Course Research Engineer: Sujan Dutta
Course Description: Online social media provides a rich source of detailed data reflecting the evolution of political sentiments over time, and in response to various news events. This seminar course studies a diverse set of recent papers at the intersection of machine learning, natural language processing and political science with an aim to pose research questions concerning US politics and devise ML and NLP framework for answering them. A key component of the course is a semester-long research project with the view towards a peer-reviewed publication. To this end, the course provides a large text data set relevant to US politics. Each student will formulate, explore and address a focused research question through the lens of this data. By the end of the course, apart from acquiring hands-on experience in realizing the synergy between large scale data, creative research questions and effective NLP solutions, we all hope to have an improved understanding of why we are in what we are in.
Prerequisites: The course requires familiarity with machine learning at a level of being able to complete a substantial project. Any of the following courses can serve as a prerequisite for RIT: CSCI-635, STAT-547, CISC-865, CSEC-620, CMPE-679, CSCI-722, ENGL-681, CSCI-630, ISTE 780, and STAT 745. Any of the following courses can serve as a prerequisite for CMU: 10-601, 10-701, 10-605, 10-805, 11-685, 11-785, 95-845. Interested students without this requirement should contact the instructor (Ashique KhudaBukhsh, axkvse@rit.edu or akhudabu@cs.cmu.edu) to check if they have the required background. RIT students: Please reach out to Ashique KhudaBukhsh (axkvse@rit.edu) if you are experiencing any trouble registering for the course.
Course project and resources: In this course, students are encouraged to explore ambitious, open-ended projects. The course will provide some useful data sets and a list of potential project ideas. Students will work in small teams (2-3 members) on course projects. The course research engineer will provide useful scripts and code to efficiently process the data.
Grading: Weekly reading summaries (10%), class participation (10%), class presentation (15%), weekly/bi-weekly progress discussions (10%), midterm project evaluation (15%), final project evaluation (40%).
Reading summaries: Weekly reading summaries are due every Monday (in case Monday is a holiday, Wednesday) before the first class of the week begins. The summary will cover all the papers we are scheduled to read for the week. The primary goal of the summary is to make sure we have read the papers beforehand, and are ready to discuss the finer points during the class. The summary can be fairly informal, no need to regurgitate the whole paper. Mentioning few interesting lines of thoughts that came to you when you were reading these papers is what we are looking for. Please email the reading summary to axkvse@rit.edu. Summaries are not required for the optional readings.
Tentative syllabus: A single lecture (lecture 8 onward) is organized around a main paper (and few supplementary readings) with a student presenting the key ideas in the first half of the lecture followed by class discussions. Students are required to submit a short reading summary (not more than a single page) outlining the key ideas of the papers scheduled for the week before class starts on Monday. Lectures way ahead in the future may get reshuffled if we need to adjust to the schedule of some of the invited guests and speakers.
Lecture | Topic | Papers |
---|---|---|
Aug 22 | Course overview | |
Aug 24 | No class, office hour during the regular class hour. | |
Aug 29 | Course overview; data set characterization. | |
Aug 31 | Our first paper on the data set | Guest: Rupak Sarkar (University of Maryland, College Park) Chapter 6 from Dan Jurafsky and James H. Martin's book. |
Sep 5 | Labor Day | No class. |
Sep 7 | Why does the Internet behave the way it does: a brief history of content moderation. | Reading summary due for the week (before class starts). We Don't Speak the Same Language: Interpreting Polarization through Machine Translation; KhudaBukhsh*, Sarkar*, Kamlet, Mitchell; AAAI 2021. Paper. Word Embeddings Quantify 100 Years of Gender and Ethnic Stereotypes; Garg, Schiebinger, Jurafsky, Zou; PNAS, 2018. Paper. |
Sep 12 | The political evolution in the US (Mark) | Reading summary due for the week (before class starts). The Polarization of Contemporary American Politics; Hare, Poole; Polity, 2014. Paper. A Computational Model of the Citizen as Motivated Reasoner: Modeling the Dynamics of the 2000 Presidential Election; Kim, Taber, Lodge; Political Behavior, 2010. Paper. |
Sep 14 | Potential research questions and project ideas | Text-based Inference of Moral Sentiment Change; Xie, Ferreira Pinto Jr., Hirst, Xu; EMNLP, 2019. Paper. (Optional reading) Black is to Criminal as Caucasian is to Police: Detecting and Removing Multiclass Bias in Word Embeddings; Manzini, Chong, Tsvetkov, Black; NAACL, 2019. Paper. |
Sep 19 | The political evolution in the US continued (Mark) | Reading summary due for the week (before class starts). "i have a feeling trump will win..................": Forecasting Winners and Losers from User Predictions on Twitter; Swamy, Ritter, de Marneffe; EMNLP, 2017. Paper. |
Sep 21 | Potential research questions and project ideas | Fringe News Networks: Dynamics of US News Viewership following the 2020 Presidential Election; KhudaBukhsh*, Sarkar*, Kamlet, Mitchell; ACM Web Science 2022. Paper. Project Proposal is due (Friday, September 23). |
Sep 26 | Predicting political outcomes | Reading summary due for the week (before class starts). Student presentations start from this lecture. From Tweets to Polls: Linking Text Sentiment to Public Opinion Time Series; O'Connor, Balasubramanyan, Routledge, Smith; ICWSM 2010. Paper. (Optional reading) Does the @realDonaldTrump Really Matter to Financial Markets?; Benton, Philips; American Journal of Political Science, 2020. Paper.. |
Sep 28 | Opinion aggregation using language models | Mining Insights from Large-scale Corpora Using Fine-tuned Language Models; Palakodety, KhudaBukhsh, Carbonell; ECAI 2020. Paper. (Optional reading) How Can We Know What Language Models Know; Jiang, Xu, Araki, Neubig; TACL 2020. Paper. |
Oct 3 | Political users | Reading summary due for the week (before class starts).
Classification without (Proper) Representation: Political Heterogeneity in Social Media and Its Implications for Classification and Behavioral Analysis; Alkiek, Zhang, Jurgens; ACL 2022. Paper. |
Oct 5 | Censorship and moderation | Censorship and Deletion Practices in Chinese Social Media; Bamman, O'Connor, and Smith; First Monday, 2012. Paper.
You Can't Stay Here: The Efficacy of Reddit's 2015 Ban Examined Through Hate Speech; Chandrasekharan, Pavalanathan, Srinivasan, Glynn, Eisenstein, Gilbert; CSCW 2017. Paper. |
Oct 10 | Serendipity and Challenges in CSS Research (Ashique) | Reading summary due for the week (before class starts). For CMU Students |
Oct 12 | Polarization | Mark Kamlet. |
Oct 17 | Serendipity and Challenges in CSS Research (Ashique) | For RIT Students |
Oct 19 | No class. | |
Oct 24 | Spotlight talks (5-7 minutes per group) | Reading summary due for the week (before class starts). Midterm project evaluation |
Oct 26 | Polarization | Political Polarization in Online News Consumption; Garimella, Smith, Weiss, West; ICWSM 2021. |
Oct 31 | Polarization | Reading summary due for the week (before class starts). Aligning Multidimensional Worldviews and Discovering Ideological Differences; Milbauer, Mathew, Evans; EMNLP 2021. (Optional reading) Analyzing Polarization in Social Media: Method and Application to Tweets on 21 Mass Shootings; Demszky, Garg, Voigt, Zou, Shapiro, Gentzkow, Jurafsky; NAACL 2019. Paper. |
Nov 2 | Controversy | Something's Brewing! Early Prediction of Controversy-causing Posts from Discussion Features; Hessel, Lee; NAACL 2019. Paper. Events and Controversies: Influences of a Shocking News Event on Information Seeking; Koutra, Bennett, Horvitz; WWW 2015. Paper. |
Nov 7 | Hate speech, counter speech | Reading summary due for the week (before class starts). Thou Shalt Not Hate: Countering Online Hate Speech; Mathew, Saha, Tharad, Rajgaria, Singhania, Maity, Goyal, Mukherjee; ICWSM 2019. Paper. (Optional reading) Voice for the Voiceless: Active Sampling to Detect Comments Supporting the Rohingyas; Palakodety, KhudaBukhsh, Carbonell; AAAI 2020. Paper. (Optional reading) Hate Speech Detection is Not as Easy as You May Think; Arango, Pèrez, Poblete; SIGIR 2019. Paper. |
Nov 9 | TBD. | TBD. |
Nov 14 | Misinformation | Reading summary due for the week (before class starts). Guest lecture: Kathleen Carley Capturing the Style of Fake News; Przybyla; AAAI 2020. Paper. Political Knowledge and Misinformation in the Era of Social Media: Evidence from the 2015 U.K. Election; Munger, Egan, Nagler, Ronen, Tucker; British Journal of Political Science, forthcoming. Paper. |
Nov 16 | Policing | A Murder and Protests, the Capitol Riot, and the Chauvin Trial: Estimating
Disparate News Media Stance; Dutta, Li, Nagin, KhudaBukhsh; IJCAI 2022. Paper. (Optional reading:) Language from police body camera footage shows racial disparities in officer respect; Voigt, Camp, Prabhakaran, Hamilton, Hetey, Griffiths, Jurgens, Jurafsky, Eberhardt; PNAS, 2017. Paper. |
Nov 21 | Politics and news media | Reading summary due for the week (before class starts). Strategic Candidate Entry and Congressional Elections in the Era of Fox News; Arceneaux, Dunaway, Johnson, Vander Wielen; American Journal of Political Science, 2020. Paper. Partisanship, Propaganda, and Disinformation: Online Media and the 2016 U.S. Presidential Election; Faris, Roberts, Etling, Bourassa, Zuckerman, Benkler; SSRN 2017. Paper. |
Nov 28 | Immigration and climate change | Reading summary due for the week (before class starts). Computational analysis of 140 years of US political speeches reveals more positive but increasingly polarized framing of immigration; Card, Chang, Becker, Mendelsohn, Voigt, Boustan, Abramitzky, Jurafsky; PNAS 2022. Paper. |
Nov 30 | Final project presentations | |
Dec 5 | Final project presentations |