Dr. Aron Culotta

Assistant Professor of Computer Science | Illinois Institute of Technology | Chicago, IL 60616 | | 312.567.5261 | SB 229B







Dr. Culotta leads the Text Analysis in the Public Interest Lab, which investigates socially-beneficial applications of natural language processing, machine learning, and text mining algorithms. See the Projects page for more information. The lab focuses on three core areas:

Press: Our work in social media analysis has been discussed in several press outlets, including the Wall Street Journal, The Atlantic, CNET, and the Communcations of the ACM.

Interested PhD students should please apply here.
(see also my Google Scholar page)

2016 Domain Adaptation for Learning from Label Proportions Using Self-Training
Ehsan Mohammady Ardehaly and Aron Culotta
IJCAI, 2016.
Cold-Start Recommendations for Audio News Stories Using Matrix Factorization
Ehsan Mohammady Ardehaly and Aron Culotta
IJCAI, 2016.
Towards identifying leading indicators of smoking cessation attempts from social media
Aron Culotta
ICHI CHS Workshop, 2016.
#Polar Scores: Measuring Partisanship Using Social Media Content
Libby Hemphill, Aron Culotta, and Matthew Heston
Journal of Information Technology & Politics, 2016.
Robust Text Classification in the Presence of Confounding Bias
Virgile Landeiro and Aron Culotta
AAAI, 2016.   (code)
Reducing confounding bias in observational studies that use text classification
Virgile Landeiro and Aron Culotta
AAAI Spring Symposium on Observational Studies through Social Media, 2016.
Mining brand perceptions from Twitter social networks   (code)
Aron Culotta and Jennifer Cutler
Marketing Science, 35(3), pp. 343--362, 2016.
Predicting Twitter user demographics using distant supervision from website traffic data   (code)
Aron Culotta, Nirmal Kumar Ravi, and Jennifer Cutler
Journal of Artificial Intelligence, (55) 389-408, 2016.
Training a text classifier with a single word using Twitter Lists and domain adaptation
Aron Culotta
Social Network Mining and Analysis, 6(1), 1--15, 2016.
2015 Inferring latent attributes of Twitter users with label regularization
Ehsan Mohammady Ardehaly, Aron Culotta
HLT, 2015.
A demographic and sentiment analysis of e-cigarette messages on Twitter   (code)
Elaine Cristina Resende and Aron Culotta
Computational Health Science Workshop at ACM BCB-BIO, 2015.
Finding truth in cause-related advertising: A lexical analysis of brands' health, environment, and social justice communications on Twitter  
Aron Culotta, Jennifer Cutler, and Junzhe Zheng
The Journal of Values-Based Leadership, Vol. 8, Iss. 2, 2015
Predicting the Demographics of Twitter Users from Website Traffic Data   (code)
Aron Culotta, Nirmal Kumar Ravi, Jennifer Cutler
AAAI, 2015.
Outstanding Paper (Honorable Mention)
Using Matched Samples to Estimate the Effects of Exercise on Mental Health from Twitter   (code)
Virgile Landeiro Dos Reis and Aron Culotta
AAAI, 2015.
2014 Using county demographics to infer attributes of Twitter users
Ehsan Mohammady Ardehaly, Aron Culotta
ACL Joint Workshop on Social Dynamics and Personal Attributes in Social Media, 2014.
Reducing Sampling Bias in Social Media Data for County Health Inference   (code)
Aron Culotta
JSM, 2014.
Anytime Active Learning  
Maria E Ramirez-Loaiza, Aron Culotta, Mustafa Bilgic
AAAI, 2014.
Estimating County Health Statistics with Twitter   (code)
Aron Culotta
CHI, 2014.
Tweedr: Mining Twitter to Inform Disaster Response  (code)
Zahra Ashktorab, Christopher Brown, Manojit Nandi, Aron Culotta
ISCRAM, 2014.
Inferring the Origin Locations of Tweets with Quantitative ConīŦdence   (code)
Reid Priedhorsky, Aron Culotta, Sara Y. Del Valle
CSCW, 2014.
Best Paper (Honorable Mention)
2013 Framing in Social Media: How the US Congress Uses Twitter Hashtags to Frame Political Issues   (code)
Libby Hemphill, Aron Culotta, Matthew Heston
Technical Report (SSRN)
Too Neurotic, Not too Friendly: Structured Personality Classification on Textual Data
Francisco Iacobelli and Aron Culotta
ICWSM Workshop on Personality Classification, 2013.
Towards Anytime Active Learning: Interrupting Experts to Reduce Annotation Costs
Maria E. Ramirez-Loaiza, Aron Culotta, Mustafa Bilgic
KDD Workshop on Interactive Data Exploration and Analytics (IDEA), 2013.
Lightweight methods to estimate influenza rates and alcohol sales volume from Twitter messages
Aron Culotta
Language Resources and Evaluation, Special Issue on Analysis of Short Texts on the Web, 47(1):217--238, 2013
Preprint. Final version at Springer
2012 A demographic analysis of online sentiment during Hurricane Irene
Benjamin Mandel, Aron Culotta, John Boulahanis, Danielle Stark, Bonnie Lewis, Jeremy Rodrigue
NAACL-HLT Workshop on Language in Social Media, 2012
2011 SampleRank: Training factor graphs with atomic gradients
Michael Wick, Khashayar Rohanimanesh, Kedar Bellare, Aron Culotta, Andrew McCallum
Proceedings of the International Conference on Machine Learning (ICML), 2011
2010 Detecting influenza epidemics by analyzing Twitter messages
Aron Culotta
arXiv:1007.4748v1 [cs.IR], 2010
Towards detecting influenza epidemics by analyzing Twitter messages
Aron Culotta
KDD Workshop on Social Media Analytics, 2010
2009 SampleRank: Learning preferences from atomic gradients
Michael Wick, Khashayar Rohanimanesh, Aron Culotta, Andrew McCallum
Neural Information Processing Systems (NIPS) Workshop on Advances in Ranking, 2009
An entity-based model for coreference resolution
Michael Wick, Aron Culotta, Khashayar Rohanimanesh, Andrew McCallum
SIAM International Conference on Data Mining, 2009
2008 Learning and inference in weighted logic with application to natural language processing
Aron Culotta
Ph.D. Thesis, University of Massachusetts, Amherst, 2008
2007 Canonicalization of Database Records using Adaptive Similarity Measures
Aron Culotta, Michael Wick, Robert Hall, Matthew Marzilli, Andrew McCallum
Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2007
Sparse Message Passing Algorithms for Weighted Maximum Satisfiability
Aron Culotta, Andrew McCallum, Bart Selman, Ashish Sabharwal
New England Student Colloquium on Artificial Intelligence (NESCAI), 2007
Author Disambiguation using Error-driven Machine Learning with a Ranking Loss Function
Aron Culotta, Pallika Kanani, Robert Hall, Michael Wick, Andrew McCallum
Sixth International Workshop on Information Integration on the Web (IIWeb-07), 2007
First-Order Probabilistic Models for Coreference Resolution
Aron Culotta, Michael Wick, Robert Hall, Andrew McCallum
Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL), 2007
2006 Corrective Feedback and Persistent Learning for Information Extraction
Aron Culotta, Trausti Kristjansson, Andrew McCallum, Paul Viola
Artificial Intelligence, 2006
Tractable Learning and Inference with High-Order Representations
Aron Culotta, Andrew McCallum
International Conference on Machine Learning Workshop on Open Problems in Statistical Relational Learning, 2006
Learning field compatibilities to extract database records from unstructured text
Michael Wick, Aron Culotta, Andrew McCallum
Conference on Empirical Methods in Natural Language Processing (EMNLP), 2006
Practical Markov logic containing first-order quantifiers with application to identity uncertainty
Aron Culotta, Andrew McCallum
Human Language Technology Workshop on Computationally Hard Problems and Joint Inference in Speech and Language Processing (HLT/NAACL), 2006
Integrating probabilistic extraction models and data mining to discover relations and patterns in text
Aron Culotta, Andrew McCallum, Jonathan Betz
Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL), 2006
2005 Learning clusterwise similarity with first-order features
Aron Culotta, Andrew McCallum
Neural Information Processing Systems (NIPS) Workshop on the Theoretical Foundations of Clustering, 2005
A conditional model of deduplication for multi-type relational data
Aron Culotta, Andrew McCallum
University of Massachusetts IR-443, 2005
Joint deduplication of multiple record types in relational data
Aron Culotta, Andrew McCallum
ACM CIKM International Conference on Information and Knowledge Management, 2005
Reducing labeling effort for structured prediction tasks
Aron Culotta, Andrew McCallum
The Twentieth National Conference on Artificial Intelligence (AAAI), 2005
Gene prediction with conditional random fields
Aron Culotta, David Kulp, Andrew McCallum
University of Massachusetts, Amherst UM-CS-2005-028, 2005
2004 Dependency tree kernels for relation extraction
Aron Culotta, Jeffery Sorensen
42nd Annual Meeting of the Association for Computational Linguistics (ACL), 2004
Interactive information extraction with constrained conditional random fields
Trausti Kristjannson, Aron Culotta, Paul Viola, Andrew McCallum
Nineteenth National Conference on Artificial Intelligence (AAAI), 2004
Best Paper Award (Honorable Mention)
Confidence estimation for information extraction
Aron Culotta, Andrew McCallum
Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics (HLT/NAACL), 2004
Extracting social networks and contact information from email and the Web
Aron Culotta, Ron Bekkerman, Andrew McCallum
First Conference on Email and Anti-Spam (CEAS), 2004
2003 Maximizing cascades in social networks
Aron Culotta
University of Massachusetts, 2003
Fall 2016 CS 579: Online Social Network Analysis
Spring 2016 CS 429: Information Retrieval
Fall 2015 CS 579: Online Social Network Analysis
Spring 2015 CS 429: Information Retrieval
Fall 2014 CS 579: Online Social Network Analysis
Spring 2014 CS 429: Information Retrieval
Fall 2013 CS 595: Machine Learning and Social Media

Some data sets I've created to train and evaluate machine learning algorithms: