This data set contains 66.502 Twitter users with their real gender and text predicted age and Big Five personality scores. We also include the URL for the profile picture used in our study. The users were mapped to their real gender by linking the Twitter accounts to blogs where users could have specified their gender (see "Discriminating Gender on Twitter" - John D. Burger, John Henderson, George Kim, Guido Zarrella - EMNLP 2011). The age and Big Five personality scores were predicted using the text from the user tweets using an automatic method (for personality - "Personality, Gender, and Age in the Language of Social Media: the Open-Vocabulary Approach" - H.A. Schwartz et al - PLoS ONE 2013; for age - "Developing Age and Gender Predictive Lexica over Social Media" - M. Sap et al - EMNLP 2014). The dataset consists of a single csv file: users-persimages.csv - a csv file containing Twitter user_id, real gender (coded as 0 - male, 1- female), text predicted age, text predicted Big Five personality traits and URL of the profile picture. - per Twitter TOS, we are not allowed to share the actual images as users should retain the rights to remove these from the system. However, providing the URL confirms which picture was used at the time of our analysis. - this is only the data for the TwitterText data set described in the paper. We are unable to share the TwitterSurvey data set personality scores under our IRB protocol. If using this data set, please cite the following publication: @inproceedings{persimages16icwsm, title={{Analyzing Personality through Social Media Profile Picture Choice}}, author={Liu, Leqi and Preo\c{t}iuc-Pietro, Daniel and Riahi Samani, Zahra and Moghaddam, Mohsen E. and Ungar, Lyle}, series = {ICWSM}, booktitle = {{Proceedings of the 10th International AAAI Conference on Web and Social Media}}, year={2016} }