Paraphrase choice based on user personality traits PPDB paraphrase pairs and clusters with their associated usage score across the five personality traits of the Big Five model: a. Openness to Experience b. Conscientiousness c. Extraversion d. Agreeableness e. Neuroticism Contents: frequencies.tar.gz - contains the raw frequency statistics for all phrases and each trait pairs.tar.gz - contains files with pairwise usage scores for each trait clusters.tar.gz - contains files with cluster usage scores for each trait In pairs and clusters, the negative values are phrases which are more associated with being low in a traits. If you are using this dataset, please reference our work: @inproceedings{perspara17nlpcss, author = {Preo\c{t}iuc-Pietro, Daniel and Xu, Wei and Ungar, Lyle}, title = {{Personality Driven Difference in Paraphrase Preference}}, booktitle = {{Proceedings of the Workshop on Natural Language Processing and Computational Social Science (NLP+CSS)}}, series = {ACL}, year = {2017} }