How Your Facebook Posts Reflect Your Personality Style
Author: Susan Cain
Wow. This is an incredible study of personality styles, based on the language that people with different personality traits use in their Facebook posts.
The researchers analyzed 700 million words and phrases collected from the Facebook messages of 75,000 volunteers, and found fascinating differences between the posts of introverts and extroverts, men and women, etc.
Would love to hear your thoughts about the study! For me, the findings weren’t always substantively surprising, but seeing them presented as word clouds gave me a whole new view into human nature.
Personality, Gender, and Age in the Language of Social Media: The Open-Vocabulary Approach
The social sciences have entered the age of data science, leveraging the unprecedented sources of written language that social media afford –. Through media such as Facebook and Twitter, used regularly by more than 1/7th of the world's population , variation in mood has been tracked diurnally and across seasons , used to predict the stock market , and leveraged to estimate happiness across time , . Search patterns on Google detect influenza epidemics weeks before CDC data confirm them , and the digitization of books makes possible the quantitative tracking of cultural trends over decades . To make sense of the massive data available, multidisciplinary collaborations between fields such as computational linguistics and the social sciences are needed. Here, we demonstrate an instrument which uniquely describes similarities and differences among groups of people in terms of their differential language use.
Our technique leverages what people say in social media to find distinctive words, phrases, andtopics as functions of known attributes of people such as gender, age, location, or psychological characteristics. The standard approach to correlating language use with individual attributes is to examine usage of a priori fixed sets of words , limiting findings to preconceived relationships with words or categories. In contrast, we extract a data-driven collection of words, phrases, and topics, in which the lexicon is based on the words of the text being analyzed. This yields a comprehensive description of the differences between groups of people for any given attribute, and allows one to find unexpected results. We call approaches like ours, which do not rely on a priori word or category judgments, open-vocabulary analyses.
We use differential language analysis (DLA), our particular method of open-vocabulary analysis, to find language features across millions of Facebook messages that distinguish demographic and psychological attributes. From a dataset of over 15.4 million Facebook messages collected from 75 thousand volunteers , we extract 700 million instances of words, phrases, and automatically generated topics and correlate them with gender, age, and personality. We replicate traditional language analyses by applying Linguistic Inquiry and Word Count (LIWC), a popular tool in psychology, to our data set. Then, we show that open-vocabularyanalyses can yield additional insights (correlations between personality and behavior as manifest through language) and more information (as measured through predictive accuracy) than traditional a priori word-category approaches. We present a word cloud-based technique to visualize results of DLA. Our large set of correlations is made available for others to use (available at: http:www.wwbp.org/).