Training Workshop on Social Media and Demographic Methods

Washington, D.C., United States, 30 March 2016 


The IUSSP Scientific Panel on Big Data and Population Processes organized a training workshop on Social Media and Demographic Methods at the 2016 annual meeting of the Population Association of America. The workshop was supported by grants from the William and Flora Hewlett Foundation to the IUSSP and the Population Association of America (PAA) to support demographers' participation in the Data Revolution. The half-day workshop, which was held in Washington, D.C. on March 30th, focused on accessing social media data and analyzing them using demographic methods. It provided an introduction to tools such as Application Programming Interfaces (APIs) for collecting data from social media (e.g., Twitter and Facebook), and offered examples of demographic methods that could be used to gain insights from these data.



In the first part of the workshop Emilio Zagheni offered an introduction to tools for gathering social media data and for extracting demographic characteristics from profile pictures (e.g., the Face++ API). Charles Lanfear and Kivan Polimis then presented a series of hands-on tutorials (e.g. about the Twitter Streaming API and the Facebook Pages API). Participants followed a walk-through on their own laptops, using R code and datasets that were distributed in advance, as well as data collected in real-time during the workshop. In addition to showing how to access the APIs and gather data, presenters introduced simple examples of the type of analyses that can be done. These include: a) building a simple index of positive and negative keywords for Tweets from the Washington, D.C. area containing hashtags related to politicians (e.g. #donaldtrump and #berniesanders); b) generating a network map of people who “liked” and commented on posts from the Facebook Pages of various public figures; c) generating estimates of the age, gender, and race of individuals using the pictures of their faces. R code for the tutorials are available here:



In order to favor communication and interaction among scholars and students interested in the emerging field of digital demography, the second part of the workshop featured lightning talks by researchers in the field. Some of the presenters offered a more conceptual view of the relation between digital data sources and traditional demographic data and techniques: Emilio Zagheni discussed how the set of all Twitter users can be studied as a population (with “births” occurring when users sign up and “deaths” when users become inactive) in a way that standard models of population growth apply; Emmanuel Letouzé reviewed the state of the field of sample bias correction and motivated the need for refining such techniques in order to better understand and use Big Data sources for deriving development indicators. Nina Cesare discussed methods, applications and challenges for doing sociological research with Twitter. Guy Abel showed visualizations of scientific collaboration networks for papers presented at European Population Conferences. Francesco Billari discussed how fertility choices might be affected by the Internet. Elizabeth Bruch summarized her work about extracting decision rules from online dating data.


With about 150 registered participants from various disciplines, including demography, sociology, computer science and economics, a number of stimulating questions and conversations followed the tutorials and the lightning talks. A mailing list ( was set up after the workshop in order to facilitate communication among researcher interested in Web data, social media data and demography. Anyone who is interested in joining the mailing list can do so by following instructions on this webpage:


Similar training workshops will be organized at the forthcoming European Population Conference in Mainz, Germany in August 2016 (see registration) and ALAP/ABEP conference in Foz de Iguaçu, Brazil in October 2016.