Refugee or Migrant? Mixing methods for social media analysis

Adina Nerghes


Social media is often at the centre of our daily life. While it may appear that social media is filled with objective information on current events, this information is often biased, it can distort facts, and it can have an impact on sentiments and the formation of opinions. For example, the recent refugee crisis – with many fleeing their countries and seeking refuge in neighbouring countries or in Europe – triggered a fierce debate in social media which spawned all kinds of labels, such as ‘refugee’, ‘migrant’, or ‘immigrant’. Such labels are not rigid units of factual information; they encode meanings, opinions, interpretations, positions, and sentiments. Moreover, labels are not just words. They can alter perceptions, influence behaviours, undermine public support, steer public opinion in a certain direction, and ultimately have serious consequences for the lives and safety of the displaced individuals that they refer to.

Trying to understand this debate, and more specifically the sentiments associated with the different labels used to describe these events, we collected and analysed over 60,000 social media posts on the refugee crisis. While collecting such large amounts of data has become relatively easy in recent years, analysing this data and answering complex research questions, such as ‘What are the patterns of label use in online discussion of the European refugee/migrant crisis and what are the sentiments associated with these labels?’, requires a mix of conceptual, methodological and contextual knowledge drawn from various disciplines.

For this particular study, we combined theories on labeling and framing drawn from the humanities, social and communication sciences (e.g., framing and labeling, socio-semantic networks), automatic text analysis methods drawn from computer science (e.g., topic modeling and sentiment analysis), and expert knowledge of social media platforms and the refugee crisis. Thus, the robust analysis of label use in social media debates presented in this study is inherently multidisciplinary. And while most of the methods used to analyse our data were automated, human-coding (for example, sentiment identification) and contextual knowledge of the debate and corpora were also crucial to the analysis.

Mixing the methods and approaches in this manner, we were able to show not only that negative sentiments permeated most of the debate on the refugee crisis in the social media, but an ordering of sentiment was also present. In other words, certain labels employed in these debates had a more negative connotation. The most negative labels were those characterising displaced people as constituting a criminal threat to host societies, followed by those portraying displaced people as having relatively higher agency in crossing borders (for example, not in fear of their lives). Furthermore, labels related to permanence (for example, the length of time that displaced individuals are expected to remain in the host country) were more negative than those related to the expectation of economic costs incurred by the host country as a result of its taking in refugees.

As social media becomes more prevalent, studies of online opinions and discussions become increasingly valuable because they offer insights into the nature and direction of focal themes and public sentiment. In turn, such themes and sentiments can have serious implications at the individual level by altering attitudes towards important issues as well as influencing processes of political socialisation and collective action at a societal level.

To conclude, analysing the response in the social media to events in society requires taking a multidisciplinary approach and applying a combination of analytical methods. For example, linguistic approaches can elucidate structure and grammar, and methods from computer science can provide appropriate tools and analysis algorithms, while the humanities and social sciences can relate data and results to the social context in which they were produced and also to the mechanisms whereby words influence, and are influenced by, human behaviour. The different perspectives from which these various disciplines approach text analysis are not mutually exclusive, and they hold great potential of informing one another.

Click here for more on this study.


Adina Nerghes is postdoctoral researcher at the Digital Humanities Lab and she is a communication scientist who makes use of digital methods to expose sentiments, opinions, and change in large text collections. In her work, making use of various datasets, Adina has analyzed transformations in metaphor use during the financial crisis, levels of agreement in the European Parliament, opinions associated with the spread of the Zika-virus, and public sentiment on the refugee crisis.

Visit Adina’s website for more information:


Wednesday December 12 the KNAW Humanities Cluster presents HuC LIVE!. At this event the departments DHLab and Digital Infrastructure will present their innovative research and infrastructure. The main theme of this afternoon will be about bridging the gulf between science and humanities. In this series of blogs, our guest speakers will talk about why they bring science and humanities together.

Find more information here.

Blog Digital Humanities Lab
nl eng
Partner IISG Partner Meertens Instituut Partner Huygens ING