Tese de Doutorado
Inference of demographic data from digital advertising platforms based on social media
Fecha
2019-03-29Autor
Filipe Nunes Ribeiro
Institución
Resumen
The growth of Online Social Networks (OSN) in the last years is impressive. Only Facebook, the most popular OSN, attracted more than 500 million new users in the last two years, reaching the massive amount of 2.32 billion monthly active users. The revenue of OSN is concentrated on their marketing platforms which evolved substantially in comparison with the traditional advertising model. By using OSN ad platforms, an advertiser is able to explore micro-targeting advertising, which means that the advertiser may select users with very particular characteristics, including thousands of different attributes such as race, gender, interests, and behaviors. In this work, we propose and develop a framework to infer demographics based on the attributes available on OSN Advertising Platforms. The inference of demographics about Internet users by exploring personal data (browsing behavior, last purchasing, etc) is limited and challenging. However, it may be very useful for many purposes, including products recommendation, delivery of personalized content and even study of migration across countries. Social networks provide the ideal environment to infer demographics about users by exploring public profiles as well as posts and users behaviors such as likes and shopping. In our framework, we leverage the aggregate information about users provided by Facebook advertising platform to build new applications. We conducted four case studies to apply our framework. In the first case study, we applied our methodology to the US news ecosystem and we show that the ideological (liberal or conservative) leaning of a news source can be accurately estimated by the extent to which liberals or conservatives are over-/under-represented among its audience. We also show how bias in a news source's audience demographics, along the lines of race, gender, age, national identity, and income, can be used to infer more fine-grained biases of the source, such as social vs. economic vs. nationalistic conservatism. Then, we build and deploy a system, called ``Media Bias Monitor'', which exposes the biases in audience demographics for over $20,000$ news outlets on Facebook to any Internet user. In the second study case we examine a specific case of malicious advertising, exploring the extent to which political ads from the Russian Intelligence Research Agency (IRA), run prior to 2016 U.S. elections, exploited Facebook's targeted advertising infrastructure to efficiently target ads on divisive or polarizing topics (e.g., immigration, race-based policing) at vulnerable sub-populations.