Ruben L. Bach

Statistics | Data Science | Survey Methodology

Hi – I’m Ruben, postdoctoral researcher at the University of Mannheim (see also here) in social science quantitative research methods. I’m especially interested in all topics related to big data in the social sciences, machine learning, causal inference and survey research.

About me

I spent three years in the Institute for Employment Research (IAB)’s graduate program where I wrote my dissertation on behavioral consequences of repeated measurements in social science surveys using methods of causal inference and machine learning. Recently, my dissertation was awarded the Lorenz-von-Stein Award of the Mannheim Centre for European Social Research (MZES) at the University of Mannheim.

If you want to know more about me, check out my CV! If you prefer an academic-style version of my CV (including those long and boring lists like conference presentations, journals I reviewed for, workshops I organized etc.), look here.

My work

My current research projects involve using digital trace data (online and mobile web activity) and social media data (Reddit) for social research, machine learning in survey research (e.g., for response propensity models) and the various ways how participating in a survey can influence respondents‘ future behavior.

Together with colleagues from the University of Mannheim and with colleagues from other institutions, I recently published findings on the accuracy of predicted personal sensitive information (voting behavior and politicl preferences) in Social Science Computer Review (open access). You can find a pdf of the paper here.

A major concern arising from ubiquitous tracking of individuals’ online activity is that algorithms may be trained to predict personal sensitive information, even for users who do not wish to reveal such information. Although previous research has shown that digital trace data can accurately predict sociodemographic characteristics, little is known about the potentials of such data to predict sensitive outcomes. Against this background, we investigate in this article whether we can accurately predict voting behavior, which is considered personal sensitive information in Germany and subject to strict privacy regulations. Using records of web browsing and mobile device usage of about 2,000 online users eligible to vote in the 2017 German federal election combined with survey data from the same individuals, we find that online activities do not predict (self-reported) voting well in this population. These findings add to the debate about users’ limited control over (inaccurate) personal information flows.

My dissertation work on behavioral changes due to repeated survey participation was published (open access) in the Journal of the Royal Statistical Society, Series A: Statistics in Society (JRSS-A) (pdf). A second paper on changes in reporting over the waves of a panel survey (panel conditioning) has appeared in Journal of Survey Statistics and Methodology (JSSAM). The final empirical chapter from my dissertation (connection between response propensities and misreporting in surveys) recently appeared in JSSAM (link).

Get in touch!

If you want to know more about me, my work or if you are interested in collaborating,  send me an email or contact me on LinkedIn or Twitter.