logo full

 The 2016 CityU Workshop on Computational Approaches to Big Data in the Social Sciences and Humanities

(CityU CSS 2016)

Speakers

Jaime Settle is an Assistant Professor of Government at College of William & Mary, the U.S., received her B.A. from University of Richmond and her Ph.D. from University of California at San Diego. Her current research explores how social interactions (both face-to-face and on Facebook) affect how we think, feel and behave politically, as well as how innate differences between people moderate the effects of those interpersonal exposures. She uses the core methodological tools employed in political science, but is also interested in integrating tools from behavior genetics, psychophysiology, and computer science. Her work has been published in Nature, Proceedings of the National Academy of Sciences, American Journal of Political Science, Journal of Politics, and Political Science Research and Methods. She is a coauthor of “A 61-million-person experiment in social influence and political mobilization” (Nature, 2012).
Jaime Settle: The Value of Experimental Design in Social Media Research
Social scientists increasingly turn to experimental methods to better understand causal relationships. Although difficult to design and execute, experiments involving social media are especially promising because of the potential to study the dynamic spread of information, emotion, and behavior. For what kinds of research questions are experimental designs most appropriate? What sorts of challenges do these designs present? What trade-offs are involved in experimental designs exploring behaviors on social media? I will explore these ideas using examples from my own research and the field more broadly. (Slides: http://weblab.com.cityu.edu.hk/workshops/cityu-css-2016/Jaime_Settle_Online_Experiments_post.pdf)
Robert Ackland is an Associate Professor with a joint appointment in the School of Sociology and the Centre for Social Research and Methods at the Australian National University. An economist by training, Robert has been conducting quantitative research into online social and organisational networks since 2002. He leads the Virtual Observatory for the Study of Online Networks Lab (http://vosonlab.net) and he created the VOSON software for hyperlink network construction and analysis. Robert established the Social Science of the Internet specialization in the ANU's Master of Social Research in 2008, and his book Web Social Science: Concepts, Data and Tools for Social Scientists in the Digital Age (SAGE) was published in 2013.
Robert Ackland: Collecting and analyzing online social network data
This presentation provides an overview of recent developments in the conceptualization, collection and analysis of online social networks. The presentation first provides a typology of online networks and discusses whether all online networks are amenable to social network analysis. Next there is a summary of ways for collecting online network data via crawlers and APIs, and some comments on the importance for research of open source data and tools. The presentation also looks at when network visualization is useful and how to overcome the “hairball problem” of visualizing large-scale dense networks, via pruning and sampling nodes/edges. The presentation concludes with some thoughts on the role of social scientists in the big (network) data era. (Slides: http://weblab.com.cityu.edu.hk/workshops/cityu-css-2016/Rob_Ackland_Data_Colleciton.pdf)
(Wayne) Xin Zhao is an Assistant Professor at the School of Information, Renmin University of China. He received his Ph.D. from Peking University in 2014. His research interests include web text mining and natural language processing. He has published 30+ referred papers in top international conferences or journals such as ACL, EMNLP, COLING, ECIR, CIKM, SIGIR, SIGKDD, AAAI, IJCAI, ACM TOIS, ACM TIST, IEEE TKDE, KAIS and WWWJ. He is the lead author of “Comparing Twitter and traditional media using topic models” (Advances in Information Retrieval, 2011).
Xin Zhao: Mining Social Text Data
Information is being generated on social media channels such as microblogs and forums at a dramatic rate that we have never expected. One of the most basic social data types is plain text, called social text in this talk. Social text data is complex and difficult to understand. In this talk, we will cover several important topics related to social text mining, including fundamental processing techniques, mainstream mining methods, methodological challenges, typical applications and future directions. Finally, we will present an interesting work from our research group, which is a demographic-based system for product recommendation on microblogs. (Slides: http://weblab.com.cityu.edu.hk/workshops/cityu-css-2016/Xin_Zhao_Mining_Social_Text_Data.pdf)
Steffen Roth is a Professor of Management and Organization at ESC Rennes School of Business, France. He was awarded a PhD in management from the Chemnitz University of Technology and a PhD in sociology from the University of Geneva. His research fields include organization and management theory, strategic management, social innovation, ideation and crowdsourcing, and culturomics. He has published in indexed journals such as Futures, Innovation: The European Journal of Social Science Research, International Journal of Entrepreneurship and Small Business, International Journal of Technology Management, International Journal of Manufacturing Technology and Management, Games and Culture, Journal of Interdisciplinary Economics, and Creativity and Innovation Management. He is the author of “Fashionable functions: A Google ngramview of trends in functional differentiation (1800-2000)” (International Journal of Technology and Human Interaction, 2014).
Steffen Roth: Futures of a shared memory: A global brain wave measurement based on Google ngram corpus (1800-2000)
If the global brain is a suitable metaphor for an emerging collective intelligence formed by human communication and ICT, then one future of research in this global brain will be in its past, which is its “shared memory”. In this talk, I’ll show that future research in this global brain will have to reclaim classical theories of social differentiation, functional differentiation in particular, to develop higher resolution images of this brain’s function and sub-functions. We use culturomics (i.e., Google Ngram Viewer) to analyze word frequency time-series plots of key concepts of social differentiation in major European languages between 1800 and 2000. The results suggest that the emerging global intelligence features distinct and not-yet conscious biases to its earlier forms of internal differentiation as well as to particular sub-functions. We speculate that an increasingly intelligent global brain will start to critically reflect upon these biases and learn how to anticipate or design its own desired futures. (Slides: http://weblab.com.cityu.edu.hk/workshops/cityu-css-2016/Stefffen_Roth_global-brain-wave-measurement.pdf)
Federico Botta is a member of the Centre for Complexity Science, Department of Mathematics, University of Warwick, Britain. His research focuses on complex social systems, aiming at providing a deeper understanding of how such systems behave. In particular, his work focuses on investigating how people interact with these systems, such as smart phones and the Internet. Using different tools ranging from network theory to physical and computer sciences, Botta analyses large data sets to study social systems and human behaviour. Part of his research also focuses on improving current techniques in the analysis of networked systems. He is the lead author of “Quantifying crowd size with mobile phone and Twitter data” (Royal Society Open Science, 2015).
Federico Botta: Quantifying Complex Social Systems Using Mobile Data
Mobile phones have drastically changed not only our communication habits, but also the way researchers investigate social interactions. Records of mobile phone calls contain an unprecedented amount of information between individuals, which can be localized both in space and time. This provides researchers a granular description of user behaviors. In this presentation, I will discuss the main research questions that have been successfully answered by exploiting this vast amount of mobile data and those that still remain unanswered. I will also introduce some of the commonly used methodologies in the analysis of mobile phone data using examples and visualization of some mobile phone data sets. (Slides: http://weblab.com.cityu.edu.hk/workshops/cityu-css-2016/Federoci_Botta_Mobile_Data.pdf)
Nan Cao is an Assistant Professor in the Department of Computer Science, NYU Shanghai and a research assistant professor in the Tandon School of Engineering, NYU. He obtained his Ph.D. in Computer Science from Hong Kong University of Science and Technology and worked at IBM T.J. Watson Research Center prior to joining NYU. His research interests include information visualization and visual analytics, using novel visualization and interaction techniques to represent and analysis complex (big, dynamic, multidimensional, and multivariate) data. His work has appeared in IEEE TVCG, IEEE INFOVIS/VAST, and ACM CHI. He is the lead author of “Facetatlas: Multifaceted visualization for rich text corpora” (IEEE TVCG, 2010).
Nan Cao: Visual Analysis of User Behaviors in Social Media
Increasing research interests have been put on producing visualization techniques to help the interpretation and analysis of user behaviors in social media. In this talk, we will present visualization techniques developed for illustrating different user behaviors, with a special focus on visually analyzing anomalous user behaviors on Twitter. Traditional anomaly detection techniques face two challenges when applying to the real world scenarios: the ambiguity of the boundary between normal and abnormal and the missing of the ground truth for algorithm validation. To tackle these problems, we will introduce advanced visual analysis techniques to analyze user behaviors, identify suspicious accounts, and estimate the potential risk, thus helping ordinary users to protect their privacy. (Slides: http://weblab.com.cityu.edu.hk/workshops/cityu-css-2016/Nan_Cao_Visual_Analysis_User_Behavior.pdf)