Crowdsourcing platforms like Amazon MTurk are being used to collect manually generated data and annotations at scale. Such platforms can be accessed programmatically and are commonly used to build hybrid human-machine systems that leverage machines to scale over large amounts of data and keep humans in the loop to increase data processing quality. After an introduction on crowdsourcing and hybrid human-machine systems, in this talk we will focus on some of the crowdsourcing research challenges around effectiveness and efficiency. We will present a study of malicious behaviours displayed by paid crowd workers. We will discuss the effect of limiting the time available to complete a task on the accuracy of the work done by the crowd. We will discuss how crowdsourcing task complexity can be predicted based on the task design and what is its effect on crowd work efficiency. Finally, we will present ongoing work on the understanding of the effect of crowdsourcing work environments.
Dr. Gianluca Demartini is a Senior Lecturer in Data Science at the University of Sheffield, Information School. His research is currently supported by the UK Engineering and Physical Sciences Research Council (EPSRC) and by the EU H2020 framework program. His main research interests are Information Retrieval, Semantic Web, and Human Computation. He received the Best Paper Award at the European Conference on Information Retrieval (ECIR) in 2016 and the Best Demo Award at the International Semantic Web Conference (ISWC) in 2011. He has published more than 70 peer-reviewed scientific publications including papers at major venues such as WWW, ACM SIGIR, VLDBJ, ISWC, and ACM CHI. He has given several invited talks, tutorials, and keynotes at a number of academic conferences (e.g., ISWC, ICWSM, WebScience, and the RuSSIR Summer School), companies (e.g., Facebook), and Dagstuhl seminars. He is an ACM Distinguished Speaker since 2015. He served as area editor for the Journal of Web Semantics, as Student Coordinator for ISWC 2017, and as Senior Program Committee member for the AAAI Conference on Human Computation and Crowdsourcing (HCOMP), the International Conference on Web Engineering (ICWE), and the ACM International Conference on Information and Knowledge Management (CIKM). He has been Program Committee member for several conferences including WWW, SIGIR, KDD, IJCAI, ISWC, and ICWSM. He was co-chair for the Human Computation and Crowdsourcing Track at ESWC 2015. He co-organized the Entity Ranking Track at the Initiative for the Evaluation of XML Retrieval in 2008 and 2009. Before joining the University of Sheffield, he was post-doctoral researcher at the eXascale Infolab at the University of Fribourg in Switzerland, visiting researcher at UC Berkeley, junior researcher at the L3S Research Center in Germany, and intern at Yahoo! Research in Spain. In 2011, he obtained a Ph.D. in Computer Science at the Leibniz University of Hanover focusing on Semantic Search.