Twente Data Science Day, 12 October 2016
Scientific and economic progress is increasingly powered by our capabilities to explore big datasets. Data is the driving force behind the successful innovation of Internet companies like Google, Twitter, and Yahoo. The need for data scientists is apparent in almost every sector of our society, including business, health care, and education.

12 October, University of Twente

Twente Data Science is a collaboration between research groups of the University of Twente to research, promote and facilitate big data analysis for all scientific disciplines. We operate by sharing expertise, ideas, our research infrastructure for big data analysis and - of course - by sharing our data. The Data Science guest lectures are kindly sponsored by 4TU Data Science.


10:30   Welcome / Coffee

10:40   Opening

10:45   Thijs Westerveld (WizeNoze): Child-friendly access to age-specific content

12:30   Lunch

13:45   Iadh Ounis (University of Glasgow): Working with Streams: Applications in Information Retrieval

15:30   Closing


University of Twente: DesignLab,
Building The Gallery, Hengelosestraat 500, Enschede, The Netherlands.


Child-friendly access to age-specific content

by Thijs Westerveld (WizeNoze, Amsterdam)

WizeNoze is an Amsterdam based start-up that develops technology to make the Internet a more suitable place for children. The amount of online information that is targeted at children is currently small and the content that does exist is hard to find. In the main web search engines, this information gets overpowered by the plethora of information that is available for adults. Moreover, the information that is aimed at children often targets them as one homogeneous group, failing to differentiate between children of different ages and comprehension levels. WizeNoze provides a child-friendly technology platform that increases the amount of content available to children, that improves the access to this information, and that targets each child at their own comprehension level. In this talk we focus on our classification technology that given a text, determines the comprehension level required to understand it. This classifier needs to work with a highly heterogeneous set of training data with a mix of fine-grained and coarser labels that sometimes cover overlapping comprehension level ranges. We will showcase two applications in which we use the classification technology: the editing tool that helps authors to simplify their texts and the search engine to access our collection of content for children.

Thijs Westerveld Thijs Westerveld is an Information retrieval specialist with an interest in both scientific and practical work. As Chief Science Officer at WizeNoze, he is responsible for identifying, inventing and assessing algorithms to solve key technical questions to give customers access to the latest state of the art in kids technology. Thijs holds a PhD in Computer Science from the University of Twente and he has over 15 years experience in various areas of information retrieval, both in academic and industry settings.

Working with Streams: Applications in Information Retrieval

by Iadh Ounis (University of Glasgow)

Historically, the information retrieval field has focused on providing information access for largely static document corpora, which are updated infrequently. However, with the emergence of online news reporting, the growth in social media as a ubiquitous communication tool and the increasing prevalence of IoT infrastructure, we are seeing new IR tasks emerge that require truly real-time processing and updating. To support these new tasks, we have started to see classical batch processing paradigms for IR give way to a new generation of stream processing techniques, based on new platforms like Apache Storm and Apache Spark. In this talk, I will discuss how stream processing is changing the face of modern IR, and discuss three IR applications that make use of this new paradigm, namely: real-time search, temporal summarization, and streaming event detection

Iadh Ounis is a Professor of Information Retrieval in the School of Computing Science at the University of Glasgow. He has been an active researcher in large-scale text retrieval and mining for over 20 years, working on search applications from billions of web documents to voluminous social media and sensor network streams. He leads a team of researchers investigating new effective and efficient data-driven approaches for a variety of modern search tasks. Over the years, he has supervised numerous PhD students and postdoctoral research assistants on the general topic of large-scale information retrieval, and has authored over 150 refereed articles and publications. Prof Ounis is the principal investigator of the open source Terrier Information Retrieval (IR) platform, widely used both in academia and industry for large-scale text mining and search applications. Prof Ounis led many international IR projects and initiatives (e.g. the TREC Blog and Microblog evaluation forums under under the auspices of the US Nat. Inst. of Standards & Technology), chaired a number of major IR-related events (e.g. ACM CIKM 2011), and was involved in several externally funded EU, RCUK and industry projects (e.g. SMART, SUPER, CROSS, ReDiTES, UBDC). He consulted or had research collaborations with a number of industrial organisations and currently serves as the Director of Knowledge Exchange at the Scottish Informatics and Computer Science Alliance (SICSA). Prof Ounis is also a board member and the Glasgow Hub lead for the SFC Data Lab Innovation Centre.

The Data Science Day is organized by Djoerd Hiemstra, Theo Huibers, and Dolf Trieschnigg.

