Industrial Session

Overall Acceptance Criterion in Online A/B Testing: Classical and Promising Approaches

Modern Internet companies improve their web services by means of data-driven decisions that are based on online controlled experiments also known as A/B tests. The scale of use of this state-of-the-art technique is impressive: for instance, the largest search engines report on more than hundreds concurrent experiments per day.
An A/B test compares two variants of a service at a time, usually its current version and a new one, by exposing them to two groups of users. The aim of controlled experiments is to detect the causal effect of service updates on its performance relying on an Overall Acceptance Criterion (OAC). A typical OAC consists of a key metric, an evaluation statistic, and a statistical significance test.
The most challenging problem is to define an appropriate OAC that is both interpretable and sensitive. This talk presents a brief overview of classical and novel approaches to obtain an OAC.

Drutsa Alexey
Yandex, researcher and software developer

Generation and analysis of
speech sounds: a new approach

Abstract. As enterprises and state institutions switch to electronic document circulation, large quantities of official, legal and commercial information become available in digital form. However, its processing often poses even bigger challenges than ordinary information extraction from plain text.
The main issue is the complex structure of such documents: typically they contain both structured (tables, lists, headers) and unstructured (plain text) elements, which complicates the automation. In many cases it is absolutely essential that the system uses both linguistic and structural information presented in the document.
In this talk we present an information extraction system in which these issues are addressed and to some extent resolved. ABBYY Compreno extracts data from official documents taking into account their structure as well as their textual content. This enables us to process all sorts of complex documents, from contracts, agreements and bank statements to CV’s and financial spreadsheets. Awareness of their structure improves performance of named entity and fact extraction. During the talk we will describe main principles and logical components of the system and show some real life examples.

Anatoly Starostin, Maria Stepanova

Anatoly Starostin
Anatoly Starostin is the Head of Information Extraction Technologies Research Group at ABBYY Infopoisk LLC. He holds a Master’s degree awarded by the Moscow State University (Faculty of Computational Mathematics and Cybernetics). He is currentlyworking on a PhD thesis in the field of information retrieval.

Maria Stepanova
Since 2013 Maria Stepanova is a head of Ontology Description group in ABBYY Infopoisk LLC. In 2011 she graduated from the department of mathematical linguistics, Faculty of Philology, Saint-Petersburg State University. In 2013 received Master`s Degree in Applied Informatics, Faculty of Arts, Saint-Petersburg State University. Maria`s scientific interests include named entity recognition and fact extraction.

Detailed Analysis of Interests of VKontakte Audience

Ann Tyshchenko, Felix Zinatullin
Cerebro Target