ECIR 2022 Tutorial: Technology-Assisted Review for High Recall Retrieval
We are collaborating with ALTARS Workshop! Join us in the morning and attend ALTARS in the afternoon to discuss more advanced topics in TAR.
Human-in-the-loop (HITL) IR workflows are being applied to an increasing range of tasks in the law, medicine, social media, and other areas. These tasks differ from ad hoc retrieval in their focus on high recall, and differ from text categorization in their need for extensive human judgment. These tasks also differ from both in their industrial scale and, often, their use of teams of multiple reviewers. In the research literature, these tasks have been variously referred to as review, moderation, annotation, or high recall retrieval (HRR) tasks. Technologies applied to these tasks have also been referred to by many names, but technology-assisted review (TAR) has emerged as a consensus term, so these tasks are also referred to as TAR tasks.
The growth in the deployment of TAR systems, combined with the many open research problems in this area, suggest this is an appropriate time for a TAR tutorial at a major IR conference. Such a tutorial would also serve as background for attendees of the TAR workshop that has been approved for ECIR 2022.
- Length: Half day.
- Target audience: Intermediate.
- Expected prerequisite knowledge: Some exposure to basics of information retrieval and machine learning.
Eugene Yang is a Research Associate at Human Language Technology Center of Excellence at Johns Hopkins University. He has been developing state-of-the-art approaches for Technology-assisted reviews. His Ph.D. dissertation focuses on the cost reduction and analysis for TAR, including cost modeling and stopping rules for one and two phase workflows. He is currently working on cross-lingual human-in-the-loop retrieval approaches.
Jeremy Pickens is a pioneer in the field of collaborative exploratory search, a form of information seeking in which a group of people who share a common information need actively collaborate to achieve it. As Principal Data Scientist at OpenText, he has spearheaded the development of Insight Predict. His ongoing research and development focuses on methods for continuous learning, and the variety of real world technology assisted review workflows that are only possible with this approach. Dr. Pickens earned his doctoral degree at the University of Massachusetts, Amherst, Center for Intelligent Information Retrieval. Before joining Catalyst Repository Systems and later OpenText, he spent five years as a research scientist at FX Palo Alto Lab, Inc.
David D. Lewis is Chief Scientific Officer for Redgrave Data, a legal technology services company. He has researched, designed, and consulted on human-in-the-loop document classification and review systems since the early 1990’s. His 1994 paper with Gale introduced uncertainty sampling, a core technique used in commercial TAR systems. This paper won an ACM SIGIR Test of Time Award in 2017. In 2005, Dave co-founded the TREC Legal Track, the first open evaluation of TAR technology. He was elected a Fellow of the American Association for the Advancement of Science in 2006 for foundational work on algorithms, data sets, and evaluation in text analytics.