ECIR 2022 Tutorial: Technology-Assisted Review for High Recall Retrieval

We are collaborating with ALTARS Workshop! Join us in the morning and attend ALTARS in the afternoon to discuss more advanced topics in TAR.

Human-in-the-loop (HITL) IR workflows are being applied to an increasing range of tasks in the law, medicine, social media, and other areas. These tasks differ from ad hoc retrieval in their focus on high recall, and differ from text categorization in their need for extensive human judgment. These tasks also differ from both in their industrial scale and, often, their use of teams of multiple reviewers. In the research literature, these tasks have been variously referred to as review, moderation, annotation, or high recall retrieval (HRR) tasks. Technologies applied to these tasks have also been referred to by many names, but technology-assisted review (TAR) has emerged as a consensus term, so these tasks are also referred to as TAR tasks.

The growth in the deployment of TAR systems, combined with the many open research problems in this area, suggest this is an appropriate time for a TAR tutorial at a major IR conference. Such a tutorial would also serve as background for attendees of the TAR workshop that has been approved for ECIR 2022.

  • Length: Half day.
  • Target audience: Intermediate.
  • Expected prerequisite knowledge: Some exposure to basics of information retrieval and machine learning.

Outline

Introduction to TAR

  • What is a TAR task
  • Comparison to other IR tasks
  • Application areas
    • Law: litigation, antitrust, investigations
    • Systematic reviews in medicine
    • Content moderation
    • Data set annotation
    • Sunshine laws, declassification, and archival tasks
    • Patent search and other high recall review tasks
  • History

Dimensions of TAR Tasks

  • Volume and temporal characteristics of data
  • Time constraints
  • Reviewer characteristics
  • Cost structure and constraints
  • Nature of classification task (single, multiple, and cascaded classifications)

TAR Workflows

  • Importance of workflow design in HITL system
  • One-phase vs. two-phase workflows
  • Quality vs. quantity of training
  • Pipeline workflows
  • Collection segmentation and multi-technique workflows
  • When to stop?

Technology: Basics

  • Review software and traditional review workflows
  • Duplicate detection, aggregation, and propagation
  • Search and querying
  • Unsupervised learning and visual analytics

Technology: Supervised Learning

  • Basics of text classification
  • Data modeling and task definition
  • Prioritization vs. classification
  • Reviewing and labeling in TAR workflows
  • Relevance feedback and other active learning approaches
  • Implications of transductive context
  • Classifier reuse and transfer learning
  • Research questions

Evaluation

  • Effectiveness measures
  • Sample-based estimation of effectiveness
  • Impact of category prevalence
  • Cost measures
  • Evaluating progress within a TAR project
  • Collection segmentation and evaluation
  • Choosing and tuning methods across multiple projects
  • Research questions

Stopping rules

  • Stopping rules, cutoffs, and workflow design
  • Cost targets and effectiveness targets
  • The cost landscape
  • Distinctions among stopping rules
    • Interventional, standoff, and hybrid rules
    • Gold standard vs. self-evaluation rules
    • Certification vs. heuristic rules
  • Example stopping rules
  • Research questions

Societal context

  • TAR and the ethical obligations of attorneys
  • Bias and ethics issues in TAR for monitoring and surveillance
  • Implications of TAR for evidence-based medicine
  • Controversies in automated content moderation
  • Research questions

Summary and Future

  • TAR research and industry practice
  • Challenges in access to data
  • The potential for interdisciplinary TAR research

Presenters

Eugene Yang is a Research Associate at Human Language Technology Center of Excellence at Johns Hopkins University. He has been developing state-of-the-art approaches for Technology-assisted reviews. His Ph.D. dissertation focuses on the cost reduction and analysis for TAR, including cost modeling and stopping rules for one and two phase workflows. He is currently working on cross-lingual human-in-the-loop retrieval approaches.

Jeremy Pickens is a pioneer in the field of collaborative exploratory search, a form of information seeking in which a group of people who share a common information need actively collaborate to achieve it. As Principal Data Scientist at OpenText, he has spearheaded the development of Insight Predict. His ongoing research and development focuses on methods for continuous learning, and the variety of real world technology assisted review workflows that are only possible with this approach. Dr. Pickens earned his doctoral degree at the University of Massachusetts, Amherst, Center for Intelligent Information Retrieval. Before joining Catalyst Repository Systems and later OpenText, he spent five years as a research scientist at FX Palo Alto Lab, Inc.

David D. Lewis is Chief Scientific Officer for Redgrave Data, a legal technology services company. He has researched, designed, and consulted on human-in-the-loop document classification and review systems since the early 1990’s. His 1994 paper with Gale introduced uncertainty sampling, a core technique used in commercial TAR systems. This paper won an ACM SIGIR Test of Time Award in 2017. In 2005, Dave co-founded the TREC Legal Track, the first open evaluation of TAR technology. He was elected a Fellow of the American Association for the Advancement of Science in 2006 for foundational work on algorithms, data sets, and evaluation in text analytics.