Objective
In response to the COVID-19 pandemic, the Epidemic Question Answering (EPIC-QA) track challenges teams to develop systems capable of automatically answering ad-hoc questions about the disease COVID-19, its causal virus SARS-CoV-2, related corona viruses, and the recommended response to the pandemic. While COVID-19 has been an impetus for a large body of emergent scientific research and inquiry, the response to COVID-19 raises questions for consumers. The rapid increase in coronavirus literature and evolving guidelines on community response creates a challenging burden not only for the scientific and medical communities but also the general public to stay up-to-date on the latest developments. Consequently, the goal of the track is to evaluate systems on their ability to provide timely and accurate expert-level answers as expected by the scientific and medical communities as well as answers in consumer-friendly language for the general public.
Background
A pneumonia of unknown origin was detected in Wuhan, China and was first reported to the World Health Organization (WHO) on December 31, 2019. On January 30, 2020, the outbreak had escalated to the point that it was declared a Public Health Emergency of International Concern. The WHO officially named the 2019 coronavirus disease "COVID-19" on February 11, 2020. By March 11, 2020, after more than 118,000 reported cases in 114 countries resulting in over 4,291 reported fatalities, the WHO formally characterized COVID-19 as a pandemic. In the United States, in March 2020, various states and cities began issuing mandatory quarantine ordinances as well as guidance on "social distancing" and forced closures of gyms, bars, and nightclubs. As of April 7, 41 states as well as the District of Columbia had issued mandatory self-quarantine directives, forbidding non-essential activity outside the home. Over this period, there has been a rapid escalation in scientific research on COVID-19 and related coronaviruses as well as government and community response to prevent or maintain the outbreak. For example, the scientific community has sequenced the SARS-CoV-2 genome, proposed multiple potential vaccines, and explored antibody, anti-viral, and cell-based treatments. The rapid escalation of government and community response has resulted in a large burden for consumers as well as scientists and healthcare professionals seeking to maintain up-to-date knowledge on COVID-19 as well as the recommended response and adjustments to their daily lives. Consequently, the EPIC-QA track at TAC 2020 aims to help reduce this burden by fostering research in the design of automatic question answering systems to support scientific and consumer inquiry into COVID-19 and the recommended response.
It is our hope that the track will stimulate research in automatic question answering not only to support providing high-quality timely information about COVID-19, but also that the resultant collection can be used to develop generalizable approaches to meeting information needs in the face of varying levels of expertise.
Tasks
The 2020 EPIC-QA track involves two tasks:
-
Task A
- Expert QA: In Task A, teams are provided with a set of questions asked by experts and are asked to provide a ranked list of expert-level answers to each question. In Task A, answers should provide information that is useful to researchers, scientists, or clinicians.
-
Task B
- Consumer QA: In Task B, teams are provided with a set of questions asked by consumers and are asked to provide a ranked list of consumer-friendly answers to each question. In Task B, answers should be understandable by the general public.
While each task will have its own set of questions, many of the questions will overlap. This is by design, so that the collection can be used to explore whether the same approaches or systems can account for different types of users.
Answering Questions
In this tracks answers must be in the form of consecutive sentences extracted from a single context in a single document. Below, we illustrate four consumer-friendly and expert-level example answers extracted for the question, What is the origin of COVID 19?:
Consumer Passage
1 COVID-19 is caused by a new coronavirus. 2 Coronaviruses are a large family of viruses that are common in people and many different species of animals, including camels, cattle, cats, and bats. Rarely, animal coronaviruses can infect people and then spread between people such as with MERS-CoV, SARS-CoV, and now with this new virus (named SARS-CoV-2).
The SARS-CoV-2 virus is a betacoronavirus, like MERS-CoV and SARS-CoV. 3 All three of these viruses have their origins in bats. 4 The sequences from U.S. patients are similar to the one that China initially posted, suggesting a likely single, recent emergence of this virus from an animal reservoir.
Expert Passage
1 It is improbable that SARS-CoV-2 emerged through laboratory manipulation of a related SARS-CoV-like coronavirus. As noted above, the RBD of SARS-CoV-2 is optimized for binding to human ACE2 with an efficient solution different from those previously predicted.[7,11] 2 Furthermore, if genetic manipulation had been performed, one of the several reverse-genetic systems available for betacoronaviruses would probably have been used.[19] 3 However, the genetic data irrefutably show that SARS-CoV-2 is not derived from any previously used virus backbone.[20] 4 Instead, we propose two scenarios that can plausibly explain the origin of SARS-CoV-2: (i) natural selection in an animal host before zoonotic transfer; and (ii) natural selection in humans following zoonotic transfer.
Contexts and sentence IDs will be provided to the participants as part of the collection. In the CORD-19 collection, contexts will correspond to paragraphs defined by the authors of their publications. In the government collection, contexts will correspond to sections identified through the HTML of government websites. Contexts that are longer than 15 sentences will be segmented into approximately 15-sentence chunks. Contexts will be further segmented into sentences, each associated with a unique ID. The participants will be required to provide the starting and ending IDs of the sentences that constitute each of their answers. To maintain provenance, each answer must also be associated with the document and context IDs from which it originated.
Note:
In this task, the goal is to explore the landscape of answers asserted by the document collection. A statement that answers the question will be considered as a valid answer regardless of whether or not it is factually accurate. The answers in this task are intended as intermediary step where-in one would like to explore all answers provided by the document collection, both correct answers as well as incorrect answers people may have discovered on their own.
Document Collection
Answers should originate from the EPIC QA collection, which includes scientific and government articles about COVID-19, SARS-CoV-2, related coronaviruses, and information about community response. This collection consists of two parts:
-
Research Articles
-
We adapt the collection of biomedical articles released for the COVID-19 Open Research Dataset Challenge (CORD-19). The primary evaluation uses a snapshot of CORD-19 from October 22, 2020. The dataset was created by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine — National Institutes of Health, in coordination with The White House Office of Science and Technology Policy. The CORD-19 collection includes a subset of articles in PubMed Central (PMC) as well as pre-prints from bioRxiv. Contexts in this collection will correspond to automatically identified paragraphs in the articles' abstracts, or main texts.
By downloading this dataset you are agreeing to the Open COVID Pledge compatible Dataset License for the CORD-19 dataset that details the terms and conditions under which partner data and content is being made available. Specific licensing information for individual articles in the dataset is available in the metadata file.
Additional licensing information is available on the PMC website, medRxiv website and bioRxiv website.
11/13 Update
Due to a bug in the processing pipeline for the CORD-19 collection, we have re-released the 10/22 snapshot of CORD-19. The updated snapshot includes 236,034 documents with 4,075,478 contexts. Individual file md5 checksums are included inside the cord-19 archive, and can be downloaded separately here. -
Consumer Articles
-
We include a subset of the articles used by the Consumer Health Information Question Answering (CHIQA) service of the U.S. National Library of Medicine (NLM). This collection includes authoritative articles from: the Centers for Disease Control and Prevention (CDC); the Genetic and Rare Disease Information Center (GARD); the Genetics Home Reference (GHR); Medline Plus; the National Institute of Allergy and Infectious Diseases (NIAID); the World Health Organization (WHO); Contexts in this collection will correspond to paragraphs or sections as indicated by the HTML markup of the document.
We also include 265 reddit threads from /r/askscience tagged with COVID-19, Medicine, Biology, or the Human Body, and filtered for COVID-19 content.
Finally, we include a subset of the CommonCrawl News crawl from January 1st to April 30th, 2020, as used in the TREC Health Misinformation Track. Documents in this subset were filtered by domain using SALSA, PageRank, and HITS and were further filtered for COVID-19 content.
Works produced by the federal government are not copyrighted under U.S. law. You may reproduce, redistribute, and link freely to non-copyrighted content, including on social media. Documents from the WHO may be reviewed, reproduced or translated for research or private study but not for sale or for use in conjunction with commercial purposes. Additional copyright information can be obtained from their respective websites.
Collection (812 MB) MD5 Checksum
-
Research Articles
-
We adapt the collection of biomedical articles released for the COVID-19 Open Research Dataset Challenge (CORD-19). The primary evaluation uses a snapshot of CORD-19 from June 19, 2020. The dataset was created by the Allen Institute for AI in partnership with the Chan Zuckerberg Initiative, Georgetown University’s Center for Security and Emerging Technology, Microsoft Research, and the National Library of Medicine — National Institutes of Health, in coordination with The White House Office of Science and Technology Policy. The CORD-19 collection includes a subset of articles in PubMed Central (PMC) as well as pre-prints from bioRxiv. Contexts in this collection will correspond to automatically identified paragraphs in the articles' abstracts, or main texts.
By downloading this dataset you are agreeing to the Open COVID Pledge compatible Dataset License for the CORD-19 dataset that details the terms and conditions under which partner data and content is being made available. Specific licensing information for individual articles in the dataset is available in the metadata file.
Additional licensing information is available on the PMC website, medRxiv website and bioRxiv website.
11/18 Update The preliminary CORD-19 collection has been updated to account for changes in the CORD-19 preprocessing pipeline. If you downloaded the preliminary CORD-19 collection before 11/18, it had a large number of duplicated contexts, so please download the updated version (version 3).
-
Consumer Articles
-
We include a subset of the articles used by the Consumer Health Information Question Answering (CHIQA) service of the U.S. National Library of Medicine (NLM). This collection includes authoritative articles from: the Centers for Disease Control and Prevention (CDC); DailyMed; the Genetic and Rare Disease Information Center (GARD); the Genetics Home Reference (GHR); the Mayo Clinic; Medline Plus; the National Heart, Lung, and Blood Institute (NHLBI); the National Institute of Allergy and Infectious Diseases (NIAID); the World Health Organization (WHO); and the Office on Women's Health of the U.S. Department of Health & Human Services. Contexts in this collection will correspond to paragraphs or sections as indicated by the HTML markup of the document.
Works produced by the federal government are not copyrighted under U.S. law. You may reproduce, redistribute, and link freely to non-copyrighted content, including on social media. Documents from the WHO may be reviewed, reproduced or translated for research or private study but not for sale or for use in conjunction with commercial purposes. Additional copyright information can be obtained from their respective websites.
Collection (2.1 MB) MD5 Checksum
The documents in this collection adhere to a modified version of the CORD-19 JSON schema described here.
Judgments
The list of human-generated answers and sentence-level answer annotations for the 21 questions judged in Task A and the 18 questions judged in Task B during the preliminary evaluation are now available. Note: these judgments were made using the preliminary evaluation collection; the context and sentence IDs may have shifted in the primary evaluation collection.
11/13 Update The preliminary judgments have been corrected to account for an issue with the CORD-19 collection used during the preliminary evaluation. If you downloaded them before 11/13, please download them again.
Preliminary Judgments (49 KB) MD5 Checksum
Questions
In conjunction with the 2020 Text REtrieval Conference's (TREC) TREC-COVID track, we have prepared sets of approximately 45 questions. Specifically, two sets of questions will be provided: one for expert-level questions and one for consumer-level questions. By design, many of these questions will overlap, allowing us to evaluate the extent to which the background of the user will affect their preference for answers. The majority of these questions originated from consumers' interactions with MedlinePlus©. Additional scientific questions were developed based on group discussions from the National Institutes of Health (NIH) special interest group on COVID-19, questions asked by Oregon Health Science University clinicians, and responses to a public call for questions.
A new set of 30 expert-level and consumer-friendly questions are provided the final evaluation cycle. None of these questions will have been evaluated in TREC-COVID; thus, systems will not be able to rely existing document-level relevant judgments. However, relevance judgments produced during the preliminary evaluation are provided here.
Task A Questions (12 KB) Task B Questions (10 KB)The goal of the first, preliminary evaluation cycle is to produce data that can be used to develop systems for the final evaluation cycle in the fall. To reduce the barrier-to-entry, we will be using 45 topics evaluated in the fourth round of TREC-COVID. Participants are free to use the document-level relevance judgments for these topics during the preliminary evaluation cycle available here. Please note that in the final evaluation cycle there will be no document-level relevance judgments — they are provided only during the preliminary evaluation cycle to expedite the process.
The results of the preliminary evaluation cycle will be provided to all participants, not just those that participated in the preliminary evaluation.
Task A
The first 45 topics evaluated in the fourth round of TREC-COVID are used as-is for Task A.
Task B
A subset of topics evaluated in the fourth round of TREC-COVID updated with consumer-friendly narratives are used for Task B.