Reboot: COVID-Cancer Project.

Reboot Rx is the nonprofit startup fast-tracking the development of affordable cancer treatments using repurposed generic drugs and AI technology. Our technology aggregates and synthesizes large amounts of data to help us find the most promising repurposing opportunities.

The COVID-19 pandemic has led to unprecedented acceleration in research and drug development, making it impossible to keep up with and sort through the high volume of potential treatments, clinical trials, and scientific publications.

Patients with certain cancers may be at increased risk of death or severe complications from COVID-19, and there is an urgent need to understand how to treat COVID-19 in the context of cancer. The majority of drugs in consideration for the treatment of COVID-19 are repurposed generic drugs, many of which have been studied extensively for the treatment of cancer and may possess either pro- or anti-cancer activity.

We were underway developing our technology when the pandemic hit, and we knew both our technology and expertise could accelerate the search for the most relevant data. We have identified and aggregated information surrounding COVID-19 and cancer to create a unified data resource.

Press release announcing the launch of the Reboot: COVID-Cancer Project.

Blog post describing how the Reboot: COVID-Cancer Project is a proof-of-concept of our evidence synthesis technology.

We are releasing two datasets:

Dataset 1 - The first dataset draws from recent studies and contains information on the effects of COVID-19 infection on outcomes in cancer patients. The current dataset includes 347 published clinical studies that report on outcomes across 40,136 cancer patients with COVID-19, as well as 137 registered clinical trials specific to cancer patients with COVID-19. Detailed information regarding cancer type, treatments, and other specifics of the patient populations has been extracted.

Dataset 2 - The second dataset draws on years of cancer research on drugs that are now being tested for COVID-19 and contains information on the effects of 202 investigational COVID-19 treatments on cancer outcomes (independent of COVID-19 infection). The current dataset includes 28,139 published clinical studies that report clinical outcomes related to the use of these treatments in cancer and 9,118 registered clinical trials investigating these treatments in cancer.

 
20210415+Image+1-+updated.jpg
 

Both datasets are licensed by Reboot Rx, Inc. under CC BY-NC 4.0. It is our hope that these efforts will spur additional data sharing and collaboration. Please cite this resource directly if used in any communications or publications. If information from the project or datasets is utilized in published or unpublished work, please cite ‘Reboot: COVID-Cancer Project. Reboot Rx, Inc., 2021, https://rebootrx.org/covid-cancer. Accessed DATE’.

We welcome feedback on the utility of this data resource. Contact us at covid@rebootrx.org to learn more or get involved.

Disclaimer: The information included in the Reboot: COVID-Cancer Project is for informational purposes only and is not intended to be used as medical advice, or as a substitute for the medical advice of a physician. Users assume full responsibility for use of the information and understand and agree that Reboot Rx, Inc. and its third party content providers are not responsible or liable for any claim, loss, or damage (including personal injury or wrongful death) resulting from its use. We disclaim any warranty concerning the accuracy, timeliness, and completeness of the information, and we are not responsible for the content of external sites.


Dataset 1: COVID-19 and Cancer

To understand the unique implications of COVID-19 on cancer patients, we wanted to identify any relevant clinical data on cancer patients with COVID-19. Dataset 1 is a validated list of all published clinical studies (meta-analyses, clinical trials, observational studies, and case studies) containing information on clinical outcomes of cancer patients with COVID-19 and all registered clinical trials that have been previously conducted or are actively in progress specifically focusing on cancer patients with COVID-19.

Published clinical studies were identified using targeted search queries in PubMed, MedRxiv, BioRxiv, and the SSRN eLibrary, followed by rule-based approaches and extensive manual annotation to extract detailed information regarding treatments and specifics of the patient populations. This information was cross-referenced with publications extracted by the Castleman Disease Collaborative Network CORONA project where available. The majority of the non-relevant studies mention cancer but do not specifically report outcomes related to cancer patients with COVID-19.

Registered clinical trials were compiled by using targeted search queries to aggregate data from three clinical trial registries: clinicaltrials.gov, the World Health Organization's International Clinical Trials Registry Platform, and the ReDO Project’s Covid19_DB. Many trials about COVID-19 that mention cancer are not aimed at understanding the implications of COVID-19 or treatments on cancer specifically, and these were excluded.

 
20210415+Image+2.jpg
 

Last update: April 2021

 

Dataset 2: COVID-19 Drugs and Cancer

Most treatments being tested for COVID-19 are repurposed generic drugs, and many of them also have documented effects on cancer. Dataset 2 includes a validated list of all published clinical studies (meta-analyses, clinical trials, observational studies, and case studies) that report cancer-specific outcomes related to the use of these treatments in cancer (independent of COVID-19) and all clinical trials that have been previously conducted or are actively in progress testing the effects of the drugs on cancer. This dataset draws on years of cancer research on drugs that are now being tested for COVID-19 and contains information on the effects of investigational COVID-19 treatments on cancer outcomes.

For this analysis of the potential effects of COVID-19 drugs on cancer, we selected the drugs most actively being tested for COVID-19 in clinical trials. As of November 2020, there were 202 drugs being tested for the treatment of COVID-19 in at least two interventional clinical trials worldwide (for a full list of these drugs click the button ‘Explore Dataset 2 in Tableau’ below). FDA approval, indication, patent, and ATC drug classification information were annotated for each drug. Of the 202 drugs, 141 are FDA-approved drugs, 4 have been given FDA emergency use authorization for the treatment of COVID-19, and 57 are non-FDA approved.

Published clinical studies were assembled using our evidence synthesis pipeline, which is a combination of targeted search queries in PubMed, rule-based approaches, and machine learning models. This automated approach identifies relevant studies and extracts key information, such as the drug, whether the drug was used alone or in combination, cancer type, study type, and therapeutic association. The accuracy of machine learning models, trained on manually annotated clinical studies and curated for specific tasks, ranges between 83% and 95% depending on the task. To ensure that the baseline accuracy is maintained, 197 studies were randomly sampled from the final results and manually verified.

Registered clinical trials were compiled from clinicaltrials.gov using targeted search queries, automated mapping, rule-based screening, and manual validation. This approach identifies relevant trials (as opposed to those that might mention cancer but not actually be testing a drug in a cancer population) and extracts key information, such as cancer and tissue type, in addition to meta-data available via the clinicaltrials.gov API. All trials were manually validated for their cancer type and tissue type classification. To ensure accuracy in our data aggregation and extraction methods, 100 studies were randomly selected and manually verified to be accurate.

 
20201215_covidcancer_dataset2.png
 

Last update: December 2020

 

Acknowledgements

We are grateful for an amazing team that worked tirelessly on this project, including many interns and collaborators. Without a creative and open effort, this data release would not have been possible. THANK YOU.

Reboot Rx:
Catherine Del Vecchio Fitz, PhD, MSM; Co-founder and former CSO, Reboot Rx
Laura Kleiman, PhD; Founder and CEO, Reboot Rx
Pradeep Mangalath, MBA, MS; Co-founder and COO/CTO, Reboot Rx
Devon Crittenden; Program Associate, Reboot Rx
Anne Lin; Biomedical Scientist, Reboot Rx
Abby Mynahan, Colby College
Allison Britt, Bowdoin College
Allyson Imbacuan, Boston University
Ellie Strauss, Bates College
Emily Duffy, Bates College
Emily van der Veen, Colby College
Emily Yang, Stanford University
Emily Zhu, Brown University
Eva Verzani, Bowdoin College
Gabby Farrell, Bowdoin College
Ishita Mahajan, University of Virginia
Jennifer Griffith, Brown University
Katie McKinley, Colby College
Kelly Fan, Brown University
Kriti Sharda, University of Connecticut
Lailoo Perriello, Bowdoin College
Mallika Pajjuri, MIT
Murshea Tuor, Tufts University
Noopur Ranganathan, MIT
Sam Marchant, Colby College
Shailesh Advani, PhD, Terasaki Institute for Biomedical Innovation
Tanvi Kongara, University of Pennsylvania
Tommy Bhangdia, Tufts University
Yuyuan Lin, Yale School of Public Health

Collaborators at The Francis Crick Institute and University College London:
Charles Swanton, FRCP, BSc, PhD; Group Leader, Crick Institute
Chris Bailey, MS, MB ChB; Clinical Research Fellow, Crick Institute
James Black, PhD student, University College London
Kevin Litchfield, MS, PhD; Group Leader, University College London

Collaborators at Northeastern University Khoury College of Computer Sciences:
Byron Wallace, PhD; Assistant Professor
Ben Nye, PhD student
Eric Lehman, Research Assistant

Collaborators at IBM’s Science for Social Good Initiative:
Ioana Baldini, PhD
Mihaela Bornea, PhD
Dmitriy Katz-Rogozhnikov, PhD
Sara Rosenthal, PhD
Shivashankar Subramanian, IBM Science for Social Good Fellow (University of Melbourne)
Kush Varshney, PhD