Therapeutic Data Commons Benchmark Competition 2022

By: Katana Graph

April 18, 2022

Therapeutic Data Commons Benchmark Competition 2022

The overarching goal of biomedical research is to develop therapeutics to cure diseases and improve human health. The Therapeutic Data Commons (TDC) provides datasets representing several aspects of drug discovery that are organized for processing by machine learning and AI. As reported on their website, by making the datasets available to the public, they are helping data and life scientists “translate algorithmic innovation into biomedical and clinical implementation.”

Domain scientists identify meaningful tasks and datasets for which AI and machine learning experts design powerful models to solve the assigned problems. For each problem there are a few levels of solutions to address, such as property prediction and the generation of a new desirable entity, developing learning methods, and evaluating a machine model’s performance.

In the benchmark competition, models are trained using datasets provided by the TDC. Results are submitted to their leaderboard for evaluation of the model’s repeatability, reproducibility, and reusability. TDC challenges include a variety of learning tasks such as target discovery, activity screening, efficacy and safety for small molecules, antibodies, and vaccines. Since this sort of innovation has wide-reaching implications for technological and medical advances, TDC provides public benchmarks with performance metrics.

At last year’s ScaledML conference, Katana Graph’s Keshav Pingali mentioned prior success in the TDC competition, citing a 6% improvement in accuracy over the previous best in a problem regarding toxicity prediction of an unknown entity (watch his full ScaledML conference presentation). This year, Katana Graph’s AI team invented a graph machine learning-based technique to solve the twenty-two problems in the TDC competition, ranking first in eight problems and second in two others.

Several of the challenges presented by the TDC competition are regression and binary tasks, for which Katana Graph’s AI team built a graph convolutional network model. They used this model to predict, with increased precision and speed, problems surrounding drug consumption such as half-life, degrees of concentration in body tissue or blood, and barrier penetration. It was also applied to other tasks such as determining a compound’s viability as a substrate or protein inhibition.

These and similar problems and challenges have been part of the medical industry for decades. With the adaptation of technology to meet such demands, tackling these tasks from a new perspective with more nuanced methods brings the community ever closer to realizing the potential for innovation. Reducing the timeframe for drug discovery, better-targeted experimentation, and prediction of molecular activity all hold promise for faster, more reliable medicine for everyone.

Whether structured or unstructured data, Katana Graph can greatly improve the insights and opportunities uncovered from all your organization’s data. Schedule a meeting or take a look at our purpose-built solutions to drive innovation across a spectrum of use cases.


Newsletter Sign Up

The Way Forward: Graph Computing Conquers Life Sciences, AI, and ML

Graph computing, which is much more than simply utilizing a graph database, and graph AI are.

Read More
An Introduction to Network Medicine Using Graphs

In this new era of Big Data, we can leverage the abundance of large Omics datasets: high-throughput.

Read More
The Graph Computing Differentiator for Life Sciences

Few industries are as data-intensive — and as highly regulated — as life sciences. Organizations in.

Read More

View All Resources

Let’s Talk

Turn Your Unmanageable
Data Into Answers

Find out how Katana Graph can help provide the foundation for your future of data-driven innovation.

Contact Sales