Solicited Cough Sound Analysis for Tuberculosis Triage Testing: The CODA TB DREAM Challenge Dataset

Summary

Describes & provides access to a large multi-country database of cough sounds from individuals being evaluated for TB. The dataset includes more than 700,000 cough sounds from 2,143 individuals with detailed demographic, clinical and microbiologic information.

The CODA Dream Challenge paper presents a comprehensive study aimed at enhancing tuberculosis (TB) diagnosis through the analysis of cough sounds. The motivation behind the study stems from TB's status as a leading infectious disease killer globally, exacerbated by the inadequacy of current screening methods that fail to meet the World Health Organization's accuracy targets for TB triage tests.

The paper details the assembly of a large, multi-country dataset of cough sounds from individuals being evaluated for TB, featuring over 700,000 cough sounds from 2,143 participants along with detailed demographic, clinical, and microbiologic diagnostic information. This dataset was created to support the development of artificial intelligence models for cough sound analysis, with the ultimate goal of improving TB diagnosis at the point of care. The initiative represents a significant step in the field of Acoustic Epidemiology, demonstrating the potential of digital cough monitoring to aid in TB screening and diagnosis.

Methodologically, the study outlines the recruitment of participants from various global locations, all of whom were undergoing evaluation for TB. The cough sounds were collected using smartphones loaded with the Hyfe research platform, ensuring a wide coverage and diversity in the cough sound samples. The dataset was subsequently divided into training and validation sets to facilitate the development and testing of predictive models.

The study also provides an in-depth description of the data collection process, including participant demographics, clinical data collection, TB reference standard testing, and cough recording procedures using Hyfe's technology. This thorough approach not only ensures the richness and reliability of the dataset but also opens avenues for future research in the development of cough-based diagnostic tools for TB.

The CODA TB DREAM Challenge Dataset represents a novel and promising approach to TB screening and diagnosis, leveraging advancements in artificial intelligence and machine learning to interpret cough sounds. By providing a detailed and accessible dataset, the study empowers researchers worldwide to contribute to the fight against TB through innovative diagnostic solutions.

Latest Publications