RadNLP 2024 (Natural Language Processing for Radiology) is a shared task focusing on the application of natural language processing in radiology. The organizing team prepares the tasks, publishes the dataset, and calls for solutions from participants.
The aim of RadNLP 2024 is to create open medical data and to contribute insights back to medical and informatic communities.
RadNLP 2024 is held as a core task of the international conference NTCIR-18, organized by the National Institute of Informatics in Japan.
The objective of RadNLP 2024 is to automatically determine the stage (i.e., the degree of progression) of lung cancer from radiology reports.
Radiology reports are clinical documents authored by radiologists and sent to referring clinicians, in which medical images such as CT and MRI are described and interpreted.
Radiology reports are rich in information related to cancer staging, which can be essential for clinical or research purposes. However, radiology reports do not always specify the stage of the cancer explicitly¹, which imposes extra workload on human experts to read them through and extract information manually.
RadNLP 2024 aims to aid clinical practice by automating cancer staging from radiology reports using the natural language processing technique.
[UPDATED] RadNLP 2024 datasets contain 243 English or Japanese radiology reports:
English Track
The dataset consists of 243 English radiology reports.
Japanese Track
The dataset consists of 243 Japanese radiology reports.
Participants are welcome to join either the English track, the Japanese track, or both. Scoring and ranking will be conducted independently for each track.
Our datasets contain NO personal health information. The radiology reports are not derived from real medical institutions but are created with crowdsourcing by diagnosing de-identified images on Radiopaedia². Task participants requires no complex applications to use our datasets.
2 Nakamura Y et al. Clinical Comparable Corpus Describing the Same Subjects with Different Expressions. Stud Health Technol Inform 2022:290:253-257.
RadNLP 2024 is a multi-label document classification task to correctly determine T, N, and M categories for each radiology report.
Gold standard labels for the training and validation datasets are provided as CSV tables. Each table has four columns (ID, T, N, and M):
ID | T | N | M |
---|---|---|---|
1 | 3 | 3 | 1 |
2 | 1 | 0 | 0 |
3 | 3 | 1 | 0 |
4 | 2 | 0 | 0 |
... | ... | ... | ... |
We provide another empty CSV table to fill in the prediction results for the test set. Task participants should fill in the table and submit it to the task organizer team in the task period. After the submission deadline, submission scores are calculated and sent back to the task participants.
Registration period has not yet started.
We will announce on this page after the registration opens.
Yuta Nakamura
Department of Computational Diagnostic Radiology and Preventive Medicine, the University of Tokyo Hospital
Shouhei Hanaoka
Department of Radiology, Graduate School of Medicine, the University of Tokyo
Eiji Aramaki
Social Computing Laboratory, Nara Institute of Science and Technology
Shuntaro Yada
Social Computing Laboratory, Nara Institute of Science and Technology
Koji Fujimoto
Department of Real World Data Research and Development, Graduate School of Medicine, Kyoto University
Michael Krauthammer
Department of Quantitative Biomedicine, University of Zurich
Jonas Kluckert
Institute of Diagnostic and Interventional Radiology, University Hospital Zurich
E-mail: radnlp [at] googlegroups.com
Designed with WordPress