RadNLP 2024 shared task
Natural Language Processing for Radiology (NTCIR-18)
RadNLP 2024 shared task
Natural Language Processing for Radiology (NTCIR-18)
Task overview
Important dates
How to participate
About us

RadNLP 2024

Multi-label sentence & document classification for lung cancer staging

RadNLP 2024

Multi-label sentence & document classification for lung cancer staging

RadNLP 2024 (Natural Language Processing for Radiology) is a shared task in the international conference NTCIR-18, organized by the National Institute of Informatics in Japan.

We propose the tasks, publish the dataset, and call for solutions from participants. RadNLP 2024 aims to create open medical data and contribute insights back to medical and informatic communities.

News

Task overview

1. Motivation

RadNLP 2024 aims to automatically determine the stage (i.e., the degree of progression) of lung cancer from radiology reports.

Radiology reports are authored by radiologists and sent to clinicians, in which medical images such as CT and MRI are described and interpreted.

Radiology reports are rich in information related to cancer staging, which can be essential for clinical or research purposes. However, radiology reports do not always specify the stage of the cancer explicitly¹, which imposes extra workload on human experts to read them through and extract information manually.

RadNLP 2024 aims to aid clinical practice by automating cancer staging from radiology reports using the natural language processing technique.

1 Sexauer R et al. Towards more structure: comparing TNM staging completeness and processing time of text-based reports versus fully segmented and annotated PET/CT data of non-small-cell lung cancer. Contrast Media Mol Imaging 2018:5693058.

2. Dataset

Our datasets contain NO personal health information. The radiology reports are not derived from real medical institutions but are created with crowdsourcing by diagnosing de-identified images on Radiopaedia². Task participants requires no complex applications to use our datasets.

2 Nakamura Y et al. Clinical Comparable Corpus Describing the Same Subjects with Different Expressions. Stud Health Technol Inform 2022:290:253-257.

3. Tracks

RadNLP 2024 opens two independent tracks, English Track and Japanese Track:

English Track

The dataset consists of English radiology reports.

Japanese Track

The dataset consists of Japanese radiology reports.

Participants are welcome to join either the English track, the Japanese track, or both.

Scoring and ranking will be conducted independently for each track.

4. Tasks

[UPDATED] RadNLP 2024 consists of two tasks, the sub task and main task:
Sub task

Auxiliary task to classify sentence in radiology reports.

Main task

Automated lung cancer staging, the goal of this shared task.

Sub task: sentence classification

Sub task is multi-class topic classification of individual sentences in every radiology report.

Note that some sentences may have no gold standard labels when they are not relevant to cancer staging.

Further details will be announced later in this section.

Main task: multi-label document classification for lung cancer staging

Main task is a multi-label document classification to correctly determine T, N, and M categories for each radiology report.

Gold standard labels for the training and validation datasets are provided as CSV tables. Each table has four columns (ID, T, N, and M):

  • ID: unique integers assigned to each radiology report.
  • T: gold standard for the T category.
  • N: gold standard for the N category.
  • M: gold standard for the M category.
ID T N M
1 3 3 1
2 1 0 0
3 3 1 0
4 2 0 0
... ... ... ...

We provide another empty CSV table to fill in the prediction results for the test set. Task participants should fill in the table and submit it to the task organizer team in the task period. After the submission deadline, submission scores are calculated and sent back to the task participants.

Important dates

How to participate

1. Prepare your email address and click the link below:

2. Read the instruction carefully and open the online registration form.

3. In the registration form, choose “Yes” in Question 12, and check the track(s) to join in Question 13.

* Deciding whether to participate in sub task and/or main task is not mandatory at the time of registration.

FAQ

Yes. It is no problem to use the Japanese track’s dataset to solve the English track, or to use the English track’s dataset to solve the Japanese track.

If you re-use one track’s dataset in the other, please specify the detail in your system paper for the sake of reproducibility.

Our policy is that RadNLP 2024 is, rather than a competition, a workshop to welcome participants’ diverse approaches and share the insights widely.

Yes. We will not pose any specific limits on the use of any external resources, including models, dictionaries, corpora, or datasets.

If you use external resources, please specify the detail in your system paper for the sake of reproducibility.

Our policy is that RadNLP 2024 is, rather than a competition, a workshop to welcome participants’ diverse approaches and share the insights widely.

Should you use external resources, please handle them with a maximum consideration not to violate human rights, especially privacy.

If you have any other questions, please feel free to contact us.

About us

Organizers

Co-chair
Yuta Nakamura

Department of Computational Diagnostic Radiology and Preventive Medicine, the University of Tokyo Hospital

Co-chair
Shouhei Hanaoka

Department of Radiology, Graduate School of Medicine, the University of Tokyo

Co-chair
Eiji Aramaki

Social Computing Laboratory, Nara Institute of Science and Technology (NAIST)

Co-chair
Shuntaro Yada

Social Computing Laboratory, Nara Institute of Science and Technology (NAIST)

Adviser
Koji Fujimoto

Department of Advanced Imaging in Medical Magnetic Resonance, Graduate School of Medicine, Kyoto University

Adviser
Jonas Kluckert

Institute of Diagnostic and Interventional Radiology, University Hospital Zurich

Adviser
Michael Krauthammer

Department of Quantitative Biomedicine, University of Zurich

Collaborators

Past shared tasks

RadNLP 2024 has two preceding shared tasks, whose websites are available below:

Contact

E-mail: radnlp [at] googlegroups.com

Designed with WordPress