RadNLP 2024 shared task
Natural Language Processing for Radiology (NTCIR-18)
RadNLP 2024 shared task
Natural Language Processing for Radiology (NTCIR-18)
Task overview
Important dates
How to participate
About us

RadNLP 2024 (Natural Language Processing for Radiology) is a shared task in the international conference NTCIR-18, organized by the National Institute of Informatics in Japan.

We propose the tasks, publish the dataset, and call for solutions from participants. RadNLP 2024 aims to create open medical data and contribute insights back to medical and informatic communities.

Task overview

1. Motivation

RadNLP 2024 aims to automatically determine the stage (i.e., the degree of progression) of lung cancer from radiology reports.

Management of lung cancer is based on the stage, and radiology reports provide various related information by describing medical images such as CT and MRI.

However, radiology reports do not always specify the stage explicitly¹. This imposes extra workload on human experts for careful manual information extraction, which can be aided by automation.

1 Sexauer R et al. Towards more structure: comparing TNM staging completeness and processing time of text-based reports versus fully segmented and annotated PET/CT data of non-small-cell lung cancer. Contrast Media Mol Imaging 2018:5693058.

2. Dataset

Our datasets contain NO personal health information. The radiology reports are not derived from real medical institutions but are created with crowdsourcing by diagnosing de-identified images on Radiopaedia². Task participants requires no complex applications to use our datasets.

2 Nakamura Y et al. Clinical Comparable Corpus Describing the Same Subjects with Different Expressions. Stud Health Technol Inform 2022:290:253-257.

3. Tracks and Tasks

RadNLP 2024 opens two independent tracks, English Track and Japanese Track:

English Track

The dataset consists of English radiology reports.

Japanese Track

The dataset consists of Japanese radiology reports.

Participants are welcome to join either the English track, the Japanese track, or both. Scoring and ranking will be conducted independently for each track.

Also, RadNLP 2024 consists of two tasks, the sub task and main task:

Sub task

Auxiliary task to classify sentences in radiology reports.

Main task

Automated lung cancer staging, the goal of this shared task.

3-1. Sub task: document segmentation

[UPDATED] Sub task is a document segmentation to identify up to eight spans related to the following topics:

(i) Measure  –  Span describing mainly the existence and diameter of the primary lesion.

(ii) Extension  –  Span describing the range of the primary lesion’s extension outside the lung parenchyma.

(iii) Atelectasis  –  Span pointing out atelectasis or obstructive pneumonia.

(iv) Satellite  –  Span pointing out intrapulmonary metastasis or lymphangiomatosis carcinomatosa.

(v) Lymphadenopathy  –  Span pointing out enlarged regional lymph nodes.

(vi) Pleural  –  Span pointing out pleural/pericardial effusion/dissemination.

(vii) Distant  –  Span pointing out distant metastasis outside the lung parenchyma.

(viii) Omittable  –  Span without any findings that are positive or related to lung cancer staging.

In NLP terminology, this sub task is a multi-label sentence binary classification.

The segmentation should be done at the sentence level, and it is possible for a topic span to be discontinuous or for the same sentence to belong to more than one topic.

Participants are requested to determine whether each sentence falls into categories (i) to (viii). If it does, mark it as “1,” and if it does not, mark it as “0.” Therefore, each sentence requires eight binary answers.

Note that every sentence will either fall into one or more of categories (i) to (vii), or it will solely fall into category (viii).

3-2. Main task: multi-label document classification for lung cancer staging

Main task is a multi-label document classification to correctly determine T, N, and M categories for each radiology report.

Gold standard labels for the training and validation datasets are provided as CSV tables. Each table has four columns (ID, T, N, and M):

  • ID: unique integers assigned to each radiology report.
  • T: gold standard for the T category.
  • N: gold standard for the N category.
  • M: gold standard for the M category.
ID T N M
1 3 3 1
2 1 0 0
3 3 1 0
4 2 0 0
... ... ... ...

We provide another empty CSV table to fill in the prediction results for the test set. Task participants should fill in the table and submit it to the task organizer team in the task period. After the submission deadline, submission scores are calculated and sent back to the task participants.

Important dates

How to participate

1. Prepare your email address and click the link below:

2. Read the instruction carefully and open the online registration form.

3. In the registration form, choose “Yes” in Question 12, and check the track(s) to join in Question 13.

* Deciding whether to participate in sub task and/or main task is not mandatory at the time of registration.

FAQ

We expect that the sub task will support the main task.

Our aim is the NLP application for staging (i.e., the main task), which was also the focus of the last NTCIR-17 shared task.

However, the results from NTCIR-17 indicated that even the most advanced solutions at that time had potential for improvement.

Therefore, we are providing new sentence-level annotations (i.e., the sub task), in addition to document-level annotations, to enable participants to explore various new approaches.

Yes. It is fine for one person to join two or more different teams.

Yes. It is no problem to use the Japanese track’s dataset to solve the English track, or to use the English track’s dataset to solve the Japanese track.

If you re-use one track’s dataset in the other, please specify the detail in your system paper for the sake of reproducibility.

Our policy is that RadNLP 2024 is, rather than a competition, a workshop to welcome participants’ diverse approaches and share the insights widely.

Yes. We will not pose any specific limits on the use of any external resources, including models, dictionaries, corpora, or datasets.

If you use external resources, please specify the detail in your system paper for the sake of reproducibility.

Our policy is that RadNLP 2024 is, rather than a competition, a workshop to welcome participants’ diverse approaches and share the insights widely.

Should you use external resources, please handle them with a maximum consideration not to violate human rights, especially privacy.

No. We request every team to submit ONE system paper and make ONE presentation, even if you participate in both the English and Japanese tracks.

No. We request every team to submit ONE system paper and make ONE presentation, even if you solve both the main and sub tasks.

If you have any other questions, please feel free to contact us.

About us

Organizers

Co-chair
Yuta Nakamura

Department of Computational Diagnostic Radiology and Preventive Medicine, the University of Tokyo Hospital

Co-chair
Shouhei Hanaoka

Department of Radiology, Graduate School of Medicine, the University of Tokyo

Co-chair
Eiji Aramaki

Social Computing Laboratory, Nara Institute of Science and Technology (NAIST)

Co-chair
Shuntaro Yada

Social Computing Laboratory, Nara Institute of Science and Technology (NAIST)

Staff
Jun Kanzawa

Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo

Staff
Akira Katayama

Division of Radiology and Biomedical Engineering, Graduate School of Medicine, The University of Tokyo

Staff
Tomohiro Kikuchi

Data Science Center, Jichi Medical University

Staff
Ryo Kurokawa

Department of Radiology, The University of Tokyo Hospital

Staff
Wataru Gonoi

Department of Radiology, Graduate School of Medicine, the University of Tokyo

Staff
Peitao Han

Social Computing Laboratory, Nara Institute of Science and Technology (NAIST)

Staff
Kiyoto Hashimoto

Social Computing Laboratory, Nara Institute of Science and Technology (NAIST)

Collaborators

Adviser
Koji Fujimoto

Department of Advanced Imaging in Medical Magnetic Resonance, Graduate School of Medicine, Kyoto University

Adviser
Jonas Kluckert

Institute of Diagnostic and Interventional Radiology, University Hospital Zurich

Adviser
Michael Krauthammer

Department of Quantitative Biomedicine, University of Zurich

Past shared tasks

RadNLP 2024 has two preceding shared tasks, whose websites are available below:

Contact

E-mail: radnlp [at] googlegroups.com

Designed with WordPress