Medical Natural Language Processing for Emergency Call

Emergency call triage is designed to sort and prioritize calls from the most serious, life-threatening cases to cases with no immediate medical treatment needs. This helps medical staff and facilities allocate resources efficiently, prevent overcrowding in emergency departments, and ensure ambulances are dispatched in an orderly manner. However, the accuracy and safety in triage decision-making remain a challenge.

Therefore, our shared task aims to build models that can automatically classify the triage level of patients based on information from telephone calls and generate short summaries, making it easier for healthcare providers to understand patient conditions.

Participants will be provided with a dataset of dispatcher-caller dialogues, annotated with triage labels and short summaries, and will be tasked with predicting the correct triage level and generating corresponding notes.

Task Overview

Given the dialogue between the dispatcher and the caller as input, the task is to automatically classify the triage level of patients based on information from the telephone calls and generate the corresponding short summaries.

Subtask 1

Determining emergency triage level  based on the call (Very emergency, semi-uemergency, low-emergency).

SubTask 2

Generating a short summary, which contains detailed information from the caller.

Dataset

The dataset consists of dialogues between dispatchers and callers, each paired with a triage label (Very emergency, semi-emergency, low-emergency) and a short summary. 

  • The dialogues are synthetically generated using a large language model (LLM), then validated by the LLM for communication quality and information accuracy. 
  • Triage labels are labeled by medical experts, while the summaries are annotated directly by experts based on the call content.
  • Languages: Japanese, English, German, French, Indonesian, and Filipino.

Sample Data (TBA)

Important Dates

  • September 3, 2025: Kickoff event
  • January 2026: Sample dataset release
  • March 2026: Training dataset release (ja)
  • May 2026: Training set release (complete version)
  • July 1, 2026: Registration deadline
  • July 1, 2026: Test set release
  • July 1-8, 2026: Formal run
  • July 15, 2026: Evaluation results return
  • August 1, 2026: Task overview release (draft)
  • September 1, 2026: Submission due for participant papers (draft)
  • November 1, 2026: Camera-ready participant paper due
  • December 8-10, 2026: NTCIR-19 Conference @ NII, Tokyo, Japan

Organized by

Sa’idah Zahrotul Jannah
(NAIST, Japan)

Tomohiro Nishiyama, Ph.D.
(NAIST, Japan)

Lisa Raithel, Ph.D.
(TU Berlin and BIFOLD, Germany)

Roland Roller, Ph.D. (DFKI, Germany)

Eiji Aramaki, Ph.D.
(NAIST, Japan)

Lenard Paulo Tamayo (NAIST, Japan)

Maria Regina Justina E. Estuar, Ph.D
(Ateneo de Manila University, Philippines)

Philippe Thomas, Ph.D. (DFKI, Germany)

Shoko Wakamiya, Ph.D. 
(NAIST, Japan)

Pierre Zweigenbaum, Ph.D.
(Université Paris-Saclay, CNRS, LISN, France)

Kiki Ferawati, Ph.D. (UNS, Indonesia)