News
- July 16, 2024: Japanese domain sample data updated
- July 12, 2024: Updated descriptions in Task Overview and Dataset
- July 9, 2024: Japanese domain sample data available
About MedNLP-CHAT
Medical Natural Language Processing for AI Chat (MedNLP-CHAT), which is one of the core tasks in NTCIR-18, aims to evaluate medical chatbots based on multiple viewpoints. Medical chatbot service is a promising solution for medical/healthcare human resource problem. However, the risk of chatbot is not well-known. We create the testbed of potential chatbot responses from various aspects: medical validation, legal viewpoints, ethical issues, etc.
Task Overview
- INPUT
- A pair of a patient’s question and a chatbot answer
- OUTPUT
- Objective evaluation by a specialist: Binary class (TRUE/FALSE)
- medicalRisk
- ethicalRisk
- legalRisk
- Subjective evaluation by the general public: A probability distribution of evaluations on a 5-point scale from -2 to 2
- fluency
- helpfulness
- harmlessness
- Objective evaluation by a specialist: Binary class (TRUE/FALSE)
Dataset
Japanese domain dataset
- The data consists of a question, an answer, and a set of labels for the answer, which are objective labels (Risks) based on Japanese laws and medical guidelines and subjective labels (fluency, helplessness, and harmlessness) by Japanese people [README].
- Data size: We are preparing to construct 200 pairs of {Question, Answer, Answer labels}
- Both the questions and answers are created by humans, referencing responses from a chatbot.
- Answer labels represent the evaluation of the answers, which will be estimated in this task. There are six labels comprising three objective labels (medicalRisk, ethicalRisk, and legalRisk) assigned by experts based on Japanese laws and medical guidelines and three subjective labels (fluency, helpfulness, and harmlessness) assigned by Japanese through crowdsourcing.
- Languages: Japanese (JA), English (EN), German (DE), and French (FR).
- Step 1: Japanese data is created.
- Step 2: It is translated into the other languages.
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/07/sample_data-1024x557.png)
German domain dataset (TBA)
- The data consists of a question, an answer, and a set of labels for the answer.
- Data size: TBA
- Languages: German (DE), TBD
Registration
Schedule
-
Mar 29, 2024: Kickoff event -
May -> July 2024: Sample dataset release - Aug 2024: Training dataset release (Ja)
- Nov 2024-Jan 2025: Formal run
- Feb 1, 2025: Evaluation results return
- Feb 1, 2025: Task overview release (draft)
- Mar 1, 2025: Submission due of participant papers (draft)
- May 1, 2025: Camera-ready participant paper due
- Jun 10-13 2025: NTCIR-18 Conference @ NII, Tokyo, Japan
Organizer
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/aramaki-150x150-1.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/wakamiya-150x150-2.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/yada-150x150-1.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/07/1hisada-300x300-1-150x150.jpg)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/04/nishiyama-150x150.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/lisa_ntcir18-150x150.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/roland_roller_ntcir18-150x150.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/philippe_thomas_ntcir18-150x150.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/kat_ntcir18-150x150.png)
![](https://sociocom.naist.jp/mednlp-chat/wp-content/uploads/sites/10/2024/03/pierre_ntcir18-150x150.png)
Adviser
Ryuma Shineha, Ph.D. (Osaka University, Japan)