SREDH/AI CUP 2025

Recent advancements in generative artificial intelligence (AI) and natural language processing (NLP) have positioned large language models (LLMs) as transformative tools across various industries, particularly healthcare. These technologies enhance patient care, streamline administrative tasks, and advance medical research by analyzing extensive clinical data from electronic health records (EHRs), medical imaging, and genomics.

The application of LLMs in clinical medicine raises privacy concerns, particularly about the inadvertent leakage of confidential information. EHRs have sensitive patient data, making proper de-identification crucial for advance research. To address these challenges, the 2023 SREDH/AI CUP competition took place which featured two subtasks: SHI Recognition, which focused on recognizing sensitive health information (SHI) within clinical texts, and Temporal Information Normalization, which aimed to standardize temporal information, ensuring consistency across medical records. The goal of the Artificial Intelligence CUP 2023 competition was to improve the usability of EMR data while ensuring patient privacy and consistency, seeking automatic de-identification and standardization solutions from researchers worldwide. For more detailed information, please refer to the SREDH/AI CUP 2023 website.

However, integrating LLMs into healthcare presents challenges in terms of data privacy and safeguarding sensitive health information (SHI) within medical documents. Sophisticated algorithms are needed to identify and remove SHI from unstructured clinical texts while considering medical context, terminology, and evolving data privacy regulations.

To further tackle these challenges, the Ministry of Education in Taiwan and CGD Health Pvt. Ltd. will sponsor the Artificial Intelligence CUP 2025 competition titled “Controllable Multimodal Privacy De-Identification Technology.” Participating teams will receive medical speech datasets, corresponding text annotations, and audio files. The mission is to develop innovative de-identification solutions for medical data. The competition will focus on two critical subtasks: the Multimodal SHI Recognition and Normalization Competition, where researchers will strive to recognize SHI within clinical texts and standardize temporal information, and the Controlled Generation of Virtual Agents, which aims to create virtual SHIs while setting up validation standards for assessing algorithm performance.


Artificial Intelligence CUP 2025 Competition

**The dates will be updated soon**

Datasets

OpenDeID Corpus Dataset 

 

2025 International Workshop on Deidentification of Electronic Medical Record Notes (IW-DMRN)

2025 IW-DMRN Information


Organisers