2025 International workshop on Deidentification of Electronic Medical Record Notes (IW-DMRN)

Announcement (1st July 2026):
The proceedings of this workshop are now available at: https://link.springer.com/book/10.1007/978-981-92-2282-7
eBook ISBN: 978-981-92-2282-7
Print ISBN: 978-981-92-2281-0

Announcement (25 August 2025):
We are delighted to share that authors of outstanding and innovative papers presented at this workshop will have the opportunity to be considered for inclusion in a special collection of BMJ Health & Care Informatics.

Selected submissions will undergo a thorough peer-review process in accordance with the journal’s standards.

For submission guidelines and eligibility criteria, please refer to the following link:

Workshop information

This international workshop will be held as a closing event for the SREDH/AI-Cup 2025 Deidentification competition. The workshop will be held during the 2025 MedInfo 2025 ( 9th to 13th August 2025, Taipei, Taiwan). The workshop will have presentations from top performing teams that participated in SREDH/AI-Cup 2025 Deidentification competition

Artificial intelligence (AI) and natural language processing (NLP) have played transformative roles in advancements in healthcare, with large language models (LLMs) proven to be prominent in clinical decision-making and electronic health record (EHR) processing. LLM-driven systems analyse complex medical data and assist with diagnosis, treatment planning, and personalized medicine. However, safeguarding sensitive health information (SHI) embedded in EHRs and exchanged during doctor-patient interactions remains challenging. The first International Workshop on Deidentification of Electronic Medical Records Notes (IW-DMRN), which focused on LLM-based approaches for SHI deidentification, was held on 15th January 2024. By considering the outcomes of the first workshop [1-3], the 2nd IW-DMRN workshop is proposed with the primary objective of developing advanced AI algorithms capable of identifying and replacing SHIs effectively from medical speech datasets.

** Final date and venue**

Time: Based on Taiwan Time Zone (GMT+8) Friday, August 10,

Venue: Taipei International Convention Center (TICC), Taipei, Taiwan 2025 201F, 2F

** Agenda **

Workshop WS14

Super Theme:

TRACK 3: Health Data Science & Artificial Intelligence

Theme:

Theme 2 - Applications

Time: 09:00-10:30 (GMT+8)

Chair(s): Jitendra Jonnagaddala

09:00-09:05

Ching-Tai Chen

Chair Opening Remarks

Welcome message and introduction (host & participants)

Presentation link

09:05-09:15

Liang-Chun Fang

Presentation Topic: Overview of the AI CUP 2025 Medical Speech Sensitive Personal Data Recognition Competition

Presentation link

09:15-09:25

Zheng-Hao Li

Presentation Topic: A Generative Approach to Sensitive Data Identification in Medical Speech using Large Language Models

Presentation link

09:25-09:35

Liang-Kai Chen

Presentation Topic: Prompt Engineering and Post-processing for Sensitive Health Information Recognition

Presentation link

09:35-09:45

Jing Jin (Online)

Presentation Topic: Instruction-Tuned LLMs for Multilingual Medical ASR and Privacy Entity Extraction

Presentation link

09:45-09:55

Lien-hung Su

Presentation Topic: Named Entity Recognition in Chinese-English Speech via Automatic Speech Recognition and Large Language Models

Presentation link

09:55-10:05

Yan-Jun Chen (Online)

Presentation Topic: Speech Privacy and Personal Information Recognition

Presentation link

10:05-10:15

Yuan-Chi Hsu

Presentation Topic: Chinese Models for De-identifying Mixed Chinese-English- Minnan Speech

Presentation link

10:15-10:25

Chao-Long Huang (Online)

Presentation Topic: Temporal Subword De-identification of Medical Speech for Privacy Protection Leveraging ASR and LLMs

Presentation link

10:25-10:30

Group Photo and Closing

** Submission information ** (deadline 1st August 11:59PM GMT+8)

https://www.sredhconsortium.org/sredh-workshops/2025-iw-dmrn/submission-information

References

Jonnagaddala.J , Z.S.-Y.W., Privacy-preserving Strategies for Electronic Health Records in the Era of Large Language Models. npj Digital medicine, 2025. https://doi.org/10.1038/s41746-025-01429-0
Jonnagaddala.J, Dai.H.-J., Chen.C-T . SREDH. Large Language Models for Automatic Deidentification of Electronic Health Record Notes. Springer CCIS 2025.https://doi.org/10.1007/978-981-97-7966-6 .
Jonnagaddala, J., Chen, A., Batongbacal, S., & Nekkantti, C. (2021). The OpenDeID corpus for patient de-identification. Scientific reports, 11(1), 19973. https://doi.org/10.1038/s41598-021-99554-9
Chen, A., Jonnagaddala, J., Nekkantti, C., & Liaw, S. T. (2019). Generation of Surrogates for De-Identification of Electronic Health Records. Studies in health technology and informatics, 264, 70–73. https://doi.org/10.3233/SHTI190185
Alla, N. L. V., Chen, A., Batongbacal, S., Nekkantti, C., Dai, H., & Jonnagaddala, J. (2021). Cohort selection for construction of a clinical natural language processing corpus. Computer Methods and Programs in Biomedicine Update, 1, 100024. https://doi.org/10.1016/j.cmpbup.2021.100024
Liu, J., Gupta, S., Chen, A., Wang, C. K., Mishra, P., Dai, H. J., Wong, Z. S., & Jonnagaddala, J. (2023). OpenDeID Pipeline for Unstructured Electronic Health Record Text Notes Based on Rules and Transformers: Deidentification Algorithm Development and Validation Study. Journal of medical Internet research, 25, e48145. https://doi.org/10.2196/48145