CL4Health @ LREC-COLING 2024

Patient-oriented language processing

CL4Health 2024 Program  

Monday May 20, 2024



09:00 - 09:05   Opening remarks




09:05 - 10:30   

Session 1: Communicating with patients





09:05 - 09:35   

Invited talk -- Barbara Di Eugenio: Engaging the Patient in Healthcare: Summarization and Interaction





09:35 - 09:55   

Improving Sign Language Production in the Healthcare Domain Using UMLS and Multi-task Learning 

Jonathan David Mutal, Raphael Rubino, Pierrette Bouillon, Bastien David, Johanna Gerlach, Irene Strasly

TIM/FTI, University of Geneva





09:55 - 10:15   

It's Difficult to Be Neutral – Human and LLM-based Sentiment Annotation of Patient Comments 

Petter Mæhlum1, David Samuel1, Rebecka Maria Norman2, Elma Jelin2, Øyvind Andresen Bjertnæs2, Lilja Øvrelid1, Erik Velldal1

1Department of Informatics, University of Oslo, 2Norwegian Institute of Public Health





10:15 - 10:30   

Poster boasters





10:30 - 11:00   

Coffee break





11:00 - 13:00   

Session 2: Patients' language and care





11:00 - 11:30   

Invited talk -- Natalia Grabar: Linguistic Foundations of the Simplification and its Current State





11:30 - 11:50   

Simulating Diverse Patient Populations Using Patient Vignettes and Large Language Models 

Daniel Reichenpfader and Kerstin Denecke

Bern University of Applied Sciences





11:50 - 12:10   

Annotating Emotions in Acquired Brain Injury Patients' Narratives 

Salomé Klein1, Amalia Todirascu1, Hélène Vassiliadou1, Marie Kuppelin2, Joffrey Becart1, Thalassio Briand1, Clara Coridon1, Francine Gerhard-Krait1, Joé Laroche1, Jean Ulrich1, Agata Krasny-Pacini2

1UR 1339/LiLPa & FRLC (University of Strasbourg), 2INSERM (University of Strasbourg)





12:10 - 12:30   

Structuring Clinical Notes of Italian ST-elevation Myocardial Infarction Patients 

Vittorio Torri1, Sara Mazzucato2, Stefano Dalmiani3, Umberto Paradossi3, Claudio Passino4, Sara Moccia2, Silvestro Micera5, Francesca Ieva6

1MOX - Modelling and Scientific Computing Lab, Dipartimento di Matematica, Politecnico di Milano, Milano, Italy,
 
2Biorobotics Institute, Department of Excellence in Robotics and AI - Scuola Superiore Sant'Anna, Pisa, Italy,
3Fondazione Toscana G. Monasterio, Pisa, Italy,
4
Fondazione Toscana G. Monasterio, Pisa, Italy and Health Science Interdisciplinary Research Center, Scuola Superiore Sant'Anna, Pisa, Italy,
 
5Bertarelli Foundation Chair in Translational Neural Engineering, Center for Neuroprosthetics and Institute of Bioengineering,
 Ecole Polytechnique Federale de Lausanne, Lausanne, Switzerland, Health Science Interdisciplinary Research Center, Scuola Superiore Sant'Anna, Pisa, Italy
and Biorobotics Institute, Department of Excellence in Robotics and AI - Scuola Superiore Sant'Anna, Pisa, Italy,
 
6MOX - Modelling and Scientific Computing Lab, Dipartimento di Matematica, Politecnico di Milano, Milano, Italy and HDS - Health Data Science Centre, Human Technopole, Milano, Italy





12:30 - 13:00   

Poster boasters





13:00 - 14:30   

Lunch






14:30 - 14:50

Invited talk
-- Graciela Gonzalez-Hernandez: Patients are speaking - are we listening?
 Incorporating patient perspectives posted online into clinical trials

14:50 - 16:30   

Poster session (parallel)





  

Towards AI-supported Health Communication in Plain Language: Evaluating Intralingual Machine Translation of Medical Texts 

Silvana Deilen1, Ekaterina Lapshinova-Koltunski1, Sergio Hernández Garrido1, Christiane Maaß1, Julian Hörner2, Vanessa Theel3, Sophie Ziemer3

1University of Hildesheim, 2Wort und Bild Verlag, 3SUMM AI





  

Large Language Models as Drug Information Providers for Patients 

Luca Giordano and Maria Pia di Buono

UNIOR NLP Research Group - University of Naples "L'Orientale"





  

Towards Generation of Personalised Health Intervention Messages 

Clara Wan Ching Ho1 and Volha Petukhova2

1Goethe University Frankfurt, 2Saarland University





  

Analysing Emotions in Cancer Narratives: A Corpus-Driven Approach 

Daisy Monika Lal, Paul Rayson, Sheila A Payne, Yufeng Liu

Lancaster University





  

Study of Medical Text Reading and Comprehension through Eye-Tracking Fixations 

Oksana Ivchenko and Natalia Grabar

University of Lille





  

A Neuro-Symbolic Approach to Monitoring Salt Content in Food 

Anuja Tayal, Barbara Di Eugenio, Devika Salunke, Andrew D. Boyd, Carolyn A Dickens, Eulalia P Abril, Olga Garcia-Bedoya, Paula G Allen-Meares

University Of Illinois Chicago





  

On Simplification of Discharge Summaries in Serbian: Facing the Challenges 

Anđelka Zečević1, Milica Ćulafić2, Stefan Stojković3

1Mathematical Institute, Serbian Academy of Sciences and Arts, 2Faculty of Pharmacy, University of Belgrade, 3University Clinical Center of Serbia





  

Medical-FLAVORS: A Figurative Language and Vocabulary Open Repository for Spanish in the Medical Domain 

Lucia Pitarch1, Emma Angles-Herrero1, Yufeng Liu2, Daisy Monika Lal2, Jorge Gracia1, Paul Rayson2, Judith Rietjens3

1University of Zaragoza, 2Lancaster University, 3TU Delft





  

Generating Synthetic Documents with Clinical Keywords: A Privacy-Sensitive Methodology 

Simon Meoni1, Éric De la Clergerie2, Théo Ryffel3

1Arkhn/INRIA, 2Inria, 3Arkhn





  

Building Certified Medical Chatbots: Overcoming Unstructured Data Limitations with Modular RAG 

Leonardo Sanna, Patrizio Bellan, Simone Magnolini, Marina Segala, Saba Ghanbari Haez, Monica Consolandi, Mauro Dragoni

Fondazione Bruno Kessler





  

Towards Using Automatically Enhanced Knowledge Graphs to Aid Temporal Relation Extraction 

Timotej Knez and Slavko Žitnik

University of Ljubljana, Faculty of computer and information science





  

Experiments in Automated Generation of Discharge Summaries in Italian 

Lorenzo Ruinelli1, Amos Colombo1, Mathilde Rochat1, Sotirios Georgios Popeskou1, Andrea Franchini2, Sandra Mitrović2, Oscar William Lithgow2, Joseph Cornelius2, Fabio Rinaldi2

1Ente Ospedaliero Cantonale, Bellinzona, CH, 2Dalle Molle Institute for Artificial Intelligence, Lugano, CH





  

Evaluating LLMs for Temporal Entity Extraction from Pediatric Clinical Text in Rare Diseases Context 

Judith Jeyafreeda Andrew, Marc Vincent, Anita Burgun, Nicolas Garcelon

Université de Paris Cité, Imagine Institute, Data Science Platform, INSERM UMR 1163 PaRis Artificial Intelligence Research InstitutE (PRAIRIE)





  

Generating Distributable Surrogate Corpus for Medical Multi-label Classification 

Seiji Shimizu, Shuntaro Yada, Shoko Wakamiya, Eiji Aramaki

Nara Institute of Science and Technology





  

Development of a Benchmark Corpus for Medical Device Adverse Event Detection 

Susmitha Wunnava1, David A. Harris2, Florence T Bourgeois2, Timothy A Miller3

1Harvard Medical School, 2Boston Childrens Hospital, 3Boston Children's Hospital and Harvard Medical School





  

CliniRes: Publicly Available Mapping of Clinical Lexical Resources 

Elena Zotova1, Montse Cuadros1, German Rigau2

1Vicomtech, 2UPV/EHU





  

MedDialog-FR: A French Version of the MedDialog Corpus for Multi-label Classification and Response Generation Related to Women's Intimate Health 

Xingyu Liu1, Vincent Segonne2, Aidan Mannion3, Didier Schwab1, Lorraine Goeuriot1, François Portet1

1Univ. Grenoble Alpes, 2IRISA - Université Bretagne Sud, 3Univ. Grenoble Alpes / EPOS





  

Exploring the Suitability of Transformer Models to Analyse Mental Health Peer Support Forum Data for a Realist Evaluation 

Matthew Coole, Paul Rayson, Zoe Glossop, Fiona Lobban, Paul Marshall, John Vidler

Lancaster University





14:50 - 16:30   

Virtual poster session (parallel)





  

Revisiting the MIMIC-IV Benchmark: Experiments Using Language Models for Electronic Health Records 

Jesus Lovon-Melgarejo, Thouria Ben-Haddi, Jules Di Scala, Jose G. Moreno, Lynda Tamine Lechani

University of Toulouse





  

Unraveling Clinical Insights: A Lightweight and Interpretable Approach for Multimodal and Multilingual Knowledge Integration 

Kanimozhi Uma and Marie-Francine Moens

Katholieke Universiteit Leuven





  

Automated Question-Answer Generation for Evaluating RAG-based Chatbots 

Juan José González Torres, Mihai Bogdan Bîndilă, Sebastiaan Hofstee, Daniel Szondy, Quang-Hung Nguyen, Shenghui Wang, Gwenn Englebienne

University of Twente





  

Speech Accommodation in Health-Care Interactions: Evidence Using a Mixed-Reality Platform 

Rose Baker1, Susan C. Bobb2, Dai'Sha Dowson1, Elisha Eanes1, Makyah McNeill1, Hannah Ragsdale1, Audrey Eaves1, Joseph G. Lee1, Kathrin Rothermich1

1East Carolina University, 2Gordon College





  

Enhancing Consumer Health Question Reformulation: Chain-of-Thought Prompting Integrating Focus, Type, and User Knowledge Level 

Jooyeon Lee, Luan Huy Pham, Özlem Uzuner

George Mason University





  

Exploring the Challenges of Behaviour Change Language Classification: A Study on Semi-Supervised Learning and the Impact of Pseudo-Labelled Data 

Selina Meyer1, Marcos Fernandez-Pichel2, David Elsweiler1, David E. Losada2

1University of Regensburg, 2University of Santiago de Compostela





  

Using BART to Automatically Generate Discharge Summaries from Swedish Clinical Text 

Nils Berg and Hercules Dalianis

Department of Computer and Systems Sciences, Stockholm university





16:00 - 16:30   

Coffee break





16:30 - 18:00   

Session 3: Social media and literature





16:30 - 17:00   

Invited talk -- Abeed Sarker: Learning and Educating via NLP of Social Media: the Use Case for Substance Use and Overdose in the United States





17:00 - 17:20   

Biomedical Entity Linking for Dutch: Fine-tuning a Self-alignment BERT Model on an Automatically Generated Wikipedia Corpus 

Fons Hartendorp1, Tom Seinen2, Erik van Mulligen2, Suzan Verberne1

1LIACS, Leiden University, 2Erasmus Medical Center





17:20 - 17:40   

Unveiling Voices: Identification of Concerns in a Social Media Breast Cancer Cohort via Natural Language Processing 

Swati Rajwal, Avinash Kumar Pandey, Zhishuo Han, Abeed Sarker

Emory University





17:40 - 18:00   

Intent Detection and Entity Extraction from BioMedical Literature 

Ankan Mullick1, Mukur Gupta2, Pawan Goyal1

1Indian Institute of Technology, Kharagpur, 2Columbia University





18:00 - 18:05   

Closing remarks





Invited Talks

Barbara Di Eugenio, University of Illinois Chicago

Engaging the Patient in Healthcare: Summarization and Interaction Abstract

Effective and compassionate communication with patients is becoming central to healthcare. The talk discusses the results of and lessons learned from three ongoing projects in this space. The first, MyPHA, aims to provide patients with a clear and understandable summary of their hospital stay, which is informed by doctors’ and nurses’ perspectives, and by the strengths and concerns of the patients themselves. The second, VIRTUAL-COACH, models health coaching interactions via text exchanges that encourage patients to adopt specific and realistic physical activity goals. The third, HFChat, envisions an always-on-call conversational assistant for heart failure patients, that they can ask for information about lifestyle issues such as food and exercise.

Dr. Di Eugenio’s work is characterized by: large interdisciplinary groups of investigators who bring different perspectives to the research; grounding computational models in ecologically valid data, which is small by its own nature; and the need for culturally valid interventions, since the University of Illinois Health system predominantly serves underprivileged, minority populations.

Abeed Sarker, Associate Professor and Vice Chair for Research in Biomedical Informatics @ Emory School of Medicine

Learning and Educating via NLP of Social Media: the Use Case for Substance Use and Overdose in the United States

Substance use and overdose is an ongoing crisis in the United States and growing globally. The sphere of substance-related overdose also evolves continuously as novel psychoactive substances enter the supply. Nonmedical substance use surveillance via social media has the potential to provide low-cost and more timely insights than traditional approaches. In our research, we leverage natural language processing (NLP) and machine learning to obtain insights from targeted cohorts of people who use substances about emerging patterns and problems in substance use disorder and treatment. This talk outlines our NLP pipeline for analyzing substance use-related chatter from Twitter (X) and Reddit, and how insights derived from these sources may be used to educate medical practitioners at the forefront of the opioid crisis in the United States, facilitating more patient-centered care.
Dr. Sarker is an Associate Professor and the Vice Chair for Research at the Department of Biomedical Informatics, School of Medicine, Emory University. He leads several large-scale projects focusing on the application of NLP for health-related tasks, particularly those involving vulnerable populations such as people with substance use disorders, victims of intimate partner violence, and people at risk of self-harm and suicide. His research is primarily funded by the National Institutes of Health (NIH) and Centers for Disease Control and Prevention (CDC). Dr. Sarker’s research has been covered by various national and international media outlets such as the Wall Street Journal, Forbes, and Scripps National News.

Natalia Grabar, CNRS Researcher, Université de Lille

Foundations of the Simplification and its Current State

The purpose of text simplification is to adapt the content of documents in order to make their reading and understanding easier for a given type of population. If the simplification usually aims specific language levels (lexical, morphological, syntactic, semantic...), the available data cannot always provide precise indications required for this process. The talk discusses some sources of such available data. Dr. Grabar also analyzes the current situation related to the exploitation of linguistic indicators during the definition of language complexity and the simplification.
Dr. Grabar is a CNRS Researcher at the University of Lille. She studied philology at Lviv University, Ukraine and obtained her PhD in Medical Informatics from the Université Paris 6, France. She develops linguistic and statistical methods to access information and knowledge within scientific and technical texts and terminologies. The results are used in information retrieval, information extraction and text simplification. Dr. Grabar has co-authored over 200 publications.

Graciela Gonzalez-Hernandez, Cedars Sinai Medical Center

Patients are speaking - are we listening? Incorporating patient perspectives posted online into clinical trials

Research that aims to be equitable and effective at treating chronic diseases and improving patient outcomes must incorporate a broad range of patient perspectives (health-related uncertainties, beliefs, and experiences). Setting research priorities and designing trials is complex since clinicians, researchers, and patients differ on what is considered important. Patients often prioritize outcomes that directly impact their quality of life, such as symptom relief, functional status, and treatment side effects, while clinicians prioritize outcomes related to survival, disease progression, and biomarker endpoints. Methods commonly used for gaining patient perspectives are often limited are subject to recall and other biases, are expensive and time-consuming, are limited in recruitment number and diversity, and may not comprehensively capture factors important for research design.
A vast amount of data from the patient’s perspective is already publicly available: patients openly share useful perspectives on different social media platforms. Despite its potential, approaches for the systematic integration of such data to inform the prioritization and design of health research are still to be developed and validated.
In this talk, Prof. Gonzalez-Hernandez discusses her ongoing efforts to enable the extraction of relevant patient perspectives posted online using state-of-the-art natural language processing (NLP) methods, and the promise of their integration into clinical trial design.
Dr. Gonzalez-Hernandez has over 23 years of experience and more than 200 publications in health AI and NLP, funded by multiple NIH grants. She is currently a Professor and Vice Chair for Research and Education in the Cedars-Sinai Department of Computational Biomedicine. She launched the #SMM4H (Social Media Mining for Health) Workshop and Shared Tasks, which has run annually for the last 8 years.

Important Dates

March 15, 2024

Workshop Paper Due Date️

March 25, 2024

Notification of acceptance

March 31, 2024

Camera-ready papers due

May 20, 2024

Workshop @ LREC-COLING

Submissions

Two types of submissions are invited: full papers and short papers.

Full papers should not exceed eight (8) pages of text, plus unlimited references. These are intended to be reports of original research.

Short papers may consist of up to four (4) pages of content, plus unlimited references. Appropriate short paper topics include preliminary results, application notes, descriptions of work in progress, etc.

Electronic Submission: Submissions must be electronic and in PDF format, using the Softconf START conference management system. Submissions need to be anonymous. The submission site is:
     https://softconf.com/lrec-coling2024/cl4health2024/

Dual submission policy: papers may NOT be submitted to the workshop if they are or will be concurrently submitted to another meeting or publication.

Main conference resubmissions: We welcome submissions of topically-relevant papers that have been rejected from the main LREC-COLING conference. The scores and reviews from the main conference will be taken into consideration, and the highest ranking papers may be considered without additional review.

Scope

This first workshop on patient-oriented language processing aims to establish a general venue for presenting research and applications focused on patients’ needs, including summarizing health records for the patients, answering consumer-health questions using reliable resources, detecting misinformation or potentially harmful information, and providing multi-modal information, such as video, if it better satisfies patients’ needs. Such a venue is needed both to invigorate patient-oriented language processing research and to build a community of researchers interested in this area. The growing interest in this topic is fueled by several current trends:

  1. a proliferation of online services that target patients but do not always act in their best interests;
  2. policy changes that allow patients to access their health records written in the professional vernacular, which may confuse the patients or lead to misinterpretation;
  3. replacement of customer services with chat bots; and
  4. the increasing tendency of patients to consult online resources as a second or even first opinion on their health problems.


We invite papers concerning all areas of language processing focused on patients’ health. The workshop will be centered on language technologies for health-related issues concerning the public that include, but are not limited to:

Broadly, CL4Health is concerned with the resources, computational approaches, and behavioral and socio-economic aspects of the public interactions with digital resources in search of health-related information that satisfies their information needs and guides their actions.

The topics of interest for the workshop include but are not limited to the following:


Meeting

The workshop will be hybrid. Virtual attendees must be registered for the workshop to access the online environment.

Accepted papers will be presented as posters or oral presentations based on the reviewers’ recommendations. All accepted papers will be included in the workshop proceedings and ACL Anthology.

Organizers

Program Committee