Biomedical and Healthcare Natural Language Processing

CS532, Fall 2023

University of Illinois Chicago


Objective

In the recent past, there has been a dramatic revolution in Artificial Intelligence (AI). Natural Language Processing (NLP), a sub-field of AI, has seamlessly reshaped our interaction with the machine at various fronts, whether our conversation with chatbots, face recognition, or autonomous driving. A similar trend is also observed in the healthcare and biomedical domains. With the rapid digitization of medical records, an exponential rise in biomedical literature, and the growing interest in patient interaction with social media, there has been a significant advancement across several biomedical and healthcare NLP problems. From curating biological information to automating health surveillance of disease outbreaks, there have been a number of success stories in various biomedical NLP applications.

This advanced seminar course is designed to familiarize students with cutting-edge research in biomedical and healthcare NLP. It will offer a systematic introduction to various biomedical, clinical, and healthcare tasks, including the challenges associated with those tasks. The course will also cover the latest methodologies and advancements in NLP tailored specifically for the biomedical domain. Moreover, the course will involve interactive paper discussions and research projects focused on addressing real-world healthcare and biomedical NLP problems. The key topics to be covered include: biomedical information extraction, semantics and biomedical knowledge graph, multimodal biomedical NLP, biomedical and healthcare question answering, disease prediction and progression, biomedical document and healthcare records summarization, dialogue generation in healthcare and medical domains, modeling conversations in the healthcare domain, and the role of social media in the healthcare domain.



Time

Tuesday and Thursday from 14:00-15:15 CST/CDT ( Taft Hall 215 )
Office hours
Friday from 11:00-12:30 CST/CDT
Office location
3-190E LIB, 801 S. Morgan Street, MC 152, Chicago, IL 60607

Piazza

https://piazza.com/uic/fall2023/cs532
Textbook and Readings
Grading Policy
  • Project: proposal, presentations and final paper (50%)
  • Paper presentations (20%)
  • Paper critique (10%)
  • Class participation, discussion, and brainstorming (20%)
Prerequisites
  • CS 421 (Natural Language Processing) or CS 521 (Statistical Natural Language Processing) or another equivalent NLP course, and
  • CS 533 (Deep Learning for NLP)
  • Programming experience in Python

Coursework

The course consists of:

  1. Paper Presentation
    In each paper presentation class, there will be two presentations (20 minutes each) on the pre-defined research topic. Each student has to present at least one paper in the course.

  2. Paper Discussion Session
    The paper presentation will be followed by 15 minutes paper discussion session. One participant will argue in favor of the paper, and one will argue against the paper. Each student has to lead the discussion twice (once in favor and once in against the paper) in the course.

  3. Paper Critique
    Each student has to submit a detailed assessment (not more than two pages) of two papers (i) summary, (ii) key contributions , (iii) main strengths, and (iv) weaknesses of the paper.

  4. Project
    The final project provide you the opportunity to apply your newly acquired skills towards solving real-life biomedical and healthcare problems. A team of two students has to submit a project at the end of the coursework. The deliverable for the final project include:

    • Proposal Submission
      Each team need to provide a one-page proposal. The proposal should outline your research objectives, an explanation of objectives, and plans for pursuing them.

    • Project Presentation
      Each team has to present twice in the semester – at the beginning (week 5) and the end of the semester (week 15). In the first presentation, the team needs to give their research problem, motivation, a plan to tackle the challenge, and a timeline to complete the project. The final presentation will be more focused on the methodology, experimental results, analysis, and discussion of the research project.

    • Final Paper
      Each team must write a final report in the NLP conferences (e.g., ACL, NAACL, AAAI) paper format. The paper should have an abstract, introduction, clear motivation, contribution, related works, proposed method, comparison with other baselines, results, analysis and conclusion.


Schedule

This is a tentative schedule and is subject to change.

Week Topics Readings and useful links
Week 1
Introduction to Biomedical/Healthcare NLP: background, challenges, applications Chapter 1 from Cohen and Demner-Fushman's book
Week 2
Introduction to biomedical knowledge graphs Lecture slides
Week 3
Introduction to multimodal biomedical NLP Lecture slides
Week 4
Social and ethical consideration in biomedical NLP Lecture slides
Week 5
Project Proposal
Week 6
Biomedical/Clinical Information Extraction
  1. Document-level Biomedical Relation Extraction Based on Multi-Dimensional Fusion Information and Multi-Granularity Logical Reasoning
  2. Overview of the 2022 n2c2 shared task on contextualized medication event extraction in clinical notes
  3. Adapting Event Extractors to Medical Data: Bridging the Covariate Shift
  4. Amalgamation of protein sequence, structure and textual information for improving protein-protein interaction identification
Week 7
Semantics and Biomedical/Healthcare Knowledge Graph
  1. Cross-Domain Data Integration for Named Entity Disambiguation in Biomedical Text
  2. Knowledge Injected Prompt Based Fine-tuning for Multi-label Few-shot ICD Coding
  3. Learning to Leverage High-Order Medical Knowledge Graph for Joint Entity and Relation Extraction
  4. Rewire-then-Probe: A Contrastive Recipe for Probing Biomedical Knowledge of Pre-trained Language Models
Week 8
Biomedical and Healthcare Question Answering
  1. PaniniQA: Enhancing Patient Education Through Interactive Question Answering
  2. GREASELM: GRAPH REASONING ENHANCED LANGUAGE MODELS FOR QUESTION ANSWERING
  3. MedMCQA : A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering
  4. Interpretable Multi-Step Reasoning with Knowledge Extraction on Complex Healthcare Question Answering
Week 9
Disease Prediction and Progression
  1. GPT-D: Inducing Dementia-related Linguistic Anomalies by Deliberate Degradation of Artificial Neural Language Models
  2. CoAD: Automatic Diagnosis through Symptom and Disease Collaborative Generation
  3. Clinical Outcome Prediction from Admission Notes using Self-Supervised Knowledge Integration
Week 10
Multimodal Biomedical NLP
  1. JPG - Jointly Learn to Align: Automated Disease Prediction and Radiology Report Generation
  2. Multimodal Prompt Retrieval for Generative Visual Question Answering
  3. Overview of the MedVidQA 2022 Shared Task on Medical Video Question-Answering
  4. Multimodal Generation of Radiology Reports using Knowledge-Grounded Extraction of Entities and Relations
Week 11
Biomedical Document and Healthcare Records Summarization
  1. Readability Controllable Biomedical Document Summarization
  2. Towards Understanding Consumer Healthcare Questions on the Web with Semantically Enhanced Contrastive Learning
  3. Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization
  4. Dr. Summarize: Global Summarization of Medical Dialogue by Exploiting Local Structures
Week 12
Dialogue Generation in Healthcare and Medical Domains
  1. Would you like to tell me more? Generating a corpus of psychotherapy dialogues
  2. Towards Facilitating Empathic Conversations in Online Mental Health Support: A Reinforcement Learning Approach
  3. Medical Dialogue Generation via Dual Flow Modeling
  4. MidMed: Towards Mixed-Type Dialogues for Medical Consultation
Week 13
Modeling Conversations in the Healthcare Domain
  1. A Computational Approach to Understanding Empathy Expressed in Text-Based Mental Health Support
  2. Overview of the MEDIQA-Chat 2023 Shared Tasks on the Summarization & Generation of Doctor-Patient Conversations
  3. Towards Enhancing Health Coaching Dialogue in Low-Resource Settings
Week 14
The Role of Social Media in the Healthcare Domain
  1. It Takes Two to Empathize: One to Seek and One to Provide
  2. Towards Intention Understanding in Suicidal Risk Assessment with Natural Language Processing
  3. Towards Identifying Fine-Grained Depression Symptoms from Memes
  4. CancerEmo: A Dataset for Fine-Grained Emotion Detection
Week 15
Final Project Presentation
Week 16
Project report/paper Due
Note
Students can select the papers outside the paper list mentioned above. However, it has to be approved by the course instructor one week before the paper-discussion week.