At the session, I had a pretty good turn out. I had lot of people ask me questions, and got a lot of good feedback. Big thanks to Maggie for creating a very attractive poster!
Here is the paper, now that it is published - AMIA Annu Symp Proc. 2011 Oct 25 ;2011:1710.
Automated creation of clinical progress notes with machine learning
Michael Cham, CTO, Raymond Benza, MD, Jaime Carbonell, PhD
BlenderHouse, Pittsburgh, PA; Allegheny General Hospital, Pittsburgh, PA; Carnegie Mellon University, Pittsburgh, PA
Abstract
The goal of this project was to demonstrate that our machine learning approach could mimic individual physician’s diagnostic and care planning abilities by learning from clinical progress notes.  Using 500 notes from 3 physicians, we were able to achieve accuracies in excess of 97%.  This technology will be embedded in a product that can improve clinical efficiency and patient safety through electronic medical records and clinical decision support systems. 
Introduction
Electronic medical records (EMR) systems and clinical decision support (CDS) systems have the potential to address preventable readmissions and medical errors problems in the U.S., but are missing key pieces of functionality or aren’t well integrated enough into the clinical workflow to help.  Additionally, EMRs are actually introducing new kinds of medical errors and negatively impact the productivity of clinical staff.
The goal of the study was to prove that the technology at the core of our product, CompleteNoteTM, could learn to mimic individual physician’s diagnostic and patient care planning abilities, learned from actual electronic clinical progress notes. Adding this capability to EMR and CDS systems would improve the efficiency and accuracy of documentation and provide customized, patient-centric decision support.
Methods
We obtained 500 de-identified clinical progress notes from the Heart Failure, Heart Transplant and Pulmonary Hypertension Clinic at Allegheny General Hospital.  Three different physicians composed the notes. The notes were in Microsoft Word format, directly from a dictation/transcription service. The notes were then parsed and stored into a MySQL database, breaking down the notes into logical sections, vitals, problem lists, medications lists, lab results, diagnoses, and care plans.  
We utilized our proprietary, support vector machine-based, multi-statement prediction approach and generated eight statistical models.  There are two models for each of the three physicians, one for predicting patient diagnoses, and the other for predicting care plans.  For each physician model, we only used notes that they created.  We also created models for the union of all three physicians, which effectively blends the abilities of all the physicians.  The union models used all the notes from all physicians.  All of the models were restricted to predicting statements that appeared in at least 2% of the notes, in order to produce significant results.  We achieved the best results using linear kernels, but did test Polynomial, Sigmoid and RBF kernels.
Results
The prediction accuracy results are shown in Figure 1, calculated using a 5-fold cross-validation on linear kernels:
| Prediction model | Accuracy | Prediction Model | Accuracy | 
| Physician #1 Diagnosis | 97.65% | Physician #1 Plans | 97.86% | 
| Physician #2 Diagnosis | 97.94% | Physician #2 Plans | 97.88% | 
| Physician #3 Diagnosis | 97.51% | Physician #3 Plans | 97.52% | 
| Consolidated Diagnosis | 97.94% | Consolidated Plans | 97.95% | 
Figure 1. Prediction Accuracy results
Conclusion
We conclude that it is feasible to predict diagnosis and care plan information learned from progress notes using the CompleteNote technology. 
 
