
pmid: 22195227
pmc: PMC3243221
Although information redundancy has been reported as an important problem for clinicians when using electronic health records and clinical reports, measuring redundancy in clinical text has not been extensively investigated. We evaluated several automated techniques to quantify the redundancy in clinical documents using an expert-derived reference standard consisting of outpatient clinical documents. The technique that resulted in the best correlation (82%) with human ratings consisted a modified dynamic programming alignment algorithm over a sliding window augmented with a) lexical normalization and b) stopword removal. When this method was applied to the overall outpatient record, we found that overall information redundancy in clinical notes increased over time and that mean document redundancy scores for individual patient documents appear to have cyclical patterns corresponding to clinical events. These results show that outpatient documents have large amounts of redundant information and that development of effective redundancy measures warrants additional investigation.
Electronic Data Processing, Electronic Health Records, Humans, Information Storage and Retrieval, Algorithms, Pattern Recognition, Automated
Electronic Data Processing, Electronic Health Records, Humans, Information Storage and Retrieval, Algorithms, Pattern Recognition, Automated
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 32 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 10% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
