Sunjit Rana and Rajdeep Sarkar
Tata Consultancy Services, Hadapsar, India
Posters & Accepted Abstracts: J Pharmacovigil
Medical case reports contain detailed information about patients, their illnesses, and the medical procedures they undergo. Such information can provide useful inputs for the study of pharmacovigilance. Because of the plain text nature in which case reports are presented; coded summarization of case reports is often complicated and demand a niche NLP approach. In the present scope, we attempt to extract treatment/intervention details from case presentations of medical case reports. We specifically focus on extracting drug usage details and medical procedures. We use a combination of ontology-based and pattern-driven extraction mechanisms. We employ a sequence tagger and supply fine-grained context features. We build a sentence-context feature that helps reduce false positives. Additionally, we explore the use of phoneme patterns occurring in drug names, learned via a rule mining algorithm, to further enhance detection accuracy. Finally, to resolve long distance dependencies between entities, we use a dependency parser. With a training corpus consisting of 80 manually annotated case reports from Hindawi and OCMR medical case report databases, we test on 10 reports and achieve precision of 83% and recall 84%.
Email: sg.rana@tcs.com