Prakash Nadkarni, M.D.

Lecturer, Yale Center for Medical Informatics

Research Interests   Web Presentations    Selected Publications   Database Course (BIS560) Materials Microsoft Access Lecture materials

Email:   pmnadkarni @ geisinger.edu

Phone:
Fax:     
Mailing Address:
Senior Research Investigator II
Geisinger Center for Healthcare Research
100 North Academy Avenue
Danvile, PA 17821

Research Interests

My hobby (which also happens to be my job) is working with biomedical databases of all kinds.

At the moment, I'm playing with Entity-Attribute-Value (EAV) databases, which are used in domains where the number of potential descriptors (attributes) describing an object is a couple of orders of magnitude greater than the actual number of descriptors for a given object. For example, when dealing with patient data across all clinical specialties, the number of history elements,symptoms, clinical examination findings, lab tests and so on ranges in several tens of thousands, and this number is constantly growing. Yet, for a given patient, not more than a few dozen types of positive or significant negative findings are actually relevant. That is, the data is highly sparse, and a set of conventional relational tables, with one finding per column, would result in much wasted space, because most columns would be null. In the EAV approach, one stores only non-null findings in a table containing three types of information: the Entity (the patient, the date/time the finding was recorded), the Attribute (i.e., the name of the finding) and the Value of the finding.

TrialDB, an EAV database for management of Clinical Studies Data that is copyrighted by myself and my colleague Cindy Brandt (though it is open-source freeware), is described on the TrialDB Home Page. This page has an FAQ, and links to the ftp site, online documentation (also ftp-able) and the demo site where you can try it out..

I also dabble in information retrieval (a fancy phrase for text processing). This is an offshoot of my database interests: a large component of biomedical databases consists of narrative text (which captures nuances that coded text cannot): examples are discharge summaries and operative notes. I'm looking at ways to optimize the searching process by indexing the content based on recognition of concepts in controlled biomedical vocabularies (I mainly play with the National Library of Medicine's Unified Medical Language System), and at ways of integrating text search with conventional database search.

In the past, I've worked in the area of genome informatics as well as parallel computation in molecular biology and genetics.

Presentations:

The following links point to the contents of presentations (organized as HTML framesets) that should be of general interest to medical informaticians.

Clinical Data Warehousing  presented at AMIA Fall Symposium, Orlando, FL, Nov 8 1998

ACT/DB: An Infrastructure for Clinical Trials Data Management Columbia University, Jan 21 1999

The EAV/CR Physical Data Model for Heterogeneous Scientific Databases   Human Brain Project Annual Meeting, NIH, Jun 5,1999

Understanding and Implementing the EAV Database in the General Clinical Research Center. National GCRC Meeting, Baltimore, MD, April 13, 2002. The URL above is a converted PowerPoint presentation. A detailed explanatory paper can be found below.

An Introduction to EAV systems: National GCRC Meeting, Baltimore, MD, April 13,2002.

Informatics Support of Data management for multi-centric clinical studies: Integrating clinical and genetics/genomic data American College of Medical Informatics, Fort Lauderdale, Florida, March 2003.

Database Representation of Phenotype Data: Issues and Challenges   Human Genome Variation Society, American Society for Human Genetics Meeting, Los Angeles, CA , November 4, 2003.

Selected Publications:

Here is a list of selected recent publications. (Some of the papers are downloadable as MS-Word files plus figures, compressed into zip files. See the hyperlinks at the bottom of the publications list..)

  1. Nadkarni PM. Management of Evolving Map Data: Data Structures and Algorithms Based on the Framework Map Genomics, (1995) 30:565-573. Abstract
  2. Nadkarni PM, Cheung K-H. SQLGEN: An environment for rapid client-server database development. Computers and Biomedical Research (1995) 28:479-499. Abstract
  3. Nadkarni PM, Montgomery KM, Leblanc-Stracewski J, Krauter K. CONTIG EXPLORER: Interactive Exploratory Contig Assembly. Genomics (1996) 31:301-310. Abstract
  4. Nadkarni PM, Cheung K-H, Castiglione C, Miller PL, Kidd KK. DNA Workbench: a database for support of regional chromosomal mapping. Journal of Computational Biology, (1996) 3 (2), 319-329. Abstract
  5. Nadkarni PM. Mapdiff: an algorithm to report the differences between two genomic maps. Comput Applic Biosci, (1997) 13 (3) 217 - 225. Abstract
  6. Cheung, K-H, Nadkarni PM, Silverstein S, Miller PL, Kidd KK. PhenoDB: A database for the storage and analysis of pedigree and population genetic data. Computers in Biomedical Research (1996) 79: 327-337. Abstract
  7. Nadkarni PM. Concept Locator: A Client-Server Application for Retrieval of UMLS Metathesaurus Concepts Through Complex Boolean Query. Computers in Biomedical Research (1997) 30:323-336. Abstract
  8. Nadkarni PM. QAV: Querying Entity-Attribute Value Metadata in a Biomedical Database.Computer Methods and Programs in Biomedicine (1997) 53 93-103. Abstract
  9. Nadkarni PM. Mapmerge: merging genomic maps. Bioinformatics (1998) 14(4), 310-316. Abstract
  10. Nadkarni, P, Brandt, C, Frawley, S M, Sayward, F, Einbinder, R, Zelterman, D, Schacter, L Miller, P L. Managing attribute-value clinical trials data using the ACT/DB client-server database system. Journal of the American Medical Informatics Association (1998) 5(2) 139-151. Abstract
  11. Cheung, KH, Nadkarni PM, Shin DG. A metadata approach to query interoperation between molecular biology databases. Bioinformatics (1998) 14(6) 486-497. Abstract
  12. Nadkarni PM. CHRONOMERGE: An Application for the Merging and Display of Multiple Time-Stamped Data Streams. Computers and Biomedical Research (1998) 31 451-464. Abstract
  13. Nadkarni, P, Brandt, C. Data Extraction and Ad Hoc Query of an Entity-Attribute-Value Database. Journal of the American Medical Informatics Association (1998) 5(6) 511-527. Abstract
  14. Nadkarni PM, Marenco L, Chen R, Skoufos E, Shepherd G, Miller P. Organization of heterogeneous scientific data using the EAV/CR representation.Journal of the American Medical Informatics Association 1999 Nov-Dec;6(6):478-93. Abstract
  15. Stein HD, Nadkarni P, Erdos J, Miller PL Exploring the degree of concordance of coded and textual data in answering clinical queries from a clinical data repository. J Am Med Inform Assoc 2000 Jan-Feb;7(1):42-54. Abstract
  16. Nadkarni PM, Brandt C, Marenco L. WebEAV: Automatic Metadata-Driven Generation of Web Interfaces to Entity-Attribute-Value Databases. J Am Med Inform Assoc 2000;7(4):343-56 Abstract
  17. Roland S. Chen, Prakash Nadkarni , Luis Marenco, Forrest Levin , Joseph Erdos and Perry L. Miller.Exploring Performance Issues for a Clinical Database Organized Using an Entity-Attribute-Value Representation. J Am Med Inform Assoc 2000; 7(5):475-487 Abstract
  18. Chen RS and Brandt CA. UMLS Concept Indexing for Production Databases: A Feasibility Study. Journal of American Medical Informatics Association, 2001 8: 80-91 Abstract
  19. Mutalik PG, Deshpande AM, Nadkarni PM. Use of General-purpose Negation Detection to Augment Concept Indexing of Medical Documents: A Quantitative Study using the UMLS. Journal of American Medical Informatics Association, 2001 8: 598-609 Abstract
  20. Nadkarni PM: An Introduction to Information Retrieval: Applications in Genomics. The Pharmacogenomics Journal, 2002 2(2) 96-102. Abstract
  21. Deshpande AM, Brandt CA, Nadkarni PM. Metadata-driven Ad hoc Query of Patient Data: Meeting the Needs of Clinical Studies. Journal of the American Medical Informatics Association. 2002; 9(4) 369-382. Abstract
  22. Fisk JM, Mutalik PG, Levin FW, Erdos J, Taylor C, Nadkarni PM. Integrating Query of Relational and Textual Data in Clinical Databases: a Case Study. Journal of the American Medical Informatics Association. 2003: 10(1) 21-38.. Abstract
  23. Nadkarni PM, Sun K, Wiepert M. Designing and Implementing Special-Purpose Databases: Lessons from the Pharmacogenetic Network. Pharmacogenomics 2002 3(5): 687-96. Abstract
  24. Nadkarni PM. The challenges of recording phenotype in a generalizable, computable form (Perspective). The Pharmacogenomics Journal 2003 3(1) 8-10.
  25. Deshpande AM, Brandt CA, Nadkarni PM. Temporal Query of Attribute-Value Patient Data: Utilizing the Constraints of Clinical Studies. International Journal of Medical Informatics 2003; 70 (1): 59-77. Abstract
  26. Marenco L, Tosches N, Crasto C, Shepherd G, Miller PL, Nadkarni PM Achieving Evolvable Web-Database Bioscience Applications Using the EAV/CR Framework: Recent Advances. Journal of the American Medical Informatics Association. 2003: 10(5). Abstract
  27. Brandt CA, Gadagkar R, Rodriguez C, Nadkarni PM. Managing Complex Change in Clinical Study Metadata. Journal of the American Medical Informatics Association. 2004 11(3) 380-91 Abstract
  28. Marenco L, Wang TY , Shepherd G, Miller PL, Nadkarni PM QIS: A Framework for Biomedical Database Federation. Journal of the American Medical Informatics Association. 2004 11(6) 523-34. Abstract
  29. Holford M, Li N, Nadkarni P, Zhao H. VitaPad: visualization tools for analysis of pathway data. Bioinformatics: 2004, 21 (8) 1596-1602 Abstract
  30. Nadkarni PM and Wiepert M. Translating Pharmacogenomics Discoveries into Clinical Practice: The Role of Curated Databases Pharmacogenomics 2005 Jul;6(5):451-4.
  31. Brandt CA, Argraves S, Money R, Ananth G, Trocky NM, Nadkarni PM. Informatics tools to improve clinical research study implementation. Contemporary Clinicals Trials. 2005
  32. Nadkarni PM, Brandt CA. The Common Data Elements for cancer research: remarks on functions and structure.Methods Inf Med. 2006;45(6):594-601.

Book Chapters

Shepherd, G.M., M.D. Healy, M.S. Singer, B.E. Peterson, J.S. Mirsky, L. Wright, J.E. Smith, P.M. Nadkarni, & P.L. Miller. Senselab: a project in multidisciplinary, multilevel sensory integration. pp. 21-56 of Neuroinformatics: An Overview of the Human Brain Project, ed. S.H. Koslow & M.F. Huerta. Lawrence Erlbaum Associates, Inc. Mahwah, NJ: 1997.

Prakash Nadkarni., Jason Mirsky, Emmanouil Skoufos, Matthew Healy, Michael Hines, Perry Miller and Gordon Shepherd. Modeling Heterogeneous Data on the Nervous System (Book Chapter) in Bioinformatics Databases ed. Stanley I. Letovsky. Kluwer Academic Publishers, Dordrecht , Netherlands. pp. 38-51

Books

Prakash Nadkarni: Parallel Programming with Linda: An Advanced Introduction.

Linda, conceived by David Gelernter and initially implemented by Nick Carriero, both of Yale University, is a "mini-language" constructing of just 4 constructs that is embedded in a conventional language (e.g., C or FORTRAN) to give it parallel capabilities. While conceptually very simple, its reliance on a pre-processor (that must be hand-sculpted for the language in which it is to be embedded) has limited its widespread use, by contrast with MPI, which though somewhat more difficult to use, depends purely on a subroutine library. This book was written back in 1992, and also gives an introduction to parallel programming. It can be downloaded by clicking here.

Downloadable Publications 

The downloadable zip files linked to below typically contain more than one publication. Refer to the numbers in the list above. Please note: figures are generally bundled with the MS-word file, or separately, but some figures may be missing). Many of the publications have originally appeared in the Journal of the American Medical Informatics Association, from where you can get excellent content related to Medical Informatics. JAMIA publications that are more than three years old are also freely downloadable (as PDFs) from NCBI's PubMed Central Site

Click on the following links to download full text of publications of interest :
publications 1, 3 and 4;
publications 7 and 8
publication 9
publications 10, 12 and 13
publication 14
publications 16 and 17 .
publication 18
publication 19
publication 20
publication 21
publication 22
publication 23
publication 24
publication 25
publication 26
publication 28
publication 29
publication 30
publication 32

Home