Infrastructure of the Electronic Health Record Data Management for Digital Patient Phenotype Creating
- 10.2991/itids-19.2019.51How to use a DOI?
- medical information system, electronic health record, digital phenotype, information storage, Big Data, NoSQL, data extraction, data analysis, machine learning
improving the health care system requires the effective use of digitized biomedical data already accumulated and constantly updated due to the widespread introduction of applied information systems. One of the priorities here is the creation of a patient's digital phenotype based on data from an electronic health record (EHR). To solve this problem, it is necessary to rely on an ecosystem that provides secure storage, processing and analysis of large volumes of heterogeneous information (tables, images, texts in natural language). The main goal of the article is to study the possibilities of Big Data methods and technologies for the reuse of digitized biomedical data. We have considered designing and implementing of the EHR data management infrastructure that provides tools for digital patient phenotype creating. We designed the data lake prototype “5P Medicine-Big Data” based on the original “Big Data to Smart Data” (BD2SD) multi-layer approach. We have developed an information repository based on the generalized NoSQL approach and the “document repository” model, as well as the system of services. These services provide solutions for the secure data transfer from medical information systems, data storage, validation of information, preliminary data analysis and visualization, data extraction from unstructured documents, and sampling for the machine and deep learning methods. We offered methods for analyzing a significant amount of EHR based on machine learning and Big Data technologies. We applied these methods to extract valid information from unstructured EHR data (first of all, patient examination protocols) and identify characteristic patient categories. We constructed digital patient phenotype, represented by exactly those features extracted from the EHR data that are key in the context of a specific medical problem. Digital phenotypes were formed based on more than 30,000 medical texts for about 2,000 patients. We have shown the effectiveness of the proposed approach by examining the problem of identifying patterns in the patient’s visits to a doctor before and after the cardiovascular diseases (angina pectoris, myocardial infarction, and ischemic heart disease) appeared in the electronic health record.
- © 2019, the Authors. Published by Atlantis Press.
- Open Access
- This is an open access article distributed under the CC BY-NC license (http://creativecommons.org/licenses/by-nc/4.0/).
Cite this article
TY - CONF AU - Alexander Zakharov AU - Alexander Potapov AU - Irina Zakharova AU - Alexander Kotelnikov AU - Dmitriy Panfilenko PY - 2019/05 DA - 2019/05 TI - Infrastructure of the Electronic Health Record Data Management for Digital Patient Phenotype Creating BT - Proceedings of the 7th Scientific Conference on Information Technologies for Intelligent Decision Making Support (ITIDS 2019) PB - Atlantis Press SP - 285 EP - 290 SN - 1951-6851 UR - https://doi.org/10.2991/itids-19.2019.51 DO - 10.2991/itids-19.2019.51 ID - Zakharov2019/05 ER -