The Center for Health Systems Innovation at Oklahoma State University has been given a massive data warehouse by Cerner Corporation, a major electronic medical records (EMRs) provider, to help develop analytic applications. The data warehouse contains EMRs on the visits of more than 50 million unique patients across U.S. hospitals (1995–2014). It includes more than 84 million acute admissions and emergency and ambulatory visits. It is the largest and the industry’s only relational database that includes comprehensive records with pharmacy, laboratory, clinical events, admissions, and billing data. The database also includes more than 2.4 billion laboratory results and more than 295 million orders for nearly 4,500 drugs by name and brand. It is one of the largest compilation of de-identified, realworld, HIPAA-compliant data of its type. The EMRs can be used to develop multiple analytics applications. One application is to understand the relationships between diseases based on the information about the simultaneous diseases developed in the patients. When multiple diseases are present in a patient, the condition is called comorbidity. The comorbidities can be different across population groups. In this application, a research group at Oklahoma State University created a comparison of the comorbidities in patients from the urban and rural regions. To compare the comorbidities, a network analysis approach was applied. A network is comprised of a defined set of items called nodes, which are linked to each other through edges. An edge represents a defined relationship between the nodes. A very common example of network is a friendship network in which individuals are connected to each other if they are friends. Similarly, other common networks are computer networks, Web page networks, road networks, and airport networks. To compare the comorbidities, networks of the diseases in the patients from rural and urban hospitals were developed. The information about the diseases developed by each patient during hospital visits was used to create a disease network. The total number of hospital visits in the urban hospitals were 66 million and in the rural hospitals were 1 million. To manage such a huge data set, Teradata Aster Big Data platform was used. To extract and prepare the network data, SQL, SQL-MR, and SQL-GR frameworks supported by Aster were utilized. To visualize the networks we used Aster AppCenter and Gephi. Figure 7.11 presents the rural and urban comorbidity networks. In these networks, nodes represent different diseases classified as the International Classification of Diseases, Ninth Revision, Clinical Modification (ICD-9-CM), aggregated at the threedigit level. Two diseases were linked if they were significantly correlated or comorbid (p <>
Questions for Discussion
1. Why could comorbidity of diseases be different between rural and urban hospitals?
2. What is the issue about the huge difference between rural and urban patient encounters?
3. What are the main components of a network?
4. Where else can you apply the network approach?