The examples in this repository build on the information from the MIMIC-III: Getting Started repository.
PowerPoint slides are in the root directory in both .pptx and .pdf formats for two talks:
-
KU Med Frontier's Informatics Meetup on 2019-09-26: MIMIC III: A free publicly available EHR Database for Research
-
Kansas City R Users Group on 2019-02-09: Exploring MIMIMC-III Critical Care Data Using Postgres/tidyverse/dbplyr/dplyr.
Let's build a data dictionary with information about fields by table and how they can be used.
The MIMIC-III database schema can be viewed online here.
-
Explore the six fields in the
patientstable. -
Why is there a pattern in the counts of records by
subject_id? -
Let's compute age at death for patients when both
dob(date of birth) anddod(date of death) are defined. -
To protect patient anonymity dates were shifted. Patients originally >89 years old show up as 300 years old! This is an unusual approach.
-
Database drivers
RPostgresandPostgreSQLgive different age-at-death density plots (for now) since date computations are not always correct.
-
Explore the 15 fields in the
admissionstable. -
A log scale was used to view a density plot for length-of-stay computed from admit time to discharge time.
-
The relationship between the
diagnosisfield in this table and thediagnoses_icdtable information is unclear.
-
Explore the
diagnoses_icdfact table with additional information in thed_icd_diagnosesdimension table. -
A bar plot of
seq_num(diagnosis priority) shows values can range from 1 to 39, but usually are less than 10. -
The dimension table has many icd 9 codes that are never referenced by a fact table record.
-
The fact table has over 140 icd 9 codes that cannot be found in the dimension table.
-
A
left_joinis likely more desirable than aninner_joinwhen connecting the dimension table to the fact table. -
Computed Summaries
-
Counts by ICD Diagnosis COde
-
Counts by primary vs secondary diagnosis by ICD Diagnosis Code (stored in file)
-
Counts by 10-year age intervals by ICD Diagnosis Code (stored in file)
-
-
Explore
charteventstable with 330 million records and over 6400 types of chart events from two differentdbsources. -
Explore event categories, parameter types, and units of measure.
-
The "first look" file identified data inconsistencies with
valueuom(value unit of measure), and several fields in the lab items dimension. -
The file Lab-Event-Multiple-Units.xlsx identifies
loinc_codes anditemids with multiple units of measure. -
The file Lab-Event-Item-Counts.xlsx shows lab event counts by
loinc_codeanditemidalong with the label and unit(s) of measure. -
The file Lab-Event-Fluid-Category-Counts shows lab event counts by fluid (9 groups) and category (3 groups).