Use the OMOP analytics sample notebooks in healthcare data solutions (preview)

Artigo
03/14/2024

[This article is prerelease documentation and is subject to change.]

This section presents two Observational Medical Outcomes Partnership (OMOP) sample scenarios. These scenarios reflect common clinical research investigations conducted by the OMOP community regarding exposure to primary and secondary drugs across patient populations. From a time to value perspective, it demonstrates how quickly you can visualize analytical outcomes within your Fabric workspace. You can achieve this visualization by executing the sample notebooks after the data pipelines hydrate the Fast Healthcare Interoperability Resources (FHIR) clinical data in the silver and gold lakehouses, respectively.

Prerequisites

Before executing the sample notebooks healthcare#_msft_omop_sample_drug_exposure_era and healthcare#_msft_omop_sample_drug_exposure_insights, you need to ensure you have the following requirements:

Verify whether the OMOP database is created and populated with sample data.
Deploy and set up the sample data in your environment, as explained in Deploy sample data.
Review the sample notebook configuration, as explained in:
- Configure the drug exposure era sample notebook
- Configure the drug exposure insights sample notebook

Sample scenario

The sample scenarios aim to identify patient cohorts stratified by gender and age who are exposed to a secondary drug during a certain period while on the same primary drug. The process includes the following steps:

Stratify patient population by gender and age.
Identify the drug (for example, insulin isophane, human 70 UNT/ML/insulin, regular, human 30Unit) taken by the patient population over a period of one year, at least once.

If there isn't enough data, consider a period of five years instead.
Identify one more drug (the second drug) the same patient population is exposed to during the same period.
Plot the distribution of secondary drug exposure across the gender strata.
Generate the records and visualize the distribution as a histogram plot.

Tip

The sample scenarios reference the OHDSI Drug Eras sample scripts and the OMOP Drug Exposure queries. You can review these resources to learn more about similar examples published by the OMOP community.

Sample notebook execution inputs

The primary objective of the development design is to generate the drug era records, represented by the OMOP standardized derived table drug_era. This table stores the calculated drug eras, containing aggregated information on drug exposures grouped by person, drug ingredient, and persistence window. It represents continuous periods when a person is assumed to be exposed to a specific active ingredient, distinct from individual drug exposure records.

The table contains the following columns:

drug_era_id: Unique identifier for each drug era.
person_id: Foreign key referencing the person exposed to the drug, with demographic details in the Person table.
drug_concept_id: Foreign key referring to a standardized concept identifier for the active ingredient.
drug_era_start_date: Start date of the drug era, derived from the first drug exposure.
drug_era_end_date: End date of the drug era, based on the last drug exposure.
drug_exposure_count: Total number of drug exposures during the drug era.
gap_days: Number of days not covered by the drug exposure records that contributed to the drug era.

To generate the drug era records, we use the following OMOP standardized clinical tables:

Drug Exposure: This table contains the drug exposure data, including drug_exposure_id, person_id, drug_concept_id, drug_exposure_start_date, drug_exposure_end_date, and days_supply.
Concept Ancestor: This table stores hierarchical relationships between concepts in various vocabularies such as RxNorm. It includes the ancestor_concept_id (a reference to a higher-level concept) and the descendant_concept_id (a reference to a lower-level concept), representing the broader to narrower concept connections.
Concept: This table contains the concept data, including concept_id, concept_name, domain_id, vocabulary_id, and concept_class_id.

Sample input parameters

primary_drug = 1596977 - insulin
secondary_drug = 1308216 - lisinopril
year = 2022

Sample notebook outputs

Executing the two sample notebooks generates a histogram with a distribution of the secondary drug exposure across the gender and age strata of the patient population identified during a specific period from the derived OMOP table omop.drug_era. In this example, we consider a period of one year.

You can use the distribution to analyze the following aspects:

Impact of exposure by gender and age.
Median distribution of impacted population.
Apply descriptive statistics to describe the characteristics of the population.

Things to remember

To test your custom scenarios, make a copy of the sample notebooks. Don't update the notebooks directly.
The visualization notebook has the following parameters that you can configure to run different analyses:
- primary_drug: The primary drug to analyze.
- secondary_drug: The secondary drug to analyze.
- year: The year for which the analysis should be performed.
Running the drug exposure era notebook multiple times first deletes all the existing OMOP drug_era records, and then recreates the records based on the latest OMOP data.

Compartilhar via