Drug Domain‎ > ‎

NDF-RT

Vocabulary_ID=7

The following describes the procedure to update the NDF-RT vocabulary (concepts, synonyms, relationships, ancestry) with the latest release of the source.

The load process is a complete refresh of data. All new vocabulary entieties will be added. All vocabulary entities that do not exist in the latest files set will be marked as deprecated.

 

The concept hierarchy is defined in OMOP for  NDF-RT  concepts that are part of the following:

– Chemical Structure
– Indication or Contra-indication
– Mechanism of Action
– Pharmaceutical Preparations
– Physiologic Effect

 

1.1 Scripts Transfer

All scripts and files listed at the bottom of this page should be transferred to specially created directory (eg /data/RxNorm).

 

 Script Name

 Description

07_create_schema.sql

 Create prestage-stage schema with tables and indexes

07_load_to_prestage.sh

 Load NDF-RT row-data

07_transform_row_concepts.sql    

 Convert and store in stage table concept_stage

07_load_concepts.sql 

 Load only new concepts into concept table, add invalid code information

07_load_relations.sql  

 Load only new relationships into relationship table, add invalid code information.

 07_transform_row_relations.sql 
 Convert and store in stage table concept_relationship_stage

1.2 Download data from source

The most current international release is located at the following URL: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html

Historical archives located at: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormarchive.html

Note: Make sure you perform updates in chronological order. So, if you want to process a January and April release, execute the below steps for January, and after successful completion, for April.

Download the latest release file (e.g. RxNorm_full_01062014.zip)

Transfer downloaded zip file to
/data/RxNorm directory.

Zip file suffix contains release date (e.g. 08052012=05-Aug-2013)
Latest production zip file is ~136MB in size

Note: For the rest of the document we will be referring to current schema as RXNORM_20130805 you would have to adjust this entry to correspond to current release date file

1.3 Build Database Schema

All tables will be located in the schema RxNorm_<DateOfSourceFile> as following: RXNORM_20130805
The following steps will create a schema and will grant to a schema owner appropriate permissions.

In addition to the schema, it will create tables to load raw data: RXNATOMARCHIVE,RXNCONSO,RXNCUI,RXNCUICHANGES,RXNDOC,RXNREL,RXNSAB,RXNSAT and RXNSTY  as well as staging tables: 

CONCEPT_STAGE, 

CONCEPT_ANCESTOR_STAGE, 

CONCEPT_RELATIONSHIP_STAGE, 

SOURCE_TO_CONCEPT_MAP_STAGE

$ sqlplus System/<SystemPass>@DEV_VOCAB @07_create_schema.sql <RXNORM_Schema> <Pass_RXNORM_Schema>

Format of RXNORM_Schema should be RXNORM_YYYYMMDD (e.g. RXNORM_20130805)

 

Staging tables match Vocabulary schema version 4.4 with slight modification. Modifications made to the tables are:

CONCEPT_STAGE.CONCEPT_ID is nullable

          CONCEPT_RELATIONSHIP_STAGE.REL_ID is nullable

 

Staging tables match Vocabulary schema version 4.0 with slight modification. Modifications made to the tables are:

CONCEPT_STAGE.CONCEPT_ID is nullable

CONCEPT_RELATIONSHIP_STAGE.REL_ID is nullable

1.4 Raw data load into Oracle tables

         1.4.1 Extract files from the zip archive into the current directory

        $ unzip -u RxNorm_full_08052013.zip -d rxnorm_20130805

 

1.4.2 Verify that the following 2 files have been created:

    RXNCONSO.RRF    (Latest file size ~96MB)

    RXNATOMARCHIVE.RRF    (Latest file size ~42MB)

    RXNCUI.RRF    (Latest file size ~1MB)

    RXNCUICHANGES.RRF    (Latest file size ~1MB)

    RXNDOC.RRF    (Latest file size ~1MB)

    RXNREL.RRF    (Latest file size ~230MB)

    RXNSAB.RRF    (Latest file size ~8KB)

    RXNSAT.RRF    (Latest file size ~385MB)

    RXNSTY.RRF    (Latest file size ~15MB)

   

        1.4.3    Load raw data into Oracle. 

        This sqlldr process will run for about 2min.

        $ 07_load_to_prestage.sh /data/RxNorm/rxnorm_20130805 <RXNORM_Schema>/<Pass_RXNORM_Schema>

        1.4.3.1 Verify that files with extension .bad are empty, no records have been rejected.

 

        1.4.3.2 Copy data and intermediate files to backup area In this step, downloaded files will be archived to preserve source of the data. Discuss a location of backup area with your system administrator.         

        $ cp -Ru /data/RxNorm/*.zip /data/backup_area ; cp -Ru /data/RxNorm/*.log /data/backup_area


        1.4.3.3   Verify that number of records loaded is approximate equivalent to prior stage load

                     sqlplus RXNORM_20130805/myPass@DEV_VOCAB

            Execute the following SQL commands:

    SELECT count(*) FROM RXNCONSO;

Current production row count is:     1,002,967 Rows

 SELECT count(*) FROM RXNREL;

Current production row count is:     4,346,111 Rows

1.5 Loading Staging Tables from raw

 

1.5.1 Convert and store in staging table concepts

Loaded from raw staged data RXNCONSO into the staged data CONCEPT_STAGE

In this step we will perform substitute GUID identifiers into numerical format.

 

sqlplus RXNORM_20120702/myPass@DEV_VOCAB @07_transform_row_concepts.sql

 

1.5.2  Verify that number of records loaded is equivalent to prior production load

    In this step we will verify that number of active records in staging tables is valid prior to transfer data to historical DEV schema. 

    a. Number of Records in stage table

    b. Number of records in DEV schema not deleted (active, prior load production)

    c. How many records would be added to DEV schema table

    d. How many active DEV records will be marked for deletion


    In most cases  number of records in stage (a) should be greater than number of active records in production (b) 
    In rare occasion that might not be the case. It is responsibility of the developer running the scripts to verify that that is the the valid.

 

 

     $ sqlplus RXNORM_20120131/myPass@DEV_VOCAB

 

 

        Execute the following SQL commands:

SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt
  FROM CONCEPT_STAGE c
 WHERE C.VOCABULARY_ID  IN (07, 28, 32)
UNION ALL
SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
  FROM DEV.CONCEPT d
 WHERE d.VOCABULARY_ID  IN (07, 28, 32) AND NVL (d.INVALID_REASON, 'X') <> 'D'
UNION ALL
SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt
  FROM CONCEPT_STAGE c
 WHERE C.VOCABULARY_ID  IN (07, 28, 32)
       AND NOT EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT d
                WHERE d.VOCABULARY_ID  IN (07, 28, 32) AND c.CONCEPT_CODE = D.CONCEPT_CODE)
UNION ALL
SELECT '- How many DEV active will be marked for deletion' AS scr,
       COUNT (8) cnt
  FROM DEV.CONCEPT d
 WHERE     d.VOCABULARY_ID  IN (07, 28, 32)
       AND NVL (d.INVALID_REASON, 'X') <> 'D'
       AND NOT EXISTS
              (SELECT 1
                 FROM CONCEPT_STAGE c
                WHERE c.VOCABULARY_ID  IN (07, 28, 32) AND c.CONCEPT_CODE = D.CONCEPT_CODE);


                Current Result:

 - Num Rec in stage

 51705

 - Num Rec in DEV not deleted

 51076

 - How many records would be new in DEV added

637

 - How many DEV active will be marked for deletion

8




1.5.3    Load new concepts into DEV schema concept table,mark deprecated concepts as deleted. Valid start date, valid end date and invalid reason are left empty for now.

          Transfer records from CONCEPT_STAGE to DEV.CONCEPT table

           sqlplus RXNORM_20120702/myPass@DEV_VOCAB @07_load_concepts.sql

 

 

1.5.4    Load new concept relationships stage, identify invalid codes  information 

       Loaded from the raw staged data RXNREL into the staged data CONCEPT_RELATIONSHIP_STAGE, identify invalid code information     

      $ 
sqlplus RXNORM_20120702/myPass@DEV_VOCAB @07_transform_row_relations.sql 




1.5.5  Verify that number of records loaded is equivalent to prior production load



SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt
  FROM CONCEPT_RELATIONSHIP_STAGE c
UNION ALL
SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
  FROM DEV.CONCEPT_RELATIONSHIP d
 WHERE     EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN (07, 08) AND c.CONCEPT_ID = CONCEPT_ID_1)
       AND EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN (07, 08) AND c.CONCEPT_ID = CONCEPT_ID_2)
       AND NVL(d.INVALID_REASON, 'X') NOT IN ( 'D', 'U')
       AND D.RELATIONSHIP_ID NOT IN ( 1, 309, 311, 313, 135, 310 ,312, 314, 281, 282)
UNION ALL
SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt
  FROM CONCEPT_RELATIONSHIP_STAGE c
 WHERE NOT EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT_RELATIONSHIP d
                WHERE     c.CONCEPT_ID_1 = d.CONCEPT_ID_1
                      AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
                      AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID)
UNION ALL
SELECT '- How many DEV active will be marked for deletion' AS scr,
       COUNT (8) cnt
  FROM DEV.CONCEPT_RELATIONSHIP d
 WHERE     EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN (07, 08) AND c.CONCEPT_ID = CONCEPT_ID_1)
       AND EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN (07, 08) AND c.CONCEPT_ID = CONCEPT_ID_2)
             AND NVL(d.INVALID_REASON, 'X') NOT IN ( 'D', 'U')
       AND D.RELATIONSHIP_ID NOT IN ( 1, 309, 311, 313, 135, 310 ,312, 314, 281, 282)
       AND NOT EXISTS
                  (SELECT 1
                     FROM CONCEPT_RELATIONSHIP_STAGE c
                    WHERE     c.CONCEPT_ID_1 = d.CONCEPT_ID_1
                          AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
                          AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID) ;

Current Result:

 - Num Rec in stage

 983086

 - Num Rec in DEV not deleted

 984528

 - How many records would be new in DEV added

3382

 - How many DEV active will be marked for deletion

 4824

 

 

1.5.6    Load new concept relationships into DEV schema,mark deprecated concept relatuionships as deleted.

          Transfer records from CONCEPT_RELATIONSHIP_STAGE to DEV.CONCEPT_RELATIONSHIP table       

         $  sqlplus RXNORM_20120131/myPass@DEV_VOCAB @07_load_relations.sql 

 

 
ċ
07_create_schema.sql
(7k)
Serg Vereshagin,
Oct 22, 2013, 5:07 AM
ċ
07_load_concepts.sql
(5k)
Serg Vereshagin,
Oct 22, 2013, 5:07 AM
ċ
07_load_relations.sql
(5k)
Serg Vereshagin,
Oct 22, 2013, 5:07 AM
ċ
07_load_to_prestage.sh
(5k)
Serg Vereshagin,
Oct 22, 2013, 5:07 AM
ċ
07_transform_row_concepts.sql
(5k)
Serg Vereshagin,
Oct 22, 2013, 5:07 AM
ċ
07_transform_row_relations.sql
(12k)
Serg Vereshagin,
Oct 22, 2013, 5:07 AM
Comments