Drug Domain‎ > ‎

FDB Indication

Vocabulary_ID=19

The following describes the procedure to update the FDB Indication vocabulary (concepts, synonyms, relationships, ancestry) with the latest release of the source.

The load process is a complete refresh of data. All new vocabulary entieties will be added. All vocabulary entities that do not exist in the latest files set will be marked as deprecated.

The concept hierarchy is defined in OMOP for  FDB Indication  concepts that are part of the following:

    - Indication or Contra-indication


1.1 Scripts Transfer

All scripts and files listed at the bottom of this page should be transferred to specially created directory (eg /data/FDB_IND).


 Script Name  Description
19_create_schema.sql  Create prestage-stage schema with tables and indexes
19_load_to_prestage.sh 
 Load RxNorm row-data
19_transform_row_concepts.sql 
 Convert and store in stage table concept_stage
19_transform_row_relations.sql   Convert and store in stage table concept_relationship_stage
19_load_concepts.sql   Load only new concepts into concept table, add invalid code information
19_load_relations.sql    Load only new relationships into relationship table, add invalid code information.
19_load_to_prestage_FDB_IND.ctl  Concept Indications control file
19_load_to_prestage_FDB_NDC.ctl ICD9CM_SEARCH control file
19_load_to_prestage_FDB_SEQ.ctl INDCTS_GCNSEQNO_LINK control file
19_load_to_prestage_FDB_MST.ctl INDCTS_MSTR control file
19_load_to_prestage_FDB_SQN.ctl CONTRA_GCNSEQNO_LINK   control file
19_load_to_prestage_FDB_CON.ctl CONTRA_MSTR   control file


1.2 Download data from source

            1.2.1 Download data from FDB source

                    The most current international release is located at the following URL: http://www.fdbhealth.com/


                   
Download the latest release file (e.g.FDB_20120815.zip)

                     Transfer downloaded zip file to /data/FDB_IND directory.

                    Zip file suffix contains release date (e.g. 07022012=02-Jul-2012)
                     Latest production zip file is ~2MB in size


            1.2.2 Download data from RxNorm source

                    The most current international release is located at the following URL: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html

                    Historical archives located at: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormarchive.html

                    Note: Make sure you perform updates in chronological order. So, if you want to process a January and April release, execute the below steps for January, and after successful completion, for April.

                    Download the latest release file (e.g. RxNorm_full_07022012.zip)
                     Transfer downloaded zip file to /data/FDB_IND directory.

                    Zip file suffix contains release date (e.g. 07022012=02-Jul-2012)
                     Latest production zip file is ~136MB in size

                    Note: For the rest of the document we will be referring to current schema as FDB_IND_20120702 you would have to adjust this entry to correspond to current release date file


1.3 Build Database Schema

All tables will be located in the schema FDB_IND_<DateOfSourceFile> as following: FDB_IND_20120131
The following steps will create a schema and will grant to a schema owner appropriate permissions. 


In addition to the schema, it will create tables to load raw data:

RXNCONSO, RINDMDD0_INDCTS_DRUG_DESC, RINDMGC0_INDCTS_GCNSEQNO_LINK, RINDMLD0_LABELED_DESC, 

RINDMMA2_INDCTS_MSTR, RINDMPD0_PREDICTOR_DESC and RINDMRG0_ROUTED_GEN_LINK as well as staging tables: 

CONCEPT_STAGE, CONCEPT_ANCESTOR_STAGE, 

CONCEPT_RELATIONSHIP_STAGE, 

SOURCE_TO_CONCEPT_MAP_STAGE


$ sqlplus System/<SystemPass>@DEV_VOCAB @19_create_schema.sql <FDB_IND_Schema> <Pass_FDB_IND_Schema>

Format of FDB_IND_Schema should be FDB_IND_YYYYMMDD (e.g. FDB_IND_20120815)


1.4 Raw data load into Oracle tables

1.4.1 Extract files from the zip archive into the current directory

        $ unzip -u RxNorm_full_07022012.zip -d rxnorm_20120702


1.4.2 Verify that the following 7 files have been created:

1.4.2.1 RxNorm raw data

     RXNCONSO.RRF    (Latest file size ~96MB)

1.4.2.2 FDB raw data

    RDDCMMA1_CONTRA_MSTR.rrf    ( Search it in FDB source )
                RFMLDX0_DXID.rrf               
                RINDMMA2_INDCTS_MSTR.rrf       
                RFMLISR0_ICD9CM_SEARCH.rrf     
                RFMLISR1_ICD_SEARCH.rrf        
                RINDMGC0_INDCTS_GCNSEQNO_LINK.rrf
                RDDCMGC0_CONTRA_GCNSEQNO_LINK.rrf

   

          1.4.3.    Load raw data into Oracle. 

      

1.4.3.1   Load RxNorm raw data into Oracle. 

            This sqlldr process will run for about 2min.

        $ ./19_load_to_prestage.sh /data/FDB_IND/rxnorm_20120702 <FDB_IND_Schema>/<Pass_FDB_IND_schema>


            1.4.3.2    Load FDB raw data into Oracle. 

                Each sqlldr process will run for about 1min.

                $  sqlldr  FDB_IND_20120815/myPass@DEV_VOCAB control=19_load_to_prestage_FDB_IND.ctl

                $  sqlldr  FDB_IND_20120815/myPass@DEV_VOCAB control=19_load_to_prestage_FDB_NDC.ctl

                $  sqlldr  FDB_IND_20120815/myPass@DEV_VOCAB control=19_load_to_prestage_FDB_SEQ.ctl

                $  sqlldr  FDB_IND_20120815/myPass@DEV_VOCAB control=19_load_to_prestage_FDB_MST.ctl

                $  sqlldr  FDB_IND_20120815/myPass@DEV_VOCAB control=19_load_to_prestage_FDB_SQN.ctl

                $  sqlldr  FDB_IND_20120815/myPass@DEV_VOCAB control=19_load_to_prestage_FDB_CON.ctl

1.4.3.3 Verify that files with extension .bad are empty, no records have been rejected.

1.4.3.4 Copy data and intermediate files to backup area In this step, downloaded files will be archived to preserve source of the data. Discuss a location of backup area with your system administrator.         

        $ cp -Ru /data/FDB_IND/*.zip /data/backup_area ; cp -Ru /data/FDB_IND/*.log /data/backup_area


1.4.3.5  Verify that number of records loaded is equivalent to prior production load

                     sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB

                Execute the following SQL commands:

    SELECT count(*) FROM RXNCONSO;
Current production row count is:     977,332 Rows

     SELECT count(*) FROM RDDCMGC0_CONTRA_GCNSEQNO_LINK;

Current production row count is:     24,989 Rows



     SELECT count(*) FROM RDDCMMA1_CONTRA_MSTR;

Current production row count is:     17,192 Rows


SELECT count(*) FROM RFMLDX0_DXID;
Current production row count is:     4,544 Rows

     SELECT count(*) FROM RFMLISR0_ICD9CM_SEARCH;

Current production row count is:     172,776 Rows



     SELECT count(*) FROM RINDMGC0_INDCTS_GCNSEQNO_LINK;

Current production row count is:     14,280 Rows


     SELECT count(*) FROM RINDMMA2_INDCTS_MSTR;

Current production row count is:     11,943 Rows



1.5 Loading Staging Tables from raw

1.5.1 Convert and store in staging table concepts

Loaded from raw staged data RFMLDX0_DXID&RXNCONSO into the staged data CONCEPT_STAGE

In this step we will perform substitute GUID identifiers into numerical format. 

 

$  sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB @19_transform_row_concepts.sql


1.5.2  Verify that number of records loaded is equivalent to prior production load

    In this step we will verify that number of active records in staging tables is valid prior to transfer data to historical DEV schema. 

    a. Number of Records in stage table

    b. Number of records in DEV schema not deleted (active, prior load production)

    c. How many records would be added to DEV schema table

            d. How many active DEV records will be marked for deletion


    In most cases  number of records in stage (a) should be greater than number of active records in production (b) 
    In rare occasion that might not be the case. It is responsibility of the developer running the scripts to verify that that is the the valid.



      $ sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB

Execute the following SQL commands:

SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt
  FROM CONCEPT_STAGE c
 WHERE C.VOCABULARY_ID = 19
UNION ALL
SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
  FROM DEV.CONCEPT d
 WHERE d.VOCABULARY_ID = 19 AND NVL (d.INVALID_REASON, 'X') <> 'D'
UNION ALL
SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt
  FROM CONCEPT_STAGE c
 WHERE C.VOCABULARY_ID = 19
       AND NOT EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT d
                WHERE d.VOCABULARY_ID = 19 AND c.CONCEPT_CODE = D.CONCEPT_CODE)
UNION ALL
SELECT '- How many DEV active will be marked for deletion' AS scr,
       COUNT (8) cnt
  FROM DEV.CONCEPT d
 WHERE     d.VOCABULARY_ID = 19
       AND NVL (d.INVALID_REASON, 'X') <> 'D'
       AND NOT EXISTS
              (SELECT 1
                 FROM CONCEPT_STAGE c
                WHERE c.VOCABULARY_ID = 19 AND c.CONCEPT_CODE = D.CONCEPT_CODE);

Current Result:
- Num Rec in stage                                                    4622
- Num Rec in DEV not deleted                                     4544
- How many records would be new in DEV added            78
- How many DEV active will be marked for deletion           0


1.5.3    Load new concepts into DEV schema concept table,mark deprecated concepts as deleted. 

         Valid start date, valid end date and invalid reason are left empty for now.
         Transfer records from CONCEPT_STAGE to DEV.CONCEPT table


     $  sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB @19_load_concepts.sql 



        1.5.4    Load new concept relationships stage, identify invalid codes  information 

               Loaded from the raw staged data RXNREL into the staged data CONCEPT_RELATIONSHIP_STAGE
               identify invalid code information    
 

          $  
sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB @19_transform_row_relations.sql 



1.5.5  Verify that number of records loaded is equivalent to prior production load

         sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB

SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt
  FROM CONCEPT_RELATIONSHIP_STAGE d
  WHERE EXISTS
          (SELECT 1
             FROM DEV.CONCEPT c
            WHERE c.VOCABULARY_ID IN ( 19) AND c.CONCEPT_ID = CONCEPT_ID_1)
       AND EXISTS
             (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN ( 8)
                      AND c.CONCEPT_ID = CONCEPT_ID_2)
       AND D.RELATIONSHIP_ID IN (010, 144, 126, 240, 127, 241, 129, 243)
UNION ALL
SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
  FROM DEV.CONCEPT_RELATIONSHIP d
 WHERE EXISTS
          (SELECT 1
             FROM DEV.CONCEPT c
            WHERE c.VOCABULARY_ID IN ( 19) AND c.CONCEPT_ID = CONCEPT_ID_1)
       AND EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN ( 08)
                      AND c.CONCEPT_ID = CONCEPT_ID_2)
       AND NVL (d.INVALID_REASON, 'X') <> 'D'
       AND D.RELATIONSHIP_ID IN (010, 144, 126, 240, 127, 241, 129, 243)
UNION ALL
SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt
  FROM CONCEPT_RELATIONSHIP_STAGE c
 WHERE NOT EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT_RELATIONSHIP d
                WHERE     c.CONCEPT_ID_1 = d.CONCEPT_ID_1
                      AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
                      AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID)
UNION ALL
SELECT '- How many DEV active will be marked for deletion' AS scr,
       COUNT (8) cnt
  FROM DEV.CONCEPT_RELATIONSHIP d
 WHERE EXISTS
          (SELECT 1
             FROM DEV.CONCEPT c
            WHERE c.VOCABULARY_ID IN ( 19) AND c.CONCEPT_ID = CONCEPT_ID_1)
       AND EXISTS
              (SELECT 1
                 FROM DEV.CONCEPT c
                WHERE c.VOCABULARY_ID IN ( 08)
                      AND c.CONCEPT_ID = CONCEPT_ID_2)
       AND NVL (d.INVALID_REASON, 'X') <> 'D'
       AND NOT EXISTS
                  (SELECT 1
                     FROM CONCEPT_RELATIONSHIP_STAGE c
                    WHERE     c.CONCEPT_ID_1 = d.CONCEPT_ID_1
                          AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
                          AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID)
       AND D.RELATIONSHIP_ID IN (010, 144, 126, 240, 127, 241, 129, 243);   
             
    

Current Result:

 - Num Rec in stage

611524

 - Num Rec in DEV not deleted

412011

 - How many records would be new in DEV added

127720

 - How many DEV active will be marked for deletion

49274

1.5.6    Load new concept relationships into DEV schema,mark deprecated concept relatuionships as deleted
         Transfer records from CONCEPT_RELATIONSHIP_STAGE to DEV.CONCEPT_RELATIONSHIP table       


     $ 
 sqlplus  FDB_IND_20120815/myPass@DEV_VOCAB @19_load_relations.sql 



ċ
19_create_schema.sql
(11k)
Serg Vereshagin,
Oct 25, 2013, 2:47 AM
ċ
19_load_concepts.sql
(5k)
Serg Vereshagin,
Oct 25, 2013, 2:47 AM
ċ
19_load_relations.sql
(4k)
Serg Vereshagin,
Oct 25, 2013, 2:47 AM
ċ
19_load_to_prestage.sh
(5k)
Serg Vereshagin,
Oct 25, 2013, 2:47 AM
ċ
19_load_to_prestage_FDB_CON.ctl
(0k)
Serg Vereshagin,
Oct 25, 2013, 2:48 AM
ċ
19_load_to_prestage_FDB_IND.ctl
(0k)
Serg Vereshagin,
Oct 25, 2013, 2:49 AM
ċ
19_load_to_prestage_FDB_MST.ctl
(1k)
Serg Vereshagin,
Oct 25, 2013, 2:49 AM
ċ
19_load_to_prestage_FDB_NDC.ctl
(0k)
Serg Vereshagin,
Oct 25, 2013, 2:49 AM
ċ
19_load_to_prestage_FDB_SEQ.ctl
(0k)
Serg Vereshagin,
Oct 25, 2013, 2:49 AM
ċ
19_load_to_prestage_FDB_SQN.ctl
(0k)
Serg Vereshagin,
Oct 25, 2013, 2:49 AM
ċ
19_transform_row_concepts.sql
(2k)
Serg Vereshagin,
Oct 25, 2013, 2:47 AM
ċ
19_transform_row_relations.sql
(9k)
Serg Vereshagin,
Oct 25, 2013, 2:47 AM
Comments