Vocabulary_ID=9
The following describes the procedure to update
the NDC vocabulary with the
latest release of the source.
The
load process is a complete refresh of data. All new vocabulary
entieties will be added. All vocabulary entities that do not exist in
the latest files set will be marked as deprecated.
1.1 Scripts Transfer
All scripts and files listed at the bottom of this page should be transferred to specially created directory (eg /data/RxNorm).
Script Name |
Description |
09_create_schema.sql |
Create prestage-stage schema with tables and indexes
|
09_load_to_prestage.sh |
Load RxNorm row-data |
09_transform_row_maps.sql |
Convert and store in stage table SOURCE_TO_CONCEPT_MAP_STAGE |
09_load_maps.sql
| Load only new maps into SOURCE_TO_CONCEPT_MAP table, add invalid code information |
09_load_to_prestage_ATC.ctl | FDB-ATC control file |
09_load_to_prestage_ETC.ctl | FDB-ETC control file |
09_load_to_prestage_NDC.ctl | FDB-NDC control file |
09_load_to_prestage_FDA_NDC.ctl | FDA-NDC control file |
09_load_to_prestage_FDA_PRO.ctl | FDA-Product control file |
1.2 Download data from source
1.2.1 Download data from RxNorm source
The most current international release is
located at the following URL: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html
Historical archives
located at: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormarchive.html
Note: Make sure you perform updates in chronological order. So, if you want to
process a January and April release, execute the below steps for January, and
after successful completion, for April.
Download the latest release file (e.g. RxNorm_full_07022012.zip)
Transfer downloaded zip file to /data/RxNorm directory.
Zip file suffix contains release date (e.g. 07022012=02-Jul-2012)
Latest production zip file is ~136MB in size
Note:
For the rest of the document we will be referring to current schema as RXNORM_20120702 you would have to adjust
this entry to correspond to current release date file
1.2.2 Download data from FDB source
The most current international release is
located at the following URL: http://www.fdbhealth.com/
Download the latest release file (e.g.FDB_10252011.zip)
Transfer downloaded zip file to /data/RxNorm directory.
Zip file suffix contains release date (e.g. 07022012=02-Jul-2012)
Latest production zip file is ~2MB in size
1.2.3 Download data from FDA source
The most current international release is
located at the following URL:
http://www.fda.gov/downloads/Drugs/DevelopmentApprovalProcess/UCM070838.zip
Download the latest release file (e.g.UCM070838.zip)
Transfer downloaded zip file to /data/RxNorm directory.
Latest production zip file is ~6MB in size
1.3 Build Database Schema
All tables will be
located in the schema RxNorm_<DateOfSourceFile> as following: RXNORM_20130805
The following steps will create a schema and will grant to a schema owner appropriate permissions.
In addition to the
schema, it will create tables to load raw data: RXNATOMARCHIVE,RXNCONSO,RXNCUI,RXNCUICHANGES,RXNDOC,RXNREL,RXNSAB,RXNSAT
and RXNSTY as well as staging
tables:
CONCEPT_STAGE,
CONCEPT_ANCESTOR_STAGE,
CONCEPT_RELATIONSHIP_STAGE,
SOURCE_TO_CONCEPT_MAP_STAGE
$ sqlplus
System/<SystemPass>@DEV_VOCAB @09_create_schema.sql <RXNORM_Schema>
<Pass_RXNORM_Schema>
Format of RXNORM_Schema should be RXNORM_YYYYMMDD (e.g. RXNORM_20130805)
1.4 Raw data load into Oracle tables
1.4.1 Extract files from the zip
archive into the current directory
$ unzip -u RxNorm_full_07022012.zip -d rxnorm_20120702
$ unzip ndc.zip
1.4.2 Verify that the following 7 files have been created: RXNCONSO.RRF (Latest file size ~96MB)
RXNSAT.RRF (Latest file size ~385MB)
RATCGC0_ATC_GCNSEQNO_LINK.rrf (Latest file size ~496KB)
( Search it in FDB source eg \NDDF_PLU\NDDF_PLU\NDDF Plus DB\NDDF Descriptive and Pricing\NDDF BASICS 3.0\Miscellaneous Therapeutic Class\ )
RETCGC0_ETC_GCNSEQNO.rrf (Latest file size ~612KB) ( Search it in FDB source )
RNDC14_NDC_MSTR.rrf (Latest file size ~49MB) ( Search it in FDB source )
package.txt (Latest file size ~9MB)
product.txt (Latest file size ~22MB)
1.4.3
Load raw data into
Oracle.
1.4.3.1
Load RxNorm raw data into
Oracle.
This sqlldr process will run for about 2min.
$ 09_load_to_prestage.sh /data/RxNorm/rxnorm_20120702 <RXNORM_Schema>/<Pass_RXNORM_Schema>
1.4.3.2
Load FDB raw data into
Oracle.
Each sqlldr process will run for about 1min.
$ sqlldr RxNorm_20120131/myPass@DEV_VOCAB control=09_load_to_prestage_ATC.ctl
$ sqlldr RxNorm_20120131/myPass@DEV_VOCAB control=09_load_to_prestage_ETC.ctl
$ sqlldr RxNorm_20120131/myPass@DEV_VOCAB control=09_load_to_prestage_NDC.ctl
1.4.3.3
Load FDA raw data into
Oracle.
Each sqlldr process will run for about 1min.
$ sqlldr RxNorm_20120131/myPass@DEV_VOCAB control=09_load_to_prestage_FDA_NDC.ctl
$ sqlldr RxNorm_20120131/myPass@DEV_VOCAB control=09_load_to_prestage_FDA_PRO.ctl
1.4.3.4 Verify that files with extension .bad are empty, no records have been rejected.
1.4.3.5 Copy data and intermediate files to backup area. In this step, downloaded files will be archived to preserve source of the data. Discuss a location of backup area with your system administrator.
$ cp -Ru /data/RxNorm/*.zip /data/backup_area ; cp -Ru /data/RxNorm/*.log /data/backup_area
1.4.3.6 Verify that number of records loaded is approximate equivalent to prior stage load
$ sqlplus RXNORM_20120702/myPass@DEV_VOCAB
Execute the following SQL commands:
SELECT count(*) FROM RXNCONSO; |
Current production row count is: 1,002,967 Rows
SELECT count(*) FROM RXNSAT;
|
Current production row count is: 6,693,748 Rows
SELECT count(*) FROM RATCGC0_ATC_GCNSEQNO_LINK; |
Current production row count is: 21,094 Rows
SELECT count(*) FROM RETCGC0_ETC_GCNSEQNO; |
Current production row count is: 26,863 Rows
SELECT count(*) FROM RNDC14_NDC_MSTR; |
Current production row count is: 133,796 Rows
SELECT count(*) FROM FDA_NDC_PACKAGES; |
Current production row count is: 121,564 Rows
SELECT count(*) FROM FDA_NDC_PRODUCTS; |
Current production row count is: 59,931 Rows
1.5 Loading Staging Tables from raw
1.5.1 Convert and store in
staging table maps
Loaded from raw staged data RXNCONSO into the staged data SOURCE_TO_CONCEPT_MAP_STAGE
$ sqlplus RXNORM_20120702/myPass@DEV_VOCAB @09_transform_row_maps.sql
1.5.2 Verify that number of records loaded is equivalent to prior production load
In this step we will verify that number of active records in staging
tables is valid prior to transfer data to historical DEV schema.
a. Number of Records in stage table
b. Number of records in DEV schema not deleted (active, prior load production)
c. How many records would be added to DEV schema table
d. How many active DEV records will be marked for deletion
In most cases number of records in stage (a) should be greater than number of active records in production (b)
In rare occasion that might not be the case. It is responsibility of
the developer running the scripts to verify that that is the the valid.
$ sqlplus RXNORM_20120131/myPass@DEV_VOCAB
Execute the following SQL commands:
SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt FROM SOURCE_TO_CONCEPT_MAP_STAGE c WHERE c.SOURCE_VOCABULARY_ID IN (09) UNION ALL SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt FROM DEV.SOURCE_TO_CONCEPT_MAP d WHERE d.SOURCE_VOCABULARY_ID IN (09) AND D.TARGET_VOCABULARY_ID IN (08) AND NVL (d.INVALID_REASON, 'X') <> 'D' UNION ALL SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt FROM SOURCE_TO_CONCEPT_MAP_STAGE c WHERE c.SOURCE_VOCABULARY_ID IN (09) AND c.TARGET_VOCABULARY_ID IN (08) AND NOT EXISTS (SELECT 1 FROM DEV.SOURCE_TO_CONCEPT_MAP d WHERE d.SOURCE_VOCABULARY_ID IN (09) AND c.SOURCE_CODE = D.SOURCE_CODE AND d.SOURCE_VOCABULARY_ID = c.SOURCE_VOCABULARY_ID AND d.MAPPING_TYPE = c.MAPPING_TYPE AND d.TARGET_CONCEPT_ID = c.TARGET_CONCEPT_ID AND d.TARGET_VOCABULARY_ID = c.TARGET_VOCABULARY_ID) UNION ALL SELECT '- How many DEV active will be marked for deletion' AS scr, COUNT (8) cnt FROM DEV.SOURCE_TO_CONCEPT_MAP d WHERE d.SOURCE_VOCABULARY_ID IN (09) AND D.TARGET_VOCABULARY_ID IN (08) AND NVL (d.INVALID_REASON, 'X') <> 'D' AND d.VALID_START_DATE < TO_DATE ( SUBSTR (USER, REGEXP_INSTR (USER, '_[[:digit:]]') + 1, 256), 'YYYYMMDD') AND NOT EXISTS (SELECT 1 FROM SOURCE_TO_CONCEPT_MAP_STAGE c WHERE c.SOURCE_VOCABULARY_ID IN (09) AND c.SOURCE_CODE = D.SOURCE_CODE AND d.SOURCE_VOCABULARY_ID = c.SOURCE_VOCABULARY_ID AND d.MAPPING_TYPE = c.MAPPING_TYPE AND d.TARGET_CONCEPT_ID = c.TARGET_CONCEPT_ID AND d.TARGET_VOCABULARY_ID = c.TARGET_VOCABULARY_ID) AND EXISTS (SELECT 1 FROM SOURCE_TO_CONCEPT_MAP_STAGE c WHERE d.SOURCE_CODE = c.SOURCE_CODE AND d.SOURCE_VOCABULARY_ID = c.SOURCE_VOCABULARY_ID AND d.MAPPING_TYPE = c.MAPPING_TYPE AND d.TARGET_VOCABULARY_ID = c.TARGET_VOCABULARY_ID);
|
Current Result:
- Num Rec in stage 314556
- Num Rec in DEV not deleted 457751
- How many records would be new in DEV added 2383
- How many DEV active will be marked for deletion 332
1.5.3 Load new maps into DEV schema concept table,mark deprecated concepts as deleted. Valid start date, valid end date and invalid
reason are left empty for now.
Transfer records from SOURCE_TO_CONCEPT_MAP_STAGE to DEV.SOURCE_TO_CONCEPT_MAP table
$ sqlplus RXNORM_20120702/myPass@DEV_VOCAB @09_load_maps.sql