Vocabulary_ID=8
The following
describes the procedure to update the RxNorm vocabulary (concepts, synonyms,
relationships, ancestry) with the latest release of the source.
The load process is a
complete refresh of data. All new vocabulary entieties will be added. All
vocabulary entities that do not exist in the latest files set will be marked
as deprecated.
The
concept hierarchy is defined in OMOP for RxNorm concepts that are
part of the following:
–
Branded Drug
– Branded Pack
– Clinical Drug
– Clinical Pack
– Ingredient
1.1 Scripts Transfer
All scripts and files listed at the bottom of this page should be transferred to specially created directory (eg /data/RxNorm).
Script Name
|
Description
|
08_create_schema.sql
|
Create
prestage-stage schema with tables and indexes
|
08_load_to_prestage.sh
|
Load RxNorm
raw data
|
08_transform_row_concepts.sql
|
Convert and
store in stage table concept_stage
|
08_load_concepts.sql
|
Load only new
concepts into concept table, add invalid code information
|
08_load_relations.sql
|
Load only new
relationships into relationship table, add invalid code information.
|
08_transform_row_relations.sql
|
Convert and store in
stage table concept_relationship_stage
|
1.2 Download data from source
The most current
international release is located at the following URL: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormfiles.html
Historical archives
located at: http://www.nlm.nih.gov/research/umls/rxnorm/docs/rxnormarchive.html
Note:
Make sure you perform updates in
chronological order. So, if you want to process a January and April release,
execute the below steps for January, and after successful completion, for
April.
Download the latest release file (e.g. RxNorm_full_01062014.zip)
Transfer downloaded zip file to /data/RxNorm directory.
Zip file suffix
contains release date (e.g. 08052012=05-Aug-2013)
Latest production zip file is ~136MB in size
Note:
For the rest of the document we will be referring to current schema as RXNORM_20130805 you
would have to adjust this entry to correspond to current release date file
1.3 Build Database Schema
All tables will be
located in the schema RxNorm_<DateOfSourceFile> as following: RXNORM_20130805
The following steps will create a schema and will grant to a schema owner appropriate permissions.
In addition to the
schema, it will create tables to load raw data: RXNATOMARCHIVE,RXNCONSO,RXNCUI,RXNCUICHANGES,RXNDOC,RXNREL,RXNSAB,RXNSAT
and RXNSTY as well as staging
tables:
CONCEPT_STAGE,
CONCEPT_ANCESTOR_STAGE,
CONCEPT_RELATIONSHIP_STAGE,
SOURCE_TO_CONCEPT_MAP_STAGE
$ sqlplus
System/<SystemPass>@DEV_VOCAB @08_create_schema.sql <RXNORM_Schema>
<Pass_RXNORM_Schema>
Format of RXNORM_Schema should be RXNORM_YYYYMMDD (e.g. RXNORM_20130805)
Staging tables match
Vocabulary schema version 4.4 with slight modification. Modifications made to
the tables are:
CONCEPT_STAGE.CONCEPT_ID
is nullable
CONCEPT_RELATIONSHIP_STAGE.REL_ID
is nullable
1.4 Raw data load into Oracle tables
1.4.1 Extract files from the zip archive
into the current directory
$ unzip -u RxNorm_full_08052013.zip -d rxnorm_20130805
1.4.2
Verify that the following 2 files have been created:
RXNCONSO.RRF (Latest file size ~96MB)
RXNATOMARCHIVE.RRF
(Latest file size ~42MB)
RXNCUI.RRF
(Latest file size ~1MB)
RXNCUICHANGES.RRF
(Latest file size ~1MB)
RXNDOC.RRF
(Latest file size ~1MB)
RXNREL.RRF
(Latest file size ~230MB)
RXNSAB.RRF
(Latest file size ~8KB)
RXNSAT.RRF
(Latest file size ~385MB)
RXNSTY.RRF
(Latest file size ~15MB)
1.4.3 Load
raw data into Oracle.
This sqlldr process will run for about 2min.
$ 08_load_to_prestage.sh /data/RxNorm/rxnorm_20130805
<RXNORM_Schema>/<Pass_RXNORM_Schema>
1.4.3.1 Verify that files with extension .bad are empty, no records have
been rejected.
1.4.3.2 Copy data and intermediate files to backup area. In this step, downloaded files will be archived to preserve source of the data. Discuss a location of backup area with your system administrator.
$ cp -Ru /data/RxNorm/*.zip /data/backup_area ; cp -Ru /data/RxNorm/*.log /data/backup_area
1.4.3.3 Verify that number of records loaded is approximate equivalent to prior stage load
$ sqlplus RXNORM_20130805/myPass@DEV_VOCAB
Execute the following SQL commands:
SELECT count(*) FROM RXNCONSO;
|
Current production row count is: 1,002,967 Rows
SELECT count(*) FROM RXNREL;
|
Current
production row count is: 4,346,111 Rows
1.5 Loading Staging Tables from raw
1.5.1 Convert and store in staging
table concepts
Loaded
from raw staged data RXNCONSO into the staged data CONCEPT_STAGE
In
this step we will perform substitute GUID identifiers into numerical format.
$ sqlplus RXNORM_20130805/myPass@DEV_VOCAB
@08_transform_row_concepts.sql
1.5.2 Verify that number of records loaded is
equivalent to prior production load
In this step we will verify that
number of active records in staging tables is valid prior to transfer data to
historical DEV schema.
a. Number of Records in stage
table
b. Number of records in DEV
schema not deleted (active, prior load production)
c. How many records would be
added to DEV schema table
d. How many active DEV records will be marked for deletion
In most cases number of records in stage (a) should be greater
than number of active records in production (b)
In rare occasion that might not be the case. It is
responsibility of the developer running the scripts to verify that that is
the the valid.
$ sqlplus RXNORM_20130805/myPass@DEV_VOCAB
Execute the following SQL commands:
SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt FROM CONCEPT_STAGE c WHERE C.VOCABULARY_ID = 08 UNION ALL SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt FROM DEV.CONCEPT d WHERE d.VOCABULARY_ID = 08 AND NVL (d.INVALID_REASON, 'X') NOT IN ( 'D', 'U') UNION ALL SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt FROM CONCEPT_STAGE c WHERE C.VOCABULARY_ID = 08 AND NOT EXISTS (SELECT 1 FROM DEV.CONCEPT d WHERE d.VOCABULARY_ID = 08 AND c.CONCEPT_CODE = D.CONCEPT_CODE) UNION ALL SELECT '- How many DEV active will be marked for deletion' AS scr, COUNT (8) cnt FROM DEV.CONCEPT d WHERE d.VOCABULARY_ID = 08 AND NVL (d.INVALID_REASON, 'X') NOT IN ( 'D', 'U') AND NOT EXISTS (SELECT 1 FROM CONCEPT_STAGE c WHERE c.VOCABULARY_ID = 08 AND c.CONCEPT_CODE = D.CONCEPT_CODE);
|
Current
Result:
-
Num Rec in stage
158043
-
Num Rec in DEV not deleted
157626
-
How many records would be new in DEV added
421
-
How many DEV active will be marked for deletion 4
1.5.3 Load new concepts into DEV schema
concept table,mark deprecated concepts as deleted.
Valid start date, valid
end date and invalid reason are left empty for now. Transfer records from CONCEPT_STAGE to DEV.CONCEPT table
$ sqlplus RXNORM_20130805/myPass@DEV_VOCAB
@08_load_concepts.sql
1.5.4 Load new concept relationships stage,
identify invalid codes information
Loaded from the raw staged data RXNREL into the
staged data CONCEPT_RELATIONSHIP_STAGE, identify invalid code information
$ sqlplus RXNORM_20130805/myPass@DEV_VOCAB
@08_transform_row_relations.sql
1.5.5 Verify that number of
records loaded is equivalent to prior production load
SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt
FROM CONCEPT_RELATIONSHIP_STAGE c
UNION ALL
SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
FROM DEV.CONCEPT_RELATIONSHIP d
WHERE EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID = 08 AND c.CONCEPT_ID = CONCEPT_ID_1)
AND EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID = 08 AND c.CONCEPT_ID = CONCEPT_ID_2)
AND NVL(d.INVALID_REASON, 'X') NOT IN
( 'D', 'U')
UNION ALL
SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt
FROM CONCEPT_RELATIONSHIP_STAGE c
WHERE NOT EXISTS
(SELECT 1
FROM DEV.CONCEPT_RELATIONSHIP d
WHERE c.CONCEPT_ID_1 = d.CONCEPT_ID_1
AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID)
UNION ALL
SELECT '- How many DEV active will be marked for deletion' AS scr,
COUNT (8) cnt
FROM DEV.CONCEPT_RELATIONSHIP d
WHERE EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID = 08 AND c.CONCEPT_ID = CONCEPT_ID_1)
AND EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID = 08 AND c.CONCEPT_ID = CONCEPT_ID_2)
AND NVL(d.INVALID_REASON, 'X') NOT IN
( 'D', 'U')
AND D.RELATIONSHIP_ID NOT IN ( 1, 309,
311, 313, 135, 310 ,312, 314)
AND NOT EXISTS
(SELECT 1
FROM CONCEPT_RELATIONSHIP_STAGE c
WHERE c.CONCEPT_ID_1 = d.CONCEPT_ID_1
AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID);
|
Current
Result:
- Num Rec in stage
804506
- Num Rec in DEV not deleted
886452
- How many records would be new in DEV
added
2124
- How many DEV active will be marked for
deletion 30
1.5.6 Load new concept relationships into DEV
schema, mark deprecated concept relatuionships as deleted
Transfer records from CONCEPT_RELATIONSHIP_STAGE to DEV.CONCEPT_RELATIONSHIP table
$ sqlplus RXNORM_20130805/myPass@DEV_VOCAB
@08_load_relations.sql
|