Vocabulary_ID=15 The following describes the procedure to update the MedDRA vocabulary (concepts, synonyms, relationships, ancestry) with the latest release of the source. The load process is a complete refresh of data. All new vocabulary entieties will be added. All vocabulary entities that do not exist in the latest files set will be marked as deprecated. The concept hierarchy is defined in OMOP for MedDRA concepts that are part of the following: - Preferred Term - High Level Term - High Level Group Term - System Organ Class
1.1 Scripts TransferAll scripts and files listed at the bottom of this page should be transferred to specially created directory (eg /data/MEDDRA).
Script Name | Description | 15_create_schema.sql | Create prestage-stage schema with tables and indexes
| 15_transform_row_concepts.sql
| Convert and store in stage table concept_stage | 15_transform_row_relations.sql | Convert and store in stage table concept_relationship_stage | 15_load_concepts.sql | Load only new concepts into concept table, add invalid code information | 15_load_relations.sql | Load only new relationships into relationship table, add invalid code information. | 15_load_to_prestage_hlgt.ctl | sqlldr control file | 15_load_to_prestage_hlgt_hlt.ctl | sqlldr control file
| 15_load_to_prestage_hlt.ctl | sqlldr control file
| 15_load_to_prestage_hlt_pt.ctl | sqlldr control file
| 15_load_to_prestage_intl_ord.ctl | sqlldr control file
| 15_load_to_prestage_llt.ctl | sqlldr control file
| 15_load_to_prestage_mdhier.ctl | sqlldr control file
| 15_load_to_prestage_pt.ctl | sqlldr control file | 15_load_to_prestage_SMQ_Content.ctl | sqlldr control file | 15_load_to_prestage_SMQ_List.ctl | sqlldr control file | 15_load_to_prestage_soc.ctl | sqlldr control file | 15_load_to_prestage_soc_hlgt.ctl | sqlldr control file |
1.2 Download data from source 1.2.1 Download data from MedDRA source The most current international release is located at the following URL: http://www.meddramsso.com Download the latest release file (e.g.MEDDRA_20120822.zip) Transfer downloaded zip file to /data/MEDDRA directory. Zip file suffix contains release date (e.g. 20120702=02-Jul-2012) Latest production zip file is ~17MB in size Note:
For the rest of the document we will be referring to current schema as MEDDRA_20120822 you would have to adjust
this entry to correspond to current release date file
1.3 Build Database SchemaAll tables will be located in the schema MEDDRA_<DateOfSourceFile> as following: MEDDRA_20120131
The following steps will create a schema and
will grant to a schema owner appropriate permissions.
In addition to the schema, it will create tables to
load raw data: HLGT_HLT_COMP, HLGT_PREF_TERM, HLT_PREF_COMP, HLT_PREF_TERM, LOW_LEVEL_TERM, MD_HIERARCHY, PREF_TERM, SMQ_CONTENT, SMQ_LIST, SOC_HLGT_COMP, SOC_INTL_ORDER and SOC_TERM as well as staging tables: CONCEPT_STAGE, CONCEPT_ANCESTOR_STAGE,
CONCEPT_RELATIONSHIP_STAGE,
SOURCE_TO_CONCEPT_MAP_STAGE
$ sqlplus System/<SystemPass>@DEV_VOCAB @15_create_schema.sql <MEDDRA_Schema> <Pass_MEDDRA_Schema>
Format of MEDDRA_schema should be MEDDRA_YYYYMMDD (e.g. MEDDRA_20120822)
1.4 Raw data load into Oracle tables
1.4.1 Extract files from the zip
archive into the current directory
$ unzip -u MEDDRA_20120822.zip
1.4.2 Verify that the following 2 files have been created:
hlgt.asc (Latest file size ~18,035 bytes) hlgt_hlt.asc (Latest file size ~34,380 bytes) hlt.asc (Latest file size ~85,800 bytes) hlt_pt.asc (Latest file size ~513,340 bytes) intl_ord.asc (Latest file size ~364 bytes) llt.asc (Latest file size ~4,185,419 bytes) mdhier.asc (Latest file size ~4,978,921 bytes) pt.asc (Latest file size ~995,416 bytes) SMQ_Content.asc (Latest file size ~2,092,155 bytes) SMQ_List.asc (Latest file size ~198,959 bytes) soc.asc (Latest file size ~1,459 bytes) soc_hlgt.asc (Latest file size ~7,000 bytes)
1.4.3
Load raw data into
Oracle.
1.4.3.1
Load FDB raw data into
Oracle. Each sqlldr process will run for about 1min.
$ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_hlgt.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_hlgt_hlt.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_hlt.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_hlt_pt.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_intl_ord.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_llt.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_mdhier.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_pt.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_SMQ_Content.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_SMQ_List.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_soc.ctl $ sqlldr MEDDRA_20120822/myPass@DEV_VOCAB control=15_load_to_prestage_soc_hlgt.ctl
1.4.3.2 Verify that files with extension .bad are empty, no records have been rejected.
1.4.3.3 Copy data and intermediate files to backup area. In this step, downloaded files will be archived to preserve source of the data. Discuss a location of backup area with your system administrator.
$ cp -Ru /data/MEDDRA/*.zip /data/backup_area ; cp -Ru /data/MEDDRA/*.log /data/backup_area
1.4.3.4 Verify that number of records loaded is equivalent to prior production load
$ sqlplus MEDDRA_20120822/myPass@DEV_VOCAB
Execute the following SQL commands:
SELECT count(*) FROM HLGT_HLT_COMP; |
Current production row count is: 34,38 Rows
SELECT count(*) FROM HLGT_PREF_TERM; | Current production row count is: 999 Rows
SELECT count(*) FROM HLT_PREF_COMP; | Current production row count is: 51,334 Rows
SELECT count(*) FROM HLT_PREF_TERM; | Current production row count is: 1,699 Rows
SELECT count(*) FROM LOW_LEVEL_TERM; | Current production row count is: 134,318 Rows
SELECT count(*) FROM MD_HIERARCHY; | Current production row count is: 54,208 Rows
SELECT count(*) FROM PREF_TERM; | Current production row count is: 36,966 Rows
SELECT count(*) FROM SMQ_CONTENT; | Current production row count is: 105,758 Rows
SELECT count(*) FROM SMQ_LIST; | Current production row count is: 271 Rows
SELECT count(*) FROM SOC_HLGT_COMP; | Current production row count is: 700 Rows
SELECT count(*) FROM SOC_INTL_ORDER; | Current production row count is: 52 Rows
SELECT count(*) FROM SOC_TERM; | Current production row count is: 52 Rows
1.5 Loading Staging Tables from raw
1.5.1 Convert and store in staging table concepts Loaded from raw staged data into the staged data CONCEPT_STAGE In this step we will perform substitute GUID identifiers into numerical format.
$ sqlplus MEDDRA_20120822/myPass@DEV_VOCAB @15_transform_row_concepts.sql
1.5.2 Verify that number of records loaded is equivalent to prior production load
In this step we will verify that number of active records in staging
tables is valid prior to transfer data to historical DEV schema.
a. Number of Records in stage table
b. Number of records in DEV schema not deleted (active, prior load production)
c. How many records would be added to DEV schema table
d. How many active DEV records will be marked for deletion In most cases number of records in stage (a) should be greater than number of active records in production (b)
In rare occasion that might not be the case. It is responsibility of
the developer running the scripts to verify that that is the the valid.
$ sqlplus
MEDDRA_20120822/myPass@DEV_VOCAB
Execute the following SQL commands:
SELECT '-
Num Rec in stage' AS scr, COUNT (8) cnt
FROM CONCEPT_STAGE c
WHERE C.VOCABULARY_ID = 15
UNION ALL
SELECT '-
Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
FROM DEV.CONCEPT d
WHERE d.VOCABULARY_ID = 15 AND NVL
(d.INVALID_REASON, 'X') <> 'D'
UNION ALL
SELECT '-
How many records would be new in DEV added' AS scr, COUNT (8) cnt
FROM CONCEPT_STAGE c
WHERE C.VOCABULARY_ID = 15
AND NOT EXISTS
(SELECT 1
FROM DEV.CONCEPT d
WHERE d.VOCABULARY_ID = 15 AND
c.CONCEPT_CODE = D.CONCEPT_CODE)
UNION ALL
SELECT '-
How many DEV active will be marked for deletion' AS scr,
COUNT (8) cnt
FROM DEV.CONCEPT d
WHERE
d.VOCABULARY_ID = 15
AND NVL (d.INVALID_REASON, 'X') <>
'D'
AND NOT EXISTS
(SELECT 1
FROM CONCEPT_STAGE c
WHERE c.VOCABULARY_ID = 15 AND
c.CONCEPT_CODE = D.CONCEPT_CODE); |
Current Result:
- Num Rec in stage 85333
- Num Rec in DEV not deleted 93461
- How many records would be new in DEV added 1038
- How many DEV active will be marked for deletion 9166
1.5.3 Load new concepts into DEV schema concept table,mark deprecated concepts as deleted.
Valid start date, valid end date and invalid
reason are left empty for now.
Transfer records from CONCEPT_STAGE to DEV.CONCEPT table
$ sqlplus
MEDDRA_20120822/myPass@DEV_VOCAB @15_load_concepts.sql
1.5.4 Load new concept relationships stage, identify invalid codes information Loaded from the raw staged data RXNREL into the staged data CONCEPT_RELATIONSHIP_STAGE, identify invalid code information
$ sqlplus
MEDDRA_20120822/myPass@DEV_VOCAB @15_transform_row_relations.sql
1.5.5 Verify that number of records loaded is equivalent to prior production load $ sqlplus
MEDDRA_20120822/myPass@DEV_VOCAB
SELECT '- Num Rec in stage' AS scr, COUNT (8) cnt
FROM CONCEPT_RELATIONSHIP_STAGE d WHERE EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID IN (08, 15) AND c.CONCEPT_ID = CONCEPT_ID_1)
AND EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID IN (08, 15)
AND c.CONCEPT_ID = CONCEPT_ID_2)
AND D.RELATIONSHIP_ID IN (010, 144, 125, 239)
UNION ALL
SELECT '- Num Rec in DEV not deleted' AS scr, COUNT (8) cnt
FROM DEV.CONCEPT_RELATIONSHIP d
WHERE EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID IN (08, 15) AND c.CONCEPT_ID = CONCEPT_ID_1)
AND EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID IN (08, 15)
AND c.CONCEPT_ID = CONCEPT_ID_2)
AND NVL (d.INVALID_REASON, 'X') <> 'D'
AND D.RELATIONSHIP_ID IN (010, 144, 125, 239)
UNION ALL
SELECT '- How many records would be new in DEV added' AS scr, COUNT (8) cnt
FROM CONCEPT_RELATIONSHIP_STAGE c
WHERE NOT EXISTS
(SELECT 1
FROM DEV.CONCEPT_RELATIONSHIP d
WHERE c.CONCEPT_ID_1 = d.CONCEPT_ID_1
AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID)
UNION ALL
SELECT '- How many DEV active will be marked for deletion' AS scr,
COUNT (8) cnt
FROM DEV.CONCEPT_RELATIONSHIP d
WHERE EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID IN (08, 15) AND c.CONCEPT_ID = CONCEPT_ID_1)
AND EXISTS
(SELECT 1
FROM DEV.CONCEPT c
WHERE c.VOCABULARY_ID IN (08, 15)
AND c.CONCEPT_ID = CONCEPT_ID_2)
AND NVL (d.INVALID_REASON, 'X') <> 'D'
AND NOT EXISTS
(SELECT 1
FROM CONCEPT_RELATIONSHIP_STAGE c
WHERE c.CONCEPT_ID_1 = d.CONCEPT_ID_1
AND c.CONCEPT_ID_2 = d.CONCEPT_ID_2
AND c.RELATIONSHIP_ID = d.RELATIONSHIP_ID)
AND D.RELATIONSHIP_ID IN (010, 144, 125, 239);
|
Current Result: - Num Rec in stage | 205842 | - Num Rec in DEV not deleted | 203438 | - How many records would be new in DEV added | 3206
| - How many DEV active will be marked for deletion | 810
|
1.5.6 Load new concept relationships into DEV schema,mark deprecated concept relatuionships as deleted Transfer records from CONCEPT_RELATIONSHIP_STAGE to DEV.CONCEPT_RELATIONSHIP table $ sqlplus
MEDDRA_20120822/myPass@DEV_VOCAB @15_load_relations.sql
|
 Updating...
Serg Vereshagin, Oct 25, 2013, 2:19 AM
Serg Vereshagin, Oct 25, 2013, 2:19 AM
Serg Vereshagin, Oct 25, 2013, 2:19 AM
15_load_to_prestage_SMQ_Content.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_SMQ_List.ctl (1k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_hlgt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:19 AM
15_load_to_prestage_hlgt_hlt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:19 AM
15_load_to_prestage_hlt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:19 AM
15_load_to_prestage_hlt_pt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_intl_ord.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_llt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_mdhier.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_pt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_soc.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
15_load_to_prestage_soc_hlgt.ctl (0k) Serg Vereshagin, Oct 25, 2013, 2:20 AM
Serg Vereshagin, Oct 25, 2013, 2:19 AM
Serg Vereshagin, Oct 25, 2013, 2:19 AM
|