26
1 Creation of Submission & Statistical Analysis Data Sets according to CDSIC recommendations Yvane Boudraa Sanofi-aventis, Biostatistic & programming department, System support

Creation of Submission & Statistical Analysis Data Sets · PDF fileCreation of Submission & Statistical Analysis Data Sets according to CDSIC recommendations Yvane Boudraa Sanofi-aventis,

Embed Size (px)

Citation preview

1

Creation of Submission & Statistical

Analysis Data Sets according to CDSIC recommendations

Yvane Boudraa Sanofi-aventis, Biostatistic & programming department, System support

2

Agenda

Sanofi-Aventis

Objectives of the tool

Standards

Data process

System processDefinition fileSAS ProgramsSAS Macros

Demonstration

3

Sanofi-Aventis

Pharmaceutical industryN°1 in Europe, n°3 worldwide

7 mains therapeutics areasCardiovascular, Thrombosis, Central nervous system,Oncology, Metabolism,Internal medicineVaccines (1st in the world)

100 000 collaborators worldwide

4

Sanofi-AventisSystem department functions

Care for statisticians and programmers (300 users)

Responsible for systems & environments they are working on

To follow technology evolution

To ensure SAS support in 1st line

To train and to help users

To provide and to develop tools to industrialize boring tasks

To ensure respect of authorities rules such as 21 CFR part 11

5

Objectives of the tool

To produce easier and quicker Standard Submission and Analysis Data Sets (SDS and ADS)

To satisfy health authorities

To ensure quality and gain/improve efficiency

Risks reduction: programming errors, compliance to standard issue

Facilitate the Project/Study specifications for Domains (SUPPQUAL)

Ensure that all the process of derivation will be traceable

6

Objectives of the tool

Facilitate our internal QC/validation process

Ensure the uniqueness of derivations

Facilitate ADS/Domain documentation according to the CDISC and our specifications

Be portable easily from platform to another one (PC Version)

Developed in SAS V8.2, but portable in SAS 9

Risks reduction: programming errors, compliance to standard issue

7

Standards - CDISC

Clinical Data Interchange Standards ConsortiumHis mission is to develop and support pharmaceutical industry in the establishment of standards, for submissions to the health authoritiesSDTM: Standard Data sets Tabulation Model

Medical review. Current version 3.1. Created Datasets are called Domains and, if needed Supplemental Qualifier (SUPPQUAL).

ADaM: Analysis Data Set ModelStatistical review. Created Dataset are called Analysis DataSets (ADS).

Structure of data setsSDS vertical structure fixedADS mixed of vertical/horizontal structure

Depending on the type of analysis

8

Standards - CDISC

Name of data setsSDS

Domain name fixedADS

Prefixed by the two letters AD,Depends also on the standard sanofi-aventis

VariablesSDS

Mainly character variablesName and label of variables fixed

Prefixed by the domain name (xx) except for common variablesConventions such as application of format ISO (country, dates )

9

Standards - CDISC

VariablesADS

Character variables and the corresponding numeric variablesDecode and code

Name and label of variablesIf variable defined in the SDS, keep the sameOtherwise, defined in Naming conventions and standard sanofi-aventis

Naming conventionsST =StartEN =EndDTC =Date/Time Character

CMSTDTC=Start Date/Time of MedicationsCMENDTC=End Date/Time of MedicationsVSDTC =Date/Time of Measurements

DY =Study Day ofDMDY=Study Day of CollectionVSDY=Study Day of Vital signs

TEST =Test NameTESTCD =Test Short NameORRES =Result or Finding in Original UnitSTRES =Result of Finding in Standard Unit….

10

Standards - Internal

Standard sanofi-aventis has defined:SDS structures and contents

Of 14 domainsPlus the supplemental qualifiers dataset

ADS namesADS Structures and contents

Of 11 ADS

11

Standards - Internal

Standard sanofi-aventis has definedDefault statistical rules of derivations

General rules regarding The partial dates,The rounding of values,The imputing of values for missing value,…

Derivation of variables, such as the ageSafety guideline

Regarding the safety analysisThe assignation of safety windows….

BY FOLLOWING THE CDISC RECOMMENDATIONS

12

CDM Data base

ODStemporary

Mapping

Compiled macros forstandard

Derivation& transpose if needed

ExtractionSAS

Operationaldata sets

SpecificDerivations

Raw data &

Standard Derivations

ODS2

ADS

Suppqual

Statistical rulederivations

Domains (SDTM V3.1)

keep

Data process

13

CDMData base Extraction

SASOperational

data sets

SpecificDerivations

Raw data &

StandardDerivations

ODS2

ADS

Suppqual

Domains (SDTM V3.1)

Data process

-initialisation of parameters

-execution of programs called XX.sas

execution of program called sds.sas

-initialisation of parameters

-execution of programs

called YYYY.sas

XX SAS datasets

suppXX SAS datasets

YYYY SAS datasetsids_XX SAS datasets

14

System process

15

Definition file Definition of each SDS & ADS

One definition file exists for each standard dataset

In ExcelSDS cdiscvar.xls, with one sheet per domainADS adsvar.xls, with one sheet per ADS

No definition file exists for ODS2 dataset (= intermediate dataset) since they are created from the SDS definition file

16

Definition fileStructure of Excel file is fixed

Definition of each SDS & ADS

Must be filled

Filled onlyin case of mapping

17

Definition file Definition of each SDS & ADS

mandatory

18

Definition fileSpecific definition file

Reasons:To delete permissible variables,To add specific variables in an ADS,….

For an existing standard datasetSame source datasets standard will be doneSame source datasets plus others standard will be doneDifferent source datasets only the mapping on the first source dataset will be done

For a new datasetOnly the mapping except for recoding, on the first source dataset will be done

The dataset will contain all variables described in the definition file, filled or not according to the cases

Definition of each SDS & ADS

19

Definition file

Comparison of the structure stored in a datasetIdentified differences are classified

variable does not exist in the standard definition or cases letter are differentvariable does not exist in the specific definition or cases letter are differentdifferent sequence ordersdifferent variable labels, types and lengthsdifferent variable labels and typesdifferent variable labels and lengthsdifferent variable types and lengths…

Variable name, spec & std label, spec & std type, spec & std length, spec & std core, spec & std seq of order, spec & std source variable

Definition of each SDS & ADS

20

ProgramsTemplate Programs

Initialization programContains all common parameters of the study for all standard datasets and needed format

Template programOne exists for each standard datasetODS2/SDS/Supplemental qualifierOne program per domain XX,

ODS2 aloneODS2 and SDS/Supplemental qualifier

Once all ODS2 have been created, To create SDS and Supplemental qualifiers

ADS one program per ADS YYYY

21

ProgramsTemplate Programs

XX.sasids_xx dataset

Contains all variables specified in the cdiscXX definition filePlus the variables existing in the main source dataset

If creatsds=YES, xx and suppxx datasetsXX dataset contains only the variables existing in the cdiscXX definition filesuppXX dataset contains the supplement variables of the ids_xx dataset

YYYY.sas yyyy datasetContains only variables specified in the YYYYvar definition file

22

ProgramsTemplate Programs

Addition of SAS codeTo add variables,To derive variables already defined as standard but not the derivation,To drop permissible variables,To drop variables that are normally kept in the supplemental qualifier,…

ComparisonOf contents stored in a dataset

Only difference of modified variable values are displayedOf structure stored in a dataset

Variable Name, type, length, variable number, label, format of new variables

23

SAS Macros

Utility macro / examples:To check the type of a variableTo check if a variable or a dataset exist

Macros of derivationOne macro program per standard derived variableExcept for the date variablesBased on the statistical default rules document

Stored in a SAS compiled catalogueAccessible outside the tool

SAS macros programs of derivation variables

24

SAS Macros

System macros to encapsule all the executionmapping step,derivation step,addition code,control,display of the application messages in the SAS log

Stored in a SAS compiled catalogue

SAS macros programs of derivation variables

25

Demonstration

Standard

Use of specific definition file

Addition of SAS code

26

ConclusionThe first level of standards is the CRF (Case Report Form)

To have a good communication with the standard team to update the evolution of the standards and the new rules of derivations

To cover the maximum of the derivations by standard rules to standardize the work of the users between projects and between therapeutic areas

To reflect beforehand the good sequence of creation of the ODS2 to avoid the redundancies

The parameters dependent on the platform must be well identified to have a portable application

To provide users with very clear and well documented template programs

To give users a good training supported with a documentation