55
Data Science Jeanina (Nina) Worden Director, Statistical Programming

Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Jeanina (Nina) Worden

Director, Statistical Programming

Page 2: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science2

Page 3: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Agenda

Managing Expectations

Managing Data Security

Managing the People & Processes

Managing to Change

Page 4: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science4

Page 5: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Page 6: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Expectations

Four key departmental goals:

Compliant system/platform

Content security

Global access

Integrated environment for all users

6

Page 7: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Highlights

Life Science Analytic Framework (LSAF)

Cloud-based SAS environment

Built to be compliant with regulatory requirements

Integrated analytic tools

Integrated library of CDISC standards

7

Page 8: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

LSAF Dashboard

8

Page 9: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Compliance

LSAF IS a compliant system because:

Two-level system access

Account locking after failed attempts

Audit log records user activities

Redundant backups maintained by SAS

Robust change controls followed by SAS

9

Page 10: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Performance Testing

Processing speed (in seconds) using a 440,000 record dataset

10

Program Server LSAFSimple SET 0.7020 0.4297PROC SQL query (with MEAN) 0.6550 0.3915PROC FREQ 0.4830 0.6455PROC LOGISTIC 8.2840 4.4321

Page 11: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Access Requirements

Granular access controls

Familiar SAS environment

Built in management and audit tools

SAS maintenance relieves in-house IT needs

11

Page 12: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science12

Page 13: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Page 14: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Controlling Access

Santen controls access by controlling account settings:

Creating user accounts appropriate to their function or task to be performed

Controlling access to the files contained on the system

Page 15: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

User Access

Santen has differing types of users based on their roles:

Programmers/Analysts

Data managers

Data “users” (i.e. executives, scientists)

Page 16: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

User Access

Different types of users based on their employment:

FTEs

Contractors located at the Santen office

CRO partners

16

Page 17: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Group Accounts

Defined needs for each type of user

User “groups” provide consistency

Exceptions applied individually

17

FTEs Only

Programming Group

Privileges

Page 18: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Access Privileges

Can the user download out of LSAF?

Will the user be allowed to change the Data Standards?

Should the user be able to lock/unlock accounts?

Page 19: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Content AccessContent differences also exist

Types of content includes:

Ongoing study files

Archived study files

Ad hoc/testing data

Page 20: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Content AccessContent access is not automatic

Access can be controlled at different levels:

Study level

Subfolder level

File level

Page 21: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Study Level

21

Page 22: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Subfolder/File Level

22

Page 23: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Access history

Account creation and access modifications are logged

Audit history used as documentation for regulatory agencies

Page 24: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science24

Page 25: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Page 26: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Going GlobalKey considerations in building the global

team

Efficient resourcing

Effective communication

Standard processes

26

Page 27: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Where We Were

27

Page 28: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global ResourcesSingle work environment

Increase our team in a cost efficient manner

Full utilization of current resources

Capitalize on lower cost market

28

US-based programmers

Japan-based programmers

New programmers

Page 29: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Where We Are Now

296 8 18!

Page 30: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Resources

Increase productivity by capitalizing on the differing time zones

30

US

JapanChina

Page 31: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Communication

Communication is the primary challenge

Study communication:

Single-source “living” documents

Task status

31

Page 32: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Communication

Common study documents

32

Page 33: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Monitoring Progress

33

Page 34: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Libraries

Single environment = consistency

Global folders or libraries:

Best practice/guidance documents

Data standard implementation guides

Training templates

34

Page 35: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Libraries

35

Page 36: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Standards

Standards = efficiency

CDISC mapping documents

Standard programs

Dataset programs

Utility programs

36

Page 37: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Global Libraries

37

Page 38: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science38

Page 39: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Page 40: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change

Change is dependent on the starting point

Changes for Santen

UNIX environment

Jobs (Batch execution)

Working with files between areas

Storage management

40

Page 41: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change - Jobs

41

Page 42: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change - Jobs

42

Dependencies are accounted for and are access based, not user based

Page 43: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change – Working with files

Check-in - Adding files

Checkout/Moving

Versioning

Tracks changes

Ability to revert to previous

43

Page 44: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change – Working with Files

44

Copy = no change

Check-in/Checkout = changes allowed

Page 45: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change - Versioning

45

Guidance needed for when to version, “type” definitions and utilizing the comment field

Page 46: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change – Versioning (Access)

46

Note: You cannot compare versions

Page 47: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change - Storage Management

Storage understanding and management is key

Storage allocation between development and production areas

Each version stored

47

Page 48: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Change - Storage ManagementDo’s

Check-out only what is needed

Delete unneeded files often

Do Not’s

Create “archived” folders

Duplicate files in multiple places

SAS can provide reports with offenders!

48

Page 49: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Future Change

Value of data

49

0

20

40

60

80

100

1 2 3 4 5 6 7 8 9

Study

Repository

Page 50: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Future Change

Potential to build therapeutic data repository

Add value to data beyond submission usage

Cluster analysis for marketing targets

Preferred site locations

Defining better studies

50

Page 51: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Summary

Account options provide wide access and granular security

The centralized work environment allows increased, efficient use of global resources

51

Page 52: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

SummaryChanges do occur but can be managed

and have the potential to add value

52

Page 53: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Questions?

53

Page 54: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Thank you to SAS for giving me the opportunity to present this topic!

Page 55: Jeanina (Nina) Worden Director, Statistical Programming · 2017-11-28 · Data Science Performance Testing Processing speed (in seconds) using a 440,000 record dataset 10 Program

Data Science

Thank You!

Jeanina (Nina) WordenDirector, Statistical Programming6400 Hollis Emeryville, CA [email protected]

55