26
1 Statistics Canada Research Data Centre Program* Facilities across Canada housing detailed confidential microdata and documentation files from Statistics Canada Statistics Canada released data that would otherwise not be available into “secure” sites.

Statistics Canada Research Data Centre Program*

  • Upload
    arlene

  • View
    56

  • Download
    0

Embed Size (px)

DESCRIPTION

Statistics Canada Research Data Centre Program*. Facilities across Canada housing detailed confidential microdata and documentation files from Statistics Canada. Statistics Canada released data that would otherwise not be available into “secure” sites. About statistics Canada data:. - PowerPoint PPT Presentation

Citation preview

Page 1: Statistics Canada Research Data Centre Program*

1

Statistics Canada Research Data

Centre Program*Facilities across Canada housing detailed confidential microdata and documentation

files from Statistics Canada

Statistics Canada released data that would otherwise not be available into “secure” sites.

Page 2: Statistics Canada Research Data Centre Program*

About statistics Canada data:• Canadian Community Health Survey (CCHS)• Ethnic Diversity Survey (EDS) • General Social Survey (GSS selected cycles)

– Access to and Use of Information Communication Technology – Education, Work and Retirement – Family – Health – Social Engagement – Social Support and Aging – Time Use – Victimization

• Longitudinal Survey of Immigrants to Canada (LSIC) • National Longitudinal Survey of Children and Youth (NLSCY)• National Population Health Survey (NPHS)• Survey of Labour and Income Dynamics (SLID) • Workplace and Employee Survey (WES) • Youth in Transition Survey and the Programme for International Student

Assessments (YITS-PISA)

2

Page 3: Statistics Canada Research Data Centre Program*

Two ways researchers can get access to Statistics Canada Survey data:

3

Research Data Centre Data Liberation Initative

• Access restricted• Must work with data

within centre• Results can only be

released from centre after scrutiny

• Must submit proposal, be “approved,” get security clearance, sign contract

• Actual data released to researchers with minimal restrictions

• Data can be downloaded via library website

Page 4: Statistics Canada Research Data Centre Program*

Two sources of StatCan data:

Data liberation initiative:- At just about every university and many colleges

across CanadaResearch Data Centres- full-time centres at larger Canadian universities

(UBC, Alberta, Calgary, Toronto, Western, Waterloo, Montreal, etc.)

- Part-time centres at other universities, including UVic, SFU, Queen’s, Saskatchewan

- No centre at smaller institutions (e.g., Vancouver Island University, UNBC) 4

Page 5: Statistics Canada Research Data Centre Program*

DLI Restrictions

• No longitudinal data (in some cases, cross-sectional waves, not linked and with unique identifiers stripped, are available, but in other cases survey not available at all)

• Many variables treated as “confidential” and deleted from dataset or coarsely categorized

5

Page 6: Statistics Canada Research Data Centre Program*

6

RDC DLIFiles available

-Those listed on RDC site-Other files if arragements can be made

- Those listed on DLI site – see UVic library web page under Data Acquisition

Files not available

Any linked longitudinal datasetRecent waves of NCLSYSome newly available surveys

Information on files not avail.

Cluster numbers Cluster numbers, Geographic detailDemographic detailOther

Page 7: Statistics Canada Research Data Centre Program*

7

RDC DLIWho may access

- Faculty with approved projects-Graduate students with approved projects (+faculty co-investigator)

- Any member of UVic community with NetLink ID

Where files may be worked on

In Data Centre only May be downloaded to be used anywhere, with agreement not to redistributed

Initial contact Doug Baer or Lee Grenon (StatCan Analyst located in Vancouver at UBC)

Kathleen Matthews, Data Librarian, UVic

Page 8: Statistics Canada Research Data Centre Program*

8

censored variables

Full versions of datasets with censored variables + datasets not otherwise available can be worked on in a “Research Data Centre”

Page 9: Statistics Canada Research Data Centre Program*

Full versions of datasets with censored variables + datasets not otherwise available can be worked on

in a “Research Data Centre”

9

As part of the application process, you will be asked to indicate why the DLI datasets are not sufficient.

- Dataset not available at all via DLI (some longitudinal datasets)- Linked longitudinal file for the dataset not available via DLI- Variables needed for the research are either (a) not available or (b) too coarsely categorized [not so with originally collected data] to be usable for research project

Examples: exact income, exact age, city size,community in which respondent lives (esp. for multi-level analysis), religion, church attendance (some surveys),

Page 10: Statistics Canada Research Data Centre Program*

10

There are RDCs across Canada at most major universities with doctoral programs:• New Brunswick (branch at Moncton), Dalhousie• Toronto , York•Waterloo branch at Guelph ( WLU participates)•McMaster (Brock participates)•Western (branch at Windsor)•Queen’s (part-time site)•U of Ottawa (Carleton and UQ-Gatineau participate)•Manitoba•U of Saskatchewan (part-time site)• 2 Alberta sites: U of Alberta; Calgary (branch at Lethbridge)• Manitoba•Consortium (U de Montreal) with branches at UQAM, Sherbrooke, Laval•McGill •BC universities consortium

BC consortium: UBC, SFU, UVic, Vancouver Island Univ., UNBC

Statistics Canada Research Data Centre Program

Page 11: Statistics Canada Research Data Centre Program*

11

The UVic branch works within the British Columbia Interuniversity Research Data Centre network

• “main” site is at UBC; open 9-5 M-F• UVic site has more restrictive hours (arranged term-

by-term in consultation with researchers).– Currently 15.5 hours/week (sometimes a bit less in summer)– Exact hours worked out in consultation with users

Page 12: Statistics Canada Research Data Centre Program*

12

Data used most frequently at RDCssurvey of labor income dynamics (SLID)Victimization GSS cycle 3, 8, 13, 18Participation and Activity Limitation

Family history GSS cycle 5, 10, 15 and 20

homicide data-pilot

national graduate survey / Follow-up Survey of 2000 Graduates

Canadian Health survey 1.2Canadian Health survey 2.2Canadian Health survey .1

Page 13: Statistics Canada Research Data Centre Program*

13

Major Statcan surveys:(ALL VERY WELL SUPPORTED AT RDCs)

• Workplace and Employment Survey• Canadian Community Health Survey• Health Services Access Survey• General Social SurveyLongitudinal:• National Population Health Survey• Survey of Labour & Income Dynamics• National Longitudinal Survey of Children and Youth• Longitudinal Survey of Immigrants to Canada• Youth in Transition Survey• Workplace & Employment SurveyCensus (presently 1991,1996,2001,2006)

Page 14: Statistics Canada Research Data Centre Program*

14

& other data can be arranged

• There is presently a project involving BC Administrative Health data (to be linked to Stats Can survey data)

• For a very large list of StatCan Surveys, see the DLI website (UVic library)http://gateway.uvic.ca/data/default.html click on “DLI collection”

future plans: see below

Page 15: Statistics Canada Research Data Centre Program*

15

What is the process for gaining access?

http://www.statcan.ca/english/rdc/application.htm

Page 16: Statistics Canada Research Data Centre Program*

16

Application process works through SSHRC

Graduate students must have faculty member as co-investigator

Page 17: Statistics Canada Research Data Centre Program*

17

Project proposal• Proposal evaluation by SSHRC peer review and

Statistics Canada• Very few are turned down… though must

establish that confidential data are required to complete project– Does project have scientific merit? is access to

confidential microdata necessary? Does researcher have expertise to conduct research?

– Takes 5-8 weeks• Proposals that are part of SSHRC or CIHR

grants forgo the SSHRC peer review process– Approvals typically 3-4 weeks

Page 18: Statistics Canada Research Data Centre Program*

18

Process:

• Submit proposal• Proposal approved• Security check on applicant• oath, investigator becomes “deemed

employee” of statistics canada• Orientation session at UVic• Issued access card for card reader

Page 19: Statistics Canada Research Data Centre Program*

19

UVic facilities:6 workstation lab with room for expansion to up to 10 workstations

workstations now have widescreen monitors or dual screen configuration

Server for dataMost commonly used statistical software packages

Some highly specialized software packages

Hours are worked out to suit the needs of active researchers. Currently: 16 hrs/week (slightly reduced Feb 2011 due to staff transition)

We try to be open at least 4 hours on 3 different days of the week.

Page 20: Statistics Canada Research Data Centre Program*

20

SoftwareStandard stats packages: SPSS (19), SAS (9.2) STATA

(11) [Stata/SE on 2 machines)Open-source stats: RMultilevel models: HLM, LISREL, MPlusSEM models: LISREL, MPlusSpecialized (Bayesian, MCMC etc.): WinBugs

Other software can be obtained if demand exists.

Page 21: Statistics Canada Research Data Centre Program*

21

Security process• No output or notes can be taken out of the room• Users have file drawers and access to printer

inside the centre• Output listings and notes (if typed into a

computer file) can be released after they are “vetted” by a Statistics Canada Analyst at the main BC site

• Files are sent via high security network to Vancouver

• Files that are approved for release are emailed back to researcher

• Pass card works only during centre hours (swipe in, swipe out protocol)

Page 22: Statistics Canada Research Data Centre Program*

22

Can I work at other RDCs too?Can I work with other researchers?

What about other researchers at other universities?

• Access is “network wide”• Files are stored on a “project” basis

(researchers, RAs, etc. have own account but access to shared files)

• UVic researchers are part of the BC consortium and could go to the UBC site if more intense periods of research are required (35 hrs/week vs. 15); project files can be sent to and from the branch via the security intranet

Page 23: Statistics Canada Research Data Centre Program*

23

Preparation:• Check to see if dataset is one of standard RDC datasets:

check the national RDC website (see handout)– Extensive data documentation provided for listed datasets– If what you are interested is not on the list, check with Doug Baer

• Is a public use file available? Check with Kathleen Matthews [email protected] or on library web site.

• Verify that variables needed for research are not on public use file. If possible, use public use file to explore data, etc.

• If further dataset documentation required, ask Doug Baer • Go to SSHRC web page to put together application. Don’t hesitate

to consult Doug Baer for help. Be prepared to specify variables to be used. Where a public use version of the dataset is available, be prepared to make clear why RDC access is needed (e.g., “a needed variable is suppressed on the public use file”).

Page 24: Statistics Canada Research Data Centre Program*

24

Statistics Training• Summer Institutes:

– SPIDA (York University)– ICPSR (U Michigan)– University of Western Ontario– Population Health BC

• Likely workshop on Structural Equation Models June 2011

– Seminar at the Congress for the Humanities & Social Sciences (none this year)

• Special workshops and seminars (Baer):– Possible: Statistics Canada bootstrap

weights

Page 25: Statistics Canada Research Data Centre Program*

25

Contact information:Doug Baer, Academic Director

(Sociology)[email protected](721) – 7581

Cornett, A365RDC (853) 3196 ([email protected])RDC Analyst at UBC: Geoff 604-822-0263

([email protected])

Centre web site (shows hours): web.uvic.ca/rdc

More numbers on handout

Page 26: Statistics Canada Research Data Centre Program*

Future:• Plans are in development to add the following to

RDC dataset collection:– Cancer Registry (pilot project in progress at BCIRDC)– HRSDC administrative data– CPP-disability data– Homicide data (Cdn. Centre for Justice Statistics)

[under review: pilots only]– Census – Business data: (selected datasets from Small

Business & Special Surveys Division)

26