Upload
cassandra-waters
View
227
Download
3
Embed Size (px)
Citation preview
Doors to Data
Data Search: Major Sources Susan Mowers, Data Librarian Sarah Roach, Research Assistant (RTRA service)
Objectives
• Gain familiarity with the types of sources for data*
• Gain familiarity with how to “access” major data sources
*Quantitative data
Outline
Your Data Search includes …
•Doors to data “Microdata” and Aggregate data
•“Microdata” Public “microdata” Confidential “microdata”
•Aggregate data Canadian International
Suggestion
• Please logon to your computer▫ abcd###@uottawa.ca yyyyddmmsin
• Problems? I can help!
Let’s do: Open Data and Statistics Research Guide
http://biblio.uottawa.ca/en
Comparing data types …
DataAggregate data, or“Statistical tables”
Doors to data
•When to use microdata?
For high degree of detail
All variables at individual unit of analysis, leading to …
Many choices about subject matter Greater range of statistical analyses possible
Doors to data
•When to use aggregate data?
Microdata not available?
e.g., business survey microdata not readily available
Need macroeconomic data?e.g., level of the region, country, provincial or city
Need time-series data?e.g., comparative values already calculated across time periods
Questions?
Outline
Doors to Data ▫ Microdata▫ Aggregate data
•Microdata Search ▫ Public microdata▫ Confidential microdata: RDC and RTRA
•Aggregate data ▫ Canadian: CANSIM & other▫ International: UN, OECD, World Bank, Haver, IMF
We are here
Public Statistics Canada MicrodataAccess via the Library and Odesi
Public microdata
• Confidentiality/privacy problems are resolved with PUMFs
▫ Low-risk nature of public data
▫ 24/7 access via Odesi to Statistics Canada public data*
▫ Contact point for help: GSG Centre/MRT
*& other sources, e.g., ICPSR [Link] and World Bank [Link] …
Let’s see!
•Public data file
•Personal income variable
▫[LINK to Odesi]
▫Note: What type of data? Would it be specific enough?
Let’s see (cont’d)
Screen 1
- What type of data?- Would it be specific enough?
Let’s see!
• Public microdata file
• Cultural or racial origin variable
▫[Link to Odesi]
▫Note: Do these values reflect the actual question and
the level of detail asked? Would it be specific enough?
Let’s see (cont’d)
Screen 2
Do these values reflect the actual question and the level of detail asked?
Would they be specific
enough?
Let’s see!
•Public data file
▫Is there a correlation between cultural / racial origin AND income?
▫[LINK to example from Odesi]
Let’s see (cont’d)
Screen 3
Did you know?
•Odesi provides both the public microdata files and codebooks
Download both (data and codebook)
Download the data as a subset or full datafile
Always download the codebook
More info here, e.g., codebook [LINK] and topical index [LINK] or [email protected]
Hands-on: Download public data!
• Download a subset. Note also this how-to video [LINK]
• Download codebook & topical index
Questions?
outline
• Doors to Data ▫ Microdata▫ Aggregate data
• Microdata Search (hands-on: Odesi, SAS)▫ Public microdata▫ Confidential microdata: RDC and RTRA
• Aggregate data ▫ Canadian CANSIM and other▫ International: UN, OECD, World Bank, IMF , Haver
We are here
Confidential Statistics Canada MicrodataAccess via the RDC and RTRA
Agenda
•Why use confidential microdata?
•Access via Research Data Centre (RDC)
•Access via Real Time Remote Access (RTRA)
Why use confidential microdata? Need more specific data
Public microdata has limitations. It often …
aggregates continuous data, like age and income and
suppresses detailed geography
Let’s see!
• Confidential synthetic file
▫Is there a correlation between cultural / racial origin AND income?
▫[Link to example from Odesi]
Explanation: click here for information about uses for this synthetic data file.
Let’s see (cont’d)
Screen 4
Why use confidential microdata?
Need panel data
•Panel data follow a panel of individuals over repeated cycles of a survey.•Public data limitation:
Public data files are NOT available for longitudinal data for reasons of confidentiality
Why use confidential microdata? No public data exists
Public microdata sometimes offers limited surveys. For example, it doesn`t have …
The Uniform Crime Reporting Survey The Canadian Cancer Registry The Canadian Forces Mental Health Survey
Questions?
Agenda
•Why use confidential microdata?
•Access via RDC
•Access via RTRA
We are here
What is the RDC?
•The Research Data Centre (RDC) provides provides researchers access to confidential microdata.
• Access is provided in a secure university setting.
Where is the RDC and how is it used?
• The COOL RDC can be found on uOttawa campus on the 3rd floor of the Morriset library!
• All work with the data must be done inside the RDC.
• Output can be released to researchers by request pending vetting for disclosure risk
Application Process & Survey AvailabilityTo access the RDC there are 3 steps to follow:1.Apply online on the SSHRC website2.Complete a security screening3.Sign a microdata research contract
A list of the surveys available in the RDC can be found here: http://www.rdc-cdr.ca/datasets-and-surveys
Want more information?
Zacharie Tsala DimbueneRDC AnalystOffice: Morisset Library 322Email: [email protected]
Web site: [Link]
Agenda
•Why use confidential microdata?
•Access via RDC
•Access via RTRAWe are here
What is RTRA?
RTRA (Real Time Remote Access) allows remote access to confidential microdata output
Provides descriptive statistics
RTRA can be particularly useful during the proposal stage of a research project.
How does RTRA work?
•Submit code to Stats Can (online) indicating the statistics you want and received output within the hour.
•Code is generated in SAS.
•Training sessions are available for new RTRA researchers!
Availability of SAS and help
SAS is available…
Vanier Labs, or Free browser version also available online
New to SAS? Training sessions are available.
RTRA SurveysConfidential data available by remote access
RTRA Surveys
Availability via PUMF*? Availability via
RDC?
Aboriginal Children's Survey (ACS) NO YES
Canadian Cancer Registry (CCR) NO YES
Canadian Forces Mental Health Survey (CFMH) NO YES
Canadian Survey on Disability (CSD) NO YES
Health Services Access Survey (HSAS) NO YES
Homicide Survey NO YES
Life After Service Survey (LASS) NO YES
Longitudinal Survey of Immigrants to Canada (LSIC) NO YES
Maternity Experiences Survey (MES) NO YES
Post-Secondary Education Participation Survey (PEPS) NO YES
Postsecondary Student Information System (PSIS) NO NO
Registered Apprenticeship Information System (RAIS) NO NO
Survey on Living with Chronic Diseases in Canada (SLCDC): Arthritis
NO YES
The National Apprenticeship Survey (NAS) NO NO
Uniform Crime Reporting Survey (UCR) NO YES
*PUMF=Public Use Microdata File
How do I apply to RTRA?
•Fill out and sign an application form [Link | Info] indicating which survey(s) you would like access to and email it to me at [email protected]
You should have access within two weeks!
More information?
•Compare regular SAS code versus RTRA SAS code – CCHS 2012 example [Link]
More information?
•RTRA code [Link]
uOttawa RTRA Web site
[Link]
Questions?
Outline
• Doors to Data ▫ Microdata▫ Aggregate data
• Microdata Search▫ Public microdata▫ Confidential microdata: RDC and RTRA
• Aggregate data ▫ Canada: CANSIM & other▫ International: UN, OECD, World Bank, IMF ,
Haver
We are here
Aggregate DataCanadian and International Sources
About aggregate data …
•Unit of analysis is at the geographic level, e.g., Canada, U.S., U.K., province/state …
•Often is repeated, or, time-series (aggregate) data
Unemployment rate (sa %)* / Labour Force Surveys 1995-2014 – for U.S., U.K., Canada
Time
series
illustration
*Calculated from Labour force status=unemployed from repeated cycles of Labour Force Surveys
Questions?
Canadian aggregate dataCANSIM, Census / National Household Survey …
Canadian aggregate data
•CANSIM tables [Link]
•Odesi (see slide 57)
•Statistics Canada DLI data server! [Link]
•Conference Board of Canada e-Data (forecast data, metropolitan-level, confidence indices) [Link]
New
database
CANSIMParts of a
CANSIM
table• Official government data from numerous sources
• Parts of a CANSIM table:
▫ Title: Revenue, expenditure and budgetary balance - Provincial administration, education
and health quarterly (dollars x 1,000,000) ▫ Table #:
380-0081▫ Dimensions:
Geography (1 item: Canada) Seasonal adjustment: Adjusted, unadjusted. Sub-sector accounts (3 items)
Estimates: (120 items)▫ Time frame:
Q1, 1980– Current
▫ Vector: Each possible combination of categories and options in a table. Also called a series.
▫ Time series: A series (vector), measured over a number of years
▫ Footnotes Data definitions
Source: Adatped from Kwantlen Polytechnic University. (2015). Statistics: CANSIM (Guide). http://libguides.kpu.ca/c.php?g=183875&p=1212158
CANSIM
Instructions1. Go to CANSIM. In the Search box, type
“provincial expenditure.”2. On the Search Results page, click on
Table 326-0009.3. There are five tabs located above the
data table: Data table (you are by default in this selection), Add/Remove data (to narrow your filtering/search), Manipulate (time series), Download (to save the data), and Related information (other useful links), and Help.
4. TWO OPTIONS – go to tab Add/Remove to narrow search and time frame, OR go to tab Download and download entire table as a Beyond 20/20 (data viewer you can install on your computer.
Source Adapted from the Government of Canada. (2015). Canada Business Network Blog. http://www.canadabusiness.ca/eng/blog/entry/4005/
How-to
get a
CANSIM
table?
to Fall 2015
Census tables
•Two types …
What is a
Census
table?
Census tables
Census National Household Survey (NHS) *
browse (Demographics & population)
(Social surveys)
2011 Profiles [LINK] Profiles [LINK]
Tabulations [LINK] Tabulations (Data Tables) [LINK]
2006 Profiles [LINK]
Tabulations [LINK]
How to get
a Census
table?
Method 2 –
New DLI data server [Link]
Method 1 – Odesi
*Don’t forget the replacement voluntary survey, the NHS
Questions?
Outline
• Doors to Data ▫ Microdata▫ Aggregate data
• Microdata Search (hands-on: Odesi, Stata, SAS)▫ Public microdata▫ Confidential microdata: RDC and RTRA
• Aggregate data (hands-on: extract) ▫ CANSIM & other▫ International
UN, OECD, World Bank, Haver, IMF, FAO …
We are here
International aggregate dataUN, World Bank, OECD, IMF, Haver
International aggregate data
• United Nations Data [Link]
• World Bank, World Development Indicators [Link]
• International Monetary Fund (IMF) [Link]
• OECD.Stat [Link]
• Haver [Link]
UN Data
• Covers a very broad range of topics
• All countries
• From 1950 to present (various)
• Worth exploring
What are
UN Data
tables?
Topics•Demographic•Gender•Energy•Environment•Population estimates and Projections•Economic•Health•Human development•Food and Agriculture•Information and Communication Technology•Labor•Crime
World Development Indicators
• Cover many topics on, (and related to), economics, including social development and the environment
• All countries
• Annual data from 1960 to present
• See also Africa Development Indicators, if researching Africa (some additional variables).
What are
WDI tables?
How to get a
WDI table?
World Development Indicators
• Country selection 1. (optional) you can pick a grouping first, e.g., region, income level, on left,2. click on COUNTRY in middle,3. click on desired countries.
• Series selection1. You can do a keyword search 2. Or drill down under topics to left3. If you are still having problems, see [Link] for a browsable list of series
names, then use wording you find here• Select years
▫ Use tick boxes• When downloading
▫ You can download many series at a time, but▫ Only one country at a time, so▫ TIP: In the same session, as you keep downloading countries, when you
download to Excel, it will contain all countries you have downloaded in that session (so you can keep adding countries and you will only end up keeping the latest Excel table).
International Monetary Fund
• Databases: International Financial Statistics, Direction of Trade (1980+), Balance of Payments, and Government Finance Statistics, among others
• Covers countries, regions and NGO’s.
• Covers 1948 – Present, for major IMF database: International Financial Statistics
• Over 7,000 economic concepts
What are
IMF tables?
Broad topics
Balance of Payments
External Trade and Exchange RatesFinancial IndicatorsFund AccountsGovernment and Public Sector FinanceIndicators of Economic ActivityInternational Investment PositionInternational ReservesLabor MarketsNational AccountsPrices
Quick Links
International Monetary Fund
Regular portal New portal[Link] [Link]
Covers many IMF databases* - includes more visualization features
No Google Chrome
To download, REGISTER, then sign in to your account
Note: Your account will work in both portals
To download, REGISTER, then sign in to your account
Recommend: Build your own query or Search (to left), then click on View data and Excel icon (top of screen)
Country filters
Help info [Link]
Help info [Link]
* Excludes Trade and Investment
(2) Bulk download
(1) Customize
How to get
an IMF
table?
OECD.Stat
• Cover many topics on, and related to, economics, including social development and the environment
• Country coverage usually restricted to member countries [Link]
• Different frequency options: annual, quarterly, etc.
• Great tabulation options …• OECD.Stat lets you manipulate your tables:
▫ Pick and choose from among many variables and items/values
▫ Drag your variables to rows / columns▫ Multiple countries and series in one single table▫ Then download !
What are
OECD
tables?
Haver Analytics
• Customized to be econometric analysis –ready ▫ e.g., DLX Add-ins to all versions of Excel provide instant
updates of your spreadsheets).• Many advanced functions built in
▫ e.g., calculate growth rates and n-period moving averages, create log scales, recession shading and aggregations, seasonal adjustment).
• Comparable macroeconomic databases • Additional data
▫ Stock index prices▫ Ordering in-depth Asian (South and East Asian, and Chinese)
databases in Spring• Requires a DLX plug-in installation (Windows) [Link]
What are
Haver
tables?
Training
•Guide [Link]
▫Page 1 - - - Intro ▫Pages 2-3 - Excel spreadsheets from Haver▫Page 4 - - - Haver charts for Powerpoint
•On-site training by Haver Economists: March/April
How to get
a Haver
table?
International aggregate data
•Stocks and commodity prices▫Bloomberg @ Telfer Financial Research
and Learning Lab▫Finance guides: Stocks [Link], commodities
[Link]▫Other sources of commodity prices
World Bank Food and Agricultural Organization [Link]
Questions?
AppointmentsSusan MowersData LibrarianOffice: Morisset Library 309BEmail: [email protected]
Sarah RoachRTRA Research Assistant Office Hours: By appointmentEmail: [email protected]
Evaluate this workshop
•http://bit.ly/fss-eval-2014