17
GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th 2010 Lars C. Monkerud, Department of Public Governance, BI Norwegian School of Management

GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

  • View
    216

  • Download
    0

Embed Size (px)

Citation preview

Page 1: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

GRA 5917: Input Politics and Public Opinion

Data manipulation and descriptive statistics

GRA 5917 Public Opinion and Input Politics. Lecture, August 26th 2010

Lars C. Monkerud, Department of Public Governance,

BI Norwegian School of Management

Page 2: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

A few notes on data matrices… a simple cross-section

Person Y X1 X2 X3 . . . Xk

person_1 2.2 1 10 0 20person_2 1.2 0 7 1 40person_3 7.7 0 9 1 35

.

.

.person_n 6.1 1 6 0 23

Page 3: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

A few notes on data matrices… a simple cross-section

Country Y X1 X2 X3 . . . Xk

country_1 2.2 1 10 0 20country_2 1.2 0 7 1 40country_3 7.7 0 9 1 35

.

.

.country_n 6.1 1 6 0 23

columns: characteristics, traits (relationships among which we would be interested in), i.e. variables

rows: observations, records, units of analysis (persons, countries, organizations…)

Page 4: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

A few notes on data matrices… time-series for one CS

…repeated measures for one country on one or several variables … but, organized in the wide format

Country Y1981 Y1982 Y1983 X1_1981 X1_1982 . . Xk_1981 Xk_1982

country_1 2.2 4.5 3.5 1 1 18 20

Page 5: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

A few notes on data matrices… panel data

…repeated measures for several countries on one or several variables … still organized in the wide format!

Country Y1981 Y1982 Y1983 X1_1981 X1_1982 . . Xk_1981 Xk_1982

country_1 2.2 4.5 3.5 1 1 18 20country_2 1.2 2.3 2.4 0 0 34 40country_3 7.7 7.8 8.1 0 0 35 35

.

.

.country_n 6.1 5.8 6.8 1 1 23 23

Page 6: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

A few notes on data matrices… panel data in long format

Country Time Y X1 X2 X3 . . . Xk

country_1 1981 2.2 1 10 0 18country_1 1982 4.5 1 10 0 20country_1 1983 3.5 1 10 1 20country_1 1984 4.3 1 10 1 22country_2 1981 1.2 0 7 1 34country_2 1982 2.3 0 8 1 40country_2 1983 2.4 0 8 1 43country_2 1984 2.8 0 8 1 45country_3 1981 7.7 0 9 1 35country_3 1982 7.8 0 9 0 35country_3 1983 8.1 0 11 1 32country_3 1984 8.1 0 11 0 29

.

.

.country_n 1981 6.1 1 6 0 23country_n 1982 5.8 1 6 0 23country_n 1983 6.8 1 6 0 23country_n 1984 7.2 1 6 0 23

Page 7: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

A few notes on data matrices… panel data in long format

The Standardized World Income Inequality (SWIID) panel data set in SPSS (*.sav) format. For (most) analytical purposes data needs to be thus organized, i.e. in the long format (as are all data sets in the course’s PolEc Datasets collection)

Page 8: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

SPSS: Basic features and procedures

Procedures for data and data set manipulation and data analysis – and much else - available on the menu toolbar. Importantly, we will be looking at:

1. Outputing descriptives to get to ”know” the data/check on successfulness of data manipulation: Analyze > Descriptive Statistics > Frequencies and Analyze > Descriptive Statistics > Descriptives

2. Basic procedures to recode variables: Transform > Compute Variable

3. Procedures to aggregate data: Data > Aggregate

4. Procedures to match-merge data from two sets: Data > Sort Cases and Data > Merge Files > Add Variables

Page 9: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Excercises (Ia): Descriptive statistics and aggregation

1) In PolEc Datasets on Blackboard, open the World Values Survey (WVS) *.sav file which contains the individual question responses. In the accompnying codebook for this file you will find a variable (e033) measuring respondents’ political scale self positioning score (left-right).

a) To check on the quality of data for this variable, perform a simple frequency analysis and also request that the output give measures of the mean, median, mode and the standard deviation for the self positioning variable. Do the data look reasonable (in light of the information in the codebook… and otherwise)? Describe the distribution in the sample.

b) You would like to retain an aggregate measure of left-right positioning pertaining to the specific time and country for which it was measured. Aggregate the data by extracting the mean and median score of the self positioning variable for every country-year, and save the data in a new file named lr_cy.sav. Request some simple descriptive statisics for this new aggregated variable. Do the data look reasonable? Describe the distribution in the sample.

Page 10: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Excercises (Ib): Descriptive statistics and variable recoding

2) For convenience the WVS *_AGGR.sav file contains certain aggregates for all individual responses within country-years. Open the said file and…

a) Request that the mean and the standard deviation of x001_mean (the aggregate within country-years of respondents’ sex) be output. How would you interpret these statistics?

b) Using the above score (i.e. x001_mean), compute a variable pmale that will show directly the proportion of males sampled in each country-year. To check wether the computation went well, request that the mean and standard deviation of the pmale variable be output. If you compare the output statistics to those found in 2a) , does the new pmale variable seem reasonable?

Page 11: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Excercises (II): Aggregation and computing variables

3) The median measures in the WVS *AGGR.sav file are simply the response category code medians. For some variables (e.g. x011 - ”number of children”) this is an appropriate estimate of the substantive median. For other (continuous scale) phenomena a more reasonable median measure can be constructed. For instance, this is done in Gable and Hix (2005; see note 6) for the country-year median of the WVS e033 – ”left-right self positioning” variable.

a) Using the methodology of Gable and Hix (2005), calculate the median for e033 for all combinations of countries and years in the WVS surveys. Save the estimates in a file called lr_md.sav containing country-year observations for the median estimate and the identifiers (cname and year). (Tip: Work with a trivariate individual level file, count individuals in and out of the median category, aggregate and keep aggergates in the file until the final stage…)

b) Again applying the logic of Gable and Hix (2005) and using x047cs in the WVS: Estimate the local currency median household income in New Zealand in 1998. (Hint: Use select cases and frequncy analysis… and calculate…)

Page 12: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Match-merging data in SPSS

Sort data by values of identifier(s)/key variable(s)… in same order in both files to be merged

Page 13: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Match-merging data in SPSS

Page 14: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Match-merging data in SPSS

Page 15: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Match-merging data in SPSS

Choose the file you wnat merged with the active dataset

Page 16: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Match-merging data in SPSS

Variables that appear on both files, such as the identifier, are not duplicated

Use identifier to match cases: Highlight, tick and move into key Variables box

Choices depend on whether you’re performing a 1) one-to-one or a one-to-many merge, 2) which records (apart from those with matches) that you want to keep

Page 17: GRA 5917: Input Politics and Public Opinion Data manipulation and descriptive statistics GRA 5917 Public Opinion and Input Politics. Lecture, August 26th

Excercises (III): Match-merging datasets

a) Match-merge the two cross-sectional datasets containing democracy and happiness scores (wvs_c.sav and PolityIV_data_c.sav) and perform a simple correlation analysis of the democracy-happiness relationship (Analyze > Correlate > Bivariate). Describe the relationship between democracy and happiness as measured in this analysis.

b) Match-merge instead the two panel data-sets (PolityIV_data.sav and wvs_cy.sav), matching observations on cname and year.