Young Children Job Satisfaction

8/10/2019 Young Children Job Satisfaction

1/49


2/49

Slide 2

Stage One: Define the Research Problem

In this stage, the following issues are addressed:

Relationship to be analyzed

Specifying the dependent and independent variablesMethod for including independent variables

Young Children and Job Satisfaction

Relationship to be analyzed

"We are interested in examining the effect of young children on the job satisfaction ofmen and women involved in a variety of work and family roles to see how the presenceof family responsibilities affects their happiness at work. The research is comparative. Itinvolves contrasts between men and women in different work and marital statuses asseveral points in time." (page 800)


3/49

Slide 3

Specifying the dependent and independent variables

The dependent variable is job satisfaction, measured on a four category Likert-scale:1=Very Satisfied, 2=Moderately Satisfied, 3=A Little Dissatisfied, and 4=Very Dissatisfied.Because the data does not follow a normal distribution (See page 803-804), the authors

recoded the variable to a dichotomous variable where 1 = Very Satisfied and 0 =Moderately Satisfied to Very Dissatisfied. The purpose of the analysis, then, is todetermine what factors contribute to a high level of job satisfaction versus some otherlevel of job satisfaction. With a dichotomous dependent variable, logistic regressionbecomes the analytic techniques of choice.

The independent variables are grouped into two categories:

1. Individual and family characteristics (age, race, education, spouse's work status,

prestige of spouse's occupation, number of children, presence of young children, generalhappiness, and satisfaction with family)

2. Job characteristics (income, job prestige, job authority, job autonomy,convenience (number of hours worked per week), and past work experience).

The variable presence of young children is important to answering the main question ofthe article.

Other variables, which could have been included as independent variables, were used todivide the sample into subgroups which were compared with each other to answer theresearch questions. For example, Sex and Work Status were combined to form acomposite variable WORK_SEX. We will use these variables with the SPSS "Select Casescommand to produce the results for different groups.



4/49

Slide 4

Method for including independent variables

With a dichotomous dependent variable and a variety of independent variables, thestatistical technique to use is logistic regression. While we could structure the analysisto do hierarchical entry of variables (individual, family characteristics, and job

characteristics in block 1 and the presence of young children in block 2), we will usedirect entry of all variables on a single step to conform to the authors analysis.



5/49

Slide 5

Stage 2: Develop the Analysis Plan: Sample Size Issues


Missing data analysis

Minimum sample size requirement: 15-20 cases per independent variable



6/49

Slide 6

Missing data analysis

In the missing data analysis, we are looking for a pattern or process whereby the patternof missing data could influence the results of the statistical analysis.

The data set for this problem is used for a large number of analyses in the article. Notall variables and cases are used in each analysis, so it makes sense to conduct themissing data analysis on the cases and variables to be included in the problem in thisexercise.

We will compute the logistic regression model for 1976-77 married, full-time males aspresented in table 2 on page 807. (Note: this analysis does not include the independentvariables SPOCCUP 'Spouses Occupation' and EVWORK 'Ever Work as Long as One Year').

First, we will exclude the cases not used in this exercise and then we will examinemissing data for the variables used in this exercise.



7/49Slide 7

Specify the Cases to Include in this Analysis



8/49Slide 8

Enter the Selection Criterion



9/49Slide 9

Run the MissingDataCheck Script



10/49Slide 10

Complete the 'Check for Missing Data' Dialog Box



11/49Slide 11

Number of Valid and Missing Cases per Variable

Two independent variables have relatively large numbers of missing cases:JCINCOME 'Job Characteristic - Income' and AUTHORIT 'Job Characteristic - Authority'.

However, all variables have valid data for 90% or more of cases, so no variables will beexcluded for an excessive number of missing cases.



12/49Slide 12

Frequency of Cases that are Missing Variables

Next, we examine the number of missing variables per case. Of the possible 14 variablesin the analysis (13 independent variables and 1 dependent variable), one cases wasmissing half of the variables (7) and should be excluded from the remaining analyses.



13/49


14/49

Slide 14

Correlation Matrix of Valid/Missing Dichotomous Variables

The largest correlation in the matrix of valid/missing data (not shown) is 0.363. None ofthe correlations for missing data values are above the weak level, so we can deletemissing cases without fear that we are distorting the solution.



15/49

Slide 15

Minimum sample size requirement:15-20 cases per independent variable

If we accept the SPSS default of listwise deletion of missing data, we will have 538 casesin the analysis. The ratio of cases to independent variables is 538/13 or 41 to 1. We

meet this requirement.



16/49

Slide 16

Stage 2: Develop the Analysis Plan: Measurement Issues:


Incorporating nonmetric data with dummy variables

Representing Curvilinear Effects with PolynomialsRepresenting Interaction or Moderator Effects


Incorporating Nonmetric Data with Dummy Variables

All of the nonmetric variables have recoded into dichotomous dummy-coded variables.

Representing Curvilinear Effects with Polynomials

We do not have any evidence of curvilinear effects at this point in the analysis.

Representing Interaction or Moderator Effects

We do not have any evidence at this point in the analysis that we should add interactionor moderator variables.


17/49

Slide 17

Stage 3: Evaluate Underlying Assumptions


Nonmetric dependent variable with two groups

Metric or dummy-coded independent variables


Nonmetric dependent variable having two groups

The dependent variable 'Job satisfaction' was recoded into dichotomous categories.

Metric or dummy-coded independent variables

Marital status, race, spouse's work status, presence of young children, job authority, jobautonomy, and ever worked as long as one year are all coded as dichotomous variables.

Age of respondent, highest year of school completed, prestige of spouse's occupation,number or children, general happiness, satisfaction with family, income, job prestige,hours worked (convenience), and year of the survey can be treated as metric variables.


18/49

Slide 18

Stage 4: Estimation of Logistic Regression andAssessing Overall Fit: Model Estimation


Compute logistic regression model


Compute the logistic regression

The steps to obtain a logistic regression analysis are detailed on the following screens.

If the cases to be included in this analysis were not selected in the missing data analysis,the selection needs to be completed before proceeding.


19/49


20/49

Slide 20

Specifying the Dependent Variable



21/49


22/49

Slide 22

Specify the method for entering variables



23/49

Slide 23

Specifying Options to Include in the Output



24/49

Slide 24

Specifying the New Variables to Save



25/49

Slide 25

Complete the Logistic Regression Request



26/49

Slide 26

Stage 4: Estimation of Logistic Regression andAssessing Overall Fit: Assessing Model Fit


Significance test of the model log likelihood (Change in -2LL)Measures Analogous to R: Cox and Snell R and Nagelkerke RHosmer-Lemeshow Goodness-of-fitClassification matricesCheck for Numerical ProblemsPresence of outliers



27/49

Slide 27

Initial statistics before independent variables are included

The Initial Log Likelihood Function, (-2 Log Likelihood or -2LL) is a statistical measurelike total sums of squares in regression. If our independent variables have a relationshipto the dependent variable, we will improve our ability to predict the dependent variable

accurately, and the log likelihood value will decrease. The initial 2LL value is 742.850on step 0, before any variables have been added to the model.



28/49

Slide 28

Significance test of the model log likelihood

The difference between these two measures is the model child-square value (57.153 =742.850 685.697) that is tested for statistical significance. This test is analogous to theF-test for R or change in R value in multiple regression which tests whether or not the

improvement in the model associated with the additional variables is statisticallysignificant.

In this problem the model Chi-Square value of 57.153 has a significance of 0.000, lessthan 0.05, so we conclude that there is a significant relationship between the dependentvariable and the set of independent variables.



29/49

Slide 29

Measures Analogous to R

The next SPSS outputs indicate the strength of the relationship between the dependentvariable and the independent variables, analogous to the R measures in multipleregression.

The Cox and Snell R measure operates like R, with higher values indicating greatermodel fit. However, this measure is limited in that it cannot reach the maximum valueof 1, so Nagelkerke proposed a modification that had the range from 0 to 1. We will relyupon Nagelkerke's measure as indicating the strength of the relationship.

Based on the interpretive criteria, we would characterize this model as weak.



30/49

Slide 30

Correspondence of Actual and Predicted Valuesof the Dependent Variable

The final measure of model fit is the Hosmer and Lemeshow goodness-of-fit statistic,which measures the correspondence between the actual and predicted values of thedependent variable. In this case, better model fit is indicated by a smaller difference in

the observed and predicted classification. A good model fit is indicated by anonsignificant chi-square value.

The goodness-of-fit measure has a value of 5.678 which has the desirable outcome of

nonsignificance.Young Children and Job Satisfaction


31/49

Slide 31

The Classification Matrices

The classification matrices in logistic regression serve the same function as theclassification matrices in Young Children and Job Satisfaction, i.e. evaluating theaccuracy of the model.

To evaluate the accuracy of the model, we compute the proportional by chance accuracyrate and the maximum by chance accuracy rates, if appropriate. Since the sizes of thegroups in this problem are equal to 46% and 54%, the proportional accuracy criterion isappropriate because we do not have a dominant group.

The proportional by chance accuracy rate is equal to 0.503 (0.463^2 + 0.537^2). A 25%increase over the by chance accuracy rate would equal 0.628.

Our model accuracy race of 63.2% meets this criterion.


32/49


33/49

Slide 33

Check for Numerical Problems

There are several numerical problems that can in logistic regression that are notdetected by SPSS or other statistical packages: multicollinearity among the independentvariables, zero cells for a dummy-coded independent variable because all of the

subjects have the same value for the variable, and "complete separation" whereby thetwo groups in the dependent event variable can be perfectly separated by scores on oneof the independent variables.

All of these problems produce large standard errors (over 2) for the variables included inthe analysis and very often produce very large B coefficients as well. If we encounterlarge standard errors for the predictor variables, we should examine frequency tables,one-way ANOVAs, and correlations for the variables involved to try to identify the sourceof the problem.

The standarderrors and Bcoefficients arenot excessivelylarge, so there isno evidence of anumeric problemwith this analysis.



34/49

Slide 34

There are two outputs to alert us to outliers that we might consider excluding from theanalysis: listing of residuals and saving Cook's distance scores to the data set.

SPSS provides a casewise list of residuals that identify cases whose residual is above orbelow a certain number of standard deviation units. Like multiple regression there are avariety of ways to compute the residual. In logistic regression, the residual is thedifference between the observed probability of the dependent variable event and thepredicted probability based on the model. The standardized residual is the residualdivided by an estimate of its standard deviation. The deviance is calculated by takingthe square root of -2 x the log of the predicted probability for the observed group andattaching a negative sign if the event did not occur for that case. Large values fordeviance indicate that the model does not fit the case well. The studentized residual

for a case is the change in the model deviance if the case is excluded. Discrepanciesbetween the deviance and the studentized residual may identify unusual cases. (See theSPSS chapter on Logistic Regression Analysis for additional details).

In the output for our problem, SPSS listed one cases that have may be considered anoutlier with a studentized residuals greater than 2:

Presence of outliers



35/49

Slide 35

Cooks Distance

SPSS has an option to compute Cook's distance as a measure of influential cases and addthe score to the data editor. I am not aware of a precise formula for determining whatcutoff value should be used, so we will rely on the more traditional method for

interpreting Cook's distance which is to identify cases that either have a score of 1.0 orhigher, or cases which have a Cook's distance substantially different from the other. Theprescribed method for detecting unusually large Cook's distance scores is to create ascatterplot of Cook's distance scores versus case id.

SPSS Sample Problem


36/49


37/49

Slide 37

Specifying the Variables for the Scatterplot



38/49

Slide 38

The Scatterplot of Cook's Distances

Horizontal gridlines were added to the scatterplot to aid interpretation. Based on thegridlines, we can identify four cases with Cook's distances about 0.175 as influentialcases.

After sorting the data set by theCook's distance variable, weidentify the four cases as havingid numbers: 99, 1807, 1833, and1953. None of these cases wereincluded on the casewise listingfor large studentized residuals.

Based on these outputs, weidentify five cases out of 538 thatare potential outliers. Since thenumber of outliers representsless than 1% of the sample andnone of the outliers are reallyextreme, I will opt to retain themin the analysis.



39/49

Slide 39

Stage 5: Interpret the Results

In this section, we address the following issues:

Identifying the statistically significant predictor variables

Direction of relationship and contribution to dependent variable



40/49

Slide 40

Identifying the statistically significant predictor variables

The table of variables in the equation identifies for us the predictor variables that havea statistically significant individual relationship to the dependent variable. Scanning the'Sig' column, we identify four variables that have a significance level less than

0.05: GENHAPPY 'How Happy Generally', PRESTIGE 'Job Characteristic - Prestige',CONVENIE 'Job Characteristic - Convenience', and YEAR 'GSS Year for Respondent'.



41/49

Slide 41

Direction of relationship and contribution to dependent variable - 1

The sign of the B coefficients indicates whether the predictor variable increased ordecreased the likelihood of belonging to the group of respondents who were verysatisfied with their jobs.



42/49

Slide 42

Direction of relationship and contribution to dependent variable - 2

The coefficient signs for the variables GENHAPPY 'How Happy Generally', PRESTIGE 'JobCharacteristic - Prestige', and CONVENIE 'Job Characteristic - Convenience' were all

positive, indicating that a higher score on these variables enhanced the likelihood ofbelonging to the group that was very satisfied with their jobs. The coefficient for YEARwas negative, indicating that job satisfaction has been declining in later years of thesurvey.

The magnitude of change associated with each independent variable is given in the oddsratio column labeled 'Exp (B)'. This column indicates the increased or decreased odds ofbelonging to the group that was very satisfied with their jobs.

For each unit increment on the measure of overall happiness, a respondent was 1.76times more likely to be very satisfied with his or her job. For each unit increment in jobprestige, a subject was 1.02 times as likely to be very satisfied with his or her job. Foreach unit increment in job convenience (or hours worked), a subject was 1.02 times aslikely to be very satisfied with his or her job. Finally, for each increase in year, asubject was 0.65 times as likely to be very satisfied with his or her job, i.e. was lesslikely to be satisfied.

Important to the research question raised by the authors is the finding that

CHILDLT6 'Presence of Young Children' did not have a statistically significant impact onjob satisfaction.



43/49


44/49

Slide 44

Set the Starting Point for Random Number Generation



45/49


46/49

Slide 46

Specify the Cases to Include in the First Screening Sample



47/49

S if th V l f th S l ti V i bl


48/49

Slide 48

Specify the Value of the Selection Variablefor the Second Validation Analysis



49/49

Generalizability of the Logistic Regression Model

Only one predictor variable, CONVENIE 'Job Characteristic - Convenience, has a stable,statistically significant relationship to the dependent variable, Job Satisfaction.In addition, the accuracy that we should evaluate in assessing our model is in the 56% to59% range rather than in the 63% to 72% range. At this accuracy rate, the model doesnot represent a 25% increase over the proportional by chance accuracy rate.

In sum, we do find a relationship between one of the independent variables and jobsatisfaction. Our findings should be regarded as tentative or exploratory rather thandefinitive because we would not meet the classification accuracy rate required for ausable model

Full Model

Split=0

Split=1

Model Chi-Square

57.153, p=.0000

54.386, p

Young Children Job Satisfaction

Documents

Investigating Jobs in Jamaica Job Satisfaction. Job Satisfaction Job Satisfaction describes how content an individual is with his /her job

A Study on Generation Y and Job Satisfaction in Vietnam€¦ · Keywords: Job satisfaction, Gen Y, Vietnam, motivational factors Introduction In the Globalization era, young people

Temporary Contracts and Young Workers’ Job Satisfaction in Italyrepec.iza.org/dp7716.pdf · 2013-11-11 · Temporary Contracts and Young Workers’ Job Satisfaction in Italy . Giovanni

Job satisfaction

Job design & job satisfaction

Job Satisfaction and Life Satisfaction

Job Satisfaction, Job Performance