Upload
praneeth-reddy
View
2
Download
0
Embed Size (px)
DESCRIPTION
Descriptive Statistics & Counting Responses and Graphs
Citation preview
Part One: Descriptive StatisticsThe dataset has 747 observations on 19 variables. The part one of the assignment focuses on three variables namely: 1) Age of respondents2) Total family income in quartile3) RS highest degreeStatistics
Age of RespondentTotal Family Income in quartilesRS Highest Degree
NValid744747747
Missing300
Descriptive Statistics
NMeanStd. DeviationVariance
Age of Respondent74440.4111.158124.511
RS Highest Degree7471.791.2001.440
Total Family Income in quartiles7472.571.1441.309
Valid N (list wise)744
Let's look at the data and start with descriptive statistics.
Valid N (list wise) - This is the number of non-missing values.N - This is the number of valid observations for the variable. The total number of observations is the sum of N and the number of missing values.Mean - This is the arithmetic mean across the observations. It is the most widely used measure of central tendency. It is commonly called the average. The mean is sensitive to extremely large or small values. Std. - Standard deviation is the square root of the variance. It measures the spread of a set of observations. The larger the standard deviation is, the more spread out the observations are.Variance - The variance is a measure of variability. It is the sum of the squared distances of data value from the mean divided by the variance divisor. The Corrected SS is the sum of squared distances of data value from the mean. In the above sample the mean of Age respondents, RS Highest Degree and Total family income in quartiles are measured as 40.1, 1.79 and 2.57 respectively.Std. Deviation for the above three variables are 11.158, 1.200, 1.144 And also the variance for these variables are calculated as 124.511, 1.440 and 1.309 respectively.The frequency distribution for each variable is displayed below:Age of Respondent
FrequencyPercentValid PercentCumulative Percent
Valid191.1.1.1
205.7.7.8
214.5.51.3
223.4.41.7
23141.91.93.6
24152.02.05.6
25202.72.78.3
26162.12.210.5
27141.91.912.4
28253.33.415.7
29202.72.718.4
30202.72.721.1
31233.13.124.2
32162.12.226.3
33222.93.029.3
34182.42.431.7
35324.34.336.0
36283.73.839.8
37243.23.243.0
38324.34.347.3
39283.73.851.1
40233.13.154.2
41233.13.157.3
42202.72.759.9
43273.63.663.6
44172.32.365.9
45233.13.169.0
46233.13.172.0
47253.33.475.4
48101.31.376.7
49192.52.679.3
50152.02.081.3
51212.82.884.1
52101.31.385.5
5391.21.286.7
547.9.987.6
55152.02.089.7
567.9.990.6
57121.61.692.2
58131.71.794.0
595.7.794.6
607.9.995.6
615.7.796.2
626.8.897.0
636.8.897.8
641.1.198.0
653.4.498.4
661.1.198.5
671.1.198.7
681.1.198.8
722.3.399.1
731.1.199.2
741.1.199.3
751.1.199.5
772.3.399.7
781.1.199.9
821.1.1100.0
Total74499.6100.0
MissingNA3.4
Total747100.0
Total Family Income in quartiles
FrequencyPercentValid PercentCumulative Percent
Valid24,999 or less17423.323.323.3
25,000 to 39,99919426.026.049.3
40,000 to 59,99915620.920.970.1
60,000 or more22329.929.9100.0
Total747100.0100.0
RS Highest Degree
FrequencyPercentValid PercentCumulative Percent
ValidLess than HS527.07.07.0
High school39052.252.259.2
Junior college547.27.266.4
Bachelor16522.122.188.5
Graduate8611.511.5100.0
Total747100.0100.0
BAR CHART:A bar chart is helpful in graphically describing (visualizing) your data. It will often be used in addition to inferential statistics.
From this data interpreted by bar chart, we can observe that people of age group 35 and 38 are high in number contributing 4.3% each of them and also we can observe that age group of people from 59 to 82 are less in number compared to other age group of people.
From this bar chart we can observe that more percent of the respondents are from high school with 52.2% and the least from less than high school level and junior college contributing 7% and 7.2% each.
From this chart it can be observed that there are 29.9% i.e. 233 families whose income is more than 60,000. And there are 20.9% i.e. 156 families whose income is between 40,000 to 59,999.
Part Two: Counting Responses and Graphs
1. Create Frequency table of importance of high income (jobinc). What are the percents of "Very important" and "Fourth"?
Statistics
Importance of HIGH INCOME
NValid497
Missing250
Importance of HIGH INCOME
FrequencyPercentValid PercentCumulative Percent
ValidMOST IMPT10113.520.320.3
SECOND14519.429.249.5
THIRD15120.230.479.9
FOURTH8110.816.396.2
FIFTH192.53.8100.0
Total49766.5100.0
MissingNAP23731.7
NA131.7
Total25033.5
Total747100.0
From this data it can be inferred that, 13.5% i.e. 101 people think high income is most important and 10.8% i.e. 19 people think it is fourth most.
2. Create Frequency table of level of education (RS Highest Degree), then create Pie chart and Bar chart. What level has the highest percent?
Statistics
RS Highest Degree
NValid747
Missing0
RS Highest Degree
FrequencyPercentValid PercentCumulative Percent
ValidLess than HS527.07.07.0
High school39052.252.259.2
Junior college547.27.266.4
Bachelor16522.122.188.5
Graduate8611.511.5100.0
Total747100.0100.0
On comparing the data from pie chart and bar chart, it is observed that high school has the highest with 52.2%.
3. Create a Histogram of ages of respondents. What is the mean? And Standard Deviation? (See answers from the histogram)
From the histogram it is observed that mean and Std. Deviation are 40.41 and 11.158 respectively.