Upload
hahoang3110
View
219
Download
0
Embed Size (px)
Citation preview
7/29/2019 Report Draft (4)
1/23
Table of Contents
Group members and list of work ............................................................................... 2
Introduction ................................................................................................................ 3
Collecting data ........................................................................................................... 4
Presenting and Summarizing data .............................................................................. 6
1. Subject................................................................................................................. 6
2. Number of absent classes per month .................................................................. 7
3. Average number of hours spent for self-study .................................................10
4. GPA ...................................................................................................................14
Regression model .....................................................................................................18
1. Descriptive statistics .........................................................................................18
2. Regression model ..............................................................................................18
Conclusion ...............................................................................................................22
Appendix ..................................................................................................................23
7/29/2019 Report Draft (4)
2/23
Group members and list of workName Student ID Tasks % equivalent
Pham Ngoc Anh Collecting data
Preparing slides
13
Dang Thi Hien Writing "Introduction"
Designing survey
14
Hoang Thanh Ha Collecting data
Presenting
13
Phan Duy Hung 1001030153 Processing and Analyzing data
Writing "Regression model"
and "Average number of hours
spent for self - study"
25
Bui Kieu Dieu
Linh
1001010500 Processing and Analyzing data
Writing "Presenting and
Summarizing data"
25
Kim Le Ha Thanh Designing survey
Writing "Conclusion"
10
It should be note that all members of the group are serious, enthusiastic and hard -
working. They all did the tasks given well and before deadline. In fact, each
member could do equal y (which means sharing the same task for all 6 people), but
for the sake of the report's quality as well as time saving, we delegate the tasks
based on our strengths and weaknesses. Therefore, the percentage here is just arelative measure, and in terms of working attitude and efficiency, all members
could be regard as equivalent.
7/29/2019 Report Draft (4)
3/23
Introduction
Education is one of the most fundamental aspects for individuals success in life. It
is also the best investment for people because well educated people would have
more opportunities to get promising jobs in the future. However, not many people
realize that importance and are wasting their time on other nonsensical reasons. To
be more specific, there are quite a variety of dangers which seduce students such as
video games, internet social programs, etc,.. But in this scope of our assignment,
we just focus on some main factors such as: being absent from the class due to the
weather, registering lots of lessons then quit from class several times, spending
least time on studying,...which affect directly to students of Foreign Trade
University (FTU), especially members of High Quality Classes of Finance and
Banking. From this view, the data would be closer and more accurate to our
findings.
This topic may be popular for researching at FTU, on the other hand, there has not
been a research for students of High Quality Programs in Finance and Banking
Faculty before. So that we decide to focus on this number of people and choose the
topic : Investigation of absent time of students and their average total marks
so as to figure out how important of studying hard, trying their best to get a good
result, giving optimistic attitude of learning,... After seeing the final output, it may
help a student look back his studying history and change the habits to be better at
university. Our purpose of doing this assignment is illustrating the important of
education, particularly the time at university for learners.
In our research, we use five methods of business statistics including collecting
data, presenting, summarizing, analyzing and forecasting in order to make our
studying as effective and informative as possible.
7/29/2019 Report Draft (4)
4/23
Collecting data
In order to study the diligence of Foreign Trade University students of HighQuality Class, Faculty of Finance and Banking (a.k.a CLCTCNH), we collected
data in both direct and indirect way.
By the first method, our group including 6 people randomly chose some classes
and spontaneously asked several people to fill in the survey for us. In addition, we
created a online version then spread it through email and social network like
Facebook.
The overall students of CLCTCNH are about 600 students, and we tried to get at
least 10% of these to serve for our study.
And here is our survey we used to collect data
Investigation on students' studying habit
1. According to credit scale, how many subjects are you studying?* < 5 6 -8 >82. How many classes per week do you have? * 3. How much time do you spend on self-studying at home? *
4. How many classes per month are you absent from? * 5. Are you interested in studying in the class? * Yes No Only in some certain subjects
7/29/2019 Report Draft (4)
5/23
6. Which thing(s) most make you want to skip a class? * Bad weather Classes without checking attendance Studying without understanding the lesson Busy with other activities Getting up late Other:7. Are you happy with your studying now? * No Yes Some subjects only8. Your average GPA (for the scale of 4) at the moment * 9. Which year are you in? K47 K48 K49 K50
The complete filled surveys will be showed latter in the appendix.
7/29/2019 Report Draft (4)
6/23
Presenting and Summarizing data
1. SubjectAfter one week collecting data, we got 75 surveys filled, more than what we
expected. A quick glance at the pie chart below shows that of 4 kinds, 2nd year
students and 3rd year students were willing to answer the survey's question much
more than 1st year and 4th year students. This may be explained by the fact that
last year students are too busy with their work and graduation essay that they didn't
have time to answer these. For the 1st year students, we guess that since they
haven't learned courses as Economics or Business Statistic yet, so they didn't have
the motive to answer these.
11%
38%47%
4%
K47 K48 K49 K50
7/29/2019 Report Draft (4)
7/23
2. Number of absent classes per month
Table of distribution
xi fi cumulative frequency relative frequency (%)
0 14 14 18.7%
1 12 26 16.0%
2 15 41 20.0%
3 4 45 5.3%
4 11 56 14.7%
5 9 65 12.0%
6 3 68 4.0%8 2 70 2.7%
10 2 72 2.7%
12 1 73 1.3%
16 1 74 1.3%
25 1 75 1.3%
From the above data, we calculated:
The range: R = largest valuesmallest value = 25 - 0 = 25This range tells us the difference between the largest and the smallest value
of the distribution is 25.
Although the absent classes per month of each students range from 0 to 25,
there are only 12 values, with the number of absent classes are mainly less
than 10 (which are account for 70% overall)
The arithmetic mean:
Mean = 253/75 = 3.37
7/29/2019 Report Draft (4)
8/23
This number tells us the monthly average absent classes are 3.37, or in a
more sensible understanding, averagely, a CLCTCNH student takes 3 to 4
classes off per month.
The mean deviation
Mean deviation = 2.568
The average difference between the number of absent classes and the meanis 2.568
ModeThe mode of the data set is the value which has the largest frequency. From
the above table, we can see the rate which appears the most is 2, which has
the frequency of 15 times. The below graph also illustrates this fact
0
2
4
6
8
10
12
14
16
0 1 2 3 4 5 6 8 10 12 16 25
fi
1
n
i
i
x x
dn
7/29/2019 Report Draft (4)
9/23
This data can be understood as of the studied students, most of them choose
to be off 2 classes per month.
However, we should also note an interesting number that there are 18.7%
students never or hardly absent from any classes, which is only 1.3% less
than the mode of 2. Therefore, we have evidence to say that Finance and
Banking students are quite hard-working.
MedianBased on the table of distribution in which total frequency is 75, we can find
the middle item is the 38th
item, corresponding to the value of 2.
Median = 2 The variance
Variance = 63.403136
The average of squared discrepancies between each number of monthly
absent classes and the mean is 63.40
The standard deviation
Standard deviation = 2.821890458
7/29/2019 Report Draft (4)
10/23
The coefficient of variation
Coefficient of variation = 83.65 %
3. Average number of hours spent for self-study
Table of distribution
xi fiCumulative
frequency
Relative frequency
(%)
0 6 6 8 %
0.25 1 7 1.3 %
0.5 11 18 14.7 %
1 15 33 20 %
1.5 1 34 1.3 %
2 24 58 32 %
3 6 64 8 %
4 6 70 8 %
5 4 74 5.3 %
10 1 75 1.3 %
The following dot plot shows the frequency of values of time students spending on
self - studying
7/29/2019 Report Draft (4)
11/23
From the above data, we calculated:
The range:R = largest valuesmallest value = 10- 0 = 10
This range tells us the difference between the largest and the smallest value
of the distribution is 10.
Although the number of hours spent on self-studying of each students range
from 0 to 10, there are only 10 values, with the number of hours are mainly
less than 10 (which are account for 70% overall)
The arithmetic mean:
Mean = 142.25/75=1.897
This number tells us the daily average hours spent on self-studying are
1.897, or in a more sensible understanding, averagely, a CLCTCNH student
spends about 2 hours a day on revising their home assignment.
0 2 4 6 8 10
Dotplot of hours spent on self-
studying
7/29/2019 Report Draft (4)
12/23
The mean deviation
Mean deviation = 1.126
The average difference between the number of hours spent on self-studying
and the mean is 1.126
MedianThe median of the data set is the value of the item in the middle when the
data items are arranged in ascending order.Based on the table of distribution in which total frequency is 75, we can find
the middle item is the 38th
item, corresponding to the value of 2.
Median = 2
ModeThe mode of the data set is the value which has the largest frequency. From
the above table, we can see the rate which appears the most is 2, which has
the frequency of 24 times. The below graph also illustrates this fact
1
n
i
i
x x
d
n
7/29/2019 Report Draft (4)
13/23
This data can be understood as of the studied students, most of them spend 2
hours per day on studying.
Mode = 2
As we can see from the graph and the table of frequency distribution, only
8% of students never spend their time at home on studying, which indicates
that there are still a large number of CLCTCNH students being aware oftheir task.
The variance
Variance = 2.617
The average of squared discrepancies between each number of monthly
absent classes and the mean is 2.617
The standard deviation
0
5
10
15
20
25
30
0 0.25 0.5 1 1.5 2 3 4 5 10
fi
7/29/2019 Report Draft (4)
14/23
Standard deviation = 1.618
The coefficient of variation
Coefficient of variation = 85.293 %
4. GPA
Table of distribution
xi fi xi fi xi fi
2.67 1 3.19 1 3.42 2
2.8 2 3.2 10 3.47 1
2.83 1 3.22 3 3.48 1
2.88 1 3.23 2 3.49 1
2.9 2 3.24 1 3.5 1
2.91 1 3.25 1 3.54 2
2.93 1 3.27 1 3.62 1
2.97 1 3.28 1 3.64 1
3 3 3.3 4 3.65 1
3.03 1 3.31 2 3.67 2
3.04 2 3.34 1 3.76 1
3.1 2 3.35 2 3.79 1
3.13 1 3.38 2 3.8 1
3.15 1 3.4 3 3.9 2
3.17 1 3.41 1 4 1
7/29/2019 Report Draft (4)
15/23
There are 45 values in this data set, so we create a table of grouped frequency
distribution in order to simplify it, which help readers easier to follow
GPA f
cummulative
frequency (%)
class mid
point (x)
2.67 up to 2.75 1 1.33% 2.71
2.76 up to 3.00 12 16.00% 2.88
3.01 up to 3.25 26 34.67% 3.13
3.26 up to 3.50 23 30.67% 3.38
3.51 up to 3.75 7 9.33% 3.63
3.76 up to 4.00 6 8.00% 3.88
Total 75 100.00%
Those can be illustrated by the following ogive:
1.33%
17.33%
52.00%
82.67%
92.00%100.00%
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
2.67 up to
2.75
2.76 up to
3.00
3.01 up to
3.25
3.26 up to
3.50
3.51 up to
3.75
3.76 up to
4.00
Ogive of GPA
7/29/2019 Report Draft (4)
16/23
From the above statistic, we can calculate
The range : R = largest valuesmallest value = 4 - 2.67 = 1.33Class width: C= lower limit of class N+1 - lower limit class N = 3.01 - 2.76 =
0.25
With the available data, we grouped them into 6 classes, with the class range
of 0.25 and total range is 1.33. It should be noted that the GPA here is
calculated on the scale of 4, not 10 as usual. Therefore, 4 is the highest mark.
The arithmetic mean
Mean = 245.08/75 = 3.27
From the value calculated, we can generally understood as the standard
average GPA of a CLCTCNH student is 3.27
The mean deviation
md = 0.24So the average distance between the avarage GPA and GPA is 0.24
The mode ( )
Mode = 3.22
So the GPA that occurs most often of CLCTCNH student is 3.22
7/29/2019 Report Draft (4)
17/23
The median
[ ] Median = 3.16
This number tells us the middle value of GPA in size order is 3.16
The variance
Variance = 0.082
The standard deviation
Standard deviation= 0.287
The coefficient of the variance
Coefficient of variance = 8.77%
0
5
10
15
20
25
30
2.67 up to
2.75
2.76 up to
3.00
3.01 up to
3.25
3.26 up to
3.50
3.51 up to
3.75
3.76 up to
4.00
Histogram of GPA
7/29/2019 Report Draft (4)
18/23
Regression model
Analyzing the relationship between
GPA, number of absent classes and number of self - studying hours
1. Descriptive statisticsThe dependent variable we use in this model is GPA and the independent variable
is the number of classes students are absent from and the number of hours spent for
self-study.
To specify,
y : GPA
x1 : The number of classes students are absent from
x2 : The number of hours spent for self-study
2. Regression model
We can easily calculate ^0 and
^1 (or b0 and b1) of the regression function by
using the calculation of Excel and testing the result with Gretl software.
Step 1: Calculating by Excel
We have:
The population model:y = 0 + 1x1+ 2x2 +
7/29/2019 Report Draft (4)
19/23
Where, 0 : intercept of y
1x1 + 2x2 : population slope
: random error
The estimated multiple regression model isy
^= b0 + b1x1 + b2x2
Where, y^: estimated (predicted) value of y
b0 : estimated intercept
b1x1 + b2x2 : estimated slope coefficients
The formula of b0, b1 and b2 as following:
y = nb0 + b1x 1 + b2x 2
and x 1y = b0x 1 + b1x12
+ b2x 1x2
and x 2y = b0x 2 + b1x 1x2 + b2x 22
We have:
{ b0 = 3.31947
b1 = -0.0323693
b2 = 0.0739471
The estimated model is y^ = 3.31947 - 0.0323693x1 + 0.0739471 x2
7/29/2019 Report Draft (4)
20/23
Step 2: Testing the calculation by Gretl:
Model 1: OLS, using observations 1-75
Dependent variable: GPA_Y
Coefficient Std. Error t-ratio p-value
const 3.31947 0.133021 24.9546
7/29/2019 Report Draft (4)
21/23
R2= 0.072181 means that the variation in the independent variables is able to
explain 7.2181% of the total variation in the dependent variable.
According to the final result we can jump to conclusion as following :
- As we can see, b0 = 3.31947 reflects that the GPA is not affected by not only the
number of classes students are absent from and the number of hours spent for self-
study but also other factors.
- Equation (*) illustrates the fact that the number of hours that student skipped
classes has a negative effect on their GPA. One hour of being absent from classes
makes GPA decrease 0.0323693 if other factors remain unchanged.
- According to equation (*), we can see the obviously positive relationship between
the hours that student spent on self- studying and their GPA. A coefficient of
0.0739471 means that when the time for self studying increases one more hour, the
students GPA will also increase 0.0739471.
7/29/2019 Report Draft (4)
22/23
Conclusion
Due to limited time and resource, our group chose to investigate only in small area
High quality class, Faculty of Finance and Banking instead of the whole
university or even in other universities. If having any more time for us, our group
believes that we will be able collect more information, more data as a result, more
accurate results for this survey.
In addition, the hours students spent on self-studying and the number of classes
students are absent from have a positive relationship with their GPA. With the
result of this survey and the data collected, there are convincing evident that ifstudents want to improve their GPA, they should be more diligent. In specific, we
students need to spend much time to improve knowledge by studying hard at
school, listening carefully to the teachers lecture, doing homework regularly and
moreover, the number of hours spending on self-studying are very extremely
important for each students.
Finally, through this investigation, we not only believe to bring application tostudents and teacher but also gained ourselves helpful and practical knowledge,
which can be used many times afterward.
7/29/2019 Report Draft (4)
23/23
Appendix
GPA AbsentSelf-
studyGPA Absent
Self-
studyGPA Absent
Self-
study
2.88 10 0 3.2 2 1 3.27 5 4
3.2 2 2 2.67 4 0.5 3.2 5 0.5
3.54 4 3 2.93 2 2 3.76 3 10
2.91 5 2 8.5 0 3 3.04 25 3
3.22 2 5 3.22 0 1 3.8 1 0.5
3.13 12 0 3.48 1 1 3.3 2 2
3.03 5 0 3.4 5 1 3.42 2 0.5
3.2 10 0 2.9 7 2 3.2 4 2
3.34 4 1 3.28 7 2 3.79 0 4
3.1 2 0.5 3.22 4 0.5 3.23 0 0.25
3.47 4 1.00 3.38 1 0.5 3.17 4 2
3.24 0 1 3.9 1 2 3 2 2
2.8 2 1 3.19 5 1.5 3.65 1 3
2.97 3 1 3.31 7 5 3.3 8 23.49 2 1 3.54 8 5 3.31 4 2
3.2 1 4 2.9 5 2 3.42 0 2
3.3 2 0 3.62 0 3 2.8 5 4
3.1 3 0.5 3.5 1 0.5 3.2 2 1
3.23 0 2 3.67 2 2 3.25 4 2
3 5 2 3.38 1 1 4 0 3
3.41 3 2 3.2 4 2 3.35 2 2
3.04 1 2 3.67 0 1 3.35 16 0
2.83 1 2 3.2 1 1 3.3 0 4
3.2 1 5 3.9 4 4 3.15 0 0.5
3 2 0.5 3.64 0 2 3.4 0 1