Report Draft (4)

Embed Size (px)

Citation preview

  • 7/29/2019 Report Draft (4)

    1/23

    Table of Contents

    Group members and list of work ............................................................................... 2

    Introduction ................................................................................................................ 3

    Collecting data ........................................................................................................... 4

    Presenting and Summarizing data .............................................................................. 6

    1. Subject................................................................................................................. 6

    2. Number of absent classes per month .................................................................. 7

    3. Average number of hours spent for self-study .................................................10

    4. GPA ...................................................................................................................14

    Regression model .....................................................................................................18

    1. Descriptive statistics .........................................................................................18

    2. Regression model ..............................................................................................18

    Conclusion ...............................................................................................................22

    Appendix ..................................................................................................................23

  • 7/29/2019 Report Draft (4)

    2/23

    Group members and list of workName Student ID Tasks % equivalent

    Pham Ngoc Anh Collecting data

    Preparing slides

    13

    Dang Thi Hien Writing "Introduction"

    Designing survey

    14

    Hoang Thanh Ha Collecting data

    Presenting

    13

    Phan Duy Hung 1001030153 Processing and Analyzing data

    Writing "Regression model"

    and "Average number of hours

    spent for self - study"

    25

    Bui Kieu Dieu

    Linh

    1001010500 Processing and Analyzing data

    Writing "Presenting and

    Summarizing data"

    25

    Kim Le Ha Thanh Designing survey

    Writing "Conclusion"

    10

    It should be note that all members of the group are serious, enthusiastic and hard -

    working. They all did the tasks given well and before deadline. In fact, each

    member could do equal y (which means sharing the same task for all 6 people), but

    for the sake of the report's quality as well as time saving, we delegate the tasks

    based on our strengths and weaknesses. Therefore, the percentage here is just arelative measure, and in terms of working attitude and efficiency, all members

    could be regard as equivalent.

  • 7/29/2019 Report Draft (4)

    3/23

    Introduction

    Education is one of the most fundamental aspects for individuals success in life. It

    is also the best investment for people because well educated people would have

    more opportunities to get promising jobs in the future. However, not many people

    realize that importance and are wasting their time on other nonsensical reasons. To

    be more specific, there are quite a variety of dangers which seduce students such as

    video games, internet social programs, etc,.. But in this scope of our assignment,

    we just focus on some main factors such as: being absent from the class due to the

    weather, registering lots of lessons then quit from class several times, spending

    least time on studying,...which affect directly to students of Foreign Trade

    University (FTU), especially members of High Quality Classes of Finance and

    Banking. From this view, the data would be closer and more accurate to our

    findings.

    This topic may be popular for researching at FTU, on the other hand, there has not

    been a research for students of High Quality Programs in Finance and Banking

    Faculty before. So that we decide to focus on this number of people and choose the

    topic : Investigation of absent time of students and their average total marks

    so as to figure out how important of studying hard, trying their best to get a good

    result, giving optimistic attitude of learning,... After seeing the final output, it may

    help a student look back his studying history and change the habits to be better at

    university. Our purpose of doing this assignment is illustrating the important of

    education, particularly the time at university for learners.

    In our research, we use five methods of business statistics including collecting

    data, presenting, summarizing, analyzing and forecasting in order to make our

    studying as effective and informative as possible.

  • 7/29/2019 Report Draft (4)

    4/23

    Collecting data

    In order to study the diligence of Foreign Trade University students of HighQuality Class, Faculty of Finance and Banking (a.k.a CLCTCNH), we collected

    data in both direct and indirect way.

    By the first method, our group including 6 people randomly chose some classes

    and spontaneously asked several people to fill in the survey for us. In addition, we

    created a online version then spread it through email and social network like

    Facebook.

    The overall students of CLCTCNH are about 600 students, and we tried to get at

    least 10% of these to serve for our study.

    And here is our survey we used to collect data

    Investigation on students' studying habit

    1. According to credit scale, how many subjects are you studying?* < 5 6 -8 >82. How many classes per week do you have? * 3. How much time do you spend on self-studying at home? *

    4. How many classes per month are you absent from? * 5. Are you interested in studying in the class? * Yes No Only in some certain subjects

  • 7/29/2019 Report Draft (4)

    5/23

    6. Which thing(s) most make you want to skip a class? * Bad weather Classes without checking attendance Studying without understanding the lesson Busy with other activities Getting up late Other:7. Are you happy with your studying now? * No Yes Some subjects only8. Your average GPA (for the scale of 4) at the moment * 9. Which year are you in? K47 K48 K49 K50

    The complete filled surveys will be showed latter in the appendix.

  • 7/29/2019 Report Draft (4)

    6/23

    Presenting and Summarizing data

    1. SubjectAfter one week collecting data, we got 75 surveys filled, more than what we

    expected. A quick glance at the pie chart below shows that of 4 kinds, 2nd year

    students and 3rd year students were willing to answer the survey's question much

    more than 1st year and 4th year students. This may be explained by the fact that

    last year students are too busy with their work and graduation essay that they didn't

    have time to answer these. For the 1st year students, we guess that since they

    haven't learned courses as Economics or Business Statistic yet, so they didn't have

    the motive to answer these.

    11%

    38%47%

    4%

    K47 K48 K49 K50

  • 7/29/2019 Report Draft (4)

    7/23

    2. Number of absent classes per month

    Table of distribution

    xi fi cumulative frequency relative frequency (%)

    0 14 14 18.7%

    1 12 26 16.0%

    2 15 41 20.0%

    3 4 45 5.3%

    4 11 56 14.7%

    5 9 65 12.0%

    6 3 68 4.0%8 2 70 2.7%

    10 2 72 2.7%

    12 1 73 1.3%

    16 1 74 1.3%

    25 1 75 1.3%

    From the above data, we calculated:

    The range: R = largest valuesmallest value = 25 - 0 = 25This range tells us the difference between the largest and the smallest value

    of the distribution is 25.

    Although the absent classes per month of each students range from 0 to 25,

    there are only 12 values, with the number of absent classes are mainly less

    than 10 (which are account for 70% overall)

    The arithmetic mean:

    Mean = 253/75 = 3.37

  • 7/29/2019 Report Draft (4)

    8/23

    This number tells us the monthly average absent classes are 3.37, or in a

    more sensible understanding, averagely, a CLCTCNH student takes 3 to 4

    classes off per month.

    The mean deviation

    Mean deviation = 2.568

    The average difference between the number of absent classes and the meanis 2.568

    ModeThe mode of the data set is the value which has the largest frequency. From

    the above table, we can see the rate which appears the most is 2, which has

    the frequency of 15 times. The below graph also illustrates this fact

    0

    2

    4

    6

    8

    10

    12

    14

    16

    0 1 2 3 4 5 6 8 10 12 16 25

    fi

    1

    n

    i

    i

    x x

    dn

  • 7/29/2019 Report Draft (4)

    9/23

    This data can be understood as of the studied students, most of them choose

    to be off 2 classes per month.

    However, we should also note an interesting number that there are 18.7%

    students never or hardly absent from any classes, which is only 1.3% less

    than the mode of 2. Therefore, we have evidence to say that Finance and

    Banking students are quite hard-working.

    MedianBased on the table of distribution in which total frequency is 75, we can find

    the middle item is the 38th

    item, corresponding to the value of 2.

    Median = 2 The variance

    Variance = 63.403136

    The average of squared discrepancies between each number of monthly

    absent classes and the mean is 63.40

    The standard deviation

    Standard deviation = 2.821890458

  • 7/29/2019 Report Draft (4)

    10/23

    The coefficient of variation

    Coefficient of variation = 83.65 %

    3. Average number of hours spent for self-study

    Table of distribution

    xi fiCumulative

    frequency

    Relative frequency

    (%)

    0 6 6 8 %

    0.25 1 7 1.3 %

    0.5 11 18 14.7 %

    1 15 33 20 %

    1.5 1 34 1.3 %

    2 24 58 32 %

    3 6 64 8 %

    4 6 70 8 %

    5 4 74 5.3 %

    10 1 75 1.3 %

    The following dot plot shows the frequency of values of time students spending on

    self - studying

  • 7/29/2019 Report Draft (4)

    11/23

    From the above data, we calculated:

    The range:R = largest valuesmallest value = 10- 0 = 10

    This range tells us the difference between the largest and the smallest value

    of the distribution is 10.

    Although the number of hours spent on self-studying of each students range

    from 0 to 10, there are only 10 values, with the number of hours are mainly

    less than 10 (which are account for 70% overall)

    The arithmetic mean:

    Mean = 142.25/75=1.897

    This number tells us the daily average hours spent on self-studying are

    1.897, or in a more sensible understanding, averagely, a CLCTCNH student

    spends about 2 hours a day on revising their home assignment.

    0 2 4 6 8 10

    Dotplot of hours spent on self-

    studying

  • 7/29/2019 Report Draft (4)

    12/23

    The mean deviation

    Mean deviation = 1.126

    The average difference between the number of hours spent on self-studying

    and the mean is 1.126

    MedianThe median of the data set is the value of the item in the middle when the

    data items are arranged in ascending order.Based on the table of distribution in which total frequency is 75, we can find

    the middle item is the 38th

    item, corresponding to the value of 2.

    Median = 2

    ModeThe mode of the data set is the value which has the largest frequency. From

    the above table, we can see the rate which appears the most is 2, which has

    the frequency of 24 times. The below graph also illustrates this fact

    1

    n

    i

    i

    x x

    d

    n

  • 7/29/2019 Report Draft (4)

    13/23

    This data can be understood as of the studied students, most of them spend 2

    hours per day on studying.

    Mode = 2

    As we can see from the graph and the table of frequency distribution, only

    8% of students never spend their time at home on studying, which indicates

    that there are still a large number of CLCTCNH students being aware oftheir task.

    The variance

    Variance = 2.617

    The average of squared discrepancies between each number of monthly

    absent classes and the mean is 2.617

    The standard deviation

    0

    5

    10

    15

    20

    25

    30

    0 0.25 0.5 1 1.5 2 3 4 5 10

    fi

  • 7/29/2019 Report Draft (4)

    14/23

    Standard deviation = 1.618

    The coefficient of variation

    Coefficient of variation = 85.293 %

    4. GPA

    Table of distribution

    xi fi xi fi xi fi

    2.67 1 3.19 1 3.42 2

    2.8 2 3.2 10 3.47 1

    2.83 1 3.22 3 3.48 1

    2.88 1 3.23 2 3.49 1

    2.9 2 3.24 1 3.5 1

    2.91 1 3.25 1 3.54 2

    2.93 1 3.27 1 3.62 1

    2.97 1 3.28 1 3.64 1

    3 3 3.3 4 3.65 1

    3.03 1 3.31 2 3.67 2

    3.04 2 3.34 1 3.76 1

    3.1 2 3.35 2 3.79 1

    3.13 1 3.38 2 3.8 1

    3.15 1 3.4 3 3.9 2

    3.17 1 3.41 1 4 1

  • 7/29/2019 Report Draft (4)

    15/23

    There are 45 values in this data set, so we create a table of grouped frequency

    distribution in order to simplify it, which help readers easier to follow

    GPA f

    cummulative

    frequency (%)

    class mid

    point (x)

    2.67 up to 2.75 1 1.33% 2.71

    2.76 up to 3.00 12 16.00% 2.88

    3.01 up to 3.25 26 34.67% 3.13

    3.26 up to 3.50 23 30.67% 3.38

    3.51 up to 3.75 7 9.33% 3.63

    3.76 up to 4.00 6 8.00% 3.88

    Total 75 100.00%

    Those can be illustrated by the following ogive:

    1.33%

    17.33%

    52.00%

    82.67%

    92.00%100.00%

    0.00%

    20.00%

    40.00%

    60.00%

    80.00%

    100.00%

    120.00%

    2.67 up to

    2.75

    2.76 up to

    3.00

    3.01 up to

    3.25

    3.26 up to

    3.50

    3.51 up to

    3.75

    3.76 up to

    4.00

    Ogive of GPA

  • 7/29/2019 Report Draft (4)

    16/23

    From the above statistic, we can calculate

    The range : R = largest valuesmallest value = 4 - 2.67 = 1.33Class width: C= lower limit of class N+1 - lower limit class N = 3.01 - 2.76 =

    0.25

    With the available data, we grouped them into 6 classes, with the class range

    of 0.25 and total range is 1.33. It should be noted that the GPA here is

    calculated on the scale of 4, not 10 as usual. Therefore, 4 is the highest mark.

    The arithmetic mean

    Mean = 245.08/75 = 3.27

    From the value calculated, we can generally understood as the standard

    average GPA of a CLCTCNH student is 3.27

    The mean deviation

    md = 0.24So the average distance between the avarage GPA and GPA is 0.24

    The mode ( )

    Mode = 3.22

    So the GPA that occurs most often of CLCTCNH student is 3.22

  • 7/29/2019 Report Draft (4)

    17/23

    The median

    [ ] Median = 3.16

    This number tells us the middle value of GPA in size order is 3.16

    The variance

    Variance = 0.082

    The standard deviation

    Standard deviation= 0.287

    The coefficient of the variance

    Coefficient of variance = 8.77%

    0

    5

    10

    15

    20

    25

    30

    2.67 up to

    2.75

    2.76 up to

    3.00

    3.01 up to

    3.25

    3.26 up to

    3.50

    3.51 up to

    3.75

    3.76 up to

    4.00

    Histogram of GPA

  • 7/29/2019 Report Draft (4)

    18/23

    Regression model

    Analyzing the relationship between

    GPA, number of absent classes and number of self - studying hours

    1. Descriptive statisticsThe dependent variable we use in this model is GPA and the independent variable

    is the number of classes students are absent from and the number of hours spent for

    self-study.

    To specify,

    y : GPA

    x1 : The number of classes students are absent from

    x2 : The number of hours spent for self-study

    2. Regression model

    We can easily calculate ^0 and

    ^1 (or b0 and b1) of the regression function by

    using the calculation of Excel and testing the result with Gretl software.

    Step 1: Calculating by Excel

    We have:

    The population model:y = 0 + 1x1+ 2x2 +

  • 7/29/2019 Report Draft (4)

    19/23

    Where, 0 : intercept of y

    1x1 + 2x2 : population slope

    : random error

    The estimated multiple regression model isy

    ^= b0 + b1x1 + b2x2

    Where, y^: estimated (predicted) value of y

    b0 : estimated intercept

    b1x1 + b2x2 : estimated slope coefficients

    The formula of b0, b1 and b2 as following:

    y = nb0 + b1x 1 + b2x 2

    and x 1y = b0x 1 + b1x12

    + b2x 1x2

    and x 2y = b0x 2 + b1x 1x2 + b2x 22

    We have:

    { b0 = 3.31947

    b1 = -0.0323693

    b2 = 0.0739471

    The estimated model is y^ = 3.31947 - 0.0323693x1 + 0.0739471 x2

  • 7/29/2019 Report Draft (4)

    20/23

    Step 2: Testing the calculation by Gretl:

    Model 1: OLS, using observations 1-75

    Dependent variable: GPA_Y

    Coefficient Std. Error t-ratio p-value

    const 3.31947 0.133021 24.9546

  • 7/29/2019 Report Draft (4)

    21/23

    R2= 0.072181 means that the variation in the independent variables is able to

    explain 7.2181% of the total variation in the dependent variable.

    According to the final result we can jump to conclusion as following :

    - As we can see, b0 = 3.31947 reflects that the GPA is not affected by not only the

    number of classes students are absent from and the number of hours spent for self-

    study but also other factors.

    - Equation (*) illustrates the fact that the number of hours that student skipped

    classes has a negative effect on their GPA. One hour of being absent from classes

    makes GPA decrease 0.0323693 if other factors remain unchanged.

    - According to equation (*), we can see the obviously positive relationship between

    the hours that student spent on self- studying and their GPA. A coefficient of

    0.0739471 means that when the time for self studying increases one more hour, the

    students GPA will also increase 0.0739471.

  • 7/29/2019 Report Draft (4)

    22/23

    Conclusion

    Due to limited time and resource, our group chose to investigate only in small area

    High quality class, Faculty of Finance and Banking instead of the whole

    university or even in other universities. If having any more time for us, our group

    believes that we will be able collect more information, more data as a result, more

    accurate results for this survey.

    In addition, the hours students spent on self-studying and the number of classes

    students are absent from have a positive relationship with their GPA. With the

    result of this survey and the data collected, there are convincing evident that ifstudents want to improve their GPA, they should be more diligent. In specific, we

    students need to spend much time to improve knowledge by studying hard at

    school, listening carefully to the teachers lecture, doing homework regularly and

    moreover, the number of hours spending on self-studying are very extremely

    important for each students.

    Finally, through this investigation, we not only believe to bring application tostudents and teacher but also gained ourselves helpful and practical knowledge,

    which can be used many times afterward.

  • 7/29/2019 Report Draft (4)

    23/23

    Appendix

    GPA AbsentSelf-

    studyGPA Absent

    Self-

    studyGPA Absent

    Self-

    study

    2.88 10 0 3.2 2 1 3.27 5 4

    3.2 2 2 2.67 4 0.5 3.2 5 0.5

    3.54 4 3 2.93 2 2 3.76 3 10

    2.91 5 2 8.5 0 3 3.04 25 3

    3.22 2 5 3.22 0 1 3.8 1 0.5

    3.13 12 0 3.48 1 1 3.3 2 2

    3.03 5 0 3.4 5 1 3.42 2 0.5

    3.2 10 0 2.9 7 2 3.2 4 2

    3.34 4 1 3.28 7 2 3.79 0 4

    3.1 2 0.5 3.22 4 0.5 3.23 0 0.25

    3.47 4 1.00 3.38 1 0.5 3.17 4 2

    3.24 0 1 3.9 1 2 3 2 2

    2.8 2 1 3.19 5 1.5 3.65 1 3

    2.97 3 1 3.31 7 5 3.3 8 23.49 2 1 3.54 8 5 3.31 4 2

    3.2 1 4 2.9 5 2 3.42 0 2

    3.3 2 0 3.62 0 3 2.8 5 4

    3.1 3 0.5 3.5 1 0.5 3.2 2 1

    3.23 0 2 3.67 2 2 3.25 4 2

    3 5 2 3.38 1 1 4 0 3

    3.41 3 2 3.2 4 2 3.35 2 2

    3.04 1 2 3.67 0 1 3.35 16 0

    2.83 1 2 3.2 1 1 3.3 0 4

    3.2 1 5 3.9 4 4 3.15 0 0.5

    3 2 0.5 3.64 0 2 3.4 0 1