Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
1
GLOSSARY A Toolkit of Terms and Concepts for a Successful Semester
COREConcepts DefinitionsMODULE1 ResearchProcess
Researchisanactivity(justlikebaseballoranysport).Therearemanydifferentwaystobeinvolvedandengagedinresearchandmanymethodstounderstandonhowtheworldworks.Theuseofthescientificmethodisonlyoneofmanywaystoconductresearch.It’snottheonlygameintown.Thereareotherimportantmethodsofresearchthatincludeoralhistories,fieldinterviewsandparticipantobservationthatareperhapsabitmorefunandoften,moreintenseintermsofimmersionandinvolvementintheresearchtopicorresearchquestion.InSOC232,however,wewillfocusonusingthescientificmethodtoaddressourresearchquestions.
ScientificMethod Anobjectiveandsystematicapproachtoconductingresearch.Thoseinvolvedindoingscientificresearchfollowthesebasicsteps:1)Formtheresearchquestion2)Decideontheresearchdesignandmeasures3)Collectthedata4)Enterthedataintoacomputerdatabase5)Analyzethedatastatistically6)Interpretthefindingsandwriteuptheresultsintoapaperorpresentation7)Pointthedirectionforthenextteamofscientiststofollow.
EthnographicMethod
Unlikethescientificmethod,theethnographicmethodisbothmessyandwonderfulatthesametime!Itisdiscussedhereonlytoserveasacontrastandcomparisontothescientificmethod.Unlikethescientist,ethnographicresearcherstypicallyspendmonthsoryearsimmersedinacultureorcommunityandtheirethnographicresearchisbasedon
2
theirextensivefieldnotesandfieldobservationsoverthislengthytimeperiod.GoingNative:Completeimmersionintheirfieldworkmaysometimesleadtotheresearcher“goingnative”(atermusedinanthropology)andwhichsometimeshappenstoanthropologistswhentheyspendtoomuchtimeinthefield.“Goingnative”iswhentheresearchertendstolosetheirownobjectivityandthelinebetweenthemselvesastheresearcherandthesubjectsthattheyarestudyingbecomesincreasinglyblurred.Thisisoneofthe“hazards”ofdoingethnographicresearchandbeingcompletelyimmersedinthepeopleandplacesthatyouarestudying.Buttheproblemof“goingnative”isnotusuallyanissueamongpracticingsocialscientistswhotendtobemoreimmersedindatathanotherculturesandotherpeople.But,ofcourse,therearemanyexceptionstothisgeneralstatement!
AssumptionsofScience
EmpiricalMeasurement:Theprincipalassumptionofthescientificmethodisthattheworldcanbeempiricallymeasuredand,thedatawecollectcanbestatistically(andobjectively)analyzed.Objectivity:Objectivityisnotahardassumptiontograspundertheprinciplesofscience:Thestepsinthescientificmethod,fromformingtheresearchquestion,tothemeasurement,quantificationandstatisticalanalysisofthedata,helpmaintaintheresearcher’semotionaldistanceandobjectivityintheirresearchquestions.Subsequently,objectivityisoneofthecorevaluesandassumptionsofscience.Replication:Anothervalueorbenefitofthescientificmethodisreplication:Giventhatthestepsinthescientificmethodarelaidoutclearly,anotherresearchercanmoreorlessreplicatearesearchstudytoeithervalidateorrefuteitsfindings.Thisleadstotheverificationandaccumulationofknowledge.
SocialScience Thesocialscientistassumesthatthesocialworldcanbeempiricallymeasuredandobjectivelyunderstood-justthesamewayasthephysicalandbiologicalworldcanbescientificallyunderstood.Throughtheapplicationofthestepsinthescientificmethod,people,communitiesandculturescanbeunderstoodscientifically.Remembernotallsocialresearchersconductresearchundertheassumptionsofsciencenordotheyapplythestepsinthescientificmethod.Thinkagainabouttheethnographicresearcher,forexample,andothertypesofresearchers(historiansasjustanotherexample)whoareengagedinresearchprojectsacrossthiscampusand,acrossotheruniversitycampusesacrosstheworld.Theseresearcherstooareactivelyengagedincontributingtoourunderstandingofhumansocietiesthroughothermethodsotherthanthescientificmethod.
3
SurveyResearchDesign
Manysocialscientistsusesurveystoobjectivelyunderstandpeople’sattitudes,valuesandbehaviors.Properties-DispositionsDesign:Surveyresearchisoftenreferredtoastheproperties-dispositionsdesign.Usingthisframework,thesurveyresearcherdevelopssurveyquestionstocapturepeople’sproperties(theirrace,class,gender,maritalstatus,etc.)aswellastheirdispositions(theirvalues,attitudes,beliefsandbehaviors).Oncethesurveyquestionshaveallbeencollectedandcompiled,thesurveyresearcherisabletoaskonlythisbasicresearchquestionfromtheirsurveydata:
• Whatistheeffectofthesepropertiesonthesedispositions?• Forexample,doesgendermatterinexplainingone’sposition
onthedeathpenalty?Doesracematter?• Ifdataongender,raceandviewsonthedeathpenaltyhave
beencollected,thentheresearcherisinapositiontostatistically(andobjectively)analyzethoserelationships.
• Thisisthelevelofunderstandingwecanreachwithsurveyresearch,ifyouareinterestedinother,deeperresearcherquestions,youwillhavetouseothermethodsinthefuture.
BasicSurveyResearchQuestion:Whatistheeffectofthesepropertiesonthesedispositions?Allresearchquestionswithinasurveydesignframeworkareframedthisway.Duringoursemestertogether,thiswillbehowyouwillframeyourresearchquestionstoo.Reviewcarefullytheabovebulletsastheymayapplytoyoursetofresearchquestions,starttrainingyourselftothinkthiswayforthissemester.Butifthisboringtoyou,remember,thatyoucanuseotherresearchmethodsandresearchdesignstoconductotherkindsofresearchinthefuture!
ExperimentalResearchDesign
Othersocialscientists(andmedicalresearchers)oftenuseexperimentalmethodstounderstandandgainknowledgeaboutpeople’sbehaviorsandtheirresponses.Stimulus-ResponseDesign:Experimentalresearchisoftenreferredtoasthestimulus-responsedesign.Usingthisframework,thereisusuallyoneexperimentalgroupthatreceivesthestimulusandanothergroup(thecontrolgroup)thatdoesnot.Dataiscollectedandcompiledacrossbothgroupsandgroupcomparisonsarestatisticallymadebetweentheexperimentalandcontrolgroup.Throughthisresearchdesign,thebasicresearchquestionthatisaddressedis:Whatiseffectofthestimulusontheresponse?Doessleepdeprivation,forexample,impairjudgment?Bycomparingtwogroups,onegroupthatissleepdeprived(experimentalgroup)andonethatisnot(controlgroup)andthen,bygivingthesamebarrageofteststomeasurejudgmentand,comparingtheresultsbetweengroups,the
4
researcherisinapositiontoanswerthatparticularresearchquestionstatisticallyandobjectively.Inshort,bycomparingtheresponsesoftwogroups(theexperimentalgroupthatreceivedthestimulusandthecontrolgroupthatdoesnot)onecanbegintoanswerthisgeneralresearchquestion:“Whatistheeffectofthestimulusontheresponse?”Ifyourfutureresearchquestionscanbeframedthisway,youmightdecidetouseanexperimentalresearchdesigninthefuturetoansweryourownresearchquestions.
ResearchQuestions Itshouldbeapparentbynowthatyourresearchquestionwilldetermineyourresearchdesignandmethods.Ifyouareinterestedinreallyunderstandingapeopleorcultureorprobleminsomedepth,youmightuseethnographicorotherin-depthfieldmethods.Ifyourquestioncanbeframedsothatyouarecomparingtwogroupswhereonereceivesa“stimulus’andtheotherdoesnotand,youwanttocompareresponsesorbehaviorchanges,thenperhapsyouwilluseanexperimentalresearchdesign.If,ontheotherhand,youareinterestedinunderstandingpeople’svalues,beliefsandbehaviorsanddifferencesbetweenpeople(byrace,class,gender,place,etc.),thenyouwilllikelybeapplyingasurveydesigntoyourresearchquestion.Inthisclass,thefocuswillbesolelyonsurveyresearchdesignandusingsurveydatatoaddressourresearchquestions.Frameyourresearchquestionsaccordingly.
Conceptualization Whenframingyourresearchquestions,beveryclearonyourconcepts.Evenlistthemoutseparately.Thiswillhelpinthenextstepofdecidingonyourmeasuresforthoseconcepts.Forexample,areyouinterestedinReligion?Gender?PoliticalViews?Thensayso.Listthoseconceptsthatinterestyou.Remember,surveyresearchistheproperties-dispositiondesignandso,alwaysmakesuretoalsoincludesomeconceptsthataredemographicinnature(race,class,gender,education,etc.)whendevelopingyourresearchquestions.
Operationalization Onceyouareclearonyourgeneralresearchquestionsandsetsofconcepts,thenextstepistodecideonhowyouaregoingtomeasurethoseconcepts.Thisstageofmeasurementisreferredtoas“operationalization.”Ifyouareinterestedintheconceptof“religion,”forexample,youwillneedtodecidehowyouaregoingtomeasurethatconcept.Ifyouareusingsurveymethods,youwillneedtodecidewhattypeofsurveyquestionorquestionsyouwillusetomeasure“religion”orhowreligiousapersonis?Willyoudevelopquestionsonhowoftenapersonprays?Or,howoftentheygotochurch?Or,willyouuseseveralquestionstomeasurethestrengthofaperson’sreligiousconvictionsacrossseveraldimensions?
5
Ifyouthinkthatgenderisanimportantfactorinexplainingandunderstandingreligiouslife,youwillneedtoincludethatquestiononyoursurveytoosoastomeasurethatconcept.
SurveyInstrument Thesurveycanbeconsideredyour“measurementinstrument.”Neverjustthrowabunchofquestionsonapieceofpaper,thinkseriouslyaboutwhatconceptsyouwanttomeasureandthenworktodevelopvalidandreliablesurveyquestionsthatmeasurethoseconceptswell.Remember,thesurveydesignisthe“properties-dispositions”designandso,youwillneedagoodnumberofsurveyquestionsthatmeasurethosedemographicfactors(properties)thatyouthinkwillmatterinexplainingdifferencesinpeoples’views,opinionsandbehavior.Inotherwords,besuretoincludequestionsthatmeasurerace,class,gender,educationandotherdemographicfactorswhendevelopinganyfuturesurvey.
SurveyQuestions Surveyquestionsarethespecificmeasuresthatyoudevelopandusetomeasureyourcoreconcepts.Variables:Onceresponsestoaparticularsetofsurveyquestionshavebeenenteredintoadatabase,thesequestionswillbecomeyour“variables”thatyouwillanalyzeinyourlateranalyses.Theterms“surveyquestion”and“variable”canbeusedinterchangeably.
MeasurementValidity
Whendevelopingyoursurveyquestionstomeasureyourconcepts,youmustaddressissuesofmeasurementvalidity.Youmustaskyourselfisthatavalidmeasure,forexample,ofstrengthofreligiousfaith?Or,mightthatsurveyquestiononchurchattendancebemeasuringsomethingelse?Isthesurveyquestiononchurchattendanceavalidmeasureofstrengthoffaith?Whyorwhynot?Bepreparedtodefendyourchoiceofquestionsandyoursetofmeasuresalways.Developvalidquestions.Ifnot,youwillbecollectingbaddataandwillhaveapoorlydesignedresearchstudyonyourhands.
MeasurementReliability
Reliabilityimpliesthatyouaregettingaconsistentmeasureandthatitwillnotvaryfromdaytoday.Makesuretodeveloporusequestionsthatarereliableandthataremeasuringmorethanaperson’simmediatestateofmindandwon’tfluctuatefromdaytoday—thosearepoorlydevelopedquestionsandareunreliablemeasures.Instead,developreliablequestions.Otherwise,youwillbecollectingunreliabledataandwill,onceagain,haveapoorlydesignedresearchstudyonyourhands.
Self-ReportedDesign
Thisisanotherwayofthinkingaboutsurveyresearch.Oursurveyquestionsarebasedonrespondentsself-reportingontheirattitudes,behaviorsanddemographics.Thisisobvious.Howelsecanwefindoutandmeasurepeople’sbeliefsandbehaviorsifwedon’taskthem?Butthereourproblemsintheself-reporteddesignwithrespecttoissuesofmeasurementvalidityandreliability.Arepeopleresponding
6
honestlyandtruthfullytoourquestions?Or,aretheysometimesprovidinguswiththe“sociallyacceptable”answer?Subsequently,arewecollectingvalidandreliabledata?GapsinourResearch:Thesearethingstopayattentiontowhenevaluatingsurveyresearch.ItisOKtoaddressissuesofimprovingsurveyquestionsandquestionwordingsoastoimproveuponourmeasures.Thisistheonlywaywecanbeinapositiontocollectmorevalidandmorereliabledatainthefuture.Infact,itisyourresponsibilityasasocialscientisttopointoutthesegapsincurrentmeasuresandproblemswithcurrentdata.Thisisoneofthingsthatyoumustdowhenpavingthedirectionforfutureresearchtoconcludeyourstudy!
LevelsofMeasurement
Surveyquestionscanbemeasuredoneitheracategoricaloracontinuousscale.Researchersneedtoknowwhethertheyareworkingwithcategoricalorcontinuousvariablesasthatwilldeterminewhattypeofstatisticalanalysescandoonthosevariables.Whenitistimetorunthestatistics,categoricalvariablesaretreateddifferentlythancontinuousvariables,
CategoricalVariables
Categoricalvariablesarethosesurveyquestions,whererespondentsareaskedtomeasurethemselvesbasedonaseriesofcategories.Withthesequestions,respondentsareaskedtocheckoneboxortheother.Forexample,dotheyagreeordisagree?Favororoppose?Or,aretheymaleorfemale?Ineachcase,thelevelofmeasurementiscategorical.Toturnthesetypesofsurveyquestionsintodata,theresponseshavetobecodedtoanumber.Forexample,codemaleas“1”andfemaleas“2.”Thosenumbershavenotruenumericmeaningandarejustplaceholdersforcategoriesandsubsequently,thetypeofstatisticsweuseonthistypeofdataaredifferentthanworkingwithtrue,numericdata.
LikertScales
SomecategoricalvariablesaremeasuredonaLikertScale.Thisisthetermthatisusedforthosevariablesorsurveyquestionsthataremeasuredfromstronglyagree,agreetodisagree,stronglydisagree.Thereareotherscales(stronglyfavortostronglyoppose)thatarealsoaLikertscale.Thinkofitthisway,wheneverviews,opinionsorbehaviorsaremeasuredthroughaseriesofcategoriesthatrangefromlowtohighthatisaLikert-scaled(categorical)variable.
ContinuousVariables
Whensurveyquestionsresultinanumberresponseratherthanacheckedabox,thistypeofsurveyquestionorvariableisacontinuous(numeric)variable.Forexample,howoldareyou?Howmanybrothersandsistersdoyouhave?Bothofthesequestionsarecontinuousinthattherespondentisaskedtosimplyplaceanumbernexttothesurveyquestion.
7
Onceagain,beawarethatnumericdataistreateddifferentlystatisticallythanasetofcategorical(strongly-agreetostrongly-disagree)surveyquestions.Withcontinuousnumericdata,forexample,theaverageageandaveragenumberofbrothersorsistersofrespondentscanbecalculated.Incontrast,themeanandotherdescriptivestatistics,can’tbeusedwhensummarizingresponsesfromstronglyagreetostronglydisagreevariables–the“average”opiniononthedeathpenaltyjustdoesnotmakesense(Thebestwaytosummarizethoseresponsecategoriesisthroughafrequencydistributionaswewilldiscusssoon).
SurveyDatabases Surveysandsurveyquestionsthathavebeencompiledandenteredintoadatabase.Surveydatabases,likealldatabases,aretypicallycomprisedofthefollowingthreebasicelements:Cases:Thecodedandcompileddatathatrunsacrosseachrowofthedatabaserepresentsonecaseor,onesurveyrespondent.Thetotalnumberofrowsinthedatabase,therefore,willbethetotalnumberofcasesor,totalnumberofrespondentsinthedatabase.
• n=Totalnumberofcasesorrespondentsinthedatabaseisoftenrepresentedstatisticallythroughtheletter“n.”
Variables:Thecodedandcompileddatafromeachsurveyquestionrunsdowneachcolumn(orvariablefield).Thetotalnumberofcolumnsofadatabasewillmatchupwiththetotalnumberofsurveyquestionsorvariablesthatwereaskedonthesurvey.VariableMnemonics:Ontopofeachvariablecolumnorfield,therewillashorttermthatprovidesanabbreviationforthesurveyquestion.Themnemonicsontopofeachfield,allowtheresearchertogetasenseofwhattypesofquestionsareinthedatabase.However,tomakefulluseofadatabasetheresearcherwillneedtoreferenceitscodebook.
Codebook Acodebookisamanualthathelpstheresearchernavigateanduseadatabase.Codebooksaretypicallyarrangedbyvariablemnemonicandcontainthefullwordingofthequestionasitoriginallyappearedonthesurveyinstrument.Codebooksareespeciallyusefulwhenresearchersareworkingwithsecondarydata.
SecondaryData Oftensocialscientistsworkwithsecondarysurveydata.Secondarysurveydataistypicallydatathathasbeencollectedbyalargefederalagencyorprivatefoundation.Thesesurveydatabasesareoftenlarger(sometimeshuge)andoftenhaveafarbettersampledesignandresponseratethandatathatiscollectedbyindividualsurveyresearchers.Asaresult,secondarysurveydataisoften(butnotalways)morereliableandvalidthanwhattheresearchermightcollectontheirown.Forthisreason,manyresearchersopttousetheselarger,secondarysurveydatabasesfortheirresearch.Inourclasstimetogether,wewillbedoingthesameand,youwillbeusinganalreadyexisting,secondarysurveydatabaseforyourownanalysesandreports.
8
• TheGeneralSocialSurvey:Inthisclass,wewillbeusingthe
GeneralSocialSurvey(orGSS)asoursecondarydatasource.Thisdatahasbeencollectedannuallysince1972bytheNationalOpinionResearchCouncil(NORC)andisarespecteddatabaseofUSPublicOpinion.YouwillhavefunreviewingtheGSSCodebooksthatareavailable!
DataExcavation Thistermisspecifictowhenresearchersusesecondarydata.Another
termfordataexcavationis“datamining.”Whenusingsecondarysurveydata,theresearcherhastomoreorlessworkbackwardsandformtheirresearchquestionafterthefact.Since,thedatahasalreadybeencollected,theresearchermustask“whatkindsofresearchquestions,”whatvariables,whatsurveyquestionsfromthedatabase,canIuseinmyanalyses?Whenexcavatingorminingadatabaseforinterestingvariables/surveyquestions,thecodebookbecomesabsolutelyessential.Inthe“dataexcavationstage”oftheresearchprocess,theresearcherspendsalotoftimeperusingthecodebook,lookingforsurveyquestionsthataresolidmeasuresoftheconceptstheywanttounderstand.Inyourfirstmodule,youwillbeaskedtoreviewtheavailablecodebookscarefullyand“mine”theGSSdataforinterestingvariablesandresearchquestionsforyourownanalysisandpaperforthissemester.
SPSS Thisistheacronymfor“StatisticalPackagefortheSocialSciences.”SPSSisasoftwarepackagethatwillrunourstatisticsforusandso,thegoodnewsisthatwewon’thavetocalculateanystatisticsbyhandorwithacalculator.Usingastatisticalsoftwarepackagewillallowustomovetothemoreimportantstepsinthescientificprocess,analyzing,interpretingandcommunicating(writingup)ourfindings.SPSSisoftenreferredtoasthe“industrystandard”instatisticalsoftware.Infact,whenyouhavecompletedthisclass,besuretoplaceonyourresumethatyouhaveexperienceinusingthestatisticalsoftwarepackage,SPSS!
Module2 Frequency
DistributionsAfrequencytableisaverybasicstatisticthatisalwaysusedatthestartofanyanalysesofsurveydatanomatterhowbigorsmalltheresearchproject.Becauseitisthefirststatisticusedwhenstartinganyanalysisofsurveydata,itisthefirststatisticwewilluseinthiscourse.Regardingabasicdefinition,therearethreemaintermstoapplywhenthinkingaboutandusingfrequencydistributions:
1. Descriptivestatistics:Thereareanumberofdescriptivestatisticsthatwewilluseinthisclassand,afrequencydistributionisoneofthem.Descriptivestatisticsareusedtosimplysummarizeanddescribebasictrendsinourdata.For
9
example,howmanypeoplesaid“yes”or“no”tothequestiononsmoking?Howmanypeople,stronglyagreeetc.onanotherquestion?
2. Univaratestatistics:Afrequencydistributionisalsodefinedasaunivariatestatistic.Thismeansthatthesestatisticssimplysummarizeanddescribetrendsacrossvariables,onevariableatatime.Inotherwords,withfrequencydistributions,wecannotlookatrelationshipsbetweenvariables(thosearebivariatestatistics).Withfrequencydistributions,wecanlookathowpeoplerespondedtoeachsurveyquestion,onequestionaftertheother.
3. CategoricalData:Asmentioned,thereareotherdescriptive
(univariate)statisticsthatwewillusethissemesterbut,itisveryimportanttoknowthatfrequencydistributionsarereallytheonlyusefulstatistictosummarizeanddescribetrendsacrosscategoricalvariables.Manysurveyquestions,asweknow,aremeasuredonacategoricalscaleandsubsequently,frequencydistributionsareessentialtoolswhensummarizinganddescribingtrendsacrossoursetofsurveyquestions.Itwilltellushowpeoplerespondedacrosscategoriesofasurveyquestion.Forexample,howmanypeoplefavored,howmanyopposedthedeathpenalty?
Sample
CharacteristicsFrequencydistributionsareanimportanttoolinsummarizingthecharacteristicsofasampleofrespondents.Beforedoinganyhigheranalysesonthesurvey,itisimportanttotakealookatthedataandwhorespondedtothesurvey.Howmanymen?Howmanywomen?Wheredomostpeoplelive?Howmanyaremarried?Howmanyaresingle?Etc.Runningfrequencydistributionsonsomeofthemoredemographicsurveyquestionsinthedatabasewillprovideyouwithinformationonthecharacteristicsofthesample.SurveyResearch,WritingResearch:Whenwritingasurveyresearchreport,itisimportanttosharethisinformationwithyourreaders.Inyourpaperforthisclass,youwillberequiredtoincludeasectionon“SampleCharacteristics”whenwritingupyourresultsfromyourownanalyses.Tocompleteyoursectionon“SampleCharacteristics,”besuretorunfrequencydistributionsonasetofselecteddemographicquestions.Thequestionsyouselectareentirelyuptoyouandarebasedonwhatyoubelieveareimportantfactorstosharewhensummarizingyoursampletoyourreaders.Thisisanimportantexercise.Whenwritingsurveyresearchinthefuture,youwillalwaysincludeasectionon“SampleCharacteristics”andthatiswhyitisagoodhabittostartthatnow.
Trends Atrendinsurveyresearchtellsuswhatpeoplearedoing,whattheyarethinkingandwhatvaluesoropinionstheyhold.Frequency
10
distributionsallowustosummarizeimportanttrendsinoursurveydata.
RawFrequencies
Therearefivebasicelementstoafrequencytable:Rawfrequencies,numberofcase,percentages,validpercentagesandcumulativepercentages.Therawfrequenciesarethefirstnumbersinthefirstcolumnofthetableandreporthowmanypeople(theactualnumber)respondedoneachcategoryoritemonthesurveyquestion.Donotreportrawfrequencies:Thisistheleastusefulinformationinthetableandyoumustneverreporttherawfrequencieswhensummarizingtrendsacrossyoursetofsurveyquestions.Reportthevalidpercentagesinstead.
NumberofCases Ratherthantherawfrequencies,thetotalnumberofcasesisveryimportanttoreportwhensummarizingtrendsacrosseachvariable.Itisimportanttosharewithyourreadersjusthowmanypeoplerespondedtothatparticularsurveyquestion.However,therearetwototalstopayattentionto,butremember,youonlyreportthesecondone:TotalNumberofCases:Thistotalisatthebottomoftherawfrequencycolumnandreportsonthetotalnumberofpeopleinthesample.Trytoavoidreportingthisnumber.Thereason,inmanylargenationalsurveys(liketheGeneralSocialSurvey)noteveryoneinthesampleisaskedthesamesetofsurveyquestions.Askingrespondentsdifferentsetsofquestions,allowsthesurveyresearchteamtocapturealargeamountofdataonpeople’sopinionsandbehaviors.Thepointhereisthatnoteverybodyinthesurveyisaskedthesamesetofsurveyquestions.Subsequently,usethevalidpercentwhenreportingonthenumberofpeoplethatrespondedtothatquestion.ValidNumberofCases:Thevalidnumberofcasesisthenumberthatisreportedintherawfrequencycolumnand,itisthenumberthatisreportedafterthesetofresponsecategories.Thistotalnumberreportsjustwhoansweredthesurvey.Whenreportingthenumberofcases,reportthevalidnumberofcases–reportonlythosepersonsthatwereaskedthatparticularsurveyquestion.
• n=asweknow,thetotalnumberofcasesinthedatabaseisoftenrepresentedstatisticallythroughtheletter“n.”Nowweknow,thatwhenreportingthe“n”foreachsurveyquestion,besuretoreportonlythe“validn”–thetotalwhorespondedtothatparticularsurveyquestion.
• SurveyResearch,WritingResearch:Reviewthe“RulesforWritingFrequencies”onhowtoreportthenumberofcases(n=)foreachsurveyquestion.
11
Percentages Thepercentagesarethenextcolumninafrequencytable.Butavoidthisfirstsetofpercentageswhensummarizingtrendsinyoursurveyquestions.Why?Becausethesepercentagesareexpressedasafunctionofthetotalnumberofcasesandremember,noteveryoneinthedatabasehasbeenaskedthatparticularquestion.Subsequently,thesepercentagesarewrongtoreportastheydonotaccuratelysummarizethepatternordistributionofresponses.Reportinsteadalwaysthevalidpercentages.
ValidPercentages Alwaysreportthevalidpercentagewhensummarizingtrendsinsurveydata.Thevalidpercentagesarebasedonthenumberofpeoplewhoactuallyrespondedtothatparticularsurveyquestion.Itgivesamoreaccuratesummaryoftrendsinthesurveydataandhowpeoplerespondedtothequestions.Usethevalidpercentagesalways.
CumulativePercentages
Thecumulativepercentagesarethelastsetofpercentagesinthelastcolumnofthefrequencytable.Theyaretheaccumulatedtotalofpercentagesasyoumovedownthecategoriesofresponses.Theyareveryusefulinsummarizingtrendsacrosssurveyquestionsastheyhelptocollapseandsummarizecategoriesevenbetter.Forexample,onecouldreportthat69%ofGSSrespondentswereeitherprotestant(47%)orcatholic(23%)tobetterpresenttrendsinyourdata.Whenyoucananditisuseful,usethecumulativepercentagecolumnatanygivenopportunity.
RulesforWritingFrequencies
SurveyResearch,WritingResearch:Therearecertainstandardsandacceptablepracticeswhenreportingonfrequenciesandtrendsacrossyoursetofsurveyquestions.Reviewcarefullythe“RulesforWritingFrequencies”whenreportingyourresultsonyourfrequencydistributionstogainthoseskillsinwritingsurveyresearchwell.
Module3 DataVisualization
Asweknow,apictureisworthathousandwords.Presentingourfindingsingraphicalorsomeothereasytounderstandvisualformwillhelpusbettercommunicateourfindingstoeitheraninterestedpublicorpolicymakers.Subsequently,datavisualizationisanimportantelementinthescientificprocessinthatithelpsscientistsbettercommunicatetheirfindings,expandtheiraudience,educateandinvolveothers(thatmightotherwisebeexcluded)inscientificdiscussions.Itisthereforeimportantthatyougainsomeskillsinvisualizingandpresentingdataintablesandgraphs.
Tables Whenwethinkofscientificstudies,wenaturallythinkabouttablesandgraphs.Scientistsreportandsummarizetheirfindingsandstatisticsnotonlythroughtextbut,alsothroughseveralwell-placedtablesthroughouttheirstudy/researchreport.Inthismodule,youwillmake
12
andformattablestoscientificstandards.Infact,youwillbeexpectedtodevelopseveraltablestoincludeinyourpaperaswemovethroughthesemestertogether.Aninstructionalvideoandtipsheetonmakingtablesisincludedinthismodule.
Graphs Graphsprovideavisualrepresentationoftrendsandrelationshipswehaveobservedinourdata.Wewillgooverthreebasictypesofgraphsinthismodule:PieCharts,BarChartsandLineCharts.Youwillbeexpectedtoincludeseveralinyourownresearchreport.Bewarned.SPSSisnotagoodpackageforproducinggoodqualitygraphsrather,EXCELisfarbetter.Inthisclass,youwillmakeandformatgraphsusingEXCEL.YouwillbeexpectedtodevelopseveralchartsandgraphsinEXCEL.Aswithbuildingyourowntables,aninstructionalvideoandtipsheetonmakinggraphsinEXCELisincludedinthismodule.
EXCEL ManyofyouarealreadyfamiliarwithEXCELasitispartofyourMicrosoftsuiteonyourcomputer.ItissimilartoSPSSinthatitisdatamanagementsoftware.WhileSPSSisfarbetterforrunningstatisticsonourdata,EXCELisbetterinproducinghighqualitygraphs.Youwillbeusingbothdatamanagementtoolsinthisclass.
PieCharts Apiechartistypicallyusedwhenpresentingfrequencydistributions(percentages)inchartorgraphform.Manyofyouarefamiliaralreadywithreadingpiecharts:Percentagebreakdownsacrossresponsescategories(forexample,stronglyagreetostronglydisagree)aretypicallyrepresentedasa“sliceofthepie”whereeachpieceisproportionatetoitspercentages:Naturally,thehigherthepercentagerepresentedonacategory,thebiggerthesliceofthepie.
BarCharts Likepiecharts,barchartshelpresearchersvisuallyrepresenttheirdata.Mostofyouarealsofamiliarwithbarcharts.Withabarchart,ratherthanasliceofpie,thebarsrepresentthepercentagesortrendsinthedata.Thehigherthepercentage,thehigherthebar.Yetthisisthedifferentpartwithabarchart:Therearetwoadditionalelementstoconsiderwhenreadingorbuildingabargraph:
• Y-Axis:TheY-axisisthevertical(upanddown)axis.Thisaxisshowsthepercentagescale.Thishelpsthereadermakesenseofthetrends(readthepercentages)representedbythebarsinthechart.Onecanalsohelpthereaderreadthegraphbyincludingthepercentagesoneachbar.
• X-Axis:TheX-axisisthehorizontal(sidetoside)axis.Thisaxiscontainstheactualmeasurementscale.Typically,thecategoryresponsestoasurveyquestion.
13
LongitudinalDesign
Timeisoftentreatedinsurveyresearch.Manyresearcherswanttoknowifpeoples’opinions,valuesorbehaviorshavechangedovertime.Askingwhethertherehasbeenachangeovertimeisanimportantresearchquestion.Inthisclass,youmightdecidetodothesameandexplorechangesandtrendsacrosstime,byusingandanalyzingdatafromdifferentyears.Findingsfromlongitudinalstudiesareoftenrepresentedgraphically.Inthisclassyoumightdothesame.Infact,asyoudevelopyourgraphsandchartstodocumentchanges(ornochanges)acrosstime,-youwillbeaddinganimportantvariable(inthiscase,“time”)intoyourstudyandanalysis,-makingforanevenbetterresearchpaper.Definitelyconsiderusingdatafromdifferentyears,ratherthanfromjustoneyear.
• Cross-sectionalDesign:Sincewearetalkingaboutlongitudinalresearchdesigns,itseemsimportanttomentiontheopposite.Whentreatingonlyoneyearofdataandtherefore,notaddressingquestionsofchangesovertime,thenoneisworkingwithinacross-sectionaldesignframework.ItisA-OKtodocross-sectionalresearch,especiallyifthedataisonlyavailableforoneyear!
LineCharts Linechartsareusedinlongitudinalresearchstudies.Thinkofaline
chartasabarchartwithoutbarsandonlythehighpointsofeachbarbeingrepresentedbyadot.Thosedotsarethenconnectedtogetherbyaline.Justlikethebarchart,therearetwoelementstoconsiderwhenreadingorbuildingalinegraph:
• Y-Axis:Itisthesameaswiththebarchart.TheY-axisisthevertical(upanddown)axis.Thisaxisshowsthepercentagescale.Thishelpsthereadermakesenseofthetrends(readthepercentages)representedbythelineonthechart.Ofcourse,onecanalsohelpthereaderreadthelinegraphbyincludingthepercentagesneareachpoint.
• X-Axis:TheX-axisisdifferentinthelinegraphthanwithabarchart.Inthiscase,thehorizontal(sidetoside)axis,nolongerrepresentsthecategoriesofpossibleresponses,butratherthepointsintime.Besides,thatisthemeasurementscalenowbeingrepresentedontheX-axis.Theresearcherisinterestedinmeasuringpointsintimeandchangesovertime.
Infographics Thisisthenewandexcitingwaveinpresentingdataandinformation.
Aninfographisacreativeandartfulwayofcommunicatingthroughcombiningeffectivegraphs,images,iconsandasmallamountoftext.Aninfograph,whendonewell,cansummarizeeasilyascientificstudy
14
(orevenawholebodyofscientificwork)inanartful,singlepagedisplay.Infact,inyourhomeworkassignments,youwillbeaskedtolocateinfographs,screencapturethemanddiscuss.Whilewewon’thavetimetomakeinfographsofourown,considergainingthesoftwareandgraphicskillsneededthroughothercoursesoncampusasyoubuildyourfutureskillsindataanalysisandscience.Thisisthewaveofthefutureinpresentingdata,trendsandscientificfindings.
RulesforMakingTables
SurveyResearch,MakingTables:Therearecertainstandardsandacceptablepracticeswhenmakingtablesforscientific,surveyreports.Reviewcarefullythe“RulesforMakingTables”whenbuildingyourtablesforyourpaper.
RulesforMakingGraphs
SurveyResearch,MakingGraphs:Therearecertainstandardsandacceptablepracticeswhenmakinggraphsforscientific,surveyreports.Reviewcarefullythe“RulesforMakingGraphs”whenbuildingyourtablesforyourpaper.Remember,youwillbeusingEXCELnotSPSStomakeyourgraphs.
Module4 NumericData
Wehavebeenintroducedtonumericdataalready.Anothertermfornumericdatais“continuousdata.”Wewereintroducedtothatterminmodule1buthereitisagain:Whensurveyquestionsresultinanumberresponseratherthanacheckedbox,thistypeofsurveyquestionorvariableisacontinuous(numeric)variable.Forexample,howoldareyou?Howmanybrothersandsistersdoyouhave?Bothofthesequestionsarecontinuous,numericvariablesinthattherespondentisaskedtosimplyplaceanumbernexttothesurveyquestion.Itisveryimportanttorememberthatnumericdataistreateddifferentlystatisticallythancategoricaldata.Why?Becausewearedealingwithrealnumbersratherthansimplycodedcategories.Whenworkingwithnumericdata,measuresofcentraltendencyandmeasuresofvariabilityareagoodwaytosummarizeanddescribenumericdata.Incontrast,measuresofcenterandmeasuresofvariability(thatwewillbelearningaboutinthismodule)don’tapplywhensummarizingcategoricaldata.And,aswehavecometounderstand,mostsurveyquestionsaremeasuredoncategoricalscales(peopleareaskedtocheckabox)and,asweknowalready,thebestwaytosummarizethistypeofdataisthroughafrequencydistribution.
Distributions
15
Whilefrequencydistributionsareimportanttoolsinsummarizingcategoricaldata,itisalsoimportanttousethe“FrequencyDistribution”commandwhensummarizingnumericdata.Thefrequencycommandwillprovideyouwiththefulldistributionofvaluesandtheircorrespondingfrequenciesandpercentagesasreportedacrosstherangeofscoresorvalues:Forexample,withage:Thefrequencytablewillshowhowmany18yearoldsthereareallthewaythroughhowmany88yearoldsthereareinthesample.Thefrequencytable,inthiscase,providesthefullagerangeoragedistributionofrespondents.Inotherwords,thefrequencydistributionprovidesyouwithafullpicturedisplayofthedata.Forthisreason,itisexcellentpractice,whenhandlingnumericdata,toaskfirstforafrequencydistributionsoastogetafullsenseandpictureofthedistributionofscoresorreportedvalues.Havingafullpicturewillhelpyouindecidinghowtopresentandsummarizethedatahonestlyandaccuratelyinyourreportsandpresentations.
*NewTermadded
Histograms Ahistogramprovidesanactualpicturedisplayofthefrequencydata.Itissimilartoabarchartbutratherthaneachbarrepresentingaresponsecategory,eachbarrepresentsanumericvalue.Likethebarchart,therearetwobasicelementstoconsiderwhenreadingorbuildingahistogram:
• Y-Axis:Asweknow,theY-axisisthevertical(upanddown)axis.Thisaxisshowsthepercentagescale.Thishelpsthereadermakesenseofthetrendsrepresentedbythebarsandcorrespondingpercentagesonthehistogram.
• X-Axis:TheX-axisisthehorizontal(sidetoside)axis.Thisaxis
containstheactualmeasurementscale.Typically,allofthepotentialnumericvaluesarearrangedinsomeformonthisX-axis.
Descriptive
StatisticsAfrequencydistributionisalsoconsideredadescriptivestatistic,butinthismodule,wewillusetheterm“descriptivestatistics”torefertothefollowingstatisticsthathelpsummarizeanddescribenumericdata.Inthiscase,therearetwotypesofdescriptivestatistics,thatwewillgooverinthismodule:
1. MeasuresofCentralTendency:Statisticsthatsummarizeanddescribeanumericdistribution’smiddle.
2. MeasuresofDispersion:Statisticsthatsummarizeanddescribehow“spreadout”(ordispersed)adistributionofscoresactuallyare.
MeasuresofCentralTendency
16
Asstatedabove,measuresofcentraltendencyarestatisticsthatsummarizeadistribution’smiddle.Thebestwaytothinkofmeasuresofcentraltendencyare“measuresoftypicalness.”Byidentifyingandreportingthecenterofadistribution,theresearcherisattemptingtoreportwhatistypicalinthedistribution.Aswewilldiscuss,therearethreedifferentwaystomeasureadistribution’smiddle:Themean,medianandmode.And,ineachcase,theresearcherisattemptingtoreportonwhatistypical.Forexample,themeanageinadistributionmightbe44yearsold.Inthiscase,theresearchermightreportthatthe“typicalperson(theaverageperson)is44yearsold.”Justremember,anothertermforameasureofcentraltendencyisameasureoftypicalnessandthattherearethreemeasuresofcenter:Themean,medianandmode,eachofwhichwewillgooverinthismodule.
Mean Themeanisameasureofcentraltendencyoragain,measureoftypicalness.Anothertermforthemeanisthe“average”andtheaverage,asweknow,issimplybasedonaddingupallthevaluesinadistributionanddividingbythenumberofcases.
• x̂ = Thisisthestatisticalsymbolforreportingthemeanofadistribution;otherwiseknownasX-bar-hat:x̂
• ItisabitofapaintomakeinWORDandso,usethedefaultsymbolicform:M=whenreportingthemean.Thisalsorepresentsthemean.
Median Themedianisalsoameasureofcentraltendencyormeasureoftypicalness.Themedianrepresentstheperfectmiddleofadistributionwherehalfofscoresliebelowthemedianvalueandtheotherhalflieabove.
• Themedianisanincrediblyusefultoolbutisoftenignoredinfavorofthemean.Inthisclassthough,whenreportingthemean(theaverage)foradistribution,youwilllearntocompareitfirsttothetruemiddle(median)ofthedistribution.Thiswillallowyoutoseeifyouaregettingagoodreadingonwhatistrulytypicalforthedistribution.Otherwise,youmightnotbeaccuratelyreportingontrendswithinthedata.If,forexample,themeanisnotclosetothemedianvalue,thenthatmeasureofcentermightnotbereportingwellonwhatistypicalforthedistribution.
• Mdn=isthestatisticalsymbolforsummarizingthemedianofadistribution.
Mode
17
Themodeisalsoameasurecentraltendencyormeasureoftypicalness.Themoderepresentsthemostoftenreportedscoreinadistribution.Ifthedataisrepresentedinahistogram,themodeisthevaluethatrepresentsthe“highestpoint”–or,thehighestfrequency(mostoftenreportedscore)inthedistribution.
• Themodeisalsoausefultoolbutisalsoignoredinfavorofthemean.Inthisclassthough,whenreportingthemean(theaverage)foradistribution,youwilllearntocompareitfirsttothetruemiddle(median)ofthedistributionandalsoconsiderthereportedmeanincomparisontothemode,themostoftenreportedvalue,todetermineifthemeanisgivingyouagoodreadingonwhatistypical.
• Onceagain,byconsideringallthreemeasuresofcenter,thiswillallowyoutoseeifyouaregettingagoodreadingonwhatistrulytypicalforthedistribution.Otherwise,youmightnotbeaccuratelyreportingontrendswithinthedata.
• Md=isthestatisticalsymbolforsummarizingthemodeofadistribution.
Skewed
DistributionsMostdistributionsofdataintherealworldareeithersomewhatorhighly“skewed.”Thismeansthatthesedistributionsarenot“normal”andinstead,abit‘crazy’.Reviewthedefinitionofanormaldistributionwhengoingoverthisdefinition.Thereason,onceyouunderstandwhatanormaldistributionis,itwillbeeasytocontrastanormaldistributionwithaskewedone.Hereisthedefinitionofaskeweddistribution.Skeweddistributions:Incontrasttoanormaldistribution,skeweddistributionsarenotsymmetricalbutrather“heavy”ononesideofthedistributionortheother.Thatmeans,thattheremightbemorepeoplethan“normal”whoarereportinghighorlowscoresonthedistributionandsubsequently,thesereportsarepullingthetailsofthedistributiontoeitherasetofhighscoresorlowscores.Thepresenceofextremehighorlowscoreswillnaturallyinfluencethemean,movingittowardsthosehighorlowvaluesandso,ifadistributionisskewed,themeanmightbegivingabitofafalsereportontypicalness,inthatitisbeinginfluencedbysomeextremevalues.Forthisreason,skewisimportanttoassess.Weassessskew(orasymmetryinadistribution)throughreviewingthefrequencydistributionofscoresandalsobycomparingthemeantothemedian.Ifthemeanisalotlowerorhigherthanthemedian(whichistheperfectmiddleofthedistribution),thatindicatesthatthemeanisbeinginfluenced(andpulledoffcenter)bythepresenceofextremeloworhighvaluesinthedistribution.Fromascientificperspective,thisisimportanttoassess,especiallyindecidingwhatis“typical”andhence,accuratelyreportingthatinformationtothepublicandotherscientistsandpolicymakers.Ifthe
18
distributionisskewed,thenthemeanmightnotaccuratelybereportingthecenterofthedistribution.Inthatcase,itmightbeimportanttoreportalsotheothermeasuresoftypicalness(median,mode)toaccuratelyconveymoreinformationaboutthedistributionandhowpeoplereported.Rememberthiswhenreviewingthe“rulesofwriting”inpreparationforyourhomeworkproblemsforthismodule.
NormalDistributions
NormalDistributionsareoftenreferredtoas“bellcurves.”Thesedistributionsareshapedlikeabellwithmostcasesfallingunderacertaindefinedarea(orrangeofscores)justunderthecurve.Thesedistributionsarealsoreferredtoassymmetricaldistributions.Thatmeansthatifwecutanormaldistributioninhalf,theleftsideisjustamirrorimageoftheotherside.Ifadistributionissymmetricalthatmeansthatthereisaperfectlydefinedcenterpointofthedistribution.Instatisticaltermsthatsuggeststhatthemean,medianandmodeareallequal.Whenwecompare,therefore,themean,medianandmodeandtheyareroughlythesamethatindicatesthatthedistributionisrelativelysymmetrical.If,ontheotherhand,themean,medianandmodearedifferentandaresubsequently,givingusthreedifferentreadingsonthemiddle,thenthedistributionisrelativelyskewed.Comparingthemeasuresofcenterisanimportantdiagnostictoolforassessingtheextenttowhichadistributioniseithernormallydistributedorinstead,somewhatskewed.
MeasuresofDispersion
Incomparisontomeasuresofcenter,thatreportonadistribution’smiddle(orwhatis“typical”),measuresofdispersionreportonhow“spreadout”ordispersedthatdistributionofscoresorvaluesactuallyare.Thisisalsoimportanttoreport,whensummarizingadistributionofscoresofvalues.Measuresofdispersionhelpusdetermine,forexample,whetherthescoresorvaluesaretightlyclusteredaroundthecenter,or,aretheyspreadoutallovertheplace,withlotsofspreador“variability”inthedata?Youcanseealreadythatspreadisalsoimportanttoincludewhenreportingonadistributionofscoresorvalues.Therearethreemeasuresofdispersionthatresearchersusetoreportonspreadorvariabilityintheirdata:Theminimum/maximumscores,theInterquartilerangeandthestandarddeviation.Inthismodule,wewilltalkabouteachofthesethreemeasuresofdispersionseparately.
MinimumandMaximum
Theminimum/maximumvaluesareourfirstmeasuresofdispersion.Reportingthelowestvalueinthedistribution(theminimum)andthehighestvalueinadistribution(themaximum)isagoodwaytoreportonhowspreadoutadistributionofscoresis.
19
Thisisaquickandeasymeasureofspread.Infact,whenreportingonspreadorvariabilityinfuturedistributions,itisexpectedthatyoureportonthelow(theminimum)andthehigh(themaximum)inthedistribution.Besuretoreviewthe“RulesforWritingDescriptiveStatistics”forreportingtheminimumandmaximumvaluesinadistributionofscores.Min/Max=isthestatisticalsymbolforsummarizingtheminimumandmaximumofadistribution
InterquartileRange Theinterquartilerangeisanothermeasureofdispersion.Italsoreportsarangeofscores,butnotthelowestandhighestscores.Rathertheinterquartilerange,reportsthemiddle50percentofscoresinthedistribution.Byreportingtheinterquartilerange,theresearcherisreportingthemiddlehalfofcasesthatfallbetweenthe25thpercentileandthe75thpercentiles.Subsequently,thescoresbetweenthesetwopercentilescoverthemiddlehalfofthedistribution.Reviewthe“RulesforWritingDescriptiveStatistics”forreportingtheinterquartilerangeforadistributionofscores.Remember,whenreportingonspreadorvariabilityinadistribution,youmustreporttheminimumandmaximumvaluebut,thenyoumustdecidetoeitherreporttheinterquartilerangeorthestandarddeviation.(Youneverreportboth–justdecideononeortheother).IQR=isthestatisticalsymbolforsummarizingthemiddlehalfofadistribution
StandardDeviation Thestandarddeviationisanothermeasureofdispersion.Italsoreportsarangeofscores,butneitherthelowestnorhighestscores(minimum/maximum)ortherangebetweenthe25thand75thpercentile(interquartilerange).Ratherthestandarddeviation,asitsformulasuggests,reportstheaveragelevelofspreadaroundthemean.Byreportingthestandarddeviation,theresearcherisreportinghowfaronaveragemostcasesfallfromthemean.Unliketheinterquartilerange,thatreportsonthemiddlehalfofthedistribution,thestandarddeviationreportsonwheremostcasesfall,onaverage,withinacertainrangewithinthedistribution.Reviewthe“RulesforWritingDescriptiveStatistics”forreportingthestandarddeviationinadistributionofscores.Remember,onceagain,whenreportingonspreadorvariabilityinadistribution,youmustreporttheminimumandmaximumvaluebut,thenyoucandecidetoeitherreporttheinterquartilerangeorthestandarddeviation.(Youneverreportboth–butdecideononeortheother).Rememberalso,fromtherulesofwriting,thatthestandarddeviationisnota“standalonestatistic”andthatitmustbereportedwiththemean.
20
Reviewcarefullythe“RulesofWriting”fortheproperwaytoreportthestandarddeviation.s=isthestatisticalsymbolforsummarizingthestandarddeviationofadistribution
RulesforWritingDescriptiveStatistics
SurveyResearch,WritingResearch:Therearecertainstandardsandacceptablepracticeswhenreportingdescriptivestatisticsoncontinuous,numericdata.Reviewcarefullythe“RulesforWritingDescriptiveStatistics”whenreportingonadistributionscenterpointanditsspreadandvariability.Bystudyingandapplyingthosestandardsofwriting,youwillgainfurtherskillsinaccuratelywritinguptrendsinyourdistributions.
Module5 Propositions
Wearefinallyatthemostexcitingplaceinthescientificprocess!Wearenowatthestageoftestingourpropositions.Propositionsallowustoexaminerelationshipsbetweenvariablesandtestourhypotheses.Thefollowingisastandarddefinitionofaproposition:Apropositionisaclear,objectivestatementconcerningtherelationshipbetweentwovariables.Remember,propositionsareclearnotconvolutedstatements.Apersonreadingyourresearchandyoursetofpropositions,shouldbeabletofolloweasilywhatrelationshipsbetweenwhatvariablesyouaretestingandwhy.Yourpropositionsshouldalsobeobjective.Thewaytoreinforcethisobjectivityisbysettingyourpropositionsoutasquestions:Doesthisvariableorfactormatterinexplainingthisone?
• Forexample,doesgendermatterinexplainingchurchattendance?
BivariateStatistics
Whenwesetoutandarereadytotestourpropositions,weareatthebivariatelevelofanalysis.Bivariatestatisticsexaminethestrength(andsignificance)ofrelationshipsbetweenvariables.Bivariatestatisticsallowustotestwhetherthereisempiricalsupportforourpropositionsandhypotheses.InModule5,wewillbeusingbivariatestatistics(onetestinparticular)tostarttotestourpropositions.
DependentVariables
Whensettingupaproposition,youmustdecidewhatwillbeyourdependentvariable.Yourdependentvariableisthevariablethatisthefocusofyouranalysisandthatyouwanttobetterunderstand.Thinkofitthisway,ifyouweregoingtogiveyourresearchpaper/scientificstudyatitle,yourdependentvariablewouldbeincludedinthattitle.
21
Y=isthesymbolicformforthedependentvariable.Remember,inthefuture,ifanotherresearcherasksyouwhatthe“Y”is,inyourstudy,theyareaskingyouwhatyourdependentvariableis.
IndependentVariables
Whensettingupaproposition,youmustdecidewhatwillbeyourindependentvariable.Yourindependentvariableisthevariablethatyouproposecanhelpexplainoraccountfordifferencesinyourdependentvariable.Anothertermfortheindependentvariableisthe“ExplanatoryVariable”X=Thisisthesymbolicformfortheindependent(orexplanatory)variable.
*Addednewterm
FunctionalPropositionStatement
Whensettingupapropositiontotestitstatistically,thereisacertainfunctionalformthatyouareexpectedtofollowinthisclass.Thiswillhelpyouinterpretingandwritingyourfindings.Thisisthefunctionalform:
Yf(x)?:Thewaytoreadthisstatementis,“IsYafunctionofX?”evenbetter,istoreaditbackwards:“DoesXmatterinexplainingY?”Example:Churchattendanceandgender
Yf(x)?:ATTENDf(SEX)?:“Ischurchattendanceafunctionofgender?”Evenbetter,istoreaditbackwards:“Doesgendermatterinexplainingchurchattendance?”Thereyouhaveit!Aproposition:Averyclear,objectivestatementconcerningtherelationshipbetweentwovariables.Itisveryclearwhichvariableisyourdependentvariableandwhichvariableistheexplanatory(independentvariable)inthisstatement.Itisobjectivebecauseyouareproposingyourpropositionasaquestion!
*Addednewterm
Hypothesis Aftersettingupyourpropositionintheabovefunctionalform,thenextstepistodomorethinkingabouttheproposedrelationship.Thisiswhenyousetoutyourhypothesisonwhatyoumightexpecttofind.Youmustaskyourselfthefollowingquestion:“HowmightXaffectY?”Intheworldofsurveyresearch,thistranslatesintothefollowingstatement:“Whoismorelikely…andwhy?”Withrespecttotheaboveexampleonchurchattendanceandgender,whensettingupyourhypothesis,youmustaskhowmightgendermatterinexplainingchurchattendance.Or,moredirectly,whoismorelikelytoattendchurchmoreoften(orlessoften)andwhy?
22
HaisthesymbolicformforthehypothesisandletussetupanHafortheaboveproposition:Ha:Itisexpectedthatwomenmightattendchurchmoreoftenbecauseoftheirkeyroleinthefamilywithearlychildhoodsocialization.Theymaybemorelikelytoattendchurchmoreregularlywiththeirchildren.
*Addednewterm
NullHypothesis Butwait!Beforetestingourhypothesistoseeifthereisstatisticalsupportforit,itisimportanttoalsostateyournullhypothesis.Thenullhypothesissetsoutthatyouarealsoopentothepossibilitythatthereisnorelationshipbetweenthevariablesthatyouareabouttotest.Bystatingyournullhypotheses,youarereinforcingyourobjectivityasaresearcherandsettingoutthenullhypothesis,alsohelpswithourdecisionsinacceptingorrejectingourhypotheses.HoisthesymbolicformforthenullhypothesisandletussetupaHofortheabovepropositiononchurchattendanceandgender:Ho:Itcanalsobeexpectedthatthereisnorelationshipbetweenchurchattendanceandgender.
Crosstabs Acrosstabisourfirstbivariatestatistic.Bivariatestatisticsallowustotestourpropositions.Therearecertaintypesofbivariatestatisticsforcertaintypesofdata.Justremember,weonlyusecrosstabstotestourpropositionswhenbothvariablesinthepropositionstatement(bothYandX)arecategoricalvariables:Crosstabsareonlyausefulbivariatestatisticwhenworkingwithcategoricaldata.Butsincemostsurveydataiscategorical,acrosstabisaveryimportanttoolinsurveyresearch.Crosstabsareprettyeasytointerpretandreporton,butitrequiresalittletrainingandguidanceinusingthisbivariatetoolcorrectly.Paycarefulattentiontotheinstructionalvideosandtipsheetsonconstructingcrosstabsandinterpretingandwritingupyourcrosstabresults.
RulesforWritingCrosstabs
SurveyResearch,WritingResearch:Therearecertainstandardsandacceptablepracticeswhenreportingyourpercentagesoncrosstabsdata.Reviewcarefullythe“RulesforWritingCrosstabs”whenreportingyourpercentageresultsfromthepropositionsthatyoutestedthroughacrosstab.Bystudyingandapplyingthosestandardsofwriting,youwillgainincrediblyimportantskillsinwritinguptheresultsofyourresearchfindingsandinwritingscience.
Module6
TestsofStatisticalSignificance
Testsofstatisticalsignificanceareoftenreferredtoas“inferentialstatistics”or“testsofconfidence.”Wecanusethosetermsinterchangeably.Testsofsignificanceallowtheresearchertoinferfromtheirsampledatatothepopulationthattheyareinterestedin
23
understanding.Inferentialstatisticsallowtheresearchertoassertwitha“certainlevelofconfidence”whethertherelationshipstheyareobservinginthedataare“statisticallysignificant.”What“statisticalsignificance”meansisthattheresearcherisconfidentthatthefindingsthattheyhaveobservedinthedataaregeneralizabletothepopulation.
Population Thepopulationisthegroupthattheresearcher/scientistisultimatelyinterestedinstudying.Thishasbeenimpliedsofarinthepapersthatyouhavewritten.WhenusingtheGeneralSocialSurvey,youhavebeenmostconcernedabouttheUSpublicnotGSSrespondentsperse.Butresearchers,asweknow,areneverreallyinapositiontosurveyeveryoneinapopulationthatiswhytheytakeasampleandattempttoinferorgeneralizefromtheirsampletothepopulation.
Sample Samplingispartofthescientificprocessinoureffortstounderstandapopulationbeitawatershed,birdsorpeople.Subsequently,devisingagoodsamplingstrategyisabsolutelyessentialincollectinggooddatathatrepresentsthepopulationthatisbeingstudied.
EPSEMSample Thisisyourwaytoevaluatewhetheragoodsamplehasbeentakenandthestudyanditsconclusionsarebasedongooddata.EPSEMstandsforEqualProbabilityofSelectionMethod.Thinkofitthisway:Ifeveryoneinthepopulationhadanequalchanceofbeingselectedtobeinthesample,thenthereisagoodchancethatthesampledataissomewhatrepresentativeofpopulationand,thatthefindingsfromthesamplecanthereforebegeneralizedtothatpopulationthatisbeingstudied.EPSEMisaprincipalassumptionofinferentialstatistics.Beforeusinginferentialstatistics,thesampledatashouldbemoreorlessrepresentativeofthepopulationwithEPSEMmethodsusedtocollectthesampledata.Otherwise,youareusingtheseinferentialstatisticstogeneralizeaboutapopulationwithdatathatisnotevenrepresentativeofthatpopulation.Thatisbadscience.Beaware.Badsciencehappensallthetime.Nowthatyouareeducatedinthepracticeandprincipalsofgoodscience,youhavearesponsibilitytoexposesuchbadpracticesinthefuture.Whenevaluatinganyscientificstudyinthefuture,reviewcarefullythesamplingmethodstodeterminewhetherthedataisagoodrepresentativecrosssectionofthestudypopulation.Ifnot,thedata,subsequentfindings,andconclusionscanbecalledintoquestion.
ResponseRates AlongwiththeEPSEMsampledesign,theotherfactortoconsiderwhenevaluatinggoodorbadsurveydataistheresponserate.Theresponseratereportshowmanypersonswhoreceivedthesurvey,actuallytook
24
thetimetocompleteit.Responseratescanoftentimesbeterriblylow,reachingaslowas20%.Whenonly20%ofthoseselectedtoparticipateactuallycompletethesurvey,canthesurveydatareallyberepresentativeofthepopulation?Whenevaluatingsurveyresearchinthefuture,paycloseattentiontotheresponseratetodeterminewhetherthestudyisbasedongoodorbaddata.
Representativeness Whendevisingtheirsamplingdesignandcollectingthedata,allresearchersstrivetoachieveasamplethatisrepresentativeofthatbroaderpopulationthatisunderstudy.Inotherwords,datathatisrepresentative,‘lookslike’thepopulationthatisunderinvestigation.Representativedataisgooddataandisthebasisofanysoundscientificstudy.
ChiSquareTests InferentialstatisticsshouldonlybeusedwhenthesamplingdesignisbasedonEPSEMprobabilitymethodsandgoodresponserates.Thisisbecausethedataismorelikelytorepresentthepopulation,andthesestatisticsarethenusefultoinferfromthesampletothatpopulation.ChiSquareisaninferentialstatisticortestofstatisticalsignificanceotherwiseknownasa“testofconfidence.”Thereareothertestsofsignificance,otherthanChiSquare,butChiSquareisthetestofsignificancetousewhenworkingwithcategorical(crosstab)data.Itiscalculatedbasedontheobservedandexpectedfrequenciesinacrosstab,butweneednotgooveritscalculationhere,beyondjustshowingitsformula.Whatisimportantforusistoknowhowtouseandinterpretthisstatistic:BesuretoreviewcarefullytheinstructionalvideoforthismoduleandalsothetipsheetforinterpretingChiSquare.InterpretationofChiSquareresultsarebasedonthefollowingsteps:
1)Setoutyourlevelofconfidence2)Next,comparetheChiSquarereportedsignificancevalueonyourSPSSoutputtoasetp-valuethatcorrespondswithyourconfidencelevel3)Then,basedonasetofeasydecisionrules,youwillbeinapositiontoquicklyandeasilyacceptorrejectapropositionandrelationshipas“statisticallysignificant”.
Next,wewilltalkmoreaboutlevelsofconfidence,p-valuesandtheeasydecisionrulesthatyouwillusetodeterminewhethertherelationshipsthatyouaretestinginyourcrosstabsarestatisticallysignificantandsubsequently,generalizabletothepopulation(atacertainlevelofconfidence).X2=isthestatisticalsymbolforsummarizingtheChiSquarestatistic
25
ConfidenceLevels Inferentialstatisticsarebasedonrealassumptionsofrepresentativedatabutarealsobasedonsomeabstractprincipalsofprobability.Thisisnotaclassinprobabilitytheorybutratheraclassintheapplicationofthescientificmethod.Forthatreason,letustalkabouthowtoapplytheseprincipalsofprobabilitytheorytoourownscientificstudies.Therearethreelevelsofconfidencethataresearcherusuallyappliesinacceptingorrejectingahypothesisas“statisticallysignificant”orgeneralizabletothepopulation.Letusdiscussallthreelevelsofconfidence:95%,99%andthe99.9%levelofconfidence.Bewarned,thisisalittleabstract:
1. 95%,ConfidenceLevel:Thisisthestandardlevelofconfidencethatisusedinsocialresearch.Toacceptarelationshipasstatisticallysignificant,thinkofitthisway:Ifhypotheticallyyouwereabletodraw100samplesfromthesamepopulation(andeachtime,everyonewas“thrownback”sotheyhadachanceofbeingsampledagain)andif95samplesfromthose100samplesdrawnyieldedmoreorlessthesameresults,thentheresearchhasmetthe“95%confidencelevel.”Ofcourse,thiswouldneverhappenintherealworldpracticeofresearch,butitisagoodwayofthinkingaboutthe95%confidencelevel.
2. 99%,ConfidenceLevel:Thesamecanbesaidatthe99%confidencelevelbutthistime,with99samplesyieldingthesameresultsoutofa100samplesdrawn.
3. 99.9%,ConfidenceLevel:Thesamecanbesaidatthe99.9%
confidencelevelbutthistime,with999samplesoutofa1000samplesyieldingthesameresults.Ifaresearchercanassertthattheirfindingsare“statisticallysignificant”ateitherthe95,99,99.9%levelofconfidence,thentheresearcherhasaverystrongcasethatthosefindingsaregeneralizabletothepopulation.Butremember,theassumptionisthatthedataisrepresentativeandgeneralizabletothepopulationfromthestart,andremember,thisisnotalwaysthecase.Thedatathatscientistsworkwithisnotalwaysgood.Itisyourjobasacriticalconsumerofresearchandoffuturescientificstudiestobeawareofthatand,askthosebasicquestions:Howwasthedatacollected?Whatwastheresponserate?
Admittedly,thisdiscussiononlevelofconfidence,isstillquiteanabstraction,untilwediscusstheselevelsofconfidenceinlightoftheirp-values.Youwillneedthep-valueinreadingyourSPSSoutput.
26
P-Value Thep-valueandconfidencelevelsaremoreorlessthesamething.Theyaretheothersideofthenickelsotospeak.Letusgooverthep-valuesforeachoftheaboveconfidencelevels:P-value,95%ConfidenceLevel:Remembertheassumptionatthe95%confidencelevel:If100samplesweredrawnand95yieldedthesameresultsthenyouwouldhavemetyour95%confidencecriteria.Now,ifweexpressthatasaproportionthatis.95andtheothersideofthat(theothersideofthenickel)is.05.
• Subsequently,atthe95%confidencelevelthep-valueis.05.• Thisisthestandardlevelofconfidencethatisusedto
determinestatisticalsignificanceandisimportantcriteriafordeterminingstatisticalsignificance.Asyouwillseeintheinstructionalvideo,basedonthereportedsignificancelevelofourinferentialstatistics,ifthereportedprobabilityislessthanorequalto.05thatmeanstherelationshipisstatisticallysignificantatthe95%levelofconfidence.
Hereareyoureasydecisionrulesinusingthisp-value:
• IfthereportedChiSquaresignificancelevelislessthanorequal
to.05therelationshipisstatisticallysignificant
• IfthereportedChiSquaresignificancelevelisgreaterthanorequalto.05therelationshipisnotstatisticallysignificant,acceptthenullhypothesisofnorelationship.
Thesamestandardsapplyforthe99%levelofconfidence.Inthiscasethep-valueis.01andthedecisionrulesarethefollowing:
• IfthereportedChiSquaresignificancelevelislessthanorequalto.01therelationshipisstatisticallysignificant
• Iftheontheotherhand,thereportedChiSquaresignificance
levelisgreaterthan.01thatmeanstherelationshipsisnotstatisticallysignificant,acceptthenullhypothesisofnorelationship.
Thesameappliesforthe99.9%levelofconfidence.Inthiscasethep-valueis.001andthedecisionrulesarethefollowing:
• IfthereportedChiSquaresignificancelevelislessthanorequalto.001therelationshipisstatisticallysignificant.
• Iftheontheotherhand,thereportedChiSquaresignificance
levelisgreaterthan.001thatmeanstherelationshipsisnotstatisticallysignificant,acceptthenullhypothesis.
27
Besuretoreviewtheinstructionalvideos,tipsheetsandrulesforinterpretingandwritingupyourChiSquareResults.Basedontheinstructionalvideoespecially,youwillseehowtheabovep-valuesareusedineasilydeterminingstatisticalsignificance.
RulesforWritingChiSquare
SurveyResearch,WritingResearch:Therearecertainstandardsandacceptablepracticeswhenwritingupyourchisquareresults.Reviewcarefullythe“RulesforWritingChiSquare”whenreportingonstatisticalsignificanceforthepropositionsthatyoutestedthroughacrosstabandchisquare.Onceagain,bystudyingandapplyingthesestandardsofwriting,youwillgainincrediblyimportantskillsinwritinguptheresultsofyourresearchfindingsandinwritingscience.