Upload
elaine-lasda
View
78
Download
4
Embed Size (px)
Citation preview
Understanding Data: An Information Literacy Perspective
Elaine M. Lasda BergmanAssociate Librarian
Dewey Graduate LibraryUNL 299: Information Literacy in Mathematics and Statistics
February 18, 2016
https://upload.wikimedia.org/wikipedia/en/0/09/DataTNG.jpg
Selecting Data
• What is your question? • What are the variables needed in the dataset
that will answer this question? • How will you need to manipulate the data to
arrive at meaningful conclusions? • What format does the dataset need to be in to
maniuplate the data?
Do do you have to collect your own data?
• First party data – Collected by entity doing the analysis– Unique; often a direct relationship to data source– Trustworthy (?)– Smaller dataset (mostly)
• Second party data – Access is from another platform, but you have access to it
• Repositories– Creator of original platform has direct relationship to data source– Trustworthy (?)
• Third party data– Access from another platform – Collected anonymously; without user consent
• “data exhaust”– Large, aggregated datasets– Trustworthy (?)
Data Collection
• Empirical• Survey• Ethnographic• Focus Group• Qualitative vs. Quantitative
Variable Types
• NOT all data is numerical!– Nominal – Ordinal– Discrete– Continuous
• How does these datatypes affect the creation of descriptive statistics about the data?
Dataset Architecture
• .csv/Spreadsheet• RDBs• Linked Datasets• OODBs• “Big Data”• Others…
Metadata
• Data about data• “Attributes” of variables
https://library.uoregon.edu/datamanagement/metadata.html
• How does Metadata aid in our use of a dataset?
Evaluation of Data
• GIGO: data cleaning• Current? • Authoritative?• Format• Replicable?
Tools for Analysis
• R• Stata• SPSS• Excel• Weka• HadoopEtc, etc etc
Ethics
• In data collection• In data usage• In data storage and preservation• In data disposal