Upload
sawood-alam
View
475
Download
0
Tags:
Embed Size (px)
DESCRIPTION
A talk given to final year B.Tech. Computer Science students at Jamia Millia Islamia, New Delhi, India with the intent of spreading awareness about web archiving and digital preservation and motivating the students for research.
Citation preview
Web ArchivingA Brief Introduction
Sawood AlamDepartment of Computer ScienceOld Dominion UniversityNorfolk, Virginia - 23529
About Me
Sawood Alam
Lexical SignatureWeb, Digital Library, Web Archiving, Ruby on Rails, PHP,
XHTML, CSS, JavaScript, ExtJS, Urdu, RTL and Linux.
● BTech, Jamia Millia Islamia, India, 2008● MSc, Old Dominion University, USA, 2013● PhD, Old Dominion University, USA, Current
Agenda● What is an archive?● What is Web archiving?● Why do we care about archiving?● Issues and challenges● Various archiving efforts● Tools and techniques● WSDL research group● My research: Archive X-Ray!● Research opportunities● Higher education: how to study abroad?
What is an Archive?● Accumulation of historical records● Long term storage and preservation● Less frequently used● Physical or digital
What is Web Archiving?● Periodic snapshots of web pages● Preserving important events on the Web● Making archived content accessible
Why do We Care Archiving?
Web contents decay rapidly!
● To preserve the history● To tell a story● For evidence● For backup● For personal satisfaction
Issues and Challenges● Crawling● Storage● Retrieval● Replay● Accessibility● Completeness● Accuracy● Credibility
Web Archiving Efforts● Internet Archive● Archive-It● Wikipedia● UK Web Archive● Various national and non-profit archives● Film, music and other multimedia archives● Scholarly archives● Personal archiving
Tools and Techniques● Heritrix● WGet● cURL● OpenWayback● Memento● CarbonDate● Warrick● Synchronicity● Preserve Me!● WARCreate and WAIL
WSDL Research Group● Web Science and Digital Libraries
Research Group● Home Page: ws-dl.cs.odu.edu● Blog: ws-dl.blogspot.com● Twitter: @WebSciDL● Flickr: flickr.com/photos/124419986@N07
WSDL Research Group
Archive X-Ray!● How much of the Web is archived?● Profiling various archive services● Predicting what they contain● Routing Memento aggregator queries
Research Opportunities● Information retrieval● Information visualization● Client and server side archiving● Archiving dynamic content● Distributed archiving● Discovering alternate long term archiving
techniques● Predicting “Important” events on the Web
and archiving them timely
Higher Education Abroad● Select your field of interest● Find potential universities in your field● Approach professors● Approach alumni● GRE and TOEFL● Expenses and funding options
○ Scholarship○ Assistantship and on-campus jobs○ Education loan and self financing
Sawood AlamDepartment of Computer Science
Old Dominion UniversityNorfolk, Virginia - 23529
[email protected]: @ibnesayeed
www.cs.odu.edu/~salam