Upload
nullhandle
View
1.734
Download
2
Embed Size (px)
DESCRIPTION
Presentation for the Society of American Archivists Web Archiving Roundtable session at the 2014 Annual Meeting.
Citation preview
Content Working Group
2013 NDSA Web Archiving Survey Report HighlightsNicholas Taylor (@nullhandle)Web Archiving Service ManagerStanford University Libraries
SAA Annual Meeting: Web Archiving RoundtableAugust 13, 2014
Content Working Group
NDSA Web Archiving Survey Working Group
Jefferson BaileyInternet Archive / Archive-It
Kristine HannaInternet Archive / Archive-It
Edward McCainUniversity of Missouri
Cathy HartmanUniversity of North Texas
Abbie GrotkeLibrary of Congress
Christie MoffattNational Library of Medicine
Nicholas TaylorStanford University
Content Working Group
NDSA Web Archiving survey background
2011• 78 respondents• program info• tools and services• access• policies
2013• 92 respondents• program info
• staff time, metrics, skills, content concerns
• tools and services• access and discovery
• new discovery options
• policies• embargo, social media,
robots.txt, resources
Content Working Group
Respondent Characteristics
“Lego People” by Scoobay under CC BY-NC-SA 2.0
Content Working Group
universities still make up most programs
College or University
47%
Archive13%
State Gov13%
Other12%
Fed Gov8%
Commercial3% Public Library
3%Museum
3%
2011
College or University
52%
Archive15%
State Gov13%
Other8%
Fed Gov4%
Commercial4% Public Library
2%Museum
1%
2013
Content Working Group
SAA WebArch RT tops group affiliations
group 2011 2013
8% 7%
31% 33%
45%
Content Working Group
most programs are fractionally staffed
less than 25% FTE
25% FTE
40-50% FTE
1 FTE
1 to 3 FTE3.5 to 15 FTE
Content Working Group
Maturity and Progress
“Apple Mouse Evolution” by raneko under CC BY 2.0
Content Working Group
programs have matured slightly since 2011
Active Testing Planning No longer collecting0%
10%
20%
30%
40%
50%
60%
70%
80%
64%
16% 17%
4%
72%
14%9%
2%
2011 2013
Content Working Group
strong perceptions of progress since 2011
Significant progress
39%
Some progress36%
About the same20%
Slightly worse off2%
Much worse off2%
Content Working Group
many new programs since 2011
1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 20130
2
4
6
8
10
12
14
16
18
20
10
3
0
21
2
0
23
8
65
4
67
12
19
Number of organizations
Content Working Group
Archiving Focus
“Ant Farm Media Van v.08 (Time Capsule) in Bellewether at Southern Exposure” by Steve Rhodes under CC BY-NC-SA 2.0
Content Working Group
more programs are only self-archiving
Archive other sites only Archive both Archive own site only0%
10%
20%
30%
40%
50%
60%
31%
49%
20%15%
48%
37%
2011 2013
Content Working Group
concern about social media, databases, video
Social Media Databases Video Interactive Media
Audio Blogs Art0
10
20
30
40
50
60
70
8069
65 64
49
40
32
16
Number of organizations
Content Working Group
untapped interest in collaboration
Yes No Not yet, but interested Don't know0%
10%
20%
30%
40%
50%
60%
70%
80%
21%
72%
7%
17%
47%
33%
2%
2011 2013
Content Working Group
“Photocopier” by Joriel "Joz" Jimenez under CC BY-NC-ND 2.0
Tools and Services
Content Working Group
web archiving as a service still most popular
External In-house Both0%
10%
20%
30%
40%
50%
60%
70%
60%
25%
14%
63%
20%16%
2011 2013
Content Working Group
data not transferred from service provider
Transferred Haven't transferred0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
19%
81%
20%
80%
2011 2013
Content Working Group
increased use of tools supporting W/ARC
Supports W/ARC Doesn't support W/ARC0%
10%
20%
30%
40%
50%
60%
70%
80%
24%
76%
38%
62%
2011 2013
Content Working Group
less granular descriptive metadata
URL search
Full-t
ext se
arch
Browse
by Url
Browse
by title
Collecti
on-leve
l descr
iption
Item-le
vel d
escription
Finding aids
APIsOther
0%10%20%30%40%50%60%70% 62% 66%
47%55%
30%36%
54%60%
43%50%
22% 18% 20%
5%
20%
2011 2013
Content Working Group
Archiving Policies
“Handle With Care” by ServInt under CC BY-NC-ND 2.0
Content Working Group
most don’t notify or seek permission
Capture Provide restricted access Provide public access0
5
10
15
20
25
30
35
40
45
50
42 4245
17
711
14 1315
No action Notify Request permission
Content Working Group
more conditional handling of robots.txt
Always respect robots.txt Sometimes/conditionally respect robots.txt
Never respect robots.txt Don't know0%
10%
20%
30%
40%
50%
60%
38%33%
8%
21%22%
55%
8%
16%
2011 2013
Content Working Group
social media archiving policies are uncommon
Has social media archiving pol-icy
24%
Lacks social media archiving policy76%
Content Working Group
policies based on community practices
Other orga-nizations
37%
ARL Code of Best Practices
27%
Section 108 Study Group17%
Counsel or service provider7%
Oakland Archive Policy4%
Statute4%
Don't know5%
Content Working Group
takeaways and questions for SAA WebArch RT
• for individual organizations:• if you’re only self-archiving, what’s on your roadmap?• how are you preserving your web archive data?• how do you describe and enable discovery of web archives?• how do you handle robots.txt?• what are your plans for social media archiving policy?
• for the group:• what is this group (vs. IIPC, NDSA) best equipped to do?• what kind of collaboration are you interested in?
Content Working Group
Nicholas Taylor@nullhandle
“Thank You” by vistamommy under CC BY 2.0