Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Web Analytics at EPA Dr Stephen P Gant (CSC) [email protected]
• What do you know about your website?• What comments were there about your site in the ACSI?• What files do you have out there? Any ROT?• What is the quality of your site? Is it getting better or worse?• How many visitors do you have (really)?• What metadata should you have in your page that you don’t have
now?• What queries bring people to your pages?• Where do people come from to get to you (Google, Wikipedia,
Facebook)?• What link text did they click on to get to you?• What did they search for – from your page?• Where did they click on your page?
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Web Analytics Center of Excellence
• Players– OEI/OIAA Branch Chief – Jonda Byrd– Center EPA Project Manager – Dr. Charlotte Cottrill– Center Director – Dr. Stephen Gant (CSC)
• Goals– Build a Web Analytics Tool Chest to measure different facets of
visitor interaction with www.epa.gov– Produce metrics about visitors’ behavior and opinions, and
quality attributes of pages.– Support identifying issues and actionable Web improvements – Answer the questions on the previous slide– Provide infrastructure to support custom WA services through
the Working Capital Fund
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
New Capabilities
• WA Wiki – more documentation
• Daily Top Requests Report
• Monthly Agency Dashboard
• Traffic Reports for Blogs
• Traffic Reports for Opengov
• 2nd Maxamine Server
• Traffic Reports for Oracle
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Web Analytics Wiki
• The major source of information about EPA’s suite of web metrics/web analytics tools, reports, and contacts– Where they are– How to interpret them
• https://kestrel.rtpnc.epa.gov/webwiki/index.php/Web_Analytics
• use your LAN id and password to log in
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Web Analytics Wiki
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Daily Top Requests Page• Top 20 Report – generated daily, listing last 7 days, average for last
7 days, and last 30 days
• http://www.epa.gov/webmast1/top20/top20.htm • Top Requested Pages and Documents (HTML, PDF)• Top Pages With Links to www.epa.gov pages• Top In-Bound Search Terms (Google, Yahoo, outside search
engines, and requests from NL Search results pages)• Top Out-Bound Search Terms (Search Terms from an EPA search
box on an EPA web page)• Top Widget and Iframe entries (Small pieces of html reused across
EPA web pages)• Excludes requests from epa.gov computers• Will soon exclude bots
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Monthly Dashboard Report
• On the Web Analytics Wiki https://kestrel.rtpnc.epa.gov/webwiki/index.php/Dashboard
• Email notification – request from– [email protected]
• Working on RSS feed
• Working on automatic generation
• Wiki holds previous dashboards
• Available as Power Point slide
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Dr. Stephen Gant, CSC
Work toward Nicer Charts
April 13-16, 2010 EPA Web Workgroup – RTP - NC
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
New Maxamine Capabilities
• Chocolate (Maxamine server) soon joined by Vanilla
• Significantly increase ability to expand (Chocolate is maxed out)
• Will distribute QA and Traffic loads and custom capabilities
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Maxamine Coveragehttps://maxamine.epa.gov/maxcentral/
• www.epa.gov – Static web pages
• cfpub.epa.gov – ColdFusion server
• yosemite.epa.gov – Domino server
• chocolate.gov – Maxamine server• oaspub.gov – Oracle server (NEW!)
https://maxamine.epa.gov/enviro/
• sparrow.gov – Blog server (NEW!) https://maxamine.epa.gov/blog/
• Opengov – Opengov Website (NEW!) https://maxamine.epa.gov/opengov/
Dr. Stephen Gant, CSC
23,416UnfilteredRaw PDFRequests
7,380 Left 5,887 Left 5,885 Left1,667 HitsIn MaxamineReport
16,395UnfilteredRaw HTMLRequests
16,355 Left 15,234 Left9,375 HitsIn MaxamineReport
16,036Status 206 (Partial Content)
1,493 Status 404 (File Not Found)
2 Status 405 (Method Not Allowed)
4,218Bot RequestsRemoved
5,859Bot RequestsRemoved
40Status 206 (Partial Content)
1,121 Status 404 (File Not Found)
How the Raw Hits Data get turned into Maxamine numbers Nerlpage December, 2009 Stats
HTM/HTML
Human Views
PDF Inflation Removed
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
External Links Report Problem• External Link Integrity report showed errors to
cfpub.epa.gov and yosemite.epa.gov. • Caused by an issue in the blade center switch that load
balances other servers in the same blade center as the Maxamine server
• Complex enough mystery it took some time to solve• Now it is fixed• Rescanned all the sites so the QA reports for March are
now correct• Cannot do scans back in Jan or Feb• Updating trend reports with data from March for Jan-
March
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Leader Boards – How are we doing?
• Maxamine metric of user experience• Overall Score = Page Proximity (25%), Page Weight Rating (25%), Link Integrity Rating (25%),
External Link Integrity (15%) and Title Integrity Rating(10%).
With Problem Problem Fixed
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Broken Links – Not Getting Better
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Page Weight (size)
• Metric represents user experience with download time, bigger is longer.
• Maxamine gives a break for cached components• Maxamine thinks 50k is too big – based on dial-up
download speeds and older broadband speeds• Dial-up users much less now – 8%• Could
– eliminate page weight from our score– Increase threshold (200k?)– Leave it at 50k
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
3rd Party Links
• Maxamine penalizes score pages – especially home pages that direct people away from the website
• Based on commercial model – funnel users to credit card stage
• Not counted in overall score
• Can we configure to exclude nlquery.epa.gov, yosemite.epa.gov, etc.?
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Better Bloodhound
• Evaluation of single page – custom report• http://epa.gov/webmast1/srchrept/dashes.htm • Meta Information (title, keywords, description)• In-bound Search phrases (sortable by phrase and
frequency)• In-Links Text (sortable by in-link, phrase, and frequency)• Browser Addresses (NEW!)• Referrers Pages (NEW!)• Out-bound Search phrases (NEW! – now you can
recreate search people did from your page and see the results)
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Metadata
• Can automatically extract from your PDFs to spreadsheet
• You fill in spreadsheet
• Can provide some Librarian support
• Can automatically populate PDFs
• Did most requested 500 PDFs
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Restructuring EPA
• Inform Decision making
• Information on Biggest Topics– ACSI Topics– Top Search Terms – Emergent Interest –
cash for clunkers– Most Requested Sites / PDFs
• Resources Lists– Based on biggest concepts
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
OIAA’s Custom Report Policy
• Content owners may not want reports at TSSMS level – need scan of smaller domain.
• Content owners can get custom QA and Traffic reports and other custom reports like CrazyEgg and Bloodhound – but they will have to pay for the contractor labor to produce them.
• It is easy to order the work through WCF service TZ – DeShelia Hall (919)-541-4469
Dr. Stephen Gant, CSCApril 13-16, 2010 EPA Web Workgroup – RTP - NC
Questions?
• Dr Charlotte Cottrill (EPA) [email protected]
• Dr Stephen P Gant (CSC) [email protected]
• Karen Litwin (CSC) [email protected]
• DeShelia Hall (EPA)