Personalisation of search: take back controlKaren Blakeman, RBA Information Services
5th June 2012Pre-conference workshop, 11th Southern African Online Information Meeting,
Sandton Convention Centre
Slides are available at http://www.rba.co.uk/as/
Twitter: @karenblakeman
http://www.rba.co.uk/
This presentation is licensed under a Creative Commons Attribution 3.0 License
General plan for the session
Getting to know one another and feedback on search issues
Slides are a basic framework for the session (download from http://www.rba.co.uk/as/ now if you wish). I'll create an addendum of key, additional information after the session.
Ask questions as we go along, or write on the notelets and we'll have Q&A slots throughout the session
Summary of "stuff" we've learned and what to take back home and to work
20/04/23 www.rba.co.uk 2
How it all started
Before 1992 priced electronic databases indexed by humans. Many still exist for example LexisNexis, STN, Dialog
1992 – the Internet can be accessed by anyone but 2-3 years before significant information started appearing on the web
Increase in amount of data and information led to the development of tools that indexed and searched the content of web pages
Lycos, Excite, AltaVista, Hotbot
20/04/23 www.rba.co.uk 3
How the web search tools worked (and still do in part)
"Crawl" the internet looking for new and updated pages by following links
Copies of pages and documents added to a database that is publicly searchable Results sorted according to:
– how often the words you looked for appear in the page– where they appear (words in the title and first few sentences given higher ranking)– and other criteria not disclosed by the search engines
They do not cover:– password protected sites– databases or sites where you have to fill in a form to find the information
20/04/23 www.rba.co.uk 4
Then along came.....
20/04/23 www.rba.co.uk 5
11 November 1998From the Internet Archive www.archive.org
How was Google different?
20/04/23 www.rba.co.uk 6
Links (citations) a major part of sorting search results
http://www.seobook.com/learn-seo/collateral-damage.php
Google 2011
20/04/23 www.rba.co.uk 7
Revenues $37,905 millionsNet Income $9,737 millionshttp://investor.google.com/financial/tables.html
2011 – 96% of revenues are from advertising
Google has a problem....
Google's problem "How People Spend Their Time Online – Stephen's Lighthouse" http://stephenslighthouse.com/2012/03/14/how-people-spend-their-time-online/
20/04/23 www.rba.co.uk 8
Search engines and social networks need to keep you on their "properties" – captive audience
Always trying to deliver new services to stop you wandering off to other sites
To keep you on-site need to deliver content as relevant as possible to YOU
– need to know more about you
– need to know what sort of information you are interested in
– need you signed in to your account
– need to know who your contacts are on-site and elsewhere
– need to know about your activity elsewhere
Personalise results20/04/23 www.rba.co.uk 9
The Filter Bubble: What The Internet Is Hiding From YouEli Pariser
Publisher: Viking (23 Jun 2011)
ISBN-10: 067092038X
ISBN-13: 978-0670920389
20/04/23 www.rba.co.uk 10
Trends in search
No longer straightforward text searching of web pages and documents
Localisation
Personalisation
Social
Mobile
20/04/23 www.rba.co.uk 11
How far does personalisation go?
Google can seriously damage your news http://www.rba.co.uk/wordpress/2011/09/03/google-can-seriously-damage-your-news/
Is Google really filtering my news? http://www.librarianoffortune.com/librarian_of_fortune/2011/09/is-google-really-filtering-my-news.html
20/04/23 www.rba.co.uk 12
How far does personalisation go?
An Awfully Big Blog Adventure: The answer to your question... depends on who you are (Anne Rooney) http://awfullybigblogadventure.blogspot.co.uk/2012/04/answer-to-your-question-depends-on-who.html
20/04/23 www.rba.co.uk 13
Borromeo vs Borromeo
20/04/23 www.rba.co.uk 14
Word cloud of top 20 results for a Google search on Prague (web history and social networks switched off, cookies cleared)
Word cloud of top 20 results for a Google search on Prague (signed in to Google+ account, social networks and web history enabled)
Google loses the plot?
Search on goats
20/04/23 www.rba.co.uk 15
Dear Google, stop messing with my search http://www.rba.co.uk/wordpress/2011/11/08/dear-google-stop-messing-with-my-search/
20/04/23www.rba.co.uk
16
Google introduces the “soft AND”
“When you do a multi-term query on Google (even with quoted terms), the algorithm sometimes backs-off from hard ANDing all of the terms together.......it’s clear that people will often write long queries (with anywhere from 5 to 10 terms) for which there are no results. Google will then selectively remove the terms that are the lowest frequency to give you some results (rather than none)....Soft AND is a way to reduce the overall frustration and give the searcher something to examine (and with luck, a chance to reformulate their query).”
Dan Russell
http://www.rba.co.uk/wordpress/2011/11/08/dear-google-stop-messing-with-my-search/#comments
20/04/23 www.rba.co.uk 17
No more '+' to force an exact match
No more automatic 'ANDing' of your terms
No more highlighting your search terms in its cached copy of a page
No more clearing search history unless you are logged in to a Google account [alternatively delete all search cookies from your browser]
20/04/23 www.rba.co.uk 18
Google's new Privacy Policy
20/04/23 www.rba.co.uk 19
"Our new Privacy Policy makes clear that, if you’re signed in, we may combine information you’ve provided from one service with information from other services. In short, we’ll treat you as a single user across all our products, which will mean a simpler, more intuitive Google experience."
Toward a simpler, more beautiful Google http://googleblog.blogspot.co.uk/2012/04/toward-simpler-more-beautiful-google.html
"we're more excited than ever to build a seamless social experience, all across Google"
What does Google know about you?
20/04/23 www.rba.co.uk 20
Look in your Google account dashboard http://www.google.com/dashboard/
How Google is targeting your adshttp://www.google.com/ads/preferences/
Impact of Google's policy and personalinformation management changes on YouTube
20/04/23 www.rba.co.uk 21
Gives me videos based on my web search history, linked to my location (Reading) plus a long list of videos mentioned by people in my Google+ circles.
Targeted advertising?!
I wonder how Google will customise my web search based on my YouTube viewing?
Google Enables Cross-Platform Local Search (As Carrot To Relinquish Your Privacy) http://searchengineland.com/google-enables-cross-platform-local-search-as-carrot-for-web-history-113811
Introducing a new local search experience across your devices - Inside Search http://insidesearch.blogspot.co.uk/2012/03/introducing-new-local-search-experience.html
20/04/23 www.rba.co.uk 22
20/04/23 www.rba.co.uk 23
Google's new(ish) social network Google Plus (Google+)
http://plus.google.com/
Google Now Forcing All New Users To Create Google+ Enabled Accounts
http://marketingland.com/google-now-forcing-all-new-users-to-create-google-enabled-accounts-3912
Search Plus Your World (SPYW) referred to as Search+ now available in Google.com and is the default. Gives priority to content from people in your Google+ network if you are signed in to your account.
(And the next Google killer is….Google! http://www.rba.co.uk/wordpress/2012/01/30/and-the-next-google-killer-is-google/ ) 20/04/23 www.rba.co.uk 24
20/04/23 www.rba.co.uk 25
SPYW currently being tested on Google.com
Top results (blacked out for privacy reasons) were from one of my Google+ circles and from people who restricted access to the postings.
Take care when providing information to users or incorporating data as part of a report
Google Knowledge Graph
Introducing the Knowledge Graph
http://www.google.com/insidesearch/features/search/knowledge.html
It isn't a graph!
At present only on Google.com
Only shows if you are logged in to a Google+ enabled account
20/04/23 www.rba.co.uk 26
Google Knowledge Graph
20/04/23 www.rba.co.uk 27
Google Knowledge Graph
20/04/23 www.rba.co.uk 28
No Knowledge Graph
20/04/23 www.rba.co.uk 29
Bing - Adapting Search to You http://www.bing.com/community/site_blogs/b/search/archive/2011/09/14/adapting-search-to-you.aspx Bing to use Facebook, Twitter more in fight against Google | ZDNet : http://www.zdnet.com/blog/facebook/bing-to-use-facebook-twitter-more-in-fight-against-google/8631
Bing Relaunches, Features New Social Sidebar http://searchengineland.com/the-new-bing-microsoft-tries-again-with-search-meets-social-120728
"It’s not just Facebook and Twitter that get to play in the social sidebar, however. Social suggestions might also come from LinkedIn, Quora, Foursquare, Blogger and - wait for it - Google Plus"
20/04/23 www.rba.co.uk 30
Bing Relaunches, Features New Social Sidebar : http://searchengineland.com/the-new-bing-microsoft-tries-again-with-search-meets-social-120728
20/04/23 www.rba.co.uk 31
20/04/23 www.rba.co.uk 32
No social side bar for me (not in the US) but social network 'stuff' appears in main results
So.cl : http://www.so.cl/
20/04/23 www.rba.co.uk 33
Microsoft Launches Socl Social Network: A Look Inside http://marketingland.com/microsoft-launches-so-cl-social-network-a-quick-look-12499
What I see on my screen is not what you'll see on yours!
20/04/23 www.rba.co.uk 34
How do search engines personalise results?
Depends on:– your location
– past searches
– which sites you have looked at in the past
– your +1s
– your likes
– your shares
– sites blocked by you (Google)
– which social networks you are signed in to
– who is in your social networks
20/04/23 www.rba.co.uk 35
To allow personalisation or not?
Not necessarily a bad thing
Use a second browser with search history enabled and logged in to accounts for a different point of view
It does bias results
What I see on my screen is not what you'll see on yours
Be aware of potential privacy issues regarding friends and contacts in social networks when providing results to your users
20/04/23 www.rba.co.uk 36
Want to switch it off?
Disable and remove web/search history - but in Google, no option to erase 'signed out' histories
Actively manage search cookies, automatically delete cookies after computer log out or switch off
- How to delete cookies- http://aboutcookies.org/Default.aspx?page=2
Log out of all search engine and social media accounts when searching
Use Chrome Incognito (Chrome owned by Google!)
In Google use Verbatim in the left hand menu on results page
Use advanced search commands if relevant
Use a search engine that doesn't track or personalise20/04/23 www.rba.co.uk 37
20/04/23 www.rba.co.uk 38
Google search settings
20/04/23 www.rba.co.uk 39
Google web history
20/04/23 www.rba.co.uk 40
Verbatim
Forces Google to run an exact
match search. Run your search first
and then select Verbatim from the
left hand menu on your results page
Cannot be combined with time
options in the side bar
Google: Verbatim for exact match
search
http://www.rba.co.uk/wordpress/2011/11/18/google-verbatim-for-exact-match-search/
20/04/23 www.rba.co.uk 41
20/04/23 www.rba.co.uk 42
Signed out
Verbatim
Chrome incognito
Try a search tool with less or no personalisation
DuckDuckGo – does not track, does not personalisehttp://duckduckgo.com/
Yandex.com – International version of the Russian search engine http://www.yandex.com/
Blekkohttp://www.blekko.com/
Million Short – omits the top million most "popular" sites from resultshttp://www.millionshort.com/
20/04/23 www.rba.co.uk 43
Use more than Google anyway
The Disruptive Searcher (Sanity checking Google http://disruptivesearcher.wordpress.com/2012/02/27/sanity-checking-google/)
“if I hadn’t searched across more than Google for data on a small, new company that I was asked to research recently, I would have missed out on some very significant information that Google just wasn’t showing me.”
20/04/23 www.rba.co.uk 44
Bing
http://www.bing.com/
Does personalise search and include social network content in results
Most of the interesting developments and features are only available in the US version
Results tend to be more consumer/retail focused unless using advanced search features
Coverage not identical to Google’s - sometimes yields important unique content, especially in research and business
Sometimes more up to date than Google
20/04/23 www.rba.co.uk 45
Bing
Link to minimalist advanced search options now vanished
Advanced Search Operators
http://msdn.microsoft.com/en-us/library/ff795620
Main ones– site:
– filetype:
– intitle:
20/04/23 www.rba.co.uk 46
DuckDuckGo
http://duckduckgo.com/
DuckDuckGo – silly name but a neat little search tool : http://www.rba.co.uk/wordpress/2011/11/07/duckduckgo-silly-name-but-a-neat-little-search-tool/
No tracking, no “filter bubble”
Commandssite: inbody: intitle: filetype: sort:date to sort by date (uses results from Blekko)region:cc (e.g. za) to boost a country
Syntax and keyboard shortcuts at http://duckduckgo.com/goodies.html
20/04/23 www.rba.co.uk 47
Yandex
• http://www.yandex.com/
20/04/23 www.rba.co.uk 48
Yandex – advanced search
20/04/23 www.rba.co.uk 49
OR
Search operators http://help.yandex.com/search/?id=1113759
Blekko
http://blekko.com/
slashtags for sorting by date (/date), searching for images (/images) and videos (/videos)
Use public slashtags to search a group of web sites covering a particular topic or type of site e.g. /library or create your own to search your specified list of sites (similar to Google Custom Search Engines)
wind turbine electricity generation /karenblakeman/renewable
“Musings about librarianship: Using Blekko to search across thousands of library sites” http://musingsaboutlibrarianship.blogspot.com/2010/11/using-blekko-to-search-across-thousands.html
20/04/23 www.rba.co.uk 50
Blekko
Cannot do filetype, inurl, intitle searches
Drop down menu next to page in results list for– site search (or use /site)– similar pages (or use /similar)– inbound links to the page (or use /links)
20/04/23 www.rba.co.uk 51
Million Short
http://www.millionshort.com/
"Imagine a search engine that simply removed the top 1 million most popular web sites from its index. What would you discover?"
Can usefiletype:site:intitle "..."
Can add individual sites back in
20/04/23 www.rba.co.uk 52
20/04/23 www.rba.co.uk 53
Using Google features to best advantage
20/04/23 www.rba.co.uk 54
Location
Country versions of Google to prioritise local content – for example google.co.za, google.fr, google.de
– usually two letter ISO code for the country
Change location in left hand menu on results page
20/04/23 www.rba.co.uk 55
Google.com and SPYW
20/04/23 www.rba.co.uk 56
SPYW – hide personal results
20/04/23 www.rba.co.uk 57
Personal results only
20/04/23 www.rba.co.uk 58
Verbatim
20/04/23 www.rba.co.uk 59
Google results page side bar
20/04/23 www.rba.co.uk 60
'Everything' does not search everything
Videos is not YouTube
Not clear how discussions are identified
'Social' is not just Google+. Look in your dashboard to see who Google has decided to include.
No Twitter option anymore and Twitter coverage is sporadic.
Google side bars
20/04/23 www.rba.co.uk 61
Images Videos News Books Blogs
Start using advanced search commands and Google gives up on personalisation although you may have to use Verbatim
20/04/23 www.rba.co.uk 62
Looking for a particular type of information for example statistics, research report, expert presentation?
Use the filetype: command
For statistics
world oil consumption filetype:xls world oil consumption filetype:xlsx world oil consumption filetype:xlsx OR filetype:xls
For government, research, industry reports
oil consumption forecasts filetype:pdf
For conference presentations or trying to locate an expert
renewable energy UK filetype:ppt renewable energy UK filetype:pptx renewable energy UK filetype:ppt site:ac.uk
20/04/23 www.rba.co.uk 63
Numerical range search
Anything to do with numbers
Use advanced search screen
or
1st number followed by two full stops followed by 2nd number followed by unit of measurement (if applicable)
– Norway oil production forecasts 2012..2020
– Norway oil production forecasts 2012..2020 filetype:xls OR filetype:xlsx
20/04/23 www.rba.co.uk 64
Advanced commands continued
inurl: for example
inurl:"carbon capture" targets
intitle: for example
intitle:"carbon capture" targets
asterisk (*) to search for terms separated by 1-5 words (may
have to use quotation marks)
solar * panels
"solar * panels"
Picks up solar PV panels, solar photovoltaic panels, solar
water heating panels
20/04/23 www.rba.co.uk 65
Synonyms
Google often looks for variations of your terms but you cannot rely on it always happening
Use the tilde ~ before a term to look for what Google considers are synonyms
– ~energy will pick up oil, fuel, gas, electricity
No information/documentation on how synonyms are created
Very general, consumer oriented rather than scientific
Can be used with Verbatim
20/04/23 www.rba.co.uk 66
When you really DO want to search social media the main search engines don't make it easy! Social media is an essential part of many types of research.
Search within the network itself – means you must have an account
Use specialist search tools– come and go
– not comprehensive
– need to use more than one
– a few examples are shown in the following slides. For more information on social media search tools see Phil Bradley's presentations http://www.slideshare.net/Philbradley/
20/04/23 www.rba.co.uk 67
LinkedIn.com
20 April 2023 Karen Blakeman www.rba.co.uk 68
LinkedIn.com
20/04/23 www.rba.co.uk 69
More on searching LinkedIn
Boolean Black Belt-Sourcing/Recruiting http://www.booleanblackbelt.com/
Mary Ellen Bates Ten Top Tips for Searching LinkedIn http://www.batesinfo.com/meb123/index.html PDF http://www.batesinfo.com/extras/assets/linkedin.pdf
20/04/23 www.rba.co.uk 70
Bing social http://www.bing.com/social/
20/04/23 www.rba.co.uk 71
http://search.twitter.com/
20/04/23 www.rba.co.uk 72
http://search.twitter.com/
20/04/23 www.rba.co.uk 73
Topsy.com
20/04/23 www.rba.co.uk 74
Topsy.com
20/04/23 www.rba.co.uk 75
Icerocket.com
20/04/23 www.rba.co.uk 76
Socialmention.com
20/04/23 www.rba.co.uk 77
Keeping up to date
Inside Search http://insidesearch.blogspot.com/
Official Google Blog http://googleblog.blogspot.com/
Google Scholar Blog http://googlescholar.blogspot.com/
Search Engine Land http://searchengineland.com/
Search Engine Watch http://searchenginewatch.com/
Boolean Black Belt-Sourcing/Recruiting http://www.booleanblackbelt.com/
Karen Blakeman’s Blog http://www.rba.co.uk/wordpress/
Phil Bradley's weblog http://philbradley.typepad.com/
20/04/23 www.rba.co.uk 78