View
217
Download
1
Category
Preview:
Citation preview
Local Touch – Global Reach
www.us.sogeti.com
SharePoint 2010 Search Deep Dive
Corey Erkes, Manager ConsultantSogeti USA
2www.us.sogeti.com
Local Touch – Global Reach
About Me• Manager Consultant within Sogeti SharePoint Practice
• Worked with SharePoint since V2
• MCTS: Microsoft SharePoint 2010, Configuring
• Co-Leader of Omaha SharePoint User Group
• Coauthor of SharePoint 2010 Governance Book
• Member of UNO IS&T Alumni Board
SharePoint 2010 Deep Dive
3www.us.sogeti.com
Local Touch – Global Reach
Agenda
SharePoint 2010 Search Deep Dive
SharePoint 2010 Search Versions• SharePoint 2010 Foundation• Search Server Express• Search Server• SharePoint 2010 Server• FAST
Search 2010 Architecture• How to Configure• Crawl Component• Query Component• Associated Databases
How to Scale Out
4www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
SharePoint 2010 Search Versions
5www.us.sogeti.com
Local Touch – Global Reach
Wait, there are different flavors of Search?
SharePoint 2010 Search Deep Dive
• SharePoint Foundation 2010• Search Server 2010 Express• Search Server 2010• SharePoint Server 2010• FAST Search Server 2010 for SharePoint
Search Server 2010 Express is a separate product outside of SharePoint 2010, but when installed with SharePoint Foundation 2010, can provide a lot of functionality
6www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Functionality Breakdown
SharePoint 2010 Search Deep Dive
Feature SharePoint Foundation
2010
Search Server
Express
Search Server 2010
SharePoint Server 2010
FAST Search Server 2010 for SharePoint
Visual Best Bets Limited Limited
Scopes
Search enhancements based on user context
Custom properties
Property extraction Limited Limited Limited
Query federation
Query suggestions
Similar results
Sort results on managed properties or rank profiles
Relevancy tuning by document or site promotions
Limited Limited Limited
7www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Functionality Breakdown - Continued
SharePoint 2010 Search Deep Dive
Feature SharePoint Foundation
2010
Search Server
Express
Search Server 2010
SharePoint Server 2010
FAST Search Server 2010 for SharePoint
Shallow results refinement
Deep results refinement
Document preview and thumbnails
Windows 7 federation
People search
Social search
Taxonomy integration
Multi-tenant hosting
Rich Web indexing support
8www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
SharePoint 2010 Index Size Capabilities
• SharePoint Foundation 2010 can be scaled out to over ~10 million with addition of search server and assign it to crawl different content databases
9www.us.sogeti.com
Local Touch – Global Reach
Available Search Repositories
SharePoint 2010 Search Deep Dive
Repository SharePoint Foundation
2010
Search Server
Express
Search Server 2010
SharePoint Server 2010
FAST Search Server 2010 for SharePoint
SharePoint sites
Windows file shares
Exchange public folders
Lotus Notes
Web sites
IFilters for additional systems
Structured content in databases
10www.us.sogeti.com
Local Touch – Global Reach
Search Manageability
SharePoint 2010 Search Deep Dive
Manageability SharePoint Foundation
2010
Search Server
Express
Search Server 2010
SharePoint Server 2010
FAST Search Server 2010 for SharePoint
UI-based administration Limited
Scriptable deployment and management via PowerShell
Microsoft System Center Operations Manager Pack
Health Monitoring
Usage Reporting
11www.us.sogeti.com
Local Touch – Global Reach
So wait, Search Server Express is free?
SharePoint 2010 Search Deep Dive
Feature Search Server Express
SharePoint Server 2010
Performance with sub-second response time 10 million items* 100 million items
Scriptable deployment and management via PowerShell
User interface–based (UI-based) administration
Relevancy tuning by document or site promotions
Common connector framework for indexing and federation
Search from Windows 7 and Windows Mobile
Metadata-based refinement panel
Metadata extraction on managed properties
Scriptable deployment and management using Windows PowerShell
Relevance improves with social behavior
Query suggestions, related searches, and improved “Did you mean?”
* - assumes SQL Server and not SQL Server Express
12www.us.sogeti.com
Local Touch – Global Reach
Really, Search Server Express is free?
SharePoint 2010 Search Deep Dive
Feature Search Server Express
SharePoint Server 2010
People and expertise search
Taxonomy and term store integration
Phonetic and nickname search
Integration with My Site
That’s a lot of goodness for free!
13www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
Unfortunately, FAST Search is not free!
14www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
SharePoint 2010 Search Architecture
15www.us.sogeti.com
Local Touch – Global Reach
Goodbye SSP, Hello SharePoint Search Service!
SharePoint 2010 Search Deep Dive
Search Service Application
Creation of Search Service Application\Proxy can be provisioned in one of three ways:
• Central Administration Manage Service Applications Page• Central Administration Farm Configuration Wizard• PowerShell (how the cool kids do it!)
Creation of Search Service Application PowerShell Walk-Thruhttp://blogs.msdn.com/b/russmax/archive/2009/10/20/sharepoint-2010-configuring-search-service-application-using-powershell.aspx
16www.us.sogeti.com
Local Touch – Global Reach
SharePoint Search Roles
SharePoint 2010 Search Deep Dive
Four unique roles involved in Search• Web server role
• Provides interface for searching• Query server role
• Serves search results to web server(s)• Crawl server role
• Responsible for crawling content• Database server role
• Hosts the three databases associated with search• Property database• Crawl database• Search administration database
17www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
Search Service Application Proxy
Search Components
Web Front End Query Server / Query Processor
WCF Call
Query Component
Property Store Database
Search Administration Database
Index
Index Server
Index
Propagation
Content Data Sources
SharePoint Web Sites
Shared Folders
External Web Sites
CustomDatabases
OtherSystems
CrawlerConnector(s) Crawl Database
18www.us.sogeti.com
Local Touch – Global Reach
Database Role
SharePoint 2010 Search Deep Dive
A minimum of three databases are required to support Search:• Property databases
• Contains metadata or associated custom properties for all crawled items
• Crawl databases• Contains history of the crawl • Manages start and stop points of crawls• Database can have more than one crawl associated to it,
but a single crawler can only be associated to one database
• Search Administration database• Stores search configuration data such as scopes and
refiners. • Contains security information for the crawl content
19www.us.sogeti.com
Local Touch – Global Reach
Database Sizing
SharePoint 2010 Search Deep Dive
Calculations for sizing databases• Property databases
• 0.046 x (sum of content databases)• Crawl databases
• 0.015 x (sum of content databases)• Search Administration database
• Allocate 10 GB
Database Characteristics• Property databases
• Write-heavy, 1:2 ratio• Crawl databases
• Read-heavy, 3:1 ratio• Should not be collocated with Property DB
• Search Administration database• Equal read/write
20www.us.sogeti.com
Local Touch – Global Reach
Crawl Role
SharePoint 2010 Search Deep Dive
Purpose of crawl server is to index content• Crawl runs under MSSeach.exe (SharePoint Server Search 14)• Crawl sever does not contain copy of index, index is
streamed/propagated to Query server• No longer a single point of failure
• Crawler component needs to be mapped to SQL crawl database• Possible to create multiple Crawl databases and Crawler
components
21www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
Search Service Application Proxy
Crawl Architecture
Web Front End Query Server / Query Processor
WCF Call
Query Component
Property Store Database
Search Administration Database
Index
Index Server
Index
Propagation
Content Data Sources
SharePoint Web Sites
Shared Folders
External Web Sites
CustomDatabases
OtherSystems
CrawlerConnector(s) Crawl Database
22www.us.sogeti.com
Local Touch – Global Reach
Crawl Role – Fault Tolerance
SharePoint 2010 Search Deep Dive
• Can be achieved by provisioning a secondary crawl component on a secondary server
• Can be mapped to same SQL Crawl database• Having more crawl databases than Crawl components
doesn’t make sense and wastes system resources
• Crawl Database fault tolerance should be handled through SQL mirroring
23www.us.sogeti.com
Local Touch – Global Reach
Crawl Role – Performance
SharePoint 2010 Search Deep Dive
• Performance is improved by adding additional Crawl components as two or more are crawling content instead of one
• Load is distributed across both Crawl components• Overlapping would not occur as items are crawled in batches by
both crawlers
24www.us.sogeti.com
Local Touch – Global Reach
Crawl Role – Distribution
SharePoint 2010 Search Deep Dive
• Can be accomplished by doing the following:• Crawl Component 1 Crawl DB 1• Crawl Component 2 Crawl DB 2
• Each web application host is assigned a crawl component and attempts to distribute load evenly across crawl databases
• sales.company.com Crawl Component 1 Crawl DB 1• hr.company.com Crawl Component 2 Crawl DB 2
• Distribution is based off # of items/doc id’s that are stored in crawl DB
25www.us.sogeti.com
Local Touch – Global Reach
Crawl Role – Distribution Example
SharePoint 2010 Search Deep Dive
Let’s say you have two web applications• sales.company.com Crawl Component 1 Crawl DB 1• hr.company.com Crawl Component 2 Crawl DB 2
Crawl DB 1 contains 3000 itemsCrawl DB 2 contains 10,000 items
New web application is provisioned: finance.company.com• No need to create additional crawl component or crawl DB
What crawl DB will new host be associated to?
26www.us.sogeti.com
Local Touch – Global Reach
Query Role
SharePoint 2010 Search Deep Dive
Purpose of query server is to server up queries to WFE• Index is stored on Query server(s)• Query server(s) contains one or more Query Components• Query Component is mapped to only one Property Store DB• Query Component is where index that is propagated from Crawler
resides
27www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
Search Service Application Proxy
Query Architecture
Web Front End Query Server / Query Processor
WCF Call
Query Component
Property Store Database
Search Administration Database
Index
Index Server
Index
Propagation
Content Data Sources
SharePoint Web Sites
Shared Folders
External Web Sites
CustomDatabases
OtherSystems
CrawlerConnector(s) Crawl Database
28www.us.sogeti.com
Local Touch – Global Reach
Query Component – Fault Tolerance
SharePoint 2010 Search Deep Dive
Highly recommended to create fault tolerance index by mirroring a Query component onto another server in the farm.
Check “Fail-over Query Component” if you only want fault tolerance and not increase in query performance.
29www.us.sogeti.com
Local Touch – Global Reach
Query Component – Sizing the Index
SharePoint 2010 Search Deep Dive
Index will be approximately 3.5% of Index size• Don’t forget about size needed for mirror• Additional space needed for master merge
Example:
• 100 GB Content Database• Index partition: 100 GB x 3.5% = 3.5 GB• Index partition mirror: 100 GB x 3.5% = 3.5 GB• Space for master merge: All index partitions x 3• Total Space = (3.5 x 2) x 3 = 21 GB
Recommend having enough memory to fit 33% of the index in RAM.
30www.us.sogeti.com
Local Touch – Global Reach
Query Component – Performance
SharePoint 2010 Search Deep Dive
Index size is the main bottleneck for query performance
• Index contains 10 million documents = Avg. of 2 seconds per query• Index contains 20 million documents = Avg. of 4 seconds per query
Creating multiple index partitions is the key to reducing query times and reducing bottlenecks. A new index partition can be added through Search Application Topology in Central Administration.
31www.us.sogeti.com
Local Touch – Global Reach
Property DB Store – Fault Tolerance & Performance
SharePoint 2010 Search Deep Dive
Fault Tolerance• SQL mirroring should be used to achieve fault tolerance.
Performance• Add addition Property Store DB if bottlenecks occur• Must first create new Property Store DB, then create new
Query component and map to new Property Store DB• Additional Query component should not include mirror if
performance is wanted
• You will need to reset index and re-crawl as a new Query component (index partition) would be created
32www.us.sogeti.com
Local Touch – Global Reach
Property Store DB – Add Query Component
SharePoint 2010 Search Deep Dive
Property Store DB must be created before adding Query Component so it appears in dropdown
33www.us.sogeti.com
Local Touch – Global Reach
Query Processor
SharePoint 2010 Search Deep Dive
• Runs under w3wp.exe process• Processes a query by retrieving results from the index\Query
Components• Utilizes the Property Store DB and Search Administration DB to
obtain metadata and perform security trimming• Will load balance requests if more than one Query Component
(mirrored) exists within the same Index Partition
• Query Processor connects to every Property Store DB and Query Component to retrieve results
• Unlike MOSS 2007 where the Query Processor ran on the WFE, any server can run the Query Processor in SharePoint 2010
34www.us.sogeti.com
Local Touch – Global Reach
Query Processor – Fault Tolerance & Performance
SharePoint 2010 Search Deep Dive
• Add additional Query Processor service to another machine in farm• Doesn’t have to be WFE
• Requested will be load balanced in a round-robin fashion to each Query Processor
Search Query and Site Settings Service can be found in CA Services On Server
35www.us.sogeti.com
Local Touch – Global Reach
SharePoint 2010 Search Deep Dive
Search Service Application Proxy
Overall Search Architecture
Web Front End Query Server / Query Processor
WCF Call
Query Component
Property Store Database
Search Administration Database
Index
Index Server
Index
Propagation
Content Data Sources
SharePoint Web Sites
Shared Folders
External Web Sites
CustomDatabases
OtherSystems
CrawlerConnector(s) Crawl Database
36www.us.sogeti.com
Local Touch – Global Reach
Scale-out Decision Points
SharePoint 2010 Search Deep Dive
Number of items Action
0 – 1 million All Search roles can coexist on one or two servers
1 – 10 million Move crawl components to another server, while the query components remain on the Web servers.
10 – 20 million Add a crawl server. Each crawl server has one crawler. Create another index partition with query components and distribute these across query servers.
20 – 40 million Add index partitions with distributed query components. Add another crawl database, and then add a new associated crawler to each crawl server.
40 – 100 million Isolate each topology layer into server groups in which each role is deployed to its own set of servers. Each server group can be scaled out to meet specific requirements for the components in that role.
http://www.microsoft.com/download/en/details.aspx?id=20066
37www.us.sogeti.com
Local Touch – Global Reach
Performance Metrics Thoughts
SharePoint 2010 Search Deep Dive
To improve this metric… Take these actions
Full crawl time and resultfreshness
Add crawl servers, crawlers, and crawl databases. Each crawl database contains content from independent sources. Each crawl database can have several crawl components associated with it, and those crawl components can be distributed among many crawl servers. If you have several content sources, multiple crawl components and associated crawl databases allow you to crawl the content concurrently.
Time required for results to be returned
If query latency is caused by high peak query load, add query servers and index partitions. Each index partition can contain up to ~10 million items. You can also add a mirror for each query component for a given index partition. Place the mirror copy on a different server. Query throughput increases when you add index partition instances. If query latency is caused by database load, isolate the property database from crawl databases by moving it to a separate database server.
http://www.microsoft.com/download/en/details.aspx?id=20066
38www.us.sogeti.com
Local Touch – Global Reach
Small Farm Topology
SharePoint 2010 Search Deep Dive
http://www.microsoft.com/download/en/details.aspx?id=20066
39www.us.sogeti.com
Local Touch – Global Reach
Medium Farm Topology
SharePoint 2010 Search Deep Dive
http://www.microsoft.com/download/en/details.aspx?id=20066
40www.us.sogeti.com
Local Touch – Global Reach
Medium Search Farm Topology
SharePoint 2010 Search Deep Dive
http://www.microsoft.com/download/en/details.aspx?id=20066
41www.us.sogeti.com
Local Touch – Global Reach
Medium Dedicated Search Farm Topology
SharePoint 2010 Search Deep Dive
http://www.microsoft.com/download/en/details.aspx?id=20066
42www.us.sogeti.com
Local Touch – Global Reach
Large Dedicated Search Farm Topology
SharePoint 2010 Search Deep Dive
http://www.microsoft.com/download/en/details.aspx?id=20066
43www.us.sogeti.com
Local Touch – Global Reach
References
SharePoint 2010 Search Deep Dive
Search Technologies for SharePoint 2010 Productshttp://download.microsoft.com/download/0/0/0/00015E0A-67CD-490C-9C1B-DCFA8E9BAEFC/Search%20Model%201%20of%204%20-%20Search%20Technologies.pdf
SharePoint Brew – Search 2010 Architecture and Scale, Part 1 Crawlhttp://blogs.msdn.com/b/russmax/archive/2010/04/23/search-2010-architecture-and-scale-part-1-crawl.aspx
SharePoint Brew – Search 2010 Architecture and Scale, Part 2 Queryhttp://blogs.msdn.com/b/russmax/archive/2010/04/23/search-2010-architecture-and-scale-part-2-query.aspx
Recommended