24
Content Enrichment in SharePoint Search #SPSBE12 Steven Van de Craen April 26 th , 2014

SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

  • Upload
    biwug

  • View
    157

  • Download
    0

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Content Enrichment in SharePoint Search

#SPSBE12Steven Van de CraenApril 26th, 2014

Page 2: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Thanks to our sponsors!

Gold

Silver

Page 3: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

About me Steven Van de Craen

SharePoint

enthousiast

Ventigrate

Since 2005

Page 4: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Overview What is it?

A FAST history

SharePoint 2013

#demo

Practical use

#demo

WCF Routing Technology (.NET 4.0)

#demo

Wrap-up

Resources

Page 5: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

What is it?

“Content Enrichment is about manipulating crawled content before it is added to the search index.”

Add or modify properties of crawled items

Add information from an external system

Advanced processing on raw data

Page 6: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

A FAST history

Enterprise-class search

SharePoint 2001 Search

SharePoint 2003 Search

MOSS 2007 Search

SharePoint 2010 Search

FAST ESP

FAST Search Server 2010 for SharePoint

SharePoint 2013 Search

Page 7: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

A FAST history

Custom pipeline extensibility

Registration via XML

Page 8: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

A FAST history

Custom pipeline extensibility

“Callout” = executable

Page 9: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

A FAST history

Custom pipeline extensibility

For each crawled item

Synchronous

Optimize for performance

Visual Studio Profiling Tools

Startup penalty

200ms to

process a

single item

10 million

items

23 days

Page 10: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

“Callout” = web service

Conditionally via triggers

Synchronous

Process properties or raw data

High Availability / Load Balancing

Optimize for performance

Startup penalty is minimized

Page 11: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

Registration via PowerShell

Page 12: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

Registration via PowerShell

Configuration property Description Default valueEndpoint Specifies the URL of the external web service. Empty.

InputProperties The managed properties that the external web service receives. Empty.

OutputProperties The managed properties that the external web service returns. Empty.

Timeout The amount of time until the web service times out in milliseconds. Depending on FailureMode, the item fails to be processed or a warning is written to the ULS log.

5000 milliseconds; Valid range [100, 30000].

SendRawData Enables or disables sending raw data to the web service. False.

MaxRawDataSize The maximum size of raw data sent to the web service in kilobytes (KB). If the binary data of an item exceeds this limit, the item is not sent. This does not prevent the InputProperties from being sent, and the OutputProperties from being received.

5120 kilobytes.

FailureMode

Controls the behavior of the web service client when errors occur. When FailureMode is set to ERROR, any problems that occur during content enrichment processing send a failed callback for that particular item. When FailureMode is set to WARNING, the item is indexed, without any modifications by the web service and a warning is written to the ULS log.

Error.

DebugMode

A mode that when set to true enables the content enrichment client to send all managed properties to the client without expecting any properties in return. Any configured Trigger property, InputProperties property, and OutputProperties property are ignored.

False.

Trigger A Boolean predicate that is executed on every crawled item. If the predicate evaluates to true, the record is sent to the web service. Otherwise, the item is passed through to the search index.

Empty.

Page 13: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

Trigger conditions

Determine if a callout is needed

Uses Managed Properties, Operators, Constants and Functions

Property1 > Property2

Property1 > 600

IsNull(Property2)

StartsWith(Property1, “sample”) AND Property2 != 18

IsDay(Property1, 2014, 04, 26)

Page 14: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

SOAP-based WCF service implementing IContentProcessingEnrichmentService

Microsoft.Office.Server.Search.ContentProcessingEnrichment.dll

C:\Program Files\Microsoft Office Servers\15.0\Search\Applications\External

Page 15: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

Page 16: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

SharePoint 2013

Content Enrichment web service (CEWS) callout

Limitations

1 WCF per CEWS per SSA

Raw data message limit

Page 17: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

#demo A taste of CEWS

Page 18: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Practical use OCR and data extraction

Image recognition and tagging

Barcode scanning

BBAN/IBAN number normalization

LOB data tagging/enrichment

Page 19: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

#demo A real world example

Page 20: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

WCF Routing Technology (.NET 4.0)

Enables development of complex routing logic, load-balancing, and fault tolerance.

Routing based on predefined or custom filters

Fault tolerance through backup endpoints

Load balancing through custom filters

Page 21: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

#demo Breaking through the limit

Page 22: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Wrap-up Service oriented

Raw Data and/or Managed Properties

PowerShell

Synchronous

Routing Service

Trigger Expression Syntax

Page 23: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Resources Custom content processing with the Content

Enrichment web service callouthttp://bit.ly/1j1UEvH

How to: Use the Content Enrichment web service callout for SharePoint Serverhttp://bit.ly/1l3wLK3

Trigger expressions syntax in SharePoint 2013http://bit.ly/1hVSR97

Advanced Content Enrichment in SharePoint 2013 Searchhttp://bit.ly/1j25Ua9

Content enrichment service scaling and aggregationhttp://bit.ly/1h4HZpt

Routing Servicehttp://bit.ly/1jKIVAP

Message Filtershttp://bit.ly/1keu0ls

Page 24: SharePoint Saturday Belgium 2014 - Content Enrichment in SharePoint Search

Thank you [email protected]

www.sharepointblogs.be/blogs/vandest

@vandest1