36
Search Version 2007.3 Installation and Configuration Guide Oracle ATG One Main Street Cambridge, MA 02142 USA

ATG Search Installation and Configuration Guide - product

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: ATG Search Installation and Configuration Guide - product

Search

Version 2007.3

Installation and Configuration Guide

Oracle ATG

One Main Street

Cambridge, MA 02142

USA

Page 2: ATG Search Installation and Configuration Guide - product

ATG Search Installation and Configuration Guide

Product version: 2007.3

Release date: 04-15-11

Document identifier: SearchInstallConfigGuide1106061639

Copyright © 1997, 2011 Oracle and/or its affiliates. All rights reserved.

This software and related documentation are provided under a license agreement containing restrictions on use and disclosure and are

protected by intellectual property laws. Except as expressly permitted in your license agreement or allowed by law, you may not use, copy,

reproduce, translate, broadcast, modify, license, transmit, distribute, exhibit, perform, publish, or display any part, in any form, or by any

means. Reverse engineering, disassembly, or decompilation of this software, unless required by law for interoperability, is prohibited.

The information contained herein is subject to change without notice and is not warranted to be error-free. If you find any errors, please

report them to us in writing. If this software or related documentation is delivered to the U.S. Government or anyone licensing it on behalf of

the U.S. Government, the following notice is applicable:

U.S. GOVERNMENT RIGHTS

Programs, software, databases, and related documentation and technical data delivered to U.S. Government customers are "commercial

computer software" or "commercial technical data" pursuant to the applicable Federal Acquisition Regulation and agency-specific

supplemental regulations. As such, the use, duplication, disclosure, modification, and adaptation shall be subject to the restrictions and

license terms set forth in the applicable Government contract, and, to the extent applicable by the terms of the Government contract, the

additional rights set forth in FAR 52.227-19, Commercial Computer Software License (December 2007). Oracle America, Inc., 500 Oracle

Parkway, Redwood City, CA 94065.

This software or hardware is developed for general use in a variety of information management applications. It is not developed or intended

for use in any inherently dangerous applications, including applications that may create a risk of personal injury. If you use this software or

hardware in dangerous applications, then you shall be responsible to take all appropriate fail-safe, backup, redundancy, and other measures

to ensure its safe use. Oracle Corporation and its affiliates disclaim any liability for any damages caused by use of this software or hardware in

dangerous applications.

Oracle and Java are registered trademarks of Oracle and/or its affiliates. Other names may be trademarks of their respective owners.

Intel and Intel Xeon are trademarks or registered trademarks of Intel Corporation. All SPARC trademarks are used under license and are

trademarks or registered trademarks of SPARC International, Inc. AMD, Opteron, the AMD logo, and the AMD Opteron logo are trademarks or

registered trademarks of Advanced Micro Devices. UNIX is a registered trademark licensed through X/Open Company, Ltd.

This software or hardware and documentation may provide access to or information on content, products, and services from third parties.

Oracle Corporation and its affiliates are not responsible for and expressly disclaim all warranties of any kind with respect to third-party

content, products, and services. Oracle Corporation and its affiliates will not be responsible for any loss, costs, or damages incurred due to

your access to or use of third-party content, products, or services.

For information about Oracle's commitment to accessibility, visit the Oracle Accessibility Program website at http://www.oracle.com/us/

corporate/accessibility/index.html.

Oracle customers have access to electronic support through My Oracle Support. For information, visit http://www.oracle.com/support/

contact.html or visit http://www.oracle.com/accessibility/support.html if you are hearing impaired.

Page 3: ATG Search Installation and Configuration Guide - product

ATG Search Installation and Configuration Guide iii

Table of Contents

1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

Document Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

ATG Search Installation Roadmap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2. ATG Search Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

How ATG Search Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

ATG Search Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

ATG Search Administration Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Project Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Search Workbench . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Security in ATG Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3. Installing ATG Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Installation Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Installing Search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

Installing HTMLFilter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Installing PDF Extract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Adding Standalone Search Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Configuring for Remote Search Engine Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Configuring for Remote Repository Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Configuring a Multi-Server Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Configuring the DeployShare Directory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Configuring the GSAInvalidatorService . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Configuring the Lock Manager . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

Configuring the IDGenerator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Configuring a Single Search Administration for Multiple CA Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Configuring SearchSQLRepository Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Using Incremental Loading with ATG Content Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Configuring the Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

Configuring Search for Customer Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Building the Search Administration EAR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Accessing Search Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

A. Migrating from Previous ATG Search Versions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Terminology Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Procedural Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Reporting Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Migration Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

B. ATG Search Component Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4. Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Page 4: ATG Search Installation and Configuration Guide - product

iv ATG Search Installation and Configuration Guide

Page 5: ATG Search Installation and Configuration Guide - product

1 Introduction 1

1 Introduction

This guide explains how to install ATG Search. It is written for system administrators and other experienced

persons involved in planning and performing an ATG Search installation. Review this guide thoroughly before

beginning your installation.

This chapter includes the following sections:

Document Conventions (page 1)

More Information (page 1)

ATG Search Installation Roadmap (page 2)

Document Conventions

This guide uses the following conventions:

• <SearchDir> refers to the directory in which you have installed ATG Search.

• <ATG2007.3Dir> refers to your ATG 2007.3 installation directory.

More Information

For more information on ATG Search, see the following:

• ATG Search Administration Guide—Administering your Search installation.

• ATG Search Query Reference Guide—Detailed information on how content is indexed and how queries are

processed.

See the following documentation for making searchable content accessible to end-users of your ATG products:

• ATG Commerce Search Guide—Using Search with your ATG Commerce site.

• ATG Knowledge Installation and Configuration Guide—Using Search as part of Knowledge.

Page 6: ATG Search Installation and Configuration Guide - product

2 1 Introduction

ATG Search Installation Roadmap

Installing and configuring ATG Search involves the following steps:

Step Reference

Plan the installation See ATG Search Overview (page 3) if you are

not familiar with ATG Search components and

architecture.

Install ATG Search See Installing ATG Search (page 9).

Prepare, index, and deploy your content See the ATG Search Administration Guide.

Page 7: ATG Search Installation and Configuration Guide - product

2 ATG Search Overview 3

2 ATG Search Overview

ATG Search is an industry-leading search engine that lets users zero in on the information they need, regardless

of format, location or language. Multiple levels of analysis pinpoint answers and deliver them with sub-second

response times.

Features include:

• Processing natural language queries from end users

• Multiple ways to find information (browse, natural language, keyword, etc.)

• Unified access to any content type or location

• Support for faceted navigation in client applications

• Multiple language support

• Industry-specific lexicons

• Automated indexing and categorization

• Predefined reports using ATG Customer Intelligence

This chapter includes the following sections:

How ATG Search Works (page 3)

ATG Search Architecture (page 4)

ATG Search Administration Components (page 6)

How ATG Search Works

“Searching” means retrieving information from a repository using an input query. Commonly, the information

is textual, the repository is a collection of documents, and the queries are words or phrases entered by an end-

user. The search results are typically the most relevant documents to the query, plus some indication of why the

document was retrieved. In order for searching to be efficient, the document collection is indexed by its terms in

a secondary storage component, typically called the index.

ATG Search generalizes documents and other indexed content into an abstraction called an index item. An index

item consists of two elements: searchable text content and metadata. Metadata includes the title, summary, and

Page 8: ATG Search Installation and Configuration Guide - product

4 2 ATG Search Overview

other associated properties. An index item can be an actual document such as an HTML file, a repository item

such as an ATG Commerce product, or a piece of structured data from a traditional database.

The index items are fed into ATG Search, which analyzes the content and stores a representation for each item

in the index. The index objects are organized into hierarchical sets much like a directory scheme. ATG Search

creates some sets from the physical directory of the item, some sets from the metadata of the item, and other

sets from topic categorization results. These item sets enable searches within subsets of the collection and

browsing the collection without query input.

Text content is processed through the natural language components, which identify the structural elements

(such as sentences, headers, and tables) as well as the terms in the content. This processing is driven by a

dictionary and other language data, which are also stored in the index. The terms are divided into statement

vectors representing the sequence of terms in a structural portion of the content. The terms are also added to an

index, similar to one found in the back of a large book. Thisback-of-the-book index allows for efficient searches

and provides a global view of all the content.

Rather than simple text queries, ATG Search accepts complexrequests that specify what actions should be

performed. A request may include parameters, processing options, constraints, security settings, and other

information. The two primary requests are the Search Query and the View Item request. ATG Search returns

responses that contain varied information depending on the type of request. For a Search Query, the response

contains a list of results plus other information to drive the search application or user interface.

For more information on how ATG Search processes content into a searchable index, see the ATG Search Query

Reference Guide.

ATG Search Architecture

ATG Search runs within a standard ATG platform installation. Due to the heavy performance demands of

indexing content and serving search responses, you must dedicate at least one machine in your network

exclusively to Search.

ATG Search has the following components:

• Search Administration—The interface where you create projects, index content, deploy indexes, etc.

Runs within a standard ATG installation. You should have only one Search Administration instance in your

installation.

The SearchAdmin.AdminUI module is required to run Search Administration

• Search engine—Serves answers to end-user queries. Does not require a full ATG installation. Your installation

can include any number of search engines, but should run only one search engine per processor core. A single

search engine can use up to 1.4GB of memory, assuming a 32-bit operating system. Your hardware should be

sized accordingly.

The DAF.Search.Routing module starts the Search engine(s) either locally or remotely, and coordinates

communication between the client application, Search engine(s), and the Search database. Runs within a

standard ATG installation. You can run multiple instances of this module, but one should always run along

with Search Administration.

The DAF.Search.Base module is a lightweight routing module that can be run independent of Search

Administration. DAF.Search.Base is automatically included in Routing and contains base routing

functionality. It is also used stand-alone in client applications. You can run multiple instances of this module.

Page 9: ATG Search Installation and Configuration Guide - product

2 ATG Search Overview 5

• Client application—The application through which your end-users place their search queries. This application

must include the DAF.Search.Base module, and must also include DAF.Search.Query if your application

uses legacy form handlers.

• Search index—The searchable content deployed on your site. An index is composed of one or more logical

partitions, each of which is associated with a content set configured in Search Administration. Each logical

partition is composed of one or more physical partitions, each of which is served up by a search engine.

• Search database—Consists of two repositories, one of which stores information about Search engines,

index structure, and deployment information, the other of which stores information about users and Search

Administration. Search Administration requires access to both repositories. Both Search Administration and

your client application must have access to the routing repository.

The Search Administration, routing, and Search engines must all have access to the deployment share directory.

This is a scalable, shared directory where master copies of indexes are stored. This directory should ideally be

located on a high-performance machine separate from the Search Administration machine. Routing and Search

engine components must have access to this directory in order for indexes to be deployed and searched.

Note: The size of the index that is deployed does not bear any direct relationship to the size of the raw

information being indexed—dictionaries, topics, and other customization data can all add to the size of the

index, as can the nature of the content itself. For example, content consisting mostly of pictures with some

metadata might form a very small index relative to the raw content size, while a product catalog with many

small, unique pieces of information might be relatively large.

There are several possible configurations for the components described. The simplest option is to run all

components locally, on one machine (the possible exception being the database). This configuration is sufficient

for testing purposes and for estimating the size of your optimal configuration, but is not likely to be used in a

production environment.

You may want to consider a self-contained installation to begin with. Use this installation to estimate the size of

your index or indexes, then add routing and search engine installations as necessary. The original installation

can then remain as the Search Administration/indexing machine. In a production environment, you should

separate the searching and indexing functions. The diagram that follows shows a minimal configuration:

Page 10: ATG Search Installation and Configuration Guide - product

6 2 ATG Search Overview

And of course ideally, resources will be available to provide redundancy in order to prevent performance

bottlenecks and failures should one component go down. Additionally, if your content set is large enough to

require multiple search Engines, you will need multiple CPUs in order to serve the index. The next diagram

shows a single CPU dedicated to indexing, and a minimum of three for searching.

ATG Search Administration Components

The ATG Search Administration interface consists of the following main components:

• Project Administration (page 7)

• Search Workbench (page 7)

Page 11: ATG Search Installation and Configuration Guide - product

2 ATG Search Overview 7

For detailed information on Search Administration, see the ATG Search Administration Guide.

Project Administration

Project Administration is the part of the Search Administration interface dedicated to project-related tasks.

The Search project contains information about a set of searchable content you define, and the configuration

information for that content, such as where it comes from, what rules are followed for its indexing, and

customization data is applied.

The main item of importance for a project is its index. The index is the searchable content version produced

by ATG Search’s processing. The index is created from thecontent sets, which organize the content you

define.Content refers to the original, unprocessed documents, databases, etc. that you want to make searchable.

The project links the index with Search environments, which are where indexing takes place (indexing

environment), where testing can be done (staging environment), and where the index is finally deployed so end-

users can search it (production environment).

You can also use projects to associate an index with customization data, such as topic sets or specialized

dictionaries. This customization data is created using the Workbench (see next section).

Search Workbench

The Search workbench is used for creating and working with customization data, as well as viewing information

about already built indexes. Customizations are used to enhance the out-of-the-box configuration for features

such as languages, dictionaries, and topic sets.

Customization is the process of optimizing your ATG Search implementation to meet specific accuracy

and search usability goals. Customizations typically add information to the index itself (such as additional

dictionaries or term weights) or affect the way your content is processed or accessed by end-users (for example,

Query Rules, Topic Sets, or Facet Sets).

Note: For information on adding customization data to your Search project, see the Managing Search Projects

chapter of the ATG Search Administration Guide.

Security in ATG Search

ATG Search includes the Search Access role, which allows a user assigned that role to access Search

Administration and the Workbench. This role is granted by default to the admin user; in ATG Knowledge, by

default it is assigned to the service user.

The role can be found in the SearchAdminRoles folder of the ATG Business Control Center; for information, see

the ATG Business Control Center Administration and Programming Guide.

Page 12: ATG Search Installation and Configuration Guide - product

8 2 ATG Search Overview

Page 13: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 9

3 Installing ATG Search

This chapter explains how to install ATG Search and perform initial configuration. It includes the following

sections:

Installation Prerequisites (page 9)

Installing Search (page 10)

Adding Standalone Search Engines (page 11)

Configuring a Multi-Server Installation (page 13)

Configuring the Database (page 17)

Building the Search Administration EAR (page 21)

Accessing Search Administration (page 21)

Note: The default RMI port for new environments is 8860. If this port is not correct for your system, then when

you create an environment in Search Administration, Search Administration cannot connect to it until you

change that setting. To do so:

1. In Search Administration go to Browse Projects > Your Project > Environments > Host Name > Advanced

Settings.

2. Change the Default RMI Port setting.

3. Click Save.

Installation Prerequisites

This section addresses prerequisites and preliminary configuration procedures that must be followed before

installing ATG Search.

• If you are planning to use this installation to run Search Administration, before you install ATG Search, install

your application server and the ATG platform. If you are only planning to run the standalone Search engine,

you do not need the ATG platform or an application server.

• If you are running other ATG applications along with Search, install in the following order:

• ATG Platform

Page 14: ATG Search Installation and Configuration Guide - product

10 3 Installing ATG Search

• ATG Search

• Other ATG applications

• You will need a scalable shared network drive where indexes are initially created and from which they can be

deployed. All Search engines must have access to this directory. Indexes can take up a great deal of disk space,

so be sure the directory selected has at least 5-10 GB available (more if your index is large). This directory

must be writable.

Note: The deployment directory must be created and configured as a shared folder prior to installing ATG

Search. In a production site, this directory should be located on a separate, high-performance machine.

Installing Search

Use the ATG Search installer for all machines, regardless of whether they will be used for Search Administration,

routing, or Search engines.

Note: The screen shots shown are for a Windows installation. If you are installing on Linux, the procedure and

required information is the same, but the installer looks somewhat different.

To install ATG Search:

1. Download the installation executable.

2. Double-click the installer icon to start installing.

3. On the start page, click Next to begin the installation.

4. Read the license agreement. Select “I accept the terms of the license agreement” and click Next.

5. Select whether to install Search Administration (which includes a Search Engine) or a standalone Search

Engine only. Click Next.

6. Select the Unicode support level required for your Search installation. All languages supported by ATG

Search use UCS-2 encoding, including Chinese, Japanese, and Korean. Use UCS-4 only if you need support for

surrogate characters. Click Next.

7. Enter an installation location. The default location is C:\ATG\ATG2007.3. Click Next.

Note: If you plan to use this installation as a standalone Search engine only, you do not need an ATG platform

installation, so the C:\ATG directory may not exist.

8. Select locations for product icons and shortcuts.

9. Review your selections, and click Install to begin the installation.

10.Enter a Deployment Share folder. This is the scalable, shared directory you created before beginning the

installation.

11.When the installation is finished, click Done.

Page 15: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 11

Installing HTMLFilter

If the content you want to make searchable includes documents in HTML format, download and install the HTML

Filter component for ATG Search.

1. Download the ATGSearchHtmlFilter executable.

2. Double-click the installer icon to start installing.

3. On the start page, click Next to begin the installation.

4. Accept the license agreement, and click Next.

5. Select an installation directory, or accept the default, and click Next.

6. Click Install to install the component.

No additional configuration is necessary to use the component.

Installing PDF Extract

If the content you want to make searchable includes documents in PDF format, download and install the PDF

Extract component for ATG Search.

1. Download the ATGSearchPDFExtract executable.

2. Double-click the installer icon to start installing.

3. On the start page, click Next to begin the installation.

4. Accept the license agreement, and click Next.

5. Select an installation directory, or accept the default, and click Next.

6. Click Install to install the component.

No additional configuration is necessary to use the component.

Adding Standalone Search Engines

Standalone Search engines can run without an ATG installation or application server. These engines assist with

indexing, and provide Search results from a portion of your index.

To add a standalone Search engine to your installation:

1. Download the installation executable.

2. Double-click the installer icon to start installing.

3. On the start page, click Next to begin the installation.

Page 16: ATG Search Installation and Configuration Guide - product

12 3 Installing ATG Search

4. Read the license agreement. Select “I accept the terms of the license agreement” and click Next.

5. Select Search Engine Only for the installation type.

6. Select the Unicode support level required for your Search installation. Use UCS2 unless you need full support

for Chinese and Korean content, in which case select UCS4. Click Next.

7. Select an installation location.

8. Select locations for product icons and shortcuts.

9. Review your selections, and click Install to begin the installation.

10.Enter a Deployment Share folder. This is the scalable, shared directory you created before beginning the

installation.

11.Make sure that you have execute permissions on the following script:

<Search2007.3Dir>/SearchAdmin/bin/startRemoteLauncher.sh

12.Execute the startRemoteLauncher script.

The startRemoteLauncher script starts up a launcher service, which is then used by the routing component to

start the Search engine.

Configuring for Remote Search Engine Use

If you are running Search engines (either standalone engines or full Search installation) on machines other than

the Search Administration machine, additional configuration steps are required to ensure that these engines can

index content.

Search Engines, whether used for indexing or searching, pull their index files from a common location. If the

engine is for searching, then it pulls the files from the master deploy share. If the engine is for indexing, then

there are two cases:

• Incremental indexing. The Search engine needs to load the previously-produced index files from the master

deployment share; this is already a network share, so presents no difficulty.

• Indexing from scratch. The Search engine needs a clean partition, a file from which all indexes are created. If

the engine is not local to Search Administration, there must be a share in order for the engine to access the

clean partition.

Therefore, you must create a share for the folder containing the clean partition, with read access for the

Search engine. The clean partition is located in <SearchDir>\SearchEngine\i686-win32-vc71\data,

which could for example be shared as \\mymachine\sharedData.

Having done this, you must also create a RoutingSystemService.properties file in your

l<ATG2007.3Dir>\home\localconfig\atg\search\routing directory to point to the partition in the

shared directory. The file should contain a cleanPhysicalPartitionPath property as shown:

cleanPhysicalPartitionPath=\\\\mymachine\\sharedData\\initial.index

Note: If you do not want to create a share for the clean partition, you can do the following instead:

Page 17: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 13

• Copy initial.index from the unshared <SearchDir>\SearchEngine\i686-win32-vc71\data folder

into the network deployment share.

• Update the RoutingSystemService.properties file’s cleanPhysicalPartitionPath to point to this

using the full network path, such as \\server\deploymentShare\initial.index.

You will also need to share any directories that contain content to be indexed. When indexing a file system the

Search engines receive paths to the files, for example, D:\CustomerDocs\whitepaper.doc. The Search engine

opens the file and indexes it; if the Search engine is not on the same machine as Search Administration, then the

D drive is not accessible.

Then specify this directory in <ATG2007.3Dir>\home\localconfig\atg\search\routing

\LaunchingService.properties:

deployShare=\\\\mymachine\\DeployShare

Configuring for Remote Repository Indexing

If you are using ATG Search to index remote repositories, such as those used by an ATG Commerce installation,

you may encounter the following error:

Connection refused to host: 127.0.0.1

If this happens, modify the application server startup script to include the argument:

-Djava.rmi.server.hostname=XXX.XXX.XXX.XXX

Replace the XXX.XXX.XXX.XXX value with an IP address which other ATG instances can communicate with on this

machine.

Configuring a Multi-Server Installation

When installed, the default configuration assumes that all Search modules are running locally. This is not a

configuration likely to be used in production, however. This section addresses configuration that must be

done in order for Search to communicate with remote Search components, or with other ATG products run in

combination.

A typical real-world Search installation includes three servers running one or more of the indicated ATG

products:

• Management Server—Search Administration, Content Administration (CA), Response Management

Administration, or Merchandising

Page 18: ATG Search Installation and Configuration Guide - product

14 3 Installing ATG Search

• Agent Server—Service Administration, Customer Service Center, Knowledge, Search Routing

• Production—Commerce, Self Service, Search Routing

The following diagram shows a Search installation with all configured components. The sections that follow

explain the configuration steps that must be performed.

Configure all components in the dynamo/home/servers/servername/localconfig/ configuration layer.

Configuring the DeployShare Directory

On each server on which you plan to run the DAF.Search.Routing module, configure the shared deployment

directory. To do this, change the DeployShare property in the /atg/search/routing/LaunchingService

component on each server. Make sure the deployment directory is visible to all of the servers. For example:

deployShare=/yourSharedDirectory/DeployShare.

Configuring the GSAInvalidatorService

Make sure that the /atg/dynamo/service/GSAInvalidatorService is enabled on each server. See Enabling

the Cache Invalidator in the ATG Repository Guide.

Configuring the Lock Manager

Configure Server and Client lock managers between your servers. The lock manager is used when you have

multiple repositories referring to the same database and want to ensure that the repositories remain in sync. For

example, the Search routing repository is used by all Search clients within a system. Therefore, all of the servers

Page 19: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 15

on which those Search clients run need to have a Client Lock Manager set up to look at the same Server Lock

Manager.

Both Search Administration and Knowledge automatically point their SearchClientLockManager

components to the ClientLockManager_production component; SearchClientLockManager requires

no further configuration. If you are using other ATG applications with Search, you may need to perform this

configuration manually. You must also configure the following components:

• ClientLockManager component on the production server

• ClientLockManager_production component on the Knowledge and Search Administration servers

In both of those components, perform the following configuration:

• Set useLockServer=true

• Set the lockServerAddress and lockServerPort properties to point to the lockServerAddress and

lockServerPort properties of the ClientLockManager component on the production server

See also the Locked Caching section of the ATG Repository Guide.

Configuring the IDGenerator

An IDGenerator generates IDs for repository items. There should be only one per repository.

On the management and agent servers, the /atg/search/service/SearchIdGenerator component point

to the /atg/dynamo/service/IdGenerator_production component, which is the ID generator for the

production server. This ensures that Search Administration and all Search client applications use the same ID

generator for the Search repositories.

Configuring a Single Search Administration for Multiple CA Environments

If you are using Search with ATG Content Administration, have more than one environment (such as Staging and

Production), and plan to use incremental indexing, perform the following steps:

1. Copy the /atg/commerce/search/IndexedItemsGroup.properties file to

IndexedItemsGroup_staging.properties.

2. In the new IndexedItemsGroup_staging.properties file, set the repository property as shown:

repository=/atg/commerce/catalog/ProductCatalog_staging

3. Copy the product-catalog-output-config.xml file to product-catalog-output-config-

staging.xml.

4. In the new product-catalog-output-config-staging.xml file, change the repository-path and

repository-item settings as shown:

repository-path=/atg/commerce/catalog/ProductCatalog_staging

repository-item-group=/atg/commerce/search/IndexedItemsGroup_staging

5. Copy the ProductCatalogOutputConfig.properties file to

ProductCatalogOutputConfig_staging.properties.

6. In the new ProductCatalogOutputConfig_staging.properties file, set the definitionFile as

shown:

Page 20: ATG Search Installation and Configuration Guide - product

16 3 Installing ATG Search

definitionFile=product-catalog-output-config-staging.xml

Configuring SearchSQLRepository Components

The SearchSQLQLRepository and SearchSQLRepositoryEventServer components are used by the

subscriber table needed for GSA distributed cache invalidation.

These components are located in the DAF.Search.Base module, in the atg/search/service component

directory. By default, these components point to the local, default JDBC connection and the related services. If

your environment has both production and internal-facing ATG instances, you must change these components

to point to the _production versions of the target components on the internally-facing machines.

Note: ATG Knowledge sets these components to point to their _production equivalents automatically, and you

do not need to configure them further.

Component Default Setting Production Setting

/atg/search/service/

SearchSQLRepository

/atg/dynamo/service/

jdbc/

SQLRepository

/atg/dynamo/service/jdbc/

SQLRepository_production

/atg/search/service/

SearchSQLRepositoryEventServer

The SQLRepositoryEventServer

used by Search. Should always match

the data source configuration of

SearchJTDataSource

/atg/dynamo/server/

SQLRepositoryEventServer

/atg/dynamo/server/

SQLRepositoryEventServer_production

Note: There should only be one SQLRepositoryEventServer per data source. It’s important that /atg/

dynamo/server/SQLRepositoryEventServer_production on the asset management server and /atg/

dynamo/server/SQLRepositoryEventServer on the production server are connected to the same database

account.

If you want to use something other than the production server as the target, clone the _production

components and configure them to point to components on your alternative server. Most installations should

not require this.

Using Incremental Loading with ATG Content Administration

If you are using incremental indexing of your versioned content, whenever you modify and redeploy your

content, the incremental loading system needs to be notified about the changes in order to reflect them in the

next incremental index. Therefore, an IndexingOutputConfig component (IOC) must run in the ATG Content

Administration environment (asset management server) and listen for change events on the deployment

repository (production server).

You can actually load the content using an IOC running on either server. If you want to load content on the

production server, configure the targetName property of the IOC on the management server to the name of

Page 21: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 17

the Content Administration deployment target. However, if you are loading content on the asset management

server, you must not configure this property, or loading will not work.

If you are using a secure product catalog (for instance, as part of a B2B Commerce installation), you must change

the indexing configuration on the asset management server, so that it can listen for change events on the on the

deployment target repository. By default, this is configured to point to the non-secure catalog repository. To do

this, in the /atg/commerce/search/IndexedItemsGroup.properties file, change the following property:

repository=/atg/commerce/catalog/ProductCatalog_production

The new value should be as shown:

repository=/atg/commerce/catalog/SecureProductCatalog

Configuring the Database

Your Search Administration installation and all routing installations must have access to the Search databases.

Search components rely on two database schemas, one internal, which is used by management applications

such as Search Administration and Merchandising (called the management database), and one external, which is

used by Search client applications such as Routing, Commerce and Knowledge (called the production database).

Configuring Search Data Sources

Before you run the database import scripts, in your home/localconfig directory, make the following changes:

• Set FakeXADataSource to point to the management schema.

• Set FakeXADataSource_production to point to your production schema.

• Set the JTDataSource and JTDataSource_production components in home/servers/server_name/

localconfig to use a JNDIReference to your application server data source. On the management server, the

JTDataSource should point to the management schema and the JTDataSource_production should point

to the production schema. On the production server, you only need to configure the JTDataSource, which

must refer to the production schema.

Note: If you are running Search Administration by itself, without any other servers, point both

FakeXADataSource and FakeXADataSource_production to the management database. You should also

define the JTDataSource_production component in your home/servers/server_name/localconfig to

point to the production schema.

When you start your management or production server, reference the server_name in your start command.

Configuring Data Sources for Multi-Product Installations

As explained in the previous sections, a typical Search installation has a management, agent, and production

server. It also has three data sources:

• Agent data source – Stores versioned Knowledge solutions and internal users

• Production data source- Stores Self Service solutions, Search routing and Search Administration data, and

external users

Page 22: ATG Search Installation and Configuration Guide - product

18 3 Installing ATG Search

• Management data source – Required for Knowledge installations

A number of /atg/search/service components exist to configure your installation. As provided, these

components point to the production server and the production data source; this is the recommended

configuration, and if you use it, you do not need to change these components. You do need to do the following:

• On the agent server, configure JTDataSource_production, ClientLockManager_production and

SQLRepositoryEventServer_production to point to the production server and production data source.

• On the management server, configure JTDataSource_production, ClientLockManager_production

and SQLRepositoryEventServer_production to point to the management server and management data

source.

Creating the Database Schema

After installing ATG Search, create the database schema. The following scripts for are located in the

<ATG2007.3dir>\Search2007.3\SearchAdmin\install\ directory:

• create-search—Creates the DAF.Search tables and the SearchAdmin tables

• create-daf-search—Creates only the DAF.Search tables

• create-search-admin—Creates only the SearchAdmin tables

• all-create—Creates all Search tables and also tables for the ATG Business Control Center

If your Search installation is standalone, you can simply run all-create.

If your installation involves both a production and a management component, tables related to other ATG

products have most likely already been installed on both the production and management server, and you only

need to create the Search tables. This involves the following steps:

1. Set your Search database environment variables to point to the production database.

2. Set your FakeXADataSource to point to your management database.

Page 23: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 19

3. Set your FakeXADataSource_production to point to your production database.

4. Run create-search, which creates both the daf-search and search-admin tables and initializes them. It

also creates the SearchAccess role in the management database so the default admin user can access the

Search Administration interface through the BCC on the management server.

Note: ATG Knowledge and ATG Self Service create the daf-search tables as part of their installation. If you have

already installed either product, you should run create-search-admin instead of create-search.

Whichever script you use, before running it, edit the search-env.bat script to set the SEARCH_DB_USER,

SEARCH_DB_PWD, and JDBC connection to the correct values for your system (you can also set these using

environment variables). Also, configure the SEARCH_MSSQL_DB_NAME and SEARCH_MSSQL_DB_HOST for

Microsoft SQL Server, SEARCH_ORACLE_TNS for Oracle, or SEARCH_DB2_DB_NAME for DB2, depending on your

database provider.

Note: The schema creation scripts import data using StartSQLRepository. Make sure your JTDataSource

and FakeXADatasource are configured to permit this in home/localconfig. JTDataSource should

automatically point to FakeXADataSource. FakeXADataSource should point to the management database,

and FaxeXADataSource_production should point to the production database. Note that some product

installation processes, including ATG Self Service, need FakeXADataSource pointed at the production database,

so you may need to change the FakeXADataSource between installing the different products.

Deleting Database Tables

To delete Search database tables from your installation, run one of the following scripts found in

<ATG2007.3dir>\Search2007.3\SearchAdmin\install, depending on which tables you have created:

drop-searchdrop-search-admindrop-daf-searchall-drop

Configuring Search for Customer Intelligence

If you want to use reporting with your Search installation, perform the steps in this section after you have

installed and configured Search Administration. The steps provided are for an Oracle database with JBoss as the

application server, and assume that you have already installed ACI.

1. Add a data warehouse datasource for JBoss. To do this, add following text to the <JBOSS_HOME>\server

\atg\deploy\atg-search-ds.xml file just before the </datasources> tag, replacing the italic text with

the correct information for your data warehouse database.

<local-tx-datasource>

<jndi-name>name</jndi-name>

<connection-url>url</connection-url>

<driver-class>class</driver-class>

<user-name>your_dw_name</user-name>

<password>your_dw_password</password>

<min-pool-size>10</min-pool-size>

<max-pool-size>10</max-pool-size>

Page 24: ATG Search Installation and Configuration Guide - product

20 3 Installing ATG Search

</local-tx-datasource>

2. Configure your <ATG2007.3Dir>/home/localconfig/atg/reporting/datawarehouse/

JTDataSource.properties datasource to point to the schema in which your data warehouse tables will be

created.

3. Configure /atg/reporting/datawarehouse/loaders/JTDataSource.properties to point to your

data warehouse loader tables.

4. Make the following changes to the <ATG2007.3Dir>\Search2007.3\SearchAdmin\datawarehouse

\install\search-dw-env.bat or .sh file:

Note: This script imports data using StartSQLRepository. Make sure your JTDataSource and

FakeXADatasource are configured to permit this.

• Add the following to the beginning of the file. For the value, use your <ATG2007.3Dir>\home directory.

set DYNAMO_HOME=value

• Change the DB_TYPE from solid to your database type

• Change the DB_USER and DB_PWD values to the correct values for your database.

• Change the ORACLE_TNS value to the correct value for your database.

5. Run the <ATG2007.3Dir>\Search2007.3\SearchAdmin\datawarehouse\install\all-create.bat

file to create the database tables.

Create files with following folder structure and content on the server running your Search client application,

such as your Commerce store:

• /atg/dynamo/service/DWDataCollectionConfig.properties

enabled=true

If you are using Search with Commerce, create files with following folder structure and content on the server

running your Search client application:

• /atg/commerce/search/ProductCatalogQueryFormHandler

requestLogging=true

• /atg/commerce/search/ProductCatalogQueryFormHandler

enableReportData=true

Install the Search reports:

1. Go to <ATG2007.3dir>\Search2007.3\SearchAdmin\datawarehouse\deployment.

2. Copy Search.zip to the <ATG2007.3dir>\ACI2007.3\c8\deployment directory.

3. Open the ATG Reporting Center and navigate to Tools > Content Administration.

4. Click the New Import link. The next screen lists the deployment archives that are available for importing.

5. Select Search.zip and complete the import process, accepting all the defaults.

6. Once the import is complete, you will see a folder called ATG in the home page of the Reporting Center.

Page 25: ATG Search Installation and Configuration Guide - product

3 Installing ATG Search 21

When you assemble your EAR file to include Search, include the following modules:

SearchAdmin.AdminUI SearchAdmin.Datawarehouse

Building the Search Administration EAR

You must include the following modules in your EAR file:

• SearchAdmin.AdminUI

• SearchAdmin.datawarehouse if you are planning to use Search reporting

• ARF.BIZUI if you want to include a link to the Reporting Center in the ATG Business Control Center

See the ATG Programming Guide for information on using runAssembler to build and deploy ATG applications.

After creating the EAR file, see your application server documentation for information on deploying and starting

ATG Search.

Note: If you are going to deploy Search Administration as part of a standalone EAR, you must do the following:

1. Run the ATG Search installer and install a standalone Search engine (see Adding Standalone Search

Engines (page 11)) on the machine to which you are going to deploy your EAR.

2. Add an engineDir property to the /atg/search/routing/LaunchingService component, and set it to

point to the new Search engine’s <Search2007.3Dir>\SearchEngine directory. For example:

engineDir=c:\\ATG\\ATG2007.3\\Search2007.3\\SearchEngine

3. In the /atg/searchadmin/atgsearchengine/LanguageConfigurationService component, configure

the languageConfigPath property to point to the Search engine’s Config.xml file. For example:

languageConfigiFile=C:\\ATG\ATG2007.3\\Search2007.3\\SearchEngine\\

language\\Config.xml

Accessing Search Administration

To access Search Administration:

1. Access the ATG Business Control Center.

2. Enter your login and password.

3. In the left navigation area, click ATG Search Administration.

4. Click one of the menu options to begin working with Search.

The ATG Search Project Administration interface is loaded into the current browser window.

Page 26: ATG Search Installation and Configuration Guide - product

22 3 Installing ATG Search

Page 27: ATG Search Installation and Configuration Guide - product

Appendix A. Migrating from Previous ATG Search Versions 23

Appendix A. Migrating from Previous

ATG Search Versions

This appendix is for users of previous versions of ATG Search. It highlights changes in the user interface and

terminology, and provides instructions for changing to the new version. It includes the following sections:

Terminology Changes (page 23)

Procedural Changes (page 25)

Reporting Changes (page 26)

Migration Procedures (page 26)

Terminology Changes

This section describes where terms used in the user interface have changed between 2006.x and 2007.3 Search

versions.

2007.3 Term Definition Replaces 2006.x Term

content item A unit of indexable material. document

content source A location in which content is stored, such as a

file system or a database.

content connector

data extractor

data renderer

web spider

db forum resource

content template

content set A logical grouping of related content. A content

set can contain any number of content sources.

document repositories/sets

Page 28: ATG Search Installation and Configuration Guide - product

24 Appendix A. Migrating from Previous ATG Search Versions

2007.3 Term Definition Replaces 2006.x Term

customizations Blanket term for topic sets, query rules,

dictionaries, refinement configurations,

languages, parsing options, and query rules.

n/a

index Content that has been processed and can be

searched.

IdeaMap

indexing Indexing content and deploying to either a

staging or production environment

job script actions

deployment

indexing

environment

Search environment that performs indexing. indexing server

partition A section of an index. Partition size is limited

based on the Search product architecture, so

it may be necessary to have more than one in

an index for large amounts of content. Search

automatically calculates how many partitions

are required.

n/a

Search project Projects represent the top-level organizing

container for indexed content: one or more

sets of content are gathered in a project and

collectively referred to as the project’s Index.

The project associates customization data,

search environments, and physical machines

with an index.

Job scripts

Search Administration The ATG Search user interface, including both

Project Administration and the Workbench.

Management Console

search engine A standalone Search executable that both

indexes content and processes end-user

searches.

Answer Server

Search environment A combination of one or more machines and

search engines which either index content or

search indexes. Environment types are:

Indexing environment – Performs the indexing

of content for eventual deployment to

production or staging servers.

Staging environment – For demonstration and

evaluation purposes, not intended for heavy

load.

Production environment - Used to run end-user

facing search services under heavy load.

Workbench – Dedicated environment for

Workbench users to test customizations.

Content clusters

Page 29: ATG Search Installation and Configuration Guide - product

Appendix A. Migrating from Previous ATG Search Versions 25

2007.3 Term Definition Replaces 2006.x Term

topic sets A hierarchy of topics. taxonomies

Workbench Part of the user interface where users verify their

indexes, and edit customization data.

Management Console (partial)

Procedural Changes

The biggest change in how Search works comes from the new concept of the Search Project and the elimination

of the old Content Template taskflow in favor of a simpler configuration process.

The process for adding content to your searchable data has been simplified. Many seldom-used UI options have

been moved so that while still accessible, if you do not need them they are easily skipped over.

ATG Search 2006.x and earlier releases only allowed one indexing job to run per ATG Search installation. This

generally required a separate ATG Search installation per site to work around this limitation. ATG Search 2007.3

allows concurrent indexing jobs.

After identifying the content you want to make searchable, the procedures can be compared thus:

Step ATG Search 2006.x ATG Search 2007.3

Create a project N/A The new Search Project provides

a central place to keep track of

everything you need to index and

deploy content: the content itself,

associated configuration, information

about where and how it should be

deployed, etc.

Acquire content Create content connectors using data

extractors and data renderers.

Add content directly to your Search

Project.

Customize content Build topics, dictionaries, and other

tools to help end-users find answers

successfully.

The new Search Workbench provides a

centralized, standardized way to create

and test customizations, which can

then be associated with one or more

projects.

Index content Create a job script and procedures, and

run the script.

Index your project, either immediately

or by creating schedule rules.

Note: A customization data type in previous versions of ATG Search known as “File Handling Options”, which

specified how file extensions were interpreted with respect to file types, has been replaced by Advanced

Settings options on the content itself.

Page 30: ATG Search Installation and Configuration Guide - product

26 Appendix A. Migrating from Previous ATG Search Versions

Reporting Changes

Search reports cannot be migrated to Search 2007.3. Reports in Search 2007.3 will include only data collected

after the upgrade.

The following reports, which were once included in ATG Search, have been moved to ATG Self Service:

• Ratings Trend Report

• Ratings Detail Report

• Ratings Distribution Report

See the Reporting chapter of the ATG Search Administration Guide for information on Search 2007.3 reports.

Migration Procedures

Direct migration from previous Search versions is not possible. You can export the following from your Search

2006.x installation as XML files, and then import it into your Search 2007.3 installation:

• Dictionaries

• Term Weight Sets

• Taxonomies (called Topic Sets in 2007.3)

Note: Be sure to export topic sets from 2006.x so that the topic IDs are included, and import them into 2007.3

the same way. If you do not include topic IDs, topic references are broken.

• Query Rules

• Document Parsing Options (called Text Processing Option Sets in 2007.3)

• Refinement Configurations (called Facet Sets in 2007.3)

Warning: All data must be exported using the 2006.x Management Console. Make sure to export your data

before uninstalling Search 2006.x.

Page 31: ATG Search Installation and Configuration Guide - product

Appendix B. ATG Search Component Ports 27

Appendix B. ATG Search Component

Ports

The default ports used by ATG Search components may conflict with your organization’s current port

assignments. This appendix describes the ports used.

If you are using Search with Knowledge or Self Service, the RMI IIOP port must be opened so that Search can

extract content from the Knowledge/Self Service installation. By default, the RMI port is 8860.

Knowledge, Self Service and Search Administration communicate with the search engines using HTTP. One port

per search engine is by default allocated from the 6070—6100 range.

The following table summarizes the locations where ports are set and used:

Component Default

Port(s)

Locations

Search Engines 6072-6100 Search Administration: Host Machine Details for each machine.

JBoss RMI

WebService

8083 Set (file, code):

jboss/server/default/conf/jboss-service.xml

//mbean[@code=

'org.jboss.web.WebService']/

attribute[@name='Port']

JBoss HSQL 1476 Set (file, code):

jboss/server/default/deploy/hsqldb-service.xml

//mbean[@code=

'org.jboss.jdbc.HypersonicDatabase']/

attribute[@name='Port']

Page 32: ATG Search Installation and Configuration Guide - product

28 Appendix B. ATG Search Component Ports

Component Default

Port(s)

Locations

JBoss MQ

(3 ports)

8090-2 Set (file, code):

jboss/server/default/deploy/jbossmq-service.xml

//mbean[@code='org.jboss.mq.il.

oil.OILServerILService']/

attribute[@name='ServerBindPort']

//mbean[@code='org.jboss.mq.il.

uil.UILServerILService']/

attribute[@name='ServerBindPort']

//mbean[@code='org.jboss.mq.il.

uil2.UILServerILService']/

attribute[@name='ServerBindPort']

JBoss HTTP 8080 Set (file, code):

jboss3.0.7/bin/runjboss.cmd

add to JAVA_OPTS line:

-Djetty.port=_PORT_

Used (file, code):

ESConfig.xml

/AnswerEngine/Reporting/ReportWebAppRootURL

ESEnv.cmd

HTTP_PORT=_PORT_

JBoss SSL 8443 Set (file, code):

jboss/server/default/deploy/jbossweb.sar/META-INF/jboss-

service.xml

//Call[@name='addListener']/Arg/

New[@class='org.mortbay.http.

SunJsseListener']/Set[@name='Port']

JBoss

AJP13 Listener

8009 Set (file, code):

jboss/server/default/deploy/jbossweb.sar/META-INF/jboss-

service.xml

//Call[@name='addListener']/Arg/

New[@class='org.mortbay.http.ajp.

AJP13Listener']/Set[@name='Port']

Page 33: ATG Search Installation and Configuration Guide - product

4 Glossary 29

4 Glossary

The glossary provides a description of the terms used in ATG Search.

Term Definition

content Raw data for indexing, such as html files, Word documents, or ATG

repositories.

content set Bundles content into a single organizational unit that can be indexed as

one.

customizations Blanket term for topic sets, query rules, dictionaries, facet sets, languages,

parsing options, query rules, and preferred answers.

deploying The action of moving indexes and/or customization data from a staging to

a production environment

dictionary Contains definitions of special words and the relationships between them.

Users can add terms to improve the search responses.

expansion A synonym for a term, either defined in a custom dictionary or in the

default language files

facet set Allows end-users of your site to search within an existing result set.

index Used to refer to both the process of making content searchable, and the

aggregate form of the searchable content itself.

indexing environment Search environment that indexes content.

learning files Rules that automatically apply topics to a content collection.

partition An index is divided into one or more Logical Partitions, which have a one-

to-one relationship with content sets. Each Logical Partition is composed of

one or more Physical Partitions, depending on the amount of content.

project Projects represent the top-level container for managing and indexing

content. The project associates customization data and search

environments with an index.

query rules Link actions in ATG Search to certain types of user queries.

Page 34: ATG Search Installation and Configuration Guide - product

30 4 Glossary

Term Definition

Search Administration The ATG Search user interface, including both project administration and

the workbench.

search engine A standalone Search executable that both indexes content and processes

end-user searches.

search environment A combination of one or more machines and search engines which either

index content or search indexes. Environment types are:

indexing environment – Dedicated environment indexing content.

staging environment – Largely used for demonstration and evaluation

purposes. A web site can be pointed to a staging Search Environment for

demo or proof-of-concept purposes, but staging sites are not intended for

heavy load.

production environment – Used to run end-user facing search services

under heavy load. These typically include a large number of multi-CPU

machines to support high degrees of fault-tolerance and parallelism.

Workbench – Dedicated environment for performing the indexing of

content for eventual movement to production or staging servers.

term Individual entry within a dictionary.

term weight set/stop words A set of term/weight pairs, in which the weight describes the amount of

influence a term has in a search.

topics User-defined classification system that can be applied to indexes, allowing

end-users to refine their search by finding related information.

topic set A hierarchy of topics.

Workbench Part of the user interface where users create and edit customization data.

Page 35: ATG Search Installation and Configuration Guide - product

Index 31

Index

AATG Search, installing, 9

Bback-of-the-book index, 4

Business Control Center, 7

Ccomponents, 6

content, 7

Content Administration requirements, 15

content sets, 7

conventions, 1

customization data, 7

Ddeployment directory, 10

dictionaries, 4

documents, 3

Eenvironments, 7

Ffeatures, 3

Iindex, 3, 4, 7

customizing, 7

index item, 3

indexing environments, 7

installation roadmap, 2

Nnatural language components, 4

Pproduction environments, 7

Project Administration, 7

Rrequests, 4

SSearch projects, 7

Search Query request, 4

Search workbench (see workbench)

SearchAdmin role, 7

searching, definition, 3

SearchSQLRepository component, 16

SearchSQLRepositoryEventServer component, 16

security, 7

staging environments, 7

Vversioned repositories

indexing incrementally, 16

View Item request, 4

Wworkbench, 7

Page 36: ATG Search Installation and Configuration Guide - product

32 Index