23
IBM DB2 QUERY PATROLLER V7.1 Data Management Solutions White Paper

IBM DB2 Q

  • Upload
    tess98

  • View
    739

  • Download
    3

Embed Size (px)

Citation preview

Page 1: IBM DB2 Q

IBM DB2 QUERY PATROLLER V7.1

Data Management Solutions White Paper

Page 2: IBM DB2 Q

First Edition (July 2000)

© Copyright International Business Machines Corporation 2000. All rights reserved. Note to U.S. Government Users -- Documentation related to restricted rights -- Use,duplication or disclosure is subject to restrictions set forth in GSA ADP Schedule Contractwith IBM Corp.

i

Page 3: IBM DB2 Q

ii

Page 4: IBM DB2 Q

Notices

References in this publication to IBM products, programs, or services donot imply that IBM intends to make these available in all countries in whichIBM operates. Any reference to an IBM product, program, or service isnot intended to state or imply that only that IBM product, program, orservice may be used. Any functionally equivalent product, program, orservice that does not infringe any of the intellectual property rights of IBMmay be used instead of the IBM product, program, or service. Theevaluation and verification of operation in conjunction with other products,except those expressly designated by IBM, are the responsibility of theuser.

IBM may have patents or pending patent applications covering subjectmatter in this document. The furnishing of this document does not give youany license to these patents. You can send license inquiries, in writing, to:

IBM Corporation IBM Director of Licensing 208 Harbor Drive Stamford, Connecticut 06904 U.S.A.

iii

Page 5: IBM DB2 Q

Trademarks

AIX, DB2, NUMA-Q, QMF are trademarks of International BusinessMachine Corporation.

Microsoft, Windows, Windows NT are registered trademarks of MicrosoftCorporation.

Java or all Java-based trademarks and logos, and Solaris are trademarks ofSun Microsystems, Inc. in the United States, other countries, or both.

Other company, product, and service names used in this publication may betrademarks or service marks of others.

iv

Page 6: IBM DB2 Q

Contents

17HIGH AVAILABILITY AND STABILITY

. . . . . . . . . . . . . . . . . . . . . .

16DB2 Query Patroller All Management in One Tool

. . . . . . . . . . . . .

16Proprietary Query Management Tools

. . . . . . . . . . . . . . . . . . . . . . . .

16Hardware Load Levelers

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

16Ad Hoc Query Tools

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15SETTING THE STANDARD FOR ROBUST QUERYMANAGEMENT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

15DB2 Query Patroller - Brings it alltogether in one product . . . . . . . . . . . . . . . . . . . . . . . . . . .

13DB2 Query Patroller Tracker

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

13Query Enabler

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11Query Monitor

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11(null)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

11(null)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10System Parameters

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10(null)

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

10Query Patroller

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9Overview

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9ARCHITECTURE OVERVIEW

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

9DESCRIPTION

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

8CORPORATE ASSET PROTECTION

. . . . . . . . . . . . . . . . . . . . . . . .

8HIGH AVAILABILITY AND STABILITY

. . . . . . . . . . . . . . . . . . . . . . .

7ROBUST SCALABILITY

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7EMPOWERING USERS WITH INFORMATION

. . . . . . . . . . . . . . . .

7Large Scale Warehouse Challenges . . . . . . . . . . . . .

vContents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ivTrademarks

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

iiiNotices

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

v

Page 7: IBM DB2 Q

22Additional Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

20For More Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19Data Warehouse Optimization

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

19Load Balancing

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18User Prioritization

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18Query Cost Analysis

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

18Frees User Desktop, Improving Productivity

. . . . . . . . . . . . . . . . . . .

18ROBUST SCALABILITY

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17Robust Query Termination

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17Proactive Query Capture

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

vi

Page 8: IBM DB2 Q

Large Scale Warehouse ChallengesLarge Scale Warehouse ChallengesToday, database management personnel are facing increasing challenges.While they want to deliver information to end users as quickly as possible,they are finding that it takes an enormous amount of resources to beresponsive to the growing number of users demanding information. Evenworse, the users’ needs for information continue to change and evolve, asnew information becomes available.

EMPOWERING USERS WITH INFORMATION

Innovative data mining, information queries, and trend analysis techniquesare providing companies with much needed competitive advantage, and areenabling radical breakthroughs within industries. As a result, there is greatdemand for rapid, innovative query capabilities against increasingly large,mission critical data warehouses.In the past, end users typically called upon their Information Systems (IS)department and asked for a report on the market, sales, or inventory. ISgenerated the report on the user’s behalf and provided it to them usuallydays or even weeks later. These reports were static in nature and did notallow the user to change the selection criteria for the reports. Sometimes,requirements were not communicated properly and the users waited weeksfor a report that did not answer their business questions.In the late 1980s, desktop query and reporting tools entered the market,allowing end-users to perform their own queries against the corporate ordepartmental database. This immediately provided the companies usingthese applications with a competitive advantage. While end users at othercompanies waited weeks for a report in order to make a decision,companies that allowed their business analysts and key managers to querydirectly against the database were making decisions in minutes and gaininga competitive advantage in the marketplace.

ROBUST SCALABILITY

The steady increase of end users performing queries against the corporatedatabase presents a huge challenge for database administrators. As thenumber of users performing their own queries increases, the response timeof the system may decline due to increased contention. Large scale datawarehouses that provide breakthrough business value pose a challenge:How can a company’s data warehouse continue to provide quick response

7 DB2 Query Patroller

Page 9: IBM DB2 Q

time across large amounts of data to an ever increasing number ofend-users tapping the power of ad hoc query tools on their desktops?One solution to this problem has been to add more hardware. The newsymmetric multi-processing (SMP) and massively parallel processing(MPP) systems available in recent years can, in part, help handle theincreased load or help spread the load over several machines to improvequery performance. This is the direction many leading edge organizationshave taken with their Decision Support Systems (DSS).

HIGH AVAILABILITY AND STABILITY

Even with the addition of new, more powerful hardware systems, usersmay still unknowingly submit “runaway” queries. When users submit thesequeries during peak business hours, they can bring even the largest systemto a crawl.With the advancement in query tools, it is now possible for every businessanalyst in a company to quickly generate a query without knowing anythingabout the back end database or about SQL. When too many end userssubmit complex “runaway” queries to a database running on a large MPPsystem at the same time, they can potentially bring the large, multi-terabytesystem to its knees. The drop in response is due primarily too poor querymanagement. The database load management capability wasoverwhelmed because all the queries reached the data warehouse at thesame time. If the query submissions had been controlled, the response timewould not have been significantly impacted.

CORPORATE ASSET PROTECTION

All of these trends point to the need to grow and protect the datawarehouse as the vital corporate asset that it is. The robustness,availability, performance, and security of large data warehouses are ofparamount importance since they enable radical business breakthroughswhile maintaining competitive advantage in the marketplace. .

DB2 Query Patroller 8

Page 10: IBM DB2 Q

DB2 Query Patroller Developed To ManageYour Workloads

DESCRIPTION

DB2 Query Patroller greatly improves the scalability of a data warehouseby allowing hundreds of users to safely submit queries on multi-terabyteclass systems. The product is a true three-tier architecture solution. Itscomponents span the client/server environment to better manage andcontrol all aspects of query submission.DB2 Query Patroller acts as an agent on behalf of the end user. Itprioritizes and schedules queries so that query completion is morepredictable and computer resources are efficiently utilized. After an enduser submits a query, DB2 Query Patroller frees up the user’s desktop sothey can perform other work, or even submit additional queries, whilewaiting for the original query results. DB2 Query Patroller is integratedwith the DB2 optimizer and performs cost analysis on queries and thenschedules and dispatches those queries so that the load is balanced acrossthe database partitions. DB2 Query Patroller sets individual user and group priorities, as well asuser query limits. This enables the data warehouse to deliver the neededresults to its most important users as quickly as possible. It also has theability to limit usage of the system by stopping those “runaway” queriesbefore they even start. If desired, an end user can choose to receive e-mailnotification of scheduled query completion or query failure.

ARCHITECTURE OVERVIEW

Overview

DB2 Query Patroller consists of components running on the databaseserver and end users’ desktops. DB2 Query Patroller is made up of severalcomponents each having a specific task in providing query and resourcemanagement.

9 DB2 Query Patroller

Page 11: IBM DB2 Q

AdministratorDB2 CAE

Query MonitorQuery Administrator

Query Patroller Tracker

End UserDB2 CAE

Query Monitor

Enterprise

Network

DB2 EE Query Patroller Server

Agent

DatamartData

Warehouse

Agent Agent Agent Agent

DB2 EEE Query Patroller Server

...

...

End User

AIX, Solaris, NT, 2000, HP-UX, NUMA-Q

Internet ServerDB2 CAE

Figure 1 – DB2 Query Patroller Overview

Query Patroller Server

The Server is the core component of DB2 Query Patroller. It provides anenvironment for storing user profiles, storing system parameters,maintaining job lists, scheduling queries and storing node information. TheServer component executes on a node with the DB2 database server.

Query Administrator

The Administrator component gives a DBA or system administrator thetools needed to manage the DB2 Query Patroller environment. This javainterface allows for the management of the Query Patroller system. Theadministrator provides menus to configure user profiles, system parametersand node parameters.

System ParametersThe system administrator can set up system-wide, partition, user, or grouplevel thresholds for governing the data warehouse, including:� Maximum number of queries running on the system at any given time.

DB2 Query Patroller 10

Page 12: IBM DB2 Q

� Maximum cost threshold for the entire system. The cumulative cost of allqueries running cannot exceed this number.

� Maximum cost threshold for each defined user or group in the system.

� Maximum number of jobs a user can initiate. This value can be configureddifferently for each of your users or groups.

� Specific amount of time to retain temporary result tables. When DB2 QueryPatroller takes control of a query, a temporary table is created in thedatabase to store the query results. DB2 Query Patroller will automaticallyclean up these tables after the period of time specified by the administrator.Query Patroller will also allow users to share results sets so that a querycan be executed once and all authorized users can reuse the result set.

Query Patroller Agent

The agent component of DB2 Query Patroller resides on each of thedatabase server nodes. It processes the database requests on behalf of thequery patroller server and gathers resource utilization statistics to allow forquery workload balancing, as well as monitoring of the resource utilizationof each partition.On a uni-processor or non-clustered SMP machine, the agent and servercomponents run on the same machine. On MPP or clustered SMP machine,the server runs on one node and the agents will run on all of the databasenodes

Query Monitor

The Query Monitor component of DB2 Query Patroller provides both theadministrators and the end users with a Java based interface for viewingand managing their queries. The Query Monitor component enables endusers to view a job’s status, submit and cancel queries, and drop resulttables. End users can only display information for their own queries andjobs running on the system while the Query Monitor tool providesadministrators with the ability to manage all queries in the system.

11 DB2 Query Patroller

Page 13: IBM DB2 Q

Figure 2 – DB2 Query Monitor

Job ListThe DB2 Query Monitor job list maintains the queries submitted by endusers. The job list contains information about each query submitted throughQuery Patroller. The system administrator or end user is able to use the joblist to view information for the queries in the system including:� Job sequence number

� Query priority

� Query status

� Query source

� Node on which the query was submitted

� Type of application submitting the query

� ID of user submitting the query

� Date and time the query was submitted

Users and administrators may also view more detailed information on anyof the queries listed in the job list table, such as query run time, cost of thequery, and the SQL statement.

DB2 Query Patroller 12

Page 14: IBM DB2 Q

Query Enabler

The Query Enabler component of DB2 Query Patroller executes inside ofthe DB2 client. This component intercepts dynamic SQL statements beingpassed to DB2 from any front end query tool. Query Enabler interactswith other DB2 Query Patroller components and with the user to executeor schedule the query and to return results from previously completedqueries. Query Enabler intercepts queries submitted by end users. If matchingqueries exist on the Query Patroller Server, Query Enabler provides theend user with a display of those queries and prompts the user to indicatewhether or not a new result set should be returned. Whenever a user wantsto submit a query, Query Enabler provides the option to set scheduled runtimes or to submit and wait for the results. If the end user does not want towait for the query results, Query Enabler releases the desktop applicationand passes the query to the DB2 Query Patroller Server. Query Patrollerthen takes control of the query and runs it in the background on behalf ofthe end user. The next time the user submits that same query, the result setfor that query will be returned to the application. Query Enabler also has the ability to run in a silent mode so that the enduser does not interact with the Query Enabler, but rather they can run inthe same mode they have today with their end user tools submitting queriesdirectly to the server. This also enables 3-tier or n-tier applications toutilize Query Patroller without the need for additional software on theclient desktop.

DB2 Query Patroller Tracker

Figure 3 – DB2 Query Patroller Tracker

13 DB2 Query Patroller

Page 15: IBM DB2 Q

The DB2 Query Patroller Tracker product enables a user to manage thedatabases by displaying usage history in a graphical, user-friendly format.It provides two key features that support system administrators inmanaging the database. First, it gives the system administrator the abilityto monitor the database load and activity over time. Second, it providesthe administrator with details on table and column access to assist in tuningthe system. The Query Patroller server stores the historical information inDB2 tables so that administrators can drill down on whatever aspects ofthe database usage that they desire using the query tool of their choice.

DB2 Query Patroller 14

Page 16: IBM DB2 Q

DB2 Query Patroller - Brings it all togetherin one product

SETTING THE STANDARD FOR ROBUST QUERYMANAGEMENT

To understand DB2 Query Patroller functionality and how it differs fromquery tool management systems, it is necessary to understand the problemseach tool is designed to solve. At present, four technologies are available tomanage queries and resources:� Ad hoc query tools � Three-tier proprietary tools

� Server-based query and resource managers

� Hardware resource managers

Each has its own strengths that make it appropriate for particular types ofquery situations. Figure 4 illustrates the queries divided into four classes based upon the resource load leveling provided and the management of thequery before it runs against the database.

Figure 4– Classes of Query and Resource Management

15 DB2 Query Patroller

Resource Load Balancing

Query

No Yes

Managed

NotManaged

Ad HocQuery Tools

Hardware LoadLeveling

ProprietaryQuery Tools

DB2 QueryPatroller

Page 17: IBM DB2 Q

Ad Hoc Query Tools

Ad hoc query tools do a good job of allowing the end user to directly askquestions of the database without having to go to IS personnel every timethey have a need for additional information. Generally, with relativelysmall databases and few users, there is little need for query and resourcemanagement. However, as databases grow larger and the number of usersincrease, the strain on the data warehouse becomes evident. In some cases,a query tool may have a rudimentary scheduling facility. However, thatrequires the user to keep their PC powered on overnight to schedule thequery or puts the burden on the end user to share resources in good faithwith other users. QMF for Windows is a unique exception that providespredictive query governing. However, it requires managing a distributedenvironment rather than all query governing being centrally managed by thedatabase server.

Hardware Load Levelers

Some Database Management Systems and MPP hardware companies offersystem software products that spread the query load across thedatabase-specified nodes. Queries are routed to free nodes for processing.Even though this provides a good use of the hardware resources, it doesnot look at the type of query being submitted. Any query that comes intothis type of system is immediately run, regardless of the time or cost thatthe query will consume.

Proprietary Query Management Tools

Three-tier query tools have a server component that provides somecapabilities for scheduling queries. This component releases the desktopand submits the query at the pre-scheduled time on behalf of the end user.Typically, these type of tools only work with their own front end andprovide a canned query interface for end users. This is less adaptable forad hoc querying. Typically, many users submit very predictable optimizedqueries. Three-tier query tools provide little user prioritization andresource balancing.

DB2 Query Patroller All Management in One Tool

The first three categories of query and resource management tools fail toprovide end users with acceptable query response times and IS with thecontrol they need. The DB2 Query Patroller product addresses thesechallenges. DB2 Query Patroller is the only product of its kind on themarket today that controls and monitors queries. DB2 Query Patrollerworks with dynamic SQL query tools to prioritize and schedule userqueries based on user profiles and cost analysis performed on each query.

DB2 Query Patroller 16

Page 18: IBM DB2 Q

Large queries are put on hold and can then be scheduled for a later timeduring off-peak hours. Queries with high priority (based on user profiles)are promoted to the top of the schedule. In addition, DB2 Query Patrollermonitors resource utilization statistics to determine which partitions are theleast busy and provides load distribution functionality that evens out theworkload across the system.

HIGH AVAILABILITY AND STABILITY

Proactive Query Capture

At the core of DB2 Query Patroller’s breakthrough functionality, is itsability to proactively capture queries. Query Patroller’s proactiveapproach to query management helps it guarantee the high levels ofavailability and stability required in a mission-critical data warehouse. Asqueries are submitted against the data warehouse, Query Patroller traps thequeries, assesses their cost, and prioritizes their execution. Without thisproactive query trap, users could submit “runaway” queries thatcompromise the system availability and IS could only report in retrospectwhy the system failed. DB2 Query Patroller serves as a vigilant eye overvital corporate data warehouses. Since the queries are captured, should thedatabase server fail for any reason, these queries will be automaticallyrestarted by Query Patroller on behalf of the end user.

Robust Query Termination

The proactive query capture approach is enhanced through QueryPatroller’s robust query termination. One of its strengths is its ability toeffectively terminate queries. Many ad-hoc query tools give end users aterminate option, but in reality the query is just terminated on the clientworkstation. The processes already started on the database server may not be terminated. If the user assumes that the query has been cancelledthey might be more likely to submit other queries and repeat the samesubmit and terminate process. The end result could be that the server getsbogged down with multiple orphan queries that continue to run, wastingvaluable resources. DB2 Query Patroller addresses this problem by trulyterminating the query. It ensures that both the end user workstation andthe database server are released from a terminated query. This ensures thatthe cycles used for processing on the database server are fully utilized byneeded queries and it frees up the system administrator from having tomonitor and kill the orphan queries.

17 DB2 Query Patroller

Page 19: IBM DB2 Q

ROBUST SCALABILITY

Frees User Desktop, Improving Productivity

Typically with most front-end query tools, after a user submits a query, theuser’s application is “hung up” in a “pending output” state until the resultsof the query return. Users must wait until the query completes for theirdesktops to become available, which can greatly reduce their productivity.Users need to be able to perform other tasks, even submit additionalqueries, while earlier queries run in the background.In many cases, users don’t need their query results back until the next dayor the following Monday morning. Thus, instead of submitting a query forimmediate execution , the query could be scheduled for a later time whenthe system load may be lower. DB2 Query Patroller frees up the userapplication and improves user productivity by allowing the user to submitand schedule queries based on their response requirements.

Query Cost Analysis

DB2 Query Patroller is integrated with the optimizer which performs costanalysis of each query entering the system to determine the static cost ofthe query. Query Patroller enables the system administrator to modify auser’s profile and specify a query cost threshold for each user or group.After completing cost analysis, DB2 Query Patroller compares the returnedvalue to the value in the user profile. If the returned value exceeds the userthreshold, DB2 Query Patroller places the query on hold so that the querycan run at a later time. Query Patroller also notifies the end user that theirquery is on hold for future execution.

User Prioritization

The majority of ad hoc query tools do not take into account a user’spriority with respect to other users submitting queries into the system. Forexample, many times the CEO of a company needs a report right away fora meeting, but the system is so overloaded that the query does notcomplete in time. If the CEO’s priority class level is high, the queryrequest would automatically move to the top of the query submissionqueue and be executed immediately. DB2 Query Patroller provides an environment that facilitates the prioritizedcompletion of queries. It maintains a user profile for each user that submitsqueries into the system. The user profile defines a priority class, whichidentifies the relative priority a user has when submitting a query into thedatabase. A higher priority class places the user’s query closer to the top ofthe query submission queue.

DB2 Query Patroller 18

Page 20: IBM DB2 Q

The system administrator sets individual user and group priorities, thusenabling the data warehouse to deliver the needed results to your mostimportant users as quickly as possible.DB2 Query Patroller also enables the system administrator to limit thenumber of queries that each individual user can simultaneously submit. Thisfeature gives other users the opportunity to have their queries processed ina timely fashion.

Load Balancing

Ideally, query workload should be balanced across available resources.However, in an MPP environment, ad hoc query tools may only submitqueries onto one or two nodes on the system heavily using some nodes andunder utilizing others. In comparison, a server-based query manager couldmore intelligently balance node utilization. This prevents bottlenecks at thenodes being bombarded with ad hoc query requests.DB2 Query Patroller provides the system administrator with the ability toset system and user parameters to govern the queries entering the database.The system administrator may specify the maximum number of concurrentqueries for each user or group, for each node, and for the entire system.DB2 Query Patroller provides load leveling across MPP hardwareenvironments and clustered servers. By tracking node or server utilization,Query Patroller routes queries to idle nodes or servers and spreads thequery load across the system.

Data Warehouse Optimization

DB2 Query Patroller enables system administrators to monitor the databaseload by providing access to the following information:� What tables are being accessed for all jobs

� Columns accessed for each table

� Number of rows returned, by table, for all jobs

� Detailed view of job activity over time

� Historical view of job activity

DB2 Tracker displays this information in an easy to view format bydetermining the total number of tables accessed in the database, andcalculating the total number of times each specific table is accessed. Foreach table displayed, the user is able to drill down to view the columnsaccessed for queries against that table. This enables the administrator todecide if new indexes should be created on the columns used most in thetable, if an Automatic Summary Table (AST) may improve performance,or if certain tables should be considered for archival.

DB2 Query Patroller also provides robust charge back mechanisms.Administrators can track usage by user, group, client hostname orapplication submitting the query. The resources that can be accumulated

19 DB2 Query Patroller

Page 21: IBM DB2 Q

for chargeback include elapsed query execution time, rows returned, orquery cost. All of this information is stored in DB2 tables which allowschargeback reporting using any query tool.

DB2 Query Patroller 20

Page 22: IBM DB2 Q

21 DB2 Query Patroller

Page 23: IBM DB2 Q

Additional Information

If you are interested in learning more about DB2 Query Patroller and theother products in the DB2 UDB database family, please contact your localIBM representative or visit our Web sites at:http://www.ibm.com/software/data/db2

.

DB2 Query Patroller 22