If you can't read please download the document
Upload
truongtruc
View
219
Download
1
Embed Size (px)
Citation preview
TPC-DS Specification V 1.0.0L
TPC BENCHMARK DS
Standard Specification
Version 2.5.0
June, 2017
Transaction Processing Performance Council (TPC)
www.tpc.org
2017 Transaction Processing Performance Council
All Rights Reserved
TPC Benchmark DS - Standard Specification, Version 2.5.0Page 106 of 137
Legal Notice
The TPC reserves all right, title, and interest to this document and associated source code as provided under U.S. and international laws, including without limitation all patent and trademark rights therein.
Permission to copy without fee all or part of this document is granted provided that the TPC copyright notice, the title of the publication, and its date appear, and notice is given that copying is by permission of the Transaction Processing Performance Council. To copy otherwise requires specific permission.
No Warranty
TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, THE INFORMATION CONTAINED HEREIN IS PROVIDED AS IS AND WITH ALL FAULTS, AND THE AUTHORS AND DEVELOPERS OF THE WORK HEREBY DISCLAIM ALL OTHER WARRANTIES AND CONDITIONS, EITHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING, BUT NOT LIMITED TO, ANY (IF ANY) IMPLIED WARRANTIES, DUTIES OR CONDITIONS OF MERCHANTABILITY, OF FITNESS FOR A PARTICULAR PURPOSE, OF ACCURACY OR COMPLETENESS OF RESPONSES, OF RESULTS, OF WORKMANLIKE EFFORT, OF LACK OF VIRUSES, AND OF LACK OF NEGLIGENCE. ALSO, THERE IS NO WARRANTY OR CONDITION OF TITLE, QUIET ENJOYMENT, QUIET POSSESSION, CORRESPONDENCE TO DESCRIPTION OR NON-INFRINGEMENT WITH REGARD TO THE WORK.
IN NO EVENT WILL ANY AUTHOR OR DEVELOPER OF THE WORK BE LIABLE TO ANY OTHER PARTY FOR ANY DAMAGES, INCLUDING BUT NOT LIMITED TO THE COST OF PROCURING SUBSTITUTE GOODS OR SERVICES, LOST PROFITS, LOSS OF USE, LOSS OF DATA, OR ANY INCIDENTAL, CONSEQUENTIAL, DIRECT, INDIRECT, OR SPECIAL DAMAGES WHETHER UNDER CONTRACT, TORT, WARRANTY, OR OTHERWISE, ARISING IN ANY WAY OUT OF THIS OR ANY OTHER AGREEMENT RELATING TO THE WORK, WHETHER OR NOT SUCH AUTHOR OR DEVELOPER HAD ADVANCE NOTICE OF THE POSSIBILITY OF SUCH DAMAGES.
Trademarks
TPC Benchmark, TPC-DS and QphDS are trademarks of the Transaction Processing Performance Council.
Acknowledgments
Developing a TPC benchmark for a new environment requires a huge effort to conceptualize research, specify, review, prototype, and verify the benchmark. The TPC acknowledges the work and contributions of the TPC-DS subcommittee member companies in developing the TPC-DS specification.
The TPC-DS subcommittee would like to acknowledge the contributions made by the many members during the development of the benchmark specification. It has taken the dedicated efforts of people across many companies, often in addition to their regular duties. The list of significant contributors to this version includes Susanne Englert, Mary Meredith, Sreenivas Gukal, Doug Johnson 1+2, Lubor Kollar, Murali Krishna, Bob Lane, Larry Lutz, Juergen Mueller, Bob Murphy, Doug Nelson, Ernie Ostic, Raghunath Othayoth Nambiar, Meikel Poess, Haider Rizvi, Bryan Smith, Eric Speed, Cadambi Sriram, Jack Stephens, John Susag, Tricia Thomas, Dave Walrath, Shirley Wang, Guogen Zhang, Torsten Grabs, Charles Levine, Mike Nikolaiev, Alain Crolotte, Francois Raab, Yeye He, Margaret McCarthy, Indira Patel, Daniel Pol, John Galloway, Jerry Lohr, Jerry Buggert, Michael Brey, Nicholas Wakou, Vince Carbone, Wayne Smith, Dave Steinhoff, Dave Rorke, Dileep Kumar, Yanpei Chen, John Poelman, and Seetha Lakshmi.
Document Revision History
Date
Version
Description
08-28-2015
2.0.0
Mail ballot version
11-12-2015
2.1.0
Includes FogBugz entries 937, 991, 1002, 1033 1053, 1060, 1121, 1128, 1135, 1136
06-09-2016
2.2.0
Includes FogBugz entries 1571 1559 1539 1538 1537 1531 1502 1501 1480 1479 1474 1473 1472 1470 1393 1322 1263
08-05-2016
2.3.0
Includes FogBugz entries 1676, 1627, 1531, 1501 and 616
02-24-2017
2.4.0
Includes FogBugz entries 1728, 1697, 1696 and 1654
06-08-2017
2.5.0
Includes FogBugz entries 1756, 1894, 1909, 1912, 1980 and 1981
TPC Membership (as of June 2017)
Full Members
Associate Members
Table of Contents
0PREAMBLE7
0.1Introduction7
0.2General Implementation Guidelines7
0.3General Measurement Guidelines8
0.4Workload Independence9
0.5Associated Materials9
1Business and Benchmark Model11
1.1Overview11
1.2Business Model12
1.3Data Model and Data Access Assumptions13
1.4Query and User Model Assumptions13
1.5Data Maintenance Assumptions15
2Logical Database Design17
2.1Schema Overview17
2.2Column Definitions17
2.3Fact Table Definitions18
2.4Dimension Table Definitions24
2.5Implementation Requirements31
2.6Data Access Transparency Requirements34
3Scaling and Database Population35
3.1Scaling Model35
3.2Test Database Scaling35
3.3Qualification Database Scaling36
3.4dsdgen and Database Population37
3.5Data Validation38
4Query Overview39
4.1General Requirements and Definitions for Queries39
4.2Query Modification Methods40
4.3Substitution Parameter Generation46
5Data Maintenance47
5.1Implementation Requirements and Definitions47
5.2Refresh Data47
5.3Data Maintenance Functions50
6Data Accessibility Properties61
6.1The Data Accessibility Properties61
7Performance Metrics and Execution Rules62
7.1Definition of Terms62
7.2Configuration Rules63
7.3Query Validation65
7.4Execution Rules65
7.5Output Data70
7.6Metrics70
8SUT AND DRIVER IMPLEMENTATION73
8.1Models of Tested Configurations73
8.2System Under Test (SUT) Definition73
8.3Driver Definition74
9PRICING76
9.1Priced System76
9.2Allowable Substitution77
10FULL DISCLOSURE78
10.1Reporting Requirements78
10.2Format Guidelines78
10.3Full Disclosure Report Contents78
10.4Executive Summary83
10.5Availability of the Full Disclosure Report85
10.6Revisions to the Full Disclosure Report85
10.7Derived Results86
10.8Supporting Files Index Table87
10.9Supporting Files88
11AUDIT90
11.1General Rules90
11.2Auditor's Check List90
11.3Clause 4 Related Items91
11.4Clause 5 Related Items92
11.5Clause 6 Related Items92
11.6Clause 7 Related Items92
11.7Clause 8 Related Items92
11.8Clause 9 Related Items92
11.9Clause 10 Related Items93
PREAMBLE
Introduction
The TPC BenchmarkDS (TPC-DS) is a decision support benchmark that models several generally applicable aspects of a decision support system, including queries and data maintenance. The benchmark provides a representative evaluation of the System Under Tests (SUT) performance as a general purpose decision support system.
This benchmark illustrates decision support systems that:
Examine large volumes of data;
Give answers to real-world business questions;
Execute queries of various operational requirements and complexities (e.g., ad-hoc, reporting, iterative OLAP, data mining);
Are characterized by high CPU and IO load;
Are periodically synchronized with source OLTP databases through database maintenance functions.
Run on Big Data solutions, such as RDBMS as well as Hadoop/Spark based systems.
A benchmark result measures query response time in single user mode, query throughput in multi user mode and data maintenance performance for a given hardware, operating system, and data processing system configuration under a controlled, complex, multi-user decision support workload.
While separated from the main text for readability, comments and appendices are a part of the standard and their provisions must be enforced.
General Implementation Guidelines
The purpose of TPC benchmarks is to provide relevant, objective performance data to industry users. To achieve that purpose, TPC benchmark specifications require benchmark tests be implemented with systems, products, technologies and pricing that:
a) Are generally available to users;
b) Are relevant to the market segment that the individual TPC benchmark models or represents (e.g., TPC-DS models and represents complex, high data volume, decision support environments);
c) Would plausibly be implemented by a significant number of users in the market segment modeled or represented by the benchmark.
In keeping with these requirements, the TPC-DS database must be implemented using commercially available data processing software, and its queries must be executed via SQL interface.
The use of new systems, products, technologies (hardware or software) and pricing is encouraged so long as they meet the requirements above. Specifically prohibited are benchmark systems, products, technologies or pricing (hereafter referred to as "implementations") whose primary purpose is performance optimization of TPC benchmark results without any corresponding applicability to real-world applications and environments. In other words, all "benchmark special" implementations, which improve benchmark results but not real-world performance or pricing, are prohibited.
A number of characteristics shall be evaluated in order to judge whether a particular implementation is a benchmark special. It is not required that each point below be met, but that the