5309
Vertica Documentation HPE Vertica Analytics Platform Software Version: 7.2.x Document Release Date: 4/21/2016

Hp vertica 7.2.x_complete_documentation

Embed Size (px)

Citation preview

HP Vertica Analytics Platform 7.2.x Complete Documentation SetSoftware Version: 7.2.x
Document Release Date: 4/21/2016
Legal Notices Warranty The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HPE shall not be liable for technical or editorial errors or omissions contained herein.
The information contained herein is subject to change without notice.
Restricted Rights Legend Confidential computer software. Valid license from HPE required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license.
Copyright Notice © Copyright 2006 - 2013 Hewlett Packard Enterprise Development LP
Trademark Notices Adobe™ is a trademark of Adobe Systems Incorporated.
Microsoft® and Windows® are U.S. registered trademarks of Microsoft Corporation.
UNIX® is a registered trademark of The Open Group.
This product includes an interface of the 'zlib' general purpose compression library, which is Copyright © 1995- 2002 Jean-loup Gailly and Mark Adler.
HPE Vertica Analytics Platform (7.2.x) Page 2 of 5309
Contents Supported Platforms 5
Vertica Place 4471
Vertica Pulse 4644
Vertica Plug-In for Informatica 4779
Glossary 4825
Page 4 of 5309HPE Vertica Analytics Platform (7.2.x)
Vertica Documentation Contents
Supported Platforms
Vertica Server and Vertica Management Console Supported Operating Systems and Operating System Versions Hewlett Packard Enterprise supports Vertica Analytics Platform 7.2.x running on the following 64-bit operating systems and versions on x86_x64 architecture.
Important: This document reflects what has been tested with Vertica Analytics Platform 7.2.x. Be aware that operating system vendors and communities release updates and patches for all versions of their operating systems on their own schedules. Such updates to these operating systems may or may not coincide with the release schedule for Vertica. While other versions of the operating systems listed may have been successfully deployed by customers in their environments, stability and performance of these configurations may vary. If you choose to run Vertica on an operating system version not listed in this document and experience an issue, the Vertica Support team may ask you to reproduce the issue using one of the configurations described in this document to aid in troubleshooting. Depending on the details of the case, the Support team may also ask you to enter a support ticket with your operating system vendor.
Red Hat Enterprise Linux
l All versions starting at 6.0 up to and including 6.7
l Version 7.0
Important: You cannot perform an in-place upgrade of your current Vertica Analytics Platform from Red Hat Enterprise Linux 6.0 — 6.7 to Red Hat Enterprise Linux 7.0. For information how to upgrade to Red Hat Enterprise Linux 7.0, see the Migration Guide for Red Hat 7/CentOS 7.
CentOS
l All versions starting at 6.0 up to and including 6.7
l Version 7.0
Important: You cannot perform an in-place upgrade of your current Vertica Analytics
Vertica Documentation
HPE Vertica Analytics Platform (7.2.x) Page 6 of 5309
Platform from CentOS 6.0 — 6.7 to CentOS 7.0. For information on how to upgrade to CentOS 7.0, see the Migration Guide for Red Hat 7/CentOS 7.
SUSE Linux Enterprise Server All versions starting at 11.0 up to and including 11.0 SP3.
Oracle Enterprise Linux 6 Red Hat Compatible Kernel only - Vertica is not supported on the unbreakable kernel (kernels with a uel suffix).
Debian Linux All versions starting at 7.0 up to and including 7.7.
Ubuntu
l Version 12.04 LTS
l Version 14.04 LTS
When there are multiple minor versions supported for a major operating system release, Hewlett Packard Enterprise recommends that you run Vertica on the latest minor version listed in the supported versions list. For example, if you run Vertica on the Debian Linux 7.0 major release, Hewlett Packard Enterprise recommends you use version 7.7. However, for both Red Hat Enterprise Linux and CentOS, Hewlett Packard Enterprise recommends using any of the following versions for the 6.0 major release: 6.5, 6.6, or 6.7
Supported File Systems Vertica Analytics Platform Enterprise Edition has been tested on all supported Linux platforms running ext3 or ext4 file systems. For the Vertica Analytics Platform I/O profile, the ext4 file system is considerably faster than ext3.
While other file systems have been successfully deployed by some customers, Vertica Analytics Platform cannot guarantee performance or stability of the product on these file systems. In certain support situations, you may be asked to migrate off of these unsupported file systems to help you troubleshoot or fix an issue. In particular, several file corruption issues have been linked to the use of XFS with Vertica; Hewlett Packard Enterprise strongly recommends not using it in production.
Important: Vertica Analytics Platform 7.2.x does not support Linux Logical Volume
Vertica Documentation
Manager (LVM).
l Internet Explorer 10 and later
l Firefox 31 and later
l Google Chrome 38 and later
Vertica Server and Management Console Compatibility Each version of Vertica Analytics Platform Management Console is compatible only with the matching version of the Vertica Analytics Platform server. For example, the Vertica Analytics Platform 7.2 server is supported with Vertica Analytics Platform 7.2 Management Console only.
Vertica 7.2.x Client Drivers Vertica provides JDBC, ODBC, OLE DB, and ADO.NET client drivers. You can choose to download:
l Linux and UNIX-like platforms — ODBC client driver, client RPM, and vsql client. See Installing the Client Drivers on Linux and UNIX-Like Platforms.
l Windows platforms — ODBC, ADO.NET, and OLE DB client drivers, the vsql client, the Microsoft Connectivity Pack, and the Visual Studio plug-in. See Installing the Client Drivers and Tools on Windows.
l Mac OS X platforms — ODBC client driver and vsql client. See Installing the Client Drivers on Mac OS X.
l The cross-platform JDBC client driver .jar file available for installation on all platforms.
See Vertica Driver/Server compatibility to see which server versions are compatible with Vertica 6.1.x drivers.
ADO.NET Driver The ADO.NET driver is supported on the following platforms:
Vertica Documentation
Platform Processor Supported Versions
Windows 8
Windows 10
Microsoft Windows x64 (64-bit) Windows 7
Windows 8
Windows 10
2012
JDBC Driver All JDBC drivers are supported on any Java 5-compliant platform. (Java 5 is the minimum.)
Vertica Documentation
HPE Vertica Analytics Platform (7.2.x) Page 9 of 5309
ODBC Driver Vertica Analytics Platform provides both 32-bit and 64-bit ODBC drivers. Vertica 7.2.x ODBC drivers are supported on the following platforms:
Platform Processor Supported Versions
Windows 8
Windows 10
Windows 8
Windows 10
unixODBC 2.3.0 or later
SUSE Linux Enterprise
x86_64 6
Ubuntu x86_64 12.04 LTS
Platform Processor Supported Versions
unixODBC 2.3.0 or later
HP-UX IA-64 11i V3 iODBC 3.52.6 or later
unixODBC 2.3.0 or later
Solaris SPARC 10
Vertica Documentation
Vertica Analytics Platform Driver/Server Compatibility The following table indicates the Vertica Analytics Platform driver versions that are supported by different Vertica Analytics Platform server versions.
Note: SHA password security is supported on client driver and server versions 7.1.x and later.
Client Driver Version Compatible Server Versions
6.1.x 6.1.x, 7.0.x, 7.1.x, 7.2.x
7.0.x 7.0.x, 7.1.x, 7.2.x
7.2.x 7.2.x
vsql Client The Vertica vsql client is included in all client packages; it is not available for download separately. The vsql client is supported on the following platforms:
Operating System Processor
l Windows 2012, all variants
l Windows 7, all variants
l Windows 8.*, all variants
l Windows 10
SUSE Linux Enterprise 11 x86, x64
Vertica Documentation
Operating System Processor
Oracle Enterprise Linux 6 (Red Hat Compatible Kernel only) x86, x64
CentOS 6 x86, x64
CentOS 7 x86, x64
Ubuntu 12.04LTS x86, x64
AIX 5.3, 6.1 PowerPC
Mac OS X 10.7, 10.8, 10.9 x86, x64
Perl and Python Requirements You can use Vertica's ODBC driver to connect applications written in Perl or Python to the Vertica Analytics Platform.
Perl To use Perl with Vertica, you must install the Perl driver modules (DBI and DBD::ODBC) and a Vertica ODBC driver on the machine where Perl is installed. The following table lists the Perl versions supported with Vertica 7.2.x.
Perl Version Perl Driver Modules ODBC Requirements
l 5.8
l 5.10
l DBD::ODBC version 1.22
See Vertica 7.2.x Client Drivers.
Python To use Python with Vertica, you must install the Vertica Python Client or the pyodbc module and a Vertica ODBC driver on the machine where Python is installed. The following table lists the Python versions supported with Vertica 7.2.x:
Vertica Documentation
Python Version
2.4.6 pyodbc 2.1.6 See Vertica 7.2.x Client Drivers.
2.7.x Vertica Python Client (Linux only)
2.7.3 pyodbc 3.0.6
3.3.4 pyodbc 3.0.7
Vertica SDKs This section details software requirements for running User Defined Extensions (UDxs) developed using the Vertica SDKs.
C++ SDK The Vertica cluster does not have any special requirements for running UDXs written in C++.
Java SDK Your Vertica cluster must have a Java runtime installed to run UDxs developed using the Vertica Java SDK. HPE has tested the following Java Runtime Environments (JREs) with this version of the Vertica Java SDK:
l Oracle Java Platform Standard Edition 6 (version number 1.6)
l Oracle Java Platform Standard Edition 7 (version number 1.7)
l OpenJDK 6 (version number 1.6)
l OpenJDK 7 (version number 1.7)
R Language Pack The Vertica R Language Pack provides version 3.0.0 of the R runtime and associated libraries for interfacing with Vertica. You install the R Language Pack on the Vertica server.
Vertica Integrations for Hadoop Vertica 7.2.x is supported with these Hadoop distributions:
Vertica Documentation
Distribution Version
MapR 3.1.1
4.0
Vertica Connector for Hadoop MapReduce HP provides a module specific to Hadoop for Vertica client machines.
Vertica provides a Connector for Apache Hadoop 2.0.0. The table below details supported software versions.
Description Vertica Connector for Apache Hadoop MapReduce for Hadoop 2.0.0
Apache Hadoop and Pig Combinations
Apache Hadoop 2.0.0 and Pig 0.10.0
Cloudera Distribution Versions
Cloudera Distribution Including Apache Hadoop (CDH) 4
Packs, Plug-Ins, and Connectors for Vertica Client Machines HPE provides the following optional module for Vertica client machines.
Informatica PowerCenter Plug-In The Vertica plug-in for Informatica PowerCenter is supported on the following platforms:
Plug-in Version
Vertica Versions
9.x 6.x (limited functionality)
Plug-in Version
Vertica Versions
l Windows 7 all variants
Red Hat Enterprise Linux 5 (32 and 64 bit)
Solaris
AIX
HP-UX
enhancements)
Vertica on Amazon Web Services HPE provides a preconfigured AMI for users who want to run Vertica Analytics Platform on Amazon Web Services (AWS). This HP-supplied AMI allows users to configure their own storage and has been configured for and tested on AWS. This AMI is the officially supported version of Vertica Analytics Platform for AWS.
Note that HPE develops AMIs for Vertica on a slightly different schedule than our product release schedule. Therefore, AMIs will be available for Vertica releases sometime following the initial release of Vertica software.
Vertica in a Virtualized Environment Vertica runs in the following virtualization environment:
Important: Vertica does not support suspending a virtual machine while Vertica is running on it.
Host l VMware version 5.5
l The number of virtual machines per host did not exceed the number of physical processors
Vertica Documentation
HPE Vertica Analytics Platform (7.2.x) Page 16 of 5309
l CPU frequency scaling turned off at the host level and for each virtual machine
l VMware parameters for hugepages set at version. 5.5 defaults
Input/Output l Measured by vioperf concurrently on all Vertica nodes When running vioperf, provide
the –duration=2min option and start on all nodes concurrently
l 25 megabytes per second per core of write
l 20+20 megabytes per second per core of rewrite
l 40 megabytes per second per core of read
l 150 seeks per second of latency (SkipRead)
l Thick provisioned disk, or pass-through-storage
Network l Dedicated 10G NIC for each Virtual Machine
l No oversubscription at the switch layer, verified with vnetperf
Processor l Architecture of Sandy Bridge (HP Gen8 or higher)
l 8 or more virtual cores per virtual machine
l No oversubscription
l vcpuperf time of no more than 12 seconds ( ~= 2.2 GHz clock speed)
Memory l Pre-allocate and reserve memory for the VM
l 4G per virtual core of the virtual machines
HP has tested the configuration above. While other virtualization configurations may have been successfully deployed by customers in development environments,
Vertica Documentation
HPE Vertica Analytics Platform (7.2.x) Page 17 of 5309
performance of these configurations may vary. If you choose to run Vertica on a different virtualization configuration and experience an issue, the Vertica Support team may ask you to reproduce the issue using the configuration described above, or in a bare-metal environment, to aid in troubleshooting. Depending on the details of the case, the Support team may also ask you to enter a support ticket with your virtualization vendor.
Vertica Integration for Apache Kafka You can use Vertica with the Apache Kafka message broker. Vertica supports the following Kafka distributions:
Apache Kafka Versions Vertica Versions
0.8.x 7.2 and later
0.9.0 7.2 SP2
For more information on Kafka integration, refer to How Vertica and Apache Kafka Work Together.
Vertica Documentation
New Features
HPE Vertica Analytics Platform (7.2.x) Page 20 of 5309
New Features and Changes in Vertica 7.2.2 Read the topics in this section for information about new and changed functionality in Vertica 7.2.2.
Upgrade and Installation This section contains information on updates to upgrading or installing Vertica Analytics Platform 7.0.x.
More Details For information see Installing Vertica.
Using the admintools Option --force-reinstall to Reinstall Packages When you upgrade Vertica and restart your database, Vertica automatically upgrades packages such as the package flextable for flex tables. Vertica issues a warning if it fails to reinstall one or more packages. Failed reinstallation could occur if, for example, your user entered an incorrect password upon restart of the database after upgrade. You can force installation of packages by issuing the new admintools install_package option, --force-reinstall.
For information on the --force-reinstall option, see Upgrading and Reinstalling Packages in Installing Vertica.
Setting the pid_max Before Installation Vertica now requires that pid_max be set to at least 524288 before installing or upgrading the database platform. This new requirement ensures that there are enough pids for all the necessary system and Vertica processes.
For more information, see pid_max Setting
Client Connectivity This section contains information on updates to connection information for Vertica Analytics Platform 7.0.x.
More Details For more information see .
Last Line Record Separator to vsql You can add a record separator to the last line of vsql output. The -Q command-line option and pset option \pset trailingrecordsep enable trailing record separators.
HPE Vertica Analytics Platform (7.2.x) Page 21 of 5309
For more information about adding a record separator, please see Command-Line Options and \pset NAME [ VALUE ].
Setting a Client Connection Label After Connecting to Vertica You can change the client connection label after connecting to Vertica. For more information about this feature, see Setting a Client Connection Label.
Support for Square-Bracket Query Identifiers with ODBC and OLE DB Connections When connecting to a Vertica database with an ODBC or OLE DB connection, some third-party clients require you to use square-bracket identifiers when writing queries. Previously, Vertica did not support this type of identifier. Vertica now supports square bracket identifiers and allows queries from those third-party platforms to run successfully.
For more information, see:
l OLE DB Connection Properties
Security and Authentication This section contains information on updates to security and authentication features for Vertica Analytics Platform 7.0.x.
More Details For more information see Security and Authentication.
Row-Level Access Policy You can create access policies at the row level to restrict access to sensitive information to only those users authorized to view it.
Along with the existing column access policy, the new row-level access policy provides improved data security.
For more information see Access Policies.
Data Analysis This section contains information on updates to data analysis for Vertica Analytics Platform 7.0.x.
HPE Vertica Analytics Platform (7.2.x) Page 22 of 5309
More Details For more information see Analyzing Data
Machine Learning for Predictive Analytics Machine Learning for Predictive Analytics is an analytics package that allows you to use machine learning algorithms on existing data in your Vertica database. Machine Learning for Predictive Analytics is included with the Vertica server and does not need to be downloaded separately.
Machine Learning for Predictive Analytics features include:
l In-database predictive modeling for regression problems, using linear regression
l In-database predictive modeling for classification problems, using logistic regression
l In-database data clustering, using k-means
l Evaluation functions to determine the accuracy of your predictive models
l Normalization functions to use for data preparation
For more information see Machine Learning for Predictive Analytics: An Overview.
Query Optimization This section contains information on updates to Optimization for Vertica Analytics Platform 7.0.x.
More Details For more information see Optimizing Query Performance.
Controlling Inner and Outer Join Inputs ALTER TABLE supports the FORCE OUTER option, which lets you control join inputs for specific tables. You enable this option at the database and session scopes through the configuration parameter EnableForceOuter.
For details, see Controlling Join Inputs.
DISTINCT Support in Multilevel Aggregation Multilevel aggregation supports SELECT..DISTINCT and aggregate functions that specify the DISTINCT option (AVG, COUNT, MIN, MAX, and SUM).
For example, Vertica supports statements such as the following:
HPE Vertica Analytics Platform (7.2.x) Page 23 of 5309
SELECT DISTINCT catid, year FROM sales GROUP BY ROLLUP(catid,year) HAVING GROUPING (year)=0;
SELECT catid, SUM(DISTINCT year), SUM(DISTINCT month), SUM(price) FROM sales GROUP BY ROLLUP (catid) ORDER BY catid;
Guaranteed Uniqueness Optimization Vertica can identify in a query certain columns that guarantee unique values, and use them to optimize various query operations. Any columns that are defined with the following attributes contain unique values:
l Columns that are defined with AUTO_INCREMENT or IDENTITY constraints
l Primary key columns where key constraints are enforced
l Columns that are constrained to unique values, either individually or as a set
l A column that is output from one of the following SELECT statement options: GROUP BY, SELECT..DISTINCT, UNION, INTERSECT, and EXCEPT
Vertica can optimize queries when these columns are included in the following operations:
l Left or right outer join
l GROUP BY clause
l ORDER BY clause
For example, a view might contain tables that are joined on columns where key constraints are enforced, and the join is a left outer join. If you query on this view, and the query contains only columns from a subset of the joined tables, Vertica can optimize view materialization by omitting the unused tables.
You enable guaranteed uniqueness optimization through the configuration parameter EnableUniquenessOptimization. You can set this parameter at database and session scopes. By default, this parameter is set to 1 (enabled).
Tables This section contains information on updates to Tables information for Vertica Analytics Platform 7.0.x.
HPE Vertica Analytics Platform (7.2.x) Page 24 of 5309
More Details For more information see Managing Tables.
Changing External Table Column Types You can change the data types of columns in external tables without deleting and re- creating the table. Because external tables do not store their data in Vertica ROS files, data does not need to be transformed. For details, see Changing Column Data Type.
You can use this feature when working with Hive tables accessed through the HCatalog Connector, which are external tables. See Synchronizing an HCatalog Schema With a Local Schema.
Loading Data This section contains information about loading data Vertica Analytics Platform 7.0.x.
More Details For more information see:
Vertica Library for Amazon Web Services
Using Flex Tables
COPY
Vertica Library for Amazon Web Services The Vertica library for Amazon Web Services (AWS) is a set of functions and configurable session parameters. These parameters allow you to directly exchange data between Vertica and Amazon S3 storage without any third-party scripts or programs.
For more information, see the Vertica Library for Amazon Web Services topic.
Performance Enhancements for favroparser This release includes significant performance improvements when loading Avro files into columnar tables with the favroparser. These improvements do not affect loading Avro files into flex tables.
For more information about using this parser, see Loading Avro Data and FAVROPARSER in the Using Flex Tables guide.
Using Copy ERROR TOLERANCE to Treat Each Source Independently Vertica treats each source independently when loading data using the ERROR TOLERANCE parameter. Thus if a source has multiple files and one of them is invalid,
HPE Vertica Analytics Platform (7.2.x) Page 25 of 5309
then the load continues, but the invalid file does not load.
For more information about this feature, see ERROR TOLERANCE in the topic Parameters.
Management Console This section contains information on updates to the Management Consol for Vertica Analytics Platform 7.0.x.
More Details For more information see Using Management Console.
Email Alerts in Management Console Management Console can generate email alerts when high-priority thresholds are exceeded. When you set a threshold to Priority 1, you can subscribe users to receive alerts through email when that threshold is exceeded.
Use the Email Gateway tab in MC Settings to enable MC to send email alerts. See Set Up Email .
You can subscribe to email alerts using the Thresholds tab, which appears on your database's Settings page. See Customizing Message Thresholds.
MC Message Center Notification Menu In Management Console 7.2.2, you can view a preview of your most recent messages without navigating away from your current page. Click the Message Center icon in the top-right corner of Management Console to view the new notification menu. From this menu, you can delete, archive, or mark your messages as read.
To visit the Message Center, click the Message Center link in the top-right corner of the notification menu.
For more information, see Monitoring Database Messages in MC.
Syntax Error Highlighting During Query Planning Management Console can show you query plans in an easy-to-read format on the Query Plan page. If the query you submit is invalid, Management Console highlights the parts of your query that might have caused a syntax error. You can immediately identify errors and correct an invalid query.
For more information about query plans in Management Console, see Managing Queries in MC.
HPE Vertica Analytics Platform (7.2.x) Page 26 of 5309
MC Time Information Using the MC REST API, you can retrieve the following information about the MC server:
l The MC server's current time
l The timezone where the MC server is located
For more information, see GET mcTimeInfo.
Timestamp Range Parameters with MC Alerts You can specify that calls to the MC API for MC alerts, their current status, and database properties return only those alerts within specific timestamp ranges.
For more information, see GET alerts.
System Table Updates This section contains information on updates to System Tables for Vertica Analytics Platform 7.0.x.
More Details For more information see Vertica System Tables.
OS User Name Column The following system tables have been updated to include the CLIENT_OS_USER_ NAME column:
l CURRENT_SESSION
l LOGIN_FAILURES
l SESSION_PROFILES
l SESSIONS
l SYSTEM_SESSIONS
l USER_SESSIONS
This column logs the username of any user who logged into, or attempted to log into, the database.
HPE Vertica Analytics Platform (7.2.x) Page 27 of 5309
KEYWORDS System Table The V_CATALOG schema includes a new system table, KEYWORDS. You can query this table to obtain a list of Vertica reserved and non-reserved keywords.
REMOTE_REPLICATION_STATUS System Table The V_MONITOR schema includes a new system table, REMOTE_REPLICATION_ STATUS. You can query this table to view the status of objects being replicated to an alternate cluster.
TRUNCATED_SCHEMATA System Table The V_MONITOR schema includes a new system table, TRUNCATED_SCHEMATA. You can query this table to view the original names of restored schemas that were truncated due to length.
SQL Functions and Statements This section contains information on updates to SQL Functions and Statements for Vertica Analytics Platform 7.0.x.
More Details For more information see the .
Regular Expression Functions All of the Vertica regular expression functions support LONG VARCHAR strings. This capability allows you to use the Vertica regex functions on __raw__ columns in flex or columnar tables. Using any of the regex functions to pattern match in __raw__ columns requires casting.
These are the current functions:
l REGEXP_COUNT
l REGEXP_ILIKE
l REGEXP_INSTR
l REGEXP_NOT_ILIKE
l REGEXP_NOT_LIKE
l REGEXP_LIKE
l REGEXP_REPLACE
l REGEXP_SUBSTR
This Vertica version adds the these functions to the documentation:
l REGEXP_ILIKE
l REGEXP_NOT_ILIKE
l REGEXP_NOT_LIKE
Backup, Restore, and Recovery This section contains information on updates to backup and restore operations for Vertica Analytics Platform 7.0.x.
More Details For more information see Backing Up and Restoring the Database.
Backup Integrity Check Vertica can confirm the integrity of your database backups. For more information about this feature, see Checking Backup Integrity.
Backup Repair Vertica can reconstruct backup manifests and remove unneeded backup objects. For more information about this feature, see Repairing Backups
Recover Tables in Parallel During a recovery, Vertica recovers multiple tables in parallel. For more information on this feature, see Recovery By Table.
Replicate Tables and Schemas to an Alternate Database You can replicate Vertica tables and schemas from one database to alternate clusters in your organization. Using this strategy helps you:
l Replicate objects to a secondary site.
l Move objects between test, staging, and production clusters.
For more information on this feature, see Replicating Tables and Schemas to an Alternate Database
HPE Vertica Analytics Platform (7.2.x) Page 29 of 5309
Place This section contains information on updates to Vertica Place for Vertica Analytics Platform 7.0.x.
More Details For more information see Vertica Place.
Additional WGS84 Supported Functions WGS84 support has been added to the following functions:
l ST_Contains
l ST_Disjoint
l ST_Intersects
l ST_Touches
l ST_Within
For information about which data types and data type combinations are supported, please view the function's reference page.
Exporting Spatial Data to a Shapefile Vertica supports exporting spatial data as a shapefile. For more information about this feature, see Exporting Spatial Data from a Table.
Use STV_MemSize to Optimize Tables for Geospatial Queries STV_MemSize returns the amount of memory used by a spatial object. When you are optimizing your table for performance, you can use this function to get the optimal column width of your data.
For more information about this feature, see STV_MemSize.
Kafka Integration This section contains information on updates to Kafka Integration for Vertica Analytics Platform7.0.x.
More Details For more information see Integrating with Apache Kafka.
HPE Vertica Analytics Platform (7.2.x) Page 30 of 5309
Apache Kafka 0.9 Support Vertica supports Apache Kafka 0.9 integration.
Multiple Kafka Cluster Support Vertica can accept data from multiple Kafka clusters streaming into a single Vertica instance. For more information, refer to Kafka Cluster Utility Options in Kafka Utility Options.
Parse Custom Formats Vertica supports the supports the use of user-defined filters to manipulate data arriving from Kafka. You can apply these filters to data before you parse it. For more information about this feature, see Parsing Custom Formats.
Parse Kafka Messages Without Schema and Metadata The KafkaAVROParser includes the parameter with_metadata. When set to TRUE, the KafkaAVROParser parses messages without including object and schema metadata. For more information about this feature, see Using COPY with Kafka.
kafka_clusters System Table The kafka_config schema includes a new system table, kafka_clusters. You can query this table to view your Kafka clusters and their constituent brokers.
New Features and Changes in Vertica 7.2.1 Read the topics in this section for information about new and changed functionality in Vertica 7.2.1.
Supported Platforms This section contains information on updates to supported platforms for Vertica Analytics Platform 7.0.x.
More Details For complete information on platform support see Vertica 7.2.x Supported Platforms.
Software Download For more information on changes to the operating system in the Red Hat 7 release, see the Red Hat Enterprise Linux 7 documentation.
Windows 10 Support for Client Drivers Vertica has added Windows 10 support for the following client drivers:
l ADO.NET, both 32 and 64-bit
l JDBC
l Vertica vsql
Security and Authentication This section contains information on updates to security and authentication features for Vertica Analytics Platform 7.0.x.
More Details For more information see Security and Authentication.
Authentication Support for Chained Users and Roles You can now enable authentication for a chain of users and roles, rather than enabling authentication for each user and role separately.
For more information see Implementing Client Authentication.
Restrict System Tables The new security parameter RestrictSystemTables prohibits users from accessing sensitive information from some system tables.
For more information see System Table Restriction.
Query Optimization This section contains information on updates to Optimization for Vertica Analytics Platform 7.0.x.
More Details For more information see Optimizing Query Performance.
Batch Export of Directed Queries This release provides two new meta-functions that let you batch export directed queries from one database to another. These tools are useful for saving query plans before a
HPE Vertica Analytics Platform (7.2.x) Page 32 of 5309
scheduled version upgrade:
l EXPORT_DIRECTED_QUERIES batch exports query plans as directed queries to an external SQL file.
l IMPORT_DIRECTED_QUERIES lets you selectively import query plans that were exported by EXPORT_DIRECTED_QUERIES from another database.
For more information on using these tools, see Batch Query Plan Export.
UTYPE hint The UTYPE hint specifies how to combine UNION ALL input.
SQL Functions and Statements This section contains information on updates to SQL Functions and Statements for Vertica Analytics Platform 7.0.x.
More Details For more information see the SQL Reference Manual.
THROW_ERROR Function This release adds the function, throw_error, which allows you to generate arbitrary errors. For more information, see THROW_ERROR.
Management Console This section contains information on updates to the Management Console for Vertica Analytics Platform 7.0.x.
More Details For more information see Management Console.
Management Console Message Center Redesign Management Console now brings focus to time-sensitive and high priority messages with a redesigned Message Center.
The new Recent Messages inbox displays your messages from the past week, while messages you have read are now archived in the Archived Messages inbox. You can also click Threshold Messages to view only alerts about exceeded thresholds in your database.
HPE Vertica Analytics Platform (7.2.x) Page 33 of 5309
To help you prioritize the messages you view, Message Center displays the number of messages categorized as High Priority, Needs Attention, and Informational. You can click any of these values to filter by that priority.
For more information about Message Center, see Monitoring Database Messages in MC.
Password Requirements in Management Console Starting with Vertica 7.2.1, Management Console (MC) passwords must contain 5 - 25 characters. Passwords created in previous versions of MC continue to work regardless of length. If you change a password after updating to 7.2.1, MC enforces the new length requirement.
System Table Updates This section contains information on updates to System Tables for Vertica Analytics Platform 7.0.x.
More Details For more information see Vertica System Tables.
License Audits System Table Vertica Concepts has changed the LICENSE_AUDITS system table column LICENSE_ NAME to AUDITED_DATA.
Refer to the section, LICENSE_AUDITS in the SQL Reference Manual for more information.
Licenses System Table Vertica Concepts has changed the LICENSES system table column IS_COMMUNITY_ EDITION to IS_SIZE_LIMIT_ENFORCED.
Refer to the section, LICENSES in the SQL Reference Manual for more information.
SDK Updates This section contains information on updates to the SDK for Vertica Analytics Platform 7.0.x.
More Details For more information see the Java SDK Documentation.
Enhancements for C++ User-Defined Function Parameters This release adds new functionality for C++ user-defined function parameters. For more information, see:
l Defining the Parameters Your UDx Accepts
l Specifying the Behavior of Passing Unregistered Parameters
l USER_FUNCTION_PARAMETERS
Management Console API Alert Filters New filters are available for when you use the Vertica REST API to retrieve information on alerts configured in Management Console. For information on applying these category and sub-category filters, see:
l Thresholds Category Filter
l Database Name Category Filter
Text Search Updates This section contains information on updates to Text Search for Vertica Analytics Platform 7.0.x.
More Details For more information see Using Text Search.
Text Search Features This release adds the following functionality for text search:
l Tokenizers are now polymorphic and can accept any number and type of columns.
l Text indices can now contain multiple columns from their source table.
For more information see Using Text Search.
New Features and Changes in Vertica 7.2.0 Read the topics in this section for information about new and changed functionality in Vertica 7.2.0.
HPE Vertica Analytics Platform (7.2.x) Page 35 of 5309
Supported Platforms This section contains information on updates to supported platforms for Vertica Analytics Platform 7.0.x.
More Details For complete information on platform support see Vertica 7.2.x Supported Platforms.
Software Download For more information on changes to the operating system in the Red Hat 7 release, see the Red Hat Enterprise Linux 7 documentation.
Support for Red Hat Enterprise Linux 7 and CentOS 7 This release adds support for Red Hat Enterprise Linux 7 and CentOS 7 on HPE Vertica Analytics Platform and Management Console.
You cannot perform a direct upgrade with Vertica Analytics Platform 7.0.x from Red Hat 6.x to 7 or CentOS 6.x to 7. For information on how to upgrade to Red Hat 7 or CentOS 7, see Migration Guide for Red Hat 7/CentOS 7.
Requirements Before Upgrading or Installing This section contains updated information on the tasks you must complete before you install or upgrade Vertica.
Dialog Package Vertica now requires the dialog package to be installed on all nodes in your cluster before installing or upgrading the database platform.
See Package Dependencies for more information.
Licensing and Auditing Vertica 7.0.x has changed its licensing scheme and the way it audits licenses.
More Details For complete license information see Managing Licenses.
Vertica Enterprise Edition License Is Now Premium Edition Enterprise Edition is now Premium Edition.
HPE Vertica Analytics Platform (7.2.x) Page 36 of 5309
l If you have a current Enterprise Edition license, it is still valid for use with Vertica 7.2.x.
l The new Premium Edition includes Flex Tables. You no longer need a separate license for Flex Zone. (The new Premium Edition does not include a Flex Table data limit; you can add Flex Table data up to your general license limit. Flex data counts as 1/10th the cost of regular data towards your license limits.)
l Premium Edition includes all Vertica functionality. Hadoop requires a separate license.
For more information see Understanding Vertica Licenses.
Vertica Database Audits
IMPORTANT: The changes in this section became effective with Vertica Release 7.1.2, except where noted.
Vertica has made storage changes related to licensing. As a result, the Vertica database audit size is calculated differently and is reduced from audit sizes calculated prior to Vertica Version 7.1.2.
Vertica now computes the effective size of the database based on the export size of the data.
l Vertica no longer counts a 1-byte delimiter value in the effective size of the database. Instead, the Vertica audit license size is now based solely on the data width.
l Vertica no longer adds a 1-byte value to account for each delimiter. Under the new sizing rules, null values are free. Thus, Vertica audit size may be greatly reduced from the previous version audit size.
l As of Vertica Release 7.0.x, Flex data counts as only 1/10th the cost of non-Flex data.
As a result of these changes, compression ratios show less compression than previous versions. You can find detailed information on how Vertica calculates database size in the Calculating the Database Size section of the Administrator's Guide.
Client Connectivity This section contains information on updates to connection information for Vertica Analytics Platform 7.0.x.
HPE Vertica Analytics Platform (7.2.x) Page 37 of 5309
More Details For more information see Connecting to Vertica.
Vertica Client Drivers and Tools for Windows This release adds a new installer, Vertica Client Drivers and Tools for Windows, for connecting to Vertica. The installer is packaged as an .exe file. You can run the installer as a regular Windows installer or silently. The installer is compatible with both 32-bit and 64-bit machines.
The installer contains the following client drivers and tools:
l The ODBC Client Driver for Windows
l The OLE DB Client Driver for Windows
l The vsql Client for Windows
l The ADO.NET Driver for Windows
l The Microsoft Connectivity Pack for Windows
l The Visual Studio Plug-in for Windows
For more information on installing the Client Drivers and Tools for Windows, see The Vertica Client Drivers and Tools for Windows.
Download the latest Client Drivers and Tools for Windows from my.vertica.com (logon required).
Multiple Active Result Sets (MARS) Support Vertica now supports multiple active result sets when you use a JDBC connection. For more information, see Multiple Active Result Sets (MARS).
VHash Class for JDBC You can use the VHash class as an implementation of the Vertica built-in hash function when connecting to your database with a JDBC connection. For more information, see Pre-Segmenting Data Using VHash.
Binary Transfer with ADO.NET Connections When connecting between your ADO.NET client application and your Vertica database, you can now use binary transfer instead of string transfer. See ADO.NET Connection Properties for more information.
HPE Vertica Analytics Platform (7.2.x) Page 38 of 5309
Python Client Vertica offers a Python client that allows you to interface with your database. For more information, see Vertica Python Client.
Security and Authentication This section contains information on updates to security and authentication features for Vertica Analytics Platform 7.0.x.
More Details For more information see Security and Authentication.
Admintools Remote Calls Previous to Vertica Analytics Platform 7.0.x, you performed admintools remote calls using SSH to connect to a remote cluster and then running shell commands. This configuration allowed system database administration users to perform any system action without limitation.
This approach prevented organizations from auditing the complete set of actions that Admintools can perform.
Vertica Analytics Platform 7.0.x addresses this with the following remote python module:
python -m <command>
Using this module allows organizations to limit the system database administration user to execute python modules under the .../vertica/engine/api directory.
CFS Security You can now use the Connector Framework Service (CFS) to ingest indexed HPE IDOL data securely into the Vertica Analytics Platform. This option allows you to use the Vertica Analytics Platform to perform analytics on data indexed by HPE IDOL.
The new security features control access to specific documents using:
l Access Control Lists (ACL) indicating which users and groups can access a document.
HPE Vertica Analytics Platform (7.2.x) Page 39 of 5309
l Security Information Strings that associate a user/group with a specific ACL.
For more information see Connector Framework Service.
Inherited Privileges The new Inherited privileges feature allow you to grant privileges at the schema level. This approach automatically grants privileges to a new or existing table in the schema. By using inherited privileges, you can:
l Eliminate the need to apply the same privileges to each individual table in the schema.
l Quickly create new tables that have the necessary privileges for users to perform required tasks.
For more information see Inherited Privileges Overview in the Administrator's Guide.
LDAP Link The new LDAP Link service allows the Vertica Server to tightly couple with an existing Directory service such as MS Active Directory or OpenLDAP. Using the LDAP link services, you can specify that the Vertica server synchronize:
l LDAP users to Vertica database users
l LDAP groups to Vertica groups
l LDAP user and group membership to Vertica users and roles membership
Any changes to the LDAP Link directory service are reflected in the Vertica database in near real time. For example, if you create a new user in LDAP, and LDAP Link is active, that user identity is sent to the Vertica database upon the next synchronization.
For more information see LDAP Link Service.
HPE Vertica Analytics Platform (7.2.x) Page 40 of 5309
SYSMONITOR Role The new System Monitor (SYSMONITOR) role grants access to specific monitoring utilities without granting full DBADMIN access. This role allows the DBADMIN user to delegate administrative tasks without compromising security or exposing sensitive information. A DBADMIN user has the System Monitor role by default.
For more information see SYSMONITOR Role in the Administrator's Guide.
Database Management This section contains information on updates to database operations for Vertica Analytics Platform 7.0.x.
More Details For more information see Managing the Database.
Configuration Parameter ARCCommitPercentage This parameter sets a threshold percentage of WOS to ROS rows. The specified percentage determines when to aggregate projection row counts and commit the result to the Vertica catalog. The default value is 3 (percent).
ARCCommitPercentage provides better control over expensive commit operations, and helps reduce extended catalog locks.
For more information see General Parameters in the Administrator's Guide.
Admintools Debug Option Vertica has added a --debug option to the admintools command to assist customers and customer support. When you enable the debug option, Vertica adds additional information to associated log files.
Note: Vertica often changes the format or content of log files in subsequent releases to benefit both customers and customer support.
For information on admintools command options, refer to Writing Administration Tools Scripts in the Administrator's Guide.
Automatic Eviction of Unresponsive Nodes This release adds a new capability to detect and respond to unhealthy nodes in an Vertica database cluster.
See Automatic Eviction of Unhealthy Nodes to learn more.
HPE Vertica Analytics Platform (7.2.x) Page 41 of 5309
Reduced Catalog Size Vertica 7.0.x reduces catalog size by consolidating statistics storage, removing unused statistics, and storing unsegmented projection metadata once per database, instead of once per node. More efficient catalog storage provides the following benefits:
l Smaller catalog footprint
l Better scalability with large clusters, facilitating faster analytics, backup, and recovery
l Reduced overhead and fewer bottlenecks associated with catalog size and usage.
For more information see Prepare Disk Storage Locations in Installing Vertica.
Unsegmented Projection Buddies Map to Single Name Before Vertica 7.0.x, all instances (buddies) of unsegmented projections that were created by CREATE PROJECTION UNSEGMENTED ALL NODES had unique identifiers:
unseg-proj-name_nodeID
In this identifier nodeID indicated the node that hosted a given projection buddy.
With redesign of the database catalog in Vertica 7.0.x, a single name now maps to all buddies of an unsegmented projection.
Name Conversion Utility When you upgrade your database to Vertica 7.0.x, all existing projection names remain unchanged and Vertica continues to support them. You can use the function merge_ projections_with_same_basename() to consolidate unsegmented projection names so they conform to the new naming convention. This function takes a single argument, one of the following:
l An empty string specifies to consolidate all unsegmented projection buddy names under their respective projection base names. For example:
=> select merge_projections_with_same_basename(''); merge_projections_with_same_basename -------------------------------------- 0 (1 row)
l The base name of the projection whose buddy names you want to convert.
HPE Vertica Analytics Platform (7.2.x) Page 42 of 5309
Query Optimization This section contains information on updates to Optimization for Vertica Analytics Platform 7.0.x.
More Details For more information see Optimizing Query Performance.
Database Designer Database Designer performance and scalability on large (>100 nodes) clusters has been significantly improved.
For more information see About Database Designer in the Administrator's Guide.
Optimizer Memory Usage Memory usage by Database Optimizer has been reduced. For more information on the Database Optimizer see Optimize Query Performance.
Directed Queries and New Query Hints Directed queries encapsulate information that the optimizer can use to create a query plan. Directed queries serve two goals:
l Preserve current query plans before a scheduled upgrade.
l Enable you to create query plans that improve optimizer performance.
For more information about directed queries, see Directed Queries in the Administrator's Guide.
Directed queries rely on a set of new optimizer hints, which you can also use in vsql queries. For information about these and other hints, see Hints in the SQL Reference Manual.
JOIN Performance Vertica 7.0.x improves the performance of hash join queries through parallel construction of the hash table.
For more information see Joins in Analyzing Data.
Terrace Routing Reduces Buffering Requirements Terrace routing is a feature that can reduce the buffer requirements of large queries. Use
HPE Vertica Analytics Platform (7.2.x) Page 43 of 5309
terrace routing in situations where you have large queries and clusters with a large number of nodes. Without terrace routing, these situations would otherwise require excessive buffer space.
For more information see Terrace Routing in the Administrator's Guide.
Changed Behavior for Live Aggregate/Top-K Projections Live aggregate and Top-K projections now conform to the behavior of other Vertica projections, as follows:
l Vertica's database optimizer automatically directs queries that specify aggregation to the appropriate live aggregate or Top-K projection. In previous releases, you could access pre-aggregated data only by querying live aggregate/Top-K projections; now you can query the anchor tables.
l Live aggregate/Top-K projections no longer require anchor projections. Vertica now loads new data directly into live aggregate/Top-K projections.
l Live aggregate/Top-K projections must now explicitly specify the same level of K- safety or greater that is configured for the Vertica database. Otherwise, Vertica does not create buddy projections and cannot update projection data.
For more information see Creating Live Aggregate Projections in Analyzing Data.
Pre-Aggregation of UDx Function Results You can create live aggregate projections that invoke user-defined transform functions (UDTFs). Vertica processes these functions in the background and stores their results on disk, to minimize overhead when you query those projections. For more information, see Pre-Aggregating UDTF Results in Analyzing Data.
A projection can also specify a user-defined scalar function like any other expression. When you load data into this projection, Vertica stores the function result set for faster access. See Support for User-Defined Scalar Functions in Analyzing Data.
Improved Queries in Flex Views This release supports the flex query rewriting to occur more broadly anytime a __raw__ column is present. With this new functionality, a view is considered a flex view if it includes a __raw__ column and thus supports rewriting.
For more information see Querying Flex Views
HPE Vertica Analytics Platform (7.2.x) Page 44 of 5309
JIT Support for Regular Expression Matching Vertica has upgraded the PCRE library to version 8.37. This upgrade includes Just in Time (JIT) compilation for the functions used in regular expression matching in SQL queries. For more information, see the parameter PatternMatchingUseJIT inGeneral Parameters.
Loading Data This section contains information about loading data in flex tables, and using Kafka Integration for Vertica Analytics Platform 7.0.x.
Vertica 7.0.x includes two new parsers for flex tables:
l Avro file parser, favroparser
l CSV (comma-separated values) parser, fcsvparser
This release also extends flex view and __raw__ column queries. Whenever you query a flex table, Vertica calls maplookup() internally to include any virtual columns. In this release, querying flex views, or any table with a __raw__ column, also invokes maplookup() internally to improve query results.
More Details For more information see Integrating with Apache Kafka and Understanding Flex Tables
Flex Parsers Vertica 7.0.x introduces two new parsers.
Avro Parser Use favroparser to load Avro files into flex and columnar tables.
For flex tables, this parser supports files with primitive and complex data types, as described in the Apache Avro 1.3 specification.
HPE Vertica Analytics Platform (7.2.x) Page 45 of 5309
CSV Parser Use the flex tables CSV parser, fcsvparser, to load standard and modified CSV files into flex and columnar tables.
For more information, see Flex Parsers Reference.
Vertica and Apache Kafka Integration Vertica 7.0.x introduces the ability to integrate with Apache Kafka. This feature allows you to stream data from a Kafka message bus directly into your Vertica database. It uses the job scheduler feature included in the Vertica rpm package.
For more information, see Kafka Integration Guide.
Management Console This section contains information on updates to the Management Console for Vertica Analytics Platform 7.0.x.
More Details For more information see Management Console.
Connect to HPE IDOL Dashboard from Management Console Vertica 7.0.x Management Console introduces the ability to create a mutual link between your Vertica Management Console and HPE IDOL dashboards. The new HPE IDOL button in MC displays the number of alerts you have in HPE IDOL. The button provides a clickable shortcut to your HPE IDOL dashboard. Point MC to your HPE IDOL dashboard in MC Settings.
For more information see Connecting to HPE IDOL Dashboard in Using Management Console.
External Data Sources for Management Console Monitoring You can now use Management Console (MC) to monitor Data Collector information copied into Vertica tables, locally or remotely.
In the MC Settings page, provide mappings to local schemas or to an external database that contains the corresponding DC data. MC can then render its charts and graphs from the new repository instead of from local DC tables. This offers the benefit of loading larger sets of data faster in MC, and retaining historical data long term.
Administrators can configure MC to monitor an external data source using the new Data Source tab on the MC Settings page.
HPE Vertica Analytics Platform (7.2.x) Page 46 of 5309
See Monitoring External Data Sources in Management Console.
Configure Resource Pools Using Management Console In this release, Management Console introduces more ways to configure your resource pools. With Management Console you can now create and remove resource pools, assign resource pool users, and assign cascading pools.
Database administrators can make changes to a database's resource pools in the Resource Pools Configuration page, accessible through the database's Settings page.
See Configuring Resource Pools in Management Console
Threshold Monitoring Enhancements in Management Console Vertica Management Console 7.0.x introduces more detailed, configurable notifications about the health of monitored databases.
Prioritize and customize your notifications and alert thresholds on the new Thresholds tab, which appears on the Database Settings page. The new Threshold Settings widget on the Overview Page now displays your prioritized alerts.
For more information, see Monitoring Database Messages in MC and Customizing Message Thresholds.
Management Console Message Center Enhancements In this release, Management Console introduces filtering and performance enhancements to the Message Center that allow you to:
l View up to 10,000 messages by default
l Retrieve additional alerts from the past
l Use console.properties to increase the number of messages you can view in Message Center
l Delete all your alerts at once
In addition, improvements to filtering now allow you to sort messages by severity, database name, message description, and date.
For more information about messages in Management Console, see Monitoring Database Messages in MC.
Management Console Rest API Vertica now provides API calls to interact with Management Console.
HPE Vertica Analytics Platform (7.2.x) Page 47 of 5309
l GET alerts
l GET alertSummary
System Table Updates This section contains information on updates to System Tables for Vertica Analytics Platform 7.0.x.
More Details For more information see Vertica System Tables.
System Tables for Constraint Enforcement These system tables, under the V_CATALOG schema, include the new column IS_ ENABLED:
l CONSTRAINT_COLUMNS
l TABLE_CONSTRAINTS
l PRIMARY_KEYS
Under the V_CATALOG schema, the PROJECTIONS system table includes the new column IS_KEY_CONSTRAINT_PROJECTION.
RESOURCE_POOL_MOVE System Table Vertica 7.2.x includes the following changes to the RESOURCE_POOL_MOVE system table::
l The tables includes the MOVE_CAUSE column. This column displays the reason why the query attempted to move.
l The CAP_EXCEEDED column was removed.
l The REASON column is now called RESULT_REASON.
For more information see RESOURCE_POOL_MOVE in the SQL Reference Manual.
LICENSES System Table The system table LICENSES, under the V_CATALOG schema, includes new columns:
HPE Vertica Analytics Platform (7.2.x) Page 48 of 5309
l LICENSETYPE
l PARENT
l CONFIGURED_ID
TABLE_RECOVERIES System Table Vertica 7.0.x now includes the TABLE_RECOVERIES system table. You can query this table to view detailed progress on specific tables during a Recovery By Table.
TABLE_RECOVERY_STATUS System Table Vertica 7.0.x now includes the TABLE_RECOVERY_STATUS system table. You can query this table to view the progress of a Recovery By Table.
SQL Functions and Statements This section contains information on updates to SQL Functions and Statements for Vertica Analytics Platform 7.0.x.
More Details For more information see the SQL Reference Manual.
Analytic Functions Vertica now includes the NTH_VALUE analytic function.
NTH_VALUE is an analytic function that returns the value evaluated at the row that is the nth row of the window (counting from 1).
For more information see Analytic Functions in SQL Reference Manual.
Math Functions Vertica now includes the following mathematical functions:
l COSH—Calculates the hyperbolic cosine.
l LOG10—Calculates the base 10 logarithm.
l SINH—Calculates the hyperbolic sine.
l TANH—Calculates the hyperbolic tangent.
HPE Vertica Analytics Platform (7.2.x) Page 49 of 5309
Options for Routing Queries Vertica 7.0.x introduces new functionality for moving queries to different resource pools. Now, as the database administrator, you can use the MOVE_STATEMENT_TO_ RESOURCE_POOL meta-function to specify that queries move to different resource pools mid-execution.
For more information see Manually Moving Queries to Different Resource Pools.
Session Resource Functions Vertica now includes new resource management functions:
l RESERVE_SESSION_RESOURCE
l RELEASE_SESSION_RESOURCE
Automatic Enforcement of Primary and Unique Key Constraints Vertica can now automatically enforce primary and unique key constraints. Additionally, you can enable individual constraints using CREATE TABLE or ALTER TABLE.
You also have the option of setting parameters so that new constraints you create are, by default, disabled or enabled when you create them. If you have not specifically enabled or disabled constraints using CREATE TABLE or ALTER TABLE, the parameter default settings apply.
For information on automatic enforcement of PRIMARY and UNIQUE key constraints, refer to Enforcing Primary and Unique Key Constraints Automatically in the Administrator's Guide.
When you upgrade to Vertica7.0.x, the primary and unique key constraints in any tables you carry over are disabled. Existing constraints are not automatically enforced. To enable existing constraints and make them automatically enforceable, manually enable each constraint using the ALTER TABLE ALTER CONSTRAINT statement. This statement triggers constraint enforcement for the existing table contents. Statements roll back if one or more violations occur.
HPE Vertica Analytics Platform (7.2.x) Page 50 of 5309
Enabling and Disabling Individual Constraints Two new modifiers allow you to set enforcement of individual constraints: ENABLED and DISABLED.
To enable or disable individual constraints, use the CREATE TABLE or ALTER TABLE statements. These syntaxes now include ENABLED and DISABLED options for PRIMARY and UNIQUE keys:
l Column-Constraint (as part of the CREATE TABLE statement)
l Table-Constraint (as part of CREATE TABLE or ALTER TABLE statements)
The ALTER TABLE statement also includes an ALTER CONSTRAINT option for enabling or disabling existing constraints.
Choosing Default Enforcement for Newly Declared or Modified Constraints Two new parameters allow you to set the default for enabling or disabling newly created constraints. You set these parameters using the ALTER DATABASE statement. Setting a constraint as enabled or disabled when you create or alter it using CREATE TABLE or ALTER TABLE overrides the parameter setting. The default value for both of these new parameters is false (disabled).
l EnableNewPrimaryKeysByDefault lets you enable or disable constraints for primary keys.
l EnableNewUniqueKeysByDefault lets you enable or disable constraints for unique keys.
For general information about configuration parameters, refer to Configuration Parameters in the Administrator's Guide. For information about these new parameters and how to set them, refer to Constraint Enforcement Parameters in the Administrator's Guide and ALTER DATABASE in the SQL Reference Manual.
Behavior Not Changed from Previous Releases NOT NULL constraints are always automatically enforced for primary keys. When you create a primary key, Vertica implicitly creates a NOT NULL constraint on the key set. It does so regardless of whether you enable or disable the key.
You can manually validate constraints using ANALYZE_CONSTRAINTS meta-function. ANALYZE_CONSTRAINTS does not depend upon nor does it consider the automatic
HPE Vertica Analytics Platform (7.2.x) Page 51 of 5309
enforcement settings of primary or unique keys. Thus, you can run ANALYZE_ CONSTRAINTS on a table or schema that includes:
l Disabled key constraints
l A mixture of enabled and disabled key constraints
Recover by Table Functions Vertica now includes new functions to configure and perform recovery on a per-table basis:
l SET_RECOVER_BY_TABLE
l SET_TABLE_RECOVER_PRIORITY
Backup, Restore, and Recovery This section contains information on updates to backup and restore operations for Vertica Analytics Platform 7.0.x.
More Details For more information see Backing Up and Restoring the Database.
Backup to Local Host Vertica does not support the Linux variable, localhost. However, it does allow you to direct backups to the local host without using an IP address.
Direct a backup to a location on the localhost by including square brackets and a path in the following form:
[Mapping] NodeName = []:/backup/path
[Mapping] v_node0001 = []:/scratch_drive/archive/backupdir v_node0002 = []:/scratch_drive/archive/backupdir v_node0003 = []:/scratch_drive/archive/backupdir
HPE Vertica Analytics Platform (7.2.x) Page 52 of 5309
Restoring Individual Objects from a Full or Object-Level Backup You can now restore individual tables or schemas from any backup that contains those objects without restoring the entire backup. This option is useful if you only need to restore a few objects and want to avoid the overhead of a larger scale restore. Your database must be running and your nodes must be UP to restore individual objects.
For more information see Restoring Individual Objects from a Full or Object-Level Backup.
Lightweight Partition Copy Vertica now includes the COPY_PARTITIONS_TO_TABLE function.
Lightweight partition copy increases performance by sharing the same storage between two tables. The storage footprint does not increase as a result of shared storage. After the copy partition is complete, the tables are independent of each other. Users can perform operations on each table without impacting the other. As the tables diverge, the storage footprint may increase as a result of operations performed on these tables.
Object Restore Mode You can now specify how Vertica should handle restored objects:
Object restore mode (coexist, createOrReplace or create)
(createOrReplace):
Vertica supports the following object restore modes:
l createOrReplace (default) — Vertica creates any objects that do not exist. If the object does exist, vbr overwrites it with the version from the archive.
l create — Vertica creates any objects that do not exist. If an object being restored does exist, Vertica displays an error message and skips that object.
l coexist — Vertica creates all restored objects with the form <backup>_ <timestamp>_<object_name>. This approach allows existing and restored objects to exist simultaneously.
In all modes, Vertica restores data with the current epoch. Object restore mode settings do not apply to backups and full restores.
HPE Vertica Analytics Platform (7.2.x) Page 53 of 5309
For more information see Restoring Object-Level Backups.
Recovery by Table Vertica now supports node recovery on a per-table basis. Unlike a node-based recovery, recovering by table makes tables available as they recover, before the node itself is completely restored. You can prioritize your most important tables so that they become available as soon as possible. Recovered tables support all DDL and DML operations.
After a node fully recovers, it enables full Vertica functionality.
Recovery by table is enabled by default.
For more information see Recovery By Table and Prioritizing Table Recovery.
Hadoop Integration This section contains information on updates to Hadoop-integration information for Vertica Analytics Platform 7.0.x.
More Details For more information see Integrating with Hadoop.
Hadoop HDFS Connector The HDFS Connector is now installed with Vertica; you no longer need to download and install it separately. If you have previously downloaded and installed this connector, uninstall it before you upgrade to this release of Vertica to get the newest version.
For more information see Using the HDFS Connector in Integrating with Hadoop.
Place This section contains information on updates to Vertica Place for Vertica Analytics Platform 7.0.x.
More Details For more information see .
WGS84 Support WGS84 support has been added to the following functions:
l STV_Intersect Transform Function
l STV_Intersect Scalar Function
HPE Vertica Analytics Platform (7.2.x) Page 54 of 5309
Vertica Place Functions The following new functions have been added to Vertica Place:
l STV_AsGeoJSON
l STV_ForceLHR
l STV_Reverse
STV_Refresh_Index Removes Polygons From Spatial Indexes STV_Refresh_Index can now remove deleted polygons from spatial indexes. For more information, see STV_Refresh_Index.
Vertica Pulse This section contains information on updates to Pulse for Vertica Analytics Platform 7.0.x.
More Details For more information see Vertica Pulse.
Action Patterns Vertica Pulse now supports the use of action patterns in white_list dictionaries. An action pattern enables Pulse to recognize phrases that denote action, intention, or interest, such as going to buy, waiting to see, and so on. Action patterns can identify behaviors associated with your sentiment analysis terms.
Action patterns can:
l Connect Word Forms to a Root Word — Vertica Pulse lemmatizes all words. Lemmatization recognizes different word forms and maps them to the root word. For example, Pulse would map bought and buying to buy. This ability extends to misspellings. For example, tryiiiing and seeeeeing taaablets would map to trying and seeing tablets.
l Create Object-Specific Queries — To identify only the attributes that are objects of action patterns, create a whitelist dictionary that contains only action patterns of interest. In your sentiment analysis query set the actionPattern and whiteListOnly parameters to true.
HPE Vertica Analytics Platform (7.2.x) Page 55 of 5309
Concurrent User-Defined Dictionaries In version 7.0.x and later, users can apply dictionaries on a per-user basis. Any number of Pulse users can concurrently apply different sets of dictionaries without conflicts and without disrupting the sessions of other users. Each user can have one dictionary of each type loaded at any given time. If a user does not specify a dictionary of a given type, Pulse uses the default dictionary for that type.
For more information see Dictionary and Mapping Labels in Vertica Pulse.
Case-Sensitive Sentiment Analysis By default, Pulse is case insensitive. ERROR produces the same results as error. You can now specify a case setting for a single word using the $Case parameter. For example, to identify Apple, rather than apple, you would add the following:
=> INSERT INTO pulse.white_list_en VALUES('$Case(Apple)'); => COMMIT;
For more information see Sentiment Analysis Levels in Vertica Pulse.
Dictionary And Mapping Labels You can apply a label to any user-defined dictionary or mapping when you load that object. Labels enable to you perform sentiment analysis against a predetermined set of dictionaries and mappings without having to specify a list of dictionaries. For example, you might have a set of dictionaries labeled "music" and a set labeled "movies." The default user dictionaries automatically have a label of "default."
A single dictionary or mapping can have multiple labels. For example, you might label a white list of artists as both "painters" and "renaissance." You could load the dictionary by loading either label. A label can only apply to one dictionary of each type. For example, you cannot have two dictionaries of stop words that share the same label. If you apply a label to multiple dictionaries of the same type, Pulse uses the most recently applied label.
You can view the labels associated with your current dictionaries using the GetAllLoadedDictionaries() function. You can also view the label associated with your current mapping using the GetLoadedMapping() function.
For more information, see Dictionaries and Mappings in Vertica Pulse.
Pulse Functions Vertica 7.0.x includes the following new functions:
HPE Vertica Analytics Platform (7.2.x) Page 56 of 5309
l UnloadLabeledDictionary()
l UnloadLabeledDictionarySet()
l UnloadLabeledMapping()
SDK Updates This section contains information on updates to the SDK for Vertica Analytics Platform 7.0.x.
More Details For more information see the Java SDK Documentation.
SDK Enhancements Vertica 7.0.x introduces these enhancements to the Java SDK:
l A new class named StringUtil helps you manipulate string data. See The StringUtils Class for more information.
l The new PartitionWriter.setStringBytes method lets you set the value of BINARY, VARBINARY, and LONG VARBINARY columns using a ByteBuffer object. See the Java UDx API documentation for more details.
l The PartitionWriter class has a new set of methods for writing output including setLongValue, setStringValue, and setBooleanValue. These methods set the output column value to NULL when passed a Java null reference. When you pass these methods a value, they save the value in the column. These methods save you the steps of checking for null references and calling the separate methods to store nulls or values in the columns. For more information, see the entry for PartitionWriter in the Java API documentation.
l The StreamWriter class used with a User-Defined Parser has a new method, setRowFromMap. You can use this method to write a map of column-name/value pairs as a single operation with automatic type coercion. The JSON Parser example demonstrates this method. For more information, see UDParser and ParserFactory Java Interface, particularly the section titled "Writing Data".
HPE Vertica Analytics Platform (7.2.x) Page 57 of 5309
l User-Defined Analytic Functions can now be written in Java in addition to C++. See Developing a User-Defined Analytic Function in Java.
l Multi-phase transform functions can now be written in Java in addition to C++. See Creating Multi-Phase UDTFs.
For more information see the Java SDK Documentation.
UDx Wildcards Vertica now supports wildcard * characters in the place of column names in user-defined functions.
You can use wildcards when:
l Your query contains a table in the FROM clause
l You are using a Vertica-supported development language
l Your UDx is running in fenced or unfenced mode
For more information see Using User-Defined Extensions.
User-Defined Session Parameters This release adds support for user-defined session parameters. Vertica now supports passing session parameters to a Java or C++ UDx at construction time.
For more information, see User-Defined Session Parameters and User-Defined Session Parameters in Java.
Documentation Updates This section contains information on updates to the product documentation for Vertica Analytics Platform 7.0.x.
More Details For complete product documentation see ® Documentation.
Documentation Changes The following changes are effective for Vertica 7.0.x.
Document Additions and Revisions The following additions and revisions have been made to the Vertica 7.0.x product documentation.
l Extending Vertica has been reorganized to reduce redundancy and make information easier to find. Specifically: n The documentation for each type of User-Defined Extension first presents
information that is true for all implementation languages and then presents language-specific information for C++ and Java.
n All information about developing UDxs, whether general (such as packaging libraries) or about specific APIs, is now found under Developing User-Defined Extensions (UDxs).
n The documentation for each type of UDx follows the same structure: requirements, class overview, any implementation topics related specifically to that type of UDx, deploying, and examples.
n Developing UDSFs and UDTFs in R remains a separate section.
l Vertica Concepts has been moved in the Table of Contents. It now appears after New Features and before Installing Vertica. It also has been reorganized and updated to reflect improvements to Vertica.
l The contents of the book Vertica for SQL on Hadoop have been folded into Integrating with Hadoop, which now explains both co-located and separate clusters. The book has been reorganized to make information easier to find and eliminate redundancies. The ORC Reader has been made more prominent.
For more information, see Integrating with Hadoop, and in particular Cluster Layout and Choosing Which Hadoop Interface to Use.
l A new Security and Authentication guide consolidates all client/server authentication and security topics. Authentication information was removed from the Administrator's Guide and placed in the new document.
l Creation of a standalone Management Console guide. Management Console topics have been removed from the Administrator's Guide. Management Console topics
HPE Vertica Analytics Platform (7.2.x) Page 59 of 5309
remain in Installing Vertica (Installing and Configuring Management Console) and Getting Started (Using Management Console ).
l A new Hints section in the SQL Reference Manual describes all supported query hints.
l A new Best Practices for DC Table Queries section in Machine Learning for Predictive Analytics describes how to optimize query performance when querying Data Collector Tables.
l A new Integrating with Apache Kafka document describes how to integrate Apache Kafka with Vertica.
l Documentation for the Microsoft Connectivity Pack now resides in the Connecting to Vertica document. See The Microsoft Connectivity Pack for Windows.
Removed from Documentation The following documentation elements were removed from the Vertica 7.0.x product documentation.
l Documentation on partially sorted GROUPBY has been removed.
l The VSQL environment variables page (vsql Environment Variables) listed VSQL_ DATABASE and SHELL. These environment variables are no longer in use and have been removed from the documentation.
l The previously deprecated function MERGE_PARTITIONS was removed from the SQL Reference Manual.
Deprecated and Retired Functionality This section describes the two phases HPE follows to retire Vertica functionality:
l Deprecated. Vertica announces deprecated features and functionality in a major or minor release. Deprecated features remain in the product and are functional. Documentation is included in the published release documentation. Accessing the feature can result in informational messages noting that the feature will be removed in
HPE Vertica Analytics Platform (7.2.x) Page 60 of 5309
the following major or minor release. Vertica identifies deprecated features in this document.
l Removed. HPE removes a feature in the major or minor release immediately following the deprecation announcement. Users can no longer access the functionality. Vertica announces all feature removal in this document. Documentation describing the retired functionality is removed, but remains in previous documentation versions.
Deprecated Functionality in This Release In version 7.2., the following Vertica functionality has been deprecated.
l System-level parameter ConstraintsEnforcedExternally, and related SQL statement SET SESSION CONSTRAINTS_ENFORCED_EXTERNALLY
l The backup and restore configuration parameter overwrite is replaced by the objectRestoreMode setting.
l Support for any Vertica Analytics Platform running on the ext3 file system
l Prejoin projections
l The --compat21 option of the admintools command
See Also For a description of how Vertica deprecates features and functionality, see Deprecated and Retired Functionality.
Retired Functionality History The following functionality has been deprecated or removed in the indicated versions:
Functionality Component Deprecated Version
Backup and restore overwrite configuration parameter
Server 7.2
verticaConfig vbr configuration option Server 7.1
JavaClassPathForUDx configuration parameter
Functionality Component Deprecated Version
Functionality Component Deprecated Version
Server 7.0
7.0
Pload Library Server 7.0
IMPLEMENT_TEMP_DESIGN() Server, clients
UPDATE privileges on sequences Server 6.0
Query Repository, which includes:
Functionality Component Deprecated Version
RESOURCE_ACQUISITIONS_HISTORY system table
Server 6.1
copy_vertica_database.sh Server
restore.sh Server
backup.sh Server
4.1 (Client)
5.1 (Server)
5.1 (Client)
use35CopyParameters ODBC, JDBC
Functionality Component Deprecated Version
Server, clients
RefreshHistoryDuration Server 4.1 5.0
Notes
l While the Vertica Geospatial package has been deprecated, it has been replaced by Vertica Place. This analytics package is available on my.vertica.com/downloads.
l LCOPY: Supported by the 5.1 server to maintain backwards compatibility with the 4.1 client drivers.
l Query Repository: You can still monitor query workloads with the following system tables:
n QUERY_PROFILES
n SESSION_PROFILES
n EXECUTION_ENGINE_PROFILES
In addition, Vertica Version 6.0 introduced new robust, stable workload-related system table:
l QUERY_REQUESTS
Vertica Documentation
l QUERY_EVENTS
l RESOURCE_ACQUISITIONS
l The RESOURCE_ACQUISITIONS system table captures historical information.
l Use the Kerberos gss method for client authentication, instead of krb5. See Configuring Kerberos Authentication.
Vertica Documentation
Installing Vertica
Installation Overview and Checklist This page provides an overview of installation tasks. Carefully review and follow the instructions in all sections in this topic.
Important Notes l Vertica supports only one running database per cluster.
l Vertica supports installation on one, two, or multiple nodes. The steps for Installing Vertica are the same, no matter how many nodes are in the cluster.
l Prerequisites listed in Before You Install Vertica are required for all Vertica configurations.
l Only one instance of Vertica can be running on a host at any time.
l To run the install_vertica script, as well as adding, updating, or deleting nodes, you must be logged in as root, or sudo as a user with all privileges. You must run the script for all installations, including upgrades and single-node installations.
Installation Scenarios The four main scenarios for installing Vertica on hosts are:
l A single node install, where Vertica is installed on a single host as a localhost process. This form of install cannot be expanded to more hosts later on and is typically used for development or evaluation purposes.
l Installing to a cluster of physical host hardware. This is the most common scenario when deploying Vertica in a testing or production environment.
l Installing on Amazon Web Services (AWS). When you choose the recommended Amazon Machine Image (AMI), Vertica is installed when you create your instances. For the AWS specific installation procedure, see Installing and Running Vertica on AWS: The Detailed Procedure rather than the using the steps for installation and upgrade that appear in this guide.
l Installing to a local cluster of virtual host hardware. Also similar to installing on physical hosts, but with network configuration differences.
Vertica Documentation
These preliminary steps are broken into two categories:
l Configuring Hardware and Installing Linux
l Configuring the Network
Install or Upgrade Vertica Once you have completed the steps in the Before You Install Vertica section, you are ready to run the install script.
Installing Vertica describes how to:
l Back up any existing databases.
l Download and install the Vertica RPM package.
l Install a cluster using the install_vertica script.
l [Optional] Create a properties file that lets you install Vertica silently.
Note: This guide provides additional manual procedures in case you encounter installation problems.
l Upgrading Vertica to a New Version describes the steps for upgrading to a more recent version of the software.
Note: If you are upgrading your Vertica license, refer to Managing Licenses in the Administrator's Guide.
Post-Installation Tasks After You Install Vertica describes subsequent steps to take after you've run the installation script. Some of the steps can be skipped based on your needs:
l Install the license key.
l Verify that kernel and user parameters are correctly set.
Vertica Documentation
l Use the Vertica documentation online, or download and install Vertica documentation. Find the online documentation and documentation packages to download at http://my.vertica.com/docs.
l Install client drivers.
l Extend your installation with Vertica packages.
l Install or upgrade the Management Console.
Get started! l Read the Concepts Guide for a high-level overview of the HPE Vertica Analytics
Platform.
l Proceed to the Installing and Connecting to the VMart Example Database in Getting Started, where you will be guided through setting up a database, loading sample data, and running sample queries.
Vertica Documentation
Vertica Documentation
About Linux Users Created by Vertica and Their Privileges This topic describes the Linux accounts that the installer creates and configures so Vertica can run. When you install Vertica, the installation script optionally creates the following Linux user and group:
l dbadmin—Administrative user
l verticadba—Group for DBA users
dbadmin and verticadba are the default names. If you want to change what these Linux accounts are called, you can do so using the installation script. See Installing Vertica with the install_vertica Script for details.
Before You Install Vertica See the following topics for more information:
l Installation Overview and Checklist
l General Hardware and OS Requirements and Recommendations
When You Install Vertica The Linux dbadmin user owns the database catalog and data storage on disk. When you run the install script, Vertica creates this user on each node in the database cluster. It also adds dbadmin to the Linux dbadmin and verticadba groups, and configures the account as follows:
l Configures and authorizes dbadmin for passwordless SSH between all cluster nodes. SSH must be installed and configured to allow passwordless logins. See Enable Secure Shell (SSH) Logins.
l Sets the dbadmin user's BASH shell to /bin/bash, required to run scripts, such as install_vertica and the Administration Tools.
l Provides read-write-execute permissions on the following directories:
Vertica Documentation
n /opt/vertica/*
n /home/dbadmin—the default directory for database data and catalog files (configurable through the install script)
Note: The Vertica installation script also creates a Vertica database superuser named dbadmin. They share the same name, but they are not the same; one is a Linux user and the other is a Vertica user. See Database Administration User in the Administrator's Guide for information about the database superuser.
After You Install Vertica Root or sudo privileges are not required to start or run Vertica after the installation process completes.
The dbadmin user can log in and perform Vertica tasks, such as creating a database, installing/changing the license key, or installing drivers. If dbadmin wants database directories in a location that differs from the default, the root user (or a user with sudo privileges) must create the requested directories and change ownership to the dbadmin user.
Vertica prevents administration from users other than the dbadmin user (or the user name you specified during the installation process if not dbadmin). Only this user can run Administration Tools.
See Also l Installation Overview and Checklist
l Before You Install Vertica
l Platform Requirements and Recommendations
l Enable Secure Shell (SSH) Logins
Vertica Documentation
HPE Vertica Analytics Platform (7.2.x) Page 73 of