Upload
dangthu
View
230
Download
1
Embed Size (px)
Citation preview
IIBBMM PPuurreeDDaattaa ffoorr
AAnnaallyyttiiccss ((IIPPDDAA))
PPoowweerreedd bbyy NNeetteezzzzaa tteecchhnnoollooggyy
Multiple Schema Support
in Release 7.0.3
- including Best Practices
Netezza Performance Server 9 September 9, 2013
ii
Table of Contents
1. Introduction ..................................................................................................................................................................... 1 2. Referencing Database Objects in IPDA ........................................................................................................................ 1 3. Multiple Schema Support ............................................................................................................................................... 4
3.1. Compatibility Mode (default) ................................................................................................................................ 5 3.2. Multi-Schema Mode .............................................................................................................................................. 5
4. Configuring Multiple Schema Support ........................................................................................................................ 6 4.1. Enable Multiple Schema Support ........................................................................................................................ 6 4.2. Disable Multiple Schema Support ....................................................................................................................... 7
5. Multiple Schemas and User Connection Options ....................................................................................................... 7 5.1. Connect to the default Database Schema ............................................................................................................. 8 5.2. Connect to the default User Schema ..................................................................................................................... 8 5.3. Connect to a specific User Schema (with Multiple Schema support Enabled) .................................................... 8
6. Enabling Multiple Schema support on Existing Systems .......................................................................................... 9 7. Database Schemas ......................................................................................................................................................... 11 8. Managing Multiple Schemas ....................................................................................................................................... 11
8.1. Creating Schemas ................................................................................................................................................. 11 8.2. Changing Schemas............................................................................................................................................... 12 8.3. Altering Schemas ................................................................................................................................................. 12 8.4. Display Database Schema ................................................................................................................................... 12 8.5. Dropping Schemas ............................................................................................................................................... 12
9. Abbreviated Database Object Naming Notation ...................................................................................................... 13 10. Security ...................................................................................................................................................................... 14
10.1. GRANT/REVOKE Privileges .............................................................................................................................. 14 11. Application Considerations .................................................................................................................................... 15
11.1. Migrating from other DBMS to IPDA ............................................................................................................... 15 11.2. Determining the Current Schema ...................................................................................................................... 16 11.3. Dynamically Changing Databases .................................................................................................................... 17 11.4. Resolving SQL Statement Database Objects .................................................................................................... 17
12. Application Usage Types......................................................................................................................................... 17 12.1. ETL / ELT .............................................................................................................................................................. 18 12.2. System Administration ....................................................................................................................................... 18 12.3. Database Administration .................................................................................................................................... 19 12.4. Development ........................................................................................................................................................ 20 12.5. Business Intelligence ........................................................................................................................................... 21 12.6. Scientific/Labs/Class ............................................................................................................................................ 21 12.7. HA/Replication .................................................................................................................................................... 21
13. Backup and Restore (BAR) ...................................................................................................................................... 22 14. Upgrade / Downgrade ............................................................................................................................................. 22
14.1. Upgrade ................................................................................................................................................................ 22 14.2. Downgrade ........................................................................................................................................................... 23
15. Catalog/Control Tables and other Database Objects ........................................................................................... 23 16. Stored Procedures / UDFs / UDAs ......................................................................................................................... 24
Netezza Performance Server 9 September 9, 2013
iii
16.1. Creating a Stored Procedure in a Schema ........................................................................................................ 24 16.2. Using Stored Procedures .................................................................................................................................... 25 16.3. Functions / UDFs ................................................................................................................................................. 25 16.4. Procedure/Function Privileges ........................................................................................................................... 26
17. IPDA Utilities / Interfaces ........................................................................................................................................ 26 17.1. CLIENT CONNECTIVITY.................................................................................................................................. 26 17.2. NZLOAD .............................................................................................................................................................. 26 17.3. Other IPDA Utilities ............................................................................................................................................ 27
18. Other ........................................................................................................................................................................... 27 19. Conclusion ................................................................................................................................................................. 28
Netezza Performance Server 9 September 9, 2013
1
1. Introduction
IBM Pure Data for Analytics (IPDA) appliances, formerly Netezza, supports SQL Environments as defined by the
ANSI/ISO standard for SQL. The standard defines an SQL Environment as an object hierarchy which is made
up of one or more catalogs, a.k.a databases, containing one or more schemas, containing one or more SQL
Objects such as tables, view, procedures and functions. These objects are made up of a collection of one or more
columns defined using SQL data types. Connections to an SQL environment results in a session that exist at the
SQL object level. So sessions exist within a schema that exist within a catalog.
Note: This document will use the common term database rather than the ANSI term catalog. Further schema is
often used to refer to the structure of a database, its objects (tables, views, udfs, etc), and the rules for how those
objects are related and organized in the database. In this document schema refers to the SQL object class of
schema. We will use object to refer to SQL tables, views, procedures, routines, and functions.
Prior to release 7.0.3, only one schema was supported for every database. By default, the schema was the name of
the database owner. All database objects were created within that one schema. To support ANSI SQL and be user
friendly the system allowed any value to be specified as the schema name in a two or three part name. To make
object ownership easy for users using odbc, jdbc the owner of and SQL object was returned as it’s schema.
Beginning release 7.0.3, multiple schema support can be enabled and, when enabled, multiple schemas can be
created in each database. Multiple schemas are useful to help organize database objects as well as to give users
areas within the database for development and testing. The implementation of multiple schema support enables
applications to consolidate multiple databases into a single database containing schemas for stage, production
archive and so on. Because schemas are objects within the database applications can read from any schema and
write into any schema, this make application development simpler. Application migration to IPDA from other
database offerings that support multiple schemas are made significantly easier.
IBM Pure Data for Analytics (IPDA) systems running release 7.0.3 forward will have the ability to be configured
as either supporting a single schema or multiple schema. When configured to support only 1 schema per
database the system can be configured to ignore the schema element in SQL statements.
2. Referencing Database Objects in IPDA
IPDA systems implement SQL using the ANSI/ISO standard for SQL and support may, but not all features
specified in latest standard revision. IPDA appliances implement an ANSI/ISO SQL (SQL) Environment.
SQL specifies all SQL objects be uniquely identified by a three part name that consist of the database name, the
schema name and the object name.
IPDA follows the three-level SQL naming standard to refer to objects in a database:
• The database or catalog name
• The schema name
• The object name, which is the name of the table, view, synonym. routine, and other database objects
Netezza Performance Server 9 September 9, 2013
2
An example select using the fully qualified 3 part name:
select * from database-name.schema-name.object-name
When a connection (session) is established to database default values are provided for current_catalog
(database) and current_schema. So any SQL statement that is submitted containing only using less than the full
three part name will have the missing parts completed with the session defaults. This greatly simplifies the effort
required to write SQL statements.
For example given a database sales with schema demo and table customer all of the following SQL statements
would return the same data.
Three part: sales(user1)=> SELECT * FROM sales.demo.customer;
Two part: sales(user1)=> SELECT * FROM demo.customer;
One part: sales(user1)=> SELECT * FROM customer;
In the above examples the default catalog is sales, and the default schema is demo.
IPDA supports the above naming convention with many valid SQL statements, including cross-database SQL
statements. For example, the following statements are all valid:
1. Truncate the contents of the CUSTOMER table:
sales(user1)=> TRUNCATE TABLE customer;
sales(user1)=> TRUNCATE TABLE demo.customer;
sales(user1)=> TRUNCATE TABLE sales.demo.customer;
2. Retrieve specific records from the CUSTOMER table in the SALES database:
sales(user1)=> SELECT * FROM customer WHERE sales.demo.customer.custid = 101;
sales(user1)=> SELECT * FROM sales.demo.customer WHERE custid = 101;
3. While connected to the SALES database:
• Retrieve the contents of the CUSTOMER table that is located in the ORDERS database, and insert into the
CUSTOMER table in the SALES database:
sales(user1)=> INSERT INTO customer SELECT * FROM orders.admin.customer;
• Join information in the CUSTOMER tables of both the SALES and ORDERS databases using their
CUSTID column:
sales(user1)=> SELECT * FROM customer sc, orders.admin.customer oc
WHERE sc.custid = oc.custid;
Netezza Performance Server 9 September 9, 2013
3
• Create a table named CUSTOMER in the SALES database, with the same definition as the CUSTOMER
table in the ORDERS database, and populate the new table:
sales(user1)=> CREATE TABLE customer AS SELECT * FROM orders.admin.customer;
Note that cross-database writes (DML and DDL statements) are not supported in IPDA. For example, the
following statement will return an error:
sales(user1)=> INSERT INTO orders.admin.customer SELECT * FROM customer;
Cross Database Access not supported for this type of command.
For DML statements, consider changing the query to a cross-database SELECT statement (which is supported)
while logged in to the target database. For example:
orders(user1)=> INSERT INTO customer SELECT * FROM sales.admin.customer;
As a best practice, when a query involves multiple tables, it is important to qualify column names as follows:
database.schema.table.column
or
table_alias.column where alias refers to table alias defined in the from
clause.
Select cst.custid from sales.admin.customer cst where cst.lasname like ‘Smith’;
This helps identify which column belongs to which table especially if the same names are used for tables and
columns. The use of aliases keeps the size of the SQL manageable.
SQL procedures , routines, and functions are SQL objects and they support the use of multi-part names to locate
the object. For procedures, routines and functions using the single part name, IPDA will try to resolve the
object’s location using the current_schema, then the SQL_PATH (if specified), then the
INOFORMATION_SCHEMA and finally the DEFINITON_SCHEMA to find the object. The first matching object
name is used in this search.
Note. Procedure, routine, function name resolution can be time consuming and result in performance issues.
Therefore, to avoid either incorrect resolution or search performance issues, as a best practice these objects
should always be fully qualified name – database-name.schema-name.object-name.
2.1 Double Dot notation
Since the introduction of cross database access in version 3.1 the IPDA SQL syntax has provided a place holder
syntax that allowed sql users and developers to omit schema in their SQL statements. In a single schema world names always correctly resolve, so the double dot notation saved time, space
and was appliance like. Now with support for multiple schemas objects, primarily
tables and views, can reside under many schemas. So the only way to ensure the
statement resolves as desired is to provide the schema name.
Should the double dot notation be used it’s important to understand how the missing
schema element is supplied.
This syntax consisted of a database name two dots and the object name, as shown below.
Netezza Performance Server 9 September 9, 2013
4
Sales.user1(user1)=> SELECT * from orders..customer;
Note the periods between orders and customer, these are the double dots.
When this statement executes the missing schema value will be filled in by either
the current_schema or the database default schema. When the database name matches
the current database name then the current schema is used. When the database name
does not match the current database name then the schema value is filled in with
the database default schema for the specified database.
For example: sales.user1(user1)=> SELECT * from sales..customer;
In this example the current_database sales matches so the current schema user1
fills in the missing schema, making the statement “SELECT * from
sales.user1.customer”.
When the database does not match the current database the database default schema for the specified database is
used.
For example: Given the database default schema for orders database is user5 and the
SQL:
sales.user1(user1)=> SELECT * from orders..customer;
In this example the current_database sales does not match the specified database
orders so the default schema for the orders database, user5, is used to fill in the
missing schema, making the statement “SELECT * from orders.user5.customer”.
Because the missing schema can be filled in with different values we recommend all SQL statements provide the
schema values and avoid the uses of the double dot notation. This will make migration to multiple schema easier.
3. Schema Support
All IPDA releases support SQL’s three part names.
For all releases where multiple schema support is either not available or not enabled the schema element is not
validated, because there is only 1 schema.
IBM in preparation for the default enablement of multiple schema support in IPDA has introduced the option of
enabling multiple schema support in the 7.0.3 release to allow partners, and customers access to this functionality
before IBM starts shipping systems with multiple schema support enabled by default. Nearly all the partner
applications that support IPDA already also support multiple schema, so for most partners no changes are
required.
Netezza Performance Server 9 September 9, 2013
5
When an IPDA database is created it is created with a default schema named for the owner of the database. The
system variables current_catalog and current_schema are set to reflect the database and default schema. The
database owner is also the schema owner and has full privileges on both the database and schema.
When a SQL object like a table is created it’s created in the current_catalog and current_schema combining the
object name with the schema and database name results in a unique three-part name.
Privileges can be used to restrict access to databases, schemas and the objects in a schema.
3.1. Compatibility Mode (default)
One of the requirements with the introduction of multiple schema support was seamless backward compatibility.
Customers upgrading to IPDA release 7.0.3 must be able to function without any conflicts and with no
modification to their existing application/user code.
Compatibility mode allows customers with existing single schema systems to continue using the existing
database applications and processes with no modifications, following an upgrade to releases 7.0.3 and later.
In compatibility mode schema is ignored and the new syntax supporting multiple schemas are disabled.
Compatibility mode is governed by the variable enable_schema_dbo_check in the /nz/data/postgresql.conf file –
a value of 0 enables this mode. If the variable enable_schema_dbo_check is omitted it’s default value is 0.
The variable enable_schema_dbo_check has been part of the IPDA product since release 3.1 and has been used
specify the actions that will be taken when processing a query with an invalid schema. In previous releases, a
value of 0 caused IPDA to ignore the user-specified schema and use the default schema. A value of 1 would cause
a warning, but the schema would be ignored. A value of 2 would cause the SQL statement to fail if the schema
did not match.
3.2. Multi-Schema Mode
In multiple schema mode each database can have a multiple schemas. As a result of having multiple shcema’s
there can be multiple instances of an SQL object, say a table in the database. For example in the sales database
you can have an instance of the customer table under the production schema and another instance of customer
under the stage schema. The instances of customer are unique because they are in different schema and because
the both instances of customer are in the same database both instances can be written to. Being able to write to
tables in different schema enables many applications functionality equal to cross database write.
Multi-schema mode is enabled by specifying a value of 1 or 2 for the variable enable_schema_dbo_check.
When enabled all functionality relating to this schema, including new syntax and access to the information
schema is enabled.
It is recommended that new customers or new IPDA installations with no existing data should enable multi-
schema mode with a value of 2, and begin using this feature. This will help simplify the data migration from
other systems that use multiple schemas .
Netezza Performance Server 9 September 9, 2013
6
3.3. Default schema
The ANSI standard requires users connect to a vendor defined default schema when a user connects and does not
specify a schema. IPDA provides an environment (Appliance) level selectable option for the default schema.
IPDA allows the specification of either the DBMS default schema or a schema named identically to the session
owner. The variable enable_user_schema in the /nz/data/postgresql.conf determine which schema is
selected 0 = database , 1 = user.
3.4. Upgrading existing systems to multiple schema.
After upgrading an existing appliance to release 7.0.3 multiple schema support can be enabled. We strongly
recommend existing customers phase in multiple schema support according to business requirements and with
full functional testing of existing applications and processes.
Refer to Section 6, Enable Multiple Schema support in existing Systems, for more information on migrating existing
applications to release 7.0.3.
4. Configuring Multiple Schema Support
Multiple Schema Support in IPDA is enabled at the appliance, system, level . multi-schema support cannot be
enabled at the database level!
To enable, disable multiple schema support requires access to /nz/data/postgresql.conf and a restart
of the NPS system. The Linux user nz typically owns the /nz/data/postgresql.conf file.
The admin and users granted the schema privilege can create and manage schemas within databases.
You can configure whether invalid schemas return a warning or an error. For example, you can configure IPDA
to return an error for any queries that specify an invalid/non-existent schema, or you can return a warning
message for queries that use a non-existent schema and allow the use of the default schema.
4.1. Enable Multiple Schema Support
To enable multiple schema support, use the following procedure:
1. Log in to the IPDA active host as the nz user. Change the value of the variable
enable_schema_dbo_check in the /nz/data/postgresql.conf file to one of the following:
�� 0
•• DDiissaabblleess mmuullttiippllee sscchheemmaa ssuuppppoorrtt
�� 11
• Enables multiple schema support in warn mode.
• Users can create, alter, set, and drop schemas.
• However, if a query references an invalid schema, IPDA looks for the object in the default
schema and if found continues processing and displays a WARNING message (Schema
'schema_name' does not exist) .
Netezza Performance Server 9 September 9, 2013
7
�� 22
• Enables full support in error mode.
• Users can create, alter, set, and drop schemas.
• If a query references an invalid schema, the query fails and returns an ERROR (Schema
'schema_name' does not exist).
2. Run nzstop command to stop IPDA.
3. Run nzstart command to restart the IPDA software for the change to take effect.
4.2. Disable Multiple Schema Support
To disable multiple schema support, use the following procedure:
1. Move any SQL objects not in the default schema to the default schema. Note: some names may need to be
changed to keep object names unique!
2. Change the value of the variable enable_schema_dbo_check in /nz/data/postgresql.conf to 0.
3. Run nzstop command to stop IPDA.
4. Run nzstart command to restart the IPDA software for the change to take effect.
Use caution when you disable the functionality. If you disable multiple schema support without moving all SQL
objects to the default schema they have limited accessibility. The schemas and their objects still exist and can be
referenced in SELECT statements, but they will not be accessible with insert, update, delete statements. Further
schema commands will be disabled. If multiple schema support is re-enabled , users will be able to set schemas
and manage the existing objects again.
As a best practice, you should not enable and disable multiple schema support to avoid any problems managing
and accessing schemas.
5. Multiple Schemas and User Connection Options
When multiple schema support is enabled (enable_schema_dbo_check=1 or 2), if a user does not specify a
schema name to connect to when accessing a database, they will be connected to a default schema.
IPDA allows for two default connection behaviors: Connect to the database default schema or connect to the
schema which matches the user name. The enable_user_schema variable, set in the
/nz/data/postgresql.conf, controls which of these two options will be used by IPDA.
enable_user_schema = FALSE, 0 or not defined results in a connection to the database default schema.
enable_user_schema = TRUE, 1 results in a connection to a schema named for the user. In the event a schema named for the user does not exist it will be created and the user will own the schema and be placed in it.
Netezza Performance Server 9 September 9, 2013
8
5.1. Connect to the default Database Schema
When you create a database, IPDA automatically creates a schema for that database. The name of this schema is
the name of the owner who created the database. This is known as the database default schema and each database
has one. For example, if user user1 creates a database sales1, there will be a default schema named user1 in the
sales1 database. You cannot drop the default schema in a database, but you can change the default schema using
the ALTER DATABASE command and then drop the non-default schema. For example:
sales.myschema(admin)=> ALTER DATABASE sales SET DEFAULT SCHEMA schema2;
sales.myschema(admin)=> SET SCHEMA schema2;
sales.schema2(admin)=> DROP SCHEMA myschema;
Note that in 7.0.3, the nzsql environment prompt has changed to include the schema name to which you are
currently connected. It has the format database-name.schema-name(user-name)=>. The above and earlier
examples reflect this change in the prompt.
You must own the database or have alter database privilege on the database to change the default database
schema.
Care should be taken whenever there is a need to change the default schema because of the following:
• All users who can access the database will now automatically have access to the new default schema;
however, they may not have access to the previous default schema!
• If a user requires access to the previous default schema, that user must now be explicitly granted
access to that schema and must specify the schema in their SQL.
• Any SQL that uses the Netezza database dot dot table notation will use the database default schema
when doing a cross database access, so changing the default result in SQL Errors “table not found”.
5.2. Connect to the default User Schema
If the enable_user_schema variable is set to TRUE, and a user does not specify a schema when connecting to a
database, IPDA will connect that user to a schema that matches their user name. This is similar to how many
other DBMS work under these circumstances.
If a schema name matching the user name does not exist in that database, IPDA will automatically create that schema
in that database. For example, if user user1 connects a database named mydb, and there is no schema name user1
in mydb, a new schema named user1 will automatically be created in mydb and that user will be connected to
that schema.
Note that when enable_user_schema variable is set to TRUE, users will not be connected to the database default
schema, unless it matches the user name.
5.3. Connect to a specific User Schema (with Multiple Schema support Enabled)
Starting with release 7.0.3 multi-schema enabled, you can specify the schema to which you want to connect when
using the command line tools. For example, you can enter the following nzsql command to connect to the
schema USER1 in the SALES database:
nzsql -d sales –schema user2 –u user1 -pw 123456
If you wish to avoid the –schema option with nzsql, you can set the environment variable NZ_SCHEMA to the
specific schema prior to issuing the nzsql command.
Netezza Performance Server 9 September 9, 2013
9
With multiple schema enabled connected sessions can use the SET CATALOG and SET SCHEMA commands to
change to a different database and optionally schema and change to a different schema in that current database
respectively. For example:
sales.user1(user1)=> SET CATALOG orders;
orders.user1(user1)=> SET schema user2;
orders.user2(user1)=> SET schema orders.user1
orders.user1(user1)=> SET schema sales.user1
sales.user1(user1)=> SET CATALOG orders;
orders.user1(user1)=>
If IPDA multiple schemas support is not enabled, -schema is ignored. Also, if you do not specify -schema, the
environment variable NZ_SCHEMA is used; otherwise, either the schema that matches the user name, if
enable_user_schema = TRUE is set, or the database default schema for the database, if enable_user_schema =
FALSE or not set.
There is no way to specify an initial schema other than the default when using ODBC, JDBC, OleDB , so the set
schema command needs to be used.
If you have multiple schemas in a database, as a best practice, you should always use the SET SCHEMA
command to connect to the schema that contains the objects that you want to describe. This can help avoid any
confusion or problems later.
As a best practice enable_user_schema = TRUE should be used when multiple schema support is enabled.
6. Enabling Multiple Schema support on Existing Systems
Current customers who want to enable multiple schema support on exiting system need to be aware of some
existing behaviors.
Releases 4 , 5, 6, and 7 drivers pre 7.0.3 report the object owner rather than schema by default, to make
identifying the tables owner easier. As a result, BI reports and ETL flows may well contain SQL that specified the
table owner as the schema. In single schema mode schema was ignored so the object was always found.
When multiple schema support is enabled all the SQL object are kept in the default schema named for the DBMS
owner and the SQL sent by BI and ETL tools will likely contain schema’s that designate the table owner as
schema. Often the table owner will not match the dbms owner and in these cases with a value of the SQL sent
will likely fail because the table is not found under the schema, assuming the schema exist. If
enable_user_schema is disabled and enable_schema_dbo_check = 1, there will likely not be a schema named
for the table owner, so the server will look under the default schema and find the table; it will then return the
data and issue an SQL warning about the invalid schema name and continue processing.
However as schema's are created, if and they are named for “user”, then at some point the BI, ETL object will
issue SQL that contains a valid schema at which point the server will not find the table under the schema and it
will return an error. To avoid this issue from occurring, all the objects in the database need to be moved into
schemas named for the object owners. This must be done for all databases on the system.
Below is an example script to create schemas and move objects.
Netezza Performance Server 9 September 9, 2013
10
====================================================================================================
#/bin/sh
#
# This script is provided as-is it does not cover all object types or all cases. It is provided as a starting point. @IBM 2013
#
if [ "${NZ_USER}x" != "adminx" ]; then
echo "This script must be executed by user admin. ...export NZ_USER=admin"
exit
fi
if [ "`nzsql system admin -Atc "show enable_schema_dbo_check" 2>&1 | head -1`" = "NOTICE: ENABLE_SCHEMA_DBO_CHECK is 0" ];
then
echo "Exiting! Multiple schmea mode disabled: restart system afere setting ENABLE_SCHEMA_DBO_CHECK 2 "
exit
fi
if [ "`nzsql system admin -Atc "show enable_user_schema" 2>&1 | head -1`" = "NOTICE: ENABLE_USER_SCHEMA is off" ]; then
echo "*
*
* Warning! ENABLE_USER_SCHEMA is off.
*
* Hit Control C to exit, wait to continue
*
*"
sleep 10
fi
# PROCESS ALL DATABASES
for db in `nzsql -lAt | cut -d '|' -f1 | grep -iv system`
do
# get all the elements we need to work with: tables views procedures ....
objlist=`nzsql \"$db\" admin -Atc 'select distinct class from _v_objs_owned where database = current_catalog and class != ^SCHEMA^'`
# don't process objects in the default schema
dfschem=`nzsql \"$db\" admin -Atc 'select defschema from _v_database where database = current_catalog'`
for ownr in `nzsql \"$db\" admin -Atc " select distinct owner from _V_OBJS_OWNED where database = current_catalog and class !=
^SCHEMA^ and owner != ^$dfschem^"`
do
nzsql \"$db\" admin -c "create schema $ownr"
for cls in $objlist
do
for obj in `nzsql \"$db\" admin -Atc " select distinct objname from _V_OBJS_OWNED where database = current_catalog and class
= ^${cls}^ and owner = ^$ownr^ and schema != ^$ownr^"`
do
# echo "owner: $ownr Class: $cls Object: $obj"
nzsql \"$db\" -eAtc "alter ${cls} \"$obj\" rename to \"${obj}x\"; alter ${cls} \"${obj}x\" rename to $ownr.\"${obj}\""
done
done
done
done
====================================================================================================
Netezza Performance Server 9 September 9, 2013
11
7. Schemas
Beginning with IPDA release 7.0.3, when you create a database, initialize IPDA for the first time , or upgrade to
7.0.3 two internal or system defined, schemas along with objects are automatically created in every database:
• INFORMATION_SCHEMA (system schema with internal views)
• DEFINITION_SCHEMA (system schema with all system objects as seen in previous releases)
In addition, a schema that matches the user name of the database creator (default schema) must also exist, or will
be created. Therefore, there will always be at least 2 system and 1 user schemas. Previous IPDA releases only had
1 schema.
Any user who can access the database automatically has access to the database default schema if IPDA is
configured with enable_user_schema = FALSE. Otherwise, users can only access a schema name that matches
their user name, wherein user objects can be created.
With multiple schema support, IPDA supports cross-schema write actions for schemas that reside in the same
database. For example:
sales.user1(user1)=> CREATE SEQUENCE myseq.seq1
sales.user1(user1)=> CREATE TABLE orders AS SELECT * FROM user2.orders;
sales.user1(user1)=> INSERT INTO user2.sales SELECT * FROM sales;
However, cross-database writes are still not supported in 7.0.3. You therefore cannot have a transaction that spans
multiple databases.
8. Managing Multiple Schemas
You, you can create, alter, set and drop schemas within the databases, if multiple schema support is enabled. You
must have the necessary privileges to execute the schema commands below.
8.1. Creating Schemas
To create a schema, use the new SQL standard CREATE SCHEMA command.
sales.user1(user1)=> CREATE SCHEMA myschema;
You automatically become the authorization user (owner) of the new schema.
The CREATE SCHEMA command also includes options to create tables and views, grant privileges, set user
authorizations, and search specific locations to resolve unqualified routine names. For example, the following
statement creates a new schema named s2, specifies the schema name s2 to resolve unqualified routine names,
set the owner of the schema to user1, creates the table tab1 within the schema s2, and grants the SELECT
privilege to user user2 on the new table tab1:
CREATE SCHEMA s2 AUTHORIZATION user1 PATH 's2' CREATE TABLE tab1
(c1 int, c2 int, c3 varchar(10)) GRANT SELECT ON tab1 TO user2;
Note that if any one of the SQL commands in the statement above fails, the CREATE SCHEMA command fails
and everything is rolled back.
Netezza Performance Server 9 September 9, 2013
12
8.2. Changing Schemas.
Creating a schema does not change your current schema by default. You must change your current schema using
the SET SCHEMA statement as follows:
sales.user1(user1)=> SET SCHEMA user2;
sales.user2(user1)=>
If the schema does not exist, the above command will display an error.
You have the option to specify the database name and change to it if the schema is in a different database. If you
specify database-name.schema-name to connect to a schema in a different database, the command implicitly executes
a SET CATALOG command to change to the database. If you include a database name and your in a transaction
the set schema statement will fail because you cannot change databases within a transaction.
Important: You cannot change both the database and schema using SET CATALOG - however, you can change
both with the SET SCHEMA command using the two-level naming convention (database-name.schema-name)!
8.3. Altering Schemas
You can use the new ALTER SCHEMA statement to alter a non-systems-defined schema and change its name, its
owner, or the contents of the schema path. For example:
• Change the name of the schema from user1 to user2:
sales.user1(user1)=> ALTER SCHEMA user1 RENAME TO user3;
• Change the owner of the schema from user1 to user2:
sales.user1(user1)=> ALTER SCHEMA user1 AUTHORIZATION TO user2;
• Set the schema path for schema user1 to search for objects within several other schema names:
sales.user1(user1)=> ALTER SCHEMA user1 SET PATH
'user2, user2, user3.user4';
The schema path specifies a comma-separated list of schemas that define a search path for routines such as user-
defined functions, aggregates, libraries, and stored procedures.
You cannot alter a schema if there are sessions that are connected to that schema. Also, you cannot use the ALTER
SCHEMA command to move the schema to a different database.
8.4. Display Database Schema
You can use the new SHOW SCHEMA command to list all of the schemas defined in the current database. For
example: sales.user1(user1)=> SHOW SCHEMA;
Schema owners can only see information for their schemas and any schemas that they are privileged to see. The
admin user or the database owner can see all schemas in the database, including the system-defined schemas.
8.5. Dropping Schemas
If you are logged in as the admin user or are the schema/database owner, you can drop a schema using the new
DROP SCHEMA statement.
You must specify either CASCADE (drop all objects within the schema) or RESTRICT (do not drop the schema if
it contains objects) when dropping a schema. For example:
sales.admin(admin)=> DROP SCHEMA user1 CASCADE;
Netezza Performance Server 9 September 9, 2013
13
You cannot drop the default schema , the internal schema or a schema that is currently being accessed by other
users.
Note again that when you drop a schema, you drop all the objects defined within that schema! Therefore, as a
best practice, always use the RESTRICT option to avoid the accidental drop of important objects within a schema.
9. Abbreviated Database Object Naming Notation
In IPDA the terms catalog and database are interchangeable, so the function current_catalog returns the
name of the current database.
In release 3.1, Netezza introduced cross database access and with it a shortcut notation known as DOT DOT or
double dot. This syntax made SQL statements easier to write and took advantage of the fact that Netezza had
only 1 schema per database. By design the missing schema name is filled in with the current_schema for current
database references and by the database default schema for non-current database references. In single schema
mode these are always the same. In multi-schema mode the values can be different and the behavior can change.
The following examples will help make the choice clear.
Database Default Schema
------------ ---------------------
SALES user1
ORDERS user2
A user has the current_schema set to production.
The user first executes the following command:
sales.user1(user1)=> SELECT * from SALES..CUSTOMER;
The above statement will translate into:
SELECT * from SALES.PRODUCTION.CUSTOMER.
Notice that the filled in schema value above is the current_schema! So the user will be referencing the object
CUSTOMER in the production schema of the SALES database!
The user next executes the following command:
sales.user1(user1)=> SELECT * from ORDERS..CUSTOMER;
The above statement will translate into:
SELECT * from ORDERS.USER2.CUSTOMER.
Notice now that the filled in schema is the default for the ORDERS database. So the user will actually be
referencing the object CUSTOMER in the user2 schema of the ORDERS database!
Therefore, to avoid unexpected or unintentional consequences, effective with release 7.0.3 we recommend that the
use of the DOT DOT notation be abandoned to avoid unexpected query behavior.
Netezza Performance Server 9 September 9, 2013
14
10. Security
By default, the admin user has full access to all databases, schemas, and objects. With multiple schemas in 7.0.3,
the schema owner automatically has full access to all objects within the schema - even if those objects were not
created by that owner. In essence, the schema owner becomes the superuser for that schema!
However, the schema owner cannot grant object or administrative privileges within the schema unless the user is
explicitly granted privileges to do so.
When multiple schemas are enabled, a user who is granted access to a database automatically inherits access to
the default schema of that database. The only exception is when enable_user_schema is set to TRUE. In that
case, the user always connects to a schema that matches their user name, not the database default schema, if a
schema name is not specified. However, if necessary, a user can set their schema to the default or access the default
in an SQL statement.
Be aware that if you change the default schema for a database, the users automatically inherits access to the new
default schema and may loses access to the previous/original default schema. If a user requires access to the
previous default schema, that user must now be explicitly granted access to that schema.
10.1. GRANT/REVOKE Privileges
The GRANT and REVOKE SQL commands now include multiple schema support.
You can use a fully qualified object notation (database.schema.object) with GRANT/REVOKE statements to set the
scope of object privileges from any database:
• database is a specific database name or the keyword ALL to grant the privilege across all databases.
• schema is a specific schema name or the keyword ALL for all schemas in the specified database value.
• object is an object class such as TABLE, VIEW, and so on, or a specific object name.
Important: The keyword ALL applies the privilege to all the databases, or to all schemas in the specified database
if that keyword is specified at the schema level. It also means the privilege applies to all current and future
databases/schemas! If you do not specify a schema, the scope is the current schema. As a best practice, you should
always specify a specific database and schema to restrict the privilege.
You can grant privileges on an object to a user or a group using one of the qualified object notations. For example,
grant the SELECT privilege to user user3 on the table CUSTOMER, which is located in schema user2:
sales.user1(user1)=> GRANT SELECT ON user2.customer TO user3
For the above type of privilege, the object must exist, and this privilege overrides any other defined privilege.
Since multiple schema support allows objects with the same name to reside in different schemas, the GRANT
statement now includes a TYPE clause to specify an object type such as DATABASE, SCHEMA, USER, or
GROUP. For example, to grant the SELECT privilege to all objects in the schema user1 to user user2:
sales.user1(user1)=> GRANT SELECT ON user1.customer TYPE SCHEMA TO user2
IPDA would resolve user1.customer in the above statement to the customer table only if the SCHEMA
keyword was not specified. The TYPE option can therefore help to clarify the object in cases where different
objects could have the same name, and is strongly recommended to avoid unintentional privilege grants.
Netezza Performance Server 9 September 9, 2013
15
A new CREATE SCHEMA admin privilege has also been introduced to allow users to create schemas in a
database. This, and other admin privileges can be used to define the privilege scope. Previously, when you
granted a privilege in the SYSTEM database, it automatically became a global privilege – it applied to all
databases. Now, you can set the privilege scope as before by specifying the IN database.schema clause from
any database. For example, the following command grants user user1 the Create Schema privilege in all databases
and schemas:
sales.user1(user1)=> GRANT CREATE SCHEMA IN ALL.ALL TO user1;
The power of this is that with very few statements you can grant specific privileges to everyone; then, if need be,
you can override the privileges on a specific user/schema/database basis.
You can use the REVOKE statement to remove access privileges for a user, group, or all users. The range of
syntax supported is similar to that of the GRANT statement, including the examples above. However, as a best
practice, when revoking privileges, you should sign on to the same database and schema where you granted the
privileges, or use the fully qualified name forms that match the locations in which you granted the privileges.
11. Application Considerations
11.1. Migrating from other DBMS to IPDA
For users/applications migrating to IPDA from other DBMS which support multiple schemas, there are some
differences in the behavior that must be accounted for.
IPDA automatically creates a schema in a database, if one does not exists, when multi-schema mode is enabled
and the variable enable_user_schema is set to TRUE. This helps avoid object name collisions. For example, this
feature is useful when multiple users are testing one script which executes DDL and DML statements. Since each
invocation of the script will be in the schema of the executing user, multiple users can use the same script without
any changes. IPDA will create a schema for each user, if one does not exist, and since all tables, views, or other
database objects will be created in the schema of each user, there will not be any conflicts as long as all object are
unique in that schema.
Note that IPDA is an appliance and therefore does not have an option to place certain data in specific storage
spaces/media. You therefore cannot place individual tables in different storage locations. All table records are
distributed across the pre-defined data slices of the system. However, IPDA does distribute rows with the same
distribution key value to the same data slice. Because IPDA does not support the specification of storage locations
at the object level, you cannot use schemas to distribute data, as is possible in other DBMS.
Applications migrating from other DBMS which do not support multiple schema, but wish to take advantage of
this feature, can do so relatively easily. You can logically organize the tables into different schemas. Then, to
avoid significant application changes, you can use synonyms to reference tables, in the same or different schema.
In many environments, there is a need to separate data amongst various user groups, even though the set of
tables and their definitions are the same. For example, development, quality assurance, and production would all
normally use the same table definition but different data sets. The production data and tables are considered
critical and would normally be in a separate database. Often, for cost reasons, the development and quality
assurance databases would be on the same IPDA appliance also. This containment of functionally within separate
databases ensures no accidental overlaps, thereby avoiding critical situations.
However, the overhead in maintaining multiple databases is generally more than managing multiple schemas.
Further, unlike DB2 and some other DBMS, IPDA can only be configured at the system level and not the database
Netezza Performance Server 9 September 9, 2013
16
level. Therefore, it is recommended that when migrating from other DBMS wherein many databases were
created to divide responsibilities, you should consider creating multiple schemas versus multiple databases to
reduce the management overhead. Code changes should not be required if objects were referenced using the
object-only freeform notation, since each group would still be isolated within their own schema. However, you
now would now have the added capability to perform cross-schema writes if necessary.
Note that it is not uncommon for the development environment to have a different version (definition) of some
tables to that of production. As long as developers can only access objects in their respective schema, the different
table versions will not cause a problem. It would be up to the DBA to formally deploy the different object
versions between the development, quality assurance, and production schemas.
In environments where information for many clients is hosted in separate databases, the IPDA DBA can instead
create multiple schemas, one for each client, and then assign owners for each schema – including the default
schema. Then, if any additional permissions need to be assigned, it can be done on an individual schema basis.
This way, the DBA can quickly setup an environment that already has the default permissions discussed earlier,
which are natural to multiple schema in release 7.0.3.
Important: Like Oracle and some other DBMS, switching schemas is easier than switching databases; therefore,
multiple schemas provide more flexibility. However, note that if the database becomes inaccessible, so do all
schemas in that database! Multiple schemas offer simplicity and flexibility, but at a cost of reduced availability.
As with many other DBMS, schemas provide the flexibility to add/remove users/applications easily. You can
perform the following functions when working with schemas:
• Change ownership of schema using the ALTER command
• Allow multiple database users to share a single schema
• Setup a single schema that contains objects owned by multiple database users
• Objects can be moved between schemas using the ALTER SCHEMA command, but not databases
• Permissions can be setup at the schema level instead of the object level
11.2. Determining the Current Schema
You can use one or more SET SCHEMA statements within a transaction or stored procedure . However, you will
fall back to the schema you were originally connected to once the transaction or stored procedure completes.
After executing one or more SET SCHEMA statements within a transaction or stored procedure, you can use the
new current_tx_schema built-in function to return the value of the current schema.
Note that the existing function current_schema returns the name of the schema a user is currently connected to
and working within. The difference between current_schema and current_tx_schema is that
current_tx_schema can only be used within a transaction or stored procedure, whereas current_schema
only works outside a stored procedure or transaction. Outside of their respective areas, these functions will return
null/incorrect values.
Both functions allow the user/application to query their current environment – as has been the case with
current_catalog when working with databases in previous IPDA releases. For example:
• Find the current schema within a transaction:
sales.user1(user1)=> SELECT CURRENT_TX_SCHEMA;
• Find the current schema outside a transaction:
sales.user1(user1)=> SELECT CURRENT_SCHEMA;
Netezza Performance Server 9 September 9, 2013
17
As a best practice, when working in multiple schemas, you should always check your current schema prior to
performing any major activities (loads, deletes, updates, etc). This ensures that DBMS tasks will be executed in
the correct environment; otherwise, serious and unexpected problems could result!
11.3. Dynamically Changing Databases
You cannot use the SET CATALOG statement inside of a transaction to change the current database.
11.4. Resolving SQL Statement Database Objects
When you create or alter a schema, you have the option to specify the Schema Path in the definition. This path is a
list of schemas to search in order to resolve the location of a particular object referenced in an SQL statement.
For example, to resolve a stored procedure name, IPDA will first look in current schema, then in the schema
search path, followed by the 2 internal schemas. As soon as a procedure of the same name is found in a schema,
the search will stop and that procedure will be used – irrespective if it is the correct procedure or not!
Because of this, the order specified in the schema path can become very important! Further, if you change your
schema using SET SCHEMA, you could have a completely different search path in the schema you transitioned
to. It is therefore crucial that the user/application understand these changes and account for them. Searching for
objects in an ordered fashion can not only affect performance, but it could result in the use of the wrong object,
leading to unexpected issues and incorrect results!
Be aware that the search_path variable can also be defined in postgresql.conf to resolve database objects. This
variable specifies the order in which schemas are searched when an object is referenced by a simple name with no
schema component. The paths specified by this variable are appended to every schema path of all defined
schemas. Therefore, if you have a common set of schema that you would like to be searched for every schema that
has been (will be) created, you can set this variable accordingly.
You can use wildcards such as current_user, current_database, and current_schema inside the
search path. Searches can therefore be dynamic!
Important: search_path does not change your current schema even if the object is found in a different schema.
12. Application Usage Types
With the introduction of multiple schemas in release 7.0.3, a number of intriguing possibilities arise in the areas of
database design, application processing, and security.
You can take advantage of the major benefits of using multiple schemas, including:
• Easier Administration, by organizing objects into logical groupings and managing them at the object
level, versus on a individual basis
• Effective Sharing, by grouping commonly used objects in a different schema and allowing the
respective users to access them from other schemas
• Enhanced Security, because security can now be maintained at the schema level
• Better Containment, because users in one database but in different schemas cannot interfere with
each other, resulting in fewer collisions
There are many application/user types spread out across many business environments which can benefit from
multiple schemas. Schemas can be used for duplication of object definitions and data, QA testing, developer
Netezza Performance Server 9 September 9, 2013
18
playgrounds, keeping copies of other databases locally, increasing security, logical database design, user/
application containment, and even to support different application versions.
Below are listed some of the major business environments/functions, and for each, a summary of how IPDA users
in each environment can benefit in the use of multiple schemas.
12.1. ETL / ELT
In most ETL environments, data is continuously sent from various data sources to a Enterprise Data Warehouse
or Data Mart(s). However, before the data can be loaded into the production warehouse tables, the data must be
“transformed” (scrubbed) prior to loading.
Data Flows, together with Control Flows, can move and scrub the data in the source environment, in temporary
directories or other staging areas, or within the production warehouse database environment. If the scrubbing is
performed within the data warehouse database, like IPDA, it is best practice to create a new schema to perform
these transformations. After the data is scrubbed in the non-production schema tables, it can be loaded into the
warehouse production schema tables (fact and dimension).
If multiple users concurrently scrub the data received from different data sources, each user works on a subset of
the source data. Since many transformation scripts create temp (staging) tables to load and scrub the data, if the
enable_user_schema variable is set to TRUE, table creation and scrubbing will automatically occur in the
schema of each user - prior to loading into production schema tables. This will help avoid conflicts or accidental
deletion of records, and even catastrophic drops of production tables.
You also have the option to create schemas for each data source to be transformed, allowing scrubbing activities
to be independent and concurrent.
You could even consider creating a schema for each type of ETL/production table. For example, you could create
separate schemas for the fact, dimension, staging, temp, and other tables. The benefit of this is that the schema
would clearly identify the purpose of each set of tables, the users that need permissions for each schema, and help
with troubleshooting activities.
All the above multiple schema scenarios would work because cross-schema writes are supported in release 7.0.3!
Note that you will need to GRANT the INSERT privilege on the production tables to all users involved in the ETL
process. For example, using the new syntax of the GRANT option, the following will grant user user1 INSERT
privilege to all tables in schema prodschema:
dw.prodschema(user1)=> GRANT INSERT ON user1.fact_tbl TYPE SCHEMA TO user1
However, as a best practice, you should ensure that those same users do not have other privileges on tables in the
production schema.
12.2. System Administration
Typically the system administrator does not have access to databases and their data . However, certain
institutions, such as banking, require a division of responsibilities because of government or other regulations.
Because of this, there could be some privileged and sensitive data stored in specific database tables. Division of
responsibilities requires that not even the DBA should have access to this data. It is therefore essential that these
tables are stored in the admin or an admin-only accessible schema, with no other data. Since the user admin has
full privileges and access to all database objects, you must use this account for only those individuals that require
access to that extremely sensitive data.
Netezza Performance Server 9 September 9, 2013
19
In these environments the DBA should initially use the admin account to create some IPDA users and groups,
including another administrative-level account to perform tasks such as user management, database
maintenance, and object creation and management. For example, to create an admin_users group that provides
object and administrative level privileges similar to the admin user:
1) Connect to the SYSTEM database as the admin user:
nzsql -d system -u admin -pw password
2) Create a group for your administrative users:
system.admin(admin)=> CREATE GROUP admin_users
3) Grant the group all administrative permissions:
system.admin(admin)=> GRANT ALL ADMIN TO admin_users WITH GRANT OPTION;
4) Grant the group all object permissions EXCEPT schema:
system.admin(admin)=> GRANT ALL ON DATABASE, GROUP, SEQUENCE, SYNONYM, TABLE,
EXTERNAL TABLE, FUNCTION, AGGREGATE, USER, VIEW,
PROCEDURE, LIBRARY TO admin_users WITH GRANT OPTION;
5) Add users to the group to grant them the permissions of the group:
system.admin(admin)=> ALTER GROUP admin_users WITH USER adm_user1, adm_user2
Then, once multi-schema mode has been enabled and the enable_user_schema variable set to TRUE, the DBA
should then relinquish the admin user/password account to the respective individual(s) who has a need to view
sensitive data. That privileged user can then setup an environment to house the sensitive data in an admin-only
accessible schema.
Although the above is not the most efficient solution, it does allow for the separation of responsibilities.
12.3. Database Administration
Schemas allow a database administrator (DBA) to separate database users from database object owners. The DBA
has more flexibility to protect sensitive objects in the database, and also to group logical entities together.
If you do not enable multi-scheme mode, you will default to a pre-7.0.3 release IPDA environment – a single
schema system. This is essentially a public database schema, which does not offer grouping of objects,
containment for abnormal events, nor better security. Single schema environments are recommended when there
are very few users that cooperate amongst each other, and when there is not a need to sensitize data between the
users.
As mentioned earlier, it is recommended that new IPDA release 7.0.3 installations enable multiple schema
support using the variable enable_schema_dbo_check, located in the /nz/data/postgresql.conf file. This variable is
a system level variable implying multi-schema support is enabled for the whole IPDA system – including all
databases. It does not apply on a per database level. You should also set the variable enable_user_schema to
TRUE, allowing IPDA to automatically create schema names that match user names, if it does not already exist.
Through schemas, a DBA can control access to critical objects that would otherwise be open to intentional or
unintentional, and potentially harmful, changes by users. Schemas are often considered a safety net and should
be viewed as so as a best practice. For example, scripts are often used in many database environments to perform
varied functions – including unload/load data, databases maintenance, and periodic deployments. Users allowed
to execute these scripts to accomplish these functions should have limited permissions, often only inside their
schema, so that their actions are contained. This helps prevent any problems from affecting other users/data.
Netezza Performance Server 9 September 9, 2013
20
As a best practice, you should setup shared database objects in their own schema. Shared objects can include
tables, views, stored procedures, functions. third party objects, etc. You will need to setup the necessary GRANTs
to allow users working in other schemas to access objects in the shared schema. Users accessing objects in the
shared schema will at a minimum need to specify the schema.object notation – this may require code changes, or,
more simply, include the shared schema name in their schema path or the global search_path variable.
Note that the IPDA database group named public includes all database users – they inherit all of that group’s
privileges. Therefore, once you adopt multiple schemas, if you had setup a default set of permissions for all IPDA
user accounts in a previous release, you will need to revisit that and determine if there would be any security
violations in your new multiple schema environment. You may have to revoke access to some or all users so that
they are constrained to their individual schemas.
Views can also be created in separate schemas to hide sensitive, or unnecessary, information that located in the
base tables. The base tables could be in the same or different schema, but the view creator must have access to
those tables - GRANT SELECT on table TO view_creator. Shared views that provide access to the non-
sensitive data can be very useful and easily managed. Since the underlying tables are never exposed, this setup
provides another layer of abstraction and thereby increases security.
Restricting third party or connection pooling applications to one schema using the application id is considered a
best practice. Also restricting users to different schemas based on their job title/function is another best practice
that should be considered. Both isolate the application/functions from other user activities and avoid unexpected
or catastrophic consequences.
12.4. Development
One important usages of multiple schemas in a development environment is in the form of object versioning. For
example, many developers could create their own version of an object and work on it. Each developer can first
“pull” a related set of objects from a shared schema into their own schema. They can modify their own version of
the object(s) and, when they have completed testing, etc., they can “put” the new version of the object back into
the shared schema using an established standard. This setup ensures that the original shared object definitions are
always visible to others for use, and also allows everyone to immediately see changes once object definitions have
been updated in the shared schema.
Setting the schema path and the search_path variable would allow the above functionality to work seamlessly.
Also, since each individual developer is normally isolated in most development environments, the above should
not result in any conflicts. Developers usually work on a specific area of code/application; therefore, the
likelihood of two developers working the same object is often minimal. Further, because a “put” of a new version
of an object into the shared schema would be controlled either through security (ALTER privilege) or some
internal standard, it is unlikely two developers will update the original schema at the same time.
Many development environments work using a modularized model. Creating separate schemas for each module
and ensuring developers work in their respective modules as a “team” could help avoid collisions/accidental
“changes”. Also, this would more clearly identify the objects used in each module. Any objects shared between
modules could be setup in a shared schema with the necessary privileges.
As discussed earlier, creating a schema for each team – developer, testing, QA, production – and ensuring
members of each team can only access their respective schema, can lead to efficient software development.
However, as the application/changes moves through the various software lifecycle stages, only a single individual
should have the authority to “update” the objects in each team’s schema. Further, creating a backup schema at
each stage would help facilitate “rollbacks”.
Netezza Performance Server 9 September 9, 2013
21
Note that when developing applications, the number of objects within databases can easily run into hundreds, if
not thousands. It is under these circumstances wherein multiple schemas prove extremely beneficial. They help
organize the objects logically, provide the necessary security, and avoid possible problems; their benefits
outweigh their costs.
12.5. Business Intelligence
Business Intelligence (BI) applications and tools mine data within EDWs/Marts to obtain insights into what has
happened within an organization, and also what could be ahead. Since most ETL can be found in these same
environments, it is important to segregate the ETL and BI activities.
Creating multiple schemas for ETL activity is recommended, but also separate schemas for BI tools/applications.
For example, simple reporting applications could be in their own schema, whilst applications that heavily mine
the data to predict future activity/events could be in another schema. The key is that any non-ETL activity should
be segregated so as to not interfere with each other.
Note that it is not necessarily the data that should be segregated into multiple schemas, if practical, but especially
the objects, such as stored procedures and functions, which are directly called/used by the BI tools.
Creating shared and private views, in their respective shcemas can also help segregate information for reports,
data mining, and/or loads. The use of views within multiple schemas can further enhance security where
exposure of underlying data needs to be restricted based on role.
Finally, warehouse environments that support ad-hoc mining could benefit using multiple schemas too. You
could setup a shared schema where data is simply read by all like users; however, any temp objects that are
created to support the analysis can be in the schema of the user. This avoids conflicts and allows separation of the
permanent tables/objects with the “temp” objects that may be created as part of the mining process.
12.6. Scientific/Labs/Class
Many scientific environments usually work with a suite of application packages to perform their analysis, and
record their observations/test results into a DBMS. There often are many scientists and assistants recording data.
Each would normally work in their own database environment but have access to a common set of the public
tables.
When these IPDA environments are upgraded to release 7.0.3, they should remain in compatibility mode, so that
they continue to perform their functions as before. However, if the users upload information using their own ids,
they could take advantage of multiple schemas, wherein they would simplify database management by recording
each test/observation in their own schema. This also provides them the flexibility to update/modify/delete data in
their own environments, without risk of affecting others.
Another scenario where multiple schemas would be useful is in the classroom environment. When students are
following the same lab/class exercise, they would normally create the same objects. In previous IPDA versions,
they would have to create separate databases to avoid conflicts. However, with multiple schemas in release 7.0.3,
the DBA/ADMIN could simply set enable_user_schema = TRUE, which would allow each student work to in
his/her schema without any conflicts.
12.7. HA/Replication
High-Availability (HA) replication environments often move data between a production environment to one or
more secondary (backup) servers. Multiple schemas can be used on the primary/secondary servers to differentiate
the data that is sent and received. to serve multiple business purposes.
Netezza Performance Server 9 September 9, 2013
22
For example, you could create a production and a non-production schema in a database on the primary, both
with the same tables. In real-time, you could replicate/move the transactions from the primary schema to the non-
production schema. This provides the first level of high-availability. The non-production schema can then
replicate the data asynchronously to a HA server. Best of all, this non-production schema can serve as an
operational data store which provides real-time mining of the data. This prevents conflicts in reading and
replicating the data, and isolates the production schema from unexpected problems.
The data that is replicated to the HA server(s) can be received and moved into several schemas for
transformational, data mart, or other activities.
It is important to note that even though the IPDA appliance is a warehouse appliance, there is an increased
urgency to report/analyze the data; in effect, DSS environments are becoming comparable to OLTP in their
importance, availability, and concurrency.
13. Backup and Restore (BAR)
With the introduction of multiple schemas in Release 7.0.3, the nzbackup and nzrestore commands have
changed.
Now, when you backup a database using the nzbackup command, the backup includes the data for all the
schemas in the database. However, backup of specific schema objects are not supported.
When you restore a database that includes multiple schemas, nzrestore will restore all of the schemas unless you
are restoring to a system that does not support schemas. In this case, the restore process attempts to restore all the
objects in the single, default schema for a database. If there are name conflicts, such as tables in different schemas
that use the same name, the restore reports an error.
Conversely, if you are restoring a database created on a system that did not use multiple schemas to a system that
does support multiple schemas, the restore should complete without error. The restore creates a schema that
matches the owner name of the database in the backup, and restores the objects to that new schema.
Any stored procedures that you create in any schema are backed up and restored by the IPDA backup and restore
operations. As a best practice, you should keep backup copies of your source CREATE OR REPLACE
PROCEDURE definitions in a safe location outside of the IPDA system. Make sure that you have recent backups
of your IPDA systems in the event that you need to recover from an accidental deletion or change, or to restore
IPDA services as part of a disaster recovery situation.
Note that as of the IPDA 7.0.3 release, schema level backup and restores are not supported.
14. Upgrade / Downgrade
14.1. Upgrade
To support compatibility mode, the existing postgresql.conf settings, catalog tables, and views do not change when
upgrading to IPDA release 7.0.3 from a previous release.
However, it is recommended to ensure that multiple schema support is disabled prior to the upgrade – set the
variable enable_schema_dbo_check is set to 0. This will avoid any initial issues/conflicts related to multiple
schema support in release 7.0.3, and allow you time to plan and test your applications with a phased approach as
discussed earlier.
Netezza Performance Server 9 September 9, 2013
23
As part of the upgrade process to release 7.0.3, a parent record is created for functions and aggregates. A default
schema is also created for every user database during upgrade. This schema will have the same name as the
database owner – exactly as in prior versions. However, this default schema is now an explicit object now.
If you changed the owner of the database in the prior version, the default schema name would reference the new
owner! You can see the name of the new owner in SHOW schema command. However, if you change the owner
of the database after the upgrade, name of the default schema does not change! In release 7.0.3, you can change
the name of the schema and the owner as separate entireties, but these are two different operations – one of which
does not affect the default schemas.
14.2. Downgrade
Once you create multiple schemas in a database, in release 7.0.3, you cannot automatically downgrade to a
previous release.
You must undo all your changes in release 7.0.3 to return to a single schema environment. You can either drop all
the new schemas except the default schema, which will drop all the objects, or move all objects that are in the
new schemas into the default schema and then drop the schemas.
You can use the ALTER TABLE statement to move a table from the current schema to a new schema. For
example, to move table CUSTOMER from the schema user1 to schema user2, you can execute the following
statement:
sales.user1(user1)=>ALTER TABLE customer RENAME TO user2.customer
Be aware that you cannot change the database a table exists in! Use the CREATE TABLE AS … statement in the
new database environment to move the table over. This is similar to what has been done in previous release.
Also, you must undo changes to any postgresql.conf settings made in release 7.0.3. This includes any changes to the
enable_schema_dbo_check variable. The view version of client drivers must also be reverted - if changed. All
these are manual activities and not part of the downgrade process.
Note that the two internal schemas are automatically dropped during the downgrade process.
15. Catalog/Control Tables and other Database Objects
Release 7.0.3 introduces many updates to existing tables and views to support fully qualified names, multiple
schemas, and other improvements.
Important: Unless multiple schema support is enabled, you will not see any changes to catalog tables and views.
This is to facilitate seamless backward compatibility after immediately after upgrading to IPDA release 7.0.3.
The _t_object table now has a new schema column. The table _t_database includes a column to store the
default schema.
In IPDA, all object references by name are resolved against the object table - _t_object. This table contains the
object id (oid), which can be used to join with other catalog tables to get specific details for that object. Since the
object table now also holds the schema oid, this will allow you to find more information on schemas when joining
with other catalog tables. Not that the _t_object table is the only place where the object oid and the schema oid
and their associated names are listed.
Netezza Performance Server 9 September 9, 2013
24
The security tables, _t_acl, _t_grpobj_priv, _t_usrobj_priv, all have a new column named schema.
A new class named Schema Class has been added on the _t_object table so that all the schemas will have a code
which identifies them as schemas A new table named _t_schema has been introduced to provide information on
the definition of each schema - including the sql path values.
Also, all system views now include two new columns to hold the physical schema name and schema object id
(oid).
Be aware that there are changes to system indexes in release 7.0.3. The unique index on relname in the _t_class
table has been removed since relname is no longer unique inside of a database. This is because you can have same
table name in different schemas. Also, some of the indexes on the _t_object and _t_acl tables have been
updated to include the schema column.
Note: When you execute queries against the system catalog tables, for performance reasons, you should always
use the columns defined in the indexes in the WHERE clause of the SQL statement. Using indexed columns will
force an index scan; otherwise, a full scan will be performed.
In release 7.0.3, the history configuration version has incremented from 1 to 2. The version 2 history tables and
views have numerous updates to support multiple schemas, the ability for users to change databases and
schemas within a session, and timezone offsets. When you create a history database and enable history collection,
specify version 2 to make sure that you collect the complete information for activity on the 7.0.3 or later system.
Synonyms and sequences share the same naming restrictions as tables, views, and functions - both must be
unique with a schema. However, synonyms cannot be the same as global objects such as those of databases,
users, or groups. Also, the owner of the sequence where the sequence is defined has full privileges on all user
sequences in that schema. There is no need to grant any privileges to the owner.
16. Stored Procedures / UDFs / UDAs
16.1. Creating a Stored Procedure in a Schema
When you create a stored procedure using the CREATE [OR REPLACE] PROCEDURE command, the command
adds the procedure to the database and schema to which you are connected. Starting with version 7.0.3, you can
specify a name in the format schema-name.procedure-name to create a procedure in a different schema of the current
database. For example:
• To create a new procedure called cust_name in your current schema:
sales.user1(user1)=>CREATE OR REPLACE PROCEDURE cust_name()
RETURNS INT8 LANGUAGE NZPLSQL AS BEGIN_PROC BEGIN RAISE
NOTICE 'The customer name is John Doe; END; END_PROC;
• To create a new procedure called cust_name in a different schema of the same database:
sales.user1(user1)=>CREATE OR REPLACE PROCEDURE user2.cust_name()
RETURNS INT8 LANGUAGE NZPLSQL AS BEGIN_PROC BEGIN RAISE
NOTICE 'The customer name is John Doe; END; END_PROC;
You can use the same naming convention with the ALTER/DROP PROCEDURE statements to change/drop a
procedure in the same/different schema of the current database. For example, to change the owner for a
procedure in a different schema, enter:
Netezza Performance Server 9 September 9, 2013
25
sales.user1(user1)=>ALTER PROCEDURE user2.myproc(int4) OWNER TO john;
Note that you cannot create/change/drop a procedure in a different database.
Important: With multiple schemas, if you define stored procedures in a non-default schema, dropping that
schema will drop all the objects in that schema.
16.2. Using Stored Procedures
You can execute a procedure while connected to a database and schema in which it is defined, in other schemas of
the same database, or in other databases. Assuming you have the necessary privileges, the following examples
use various notations of the fully qualified format (database-name.schema-name.object-name) when calling a
procedure object that resides within the current/different database or schema:
sales.user1(user1)=>CALL calc_sales();
sales.user1(user1)=>EXEC user1.calc_sales();
sales.user1(user1)=>EXECUTE sales.user1.calc_sales();
sales.user1(user1)=>EXECUTE PROCEDURE calc_sales();
sales.user1(user1)=>EXEC orders.user2.cals_sales ();
sales.user1(user1)=>EXEC user3.calc_sales();
You can also use the fully qualified object notation to set the scope of object privileges from any database - where
object is either the class PROCEDURE for all stored procedures, or a full signature such as
customer(VARCHAR(20)). You can do the same to set the scope of administrative privileges from any database.
16.3. Functions / UDFs
Similar in behavior to the CREATE TABLE statement, the CREATE FUNCTION statement will fail if a user-
defined function with the same name and signature already exists in the database, or in the same schema.
However, you can now also use any of the below SQL statements to change/create/drop the object in a different
schema of the current database:
• ALTER FUNCTION (schema.function)
• ALTER LIBRARY (schema.library)
• CREATE [OR REPLACE] AGGREGATE (schema.aggregate)
• CREATE [OR REPLACE] LIBRARY (schema.library)
• DROP AGGREGATE/FUCTION/LIBRARY (schema.aggregate/function/library)
None of the above SQL statements can be used across a different database.
Note that when you create an aggregate or function, its signature (that is, its name and argument type list) must
be unique within a schema. No other aggregate or function can have the same name and argument type list in the
same database or schema. As a best practice, avoid creating duplicates of the same function in different schemas
and, instead, always use fully qualified names to reference functions in the different schemas. This can help
reduce the overhead to maintain multiple copies of the same function and ensure the correct UDF is executed.
Also, the nzudxcompile command now has the new schema option that specifies the schema in the database
where you want to register the object. If you do not specify a schema, and the NZ_SCHEMA environment
variable is not set, be aware the command uses the default schema for the database, unless IPDA is configured
with enable_user_schema = TRUE, which then would place the UDX in a schema that matches user name.
Netezza Performance Server 9 September 9, 2013
26
16.4. Procedure/Function Privileges
By default, the admin user account has execute access to all stored procedures, user-defined functions (UDF), and
aggregates. The database owner has privileges to run these objects in the database. Other users can be given
privileges to run specific or all objects. Starting with release 7.0.3 the schema owner has full permission to these
objects within that schema. In addition, the object owner (user who created and registered the UDF) has
permission to manage and execute the stored procedures that he or she owns.
17. IPDA Utilities / Interfaces
17.1. CLIENT CONNECTIVITY
IPDA clients and CLIs now include support for multiple schemas.
There are multiple versions of system views for each IPDA driver. Drivers are structured in such a way that they
call different views internally. Clients on a release prior to 7.0.3 that connect to a 7.0.3 host may find some
command output confusing, because the older commands do not show schema information that can help to
uniquely identify objects.
In IPDA releases prior to 7.0.3, the default view is version 2, which returns the object owner as the schema field –
this is the name of the user who created the database. When moving to release 7.0.3, it is recommended to change
the views of the driver that ODBC and JDBC clients are using from version 2 to 3. This version will return the
schema field in the schema column attribute of those views. Since this is a system-wide change, the change will
affect all clients!
OLDB and .NET clients should move from view version 1 to 2 to take advantage of the additional schema
information.
17.2. NZLOAD
The nzload command now includes the new –schema option to support multiple schemas. This option specifies
the schema in which to load the table. If you do not specify the -schema option, IPDA uses the value of the
NZ_SCHEMA environment variable; otherwise, the default schema for the database is used.
The nzload control file, which allows you to define load operations in a text file without having to specify the
options on the command line, also includes the schema option to support schema specific table loads. For
example, the following control file options define two data sets to load into two tables with the same name but in
different schemas – john.customer and mary.customer.
DATAFILE /home/operation/data/j_customer.dat { Database sales Schema john TableName customer Delimiter '|' Logfile j_logfile.log Badfile j_customer.bad } DATAFILE /home/imports/data/m_customer.dat { Database sales Schema mary TableName customer Delimiter '#' Logfile m_logfile.log Badfile m_customer.bad }
Netezza Performance Server 9 September 9, 2013
27
The default nzlog and nzbad log files created by nzload now include a schema name such as
table.schema.database.nzlog and table.schema.database.nzbad. As a best practice, if you are running multiple nzload
jobs to load into a table, you should use unique names for your nzbad files.
Note: If your default system case is uppercase, IPDA displays lowercase table names as uppercase in nzlog files,
for example, CUSTOMER.USER1.SALES.nzlog and CUSTOMER.USER1.SALES.nzbad
As a best practice, because table names are no longer unique within a database, you should always fully qualify
the objects you are loading to avoid data insertion into the wrong tables. This is especially true given that IPDA
will use the object in the default schema if the –schema option and NZ_SCHEMA are not set.
17.3. Other IPDA Utilities
Starting with IPDA release 7.0.3, there have been several changes to many of the IPDA utilities to support
multiple schemas:
• The NZADMIN GUI interface now has a Schemas option to display all schemas within a database.
• nzdumpschema now has the new schemaList option which specifies one or more schemas within the
database to extract information for. By default, the command extracts information for all the schemas
in a database. You can specify one or more schemas in a space-separated list to extract only the
contents of the specified schemas. Note that the command ignores the NZ_SCHEMA setting.
• You can now use the schema option to specify the schema in which to groom the tables with
nzreclaim. This option is compatible with the existing –db option, but not with the -allDbs option. If
you specify -schema and -allTbls, the command grooms all the tables in the schema. If you do not
specify a schema, IPDA uses NZ_SCHEMA; otherwise, the default schema for the database is used.
• The nzsession command, to view and manage sessions, now includes schema information also.
18. Other
In release 7.0.3, many of the messages IPDA displays have been standardized. Because object names are no
longer unique, three-level names have been included in error message text (database-name.schema-name.object-
name). You therefore require all 3 names to physically identify an object.
Also, error messages are now more consistent. For example, if an object does not exist, a standard message will be
displayed, versus different flavors of the same message.
There is a new version of query history configuration – version 2. If you use the SET CATALOG / SET SCHEMA
statements, version 2 of query history will capture these statements. However, if you use version 1 and later
decide to switch to version 2, you will lose these statements as they will not be captured – in effect, you will be
losing data!
Because of this, after an upgrade, it is recommended that you switch to query history schema version 2, so that
you do not lose any data. IPDA does not migrate your version 1 data to version 2. Once you start using release
7.0.3, you will be collecting new data under version 2. It is therefore recommended to use the CREATE HISTORY
command to create a version 2 configuration for query history logging after an upgrade.
Important: As of the IPDA 7.0.2 GA release, IBM IPDA Analytics (INZA) is not supported.
Netezza Performance Server 9 September 9, 2013
28
19. Conclusion
In summary, just as folders and directories allow you to organize information, schemas allow the logical
separation of objects, but still allow the objects to work together as required. Schemas and their objects are
mutually dependent, and therefore enforce uniqueness in the database by forcing you to qualify database objects.
This makes development easier, simplifies security, permits object name reuse, avoids conflicts, and allows
objects to be manipulated independently of users.
The use of multiple schema is encouraged as the benefits often outweigh their costs – however, overuse, or the
use of schemas for the wrong reasons, can result in additional and unnecessary overhead. You should therefore
always ask how the use of multiple schemas in your situation will be beneficial, prior to implementation.