22
Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode) [ID 260192.1] Modified 01-FEB-2010 Type BULLETIN Status PUBLISHED In this Document Purpose Scope and Application Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode) 1.A) Prerequisites: 1.B) When changing an Oracle Applications Database: 1.C) When to use full export / import and when to use Alter Database Character Set / Csalter? 1.D) When using Expdp/Impdp (DataPump) 1.E) Using Alter Database Character Set on 9i 2) Check the source database for: 2.a) Invalid objects. 2.b) Orphaned Datapump master tables (10g and up) 2.c) Unneeded sample schema's/users. 2.d) Objects in the recyclebin (10g and up) 2.e) Leftover Temporary tables using CHAR semantics. 3) Check the Source database for "Lossy" (invalid code points in the current source character set). 4) Check for "Convertible" and "Truncation" data when going to AL32UTF8 5) Dealing with "Truncation" data. 6.a) Dealing with "Convertible" data. 6.b) After any "Lossy" is solved, "Truncation" data is planned to be addressed and/or "Convertible" exported / truncated / addressed run Csscan again as final check. 7) Before using Csalter / Alter Database Character Set check the database for: 7.a) Partitions using CHAR semantics: 7.b) Functional indexes on CHAR semantics columns. 7.c) SYSTIMESTAMP in the DEFAULT value clause for tables using CHAR semantics. 7.d) Clusters using CHAR semantics. 7.e) Unused columns using CHAR semantics 7.f) Check that you have enough room to run Csalter or to import the "Convertible" data again afterwards. 8) Summary of steps needed to use Alter Database Character Set / Csalter: 8.a) For 9i and lower: 8.b) For 10g and up: 9) Running Csalter/Alter Database Character Set 9.a) For 8i/9i 9.b) For 10g and up 10) Reload the data pump packages after a change to AL32UTF8 in 10g and up. 11) Import the exported data again.

Changing the NLS

Embed Size (px)

Citation preview

Page 1: Changing the NLS

Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode) [ID 260192.1]

 

  Modified 01-FEB-2010     Type BULLETIN     Status PUBLISHED

 

In this Document  Purpose  Scope and Application  Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode)     1.A) Prerequisites:     1.B) When changing an Oracle Applications Database:     1.C) When to use full export / import and when to use Alter Database Character Set / Csalter?     1.D) When using Expdp/Impdp (DataPump)     1.E) Using Alter Database Character Set on 9i     2) Check the source database for:     2.a) Invalid objects.     2.b) Orphaned Datapump master tables (10g and up)     2.c) Unneeded sample schema's/users.     2.d) Objects in the recyclebin (10g and up)     2.e) Leftover Temporary tables using CHAR semantics.     3) Check the Source database for "Lossy" (invalid code points in the current source character set).     4) Check for "Convertible" and "Truncation" data when going to AL32UTF8     5) Dealing with "Truncation" data.     6.a) Dealing with "Convertible" data.     6.b) After any "Lossy" is solved, "Truncation" data is planned to be addressed and/or "Convertible" exported / truncated / addressed run Csscan again as final check.      7) Before using Csalter / Alter Database Character Set check the database for:      7.a) Partitions using CHAR semantics:     7.b) Functional indexes on CHAR semantics columns.     7.c) SYSTIMESTAMP in the DEFAULT value clause for tables using CHAR semantics.     7.d) Clusters using CHAR semantics.     7.e) Unused columns using CHAR semantics     7.f) Check that you have enough room to run Csalter or to import the "Convertible" data again afterwards.     8) Summary of steps needed to use Alter Database Character Set / Csalter:     8.a) For 9i and lower:     8.b) For 10g and up:     9) Running Csalter/Alter Database Character Set     9.a) For 8i/9i     9.b) For 10g and up     10) Reload the data pump packages after a change to AL32UTF8 in 10g and up.     11) Import the exported data again.     11.a) When using Csalter/Alter database and there was "Truncation" data in the csscan done in point 4:     11.b) When using Full export/import and there was "Truncation" data in the csscan done in point 4:     11.c) When using Csalter/Alter database and there was NO "Truncation" data, only "Convertible" and "Changeless" in the csscan done in point 4:     11.d) When using full export/import and there was NO "Truncation" data, only "Convertible" and "Changeless" in the csscan done in point 4:     12) Check your data  References

Page 2: Changing the NLS

Applies to:

Oracle Server - Enterprise Edition - Version: 8.0.3.0 to 11.2.0.1.0Information in this document applies to any platform.

Purpose

To provide a guide to change the NLS_CHARACTERSET to AL32UTF8 or UTF8.This note will only deal with the database (server side) change itself. For further implications on clients and application level when going to AL32UTF8 please see Note 788156.1 AL32UTF8 / UTF8 (Unicode) Database Character Set Implications.It's strongly recommended to read Note 788156.1 AL32UTF8 / UTF8 (Unicode) Database Character Set Implications first and to make sure your application and clients are checked and ready for the change on database level.

This note was specific to FROM: WE8ISO8859P1, WE8ISO8859P15 or WE8MSWIN1252 TO: AL32UTF8 or UTF8The current note however can be used to go from any NLS_CHARACTERSET to AL32UTF8 / UTF8. ( which also means it can be used to go from UTF8 to AL32UTF8 (or inverse) ).

This "flow" can also be used to go from any single byte characterset (like US7ASCII, WE8DEC) to any other Multi byte characterset (ZHS16GBK, ZHT16MSWIN950, ZHT16HKSCS, ZHT16HKSCS31,KO16MSWIN949, JA16SJIS ...), simply substitute AL32UTF8 with the xx16xxxx target characterset. But in that case going to AL32UTF8 would be simply be a far better idea. Note 333489.1 Choosing a database character set means choosing Unicode.

The note is written using AL32UTF8, to use this note to go to an other characterset (for example UTF8) simply replace "AL32UTF8" with "UTF8" in the CSSCAN TOCHAR and for 9i and lower in the alter database character set command.

Scope and Application

Any DBA changing the current NLS_CHARACTERSET to AL32UTF8 / UTF8 or an other multibyte characterset. In this note AL32UTF8 will be used, but it's applicable to UTF8 or other multibyte charactersets also.

The current NLS_CHARACTERSET is seen in NLS_DATABASE_PARAMETERS.

select value from NLS_DATABASE_PARAMETERS where parameter='NLS_CHARACTERSET';

Changing the NLS_CHARACTERSET to AL32UTF8 / UTF8 (Unicode)

1) General remarks on going to AL32UTF8

1.A) Prerequisites:

In this note the Csscan tool is used. Please install this firstNote 458122.1 Installing and configuring CSSCAN in 8i and 9iNote 745809.1 Installing and configuring CSSCAN in 10g and 11gTo have an overview of the output and what it means please see Note 444701.1 Csscan output explained

Page 3: Changing the NLS

1.B) When changing an Oracle Applications Database:

Please see the following note for an Oracle Applications database: Note 124721.1 Migrating an Applications Installation to a New Character Set.This is the only way supported by Oracle applications. If you have any doubt log an Oracle Applications SR for assistance.

1.C) When to use full export / import and when to use Alter Database Character Set / Csalter?

Full exp/imp can be used at any time. To avoid data loss please do check your source database with Csscan even when using full export / import  (= follow this note until point 6 and then go to step 11).Using Alter Database Character Set / Csalter has an advantage when the amount of "Convertible" data is low compared to the amount of "Changeless" data and/or when recreating the database will take a lot of time.

1.D) When using Expdp/Impdp (DataPump)

Do NOT use Expdp/Impdp when going to (AL32)UTF8 or an other multibyte characterset on ALL 10g versions lower then 10.2.0.4 (including 10.1.0.5). Also 11.1.0.6 is affected.It will provoke data corruption unless Patch 5874989 is applied on the Impdp side. Expdp is not affected, hence the data in the dump file is correct.Also the "old" exp/imp tools are not affected. This problem is fixed in the 10.2.0.4 and 11.1.0.7 patch set. For windows the fix is included in 10.1.0.5.0 Patch 20 (10.1.0.5.20P) or later, see Note 276548.1 .10.2.0.3.0 Patch 11 (10.2.0.3.11P) or later, see Note 342443.1 .

1.E) Using Alter Database Character Set on 9i

For 9i systems please make sure you are at least on Patchset 9.2.0.4, see Note 250802.1 Changing character set takes a very long time and uses lots of rollback space

2) Check the source database for:

2.a) Invalid objects.

select owner, object_name, object_type, status from dba_objects where status ='INVALID'; 

If there are any invalid objects, resolve / drop those.

2.b) Orphaned Datapump master tables (10g and up)

SELECT o.status, o.object_id, o.object_type,        o.owner||'.'||object_name "OWNER.OBJECT"  FROM dba_objects o, dba_datapump_jobs j WHERE o.owner=j.owner_name AND o.object_name=j.job_name   AND j.job_name NOT LIKE 'BIN$%' ORDER BY 4,2;

Note 336014.1 How To Cleanup Orphaned DataPump Jobs In DBA_DATAPUMP_JOBS ?

2.c) Unneeded sample schema's/users.

The 'HR', 'OE', 'SH', 'PM', 'IX', 'BI' and 'SCOTT' users are sample schemas. There is no point in having these sample schemas in a production system. If sample schemas exist we suggest

Page 4: Changing the NLS

to remove them.This note is useful to identify users in your database Note 160861.1 Oracle Created Database Users: Password, Usage and Files .An other user that might be removed is SQLTXPLAIN from Note 215187.1

2.d) Objects in the recyclebin (10g and up)

conn / as sysdbaSELECT OWNER, ORIGINAL_NAME, OBJECT_NAME, TYPE from dba_recyclebin order by 1,2;

If there are objects in the recyclebin then perform

conn / as sysdbaPURGE DBA_RECYCLEBIN;

This will remove unneeded objects and otherwise during CSALTER an ORA-38301 will be seen.

2.e) Leftover Temporary tables using CHAR semantics.

conn / as sysdbaselect C.owner ||'.'|| C.table_name ||'.'|| C.column_name ||' ('||C.data_type ||' '|| C.char_length ||' CHAR)'from all_tab_columns Cwhere C.char_used = 'C'and C.table_name in (select table_name from dba_tables where temporary='Y')and C.data_type in ('VARCHAR2', 'CHAR')order by 1;

These tables MAY (!) give during Alter database Charter Set or CsalterERROR at line 1:ORA-00604: error occurred at recursive SQL level 1ORA-14450: attempt to access a transactional temp table already in use.

Temporary tables should be recreated by the application when needed, so if you have tables listed by above select it's good idea to confirm the application will recreate them if needed and drop them now (or if the db is still in use now, do this just before the final Csscan run , point 6.b in this note).If the reported tables are SYS.ORA_TEMP_X_DS_XXXX (like  SYS.ORA_TEMP_1_DS_27681, SYS.ORA_TEMP_1_DS_27686 ) they are leftovers of DBMS_STATS ( note:4157602.8 ) so they can be dropped without problems at any time .

3) Check the Source database for "Lossy" (invalid code points in the current source character set).

Run Csscan with the following syntax:

$ csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y FROMCHAR=<current NLS_CHARACTERSET> TOCHAR=<current NLS_CHARACTERSET> LOG=dbcheck CAPTURE=N ARRAY=1000000 PROCESS=2

* Always run Csscan connecting with a 'sysdba' connection/user,do not use "system" or "csmig" user.

* The <current NLS_CHARACTERSET> is seen in NLS_DATABASE_PARAMETERS.select value from NLS_DATABASE_PARAMETERS where parameter='NLS_CHARACTERSET';

* the TOCHAR=<current NLS_CHARACTERSET> is not a typo, the idea is to check the

Page 5: Changing the NLS

CURRENT charterset for codes who are not defined in this NLS_CHARACTERSET before changing the NLS_CHARACTERSET

* The PROCESS= parameter influences the load on your system, the higher this is (6 or 8 for example) the faster Csscan will be done, the lower this is the less impact it will have on your system. Adapt if needed.

* The csscan SUPPRESS parameter limits the size of the .err file by limiting the amount of information logged / table. Using SUPPRESS=1000 will log max 1000 rows for each table in the .err file. It will not affect the information in the .txt file. It WILL affect the data logged in the .err file. This is mainly useful for the first scan of big databases, if you have no idea how much "Convertible" or "Lossy" there is in a database then this will avoid that the .err file becomes 100's of MB big and it limits also the space used by the csscan tables under the Csmig schema.

This will create 3 files :

dbcheck.out a log of the output of csscandbcheck.txt a Database Scan Summary Report dbcheck.err contains the rowid's of the Lossy rows reported in dbcheck.txt (if any).

This is to check if all data is stored correctly in the current character set. Because the TOCHAR and FROMCHAR character sets as the same there cannot be any "Convertible" or "Truncation" data reported in dbcheck.txt.

If all the data in the database is stored correctly at the moment then there is only "Changeless" data reported in dbcheck.txt. If this is the case please go to point 4).

If there is any "Lossy" data then those rows contain code points that are not currently defined correctly and they should be cleared up before you can continue. If this "Lossy" is not checked/corrected then this "Lossy" data WILL BE LOST.Please see the following note more information about "Lossy" data Note 444701.1 Csscan output explained. You can also use (after reading Note 444701.1 ) the flow in Note 225938.1 Database Character Set Healthcheck.

The most common situation is when having an US7ASCII/WE8ISO8859P1 database and "Lossy", in this case changing your US7ASCII/WE8ISO8859P1 SOURCE database to WE8MSWIN1252 using Alter Database Character Set / Csalter will most likely solve you lossy. The reason is explained in Note 252352.1 Euro Symbol Turns up as Upside-Down Questionmark. The flow to do this is found in Note 555823.1 Changing US7ASCII or WE8ISO8859P1 to WE8MSWIN1252Note that using Csscan alone is not enough, you will need to check your whole environment to deduct the real encoding of the data. 

Do not blindly assume your data is WE8MSWIN1252 and this does not mean ALL lossy can be "solved" by going to WE8MSWIN1252.

It cannot be repeated enough that * if LOSSY need to be saved/corrected then it is NEEDED to change the NLS_CHARACTERSET FIRST to the "real" characterset of the LOSSY in the source database BEFORE going to AL32UTF8. If your WE8ISO8859P1 database has for example Hebrew stored you NEED to go most likely to IW8MSWIN1255 before going to AL32UTF8 seen WE8ISO8859P1 simply does not define Hebrew.* In some cases it is NOT possible the "correct" the "Lossy" data and then the best solution is to update those rows with something meaningfull using SqlDeveloper.

Also do NOT use exp/imp to "correct" lossy, for example setting the NLS_LANG to WE8MSWIN1252 while exporting "Lossy" data from a WE8ISO8859P1 database will NOT solve the lossy, that data WILL be lost.

Page 6: Changing the NLS

When preparing a test environment to debug this you can use 2 things: a) a restored physical copy (backup) of the database or b) export/import (part of) the dataset in a database with the same NLS_CHARACTERSET as the current source database.

If you use this note to go from AL32UTF8 to UTF8 (or inverse) and you have lossy then log a SR and ask for a review by the "Advanced Resolution Team".

This select will give all the lossy objects found in the last Cssan run:

Note that when using the csscan SUPPRESS parameter this select may give incomplete results (not all tables).

select distinct z.owner_name || '.' || z.table_name || '(' ||          z.column_name || ') - ' || z.column_type ||           ' ' LossyColumns from csmig.csmv$errors zwhere z.error_type ='DATA_LOSS'order by LossyColumns/Lossy in Data Dictionary objects

When using Csalter/Alter Database Character Set:

Most "Lossy" in the Data Dictionary objects will be corrected by correcting the database as a whole, if the only "lossy" is found in Data Dictionary objects then follow the tips for "Convertible" Data Dictionary data . For example one common thing seen is "Lossy" found only in SYS.SOURCE$,  most of the time this means some package source code contain illegal codes/bad data. You can use the selects found in Note 291858.1 "SYS.SOURCE$ marked as having Convertible or Lossy data in Csscan output" to find what objects are affected. Note that you CANNOT "fix" SYS.SOURCE$ itself, you need to recreate the objects who's text is stored in SYS.SOURCE$.

Do NOT truncate or export Data Dictionary objects itself unless this is said to be possible in  Note 258904.1 .

When using full export/import into a new AL32UTF8 database:

When using export/import to a new database then "Lossy" in Data Dictionary objects is only relevant when it concerns "Application data". The thing to check for is to see if there is no "lossy" in tables like SYS.SOURCE$ (package source code) / SYS.COM$ (comments on objects) / SYS.VIEW$ (view definitions) / SYS.COL$ (column names) or SYS.TRIGGER$ (triggers). The reason beeing is simply that these Data Dictionary objects objects contain information about user objects or pl/sql code. If you have "convertible" there that's not a an issue.For most conversions , if there is "lossy" it will be in SYS.SOURCE$.

4) Check for "Convertible" and "Truncation" data when going to AL32UTF8

Run csscan with the following syntax:

$ csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y TOCHAR=AL32UTF8 LOG=TOUTF8 CAPTURE=Y ARRAY=1000000 PROCESS=2

This will create 3 files :toutf8.out a log of the output of csscantoutf8.txt the Database Scan Summary Report toutf8.err contains the rowid's of the Convertible and Lossy rows reported in toutf8.txt

There should be NO entries under "Lossy" in toutf8.txt, because they should have been

Page 7: Changing the NLS

filtered out in step 3), if there is "Lossy" data then please redo step 3).

If there are:* Entries under "Truncation" then go to step 5)* Entries for "Convertible" and "Changeless" but no "Truncation" then goto step 6).* If you have NO entry's under the "Convertible", "Truncation" or "Lossy" and all data is reported as "Changeless" then proceed to step 7) .

5) Dealing with "Truncation" data.

As explained in Note 788156.1, characters may use more BYTES in AL32UTF8 then in the source characterset. Truncation data means this row won't fit in the current column definition once converted to AL32UTF8.

"Truncation" data is always also "Convertible" data, which means that whatever way you do the change, these rows have to be exported before the character set is changed and re-imported after the character set has changed. If you proceed with that without dealing with the truncation issue then the import will fail on these columns because the size of the data exceeds the maximum size of the column.

Truncation issues will always require some work, there are a number of ways to deal with them:A) Update these rows in the source database so that they contain less data.B) Update the table definition in the source database before exporting so that it can store more BYTES or by using CHAR length semantics instead of BYTE length semantics (only possible in Oracle9i and up).C) Pre-create/adapt the table before the import so that it can contain 'longer' data. Again you have a choice between simply making it larger in BYTES, or switching from BYTE to CHAR length semantics.

Typically :* when using Csalter/Alter database the columns are changed to CHAR semantics after going to AL32UTF8 but before importing the exported "Convertible/Truncation" data again.* when using full export import the tables are pre-created in the new database using CHAR semantics before importing the data.

Note that in some cases the expansion in BYTES is bigger then the max datalength of the datatype and then using CHAR sementics will not help. This is 2000 BYTES for CHAR and 4000 BYTES  for VARCHAR2.In that case you or need to reduce the actual data or change to a datatype (like CLOB) that will allow you to store that length.

Using CHAR semantics is further discussed in Note:144808.1 Examples and limits of BYTE and CHAR semantics usage. This note has also a link to a script that can change all tables from BYTE to CHAR semantics.

To know how much the data expands you can:* Or use this procedure:

Note that when using the csscan SUPPRESS parameter this procedure may give incomplete results (not all tables or not the correct minimal needed data size).

conn / as sysdbaset serveroutput onDECLARE newmaxsz  NUMBER;BEGIN     FOR rec in      ( SELECT distinct u.owner_name, u.table_name, u.column_name ,                u.column_type, u.owner_id, u.table_id, u.column_id,                u.column_intid FROM csmv$errors u

Page 8: Changing the NLS

               WHERE u.error_type='EXCEED_SIZE'             order by u.owner_name, u.table_name, u.column_name)      LOOP        select MAX(cnvsize)INTO newmaxsz from csm$errors WHERE                usr#=rec.owner_id and obj#=rec.table_id                and col#=rec.column_id and intcol#=rec.column_intid;        DBMS_OUTPUT.PUT_LINE(rec.owner_name ||'.'|| rec.table_name||' ('||             rec.column_name ||') - '|| rec.column_type ||' - '||             newmaxsz || ' Bytes');      END LOOP;    END;/

This will give the minimal amount of BYTES the column needs to be to accommodate the expansion.

* Or check the Csscan output. You can find that in the .err file as "Max Post Conversion Data Size" For example, check in the .txt file wich table has "Truncation", let's assume you have there a row that say's:-- snip from toutf8.txt[Distribution of Convertible, Truncated and Lossy Data by Table]

USER.TABLE Convertible Truncation Lossy--------------------- ---------------- ---------------- ----------------...SCOTT.TESTUTF8 69 6 0...

then look in the toutf8.err file for "TESTUTF8" until the "Max Post Conversion Data Size" is bigger then the column size for that table.-- snip from toutf8.errUser : SCOTTTable : TESTUTF8Column: ITEM_NAMEType : VARCHAR2(80)Number of Exceptions : 6Max Post Conversion Data Size: 81the max size after going to AL32UTF8 will be 81 bytes for this column.

Csalter/Alter Database Character Set has problems with functional indexes on /partitions using CHAR based columns. See point 7). If you have functional indexes / partitions you can only change those columns to CHAR semantics after the change to AL32UTF8. Any other table columns can be changed to CHAR semantics before going to AL32UTF8 if required.

Truncation in Data Dictionary objects is rare and will be solved by using the steps for "Convertible" Data Dictionary data.

While it's technically only needed to take action on the "Truncation" rows reported by Csscan it's still a good idea to consider using CHAR Semantics for every column / variable in an AL32UTF8 database.

6.a) Dealing with "Convertible" data.

Once any "Lossy" or "Truncation" is dealt with, full exp/imp to a new AL32UTF8 database can be used. Take the full export and goto point 11). The rest of this note until step 11) will deal only with using Csalter/Alter Database Character Set combined with partial export/import.

When using Csalter/Alter Database Character Set all User/Application Data "Convertible" data needs to be exported and truncated/deleted.

Page 9: Changing the NLS

When using export/import (full or partial) using the "old" Exp/Imp tools the NLS_LANG setting is simply AMERICAN_AMERICA.<source NLS_CHARACTERSET>. Expdp/imdpd does not use the NLS_LANG for data conversion. Note 227332.1 NLS considerations in Import/Export - Frequently Asked QuestionsTo check for constraint definitions on the tables before exporting and truncating them  Note 1019930.6 Script: To report Table Constraints can be used

The main challenge when using Csalter/Alter Database Character Set is most of the time "Convertible" data in Data Dictionary objects.

* For 8i/9i ALL "Convertible" data in the Data Dictionary objects needs to be addressed.

* For 10g and up you do not need to take action on "convertible" Data Dictionary CLOB data. Convertible CLOB in Data Dictionary objects is handled by Csalter, for CHAR, VARCHAR2 and LONG data however you do need to take action.

Please see Note 258904.1 Solving Convertible data in Data Dictionary objects when changing the NLS_CHARACTERSET  for selects that gives a better overview then the Csscan *.txt file output on what objects need action and how to solve common seen "Convertible" data for Data Dictionary columns.If there are Data Dictionary columns in your Csscan output that are not listed in Note 258904.1 please log a SR if you need help.

Do NOT truncate or export Data Dictionary objects itself unless this is said to be possible in Note 258904.1

This note my be useful to identify Oracle created users in your database Note 160861.1 Oracle Created Database Users: Password, Usage and Files. This note may also be useful: Note 472937.1 Information On Installed Database Components and Schema's.

To remove "Convertible" out of an Intermedia / Oracle Text Index (after it has been removed from the table) please see Note 176135.1

6.b) After any "Lossy" is solved, "Truncation" data is planned to be addressed and/or "Convertible" exported / truncated / addressed run Csscan again as final check.

$ csscan \"sys/<syspassword>@<TNSalias> as sysdba\" FULL=Y TOCHAR=AL32UTF8 LOG=TOUTF8FIN CAPTURE=Y ARRAY=1000000 PROCESS=2

6.b.1) For 8i/9i the Csscan output needs to be "Changeless" for all CHAR, VARCHAR2, CLOB and LONG data (Data Dictionary and User/Application data).

In order to use "Alter Database Character Set" you need to see in the toutf8fin.txt file under [Scan Summary] this message::All character type data in the data dictionary remain the same in the new character setAll character type application data remain the same in the new character set

If so, then continue in step 7)

6.b.2) For 10g and up the Csscan output needs to be

* "Changeless" for all CHAR VARCHAR2, and LONG data (Data Dictionary and User/Application data )* "Changeless" for all User/Application data CLOB* "Changeless" and/or "Convertible" for all Data Dictionary CLOB

And in order to run Csalter you need to see in the toutf8fin.txt file under [Scan Summary] this message:

Page 10: Changing the NLS

All character type application data remain the same in the new character setand under [Data Dictionary Conversion Summary] this message:The data dictionary can be safely migrated using the CSALTER script

If you run Csalter without these conditions met then you will see messages like " Unrecognized convertible data found in scanner result " in the Csalter output .

Before you can run Csalter you need* to have above messages in the .txt file.* to have that FULL=Y run been completed in the 7 days prior to running Csalter. So you can only run Csalter in the 7 days following the "Clean" FULL=Y scan.* to be sure the session running Csalter is the ONLY session connected to the database, otherwise Csalter will give this warning 'Sorry only one session is allowed to run this script'.

7) Before using Csalter / Alter Database Character Set check the database for:

7.a) Partitions using CHAR semantics:

conn / as sysdbaselect C.owner, C.table_name, C.column_name, C.data_type, C.char_length        from all_tab_columns C, all_tables T       where C.owner = T.owner         and C.table_name = T.table_name         and C.char_used = 'C'         and T.PARTITIONED='YES'         and C.table_name not in (select table_name from all_external_tables)         and C.data_type in ('VARCHAR2', 'CHAR')      order by 1,2;

If there are, check out Note 330964.1, they will give "ORA-14265: data type or length of a table subpartitioning column may not be changed" during the change to AL32UTF8.

7.b) Functional indexes on CHAR semantics columns.

conn / as sysdba select OWNER, INDEX_NAME, TABLE_OWNER, TABLE_NAME, STATUS, INDEX_TYPE,        FUNCIDX_STATUS from DBA_INDEXES    where INDEX_TYPE like 'FUNCTION-BASED%'     and TABLE_NAME in (select unique (table_name) from dba_tab_columns where      char_used ='C') order by 1,2;

If this gives rows back then the change to AL32UTF8 will fail with "ORA-30556: functional index is defined on the column to be modified" or with "ORA-02262: ORA-904 occurs while type-checking column default value expression" . If there are functional indexes on columns using CHAR semantics (this is including Nchar, Nvarchar2 columns) the index need to be dropped and recreated after the change. Note that a disable will not be enough.The DDL of all those indexes can be found using:conn / as sysdbaSET LONG 2000000SET PAGESIZE 0EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM,'STORAGE',false);SELECT DBMS_METADATA.GET_DDL('INDEX',u.index_name,u.owner) FROM      DBA_INDEXES u where u.INDEX_TYPE like 'FUNCTION-BASED%'              and u.TABLE_NAME in               (select unique (x.TABLE_NAME)               from DBA_TAB_COLUMNS x where x.char_used ='C');EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM,'DEFAULT');

Page 11: Changing the NLS

7.c) SYSTIMESTAMP in the DEFAULT value clause for tables using CHAR semantics.

conn / as sysdbaset serveroutput onBEGIN     FOR rec in      ( SELECT OWNER, TABLE_NAME, COLUMN_NAME, DATA_DEFAULT          FROM dba_tab_columns where CHAR_USED='C')       LOOP       IF UPPER(rec.DATA_DEFAULT) LIKE '%TIMESTAMP%' THEN        DBMS_OUTPUT.PUT_LINE(rec.OWNER ||'.'|| rec.TABLE_NAME ||'.'|| rec. COLUMN_NAME);       END IF;      END LOOP;    END;    /

This will give ORA-604 error occurred at recursive SQL level %s , ORA-1866 the datetime class is invalid during the change to AL32UTF8.The workaround is to temporary change affected tables to use a DEFAULT NULL clause eg: ALTER TABLE tab MODIFY ( col ... DEFAULT NULL NOT NULL );After the character set change the default clause can be restored.

7.d) Clusters using CHAR semantics.

conn / as sysdbaselect OWNER, OBJECT_NAME from ALL_OBJECTS where OBJECT_TYPE = 'CLUSTER'  and OBJECT_NAME in (select unique (TABLE_NAME) from DBA_TAB_COLUMNS where char_used ='C') order by 1,2;

If this gives rows back then the change will fail with "ORA-01447: ALTER TABLE does not operate on clustered columns". Those clusters need to be dropped and recreated after the change.

7.e) Unused columns using CHAR semantics

conn / as sysdbaselect OWNER, TABLE_NAME from DBA_UNUSED_COL_TABS where TABLE_NAME in (select unique (TABLE_NAME) from DBA_TAB_COLUMNS where char_used ='C') order by 1,2;

Unused columns using CHAR semantics will give an ORA-00604: error occurred at recursive SQL level 1 with an "ORA-00904: "SYS_C00002_09031813:50:03$": invalid identifier". Note that the "SYS_C00002_09031813:50:03$" will change for each column. These unused columns need to be dropped.

7.f) Check that you have enough room to run Csalter or to import the "Convertible" data again afterwards.

In 10g and up verify at least in toutf8.txt/toutf8fin.txt the "Expansion" column found under [Database Size] and check you have at least 2 times the expansion listed for SYSTEM tablespace free. This is the size needed for Csalter to update Data Dictionary CLOB. Otherwise you will see errors like "ORA-01691: unable to extend lob segment SYS.SYS_LOB0000058943C00039$$ by 1598 in tablespace SYSTEM " during Csalter.In general (for any version) it's a good idea to check the "Expansion" column and see that there is enough space in each listed tablespace.The Expansion column gives an estimation on how much more place you need in that tablespace when going to the new characterset. The Tablespace Expansion for tablespace X is calculated as the grand total of the differences between the byte length of a string converted to the target character set and the original byte length of this string over all strings

Page 12: Changing the NLS

scanned in tables in X. The distribution of values in blocks, PCTFREE, free extents, etc., are not taken into account.

8) Summary of steps needed to use Alter Database Character Set / Csalter:

8.a) For 9i and lower:

8.a.1) Export all the "Convertible" User/Application Data data (make sure that the character set part of the NLS_LANG is set to the current database character set during the export session)8.a.2) If you have "convertible" data for the sys objects SYS.METASTYLESHEET, SYS.RULE$ or SYS.JOB$ then follow the following note for those objects: Note 258904.1 Convertible data in Data Dictionary: Workarounds when changing character set. 8.a.3) Truncate the exported tables of point 8.a.1).8.a.4) Run csscan again with the syntax of point 6.b) to verify you only have "changeless" User/Application Data left8.a.5) If this now reports only Changeless data then proceed to step 9), otherwise do the same again for the rows you've missed out.8.a.6) Adapt any columns if needed to avoid "Truncation"8.a.7) Import the exported data again.

8.b) For 10g and up:

8.b.1) Export all the "Convertible" User/Application Data (make sure that the character set part of the NLS_LANG is set to the current database character set during the export session)8.b.2) Fix any "Convertible" in the SYS schema using Note 258904.1, All "9i only" fixes in Note 258904.1 Convertible data in Data Dictionary: Workarounds when changing character set should NOT be done in 10g and up.8.b.3) Truncate the exported tables of point 8.b.1).8.b.4) Run csscan with the syntax of point 6.b) to verify you only have "convertible" CLOB in the Data Dictionary and all other data is "Changeless".8.b.5) If this is now correct then proceed to step 9), otherwise do the same again for the rows you've missed out.

When using Csscan in 10g and up the toutf8.txt or toutf8fin.txt need to contain this before doing step 9):

The data dictionary can be safely migrated using the CSALTER scriptandAll character type application data remain the same in the new character set

If this is NOT seen in the toutf8.txt then Csalter will NOT work and this means something is missed or not all steps in this note are followed.

8.b.6) Adapt any columns if needed to avoid "Truncation"8.b.7) Import the exported data again.

9) Running Csalter/Alter Database Character Set

Please perform a backup of the database. Check the backup. Double-check the backup.

9.a) For 8i/9i

Shutdown the listener and any application that connects locally to the database.There should be only ONE connection the database during the WHOLE time and that's the sqlplus session where you do the change.

9.a.1) Make sure the PARALLEL_SERVER (8i) and CLUSTER_DATABASE parameter are

Page 13: Changing the NLS

set to false or it is not set at all. When using RAC you will need to start the database in single instance with CLUSTER_DATABASE = FALSEconn / as sysdbasho parameter CLUSTER_DATABASEsho parameter PARALLEL_SERVER

9.a.2) Execute the following commands in sqlplus connected as "/ AS SYSDBA":

conn / as sysdbaSPOOL Nswitch.logSHUTDOWN IMMEDIATE;STARTUP MOUNT;ALTER SYSTEM ENABLE RESTRICTED SESSION;ALTER SYSTEM SET JOB_QUEUE_PROCESSES=0;ALTER SYSTEM SET AQ_TM_PROCESSES=0;ALTER DATABASE OPEN;ALTER DATABASE CHARACTER SET INTERNAL_USE AL32UTF8;SHUTDOWN IMMEDIATE;-- in 8i you need to do another startup/shutdownSTARTUP;SHUTDOWN;

An alter database takes typically only a few minutes or less, it depends on the number of columns in the database, not the amount of data. Without the INTERNAL_USE you get a ORA-12712: new character set must be a superset of old character set

9.a.3) Restore the PARALLEL_SERVER (8i) and CLUSTER_DATABASE parameter if necessary and start the database. For RAC start the other instances.

WARNING WARNING WARNING

Do NEVER use "INTERNAL_USE" unless you did follow the guidelines STEP BY STEP here in this note and you have a good idea what you are doing.

Do NEVER use "INTERNAL_USE" to "fix" display problems, but follow Note:179133.1 The correct NLS_LANG in a Windows Environment or Note:264157.1 The correct NLS_LANG setting in Unix Environments

If you use the INTERNAL_USE clause on a database where there is data listed as convertible without exporting that data then the data will be corrupted by changing the database character set !

9.b) For 10g and up

Csalter.plb needs to be used within 7 days after the Csscan run, otherwise you will get a 'The CSSCAN result has expired' message.

Shutdown the listener and any application that connects locally to the database.There should be only ONE connection the database during the WHOLE time and that's the sqlplus session where the change is done. RAC systems need to be started as single instance.

Run in sqlplus connected as "/ AS SYSDBA":

conn / as sysdba-- Make sure the CLUSTER_DATABASE parameter is set-- to false or it is not set at all.-- If you are using RAC you will need to start the database in single instance-- with CLUSTER_DATABASE = FALSEsho parameter CLUSTER_DATABASE

Page 14: Changing the NLS

--  if you are using spfile note thesho parameter job_queue_processessho parameter aq_tm_processes-- (this is Bug 6005344 fixed in 11g )-- then do

shutdownstartup restrictSPOOL Nswitch.log

-- do this alter system or you might run into "ORA-22839: Direct updates on SYS_NC columns are disallowed"-- This is only needed in 11.1.0.6, fixed in 11.1.0.7, not applicable to 10.2 or lower-- ALTER SYSTEM SET EVENTS '22838 TRACE NAME CONTEXT LEVEL 1,FOREVER';

then run Csalter.plb @?/rdbms/admin/csalter.plb

-- Csalter will aks confirmation - do not copy paste the whole actions on one time-- sample Csalter output:

-- 3 rows created....-- This script will update the content of the Oracle Data Dictionary.-- Please ensure you have a full backup before initiating this procedure.-- Would you like to proceed (Y/N)?y-- old 6: if (UPPER('&conf') <> 'Y') then-- New 6: if (UPPER('y') <> 'Y') then-- Checking data validility...-- begin converting system objects

-- PL/SQL procedure successfully completed.

-- Alter the database character set...-- CSALTER operation completed, please restart database

-- PL/SQL procedure successfully completed....-- Procedure dropped.

-- if you are using spfile then you need to also

-- ALTER SYSTEM SET job_queue_processes=<original value> SCOPE=BOTH;-- ALTER SYSTEM SET aq_tm_processes=<original value> SCOPE=BOTH;

shutdownstartup

and the database will be AL32UTF8.

Note: in 10.1 csalter is asking for "Enter value for 1: ".

-- Would you like to proceed ?(Y/N)?Y-- old 5: if (UPPER('&conf') <> 'Y') then-- new 5: if (UPPER('Y') <> 'Y') then-- Enter value for 1:

-> simply hit enter.

10) Reload the data pump packages after a change to AL32UTF8 in 10g and up.

For 10g or up the datapump packages need to be reloaded after a conversion to AL32UTF8. In order to do this run the following scripts from $ORACLE_HOME/rdbms/admin in sqlplus connected as "/ AS SYSDBA":

Page 15: Changing the NLS

For 10.2.X and higher:catnodp.sqlcatdph.sqlcatdpb.sql

For 10.1.X:catnodp.sqlcatdp.sql

In some cases exp (the original export tool) fails in 10g after changing to AL32UTF8. please see Note 339938.1  Full Export From 10.2.0.1 Aborts With EXP-56 ORA-932 (Inconsistent Datatypes) EXP-0

11) Import the exported data again.

Note that if you had in the Csscan done in point 4) ONLY "Changeless" and NO "Convertible" (this is not often seen) then there is no data to import when using Csalter/Alter database.

11.a) When using Csalter/Alter database and there was "Truncation" data in the csscan done in point 4:

Truncation data is always ALSO "Convertible", it's "Convertible" data that needs action before you can import this again. If there was "Truncation" then typically this is handled by pre-creating the tables using CHAR semantics or enlarged column size in bytes after changing the database to AL32UTF8 using Csalter/Alter database and before starting the import.

Note that simply setting NLS_LENGTH_SEMANTICS=CHAR in the init.ora will NOT work to go to CHAR semantics.

Once the measures for solving the "Truncation" are in place you can then import the "Truncation/Convertible" data.

Set the parameter BLANK_TRIMMING=TRUE to avoid the problem documented in Note 779526.1 CSSCAN does not detect data truncation for CHAR datatype - ORA-12899 when importingUse the IGNORE=Y parameter for imp or the TABLE_EXISTS_ACTION=TRUNCATE option for Impdp to import the data into the pre-created tables.

Import the exported data back into the -now AL32UTF8- database, when using export/import using the "old" Exp/Imp tools the NLS_LANG setting is simply AMERICAN_AMERICA.<source NLS_CHARACTERSET> OR AMERICAN_AMERICA.AL32UTF8. Both are correct.Expdp/imdpd does not use the NLS_LANG for data conversion.Once the data is imported goto step 12.

11.b) When using Full export/import and there was "Truncation" data in the csscan done in point 4:

Truncation data is always ALSO "Convertible", it's "Convertible" data that needs action before you can import this again. If there was "Truncation" then typically this is handled by pre-creating the tables using CHAR semantics or enlarged column size in bytes after creating the new AL32UTF8 and before starting the import.

Note that simply setting NLS_LENGTH_SEMANTICS=CHAR in the init.ora will NOT work to go to CHAR semantics.

Once the measures for solving the "Truncation" are in place you can then import the 

Page 16: Changing the NLS

"Truncation/Convertible" data.

Set the parameter BLANK_TRIMMING=TRUE to avoid the problem documented in Note 779526.1 CSSCAN does not detect data truncation for CHAR datatype - ORA-12899 when importingUse the IGNORE=Y parameter for imp or the TABLE_EXISTS_ACTION=TRUNCATE option for Impdp to import the data into the pre-created tables.

Import the exported data back into the new AL32UTF8 database, when using export/import using the "old" Exp/Imp tools the NLS_LANG setting is simply AMERICAN_AMERICA.<source NLS_CHARACTERSET> OR AMERICAN_AMERICA.AL32UTF8. Both are correct.Expdp/imdpd does not use the NLS_LANG for data conversion.Once the data is imported goto step 12.

11.c) When using Csalter/Alter database and there was NO "Truncation" data, only "Convertible" and "Changeless" in the csscan done in point 4:

Set the parameter BLANK_TRIMMING=TRUE to avoid the problem documented in Note 779526.1 CSSCAN does not detect data truncation for CHAR datatype - ORA-12899 when importing

Import the exported data back into the -now AL32UTF8- database, when using export/import using the "old" Exp/Imp tools the NLS_LANG setting is simply AMERICAN_AMERICA.<source NLS_CHARACTERSET> OR AMERICAN_AMERICA.AL32UTF8. Both are correct.Expdp/imdpd does not use the NLS_LANG for data conversion.

Once the data is imported goto step 12.

11.d) When using full export/import and there was NO "Truncation" data, only "Convertible" and "Changeless" in the csscan done in point 4:

Create a new AL32UTF8 database, set the parameter BLANK_TRIMMING=TRUE to avoid the problem documented in Note 779526.1 CSSCAN does not detect data truncation for CHAR datatype - ORA-12899 when importing

Import the exported data into the new AL32UTF8 database, when using export/import using the "old" Exp/Imp tools the NLS_LANG setting is simply AMERICAN_AMERICA.<source NLS_CHARACTERSET> OR AMERICAN_AMERICA.AL32UTF8. Both are correct.Expdp/imdpd does not use the NLS_LANG for data conversion.

Once the data is imported goto step 12.

12) Check your data

Use a correctly configured client or Oracle SQL Developer / iSqlplus and verify you data.Note 788156.1 AL32UTF8 / UTF8 (Unicode) Database Character Set Implications.

For RAC restore the CLUSTER_DATABASE parameter, remove the BLANK_TRIMMING=TRUE parameter if needed and restart the instance(s).The Csmig user can also be dropped.If you did not use CHAR semantics for all CHAR and VARCHAR2 columns and pl/sql variables it might be an idea to consider this.

References

Page 17: Changing the NLS

BUG:5172797 - DATA PUMP THROWING ORA-600 [[KWQBESPAYL:PICKLE] WHEN CONVERTING TO UTF8NOTE:179133.1 - The correct NLS_LANG in a Windows EnvironmentNOTE:225912.1 - Changing the Database Character Set ( NLS_CHARACTERSET )NOTE:264157.1 - The correct NLS_LANG setting in Unix EnvironmentsNOTE:444701.1 - Csscan output explainedNOTE:458122.1 - Installing and Configuring Csscan in 8i and 9i (Database Character Set Scanner)NOTE:745809.1 - Installing and configuring Csscan in 10g and 11g (Database Character Set Scanner)NOTE:788156.1 - AL32UTF8 / UTF8 (Unicode) Database Character Set Implications

Related

Products

Oracle Database Products > Oracle Database > Oracle Database > Oracle Server - Enterprise Edition

Keywords

AL32UTF8; CHARACTERSET; CSSCAN; MULTIBYTE; NLS_CHARACTERSET; UNICODE

Errors

XP-56; XP-0; EXP-0; EXP-56; ORA-30556; ORA-932; ORA-904; ORA-14265; ORA-38301; ORA-12712; ORA-1866; ORA-22839; ORA-2262; ORA-12899; ORA-1691; ORA-1447; ORA-604; ORA-12710; ORA-14450; 604 ERROR