วันอังคารที่ 16 กุมภาพันธ์ พ.ศ. 2553

NLS Guide for Generic Connectivity and Gateways [ID 230239.1]


 

  Modified 18-DEC-2008     Type BULLETIN     Status PUBLISHED  

PURPOSE
-------

Analysing NLS issues when HS or Gateways are involved.


SCOPE & APPLICATION
-------------------

How to analyse NLS related issues with Gateways or Generic Connectivity.


How to solve NLS related problems with generic connectivity or gateways
-----------------------------------------------------------------------

This note is devided into four parts. The first part is some useful
background information. Part two is a description of the language
parameter, part three is a guide how to solve decimal
separator or lost decimal issues and part four describes steps
to check why characters are not displayed correctly.


I. Background Information
#########################
Each database works with a special character set, language and
territory setting.

Ideally, the character set of the Oracle database server
and the non-Oracle data source are the same. Then no
conversion during data transfer is needed.
If the character set differs, Heterogeneous Services / Gateways must
translate the character set of the non-Oracle data source
to the Oracle database character set and back again.


To be able to translate between two kind of things, you must know
the source and the target data. This is true for translating from
one language to another as well as for currency ...

The Generic connectivity/Gateway translates characters and
territory settings from the remote database to the Oracle database
and vice versa.
To do so, it must be aware of the remote character and territory settings.
A parameter called HS_LANGUAGE could be set in the initialisation file
of the gateway to reflect the REMOTE settings.
HS_LANGUAGE provides Heterogeneous Services / Gateways with character set,
language and territory information of the non-Oracle data source.
While the remote settings are specified in the initialisation file
and the local settings are defined with the NLS_LANG parameter a
conversation should be successful.

HS_LANGUAGE=language[_territory.character_set]


II. Language settings and their relationship
############################################

The language part of the HS_LANGUAGE initialization parameter
determines:

Day and month names of dates
AD, BC, PM, and AM symbols for date and time
Default sorting mechanism

Note: HS_LANGUAGE does not determine the language for error messages.



III. Truncated number columns or lost decimals
##############################################

Decimal delimiters depend on country specific settings. Some
countries use a comma, others a point to separate decimals.
Local and remote databases can work with different decimal delimiters.
Without being aware of the remote decimal delimiter it might happen
that decimals are lost during conversation.

Example:

- Microsoft SQL Server with American NLS settings
- Oracle Server 9.2.0.1 with German NLS settings
- GERMAN_GERMANY.WE8MSWIN1252.
- Derived NLS_NUMERIC_CHARACTERS from German NLS_TERRITORY is ", ."

MS SQL Server with a table num and 2 columns:
a as numeric with a value 8.5
and b as char with a string 'd'
in it

SQL>select * from num@sql7;

a b
---------- ---------------
8 d

In this sample everything beyond the decimal point is truncated.
The reason is that the SQL*Plus session works with GERMAN
territory settings, and inherits this setting to the HS.
MS SQL Server works with American settings. Both regions use
different decimal delimiters - German schema uses ','; American
schema '.'-.

To resolve this issue the parameter HS_LANGUAGE in
the initialisation file of the gateway must be
changed to AMERICAN_AMERICA.WE8MSWIN1252 to reflect the
REMOTE NLS settings.

SQL>select * from num@sql7;

a b
---------- ---------------
8,5 d

Sometimes it's hard to figure out the real Territory of the
foreign database; in might be changed due to the client settings.
So another simple approach is to vary the HS_LANGUAGE parameter
and try it:
For the first try set the HS_Language to
AMERICAN_AMERICA.we8iso8859p1. This will work for foreign data stores
having a POINT as decimal delimeter. After changing any HS parameter,
please make sure you close the old connection. The new parameter will be
picked up only during the connection phase.
If you still won't be able to get the correct values back, now try
HS_LANGUAGE with an European character set like GERMAN_GERMANY.we8iso8859p1

In almost all cases the decimals could be displayed with method 1 or 2
correctly.


IV. Characters are not displayed correctly
##########################################

A) Single Byte characters
A code page, also known as character set, is a set of 256
uppercase and lowercase letters, numbers and symbols. The printable
characters of the first 128 values are the same for all character set
choices. The last 128 characters differ from set to set. All those 256
characters are single byte characters (0x00-0xFF).

Before analysing any details, make sure that the operating system
is capable of dealing with these characters:
Microsoft Windows NT contains an ASCII table.
The program is called charmap.exe and located %SystemRoot%\System32\.

Let's check out the ARIAL font; it contains for example a character
'Ä' (0xC4).
To make sure, that the operating system can display this character 'Ä',
copy it from this Ascii table and paste it into NOTEPAD.

Is it displayed correctly?
If not, you have to figure out the language YOUR character belongs to
and the Windows environment must be changed accordingly.
This is done by changing the regional settings of the operating system
( -> Start -> Settings -> Control Panel -> Regional Settings).

Once you figured out the correct settings, proceed to the next step. Do NOT
proceed until you're able to display YOUR character correctly in NOTEPAD!
If you have problems to figure out the correct settings, please contact
Microsoft Support.

The next step is to verify that SQL*Plus is capable displaying the
characters.

SQL*PLUS GUI (sqlplusw):
------------------------
At this stage we don't need a connection to the database. Start a GUI
SQL*Plus session with the nolog option.
sqlplusw /nolog

Side Note: Everything is done with the GUI based SQL*Plus.
Inserting from the Windows command line tool and selecting the
inserted characters from the GUI based SQL*Plus could cause problems,
because the DOS Box character set ( 437 or 850...) differs from the GUI
character set and characters might be interpreted wrongly.

From the ASCII table choose again YOUR character and copy paste it now into
the SQL*Plus session. Does it display correctly?

If not, there are a few parameters to change the FONT and CHARACTER set
of SQL*Plus:
The most important registry values are
SQLPLUS_FONT and SQLPLUS_FONT_CHARSET.
Both are string values (REG_SZ) and
must be added manually to the HKEY_LOCAL_MACHINE/SOFTWARE/ORACLE/HOMEx key
where x is the number of your Oracle_Home installation associated with
SQL*Plus was called from. These values are read each time SQL*Plus starts.

SQLPLUS_FONT:
Any fixed-pitch TrueType font available in Windows system such as
Courier New or Lucida Console can be used. Choosing a proportional
pitch font like Arial or Times New Roman, or entering an unavailable
font, the registry entry is ignored and the default font
Fixedsys 16, is used.

Prior Oracle 9iR2 this was hard work: A capable font that is used
by SQL*Plus and that also contains the character that should be
displayed must be figured out.

Oracle 9iR2 introduced the SQLPLUS_FONT_CHARSET and with this value it
becomes easier. The SQLPLUS_FONT can now be the default value and the
character set is managed by the
SQLPLUS_FONT_CHARSET
registry entry.
It defines the font subset used in the SQL*Plus Windows GUI.
The following values are valid:
DEFAULT
SYMBOL
SHIFTJIS
HANGEUL
GB2312
CHINESEBIG5
OEM
JOHAB
HEBREW
ARABIC
GREEK
TURKISH
VIETNAMESE
THAI
EASTEUROPE
RUSSIAN
MAC
BALTIC

While choosing one of those from the list, it should be possible to
display the character in SQL*Plus.
This step is also a preleminary step and you must not proceed before this
works.

SQL*Plus COMMAND LINE (sqlplus):
To configure the command line version of SQL*Plus please open a DOS command line
window and check out the PROPERTIES of this window. The font needs to be a TRUE
TYPE font like "Lucida Console", the default RASTER FONT is not able to display
any foreign character.

Next step is to check out the CODE PAGE of the shell by typing
"chcp" (chcp =>change codepage and it will display the active code page).
Please make sure that this code page is able to display your characters; else
change the code page to a code page that covers your characters. More
details can be found at Microsoft's Web site.
Last step is now to set NLS_LANG correctly in this COMMAND Line window.
There is NO need to change the registry key at all.



By now, the OS and SQL*Plus can display the character.
The next questions are:
Is the character inserted correctly in the remote database?
Is the Oracle database capable to deal with this character?
Is HS/Gateway capable to deal with this character?


Figuring out the correct ASCII value for a character in the remote table
depends on the database vendor:

AS/400 servers for example:
'DSPFFD' to get the codepage of the column/table and use
select hex(<col>) from table to get the dump values.
To insert dediacted hex values into an AS400 table the function char(X'<hex value>')
like char(X'b8') can be used.

MS SQL Server for example provides the ASCII function:
select ascii(<column name>) from <table>

Demonstration of the ASCII function of MS SQL Server:
select ascii('T')
reports
84
This output is the decimal value for this character 'T'; 84 decimal is the
same as 54 hexadecimal and this is accoring to the charmap.exe the capital
T.

While the inserted characters correspond to the correct values in the
charmap, the next step is to verify the capability of the Oracle database:

select dump('T') from dual;
results in:
DUMP('T')
----------------
Typ=96 Len=1: 84

The important infomation is '84'. This is decimal and represents the
'T' letter.

Both methods (dumping the remote and the Oracle side) to verify the
decimal value for a character resolve the same result.

Side Note:
Sometimes also the application might insert characters incorrectly. Thus insert the
character with SQL*Plus into a test table and afterwards again with the
application. Then compare both dumps.


The last step now is to verify the HS sub system:
Only after all the steps above are successful, the HS system should be
checked. HS depends on a well configured remote and local database.

The simplest way to verify the HS is again a test table at the remote
database with test characters inserted. Especially use one character
out of the first 128 characters from the character map. They are the
same for all character maps and this can be very helpful.
Then also insert the characters that are not being displayed correctly.

Then create the same table in the Oracle database and fill it with the
same records.

Now dump the characters of the Oracle db and also from the remote db.
select dump(<column>) from <table>@<db_link>;
select dump(<column>) from <table>;

Are the decimal values the same?
If the dumped value is the same, but the SQL*Plus output differs,
then it looks like HS is wrongly initialized.
Check out the initialisation file of the HS/Gateway and verify
the settings of HS_LANGUAGE:
The HS_LANGUAGE initialization parameter provides Heterogeneous
Services with character set, language and territory information
of the non-Oracle data source. The value of the HS_LANGUAGE
initialization parameter has to be of the following format:

<language>[_<territory>.<character_set>]

Character_set:
This should be set to the REMOTE DATABASE CHARACTER SET or
to a super set of the remote database character set.

While the character set specifies the remote database character set,
all parameters (language and territory) must be initialized!

Language:
Day and month names of dates
AD, BC, PM, and AM symbols for date and time
Default sorting mechanism

Territory:
The territory clause of the HS_LANGUAGE initialization
parameter specifies the conventions for day and week numbering,
default date format, decimal character and group separator,
and ISO and local currency symbols.

Important:
The level of National Language Support between the Oracle server
and the non-Oracle data source depends on how the driver is implemented.
See the installation and users' guide for your platform for more
information about the level of National Language Support.


B)MULTI BYTE Characters:
The same from above is also valid for MULTI BYTE as well as UNICODE
characters.
The only difference is that the dump will differ:

Example for dumping unicode characters of a MS SQL Server:
insert into <table> values (nchar(30333)...)
select unicode(<column>) from table

This value belongs to a chinese character set; thus setting the
NLS_LANG of the client to ZHT16BIG5 and dumping now this remote table
with SQL*Plus and Oracle 9i onwards:
select asciistr(<column>) from <table>@<db_link>;
results in:
ASCIISTR("B")
----------------------
\767D

The Hex value 767D is equivalent to decimal 30333 we have inserted.

To verify the retrieved value matches the same character in an Oracle db a
select unistr('\767D') from dual; should show up the same character as
it is stored in the foreign database.

If the dump values differ between the source and the target database,
make sure the NLS_LANG is set to at least a character set that contains
the remote character. If the character set match, the next reason could be
the HS subsytem not being aware of the UNICODE characters.
The initialisation file of the gateway/hs knows a parameter called:

HS_NLS_NCHAR
and this parameter must refelct the REMOTE NATIONAL CHARACTER SET;
for MS SQL Server for example: UCS2
=> HS_NLS_NCHAR=UCS2




Additional comment:
Instead of configuring SQL*Plus another approach would be to use Microsoft
Word (if installed on the machine) and use the MS Word unicode support.

There is NO need to configure SQL*Plus at all. The following procedure will
perform a select using the gateway or HSODBC and write the output directly into
a file using the unicode compliant utl_file function.
Then just open the DOC file output using MS Word and the preview will display
the characters within the DOC:

Make sure the following directory exists or adapt it according to your
preferences:
d:\test_utf8

Now log into the database and prepare UTL_FILE function:

create or replace directory TEST_UTF8 as 'd:\test_utf8';
grant read, write on directory TEST_UTF8 to public;

Assuming now we have a table at the SQL Server database called
"russian" and "col3" conatins the russian characters:

CREATE OR REPLACE procedure testUTF8_4 as
v_file UTL_FILE.FILE_TYPE;
line_org varchar2(2000);
line_conv varchar2(2000);
v_datensatz varchar2(20000);
Cursor c1 is
Select "col3" from "russia"@sqlserver;

begin
v_file := UTL_FILE.FOPEN_nchar('TEST_UTF8','RUSSIAN.doc', 'w');
open c1;
LOOP
fetch c1 into v_datensatz;
Exit when c1%notFOUND;

UTL_FILE.PUT_LINE_nchar(v_file, v_datensatz);
UTL_FILE.NEW_LINE (v_file);
end loop;
close c1;
UTL_FILE.FClose(v_file);
DBMS_OUTPUT.PUT_LINE('DONE!');
End;

This procedure does not depend on the SQL*Plus settings. So there is no
need to change the registry keys. It is using the database settings and writes a
file to d:\test_utf8 called RUSSIAN.doc. Now open this doc file using MS Word
and MS Word is displaying a preview Window where you can choose the correct
character set by simply selecting it from a list and depending on the selection
the preview window will change.


To get rid of the UTL_FILE_DIR configuration please execute:
drop directory TEST_UTF8;


RELATED DOCUMENTS
-----------------

Manual: SQL*Plus, Getting Started, Release 9.2 for Windows
Note 199926.1 NLS_LANG Explained (How does Client-Server Character Conversion Work?)
Note 226558.1 An example inserting cyrillic data into a database on west european windows
Note 69518.1 Storing and Checking Character Codepoints in a UTF8/AL32UTF8 (Unicode) database










Show Related Information Related






Products





    Keywords








    CONNECTIVITY; DECIMAL; HETEROGENEOUS; TG4MSQL; TG4SYBS




    0 ความคิดเห็น:

    แสดงความคิดเห็น