Special characters via XML

grinder

Member
Hello Everybody,

i try to import data via XML and i have a problem when there are special characters in the file.
e.g. when the string is "äöüÄÖÜ-áéíóú-àèìòù-âêîôû" then the string in my DB is "„”Ž™š- ‚¡¢£-…Š•—-ƒˆŒ“–"
I tried to convert it with
mcCurrVal = CODEPAGE-CONVERT(mcCurrVal, SESSION:CHARSET, "UTF-8":U).
But then there are other characters?!?

Does anybody know about this?

Greetings
Phil
 

dayv2005

Member
Hello Everybody,

i try to import data via XML and i have a problem when there are special characters in the file.
e.g. when the string is "äöüÄÖÜ-áéíóú-àèìòù-âêîôû" then the string in my DB is "„”Ž™š- ‚¡¢£-…Š•—-ƒˆŒ“–"
I tried to convert it with
mcCurrVal = CODEPAGE-CONVERT(mcCurrVal, SESSION:CHARSET, "UTF-8":U).
But then there are other characters?!?

Does anybody know about this?

Greetings
Phil


did you try using the CH() function i think thats right. if you load up your system char map click on the char it should show you a number just try ch(what ever number here) that may work. awhile back i had a similar issue with this on an xml file by inserting page breaks if you need more info email me and ill send you what i did dpipes@aimntls.com
 

grinder

Member
Hi dayv2005,

thanks for the answer, but that didn't work. Maybe I did something wrong. I guess with CH() you mean the CHR() function?
Can you please provide me an example of your solution (i will send you an email)? I have to confess that I am not very familiar with codepages and conversion.

Anyway I managed to convert some of the characters by using e.g. the following code:
IF INDEX(mcCurrVal, "":U) > 0 THEN
mcCurrVal = SUBSTRING(mcCurrVal, 1, INDEX(mcCurrVal, "":U) - 1) + "ä":U + SUBSTRING(mcCurrVal, INDEX(mcCurrVal, "":U) + 1, LENGTH(mcCurrVal)).
But if I use this solution I have to convert every single character.
Is there no other way like:
mcCurrVal = CODEPAGE-CONVERT(mcCurrVal, "don't know what codepage i should use":U, "UTF-8":U).
 

grinder

Member
I have found a small solution:
mcCurrVal = CODEPAGE-CONVERT(mcCurrVal, SESSION:CHARSET, "ibm850":U).

But I still have problems with some other characters like "ß".

Any ideas?

Thanks in advance
Phil
 

Casper

ProgressTalk.com Moderator
Staff member
What codepage is your database and what codepage is the XML file?

Regards,

Casper.
 

grinder

Member
Hi Casper,

the codepage of my DB is 1252 and the codepage of the xml-file is utf-8. It was just a shot to try it with ibm850.

Greetings
Phil
 

Casper

ProgressTalk.com Moderator
Staff member
Did you already try to convert it with the input statement... e.g.:
Code:
input stream [I]StreamName [/I]from <xmlfilename> convert SOURCE "UTF-8"

HTH,

Casper.
 

grinder

Member
Hi, I tried as you recommended, but this does not work.
BTW: I told my business-partner to send the xml-file utf-8 conform. Characters like "ß" are NOT utf-8 conform, afaik.

The easiest solution for this problem is the following:
If you want to send chars like "ß" via xml, you have to convert them into &#ascii-value of char;

For my example with "ß" it is ß

This is the easiest way I found.
Also this might be the way to do it correctly without conversions. I tried several combinations with CODEPAGE-CONVERT, but no one worked with all special-characters I'd like/have to use.

This solution also works with cyrillic characters :)

Thx to dayv2005 and Casper for your help...

cu
Phil
 
Top