Russian character import/storage woes

peskywinnets

Registered User.
Local time
Today, 22:18
Joined
Feb 4, 2014
Messages
578
So I'm parsing some XML that has an address like this...

ул.Плеханова
Минск,минская
Belarus

but it ends up in Access looking like this...

??????? ????????
???????,????????

Also if I do a debug.print of the variable it is the same (i.e. ????????) in the VBA immediate window

I've google this & couldn't wrap my head around the solutions spoken of....does anyone round these parts speak jive?
 
What interest is that in spite of the question marks the string actually contains the value. I copied and pasted Минск,минская into a table and then copied it to another records with this code:


Code:
Dim s As String
s = DLookup("[Question]", "[Questions]", "[QuestionID] = 3")
CurrentDb.Execute "INSERT INTO Questions (Question) VALUES ( '" & s & "')"

The point being that it was transferred via the string s without problems albeit debug.print s produces ?????,???????
 
I suspect it is to do with the character set being used by the vba editor. Perhaps it only uses the standard character set which goes up to 128 rather than the extended set

See this link

http://www.ascii-codes.com/cp855.html

I know you can change the font for the editor, but don't think you can change the character set
 
I suspect it is to do with the character set being used by the vba editor.t

Whilst I tend to agree with that thought, the data ends up as gobbledegook in the access table itself too - yet I can copy paste the same Russian Text data into the same Access table field & it pastes/reads correctly, therefore things are going awry between the incoming XML (which has the correct characters in it) parsing/inserting into an access table field via VBA.

here's how I parse the incoming XML within VBA...
Code:
ShippingAddress1 = objxmldoc.selectSingleNode("//ebay:Order[" & O & "]/ebay:ShippingAddress/ebay:Street1").text

& here's how it get inserted into my table...
Code:
DoCmd.RunSQL "INSERT INTO Customers(ShippingAddress1)VALUES('" & ShippingAddress1 & "')"

When I pipe the variable ShippingAddress1 into a text file, it's gobbledegook in there too :-(
 
It must be going gobbledegook in the objxmldoc.selectSingleNode statement as I was able to transfer them through a string variable without damage. You could confirm this by combining the statements and avoid the variable, i,e.


Code:
DoCmd.RunSQL "INSERT INTO Customers(ShippingAddress1)VALUES('" & objxmldoc.selectSingleNode("//ebay:Order[" & O & "]/ebay:ShippingAddress/ebay:Street1").text & "')"
 
You could confirm this by combining the statements and avoid the variable, i,e.
Code:
DoCmd.RunSQL "INSERT INTO Customers(ShippingAddress1)VALUES('" & objxmldoc.selectSingleNode("//ebay:Order[" & O & "]/ebay:ShippingAddress/ebay:Street1").text & "')"

Whoop...there it is - that works ! :-)

Such a relief...thanks.

That said, the incoming XML shipping address information is normally standard UTF8 ....and because the incoming text is normally in tatters from a propercase perspective, I normally massage the parsed XML in VBA with commands like strconv() (ultimately, this data ends up on a sales receipt that I email out, so I need it to be presentable!)....so I need a way of testing whether the incoming XML text contains a weird characterset or not ...if it is, leave well alone, but if not proceed & massage the text into propercase before inserting into my Access table.

So my next challenge is how to test whether the text data is normal or weird!

This link alludes to 'testing the incoming chracters'...

http://answers.microsoft.com/en-us/...-visible/5373f885-614f-4b66-8d16-93f9a0c0ab28

"Changing your data to unicode, if that is the problem, requires you to examine each character and if it has a code of between 127 and 255 deciding if it needs changing, what it should be changed to, and then changing it. I have never done this nor do I know if there are any facilities in Access, or Windows, or third party utilities that might help."


An immediate (quick 'n Dirty) thought is just not to apply ProperCase tidying to any incoming XML text that has a ShippingCountry = "Russia"...indeed if the shipping country is Russia....just insert direct to the table from the XML as you recommended (avoiding a variable). It'd be nice to use a variable though...
 
Last edited:
Well that seems to disprove my assertion but I'm glad for you that it did. Good luck.
 
Wow, & what a beast of "gotcha" that was!

So in essence (if I'm picking this up right), you can't parse XML into a VBA variable if the raw XML contains Russian Character text (Cyrillic character set).....as storing in a VBA string variable will corrupt the characters. The only way around this is to bypass using a VBA string variable altogether & just insert the XML straight into your Access table ...which makes for some very unwieldy code!

So there's a day out of my life I'll never get back trying to get to the bottom of (not to mention all the code I wrote prior to realising the above ....now in need of some major rejigging)

Anyway thanks for your input....it really helped me.
 
It would be interesting to know if a variant type would act differently but I understand if you are tired of experimenting.
 

Users who are viewing this thread

Back
Top Bottom