Print Page | Close Window

BUG Report - mysql utf8 encoding problem

Printed From: Web Wiz Forums
Category: Web Wiz Web App Support Forums
Forum Name: Web Wiz Forums
Forum Description: Support forum for Web Wiz Forums application.
URL: https://forums.webwiz.net/forum_posts.asp?TID=29865
Printed Date: 01 April 2026 at 2:32am
Software Version: Web Wiz Forums 12.08 - https://www.webwizforums.com


Topic: BUG Report - mysql utf8 encoding problem
Posted By: sectioni
Subject: BUG Report - mysql utf8 encoding problem
Date Posted: 12 October 2011 at 8:19pm

unless i'm mistaken (and from my tests i'm pretty sure i'm not), when using the forums in utf8 mode, mysql does not save the data as utf8 correctly.

i'm regarding odbc connector 5.1 ofcourse since 3.51 does not support utf8.
 
a double encoding happens since the connector  re-encodes the data as utf8 which is already sent via the forms on the forum as utf8 when inserting/updating the db.
 
if you try to read non-english data from the tbltopic table for example with a mysql client (such as sqlyog) or do a backup dump of the db you will get gibrish because it's not well-formed utf8 content.
 
you guys need to add a check so that if the forums use utf8, add the charset=utf8 parameter to the odbc 5.1 connection string so it will know that the data is already being passed as utf8 and does not re-encode it.
after adding the charset, existing data will be retrieved as gibrish since it will try to retrieve as utf8 and it is not. so a conversion fix of all existing data will need to be made first.
 
 
 



Replies:
Posted By: WebWiz-Bruce
Date Posted: 13 October 2011 at 12:19pm
Not seen this issue.

The other problem is that you need to connect to the database first to read in the page encoding, so do not know if the forum is using utf-8 till after the connection to the database.

This would need to be something that would require the editing of the connection string manually.


-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: sectioni
Date Posted: 13 October 2011 at 3:14pm
you haven't noticed this issue probably because browsers know how to handle double encoding.
but clients and dumps and everything else that will try to work against that data won't.
 
All the data inputed by the user via the forms is being converted to utf8 (with no relation to database at this point) because of the utf8 meta tag which is telling the browser to convert it during the post.
 
and you can test it yourself:
input a foreign language text (that one of you guys at development know) and try to view the data in one of the free mysql clients and you'll see it as gibrish text.
 
and then if you're still not convinced, the only way to test if it's true utf8 is to look at the hex value.
 
do a SELECT hex(columnName) on one of the columns, copy the text and paste it into a good hex decoder such as this one: http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder - http://software.hixie.ch/utilities/cgi/unicode-decoder/utf8-decoder
(don't forget to change the input type to hex on the combo box)
and you'll see it in gibrish too.
 
then try submitting a form on the forum after making the addition i suggested to the connection string and try the process again. you will see the words correctly this time on the mysql client and the hex decoder.
 
this is causing issues for foreign languages because we can't backup the DB to a dump file because all the text is garbled, and also since it is double encoded, it takes much more space in the varchar column so for example a column with a limit of 70 chars will only be able to contain about 20 chars and then cut the rest.
 


Posted By: WebWiz-Bruce
Date Posted: 13 October 2011 at 3:27pm
There is no official support for foreign languages as they are a complete headache to support.

If you wish to use foreign languages with mySQL then you would need to edit the connection string in the way you suggest.

Due to what I explained earlier in having to retrieve the encoding type from the database first, your suggestion could not be supported out the box as the connection needs to be made to the database first to get the page encoding set by the forum admin.


-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting


Posted By: sectioni
Date Posted: 13 October 2011 at 3:31pm
thats ok.
i can handle the language bugs on my own.
just thought you guys should know about the problems in case you do decide to support it in the future.
 
but yeah its a headache :)
 
 


Posted By: WebWiz-Bruce
Date Posted: 13 October 2011 at 3:37pm
Have attempted to make the software as generic as possible to limit the number of modifications needed to be made for foreign languages, but depending on the database type used, the language used, the character encoding used, the server locale settings, etc, etc, there may well be modifications that need to be made.

I personally would recommend that you use SQL Server as mySQL does not tend to stand up to well with larger forums and the myODBC driver does tend to be rather buggy.


-------------
https://www.webwiz.net/web-wiz-forums/forum-hosting.htm" rel="nofollow - Web Wiz Forums Hosting
https://www.webwiz.net/web-hosting/windows-web-hosting.htm" rel="nofollow - ASP.NET Web Hosting



Print Page | Close Window

Forum Software by Web Wiz Forums® version 12.08 - https://www.webwizforums.com
Copyright ©2001-2026 Web Wiz Ltd. - https://www.webwiz.net