Home
About
Search
🌐
English Română
  • MySQL and Unicode using UTF-8

    Citește postarea în română

    Aug 10, 2009 information_schema MySQL unicode utf-8
    Share on:

    Along with globalization, the old ASCII code is no longer suitable.  Consider that one day you have to develop a project in German, Russian or even Japanese, you could adapt the charset for each of these languages or you could simply develop using Unicode.

    To use Unicode with MySQL UTF-8 can be used.

    You must note that UTF-8 characters are variable in length and they are ASCII compatible. In ASCII 1 char = 1B, in UTF-8 1 char can be between 1 and 4 B.

    UTF-8 charset and collation on the server

    Character type in MySQL is dictated by charset.

    To check if UTF-8 in installed on the server:

    1SHOW CHARSET LIKE 'utf8';
    

    or with information_schema

    1SELECT * FROM `CHARACTER_SETS` WHERE CHARACTER_SET_NAME = 'utf8';
    

    If the charset was found then we can continue.

    Another element that appears with charset is collation, which it’s used for comparing strings at ordering.

    To see what collations are available on the server:

    1SHOW COLLATION WHERE CHARSET = 'utf8';
    

    or with information_schema

    1SELECT * FROM `COLLATIONS` WHERE CHARACTER_SET_NAME = 'utf8';
    

    Collation are usually by language, for comparing strings with or without diacritics for example, or “bin” can be used with orders strings in binary mode,  ie “A” is greater than “a” for example.

    If no collation is specified, then the default one will be used.

    UTF-8 and data bases

    When creating a database you can specify the default charset to be used with all new tables for which there isn’t any charset specified.

    For example:

    1CREATE DATABASE db_name CHARACTER SET utf8 COLLATE utf8_romanian_ci;
    

    Or for modifying the default charset for a data base which already exists:

    1ALTER DATABASE db_name CHARACTER SET utf8 COLLATE utf8_romanian_ci;
    

    UTF-8, tables and columns

    For modifying tables which already exist ALTER TABLE must be used is used.

    A table can have a default charset and collation, and each column can have it’s own charset and collation.

    For more information about the table:

    1SHOW CREATE TABLE tab;
    

    To set a charset for an existing table:

    1ALTER TABLE tab CHARSET = utf8 COLLATE = utf8_romanian_ci;
    

    For modifying the charset of a VARCHAR(200) column is used:

    1ALTER TABLE tab MODIFY c1 VARCHAR(200) CHARSET utf8 COLLATE utf8_romanian_ci;
    

    String size

    A “problem” that may arise is related to the size of the character, it’s size can be between 1 and 4B.  That is why for measuring a string column (like varchar) you must use CHAR_LENGTH(str) instead of LENGTH().

    A short exemple:

    1SET @var = 'aşadar';
    2SELECT CHAR_LENGTH(@var) AS 'Char', LENGTH(@var) AS 'Length';
    3
    4// The output is: Char = 6 and Length = 7 because ş is 2B
    

Claudiu Perșoiu

Programming, technology and more
Read More

Recent Posts

  • Adding a slider to Tasmota using BerryScript
  • The future proof project
  • Docker inside wsl2
  • Moving away from Wordpress
  • Custom path for Composer cache
  • Magento2 and the ugly truth
  • A bit of PHP, Go, FFI and holiday spirit
  • How to make use of the Xiaomi Air Conditioning Companion in Home Assistant in only 20 easy steps!

PHP 49 MISCELLANEOUS 46 JAVASCRIPT 14 MAGENTO 7 MYSQL 7 BROWSERS 6 DESIGN PATTERNS 5 HOME AUTOMATION 2 LINUX-UNIX 2 WEB STUFF 2 GO 1

PHP 35 JAVASCRIPT 15 PHP5.3 11 MAGENTO 7 PHP6 7 MYSQL 6 PHP5.4 6 ZCE 6 CERTIFICARE 5 CERTIFICATION 5 CLOSURES 4 DESIGN PATTERNS 4 HACK 4 ANDROID 3
All tags
3D1 ADOBE AIR2 ANDROID3 ANGULAR1 ANONYMOUS FUNCTIONS3 BERRYSCRIPT1 BOOK1 BROWSER2 CARTE1 CERTIFICARE5 CERTIFICATION5 CERTIFIED1 CERTIFIED DEVELOPER1 CHALLENGE1 CHM1 CLASS1 CLI2 CLOSURES4 CODE QUALITY1 CODEIGNITER3 COFFEESCRIPT1 COLLECTIONS1 COMPOSER1 CSS1 DEBUG1 DESIGN PATTERNS4 DEVELOPER1 DEVELOPMENT TIME1 DOCKER2 DOCKER-COMPOSE1 DOUGLAS CROCKFORD2 ELEPHPANT2 FACEBOOK2 FFI1 FINALLY1 FIREFOX3 GAMES1 GENERATOR1 GO1 GOOGLE1 GOOGLE CHROME1 GOOGLE MAPS1 HACK4 HOMEASSISTANT2 HTML2 HTML HELP WORKSHOP1 HTML51 HUG1 HUGO1 INFORMATION_SCHEMA1 INI1 INTERNET EXPLORER3 IPV41 IPV61 ITERATOR2 JAVASCRIPT15 JQUERY1 LAMBDA1 LINUX1 MAGENTO7 MAGENTO22 MAP1 MINESWEEPER1 MOTIVATION1 MYSQL6 NGINX1 NODE.JS2 NOSQL1 OBSERVER3 OBSERVER PATTERN1 OOP1 OPERA1 OPTIMIZATION1 ORACLE1 PAGESPEED1 PAIR1 PARSE_INI_FILE1 PHONEGAP2 PHP35 PHP ELEPHANT2 PHP FOR ANDROID1 PHP-GTK1 PHP42 PHP53 PHP5.311 PHP5.46 PHP5.53 PHP5.61 PHP67 PHP7.41 PROGRAMMING1 REVIEW1 ROMANIAN STEMMER2 SAFARY1 SCALAR TYPE HINTING1 SCHEME1 SET1 SHOPPING CART PRICE RULE1 SINGLETON1 SOAP1 SPL2 SQLITE1 SSH1 STACK TRACE1 STDERR1 STDIN1 STDOUT1 SUN1 SYMFONY2 TASMOTA1 TEST TO SPEECH1 TITANIUM2 TRAITS1 TTS1 UBUNTU1 UNICODE2 UTF-82 VECTOR1 WEBKIT1 WINBINDER1 WINDOWS2 WORDPRESS1 WSL21 YAHOO3 YAHOO MAPS1 YAHOO OPEN HACK1 YSLOW1 YUI1 ZCE6 ZCE5.31 ZEND3 ZEND FRAMEWORK3
[A~Z][0~9]

Copyright © 2008 - 2024 CLAUDIU PERȘOIU'S BLOG. All Rights Reserved