乐闻世界logo
搜索文章和话题

How to make MySQL handle UTF-8 properly

1个答案

1

Handling the UTF-8 character set in MySQL is crucial, especially when dealing with internationalized data. Here are several key steps to ensure MySQL correctly handles UTF-8:

1. Set the correct character set

Ensure that the database, tables, or columns are configured with the correct character set. For full Unicode support, use utf8mb4 instead of utf8, as utf8mb4 is a true UTF-8 implementation that supports four-byte characters (including certain emojis and special characters). Specify the character set when creating the database or table:

sql
CREATE DATABASE mydatabase CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; CREATE TABLE mytable ( id INT, text VARCHAR(255) ) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

2. Connection character set configuration

When the application connects to MySQL, ensure the connection uses utf8mb4. Set this in the connection string:

bash
mysql -u username -p -h host --default-character-set=utf8mb4

For programming languages like PHP:

php
$pdo = new PDO("mysql:host=host;dbname=dbname;charset=utf8mb4", 'username', 'password');

3. Server and client configuration

Ensure the MySQL server configuration file (typically my.cnf or my.ini) includes the correct character set and collation rules. Configure these in the [mysqld] and [client] sections:

ini
[mysqld] character-set-server=utf8mb4 collation-server=utf8mb4_unicode_ci [client] default-character-set=utf8mb4

4. Convert existing data

If existing data is stored in another character set, convert it to utf8mb4 using the ALTER TABLE command:

sql
ALTER TABLE mytable CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

5. Test and verify

After configuration, test to confirm all characters are stored and retrieved correctly. Insert data with special characters:

sql
INSERT INTO mytable (text) VALUES ('😊'); SELECT text FROM mytable WHERE id = 1;

Verify the returned result contains the correct characters.

By following these steps, you can ensure MySQL correctly handles UTF-8, enabling reliable storage and querying of multilingual content.

2024年8月7日 00:02 回复

你的答案