Fix MySQL ERROR 1366: Incorrect string value When Inserting Unicode or Emoji

intermediate๐Ÿ—„๏ธ MySQL2026-04-02| MySQL 5.5โ€“8.x on Linux, macOS, Windows; any application inserting emoji or multi-byte Unicode characters

Error Message

ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x80...' for column 'content' at row 1
#mysql#charset#utf8mb4#unicode#emoji#encoding

The Problem

You insert a row that contains an emoji โ€” or Chinese, Arabic, or certain Japanese kanji โ€” and MySQL hits the brakes:

ERROR 1366 (HY000): Incorrect string value: '\xF0\x9F\x98\x80...' for column 'content' at row 1

Those \xF0\x9F\x98\x80 bytes are the UTF-8 encoding of ๐Ÿ˜€. Every 4-byte UTF-8 character triggers the same error.

Root Cause

Here's the trap: MySQL's utf8 charset is not real UTF-8. It only supports 3-byte sequences โ€” the Basic Multilingual Plane โ€” so emoji and many rare CJK characters fall outside that range and get rejected flat out. The actual fix is utf8mb4, genuine 4-byte UTF-8 support that's been available since MySQL 5.5.3.

This mismatch can exist at four levels: server default, database, table, or individual column. You need to fix every level that's still set to plain utf8.

Quick Fix โ€” Alter the Affected Column

Only one column causing trouble? Change it directly:

ALTER TABLE posts
  MODIFY COLUMN content TEXT
  CHARACTER SET utf8mb4
  COLLATION utf8mb4_unicode_ci;

Retry the insert. If it goes through, you're unblocked for now โ€” but read the Permanent Fix section below. Other columns will hit the same wall eventually.

Permanent Fix

Step 1 โ€” Check current charset settings

-- Server defaults
SHOW VARIABLES LIKE 'character_set%';
SHOW VARIABLES LIKE 'collation%';

-- Database level
SHOW CREATE DATABASE your_database_name;

-- Table level
SHOW CREATE TABLE posts;

Anything showing utf8 instead of utf8mb4 needs to change.

Step 2 โ€” Update the MySQL server config

Edit /etc/mysql/mysql.conf.d/mysqld.cnf (or /etc/my.cnf on older setups):

[mysqld]
character-set-server  = utf8mb4
collation-server      = utf8mb4_unicode_ci

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

Then restart:

sudo systemctl restart mysql

Step 3 โ€” Convert the database

ALTER DATABASE your_database_name
  CHARACTER SET utf8mb4
  COLLATION utf8mb4_unicode_ci;

Step 4 โ€” Convert all tables and columns at once

CONVERT TO changes every text column in a table in one shot:

ALTER TABLE posts
  CONVERT TO CHARACTER SET utf8mb4
  COLLATE utf8mb4_unicode_ci;

ALTER TABLE users
  CONVERT TO CHARACTER SET utf8mb4
  COLLATE utf8mb4_unicode_ci;

-- Repeat for every table

Got 20+ tables? Let MySQL generate the statements for you:

SELECT CONCAT(
  'ALTER TABLE `', TABLE_NAME, '` CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;'
)
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = 'your_database_name'
  AND TABLE_TYPE = 'BASE TABLE';

Copy the output, paste it into a client, and run it all at once.

Step 5 โ€” Fix your application connection string

Database converted โ€” but still seeing errors? If the client connection negotiates utf8, the problem comes straight back. Set the charset explicitly when connecting.

PHP (PDO):

$pdo = new PDO(
  'mysql:host=localhost;dbname=mydb;charset=utf8mb4',
  $user, $pass
);

PHP (MySQLi):

$conn = new mysqli('localhost', $user, $pass, 'mydb');
$conn->set_charset('utf8mb4');

Python (mysql-connector):

cnx = mysql.connector.connect(
  host='localhost', database='mydb',
  user='user', password='pass',
  charset='utf8mb4'
)

Node.js (mysql2):

const pool = mysql.createPool({
  host: 'localhost',
  database: 'mydb',
  charset: 'utf8mb4'
});

SQLAlchemy (Python):

engine = create_engine(
  'mysql+pymysql://user:pass@localhost/mydb?charset=utf8mb4'
)

Step 6 โ€” Handle the innodb_file_format issue (MySQL 5.5โ€“5.7 only)

On MySQL 5.x, utf8mb4 with long index keys can slam into a 767-byte index prefix limit. If you see ERROR 1071: Specified key was too long right after converting, add these lines to my.cnf:

[mysqld]
innodb_file_format    = Barracuda
innodb_file_per_table = 1
innodb_large_prefix   = 1

MySQL 8.0 ships with this enabled by default. Nothing to do there.

Verify the Fix

Three quick checks to confirm it all landed correctly:

-- 1. Check server charset
SHOW VARIABLES LIKE 'character_set_server'; -- should be utf8mb4

-- 2. Check the specific column
SHOW FULL COLUMNS FROM posts WHERE Field = 'content';
-- Collation column should show utf8mb4_unicode_ci

-- 3. Try inserting an emoji
INSERT INTO posts (content) VALUES ('Hello ๐Ÿ˜€๐ŸŽ‰');
SELECT content FROM posts ORDER BY id DESC LIMIT 1;

Emoji comes back intact? You're done.

Tips

Tracking down encoding bugs in a data pipeline can feel like detective work. When the error log only shows truncated hex like \xF0\x9F\x98\x80..., you need a way to inspect the raw bytes without uploading anything sensitive. I use the Base64 Encoder at toolcraft.app/en/tools/developer/base64-encoder โ€” paste the problematic string, see its byte representation, and confirm whether it has 4-byte sequences. A \xF0 prefix always means a 4-byte codepoint. Fast way to rule out guesswork.

utf8mb4_unicode_ci vs utf8mb4_general_ci

Both collations store emoji just fine. The difference shows up in sorting:

  • utf8mb4_unicode_ci โ€” follows the Unicode sorting standard. Better for multilingual content. Use this as your default.
  • utf8mb4_general_ci โ€” marginally faster on old hardware, but sorting accuracy suffers. Skip it for new projects.
  • utf8mb4_0900_ai_ci โ€” MySQL 8.0+ default. Most accurate option. Fine to use if you're on MySQL 8.

Summary

  • MySQL's utf8 charset rejects 4-byte characters โ€” emoji, rare CJK glyphs, anything above U+FFFF. Switch to utf8mb4.
  • Apply the change at all four levels: server config, database, table/column, and application connection string.
  • CONVERT TO CHARACTER SET utf8mb4 converts every column in a table in one command.
  • MySQL 8.0 defaults to utf8mb4. On fresh installs, this error usually only appears when the schema was created with an explicit utf8 setting.

Related Error Notes