[T0002] RDS MariaDB character set

2024-08-05

Problem

When I was developing in local, Korean worked fine. But when I tried to use AWS RDS, Korean didn't work well.

Environment

DB: MariaDB in docker

version: 11.4 (latest)

Tracking

I searched the internet. It seemed character encoding problem.

And in regard to version, docker latest was 11.4 and RDS was 10.11. I changed local mariadb version to 10.11.

I checked current encoding settings.

In mysql client
show variables where variable_name like 'c%';
character_set_server
collation_server

Docker MariaDB

utf8mb4

utf8mb4_general_ci

RDS MariaDB

latin1

latin1_swedish_ci

Why do they have different settings without any manual change?

RDS uses default value. Debian distribution uses different value.

https://mariadb.com/kb/en/differences-in-mariadb-in-debian-and-ubuntu/arrow-up-right

Solution

I created parameter group in RDS. Then I changed values of character_set_server and collation_server.

I created database instance using the parameter group that I made.

Some blog says that I also should change other parameters having character_set_ prefix. But It was enough to change the two, character_set_server and collation_server. Refer to https://mariadb.com/kb/en/server-system-variables/#character_set_serverarrow-up-right

If I need to change parameters on working RDS, it requires DB reboot.

Further

circle-info

It is possible that setting character set in server level, database level, table level and column level. I chose server level.

https://mariadb.com/kb/en/setting-character-sets-and-collations/arrow-up-right

circle-info

What is collation?

Collate means "collect and combine (texts, information, or sets of figures) in proper order".

A character set is a set of characters while a collation is the rules for comparing and sorting a particular character set.

https://mariadb.com/kb/en/character-set-and-collation-overview/arrow-up-right

It provides several collations for one character set. The utf8_general_ci and the utf8_unicode_ci works differently.

https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ciarrow-up-right

In mysql from 8.0.1, default character set is utf8mb4_0900_ai_ci.

https://rastalion.dev/mysql-8-0-1-%EB%B2%84%EC%A0%84%EB%B6%80%ED%84%B0-%EC%B1%84%ED%83%9D%EB%90%9C-utf8mb4_0900_ai_ci%EC%9D%98-%ED%95%9C%EA%B8%80-%EC%82%AC%EC%9A%A9%EC%97%90-%EB%8C%80%ED%95%9C-%EB%AC%B8%EC%A0%9C%EC%A0%90/arrow-up-right

Last updated