Introduction to Some MySQL Storage Engines
Introduction
MySQL is one of the most popular relational database systems in the world, used by most large websites. Therefore, mastering MySQL is an essential requirement for any webmaster.
The logical architecture of MySQL can be broadly described as shown below:
We can see that MySQL has basic components as listed below:
- Connection/thread handling.
- Query cache.
- Parser.
- Optimizer.
- Storage engine.
Understanding the function and role of each component is essential to using MySQL effectively. This article will focus on introducing the bottom-most component in the above model: the Storage engine.
Storage Engine
The Storage Engine is essentially how MySQL stores data on disk. MySQL stores each database as a subdirectory under the data directory. When a table is created, MySQL stores the table definition in a file with the .frm extension and the same name as the created table. Managing the table definition is the responsibility of the MySQL server, even though each storage engine stores and indexes data differently.
For example, if I specify –datadir as /usr/local/mysql/data and define the users table in a database named test as follows:
create table users (
id int not null auto_increment,
name varchar(30),
password varchar(20),
primary key(id)
);
Then, in the /usr/local/mysql/data directory, there will be a subdirectory named test, and under test, there will be the following files:
-rw-rw---- 1 _mysql wheel 8624 5 7 17:35 users.frm
-rw-rw---- 1 _mysql wheel 98304 5 7 17:35 users.ibd
To check the current storage engine of a table, you can use the command SHOW TABLE STATUS:
mysql> show table status like 'users' \G
*************************** 1. row ***************************
Name: users
Engine: InnoDB
Version: 10
Row_format: Compact
Rows: 2
Avg_row_length: 8192
Data_length: 16384
Max_data_length: 0
Index_length: 0
Data_free: 0
Auto_increment: 3
Create_time: 2013-05-07 17:35:09
Update_time: NULL
Check_time: NULL
Collation: latin1_swedish_ci
Checksum: NULL
Create_options:
Comment:
1 row in set (0.01 sec)
In this case, the storage engine is: InnoDB, row format: Compact, the number of data rows: 2, and the next auto increment value: 3.
Overview of Engines
1. MyISAM engine
Features:
- Full-text indexing.
- Compression.
- Spatial functions (GIS).
- No support for transactions.
- No support for row-level locking.
Storage:
MyISAM stores each table’s data in 2 files: .MYD for data and .MYI for indexes. Rows have two types: dynamic and static (depending on whether the data length changes). The maximum number of rows stored is limited by the operating system and disk capacity. By default, MyISAM uses a 6-byte pointer to point to the data record, which limits the data size to 256TB.
Features:
- MyISAM locks the entire table. The user (MySQL server) acquires a shared lock for reading and an exclusive lock for writing. However, reading and writing can occur concurrently!
- MyISAM has the ability to automatically repair and recover data after a system crash.
- Use the check table/repair table command to check for errors and recover after failure.
- MyISAM can index full-text, supporting full-text search.
- MyISAM does not write data directly to the hard disk, but to a buffer in memory (and only writes to the disk after a certain period), which speeds up writes. However, after a server crash, data recovery must be done using myisamchk.
- MyISAM supports data compression, which helps speed up data reading. However, compressed data cannot be updated.
2. InnoDB engine
Features:
- It is the most complex engine in MySQL.
- Supports transactions.
- Good recovery and repair support.
Storage:
InnoDB stores data in a single file (referred to as tablespace).
Features:
- InnoDB supports MVCC (Multiversion Concurrency Control) to improve concurrent access and supports next-key locking strategies.
- InnoDB is built on a clustered index, so searching by primary key is very efficient. However, InnoDB does not support sorting indexes, so changes to the table structure will cause all data to be re-indexed from scratch (SLOW with large tables).
3. Memory engine
Features:
- Also known as HEAP tables.
Storage:
All data is stored in memory.
Features:
- After the server restarts, the table structure is preserved, but data is lost.
- The Memory engine uses HASH indexes, which makes query lookup very fast.
- The Memory engine uses table-level locking, so concurrency is not high.
4. Archive engine
Features
- Only supports Insert and Select.
- No indexing.
- Data is buffered and compressed using zlib, which reduces I/O, thus increasing write speed.
Features:
- High write speed, suitable for logging applications.
5. CSV engine
Features:
- Treats CSV files as a table.
- No indexing support.
Features:
- If the task is to extract information from a CSV file and write it to the database, and also need the CSV result directly from the DB, this engine seems appropriate.
6. Falcon engine
Features:
- Designed for modern hardware: 64-bit servers, large memory.
- Still quite new, with few user cases.
7. Maria engine (Related Database: MariaDB)
Features:
- Designed by seasoned MySQL engineers, aiming to replace MyISAM.
- Supports transactions as an option.
- Error recovery.
- Row-level locking and MVCC.
- Better support for BLOBs.
Criteria for Selecting an Engine
- Transactions: If the application requires transactions, InnoDB is the only choice. If transactions are not needed, MyISAM is a good choice.
- Concurrency: If high load is required and transactions are not necessary, MyISAM is the number one choice.
- Backup: All engines somewhat support backups. Additionally, system design support for backups is necessary. For example: If you design a database server with a master and slave, the master requires transactions, so InnoDB should be used; the slave needs backups and reading, so MyISAM can be used. The master-slave synchronization mechanism will help manage the differences between engines while ensuring backups. This criterion has a small weight.
- Crash recovery: MyISAM has weaker crash recovery compared to InnoDB.
- Feature requirements: If the requirement is logging, MyISAM or Archive is a reasonable choice. If direct CSV storage is needed, CSV engine should be considered. If the application rarely modifies data (e.g., a book database), MyISAM with compression is a suitable option.
Conclusion
This article introduced an overview of storage engines, an important component of the database system. Several prominent engines and their features were presented. The criteria for selecting each type of engine were also discussed.
Hopefully, through this article, you gained an overview of database storage engines in general and MySQL storage engines in particular, while understanding the importance of choosing the right storage engine.
Source: ktmt.github.io