Introduction to MySQL’s Innodb Memcached Interface
With MySQL 5.6 a memcache-compatible innodb-backed key-value store was added to MySQL.
The InnoDB Memcache Daemon gives you the permanence of innodb for key-value data, that can be accessed via the much faster, optimized memcached protocol — using this will skip the query parser, optimizer and other parts of engine that are unnecessary.
With mysqlnd_memcache, you can transparently route queries to this memcache-compatible interface.
Installation
The default MySQL 5.6 packages that ship with Ubuntu (Trusty) do not include the memcache plugin, to use it you should install from the official MySQL apt repository (for Debian 7.x Wheezy, Ubuntu 12.04 Precise, and Ubuntu 14.04 Trusty)
Once you have MySQL 5.6 (or later) installed, you should then login to MySQL as a super user and run:
SOURCE /usr/share/mysql/innodb_memcached_config.sql;
INSTALL PLUGIN daemon_memcached SONAME "libmemcached.so";
The innodb_memcached_config.sql
does several things, the first is to create a innodb_memcache
database, which contains three tables:
cache_policies
: This table contains policies determining howGET
,SET
,DELETE
, andFLUSH
commands are executed.containers
: This table contains a list of tables to expose via memcacheconfig_options
— This table contains memcache configuration options — specifically the multi-column valueseparator
(defaults to a pipe “|
”) and thetable_map_delimiter
(defaults to a period “.
”)
The second command loads the actual plugin and starts the memcache daemon within MySQL.
Now you are ready to start using the plugin.
Creating a Memcache Store
Because the memcache engine sets certain requirements on tables, not every innodb table is automatically exposed. Instead, you must setup what is known as a collection.
Each collection has a name, which is used to refer to it via memcache, and then maps out a number of columns:
name
: The name used to refer to the collection via memcachedb_schema
: The database namedb_table
: The database table namekey_columns
: The column name containing the key (don’t let the plural fool you, it’s a single column)value_columns
: The columns that contain values, specify them as comma separated values. In memcache column values are separated by a pipe (as per theconfig_options
table)flags
: The memcache flags to setcas_column
: The CAS value assigned by memcacheexpire_time_column
: The expiration time (in seconds), or 0 to never expireunique_idx_name_on_key
: The name of the index which places aUNIQUE
constraint on the key. If the key is the primary key, specifyPRIMARY
To create our data store, we will create a new database kv_data
, and table kv_store
:
CREATE DATABASE kv_data;
USE kv_data;
CREATE TABLE kv_store (
`key` VARCHAR(255),
`value` VARCHAR(1024),
`flags` INT,
`cas` BIGINT UNSIGNED,
`exp` INT,
primary key(`key`)
) ENGINE = INNODB;
We then tell the plugin about the container:
INSERT INTO innodb_memcache.containers(
name,
db_schema,
db_table,
key_columns,
value_columns,
flags,
cas_column,
expire_time_column,
unique_idx_name_on_key
) VALUES (
'kv_data',
'kv_data',
'kv_store',
'key',
'value',
'flags',
'cas',
'exp',
'PRIMARY'
);
Once you have done this, it may take a short while for this to apply, or you can restart MySQL.
Using the Memcache Interface
Now that you have the innodb memcache interface up and running, you can insert data just like any other database table, or of course via the memcache protocol — which can be done by hand using telnet.
When using the memcache protocol — by default, and the minimum — number of operations before the data is committed to innodb is 32, as determined by the daemon_memcached_w_batch_size
setting — this means that data is only visible via MySQL every 32 operations. This is a performance trade-off. One exception to the rule is when using binlog replication, when it is always set to 1.
To have MySQL changes available immediately via memcache, you must run:
SET TRANSACTION ISOLATION TO READ-UNCOMMITTED;
To use the interface using telnet, you need to connect using:
$ telnet localhost 11211
telnet> set test.key 0 0 11
Hello World
STORED
telnet> get test.key
VALUE test.key 0 11
Hello World
END
Using Multiple Collections
By default the data is stored in either the collection named default
, or if that does not exist, the first collection in the table.
To access data in a different collection you have two options. The first is closer to MySQL’s USE
statement, you simply get the collection itself, and then any subsequent commands on that connection are performed on that collection until it is changed again.
Collections names are prefixed with double @ signs to distinguish them:
telnet> get @@kv_data
VALUE @@kv_data 0 16
kv_data/kv_store
END
Alternatively (and my personal preference, as someone who also hates MySQL table aliases) is to use the fully qualified name. This is where the table_map_delimeter
comes into play, whereby we just prefix the key with the @@collection
and the table_map_delimeter
, so test.key
becomes @@kv_data.test.key
telnet> get @@kv_data.test.key
VALUE @@kv_data.test.key 0 11
Hello World
END
I recommend always using the fully qualified names as you may accidentally change collections in some called code without realizing it.
Using the mysqlnd_memcache Plugin
At this point, you can simply interface with the memcache interface using the memcached or memcache PHP extensions — including using it for the default session handler (memcached extension, memcache extension), or using regular SQL queries. However, with the mysqlnd_memcache plugin, you can transparently route SQL queries to the memcache interface when applicable.
By default, queries are matched against the regular expression exposed via the MYSQLND_MEMCACHE_DEFAULT_REGEXP
constant:
/^\s*SELECT\s*(.+?)\s*FROM\s*`?([a-z0-9_]+)`?\s*WHERE\s*`?([a-z0-9_]+)`?\s*=\s*(?(?=["'])["']([^"']*)["']|([0-9e\.]*))\s*$/is
If there is a match, the plugin then verifies that only columns mapped in the collection setup, and that only a single WHERE clause exists, with a simple comparison on the key column.
In our example this will match:
SELECT `value` FROM kv_store WHERE `key` = 'test.key';
However, none of these would:
SELECT * FROM kv_store WHERE `key` = 'test.key';
SELECT `value` FROM kv_store WHERE `key` = `test.key` AND value LIKE '%foo%';
SELECT `key` FROM kv_store WHERE value LIKE '%foo%';
These queries may be issued using any of the mysql, mysqli, or pdo query execution mechanisms, and will be intercepted transparently.
The mysqlnd_memcache plugin does not handle writes.
While this is of potentially limited use, it can definitely improve your code by not introducing the memcache/memcached APIs, and with possibly large performance wins for very little work.
Writing, Replication and Memcache Pools
Memcached is known for it’s ease of setting up memcached pools for load balancing and high availability, but what about the InnoDB memcache interface? Somewhat similar behavior is available with standard MySQL replication in that each slave can be used as a read-only memcached server, however the same rules of read/write splitting applies to your memcache reads and writes.
You could ensure that you only use the mysqlnd_memcache plugin to access your memcached pool, and pair with the mysqlnd_ms plugin to handle read/write splitting — however this means that you miss out on the high performance of the memcached interface for writes.
As there is no way to indicate to the memcache(d) extensions that servers are read-only it would not be possible to use a standard master+slaves replication topology as a memcached pool with them.
Conclusion
The InnoDB memcached interface is a much faster way of using MySQL for a simple key-value store, and is still backed by the reliability and persistence of the excellent InnoDB table engine.
While it is typically not quite as fast as actual memcached, it allows you to eliminate “another moving part” in your infrastructure while gaining persistence, and is a drop-in replacement for memcached.
Additionally, the have been some drastic innodb memcached engine performance gains in the upcoming MySQL 5.7 release especially for larger multi-core systems.
In the next part in this series, we will look at query caching for further performance gains.
P.S. Are you using memcached? Would you consider using the InnoDB memcached interface to add persistence? How are you currently accessing the InnoDB memcached interface? We’d love to hear your experiences.
Share your thoughts with @engineyard on Twitter
OR
Talk about it on reddit