diff --git a/content/posts/2024/lmdb/lmdb-with-c.md b/content/posts/2024/lmdb/lmdb-with-c.md new file mode 100644 index 0000000..897baa4 --- /dev/null +++ b/content/posts/2024/lmdb/lmdb-with-c.md @@ -0,0 +1,201 @@ +--- +title: "🗄️ Basics of LMDB with C" +date: 2024-06-25T22:57:31+03:00 +draft: true +tags: [db, lmdb, tutorial] +--- + +https://daemon.pizza/posts/lmdb-with-c/ + +LMDB is a neat embedded key value database. It stands for Lightning Memory-Mapped Database. It is useful for small applications where you don’t have to worry about schemas or relations. Think of the usecase where you might use a hashmap but you want the state to be persistent. LMDB is written in C and is even able to be linked with pkgconfig. A lot of times people use a wrapper, but I wanted to see what it was like to use this directly with C. It isn’t difficult, but I found basic tutorials hard to come by. So in this post I’ll show you to store and retrieve string key value pairs. + +## Installation + +The library has been around for a while, so it is easy to install. With debian I simply run: + +```sh +pacman -S lmdb # ArchLinux +sudo apt install liblmdb-dev # Debian / Ubuntu +``` + +In my `meson.build` I set the dependency like so: + +```sh +lmdb = dependency('lmdb') +``` + +Add the header to your source file like so: + +```c +#include "lmdb.h" +``` + + +## Usage + +LMDB has its own terminology, but if you are familiar with databases it shouldn’t be that foreign. [BoltDB](https://github.com/boltdb/bolt) which is written in Go was originally supposed to be a port of LMDB and I found its model to help me understand how the LMDB API works. + + +## Environment + +The first thing that is setup is the environment. I find it easier to think about the environment as the “database”, but LMDB has a specific use for the word database which we will see later. When we initialize the environment it will initialize the folder where the data is stored. Emphasis on folder not file. Upon initialization, in the folder there will be a lock.mdb and a data.mdb file. LMDB uses these two files to persist data. The way to initialize the environment it looks like this: + +```c +MDB_env *env; +mdb_env_create(&env); + +mdb_env_open(env, "./testdb", 0, 0664); +``` + +The more important call was mdb_env_open here we pass the environment struct, the path to the folder, 0 to specify READ and WRITe, and the folder permissions mode. Which you will recognize from UNIX chmod calls. + +Once we open the environment we can start carrying out transactions to read and write data. + + +## Put + +To put key pairs into the database we use `mdb_put` inside of a transaction. + +Lets start by creating a transaction. + +```c +MDB_txn *txn; +mdb_txn_begin(env, NULL, 0, &txn); +``` + +Here we use the environment struct from earlier, we pass NULL for the parent transaction (it is possible to have nested transaactions). 0 for READ & WRITE, and then the transaction. + +The next thing do is open the database. Which I think is better understood as “table” or as BoltDB calls them “buckets”. We open up access to the database inside of a transaction. + +```c +MDB_dbi dbi; +mdb_open(txn, NULL, 0, &dbi); +``` + +The parameters we used are similar to the transaction, except the NULL stands for use default database. If we specified a name it would open up a specific database, which I find easier to think about as a table. In this tutorial I won’t be showing how to do that. You can find more info on that in the [official documentation](http://www.lmdb.tech/doc/starting.html). + +After all that ceremony we are finally ready to put the key pair in. We begin by setting up our key and value which are both MDB_val data types. + +```c +char *skey = "foo"; +char *sval = "bar"; + +MDB_val key, val; + +// Use +1 to include the \0 character +key.mv_size = strlen(skey) + 1; +key.mv_data = skey; + +val.mv_size = strlen(sval) + 1; +val.mv_data = sval; +``` + +Once the key and value are setup we put them in the database. + +```c +mdb_put(txn, dbi, &key, &val, 0); +``` + +Make sure to commit the transaction which will also free the transaction and database struct memory. + +```c +mdb_txn_commit(txn); +``` + + +## Get + +Once we stored the value we want to be able to get it. This works similarly to put in that we initialize a transaction and database. + +```c +MDB_txn *txn; +MDB_dbi dbi; + +mdb_txn_beginn(env, NULL, MDB_RDONLY, &txn); +mdb_open(txn, NULL, 0, &dbi); +``` + +Now here we only have to setup the key: + +```c +char *skey = "foo"; +MDB_val key; +key.mv_size = strlen(skey) + 1; +key.mv_data = skey; +``` + +To finally get the value: + +```c +MDB_val val; +mdb_get(txn, dbi, &key, &val); +``` + +With the data being in the mv_data field: + +```c +printf("%s\n", val.mv_data); +``` + +Once we are done make sure to end the transaction: + +```c +mdb_txn_abort(txn); +``` + + +## Cursor + +Now that we have covered the basics the last we are going to go over is cursors. Here we will see how to list all the key pairs in the database. + +By this point you know the drill. Setup a transaction and a database. + +```c +MDB_txn *txn; +MDB_dbi dbi; + +mdb_txn_begin(env, NULL, MDB_RDONLY, &txn); +mdb_open(txn, NULL, 0, &dbi): +``` + +One thing that is different is we open up our cursor: + +```c +MDB_cursor *cursor; +mdb_cursor_open(txn, dbi, &cursor); +``` + +We also need to initialize the empty key and value pair. + +```c +MDB_val key, data; +``` + +Then we will iterate through the database with the cursor: + +```c +int rc; +while ((rc = mdb_cursor_get(cursor, &key, &data, MDB_NEXT)) == 0) { + printf("key: %s, value: %s\n", (char *)key.mv_data, (char *)data.mv_data); +} +``` + +Finally we want to close the cursor and the transaction: + +```c +mdb_cursor_close(cursor); +mdb_txn_abort(txn); +``` + + +## Final Remarks + +One thing I failed to show was how each call prefixed with mdb_ returns an error code. + +To get a meaningful error we can use this: + +```c +fprintf(stderr, "mdb_txn_commit: (%d) %s\n", rc, mdb_strerror(rc)); +``` + +Hopefully this helps on your journey to use LMDB in your cool projects!