This blog post describes my thought process in implementing a transaction log for my database. My implementation may not follow best practices, as I am still learning. If this blog post sparks ideas on what I could do better, please let me know!
JameSQL, my NoSQL database, was designed to be an in-memory data store. I use the tool for my blog search engine. The current architecture is as follows:
- A JameSQL server loads an archive of my blog posts in memory, creates indices to allow for efficient searching, and offers an API through which queries can be made;
- A user makes a query to my web page using the string query syntax available in JameSQL, which supports keyword search, negation, searching by field (i.e. blog category), and more. This query is then turned into the JSON syntax that the database understands.
Because the database server loads all my blog posts in memory on initialisation, I need a way to update it with new posts. I wrote a script that subscribes to my RSS feeds and submits a POST request to add a new post to the server.
While this solves the problem of keeping the index up to date, it introduces a new problem: because the database is entirely in memory, if the server crashes there will inevitably be inconsistencies in data. For example, if a post is newly published and the server fails, it may take several minutes for the post to be re-indexed.
I needed a storage solution that would let me persist data as it was submitted to the database.
To address this need, I implemented two changes:
- All documents in the database are stored in an
index
file. - All database modification requests (add, update, delete) are made following the transaction log paradigm.
In this post, I am going to talk about how I implemented these features. My implementation only works for adding documents. I have not yet figured out an efficient way to delete documents.
Of note, the server is single-threaded. Supporting multiple threads is not yet required for my use cases, and I have significant research to do in the area of threading to better understand how to make it work.
Durable data
A JameSQL server loads and processes data in memory. I needed a way to efficiently save data to disk. For this, I decided to create a single JSONL file that stores all records in the database. This file can act as the single source of truth about the state of the database.
The JSONL file stores one JSON document per line.
With this implementation, I would have a log of all of the data. I could use an append file writer so I can always add new records. The data in the file could then be used to reconstruct the database. With that said, there was a big limitation: what would happen if the server failed during an indexing request? How could I know if 50% of documents had been indexed but not the other 50%?
For that, I needed a transaction log.
Atomic transactions
A transaction log lists database operations that are going to be executed. For example, consider a scenario where I am indexing a document. I could add a record to the transaction log that says I am going to add a record. The log would contain the data I want to add. The transaction log is saved to disk. This happens before any data is indexed in memory or saved in the primary index.
Then, when the transaction is complete, the log can be erased, indicating that all of the processes in the transaction list have been completed and that the main index is up to date. If a transaction fails part way through, the database server can restart, read the index and transaction log, then apply all transactions before declaring the index up to date.
If a transaction log is present when a server instance starts, it means a transaction failed. The database will first load the index from disk, then load the transaction log and apply all changes according to whether they were add or remove operations.
Transactions satisfy the atomicity criteria in the ACID database paradigm. Every change or series of changes are treated as a transaction. All changes must be made before changes are persisted.
Because transaction data is persisted on disk, all operations can be replayed to reconcile the main database stored on the file system in the case of any issues (i.e. server crash, system error).
The new system
When a document is added to a JameSQL database, the following process occurs:
- The record to save is added to a journal file (a term used to refer to transaction logs).
- The record is indexed using the relevant data structure.
- The record is written to the file system.
- The journal file is deleted.
If the database fails at any step, the database can crash and, on restart, fully restore its previous state.
It is presently unclear whether Step #2 should come last. The presence of a journal file could be seen as a lock. If a journal file exists in the above configuration, it means a transaction is ongoing. This implies data is being written to memory if the indexing process happens before the record is written to the file system and the journal file is deleted. This would be useful for multithreading. I need to think more about this.
In addition, there are a few limitations with my implementation:
- Only source documents are stored. The underlying indexes are not stored. This means that indices have to be re-built on restart. For the scale of data I am working with, this is acceptable; building the index takes under a second. With that said, this would have to be revisited if the database were used at scale. If a server crashes and it takes a minute to rebuild the index, that means that there would be a minute of downtime.
- Delete operations are not yet supported. This is because I have been thinking through how to efficiently delete records from the index file. I need an efficient way to access exactly the part of the file that contains a record and delete it.
I plan to explore delete operations in more depth. If you know of any useful resources, let me know! The BitCask algorithm comes to mind, which is an efficient, in-memory key value store where transactions are also saved on disk. That could be the solution to this problem.
In my explorations, I learned that transaction logs (journals) allow me to enforce atomic transactions. Storing a transaction on disk before it is carried out ensures that the server can reconstruct its state in the event of a crash. I have many other avenues of research: how to efficiently store indices, and how to efficiently remove data.