Brians Technology Blog: august 2021

Background - why another flash storage system ?

In Bluetooth LE applications there is a need for persistent storage of data as soon as bonding is being used. This is because the standard requires an LE device (either central or peripheral) to persistently store at least two things: For each device that is bonded, the identity of that device must be stored together with its Long Term Key (LTK), and for each connection that is created with a known bonded deivce, the configuration of server and client characteristic descriptors must be associated with that device and stored so that it can be reused upon the next connection.

These requirements calls for a very simple relational database, where a number of device identities for bonded devices (0..N) can be stored and associated 1:1 with a set of configuration data that can be updated at each connection. Such a structure has been implemented in the Flash DataBase (FDB) that I have written in conjunction with the BLE from scratch. This simple database addresses a number of needs unique to small embedded applications:

It is small and lightweight compared to e.g. SQLite
It stores data in a way that preserves flash health by not erasing flash pages unless absolutely necessary.
It only consumes program code for things that are actually used – e.g. search terms are expressed as C++ lambda expresssions rather than with an interpreted language.

FDB does not try to be a solution to bigger database needs – in fact in its default configuration it only allows for storing up to 4 kByte of data, but that is also enough for most needs in an embedded BLE device.

This is a long blogpost, so TL;DR: The source code for FDB can be found here: https://github.com/bdpedersen/ble-from-scratch/tree/master/fdb. The test code shows how a simple 1:1 relation can be set up and how to use the data type system.

General design

FDB stores data in data records that each have a unique 32-bit ID, an 8-bit data type, a length of up to 256 4-byte words and a checksum allowing FDB to verify each record individually when loading it. There are two reserved IDs in the system: 0xFFFFFFFF which means not found and 0x00000000 which means that the record has been deleted. These two IDs are chosen due to the unique properties of flash memory:

Flash memory are special in that you can’t write 1s to it. When you want to write to flash memory it first needs to be erased. Erase means setting all values to 1. After this operation each bit can be changed from a 1 to a 0, but not back to 1 without a new erase. Flash memory is page oriented, which means that individual bits or bytes can’t be erased independently. You have to erase at least a full page, which in nRF52 is 4 kByte big.

Typically this leads to data being written to flash by first reading the full page to a RAM buffer, then modify some data before the page is erased and data is written back. This is called read-modify-write. The problem with this approach is that if a power loss occurs between the erase and writeback, not only will the new data being written be lost – all data that resided at that page would be lost. And for an embedded application with not much data available this could mean all data that is stored at all.

Furthermore most people have heard that “flash has a limited lifetime and can only endure so many writes before it fails”. Actually it is not (so much) writing that wears out the flash. It is the erase cycles, so if these can be avoided, the flash will have a much longer lifespan.

FDB gets around the problems with read-modify-write in the way data data records are written and deleted. Note you can not update data in a data record, you can only write a new one and delete the old one.

FDB uses two pages of flash at all time. One is the active page, and the other is put to use in case the first one gets filled up.

When a new data record is to be written to the system, FDB looks for the first free area in the active page. Since this area is always erased and not written, it will look like a data record with an ID of 0xFFFFFFFF. This is why this value is used to signify not found. If there is enough free space in the active page, the new data record is simply written and then given an ID that is one bigger than the largest ID that was already in use (to keep it unique).

If we need to delete a data record, the ID is simply set to 0. This is possible because any 1s in the ID can be changed to 0 by a new write. This is why 0x00000000 signify a deleted record. The length and checksum fields are kept intact so that the beginning of the next record can be found even if the item was deleted. FDB has no central registry of the locations of each of the data records, so it needs to be able to trust the length fields of the records that are in the active page.

If a write fails due to missing space, FDB be will compact the record space. This is done by verifying and then copying all records that are not deleted to the unused flash page and then erase the previous page. An active page starts with a 32 bit page counter that will increment every time it is activated (meaning all data was successfully copied to it). That way the system will be able to deduce which state it is in even if power is lost at any point in time. Pages are only erased during compaction which greatly increases the lifetime of the flash pages. Apart from allowing FDB to keep track of the data state, the active page counter can also be used as an indicator of flash health. When it approaches 10000, health can be compromised, and no flash will make it all the way to the 4 billion erases that it can indicate.

FDB consists of three layers:

A storage layer, that handles the physical interface to the flash
A record layout layer that allows read, write and verify of data records, as well as low-overhead deletion without flash page erases in most cases. This layer implements the logic described above.
An index layer that allows fast retrieval of data and fast searching for records.

These three layers are described in the following. Note that only the interfaces will be shown – the actual implementation can be found in the source repository.

Storage layer

The storage provides a basic interface to the physical flash device. It makes a few assumptions about the hardware (which could be relaxed if needed):

Flash can be read as memory mapped data
Both the active and the spare flash page is memory mapped as one contiguous memory region
Write can take place without erasing
Page erase can be initiated without writing

The nRF52 internal flash – both raw hardware and when running via the SoftDevice API - fulfills the above requirements. It will be possible also to write a backend for other devices with internal flash, and for STM32 devices with external SPI flash using DCR memory mapping.

The interface is quite simple and can be seen here:

Note that all data is written as 32-bit words and the offset in the Write() method is also in 32-bit words. This is because nRF52 (and possibly other chips) allows 4-byte aligned writes only.

Layout layer

A flash page containing data will always start with the 4 byte page counter followed by one or more data records. A record starts with a header, whose format can be seen below:

Adler16 is the Fletcher16 checksum but with 251 (largest 8-bit prime) as modulo rather than 255. This allows slightly better detection of single bit and small burst errors in small amounts of data than Fletcher-16, which is what we need in this application.

The header contains the ID, the data type (where semantics is decided by the application) and the length in 32 bit words. If byte data needs to be written, the byte length must be stored explicitly by the application.

By traversing these RecordHdr structures, all records can be located in the data store layout. An application should never use the RecordHdr directly, instead it should use the Layout class to query the data structure and read/manipulate records.

When searching for records they can be found in two ways:

By searching directly by ID. This is useful for reverse relations (you have a record that is related to another by ID and want the related record). It is also used internally to re-acquire a record that has moved in memory.
By searching for records with a specific data type. This is used for enumerating records of a given type.

Finding by ID is done via the following method

Enumeration is done by these two methods:

They return a Record struct. Only the ID field is meant to be accessed publicly, the other two are for keeping track of the data in memory. If a query doesn’t return any records (e.g. if a findNext() can’t find any more records) then the ID field is set to NOT_FOUND.

It is possible to delete a record using the Delete method:

.. and insert it using the Insert method

Note that the insert method allows that one and optionally two different data buffers are written into the record. This API is constructed this way to minimize copying operations. Using this approach it is possible to have a record type with a fixed header (e.g. an ID) and a variable size payload (e.g. a string or data buffer) so that variable size data can be associated with other records. Insert() may fail due to lack of free space in the flash page, in which case the ID field in the returned record will be set to NOT_FOUND. Otherwise the Record will refer to the newly inserted data.

Due to the way IDs work in fdb, the ID of the most recently inserted record can be retrieved by GetLastUsedID(), which will return NOT_FOUND if the database is empty.

Data is accessed by first getting the length and then a pointer to the data using these two methods

Note that you should never access the item field in the Record directly but use these methods instead. They will retrieve the the information using the fastest method possible, also taking into account that data may have moved since last time it was queried. Never use the pointer returned by GetData() outside the calling scope, i.e. never store the pointer permanently in an object or static variable. The data it points to may move during a Compact() call so it may be invalidated. Note that the pointer will point to the first of the two possible data buffers that were inserted, and the length will be the joint length of the two. It is up to a higher protocol to determine or store the boundary between the two.

The Compact() method:

can be called to purge deleted records and free up space. It will cause a flash page delete so it should only be used if an Insert() operation fails.

It is possible to find the amount of free space and get a debug status by these methods:

Index layer

While it is perfectly possible to use FDB via the Layout class only, a number of operations are made easier by the Index template class. An index can be seen as a searchable variable-sized array of data records, each record represented by a struct or other data type with a size divisible by 4. The class maintains its relation to the underlying persistent store automatically

Index is a template class shown here:

Data is copied out of an Index so that the result of an operation can be passed around. To get the record at a given numbered offset you use GetEntry which will return a Result with the data. A result contains the offset of the data, as well as the data itself. If you need the ID of the data at a given offset, you call GetEntryId(offset). The Index is searchable using GetEntryMatching() where a functor returning bool can be used to filter. Start can be set explicitly if a search needs to be re-started at some offset.

The data in a Result is always valid but the index is only valid until Delete() or DeleteIndex() is called

It is also possibly to search by ID using GetEntryWithID()

To create an Index you first declare the underlying datatype and decides which number from 0-255 should represent it in the underlying records datatype field. Suppose we have a index of paired devices identified by their Bluetooth address (for examples sake – real world is not so simple) we could do a

typedef struct {
uint8_t bt_addr[16];
} device_t;

const uint8_t device_t_dtype=1;
const int pairing_table_size=5;

And then build a pairing table of up to 5 devices by instantiating an index like this (given a Layout instance in layout):

Index<device_t,pairing_table_size,device_t_dtype> pairing_table(layout);

We could then also make a cache of auxiliary data associated to each device by ID. That index would look like this:

const uint8_t aux_dtype=2;

Index<uint32_t,pairing_table_size,aux_type> aux_data(layout);

We can append a new device after pairing by calling Append() with a device_t as argument, and we can store auxiliary data associated with it by calling Append taking both it ID and the auxiliary data buffer as argument.

Last we can copy out the variable part of the data by first allocating a buffer (on heap or stack by e.g. bfs_alloca()) with the size returned by GetExtendedSize() and then copy it out using GetExtendedData()

Using these primitives a simple relational database can be built. Chained deletes needs to be done manually but are easy to do since Delete methods returns the ID of the deleted records, so a relation can be looked up in other Index instances by GetEntryMatching()

Brians Technology Blog

tirsdag den 3. august 2021

Flash DB - a small C++ database for flash memory

Background - why another flash storage system ?

General design

Storage layer

Layout layer

Index layer