ETFS implements a high-reliability filesystem for use with
embedded solid-state memory devices, particularly NAND flash
memory.
The filesystem supports a fully hierarchical
directory structure with POSIX semantics as shown in the
table above.
ETFS is a filesystem composed entirely of
transactions. Every write operation, whether of
user data or filesystem metadata, consists of a transaction.
A transaction either succeeds or is treated as if it never
occurred.
Transactions never overwrite live data. A write in the
middle of a file or a directory update always writes to a
new unused area. In this way, if the operation fails part
way through (due to a crash or power failure), the old data
is still intact.
Some log-based filesystems also operate under the principle
that live data is never overwritten. But ETFS takes this to
the extreme by turning everything into a log of
transactions. The filesystem hierarchy is built on the fly
by processing the log of transactions in the device. This
scan occurs at startup, but is designed such that only a
small subset of the data is read and CRC-checked, resulting
in faster startup times without sacrificing reliability.
Transactions are position-independent in the device and may
occur in any order. You could read the transactions from one
device and write them in a different order to another
device. This is important because it allows bulk programming
of devices containing bad blocks that may be at arbitrary
locations.
This design is well-suited for NAND flash memory. NAND flash
is shipped with factory-marked bad blocks that may occur in
any location.
Figure 1. ETFS is a filesystem composed entirely of transactions.
Inside a transaction
Each transaction consists of a header followed by data.
The header contains the following:
- FID
- A unique file ID that identifies which file the transaction belongs to.
- Offset
- The offset of the data portion within the file.
- Size
- The size of the data portion.
- Sequence
- A monotonically increasing number (to enable time ordering).
- CRCs
- Data integrity checks (for NAND, NOR, SRAM).
- ECCs
- Error correction (for NAND).
- Other
- Reserved for future expansion.
Types of storage media
Although best for NAND devices, ETFS also supports other
types of embedded storage media by using driver classes as follows:
Class |
CRC |
ECC |
Wear-leveling erase |
Wear-leveling read |
Cluster size |
NAND 512+16 |
Yes |
Yes |
Yes |
Yes |
1 KB |
NAND 2048+64 |
Yes |
Yes |
Yes |
Yes |
2 KB |
RAM |
No |
No |
No |
No |
1 KB |
SRAM |
Yes |
No |
No |
No |
1 KB |
NOR |
Yes |
No |
Yes |
No |
1 KB |
Note:
Although ETFS can support NOR flash, we recommend instead
the
FFS3 filesystem (
devf-*), which is designed
explicitly for NOR flash devices.
Reliability features
ETFS is designed to survive across a power failure, even
during an active flash write or block erase. The following
features contribute to its reliability:
- dynamic wear-leveling
- static wear-leveling
- CRC error detection
- ECC error correction
- read degradation monitoring with automatic refresh
- transaction rollback
- atomic file operations
- automatic file defragmentation.
- Dynamic wear-leveling
- Flash memory allows a limited number of erase cycles on a
flash block before the block will fail.
This number can be as low as 100,000. ETFS tracks the number of erases on each block.
When selecting a block to use, ETFS attempts to spread the erase cycles evenly
over the device, dramatically increasing its life.
The difference can be extreme: from usage scenarios of failure within a few days without
wear-leveling to over 40 years with wear-leveling.
- Static wear-leveling
- Filesystems often consist of a large number of static files
that are read but not written.
These files will occupy flash blocks that have no reason to be erased.
If the majority of the files in flash are static, this will cause the remaining
blocks containing dynamic data to wear at a dramatically increased rate.
ETFS notices these underworked static blocks and forces
them into service by copying their data to an overworked block.
This solves two problems: it gives the overworked
block a rest, since it now contains static data, and it
forces the underworked static block into the dynamic pool of blocks.
- CRC error detection
- Each transaction is protected by a cyclic redundancy check (CRC).
This ensures quick detection of corrupted data, and
forms the basis for the rollback operation of damaged or
incomplete transactions at startup.
The CRC can detect multiple bit errors that may occur during a power failure.
- ECC error correction
- On a CRC error, ETFS can apply error correction coding (ECC)
to attempt to recover the data.
This is suitable for NAND flash memory, in which single-bit errors may occur
during normal usage.
An ECC error is a warning signal that the flash block the error occurred
in may be getting weak, i.e., losing charge.
ETFS marks the weak block for a refresh operation, which copies the
data to a new flash block and erases the weak block.
The erasure recharges the flash block.
- Read degradation monitoring with automatic refresh
- Each read operation within a NAND flash block weakens the
charge maintaining the data bits.
Most devices support about 100,000 reads before there's danger of losing a bit.
The ECC recovers a single-bit error, but may not be able to recover multi-bit errors.
ETFS solves this by tracking reads and marking blocks for
refresh before the 100,000 read limit is reached.
- Transaction rollback
- When ETFS starts, it processes all transactions and rolls
back (discards) the last partial or damaged transaction.
The rollback code is designed to handle a power failure during a
rollback operation, thus allowing the system to recover from multiple nested faults.
The validity of a transaction is protected by CRC codes on each transaction.
- Atomic file operations
- ETFS implements a very simple directory structure on the
device, allowing significant modifications with a single flash write.
For example, the move of a file or directory to another directory is often
a multistage operation in most filesystems.
In ETFS, a move is accomplished with a single flash write.
- Automatic file defragmentation
- Log-based filesystems often suffer from fragmentation, since
each update or write to an existing file causes a new transaction to be created.
ETFS uses write-buffering to combine small writes into larger write transactions in an
attempt to minimize fragmentation caused by lots of very small transactions.
ETFS also monitors the fragmentation level of each file and will do a background
defragmenting operation on files that do become badly fragmented.
Note that this background activity will always be preempted by a user data
request in order to ensure immediate access to the file being defragmented.