@family/* | limit 10
Embeddable JSON database engine
Highlights
Document oriented storage engine
Simple yet powerful Query language (JQL)
Data durability, read uncommitted TX isolation.
Supports indexes and collections joins.
Online backups support
Lightweight library under ~1Mb in size.
EJDB2 query language (JQL) syntax inspired by ideas behind XPath and Unix shell pipes. It designed for easy querying and updating sets of JSON documents.
JSON documents are stored in collections. Collection also contains a set of Indexes for document querying.
Get up to ten documents from family collection
@family/* | limit 10
{
"firstName": "John",
"lastName": "Doe",
"age": 28,
"pets": [
{ "name": "Rexy rex", "kind": "dog", "likes": ["bones", "jumping", "toys"] },
{ "name": "Grenny", "kind": "parrot", "likes": ["green color", "night", "toys"] }
]
}
Gets documents from family collection where pets array object has likes
array having non null object at index 1
@family/pets/*/likes/1
{
"firstName": "John",
"lastName": "Doe",
"age": 28,
"pets": [
{ "name": "Rexy rex", "kind": "dog", "likes": ["bones", "jumping", "toys"] },
{ "name": "Grenny", "kind": "parrot", "likes": ["green color", "night", "toys"] }
]
}
Get all documents where owner age greater than 20 and have some pet who like bones or toys.
@family/[age > 20] and /pets/*/likes/[** in ["bones", "toys"]]
Here ** denotes some element in likes array.
Games
Mobile applications
Self-contained single binary applications
Network accessible applications over HTTP protocol (REST API and Websockets)
Requires Memory Management Unit (MMU) hardware and relevant virtual memory support by OS (mmap)
Linux, MacOS, FreeBSD, Win32 support
No network support on Windows since async IO layer is not ported to Windows IOCP
File size limit 0x7fffffff80
bytes ~512Gb
Size of single JSON document is limited to ~255Mb
Data file is memory mapped (mmap) into the address space of process
Data blocks allocation/deallocation is managed by bitmap index
Data block granularity is 128 bytes
Persistent index is based on Skiplist data structure
Data file contains multiple skiplist indexes (sub-databases).
Concurrent access to every persistent index is managed by RW lock (single writer, multiple readers) per collection.
Consistency, durability, online backups provided by WAL
O(log(n)) cost of search and insert
O(n) worst case (practcally imposible)
O(log(n)) Levels
Nodes are randomly spreaded across as data file what’s not good is we have a High cost of random data access.
Poor data locality when accessing keys. Many page faults.
Huge storage overhead on pointers
A key-value storage
Keys kept together with values!
This approach provides good data locality for up to 32 adjusted key-values pairs for every skiplist node.
Data file may contains up to ~10K independent skiplist indexes (KV storages)
Write ahead log (WAL) used to provide database data fault tolerance in the case of system/hardware failures. WAL recovery logs used in automatic recovery procedure get the most actual and consistent database state before a crash.
Also WAL is used to provide online backups feature and network data replication*
*Implementation pending
Savepoint
Flushing of WAL file to filesystem at consistent database state. In the case of failure it can be rolled forward to restore database state to the latest savepoint time stamp.
Checkpoint
All pending changes persisted in WAL are applied to the database file. WAL file data truncated to zero following by savepoint `fsync`. All durty pages flushed to disk.
Documents a represented as binary blobs in compact BINN format
Every document associated with unique primary key number (int64 serial value)
EJDB operates on set of pairs: {PK, JSON}
PK is generated sequentially
Actually persisting a JSON document is a placing its blob under PK
in KV database (collection)
JSON fields indexes are stored in separate KV database
Primitive types:
null
boolean
integer (up to 64 bits signed)
floating point numbers (IEEE single and double precision)
string
blob (binary data)
Composite types:
list
map (numeric key associative array)
object (text key associative array)
The following types eligible for indexing
string value
8 bytes width signed integer
8 bytes width floating point number
Values in JSON document arrays can be indexed as well
Indexes may be unique (won’t allow duplicates)
Note | Index is a separate KV storage which maps index values to JSON document primary key. |
Note | JSON RFC6901 Path used to identify particular point at JSON structure to index. |
{
lastName: "..."
"employment": {
"company": {
"name": "..."
}
Ensure string index on company name:
iwrc rc = ejdb_ensure_index(db, "persons", "/employment/company/name", EJDB_IDX_STR);
Matching patterns supported by indexes
Equality (eq)
Comparison (gt, gte, lt, lte)
Presence in a set (in)
Prefix matching (~)
JQL query which use index:
@persons/employment/company/[name in ["Huawei", "PBC"]] and /[age > 30]
Only one index is used for particular query execution
Index is selected by set of heuristic rules.
Actual data stats (eg. selectivity) is not taken for index selection.
If query consist of OR part at top level or contains NOT expressions at the top level of query expression - indexes will not be in use at all.
ORDERBY clauses may use indexes to avoid result set sorting.
Competitive to LevelDB, LMDB, WiredTiger, RocksDB
1024
byte keys
Random values of [0..40]
bytes length
filrandom2 - insert 2M
random distributed records
readrandom - read 2M
records in random order
deleterandom - delete 2M
random keys