EJDB2

Embeddable JSON database engine

Highlights

JQL

EJDB2 query language (JQL) syntax inspired by ideas behind XPath and Unix shell pipes. It designed for easy querying and updating sets of JSON documents.

JSON documents are stored in collections. Collection also contains a set of Indexes for document querying.

JQL

Get up to ten documents from family collection

@family/* | limit 10
{
  "firstName": "John",
  "lastName": "Doe",
  "age": 28,
  "pets": [
    { "name": "Rexy rex", "kind": "dog", "likes": ["bones", "jumping", "toys"] },
    { "name": "Grenny", "kind": "parrot", "likes": ["green color", "night", "toys"] }
  ]
}

Gets documents from family collection where pets array object has likes array having non null object at index 1

@family/pets/*/likes/1
{
  "firstName": "John",
  "lastName": "Doe",
  "age": 28,
  "pets": [
    { "name": "Rexy rex", "kind": "dog", "likes": ["bones", "jumping", "toys"] },
    { "name": "Grenny", "kind": "parrot", "likes": ["green color", "night", "toys"] }
  ]
}

Get all documents where owner age greater than 20 and have some pet who like bones or toys.

@family/[age > 20] and /pets/*/likes/[** in ["bones", "toys"]]

Here ** denotes some element in likes array.

Typical use cases of EJDB2

Native applications comparable with C ABI

  • Games

  • Mobile applications

  • Self-contained single binary applications

  • Network accessible applications over HTTP protocol (REST API and Websockets)

Requirements and limitations

  • Requires Memory Management Unit (MMU) hardware and relevant virtual memory support by OS (mmap)

  • Linux, MacOS, FreeBSD, Win32 support

  • No network support on Windows since async IO layer is not ported to Windows IOCP

  • File size limit 0x7fffffff80 bytes ~512Gb

  • Size of single JSON document is limited to ~255Mb

EJDB2 Internals

  • Data file is memory mapped (mmap) into the address space of process

  • Data blocks allocation/deallocation is managed by bitmap index

  • Data block granularity is 128 bytes

  • Persistent index is based on Skiplist data structure

  • Data file contains multiple skiplist indexes (sub-databases).

  • Concurrent access to every persistent index is managed by RW lock (single writer, multiple readers) per collection.

  • Consistency, durability, online backups provided by WAL

Skiplist used as main index structure

skiplist1
  • O(log(n)) cost of search and insert

  • O(n) worst case (practcally imposible)

  • O(log(n)) Levels

Making Skiplist persistent

Naive approach

  • Nodes are randomly spreaded across as data file what’s not good is we have a High cost of random data access.

  • Poor data locality when accessing keys. Many page faults.

  • Huge storage overhead on pointers

Skiplist optimizations

skiplist2

Actual KV storage layout

A key-value storage

skiplist3
  • Keys kept together with values!

  • This approach provides good data locality for up to 32 adjusted key-values pairs for every skiplist node.

  • Data file may contains up to ~10K independent skiplist indexes (KV storages)

EJDB2 WAL

Write ahead log (WAL) used to provide database data fault tolerance in the case of system/hardware failures. WAL recovery logs used in automatic recovery procedure get the most actual and consistent database state before a crash.

Also WAL is used to provide online backups feature and network data replication*

*Implementation pending

EJDB2 WAL

Savepoint

Flushing of WAL file to filesystem at consistent database state.
In the case of failure it can be rolled forward to restore database state
to the latest savepoint time stamp.

Checkpoint

All pending changes persisted in WAL are applied to the database file.
WAL file data truncated to zero following by savepoint `fsync`.
All durty pages flushed to disk.
wal1
wal2

How JSON documents are stored?

  • Documents a represented as binary blobs in compact BINN format

  • Every document associated with unique primary key number (int64 serial value)

  • EJDB operates on set of pairs: {PK, JSON}

  • PK is generated sequentially

  • Actually persisting a JSON document is a placing its blob under PK in KV database (collection)

  • JSON fields indexes are stored in separate KV database

Supported data types

Primitive types:

  • null

  • boolean

  • integer (up to 64 bits signed)

  • floating point numbers (IEEE single and double precision)

  • string

  • blob (binary data)

Composite types:

  • list

  • map (numeric key associative array)

  • object (text key associative array)

How JSON documents indexed?

The following types eligible for indexing

  • string value

  • 8 bytes width signed integer

  • 8 bytes width floating point number

  • Values in JSON document arrays can be indexed as well

Indexes may be unique (won’t allow duplicates)

Note
Index is a separate KV storage which maps index values to JSON document primary key.
Note
JSON RFC6901 Path used to identify particular point at JSON structure to index.
{
  lastName: "..."
  "employment": {
    "company": {
      "name": "..."
    }

Ensure string index on company name:

iwrc rc = ejdb_ensure_index(db, "persons", "/employment/company/name", EJDB_IDX_STR);

Matching patterns supported by indexes

  • Equality (eq)

  • Comparison (gt, gte, lt, lte)

  • Presence in a set (in)

  • Prefix matching (~)

JQL query which use index:

@persons/employment/company/[name in ["Huawei", "PBC"]] and /[age > 30]

Some facts about indexing

  • Only one index is used for particular query execution

  • Index is selected by set of heuristic rules.

  • Actual data stats (eg. selectivity) is not taken for index selection.

  • If query consist of OR part at top level or contains NOT expressions at the top level of query expression - indexes will not be in use at all.

  • ORDERBY clauses may use indexes to avoid result set sorting.

Performance

b3
  • 1024 byte keys

  • Random values of [0..40] bytes length

    • filrandom2 - insert 2M random distributed records

    • readrandom - read 2M records in random order

    • deleterandom - delete 2M random keys

Plans

  • Intelligent query optimization index selection algorithm

  • Implementation of istributed network cluster library powered by ZAB or Raft protocols (Like a Etcd or Consul)

  • Introducing data compression options