Major Objects: Difference between revisions

From Bitpost wiki
No edit summary
No edit summary
Line 1: Line 1:
Use '''Major Objects''' for fast in-memory handling of large amount of data that is thread-safe but must be persisted.  Support complex objects with simple keys, fast lookup by multiple keys, serialization to multiple persistent layers, and fast DRY schema changes.
Use '''Major Objects''' for fast in-memory handling of large amount of data that is thread-safe but must be persisted.  Support complex objects with simple keys, fast lookup by multiple keys, serialization from/to multiple persistent layers, and fast DRY schema changes.


=== Major Objects ===
=== Major Objects ===

Revision as of 16:46, 30 September 2018

Use Major Objects for fast in-memory handling of large amount of data that is thread-safe but must be persisted. Support complex objects with simple keys, fast lookup by multiple keys, serialization from/to multiple persistent layers, and fast DRY schema changes.

Major Objects

  • Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below)
  • The main container's primary key is always db_id
  • Always use the db_id for foreign keys
  • Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!)
    • For other unordered_sets, just define new hash functions
    • Other useful containers are sorted_vector and map (<key,value> pair)

PersistentIDObject

  • Add a dirty flag to all objects, set to true on any change that must be persisted
  • Use an internal in-memory counter to generate the next db_id for a newly created object
  • This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
  • Use delayed-write tactics to write all dirty objects on idle time

Memory Model

  • Use a Datastore manager (aka "MemoryModel") to hold sets
  • It can look up objects by any key, and strip away const to return a mutable object. NOTE that the user must not damage the key values!
  • Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
  • Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.

Schema Driven

  • Schema is shared across memory objects and every persistence layer. Follow a DRY pattern by generating all code from a common schema definition.
  • JSON schema is elegant as it is as simple as possible but no less, always use it. Essential tools:
  • Always provide a constructor with a single db_id parameter, defaulted to UNSAVED; this serves two purposes:
    • no-param constructor for reflection via quicktype
    • id constructor for use as key for unsorted_set::find()

THE RESULT

  • Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
  • Fast supplemental in-memory indexes into objects wherever needed
  • DRY code that allows schema changes within a rapid, complete, and simplified codebase

Delayed delete pattern

           1) to dynamically delete an object: 
               a) ba.setDeleted();
               b) do not remove from any container indexes
               c) but fix the index sorting, flags, etc, as if the object were gone, so the program will function as if it is!
                  eg: do not remove from runsByRank_, but adjust all other ranking as if the run was gone
           2) include deleted status in active check, etc.:
               // NOTE use the direct function rather than !bFunc(), as deleted objects return false for both.
               bool bActive() const        { return b_active_ && !bDeleted();  }
               bool bInactive() const      { return !b_active_ && !bDeleted(); }
               ---
               for (auto& psr: runsByRank_) {
                 if (psr->bDeleted()) continue;
                 ...
           3) all deletion work is done in MemoryModel::saveDirtyObjectsAsNeeded(), see that code
               a) deletion check should happen in delayed write check:
                   if (pau->bDirtyOrDeleted())
                       bNeeded = true;
               b) if bNeeded, always do deletions first, starting with greatest grandparent container, to minimize work
               c) use the erase-remove pattern to remove all deleted items in one loop
                   see code here for reference implementation: BrokerAccount::removeDeletedStockRuns()
                     i) iterate and remove item from all secondary indices
                     ii) iterate primary index, and use the lambda of the erase-remove operation to delete memory allocation and remove db record
                     iii) associative container iterators can be safely deleted directly
                          sequential containers like vector require use of erase-remove idiom
                          see reference implementation for example code!