Major Objects: Difference between revisions

From Bitpost wiki
No edit summary
No edit summary
Line 8: Line 8:
** For other unordered_sets, just define new hash functions
** For other unordered_sets, just define new hash functions
** Other useful containers are sorted_vector and map (<key,value> pair)
** Other useful containers are sorted_vector and map (<key,value> pair)
* '''''PersistentIDObject'''''
=== PersistentIDObject ===
** Add a dirty flag to all objects, set to true on any change that must be persisted
* Add a dirty flag to all objects, set to true on any change that must be persisted
** Use an internal in-memory counter to generate the next db_id for a newly created object
* Use an internal in-memory counter to generate the next db_id for a newly created object
** This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
* This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
** Use delayed-write tactics to write all dirty objects on idle time
* Use delayed-write tactics to write all dirty objects on idle time
* '''''Memory Model'''''
=== Memory Model ===
** Use a Datastore manager (aka "MemoryModel") to hold sets
* Use a Datastore manager (aka "MemoryModel") to hold sets
** It can look up objects by any key, and strip away const to return a mutable object.  NOTE that the user must not damage the key values!
* It can look up objects by any key, and strip away const to return a mutable object.  NOTE that the user must not damage the key values!
** Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
* Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
** Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.
* Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.
* '''''Schema Driven'''''
=== Schema Driven ===
** Schema is shared across memory objects and every persistence layer.  Follow a DRY pattern by generating all code from a common schema definition.
* Schema is shared across memory objects and every persistence layer.  Follow a DRY pattern by generating all code from a common schema definition.
** JSON schema is elegant as it is as simple as possible but no less, always use it.  Essential tools:
* JSON schema is elegant as it is as simple as possible but no less, always use it.  Essential tools:
*** [https://quicktype.io/ quicktype] - use npm to install it and to incorporate code generation into CI
** [https://quicktype.io/ quicktype] - use npm to install it and to incorporate code generation into CI
*** [https://jsonformatter.curiousconcept.com/ JSON formatter]
** [https://jsonformatter.curiousconcept.com/ JSON formatter]
** Always provide a constructor with a single db_id parameter, defaulted to UNSAVED; this serves two purposes:
* Always provide a constructor with a single db_id parameter, defaulted to UNSAVED; this serves two purposes:
*** no-param constructor for reflection via quicktype
** no-param constructor for reflection via quicktype
*** id constructor for use as key for unsorted_set::find()
** id constructor for use as key for unsorted_set::find()
* '''''THE RESULT'''''
=== THE RESULT ===
** fast unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
* Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
** DRY code that allows schema changes within a rapid, complete, and simplified codebase
* Fast supplemental in-memory indexes into objects wherever needed
* DRY code that allows schema changes within a rapid, complete, and simplified codebase


=== Delayed delete pattern ===
=== Delayed delete pattern ===

Revision as of 16:41, 30 September 2018

Major Objects

  • Use Major Objects for fast in-memory handling of large amount of data that is thread-safe but must be persisted
  • We must support complex objects with simple keys, crud, and fast lookup by multiple keys.
  • Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below)
  • The main container's primary key is always db_id
  • Always use the db_id for foreign keys
  • Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!)
    • For other unordered_sets, just define new hash functions
    • Other useful containers are sorted_vector and map (<key,value> pair)

PersistentIDObject

  • Add a dirty flag to all objects, set to true on any change that must be persisted
  • Use an internal in-memory counter to generate the next db_id for a newly created object
  • This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
  • Use delayed-write tactics to write all dirty objects on idle time

Memory Model

  • Use a Datastore manager (aka "MemoryModel") to hold sets
  • It can look up objects by any key, and strip away const to return a mutable object. NOTE that the user must not damage the key values!
  • Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
  • Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.

Schema Driven

  • Schema is shared across memory objects and every persistence layer. Follow a DRY pattern by generating all code from a common schema definition.
  • JSON schema is elegant as it is as simple as possible but no less, always use it. Essential tools:
  • Always provide a constructor with a single db_id parameter, defaulted to UNSAVED; this serves two purposes:
    • no-param constructor for reflection via quicktype
    • id constructor for use as key for unsorted_set::find()

THE RESULT

  • Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
  • Fast supplemental in-memory indexes into objects wherever needed
  • DRY code that allows schema changes within a rapid, complete, and simplified codebase

Delayed delete pattern

           1) to dynamically delete an object: 
               a) ba.setDeleted();
               b) do not remove from any container indexes
               c) but fix the index sorting, flags, etc, as if the object were gone, so the program will function as if it is!
                  eg: do not remove from runsByRank_, but adjust all other ranking as if the run was gone
           2) include deleted status in active check, etc.:
               // NOTE use the direct function rather than !bFunc(), as deleted objects return false for both.
               bool bActive() const        { return b_active_ && !bDeleted();  }
               bool bInactive() const      { return !b_active_ && !bDeleted(); }
               ---
               for (auto& psr: runsByRank_) {
                 if (psr->bDeleted()) continue;
                 ...
           3) all deletion work is done in MemoryModel::saveDirtyObjectsAsNeeded(), see that code
               a) deletion check should happen in delayed write check:
                   if (pau->bDirtyOrDeleted())
                       bNeeded = true;
               b) if bNeeded, always do deletions first, starting with greatest grandparent container, to minimize work
               c) use the erase-remove pattern to remove all deleted items in one loop
                   see code here for reference implementation: BrokerAccount::removeDeletedStockRuns()
                     i) iterate and remove item from all secondary indices
                     ii) iterate primary index, and use the lambda of the erase-remove operation to delete memory allocation and remove db record
                     iii) associative container iterators can be safely deleted directly
                          sequential containers like vector require use of erase-remove idiom
                          see reference implementation for example code!