Major Objects: Difference between revisions

Latest revision as of 02:44, 3 August 2020

Use Major Objects for fast in-memory handling of large amount of data that is thread-safe but must be persisted. Support complex objects with simple keys, fast lookup by multiple keys, serialization from/to multiple persistent layers, and fast DRY schema changes.

Major Objects

Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below)
The main container's primary key is always db_id
Always use the db_id for foreign keys
Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!)
- For other unordered_sets, just define new hash functions
- Other useful containers are sorted_vector and map (<key,value> pair)

PersistentIDObject

Add a dirty flag to all objects, set to true on any change that must be persisted
Use an internal in-memory max_db_id counter in the parent to generate the next db_id for a newly created object
This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
Use delayed-write tactics to write all dirty objects on idle time

Memory Model

Use a Datastore manager (aka "MemoryModel") to hold sets
It can look up objects by any key, and strip away const to return a mutable object. NOTE that the user must not damage the key values!
Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.

Schema Driven

Schema is shared across memory objects and every persistence layer. Follow a DRY pattern by generating all code from a common schema definition.
JSON schema is elegant as it is as simple as possible but no less, always use it. Essential tools:
- quicktype - use npm to install it and to incorporate code generation into CI; this gives us reflection in C++!
- JSON formatter
Always provide a constructor following this format, that by default creates temporary objects that can be safely thrown away:

   // This constructor serves several purposes:
   //  1) standard full-param constructor, efficient for both deserializing and initializing
   //  2) no-param constructor for reflection via quicktype
   //  3) id constructor for loading via id + quicktype fields
   //  4) id constructor for use as key for unsorted_set::find()
   StockQuoteDetails(
       int64_t db_id = PersistentIDObject::DBID_DO_NOT_SAVE,
       double quote = -1.0,
       time_t timestamp = 0,               // Set invalid by default until we get a live one
       AutotradeParameterSet* p_aps = 0    // For global autoanalysis results
   ) :
       // Call base class
       inherited(sq_max_db_id_,db_id),

       // internal members
       n_refcount_(1)
   {
       // persistent members
       quote_ = quote;
       timestamp_ = timestamp;
   }

Use object pointers in memory, and provide code-friendly accessors, eg:

   // EXTERNAL REFERENCES
   // NOTE we are not responsible for these allocations.
   // Access pointers to parents as references.
   void setParent(BrokerAccount& ba) { pba_ = &ba; }
   BrokerAccount& ba() { assert(pba_ != 0); return *pba_; }
   BrokerAccount& ba() const { assert(pba_ != 0); return *pba_; }

   void setStockQuote(StockQuote& sq) { psq_ = &sq; }
   StockQuote& sq() { assert(psq_ != 0); return *psq_; }
   StockQuote& sq() const { assert(psq_ != 0); return *psq_; }

   // Accces nullable pointers directly.
   AnalysisData* pad_;     // the analysis parameters used to run the last analysis

Use simple JSON types in memory, and provide code-friendly accessors, eg:

       // Cast database JSON types to code-friendly types
       RANK_TYPE rt() const { return (RANK_TYPE)rank_type_; }
       APS_SCOPE scope() const { return (APS_SCOPE)aps_scope_; }
       ORDER_STATUS status() const { return (ORDER_STATUS)order_status_; }
       time_t tFirstSellDate() const { return (time_t)first_sell_date_; }
       double  getPctGain(GAIN_TIMEINTERVAL_TYPE gtt) const;
       void    setPctGain(GAIN_TIMEINTERVAL_TYPE gtt, const double& gain);
       int64_t getSellsCount(GAIN_TIMEINTERVAL_TYPE gtt) const;
       void    setSellsCount(GAIN_TIMEINTERVAL_TYPE gtt, int64_t count);

Read pattern: read entire set, then loop the set and patch/distribute each object as needed; eg:

bool SqliteLocalModel::readUsers() {
   int_fast32_t count = 0;
   try {
       if (!read("AppUsers",AppUser(),users_))
           return false;
       for (auto& u : users_)
       {
           addAppUserToMemory(u);
           u->setSaved();
           ++count;
       }
   }

Write pattern: hopefully simple, eg:

bool SqliteLocalModel::write(AppUser& obj) { 
    json j; to_json(j,obj); return write("AppUsers",obj,j,true);
}

THE RESULT

Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
Fast supplemental in-memory indexes into objects wherever needed
DRY code that allows rapid schema changes within a complete yet simplified codebase

Delayed delete pattern

           1) to dynamically delete an object: 
               a) ba.setDeleted();
               b) do not remove from any container indexes
               c) but fix the index sorting, flags, etc, as if the object were gone, so the program will function as if it is!
                  eg: do not remove from runsByRank_, but adjust all other ranking as if the run was gone
           2) include deleted status in active check, etc.:
               // NOTE use the direct function rather than !bFunc(), as deleted objects return false for both.
               bool bActive() const        { return b_active_ && !bDeleted();  }
               bool bInactive() const      { return !b_active_ && !bDeleted(); }
               ---
               for (auto& psr: runsByRank_) {
                 if (psr->bDeleted()) continue;
                 ...
           3) all deletion work is done in MemoryModel::saveDirtyObjectsAsNeeded(), see that code
               a) deletion check should happen in delayed write check:
                   if (pau->bDirtyOrDeleted())
                       bNeeded = true;
               b) if bNeeded, always do deletions first, starting with greatest grandparent container, to minimize work
               c) use the erase-remove pattern to remove all deleted items in one loop
                   see code here for reference implementation: BrokerAccount::removeDeletedStockRuns()
                     i) iterate and remove item from all secondary indices
                     ii) iterate primary index, and use the lambda of the erase-remove operation to delete memory allocation and remove db record
                     iii) associative container iterators can be safely deleted directly
                          sequential containers like vector require use of erase-remove idiom
                          see reference implementation for example code!

@@ Line 1: / Line 1: @@
-=== Overview ===
+Use '''Major Objects''' for fast in-memory handling of large amount of data that is thread-safe but must be persisted.  Support complex objects with simple keys, fast lookup by multiple keys, serialization from/to multiple persistent layers, and fast DRY schema changes.
-* '''''Major Objects'''''
-** Use Major Objects for fast in-memory handling of large amount of data that is thread-safe but must be persisted
+=== Major Objects ===
-** We must support complex objects with simple keys, crud, and fast lookup by multiple keys.
+* Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below)
-** Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below)
+* The main container's primary key is always db_id
-** The main container's primary key is always db_id
+* Always use the db_id for foreign keys
-** Always use the db_id for foreign keys
+* Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!)
-** Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!)
+** For other unordered_sets, just define new hash functions
-*** For other unordered_sets, just define new hash functions
+** Other useful containers are sorted_vector and map (<key,value> pair)
-*** Other useful containers are sorted_vector and map (<key,value> pair)
+=== PersistentIDObject ===
-* '''''PersistentIDObject'''''
+* Add a dirty flag to all objects, set to true on any change that must be persisted
-** Add a dirty flag to all objects, set to true on any change that must be persisted
+* Use an internal in-memory max_db_id counter in the parent to generate the next db_id for a newly created object
-** Use an internal in-memory counter to generate the next db_id for a newly created object
+* This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
-** This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
+* Use delayed-write tactics to write all dirty objects on idle time
-** Use delayed-write tactics to write all dirty objects on idle time
-* '''''Memory Model'''''
+=== Memory Model ===
-** Use a Datastore manager (aka "MemoryModel") to hold sets
+* Use a Datastore manager (aka "MemoryModel") to hold sets
-** It can look up objects by any key, and strip away const to return a mutable object.  NOTE that the user must not damage the key values!
+* It can look up objects by any key, and strip away const to return a mutable object.  NOTE that the user must not damage the key values!
-** Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
+* Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
-** Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.
+* Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.
-* '''''Schema Driven'''''
+=== Schema Driven ===
-** Schema is shared across memory objects and every persistence layer.  Follow a DRY pattern by generating all code from a common schema definition.
+* Schema is shared across memory objects and every persistence layer.  Follow a DRY pattern by generating all code from a common schema definition.
-** JSON schema is elegant as it is as simple as possible but no less, always use it.  Essential tools:
+* JSON schema is elegant as it is as simple as possible but no less, always use it.  Essential tools:
-*** [https://quicktype.io/ quicktype] - use npm to install it and to incorporate code generation into CI
+** [https://quicktype.io/ quicktype] - use npm to install it and to incorporate code generation into CI; this gives us reflection in C++!
-*** [https://jsonformatter.curiousconcept.com/ JSON formatter]
+** [https://jsonformatter.curiousconcept.com/ JSON formatter]
-** Use unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
+* Always provide a constructor following this format, that by default creates temporary objects that can be safely thrown away:
-** Always provide a constructor with a single db_id parameter, defaulted to UNSAVED; this serves two purposes:
+    // This constructor serves several purposes:
-*** no-param constructor for reflection via quicktype
+    //  1) standard full-param constructor, efficient for both deserializing and initializing
-*** id constructor for use as key for unsorted_set::find()
+    //  2) no-param constructor for reflection via quicktype
+    //  3) id constructor for loading via id + quicktype fields
+    //  4) id constructor for use as key for unsorted_set::find()
+    StockQuoteDetails(
+        int64_t db_id = PersistentIDObject::DBID_DO_NOT_SAVE,
+        double quote = -1.0,
+        time_t timestamp = 0,               // Set invalid by default until we get a live one
+        AutotradeParameterSet* p_aps = 0    // For global autoanalysis results
+    ) :
+        // Call base class
+        inherited(sq_max_db_id_,db_id),
+        // internal members
+        n_refcount_(1)
+    {
+        // persistent members
+        quote_ = quote;
+        timestamp_ = timestamp;
+    }
+* Use object pointers in memory, and provide code-friendly accessors, eg:
+    // EXTERNAL REFERENCES
+    // NOTE we are not responsible for these allocations.
+    // Access pointers to parents as references.
+    void setParent(BrokerAccount& ba) { pba_ = &ba; }
+    BrokerAccount& ba() { assert(pba_ != 0); return *pba_; }
+    BrokerAccount& ba() const { assert(pba_ != 0); return *pba_; }
+    void setStockQuote(StockQuote& sq) { psq_ = &sq; }
+    StockQuote& sq() { assert(psq_ != 0); return *psq_; }
+    StockQuote& sq() const { assert(psq_ != 0); return *psq_; }
+    // Accces nullable pointers directly.
+    AnalysisData* pad_;     // the analysis parameters used to run the last analysis
+* Use simple JSON types in memory, and provide code-friendly accessors, eg:
+        // Cast database JSON types to code-friendly types
+        RANK_TYPE rt() const { return (RANK_TYPE)rank_type_; }
+        APS_SCOPE scope() const { return (APS_SCOPE)aps_scope_; }
+        ORDER_STATUS status() const { return (ORDER_STATUS)order_status_; }
+        time_t tFirstSellDate() const { return (time_t)first_sell_date_; }
+        double  getPctGain(GAIN_TIMEINTERVAL_TYPE gtt) const;
+        void    setPctGain(GAIN_TIMEINTERVAL_TYPE gtt, const double& gain);
+        int64_t getSellsCount(GAIN_TIMEINTERVAL_TYPE gtt) const;
+        void    setSellsCount(GAIN_TIMEINTERVAL_TYPE gtt, int64_t count);
+* Read pattern: read entire set, then loop the set and patch/distribute each object as needed; eg:
+ bool SqliteLocalModel::readUsers() {
+    int_fast32_t count = 0;
+    try {
+        if (!read("AppUsers",AppUser(),users_))
+            return false;
+        for (auto& u : users_)
+        {
+            addAppUserToMemory(u);
+            u->setSaved();
+            ++count;
+        }
+    }
+* Write pattern: hopefully simple, eg:
+ bool SqliteLocalModel::write(AppUser& obj) {
+     json j; to_json(j,obj); return write("AppUsers",obj,j,true);
+ }
+=== THE RESULT ===
+* Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
+* Fast supplemental in-memory indexes into objects wherever needed
+* DRY code that allows rapid schema changes within a complete yet simplified codebase
 === Delayed delete pattern ===