Major Objects: Difference between revisions
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Use '''Major Objects''' for fast in-memory handling of large amount of data that is thread-safe but must be persisted. Support complex objects with simple keys, fast lookup by multiple keys, serialization from/to multiple persistent layers, and fast DRY schema changes. | |||
=== Major Objects === | |||
* Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below) | |||
* The main container's primary key is always db_id | |||
* Always use the db_id for foreign keys | |||
* Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!) | |||
** For other unordered_sets, just define new hash functions | |||
** Other useful containers are sorted_vector and map (<key,value> pair) | |||
=== PersistentIDObject === | |||
* Add a dirty flag to all objects, set to true on any change that must be persisted | |||
* Use an internal in-memory max_db_id counter in the parent to generate the next db_id for a newly created object | |||
* This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT! | |||
* Use delayed-write tactics to write all dirty objects on idle time | |||
=== Memory Model === | |||
* Use a Datastore manager (aka "MemoryModel") to hold sets | |||
* It can look up objects by any key, and strip away const to return a mutable object. NOTE that the user must not damage the key values! | |||
* Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.). | |||
* Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc. | |||
=== Schema Driven === | |||
* Schema is shared across memory objects and every persistence layer. Follow a DRY pattern by generating all code from a common schema definition. | |||
* JSON schema is elegant as it is as simple as possible but no less, always use it. Essential tools: | |||
** [https://quicktype.io/ quicktype] - use npm to install it and to incorporate code generation into CI; this gives us reflection in C++! | |||
** [https://jsonformatter.curiousconcept.com/ JSON formatter] | |||
* Always provide a constructor following this format, that by default creates temporary objects that can be safely thrown away: | |||
// This constructor serves several purposes: | |||
// 1) standard full-param constructor, efficient for both deserializing and initializing | |||
** | // 2) no-param constructor for reflection via quicktype | ||
*** | // 3) id constructor for loading via id + quicktype fields | ||
* | // 4) id constructor for use as key for unsorted_set::find() | ||
StockQuoteDetails( | |||
int64_t db_id = PersistentIDObject::DBID_DO_NOT_SAVE, | |||
double quote = -1.0, | |||
time_t timestamp = 0, // Set invalid by default until we get a live one | |||
AutotradeParameterSet* p_aps = 0 // For global autoanalysis results | |||
) : | |||
// Call base class | |||
inherited(sq_max_db_id_,db_id), | |||
// internal members | |||
n_refcount_(1) | |||
{ | |||
// persistent members | |||
quote_ = quote; | |||
timestamp_ = timestamp; | |||
} | |||
* Use object pointers in memory, and provide code-friendly accessors, eg: | |||
// EXTERNAL REFERENCES | |||
// NOTE we are not responsible for these allocations. | |||
// Access pointers to parents as references. | |||
void setParent(BrokerAccount& ba) { pba_ = &ba; } | |||
BrokerAccount& ba() { assert(pba_ != 0); return *pba_; } | |||
BrokerAccount& ba() const { assert(pba_ != 0); return *pba_; } | |||
void setStockQuote(StockQuote& sq) { psq_ = &sq; } | |||
StockQuote& sq() { assert(psq_ != 0); return *psq_; } | |||
StockQuote& sq() const { assert(psq_ != 0); return *psq_; } | |||
// Accces nullable pointers directly. | |||
AnalysisData* pad_; // the analysis parameters used to run the last analysis | |||
* Use simple JSON types in memory, and provide code-friendly accessors, eg: | |||
// Cast database JSON types to code-friendly types | |||
RANK_TYPE rt() const { return (RANK_TYPE)rank_type_; } | |||
APS_SCOPE scope() const { return (APS_SCOPE)aps_scope_; } | |||
ORDER_STATUS status() const { return (ORDER_STATUS)order_status_; } | |||
time_t tFirstSellDate() const { return (time_t)first_sell_date_; } | |||
double getPctGain(GAIN_TIMEINTERVAL_TYPE gtt) const; | |||
void setPctGain(GAIN_TIMEINTERVAL_TYPE gtt, const double& gain); | |||
int64_t getSellsCount(GAIN_TIMEINTERVAL_TYPE gtt) const; | |||
void setSellsCount(GAIN_TIMEINTERVAL_TYPE gtt, int64_t count); | |||
* Read pattern: read entire set, then loop the set and patch/distribute each object as needed; eg: | |||
bool SqliteLocalModel::readUsers() { | |||
int_fast32_t count = 0; | |||
try { | |||
if (!read("AppUsers",AppUser(),users_)) | |||
return false; | |||
for (auto& u : users_) | |||
{ | |||
addAppUserToMemory(u); | |||
u->setSaved(); | |||
++count; | |||
} | |||
} | |||
* Write pattern: hopefully simple, eg: | |||
bool SqliteLocalModel::write(AppUser& obj) { | |||
json j; to_json(j,obj); return write("AppUsers",obj,j,true); | |||
} | |||
=== THE RESULT === | |||
* Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer | |||
* Fast supplemental in-memory indexes into objects wherever needed | |||
* DRY code that allows rapid schema changes within a complete yet simplified codebase | |||
=== Delayed delete pattern === | === Delayed delete pattern === |
Latest revision as of 02:44, 3 August 2020
Use Major Objects for fast in-memory handling of large amount of data that is thread-safe but must be persisted. Support complex objects with simple keys, fast lookup by multiple keys, serialization from/to multiple persistent layers, and fast DRY schema changes.
Major Objects
- Use an unordered_set of const pointers to objects derived from PersistentIDObject (see below)
- The main container's primary key is always db_id
- Always use the db_id for foreign keys
- Other containers can be created using other members as keys; the only cost is for a new set of pointers (not objects!)
- For other unordered_sets, just define new hash functions
- Other useful containers are sorted_vector and map (<key,value> pair)
PersistentIDObject
- Add a dirty flag to all objects, set to true on any change that must be persisted
- Use an internal in-memory max_db_id counter in the parent to generate the next db_id for a newly created object
- This means that when creating new objects, there is NO NEED to access db, VERY IMPORTANT!
- Use delayed-write tactics to write all dirty objects on idle time
Memory Model
- Use a Datastore manager (aka "MemoryModel") to hold sets
- It can look up objects by any key, and strip away const to return a mutable object. NOTE that the user must not damage the key values!
- Derive a class from the memory model for persistence; it can use any persistence method (local, remote, sql, nosql, etc.).
- Make sure that the base MemoryModel class is concrete not abstract, thread-safe and self-contained; this makes parallel calculations trivial, helps scalability, etc.
Schema Driven
- Schema is shared across memory objects and every persistence layer. Follow a DRY pattern by generating all code from a common schema definition.
- JSON schema is elegant as it is as simple as possible but no less, always use it. Essential tools:
- quicktype - use npm to install it and to incorporate code generation into CI; this gives us reflection in C++!
- JSON formatter
- Always provide a constructor following this format, that by default creates temporary objects that can be safely thrown away:
// This constructor serves several purposes: // 1) standard full-param constructor, efficient for both deserializing and initializing // 2) no-param constructor for reflection via quicktype // 3) id constructor for loading via id + quicktype fields // 4) id constructor for use as key for unsorted_set::find() StockQuoteDetails( int64_t db_id = PersistentIDObject::DBID_DO_NOT_SAVE, double quote = -1.0, time_t timestamp = 0, // Set invalid by default until we get a live one AutotradeParameterSet* p_aps = 0 // For global autoanalysis results ) : // Call base class inherited(sq_max_db_id_,db_id), // internal members n_refcount_(1) { // persistent members quote_ = quote; timestamp_ = timestamp; }
- Use object pointers in memory, and provide code-friendly accessors, eg:
// EXTERNAL REFERENCES // NOTE we are not responsible for these allocations. // Access pointers to parents as references. void setParent(BrokerAccount& ba) { pba_ = &ba; } BrokerAccount& ba() { assert(pba_ != 0); return *pba_; } BrokerAccount& ba() const { assert(pba_ != 0); return *pba_; } void setStockQuote(StockQuote& sq) { psq_ = &sq; } StockQuote& sq() { assert(psq_ != 0); return *psq_; } StockQuote& sq() const { assert(psq_ != 0); return *psq_; } // Accces nullable pointers directly. AnalysisData* pad_; // the analysis parameters used to run the last analysis
- Use simple JSON types in memory, and provide code-friendly accessors, eg:
// Cast database JSON types to code-friendly types RANK_TYPE rt() const { return (RANK_TYPE)rank_type_; } APS_SCOPE scope() const { return (APS_SCOPE)aps_scope_; } ORDER_STATUS status() const { return (ORDER_STATUS)order_status_; } time_t tFirstSellDate() const { return (time_t)first_sell_date_; } double getPctGain(GAIN_TIMEINTERVAL_TYPE gtt) const; void setPctGain(GAIN_TIMEINTERVAL_TYPE gtt, const double& gain); int64_t getSellsCount(GAIN_TIMEINTERVAL_TYPE gtt) const; void setSellsCount(GAIN_TIMEINTERVAL_TYPE gtt, int64_t count);
- Read pattern: read entire set, then loop the set and patch/distribute each object as needed; eg:
bool SqliteLocalModel::readUsers() { int_fast32_t count = 0; try { if (!read("AppUsers",AppUser(),users_)) return false; for (auto& u : users_) { addAppUserToMemory(u); u->setSaved(); ++count; } }
- Write pattern: hopefully simple, eg:
bool SqliteLocalModel::write(AppUser& obj) { json j; to_json(j,obj); return write("AppUsers",obj,j,true); }
THE RESULT
- Fast efficient unsorted_set<PersistentIDObject*> containers that can be automatically serialized to/from any database layer
- Fast supplemental in-memory indexes into objects wherever needed
- DRY code that allows rapid schema changes within a complete yet simplified codebase
Delayed delete pattern
1) to dynamically delete an object: a) ba.setDeleted(); b) do not remove from any container indexes c) but fix the index sorting, flags, etc, as if the object were gone, so the program will function as if it is! eg: do not remove from runsByRank_, but adjust all other ranking as if the run was gone 2) include deleted status in active check, etc.: // NOTE use the direct function rather than !bFunc(), as deleted objects return false for both. bool bActive() const { return b_active_ && !bDeleted(); } bool bInactive() const { return !b_active_ && !bDeleted(); } --- for (auto& psr: runsByRank_) { if (psr->bDeleted()) continue; ... 3) all deletion work is done in MemoryModel::saveDirtyObjectsAsNeeded(), see that code a) deletion check should happen in delayed write check: if (pau->bDirtyOrDeleted()) bNeeded = true; b) if bNeeded, always do deletions first, starting with greatest grandparent container, to minimize work c) use the erase-remove pattern to remove all deleted items in one loop see code here for reference implementation: BrokerAccount::removeDeletedStockRuns() i) iterate and remove item from all secondary indices ii) iterate primary index, and use the lambda of the erase-remove operation to delete memory allocation and remove db record iii) associative container iterators can be safely deleted directly sequential containers like vector require use of erase-remove idiom see reference implementation for example code!