I spent this morning exploring available tech to address this goal:

Add a bigdata database to my application, to archive older data out of the realtime local model

This is for my stock app, which deals with realtime in-memory data during market hours, with a delayed-write to local storage.  At the end of the day, it can then archive most of the data collected during market hours.

Because I have not achieved “success” in life yet, at least enough to allow me to pursue my larger goals uninhibited, I have to be very careful about how I apply my limited resources.  To be more precise:

  • Follow patterns that are as simple as possible (but no less), and sustainable into the next decade.
  • For my projects, limit languages, libraries and tools to those that are
    • well maintained
    • solve difficult problems more elegantly than I could solve with a medium level of effort

The result of today’s philosophically-informed research:

  • The primary languages of my software projects should be Javascript and C++
  • All data should be defined by JSON schema that is used to generate code, via quicktype
  • Long-term libraries and tools include boost, jquery, bootstrap, accounting.js, moment.js, nlohmann::json, sqlite, postgres

Note that using quicktype with nlohmann::json is an elegant way to effectively get C++ reflection.  Once you serialize an object to JSON you can walk all its fields.  Then you can do things like automatically build SQL queries for your classes based on the JSON schema.  Beautiful.

PS. I avoided spotify-json, StaticJson/autojsoncxx, Google Prototype Buffers, Code Synthesis’s ODB, the sqlite JSON1 Extension, C++ reflection libraries like RTTR, lots of code from Stiffstream and Chilkat, etc. because while they are all brilliant and compelling, they bring extra weight.  The world keeps churning though, so keep searching.  Also, there are cases where my choices do not fit, most obviously being cross-platform mobile apps, which will have to be saved for another post… 🙂

A friend of mine asked for help to do a complicated data pivot on a few thousand rows of data.  We discussed it and agreed that we probably needed to walk each row and do an algorithmic summation on the fly.

My first instinct was to fire up Windows and dust off the old VBA cobwebs in my mind.  Fortunately that didn’t last long, and I reached for a far more comfortable friend, SQL.  The final SQL wasn’t terribly simple, but wasn’t terribly complicated either.  And it was a total linux solution, always the more pleasant solution.  And totally gui, too, for all you folks who get hives when you have to bang at a terminal.  🙂

  • Open excel sheet in LibreOffice
  • Copy columns of interest into a new sheet
  • Save As -> Text CSV
  • Create a sqlitebrowser database
  • Import the CSV file as a new table

BAM, easy, powerful, free, cheap, fast, win, win, win!

I ended up using this sql, the key is sqlite’s group_concat(), which just jams a whole query into one field, ha, boom.

SELECT a.email
 , group_concat(d.Dates) AS Dates
 , group_concat(d.Values) ASValues
FROM (select * from TestData ORDER BY email,Dates) AS a
inner JOIN TestData AS d ON d.email = a.email
GROUP BY a.email;

Then I copied the tabular result of the query, and pasted into the LibreOffice sheet, which asked me how to import.  Delimited by tab and comma, and then always recheck “select text delimiter” and use a comma.  And the cells fill right up.

All is full of light.