Thursday 14 July 2022

Merging two SQLite DBs

We've got an MVAS (well, we've got two, but only one team to manage the one) in Witchford, and I'm part of the team that manages it.

I hold ladders and carry batteries, and it's not all that hard a task, quite good fun. One chap downloads the data and lets me cast my eye over it (it's stored as in SQLite database). He went on holiday a little while back, so I had to collect the data. That meant there was an issue with the data integrity between his copy of the DB and mine. Thus, I was left with a need to merge two SQLite DBs; with the help of SQLiteStudio, it was pretty easy to resolve. Each DB has three tables: campaign, campaign_details and measure.

Thinking about it, after going through this process, I realised I just needed to copy over the entirety of my DB to him; it had the same data apart from the latest readings... but hindsight is a beautiful thing.

Anyway, there was an extra row in campaign - so that was easy to copy using some SQL; there were an additional 13 rows in campaign_details - again, dead easy to copy over using a bit of SQL. But, there were an extra 17K+ rows in measure...

I got in touch with my mate Oliwia to see if she could help, but she was busy, so I had to cobble together my SQL; this is what I came up with:

INSERT INTO 
  original.measure
SELECT 
  *
FROM 
  new.measure a 
WHERE 
  a.meas_id 
NOT IN (
  SELECT 
    meas_id
  FROM 
    original.measure b
)

Not pretty, but it worked! I blame the heat for not realising sooner.

EDIT

So, this happened again so I thought I'd better take some notes and use some proper SQL:

INSERT INTO
  `TARGET_TABLE`.campaign (
    product_prod_id,
    camp_date_begin,
    camp_date_end,
    date_created,
    camp_store_mode,
    camp_title,
    camp_comment
  )
SELECT
  product_prod_id,
  camp_date_begin,
  camp_date_end,
  date_created,
  camp_store_mode,
  camp_title,
  camp_comment
FROM
  `SOURCE_TABLE`.campaign a
WHERE
  a.camp_date_begin
    NOT IN (
    SELECT
      camp_date_begin
    FROM
      `TARGET_TABLE`.campaign b
    );

INSERT INTO
  `TARGET_TABLE`.campaign_details (
    product_prod_id,
    cpd_date,
    campaign_camp_id,
    cpd_status,
    cpd_attempt,
    last_updated,
    cpd_max_time
  )
SELECT
  product_prod_id,
  cpd_date,
  51,
  cpd_status,
  cpd_attempt,
  last_updated,
  cpd_max_time
FROM
`SOURCE_TABLE`.campaign_details a
  WHERE
  a.cpd_date
  NOT IN (
    SELECT
      cpd_date
    FROM
      `TARGET_TABLE`.campaign_details b
    );

INSERT INTO
  `TARGET_TABLE`.measure (
    meas_date,
    meas_speed,
    meas_direction
  )
SELECT
  meas_date,
  meas_speed,
  meas_direction
FROM
  `SOURCE_TABLE`.measure a
WHERE
  a.meas_date
  NOT IN (
    SELECT
      meas_date
    FROM
      `TARGET_TABLE`.measure b
    );

Please note the 51 is hardcoded and the result of the previous query inserting another row in the campaign table.

Tuesday 5 July 2022

Donut bugs

I noted before that my Donut Component has a bug in it when it comes to drawing segments with a value over 50% of the total - such a pain!

I was hoping to avoid using SVGs for this component, but I think that is what I'll have to do - not such a bad thing, TBH, but still annoying!

Comparing JSON (gzipped or not) in localStorage with IndexedDB (using PouchDB)

I wrote last week about GZipping JSON for localStorage and, over the weekend, I decided to do some checks on the efficacy of this approach. I noted that the larger the payload, the more significant the space saving; smaller payloads often negated any saving, with the gzipped file being significantly larger than the original JSON for smaller JSON payloads.

I didn't only compare the original JSON and the gzipped JSON though, I decided to take a wee look into IndexedDB, with the help of PouchDB.

I worked with three JSON files:

  1. bodies.json (330 KB)
  2. employees.json (2284 KB)
  3. family.json (2 KB)

As you'll be able to see from the code, I do a fair bit of computation using both console.time() and performance.now().

The results are output to the developer console (I did try using a Donut Chart - but I discovered a bug in my Component - I'll fix it this week, hopefully!).

The following are my results for saving on Firefox:

    1. Time taken to save bodies using localStorage: 0ms
    2. Time taken to save bodies with Pako: 20ms
    3. Time taken to save bodies with PouchDB: 34ms
    1. Time taken to save employees using localStorage: 0.002ms
    2. Time taken to save employees with Pako: 69ms
    3. Time taken to save employees with PouchDB: 149ms
    1. Time taken to save family using localStorage: 0ms
    2. Time taken to save family with Pako: 2ms
    3. Time taken to save family with PouchDB: 157ms

And these are my results for retrieving the data on Firefox:

    1. Time taken to get bodies using localStorage: 2ms
    2. Time taken to get bodies with Pako: 10ms
    3. Time taken to get bodies with PouchDB: 45ms
    1. Time taken to get employees using localStorage: 12ms
    2. Time taken to get employees with Pako: 25ms
    3. Time taken to get employees with PouchDB: 22ms
    1. Time taken to get family using localStorage: 1ms
    2. Time taken to get family with Pako: 1ms
    3. Time taken to get family with PouchDB: 43ms

The following are my results for saving on Chrome:

    1. Time taken to save bodies using localStorage: 0ms
    2. Time taken to save bodies with Pako: 6.5ms
    3. Time taken to save bodies with PouchDB: 185.1ms
    1. Time taken to save employees using localStorage: 0ms
    2. Time taken to save employees with Pako: 54ms
    3. Time taken to save employees with PouchDB: 143.89ms
    1. Time taken to save family using localStorage: 0ms
    2. Time taken to save family with Pako: 0.4ms
    3. Time taken to save family with PouchDB: 178.5ms

And these are my results for retrieving the data on Chrome:

    1. Time taken to get bodies using localStorage: 0.9ms
    2. Time taken to get bodies with Pako: 4.29ms
    3. Time taken to get bodies with PouchDB: 39.6ms
    1. Time taken to get employees using localStorage: 5.19ms
    2. Time taken to get employees with Pako: 27.9ms
    3. Time taken to get employees with PouchDB: 19.2ms
    1. Time taken to get family using localStorage: 0ms
    2. Time taken to get family with Pako: 0.19ms
    3. Time taken to get family with PouchDB: 34.69ms

The results (in chart form) are here:

I think we can probably all tell that localStorage wins, and wins handsomely. Though the limitations of space in localStorage means that IndexedDB still has a place; perhaps not for smaller data sets though.

I'm happy that I've taken the time to find this out; it certainly means that I have evidence for future decisions - I'd be interested in analysing it further, though, especially as there are discrepancies between the different browsers. Saving to IndexedDB on Firefox seems much faster than on Chrome, though the situation is reversed when retrieving data from IndexedDB.