Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: 'accuracy' when processing Location History #7

Open
davidwilemski opened this issue Jun 27, 2021 · 0 comments
Open

KeyError: 'accuracy' when processing Location History #7

davidwilemski opened this issue Jun 27, 2021 · 0 comments

Comments

@davidwilemski
Copy link

I'm new to both the dogsheep tools and datasette but have been experimenting a bit the last few days and these are really cool tools!

I encountered a problem running my Google location history through this tool running the latest release in a docker container:

Traceback (most recent call last):
  File "/usr/local/bin/google-takeout-to-sqlite", line 8, in <module>
    sys.exit(cli())
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/usr/local/lib/python3.9/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/cli.py", line 49, in my_activity
    utils.save_location_history(db, zf)
  File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py", line 27, in save_location_history
    db["location_history"].upsert_all(
  File "/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py", line 1105, in upsert_all
    return self.insert_all(
  File "/usr/local/lib/python3.9/site-packages/sqlite_utils/db.py", line 990, in insert_all
    chunk = list(chunk)
  File "/usr/local/lib/python3.9/site-packages/google_takeout_to_sqlite/utils.py", line 33, in <genexpr>
    "accuracy": row["accuracy"],
KeyError: 'accuracy'

It looks like the tool assumes the accuracy key will be in every location history entry.

My first attempt at a local patch to get myself going was to convert accessing the accuracy key to a .get instead to hopefully make the row nullable but I wasn't quite sure what sqlite_utils would do there. That did work in that the import happened and so I was going to propose a patch that made that change but in updating the existing test to include an entry with a missing accuracy entry, I noticed the expected type of the field appeared to be changing to a string in the test (and from a quick scan through the sqlite_utils code, probably TEXT in the database). Given this change in column type, it seemed that opening an issue first before proposing a fix seemed warranted. It seems the schema would need to be explicitly specified if you wanted a nullable integer column.

Now that I've done a successful import run using my initial fix of calling .get on the row dict, I can see with datasette that I only have 7 data points (out of ~250k) that have a null accuracy column. They are all from 2011-2012 in an import that includes points spanning ~2010-2016 so perhaps another approach might be to filter those entries out during import if it really is that infrequent?

I'm happy to provide a PR for a fix but figured I'd ask about which direction is preferred first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant