Coding · google docs · Python

Getting number of characters from all documents in Google drive folder

I should be writing on my PhD thesis, but instead I wrote a fun piece of code. I wanted to trace the number of characters in my thesis documents, which are in a folder in my Google drive.  Google drive has a useful API, but with seemingly outdated and/or hard to navigate documentation (and not the time to scan through all of it) I had still quite a rough time to set up what I wanted to do. Your starting point is this page. Another great resource is the Core Python Programming blog.

I had to use both the ‘v2’ and ‘v3’ API even, based on code snippets I found. There is definitely a better way to do parts of this, so your feedback is welcome.  The full code is at the bottom of the post. The code uses a client_secret.json file which you can obtain after activating the API and setting the proper permissions in your google account. The first time the code is executed a browser window will be opened for authorizing the application. This will generate credentials stored in credentials.json for future use. Note that every time you tamper (e.g. with the permission scope) you should delete this credentials.json file and repeat the authorisation. There will be dragons.

A seemingly trivial but actually rather tricky part was listing the files in a Google drive folder. Perhaps I used some weak search terms, but it took most of my Saturday afternoon. Finally the script downloads the documents in my “thesis” folder as plain text  and sums all characters except newlines and underscores (which are generated from titles).

I’ve put this script in my cron tab for appending the  number of characters to a file every hour:

@hourly python thesis_charcount.py >> char_count_thesis.txt

This should give me some data for plotting, but that code is for another moment of glorious procrastination.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s