Quantified self, data science, python, paranoia

Было чуть позже четырех дня, конец весны, коридор ██ корпуса был почти пустым. Я шел к ███-му кабинету.

Около входа я ее заметил. Она странно стояла, не в плане что совсем странно, просто иначе, чем я привык, что она стоит. Как человек с миопией в -5 диоптрий, я всегда обращаю внимание на походку и осанку, по ним проще узнать человека издали, чем по лицу, и увидев ее я сразу напрягся, что-то странно, что-то не так.

Я подошел, она стояла довольно нетвердо и смотрела на меня расширенными зеницами. В воздухе витал легкий запах дешевого плохого красного вина.

—Извините, █████ █████а, я хотел бы спросить можно ли написать допуск к экзамену по ██████ ██████.
— Ты … кто вообще?
— Я ваш студент, группа ██-██.
— У тебя сколько баллов сейчас?
Ради Бога, не упоминай, что у тебя 7 баллов из 30 нужных для допуска…
— Мне казалось вы меня помните?
— Баллов. С… склько? В… впрчм не ваажно, заходи.

В кабинете тот запах был еще сильнее, чем вне его. На столе стояли пустые пластиковые стаканы на салфетках.
— …если я открою окно, стнет жарррко, если оставлю закрытыми, то зпах останется. Что мне … делать?
— Что я буду писать?
— Вот …. твой вариант.
— Спасибо.

Она ушла.

Подошел к ней через минут 20.

— Я написал, вот …
— …д…да, сейчас проверю. В… впрчм не важжжно, дй зчу.
— Что?
— Двй зчтку
— Спасибо! 1 На дворе эпоха постмодернизма, отсылки на других авторов или самого себя святое дело: https://www.pchr8.net/blog/2016/09/03/%d0%b4%d0%bd%d0%b5%d0%b2%d0%bd%d0%b8%d0%ba-%d1%81%d1%8d%d1%80%d0%b0-%d0%b4%d0%b6%d0%be%d0%bd%d0%b0-%d1%84%d1%83%d1%80%d1%82%d0%b2%d0%b5%d1%80%d0%ba%d0%b0-%d1%8d%d1%81%d0%ba%d0%b2-the-diaries-of-s/

The above may or may not describe the last time SWIM from another university had anything to do with data analysis, statistics, and empirical methods of software engineering.

This post shortly describes my current understanding of the basics of Quantified self / self tracking, how exactly I [am planning to] do this, how am I planning to analyze this, and what am I hoping to get from this.

Quantified self

EDIT: I do things now a bit differently, and I’ve split the big spreaadsheet file in multiple smaller ones; also I’ve ditched BWT in favour of classic gym workouts. But mostly this is still relevant.

For me, it all started with this excellent post by the creator of Wolfram Alpha, about how he tracked a lot of things for decades. It’s a very interesting read.

Quantified self (QS) is a movement / subculture / set of methods and approaches dealing with tracking various things about oneself, and then, hopefully, analyzing them, and getting interesting insights. One can think of QS as of a more targeted variant of a typical scientific study. I can read that “Scientific study confirms 5-HTP taken at bedtime improves sleep quality” or see “man, I swear, since I started doing Image streaming my memory, attention and weight have improved, girls started smiling at me, and I started speaking French!” on Reddit, or I can actually do an experiment and see if it improves my sleep (or French).

Some things don’t generalize, and if I find something that works for me (even though it doesn’t work for anyone else, isn’t scientifically sound, or doesn’t even make sense), it is still a very interesting discovery, if I can use it to make my life better. This guy discovered that standing on one leg until exhaustion improves his sleep, for example.

As someone who has been doing this for a couple of weeks, one of the more interesting discoveries is that my subjective estimates of my concentration, memory etc have very little to do with how well I can perform tasks that require them. Just now, I logged the following:

2017-10-21-174835_574x283_scrot

This is much worse than average, almost all of it. Then I took a typing test, and got the absolute best typing score of my entire life:

2017-10-21-174345_700x571_scrot

even though I was completely out of shape the last hour, or at least felt that way. The (lack of) correlation is even more strongly noticeable on tests which target memory / language / attention directly, such as Cambridge Brain Sciences (which I really recommend; short tests taking 10 minutes a day to get a feel of how your brain is doing, in three key areas).

Gathering data

I am currently doing it via a couple of methods.

The Spreadsheed

2017-10-21-192211_912x692_scrot
This is the monster in its entirety

It has two goals: productivity and self-tracking.

Basics

The top part is basic information about the day, such as number of pages read, time spent in bed, number of steps etc etc. I think I’ll be slowly removing it and using something else instead.

The first column uses the old method of dividing the day in 100 10-minute intervals and writing what are you doing in each of them. It allows, firstly, to see what I spend my time on, and secondly, it works really well as motivation. If you write that you are “procrastinating” six intervals one after another, you really start feeling bad. You don’t have the “Oh, it’s 10pm already, where did my day go?” moment at the end of the day, you keep track of your activities every-damn-10-to-30-minutes. It’s wonderful. (Sometimes, I use this timer video for this)

Except the column where I write keywords about my activities, separated by semicolons (not “I was talking with X in his room”, but “social; talking; real-life”, to make it easier to analyze later), in the first part I also log:

  • Location
  • People with whom I did this
  • Food eaten
  • It’s nutritional values, if I feel like it
  • Medicines / supplements consumed.

Melancholy

The next blue part are some typical depressive symptoms, mostly inspired by this “Depression progress tracker” with a couple of additions I felt were interesting or lacking. Another source for this was the Immediate Mood Scale, linked is the appendix with the complete list of the items, but the entire paper is interesting. Also, here are some interesting theoretical considerations about measuring mood — it’s true, the deeper you get into something, the more complicated it is.

  • Good spirits
  • Good about myself
  • Calm and relaxed
  • Active and vigorous
  • Appetite
  • Interested in activities
  • Focus & Concentration
  • Aggressive
  • Impulsive
  • Guilty
  • Withdrawn
  • Manic
  • Fidgety

There I also track my top three physical symptoms:

  • Headache
  • Fatigue
  • Stomach problems
  • Brain fog

All this on scales from 0 to 100. When I feel like it, or remember about it, or have nothing to do and feel like procrastinating.

Money

How much, on what, and on which category. Additionally, recently added “how”, to differentiate shopping in real life with Amazon and similar ones. An inspiration for this was this blog post about the tracking, especially the idea of using the United Nations Classification of Individual Consumption According to Purpose for categorizing the expenses. It’s very well thought-out and definitely better than I could myself have made. It’s a nice reminder of not having to reinvent the wheel for everything I do, hah.

Details

Anything I want to add about the 10-min interval. Just like a mini twitter-like diary.

Dual-N-Back

In this area, I track DNB level and how many percent I got right.

I wrote just a bit about it here, but, of course, Gwern’s FAQ is the ultimate resource on the topic. Dual-N-Back is pretty much the only brain-training exercise which has been shown to transfer to other areas — that is, getting better at DNB actually means that the working memory in real life is getting better too.

…General Dreedle wants his [pilots] to spend as much time on the skeet-shooting range as the facilities and their flight schedule would allow. Shooting skeet eight hours a month was excellent training for them. It trained them to shoot skeet.
(From Catch-22, as quoted in the FAQ)

For me, it’s an exercise that hits right where it hurts — memory (short-term and working, I know that both are different but both are bad for me) and concentration (where else I get a 5-minute span where I can’t get distracted? Oh, meditation, about this later). Speaking of this, I knew that my memory was bad, but then comes Cambridge Brain Sciences and shows me daily workouts looking like this:

CBS on an unusually good day
CBS on an unusually good day; about their objectivity I have my doubts.

The “reasoning” and “verbal” subdomains float quite a lot, but the memory one stays low, always. It’s nice to know what you fail at.

I hope DNB will help me. But I fully agree with Gwern on this — DNB is really interesting as an exercise by itself, even if it’s hard. Anything which puts you on your limits and lets you stay there, to explore them and get a feel of them, is interesting.

I have been doing DNB for a couple of months, got from Dual-2-Back to 70% right on Dual-3-Back, but have no way of measuring if it affected my memory during that time. I hope I’ll be able to do it more regularly now.

For DNB, my favourite program for the desktop is this web app, and on Android there are a couple of apps, but I really like Brain-N-Back, for its immense configurability and lack of disrupting adverts.

Shulte table

I got into this back when I was learning speedreading. It’s used for development of speed reading, peripheral vision, attention and visual perception 2Wikipedia, but for me it’s another way to stay concentrated. And to use my time, say, in the Metro, productively. I track grid size and time in seconds. This is the app I use and it’s awesome, also very configurable and without ads.

Cambridge Brain Sciences Tests

I log my C-Score and the scores in the three main domains, and the scores on the individual tests.

2017-10-21-201829_477x242_scrot

This is because I don’t really trust the C-Score as an objective method of assestment. I know that just a couple of paragraphs above I have written that it’s nice not to reinvent the wheel, but there are a couple of things which I find illogical or just don’t understand. The main one would be that the tests change day-to-day, and I’m much better at some tests than at others. I don’t like that this (pretty predictable) variation is shown as, say, my “reasoning” score for the day. Just how fluctuating it is can be seen here:

Guess on which days are the tests I'm good at?
It’s clear on which days the tests are the ones I’m good at.

We’re talking about 6-7 points out of 23 max, which is a lot. On the “good” days, there must be one (or two) “good” tests, I’ll look later which ones.

Regardless, I think the scores on the individual tests are indicative of many things. It would be nice if CBS would offer their data in .csv format for easier analysis, but for now I’m stuck with entering the results manually.

Typing speed

Doing the tests using my absolutely favourite aoeu.eu, the one that taught me Dvorak, and sometimes typeracer.com. I take it as a measure of how overall in shape I am, but just how meanigful this is is one of the things I’d like to discover using QS. It’s easy to do and takes one minute, so no harm done either way.

Weight

Why not?

Body weight training

Partly motivational accountability, partly because it’s actually nice and interesting to track.

The relevant subreddit, and the cheatsheet I use for it all. I log at which exercise I am in the progression and how many repetitions (4-6) I did.

Other ways to gather data

Not everything  is practical to track in this cheatsheet. I use a couple of other programs, especially focusing on ones that allow exporting to .csv.

Nomie. Nomie is awesome. Like, really awesome. It has a number of trackers, which you can activate just by clicking on them (if needed — also set a value). Also the trackers can be set as positive/negative, and it shows the tally for the day. Examples of such trackers: Healthy snack, unhealthy snack, drinking alcohol, Manic, Anxious, peed, etc etc. And, of course, you can add your own. And it allows exporting to csv. It’s really great.

Selfspy — excellent Linux daemon which records everything you do, keystrokes, window names, etc etc etc.

Simple data logger with graph. For when I’m on the go, and when it’s something numerically measured, like weight, when it’s okay if it’s added to the data in the spreadsheet. (Measuring weight one time more is not as bad as marking “showered” both on Nomie and on the spreadsheet, for example.) Also allows exporting to csv.

Google Fit (previously — Pedometer) — number of steps, geographic location, basic movements etc. Pretty neat.

MyAddictometer — no export; it gives details about your phone usage, which can be enlightening.

Google Spreadsheets for Android — when possible, I attempt  to upload the spreadsheets to Google Drive, to be able to edit it on my phone. This Google app works really well for reading and editing LibreOffice Calc’s files, with all conditional formatting etc.

LastFM to CSV

In the future, I’d love to add a dump of my Telegram messages. (Here is a nice writeup on how to do that.) It’s data that already exists, and I could use it to infer basic stuff (sleeping patterns as one of them) without too much trouble. Probably the next post will be dedicated to it? Seeing when I write the biggest number of messages, to whom, etc. would be a very interesting exercise in visualization and data analysis.

 

The last hour, this happened.
The last hour, this happened.

Putting it all together

This is a big work in progress.

I’m planning to use it as a playground to remember Python and get a bit into Data analysis and statistics, I’ll need both in a couple of months.

Currently I have a shell script that converts all ODT files to csv. Then I have a python script (using the Pandas library, which is pretty neat) that divides the two parts of the spreadsheet — the “head” one and the tracking one. In the tracking one, it converts the times to datetime timestamps, and writes to a separate file %%timestamp%%-data.csv.

2017-10-21-223014_407x333_scrot

Next step (about an hour of work, given my three hours’ experience with Pandas) would be to append the data csv to a master data CSV file, if it hasn’t been already appended to it, and to make a master head-file, where rows would be days and columns would be the things I track daily, not in 10-minute intervals.

At the end, I should have a long CSV file with rows as intervals, and columns as data. And a couple of CSVs exported from various services.

Analyzing it all and actually making it useful

I really have no idea. At all.

TODO: remember my statistics course, remember my mad Python skillz, and start learning data science? (What better excuse could there be?)

Graphing it all would be pretty trivial, and maybe visually I could start seeing relationships or correlations. For this, there are a couple of web services: Fluxtream seems promising. It’s not the only one that can connect to various sources, get your csv input, and start doing magic. It’s a topic I really do need to research better.

I could start doing experiments, in their classic sense, but for that I’d like to estabilish a baseline, which is what I’ll do this month. Then start additng/removing supplements (topic for another post), BWT, changing diet, etc.

Ideas for possible correlations:

  • Sleep and pretty much everything, esp. C-scores and mood
  • Subjective feelings and C-Scores
  • C-scores and typing speed

Ideas for possible experiments:

  • Caffeine & L-theanine and C-scores and typing speed
  • Caffeine(~tea) after 6 pm and sleep
  • Alcohol and sleep
  • Running|BWT and sleep

For now, let’s focus on the tracking, and do it consistently (as strange as it sounds, I’m actually able to fill the spreadsheet every 10-40 minutes for the last 1.5 weeks; the apps will do their job by themselves, mostly.)

I should also really play with visualization (find my old Conky configs and fill them with TODOs and reminders?), since it works really well for motivation. Especially the once-a-day tracking. Part of a well-known effect, but was surprised to know there’s a subreddit dedicated to it. And something like this on my background would really motivate me:

simpsons-radioactive

And, lastly, just a dashboard on a terminal that I can switch to would also work wonders. … if I need a next step, research the motivation parts of it all, what visualizations work and why, what motivates me (is meta-quantifiedSelf a thing?), but we all know this won’t be happening, probably.

But again, all of this is far away.

If someone wants the current iteration of the spreadsheet — feel free to contact me, you’ll find how.

And, as usually, this post reflects my current view and understanding of the topic, everything subject to change (as it usually it does change). But that’s the beauty of this blog. Just opening the archives and seeing what I was up to two years ago is priceless. As Stephen Wolfram says on his blog:

And as I think about it all, I suppose my greatest regret is that I did not start collecting more data earlier.

I can say the same about this blog. I’m doing it not for the twelve and a half individual visitors a month that I’m getting, I’m doing it as writing practice (immensely therapeutical!) and as a way to record my very fleeting states of mind and interests for myself. And this has been going on for more than five years. Having the possibility of looking back is priceless and very gratifying. Given my awesome memory especially.

See you later with a full analysis of my Telegram messages!

(Y),
SH

 

Footnotes   [ + ]

1. На дворе эпоха постмодернизма, отсылки на других авторов или самого себя святое дело: https://www.pchr8.net/blog/2016/09/03/%d0%b4%d0%bd%d0%b5%d0%b2%d0%bd%d0%b8%d0%ba-%d1%81%d1%8d%d1%80%d0%b0-%d0%b4%d0%b6%d0%be%d0%bd%d0%b0-%d1%84%d1%83%d1%80%d1%82%d0%b2%d0%b5%d1%80%d0%ba%d0%b0-%d1%8d%d1%81%d0%ba%d0%b2-the-diaries-of-s/
2. Wikipedia

One thought on “Quantified self, data science, python, paranoia

Leave a Reply

Your email address will not be published. Required fields are marked *