21 Feb 2020

Learning topics that are overwhelming

History / Edit / PDF / EPUB / BIB / 3 min read (~435 words)
Problems

I want to learn a new topic but it feels overwhelming. How can I do it?

When I wanted to learn about artificial general intelligence (AGI), I knew almost nothing about the field. I had a background in software engineering and I was curious about intelligence as a whole, but I had no clue what I would need to know in order to work on artificial general intelligence.

My initial approach was to read a lot about the topic. It allowed me to learn about a variety of concepts and the vocabulary associated with AGI. However, as I was doing that, I felt overwhelmed by the amount of information I would have to learn. In 2014 I was reading Artificial Intelligence: A Modern Approach, then in 2017 the Deep Learning book. It was hard to reconcile how the new concepts I was learning and how they were all connected.

At the time I already knew of mind maps. I thought it would make sense to try and map out all the knowledge I had acquired so that I could make sense of it and be able to refer to it as I would work.

I initially built a very large concept map using yEd that included everything I could think of that touched on the topic of AGI: machine learning, mathematics, computer science, neuroscience, etc. As the map got larger and larger, I started thinking about how I would organize it so that it was clearer to understand and that it was possible to progress through it if someone else wanted to learn from my experience. Thus, I extracted the machine learning, computer science and mathematics concept maps to their own respective concept maps.

The approach has been to spend time to write down what I knew, create a node in the graph with the term, then try to associate it with other terms already in the graph. As I learned new terms I would add them to the graph. I would also dedicate a bit of time during the week to try to brain dump whatever I might have learned or I knew about that could be added to the graph. This is an iterative process, so the graph itself is never really completed. However given this graph, it is now possible to know what I have already explored and the areas where I need to do more exploration because my understanding is lacking.

20 Feb 2020

Omnifocus "ofocus" format

History / Edit / PDF / EPUB / BIB / 2 min read (~391 words)
Problems

I use Omnifocus but I'd like to have access to the underlying data and use it in another application. How can I do that?

The first step to solve this problem is figuring out if we can access the underlying data easily. Sometimes we're lucky and the data is just a single file with the data in a format that we need to manipulate a little. In other cases, like for the ofocus format, files are organized in a single directory that contains zip archives. Unzipping some of those files we can observe there are two kinds: a master file and many transaction files. By playing around a bit with Omnifocus, we can make all the transactions files disappear, effectively merging them into the master file.

The next step is to look at the content of the master file. This file contains most of what we would expect to find when we use Omnifocus, namely contexts, folders and tasks. By inspecting multiple copies of the same type of entry it is possible to collect the different attributes that constitute them and whether some are optional while others are always present.

When I tried to reverse-engineer the format of ofocus files, I used my own database which contained many entries. This allowed me to quickly find most of the attributes of each type of entry. Another approach would've been to start with a clean database and to create each of the types of entry and create two variants: one with all the attributes defined, the other one with the least amount of attributes necessary.

Once the structure of those entries is identified, it is not too difficult to use reasoning to determine from where certain values come from. In some cases, values are references to other entries, similar to how you would have foreign keys in a database.

With all this knowledge now available to us, it is easy to simply convert the XML file into a JSON file. Once the data is available in JSON, it's slightly easier to work with it in PHP than using XML. By understanding the structure and relationships within the data, it is possible to make use of this data to build your own application that would reproduce the hierarchy of folders/tasks that are displayed inside Omnifocus.

19 Feb 2020

PHP semantic versioning checker

History / Edit / PDF / EPUB / BIB / 2 min read (~310 words)
Problems PHP

I use semantic versioning 2.0.0 for my PHP libraries and I'd like to know when I generate breaking changes to my code. I would like to have a tool that tells me when that happens, as soon as possible so I can avoid creating backward-incompatible changes.

In 2015 I was working on a lot of PHP code and releasing various libraries as well as using a lot of libraries which were not always respecting the semantic versioning 2.0.0 rules. I understood that it was difficult to keep track of all the changes done to a codebase and that you needed a certain level of expertise to tell what kind of semantic versioning impact changes had. I knew that most of the semantic versioning rules however were rules that could be codified such that you could run a program that would look at a before and after snapshot of a piece of code and tell you how it changed. Those changes could then be categorized according to what they changed, and based on the semantics of semantic versioning, a semantic versioning change could be generated for the code change itself.

Thus PHP Semantic Versioning Checker was born. It analyzes a before and after snapshot of a directory of source code and generates a report of all the changes that occurred. It is then possible to review all the changes by their type (class, function, method, trait, interface) and their semantic impact (major, minor, patch, none). It also computes a suggested versioning change, which is the highest semantic impact found amongst the different types inspected.

I hope that this library and the concept of scanning source code for semantic versioning changes becomes more mainstream, such that every project runs this kind of tool as part of their CI pipeline.

18 Feb 2020

Anki recorder

History / Edit / PDF / EPUB / BIB / 2 min read (~352 words)
Problems Anki

I'm learning a language using Anki flash cards and I'd like to record my progress over time. I'd like to be able to hear what I sounded like when I started learning.

In 2017 I started learning Chinese. It was quite difficult for me given that I had very little prior experience with Asian languages, other than learning a bit of Japanese through a Rosetta Stone program where I couldn't figure out what they wanted me to learn based on images alone (they were trying to teach me color, but it was not obvious).

I imported the Memrise Mandarin Chinese flashcards level 1, 2 and 3 through the use of an Anki extension (https://github.com/wilddom/memrise2anki-extension) and I started my journey to learn. After practicing for a few weeks, I started being interested to have some form of recording of my progress, so I thought I could simply record my voice when I was asked to recall a word. Anki already had a recorder as part of its features so I simply piggy-backed on it to implement an Anki extension which I called the Anki recorder.

When a new card is shown to you, the recorder starts right away. As you recall the word, you have to pronounce it. Once you've said the word, you can then check if you were correct or not, at which point the recording is stopped. Each record is timestamped, which allows you to listen to any of the words over time. It's rather funny to listen to yourself when you started learning a language and how you sound a few years later.

With this tool I was able to record over 193k audio samples over 3 years. There's a good chunk of those records that is only silence because it would also be triggered on cards with text only (where you had to remember how to write the word, or what the word meant).

Hopefully this tool can allow you to record your language learning progression and have fun after a few months of practice!

17 Feb 2020

PHP code coverage verifier

History / Edit / PDF / EPUB / BIB / 2 min read (~277 words)
Problems PHP

I wrote some new code or edited code in my PHP application and I'd like to know if this change is covered by tests.

The approach I thought of to solve this problem was to write your change, run your tests (most likely using PHPUnit), then create a diff/patch of those changes and use this information to determine whether the lines that are part of the diff are covered. It should be relatively straightforward to automate the process of generating a patch from within a git repository as well as running PHPUnit within a project's directory (generating a clover.xml report). With this information in hand, it would be possible for a script to compute what is and isn't covered by tests in the changes that you are about to commit.

I wrote PHP Code Coverage Verifier which takes care of reading the PHPUnit output (a clover.xml file) and a generated patch file and produce a report of the lines that are and aren't covered.

An example of the script output is as follow:

php vendor/bin/php-code-coverage-verifier verify my-clover.xml my-diff.patch
Using clover-xml file: my-clover.xml
With diff file: my-diff.patch

Covered:
controller/admin/stocks.php line 15 - 21
controller/admin/stocks.php line 91 - 97
controller/search.php line 26 - 32
controller/search.php line 376 - 384
model/user.php line 34 - 41
model/user.php line 44 - 51

Not covered:
controller/account.php line 39 - 45
controller/admin/stocks.php line 27 - 33
controller/search.php line 36 - 42
controller/search.php line 187 - 193
model/user.php line 533 - 540

Ignored:
application/composer.json

Coverage: 40 covered (56.338%), 31 not covered (43.662%)