On being a polyglot

I’m kind of known as a polyglot among coworkers. We would often argue that instead of hiring great Java/Python/C++ developers, we should rather strive to hire great engineers with strong CS fundamentals who can pick up any language easily. I came from scientific computing background, doing mostly C/C++/Python many years ago. Over the course of the last three years at my current job I coded seven languages professionally, some out of interest and some necessity. I enjoyed the experience learning all these different things and want to share my experience here, what I learned from each one of them and how it helps me becoming a better engineer.

C

The first language I used seriously, apart from LOGO & BASIC when I was a kid of course. It’s probably the closest thing one can get to the operating system and bare metal without dropping down to assembly (while you still can in C). It’s a simple language whose syntax served as the basis of many successors like C++ & Java. It doesn’t offer any fancy features like OOP or namespaces, but rather depends on the developer’s skill for organizing large code base (think …

more ...

How many copies

One topic that came up a lot when optimizing Scala data applications is the performance of standard collections, or the hidden cost of temporary copies. The collections API is easy to learn and maps well to many Python concepts where a lot of data engineers are familiar with. But the performance penalty can be pretty big when it’s repeated over millions of records in a JVM with limited heap.

Mapping values

Let’s take a look at one most naive example first, mapping the values of a Map.

val m = Map("A" -> 1, "B" -> 2, "C" -> 3)
m.toList.map(t => (t._1, t._2 + 1)).toMap

Looks simple enough but obviously not optimal. Two temporary List[(String, Int)] were created, one from toList and one from map. map also creates 3 copies of (String, Int).

There are a few commonly seen variations. These don’t create temporary collections but still key-value tuples.

for ((k, v) <- m) yield k -> (v + 1)
m.map { case (k, v) => k -> (v + 1) }

If one reads the ScalaDoc closely, there’s a mapValues method already and it probably is the shortest and most performant.

m.mapValues(_ + 1)

Java conversion

Similar problem exists …

more ...


Light Table

I recently picked up Light Table for Clojure development and liked it. Form evaluation works out of the box and indentation is better than that in La Clojure plugin for IntelliJ IDEA.

I particularly like the idea of command bar, which allows you to search for Light Table commands by name and execute them quickly. I was already used to IDEA’s key map though (Mac OS X 10.5+ which is more natural to Mac users than the default Mac OS X), and wanted something similar. The setting files are in Clojure so it’s easy to customize. This is what I got so far for user.keymap:

{:+ {:app {"alt-space" [:show-commandbar-transient]}

     :editor {"alt-w" [:editor.watch.watch-selection]
              "alt-shift-w" [:editor.watch.unwatch]
              "ctrl-alt-i" [:smart-indent-selection]
              "ctrl-alt-c" [:toggle-console]
              "ctrl-shift-j" [:editor.sublime.joinLines]
              "pmeta-d" [:editor.sublime.duplicateLine]
              "pmeta-shift-up" [:editor.sublime.swapLineUp]
              "pmeta-shift-down" [:editor.sublime.swapLineDown]
              "pmeta-/" [:toggle-comment-selection :editor.line-down]}}}

Apart from these, I found myself using "pmeta-enter" [:eval-editor-form] and "ctrl-d" [:editor.doc.toggle] most when writing Clojure code. After all they are probably the most essential ones no matter what editor you use :)

more ...

dotfiles

My dotfiles is probably the most copied code among my coworkers and today I will give a little break down of the code base.

zsh

I switched to zsh 3 years ago and never looked back. There’s also oh-my-zsh, a framework for managing ZSH configuration. The features I found most useful are:

  • Tab completion, including hostnames and arguments
  • History across multiple sessions
  • Plugins and themes

My .zshrc is mostly out of the box with some aliases and a few plugins thrown in but decided to create my own theme. I use colors for hostname in the prompt, green for local and red for remote. I also tweaked git status a bit to show untracked files (red dots), unstaged (yellow) & staged (green) changes, plus number stashed changes since that’s the one thing I keep doing and forgetting about.

git

I use git both for work and personal projects, plus contributing to open source projects on GitHub. My .gitconfig includes both a global gitignore file and a templatedir, which includes hooks for ctags and Gerrit. The hooks are installed automatically for every repo.

Since I type hundreds of git commands on a daily basis, I aliased git to simply g …

more ...

First (real) post with Pelican

Finally decided to jump (back) on the blogging bandwagon. This time I decided to use a static site generator, since that seems the cool thing to do these days, and found this site. I want something in a language I know well, so Ruby or JavaScript is out. It should also be actively maintained, so Scala is out since monkeyman, the only entry there, seems abandoned. I eventually settled on Pelican, the top ranked Python framework.

I set up a new virtualenv with virtualenvwrapper and also discovered autoenv along the way. It was easy to get started with the pelican-quickstart script and in a few minutes I have a working site already. Next I went shopping for themes in pelican-themes and picked pelican-bootstrap3. Turns out it doesn’t work with Spotify icon yet so I forked the repo and made a quick PR.

After some further tweaking with the settings I was pretty happy with the results. I went on to set up Disqus and Google Analytics for the site, and published it to my Linode with make ssh_upload.

more ...