New introduction to SciPy

The Python stack for scientific computing is more modular than say R or Mathematica. Python is a general-purpose programming language that has libraries for scientific computing. R and Mathematica are statistical and mathematical programming languages that have general-purpose features. The Python approach has its advantages — I’d rather do math in a general language than do general programming in a mathematical language — but it takes longer to learn. The components of the Python stack work well together, but someone new to Python has to discover what components they’ll need.

Several books have come out recently to help someone learn Python and the components for numerical computing. The latest is Learning SciPy for Numerical and Scientific Computing by Francisco J. Blanco-Silva.

This book covers the things you’d expect, including SciPy, NumPy, and Matplotlib. The only exception may be IPython. But no book can cover everything. And since IPython is an environment more than a library, it makes some sense to leave it out.

In addition to the usual topics, the book includes several important topics that are not as commonly covered. For example, it devotes a good amount of space to special functions and integration in a chapter on numerical analysis. I share the author’s assessment that this is “one of the most interesting chapters in the book.”

There are three chapters on more specific applications: signal processing, data mining, and computational geometry. These chapters give an introduction to their topics as well as how to carry out computations in SciPy.

The final chapter of the book is on integration with other programming languages.

Learning SciPy for Numerical and Scientific Computing covers the material you need to get going. It’s easy to read, and still fairly small: 150 pages total, about 130 pages of content. This is the right size for such a book in my opinion. There’s plenty of online documentation for more details, but it helps to have a good overview such as this book before diving into reference material.

* * *

For daily tips on Python and scientific computing, follow @SciPyTip on Twitter.

Scipytip twitter icon

Putting people in boxes

When I was in high school I had a conversation with a singer, a man with an incredible range. I was sitting at a piano, and he demonstrated that he could sing notes well below the bass clef. Then he said “Not bad for a tenor, huh?”

That struck me as bizarre. He was obviously a bass, but he called himself a tenor. Or rather, he was capable of singing bass. He was capable of singing tenor as well. But you don’t usually call someone a tenor and a bass. Singers fit into four boxes: soprano, alto, tenor, and bass. Maybe you subdivide the boxes, such as first and second tenor, but you only get to pick one box.

We all tend to put people in boxes, but the urge is particularly strong with children and bureaucrats: children because they have a limited view of the world, and bureaucrats because, well, pretty much the same reason.

Imagine this singer trying out for a choir. Suppose he’s given a form to check which part he sings and he checks both bass and tenor. A director might ask what he means, and respond “Great, we can put you on different parts depending on what we need.” But an audition coordinator might say “Look, Mack. Don’t try to be funny or imagine that you’re some unique snowflake. Which part do you sing?”

It’s not that hard to explain that you can sing two voice parts. Other combinations of abilities are harder to explain. Here’s an example from an earlier post:

Take an expert programmer back in time 100 years. What are his skills? Maybe he’s pretty good at math. He has good general problem solving skills, especially logic. He has dabbled a little in linguistics, physics, psychology, business, and art. He has an interesting assortment of knowledge, but he’s not a master of any recognized trade.

If you’re a programmer but there isn’t a box for programmer, you have to pick another box, but you might not fare well by the evaluation criteria of that box.

Not fitting into traditional categories makes you stand out, for better or for worse. It can make you highly valued. It can also make you a thorn in an administrator’s side, or simply someone to be ignored.

James Scott uses the term illegible for people who don’t fit into boxes. Venkat Rao summarizes the idea in the glossary of his blog. His summary is a bit dense, but it’s worth reading carefully.

A system is legible if it is comprehensible to a calculative-rational observer looking to optimize the system from the point of view of narrow utilitarian concerns and eliminate other phenomenology. It is illegible if it serves many functions and purposes in complex ways, such that no single participant can easily comprehend the whole. The terms were coined by James Scott in Seeing Like a State. Illegible systems are generally more robust than legible ones, and Scott’s model is mainly about the failures caused by imposing legibility on an initially illegible reality.

I’d like to hear the terms legible and illegible more widely used. I’ve had conversations with Daniel Lemire, for example, where he would use one of these terms and immediately clarify a discussion.

Related posts:

Abelian consulting and Lévy consulting

Eric Jonas once asked me on Twitter whether I was an Abelian consultant. The pun is an allusion to Abelian groups, groups in which the group operation commutes.

No, I’m not an Abelian consultant. I don’t have a regular commute. I’m more of a Lévy consultant. A Lévy distribution has heavy tails. That is, it is often near the origin, but occasionally takes very long excursions.

I vaguely remember a couple papers about the Lévy distribution, one saying that whale migration follows such a distribution, and another saying that human movements do too.

Related post:

Preparing for Google Reader going away

As you’ve probably heard, Google has announced that they’re discontinuing Google Reader on July 1. Most of you who subscribe to this blog use Google Reader or use an RSS reader that depends on Google’s Feedfetcher. Here’s a snapshot from before Google announced the end of Reader. The proportions have changed slightly since then as people are starting to leave Google Reader.

If you use Google Reader, I suggest you bookmark Google has been tinkering with the menu you see when you log into their home page. Sometimes Reader is in the list under “More” and sometimes it’s not.

Try out a few RSS readers. You may want to start with Feedly as it is appears to be the most popular alternative.  A half million people signed up for Feedly within 48 hours of the Google announcement. Feedly is available through your browser as as a mobile app. It will synchronize across multiple devices like Google Reader, but has a very different user interface.

There are a lot of other alternatives, and I imagine more will appear over the next three months. Here’s a list of 18 RSS readers. That post started as a list of readers available on Linux, and all do run there, but I added notes on what platforms each runs on. Most of the readers run on multiple platforms.

You can subscribe to this blog via email. If you go to the web page for the blog, you’ll see a box on the right side where you can enter your email to subscribe. You may also be able to use your email client as an RSS reader directly. At least Outlook and Thunderbird are RSS readers, and I imagine other email clients are as well.

A web built on LaTeX

The other day on TeXtip, I threw this out:

Imagine if the web had been built on LaTeX instead of HTML …

Here are some of the responses I got:

  • It would have been more pretty looking.
  • Frightening.
  • Single tear down the cheek.
  • No crap amateurish content because of the steep learning curve, and beautiful rendering … What a dream!
  • Shiny math, crappy picture placement: glad it did not!
  • Overfull hboxes EVERYWHERE.
  • LaTeX would have become bloated, and people would be tweeting about HTML being so much better.
  • Noooo! LaTeX would have been “standardised”, “extended” and would by now be a useless pile of complexity.

* * *

For daily tips on LaTeX and typography, follow @TeXtip on Twitter.

TeXtip logo

Interview with Sacha Chua

I spoke with with Sacha Chua last week. We talked about entrepreneurship, Emacs, having eclectic interests, delegation, and more.

Navigation cons from Sacha's blog

J: I ran into you by searching on Emacs topics. When I look at your blog, I see that you do a lot of interesting things, but it’s a little hard to get a handle on exactly what you do.

S: Oh, the dreaded networking quirky question. What exactly do you do?

J: Yeah, people have said the same thing to me. Not to put you in a box, but I was curious. I see from your site that you do graphic art — sketching and such — and it doesn’t create the impression that you’re someone who would spend a lot of time in front of Emacs. So I’m curious how these things fit together, how you got started using Emacs and how you use it now.

Image by Sacha Chua

S: So my background is actually fairly technical. I’ve been doing computer programming for ages and ages. In high school I came across a book Unix Power Tools, which is how I got interested in Emacs. And because I was interested in programming, in open source, a little bit of wearable computing as well, I got to know Emacs and all these different modules it had. For example, Emacspeak is amazing! It’s been around since the 1990s and it’s a great way to use the computer while you’re walking around. Because I love programming and because I wanted to find a way to help out, I ended up maintaining PlannerMode and later EmacsWiki mode as well.

When I went to university, I took up computer science. After I finished, I taught. Then I took my masters in Toronto, where I am now. Emacs was super helpful — being able to do everything in one place. After I finished my masters, I did a lot of software consulting with IBM. I did business consulting as well. Then in 2012, after saving up, I decided to go on pretty much the same adventure you’re on. I’m completely unhirable for the next five years! Most businesses struggle for the first five years, so I saved up enough to not worry too much about my expenses for the next five years. I’m one year in, four years to go, and that’s where I am.

At networking events, I like to shake people up a bit by telling them I’m semi-retired. I’m in this five-year experiment to see how awesome life can be and what I can do to make things better. I’ve done technical consulting, business consulting, sketching, illustration, writing, all sorts of things. Basically, my job description is context-dependent.

J: I understand that.

S: I use Emacs across all the things I do. When I’m doing technical and business consulting, I use Emacs to edit code, to draft documents, even to outline comic strips. And when I’m doing illustration, Emacs — especially Org Mode — helps me keep track of clients and deliverables, things to do, agenda, calendar, deadlines.

J: I’m basically running my life through Org Mode right now. When you say you use Emacs to draft documents, are you using LaTeX?

S: I used LaTeX when I was working on my master’s thesis and other papers, I think. Now I mostly use org mode and export from there.

J: Are you using Emacs for email?

S: I used to. But I’m stuck on Windows to use drawing programs like Sketchbook Pro on my Tablet PC. So it’s harder to set up my email like I had it set up when I used Ubuntu. Back when I used Ubuntu, I was very happy with Gnus.

J: Do you work entirely on Windows, or do you go back and forth between operating systems?

S: I have a private server that runs Linux. On Windows I run Cygwin, but I miss some of the conveniences I had when I had a nicely set-up Linux installation.

J: When you’re running Emacs on Windows, I’m sure you run into things that don’t quite work. What do you do about that?

S: Most things work OK if they’re just Emacs Lisp, but some things call a shell command or use some library that hasn’t been ported over yet. Then I basically wail and gnash my teeth. Sometimes I get things working by using Cygwin, but sometimes it’s a bit of a mess. I don’t use Emacs under Cygwin because I prefer how it works natively. I don’t run into much that doesn’t work.

J: So what programming languages do you use when you’re writing code?

S: I do a lot of quick-and-dirty things in Emacs Lisp. When I need to do some XML parsing or web development, I’ll use Ruby because a lot of people can read it and there are a lot of useful gems. Sometimes I’ll do some miscellaneous things in Perl.

I love doing programming and putting together tools. And I quite enjoy drawing, helping people with presentation and design. So this is left brain plus right brain.

It does boggle people that you can have more than one passion, but others are, like, “Yeah, I know, I’m like that too.”

J: I think having an interest in multiple things is a healthier lifestyle, but it’s a little harder to market.

S: Actually, no. I finally figured out a name for my company, ExperiVis, after a year of playing with it. People reach out to me and we figure out whether it’s a good fit. I don’t need to necessarily guide people to just this aspect or another of my work. I like the fact that people bump into these different things.

J: When we scheduled this call, I went through your virtual assistant. How do you use a virtual assistant?

S: One of the things I don’t like to do is scheduling. I used to get stressed out about scheduling when I did it myself. I’ve always been interested in delegating and taking advantage of what other people enjoy and are good at. I work with an assistant — Criselda. She lives in the Philippines. I found her on oDesk. She works one to four hours a week, more or less, and keeps track of her time.

J: What else might you ask a VA to do?

S: I’ve asked people to do web research. I’ve had someone do a little bit of illustration for me. I’ve had someone do a little bit of programming for me because I want to learn how to delegate technical tasks. He does some Rails prototyping for me. I have someone doing data entry and transcription. It’s fascinating to see how you can swap money for time, especially for things that stress me out, or bore me, or things I can’t do.

Every week I go over my task list with my VA to see which of the tasks I should have delegated. Still working on it!

* * *

Later on in the conversation Sacha asked about my new career and had this gem of advice:

Treating this as a grand experiment makes it much easier for me to try different approaches and not be so scared, to not treat it as a personal rejection if something doesn’t work.

Related post: People I’ve interviewed and people who have interviewed me

Was Betteridge right?

Betteridge’s law says

Any headline which ends in a question mark can be answered by the word no.

If Betteridge was right, then the answer to my headline question should be no, in which case Betteridge was wrong. But Betteridge was wrong, then the answer to the question in my headline is yes.

This isn’t quite like Russell’s paradox. He asked whether the set of sets which contain themselves contains itself. If it does, it doesn’t. If it doesn’t, it does. This logical contradiction led to a more rigorous construction of set theory that avoids the paradox.

My observation about Betteridge’s law isn’t a paradox, though it resembles one. If Betteridge was wrong, there’s no contradiction in saying that he was sometimes but not always right.

Betteridge’s law was an aphorism, not a logical absolute, and so was never intended to be a rigorous statement. I’m sure Betteridge was quite aware that there had been exceptions, or at least that one could easily create an exception. He did so himself. But as is often the case with yes/no statements that are not always true, it can be turned into a rigorous statement using probability.

Betteridge could have said that if a headline ends in a question mark, the probability that the answer is no is large. Then my headline, added to the vast collection of headlines, would ever-so-slightly lower the proportion of headlines that ask questions that can be answered negatively, without contradicting Betteridge, if was right.

We could call Betteridge’s constant the probability that a headline asks a question that could be answered no. But then it probably isn’t a constant. Maybe knowledge of Betteridge’s law influences how people write headlines …

* * *

Thanks to Don Sizemore for pointing out Betteridge’s law.

An incomplete post about sphere volumes

This is an incomplete blog post. Maybe you can help finish it.

One of the formulas I’ve looked up the most is the volume of a ball in n dimensions. I needed it often enough to be aware of it, but not often enough to remember it. Here’s the formula:

frac{pi^{frac{n}{2}}}{Gammaleft(frac{n}{2} + 1right)}r^n

The factor of rn is no surprise: of course the volume as a function of radius has to be proportional to rn. So we can make the formula a little simpler by just remembering the formula for the volume of a unit ball.

Next, we can make the formula a simpler still by using factorials instead of the gamma function. If n is a non-negative integer, n! = Γ(n+1). We can use that to define factorial for non-integers. Then the volume of a unit ball is


That’s easier to remember.

It’s also curious. The nth term in the series for ex is xn/n!, so the volumes of unit balls look like series for eπ except compressed, with each index n cut in half. The volumes are not the coefficients in the series for ex, but could they be the coefficients in the series for another familiar function? To find out, let’s stick back in the factor of rn and sum.

sum_{n=0}^infty frac{pi^{frac{n}{2}}}{frac{n}{2}!} ,r^n

This is the sum of the volumes of balls of radius r in all dimensions. That doesn’t make sense by itself, but you could also think of this as the generating function for the volumes of unit balls. So can we find a closed-form expression for the generating function? Yes:

sum_{n=0}^infty frac{pi^{frac{n}{2}}}{frac{n}{2}!} ,r^n = sqrt{pi} r exp(pi r^2) (mbox{erf}(sqrt{pi} r) + 1)

If you work with probability, you probably find Φ more familiar than the error function (see notes relating these) and find exp(x2/2) more familiar than exp(x2). So you could rewrite the generating function as f(√(2π)r) where

f(x) = sqrt{2} xexp(x^2/2) Phi(x)

That looks familiar, but I don’t know what to do with it.

I warned you this would an incomplete post. I feel like there’s an interesting connection to be made, but I’m not quite there. Any suggestions?

RSS readers on Linux

This afternoon I asked on UnixToolTip for suggestions of RSS readers on Linux. Here are the suggestions I got, in order of popularity.


Some other readers available on Linux:

For daily tips on using Unix, follow @UnixToolTip on Twitter.

UnixToolTip twitter icon