Bringing bash and PowerShell a little closer together

I recently ran across PSReadLine, a project that makes the PowerShell console act more like a bash shell. I’ve just started using it, but it seems promising. I’m switching between Linux and Windows frequently these days and it’s nice to have a little more in common between the two.

I’d rather write a PowerShell script than a bash script, but I’d rather use the bash console interactively. The PowerShell console is essentially the old cmd.exe console. (I haven’t kept up with PowerShell in a while, so maybe there have been some improvements, but it’s my impression that the scripting language has moved forward and the console has not.) PSReadLine adds some bash-like console conveniences such as Emacs-like editing at the command prompt.

Update: Thanks to Will for pointing out Clink in the comments. Clink sounds like it may be even better than PSReadLine.

PowerShell logo

Read More

Haskell is to math as Perl is to English?

Fortran superficially looks like mathematics. Its name comes from “FORmula TRANslation,” and the language does provide a fairly straight-forward way to turn formulas into code. But the similarity to mathematics ends there at the lowest level of abstraction.

Haskell, on the other hand, is more deeply mathematical. Fortran resembles math in the small, but Haskell resembles math in the large. Haskell has mathematical structures at a higher level of abstraction than just formulas. As a recent article put it

While Fortran provides a comfortable translation of mathematical formulas, Haskell code begins to resemble mathematics itself.

On its surface, Perl is one of the least English-like programming languages. It is often criticized as looking like “line noise” because of its heavy use of operators. By contrast, Python has been called “executable pseudocode” because the source is easier to read, particularly for novice programmers. And yet at a deeper level, Perl is more English-like than other programming languages such as Python.

Larry Wall explains in Natural Language Principles in Perl that he designed Perl to incorporate features of spoken language. For example, Perl has analogs of articles and pronouns. (Larry Wall studied both linguistics and computer science in college.) Opinions differ on how well his experiment with incorporating natural language features into programming languages has worked out, but it was an interesting idea.

Related posts:

Haskell / Category theory decoder ring

Programming languages and magic

Extreme syntax

Three-hour-a-week language

Read More

Where combinator names come from

Today I found out where the one-letter names of some functions in combinatory logic come from. I’d seen these before (for example, in To Mock a Mockingbird) but I had no idea what inspired the names.

These functions — I, K, S, T, and Z — are known as the Schönfinkel combinators, and their names are somewhat mnemonic in German. (Only somewhat. Don’t get your hopes up.)

Definition Name Name origin
λx. x I Identitätsfunktion (identity function)
λx,y. x K Konstanzfunktion (constant function)
λx,y,z. xz(yz) S Verschmelzungsfunktion (amalgamation function)
λx,y,z. xzy T Vertauschungsfunktion (exchange function)
λx,y,z. x(yz) Z Zusammensetzungsfunktion (composition function)

Source: Practical Foundations of Mathematics, footnote on page 89. Available online here.

If you’re not familiar with the notation in the function definitions, see this introduction to lambda calculus.

Read More

Comparing Windows and Linux stability

In my experience, Ubuntu 12.04 is less stable than Windows 8. Ubuntu is more likely to freeze, crash, or otherwise act badly than Windows.

When I say this, people point out that Ubuntu is not the Linux kernel and that the problems I see come from the GUI or from Ubuntu itself. I can believe that.

So Linux is really stable when you don’t run it on a desktop. But the same is true of Windows. If you set up a Windows server and treat it like a server — don’t install random consumer software, don’t browse the web on it, don’t play games on it, etc. — then Windows is really stable too.

When people compare the stability of Linux and Windows, they may be biased a couple ways. First, Linux is more often deployed on servers and Windows more often on desktops. So they may unintentionally be comparing Linux servers to Windows desktops. Second, they may be thinking that Linux users’ computers are more stable than Windows users’ computers, which is probably true. Linux users typically know more about computers than Windows users. They are more able to avoid problems and more able to fix them when they occur.

I suspect the most important factors for a computer’s stability are how it is being used and who is using it, not what OS it is running.

Read More

Calculating entropy

For a set of positive probabilities pi summing to 1, their entropy is defined as

- \sum p_i \log p_i

(For this post, log will mean log base 2, not natural log.)

This post looks at a couple questions about computing entropy. First, are there any numerical problems computing entropy directly from the equation above?

Second, imagine you don’t have the pi values directly but rather counts ni that sum to N. Then pi = ni/N. To apply the equation directly, you’d first need to compute N, then go back through the data again to compute entropy. If you have a large amount of data, could you compute the entropy in one pass?

To address the second question, note that

- \sum \frac{n_i}{N} \log \frac{n_i}{N} = \log N -\frac{1}{N} \sum n_i \log n_i

so you can sum ni and ni log ni in the same pass.

One of the things you learn in numerical analysis is to look carefully at subtractions. Subtracting two nearly equal numbers can result in a loss of precision. Could the numbers above be nearly equal? Maybe if the ni are ridiculously large. Not just astronomically large — astronomically large numbers like the number of particles in the universe are fine — but ridiculously large, numbers whose logarithms approach the limits of machine-representable numbers. (If we’re only talking about numbers as big as the number of particles in the universe, their logs will be at most three-digit numbers).

Now to the problem of computing the sum of ni log ni. Could the order of the terms matter? This also applies to the first question of the post if we look at summing the pi log pi. In general, you’ll get better accuracy summing a lot positive numbers by sorting them and adding from smallest to largest and worse accuracy by summing largest to smallest. If adding a sorted list gives essentially the same result when summed in either direction, summing the list in any other order should too.

To test the methods discussed here, I used two sets of count data, one on the order of a million counts and the other on the order of a billion counts. Both data sets had approximately a power law distribution with counts varying over seven or eight orders of magnitude. For each data set I computed the entropy four ways: two equations times two orders. I convert the counts to probabilities and use the counts directly, and I sum smallest to largest and largest to smallest.

For the smaller data set, all four methods produced the same answer to nine significant figures. For the larger data set, all four methods produced the same answer to seven significant figures. So at least for the kind of data I’m looking at, it doesn’t matter how you calculate entropy, and you might as well use the one-pass algorithm to get the result faster.

Read More

Leading digits and quadmath

My previous post looked at a problem that requires repeatedly finding the first digit of kn where k is a single digit but n may be on the order of millions or billions.

The most direct approach would be to first compute kn as a very large integer, then find it’s first digit. That approach is slow, and gets slower as n increases. A faster way is to look at the fractional part of log kn = n log k and see which digit it corresponds to.

If n is not terribly big, this can be done in ordinary precision. But when n is large, multiplying log k by n and taking the fractional part brings less significant digits into significance. So for very large n, you need extra precision. I first did this in Python using SymPy, then switched to C++ for more speed. There I used the quadmath library for gcc. (If n is big enough, even quadruple precision isn’t enough. An advantage to SymPy over quadmath is that the former has arbitrary precision. You could, for example, set the precision to be 10 more than the number of decimal places in n in order to retain 10 significant figures in the fractional part of n log k.)

The quadmath.h header file needs to be wrapped in an extern C declaration. Otherwise gcc will give you misleading error messages.

The 128-bit floating point type __float128 has twice as many bits as a double. The quadmath functions have the same name as their standard math.h counterparts, but with a q added on the end, such as log10q and fmodq below.

Here’s code for computing the leading digit of kn that illustrates using quadmath.

#include <cmath>
extern "C" {
#include <quadmath.h>
}

__float128 logs[11];

for (int i = 2; i <= 10; i++)
    logs[i] = log10q(i + 0.0);

int first_digit(int base, long long exponent)
{
    __float128 t = fmodq(exponent*logs[base], 1.0);
    for (int i = 2; i <= 10; i++)
        if (t < logs[i])
            return i-1;
}

The code always returns because t is less than 1.

Caching the values of log10q saves repeated calls to a relatively expensive function. So does using the search at the bottom rather than computing powq(10, t).

The linear search at the end is more efficient than it may seem. First, it’s only search a list of length 9. Second, because of Benford’s law, the leading digits are searched in order of decreasing frequency, i.e. most inputs will cause first_digit to return early in the search.

When you compile code using quadmath, be sure to add -lquadmath to the compile command.

Related posts

Benford’s law and SciPy
Leading digits of factorials

Read More

Baroque computers

From an interview with Neal Stephenson, giving some background for his Baroque Cycle:

Leibniz [1646-1716] actually thought about symbolic logic and why it was powerful and how it could be put to use. He went from that to building a machine that could carry out logical operations on bits. He knew about binary arithmetic. I found that quite startling. Up till then I hadn’t been that well informed about the history of logic and computing. I hadn’t been aware that anyone was thinking about those things so far in the past. I thought it all started with [Alan] Turing. So, I had computers in the 17th century.

Read More

Preparing for Google Reader going away

As you’ve probably heard, Google has announced that they’re discontinuing Google Reader on July 1. Most of you who subscribe to this blog use Google Reader or use an RSS reader that depends on Google’s Feedfetcher. Here’s a snapshot from before Google announced the end of Reader. The proportions have changed slightly since then as people are starting to leave Google Reader.

If you use Google Reader, I suggest you bookmark https://www.google.com/reader. Google has been tinkering with the menu you see when you log into their home page. Sometimes Reader is in the list under “More” and sometimes it’s not.

Try out a few RSS readers. You may want to start with Feedly as it is appears to be the most popular alternative.  A half million people signed up for Feedly within 48 hours of the Google announcement. Feedly is available through your browser as as a mobile app. It will synchronize across multiple devices like Google Reader, but has a very different user interface.

There are a lot of other alternatives, and I imagine more will appear over the next three months. Here’s a list of 18 RSS readers. That post started as a list of readers available on Linux, and all do run there, but I added notes on what platforms each runs on. Most of the readers run on multiple platforms.

You can subscribe to this blog via email. If you go to the web page for the blog, you’ll see a box on the right side where you can enter your email to subscribe. You may also be able to use your email client as an RSS reader directly. At least Outlook and Thunderbird are RSS readers, and I imagine other email clients are as well.

Read More

Wonky but free

Rachel Kroll wrote a blog post last Friday entitled I mortgaged my future with a Mac. The part I found most interesting is near the end of the post.

Instead of staying with my wonky-but-free ways of doing things, I shifted all of my stuff over to the Mac. …  Now when I want to get back out, I have to do all of the work I thought I had managed to avoid by using a Mac in the first place.

I’m not interested in the pros and cons of using a Mac, but I really liked the phrase “wonky but free.” I have a growing appreciation for things that are wonky-but-free, technologies that make a poor first impression but have long-term benefits.

Read More

Seven John McCarthy papers in seven weeks

I recently ran across a series of articles from Carin Meier going through seven papers by the late computer science pioneer John McCarthy in seven weeks. Published so far:

Prologue
#1: Ascribing Mental Qualities to Machines
#2: Towards a Mathematical Science of Computation

Carin has announced that the next paper will be “First Order Theories of Individual Concepts and Propositions” but she hasn’t posted a commentary on it yet.

Read More

No technology can ever be too arcane

In this fake interview, Linux creator Linus Torvalds says Linux has gotten too easy to use and that’s why people use Git:

Git has taken over where Linux left off separating the geeks into know-nothings and know-it-alls. I didn’t really expect anyone to use it because it’s so hard to use, but that turns out to be its big appeal. No technology can ever be too arcane or complicated for the black t-shirt crowd.

Emphasis added.

Note: If you want to leave a comment saying Linux or Git really aren’t hard to use, lighten up. This is satire.

Related post: Advanced or just obscure?

Read More

It's not the text editor, it's text

Vivek Haldar had a nice rant about editors a couple days ago. In response to complaints that some editors are ugly, he writes:

The primary factor in looking good should be the choice of a good font at a comfortable size, and a syntax coloring theme that you like. And that is not something specific to an editor. Editors like Emacs and vi have almost no UI!

To illustrate his point, here’s what my Emacs looks like without text:

There’s just not much there, not enough to say it’s pretty or ugly.

When people say that Emacs is not pretty, I think they mean that plain text is not pretty.

For better and for worse, everything in Emacs is text. The advantage of this approach is consistency. Everything uses the same commands for navigation and editing: source code, error messages, directory listings, … Everything is just text. The disadvantage is that you don’t have nicely designed special windows for each of these things, and it does get a little monotonous.

When people say they love their text editor, I think they love text. They prefer an environment that allows them to solve problems by editing text files rather than clicking buttons. And as Vivek says, that is not something specific to a particular editor.

Related posts:

Personal organization software
Complexity and unity

Read More