Numerical computing in IronPython with Ironclad

In a previous post, I discuss my difficulties calling some Python modules from IronPython. In particular I wanted to call SciPy from IronPython and couldn’t. The discussion following that post brought up Ironclad as a possible solution. I wanted to learn more about Ironclad, and so I invited William Reade to write a guest post about the project. I want to thank William for responding to my request with a very helpful article. — John


Hi! My name’s William Reade, and I’ve spent the last year or so working on Ironclad, an open-source project which helps IronPython to inter-operate better with CPython. Michael Foord recently introduced  me to our host John, who kindly offered me the opportunity to write a bit about  my work and, er, how well it works. So, here I am.

To give you a little bit of context, I’ve been working at Resolver Systems for several years now; our main product, Resolver  One, is a spreadsheet with very tight IronPython integration. We like to describe  it as a “Pythonic spreadsheet”, and that’s clearly a concept that people like.  However, when people think of a “Pythonic spreadsheet”, they apparently expect it  to work with popular Python libraries — such as NumPy and SciPy — and we found that IronPython’s incompatibility put us at a serious disadvantage. And, for some reason, nobody seemed very keen to  solve the problem for us, so we had to do it ourselves.

The purpose of Ironclad is to allow you to use Python C extensions (of which there are many) from inside IronPython without recompiling anything. The secret purpose  has always been to get NumPy working in Resolver One, and in release 1.4 we finally  achieved this goal. Although the integration is still alpha level, you can import  and use NumPy inside the spreadsheet grid and user code: you can see a screencast  about the integration here.

However, while Resolver One is a great tool, you aren’t required to use it to get the benefits: Ironclad has been developed completely separately, has no external  dependencies, and is available under an open source license. If you consider  yourself adequately teased, keep reading for a discussion of what Ironclad actually  does, what it enables you to do, and where it’s headed.

As you may know, Python is written in C and IronPython is written in C#. While IronPython is an excellent implementation of Python, it works very differently  under the hood, and it certainly doesn’t have anything resembling Python’s API for  writing C extensions. However, Ironclad can work around this problem by loading a  stub DLL into an IronPython process which impersonates the real python25.dll, and  hence allows us to us intercept the CPython API calls. We can then ensure that the  appropriate things happen in response to those calls … except that we use  IronPython objects instead of CPython ones.

So long as we wrap IronPython objects for consumption by CPython, and vice versa, the two systems can coexist and inter-operate quite happily. Of course, the mix of  deterministic and non-deterministic garbage collection makes it a little tricky [1] to  ensure that unreferenced objects — and only unreferenced objects — die in a  timely manner, and there are a number of other dark corners, but I’ve done enough  work to confidently state that the problem is “just” complex and fiddly. While it’s  not the sort of project that will ever be finished, it hopefully is the sort that  can be useful without being perfect.

The upshot of my recent work is that you can now download Ironclad, type ‘import ironclad; import scipy‘ in an IronPython console, and it will Just Work [2]. I am  programmer, hear me roar!

Hundreds of tests now pass in both NumPy and SciPy, and I hope that some of you will be inspired to test it against your own requirements. For example, the Gaussian error function has been mentioned a few times on this blog (and, crucially, I have a  vague idea of what it actually is), and I can demonstrate that scipy.special.erf works perfectly under Ironclad:

C:devironclad-headbuild>ipy
IronPython 2.0 (2.0.0.0) on .NET 2.0.50727.3053
Type "help", "copyright", "credits" or "license" for more information.
>>> import ironclad
>>> from scipy.special import erf
Detected scipy import
  faking out numpy._import_tools.PackageLoader
Detected numpy import
  faking out modules: mmap, nosetester, parser
>>> erf(0)
0.0
>>> erf(0.1)
0.1124629160182849
>>> erf(1)
0.84270079294971478
>>> erf(10)
1.0

Numerical integration also seems to work pretty well, even for tricky cases (note that the quad function returns a tuple of (result, error)):

>>> from scipy.integrate import quad
>>> quad(erf, 0, 1)
(0.48606495811225592, 5.3964050795968879e-015)
>>> quad(erf, -1, 1)
(0.0, 1.0746071094349994e-014)
>>> from scipy import inf
>>> quad(erf, -inf, inf)
(0.0, 0.0)
>>> quad(erf, 0, inf) # ok, this one is probably more of a 'stupid' case
Warning: The integral is probably divergent, or slowly convergent.
(-1.564189583542768, 3.2898350710297564e-010)

And, while this exposes a little import-order wart, we can re-implement erf in terms of the normal CDF, and see that we get pretty similar results:

>>> from scipy import misc # shouldn't really be necessary - sorry :)
>>> from scipy.stats.distributions import norm
>>> import numpy as np
>>> def my_erf(x):
...   y = norm.cdf(x * np.sqrt(2))
...   return (2 * y) - 1
...
>>> my_erf(0.1)
0.11246291601828484
>>> my_erf(1)
0.84270079294971501
>>> quad(my_erf, 0, 1)
(0.48606495811225597, 5.3964050795968887e-015)
>>> quad(my_erf, -inf, inf)
(2.8756927650058737e-016, 6.1925307417506635e-016)

I also know that it’s possible to run through the whole Tentative NumPy Tutorial [3] with identical output on  CPython and IronPython [4], and the SciPy tutorial appears to work equally well in both  environments [5]. In short, if you’re trying to do scientific computing with  IronPython, Ironclad is now probably mature enough to let you get significant value  out of SciPy/NumPy.

However, I can’t claim that everything is rosy: Ironclad has a number of flaws which may impact you.

  • It won’t currently work outside Windows, and it won’t work in 64-bit processes.  However, NumPy itself doesn’t yet take advantage of 64-bit Windows. I’ll start work  on this  as soon as it’s practical; for now, it should be possible to run in 32-bit  mode without problems.
  • Performance is generally poor compared to CPython. In many places it’s only a matter of a few errant microseconds — and we’ve seen NumPy integration deliver some great performance benefits for Resolver One — but in pathological cases it’s  worse by many orders of magnitude. This is another area where I would really like  to hear back from users with examples of what needs to be faster.
  • Unicode data doesn’t work, and I don’t plan to work on this problem because it’ll disappear when IronPython catches up to Python 3000. At that point both systems will have Unicode strings only, instead of the current situation where I would have to map one string type on one side to two string types on the other.
  • NumPy’s distutils and f2py subpackages don’t currently work at all, and nor do memory-mapped files.
  • Plenty of other CPython extensions work, to a greater or lesser extent, but lots won’t even import.

However, just about every problem with Ironclad is fixable, at least in theory: if you need it to do something that it can’t, please talk to me about it (or even send me a patch!).

Footnotes

[1] CPython uses reference counting to track objects’ states, and deletes them deterministically the moment they become unreferenced, while .NET uses a more advanced garbage collection strategy which unfortunately leads to non-deterministic finalization.

[2] Assuming you have the directories containing the ironclad, numpy and scipy packages already on your sys.path, at any rate. I personally just install  everything for Python 2.5, and have added the CPython install’s ‘Dlls‘ and ‘lib/site-packages‘ subdirectories to my IRONPYTHONPATH.

[3] Apart from the matplotlib/pylab bit, but even that should be workable with a little extra setup if you don’t mind using a non-interactive back-end.

[4] Modulo PRNG output, of course.

[5] That is to say, not very well at all, but at least they go wrong in similar ways.

Where does the programming effort go?

I stumbled across this quote from Mary Shaw.

Less than 10% of the code has to do with the ostensible purpose of the system; the rest deals with input-output, data validation, data structure maintenance, and other housekeeping.

I don’t know the context of her quote, but she could be talking about any software project.

When I began working as a professional programmer, I was surprised that I spent most of my time on work that wasn’t directly related to what I wanted to accomplish. Computer science classes and writing software for my own research had not prepared me for this. I kept thinking that as I got more experience, the proportion of my effort going directly toward what I wanted to accomplish would increase. It did, but very slowly, and never by very much. Only later did the reason occur to me: the vast majority of the work that needs to be done isn’t directly related to the purpose of the project.

People who know a little bit about programming can make difficult clients because they can imagine how they might write the core 10% (or maybe core 2%) of a large project.

Every once in a while someone will claim to have a solution that will change things. They’re selling a framework, a language,  etc. that will radically change things. The sales pitch is “You spend most of your time doing low-level stuff. Use my product and your programmers can focus on the value-added part and not do so much plumbing.” But plumbing is value-added work. (Call it “infrastructure” if you like; that sounds more important.) Sometimes plumbing work becomes repetitive and can be reduced by reusing code, but there’s always new plumbing to work on. Most of the work to be done is invisible and I don’t foresee this changing any time soon.

Update: See this list of non-functional requirements by Mike Griffiths. This list gives some specific examples of where development effort goes, things that must be done but are not obvious until you mention them.

How to preserve documents

A conversation at work this morning inspired a post on the Reproducible Ideas blog about preserving documents. Physically preserving documents may be the easy part. Keeping alive the memory that the documents exist can be much harder.

photo of the Rosetta disk from the Rosetta Project

[The image above is a photograph of the Rosetta Disk, a disk preserving 13,000 pages of documentation about 1,500 human languages. The information is not encoded as bits; it is engraved. A thousand years from now, a scholar could read the disk using only a microscope. Click on the image for more information about the Rosetta Disk.]

Don’t forget the salt

The following story comes from Alton Brown’s book I’m Just Here for More Food, page 76.

I 'm just here for more food by Alton Brown

Years ago Alton Brown ruined 50 pounds of dough by forgetting to add salt. By the time he remembered, it was too late. He threw the dough in dumpster behind the restaurant he was working in.

By noon, the temperature in the parking lot was edging toward 101°F. By 2:00 P. M. the lid of the dumpster started to rise. By 3:00 P.M. , grandson of the Blob was oozing out over the top of the dumpster and onto the pavement. … In the end, that 50-pound batch of dough completely filled the dumpster, and when the metal got hot enough, it baked onto the inside, making a real chore for those who had to clean it (meaning me).

He goes on to explain that throwing the dough out because he’d forgotten the salt made things worse: salt would have slowed down the fermentation in the dumpster. He concludes “So don’t forget the salt.”

Two perspectives on the design of C++

Here are two complementary (but not entirely complimentary!) blog posts about C++.

Roshan James has a scathing article about C++. When asked to recommend books on C++, he replied that he doesn’t recommend C++. He explains how the best C++ books may be Scott Meyer’s series Effective C++ but argues that they should be called “Defective C++. ” He isn’t criticizing Scott Meyers, only the aspects of the C++ language that made it necessary for Scott Meyers to write such books. Effective C++ explains how to get around problems that don’t exist in more recent languages.

Bruce Eckel’s article The Positive Legacy of C++ and Java focused more on what C++ did well. C++ was designed to be backwardly compatible with C. Bjarne Stroustrup, original author  of C++, realized that the decision to be compatible with C would cause major difficulties, but he also thought (correctly) that without such compatibility no one would move over to the new language. Given this severe constraint, C++ has been remarkably well designed.

Update: Check out the C++ FQA site Alessandro Gentilini mentions in the comments. “FQA” is not a typo. It stands for Frequently Questioned Answers.

Quasi-random sequences in art and integration

Sometimes when people say they want random points, that’s not what they really want. Random points clump more than most people expect. Quasi-random sequences are not random in any mathematical sense, but they might match popular expectations of randomness better than the real thing, especially for aesthetic applications. If by “random” someone means “scattered around without a clear pattern and not clumped together” then quasi-random sequences might do the trick.

Here are the first 50 points in a quasi-random sequence of points in the unit square.

quasi-random points

By contrast, here are 50 points in a unit square whose coordinates are uniform random samples.

random points

The truly random points clump together. Notice the cluster of three points in the top right corner. There are few other instances of pairs of points being very close together. Also, there are fairly large areas that don’t contain any random points. The quasi-random points by contrast are better spread out. They have a self-avoiding property that keeps them from clustering, and they fill the space more efficiently.

Quasi-random sequences could be confused with pseudo-random sequences. They’re not at all the same thing. Pseudo-random sequences are computer-generated sequences that in many ways behave as if they were truly random, even though they were produced by deterministic algorithms. For many practical purposes, including sensitive statistical tests, pseudo-random sequences are simply random sequences. (The “truly” random points above were technically “pseudo-random” points.)

The quasi-random points above were part of a Sobol sequence, a common quasi-random sequence. Other quasi-random sequences include the Halton sequence and the Hammersley sequence. Mathematically, these sequences are defined has having low-discrepancy. Roughly speaking, this means the “discrepancy” between the number of points that actually fall in a volume and the number of points you’d expect to fall in the same volume is small. See the Wikipedia article on quasi-random sequences for more mathematical details.

Besides being aesthetically useful, quasi-random sequences are useful in applied mathematics. Because these sequences explore a space more efficiently than random sequences, quasi-random sequences sometimes lead to more efficient high-dimensional integration algorithms than Monte Carlo integration. Quasi-Monte Carlo integration, i.e. integration based on quasi-random sequences rather than random sequences, is popular in financial applications. Art Owen has written insightful papers on Quasi-Monte Carlo integration (QMC). He has classified which integration problems can be efficiently computed via QMC methods and which cannot. In a nutshell, QMC works well when the effective dimension of a problem is significantly lower than the actual dimension. For example, a financial model might ostensibly depend on 1000 variables, but 50 of those variables contribute far more to the integrand than all the other variables. The integrand might essentially be a function of only 50 variables. In that case, QMC will work well. Note that it is not necessary to identify these 50 variables or do any change of variables. QMC just magically takes advantage of the situation.

One disadvantage of QMC integration is that it doesn’t naturally lead to an estimate of its own accuracy, unlike Monte Carlo integration. Several hybrid approaches have been proposed to combine QMC integration and Monte Carlo integration to get the efficiency of the former and the error estimates of the latter. For example, one could randomly jitter the quasi-random points or randomly permute their components. Some of these results are in Art Owen’s papers.

To read more about quasi-random sequences, see the book Random Number Generation and Quasi-Monte Carlo Methods.

What happened to XSLT?

Around 2000, some people believed that nearly all programming would be a matter of transporting and transforming XML. XML would be the universal data format, and all software would be a matter of transforming XML. If that were the case, then it would be very handy to use a language designed especially for transforming XML. That language was XSLT.

I’m sure some programmers write XSLT on a daily basis, and there’s probably a large amount of machine-generated XSLT out there. But it’s safe to say XSLT never became extremely popular. Maybe because the syntax is awkward. Not many developers like to write lots of angle brackets. Or maybe the declarative/functional style of the language isn’t as comfortable as a more imperative style of programming for most developers. I don’t really know. I thought XSLT sounded like a good idea, but I never learned it. My experience with XSLT consists of a few dozen lines I wrote six or seven years ago.

Along these lines, I ran across the following picture in a blog post entitled A picture is worth a thousand words: is XML dying? containing the following photo:

Sharps and flats in HTML

Apparently there’s no HTML entity for the flat symbol, ♭. In my previous post, I just spelled out B-flat because I thought that was safer; it’s possible not everyone would have the fonts installed to display B♭ correctly.

So how do you display music symbols for flat, sharp, and natural in HTML? You can insert any symbol if you know its Unicode value, though you run the risk that someone viewing the page may not have the necessary fonts installed to view the symbol. Here are the Unicode values for flat, natural, and sharp.

Since the flat sign has Unicode value U+266D, you could enter ♭ into HTML to display that symbol.

The sharp sign raises an interesting question. I’m sure most web pages referring to G-sharp would use the number sign # (U+0023) rather than the sharp sign ♯ (U+266F). And why not? The number sign is conveniently located on a standard keyboard and the sharp sign isn’t. It would be nice if people used sharp symbols rather than number signs. It would make it easier to search on specifically musical terms. But it’s not going to happen.

Update: See this post on font support for Unicode. Most people can see all three symbols, but some, especially Android users, might not see the natural sign.

Related posts:

Typesetting music in LaTeX and LilyPond

I tried typesetting music in LaTeX some time ago and gave up. The packages I found were hard to install, the examples didn’t work, etc. This weekend I decided to try again. I tried plowing through the MusiXTeX documentation and got no further than I did last time.

I posted a note on StackOverflow and got some good responses. Nikhil Chelliah suggested I look at LilyPond. I had looked at LilyPond before, and @jleedev explained how to integrate LaTeX and LilyPond.

Here’s some sheet music I included in my previous post, March in 7/4 time.

sheet music example

Here’s a full-sized PDF file version of the music above. And here’s the LilyPond source code used to create the music.

\relative c' {
\time 7/4
\key f \major
\clef treble
f g f \times 2/3{ c8 c c} f4 g a
g a8. bes16 a4 g f g c,
f g f \times 2/3{ c8 c c} f4 g a
g a8. bes16 a4 g f e f
}

The notation looks cryptic at first, but it makes sense after a few minutes. The command relative c' means that the following pitches will be relative to middle C. For example, the first note, F, is the F closest to middle C. Each note is the same length as the previous note by default, and the first note is a quarter note by default. The notation c8 means that the C is an eighth note, except it’s in the context of a triplet (times 2/3) and so it’s an eighth note triplet. The next F is denoted f4 to indicate that we’re back to quarter notes.

The notation a8. says that the A is a dotted eighth note. For the next note, bes16 means a B-flat sixteenth note. The suffix “es” stands for “flat” and “is” stands for “sharp.” (The documentation says it’s Dutch. I’ve never seen it before.) I don’t understand why I had to tell it that the B was flat. The code specified earlier that the key was F major, which implies B’s are flat. I suppose the code for individual notes is decoupled from the code to draw the key signature. That would make entering music painful in keys that have lots of sharps or flats. Maybe there’s a way to specify default sharps or flats.

The comma in c, gives the absolute pitch of the C. In relative mode, LilyPond assumes by default that each pitch name refers to the pitch closest to its predecessor. The C closest to the previous note, F, would have been the C up one fourth rather than down one fifth, so the comma was necessary to tell LilyPond to go down.

If I were to do a lot of music processing, I’d probably look at a commercial package such as Sibelius. But for now I’m just interested in producing small excerpts like that above, and it looks like LilyPond may be fine.

Update: I double checked the rules about flats etc. Yes, I do have to specify explicitly that the B in this example is B-flat. If I just say b rather than bes, LilyPond will add a natural sign in front of the B! It’s strange. It is aware of the key signature: when I tell it the B is flat, it says “OK, then I don’t have to mark that specially since it’s implicit in the key signature.” And if I don’t tell it the B is flat, it says “Oh, that’s an exception to the key signature. Better mark it with a natural sign.”

March in 7/4 time

After writing my post on music in 5/4 time, I remembered a march in 7/4 time that I played in band many years ago. Here’s an excerpt, about all I can remember.

sheet music example

In case the music above is too hard to read, here’s a full-sized PDF file version.

Marches are always in even meters: your left foot has to come down on the first beat of every measure when you’re marching. And yet this odd meter tune comes across as a convincing march. (It was a concert march. Actually marching to it would have been odd, pun intended.)

This march had a 4/4 + 3/4 feel, emphasis on the first and fifth beats of each 7/4 measure.

I’ve just started blogging about music recently, and I’ve got a lot to learn. I’m not set up to record audio clips. My next post will describe the software I used to post the sheet music above.

Update: Many thanks to Nikhil Chelliah for identifying the march. It’s the first movement from Third Suite by Robert Jager. The sheet music and a sound clip are available here.

Related post: Blue Rondo à la Turk

Beatbox flute

Greg Pattillo plays beatbox flute. It’s hard to imagine what that means until you hear it. Here’s a video of Pattillo playing the theme from Sesame Street.

And here’s a video of Pattillo playing Peter and the Wolf.

Fractional derivatives

Is there a way to make sense of the nth derivative of a function when n is not a positive integer?

The notation f(n) is usually introduced in calculus classes in order to make Taylor’s theorem easier to state:

f(x) = sum_{n=0}^infty f^{(n)}(0) frac{x^n}{n!}

To make the above statement work, the 0th derivative is defined to be the function itself, i.e. don’t take any derivatives. This makes a modest extension of the notation f(n) from requiring n to be a positive integer to being a non-negative integer. Can we make sense of the case when n is negative or non-integer? Before answering that question, let’s think about what the fractional derivative might be in some special cases if we could define it.

When we take derivatives of powers of x, we get factorial-like coefficients. The first derivative of xm is m xm-1, the second derivative is m(m-1) xm-2, the third derivative is m(m-1)(m-2) xm-3, etc. We can use the gamma function to extend the factorial function to non-integer argument, so maybe we could do the same to compute non-integer derivatives. If m > -1, and n is a positive integer, the nth derivative of xm is (m!/(m-n)!) xm-n. We could rewrite this as (Γ(m+1) / Γ(m-n+1)) xm-n. The result holds for integer values of n, and so we could hope it holds for non-integer values of n.

If n is an integer and we take the nth derivative of ebx we get bn ebx. We might guess that for non-integer values of n the same formula holds.

It is indeed possible to define derivatives of order n for non-integer values of n, and the speculations above are correct, subject to some conditions. In fact there are several ways to define non-integer derivatives and the differences can be complicated.

What about negative derivatives? Well, it makes sense that these could be anti-derivatives, i.e. integrals. We could define, for example, the -3rd derivative of f(x) to be a function whose third derivative is f(x). However, anti-derivatives are only determined up to a constant. We could use the Fundamental Theorem of Calculus to uniquely specify anti-derivatives if we agree on a lower limit of integration c, such as c = 0 or maybe c = -∞.

f^{(-1)}(x) = int_c^x f(t), dt

Here’s one way fractional derivatives could be defined. Suppose the Fourier transform of f(x) is g(ξ). Then for positive integer n, the nth derivative of f(x) has Fourier transform (2π i ξ)n g(ξ). So you could take the nth derivative of f(x) as follows: take the Fourier transform, multiply by (2π i ξ)n, and take the inverse Fourier transform. This suggests the same procedure could be used to define the nth derivative when n is not an integer.

Fractional derivatives have practical uses. The book An Atlas of Functions makes frequent use of fractional derivatives, especially derivatives of order 1/2 and -1/2, to show connections between different classes of functions.

Related article: Generalizing binomial coefficients

Shorter URLs by using Unicode

Tinyarro.ws is a service like tinyurl.com and others that shorten URLs. However, unlike similar services, Tinyarro.ws uses Unicode characters, allowing it to encode more possibilities into each character. These sub-compact URLs may contain Chinese characters, for example, or other symbols unfamiliar to many users. They’re no good for reading aloud, say over the phone or on a podcast. But they’re ideal for Twitter because you only have to click on the link, not type it into a browser.

Here’s a URL I got when I tried the Tinyarro.ws site:

screen shot from tinyarro.ws

The resulting URL may not display correctly in your browser depending on what fonts you have installed: http://➡.ws/㣸.

I pasted the URL into Microsoft Word and used Alt-x to see what the Unicode characters were. (See Three ways to enter Unicode characters in Windows.) The arrow is code point U+27A1 and the final character is code point U+38F8. I have no idea what that character means. I would appreciate someone letting me know in the comments.

Unicode character U=38F8

Related post: How to insert graphics in Twitter messages