Posts tagged as:

Python

This is a continuation of my previous post, Running Python and R inside Emacs. That post shows how to execute independent code blocks in Emacs org-mode. This post illustrates calling one code block from another, each written in a different language. [click to continue...]

{ 1 comment }

Running Python and R inside Emacs

by John on February 9, 2012

Emacs org-mode lets you manage blocks of source code inside a text file. You can execute these blocks and have the output display in your text file. Or you could export the file, say to HTML or PDF, and show the code and/or the results of executing the code.

Here I’ll show some of the most basic possibilities. For much more information, see  orgmod.org. And for the use of org-mode in research, see A Multi-Language Computing Environment for Literate Programming and Reproducible Research.

[click to continue...]

{ 3 comments }

People are invariably surprised when they hear it’s hardly ever necessary to invert a matrix. It’s very often necessary solve linear systems of the form Ax = b, but in practice you almost never do this by inverting A. This post will give an example of avoiding matrix inversion. I will explain how the Newton-Conjugate Gradient method works, implemented in SciPy by the function fmin_ncg.

[click to continue...]

{ 2 comments }

How to compute jinc(x)

by John on February 2, 2012

The function jinc(x) that I wrote about yesterday is almost trivial to implement, but not quite. I’ll explain why it’s not quite as easy as it looks and how one might implement it in C and Python. [click to continue...]

{ 8 comments }

The Python ecosystem

by John on December 7, 2011

The hard part about getting started with Python is not the language but the ecosystem. It’s easy to find good references on the Python language, but it’s harder to learn what packages are available, how to install them, etc. That was my experience, and Miz Nazim started with a similar observation in his article Python Ecosystem: An Introduction.

Maybe its always harder to learn a language’s ecosystem than the language itself. But I think this was the case for me with Python more than with other languages I’ve used. I wish I’d found Nazim’s article or something like it when I was learning Python.

Related links:

Getting started with SciPy
Sage beginner’s guide
Python is a voluntary language
SciPyTip on Twitter

{ 13 comments }

Python is a voluntary language

by John on October 26, 2011

People who write Python choose to write Python.

I don’t hear people say “I use Python at work because I have to, but I’d rather be writing Java.” But often I do hear people say they’d like to use Python if their job would allow it. There must be someone out there writing Python who would rather not, but I think that’s more common with other languages.

My point isn’t that everyone loves Python, but rather that those who don’t care for Python simply don’t write it.

Since Python isn’t a common choice for enterprise software projects, it can resist the pressure to be all things to all people. Having a “Benevolent Dictator for Life” also helps Python maintain conceptual integrity. Python is popular enough to have a critical mass of users, but not so popular that it is under pressure to lose its uniqueness.

I don’t know much about the Ruby world, but I wonder whether the increasing popularity of Ruby for web development has created pressure for Ruby to compromise its original philosophy. And I wonder whether Ruby’s creator Yukihiro Matsumoto has “dictatorial” control over his language analogous to the control Guido van Rossum has over Python.

Related posts:

Plain Python
Ruby, Python, and science

{ 28 comments }

Leading digits of factorials

by John on October 19, 2011

Suppose you take factorials of a lot of numbers and look at the leading digit of each result. You could argue that there’s no apparent reason that any digit would be more common than any other, so you’d expect each of the digits 1 through 9 would come up 1/9 of the time. Sounds plausible, but it’s wrong.

The leading digits of factorials follow Benford’s law as described in the previous post. In fact, factorials follow Benford’s law even better than physical constants do. Here’s a graph of the leading digits of the factorials of 1 through 500.

In the remainder of this post, I’ll explain why Benford’s law should apply to factorials, make an aside on statistics, and point out an interesting feature of the Python code used to generate the chart above.

Why Benford’s law applies

Here’s a hand-waving explanation. One way to justify Benford’s law is to say that physical constants are uniformly distributed, but on a logarithmic scale. The same is true for factorials, and it’s easier to see why.

The leading digits of the logarithms depend on on their logarithms in base 10. The gamma function extends the factorial function and it is log-convex. The logarithm of the gamma function is fairly flat (see plot here), and so the leading digits of the log-gamma function applied to integers are uniformly distributed on a logarithmic scale.  (I’ve mixed logs base 10 and natural logs here, but that doesn’t matter. All logarithms are the same up to a multiplicative constant. So if a plot is nearly linear on a log10 scale, it’s nearly linear on a natural log scale.)

Update: Graham gives a link in the comments below to a paper proving that factorials satisfy Benford’s law exactly in the limit.

Uniform on what scale?

This example brings up an important principle in statistics. Some say that if you don’t have a reason to assume anything else, use a uniform distribution. For example, some say that a uniform prior is the ideal uninformative prior for Bayesian statistics. But you have to ask “Uniform on what scale?” It turns out that the leading digits of physical constants and factorials are indeed uniformly distributed, but on a logarithmic scale.

Python integers and floating point

I used nearly the same code to produce the chart above as I used in its counterpart in the previous post. However, one thing had to change: I couldn’t compute the leading digits of the factorials the same way. Python has extended precision integers, so I can compute 500! factorial without overflowing. Using floating point numbers, I could only go up to 170!. But when I used my previous code to find the leading digit, it first tried to apply log10 to an integer larger than the largest representable floating point number and failed. Converting numbers such as 500! to floating point numbers will overflow. (See Floating point numbers are a leaky abstraction.)

The solution was to find the leading digit using only integer operations.

def leading_digit_int(n):
    while n > 9:
        n = n/10
    return n

This code works fine for numbers like 500! or even larger.

Related posts:

Benford’s law and SciPy
Physical constants and factorials
Anatomy of a floating point number

{ 13 comments }

Benford’s law and SciPy

by John on October 19, 2011

Imagine you picked up a dictionary and found that the pages with A’s were dirty and the Z’s were clean. In between there was a gradual transition with the pages becoming cleaner as you progressed through the alphabet. You might conclude that people have been looking up a lot of words that begin with letters near the beginning of the alphabet and not many near the end.

That’s what Simon Newcomb did in 1881, only he was looking at tables of logarithms. He concluded that people were most interested in looking up the logarithms of numbers that began with 1 and progressively less interested in logarithms of numbers beginning with larger digits. This sounds absolutely bizarre, but he was right. The pattern he described has been repeatedly observed and is called Benford’s law. (Benford re-discovered the the same principle in 1938, and per Stigler’s law, Newcomb’s observation was named after Benford.)

Benford’s law predicts that for data sets such as collections of physical constants, about 30% of the numbers will begin with 1 down to about 5% starting with 8 or 9. To be precise, it says the leading digit will be d with probability log10(1 + 1/d). For a good explanation of Benford’s law, see TAOCP volume 2.

A couple days ago I blogged about using SciPy’s collection of physical constants to look for values that were approximately factorials. Let’s look at that set of constants again and see whether the most significant digits of these constants follows Benford’s law.

Here’s a bar chart comparing the actual number of constants starting with each digit to the results we would expect from Benford’s law.

Here’s the code that was used to create the data for the chart.

from math import log10, floor
from scipy.constants import codata

def most_significant_digit(x):
    e = floor(log10(x))
    return int(x*10**-e)

# count how many constants have each leading digit
count = [0]*10
d = codata.physical_constants
for c in d:
    (value, unit, uncertainty) = d[c]
    x = abs(value)
    count[ most_significant_digit(x) ] += 1
total = sum(count)

# expected number of each leading digit per Benford's law
benford = [total*log10(1 + 1./i) for i in range(1, 10)]

The chart itself was produced using matplotlib, starting with this sample code.

The actual counts we see in scipy.constants line up fairly well with the predictions from Benford’s law. The results are much closer to Benford’s prediction than to the uniform distribution that you might have expected before hearing of Benford’s law.

Update: See the next post for an explanation of why factorials also follow Benford’s law.

Related posts:

Physical constants and factorials
Slide rules

{ 6 comments }

Physical constants and factorials

by John on October 17, 2011

The previous post mentioned that Avogadro’s constant is approximately 24!. Are there other physical constants that are nearly factorials?

[click to continue...]

{ 11 comments }

Code to slice open a Menger sponge

by John on August 30, 2011

Last month the New York Times ran a story about a sculpture based on cutting open a “Menger sponge,” a shape formed by recursively cutting holes through a cube. All the holes are rectangular, but when you cut the sponge open at an angle, you see six-pointed stars.

Here are some better photos, including both a physical model and a computer animation. Thanks to Mike Croucher for the link.

I’ve written some Python code to take slices of a Menger sponge. Here’s a sample output.

[click to continue...]

{ 3 comments }

A couple Python-like features in C++11

by John on August 17, 2011

The new C++ standard includes a couple Python-like features that I ran across recently. There are other Python-like features in the new standard, but here I’ll discuss range-based for-loops and raw strings.

[click to continue...]

{ 9 comments }

Sage Beginner’s Guide

by John on July 12, 2011

I like books. Given a choice, I’d much rather read a book than online documentation. Typically a book speaks with one voice and has been more carefully edited. Unfortunately, it can be hard to find books on specialized software. That’s why I was glad to hear there’s a book on Sage, a project that integrates many Python libraries for mathematical computing into a single context.

Craig Finch’s book Sage Beginner’s Guide provides an easy-to-read overview of Sage. The book is filled with examples. In fact, every topic is introduced by an example. Explanations follow the examples in sections entitled “What just happened?”. Follow-up exercises are provided to solidify the material after the example and explanation.

Someone could begin using Sage without knowing Python. They could think of Sage as an open source Mathematica-like application. But one of the strengths of Sage is that its underlying language is Python. And the Sage Beginner’s Guide has two chapters devoted to the Python language, one basic and one advanced.

Finch’s book is primarily focused on Sage as a whole, not the Python components it integrates. However, it does point out the component libraries that provide certain functionality when either the constituent library conflicts with Sage or can be used to refine Sage functionality.

Sage can be challenging to install. It is not yet directly supported on Windows; the recommended way to use Sage on Windows is to download a Linux virtual machine with Sage installed. I was able to install Sage on Ubuntu but not yet on OS X. However, you can try out Sage without installing it by using Sage online notebooks.

I don’t have as much experience with Sage as with some of its components. As far as I can tell, it’s easy to take your experience from component libraries — say NumPy — and bring it over to Sage. It would be harder to take functionality you discovered while using Sage and use it outside of Sage since, though that is to be expected since part of the value Sage adds is smoothing over the peculiarities of each component library.

{ 7 comments }

How to fit an elephant

by John on June 21, 2011

John von Neumann famously said

With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.

By this he meant that one should not be impressed when a complex model fits a data set well. With enough parameters, you can fit any data set.

It turns out you can literally fit an elephant with four parameters if you allow the parameters to be complex numbers.

I mentioned von Neumann’s quote on StatFact last week and Piotr Zolnierczuk replied with reference to a paper explaining how to fit an elephant:

“Drawing an elephant with four complex parameters” by Jurgen Mayer, Khaled Khairy, and Jonathon Howard,  Am. J. Phys. 78, 648 (2010), DOI:10.1119/1.3254017.

Piotr also sent me the following Python code he’d written to implement the method in the paper. This code produced the image above.

"""
Author: Piotr A. Zolnierczuk (zolnierczukp at ornl dot gov)

Based on a paper by:
Drawing an elephant with four complex parameters
Jurgen Mayer, Khaled Khairy, and Jonathon Howard,
Am. J. Phys. 78, 648 (2010), DOI:10.1119/1.3254017
"""
import numpy as np
import pylab

# elephant parameters
p1, p2, p3, p4 = (50 - 30j, 18 +  8j, 12 - 10j, -14 - 60j )
p5 = 40 + 20j # eyepiece

def fourier(t, C):
    f = np.zeros(t.shape)
    A, B = C.real, C.imag
    for k in range(len(C)):
        f = f + A[k]*np.cos(k*t) + B[k]*np.sin(k*t)
    return f

def elephant(t, p1, p2, p3, p4, p5):
    npar = 6
    Cx = np.zeros((npar,), dtype='complex')
    Cy = np.zeros((npar,), dtype='complex')

    Cx[1] = p1.real*1j
    Cx[2] = p2.real*1j
    Cx[3] = p3.real
    Cx[5] = p4.real

    Cy[1] = p4.imag + p1.imag*1j
    Cy[2] = p2.imag*1j
    Cy[3] = p3.imag*1j

    x = np.append(fourier(t,Cx), [-p5.imag])
    y = np.append(fourier(t,Cy), [p5.imag])

    return x,y

x, y = elephant(np.linspace(0,2*np.pi,1000), p1, p2, p3, p4, p5)
pylab.plot(y,-x,'.')
pylab.show()

Related posts:

Advantages of crude models
Occam’s razor and Bayes theorem

{ 8 comments }

Stand-alone scientific code

by John on June 7, 2011

Sometimes you need one or two scientific functions not included in your programming environment. For a number of possible reasons, you do not want to depend on an external library. For example, maybe you don’t want to take the time to evaluate libraries. Or maybe you want to give someone else a small amount of self-contained code. Here is a collection of code for these situations.

Stand-alone code for numerical computing

This page contains C++, Python, and C# code for special functions and random number generation with no external dependencies. Do whatever you want with it, no strings attached. Use at your own risk. I recently added software for gamma and log gamma functions, as well as a few random number generators. (Why separate functions for the gamma function and its logarithm? See explanation here.)

I don’t recommend using this code as a way to avoid learning a good library. If you’re writing Python, for example, I’d recommend using SciPy. But there are times when the advantages of being self-contained outweigh the advantages of using high-quality libraries.

Related posts:

Mathematical functions that seem unnecessary
SciPyTip: Daily tips on using scientific computing in Python
C# math gotchas

{ 8 comments }

Eric Floehr is the owner of ForecastWatch, a company that evaluates the accuracy of weather forecasts. In this interview Eric explains what his business does, how he got started, and some of the technology he uses.

[click to continue...]

{ 10 comments }