There’s a new math blog, Numerical Recipes. The initial post says the blog will be understandable to anyone with a high school math background, will solve all problems in Python, and will not use any external libraries. The blog will also be available in Spanish.
Posts tagged as:
Python
It’s difficult to use SciPy from IronPython because much of SciPy is implemented in C rather than in Python itself. I wrote an article on CodeProject summarizing some things I’d learned about using Python modules with IronPython. (Many thanks to folks who left helpful comments here and answered my questions on StackOverflow.) The article gives stand-alone code for computing normal probabilities and explains why I wrote my own code rather than using SciPy.
Here’s the article: Computing Normal Probabilities in IronPython
Here’s some more stand-alone Python code for computing mathematical functions and generating random numbers. The code should work in Python 2.5, Python 3.0, and IronPython.
Related post: IronPython is a one-way gate
{ 4 comments }
I often find it strange how some programming books arbitrarily classify content as either “beginner” or “advanced.” The advanced material may not be advanced at all. Maybe by “advanced” the author means “the stuff I didn’t learn right away.” However, Expert Python Programming by Tarek Ziadé objectively is an advanced book. I had heard good things about the book from the Python 411 podcast and had it on my list of books to buy when the book’s publisher gave me a copy to review.
Expert Python Programming assumes the reader has a solid knowledge of Python. In some ways, it is the Python counterpart to Scott Meyers’ Effective C++ series, giving advice about recommended practice. But Ziadé’s Python book is broader than Meyer’s C++ books because Expert Python Programming goes beyond best practices for language use. The latter chapters discuss how to package and distribute Python applications, software life cycle management, documentation, test-driven development, optimization, and design patterns. I’d like to see more books follow this pattern, covering these important topics in the context of a particular language or tool set. I was particularly pleased to see 27 pages devoted to documentation.
True to its name, Expert Python is not for beginners; I’d recommend Wesley Chun’s Core Python for a first book. But I’d heartily recommend Expert Python Programming as a second Python book. The book has an impressive range of practical material including numerous links to even more resources.
Update: Thomas Guest also reviewed this book.
{ 4 comments }
Sergey Fomel left a valuable comment on my post about computing the error function erf(x). He gave some sample code for exposing C functions to Python via SWIG on Linux. I haven’t used SWIG, but I’ve heard good things about it. I believe Google uses SWIG extensively to make C++ code callable from Python.
Here’s Sergey’s code.
bash$ cat erf.i %module erf #include double erf(double); bash$ swig -o erf_wrap.c -python erf.i bash$ gcc -o erf_wrap.os -c -fPIC -I/usr/include/python2.4 erf_wrap.c bash$ gcc -o _erf.so -shared erf_wrap.os bash$ python >>> from erf import erf >>> erf(1) 0.84270079294971489
{ 1 comment }
I’ve seen several people ask lately how to compute the distribution (CDF) function for a standard normal random variable, often denoted Φ(x). They want to know how to compute it in Java, or Python, or C++, etc. Every language has its own standard libraries, and in general I recommend using standard libraries. However, sometimes you want to minimize dependencies. Or maybe you want more transparency than your library allows. The code given here is in Python, but it is so compact that it could easily be ported to any other language.
I just posted Python code for computing the error function, erf(x). The normal density Φ(x) is a simple transformation of erf(x). Given code for erf(x), here’s code for Φ(x).
def phi(x):
return 0.5*( 1.0 + erf(x/math.sqrt(2)) )
After deriving the transformations between erf(x) and Φ(x) several times, including their complements and inverses, I wrote them down to save. See the PDF file Relating Φ and erf.
See also stand alone code for computing the inverse of the standard normal CDF.
{ 1 comment }
The question came up on StackOverflow this morning how to compute the error function erf(x) in Python. The standard answer for how to compute anything numerical in Python is “Look in SciPy.” However, this person didn’t want to take on the dependence on SciPy. I’ve seen variations on this question come up in several different contexts lately, including questions about computing the normal distribution function, so I thought I’d write up a solution.
Here’s a Python implementation of erf(x) based on formula 7.1.26 from A&S. The maximum error is below 1.5 × 10-7.
import math
def erf(x):
# constants
a1 = 0.254829592
a2 = -0.284496736
a3 = 1.421413741
a4 = -1.453152027
a5 = 1.061405429
p = 0.3275911
# Save the sign of x
sign = 1
if x < 0:
sign = -1
x = abs(x)
# A & S 7.1.26
t = 1.0/(1.0 + p*x)
y = 1.0 - (((((a5*t + a4)*t) + a3)*t + a2)*t + a1)*t*math.exp(-x*x)
return sign*y
This problem is typical in two ways: A&S has a solution, and you’ve got to know a little background before you can use it.
The formula given in A&S is only good for x ≥ 0. That’s no problem if you know that the error function is an odd function, i.e. erf(-x) = -erf(x). But if you’re an engineer who has never heard of the error function but needs to use it, it may take a while to figure out how to handle negative inputs.
One other thing that someone just picking up A&S might not know is the best way to evaluate polynomials. The formula appears as 1 – (a1t1 + a2t2 + a3t3 + a4t4 + a5t5)exp(-x2), which is absolutely correct. But directly evaluating an nth order polynomial takes O(n2) operations, while the factorization used in the code above uses O(n) operations. This technique is known as Horner’s method.
{ 8 comments }
My previous post showed how it is possible to do partial function application in C++ by using function objects. Here I’ll show how much simpler this can be done in Python.
As before, we want to evaluate a function of one variable and three parameters: f(x; a, b, c) = 1000a + 100b + 10c + x. Here’s an example how we could do this in Python using the functools module.
from functools import partial
def f(a, b, c, x):
return 1000*a + 100*b + 10*c + x
g = partial(f, 3, 1, 4)
print g(5)
The code will print 3145.
The function f above is so simple that it may be hard to imagine why you would want to do such a thing. The earlier post gives a more realistic application, using partial function application in numerical integration and root-finding.
{ 0 comments }
Fonts, translations, and programming languages have one thing in common: they work best when you don’t notice them.
If someone says “Hey, look at this cool font I just found!” you probably wouldn’t want to read a book set in that font. At least to an untrained eye, a great font will not stand out in a list of small samples. You have to see large blocks of text set in a font to appreciate it. Even then, most people will not consciously appreciate a very readable font.
Translations are similar. If you find yourself saying “What an interesting translation!” then the translator has probably fallen down on the job. A good translation is neither archaic nor trendy. It does not draw attention to itself but allows you to focus on the original content. I believe the English Standard Version achieves that with Bible translation.
Python is like a good font or a good translation. For years I’d look into Python briefly when someone would recommend it. I’d thumb through a Python book, but it all looked rather plain. Only later did I come to appreciate that the beauty of Python is that it is rather plain. It doesn’t call attention to itself. It just gets out of your way and lets you write programs. It seems to me that compared to other programming language communities, the Python community brags less about their language per se and more about what they’re able to do with it.
{ 4 comments }
I needed to find a spell checker I could call from Python, so I did a Google search and ran across GNU aspell. I tried installing it but got contradictory warning messages: aspell not installed, aspell already installed, etc. Then I remembered what an awful time I’d had before when I’d tried to use aspell and gave up.
Next I tried Ryan Kelly’s PyEnchant and it worked like a charm. I downloaded the installer for Windows and ran it. Then I opened up a Python console and typed an example following the online tutorial.
>>> import enchant
>>> d = enchant.Dict("en_US")
>>> d.check("Potatoe")
False
>>> d.check("Potato")
True
It just works.
{ 3 comments }
Sometimes code is easier to understand than abstract math. A few days ago I was having a hard time conveying bias, consistency, and efficiency in a statistics class. Writing some pseudo-code on the board seemed to help clear things up. Loops and calls to random number generation routines are more tangible than discussions about random samples.
Later I turned the pseudo-code into Python code (after all, Python is supposed to be “executable pseudo-code”) and fancied it up a bit. The following page gives some explanation, some plots of the output, and the source code.
The difference between an unbiased estimator and a consistent estimator
{ 0 comments }
Symmetric APIs are easier to use. I was reminded of this when doing some regular expression programming in Python and comparing it to Perl. Perl’s regular expression operators for search and replace are symmetric in a way that their Python counterparts are not.
Perl uses m/pattern/ for matching and s/pattern/replacement/ for substitution. Both apply to the first instance of a pattern in a string by default. The g option following a match or substitute operator causes the command to apply to all instances of the pattern. The i option after either a match or substitute command causes the pattern to apply in a case-insensitive manner. Matching and substitution are symmetric.
Python uses re.search() for matching and re.sub() for substitution. The search function can only apply to the first instance of a pattern; to match all instances of a pattern, use re.findall(). The function re.sub() applies to all instances by default, but it has a max parameter that can be set to limit the number of instances it applies to. To make a search pattern case-insensitive, pass in re.IGNORECASE flag. To make a substitution case-insensitive, modify the regular expression itself by adding (?i).
In general, I find Python syntax much cleaner than Perl, but regular expressions are implemented more elegantly in Perl.
{ 2 comments }
Some programming languages are much easier to come back to than others. In my previous post I mentioned that Mathematica is easy to come back to, put Perl is not.
I found it easy to come back LaTeX after not using it for a while. It has a few quirks, but it’s basically consistent. The LaTeX commands for Greek letters are their names, lower case names for lower case letters, upper case names for upper case letters. The command for a mathematical symbol is usually the name a mathematician would give the symbol. Modes always begin with \begin and end with \end.
Python also has a consistent syntax that make it easier to come back to the language after a break. Someone has said that Python is similar to Perl, except that the word “except” does not appear nearly so often in the Python documentation.
It’s more important that a language be internally consistent than conventional. Each of the languages I mentioned have their peculiarities. Mathematica uses square brackets for function argument arguments. LaTeX uses percent signs for comments. Python uses indention to denote blocks. Each of these take a little getting used to, but each makes sense in its own context.
A special case of consistency is using full names for keywords. Mathematica always spells out words in full. For example, the gamma distribution object is named GammaDistribution. I don’t mind a little extra typing. I’d rather optimize for recall and readability than minimize keystrokes since I spend more time recalling and reading than typing. (One flaw in LaTeX is that it occasionally uses unnecessary abbreviations. For example, \infty for infinity. The corresponding Mathematica keyword is Infinity.)
{ 4 comments }
Here are a few lessons learned from porting a numerical library recently from Windows/Visual C++ to Linux/gcc.
Some of our code only runs on Windows, and only needs to run on Windows. Our first thought was to put #ifdef WIN32 directives around that code. Clift Norris came up with the clever idea of using #ifndef EXCLUDE_WINDOWS_ONLY_CODE instead. That way we could do a preliminary test of the portable subset of the code while still working on Windows where we’re more comfortable. Also, by not referring specifically to 32-bit Windows, we’re OK moving the code to 64-bit Windows.
Visual C++ does not require source and header files to end with newline characters, but gcc does. We got hundreds of warnings of the form warning: no newline at end of file when we first attempted to compile our code on Linux. Apparently there’s no gcc switch to turn this off, and it may not be prudent to turn it off if you could. As I understand it, Visual Studio inserts a linebreak after including header files, but gcc may not and so gcc needs to issue a warning in this case while Visual Studio does not. We copy our source tree to a Linux box then run the following Python code on that box to insert the extra newline characters when needed.
import os
# List of directories of files to add newlines to.
# Script must be in the same location as these directories.
directories = ["Banana", "Apple", "Peach"]
for dir in directories:
for file in os.listdir(dir):
if file.endswith(".h") or file.endswith(".cpp"):
path = dir + "/" + file
handle = open(path, "r")
slurp = handle.read()
handle.close()
if not slurp.endswith("\n"):
retcode = os.system("chmod +w " + path)
if retcode != 0:
print "chmod returned " + retcode + " on " + path
else:
handle = open(path, "a")
handle.write("\n")
handle.close()
There were several places in our code where a variable was deliberately unused but retained in a function signature. Suppose a function has signature void foo(int a, int b) but b is unused. We had usually handled that by making b; the first line of the implementation. That would suppress unused variable warnings in Visual C++, but not in gcc. When we changed the function signature to void foo(int a, int /* b */), that made both compilers happy.
We started out using autoconf, but that was overkill for our project. Our build process became two orders of magnitude simpler when we switched over to a crude, old-fashioned make file. After porting the library to Linux, we built it on OS X without any issue.
This wasn’t an issue for us, but a potential problem when moving numerical code between Unix-like systems is that the function gamma computes different things on different systems. On Linux it computes the logarithm of the mathematical gamma function but on OS X it computes the gamma function itself. See the last two paragraphs of how to calculate binomial probabilities and Thomas Guest’s comment on that post for a full explanation.
This library had been ported to Linux years ago, but nobody used it on Linux and so development continued only on Windows. When we first ported the code, gcc and Visual C++ seemed to have incompatible requirements, especially with templates. The more recent port described here was much easier now that both compilers are more compliant with the C++ standard.
{ 3 comments }
Here’s some Python code to create a sitemap in the format specified by sitemaps.org and read by search engines. Download the file sitemapmaker.txt and change the extension from .txt to .py.
Change the url variable in the script before running it or else you’ll point search engines to my web site rather than yours. Also, edit the file extensions_to_keep variable if you want to index any file types besides HTML and PDF.
Copy the file sitemapmaker.py to the directory on your computer where you have your files. Run the script and direct its output to a file, sitemapmaker.py > sitemap.xml. See sitemaps.org for instructions on how to let search engines know about your sitemap.
This code assumes all the files to index in your sitemap are in one directory, the directory you run the script from. It also assumes the timestamps on your computer match those on your web server. Optional fields are left out of the sitemap.
{ 0 comments }

