The success of OOP

Allen Wirfs-Brock gave the following defense of OOP a few days ago in a series of six posts on Twitter:

A young developer approached me after a conf talk and said, “You must feel really bad about the failure of object-oriented programming.” I was confused. I said, “What do you mean that object-orient programming was a failure. Why do you think that?”

He said, “OOP was supposed to fix all of our software engineering problems and it clearly hasn’t. Building software today is just as hard as it was before OOP. came along.”

“Have you ever look at the programs we were building in the early 1980s? At how limited their functionality and UIs were? OOP has been an incredible success. It enabled us to manage complexity as we grew from 100KB applications to today’s 100MB applications.”

Of course OOP hasn’t solved all software engineering problems. Neither has anything else. But OOP has been enormously successful in allowing ordinary programmers to write much larger applications. It has become so pervasive that few programmers consciously think about it; it’s simply how you write software.

I’ve written several posts poking fun at the excesses of OOP and expressing moderate enthusiasm for functional programming, but I appreciate OOP. I believe functional programming will influence object oriented programming, but not replace it.

Related:

Life lessons from differential equations

Ten life lessons from differential equations:

  1. Some problems simply have no solution.
  2. Some problems have no simple solution.
  3. Some problems have many solutions.
  4. Determining that a solution exists may be half the work of finding it.
  5. Solutions that work well locally may blow up when extended too far.

More differential equations posts

 

Numerators of harmonic numbers

Harmonic numbers

The nth harmonic number, Hn, is the sum of the reciprocals of the integers up to and including n. For example,

H4 = 1 + 1/2 + 1/3 + 1/4 = 25/12.

Here’s a curious fact about harmonic numbers, known as Wolstenholme’s theorem:

For a prime p > 3, the numerator of Hp−1 is divisible by p2.

The example above shows this for p = 5. In that case, the numerator is not just divisible by p2, it is p2, though this doesn’t hold in general. For example, H10 = 7381/2520. The numerator 7381 is divisible by 112 = 121, but it’s not equal to 121.

Generalized harmonic numbers

The generalized harmonic numbers Hn,m are the sums of the reciprocals of the first n positive integers, each raised to the power m. Wolstenholme’s theorem also says something about these numbers too:

For a prime p > 3, the numerator of Hp−1,2 is divisible by p.

For example, H4,2 = 205/144, and the numerator is clearly divisible by 5.

Computing with Python

You can play with harmonic numbers and generalized harmonic numbers in Python using SymPy. Start with the import statement

from sympy.functions.combinatorial.numbers import harmonic

Then you can get the nth harmonic number with harmonic(n) and generalized harmonic numbers with harmonic(n, m).

To extract the numerators, you can use the method as_numer_denom to turn the fractions into (numerator, denominator) pairs. For example, you can create a list of the numerators of the first 10 harmonic numbers with

[harmonic(n).as_numer_denom()[0] for n in range(10)]

What about 0?

You might notice that harmonic(0) returns 0, as it should. The sum defining the harmonic numbers is empty in this case, and empty sums are defined to be zero.

 

High-dimensional integration

Numerically integrating functions of many variables almost always requires Monte Carlo integration or some variation. Numerical analysis textbooks often emphasize one-dimensional integration and, almost as a footnote, say that you can use a product scheme to bootstrap one-dimensional methods to higher dimensions. True, but impractical.

Traditional numerical integration routines work well in low dimensions. (“Low” depends on context, but let’s say less than 10 dimensions.) When you have the choice of a well-understood, deterministic method, and it works well, by all means use it. I’ve seen people who are so used to problems requiring Monte Carlo methods that they use these methods on low-dimensional problems where they’re not the most efficient (or robust) alternative.

Monte Carlo

Except for very special integrands, high dimensional integration requires either Monte Carlo integration or some variation such as quasi-Monte Carlo methods.

As I wrote about here, naive Monte Carlo integration isn’t practical for high dimensions. It’s unlikely that any of your integration points will fall in the region of interest without some sort of importance sampler. Constructing a good importance sampler requires some understanding of your particular problem. (Sorry. No one-size-fits-all black-box methods are available.)

Quasi-Monte Carlo (QMC)

Quasi-Monte Carlo methods (QMC) are interesting because they rely on something like a random sequence, but “more random” in a sense, i.e. more effective at exploring a high-dimensional space. These methods work well when an integrand has low effective dimension. The effective dimension is low when nearly all the action takes place on a lower dimensional manifold. For example, a function may ostensibly depend on 10,000 variables, while for practical purposes 12 of these are by far the most important. There are some papers by Art Owen that say, roughly, that Quasi-Monte Carlo methods work well if and only if the effective dimension is much smaller than the nominal dimension. (QMC methods are especially popular in financial applications.)

While QMC methods can be more efficient than Monte Carlo methods, it’s hard to get bounds on the integration error with these methods. Hybrid approaches, such as randomized QMC can perform remarkably well and give theoretical error bounds.

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) is a common variation on Monte Carlo integration that uses dependent random samples. In many applications, such as Bayesian statistics, you’d like to be able to create independent random samples from some probability distribution, but this is not practical. MCMC is a compromise. It produces a Markov chain of dependent samples, and asymptotically these samples have the desired distribution. The dependent nature of the samples makes it difficult to estimate error and to determine how well the integration estimates using the Markov chain have converged.

(Some people speak of the Markov chain itself converging. It doesn’t. The estimates using the Markov chain may converge. Speaking about the chain converging may just be using informal language, or it may betray a lack of understanding.)

There is little theory regarding the convergence of estimates from MCMC, aside from toy problems. An estimate can appear to have converged, while an important region of the integration problem remains unexplored. Despite its drawbacks, MCMC is the only game in town for many applications.

***

Numerical integration consulting

 

Scientific computing in Python

Scientific computing in Python is expanding and maturing rapidly. Last week at the SciPy 2015 conference there were about twice as many people as when I’d last gone to the conference in 2013.

You can get some idea of the rapid develop of the scientific Python stack and its future direction by watching the final keynote of the conference by Jake VanderPlas.

I used Python for a while before I discovered that there were so many Python libraries for scientific computing. At the time I was considering learning Ruby or some other scripting language, but I committed to Python when I found out that Python has far more libraries for the kind of work I do than other languages do. It felt like I’d discovered a secret hoard of code. I expect it would be easier today to discover the scientific Python stack. (It really is becoming more of an integrated stack, not just a collection of isolated libraries. This is one of the themes in the keynote above.)

When people ask me why I use Python, rather than languages like Matlab or R, my response is that I do a mixture of mathematical programming and general programming. I’d rather do mathematics in a general programming language than do general programming in a mathematical language.

One of the drawbacks of Python, relative to C++ and related languages, is speed. This is a problem in languages like R as well. However, with Python there are ways to speed up code without completely rewriting it, such as Cython and Numba. The only reliable way I’ve found to make R much faster, is to rewrite it in another language.

Another drawback of Python until recently was that data manipulation and exploration were not as convenient as one would hope. However, that has changed due to developments such as Pandas, initiated by Wes McKinney. For more on how that came to be and where it’s going, see his keynote from the second day of SciPy 2015.

It’s not clear why Python has become the language of choice for so many people in scientific computing. Maybe if people like Travis Oliphant had decided to use some other language for scientific programming years ado, we’d all be using that language now. Python wasn’t intended to be a scientific programming language. And as Jake VanderPlas points out in his keynote, Python still is not a scientific programming language, but the foundation for a scientific programming stack. Maybe Python’s strength is that it’s not a scientific language. It has drawn more computer scientists to contribute to the core language than it would have if it had been more of a domain-specific language.

* * *

If you’d like help moving to the Python stack, please let me know.

Contact info diagram

My contact information arranged into a diagram:

Yesterday at SciPy 2015 Allen Downey did something similar for his contact info and gave me the idea for the image above.

LinkedIn doesn’t quite fit; you have to know that LinkedIn profiles stick linkedin.com/in/ before the username.

When the last digits of powers don’t change

If you raise any integer to the fifth power, its last digit doesn’t change. For example, 25 = 32.

It’s easy to prove this assertion by brute force. Since the last digit of bn only depends on the last digit of b, it’s enough to verify that the statement above holds for 0, 1, 2, …, 9.

Although that’s a valid proof, it’s not very satisfying. Why fifth powers? We’re implicitly working in base 10, as humans usually do, so what is the relation between 5 and 10 in this context? How might we generalize it to other bases?

The key is that 5 = φ(10) + 1 where φ(n) is the number of positive integers less than n and relatively prime to n.

Euler discovered the φ function and proved that if a and m are relatively prime, then

aφ(m) = 1 (mod m)

This means that aφ(m) − 1 is divisible by m. (Technically I should use the symbol ≡ (U+2261) rather than = above since the statement is a congruence, not an equation. But since Unicode font support on various devices is disappointing, not everyone could see the proper symbol.)

Euler’s theorem tells us that if a is relatively prime to 10 then a4 ends in 1, so a5 ends in the same digit as a. That proves our original statement for numbers ending in 1, 3, 7, and 9. But what about the rest, numbers that are divisible by 2 or 5? We’ll answer that question and a little more general one at the same time. Let m = αβ where α and β are distinct primes. In particular, we could choose α = 2 and β = 5; We will show that

aφ(m)+1 = a (mod m)

for all a, whether relatively prime to m or not. This would show in addition, for example, that in base 15, every number keeps the same last “digit” when raised to the 9th power since φ(15) = 8.

We only need to be concerned with the case of a being a multiple of α or β since Euler’s theorem proves our theorem for the rest of the cases. If a = αβ our theorem obviously holds, so assume a is some multiple of α, a = kα with k less than β (and hence relatively prime to β).

We need to show that αβ divides

(kα)φ(αβ)+1kα.

This expression is clearly divisible by α, so the remaining task is to show that it is divisible by β. We’ll show that

(kα)φ(αβ) − 1

is divisible by β.

Since α and k are relatively prime to β, Euler’s theorem tells us that αφ(β) and kφ(β) are congruent to 1 (mod β). This implies that kαφ(β) is congruent to 1, and so kαφ(α)φ(β) is also congruent to 1 (mod β). One of the basic properties of φ is that for relatively prime arguments α and β, φ(αβ) = φ(α)φ(β) and so we’re done.

Exercise: How much more can you generalize this? Does the base have to be the product of two distinct primes?

Related: Applied number theory

No, I’m not a bot.

Periodically someone on Twitter will suggest that one of my Twitter accounts is a bot. Others will reply in the second person plural, suggesting that there’s a group of people behind one of the accounts. These accounts aren’t run by a bot or a committee, just me.

I do use software to schedule my tweets in advance. Most of the tweets from my personal account are live. Most of the tweets from my topic accounts are scheduled, though some are live. All replies are manual, not automated, and I don’t scrape content from anywhere.

Occasionally I read the responses to these accounts and sometimes I reply. But with over half a million followers (total, not unique) I don’t try to keep up with all the responses. If you’d like to contact me, you can do so here. That I do keep up with.

claimtoken-5595a093e11d8