Iterative linear solvers as metaphor

Gaussian elimination is systematic way to solve systems of linear equations in a finite number of steps. Iterative methods for solving linear systems require an infinite number of steps in theory, but may find solutions faster in practice.

Gaussian elimination tells you nothing about the final solution until it’s almost done. The first phase, factorization, takes O(n^3) steps, where n is the number of unknowns. This is followed by the back-substitution phase which takes O(n^2) steps. The factorization phase tells you nothing about the solution. The back-substitution phase starts filling in the components of the solution one at a time. In application n is often so large that the time required for back-substitution is negligible compared to factorization.

Iterative methods start by taking a guess at the final solution. In some contexts, this guess may be fairly good. For example, when solving differential equations, the solution from one time step gives a good initial guess at the solution for the next time step. Similarly, in sequential Bayesian analysis the posterior distribution mode doesn’t move much as each observation arrives. Iterative methods can take advantage of a good starting guess while methods like Gaussian elimination cannot.

Iterative methods take an initial guess and refine it to a better approximation to the solution. This sequence of approximations converges to the exact solution. In theory, Gaussian elimination produces an exact answer in a finite number of steps, but iterative methods never produce an exact solution after any finite number of steps. But in actual computation with finite precision arithmetic, no method, iterative or not, ever produces an exact answer. The question is not which method is exact but which method produces an acceptably accurate answer first. Often the iterative method wins.

Successful projects often work like iterative numerical methods. They start with an approximation solution and iteratively refine it. All along the way they provide a useful approximation to the final product. Even if, in theory, there is a more direct approach to a final product, the iterative approach may work better in practice.

Algorithms iterate toward a solution because that approach may reach a sufficiently accurate result sooner. That may apply to people, but more important for people is the psychological benefit of having something to show for yourself along the way. Also, iterative methods, whether for linear systems or human projects, are robust to changes in requirements because they are able to take advantage of progress made toward a slightly different goal.

Related post: Ten surprises from numerical linear algebra

Read More

Multiple zeta

The Riemann zeta function, introduced by Leonard Euler, is defined by

\zeta(k) = \sum n^{-k}

where the sum is over all positive integers n.

Euler also introduced a multivariate generalization of the zeta function

\zeta(k_1, \ldots, k_r) = \sum n_1^{-k_1}\cdots n_r^{-k_r}

where the sum is over all decreasing k-tuples of positive integers. This generalized zeta function satisfies the following beautiful identity:

 \zeta(a)\,\zeta(b) = \zeta(a, b) + \zeta(b, a) + \zeta(a+b)

The multivariate zeta function and identities such as the one above are important in number theory and are the subject of open conjectures.

Source

Read More

Benchmarking C++, Python, R, etc.

The other day Travis Oliphant pointed out an interesting paper: A Comparison of Programming Languages in Economics. The paper benchmarks several programming languages on a computational problem in economics.

All the usual disclaimers about benchmarks apply, your mileage may vary, etc. See the paper for details.

Here I give my summary of their summary of their results. The authors ran separate benchmarks on Mac and Windows. The results were qualitatively the same, so I just report the Windows results here.

Times in the table below are relative to the fastest C++ run.

Language Time
C++ 1.00
Java 2.10
Julia 2.70
CPython 155.31
Python with Numba 1.57
R 505.09
R using compiler package 243.38

 

The most striking result is that the authors were able to run their Python code 100x faster, achieving performance comparable to C++, by using Numba.

Read More

Weak static type systems

Comment by Simon Peyton Jones in an interview:

People often dislike static type systems because they’ve only met weak ones. A weak or not very expressive type system gets in your way all the time. It prevents you from writing functions you want to write that you know are fine. … The solution is not to abandon the type system but to make the type system more expressive.

In particular, he mentions Haskell’s polymorphic types and type inference as ways to make strong static typing convenient to use.

Read More

Closed-world expertise

Venkat Rao has an interesting take on the ideas of deliberate practice, flow, and the 10,000 hour rule. In The Deliberate Practice of Disruption he points out that these ideas of expertise are always presented in closed worlds.

The real problem is that research on expertise focuses on fields where “expertise” is a well-posed and objectively codified notion. This means mature fields that are closed and bounded, and can be easily observed, modeled and studied under laboratory conditions. So it is not surprising that the work … is based on fields like “medicine, music, chess and sports” … all sharply circumscribed and regulated domains.

Rao’s essay may be a bit too harsh on closed-world domains and a bit too romantic about open-world domains, but the distinction between the domains is important. You don’t become a successful entrepreneur, for example, the same way you become a successful violinist. Closed worlds place a much higher emphasis on error-elimination, at least initially, than do open worlds. In an open world, the concept of an error may not even make sense. Where there is no law there is no sin.

Consulting is a more open world than academia. As Rao notes, academia can close off an otherwise open world through “bureaucratic productivity measures like publications and citations.” Clients are happy if you solve their problems. They could not care less whether your solutions are publishable and all that implies. Original and thoroughly footnooted work that doesn’t solve their problems is not appreciated.

Clients are not going to give you an oral exam to see whether you’ve mastered some canon. And they don’t care if you cross academic boundary lines to use something “outside your field.” They do care about credentials sometimes, but in a pragmatic way: they may need someone with the right credentials to review something. In that case, your credentials are part of the solution.

Related posts:

How to know it all

Evaluate people at their best or at their worst?

Read More

Every exercise in the book

When I did an independent study course with Ted Odell, he told me to get a copy of De Vito’s Functional Analysis and work every exercise. I don’t recall whether I actually worked every problem, though I believe I at least did most of them. I heard of someone who learned algebraic geometry by working every problem in Hartshorne.

Doing all the exercises in a book isn’t a bad way to learn something, though it depends on the book, what you’re trying to accomplish, and on the quality and quantity of the exercises.

Have you ever gone through a book working every exercise? If so, what book? How was your experience?

 

 

Read More

Swedish superellipse

Sergels torg, Stockholm, Sweden. Photo via Wikipedia.

The fountain in Sergels torg (English: Sergel’s Square) in Sockholm is in the shape of a “superellipse.” Piet Hein proposed this shape as a practical and aesthetically pleasing compromise between a circle and a rectangle.

What Piet Hein dubbed a superellipse is a curve satisfying

|x/a|2.5 + |y/b|2.5 = 1

In the case of Sergels torg, a/b = 6/5. A superellipse is what an analyst would call an Lp ball with p = 2.5 and rescaled axes.

Incidentally, I used curves of this form in a dose-finding method a few years ago. A clinical trial design would pick a, b, and p to match a physician’s desired trade-off between probability of efficacy and probability of toxicity.

Update: A few minutes after I posted this, I realized that there was a bowl on my desk shaped like a superellipse:

The bowl came from Ikea, so there’s another connection to Sweden.

Related post: Volumes of Lp balls

Read More

The door was unlocked

From the dedication of C. S. Lewis’ A Preface to Paradise Lost:

Apparently the door of the prison was really unlocked all the time; but it was only you who thought of trying the handle.

In context he was saying that Charles Williams had recovered a clear understanding of Milton that had been obfuscated for over a century.

That line made me think of a quote from Alexander Grothendieck (that I haven’t been able to find this morning) to the effect that progress in mathematics has often waited for someone to be bold enough to ask a simple question or introduce a simple concept.

Update: Thanks to Roland Elliot for providing the quote I was missing. Ronald Brown says that Grothendieck said in a letter to him in 1982:

The introduction of the cipher 0 or the group concept was general nonsense too, and mathematics was more or less stagnating for thousands of years because nobody was around to take such childish steps …

Read More

New theme

I’ve changed the appearance of this blog. If you find anything that’s broken, or if you have suggestions for improvement, please let me know. As far as I can tell, it seems to work OK on multiple platforms.

I’m more likely to implement suggestions that come with specific details. For example, if you say “I don’t like your theme” then there’s not much I can do unless you suggest an alternative. But if you say something like “I’d suggest changing this line of CSS because …” I can at least try it out.

Read More

Haskell is to math as Perl is to English?

Fortran superficially looks like mathematics. Its name comes from “FORmula TRANslation,” and the language does provide a fairly straight-forward way to turn formulas into code. But the similarity to mathematics ends there at the lowest level of abstraction.

Haskell, on the other hand, is more deeply mathematical. Fortran resembles math in the small, but Haskell resembles math in the large. Haskell has mathematical structures at a higher level of abstraction than just formulas. As a recent article put it

While Fortran provides a comfortable translation of mathematical formulas, Haskell code begins to resemble mathematics itself.

On its surface, Perl is one of the least English-like programming languages. It is often criticized as looking like “line noise” because of its heavy use of operators. By contrast, Python has been called “executable pseudocode” because the source is easier to read, particularly for novice programmers. And yet at a deeper level, Perl is more English-like than other programming languages such as Python.

Larry Wall explains in Natural Language Principles in Perl that he designed Perl to incorporate features of spoken language. For example, Perl has analogs of articles and pronouns. (Larry Wall studied both linguistics and computer science in college.) Opinions differ on how well his experiment with incorporating natural language features into programming languages has worked out, but it was an interesting idea.

Related posts:

Haskell / Category theory decoder ring

Programming languages and magic

Extreme syntax

Three-hour-a-week language

Read More

New trade-offs

Technology is all about trade-offs.

I find it more plausible when someone says a new technology has new trade-offs than when someone says a new technology is “better.” Rarely does one thing improve on another by all criteria, but often one thing is an improvement on another by the criteria you value most.

If you want to persuade me to adopt something new, you’ll gain credibility by being candid about its drawbacks. Explain by what criteria you think the new thing is better, by what criteria it is worse, and why the former should matter more to me in my circumstances.

 

Read More

Sometimes definitions are enough

Sometimes you can apply math just by raiding it for vocabulary. You may not need to apply a single theorem.

This has been a surprise to me. I’m more used to creating a mathematical model so you can compute something or apply some theorem. But sometimes you can move a project along just by providing a name for a concept. A meandering discussion can snap into focus because someone has a name for an idea everyone vaguely understands.

Sometimes it may be clear that only part of a mathematical definition applies. In this case math can guide the discussion by asking whether the rest of the definition applies. “It sounds like we’ve got a widget here. A widget has to have these five properties and clearly we have the first three. Let’s think about whether the last two hold.” The answers don’t have to be positive to be useful. You might realize something important in the process of explaining why your thing is not a widget.

Sometimes a definition may not apply at all and still be useful! “This reminds me of a widget. It’s not a widget in any strict sense. But if it were, this is what we’d do next. I wonder whether we can do something like that.”

 

Read More