High-dimensional integration

Numerically integrating functions of many variables almost always requires Monte Carlo integration or some variation. Numerical analysis textbooks often emphasize one-dimensional integration and, almost as a footnote, say that you can use a product scheme to bootstrap one-dimensional methods to higher dimensions. True, but impractical.

Traditional numerical integration routines work well in low dimensions. (“Low” depends on context, but let’s say less than 10 dimensions.) When you have the choice of a well-understood, deterministic method, and it works well, by all means use it. I’ve seen people who are so used to problems requiring Monte Carlo methods that they use these methods on low-dimensional problems where they’re not the most efficient (or robust) alternative.

Monte Carlo

Except for very special integrands, high dimensional integration requires either Monte Carlo integration or some variation such as quasi-Monte Carlo methods.

As I wrote about here, naive Monte Carlo integration isn’t practical for high dimensions. It’s unlikely that any of your integration points will fall in the region of interest without some sort of importance sampler. Constructing a good importance sampler requires some understanding of your particular problem. (Sorry. No one-size-fits-all black-box methods are available.)

Quasi-Monte Carlo (QMC)

Quasi-Monte Carlo methods (QMC) are interesting because they rely on something like a random sequence, but “more random” in a sense, i.e. more effective at exploring a high-dimensional space. These methods work well when an integrand has low effective dimension. The effective dimension is low when nearly all the action takes place on a lower dimensional manifold. For example, a function may ostensibly depend on 10,000 variables, while for practical purposes 12 of these are by far the most important. There are some papers by Art Owen that say, roughly, that Quasi-Monte Carlo methods work well if and only if the effective dimension is much smaller than the nominal dimension. (QMC methods are especially popular in financial applications.)

While QMC methods can be more efficient than Monte Carlo methods, it’s hard to get bounds on the integration error with these methods. Hybrid approaches, such as randomized QMC can perform remarkably well and give theoretical error bounds.

Markov Chain Monte Carlo (MCMC)

Markov Chain Monte Carlo (MCMC) is a common variation on Monte Carlo integration that uses dependent random samples. In many applications, such as Bayesian statistics, you’d like to be able to create independent random samples from some probability distribution, but this is not practical. MCMC is a compromise. It produces a Markov chain of dependent samples, and asymptotically these samples have the desired distribution. The dependent nature of the samples makes it difficult to estimate error and to determine how well the integration estimates using the Markov chain have converged.

(Some people speak of the Markov chain itself converging. It doesn’t. The estimates using the Markov chain may converge. Speaking about the chain converging may just be using informal language, or it may betray a lack of understanding.)

There is little theory regarding the convergence of estimates from MCMC, aside from toy problems. An estimate can appear to have converged, while an important region of the integration problem remains unexplored. Despite its drawbacks, MCMC is the only game in town for many applications.

Consulting

If you’d like help with high dimensional integration, or any other kind of numerical integration, please contact me to discuss how we might work together.

Algorithmic wizardry

Last week I wrote a short commentary on James Hague’s blog post Organization skills beat algorithmic wizardry. This week that post got more traffic than my server could handle. I believe it struck a chord with experienced software developers who know that the challenges they face now are not like the challenges they prepared for in school.

Although I completely agree that “algorithmic wizardry” is over-rated in general, my personal experience has been a little different. My role on projects has frequently been to supply a little bit of algorithmic wizardry. I’ve often been asked to look into a program that is taking too long to run and been able to speed it up by an order of magnitude or two by improving a numerical algorithm. (See an example here.)

James Hague says that “rarely is there some … algorithm that casts a looming shadow over everything else.” I believe he is right, though I’ve been called into projects precisely on those rare occasions when an algorithm does cast a shadow over everything else.

The most important skill in software development

Here’s an insightful paragraph from James Hague’s blog post Organization skills beat algorithmic wizardry:

When it comes to writing code, the number one most important skill is how to keep a tangle of features from collapsing under the weight of its own complexity. I’ve worked on large telecommunications systems, console games, blogging software, a bunch of personal tools, and very rarely is there some tricky data structure or algorithm that casts a looming shadow over everything else. But there’s always lots of state to keep track of, rearranging of values, handling special cases, and carefully working out how all the pieces of a system interact. To a great extent the act of coding is one of organization. Refactoring. Simplifying. Figuring out how to remove extraneous manipulations here and there.

Algorithmic wizardry is easier to teach and easier to blog about than organizational skill, so we teach and blog about it instead. A one-hour class, or a blog post, can showcase a clever algorithm. But how do you present a clever bit of organization? If you jump to the solution, it’s unimpressive. “Here’s something simple I came up with. It may not look like much, but trust me, it was really hard to realize this was all I needed to do.” Or worse, “Here’s a moderately complicated pile of code, but you should have seen how much more complicated it was before. At least now someone stands a shot of understanding it.” Ho hum. I guess you had to be there.

You can’t appreciate a feat of organization until you experience the disorganization. But it’s hard to have the patience to wrap your head around a disorganized mess that you don’t care about. Only if the disorganized mess is your responsibility, something that means more to you than a case study, can you wrap your head around it and appreciate improvements. This means that while you can learn algorithmic wizardry through homework assignments, you’re unlikely to learn organization skills unless you work on a large project you care about, most likely because you’re paid to care about it.

Related posts:

 

Launching missiles with Haskell

Haskell advocates are fond of saying that a Haskell function cannot launch missiles without you knowing it. Pure functions have no side effects, so they can only do what they purport to do. In a language that does not enforce functional purity, calling a function could have arbitrary side effects, including launching missiles. But this cannot happen in Haskell.

The difference between pure functional languages and traditional imperative languages is not quite that simple in practice.

Programming with pure functions is conceptually easy but can be awkward in practice. You could just pass each function the state of the world before the call, and it returns the state of the world after the call. It’s unrealistic to pass a program’s entire state as an argument each time, so you’d like to pass just that state that you need to, and have a convenient way of doing so. You’d also like the compiler to verify that you’re only passing around a limited slice of the world. That’s where monads come in.

Suppose you want a function to compute square roots and log its calls. Your square root function would have to take two arguments: the number to find the root of, and the state of the log before the function call. It would also return two arguments: the square root, and the updated log. This is a pain, and it makes function composition difficult.

Monads provide a sort of side-band for passing state around, things like our function call log. You’re still passing around the log, but you can do it implicitly using monads. This makes it easier to call and compose two functions that do logging. It also lets the compiler check that you’re passing around a log but not arbitrary state. A function that updates a log, for example, can effect the state of the log, but it can’t do anything else. It can’t launch missiles.

Once monads get large and complicated, it’s hard to know what side effects they hide. Maybe they can launch missiles after all. You can only be sure by studying the source code. Now how do you know that calling a C function, for example, doesn’t launch missiles? You study the source code. In that sense Haskell and C aren’t entirely different.

The Haskell compiler does give you assurances that a C compiler does not. But ultimately you have to study source code to know what a function does and does not do.

Related post: Monads are hard because …

Information hiding

One of the basic principles of software development is information hiding. People agree that it’s desirable, but may not realize they have different ideas of what it means. And when done poorly, well-meaning attempts to make software more maintainable backfire. Leo Brodie cautions

… we should clarify. From what, or whom, are we hiding information?

[T]raditional languages … bend over backwards to ensure that modules hide internal routines and data structures from other modules. The goal is to achieve module independence (a minimum coupling). The fear seems to be that modules strive to attack each other like alien antibodies. Or else, that evil bands of marauding modules are out to clobber the precious family data structures.

This is not what we’re concerned about. The purpose of hiding information, as we mean it, is simply to minimize the effects of a possible design-change by localizing things that might change within each component.

Quote from Thinking Forth. Emphasis added.

 

Managing Emacs windows

When you have Emacs split into multiple windows, how do you move your cursor between windows? How do you move the windows around?

Update (27 Jan 2015): You may want to skip the post below and use ace-window.  I plan to use it rather than my solution below. Also, see Sacha Chua’s video on window management, including using ace-window.

Moving the cursor between windows

You can use C-x o to move the cursor to the “other” window. That works great when you only have two windows: C-x o toggles between them.

When you have more windows, C-x o moves to the next window in top-to-bottom, left-to-right order. So if you have five windows, for example, you have to go to the “other” window up to four times to move to each of the windows.

The cursor is in the A window. Repeatedly executing C-x o will move the cursor to C, B, and D.

There are more direct commands:

  • windmove-up
  • windmove-down
  • windmove-left
  • windmove-right

You might map these to Control or Alt plus the arrow keys to have a memorable key sequence to navigate windows. However, such keybindings may conflict with existing keybindings, particularly in org-mode.

Moving windows

You can move windows around with the buffer-move. (Technically the windows stay fixed and the buffer contents moves around.)

Super and Hyper keys

Since the various Control arrow keys and Alt arrow keys are spoken for in my configuration, I thought I’d create Super and Hyper keys so that Super-up would map to windmove-up etc.

Several articles recommend mapping the Windows key to Super and the Menu key to Hyper on Windows. The former is problematic because so many of the Windows key combinations are already used by the OS. I did get the latter to work by adding the following to the branch of my init.el that runs on Windows.

        (setq w32-pass-apps-to-system nil
              w32-apps-modifier 'hyper)
        (global-set-key (kbd "<H-up>")    'windmove-up)
        (global-set-key (kbd "<H-down>")  'windmove-down)
        (global-set-key (kbd "<H-left>")  'windmove-left)
        (global-set-key (kbd "<H-right>") 'windmove-right)  

I haven’t figured out yet how to make Super or Hyper keys work on Linux. Suggestions for setting up Super and Hyper on Linux or Windows would be much appreciated.

Striving for simplicity, arriving at complexity

This post is a riff on a line from Mathematics without Apologies, the book I quoted yesterday.

In an all too familiar trade-off, the result of striving for ultimate simplicity is intolerable complexity; to eliminate too-long proofs we find ourselves “hopelessly lost” among the too-long definitions. [emphasis added]

It’s as if there’s some sort of conservation of complexity, but not quite in the sense of a physical conservation law. Conservation of momentum, for example, means that if one part of a system loses 5 units of momentum, other parts of the system have to absorb exactly 5 units of momentum. But perceived complexity is psychological, not physical, and the accounting is not the same. By moving complexity around we might increase or decrease the overall complexity.

The opening quote suggests that complexity is an optimization problem, not an accounting problem. It also suggests that driving the complexity of one part of a system to its minimum may disproportionately increase the complexity of another part. Striving for the simplest possible proofs, for example, could make the definitions much harder to digest. There’s a similar dynamic in programming languages and programs.

Larry Wall said that Scheme is a beautiful programming language, but every Scheme program is ugly. Perl, on the other hand, is ugly, but it lets you write beautiful programs. Scheme can be simple because it requires libraries and applications to implement functionality that is part of more complex languages. I had similar thoughts about COM. It was an elegant object system that lead to hideous programs.

Scheme is a minimalist programming language, and COM is a minimalist object framework. By and large the software development community prefers complex languages and frameworks in hopes of writing smaller programs. Additional complexity in languages and frameworks isn’t felt as strongly as additional complexity in application code. (Until something breaks. Then you might have to explore parts of the language or framework that you had blissfully ignored before.)

The opening quote deals specifically with the complexity of theorems and proofs. In context, the author was saying that the price of Grothendieck’s elegant proofs was a daunting edifice of definitions. (More on that here.) Grothendieck may have taken this to extremes, but many mathematicians agree with the general approach of pushing complexity out of theorems and into definitions. Michael Spivak defends this approach in the preface to his book Calculus on Manifolds.

… the proof of [Stokes’] theorem is, in the mathematician’s sense, an utter triviality — a straight-forward calculation. On the other hand, even the statement of this triviality cannot be understood without a horde of definitions … There are good reasons why the theorems should all be easy and the definitions hard. As the evolution of Stokes’ theorem revealed, a single simple principle can masquerade as several difficult results; the proofs of many theorems involve merely stripping away the disguise. The definitions, on the other hand, serve a twofold purpose: they are rigorous replacements for vague notions, and machinery for elegant proofs. [emphasis added]

Mathematicians like to push complexity into definitions like software developers like to push complexity into languages and frameworks. Both strategies can make life easier on professionals while making it harder on beginners.

Related post: A little simplicity goes a long way

Regular expression resources

Continuing the series of resource posts each Wednesday, this week we have notes on regular expressions:

See also blog posts tagged regular expressions and the RegexTip Twitter account.

Last week: Probability resources

Next week: Numerical computing resources

Four brief reviews

Princeton University Press and No Starch Press both sent me a couple books this week. Here are a few brief words about each.

The first from Princeton was The Best Writing on Mathematics 2014 (ISBN 0691164177). My favorite chapters were The Beauty of Bounded Gaps by Jordan Ellenberg and The Lesson of Grace in Teaching by Francis Su. The former is a very high-level overview of recent results regarding gaps in prime numbers. The latter is taken from the Francis’ Haimo Teaching Award lecture. A recording of the lecture and a transcript are available here.

The second book from Princeton was a new edition of Andrew Hodges’ book Alan Turing: The Enigma (ISBN 069116472X). This edition has a new cover and the new subtitle “The Book That Inspired the Film ‘The Imitation Game.'” Unfortunately I’m not up to reading a 768-page biography right now.

The first book from No Starch Press was a new edition of The Book of CSS3: A Developer’s Guide to the Future of Web Design by Peter Gasston (ISBN 1593275803). The book says from the beginning that it is intended for people who have a lot of experience with CSS, including some experience with CSS 3. I tend to ignore such warnings; many books are more accessible to beginners than they let on. But in this case I do think that someone with more CSS experience would get more out of the book. This looks like a good book, and I expect I’ll get more out of it later.

The final book was a new edition of How Linux Works: What Every Superuser Should Know by Brian Ward (ISBN 1593275676). I’ve skimmed through this book and would like to go back and read it carefully, a little at a time. Most Unix/Linux books I’ve seen either dwell on shell commands or dive into system APIs. This one, however, seems to live up to its title and give the reader an introduction to how Linux works.

“Hello world” is the hard part

Kernighan and Ritchie’s classic book The C Programming Language began with a sample C program that printed “hello world.” Since then “hello world” has come describe the first program you write with any technology, even if it doesn’t literally print “hello world.”

Hello-world programs are often intimidating. People think “I must be a dufus because I find hello-world hard. At this rate I’ll never get to anything interesting.”

The problem is that we confuse the first task with the easiest task. Hello-world programs are almost completely arbitrary. You can’t deduce what a compiler is named, where files must be located, how they must be formatted, etc. You have to be told. The amount of arbitrary material you need to learn is greatest up-front and slowly decreases.

When I started programming I thought I’d quickly get past the hello-world stage and only write substantial programs from then on. Instead, it seems I’ve spent a good chunk of my career writing hello-world programs with no end in sight.

* * *

No discussion of hello-world programs would be complete without mentioning possibly the most intimidating hello-world program: the first Windows program in Charles Petzold’s Programming Windows book. I was only able to find the program from the Windows 98 edition of his book. I don’t recall how it differs much from the program in his first edition, but I vaguely remember the original being worse.


/*------------------------------------------------------------
   HELLOWIN.C -- Displays "Hello, Windows 98!" in client area
                 (c) Charles Petzold, 1998
  ------------------------------------------------------------*/

#include <windows.h>

LRESULT CALLBACK WndProc (HWND, UINT, WPARAM, LPARAM) ;

int WINAPI WinMain (HINSTANCE hInstance, HINSTANCE hPrevInstance,
                    PSTR szCmdLine, int iCmdShow)
{
     static TCHAR szAppName[] = TEXT ("HelloWin") ;
     HWND         hwnd ;
     MSG          msg ;
     WNDCLASS     wndclass ;

     wndclass.style         = CS_HREDRAW | CS_VREDRAW ;
     wndclass.lpfnWndProc   = WndProc ;
     wndclass.cbClsExtra    = 0 ;
     wndclass.cbWndExtra    = 0 ;
     wndclass.hInstance     = hInstance ;
     wndclass.hIcon         = LoadIcon (NULL, IDI_APPLICATION) ;
     wndclass.hCursor       = LoadCursor (NULL, IDC_ARROW) ;
     wndclass.hbrBackground = (HBRUSH) GetStockObject (WHITE_BRUSH) ;
     wndclass.lpszMenuName  = NULL ;
     wndclass.lpszClassName = szAppName ;

     if (!RegisterClass (&wndclass))
     {
          MessageBox (NULL, TEXT ("This program requires Windows NT!"), 
                      szAppName, MB_ICONERROR) ;
          return 0 ;
     }
     
     hwnd = CreateWindow (szAppName,                  // window class name
                          TEXT ("The Hello Program"), // window caption
                          WS_OVERLAPPEDWINDOW,        // window style
                          CW_USEDEFAULT,              // initial x position
                          CW_USEDEFAULT,              // initial y position
                          CW_USEDEFAULT,              // initial x size
                          CW_USEDEFAULT,              // initial y size
                          NULL,                       // parent window handle
                          NULL,                       // window menu handle
                          hInstance,                  // program instance handle
                          NULL) ;                     // creation parameters
     
     ShowWindow (hwnd, iCmdShow) ;
     UpdateWindow (hwnd) ;
     
     while (GetMessage (&msg, NULL, 0, 0))
     {
          TranslateMessage (&msg) ;
          DispatchMessage (&msg) ;
     }
     return msg.wParam ;
}

LRESULT CALLBACK WndProc (HWND hwnd, UINT message, WPARAM wParam, LPARAM lParam)
{
     HDC         hdc ;
     PAINTSTRUCT ps ;
     RECT        rect ;
     
     switch (message)
     {
     case WM_CREATE:
          PlaySound (TEXT ("hellowin.wav"), NULL, SND_FILENAME | SND_ASYNC) ;
          return 0 ;
          
     case WM_PAINT:
          hdc = BeginPaint (hwnd, &ps) ;
          
          GetClientRect (hwnd, &rect) ;
          
          DrawText (hdc, TEXT ("Hello, Windows 98!"), -1, &rect,
                    DT_SINGLELINE | DT_CENTER | DT_VCENTER) ;
          
          EndPaint (hwnd, &ps) ;
          return 0 ;
          
     case WM_DESTROY:
          PostQuitMessage (0) ;
          return 0 ;
     }
     return DefWindowProc (hwnd, message, wParam, lParam) ;
}

Software development becoming less mature?

Michael Fogus posted on Twitter this morning

Computing: the only industry that becomes less mature as more time passes.

The immaturity of computing is used to excuse every ignorance. There’s an enormous body of existing wisdom but we don’t care.

I don’t know whether computing is becoming less mature, though it may very well be on average, even if individual developers become more mature.

One reason is that computing is a growing profession, so people are entering the field faster than they are leaving. That lowers average maturity.

Another reason is chronological snobbery, alluded to in Fogus’s second tweet. Chronological snobbery is pervasive in contemporary culture, but especially in computing. Tremendous hardware advances give the illusion that software development has advanced more than it has. What could I possibly learn from someone who programmed back when computers were 100x slower? Maybe a lot.

Related posts: