Interview with author Cliff Pickover

A few weeks ago, Sterling Publishing sent me a copy of Cliff Pickover‘s new book The Math Book. I enjoyed reading the book (see my review) and set up the interview that follows.

Clifford Pickover photo

JC: The Math Book is your first book that I’ve read. Is it typical of your writing? How would you summarize the topics you’ve written about?

CP: My past 40 books cover many different topics. A number of these books concern the beauty of mathematics. Others cover topics at the borderlands of science, roaming far and wide on topics ranging from creativity, art, mathematics, and human intelligence, to higher dimensions, religion, strange realities, time travel, alien life, and science fiction. You can see a listing of my other books here. This should give your readers a flavor for the kinds of topics on which I enjoy writing.

Of course, The Math Book is serious mathematics, but I hope I’ve introduced an element of art and playfulness as well — the topics flow from fractals, to Rubik’s cube robots, to the infinite monkey theorem! For me, mathematics cultivates a perpetual state of wonder about the nature of mind, the limits of thoughts, and our place in this vast cosmos.

With respect to my other books, some of which may be more at the fringes of science, I’d point out that “fringe” research is crucial — not just for its educational value but because significant discoveries can come from such study. At first glance, some topics in science or sociology in my other works may appear to be curiosities, with little practical application or purpose. However, I have found these experiments useful and educational, as have the many students, educators, artists, and scientists who have written to me. In fact, science is filled with hundreds of great discoveries that have emerged through chance happenings and serendipity, for example: Velcro, Teflon, X-rays, penicillin, nylon, safety glass, sugar substitutes, dynamite, and polyethylene plastics.

Several of my past books explore a variety of topics to test your curiosity and powers of lateral thinking. Robert Pirsig wrote in Zen and the Art of Motorcycle Maintenance, “It’s the sides of the mountain which sustain life, not the top. Here’s where things grow.” This also applies to the joy that writers experience when letting their minds drift and when wondering about humanity’s place in the universe.

Beltrami pseudosphere. Image by Paul Nylander

Beltrami’s pseudosphere by Paul Nylander, included in The Math Book

JC: You’ve written a lot of books, especially for someone who has a full-time job in addition to writing. How do you manage your time?

CP: When people ask me how I manage my time, I reply: “Some people play golf on the weekends. Instead, I prefer to write.” Of course, my prolific writing pales in comparison to American novelist, lawyer, and workaholic Erle Stanley Gardner (1889-1970), who once worked on seven novels simultaneously and dictated 66,000 words a week! Gardner would never start to dictate until he had worked out the entire plot of his novel. He actually hired six secretaries to handle his dictation, which he found more efficient than typing. His best-known works focus on the lawyer-detective Perry Mason.

I don’t know how writers like Isaac Asimov were so prolific before the age of the computer. I would have a very difficult time writing books, and doing all the necessary text rearrangements and editing, without a word processor. According to the New York Public Library Desk Reference (4th ed.), Isaac Asimov wrote over 400 books and is the only author with a book included in every major Dewey-decimal category. I sit in awe of Asimov, but a few people have exceeded his book output. Lauran Paine (b. 1916) has published over 900 books under more than 90 pen names. Paine spent his youth working as a cowboy, and today at least 500 of his books are Westerns.

JC: How do you write? Do you have a set schedule and place for writing? Anything unusual about your environment or equipment?

CP: French writer Marcel Proust composed his books in a haphazard fashion. He did not start at the beginning and finish at the end. He did not write linearly. Instead, ideas came to him in flashes as he went about his daily routine. Most of my own books are composed in the same way. As ideas come to me during the day or in the realm between sleep and wakefulness, I jot them down and continue to fill in details in the book. For me, writing is exactly like painting, adding a spot of color here, a detail there, a twig on this tree, a bit of foam on that ocean wave… No painter starts at the top of the painting and finishes at the bottom.

My approach to filling in detail, like a painter dabbing paint, is fine in the age of word processors, but it was amazing that Proust used the same approach so well. He would dictate to his stenographers who would type an initial manuscript. Then, he would crowd the margins with additional details and establish links between scenes and characters. He would paste in new pages and have the new work typed again and again. Edmund White notes in his biography of Proust, “If any writer would have benefited from a word processor, it would have been Proust, whose entire method consisted of adding details here and there and of working on all parts of his book at once.” As for my books, there’s nothing special about the tools I use and nothing special about my environment. These days, I use Microsoft Word.

JC: Are you writing a book now?

CP: I am finishing a book in the style of The Math Book — one page of text facing one page of illustration. Entries are in chronological order. Let’s wait to see how well The Math Book sells. If it sells a sufficient number of copies, perhaps I can convince a publisher to consider this newer work that covers a particular array of topics in science, art, history, and popular culture.

JC: Would you be interested in writing a computer science analog of The Math Book?

CP: I very much enjoyed creating The Math Book with my publisher, Sterling, and the $19 price offered by Amazon.com is amazing for a 528-page all color hardcover. I would welcome doing another book of this kind if we feel that such a book has not been done before and that it is marketable.

JC: Who are some of your favorite authors, either for content or style?

CP: My favorite tales of parallel worlds are those of Robert Heinlein. For example, in his science-fiction novel The Number of the Beast there is a parallel world that appears identical to ours in every respect except that the letter “J” does not appear in the English language. Luckily, the protagonists in the book have built a device that lets them perform controlled explorations of parallel worlds from the safety of their high-tech car. This is my favorite novel, and the only one that I’ve read over five times — although I could never finish it the first few times. It’s a novel that many readers dislike, can’t finish, or understand. The final section is nearly incomprehensible. But for me, it provides a sense of mystic transport as the brainy characters enter parallel worlds, fleeing from danger.

JC: There is a scene in the movie Good Will Hunting where Robin Williams’ character, Sean, asks Matt Damon’s character, Will, what he likes to read. Will’s response is “Hey, whatever blows your hair back.” What blows your hair back? Any books, blogs, podcasts, etc. that you turn to for inspiration?

CP: These days, I’m enjoying CDs and DVDs from The Teaching Company – on subjects ranging from the history of mathematics, to the history of the world, to an introduction to Judaism. Some of their classes on the history of mathematics are awesome mind-bogglers.

My most popular blog, Reality Carnival, highlights the kinds of topics and stories that interest me.

JC: Your writing indicates you have broad interests. Have you struggled to find where you want to be along the continuum between Renaissance man and specialist?

CP: I prefer to be a generalist. In fact, if I had to manage a foundation that gives money to scientists, I would also consider high-quality “generalists” as recipients. Experts have become very specialized, and science popularizes are often frowned upon by their more “serious” colleagues. Sometimes, specialists develop blind spots after years of intense focus on a single topic. Thus, I would devote a portion of my money to training “generalists” who traverse several fields and then bring together ideas in ways that specialists may be unable to do. They will also look for overlaps between different domains of research and try to solve shared problems with a single approach. As our rate of technological progress skyrockets in the 21st century, these Facilitators will study the multidisciplinary implications of this acceleration and work on technologies or new ways of seeing that help humanity assimilate advances that outstrip our comprehension and the restrictions of our intuition.

Other interview posts:

Work expands to the time allowed

Yesterday I found a copy of Parkinson’s Law for $1 at a library book sale. This book is best known for it’s opening line: Work expands so as to fill the time available for its completion.

Dust jacket of the book Parkinsons Law and Other Studies in Administration

The name “Parkinson’s law” can mean at least four different things:

  1. The 1957 book by C. Northcote Parkinson
  2. The first chapter of Parkinson’s book
  3. The principle expressed in the book’s opening line, as understood by Parkinson
  4. The principle in the opening line as understood today.

I’d heard of the general principle of Parkinson’s law a few years ago. I only found out about the book more recently. I didn’t know until last night that Parkinson intended his principle to be applied more narrowly than it is applied now.

The full title of the first chapter of the book is “Parkinson’s Law, or The Rising Pyramid.” This chapter explains how work expands to fill the available resources within a bureaucracy and why bureaucracies grow exponentially at a compounding rate of around 5% per year. The subtitle addresses the mechanism for this growth, bureaucrats creating a pyramid of subordinates. Parkinson derives his law from “two almost axiomatic statements”:

  1. An official wants to multiply subordinates, not rivals.
  2. Officials make work for each other.

Nowadays Parkinson’s law is usually condensed to saying work expands to the time allowed. It is applied to individuals as well as a burgeoning bureaucracies. Parkinson discusses this interpretation in his opening paragraph but then limits his attention to organizations.

The total effort that would occupy a busy man for three minutes all told may in this fashion leave another person prostrate after a day of doubt, anxiety, and toil.

Chapter 3 of Parkinson’s law is “High Finance, or The Point of Vanishing Interest.” This chapter is the source of the phrase bike shed arguments. In this chapter Parkinson states what he calls the Law of Triviality:

… the time spent on any item of the agenda will be in inverse proportion to the sum involved.

The idea is that people are more likely to contribute to the discussion of things they understand. A nuclear reactor will sail through the finance committee, but a bicycle shed will cause endless debate because everyone can understand it and everyone has an opinion.

I picked up a copy of Mrs. Parkinson’s Law at the same book sale, also for $1. I’d never heard of it before, but I imagine it will be entertaining.

Related posts:

Mortgages, banks, and Jensen’s inequality

Sam Savage’s new book Flaw of Averages has a brilliantly simple explanation of why volatility in the housing market caused such problems for banks recently. When housing prices drop, more people default on their mortgages and obviously that hurts banks. But banks are also in trouble when the housing market is volatile, even if on average house prices are good.

Suppose there’s no change in the housing market on average. Prices go up in some areas and down in other areas. As long as the ups and downs average out, there should be no change in bank profits, right? Wrong!

When the housing market goes up a little bit, mortgage defaults drop a little bit and bank profits go up a little bit. But when the market goes down a little bit, defaults go up more than just a little bit, and bank profits go down more than a little bit. There is much more down-side potential than up-side potential. Say 95% of homeowners pay their mortgages. Then a good housing market can only improve repayments by 5%. But a bad housing market could decrease repayments by much more.

In mathematical terminology, the bank profits are a concave function of house prices. Jensen’s inequality says that if f() is a concave function (say bank profits) and X is a random variable (say, house prices) then the average of f(X) is less than f(average of X). Average profit is less than the profit from the average.

The Flaw of Averages: Why We Underestimate Risk in the Face of Uncertainty by Sam Savage

Related posts:

Book review: The Math Book

The Math Book by Cliff Pickover proves you can’t judge a book by it’s cover.

The cover has three strikes against it.

  1. The title could hardly be less interesting.
  2. The book cover substitutes a couple Greek letters for Roman letters, a gimmick I find annoying.
  3. At first glance, the dust jacket is just text on a black background.

In short, there’s nothing about this book that would make it jump out at you in a book store. That is a shame, because the book is quite interesting inside. Not only is the content well written, the images are stunning. Aside from its dull cover, it would make a great coffee table book.

The Math Book contains 250 one-page articles on milestones in the history of math. Each article is followed by a related full-page color image. The articles are short and self-contained, so the book is easy to read a few minutes at a time. It’s also fun just to thumb through enjoying the photography and computer-generated images. The book would be a good wedding gift for a mathematician marrying an artist.

The Math Book is written for a general audience; much of the book would be accessible to someone without much background in math. But it also holds a few surprises even for experienced mathematicians.

Update: See Interview with Cliff Pickover

More on colors and grayscale

My previous post gave three algorithms for converting color to grayscale. This post gives more examples and details.

The image below is a screenshot from an Excel spreadsheet illustrating color values and how the convert to grayscale. The R, G, and B columns are the red, green, and blue component values of the color sample in the leftmost column. The columns labeled “Li”, “Lu”, and “Avg” are the grayscale values of the color using the lightness, luminosity, and average algorithms from the previous post.

The grayscale color samples were created by asking Excel to set the background color to (X, X, X) where X is the grayscale value. For example, the background color for the “Lu” column of the first row is (54, 54, 54) since 54 is the luminosity value for pure red.

To verify the algorithms, I converted the screen shot above to a grayscale image using GIMP. The gray cells remain unchanged because all three algorithms leave gray alone; when all three RBG values are equal, it’s clear from the formulas that the grayscale value becomes the common value. The color cells in the first column become the shade of gray predicted and hence match the column of gray cells for that algorithm.

Using lightness:

image converted using the lightness algorithm

Using luminosity:

image converted using the luminosity algorithm

Using average:

image converted using the average algorithm

Related post: Three algorithms for converting color to grayscale

Three algorithms for converting color to grayscale

How do you convert a color image to grayscale? If each color pixel is described by a triple (R, G, B) of intensities for red, green, and blue, how do you map that to a single number giving a grayscale value? The GIMP image software has three algorithms.

The lightness method averages the most prominent and least prominent colors: (max(R, G, B) + min(R, G, B)) / 2.

The average method simply averages the values: (R + G + B) / 3.

The luminosity method is a more sophisticated version of the average method. It also averages the values, but it forms a weighted average to account for human perception. We’re more sensitive to green than other colors, so green is weighted most heavily. The formula for luminosity is 0.21 R + 0.72 G + 0.07 B.

The example sunflower images below come from the GIMP documentation.

Original image color photo of sunflower
Lightness sunflower converted to grayscale using lightness algorithm
Average sunflower converted to grayscale using average algorithm
Luminosity sunflower converted to grayscale using luminosity algorithm

The lightness method tends to reduce contrast. The luminosity method works best overall and is the default method used if you ask GIMP to change an image from RGB to grayscale from the Image -> Mode menu. However, some images look better using one of the other algorithms. And sometimes the three methods produce very similar results.

Update: See More on colors and grayscale for more details and more examples.

Magic, stupidity, and malice

When you mix this quote from Author C. Clark

Any sufficiently advanced technology is indistinguishable from magic.

with Halnon’s Razor

Never attribute to malice that which can be adequately explained by stupidity.

you get Grey’s law

Any sufficiently advanced incompetence is indistinguishable from malice.

Update: Thanks to Wedge for leaving a comment identifying the last quote as Grey’s law.

Saving up for an avocado

Ellen Finn describes how she quit her job and exhausted her retirement savings to become a musician when she was around 50 years old.

I was totally broke. I was living on beans and I know thousands of bean recipes. It’s scary at any age, but it’s particularly scary in your fifties when all my friends are retiring and my goal is to save up for an avocado.

The quote comes from the BrightSideBroadcast podcast featuring her music.

 

Power laws and the generalized CLT

Here’s an expert from a recent ACM Ubiquity interview with David Alderson that raises a few questions.

Actually, they [power laws] aren’t special at all. They can arise as natural consequences of aggregation of high variance data. You know from statistics that the Central Limit Theorem says distributions of data with limited variability tend to follow the Normal (bell-shaped, or Gaussian) curve. There is a less well-known version of the theorem that shows aggregation of high (or infinite) variance data leads to power laws. Thus, the bell curve is normal for low-variance data and the power law curve is normal for high-variance data. In many cases, I don’t think anything deeper than that is going on.

In this post I will explain the theory I believe Alderson is alluding to in his informal remarks. I’ll also explain some restrictions necessary for this theory to hold.

I don’t understand what Alderson has in mind when he refers to data with high but finite variance. If the variance is large but finite, the classical Central Limit Theorem (CLT) holds. If the variance is infinite, the classical CLT does not apply but a Generalized Central Limit Theorem might (or might not) apply.

The Generalized CLT says that if the “aggregation” converges to a non-degenerate distribution, that distribution must be a stable distribution. Also, stable distributions (except for normal distributions) have tails that are asymptotically proportional to the tails of a power law distribution. Note that this does not say under what conditions the aggregation has a non-degenerate limit. It only says something about what that limit must be like if it exists. Also, this does not say that the limit is a power law, only that it is a distribution whose tails are eventually proportional to those of a power law distribution.

In order to better understand what’s going on, there are several gaps to fill in.

  1. What are stable distributions?
  2. What do we mean by aggregation?
  3. What conditions insure that a non-degenerate limiting distribution exists?

Let X0, X1, and X2 be independent, identically distributed (iid) random variables. The distribution of these random variables is called stable if for every pair of positive real numbers a and b, there exists a positive c and a real d such that cX0 + d has the same distribution as aX1 + bX2.

Stable distributions can be specified by four parameters. One of the four parameters is the exponent parameter 0 < α ≤ 2. This parameter is controls the thickness of the distribution tails. The distributions with α = 2 are the normal (Gaussian) distributions. For α < 2, the PDF is asymptotically proportional to |x|-α-1 and the CDF is asymptotically proportional to |x| as x → ±∞.  And so except for the normal distribution, all stable distributions have thick tails.

A stable distribution can be described in terms of its characteristic function, the Fourier transform of its PDF.  The description of the characteristic function is a little complicated, but it can be written down in closed form. (See John P. Nolan’s notes on stable distributions for much more information.) However, the PDFs can only be written down in closed form in three special cases: the normal, Cauchy, and Lévy distributions. These three distributions correspond to α = 2, 1, and 1/2 respectively.

The Generalized CLT holds if there is a sequence of constants an and bn such that (X1 + X2 + … + Xnbn) / an converges to a stable distribution. This is what is meant by the “aggregation” of the X‘s. The factors are an necessarily asymptotically equal to n1/α where α is the exponential parameter for the limiting distribution.

We now get to the most critical question: what kinds of random variables lead to stable distributions when aggregated? They must have have tails something like the tails of the limiting distribution. In this sense the Generalized CLT is not as magical as the classical CLT. The classical CLT says you can aggregate random variables quite unlike a normal distribution and get a normal distribution out in the limit. But the Generalized CLT requires that the distribution of the X‘s must be somewhat similar to limiting distribution. The specific requirements are given below.

Let F(x) be the CDF for the random variables Xi. The following conditions on F are necessary and sufficient for the aggregation of the X‘s to converge to a stable distribution with exponent α < 2.

  1. F(x) = (c1 + o(1)) |x| h(|x|) as x → -∞, and
  2. 1 -F(x) = (c2 + o(1)) x h(x) as x → ∞

where h(x) is a slowly varying function. The notation o(1) is explained in these notes on asymptotic notation. A slowly varying function h(x) is one such that h(cx) / h(x) → 1 as x → ∞ for all c > 0. Roughly speaking, this means F(x) has to look something like |x| in both the left and right tails, and so the X‘s must be distributed something like the limiting distribution.

Power laws do not fall out of the Generalized CLT as easily as the normal distribution falls out of the classical CLT. The aggregation of random variables with infinite variance might not converge to any distribution, or it might converge to a degenerate distribution. And if the aggregation converges to a non-degenerate distribution, this distribution is not strictly a power law but rather has tails like a power law.

Related links:

Has C++ jumped the shark?

Bjarne Stroustrup, creator of C++, wrote an article for Dr. Dobbs recently lamenting the decision to cut “concepts” from the upcoming revision of the C++ standard. His article left me with the feeling that C++ had jumped the shark.

The upcoming standard has been called “C++0x” based on the assumption (or at least the hope) that the standard would come out in the first decade of this century. But there will be no C++0x; now it will have to be C++1x. Part of the reason for removing concepts was to avoid pushing the release of the the standard out even further. Stroustrup says he expects concepts will be added to the standard in five years. How many people will care by the time the standard is finished?

I’ve written C++ for a long time and I still use C++ for some problems. I like the language, but it has gone about as far as it can go. It’s safe to say the language has essentially stopped evolving if new standards are only going to come out every decade or two.

I have great respect for the people working on the C++ standard committee. The design of C++ is an engineering marvel given its constraints. But if it takes such Herculean efforts to make changes, maybe it’s time to call the language finished.

I’m content with the current version of C++. If I used C++ for everything I do, maybe I’d be anxious for new features. But if something is hard to do in C++, I just don’t use C++. I don’t see a new version of C++ changing my decisions about what language to use for various tasks. If something is easier to do in Python than in C++ now, for example, that will probably still be the case when the new standard is implemented.

Update: The ISO committee approved the final draft of the C++ 2011 standard 25 March 2011.

Related links:

Good enough for Google and NASA

Leo Laporte’s comment on Python in the latest FLOSS Weekly podcast:

If it’s good enough for Google and NASA, it’s good enough for me, baby.

The podcast is an interesting interview with Michael Foord on IronPython. Leo Laporte’s comment comes near the end of the show, around 51:20.

Related posts: