Classical statistics to avoids burdensome AI regulation

On June 29 this year I said on Twitter that companies would start avoiding AI to avoid regulation.

Companies are advertising that their products contain AI. Soon companies may advertise that their projects are AI-free and thus exempt from AI regulations.

I followed that up with an article Three advantages of non-AI models. The third advantage I listed was

Statistical models are not subject to legislation hastily written in response to recent improvements in AI. The chances that such legislation will have unintended consequences are roughly 100%.

Fast forward four months and we now have a long, highly-detailed executive order, Executive Order 14110, effecting all things related to artificial intelligence. Here’s an excerpt:

… the Secretary [of Commerce] shall require compliance with these reporting requirements for: any model that was trained using a quantity of computing power greater than 10²⁶ integer or floating-point operations, or using primarily biological sequence data and using a quantity of computing power greater than 10²³ integer or floating-point operations; and any computing cluster that has a set of machines physically co-located in a single datacenter, transitively connected by data center networking of over 100 Gbit/s, and having a theoretical maximum computing capacity of 10²⁰ integer or floating-point operations per second for training AI.

If a classical model can do what you need, you are not subject to any regulations that will flow out of the executive order above, not if these regulations use definitions similar to those in the executive order.

How many floating point operations does it take to train, say, a logistic regression model? It depends on the complexity of the model and the amount of data fed into the model, but it’s not 10²⁰ flops.

Can you replace an AI model with something more classical like a logistic regression model or a Bayesian hierarchical model? Very often. I wouldn’t try to compete with Midjourney for image generation that way, but classical models can work very well on many problems. These models are much simpler—maybe a dozen parameters rather than a billion parameters—and so are much better understood (and so there is less fear of such models that leads to regulation).

I had a client that was using some complicated models to predict biological outcomes. I replaced their previous models with a classical logistic regression model and got better results. The company was so impressed with the improvement that they filed a patent on my model.

If you’d like to discuss whether I could help your company replace a complicated AI model with a simpler statistical model, let’s talk.

3 thoughts on “Using classical statistics to avoid regulatory burden”

Andrew Gelman

1 November 2023 at 22:11

John:

You write, “I wouldn’t try to compete with Midjourney for image generation that way, but classical models can work very well on many problems.”

It’s not just that. Classical models (in which category you include Bayesian hierarchical models) can do things like fit latent-parameter models in pharmacology. You can’t do that with Midjourney at all!
Denny

2 November 2023 at 19:55

Hi John:

I read your post https://www.johndcook.com/blog/applied-linear-regression/ last night as I was searching for a textbook that describes the underlying linear regression model, e.g., the F-test for fit, p-value, etc.

Do you know if there is a good textbook that describes the underlying theoretical model for for linear regression which are not taught in Statistics 101 class?
John

3 November 2023 at 17:28

You might like one of the two regression books Andrew Gelman and Jennifer Hill wrote together.

Comments are closed.