Use information theory to clarify and quantify goals


Discussions about data and information are often fuzzier than they need to be. Information theory makes it possible to quantify information content, allowing you to make precise statements in contexts where discussions are often unnecessarily vague. For example, trade-off decisions in machine learning are often made based on informal discussion of how much you expect to learn under different circumstances. Sometimes informal discussions are appropriate, either because there are many factors that are difficult to quantify or because a formal treatment is impractically complex. However, it is often possible to use information theory to be more precise.

Conversation regarding the degree of order or disorder in a system can be derailed by misleading appeals to randomness. There are several different things people commonly mean when they say something is “random.” Two people may not realize that they are using the same word but have two different concepts in mind. Information theory helps clarify such conversations, making it possible to distinguish and measure different ideas associated with randomness.

Here are some of the ideas associated with discussions of randomness that may need to be clarified:

  • Chaotic vs ordered
  • Deterministic vs nondeterministic
  • Predictable vs nonpredictable
  • Reproducible vs irreproducible
  • Dispersed vs clustered
  • Uniformly or normally distributed vs other distributions
  • Thin-tailed distributions vs thick-tailed distributions

Information theory, along with probability and statistics, can improve communication, making it easier to agree on goals and measure progress toward those goals. Stating these goals in precise terms often suggests a path toward achieving them.

If you would like to clarify and quantify your goals in collecting and understanding data, please call or email to discuss your project.


Trusted consultants to some of the world’s leading companies

Amazon, Facebook, Google, US Army Corp of Engineers, Amgen, Microsoft, Hitachi Data Systems