Ingermanson dot com. Created by Randy Ingermanson, deranged physicist and award winning author.

The Proximity Issue in the Bible Code

In the Great Rabbis Experiment, Doron Witztum, Eliyahu Rips, and Yoav Rosenberg (hereafter I'll call them WRR) found what appeared to be an amazing result. They found the names and birthdates of a number of rabbis encoded as ELSs in Genesis. Using a complicated definition of distance, they found that the names and birthdates were encoded "nearer" to each other than one would expect by random chance. This finding has provoked a heated and complex argument that continues to the present.

In my book, Who Wrote the Bible Code?, I looked at the issue from an entirely different point of view. First, I noted that the Bible code proponents all assume that the Torah has a large amount of information encoded -- something on every man, woman, child, animal, and plant. Second, I used some statistical measures (entropy and chi-squared analyses) to estimate just how much raw information is encoded in the Torah. The answer is very little. If even 200 letters were encoded in Genesis at each skip, my software could find it very easily. But my software reports nothing unusual. My conclusion is that the Torah does not have much information encoded. Therefore, the usual understanding of the Bible code (assuming lots of encoded information) is incorrect. Possibly, a small amount of information is encoded. But not encyclopedias full. The ELSs of the Torah appear to be distributed randomly, without linguistic content.

Now, my method of analysis doesn't look for word-pairs that are "close" to each other. It just looks for the signature of linguistic information. Some have asked if I may have missed real codes by ignoring the "proximity" issue.

This is certainly a valid question to ask. I believe that the answer is no. The purpose of this article is to explain why. I'll have to get a bit technical here, but I'll avoid the full mathematical rigor. If you're a real mathematician, you can easily supply the precise statement of my line of reasoning. If you're not a mathematician, you shouldn't have to plow through that stuff.

A recent reader suggested the following analogy. The ELSs are a little like refrigerator magnets, each bearing a letter of the alphabet. Isn't it possible to have equal numbers of each letter on the refrigerator and arrange them "near" each other to spell out meaningful patterns? Since the letters occur in equal numbers, a simple count of letters would conclude that the letters are random. But anybody who sees a long sequence of letters organized near each other to form words would see that an intelligent being has created a pattern. Isn't this a good analogy?

The answer is no. It's not a good analogy. To see this, I'll analyze in detail exactly the process a human might use to create order out of chaos using refrigerator magnets. Then I'll analyze the analogous process for putting ELSs "near" each other in a text. We'll see that the processes are very different. The analogy simply doesn't hold.

Let's take the refrigerator first. Suppose I have a random collection of magnets on my refrigerator, and I want to spell out the word "Hillary." I scan the refrigerator and find that I have many of each of the required letters, but none of them are "near" each other. I'd like to rearrange them with the least amount of work.

So I find a convenient "H" and choose that as my starting point. As it turns out, the letter next to "H" is a "Z". I don't want that, so I pick it up and swap it with the nearest "I". I've now spelled out "HI" and am well on my way. Doing similar swaps, I can move the other five letters into place to spell out "Hillary." Done! The letters still occur with the same frequencies. But now, there's definitely a pattern. Anybody who looked at the refrigerator would agree there's a pattern there.

Now let's turn to the case of ELSs. We'll assume that we have a large collection of random letters that make up a text. Since the letters are chosen randomly, we'll find words occurring as ELSs at random. Now we want to "move the ELSs around" in order to put ELSs with related meanings "near" each other.

What does "near" mean? It really doesn't matter in detail. It might be defined by the distance function that WRR used. It might be some other definition of nearness. The only thing that matters is that we pick some way of defining "near" and stick with it.

Let's say that we want to find the three ELSs "William," "Jefferson," and "Clinton" and put them near each other. But our goal is to do this in such a way that we don't get an unusually high number of occurrences of these words. Our goal is to encode information using proximity only, while maintaining a random distribution of each individual word. Our goal is to keep the entropy of the "skip-texts" high.

As usual, we'll also want to focus on the ELSs which occur at minimal skip. Each of these "minimal-skip" ELSs will generally be found at a different skip. However, note that there's a slight problem. None of these ELSs is guaranteed to occur in the text. This, in fact, was an issue in the Great Rabbis Experiment. A substantial fraction of the Rabbis' names didn't occur at any skip. WRR resolved this problem by looking for multiple "appellations" for each rabbi.

Let's continue with our example. I want to put my three ELSs near each other. Just to make this example interesting, and to illustrate what can happen, let's assume the following. Assume that I find "William" occurring at a minimal skip of 10. Assume I find "Clinton" occurring at a minimal skip of 543, but assume that it's "far" from "William". And assume I don't find "Jefferson" at all.

I proceed in stages. First, I decide to leave "William" where it is, at a skip of 10. Second, I have to create an ELS of "Clinton" near to "William". I can choose any skip I like, but since I'm trying to make the ELSs appear to be randomly distributed, I'd better leave it at a skip of 543.

(Why do I need to do that? -- you may ask. Suppose I don't. If I encode very many sets of ELSs to be near each other, there will soon be "too many" occurrences of real words at low skips. The ELSs won't be distributed randomly anymore. The entropy will become smaller and smaller at low skips. But I'm trying to avoid that. So I'd better create my new copy of "Clinton" at the same skip I found the old one.

For the same reason, I have to get rid of the old occurrence of "Clinton," the one that occurred far away from "William." If I don't destroy it, I'll now have two copies of "William" at the same skip. If I play this game too many times, I'll be caught. And my goal is to encode lots of information while maintaining a random distribution of frequencies for all words.)

As you can see, you've got to work extra hard to do this sort of encoding. Not only do you have to work hard to encode ELSs where you need them. You've also got to work just as hard to remove ELSs where you don't need them.

But it gets worse. We still need to encode "Jefferson" near "William" and "Clinton." Here, we're in a bit of a bind, because "Jefferson" doesn't occur at all in the text as an ELS at any skip. So we need to pick some skip and encode "Jefferson" at that skip.

Are we in trouble? Not if we're only encoding "William Jefferson Clinton" and nothing more. If we stop here, we've got one real code in a great vast sea of random ELSs. Nobody will notice that "Jefferson" occurs once instead of zero times. Nobody will notice a change in the entropy of the skip-texts.

The problem is that probably nobody will notice "William Jefferson Clinton" as a code at all. Well, OK, Michael Drosnin might notice and make a big deal out of it. But Witztum, Rips, and Rosenberg would look at that single code and say it's not a big deal. Rightly so. It might be slightly improbable, but it's not so amazing that you could claim statistical significance.

To make something really amazing, I've got to repeat the operation many times. I have to do that for two reasons. First, I want to construct a whole set of codes which are improbable, taken together as a set. (That's what was claimed to be remarkable about the Great Rabbis Experiment.) Second, I have to make enough codes so that people will stumble across them. (If I'm encoding only US Presidents, and you're looking only for baseball players, then you'll never find my codes.)

So what happens when I repeat this whole operation many times, encoding every US President, every world leader past and present, every baseball player, football player, man, woman, child, and chihuahua on the planet?

When I do that, suddenly it starts getting hard. I have to keep encoding information, which tends to lower the entropy of various skip-texts. To counter that, I have to try to raise the entropy in each skip-text artificially.

There are two questions that arise here.

  1. Why would I want to raise the entropy?
  2. How much information can I encode while keeping the entropy high?

Question (1) is essentially a theological question, when we note that God is alleged to be the author of the Bible code. Is it really plausible that God is encoding information and yet consciously keeping the entropy at an artificially high level? Why would He do that? Why reveal Himself and conceal Himself simultaneously? Why would anybody believe that He would do that?

I suppose it's possible to imagine that God would do this. But none of the Bible code believers have offered me a plausible reason why God would want to do so.

But let's lay aside that theological question, since it's not really decideable by mere mortals such as ourselves.

We still have the technical question. How much information can you encode in close proximity, without lowering the entropy level?

It is obvious that you can't fill up each skip text to be 100 percent encoded ELSs. If you do that, the entropy of each skip text will decrease by hundreds of standard deviations.

So you can only allow a small fraction of any given skip-text to be meaningful encoded words. In my book Who Wrote the Bible Code?, I estimated this fraction to be much smaller than one percent. My more recent work estimates this fraction to be much smaller than a third of a percent.

Years ago, I discussed this issue at some length with Robert Haralick, who had challenged the interpretation of my work. We agreed in principle to investigate this question further. His job was to encode increasingly greater amounts of information into a large number of texts designed to look as random as possible. My job was to find the point at which these "random-looking" texts are no longer random, using the entropy and chi-squared analyses described in my book.

We never got around to that. It would have been an interesting experiment, but there's only so much time in the day. Lacking any hard data, you can amuse yourself by mulling that knotty theological question again: Why would God try to hide the evidence that He encoded something? If you find that question too hard, you might want to ask one about human psychology: Why would anybody suggest that God would do such a thing?

Interested in My Fiction?

Don't be left behind! Be the first to know when I've got a new novel out. Sign up now for Randy Ingermanson's Book News, a free newsletter that'll keep you posted whenever there's news about my writing.



I respect your privacy and will never rent, sell, or give away your personal information.

About Randy Ingermanson

Randy Ingermanson

Randy earned a Ph.D. in physics at U.C. Berkeley and is the award-winning author of six novels and one non-fiction book. He writes about "The Intersection of Faith Avenue and Science Boulevard."

Randy publishes the world’s largest electronic magazine on the craft of writing fiction, the FREE monthly Advanced Fiction Writing E-zine. His ultimate goal is to become Supreme Dictator for Life and First Tiger and to achieve Total World Domination.

Links to Randy's Major Pages: