This is the Message Centre for droid

Human Genome Project / Random Mutations in cells

Post 1

droid

I was reading the very interesting evolutionary timeline over at http://www.pbs.org/wgbh/evolution/religion/revolution/2000.html.

On the last page of the timeline, it says that the human genome project supports evolution, since it shows that humans share many common traits with animals, plants and most anything else that's alive on this planet.

I don't get how this supports evolution.

Here's why: When I write software, I don't rewrite all the code from scratch every time. Who would do that? How difficult is it to code life anyway? I'm guessing it's quite tricky. Why re-invent the wheel once you have the core code in place? I would have been quite surprised if the code was vastly different between species. Just think of all the extra work that would have required.

A few days ago, I read at http://www.howstuffworks.com/evolution4.htm that (according to Carl Sagan in "The Dragons of Eden") large organisms like humans tend to have a one in ten chance that their offspring will contain a mutation. Assuming they aren't exposed to strong radiation for extended periods. This would dramatically increase the mutations, but would mess up the experiment. Introducing multiple mutations in a single organism dramatically reduces the already small chances that you won't damage something important.

And there are some numbers to show how a population of 10 000 creatures, over a time period of a million years can have gone through every possible 'nucleotide substitution' 50 times. (This is from the book: "The Molecular Biology of the Cell")

So, apparently, all we need is time.

The thing is, thinking back to the computer code analogy, there is a bit of a problem. If a single byte of code could turn complicated system functions on and off, like a switch, all the nucleotide swapping might be pretty handy. Unfortunately, the more complicated a function (gene) is, the more code tends to be needed to write it. Now with computers, we have CRC checking to make sure that our code isn't messed up. Evolutionary CRC checking involves a function called 'Survival of the Fittest'.

As a code integrity check, it works ok. Although not 100%, because some creatures survive, even with mutations. It makes sense really. If the mutation is not harmful, the creature won't die. Of course, the mutations not being harmful, implies that the creature has at least the same chance for survival as those around it.

Funny enough, we have an analogy which makes this all very much easier to understand right now. The internet. Files are being transferred day and night between millions of computers around the globe. We'd have to increase the average file size to compare correctly with DNA, but this is just a theoretical model, so we can do that without damaging our bandwidth.

Human DNA has about 3 billion base pairs, but only 20 amino acids (instructions) need to be represented. That's because the code for writing all the little enzymes (functions) performing different tasks inside the cell (program) can be programmed with only 20 instructions.

With computers we take 8 bits in a clump to represent one byte. Our genes get read 3 base pairs in a clump, to represent one codon. Every set of three codons maps to a single amino acid molecule, which is the basic building block of enzymes. The point is that we can compare every set of three base pairs, to a set of eight bits on computers.

Using only 20 instructions is pretty low-level stuff. We could compare it to assembly language, except that assembly language (depending on the processor type involved) generally has more instructions. For instance, for the 8086/8088 processor instruction set, there are about 116 base commands, which can translate to 180 different machine language codes. ("Using Assembly Language - 3rd Addition" - QUE/Allen L. Wyatt - p 32)

That's why we need to use 8 bit pairs, and not 6 bit. In other words, that's why we use a system which allows for 256 possible instructions, while life only allows for 64.

Now for the fun part. Let's work out how many MB of data are stored inside different organisms!

The approximate base pairs in a simple bacteria are around 4 million. Divide by three to get the instructions, and a million (or 1024^2 if you feel like being picky) to get the MB storage capacity. This reveals that your average bacteria stores 1.3 MB of data within its DNA.

Humans have around 3 billion base pairs. Which means we have about 1 GB of instructions in our DNA. Wow. That really shows the jump quite graphically. From 1.3 MB to 1 GB. That's a lot of code.

So at least 1.3 MB of our 1 GB of DNA is shared with bacteria. The jump from single celled to multi-celled organisms is quite big. I'll talk about that some time in the future. (Maybe)


For now, here's the clincher. Random mutations can't write new functions. They can modify existing ones. Just the same as random changes within a binary file being transferred over a network can't add new sections to it's code. The only way a specific function can be useful, is if it works. Introducing a single random mutation cannot produce a new working function.

And any individual function can only be modified to a certain extent. This explains for instance, why bacteria can become immune to certain toxins. Differences between the way specific functions operate, can allow some cells to survive an attack which relies on a certain vulnerability. But there are only so many ways to perform a given function. Which means it is *not* possible to become immune to any and all attacks.

In other words, a new poison might target cells that are immune to an older one, by attacking the precise function which gives them immunity. The only cells which might then be able to survive are the ones which aren't immune to the original poison, and haven't got the feature being targeted by the new one. So these changes within the bacteria population at large, do not necessarily move in any given direction, except that of survival. Survival at any one moment does not imply that the same cell would survive under all previous harsh conditions, exposed on its 'ancestors', as well.

The problem with trying to create useful functions with random mutations in existing functions (hopefully redundant ones, perhaps repeated in the DNA as in certain bacteria), is that there is nothing to guide the formation of the function. If it doesn't work, it doesn't work. There's no way of checking whether it *almost* works. And these functions cannot be excluded from the processes within the cell. Which means that a function which no longer does it's original job, *will* lead to problems, thereby substantially decreasing the chances of the cells survival.

Back to the analogy one more time. To imagine the evolution of life from a single-celled organism up to a human, you must try to imagine a 1.3 MB file gradually aquiring new code and functions *randomly*, until it contains 1 GB of instructions. All functions must be accessed within the program, as this is the only way to make sure they don't become filled with garbage. The smallest of these functions are over 100 instructions long. The largest (and equally necessary for life) consist of thousands of instructions. All the while, the program may never crash. Crashing indicates death of the cell. As a programmer, I find this hard to imagine. Over any stretch of time.

What's so hard about accepting that other life could have existed before ours, anyway?
(more next time)


Key: Complain about this post

Human Genome Project / Random Mutations in cells

More Conversations for droid

Write an Entry

"The Hitchhiker's Guide to the Galaxy is a wholly remarkable book. It has been compiled and recompiled many times and under many different editorships. It contains contributions from countless numbers of travellers and researchers."

Write an entry
Read more