Dna how many bytes




















Estimates for the number of cells in the human body range between 10 trillion and trillion. Let us take trillion cells as the generally accepted estimate. So, given that each diploid cell contains 1. Along the same lines, how much genetic data is exchanged during human reproduction? Each sperm cell in a human male is heterogametic and haploid, meaning that it contains only one of two sex chromosomes X or Y and only one set of the 22 autosomal chromosomes. Thus, each sperm contains about 3 billion bases of genetic information, representing Mbytes of digital information.

The average human ejaculate contains around million sperm cells. Following this idea even further, while Tbytes are transferred, only one sperm cell will fuse with an egg, using only Mbytes of data, combining it with another Mbytes of data from the egg. Thus, essentially Having worked out the above numbers, a whole bunch of other curious questions can be asked.

Have you ever wondered about the data capacity of our biological organism? What is the rate of data transmission during cell division? The rate of data transmission during gamete fusion? The rate of data transmission when human lymphocytes circulate through the bloodstream?

What amount of data is destroyed daily by apoptosis? What amount of data is created daily? How does this compare to the rate of data transfer via an optical fiber? Has this helped you? Then please share with your network. The Information Paradox Consider that the human brain contains over trillion electrical connections synapses in a three dimensional space. If you were given the task of building a brain you would need at least a three dimensional plan showing the electrical connections.

Mathematically those would represent points in three dimensional space bounded by the physical dimension of a synapse. Since a point in three dimensions can be represented as a combination of unit vectors each with a matrix of three bits of information, the theoretical minimum of information for specifying one point is 9 bits or slightly more than one byte. Since there are over one hundred trillion locations for the electrical connections, more information is required to specify building a brain than is contained in the entire genome which is on the order of billions of bytes, not trillions of bytes.

Very interesting! This makes me think about the potential to store digital data chemically. I wonder if in the future we will be able to manipulate these molecules so finely that we could reliably use DNA to store data in servers. Idk just seems true???????? If we have to admit that there need someone many someones to program the computer to work, then Who is the programmer of all species in this world.

Who wrote my gene code? Life is obviously not a. Think about it. Your brain and your body are created systems. The program to build you is encoded in every cell. Do you really believe this evolved over time? Only a master programer beyond our imagination could do this. So they are without excuse. Male 23rd chromosome is X, Y and makes 46 in total. Females 23rd chrom. The X chrom. Regardless of zipped versions or not nucleotides are hardly to be compressed.

For sequence searching this was also done with the query. If "coded nucleotide" storage would be 2-bit per letter then you get for a byte:. Only this way you fully profit from positions 1,2,3,4,5,6,7 and 8 for 1 byte of coding. For example the combination This alone is responsible for a four times reduction in file-size as we see in other answers. Thus 3. Unzipped you can still read it. If this byte filling was used it becomes harder to read the data. That's why fasta-files are plain-text files in reality.

Bits do not represent information by themselves, it is the combination of bits that represent information. So in the case of nuDNA and mtDNA, the bits are encoded not to be confused with compressed to represent proteins and enzymes that in themselves would requires many MBs of raw data to represent, especially in terms of functionality.

One base -- T, C, A, G in the base-4 number system: 0, 1, 2, 3 -- is encoded as two bits not one , so one base pair is encoded by four bits. There is only 2 types of base pairs, Cytosine can only bind to Guanine, and Adenine can only bind to thymine, So each base pair can be considered a single bit.

Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Collectives on Stack Overflow. Learn more. How much storage would be required to store a human genome? Ask Question.

Asked 9 years, 9 months ago. Active 1 month ago. Viewed 91k times. Elijah 1, 13 13 silver badges 19 19 bronze badges. As for the number of atoms, this depends on the composition. A and T are smaller molecules than G and C.

The structure of the molecule is the beef, though, not its atomic composition, so this isn't really a very useful calculation. For what it's worth, e. See also biostars. Except for users slayton, Paul Amstrong and rauchen all other answers given are dead wrong in its essence or far from complete.

In the answers user fail to mentioned compression methods or is poorly explained. See my answer to clarify the 4 times downsizing of the genome as seen in many answers. I'm voting to close this question as off-topic because it is off-topic here, should be on bioinformatics. Here is a link to a repository containing it in a file called: hg Show 1 more comment.

Active Oldest Votes. Oliver Charlesworth Oliver Charlesworth k 29 29 gold badges silver badges bronze badges. Just to add some biological commentary, "haploid" here means only one copy of each chromosome.

The human reference assembly is haploid and a mosaic of multiple people. An actual individual genome will be diploid 2 copies of each chromosome, except X and Y but again only variant between the two copies at a small subset of sites. Thought about it for a day, and realized this: If you stored some base case human DNA, any subsequent human's DNA would only need to be stored as the diff between it and the base case.

For same sex examples DNA is And across sexes it's like Also worth to remember that not all information encoded within DNA base pairs there is also epigenetic information. When the 4 bases are packed into one byte. I'm going to be a pedant and point out that it cannot be both 3e9 bytes and bases, because a byte is 8 bits but a base is two bits. So a genome is 6e9 bits, or 7.

This, of course, ignores methylation though. A base may be two bits, though doesn't it need a position too, to be represented? I suppose you could use the byte offset in the binary files to represent position, though that seems potentially more restrictive than explicitly storing the position. That's precisely how images are stored, because it would take many times the amount of information to explicitly store the position.

In a p image, there are 2,, pixels assuming 1 channel , to store the position of each 8 bit pixel, you'd need to include 21 extra bits of information, per pixel actually, 42, because it's a 2d array, not 1. Now, this is images, and it would be a 1D array, not 2, and it depends on the number of bases per gene, so there's a lot of variability there, but my general point stands. See my comment on Pierre's answer that a genome is 3e9 bases which is 6e9 bits uncompressed.

Your question asks what the best method of sending whole human genomes. This would be best done as a diff to the reference genome. About 0.



0コメント

  • 1000 / 1000