Abstract

The Width of the Human Plasma Proteome Compared With a Cancer Cell Line and Bacteria

Andrey V Lisitsa, Ekaterina V Poverennaya, Elena A Ponomarenko and Alexander I Archakov

Whole genome sequencing has revealed the number of protein-encoding genes in a given organism, which can be considered a first approximation of molecular complexity. Due to post-transcriptional and post-translational modifications such as RNA splicing, polymorphisms, covalent modifications and degradation, the total number of different protein species (the proteome) can be much larger than the number of protein-encoding genes. 2-D gel electrophoresis can be used to estimate the width of the human proteome. The number of spots obtained with different stains (dyes) under different protein loading conditions can give a rough idea of the number of different proteins in the sample. Data on human plasma and cell lines and on bacterial cells have been investigated to determine the dependence of the number of spots on the dye sensitivity. Assuming that each spot represents a different protein species, the spots-to-sensitivity dependence was applied as an estimate of the width of the proteome. In theory, there are 1.75 million proteoforms in 1 L of blood plasma, 18 thousand species per individual HepG2 cell, and 6700 species per bacterium.