Q: What is the Human Proteome Project (HPP)?
A: The HPP is the flagship research project of HUPO. Its goals are (1) to establish proteomics as a complement to genomics in integrated omics research and life sciences more broadly; and (2) to progressively complete the protein parts list, with at least one protein product from every protein-coding gene that is expressed, along with sequence variants, post-translational modifications (including N-termini), and splice isoforms.
Q: Why don’t we just rely on genomics or gene expression to understand the molecular biology of the cell?
A: It is impossible to predict the dynamic features of the proteome or the post-translational modifications and splice isoforms without direct measurement and quantitation of the proteins and proteoforms critical to cell functions.
Q: How many protein-coding genes are there?
A: The current estimate from neXtPRot and UniProt is 20,055. This estimate includes a category (PE5) called “uncertain or dubious”, dominated by pseudogenes (degraded gene sequences), very unlikely to produce proteins. Therefore, the HPP has adopted the neXtProt categories of protein evidence 1,2,3,4, excluding the PE5 set; that leaves 19,467 as of 2016.
Q: For how many of these genes is there highly confident protein-level evidence of at least one protein product?
A: According to the HUPO HPP metrics and the core databases from PeptideAtlas and neXtProt, there were 16,518 protein evidence level PE1 proteins, of which 14,629 were identified based on mass spectrometry (canonical in PeptideAtlas) and 1860 identified by other biochemical and physical methods, according to neXtProt curators. That leaves 2949 “missing proteins” (PE2,3,4).
Q: What are the strategies for finding those “missing proteins”?
A: Protein evidence level PE2 means that there is definite expression of transcripts, but no or insufficient evidence of the protein product. Knowing in what organ or cell type the mRNA is most expressed is a strong clue to where to look for the protein. Knowing that the protein sequence will yield appropriate peptide fragments from digestion with the proteolytic enzyme trypsin and not be embedded in membrane structures is likewise useful. A new resource www.missingproteins.org provides links to relevant literature. In other cases, it may be necessary to test specimens from early development or under the stress of infection or inflammation.
Q: Are there some proteins that simply will not be detectable?
A: Yes, the denominator of expressed proteins is probably about 90% of the 19,467 predicted protein-coding genes. There are a few (perhaps 22) proteins which generate no tryptic peptides upon digestion with trypsin. There are many hard-to-solubilize proteins in membranes, generally highly hydrophobic. There are many genes for which no transcripts can be detected; some chromosome segments have been shown to have inaccessible chromatin. Of course, if the level of expression of transcripts or proteins is too low for the sensitivity of current instruments, mass spectrometry will miss them. Other methods are needed, as we know already for 1860 of those 16,518 PE1 protein in neXtProt.
Q: How can independent scientists participate in the HPP?
A: From the beginning, the HPP has welcomed voluntary participation of HUPO members and prospective HUPO members in the teams and meetings of the HPP. From the first annual C-HPP special issue of the Journal of Proteome Research in 2013 manuscripts were invited from independent laboratories; that is explicitly repeated in the Call for Papers for the 5th annual special issue in 2017 (link to JPR call for papers). Many scientists become interested in the HPP while reading the publications or participating in the HUPO or regional HUPO or other scientific meetings. All of the HPP teams have expanded and been refreshed over the years. The B/D-HPP has grown from a handful of original projects to the present array of 22 teams.
Q: What are the opportunities specifically for early career researchers?
A: HUPO and the HPP warmly welcome early career researchers. The B/D-HPP has put the spotlight on early career researchers with an all-day Mentoring Day associated with the HUPO World Congress, a manuscript competition with awards, travel awards, and engagement of early career researchers on the B/D-HPP committees, including the Executive Committee. The HPP also sponsors travel awards for clinician-scientists to become more knowledgeable about proteomics. And the Knowledgebase Resource Pillar offers a Bioinformatics Hub for discussions about pre-planned topics and open consultations during the HUPO Congress.
Have more questions? Please contact office(at)hupo.org.