Quantitative assessment of protein function prediction from metagenomics shotgun sequences.

Abstract

To assess the potential of protein function prediction in environmental genomics data, we analyzed shotgun sequences from four diverse and complex habitats. Using homology searches as well as customized gene neighborhood methods that incorporate intergenic and evolutionary distances, we inferred specific functions for 76% of the 1.4 million predicted ORFs in these samples (83% when nonspecific functions are considered). Surprisingly, these fractions are only slightly smaller than the corresponding ones in completely sequenced genomes (83% and 86%, respectively, by using the same methodology) and considerably higher than previously thought. For as many as 75,448 ORFs (5% of the total), only neighborhood methods can assign functions, illustrated here by a previously undescribed gene associated with the well characterized heme biosynthesis operon and a potential transcription factor that might regulate a coupling between fatty acid biosynthesis and degradation. Our results further suggest that, although functions can be inferred for most proteins on earth, many functions remain to be discovered in numerous small, rare protein families.

Authors and Affiliations

E D Harrington, A H Singh, T Doerks, I Letunic, C von Mering, L J Jensen, J Raes, P Bork

Keywords

Related Articles

Prostitution and the sex discrepancy in reported number of sexual partners.

One of the most reliable and perplexing findings from surveys of sexual behavior is that men report substantially more sexual partners than women do. We use data from national sex surveys and studies of prostitutes and t...

Quantitation of human immunodeficiency virus type 1 during pregnancy: relationship of viral titer to mother-to-child transmission and stability of viral load.

To develop strategies to prevent mother-to-child transmission of human immunodeficiency virus type 1 (HIV-1), it is important to define the factors determining it. We examined the relationship between maternal HIV-1 tite...

Correlation of peptide specificity and IgG subclass with pathogenic and nonpathogenic autoantibodies in pemphigus vulgaris: a model for autoimmunity.

Pemphigus vulgaris (PV) is a rare, potentially fatal, autoimmune disease that affects the skin and mucous membranes. The PV antigen (PVA) has been characterized as desmoglein 3. PV patients carry HLA-DR4- or HLA-DR6-bear...

Histone hyperacetylation induces demethylation of reelin and 67-kDa glutamic acid decarboxylase promoters.

Reelin and glutamic acid decarboxylase 67 (GAD(67)) expression down-regulation in GABAergic interneurons of mice exposed to protracted treatment with l-methionine (MET) is attributed to RELN and GAD(67) promoter cytosine...

An epigenetic mouse model for molecular and behavioral neuropathologies related to schizophrenia vulnerability.

Reelin and glutamic acid decarboxylase (GAD)67 expressed by cortical gamma-aminobutyric acid-ergic interneurons are down-regulated in schizophrenia. Because epidemiological studies of schizophrenia fail to support candid...

Download PDF file
  • EP ID EP82726
  • DOI -
  • Views 51
  • Downloads 0

How To Cite

E D Harrington, A H Singh, T Doerks, I Letunic, C von Mering, L J Jensen, J Raes, P Bork (2007). Quantitative assessment of protein function prediction from metagenomics shotgun sequences.. Proceedings of the National Academy of Sciences of the United States of America, 104(35), 13913-13918. https://europub.co.uk./articles/-A-82726