Proteomics research is hampered in many organisms due to a lack of an appropriate reference genome sequence that can be used
in the interpretation of tandem mass spectrometry data for the identification of proteins. Public DNA sequence repositories
have grown to considerable size and can, in most cases, serve to provide at least partial interpretation of a large-scale
proteomics dataset. However, when species-specific sequences or sequences from a closely related species are available, a
boutique sequence database can provide considerable increases in specificity, confidence, and completeness of protein identification.
Here, we describe the development of a protein database from a large-scale expressed sequence tag and full-length complementary
DNA sequencing project in the economically and ecologically important spruce (
Picea) genus.
Keywords
Picea
- Proteome - Database
Communicated by J. Dean