Chicken proteomics - beyond the Genome

B. P. Singh
Central Avian Research Institute
Email: ubstomar@rediffmail.com

The large-scale study of proteins, with reference to their structures and functions is known as proteomics.

  1. Proteomics was coined to make analogy with genomics and is often viewed as the "next step", but proteomics is much more complicated than genomics.
  2. Most importantly, while the genome is a rather constant entity, the proteome is constantly changing through its biochemical interactions with the genome.
  3. One organism will have radically different protein expression in different parts of its body and in different stages of its life cycle.
  4. The entirety of proteins in existence in an organism throughout its life cycle, or on a smaller scale the entirety of proteins found in a particular cell type under a particular type of stimulation, are referred to as the proteome of the organism or all type respectively.

With the completion of rough draft of the chicken genome, many researchers are looking at how genes and proteins interact to form other proteins. To catalogue all chicken proteins and ascertain their functions and interactions presents a daunting challenge for scientists.

  1. The availability of draft chicken genome sequence has placed the chicken as the first production animal species to enter the post genomic era.
  2. Now the challenge before the research workers will be to use this information for understanding the molecular basis of chicken-normo-and patho-physiology (Burgess, 2004).
  3. The integration of all the current molecular resources such as the genetic and physical maps, QTL markers, whole genome sequence, micro-arrays and EST libraries and proteomics would be key to unraveling the molecular mechanisms that control complex biological systems (Wong, 2004).
  4. Proteomics will soon become as important to chicken researchers as it has to human researchers. This is because, due to its biomedical utility, the chicken will imminently join those vertebrate that have representative genomes sequenced (Burt and Pourguire, 2003).

The proteome and proteomics- Key concepts

  1. Plasma is the liquid part of the blood and only tissue in which every single protein in the body (estimated to be at least 500 thousand proteins, Anderson and Anderson, 2002) could potentially be present.
  2. Blood plasma is currently the most common tissue used for diagnosis of disease and nutritional status, and it holds immense potential for diagnosing, understanding and treating human and animal diseases (Yudell et al., 2002; Tirumalai et al., 2003). However, very little information is currently available from the store house of plasma; this is because the nucleic acid techniques such as cDNA micro-array analysis, PCR and serial analysis of gene expression are not appropriate for blood plasma. Plasma constitutes a collection of proteins; the only nucleic acids present will be the leakage. Therefore, functional genomic analysis of plasma can only be done by proteomics (Corzo et al., 2004).
    1. Proteomics is the study of the entire protein compliment of an organism. Marc Wilkins (1994) coined the term "proteome" at the two-dimensional (2-D) gel electrophoresis meeting in Sienna, Italy. The original definition of "proteomics" was as the study of proteome (Wilkins et al., 1996). Since then, a number of definitions have been suggested.
    2. Proteomics as a discipline has been considered as immature using relatively immature technology (Patterson and Aebersold, 2003) and at present rapidly advancing (Pardanani et al., 2002).
    3. The proteome is context dependent; it includes quantity, environment, time, stochiometry and (PTM) such as glycosylation and instructing partners (Burgess, 2004). However, the proteome is diverse and complex and may be infinite (Huber, 2003).
    4. The proteome is enormously complex. Estimates of the total potential number of proteins produced by the human genome's compliment of 40,000 genes are in the vicinity of 5,00,000 [one human gene alone produces in excess of 1,000 proteins (Ullrich et al., 1995)]. This number of genes does not provide the complexity required for vertebrate growth, maintenance, and function, if only a single protein is produced from each gene. Further, the complexity is provided by alternate mRNA splicing followed by co-and posttranslational modifications and in vertebrate, more than 200 such separate proteins modifications have been noted. More than one of these modifications routinely occurs on most proteins (Gooley and Packer, 1997).
    5. It would be impossible for scientists with biology background to undertake investigations in the field of proteomics in isolation, therefore, the teams of scientists with computer, bioinformatics, mathematics and physics at least to some degree must collaborate in an organized manner. This means in proteomics there will be huge investments in new technologies and methodologies to produce large amount of information (data) along with new ways of analyzing this information (Hanash and Celis, 2002).
    6. Burgess (2004) has divided the proteomics into three main areas-
      1. First area is termed as "Expression- Proteomics". This area measures the relative abundances of proteins and conceptually equivalent to differential gene expression experiments using cDNA micro-arrays.
      2. Second area is "Cellular- Proteomics". The aim of this area is to identify protein- protein interactions and to show the complex interacting networks that are the components biological machines.
      3. Third area is "Structural Proteomics". In this area the main goal is to predict the three dimensional structures of protein on a genome wide scale.

Categories of Proteomics

Activity Based Proteomics, Applied Proteomics, Cell Signaling Proteomics, Chemical Proteomics, Clinical Proteomics, Comparative Proteomics, Computational Proteomics, Environmental Proteomics, Expression Proteomics, Functional Proteomics, High-Throughput Proteomics, Human Proteomics Initiative, In Silico Proteomics, Interaction Proteomics, Microbial Proteomics, Phyloproteomics, Physiological Proteomics, Post-Proteomics, Proteomic Technologies, Reverse Proteomics, Riboproteomics, Shotgun Proteomics, Structural Proteomics, Targeted Proteomics, Tissue Proteomics, Topological Proteomics, Toxicoproteomics

Toolbox of Proteomics

  1. The obsolescence rate of proteomics equipment is very rapid.
  2. The genomic sequence data is linear, whereas the data set in proteomics is nonlinear and relies heavily on human intervention and prompt decision making.
  3. Appropriate interpretation of proteomics data needs understanding of proteomics technologies, their functions and limitations.
  4. Proteomics has relatively poor sensitivity.
  5. The target specific exponential amplification offered by PCR for nucleic acids has no equivalent for proteins (Fredriksson et al., 2002; Gullberg et al., 2003). It should be noted with interest that the technologies themselves are well established in their respective areas, however, their integration that is what required near in future for obtaining the goals of proteomics.
  6. A complete genomic sequence, although would facilitates proteomics but its availability is not essential in absolute term. It would be better if the genome data is well annotated and in a useable format, for proper proteins identification by matching with MS (Mass Spectrometer) data (e.g. Swiss-Prot and NRDB). This fundamental requirement was immediately recognized for the chicken (Burt and Pourguil, 2003).
  7. Well-annotated pathogen genome sequence in addition to the chicken genome sequence will facilitate the proteomics of immunity and disease.
  8. Bioinformatics and computational technology are critical to proteomics for development of future knowledge. The genomes of higher vertebrates are now estimated to have some 30,000-40,000 genes and identification of genes associated with specific traits is a formidable challenge. Contemporary genomics is a data driven information science. The amounts of data generated in proteomics are exponentially larger than that of genomics. Proteome is immensely complex, and there may be large inherent potential variation in proteomics data. Those familiar with cDNA micro-arrays will be aware that similar issues face transcriptome analyses (Bhamot et al., 2003). Although it is relatively new area of science bioinformatics and computational biology will be centrally important to modern proteomics.
  9. As technology of proteomic has advanced over the last five year, the ability to generate data has rapidly outpaced our ability to analyses it. The only way that we can hope to make sense of the flood of proteomics related data is through intelligent application of computing power. Since, bioinformatics and computational biology are very essential to proteomics, they need to defined. The definitions proposed by the US National Institutes of Health (Huerta et al., 2000) are given below.
    Bioinformatics: research, development, or application of computational tools and approaches for expanding the use of biological, medical, behavioral or health data, including those to acquire, store, organize, archive, analyze, or visualize such data.
    Computational biology: the development and application data-analytical and theoretical methods, mathematical modeling, and computational simulation techniques in the study of biological, behavioral, and social systems.
  10. Bioinformatics is therefore essential in generating, storing and manipulating the data of proteomics and, more importantly, is extracting knowledge from it. However, the accumulation of proteomic data would be pointless without understanding the functioning of the proteomic (s) and structure-expression of proteins.
  11. Computational technology is an exciting development in the computing world. In general, it involves the aggregation of computing capacity or data storage into a virtual entity that is then available as a service across a network.
  12. The terminology for data pertaining to proteomics results is metadata, i.e. "data about data". From the perspective of a biological proteomicist, metadata includes sample preparation methods, liquid chromatography gradients, mass spectrometer settings and database search algorithm setting (Burgess, 2004).  

Equipments

1. Electrophoresis-based Proteomics

  1. 1 or 2 Dimensional, using SDS-PAGE
  2. Liquid and gel-based micro-capillary systems.
    (Capillary Electrophoresis and Capillary Electro-Chromatography)
  3. The aim is to deconvolute a complex protein or peptide mixture by taking advantage of their physiochemical properties.
  4. With special reference to proteomics-one-dimensional SDS-PAGE can be used for sub cellular protein complex, provided combined with other technologies.
  5. 2-D-PAGE is the electrophoresis-based technology most commonly associated with proteomics. This technique separates proteins in 2 dimensions. The first dimension is separation by iso-electric point (pI) in a polyacrylamide gel, which has a pH gradient, in an electric field. The process is known as iso-electric focusing (IEF).
  6. The recent version of 2-D PAGE is fluorescence 2-D difference gel electrophoresis (DIGE). The advantage of 2-D DIGE is that proteins from 2 different sources may be analyzed and compared in a single gel. This significantly improves experiment accuracy and repeatability. Briefly, proteins are harvested and total protein is quantified.
  7. Post-electrophoresis, the gel is imaged into its 3 component images. The images are superimposed, and differences are critically analyzed by software.

2. Electrophoresis-free Proteomics

  1. This proteomics also relies on methods of deconvoluting biological samples prior to identification. This is mainly based on multidimensional HPLC.
  2. The HPLC may be done off line or directly inline with a mass spectrometer fitted with an electrospray ionization (ESI) source.
  3. Integrated proteomics methods that use multidimensional LC have been called multidimensional protein identification technology- Mud PIT (Wolters et al., 2001), or direct analysis of large protein complexes.

 3. Mass Spectrometers

  1. After the deconvolution of the proteome, MS-associated with computer algorithms and soft ionization techniques are very critical to identify proteins and their PIM.
  2. The ability of Mass Spectrometers to weigh very small things very accurately is considered as very critical to proteomics.
  3. The problem of differentiation between peptides with the same mass but different sequence, molecules of the peptide of interest are physically chosen by the MS for fragmentation analysis.

 4. Emerging technologies

  1. Three emerging proteomics technologies are quantitative methods for electrophoresis-free-proteomics, proteins/peptide arrays, and imaging MS.

 Immunoproteomics

When term proteomics is applied to immunology and immunity, it has acquired a new term, "Immunoproteomics" (Jungblut, 2001).

  1. Immunoproteomics is in its infancy.
  2. Proteomics relies on a number of technologies and has many applications in many disciplines. Burgess (2004) has enumerated the application's of proteomics, for defining disease pathogenesis and immunity, to provide insights into the suggested novel hypothesis for improving health. The general groups for research are as follows-
    1. Definition of immune system function and dysfunction from primary structure to higher order structure, to a global level.
    2. Definition pathogens from primary structure to higher order structure, to a global functional molecular level.
    3. Identification of immunogens and epitopes particularly in the context of the genetic basis of host disease resistance and susceptibility and also to identify virulence and attenuation determinants of pathogens and related vaccine strain; and
    4. Definition of molecular machines used in host immunity and pathogen immune evasion.
  3. Proteome maps aimed at understanding gene expression in immune system development, disease, physiology, and pharmaco-therapeutics have been established but are not yet comprehensive (Vuadens et al., 2002).
  4. The proteome of all pathologies is rich in potential immunogens, and differential proteome mapping has been done for many pathologies with special reference to cancer.
  5. The impact of pathogen proteomics will be greater than just identifying potential vaccine antigens.
  6. On of the important objective of proteomics is to find out the functional protein complexes.
  7. Mittelman et al., (2002) have reported a computational approach in human cancer immunology, to understand the correlation between immunogenicity of peptides derived from tumor-associated antigens relative to the similarity level of the host's proteome.

 Immunoproteomics in chicken

  1. The fundamental requirement of chicken immunoproteomics would be to sequence pathogens, in addition to chicken genome.
  2. The biggest impediment of chicken proteomics has been removed with the availability of chicken genome sequence draft very recently (Wong, 2004).
  3. It would be very essential to have basic maps of cells and tissues in health and disease and as those of pathogens for defining chicken genome's functional capacity.
  4. The chicken is considered as an excellent biomedical model for immunology research. The chicken is beginning to prove itself to be uniquely positioned to elucidate critical sequences in vertebrate genomes (Burt and Pourquie, 2003).
  5. Currently, with two notable exceptions one from the Beynon group at the University of Liverpool Veterinary School, U.K. (Hayter et al., 2003) and the second from department of poultry Science, Mississippi State University, USA (Corzo et al., 2004), no work on proteomics in the chicken has been published.
  6. Corzo et al. (2004) have reported the first results toward mapping the broiler plasma proteome. Blood was taken from eight 18 day-old representative commercial broiler chickens. Plasma was isolated from each sample and pooled for initial sample fractioning 0.4 μl aliquot of the pooled plasma was run on one-dimensional sodium dodecyl sulfate polyacrylamide gel electrophoresis. Based on relative amounts of protein, the gel was divided into three fractions. The proteins were in-gel digested with trypsin. Two-dimensional liquid chromatography in-line with electrospray ionization tandem mass spectrometry was then used for "Shot-gun" qualitative plasma proteomics. The resulting tandem mass spectra were then searched against the non-redundant chicken protein database. Generally accepted high stringency statistical criteria for protein identification were used. Eighty-four chicken proteins were identified. Perhaps, this is the largest number of plasma proteins ever identified and reported in a single assay from chicken plasma. The proteins, coagulation factors, transport and structural proteins were found in the plasma protein.
  7. Therefore, like clinical pathology, one of the aims of proteomics is to identify the efficient bio-markers of disease and production in chickens.
  8. Immunogenetics is extremely essential to chicken production. Not all chicks are hatched immunologically equal. There are extreme differences in the abilities of the chicken genotypes to mount immune responses to both pathogens and vaccines.
  9. The egg and broiler industries are having completely different genotypes. The chickens in each industry live in different environment with diverse conditions and suffer from diseases, which may be sometimes different. The proginator of all domestic fowl, the Red Jungle Fowl, further lives in different conditions. Layers and broilers have experienced intensive selection to survive in different pathogen environments. Therefore, the immunogenetics of these three are different; their immunoproteomes are probably even more different. The presence of inbred lines of chicken may facilitate the proteomics analysis.

 The Future:

  1. Finally, the chicken proteomics is exciting and an organised collection of technologies to understand proteomes for solving biological questions about birds.
  2. The availability of chicken genome sequence has enabled the exploration and exploitation of the chicken genome and proteome to begin. Research will be now focussed on the annotations of the genome and in particular of the proteome. This means providing, for each known protein, a wealth of information that include the description of its function, its domain structure, sub-cellular location, post translational modification, variants, similarities to other proteins etc.
  3. The chicken-proteomics is likely to involve efforts in the below mentioned research areas in the future.

    1. Exploiting genome information for cross-species identification strategies in proteomics
    2. Metabolic turnover of the proteome
    3. Development of experimental and bioinformatic tool for chicken proteomics
    4. Novel proteomic technologies for protein: protein interactions
    5. Proteomic protocols for genome-wide analysis of protein complexes
    6. Genome-wide protein: protein interaction studies via chemical cross-linking proteomics
    7. Bioinformatic data warehousing for proteomics.

References: on request

Source : IPSACON-2005

Did You Like This Article? Bookmark it at Del.icio.us

Article Tools