Protein Characterization

In-depth characterization of complex protein molecules

Intact mass analysis

This method measures the molecular weight of a protein molecule. The protein sample is desalted on a short reverse phase column and analyzed on a Q-TOF mass spectrometer. The protein picks up multiple charges during electrospray and forms a so-called charged envelope. A maximum entropy algorithm is applied to deconvolute the multiple charged mass peaks into the molecular mass of the original protein. Comparing the measured mass with the predicted value from the protein sequence can confirm the sequence identity. Our method can readily measure proteins up to 300 KDa in size and routinely achieves mass accuracy within +/- 1 Da for protein around 50 KDa, which is sufficient to detect the majority of point mutations and post-translational modifications. Many therapeutic proteins (e.g., IgG) are glycosylated and the mass spectra of these proteins will have branching patterns resulted from the heterogeneity in glycan structures. Our method can resolve such heterogeneity of proteins with up to two glycosylation sites. When the protein has multiple glycosylation sites, the pattern becomes overly crowded and the mass for individual glycoform can no longer be determined. For such proteins, deglycosylation may be necessary to simplify the spectrum profile.

Peptide mapping

This method also refers to the bottom up method. Proteins are initially denatured, reduced and alkylated. The alkylated proteins are digested with proteases such as trypsin into peptides. The resultant peptide mixture is separated on a UPLC reverse phase column and subsequently analyzed on a Q-TOF mass spectrometer. The m/z values of the peptides are compared with the values from in-silico digested protein sequence. The identities of peptides are further confirmed by mass spectrometric isolation of individual peptide ions and fragmentation analysis. The fragmentation data are compared with in-silico predicted product from the precursor peptide sequence. For peptides that return low scores from spectra matching, we manually inspect the spectra to validate the match based on empirical knowledge on peptide fragmentation, i.e., enhanced peptide cleavage near proline and histidine residues. All identified peptides are complied into a sequence coverage map. We routinely achieve greater than 95% sequence coverage of recombinant proteins with trypsin alone or in conjunction with additional proteases, such as Glu-C and Asp-N. For proteins rich in hydrophobic residues and devoid of basic or acidic residues such as membrane proteins, we use pepsin that usually cut well at hydrophobic sequences.

An important utility for peptide mapping is to identify the locations of post-translational modifications. We have extensive expertise in analyzing modifications including:

  • Oxidation and deamidation
  • Phosphorylation
  • Disulfide bond
  • Glycosylation
  • Chemical crosslinking
  • PEGylation
  • Drug conjugation
Depending on the nature of the modifications we adopt different approaches in sample preparation and mass spectrometric data collection. For stoichiometric modifications such as PEGylation and drug conjugation, where the majority of the protein is expected to be modified, we start with intact mass analysis to capture the global profile of the molecular species and then followed by peptide mapping. For low abundant modifications, we rely on specific enrichment techniques to harvest the modified species, i.e. metal affinity for phosphopeptides. We also take advantage of specific fragmentation signals from modified peptides, i.e., loss of phosphoric acid for phosphopeptide, HexNacHex ions from glycopeptides or linker cleavage from drug conjugate to facilitate the discovery. Once a modification is identified, the relative abundance of the modified species can be estimated from the intensity ratio of the modified vs. native peptides. It should be noted that the modification can alter the ionization efficiency of the peptide and more accurate quantitation should be obtained from synthetic peptide standards. For monoclonal antibodies, we identify and report the relative abundance for commonly occurring modifications including pyroglutamation, glycosylation, lysine deletion, deamidation and oxidation. The result is suitable for regulatory filing purpose.

Protein identification

Gel separated proteins can be readily identified with our nanoLC-nanospray system which provides sensitivity sufficient for silver stained gels. The gel can be an SDS-PAGE from an immunoprecipitation experiment or 2 D gel with coomassie blue or silver stain. Complex protein samples such as cell and tissue extracts can be analyzed with fractionation using either SDS-PAGE on protein level or high pH reverse phase on peptide level prior to LC-MS analysis. Thousands of proteins can be confidently identified and reported.

Glycosylation analysis

Many protein drugs or drug targets are glycosylated proteins. The glycosylation profile has significant impact on protein biophysical properties, such as hydrophobicity, thermal and protease stability and structural flexibility. For monoclonal antibodies, the conserved Fc region glycan affects its pharmacological activities, i.e., the degree of fucosylation affects antibody-dependent cell-mediated cytotoxicity (ADCC); gal-gal leads to immunogenicity. Glycoproteins present a challenge for mass spectrometry analysis due to the multiple glycosylation sites and heterogeneity of the glycans at each glycosylation site. We developed a platform for efficient detection and characterization of glycoproteins. We employ hydrophilic interaction chromatography (HILIC) to enrich glycopeptides and the Q-TOF mass spectrometer alternates between normal mode and in-source fragmentation mode that induces fragmentation of the glycans. The characteristic ions for glycopeptides facilitate locating the glycopeptides in the chromatogram. The candidate glycopeptide is searched against glycosylation database for site-identification. The glycan profile at attached sites is reported based on the intensity of each glycoform. For IgGs, we report the abundance of Man5, G0, G0F, G1F, G2F and other minor glycan species detected. We also developed a high throughput technique that can rapidly screen the glycan profiles of IgGs for cell line clone selection purpose. Often a glycosylation site is not fully occupied. The site occupancy is determined through PNGase F removal of the glycans that generated a deamidated peptide. The structures of the released glycans are characterized on nanospray source that increase the sensitivity for carbohydrate. Fragmentation spectra of the glycans are analyzed to confirm the structure proposed based on the glycan mass. The glycosidic linkage is determined from exoglycosidase digestion that has specificity for different glycan linkages.