Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction

Rahman, Kh. Shamsur; Chowdhury, Erfan U.; Sachse, Konrad; Kaltenboeck, Bernhard

View/ Open

Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction.pdf (2.269Mb)

Author

Rahman, Kh. Shamsur

Chowdhury, Erfan U.

Sachse, Konrad

Kaltenboeck, Bernhard

http://orcid.org/0000-0001-7243-8501

Metadata

Show full item record

Abstract

X-ray crystallography has shown that an antibody paratope typically binds 15-22 amino acids (aa) of an epitope, of which 2-5 randomly distributed amino acids contribute most of the binding energy. In contrast, researchers typically choose for B-cell epitope mapping short peptide antigens in antibody binding assays. Furthermore, short 6-11-aa epitopes, and in particular non-epitopes, are over-represented in published B-cell epitope datasets that are commonly used for development of B-cell epitope prediction approaches from protein antigen sequences. We hypothesized that such suboptimal length peptides result in weak antibody binding and cause false-negative results. We tested the influence of peptide antigen length on antibody binding by analyzing data on more than 900 peptides used for B-cell epitope mapping of immunodominant proteins of Chlamydia spp. We demonstrate that short 7-12-aa peptides of B-cell epitopes bind antibodies poorly; thus, epitope mapping with short peptide antigens falsely classifies many B-cell epitopes as non-epitopes. We also show in published datasets of confirmed epitopes and non-epitopes a direct correlation between length of peptide antigens and antibody binding. Elimination of short, 11-aa epitope/non-epitope sequences improved datasets for evaluation of in silico B-cell epitope prediction. Achieving up to 86% accuracy, protein disorder tendency is the best indicator of B-cell epitope regions for chlamydial and published datasets. For B-cell epitope prediction, the most effective approach is plotting disorder of protein sequences with the IUPred-L scale, followed by antibody reactivity testing of 16-30-aa peptides from peak regions. This strategy overcomes the well known inaccuracy of in silico B-cell epitope prediction from primary protein sequences.

Inadequate Reference Datasets Biased toward Short Non-epitopes Confound B-cell Epitope Prediction

View/ Open

Author

Metadata

Abstract

URI

Related items

Active Site-labeled Prothrombin Inhibits Prothrombinase in Vitro and Thrombosis in Vivo ﻿

Response of Channel Catfish to Variable Concentrations of Dietary Protein ﻿

Discovery and evolution of novel hemerythrin genes in annelid worms ﻿

Active Site-labeled Prothrombin Inhibits Prothrombinase in Vitro and Thrombosis in Vivo

Response of Channel Catfish to Variable Concentrations of Dietary Protein

Discovery and evolution of novel hemerythrin genes in annelid worms