i-mAB Analysis Pipeline:

Integrated machine learning for aberrant biomarker enrichment (i-mAB) of clusters of differentiation

Integrated machine learning pipeline for aberrant biomarker enrichment (i-mAB): characterizing clusters of differentiation within a compendium of systemic lupus erythematosus patients

Trang T. Le, Nigel O. Blackwood, Jaclyn N. Taroni, Weixuan Fu, Matthew K. Breitenstein

Department of Biostatistics, Epidemiology, and Informatics; Department of Systems Pharmacology and Translational Therapeutics; Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA

Abstract

Clusters of differentiation (CD) are cell surface biomarkers that denote key biological differences between cell types and disease state. CD-targeting therapeutic monoclonal antibodies (mABs) afford rich trans-disease repositioning opportunities. Within a compendium of systemic lupus erythematous (SLE) patients, we applied the Integrated machine learning pipeline for aberrant biomarker enrichment (i-mAB) to profile de novo gene expression features affecting CD20, CD22 and CD30 gene aberrance. First, a novel Relief-based algorithm identified interdependent features(p=681) predicting treatment-naïve SLE patients (balanced accuracy=0.822). We then compiled CD-associated expression profiles using regularized logistic regression and pathway enrichment analyses. On an independent general cell line model system data, we replicated associations (in silico) of BCL7A(padj=1.69e-9) and STRBP(padj=4.63e-8) with CD22; NCOA2(padj=7.00e-4), ATN1(padj=1.71e-2), and HOXC4(padj=3.34e-2) with CD30; and PHOSPHO1, a phosphatase linked to bone mineralization, with both CD22(padj=4.37e-2) and CD30(padj=7.40e-3). Utilizing carefully aggregated secondary data and leveraging a priori hypotheses, i-mAB fostered robust biomarker profiling among interdependent biological features.

Please cite as: Le TT, Blackwood NO, Taroni JN, Fu W, Breitenstein MK†. Integrated machine learning pipeline for aberrant biomarker enrichment (i-mAB): characterizing clusters of differentiation within a compendium of systemic lupus erythematosus patients. AMIA 2018 Annual Symposium Proceedings – accepted 6/12/2018. Link to supplemental information: https://arxiv.org/abs/1803.04487

Data: https://upenn.box.com/s/8ywe6fc8uwl0qwzugxq8osfbdv0axs7z (note: link updated 11/2/2018)

Pre-print (with supplemental tables/figures): Le TT, Blackwood NO, Taroni JN, Fu W, Breitenstein MK. Integrated machine learning pipeline for aberrant biomarker enrichment (i-mAB): characterizing clusters of differentiation within a compendium of systemic lupus erythematosus patients. arXiv preprint arXiv:1803.04487. 2018 Mar 8.