000 02267 am a22002293u 4500
042 _adc
100 1 0 _aCrook, Oliver M.
_eauthor
_91934
700 1 0 _aGatto, Laurent
_eauthor
_91935
700 1 0 _aKirk, Paul D.W.
_eauthor
_91936
245 0 0 _aFast approximate inference for variable selection in Dirichlet process mixtures, with an application to pan-cancer proteomics
260 _c2019-12-12.
500 _a/pmc/articles/PMC7614016/
500 _a/pubmed/31829970
520 _aThe Dirichlet Process (DP) mixture model has become a popular choice for model-based clustering, largely because it allows the number of clusters to be inferred. The sequential updating and greedy search (SUGS) algorithm (Wang & Dunson, 2011) was proposed as a fast method for performing approximate Bayesian inference in DP mixture models, by posing clustering as a Bayesian model selection (BMS) problem and avoiding the use of computationally costly Markov chain Monte Carlo methods. Here we consider how this approach may be extended to permit variable selection for clustering, and also demonstrate the benefits of Bayesian model averaging (BMA) in place of BMS. Through an array of simulation examples and well-studied examples from cancer transcriptomics, we show that our method performs competitively with the current state-of-the-art, while also offering computational benefits. We apply our approach to reverse-phase protein array (RPPA) data from The Cancer Genome Atlas (TCGA) in order to perform a pan-cancer proteomic characterisation of 5157 tumour samples. We have implemented our approach, together with the original SUGS algorithm, in an open-source R package named sugsvarsel, which accelerates analysis by performing intensive computations in C++ and provides automated parallel processing. The R package is freely available from: https://github.com/ococrook/sugsvarsel
540 _a
540 _ahttps://creativecommons.org/licenses/by/4.0/This work is licensed under the Creative Commons Attribution 4.0 Public License https://creativecommons.org/licenses/by/4.0/.
546 _aen
690 _aArticle
655 7 _aText
_2local
786 0 _nStat Appl Genet Mol Biol
856 4 1 _uhttp://dx.doi.org/10.1515/sagmb-2018-0065
_zConnect to this object online.
999 _c1946
_d1946