Variable Selection in Covariate Dependent Random Partition Models:an Application to Urinary Tract Infection


Lower urinary tract symptoms (LUTS) can indicate the presence of urinarytract infection (UTI), a condition that if it becomes chronic requiresexpensive and time consuming care as well as leading to reduced qualityof life. Detecting the presence and gravity of an infection fromthe earliest symptoms is then highly valuable. Typically, white bloodcell count (WBC) measured in a sample of urine is used to assessUTI. We consider clinical data from 1341 patients at their firstvisit in which UTI (i.e. WBC$≥ 1$) is diagnosed. In addition,for each patient, a clinical profile of 34 symptoms was recorded.In this paper we propose a Bayesian nonparametric regression modelbased on the Dirichlet Process (DP) prior aimed at providing theclinicians with a meaningful clustering of the patients based onboth the WBC (response variable) and possible patterns within thesymptoms profiles (covariates). This is achieved by assuming a probabilitymodel for the symptoms as well as for the response variable. To identifythe symptoms most associated to UTI, we specify a spike and slabbase measure for the regression coefficients: this induces dependenceof symptoms selection on cluster assignment. Posterior inferenceis performed through Markov Chain Monte Carlo methods.