Hemant Ishwaran

Professor, Graduate Program Director, Director of Statistical Methodology, Division of Biostatistics, University of Miami

Random Forest Big Data Grant (article appeared in UHealth news)

Favorite current quotes:

"Just say no to p-hacking!"
"We are drowning in data but starving for information"
"All models are wrong, but some are useful"

Research Interests

Cancer Staging, Trees, Forests, Ensembles, Bayesian and Frequentist Variable Selection, Bioinformatics, High Dimensions, Nonparametric Bayes

External Activities

Editor, Sankhya A, Sankhya B, 12-15
Deputy Statistical Editor, J. Thoracic and Cardiovascular Surgery
Associate Editor, JASA, Theory and Methods, 05-11
Associate Editor, Electronic Journal of Statistics, 07-13
Associate Editor, Statistics and Probability Letters, 07-10
Web Editor, Inst. of Mathematical Statistics, 03-05

Brief Biography

PhD Statistics, Yale University, 1993
MSc Applied Statistics, Oxford University, 1988
BSc Mathematical Statistics, Univ. of Toronto, 1987

Selected Papers [Full List]

Mantero A. and Ishwaran H. (2017). Unsupervised random forests.

Lu M., Sadiq S., Feaster D.J. and Ishwaran H. (2017). Estimating individual treatment effect in observational data using random forest methods. To appear in J. Comp. Graph. Statist. arXiv:1701.05306

Ishwaran H. (2015). The effect of splitting on random forests. Machine Learning, 99, 75-118. [pdf]

Ehrlinger J. and Ishwaran H. (2012). Characterizing L2Boosting. Ann. Statist, 40, 1074-1101. [pdf]

Ishwaran H., Kogalur U.B., Gorodeski E.Z., Minn A.J. and Lauer M.S. (2010). High-dimensional variable selection for survival data. J. Amer. Stat. Assoc, 105, 205-217. [pdf]

Ishwaran H., Blackstone E.H., Hansen. C.A. and Rice T.W. (2009). A novel approach to cancer staging: application to esophageal cancer. Biostatistics, 10, 603-620. [pdf]

Ishwaran H., James L.F. and Zarepour M. (2009). An alternative to the m out of n bootstrap. J. Stat. Plann. Inference, 139, 788-801. [pdf]

Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests. Ann. Appl. Statist., 2, 841-860. [pdf]

Ishwaran H. (2007). Variable importance in binary regression trees and forests. Electronic J. Statist., 1, 519-537.

Ishwaran H. and Rao J.S. (2005). Spike and slab variable selection: frequentist and Bayesian strategies. Ann. of Stat., 33, 730-773. [pdf]

Ishwaran H. and Rao J.S. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection. J. Amer. Stat. Assoc., 98, 438-455. [pdf

Ishwaran H. and James L.F. (2003). Generalized weighted Chinese restaurant processes for species sampling mixture models. Statistica Sinica, 13, 1211-1235. pdf

Ishwaran H. and Zarepour M. (2002). Exact and approximate sum-representations for the Dirichlet process. Can. J. Statist. 30, 269-283. [pdf]

Ishwaran H. and James L.F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Stat. Assoc. 96, 161-173. [pdf]

Ishwaran H., James L.F. and Sun J. (2001). Bayesian model selection in finite mixtures by marginal density decompositions. J. Amer. Stat. Assoc. 96, 1316-1332. [pdf]

Ishwaran H. and Gatsonis C. (2000). A general class of hierarchical ordinal regression models with applications to correlated ROC analysis. Can. J. Statist., 28, 731-750. [pdf]

Ishwaran H. (1999). Information in semiparametric mixtures of exponential families. Ann. Statist., 27, 159-177. [pdf]

Ishwaran H. (1996). Identifiability and rates of estimation for scale parameters in location mixture models. Ann. Statist., 24, 1560-1571. [pdf]


randomForestSRC

R package unifying Breiman random forests for survival, regression, and classification problems based on Ishwaran and Kogalur's random survival forests (RSF) package. Now includes multivariate, unsupervised, and quantile regression forests. Runs in both serial and parallel (OpenMP) modes.

  • Instructions for installing the OpenMP parallel process package

  • New Github repository (coming soon, Spark and Java builds)
    : code and documentation


spikeslab

Spike and slab R package for high-dimensional linear regression models. Uses a generalized elastic net for variable selection. Parallel process enabled. [pdf]


BAMarray (3.0)

Java software for microarray data using Bayesian Analysis of Variance for Microarrays (BAM) [pdf]

boostmtree

Boosted multivariate trees for longitudinal data [pdf]

boostmtree

R package implementing Friedman's gradient descent boosting algorithm for longitudinal data using multivariate tree base learners. A time-covariate interaction effect is modeled using penalized B-splines (P-splines) with estimated adaptive smoothing parameter.