The identification of important amino acid substitutions associated with low survival in hematopoietic cell transplantation (HCT) is hampered by the large number of observed substitutions compared to the small number of patients available for analysis. Random forest analysis is designed to address these limitations. We studied 2,107 HCT recipients with good or intermediate risk hematologic malignancies to identify HLA class I amino acid substitutions associated with reduced survival at day 100 post-transplant. Random forest analysis and traditional univariate and multivariate analyses were used. Random forest analysis identified amino acid substitutions in 33 positions that were associated with reduced 100 day survival, including HLA-A 9, 43, 62, 63, 76, 77, 95, 97, 114, 116, 152, 156, 166, and 167; HLA-B 97, 109, 116, and 156; and HLA-C 6, 9, 11, 14, 21, 66, 77, 80, 95, 97, 99, 116, 156, 163, and 173. Thirteen had been previously reported by other investigators using classical biostatistical approaches. Using the same dataset, traditional multivariate logistic regression identified only 5 amino acid substitutions associated with lower day 100 survival. Random forest analysis is a novel statistical methodology for analysis of HLA-mismatching and outcome studies, capable of identifying important amino acid substitutions missed by other methods.