Automated Machine Learning For Risk Prediction Of Incisional Hernia In Abdominal Surgery Patients
Ankoor A. Talwar, MBA, Abhishek A. Desai, MD, Phoebe B. McAuliffe, BS, Tony Liu, BS, Vivek James, BS, Ivona Percec, MD PhD, Robyn B. Broach, PhD, Lyle Ungar, PhD, John P. Fischer, MD MPH.
University of Pennsylvania, Philadelphia, PA, USA.
PURPOSE: Incisional hernia (IH) is a morbid long-term complication following abdominal surgery. Incidence is 500,000 annually. It is important for surgeons to assess risk of IH. Our group previously developed a logistic regression model to stratify patient risk of IH. Literature is mixed on whether automated machine learning model provides benefit over logistic regression in clinical settings. The purpose of this study was to determine if automated machine learning (AutoML) is superior to logistic regression (LR) in terms of assessing risk of incisional hernia, and understanding which clinical features are salient for IH formation.
METHODS: This retrospective cohort study reviewed adult patients who underwent intra-abdominal, urologic, or gynecologic surgery at our institution from January 2005 to June 2016. Any IH repair following index operation was noted. Two sets of clinical features were tested. A limited set included 18 features previously studied. An expanded set included a total of 246 clinical features, including those in the limited set. The four models generated were: LR with limited features, LR with expanded features, AutoML with limited features, and AutoML with expanded features. The machine learning model deployed was a random forest model. Primary outcome was the AUC generated by each model. Secondary outcomes included differences in predictions at varying true positive rates and determining Shapley values for feature importance.
RESULTS: 20,516 patients were included, of which 12.3% developed IH (n=2,519). 67% of patients were used to train the models (n=12,871). The other 33% were the test cohort (n=6,340). AUCs were calculated: LR limited 0.599, LR expanded 0.682, AutoML limited 0.706, AutoML expanded 0.747 (Figure 1A). Average precisions of the models were: LR limited 0.17, LR expanded 0.34, AutoML limited 0.26, AutoML expanded 0.41 (Figure 1B). At a true positive rate of 0.8, the AutoML expanded had a False Positive Rate (FPR) of 0.64, compared to AutoML limited FPR of 0.71, LR expanded of 0.78, and LR limited of 0.82 (all p < 0.0001). Shapley values of the limited feature set revealed the most critical factors predicting risk of IH were age followed by BMI and others (Figure 2A and 2C). Shapley values of the expanded feature set revealed the most critical factors were similar but shifted, with BMI most important, then age and others (Figure 2B and 2D).
CONCLUSION: Automated machine learning algorithm with more clinical features provides the best predictive capacity for IH development, over logistic regression. Further, predictions generated by the AutoML model with expanded features are different from that with limited features or from LR models. Finally, the importances of clinical features shift when using a greater feature set. More work needs to be done to develop a more robust model of IH prediction using automated machine learning.
Back to 2022 Abstracts