Expert Vs. Crowdsourced Assessment Of Aesthetic Outcomes After Breast Reconstruction
Yash Kadakia, BA1, Jake A. Alford, MD1, Julie L. Cooper, BS1, Julie L. Cooper, BS1, Ricardo Garza, BS1, Sami U. Khan, MD2, Sumeet S. Teotia, MD1, Nicholas T. Haddock, MD1.
1University of Texas Southwestern Medical Center, Dallas, TX, USA, 2Stony Brook University, Stony Brook, NY, USA.
PURPOSE: Evaluating the aesthetic success of breast reconstruction can be difficult. Patients, surgeons, and the general population may differ in what constitutes a successful outcome. Recently, crowdsourcing has emerged as a powerful tool for accumulating and analyzing data on a massive scale. The purpose of this study was to determine whether crowdsourcing can be used to reliably evaluate aesthetic outcomes of breast reconstruction.
METHODS: 101 de-identified photographs of patients undergoing various types of breast reconstruction at various stages were gathered. Anonymous crowd workers and a group of expert reconstructive surgeons rated this identical set of photographs on a 5-point Likert scale. Breast fullness, nipple-areola complex, shape and contour, scar appearance, size and fullness, and overall breast appearance were assessed in this manner. Inter-rater and intra-rater reliability were determined by Cohen's kappa coefficient (k) and Pearson's correlation coefficient (r), respectively. Pearson's correlation coefficient was also used to determine the correlation between crowd sourced evaluations of aesthetic subcomponents (ex. breast fullness) and overall outcome.
RESULTS: The authors obtained 2500 anonymous layperson evaluations and 5 expert surgeon evaluations. Crowd sourced assessment data collection took 12 hours. Expert assessment took 14 months, with multiple reminders and prompts. Expert and crowdsourced scores were equivalent across all domains on the Likert scale (k>0.95) (Figure 1). Intra-rater reliability was highly reproducible for both crowd (r>0.96) and experts (r>0.90) across all domains on the Likert scale. Within the crowd worker population, scar appearance (r=0.68) and nipple/areola (r=0.74) were least predictive of overall aesthetic appearance, while breast contour (r=0.88) and position of breast (r=0.92) were most predictive of overall aesthetic appearance (Table 1).
CONCLUSION: Aesthetic outcomes rated by crowds were reliable and correlated closely with those by expert surgeons. Crowdsourcing is a rapid, reliable, and valid way to assess aesthetic outcomes in the breast reconstruction patient. Furthermore, layperson evaluations suggest reconstructive prioritization of breast contour and positioning over superficial appearance.
Back to 2020 ePosters