Investigation into the true strike-zone of MLB home plate umpires, and the consistent biases prevalent in their strike calling.


  • Collected and cleaned data from over 520K pitches across 12 MLB seasons
  • Created a General Additive Model to evaluate expected strike/ball call at given locations as pitches cross the plate
  • Utilized contour plots to visualize true umpire strike-zone from the GAM model outputs
  • Segmented data into specific game situations to measure bias among MLB umpires

This project’s Github Repository