Baseball in the modern age is really the marriage of two games: the game played on the field, and the big data game played out behind the scenes. The sport has undergone a sea change in recent years as teams, players and fans have gained access to billions of data points that have permanently altered the way teams are built, the way the game is played, and the ways the public consumes the sport. Two elements of big data, in particular, have played especially key roles in the changing landscape of professional baseball: sabermetrics and Statcast.
SABERMETRICS: NOT YOUR FATHER’S STATS
Sabermetrics is the application of advanced statistical analysis to baseball teams, players and outcomes, and the practice has become standard for all 30 major league organizations. Baseball is unique in that large sample sizes (each team plays 162 games in a season, and players often have long careers relative to other sports) of data are available for analysis, and the traditional ways of measuring player performance (stats like Batting Average and On-Base Percentage, which are fairly straightforward, even for casual fans) have been overtaken by stats like Weighted Runs Created (wRC+), Wins Above Replacement (WAR) and Expected Fielding-Independent Pitching (xFIP) . The goal of many of these statistics is to strip out the effects of randomness and “luck” on the performance of individual players, and to determine which metrics, in particular, will best predict future performance. 
None of this in-depth analysis would be possible without the collection and organization of massive amounts of data, tracked on a pitch-by-pitch and play-by-play basis so that randomness and key contextual elements (e.g. the park the game is being played in, or the quality of the competition) can be accounted for in the analysis.  The strengths and weaknesses of individual players and decisions can be determined with much more certainty, and the result has been a seismic shift in the ways teams value individual players, the ways managers approach in-game situations and personnel usage, and even in the voting for awards like MVP (Most Valuable Player) and Cy Young (awarded to the best pitcher in each league).
STATCAST: DAMN, THAT’S FUTURISTIC
Statcast is a natural extension of Major League Baseball’s newfound obsession with data and sabermetrics. MLB purchased a technology called Trackman (which is based on Doppler radar, and was originally used to track golf swings), and installed it in its stadiums to track both baseballs and players.  Used in conjunction with PitchF/X (which tracks individual pitches) and HitF/X (which tracks batted balls), the Statcast system suddenly made available an entirely new world of data. Now, for the first time, teams could get at the root causes of performance trends, and capture with hard data what the naked eye could only guess at.
Every baseball fan knows, for example, that Los Angeles Dodgers pitcher Kenley Jansen has a nasty “cutter” (a certain type of pitch); but with Statcast and PitchF/X, we know that the reason his cutter is so filthy is that it spins at a rate of 2,555 RPM, nearly 17% faster than the league average of 2,185 RPM on that pitch type.  Teams can create “heat maps” for individual batters to see where their “hot” and “cold” zones are, and they can pitch to opposing players accordingly, trying to exploit a weakness they found through analyzing that player’s data.  To grade defenders, teams can look at the Statcast data to determine how quick an outfielder reacted was to a batted ball (“first step”) and how efficient a route they took to track down that ball (“route efficiency”). Statcast has even made its way to television broadcasts, giving fans a taste of the data’s bounty with graphics showing the speed of a ball off the bat (“exit velocity”) or the specific amount of vertical and horizontal movement on a given pitch.
Baseball’s big data revolution is still in its infancy, and with 2015 as the first full season with Statcast in all 30 MLB parks, teams and analysts have barely scratched the surface of what will be possible to learn from the data.  Next time you flip to a baseball game on TV, take a moment to think about what’s happening behind the scenes; teams of analysts poring over every pitch and wrinkle of the game, looking for the next edge, the next market inefficiency to exploit. The quality of the game and the competition will only continue to improve as teams and players use data to improve their own skills and identify weaknesses in their opponents, and we can all thank technology and big data for making it happen.