Philosophy, Stats, and Pattern Recognition

Q.  Love the site.  Philosophy over numbers, right?

A.   :daps: good to meetcha, mate.

As the information proliferates in the internet age, the challenge is to extract the essential information. 

That's true in baseball, sales, climate change, political polling or anything else.  99% of the information is noise and 1% helps you understand reality -- and then control your life.

Stats are necessary but backwards-looking by their nature.  August 12, 2009 is completely irrelevant to your life and mine.

Any number of places run Fangraphs numbers and split them out, LH/RH, 1H/2H, etc. The good folks at Fangraphs, Hardball Times, Baseball Prospectus etc. have refined Bill James' trends analysis and put them in daily-updating stacks and columns that everybody can understand.

Looking up the 50th-percentile PECOTA projection for 2010, looking up the expected $alary, and comparing the actual salary is fine, but you can do that with two clicks in 2010.  What does it really gain us, to know that Felix Hernandez saved 66 runs last year and Justin Verlander saved 76?


=== A is for Aardsma ===

We hope that the knowledge of David Aardsma's 2009 +19 runs saved is a given. 

The mission is to look beyond "+19 runs in 2009," to find the variables that are out of alignment in the statistics, and use them to, hopefully, draw a bead on the future.

In Aardsma's case, one of these two 2009 sets of variables was probably the set that is out of alignment with his previous history:

  • PESSIMIST:  too many FB's thrown, misleading BB rate, lucky HR rate, "here's-the-next-FB" not sustainable strategy
  • OPTIMIST:  2004-08 stats out of alignment with New Pitcher & New Context

So at SSI, we argue about those two specific sets of variables (statistics).  Not usually about his salary-vs-RAR.  

Fastball percentage is a number, a statistic.  Zero'ing in on the fact that Aardsma reduced his BB rate by throwing 87.1% fastballs, 3rd in the majors, is a question of [human judgment + statistics], rather than [philosophy - statistics].

I suspect that Capt Jack is also a little nervous about a 2009 bullpen that, in retro, performed beautifully.  He gave up a whale of a lot to go get a genuinely dominating short reliever.  Philosophy over stats?  No, Capt Jack's insight over a superficial "SP = 2.5 WAR, RP = 1.5 WAR" assessment, in my humble opinion.


=== Area 51 Dept. ===

What are the numbers on Ichiro, and what is the philosophy?  :- )

:shrug: the numbers are simple.  He's worth +50 runs a year.  He's signed up for several more years.  He gets you 20 runs above average with the bat, 20 more above a benchie, and 10 runs with the glove.  Okay, now what?

Assign yourself an interesting problem:  will Ichiro have a good year in 2010?  Fangraphs doesn't have a column for that.  Everything on Ichiro's player page talks about yesterday.

Templates and pattern recognition again.  Not a huge boatload of numbers in that SSI assessment, but the numbers that you zero in on are very interesting.  HOF leadoff hitters like Ichiro lose first their SLG, and many years later, their AVG.

Ichiro will be year-to-year on his SLG, and then will start serving-and-volleying the ball through the hole.  You could run a lot more numbers, but they wouldn't change the picture.

Philosophy doesn't help us here.  What does, is choosing the SLG column when reviewing the baseball cards for Lofton, Brock, Raines, Rickey, and Carew...


Part 2



Do you know if ROE captures the bases gained by all runners when the ball goes down the RF line?
Does it surprise you that Ichiro's infield singles amount to only -2 runs per year on runner advancement?
Interesting that Ichiro knocks in more runners, per on-base, than the average...
Will follow on soon, hopefully...


Players who don't hit the ball over the wall, the smaller guys, get a GOOD amount of extra runs from motoring around the bases.  Little guys and big guys have been in balance since the days of Ruth.
Big guys playing the outfield make up for a lot of defense lost by footspeed ... when they throw runners out on the bases.
60 feet was just about the perfect distance to set, as was 90 feet on the bases.  If the bases were 85 feet or 95 feet, the entire game would morph widely out of its current ballet symphony.
I don't think that the game just seems to be in balance because we're used to it.  I think the guys who set the game up to start with, had a great feel for a balanced fight between the fielders and the hitters... lucky for us...


The way baseball was created was rather akin to the way dungeons and dragons or the perfect American novel or the best tasting cheesecake are made...we tried a bunch of different configurations and the ones that were the least annoying stuck. :)
The great history channel series "Ken Burns' Baseball" commented at length about how the distances seem to have been picked by God..."a choice from heaven..." they called the 90 foot distance between the bags, saying "if the bases were 91 feet apart, the league might hit .200 in a good year...think of all the SB lost, all the infield singles gone, all the extra outs on ground balls to shallow RF and LF..  If they were 89 feet apart, fielders would be defenseless...most grounders left of second base if hit slowly would be singles!"
The reason the game works is because, for the first 15 years or so (from the late 1840s to the early 1860s) we played it in dozens of different ways and the men who developed a passion for it carefully recorded the outcomes and decided which ones they liked.  More thought, IMHO, was put into inventing baseball than any other organized team sport.


The original pitching distance was what, 55 feet or so... a lot of tinkering went on ...
What befuddles me is, between these two points
(A) The first game in which 60'6" and 90 etc were played (1870?)
(B) Felix and Ichiro, 2010
The players have changed incredibly, and yet neither the big, small, fast, slow, fastball, curveball, etc players have emerged dominant...
God certainly didn't invent the rules of baseball :- ) but it's quite a boggle how well it has stayed in balance through the evolution of society.   It was ballet then and is ballet now...


The robustness of the 1870s-1890s rules (BTW the mound was originally a flat box 45 feet from home plate and stayed that way until a rules change in 1893) is impressive.  Granted, we've chosen carefully what kinds of athletes we think can help us on the ballfield...but still, we're talking about a wide range of different types of athletes and a stunningly small variation in the results of games under those varying conditions.  It's an offensive "explosion" when the league R/G goes from 3.8 to 5.0

Taro's picture

ROE does capture the runs gained when the player advances an extra base or two. It doesn't seperate those outcomes though.. the run value of an ROE is higher than that of a single for that reason.
wOBA already calculates this, but what isn't calculated are post-play errors. SABRMatt cover that a little bit in the MC thread.

Taro's picture

I am a bit suprised although I expected it to be 3-4 runs even at the higher end..
The actual penalty for Ichiro's infield hits is closer to 1 run a year. I tacked on another run to estimate the difference in advancement value  between his non-infield singles and doubles+triples to that of league-average.
This is conservative.. the actual run value penalty in Ichiro's speed hits below league average is likely somewhere in the 1.5-2 run range per year. This will likely cancel out with the value of Ichiro's post-play errors as Matt showed in the thread (although we don't have multi-year data for that stat in that thread).
The difference in advancement value between an infield hit and an average single is just really small. Even if you were to assume that an infield hit would NEVER advance a runner more than one base, the value would still only be 0.08 runs below average. Since this is unrealistic, I set the value at 0.05 runs. I was a little suprised at how little it added up to, but it isn't too suprising since you are comparing two POSITIVE nearly identical outcomes. When you take into consideration that the on-base portion of a single has a higher run value than the advancement portion...Well, you can see that the difference just isn't going to be very much at all.

Taro's picture

GIDP has a MASSIVE run differential in comparison to "speed hits vs power hits".
A single GIDP is worth around -0.45-0.47 runs over the average "single" out. You are creating two outs at once and erasing a baserunner.
Baserunning is also more impactful since on average; 55% of the time you're batting with no one on base.
Ultimately its because you are comparing two positive outcomes (hits) vs comparing a positive outcome to a negative outcome in the baserunning+GIDP stats.

Add comment

Filtered HTML

  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd><p><br>
  • Lines and paragraphs break automatically.

Plain text

  • No HTML tags allowed.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Lines and paragraphs break automatically.


  • Allowed HTML tags: <a> <em> <strong> <cite> <blockquote> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.
  • Web page addresses and e-mail addresses turn into links automatically.