Fantasy Baseball Calculator

Subtitle

Calculating Expected Quality Starts

For instance, a 4.50 ERA over six innings is viewed as a quality start, whereas 4.00 ERA over nine innings isn't.

To solve this, in 2009 when I first tackled this problem, I threw what I thought were the pertinent driving factors behind a quality start. These factors are the basic stats that are available everywhere, like Games Started, Innings Pitched, Earned Runs, etc.

I have been constantly tinkering with this stat since then, and I feel I have come across a good algorithm to predict one of the hardest stats to project.

The algorithm, is as follows:

The basic idea behind this calculation is to compare the Games started with the product of the number of earned runs and innings pitched that were accumulated in games started. The extra bit where you square the Games started + games played is to take in account that relief appearances typically result in less innings pitched and thus earned runs in comparison to Starting Games. As a result, players that have 6 games starts on 16 appearances will have their stats altered to weight the ER and IP more heavily than what a straight ratio would give you.

Note: there will be an error in this calculation in situations where pitchers have 0 earned runs. In an effort to not destroy the world, you simply omit the ER variable, and divide the GS by the ratio of GS/GP. This is a rare error. It has only happened happend 3 times in the last 5 years: Nick Adenhart (RIP) in 2009, Jarrod Parker in 2011, and Brad Mills in 2012.

Now, calculating this will not give you the exact number of QS that is expected.The xQSCore is simply a number that is without a scale and this number would function fine within the Fantasy Baseball Calculator as-is, because the Fantasy Baseball Calculator deals with distributions, and the distribution of this would remain the same if we were to put it on a scale to compare with QS.

Out of curiosity, I did put this to a QS comparison scale by taking the ratio of xQScore to historic real QS. This factor came out to be 4.115666 for the last 5 years. To calculate expected quality starts (xQS), you just divide the xQScore with this factor:

A 0.94 is a strong correlation coefficient, but it isn't perfect. If you want to learn more about math that is way over my head - check out the wikipedia article on the Pearson Product-Moment Correlation Coefficient.

To see how accurate my equation was, I calculated the expected Quality Starts from the xQScore, and held it against certain criteria (the green box below).

The first column in this calculates the number of players that had a set number of starts within the last 5 years. Let's look at the 26 to 30 category. In the last 5 years (2008 to 2012), there were 173 pitchers that had either 26,27,28,29, or 30 starts.

66 of those players' xQS fell within -0.999 to 0.999 of their real QS (38.2%)

48 of those players' xQS fell within -1.999 to -1.000 or 1.000 to 1.999 of their real QS (27.7%)

This means that the xQS predicted within 2 QS for 65.9% of the players that had 26 to 30 starts. Why did I break it down like this? I don't really know to be honest - I thought it would be interesting to see how it compared to varying games started but I did make a couple observations.

Obviously, it is easier to predict the Quality Starts of a pitcher that has very few starts. If a player only has 1 start, you're guaranteed to be within 1. Interestingly - I have found a flaw while writing this up. How can I be 5+ games away from predicted when there are a max of only 5 starts? This happened 2 times, Stephen Strasburg in 2011 and Chris Young in 2010. These two gave up very few Earned Runs over their limited starts and at least gives me a potential point to improve upon.

Anyways, we are less concerned with the pitchers that have fewer starts. The pitchers that have 20+ starts are really of interest to the fantasy community and just looking at players with 21 or more starts, 63.1% of the 602 players were within 2 Quality Starts of expected. 80.4% were within 3 expected quality starts; 92.1% were within 4, and 96+% were within 5. Being an amateur statistician, I am pretty proud of myself for this!

If there is anybody reading this that wants to see how I did this or if they want to try and improve upon it - which I HIGHLY encourage, you can find the workbook Here. Unfortunately I used the countifs() function, which is not compatable with 97-03 versions of excel so my pretty green box will display errors if you are using an old excel version. Fear not though - If any of you actually want to play with this and would like me to update the functions, just post on the forums, shoot me an email, or find some way to get ahold of me and I will find a way to update the code.

Happy Baseball! I hope you all enjoy!

Quick Links