Total Classics Rating System

Jayr2112 · May 20, 2009

Hey guys, I'm just wondering what method was used in the ratings system of the Total Classics mod. It just made me wonder when I noticed the 1939 Yankees were ranked 11th in the CLS-4 league behind teams like the 2001 D'backs and 1980 Astros when the 39 Yanks posted a +411 run differential, the 1980 Astros had a +48 and the 2001 D'Backs posted a +141. I know that rating systems are highly subjective, but I am just wanting to know if the modders just "plugged in" yearly stats or if there was an element of opinion involved which would hold that the 39 Yankees are clearly better than both of these teams. Maybe it's just the MVP ranking system that "misranks" these teams, but I'm just curious. Note: this is just for my personal info because I'm formulating an All-Time tournament, and I love the mod by the way.

**tebjr** · May 20, 2009

The TC guys have a process that they use. Stecropper posted this awhile back.

http://www.mvpmods.com/index.php?showtopic=32568&hl=

**Jim825** · May 20, 2009

We use Stecropper's process for the single season mods and apply it to every team so that they all use the same criteria. TC9 is a bit different in that no one person worked on the rosters for all 120 teams. When we started Total Classics, we actually started with a number of classic team "packages" that were developed for MVP2004. The rosters in those packages were developed by a number of different people and they were created before Stecropper developed his Global Tweaks, so there isn't as much uniformity in the ratings.

Some of the teams simply use the ratings that were pulled in from the Lahman database, while others were tweaked, depending upon who created them. To try to go back at this point and "re-tweak" all of the rosters would be a monumental task and wouldn't necessarily be worth the effort to do the work.

Jayr2112 · May 20, 2009

Coolness Jim. It's definitely an epic mod, but I was just wanting to know the process. I'm getting good results nonetheless with the 39 Yankees easily handling the 86 Mets 4 games to 1 and the 27 Yankees took down the 64 Cardinals 4 games to 2. My All-Time Tournament has 32 of the best W.S. winners playing best of 7 series so it will take awhile to play out by watching all the games and keeping stats.

**paulw** · June 6, 2009

Don't take EA's rankings too seriously either. They are not particularly reliable. Your results with the '39 Yanks demonstrate this. Jim and Don did an outstanding job - especially considering the enormous number of teams.

It would be nice to have a standardized rating system for Total Classics but..

I'm not sure anyone would agree on what is best and more importantly the ratings are context sensative. A 90 Power and 78 Contact would play very differently in say a 1968 mod versus a 2001 mod. Because of the scope of Total Classics it would be next to impossible to approximate the right context. This is no different than baseball historians and their many attempts to interpret what .320 meant in 1931 vs. for example 1979. There are also ballpark effects involved. It's fun to speculate about taking Babe Ruth from 1921 and placing him into baseball today. Would he hit 80 homeruns or 40? What would Barry Bonds have done in 1942?

My particular bias to a certain era would love to see Jackie Robinson and Mickey Mantle playing in 2009. I think they'd be a head above everyone else.

patsen · June 6, 2009

Hard to say. BP ran a study on league difficulty year to year, and it shows that the strength of competition has gradually increased throughout history. Babe Ruth would be compared to someone like Brian Giles, rather than Barry Bonds.

Which makes sense. What was considered a gold medal performance in swimming 100 years ago is usually achieved by the best high school girls today.

Of course, that assumes the player hops in a time machine and were to play the modern game without any other adjustments. Usually, we assume something like an average player keeps the same quality, so these comparisons can be kept reasonable.

The issue this causes with ratings is that MVP was made for modern play, if an average batter and pitched faced (say, ratings of 70) in 2004, the hitter would hit about .266. In 1968, he'd hit .237. However, to do that, we'd need to either dock the hitter's rating (making an average hitter 60), boost pitching (making an average pitcher 80), or a combination of both.

This technique is usually done in tabletop games, where they don't try to translate the ratings from era to era. Each player is inherently rated with the environment he's in. So, a team of 1968 players would come equipped with ratings to depress run scoring. If they faced a modern team, their offense will increase due to facing liveball pitchers, and their pitchers will depress modern hitting due to bringing some of their high mounds with them. The effect will be the average players hitting a hybrid of both leagues, or about .252.

However, to pull this off, we need a much better understanding of how ratings convert to stats. But, it will allow us to have an accurate depiction of cross-era play, as long as we don't start trading players.

**paulw** · June 6, 2009

Awesome post, Patsen. One other huge factor is throwing in baseball's huge stain - no African American ballplayers before 1947 and Jackie Robinson. Are the stats from before 1947 even valid? When did integration make the playing field finally equal / comparable? My guess is the mid 1960's.

I think my love for the 1950's comes from the fact that it's an era where those who were denied the opportunity to play finally were given their chance. Players like Mays, Aaron, Frank and Jackie Robinson absolutely shined - the bright new stars of baseball. And what a sad context this puts American history really in. A couple of other players who also fascinate me were older negro league players - those who got a chance but were past their prime. Luke Easter and Monte Irvin come to mind.

patsen · June 6, 2009

Integration is only one of many reasons baseball is a more competitive environment today. Generally speaking, integration added to the amount of talent in the league, while more teams increased the number of jobs. That's why integration years is when you see huge surges in individual stats, because the league just added a lot of players who wouldn't be there otherwise the stars can feast on.

But, generally speaking, it's a safe assumption to say that average talent year in and year out is about equivalent. WW2 was probably the biggest deviation from the norm for that.

**paulw** · June 6, 2009

In terms of classic teams, I think the only time when you can actually get realistic (statistical) results is when you do a season mod. Hitting and pitching have to be balanced with each other.

I remember way, way back when I was about 12 or so and we tried to add the 1950 Detroit Tigers to our Classic Strat-o-Matic league. We used some older cards from around 1970 or so. The Tiger pitchers were absolutely hammered. These player cards were made to play in the context of 1970 - not 1950. This doesn't say anything about the quality of players in 1950 vs. 1970. It only relates to the context within which the cards were made to play accurately.

patsen · June 6, 2009

Generally speaking, that's true. But, for zero-sum game where both players contribute (such as batter vs pitcher), there are a lot of ways to set the ratings... 50 vs 50, 70 vs 70 will both give you league average performance. What you can do is set up how they face other players.

So, let's do something simple. Avg = .266 + .003*(Bat-Pit), where Bat is the batter rating, and Pit is the pitcher rating. Higher batter ratings increase Avg, hitter pitcher ratings decrease it. If both are the same, then you get league average for 2004 (.266). If you have Bat 100 vs Pit 50, you get .416, and so forth.

So, let's say the average hitter and pitcher in 2004 is 70 and 70. We want to know how to rate the average 1968 player, who went .237. So, we face him vs an average 2004 pitcher (70). By the theory I stated earlier, such a matchup should be the midpoint of the two, or .252. So, if an average 1968 hitter (?) faces an average 2004 pitcher (70), what rating does the hitter need to get a result of .252? Algebra tells us he should be rated 65, and, an average pitcher in 1968 would be a 75.

This is the effect of era. So, every 1968 hitter will bring a little low offense with him, but so will every 1968 pitcher. The real part will be to rate players in their own era.

Carl Yastrzemski lead the 1968 AL with a batting average of .301. Knowing the average 1968 pitcher was a 75, we can figure out Yaz should be rated about an 87. (.266 + [87-75] = .302)

A modern .302 hitter would rate as an 82, so when Yaz goes to the modern era, he'll hit closer to .317, but the modern hitters would also lose .015 to average, because of the same effect to pitchers.

Where this is a problem is other factors, like defense. If you set Honus Wagner to a low defense so he'll field .926 in his own era, he'll field that in every era, so we may have to normalize things like fielding and stamina (how would we compare 250 IP workhorses with the 400 IP workhorses of yesteryear?)

**paulw** · June 6, 2009

Once again, great info - and quite thought provoking. Something I've always thought about is setting an all time baseline - and using some of your math above - match ratings up against the baseline. I guess the big question to answer then would be what is an all-time historical baseline? My very rough idea is around 60 on the EA System. An example of a 60-60-60 (Power, Contact, Speed) player would be - from my time of interest -someone like Earl Torgeson. This is just my "feel" for the game and not based on anything empirical.

Off the topic, slightly (but relating to statistical baselines), I always use a very simple way of telling folks how to interpret OPS (On Base + Slugging) at basically any time in history. Here's my scale:

.967 and above A+

.933 to .966 A

.900 to .933 A-

.867 to .899 B+

.834 to .867 B

.800 to .833 B-

.767 to .799 C-

.734 to .766 C

.700 to .733 C-

etc.

Simplicity is it's big selling point - it basically uses the old grading scale we had back in High School. And I think it fits fairly nicely. I guess someone playing in 1967 or '68 had as an analogy a very tough teacher.

On my home team now Prince Fielder is an A+ (.996) with Ryan Braun at A (.951). Others would be Corey Hart (C-), Jason Kendall (F), J.J. Hardy (D+) and Craig Counsel a solid C (.754). To me at least it has some face validity.

For the 1951 Dodgers you'd have Roy Campanella A+, Jackie Robinson A, Duke Snider B-, Carl Furillo C+, Pee Wee Reese and Billy Cox C.

patsen · June 6, 2009

Well, there is a quick measure of hitting that counts league and park effects, based on OPS. It's called OPS+, and it's on every B-R page.

So, Yaz circa 1968 had an OPS+ of 170, making him much more comparable to Manny Ramirez than Nick Markakis (hitting-wise).

As for all-time ratings, I think I'd agree with a lower baseline. Average of 70 is comparable to the base game, but the base game also has to be able to cover minor leaguers down to A ball. Similar to the WBC mod, I wouldn't be offended to see players average in the 60s. So, the Julie Weras of the world would get the poor ratings.

The problem is, the game has a hard cap of 100, so we need to make sure that we don't hit too many scenarios where someone needs a higher rating. I know in Hardball 5, you could use a hex editor to get ratings higher than 100, but entering the (easy to enter) player editor would revert any überstats to 100.

People did this to make an All-time classics mod, as 100 power or contact was usually not enough for some extreme seasons. Hopefully regression can cover some of those (so we can assume no hitter is capable of hitting .400 as a true talent rating,) but it's something we'll need to look at.

**paulw** · June 7, 2009

I'm familiar with OPS+ - I think the only reason I never really used it was my OPS system gave me a very simple way to give a quick offensive performance rating. OPS+ is much, much better for making comparisons - for many reason. Anyhow, I calculated the Standard Deviation OPS+ for all major league players from last season with more than 250 at bats. The new baseball-reference.com is wonderful - it took me 10 minutes. It turned out to be approximately 23. I then translated this to a Standard Score Scale and came up with the following approximate subjective values:

146 and above A+

139-145 A

132-138 A-

125-131 B+

118-124 B

111-117 B-

103-110 C+

96-102 C

89-95 C-

82-88 D+

75-81 D

68-74 D-

61-67 F+

60 or lower F

Using it on this year's Brewers I get Braun and Fielder both at A+, Kendall still at a solid F, Cameron at A-, Counsel at C, Hardy at D, Hart at D+, Hall at F+.

Here's the '51 Dodgers: Campanella and Robinson both at A+, Gil Hodges at A-, Duke Snider and Andy Pafko at B, Reese at C+, Cox at C and Furillo at C+.

For MVP you could create Contact+ and Power+ Ratings the same way as OPS+ is calculated. Find their Standard Deviations and then instead of creating grades create MVP Ratings. Same kind of thing with pitcher ratings. What this would do is give all players at any time period ratings based on how they did with their competition. An absolute avalanche of work but an interesting *hypothetical* system.

I'm going to rest my brain now and watch something silly on T.V. :wacko:

Total Classics Rating System

Recommended Posts

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Link to comment

Share on other sites

Archived