# Benchmark Suggestions?

« Prev
Topic
» Next
Topic
 0 votes Today I updated the site to allow for multiple benchmarks on the leaderboard to help give you an idea of where your submissions rank relative to some basic ideas (i.e. submitting the example entry, submitting an entry with all zeros, etc). In addition, each benchmark has a special graphical designation (to stand out) and a brief description of how you can get that benchmark score. Do you find these benchmarks to be helpful? If so, can you think of additional ones that I should add? I'm definitely not looking for anything that would give away a big discovery of yours, but just some basic techniques to fill a few more spots on the leaderboard. This question is slightly related to the "Interesting Scores" forum topic, but in this case I'm looking for basic techniques that can be completely described in a sentence or two and are easily reproduceable using a variety of tools (Excel, R, etc.). #1 | Posted 3 years ago Jeff Moser Kaggle Admin Posts 404 | Votes 216 Joined 21 Aug '10 | Email User
 0 votes Maybe: All One's All 15's 0.509697 (There are currently three on the leaderboard with this) from: http://www.heritagehealthprize.com/c/hhp/forums/t/538/grrr-if-you-have-0-509697-as-your-score-here-is-what-you-screwed-up Are you looking for perhaps something more specific like: page three (specific methods) http://www.netflixprize.com/assets/NetflixPrizeKDD_to_appear.pdf If so - Allan posted in that "interesting scores thread": A simple linear model on Sex and AgeAtFirstClaim gives a public score of 0.478118 As  far as something that is easy to reproduce in R - I posted easy to reproduce code (including load and saving the files) in post 10 here: http://www.heritagehealthprize.com/c/hhp/forums/t/607/r-questions Hopefully it scores somewhere near Allan's, but I don't want to use one of my daily submissions on it :) #2 | Posted 3 years ago Competition 20th Posts 194 | Votes 91 Joined 9 Jul '10 | Email User
 0 votes How about the following benchmarks? All predictions 0.209179. Linear regression on the previous year's number of claims. Linear regression on the previous year's number of claims and days in hospital. Log1p-based averages of all age-sex combinations. #3 | Posted 3 years ago Overall 140th Posts 128 | Votes 60 Joined 21 Jul '10 | Email User
 0 votes Here are a few more simple ones: 1) Predict the Member's DaysInHospital just from the previous year. 2) Predict the average of the Member's DaysInHospital values over all available previous years. #4 | Posted 3 years ago Posts 23 | Votes 10 Joined 2 Dec '10 | Email User
 0 votes Jose H. Solorzano wrote: All predictions 0.209179. @Jose: Can you give a very clear explanation of where the 0.209179 number came from? I know it was mentioned on in this previous topic, but I wasn't able to reproduce it quickly in Excel. I'd especially be interested in reasonable proofs of why this should be the best constant value. @Chris: Right now I'm just looking for some techniques that are easy to verify in a variety of tools without much training to give people a basic starting point. Anything beyond what Excel can do should be able to be written up in a modern programming language in a few minutes to qualify as a basic benchmark. Later on, it might be fun to have more sophisticated benchmarks like Random Forest ones that are tied things like a quick screencast on how to replicate the result. Please keep the ideas coming :) Thanks! #5 | Posted 3 years ago Jeff Moser Kaggle Admin Posts 404 | Votes 216 Joined 21 Aug '10 | Email User
 0 votes Jeff Moser wrote: Jose H. Solorzano wrote: All predictions 0.209179. @Jose: Can you give a very clear explanation of where the 0.209179 number came from? I know it was mentioned on in this previous topic, but I wasn't able to reproduce it quickly in Excel. I'd especially be interested in reasonable proofs of why this should be the best constant value. Alan Engelhardt wrote the calculation here: It's based on 2 submissions: One with all zeroes, and a second one which could be anything. The all-zeroes submission gives you the mean of A squared. Then you can calculate the mean of A as a function of the mean of A squared, and the information from the second submission. #6 | Posted 3 years ago Overall 140th Posts 128 | Votes 60 Joined 21 Jul '10 | Email User
 0 votes I would like to see "mean of Y3 hospitalization" as a benchmark #7 | Posted 3 years ago Competition 9th | Overall 464th Posts 366 | Votes 101 Joined 2 Mar '11 | Email User
 0 votes Jose H. Solorzano wrote: It's based on 2 submissions: One with all zeroes, and a second one which could be anything. The all-zeroes submission gives you the mean of A squared. Then you can calculate the mean of A as a function of the mean of A squared, and the information from the second submission. Ah ha! I didn't realize the other submission could be any constant value. Thanks for the clarification. I've created that benchmark and wrote up more about it here: http://www.heritagehealthprize.com/c/hhp/forums/t/661/the-optimized-constant-value-benchmark/4330#post4330 #8 | Posted 3 years ago Jeff Moser Kaggle Admin Posts 404 | Votes 216 Joined 21 Aug '10 | Email User
 0 votes Chris Raimondi wrote: Maybe:  All 15's I decided to do an "All 15's" benchmark because its score is officially at the bottom of the leaderboard and gives an easy score to beat :) #9 | Posted 3 years ago Jeff Moser Kaggle Admin Posts 404 | Votes 216 Joined 21 Aug '10 | Email User
 0 votes Jeff Moser wrote: Chris Raimondi wrote: Maybe:  All 15's I decided to do an "All 15's" benchmark because its score is officially at the bottom of the leaderboard and gives an easy score to beat :) Ha! I managed to (barely) beat it with an entry that gets 2.628085. Too bad the leaderboard doesn't show your lowest score as well. #10 | Posted 3 years ago Posts 16 | Votes 4 Joined 6 May '10 | Email User
 1 vote Chris, your posted R code gives a score of 0.478246 #11 | Posted 3 years ago Posts 5 | Votes 1 Joined 20 Mar '11 | Email User
 0 votes You know, I'd like to see an entry or two for basic ensemble scores across teams -- I'd be curious to see what they'd look like. For example: a straight average (by MemberID) of submissions across the to 10 teams, or top 100 (or ALL team submissions?!). Would this outperform the #1 team? If the Kaggle team was bored and wanted to get creative they could run numbers for the BEST ensemble (via averaging) across any set of, say, 2, 5, or 10 teams; that shouldn't be too hard to calculate since you have all the data. The contributing teams wouldn't have to be revealed, nor any real additional information, but if you could easily outpace the #1 leader (and I'm guessing you could, at least slightly), it may provide a nice score for people to target, which may increase motivation during the lull. Just a thought. #12 | Posted 2 years ago Competition 63rd | Overall 455th Posts 83 | Votes 22 Joined 20 Mar '11 | Email User
 0 votes I think it would be good to have some benchmarks higher up the leaderboard. Here's one simple suggestion. It's a bit like the optimised constant value but split by gender. It can be achieved with some really simple SQL... Select Gender, exp(sum(log(dih+1))/count(*))-1 from members, outcomesy3 where members.memberid = outcomesy3.memberid group by Gender; which to 6d.p. gives for blank gender, males and females, 0.405301 0.127749 0.164897 respectively. This just gave me 0.484838 which admittedly isn't a lot better than the optimised constant value. It would be easy to further subdivide by age group which should give a better score... 0.47744 I think I got doing this, and 0.474268 if those with claims truncated are treated seperately (i.e. with 31 constant values). It's remarkable really that such a simple approach put me ahead of half of the teams on the leaderboard in my first week, and also that I haven't improved much on it since! I'd quite like some benchmarks in the 0.46 - 0.47 region to aim at, but my best score isn't that good yet so I can't offer any suggestions about that. #13 | Posted 2 years ago | Edited 2 years ago Posts 19 Joined 30 Nov '11 | Email User
 0 votes I would like to see benchmarks for the public board score for the milestone prize winners. This would make a good target to judge progress. #14 | Posted 2 years ago Posts 57 | Votes 45 Joined 4 Apr '11 | Email User
 0 votes How about all constant value predictions from 0 to 15 ? FOR I=0 TO 15 AddConstantValueBenchmark(I)NEXT I OK I haven't written BASIC for a while so the code might not compile. #15 | Posted 2 years ago Competition 2nd | Overall 109th Posts 258 | Votes 71 Joined 12 Nov '10 | Email User
 0 votes At this point, the only benchmarks that really matter are the current In-The-Money or maybe Top5 leaders. But then there's always this: Round 1 Milestone Prize - How We Did It - Team 'Market Makers' wrote: Multiple models were built on the two data sets using various parameter settings and variable subsets. Gradient Boosting Machines were the most powerful individual algorithm, with a leaderboard score around 0.461 being consistently achievable. (emphasis added.) They also gave the best individual model of 0.460 when used in conjunction with the data set containing only one year of history. Bagged Trees and Neural Network ensembles gave leaderboard errors of the order of 0.463, with linear regression the poorest individual performer at 0.466. Takeaways: If your best candidate models aren't in the .461 or better range, work on them. If they are, work on Ensembling, (anti-) Overfitting and post-model "calibration." HTH #16 | Posted 2 years ago Posts 95 | Votes 25 Joined 8 Apr '11 | Email User