Jeff Moser's image
Jeff Moser
Kaggle Admin
Posts 404
Thanks 215
Joined 21 Aug '10
Email User
From Kaggle

Today I updated the site to allow for multiple benchmarks on the leaderboard to help give you an idea of where your submissions rank relative to some basic ideas (i.e. submitting the example entry, submitting an entry with all zeros, etc). In addition, each benchmark has a special graphical designation (to stand out) and a brief description of how you can get that benchmark score.

Do you find these benchmarks to be helpful? If so, can you think of additional ones that I should add? I'm definitely not looking for anything that would give away a big discovery of yours, but just some basic techniques to fill a few more spots on the leaderboard.

This question is slightly related to the "Interesting Scores" forum topic, but in this case I'm looking for basic techniques that can be completely described in a sentence or two and are easily reproduceable using a variety of tools (Excel, R, etc.).

 
Chris Raimondi's image
Rank 20th
Posts 194
Thanks 91
Joined 9 Jul '10
Email User

Maybe:

All One's

All 15's

0.509697 (There are currently three on the leaderboard with this) from:
http://www.heritagehealthprize.com/c/hhp/forums/t/538/grrr-if-you-have-0-509697-as-your-score-here-is-what-you-screwed-up

Are you looking for perhaps something more specific like:
page three (specific methods)
http://www.netflixprize.com/assets/NetflixPrizeKDD_to_appear.pdf

If so - Allan posted in that "interesting scores thread":

A simple linear model on Sex and AgeAtFirstClaim gives a public score of 0.478118

As  far as something that is easy to reproduce in R - I posted easy to reproduce code (including load and saving the files) in post 10 here:

http://www.heritagehealthprize.com/c/hhp/forums/t/607/r-questions

Hopefully it scores somewhere near Allan's, but I don't want to use one of my daily submissions on it :)

 
José Solórzano's image
Posts 128
Thanks 60
Joined 21 Jul '10
Email User

How about the following benchmarks?

  • All predictions 0.209179.
  • Linear regression on the previous year's number of claims.
  • Linear regression on the previous year's number of claims and days in hospital.
  • Log1p-based averages of all age-sex combinations.
 
Ford Prefect's image
Posts 23
Thanks 10
Joined 2 Dec '10
Email User

Here are a few more simple ones:

1) Predict the Member's DaysInHospital just from the previous year.

2) Predict the average of the Member's DaysInHospital values over all available previous years.

 
Jeff Moser's image
Jeff Moser
Kaggle Admin
Posts 404
Thanks 215
Joined 21 Aug '10
Email User
From Kaggle

Jose H. Solorzano wrote:
  • All predictions 0.209179.

@Jose: Can you give a very clear explanation of where the 0.209179 number came from? I know it was mentioned on in this previous topic, but I wasn't able to reproduce it quickly in Excel. I'd especially be interested in reasonable proofs of why this should be the best constant value.

@Chris: Right now I'm just looking for some techniques that are easy to verify in a variety of tools without much training to give people a basic starting point. Anything beyond what Excel can do should be able to be written up in a modern programming language in a few minutes to qualify as a basic benchmark. Later on, it might be fun to have more sophisticated benchmarks like Random Forest ones that are tied things like a quick screencast on how to replicate the result.

Please keep the ideas coming :) Thanks!

 
José Solórzano's image
Posts 128
Thanks 60
Joined 21 Jul '10
Email User

Jeff Moser wrote:

Jose H. Solorzano wrote:
  • All predictions 0.209179.

@Jose: Can you give a very clear explanation of where the 0.209179 number came from? I know it was mentioned on in this previous topic, but I wasn't able to reproduce it quickly in Excel. I'd especially be interested in reasonable proofs of why this should be the best constant value.

Alan Engelhardt wrote the calculation here:


http://www.heritagehealthprize.com/c/hhp/forums/t/523/interesting-submissions-with-scores/3695#post3695

It's based on 2 submissions: One with all zeroes, and a second one which could be anything. The all-zeroes submission gives you the mean of A squared. Then you can calculate the mean of A as a function of the mean of A squared, and the information from the second submission.

 
Zach's image
Rank 9th
Posts 363
Thanks 96
Joined 2 Mar '11
Email User

I would like to see "mean of Y3 hospitalization" as a benchmark

 
Jeff Moser's image
Jeff Moser
Kaggle Admin
Posts 404
Thanks 215
Joined 21 Aug '10
Email User
From Kaggle

Jose H. Solorzano wrote:

It's based on 2 submissions: One with all zeroes, and a second one which could be anything. The all-zeroes submission gives you the mean of A squared. Then you can calculate the mean of A as a function of the mean of A squared, and the information from the second submission.

Ah ha! I didn't realize the other submission could be any constant value. Thanks for the clarification. I've created that benchmark and wrote up more about it here: http://www.heritagehealthprize.com/c/hhp/forums/t/661/the-optimized-constant-value-benchmark/4330#post4330

 
Jeff Moser's image
Jeff Moser
Kaggle Admin
Posts 404
Thanks 215
Joined 21 Aug '10
Email User
From Kaggle

Chris Raimondi wrote:

Maybe:  All 15's

I decided to do an "All 15's" benchmark because its score is officially at the bottom of the leaderboard and gives an easy score to beat :)

 
Valentin Tiriac's image
Posts 16
Thanks 4
Joined 6 May '10
Email User

Jeff Moser wrote:

Chris Raimondi wrote:

Maybe:  All 15's

I decided to do an "All 15's" benchmark because its score is officially at the bottom of the leaderboard and gives an easy score to beat :)

Ha! I managed to (barely) beat it with an entry that gets 2.628085. Too bad the leaderboard doesn't show your lowest score as well. 

 
Strangelove's image
Posts 5
Thanks 1
Joined 20 Mar '11
Email User

Chris, your posted R code gives a score of 0.478246

Thanked by Chris Raimondi
 
ChipMonkey's image
Rank 63rd
Posts 83
Thanks 22
Joined 20 Mar '11
Email User

You know, I'd like to see an entry or two for basic ensemble scores across teams -- I'd be curious to see what they'd look like.

For example: a straight average (by MemberID) of submissions across the to 10 teams, or top 100 (or ALL team submissions?!).

Would this outperform the #1 team? If the Kaggle team was bored and wanted to get creative they could run numbers for the BEST ensemble (via averaging) across any set of, say, 2, 5, or 10 teams; that shouldn't be too hard to calculate since you have all the data. The contributing teams wouldn't have to be revealed, nor any real additional information, but if you could easily outpace the #1 leader (and I'm guessing you could, at least slightly), it may provide a nice score for people to target, which may increase motivation during the lull.

Just a thought.

 
Outsider's image
Posts 19
Joined 30 Nov '11
Email User

I think it would be good to have some benchmarks higher up the leaderboard. Here's one simple suggestion. It's a bit like the optimised constant value but split by gender. It can be achieved with some really simple SQL...

Select Gender, exp(sum(log(dih+1))/count(*))-1
from members, outcomesy3
where members.memberid = outcomesy3.memberid
group by Gender;

which to 6d.p. gives for blank gender, males and females,
0.405301
0.127749
0.164897
respectively. This just gave me 0.484838 which admittedly isn't a lot better than the optimised constant value. It would be easy to further subdivide by age group which should give a better score... 0.47744 I think I got doing this, and 0.474268 if those with claims truncated are treated seperately (i.e. with 31 constant values). It's remarkable really that such a simple approach put me ahead of half of the teams on the leaderboard in my first week, and also that I haven't improved much on it since! I'd quite like some benchmarks in the 0.46 - 0.47 region to aim at, but my best score isn't that good yet so I can't offer any suggestions about that.

 
Neil Schneider's image
Posts 57
Thanks 45
Joined 4 Apr '11
Email User

I would like to see benchmarks for the public board score for the milestone prize winners. This would make a good target to judge progress.

 
B Yang's image
Rank 2nd
Posts 255
Thanks 71
Joined 12 Nov '10
Email User

How about all constant value predictions from 0 to 15 ?

FOR I=0 TO 15
AddConstantValueBenchmark(I)
NEXT I

OK I haven't written BASIC for a while so the code might not compile.

 
Signipinnis's image
Posts 95
Thanks 25
Joined 8 Apr '11
Email User

At this point, the only benchmarks that really matter are the current In-The-Money or maybe Top5 leaders.

But then there's always this:

Round 1 Milestone Prize - How We Did It - Team 'Market Makers' wrote:

Multiple models were built on the two data sets using various parameter settings and variable subsets. Gradient Boosting Machines were the most powerful individual algorithm, with a leaderboard score around 0.461 being consistently achievable. (emphasis added.) They also gave the best individual model of 0.460 when used in conjunction with the data set containing only one year of history. Bagged Trees and Neural Network ensembles gave leaderboard errors of the order of 0.463, with linear regression the poorest individual performer at 0.466.

Takeaways:

  • If your best candidate models aren't in the .461 or better range, work on them.
  • If they are, work on Ensembling, (anti-) Overfitting and post-model "calibration."

HTH

 

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?