<12>
DaveC's image Posts 14
Thanks 3
Joined 16 Feb '11 Email user

Hi.

Until recently, I've been thinking 'wow, there is a US$3M prize' for first place! However now I'm telling myself to revise my enthusiasm downwards... since it seems to me likely that no-one is going to beat the required 0.4 accuracy threshold. Hence the prize is really just US$500k. Not bad.... but not enough to retire on :(

After nearly a month of submissions, the rate of improvement is already heading towards an asymptote that isn't 0.4. The best today is 0.461113. Admittedly we are still to get some more data supplied. But realistically I don't think that we're going to be able to get down to 0.4. That target is just too hard...

Any other opinions? Personally I'd be happy to bet at 5-1 odds today that by the end of the competition no-one exceeds 0.4. Any takers?   (eg, I put up $50, you put up $10)

Dave

 

 

 
ChipMonkey's image Rank 84th
Posts 60
Thanks 14
Joined 20 Mar '11 Email user

I think the release of Labs and Rx data will have an impact on the predictions, but yes getting to 0.4 seems rough.  From what I can tell it looks like the leaders are barely 0.02 RMSLE better than an entry based solely on Age and Gender.  If Labs and Rx alone improve another 0.02 (I'm actually guessing that the Labs and Rx will be relatively less of an improvement than the initial data release, but who knows), we'd need an algorithm change to move another 0.04... judging from some of the graphs in the Netflix and other Kaggle competitions that's not unheard of, particuarly given the long time period, but definitely not easy.

5:1 is probably good odds, but I wouldn't make a bet until a while after the next data release.

0.4 seems like a good benchmark from Heritage's point of view -- it's certainly worth $2.5mil for the attempt (legal issues notwithstanding); I wonder how much effort they put in picking the number...?

---Chip

 
Uri Blass's image Posts 253
Thanks 4
Joined 5 Aug '10 Email user

I think that the leaders are even not 0.02 better than a submission that is based only on age and gender. 

 
Zach's image Rank 31st
Posts 292
Thanks 64
Joined 2 Mar '11 Email user
I'll take the other side of that 5-1 bet.
 
Justin Washtell's image Posts 48
Thanks 15
Joined 26 Aug '10 Email user
Me too!
 
Chris Raimondi's image Rank 38th
Posts 194
Thanks 90
Joined 9 Jul '10 Email user
I think 0.40000 will be rough as well, but I will try and make due with $500k :) I figure I will have to win the progress prizes as well to supplement my income. I tend to think everything will be little baby steps, but we haven't seen the labs data yet (I agree with ChipMonkey on his guess that they will be less helpful than what we already have), and I am optimistic that over two years we can figure out quite a few ways to improve things.
 
William Cukierski's image
William Cukierski
Kaggle Admin
Posts 339
Thanks 165
Joined 13 Oct '10 Email user
From Kaggle

I wonder whether the bottleneck on this competition will end up being the anonymization/granularity of the data. If you bin the data enough, you create an artificial, information-theoretic best error that wont be surpassed by anything but luck (think of the limiting case, where you are given that every patient was hospitalized between 0 and 365 days).


The big question here is whether contestants are approaching such an artificial limit, or whether there is more to be gleaned from clever algorithms. In other words, is the inherent stochasticity of the system greater than the noise introduced by fuzzing the data? Either way, I think it's far too soon to call it here.  The Netflix prize wsa looking like a dubious threshold, until it wasn't.

 
Anthony Goldbloom (Kaggle)'s image
Anthony Goldbloom (Kaggle)
Competition Admin
Kaggle Admin
Posts 382
Thanks 72
Joined 20 Jan '10 Email user
From Kaggle

Further to Will's point, those who followed the Netflix Prize will remember the jump from the Simon Funk discovery. 

 
oregano's image Posts 3
Joined 27 Apr '11 Email user

DaveC wrote:

Any other opinions? Personally I'd be happy to bet at 5-1 odds today that by the end of the competition no-one exceeds 0.4. Any takers?   (eg, I put up $50, you put up $10)

Dave

Dave, Ill bet you $3mil I win the grand prize... ;) 

 

Kidding.  I wish : (

 
Alexander  Larko's image Posts 65
Thanks 34
Joined 14 May '10 Email user

Hi all!

Nobody promised that earn three million easily!

Good luck to all!

 
alexanderr's image Posts 42
Thanks 2
Joined 5 Apr '11 Email user
I am only putting in another entry when I am sure I will go below 0.4000. More thought less calculation-that's my motto.
 
Karan Sarao's image Posts 52
Thanks 2
Joined 14 Mar '11 Email user

Alexanderr,

 

you may want to win the midway prizes!, first of which is this Aug end, you dont need to be below .4 , just in the top 2 of the leaderboard. So keep uploading!

 
DaveC's image Posts 14
Thanks 3
Joined 16 Feb '11 Email user

Considering the state of the lab and prescription data that we've now been given (which is to say the lack of any lab results and lack of any prescription data), I think my 5-1 wager is on the conservative side !

The chance that anyone will collect the $3M now strikes me as << 1%.

Good luck everyone - slave away for 2 years on this endeavour, but remember the prize is $500K, not $3M.

(although true enough, I'm only saying that to discourage as many of you as possible from competing, so as to improve my own chances of winning $500k :)

DaveC

 
Jose H. Solorzano's image Posts 103
Thanks 47
Joined 21 Jul '10 Email user

DaveC wrote:

Considering the state of the lab and prescription data that we've now been given (which is to say the lack of any lab results and lack of any prescription data), I think my 5-1 wager is on the conservative side !

The chance that anyone will collect the $3M now strikes me as << 1%.

Good luck everyone - slave away for 2 years on this endeavour, but remember the prize is $500K, not $3M.

(although true enough, I'm only saying that to discourage as many of you as possible from competing, so as to improve my own chances of winning $500k :)

DaveC

I tentatively agree. The biggest bummer, in my view, is that this is a competition that is entirely about money -- in all likelihood. It won't change healthcare as was hoped.

Perhaps it's too ambitious to try to predict a largely random event one year in advance. Maybe predicting one claim in advance would be more realistic.

 
MGNute's image Posts 2
Joined 21 Apr '11 Email user
Dave, I'm good for that bet if you're still taking action. Mike N.
 
<12>

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?