<12>
Sarkis's image Posts 41
Thanks 5
Joined 5 Apr '11 Email user

I don't want to bet on this either way, but I'd like to note that Leaderboard scores are computer using 30% of the data, whereas Accuracy Threshold defined under Information > Evaluation is computed using total number of members (100%). This makes it even harder to beat 0.4 accuracy. In any case, Round 1 Milestone Prizes are coming soon. Good luck fellow competitors!

 
Zach's image Rank 31st
Posts 292
Thanks 64
Joined 2 Mar '11 Email user

Sarkis wrote:

I don't want to bet on this either way, but I'd like to note that Leaderboard scores are computer using 30% of the data, whereas Accuracy Threshold defined under Information > Evaluation is computed using total number of members (100%). This makes it even harder to beat 0.4 accuracy. In any case, Round 1 Milestone Prizes are coming soon. Good luck fellow competitors!

 

This is to prevent overfitting.

A good algorythm can be expected to get similar scores on both the entire dataset and the 30% subset.  A bad algorythm will overfit the subset, and score poorly on the holdout.

 
Chris Raimondi's image Rank 38th
Posts 194
Thanks 90
Joined 9 Jul '10 Email user

Evaluation is computed using total number of members (100%)

I was under the impression that the 30% is not included in the scoring set - based on the graphic at the bottom of this page:

http://www.heritagehealthprize.com/c/hhp/Data

My reading of that suggests that only the unknowable data is going to be used for prize scoring (I hope that is the case).

 
<12>

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?