I don't want to bet on this either way, but I'd like to note that Leaderboard scores are computer using 30% of the data, whereas Accuracy Threshold defined under Information > Evaluation is computed using total number of members (100%). This makes it even harder to beat 0.4 accuracy. In any case, Round 1 Milestone Prizes are coming soon. Good luck fellow competitors!
|
Thanks 5 Joined 5 Apr '11 Email user |
|
|
Posts 292 Thanks 64 Joined 2 Mar '11 Email user |
Sarkis wrote: I don't want to bet on this either way, but I'd like to note that Leaderboard scores are computer using 30% of the data, whereas Accuracy Threshold defined under Information > Evaluation is computed using total number of members (100%). This makes it even harder to beat 0.4 accuracy. In any case, Round 1 Milestone Prizes are coming soon. Good luck fellow competitors!
This is to prevent overfitting. A good algorythm can be expected to get similar scores on both the entire dataset and the 30% subset. A bad algorythm will overfit the subset, and score poorly on the holdout. |
|
Posts 194 Thanks 90 Joined 9 Jul '10 Email user |
I was under the impression that the 30% is not included in the scoring set - based on the graphic at the bottom of this page: http://www.heritagehealthprize.com/c/hhp/Data My reading of that suggests that only the unknowable data is going to be used for prize scoring (I hope that is the case). |
Reply
Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?

with —