# Clarifying Rule #13 for Milestone 1

 I'm not Anthony.
 I'm very interested in the answers to these questions as well. The answers will be a make it or break it for a lot of contestants.
 @Signipinnis I don't mean this thread to be exclusionary -- sorry if it came across this way.  I addressed Anthony because I specifically want to get Kaggle's official comments on these items in addition to any other replies.  Please feel free to share your reply.
 I believe some of these points were addressed in an early post by Jeremy Howard (at Kaggle): "Only the paper describing the algorithm will be posted publicly. The paper must fully describe the algorithm. If other competitors find that it's missing key information, or doesn't behave as advertised, then they can appeal. The idea of course is that progress prize winners will fully share the results they've used to that point, so that all competitors can benefit for the remainder of the comp, and so that the overall outcome for health care is improved." Also, I think you won't be forced to share your results, even if you're in the #1 position – but then again, you won't be able to claim the $30,000 or$20,000 either, unfortunately.  Those are the rules, and it certainly does create a dilemma for top competitors. Whether or not this structure is "fair" I think might be a question for philosophers. As a practical matter, it will spur innovation as people build off of others ideas, trying to stay competitive.   Also, note that there has been some great disclosures already  in the Forums (some with code!)  posted by top competitors (Chris R in particular) which have already helped others.  Next, I should point out that the Netflix Prize had the same type of milestone prize structure & disclosure requirement.  One team -- Team BellKor -- won milestone / 'progress' prizes, disclosed their methods along the way, and was still able to be part of the team that won the $1MM Grand Prize. Yes, other people built on the techniques they disclosed (but then again, BellKor's approach built on techniques that other teams had disclosed...). My point is that in at least that case, it was possible for the leaders to disclose their methods & still remain competitive. About the level of detail required. My opinion is that I would hope that the detail would strive to match the standards set by the Netflix Prize's "Progress Prize" papers. See the solution papers referenced in these posts: [ EDIT, to address ChrisR's point below ] There's a lot of 'fancy' math in these papers, but I don't want to imply that that's necessary. In fact, too many equations can hinder understanding, and clear text or pseudocode might be better at times. My point is that these documents do not try to gloss over any details or hide critical parameters in footnotes, etc. [ /EDIT ] Finally, just to be clear, much of the above is my own opinion (as a humble Kaggle competitor), not to be confused with any 'official' response to your questions.

Bobby wrote: "The idea of course is that progress prize winners will fully share the results they've used to that point, so that all competitors can benefit for the remainder of the comp, and so that the overall outcome for health care is improved." Unacceptable. This is a contest not a group collaboration. "I think you won't be forced to share your results, even if you're in the #1 position – but then again, you won't be able to claim the $30,000 or$20,000 either, unfortunately. " That is very unfortunate, and hopefully not true (I still hope a moderator will step in and inform us of the level of detail required). My only motive for this competition is the money, not to help others win money. "Whether or not this structure is "fair" I think might be a question for philosophers --- but as a practical matter, it will spur innovation as people build off of others ideas." It's not fair to anyone, and copy-cats stand to benefit. The idea I'm implementing has taken me my ENTIRE LIFE of research to get to. There's not a chance in hell I would willingly give it away for others to get a free shortcut/cheat. I'm standing by to hear the official response before I even make my submission to the leader board.

Signipinnis wrote: Bobby wrote: Unacceptable. This is a contest not a group collaboration. Not really, this is a hybrid model of a crowd-sourced search for a problem solution. There are two preliminary phases, incentivized by cash awards specifically for collaboration. Then there's the gold rush for the best ultimate solution, arising from the previous shared benchmark/algorithm/methodology. Bobby wrote: The idea I'm implementing has taken me my ENTIRE LIFE of research to get to. There's not a chance in hell I would willingly give it away for others to get a free shortcut/cheat. I'm standing by to hear the official response before I even make my submission to the leader board. So wait until the 3rd phase starts. The way I see it, there are (likely) a number of people here with a proprietary approach/tool that they think will absolutely, unquestionably, blow the doors off everyone else. And needless to say, if one has that kind of a competitive edge, (esp. if based on one's own intellectual property developed from years of work), one would be extremely reluctant to give it up for a few pieces of silver. But here's the thing: many may THINK they exclusively have an unbeatable super-algorithm, but by definition, when all is said and done, only one of them can be the Bob Beamon of this contest. And there are a LOT of excellent data miners, using the best extant good tools and a lot of time & ingenuity, working the solution space. Think of it as a genius ensemble, with a huge amount of available computational time. Odds are very good that a hard-working data miner or health care analyst using existing tools will ultimately barely edge out another hard-working data miner. But the easy cure for the anyone with "I have proprietary secrets that are worth more than $x0,000" sentiments is to simply sandbag or wait on the sidelines until Phase 3 starts. Then take the Big Prize if you are able.
 DanB wrote: It seems a lot of people are concerned they might win a progress prize they don't want. My understanding is that you can choose which submission is considered for the prize. If you don't want the progress prize (and everything that comes with it), make a submission where every prediction is 1. For anyone concerned, NOT winning the prize should be very easy. Personally, I'd be surprised if those at risk of winning a progress prize did this.
 Signipinnis wrote: DanB wrote: Personally, I'd be surprised if those at risk of winning a progress prize did this. Nice phrase. Personally, being at risk for winning a prize is something I'm looking forward to.
 Speaking of (gasp) "collaboration": I hope it has not escaped anyone's attention that +/- 18 days ago, DanB announced "I don't have time for this anymore, here's what I've done so far, hope it helps somebody" ... and dumped various parts of his algorithm in forum posts for all to see. Various questions and answers then followed. Now DanB is in 5th Place on the Leaderboard.I could be wrong, but I don't think he was Top 10 before. Collaboration works !  Sometimes in unexpected ways !!! Thanks DanB. Hope you're able to stay in after all.
 Signipinnis wrote: Thanks DanB. Hope you're able to stay in after all. Me too!  And keep sharing ideas =)
 Anthony Goldbloom wrote: Hi all, Not ignoring this thread. Just seeking clarification from HPN on one issue. Anthony