# Call to Boycott Heritage Health Prize

« Prev
Topic
» Next
Topic
<12345>
 Posts 3 Joined 31 Mar '11 Email user Wouldn't this also stifle legitimate discussion of algorithms too? As an example, with the netflix prize I seem to recall Simon Funk publishing his algorithm online during the contest, which prompted a surge of activity along similar lines, and his was one of many models incorporated by the winning team. How can teams progress without at least some input from previously published algorithms to which they are unable to grant "exclusive" rights? #2 / Posted 2 years ago
 Posts 27 Thanks 6 Joined 4 Apr '11 Email user Please quit.  It gives me a better chance to win! Thanked by OpenNetwork Optimization Service , and LuckyLindy #3 / Posted 2 years ago
 Posts 27 Thanks 6 Joined 4 Apr '11 Email user hotelaftershow wrote: Wouldn't this also stifle legitimate discussion of algorithms too? How can teams progress without at least some input from previously published algorithms to which they are unable to grant "exclusive" rights? I was under the impression that the purpose of this competition was to create an algorithm!  One of the things I discussed with a rival this morning is the fact that if you want to use an algorithm that is publicly published, say Support Vector Machines, you will need to check out a book with the mathematically relevant formulae, use it to create your own SVM, verify against known datasets, and then use that SVM in your algorithm. It takes two years for a reason, people. #4 / Posted 2 years ago
 Posts 2 Joined 27 Jan '11 Email user This thread is getting absurd.  The organizers need to clarify what part of the programming infrastructure is off limits.  This part of the licensing agreement is part (but not all) of the problem:   as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition   Let me point out what a broad reading of this implies: 1. No algorithm in an R library like SVM may be used. 2. Hadoop can't be used  3. No support libraries in R can be used.  R can't be used. 4. No libraries in C may be used including libraries like system 5. MySQL can't be used 6. MatLAB can't be used Clearly this can't be correct or we'll all be writing in assembly language or C without any libraries.  No maybe I shouldn't complain since I'm very happy to work inside these constraints.  All C all the time, and I'll find some method to do I/O that doesn't infringe, but it's kind of silly and I'd rather use all those tools that have been built in the last 40 years. #5 / Posted 2 years ago
 Jeremy Howard (Kaggle) Kaggle Admin Posts 166 Thanks 58 Joined 13 Oct '10 Email user As I said in the other thread, HPN do not intend for the language to be construed in this way, and are actively working right now on resolving this issue with their legal team. I'll let you know as soon as I have more information. #6 / Posted 2 years ago
 Posts 66 Thanks 25 Joined 4 Apr '11 Email user djweiss wrote: 2. Give the Sponsors license to sell whatever work I just SUBMIT to them, even if they pay me nothing or I get nothing in return. Dear David, If I am not mistaken, when your supervisor, Michael Kearns, does consulting work for Fortune500 companies (directly or through Wharton), I believe, he signs a standard agreement which is practically identical to the one in this competition. I.e. all algorithms that he develops for Fortune500 company belong to the company and he can not publish about that algorithm. Please, double check with him. Yours truly, Igor #7 / Posted 2 years ago
 Posts 4 Thanks 3 Joined 20 Mar '11 Email user scarrico wrote: Let me point out what a broad reading of this implies: 1. No algorithm in an R library like SVM may be used. 2. Hadoop can't be used  3. No support libraries in R can be used.  R can't be used. 4. No libraries in C may be used including libraries like system 5. MySQL can't be used 6. MatLAB can't be used I must admit, IANAL, but I don't think any of those things were "developed or produced at any time using the data provided to Entrant in this Competition". There may be issues in the rules, and we hear that clarifications are forthcoming. But, I think your reading of your cited passage would have to include things that simply aren't there to get it to apply to your examples. #8 / Posted 2 years ago
 Posts 27 Thanks 6 Joined 4 Apr '11 Email user The original issue is that the code we produce is only licensed to HPN exclusively and you cannot re-license your code. You CAN reuse your own code, but you can't relicense it to others. The complaint is because a researcher claims that he's being reduced to an HPN "lackey" and is being paid $3 million for R&D. Um...Yes, that's correct! Think of what you'd get if you contracted your research lab to HPN to do the work outright. One-tenth, and that's if you're lucky. #9 / Posted 2 years ago  Posts 27 Thanks 6 Joined 4 Apr '11 Email user Apparently, IE 9 only allows for the quick reply. Earlier replies were done on IE8. However the RTF posts are encoded, IE 9 doesn't like it. Sadness. I'll start again: I'm beginning to come around to the point of view of many of the posts and threads here arguing for a "boycott" of the competition. I likely wouldn't boycott, but rather I'd leave altogether and never return. I've considered the arguments of most posters through these threads and have imagined a number of scenarios. Originally, I thought the arguments to boycott were baseless and pointless. I'm quite industrious and I could figure out a way to use my work commercially. I'm also probably one-up on most people in that I saved a copy of the rules BEFORE I clicked "I Accept", and that I did so before HPN changed the rules without notice. The rules I have saved on my drive say that the work product is non-exclusive, meaning I could resell/repackage it if I wanted. I figure if I wanted to, I could "force" HPN to abide by the rules that I agreed to when I signed up. Then I remembered...I'm Christian. By my faith, I not only am required to follow the letter of the law, but also the spirit of it. Unfortunately, the spirit of the law, the rules for the competition, are to use bait-and-switch tactics to say the exact opposite of what I agreed to when I signed up. Few courts in the U.S. would allow for such a drastic switch in rules, from non-exclusive to exclusive, without asking me to sign, or confirm, that I agree to such a drastic rule change. Then I started thinking about the contest as a whole. What first got me interested in the contest was the video description which said, in part, that the goal of the competition is to determine if certain people might be hospitalized over the next year, so those patients could be seen in a preventative manner. The question was: "Is this patient going to the hospital?" That was *EXACTLY* the work in my doctoral dissertation. When I signed up for the contest, however, the question was changed to "how many days will a person spend in the hospital?" That is a VERY DIFFERENT question, for which my dissertation doesn't apply at all. While I believe I've figured out how to answer the new question, the fact remains that I was attracted using one set of rules (the question), and those rules were changed. When I clicked "I Accept", I was attracted by one set of rules (non-exclusive) and then the rules were changed (to exclusive). I have that part documented and could raise all kinds of legal nightmares. But, I'm a Christian. The work I would have done for this contest would have been significant, because I think quite differently from the people here. However, I can apply the same work in other areas with other, public data sets. To be short and a bit arrogant: I don't need you, you need me. I will pray for a couple of days, maybe a week or so while I finish final edits on my dissertation. Afterwards, I will see what the contest organizers do with regards to the rules for the competition and the exclusivity of the work product. If I believe the spirit of the rules are evil, by the use of bait-and-switch, I'll delete the dataset, request removal from both this competition and Kaggle in general (if they're supporting HPN's position...) and carry on with trying to find a post-doctorate position. I hope HPN realizes that there are quite a few industrious, driven people here, certainly not just myself. While winning the prize would be nice, the work product is far more valuable to most and would be cause to resign. I wouldn't discourage my rival from competing, but I will make her aware that she could be forced to give up any and all work product. Since she is at the start of her academic career, I'm certain she will not like that she can't use the work here towards her doctorate. All the best in all you do while staying true #10 / Posted 2 years ago  Posts 6 Thanks 3 Joined 5 Apr '11 Email user (Thought it was worth adding my view to this thread) I'm a statistical machine learning researcher and a big part of my motivation for the HHP is to develop new approaches to this problem and publish them. If we can't do this, it would be a shame (and I'd probably not bother competing - there are other things I can be getting on with). I agree with the sentiments people have expressed about wanting this competition to be 'open', in the sense that the Netflix prize was. If come up with a new method that has some merit, but that isn't going to win, I want to be able to write it up and share it with the community. It seemed to me that one of the great things about the Netflix prize was the open research community that sprang up around it. From HHN's perspective, I guess it depends on whether they're after cool solutions to their problem, or if they want us to make them a product that they can then licence and sell. Entirely up to them, of course (their data and their money), but I won't bother with the latter, whereas the former really interests me. #11 / Posted 2 years ago  Posts 25 Joined 18 Mar '11 Email user scarrico wrote: This thread is getting absurd. The organizers need to clarify what part of the programming infrastructure is off limits. This part of the licensing agreement is part (but not all) of the problem: as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition Let me point out what a broad reading of this implies: 1. No algorithm in an R library like SVM may be used. 2. Hadoop can't be used 3. No support libraries in R can be used. R can't be used. 4. No libraries in C may be used including libraries like system 5. MySQL can't be used 6. MatLAB can't be used Clearly this can't be correct or we'll all be writing in assembly language or C without any libraries. No maybe I shouldn't complain since I'm very happy to work inside these constraints. All C all the time, and I'll find some method to do I/O that doesn't infringe, but it's kind of silly and I'd rather use all those tools that have been built in the last 40 years. OI don't know for all the tools, but I AM SURE you can use MATLAB if you develop your own algorithm. #12 / Posted 2 years ago  Posts 3 Joined 3 Dec '10 Email user I wonder how they are going to get an exclusive license for MATLAB in that case. Also, what to do if your code is based on open source products? It's all very nice to handle over your code for a round sum of$3m but if I'm honest with myself the chances of me winning are rather close to nil. In that case, what am I transfering my code for? And if they don't have any way to enforce a rule why put it there at all? #13 / Posted 2 years ago
 Posts 4 Joined 5 Apr '11 Email user The real question is this: is Heritage more interested in really solving the problem as well as possible, or are they interested in trying to get their R&D done cheaply without sacrificing any ownership rights? A solution to this problem that works well could be worth many billions of dollars (obviously -- to a corporation or government entity that could leverage it properly). If they really want to solve this problem, and drive down their costs maximally (even at the "cost" of driving down other people's costs as well) they should either stipulate that a winning solution needs to be put in the public domain, or (depending on who they want to attract) that they get a full *non-exclusive* license to the solution. As it is, they show they have low expectations for the competition, and the license stipulations increase the chance that these will be self fulfilling. They are excluding participants who already try to solve this type of problem professionally, unless those participants value their possible lifetime earnings at less than 3M. This certainly rules out the most skiled practitioners and almost all corporate teams as well. I'm not inclined to attack them for this, though its sort of a shame. They should just realize that they will get what they are are paying for. #14 / Posted 2 years ago
 Posts 6 Joined 6 May '11 Email user I am a graduate student using new techniques on meteorological data for storm prediction. I have been developing a new instrinsic geometric approach based on Lin's 2008 paper "Riemannian Manifold Learning." Once my code is finished, this data set would be a good hobby test case. However, since academia, and my Ph.D., are dependent on publishing, I will not be able to participate in this competition due to the extremely restrictive IP rules. The overly zealous licensing restrictions are precluding many of the people the prize seemed to be intended to recruit, those few who actually know what they are doing. Indeed, these restrictions are going to exclude nearly all academics and corporate researchers (i.e. googlers, who have a huge amount of expertise in this area). Perhaps we should fund an better prize through something like kickstarter, the problem is where do we get the health data? #15 / Posted 2 years ago / Edited 2 years ago
<12345>