Call to Boycott Heritage Health Prize

« Prev
Topic
» Next
Topic
<12345>
Tristan's image Posts 4
Thanks 10
Joined 6 Jul '10 Email user

Researchers,

Heritage recently changed the license terms to demand complete exclusivity:

By registering for the Competition, each Entrant (a) grants to Sponsor and its designees a worldwide, exclusive (except with respect to Entrant), sub-licensable (through multiple tiers), transferable, fully paid-up, royalty-free, perpetual, irrevocable right to use, not use, reproduce, distribute (through multiple tiers), create derivative works of, publicly perform, publicly display, digitally perform, make, have made, sell, offer for sale and import the entry and the algorithm used to produce the entry, as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition (collectively, the "Licensed Materials"), in any media now known or hereafter developed, for any purpose whatsoever, commercial or otherwise, without further approval by or payment to Entrant (the "License") and (b) represents that he/she/it has the unrestricted right to grant the License. Entrant understands and agrees that the License is exclusive except with respect to Entrant: Entrant may use the Licensed Materials solely for his/her/its own patient management and other internal business purposes but may not grant or otherwise transfer to any third party any rights to or interests in the Licensed Materials whatsoever.

Academics should also note that they cannot freely publish their results, even if journals accept publishing proprietary algorithms:

Rule 20: "entry (i) was not previously published"

Rule 22: "The Data Sets may not be used for any purpose other than participation in the Competition without Sponsor's prior written approval. If you wish to use the Data Sets for research purposes, please contact Sponsor via the Website's "Contact Us" form, including a reasonably detailed description of the proposed research. All such requests will be given careful consideration."

This competition is now a shortsighted R&D effort for Heritage.  I cannot see how any company or academic can submit results under terms remotely like these. Companies are bought and sold and have many assets that intermix and academics require owership of their creations, both for publications and to build future work.

I urge Heritage to quickly change these rules and to follow the Netflix guidelines. Entries should require no license except being described in enough detail that a competent user can recreate the solution. Researchers should be free to publish results as they see fit. This will benefit Heritage commercially by ensuring the algorithms that are developed are far better than they would be otherwise, while protecting the interests of all involved.

It is sad that this prize, which has significant potential, will have little impact on academic research or public health under the current terms.

I encourage anybody that feels similarly to post their support.

Thanked by B Yang , ruediger , Carlin Eng , Henk , Will Dwinnell , and 4 others
 
Phil's image Posts 3
Joined 31 Mar '11 Email user
Wouldn't this also stifle legitimate discussion of algorithms too? As an example, with the netflix prize I seem to recall Simon Funk publishing his algorithm online during the contest, which prompted a surge of activity along similar lines, and his was one of many models incorporated by the winning team. How can teams progress without at least some input from previously published algorithms to which they are unable to grant "exclusive" rights?
 
FineLineSysDes's image Posts 27
Thanks 6
Joined 4 Apr '11 Email user

Please quit.  It gives me a better chance to win!

 
FineLineSysDes's image Posts 27
Thanks 6
Joined 4 Apr '11 Email user

hotelaftershow wrote:
Wouldn't this also stifle legitimate discussion of algorithms too?

How can teams progress without at least some input from previously published algorithms to which they are unable to grant "exclusive" rights?

I was under the impression that the purpose of this competition was to create an algorithm!  One of the things I discussed with a rival this morning is the fact that if you want to use an algorithm that is publicly published, say Support Vector Machines, you will need to check out a book with the mathematically relevant formulae, use it to create your own SVM, verify against known datasets, and then use that SVM in your algorithm.

It takes two years for a reason, people.

 
Sandra Carrico's image Posts 2
Joined 27 Jan '11 Email user

This thread is getting absurd.  The organizers need to clarify what part of the programming infrastructure is off limits.  This part of the licensing agreement is part (but not all) of the problem:

 

as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition

 

Let me point out what a broad reading of this implies:

1. No algorithm in an R library like SVM may be used.

2. Hadoop can't be used 

3. No support libraries in R can be used.  R can't be used.

4. No libraries in C may be used including libraries like system

5. MySQL can't be used

6. MatLAB can't be used

Clearly this can't be correct or we'll all be writing in assembly language or C without any libraries.  No maybe I shouldn't complain since I'm very happy to work inside these constraints.  All C all the time, and I'll find some method to do I/O that doesn't infringe, but it's kind of silly and I'd rather use all those tools that have been built in the last 40 years. 

 
Jeremy Howard (Kaggle)'s image Posts 166
Thanks 58
Joined 13 Oct '10 Email user
From Kaggle
As I said in the other thread, HPN do not intend for the language to be construed in this way, and are actively working right now on resolving this issue with their legal team. I'll let you know as soon as I have more information.
 
Igor's image Posts 66
Thanks 25
Joined 4 Apr '11 Email user

djweiss wrote:

2. Give the Sponsors license to sell whatever work I just SUBMIT to them, even if they pay me nothing or I get nothing in return.

Dear David,

If I am not mistaken, when your supervisor, Michael Kearns, does consulting work for Fortune500 companies (directly or through Wharton), I believe, he signs a standard agreement which is practically identical to the one in this competition. I.e. all algorithms that he develops for Fortune500 company belong to the company and he can not publish about that algorithm.

Please, double check with him.

Yours truly,

Igor

 

 
Todd Trimble's image Posts 4
Thanks 3
Joined 20 Mar '11 Email user

scarrico wrote:

Let me point out what a broad reading of this implies:

1. No algorithm in an R library like SVM may be used.

2. Hadoop can't be used 

3. No support libraries in R can be used.  R can't be used.

4. No libraries in C may be used including libraries like system

5. MySQL can't be used

6. MatLAB can't be used

I must admit, IANAL, but I don't think any of those things were "developed or produced at any time using the data provided to Entrant in this Competition". There may be issues in the rules, and we hear that clarifications are forthcoming. But, I think your reading of your cited passage would have to include things that simply aren't there to get it to apply to your examples.

 
FineLineSysDes's image Posts 27
Thanks 6
Joined 4 Apr '11 Email user

The original issue is that the code we produce is only licensed to HPN exclusively and you cannot re-license your code. You CAN reuse your own code, but you can't relicense it to others. The complaint is because a researcher claims that he's being reduced to an HPN "lackey" and is being paid $3 million for R&D.

Um...Yes, that's correct! Think of what you'd get if you contracted your research lab to HPN to do the work outright. One-tenth, and that's if you're lucky.

 
FineLineSysDes's image Posts 27
Thanks 6
Joined 4 Apr '11 Email user
Apparently, IE 9 only allows for the quick reply. Earlier replies were done on IE8. However the RTF posts are encoded, IE 9 doesn't like it. Sadness. I'll start again: I'm beginning to come around to the point of view of many of the posts and threads here arguing for a "boycott" of the competition. I likely wouldn't boycott, but rather I'd leave altogether and never return. I've considered the arguments of most posters through these threads and have imagined a number of scenarios. Originally, I thought the arguments to boycott were baseless and pointless. I'm quite industrious and I could figure out a way to use my work commercially. I'm also probably one-up on most people in that I saved a copy of the rules BEFORE I clicked "I Accept", and that I did so before HPN changed the rules without notice. The rules I have saved on my drive say that the work product is non-exclusive, meaning I could resell/repackage it if I wanted. I figure if I wanted to, I could "force" HPN to abide by the rules that I agreed to when I signed up. Then I remembered...I'm Christian. By my faith, I not only am required to follow the letter of the law, but also the spirit of it. Unfortunately, the spirit of the law, the rules for the competition, are to use bait-and-switch tactics to say the exact opposite of what I agreed to when I signed up. Few courts in the U.S. would allow for such a drastic switch in rules, from non-exclusive to exclusive, without asking me to sign, or confirm, that I agree to such a drastic rule change. Then I started thinking about the contest as a whole. What first got me interested in the contest was the video description which said, in part, that the goal of the competition is to determine if certain people might be hospitalized over the next year, so those patients could be seen in a preventative manner. The question was: "Is this patient going to the hospital?" That was *EXACTLY* the work in my doctoral dissertation. When I signed up for the contest, however, the question was changed to "how many days will a person spend in the hospital?" That is a VERY DIFFERENT question, for which my dissertation doesn't apply at all. While I believe I've figured out how to answer the new question, the fact remains that I was attracted using one set of rules (the question), and those rules were changed. When I clicked "I Accept", I was attracted by one set of rules (non-exclusive) and then the rules were changed (to exclusive). I have that part documented and could raise all kinds of legal nightmares. But, I'm a Christian. The work I would have done for this contest would have been significant, because I think quite differently from the people here. However, I can apply the same work in other areas with other, public data sets. To be short and a bit arrogant: I don't need you, you need me. I will pray for a couple of days, maybe a week or so while I finish final edits on my dissertation. Afterwards, I will see what the contest organizers do with regards to the rules for the competition and the exclusivity of the work product. If I believe the spirit of the rules are evil, by the use of bait-and-switch, I'll delete the dataset, request removal from both this competition and Kaggle in general (if they're supporting HPN's position...) and carry on with trying to find a post-doctorate position. I hope HPN realizes that there are quite a few industrious, driven people here, certainly not just myself. While winning the prize would be nice, the work product is far more valuable to most and would be cause to resign. I wouldn't discourage my rival from competing, but I will make her aware that she could be forced to give up any and all work product. Since she is at the start of her academic career, I'm certain she will not like that she can't use the work here towards her doctorate. All the best in all you do while staying true
 
PatternEngine's image Posts 6
Thanks 3
Joined 5 Apr '11 Email user

(Thought it was worth adding my view to this thread)

I'm a statistical machine learning researcher and a big part of my motivation for the HHP is to develop new approaches to this problem and publish them.  If we can't do this, it would be a shame (and I'd probably not bother competing - there are other things I can be getting on with).    

I agree with the sentiments people have expressed about wanting this competition to be 'open', in the sense that the Netflix prize was.  If come up with a new method that has some merit, but that isn't going to win, I want to be able to write it up and share it with the community.  It seemed to me that one of the great things about the Netflix prize was the open research community that sprang up around it.

From HHN's perspective, I guess it depends on whether they're after cool solutions to their problem, or if they want us to make them a product that they can then licence and sell.  Entirely up to them, of course (their data and their money), but I won't bother with the latter, whereas the former really interests me.

 

 

 
Toulouse's image Posts 25
Joined 18 Mar '11 Email user

scarrico wrote:

This thread is getting absurd.  The organizers need to clarify what part of the programming infrastructure is off limits.  This part of the licensing agreement is part (but not all) of the problem:

 

as well as any other algorithm, data or other information whatsoever developed or produced at any time using the data provided to Entrant in this Competition

 

Let me point out what a broad reading of this implies:

1. No algorithm in an R library like SVM may be used.

2. Hadoop can't be used 

3. No support libraries in R can be used.  R can't be used.

4. No libraries in C may be used including libraries like system

5. MySQL can't be used

6. MatLAB can't be used

Clearly this can't be correct or we'll all be writing in assembly language or C without any libraries.  No maybe I shouldn't complain since I'm very happy to work inside these constraints.  All C all the time, and I'll find some method to do I/O that doesn't infringe, but it's kind of silly and I'd rather use all those tools that have been built in the last 40 years. 

 

OI don't know for all the tools, but I AM SURE you can use MATLAB if you develop your own algorithm.

 
Igor's image Posts 3
Joined 3 Dec '10 Email user

I wonder how they are going to get an exclusive license for MATLAB in that case.

Also, what to do if your code is based on open source products?

It's all very nice to handle over your code for a round sum of $3m but if I'm honest with myself the chances of me

winning are rather close to nil. In that case, what am I transfering my code for? And if they don't have any way to enforce a rule

why put it there at all?

 
factfiber's image Posts 4
Joined 5 Apr '11 Email user
The real question is this: is Heritage more interested in really solving the problem as well as possible, or are they interested in trying to get their R&D done cheaply without sacrificing any ownership rights? A solution to this problem that works well could be worth many billions of dollars (obviously -- to a corporation or government entity that could leverage it properly). If they really want to solve this problem, and drive down their costs maximally (even at the "cost" of driving down other people's costs as well) they should either stipulate that a winning solution needs to be put in the public domain, or (depending on who they want to attract) that they get a full *non-exclusive* license to the solution. As it is, they show they have low expectations for the competition, and the license stipulations increase the chance that these will be self fulfilling. They are excluding participants who already try to solve this type of problem professionally, unless those participants value their possible lifetime earnings at less than 3M. This certainly rules out the most skiled practitioners and almost all corporate teams as well. I'm not inclined to attack them for this, though its sort of a shame. They should just realize that they will get what they are are paying for.
 
ijvaughn's image Posts 6
Joined 6 May '11 Email user

I am a graduate student using new techniques on meteorological data for storm prediction. I have been developing a new instrinsic geometric approach based on Lin's 2008 paper "Riemannian Manifold Learning."

Once my code is finished, this data set would be a good hobby test case. However, since academia, and my Ph.D., are dependent on publishing, I will not be able to participate in this competition due to the extremely restrictive IP rules. The overly zealous licensing restrictions are precluding many of the people the prize seemed to be intended to recruit, those few who actually know what they are doing.

Indeed, these restrictions are going to exclude nearly all academics and corporate researchers (i.e. googlers, who have a huge amount of expertise in this area). Perhaps we should fund an better prize through something like kickstarter, the problem is where do we get the health data?

 
<12345>

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?