Before you get to the point of protecting any IP you might create, there is this one little roadblock in the Data section of the Rules:
"Entrants must use the Data Sets provided to them solely for purposes of the Competition, including but not limited to preparing their Entries, developing and testing their Prediction Algorithms
(as defined below) for accurately predicting the number of days that the members will spend in a hospital (inpatient or emergency room visit) during the 12-month period following the Data Set cut-off date and participating in the forum discussions on the Website."
So re-purposing (stealing) the data for some private research effort of your own is not kosher. Many R "vignettes," and many peer-reviewed papers on machine learning, particularly those that evaluate or compare various algorithms, point to some public accessible datasets
of various sizes and complexities, which can be used for personal research. Entering this competion, and developing something new in a good-faith pursuit of this contest's prize, would be a permissible use of this data.
As a practical matter, AFAIK, there has never been any indication that a Sponsor of a Kaggle competition has ever sought the IP / methodology of a non-prize-winning entry. And presumably, the prize money is compensation for the IP the leaders are giving
As a legal matter, in terms of whether the contest rules allow a contest's sponsor to slurp up the IP of all registered entrants, I'm not a lawyer. (I'm also not a spokesperson for Kaggle, nor more privy to their thinking than anyone else on these boards.)
As a not-lawyer, I have always found that interpretation to be pretty far out on the paranoid/ridiculous scale (those points are both at the same end), and, in practice, seemingly unenforceable. Pointedly, it would also be catastrophic for Kaggle's business
model, so they would have to come down on the side of competitors, rather than the sponsor, should a sponsor ever mount such an effort to grab all contest IP, real, imagined, or prior art. And Kaggle owns and controls the registration information on all contestants.
Winning $3 million would be a pretty good way to cash in on your newly developed predictive tech IP. And become a famous consultant. Just sayin' !