<123>
baller's image
Posts 1
Joined 25 Aug '12
Email User

Hi David,

 I have the same question as Mercicle and have read through the whole forum. It think it is a little disorganized as far as a means of declaring which external data people are using and what has been approved. I am sure I can pick through it and pull what I think fits the bill out. I do have a couple of questions:

1) If I do not see that a Kaggle admin has explicitly said not to use a posted source then it is fair game? This is assuming you all have actually checked the sources out at this point. Don't get me wrong, I will check them myself but I wanted to see if this assumption was correct.

2) I see there was a reply posted to theafh about the Census Bureau data that was never fully confirmed and it was stated that it most likely cannot be used. This is a little confusing because the rules just say "You may not, however, link the Data Sets to records in other external databases such that new demographic, socioeconomic or clinical information about the members in the Data Sets is gained. " But, Census Bureau data is anonymous and should not give insight into demographic, socioeconomic or clinical information about an individual member. I would think this is to cover the privacy of any inidividuals in the data but maybe you do mean it to cover people as a whole?

 
DavidChudzicki's image
DavidChudzicki
Kaggle Admin
Posts 447
Thanks 107
Joined 21 Nov '10
Email User
From Kaggle

(1) According to Rule 7, you don't need special permission for external data, as long as you satisfy the requirements. In some cases, we've clarified that certain external data isn't allowed. If there are particular ones you're still wondering about, feel free to ask.

(2) It's a good point, but I guess the sponsor just wanted to be totally safe.

 
K-czar's image
Rank 18th
Posts 29
Thanks 6
Joined 18 Sep '12
Email User

Becky, was all of this information you listed approved? It says posted 6 months ago (not the exact date), which is right around the deadline...so I'm not sure whether it's usable or not.  I am also new to the competition, so still figuring out how things work.  Like others, I tend to think it would be nice if someone could summarize all of the approved info that made it in before the deadline... I guess someone could try to go through and compile it, then double check with others and or the admins to verify everything is approved and nothing is missing.  I might give that a shot later.

Hi,

We are planning to leverage the following data and information which is free to the public:

http://www.dartmouthatlas.org

http://www.cdc.gov/nchs/data/nvsr/nvsr59/nvsr59_09.pdf

ftp://ftp.cdc.gov/pub/Health_Statistics/NCHS/Dataset_Documentation/NHDS/NHDS_2009_Documentation.pdf

http://www.cdc.gov/nchs/data/nvsr/nvsr60/nvsr60_04.pdf

http://www.statehealthfacts.org

http://www.cdc.gov/nchs/fastats/hospital.htm

www.ahrq.gov

Thanks!

 
Nicholas Hamilton's image
Rank 53rd
Posts 14
Thanks 2
Joined 21 Aug '11
Email User

None of my submissions to date have used external data. If another competitor has requested (or stated prior to the deadline) that they have used external data, and provided the source of the data, am I free to also use that data at my discretion?

 
Sajid Zaidi's image
Rank 56th
Posts 5
Joined 4 Feb '12
Email User

From some of the links here, it seems that people are trying to link up publicly available provider-specific and hospital-specific information with the HPN data. I have two questions:

1) Is this legal, according to the rules? I know the rules explicitly ban trying to match up patient data

2) Can anyone share how they are matching up this data, since the provider ids are masked?

Thanks

 
bhubhu123's image
Posts 1
Joined 26 Jul '13
Email User

bhubhu

 
<123>

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?