Physician with Machine Learning skills will team up.

« Prev
Topic
» Next
Topic
<12>
David J. Slate's image Rank 13th
Posts 65
Thanks 25
Joined 5 Aug '10 Email user

Thanks Chaos::Decoded for posting these ideas.  Some of your suggestions for features, such as the following, seem to depend on information that is just not provided with the data sets for this contest (the paucity of data regarding meds and tests has been discussed in the forum) :

so the number of expensive meds is a feature
number of meds physicians don't feel very comfortable to refill over the phone is a feature
number of meds which are not filled as they should is a feature
number of meds which are used in higher doses than normal
sum of cost of meds patient gets vs their insurance, copay
 
 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user
Must admit didn't review variables / data available -yet. This was just out of top of my head. Will review later and add more features: I'm trying here to sell myself but not to give the best ones yet:) Some insight for free: for some diseases, each patient has its own threshold of how he has to feel like - to go to the Er or call 911 or his doc. The rest is on docs. I''ll help the team to understand and build this mechanism in the algorithm properly. In some diseases it doesn't matter how you feel, your wife will call....
 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user
 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Patient A fills Rx for meds, with this many refills, gets this many MEDS in each therauptic group and uses them up. (see Practice fusion contest for drugs groups codes).

The total number of drugs in each grup substracts from what pt has day by day, when he uses it.

The number of meds he has at any given point in each of therapeuthic group changes over time. If it gest low in a any given time the chances are what he has is not enough compared to what pt's needs are, Patient A gets sick and gets admitted, when he runs out of meds.

When you draw a curve, for each patient, of what meds he has, there will be his own minimal treshold of each group when chances are he gets admitted.

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Now consider this. As I mentioned before it is the docs really who decide who to admit.
Each doc on each day admits this many sickest patients he sees in a given time period / place - a day, month, year.

Some patients in a ER, where everybody else is sicker, dont get admitted, just because, they are not "considered that sick" compared to everybody else. On the other hand, Same patient may get easily admitted to quiet hospital, where there is no patients in ER at all, or everyboyd else is healthier...

Now, for each feature you have there, from your dataset, give percentile of where they are comparing to others within same ER, same time, same month, same year, and not ony by date, but also by realtive time to when they come to ER, it is the always (for each feature different Nf) lowest Nf percentile that gets admitted.

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Now for each feature you have, draw a curve over time.

See how it goes up and down. Count those local minimums / downs for each feature / curve within different time window/frame.

The number of minimums for each feature is a feature itself.
As well as a number of minimums for each feature within subgroups of ICD code, gender, state,ICD code combinations

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Now imagine this (this works I did it before on one of the contest).

First take your 95% of the trainingData with all the features, train you method and try to predict the 5% of testingData. Do it R times.

Now remember your DNumberOfAdmissions being a difference between how many times somebody actually gets admitted to what you predicted.

Create new matrix: TunningDataset which will be a new matrix of data for each patient: DNumberOfAdmissions, Patient's data, Quntiles(i.e: 0.05,0.1,0.2,0.45,0.5,0.55) so on for each variable in testingDataset, trainingDataset and both, your OOB errors in each RUN Ri

Now get your predictions straight for the Submission Dataset
But before you submit, correct it with randomForest i.e using the TunningDataset.

In other words, learn when your learning method makes mistakes, and correct them before submission.

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Can't believe still no right invitation to team...

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Learn when you make mistakes.

Get 90% of training data, and predict the rest 10%.
Do it 1000 times.
Each time remember what where the characteristics of these 90% and remaining 10% by remembering percentiles (0.01,0.1,0.15...0.45,0,49,0.5,0.51,0.55..0.9,0.99)/ mean /median/min/max for each variable in the set, and ratios / subtractions between these values for each variable in the 90% part vs 10% part. add ratios of each variable in each set to number of admissions for each case and and remember when you made and how big of a mistake (DeltaDays=difference of number of days in hospital for each case from 10% set from the true to predicted value). Call it all Tuning-set (set of data that will make it possible to adjust final predictions of number of days in hospital predicted for submission set). This set will have all mistakes you make based on differences between training set and testing set.

After you finally predict your final submission data answers run it again through Tunning set with same ensemble, or simple randomForest. You will get your best answers knowing when you made mistakes - by adjusting all the final answers by "DeltaDays"

This will make your subscription best on the final data set that organizers keep hidden.

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Update on email, it is lukaszkiljanek@NOSPAM@gmail.com

 
CreativSolutions's image Posts 4
Joined 7 Jan '12 Email user

Hi,

I may be interested in teaming up.  My team name is CreativSolutions.  I got up to a ranking of 75 or so, but haven't worked on this in a while.

Can you tell me a bit about your background, you can send it to jhtrdllc@gmail.com.  I tried your email above, but it looks incorrect as it has 2 @ symbols.

John

 
CreativSolutions's image Posts 4
Joined 7 Jan '12 Email user

I just double-checked the contest rules, Oct 3 was the deadline for team mergers.

 
Chaos::Decoded's image Posts 80
Joined 18 May '12 Email user

Well, I am sorry that you waited for so long, I asked on the forum If is it possible for me to still merge with single contestant or team given I have never made a submission....

 

I can still ofer consulting help for $$$ or promise of share in prize ?

Would this be legal "Kaggle" ?

 

You can share code outside of the teams ...

But I don't think there is ban on sharing ideas ?

What you say "Kaggle" ?

 
<12>

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?