Calculating the probability that a patient associated with an entity will visit the hospital

« Prev
Topic
» Next
Topic
Zach's image Rank 31st
Posts 292
Thanks 64
Joined 2 Mar '11 Email user

In their milestone 1 paper, the team "market makers" say the following on page 13:

For each Primary Care Physician (PCP), Vendor and Provider, a value was calculated that was the 
probability that a patient associated with the entity would visit hospital. Each patient was then
allocated the highest probability of all the PCPs (Vendors or Providers) that they were associated
with, generating 3 fields in total.

I'm trying to replicate their methodology.  Is this probability only calculated using the claims data, or is the actual days in hospital for the next year merged in as well?  Here's my first shot at this.  I've already read in the claims data, converted LengthOfStay to numeric, and replaced missing values with zero:

library(plyr)
Claims$visit <- as.numeric(Claims$LengthOfStay>0)
providerProbs <- ddply(Claims,c('ProviderID','Year'),function(x) c('prob'=mean(x$visit)), .progress='text')

The result is the percent of visits to a given provider that resulted in hospitalization:

> head(providerProbs[providerProbs$prob>0,],10)
ProviderID Year prob
29 12890 Y1 1.0000000
56 23379 Y1 0.5294118
57 23379 Y2 0.7222222
58 23379 Y3 0.4166667
97 40154 Y1 0.4285714
98 40154 Y2 0.5714286
99 40154 Y3 0.5000000
466 173881 Y1 1.0000000
467 173881 Y2 0.9375000
468 173881 Y3 0.5600000

histogram

Am I on the right track?  Or should I be using the "DaysInHospital" table, rather than "LengthOfStay" in the claims table?

 
Sali Mali's image Rank 4th
Posts 292
Thanks 113
Joined 22 Jun '10 Email user

Hi Zach,

I've just been reading what I wrote and it is not 100% clear is it!

What is meant when I say 

 'probability that a patient associated with the entity would visit hospital'
should probably be
'probability patients associated with the entity would have at least one day in hospital in the following year'
So basically get all the patients who have visited a particular entity - get the DaysInHospital for those patients, rounding anything above 1 to 1. If you just then take an average of these DaysInHospital then that is the probability for that entity.
It is nothing to do with the length of stay field in the claims data. 




Thanked by Dipanjan
 
Zach's image Rank 31st
Posts 292
Thanks 64
Joined 2 Mar '11 Email user

Great, thank you. What function did you use to compute probabilities (if it wasn't the mean)?

 

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?