# Missing values of ordered or numeric variables

« Prev
Topic
» Next
Topic
 Rank 31st Posts 292 Thanks 64 Joined 2 Mar '11 Email user How are people dealing with missing values of numeric or ordered variables, such as DSFS, or Length of Stay?  For now, I am recoding those missing values to zero, but I was wondering if there was a better solution. #1 / Posted 2 years ago
 Posts 77 Thanks 29 Joined 28 May '10 Email user I put LengthOfStay to zero and am looking to drop observations with missing DSFS but zero is a good value if you want to keep them. #2 / Posted 2 years ago
 arbuckle HHP Advisor Posts 38 Thanks 21 Joined 5 May '11 Email user See impuation and alternatives: http://en.wikipedia.org/wiki/Imputation_(statistics) #3 / Posted 2 years ago
 Rank 19th Posts 144 Thanks 21 Joined 27 Jan '11 Email user then the preparation phase of the data, can include a script to assign values ​​to the missing values ​​that may be deductible or highly probable and correct errors that can be observed in the data from direct observation of these? #4 / Posted 19 months ago
 Posts 94 Thanks 25 Joined 8 Apr '11 Email user Blind Ape wrote: then the preparation phase of the data, can include a script to assign values ​​to the missing values ​​that may be deductible or highly probable and correct errors that can be observed in the data from direct observation of these? For claims-based utilization such as this, it will be GENERALLY true that utilization is zero, unless there are one or more claims showing that utilization is more than zero. By analogy, what was your avg cost of lunch at McDonalds yesterday, if you didn't eat at McDonalds and thus don't have a reciept ? There are some exceptions however. For example, pharmacy utilization may be understated/underestimated if some of a person's prescriptions are available cheaper for cash at a retail outlet than the person's normal co-pay would be. In my area, several chains have a list of prescriptions that are available for $4, which I'm sure leaves an info hole in the records of insurors/payors. Such gaps do not necessarily cause predictive biases, however, particularly if you're looking at a commercially insured population. It would probably be more of a confusion factor if we had Medicaid+SelfPay+Commercial+Medicare all mixed together, but happily for us, we don't. With plans that have an annual deductible amount, i.e. some amount ($500 or \$1000 perhaps) that the covered person must pay first, before the insurance starts paying the rest of the bills during the year, some people have a tendency to sit on the claims they paid themselves, until they get close to satisfying the annual deductible. If they never get close to that limit, there's no economic benefit to them of filing the paperwork, and some never do. That's just more random noise in the system. (But it means we have less data on the people we would be projecting to have low utilization anyway.) But as a sharp wit said in one of the previous posts somewhere on this forum, you can assume/impute anything you like, if it makes your predictions better. HTH #5 / Posted 19 months ago