cbusch's image Posts 7
Joined 31 Aug '11

What is the accuracy percentage of a model with an RMSLE of .4?

 
Momchil Georgiev's image Posts 129
Thanks 71
Joined 6 Apr '11

Predictions are a real number from 0 to 15 days so unless rounding is applied, the answer to your question is undefined. The trouble is that as soon as predictions are rounded, the RMSLE will most likely be worse than 0.4.

 
cbusch's image Posts 7
Joined 31 Aug '11

That didn't really address my question.  Could anyone else provide an aswer?  Would a model with an RMSLE of .4 be correct 60% of the time?

 
Chris Raimondi's image Rank 10th
Posts 132
Thanks 55
Joined 9 Jul '10

It would probably be "correct" 0% of the time.

You do realize that the predictions are not required to be whole numbers?

Some people may be submitting predictions with zeros - but most likely due to it being cut off and scaled to 0 - 15, but I doubt there are many people submitting predictions that contain exactly 0, 1, 2 ....

Usually terms like "accuracy" are only applied to classification type problems - where you are required to submit a discrete prediction.

 

You might want to look at something like r^2 - which may be closer to what you are looking for - it gives you a percentage - with a max of 100% (I think) - and I believe someone posted earlier that the r^2 of a model that scored .40000 would be around 40% (if memory serves).  I do not know how to claculate this, but I believe the r^2 scores I have seen for some random forest models I have run(that score in the neighborhood of 0.461xxx) is around 12.6%.

 
cbusch's image Posts 7
Joined 31 Aug '11

Could anyone provide an r^2 value for an RMSLE of .4?

 
BubaLulu's image Posts 2
Joined 31 Dec '11

Help ! -

I took  Y3 DaysInHospital (actual) for the Y4 target member list (i.e. the 70942 Y4 members)

and calculated the RMSLE using Zero for every member's DaysInHospital prediction.

I got 0.416123.

Looking at the leader board, the leaders are at ~0.4520...

What am I missing? 

 
Daniel Hartmeier's image Posts 8
Thanks 2
Joined 24 Nov '10

Out of the 70943 members in Target.csv, 21260 do not have an entry in DaysInHospital_Y3.csv.

If you simply assume a value of 0 for those, you end up with RMSLE = sqrt(12284.17224 / 70943.0) = 0.4161196012, explaining your result.

However, absense in DaysInHospital_Y3.csv does not mean the member spent 0 days in hospital (what would be the point of the 60706 lines in there with value 0, otherwise?). Search the archive for possible explanations, I think one is that the member joined between Y3 and Y4.

HTH,

Daniel

Thanked by BubaLulu
 
BubaLulu's image Posts 2
Joined 31 Dec '11

Thanks so much Daniel - this gives me something to work on...

 
JeremyA's image Posts 23
Thanks 6
Joined 5 Apr '11

In healthcare patient LoS/admit/re-admit analysis, a non-admission (or length of time between admissions) is an important outcome measure, hence that's why it's probably included in the training data-set.

~jba

 
Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?