Did anybody do any interesting submissions they want to share?
I submitted $p_i = 0.18584427052136$ for all $i$ giving a public score of 0.486849.
If anybody has submitted all zeros, then we can calculate the mean of $a_i$ for the sample.
|
Thanks 29 Joined 28 May '10 Email user |
Did anybody do any interesting submissions they want to share? I submitted $p_i = 0.18584427052136$ for all $i$ giving a public score of 0.486849. If anybody has submitted all zeros, then we can calculate the mean of $a_i$ for the sample.
Thanked by
Domcastro
|
|
Thanks 29 Joined 28 May '10 Email user |
|
|
Thanks 4 Joined 6 May '10 Email user |
|
|
Thanks 7 Joined 10 Feb '11 Email user |
|
|
Joined 15 Apr '11 Email user |
|
|
Posts 194 Thanks 90 Joined 9 Jul '10 Email user |
|
|
Thanks 29 Joined 28 May '10 Email user |
|
|
Thanks 4 Joined 6 May '10 Email user |
Allan Engelhardt wrote: Valentin Tiriac wrote: I calculate the score for the mean should be 0.486435. Can anyone confirm?
Hmm, isn't it > print(sqrt(0.522226^2 - 0.189941^2)) Or maybe I am just sleepy again but it does agree with two submissions on the leaderboard.
Yep. The way I calculated it was by fitting a parabola in Excel, so it had rounding errors, but your way is better. |
|
Posts 292 Thanks 64 Joined 2 Mar '11 Email user |
Tatiana McClintock wrote: What am I doing here? You are speaking some odd langiage to me. Submitting a constant value? What the h is that?
Predicting that EVERYONE in the dataset will be hospitalized for about .19 days. Tatiana McClintock wrote: And getting the score hmm who is scoring?
Kaggle is: http://www.heritagehealthprize.com/c/hhp/Leaderboard |
|
Thanks 9 Joined 9 Sep '10 Email user |
Allan Engelhardt wrote: If those submissions at 0.522226 on the leaderboard are from constant zero submissions, then I make $mean(\log(a_i+1)) = 0.189941$ for the 30% sample, compared with 0.1863221 and 0.178212 in Y2 and Y3, respectively. Can anybody check my math, please?
Hmm, I get 0.188965 using your figures (0.522226 RMSE for constant zero submission, 0.486849 for constant 0.18584427 submission). |
|
Thanks 29 Joined 28 May '10 Email user |
Eric Jackson wrote: Allan Engelhardt wrote: If those submissions at 0.522226 on the leaderboard are from constant zero submissions, then I make \\(mean(\log(a_i+1)) = 0.189941\\) for the 30% sample, compared with 0.1863221 and 0.178212 in Y2 and Y3, respectively. Can anybody check my math, please?
Hmm, I get 0.188965 using your figures (0.522226 RMSE for constant zero submission, 0.486849 for constant 0.18584427 submission).
Too much maths: my head will hurt :-) Anyhow, my calculations and reasoning are: Let \\(A_i = \log(a_i + 1)\\). For a submission of \\(p_i = 0\\) the score is \\(\epsilon = \sqrt{ \mean(A_{i}^2) }\\)and we have the value \\(0.522226\\) for this. P E MAI2 (P^2 + MAI2 - E^2)/(2*P) I still can't see my mistake, but that certainly doesn't mean there isn't one! |
|
Thanks 4 Joined 5 Aug '10 Email user |
Erik Jackson, Your mistake is that you probably that you used 0.18584427052136 in your formula instead of log(1.18584427052136) Allan Engelhardt seems to be right and we get log(a+1)=0.189941 and it mean a=0.209179 and it means that the best submision of constant should be 0.209179 days for everybody(I hope that I have no errors).
Thanked by
Eric Jackson
|
|
Thanks 4 Joined 5 Aug '10 Email user |
|
|
Thanks 4 Joined 5 Aug '10 Email user |
|
|
Thanks 29 Joined 28 May '10 Email user |
Allan Engelhardt wrote: [...]In R, I do the calculation as P E MAI2 (P^2 + MAI2 - E^2)/(2*P) I still can't see my mistake, but that certainly doesn't mean there isn't one!
Hmm, Jeff Moser edited that after I posted so it now makes no sense at all. Let me try again and see if Jeff can keep his fingers off the edit button this time....: P <- log1p(0.18584427052136) |
|
Thanks 178 Joined 21 Aug '10 Email user |
Allan Engelhardt wrote: Hmm, Jeff Moser edited that after I posted so it now makes no sense at all. Let me try again and see if Jeff can keep his fingers off the edit button this time....:
Sorry about that.. your post has shown me that I need to tweak how inline MathJax is rendered. That's what I was experimenting with. Currently only displaymode math works (i.e. math surrounded by double dollar signs on each side). Displaymode puts equations on a separate line which looks odd. I'd like to get inline math to work as well (i.e. with single dollar sign delimeters). The problem is that some programming languages use dollar signs and confuse MathJax. Just wanted to give an update on what I was doing. |
|
Thanks 9 Joined 9 Sep '10 Email user |
Uri was absolutely right on my error. I now get 0.189941 like Allan.
More significantly for me, what Uri's correction made me realize was that I had been mistakenly producing predictions in the log domain rather than the real domain. In other words, I have been submitting files with predictions for log(D+1) rather than predictions for D. It certainly helped me to fix that problem, although not as much as you might have thought - my leaderboard score improved by only 0.001. |
|
Thanks 29 Joined 28 May '10 Email user |
|
|
Thanks 178 Joined 21 Aug '10 Email user |
More details on making beautiful math posts in these forums can be found at http://www.kaggle.com/forums/t/581/tips-for-beautiful-math-posts
Thanked by
inf2207
|
|
Thanks 29 Joined 28 May '10 Email user |
Eric Jackson wrote: More significantly for me, what Uri's correction made me realize was that I had been mistakenly producing predictions in the log domain rather than the real domain. In other words, I have been submitting files with predictions for log(D+1) rather than predictions for D. It certainly helped me to fix that problem, although not as much as you might have thought - my leaderboard score improved by only 0.001.
For small numbers \\(\log(1+x) \approx x\\) so your score woudn’t change much. For example, \\(\log(1+0.189941) = 0.1739037\\). |
|
Thanks 5 Joined 20 Mar '11 Email user |
|
|
Thanks 29 Joined 28 May '10 Email user |
Thanked by
Chris Raimondi
|
Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?
with —