Dear, Anthony Goldbloom,
Firstly; thanks for a great initiative, Kaggle is awesome !
We belive that Y1DaysInHospital is critically important, as it then becomes possible to
train on forecasting : Y2 with: claimsY1 and Y1DaysInHospitalY1 and
train on forecasting : Y3 with: claimsY2 and Y1DaysInHospitalY2
To forecast : Y4 with: claimsY3 and Y1DaysInHospitalY3
to train on forecasting :Y3 with :
claimsY1 and Y1DaysInHospitalY1 and
claimsY2 and Y1DaysInHospitalY2
To forecast :Y4 with:
claimsY2 and Y1DaysInHospitalY2 and
claimsY3 and Y1DaysInHospitalY3
So without Y1DaysInHospital the whole Option B. falls away, and option A has only half the complete data to train on.
We feel that without this information, it will be hard if not impossible to construct at truly good model that can have real use for the sponsors,
and might make reaching the .4 mark unachievable.