I am facing a puzzle and looking for someone who can suggest what could be an issue. I am using random forests and table below shows how train/test sets are organized and corresponding results. Sets 1 and 2 are split in training/testing subsets and set 3 is trained on Y1/Y2 combination and verified against Y3 days. I get pretty consistent ~0.45 results but when I submit set 4 for verification I get ~0.47.

Cases 3 and 4 should be symmetrical and I am looking for suggestion what generally might be wrong. Data sets are prepared the same way and same code is used in all 4 instances.

  [ Train data (drugs, labs, claims) ] [ Days in hospital for training ] [ Test days ] [ Train / Test set size ] [ Result ]
1 Y1 Y2 Y2 80% / 20% ~0.45
2 Y2 Y3 Y3 80% / 20% ~0.45
3 Y1 Y2 Y3 100% ~0.45
4 Y2 Y3 Target 100% ~0.47

Thanks in advance...

Mirko