Congratulations to the winners of the first milestone prize!
We should be able to reproduce the scores of the two winners, according to the rules, which combined will probably result is an even better score.
In order to reproduce the scores I am missing some vital information.
In the Market Makers document, so I have some questions:
1) Thanks for the example software, but shouldn't we get models and parameters in order to reproduce the results? Is this one of
the models used in the ensembling, or just an example?
2) This paper suggests that they improved their results by assigning a score to each PCG based on medical knowledge. I think the actual values used should be published
as this could be considered additional data and is absolutely required to reproduce their results.
3) For the Neural Networks it looks like some proprietary software has been used, (Tiberius) is this correct? What are the models used, the number of hidden layers, functions,
number of neurons, parameters, etc.? Nothing of this has been claryfied.
4) For the other algorithms (GBM, Bagged trees and Linear regression), there are also many models derived (20 is mentioned but with some models consisting of even more
models), but the models itself are unclear to me. Note that many models were generated (up to 60?, the exact number is unknown), so there are many ways to incorparate different models with different subsets, different
parameters, and possibly different initialisations of weights. I cannot guess what these subsets or parameters are, so I am not able to reproduce the results. A good descrioption of every model with the used subsets
are necessary for me to reproduce the results.
5) This paper suggests that they improved their results by assigning a score to each PCG based on medical knowledge. I think the actual values used should be published
as this could be considered additional data and is absolutely required to reproduce their results.
6) Combinations of each PCG, Specialty and PG resulted in many fields which were reduced. Where is de model used for this? This is unclear. "Building
classification models" seems to be a very vague description for this.
7) What are the weights used for the linear blending of the models?
with —