<12>
Chris Raimondi's image Rank 38th
Posts 194
Thanks 90
Joined 9 Jul '10 Email user

JC36 wrote:

Thanks guys (ChipMonkey, DavidC and Chris Raimondi) for your comments. Let me say right away that, of course, I accept Kaggle Admin's ruling as to whether the milestone winners' methods comply with the rules. (I might use the same techniques myself if my methods get good enough.)
However, I still disagree that a combination of several algorithms should be considered "AN algorithm". It should be called "a method". Have a look at Market Makers' paper describing their method for their winning round 1 milestone.
Here is what they write:-


There were four underlying algorithms used in our models, all of which are freely available in the R language for statistical computing. Online references for each algorithm are given in the hyperlinks below.

  1. Gradient Boosting Machines ...
  2. Neural Networks...
  3. Bagged Trees...
  4. Linear Models...

I have deleted the hyperlinks for clarity. Market Makers go on to use "ensembling" which blends the results of the different algorithms.
Surely you guys wouldn't argue that a combination of four such different algorithms is AN algorithm. If you do I would have to give away my geographical location and say you are a mob of "bush lawyers"!

Surely you guys wouldn't argue that a combination of four such different algorithms is AN algorithm

Yes I would - as mentioned before - many different things though of as a single algo - or even equation is actually a combination - or linear blend.

Is the the Pythagorean theorem an algo?  Or is it a linear ensemble of A^2 plus B^2?

The fact that you are calling Bagged Trees - for example - AN ALGO - shows the problem with this appproach.  The R package randomForest is simply a combination of CART trees - therefore - even a single random forest under what you are stating - wouldn't count as a single algo - as it was someone putting together a bunch of CART trees in a clever manner.

I understand what you are saying - and similar objections to practicality were raised during the netflix competition.  Google uses over 200 different signals (what we call features) and a combination of algos, but their overall method - as you would call it - is still refered to as "The Google Algorithm".  See here for example - the singular is used eight times - the plural never:

http://bits.blogs.nytimes.com/2011/11/14/google-reveals-tweaks-to-its-search-algorithm/

I do not disagree that MM and W&E used a combination of algorithims - I just disagree that you can't call that combination an algorithm as well.

I think you can combine four movies (in some cases) - and still consider it A MOVIE - just as you can put up 100s or thousands of orange pieces of cloth in central park and call it A PIECE OF ART.  Should a cheeseburger be disqualified as the most delicious piece of food on the planet - simply because it combines cheese and a hamburger?  Can the United Kingdom not be considered A COUNTRY, because it contains the countries of England, Scotland, et. al?

Not trying to be a smart ass - ok maybe a little bit :)

Thanked by ChipMonkey
 
Sali Mali's image Rank 4th
Posts 292
Thanks 113
Joined 22 Jun '10 Email user

Wikipedia says...

While there is no generally accepted formal definition of "algorithm," an informal definition could be "a set of rules that precisely defines a sequence of operations."

RandomForest = (tree1 + tree2) / 2

Is that an algorithm?

 

myNewAlgorithm = (RF + GBM + NN + LINREG) / 4

Is that a set of rules that precisely defines a sequence of operations? Is that an algorithm?

 
Oleg Vasilyev's image Rank 6th
Posts 18
Thanks 1
Joined 4 Jun '11 Email user

JC36, can you please give us your definition of algorithm? (At least a crude definition, a description?)
It seems that your definition should be very different from what one would find in a typical computer science textbook or, say, in Wikipedia.
And, please, could you provide a proof (based on your definition) that a combination of several "different" algorithms cannot be an algorithm. - That is your claim, right? Sorry if I misunderstood something, but here are two quotations from your comment:

(1) I still disagree that a combination of several algorithms should be considered "AN algorithm". It should be called "a method".

(2) Surely you guys wouldn't argue that a combination of four such different algorithms is AN algorithm.

 
JC36's image Posts 23
Thanks 1
Joined 11 Dec '11 Email user

Oleg, My definition of an algorithm is the same as the dictionary definitions I have seen which are all singular ie refer to "a procedure" or "a method". Ok, I can see that some people could classify a procedure that embodies the use of four totally distinct algorithms (eg Market Makers' milestone 1 solution) as fitting this definiton and is therefore "an algorithm". It is just that that is not my position.

 
Karan Sarao's image Posts 52
Thanks 2
Joined 14 Mar '11 Email user

Has the data prep and modelling code been released, I recall Phil had posted a replicable code for previous milestone? Any such posting for this one so that we can test it out or it needs to be gleaned from the papers itself??

 
<12>

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?