leaving competition - can't query arge amounts of data on my server

« Prev
Topic
» Next
Topic
alexanderr's image Posts 42
Thanks 2
Joined 5 Apr '11 Email user

I am quitting the competition because it is so difficult to query large amounts of data on the server I use. I haven't got one entry in yet because I spend a my time breaking up files into smaller units and piecing them together again.

I can't even export the entry form from my database in one piece.

 
Domcastro's image Posts 63
Thanks 13
Joined 8 Aug '10 Email user

Hi

No - keep going. I just put the data into MS Access and been fiddling - can import and export on a 4gig ram Windows PC

 
Jason Morris's image Posts 11
Thanks 3
Joined 2 Apr '11 Email user

Alexanderr, if you want, I can loan you some time on my Database.  I am running MySQL, and if you are using PHP, a simple CRON job should do the trick for any lengthy data processing.  As a matter of fact, I already have the data uploaded, and I am more than happy to provide you with a login.  Let me know if you are interested, the processing on a laptop will only go so far on this competition, server power is definitely helpful.

Shoot me an email if you want to discuss details: brixtoninternet@gmail.com

 

For those who are using ACCESS, I found this book really helpful in trying to do large amounts of number crunching using MS Access:

http://www.amazon.com/Microsoft-Access-Data-Analysis-Unleashing/dp/076459978X

While it is for an earlier version of Access, the fundamentals are still helpful if you are using 2010.

Thanked by Domcastro
 
arbuckle's image
arbuckle
HHP Advisor
Posts 38
Thanks 21
Joined 5 May '11 Email user

I would not share your data with other people, even if they have signed up for the competition (note that the forums are open to the public, so anyone could email you). I haven't read the legalese, but the intent of the usage agreement is to make you responsible to keep the data private. Contact Jeff Moser, or anyone else at Kaggle, to see if they can give you the data in a different format. http://www.heritagehealthprize.com/c/hhp/forums/t/573/any-interest-in-a-sqlite-version-of-the-dataset

 
Jason Morris's image Posts 11
Thanks 3
Joined 2 Apr '11 Email user

arbuckle, Thanks for the friendly warning, been doing online work for a while now, precautions are in place.  For my daily business I actually farm out database usage to customers.  Same security and user authentication applies.

J

 
Zach's image Rank 31st
Posts 292
Thanks 64
Joined 2 Mar '11 Email user

alexanderr wrote:

I am quitting the competition because it is so difficult to query large amounts of data on the server I use. I haven't got one entry in yet because I spend a my time breaking up files into smaller units and piecing them together again.

I can't even export the entry form from my database in one piece.

 

I've been been doing all my work in R and storing all my data in flat files.  It's not lightning fast, but it works on a laptop with 4gb of ram.  Take a look around at other tools besides what you already know and see if any of them are a better fit for the problem.

 

 
Zach's image Rank 31st
Posts 292
Thanks 64
Joined 2 Mar '11 Email user

alexanderr wrote:

I am quitting the competition because it is so difficult to query large amounts of data on the server I use. I haven't got one entry in yet because I spend a my time breaking up files into smaller units and piecing them together again.

I can't even export the entry form from my database in one piece.

 

I've been been doing all my work in R and storing all my data in flat files.  It's not lightning fast, but it works on a laptop with 4gb of ram.  Take a look around at other tools besides what you already know and see if any of them are a better fit for the problem.

 

 
Sali Mali's image Rank 4th
Posts 292
Thanks 113
Joined 22 Jun '10 Email user

alexanderr wrote:

I am quitting the competition because it is so difficult to query large amounts of data on the server I use. I haven't got one entry in yet because I spend a my time breaking up files into smaller units and piecing them together again.

I can't even export the entry form from my database in one piece.

Don't give up - finding your way around these issues is all part of the learning. What database are you using? There are many available. The way I've proceeded so far is documented below...

http://anotherdataminingblog.blogspot.com/2011/05/progress-loading-hhp-data.html

 

Reply

Flag alert Flagging is a way of notifying administrators that this message contents inappropriate or abusive content. Are you sure this forum post qualifies?