Tuesday, July 17, 2012

Re: full text search

Too slow, I tried this and had to do a custom solution.

What I did: 

- Get an amazon EC2 linux instance, install sphinx, create a snapshot image of the machine.
- Create a web service that exposes your indexed data in xml format with a kill-list (the xml saved to blobstore but served trough a password protected app engine handler).
- Create a cronjob on the ami, that fetches de XML every n hours and runs the indexer.
- Create a webservise on the Ami (I used py's SimpleHTTPServer) to expose the search server.  Re-save the ami.
- Query the search service using appengine's url fetch service and get the actual objects from mysql by ID.

Actually, once I was at this point I switched to Datastore completely because it is faster and more reliable. I only was using mysql for the full text searches anyways ...



On Tue, Jul 17, 2012 at 2:55 AM, amber <tekipconsulting@gmail.com> wrote:
Hi , 

I imported a DB (~100k rows) on which I am doing full text searches, 
In CloudSQL they are taking more than 30 seconds causing 
Apennine to exceed the request time limit. 

Any suggestions? Or are full text searches just too slow to be done in 
the GCS?


No comments:

Post a Comment