View
131
Download
3
Category
Preview:
Citation preview
Full text search with App Engine Search API and Django
Kunal Grover
Why use Google App Engine?- Super fast prototyping and hitting the market- Auto scaling- Supports Django very well- MySQL, NoSQL, Memcache, Cloud storage ready to set up- Cheap- Lot of Google APIs to use
Adding Full Text Search- MySQL - Not recommended - External Solr server - Great performance!, but $$$$ �
- Google App Engine Search
GAEExternal SOLR
server(AWS/Google Cloud platform)
GAE Instance
Google Search Index
Service
The GAE Search API- Powerful expression based search- Supports TextField, DateField, HtmlField and NumberField- Fuzzy logic based ranking, add your own ranking setup!- But?
Google Search Index
ServiceGAE Instance
Object created
Object updated
Object deleted
What’s missing?- Batch updates recommended - Store the updates for each model in the
Database.- Need to create a service that updates each of the search documents
whenever they are updated/created/deleted.(Visualize it as lot of repeated code)
- Use incorrect types makes searching harder.- Let’s bring it to Django-Land.
- Install google-appengine-django-search- Connect to a model:
siteIndex.register(ModelName, field_list, rank_function)
- Set up a URL handler for Indexerhandlers:
- url: /index
script: searchApp.apps.app
- And set up a Cron job:cron:
- description: Index ranking
url: /index
schedule: every 5 minutes
siteIndex.register(ModelName, field_list, rank_function)
- ModelName -> Any Model class - Field_list -> [‘fieldname’, ‘fk.fieldname’, ‘fk.fk.fieldname’, ‘manyToManyField’]- Rank_function -> Score for an object
Default behaviour:
- Model create/updates -> Added to search update queue- Model deletes -> Add to search delete queue- IntegerField stored as search.NumberField, DateField stored as
search.DateField.
For the Power User- Option to set type as search.HtmlField - Search only the text- Using soft deletes? -> Publish your custom delete signal
siteIndex.register(ModelName, field_list, rank_function,
deleteSignal=customDeleteSignal)
- Foreign keys don’t update search documents by default? (Sorry limitation from my side)-> Publish your custom update signal siteIndex.register(ModelName, field_list, rank_function,
updateSignal=customUpdateSignal)
- Don’t want to use the cron setup? -> Use index_create_single(obj) and
index_delete_single(obj)
What’s in for future?- Better ManyToManyField support? - Attach listeners to child models auto save too.- Make it work with Google Managed VM environment too.- Optimize the index_create_single(obj) to use Task Queues.
Demotags: Horror NOT tags: Romance https://django-test-143704.appspot.com/search?search_type=Book&query=tags%3A+Horror+NOT+tags%3A+Romance&start=0&end=5
date < 2005-04-01 https://django-test-143704.appspot.com/search?search_type=Book&query=date+%3C+2005-04-01&start=0&end=5
description: novel https://django-test-143704.appspot.com/search?search_type=Book&query=description%3A+novel&start=0&end=5
author_name=Jean https://django-test-143704.appspot.com/search?search_type=Book&query=author_name%3A+Jean&start=0&end=5
author_publisher_name=Atlas Press https://django-test-143704.appspot.com/search?search_type=Book&query=author_publisher_name%3A+Atlas+Press&start=0&end=5
Recommended