Removing search terms for Google Full Text Search

This is a cross reference https://groups.google.com/d/topic/google-appengine/97LY3Yfd_14/discussion

I am working with the new full-text search service in gae 1.6.6, and it’s hard for me to figure out how to avoid query strings correctly before I pass them to the search index. The docs mention that some characters must be escaped (namely numerical operators ), however they do not indicate how the query parser expects the string to be escaped.

The problem I am facing is twofold:

  • The inability to avoid the crap of many characters (more than those hinted at in the documents) will force the parser to pick up QueryException.
  • When I slipped from the query to the point, it will not rise, the numerical operators (>, <,> =, <=) will not correctly analyze (are not taken into account in the search).

I install a test in which I pass string.printablein my_index.search(), and find that it will raise QueryExceptionto each of the “printable” characters I am striking out now, as well as things that will seem innocent, like an asterisk, comma, brackets, curly braces, tilde . None of them are mentioned in the documents, as they should be avoided.

So far I have tried:

  • cgi.escape()
  • saxutils.escape()with ascii mapping for equivalent equivalents (e.g. ,%2C)
  • saxutils.escape()mapping ascii to html objects encoded ascii codes (e.g., &#123;)
  • urllib.quote_plus()

, url- (%NN), > , <, >= <= continue . , , , , NOT field = value, , , .

TL;DR

, QueryException, ?

+5
1

, - , . .

( /) . , , " \. .

import string
from google.appengine.api.search import Query
Query('"%s"' % string.printable.replace('"', '').replace('\\', ''))

Query('"%s"' % ''.join(chr(i) for i in xrange(128)).replace('"','').replace('\\', ''))

EDIT: , , , , "foo bar" ... foo bar... ... bar foo..

+3

All Articles