1. Matt Chaput
  2. whoosh
Issue #272 new

Google app engine for reals

Matt Chaput
repo owner created an issue

With the new codec layer, it would be easier to implement Whoosh on top of Google App Engine's db.

  • Store posting blocks as blobs under {{{segment_# > term > block_#}}} keys.

  • Only allow bytes columns, store column values under {{{segment_# fieldname > doc_#}}} ? Or store columns as blob? (1 MB limit is a problem)

It probably wouldn't be fast. And I don't actually think using Google App Engine is a good idea. So there's that.

Comments (2)

  1. Steven Ourada

    For those of us who think that GAE is a good idea, about how much work do you think this would be? In this case, I want to use it for a fairly small collection (50k) of tiny documents (500 chars). It looks like gae.py is a bit out of sync with the rest of the system in 2.5.4, because it was choking when I tried to use it. I was thinking along the lines you outline above, since in GAE world, small entities indexed for direct retrieval by keys is the way to go... I haven't looked too deeply at the storage architecture to know exactly where to plug in, but your description above might give me the pointers I need.

    Thanks for putting Whoosh out there!

  2. Steven Ourada

    (I take back what I said about gae.py. I was able to get it working on a small sample, after I fixed my code and did a workaround for some other little issue. I'll keep working on that path. But I still like the idea of storing the postings in separate entities.)

  3. Log in to comment