’Sup. I’m a fourth year computer science major at UCLA. I like web applications and distributed systems and I care about using them to solve problems and make people happy. I interned at Khan Academy last summer and will be returning there as a full time developer when I graduate. This is my website.
The App Engine NDB documentation does a good job of explaining the benefits of the new interface, but it doesn't really have anything for application developers who want to upgrade their existing models. As I discussed in my previous post, the Khan Academy engineering team recently went through this process and came out with a lot of experience about what works and what doesn't for an established codebase. What follows is our refined plan of attack, distilled into a field guide that you can use to transition over your own application.
N.B.: The difficulty of making this transition is proportional to the size and complexity of your application. If it's small enough that you can convert everything at once, great. If not, be prepared to do a fair bit of debugging to smooth things out. Our experience taught us that some parts of this are going to be rocky no matter what, but that the flexibility offered by the upgrade is worth it.
1. Change your models to subclass from ndb.Model and use NDB properties and APIs
class Video(db.Model) becomes class Video(ndb.Model). If only this were all it took!PhoneNumberProperty and PostalAddressProperty are now simply StringProperty.ListProperty. Instead, add repeated=True to the property constructor. For example, what was once db.ListProperty(bool) will now be ndb.BooleanProperty(repeated=True).KeyProperty instead of ReferenceProperty. KeyProperty does not automatically fetch the referred-to entity from the datastore. You could write a custom ndb.Property subclass to emulate the old ReferenceProperty:from google.appengine.ext import ndb
class ReferenceProperty(ndb.KeyProperty):
def _validate(self, value):
if not isinstance(value, ndb.Model):
raise TypeError('expected an ndb.Model, got %s' % repr(value))
def _to_base_type(self, value):
return value.key
def _from_base_type(self, value):
return value.get()
ndb.Property. Good news: it's pretty trivial and the way to do custom properties is vastly simplified in NDB. See my custom ReferenceProperty example above.db.get(key) becomes key.get(), entity.delete() becomes entity.key.delete(), etc. Refer to the cheat sheet. Making sure you've covered these changes everywhere they need to happen is the most difficult part of the conversion process.myentity.key.delete()._use_cache = False class variable to each model as necessary. More sophisticated policy functions are available as well, but those are best left for final tweaking. This is less of a “I need to be afraid of a potential slowdown” thing and more of a “I want to preserve my existing performance characteristics at the risk of not getting potential improvements, because it will make me feel safer about this” thing.2. Change all code that uses the newly converted models to use the NDB interface
ndb.Query instances. But I don't recommend punting: the NDB query syntax is pretty sexy and this is one of the least error-prone parts of the conversion. Do note, however, that calling methods like order and filter on a query instance doesn't modify it in-place; you need to do that yourself by reassigning the instance to itself:from google.appengine.ext import db, ndb
class OldBananaStand(db.Model):
contains_money = db.BooleanProperty()
class NewBananaStand(ndb.Model):
contains_money = ndb.BooleanProperty()
old_ones = OldBananaStand.all()
old_ones.filter('contains_money = True') # => ok!
new_ones = NewBananaStand.query()
new_ones.filter(NewBananaStand.contains_money == True) # => nope
new_ones = new_ones.filter(NewBananaStand.contains_money == True) # => ok!
from google.appengine.ext import db, ndb
from google.appengine.datastore import entity_pb
def db_entity_to_protobuf(e):
return db.model_to_protobuf(e).Encode()
def protobuf_to_db_entity(pb):
# precondition: model class must be imported
return db.model_from_protobuf(entity_pb.EntityProto(pb))
def ndb_entity_to_protobuf(e):
return ndb.ModelAdapter().entity_to_pb(e).Encode()
def protobuf_to_ndb_entity(pb):
# precondition: model class must be imported
return ndb.ModelAdapter().pb_to_entity(entity_pb.EntityProto(pb))
3. Test, test, test
db.Key in your client-side template but the corresponding server-side endpoint is querying for an ndb.Key.4. Deploy the mechanical translation and squash any remaining bugs
If you've made it this far, you're in great shape. You have a solid NDB foundation and now the more advanced features are available for you to play with.
5. Start using the asynchronous API
@ndb.tasklet. A tasklet returns a future, which you can get the result of by calling get_result, naturally. By convention, I append _async to the names of newly tasklet-ized functions. But what if that function needs to be called from existing synchronous code? A future is of little use there. You could upgrade your synchronous code to always call get_result after calling a tasklet, but a slightly nicer solution is this conditionally async decorator that introduces a make_sync keyword argument:from google.appengine.ext import ndb
def tasklet(func):
"""Tasklet decorator that lets the caller specify either async or sync
behavior at runtime.
If make_sync is False (the default), the tasklet returns a future and
can be used in asynchronous control flow from within other tasklets
(like ndb.tasklet). If make_sync is True, the tasklet will wait for its
results and return them, allowing you to call the tasklet from synchronous
code (like ndb.synctasklet).
"""
@ndb.utils.wrapping(func)
def tasklet_wrapper(*args, **kwds):
arg_name = "make_sync"
sync_by_default = False
make_sync = kwds.get(arg_name, sync_by_default)
if make_sync:
taskletfunc = ndb.synctasklet(func)
else:
taskletfunc = ndb.tasklet(func)
if arg_name in kwds:
del kwds[arg_name]
return taskletfunc(*args, **kwds)
return tasklet_wrapper
ndb.Return exception. This is a good example from the App Engine documentation:# from https://developers.google.com/appengine/docs/python/ndb/async
@ndb.tasklet
def get_cart_async(acct):
cart = yield CartItem.query(CartItem.account == acct.key).fetch_async()
yield ndb.get_multi_async([item.inventory for item in cart])
raise ndb.Return(cart)
@ndb.tasklet
def get_offers_async(acct):
offers = yield SpecialOffer.query().fetch_async(10)
yield ndb.get_multi_async([offer.inventory for offer in offers])
raise ndb.Return(offers)
@ndb.tasklet
def get_cart_plus_offers(acct):
cart, offers = yield get_cart_async(acct), get_offers_async(acct)
raise ndb.Return((cart, offers))
Finally, here are a couple of additional anecdotes that are somewhat specific to our codebase but worth sharing nonetheless:
isinstance checks to deal with both types appropriately, but in practice that's really ugly.yield op.db.Put(entity) to accumulate too many NDB entities and fail with a “datastore RPC too large” error. Luckily for you, this bug has since been fixed and doesn't exist in later revisions.Don't panic. Welcome to the future!