Azure - Querying 200 million entities -


i have need query store of 200 million entities in windows azure. ideally, use table service, rather sql azure, task.

the use case this: post containing new entity incoming web-facing api. must query 200 million entities determine whether or not may accept new entity.

with entity limit of 1,000: apply type of query, i.e. have query 1,000 @ time , perform comparisons / business rules, or can query 200 million entities in 1 shot? think hit timeout in latter case.

ideas?

expanding on shiraz's comment table storage: tables organized partitions, , entities indexed row key. so, each row can found extremely fast using combination of partition key + row key. trick choose best possible partition key , row key particular application.

for example above, you're searching telephone number, can make telephonenumber partition key. find rows related telephone number (though, not knowing application, don't know how many rows you'd expecting). refine things further, you'd want define row key can index into, within partition key. give fast response let know whether record exists.

table storage (actually azure storage in general - tables, blobs, queues) have well-known sla. can execute 500 transactions per second on given partition. example above, query rows given telephone number equate 1 transaction (unless exceed 1000 rows returned - see rows, you'd need additional fetches); adding row key narrow search would, indeed, yield single transaction). inserting new row. can batch multiple row inserts, within single partition, , save them in single transaction.

for nice overview of azure table storage, labs, check out platform training kit.

for more info transactions within tables, see msdn blog post.


Comments