ShardQuery is a PHP class which is intended to make working with a partitioned dataset easier.
- QueryBroadcast – Send queries to all shards in a parallel pipeline
- ParallelPipelining – Runs parts of queries in parallel and combines results
- QueryRouting – Sends queries only to the shard containing the requested data.
- ConditionPushdown – Aggregation, joins and filtering are always performed at the shard level (in the pipeline) which distributes the work
- Gearman Workers – PHP is not threaded. Gearman (Net_Gearman) is leveraged to create each pipeline process.
via shard-query.
There are some limitations, despite that I think we’ll see more tools like this in the future.
One reply on “shard-query”
Some of the limitations are now lifted. It supports subqueries in the FROM clause (and can make them parallel) and it supports UNION and UNION ALL.