Drift
Drift is an experimental proactive information service designed for scalable operation in high-performance environments. Drift accomplishes this by using proactive delivery of information updates to clients instead of forcing them to poll the server for updates. This reduces CPU load on clients and servers, and user-specified filtering of updates allows clients to assert control over incoming communications.
Drift has its roots in the Proactive Directory Service (“Scalable directory services using proactivity”, Bustamante, Widener, Schwan, Proc. Supercomputing 2002).
Design commentary
Drift is primarily intended to address some lingering PDS issues. There are some unrelated new wrinkles, however.
PDS client-server communication was explicitly session-based, built on an RPC layer. Drift is not session-based and has message-based instead of RPC-based semantics.
PDS data items were of a limited set of types (basically the type tags supported by the ATL attribute-list package). Drift data items are explicitly fully typed as FFS formats. I would like to support simple ATL typing as well. Also, so-called “blob” (binary large object) types have a MIME type in Drift.
PDS would give you its latest value for a particular item when you asked (as part of an RPC) or as an asynchronous update if you subscribed. Drift provides only the latter semantic. When you query for a data item, you are implicitly subscribed to an update channel. Options on the query message can request a periodic update even if the value hasn't changed or only a single update (this can be wrapped to provide an RPC-style interface if that's really what you want). Asynchronous operation is intended to be the norm.
Where PDS stored data internally in a hash-map and supported a rudimentary XML serialization capability, Drift uses Berkeley DB to store data. Each data item has a UUID to uniquely identify it in the data store (the “warehouse”).
PDS supported a single naming tree for naming data. Drift will support multiple indexing strategies through a plugin approach. Each query plugin will resolve queries to a UUID or set of UUIDs, which can then be satisfied from the data warehouse. This is intended for maximal flexibility to meet client needs or desired query semantics. I'm sort of trying for a model-view-controller approach.
Note that the query spaces are not symmetric; that is, not every data item is indexable through each query plugin. I haven't decided yet whether every data item in a given Drift warehouse must be indexable through at least one query plugin :). At a minimum, I'd like to support PDS-style hierarchical naming and XML Query. DHT or smart-hashing approaches would also be interesting.
I have also been thinking about an “advertisement” channel for drift servers. This would be a well-known channel with periodic events containing the set of “public” data items the server knows about. Users could then query by data format or UUID to subscribe to the update for that particular data item.
