Hermes is really geared towards a plugable, distributed pub/sub interface with stateful subscriptions. In the model, the user provides a utility-based subscription (things like a weighting of relevance, timeliness, and thoroughness of data), which then goes through a pluggable parsing interface. The output of this parser is a logical dependency graph for a trail of function needed to achieve that subscription, but with the functions only resolved down to a symbolic level. (ie sum() instead of a literal chunk of COD code)
This graph form then gets sent to a planner/mapper. The planner interacts with a library repository in order to expand out symbolic functions to a list of possible instantiations (COD with or without specializations, dll for a particular platform, wrapper for a web service call, etc.). These may be graphs in their own right as well (enabling the application to provide preformed “widget” functionality). The planner also queries an execution metadata service that monitors the current execution environment – both locations of worker nodes and the particular gateways and transformation/annotation functions on them. The planner then is responsible for choosing an ordering of the graph, potentially re-using existing functions, and utilizes the amaze layer to place service requests.
The big hiccup to keep in mind is that we're intending this to be done in a cloud/p2p type model. There will be many planner nodes in the system, and they will have only partial metadata access between the planners. Ideally, planning is a service which can migrate dynamically, using some sort of leader-election scheme (although that's not a first-paper priority), and it should integrate tightly with a distributed metadata access layer that has some trade-offs in availability vs accuracy of data (ie chord-like guarantees of eventual data return, overlaid with some fastbit-like data-rich quick indexing).
So… that said, I think rockhound could be used within any or all of the external services that the planner needs. The library function maintenance is a clear example of something that we did with PDS and could do better now. It demands a hierarchical name space, but the ability to farm out particular specialized versions of code to locations which are “near” where those specializations are needed would be very interesting. Potential for doing some of the precached messaging Karsten's been so interested in motivation.
Also, the execution metatdata service is a clear winner… it's the user-space monitoring solution all over again. So that fits cleanly in some of your discussion around rockhoud, too. If we can figure out a way to tie it together cleanly with amaze (which is in cvs, if I didn't mention that to you already) so that you can have multiple agents issuing the graph configuration calls while maintaining a synchronizable view of the total environment… I think that gets awfully close to where you were proposing to go with rockhound, too. And pretty close to what Vibhore said he wanted to do, too.
So, here are some (more) technical questions. You could play most all of these games with some suitably complex naming games – distinguishable provenance-preserving names for each emitted data element, prefix searches for planning, etc. Or you could take a more OO stance and talk about distributed object instantiation, etc. which also boils down to naming, but in a more controllable fashion. Or you could go all the way to dhts and flat name spaces and just focus on limited categories of metadata tagging, with anything fancier having to boot out on a web service call to a much heavier-duty, xml-based coordinator. Do you have thoughts on both (a) general performance guides and (b) personal interest in any of these approaches?
~~DISCUSSION~~