First of all, the code’s had a slight polish, although it’s still some way from gleaming perfection. What it really needs is unit testing, and possibly some profiling as well. Oh, and some document strings. Stuff like that. I’m afraid that the more interested I am in something, the louder my inner cowboy shouts yeehaa!
. Process is fine when the work you’re doing is essentially boring: at least then it can be boring and respectable.
I’ve based the design so far on a number of untested assumptions about how efficient certain operations are. In fact I’m pretty certain that a naive
approach - just storing the tuples in a list and running along the list looking for matches - would perform better for smallish tuple sets. By indexing absolutely everything, I’m making tuple selection over large sets faster at the cost of making everything else slower (but slower by an amount that increases slowly relative to the size of the tuple set; at least, assuming my assumptions about dictionary lookup and tests for set membership in Python are correct).
It might be that a hybrid approach would be best - index on the first item in the tuple, then do a linear-time search through an unindexed list after that. Best of all would be to make this configurable.
Meanwhile, it works well enough for me to be able to try out some co-ordination scenarios. The new version of the code has three reader
processes simultaneously trying to access items produced by a worker
process. No synchronization primitives required: they just pull tuples out of the tuple space until there are none left. But how do they know when there are none left? Because they also update a shared count
tuple.
What makes it all so much simpler than it otherwise might have to be is the atomicity of the tuple space operations. The cost of this, in my implementation at least, is that these operations take place sequentially, on a single thread. Calls from other threads are marshalled to this thread via a queue, and block until the worker
thread calls them back. Again, this is almost certainly not the most efficient scheme. Simultaneous non-destructive reads should presumably be able to be processed concurrently.