<<Understanding JavaSpaces
JavaSpaces has been a bit of an unknown technology for a long time.
It's one of those technologies that programmers know is out there, but
haven't actually used enough to say they understand what it's for or
what it can do for them.
JavaSpaces is, in very simple terms, a kind of client/server map, a
grid in which data lives. This map doesn't have a distinct key, using
a sort of query-by-example to retrieve data. It also has notification
capabilities.
In concept, that's really it. If you can wrap your head around the
idea that it's a map in which an entry's data determines how that
entry is accessed, you've mastered most of JavaSpaces already – the
rest is simple implementation.
JavaSpaces has a number of uses, especially in massive parallel
applications. One example is that of a job producer, e.g. "calculate
this" applications where the calculations vary widely in complexity;
JavaSpaces allows a high-powered CPU to grab certain computations
while letting lower-powered CPUs select others. Also, if more
computing power is needed, adding more power is only a matter of
running more clients to select tasks from the JavaSpace. Another
example is that of queueing updates to a datastore; storing data into
a JavaSpace is not only very fast (subject to network throughput, of
course, which would also affect direct data storage), but provides
easy audit capabilities (through notification events) and also means
that the persistence engine's speed can't impact the application.
(This is a common requirement for financial applications, where
milliseconds count.)
Using JavaSpaces
JavaSpaces requires an implementation. It's built on JINI, so the
beginning of a JavaSpaces exploration should start with a download of
the JINI starter kit, which includes Outrigger, Sun's implementation
of JavaSpaces.
In addition to the JINI starter kit not being documented very well,
Outrigger isn't especially good either, so an easy and convenient
addition to the JINI download is Blitz1. Blitz replaces Outrigger as
well as providing some easy startup scripts and some diagnostic tools.
A commercial implementation with the ability to distribute the Space
is GigaSpaces2. (These are merely some candidates, not meant to be
representative of the entire JINI or JavaSpaces communities.)
Running a Blitz instance is as simple as installing Blitz, going to
its directory, and typing ".\blitz.bat". This will show a decent
amount of information in the log as various services start, including
registration services and an HTTP server (defaulting to port 8080).
When initialization is complete, you have a working JavaSpace in
action. The diagnostics tool ("dashboard.bat") shows information about
memory usage, transactions, and entry counts in the space.
Entries in a JavaSpace are basically simple Java Objects that follow a
few simple rules:
* All data persisted in the space must be exposed in public fields.
* The Entry interface must be implemented. This is a marker
interface, requiring no methods to conform to the interface contract.
* Objects must be used for the properties (i.e., no primitive
fields.) This makes sense especially in light of query-by-example,
which uses nulls to indicate wildcards. <
An entry can contain data and/or functionality, and can implement any
interface or class that conforms to the requirements for JavaSpaces.
Thus, a fairly common pattern is:
* Implement a Command interface in an Entry, along with any data
required to perform the command
* Store the Command into the JavaSpace
* Have a computing resource retrieve the Command and execute it,
storing any results back into the JavaSpace for consumption if necessary
Connecting to a JavaSpace is fairly simple and flexible. The
Lookup.java class provided in the Blitz examples connects to every
service registrar it can find, which is acceptable development
behavior but not likely to be good in production; however, changing
this is fairly easy.
Using the Lookup class and searching for an implementation of
JavaSpace05.class (suggested, if it's available) will return a
reference to the first JavaSpace it finds, provided one is available3.
There are five basic operations associated with JavaSpaces. They are:
1. Read an entry matching a template, leaving it in the JavaSpace
2. Take an entry matching a template from the space, removing it
from the JavaSpace
3. Write an entry into the JavaSpace
4. Register a callback for events in the JavaSpace
5. Issue an event in the JavaSpace
Of these, each can be associated with a Transaction, and the read/take
operations can also block, wait for matching entries, or
"readIfExists," which will block if there's an entry that might become
available to satisfy the operation; otherwise it returns.4
("takeIfExists" is also in the API.)
In addition, if JavaSpace05 is used, the "take" operation can populate
a list of entries matching the template, for bulk processing of
entries in the JavaSpace, as can the "write" operation (for writing
sets of data). JavaSpace05 also includes a way to get references to
sets of information (the "contents" method.)
Templates in JavaSpaces are instances of classes implementing the
Entry interface (i.e., they're entries) that use "null" as a wildcard.
Thus, if an Entry implementation has string properties A, B, and C, a
template might populate A with "data" while leaving B and C null; a
take operation would retrieve the first entry of that class that had
"data" in the "A" property. (The list form of take() would return all
entries that had "data" in the "A" property.5)
Believe it or not, this summarizes JavaSpaces. JavaSpaces can
legitimately be used as a datastore (i.e., sets of heterogenous or
homogenous Entries), as a queue (i.e., lists of Entries containing
data, consumed by external processing agents), and as a distributed
processing facility (i.e., Entries containing data and processing
capabilities, offloaded to external computing devices).
An Actual Application
Discussing JavaSpaces is all well and good, but that leaves it still
in the realm of theory and not practice. Let's change that, by
implementing a compute server. Our compute server will be functional,
but not complete – it won't implement security, computational limits,
or account tracking, but will allow us to signal processes and gather
results.
We'll use GigaSpaces for the implementation, although any JavaSpaces
implementation should work.
Our requirements are that the external clients will provide a subclass
of our ComputeTask class. Our ComputeTask class will contain a status
and a UUID for identification. The internal clients – which will be
executing the processes – will call a method in the ComputeTask.
The key value proposition of JavaSpaces in this situation is that if
the internal client process becomes overloaded for any reason, it will
be very, very simple to connect another "internal client" to the
JavaSpace to double computational power; each client would add nearly
linear scalability to the grid. It's possible to get the same kind of
linear capabilities from other technologies, but the nearly
transparent nature of the JavaSpace model along with the simplicity of
the clients serves as an advantage.
The first step is to download GigaSpaces' Community Edition from
http://gigaspaces.com/. This requires registration; if this is a
concern for you, feel free to substitute Blitz, Outrigger from the
JSTK, or any other compliant JavaSpace implementation, any of which
provide the JavaSpaces API features we'll be using. The compute server
shown here has no dependency on any specific JavaSpaces implementation.
The next step is to consider our basic data model. The entities are
fairly simple: a ComputeTask class (which does nothing in and of
itself, but provides UUID and status for descendants), a SubmitTask
class (and a parent) which provide some handy utility methods for
adding ComputeTasks to the JavaSpace, and a multithreaded
ComputeClient, which actually does the work of executing the ComputeTask.
It's worth reiterating that the practice followed in this engine is
very much insecure. One of the issues not being discussed in this
article is the security manager and policies; if you'd like, please
read "Discovering a Java Application's Security Requirements6" for one
method of determining required security policy entries.
In addition, the "multithreaded client" is very primitive, as is the
ComputeTask itself. The codebase monitors no information, doesn't
indicate that it's "running" (instead it removes it from the space
entirely), offers no status information. Only one hundred tasks will
be run (although it'd be trivial to change it to an infinite number of
tasks.) That said, this task server does work, and is a workable
starting point for creating a more capable task server. (Incidentally,
while this code is entirely written from scratch, except for Dan
Creswell's service locator classes, there are other similar
implementations of the same kind of structure elsewhere.)
Here's the runtime of a given ComputeClient task. It initializes a
counter (to limit the number of tasks to 20), then creates a template
to use to look for available tasks (i.e., tasks whose status is
STATUS_NEWTASK). Then it looks for the tasks for 5000 ms; if it finds
one, it removes it from the space, executes the task (which is
expected to populate a "result" field in the ComputeTask) and stores
it back into the JavaSpace for retrieval by any external clients. The
code is fairly straightforward, if not very robust. (Adding
transaction support wouldn't involve much, actually.)
public void run() {
int tasksRun = 0;
ComputeTaskImpl template = new ComputeTaskImpl();
template.setUuid(null);
template.setStatus(ComputeTask.STATUS_NEWTASK);
while (tasksRun < TASKS_PER_THREAD) {
try {
ComputeTask task = (ComputeTask) getSpace().take(template,
null,
SCAN_TIME_MILLIS);
if (task != null) {
task.execute();
task.setStatus(ComputeTask.STATUS_FINISHEDTASK);
submit(task, null, Lease.FOREVER);
tasksRun++;
}
} catch (Exception e) {
log.log(Level.SEVERE, e.getMessage(), e);
}
}
}
The actual task itself is very, very simple. Here's an addition task,
for example. It's vast overkill to use a JavaSpace for simple
addition, but it's a proof of concept – more complex tasks would
follow the same pattern.
public class SimpleAdditionTask extends ComputeTaskImpl {
private transient Logger
log=Logger.getLogger(this.getClass().getName());
public Integer firstNumber=null;
public Integer secondNumber=null;
// fulfills Entry requirements to have public no-arg constructor
public SimpleAdditionTask() {
}
public SimpleAdditionTask(int f, int s) {
firstNumber=f;
secondNumber=s;
}
public void execute() {
result=firstNumber+secondNumber;
log.info("SimpleAddition has run! (result="+result+")");
}
}
The execute() method is the one of most interest – note that it simply
executes the task (the addition) and stores the result into the
"result" property which is contained in ComputeTaskImpl.
(Incidentally, the result here is an Integer, not an int; the
JavaSpace API mandates that persisted fields in a JavaSpace are
public, Serializable objects.)
Source code for the entire server is available on TheServerSide.com.
The last step is to run the actual application. Start GigaSpaces, then
set your classpath to include jsk-platform.jar and jsk-lib.jar.
Execute "java -Djava.security.policy=policy.all
com.tss.javaspaces.compute.internal.ComputeClient"; it will block the
command shell until it completes. The security policy in "policy.all"
clears all security.
With the same classpath and JVM option, execute the
com.tss.javaspaces.compute.util.SubmitTask class; it has a main()
method that creates a single SimpleAddition task, submits it, then
watches for the result for the task it submitted (which it looks for
by UUID and status).
You should see the ComputeClient class show the SimpleAddition's log
statement, and then the SubmitTask should get the result itself. You
now have a simple, but working, Compute Server.
Hopefully this explained JavaSpaces to you clearly enough that you can
see some potential applications for it and not lack any confidence
that it can be executed efficiently and properly.>>
You can read this at:
http://www.theserverside.com/tt/articles/article.tss?l=UsingJavaSpaces
Gervas