[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Data sharing in java

Hello all,

I'm currently involved in supervising a couple of student projects that
will involve formal specification and refinement into java code. But I
have run into a BIG problem: data sharing. 

Assignment statements in java are not what they seem. Suppose that
variables a and b have type Circle, then the statement

A = B;

makes A into a reference for the same object that B references.
So if the next line of code is...

A.radius = 24.0;

this has the effect of setting B.radius as well!

One of the main advantages of object-oriented programming is supposed
to be data encapsulation, i.e. nobody should be able to mess with
object B's private data except for B itself. But if A and B are 
references to the same object then this property get's wiped out: B's
data can be changed by fiddling with A!

Specification languages like Z allow you to define variables of
abstract mathematical types such as `set' and `graph'. These can then
be gradually refined to code. Using the object-oriented model we can
think of each of these mathematical types as correponding to a
particular Class.

If it were possible to implement each class independently from the
others then we would have a `scalable' formal method for code development,
thanks to the object-oriented model.

And that would be fine if there were no data-sharing, but in practice
that can lead to horribly inefficient code. For instance we might
wish to use several Objects to represent a number of overlapping sets of
large items of data. It would be more efficient for each set to be stored
as a collection of pointers to the various set members rather than to
duplicate the data. Java programs tend to have data-sharing dependencies
all over the place, including cyclic dependencies. So whenever the value
of a particular Object gets updated there may be numerous other Objects
whose values change as a side effect.  

This mailing list was originally set up for discussion of the jcsp
library which was intended to help eliminate side effects of
multithreading over shared data. So what I'm asking now is "how can we
eliminate the risks that arise because of data-sharing of objects?".

In practice I think that good programmers `think' in terms of references   
or pointers and so do not tend make these kind of errors. But I would
like to have some kind of design rules that give a formal guarantee
that no errors can arise due to data sharing, i.e. if the java program
were tranformed so that every assignment

 A = B;

were changed to

 A = (B).clone();

then the answer would be the same.

Does anybody know how to do this?


                            Dr J. M. R. Martin                
 Oxford Supercomputing Centre                          Tel: (01865) 273849
 Wolfson Building                                      Fax: (01865) 273839
 Parks Road                           Email: jeremy.martin@xxxxxxxxxxxxxxx
 Oxford OX1 3QD                         WWW: http://users.ox.ac.uk/~jeremy