Saturday, 10 July 2010

I wouldn't start from here... (solving serialization problems)

Last week we had a problem with a live system, therefore it was an emergency. The problem was around serialization. The application uses Spring Webflow and, to allow the user to save where they were up to and restore it later, we serialize the Webflow conversation to the database. And there lay the problem.

We used object serialization, which you probably know, stores a binary representation of the Java object. The Webflow conversation object has a bunch of other objects attached to it, including application objects. It works fine as long as those objects do not change, and there lay the problem.

A new version of the application was deployed and one of those objects changed. So immediately any of the older conversations refused to restore, causing all kinds of problems.

The way you're supposed to handle this is to ensure that all your Serializable objects have a serialVersionUID field added. Eclipse flags classes that implement the Serializable interface for this reason and you really ought to take the hint and add a serialVersionUID. Java uses the serialVersionUID to figure out that even though the class has changed you still want it treated the same because the serialVersionUID is the same (unless you changed it deliberately).

Now, while that is all fine, it is not much help if you've already serialized the objects. At this point I'd like to point out that I had nothing to do with this part of the design, just so you know. I only got involved with solving the problem. But it is a bit like the old joke where you ask an Irishman how to get to Dublin and he says 'well,I wouldn't start from here!'

To solve this I figured we'd need to restore the objects into the old version, then write them to some neutral format, then serialise from there to the new format. Initially I was thinking I would need to run two passes on this. The first to write the neutral format to a file, and second to read the file. The two passes would be implemented in two different jar files so that the two versions of the same objects would not get mixed up. But someone suggested we could make this simpler by messing about with the class loader and I remembered madura-bundle which would provide a neat way to manage this.

Madura-bundle is kind of a simplified OSGi that is closely tied with Spring. You can create a bundle containing your objects and a spring beans file specifying some of them as beans. Then you have a main program with a main spring beans file, and you inject bundled beans into that. Then, and this is the interesting bit, you can switch from one bundle to another. The bundled beans injected into your main program change automatically.

So for this problem I loaded the objects from the two versions of the application into two different bundles, then I could restore the serialised objects from the database under bundle-1, switch to bundle-2 and serialise them back. Easy.

I used xstream for my neutral format. So, having unpacked the objects from the database I called xstream to get them into an XML string. Then, after I switched bundles, I used xstream to create the new version objects which I could then serialise to the database.

Xstream worked pretty much first time and it handles complex structures where an object is cross referenced several times. Where XML would typically repeat the object definition xstream stores an XPath when it finds a repeated object. I did have a small issue with one of the webflow objects which uses Externalize rather than Serialize. The xx object has a protected constructor which, naturally, gives xstream a problem. To get around this I had to implement a converter for the org.springframework.webflow.engine.impl.FlowSessionImpl class.

Xstream's architecture uses a bunch of standard converter classes which handle pretty much everything as far as I can see. But when necessary you can add your own converter. In this case all I had to do was extend
and override the unmarshal method to instantiate the FlowSessionImpl class explicitly. It put the converter in the same package as FlowSessionImpl so it overcame the protected constructor.

The code ended up like this. To unpack from the blob I used this method:
public String getXMLFromBlob(InputStream is)
    XStream xstream = new XStream();
    Object ret = null;
    if (is != null)
        try {
            ObjectInputStream oji =
                new ClassLoaderObjectInputStream(
            ret = oji.readObject();
        } catch (Exception e) {
               throw new RuntimeException(e);
    return xstream.toXML(ret);
I put this into the first madura bundle, and all it does is take the input stream which comes from the blob and converts it. I did have to implement an extension of the standard ObjectInputStream because the standard one locked on to the main class loader. The extension just uses the following to override the resolveClass method:
protected Class resolveClass(ObjectStreamClass desc)
    throws IOException, ClassNotFoundException{
        String name = desc.getName();
        return Class.forName(name, false, classLoader);
    catch(ClassNotFoundException e){
        return super.resolveClass(desc);

To get the new objects from the XML I switch to the other bundle and then call this method:
public Object getObjectFromXML(String xml) {
    XStream xstream = new XStream();
    new org.springframework.webflow.engine.impl.FlowSessionConverter(
    return xstream.fromXML(xml);
This also shows how the custom converter is registered with xstream. The object returned is always the Webflow Conversation object, plus the other objects that were attached to it.

So that's how things are serialized at the detail level. The process is controlled by this code:
InputStream is = new;
String xml = getTranslator().getXMLFromBlob(is);
Object obj = getTranslator().getObjectFromXML(xml);
Not very much to it really. The setBundle call, obviously, selects a different bundle and this switches back and forth between them as we process each record. I have not shown reading and writing the blob in the database but that is easy enough to figure out.

Although this does provide a solution to the problem you should not get into this situation in the first place.
  1. Do not serialise objects to a database like this. Put them into a neutral format such as XML (perhaps generated from xstream) instead. This feature of Java is best confined to sending objects between applications rather than persisting them.
  2. If you must serialize binary objects then declare the serialVersionUID field in every class. This is not completely foolproof. You can change the object enough to break it. But you'll probably be okay.
As part of this exercise I had a need to find the serial version of the class so that I could tell if I had an old one or a new one. My understanding is that Java uses the serialVersionUID if there is one and if it doesn't then it calculates a value based on the bytecode of the class.  In my case there was no serialVersionUID so I needed some way to find out what it was. I used this:
ObjectStreamClass objStreamClass = ObjectStreamClass.lookup(obj.getClass());
long serialVersion = objStreamClass2.getSerialVersionUID()
Post a Comment