Topic: Followup from meeting: Modularity

As discussed on the meeting in Prague we really need to keep things modular in order to be able to keep things up-do-date and easy to maintain.
There is currently a few things that still hinder this, which we should try to solve, one of them is classType.

his is currently only used by the load balancer and the context files. For both of these it should be very simple to just drop in the class name (string) as a replacement for the classtype (enum). This would remove the need to update this unwieldy enum. 
This means that there will be a few more bytes per element/node to transfer. I think it will be worth the benefits of modularity.

We already have

virtual const char *FEMComponent :: giveInputRecordName() const { }

but a lot of elements haven't overloaded this function when they should (default implementation just returns className() which only occasionally matches).

For Domain it should be straight forward to replace the classtype by this record name:

#define SAVE_COMPONENTS(size, type, giveMethod)    \
    {                                               \
        type *obj;                                  \
        for ( i = 1; i <= size; i++ ) {             \
            obj = giveMethod(i);                    \
            if ( ( mode & CM_Definition ) ) {       \
                if ( !stream->write(obj->giveRecordName()) ) {    \
                    THROW_CIOERR(CIO_IOERR);        \
                }                                   \
            }                                       \
            if ( ( iores = obj->saveContext(stream, mode) ) != CIO_OK ) { \
                THROW_CIOERR(iores);                \
            }                                       \
        }                                           \
    }

#define RESTORE_COMPONENTS(size, type, resizeMethod, creator, giveMethod, setMethod) \
    {                                           \
        type *obj;                              \
        if ( mode & CM_Definition ) {           \
            resizeMethod(size);                 \
        }                                       \
        for ( i = 1; i <= size; i++ ) {         \
            if ( mode & CM_Definition ) {       \
                std::string name;
                if ( !stream->read(name) ) { \
                    THROW_CIOERR(CIO_IOERR);    \
                }                               \
                obj = creator(name.c_str(), 0, this); \
            } else {                            \
                obj = giveMethod(i);            \
            }                                   \
            if ( ( iores = obj->restoreContext(stream, mode) ) != CIO_OK ) { \
                THROW_CIOERR(iores);            \
            }                                   \
            if ( mode & CM_Definition ) {       \
                setMethod(i, obj);              \
            }                                   \
        }                                       \
    }

Borek, could you take a look at the load-balancing/context files?
I don't have any files to try out the load balancing, so I'm reluctant to touch any of these changes in fear of breaking stuff.

2

Re: Followup from meeting: Modularity

Many classes store their classID into context file, which was motivated by ability to check consistency of context file - simply in the context file there were values (classIDs of individual objects) that must agree and could not change. This is used to make sure, that context file information is compatible (created by the same oofem version, etc.).

If we can ensure the compatibility of context files by other means (storing context version in the header, for example), or not storing classID but rather classRecordBeginID (which could be a constant), we can get rid of classIDs completely. I would still prefer the sesend approach, as it is independent on versions and somehow self-consistent, while the first one requires manual context version management.

Re: Followup from meeting: Modularity

Hmm. I must admit that I disgree here.

1. Checking classType is in no way an indiciation that you have binary compatibility. The implementation of a material model could change completely without even touching the classType.
classType would only ever  change when you add a new class, and if you add it in the end, it still wouldn't be noticed.
And adding a new element shouldn't even effect compatibility (it would definitely work if you added to the end of the classType).
So, in the end, it would give you an error message when things probably would have worked, and happily carry out the analysis when it shouldn't.

I'm not sure what classRecordBeginID() would return, but it still definitely won't be enough to verify binary compatibility (e.g. internal changes won't be detected, and those are the only ones that matter for compatibility).

2. I actually would like to have the ability to add a new element, and still continue with old context files, since a new element wouldn't effect anything else. It would almost always work (if you add the classtype to the end, or if we dropped classtype altogether).

3. The reason classID is stored because it's used in the classfactory for restoring. Since we construct elements according to the classID, it would be pointless to compare to it. So it's impossible for constructed elements to check consistency.

Summary of my suggestions:
* For saving and restoring, we need to use a unique identifier from elements. I suggest we use the same idenfitier that we use in the input files: The input record string.
* For checking binary compatibility, i suggest we use the OOFEM version itself, or nothing at all. We can never ensure compatibility anyway (even if we were to use something really fine-grained like the git-hash since local changes aren't reflected). If you modify the source, then you can never expect compatibility.

4

Re: Followup from meeting: Modularity

I fully agree that unique ID has to be stored for almost every component. The point I would like to stress out, that this ID has to be stored by a class maintaining a component (a domain, for example), as this this has to be read by the maintainer in advance to create the component and then restore its state, which is handled by component store/restore method.

However, at present this ID is stored by component itself, and this was the point I mentioned. To check binary compatibility, I think that oofem version is not enough, as it should be changed after adding any new component or changing component store/restore implementation. And this version change has to be made manually. So I think there is a good reason to start every component binary record with a marker, that could be checked in runtime. This will certainly improve the ability to check the consistency in runtime. Not only useful for debugging, but it can detect the incompatibility, when some records have been extended or modified. Of course it will not detect some changes, for example when the order of values stored by component changes. But its better to have at least some checks.

Re: Followup from meeting: Modularity

I think this whole discussion makes it clear how badly classType needs to be removed. It is the very source of all the problems.
Let me make a really really short summary of my thoughts:
Current situation: Changing IDs -> Almost never keep compatibility -> Sort-of-check if ID has changed.
My proposed change: Static IDs -> Always keep compatibility -> No need to check versions.
Because the classTypes values are not constant, it is the very source of the binary incompatibilities you want to check for.
If we change to the input string (or anything with fixed value), all these issues would just disappear right away.

The maintaning classes wouldn't need to know of all potential IDs. The component knows its ID: FEMComponent::giveInputRecordName().
Context version shouldn't need to change after adding a new component. The only reason it would break now is because of the classType to begin with (since all its values are likely offset). By using the static record name (which remains constant even when adding a new component) this binary incompatibility is removed completely!
So, dropping the ever-changing classType for ID and using the (static) input record name is a win-win. It would vastly increase the probability that context files are binary compatible with newer versions of OOFEM.

Technically, the same could be achieved if we hardcoded values for each entry in the classtype, but the huge enum has some huge downsides:

  • Users can't link in external elements/materials etc (drastically limits OOFEMs abilities as a library)

  • Duplicated information means that one version is likely wrong (e.g. the InputFieldType enum was used incorrectly in over 20 places)

  • Forces recompilation of almost everything when you add a new component

  • Large risk for annoying merge conflicts.

  • Much more difficult to package and send a new element to someone (just packaing your new *.C and *.h file isn't enough).

  • More tedious instructions for new users on how to add new components to oofem (change here and there).

  • We either have binary incompatibility (classtype values are not fixed when adding new components) or always have merge conflicts whenever we add something new (having to update numbering manually).

and any one of these downsides is serious enough to get rid of classType.

--------------------------------------------------------------------------------------------------------------------------

Instead of trying to explaining the code, i'll just show the code instead. Today I filled in the last few gaps, making it possible to save and restore context based on the input record name. I tried it out with a context file, and it seems to be working (I made it save and restore the definition as well for testing purposes). I had to change a few things (like restoring material models first, since restoring elements+integration points needs access to the material model), but other than that, it was a very small modification of the SAVE_COMPONENTS and RESTORE_COMPONENTS macros.
Those contexts files will now actually be compatible with new versions of OOFEM (unless, as always, something changes internally).

The load balancing code still uses the classType, but I don't have any samples to run. The change should be trivial but I'm reluctant to do it without being able to check that nothing crashes.

6

Re: Followup from meeting: Modularity

Consider the following scenario:
- someone discovers the missing data in the context file for some specific class
- when this is fixed, the context file becomes incompatible with the previos versions, and this wold require to manually change the context version somewhere, forgetting this would lead to strange behavour.
- someone can also implement incompatible save/restore functions (saving and restoring attributes in different order, saving of one variable without matching restore)

- many of these potential problems can be checked on runtime, when each data record starts with a special predefined new_record value (zero, for example) , that is stored and restored and could be checked. This could detect data alinment errors, incompatibility, etc.

I consider this self-consistency chacks as extremely usefull. In previous versions, the class ID served partially this purpose.

Re: Followup from meeting: Modularity

I'm very well aware of that scenario, and I'm very puzzled by this statement

In previous versions, the class ID served partially this purpose.

since the classID-checks definitely never caught any of these scenarios you decribe. If it ever happened to abort, it would have been by pure chance.
The classID wouldn't change if you added or changed something with restoreContext/saveContext as you describe, so OOFEM would happily carry on with the analysis (but most likely crashing).
The only incompatibility classID could ever possibly catch is when the classID itself had changed (and not even always that). I've simply removed this particular incompatibility by using a type of ID that doesn't change all the time.


I'd also like to point out that this change I'm discussing now is just about the components that are saved/sent. None of these elements/materials/nodes/etc can (or ever have) checked the classID (only static stuff like TimeStep could possibly do that, nothing in the classFactory could possibly work). What would they do, compare their own classType to themselves? It will always by true, because we wouldn't know which was the correct element to begin with. Technically, you could still have that classType-check (which is now pointless because there is no possible classType incompatibilities anymore) if you really really wanted, even with my changes.

You said earlier that you wanted to avoid manual context versions (i agree), but this is exactly what classType is. You want to keep updating it manually (but not always!) and use it to check if OOFEM *might* be incompatible. In fact, if I had to pick, I'd much rather just have a single value to increment than to "increment" the classType with new entries. At least then we wouldn't have all those other negative side-effects from the global enum.

- many of these potential problems can be checked on runtime, when each data record starts with a special predefined new_record value (zero, for example) , that is stored and restored and could be checked. This could detect data alinment errors, incompatibility, etc.

Yes (but zero might be to common to use for alignment checking). And, by using the inputRecordName, if something has changed, the names will be off-aligned and classFactory will definitely fail (since you end up with garbage data in your input string), so the new code actually does this.
But checking classType however, won't do any good. Because the only thing it would ever notice isn't an issue anymore.

So, to sum it up, using the inputRecordName improves binary compatibility, and improves upon the error checking capabilities when things change. Win-win.

Re: Followup from meeting: Modularity

I meant to send this message yesterday, but it seems i forgot:

I pushed some code today and I just wanted the emphasize that none of these changes actually effects whether or we do versioning.
These changes just effect what ID is send/stored in the datastream. This lets us have less duplicated code, and improve binary compatibility.

We still could use classID inside TimeStep to check if we "need" to throw a CIO_BADVERSION error, nothing has changed in this regard.
At this point, that would be the *only* usage of classID, and classType is nothing more than the manually updated version information you also  wanted to avoid.


Some changes affected the load balancing. I worked through those as careful as I could, triple-checking everything, and I think I managed to implement it correctly (if not, then at least it should be simple to fix).

The code now also throws badversion error whenever the classfactory fails to create the requested component (which most likely means we're trying to restore a nonexistent element, or that some internal changes offsets the data and the string contains only garbage). This check alone  should be a lot more robust than TimeSteps classType-checking ever was.