Topic: Preferring string constants to enums

In many places oofem uses enums externally (i.e as values in input files), and possible also saves them in context files as so on.
This makes it hard to ever have binary compability, and forces users to put in awkward constants into input files.

I think it would be much preferable to use constant strings in most places
I propose changing

enum LinSystSolverType {
    ST_Direct = 0,
    ST_IML    = 1,
    ST_Spooles= 2,
    ST_Petsc  = 3,
    ST_DSS    = 4,
    ST_Feti   = 5
};

to

typedef const char * LinSystSolverType; // Optional?

LinSystSolverType ST_Direct = "direct";
LinSystSolverType ST_IML = "iml";
LinSystSolverType ST_Spooles = "spooles";
LinSystSolverType ST_Petsc = "petsc";
LinSystSolverType ST_DSS = "dss";
LinSystSolverType ST_Feti = "feti";

It is more flexible, and more intuitive to use.
The downside is backwards compability with existing scripts and such, but I still think this is worthwhile.

Special case:
In particular, one enum has been used for this a lot, and causes quite some problems; InputFieldType
Right now, all the macros support 2 inputs, one enum and one string, and for a long time, the enum has been ignored.
I rather recently introduced DynamicInputRecord, which actually made use of the enums, but I regret this choice.
I suggest we get rid of the enum, and just introduce a typedef (possible also a namespace)

typedef const char * InputFieldType;
namespace IFT {
Element_nodes = "nodes";
...
}

This would make it possible to create a dynamic input record, than save it to a file, e.g. building a pre-processor helper library for creating OOFEM input files.
Since these IFT-enums are used in-code only, there is no downside to changing this enum to string literals (no backwards compability to brake here).

2

Re: Preferring string constants to enums

I will have a problem to accept this. The main reason is the efficiency, particularly when equality operator is called. Some enums are used very frequently in the code (to test material mode, for example, and this happens in every iteration for every IP) and string based comparison will be very slow, compared to fast enum value equality check, which is basically byte (or integer) comparison, that can be done in one cpu instruction. Perhaps in case of LinSystSolverType this is possible, but not for all enums. Another advantage of enum is the capability of checking values at compile time.
And we can can automatically  generate enum2string and string2enum methods as in case of materialMode enum, for example, where  MaterialModeToString already exist and is generated from macro definition by compiler. The reverse function can be added easily.

Re: Preferring string constants to enums

MateriaMode is not an externally used enum, so I'm not suggesting we change that.

What I mean are the enums values used in input files, which are

  • InputFieldType  - Isn't really used at all right now. It's all strings anyway in input files (since we couldn't possible use the enum values that are constantly changing).

  • LinSystSolverType - Only rarely used, and almost only with input files. Definitely preferable to have strings literals.

  • DofIDItem - Here is one case in which I would like to keep the enum, since its frequently used within the code.

  • NonLinearStatic_controlType, NonLinearStatic_stiffnessMode, (and possibly other enums used in a similar way).

InputFieldType seems like a no-brainer to me. It would let me extend the possibilities of DynamicInputRecord+DynamicDataReader to actually save valid OOFEM input file.

4

Re: Preferring string constants to enums

ok, for the enums that are not frequently tested and there is a good reason we can perhaps switch. However, I still see the use of enums + generated string conversion methods as more natural and safer (compile time checking, compiler can detect unhandled values in if/case constructs, etc.). What is the advantage of using string literals compared to enums with string conversion ?

Re: Preferring string constants to enums

I've considered the String -> Enum approach, but its precisely because of the reason that everything needs to be supported within OOFEM at compile time that I feel is a limitation, it basically breaks the modularity of the code.
If a user were to create his own element in python, and wants to send some input to this new element, it would not be possible to convert any "SomeElementParameter" to an IFT_SomeElement_parameter enum, since it wouldn't exist.
Similar external extensions could be possible if we go with LGPL, if some vendor would like to link with their own internal sparse matrix or solver.
Anything that goes into the classfactory has little to nothing to gain from enums.

For things like NonLinearStatic_controlType, yes, String->Enum would certainly be possible (these are more options, rather than referring to specific objects). These are the types of enums that actually benefit from the typechecking and such (althought that is currently avoided by casting all the controltypes to ints).

It was primarily InputFieldType which I had intention of changing (since I want to make DynamicInputRecord more useful). I also think we could do without "classtype" completely (almost all cases where classtype is used should/could be replaced easily).
These both massively break modularity of the code by collecting everything.

6

Re: Preferring string constants to enums

The InputFieldType was introduced to support future binary format, so that each parameter keyword could be represented by its key (InputFieldType). This would make the parameter value query fast. As you pointed out, it is not used now. However, I think we need binary input in the future and entity record parsing should be efficient, without storing parameter name strings and doing costly string comparisons, while still using existing initializeFrom methods.
What we can do:
- favor flexibility over speed in binary input and use strings only for parameter identification,
- define the keys corresponding to parameter names only locally for each entity, so that they will be assigned (defined) by the entity itself. This will make fast binary i/o possible, while maintaining flexibility. What do you think about this?

Regarding classtype, this was introduced at times, when not all compilers provide run-time type checking functionality (typeid). Hopefully, now its more standardized, and we can get rid of it.

Re: Preferring string constants to enums

I'm looking over the code where giveClassID is used. Many of them can trivially be changed to a dynamic_cast, and some of them shouldn't really be using classtype to begin with (rather they should rely on some suitable virtual method, like "Dof::isPrimaryDof()").
I've made some effort to get rid of the code that relies on classtype (since we already use dynamic_cast in many places), and I believe I've actually fixed a few potential bugs on the way.

As for InputFieldType; the problem i see with the current enum is that it's monolithic and binary-incompatible.
Introducing integer keys locally could work, but I'm not really convinced that it's necessary since it's only ever done during initialization. I think it's just suboptimization.
Just tried a few examples with valgrind. In the worst case scenario (single time step, linear problem), strncasecmp+strcmp only accounted for a tiny 0.27% of the execution time.

Either way, the monolithic enums really needs to go.

Re: Preferring string constants to enums

I've starting adding a few string literals to header files (engngm.h and element.h)
However we decide on enums and efficiency, this should be done anyway.
Coding string literals in C files is no good for libraries, other applications would have to hardcode strings as well. It's just safer to rely on this defined string.

I've done it using this style

///@name Input fields for general Engineering models.
//@{
#define _IFT_EngngModel_nsteps "nsteps"
#define _IFT_EngngModel_contextoutputstep "contextoutputstep"
#define _IFT_EngngModel_renumberFlag "renumber"
#define _IFT_EngngModel_profileOpt "profileopt"
#define _IFT_EngngModel_nmsteps "nmsteps"
#define _IFT_EngngModel_nxfemman "nxfemman"
#define _IFT_EngngModel_nonLinFormulation "nonlinform"
#define _IFT_EngngModel_eetype "eetype"
#define _IFT_EngngModel_parallelflag "parallelflag"
#define _IFT_EngngModel_loadBalancingFlag "lbflag"
#define _IFT_EngngModel_forceloadBalancingFlag "forcelb1"
//@}

which should work fine both for documentation and uniqueness of strings.

Re: Preferring string constants to enums

I've gone through pretty much everything now and added string literals like this.
There are undoubtedly mistakes spread out, not at least because many many classes weren't actually using correct enum values, or the enum values were called something completely different. Plenty of missing or unused enums as well. This is definitely a work in progress.
Make no mistake about it, InputFieldType was (and quite likely still is) riddled with errors and inconsistencies, and would never had worked if it was used for anything in a larger scale. Such a huge global enum which basically contain duplicated information (strings + enum) is doomed to be forgotten and left in a broken state (like it has been for several years now).

I'm about to retire the InputFieldType enum now. Naturally, I'll make sure the tests and benchmarks still run, but there could be an a couple of input fields which didn't get their correct string values.
None of these changes are intended to break any backwards compability in input files, if they do, then it is a bug.

Re: Preferring string constants to enums

I've moved it one step closer to deprecating the huge enum variable.
The next step would be to search&replace the following

IR_GIVE_OPTIONAL_FIELD(ir, vof, IFT_Tr1SUPG_pvof, "pvof");

to

IR_GIVE_OPTIONAL_FIELD(ir, vof, _IFT_Tr1SUPG_pvof);

where _IFT_Tr1SUPG_pvof is defined in the header file. By not retyping the strings in the source files, avoids redundancy which might otherwise lead to inconsistency (especially troubling it OOFEM is used as a dynamic library).
I would also suggest that we introduce

#define _IFT_Tr1SUPG_Record "tr1supg"

in each header file, which the element/material/.../ factory can rely on. That way we can isolate everything about the new component into one single C+h combination for maximum modularity.