modelutil: Field, State, and StateSet Objects



The Field, State, and StateSet classes provide the objects that hold and manage model variables. Specifically:

  • Field: This container object holds the variable data as an array. Slices/subarrays of data are specified using Numeric/numarray syntax, and many key Numeric/numarray and MA/ma functions and methods are defined for Field objects.
  • State: This container object is a mapping holding any number of Field objects, all at the same moment in time or the same "tendency" in time (actually the time differential tendency multiplied by one timestep, i.e. the value to add to the variable at the current time to increment one timestep forward). The keys for the State objects are the id attributes from the Field objects.
  • StateSet: This container object is a mapping holding any number of State objects, each at different time steps. The keys for the StateSet objects are the id attributes from the State objects.

For containers that are mappings, to add or delete an object, you can use mapping syntax (this is just like dictionaries). For example, say you have Field class instances f1, f2, and f3, and State class instance s1 which holds f1 and f2.

To delete f2 from s1:

del s1[]

So, if id in f2 equaled "u", the above line is equivalent to del s1["u"].

To add f3 to s1:

s1[] = f3

Note if there is a Field object in s1 with an id the same as, the above line will associate the key with f3. This is just like dictionaries. (The add method actually is the more preferable way to add new objects to State and StateSet; it's described below.)

Most standard mapping methods (e.g. clear, has_key, such as found in dictionaries) are also defined for containers that are mappings.

The pydoc generated documentation for each class (Field, State, StateSet) gives detailed information about the API for these classes. The sections below highlight the coding principles used, and the more important methods, attributes, and functions.

Coding Principles

Attribute initialization and addition: Data are passed in via positional arguments. Data for a Field object is passed in via a single positional input argument (Field data is an array). An assortment of Field objects are passed in to a State object as positional arguments, and likewise for a StateSet collection of State objects.

Public attribute are set via keywords on object instantiation. The keywords are the attribute name while their values are the attribute values. Public attributes not given in the keyword argument list are set to None on instantiation.

For Field objects, the list of attributes (both public and private) which are allowed is limited to the following: id, long_name, units, extra_meta, _data. These limits are enforced, and any attempt to add an attribute not on this list will throw an exception. This list is given in module field private variable _ok_field_attrib. All these attributes are set on object instantiation (to a default value if not specified in the argument list).

For State and StateSet objects, the rules are more lenient. There is a limited number of attributes that are set on object instantiation, but after instantiation there are no limits on the setting or changing of attributes. The list of public attributes set on object instantiation (undefined values are set to None) is given in private instance attribute _ok_init_kwds. See the pydoc documentation and source code for details.

Container data: The data for all three container classes (Field, State, StateSet) is stored in the private variable _data. _data is a dictionary, and In Field objects _data is set by reference to the array passed in on object instantiation. (Note if the value is a scalar, list, or tuple, _data is set to a Numeric/numarray copy of the input data.) In State and StateSet objects _data is a dictionary (empty if there are no objects placed in it on instantiation).

Most special container methods (e.g. __contains__, __getitem__) are defined for the objects and act on _data.

Common Attributes/Methods

This section discusses methods that are common to more than one of the Field, State, and StateSet classes.

Object name: The id attribute specifies the name of each class of object. For Field and State objects, the id attribute is also the key to the object, if it is part of a State or StateSet object (respectively). Both Field and State objects have the long_name attribute which gives additional information on the object.

Adding several objects to State and StateSet objects: State and StateSet provide the add method to add any number of additional Field and State (respectively) objects to the existing State and StateSet object. For both classes, the syntax of this method is:

s.add(a, b)

If a and b are Field objects, and s is a State object, a and b will be added to the existing Field objects in s. Field objects in State objects, and State objects in StateSet objects, are referenced by the id attribute in the constituent Field and State objects, respectively.

The add method adds the objects left-to-right in the order given in the argument list. Each member of the argument list must have unique id attributes; otherwise an exception is thrown. If the keys associated with any of those objects already exist, the pre-existing value associated with that key is overwritten with the new value. Thus, in the above example, if b has the same id as an existing Field object in s, the value of b will become associated with that key in s, replacing the prior value.

This method exists because it can be cumbersome to add each member object individually using dictionary slicing syntax.

Copying: All three container classes (Field, State, and StateSet) provide two methods to copying all or parts of themselves. Each method returns objects of the same class as the source object, and duplicated values do not share memory with the source values. In general the copied parts are deep copies:

  • copy(): Returns a copy of the entire object.
  • copymeta(): Returns a copy of the metadata of the entire object. The data in the object is not copied; thus for a Field object an empty array is returned for the data while for State and StateSet objects an empty dictionary is returned for the data.

Note that in the copymeta description, by "metadata" we mean metadata for that object. Thus the "metadata" for a State object does not include the metadata attached to the Field objects making up the data of the State object.

Field and State also each include unique methods for copying other parts of themselves. Those are described in the "Class Specific Attributes/Methods" section below. The field module also includes functions that make copies of all or part of a Field object. For instance, copymeta_largest returns a copy of the metadata of the "largest" Field object in a list of Field objects.

Setting/referencing data: In all three container classes (Field, State, and StateSet) setting and referencing (as opposed to slicing) of the data stored in _data are done by the following public methods:

  • refdata(): Reference the _data object.
  • setdata(arg): Set _data, by reference, to the argument arg for Numeric/numarray and MA/ma arrays, and by value for lists, tuples, and scalars.

Note that plain assignment does not do the same thing as setdata. If a is a Field variable, then:

a = b

will rebind the entire object a to b. If b is a numarray array, a will not be a Field object anymore but a numarray array.

To bind the a data container to the b array, instead use:


a is now a Field object with private attribute _data bound to b.

refdata and setdata are very powerful methods and must be used carefully! With these methods you are accessing the entire container, not elements in that container. You could redefine the container if you're not careful. If you use setdata in a State variable, and you provide an argument that is not of dictionary type, you will break nearly all of State's container mapping special methods. (If you want to add a Field variable to an existing State object you should use the add method.)

The use of these public methods also occurs in the internals of the three variable classes. For instance, in the code defining Field objects, all occurrences in that set _data by reference to specified data use setdata, and all references to _data use refdata. Other types of set commands (e.g. in-place operations) in the Field internals may not use setdata, instead operating on _data directly. For State and StateSet code, some of the class definition internals use setdata and refdata, but it's not uniform in these classes whether these public methods are used or _data is manipulated directly.

Managing/manipulating data: Because the container data stored in the Field object is a Numpy/numarray masked or unmasked array, manipulation of Field data follows rules and uses methods given in Numpy/numarray. Because the container data in State and StateSet objects are stored as a dictionary, these two classes also define a number of the key dictionary methods (e.g. has_key, items, keys, and values).

Checking metadata: One common problem in modeling is ensuring that "interface variables," i.e. variables that are passed between models/submodels, are used consistently in all applicable scopes. The meta_ok_to_interface method (which is defined for Field, State, and StateSet objects) fulfills this task, returning True if the object's metadata passes certain consistency checks. More information is found in this description of checking Field metadata with meta_ok_to_interface and this description of the InterfaceMeta class.

Simple parallelization: One simple way of parallelizing a model is by breaking up the model spatial domain into parts (assuming that each element of the domain does not communicate to any other elements in the domain), and assigning computations on those parts to different processors. At the end of those computations the parts are reintegrated into the whole. This is the function of the wrap method, defined in all three model variable container classes (Field, State, and StateSet). Usually, the wrap method for a Field object is called right before the model adds the Field object to the model's tendency dictionary (via add_tend1). The code for the semtner0 model illustrates this usage.

This type of simple parallelization is not yet implemented in modelutil; the wrap method is currently a stub and does nothing.

Class Specific Attributes/Methods

This section discuss use of some of the more important methods and attributes specific to each class. The classes have other methods and attributes you might be interested in; see their pydoc documentation (Field, State, StateSet) for details.

Field Objects

Metadata and metadata management: One key feature of Field objects is that they contain all the necessary metadata to properly describe the variable for the purposes of use in a model. In a Field object key metadata are specified as attributes while secondary metadata is stored in an ExtraMeta object in the Field object's extra_meta attribute. Details regarding how and what metadata are specified are found here.

Metadata management in a Field object is provided by the following methods:

  • clear_all_meta(): Sets metadata attributes to None. Sets the extra_meta attribute to an empty ExtraMeta object.
  • clear_extra_meta(): Sets the extra_meta attribute to an empty ExtraMeta object.
  • copymeta(): Returns a copy of the metadata of the entire object. The data in the object is not copied; thus for a Field object an empty array is returned for the data.
  • replace_all_meta(arg, **kwds): Erases all metadata values. Sets metadata to the values in the Field object given by arg or to the keyword parameters given in **kwds. The data in the Field container object is unaltered.

The replace_all_meta method is particularly useful when you wish to copy all the metadata from one Field object to another. This often happens at the end of a series of calculations.

Operations: In many ways, Field objects can be used as Numeric/numarray or MA/ma arrays. Many mathematical operations (e.g. cos, exp, sqrt) and key array manipulation functions (e.g. ravel, reshape, where) are implemented as field module functions that operate on Field objects. Field objects. Standard operators (e.g. +, **) and comparison operators (e.g. >, <=), as well as select array methods (e.g. astype, typecode), are defined for Field objects. Some of these methods are described below; see also the Field pydoc documentation for details.

When using operators (e.g. +, -) with Field objects, execution order can be important. Consider three variables a, b, and c. a is a Field object, and b and c are products of a with a numarray array. The only difference between b and c is the order of multiplication:

>>> from modelutil.field import Field
>>> import numarray as N
>>> a = Field([3.2, 4.4], id='a')
>>> b = a * N.array(1.1)
>>> c = N.array(1.1) * a

In the case of b, because a comes first in the product, the multiplication is done using the __mul__ method in Field. As a result, b is a Field object:

>>> type(b)
<class 'modelutil.field.Field'>

In contrast, for c, because the numarray array comes first in the product, the multiplication is done using the __mul__ method in numarray. As a result, c is a numarray object:

>>> type(c)
<class 'numarray.numarraycore.NumArray'>

In the above example, the calculation still worked even though the operation order was reversed. However, this effect of operation order can be significant, particularly is a Field variable's data is masked. To prevent errors of this type from occurring, order operations with Field objects in such a way that the result is also a Field object.

Copying: Besides the copy and copymeta methods, Field objects can also be copied with the following methods:

  • asMA(): Returns a copy of the entire object except the copied data is a masked MA/ma array.
  • astype(typecode): Returns a copy of the entire object with the data cast to the type specified by typecode.
  • filled(fill_value=None): Returns a copy of the entire object except the copied data is a plain unmasked Numeric/numarray array, with masked values filled by keyword argument fill_value (or by self.fill_value() if fill_value is None).

Slicing: Field objects follow standard Python array slicing syntax and rules. However, slices of Field objects return regular array objects:

>>> from modelutil.field import Field
>>> a = Field([[ 2.4, 5.5, -2.0] \
              ,[-1.2, 3.6,  9.2]], id='u')
>>> print a
array([[ 2.4,  5.5, -2. ],
       [-1.2,  3.6,  9.2]])
>>> b = a[:, 0:2]
>>> b
array([[ 2.4,  5.5],
       [-1.2,  3.6]])
>>> type(a)
<class 'modelutil.field.Field'>
>>> type(b)
<class 'numarray.numarraycore.NumArray'>

Standard slicing syntax, however, assumes the array has a pre-determined dimension structure. To make models truly flexible you want to slice axes based upon the value along a dimension (e.g. a latitude, height, etc. values) as opposed to the index of a dimension. The slice_fixed method provides a method to make such a slice of a Field object: it both slices the data and adjusts the metadata accordingly. The slice_fixed_setdata method functions similarly except that it replaces the elements selected by slice_fixed with values from another array or a scalar. The section on writing flexible models as well as the pydoc documentation for slice_fixed and slice_fixed_setdata provide more information.

State Objects

Error checking: Besides the meta_ok_to_interface metadata checking method, the State class provides the following methods to use in checking for (potential) errors:

  • is_conformable(ignore=None, include=None): Returns True if the data in all Field objects in the State object are conformable. The ignore and include keywords enable you to list Field objects to ignore or include (respectively) in the conformability test. Use this to prevent executing models where the you know from the outset that the input has non-conformable array sizes (and will throw an error when the arrays are used together in operations).
  • max_shape(): Returns the shape of the array that is large enough that the data for all non-axis Field objects in the State object can be stored in that array.

Accessing subsets: These methods enable you to access subsets of the set of Field objects making up the State object:

  • copysubset(keylist): Creates a new State object that is a deep copy of the list of Field objects in keylist.
  • subset(keylist): Creates a new State object that is a copy of the list of Field objects in keylist. In this copy, the data of the Field object are specified by reference to the original data.

StateSet Objects

Timestep attribute: The StateSet timestep (the difference between the two consecutive States "t0" and "tp1") is stored in the instance attribute delt.

Error checking: Besides the meta_ok_to_interface metadata checking method, the StateSet class provides the time_ok method to check if the time-related parameters in the StateSet object pass consistency checks.

Informational methods: These methods provide, in handy formats, information about the State objects that make up the StateSet object (e.g. ordered lists of State ids):

  • sort_stateid(): Returns a list of the State ids in time-ascending order.
  • state_list(): Returns a list of 2-element tuples, each of which gives the id and long_name of each State variable in the StateSet object.
  • stateid2int(): Translates all non-"tendency" State ids to integers and returns these integers as a list. Note the similarities and differences between this method and the stateid2int module function in stateset.

Management of constituent State variables: Besides the add method which adds State variables to a StateSet object, the following methods are provided to delete or re-order the State variables in a StateSet object:

  • shift_time_stateid(nstep): Shift State objects nstep timesteps and delete the "tendencies" in the StateSet object.
  • del_earliest: Removes the earliest State object from the StateSet container.
  • del_latest: Removes the latest State object from the StateSet container.

Functions For Use With These Classes

There are two types of functions described here: (1) Functions that manipulate and use instances of Field, State, or StateSet; (2) Utility functions useful for the definition of these classes.

Numerical functions for Field objects: Many standard mathematical and array functions and operations are defined for Field objects. Most of them are defined using one of the following support classes found in the field module: BinaryFunction, BinaryOperation, NullaryOperation, UnaryFunction, and UnaryOperation.

Most numerical functions and operations return a Field object with all non-data attributes set to empty (either None or an empty instance of ExtraMeta). This is done because such functions and operations can often change the metadata of a Field object, such as the units. Details are given in the module docstring documentation for the field module.

Function to select and copy the metadata of the "largest" object in a set of Field objects: The copymeta_largest function in the field module takes a set of Field objects, finds the "largest" Field object in that set (where "largest" can roughly be described as the Field object with the largest array), and copies the metadata of that Field object. A discussion of how this function is used is found here.

Informational functions for State objects: These functions are used to help obtain and manipulate information, in a handy format, on lists of State variables:

  • int2stateid(inlist): For inlist list of integers, translates the values into the ids of State objects. This is essentially the inverse of stateid2int.
  • stateid2int(inlist): For inlist list of non-"tendency" ids of State objects, translates the ids into a list of integers.

