modelutil: Field, State,
and StateSet ObjectsThe Field, State, and StateSet
classes provide the objects that hold and manage model variables.
Specifically:
Field:
This container object holds the variable data as an array.
Slices/subarrays of data are specified using
Numeric/numarray syntax, and many key
Numeric/numarray and
MA/ma functions and methods are
defined for Field objects. State:
This container object is a mapping holding any number of
Field objects, all at the same moment in time
or the same "tendency" in time
(actually the time differential tendency
multiplied by one timestep, i.e. the value to add to
the variable at the current time to increment
one timestep forward).
The keys for the State
objects are the id attributes from the
Field objects. StateSet:
This container object is a mapping holding any number of
State objects, each at different time steps.
The keys for the StateSet
objects are the id attributes from the
State objects. For containers that are mappings, to add or delete an object,
you can use mapping syntax (this is just like dictionaries).
For example, say you have Field
class instances f1, f2, and f3,
and State class instance s1 which holds
f1 and f2.
To delete f2 from s1:
del s1[f2.id]
So, if id in f2 equaled "u",
the above line is equivalent to del s1["u"].
To add f3 to s1:
s1[f3.id] = f3
Note if there is a Field object in s1 with
an id the same as f3.id, the above line
will associate the key with f3. This is just like
dictionaries.
(The add method actually is the more
preferable way to add new objects to State and
StateSet; it's described
below.)
Most standard mapping methods
(e.g. clear, has_key,
such as found in dictionaries)
are also defined for containers that are mappings.
The pydoc generated documentation for each class
(Field,
State,
StateSet)
gives detailed information about the API for these classes.
The sections below highlight the coding principles used,
and the more important methods, attributes, and functions.
Attribute initialization and addition:
Data are passed in via positional arguments. Data
for a Field object is passed in via a single
positional input argument (Field data is an
array).
An assortment of Field
objects are passed in to a State object as
positional arguments, and likewise for a StateSet
collection of State objects.
Public attribute are set via keywords on object instantiation.
The keywords are the attribute name while their
values are the attribute values. Public attributes not given in
the keyword argument list are set to None
on instantiation.
For Field objects, the list of attributes
(both public and private) which are allowed is limited
to the following:
id,
long_name,
units,
extra_meta,
_data.
These limits are enforced, and any attempt to add an attribute
not on this list will throw an exception.
This list is
given in module field private variable
_ok_field_attrib.
All these attributes are set on object instantiation (to a
default value if not specified in the argument list).
For State and StateSet objects,
the rules are more lenient. There is a limited
number of attributes that are set on object instantiation,
but after instantiation there are no limits on the setting or
changing of attributes. The list of public attributes set on
object instantiation (undefined values are set to None)
is given in private instance attribute _ok_init_kwds.
See the pydoc documentation and source code for details.
Container data:
The data for all three container classes
(Field, State, StateSet)
is stored in the private variable _data.
_data is a dictionary, and
In Field objects
_data is set by reference to the array passed in on object
instantiation. (Note if the value is a scalar, list, or tuple,
_data is set to a Numeric/numarray
copy of the input data.)
In State and StateSet objects
_data is a dictionary (empty if there are no
objects placed in it on instantiation).
Most special container methods
(e.g. __contains__, __getitem__)
are defined for the objects and act on
_data.
This section discusses methods that are common to more than
one of the
Field, State, and
StateSet classes.
Object name:
The id attribute specifies the name of each
class of object. For Field and State
objects, the id attribute is also the key to the object,
if it is part of a State or StateSet
object (respectively). Both Field and State
objects have the long_name attribute which gives additional
information on the object.
Adding several objects to
State and StateSet objects:
State and StateSet
provide the add method to add any number of additional
Field and State (respectively) objects
to the existing State and StateSet object.
For both classes, the syntax of this method is:
s.add(a, b)
If a and b are Field
objects, and s is a State object,
a and b will be added to the existing
Field objects in s.
Field objects in State objects, and
State objects in StateSet objects,
are referenced by the id attribute
in the constituent Field and State objects,
respectively.
The add method adds the objects left-to-right
in the order given in the argument list. Each member of the
argument list must have unique id attributes;
otherwise an exception is thrown.
If the keys associated with
any of those objects already exist, the pre-existing value associated
with that key is overwritten with the new value. Thus, in the
above example, if b has the same id as
an existing Field object in s,
the value of b will become associated
with that key in s, replacing the prior value.
This method exists because it can be cumbersome to add each member object individually using dictionary slicing syntax.
Copying:
All three container classes
(Field, State, and StateSet)
provide two methods to copying all or parts of themselves.
Each method returns objects of the same class as the source
object, and duplicated values do not share memory with the source
values. In general the copied parts are deep copies:
copy(): Returns a copy of the entire object. copymeta(): Returns a copy of the metadata of
the entire object. The data in the object is not copied; thus
for a Field object an empty array is returned
for the data while for State and
StateSet objects an
empty dictionary is returned for the data. Note that in the copymeta description,
by "metadata" we mean metadata for that object.
Thus the "metadata" for a State object
does not include the metadata attached to the Field
objects making up the data of the State object.
Field and State also each include
unique methods for copying other parts of themselves. Those are
described in the "Class Specific Attributes/Methods" section below.
The field module
also includes functions that make copies of all or part of a
Field object.
For instance,
copymeta_largest
returns a copy of the metadata of the
"largest" Field object in a list of
Field objects.
Setting/referencing data:
In all three container classes
(Field, State, and
StateSet)
setting and referencing (as opposed to slicing) of the data
stored in _data are done by the following
public methods:
refdata(): Reference the _data object. setdata(arg):
Set _data, by reference, to the argument arg
for Numeric/numarray and
MA/ma arrays, and by value
for lists, tuples, and scalars.Note that plain assignment does not do the same thing as
setdata. If a is a Field
variable, then:
a = b
will rebind the entire object a to b.
If b is a numarray array,
a will not be a Field object anymore but
a numarray array.
To bind the a data container to the
b array, instead use:
a.setdata(b)
a is now a Field object with
private attribute _data bound to b.
refdata and setdata
are very powerful methods and must be used carefully!
With these methods you are accessing the entire container,
not elements in that container. You could redefine the container
if you're not careful. If you use setdata in a
State variable, and you provide an argument that is not
of dictionary type, you will break nearly all of State's
container mapping special methods. (If you want to add a
Field variable to an existing State object
you should use the add method.)
The use of these public methods also occurs in the internals of
the three variable classes. For instance,
in the code defining Field objects, all occurrences
in that set _data by reference to specified data use
setdata, and all references to _data
use refdata.
Other types of set commands (e.g. in-place operations)
in the Field internals may not use
setdata, instead operating on
_data directly.
For State and StateSet code, some of
the class definition internals use setdata and
refdata, but it's not uniform in these classes
whether these public methods are used or _data
is manipulated directly.
Managing/manipulating data:
Because the container data stored in the Field object
is a Numpy/numarray
masked or unmasked array, manipulation of Field
data follows rules and uses methods given in Numpy/numarray.
Because the container data in
State and StateSet objects are
stored as a dictionary, these two classes also define a number of the
key dictionary methods (e.g. has_key,
items, keys, and values).
Checking metadata:
One common problem in modeling is ensuring that "interface variables,"
i.e. variables that are passed between models/submodels, are used
consistently in all applicable scopes.
The meta_ok_to_interface method (which is defined
for Field, State, and StateSet
objects) fulfills this task, returning True if the
object's metadata passes certain consistency checks. More information
is found in this description of
checking Field
metadata with meta_ok_to_interface and this
description of the InterfaceMeta
class.
Simple parallelization:
One simple way of parallelizing a model is by breaking up the
model spatial domain into parts (assuming that each element of
the domain does not communicate to any other elements in the
domain), and assigning computations on those parts to
different processors. At the end of those computations
the parts are reintegrated into the whole.
This is the function of the wrap method,
defined in all three model variable container classes
(Field, State, and StateSet).
Usually, the wrap method for a Field
object is called right before the model adds the Field
object to the model's tendency dictionary (via add_tend1).
The
code for the semtner0 model
illustrates this usage.
This type of simple parallelization is not yet implemented in
modelutil; the wrap method is currently
a stub and does nothing.
This section discuss use of some of the more important
methods and attributes specific to each class. The classes
have other methods and attributes you might be interested in;
see their pydoc documentation
(Field,
State,
StateSet)
for details.
Field ObjectsMetadata and metadata management:
One key feature of Field objects is that they contain
all the necessary metadata to properly describe the variable for the
purposes of use in a model. In a Field object key
metadata are specified as attributes while secondary metadata is
stored in an ExtraMeta object in the Field
object's extra_meta attribute.
Details regarding how and what metadata
are specified are found here.
Metadata management in a Field object is provided by
the following methods:
clear_all_meta():
Sets metadata attributes to None.
Sets the extra_meta attribute to an empty
ExtraMeta object.
clear_extra_meta():
Sets the extra_meta attribute to an empty
ExtraMeta object.
copymeta():
Returns a copy of the metadata of the entire object.
The data in the object is not copied; thus for a Field
object an empty array is returned for the data.replace_all_meta(arg, **kwds):
Erases all metadata values.
Sets metadata to the values in the Field object
given by arg or to the keyword parameters given in
**kwds.
The data in the Field container object is unaltered.
The replace_all_meta method is particularly useful
when you wish to copy all the metadata from one Field
object to another. This often happens at the end of a series
of calculations.
Operations:
In many ways, Field objects can be used as
Numeric/numarray or
MA/ma arrays.
Many mathematical operations
(e.g. cos, exp, sqrt)
and key array manipulation functions
(e.g. ravel, reshape, where)
are implemented as field module functions that operate
on Field objects.
Field objects.
Standard operators
(e.g. +, **)
and comparison operators
(e.g. >, <=),
as well as select array methods
(e.g. astype, typecode),
are defined for
Field objects.
Some of these methods are described below; see also the
Field pydoc documentation
for details.
When using operators (e.g. +, -)
with Field objects, execution order can be important.
Consider three variables a, b,
and c. a is a Field object,
and b and c are products of a
with a numarray array.
The only difference between b and c
is the order of multiplication:
>>> from modelutil.field import Field >>> import numarray as N >>> a = Field([3.2, 4.4], id='a') >>> b = a * N.array(1.1) >>> c = N.array(1.1) * a
In the case of b, because a
comes first in the product, the multiplication is done using the
__mul__ method in Field.
As a result, b is a Field object:
>>> type(b) <class 'modelutil.field.Field'>
In contrast, for c,
because the numarray array comes first
in the product, the multiplication is done using the
__mul__ method in numarray.
As a result, c is a numarray object:
>>> type(c) <class 'numarray.numarraycore.NumArray'>
In the above example, the calculation still worked even though the
operation order was reversed. However, this effect of operation
order can be significant, particularly is a Field
variable's data is masked. To prevent errors of this type from
occurring, order operations with Field objects in
such a way that the result is also a Field object.
Copying:
Besides the
copy
and copymeta methods,
Field objects can also be copied with the following
methods:
asMA():
Returns a copy of the entire object except the copied data
is a masked MA/ma array.
astype(typecode):
Returns a copy of the entire object
with the data cast to the type specified by typecode.
filled(fill_value=None):
Returns a copy of the entire object
except the copied data is a plain unmasked
Numeric/numarray array, with masked
values filled by keyword argument fill_value
(or by self.fill_value() if fill_value
is None).
Slicing:
Field objects follow standard Python array slicing syntax
and rules. However, slices of Field objects return regular
array objects:
>>> from modelutil.field import Field
>>> a = Field([[ 2.4, 5.5, -2.0] \
,[-1.2, 3.6, 9.2]], id='u')
>>> print a
array([[ 2.4, 5.5, -2. ],
[-1.2, 3.6, 9.2]])
>>> b = a[:, 0:2]
>>> b
array([[ 2.4, 5.5],
[-1.2, 3.6]])
>>> type(a)
<class 'modelutil.field.Field'>
>>> type(b)
<class 'numarray.numarraycore.NumArray'>
Standard slicing syntax, however, assumes the array has a
pre-determined dimension structure.
To make models truly flexible you want to slice axes based upon
the value along a dimension (e.g. a latitude, height, etc. values)
as opposed to the index of a dimension.
The slice_fixed method provides a method to make such
a slice of a Field object: it both slices the data
and adjusts the metadata accordingly.
The slice_fixed_setdata method functions similarly
except that it replaces the elements selected by
slice_fixed with values from another array or a scalar.
The section on writing flexible models
as well as the pydoc documentation for
slice_fixed
and
slice_fixed_setdata
provide more information.
State ObjectsError checking:
Besides the
meta_ok_to_interface
metadata checking method, the State class
provides the following methods to use in
checking for (potential) errors:
is_conformable(ignore=None, include=None):
Returns True if the data in all Field objects
in the State object are conformable. The ignore
and include keywords enable you to list Field
objects to ignore or include (respectively) in the conformability test.
Use this to prevent executing models where the you know from the
outset that the input has non-conformable array sizes (and will
throw an error when the arrays are used together in operations).max_shape():
Returns the shape of the array that is large enough that
the data for all non-axis Field objects
in the State object can be stored in that array.Accessing subsets:
These methods enable you to access subsets of the set of
Field objects making up the State object:
copysubset(keylist):
Creates a new State object that is a
deep copy of the list of Field objects
in keylist.subset(keylist):
Creates a new State object that is a
copy of the list of Field objects
in keylist. In this copy, the data of the Field
object are specified by reference to the original data.StateSet ObjectsTimestep attribute:
The StateSet timestep (the difference between
the two consecutive States "t0" and
"tp1")
is stored in the instance attribute delt.
Error checking:
Besides the
meta_ok_to_interface
metadata checking method, the StateSet class
provides the
time_ok method
to check if the time-related parameters in the StateSet
object pass consistency checks.
Informational methods:
These methods provide, in handy formats,
information about the State objects
that make up the StateSet object
(e.g. ordered lists of State ids):
sort_stateid():
Returns a list of the State ids
in time-ascending order.state_list():
Returns a list of 2-element tuples, each of which gives the
id and long_name
of each State variable in the StateSet
object.stateid2int():
Translates all non-"tendency" State ids to integers and
returns these integers as a list.
Note the similarities and differences between this method and the
stateid2int module function in stateset.Management of constituent State variables:
Besides the
add method
which adds State variables
to a StateSet object, the following methods are provided
to delete or re-order the State variables
in a StateSet object:
shift_time_stateid(nstep):
Shift State objects nstep timesteps and
delete the "tendencies" in the StateSet object.del_earliest:
Removes the earliest State object from the
StateSet container.del_latest:
Removes the latest State object from the
StateSet container.There are two types of functions described here:
(1) Functions that manipulate and use instances of
Field,
State, or
StateSet;
(2) Utility functions useful for the definition of these classes.
Numerical functions for
Field objects:
Many standard mathematical and array functions and operations
are defined for Field objects. Most of them are
defined using one of the following support classes found in the
field module:
BinaryFunction,
BinaryOperation,
NullaryOperation,
UnaryFunction, and
UnaryOperation.
Most numerical functions and operations return a
Field object with all non-data attributes set
to empty (either None or an empty instance of
ExtraMeta).
This is done because such functions and operations can often change
the metadata of a Field object, such as the units.
Details are given in the
module docstring documentation for
the field module.
Function to select and copy the metadata of
the "largest" object in a set of Field objects:
The copymeta_largest function in the field
module takes a set of Field objects, finds the
"largest" Field object in that set
(where "largest" can roughly be described as the
Field object with the largest array),
and copies the metadata of that Field object.
A discussion of how this function is used is found
here.
Informational functions for State objects:
These functions are used to help obtain and manipulate
information, in a handy format, on lists of State
variables:
int2stateid(inlist):
For inlist list of integers, translates the values into the
ids of State objects. This is essentially
the inverse of stateid2int.stateid2int(inlist):
For inlist list of non-"tendency" ids of
State objects, translates the ids
into a list of integers.