1.5.4. Using the generic data classes¶
These classes are all inherited from Dataset
.
1.5.4.1. Goal¶
Theses classes are generic interfaces to gridded data stored in CF compliant netcdf files. They provide standard method for accessing simple or complex quantities.
For instance, to access sea surface temperature, whatever the data are satellite observations or results from models, you use:
>>> mydata = MyData(myfile)
>>> ss = mydata.get_sst()
MyData
will access the SST directly in the netcdf file,
or extract a surface slice from 3D temperature.
Specialized classes inherited from Dataset
are implemented to manage the interface to the different sources of data.
- There are some rather generic interfaces:
OceanDataset
,OceanSurfaceDataset
,AtmosDataset
,AtmosSurfaceDataset
,GenericDataset
. - There are some very specialized ones:
Mars3D
,Nemo
,HYCOM
,SWAN
,CFSR
, and more. This list will increase with time.
1.5.4.2. Usage¶
You must provide at least a file name:
>>> from vacumm.data.model.mars3d import Mars3D
>>> mydata = Mars3D(ncfile)
You can also use the vacumm.data.DS()
function:
>>> mydata = DS(ncfile, 'mars')
In this latter case, if you don’t specify the dataset type (here “mars”),
the generic class GenericDataset
is used.
Some options are supported at initialization. For example:
>>> mydata = Mars3D(ncfile, lon=(-10, 4), logger_level='debug')
Then, numerous methods are availables:
>>> lon = mydata.get_lon_u()
>>> mld = mydata.get_mld(mode='deltadens')
1.5.4.3. Sample of the capabilities¶
1.5.4.3.1. Reading multiple files¶
These classes fully take advantage of the (see documentation). For example, you can provide sereval files or a file name with date patterns.
>>> Mars3D([ncfile1, ncfile2, ...])
>>> Mars3D('myfile.%Y.nc', time=('2000', '2010'))
It also accepts url, global unix patterns (*
, ?
, etc),
and date patterns (%Y
, %m
, etc, see strftime()
).
For more information on how to read a set of files, please read the
list_forecast_files()
function for options.
Note
Files are sorted by alphabetically, unless you you pass
the sort
keyword to prevent or tune sorting.
1.5.4.3.2. Searching for variables and axes¶
The search algorithm typically exploit lists of standard names and names.
These lists are defined in the cf
module,
and can be completed within the specialized classes.
The functions that search for objects are
ncfind_axis()
.
1.5.4.3.3. Simple variables¶
You can access these variables using the generic way like the method
get_temp()
.
If the variable cannot be found directly, the methods searches for it
using a name at another similar location name (no location or its physical
location according to vacumm.data.cf.var_specs
).
Finally, the method try to get it by reading it at different
location and interpolate it to the right one.
The method provide the mode keyword to choose the retreiving mode.
"var"
: Read it directly from the file."stag"
: Get it from staggered location thanks to interpolation.None
: Try them all!
1.5.4.3.4. Advanced variables¶
Generally, a method will search for a variable which is already in the dataset. Other methods can also be implemented manually to retreive the variable, typically from other physical variables. You can choose which retreiving method to use thanks to the mode keyword:
>>> mld = ds.get_mld(mode='var') # read it
>>> mld = ds.get_mld(mode='deltadens') # compute it
In the second case, it is computed from the temperature, the salinity and the depths.
1.5.4.3.4.1. From slices¶
A variable can be accesed as a slice of another variable. For instance, SST can be defined as the surface slice of temperature in 3D models. This is setup within the specification of the class.
1.5.4.3.4.2. From algorithms¶
Sometimes, variable can also be retreived as computation
using existing variables.
For instance, if the mixed layer depth is not store in
the netcdf file, it can be computed using temperature, density or Kz
with the vacumm.diag.thermdyn.mixed_layer_depth()
.
The get_mld()
method
will read all needed variables to compute it.
1.5.4.3.5. Spatial and temporal selections¶
Spatial and temporal selections are availablesat initialization
or running time, using the lon
, lat
, level
and time
generic keywords:
>>> mars = Mars3d(ncfile, time=('2000', '2010'))
>>> sst = mars.get_sst(lon=(-10, 1), lat=(43, 50))
Selection argument are those of CDAT, and works well even on curvilinear grids.
1.5.4.3.6. Reshaping variables¶
It can be useful sometimes to reshape a variable so that it has the same shape as another variable, for instance to mathematical combinations or statistics. You can use the asvar keyword for such task.
>>> bathy3d = ds.get_bathy(asvar='temp')
1.5.4.3.7. Interpolating a variable to another grid location¶
When a dataset is on an staggered Arakawa grid (such as a C grid), you can use the at keyword to interpolate a variable at another location. For instance, to put sst at a U location, just use:
>>> sst_u = ds.get_sst(at='u')
Note
Note that sometimes a method already exists to do it:
>>> sst_u = ds.get_sst_u()
1.5.4.3.8. Vertical coordinates¶
Depth can be either a 1D axis, a volumique variable, or
estimated from sigma coordinates (using the
sigma
module) or
by integrated layer thicknesses.
1.5.4.3.9. Logging and verbosity¶
Data classes integrate the logging system of module log
.
You can for instance define a logging level and print messages
of different level of importance:
>>> mars = Mars3D(ncfile, logger_level='warning', logger_file='mars.log')
>>> mars.debug('debug message') # nothing is printed
>>> mars.warning('warning message')
1.5.4.4. Development¶
1.5.4.4.1. Adding a new get method¶
This is useful to …
1.5.4.4.1.1. Simple variable to read¶
In this case, you just define an interface to read the variable in a netcdf file.
First, define a new entry in the var_specs
dictionary
(see section Structure of the specification dictionaries).
Let’s name it myvar
.
Then you can choose between two methods:
Add the entry name
myvar
to the list defined by theauto_generic_var_names
class attribute (create it if no present). This method also makes the mode keyword available to choose the way you will retreive your variable Simple variables.Or define the method manually with the decorator
getvar_fmtdoc()
:@getvar_fmtdoc def get_myvar(self, **kwargs): return self.get_variable('myvar', **kwargs)
1.5.4.4.1.2. Advanced variable¶
In this case, you want to be able to read the variable or compute it (see paragraph Advanced variables).
You must define the method manually and introduce the mode keyword,
and use the vacumm.data.misc.dataset.check_mode()
function to
verify which mode the user wants.
The vacumm.data.misc.dataset.getvar_fmtdoc()
format the docstring
of the method.
In the follwing example, we want to be able to read the zonal wind stress,
or compute it from the wind speed components using function wind_stress()
:
# Define the method
def get_taux(self, mode=None, **kwargs):
# Params
kwvar = kwfilter(kwargs, ['lon','lat','time','level','torect'], warn=False)
kwfinal = kwfilter(kwargs, ['squeeze','order','asvar', 'at'])
kwspeed = kwfilter(kwargs, 'speed_') #
# First, try to find a taux variable
if check_mode('var', mode):
taux = self.get_variable('taux', **kwvar)
if taux is not None or check_mode('var', mode, strict=True):
return self.finalize_object(taux, **kwfinal)
# Estimate it from wind speed components
if check_mode('speed', mode):
u10m = self.get_u10m(**kwvar)
v10m = self.get_v10m(**kwvar)
if u10m is not None and v10m is not None:
taux,_ = wind_stress(u10m, v10m, format_axes=True, **kwspeed)
if taux is not None or check_mode('speed', mode, strict=True):
return self.finalize_object(taux, depthup=False, **kwfinal)
# Generate the docstring
getvar_fmtdoc(get_taux,
mode="""Retreiving mode
- ``None``: Try all modes, in the following order.
- ``"var"``: Read it from a variable.
- ``"speed"``: Compute it from wind speed.
""",
speed_rhoa="Air density (in kg.m-3)",
speed_cd="Drag coefficient",
)
1.5.4.4.2. Adding a new class for a new dataset type¶
This can be useful for example to create an interface to a gridded product such as model outputs or satellite data.
Create a module somewhere in package
vacumm.data
that contains a class that is subclass of one or several classes ofvacumm.data.misc.dataset
:class MyClass(OceanDataset): ...
Define important class attributes:
arakawa_grid_type
if the dataset is on a special grid.positive
if you know that your dataset is positive up or down for levels.ncobj_specs
if you want to redefine the specification of some variables in the same form as invar_specs
, but with two other possible keys:- select: To perform a systematic selection into the variable.
- squeeze: To systematically squeeze out one or more dimensions.
For instance, the SST in
Mars3D
is redefined so that it is read from the last level of 3D temperature, squeezing out the vertical dimension:ncobj_specs = { # sea surface temperature 'sst':{ 'inherit':'temp', 'select':{'level':slice(-1, None)}, 'squeeze':'z', } }
- Define secondary class attributes and other special methods if needed.
- Add an entry in
vacumm.data.dataset_specs
to create a shortcut to this dataset whe usingvacumm.data.DS()
.