pax_global_header00006660000000000000000000000064131756530370014524gustar00rootroot0000000000000052 comment=9bd7d6689e5eaf7d5d054c28985b253c0ec2c9ed netcdf4-python-1.3.1rel/000077500000000000000000000000001317565303700150775ustar00rootroot00000000000000netcdf4-python-1.3.1rel/.travis.yml000066400000000000000000000023541317565303700172140ustar00rootroot00000000000000language: python sudo: false addons: apt: packages: - libhdf5-serial-dev - netcdf-bin - libnetcdf-dev env: global: - DEPENDS="numpy>=1.9.0 cython>=0.21 setuptools>=18.0" - NO_NET=1 - MPI=0 python: - "2.7" - "3.5" - "3.6" matrix: allow_failures: - python: "3.7-dev" include: # Absolute minimum dependencies. - python: 2.7 env: - DEPENDS="numpy==1.9.0 cython==0.21 ordereddict==1.1 setuptools==18.0" # test MPI - python: 2.7 env: - MPI=1 - CC=mpicc - DEPENDS="numpy>=1.9.0 cython>=0.21 setuptools>=18.0 mpi4py>=1.3.1" - NETCDF_VERSION=4.4.1.1 - NETCDF_DIR=$HOME - PATH=${NETCDF_DIR}/bin:${PATH} # pick up nc-config here addons: apt: packages: - openmpi-bin - libopenmpi-dev - libhdf5-openmpi-dev notifications: email: false before_install: - pip install $DEPENDS install: - if [ $MPI -eq 1 ] ; then ci/travis/build-parallel-netcdf.sh; fi - python setup.py build - python setup.py install script: - | if [ $MPI -eq 1 ] ; then cd examples mpirun -np 4 python mpi_example.py cd .. fi - cd test - python run_all.py netcdf4-python-1.3.1rel/COPYING000066400000000000000000000035671317565303700161450ustar00rootroot00000000000000copyright: 2008 by Jeffrey Whitaker. Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both the copyright notice and this permission notice appear in supporting documentation. THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. parts of pyiso8601 are included in netcdftime under the following license: Copyright (c) 2007 Michael Twomey Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. netcdf4-python-1.3.1rel/Changelog000066400000000000000000001630741317565303700167240ustar00rootroot00000000000000 version 1.3.1 (tag v1.3.1rel) ============================= * add parallel IO capabilities. netcdf-c and hdf5 must be compiled with MPI support, and mpi4py must be installed. To open a file for parallel access, use `parallel=True` in `Dataset.__init__` and optionally pass the mpi4py Comm instance using the `comm` kwarg and the mpi4py Info instance using the `info` kwarg. IO can be toggled between collective and independent using `Variable.set_collective`. See `examples/mpi_example.py`. Issue #717, pull request #716. Minimum cython dependency bumped from 0.19 to 0.21. * Add optional `MFTime` calendar overload to use across all files, for example, `'standard'` or `'gregorian'`. If `None` (the default), check that the calendar attribute is present on each variable and values are unique across files raising a `ValueError` otherwise. * Allow _FillValue to be set for vlen string variables (issue #730). version 1.3.0 (tag v1.3.0rel) ============================== * always search for HDF5 headers when building, even when nc-config is used (since nc-config does not always include the path to the HDF5 headers). Also use H5get_libversion to obtain HDF5 version info instead of H5public.h. Fixes issue #677. * encoding kwarg added to Dataset.__init__ and Dataset.filepath (default is to use sys.getfilesystemencoding()) so that oddball encodings (such as cp1252 on windows) can be handled in Dataset filepaths (issue #686). * Calls to nc_get_vars are avoided, since nc_get_vars is very slow (issue #680). Strided slices are now converted to multiple calls to nc_get_vara. This speeds up strided slice reads by a factor of 10-100 (especially for NETCDF4/HDF5 files) in most cases. In some cases, strided reads using nc_get_vars are faster (e.g. strided reads over many dimensions such as var[:,::2,::2,::2])), so a variable method use_nc_get_vars was added. var.use_nc_get_vars(True) will tell the library to use nc_get_vars instead of multiple calls to nc_get_vara, which was the default behaviour previous to this change. * fix utc offset time zone conversion in netcdftime - it was being done exactly backwards (issue #685 - thanks to @pgamez and @mdecker). * Fix error message for illegal ellipsis slicing, add test (issue #701). * Improve timezone format parsing in netcdftime (https://github.com/Unidata/netcdftime/issues/17). * make sure numpy datatypes used to define CompoundTypes have isalignedstruct flag set to True (issue #705), otherwise. segfaults can occur. Fix required raising them minimum numpy requirement from 1.7.0 to 1.9.0. * ignore missing_value, _FillValue, valid_range, valid_min and valid_max when creating masked arrays if attribute cannot be safely cast to variable data type (and issue a warning). When setting these attributes don't cast to variable dtype unless it can be done safely and issue a warning. Issue #707. version 1.2.9 (tag v1.2.9rel) ============================== * Fix for auto scaling and masking when _Unsigned attribute set (create view as unsigned type after scaling and masking). Issue #671. * Always mask values outside valid_min, valid_max (not just when missing_value attribue present). Issue #672. * Fix setup.py so pip install doesn't fail if cython not installed. setuptools >= 18.0 now required for installation (Issue #666). version 1.2.8 (tag v1.2.8rel) ============================== * recognize _Unsigned attribute used by netcdf-java to designate unsigned integer data stored with a signed integer type in netcdf-3 (issue #656). * add Dataset init memory parameter to allow loading a file from memory (pull request #652, issues #406 and #295). * fix for negative times in num2date (issue #659). * fix for failing tests in numpy 1.13 due to changes in numpy.ma (issue #662). * Checking for _Encoding attribute for NC_STRING variables, otherwise use 'utf-8'. 'utf-8' is used everywhere else, 'default_encoding' global module variable is no longer used. getncattr method now takes optional kwarg 'encoding' (default 'utf-8') so encoding of attributes can be specified if desired. If _Encoding is specified for an NC_CHAR ('S1') variable, the chartostring utility function is used to convert the array of characters to an array of strings with one less dimension (the last dimension is interpreted as the length of each string) when reading the data. When writing the data, stringtochar is used to convert a numpy array of fixed length strings to an array of characters with one more dimension. chartostring and stringtochar now also have an 'encoding' kwarg. Automatic conversion to/from character to string arrays can be turned off via a new set_auto_chartostring Dataset and Variable method (default is True). Addresses issue #654. * Cython >= 0.19 now required, _netCDF4.c and _netcdftime.c removed from repository. version 1.2.7 (tag v1.2.7rel) ============================== * fix for issue #624 (error in conversion to masked array when variable slice returns a scalar). This is a regression introduced in 1.2.5 associated with support for vector missing_values. Test (tst_masked5.py) added for vector missing_values. * fix for python 3.6 compatibility (error retrieving character _FillValue attribute, issue #626). Test with python 3.6 using travis CI. version 1.2.6 (tag v1.2.6rel) ============================== * fix some test failures on big endian PPC64 that were due to errors in byte-swapping logic. Also fixed bug in enum code exposed on PPC64 (issue #608). * remove support for python 2.6 (it probably still will work for a while though). * Sometimes checking that data being assigned to a variable has an 'ndim' attribute is not sufficient, instead check to see that the object supports the buffer interface (issue #613). * make get_variables_by_attributes work in MFDataset (issue #610) The hack is also applied for set_auto_maskandscale, set_auto_scale, set_automask, so these don't have to be duplicated in MFDataset (pull request #571). version 1.2.5 (tag v1.2.5rel) ============================== * Add MFDataset.set_auto_maskandscale (plus set_auto_scale, set_auto_mask). Fixes issue #570. * Use valid_min/valid_max/valid_range attributes when defining mask (issue #576). Values outside the valid range are considered to be missing when defining the mask. * Fix for issue #584 (add support for dates before -4712-1-1 in 360_day and 365_day calendars to netcdftime.utime). * Fix for issue #593: add support for datetime.timedelta operations (adding and subtracting timedelta, subtracting two datetime instances to compute time duration between them), implement datetime.replace() and datetime.__str__(). datetime.__repr__() includes the full state of an instance. Add datetime.calendar. datetime comparison operators have full accuracy now. * Fix for issue #585 by increasing the size of the buffer used to store the filepath. * Fix for issue #592: Add support for string array attributes. (When reading, a vlen string array attribute is returned as a list of strings. To write, use var.setncattr_string("name", ["two", "strings"]).) * Fix for issue #596 - julian day calculations wrong for negative years, caused incorrect rountrip num2date(date2num(date)) roundtrip for dates with year < 0. * Make sure negative years work in utime.num2date (issue #596). * raise NotImplementedError when trying to pickle Dataset, Variable, CompoundType, VLType, EnumType and MFDataset (issue #602). * Fix for issue #527: initialize vldata[i].p in Variable._get(...). version 1.2.4 (tag v1.2.4rel) ============================== * Fix for issue #554. It is now ensured that data is in native endian byte order before passing to netcdf-c library. Data read from variable with non-native byte order is also byte-swapped, so that dtype remains consistent with netcdf variable. Behavior now consistent with h5py. * raise warning for HDF5 1.10.x (issue #549), since backwards incompatible files may be created. * raise AttributeError instead of RuntimeError when attribute operation fails. raise IOError instead of RuntimeError when nc_create or nc_open fails (issue #546). * Use NamedTemporaryFile instead of deprecated mktemp in tests (pull request #543). * add AppVeyor automated windows tests (pull request #540). version 1.2.3.1 (tag v1.2.3.1rel) ================================== * fix bug in setup.py (pull request #539, introduced in issue #518). version 1.2.3 (tag v1.2.3rel) ============================== * try to avoid writing NC_STRING attributes if possible, by trying to convert unicode strings to ascii and write as NC_CHAR (issue #529). This preserves compatibility with clients (like Matlab) that can't deal with NC_STRING attributes. A 'setncattr_string' method was added for Dataset and Variable to that users can force attributes to be written as NC_STRING if necessary. * fix failing tests with numpy 1.11 (issues #521 and #522). * fix indentation bug in nc4tonc3 utility (issue #519). * add the capability in setup.py to use pkg-config instead of nc-config (pull request #518). * make sure slices which return scalar masked arrays are consistent with numpy.ma (issue #515). * add test/tst_cdf5.py and test/tst_filepath.py (to test new NETCDF3_64BIT_DATA format and filepath Dataset method). * expose netcdftime.__version__ (issue #504). * fix potential memory leak in Dataset.filepath in attempt to fix mysterious segfaults on CentOS6 (issue #506). Segfaults can apparently still occur on systems like CentOS6 with old versions of glibc. version 1.2.2 (tag v1.2.2rel) ============================= * fix failing tests on python 2.6 (issue #497). Change minimum required python from 2.5 to 2.6. * Potential memory leaks fixed by freeing string pointers internally allocated in netcdf-c using nc_free_string. Also use nc_free_vlens to free space allocated for vlens inside netcdf-c (issue #495). * invoke str on filename argument to Dataset constructor, so pathlib instances can be used (issue #489). * don't use hardwired NC_MAX_DIMS or NC_MAX_VARS internally to allocate space for dimension or variable ids. Instead, find out the number of dims and vars and use malloc. NC_MAX_NAME is still used to allocate space for attribute and variable names, since there is no obvious way to determine the length of these names. * if trying to write a unicode attribute, check to see if it exists first and is NC_CHAR, and if so, delete it and recreate it. Workaround for C lib bug discovered in issue #485. * support for NETCDF3_64BIT_DATA format supported in netcdf-c 4.4.0. Similar to NETCDF3_64BIT (now NETCDF3_64BIT_OFFSET), but includes 64 bit dimensions and sizes, plus unsigned and 64 bit integer data types. * make sure chunksize does not exceed dimension size (for non-unlimited dimensions) on variable creation (issue #480). * add 'size' attribute to Dimension (same as len(d), where d is a Dimension instance, issue #477). * fix bug in nc3tonc4 with --unpackshort=1 (issue #474). * dates do not have to be contiguous, i.e. can be before and after the missing dates in Gregorian calendar (pull request #476). version 1.2.1 (tag v1.2.1rel) ============================= * add the capability to slice variables with unsorted integer sequences, or integer sequences with duplicates (issue #467). This was done by converting boolean array slices to integer array slices internally, instead of the other way around. * raise TypeError if masked array assigned to a VLEN str variable slice (issue #464). * Ellipsis now can be used with scalar VLEN str variables (issue #458). Slicing of scalar VLEN (non-str) variables now works. * Allow non-positive reference years in non-real-world calendars (issue #442). version 1.2.0 (tag v1.2.0rel) ============================= * Fixes to setup.py for building on windows (issue #460). * warnings now issued if file being read contains unsupported variables or data types (they were previously being silently skipped). * added 'get_variables_by_attributes' method (issue #454). * check for 'units' attribute in date2index (issue #453). * added support for enum types (issue #452). * added 'isopen' Dataset method (issue #450). * raise ValueError if year 0 or negative year used in time units string. The year 0 does not exist in the Julian and Gregorian calendars (issue #442). version 1.1.9 (tag v1.1.9rel) ============================= * fix for issue #391 (data is already byte-swapped to native endian format by the HDF4 library). * fix for issue #415 (copy.deepcopy does not work on netcdftime datetime object). * fix for issue #420 - len(v) where v is a scalar variable returned unexpected IndexError, now returns "TypeError: len() on unsized object" (same as numpy does for len() on a scalar array). * translate docstrings from epydoc markup to markdown, so pdoc can be used (epydoc is dead). * add small offset in conversion to Julian date for numerical stability (more accurate round trip calculations). This offset is removed in back conversion only from microseconds. Pull request #433. * add detection of unsigned integers to handling of automatic packing (set_auto_scale and set_auto_maskandscale) when writing. Pull request #435. * use USE_SETUPCFG env var to over-ride use of setup.cfg. If USE_SETUPCFG evaluates to false, setup.cfg will not be used and all configuration variables can be set from environment variables. Useful when using 'pip install' and nc-config is broken (issue #438). * fix for integer overflow in date2index (issue #444). version 1.1.8 (tag v1.1.8rel) ============================= * v[...] now returns a numpy scalar array (not just a scalar) when v is a scalar netcdf variable (issue #413). * unix-like paths can now be used in createVariable and createGroup. v = nc.createVariable('/path/to/var1',('xdim','ydim'),float) will create a Variable named 'var1', while also creating the Groups 'path' and 'path/to' if they do not already exist. Similarly, g = nc.createGroup('/path/to') acts like 'mkdir -p' in unix, creating the Groups 'path' and '/path/to', if they don't already exist. Users who relied on nc.createGroup(groupname) failing when the group already exists will have to modify their code, since nc.createGroup will now return the existing group instance. Dataset.__getitem__ also added. nc['/path/to'] returns a Group instance, and nc['/path/to/var1'] returns a Variable instance. * change minimum required numpy to 1.7.0, fix so all tests pass with 1.7.0. Added travis tests for minimum required cython, numpy (issue #404). * enable abbreviations to time units specification, as allowed in CF (issue #402). Now, instead of just 'seconds' and 'seconds', 'secs', 'sec' and 's' are also allowed (similar to minutes, days and hours). * install utility scripts in utils directory with setuptools entry points (pull request #392 from @mindw). Code for utilities moved to netCDF4_utils.py - makes utilities more windows-friendly. * make sure booleans are treated correctly in setup.cfg. Add use_cython (default True) to setup.cfg. If set to False, then cython will not be used to compile netCDF4.pyx (existing netCDF4.c will be used instead). * use "from Cython.Build import cythonize" instead of "from Cython.Distutils import build_ext" in setup.py (issue #393) to conform to new cython build mechanism (CEP 201, described at https://github.com/cython/cython/wiki/enhancements-distutils_preprocessing). * unicode attributes now written as strings, not bytes (using nc_put_att_string instead of nc_put_att_text, issue #388). * add __orthogonal_indexing__ attribute to Variable, Dataset and Group (issue #385) to denote that Variable objects do not follow numpy indexing semantics for integer and boolean array indices. * make sure application of scale_factor and add_offset works correctly when scale_factor not given (issue #381). * add man pages for nc3tonc4, nc4tonc3, ncinfo in man directory. Not installed by setup.py (contributed by Ross Gammon, issue #383). * replace tabs with spaces by running reindent.py on all *.py and *.pyx files (issue #378). * refactor netCDF4_utils and netCDF4 module into netCDF4 package. Refactoring effectively removes netCDF4 utils private attributes from netCDF4 namespace, so has the potential to break code using private attributes (issue #409). version 1.1.7 (tag v1.1.7rel) ============================= * check to make sure cython >= 0.19 is available before trying to use it (otherwise compilation with fail). Issue 367. * add ipython notebooks from Unidata workshop in examples directory. * fix ellipsis variable slicing regression (issue 371). * release the Global Interpreter Lock (GIL) when calling the C library for read operations. Speeds up multi-threaded reads (issue 369). Caution - the HDF5 library may need to be compiled with the threadsafe option to ensure that global data structures are not corrupted by simultaneous manipulation by different threads. * Make sure USE_NCCONFIG environment variable takes precedence over value of use_ncconfig in setup.cfg. With this change, 'pip install netCDF4' with USE_NCCONFIG=1 will use environment variables to find paths to libraries and include files, instead of relying on nc-config (issue #341). version 1.1.6 (tag v1.1.6rel) ============================= * fix for issue 353 (num2date can no longer handle units like 'hours since 2000-01-01 0'). * fix for issue 354 (num2date no longer supports multi-dimensional arrays). * fix for spurious UserWarning about endian-ness mismatch (issue 364). * make calendar name keyword for num2date/date2num case insensitive (issue 362). * make sure units parser returns time-zone naive datetime instance that includes UTC offset (issue 357). UTC offset was applied incorrectly in netcdftime.date2num and num2date. No longer need to depend on python-dateutil. version 1.1.5 (tag v1.1.5rel) ============================= * add dependency on python-dateutil in setup.py and install docs. * use python datetime in num2date and date2num whenever possible. Remove duplicate num2date and date2num functions from netcdftime. Addresses issue #344. Add microsecond capability to netcdftime.datetime. Roundtrip accuracy of num2date/date2num now down to less than a millisecond. * use nc-config by default to find dependencies. setup.py modified to handle failure to find nc-config more gracefully (issue #340). If you wish to use env vars to point to the libs, you must first move the setup.cfg file out of the way (rename it to setup.cfg.save), or set USE_NCCONFIG to 0. * if endian-ness of variable is specified, adjust datatype to reflect this when opening a file (issue 346). * fix for issue #349 (seconds outside the range 0-59 in netcdftime.num2date). version 1.1.4 (tag v1.1.4rel) ============================= * speedup conversion of array indices to slices (issue #325). * fix for issue #330 (incorrect values for seconds returned by netcdftime). * fix reading of scalar vlen variables (issue #333). * setting fill_value=False in createVariable for vlen and compound variables now does nothing, instead of causing an error when the Dataset is closed (issue #331). * cython will regenerate netCDF4.c when install is run, not just build. Makes 'pip install' do the right thing when cython is installed (issue #263). version 1.1.3 (tag v1.1.3rel) ============================= * checked in _datetime.c to git (resolves issue #315). Note - _datetime.c was *not* included in the 1.1.2 release. * Changed __str__ to __repr__ in MFDataset, to be consistent with Dataset (issue #317). IPython uses __repr__ to make use-friendly human-readable summaries of objects in the terminal. version 1.1.2 (tag v1.1.2rel) ============================= * fix for issue 312 (allow slicing with objects that can be cast to ints). * indexing netCDF variables with integer sequences and boolean arrays now behave the same way (integer sequences are converted to boolean arrays internally). Addresses issue #300. Since indexing using integer sequences does not behave exactly as before, some client code may break. For example, previously when integer index arrays had the same length, and that length was equal to the number of dimensions of the array being indexed, netcdf4-python mirrored the numpy indexing behavior and treated the elements of the index arrays as individual sets of integer indices. This special case has been removed. An IndexError is now raised when the new behavior would produce a different result than the old, i.e. when the indices in an integer sequence are not sorted, or there are duplicate indices in the sequence. * fix for issue #310 (masked arrays not returned correctly when variable has non native endian-ness). * fix for issue #306 (slicing variable with "-1" when there is only one element along that dimension). * Improved speed of num2date and date2num for standard, julian, gregorian and proleptic gregorian calendars by vectorizing the functions. See Issue #296 * Fix for issue #301 ("Datestring parser chokes on years with extra space"). * Add name property for Dimension, Variable and Group instances (to access string name associated with instance). * Allow for null byte attributes (so _FillValue='\x00' can be set manually). Issue 273. * Added __repr__ (matching __str__) for all types (pull request #291). IPython uses __repr__ to make use-friendly human-readable summaries of objects in the terminal. version 1.1.1 (tag v1.1.1rel) ============================== * make sure _FillValue is a byte for character arrays in Python 3 (issue 271). * add numpy to install_requires in setup.py (issue #282, fixes issue #211). 'pip install netcdf4-python' will no longer fail if numpy not installed. * Fix for issue 278 (UnicodeDecodeError reading netcdf.h from setup.py with Python 3.4). * Make netcdftime.datetime immutable and hashable (issue 255). * Fix issue with slicing of scalar VLEN arrays (issue 270). * Add set_auto_mask and set_auto_scale methods to control auto scaling and auto masking separately. (issue 269). Also added set_auto_maskandscale, set_auto_scale, set_auto_mask Dataset/Group methods that recursively walk through all variables in the Dataset/Group. * Make sure file_format attribute propagated to Group instances (issue 265). * Fix for issue #259 ("Cannot use broadcasting to set all elements of a Variable to a given value"). version 1.1.0 (tag v1.1.0rel) ============================= * revert weakref change, so that previous behaviour (Dimensions and Variables keep strong references to parent Dataset) is the default. New keyword argument 'keepweakref' for Dataset.__init__ can be set to true to get weak references. version 1.0.9 (tag v1.0.9rel) ============================= * speed up the creation of new Group instances (issue 239). * fix logic errors in setup.py (issue 236). * it is now possible to create and set variable length string variables with numpy string datatypes (pull request 224). * add .travis.yml (for travis-ci testing on github), silence warnings from test output (issue 225). * modify __unicode__ for Variable and Dimension to return more useful error message when Dataset object has been garbage collected. * use weak references to group instances when creating Dimension and Variable objects. This prevents cyclic references messing up garbage collection (issue 218, pull request 219). * accessing values from a 0-dimensional Variable now returns a 0-dimensional numpy array, not a 1-dimensional array (issue 220). To write code compatible with both the old and new (fixed) behavior, wrap values accessed from a 0-dimensional Variable with numpy.asscalar. * add an __array__ method to Variable to make numpy ufuncs faster (issue 216). * change download_url in setup.py to point to pypi instead of googlecode. * fix for date2index error when time variable has only one entry (issue 215). * silence warnings ("Non-trivial type declarators in shared declaration (e.g. mix of pointers and values). Each pointer declaration should be on its own line") with Cython 0.2. * reduced memory usage for Variable.__getitem__ under Python 2. version 1.0.8 (tag v1.0.8rel) ============================= * change file_format Dataset attribute to data_model (keeping file_format for backward compatibility). Add disk_format attribute (underlying disk format, one of NETCDF3, HDF4, HDF5, DAP2, DAP4, PNETCDF or UNDEFINED). Uses nc_inq_format_extended, added in version 4.3.1 of library. If using earlier version of lib, disk_format will be set to UNDEFINED. * default _FillValue now ignored for byte data types (int8 and uint8) as per http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/Fill-Values.html#Fill-Values "If you need a fill value for a byte variable, it is recommended that you explicitly define an appropriate _FillValue attribute, as generic utilities such as ncdump will not assume a default fill value for byte variables". ncinfo now returns fill mode information (issue 209). * check to see if filling was disabled before masking data equal to default fill value (issue 209). * add variable type information to Dataset.__repr__ (output of ncinfo). version 1.0.7 (tag v1.0.7rel) ============================= * add the ability to specify the locations of hdf4,jpeg and curl libs, in case netCDF4 was built statically with HDF4 and/or OpenDAP support (issue 207). * add 'ncinfo' utility (like 'ncdump -h' but less verbose). * more information displayed when Dataset or Group instance is printed. * fix for issue 194 (versions after 1.0.5 fail for netcdf 4.1.1, due to call to nc_inq_path, which was added in netcdf 4.1.2). Fixed by adding compile time API check similar to what was done for nc_rename_grp. If filepath Dataset method is called an exception will be raised at runtime if the module was built with netcdf < 4.1.2, or cython was not installed at build time. * fix for issues 202 and 206 (exception raised by numpy.isnan for character data types). * if dateutils not installed and time unit accuracy < 1 second requested, have netcdftime raise an ImportError. version 1.0.6 (svn revision 1312) ================================ * issue warning of endian-ness of dtype argument does not match endian kwarg in createVariable. * make sure netcdf type NC_CHAR always returned in numpy array dtype 'S1' (sometimes arrays of type 'U1' were being returned). Fixes intermittently failing test tst_compoundatt.py on python 3.3. * fix for issue 201 (if data associated with numpy array not the same endian-ness as dtype, data was written incorrectly). Now bytes are swapped if necessary. Variable.endian() now returns 'native' instead of None for NETCDF3 formatted files. createVariable now enforces endian='native' for NETCDF3 files. Added tst_endian.py test case. * fix for issue 200 (library version detection failed on cygwin). * fix for issue 199 (nc4tonc3 utility not copying global attributes). * fix for issue 198 (setup.py chokes when no arguments given). * fix for issue 197 (slicing of netCDF variables using lists of integers). * create 'path' attribute for group instance using posixpath, instead of os.path (to ensure the unix path is used on all platforms). Issue 196. * fix for issue 196 (test failures on win32 due to files being deleted before they are closed). version 1.0.5 (svn revision 1278) ================================ * change setup.py to compile the Cython sources directly, if cython is available. This allows for "ifdef" like capability to modify source at compile time to account for changes in netcdf API (e.g. the forthcoming addition of the nc_rename_grp in version 4.3.1). * added a "renameGroup" method, which raises an exception if the netcdf lib version linked does not support it. Requires netcdf >= 4.3.1. * support for more than one missing value (missing_value attribute is a vector) when converting to masked array. * add 'renameAttribute' method to Dataset, Group and Variable. * fix so that var[:] = x works if x is a scalar, and var is a netcdf variable with an unlimited dimension that has shape () - i.e. no data has been written to it yet. Before this change, var[:] = x did not write any data. Now the scalar x will be written as the first entry in var along the unlimited dimension. * remove dos line feeds from nc3tonc4 (issue 181). * add datatype property for Variable that returns numpy dtype for primitive datatypes (same as dtype attribute) but returns CompoundType or VLType instance for compound or vlen variables (issue 178). * fix logic for deciding where to look for nc-config in setup.py (issue 177). * issue a warning and don't try to apply scale_factor or add_offset if these attributes are not convertible to floats (issue 176). * add filepath method to Dataset instance to return file path (or opendap URL) used to create Dataset (issue 172). * fix for issue 170 (opening a remote DAP dataset fails after creating a NETCDF4 formatted file). * fix for issue 169 (error in chartostring function on 64-bit windows). * add support for missing_value or _FillValue == NaN (issue 168). * added a Dimension.group() method (issue 165). version 1.0.4 (svn revision 1229) ================================= * fixed alignment bug that could cause memory corruption when reading compound type variables. All users of compound types should upgrade. version 1.0.3 (svn revision 1219) ================================= * don't try to write empty data array to netcdf file (fixed failing test with netcdf 4.3.0rc2). * date2num, num2date and date2index now can handle units of microseconds and milliseconds (for proleptic_gregorian calendar, or gregorian and standard calendars as long as the time origin is after 1582-10-15). Issue 159. * Added a _grp attribute to Dimension (issue 165). * don't bundle ordereddict (issue 164). * support reading of vlen string attributes (issue 156). * add --vars option to nc3tonc4 (issue 154). * Don't try to set fletcher32 checksum on scalar variables (it causes HDF5 to crash). Fixes issue 150. * Add --istart/--istop options to nc3tonc4 (issue 148, courtesy of Rich Signell). * fix for proleptic_gregorian in netcdftime.py (courtesy of Matthias Cuntz). version 1.0.2 (svn revision 1196) ================================= * disable version check for HDF5, which is broken by hdf5 1.8.10. * make sure all files have a calendar attribute in MFTime (issue 144). * more robust fix to issue 90 (array shape modified by assignment to a netCDF variable with one more dimension), including test case. version 1.0.1 (svn revision 1190) ================================= * fix error that occurred when retrieving data from a variable that has a missing_value attribute specified as a string (issue 142). * automatically close netcdf files when there are no references left to Dataset object (using __dealloc__ method). Fixes issue 137. * fix for slicing of scalar vlen string variables (issue 140). * fix to allow writing of unicode data to a NC_CHAR variable. * allow for writing of large variables (> 2**32 elements). Fixes issue 130. version 1.0fix1 =============== * fix python 3 incompatibility in setup.py (issue 125). version 1.0 (svn revision 1164) =============================== * add 'aggdim' keyword to MFDataset, so the name of the dimension to aggregate over can be specified (instead of using the unlimited dimension). aggdim=None by default, which results in the previous behavior. aggdim must be the leftmost dimension of all the variables to be aggregated. * raise IndexError when indexing a netcdf variable out of range so iterating over a variable in a for loop behaves as expected (as described in http://effbot.org/zone/python-for-statement.htm). Fixes issue 121. * added MacPorts portfile (so it can be installed via MacPorts on macosx using a "local Portfile repository"). Installs from svn HEAD using 'port install netcdf4-python'. * added experimental 'diskless' file capability (only added to the C lib after the 4.2 release). Controlled by kwarg 'diskless' to netCDF4.Dataset (default False). diskless=True when creating a file results in a file that exists only in memory, closing the file makes the data disapper, except if persist=True keyword given in which case it is persisted to a disk file on close. diskless=True when opening a file creates an in-memory copy of the file for faster access. * add the ability to specify the location of the required libs (and whether to use nc-config) with setup.cfg, instead of using environment variables. * fix ISO9601 date parser so it recognizes time zone offsets in time unit strings (contributed by David Hassel, issue 114, r1117). * add setncatts Dataset,Group and Variable method to add a bunch of attributes (given in a python dictionary) at once. Speeds things up for NETCDF3 and NETCDF4_CLASSIC files a lot, since nc_redef/nc_enddef not need to be called for each attribute (issue 85, r1113). Adding 1000 attributes is about 35 times faster using setncatts to add them all at once. Makes no difference for NETCDF4 formatted files, since nc_redef/nc_enddef is not called. * only round after apply scale_factor and add_offset if variable type is integer (issue 111, r1109). * Fixed bug with all False Boolean index (r1107). * added support for after, before and nearest selection method to date2index fast "first guess" indexing (r1106). * Remove white space in time units string (netcdftime._parse_date). An extra space in the time units of one CMIP3 model caused an error (r1105). * based on results with examples/bench_compress2.py, change default complevel for zlib compression from 6 to 4. If complevel=0, turn compression off entirely (set zlib=False) (r1102). version 0.9.9 (svn revision 1099) ================================ * changed default unicode encoding from "latin-1" to "utf-8", since this is the python 3 default, and the only encoding that appears to work for dimension and variable names. * added test case for unicode attributes, variable and dimension names. * fixes for unicode variable, dimension and group names. * fix for unicode attributes in python3 (ncdump did not intrepret them as text strings). Issue 107. * add --format option to nc4tonc3 utility (can be either NETCDF3_CLASSIC or NETCDF3_64BIT). Fixes issue 104. version 0.9.8 (svn revision 1080) ================================ * use numpy.ma.isMA to check for masked array (instead of checking for presence of 'mask' attribute). * fixes for AIX with ibm xlc compiler. * make sure unicode attributes don't get converted to ascii strings (issue 98). version 0.9.7 (svn revision 1073) ================================ * Added __str__ methods to Dataset, Variable, Dimension, CompoundType, VLType and MFDataset, so useful human-readable information is provided when these objects are printed in an interactive session. * don't try to apply scale_factor and offset if scale_factor=1 and add_offset=0 (to avoid making copies of large arrays). * changed netCDF4._default_fillvals to netCDF4.default_fillvals (to make part of public API). Added to docs (issue 94). version 0.9.6 (svn revision 1043) ================================= * changed default unicode encoding from "ascii" to "latin-1" (iso-8859-1). * add "unicode_error" module variable to control what happens when characters cannot be decoded by the encoding specified by the "default_encoding" module variable (which is "ascii" by default). unicode_error = "replace" by default which means bad characters are replace by "?". Previously an error was raised, the old behavior can be obtained by setting unicode_error = 'strict'. Fixes issue 92. * add __enter__ and __exit__ methods so you can do "with Dataset(url) as f:" (issue 89). * don't add extra singleton dimensions to rhs numpy arrays when assigning to a netcdf variable. Fixes issue 90. * coerce missing_value attribute to same type as variable (for primitive types). Fixes issue 91. version 0.9.5 (svn revision 1031) ================================ * fix for compound variables on python 3.2. * fix slicing of masked MFDataset variables (issue 83). * round to nearest integer after packing with scale_factor and add_offset (instead of truncation) (issue 84). * if add_offset missing, but scale_factor present, assume add_offset zero. if scale_factor missing, but add_offset present, assume scale_factor one. (this is consistent with unidata recommendations - issue 86). * only try to convert strings to bytes for python 3 so Dataset can be subclassed (issue 87). version 0.9.4 (svn revision 1018) ================================ * tested with python 2.7.1/3.1.3 using netcdf 4.1.2 and hdf5 1.8.6. * Added a 'default_encoding' module variable that controls how unicode strings are encoded into bytes. Default is 'ascii'. * now works on Python 3. * netCDF3 module removed. If you still need it, get it from netCDF4 0.9.3. * regenerated C source with Cython 0.14.1. * Added a MFTime class. Provide a unified interface to MFDataset time variable using different time units. * Fixed bug in netcdftime (issue 75) that occurs when time specified is within one second of the end of the month. * on unix-like systems, the environment variable USE_NCCONFIG can be set to tell setup.py to use the nc-config script installed by netcdf to figure out where all the libs and headers are (without having to specify NETCDF_DIR, HDF5_DIR, etc). Only works with netcdf 4.1.2. version 0.9.3 (svn revision 930) ================================ * fix chunk sizes bug (chunk sizes pointer should be size_t, not int). Fixes issue 66. Added test in tst_compression.py * fixed writing of data with missing values with scale/offset packing. Added test (tst_masked2.py). * fix iso8601 regex in netcdftime date parser so it can parse 'hours since 1-1-1 ...' (year had to be 4 digits previously) version 0.9.2 (svn revision 907) ================================ * fix netcdftime bug with '360_day' calendar. Fixes issue 59. * make sure scalar slice of 1d variable returns array scalar (not array of shape (1,)). Fixes issue 57. * updated date parser in netcdftime. Can now handle units like "seconds since 1970-01-01T00:00:00Z". * added support in setup.py for specifying the locations of the HDF5/netcdf-4 headers and libs separately with environment variables (HDF5_INCDIR, HDF5_LIBDIR).i Patch contributed by Patrice Dumas. * add masked array support to num2date (dates for missing times set to None). * add chunk_cache keyword to createVariable. HDF5 default is 1mb, which can cause problems when creating 1000's of variables. In such cases, chunk_cache can be reduced, or set to zero. * add set_var_chunk_cache and get_var_chunk_cache Variable methods. * raise AttributeError when trying to set _FillValue attribute (it can only be reliably set on variable creation, using the fill_value keyword to createVariable). version 0.9.1 (svn revision 879) ================================ * raise ImportError if netcdf-4 < 4.1.1 or hdf5 <= 1.8.4. * add __netcdf4libversion__ and __hdf5libversion__ module variables. * make sure data is not truncated to integers before scale_factor and add_offset is applied (issue 46). * fix bug in date2num with noleap calendar in netcdftime (issue 45). * fix bug in 360day calendar in netcdftime (issue 44). * python 2.4 compatibility restored (by modifying OrderedDict). Fixes issue 37. * make sure int64 attributes cast to int32 when format=NETCDF4_CLASSIC. This was causing tst_multifile.py to fail on 64-bit platforms. * fix tutorial.py to cast 64 bit integers to 32 bit when writing to 32-bit integer vlen (was causing tutorial.py to fail on 64-bit platforms). * remove nose dependency from tst_netcdftime.py. version 0.9 (svn revision 846) ============================== * fixed bug (issue 30) with date2index occurring with dates outside the support. * make sure that auto masking works with MFDataset. * fix bug (issue 34) when slicing MFDataset variables with dimensions of length 1. * used ordered dictionaries for variables, dimensions, groups etc to preserve creation order (makes it easier to copy files, fixes issue 28). * change auto_maskandscale default to True. This means data will automatically be converted to and from masked arrays. Data scaled as short integers using the scale_factor and add_offset attributes will also be automatically converted to/from float arrays. * add setncattr, getncattr, delncattr methods (for setting/getting/deleting netcdf attributes with names that clash with the reserved python attributes). version 0.8.2 (svn revision 769) ================================ * compound type tests re-enabled. Compound and vlen types now fully supported in netcdf-4.1-beta2. * make sure data retrieved from a netCDF variable is not coerced to a python scalar (it should remain a numpy scalar array). * fix docs to point out that an unlimited dimension can be created by setting size to *either* None or 0 in createDimension. * fix another slicing corner case. * remove auto pickling/unpickling into vlen strings (too cute, sometimes produced surprising results). version 0.8.1 (svn revision 744) ================================ * added 'cmptypes' and 'vltypes' Group/Dataset attributes, which contain dictionaries that map the names of compound and vlen types to CompoundType and VLType instances. * Experimental variable-length (vlen) data type support added. * changes to accomodate compound types in netcdf-4.1-beta snapshots. Compound types now work correctly for snapshots >= 20090603. * Added __len__ method and 'size' property to Variable class. * In date2index, replaced the brute force method by the bisection method and added a 'select' keyword to find the index of the date before, after or nearest the given date. * Fixed bug occurring when indexing with a numpy array of length 1. * Fixed bug that occured when -1 was used as a variable index. * enabled 'shared access' mode for NETCDF3 formatted files (mode='ws', 'r+s' or 'as'). Writes in shared mode are unbuffered, which can improve performance for non-sequential access. * fixed bug in renameVariable that caused failure when new name is longer than old name, and file format is NETCDF3_64BIT or NETCDF3_CLASSIC. version 0.8 (svn revision 685) ============================== * added 'stringtoarr' utility function for converting python strings to numpy character arrays of a specified size. * initial support for compound data types (which are mapped to structured numpy arrays). Compound data types are created with the createCompoundTYpe Dataset or Group method. Both attributes and variables can be compound types. * make sure 64-bit integer attributes converted to 32 bits when writing to a NETCDF3 formatted file. * added nc4tonc3 utility for converted NETCDF4_CLASSIC files to NETCDF3_64BIT files (useful for sharing data with colleagues that don't have netcdf-4 capable clients). version 0.7.7 (svn revision 626) ================================ * David Huard reworked fancy indexing - it is now much more efficient and less of a memory hog. Now works differently than numpy fancy indexing - 1d arrays of boolean or integer indices work independently on each dimension. This enables things like: >>> tempdat = temp[[0,1,3],lats>0,lons>0] (retrieves 1st, 2nd and 4th levels, all Northern Hem. and Eastern Hem. grid points - note that this would raise an IndexError in numpy) * added opendap test (tst_dap.py). * bugfix for nc3tonc4 utility. * fix MFDataset.Variable. __getattr__ to raise AttributeError instead of KeyError when attribute not found. * netcdftime version number upped to 0.7. version 0.7.6 (svn revision 574) ================================ * added date2index function, courtesy of David Huard, which finds the indices in a netCDF time variable corresponding to a sequence of datetime instances. * make _get_att/_set_att raise AttributeError instead of RuntimeError, so that getattr(object, 'nonexistantattribute', None) works. (thanks David Huard) * v[:] = data now works along unlim dim, i.e. you can do this: file = Dataset('test.nc', "w") file.createDimension("time", None) # unlimited dimension var = file.createVariable("var", 'd', ("time",)) # you used to have to do this #var[0:10] = numpy.arange(10) # but now you can simply do this var[:] = numpy.arange(10) version 0.7.5 (svn revision 549) ================================ * return a scalar array, not a python scalar, when a slice returns a single number. This is more consistent with numpy behavior, and fixes a bug in MFDataset slicing. * added 'exclude' parameter to MFDataset.__init__ * added set_auto_maskandscale method to MFDataset variables. version 0.7.4 (svn revision 540) ================================ * ensure all arithmetic is done with float64 in netcdftime (Rob Hetland). * fixes for netcdf-4.0-beta2 ('chunking' keyword to createVariable replaced by 'contiguous'). Now works with netcdf-4.0-beta2 and hdf5-1.8.0 final, but is incompatible with netcdf-4.0-beta1. version 0.7.3.1 (svn revision 507) ================================== * netCDF3 docs were missing from 0.7.3. * make sure quantization function preserves fill_value of masked arrays. version 0.7.3 (svn revision 501) ================================ * MFnetCDF4 module merged into netCDF4 and netCDF3 (now called MFDataset). * added netCDF3 module for those who can't install the netCDF 4 lib. * added set_auto_maskandscale Variable method to enable automatic packing and unpacking of short integers (using scale_factor and add_offset attributes) and automatic conversion to/from masked arrays (using missing_value or _FillValue attribute) on a per-variable basis. var.set_auto_maskandscale(True) turns automatic conversion on (it is off by default). * automatically pack/unpack short integer variables if scale_factor and add_offset variable attributes are set. * added support for masked arrays. If you try to write a masked array to a variable with the missing_value or _FillValue attributes set, the masked array is filled with that value before being written to the file. If you read data from a variable with the missing_value or _FillValue attribute set, a masked array is returned with the appropriate values masked. * added date2num and num2date functions. * added capability to use 'fancy indexing' with variable objects (i.e. using sequences of integers or booleans in slices). WARNING: if a sequence of integers or booleans is used to slice a netCDF4 variable, all of the data in that dimension is read into a numpy array, and then the sequence is used to slice the numpy array, returning just the requested elements to the user. This can potentially gobble a lot of memory and degrade performance (especially if 'fancy indexing' is done on the left-most dimension). * added convenience functions stringtochar and chartostring for converting character arrays to arrays of fixed-length strings and vice-versa. Example usage in examples/test_stringarr.py. 20070826 - version 0.7.1 (svn revision 400) =========================================== * added 'endian()' and 'chunking()' Variable methods (to inquire about endian and chunking variable settings). * 'ndim' attribute was not public (so it couldn't be accessed from python). Fixed. * added 'endian' kwarg to createVariable (to set the endian-ness used in the HDF5 file). * can now manually set HDF5 chunksizes for each dimension at variable creation, using 'chunksizes' kwarg to createVariable. * added "getlibversion()" function to get info about version of netcdf-4 library used to build module. * if a variable has an unsupported datatype (such as 'compound', or 'vlen'), then instead of raising an exception, just skip it. Print a useful error message when an attribute with an unsupported datatype is accessed. * if variable dimension is specified as 'dimname' or ('dimname') in createVariable, it is automatically converted to a tuple ('dimname',). Better error messages when specified dimension can't be found. * createVariable accepts numpy dtype object as datatype. dtype variable attribute is now a numpy dtype object. 20070723 - version 0.7 (svn revision 361) ========================================= * renamed MFnetCDF4_classic --> MFnetCDF4. * eliminated netCDF4_classic module (all file formats handled by netCDF4 module now). * removed all user-defined data type stuff (it was hacky and made the code too complex - wait till there is a real use case to refactor and put back in). * added 'ndim' variable attribute (number of variable dimensions). 20070424 - version 0.6.3 (svn revision 302) =========================================== * passes all tests with netcdf-4.0-beta1/hdf5-1.8.0-beta1. * if slice index is not a slice object, assume it's an integer (and try to convert to one if it is not). This allows numpy scalar arrays to work as slice indices. * (netCDF4_classic only) try to make sure file is not left in 'define mode' when execption is raised. * if slicing a variable results in a array with shape (1,), just return a scalar (except for compound types). * added instructions for using the netCDF4_classic module to serve data over http with the DAP using pydap (http://pydap.org). * added --quiet and --chunk options to nc3tonc4. * Turned off zlib compression by default so as not to violate the 'principle of least surprise'. Shuffle filter still activated by default when zlib compression turned on. * Fixed bug in fletcher32 checksum activation call. Renamed compression() variable method to filters(), include fletcher32 checksum flag in output. * added utility for converting GRIB1 files to compressed NETCDF4_CLASSIC files (requires PyNIO). * added 'compression()' variable method that returns a dict with compression filter parameter settings for that variable. (rev 237) * reimplemented 'shape' and 'dimensions' variable attributes as properties. * fixed bug when 'chunking' keyword in createVariable was set to 'sub' (caused Bus Error on MacOS X). * Setting 'shuffle=0' keyword in createVariable was turning off zlib compression filter instead of shuffle filter. Fixed. 20070213 - version 0.6.2 ======================== * updated for compatibility with netcdf-4.0-alpha18 and hdf5 1.8.0alpha5 (shared dimensions actually work now). * netCDF4.createVariable can now use old single character Numeric typecodes for datatype specification. * Improvements to MFDataset (now called MFnetCDF4_classic) by Rob Hetland. 20061121 - version 0.6.1 ======================== * bugfixes for negative strides. * bugfix for empty string attributes. * support for shared dimensions (variables can use dimensions defined only in a parent group). This doesn't actually work yet, because of a bug in netcdf-4.0-alpha17. * now requires Pyrex (C source files generated on the fly when setup.py is run). 20061003 - version 0.6 ====================== * if fill_value keyword to createVariable is set to the Boolean False (not an integer that evaluates to False), no pre-filling is done for that variable. * updated to be compatible with netcdf-4.0-alpha17. Can now install pure-python netcdftime separately with setup-netcdftime.py. netcdftime will try to use numpy, but fall back to Numeric if numpy not installed. * generated source files with a version of pyrex (from http://codespeak.net/svn/lxml/pyrex/) that produces extensions compatible with python 2.5. * added new module for multi-file access of NETCDF3 and NETCDF4_CLASSIC files (MFDataset). Based on CDFMF from pycdf. * implement negative strides in variable slicing (feature missing from Scientific.IO.NetCDF). Now variables support full python extended slicing syntax. 20060925 - version 0.5.1 ======================== * on 64-bit systems integer attributes in netCDF4_classic failed, since there is no 64-bit integer data type. Fixed by downcasting to 32-bit integer. 20060920 - version 0.5 ====================== * Compound type support! (members must be fixed data primitive types - no user-defined types or NC_STRING variables allowed). Attributes are still restricted to primitive data types (no vlen or compound type attributes). * Assigning single values to a slice now does the Right Thing, i.e. >>> data[:] = 1 fills all the elements with 1 (instead of raising an IndexError). * Tested with numpy 1.0b5, netcdf-4.0-alpha16, HDF5 1.7.52 alpha. * Added renameDimension and renameVariable methods to Dataset and Group classes. * netCDF attributes can be deleted using python del (i.e. 'del dset.foo'). * Moved examples from test and test_classic to examples and examples_classic directories. * Added proper unit tests (in test and test_classic directories). * NULL characters are removed from text attributes. * Variable _FillValue can be set using new keyword argument 'fill_value' to createVariable Dataset and Group method. * docstrings now formatted with epydoc (http://epydoc.sf.net). * improved Scientific.IO.NetCDF compatibility for netCDF4_classic (typecode method, ability to use old Numeric typecodes). * zlib=False or complevel=0 disables shuffle filter in createVariable. * subversion repository hosted on Google projects (http://code.google.com/p/netcdf4-python/). * examples_classic/bench2.py is a performance comparison with Scientific.IO.NetCDF (the numpy version provided by pynetcdf). * __dict__ attribute of Dataset, Group or Variable provides a python dictionary with all netCDF attribute name/value pairs (just like Scientific.IO.NetCDF). 20060710 - version 0.4.5 ======================== * fixed to work with recent svn versions of numpy * Now requires at least numpy 0.9.8. * Raise a AttributeError if user tries to rebind a private attribute (like 'variables', 'dimensions' or 'dtype'). 20060629 - version 0.4.4 ======================== * fixed to work with netcdf-4.0-alpha14. * automatically cast _FillValue attribute to variable type, to avoid surprising error message. 20060320 - version 0.4.3 ======================== updated netcdftime module yet again added 'all_leap'/'366_day' and '360_day' calendars. netCDFTime class renamed utime, fwd and inv methods renamed date2num and num2date. These methods can now handle numpy arrays as well as scalars. a 'real' python datetime instance is returned if calendar is gregorian, otherwise a 'datetime-like' instance is returned (python datetime can't handle funky dates in 'all_leap' and '360_day' calendars). 20060316 - version 0.4.2 ======================== udunits module replaced by pure python version, renamed 'netcdftime' No longer requires udunits library. Includes 4 calendars ('julian','standard'/'gregorian','proleptic_gregorian','noleap'/'365_day'). Calendar names and their interpretations follow the CF metadata convention. 20060310 - version 0.4.1 ======================== udunits module included for doing time conversions. 20060306 - version 0.4 ====================== netCDF4_classic module can now write NETCDF3_CLASSIC, NETCDF4_64BIT as well as NETCDF4_CLASSIC files. The file format is given as an optional keyword to the Dataset constructor ('NETCDF4_CLASSIC' is the default). Preliminary work on compound types done - but awaiting the next alpha of the netCDF 4 library to complete (bugs in alpha12 prevent it from working properly if the compound type has fields which are arrays). 20060217 - version 0.3.1 ======================== refactored user-defined data type support - user-defined data types are now described by an instance of the class UserType. usertype and usertype_name keyword args eliminated from createVariable. 20060214 - version 0.3 ====================== support for variable length strengths (typecode = 'S') and variable-length, or 'ragged' arrays (vlen user-defined datatype). Arrays of python objects can be saved as pickled strings with datatype = 'S'. 20050128 - version 0.2.5 ======================== added support for scalar variables (and assignValue, getValue Variable methods for Scientific.IO.NetCDF compatibility). 20051123 - version 0.2.4 ======================== numpy 0.9.4 compatibility Changed data type codes from ('d', 'f', 'i', 'h', ...) to ('f8', 'f4', 'i4', 'i2', ...). 20050110 - version 0.2.3 ======================== added ellipsis slicing capability 20050106 - version 0.2.2 ======================== changed scipy_core to numpy. 20051228 - version 0.2.1 ======================== bugfixes, added 'nc3tonc4' utility to convert netCDF version 3 files to NETCDF4_CLASSIC files (with compression). The converted files can be read from netCDF 3 clients that have been re-linked to the netCDF 4 library. 'chunking' keyword added to createVariable in netCDF4 module. 20051224 - version 0.2 ====================== Added netCDF4_classic module - which creates files in NETCDF4_CLASSIC format. These files are compatible with netCDF 3 clients which have been linked against the netCDF 4 lib. This module does not use any new features of the netCDF 4 API except zlib compression. Unlike any other netCDF 3 python client, it can transparently compress data with zlib compression and the HDF5 shuffle filter. 20051222 - version 0.1 ====================== First release. Supports groups, multiple unlimited dimensions, zlib compression (plus shuffle filter and fletcher32 checksum) and all new primitive data types. No support for user-defined data types yet. netcdf4-python-1.3.1rel/MANIFEST.in000066400000000000000000000007571317565303700166460ustar00rootroot00000000000000recursive-include docs * recursive-include man * recursive-include conda.recipe * include MANIFEST.in include README.md include COPYING include Changelog include appveyor.yml include .travis.yml include setup.cfg include examples/*py include examples/*ipynb include examples/README.md include test/*py include test/*nc include netcdftime/__init__.py include netcdftime/_netcdftime.pyx include netCDF4/__init__.py include netCDF4/_netCDF4.pyx include netCDF4/utils.py include include/netCDF4.pxi netcdf4-python-1.3.1rel/PKG-INFO000066400000000000000000000041561317565303700162020ustar00rootroot00000000000000Metadata-Version: 1.1 Name: netCDF4 Version: 1.3.1 Author: Jeff Whitaker Author-email: jeffrey s whitaker at noaa gov Home-page: https://github.com/Unidata/netcdf4-python Summary: python/numpy interface to netCDF library (versions 3 and 4) License: OSI Approved Description: netCDF version 4 has many features not found in earlier versions of the library and is implemented on top of HDF5. This module can read and write files in both the new netCDF 4 and the old netCDF 3 format, and can create files that are readable by HDF5 clients. The API modelled after Scientific.IO.NetCDF, and should be familiar to users of that module. Most new features of netCDF 4 are implemented, such as multiple unlimited dimensions, groups and zlib data compression. All the new numeric data types (such as 64 bit and unsigned integer types) are implemented. Compound, variable length (vlen), and enumerated (enum) data types are supported, but the opaque type is not. Mixtures of compound, vlen and/or enum data types are not supported. This project has a `Github repository `_ where you may access the most up-to-date source. `Documentation `_ `Changelog `_ Also available in the `Anaconda scientific python distribution `_ Download source tarball and binary wheels below... Keywords: numpy netcdf data science network oceanography meteorology climate Platform: any Classifier: Intended Audience :: Science/Research Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 2.7 Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.3 Classifier: Programming Language :: Python :: 3.4 Classifier: Programming Language :: Python :: 3.5 Classifier: Topic :: Scientific/Engineering netcdf4-python-1.3.1rel/README.gh-pages000066400000000000000000000007661317565303700174620ustar00rootroot00000000000000To update web docs at http://github.unidata.io/netcdf4-python: First install fork of pdoc from https://github.com/jswhit/pdoc (requires mako, markdown, pygments and future). Then in netcdf4-python github clone directory (after building and installing github master), * generate docs (sh create_docs.sh) * copy docs/netCDF4/index.html up one level (cp docs/netCDF4/index.html ..) * git checkout gh-pages * cp ../index.html . * git commit index.html * git push origin gh-pages * git checkout master netcdf4-python-1.3.1rel/README.md000066400000000000000000000236571317565303700163730ustar00rootroot00000000000000# netcdf4-python [Python](http://python.org)/[numpy](http://numpy.org) interface to the netCDF [C library](https://github.com/Unidata/netcdf-c). [![Linux Build Status](https://travis-ci.org/Unidata/netcdf4-python.svg?branch=master)](https://travis-ci.org/Unidata/netcdf4-python) [![Windows Build Status](https://ci.appveyor.com/api/projects/status/fl9taa9je4e6wi7n/branch/master?svg=true)](https://ci.appveyor.com/project/jswhit/netcdf4-python/branch/master) [![PyPI package](https://badge.fury.io/py/netCDF4.svg)](http://python.org/pypi/netCDF4) ## News For details on the latest updates, see the [Changelog](https://github.com/Unidata/netcdf4-python/blob/master/Changelog). 11/01/2017: Version 1.3.1 released. Parallel IO support with MPI! Requires that netcdf-c and hdf5 be built with MPI support, and [mpi4py](http://mpi4py.readthedocs.io/en/stable). To open a file for parallel access in a program running in an MPI environment using mpi4py, just use `parallel=True` when creating the `Dataset` instance. See [`examples/mpi_example.py`](https://github.com/Unidata/netcdf4-python/blob/master/examples/mpi_example.py) for a demonstration. For more info, see the tutorial [section](http://unidata.github.io/netcdf4-python/#section13). 9/25/2017: Version [1.3.0](https://pypi.python.org/pypi/netCDF4/1.3.0) released. Bug fixes for `netcdftime` and optimizations for reading strided slices. `encoding` kwarg added to `Dataset.__init__` and `Dataset.filepath` to deal with oddball encodings in filename paths (`sys.getfilesystemencoding()` is used by default to determine encoding). Make sure numpy datatypes used to define CompoundTypes have `isalignedstruct` flag set to avoid segfaults - which required bumping the minimum required numpy from 1.7.0 to 1.9.0. In cases where `missing_value/valid_min/valid_max/_FillValue` cannot be safely cast to the variable's dtype, they are no longer be used to automatically mask the data and a warning message is issued. 6/10/2017: Version [1.2.9](https://pypi.python.org/pypi/netCDF4/1.2.9) released. Fixes for auto-scaling and masking when `_Unsigned` and/or `valid_min`, `valid_max` attributes present. setup.py updated so that `pip install` works if cython not installed. Now requires [setuptools](https://pypi.python.org/pypi/setuptools) version 18.0 or greater. 6/1/2017: Version [1.2.8](https://pypi.python.org/pypi/netCDF4/1.2.8) released. From Changelog: * recognize `_Unsigned` attribute used by [netcdf-java](http://www.unidata.ucar.edu/software/thredds/current/netcdf-java/) to designate unsigned integer data stored with a signed integer type in netcdf-3 [issue #656](https://github.com/Unidata/netcdf4-python/issues/656). * add Dataset init memory parameter to allow loading a file from memory [pull request #652](https://github.com/Unidata/netcdf4-python/pull/652), [issue #406](https://github.com/Unidata/netcdf4-python/issues/406) and [issue #295](https://github.com/Unidata/netcdf4-python/issues/295). * fix for negative times in num2date [issue #659](https://github.com/Unidata/netcdf4-python/pull/659). * fix for failing tests in numpy 1.13 due to changes in `numpy.ma` [issue #662](https://github.com/Unidata/netcdf4-python/issues/662). * Checking for `_Encoding` attribute for `NC_STRING` variables, otherwise use 'utf-8'. 'utf-8' is used everywhere else, 'default_encoding' global module variable is no longer used. getncattr method now takes optional kwarg 'encoding' (default 'utf-8') so encoding of attributes can be specified if desired. If `_Encoding` is specified for an `NC_CHAR` (`'S1'`) variable, the chartostring utility function is used to convert the array of characters to an array of strings with one less dimension (the last dimension is interpreted as the length of each string) when reading the data. When writing the data, stringtochar is used to convert a numpy array of fixed length strings to an array of characters with one more dimension. chartostring and stringtochar now also have an 'encoding' kwarg. Automatic conversion to/from character to string arrays can be turned off via a new `set_auto_chartostring` Dataset and Variable method (default is `True`). Addresses [issue #654](https://github.com/Unidata/netcdf4-python/issues/654) * [Cython](http://cython.org) >= 0.19 now required, `_netCDF4.c` and `_netcdftime.c` removed from repository. 1/8/2017: Version [1.2.7](https://pypi.python.org/pypi/netCDF4/1.2.7) released. Python 3.6 compatibility, and fix for vector missing_values. 12/10/2016: Version [1.2.6](https://pypi.python.org/pypi/netCDF4/1.2.6) released. Bug fixes for Enum data type, and _FillValue/missing_value usage when data is stored in non-native endian format. Add get_variables_by_attributes to MFDataset. Support for python 2.6 removed. 12/1/2016: Version [1.2.5](https://pypi.python.org/pypi/netCDF4/1.2.5) released. See the [Changelog](https://github.com/Unidata/netcdf4-python/blob/master/Changelog) for changes. 4/15/2016: Version [1.2.4](https://pypi.python.org/pypi/netCDF4/1.2.4) released. Bugs in handling of variables with specified non-native "endian-ness" (byte-order) fixed ([issue #554] (https://github.com/Unidata/netcdf4-python/issues/554)). Build instructions updated and warning issued to deal with potential backwards incompatibility introduced when using HDF5 1.10.x (see [Unidata/netcdf-c/issue#250](https://github.com/Unidata/netcdf-c/issues/250)). 3/10/2016: Version [1.2.3](https://pypi.python.org/pypi/netCDF4/1.2.3) released. Various bug fixes. All text attributes in ``NETCDF4`` formatted files are now written as type ``NC_CHAR``, unless they contain unicode characters that cannot be encoded in ascii, in which case they are written as ``NC_STRING``. Previously, all unicode strings were written as ``NC_STRING``. This change preserves compatibility with clients, like Matlab, that can't deal with ``NC_STRING`` attributes. A ``setncattr_string`` method was added to force attributes to be written as ``NC_STRING``. 1/1/2016: Version [1.2.2](https://pypi.python.org/pypi/netCDF4/1.2.2) released. Mostly bugfixes, but with two new features. * support for the new ``NETCDF3_64BIT_DATA`` format introduced in netcdf-c 4.4.0. Similar to ``NETCDF3_64BIT`` (now ``NETCDF3_64BIT_OFFSET``), but includes 64 bit dimension sizes (> 2 billion), plus unsigned and 64 bit integer data types. Uses the classic (netcdf-3) data model, and does not use HDF5 as the underlying storage format. * Dimension objects now have a ``size`` attribute, which is the current length of the dimension (same as invoking ``len`` on the Dimension instance). The minimum required python version has now been increased from 2.5 to 2.6. 10/15/2015: Version [1.2.1](https://pypi.python.org/pypi/netCDF4/1.2.1) released. Adds the ability to slice Variables with unsorted integer sequences, and integer sequences with duplicates. 9/23/2015: Version [1.2.0](https://pypi.python.org/pypi/netCDF4/1.2.0) released. New features: * [get_variables_by_attributes](http://unidata.github.io/netcdf4-python/#netCDF4.Dataset.get_variables_by_attributes) ``Dataset`` and ``Group`` method for retrieving variables that have matching attributes. * Support for [Enum](http://unidata.github.io/netcdf4-python/#section12) data types. * [isopen](http://unidata.github.io/netcdf4-python/#netCDF4.Dataset.isopen) `Dataset` method. 7/28/2015: Version [1.1.9](https://pypi.python.org/pypi/netCDF4/1.1.9) bugfix release. 5/14/2015: Version [1.1.8](https://pypi.python.org/pypi/netCDF4/1.1.8) released. Unix-like paths can now be used in `createVariable` and `createGroup`. ```python v = nc.createVariable('/path/to/var1', ('xdim', 'ydim'), float) ``` will create a variable named 'var1', while also creating the groups 'path' and 'path/to' if they do not already exist. Similarly, ```python g = nc.createGroup('/path/to') ``` now acts like `mkdir -p` in unix, creating groups 'path' and '/path/to', if they don't already exist. Users who relied on `nc.createGroup(groupname)` failing when the group already exists will have to modify their code, since `nc.createGroup` will now return the existing group instance. `Dataset.__getitem__` was also added. `nc['/path/to']` now returns a group instance, and `nc['/path/to/var1']` now returns a variable instance. 3/19/2015: Version [1.1.7](https://pypi.python.org/pypi/netCDF4/1.1.7) released. Global Interpreter Lock (GIL) now released when extension module calls C library for read operations. This speeds up concurrent reads when using threads. Users who wish to use netcdf4-python inside threads should read http://www.hdfgroup.org/hdf5-quest.html#gconc regarding thread-safety in the HDF5 C library. Fixes to `setup.py` now ensure that `pip install netCDF4` with `export USE_NCCONFIG=0` will use environment variables to find paths to libraries and include files, instead of relying exclusively on the nc-config utility. ## Quick Start * Clone GitHub repository (`git clone https://github.com/Unidata/netcdf4-python.git`), or get source tarball from [PyPI](https://pypi.python.org/pypi/netCDF4). Links to Windows and OS X precompiled binary packages are also available on [PyPI](https://pypi.python.org/pypi/netCDF4). * Make sure [numpy](http://www.numpy.org/) and [Cython](http://cython.org/) are installed and you have [Python](https://www.python.org) 2.7 or newer. * Make sure [HDF5](http://www.h5py.org/) and netcdf-4 are installed, and the `nc-config` utility is in your Unix PATH. * Run `python setup.py build`, then `python setup.py install` (with `sudo` if necessary). * To run all the tests, execute `cd test && python run_all.py`. ## Documentation See the online [docs](http://unidata.github.io/netcdf4-python) for more details. ## Usage ###### Sample [iPython](http://ipython.org/) notebooks available in the examples directory on [reading](http://nbviewer.ipython.org/github/Unidata/netcdf4-python/blob/master/examples/reading_netCDF.ipynb) and [writing](http://nbviewer.ipython.org/github/Unidata/netcdf4-python/blob/master/examples/writing_netCDF.ipynb) netCDF data with Python. netcdf4-python-1.3.1rel/README.release000066400000000000000000000026611317565303700174030ustar00rootroot00000000000000* create a release branch ('vX.Y.Zrel'). In the release branch... * make sure version number in setup.py and netCDF4/_netCDF4.pyx are up to date (in _netCDF4.pyx, change 'Version' in first line of docstring at top of file, and __version__ variable). If netcdftime module has any updates, increment __version__ in netcdftime/_netcdftime.pyx. Update version number in PKG_INFO. * update Changelog and README.md as needed. * commit and push all of the above changes. * install the module (python setup.py install), then run 'sh create_docs.sh' to update html docs. Commit and push the update to docs/netCDF4/index.html. * create a pull request for the release branch. * After release branch has been merged, tag a release git tag -a vX.Y.Zrel -m "version X.Y.Z release" git push origin --tags * push an empty commit to the netcdf4-python-wheels repo to trigger new builds. (e.g. git commit --allow-empty -m "Trigger build") You will likely want to edit the .travis.yml file at https://github.com/MacPython/netcdf4-python-wheels to specify the BUILD_COMMIT before triggering a build. * update the pypi entry, upload the wheels from wheels.scipy.org. Lastly, create a source tarball using 'python setup.py sdist' and upload to pypi. * update web docs by copying docs/netCDF4/index.html somewhere, switch to the gh-pages branch, copy the index.html file back, commit and push the updated index.html file (see README.gh-pages). netcdf4-python-1.3.1rel/README.wheels.md000066400000000000000000000073231317565303700176510ustar00rootroot00000000000000# Building and uploading wheels ## For OSX We automate OSX wheel building using a custom github repository that builds on the travis-ci OSX machines. The travis-ci interface for the builds is : https://travis-ci.org/MacPython/netcdf4-python-wheels The driving github repository is : https://github.com/MacPython/netcdf4-python-wheels ### How it works The wheel-building repository: * does a fresh build of the required C / C++ libraries; * builds a netcdf4-python wheel, linking against these fresh builds; * processes the wheel using [delocate](https://pypi.python.org/pypi/delocate). `delocate` copies the required dynamic libraries into the wheel and relinks the extension modules against the copied libraries; * uploads the built wheel to http://wheels.scipy.org (a Rackspace container kindly donated by Rackspace to scikit-learn). The resulting wheel is therefore self-contained and does not need any external dynamic libraries apart from those provided as standard by OSX. ### Triggering a build You will need write permision to the github repository to trigger new builds on the travis-ci interface. Contact us on the mailing list if you need this. You can trigger a build by: * making a commit to the `netcdf4-python-wheels` repository (e.g. with `git commit --allow-empty`); or * clicking on the circular arrow icon towards the top right of the travis-ci page, to rerun the previous build. In general, it is better to trigger a build with a commit, because this makes a new set of build products and logs, keeping the old ones for reference. Keeping the old build logs helps us keep track of previous problems and successful builds. ### Which netcdf4-python commit does the repository build? By default, the `netcd4-python-wheels` repository is usually set up to build the latest git tag. To check whether this is so have a look around line 5 of `.travis.yml` in the `netcdf4-python-wheels` repository. You should see something like: ``` - BUILD_COMMIT='latest-tag' ``` If this is commented out, then the repository is set up to build the current commit in the `netcdf4-python` submodule of the repository. If it is set to another value then it will be specifying a commit to build. You can therefore build any arbitrary commit by specificying the commit hash or branch name or tag name in this line of the `.travis.yml` file. ### Uploading the built wheels to pypi Be careful, http://wheels.scipy.org points to a container on a distributed content delivery network. It can take up to 15 minutes for the new wheel file to get updated into the container at http://wheels.scipy.org. When the wheels are updated, you can of course just download them to your machine manually, and then upload them manually to pypi, or by using [twine][twine]. You can also use a script for doing this, housed at : https://github.com/MacPython/terryfy/blob/master/wheel-uploader You'll need [twine][twine] and [beautiful soup 4][bs4]. You will typically have a directory on your machine where you store wheels, called a `wheelhouse`. The typical call for `wheel-uploader` would then be something like: ``` wheel-uploader -v -w ~/wheelhouse netCDF4 1.1.8 ``` where: * `-v` means give verbose messages; * `-w ~/wheelhouse` means download the wheels from https://wheels.scipy.org to the directory `~/wheelhouse`; * `netCDF4` is the root name of the wheel(s) to download / upload; * `1.1.8` is the version to download / upload. So, in this case, `wheel-uploader` will download all wheels starting with `netCDF4-1.1.8-` from http://wheels.scipy.org to `~/wheelhouse`, then upload them to pypi. Of course, you will need permissions to upload to pypi, for this to work. [twine]: https://pypi.python.org/pypi/twine [bs4]: https://pypi.python.org/pypi/beautifulsoup4 netcdf4-python-1.3.1rel/appveyor.yml000066400000000000000000000044011317565303700174660ustar00rootroot00000000000000environment: # SDK v7.0 MSVC Express 2008's SetEnv.cmd script will fail if the # /E:ON and /V:ON options are not enabled in the batch script interpreter # See: http://stackoverflow.com/a/13751649/163740 CMD_IN_ENV: "cmd /E:ON /V:ON /C obvci_appveyor_python_build_env.cmd" matrix: - TARGET_ARCH: x64 CONDA_NPY: 111 CONDA_PY: 27 CONDA_INSTALL_LOCN: C:\\Miniconda-x64 - TARGET_ARCH: x64 CONDA_NPY: 111 CONDA_PY: 36 CONDA_INSTALL_LOCN: C:\\Miniconda35-x64 # We always use a 64-bit machine, but can build x86 distributions # with the TARGET_ARCH variable. platform: - x64 install: # If there is a newer build queued for the same PR, cancel this one. # The AppVeyor 'rollout builds' option is supposed to serve the same # purpose but it is problematic because it tends to cancel builds pushed # directly to master instead of just PR builds (or the converse). # credits: JuliaLang developers. - ps: if ($env:APPVEYOR_PULL_REQUEST_NUMBER -and $env:APPVEYOR_BUILD_NUMBER -ne ((Invoke-RestMethod ` https://ci.appveyor.com/api/projects/$env:APPVEYOR_ACCOUNT_NAME/$env:APPVEYOR_PROJECT_SLUG/history?recordsNumber=50).builds | ` Where-Object pullRequestId -eq $env:APPVEYOR_PULL_REQUEST_NUMBER)[0].buildNumber) { ` throw "There are newer queued builds for this pull request, failing early." } # Add path, activate `conda` and update conda. - cmd: set "PATH=%CONDA_INSTALL_LOCN%\\Scripts;%CONDA_INSTALL_LOCN%\\Library\\bin;%PATH%" - cmd: set PYTHONUNBUFFERED=1 - cmd: call %CONDA_INSTALL_LOCN%\Scripts\activate.bat # for obvci_appveyor_python_build_env.cmd - cmd: conda update --all --yes - cmd: conda install anaconda-client=1.6.3 --yes - cmd: conda install -c conda-forge --yes obvious-ci # for msinttypes and newer stuff - cmd: conda config --prepend channels conda-forge - cmd: conda config --set show_channel_urls yes - cmd: conda config --set always_yes true # For building conda packages - cmd: conda install --yes conda-build jinja2 anaconda-client # this is now the downloaded conda... - cmd: conda info -a # Skip .NET project specific build phase. build: off test_script: - "%CMD_IN_ENV% conda build conda.recipe --quiet" netcdf4-python-1.3.1rel/checkversion.py000066400000000000000000000005001317565303700201270ustar00rootroot00000000000000import netCDF4, sys, numpy sys.stdout.write('netcdf4-python version: %s\n'%netCDF4.__version__) sys.stdout.write('HDF5 lib version: %s\n'%netCDF4.__hdf5libversion__) sys.stdout.write('netcdf lib version: %s\n'%netCDF4.__netcdf4libversion__) sys.stdout.write('numpy version %s\n' % numpy.__version__) netcdf4-python-1.3.1rel/ci/000077500000000000000000000000001317565303700154725ustar00rootroot00000000000000netcdf4-python-1.3.1rel/ci/travis/000077500000000000000000000000001317565303700170025ustar00rootroot00000000000000netcdf4-python-1.3.1rel/ci/travis/build-parallel-netcdf.sh000077500000000000000000000006201317565303700234710ustar00rootroot00000000000000#!/bin/bash set -e echo "Using downloaded netCDF version ${NETCDF_VERSION} with parallel capabilities enabled" pushd /tmp wget ftp://ftp.unidata.ucar.edu/pub/netcdf/netcdf-${NETCDF_VERSION}.tar.gz tar -xzvf netcdf-${NETCDF_VERSION}.tar.gz pushd netcdf-${NETCDF_VERSION} ./configure --prefix $NETCDF_DIR --enable-netcdf-4 --enable-shared --disable-dap --enable-parallel make -j 2 make install popd netcdf4-python-1.3.1rel/conda.recipe/000077500000000000000000000000001317565303700174315ustar00rootroot00000000000000netcdf4-python-1.3.1rel/conda.recipe/bld.bat000066400000000000000000000006071317565303700206650ustar00rootroot00000000000000set SITECFG=%SRC_DIR%/setup.cfg echo [options] > %SITECFG% echo [directories] >> %SITECFG% echo HDF5_libdir = %LIBRARY_LIB% >> %SITECFG% echo HDF5_incdir = %LIBRARY_INC% >> %SITECFG% echo netCDF4_libdir = %LIBRARY_LIB% >> %SITECFG% echo netCDF4_incdir = %LIBRARY_INC% >> %SITECFG% "%PYTHON%" setup.py install --single-version-externally-managed --record record.txt if errorlevel 1 exit 1 netcdf4-python-1.3.1rel/conda.recipe/build.sh000066400000000000000000000003471317565303700210700ustar00rootroot00000000000000#!/bin/bash SETUPCFG=$SRC_DIR\setup.cfg echo "[options]" > $SETUPCFG echo "[directories]" >> $SETUPCFG echo "netCDF4_dir = $PREFIX" >> $SETUPCFG ${PYTHON} setup.py install --single-version-externally-managed --record record.txt netcdf4-python-1.3.1rel/conda.recipe/meta.yaml000066400000000000000000000015011317565303700212400ustar00rootroot00000000000000{% set version = "dev" %} package: name: netcdf4 version: {{ version }} source: path: ../ build: number: 0 entry_points: - ncinfo = netCDF4.utils:ncinfo - nc4tonc3 = netCDF4.utils:nc4tonc3 - nc3tonc4 = netCDF4.utils:nc3tonc4 requirements: build: - python - setuptools - cython - numpy x.x - msinttypes # [win and py<35] - hdf5 1.8.17|1.8.17.* - libnetcdf 4.4.* run: - python - setuptools - numpy x.x - hdf5 1.8.17|1.8.17.* - libnetcdf 4.4.* test: source_files: - test imports: - netCDF4 - netcdftime commands: - ncinfo -h - nc4tonc3 -h - nc3tonc4 -h about: home: http://github.com/Unidata/netcdf4-python license: OSI Approved summary: 'Provides an object-oriented python interface to the netCDF version 4 library..' netcdf4-python-1.3.1rel/conda.recipe/run_test.py000066400000000000000000000002651317565303700216510ustar00rootroot00000000000000import os import netCDF4 # Run the unittests, skipping the opendap test. test_dir = os.path.join('test') os.chdir(test_dir) os.environ['NO_NET']='1' os.system('python run_all.py') netcdf4-python-1.3.1rel/create_docs.sh000066400000000000000000000004461317565303700177120ustar00rootroot00000000000000# Uses pdoc (https://github.com/BurntSushi/pdoc) # to create html docs from docstrings in Cython source. # Use hacked version at https://github.com/jswhit/pdoc # which extracts cython method docstrings and function signatures. pdoc --html --html-no-source --overwrite --html-dir 'docs' netCDF4 netcdf4-python-1.3.1rel/docs/000077500000000000000000000000001317565303700160275ustar00rootroot00000000000000netcdf4-python-1.3.1rel/docs/netCDF4/000077500000000000000000000000001317565303700172165ustar00rootroot00000000000000netcdf4-python-1.3.1rel/docs/netCDF4/index.html000066400000000000000000012750161317565303700212270ustar00rootroot00000000000000 netCDF4 API documentation Top

netCDF4 module

Version 1.3.1


Introduction

netcdf4-python is a Python interface to the netCDF C library.

netCDF version 4 has many features not found in earlier versions of the library and is implemented on top of HDF5. This module can read and write files in both the new netCDF 4 and the old netCDF 3 format, and can create files that are readable by HDF5 clients. The API modelled after Scientific.IO.NetCDF, and should be familiar to users of that module.

Most new features of netCDF 4 are implemented, such as multiple unlimited dimensions, groups and zlib data compression. All the new numeric data types (such as 64 bit and unsigned integer types) are implemented. Compound (struct), variable length (vlen) and enumerated (enum) data types are supported, but not the opaque data type. Mixtures of compound, vlen and enum data types (such as compound types containing enums, or vlens containing compound types) are not supported.

Download

Requires

  • Python 2.7 or later (python 3 works too).
  • numpy array module, version 1.9.0 or later.
  • Cython, version 0.21 or later.
  • setuptools, version 18.0 or later.
  • The HDF5 C library version 1.8.4-patch1 or higher (1.8.x recommended) from . netCDF version 4.4.1 or higher is recommended if using HDF5 1.10.x - otherwise resulting files may be unreadable by clients using earlier versions of HDF5. For netCDF < 4.4.1, HDF5 version 1.8.x is recommended. Be sure to build with --enable-hl --enable-shared.
  • Libcurl, if you want OPeNDAP support.
  • HDF4, if you want to be able to read HDF4 "Scientific Dataset" (SD) files.
  • The netCDF-4 C library from the github releases page. Version 4.1.1 or higher is required (4.2 or higher recommended). Be sure to build with --enable-netcdf-4 --enable-shared, and set CPPFLAGS="-I $HDF5_DIR/include" and LDFLAGS="-L $HDF5_DIR/lib", where $HDF5_DIR is the directory where HDF5 was installed. If you want OPeNDAP support, add --enable-dap. If you want HDF4 SD support, add --enable-hdf4 and add the location of the HDF4 headers and library to $CPPFLAGS and $LDFLAGS.
  • for MPI parallel IO support, MPI-enabled versions of the HDF5 and netcdf libraries are required, as is the mpi4py python module.

Install

  • install the requisite python modules and C libraries (see above). It's easiest if all the C libs are built as shared libraries.
  • By default, the utility nc-config, installed with netcdf 4.1.2 or higher, will be run used to determine where all the dependencies live.
  • If nc-config is not in your default $PATH edit the setup.cfg file in a text editor and follow the instructions in the comments. In addition to specifying the path to nc-config, you can manually set the paths to all the libraries and their include files (in case nc-config does not do the right thing).
  • run python setup.py build, then python setup.py install (as root if necessary).
  • pip install can also be used, with library paths set with environment variables. To make this work, the USE_SETUPCFG environment variable must be used to tell setup.py not to use setup.cfg. For example, USE_SETUPCFG=0 HDF5_INCDIR=/usr/include/hdf5/serial HDF5_LIBDIR=/usr/lib/x86_64-linux-gnu/hdf5/serial pip install has been shown to work on an Ubuntu/Debian linux system. Similarly, environment variables (all capitalized) can be used to set the include and library paths for hdf5, netCDF4, hdf4, szip, jpeg, curl and zlib. If the libraries are installed in standard places (e.g. /usr or /usr/local), the environment variables do not need to be set.
  • run the tests in the 'test' directory by running python run_all.py.

Tutorial

  1. Creating/Opening/Closing a netCDF file.
  2. Groups in a netCDF file.
  3. Dimensions in a netCDF file.
  4. Variables in a netCDF file.
  5. Attributes in a netCDF file.
  6. Writing data to and retrieving data from a netCDF variable.
  7. Dealing with time coordinates.
  8. Reading data from a multi-file netCDF dataset.
  9. Efficient compression of netCDF variables.
  10. Beyond homogeneous arrays of a fixed type - compound data types.
  11. Variable-length (vlen) data types.
  12. Enum data type.
  13. Parallel IO.

1) Creating/Opening/Closing a netCDF file.

To create a netCDF file from python, you simply call the Dataset constructor. This is also the method used to open an existing netCDF file. If the file is open for write access (mode='w', 'r+' or 'a'), you may write any type of data including new dimensions, groups, variables and attributes. netCDF files come in five flavors (NETCDF3_CLASSIC, NETCDF3_64BIT_OFFSET, NETCDF3_64BIT_DATA, NETCDF4_CLASSIC, and NETCDF4). NETCDF3_CLASSIC was the original netcdf binary format, and was limited to file sizes less than 2 Gb. NETCDF3_64BIT_OFFSET was introduced in version 3.6.0 of the library, and extended the original binary format to allow for file sizes greater than 2 Gb. NETCDF3_64BIT_DATA is a new format that requires version 4.4.0 of the C library - it extends the NETCDF3_64BIT_OFFSET binary format to allow for unsigned/64 bit integer data types and 64-bit dimension sizes. NETCDF3_64BIT is an alias for NETCDF3_64BIT_OFFSET. NETCDF4_CLASSIC files use the version 4 disk format (HDF5), but omits features not found in the version 3 API. They can be read by netCDF 3 clients only if they have been relinked against the netCDF 4 library. They can also be read by HDF5 clients. NETCDF4 files use the version 4 disk format (HDF5) and use the new features of the version 4 API. The netCDF4 module can read and write files in any of these formats. When creating a new file, the format may be specified using the format keyword in the Dataset constructor. The default format is NETCDF4. To see how a given file is formatted, you can examine the data_model attribute. Closing the netCDF file is accomplished via the close method of the Dataset instance.

Here's an example:

>>> from netCDF4 import Dataset
>>> rootgrp = Dataset("test.nc", "w", format="NETCDF4")
>>> print rootgrp.data_model
NETCDF4
>>> rootgrp.close()

Remote OPeNDAP-hosted datasets can be accessed for reading over http if a URL is provided to the Dataset constructor instead of a filename. However, this requires that the netCDF library be built with OPenDAP support, via the --enable-dap configure option (added in version 4.0.1).

2) Groups in a netCDF file.

netCDF version 4 added support for organizing data in hierarchical groups, which are analogous to directories in a filesystem. Groups serve as containers for variables, dimensions and attributes, as well as other groups. A Dataset creates a special group, called the 'root group', which is similar to the root directory in a unix filesystem. To create Group instances, use the createGroup method of a Dataset or Group instance. createGroup takes a single argument, a python string containing the name of the new group. The new Group instances contained within the root group can be accessed by name using the groups dictionary attribute of the Dataset instance. Only NETCDF4 formatted files support Groups, if you try to create a Group in a netCDF 3 file you will get an error message.

>>> rootgrp = Dataset("test.nc", "a")
>>> fcstgrp = rootgrp.createGroup("forecasts")
>>> analgrp = rootgrp.createGroup("analyses")
>>> print rootgrp.groups
OrderedDict([("forecasts", 
              <netCDF4._netCDF4.Group object at 0x1b4b7b0>),
             ("analyses", 
              <netCDF4._netCDF4.Group object at 0x1b4b970>)])

Groups can exist within groups in a Dataset, just as directories exist within directories in a unix filesystem. Each Group instance has a groups attribute dictionary containing all of the group instances contained within that group. Each Group instance also has a path attribute that contains a simulated unix directory path to that group. To simplify the creation of nested groups, you can use a unix-like path as an argument to createGroup.

>>> fcstgrp1 = rootgrp.createGroup("/forecasts/model1")
>>> fcstgrp2 = rootgrp.createGroup("/forecasts/model2")

If any of the intermediate elements of the path do not exist, they are created, just as with the unix command 'mkdir -p'. If you try to create a group that already exists, no error will be raised, and the existing group will be returned.

Here's an example that shows how to navigate all the groups in a Dataset. The function walktree is a Python generator that is used to walk the directory tree. Note that printing the Dataset or Group object yields summary information about it's contents.

>>> def walktree(top):
>>>     values = top.groups.values()
>>>     yield values
>>>     for value in top.groups.values():
>>>         for children in walktree(value):
>>>             yield children
>>> print rootgrp
>>> for children in walktree(rootgrp):
>>>      for child in children:
>>>          print child
<type "netCDF4._netCDF4.Dataset">
root group (NETCDF4 file format):
    dimensions:
    variables:
    groups: forecasts, analyses
<type "netCDF4._netCDF4.Group">
group /forecasts:
    dimensions:
    variables:
    groups: model1, model2
<type "netCDF4._netCDF4.Group">
group /analyses:
    dimensions:
    variables:
    groups:
<type "netCDF4._netCDF4.Group">
group /forecasts/model1:
    dimensions:
    variables:
    groups:
<type "netCDF4._netCDF4.Group">
group /forecasts/model2:
    dimensions:
    variables:
    groups:

3) Dimensions in a netCDF file.

netCDF defines the sizes of all variables in terms of dimensions, so before any variables can be created the dimensions they use must be created first. A special case, not often used in practice, is that of a scalar variable, which has no dimensions. A dimension is created using the createDimension method of a Dataset or Group instance. A Python string is used to set the name of the dimension, and an integer value is used to set the size. To create an unlimited dimension (a dimension that can be appended to), the size value is set to None or 0. In this example, there both the time and level dimensions are unlimited. Having more than one unlimited dimension is a new netCDF 4 feature, in netCDF 3 files there may be only one, and it must be the first (leftmost) dimension of the variable.

>>> level = rootgrp.createDimension("level", None)
>>> time = rootgrp.createDimension("time", None)
>>> lat = rootgrp.createDimension("lat", 73)
>>> lon = rootgrp.createDimension("lon", 144)

All of the Dimension instances are stored in a python dictionary.

>>> print rootgrp.dimensions
OrderedDict([("level", <netCDF4._netCDF4.Dimension object at 0x1b48030>),
             ("time", <netCDF4._netCDF4.Dimension object at 0x1b481c0>),
             ("lat", <netCDF4._netCDF4.Dimension object at 0x1b480f8>),
             ("lon", <netCDF4._netCDF4.Dimension object at 0x1b48a08>)])

Calling the python len function with a Dimension instance returns the current size of that dimension. The isunlimited method of a Dimension instance can be used to determine if the dimensions is unlimited, or appendable.

>>> print len(lon)
144
>>> print lon.isunlimited()
False
>>> print time.isunlimited()
True

Printing the Dimension object provides useful summary info, including the name and length of the dimension, and whether it is unlimited.

>>> for dimobj in rootgrp.dimensions.values():
>>>    print dimobj
<type "netCDF4._netCDF4.Dimension"> (unlimited): name = "level", size = 0
<type "netCDF4._netCDF4.Dimension"> (unlimited): name = "time", size = 0
<type "netCDF4._netCDF4.Dimension">: name = "lat", size = 73
<type "netCDF4._netCDF4.Dimension">: name = "lon", size = 144
<type "netCDF4._netCDF4.Dimension"> (unlimited): name = "time", size = 0

Dimension names can be changed using the netCDF4.Datatset.renameDimension method of a Dataset or Group instance.

4) Variables in a netCDF file.

netCDF variables behave much like python multidimensional array objects supplied by the numpy module. However, unlike numpy arrays, netCDF4 variables can be appended to along one or more 'unlimited' dimensions. To create a netCDF variable, use the createVariable method of a Dataset or Group instance. The createVariable method has two mandatory arguments, the variable name (a Python string), and the variable datatype. The variable's dimensions are given by a tuple containing the dimension names (defined previously with createDimension). To create a scalar variable, simply leave out the dimensions keyword. The variable primitive datatypes correspond to the dtype attribute of a numpy array. You can specify the datatype as a numpy dtype object, or anything that can be converted to a numpy dtype object. Valid datatype specifiers include: 'f4' (32-bit floating point), 'f8' (64-bit floating point), 'i4' (32-bit signed integer), 'i2' (16-bit signed integer), 'i8' (64-bit signed integer), 'i1' (8-bit signed integer), 'u1' (8-bit unsigned integer), 'u2' (16-bit unsigned integer), 'u4' (32-bit unsigned integer), 'u8' (64-bit unsigned integer), or 'S1' (single-character string). The old Numeric single-character typecodes ('f','d','h', 's','b','B','c','i','l'), corresponding to ('f4','f8','i2','i2','i1','i1','S1','i4','i4'), will also work. The unsigned integer types and the 64-bit integer type can only be used if the file format is NETCDF4.

The dimensions themselves are usually also defined as variables, called coordinate variables. The createVariable method returns an instance of the Variable class whose methods can be used later to access and set variable data and attributes.

>>> times = rootgrp.createVariable("time","f8",("time",))
>>> levels = rootgrp.createVariable("level","i4",("level",))
>>> latitudes = rootgrp.createVariable("lat","f4",("lat",))
>>> longitudes = rootgrp.createVariable("lon","f4",("lon",))
>>> # two dimensions unlimited
>>> temp = rootgrp.createVariable("temp","f4",("time","level","lat","lon",))

To get summary info on a Variable instance in an interactive session, just print it.

>>> print temp
<type "netCDF4._netCDF4.Variable">
float32 temp(time, level, lat, lon)
    least_significant_digit: 3
    units: K
unlimited dimensions: time, level
current shape = (0, 0, 73, 144)

You can use a path to create a Variable inside a hierarchy of groups.

>>> ftemp = rootgrp.createVariable("/forecasts/model1/temp","f4",("time","level","lat","lon",))

If the intermediate groups do not yet exist, they will be created.

You can also query a Dataset or Group instance directly to obtain Group or Variable instances using paths.

>>> print rootgrp["/forecasts/model1"] # a Group instance
<type "netCDF4._netCDF4.Group">
group /forecasts/model1:
    dimensions(sizes):
    variables(dimensions): float32 temp(time,level,lat,lon)
    groups:
>>> print rootgrp["/forecasts/model1/temp"] # a Variable instance
<type "netCDF4._netCDF4.Variable">
float32 temp(time, level, lat, lon)
path = /forecasts/model1
unlimited dimensions: time, level
current shape = (0, 0, 73, 144)
filling on, default _FillValue of 9.96920996839e+36 used

All of the variables in the Dataset or Group are stored in a Python dictionary, in the same way as the dimensions:

>>> print rootgrp.variables
OrderedDict([("time", <netCDF4.Variable object at 0x1b4ba70>),
             ("level", <netCDF4.Variable object at 0x1b4bab0>),
             ("lat", <netCDF4.Variable object at 0x1b4baf0>),
             ("lon", <netCDF4.Variable object at 0x1b4bb30>),
             ("temp", <netCDF4.Variable object at 0x1b4bb70>)])

Variable names can be changed using the renameVariable method of a Dataset instance.

5) Attributes in a netCDF file.

There are two types of attributes in a netCDF file, global and variable. Global attributes provide information about a group, or the entire dataset, as a whole. Variable attributes provide information about one of the variables in a group. Global attributes are set by assigning values to Dataset or Group instance variables. Variable attributes are set by assigning values to Variable instances variables. Attributes can be strings, numbers or sequences. Returning to our example,

>>> import time
>>> rootgrp.description = "bogus example script"
>>> rootgrp.history = "Created " + time.ctime(time.time())
>>> rootgrp.source = "netCDF4 python module tutorial"
>>> latitudes.units = "degrees north"
>>> longitudes.units = "degrees east"
>>> levels.units = "hPa"
>>> temp.units = "K"
>>> times.units = "hours since 0001-01-01 00:00:00.0"
>>> times.calendar = "gregorian"

The ncattrs method of a Dataset, Group or Variable instance can be used to retrieve the names of all the netCDF attributes. This method is provided as a convenience, since using the built-in dir Python function will return a bunch of private methods and attributes that cannot (or should not) be modified by the user.

>>> for name in rootgrp.ncattrs():
>>>     print "Global attr", name, "=", getattr(rootgrp,name)
Global attr description = bogus example script
Global attr history = Created Mon Nov  7 10.30:56 2005
Global attr source = netCDF4 python module tutorial

The __dict__ attribute of a Dataset, Group or Variable instance provides all the netCDF attribute name/value pairs in a python dictionary:

>>> print rootgrp.__dict__
OrderedDict([(u"description", u"bogus example script"),
             (u"history", u"Created Thu Mar  3 19:30:33 2011"),
             (u"source", u"netCDF4 python module tutorial")])

Attributes can be deleted from a netCDF Dataset, Group or Variable using the python del statement (i.e. del grp.foo removes the attribute foo the the group grp).

6) Writing data to and retrieving data from a netCDF variable.

Now that you have a netCDF Variable instance, how do you put data into it? You can just treat it like an array and assign data to a slice.

>>> import numpy
>>> lats =  numpy.arange(-90,91,2.5)
>>> lons =  numpy.arange(-180,180,2.5)
>>> latitudes[:] = lats
>>> longitudes[:] = lons
>>> print "latitudes =\n",latitudes[:]
latitudes =
[-90.  -87.5 -85.  -82.5 -80.  -77.5 -75.  -72.5 -70.  -67.5 -65.  -62.5
 -60.  -57.5 -55.  -52.5 -50.  -47.5 -45.  -42.5 -40.  -37.5 -35.  -32.5
 -30.  -27.5 -25.  -22.5 -20.  -17.5 -15.  -12.5 -10.   -7.5  -5.   -2.5
   0.    2.5   5.    7.5  10.   12.5  15.   17.5  20.   22.5  25.   27.5
  30.   32.5  35.   37.5  40.   42.5  45.   47.5  50.   52.5  55.   57.5
  60.   62.5  65.   67.5  70.   72.5  75.   77.5  80.   82.5  85.   87.5
  90. ]

Unlike NumPy's array objects, netCDF Variable objects with unlimited dimensions will grow along those dimensions if you assign data outside the currently defined range of indices.

>>> # append along two unlimited dimensions by assigning to slice.
>>> nlats = len(rootgrp.dimensions["lat"])
>>> nlons = len(rootgrp.dimensions["lon"])
>>> print "temp shape before adding data = ",temp.shape
temp shape before adding data =  (0, 0, 73, 144)
>>>
>>> from numpy.random import uniform
>>> temp[0:5,0:10,:,:] = uniform(size=(5,10,nlats,nlons))
>>> print "temp shape after adding data = ",temp.shape
temp shape after adding data =  (6, 10, 73, 144)
>>>
>>> # levels have grown, but no values yet assigned.
>>> print "levels shape after adding pressure data = ",levels.shape
levels shape after adding pressure data =  (10,)

Note that the size of the levels variable grows when data is appended along the level dimension of the variable temp, even though no data has yet been assigned to levels.

>>> # now, assign data to levels dimension variable.
>>> levels[:] =  [1000.,850.,700.,500.,300.,250.,200.,150.,100.,50.]

However, that there are some differences between NumPy and netCDF variable slicing rules. Slices behave as usual, being specified as a start:stop:step triplet. Using a scalar integer index i takes the ith element and reduces the rank of the output array by one. Boolean array and integer sequence indexing behaves differently for netCDF variables than for numpy arrays. Only 1-d boolean arrays and integer sequences are allowed, and these indices work independently along each dimension (similar to the way vector subscripts work in fortran). This means that

>>> temp[0, 0, [0,1,2,3], [0,1,2,3]]

returns an array of shape (4,4) when slicing a netCDF variable, but for a numpy array it returns an array of shape (4,). Similarly, a netCDF variable of shape (2,3,4,5) indexed with [0, array([True, False, True]), array([False, True, True, True]), :] would return a (2, 3, 5) array. In NumPy, this would raise an error since it would be equivalent to [0, [0,1], [1,2,3], :]. When slicing with integer sequences, the indices need not be sorted and may contain duplicates (both of these are new features in version 1.2.1). While this behaviour may cause some confusion for those used to NumPy's 'fancy indexing' rules, it provides a very powerful way to extract data from multidimensional netCDF variables by using logical operations on the dimension arrays to create slices.

For example,

>>> tempdat = temp[::2, [1,3,6], lats>0, lons>0]

will extract time indices 0,2 and 4, pressure levels 850, 500 and 200 hPa, all Northern Hemisphere latitudes and Eastern Hemisphere longitudes, resulting in a numpy array of shape (3, 3, 36, 71).

>>> print "shape of fancy temp slice = ",tempdat.shape
shape of fancy temp slice =  (3, 3, 36, 71)

Special note for scalar variables: To extract data from a scalar variable v with no associated dimensions, use np.asarray(v) or v[...]. The result will be a numpy scalar array.

7) Dealing with time coordinates.

Time coordinate values pose a special challenge to netCDF users. Most metadata standards (such as CF) specify that time should be measure relative to a fixed date using a certain calendar, with units specified like hours since YY-MM-DD hh:mm:ss. These units can be awkward to deal with, without a utility to convert the values to and from calendar dates. The function called num2date and date2num are provided with this package to do just that. Here's an example of how they can be used:

>>> # fill in times.
>>> from datetime import datetime, timedelta
>>> from netCDF4 import num2date, date2num
>>> dates = [datetime(2001,3,1)+n*timedelta(hours=12) for n in range(temp.shape[0])]
>>> times[:] = date2num(dates,units=times.units,calendar=times.calendar)
>>> print "time values (in units %s): " % times.units+"\n",times[:]
time values (in units hours since January 1, 0001):
[ 17533056.  17533068.  17533080.  17533092.  17533104.]
>>> dates = num2date(times[:],units=times.units,calendar=times.calendar)
>>> print "dates corresponding to time values:\n",dates
dates corresponding to time values:
[2001-03-01 00:00:00 2001-03-01 12:00:00 2001-03-02 00:00:00
 2001-03-02 12:00:00 2001-03-03 00:00:00]

num2date converts numeric values of time in the specified units and calendar to datetime objects, and date2num does the reverse. All the calendars currently defined in the CF metadata convention are supported. A function called date2index is also provided which returns the indices of a netCDF time variable corresponding to a sequence of datetime instances.

8) Reading data from a multi-file netCDF dataset.

If you want to read data from a variable that spans multiple netCDF files, you can use the MFDataset class to read the data as if it were contained in a single file. Instead of using a single filename to create a Dataset instance, create a MFDataset instance with either a list of filenames, or a string with a wildcard (which is then converted to a sorted list of files using the python glob module). Variables in the list of files that share the same unlimited dimension are aggregated together, and can be sliced across multiple files. To illustrate this, let's first create a bunch of netCDF files with the same variable (with the same unlimited dimension). The files must in be in NETCDF3_64BIT_OFFSET, NETCDF3_64BIT_DATA, NETCDF3_CLASSIC or NETCDF4_CLASSIC format (NETCDF4 formatted multi-file datasets are not supported).

>>> for nf in range(10):
>>>     f = Dataset("mftest%s.nc" % nf,"w")
>>>     f.createDimension("x",None)
>>>     x = f.createVariable("x","i",("x",))
>>>     x[0:10] = numpy.arange(nf*10,10*(nf+1))
>>>     f.close()

Now read all the files back in at once with MFDataset

>>> from netCDF4 import MFDataset
>>> f = MFDataset("mftest*nc")
>>> print f.variables["x"][:]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]

Note that MFDataset can only be used to read, not write, multi-file datasets.

9) Efficient compression of netCDF variables.

Data stored in netCDF 4 Variable objects can be compressed and decompressed on the fly. The parameters for the compression are determined by the zlib, complevel and shuffle keyword arguments to the createVariable method. To turn on compression, set zlib=True. The complevel keyword regulates the speed and efficiency of the compression (1 being fastest, but lowest compression ratio, 9 being slowest but best compression ratio). The default value of complevel is 4. Setting shuffle=False will turn off the HDF5 shuffle filter, which de-interlaces a block of data before compression by reordering the bytes. The shuffle filter can significantly improve compression ratios, and is on by default. Setting fletcher32 keyword argument to createVariable to True (it's False by default) enables the Fletcher32 checksum algorithm for error detection. It's also possible to set the HDF5 chunking parameters and endian-ness of the binary data stored in the HDF5 file with the chunksizes and endian keyword arguments to createVariable. These keyword arguments only are relevant for NETCDF4 and NETCDF4_CLASSIC files (where the underlying file format is HDF5) and are silently ignored if the file format is NETCDF3_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA.

If your data only has a certain number of digits of precision (say for example, it is temperature data that was measured with a precision of 0.1 degrees), you can dramatically improve zlib compression by quantizing (or truncating) the data using the least_significant_digit keyword argument to createVariable. The least significant digit is the power of ten of the smallest decimal place in the data that is a reliable value. For example if the data has a precision of 0.1, then setting least_significant_digit=1 will cause data the data to be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). Effectively, this makes the compression 'lossy' instead of 'lossless', that is some precision in the data is sacrificed for the sake of disk space.

In our example, try replacing the line

>>> temp = rootgrp.createVariable("temp","f4",("time","level","lat","lon",))

with

>>> temp = dataset.createVariable("temp","f4",("time","level","lat","lon",),zlib=True)

and then

>>> temp = dataset.createVariable("temp","f4",("time","level","lat","lon",),zlib=True,least_significant_digit=3)

and see how much smaller the resulting files are.

10) Beyond homogeneous arrays of a fixed type - compound data types.

Compound data types map directly to numpy structured (a.k.a 'record') arrays. Structured arrays are akin to C structs, or derived types in Fortran. They allow for the construction of table-like structures composed of combinations of other data types, including other compound types. Compound types might be useful for representing multiple parameter values at each point on a grid, or at each time and space location for scattered (point) data. You can then access all the information for a point by reading one variable, instead of reading different parameters from different variables. Compound data types are created from the corresponding numpy data type using the createCompoundType method of a Dataset or Group instance. Since there is no native complex data type in netcdf, compound types are handy for storing numpy complex arrays. Here's an example:

>>> f = Dataset("complex.nc","w")
>>> size = 3 # length of 1-d complex array
>>> # create sample complex data.
>>> datac = numpy.exp(1j*(1.+numpy.linspace(0, numpy.pi, size)))
>>> # create complex128 compound data type.
>>> complex128 = numpy.dtype([("real",numpy.float64),("imag",numpy.float64)])
>>> complex128_t = f.createCompoundType(complex128,"complex128")
>>> # create a variable with this data type, write some data to it.
>>> f.createDimension("x_dim",None)
>>> v = f.createVariable("cmplx_var",complex128_t,"x_dim")
>>> data = numpy.empty(size,complex128) # numpy structured array
>>> data["real"] = datac.real; data["imag"] = datac.imag
>>> v[:] = data # write numpy structured array to netcdf compound var
>>> # close and reopen the file, check the contents.
>>> f.close(); f = Dataset("complex.nc")
>>> v = f.variables["cmplx_var"]
>>> datain = v[:] # read in all the data into a numpy structured array
>>> # create an empty numpy complex array
>>> datac2 = numpy.empty(datain.shape,numpy.complex128)
>>> # .. fill it with contents of structured array.
>>> datac2.real = datain["real"]; datac2.imag = datain["imag"]
>>> print datac.dtype,datac # original data
complex128 [ 0.54030231+0.84147098j -0.84147098+0.54030231j  -0.54030231-0.84147098j]
>>>
>>> print datac2.dtype,datac2 # data from file
complex128 [ 0.54030231+0.84147098j -0.84147098+0.54030231j  -0.54030231-0.84147098j]

Compound types can be nested, but you must create the 'inner' ones first. All possible numpy structured arrays cannot be represented as Compound variables - an error message will be raise if you try to create one that is not supported. All of the compound types defined for a Dataset or Group are stored in a Python dictionary, just like variables and dimensions. As always, printing objects gives useful summary information in an interactive session:

>>> print f
<type "netCDF4._netCDF4.Dataset">
root group (NETCDF4 file format):
    dimensions: x_dim
    variables: cmplx_var
    groups:
<type "netCDF4._netCDF4.Variable">
>>> print f.variables["cmplx_var"]
compound cmplx_var(x_dim)
compound data type: [("real", "<f8"), ("imag", "<f8")]
unlimited dimensions: x_dim
current shape = (3,)
>>> print f.cmptypes
OrderedDict([("complex128", <netCDF4.CompoundType object at 0x1029eb7e8>)])
>>> print f.cmptypes["complex128"]
<type "netCDF4._netCDF4.CompoundType">: name = "complex128", numpy dtype = [(u"real","<f8"), (u"imag", "<f8")]

11) Variable-length (vlen) data types.

NetCDF 4 has support for variable-length or "ragged" arrays. These are arrays of variable length sequences having the same type. To create a variable-length data type, use the createVLType method method of a Dataset or Group instance.

>>> f = Dataset("tst_vlen.nc","w")
>>> vlen_t = f.createVLType(numpy.int32, "phony_vlen")

The numpy datatype of the variable-length sequences and the name of the new datatype must be specified. Any of the primitive datatypes can be used (signed and unsigned integers, 32 and 64 bit floats, and characters), but compound data types cannot. A new variable can then be created using this datatype.

>>> x = f.createDimension("x",3)
>>> y = f.createDimension("y",4)
>>> vlvar = f.createVariable("phony_vlen_var", vlen_t, ("y","x"))

Since there is no native vlen datatype in numpy, vlen arrays are represented in python as object arrays (arrays of dtype object). These are arrays whose elements are Python object pointers, and can contain any type of python object. For this application, they must contain 1-D numpy arrays all of the same type but of varying length. In this case, they contain 1-D numpy int32 arrays of random length between 1 and 10.

>>> import random
>>> data = numpy.empty(len(y)*len(x),object)
>>> for n in range(len(y)*len(x)):
>>>    data[n] = numpy.arange(random.randint(1,10),dtype="int32")+1
>>> data = numpy.reshape(data,(len(y),len(x)))
>>> vlvar[:] = data
>>> print "vlen variable =\n",vlvar[:]
vlen variable =
[[[ 1  2  3  4  5  6  7  8  9 10] [1 2 3 4 5] [1 2 3 4 5 6 7 8]]
 [[1 2 3 4 5 6 7] [1 2 3 4 5 6] [1 2 3 4 5]]
 [[1 2 3 4 5] [1 2 3 4] [1]]
 [[ 1  2  3  4  5  6  7  8  9 10] [ 1  2  3  4  5  6  7  8  9 10]
  [1 2 3 4 5 6 7 8]]]
>>> print f
<type "netCDF4._netCDF4.Dataset">
root group (NETCDF4 file format):
    dimensions: x, y
    variables: phony_vlen_var
    groups:
>>> print f.variables["phony_vlen_var"]
<type "netCDF4._netCDF4.Variable">
vlen phony_vlen_var(y, x)
vlen data type: int32
unlimited dimensions:
current shape = (4, 3)
>>> print f.VLtypes["phony_vlen"]
<type "netCDF4._netCDF4.VLType">: name = "phony_vlen", numpy dtype = int32

Numpy object arrays containing python strings can also be written as vlen variables, For vlen strings, you don't need to create a vlen data type. Instead, simply use the python str builtin (or a numpy string datatype with fixed length greater than 1) when calling the createVariable method.

>>> z = f.createDimension("z",10)
>>> strvar = rootgrp.createVariable("strvar", str, "z")

In this example, an object array is filled with random python strings with random lengths between 2 and 12 characters, and the data in the object array is assigned to the vlen string variable.

>>> chars = "1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
>>> data = numpy.empty(10,"O")
>>> for n in range(10):
>>>     stringlen = random.randint(2,12)
>>>     data[n] = "".join([random.choice(chars) for i in range(stringlen)])
>>> strvar[:] = data
>>> print "variable-length string variable:\n",strvar[:]
variable-length string variable:
[aDy29jPt 5DS9X8 jd7aplD b8t4RM jHh8hq KtaPWF9cQj Q1hHN5WoXSiT MMxsVeq tdLUzvVTzj]
>>> print f
<type "netCDF4._netCDF4.Dataset">
root group (NETCDF4 file format):
    dimensions: x, y, z
    variables: phony_vlen_var, strvar
    groups:
>>> print f.variables["strvar"]
<type "netCDF4._netCDF4.Variable">
vlen strvar(z)
vlen data type: <type "str">
unlimited dimensions:
current size = (10,)

It is also possible to set contents of vlen string variables with numpy arrays of any string or unicode data type. Note, however, that accessing the contents of such variables will always return numpy arrays with dtype object.

12) Enum data type.

netCDF4 has an enumerated data type, which is an integer datatype that is restricted to certain named values. Since Enums don't map directly to a numpy data type, they are read and written as integer arrays.

Here's an example of using an Enum type to hold cloud type data. The base integer data type and a python dictionary describing the allowed values and their names are used to define an Enum data type using createEnumType.

>>> nc = Dataset('clouds.nc','w')
>>> # python dict with allowed values and their names.
>>> enum_dict = {u'Altocumulus': 7, u'Missing': 255, 
>>> u'Stratus': 2, u'Clear': 0,
>>> u'Nimbostratus': 6, u'Cumulus': 4, u'Altostratus': 5,
>>> u'Cumulonimbus': 1, u'Stratocumulus': 3}
>>> # create the Enum type called 'cloud_t'.
>>> cloud_type = nc.createEnumType(numpy.uint8,'cloud_t',enum_dict)
>>> print cloud_type
<type 'netCDF4._netCDF4.EnumType'>: name = 'cloud_t',
numpy dtype = uint8, fields/values ={u'Cumulus': 4,
u'Altocumulus': 7, u'Missing': 255,
u'Stratus': 2, u'Clear': 0,
u'Cumulonimbus': 1, u'Stratocumulus': 3,
u'Nimbostratus': 6, u'Altostratus': 5}

A new variable can be created in the usual way using this data type. Integer data is written to the variable that represents the named cloud types in enum_dict. A ValueError will be raised if an attempt is made to write an integer value not associated with one of the specified names.

>>> time = nc.createDimension('time',None)
>>> # create a 1d variable of type 'cloud_type'.
>>> # The fill_value is set to the 'Missing' named value.
>>> cloud_var =
>>> nc.createVariable('primary_cloud',cloud_type,'time',
>>> fill_value=enum_dict['Missing'])
>>> # write some data to the variable.
>>> cloud_var[:] = [enum_dict['Clear'],enum_dict['Stratus'],
>>> enum_dict['Cumulus'],enum_dict['Missing'],
>>> enum_dict['Cumulonimbus']]
>>> nc.close()
>>> # reopen the file, read the data.
>>> nc = Dataset('clouds.nc')
>>> cloud_var = nc.variables['primary_cloud']
>>> print cloud_var
<type 'netCDF4._netCDF4.Variable'>
enum primary_cloud(time)
    _FillValue: 255
enum data type: uint8
unlimited dimensions: time
current shape = (5,)
>>> print cloud_var.datatype.enum_dict
{u'Altocumulus': 7, u'Missing': 255, u'Stratus': 2,
u'Clear': 0, u'Nimbostratus': 6, u'Cumulus': 4,
u'Altostratus': 5, u'Cumulonimbus': 1,
u'Stratocumulus': 3}
>>> print cloud_var[:]
[0 2 4 -- 1]
>>> nc.close()

13) Parallel IO.

If MPI parallel enabled versions of netcdf and hdf5 are detected, and mpi4py is installed, netcdf4-python will be built with parallel IO capabilities enabled. To use parallel IO, your program must be running in an MPI environment using mpi4py.

>>> from mpi4py import MPI
>>> import numpy as np
>>> from netCDF4 import Dataset
>>> rank = MPI.COMM_WORLD.rank  # The process ID (integer 0-3 for 4-process run)

To run an MPI-based parallel program like this, you must use mpiexec to launch several parallel instances of Python (for example, using mpiexec -np 4 python mpi_example.py). The parallel features of netcdf4-python are mostly transparent - when a new dataset is created or an existing dataset is opened, use the parallel keyword to enable parallel access.

>>> nc = Dataset('parallel_tst.nc','w',parallel=True)

The optional comm keyword may be used to specify a particular MPI communicator (MPI_COMM_WORLD is used by default). Each process (or rank) can now write to the file indepedently. In this example the process rank is written to a different variable index on each task

>>> d = nc.createDimension('dim',4)
>>> v = nc.createVariable('var', np.int, 'dim')
>>> v[rank] = rank
>>> nc.close()

% ncdump parallel_test.nc
netcdf parallel_test {
dimensions:
    dim = 4 ;
    variables:
    int64 var(dim) ;
    data:

    var = 0, 1, 2, 3 ;
}

There are two types of parallel IO, independent (the default) and collective. Independent IO means that each process can do IO independently. It should not depend on or be affected by other processes. Collective IO is a way of doing IO defined in the MPI-IO standard; unlike independent IO, all processes must participate in doing IO. To toggle back and forth between the two types of IO, use the set_collective Variablemethod. All metadata operations (such as creation of groups, types, variables, dimensions, or attributes) are collective. There are a couple of important limitatons of parallel IO:

  • If a variable has an unlimited dimension, appending data must be done in collective mode. If the write is done in independent mode, the operation will fail with a a generic "HDF Error".
  • You cannot write compressed data in parallel (although you can read it).
  • You cannot use variable-length (VLEN) data types.

All of the code in this tutorial is available in examples/tutorial.py, except the parallel IO example, which is in examples/mpi_example.py. Unit tests are in the test directory.

contact: Jeffrey Whitaker jeffrey.s.whitaker@noaa.gov

copyright: 2008 by Jeffrey Whitaker.

license: Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both the copyright notice and this permission notice appear in supporting documentation. THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.


Functions

def chartostring(

b,encoding='utf-8')

convert a character array to a string array with one less dimension.

b: Input character array (numpy datatype 'S1' or 'U1'). Will be converted to a array of strings, where each string has a fixed length of b.shape[-1] characters.

optional kwarg encoding can be used to specify character encoding (default utf-8).

returns a numpy string array with datatype 'UN' and shape b.shape[:-1] where where N=b.shape[-1].

def date2index(

dates, nctime, calendar=None, select='exact')

Return indices of a netCDF time variable corresponding to the given dates.

dates: A datetime object or a sequence of datetime objects. The datetime objects should not include a time-zone offset.

nctime: A netCDF time variable object. The nctime object must have a units attribute.

calendar: describes the calendar used in the time calculations. All the values currently defined in the CF metadata convention Valid calendars 'standard', 'gregorian', 'proleptic_gregorian' 'noleap', '365_day', '360_day', 'julian', 'all_leap', '366_day'. Default is 'standard', which is a mixed Julian/Gregorian calendar. If calendar is None, its value is given by nctime.calendar or standard if no such attribute exists.

select: 'exact', 'before', 'after', 'nearest' The index selection method. exact will return the indices perfectly matching the dates given. before and after will return the indices corresponding to the dates just before or just after the given dates if an exact match cannot be found. nearest will return the indices that correspond to the closest dates.

returns an index (indices) of the netCDF time variable corresponding to the given datetime object(s).

def date2num(

dates,units,calendar='standard')

Return numeric time values given datetime objects. The units of the numeric time values are described by the netCDF4.units argument and the netCDF4.calendar keyword. The datetime objects must be in UTC with no time-zone offset. If there is a time-zone offset in units, it will be applied to the returned numeric values.

dates: A datetime object or a sequence of datetime objects. The datetime objects should not include a time-zone offset.

units: a string of the form <time units> since <reference time> describing the time units. <time units> can be days, hours, minutes, seconds, milliseconds or microseconds. <reference time> is the time origin.

calendar: describes the calendar used in the time calculations. All the values currently defined in the CF metadata convention Valid calendars 'standard', 'gregorian', 'proleptic_gregorian' 'noleap', '365_day', '360_day', 'julian', 'all_leap', '366_day'. Default is 'standard', which is a mixed Julian/Gregorian calendar.

returns a numeric time value, or an array of numeric time values with approximately millisecond accuracy.

def getlibversion(

)

returns a string describing the version of the netcdf library used to build the module, and when it was built.

def num2date(

times,units,calendar='standard')

Return datetime objects given numeric time values. The units of the numeric time values are described by the units argument and the calendar keyword. The returned datetime objects represent UTC with no time-zone offset, even if the specified units contain a time-zone offset.

times: numeric time values.

units: a string of the form <time units> since <reference time> describing the time units. <time units> can be days, hours, minutes, seconds, milliseconds or microseconds. <reference time> is the time origin.

calendar: describes the calendar used in the time calculations. All the values currently defined in the CF metadata convention Valid calendars 'standard', 'gregorian', 'proleptic_gregorian' 'noleap', '365_day', '360_day', 'julian', 'all_leap', '366_day'. Default is 'standard', which is a mixed Julian/Gregorian calendar.

returns a datetime instance, or an array of datetime instances with approximately millisecond accuracy.

Note: The datetime instances returned are 'real' python datetime objects if calendar='proleptic_gregorian', or calendar='standard' or 'gregorian' and the date is after the breakpoint between the Julian and Gregorian calendars (1582-10-15). Otherwise, they are 'phony' datetime objects which support some but not all the methods of 'real' python datetime objects. The datetime instances do not contain a time-zone offset, even if the specified units contains one.

def stringtoarr(

a, NUMCHARS,dtype='S')

convert a string to a character array of length NUMCHARS

a: Input python string.

NUMCHARS: number of characters used to represent string (if len(a) < NUMCHARS, it will be padded on the right with blanks).

dtype: type of numpy array to return. Default is 'S', which means an array of dtype 'S1' will be returned. If dtype='U', a unicode array (dtype = 'U1') will be returned.

returns a rank 1 numpy character array of length NUMCHARS with datatype 'S1' (default) or 'U1' (if dtype='U')

def stringtochar(

a,encoding='utf-8')

convert a string array to a character array with one extra dimension

a: Input numpy string array with numpy datatype 'SN' or 'UN', where N is the number of characters in each string. Will be converted to an array of characters (datatype 'S1' or 'U1') of shape a.shape + (N,).

optional kwarg encoding can be used to specify character encoding (default utf-8).

returns a numpy character array with datatype 'S1' or 'U1' and shape a.shape + (N,), where N is the length of each string in a.

Classes

class CompoundType

A CompoundType instance is used to describe a compound data type, and can be passed to the the createVariable method of a Dataset or Group instance. Compound data types map to numpy structured arrays. See __init__ for more details.

The instance variables dtype and name should not be modified by the user.

Ancestors (in MRO)

Class variables

var dtype

A numpy dtype object describing the compound data type.

var name

String name.

Static methods

def __init__(

group, datatype, datatype_name)

CompoundType constructor.

group: Group instance to associate with the compound datatype.

datatype: A numpy dtype object describing a structured (a.k.a record) array. Can be composed of homogeneous numeric or character data types, or other structured array data types.

datatype_name: a Python string containing a description of the compound data type.

Note 1: When creating nested compound data types, the inner compound data types must already be associated with CompoundType instances (so create CompoundType instances for the innermost structures first).

Note 2: CompoundType instances should be created using the createCompoundType method of a Dataset or Group instance, not using this class directly.

class Dataset

A netCDF Dataset is a collection of dimensions, groups, variables and attributes. Together they describe the meaning of data and relations among data fields stored in a netCDF file. See __init__ for more details.

A list of attribute names corresponding to global netCDF attributes defined for the Dataset can be obtained with the ncattrs method. These attributes can be created by assigning to an attribute of the Dataset instance. A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Dataset instance.

The following class variables are read-only and should not be modified by the user.

dimensions: The dimensions dictionary maps the names of dimensions defined for the Group or Dataset to instances of the Dimension class.

variables: The variables dictionary maps the names of variables defined for this Dataset or Group to instances of the Variable class.

groups: The groups dictionary maps the names of groups created for this Dataset or Group to instances of the Group class (the Dataset class is simply a special case of the Group class which describes the root group in the netCDF4 file).

cmptypes: The cmptypes dictionary maps the names of compound types defined for the Group or Dataset to instances of the CompoundType class.

vltypes: The vltypes dictionary maps the names of variable-length types defined for the Group or Dataset to instances of the VLType class.

enumtypes: The enumtypes dictionary maps the names of Enum types defined for the Group or Dataset to instances of the EnumType class.

data_model: data_model describes the netCDF data model version, one of NETCDF3_CLASSIC, NETCDF4, NETCDF4_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA.

file_format: same as data_model, retained for backwards compatibility.

disk_format: disk_format describes the underlying file format, one of NETCDF3, HDF5, HDF4, PNETCDF, DAP2, DAP4 or UNDEFINED. Only available if using netcdf C library version >= 4.3.1, otherwise will always return UNDEFINED.

parent: parent is a reference to the parent Group instance. None for the root group or Dataset instance.

path: path shows the location of the Group in the Dataset in a unix directory format (the names of groups in the hierarchy separated by backslashes). A Dataset instance is the root group, so the path is simply '/'.

keepweakref: If True, child Dimension and Variables objects only keep weak references to the parent Dataset or Group.

Ancestors (in MRO)

Class variables

var cmptypes

The cmptypes dictionary maps the names of compound types defined for the Group or Dataset to instances of the CompoundType class.

var data_model

data_model describes the netCDF data model version, one of NETCDF3_CLASSIC, NETCDF4, NETCDF4_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA.

var dimensions

The dimensions dictionary maps the names of dimensions defined for the Group or Dataset to instances of the Dimension class.

var disk_format

disk_format describes the underlying file format, one of NETCDF3, HDF5, HDF4, PNETCDF, DAP2, DAP4 or UNDEFINED. Only available if using netcdf C library version >= 4.3.1, otherwise will always return UNDEFINED.

var enumtypes

The enumtypes dictionary maps the names of Enum types defined for the Group or Dataset to instances of the EnumType class.

var file_format

same as data_model, retained for backwards compatibility.

var groups

The groups dictionary maps the names of groups created for this Dataset or Group to instances of the Group class (the Dataset class is simply a special case of the Group class which describes the root group in the netCDF4 file).

var keepweakref

If True, child Dimension and Variables objects only keep weak references to the parent Dataset or Group.

var parent

parent is a reference to the parent Group instance. None for the root group or Dataset instance

var path

path shows the location of the Group in the Dataset in a unix directory format (the names of groups in the hierarchy separated by backslashes). A Dataset instance is the root group, so the path is simply '/'.

var variables

The variables dictionary maps the names of variables defined for this Dataset or Group to instances of the Variable class.

var vltypes

The vltypes dictionary maps the names of variable-length types defined for the Group or Dataset to instances of the VLType class.

Static methods

def __init__(

self, filename, mode="r", clobber=True, diskless=False, persist=False, keepweakref=False, format='NETCDF4')

Dataset constructor.

filename: Name of netCDF file to hold dataset. Can also be a python 3 pathlib instance or the URL of an OpenDAP dataset. When memory is set this is just used to set the filepath().

mode: access mode. r means read-only; no data can be modified. w means write; a new file is created, an existing file with the same name is deleted. a and r+ mean append (in analogy with serial files); an existing file is opened for reading and writing. Appending s to modes w, r+ or a will enable unbuffered shared access to NETCDF3_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA formatted files. Unbuffered access may be useful even if you don't need shared access, since it may be faster for programs that don't access data sequentially. This option is ignored for NETCDF4 and NETCDF4_CLASSIC formatted files.

clobber: if True (default), opening a file with mode='w' will clobber an existing file with the same name. if False, an exception will be raised if a file with the same name already exists.

format: underlying file format (one of 'NETCDF4', 'NETCDF4_CLASSIC', 'NETCDF3_CLASSIC', 'NETCDF3_64BIT_OFFSET' or 'NETCDF3_64BIT_DATA'. Only relevant if mode = 'w' (if mode = 'r','a' or 'r+' the file format is automatically detected). Default 'NETCDF4', which means the data is stored in an HDF5 file, using netCDF 4 API features. Setting format='NETCDF4_CLASSIC' will create an HDF5 file, using only netCDF 3 compatible API features. netCDF 3 clients must be recompiled and linked against the netCDF 4 library to read files in NETCDF4_CLASSIC format. 'NETCDF3_CLASSIC' is the classic netCDF 3 file format that does not handle 2+ Gb files. 'NETCDF3_64BIT_OFFSET' is the 64-bit offset version of the netCDF 3 file format, which fully supports 2+ GB files, but is only compatible with clients linked against netCDF version 3.6.0 or later. 'NETCDF3_64BIT_DATA' is the 64-bit data version of the netCDF 3 file format, which supports 64-bit dimension sizes plus unsigned and 64 bit integer data types, but is only compatible with clients linked against netCDF version 4.4.0 or later.

diskless: If True, create diskless (in memory) file.
This is an experimental feature added to the C library after the netcdf-4.2 release.

persist: if diskless=True, persist file to disk when closed (default False).

keepweakref: if True, child Dimension and Variable instances will keep weak references to the parent Dataset or Group object. Default is False, which means strong references will be kept. Having Dimension and Variable instances keep a strong reference to the parent Dataset instance, which in turn keeps a reference to child Dimension and Variable instances, creates circular references. Circular references complicate garbage collection, which may mean increased memory usage for programs that create may Dataset instances with lots of Variables. It also will result in the Dataset object never being deleted, which means it may keep open files alive as well. Setting keepweakref=True allows Dataset instances to be garbage collected as soon as they go out of scope, potentially reducing memory usage and open file handles. However, in many cases this is not desirable, since the associated Variable instances may still be needed, but are rendered unusable when the parent Dataset instance is garbage collected.

memory: if not None, open file with contents taken from this block of memory. Must be a sequence of bytes. Note this only works with "r" mode.

encoding: encoding used to encode filename string into bytes. Default is None (sys.getdefaultfileencoding() is used).

parallel: open for parallel access using MPI (requires mpi4py and parallel-enabled netcdf-c and hdf5 libraries). Default is False. If True, comm and info kwargs may also be specified.

comm: MPI_Comm object for parallel access. Default None, which means MPI_COMM_WORLD will be used. Ignored if parallel=False.

info: MPI_Info object for parallel access. Default None, which means MPI_INFO_NULL will be used. Ignored if parallel=False.

def close(

self)

Close the Dataset.

def createCompoundType(

self, datatype, datatype_name)

Creates a new compound data type named datatype_name from the numpy dtype object datatype.

Note: If the new compound data type contains other compound data types (i.e. it is a 'nested' compound type, where not all of the elements are homogeneous numeric data types), then the 'inner' compound types must be created first.

The return value is the CompoundType class instance describing the new datatype.

def createDimension(

self, dimname, size=None)

Creates a new dimension with the given dimname and size.

size must be a positive integer or None, which stands for "unlimited" (default is None). Specifying a size of 0 also results in an unlimited dimension. The return value is the Dimension class instance describing the new dimension. To determine the current maximum size of the dimension, use the len function on the Dimension instance. To determine if a dimension is 'unlimited', use the isunlimited method of the Dimension instance.

def createEnumType(

self, datatype, datatype_name, enum_dict)

Creates a new Enum data type named datatype_name from a numpy integer dtype object datatype, and a python dictionary defining the enum fields and values.

The return value is the EnumType class instance describing the new datatype.

def createGroup(

self, groupname)

Creates a new Group with the given groupname.

If groupname is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary (analogous to mkdir -p in unix). For example, createGroup('/GroupA/GroupB/GroupC') will create GroupA, GroupA/GroupB, and GroupA/GroupB/GroupC, if they don't already exist. If the specified path describes a group that already exists, no error is raised.

The return value is a Group class instance.

def createVLType(

self, datatype, datatype_name)

Creates a new VLEN data type named datatype_name from a numpy dtype object datatype.

The return value is the VLType class instance describing the new datatype.

def createVariable(

self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None)

Creates a new variable with the given varname, datatype, and dimensions. If dimensions are not given, the variable is assumed to be a scalar.

If varname is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary For example, createVariable('/GroupA/GroupB/VarC', float, ('x','y')) will create groups GroupA and GroupA/GroupB, plus the variable GroupA/GroupB/VarC, if the preceding groups don't already exist.

The datatype can be a numpy datatype object, or a string that describes a numpy dtype object (like the dtype.str attribute of a numpy array). Supported specifiers include: 'S1' or 'c' (NC_CHAR), 'i1' or 'b' or 'B' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' or 'h' or 's' (NC_SHORT), 'u2' (NC_USHORT), 'i4' or 'i' or 'l' (NC_INT), 'u4' (NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' or 'f' (NC_FLOAT), 'f8' or 'd' (NC_DOUBLE). datatype can also be a CompoundType instance (for a structured, or compound array), a VLType instance (for a variable-length array), or the python str builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for str.

Data from netCDF variables is presented to python as numpy arrays with the corresponding data type.

dimensions must be a tuple containing dimension names (strings) that have been defined previously using createDimension. The default value is an empty tuple, which means the variable is a scalar.

If the optional keyword zlib is True, the data will be compressed in the netCDF file using gzip compression (default False).

The optional keyword complevel is an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False.

If the optional keyword shuffle is True, the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Default is True. Ignored if zlib=False.

If the optional keyword fletcher32 is True, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default False.

If the optional keyword contiguous is True, the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error.

The optional keyword chunksizes can be used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. chunksizes cannot be set if contiguous=True.

The optional keyword endian can be used to control whether the data is stored in little or big endian format on disk. Possible values are little, big or native (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness.

The zlib, complevel, shuffle, fletcher32, contiguous, chunksizes and endian keywords are silently ignored for netCDF 3 files that do not use HDF5.

The optional keyword fill_value can be used to override the default netCDF _FillValue (the value that the variable gets filled with before any data is written to it, defaults given in netCDF4.default_fillvals). If fill_value is set to False, then the variable is not pre-filled.

If the optional keyword parameter least_significant_digit is specified, variable data will be truncated (quantized). In conjunction with zlib=True this produces 'lossy', but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From the PSD metadata conventions: "least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value." Default is None, or no quantization, or 'lossless' compression.

When creating variables in a NETCDF4 or NETCDF4_CLASSIC formatted file, HDF5 creates something called a 'chunk cache' for each variable. The default size of the chunk cache may be large enough to completely fill available memory when creating thousands of variables. The optional keyword chunk_cache allows you to reduce (or increase) the size of the default chunk cache when creating a variable. The setting only persists as long as the Dataset is open - you can use the set_var_chunk_cache method to change it the next time the Dataset is opened. Warning - messing with this parameter can seriously degrade performance.

The return value is the Variable class instance describing the new variable.

A list of names corresponding to netCDF variable attributes can be obtained with the Variable method ncattrs. A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Variable instance.

Variable instances behave much like array objects. Data can be assigned to or retrieved from a variable with indexing and slicing operations on the Variable instance. A Variable instance has six Dataset standard attributes: dimensions, dtype, shape, ndim, name and least_significant_digit. Application programs should never modify these attributes. The dimensions attribute is a tuple containing the names of the dimensions associated with this variable. The dtype attribute is a string describing the variable's data type (i4, f8, S1, etc). The shape attribute is a tuple describing the current sizes of all the variable's dimensions. The name attribute is a string containing the name of the Variable instance. The least_significant_digit attributes describes the power of ten of the smallest decimal place in the data the contains a reliable value. assigned to the Variable instance. If None, the data is not truncated. The ndim attribute is the number of variable dimensions.

def delncattr(

self,name,value)

delete a netCDF dataset or group attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.

def filepath(

self,encoding=None)

Get the file system path (or the opendap URL) which was used to open/create the Dataset. Requires netcdf >= 4.1.2. The path is decoded into a string using sys.getfilesystemencoding() by default, this can be changed using the encoding kwarg.

def get_variables_by_attributes(

...)

Returns a list of variables that match specific conditions.

Can pass in key=value parameters and variables are returned that contain all of the matches. For example,

>>> # Get variables with x-axis attribute.
>>> vs = nc.get_variables_by_attributes(axis='X')
>>> # Get variables with matching "standard_name" attribute
>>> vs = nc.get_variables_by_attributes(standard_name='northward_sea_water_velocity')

Can pass in key=callable parameter and variables are returned if the callable returns True. The callable should accept a single parameter, the attribute value. None is given as the attribute value when the attribute does not exist on the variable. For example,

>>> # Get Axis variables
>>> vs = nc.get_variables_by_attributes(axis=lambda v: v in ['X', 'Y', 'Z', 'T'])
>>> # Get variables that don't have an "axis" attribute
>>> vs = nc.get_variables_by_attributes(axis=lambda v: v is None)
>>> # Get variables that have a "grid_mapping" attribute
>>> vs = nc.get_variables_by_attributes(grid_mapping=lambda v: v is not None)

def getncattr(

self,name)

retrieve a netCDF dataset or group attribute. Use if you need to get a netCDF attribute with the same name as one of the reserved python attributes.

option kwarg encoding can be used to specify the character encoding of a string attribute (default is utf-8).

def isopen(

...)

is the Dataset open or closed?

def ncattrs(

self)

return netCDF global attribute names for this Dataset or Group in a list.

def renameAttribute(

self, oldname, newname)

rename a Dataset or Group attribute named oldname to newname.

def renameDimension(

self, oldname, newname)

rename a Dimension named oldname to newname.

def renameGroup(

self, oldname, newname)

rename a Group named oldname to newname (requires netcdf >= 4.3.1).

def renameVariable(

self, oldname, newname)

rename a Variable named oldname to newname

def set_auto_chartostring(

self, True_or_False)

Call set_auto_chartostring for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion of all character arrays <--> string arrays should be performed for character variables (variables of type NC_CHAR or S1) with the _Encoding attribute set.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_mask(

self, True_or_False)

Call set_auto_mask for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion to masked arrays shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_maskandscale(

self, True_or_False)

Call set_auto_maskandscale for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion to masked arrays and variable scaling shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_scale(

self, True_or_False)

Call set_auto_scale for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic variable scaling shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_fill_off(

self)

Sets the fill mode for a Dataset open for writing to off.

This will prevent the data from being pre-filled with fill values, which may result in some performance improvements. However, you must then make sure the data is actually written before being read.

def set_fill_on(

self)

Sets the fill mode for a Dataset open for writing to on.

This causes data to be pre-filled with fill values. The fill values can be controlled by the variable's _Fill_Value attribute, but is usually sufficient to the use the netCDF default _Fill_Value (defined separately for each variable type). The default behavior of the netCDF library corresponds to set_fill_on. Data which are equal to the _Fill_Value indicate that the variable was created, but never written to.

def setncattr(

self,name,value)

set a netCDF dataset or group attribute using name,value pair. Use if you need to set a netCDF attribute with the with the same name as one of the reserved python attributes.

def setncattr_string(

self,name,value)

set a netCDF dataset or group string attribute using name,value pair. Use if you need to ensure that a netCDF attribute is created with type NC_STRING if the file format is NETCDF4. Use if you need to set an attribute to an array of variable-length strings.

def setncatts(

self,attdict)

set a bunch of netCDF dataset or group attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a NETCDF3 formatted file, since nc_redef/nc_enddef is not called in between setting each attribute

def sync(

self)

Writes all buffered data in the Dataset to the disk file.

class Dimension

A netCDF Dimension is used to describe the coordinates of a Variable. See __init__ for more details.

The current maximum size of a Dimension instance can be obtained by calling the python len function on the Dimension instance. The isunlimited method of a Dimension instance can be used to determine if the dimension is unlimited.

Read-only class variables:

name: String name, used when creating a Variable with createVariable.

size: Current Dimension size (same as len(d), where d is a Dimension instance).

Ancestors (in MRO)

Class variables

var name

A string describing the name of the Dimension - used when creating a Variable instance with createVariable.

var size

Static methods

def __init__(

self, group, name, size=None)

Dimension constructor.

group: Group instance to associate with dimension.

name: Name of the dimension.

size: Size of the dimension. None or 0 means unlimited. (Default None).

Note: Dimension instances should be created using the createDimension method of a Group or Dataset instance, not using __init__ directly.

def group(

self)

return the group that this Dimension is a member of.

def isunlimited(

self)

returns True if the Dimension instance is unlimited, False otherwise.

class EnumType

A EnumType instance is used to describe an Enum data type, and can be passed to the the createVariable method of a Dataset or Group instance. See __init__ for more details.

The instance variables dtype, name and enum_dict should not be modified by the user.

Ancestors (in MRO)

Class variables

var dtype

A numpy integer dtype object describing the base type for the Enum.

var enum_dict

A python dictionary describing the enum fields and values.

var name

String name.

Static methods

def __init__(

group, datatype, datatype_name, enum_dict)

EnumType constructor.

group: Group instance to associate with the VLEN datatype.

datatype: An numpy integer dtype object describing the base type for the Enum.

datatype_name: a Python string containing a description of the Enum data type.

enum_dict: a Python dictionary containing the Enum field/value pairs.

Note: EnumType instances should be created using the createEnumType method of a Dataset or Group instance, not using this class directly.

class Group

Groups define a hierarchical namespace within a netCDF file. They are analogous to directories in a unix filesystem. Each Group behaves like a Dataset within a Dataset, and can contain it's own variables, dimensions and attributes (and other Groups). See __init__ for more details.

Group inherits from Dataset, so all the Dataset class methods and variables are available to a Group instance (except the close method).

Additional read-only class variables:

name: String describing the group name.

Ancestors (in MRO)

Class variables

var cmptypes

Inheritance: Dataset.cmptypes

The cmptypes dictionary maps the names of compound types defined for the Group or Dataset to instances of the CompoundType class.

var data_model

Inheritance: Dataset.data_model

data_model describes the netCDF data model version, one of NETCDF3_CLASSIC, NETCDF4, NETCDF4_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA.

var dimensions

Inheritance: Dataset.dimensions

The dimensions dictionary maps the names of dimensions defined for the Group or Dataset to instances of the Dimension class.

var disk_format

Inheritance: Dataset.disk_format

disk_format describes the underlying file format, one of NETCDF3, HDF5, HDF4, PNETCDF, DAP2, DAP4 or UNDEFINED. Only available if using netcdf C library version >= 4.3.1, otherwise will always return UNDEFINED.

var enumtypes

Inheritance: Dataset.enumtypes

The enumtypes dictionary maps the names of Enum types defined for the Group or Dataset to instances of the EnumType class.

var file_format

Inheritance: Dataset.file_format

same as data_model, retained for backwards compatibility.

var groups

Inheritance: Dataset.groups

The groups dictionary maps the names of groups created for this Dataset or Group to instances of the Group class (the Dataset class is simply a special case of the Group class which describes the root group in the netCDF4 file).

var keepweakref

Inheritance: Dataset.keepweakref

If True, child Dimension and Variables objects only keep weak references to the parent Dataset or Group.

var name

A string describing the name of the Group.

var parent

Inheritance: Dataset.parent

parent is a reference to the parent Group instance. None for the root group or Dataset instance

var path

Inheritance: Dataset.path

path shows the location of the Group in the Dataset in a unix directory format (the names of groups in the hierarchy separated by backslashes). A Dataset instance is the root group, so the path is simply '/'.

var variables

Inheritance: Dataset.variables

The variables dictionary maps the names of variables defined for this Dataset or Group to instances of the Variable class.

var vltypes

Inheritance: Dataset.vltypes

The vltypes dictionary maps the names of variable-length types defined for the Group or Dataset to instances of the VLType class.

Static methods

def __init__(

self, parent, name)

Inheritance: Dataset.__init__

Group constructor.

parent: Group instance for the parent group. If being created in the root group, use a Dataset instance.

name: - Name of the group.

Note: Group instances should be created using the createGroup method of a Dataset instance, or another Group instance, not using this class directly.

def close(

self)

Inheritance: Dataset.close

overrides Dataset close method which does not apply to Group instances, raises IOError.

def createCompoundType(

self, datatype, datatype_name)

Inheritance: Dataset.createCompoundType

Creates a new compound data type named datatype_name from the numpy dtype object datatype.

Note: If the new compound data type contains other compound data types (i.e. it is a 'nested' compound type, where not all of the elements are homogeneous numeric data types), then the 'inner' compound types must be created first.

The return value is the CompoundType class instance describing the new datatype.

def createDimension(

self, dimname, size=None)

Inheritance: Dataset.createDimension

Creates a new dimension with the given dimname and size.

size must be a positive integer or None, which stands for "unlimited" (default is None). Specifying a size of 0 also results in an unlimited dimension. The return value is the Dimension class instance describing the new dimension. To determine the current maximum size of the dimension, use the len function on the Dimension instance. To determine if a dimension is 'unlimited', use the isunlimited method of the Dimension instance.

def createEnumType(

self, datatype, datatype_name, enum_dict)

Inheritance: Dataset.createEnumType

Creates a new Enum data type named datatype_name from a numpy integer dtype object datatype, and a python dictionary defining the enum fields and values.

The return value is the EnumType class instance describing the new datatype.

def createGroup(

self, groupname)

Inheritance: Dataset.createGroup

Creates a new Group with the given groupname.

If groupname is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary (analogous to mkdir -p in unix). For example, createGroup('/GroupA/GroupB/GroupC') will create GroupA, GroupA/GroupB, and GroupA/GroupB/GroupC, if they don't already exist. If the specified path describes a group that already exists, no error is raised.

The return value is a Group class instance.

def createVLType(

self, datatype, datatype_name)

Inheritance: Dataset.createVLType

Creates a new VLEN data type named datatype_name from a numpy dtype object datatype.

The return value is the VLType class instance describing the new datatype.

def createVariable(

self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None)

Inheritance: Dataset.createVariable

Creates a new variable with the given varname, datatype, and dimensions. If dimensions are not given, the variable is assumed to be a scalar.

If varname is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary For example, createVariable('/GroupA/GroupB/VarC', float, ('x','y')) will create groups GroupA and GroupA/GroupB, plus the variable GroupA/GroupB/VarC, if the preceding groups don't already exist.

The datatype can be a numpy datatype object, or a string that describes a numpy dtype object (like the dtype.str attribute of a numpy array). Supported specifiers include: 'S1' or 'c' (NC_CHAR), 'i1' or 'b' or 'B' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' or 'h' or 's' (NC_SHORT), 'u2' (NC_USHORT), 'i4' or 'i' or 'l' (NC_INT), 'u4' (NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' or 'f' (NC_FLOAT), 'f8' or 'd' (NC_DOUBLE). datatype can also be a CompoundType instance (for a structured, or compound array), a VLType instance (for a variable-length array), or the python str builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for str.

Data from netCDF variables is presented to python as numpy arrays with the corresponding data type.

dimensions must be a tuple containing dimension names (strings) that have been defined previously using createDimension. The default value is an empty tuple, which means the variable is a scalar.

If the optional keyword zlib is True, the data will be compressed in the netCDF file using gzip compression (default False).

The optional keyword complevel is an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False.

If the optional keyword shuffle is True, the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Default is True. Ignored if zlib=False.

If the optional keyword fletcher32 is True, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default False.

If the optional keyword contiguous is True, the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error.

The optional keyword chunksizes can be used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. chunksizes cannot be set if contiguous=True.

The optional keyword endian can be used to control whether the data is stored in little or big endian format on disk. Possible values are little, big or native (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness.

The zlib, complevel, shuffle, fletcher32, contiguous, chunksizes and endian keywords are silently ignored for netCDF 3 files that do not use HDF5.

The optional keyword fill_value can be used to override the default netCDF _FillValue (the value that the variable gets filled with before any data is written to it, defaults given in netCDF4.default_fillvals). If fill_value is set to False, then the variable is not pre-filled.

If the optional keyword parameter least_significant_digit is specified, variable data will be truncated (quantized). In conjunction with zlib=True this produces 'lossy', but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From the PSD metadata conventions: "least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value." Default is None, or no quantization, or 'lossless' compression.

When creating variables in a NETCDF4 or NETCDF4_CLASSIC formatted file, HDF5 creates something called a 'chunk cache' for each variable. The default size of the chunk cache may be large enough to completely fill available memory when creating thousands of variables. The optional keyword chunk_cache allows you to reduce (or increase) the size of the default chunk cache when creating a variable. The setting only persists as long as the Dataset is open - you can use the set_var_chunk_cache method to change it the next time the Dataset is opened. Warning - messing with this parameter can seriously degrade performance.

The return value is the Variable class instance describing the new variable.

A list of names corresponding to netCDF variable attributes can be obtained with the Variable method ncattrs. A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Variable instance.

Variable instances behave much like array objects. Data can be assigned to or retrieved from a variable with indexing and slicing operations on the Variable instance. A Variable instance has six Dataset standard attributes: dimensions, dtype, shape, ndim, name and least_significant_digit. Application programs should never modify these attributes. The dimensions attribute is a tuple containing the names of the dimensions associated with this variable. The dtype attribute is a string describing the variable's data type (i4, f8, S1, etc). The shape attribute is a tuple describing the current sizes of all the variable's dimensions. The name attribute is a string containing the name of the Variable instance. The least_significant_digit attributes describes the power of ten of the smallest decimal place in the data the contains a reliable value. assigned to the Variable instance. If None, the data is not truncated. The ndim attribute is the number of variable dimensions.

def delncattr(

self,name,value)

Inheritance: Dataset.delncattr

delete a netCDF dataset or group attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.

def filepath(

self,encoding=None)

Inheritance: Dataset.filepath

Get the file system path (or the opendap URL) which was used to open/create the Dataset. Requires netcdf >= 4.1.2. The path is decoded into a string using sys.getfilesystemencoding() by default, this can be changed using the encoding kwarg.

def get_variables_by_attributes(

...)

Inheritance: Dataset.get_variables_by_attributes

Returns a list of variables that match specific conditions.

Can pass in key=value parameters and variables are returned that contain all of the matches. For example,

>>> # Get variables with x-axis attribute.
>>> vs = nc.get_variables_by_attributes(axis='X')
>>> # Get variables with matching "standard_name" attribute
>>> vs = nc.get_variables_by_attributes(standard_name='northward_sea_water_velocity')

Can pass in key=callable parameter and variables are returned if the callable returns True. The callable should accept a single parameter, the attribute value. None is given as the attribute value when the attribute does not exist on the variable. For example,

>>> # Get Axis variables
>>> vs = nc.get_variables_by_attributes(axis=lambda v: v in ['X', 'Y', 'Z', 'T'])
>>> # Get variables that don't have an "axis" attribute
>>> vs = nc.get_variables_by_attributes(axis=lambda v: v is None)
>>> # Get variables that have a "grid_mapping" attribute
>>> vs = nc.get_variables_by_attributes(grid_mapping=lambda v: v is not None)

def getncattr(

self,name)

Inheritance: Dataset.getncattr

retrieve a netCDF dataset or group attribute. Use if you need to get a netCDF attribute with the same name as one of the reserved python attributes.

option kwarg encoding can be used to specify the character encoding of a string attribute (default is utf-8).

def isopen(

...)

Inheritance: Dataset.isopen

is the Dataset open or closed?

def ncattrs(

self)

Inheritance: Dataset.ncattrs

return netCDF global attribute names for this Dataset or Group in a list.

def renameAttribute(

self, oldname, newname)

Inheritance: Dataset.renameAttribute

rename a Dataset or Group attribute named oldname to newname.

def renameDimension(

self, oldname, newname)

Inheritance: Dataset.renameDimension

rename a Dimension named oldname to newname.

def renameGroup(

self, oldname, newname)

Inheritance: Dataset.renameGroup

rename a Group named oldname to newname (requires netcdf >= 4.3.1).

def renameVariable(

self, oldname, newname)

Inheritance: Dataset.renameVariable

rename a Variable named oldname to newname

def set_auto_chartostring(

self, True_or_False)

Inheritance: Dataset.set_auto_chartostring

Call set_auto_chartostring for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion of all character arrays <--> string arrays should be performed for character variables (variables of type NC_CHAR or S1) with the _Encoding attribute set.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_mask(

self, True_or_False)

Inheritance: Dataset.set_auto_mask

Call set_auto_mask for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion to masked arrays shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_maskandscale(

self, True_or_False)

Inheritance: Dataset.set_auto_maskandscale

Call set_auto_maskandscale for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion to masked arrays and variable scaling shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_scale(

self, True_or_False)

Inheritance: Dataset.set_auto_scale

Call set_auto_scale for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic variable scaling shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_fill_off(

self)

Inheritance: Dataset.set_fill_off

Sets the fill mode for a Dataset open for writing to off.

This will prevent the data from being pre-filled with fill values, which may result in some performance improvements. However, you must then make sure the data is actually written before being read.

def set_fill_on(

self)

Inheritance: Dataset.set_fill_on

Sets the fill mode for a Dataset open for writing to on.

This causes data to be pre-filled with fill values. The fill values can be controlled by the variable's _Fill_Value attribute, but is usually sufficient to the use the netCDF default _Fill_Value (defined separately for each variable type). The default behavior of the netCDF library corresponds to set_fill_on. Data which are equal to the _Fill_Value indicate that the variable was created, but never written to.

def setncattr(

self,name,value)

Inheritance: Dataset.setncattr

set a netCDF dataset or group attribute using name,value pair. Use if you need to set a netCDF attribute with the with the same name as one of the reserved python attributes.

def setncattr_string(

self,name,value)

Inheritance: Dataset.setncattr_string

set a netCDF dataset or group string attribute using name,value pair. Use if you need to ensure that a netCDF attribute is created with type NC_STRING if the file format is NETCDF4. Use if you need to set an attribute to an array of variable-length strings.

def setncatts(

self,attdict)

Inheritance: Dataset.setncatts

set a bunch of netCDF dataset or group attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a NETCDF3 formatted file, since nc_redef/nc_enddef is not called in between setting each attribute

def sync(

self)

Inheritance: Dataset.sync

Writes all buffered data in the Dataset to the disk file.

class MFDataset

Class for reading multi-file netCDF Datasets, making variables spanning multiple files appear as if they were in one file. Datasets must be in NETCDF4_CLASSIC, NETCDF3_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA format (NETCDF4 Datasets won't work).

Adapted from pycdf by Andre Gosselin.

Example usage (See __init__ for more details):

>>> import numpy
>>> # create a series of netCDF files with a variable sharing
>>> # the same unlimited dimension.
>>> for nf in range(10):
>>>     f = Dataset("mftest%s.nc" % nf,"w")
>>>     f.createDimension("x",None)
>>>     x = f.createVariable("x","i",("x",))
>>>     x[0:10] = numpy.arange(nf*10,10*(nf+1))
>>>     f.close()
>>> # now read all those files in at once, in one Dataset.
>>> f = MFDataset("mftest*nc")
>>> print f.variables["x"][:]
[ 0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49
 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74
 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99]

Ancestors (in MRO)

Class variables

var cmptypes

Inheritance: Dataset.cmptypes

The cmptypes dictionary maps the names of compound types defined for the Group or Dataset to instances of the CompoundType class.

var data_model

Inheritance: Dataset.data_model

data_model describes the netCDF data model version, one of NETCDF3_CLASSIC, NETCDF4, NETCDF4_CLASSIC, NETCDF3_64BIT_OFFSET or NETCDF3_64BIT_DATA.

var dimensions

Inheritance: Dataset.dimensions

The dimensions dictionary maps the names of dimensions defined for the Group or Dataset to instances of the Dimension class.

var disk_format

Inheritance: Dataset.disk_format

disk_format describes the underlying file format, one of NETCDF3, HDF5, HDF4, PNETCDF, DAP2, DAP4 or UNDEFINED. Only available if using netcdf C library version >= 4.3.1, otherwise will always return UNDEFINED.

var enumtypes

Inheritance: Dataset.enumtypes

The enumtypes dictionary maps the names of Enum types defined for the Group or Dataset to instances of the EnumType class.

var file_format

Inheritance: Dataset.file_format

same as data_model, retained for backwards compatibility.

var groups

Inheritance: Dataset.groups

The groups dictionary maps the names of groups created for this Dataset or Group to instances of the Group class (the Dataset class is simply a special case of the Group class which describes the root group in the netCDF4 file).

var keepweakref

Inheritance: Dataset.keepweakref

If True, child Dimension and Variables objects only keep weak references to the parent Dataset or Group.

var parent

Inheritance: Dataset.parent

parent is a reference to the parent Group instance. None for the root group or Dataset instance

var path

Inheritance: Dataset.path

path shows the location of the Group in the Dataset in a unix directory format (the names of groups in the hierarchy separated by backslashes). A Dataset instance is the root group, so the path is simply '/'.

var variables

Inheritance: Dataset.variables

The variables dictionary maps the names of variables defined for this Dataset or Group to instances of the Variable class.

var vltypes

Inheritance: Dataset.vltypes

The vltypes dictionary maps the names of variable-length types defined for the Group or Dataset to instances of the VLType class.

Static methods

def createCompoundType(

self, datatype, datatype_name)

Inheritance: Dataset.createCompoundType

Creates a new compound data type named datatype_name from the numpy dtype object datatype.

Note: If the new compound data type contains other compound data types (i.e. it is a 'nested' compound type, where not all of the elements are homogeneous numeric data types), then the 'inner' compound types must be created first.

The return value is the CompoundType class instance describing the new datatype.

def createDimension(

self, dimname, size=None)

Inheritance: Dataset.createDimension

Creates a new dimension with the given dimname and size.

size must be a positive integer or None, which stands for "unlimited" (default is None). Specifying a size of 0 also results in an unlimited dimension. The return value is the Dimension class instance describing the new dimension. To determine the current maximum size of the dimension, use the len function on the Dimension instance. To determine if a dimension is 'unlimited', use the isunlimited method of the Dimension instance.

def createEnumType(

self, datatype, datatype_name, enum_dict)

Inheritance: Dataset.createEnumType

Creates a new Enum data type named datatype_name from a numpy integer dtype object datatype, and a python dictionary defining the enum fields and values.

The return value is the EnumType class instance describing the new datatype.

def createGroup(

self, groupname)

Inheritance: Dataset.createGroup

Creates a new Group with the given groupname.

If groupname is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary (analogous to mkdir -p in unix). For example, createGroup('/GroupA/GroupB/GroupC') will create GroupA, GroupA/GroupB, and GroupA/GroupB/GroupC, if they don't already exist. If the specified path describes a group that already exists, no error is raised.

The return value is a Group class instance.

def createVLType(

self, datatype, datatype_name)

Inheritance: Dataset.createVLType

Creates a new VLEN data type named datatype_name from a numpy dtype object datatype.

The return value is the VLType class instance describing the new datatype.

def createVariable(

self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None)

Inheritance: Dataset.createVariable

Creates a new variable with the given varname, datatype, and dimensions. If dimensions are not given, the variable is assumed to be a scalar.

If varname is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary For example, createVariable('/GroupA/GroupB/VarC', float, ('x','y')) will create groups GroupA and GroupA/GroupB, plus the variable GroupA/GroupB/VarC, if the preceding groups don't already exist.

The datatype can be a numpy datatype object, or a string that describes a numpy dtype object (like the dtype.str attribute of a numpy array). Supported specifiers include: 'S1' or 'c' (NC_CHAR), 'i1' or 'b' or 'B' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' or 'h' or 's' (NC_SHORT), 'u2' (NC_USHORT), 'i4' or 'i' or 'l' (NC_INT), 'u4' (NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' or 'f' (NC_FLOAT), 'f8' or 'd' (NC_DOUBLE). datatype can also be a CompoundType instance (for a structured, or compound array), a VLType instance (for a variable-length array), or the python str builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for str.

Data from netCDF variables is presented to python as numpy arrays with the corresponding data type.

dimensions must be a tuple containing dimension names (strings) that have been defined previously using createDimension. The default value is an empty tuple, which means the variable is a scalar.

If the optional keyword zlib is True, the data will be compressed in the netCDF file using gzip compression (default False).

The optional keyword complevel is an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if zlib=False.

If the optional keyword shuffle is True, the HDF5 shuffle filter will be applied before compressing the data (default True). This significantly improves compression. Default is True. Ignored if zlib=False.

If the optional keyword fletcher32 is True, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default False.

If the optional keyword contiguous is True, the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error.

The optional keyword chunksizes can be used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. chunksizes cannot be set if contiguous=True.

The optional keyword endian can be used to control whether the data is stored in little or big endian format on disk. Possible values are little, big or native (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness.

The zlib, complevel, shuffle, fletcher32, contiguous, chunksizes and endian keywords are silently ignored for netCDF 3 files that do not use HDF5.

The optional keyword fill_value can be used to override the default netCDF _FillValue (the value that the variable gets filled with before any data is written to it, defaults given in netCDF4.default_fillvals). If fill_value is set to False, then the variable is not pre-filled.

If the optional keyword parameter least_significant_digit is specified, variable data will be truncated (quantized). In conjunction with zlib=True this produces 'lossy', but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using numpy.around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From the PSD metadata conventions: "least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value." Default is None, or no quantization, or 'lossless' compression.

When creating variables in a NETCDF4 or NETCDF4_CLASSIC formatted file, HDF5 creates something called a 'chunk cache' for each variable. The default size of the chunk cache may be large enough to completely fill available memory when creating thousands of variables. The optional keyword chunk_cache allows you to reduce (or increase) the size of the default chunk cache when creating a variable. The setting only persists as long as the Dataset is open - you can use the set_var_chunk_cache method to change it the next time the Dataset is opened. Warning - messing with this parameter can seriously degrade performance.

The return value is the Variable class instance describing the new variable.

A list of names corresponding to netCDF variable attributes can be obtained with the Variable method ncattrs. A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Variable instance.

Variable instances behave much like array objects. Data can be assigned to or retrieved from a variable with indexing and slicing operations on the Variable instance. A Variable instance has six Dataset standard attributes: dimensions, dtype, shape, ndim, name and least_significant_digit. Application programs should never modify these attributes. The dimensions attribute is a tuple containing the names of the dimensions associated with this variable. The dtype attribute is a string describing the variable's data type (i4, f8, S1, etc). The shape attribute is a tuple describing the current sizes of all the variable's dimensions. The name attribute is a string containing the name of the Variable instance. The least_significant_digit attributes describes the power of ten of the smallest decimal place in the data the contains a reliable value. assigned to the Variable instance. If None, the data is not truncated. The ndim attribute is the number of variable dimensions.

def delncattr(

self,name,value)

Inheritance: Dataset.delncattr

delete a netCDF dataset or group attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.

def filepath(

self,encoding=None)

Inheritance: Dataset.filepath

Get the file system path (or the opendap URL) which was used to open/create the Dataset. Requires netcdf >= 4.1.2. The path is decoded into a string using sys.getfilesystemencoding() by default, this can be changed using the encoding kwarg.

def get_variables_by_attributes(

...)

Inheritance: Dataset.get_variables_by_attributes

Returns a list of variables that match specific conditions.

Can pass in key=value parameters and variables are returned that contain all of the matches. For example,

>>> # Get variables with x-axis attribute.
>>> vs = nc.get_variables_by_attributes(axis='X')
>>> # Get variables with matching "standard_name" attribute
>>> vs = nc.get_variables_by_attributes(standard_name='northward_sea_water_velocity')

Can pass in key=callable parameter and variables are returned if the callable returns True. The callable should accept a single parameter, the attribute value. None is given as the attribute value when the attribute does not exist on the variable. For example,

>>> # Get Axis variables
>>> vs = nc.get_variables_by_attributes(axis=lambda v: v in ['X', 'Y', 'Z', 'T'])
>>> # Get variables that don't have an "axis" attribute
>>> vs = nc.get_variables_by_attributes(axis=lambda v: v is None)
>>> # Get variables that have a "grid_mapping" attribute
>>> vs = nc.get_variables_by_attributes(grid_mapping=lambda v: v is not None)

def getncattr(

self,name)

Inheritance: Dataset.getncattr

retrieve a netCDF dataset or group attribute. Use if you need to get a netCDF attribute with the same name as one of the reserved python attributes.

option kwarg encoding can be used to specify the character encoding of a string attribute (default is utf-8).

def isopen(

...)

Inheritance: Dataset.isopen

is the Dataset open or closed?

def renameAttribute(

self, oldname, newname)

Inheritance: Dataset.renameAttribute

rename a Dataset or Group attribute named oldname to newname.

def renameDimension(

self, oldname, newname)

Inheritance: Dataset.renameDimension

rename a Dimension named oldname to newname.

def renameGroup(

self, oldname, newname)

Inheritance: Dataset.renameGroup

rename a Group named oldname to newname (requires netcdf >= 4.3.1).

def renameVariable(

self, oldname, newname)

Inheritance: Dataset.renameVariable

rename a Variable named oldname to newname

def set_auto_chartostring(

self, True_or_False)

Inheritance: Dataset.set_auto_chartostring

Call set_auto_chartostring for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion of all character arrays <--> string arrays should be performed for character variables (variables of type NC_CHAR or S1) with the _Encoding attribute set.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_mask(

self, True_or_False)

Inheritance: Dataset.set_auto_mask

Call set_auto_mask for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion to masked arrays shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_maskandscale(

self, True_or_False)

Inheritance: Dataset.set_auto_maskandscale

Call set_auto_maskandscale for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic conversion to masked arrays and variable scaling shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_auto_scale(

self, True_or_False)

Inheritance: Dataset.set_auto_scale

Call set_auto_scale for all variables contained in this Dataset or Group, as well as for all variables in all its subgroups.

True_or_False: Boolean determining if automatic variable scaling shall be applied for all variables.

Note: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour.

def set_fill_off(

self)

Inheritance: Dataset.set_fill_off

Sets the fill mode for a Dataset open for writing to off.

This will prevent the data from being pre-filled with fill values, which may result in some performance improvements. However, you must then make sure the data is actually written before being read.

def set_fill_on(

self)

Inheritance: Dataset.set_fill_on

Sets the fill mode for a Dataset open for writing to on.

This causes data to be pre-filled with fill values. The fill values can be controlled by the variable's _Fill_Value attribute, but is usually sufficient to the use the netCDF default _Fill_Value (defined separately for each variable type). The default behavior of the netCDF library corresponds to set_fill_on. Data which are equal to the _Fill_Value indicate that the variable was created, but never written to.

def setncattr(

self,name,value)

Inheritance: Dataset.setncattr

set a netCDF dataset or group attribute using name,value pair. Use if you need to set a netCDF attribute with the with the same name as one of the reserved python attributes.

def setncattr_string(

self,name,value)

Inheritance: Dataset.setncattr_string

set a netCDF dataset or group string attribute using name,value pair. Use if you need to ensure that a netCDF attribute is created with type NC_STRING if the file format is NETCDF4. Use if you need to set an attribute to an array of variable-length strings.

def setncatts(

self,attdict)

Inheritance: Dataset.setncatts

set a bunch of netCDF dataset or group attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a NETCDF3 formatted file, since nc_redef/nc_enddef is not called in between setting each attribute

def sync(

self)

Inheritance: Dataset.sync

Writes all buffered data in the Dataset to the disk file.

Methods

def __init__(

self, files, check=False, aggdim=None, exclude=[])

Inheritance: Dataset.__init__

Open a Dataset spanning multiple files, making it look as if it was a single file. Variables in the list of files that share the same dimension (specified with the keyword aggdim) are aggregated. If aggdim is not specified, the unlimited is aggregated. Currently, aggdim must be the leftmost (slowest varying) dimension of each of the variables to be aggregated.

files: either a sequence of netCDF files or a string with a wildcard (converted to a sorted list of files using glob) The first file in the list will become the "master" file, defining all the variables with an aggregation dimension which may span subsequent files. Attribute access returns attributes only from "master" file. The files are always opened in read-only mode.

check: True if you want to do consistency checking to ensure the correct variables structure for all of the netcdf files. Checking makes the initialization of the MFDataset instance much slower. Default is False.

aggdim: The name of the dimension to aggregate over (must be the leftmost dimension of each of the variables to be aggregated). If None (default), aggregate over the unlimited dimension.

exclude: A list of variable names to exclude from aggregation. Default is an empty list.

def close(

self)

Inheritance: Dataset.close

close all the open files.

def ncattrs(

self)

Inheritance: Dataset.ncattrs

return the netcdf attribute names from the master file.

class MFTime

Class providing an interface to a MFDataset time Variable by imposing a unique common time unit to all files.

Example usage (See __init__ for more details):

>>> import numpy
>>> f1 = Dataset("mftest_1.nc","w", format="NETCDF4_CLASSIC")
>>> f2 = Dataset("mftest_2.nc","w", format="NETCDF4_CLASSIC")
>>> f1.createDimension("time",None)
>>> f2.createDimension("time",None)
>>> t1 = f1.createVariable("time","i",("time",))
>>> t2 = f2.createVariable("time","i",("time",))
>>> t1.units = "days since 2000-01-01"
>>> t2.units = "days since 2000-02-01"
>>> t1.calendar = "standard"
>>> t2.calendar = "standard"
>>> t1[:] = numpy.arange(31)
>>> t2[:] = numpy.arange(30)
>>> f1.close()
>>> f2.close()
>>> # Read the two files in at once, in one Dataset.
>>> f = MFDataset("mftest*nc")
>>> t = f.variables["time"]
>>> print t.units
days since 2000-01-01
>>> print t[32] # The value written in the file, inconsistent with the MF time units.
1
>>> T = MFTime(t)
>>> print T[32]
32

Ancestors (in MRO)

  • MFTime
  • netCDF4._netCDF4._Variable
  • __builtin__.object

Methods

def __init__(

self, time, units=None)

Create a time Variable with units consistent across a multifile dataset.

time: Time variable from a MFDataset.

units: Time units, for example, days since 1979-01-01. If None, use the units from the master variable.

def ncattrs(

...)

def set_auto_chartostring(

...)

def set_auto_mask(

...)

def set_auto_maskandscale(

...)

def set_auto_scale(

...)

def typecode(

...)

class VLType

A VLType instance is used to describe a variable length (VLEN) data type, and can be passed to the the createVariable method of a Dataset or Group instance. See __init__ for more details.

The instance variables dtype and name should not be modified by the user.

Ancestors (in MRO)

Class variables

var dtype

A numpy dtype object describing the component type for the VLEN.

var name

String name.

Static methods

def __init__(

group, datatype, datatype_name)

VLType constructor.

group: Group instance to associate with the VLEN datatype.

datatype: An numpy dtype object describing the component type for the variable length array.

datatype_name: a Python string containing a description of the VLEN data type.

Note: VLType instances should be created using the createVLType method of a Dataset or Group instance, not using this class directly.

class Variable

A netCDF Variable is used to read and write netCDF data. They are analogous to numpy array objects. See __init__ for more details.

A list of attribute names corresponding to netCDF attributes defined for the variable can be obtained with the ncattrs method. These attributes can be created by assigning to an attribute of the Variable instance. A dictionary containing all the netCDF attribute name/value pairs is provided by the __dict__ attribute of a Variable instance.

The following class variables are read-only:

dimensions: A tuple containing the names of the dimensions associated with this variable.

dtype: A numpy dtype object describing the variable's data type.

ndim: The number of variable dimensions.

shape: A tuple with the current shape (length of all dimensions).

scale: If True, scale_factor and add_offset are applied, and signed integer data is automatically converted to unsigned integer data if the _Unsigned attribute is set. Default is True, can be reset using set_auto_scale and set_auto_maskandscale methods.

mask: If True, data is automatically converted to/from masked arrays when missing values or fill values are present. Default is True, can be reset using set_auto_mask and set_auto_maskandscale methods.

chartostring: If True, data is automatically converted to/from character arrays to string arrays when the _Encoding variable attribute is set. Default is True, can be reset using set_auto_chartostring method.

least_significant_digit: Describes the power of ten of the smallest decimal place in the data the contains a reliable value. Data is truncated to this decimal place when it is assigned to the Variable instance. If None, the data is not truncated.

__orthogonal_indexing__: Always True. Indicates to client code that the object supports 'orthogonal indexing', which means that slices that are 1d arrays or lists slice along each dimension independently. This behavior is similar to Fortran or Matlab, but different than numpy.

datatype: numpy data type (for primitive data types) or VLType/CompoundType instance (for compound or vlen data types).

name: String name.

size: The number of stored elements.

Ancestors (in MRO)

Class variables

var chartostring

If True, data is automatically converted to/from character arrays to string arrays when _Encoding variable attribute is set. Default is True, can be reset using set_auto_chartostring method.

var datatype

numpy data type (for primitive data types) or VLType/CompoundType/EnumType instance (for compound, vlen or enum data types).

var dimensions

A tuple containing the names of the dimensions associated with this variable.

var dtype

A numpy dtype object describing the variable's data type.

var mask

If True, data is automatically converted to/from masked arrays when missing values or fill values are present. Default is True, can be reset using set_auto_mask and set_auto_maskandscale methods.

var name

String name.

var ndim

The number of variable dimensions.

var scale

if True, scale_factor and add_offset are applied, and signed integer data is converted to unsigned integer data if the _Unsigned attribute is set. Default is True, can be reset using set_auto_scale and set_auto_maskandscale methods.

var shape

A tuple with the current shape (length of all dimensions).

var size

The number of stored elements.

Static methods

def __init__(

self, group, name, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None,fill_value=None)

Variable constructor.

group: Group or Dataset instance to associate with variable.

name: Name of the variable.

datatype: Variable data type. Can be specified by providing a numpy dtype object, or a string that describes a numpy dtype object. Supported values, corresponding to str attribute of numpy dtype objects, include 'f4' (32-bit floating point), 'f8' (64-bit floating point), 'i4' (32-bit signed integer), 'i2' (16-bit signed integer), 'i8' (64-bit signed integer), 'i4' (8-bit signed integer), 'i1' (8-bit signed integer), 'u1' (8-bit unsigned integer), 'u2' (16-bit unsigned integer), 'u4' (32-bit unsigned integer), 'u8' (64-bit unsigned integer), or 'S1' (single-character string). From compatibility with Scientific.IO.NetCDF, the old Numeric single character typecodes can also be used ('f' instead of 'f4', 'd' instead of 'f8', 'h' or 's' instead of 'i2', 'b' or 'B' instead of 'i1', 'c' instead of 'S1', and 'i' or 'l' instead of 'i4'). datatype can also be a CompoundType instance (for a structured, or compound array), a VLType instance (for a variable-length array), or the python str builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for str.

dimensions: a tuple containing the variable's dimension names (defined previously with createDimension). Default is an empty tuple which means the variable is a scalar (and therefore has no dimensions).

zlib: if True, data assigned to the Variable instance is compressed on disk. Default False.

complevel: the level of zlib compression to use (1 is the fastest, but poorest compression, 9 is the slowest but best compression). Default 4. Ignored if zlib=False.

shuffle: if True, the HDF5 shuffle filter is applied to improve compression. Default True. Ignored if zlib=False.

fletcher32: if True (default False), the Fletcher32 checksum algorithm is used for error detection.

contiguous: if True (default False), the variable data is stored contiguously on disk. Default False. Setting to True for a variable with an unlimited dimension will trigger an error.

chunksizes: Can be used to specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available here. Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. chunksizes cannot be set if contiguous=True.

endian: Can be used to control whether the data is stored in little or big endian format on disk. Possible values are little, big or native (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness. For netCDF 3 files (that don't use HDF5), only endian='native' is allowed.

The zlib, complevel, shuffle, fletcher32, contiguous and chunksizes keywords are silently ignored for netCDF 3 files that do not use HDF5.

least_significant_digit: If specified, variable data will be truncated (quantized). In conjunction with zlib=True this produces 'lossy', but significantly more efficient compression. For example, if least_significant_digit=1, data will be quantized using around(scaledata)/scale, where scale = 2*bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). Default is None, or no quantization.

fill_value: If specified, the default netCDF _FillValue (the value that the variable gets filled with before any data is written to it) is replaced with this value. If fill_value is set to False, then the variable is not pre-filled. The default netCDF fill values can be found in netCDF4.default_fillvals.

Note: Variable instances should be created using the createVariable method of a Dataset or Group instance, not using this class directly.

def assignValue(

self, val)

assign a value to a scalar variable. Provided for compatibility with Scientific.IO.NetCDF, can also be done by assigning to an Ellipsis slice ([...]).

def chunking(

self)

return variable chunking information. If the dataset is defined to be contiguous (and hence there is no chunking) the word 'contiguous' is returned. Otherwise, a sequence with the chunksize for each dimension is returned.

def delncattr(

self,name,value)

delete a netCDF variable attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.

def endian(

self)

return endian-ness (little,big,native) of variable (as stored in HDF5 file).

def filters(

self)

return dictionary containing HDF5 filter parameters.

def getValue(

self)

get the value of a scalar variable. Provided for compatibility with Scientific.IO.NetCDF, can also be done by slicing with an Ellipsis ([...]).

def get_var_chunk_cache(

self)

return variable chunk cache information in a tuple (size,nelems,preemption). See netcdf C library documentation for nc_get_var_chunk_cache for details.

def getncattr(

self,name)

retrieve a netCDF variable attribute. Use if you need to set a netCDF attribute with the same name as one of the reserved python attributes.

option kwarg encoding can be used to specify the character encoding of a string attribute (default is utf-8).

def group(

self)

return the group that this Variable is a member of.

def ncattrs(

self)

return netCDF attribute names for this Variable in a list.

def renameAttribute(

self, oldname, newname)

rename a Variable attribute named oldname to newname.

def set_auto_chartostring(

self,chartostring)

turn on or off automatic conversion of character variable data to and from numpy fixed length string arrays when the _Encoding variable attribute is set.

If chartostring is set to True, when data is read from a character variable (dtype = S1) that has an _Encoding attribute, it is converted to a numpy fixed length unicode string array (dtype = UN, where N is the length of the the rightmost dimension of the variable). The value of _Encoding is the unicode encoding that is used to decode the bytes into strings.

When numpy string data is written to a variable it is converted back to indiviual bytes, with the number of bytes in each string equalling the rightmost dimension of the variable.

The default value of chartostring is True (automatic conversions are performed).

def set_auto_mask(

self,mask)

turn on or off automatic conversion of variable data to and from masked arrays .

If mask is set to True, when data is read from a variable it is converted to a masked array if any of the values are exactly equal to the either the netCDF _FillValue or the value specified by the missing_value variable attribute. The fill_value of the masked array is set to the missing_value attribute (if it exists), otherwise the netCDF _FillValue attribute (which has a default value for each data type). When data is written to a variable, the masked array is converted back to a regular numpy array by replacing all the masked values by the missing_value attribute of the variable (if it exists). If the variable has no missing_value attribute, the _FillValue is used instead. If the variable has valid_min/valid_max and missing_value attributes, data outside the specified range will be set to missing_value.

The default value of mask is True (automatic conversions are performed).

def set_auto_maskandscale(

self,maskandscale)

turn on or off automatic conversion of variable data to and from masked arrays, automatic packing/unpacking of variable data using scale_factor and add_offset attributes and automatic conversion of signed integer data to unsigned integer data if the _Unsigned attribute exists.

If maskandscale is set to True, when data is read from a variable it is converted to a masked array if any of the values are exactly equal to the either the netCDF _FillValue or the value specified by the missing_value variable attribute. The fill_value of the masked array is set to the missing_value attribute (if it exists), otherwise the netCDF _FillValue attribute (which has a default value for each data type). When data is written to a variable, the masked array is converted back to a regular numpy array by replacing all the masked values by the missing_value attribute of the variable (if it exists). If the variable has no missing_value attribute, the _FillValue is used instead. If the variable has valid_min/valid_max and missing_value attributes, data outside the specified range will be set to missing_value.

If maskandscale is set to True, and the variable has a scale_factor or an add_offset attribute, then data read from that variable is unpacked using::

data = self.scale_factor*data + self.add_offset

When data is written to a variable it is packed using::

data = (data - self.add_offset)/self.scale_factor

If either scale_factor is present, but add_offset is missing, add_offset is assumed zero. If add_offset is present, but scale_factor is missing, scale_factor is assumed to be one. For more information on how scale_factor and add_offset can be used to provide simple compression, see the PSD metadata conventions.

In addition, if maskandscale is set to True, and if the variable has an attribute _Unsigned set, and the variable has a signed integer data type, a view to the data is returned with the corresponding unsigned integer data type. This convention is used by the netcdf-java library to save unsigned integer data in NETCDF3 or NETCDF4_CLASSIC files (since the NETCDF3 data model does not have unsigned integer data types).

The default value of maskandscale is True (automatic conversions are performed).

def set_auto_scale(

self,scale)

turn on or off automatic packing/unpacking of variable data using scale_factor and add_offset attributes. Also turns on and off automatic conversion of signed integer data to unsigned integer data if the variable has an _Unsigned attribute.

If scale is set to True, and the variable has a scale_factor or an add_offset attribute, then data read from that variable is unpacked using::

data = self.scale_factor*data + self.add_offset

When data is written to a variable it is packed using::

data = (data - self.add_offset)/self.scale_factor

If either scale_factor is present, but add_offset is missing, add_offset is assumed zero. If add_offset is present, but scale_factor is missing, scale_factor is assumed to be one. For more information on how scale_factor and add_offset can be used to provide simple compression, see the PSD metadata conventions.

In addition, if scale is set to True, and if the variable has an attribute _Unsigned set, and the variable has a signed integer data type, a view to the data is returned with the corresponding unsigned integer datatype. This convention is used by the netcdf-java library to save unsigned integer data in NETCDF3 or NETCDF4_CLASSIC files (since the NETCDF3 data model does not have unsigned integer data types).

The default value of scale is True (automatic conversions are performed).

def set_collective(

self,True_or_False)

turn on or off collective parallel IO access. Ignored if file is not open for parallel access.

def set_var_chunk_cache(

self,size=None,nelems=None,preemption=None)

change variable chunk cache settings. See netcdf C library documentation for nc_set_var_chunk_cache for details.

def setncattr(

self,name,value)

set a netCDF variable attribute using name,value pair. Use if you need to set a netCDF attribute with the same name as one of the reserved python attributes.

def setncattr_string(

self,name,value)

set a netCDF variable string attribute using name,value pair. Use if you need to ensure that a netCDF attribute is created with type NC_STRING if the file format is NETCDF4. Use if you need to set an attribute to an array of variable-length strings.

def setncatts(

self,attdict)

set a bunch of netCDF variable attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a NETCDF3 formatted file, since nc_redef/nc_enddef is not called in between setting each attribute

def use_nc_get_vars(

self,_no_get_vars)

enable the use of netcdf library routine nc_get_vars to retrieve strided variable slices. By default, nc_get_vars not used since it slower than multiple calls to the unstrided read routine nc_get_vara in most cases.

netcdf4-python-1.3.1rel/examples/000077500000000000000000000000001317565303700167155ustar00rootroot00000000000000netcdf4-python-1.3.1rel/examples/README.md000066400000000000000000000016551317565303700202030ustar00rootroot00000000000000* `tutorial.py`: code from introduction section of documentation. * `json_att.py`: shows to to use json to serialize python objects, save them as netcdf attributes, and then convert them back to python objects. * `subset.py`: shows how to use 'orthogonal indexing' to select geographic regions. * `reading_netcdf.ipynb`: ipython notebook from Unidata python workshop. * `writing_netcdf.ipynb`: ipython notebook from Unidata python workshop. * `threaded_read.py`: test script for concurrent threaded reads. * `bench.py`: benchmarks for reading/writing using different formats. * `bench_compress*.py``: benchmarks for reading/writing with compression. * `bench_diskless.py`: benchmarks for 'diskless' IO. * `test_stringarr.py`: test utilities for converting arrays of fixed-length strings to arrays of characters (with an extra dimension), and vice-versa. Useful since netcdf does not have a datatype for fixed-length string arrays. netcdf4-python-1.3.1rel/examples/bench.py000066400000000000000000000031071317565303700203470ustar00rootroot00000000000000# benchmark reads and writes, with and without compression. # tests all four supported file formats. from numpy.random.mtrand import uniform import netCDF4 from timeit import Timer import os, sys # create an n1dim by n2dim by n3dim random array. n1dim = 30 n2dim = 15 n3dim = 73 n4dim = 144 ntrials = 10 sys.stdout.write('reading and writing a %s by %s by %s by %s random array ..\n'%(n1dim,n2dim,n3dim,n4dim)) array = uniform(size=(n1dim,n2dim,n3dim,n4dim)) def write_netcdf(filename,zlib=False,least_significant_digit=None,format='NETCDF4'): file = netCDF4.Dataset(filename,'w',format=format) file.createDimension('n1', n1dim) file.createDimension('n2', n2dim) file.createDimension('n3', n3dim) file.createDimension('n4', n4dim) foo = file.createVariable('data', 'f8',('n1','n2','n3','n4'),zlib=zlib,least_significant_digit=least_significant_digit) foo[:] = array file.close() def read_netcdf(filename): file = netCDF4.Dataset(filename) data = file.variables['data'][:] file.close() for format in ['NETCDF3_CLASSIC','NETCDF3_64BIT','NETCDF4_CLASSIC','NETCDF4']: sys.stdout.write('testing file format %s ...\n' % format) # writing, no compression. t = Timer("write_netcdf('test1.nc',format='%s')" % format,"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. t = Timer("read_netcdf('test1.nc')","from __main__ import read_netcdf") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) netcdf4-python-1.3.1rel/examples/bench_compress.py000066400000000000000000000035501317565303700222640ustar00rootroot00000000000000# benchmark reads and writes, with and without compression. # tests all four supported file formats. from numpy.random.mtrand import uniform import netCDF4 from timeit import Timer import os, sys # create an n1dim by n2dim by n3dim random array. n1dim = 30 n2dim = 15 n3dim = 73 n4dim = 144 ntrials = 10 sys.stdout.write('reading and writing a %s by %s by %s by %s random array ..\n'%(n1dim,n2dim,n3dim,n4dim)) sys.stdout.write('(average of %s trials)\n' % ntrials) array = netCDF4.utils._quantize(uniform(size=(n1dim,n2dim,n3dim,n4dim)),4) def write_netcdf(filename,zlib=False,shuffle=False,complevel=6): file = netCDF4.Dataset(filename,'w',format='NETCDF4') file.createDimension('n1', n1dim) file.createDimension('n2', n2dim) file.createDimension('n3', n3dim) file.createDimension('n4', n4dim) foo = file.createVariable('data',\ 'f8',('n1','n2','n3','n4'),zlib=zlib,shuffle=shuffle,complevel=complevel) foo[:] = array file.close() def read_netcdf(filename): file = netCDF4.Dataset(filename) data = file.variables['data'][:] file.close() for compress_kwargs in ["zlib=False,shuffle=False","zlib=True,shuffle=False", "zlib=True,shuffle=True","zlib=True,shuffle=True,complevel=2"]: sys.stdout.write('testing compression %s...\n' % repr(compress_kwargs)) # writing. t = Timer("write_netcdf('test.nc',%s)" % compress_kwargs,"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. t = Timer("read_netcdf('test.nc')","from __main__ import read_netcdf") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) # print out size of resulting files. sys.stdout.write('size of test.nc = %s\n'%repr(os.stat('test.nc').st_size)) netcdf4-python-1.3.1rel/examples/bench_compress2.py000066400000000000000000000050541317565303700223470ustar00rootroot00000000000000# benchmark reads and writes, with and without compression. # tests all four supported file formats. from numpy.random.mtrand import uniform import netCDF4 from timeit import Timer import os, sys # create an n1dim by n2dim by n3dim random array. n1dim = 30 n2dim = 15 n3dim = 73 n4dim = 144 ntrials = 10 sys.stdout.write('reading and writing a %s by %s by %s by %s random array ..\n'%(n1dim,n2dim,n3dim,n4dim)) sys.stdout.write('(average of %s trials)\n\n' % ntrials) array = uniform(size=(n1dim,n2dim,n3dim,n4dim)) def write_netcdf(filename,complevel,lsd): file = netCDF4.Dataset(filename,'w',format='NETCDF4') file.createDimension('n1', n1dim) file.createDimension('n2', n2dim) file.createDimension('n3', n3dim) file.createDimension('n4', n4dim) foo = file.createVariable('data',\ 'f8',('n1','n2','n3','n4'),\ zlib=True,shuffle=True,complevel=complevel,\ least_significant_digit=lsd) foo[:] = array file.close() def read_netcdf(filename): file = netCDF4.Dataset(filename) data = file.variables['data'][:] file.close() lsd = None sys.stdout.write('using least_significant_digit %s\n\n' % lsd) for complevel in range(0,10,2): sys.stdout.write('testing compression with complevel %s...\n' % complevel) # writing. t = Timer("write_netcdf('test.nc',%s,%s)" % (complevel,lsd),"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. t = Timer("read_netcdf('test.nc')","from __main__ import read_netcdf") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) # print out size of resulting files. sys.stdout.write('size of test.nc = %s\n'%repr(os.stat('test.nc').st_size)) complevel = 4 sys.stdout.write('\nusing complevel %s\n\n' % complevel) for lsd in range(1,6): sys.stdout.write('testing compression with least_significant_digit %s...\n' % lsd) # writing. t = Timer("write_netcdf('test.nc',%s,%s)" % (complevel,lsd),"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. t = Timer("read_netcdf('test.nc')","from __main__ import read_netcdf") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) # print out size of resulting files. sys.stdout.write('size of test.nc = %s\n'%repr(os.stat('test.nc').st_size)) netcdf4-python-1.3.1rel/examples/bench_compress3.py000066400000000000000000000054121317565303700223460ustar00rootroot00000000000000from __future__ import print_function # benchmark reads and writes, with and without compression. # tests all four supported file formats. from numpy.random.mtrand import uniform import netCDF4 from timeit import Timer import os, sys # use real data. URL="http://www.esrl.noaa.gov/psd/thredds/dodsC/Datasets/ncep.reanalysis/pressure/hgt.1990.nc" nc = netCDF4.Dataset(URL) # use real 500 hPa geopotential height data. n1dim = 100 n3dim = 73 n4dim = 144 ntrials = 10 sys.stdout.write('reading and writing a %s by %s by %s random array ..\n'%(n1dim,n3dim,n4dim)) sys.stdout.write('(average of %s trials)\n\n' % ntrials) print(nc) print(nc.variables['hgt']) array = nc.variables['hgt'][0:n1dim,5,:,:] print(array.min(), array.max(), array.shape, array.dtype) def write_netcdf(filename,complevel,lsd): file = netCDF4.Dataset(filename,'w',format='NETCDF4') file.createDimension('n1', None) file.createDimension('n3', n3dim) file.createDimension('n4', n4dim) foo = file.createVariable('data',\ 'f4',('n1','n3','n4'),\ zlib=True,shuffle=True,complevel=complevel,\ least_significant_digit=lsd) foo[:] = array file.close() def read_netcdf(filename): file = netCDF4.Dataset(filename) data = file.variables['data'][:] file.close() lsd = None sys.stdout.write('using least_significant_digit %s\n\n' % lsd) for complevel in range(0,10,2): sys.stdout.write('testing compression with complevel %s...\n' % complevel) # writing. t = Timer("write_netcdf('test.nc',%s,%s)" % (complevel,lsd),"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. t = Timer("read_netcdf('test.nc')","from __main__ import read_netcdf") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) # print out size of resulting files. sys.stdout.write('size of test.nc = %s\n'%repr(os.stat('test.nc').st_size)) complevel = 4 complevel = 4 sys.stdout.write('\nusing complevel %s\n\n' % complevel) for lsd in range(0,6): sys.stdout.write('testing compression with least_significant_digit %s..\n'\ % lsd) # writing. t = Timer("write_netcdf('test.nc',%s,%s)" % (complevel,lsd),"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. t = Timer("read_netcdf('test.nc')","from __main__ import read_netcdf") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) # print out size of resulting files. sys.stdout.write('size of test.nc = %s\n'%repr(os.stat('test.nc').st_size)) netcdf4-python-1.3.1rel/examples/bench_diskless.py000066400000000000000000000052761317565303700222610ustar00rootroot00000000000000# benchmark reads and writes, with and without compression. # tests all four supported file formats. from numpy.random.mtrand import uniform import netCDF4 from timeit import Timer import os, sys # create an n1dim by n2dim by n3dim random array. n1dim = 30 n2dim = 15 n3dim = 73 n4dim = 144 ntrials = 10 sys.stdout.write('reading and writing a %s by %s by %s by %s random array ..\n'%(n1dim,n2dim,n3dim,n4dim)) array = uniform(size=(n1dim,n2dim,n3dim,n4dim)) def write_netcdf(filename,zlib=False,least_significant_digit=None,format='NETCDF4',closeit=False): file = netCDF4.Dataset(filename,'w',format=format,diskless=True,persist=True) file.createDimension('n1', n1dim) file.createDimension('n2', n2dim) file.createDimension('n3', n3dim) file.createDimension('n4', n4dim) foo = file.createVariable('data',\ 'f8',('n1','n2','n3','n4'),zlib=zlib,least_significant_digit=None) foo.testme="hi I am an attribute" foo.testme1="hi I am an attribute" foo.testme2="hi I am an attribute" foo.testme3="hi I am an attribute" foo.testme4="hi I am an attribute" foo.testme5="hi I am an attribute" foo[:] = array if closeit: file.close() return file def read_netcdf(ncfile): data = ncfile.variables['data'][:] for format in ['NETCDF4','NETCDF3_CLASSIC','NETCDF3_64BIT']: sys.stdout.write('testing file format %s ...\n' % format) # writing, no compression. t = Timer("write_netcdf('test1.nc',closeit=True,format='%s')" % format,"from __main__ import write_netcdf") sys.stdout.write('writing took %s seconds\n' %\ repr(sum(t.repeat(ntrials,1))/ntrials)) # test reading. ncfile = write_netcdf('test1.nc',format=format) t = Timer("read_netcdf(ncfile)","from __main__ import read_netcdf,ncfile") sys.stdout.write('reading took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) # test diskless=True in nc_open format='NETCDF3_CLASSIC' trials=50 sys.stdout.write('test caching of file in memory on open for %s\n' % format) sys.stdout.write('testing file format %s ...\n' % format) write_netcdf('test1.nc',format=format,closeit=True) ncfile = netCDF4.Dataset('test1.nc',diskless=False) t = Timer("read_netcdf(ncfile)","from __main__ import read_netcdf,ncfile") sys.stdout.write('reading (from disk) took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) ncfile.close() ncfile = netCDF4.Dataset('test1.nc',diskless=True) # setting diskless=True should cache the file in memory, # resulting in faster reads. t = Timer("read_netcdf(ncfile)","from __main__ import read_netcdf,ncfile") sys.stdout.write('reading (cached in memory) took %s seconds\n' % repr(sum(t.repeat(ntrials,1))/ntrials)) ncfile.close() netcdf4-python-1.3.1rel/examples/json_att.py000066400000000000000000000012631317565303700211120ustar00rootroot00000000000000from netCDF4 import Dataset import json # example showing how python objects (lists, dicts, None, True) # can be serialized as strings, saved as netCDF attributes, # and then converted back to python objects using json. ds = Dataset('json.nc', 'w') ds.pythonatt1 = json.dumps([u'foo', {u'bar': [u'baz', None, 1.0, 2]}]) ds.pythonatt2 = "true" # converted to bool ds.pythonatt3 = "null" # converted to None print(ds) ds.close() ds = Dataset('json.nc') def convert_json(s): try: a = json.loads(s) return a except: return s x = convert_json(ds.pythonatt1) print(type(x)) print(x) print(convert_json(ds.pythonatt2)) print(convert_json(ds.pythonatt3)) ds.close() netcdf4-python-1.3.1rel/examples/mpi_example.py000066400000000000000000000024211317565303700215660ustar00rootroot00000000000000# to run: mpirun -np 4 python mpi_example.py from mpi4py import MPI import numpy as np from netCDF4 import Dataset rank = MPI.COMM_WORLD.rank # The process ID (integer 0-3 for 4-process run) nc = Dataset('parallel_test.nc', 'w', parallel=True, comm=MPI.COMM_WORLD, info=MPI.Info()) # below should work also - MPI_COMM_WORLD and MPI_INFO_NULL will be used. #nc = Dataset('parallel_test.nc', 'w', parallel=True) d = nc.createDimension('dim',4) v = nc.createVariable('var', np.int, 'dim') v[rank] = rank # switch to collective mode, rewrite the data. v.set_collective(True) v[rank] = rank nc.close() # reopen the file read-only, check the data nc = Dataset('parallel_test.nc', parallel=True, comm=MPI.COMM_WORLD, info=MPI.Info()) assert rank==nc['var'][rank] nc.close() # reopen the file in append mode, modify the data on the last rank. nc = Dataset('parallel_test.nc', 'a',parallel=True, comm=MPI.COMM_WORLD, info=MPI.Info()) if rank == 3: v[rank] = 2*rank nc.close() # reopen the file read-only again, check the data. # leave out the comm and info kwargs to check that the defaults # (MPI_COMM_WORLD and MPI_INFO_NULL) work. nc = Dataset('parallel_test.nc', parallel=True) if rank == 3: assert 2*rank==nc['var'][rank] else: assert rank==nc['var'][rank] nc.close() netcdf4-python-1.3.1rel/examples/reading_netCDF.ipynb000066400000000000000000001451251317565303700225640ustar00rootroot00000000000000{ "cells": [ { "cell_type": "markdown", "metadata": { "internals": { "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "slide" } }, "source": [ "# Reading netCDF data\n", "- requires [numpy](http://numpy.scipy.org) and netCDF/HDF5 C libraries.\n", "- Github site: https://github.com/Unidata/netcdf4-python\n", "- Online docs: http://unidata.github.io/netcdf4-python/\n", "- Based on Konrad Hinsen's old [Scientific.IO.NetCDF](http://dirac.cnrs-orleans.fr/plone/software/scientificpython/) API, with lots of added netcdf version 4 features.\n", "- Developed by Jeff Whitaker at NOAA, with many contributions from users." ] }, { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Interactively exploring a netCDF File\n", "\n", "Let's explore a netCDF file from the *Atlantic Real-Time Ocean Forecast System*\n", "\n", "first, import netcdf4-python and numpy" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false, "internals": { "frag_number": 2, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "import netCDF4\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 2, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Create a netCDF4.Dataset object\n", "- **`f`** is a `Dataset` object, representing an open netCDF file.\n", "- printing the object gives you summary information, similar to *`ncdump -h`*." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 4, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "root group (NETCDF4_CLASSIC data model, file format HDF5):\n", " Conventions: CF-1.0\n", " title: HYCOM ATLb2.00\n", " institution: National Centers for Environmental Prediction\n", " source: HYCOM archive file\n", " experiment: 90.9\n", " history: archv2ncdf3z\n", " dimensions(sizes): MT(1), Y(850), X(712), Depth(10)\n", " variables(dimensions): float64 \u001b[4mMT\u001b[0m(MT), float64 \u001b[4mDate\u001b[0m(MT), float32 \u001b[4mDepth\u001b[0m(Depth), int32 \u001b[4mY\u001b[0m(Y), int32 \u001b[4mX\u001b[0m(X), float32 \u001b[4mLatitude\u001b[0m(Y,X), float32 \u001b[4mLongitude\u001b[0m(Y,X), float32 \u001b[4mu\u001b[0m(MT,Depth,Y,X), float32 \u001b[4mv\u001b[0m(MT,Depth,Y,X), float32 \u001b[4mtemperature\u001b[0m(MT,Depth,Y,X), float32 \u001b[4msalinity\u001b[0m(MT,Depth,Y,X)\n", " groups: \n", "\n" ] } ], "source": [ "f = netCDF4.Dataset('data/rtofs_glo_3dz_f006_6hrly_reg3.nc')\n", "print(f) " ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 4, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Access a netCDF variable\n", "- variable objects stored by name in **`variables`** dict.\n", "- print the variable yields summary info (including all the attributes).\n", "- no actual data read yet (just have a reference to the variable object with metadata)." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 6, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[u'MT', u'Date', u'Depth', u'Y', u'X', u'Latitude', u'Longitude', u'u', u'v', u'temperature', u'salinity']\n", "\n", "float32 temperature(MT, Depth, Y, X)\n", " coordinates: Longitude Latitude Date\n", " standard_name: sea_water_potential_temperature\n", " units: degC\n", " _FillValue: 1.26765e+30\n", " valid_range: [ -5.07860279 11.14989948]\n", " long_name: temp [90.9H]\n", "unlimited dimensions: MT\n", "current shape = (1, 10, 850, 712)\n", "filling on\n" ] } ], "source": [ "print(f.variables.keys()) # get all variable names\n", "temp = f.variables['temperature'] # temperature variable\n", "print(temp) " ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 6, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## List the Dimensions\n", "\n", "- All variables in a netCDF file have an associated shape, specified by a list of dimensions.\n", "- Let's list all the dimensions in this netCDF file.\n", "- Note that the **`MT`** dimension is special (*`unlimited`*), which means it can be appended to." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 8 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(u'MT', (unlimited): name = 'MT', size = 1\n", ")\n", "(u'Y', : name = 'Y', size = 850\n", ")\n", "(u'X', : name = 'X', size = 712\n", ")\n", "(u'Depth', : name = 'Depth', size = 10\n", ")\n" ] } ], "source": [ "for d in f.dimensions.items():\n", " print(d)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 9 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Each variable has a **`dimensions`** and a **`shape`** attribute." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 10 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(u'MT', u'Depth', u'Y', u'X')" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp.dimensions" ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 11, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(1, 10, 850, 712)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "temp.shape" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 11, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "### Each dimension typically has a variable associated with it (called a *coordinate* variable).\n", "- *Coordinate variables* are 1D variables that have the same name as dimensions.\n", "- Coordinate variables and *auxiliary coordinate variables* (named by the *coordinates* attribute) locate values in time and space." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 13, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "float64 MT(MT)\n", " long_name: time\n", " units: days since 1900-12-31 00:00:00\n", " calendar: standard\n", " axis: T\n", "unlimited dimensions: MT\n", "current shape = (1,)\n", "filling on, default _FillValue of 9.96920996839e+36 used\n", "\n", "\n", "int32 X(X)\n", " point_spacing: even\n", " axis: X\n", "unlimited dimensions: \n", "current shape = (712,)\n", "filling on, default _FillValue of -2147483647 used\n", "\n" ] } ], "source": [ "mt = f.variables['MT']\n", "depth = f.variables['Depth']\n", "x,y = f.variables['X'], f.variables['Y']\n", "print(mt)\n", "print(x) " ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 13, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Accessing data from a netCDF variable object\n", "\n", "- netCDF variables objects behave much like numpy arrays.\n", "- slicing a netCDF variable object returns a numpy array with the data.\n", "- Boolean array and integer sequence indexing behaves differently for netCDF variables than for numpy arrays. Only 1-d boolean arrays and integer sequences are allowed, and these indices work independently along each dimension (similar to the way vector subscripts work in fortran)." ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 15 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 41023.25]\n" ] } ], "source": [ "time = mt[:] # Reads the netCDF variable MT, array of one element\n", "print(time) " ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 16 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0. 100. 200. 400. 700. 1000. 2000. 3000. 4000. 5000.]\n" ] } ], "source": [ "dpth = depth[:] # examine depth array\n", "print(dpth) " ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 17, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "shape of temp variable: (1, 10, 850, 712)\n", "shape of temp slice: (6, 425, 356)\n" ] } ], "source": [ "xx,yy = x[:],y[:]\n", "print('shape of temp variable: %s' % repr(temp.shape))\n", "tempslice = temp[0, dpth > 400, yy > yy.max()/2, xx > xx.max()/2]\n", "print('shape of temp slice: %s' % repr(tempslice.shape))" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 17, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## What is the sea surface temperature and salinity at 50N, 140W?\n", "### Finding the latitude and longitude indices of 50N, 140W\n", "\n", "- The `X` and `Y` dimensions don't look like longitudes and latitudes\n", "- Use the auxilary coordinate variables named in the `coordinates` variable attribute, `Latitude` and `Longitude`" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 19 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "float32 Latitude(Y, X)\n", " standard_name: latitude\n", " units: degrees_north\n", "unlimited dimensions: \n", "current shape = (850, 712)\n", "filling on, default _FillValue of 9.96920996839e+36 used\n", "\n" ] } ], "source": [ "lat, lon = f.variables['Latitude'], f.variables['Longitude']\n", "print(lat)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 20, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "source": [ "Aha! So we need to find array indices `iy` and `ix` such that `Latitude[iy, ix]` is close to 50.0 and `Longitude[iy, ix]` is close to -140.0 ..." ] }, { "cell_type": "code", "execution_count": 16, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 20, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "# extract lat/lon values (in degrees) to numpy arrays\n", "latvals = lat[:]; lonvals = lon[:] \n", "# a function to find the index of the point closest pt\n", "# (in squared distance) to give lat/lon value.\n", "def getclosest_ij(lats,lons,latpt,lonpt):\n", " # find squared distance of every point on grid\n", " dist_sq = (lats-latpt)**2 + (lons-lonpt)**2 \n", " # 1D index of minimum dist_sq element\n", " minindex_flattened = dist_sq.argmin() \n", " # Get 2D index for latvals and lonvals arrays from 1D index\n", " return np.unravel_index(minindex_flattened, lats.shape)\n", "iy_min, ix_min = getclosest_ij(latvals, lonvals, 50., -140)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 22 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "### Now we have all the information we need to find our answer.\n" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 23 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "```\n", "|----------+--------|\n", "| Variable | Index |\n", "|----------+--------|\n", "| MT | 0 |\n", "| Depth | 0 |\n", "| Y | iy_min |\n", "| X | ix_min |\n", "|----------+--------|\n", "```" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 24 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "### What is the sea surface temperature and salinity at the specified point?" ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 25, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " 6.4631 degC\n", "32.6572 psu\n" ] } ], "source": [ "sal = f.variables['salinity']\n", "# Read values out of the netCDF file for temperature and salinity\n", "print('%7.4f %s' % (temp[0,0,iy_min,ix_min], temp.units))\n", "print('%7.4f %s' % (sal[0,0,iy_min,ix_min], sal.units))" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 25, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Remote data access via openDAP\n", "\n", "- Remote data can be accessed seamlessly with the netcdf4-python API\n", "- Access happens via the DAP protocol and DAP servers, such as TDS.\n", "- many formats supported, like GRIB, are supported \"under the hood\"." ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 27 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "The following example showcases some nice netCDF features:\n", "\n", "1. We are seamlessly accessing **remote** data, from a TDS server.\n", "2. We are seamlessly accessing **GRIB2** data, as if it were netCDF data.\n", "3. We are generating **metadata** on-the-fly." ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 28, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Global_0p5deg/GFS_Global_0p5deg_20150711_0600.grib2/GC\n" ] } ], "source": [ "import datetime\n", "date = datetime.datetime.now()\n", "# build URL for latest synoptic analysis time\n", "URL = 'http://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Global_0p5deg/GFS_Global_0p5deg_%04i%02i%02i_%02i%02i.grib2/GC' %\\\n", "(date.year,date.month,date.day,6*(date.hour//6),0)\n", "# keep moving back 6 hours until a valid URL found\n", "validURL = False; ncount = 0\n", "while (not validURL and ncount < 10):\n", " print(URL)\n", " try:\n", " gfs = netCDF4.Dataset(URL)\n", " validURL = True\n", " except RuntimeError:\n", " date -= datetime.timedelta(hours=6)\n", " ncount += 1 " ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 28, "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "float32 Temperature_surface(time2, lat, lon)\n", " long_name: Temperature @ Ground or water surface\n", " units: K\n", " abbreviation: TMP\n", " missing_value: nan\n", " grid_mapping: LatLon_Projection\n", " coordinates: reftime time2 lat lon \n", " Grib_Variable_Id: VAR_0-0-0_L1\n", " Grib2_Parameter: [0 0 0]\n", " Grib2_Parameter_Discipline: Meteorological products\n", " Grib2_Parameter_Category: Temperature\n", " Grib2_Parameter_Name: Temperature\n", " Grib2_Level_Type: Ground or water surface\n", " Grib2_Generating_Process_Type: Forecast\n", "unlimited dimensions: \n", "current shape = (93, 361, 720)\n", "filling off\n", "\n", "\n", "float64 time2(time2)\n", " units: Hour since 2015-07-11T06:00:00Z\n", " standard_name: time\n", " long_name: GRIB forecast or observation time\n", " calendar: proleptic_gregorian\n", " _CoordinateAxisType: Time\n", "unlimited dimensions: \n", "current shape = (93,)\n", "filling off\n", "\n", "\n", "float32 lat(lat)\n", " units: degrees_north\n", " _CoordinateAxisType: Lat\n", "unlimited dimensions: \n", "current shape = (361,)\n", "filling off\n", "\n", "\n", "float32 lon(lon)\n", " units: degrees_east\n", " _CoordinateAxisType: Lon\n", "unlimited dimensions: \n", "current shape = (720,)\n", "filling off\n", "\n" ] } ], "source": [ "# Look at metadata for a specific variable\n", "# gfs.variables.keys() will show all available variables.\n", "sfctmp = gfs.variables['Temperature_surface']\n", "# get info about sfctmp\n", "print(sfctmp)\n", "# print coord vars associated with this variable\n", "for dname in sfctmp.dimensions: \n", " print(gfs.variables[dname])" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 28, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "##Missing values\n", "- when `data == var.missing_value` somewhere, a masked array is returned.\n", "- illustrate with soil moisture data (only defined over land)\n", "- white areas on plot are masked values over water." ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 31 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "shape=(361, 720), type=, missing_value=nan\n" ] }, { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAXIAAAD7CAYAAAB37B+tAAAABHNCSVQICAgIfAhkiAAAAAlwSFlz\nAAALEgAACxIB0t1+/AAAIABJREFUeJztnX+MJkeZ378Ptmf5dbPLgM8/N7dWwCc23NhGwr6EO7Fc\nwNhSgi9ShNlVyAlOEQpZQAQns2sHZ+OLHQb5yEm3AkXHDzmEmeCYHwIFDtvEm3CKzE/bY1gc2zpW\nYn322vHCDhd0s177yR/dNVNvvVXV1d3V3VX9Ph9pNO/bb/+o7q769tNPPfUUMTMEQRCEfHnR0AUQ\nBEEQ2iFCLgiCkDki5IIgCJkjQi4IgpA5IuSCIAiZI0IuCIKQOWcPcVAikphHQRCEBjAz2RY6/wC8\nGMB3ADwI4EcADpXLDwE4DuCB8u9abZuDAB4D8AiAqx37Zd9xU/1T55/jn5Rdyj0LZc+13KFld2mn\n1yJn5r8hojcz86+I6GwAf0FE3wDAAD7OzB/X1yei3QCuB7AbwEUA7iWiS5n5hZAnjSAIglCfSh85\nM/+q/DgH4BwUIg4A0+Y9cB2AVWZ+jpmPAXgcwJURyikIgiA4qBRyInoRET0I4ASAu5n5u+VP7yei\nh4jo00S0o1x2IQqXi+I4Cst8LBwZugAtODJ0AVpwZOgCNOTI0AVowZGhC9CQI0MXoAVHmm4YYpG/\nwMyXA7gYwFVE9HcAfBLAJQAuB/AkgD/27aJp4VKDmY8MXYamSNn7J9dyA/mWPddyA+3KHhy1wsyn\niOg+ANcw86ZwE9GnAHyt/PoEgJ3aZheXy6YgokPa1yM53wBBEIQuIKI9APZUrlf2hLp28ioAZ5j5\nF0T0EgDfBPBRAD9k5qfKdT4E4A3MvK/s7FxB4Re/CMC9AF7NxkGIiNkWQiMIgpAg206ub2rYxsI8\nuZZ1jUs7qyzyCwDcQURnoXDDfIGZv05E/5mILkfhNvkpgPcCADMfJaI7ARwFcAbA+0wRFwRBAADa\nB547vN56PzFEtEqozXX1Yy4tLOPm0oW8fHKpN1HX8VrknR1ULHJBmBlo31Y/WQzhNtGF03UsU1xN\n4XaJtsnSwjJuueE24PYt/VrTBjgudqxrLu0UIR8ZekXmFWuIqCB0gl73gK36p0Ty9P75aEKuhLnN\nQ+KZ7ec6f7v9rBs2P9/w/O0Tvx07+3TlvheZ6WbcxDc8fzvmz9rYbIdt3TEi5DOCaVkM8ZonzB5K\nUE0xNeufWs80MlS9fWb7uZhfKoRyfXkOAHDuqWec+9O3A7C5rULtw0aIIMckhrU+GiE3X2PWLHlb\nun69SRnXK6IIutA1Pp/36f3zU8ts637vldsnvle15ZtxEwPAP6bbwgs6ALE0KVshV0KtLoQp5Eq4\nRKgmMS2k0688Vvywd1FcLkJtaB84pN64LHObkAPAR1ZuAjApxKboTblsLiSs+UauJMjMC7nCaXnf\nUC6/fXatcEGIgcvtUWf7qm1NUX5odXp1m5BPuWxu256FmMf2DoxOyGfZfdIUsxFhdQ3Mi3IdhU6x\nWemq47OOK8UUdHPbFHnDs6eieguaxpEngwh3GKZVNSXelvXnDq9PvfqK+2WchLpIYhwHcNejucPr\neGb7uTimLatq47wC2nZyqz7fxTfihudv773TMpTFD/d3rGws8tQZstO1SqxD2PSla4LeVYPvS0yE\nMFyiW3WffGJt1km9foW4U9pia49D8IZnTwGI14eXvWslZYZw+8QQ7ypEbMcD7QP36Urz1c9Tn9s2\nFSYYu4/rZtzEt9yw1YE6pD89ph6IkHdEn6O6dGILuYj2OKmqJzHvu2tA0JCoqLah/Om7zsxNDAhq\nS/Y+8uS4gVh/yncl4n1Y3nWPozfQtpEOQsesrgF7F50/V90/30jhuiGJLqoGEeWGPr5l+7s2wCvd\nH1Ms8oasP7+Nj519uujQqPlaaLNc9AbVl3h3QR1BDzlPeUA0x3V9m9Yxm5Db9ml70NdBRbNEGQlZ\nduYvLSwD6H7g0K4zc9j+rg3MHV7v5IHk0s7KiSUEO2rYsG8IsA2fiNt+19erX8r+Mcsfcm5CN/AK\nyKw3Vd+b7N8n8E1QogsUrpHQhFYuTu+fxy24lZZPLuEuvhF38Y1tdudkkZmULrgGQHXFaCzyIQRw\n/fltfPtZN+AW3Bp0jK6FzBZGOAuk+pDr0z9dlzZvgCFRKm3EXYUmzp+1semiWPwwQH/FzuOHsjkS\nvIMBReoNokt346g7O32VJKVGbitnDFeKHtplG3RhivsYBT/0PvcR+tjqfpaRJURrXCfCpGlHo2sc\ngY8mQu5b14USdKDIRrh8cgmn9883atP6fe9SzHedmcO5p57pzM8/qs7OOg1lCJGv45uMaaWHpvEc\nm4gD9fztXdWJKPdy72KxH/VfZ3Vtcx2FyxBwlUUXbfNcQwTdZWmb1zZU2H2W++n989iOjYnyNWWy\nfPPgC7t5luvpb/skK4u8S9dEVQVVy2wDZ0LwNZKYVrkqW9Ny9kFObwR1LNvOClEReVIH/dqrOuKy\nHl1vkFXr2HA9RNtY7rU61mmt2NfeRXxk5aZonZ6LHwZuvv3GYPdqW7J3rYRUmBgCoVc4s/KFZnTz\nlc01HL5LIUhZ1HOgSzEf+t70+VC11fNQC99Ev261QiBX1/AQLmtS/E2GTBeSrWslpsBVCaarUrle\n6WzLbbOgmBaQuY8if0R3jUkEvB0hVmTM8FFXveqCPutG3XDT0OtYq99j7yJ2fW6udn6WXWfqRaf1\nTfJCXgeXYKrlVbkhYuYsUcedO7zuFHDz+9gF13WOuZ97mxBLX0d0zPkt28Z2d1GONmxO9Yb653Lu\nqWfwPYSP9FTzHtTtwPQZgk07bZ3HSt210mTQiC3uVG8cthviO05Ig9L3rzdOU9irlgvD4xO9kIiN\ntuhuOGA89aOucMXulDaF1TdsP4b7ZNvJdT69f6tjVUXIqMgWoP4o1iwHBOk30mXZ6uvqVrVvail9\nPdoH74CDkMbks7yrGmGXjbSLGcubklJZXLgG0FSJhj4wJmR9E73u6P9P759PR8RX16b/GmC2vabF\nafMGra5pVwODFBsL8zR3eH0qxFElDIuap9xnkRPRiwH8TwDbULhh7mLmQ0S0AOALAH4DwDEA72Dm\nX5TbHATwHgDPA/gAM99t2a/TInf5Gn1pVpt2Mrlm9TY7Jc1j2ywml+D36e80yd1l0Qf6/YnZsEJn\njq+qO643vSa06awHYBdvFU1jiayJ5TqIaZnrkWdLC8vO6JWYHZoqnUeMBFqNLHJm/hsAb2bmywFc\nDuAaIroKwAEA9zDzpQC+VX4HEe0GcD2A3QCuAfAJIgqy+s0nrHnzXBMftH0q+ypzEyu7dWNx7KcJ\nqYt41VtWH8ePhRpKHjqk3OYH93WS20Q+5Pp1fn1bWOcmLis7pi+5al+LH44flTJ/1gYtMlPMLIgm\nlZ2dzPyr8uMcgHNQdC68HcCbyuV3ADiCQsyvA7DKzM8BOEZEjwO4EsD9vmOEirFrIEMTfBXcF6Wi\nW+uhQlnVeVV1vLHieuPpki4y7RVW3vTyKmvchms8gHmN6u67ahwDENAOTevbI+AxB1Y1cVf52FiY\nJ9fDdn15rsP4se6o7OwsLeofAvjbAA4z80Ei+jkzv6L8nQCcZOZXENGfArifmT9f/vYpAN9g5i8a\n+5x4PWib68HXuamvH2optbViXA0mNFwxdH8x6KvTtYuHktkpGSJIVYNgmuKqg1VuFeVCLKYxW596\nC43xhucrw8bCfJwxDJqoN528ou/cNHpmROViuYv7G9zThMadncz8QulauRjAVUT0OuN3hj8EKHpY\nTN1IljoZ1Lp4FW1jcbYRv7qv3V27OroKp1Pf1X02j6Ove3r//KD5rtWbnBJpvV6qcsXs5Ky65m0z\nC26ydxHYu5jVZN68AtpYmKdbcCtdtpdx2V5OWsR9BMeRM/MpIroPwNsAnCCi85n5KSK6AMDT5WpP\nANipbXZxuWwKIjqkfX0z9vJ9QeWwPJX1hqkqZtGgQ/YYH1fcuC+Spsl+ba4J/bPveC6hSM2VY7Oy\nbXVgShCNZP5dJ8qy1UHbOtsOb8Uj07754i1iZXKdybJPjoFQhHSQtiGHvPgxswymlFxPh4j2ANhT\nuV5F1MqrAJxh5l8Q0UsAfBPAR8sdP8vMy0R0AMAOZj5QdnauoPCLXwTgXgCvZuMgtteDqjzdQPVr\ncV3rwvfq28WgjLq4ImNCtnMxpBvHLEOsPg89EgEYfoYZXz1sMxDEPE+1Px9D3e82cd4h+26a7TEn\nNrM0avW56RD9CwDcQURnoXDDfIGZv05E9wO4k4j+EGX4IQAw81EiuhPAUQBnALzPFHEXthth3rDN\nSmdYW1Wjrlwi2KTHP4YPvQ5NHiipDjbyhZA2IVWL0ayLurC3ncihSOfQjBh1V2+TsTupQ98CXNke\nxyjmQNi5eYWcmR8G8HrL8pMA3uLY5jYArVKL+V6ZXB2dIYN6Qmlj0aaALv5mpE0qjLXRuYj1ljD1\nkDi8Fa9uw1YX1PcYuI5b5QoT/Oh6Vtwrf9tNbmTnUFaW2QmVyvDoJg3O1olZtS/z3Ju8wVQRsxNP\n79RM9eG67eQ60z7UztFRB3PfVR3W5u91r531DTlgm1TfnlKlbp1JTshduGJJ9cpkVixdnH0i4sqB\nYvs+BE3FvOp8becdep2aEjMm2IzpHzoixWRjYT5q/LMNZbn5xiK0iWWPSdeCPuaHRVX/X5JJs1QS\neFsoU9XNcg2oCInpNbcfgjox6G2OYRsY00dDiCVsesWOOVAsZ2ydoaF0Ff1SRdsR2iH7zgkz+mpq\nbMErtyP5iSVi53H2WdopEtM378sj49pfVw03ZFRhXcwKnpIl3id6ZIPKtgf4O+pjDTSKQZdCru8/\nB1x1Ws/bg1WyCnkyrpXYN7JuxfS5aOruw+bSGaKh1HGTdFG+0Nd65UtuepxZFXGgOPfQ8+96dG0K\n+xkb6uFcdd+SsMiJ1jjWnIQumligoVZLnY7Rqrj1utuF7qfPeOKQa21bp64gN0n2P3ZsGUN1unhg\ntx081JVVPnfYPvdAqkxY3nBcV4dFPqiQ225cH765VCJSTJqO/jRdSGbfgG9fXVnivjLq6+h+wCYz\nsIglN4kpBn260JrQdVscQ/2Y0MnUhBx73ce1+fFSqnxdYqvYIZ20IftRy6t88XWvdZP+B3V/2wi5\nYCdUzLtoX6HC3KVBoSfJG1udymry5Sr/bRIWtcr21rFLCKjuuFQ0DSF0hVw2cdOEjh7cfGCsVK4q\n1MTWflydnF0fuyuqLO1oycAyITmL3BSRkERRg6DnYu5BzIHq1+SmbimbZWbLh+KaTLaJO0ihLPKx\nWU4poI+QtoVrAsOmc2hybFtKalvdGWNUE+0DJx+1Akze2CYDGTon0kwoMbFZ0+afa11FiPWuRlJW\nNb6QdYR+sHUi6g/qLowhV7u1DQSrGx48d3h9KiV1iIjnjDkXsXO9VCzytlEZvWIKus0ij+R6qftK\n3LaTK3QAlbk/vZyhDXQMVlLqmHOHVrkp24q7+RDXJ85Qy0NT8vr2raiqQ2Ppf9m8ZjlY5FU3MmTA\njEs4msyObtsHr4CYF0kl0rcJNa+AXL/59u0qq0uUQyxiff0qzAbl26aOD95G7g0rR0JCQOsIeEgn\nvBIg3bI03xLNuqzXYf3NssnDpU6cfcq0yn7YF3Vvkm193YI0f1cXwZfjvOr4XYcxFf6vwoonLHpf\nk0JFtO51DWnoIdtWvUWMoWHliK8+NHVjmvsMsex1S91X50IHk+n7bHAK2cArIFq1u1iSda2YQhwy\nWMBVecxp34Dp4a+uY5vb+8oRul4IbV43Q7YLPX7d/fpe3wG3T1PEvRtc/uJQ11nbDnRfO7a1F1+9\nM+uR2YE7diEvclBdln4cucvPanba1PHjhdxcr5ivrk0k7woR8tg5Y2yEhiTGILTz0vSv28ro65wS\nMY9PlaFi3lvbd58VXyfaJGQWMNf+9bqlt7Oxi7eOL2olKSH3WbT6zYst5EB1ha8l0qtrhX88pFNU\nI1aUTmwxb/t2oJiVMLGUsA3Zd4Ws1nkLcz3cQ+qKWkfPuBly7FkQ7coHW2rZD5WQ+26OLtpmAw8R\n1Lo33vcaau7Le3xXmKIm5K7oEFW5QwfddBnFExIvbltXjz0POY5Y5N2h6pKtztSJSqqD603RjGcP\nmbBa37ZRYTKjUtdyiFox4RWQ6tE2YyltkR2+ikn7wCoG1YxFdVG7Iq+u+a1wi8CbZdZnvnGVp7GI\n14iDt4m4q1y2WPY6DW8skQUpYovmCnXdNcUWUaViwM11q9piEuHGQ1BzzMpgUSuuhm5OKmG6NHS/\nWIjFqCySucOWYxluGn0dMwSvsT/OEU9edzBESEhgbHyRQDEQS3xYTCvdFhYY0j/S1CVYZUzNkiXe\nljQtcldstgc1rZYeH+57TXGFRbnis6dQAt1gtKcZJ2uWRy9HcHksx5iiweAk2/F9I/TEuk4LVz03\nqeNG6wsR8XCSiCOfQHUSrjhGSzrEyBfC5qq4TUXSWq4AbC4IszzW/aprYjl3n5slZlSL6fc2O6Y3\n9ytJsJKi8EsX98fmXtHddXXfEhUhoYx1mNW6pLwRAGobXV6LnIh2EtF9RPRjIvoREX2gXH6IiI4T\n0QPl37XaNgeJ6DEieoSIrq53KuXoSctcnUDhbtEFZGNhnkJcDjYBnVhf+bZX1+xWgMX3zbxIdYfh\nVz04JsqpH8+w/qv6A2JgK2eVheTygwrDYrYRV3toWq+6qItijdfDG7VCROcDOJ+ZHySilwP4AYDf\nB/AOAL9k5o8b6+9G8Rx9A4CLANwL4FJmfsFYzzv5srUsDh+13itvDvyJ+npoxJOrY1dts4kh9iHD\nm13bh4RqBe2/Aa57oB8zxLUi/vF+8WWuDBXi0Le+tsxynZiwygFLsIR9QJDXtcLMTwF4qvz810T0\nExQCDcD6xLwOwCozPwfgGBE9DuBKAPdPFThS+GAxqGB+SuQ3FuaJsCXyXmxuC8sy2/B+575t7haP\na8jsWAQAHN6lLa/XYKoaaNPOSz01qlq2lVq0+M3n5hKLfVhsYYjmwJ++BpoJ0yhjcUrQS0OSyK6b\nwT5yItoF4AoUovxGAO8non8K4PsAPszMvwBwISZF+zi2hL8VPlF3/VYsn7f62rayiRmdluYT0OGX\n9lZsVwiixwUTGtrnI6Z15POZVllMIuLpEyLMrqil2GMYmvrmx8yUe9nWZ6gRNCCodKscAfDvmfkr\nRPTrAJ4pf/4jABcw8x8S0Z8CuJ+ZP19u9ykAX2fmLxn7Y+ChSt9yl36yKSH3UTfaI2DSidAKu7Sw\njOWTS/WOH0DdRhiSLyX0d996QneE5iuqGnDWScemhvjHHffK41qpDD8konMAfBHAf2HmrwAAMz/N\nJQA+hcJ9AgBPANipbX5xuczCJ4HVfw48fAg4cWS6wH1M4hB6DK0z1LptwEhOk9CImToibu6vTeii\nKxzNtT8VdugaraeW29YT0qDJ4KG2A4dMRMQNThwpNPLhQ8DrvuRcraqzkwDcAeBZZv6QtvwCZn6y\n/PwhAG9g5n1aZ+eV2OrsfDUbB9m0yG0YOUpcESxtmLDGldjWfXDYRmzqYYKeiBbXIIy21N2fz9ry\nve7KQI28qcrnYcOXT6VtndORejWdH2cy+KFB9kMi+h0A/wvAGrC58xsB7AVwebnspwDey8wnym1u\nBPAeAGcAfJCZv2nZr1vIHbgEXe8UCBV9Z0WusqxdfnS1zLa9R9Rto+dcVK3X9qFQZ9SeCPk4qCvo\nsUZ2uvYz63VqU8uMCLXT++c1bUktjW1NIcfexenEVWbPLsLEvDMh9+3D3Beqc3cPRZWlJW6RceBq\nB02jVprmcJl1AVcQrbHSB/cAwQbhh6mhz6ITYqF70YW3qjMzxOq2LXOIuqsjKYZlHgvfMWQiiJFh\nGCM2N0qMXDsSwljB3kV3u6swENMXciWISnDLyhYs2KGE+MirQhWrsDQUnRCRHlrEhfGgT+zAvEim\nb1YRo0NT6lQYzjQdFaQr5LqAd0FI1InGZqC+/jpqE/aWybRyYGlhGcCtQxdDiACvgKrmiK0TQ25G\nLW25TSbHc7SdRWtsTOmK543eRprZDwH3yMiU8ZXP5n5pcT5tsyM2pRBxYVSsrgXHmPsw66HP/SZ+\n8S3cMeMGHhdwuha5i5ZiXgxzremW2bsIojWe8su7OkGryhjhgTSU9a7i2m9ZGOTwQodUzaHpMxhE\nmJuhd3Bu4rLGPbqRjkXeIFd2E6yCHEJZPj3fuflbG3JxqwjjZajJjGfezRLBhZxe+GHTATo1aGqV\n2zIgKir3V5GUKychl4iVcaInRDNTKlTlFmpiHPmmbpwFmgVsNByi3zlaJEpfVjkQf8To5v7U+diE\nWxAywXxYe3OVN2y3syjeithRd8MLucKM+OhQ1BtdxLJs5rZEa6zPM6pPN8e8OPE3sb+K80s15nZj\nYX7KWhPGievNyxT1NoJsdVUKtRnWtRI5kqMOzry/LfdXB1t+85RRDVsGBM0Oroe2bmiIENejneak\n6loZmAmXSAt0y7xy3X1gWwePaYWnFOonwj2b+CZv7mPKQSGM4SzyveVxB/Qdx7bKzf36sI2iazoK\nrkt3hwi4YE4TZ45AbltHZmnav/Zak6JFnlIHYCSffKiLxZXv24VtsEWsvN7mPiRnuKDDK6A2MwS5\n3kAVUtfak59rJWIn6MTTcYCHSsxp3No2BH3ihzb7EcaHbpHbEmr53gbrxogr0Z/52PKaDCfkDWbV\nmdguoqDH8pPX6fBUVkjosGd9OH5MsdX3JSIuVNG1T3zsPvc2gxF9pGeR15l+LSKxxLwuSjyrJo1Q\n+CIE6gqxvt9tJ9dZwgoFG7wCqnKj2OpOk/q0sTBPekiiqpdSP/2kJ+R9snfRns2wAW0HGOkNpa9Z\nxZvuWxrV7BESYugT3aYhiq52MTMEZkGcbSHHpJg1tcpjjBKtFNXVtaDGYLPKzcZgWuJqO3GtCD6a\ndIK3DVHkFWweU+LV3aSXa6UPLEKtKslENrIaecrbsO3kOsealLbKUraFONYV8FkKFxPc1Hkrk7qy\nxeZMZ74BkS5jcpUyn7MzBo4Z7fXJhEOFPGauFr1BtJ3T0BTZ0MYW2tBExAUTVSdUO1LflxaWN9Me\nm8xq/WkcjVMx+fJwrhVdVG1JpuruoyWhF9iaN6UlMRPwu159Y01CMasNUHDjcn24RBxo1hE601Ro\n5LA+8hABd/1uy5jo25fxm813R/swmavc2Ca2gJt01ZkT0nkqDUuIhTzs+8cr5ES0k4juI6IfE9GP\niOgD5fIFIrqHiB4loruJaIe2zUEieoyIHiGiq2MUspaAOn1LbjdJVWdMF1Z4k3LExHxo1HmISEMV\nqlBW+kxGmgxA1VRvzwH4EDM/SEQvB/ADIroHwLsB3MPMHyOiJQAHABwgot0ArgewG8BFAO4lokuZ\n+QXnEQKmRZuYHDZ0wJBnXZ9gbnZEoHyArPQbVx6bjYV52nZ4ujN17AMvhDQo3C1F3aN94Fmvd7Z8\nNbY35roPQK+QM/NTAJ4qP/81Ef0EhUC/HcCbytXuAHAEhZhfB2CVmZ8DcIyIHgdwJYD7nQep4+f2\nuVkCp0uyVSTzojlnAdJmUKk8UE1U4it9GHTM42wNsLD3BRTXRawnoR9C3+oaT82YKLYEdy5BryPm\nwT5yItoF4AoA3wFwHjOfKH86AeC88vOFAI5rmx1HIfztadKxGTjAZ8pXbk4eMaLcD64GJO4SoWvq\nGibbTq6zGrQ3phwsrtQc+rK6FnmQkJdulS8C+CAz/1L/jYv4Rd/F7efC61OsuaZbq7EvZ8XpMLlW\nV2Jqy26oVxQRcaFvqhJtmalz9f9q5Cjtw+b/Pso8FCHuqCofOYjoHBQi/jlm/kq5+AQRnc/MTxHR\nBQCeLpc/AWCntvnF5bJpHj609fnX9wDn7aksbFt8FyT0adjXa14st4oabGTuj1dAtG9+kFnThdkl\nJHe+q536ltO+eZ47vJ6NUVJ1HZRr5ex3/necWf4L4OFt3v15BwQREaHwgT/LzB/Sln+sXLZMRAcA\n7GBm1dm5gsIvfhGAewG8mo2DTEwsgTAHf9NOANs+XEzs2zbyquFs4XXQ5/+M4ZPvegCPXhlzaUTC\n8Oj1Uk0d6JtWztV2zd9sRkvKKEPL1Xc3pXuOkZ1VrpU3AvgnAN5MRA+Uf9cA+CiAtxLRowB+r/wO\nZj4K4E4ARwF8A8D7TBE30U/A5Tdqkrc7yjRU+mTQPYg4MBnqGKOHX8RVSBF94FpVHa37Jp2bq6Xu\nG4iNwad6M29ELatZI2S70FSxmxj+8L57z3MYDp9DGYU8iDkoLaf6WOe8T79ye2JD9EvqukraWKk5\nDk6IVeY2qWclba3QBzHFN6f6GuO8B7PI9adKrSdSgNXu2842MMYplppFnnssa1NftnlvcrJ0hPzo\nUoBTr7vmudu0LlmLHJiebqyPnCO+ZWPATPDf5DxzsmoEAfDX89Trs6mDei72qjzwSQg5UG9ig6bi\nW9kBasaI9zztW0z0a9Sk88dW6VO3aIRxsbSwXHubqrf11DtCm07wUhlHPgRFTuNELvjexazCmRT6\nNVTxtbRvvrimK/5tRcSFITDrnS8NblPGmooiGYu8DjFiyq14Zg7KEV18Qy0R1+tn6q+lwjhpYpXP\nIkl0dtrwDQ4wCe38dGUcs5GzgJuYMxDZzs0cEm3DnHXIZqWrwR0Rii3MEH0bCrnWUZd2JulaCRFx\n20CAWCkyxyTiodTNtiYIuZKriPvI0rViI2RAkC7QUUZ+ZkLIDEFVYZ1mj3poFkVxyQhV9F1Hxjgu\nIkmL3EabGW1s27ks0DFa42bnsc39EdrBXCceXa0r7hYhRcY0KjlJi7yJSIdMLKx3+M2aG8E836rc\nzjEmarZtb8a3C8LQjKE+JtvZaU4LtZX9q9lUbiGiNEZrXMfs9AQmz9kUdnUdzU5OfVkVIdN7jcEi\nEtqRipDqOpGiHri0M0mLHCguorIKJ0RcF28jO6GJ2k5EvCAkUkf/c60fW3jHYBEJzaF9mJpTdmiK\nHOeJjGUccxLpAAARaUlEQVQJIFmLHHDEPttm6FFzdqrPLjzrzIKQA2GhiF0f14dY57OFemNrmkMp\nNrZ5M1PShuwscsAirqYQq+nc6k6/1uF0banTViibWimhQ49tgj+WuRoFO6lY467orhzqXtJC3giX\nSFuWq5uV0hO3D1RisiHOOyQpmi7m206uy3R0I0bd26Gs8Sn3rbZcJ3VjIn0h91nPum+8QYKrXGcU\niYFqQOYM5VUVNoao1tmHsuJFzIVQQi38qvVsKa9TJX0hB6bF3JWl0CXmeiepcsf49jej9FVRVedW\njBBHIX/0wIa+aBrinCrJDwhiXiSiNQ4SWyXYtomTAbfQZ5yuti2T4Yfzg76Z6BPwyiCi2SRWp6cr\nbUebFNgFaYp5Hha5z5oGqvOIh0SyCBPhh30eR7fGmmRsFPInZp3T02/k5B5pQ/IW+SZKjH3Ca4sx\nn2Fruw9oH1p1Rtq21S1zQWhKLNE2B8alSB4WuaKJ9RzokiFaE+FoQFfWu5rqqot9C4kS4e3Y5ToJ\n7pPJ9A29UsiJ6DNEdIKIHtaWHSKi40T0QPl3rfbbQSJ6jIgeIaKro5fY5V5pup4gCEmgJjiP2fFZ\naz9G35pu0adsjQNhFvlnAVxjLGMAH2fmK8q/bwAAEe0GcD2A3eU2nyCieFZ/E193JDEXf209iNaY\naK0IZ1T/HddQrdt3GYUE0dpxXUG3peQIdq849ENNkxhciIGoFFlm/jaAn1t+sp3cdQBWmfk5Zj4G\n4HEAV7Yp4EQDV3HjNQb9VFJD6EXMBaFblFVeB9eAHpegh8xFkINfXKeNtfx+InqIiD5NRDvKZRcC\nOK6tcxzARS2OUeATb9e6dbapEPOQDH7CFlbrus79EGYbo574kt/pYYamQKtcKepPj2Q5vX9+OpNq\nxq7YpkL+SQCXALgcwJMA/tiz7rBWbEQBETG3s+lGsblIQq592YjEvSK4aDOAxzlPb8bCbdJIyJn5\naS4B8ClsuU+eALBTW/XictkUZYep+tvTpBxebKM3Q0MXhTjINRUawLxIrrbqiguPPTI0lVGcRLRH\n10rXeo2EnIgu0L7+IwAqouWrAN5JRHNEdAmA1wD4rm0fzHxI+zvSpBxBBIqJGeq22UlHa5ybv6xP\n6lrR5voT/Q4i/EJJEzG3/eZaVkUqk5Ez8xFdK13rVQ4IIqJVAG8C8Coi+hmAfwtgDxFdjsJt8lMA\n7y0PepSI7gRwFMAZAO/jIRKeA/aZhBwVwxmvvCks4lIx2RRkVxphV/4bLWbf1rElQ/MFhaofRGts\n1iPbUH7faM6mYp7qkHyTpCeWKNaN4Dd1WHq+ASe6pSgW+TSVQq5wpVTwTNMn11kwcUWM6eJdJ69K\nqLinNigty4kloiERE91RdV1Dk52VpPA6KySIx81iRrX4IlnU8pDkXKmJuI98cq00pYE1Lvip/ZZk\nm9nJtk65vG3+FmGEBPafhGZPNMMUzW1zMyiSt8ibDBBwUgpKHZGQkMMtWo/A9I2+NRKeyWhPwUbs\n9ujaX27tPnkhb4XD8qsSCP13W3rVWWJi1iAluF1GlyjLPPMBGkJkSiNgaWG5schWhSrmlFvFJAsh\nb2SVRxKB3J7MMfE+8ERkhYxQ7djlN9fJza0CZCLkQqI0EfOanc6S30bQWT65BKBbAyvH/pn8hDxU\nPOrmW/GQ4xO6LZVD7dtc25rhiiLmgs7SwnL0fW4szJP6i77zHshPyOuixzk3dAfk+IRuTdV0eVXT\n70VGxFz4yMpNmyIeS8xzFm+dvIS8iXhYrL8qn/vm8ODVtZm0xoHAh5d+D0TMhS4x6pdysQgFecWR\ne0YEOlEWuQo9DOw4jRr2OAv4JsCWwVhCW1bXgJW4uxyDJa7IxiLfFFZbDhUbutUuoWzdEeIr18MJ\nG96LWY4eEgp0K3zqTXnGjYW8LPJQRLT7QX9DsvVDtOyb0JlVF5cwzR/tuxVAOhkKUyAbixwIdHd4\nREPcJQ2oM8uS7drLQ1VoCdEazz27C0CFVY7ZfXPLSsitiPtkOKqud8P7YcuDIQiA582srGuhFvqY\n/ONAhkJutaoDBGMmQwhbEu0Npqb/0pa8KGTCXGEGiDg+ZExkJ+RCz8R4y4lgmQszTM3645useSxx\n4yZZCnld61qs8bywWePA+F6HhQYEhrXOWifoOKNWNETE+yF0tqUQZq0RChWsrgGHd7XezZgNgSwt\n8lBExPuh6jq3vQ+n98+PuhEKflTEygSO4AZXR/nY68/ohJxXQOpv6LKMFf0ah15ntZ5rffGHC0EE\ndHLWmbtzLCQ/+bIwPMo1EuvhaHO1uPzip/fPy5vVjLPt5PrkRC+e0d1jrzcu7RQhFwZBF3PdgrLN\nuTj212LBjS7igGFdV+RemiUhr3StENFniOgEET2sLVsgonuI6FEiupuIdmi/HSSix4joESK6Ot4p\nCGPCbGRq+i1xsQhebHnxHcxStswQH/lnAVxjLDsA4B5mvhTAt8rvIKLdAK4HsLvc5hNENDo/vBAH\nU7Rtvs0u/ZvbTq6zafEJ6aFyj5/eP18rdfIYLXIXleGHzPxtItplLH47gDeVn+8AcASFmF8HYJWZ\nnwNwjIgeB3AlgPsjlVcYIfoADj0RUtcNUR2HELcPQBD6pmkc+XnMfKL8fALAeeXnCzEp2scBXNTw\nGMLI2ViYp+kc0/PR8067MDtYhUzRsmzO6sO49YAgZmYi8r2eyqurkDRzh9elQzVBtp1c59Ap3WY9\ns2lTIT9BROcz81NEdAGAp8vlTwDYqa13cblsCiI6pH09wsxHGpZFEBqxsTBP4iNPm+WTS51MtpwL\nRLQHwJ7K9ULCD0sf+deY+bfK7x8D8CwzLxPRAQA7mPlA2dm5gsIvfhGAewG8mo2DSPih4EIJa5dx\nwOoYSsjFGk8T10PWNjvQrFjkbcIPVwH8bwC/SUQ/I6J3A/gogLcS0aMAfq/8DmY+CuBOAEcBfAPA\n+0wRF4QhMcVBRDxd1L3xWeS8ApoVEfcRErWy1/HTWxzr3wbgtjaFEmYXFbnShTWui7gIeJ5Ix7Qd\nifEWkiN2Y9XjxWfZ3zoGZMCYndGnsRXyouvwseWTS2KNC6NDcq0Io0dcKnni6+yc1Xjxxp2dgiAI\nfSOusHqIa0UYPWKF54cvxr/wk0unp45Y5IIgJItrQJAM5JpELHJBEJJCF2mXa0XesiYRi1wQhKSo\nEmkR8WlEyAVBEDJHwg8FQUgOlw981q1xCT8UBCEbNhbmaWNhnpYWlmuFIM7S9G46YpELgpAsNst8\nlq1yl3ZK1IogCMmzsTBPN+OmUtRvHbYwCSIWuSAIySK+8knERy4IQnYoX/nQ5UgdscgFQRAyQSxy\nQRCEkSJCLgiCkDki5IIgCJkjQi4IgpA5IuSCIAiZI0IuCIKQOa1GdhLRMQDrAJ4H8BwzX0lECwC+\nAOA3ABwD8A5m/kXLcgqCIAgO2lrkDGAPM1/BzFeWyw4AuIeZLwXwrfK7IAiC0BExXCtmcPrbAdxR\nfr4DwO9HOIYgCILgoNXITiL6SwCnULhW/hMz/xkR/ZyZX1H+TgBOqu/adjKyUxCEqKwRTYjZ4gg1\npqvsh29k5ieJ6FwA9xDRI/qPzMxkXFxBEISYmAIOjFPEfbQScmZ+svz/DBF9GcCVAE4Q0fnM/BQR\nXQDgadu2RHRI+3qEmY+0KYuwhV6xVYVeI+JFZrL9Jgi5MnYRJ6I9APZUrtfUtUJELwVwFjP/kohe\nBuBuAP8OwFsAPMvMy0R0AMAOZj5gbJula8VWaRRdVp66x/Wt79uHa7s6xzAfFlXHFISmjF3Ebbi0\ns42QXwLgy+XXswF8npn/Qxl+eCeAvwVH+GFOQl5XFHWUqLWpXG2Onytjb4xCc6raw9jrTnQh76Iw\nqdCHeIZUuFkUcZOxN0yhPkO9GaeACDm2KkAMd0Rs9DINXZYcGHuDFbYw2+0sW+VZC7lPgEO3tSHi\nOR7G3HhnGbNzfpZFHEhQyB/q/ajCrDD2xjwrNDGuxnDvfed9GYAu4sgFITnqRN8IaTIrb8ixzlOE\nXJgZ2kYQCQXyoIxDzIeVCLkgCMH4xCcFKzrFh3Uf10WEXBg9qTXsronlW7aJYuiAr6FI7V73da1E\nyIXRklqj7oJYQuFKOCUpHeozxINOolaEUZKr6KRs7eZG33Wgj3snUSvCzJCjiIuAx2eW3iZEyAWh\nY2wD2kS4+yV2rvLU7p+4VgTBgiQ6E/SO3VQ6eV2uFRFyQRCETHAJeYw5OwVBEIQBESEXBEHIHBFy\nQRCEzBEhFwRByBwRckEQhMwRIRcEQcgcEXJBEITMESEXBEHIHBFyQRCEzOlEyInoGiJ6hIgeI6Kl\nLo4hCIIgFEQXciI6C8BhANcA2A1gLxG9NvZxhuB7QxegBVL2/sm13EC+Zc+13EC7sndhkV8J4HFm\nPsbMzwH4rwCu6+A4vfP9oQvQAil7/+RabiDfsudabqBd2bsQ8osA/Ez7frxcJgiCIHRAF0I+eKpH\nQRCEWSJ6Glsi+m0Ah5j5mvL7QQAvMPOyto6IvSAIQgN6yUdORGcD+D8A/j6AvwLwXQB7mfknUQ8k\nCIIgAOhgqjdmPkNE+wF8E8BZAD4tIi4IgtAdg8wQJAiCIMSj15GdqQ8UIqLPENEJInpYW7ZARPcQ\n0aNEdDcR7dB+O1ieyyNEdPUwpQaIaCcR3UdEPyaiHxHRBzIq+4uJ6DtE9GBZ9kO5lL0sy1lE9AAR\nfa38nku5jxHRWln275bLki87Ee0goruI6CdEdJSIrsqk3L9ZXmv1d4qIPhCt7Mzcyx8KN8vjAHYB\nOAfAgwBe29fxA8v4uwCuAPCwtuxjAP51+XkJwEfLz7vLczinPKfHAbxooHKfD+Dy8vPLUfRRvDaH\nspfleWn5/2wA9wO4KqOy/0sAnwfw1VzqS1menwJYMJYlX3YAdwB4j1ZftudQbuMcXgTgSQA7Y5W9\nz8L/XQB/rn0/AODA0BfVUs5dmBTyRwCcV34+H8Aj5eeDAJa09f4cwG8PXf6yLF8B8Jbcyg7gpQB+\ngGJQWfJlB3AxgHsBvBnA13KqL6WQv9JYlnTZS9H+S8vypMttKe/VAL4ds+x9ulZyHSh0HjOfKD+f\nAHBe+flCFOegSOJ8iGgXireK7yCTshPRi4joQRRlvJuZv4s8yv4fAfwrAC9oy3IoN1CM97iXiL5P\nRP+sXJZ62S8B8AwRfZaIfkhEf0ZEL0P65TZ5J4DV8nOUsvcp5Nn3qnLxaPSdx6DnSEQvB/BFAB9k\n5l/qv6VcdmZ+gZkvR2HhXkVErzN+T67sRPQPADzNzA8AmIrrBdIst8YbmfkKANcC+BdE9Lv6j4mW\n/WwArwfwCWZ+PYD/h+LNfqtQaZZ7EyKaA/APAfw387c2Ze9TyJ9A4RNS7MTkEydVThDR+QBARBcA\neLpcbp7PxeWyQSCic1CI+OeY+Svl4izKrmDmUwDuA/A2pF/2vwfg7UT0UxTW1e8R0eeQfrkBAMz8\nZPn/GQBfRuHOSr3sxwEcZ2aVX+ouFML+VOLl1rkWwA/K6w5EuuZ9Cvn3AbyGiHaVT6XrAXy1x+M3\n5asA/qD8/Aco/M9q+TuJaI6ILgHwGhSDn3qHiAjApwEcZeY/0X7KoeyvUj31RPQSAG8F8BMkXnZm\nvpGZdzLzJShelf8HM78r9XIDABG9lIh+rfz8MhQ+24eReNmZ+SkAPyOiS8tFbwHwYwBfQ8LlNtiL\nLbcKEOua9+zkvxZFRMXjAA4O3elgKd8qitGop1H4898NYAFFh9ajAO4GsENb/8byXB4B8LYBy/07\nKPy0DwJ4oPy7JpOy/xaAHwJ4CIWY/JtyefJl18rzJmxFrSRfbhS+5gfLvx+ptphJ2S9DkfH1IQBf\nQtEBmny5y7K8DMD/BfBr2rIoZZcBQYIgCJkjU70JgiBkjgi5IAhC5oiQC4IgZI4IuSAIQuaIkAuC\nIGSOCLkgCELmiJALgiBkjgi5IAhC5vx/oWJ9OHx0YTwAAAAASUVORK5CYII=\n", "text/plain": [ "" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "soilmvar = gfs.variables['Volumetric_Soil_Moisture_Content_depth_below_surface_layer']\n", "# flip the data in latitude so North Hemisphere is up on the plot\n", "soilm = soilmvar[0,0,::-1,:] \n", "print('shape=%s, type=%s, missing_value=%s' % \\\n", " (soilm.shape, type(soilm), soilmvar.missing_value))\n", "import matplotlib.pyplot as plt\n", "%matplotlib inline\n", "cs = plt.contourf(soilm)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 32, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "source": [ "##Packed integer data\n", "There is a similar feature for variables with `scale_factor` and `add_offset` attributes.\n", "\n", "- short integer data will automatically be returned as float data, with the scale and offset applied. " ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 32, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Dealing with dates and times\n", "- time variables usually measure relative to a fixed date using a certain calendar, with units specified like ***`hours since YY:MM:DD hh-mm-ss`***.\n", "- **`num2date`** and **`date2num`** convenience functions provided to convert between these numeric time coordinates and handy python datetime instances. \n", "- **`date2index`** finds the time index corresponding to a datetime instance." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 34 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "name of time dimension = time2\n", "units = Hour since 2015-07-11T06:00:00Z, values = [ 0. 3. 6. 9. 12. 15. 18. 21. 24. 27. 30. 33.\n", " 36. 39. 42. 45. 48. 51. 54. 57. 60. 63. 66. 69.\n", " 72. 75. 78. 81. 84. 87. 90. 93. 96. 99. 102. 105.\n", " 108. 111. 114. 117. 120. 123. 126. 129. 132. 135. 138. 141.\n", " 144. 147. 150. 153. 156. 159. 162. 165. 168. 171. 174. 177.\n", " 180. 183. 186. 189. 192. 195. 198. 201. 204. 207. 210. 213.\n", " 216. 219. 222. 225. 228. 231. 234. 237. 240. 252. 264. 276.\n", " 288. 300. 312. 324. 336. 348. 360. 372. 384.]\n" ] } ], "source": [ "from netCDF4 import num2date, date2num, date2index\n", "timedim = sfctmp.dimensions[0] # time dim name\n", "print('name of time dimension = %s' % timedim)\n", "times = gfs.variables[timedim] # time coord var\n", "print('units = %s, values = %s' % (times.units, times[:]))" ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 35, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "['2015-07-11 06:00:00', '2015-07-11 09:00:00', '2015-07-11 12:00:00', '2015-07-11 15:00:00', '2015-07-11 18:00:00', '2015-07-11 21:00:00', '2015-07-12 00:00:00', '2015-07-12 03:00:00', '2015-07-12 06:00:00', '2015-07-12 09:00:00']\n" ] } ], "source": [ "dates = num2date(times[:], times.units)\n", "print([date.strftime('%Y-%m-%d %H:%M:%S') for date in dates[:10]]) # print only first ten..." ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 35, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "###Get index associated with a specified date, extract forecast data for that date." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 37 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2015-07-14 07:22:39.579246\n", "index = 24, date = 2015-07-14 06:00:00\n" ] } ], "source": [ "from datetime import datetime, timedelta\n", "date = datetime.now() + timedelta(days=3)\n", "print(date)\n", "ntime = date2index(date,times,select='nearest')\n", "print('index = %s, date = %s' % (ntime, dates[ntime]))" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 38 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "###Get temp forecast for Boulder (near 40N, -105W)\n", "- use function **`getcloses_ij`** we created before..." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 39, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Boulder forecast valid at 2015-07-14 06:00:00 UTC = 296.8 K\n" ] } ], "source": [ "lats, lons = gfs.variables['lat'][:], gfs.variables['lon'][:]\n", "# lats, lons are 1-d. Make them 2-d using numpy.meshgrid.\n", "lons, lats = np.meshgrid(lons,lats)\n", "j, i = getclosest_ij(lats,lons,40,-105)\n", "fcst_temp = sfctmp[ntime,j,i]\n", "print('Boulder forecast valid at %s UTC = %5.1f %s' % \\\n", " (dates[ntime],fcst_temp,sfctmp.units))" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 39, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "##Simple multi-file aggregation\n", "\n", "What if you have a bunch of netcdf files, each with data for a different year, and you want to access all the data as if it were in one file?" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 41 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-rw-r--r-- 1 jwhitaker staff 8985332 Jul 10 06:43 data/prmsl.2000.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8968789 Jul 10 06:43 data/prmsl.2001.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8972796 Jul 10 06:43 data/prmsl.2002.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8974435 Jul 10 06:43 data/prmsl.2003.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8997438 Jul 10 06:43 data/prmsl.2004.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8976678 Jul 10 06:43 data/prmsl.2005.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8969714 Jul 10 06:43 data/prmsl.2006.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8974360 Jul 10 06:43 data/prmsl.2007.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8994260 Jul 10 06:43 data/prmsl.2008.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8974678 Jul 10 06:43 data/prmsl.2009.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8970732 Jul 10 06:43 data/prmsl.2010.nc\r\n", "-rw-r--r-- 1 jwhitaker staff 8976285 Jul 10 06:43 data/prmsl.2011.nc\r\n" ] } ], "source": [ "!ls -l data/prmsl*nc" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 42 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "**`MFDataset`** uses file globbing to patch together all the files into one big Dataset.\n", "You can also pass it a list of specific files.\n", "\n", "Limitations:\n", "\n", "- It can only aggregate the data along the leftmost dimension of each variable.\n", "- only works with `NETCDF3`, or `NETCDF4_CLASSIC` formatted files.\n", "- kind of slow." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 43, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "starting date = 2000-01-01 00:00:00\n", "ending date = 2011-12-31 00:00:00\n", "times shape = 4383\n", "prmsl dimensions = (u'time', u'lat', u'lon'), prmsl shape = (4383, 91, 180)\n" ] } ], "source": [ "mf = netCDF4.MFDataset('data/prmsl*nc')\n", "times = mf.variables['time']\n", "dates = num2date(times[:],times.units)\n", "print('starting date = %s' % dates[0])\n", "print('ending date = %s'% dates[-1])\n", "prmsl = mf.variables['prmsl']\n", "print('times shape = %s' % times.shape)\n", "print('prmsl dimensions = %s, prmsl shape = %s' %\\\n", " (prmsl.dimensions, prmsl.shape))" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 43, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Closing your netCDF file\n", "\n", "It's good to close netCDF files, but not actually necessary when Dataset is open for read access only.\n" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 45 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "f.close()\n", "gfs.close()" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 45, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "-" } }, "source": [ "##That's it!\n", "\n", "Now you're ready to start exploring your data interactively.\n", "\n", "To be continued with **Writing netCDF data** ...." ] } ], "metadata": { "celltoolbar": "Raw Cell Format", "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.9" } }, "nbformat": 4, "nbformat_minor": 0 } netcdf4-python-1.3.1rel/examples/subset.py000066400000000000000000000012061317565303700205730ustar00rootroot00000000000000# use 'orthogonal indexing' feature to subselect data over CONUS. import netCDF4 import numpy as np import matplotlib.pyplot as plt # use real data from CFS reanlysis. # note: we're reading GRIB2 data! URL="http://nomads.ncdc.noaa.gov/thredds/dodsC/modeldata/cmd_flxf/2010/201007/20100701/flxf00.gdas.2010070100.grb2" nc = netCDF4.Dataset(URL) lats = nc.variables['lat'][:]; lons = nc.variables['lon'][:] latselect = np.logical_and(lats>25,lats<50) lonselect = np.logical_and(lons>230,lons<305) data = nc.variables['Soil_moisture_content'][0,0,latselect,lonselect] plt.contourf(data[::-1]) # flip latitudes so they go south -> north plt.show() netcdf4-python-1.3.1rel/examples/test_stringarr.py000066400000000000000000000035331317565303700223450ustar00rootroot00000000000000from netCDF4 import Dataset, stringtochar, chartostring import random, numpy # test utilities for converting arrays of fixed-length strings # to arrays of characters (with an extra dimension), and vice-versa. # netCDF does not have a fixed-length string data-type (only characters # and variable length strings). The convenience function chartostring # converts an array of characters to an array of fixed-length strings. # The array of fixed length strings has one less dimension, and the # length of the strings is equal to the rightmost dimension of the # array of characters. The convenience function stringtochar goes # the other way, converting an array of fixed-length strings to an # array of characters with an extra dimension (the number of characters # per string) appended on the right. FILE_NAME = 'tst_stringarr.nc' FILE_FORMAT = 'NETCDF4_CLASSIC' chars = '1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' nc = Dataset(FILE_NAME,'w',format=FILE_FORMAT) n2 = 10; nchar = 12; nrecs = 4 nc.createDimension('n1',None) nc.createDimension('n2',n2) nc.createDimension('nchar',nchar) v = nc.createVariable('strings','S1',('n1','n2','nchar')) for nrec in range(nrecs): data = [] data = numpy.empty((n2,),'S'+repr(nchar)) # fill data with random nchar character strings for n in range(n2): data[n] = ''.join([random.choice(chars) for i in range(nchar)]) print(nrec,data) # convert data to array of characters with an extra dimension # (the number of characters per string) added to the right. datac = stringtochar(data) v[nrec] = datac nc.close() nc = Dataset(FILE_NAME) v = nc.variables['strings'] print(v.shape, v.dtype) for nrec in range(nrecs): # read character array back, convert to an array of strings # of length equal to the rightmost dimension. print(nrec, chartostring(v[nrec])) nc.close() netcdf4-python-1.3.1rel/examples/threaded_read.py000066400000000000000000000033001317565303700220360ustar00rootroot00000000000000from __future__ import print_function from netCDF4 import Dataset from numpy.testing import assert_array_equal, assert_array_almost_equal import numpy as np import threading import queue import time # demonstrate reading of different files from different threads. # Releasing the Global Interpreter Lock (GIL) when calling the # netcdf C library for read operations speeds up the reads # when threads are used (issue 369). # Test script contributed by Ryan May of Unidata. # Make some files nfiles = 4 fnames = []; datal = [] for i in range(nfiles): fname = 'test%d.nc' % i fnames.append(fname) nc = Dataset(fname, 'w') data = np.random.randn(500, 500, 500) datal.append(data) nc.createDimension('x', 500) nc.createDimension('y', 500) nc.createDimension('z', 500) var = nc.createVariable('grid', 'f', ('x', 'y', 'z')) var[:] = data nc.close() # Queue them up items = queue.Queue() for data,fname in zip(datal,fnames): items.put(fname) # Function for threads to use def get_data(serial=None): if serial is None: # if not called from a thread fname = items.get() else: fname = fnames[serial] nc = Dataset(fname, 'r') data2 = nc.variables['grid'][:] # make sure the data is correct #assert_array_almost_equal(data2,datal[int(fname[4])]) nc.close() if serial is None: items.task_done() # Time it (no threading). start = time.time() for i in range(nfiles): get_data(serial=i) end = time.time() print('no threads, time = ',end - start) # with threading. start = time.time() for i in range(nfiles): threading.Thread(target=get_data).start() items.join() end = time.time() print('with threading, time = ',end - start) netcdf4-python-1.3.1rel/examples/tutorial.py000066400000000000000000000277531317565303700211500ustar00rootroot00000000000000from netCDF4 import Dataset # code from tutorial. # create a file (Dataset object, also the root group). rootgrp = Dataset('test.nc', 'w', format='NETCDF4') print(rootgrp.file_format) rootgrp.close() # create some groups. rootgrp = Dataset('test.nc', 'a') fcstgrp = rootgrp.createGroup('forecasts') analgrp = rootgrp.createGroup('analyses') fcstgrp1 = rootgrp.createGroup('/forecasts/model1') fcstgrp2 = rootgrp.createGroup('/forecasts/model2') # walk the group tree using a Python generator. def walktree(top): values = top.groups.values() yield values for value in top.groups.values(): for children in walktree(value): yield children print(rootgrp) for children in walktree(rootgrp): for child in children: print(child) # dimensions. level = rootgrp.createDimension('level', None) time = rootgrp.createDimension('time', None) lat = rootgrp.createDimension('lat', 73) lon = rootgrp.createDimension('lon', 144) print(rootgrp.dimensions) print(len(lon)) print(lon.isunlimited()) print(time.isunlimited()) for dimobj in rootgrp.dimensions.values(): print(dimobj) print(time) # variables. times = rootgrp.createVariable('time','f8',('time',)) levels = rootgrp.createVariable('level','i4',('level',)) latitudes = rootgrp.createVariable('lat','f4',('lat',)) longitudes = rootgrp.createVariable('lon','f4',('lon',)) # 2 unlimited dimensions. #temp = rootgrp.createVariable('temp','f4',('time','level','lat','lon',)) # this makes the compression 'lossy' (preserving a precision of 1/1000) # try it and see how much smaller the file gets. temp = rootgrp.createVariable('temp','f4',('time','level','lat','lon',),least_significant_digit=3) print(temp) # create variable in a group using a path. temp = rootgrp.createVariable('/forecasts/model1/temp','f4',('time','level','lat','lon',)) print(rootgrp['/forecasts/model1']) # print the Group instance print(rootgrp['/forecasts/model1/temp']) # print the Variable instance # attributes. import time rootgrp.description = 'bogus example script' rootgrp.history = 'Created ' + time.ctime(time.time()) rootgrp.source = 'netCDF4 python module tutorial' latitudes.units = 'degrees north' longitudes.units = 'degrees east' levels.units = 'hPa' temp.units = 'K' times.units = 'hours since 0001-01-01 00:00:00.0' times.calendar = 'gregorian' for name in rootgrp.ncattrs(): print('Global attr', name, '=', getattr(rootgrp,name)) print(rootgrp) print(rootgrp.__dict__) print(rootgrp.variables) import numpy # no unlimited dimension, just assign to slice. lats = numpy.arange(-90,91,2.5) lons = numpy.arange(-180,180,2.5) latitudes[:] = lats longitudes[:] = lons print('latitudes =\n',latitudes[:]) print('longitudes =\n',longitudes[:]) # append along two unlimited dimensions by assigning to slice. nlats = len(rootgrp.dimensions['lat']) nlons = len(rootgrp.dimensions['lon']) print('temp shape before adding data = ',temp.shape) from numpy.random.mtrand import uniform # random number generator. temp[0:5,0:10,:,:] = uniform(size=(5,10,nlats,nlons)) print('temp shape after adding data = ',temp.shape) # levels have grown, but no values yet assigned. print('levels shape after adding pressure data = ',levels.shape) # assign values to levels dimension variable. levels[:] = [1000.,850.,700.,500.,300.,250.,200.,150.,100.,50.] # fancy slicing tempdat = temp[::2, [1,3,6], lats>0, lons>0] print('shape of fancy temp slice = ',tempdat.shape) print(temp[0, 0, [0,1,2,3], [0,1,2,3]].shape) # fill in times. from datetime import datetime, timedelta from netCDF4 import num2date, date2num, date2index dates = [datetime(2001,3,1)+n*timedelta(hours=12) for n in range(temp.shape[0])] times[:] = date2num(dates,units=times.units,calendar=times.calendar) print('time values (in units %s): ' % times.units+'\\n',times[:]) dates = num2date(times[:],units=times.units,calendar=times.calendar) print('dates corresponding to time values:\\n',dates) rootgrp.close() # create a series of netCDF files with a variable sharing # the same unlimited dimension. for nfile in range(10): f = Dataset('mftest'+repr(nfile)+'.nc','w',format='NETCDF4_CLASSIC') f.createDimension('x',None) x = f.createVariable('x','i',('x',)) x[0:10] = numpy.arange(nfile*10,10*(nfile+1)) f.close() # now read all those files in at once, in one Dataset. from netCDF4 import MFDataset f = MFDataset('mftest*nc') print(f.variables['x'][:]) # example showing how to save numpy complex arrays using compound types. f = Dataset('complex.nc','w') size = 3 # length of 1-d complex array # create sample complex data. datac = numpy.exp(1j*(1.+numpy.linspace(0, numpy.pi, size))) print(datac.dtype) # create complex128 compound data type. complex128 = numpy.dtype([('real',numpy.float64),('imag',numpy.float64)]) complex128_t = f.createCompoundType(complex128,'complex128') # create a variable with this data type, write some data to it. f.createDimension('x_dim',None) v = f.createVariable('cmplx_var',complex128_t,'x_dim') data = numpy.empty(size,complex128) # numpy structured array data['real'] = datac.real; data['imag'] = datac.imag v[:] = data # close and reopen the file, check the contents. f.close() f = Dataset('complex.nc') print(f) print(f.variables['cmplx_var']) print(f.cmptypes) print(f.cmptypes['complex128']) v = f.variables['cmplx_var'] print(v.shape) datain = v[:] # read in all the data into a numpy structured array # create an empty numpy complex array datac2 = numpy.empty(datain.shape,numpy.complex128) # .. fill it with contents of structured array. datac2.real = datain['real'] datac2.imag = datain['imag'] print(datac.dtype,datac) print(datac2.dtype,datac2) # more complex compound type example. from netCDF4 import chartostring, stringtoarr f = Dataset('compound_example.nc','w') # create a new dataset. # create an unlimited dimension call 'station' f.createDimension('station',None) # define a compound data type (can contain arrays, or nested compound types). NUMCHARS = 80 # number of characters to use in fixed-length strings. winddtype = numpy.dtype([('speed','f4'),('direction','i4')]) statdtype = numpy.dtype([('latitude', 'f4'), ('longitude', 'f4'), ('surface_wind',winddtype), ('temp_sounding','f4',10),('press_sounding','i4',10), ('location_name','S1',NUMCHARS)]) # use this data type definitions to create a compound data types # called using the createCompoundType Dataset method. # create a compound type for vector wind which will be nested inside # the station data type. This must be done first! wind_data_t = f.createCompoundType(winddtype,'wind_data') # now that wind_data_t is defined, create the station data type. station_data_t = f.createCompoundType(statdtype,'station_data') # create nested compound data types to hold the units variable attribute. winddtype_units = numpy.dtype([('speed','S1',NUMCHARS),('direction','S1',NUMCHARS)]) statdtype_units = numpy.dtype([('latitude', 'S1',NUMCHARS), ('longitude', 'S1',NUMCHARS), ('surface_wind',winddtype_units), ('temp_sounding','S1',NUMCHARS), ('location_name','S1',NUMCHARS), ('press_sounding','S1',NUMCHARS)]) # create the wind_data_units type first, since it will nested inside # the station_data_units data type. wind_data_units_t = f.createCompoundType(winddtype_units,'wind_data_units') station_data_units_t =\ f.createCompoundType(statdtype_units,'station_data_units') # create a variable of of type 'station_data_t' statdat = f.createVariable('station_obs', station_data_t, ('station',)) # create a numpy structured array, assign data to it. data = numpy.empty(1,station_data_t) data['latitude'] = 40. data['longitude'] = -105. data['surface_wind']['speed'] = 12.5 data['surface_wind']['direction'] = 270 data['temp_sounding'] = (280.3,272.,270.,269.,266.,258.,254.1,250.,245.5,240.) data['press_sounding'] = range(800,300,-50) # variable-length string datatypes are not supported inside compound types, so # to store strings in a compound data type, each string must be # stored as fixed-size (in this case 80) array of characters. data['location_name'] = stringtoarr('Boulder, Colorado, USA',NUMCHARS) # assign structured array to variable slice. statdat[0] = data # or just assign a tuple of values to variable slice # (will automatically be converted to a structured array). statdat[1] = (40.78,-73.99,(-12.5,90), (290.2,282.5,279.,277.9,276.,266.,264.1,260.,255.5,243.), range(900,400,-50),stringtoarr('New York, New York, USA',NUMCHARS)) print(f.cmptypes) windunits = numpy.empty(1,winddtype_units) stationobs_units = numpy.empty(1,statdtype_units) windunits['speed'] = stringtoarr('m/s',NUMCHARS) windunits['direction'] = stringtoarr('degrees',NUMCHARS) stationobs_units['latitude'] = stringtoarr('degrees north',NUMCHARS) stationobs_units['longitude'] = stringtoarr('degrees west',NUMCHARS) stationobs_units['surface_wind'] = windunits stationobs_units['location_name'] = stringtoarr('None', NUMCHARS) stationobs_units['temp_sounding'] = stringtoarr('Kelvin',NUMCHARS) stationobs_units['press_sounding'] = stringtoarr('hPa',NUMCHARS) statdat.units = stationobs_units # close and reopen the file. f.close() f = Dataset('compound_example.nc') print(f) statdat = f.variables['station_obs'] print(statdat) # print out data in variable. print('data in a variable of compound type:') print('----') for data in statdat[:]: for name in statdat.dtype.names: if data[name].dtype.kind == 'S': # a string # convert array of characters back to a string for display. units = chartostring(statdat.units[name]) print(name,': value =',chartostring(data[name]),\ ': units=',units) elif data[name].dtype.kind == 'V': # a nested compound type units_list = [chartostring(s) for s in tuple(statdat.units[name])] print(name,data[name].dtype.names,': value=',data[name],': units=',\ units_list) else: # a numeric type. units = chartostring(statdat.units[name]) print(name,': value=',data[name],': units=',units) print('----') f.close() f = Dataset('tst_vlen.nc','w') vlen_t = f.createVLType(numpy.int32, 'phony_vlen') x = f.createDimension('x',3) y = f.createDimension('y',4) vlvar = f.createVariable('phony_vlen_var', vlen_t, ('y','x')) import random data = numpy.empty(len(y)*len(x),object) for n in range(len(y)*len(x)): data[n] = numpy.arange(random.randint(1,10),dtype='int32')+1 data = numpy.reshape(data,(len(y),len(x))) vlvar[:] = data print(vlvar) print('vlen variable =\n',vlvar[:]) print(f) print(f.variables['phony_vlen_var']) print(f.vltypes['phony_vlen']) z = f.createDimension('z', 10) strvar = f.createVariable('strvar',str,'z') chars = '1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ' data = numpy.empty(10,object) for n in range(10): stringlen = random.randint(2,12) data[n] = ''.join([random.choice(chars) for i in range(stringlen)]) strvar[:] = data print('variable-length string variable:\n',strvar[:]) print(f) print(f.variables['strvar']) f.close() # Enum type example. f = Dataset('clouds.nc','w') # python dict describing the allowed values and their names. enum_dict = {u'Altocumulus': 7, u'Missing': 255, u'Stratus': 2, u'Clear': 0, u'Nimbostratus': 6, u'Cumulus': 4, u'Altostratus': 5, u'Cumulonimbus': 1, u'Stratocumulus': 3} # create the Enum type called 'cloud_t'. cloud_type = f.createEnumType(numpy.uint8,'cloud_t',enum_dict) print(cloud_type) time = f.createDimension('time',None) # create a 1d variable of type 'cloud_type' called 'primary_clouds'. # The fill_value is set to the 'Missing' named value. cloud_var = f.createVariable('primary_cloud',cloud_type,'time',\ fill_value=enum_dict['Missing']) # write some data to the variable. cloud_var[:] = [enum_dict['Clear'],enum_dict['Stratus'],enum_dict['Cumulus'],\ enum_dict['Missing'],enum_dict['Cumulonimbus']] # close file, reopen it. f.close() f = Dataset('clouds.nc') cloud_var = f.variables['primary_cloud'] print(cloud_var) print(cloud_var.datatype.enum_dict) print(cloud_var[:]) f.close() netcdf4-python-1.3.1rel/examples/writing_netCDF.ipynb000066400000000000000000001074161317565303700226370ustar00rootroot00000000000000{ "cells": [ { "cell_type": "markdown", "metadata": { "internals": { "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Writing netCDF data\n", "\n", "**Important Note**: when running this notebook interactively in a browser, you probably will not be able to execute individual cells out of order without getting an error. Instead, choose \"Run All\" from the Cell menu after you modify a cell." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "collapsed": false, "internals": { "frag_number": 1, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "import netCDF4 # Note: python is case-sensitive!\n", "import numpy as np" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 1, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Opening a file, creating a new Dataset\n", "\n", "Let's create a new, empty netCDF file named 'data/new.nc', opened for writing.\n", "\n", "Be careful, opening a file with 'w' will clobber any existing data (unless `clobber=False` is used, in which case an exception is raised if the file already exists).\n", "\n", "- `mode='r'` is the default.\n", "- `mode='a'` opens an existing file and allows for appending (does not clobber existing data)\n", "- `format` can be one of `NETCDF3_CLASSIC`, `NETCDF3_64BIT`, `NETCDF4_CLASSIC` or `NETCDF4` (default). `NETCDF4_CLASSIC` uses HDF5 for the underlying storage layer (as does `NETCDF4`) but enforces the classic netCDF 3 data model so data can be read with older clients. " ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 3, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "root group (NETCDF4_CLASSIC data model, file format HDF5):\n", " dimensions(sizes): \n", " variables(dimensions): \n", " groups: \n", "\n" ] } ], "source": [ "try: ncfile.close() # just to be safe, make sure dataset is not already open.\n", "except: pass\n", "ncfile = netCDF4.Dataset('data/new.nc',mode='w',format='NETCDF4_CLASSIC') \n", "print(ncfile)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 3, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Creating dimensions\n", "\n", "The **ncfile** object we created is a container for _dimensions_, _variables_, and _attributes_. First, let's create some dimensions using the [`createDimension`](http://unidata.github.io/netcdf4-python/netCDF4.Dataset-class.html#createDimension) method. \n", "\n", "- Every dimension has a name and a length. \n", "- The name is a string that is used to specify the dimension to be used when creating a variable, and as a key to access the dimension object in the `ncfile.dimensions` dictionary.\n", "\n", "Setting the dimension length to `0` or `None` makes it unlimited, so it can grow. \n", "\n", "- For `NETCDF4` files, any variable's dimension can be unlimited. \n", "- For `NETCDF4_CLASSIC` and `NETCDF3*` files, only one per variable can be unlimited, and it must be the leftmost (fastest varying) dimension." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 5, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('lat', : name = 'lat', size = 73\n", ")\n", "('lon', : name = 'lon', size = 144\n", ")\n", "('time', (unlimited): name = 'time', size = 0\n", ")\n" ] } ], "source": [ "lat_dim = ncfile.createDimension('lat', 73) # latitude axis\n", "lon_dim = ncfile.createDimension('lon', 144) # longitude axis\n", "time_dim = ncfile.createDimension('time', None) # unlimited axis (can be appended to).\n", "for dim in ncfile.dimensions.items():\n", " print(dim)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 5, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Creating attributes\n", "\n", "netCDF attributes can be created just like you would for any python object. \n", "\n", "- Best to adhere to established conventions (like the [CF](http://cfconventions.org/) conventions)\n", "- We won't try to adhere to any specific convention here though." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 7 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "My model data\n" ] } ], "source": [ "ncfile.title='My model data'\n", "print(ncfile.title)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 8, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "source": [ "Try adding some more attributes..." ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 8, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Creating variables\n", "\n", "Now let's add some variables and store some data in them. \n", "\n", "- A variable has a name, a type, a shape, and some data values. \n", "- The shape of a variable is specified by a tuple of dimension names. \n", "- A variable should also have some named attributes, such as 'units', that describe the data.\n", "\n", "The [`createVariable`](http://unidata.github.io/netcdf4-python/netCDF4.Dataset-class.html#createVariable) method takes 3 mandatory args.\n", "\n", "- the 1st argument is the variable name (a string). This is used as the key to access the variable object from the `variables` dictionary.\n", "- the 2nd argument is the datatype (most numpy datatypes supported). \n", "- the third argument is a tuple containing the dimension names (the dimensions must be created first). Unless this is a `NETCDF4` file, any unlimited dimension must be the leftmost one.\n", "- there are lots of optional arguments (many of which are only relevant when `format='NETCDF4'`) to control compression, chunking, fill_value, etc.\n" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 10, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "float64 temp(time, lat, lon)\n", " units: K\n", " standard_name: air_temperature\n", "unlimited dimensions: time\n", "current shape = (0, 73, 144)\n", "filling on, default _FillValue of 9.96920996839e+36 used\n", "\n" ] } ], "source": [ "# Define two variables with the same names as dimensions,\n", "# a conventional way to define \"coordinate variables\".\n", "lat = ncfile.createVariable('lat', np.float32, ('lat',))\n", "lat.units = 'degrees_north'\n", "lat.long_name = 'latitude'\n", "lon = ncfile.createVariable('lon', np.float32, ('lon',))\n", "lon.units = 'degrees_east'\n", "lon.long_name = 'longitude'\n", "time = ncfile.createVariable('time', np.float64, ('time',))\n", "time.units = 'hours since 1800-01-01'\n", "time.long_name = 'time'\n", "# Define a 3D variable to hold the data\n", "temp = ncfile.createVariable('temp',np.float64,('time','lat','lon')) # note: unlimited dimension is leftmost\n", "temp.units = 'K' # degrees Kelvin\n", "temp.standard_name = 'air_temperature' # this is a CF standard name\n", "print(temp)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 10, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Pre-defined variable attributes (read only)\n", "\n", "The netCDF4 module provides some useful pre-defined Python attributes for netCDF variables, such as dimensions, shape, dtype, ndim. \n", "\n", "Note: since no data has been written yet, the length of the 'time' dimension is 0." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 12, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "-- Some pre-defined attributes for variable temp:\n", "('temp.dimensions:', (u'time', u'lat', u'lon'))\n", "('temp.shape:', (0, 73, 144))\n", "('temp.dtype:', dtype('float64'))\n", "('temp.ndim:', 3)\n" ] } ], "source": [ "print(\"-- Some pre-defined attributes for variable temp:\")\n", "print(\"temp.dimensions:\", temp.dimensions)\n", "print(\"temp.shape:\", temp.shape)\n", "print(\"temp.dtype:\", temp.dtype)\n", "print(\"temp.ndim:\", temp.ndim)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 12, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Writing data\n", "\n", "To write data a netCDF variable object, just treat it like a numpy array and assign values to a slice." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 14 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('-- Wrote data, temp.shape is now ', (3, 73, 144))\n", "('-- Min/Max values:', 280.00283562143028, 329.99987991477548)\n" ] } ], "source": [ "nlats = len(lat_dim); nlons = len(lon_dim); ntimes = 3\n", "# Write latitudes, longitudes.\n", "# Note: the \":\" is necessary in these \"write\" statements\n", "lat[:] = -90. + (180./nlats)*np.arange(nlats) # south pole to north pole\n", "lon[:] = (180./nlats)*np.arange(nlons) # Greenwich meridian eastward\n", "# create a 3D array of random numbers\n", "data_arr = np.random.uniform(low=280,high=330,size=(ntimes,nlats,nlons))\n", "# Write the data. This writes the whole 3D netCDF variable all at once.\n", "temp[:,:,:] = data_arr # Appends data along unlimited dimension\n", "print(\"-- Wrote data, temp.shape is now \", temp.shape)\n", "# read data back from variable (by slicing it), print min and max\n", "print(\"-- Min/Max values:\", temp[:,:,:].min(), temp[:,:,:].max())" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 15, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "source": [ "- You can just treat a netCDF Variable object like a numpy array and assign values to it.\n", "- Variables automatically grow along unlimited dimensions (unlike numpy arrays)\n", "- The above writes the whole 3D variable all at once, but you can write it a slice at a time instead.\n", "\n", "Let's add another time slice....\n" ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 15, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('-- Wrote more data, temp.shape is now ', (4, 73, 144))\n" ] } ], "source": [ "# create a 2D array of random numbers\n", "data_slice = np.random.uniform(low=280,high=330,size=(nlats,nlons))\n", "temp[3,:,:] = data_slice # Appends the 4th time slice\n", "print(\"-- Wrote more data, temp.shape is now \", temp.shape)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 17 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Note that we have not yet written any data to the time variable. It automatically grew as we appended data along the time dimension to the variable `temp`, but the data is missing." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 18, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "float64 time(time)\n", " units: hours since 1800-01-01\n", " long_name: time\n", "unlimited dimensions: time\n", "current shape = (4,)\n", "filling on, default _FillValue of 9.96920996839e+36 used\n", "\n", "(, masked_array(data = [-- -- -- --],\n", " mask = [ True True True True],\n", " fill_value = 9.96920996839e+36)\n", ")\n" ] } ], "source": [ "print(time)\n", "times_arr = time[:]\n", "print(type(times_arr),times_arr) # dashes indicate masked values (where data has not yet been written)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 18, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "Let's add write some data into the time variable. \n", "\n", "- Given a set of datetime instances, use date2num to convert to numeric time values and then write that data to the variable." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 20, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[datetime.datetime(2014, 10, 1, 0, 0), datetime.datetime(2014, 10, 2, 0, 0), datetime.datetime(2014, 10, 3, 0, 0), datetime.datetime(2014, 10, 4, 0, 0)]\n", "(array([ 1882440., 1882464., 1882488., 1882512.]), u'hours since 1800-01-01')\n", "[datetime.datetime(2014, 10, 1, 0, 0) datetime.datetime(2014, 10, 2, 0, 0)\n", " datetime.datetime(2014, 10, 3, 0, 0) datetime.datetime(2014, 10, 4, 0, 0)]\n" ] } ], "source": [ "from datetime import datetime\n", "from netCDF4 import date2num,num2date\n", "# 1st 4 days of October.\n", "dates = [datetime(2014,10,1,0),datetime(2014,10,2,0),datetime(2014,10,3,0),datetime(2014,10,4,0)]\n", "print(dates)\n", "times = date2num(dates, time.units)\n", "print(times, time.units) # numeric values\n", "time[:] = times\n", "# read time data back, convert to datetime instances, check values.\n", "print(num2date(time[:],time.units))" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 20, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Closing a netCDF file\n", "\n", "It's **important** to close a netCDF file you opened for writing:\n", "\n", "- flushes buffers to make sure all data gets written\n", "- releases memory resources used by open netCDF files" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 22, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "root group (NETCDF4_CLASSIC data model, file format HDF5):\n", " title: My model data\n", " dimensions(sizes): lat(73), lon(144), time(4)\n", " variables(dimensions): float32 \u001b[4mlat\u001b[0m(lat), float32 \u001b[4mlon\u001b[0m(lon), float64 \u001b[4mtime\u001b[0m(time), float64 \u001b[4mtemp\u001b[0m(time,lat,lon)\n", " groups: \n", "\n", "Dataset is closed!\n" ] } ], "source": [ "# first print the Dataset object to see what we've got\n", "print(ncfile)\n", "# close the Dataset.\n", "ncfile.close(); print('Dataset is closed!')" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 22, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "# Advanced features\n", "\n", "So far we've only exercised features associated with the old netCDF version 3 data model. netCDF version 4 adds a lot of new functionality that comes with the more flexible HDF5 storage layer. \n", "\n", "Let's create a new file with `format='NETCDF4'` so we can try out some of these features." ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 25, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "root group (NETCDF4 data model, file format HDF5):\n", " dimensions(sizes): \n", " variables(dimensions): \n", " groups: \n", "\n" ] } ], "source": [ "ncfile = netCDF4.Dataset('data/new2.nc','w',format='NETCDF4')\n", "print(ncfile)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 25, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "## Creating Groups\n", "\n", "netCDF version 4 added support for organizing data in hierarchical groups.\n", "\n", "- analagous to directories in a filesystem. \n", "- Groups serve as containers for variables, dimensions and attributes, as well as other groups. \n", "- A `netCDF4.Dataset` creates a special group, called the 'root group', which is similar to the root directory in a unix filesystem. \n", "\n", "- groups are created using the [`createGroup`](http://unidata.github.io/netcdf4-python/netCDF4.Dataset-class.html#createGroup) method.\n", "- takes a single argument (a string, which is the name of the Group instance). This string is used as a key to access the group instances in the `groups` dictionary.\n", "\n", "Here we create two groups to hold data for two different model runs." ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 27, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('model_run1', \n", "group /model_run1:\n", " dimensions(sizes): \n", " variables(dimensions): \n", " groups: \n", ")\n", "('model_run2', \n", "group /model_run2:\n", " dimensions(sizes): \n", " variables(dimensions): \n", " groups: \n", ")\n" ] } ], "source": [ "grp1 = ncfile.createGroup('model_run1')\n", "grp2 = ncfile.createGroup('model_run2')\n", "for grp in ncfile.groups.items():\n", " print(grp)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 27, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "Create some dimensions in the root group." ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 29 }, "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "lat_dim = ncfile.createDimension('lat', 73) # latitude axis\n", "lon_dim = ncfile.createDimension('lon', 144) # longitude axis\n", "time_dim = ncfile.createDimension('time', None) # unlimited axis (can be appended to)." ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 30 }, "slideshow": { "slide_type": "fragment" } }, "source": [ "Now create a variable in grp1 and grp2. The library will search recursively upwards in the group tree to find the dimensions (which in this case are defined one level up).\n", "\n", "- These variables are create with **zlib compression**, another nifty feature of netCDF 4. \n", "- The data are automatically compressed when data is written to the file, and uncompressed when the data is read. \n", "- This can really save disk space, especially when used in conjunction with the [**least_significant_digit**](http://unidata.github.io/netcdf4-python/netCDF4.Dataset-class.html#createVariable) keyword argument, which causes the data to be quantized (truncated) before compression. This makes the compression lossy, but more efficient." ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 31, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "('model_run1', \n", "group /model_run1:\n", " dimensions(sizes): \n", " variables(dimensions): float64 \u001b[4mtemp\u001b[0m(time,lat,lon)\n", " groups: \n", ")\n", "('model_run2', \n", "group /model_run2:\n", " dimensions(sizes): \n", " variables(dimensions): float64 \u001b[4mtemp\u001b[0m(time,lat,lon)\n", " groups: \n", ")\n" ] } ], "source": [ "temp1 = grp1.createVariable('temp',np.float64,('time','lat','lon'),zlib=True)\n", "temp2 = grp2.createVariable('temp',np.float64,('time','lat','lon'),zlib=True)\n", "for grp in ncfile.groups.items(): # shows that each group now contains 1 variable\n", " print(grp)" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 31, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "##Creating a variable with a compound data type\n", "\n", "- Compound data types map directly to numpy structured (a.k.a 'record' arrays). \n", "- Structured arrays are akin to C structs, or derived types in Fortran. \n", "- They allow for the construction of table-like structures composed of combinations of other data types, including other compound types. \n", "- Might be useful for representing multiple parameter values at each point on a grid, or at each time and space location for scattered (point) data. \n", "\n", "Here we create a variable with a compound data type to represent complex data (there is no native complex data type in netCDF). \n", "\n", "- The compound data type is created with the [`createCompoundType`](http://unidata.github.io/netcdf4-python/netCDF4.Dataset-class.html#createCompoundType) method." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 33, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "compound cmplx_var(time, lat, lon)\n", "compound data type: [('real', '\n", "vlen phony_vlen_var(time, lat, lon)\n", "vlen data type: int64\n", "path = /model_run2\n", "unlimited dimensions: time\n", "current shape = (1, 73, 144)\n", "\n", "('data =\\n', array([[[array([0, 4, 0, 9, 2, 2, 2, 4, 2]), array([7, 5, 4, 4, 9, 8, 0]),\n", " array([3, 6, 6, 8, 2, 7]), ..., array([5, 0, 0, 8, 8, 1, 5, 3]),\n", " array([4, 2, 7]), array([0])],\n", " [array([5, 6, 6, 6, 1, 0, 7]), array([7]),\n", " array([7, 5, 8, 9, 6, 9, 3]), ..., array([0, 6, 5, 4]),\n", " array([7, 1, 9, 7, 7, 2]), array([1, 4, 0])],\n", " [array([4, 3, 1]), array([6, 3, 9, 7, 8]), array([8]), ...,\n", " array([6, 5, 8, 0]), array([0]), array([0, 9, 6, 2, 4])],\n", " ..., \n", " [array([8, 4, 4]), array([4, 1, 6]), array([1, 4, 2, 3, 9]), ...,\n", " array([9, 1]), array([7, 2, 5, 1, 5, 8, 2]),\n", " array([2, 9, 9, 1, 4, 6, 3, 5, 2])],\n", " [array([4, 7, 9, 8, 2, 3, 6, 6]),\n", " array([1, 4, 1, 6, 1, 1, 2, 3, 9]),\n", " array([9, 5, 6, 2, 4, 3, 8, 2, 9]), ..., array([9, 5, 7]),\n", " array([3, 9]), array([4, 2, 6, 9])],\n", " [array([8, 9, 9, 2, 2, 8, 8, 5]), array([3]),\n", " array([8, 8, 0, 2, 9, 2, 3, 0, 9]), ..., array([7]),\n", " array([5, 1, 0, 6, 8, 6]), array([8, 6, 3, 6, 9, 8, 4, 2, 5])]]], dtype=object))\n" ] } ], "source": [ "vlen_data = np.empty((nlats,nlons),object)\n", "for i in range(nlons):\n", " for j in range(nlats):\n", " size = np.random.randint(1,10,size=1) # random length of sequence\n", " vlen_data[j,i] = np.random.randint(0,10,size=size)# generate random sequence\n", "vlvar[0] = vlen_data # append along unlimited dimension (time)\n", "print(vlvar)\n", "print('data =\\n',vlvar[:])" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 39, "slide_type": "subslide" }, "slideshow": { "slide_type": "slide" } }, "source": [ "Close the Dataset and examine the contents with ncdump." ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "collapsed": false, "internals": { "frag_helper": "fragment_end", "frag_number": 41, "slide_helper": "subslide_end" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "netcdf new2 {\r\n", "types:\r\n", " compound complex128 {\r\n", " double real ;\r\n", " double imag ;\r\n", " }; // complex128\r\n", " int64(*) phony_vlen ;\r\n", "dimensions:\r\n", "\tlat = 73 ;\r\n", "\tlon = 144 ;\r\n", "\ttime = UNLIMITED ; // (1 currently)\r\n", "\r\n", "group: model_run1 {\r\n", " variables:\r\n", " \tdouble temp(time, lat, lon) ;\r\n", " \tcomplex128 cmplx_var(time, lat, lon) ;\r\n", " } // group model_run1\r\n", "\r\n", "group: model_run2 {\r\n", " variables:\r\n", " \tdouble temp(time, lat, lon) ;\r\n", " \tphony_vlen phony_vlen_var(time, lat, lon) ;\r\n", " } // group model_run2\r\n", "}\r\n" ] } ], "source": [ "ncfile.close()\n", "!ncdump -h data/new2.nc" ] }, { "cell_type": "markdown", "metadata": { "internals": { "frag_helper": "fragment_end", "frag_number": 41, "slide_helper": "subslide_end", "slide_type": "subslide" }, "slide_helper": "slide_end", "slideshow": { "slide_type": "slide" } }, "source": [ "##Other interesting and useful projects using netcdf4-python\n", "\n", "- [Xray](http://xray.readthedocs.org/en/stable/): N-dimensional variant of the core [pandas](http://pandas.pydata.org) data structure that can operate on netcdf variables.\n", "- [Iris](http://scitools.org.uk/iris/): a data model to create a data abstraction layer which isolates analysis and visualisation code from data format specifics. Uses netcdf4-python to access netcdf data (can also handle GRIB).\n", "- [Biggus](https://github.com/SciTools/biggus): Virtual large arrays (from netcdf variables) with lazy evaluation.\n", "- [cf-python](http://cfpython.bitbucket.org/): Implements the [CF](http://cfconventions.org) data model for the reading, writing and processing of data and metadata. " ] } ], "metadata": { "kernelspec": { "display_name": "Python 2", "language": "python", "name": "python2" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 2 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython2", "version": "2.7.9" } }, "nbformat": 4, "nbformat_minor": 0 } netcdf4-python-1.3.1rel/include/000077500000000000000000000000001317565303700165225ustar00rootroot00000000000000netcdf4-python-1.3.1rel/include/mpi-compat.h000066400000000000000000000004401317565303700207370ustar00rootroot00000000000000/* Author: Lisandro Dalcin */ /* Contact: dalcinl@gmail.com */ #ifndef MPI_COMPAT_H #define MPI_COMPAT_H #include #if (MPI_VERSION < 3) && !defined(PyMPI_HAVE_MPI_Message) typedef void *PyMPI_MPI_Message; #define MPI_Message PyMPI_MPI_Message #endif #endif/*MPI_COMPAT_H*/ netcdf4-python-1.3.1rel/include/netCDF4.pxi000066400000000000000000001073231317565303700204410ustar00rootroot00000000000000# size_t, ptrdiff_t are defined in stdlib.h cdef extern from "stdlib.h": ctypedef long size_t ctypedef long ptrdiff_t # hdf5 version info. cdef extern from "H5public.h": ctypedef int herr_t int H5get_libversion( unsigned int *majnum, unsigned int *minnum, unsigned int *relnum ) cdef extern from *: ctypedef char* const_char_ptr "const char*" # netcdf functions. cdef extern from "netcdf.h": ctypedef int nclong ctypedef int nc_type ctypedef struct nc_vlen_t: size_t len # Length of VL data (in base type units) void *p # Pointer to VL data float NC_FILL_FLOAT long NC_FILL_INT double NC_FILL_DOUBLE char NC_FILL_CHAR long long NC_FILL_INT64 unsigned long NC_FILL_UINT unsigned long long NC_FILL_UINT64 cdef enum: NC_NAT # NAT = 'Not A Type' (c.f. NaN) NC_BYTE # signed 1 byte integer NC_CHAR # ISO/ASCII character NC_SHORT # signed 2 byte integer NC_INT # signed 4 byte integer NC_LONG # deprecated, but required for backward compatibility. NC_FLOAT # single precision floating point number NC_DOUBLE # double precision floating point number NC_UBYTE # unsigned 1 byte int NC_USHORT # unsigned 2-byte int NC_UINT # unsigned 4-byte int NC_INT64 # signed 8-byte int NC_UINT64 # unsigned 8-byte int NC_STRING # string NC_VLEN # used internally for vlen types NC_OPAQUE # used internally for opaque types NC_COMPOUND # used internally for compound types NC_ENUM # used internally for enum types. # Use these 'mode' flags for nc_open. NC_NOWRITE # default is read only NC_WRITE # read & write # Use these 'mode' flags for nc_create. NC_CLOBBER NC_NOCLOBBER # Don't destroy existing file on create NC_64BIT_OFFSET # Use large (64-bit) file offsets NC_NETCDF4 # Use netCDF-4/HDF5 format NC_CLASSIC_MODEL # Enforce strict netcdf-3 rules. # Use these 'mode' flags for both nc_create and nc_open. NC_SHARE # Share updates, limit cacheing NC_MPIIO NC_MPIPOSIX # The following flag currently is ignored, but use in # nc_open() or nc_create() may someday support use of advisory # locking to prevent multiple writers from clobbering a file NC_LOCK # Use locking if available # Default fill values, used unless _FillValue attribute is set. # These values are stuffed into newly allocated space as appropriate. # The hope is that one might use these to notice that a particular datum # has not been set. NC_FILL_BYTE #NC_FILL_CHAR NC_FILL_SHORT #NC_FILL_INT #NC_FILL_FLOAT #NC_FILL_DOUBLE NC_FILL_UBYTE NC_FILL_USHORT #NC_FILL_UINT #NC_FILL_INT64 #NC_FILL_UINT64 # These represent the max and min values that can be stored in a # netCDF file for their associated types. Recall that a C compiler # may define int to be any length it wants, but a NC_INT is *always* # a 4 byte signed int. On a platform with has 64 bit ints, there will # be many ints which are outside the range supported by NC_INT. But # since NC_INT is an external format, it has to mean the same thing # everywhere. NC_MAX_BYTE NC_MIN_BYTE NC_MAX_CHAR NC_MAX_SHORT NC_MIN_SHORT NC_MAX_INT NC_MIN_INT NC_MAX_FLOAT NC_MIN_FLOAT NC_MAX_DOUBLE8 NC_MIN_DOUBLE NC_MAX_UBYTE NC_MAX_USHORT NC_MAX_UINT NC_MAX_INT64 NC_MIN_INT64 NC_MAX_UINT64 X_INT64_MAX X_INT64_MIN X_UINT64_MAX # The above values are defaults. # If you wish a variable to use a different value than the above # defaults, create an attribute with the same type as the variable # and the following reserved name. The value you give the attribute # will be used as the fill value for that variable. _FillValue NC_FILL NC_NOFILL # Starting with version 3.6, there are different format netCDF # files. 4.0 instroduces the third one. These defines are only for # the nc_set_default_format function. NC_FORMAT_CLASSIC NC_FORMAT_64BIT NC_FORMAT_64BIT_OFFSET NC_FORMAT_64BIT_DATA NC_FORMAT_NETCDF4 NC_FORMAT_NETCDF4_CLASSIC NC_FORMAT_NC3 NC_FORMAT_NC_HDF4 NC_FORMAT_NC_HDF5 NC_FORMAT_DAP2 NC_FORMAT_DAP4 NC_FORMAT_PNETCDF NC_FORMAT_UNDEFINED # Let nc__create() or nc__open() figure out # as suitable chunk size. NC_SIZEHINT_DEFAULT # In nc__enddef(), align to the chunk size. NC_ALIGN_CHUNK # 'size' argument to ncdimdef for an unlimited dimension NC_UNLIMITED # attribute id to put/get a global attribute NC_GLOBAL # These maximums are enforced by the interface, to facilitate writing # applications and utilities. However, nothing is statically allocated to # these sizes internally. NC_MAX_DIMS NC_MAX_ATTRS NC_MAX_VARS NC_MAX_NAME NC_MAX_VAR_DIMS # Algorithms for netcdf-4 chunking. NC_CHUNK_SEQ NC_CHUNK_SUB NC_CHUNK_SIZES NC_CHUNKED NC_CONTIGUOUS # The netcdf version 3 functions all return integer error status. # These are the possible values, in addition to certain # values from the system errno.h. NC_ISSYSERR NC_NOERR NC2_ERR NC_EBADID NC_ENFILE NC_EEXIST NC_EINVAL NC_EPERM NC_ENOTINDEFINE NC_EINDEFINE NC_EINVALCOORDS NC_EMAXDIMS NC_ENAMEINUSE NC_ENOTATT NC_EMAXATTS NC_EBADTYPE NC_EBADDIM NC_EUNLIMPOS NC_EMAXVARS NC_ENOTVAR NC_EGLOBAL NC_ENOTNC NC_ESTS NC_EMAXNAME NC_EUNLIMIT NC_ENORECVARS NC_ECHAR NC_EEDGE NC_ESTRIDE NC_EBADNAME # N.B. following must match value in ncx.h NC_ERANGE # Math result not representable NC_ENOMEM # Memory allocation (malloc) failure NC_EVARSIZE # One or more variable sizes violate format constraints NC_EDIMSIZE # Invalid dimension size NC_ETRUNC # NetCDFFile likely truncated or possibly corrupted # The following was added in support of netcdf-4. Make all netcdf-4 # error codes < -100 so that errors can be added to netcdf-3 if # needed. NC4_FIRST_ERROR NC_EHDFERR NC_ECANTREAD NC_ECANTWRITE NC_ECANTCREATE NC_EFILEMETA NC_EDIMMETA NC_EATTMETA NC_EVARMETA NC_ENOCOMPOUND NC_EATTEXISTS NC_ENOTNC4 NC_ESTRICTNC3 NC_ENOTNC3 NC_ENOPAR NC_EPARINIT NC_EBADGRPID NC_EBADTYPID NC_ETYPDEFINED NC_EBADFIELD NC_EBADCLASS NC4_LAST_ERROR NC_ENDIAN_NATIVE NC_ENDIAN_LITTLE NC_ENDIAN_BIG NC_SZIP_EC_OPTION_MASK # entropy encoding NC_SZIP_NN_OPTION_MASK # nearest neighbor encoding const_char_ptr *nc_inq_libvers() nogil const_char_ptr *nc_strerror(int ncerr) int nc_create(char *path, int cmode, int *ncidp) int nc__create(char *path, int cmode, size_t initialsz, size_t *chunksizehintp, int *ncidp) int nc_open(char *path, int mode, int *ncidp) int nc__open(char *path, int mode, size_t *chunksizehintp, int *ncidp) int nc_inq_path(int ncid, size_t *pathlen, char *path) nogil int nc_inq_format_extended(int ncid, int *formatp, int* modep) nogil int nc_inq_ncid(int ncid, char *name, int *grp_ncid) nogil int nc_inq_grps(int ncid, int *numgrps, int *ncids) nogil int nc_inq_grpname(int ncid, char *name) nogil int nc_inq_grp_parent(int ncid, int *parent_ncid) nogil int nc_inq_varids(int ncid, int *nvars, int *varids) nogil int nc_inq_dimids(int ncid, int *ndims, int *dimids, int include_parents) nogil int nc_def_grp(int parent_ncid, char *name, int *new_ncid) int nc_def_compound(int ncid, size_t size, char *name, nc_type *typeidp) int nc_insert_compound(int ncid, nc_type xtype, char *name, size_t offset, nc_type field_typeid) int nc_insert_array_compound(int ncid, nc_type xtype, char *name, size_t offset, nc_type field_typeid, int ndims, int *dim_sizes) int nc_inq_type(int ncid, nc_type xtype, char *name, size_t *size) nogil int nc_inq_compound(int ncid, nc_type xtype, char *name, size_t *size, size_t *nfieldsp) nogil int nc_inq_compound_name(int ncid, nc_type xtype, char *name) nogil int nc_inq_compound_size(int ncid, nc_type xtype, size_t *size) nogil int nc_inq_compound_nfields(int ncid, nc_type xtype, size_t *nfieldsp) nogil int nc_inq_compound_field(int ncid, nc_type xtype, int fieldid, char *name, size_t *offsetp, nc_type *field_typeidp, int *ndimsp, int *dim_sizesp) nogil int nc_inq_compound_fieldname(int ncid, nc_type xtype, int fieldid, char *name) nogil int nc_inq_compound_fieldindex(int ncid, nc_type xtype, char *name, int *fieldidp) nogil int nc_inq_compound_fieldoffset(int ncid, nc_type xtype, int fieldid, size_t *offsetp) nogil int nc_inq_compound_fieldtype(int ncid, nc_type xtype, int fieldid, nc_type *field_typeidp) nogil int nc_inq_compound_fieldndims(int ncid, nc_type xtype, int fieldid, int *ndimsp) nogil int nc_inq_compound_fielddim_sizes(int ncid, nc_type xtype, int fieldid, int *dim_sizes) nogil int nc_def_vlen(int ncid, char *name, nc_type base_typeid, nc_type *xtypep) int nc_inq_vlen(int ncid, nc_type xtype, char *name, size_t *datum_sizep, nc_type *base_nc_typep) nogil int nc_inq_user_type(int ncid, nc_type xtype, char *name, size_t *size, nc_type *base_nc_typep, size_t *nfieldsp, int *classp) nogil int nc_inq_typeids(int ncid, int *ntypes, int *typeids) nogil int nc_put_att(int ncid, int varid, char *name, nc_type xtype, size_t len, void *op) int nc_get_att(int ncid, int varid, char *name, void *ip) nogil int nc_get_att_string(int ncid, int varid, char *name, char **ip) nogil int nc_put_att_string(int ncid, int varid, char *name, size_t len, char **op) nogil int nc_def_opaque(int ncid, size_t size, char *name, nc_type *xtypep) int nc_inq_opaque(int ncid, nc_type xtype, char *name, size_t *sizep) int nc_put_att_opaque(int ncid, int varid, char *name, size_t len, void *op) int nc_get_att_opaque(int ncid, int varid, char *name, void *ip) int nc_put_cmp_att_opaque(int ncid, nc_type xtype, int fieldid, char *name, size_t len, void *op) int nc_get_cmp_att_opaque(int ncid, nc_type xtype, int fieldid, char *name, void *ip) int nc_put_var1(int ncid, int varid, size_t *indexp, void *op) int nc_get_var1(int ncid, int varid, size_t *indexp, void *ip) int nc_put_vara(int ncid, int varid, size_t *startp, size_t *countp, void *op) int nc_get_vara(int ncid, int varid, size_t *startp, size_t *countp, void *ip) nogil int nc_put_vars(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, void *op) int nc_get_vars(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, void *ip) nogil int nc_put_varm(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, void *op) int nc_get_varm(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, void *ip) int nc_put_var(int ncid, int varid, void *op) int nc_get_var(int ncid, int varid, void *ip) int nc_def_var_deflate(int ncid, int varid, int shuffle, int deflate, int deflate_level) int nc_def_var_fletcher32(int ncid, int varid, int fletcher32) int nc_inq_var_fletcher32(int ncid, int varid, int *fletcher32p) nogil int nc_def_var_chunking(int ncid, int varid, int contiguous, size_t *chunksizesp) int nc_def_var_fill(int ncid, int varid, int no_fill, void *fill_value) int nc_def_var_endian(int ncid, int varid, int endian) int nc_inq_var_chunking(int ncid, int varid, int *contiguousp, size_t *chunksizesp) nogil int nc_inq_var_deflate(int ncid, int varid, int *shufflep, int *deflatep, int *deflate_levelp) nogil int nc_inq_var_fill(int ncid, int varid, int *no_fill, void *fill_value) nogil int nc_inq_var_endian(int ncid, int varid, int *endianp) nogil int nc_set_fill(int ncid, int fillmode, int *old_modep) int nc_set_default_format(int format, int *old_formatp) int nc_redef(int ncid) int nc__enddef(int ncid, size_t h_minfree, size_t v_align, size_t v_minfree, size_t r_align) int nc_enddef(int ncid) int nc_sync(int ncid) int nc_abort(int ncid) int nc_close(int ncid) int nc_inq(int ncid, int *ndimsp, int *nvarsp, int *nattsp, int *unlimdimidp) nogil int nc_inq_ndims(int ncid, int *ndimsp) nogil int nc_inq_nvars(int ncid, int *nvarsp) nogil int nc_inq_natts(int ncid, int *nattsp) nogil int nc_inq_unlimdim(int ncid, int *unlimdimidp) nogil int nc_inq_unlimdims(int ncid, int *nunlimdimsp, int *unlimdimidsp) nogil int nc_inq_format(int ncid, int *formatp) nogil int nc_def_dim(int ncid, char *name, size_t len, int *idp) int nc_inq_dimid(int ncid, char *name, int *idp) nogil int nc_inq_dim(int ncid, int dimid, char *name, size_t *lenp) nogil int nc_inq_dimname(int ncid, int dimid, char *name) nogil int nc_inq_dimlen(int ncid, int dimid, size_t *lenp) nogil int nc_rename_dim(int ncid, int dimid, char *name) int nc_inq_att(int ncid, int varid, char *name, nc_type *xtypep, size_t *lenp) nogil int nc_inq_attid(int ncid, int varid, char *name, int *idp) nogil int nc_inq_atttype(int ncid, int varid, char *name, nc_type *xtypep) nogil int nc_inq_attlen(int ncid, int varid, char *name, size_t *lenp) nogil int nc_inq_attname(int ncid, int varid, int attnum, char *name) nogil int nc_copy_att(int ncid_in, int varid_in, char *name, int ncid_out, int varid_out) int nc_rename_att(int ncid, int varid, char *name, char *newname) int nc_del_att(int ncid, int varid, char *name) int nc_put_att_text(int ncid, int varid, char *name, size_t len, char *op) int nc_get_att_text(int ncid, int varid, char *name, char *ip) nogil int nc_put_att_uchar(int ncid, int varid, char *name, nc_type xtype, size_t len, unsigned char *op) int nc_get_att_uchar(int ncid, int varid, char *name, unsigned char *ip) int nc_put_att_schar(int ncid, int varid, char *name, nc_type xtype, size_t len, signed char *op) int nc_get_att_schar(int ncid, int varid, char *name, signed char *ip) int nc_put_att_short(int ncid, int varid, char *name, nc_type xtype, size_t len, short *op) int nc_get_att_short(int ncid, int varid, char *name, short *ip) int nc_put_att_int(int ncid, int varid, char *name, nc_type xtype, size_t len, int *op) int nc_get_att_int(int ncid, int varid, char *name, int *ip) int nc_put_att_long(int ncid, int varid, char *name, nc_type xtype, size_t len, long *op) int nc_get_att_long(int ncid, int varid, char *name, long *ip) int nc_put_att_float(int ncid, int varid, char *name, nc_type xtype, size_t len, float *op) int nc_get_att_float(int ncid, int varid, char *name, float *ip) int nc_put_att_double(int ncid, int varid, char *name, nc_type xtype, size_t len, double *op) int nc_get_att_double(int ncid, int varid, char *name, double *ip) int nc_put_att_ushort(int ncid, int varid, char *name, nc_type xtype, size_t len, unsigned short *op) int nc_get_att_ushort(int ncid, int varid, char *name, unsigned short *ip) int nc_put_att_uint(int ncid, int varid, char *name, nc_type xtype, size_t len, unsigned int *op) int nc_get_att_uint(int ncid, int varid, char *name, unsigned int *ip) int nc_put_att_longlong(int ncid, int varid, char *name, nc_type xtype, size_t len, long long *op) int nc_get_att_longlong(int ncid, int varid, char *name, long long *ip) int nc_put_att_ulonglong(int ncid, int varid, char *name, nc_type xtype, size_t len, unsigned long long *op) int nc_get_att_ulonglong(int ncid, int varid, char *name, unsigned long long *ip) int nc_def_var(int ncid, char *name, nc_type xtype, int ndims, int *dimidsp, int *varidp) int nc_inq_var(int ncid, int varid, char *name, nc_type *xtypep, int *ndimsp, int *dimidsp, int *nattsp) nogil int nc_inq_varid(int ncid, char *name, int *varidp) nogil int nc_inq_varname(int ncid, int varid, char *name) nogil int nc_inq_vartype(int ncid, int varid, nc_type *xtypep) nogil int nc_inq_varndims(int ncid, int varid, int *ndimsp) nogil int nc_inq_vardimid(int ncid, int varid, int *dimidsp) nogil int nc_inq_varnatts(int ncid, int varid, int *nattsp) nogil int nc_rename_var(int ncid, int varid, char *name) int nc_copy_var(int ncid_in, int varid, int ncid_out) int nc_put_var1_text(int ncid, int varid, size_t *indexp, char *op) int nc_get_var1_text(int ncid, int varid, size_t *indexp, char *ip) int nc_put_var1_uchar(int ncid, int varid, size_t *indexp, unsigned char *op) int nc_get_var1_uchar(int ncid, int varid, size_t *indexp, unsigned char *ip) int nc_put_var1_schar(int ncid, int varid, size_t *indexp, signed char *op) int nc_get_var1_schar(int ncid, int varid, size_t *indexp, signed char *ip) int nc_put_var1_short(int ncid, int varid, size_t *indexp, short *op) int nc_get_var1_short(int ncid, int varid, size_t *indexp, short *ip) int nc_put_var1_int(int ncid, int varid, size_t *indexp, int *op) int nc_get_var1_int(int ncid, int varid, size_t *indexp, int *ip) int nc_put_var1_long(int ncid, int varid, size_t *indexp, long *op) int nc_get_var1_long(int ncid, int varid, size_t *indexp, long *ip) int nc_put_var1_float(int ncid, int varid, size_t *indexp, float *op) int nc_get_var1_float(int ncid, int varid, size_t *indexp, float *ip) int nc_put_var1_double(int ncid, int varid, size_t *indexp, double *op) int nc_get_var1_double(int ncid, int varid, size_t *indexp, double *ip) int nc_put_var1_ubyte(int ncid, int varid, size_t *indexp, unsigned char *op) int nc_get_var1_ubyte(int ncid, int varid, size_t *indexp, unsigned char *ip) int nc_put_var1_ushort(int ncid, int varid, size_t *indexp, unsigned short *op) int nc_get_var1_ushort(int ncid, int varid, size_t *indexp, unsigned short *ip) int nc_put_var1_uint(int ncid, int varid, size_t *indexp, unsigned int *op) int nc_get_var1_uint(int ncid, int varid, size_t *indexp, unsigned int *ip) int nc_put_var1_longlong(int ncid, int varid, size_t *indexp, long long *op) int nc_get_var1_longlong(int ncid, int varid, size_t *indexp, long long *ip) int nc_put_var1_ulonglong(int ncid, int varid, size_t *indexp, unsigned long long *op) int nc_get_var1_ulonglong(int ncid, int varid, size_t *indexp, unsigned long long *ip) int nc_put_vara_text(int ncid, int varid, size_t *startp, size_t *countp, char *op) int nc_get_vara_text(int ncid, int varid, size_t *startp, size_t *countp, char *ip) int nc_put_vara_uchar(int ncid, int varid, size_t *startp, size_t *countp, unsigned char *op) int nc_get_vara_uchar(int ncid, int varid, size_t *startp, size_t *countp, unsigned char *ip) int nc_put_vara_schar(int ncid, int varid, size_t *startp, size_t *countp, signed char *op) int nc_get_vara_schar(int ncid, int varid, size_t *startp, size_t *countp, signed char *ip) int nc_put_vara_short(int ncid, int varid, size_t *startp, size_t *countp, short *op) int nc_get_vara_short(int ncid, int varid, size_t *startp, size_t *countp, short *ip) int nc_put_vara_int(int ncid, int varid, size_t *startp, size_t *countp, int *op) int nc_get_vara_int(int ncid, int varid, size_t *startp, size_t *countp, int *ip) int nc_put_vara_long(int ncid, int varid, size_t *startp, size_t *countp, long *op) int nc_get_vara_long(int ncid, int varid, size_t *startp, size_t *countp, long *ip) int nc_put_vara_float(int ncid, int varid, size_t *startp, size_t *countp, float *op) int nc_get_vara_float(int ncid, int varid, size_t *startp, size_t *countp, float *ip) int nc_put_vara_double(int ncid, int varid, size_t *startp, size_t *countp, double *op) int nc_get_vara_double(int ncid, int varid, size_t *startp, size_t *countp, double *ip) int nc_put_vara_ubyte(int ncid, int varid, size_t *startp, size_t *countp, unsigned char *op) int nc_get_vara_ubyte(int ncid, int varid, size_t *startp, size_t *countp, unsigned char *ip) int nc_put_vara_ushort(int ncid, int varid, size_t *startp, size_t *countp, unsigned short *op) int nc_get_vara_ushort(int ncid, int varid, size_t *startp, size_t *countp, unsigned short *ip) int nc_put_vara_uint(int ncid, int varid, size_t *startp, size_t *countp, unsigned int *op) int nc_get_vara_uint(int ncid, int varid, size_t *startp, size_t *countp, unsigned int *ip) int nc_put_vara_longlong(int ncid, int varid, size_t *startp, size_t *countp, long long *op) int nc_get_vara_longlong(int ncid, int varid, size_t *startp, size_t *countp, long long *ip) int nc_put_vara_ulonglong(int ncid, int varid, size_t *startp, size_t *countp, unsigned long long *op) int nc_get_vara_ulonglong(int ncid, int varid, size_t *startp, size_t *countp, unsigned long long *ip) int nc_put_vars_text(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, char *op) int nc_get_vars_text(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, char *ip) int nc_put_vars_uchar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned char *op) int nc_get_vars_uchar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned char *ip) int nc_put_vars_schar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, signed char *op) int nc_get_vars_schar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, signed char *ip) int nc_put_vars_short(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, short *op) int nc_get_vars_short(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, short *ip) int nc_put_vars_int(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, int *op) int nc_get_vars_int(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, int *ip) int nc_put_vars_long(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, long *op) int nc_get_vars_long(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, long *ip) int nc_put_vars_float(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, float *op) int nc_get_vars_float(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, float *ip) int nc_put_vars_double(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, double *op) int nc_get_vars_double(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, double *ip) int nc_put_vars_ubyte(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned char *op) int nc_get_vars_ubyte(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned char *ip) int nc_put_vars_ushort(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned short *op) int nc_get_vars_ushort(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned short *ip) int nc_put_vars_uint(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned int *op) int nc_get_vars_uint(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned int *ip) int nc_put_vars_longlong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, long long *op) int nc_get_vars_longlong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, long long *ip) int nc_put_vars_ulonglong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned long long *op) int nc_get_vars_ulonglong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, unsigned long long *ip) int nc_put_varm_text(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, char *op) int nc_get_varm_text(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, char *ip) int nc_put_varm_uchar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, unsigned char *op) int nc_get_varm_uchar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, unsigned char *ip) int nc_put_varm_schar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, signed char *op) int nc_get_varm_schar(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, signed char *ip) int nc_put_varm_short(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, short *op) int nc_get_varm_short(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, short *ip) int nc_put_varm_int(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, int *op) int nc_get_varm_int(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, int *ip) int nc_put_varm_long(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, long *op) int nc_get_varm_long(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, long *ip) int nc_put_varm_float(int ncid, int varid,size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, float *op) int nc_get_varm_float(int ncid, int varid,size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, float *ip) int nc_put_varm_double(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t *imapp, double *op) int nc_get_varm_double(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, double *ip) int nc_put_varm_ubyte(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned char *op) int nc_get_varm_ubyte(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned char *ip) int nc_put_varm_ushort(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned short *op) int nc_get_varm_ushort(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned short *ip) int nc_put_varm_uint(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned int *op) int nc_get_varm_uint(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned int *ip) int nc_put_varm_longlong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, long long *op) int nc_get_varm_longlong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, long long *ip) int nc_put_varm_ulonglong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned long long *op) int nc_get_varm_ulonglong(int ncid, int varid, size_t *startp, size_t *countp, ptrdiff_t *stridep, ptrdiff_t * imapp, unsigned long long *ip) int nc_put_var_text(int ncid, int varid, char *op) int nc_get_var_text(int ncid, int varid, char *ip) int nc_put_var_uchar(int ncid, int varid, unsigned char *op) int nc_get_var_uchar(int ncid, int varid, unsigned char *ip) int nc_put_var_schar(int ncid, int varid, signed char *op) int nc_get_var_schar(int ncid, int varid, signed char *ip) int nc_put_var_short(int ncid, int varid, short *op) int nc_get_var_short(int ncid, int varid, short *ip) int nc_put_var_int(int ncid, int varid, int *op) int nc_get_var_int(int ncid, int varid, int *ip) int nc_put_var_long(int ncid, int varid, long *op) int nc_get_var_long(int ncid, int varid, long *ip) int nc_put_var_float(int ncid, int varid, float *op) int nc_get_var_float(int ncid, int varid, float *ip) int nc_put_var_double(int ncid, int varid, double *op) int nc_get_var_double(int ncid, int varid, double *ip) int nc_put_var_ubyte(int ncid, int varid, unsigned char *op) int nc_get_var_ubyte(int ncid, int varid, unsigned char *ip) int nc_put_var_ushort(int ncid, int varid, unsigned short *op) int nc_get_var_ushort(int ncid, int varid, unsigned short *ip) int nc_put_var_uint(int ncid, int varid, unsigned int *op) int nc_get_var_uint(int ncid, int varid, unsigned int *ip) int nc_put_var_longlong(int ncid, int varid, long long *op) int nc_get_var_longlong(int ncid, int varid, long long *ip) int nc_put_var_ulonglong(int ncid, int varid, unsigned long long *op) int nc_get_var_ulonglong(int ncid, int varid, unsigned long long *ip) # set logging verbosity level. void nc_set_log_level(int new_level) int nc_show_metadata(int ncid) int nc_free_vlen(nc_vlen_t *vl) int nc_free_vlens(size_t len, nc_vlen_t *vl) int nc_free_string(size_t len, char **data) int nc_set_chunk_cache(size_t size, size_t nelems, float preemption) int nc_get_chunk_cache(size_t *sizep, size_t *nelemsp, float *preemptionp) int nc_set_var_chunk_cache(int ncid, int varid, size_t size, size_t nelems, float preemption) int nc_get_var_chunk_cache(int ncid, int varid, size_t *sizep, size_t *nelemsp, float *preemptionp) nogil int nc_rename_grp(int grpid, char *name) int nc_def_enum(int ncid, nc_type base_typeid, char *name, nc_type *typeidp) int nc_insert_enum(int ncid, nc_type xtype, char *name, void *value) int nc_inq_enum(int ncid, nc_type xtype, char *name, nc_type *base_nc_typep,\ size_t *base_sizep, size_t *num_membersp) nogil int nc_inq_enum_member(int ncid, nc_type xtype, int idx, char *name, void *value) nogil int nc_inq_enum_ident(int ncid, nc_type xtype, long long value, char *identifier) nogil IF HAS_NC_OPEN_MEM: cdef extern from "netcdf_mem.h": int nc_open_mem(const char *path, int mode, size_t size, void* memory, int *ncidp) IF HAS_NC_PAR: cdef extern from "mpi-compat.h": pass cdef extern from "netcdf_par.h": ctypedef int MPI_Comm ctypedef int MPI_Info int nc_create_par(char *path, int cmode, MPI_Comm comm, MPI_Info info, int *ncidp); int nc_open_par(char *path, int mode, MPI_Comm comm, MPI_Info info, int *ncidp); int nc_var_par_access(int ncid, int varid, int par_access); cdef enum: NC_COLLECTIVE NC_INDEPENDENT cdef extern from "netcdf.h": cdef enum: NC_MPIIO NC_PNETCDF # taken from numpy.pxi in numpy 1.0rc2. cdef extern from "numpy/arrayobject.h": ctypedef int npy_intp ctypedef extern class numpy.ndarray [object PyArrayObject]: cdef char *data cdef int nd cdef npy_intp *dimensions cdef npy_intp *strides cdef object base # cdef dtype descr cdef int flags npy_intp PyArray_SIZE(ndarray arr) npy_intp PyArray_ISCONTIGUOUS(ndarray arr) npy_intp PyArray_ISALIGNED(ndarray arr) void import_array() netcdf4-python-1.3.1rel/man/000077500000000000000000000000001317565303700156525ustar00rootroot00000000000000netcdf4-python-1.3.1rel/man/nc3tonc4.1000066400000000000000000000061641317565303700173760ustar00rootroot00000000000000.\" (C) Copyright 2015, Ross Gammon , .\" .TH NC3TONC4 1 "22 Mar 2015" .\" .SH NAME nc3tonc4 \- a program to convert netCDF 3 files to netCDF 4 format files .SH SYNOPSIS .B nc3tonc4 .RB [ \-h ] .RB [ \-o ] .RB [ \-\-vars=\fIvar1,var2,..\fR ] .RB [ \-\-zlib=\fI(0|1)\fR ] .RB [ \-\-complevel=\fI(1\-9)\fR ] .RB [ \-\-shuffle=\fI(0|1)\fR ] .RB [ \-\-fletcher32=\fI(0|1)\fR ] .RB [ \-\-unpackshort=\fI(0|1)\fR ] .RB [ \-\-quantize=\fIvar1=n1,var2=n2,..\fR ] .I netcdf3filename .I netcdf4filename .br .SH DESCRIPTION This manual page documents briefly the .B nc3tonc4 command. .PP \fBnc3tonc4\fP is a program that converts a netCDF 3 file into netCDF 4 format, optionally unpacking variables packed as short integers (with scale_factor and add_offset) to floats, and adding zlib compression (with the HDF5 shuffle filter and fletcher32 checksum). Data may also be quantized (truncated) to a specified precision to improve compression. .SH OPTIONS These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. .TP .B \-h Shows a summary of the available options. .TP .B \-o Overwrite destination file (default is to raise an error if output file already exists). .TP .B \-\-vars A comma separated list of variable names to copy (default is to copy all variables). .TP .B \-\-classic=(0|1) Use NETCDF4_CLASSIC format instead of NETCDF4 (default = 1). .TP .B \-\-zlib=(0|1) Activate (or disable) zlib compression (the default is to activate). .TP .B \-\-complevel=(1-9) Set the zlib compression level (6 is default). .TP .B \-\-shuffle=(0|1) Activate (or disable) the shuffle filter (it is active by default). .TP .B \-\-fletcher32=(0|1) Activate (or disable) the fletcher32 checksum (it is not active by default). .TP .B \-\-unpackshort=(0|1) Unpack short integer variables to float variables using scale_factor and add_offset netCDF variable attributes (it is active by default). .TP .B \-\-quantize=(comma separated list of "variable name=integer" pairs) Truncate the data in the specified variables to a given decimal precision. For example, 'speed=2, height=-2, temp=0' will cause the variable 'speed' to be truncated to a precision of 0.01, 'height' to a precision of 100 and 'temp' to 1. This can significantly improve compression. The default is not to quantize any of the variables. .TP .B \-\-quiet=(0|1) If set to 1, don't print any diagnostic information. .TP .B \-\-chunk=(integer) The number of records along unlimited dimension to write at once. The default is 10. It is ignored if there is no unlimited dimension. If chunk=0, it means write all the data at once. .TP .B \-\-istart=(integer) The number of the record to start at along unlimited dimension. The default is 0. This option is ignored if there is no unlimited dimension. .TP .B \-\-istop=(integer) The number of the record to stop at along unlimited dimension. The default is 1. This option is ignored if there is no unlimited dimension. .SH SEE ALSO .BR ncinfo (1), .BR nc4tonc3 (1). .br .SH AUTHOR This manual page was written by Ross Gammon based on the options displayed by nc3tonc4 \-h. netcdf4-python-1.3.1rel/man/nc4tonc3.1000066400000000000000000000025361317565303700173750ustar00rootroot00000000000000.\" (C) Copyright 2015, Ross Gammon , .\" .TH NC4TONC3 1 "22 Mar 2015" .\" .SH NAME nc4tonc3 \- a program to convert a classic netCDF 4 file to netCDF 3 format .SH SYNOPSIS .B nc4tonc3 .RB [ \-h ] .RB [ \-o ] .RB [ \-\-chunk ] .I netcdf4filename .I netcdf3filename .br .SH DESCRIPTION This manual page documents briefly the .B nc4tonc3 command. .PP \fBnc4tonc3\fP is a program that converts a netCDF 4 file (in NETCDF4_CLASSIC format) to netCDF 3 format. .SH OPTIONS These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. .TP .B \-h Shows a summary of the available options. .TP .B \-o Overwrite destination file (default is to raise an error if output file already exists). .TP .B \-\-quiet=(0|1) If set to 1, don't print any diagnostic information. .TP .B \-\-format Choose the netcdf3 format to use. NETCDF3_64BIT is used by default, or it can be set to NETCDF3_CLASSIC. .TP .B \-\-chunk=(integer) The number of records along unlimited dimension to write at once. The default is 10. It is ignored if there is no unlimited dimension. If chunk=0, this means write all the data at once. .SH SEE ALSO .BR ncinfo (1), .BR nc3tonc4 (1). .br .SH AUTHOR This manual page was written by Ross Gammon based on the options displayed by nc3tonc4 \-h. netcdf4-python-1.3.1rel/man/ncinfo.1000066400000000000000000000024401317565303700172100ustar00rootroot00000000000000.\" (C) Copyright 2015, Ross Gammon , .\" .TH NCINFO 1 "22 Mar 2015" .\" .SH NAME ncinfo \- a program to print summary information about a netCDF file .SH SYNOPSIS .B ncinfo .RB [ \-h ] .RB [ \-g|\-\-group=\fIgrp\fR ] .RB [ \-v|\-\-variable=\fIvar\fR ] .RB [ \-d|\-\-dimension=\fIdim\fR ] .I filename .br .SH DESCRIPTION This manual page documents briefly the .B ncinfo command. .PP \fBncinfo\fP is a program that prints summary information about a netCDF file .SH OPTIONS These programs follow the usual GNU command line syntax, with long options starting with two dashes (`-'). A summary of options is included below. .TP .B \-h Shows a summary of the available options. .TP .B \-g grp, \-\-group=grp Prints information for this group. The default group is the root group. Nested groups are specified using posix paths e.g. group1/group2/group3. .TP .B \-v , \-\-variable= Prints information for this variable. .TP .B \-d , \-\-dimension= Prints information for this dimension. .TP The filename of the netCDF file must be supplied as the last argument. .SH SEE ALSO .BR nc3tonc4 (1), .BR nc4tonc3 (1). .br .SH AUTHOR This manual page was written by Ross Gammon based on the options displayed by ncinfo \-h. netcdf4-python-1.3.1rel/netCDF4/000077500000000000000000000000001317565303700162665ustar00rootroot00000000000000netcdf4-python-1.3.1rel/netCDF4/__init__.py000066400000000000000000000012561317565303700204030ustar00rootroot00000000000000# init for netCDF4. package # Docstring comes from extension module _netCDF4. from ._netCDF4 import * # Need explicit imports for names beginning with underscores from ._netCDF4 import __doc__, __pdoc__ from ._netCDF4 import (__version__, __netcdf4libversion__, __hdf5libversion__, __has_rename_grp__, __has_nc_inq_path__, __has_nc_inq_format_extended__, __has_nc_open_mem__, __has_cdf5_format__,__has_nc_par__) __all__ =\ ['Dataset','Variable','Dimension','Group','MFDataset','MFTime','CompoundType','VLType','date2num','num2date','date2index','stringtochar','chartostring','stringtoarr','getlibversion','EnumType'] netcdf4-python-1.3.1rel/netCDF4/_netCDF4.pyx000066400000000000000000010330241317565303700203610ustar00rootroot00000000000000""" Version 1.3.1 ------------- - - - Introduction ============ netcdf4-python is a Python interface to the netCDF C library. [netCDF](http://www.unidata.ucar.edu/software/netcdf/) version 4 has many features not found in earlier versions of the library and is implemented on top of [HDF5](http://www.hdfgroup.org/HDF5). This module can read and write files in both the new netCDF 4 and the old netCDF 3 format, and can create files that are readable by HDF5 clients. The API modelled after [Scientific.IO.NetCDF](http://dirac.cnrs-orleans.fr/ScientificPython/), and should be familiar to users of that module. Most new features of netCDF 4 are implemented, such as multiple unlimited dimensions, groups and zlib data compression. All the new numeric data types (such as 64 bit and unsigned integer types) are implemented. Compound (struct), variable length (vlen) and enumerated (enum) data types are supported, but not the opaque data type. Mixtures of compound, vlen and enum data types (such as compound types containing enums, or vlens containing compound types) are not supported. Download ======== - Latest bleeding-edge code from the [github repository](http://github.com/Unidata/netcdf4-python). - Latest [releases](https://pypi.python.org/pypi/netCDF4) (source code and windows installers). Requires ======== - Python 2.7 or later (python 3 works too). - [numpy array module](http://numpy.scipy.org), version 1.9.0 or later. - [Cython](http://cython.org), version 0.21 or later. - [setuptools](https://pypi.python.org/pypi/setuptools), version 18.0 or later. - The HDF5 C library version 1.8.4-patch1 or higher (1.8.x recommended) from [](ftp://ftp.hdfgroup.org/HDF5/current/src). ***netCDF version 4.4.1 or higher is recommended if using HDF5 1.10.x - otherwise resulting files may be unreadable by clients using earlier versions of HDF5. For netCDF < 4.4.1, HDF5 version 1.8.x is recommended.*** Be sure to build with `--enable-hl --enable-shared`. - [Libcurl](http://curl.haxx.se/libcurl), if you want [OPeNDAP](http://opendap.org) support. - [HDF4](http://www.hdfgroup.org/products/hdf4), if you want to be able to read HDF4 "Scientific Dataset" (SD) files. - The netCDF-4 C library from the [github releases page](https://github.com/Unidata/netcdf-c/releases). Version 4.1.1 or higher is required (4.2 or higher recommended). Be sure to build with `--enable-netcdf-4 --enable-shared`, and set `CPPFLAGS="-I $HDF5_DIR/include"` and `LDFLAGS="-L $HDF5_DIR/lib"`, where `$HDF5_DIR` is the directory where HDF5 was installed. If you want [OPeNDAP](http://opendap.org) support, add `--enable-dap`. If you want HDF4 SD support, add `--enable-hdf4` and add the location of the HDF4 headers and library to `$CPPFLAGS` and `$LDFLAGS`. - for MPI parallel IO support, MPI-enabled versions of the HDF5 and netcdf libraries are required, as is the [mpi4py](http://mpi4py.scipy.org) python module. Install ======= - install the requisite python modules and C libraries (see above). It's easiest if all the C libs are built as shared libraries. - By default, the utility `nc-config`, installed with netcdf 4.1.2 or higher, will be run used to determine where all the dependencies live. - If `nc-config` is not in your default `$PATH` edit the `setup.cfg` file in a text editor and follow the instructions in the comments. In addition to specifying the path to `nc-config`, you can manually set the paths to all the libraries and their include files (in case `nc-config` does not do the right thing). - run `python setup.py build`, then `python setup.py install` (as root if necessary). - [`pip install`](https://pip.pypa.io/en/latest/reference/pip_install.html) can also be used, with library paths set with environment variables. To make this work, the `USE_SETUPCFG` environment variable must be used to tell setup.py not to use `setup.cfg`. For example, `USE_SETUPCFG=0 HDF5_INCDIR=/usr/include/hdf5/serial HDF5_LIBDIR=/usr/lib/x86_64-linux-gnu/hdf5/serial pip install` has been shown to work on an Ubuntu/Debian linux system. Similarly, environment variables (all capitalized) can be used to set the include and library paths for `hdf5`, `netCDF4`, `hdf4`, `szip`, `jpeg`, `curl` and `zlib`. If the libraries are installed in standard places (e.g. `/usr` or `/usr/local`), the environment variables do not need to be set. - run the tests in the 'test' directory by running `python run_all.py`. Tutorial ======== 1. [Creating/Opening/Closing a netCDF file.](#section1) 2. [Groups in a netCDF file.](#section2) 3. [Dimensions in a netCDF file.](#section3) 4. [Variables in a netCDF file.](#section4) 5. [Attributes in a netCDF file.](#section5) 6. [Writing data to and retrieving data from a netCDF variable.](#section6) 7. [Dealing with time coordinates.](#section7) 8. [Reading data from a multi-file netCDF dataset.](#section8) 9. [Efficient compression of netCDF variables.](#section9) 10. [Beyond homogeneous arrays of a fixed type - compound data types.](#section10) 11. [Variable-length (vlen) data types.](#section11) 12. [Enum data type.](#section12) 13. [Parallel IO.](#section13) ##
1) Creating/Opening/Closing a netCDF file. To create a netCDF file from python, you simply call the `netCDF4.Dataset` constructor. This is also the method used to open an existing netCDF file. If the file is open for write access (`mode='w', 'r+'` or `'a'`), you may write any type of data including new dimensions, groups, variables and attributes. netCDF files come in five flavors (`NETCDF3_CLASSIC, NETCDF3_64BIT_OFFSET, NETCDF3_64BIT_DATA, NETCDF4_CLASSIC`, and `NETCDF4`). `NETCDF3_CLASSIC` was the original netcdf binary format, and was limited to file sizes less than 2 Gb. `NETCDF3_64BIT_OFFSET` was introduced in version 3.6.0 of the library, and extended the original binary format to allow for file sizes greater than 2 Gb. `NETCDF3_64BIT_DATA` is a new format that requires version 4.4.0 of the C library - it extends the `NETCDF3_64BIT_OFFSET` binary format to allow for unsigned/64 bit integer data types and 64-bit dimension sizes. `NETCDF3_64BIT` is an alias for `NETCDF3_64BIT_OFFSET`. `NETCDF4_CLASSIC` files use the version 4 disk format (HDF5), but omits features not found in the version 3 API. They can be read by netCDF 3 clients only if they have been relinked against the netCDF 4 library. They can also be read by HDF5 clients. `NETCDF4` files use the version 4 disk format (HDF5) and use the new features of the version 4 API. The `netCDF4` module can read and write files in any of these formats. When creating a new file, the format may be specified using the `format` keyword in the `Dataset` constructor. The default format is `NETCDF4`. To see how a given file is formatted, you can examine the `data_model` attribute. Closing the netCDF file is accomplished via the `netCDF4.Dataset.close` method of the `netCDF4.Dataset` instance. Here's an example: :::python >>> from netCDF4 import Dataset >>> rootgrp = Dataset("test.nc", "w", format="NETCDF4") >>> print rootgrp.data_model NETCDF4 >>> rootgrp.close() Remote [OPeNDAP](http://opendap.org)-hosted datasets can be accessed for reading over http if a URL is provided to the `netCDF4.Dataset` constructor instead of a filename. However, this requires that the netCDF library be built with OPenDAP support, via the `--enable-dap` configure option (added in version 4.0.1). ##
2) Groups in a netCDF file. netCDF version 4 added support for organizing data in hierarchical groups, which are analogous to directories in a filesystem. Groups serve as containers for variables, dimensions and attributes, as well as other groups. A `netCDF4.Dataset` creates a special group, called the 'root group', which is similar to the root directory in a unix filesystem. To create `netCDF4.Group` instances, use the `netCDF4.Dataset.createGroup` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. `netCDF4.Dataset.createGroup` takes a single argument, a python string containing the name of the new group. The new `netCDF4.Group` instances contained within the root group can be accessed by name using the `groups` dictionary attribute of the `netCDF4.Dataset` instance. Only `NETCDF4` formatted files support Groups, if you try to create a Group in a netCDF 3 file you will get an error message. :::python >>> rootgrp = Dataset("test.nc", "a") >>> fcstgrp = rootgrp.createGroup("forecasts") >>> analgrp = rootgrp.createGroup("analyses") >>> print rootgrp.groups OrderedDict([("forecasts", ), ("analyses", )]) Groups can exist within groups in a `netCDF4.Dataset`, just as directories exist within directories in a unix filesystem. Each `netCDF4.Group` instance has a `groups` attribute dictionary containing all of the group instances contained within that group. Each `netCDF4.Group` instance also has a `path` attribute that contains a simulated unix directory path to that group. To simplify the creation of nested groups, you can use a unix-like path as an argument to `netCDF4.Dataset.createGroup`. :::python >>> fcstgrp1 = rootgrp.createGroup("/forecasts/model1") >>> fcstgrp2 = rootgrp.createGroup("/forecasts/model2") If any of the intermediate elements of the path do not exist, they are created, just as with the unix command `'mkdir -p'`. If you try to create a group that already exists, no error will be raised, and the existing group will be returned. Here's an example that shows how to navigate all the groups in a `netCDF4.Dataset`. The function `walktree` is a Python generator that is used to walk the directory tree. Note that printing the `netCDF4.Dataset` or `netCDF4.Group` object yields summary information about it's contents. :::python >>> def walktree(top): >>> values = top.groups.values() >>> yield values >>> for value in top.groups.values(): >>> for children in walktree(value): >>> yield children >>> print rootgrp >>> for children in walktree(rootgrp): >>> for child in children: >>> print child root group (NETCDF4 file format): dimensions: variables: groups: forecasts, analyses group /forecasts: dimensions: variables: groups: model1, model2 group /analyses: dimensions: variables: groups: group /forecasts/model1: dimensions: variables: groups: group /forecasts/model2: dimensions: variables: groups: ##
3) Dimensions in a netCDF file. netCDF defines the sizes of all variables in terms of dimensions, so before any variables can be created the dimensions they use must be created first. A special case, not often used in practice, is that of a scalar variable, which has no dimensions. A dimension is created using the `netCDF4.Dataset.createDimension` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. A Python string is used to set the name of the dimension, and an integer value is used to set the size. To create an unlimited dimension (a dimension that can be appended to), the size value is set to `None` or 0. In this example, there both the `time` and `level` dimensions are unlimited. Having more than one unlimited dimension is a new netCDF 4 feature, in netCDF 3 files there may be only one, and it must be the first (leftmost) dimension of the variable. :::python >>> level = rootgrp.createDimension("level", None) >>> time = rootgrp.createDimension("time", None) >>> lat = rootgrp.createDimension("lat", 73) >>> lon = rootgrp.createDimension("lon", 144) All of the `netCDF4.Dimension` instances are stored in a python dictionary. :::python >>> print rootgrp.dimensions OrderedDict([("level", ), ("time", ), ("lat", ), ("lon", )]) Calling the python `len` function with a `netCDF4.Dimension` instance returns the current size of that dimension. The `netCDF4.Dimension.isunlimited` method of a `netCDF4.Dimension` instance can be used to determine if the dimensions is unlimited, or appendable. :::python >>> print len(lon) 144 >>> print lon.isunlimited() False >>> print time.isunlimited() True Printing the `netCDF4.Dimension` object provides useful summary info, including the name and length of the dimension, and whether it is unlimited. :::python >>> for dimobj in rootgrp.dimensions.values(): >>> print dimobj (unlimited): name = "level", size = 0 (unlimited): name = "time", size = 0 : name = "lat", size = 73 : name = "lon", size = 144 (unlimited): name = "time", size = 0 `netCDF4.Dimension` names can be changed using the `netCDF4.Datatset.renameDimension` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. ##
4) Variables in a netCDF file. netCDF variables behave much like python multidimensional array objects supplied by the [numpy module](http://numpy.scipy.org). However, unlike numpy arrays, netCDF4 variables can be appended to along one or more 'unlimited' dimensions. To create a netCDF variable, use the `netCDF4.Dataset.createVariable` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. The `netCDF4.Dataset.createVariable` method has two mandatory arguments, the variable name (a Python string), and the variable datatype. The variable's dimensions are given by a tuple containing the dimension names (defined previously with `netCDF4.Dataset.createDimension`). To create a scalar variable, simply leave out the dimensions keyword. The variable primitive datatypes correspond to the dtype attribute of a numpy array. You can specify the datatype as a numpy dtype object, or anything that can be converted to a numpy dtype object. Valid datatype specifiers include: `'f4'` (32-bit floating point), `'f8'` (64-bit floating point), `'i4'` (32-bit signed integer), `'i2'` (16-bit signed integer), `'i8'` (64-bit signed integer), `'i1'` (8-bit signed integer), `'u1'` (8-bit unsigned integer), `'u2'` (16-bit unsigned integer), `'u4'` (32-bit unsigned integer), `'u8'` (64-bit unsigned integer), or `'S1'` (single-character string). The old Numeric single-character typecodes (`'f'`,`'d'`,`'h'`, `'s'`,`'b'`,`'B'`,`'c'`,`'i'`,`'l'`), corresponding to (`'f4'`,`'f8'`,`'i2'`,`'i2'`,`'i1'`,`'i1'`,`'S1'`,`'i4'`,`'i4'`), will also work. The unsigned integer types and the 64-bit integer type can only be used if the file format is `NETCDF4`. The dimensions themselves are usually also defined as variables, called coordinate variables. The `netCDF4.Dataset.createVariable` method returns an instance of the `netCDF4.Variable` class whose methods can be used later to access and set variable data and attributes. :::python >>> times = rootgrp.createVariable("time","f8",("time",)) >>> levels = rootgrp.createVariable("level","i4",("level",)) >>> latitudes = rootgrp.createVariable("lat","f4",("lat",)) >>> longitudes = rootgrp.createVariable("lon","f4",("lon",)) >>> # two dimensions unlimited >>> temp = rootgrp.createVariable("temp","f4",("time","level","lat","lon",)) To get summary info on a `netCDF4.Variable` instance in an interactive session, just print it. :::python >>> print temp float32 temp(time, level, lat, lon) least_significant_digit: 3 units: K unlimited dimensions: time, level current shape = (0, 0, 73, 144) You can use a path to create a Variable inside a hierarchy of groups. :::python >>> ftemp = rootgrp.createVariable("/forecasts/model1/temp","f4",("time","level","lat","lon",)) If the intermediate groups do not yet exist, they will be created. You can also query a `netCDF4.Dataset` or `netCDF4.Group` instance directly to obtain `netCDF4.Group` or `netCDF4.Variable` instances using paths. :::python >>> print rootgrp["/forecasts/model1"] # a Group instance group /forecasts/model1: dimensions(sizes): variables(dimensions): float32 temp(time,level,lat,lon) groups: >>> print rootgrp["/forecasts/model1/temp"] # a Variable instance float32 temp(time, level, lat, lon) path = /forecasts/model1 unlimited dimensions: time, level current shape = (0, 0, 73, 144) filling on, default _FillValue of 9.96920996839e+36 used All of the variables in the `netCDF4.Dataset` or `netCDF4.Group` are stored in a Python dictionary, in the same way as the dimensions: :::python >>> print rootgrp.variables OrderedDict([("time", ), ("level", ), ("lat", ), ("lon", ), ("temp", )]) `netCDF4.Variable` names can be changed using the `netCDF4.Dataset.renameVariable` method of a `netCDF4.Dataset` instance. ##
5) Attributes in a netCDF file. There are two types of attributes in a netCDF file, global and variable. Global attributes provide information about a group, or the entire dataset, as a whole. `netCDF4.Variable` attributes provide information about one of the variables in a group. Global attributes are set by assigning values to `netCDF4.Dataset` or `netCDF4.Group` instance variables. `netCDF4.Variable` attributes are set by assigning values to `netCDF4.Variable` instances variables. Attributes can be strings, numbers or sequences. Returning to our example, :::python >>> import time >>> rootgrp.description = "bogus example script" >>> rootgrp.history = "Created " + time.ctime(time.time()) >>> rootgrp.source = "netCDF4 python module tutorial" >>> latitudes.units = "degrees north" >>> longitudes.units = "degrees east" >>> levels.units = "hPa" >>> temp.units = "K" >>> times.units = "hours since 0001-01-01 00:00:00.0" >>> times.calendar = "gregorian" The `netCDF4.Dataset.ncattrs` method of a `netCDF4.Dataset`, `netCDF4.Group` or `netCDF4.Variable` instance can be used to retrieve the names of all the netCDF attributes. This method is provided as a convenience, since using the built-in `dir` Python function will return a bunch of private methods and attributes that cannot (or should not) be modified by the user. :::python >>> for name in rootgrp.ncattrs(): >>> print "Global attr", name, "=", getattr(rootgrp,name) Global attr description = bogus example script Global attr history = Created Mon Nov 7 10.30:56 2005 Global attr source = netCDF4 python module tutorial The `__dict__` attribute of a `netCDF4.Dataset`, `netCDF4.Group` or `netCDF4.Variable` instance provides all the netCDF attribute name/value pairs in a python dictionary: :::python >>> print rootgrp.__dict__ OrderedDict([(u"description", u"bogus example script"), (u"history", u"Created Thu Mar 3 19:30:33 2011"), (u"source", u"netCDF4 python module tutorial")]) Attributes can be deleted from a netCDF `netCDF4.Dataset`, `netCDF4.Group` or `netCDF4.Variable` using the python `del` statement (i.e. `del grp.foo` removes the attribute `foo` the the group `grp`). ##
6) Writing data to and retrieving data from a netCDF variable. Now that you have a netCDF `netCDF4.Variable` instance, how do you put data into it? You can just treat it like an array and assign data to a slice. :::python >>> import numpy >>> lats = numpy.arange(-90,91,2.5) >>> lons = numpy.arange(-180,180,2.5) >>> latitudes[:] = lats >>> longitudes[:] = lons >>> print "latitudes =\\n",latitudes[:] latitudes = [-90. -87.5 -85. -82.5 -80. -77.5 -75. -72.5 -70. -67.5 -65. -62.5 -60. -57.5 -55. -52.5 -50. -47.5 -45. -42.5 -40. -37.5 -35. -32.5 -30. -27.5 -25. -22.5 -20. -17.5 -15. -12.5 -10. -7.5 -5. -2.5 0. 2.5 5. 7.5 10. 12.5 15. 17.5 20. 22.5 25. 27.5 30. 32.5 35. 37.5 40. 42.5 45. 47.5 50. 52.5 55. 57.5 60. 62.5 65. 67.5 70. 72.5 75. 77.5 80. 82.5 85. 87.5 90. ] Unlike NumPy's array objects, netCDF `netCDF4.Variable` objects with unlimited dimensions will grow along those dimensions if you assign data outside the currently defined range of indices. :::python >>> # append along two unlimited dimensions by assigning to slice. >>> nlats = len(rootgrp.dimensions["lat"]) >>> nlons = len(rootgrp.dimensions["lon"]) >>> print "temp shape before adding data = ",temp.shape temp shape before adding data = (0, 0, 73, 144) >>> >>> from numpy.random import uniform >>> temp[0:5,0:10,:,:] = uniform(size=(5,10,nlats,nlons)) >>> print "temp shape after adding data = ",temp.shape temp shape after adding data = (6, 10, 73, 144) >>> >>> # levels have grown, but no values yet assigned. >>> print "levels shape after adding pressure data = ",levels.shape levels shape after adding pressure data = (10,) Note that the size of the levels variable grows when data is appended along the `level` dimension of the variable `temp`, even though no data has yet been assigned to levels. :::python >>> # now, assign data to levels dimension variable. >>> levels[:] = [1000.,850.,700.,500.,300.,250.,200.,150.,100.,50.] However, that there are some differences between NumPy and netCDF variable slicing rules. Slices behave as usual, being specified as a `start:stop:step` triplet. Using a scalar integer index `i` takes the ith element and reduces the rank of the output array by one. Boolean array and integer sequence indexing behaves differently for netCDF variables than for numpy arrays. Only 1-d boolean arrays and integer sequences are allowed, and these indices work independently along each dimension (similar to the way vector subscripts work in fortran). This means that :::python >>> temp[0, 0, [0,1,2,3], [0,1,2,3]] returns an array of shape (4,4) when slicing a netCDF variable, but for a numpy array it returns an array of shape (4,). Similarly, a netCDF variable of shape `(2,3,4,5)` indexed with `[0, array([True, False, True]), array([False, True, True, True]), :]` would return a `(2, 3, 5)` array. In NumPy, this would raise an error since it would be equivalent to `[0, [0,1], [1,2,3], :]`. When slicing with integer sequences, the indices ***need not be sorted*** and ***may contain duplicates*** (both of these are new features in version 1.2.1). While this behaviour may cause some confusion for those used to NumPy's 'fancy indexing' rules, it provides a very powerful way to extract data from multidimensional netCDF variables by using logical operations on the dimension arrays to create slices. For example, :::python >>> tempdat = temp[::2, [1,3,6], lats>0, lons>0] will extract time indices 0,2 and 4, pressure levels 850, 500 and 200 hPa, all Northern Hemisphere latitudes and Eastern Hemisphere longitudes, resulting in a numpy array of shape (3, 3, 36, 71). :::python >>> print "shape of fancy temp slice = ",tempdat.shape shape of fancy temp slice = (3, 3, 36, 71) ***Special note for scalar variables***: To extract data from a scalar variable `v` with no associated dimensions, use `np.asarray(v)` or `v[...]`. The result will be a numpy scalar array. ##
7) Dealing with time coordinates. Time coordinate values pose a special challenge to netCDF users. Most metadata standards (such as CF) specify that time should be measure relative to a fixed date using a certain calendar, with units specified like `hours since YY-MM-DD hh:mm:ss`. These units can be awkward to deal with, without a utility to convert the values to and from calendar dates. The function called `netCDF4.num2date` and `netCDF4.date2num` are provided with this package to do just that. Here's an example of how they can be used: :::python >>> # fill in times. >>> from datetime import datetime, timedelta >>> from netCDF4 import num2date, date2num >>> dates = [datetime(2001,3,1)+n*timedelta(hours=12) for n in range(temp.shape[0])] >>> times[:] = date2num(dates,units=times.units,calendar=times.calendar) >>> print "time values (in units %s): " % times.units+"\\n",times[:] time values (in units hours since January 1, 0001): [ 17533056. 17533068. 17533080. 17533092. 17533104.] >>> dates = num2date(times[:],units=times.units,calendar=times.calendar) >>> print "dates corresponding to time values:\\n",dates dates corresponding to time values: [2001-03-01 00:00:00 2001-03-01 12:00:00 2001-03-02 00:00:00 2001-03-02 12:00:00 2001-03-03 00:00:00] `netCDF4.num2date` converts numeric values of time in the specified `units` and `calendar` to datetime objects, and `netCDF4.date2num` does the reverse. All the calendars currently defined in the [CF metadata convention](http://cfconventions.org) are supported. A function called `netCDF4.date2index` is also provided which returns the indices of a netCDF time variable corresponding to a sequence of datetime instances. ##
8) Reading data from a multi-file netCDF dataset. If you want to read data from a variable that spans multiple netCDF files, you can use the `netCDF4.MFDataset` class to read the data as if it were contained in a single file. Instead of using a single filename to create a `netCDF4.Dataset` instance, create a `netCDF4.MFDataset` instance with either a list of filenames, or a string with a wildcard (which is then converted to a sorted list of files using the python glob module). Variables in the list of files that share the same unlimited dimension are aggregated together, and can be sliced across multiple files. To illustrate this, let's first create a bunch of netCDF files with the same variable (with the same unlimited dimension). The files must in be in `NETCDF3_64BIT_OFFSET`, `NETCDF3_64BIT_DATA`, `NETCDF3_CLASSIC` or `NETCDF4_CLASSIC` format (`NETCDF4` formatted multi-file datasets are not supported). :::python >>> for nf in range(10): >>> f = Dataset("mftest%s.nc" % nf,"w") >>> f.createDimension("x",None) >>> x = f.createVariable("x","i",("x",)) >>> x[0:10] = numpy.arange(nf*10,10*(nf+1)) >>> f.close() Now read all the files back in at once with `netCDF4.MFDataset` :::python >>> from netCDF4 import MFDataset >>> f = MFDataset("mftest*nc") >>> print f.variables["x"][:] [ 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99] Note that `netCDF4.MFDataset` can only be used to read, not write, multi-file datasets. ##
9) Efficient compression of netCDF variables. Data stored in netCDF 4 `netCDF4.Variable` objects can be compressed and decompressed on the fly. The parameters for the compression are determined by the `zlib`, `complevel` and `shuffle` keyword arguments to the `netCDF4.Dataset.createVariable` method. To turn on compression, set `zlib=True`. The `complevel` keyword regulates the speed and efficiency of the compression (1 being fastest, but lowest compression ratio, 9 being slowest but best compression ratio). The default value of `complevel` is 4. Setting `shuffle=False` will turn off the HDF5 shuffle filter, which de-interlaces a block of data before compression by reordering the bytes. The shuffle filter can significantly improve compression ratios, and is on by default. Setting `fletcher32` keyword argument to `netCDF4.Dataset.createVariable` to `True` (it's `False` by default) enables the Fletcher32 checksum algorithm for error detection. It's also possible to set the HDF5 chunking parameters and endian-ness of the binary data stored in the HDF5 file with the `chunksizes` and `endian` keyword arguments to `netCDF4.Dataset.createVariable`. These keyword arguments only are relevant for `NETCDF4` and `NETCDF4_CLASSIC` files (where the underlying file format is HDF5) and are silently ignored if the file format is `NETCDF3_CLASSIC`, `NETCDF3_64BIT_OFFSET` or `NETCDF3_64BIT_DATA`. If your data only has a certain number of digits of precision (say for example, it is temperature data that was measured with a precision of 0.1 degrees), you can dramatically improve zlib compression by quantizing (or truncating) the data using the `least_significant_digit` keyword argument to `netCDF4.Dataset.createVariable`. The least significant digit is the power of ten of the smallest decimal place in the data that is a reliable value. For example if the data has a precision of 0.1, then setting `least_significant_digit=1` will cause data the data to be quantized using `numpy.around(scale*data)/scale`, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). Effectively, this makes the compression 'lossy' instead of 'lossless', that is some precision in the data is sacrificed for the sake of disk space. In our example, try replacing the line :::python >>> temp = rootgrp.createVariable("temp","f4",("time","level","lat","lon",)) with :::python >>> temp = dataset.createVariable("temp","f4",("time","level","lat","lon",),zlib=True) and then :::python >>> temp = dataset.createVariable("temp","f4",("time","level","lat","lon",),zlib=True,least_significant_digit=3) and see how much smaller the resulting files are. ##
10) Beyond homogeneous arrays of a fixed type - compound data types. Compound data types map directly to numpy structured (a.k.a 'record') arrays. Structured arrays are akin to C structs, or derived types in Fortran. They allow for the construction of table-like structures composed of combinations of other data types, including other compound types. Compound types might be useful for representing multiple parameter values at each point on a grid, or at each time and space location for scattered (point) data. You can then access all the information for a point by reading one variable, instead of reading different parameters from different variables. Compound data types are created from the corresponding numpy data type using the `netCDF4.Dataset.createCompoundType` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. Since there is no native complex data type in netcdf, compound types are handy for storing numpy complex arrays. Here's an example: :::python >>> f = Dataset("complex.nc","w") >>> size = 3 # length of 1-d complex array >>> # create sample complex data. >>> datac = numpy.exp(1j*(1.+numpy.linspace(0, numpy.pi, size))) >>> # create complex128 compound data type. >>> complex128 = numpy.dtype([("real",numpy.float64),("imag",numpy.float64)]) >>> complex128_t = f.createCompoundType(complex128,"complex128") >>> # create a variable with this data type, write some data to it. >>> f.createDimension("x_dim",None) >>> v = f.createVariable("cmplx_var",complex128_t,"x_dim") >>> data = numpy.empty(size,complex128) # numpy structured array >>> data["real"] = datac.real; data["imag"] = datac.imag >>> v[:] = data # write numpy structured array to netcdf compound var >>> # close and reopen the file, check the contents. >>> f.close(); f = Dataset("complex.nc") >>> v = f.variables["cmplx_var"] >>> datain = v[:] # read in all the data into a numpy structured array >>> # create an empty numpy complex array >>> datac2 = numpy.empty(datain.shape,numpy.complex128) >>> # .. fill it with contents of structured array. >>> datac2.real = datain["real"]; datac2.imag = datain["imag"] >>> print datac.dtype,datac # original data complex128 [ 0.54030231+0.84147098j -0.84147098+0.54030231j -0.54030231-0.84147098j] >>> >>> print datac2.dtype,datac2 # data from file complex128 [ 0.54030231+0.84147098j -0.84147098+0.54030231j -0.54030231-0.84147098j] Compound types can be nested, but you must create the 'inner' ones first. All possible numpy structured arrays cannot be represented as Compound variables - an error message will be raise if you try to create one that is not supported. All of the compound types defined for a `netCDF4.Dataset` or `netCDF4.Group` are stored in a Python dictionary, just like variables and dimensions. As always, printing objects gives useful summary information in an interactive session: :::python >>> print f root group (NETCDF4 file format): dimensions: x_dim variables: cmplx_var groups: >>> print f.variables["cmplx_var"] compound cmplx_var(x_dim) compound data type: [("real", ">> print f.cmptypes OrderedDict([("complex128", )]) >>> print f.cmptypes["complex128"] : name = "complex128", numpy dtype = [(u"real","11) Variable-length (vlen) data types. NetCDF 4 has support for variable-length or "ragged" arrays. These are arrays of variable length sequences having the same type. To create a variable-length data type, use the `netCDF4.Dataset.createVLType` method method of a `netCDF4.Dataset` or `netCDF4.Group` instance. :::python >>> f = Dataset("tst_vlen.nc","w") >>> vlen_t = f.createVLType(numpy.int32, "phony_vlen") The numpy datatype of the variable-length sequences and the name of the new datatype must be specified. Any of the primitive datatypes can be used (signed and unsigned integers, 32 and 64 bit floats, and characters), but compound data types cannot. A new variable can then be created using this datatype. :::python >>> x = f.createDimension("x",3) >>> y = f.createDimension("y",4) >>> vlvar = f.createVariable("phony_vlen_var", vlen_t, ("y","x")) Since there is no native vlen datatype in numpy, vlen arrays are represented in python as object arrays (arrays of dtype `object`). These are arrays whose elements are Python object pointers, and can contain any type of python object. For this application, they must contain 1-D numpy arrays all of the same type but of varying length. In this case, they contain 1-D numpy `int32` arrays of random length between 1 and 10. :::python >>> import random >>> data = numpy.empty(len(y)*len(x),object) >>> for n in range(len(y)*len(x)): >>> data[n] = numpy.arange(random.randint(1,10),dtype="int32")+1 >>> data = numpy.reshape(data,(len(y),len(x))) >>> vlvar[:] = data >>> print "vlen variable =\\n",vlvar[:] vlen variable = [[[ 1 2 3 4 5 6 7 8 9 10] [1 2 3 4 5] [1 2 3 4 5 6 7 8]] [[1 2 3 4 5 6 7] [1 2 3 4 5 6] [1 2 3 4 5]] [[1 2 3 4 5] [1 2 3 4] [1]] [[ 1 2 3 4 5 6 7 8 9 10] [ 1 2 3 4 5 6 7 8 9 10] [1 2 3 4 5 6 7 8]]] >>> print f root group (NETCDF4 file format): dimensions: x, y variables: phony_vlen_var groups: >>> print f.variables["phony_vlen_var"] vlen phony_vlen_var(y, x) vlen data type: int32 unlimited dimensions: current shape = (4, 3) >>> print f.VLtypes["phony_vlen"] : name = "phony_vlen", numpy dtype = int32 Numpy object arrays containing python strings can also be written as vlen variables, For vlen strings, you don't need to create a vlen data type. Instead, simply use the python `str` builtin (or a numpy string datatype with fixed length greater than 1) when calling the `netCDF4.Dataset.createVariable` method. :::python >>> z = f.createDimension("z",10) >>> strvar = rootgrp.createVariable("strvar", str, "z") In this example, an object array is filled with random python strings with random lengths between 2 and 12 characters, and the data in the object array is assigned to the vlen string variable. :::python >>> chars = "1234567890aabcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ" >>> data = numpy.empty(10,"O") >>> for n in range(10): >>> stringlen = random.randint(2,12) >>> data[n] = "".join([random.choice(chars) for i in range(stringlen)]) >>> strvar[:] = data >>> print "variable-length string variable:\\n",strvar[:] variable-length string variable: [aDy29jPt 5DS9X8 jd7aplD b8t4RM jHh8hq KtaPWF9cQj Q1hHN5WoXSiT MMxsVeq tdLUzvVTzj] >>> print f root group (NETCDF4 file format): dimensions: x, y, z variables: phony_vlen_var, strvar groups: >>> print f.variables["strvar"] vlen strvar(z) vlen data type: unlimited dimensions: current size = (10,) It is also possible to set contents of vlen string variables with numpy arrays of any string or unicode data type. Note, however, that accessing the contents of such variables will always return numpy arrays with dtype `object`. ##
12) Enum data type. netCDF4 has an enumerated data type, which is an integer datatype that is restricted to certain named values. Since Enums don't map directly to a numpy data type, they are read and written as integer arrays. Here's an example of using an Enum type to hold cloud type data. The base integer data type and a python dictionary describing the allowed values and their names are used to define an Enum data type using `netCDF4.Dataset.createEnumType`. :::python >>> nc = Dataset('clouds.nc','w') >>> # python dict with allowed values and their names. >>> enum_dict = {u'Altocumulus': 7, u'Missing': 255, >>> u'Stratus': 2, u'Clear': 0, >>> u'Nimbostratus': 6, u'Cumulus': 4, u'Altostratus': 5, >>> u'Cumulonimbus': 1, u'Stratocumulus': 3} >>> # create the Enum type called 'cloud_t'. >>> cloud_type = nc.createEnumType(numpy.uint8,'cloud_t',enum_dict) >>> print cloud_type : name = 'cloud_t', numpy dtype = uint8, fields/values ={u'Cumulus': 4, u'Altocumulus': 7, u'Missing': 255, u'Stratus': 2, u'Clear': 0, u'Cumulonimbus': 1, u'Stratocumulus': 3, u'Nimbostratus': 6, u'Altostratus': 5} A new variable can be created in the usual way using this data type. Integer data is written to the variable that represents the named cloud types in enum_dict. A `ValueError` will be raised if an attempt is made to write an integer value not associated with one of the specified names. :::python >>> time = nc.createDimension('time',None) >>> # create a 1d variable of type 'cloud_type'. >>> # The fill_value is set to the 'Missing' named value. >>> cloud_var = >>> nc.createVariable('primary_cloud',cloud_type,'time', >>> fill_value=enum_dict['Missing']) >>> # write some data to the variable. >>> cloud_var[:] = [enum_dict['Clear'],enum_dict['Stratus'], >>> enum_dict['Cumulus'],enum_dict['Missing'], >>> enum_dict['Cumulonimbus']] >>> nc.close() >>> # reopen the file, read the data. >>> nc = Dataset('clouds.nc') >>> cloud_var = nc.variables['primary_cloud'] >>> print cloud_var enum primary_cloud(time) _FillValue: 255 enum data type: uint8 unlimited dimensions: time current shape = (5,) >>> print cloud_var.datatype.enum_dict {u'Altocumulus': 7, u'Missing': 255, u'Stratus': 2, u'Clear': 0, u'Nimbostratus': 6, u'Cumulus': 4, u'Altostratus': 5, u'Cumulonimbus': 1, u'Stratocumulus': 3} >>> print cloud_var[:] [0 2 4 -- 1] >>> nc.close() ##
13) Parallel IO. If MPI parallel enabled versions of netcdf and hdf5 are detected, and [mpi4py](https://mpi4py.scipy.org) is installed, netcdf4-python will be built with parallel IO capabilities enabled. To use parallel IO, your program must be running in an MPI environment using [mpi4py](https://mpi4py.scipy.org). :::python >>> from mpi4py import MPI >>> import numpy as np >>> from netCDF4 import Dataset >>> rank = MPI.COMM_WORLD.rank # The process ID (integer 0-3 for 4-process run) To run an MPI-based parallel program like this, you must use `mpiexec` to launch several parallel instances of Python (for example, using `mpiexec -np 4 python mpi_example.py`). The parallel features of netcdf4-python are mostly transparent - when a new dataset is created or an existing dataset is opened, use the `parallel` keyword to enable parallel access. :::python >>> nc = Dataset('parallel_tst.nc','w',parallel=True) The optional `comm` keyword may be used to specify a particular MPI communicator (`MPI_COMM_WORLD` is used by default). Each process (or rank) can now write to the file indepedently. In this example the process rank is written to a different variable index on each task :::python >>> d = nc.createDimension('dim',4) >>> v = nc.createVariable('var', np.int, 'dim') >>> v[rank] = rank >>> nc.close() % ncdump parallel_test.nc netcdf parallel_test { dimensions: dim = 4 ; variables: int64 var(dim) ; data: var = 0, 1, 2, 3 ; } There are two types of parallel IO, independent (the default) and collective. Independent IO means that each process can do IO independently. It should not depend on or be affected by other processes. Collective IO is a way of doing IO defined in the MPI-IO standard; unlike independent IO, all processes must participate in doing IO. To toggle back and forth between the two types of IO, use the `netCDF4.Variable.set_collective` `netCDF4.Variable`method. All metadata operations (such as creation of groups, types, variables, dimensions, or attributes) are collective. There are a couple of important limitatons of parallel IO: - If a variable has an unlimited dimension, appending data must be done in collective mode. If the write is done in independent mode, the operation will fail with a a generic "HDF Error". - You cannot write compressed data in parallel (although you can read it). - You cannot use variable-length (VLEN) data types. All of the code in this tutorial is available in `examples/tutorial.py`, except the parallel IO example, which is in `examples/mpi_example.py`. Unit tests are in the `test` directory. **contact**: Jeffrey Whitaker **copyright**: 2008 by Jeffrey Whitaker. **license**: Permission to use, copy, modify, and distribute this software and its documentation for any purpose and without fee is hereby granted, provided that the above copyright notice appear in all copies and that both the copyright notice and this permission notice appear in supporting documentation. THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. - - - """ # Make changes to this file, not the c-wrappers that Cython generates. from cpython.mem cimport PyMem_Malloc, PyMem_Free from cpython.buffer cimport PyObject_GetBuffer, PyBuffer_Release, PyBUF_SIMPLE, PyBUF_ANY_CONTIGUOUS # pure python utilities from .utils import (_StartCountStride, _quantize, _find_dim, _walk_grps, _out_array_shape, _sortbylist, _tostr, _safecast) # try to use built-in ordered dict in python >= 2.7 try: from collections import OrderedDict except ImportError: # or else use drop-in substitute try: from ordereddict import OrderedDict except ImportError: raise ImportError('please install ordereddict (https://pypi.python.org/pypi/ordereddict)') try: from itertools import izip as zip except ImportError: # python3: zip is already python2's itertools.izip pass __version__ = "1.3.1" # Initialize numpy import posixpath import netcdftime import numpy import weakref import sys import warnings from glob import glob from numpy import ma from libc.string cimport memcpy, memset from libc.stdlib cimport malloc, free import_array() include "constants.pyx" include "netCDF4.pxi" IF HAS_NC_PAR: cimport mpi4py.MPI as MPI from mpi4py.libmpi cimport MPI_Comm, MPI_Info, MPI_Comm_dup, MPI_Info_dup, \ MPI_Comm_free, MPI_Info_free, MPI_INFO_NULL,\ MPI_COMM_WORLD ctypedef MPI.Comm Comm ctypedef MPI.Info Info ELSE: ctypedef object Comm ctypedef object Info # check for required version of netcdf-4 and hdf5. def _gethdf5libversion(): cdef unsigned int majorvers, minorvers, releasevers cdef herr_t ierr ierr = H5get_libversion( &majorvers, &minorvers, &releasevers) if ierr < 0: raise RuntimeError('error getting HDF5 library version info') return '%d.%d.%d' % (majorvers,minorvers,releasevers) def getlibversion(): """ **`getlibversion()`** returns a string describing the version of the netcdf library used to build the module, and when it was built. """ return (nc_inq_libvers()).decode('ascii') __netcdf4libversion__ = getlibversion().split()[0] __hdf5libversion__ = _gethdf5libversion() __has_rename_grp__ = HAS_RENAME_GRP __has_nc_inq_path__ = HAS_NC_INQ_PATH __has_nc_inq_format_extended__ = HAS_NC_INQ_FORMAT_EXTENDED __has_cdf5_format__ = HAS_CDF5_FORMAT __has_nc_open_mem__ = HAS_NC_OPEN_MEM __has_nc_par__ = HAS_NC_PAR _needsworkaround_issue485 = __netcdf4libversion__ < "4.4.0" or \ (__netcdf4libversion__.startswith("4.4.0") and \ "-development" in __netcdf4libversion__) # issue warning for hdf5 1.10 (issue #549) if __netcdf4libversion__[0:5] < "4.4.1" and\ __hdf5libversion__.startswith("1.10"): msg = """ WARNING: Backwards incompatible files will be created with HDF5 1.10.x and netCDF < 4.4.1. Upgrading to netCDF4 >= 4.4.1 or downgrading to to HDF5 version 1.8.x is highly recommended (see https://github.com/Unidata/netcdf-c/issues/250).""" warnings.warn(msg) # numpy data type <--> netCDF 4 data type mapping. _nptonctype = {'S1' : NC_CHAR, 'i1' : NC_BYTE, 'u1' : NC_UBYTE, 'i2' : NC_SHORT, 'u2' : NC_USHORT, 'i4' : NC_INT, 'u4' : NC_UINT, 'i8' : NC_INT64, 'u8' : NC_UINT64, 'f4' : NC_FLOAT, 'f8' : NC_DOUBLE} # just integer types. _intnptonctype = {'i1' : NC_BYTE, 'u1' : NC_UBYTE, 'i2' : NC_SHORT, 'u2' : NC_USHORT, 'i4' : NC_INT, 'u4' : NC_UINT, 'i8' : NC_INT64, 'u8' : NC_UINT64} # create dictionary mapping string identifiers to netcdf format codes _format_dict = {'NETCDF3_CLASSIC' : NC_FORMAT_CLASSIC, 'NETCDF4_CLASSIC' : NC_FORMAT_NETCDF4_CLASSIC, 'NETCDF4' : NC_FORMAT_NETCDF4} IF HAS_CDF5_FORMAT: # NETCDF3_64BIT deprecated, saved for compatibility. # use NETCDF3_64BIT_OFFSET instead. _format_dict['NETCDF3_64BIT_OFFSET'] = NC_FORMAT_64BIT_OFFSET _format_dict['NETCDF3_64BIT_DATA'] = NC_FORMAT_64BIT_DATA ELSE: _format_dict['NETCDF3_64BIT'] = NC_FORMAT_64BIT # invert dictionary mapping _reverse_format_dict = dict((v, k) for k, v in _format_dict.iteritems()) # add duplicate entry (NETCDF3_64BIT == NETCDF3_64BIT_OFFSET) IF HAS_CDF5_FORMAT: _format_dict['NETCDF3_64BIT'] = NC_FORMAT_64BIT_OFFSET ELSE: _format_dict['NETCDF3_64BIT_OFFSET'] = NC_FORMAT_64BIT # default fill_value to numpy datatype mapping. default_fillvals = {#'S1':NC_FILL_CHAR, 'S1':'\0', 'i1':NC_FILL_BYTE, 'u1':NC_FILL_UBYTE, 'i2':NC_FILL_SHORT, 'u2':NC_FILL_USHORT, 'i4':NC_FILL_INT, 'u4':NC_FILL_UINT, 'i8':NC_FILL_INT64, 'u8':NC_FILL_UINT64, 'f4':NC_FILL_FLOAT, 'f8':NC_FILL_DOUBLE} # logical for native endian type. is_native_little = numpy.dtype('f4').byteorder == '=' # hard code these here, instead of importing from netcdf.h # so it will compile with versions <= 4.2. NC_DISKLESS = 0x0008 # next two lines do nothing, preserved for backwards compatibility. default_encoding = 'utf-8' unicode_error = 'replace' python3 = sys.version_info[0] > 2 if python3: buffer = memoryview _nctonptype = {} for _key,_value in _nptonctype.items(): _nctonptype[_value] = _key _supportedtypes = _nptonctype.keys() # make sure NC_CHAR points to S1 _nctonptype[NC_CHAR]='S1' # internal C functions. cdef _get_att_names(int grpid, int varid): # Private function to get all the attribute names in a group cdef int ierr, numatts, n cdef char namstring[NC_MAX_NAME+1] if varid == NC_GLOBAL: with nogil: ierr = nc_inq_natts(grpid, &numatts) else: with nogil: ierr = nc_inq_varnatts(grpid, varid, &numatts) _ensure_nc_success(ierr, err_cls=AttributeError) attslist = [] for n from 0 <= n < numatts: with nogil: ierr = nc_inq_attname(grpid, varid, n, namstring) _ensure_nc_success(ierr, err_cls=AttributeError) # attribute names are assumed to be utf-8 attslist.append(namstring.decode('utf-8')) return attslist cdef _get_att(grp, int varid, name, encoding='utf-8'): # Private function to get an attribute value given its name cdef int ierr, n, _grpid cdef size_t att_len cdef char *attname cdef nc_type att_type cdef ndarray value_arr # attribute names are assumed to be utf-8 bytestr = _strencode(name,encoding='utf-8') attname = bytestr _grpid = grp._grpid with nogil: ierr = nc_inq_att(_grpid, varid, attname, &att_type, &att_len) _ensure_nc_success(ierr, err_cls=AttributeError) # attribute is a character or string ... if att_type == NC_CHAR: value_arr = numpy.empty(att_len,'S1') with nogil: ierr = nc_get_att_text(_grpid, varid, attname, value_arr.data) _ensure_nc_success(ierr, err_cls=AttributeError) if name == '_FillValue' and python3: # make sure _FillValue for character arrays is a byte on python 3 # (issue 271). pstring = value_arr.tostring() else: pstring =\ value_arr.tostring().decode(encoding,errors='replace').replace('\x00','') return pstring elif att_type == NC_STRING: values = PyMem_Malloc(sizeof(char*) * att_len) if not values: raise MemoryError() try: with nogil: ierr = nc_get_att_string(_grpid, varid, attname, values) _ensure_nc_success(ierr, err_cls=AttributeError) try: result = [values[j].decode(encoding,errors='replace').replace('\x00','') for j in range(att_len)] finally: ierr = nc_free_string(att_len, values) # free memory in netcdf C lib finally: PyMem_Free(values) if len(result) == 1: return result[0] else: return result else: # a regular numeric or compound type. if att_type == NC_LONG: att_type = NC_INT try: type_att = _nctonptype[att_type] # see if it is a primitive type value_arr = numpy.empty(att_len,type_att) except KeyError: # check if it's a compound try: type_att = _read_compound(grp, att_type) value_arr = numpy.empty(att_len,type_att) except: # check if it's an enum try: type_att = _read_enum(grp, att_type) value_arr = numpy.empty(att_len,type_att.dtype) except: raise KeyError('attribute %s has unsupported datatype' % attname) with nogil: ierr = nc_get_att(_grpid, varid, attname, value_arr.data) _ensure_nc_success(ierr, err_cls=AttributeError) if value_arr.shape == (): # return a scalar for a scalar array return value_arr.item() elif att_len == 1: # return a scalar for a single element array return value_arr[0] else: return value_arr def _set_default_format(object format='NETCDF4'): # Private function to set the netCDF file format if format not in _format_dict: raise ValueError("unrecognized format requested") nc_set_default_format(_format_dict[format], NULL) cdef _get_format(int grpid): # Private function to get the netCDF file format cdef int ierr, formatp with nogil: ierr = nc_inq_format(grpid, &formatp) _ensure_nc_success(ierr) if formatp not in _reverse_format_dict: raise ValueError('format not supported by python interface') return _reverse_format_dict[formatp] cdef _get_full_format(int grpid): # Private function to get the underlying disk format cdef int ierr, formatp, modep IF HAS_NC_INQ_FORMAT_EXTENDED: with nogil: ierr = nc_inq_format_extended(grpid, &formatp, &modep) _ensure_nc_success(ierr) if formatp == NC_FORMAT_NC3: return 'NETCDF3' elif formatp == NC_FORMAT_NC_HDF5: return 'HDF5' elif formatp == NC_FORMAT_NC_HDF4: return 'HDF4' elif formatp == NC_FORMAT_PNETCDF: return 'PNETCDF' elif formatp == NC_FORMAT_DAP2: return 'DAP2' elif formatp == NC_FORMAT_DAP4: return 'DAP4' elif formatp == NC_FORMAT_UNDEFINED: return 'UNDEFINED' ELSE: return 'UNDEFINED' cdef issue485_workaround(int grpid, int varid, char* attname): # check to see if attribute already exists # and is NC_CHAR, if so delete it and re-create it # (workaround for issue #485). Fixed in C library # with commit 473259b7728120bb281c52359b1af50cca2fcb72, # which was included in 4.4.0-RC5. cdef nc_type att_type cdef size_t att_len if not _needsworkaround_issue485: return ierr = nc_inq_att(grpid, varid, attname, &att_type, &att_len) if ierr == NC_NOERR and att_type == NC_CHAR: ierr = nc_del_att(grpid, varid, attname) _ensure_nc_success(ierr) cdef _set_att(grp, int varid, name, value,\ nc_type xtype=-99, force_ncstring=False): # Private function to set an attribute name/value pair cdef int ierr, lenarr cdef char *attname cdef char *datstring cdef char **string_ptrs cdef ndarray value_arr bytestr = _strencode(name) attname = bytestr # put attribute value into a numpy array. value_arr = numpy.array(value) # if array is 64 bit integers or # if 64-bit datatype not supported, cast to 32 bit integers. fmt = _get_format(grp._grpid) is_netcdf3 = fmt.startswith('NETCDF3') or fmt == 'NETCDF4_CLASSIC' if value_arr.dtype.str[1:] == 'i8' and ('i8' not in _supportedtypes or\ is_netcdf3): value_arr = value_arr.astype('i4') # if array contains ascii strings, write a text attribute (stored as bytes). # if array contains unicode strings, and data model is NETCDF4, # write as a string. if value_arr.dtype.char in ['S','U']: if not is_netcdf3 and force_ncstring and value_arr.size > 1: N = value_arr.size string_ptrs = PyMem_Malloc(N * sizeof(char*)) if not string_ptrs: raise MemoryError() try: strings = [_strencode(s) for s in value_arr.flat] for j in range(N): if len(strings[j]) == 0: strings[j] = _strencode('\x00') string_ptrs[j] = strings[j] issue485_workaround(grp._grpid, varid, attname) ierr = nc_put_att_string(grp._grpid, varid, attname, N, string_ptrs) finally: PyMem_Free(string_ptrs) else: if not value_arr.shape: dats = _strencode(value_arr.item()) else: value_arr1 = value_arr.ravel() dats = _strencode(''.join(value_arr1.tolist())) lenarr = len(dats) datstring = dats if lenarr == 0: # write null byte lenarr=1; datstring = '\x00' if (force_ncstring or value_arr.dtype.char == 'U') and not is_netcdf3: # try to convert to ascii string, write as NC_CHAR # else it's a unicode string, write as NC_STRING (if NETCDF4) try: if force_ncstring: raise UnicodeError dats_ascii = _to_ascii(dats) # try to encode bytes as ascii string ierr = nc_put_att_text(grp._grpid, varid, attname, lenarr, datstring) except UnicodeError: issue485_workaround(grp._grpid, varid, attname) ierr = nc_put_att_string(grp._grpid, varid, attname, 1, &datstring) else: ierr = nc_put_att_text(grp._grpid, varid, attname, lenarr, datstring) _ensure_nc_success(ierr, err_cls=AttributeError) # a 'regular' array type ('f4','i4','f8' etc) else: if value_arr.dtype.kind == 'V': # compound attribute. xtype = _find_cmptype(grp,value_arr.dtype) elif value_arr.dtype.str[1:] not in _supportedtypes: raise TypeError, 'illegal data type for attribute, must be one of %s, got %s' % (_supportedtypes, value_arr.dtype.str[1:]) elif xtype == -99: # if xtype is not passed in as kwarg. xtype = _nptonctype[value_arr.dtype.str[1:]] lenarr = PyArray_SIZE(value_arr) ierr = nc_put_att(grp._grpid, varid, attname, xtype, lenarr, value_arr.data) _ensure_nc_success(ierr, err_cls=AttributeError) cdef _get_types(group): # Private function to create `netCDF4.CompoundType`, # `netCDF4.VLType` or `netCDF4.EnumType` instances for all the # compound, VLEN or Enum types in a `netCDF4.Group` or `netCDF4.Dataset`. cdef int ierr, ntypes, classp, n, _grpid cdef nc_type xtype cdef nc_type *typeids cdef char namstring[NC_MAX_NAME+1] _grpid = group._grpid # get the number of user defined types in this group. with nogil: ierr = nc_inq_typeids(_grpid, &ntypes, NULL) _ensure_nc_success(ierr) if ntypes > 0: typeids = malloc(sizeof(nc_type) * ntypes) with nogil: ierr = nc_inq_typeids(_grpid, &ntypes, typeids) _ensure_nc_success(ierr) # create empty dictionary for CompoundType instances. cmptypes = OrderedDict() vltypes = OrderedDict() enumtypes = OrderedDict() if ntypes > 0: for n from 0 <= n < ntypes: xtype = typeids[n] with nogil: ierr = nc_inq_user_type(_grpid, xtype, namstring, NULL,NULL,NULL,&classp) _ensure_nc_success(ierr) if classp == NC_COMPOUND: # a compound name = namstring.decode('utf-8') # read the compound type info from the file, # create a CompoundType instance from it. try: cmptype = _read_compound(group, xtype) except KeyError: msg='WARNING: unsupported Compound type, skipping...' warnings.warn(msg) continue cmptypes[name] = cmptype elif classp == NC_VLEN: # a vlen name = namstring.decode('utf-8') # read the VLEN type info from the file, # create a VLType instance from it. try: vltype = _read_vlen(group, xtype) except KeyError: msg='WARNING: unsupported VLEN type, skipping...' warnings.warn(msg) continue vltypes[name] = vltype elif classp == NC_ENUM: # an enum type name = namstring.decode('utf-8') # read the Enum type info from the file, # create a EnumType instance from it. try: enumtype = _read_enum(group, xtype) except KeyError: msg='WARNING: unsupported Enum type, skipping...' warnings.warn(msg) continue enumtypes[name] = enumtype free(typeids) return cmptypes, vltypes, enumtypes cdef _get_dims(group): # Private function to create `netCDF4.Dimension` instances for all the # dimensions in a `netCDF4.Group` or Dataset cdef int ierr, numdims, n, _grpid cdef int *dimids cdef char namstring[NC_MAX_NAME+1] # get number of dimensions in this Group. _grpid = group._grpid with nogil: ierr = nc_inq_ndims(_grpid, &numdims) _ensure_nc_success(ierr) # create empty dictionary for dimensions. dimensions = OrderedDict() if numdims > 0: dimids = malloc(sizeof(int) * numdims) if group.data_model == 'NETCDF4': with nogil: ierr = nc_inq_dimids(_grpid, &numdims, dimids, 0) _ensure_nc_success(ierr) else: for n from 0 <= n < numdims: dimids[n] = n for n from 0 <= n < numdims: with nogil: ierr = nc_inq_dimname(_grpid, dimids[n], namstring) _ensure_nc_success(ierr) name = namstring.decode('utf-8') dimensions[name] = Dimension(group, name, id=dimids[n]) free(dimids) return dimensions cdef _get_grps(group): # Private function to create `netCDF4.Group` instances for all the # groups in a `netCDF4.Group` or Dataset cdef int ierr, numgrps, n, _grpid cdef int *grpids cdef char namstring[NC_MAX_NAME+1] # get number of groups in this Group. _grpid = group._grpid with nogil: ierr = nc_inq_grps(_grpid, &numgrps, NULL) _ensure_nc_success(ierr) # create dictionary containing `netCDF4.Group` instances for groups in this group groups = OrderedDict() if numgrps > 0: grpids = malloc(sizeof(int) * numgrps) with nogil: ierr = nc_inq_grps(_grpid, NULL, grpids) _ensure_nc_success(ierr) for n from 0 <= n < numgrps: with nogil: ierr = nc_inq_grpname(grpids[n], namstring) _ensure_nc_success(ierr) name = namstring.decode('utf-8') groups[name] = Group(group, name, id=grpids[n]) free(grpids) return groups cdef _get_vars(group): # Private function to create `netCDF4.Variable` instances for all the # variables in a `netCDF4.Group` or Dataset cdef int ierr, numvars, n, nn, numdims, varid, classp, iendian, _grpid cdef int *varids cdef int *dimids cdef nc_type xtype cdef char namstring[NC_MAX_NAME+1] cdef char namstring_cmp[NC_MAX_NAME+1] # get number of variables in this Group. _grpid = group._grpid with nogil: ierr = nc_inq_nvars(_grpid, &numvars) _ensure_nc_success(ierr, err_cls=AttributeError) # create empty dictionary for variables. variables = OrderedDict() if numvars > 0: # get variable ids. varids = malloc(sizeof(int) * numvars) if group.data_model == 'NETCDF4': with nogil: ierr = nc_inq_varids(_grpid, &numvars, varids) _ensure_nc_success(ierr) else: for n from 0 <= n < numvars: varids[n] = n # loop over variables. for n from 0 <= n < numvars: varid = varids[n] # get variable name. with nogil: ierr = nc_inq_varname(_grpid, varid, namstring) _ensure_nc_success(ierr) name = namstring.decode('utf-8') # get variable type. with nogil: ierr = nc_inq_vartype(_grpid, varid, &xtype) _ensure_nc_success(ierr) # get endian-ness of variable. endianness = None with nogil: ierr = nc_inq_var_endian(_grpid, varid, &iendian) if ierr == NC_NOERR and iendian == NC_ENDIAN_LITTLE: endianness = '<' elif iendian == NC_ENDIAN_BIG: endianness = '>' # check to see if it is a supported user-defined type. try: datatype = _nctonptype[xtype] if endianness is not None: datatype = endianness + datatype except KeyError: if xtype == NC_STRING: datatype = str else: with nogil: ierr = nc_inq_user_type(_grpid, xtype, namstring_cmp, NULL, NULL, NULL, &classp) _ensure_nc_success(ierr) if classp == NC_COMPOUND: # a compound type # create CompoundType instance describing this compound type. try: datatype = _read_compound(group, xtype, endian=endianness) except KeyError: msg="WARNING: variable '%s' has unsupported compound datatype, skipping .." % name warnings.warn(msg) continue elif classp == NC_VLEN: # a compound type # create VLType instance describing this compound type. try: datatype = _read_vlen(group, xtype, endian=endianness) except KeyError: msg="WARNING: variable '%s' has unsupported VLEN datatype, skipping .." % name warnings.warn(msg) continue elif classp == NC_ENUM: # create EnumType instance describing this compound type. try: datatype = _read_enum(group, xtype, endian=endianness) except KeyError: msg="WARNING: variable '%s' has unsupported Enum datatype, skipping .." % name warnings.warn(msg) continue else: msg="WARNING: variable '%s' has unsupported datatype, skipping .." % name warnings.warn(msg) continue # get number of dimensions. with nogil: ierr = nc_inq_varndims(_grpid, varid, &numdims) _ensure_nc_success(ierr) dimids = malloc(sizeof(int) * numdims) # get dimension ids. with nogil: ierr = nc_inq_vardimid(_grpid, varid, dimids) _ensure_nc_success(ierr) # loop over dimensions, retrieve names. # if not found in current group, look in parents. # QUESTION: what if grp1 has a dimension named 'foo' # and so does it's parent - can a variable in grp1 # use the 'foo' dimension from the parent? dimensions = [] for nn from 0 <= nn < numdims: grp = group found = False while not found: for key, value in grp.dimensions.items(): if value._dimid == dimids[nn]: dimensions.append(key) found = True break grp = grp.parent free(dimids) # create new variable instance. if endianness == '>': variables[name] = Variable(group, name, datatype, dimensions, id=varid, endian='big') elif endianness == '<': variables[name] = Variable(group, name, datatype, dimensions, id=varid, endian='little') else: variables[name] = Variable(group, name, datatype, dimensions, id=varid) free(varids) # free pointer holding variable ids. return variables cdef _ensure_nc_success(ierr, err_cls=RuntimeError, filename=None): # print netcdf error message, raise error. if ierr != NC_NOERR: err_str = (nc_strerror(ierr)).decode('ascii') if issubclass(err_cls, EnvironmentError): raise err_cls(ierr, err_str, filename) else: raise err_cls(err_str) # these are class attributes that # only exist at the python level (not in the netCDF file). _private_atts = \ ['_grpid','_grp','_varid','groups','dimensions','variables','dtype','data_model','disk_format', '_nunlimdim','path','parent','ndim','mask','scale','cmptypes','vltypes','enumtypes','_isprimitive', 'file_format','_isvlen','_isenum','_iscompound','_cmptype','_vltype','_enumtype','name', '__orthogoral_indexing__','keepweakref','_has_lsd', '_buffer','chartostring','_no_get_vars'] __pdoc__ = {} cdef class Dataset: """ A netCDF `netCDF4.Dataset` is a collection of dimensions, groups, variables and attributes. Together they describe the meaning of data and relations among data fields stored in a netCDF file. See `netCDF4.Dataset.__init__` for more details. A list of attribute names corresponding to global netCDF attributes defined for the `netCDF4.Dataset` can be obtained with the `netCDF4.Dataset.ncattrs` method. These attributes can be created by assigning to an attribute of the `netCDF4.Dataset` instance. A dictionary containing all the netCDF attribute name/value pairs is provided by the `__dict__` attribute of a `netCDF4.Dataset` instance. The following class variables are read-only and should not be modified by the user. **`dimensions`**: The `dimensions` dictionary maps the names of dimensions defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.Dimension` class. **`variables`**: The `variables` dictionary maps the names of variables defined for this `netCDF4.Dataset` or `netCDF4.Group` to instances of the `netCDF4.Variable` class. **`groups`**: The groups dictionary maps the names of groups created for this `netCDF4.Dataset` or `netCDF4.Group` to instances of the `netCDF4.Group` class (the `netCDF4.Dataset` class is simply a special case of the `netCDF4.Group` class which describes the root group in the netCDF4 file). **`cmptypes`**: The `cmptypes` dictionary maps the names of compound types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.CompoundType` class. **`vltypes`**: The `vltypes` dictionary maps the names of variable-length types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.VLType` class. **`enumtypes`**: The `enumtypes` dictionary maps the names of Enum types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.EnumType` class. **`data_model`**: `data_model` describes the netCDF data model version, one of `NETCDF3_CLASSIC`, `NETCDF4`, `NETCDF4_CLASSIC`, `NETCDF3_64BIT_OFFSET` or `NETCDF3_64BIT_DATA`. **`file_format`**: same as `data_model`, retained for backwards compatibility. **`disk_format`**: `disk_format` describes the underlying file format, one of `NETCDF3`, `HDF5`, `HDF4`, `PNETCDF`, `DAP2`, `DAP4` or `UNDEFINED`. Only available if using netcdf C library version >= 4.3.1, otherwise will always return `UNDEFINED`. **`parent`**: `parent` is a reference to the parent `netCDF4.Group` instance. `None` for the root group or `netCDF4.Dataset` instance. **`path`**: `path` shows the location of the `netCDF4.Group` in the `netCDF4.Dataset` in a unix directory format (the names of groups in the hierarchy separated by backslashes). A `netCDF4.Dataset` instance is the root group, so the path is simply `'/'`. **`keepweakref`**: If `True`, child Dimension and Variables objects only keep weak references to the parent Dataset or Group. """ cdef object __weakref__ cdef public int _grpid cdef public int _isopen cdef Py_buffer _buffer cdef public groups, dimensions, variables, disk_format, path, parent,\ file_format, data_model, cmptypes, vltypes, enumtypes, __orthogonal_indexing__, \ keepweakref # Docstrings for class variables (used by pdoc). __pdoc__['Dataset.dimensions']=\ """The `dimensions` dictionary maps the names of dimensions defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.Dimension` class.""" __pdoc__['Dataset.variables']=\ """The `variables` dictionary maps the names of variables defined for this `netCDF4.Dataset` or `netCDF4.Group` to instances of the `netCDF4.Variable` class.""" __pdoc__['Dataset.groups']=\ """The groups dictionary maps the names of groups created for this `netCDF4.Dataset` or `netCDF4.Group` to instances of the `netCDF4.Group` class (the `netCDF4.Dataset` class is simply a special case of the `netCDF4.Group` class which describes the root group in the netCDF4 file).""" __pdoc__['Dataset.cmptypes']=\ """The `cmptypes` dictionary maps the names of compound types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.CompoundType` class.""" __pdoc__['Dataset.vltypes']=\ """The `vltypes` dictionary maps the names of variable-length types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.VLType` class.""" __pdoc__['Dataset.enumtypes']=\ """The `enumtypes` dictionary maps the names of Enum types defined for the `netCDF4.Group` or `netCDF4.Dataset` to instances of the `netCDF4.EnumType` class.""" __pdoc__['Dataset.data_model']=\ """`data_model` describes the netCDF data model version, one of `NETCDF3_CLASSIC`, `NETCDF4`, `NETCDF4_CLASSIC`, `NETCDF3_64BIT_OFFSET` or `NETCDF3_64BIT_DATA`.""" __pdoc__['Dataset.file_format']=\ """same as `data_model`, retained for backwards compatibility.""" __pdoc__['Dataset.disk_format']=\ """`disk_format` describes the underlying file format, one of `NETCDF3`, `HDF5`, `HDF4`, `PNETCDF`, `DAP2`, `DAP4` or `UNDEFINED`. Only available if using netcdf C library version >= 4.3.1, otherwise will always return `UNDEFINED`.""" __pdoc__['Dataset.parent']=\ """`parent` is a reference to the parent `netCDF4.Group` instance. `None` for the root group or `netCDF4.Dataset` instance""" __pdoc__['Dataset.path']=\ """`path` shows the location of the `netCDF4.Group` in the `netCDF4.Dataset` in a unix directory format (the names of groups in the hierarchy separated by backslashes). A `netCDF4.Dataset` instance is the root group, so the path is simply `'/'`.""" __pdoc__['Dataset.keepweakref']=\ """If `True`, child Dimension and Variables objects only keep weak references to the parent Dataset or Group.""" def __init__(self, filename, mode='r', clobber=True, format='NETCDF4', diskless=False, persist=False, keepweakref=False, memory=None, encoding=None, parallel=False, Comm comm=None, Info info=None, **kwargs): """ **`__init__(self, filename, mode="r", clobber=True, diskless=False, persist=False, keepweakref=False, format='NETCDF4')`** `netCDF4.Dataset` constructor. **`filename`**: Name of netCDF file to hold dataset. Can also be a python 3 pathlib instance or the URL of an OpenDAP dataset. When memory is set this is just used to set the `filepath()`. **`mode`**: access mode. `r` means read-only; no data can be modified. `w` means write; a new file is created, an existing file with the same name is deleted. `a` and `r+` mean append (in analogy with serial files); an existing file is opened for reading and writing. Appending `s` to modes `w`, `r+` or `a` will enable unbuffered shared access to `NETCDF3_CLASSIC`, `NETCDF3_64BIT_OFFSET` or `NETCDF3_64BIT_DATA` formatted files. Unbuffered access may be useful even if you don't need shared access, since it may be faster for programs that don't access data sequentially. This option is ignored for `NETCDF4` and `NETCDF4_CLASSIC` formatted files. **`clobber`**: if `True` (default), opening a file with `mode='w'` will clobber an existing file with the same name. if `False`, an exception will be raised if a file with the same name already exists. **`format`**: underlying file format (one of `'NETCDF4', 'NETCDF4_CLASSIC', 'NETCDF3_CLASSIC'`, `'NETCDF3_64BIT_OFFSET'` or `'NETCDF3_64BIT_DATA'`. Only relevant if `mode = 'w'` (if `mode = 'r','a'` or `'r+'` the file format is automatically detected). Default `'NETCDF4'`, which means the data is stored in an HDF5 file, using netCDF 4 API features. Setting `format='NETCDF4_CLASSIC'` will create an HDF5 file, using only netCDF 3 compatible API features. netCDF 3 clients must be recompiled and linked against the netCDF 4 library to read files in `NETCDF4_CLASSIC` format. `'NETCDF3_CLASSIC'` is the classic netCDF 3 file format that does not handle 2+ Gb files. `'NETCDF3_64BIT_OFFSET'` is the 64-bit offset version of the netCDF 3 file format, which fully supports 2+ GB files, but is only compatible with clients linked against netCDF version 3.6.0 or later. `'NETCDF3_64BIT_DATA'` is the 64-bit data version of the netCDF 3 file format, which supports 64-bit dimension sizes plus unsigned and 64 bit integer data types, but is only compatible with clients linked against netCDF version 4.4.0 or later. **`diskless`**: If `True`, create diskless (in memory) file. This is an experimental feature added to the C library after the netcdf-4.2 release. **`persist`**: if `diskless=True`, persist file to disk when closed (default `False`). **`keepweakref`**: if `True`, child Dimension and Variable instances will keep weak references to the parent Dataset or Group object. Default is `False`, which means strong references will be kept. Having Dimension and Variable instances keep a strong reference to the parent Dataset instance, which in turn keeps a reference to child Dimension and Variable instances, creates circular references. Circular references complicate garbage collection, which may mean increased memory usage for programs that create may Dataset instances with lots of Variables. It also will result in the Dataset object never being deleted, which means it may keep open files alive as well. Setting `keepweakref=True` allows Dataset instances to be garbage collected as soon as they go out of scope, potentially reducing memory usage and open file handles. However, in many cases this is not desirable, since the associated Variable instances may still be needed, but are rendered unusable when the parent Dataset instance is garbage collected. **`memory`**: if not `None`, open file with contents taken from this block of memory. Must be a sequence of bytes. Note this only works with "r" mode. **`encoding`**: encoding used to encode filename string into bytes. Default is None (`sys.getdefaultfileencoding()` is used). **`parallel`**: open for parallel access using MPI (requires mpi4py and parallel-enabled netcdf-c and hdf5 libraries). Default is `False`. If `True`, `comm` and `info` kwargs may also be specified. **`comm`**: MPI_Comm object for parallel access. Default `None`, which means MPI_COMM_WORLD will be used. Ignored if `parallel=False`. **`info`**: MPI_Info object for parallel access. Default `None`, which means MPI_INFO_NULL will be used. Ignored if `parallel=False`. """ cdef int grpid, ierr, numgrps, numdims, numvars cdef char *path cdef char namstring[NC_MAX_NAME+1] IF HAS_NC_PAR: cdef MPI_Comm mpicomm cdef MPI_Info mpiinfo memset(&self._buffer, 0, sizeof(self._buffer)) # flag to indicate that Variables in this Dataset support orthogonal indexing. self.__orthogonal_indexing__ = True if diskless and __netcdf4libversion__ < '4.2.1': #diskless = False # don't raise error, instead silently ignore raise ValueError('diskless mode requires netcdf lib >= 4.2.1, you have %s' % __netcdf4libversion__) # convert filename into string (from os.path object for example), # encode into bytes. if encoding is None: encoding = sys.getfilesystemencoding() bytestr = _strencode(_tostr(filename), encoding=encoding) path = bytestr if memory is not None and (mode != 'r' or type(memory) != bytes): raise ValueError('memory mode only works with \'r\' modes and must be `bytes`') if parallel: IF HAS_NC_PAR != 1: msg='parallel mode requires MPI enabled netcdf-c' raise ValueError(msg) if format != 'NETCDF4': msg='parallel mode only works with format=NETCDF4' raise ValueError(msg) if comm is not None: mpicomm = comm.ob_mpi else: mpicomm = MPI_COMM_WORLD if info is not None: mpiinfo = info.ob_mpi else: mpiinfo = MPI_INFO_NULL if mode == 'w': _set_default_format(format=format) if clobber: if parallel: IF HAS_NC_PAR: ierr = nc_create_par(path, NC_CLOBBER | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: if persist: ierr = nc_create(path, NC_WRITE | NC_CLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_CLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_CLOBBER, &grpid) else: if parallel: IF HAS_NC_PAR: ierr = nc_create_par(path, NC_NOCLOBBER | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: if persist: ierr = nc_create(path, NC_WRITE | NC_NOCLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_NOCLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_NOCLOBBER, &grpid) # reset default format to netcdf3 - this is a workaround # for issue 170 (nc_open'ing a DAP dataset after switching # format to NETCDF4). This bug should be fixed in version # 4.3.0 of the netcdf library (add a version check here?). _set_default_format(format='NETCDF3_64BIT_OFFSET') elif mode == 'r': if memory is not None: IF HAS_NC_OPEN_MEM: # Store reference to memory result = PyObject_GetBuffer(memory, &self._buffer, PyBUF_SIMPLE | PyBUF_ANY_CONTIGUOUS) if result != 0: raise ValueError("Unable to retrieve Buffer from %s" % (memory,)) ierr = nc_open_mem(path, 0, self._buffer.len, self._buffer.buf, &grpid) ELSE: msg = """ nc_open_mem method not enabled. To enable, install Cython, make sure you have version 4.4.1 or higher of the netcdf C lib, and rebuild netcdf4-python.""" raise ValueError(msg) elif parallel: IF HAS_NC_PAR: ierr = nc_open_par(path, NC_NOWRITE | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: ierr = nc_open(path, NC_NOWRITE | NC_DISKLESS, &grpid) else: ierr = nc_open(path, NC_NOWRITE, &grpid) elif mode == 'r+' or mode == 'a': if parallel: IF HAS_NC_PAR: ierr = nc_open_par(path, NC_WRITE | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: ierr = nc_open(path, NC_WRITE | NC_DISKLESS, &grpid) else: ierr = nc_open(path, NC_WRITE, &grpid) elif mode == 'as' or mode == 'r+s': if parallel: # NC_SHARE ignored IF HAS_NC_PAR: ierr = nc_open_par(path, NC_WRITE | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: ierr = nc_open(path, NC_SHARE | NC_DISKLESS, &grpid) else: ierr = nc_open(path, NC_SHARE, &grpid) elif mode == 'ws': if clobber: if parallel: # NC_SHARE ignored IF HAS_NC_PAR: ierr = nc_create_par(path, NC_CLOBBER | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: if persist: ierr = nc_create(path, NC_WRITE | NC_SHARE | NC_CLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_SHARE | NC_CLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_SHARE | NC_CLOBBER, &grpid) else: if parallel: # NC_SHARE ignored IF HAS_NC_PAR: ierr = nc_create_par(path, NC_NOCLOBBER | NC_MPIIO, \ mpicomm, mpiinfo, &grpid) ELSE: pass elif diskless: if persist: ierr = nc_create(path, NC_WRITE | NC_SHARE | NC_NOCLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_SHARE | NC_NOCLOBBER | NC_DISKLESS , &grpid) else: ierr = nc_create(path, NC_SHARE | NC_NOCLOBBER, &grpid) else: raise ValueError("mode must be 'w', 'r', 'a' or 'r+', got '%s'" % mode) _ensure_nc_success(ierr, err_cls=IOError, filename=path) # data model and file format attributes self.data_model = _get_format(grpid) # data_model attribute used to be file_format (versions < 1.0.8), retain # file_format for backwards compatibility. self.file_format = self.data_model self.disk_format = _get_full_format(grpid) # diskless read access only works with NETCDF_CLASSIC (for now) #ncopen = mode.startswith('a') or mode.startswith('r') #if diskless and self.data_model != 'NETCDF3_CLASSIC' and ncopen: # raise ValueError("diskless access only supported for NETCDF3_CLASSIC format") self._grpid = grpid self._isopen = 1 self.path = '/' self.parent = None self.keepweakref = keepweakref # get compound, vlen and enum types in the root Group. self.cmptypes, self.vltypes, self.enumtypes = _get_types(self) # get dimensions in the root group. self.dimensions = _get_dims(self) # get variables in the root Group. self.variables = _get_vars(self) # get groups in the root Group. if self.data_model == 'NETCDF4': self.groups = _get_grps(self) else: self.groups = OrderedDict() # these allow Dataset objects to be used via a "with" statement. def __enter__(self): return self def __exit__(self,atype,value,traceback): self.close() def __getitem__(self, elem): # return variable or group defined in relative path. # split out group names in unix path. elem = posixpath.normpath(elem) # last name in path, could be a variable or group dirname, lastname = posixpath.split(elem) nestedgroups = dirname.split('/') group = self # iterate over groups in path. for g in nestedgroups: if g: group = group.groups[g] # return last one, either a group or a variable. if lastname in group.groups: return group.groups[lastname] elif lastname in group.variables: return group.variables[lastname] else: raise IndexError('%s not found in %s' % (lastname,group.path)) def filepath(self,encoding=None): """ **`filepath(self,encoding=None)`** Get the file system path (or the opendap URL) which was used to open/create the Dataset. Requires netcdf >= 4.1.2. The path is decoded into a string using `sys.getfilesystemencoding()` by default, this can be changed using the `encoding` kwarg.""" cdef int ierr cdef size_t pathlen cdef char *c_path if encoding is None: encoding = sys.getfilesystemencoding() IF HAS_NC_INQ_PATH: with nogil: ierr = nc_inq_path(self._grpid, &pathlen, NULL) _ensure_nc_success(ierr) c_path = malloc(sizeof(char) * (pathlen + 1)) if not c_path: raise MemoryError() try: with nogil: ierr = nc_inq_path(self._grpid, &pathlen, c_path) _ensure_nc_success(ierr) py_path = c_path[:pathlen] # makes a copy of pathlen bytes from c_string finally: free(c_path) return py_path.decode(encoding) ELSE: msg = """ filepath method not enabled. To enable, install Cython, make sure you have version 4.1.2 or higher of the netcdf C lib, and rebuild netcdf4-python.""" raise ValueError(msg) def __repr__(self): if python3: return self.__unicode__() else: return unicode(self).encode('utf-8') def __unicode__(self): ncdump = ['%r\n' % type(self)] dimnames = tuple([_tostr(dimname)+'(%s)'%len(self.dimensions[dimname])\ for dimname in self.dimensions.keys()]) varnames = tuple(\ [_tostr(self.variables[varname].dtype)+' \033[4m'+_tostr(varname)+'\033[0m'+ (((_tostr(self.variables[varname].dimensions) .replace("u'",""))\ .replace("'",""))\ .replace(", ",","))\ .replace(",)",")") for varname in self.variables.keys()]) grpnames = tuple([_tostr(grpname) for grpname in self.groups.keys()]) if self.path == '/': ncdump.append('root group (%s data model, file format %s):\n' % (self.data_model, self.disk_format)) else: ncdump.append('group %s:\n' % self.path) attrs = [' %s: %s\n' % (name,self.getncattr(name)) for name in\ self.ncattrs()] ncdump = ncdump + attrs ncdump.append(' dimensions(sizes): %s\n' % ', '.join(dimnames)) ncdump.append(' variables(dimensions): %s\n' % ', '.join(varnames)) ncdump.append(' groups: %s\n' % ', '.join(grpnames)) return ''.join(ncdump) def _close(self, check_err): cdef int ierr = nc_close(self._grpid) if check_err: _ensure_nc_success(ierr) self._isopen = 0 # indicates file already closed, checked by __dealloc__ # Only release buffer if close succeeded # per impl of PyBuffer_Release: https://github.com/python/cpython/blob/master/Objects/abstract.c#L667 # view.obj is checked, ref on obj is decremented and obj will be null'd out PyBuffer_Release(&self._buffer) def close(self): """ **`close(self)`** Close the Dataset. """ self._close(True) def isopen(self): """ **`close(self)`** is the Dataset open or closed? """ return bool(self._isopen) def __dealloc__(self): # close file when there are no references to object left if self._isopen: self._close(False) def __reduce__(self): # raise error is user tries to pickle a Dataset object. raise NotImplementedError('Dataset is not picklable') def sync(self): """ **`sync(self)`** Writes all buffered data in the `netCDF4.Dataset` to the disk file.""" _ensure_nc_success(nc_sync(self._grpid)) def _redef(self): cdef int ierr ierr = nc_redef(self._grpid) def _enddef(self): cdef int ierr ierr = nc_enddef(self._grpid) def set_fill_on(self): """ **`set_fill_on(self)`** Sets the fill mode for a `netCDF4.Dataset` open for writing to `on`. This causes data to be pre-filled with fill values. The fill values can be controlled by the variable's `_Fill_Value` attribute, but is usually sufficient to the use the netCDF default `_Fill_Value` (defined separately for each variable type). The default behavior of the netCDF library corresponds to `set_fill_on`. Data which are equal to the `_Fill_Value` indicate that the variable was created, but never written to.""" cdef int oldmode _ensure_nc_success(nc_set_fill(self._grpid, NC_FILL, &oldmode)) def set_fill_off(self): """ **`set_fill_off(self)`** Sets the fill mode for a `netCDF4.Dataset` open for writing to `off`. This will prevent the data from being pre-filled with fill values, which may result in some performance improvements. However, you must then make sure the data is actually written before being read.""" cdef int oldmode _ensure_nc_success(nc_set_fill(self._grpid, NC_NOFILL, &oldmode)) def createDimension(self, dimname, size=None): """ **`createDimension(self, dimname, size=None)`** Creates a new dimension with the given `dimname` and `size`. `size` must be a positive integer or `None`, which stands for "unlimited" (default is `None`). Specifying a size of 0 also results in an unlimited dimension. The return value is the `netCDF4.Dimension` class instance describing the new dimension. To determine the current maximum size of the dimension, use the `len` function on the `netCDF4.Dimension` instance. To determine if a dimension is 'unlimited', use the `netCDF4.Dimension.isunlimited` method of the `netCDF4.Dimension` instance.""" self.dimensions[dimname] = Dimension(self, dimname, size=size) return self.dimensions[dimname] def renameDimension(self, oldname, newname): """ **`renameDimension(self, oldname, newname)`** rename a `netCDF4.Dimension` named `oldname` to `newname`.""" cdef char *namstring bytestr = _strencode(newname) namstring = bytestr if self.data_model != 'NETCDF4': self._redef() try: dim = self.dimensions[oldname] except KeyError: raise KeyError('%s not a valid dimension name' % oldname) ierr = nc_rename_dim(self._grpid, dim._dimid, namstring) if self.data_model != 'NETCDF4': self._enddef() _ensure_nc_success(ierr) # remove old key from dimensions dict. self.dimensions.pop(oldname) # add new key. self.dimensions[newname] = dim # Variable.dimensions is determined by a method that # looks in the file, so no need to manually update. def createCompoundType(self, datatype, datatype_name): """ **`createCompoundType(self, datatype, datatype_name)`** Creates a new compound data type named `datatype_name` from the numpy dtype object `datatype`. ***Note***: If the new compound data type contains other compound data types (i.e. it is a 'nested' compound type, where not all of the elements are homogeneous numeric data types), then the 'inner' compound types **must** be created first. The return value is the `netCDF4.CompoundType` class instance describing the new datatype.""" self.cmptypes[datatype_name] = CompoundType(self, datatype,\ datatype_name) return self.cmptypes[datatype_name] def createVLType(self, datatype, datatype_name): """ **`createVLType(self, datatype, datatype_name)`** Creates a new VLEN data type named `datatype_name` from a numpy dtype object `datatype`. The return value is the `netCDF4.VLType` class instance describing the new datatype.""" self.vltypes[datatype_name] = VLType(self, datatype, datatype_name) return self.vltypes[datatype_name] def createEnumType(self, datatype, datatype_name, enum_dict): """ **`createEnumType(self, datatype, datatype_name, enum_dict)`** Creates a new Enum data type named `datatype_name` from a numpy integer dtype object `datatype`, and a python dictionary defining the enum fields and values. The return value is the `netCDF4.EnumType` class instance describing the new datatype.""" self.enumtypes[datatype_name] = EnumType(self, datatype, datatype_name, enum_dict) return self.enumtypes[datatype_name] def createVariable(self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None, chunk_cache=None): """ **`createVariable(self, varname, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None)`** Creates a new variable with the given `varname`, `datatype`, and `dimensions`. If dimensions are not given, the variable is assumed to be a scalar. If `varname` is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary For example, `createVariable('/GroupA/GroupB/VarC', float, ('x','y'))` will create groups `GroupA` and `GroupA/GroupB`, plus the variable `GroupA/GroupB/VarC`, if the preceding groups don't already exist. The `datatype` can be a numpy datatype object, or a string that describes a numpy dtype object (like the `dtype.str` attribute of a numpy array). Supported specifiers include: `'S1' or 'c' (NC_CHAR), 'i1' or 'b' or 'B' (NC_BYTE), 'u1' (NC_UBYTE), 'i2' or 'h' or 's' (NC_SHORT), 'u2' (NC_USHORT), 'i4' or 'i' or 'l' (NC_INT), 'u4' (NC_UINT), 'i8' (NC_INT64), 'u8' (NC_UINT64), 'f4' or 'f' (NC_FLOAT), 'f8' or 'd' (NC_DOUBLE)`. `datatype` can also be a `netCDF4.CompoundType` instance (for a structured, or compound array), a `netCDF4.VLType` instance (for a variable-length array), or the python `str` builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for `str`. Data from netCDF variables is presented to python as numpy arrays with the corresponding data type. `dimensions` must be a tuple containing dimension names (strings) that have been defined previously using `netCDF4.Dataset.createDimension`. The default value is an empty tuple, which means the variable is a scalar. If the optional keyword `zlib` is `True`, the data will be compressed in the netCDF file using gzip compression (default `False`). The optional keyword `complevel` is an integer between 1 and 9 describing the level of compression desired (default 4). Ignored if `zlib=False`. If the optional keyword `shuffle` is `True`, the HDF5 shuffle filter will be applied before compressing the data (default `True`). This significantly improves compression. Default is `True`. Ignored if `zlib=False`. If the optional keyword `fletcher32` is `True`, the Fletcher32 HDF5 checksum algorithm is activated to detect errors. Default `False`. If the optional keyword `contiguous` is `True`, the variable data is stored contiguously on disk. Default `False`. Setting to `True` for a variable with an unlimited dimension will trigger an error. The optional keyword `chunksizes` can be used to manually specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available [here](http://www.hdfgroup.org/HDF5/doc/H5.user/Chunking.html). Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. `chunksizes` cannot be set if `contiguous=True`. The optional keyword `endian` can be used to control whether the data is stored in little or big endian format on disk. Possible values are `little, big` or `native` (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness. The `zlib, complevel, shuffle, fletcher32, contiguous, chunksizes` and `endian` keywords are silently ignored for netCDF 3 files that do not use HDF5. The optional keyword `fill_value` can be used to override the default netCDF `_FillValue` (the value that the variable gets filled with before any data is written to it, defaults given in `netCDF4.default_fillvals`). If fill_value is set to `False`, then the variable is not pre-filled. If the optional keyword parameter `least_significant_digit` is specified, variable data will be truncated (quantized). In conjunction with `zlib=True` this produces 'lossy', but significantly more efficient compression. For example, if `least_significant_digit=1`, data will be quantized using `numpy.around(scale*data)/scale`, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). From the [PSD metadata conventions](http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml): "least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value." Default is `None`, or no quantization, or 'lossless' compression. When creating variables in a `NETCDF4` or `NETCDF4_CLASSIC` formatted file, HDF5 creates something called a 'chunk cache' for each variable. The default size of the chunk cache may be large enough to completely fill available memory when creating thousands of variables. The optional keyword `chunk_cache` allows you to reduce (or increase) the size of the default chunk cache when creating a variable. The setting only persists as long as the Dataset is open - you can use the set_var_chunk_cache method to change it the next time the Dataset is opened. Warning - messing with this parameter can seriously degrade performance. The return value is the `netCDF4.Variable` class instance describing the new variable. A list of names corresponding to netCDF variable attributes can be obtained with the `netCDF4.Variable` method `netCDF4.Variable.ncattrs`. A dictionary containing all the netCDF attribute name/value pairs is provided by the `__dict__` attribute of a `netCDF4.Variable` instance. `netCDF4.Variable` instances behave much like array objects. Data can be assigned to or retrieved from a variable with indexing and slicing operations on the `netCDF4.Variable` instance. A `netCDF4.Variable` instance has six Dataset standard attributes: `dimensions, dtype, shape, ndim, name` and `least_significant_digit`. Application programs should never modify these attributes. The `dimensions` attribute is a tuple containing the names of the dimensions associated with this variable. The `dtype` attribute is a string describing the variable's data type (`i4, f8, S1,` etc). The `shape` attribute is a tuple describing the current sizes of all the variable's dimensions. The `name` attribute is a string containing the name of the Variable instance. The `least_significant_digit` attributes describes the power of ten of the smallest decimal place in the data the contains a reliable value. assigned to the `netCDF4.Variable` instance. If `None`, the data is not truncated. The `ndim` attribute is the number of variable dimensions.""" # if varname specified as a path, split out group names. varname = posixpath.normpath(varname) dirname, varname = posixpath.split(varname) # varname is last. # create parent groups (like mkdir -p). if not dirname: group = self else: group = self.createGroup(dirname) # create variable. group.variables[varname] = Variable(group, varname, datatype, dimensions=dimensions, zlib=zlib, complevel=complevel, shuffle=shuffle, fletcher32=fletcher32, contiguous=contiguous, chunksizes=chunksizes, endian=endian, least_significant_digit=least_significant_digit, fill_value=fill_value, chunk_cache=chunk_cache) return group.variables[varname] def renameVariable(self, oldname, newname): """ **`renameVariable(self, oldname, newname)`** rename a `netCDF4.Variable` named `oldname` to `newname`""" cdef char *namstring try: var = self.variables[oldname] except KeyError: raise KeyError('%s not a valid variable name' % oldname) bytestr = _strencode(newname) namstring = bytestr if self.data_model != 'NETCDF4': self._redef() ierr = nc_rename_var(self._grpid, var._varid, namstring) if self.data_model != 'NETCDF4': self._enddef() _ensure_nc_success(ierr) # remove old key from dimensions dict. self.variables.pop(oldname) # add new key. self.variables[newname] = var def createGroup(self, groupname): """ **`createGroup(self, groupname)`** Creates a new `netCDF4.Group` with the given `groupname`. If `groupname` is specified as a path, using forward slashes as in unix to separate components, then intermediate groups will be created as necessary (analogous to `mkdir -p` in unix). For example, `createGroup('/GroupA/GroupB/GroupC')` will create `GroupA`, `GroupA/GroupB`, and `GroupA/GroupB/GroupC`, if they don't already exist. If the specified path describes a group that already exists, no error is raised. The return value is a `netCDF4.Group` class instance.""" # if group specified as a path, split out group names groupname = posixpath.normpath(groupname) nestedgroups = groupname.split('/') group = self # loop over group names, create parent groups if they do not already # exist. for g in nestedgroups: if not g: continue if g not in group.groups: group.groups[g] = Group(group, g) group = group.groups[g] # if group already exists, just return the group # (prior to 1.1.8, this would have raised an error) return group def ncattrs(self): """ **`ncattrs(self)`** return netCDF global attribute names for this `netCDF4.Dataset` or `netCDF4.Group` in a list.""" return _get_att_names(self._grpid, NC_GLOBAL) def setncattr(self,name,value): """ **`setncattr(self,name,value)`** set a netCDF dataset or group attribute using name,value pair. Use if you need to set a netCDF attribute with the with the same name as one of the reserved python attributes.""" if self.data_model != 'NETCDF4': self._redef() _set_att(self, NC_GLOBAL, name, value) if self.data_model != 'NETCDF4': self._enddef() def setncattr_string(self,name,value): """ **`setncattr_string(self,name,value)`** set a netCDF dataset or group string attribute using name,value pair. Use if you need to ensure that a netCDF attribute is created with type `NC_STRING` if the file format is `NETCDF4`. Use if you need to set an attribute to an array of variable-length strings.""" cdef nc_type xtype xtype=-99 if self.data_model != 'NETCDF4': msg='file format does not support NC_STRING attributes' raise IOError(msg) _set_att(self, NC_GLOBAL, name, value, xtype=xtype, force_ncstring=True) def setncatts(self,attdict): """ **`setncatts(self,attdict)`** set a bunch of netCDF dataset or group attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a `NETCDF3` formatted file, since nc_redef/nc_enddef is not called in between setting each attribute""" if self.data_model != 'NETCDF4': self._redef() for name, value in attdict.items(): _set_att(self, NC_GLOBAL, name, value) if self.data_model != 'NETCDF4': self._enddef() def getncattr(self,name,encoding='utf-8'): """ **`getncattr(self,name)`** retrieve a netCDF dataset or group attribute. Use if you need to get a netCDF attribute with the same name as one of the reserved python attributes. option kwarg `encoding` can be used to specify the character encoding of a string attribute (default is `utf-8`).""" return _get_att(self, NC_GLOBAL, name, encoding=encoding) def __delattr__(self,name): # if it's a netCDF attribute, remove it if name not in _private_atts: self.delncattr(name) else: raise AttributeError( "'%s' is one of the reserved attributes %s, cannot delete. Use delncattr instead." % (name, tuple(_private_atts))) def delncattr(self, name): """ **`delncattr(self,name,value)`** delete a netCDF dataset or group attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.""" cdef char *attname cdef int ierr bytestr = _strencode(name) attname = bytestr if self.data_model != 'NETCDF4': self._redef() ierr = nc_del_att(self._grpid, NC_GLOBAL, attname) if self.data_model != 'NETCDF4': self._enddef() _ensure_nc_success(ierr) def __setattr__(self,name,value): # if name in _private_atts, it is stored at the python # level and not in the netCDF file. if name not in _private_atts: self.setncattr(name, value) elif not name.endswith('__'): if hasattr(self,name): raise AttributeError( "'%s' is one of the reserved attributes %s, cannot rebind. Use setncattr instead." % (name, tuple(_private_atts))) else: self.__dict__[name]=value def __getattr__(self,name): # if name in _private_atts, it is stored at the python # level and not in the netCDF file. if name.startswith('__') and name.endswith('__'): # if __dict__ requested, return a dict with netCDF attributes. if name == '__dict__': names = self.ncattrs() values = [] for name in names: values.append(_get_att(self, NC_GLOBAL, name)) return OrderedDict(zip(names,values)) else: raise AttributeError elif name in _private_atts: return self.__dict__[name] else: return self.getncattr(name) def renameAttribute(self, oldname, newname): """ **`renameAttribute(self, oldname, newname)`** rename a `netCDF4.Dataset` or `netCDF4.Group` attribute named `oldname` to `newname`.""" cdef char *oldnamec cdef char *newnamec bytestr = _strencode(oldname) oldnamec = bytestr bytestr = _strencode(newname) newnamec = bytestr _ensure_nc_success(nc_rename_att(self._grpid, NC_GLOBAL, oldnamec, newnamec)) def renameGroup(self, oldname, newname): """ **`renameGroup(self, oldname, newname)`** rename a `netCDF4.Group` named `oldname` to `newname` (requires netcdf >= 4.3.1).""" cdef char *newnamec IF HAS_RENAME_GRP: bytestr = _strencode(newname) newnamec = bytestr try: grp = self.groups[oldname] except KeyError: raise KeyError('%s not a valid group name' % oldname) _ensure_nc_success(nc_rename_grp(grp._grpid, newnamec)) # remove old key from groups dict. self.groups.pop(oldname) # add new key. self.groups[newname] = grp ELSE: msg = """ renameGroup method not enabled. To enable, install Cython, make sure you have version 4.3.1 or higher of the netcdf C lib, and rebuild netcdf4-python.""" raise ValueError(msg) def set_auto_chartostring(self, value): """ **`set_auto_chartostring(self, True_or_False)`** Call `netCDF4.Variable.set_auto_chartostring` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups. **`True_or_False`**: Boolean determining if automatic conversion of all character arrays <--> string arrays should be performed for character variables (variables of type `NC_CHAR` or `S1`) with the `_Encoding` attribute set. ***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour. """ # this is a hack to make inheritance work in MFDataset # (which stores variables in _vars) _vars = self.variables if _vars is None: _vars = self._vars for var in _vars.values(): var.set_auto_chartostring(value) for groups in _walk_grps(self): for group in groups: for var in group.variables.values(): var.set_auto_chartostring(value) def set_auto_maskandscale(self, value): """ **`set_auto_maskandscale(self, True_or_False)`** Call `netCDF4.Variable.set_auto_maskandscale` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups. **`True_or_False`**: Boolean determining if automatic conversion to masked arrays and variable scaling shall be applied for all variables. ***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour. """ # this is a hack to make inheritance work in MFDataset # (which stores variables in _vars) _vars = self.variables if _vars is None: _vars = self._vars for var in _vars.values(): var.set_auto_maskandscale(value) for groups in _walk_grps(self): for group in groups: for var in group.variables.values(): var.set_auto_maskandscale(value) def set_auto_mask(self, value): """ **`set_auto_mask(self, True_or_False)`** Call `netCDF4.Variable.set_auto_mask` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups. **`True_or_False`**: Boolean determining if automatic conversion to masked arrays shall be applied for all variables. ***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour. """ for var in self.variables.values(): var.set_auto_mask(value) for groups in _walk_grps(self): for group in groups: for var in group.variables.values(): var.set_auto_mask(value) def set_auto_scale(self, value): """ **`set_auto_scale(self, True_or_False)`** Call `netCDF4.Variable.set_auto_scale` for all variables contained in this `netCDF4.Dataset` or `netCDF4.Group`, as well as for all variables in all its subgroups. **`True_or_False`**: Boolean determining if automatic variable scaling shall be applied for all variables. ***Note***: Calling this function only affects existing variables. Variables created after calling this function will follow the default behaviour. """ # this is a hack to make inheritance work in MFDataset # (which stores variables in _vars) _vars = self.variables if _vars is None: _vars = self._vars for var in _vars.values(): var.set_auto_scale(value) for groups in _walk_grps(self): for group in groups: for var in group.variables.values(): var.set_auto_scale(value) def get_variables_by_attributes(self, **kwargs): """ **`get_variables_by_attribute(self, **kwargs)`** Returns a list of variables that match specific conditions. Can pass in key=value parameters and variables are returned that contain all of the matches. For example, :::python >>> # Get variables with x-axis attribute. >>> vs = nc.get_variables_by_attributes(axis='X') >>> # Get variables with matching "standard_name" attribute >>> vs = nc.get_variables_by_attributes(standard_name='northward_sea_water_velocity') Can pass in key=callable parameter and variables are returned if the callable returns True. The callable should accept a single parameter, the attribute value. None is given as the attribute value when the attribute does not exist on the variable. For example, :::python >>> # Get Axis variables >>> vs = nc.get_variables_by_attributes(axis=lambda v: v in ['X', 'Y', 'Z', 'T']) >>> # Get variables that don't have an "axis" attribute >>> vs = nc.get_variables_by_attributes(axis=lambda v: v is None) >>> # Get variables that have a "grid_mapping" attribute >>> vs = nc.get_variables_by_attributes(grid_mapping=lambda v: v is not None) """ vs = [] has_value_flag = False # this is a hack to make inheritance work in MFDataset # (which stores variables in _vars) _vars = self.variables if _vars is None: _vars = self._vars for vname in _vars: var = _vars[vname] for k, v in kwargs.items(): if callable(v): has_value_flag = v(getattr(var, k, None)) if has_value_flag is False: break elif hasattr(var, k) and getattr(var, k) == v: has_value_flag = True else: has_value_flag = False break if has_value_flag is True: vs.append(_vars[vname]) return vs cdef class Group(Dataset): """ Groups define a hierarchical namespace within a netCDF file. They are analogous to directories in a unix filesystem. Each `netCDF4.Group` behaves like a `netCDF4.Dataset` within a Dataset, and can contain it's own variables, dimensions and attributes (and other Groups). See `netCDF4.Group.__init__` for more details. `netCDF4.Group` inherits from `netCDF4.Dataset`, so all the `netCDF4.Dataset` class methods and variables are available to a `netCDF4.Group` instance (except the `close` method). Additional read-only class variables: **`name`**: String describing the group name. """ # Docstrings for class variables (used by pdoc). __pdoc__['Group.name']=\ """A string describing the name of the `netCDF4.Group`.""" def __init__(self, parent, name, **kwargs): """ **`__init__(self, parent, name)`** `netCDF4.Group` constructor. **`parent`**: `netCDF4.Group` instance for the parent group. If being created in the root group, use a `netCDF4.Dataset` instance. **`name`**: - Name of the group. ***Note***: `netCDF4.Group` instances should be created using the `netCDF4.Dataset.createGroup` method of a `netCDF4.Dataset` instance, or another `netCDF4.Group` instance, not using this class directly. """ cdef char *groupname # flag to indicate that Variables in this Group support orthogonal indexing. self.__orthogonal_indexing__ = True # set data_model and file_format attributes. self.data_model = parent.data_model self.file_format = parent.file_format # full path to Group. self.path = posixpath.join(parent.path, name) # parent group. self.parent = parent # propagate weak reference setting from parent. self.keepweakref = parent.keepweakref if 'id' in kwargs: self._grpid = kwargs['id'] # get compound, vlen and enum types in this Group. self.cmptypes, self.vltypes, self.enumtypes = _get_types(self) # get dimensions in this Group. self.dimensions = _get_dims(self) # get variables in this Group. self.variables = _get_vars(self) # get groups in this Group. self.groups = _get_grps(self) else: bytestr = _strencode(name) groupname = bytestr _ensure_nc_success(nc_def_grp(parent._grpid, groupname, &self._grpid)) self.cmptypes = OrderedDict() self.vltypes = OrderedDict() self.enumtypes = OrderedDict() self.dimensions = OrderedDict() self.variables = OrderedDict() self.groups = OrderedDict() def close(self): """ **`close(self)`** overrides `netCDF4.Dataset` close method which does not apply to `netCDF4.Group` instances, raises IOError.""" raise IOError('cannot close a `netCDF4.Group` (only applies to Dataset)') def _getname(self): # private method to get name associated with instance. cdef int ierr cdef char namstring[NC_MAX_NAME+1] with nogil: ierr = nc_inq_grpname(self._grpid, namstring) _ensure_nc_success(ierr) return namstring.decode('utf-8') property name: """string name of Group instance""" def __get__(self): return self._getname() def __set__(self,value): raise AttributeError("name cannot be altered") cdef class Dimension: """ A netCDF `netCDF4.Dimension` is used to describe the coordinates of a `netCDF4.Variable`. See `netCDF4.Dimension.__init__` for more details. The current maximum size of a `netCDF4.Dimension` instance can be obtained by calling the python `len` function on the `netCDF4.Dimension` instance. The `netCDF4.Dimension.isunlimited` method of a `netCDF4.Dimension` instance can be used to determine if the dimension is unlimited. Read-only class variables: **`name`**: String name, used when creating a `netCDF4.Variable` with `netCDF4.Dataset.createVariable`. **`size`**: Current `netCDF4.Dimension` size (same as `len(d)`, where `d` is a `netCDF4.Dimension` instance). """ cdef public int _dimid, _grpid cdef public _data_model, _name, _grp # Docstrings for class variables (used by pdoc). __pdoc__['Dimension.name']=\ """A string describing the name of the `netCDF4.Dimension` - used when creating a `netCDF4.Variable` instance with `netCDF4.Dataset.createVariable`.""" def __init__(self, grp, name, size=None, **kwargs): """ **`__init__(self, group, name, size=None)`** `netCDF4.Dimension` constructor. **`group`**: `netCDF4.Group` instance to associate with dimension. **`name`**: Name of the dimension. **`size`**: Size of the dimension. `None` or 0 means unlimited. (Default `None`). ***Note***: `netCDF4.Dimension` instances should be created using the `netCDF4.Dataset.createDimension` method of a `netCDF4.Group` or `netCDF4.Dataset` instance, not using `netCDF4.Dimension.__init__` directly. """ cdef int ierr cdef char *dimname cdef size_t lendim self._grpid = grp._grpid # make a weakref to group to avoid circular ref (issue 218) # keep strong reference the default behaviour (issue 251) if grp.keepweakref: self._grp = weakref.proxy(grp) else: self._grp = grp self._data_model = grp.data_model self._name = name if 'id' in kwargs: self._dimid = kwargs['id'] else: bytestr = _strencode(name) dimname = bytestr if size is not None: lendim = size else: lendim = NC_UNLIMITED if grp.data_model != 'NETCDF4': grp._redef() ierr = nc_def_dim(self._grpid, dimname, lendim, &self._dimid) if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) def _getname(self): # private method to get name associated with instance. cdef int err, _grpid cdef char namstring[NC_MAX_NAME+1] _grpid = self._grp._grpid with nogil: ierr = nc_inq_dimname(_grpid, self._dimid, namstring) _ensure_nc_success(ierr) return namstring.decode('utf-8') property name: """string name of Dimension instance""" def __get__(self): return self._getname() def __set__(self,value): raise AttributeError("name cannot be altered") property size: """current size of Dimension (calls `len` on Dimension instance)""" def __get__(self): return len(self) def __set__(self,value): raise AttributeError("size cannot be altered") def __repr__(self): if python3: return self.__unicode__() else: return unicode(self).encode('utf-8') def __unicode__(self): if not dir(self._grp): return 'Dimension object no longer valid' if self.isunlimited(): return repr(type(self))+" (unlimited): name = '%s', size = %s\n" % (self._name,len(self)) else: return repr(type(self))+": name = '%s', size = %s\n" % (self._name,len(self)) def __len__(self): # len(`netCDF4.Dimension` instance) returns current size of dimension cdef int ierr cdef size_t lengthp with nogil: ierr = nc_inq_dimlen(self._grpid, self._dimid, &lengthp) _ensure_nc_success(ierr) return lengthp def group(self): """ **`group(self)`** return the group that this `netCDF4.Dimension` is a member of.""" return self._grp def isunlimited(self): """ **`isunlimited(self)`** returns `True` if the `netCDF4.Dimension` instance is unlimited, `False` otherwise.""" cdef int ierr, n, numunlimdims, ndims, nvars, ngatts, xdimid cdef int *unlimdimids if self._data_model == 'NETCDF4': ierr = nc_inq_unlimdims(self._grpid, &numunlimdims, NULL) _ensure_nc_success(ierr) if numunlimdims == 0: return False else: unlimdimids = malloc(sizeof(int) * numunlimdims) dimid = self._dimid with nogil: ierr = nc_inq_unlimdims(self._grpid, &numunlimdims, unlimdimids) _ensure_nc_success(ierr) unlimdim_ids = [] for n from 0 <= n < numunlimdims: unlimdim_ids.append(unlimdimids[n]) free(unlimdimids) if dimid in unlimdim_ids: return True else: return False else: # if not NETCDF4, there is only one unlimited dimension. # nc_inq_unlimdims only works for NETCDF4. with nogil: ierr = nc_inq(self._grpid, &ndims, &nvars, &ngatts, &xdimid) if self._dimid == xdimid: return True else: return False cdef class Variable: """ A netCDF `netCDF4.Variable` is used to read and write netCDF data. They are analogous to numpy array objects. See `netCDF4.Variable.__init__` for more details. A list of attribute names corresponding to netCDF attributes defined for the variable can be obtained with the `netCDF4.Variable.ncattrs` method. These attributes can be created by assigning to an attribute of the `netCDF4.Variable` instance. A dictionary containing all the netCDF attribute name/value pairs is provided by the `__dict__` attribute of a `netCDF4.Variable` instance. The following class variables are read-only: **`dimensions`**: A tuple containing the names of the dimensions associated with this variable. **`dtype`**: A numpy dtype object describing the variable's data type. **`ndim`**: The number of variable dimensions. **`shape`**: A tuple with the current shape (length of all dimensions). **`scale`**: If True, `scale_factor` and `add_offset` are applied, and signed integer data is automatically converted to unsigned integer data if the `_Unsigned` attribute is set. Default is `True`, can be reset using `netCDF4.Variable.set_auto_scale` and `netCDF4.Variable.set_auto_maskandscale` methods. **`mask`**: If True, data is automatically converted to/from masked arrays when missing values or fill values are present. Default is `True`, can be reset using `netCDF4.Variable.set_auto_mask` and `netCDF4.Variable.set_auto_maskandscale` methods. **`chartostring`**: If True, data is automatically converted to/from character arrays to string arrays when the `_Encoding` variable attribute is set. Default is `True`, can be reset using `netCDF4.Variable.set_auto_chartostring` method. **`least_significant_digit`**: Describes the power of ten of the smallest decimal place in the data the contains a reliable value. Data is truncated to this decimal place when it is assigned to the `netCDF4.Variable` instance. If `None`, the data is not truncated. **`__orthogonal_indexing__`**: Always `True`. Indicates to client code that the object supports 'orthogonal indexing', which means that slices that are 1d arrays or lists slice along each dimension independently. This behavior is similar to Fortran or Matlab, but different than numpy. **`datatype`**: numpy data type (for primitive data types) or VLType/CompoundType instance (for compound or vlen data types). **`name`**: String name. **`size`**: The number of stored elements. """ cdef public int _varid, _grpid, _nunlimdim cdef public _name, ndim, dtype, mask, scale, chartostring, _isprimitive, _iscompound,\ _isvlen, _isenum, _grp, _cmptype, _vltype, _enumtype,\ __orthogonal_indexing__, _has_lsd, _no_get_vars # Docstrings for class variables (used by pdoc). __pdoc__['Variable.dimensions'] = \ """A tuple containing the names of the dimensions associated with this variable.""" __pdoc__['Variable.dtype'] = \ """A numpy dtype object describing the variable's data type.""" __pdoc__['Variable.ndim'] = \ """The number of variable dimensions.""" __pdoc__['Variable.scale'] = \ """if True, `scale_factor` and `add_offset` are applied, and signed integer data is converted to unsigned integer data if the `_Unsigned` attribute is set. Default is `True`, can be reset using `netCDF4.Variable.set_auto_scale` and `netCDF4.Variable.set_auto_maskandscale` methods.""" __pdoc__['Variable.mask'] = \ """If True, data is automatically converted to/from masked arrays when missing values or fill values are present. Default is `True`, can be reset using `netCDF4.Variable.set_auto_mask` and `netCDF4.Variable.set_auto_maskandscale` methods.""" __pdoc__['Variable.chartostring'] = \ """If True, data is automatically converted to/from character arrays to string arrays when `_Encoding` variable attribute is set. Default is `True`, can be reset using `netCDF4.Variable.set_auto_chartostring` method.""" __pdoc__['Variable._no_get_vars'] = \ """If True (default), netcdf routine `nc_get_vars` is not used for strided slicing slicing. Can be re-set using `netCDF4.Variable.use_nc_get_vars` method.""" __pdoc__['Variable.least_significant_digit'] = \ """Describes the power of ten of the smallest decimal place in the data the contains a reliable value. Data is truncated to this decimal place when it is assigned to the `netCDF4.Variable` instance. If `None`, the data is not truncated.""" __pdoc__['Variable.__orthogonal_indexing__'] = \ """Always `True`. Indicates to client code that the object supports 'orthogonal indexing', which means that slices that are 1d arrays or lists slice along each dimension independently. This behavior is similar to Fortran or Matlab, but different than numpy.""" __pdoc__['Variable.datatype'] = \ """numpy data type (for primitive data types) or VLType/CompoundType/EnumType instance (for compound, vlen or enum data types).""" __pdoc__['Variable.name'] = \ """String name.""" __pdoc__['Variable.shape'] = \ """A tuple with the current shape (length of all dimensions).""" __pdoc__['Variable.size'] = \ """The number of stored elements.""" def __init__(self, grp, name, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None, fill_value=None, chunk_cache=None, **kwargs): """ **`__init__(self, group, name, datatype, dimensions=(), zlib=False, complevel=4, shuffle=True, fletcher32=False, contiguous=False, chunksizes=None, endian='native', least_significant_digit=None,fill_value=None)`** `netCDF4.Variable` constructor. **`group`**: `netCDF4.Group` or `netCDF4.Dataset` instance to associate with variable. **`name`**: Name of the variable. **`datatype`**: `netCDF4.Variable` data type. Can be specified by providing a numpy dtype object, or a string that describes a numpy dtype object. Supported values, corresponding to `str` attribute of numpy dtype objects, include `'f4'` (32-bit floating point), `'f8'` (64-bit floating point), `'i4'` (32-bit signed integer), `'i2'` (16-bit signed integer), `'i8'` (64-bit signed integer), `'i4'` (8-bit signed integer), `'i1'` (8-bit signed integer), `'u1'` (8-bit unsigned integer), `'u2'` (16-bit unsigned integer), `'u4'` (32-bit unsigned integer), `'u8'` (64-bit unsigned integer), or `'S1'` (single-character string). From compatibility with Scientific.IO.NetCDF, the old Numeric single character typecodes can also be used (`'f'` instead of `'f4'`, `'d'` instead of `'f8'`, `'h'` or `'s'` instead of `'i2'`, `'b'` or `'B'` instead of `'i1'`, `'c'` instead of `'S1'`, and `'i'` or `'l'` instead of `'i4'`). `datatype` can also be a `netCDF4.CompoundType` instance (for a structured, or compound array), a `netCDF4.VLType` instance (for a variable-length array), or the python `str` builtin (for a variable-length string array). Numpy string and unicode datatypes with length greater than one are aliases for `str`. **`dimensions`**: a tuple containing the variable's dimension names (defined previously with `createDimension`). Default is an empty tuple which means the variable is a scalar (and therefore has no dimensions). **`zlib`**: if `True`, data assigned to the `netCDF4.Variable` instance is compressed on disk. Default `False`. **`complevel`**: the level of zlib compression to use (1 is the fastest, but poorest compression, 9 is the slowest but best compression). Default 4. Ignored if `zlib=False`. **`shuffle`**: if `True`, the HDF5 shuffle filter is applied to improve compression. Default `True`. Ignored if `zlib=False`. **`fletcher32`**: if `True` (default `False`), the Fletcher32 checksum algorithm is used for error detection. **`contiguous`**: if `True` (default `False`), the variable data is stored contiguously on disk. Default `False`. Setting to `True` for a variable with an unlimited dimension will trigger an error. **`chunksizes`**: Can be used to specify the HDF5 chunksizes for each dimension of the variable. A detailed discussion of HDF chunking and I/O performance is available [here](http://www.hdfgroup.org/HDF5/doc/H5.user/Chunking.html). Basically, you want the chunk size for each dimension to match as closely as possible the size of the data block that users will read from the file. `chunksizes` cannot be set if `contiguous=True`. **`endian`**: Can be used to control whether the data is stored in little or big endian format on disk. Possible values are `little, big` or `native` (default). The library will automatically handle endian conversions when the data is read, but if the data is always going to be read on a computer with the opposite format as the one used to create the file, there may be some performance advantage to be gained by setting the endian-ness. For netCDF 3 files (that don't use HDF5), only `endian='native'` is allowed. The `zlib, complevel, shuffle, fletcher32, contiguous` and `chunksizes` keywords are silently ignored for netCDF 3 files that do not use HDF5. **`least_significant_digit`**: If specified, variable data will be truncated (quantized). In conjunction with `zlib=True` this produces 'lossy', but significantly more efficient compression. For example, if `least_significant_digit=1`, data will be quantized using around(scale*data)/scale, where scale = 2**bits, and bits is determined so that a precision of 0.1 is retained (in this case bits=4). Default is `None`, or no quantization. **`fill_value`**: If specified, the default netCDF `_FillValue` (the value that the variable gets filled with before any data is written to it) is replaced with this value. If fill_value is set to `False`, then the variable is not pre-filled. The default netCDF fill values can be found in `netCDF4.default_fillvals`. ***Note***: `netCDF4.Variable` instances should be created using the `netCDF4.Dataset.createVariable` method of a `netCDF4.Dataset` or `netCDF4.Group` instance, not using this class directly. """ cdef int ierr, ndims, icontiguous, ideflate_level, numdims, _grpid cdef char *varname cdef nc_type xtype cdef int *dimids cdef size_t sizep, nelemsp cdef size_t *chunksizesp cdef float preemptionp # flag to indicate that orthogonal indexing is supported self.__orthogonal_indexing__ = True # if complevel is set to zero, set zlib to False. if not complevel: zlib = False # if dimensions is a string, convert to a tuple # this prevents a common error that occurs when # dimensions = 'lat' instead of ('lat',) if type(dimensions) == str or type(dimensions) == bytes or type(dimensions) == unicode: dimensions = dimensions, self._grpid = grp._grpid # make a weakref to group to avoid circular ref (issue 218) # keep strong reference the default behaviour (issue 251) if grp.keepweakref: self._grp = weakref.proxy(grp) else: self._grp = grp user_type = isinstance(datatype, CompoundType) or \ isinstance(datatype, VLType) or \ isinstance(datatype, EnumType) or \ datatype == str # convert to a real numpy datatype object if necessary. if not user_type and type(datatype) != numpy.dtype: datatype = numpy.dtype(datatype) # convert numpy string dtype with length > 1 # or any numpy unicode dtype into str if (isinstance(datatype, numpy.dtype) and ((datatype.kind == 'S' and datatype.itemsize > 1) or datatype.kind == 'U')): datatype = str user_type = True # check if endian keyword consistent with datatype specification. dtype_endian = getattr(datatype,'byteorder',None) if dtype_endian == '=': dtype_endian='native' if dtype_endian == '>': dtype_endian='big' if dtype_endian == '<': dtype_endian='little' if dtype_endian == '|': dtype_endian=None if dtype_endian is not None and dtype_endian != endian: if dtype_endian == 'native' and endian == sys.byteorder: pass else: # endian keyword prevails, issue warning msg = 'endian-ness of dtype and endian kwarg do not match, using endian kwarg' #msg = 'endian-ness of dtype and endian kwarg do not match, dtype over-riding endian kwarg' warnings.warn(msg) #endian = dtype_endian # dtype prevails # check validity of datatype. self._isprimitive = False self._iscompound = False self._isvlen = False self._isenum = False if user_type: if isinstance(datatype, CompoundType): self._iscompound = True self._cmptype = datatype if isinstance(datatype, VLType) or datatype==str: self._isvlen = True self._vltype = datatype if isinstance(datatype, EnumType): self._isenum = True self._enumtype = datatype if datatype==str: if grp.data_model != 'NETCDF4': raise ValueError( 'Variable length strings are only supported for the ' 'NETCDF4 format. For other formats, consider using ' 'netCDF4.stringtochar to convert string arrays into ' 'character arrays with an additional dimension.') datatype = VLType(self._grp, str, None) self._vltype = datatype xtype = datatype._nc_type # dtype variable attribute is a numpy datatype object. self.dtype = datatype.dtype elif datatype.str[1:] in _supportedtypes: self._isprimitive = True # find netCDF primitive data type corresponding to # specified numpy data type. xtype = _nptonctype[datatype.str[1:]] # dtype variable attribute is a numpy datatype object. self.dtype = datatype else: raise TypeError('illegal primitive data type, must be one of %s, got %s' % (_supportedtypes,datatype)) if 'id' in kwargs: self._varid = kwargs['id'] else: bytestr = _strencode(name) varname = bytestr ndims = len(dimensions) # find dimension ids. if ndims: dims = [] dimids = malloc(sizeof(int) * ndims) for n from 0 <= n < ndims: dimname = dimensions[n] # look for dimension in this group, and if not # found there, look in parent (and it's parent, etc, back to root). dim = _find_dim(grp, dimname) if dim is None: raise KeyError("dimension %s not defined in group %s or any group in it's family tree" % (dimname, grp.path)) dimids[n] = dim._dimid dims.append(dim) # go into define mode if it's a netCDF 3 compatible # file format. Be careful to exit define mode before # any exceptions are raised. if grp.data_model != 'NETCDF4': grp._redef() # define variable. if ndims: ierr = nc_def_var(self._grpid, varname, xtype, ndims, dimids, &self._varid) free(dimids) else: # a scalar variable. ierr = nc_def_var(self._grpid, varname, xtype, ndims, NULL, &self._varid) # set chunk cache size if desired # default is 1mb per var, can cause problems when many (1000's) # of vars are created. This change only lasts as long as file is # open. if grp.data_model.startswith('NETCDF4') and chunk_cache is not None: ierr = nc_get_var_chunk_cache(self._grpid, self._varid, &sizep, &nelemsp, &preemptionp) _ensure_nc_success(ierr) # reset chunk cache size, leave other parameters unchanged. sizep = chunk_cache ierr = nc_set_var_chunk_cache(self._grpid, self._varid, sizep, nelemsp, preemptionp) _ensure_nc_success(ierr) if ierr != NC_NOERR: if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) # set zlib, shuffle, chunking, fletcher32 and endian # variable settings. # don't bother for NETCDF3* formats. # for NETCDF3* formats, the zlib,shuffle,chunking, # and fletcher32 are silently ignored. Only # endian='native' allowed for NETCDF3. if grp.data_model in ['NETCDF4','NETCDF4_CLASSIC']: # set zlib and shuffle parameters. if zlib and ndims: # don't bother for scalar variable ideflate_level = complevel if shuffle: ierr = nc_def_var_deflate(self._grpid, self._varid, 1, 1, ideflate_level) else: ierr = nc_def_var_deflate(self._grpid, self._varid, 0, 1, ideflate_level) if ierr != NC_NOERR: if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) # set checksum. if fletcher32 and ndims: # don't bother for scalar variable ierr = nc_def_var_fletcher32(self._grpid, self._varid, 1) if ierr != NC_NOERR: if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) # set chunking stuff. if ndims: # don't bother for scalar variable. if contiguous: icontiguous = NC_CONTIGUOUS if chunksizes is not None: raise ValueError('cannot specify chunksizes for a contiguous dataset') else: icontiguous = NC_CHUNKED if chunksizes is None: chunksizesp = NULL else: if len(chunksizes) != len(dimensions): if grp.data_model != 'NETCDF4': grp._enddef() raise ValueError('chunksizes must be a sequence with the same length as dimensions') chunksizesp = malloc(sizeof(size_t) * ndims) for n from 0 <= n < ndims: if not dims[n].isunlimited() and \ chunksizes[n] > dims[n].size: msg = 'chunksize cannot exceed dimension size' raise ValueError(msg) chunksizesp[n] = chunksizes[n] if chunksizes is not None or contiguous: ierr = nc_def_var_chunking(self._grpid, self._varid, icontiguous, chunksizesp) free(chunksizesp) if ierr != NC_NOERR: if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) # set endian-ness of variable if endian == 'little': ierr = nc_def_var_endian(self._grpid, self._varid, NC_ENDIAN_LITTLE) elif endian == 'big': ierr = nc_def_var_endian(self._grpid, self._varid, NC_ENDIAN_BIG) elif endian == 'native': pass # this is the default format. else: raise ValueError("'endian' keyword argument must be 'little','big' or 'native', got '%s'" % endian) if ierr != NC_NOERR: if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) else: if endian != 'native': msg="only endian='native' allowed for NETCDF3 files" raise RuntimeError(msg) # set a fill value for this variable if fill_value keyword # given. This avoids the HDF5 overhead of deleting and # recreating the dataset if it is set later (after the enddef). if fill_value is not None: if not fill_value and isinstance(fill_value,bool): # no filling for this variable if fill_value==False. if not self._isprimitive: # no fill values for VLEN and compound variables # anyway. ierr = 0 else: ierr = nc_def_var_fill(self._grpid, self._varid, 1, NULL) if ierr != NC_NOERR: if grp.data_model != 'NETCDF4': grp._enddef() _ensure_nc_success(ierr) else: if self._isprimitive or self._isenum or \ (self._isvlen and self.dtype == str): if self._isvlen and self.dtype == str: _set_att(self._grp, self._varid, '_FillValue',\ _tostr(fill_value), xtype=xtype, force_ncstring=True) else: fillval = numpy.array(fill_value, self.dtype) if not fillval.dtype.isnative: fillval.byteswap(True) _set_att(self._grp, self._varid, '_FillValue',\ fillval, xtype=xtype) else: raise AttributeError("cannot set _FillValue attribute for VLEN or compound variable") if least_significant_digit is not None: self.least_significant_digit = least_significant_digit # leave define mode if not a NETCDF4 format file. if grp.data_model != 'NETCDF4': grp._enddef() # count how many unlimited dimensions there are. self._nunlimdim = 0 for dimname in dimensions: # look in current group, and parents for dim. dim = _find_dim(self._grp, dimname) if dim.isunlimited(): self._nunlimdim = self._nunlimdim + 1 # set ndim attribute (number of dimensions). with nogil: ierr = nc_inq_varndims(self._grpid, self._varid, &numdims) _ensure_nc_success(ierr) self.ndim = numdims self._name = name # default for automatically applying scale_factor and # add_offset, and converting to/from masked arrays is True. self.scale = True self.mask = True # default is to automatically convert to/from character # to string arrays when _Encoding variable attribute is set. self.chartostring = True if 'least_significant_digit' in self.ncattrs(): self._has_lsd = True # avoid calling nc_get_vars for strided slices by default. self._no_get_vars = True def __array__(self): # numpy special method that returns a numpy array. # allows numpy ufuncs to work faster on Variable objects # (issue 216). return self[...] def __repr__(self): if python3: return self.__unicode__() else: return unicode(self).encode('utf-8') def __unicode__(self): cdef int ierr, no_fill if not dir(self._grp): return 'Variable object no longer valid' ncdump_var = ['%r\n' % type(self)] dimnames = tuple([_tostr(dimname) for dimname in self.dimensions]) attrs = [' %s: %s\n' % (name,self.getncattr(name)) for name in\ self.ncattrs()] if self._iscompound: ncdump_var.append('%s %s(%s)\n' %\ ('compound',self._name,', '.join(dimnames))) elif self._isvlen: ncdump_var.append('%s %s(%s)\n' %\ ('vlen',self._name,', '.join(dimnames))) elif self._isenum: ncdump_var.append('%s %s(%s)\n' %\ ('enum',self._name,', '.join(dimnames))) else: ncdump_var.append('%s %s(%s)\n' %\ (self.dtype,self._name,', '.join(dimnames))) ncdump_var = ncdump_var + attrs if self._iscompound: ncdump_var.append('compound data type: %s\n' % self.dtype) elif self._isvlen: ncdump_var.append('vlen data type: %s\n' % self.dtype) elif self._isenum: ncdump_var.append('enum data type: %s\n' % self.dtype) unlimdims = [] for dimname in self.dimensions: dim = _find_dim(self._grp, dimname) if dim.isunlimited(): unlimdims.append(dimname) if (self._grp.path != '/'): ncdump_var.append('path = %s\n' % self._grp.path) ncdump_var.append('unlimited dimensions: %s\n' % ', '.join(unlimdims)) ncdump_var.append('current shape = %s\n' % repr(self.shape)) with nogil: ierr = nc_inq_var_fill(self._grpid,self._varid,&no_fill,NULL) _ensure_nc_success(ierr) if self._isprimitive: if no_fill != 1: try: fillval = self._FillValue msg = 'filling on' except AttributeError: fillval = default_fillvals[self.dtype.str[1:]] if self.dtype.str[1:] in ['u1','i1']: msg = 'filling on, default _FillValue of %s ignored\n' % fillval else: msg = 'filling on, default _FillValue of %s used\n' % fillval ncdump_var.append(msg) else: ncdump_var.append('filling off\n') return ''.join(ncdump_var) def _getdims(self): # Private method to get variables's dimension names cdef int ierr, numdims, n, nn cdef char namstring[NC_MAX_NAME+1] cdef int *dimids # get number of dimensions for this variable. with nogil: ierr = nc_inq_varndims(self._grpid, self._varid, &numdims) _ensure_nc_success(ierr) dimids = malloc(sizeof(int) * numdims) # get dimension ids. with nogil: ierr = nc_inq_vardimid(self._grpid, self._varid, dimids) _ensure_nc_success(ierr) # loop over dimensions, retrieve names. dimensions = () for nn from 0 <= nn < numdims: with nogil: ierr = nc_inq_dimname(self._grpid, dimids[nn], namstring) _ensure_nc_success(ierr) name = namstring.decode('utf-8') dimensions = dimensions + (name,) free(dimids) return dimensions def _getname(self): # Private method to get name associated with instance cdef int err, _grpid cdef char namstring[NC_MAX_NAME+1] _grpid = self._grp._grpid with nogil: ierr = nc_inq_varname(_grpid, self._varid, namstring) _ensure_nc_success(ierr) return namstring.decode('utf-8') property name: """string name of Variable instance""" def __get__(self): return self._getname() def __set__(self,value): raise AttributeError("name cannot be altered") property datatype: """numpy data type (for primitive data types) or VLType/CompoundType/EnumType instance (for compound, vlen or enum data types)""" def __get__(self): if self._iscompound: return self._cmptype elif self._isvlen: return self._vltype elif self._isenum: return self._enumtype elif self._isprimitive: return self.dtype property shape: """find current sizes of all variable dimensions""" def __get__(self): shape = () for dimname in self._getdims(): # look in current group, and parents for dim. dim = _find_dim(self._grp,dimname) shape = shape + (len(dim),) return shape def __set__(self,value): raise AttributeError("shape cannot be altered") property size: """Return the number of stored elements.""" def __get__(self): return numpy.prod(self.shape) property dimensions: """get variables's dimension names""" def __get__(self): return self._getdims() def __set__(self,value): raise AttributeError("dimensions cannot be altered") def group(self): """ **`group(self)`** return the group that this `netCDF4.Variable` is a member of.""" return self._grp def ncattrs(self): """ **`ncattrs(self)`** return netCDF attribute names for this `netCDF4.Variable` in a list.""" return _get_att_names(self._grpid, self._varid) def setncattr(self,name,value): """ **`setncattr(self,name,value)`** set a netCDF variable attribute using name,value pair. Use if you need to set a netCDF attribute with the same name as one of the reserved python attributes.""" if self._grp.data_model != 'NETCDF4': self._grp._redef() _set_att(self._grp, self._varid, name, value) if self._grp.data_model != 'NETCDF4': self._grp._enddef() def setncattr_string(self,name,value): """ **`setncattr_string(self,name,value)`** set a netCDF variable string attribute using name,value pair. Use if you need to ensure that a netCDF attribute is created with type `NC_STRING` if the file format is `NETCDF4`. Use if you need to set an attribute to an array of variable-length strings.""" cdef nc_type xtype xtype=-99 if self._grp.data_model != 'NETCDF4': msg='file format does not support NC_STRING attributes' raise IOError(msg) _set_att(self._grp, self._varid, name, value, xtype=xtype, force_ncstring=True) def setncatts(self,attdict): """ **`setncatts(self,attdict)`** set a bunch of netCDF variable attributes at once using a python dictionary. This may be faster when setting a lot of attributes for a `NETCDF3` formatted file, since nc_redef/nc_enddef is not called in between setting each attribute""" if self._grp.data_model != 'NETCDF4': self._grp._redef() for name, value in attdict.items(): _set_att(self._grp, self._varid, name, value) if self._grp.data_model != 'NETCDF4': self._grp._enddef() def getncattr(self,name,encoding='utf-8'): """ **`getncattr(self,name)`** retrieve a netCDF variable attribute. Use if you need to set a netCDF attribute with the same name as one of the reserved python attributes. option kwarg `encoding` can be used to specify the character encoding of a string attribute (default is `utf-8`).""" return _get_att(self._grp, self._varid, name, encoding=encoding) def delncattr(self, name): """ **`delncattr(self,name,value)`** delete a netCDF variable attribute. Use if you need to delete a netCDF attribute with the same name as one of the reserved python attributes.""" cdef char *attname bytestr = _strencode(name) attname = bytestr if self._grp.data_model != 'NETCDF4': self._grp._redef() ierr = nc_del_att(self._grpid, self._varid, attname) if self._grp.data_model != 'NETCDF4': self._grp._enddef() _ensure_nc_success(ierr) def filters(self): """ **`filters(self)`** return dictionary containing HDF5 filter parameters.""" cdef int ierr,ideflate,ishuffle,ideflate_level,ifletcher32 filtdict = {'zlib':False,'shuffle':False,'complevel':0,'fletcher32':False} if self._grp.data_model not in ['NETCDF4_CLASSIC','NETCDF4']: return with nogil: ierr = nc_inq_var_deflate(self._grpid, self._varid, &ishuffle, &ideflate, &ideflate_level) _ensure_nc_success(ierr) with nogil: ierr = nc_inq_var_fletcher32(self._grpid, self._varid, &ifletcher32) _ensure_nc_success(ierr) if ideflate: filtdict['zlib']=True filtdict['complevel']=ideflate_level if ishuffle: filtdict['shuffle']=True if ifletcher32: filtdict['fletcher32']=True return filtdict def endian(self): """ **`endian(self)`** return endian-ness (`little,big,native`) of variable (as stored in HDF5 file).""" cdef int ierr, iendian if self._grp.data_model not in ['NETCDF4_CLASSIC','NETCDF4']: return 'native' with nogil: ierr = nc_inq_var_endian(self._grpid, self._varid, &iendian) _ensure_nc_success(ierr) if iendian == NC_ENDIAN_LITTLE: return 'little' elif iendian == NC_ENDIAN_BIG: return 'big' else: return 'native' def chunking(self): """ **`chunking(self)`** return variable chunking information. If the dataset is defined to be contiguous (and hence there is no chunking) the word 'contiguous' is returned. Otherwise, a sequence with the chunksize for each dimension is returned.""" cdef int ierr, icontiguous, ndims cdef size_t *chunksizesp if self._grp.data_model not in ['NETCDF4_CLASSIC','NETCDF4']: return None ndims = self.ndim chunksizesp = malloc(sizeof(size_t) * ndims) with nogil: ierr = nc_inq_var_chunking(self._grpid, self._varid, &icontiguous, chunksizesp) _ensure_nc_success(ierr) chunksizes=[] for n from 0 <= n < ndims: chunksizes.append(chunksizesp[n]) free(chunksizesp) if icontiguous: return 'contiguous' else: return chunksizes def get_var_chunk_cache(self): """ **`get_var_chunk_cache(self)`** return variable chunk cache information in a tuple (size,nelems,preemption). See netcdf C library documentation for `nc_get_var_chunk_cache` for details.""" cdef int ierr cdef size_t sizep, nelemsp cdef float preemptionp with nogil: ierr = nc_get_var_chunk_cache(self._grpid, self._varid, &sizep, &nelemsp, &preemptionp) _ensure_nc_success(ierr) size = sizep; nelems = nelemsp; preemption = preemptionp return (size,nelems,preemption) def set_var_chunk_cache(self,size=None,nelems=None,preemption=None): """ **`set_var_chunk_cache(self,size=None,nelems=None,preemption=None)`** change variable chunk cache settings. See netcdf C library documentation for `nc_set_var_chunk_cache` for details.""" cdef int ierr cdef size_t sizep, nelemsp cdef float preemptionp # reset chunk cache size, leave other parameters unchanged. size_orig, nelems_orig, preemption_orig = self.get_var_chunk_cache() if size is not None: sizep = size else: sizep = size_orig if nelems is not None: nelemsp = nelems else: nelemsp = nelems_orig if preemption is not None: preemptionp = preemption else: preemptionp = preemption_orig ierr = nc_set_var_chunk_cache(self._grpid, self._varid, sizep, nelemsp, preemptionp) _ensure_nc_success(ierr) def __delattr__(self,name): # if it's a netCDF attribute, remove it if name not in _private_atts: self.delncattr(name) else: raise AttributeError( "'%s' is one of the reserved attributes %s, cannot delete. Use delncattr instead." % (name, tuple(_private_atts))) def __setattr__(self,name,value): # if name in _private_atts, it is stored at the python # level and not in the netCDF file. if name not in _private_atts: # if setting _FillValue or missing_value, make sure value # has same type and byte order as variable. if name == '_FillValue': msg='_FillValue attribute must be set when variable is '+\ 'created (using fill_value keyword to createVariable)' raise AttributeError(msg) #if self._isprimitive: # value = numpy.array(value, self.dtype) #else: # msg="cannot set _FillValue attribute for "+\ # "VLEN or compound variable" # raise AttributeError(msg) elif name in ['valid_min','valid_max','valid_range','missing_value'] and self._isprimitive: # make sure these attributes written in same data type as variable. # also make sure it is written in native byte order # (the same as the data) valuea = numpy.array(value, self.dtype) # check to see if array cast is safe if _safecast(numpy.array(value),valuea): value = valuea if not value.dtype.isnative: value.byteswap(True) else: # otherwise don't do it, but issue a warning msg="WARNING: %s cannot be safely cast to variable dtype" \ % name warnings.warn(msg) self.setncattr(name, value) elif not name.endswith('__'): if hasattr(self,name): raise AttributeError( "'%s' is one of the reserved attributes %s, cannot rebind. Use setncattr instead." % (name, tuple(_private_atts))) else: self.__dict__[name]=value def __getattr__(self,name): # if name in _private_atts, it is stored at the python # level and not in the netCDF file. if name.startswith('__') and name.endswith('__'): # if __dict__ requested, return a dict with netCDF attributes. if name == '__dict__': names = self.ncattrs() values = [] for name in names: values.append(_get_att(self._grp, self._varid, name)) return OrderedDict(zip(names,values)) else: raise AttributeError elif name in _private_atts: return self.__dict__[name] else: return self.getncattr(name) def renameAttribute(self, oldname, newname): """ **`renameAttribute(self, oldname, newname)`** rename a `netCDF4.Variable` attribute named `oldname` to `newname`.""" cdef int ierr cdef char *oldnamec cdef char *newnamec bytestr = _strencode(oldname) oldnamec = bytestr bytestr = _strencode(newname) newnamec = bytestr ierr = nc_rename_att(self._grpid, self._varid, oldnamec, newnamec) _ensure_nc_success(ierr) def __getitem__(self, elem): # This special method is used to index the netCDF variable # using the "extended slice syntax". The extended slice syntax # is a perfect match for the "start", "count" and "stride" # arguments to the nc_get_var() function, and is much more easy # to use. start, count, stride, put_ind =\ _StartCountStride(elem,self.shape,dimensions=self.dimensions,grp=self._grp,no_get_vars=self._no_get_vars) datashape = _out_array_shape(count) if self._isvlen: data = numpy.empty(datashape, dtype='O') else: data = numpy.empty(datashape, dtype=self.dtype) # Determine which dimensions need to be # squeezed (those for which elem is an integer scalar). # The convention used is that for those cases, # put_ind for this dimension is set to -1 by _StartCountStride. squeeze = data.ndim * [slice(None),] for i,n in enumerate(put_ind.shape[:-1]): if n == 1 and put_ind[...,i].ravel()[0] == -1: squeeze[i] = 0 # Reshape the arrays so we can iterate over them. start = start.reshape((-1, self.ndim or 1)) count = count.reshape((-1, self.ndim or 1)) stride = stride.reshape((-1, self.ndim or 1)) put_ind = put_ind.reshape((-1, self.ndim or 1)) # Fill output array with data chunks. for (a,b,c,i) in zip(start, count, stride, put_ind): datout = self._get(a,b,c) if not hasattr(datout,'shape') or data.shape == datout.shape: data = datout else: shape = getattr(data[tuple(i)], 'shape', ()) if self._isvlen and not len(self.dimensions): # special case of scalar VLEN data[0] = datout else: data[tuple(i)] = datout.reshape(shape) # Remove extra singleton dimensions. if hasattr(data,'shape'): data = data[tuple(squeeze)] if hasattr(data,'ndim') and self.ndim == 0: # Make sure a numpy scalar array is returned instead of a 1-d array of # length 1. if data.ndim != 0: data = numpy.asarray(data[0]) # if auto_scale mode set to True, (through # a call to set_auto_scale or set_auto_maskandscale), # perform automatic unpacking using scale_factor/add_offset. # if auto_mask mode is set to True (through a call to # set_auto_mask or set_auto_maskandscale), perform # automatic conversion to masked array using # missing_value/_Fill_Value. # ignore for compound, vlen or enum datatypes. try: # check to see if scale_factor and add_offset is valid (issue 176). if hasattr(self,'scale_factor'): float(self.scale_factor) if hasattr(self,'add_offset'): float(self.add_offset) valid_scaleoffset = True except: valid_scaleoffset = False if self.scale: msg = 'invalid scale_factor or add_offset attribute, no unpacking done...' warnings.warn(msg) if self.mask and (self._isprimitive or self._isenum): data = self._toma(data) # if attribute _Unsigned is True, and variable has signed integer # dtype, return view with corresponding unsigned dtype (issue #656) if self.scale: # only do this if autoscale option is on. is_unsigned = getattr(self, '_Unsigned', False) if is_unsigned and data.dtype.kind == 'i': data = data.view('u%s' % data.dtype.itemsize) if self.scale and self._isprimitive and valid_scaleoffset: # if variable has scale_factor and add_offset attributes, rescale. if hasattr(self, 'scale_factor') and hasattr(self, 'add_offset') and\ (self.add_offset != 0.0 or self.scale_factor != 1.0): data = data*self.scale_factor + self.add_offset # else if variable has only scale_factor attributes, rescale. elif hasattr(self, 'scale_factor') and self.scale_factor != 1.0: data = data*self.scale_factor # else if variable has only add_offset attributes, rescale. elif hasattr(self, 'add_offset') and self.add_offset != 0.0: data = data + self.add_offset # if _Encoding is specified for a character variable, return # a numpy array of strings with one less dimension. if self.chartostring and getattr(self.dtype,'kind',None) == 'S' and\ getattr(self.dtype,'itemsize',None) == 1: encoding = getattr(self,'_Encoding',None) # should this only be done if self.scale = True? # should there be some other way to disable this? if encoding is not None: # only try to return a string array if rightmost dimension of # sliced data matches rightmost dimension of char variable if len(data.shape) > 0 and data.shape[-1] == self.shape[-1]: # also make sure slice is along last dimension matchdim = True for cnt in count: if cnt[-1] != self.shape[-1]: matchdim = False break if matchdim: data = chartostring(data, encoding=encoding) return data def _toma(self,data): cdef int ierr, no_fill # private function for creating a masked array, masking missing_values # and/or _FillValues. totalmask = numpy.zeros(data.shape, numpy.bool) fill_value = None safe_missval = self._check_safecast('missing_value') if safe_missval: mval = numpy.array(self.missing_value, self.dtype) # create mask from missing values. mvalmask = numpy.zeros(data.shape, numpy.bool) if mval.shape == (): # mval a scalar. mval = [mval] # make into iterable. for m in mval: # is scalar missing value a NaN? try: mvalisnan = numpy.isnan(m) except TypeError: # isnan fails on some dtypes (issue 206) mvalisnan = False if mvalisnan: mvalmask += numpy.isnan(data) else: mvalmask += data==m if mvalmask.any(): # set fill_value for masked array # to missing_value (or 1st element # if missing_value is a vector). fill_value = mval[0] totalmask += mvalmask # set mask=True for data == fill value safe_fillval = self._check_safecast('_FillValue') if safe_fillval: fval = numpy.array(self._FillValue, self.dtype) # is _FillValue a NaN? try: fvalisnan = numpy.isnan(fval) except: # isnan fails on some dtypes (issue 202) fvalisnan = False if fvalisnan: mask = numpy.isnan(data) elif (data == fval).any(): mask = data==fval else: mask = None if mask is not None: if fill_value is None: fill_value = fval totalmask += mask # issue 209: don't return masked array if variable filling # is disabled. else: with nogil: ierr = nc_inq_var_fill(self._grpid,self._varid,&no_fill,NULL) _ensure_nc_success(ierr) # if no_fill is not 1, and not a byte variable, then use default fill value. # from http://www.unidata.ucar.edu/software/netcdf/docs/netcdf-c/Fill-Values.html#Fill-Values # "If you need a fill value for a byte variable, it is recommended # that you explicitly define an appropriate _FillValue attribute, as # generic utilities such as ncdump will not assume a default fill # value for byte variables." # Explained here too: # http://www.unidata.ucar.edu/software/netcdf/docs/known_problems.html#ncdump_ubyte_fill # "There should be no default fill values when reading any byte # type, signed or unsigned, because the byte ranges are too # small to assume one of the values should appear as a missing # value unless a _FillValue attribute is set explicitly." if no_fill != 1 and self.dtype.str[1:] not in ['u1','i1']: fillval = numpy.array(default_fillvals[self.dtype.str[1:]],self.dtype) has_fillval = data == fillval # if data is an array scalar, has_fillval will be a boolean. # in that case convert to an array. if type(has_fillval) == bool: has_fillval=numpy.asarray(has_fillval) if has_fillval.any(): if fill_value is None: fill_value = fillval mask=data==fillval totalmask += mask # set mask=True for data outside valid_min,valid_max. # (issue #576) validmin = None; validmax = None # if valid_range exists use that, otherwise # look for valid_min, valid_max. No special # treatment of byte data as described at # http://www.unidata.ucar.edu/software/netcdf/docs/attribute_conventions.html). safe_validrange = self._check_safecast('valid_range') safe_validmin = self._check_safecast('valid_min') safe_validmax = self._check_safecast('valid_max') if safe_validrange and len(self.valid_range) == 2: validmin = numpy.array(self.valid_range[0], self.dtype) validmax = numpy.array(self.valid_range[1], self.dtype) else: if safe_validmin: validmin = numpy.array(self.valid_min, self.dtype) if safe_validmax: validmax = numpy.array(self.valid_max, self.dtype) # http://www.unidata.ucar.edu/software/netcdf/docs/attribute_conventions.html). # "If the data type is byte and _FillValue # is not explicitly defined, # then the valid range should include all possible values. # Otherwise, the valid range should exclude the _FillValue # (whether defined explicitly or by default) as follows. # If the _FillValue is positive then it defines a valid maximum, # otherwise it defines a valid minimum." byte_type = self.dtype.str[1:] in ['u1','i1'] if safe_fillval: fval = numpy.array(self._FillValue, self.dtype) else: fval = numpy.array(default_fillvals[self.dtype.str[1:]],self.dtype) if byte_type: fval = None if self.dtype.kind != 'S': # don't set mask for character data if validmin is None and (fval is not None and fval <= 0): validmin = fval if validmax is None and (fval is not None and fval > 0): validmax = fval if validmin is not None: totalmask += data < validmin if validmax is not None: totalmask += data > validmax if fill_value is None and fval is not None: fill_value = fval # create masked array with computed mask if totalmask.any() and fill_value is not None: data = ma.masked_array(data,mask=totalmask,fill_value=fill_value) # issue 515 scalar array with mask=True should be converted # to numpy.ma.MaskedConstant to be consistent with slicing # behavior of masked arrays. if data.shape == () and data.mask.all(): # return a scalar numpy masked constant not a 0-d masked array, # so that data == numpy.ma.masked. data = data[()] # changed from [...] (issue #662) return data def _assign_vlen(self, elem, data): """private method to assign data to a single item in a VLEN variable""" cdef size_t *startp cdef size_t *countp cdef int ndims, n cdef nc_vlen_t *vldata cdef char **strdata cdef ndarray data2 if not self._isvlen: raise TypeError('_assign_vlen method only for use with VLEN variables') ndims = self.ndim msg="data can only be assigned to VLEN variables using integer indices" # check to see that elem is a tuple of integers. # handle negative integers. if isinstance(elem, int): if ndims > 1: raise IndexError(msg) if elem < 0: if self.shape[0]+elem >= 0: elem = self.shape[0]+elem else: raise IndexError("Illegal index") elif isinstance(elem, tuple): if len(elem) != ndims: raise IndexError("Illegal index") elemnew = [] for n,e in enumerate(elem): if not isinstance(e, int): raise IndexError(msg) elif e < 0: enew = self.shape[n]+e if enew < 0: raise IndexError("Illegal index") else: elemnew.append(self.shape[n]+e) else: elemnew.append(e) elem = tuple(elemnew) else: raise IndexError(msg) # set start, count if isinstance(elem, tuple): start = list(elem) else: start = [elem] count = [1]*ndims startp = malloc(sizeof(size_t) * ndims) countp = malloc(sizeof(size_t) * ndims) for n from 0 <= n < ndims: startp[n] = start[n] countp[n] = count[n] if self.dtype == str: # VLEN string strdata = malloc(sizeof(char *)) # use _Encoding attribute to specify string encoding - if # not given, use 'utf-8'. encoding = getattr(self,'_Encoding','utf-8') bytestr = _strencode(data,encoding=encoding) strdata[0] = bytestr ierr = nc_put_vara(self._grpid, self._varid, startp, countp, strdata) _ensure_nc_success(ierr) free(strdata) else: # regular VLEN if data.dtype != self.dtype: raise TypeError("wrong data type: should be %s, got %s" % (self.dtype,data.dtype)) data2 = data vldata = malloc(sizeof(nc_vlen_t)) vldata[0].len = PyArray_SIZE(data2) vldata[0].p = data2.data ierr = nc_put_vara(self._grpid, self._varid, startp, countp, vldata) _ensure_nc_success(ierr) free(vldata) free(startp) free(countp) def _check_safecast(self, attname): # check to see that variable attribute exists # can can be safely cast to variable data type. if hasattr(self, attname): att = numpy.array(self.getncattr(attname)) else: return False atta = numpy.array(att, self.dtype) is_safe = _safecast(att,atta) if not is_safe: msg="""WARNING: %s not used since it cannot be safely cast to variable data type""" % attname warnings.warn(msg) return is_safe def __setitem__(self, elem, data): # This special method is used to assign to the netCDF variable # using "extended slice syntax". The extended slice syntax # is a perfect match for the "start", "count" and "stride" # arguments to the nc_put_var() function, and is much more easy # to use. # if _Encoding is specified for a character variable, convert # numpy array of strings to a numpy array of characters with one more # dimension. if self.chartostring and getattr(self.dtype,'kind',None) == 'S' and\ getattr(self.dtype,'itemsize',None) == 1: # NC_CHAR variable encoding = getattr(self,'_Encoding',None) if encoding is not None: # _Encoding attribute is set # if data is a string or a bytes object, convert to a numpy string array # whose length is equal to the rightmost dimension of the # variable. if type(data) in [str,bytes]: data = numpy.asarray(data,dtype='S'+repr(self.shape[-1])) if data.dtype.kind in ['S','U'] and data.dtype.itemsize > 1: # if data is a numpy string array, convert it to an array # of characters with one more dimension. data = stringtochar(data, encoding=encoding) if self._isvlen: # if vlen, should be object array (don't try casting) if self.dtype == str: # for string vars, if data is not an array # assume it is a python string and raise an error # if it is an array, but not an object array. if not isinstance(data, numpy.ndarray): # issue 458, allow Ellipsis to be used for scalar var if type(elem) == type(Ellipsis) and not\ len(self.dimensions): elem = 0 self._assign_vlen(elem, data) return elif data.dtype.kind in ['S', 'U']: if ma.isMA(data): msg='masked arrays cannot be assigned by VLEN str slices' raise TypeError(msg) data = data.astype(object) elif data.dtype.kind != 'O': msg = ('only numpy string, unicode or object arrays can ' 'be assigned to VLEN str var slices') raise TypeError(msg) else: # for non-string vlen arrays, if data is not multi-dim, or # not an object array, assume it represents a single element # of the vlen var. if not isinstance(data, numpy.ndarray) or data.dtype.kind != 'O': # issue 458, allow Ellipsis to be used for scalar var if type(elem) == type(Ellipsis) and not\ len(self.dimensions): elem = 0 self._assign_vlen(elem, data) return # A numpy or masked array (or an object supporting the buffer interface) is needed. # Convert if necessary. if not ma.isMA(data) and not (hasattr(data,'data') and isinstance(data.data,buffer)): # if auto scaling is to be done, don't cast to an integer yet. if self.scale and self.dtype.kind in 'iu' and \ hasattr(self, 'scale_factor') or hasattr(self, 'add_offset'): data = numpy.array(data,numpy.float) else: data = numpy.array(data,self.dtype) # for Enum variable, make sure data is valid. if self._isenum: test = numpy.zeros(data.shape,numpy.bool) if ma.isMA(data): # fix for new behaviour in numpy.ma in 1.13 (issue #662) for val in self.datatype.enum_dict.values(): test += data.filled() == val else: for val in self.datatype.enum_dict.values(): test += data == val if not numpy.all(test): msg="trying to assign illegal value to Enum variable" raise ValueError(msg) start, count, stride, put_ind =\ _StartCountStride(elem,self.shape,self.dimensions,self._grp,datashape=data.shape,put=True) datashape = _out_array_shape(count) # if a numpy scalar, create an array of the right size # and fill with scalar values. if data.shape == (): data = numpy.tile(data,datashape) # reshape data array by adding extra singleton dimensions # if needed to conform with start,count,stride. if len(data.shape) != len(datashape): # create a view so shape in caller is not modified (issue 90) data = data.view() data.shape = tuple(datashape) # Reshape these arrays so we can iterate over them. start = start.reshape((-1, self.ndim or 1)) count = count.reshape((-1, self.ndim or 1)) stride = stride.reshape((-1, self.ndim or 1)) put_ind = put_ind.reshape((-1, self.ndim or 1)) # quantize data if least_significant_digit attribute # exists (improves compression). if self._has_lsd: data = _quantize(data,self.least_significant_digit) # if auto_scale mode set to True, (through # a call to set_auto_scale or set_auto_maskandscale), # perform automatic unpacking using scale_factor/add_offset. # if auto_mask mode is set to True (through a call to # set_auto_mask or set_auto_maskandscale), perform # automatic conversion to masked array using # valid_min,validmax,missing_value,_Fill_Value. # ignore if not a primitive or enum data type (not compound or vlen). if self.mask and (self._isprimitive or self._isenum): # use missing_value as fill value. # if no missing value set, use _FillValue. if hasattr(self, 'scale_factor') or hasattr(self, 'add_offset'): # if not masked, create a masked array. if not ma.isMA(data): data = self._toma(data) if self.scale and self._isprimitive: # pack non-masked values using scale_factor and add_offset if hasattr(self, 'scale_factor') and hasattr(self, 'add_offset'): data = (data - self.add_offset)/self.scale_factor if self.dtype.kind in 'iu': data = numpy.around(data) elif hasattr(self, 'scale_factor'): data = data/self.scale_factor if self.dtype.kind in 'iu': data = numpy.around(data) elif hasattr(self, 'add_offset'): data = data - self.add_offset if self.dtype.kind in 'iu': data = numpy.around(data) if ma.isMA(data): # if underlying data in masked regions of masked array # corresponds to missing values, don't fill masked array - # just use underlying data instead if hasattr(self, 'missing_value') and \ numpy.all(numpy.in1d(data.data[data.mask],self.missing_value)): data = data.data else: if hasattr(self, 'missing_value'): # if missing value is a scalar, use it as fill_value. # if missing value is a vector, raise an exception # since we then don't know how to fill in masked values. if numpy.array(self.missing_value).shape == (): fillval = self.missing_value else: msg="cannot assign fill_value for masked array when missing_value attribute is not a scalar" raise RuntimeError(msg) if numpy.array(fillval).shape != (): fillval = fillval[0] elif hasattr(self, '_FillValue'): fillval = self._FillValue else: fillval = default_fillvals[self.dtype.str[1:]] data = data.filled(fill_value=fillval) # Fill output array with data chunks. for (a,b,c,i) in zip(start, count, stride, put_ind): dataput = data[tuple(i)] if dataput.size == 0: continue # nothing to write # convert array scalar to regular array with one element. if dataput.shape == (): if self._isvlen: dataput=numpy.array(dataput,'O') else: dataput=numpy.array(dataput,dataput.dtype) self._put(dataput,a,b,c) def __len__(self): if not self.shape: raise TypeError('len() of unsized object') else: return self.shape[0] def assignValue(self,val): """ **`assignValue(self, val)`** assign a value to a scalar variable. Provided for compatibility with Scientific.IO.NetCDF, can also be done by assigning to an Ellipsis slice ([...]).""" if len(self.dimensions): raise IndexError('to assign values to a non-scalar variable, use a slice') self[:]=val def getValue(self): """ **`getValue(self)`** get the value of a scalar variable. Provided for compatibility with Scientific.IO.NetCDF, can also be done by slicing with an Ellipsis ([...]).""" if len(self.dimensions): raise IndexError('to retrieve values from a non-scalar variable, use slicing') return self[slice(None)] def set_auto_chartostring(self,chartostring): """ **`set_auto_chartostring(self,chartostring)`** turn on or off automatic conversion of character variable data to and from numpy fixed length string arrays when the `_Encoding` variable attribute is set. If `chartostring` is set to `True`, when data is read from a character variable (dtype = `S1`) that has an `_Encoding` attribute, it is converted to a numpy fixed length unicode string array (dtype = `UN`, where `N` is the length of the the rightmost dimension of the variable). The value of `_Encoding` is the unicode encoding that is used to decode the bytes into strings. When numpy string data is written to a variable it is converted back to indiviual bytes, with the number of bytes in each string equalling the rightmost dimension of the variable. The default value of `chartostring` is `True` (automatic conversions are performed). """ self.chartostring = bool(chartostring) def use_nc_get_vars(self,use_nc_get_vars): """ **`use_nc_get_vars(self,_no_get_vars)`** enable the use of netcdf library routine `nc_get_vars` to retrieve strided variable slices. By default, `nc_get_vars` not used since it slower than multiple calls to the unstrided read routine `nc_get_vara` in most cases. """ self._no_get_vars = not bool(use_nc_get_vars) def set_auto_maskandscale(self,maskandscale): """ **`set_auto_maskandscale(self,maskandscale)`** turn on or off automatic conversion of variable data to and from masked arrays, automatic packing/unpacking of variable data using `scale_factor` and `add_offset` attributes and automatic conversion of signed integer data to unsigned integer data if the `_Unsigned` attribute exists. If `maskandscale` is set to `True`, when data is read from a variable it is converted to a masked array if any of the values are exactly equal to the either the netCDF _FillValue or the value specified by the missing_value variable attribute. The fill_value of the masked array is set to the missing_value attribute (if it exists), otherwise the netCDF _FillValue attribute (which has a default value for each data type). When data is written to a variable, the masked array is converted back to a regular numpy array by replacing all the masked values by the missing_value attribute of the variable (if it exists). If the variable has no missing_value attribute, the _FillValue is used instead. If the variable has valid_min/valid_max and missing_value attributes, data outside the specified range will be set to missing_value. If `maskandscale` is set to `True`, and the variable has a `scale_factor` or an `add_offset` attribute, then data read from that variable is unpacked using:: data = self.scale_factor*data + self.add_offset When data is written to a variable it is packed using:: data = (data - self.add_offset)/self.scale_factor If either scale_factor is present, but add_offset is missing, add_offset is assumed zero. If add_offset is present, but scale_factor is missing, scale_factor is assumed to be one. For more information on how `scale_factor` and `add_offset` can be used to provide simple compression, see the [PSD metadata conventions](http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml). In addition, if `maskandscale` is set to `True`, and if the variable has an attribute `_Unsigned` set, and the variable has a signed integer data type, a view to the data is returned with the corresponding unsigned integer data type. This convention is used by the netcdf-java library to save unsigned integer data in `NETCDF3` or `NETCDF4_CLASSIC` files (since the `NETCDF3` data model does not have unsigned integer data types). The default value of `maskandscale` is `True` (automatic conversions are performed). """ self.scale = self.mask = bool(maskandscale) def set_auto_scale(self,scale): """ **`set_auto_scale(self,scale)`** turn on or off automatic packing/unpacking of variable data using `scale_factor` and `add_offset` attributes. Also turns on and off automatic conversion of signed integer data to unsigned integer data if the variable has an `_Unsigned` attribute. If `scale` is set to `True`, and the variable has a `scale_factor` or an `add_offset` attribute, then data read from that variable is unpacked using:: data = self.scale_factor*data + self.add_offset When data is written to a variable it is packed using:: data = (data - self.add_offset)/self.scale_factor If either scale_factor is present, but add_offset is missing, add_offset is assumed zero. If add_offset is present, but scale_factor is missing, scale_factor is assumed to be one. For more information on how `scale_factor` and `add_offset` can be used to provide simple compression, see the [PSD metadata conventions](http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml). In addition, if `scale` is set to `True`, and if the variable has an attribute `_Unsigned` set, and the variable has a signed integer data type, a view to the data is returned with the corresponding unsigned integer datatype. This convention is used by the netcdf-java library to save unsigned integer data in `NETCDF3` or `NETCDF4_CLASSIC` files (since the `NETCDF3` data model does not have unsigned integer data types). The default value of `scale` is `True` (automatic conversions are performed). """ self.scale = bool(scale) def set_auto_mask(self,mask): """ **`set_auto_mask(self,mask)`** turn on or off automatic conversion of variable data to and from masked arrays . If `mask` is set to `True`, when data is read from a variable it is converted to a masked array if any of the values are exactly equal to the either the netCDF _FillValue or the value specified by the missing_value variable attribute. The fill_value of the masked array is set to the missing_value attribute (if it exists), otherwise the netCDF _FillValue attribute (which has a default value for each data type). When data is written to a variable, the masked array is converted back to a regular numpy array by replacing all the masked values by the missing_value attribute of the variable (if it exists). If the variable has no missing_value attribute, the _FillValue is used instead. If the variable has valid_min/valid_max and missing_value attributes, data outside the specified range will be set to missing_value. The default value of `mask` is `True` (automatic conversions are performed). """ self.mask = bool(mask) def _put(self,ndarray data,start,count,stride): """Private method to put data into a netCDF variable""" cdef int ierr, ndims cdef npy_intp totelem cdef size_t *startp cdef size_t *countp cdef ptrdiff_t *stridep cdef char **strdata cdef void* elptr cdef char* databuff cdef ndarray dataarr cdef nc_vlen_t *vldata # rank of variable. ndims = len(self.dimensions) # make sure data is contiguous. # if not, make a local copy. if not PyArray_ISCONTIGUOUS(data): data = data.copy() # fill up startp,countp,stridep. totelem = 1 negstride = 0 sl = [] startp = malloc(sizeof(size_t) * ndims) countp = malloc(sizeof(size_t) * ndims) stridep = malloc(sizeof(ptrdiff_t) * ndims) for n from 0 <= n < ndims: count[n] = abs(count[n]) # make -1 into +1 countp[n] = count[n] # for neg strides, reverse order (then flip that axis after data read in) if stride[n] < 0: negstride = 1 stridep[n] = -stride[n] startp[n] = start[n]+stride[n]*(count[n]-1) stride[n] = -stride[n] sl.append(slice(None, None, -1)) # this slice will reverse the data else: startp[n] = start[n] stridep[n] = stride[n] sl.append(slice(None,None, 1)) totelem = totelem*countp[n] # check to see that size of data array is what is expected # for slice given. dataelem = PyArray_SIZE(data) if totelem != dataelem: raise IndexError('size of data array does not conform to slice') if negstride: # reverse data along axes with negative strides. data = data[sl].copy() # make sure a copy is made. if self._isprimitive or self._iscompound or self._isenum: # primitive, enum or compound data type. # if data type of array doesn't match variable, # try to cast the data. if self.dtype != data.dtype: data = data.astype(self.dtype) # cast data, if necessary. # byte-swap data in numpy array so that is has native # endian byte order (this is what netcdf-c expects - # issue #554, pull request #555) if not data.dtype.isnative: data = data.byteswap() # strides all 1 or scalar variable, use put_vara (faster) if sum(stride) == ndims or ndims == 0: ierr = nc_put_vara(self._grpid, self._varid, startp, countp, data.data) else: ierr = nc_put_vars(self._grpid, self._varid, startp, countp, stridep, data.data) _ensure_nc_success(ierr) elif self._isvlen: if data.dtype.char !='O': raise TypeError('data to put in string variable must be an object array containing Python strings') # flatten data array. data = data.flatten() if self.dtype == str: # convert all elements from strings to bytes # use _Encoding attribute to specify string encoding - if # not given, use 'utf-8'. encoding = getattr(self,'_Encoding','utf-8') for n in range(data.shape[0]): data[n] = _strencode(data[n],encoding=encoding) # vlen string (NC_STRING) # loop over elements of object array, put data buffer for # each element in struct. # allocate struct array to hold vlen data. strdata = malloc(sizeof(char *)*totelem) for i from 0<=imalloc(totelem*sizeof(nc_vlen_t)) for i from 0<=idatabuff)[0] dataarr = elptr if self.dtype != dataarr.dtype.str[1:]: #dataarr = dataarr.astype(self.dtype) # cast data, if necessary. # casting doesn't work ?? just raise TypeError raise TypeError("wrong data type in object array: should be %s, got %s" % (self.dtype,dataarr.dtype)) vldata[i].len = PyArray_SIZE(dataarr) vldata[i].p = dataarr.data databuff = databuff + data.strides[0] # strides all 1 or scalar variable, use put_vara (faster) if sum(stride) == ndims or ndims == 0: ierr = nc_put_vara(self._grpid, self._varid, startp, countp, vldata) else: raise IndexError('strides must all be 1 for vlen variables') #ierr = nc_put_vars(self._grpid, self._varid, # startp, countp, stridep, vldata) _ensure_nc_success(ierr) # free the pointer array. free(vldata) free(startp) free(countp) free(stridep) def _get(self,start,count,stride): """Private method to retrieve data from a netCDF variable""" cdef int ierr, ndims cdef size_t *startp cdef size_t *countp cdef ptrdiff_t *stridep cdef ndarray data, dataarr cdef void *elptr cdef char **strdata cdef nc_vlen_t *vldata # if one of the counts is negative, then it is an index # and not a slice so the resulting array # should be 'squeezed' to remove the singleton dimension. shapeout = () squeeze_out = False for lendim in count: if lendim == -1: shapeout = shapeout + (1,) squeeze_out = True else: shapeout = shapeout + (lendim,) # rank of variable. ndims = len(self.dimensions) # fill up startp,countp,stridep. negstride = 0 sl = [] startp = malloc(sizeof(size_t) * ndims) countp = malloc(sizeof(size_t) * ndims) stridep = malloc(sizeof(ptrdiff_t) * ndims) for n from 0 <= n < ndims: count[n] = abs(count[n]) # make -1 into +1 countp[n] = count[n] # for neg strides, reverse order (then flip that axis after data read in) if stride[n] < 0: negstride = 1 stridep[n] = -stride[n] startp[n] = start[n]+stride[n]*(count[n]-1) stride[n] = -stride[n] sl.append(slice(None, None, -1)) # this slice will reverse the data else: startp[n] = start[n] stridep[n] = stride[n] sl.append(slice(None,None, 1)) if self._isprimitive or self._iscompound or self._isenum: data = numpy.empty(shapeout, self.dtype) # strides all 1 or scalar variable, use get_vara (faster) if sum(stride) == ndims or ndims == 0: with nogil: ierr = nc_get_vara(self._grpid, self._varid, startp, countp, data.data) else: with nogil: ierr = nc_get_vars(self._grpid, self._varid, startp, countp, stridep, data.data) if ierr == NC_EINVALCOORDS: raise IndexError elif ierr != NC_NOERR: _ensure_nc_success(ierr) elif self._isvlen: # allocate array of correct primitive type. data = numpy.empty(shapeout, 'O') # flatten data array. data = data.flatten() totelem = PyArray_SIZE(data) if self.dtype == str: # vlen string (NC_STRING) # allocate pointer array to hold string data. strdata = malloc(sizeof(char *) * totelem) # strides all 1 or scalar variable, use get_vara (faster) if sum(stride) == ndims or ndims == 0: with nogil: ierr = nc_get_vara(self._grpid, self._varid, startp, countp, strdata) else: # FIXME: is this a bug in netCDF4? raise IndexError('strides must all be 1 for string variables') #ierr = nc_get_vars(self._grpid, self._varid, # startp, countp, stridep, strdata) if ierr == NC_EINVALCOORDS: raise IndexError elif ierr != NC_NOERR: _ensure_nc_success(ierr) # loop over elements of object array, fill array with # contents of strdata. # use _Encoding attribute to decode string to bytes - if # not given, use 'utf-8'. encoding = getattr(self,'_Encoding','utf-8') for i from 0<=imalloc(totelem*sizeof(nc_vlen_t)) for i in range(totelem): vldata[i].len = 0 vldata[i].p = 0 # strides all 1 or scalar variable, use get_vara (faster) if sum(stride) == ndims or ndims == 0: with nogil: ierr = nc_get_vara(self._grpid, self._varid, startp, countp, vldata) else: raise IndexError('strides must all be 1 for vlen variables') #ierr = nc_get_vars(self._grpid, self._varid, # startp, countp, stridep, vldata) if ierr == NC_EINVALCOORDS: raise IndexError elif ierr != NC_NOERR: _ensure_nc_success(ierr) # loop over elements of object array, fill array with # contents of vlarray struct, put array in object array. for i from 0<=ivldata[i].p memcpy(dataarr.data, vldata[i].p, dataarr.nbytes) data[i] = dataarr # reshape the output array data = numpy.reshape(data, shapeout) # free vlen data internally allocated in netcdf C lib ierr = nc_free_vlens(totelem, vldata) # free the pointer array free(vldata) free(startp) free(countp) free(stridep) if negstride: # reverse data along axes with negative strides. data = data[sl].copy() # make a copy so data is contiguous. # netcdf-c always returns data in native byte order, # regardless of variable endian-ness. Here we swap the # bytes if the variable dtype is not native endian, so the # dtype of the returned numpy array matches the variable dtype. # (pull request #555, issue #554). if not data.dtype.isnative: data.byteswap(True) # in-place byteswap if not self.dimensions: return data[0] # a scalar elif squeeze_out: return numpy.squeeze(data) else: return data def set_collective(self, value): """ **`set_collective(self,True_or_False)`** turn on or off collective parallel IO access. Ignored if file is not open for parallel access. """ IF HAS_NC_PAR: # set collective MPI IO mode on or off if value: ierr = nc_var_par_access(self._grpid, self._varid, NC_COLLECTIVE) else: ierr = nc_var_par_access(self._grpid, self._varid, NC_INDEPENDENT) _ensure_nc_success(ierr) ELSE: pass # does nothing def __reduce__(self): # raise error is user tries to pickle a Variable object. raise NotImplementedError('Variable is not picklable') # Compound datatype support. cdef class CompoundType: """ A `netCDF4.CompoundType` instance is used to describe a compound data type, and can be passed to the the `netCDF4.Dataset.createVariable` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. Compound data types map to numpy structured arrays. See `netCDF4.CompoundType.__init__` for more details. The instance variables `dtype` and `name` should not be modified by the user. """ cdef public nc_type _nc_type cdef public dtype, name __pdoc__['CompoundType.name'] = \ """String name.""" __pdoc__['CompoundType.dtype'] = \ """A numpy dtype object describing the compound data type.""" def __init__(self, grp, object dt, object dtype_name, **kwargs): """ ***`__init__(group, datatype, datatype_name)`*** CompoundType constructor. **`group`**: `netCDF4.Group` instance to associate with the compound datatype. **`datatype`**: A numpy dtype object describing a structured (a.k.a record) array. Can be composed of homogeneous numeric or character data types, or other structured array data types. **`datatype_name`**: a Python string containing a description of the compound data type. ***Note 1***: When creating nested compound data types, the inner compound data types must already be associated with CompoundType instances (so create CompoundType instances for the innermost structures first). ***Note 2***: `netCDF4.CompoundType` instances should be created using the `netCDF4.Dataset.createCompoundType` method of a `netCDF4.Dataset` or `netCDF4.Group` instance, not using this class directly. """ cdef nc_type xtype # convert dt to a numpy datatype object # and make sure the isalignedstruct flag is set to True # (so padding is added to the fields to match what a # C compiler would output for a similar C-struct). # This is needed because nc_get_vara is # apparently expecting the data buffer to include # padding to match what a C struct would have. # (this may or may not be still true, but empirical # evidence suggests that segfaults occur if this # alignment step is skipped - see issue #705). dt = _set_alignment(numpy.dtype(dt)) if 'typeid' in kwargs: xtype = kwargs['typeid'] else: xtype = _def_compound(grp, dt, dtype_name) self._nc_type = xtype self.dtype = dt self.name = dtype_name def __repr__(self): if python3: return self.__unicode__() else: return unicode(self).encode('utf-8') def __unicode__(self): return repr(type(self))+": name = '%s', numpy dtype = %s\n" %\ (self.name,self.dtype) def __reduce__(self): # raise error is user tries to pickle a CompoundType object. raise NotImplementedError('CompoundType is not picklable') def _set_alignment(dt): # recursively set alignment flag in nested structured data type names = dt.names; formats = [] for name in names: fmt = dt.fields[name][0] if fmt.kind == 'V': if fmt.shape == (): dtx = _set_alignment(dt.fields[name][0]) else: if fmt.subdtype[0].kind == 'V': # structured dtype raise TypeError('nested structured dtype arrays not supported') else: dtx = dt.fields[name][0] else: # primitive data type dtx = dt.fields[name][0] formats.append(dtx) # leave out offsets, they will be re-computed to preserve alignment. dtype_dict = {'names':names,'formats':formats} return numpy.dtype(dtype_dict, align=True) cdef _def_compound(grp, object dt, object dtype_name): # private function used to construct a netcdf compound data type # from a numpy dtype object by CompoundType.__init__. cdef nc_type xtype, xtype_tmp cdef int ierr, ndims cdef size_t offset, size cdef char *namstring cdef char *nested_namstring cdef int *dim_sizes bytestr = _strencode(dtype_name) namstring = bytestr size = dt.itemsize ierr = nc_def_compound(grp._grpid, size, namstring, &xtype) _ensure_nc_success(ierr) names = list(dt.fields.keys()) formats = [v[0] for v in dt.fields.values()] offsets = [v[1] for v in dt.fields.values()] # make sure entries in lists sorted by offset. # (don't know why this is necessary, but it is for version 4.0.1) names = _sortbylist(names, offsets) formats = _sortbylist(formats, offsets) offsets.sort() for name, format, offset in zip(names, formats, offsets): bytestr = _strencode(name) namstring = bytestr if format.kind != 'V': # scalar primitive type try: xtype_tmp = _nptonctype[format.str[1:]] except KeyError: raise ValueError('Unsupported compound type element') ierr = nc_insert_compound(grp._grpid, xtype, namstring, offset, xtype_tmp) _ensure_nc_success(ierr) else: if format.shape == (): # nested scalar compound type # find this compound type in this group or it's parents. xtype_tmp = _find_cmptype(grp, format) bytestr = _strencode(name) nested_namstring = bytestr ierr = nc_insert_compound(grp._grpid, xtype,\ nested_namstring,\ offset, xtype_tmp) _ensure_nc_success(ierr) else: # nested array compound element ndims = len(format.shape) dim_sizes = malloc(sizeof(int) * ndims) for n from 0 <= n < ndims: dim_sizes[n] = format.shape[n] if format.subdtype[0].kind != 'V': # primitive type. try: xtype_tmp = _nptonctype[format.subdtype[0].str[1:]] except KeyError: raise ValueError('Unsupported compound type element') ierr = nc_insert_array_compound(grp._grpid,xtype,namstring, offset,xtype_tmp,ndims,dim_sizes) _ensure_nc_success(ierr) else: # nested array compound type. raise TypeError('nested structured dtype arrays not supported') # this code is untested and probably does not work, disable # for now... # # find this compound type in this group or it's parents. # xtype_tmp = _find_cmptype(grp, format.subdtype[0]) # bytestr = _strencode(name) # nested_namstring = bytestr # ierr = nc_insert_array_compound(grp._grpid,xtype,\ # nested_namstring,\ # offset,xtype_tmp,\ # ndims,dim_sizes) # _ensure_nc_success(ierr) free(dim_sizes) return xtype cdef _find_cmptype(grp, dtype): # look for data type in this group and it's parents. # return datatype id when found, if not found, raise exception. cdef nc_type xtype match = False for cmpname, cmpdt in grp.cmptypes.items(): xtype = cmpdt._nc_type names1 = dtype.fields.keys() names2 = cmpdt.dtype.fields.keys() formats1 = [v[0] for v in dtype.fields.values()] formats2 = [v[0] for v in cmpdt.dtype.fields.values()] # match names, formats, but not offsets (they may be changed # by netcdf lib). if names1==names2 and formats1==formats2: match = True break if not match: try: parent_grp = grp.parent except AttributeError: raise ValueError("cannot find compound type in this group or parent groups") if parent_grp is None: raise ValueError("cannot find compound type in this group or parent groups") else: xtype = _find_cmptype(parent_grp,dtype) return xtype cdef _read_compound(group, nc_type xtype, endian=None): # read a compound data type id from an existing file, # construct a corresponding numpy dtype instance, # then use that to create a CompoundType instance. # called by _get_vars, _get_types and _get_att. # Calls itself recursively for nested compound types. cdef int ierr, nf, numdims, ndim, classp, _grpid cdef size_t nfields, offset cdef nc_type field_typeid cdef int *dim_sizes cdef char field_namstring[NC_MAX_NAME+1] cdef char cmp_namstring[NC_MAX_NAME+1] # get name and number of fields. _grpid = group._grpid with nogil: ierr = nc_inq_compound(_grpid, xtype, cmp_namstring, NULL, &nfields) _ensure_nc_success(ierr) name = cmp_namstring.decode('utf-8') # loop over fields. names = [] formats = [] offsets = [] for nf from 0 <= nf < nfields: with nogil: ierr = nc_inq_compound_field(_grpid, xtype, nf, field_namstring, &offset, &field_typeid, &numdims, NULL) _ensure_nc_success(ierr) dim_sizes = malloc(sizeof(int) * numdims) with nogil: ierr = nc_inq_compound_field(_grpid, xtype, nf, field_namstring, &offset, &field_typeid, &numdims, dim_sizes) _ensure_nc_success(ierr) field_name = field_namstring.decode('utf-8') names.append(field_name) offsets.append(offset) # if numdims=0, not an array. field_shape = () if numdims != 0: for ndim from 0 <= ndim < numdims: field_shape = field_shape + (dim_sizes[ndim],) free(dim_sizes) # check to see if this field is a nested compound type. try: field_type = _nctonptype[field_typeid] if endian is not None: format = endian + format except KeyError: with nogil: ierr = nc_inq_user_type(_grpid, field_typeid,NULL,NULL,NULL,NULL,&classp) if classp == NC_COMPOUND: # a compound type # recursively call this function? field_type = _read_compound(group, field_typeid, endian=endian) else: raise KeyError('compound field of an unsupported data type') if field_shape != (): formats.append((field_type,field_shape)) else: formats.append(field_type) # make sure entries in lists sorted by offset. names = _sortbylist(names, offsets) formats = _sortbylist(formats, offsets) offsets.sort() # create a dict that can be converted into a numpy dtype. dtype_dict = {'names':names,'formats':formats,'offsets':offsets} return CompoundType(group, dtype_dict, name, typeid=xtype) # VLEN datatype support. cdef class VLType: """ A `netCDF4.VLType` instance is used to describe a variable length (VLEN) data type, and can be passed to the the `netCDF4.Dataset.createVariable` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. See `netCDF4.VLType.__init__` for more details. The instance variables `dtype` and `name` should not be modified by the user. """ cdef public nc_type _nc_type cdef public dtype, name __pdoc__['VLType.name'] = \ """String name.""" __pdoc__['VLType.dtype'] = \ """A numpy dtype object describing the component type for the VLEN.""" def __init__(self, grp, object dt, object dtype_name, **kwargs): """ **`__init__(group, datatype, datatype_name)`** VLType constructor. **`group`**: `netCDF4.Group` instance to associate with the VLEN datatype. **`datatype`**: An numpy dtype object describing the component type for the variable length array. **`datatype_name`**: a Python string containing a description of the VLEN data type. ***`Note`***: `netCDF4.VLType` instances should be created using the `netCDF4.Dataset.createVLType` method of a `netCDF4.Dataset` or `netCDF4.Group` instance, not using this class directly. """ cdef nc_type xtype if 'typeid' in kwargs: xtype = kwargs['typeid'] else: xtype, dt = _def_vlen(grp, dt, dtype_name) self._nc_type = xtype self.dtype = dt if dt == str: self.name = None else: self.name = dtype_name def __repr__(self): if python3: return self.__unicode__() else: return unicode(self).encode('utf-8') def __unicode__(self): if self.dtype == str: return repr(type(self))+': string type' else: return repr(type(self))+": name = '%s', numpy dtype = %s\n" %\ (self.name, self.dtype) def __reduce__(self): # raise error is user tries to pickle a VLType object. raise NotImplementedError('VLType is not picklable') cdef _def_vlen(grp, object dt, object dtype_name): # private function used to construct a netcdf VLEN data type # from a numpy dtype object or python str object by VLType.__init__. cdef nc_type xtype, xtype_tmp cdef int ierr, ndims cdef size_t offset, size cdef char *namstring cdef char *nested_namstring if dt == str: # python string, use NC_STRING xtype = NC_STRING # dtype_name ignored else: # numpy datatype bytestr = _strencode(dtype_name) namstring = bytestr dt = numpy.dtype(dt) # convert to numpy datatype. if dt.str[1:] in _supportedtypes: # find netCDF primitive data type corresponding to # specified numpy data type. xtype_tmp = _nptonctype[dt.str[1:]] ierr = nc_def_vlen(grp._grpid, namstring, xtype_tmp, &xtype); _ensure_nc_success(ierr) else: raise KeyError("unsupported datatype specified for VLEN") return xtype, dt cdef _read_vlen(group, nc_type xtype, endian=None): # read a VLEN data type id from an existing file, # construct a corresponding numpy dtype instance, # then use that to create a VLType instance. # called by _get_types, _get_vars. cdef int ierr, _grpid cdef size_t vlsize cdef nc_type base_xtype cdef char vl_namstring[NC_MAX_NAME+1] _grpid = group._grpid if xtype == NC_STRING: dt = str name = None else: with nogil: ierr = nc_inq_vlen(_grpid, xtype, vl_namstring, &vlsize, &base_xtype) _ensure_nc_success(ierr) name = vl_namstring.decode('utf-8') try: datatype = _nctonptype[base_xtype] if endian is not None: datatype = endian + datatype dt = numpy.dtype(datatype) # see if it is a primitive type except KeyError: raise KeyError("unsupported component type for VLEN") return VLType(group, dt, name, typeid=xtype) # Enum datatype support. cdef class EnumType: """ A `netCDF4.EnumType` instance is used to describe an Enum data type, and can be passed to the the `netCDF4.Dataset.createVariable` method of a `netCDF4.Dataset` or `netCDF4.Group` instance. See `netCDF4.EnumType.__init__` for more details. The instance variables `dtype`, `name` and `enum_dict` should not be modified by the user. """ cdef public nc_type _nc_type cdef public dtype, name, enum_dict __pdoc__['EnumType.name'] = \ """String name.""" __pdoc__['EnumType.dtype'] = \ """A numpy integer dtype object describing the base type for the Enum.""" __pdoc__['EnumType.enum_dict'] = \ """A python dictionary describing the enum fields and values.""" def __init__(self, grp, object dt, object dtype_name, object enum_dict, **kwargs): """ **`__init__(group, datatype, datatype_name, enum_dict)`** EnumType constructor. **`group`**: `netCDF4.Group` instance to associate with the VLEN datatype. **`datatype`**: An numpy integer dtype object describing the base type for the Enum. **`datatype_name`**: a Python string containing a description of the Enum data type. **`enum_dict`**: a Python dictionary containing the Enum field/value pairs. ***`Note`***: `netCDF4.EnumType` instances should be created using the `netCDF4.Dataset.createEnumType` method of a `netCDF4.Dataset` or `netCDF4.Group` instance, not using this class directly. """ cdef nc_type xtype if 'typeid' in kwargs: xtype = kwargs['typeid'] else: xtype, dt = _def_enum(grp, dt, dtype_name, enum_dict) self._nc_type = xtype self.dtype = dt self.name = dtype_name self.enum_dict = enum_dict def __repr__(self): if python3: return self.__unicode__() else: return unicode(self).encode('utf-8') def __unicode__(self): return repr(type(self))+\ ": name = '%s', numpy dtype = %s, fields/values =%s\n" %\ (self.name, self.dtype, self.enum_dict) def __reduce__(self): # raise error is user tries to pickle a EnumType object. raise NotImplementedError('EnumType is not picklable') cdef _def_enum(grp, object dt, object dtype_name, object enum_dict): # private function used to construct a netCDF Enum data type # from a numpy dtype object or python str object by EnumType.__init__. cdef nc_type xtype, xtype_tmp cdef int ierr cdef char *namstring cdef ndarray value_arr bytestr = _strencode(dtype_name) namstring = bytestr dt = numpy.dtype(dt) # convert to numpy datatype. if dt.str[1:] in _intnptonctype.keys(): # find netCDF primitive data type corresponding to # specified numpy data type. xtype_tmp = _intnptonctype[dt.str[1:]] ierr = nc_def_enum(grp._grpid, xtype_tmp, namstring, &xtype); _ensure_nc_success(ierr) else: msg="unsupported datatype specified for Enum (must be integer)" raise KeyError(msg) # insert named members into enum type. for field in enum_dict: value_arr = numpy.array(enum_dict[field],dt) bytestr = _strencode(field) namstring = bytestr ierr = nc_insert_enum(grp._grpid, xtype, namstring, value_arr.data) _ensure_nc_success(ierr) return xtype, dt cdef _read_enum(group, nc_type xtype, endian=None): # read a Enum data type id from an existing file, # construct a corresponding numpy dtype instance, # then use that to create a EnumType instance. # called by _get_types, _get_vars. cdef int ierr, _grpid, nmem cdef char enum_val cdef nc_type base_xtype cdef char enum_namstring[NC_MAX_NAME+1] cdef size_t nmembers _grpid = group._grpid # get name, datatype, and number of members. with nogil: ierr = nc_inq_enum(_grpid, xtype, enum_namstring, &base_xtype, NULL,\ &nmembers) _ensure_nc_success(ierr) name = enum_namstring.decode('utf-8') try: datatype = _nctonptype[base_xtype] if endian is not None: datatype = endian + datatype dt = numpy.dtype(datatype) # see if it is a primitive type except KeyError: raise KeyError("unsupported component type for VLEN") # loop over members, build dict. enum_dict = {} for nmem from 0 <= nmem < nmembers: with nogil: ierr = nc_inq_enum_member(_grpid, xtype, nmem, \ enum_namstring, &enum_val) _ensure_nc_success(ierr) name = enum_namstring.decode('utf-8') enum_dict[name] = int(enum_val) return EnumType(group, dt, name, enum_dict, typeid=xtype) cdef _strencode(pystr,encoding=None): # encode a string into bytes. If already bytes, do nothing. # uses 'utf-8' for default encoding. if encoding is None: encoding = 'utf-8' try: return pystr.encode(encoding) except (AttributeError, UnicodeDecodeError): return pystr # already bytes or unicode? def _to_ascii(bytestr): # encode a byte string to an ascii encoded string. if python3: return str(bytestr,encoding='ascii') else: return bytestr.encode('ascii') #---------------------------------------- # extra utilities (formerly in utils.pyx) #---------------------------------------- from datetime import timedelta, datetime, MINYEAR from netcdftime import _parse_date, microsec_units, millisec_units,\ sec_units, min_units, hr_units, day_units # start of the gregorian calendar gregorian = datetime(1582,10,15) def _dateparse(timestr): """parse a string of the form time-units since yyyy-mm-dd hh:mm:ss, return a datetime instance""" # same as version in netcdftime, but returns a timezone naive # python datetime instance with the utc_offset included. timestr_split = timestr.split() units = timestr_split[0].lower() if timestr_split[1].lower() != 'since': raise ValueError("no 'since' in unit_string") # parse the date string. n = timestr.find('since')+6 isostring = timestr[n:] year, month, day, hour, minute, second, utc_offset =\ _parse_date( isostring.strip() ) if year >= MINYEAR: basedate = datetime(year, month, day, hour, minute, second) # subtract utc_offset from basedate time instance (which is timezone naive) basedate -= timedelta(days=utc_offset/1440.) else: if not utc_offset: basedate = netcdftime.datetime(year, month, day, hour, minute, second) else: raise ValueError('cannot use utc_offset for reference years <= 0') return basedate def stringtoarr(string,NUMCHARS,dtype='S'): """ **`stringtoarr(a, NUMCHARS,dtype='S')`** convert a string to a character array of length `NUMCHARS` **`a`**: Input python string. **`NUMCHARS`**: number of characters used to represent string (if len(a) < `NUMCHARS`, it will be padded on the right with blanks). **`dtype`**: type of numpy array to return. Default is `'S'`, which means an array of dtype `'S1'` will be returned. If dtype=`'U'`, a unicode array (dtype = `'U1'`) will be returned. returns a rank 1 numpy character array of length NUMCHARS with datatype `'S1'` (default) or `'U1'` (if dtype=`'U'`)""" if dtype not in ["S","U"]: raise ValueError("dtype must string or unicode ('S' or 'U')") arr = numpy.zeros(NUMCHARS,dtype+'1') arr[0:len(string)] = tuple(string) return arr def stringtochar(a,encoding='utf-8'): """ **`stringtochar(a,encoding='utf-8')`** convert a string array to a character array with one extra dimension **`a`**: Input numpy string array with numpy datatype `'SN'` or `'UN'`, where N is the number of characters in each string. Will be converted to an array of characters (datatype `'S1'` or `'U1'`) of shape `a.shape + (N,)`. optional kwarg `encoding` can be used to specify character encoding (default `utf-8`). returns a numpy character array with datatype `'S1'` or `'U1'` and shape `a.shape + (N,)`, where N is the length of each string in a.""" dtype = a.dtype.kind if dtype not in ["S","U"]: raise ValueError("type must string or unicode ('S' or 'U')") b = numpy.array(tuple(a.tostring().decode(encoding)),dtype+'1') b.shape = a.shape + (a.itemsize,) return b def chartostring(b,encoding='utf-8'): """ **`chartostring(b,encoding='utf-8')`** convert a character array to a string array with one less dimension. **`b`**: Input character array (numpy datatype `'S1'` or `'U1'`). Will be converted to a array of strings, where each string has a fixed length of `b.shape[-1]` characters. optional kwarg `encoding` can be used to specify character encoding (default `utf-8`). returns a numpy string array with datatype `'UN'` and shape `b.shape[:-1]` where where `N=b.shape[-1]`.""" dtype = b.dtype.kind if dtype not in ["S","U"]: raise ValueError("type must be string or unicode ('S' or 'U')") bs = b.tostring().decode(encoding) slen = int(b.shape[-1]) a = numpy.array([bs[n1:n1+slen] for n1 in range(0,len(bs),slen)],'U'+repr(slen)) a.shape = b.shape[:-1] return a def date2num(dates,units,calendar='standard'): """ **`date2num(dates,units,calendar='standard')`** Return numeric time values given datetime objects. The units of the numeric time values are described by the `netCDF4.units` argument and the `netCDF4.calendar` keyword. The datetime objects must be in UTC with no time-zone offset. If there is a time-zone offset in `units`, it will be applied to the returned numeric values. **`dates`**: A datetime object or a sequence of datetime objects. The datetime objects should not include a time-zone offset. **`units`**: a string of the form `