A module of the pyhdf package implementing the V (Vgroup) API of the NCSA HDF4 library. (see: hdf.ncsa.uiuc.edu)
Version: 0.7-3 Date: july 13 2005
Introduction Accessing the V module Package components Prerequisites Summary of differences between the pyhdf and C V API Error handling V needs support from the HDF module Classes summary Attribute access: low and high level Predefined attributes Programming models
Module documentation
V is one of the modules composing pyhdf, a python package implementing the NCSA HDF library and letting one manage HDF files from within a python program. Two versions of the HDF library currently exist, version 4 and version 5. pyhdf only implements version 4 of the library. Many different APIs are to be found inside the HDF4 specification. Currently, pyhdf implements just a few of those: the SD, VS and V APIs. Other APIs should be added in the future (GR, AN, etc).
The V API supports the definition of vgroups inside an HDF file. A vgroup can thought of as a collection of arbitrary “references” to other HDF objects defined in the same file. A vgroup may hold references to other vgroups. It is thus possible to organize HDF objects into some sort of a hierarchy, similar to files grouped into a directory tree under unix. This vgroup hierarchical nature partly explains the origin of the “HDF” name (Hierarchical Data Format). vgroups can help logically organize the contents of an HDF file, for example by grouping together all the datasets belonging to a given experiment, and subdividing those datasets according to the day of the experiment, etc.
The V API provides functions to find and access an existing vgroup, create a new one, delete a vgroup, identify the members of a vgroup, add and remove members to and from a vgroup, and set and query attributes on a vgroup. The members of a vgroup are identified through their tags and reference numbers. Tags are constants identifying each main object type (dataset, vdata, vgroup). Reference numbers serve to distinguish among objects of the same type. To add an object to a vgroup, one must first initialize that object using the API proper to that object type (eg: SD for a dataset) so as to create a reference number for that object, and then pass this reference number and the type tag to the V API. When reading the contents of a vgroup, the V API returns the tags and reference numbers of the objects composing the vgroup. The user program must then call the proper API to process each object, based on tag of this object (eg: VS for a tag identifying a vdata object).
Some limitations of the V API must be stressed. First, HDF imposes no integrity constraint whatsoever on the contents of a vgroup, nor does it help maintain such integrity. For example, a vgroup is not strictly hierarchical, because an object can belong to more than one vgroup. It would be easy to create vgroups showing cycles among their members. Also, a vgroup member is simply a reference to an HDF object. If this object is afterwards deleted for any reason, the vgroup membership will not be automatically updated. The vgroup will refer to a non-existent object and thus be left in an inconsistent state. Nothing prevents adding the same member more than once to a vgroup, and giving the same name to more than one vgroup. Finally, the HDF library seems to make heavy use of vgroups for its own internal needs, and creates vgroups “behind the scenes”. This may make it difficult to pick up “user defined” vgroups when browsing an HDF file.
To access the V module a python program can say one of:
>>> import pyhdf.V # must prefix names with "pyhdf.V."
>>> from pyhdf import V # must prefix names with "V."
>>> from pyhdf.V import * # names need no prefix
This document assumes the last import style is used.
V is not self-contained, and needs functionnality provided by another pyhdf module, namely the HDF module. This module must thus be imported also:
>>> from HDF import *
pyhdf is a proper Python package, eg a collection of modules stored under a directory whose name is that of the package and which stores an __init__.py file. Following the normal installation procedure, this directory will be <python-lib>/site-packages/pyhdf’, where <python-lib> stands for the python installation directory.
For each HDF API exists a corresponding set of modules.
The following modules are related to the V API.
- _hdfext C extension module responsible for wrapping the HDF
- C library for all python modules
- hdfext python module implementing some utility functions
- complementing the _hdfext extension module
error defines the HDF4Error exception HDF python module providing support to the V module V python module wrapping the V API routines inside
an OOP framework
_hdfext and hdfext were generated using the SWIG preprocessor. SWIG is however not needed to run the package. Those two modules are meant to do their work in the background, and should never be called directly. Only HDF and V should be imported by the user program.
The following software must be installed in order for the V module to work.
- HDF (v4) library
pyhdf does not include the HDF4 library, which must be installed separately.
HDF is available at: “http://hdf.ncsa.uiuc.edu/obtain.html”.
Numeric is also needed by the SD module. See the SD module documentation.
Most of the differences between the pyhdf and C V API can be summarized as follows.
- -In the C API, every function returns an integer status code, and values
- computed by the function are returned through one or more pointers passed as arguments.
- -In pyhdf, error statuses are returned through the Python exception
- mechanism, and values are returned as the method result. When the C API specifies that multiple values are returned, pyhdf returns a sequence of values, which are ordered similarly to the pointers in the C function argument list.
All errors reported by the C V API with a SUCCESS/FAIL error code are reported by pyhdf using the Python exception mechanism. When the C library reports a FAIL status, pyhdf raises an HDF4Error exception (a subclass of Exception) with a descriptive message. Unfortunately, the C library is rarely informative about the cause of the error. pyhdf does its best to try to document the error, but most of the time cannot do more than saying “execution error”.
The VS module is not self-contained (countrary to the SD module). It requires help from the HDF module, namely:
- -the HDF.HDF class to open and close the HDF file, and initialize the
- V interface
- -the HDF.HC class to provide different sorts of constants (opening modes,
- data types, etc).
A program wanting to access HDF vgroups will almost always need to execute the following minimal set of calls:
>>> from pyhdf.HDF import *
>>> from pyhdf.V import *
>>> hdfFile = HDF(name, HC.xxx)# open HDF file
>>> v = hdfFile.vgstart() # initialize V interface on HDF file
>>> ... # manipulate vgroups
>>> v.end() # terminate V interface
>>> hdfFile.close() # close HDF file
pyhdf wraps the V API using the following python classes:
V HDF V interface VG vgroup VGAttr vgroup attribute
In more detail:
- V The V class implements the V (Vgroup) interface applied to an
HDF file.
To instantiate a V class, call the vgstart() method of an HDF instance.
- methods:
- constructors
- attach() open an existing vgroup given its name or its
- reference number, or create a new vgroup, returning a VG instance for that vgroup
- create() create a new vgroup, returning a VG instance
- for that vgroup
- closing the interface
- end() close the V interface on the HDF file
- deleting a vgroup
- delete() delete the vgroup identified by its name or
- its reference number
- searching
- find() find a vgroup given its name, returning
- the vgroup reference number
- findclass() find a vgroup given its class name, returning
- the vgroup reference number
- getid() return the reference number of the vgroup
- following the one with the given reference number
VG The VG class encapsulates the functionnality of a vgroup.
To instantiate a VG class, call the attach() or create() methods of a V class instance.
constructors
- attr() return a VGAttr instance representing an attribute
- of the vgroup
- findattr() search the vgroup for a given attribute,
- returning a VGAttr instance for that attribute
ending access to a vgroup
detach() terminate access to the vgroupadding a member to a vgroup
- add() add to the vgroup the HDF object identified by its
- tag and reference number
- insert() insert a vdata or a vgroup in the vgroup, given
- the vdata or vgroup instance
deleting a member from a vgroup
- delete() remove from the vgroup the HDF object identified
- by the given tag and reference number
querying vgroup
attrinfo() return info about all the vgroup attributes inqtagref() determine if the HDF object with the given
tag and reference number belongs to the vgroup
- isvg() determine if the member with the given reference
- number is a vgroup object
- isvs() determine if the member with the given reference
- number is a vdata object
- nrefs() return the number of vgroup members with the
- given tag
- tagref() get the tag and reference number of a vgroup
- member, given the index number of that member
- tagrefs() get the tags and reference numbers of all the
- vgroup members
- VGAttr The VGAttr class provides methods to set and query vgroup
attributes.
To create an instance of this class, call the attr() method of a VG instance.
Remember that vgroup attributes can also be set and queried by applying the standard python “dot notation” on a VG instance.
get attibute value(s)
get() obtain the attribute value(s)set attribute value(s)
- set() set the attribute to the given value(s) of the
- given type, first creating the attribute if necessary
query attribute info
- info() retrieve attribute name, data type, order and
- size
The V API allows setting attributes on vgroups. Attributes can be of many types (int, float, char) of different bit lengths (8, 16, 32, 64 bits), and can be single or multi-valued. Values of a multi-valued attribute must all be of the same type.
Attributes can be set and queried in two different ways. First, given a VG instance (describing a vgroup object), the attr() method of that instance is called to create a VGAttr instance representing the wanted attribute (possibly non existent). The set() method of this VGAttr instance is then called to define the attribute value, creating it if it does not already exist. The get() method returns the current attribute value. Here is an example.
>>> from pyhdf.HDF import *
>>> from pyhdf.V import *
>>> f = HDF('test.hdf', HC.WRITE) # Open file 'test.hdf' in write mode
>>> v = f.vgstart() # init vgroup interface
>>> vg = v.attach('vtest', 1) # attach vgroup 'vtest' in write mode
>>> attr = vg.attr('version') # prepare to define the 'version' attribute
# on the vdata
>>> attr.set(HC.CHAR8,'1.0') # set attribute 'version' to string '1.0'
>>> print attr.get() # get and print attribute value
>>> attr = vg .attr('range') # prepare to define attribute 'range'
>>> attr.set(HC.INT32,(-10, 15)) # set attribute 'range' to a pair of ints
>>> print attr.get() # get and print attribute value
>>> vg.detach() # "close" the vgroup
>>> v.end() # terminate the vgroup interface
>>> f.close() # close the HDF file
The second way consists of setting/querying an attribute as if it were a normal python class attribute, using the usual dot notation. Above example then becomes:
>>> from pyhdf.HDF import *
>>> from pyhdf.V import *
>>> f = HDF('test.hdf', HC.WRITE) # Open file 'test.hdf' in write mode
>>> v = f.vgstart() # init vgroup interface
>>> vg = v.attach('vtest', 1) # attach vdata 'vtest' in write mode
>>> vg.version = '1.0' # create vdata attribute 'version',
# setting it to string '1.0'
>>> print vg.version # print attribute value
>>> vg.range = (-10, 15) # create attribute 'range', setting
# it to the pair of ints (-10, 15)
>>> print vg.range # print attribute value
>>> vg.detach() # "close" the vdata
>>> v.end() # terminate the vdata interface
>>> f.close() # close the HDF file
Note how the dot notation greatly simplifies and clarifies the code. Some latitude is however lost by manipulating attributes in that way, because the pyhdf package, not the programmer, is then responsible of setting the attribute type. The attribute type is chosen to be one of:
HC.CHAR8 if the attribute value is a string HC.INT32 if all attribute values are integers HC.FLOAT64 otherwise
The first way of handling attribute values must be used if one wants to define an attribute of any other type (for ex. 8 or 16 bit integers, signed or unsigned). Also, only a VDAttr instance gives access to attribute info, through its info() method.
However, accessing HDF attributes as if they were python attributes raises an important issue. There must exist a way to assign generic attributes to the python objects without requiring those attributes to be converted to HDF attributes. pyhdf uses the following rule: an attribute whose name starts with an underscore (‘_’) is either a “predefined” HDF attribute (see below) or a standard python attribute. Otherwise, the attribute is handled as an HDF attribute. Also, HDF attributes are not stored inside the object dictionnary: the python dir() function will not list them.
Attribute values can be updated, but it is illegal to try to change the value type, or the attribute order (number of values). This is important for attributes holding string values. An attribute initialized with an ‘n’ character string is simply a character attribute of order ‘n’ (eg a character array of length ‘n’). If ‘vg’ is a vgroup and we initialize its ‘a1’ attribute as ‘vg.a1 = “abcdef”’, then a subsequent update attempt like ‘vg.a1 = “12”’ will fail, because we then try to change the order of the attribute (from 6 to 2). It is mandatory to keep the length of string attributes constant.
The VG class supports predefined attributes to get (and occasionnaly set) attribute values easily using the usual python “dot notation”, without having to call a class method. The names of predefined attributes all start with an underscore (‘_’).
In the following table, the RW column holds an X if the attribute is read/write.
VG predefined attributes
_class X class name Vgetclass/Vsetclass _name X vgroup name Vgetname/Vsetname _nattrs number of vgroup attributes Vnattrs _nmembers number of vgroup members Vntagrefs _refnum vgroup reference number VQueryref _tag vgroup tag VQuerytag _version vgroup version number Vgetversion
The following program shows how to create and initialize a vgroup inside an HDF file. It can serve as a model for any program wanting to create a vgroup.
from pyhdf.HDF import * from pyhdf.V import * from pyhdf.VS import * from pyhdf.SD import *
def vdatacreate(vs, name):
# Create vdata and define its structure vd = vs.create(name,
- ((‘partid’,HC.CHAR8, 5), # 5 char string
- (‘description’,HC.CHAR8, 10), # 10 char string field (‘qty’,HC.INT16, 1), # 1 16 bit int field (‘wght’,HC.FLOAT32, 1), # 1 32 bit float (‘price’,HC.FLOAT32,1) # 1 32 bit float
))
# Store records vd.write(((‘Q1234’, ‘bolt’,12, 0.01, 0.05), # record 1
(‘B5432’, ‘brush’, 10, 0.4, 4.25), # record 2 (‘S7613’, ‘scissor’, 2, 0.2, 3.75) # record 3 ))# “close” vdata vd.detach()
def sdscreate(sd, name):
# Create a simple 3x3 float array. sds = sd.create(name, SDC.FLOAT32, (3,3)) # Initialize array sds[:] = ((0,1,2),(3,4,5),(6,7,8)) # “close” dataset. sds.endaccess()# Create HDF file filename = ‘inventory.hdf’ hdf = HDF(filename, HC.WRITE|HC.CREATE)
# Initialize the SD, V and VS interfaces on the file. sd = SD(filename, SDC.WRITE) # SD interface vs = hdf.vstart() # vdata interface v = hdf.vgstart() # vgroup interface
# Create vdata named ‘INVENTORY’. vdatacreate(vs, ‘INVENTORY’) # Create dataset named “ARR_3x3” sdscreate(sd, ‘ARR_3x3’)
# Attach the vdata and the dataset. vd = vs.attach(‘INVENTORY’) sds = sd.select(‘ARR_3x3’)
# Create vgroup named ‘TOTAL’. vg = v.create(‘TOTAL’)
# Add vdata to the vgroup vg.insert(vd) # We could also have written this: # vgroup.add(vd._tag, vd._refnum) # or this: # vgroup.add(HC.DFTAG_VH, vd._refnum)
# Add dataset to the vgroup vg.add(HC.DFTAG_NDG, sds.ref())
# Close vgroup, vdata and dataset. vg.detach() # vgroup vd.detach() # vdata sds.endaccess() # dataset
# Terminate V, VS and SD interfaces. v.end() # V interface vs.end() # VS interface sd.end() # SD interface
# Close HDF file. hdf.close()
The program starts by defining two functions vdatacreate() and sdscreate(), which will serve to create the vdata and dataset objects we need. Those functions are not essential to the example. They simply help to make the example self-contained. Refer to the VS and SD module documentation for additional explanations about how these functions work.
After opening the HDF file in write mode, the SD, V and VS interfaces are initialized on the file. Next vdatacreate() is called to create a new vdata named ‘INVENTORY’ on the VS instance, and sdscreate() to create a new dataset named ‘ARR_3x3’ on the SD instance. This is done so that we have a vdata and a dataset to play with.
The vdata and the dataset are then attached (“opened”). The create() method of the V instance is then called to create a new vgroup named ‘TOTAL’. The vgroup is then populated by calling its insert() method to add the vdata ‘INVENTORY’, and its add() method to add the ‘ARR_3x3’ dataset. Note that insert() is just a commodity method that simplifies adding a vdata or a vgroup to a vgroup, avoiding the need to pass an object tag and reference number. There is no such commodity method for adding a dataset to a vgroup. The dataset must be added by specifying its tag and reference number. Note that the tags to be used are defined inside the HDF module as constants of the HC class: DFTAG_NDG for a dataset, DFTAG_VG for a vgroup, DFTAG_VH for a vdata.
The program ends by detaching (“closing”) the HDF objects created above, terminating the three interfaces initialized, and closing the HDF file.
The following program shows the contents of the vgroups contained inside any HDF file.
from pyhdf.HDF import * from pyhdf.V import * from pyhdf.VS import * from pyhdf.SD import *
import sys
def describevg(refnum):
# Describe the vgroup with the given refnum.
# Open vgroup in read mode. vg = v.attach(refnum) print “—————-” print “name:”, vg._name, “class:”,vg._class, “tag,ref:”, print vg._tag, vg._refnum
# Show the number of members of each main object type. print “members: ”, vg._nmembers, print “datasets:”, vg.nrefs(HC.DFTAG_NDG), print “vdatas: ”, vg.nrefs(HC.DFTAG_VH), print “vgroups: ”, vg.nrefs(HC.DFTAG_VG)
# Read the contents of the vgroup. members = vg.tagrefs()
# Display info about each member. index = -1 for tag, ref in members:
index += 1 print “member index”, index # Vdata tag if tag == HC.DFTAG_VH:
vd = vs.attach(ref) nrecs, intmode, fields, size, name = vd.inquire() print ” vdata:”,name, “tag,ref:”,tag, ref print ” fields:”,fields print ” nrecs:”,nrecs vd.detach()# SDS tag elif tag == HC.DFTAG_NDG:
sds = sd.select(sd.reftoindex(ref)) name, rank, dims, type, nattrs = sds.info() print ” dataset:”,name, “tag,ref:”, tag, ref print ” dims:”,dims print ” type:”,type sds.endaccess()# VS tag elif tag == HC.DFTAG_VG:
vg0 = v.attach(ref) print ” vgroup:”, vg0._name, “tag,ref:”, tag, ref vg0.detach()# Unhandled tag else:
print “unhandled tag,ref”,tag,ref# Close vgroup vg.detach()
# Open HDF file in readonly mode. filename = sys.argv[1] hdf = HDF(filename)
# Initialize the SD, V and VS interfaces on the file. sd = SD(filename) vs = hdf.vstart() v = hdf.vgstart()
# Scan all vgroups in the file. ref = -1 while 1:
- try:
- ref = v.getid(ref)
- except HDF4Error,msg: # no more vgroup
- break
describevg(ref)
# Terminate V, VS and SD interfaces. v.end() vs.end() sd.end()
# Close HDF file. hdf.close()
The program starts by defining function describevg(), which is passed the reference number of the vgroup to display. The function assumes that the SD, VS and V interfaces have been previously initialized.
The function starts by attaching (“opening”) the vgroup, and displaying its name, class, tag and reference number. The number of members of the three most important object types is then displayed, by calling the nrefs() method with the predefined tags found inside the HDF.HC class.
The tagrefs() method is then called to get a list of all the vgroup members, each member being identified by its tag and reference number. A ‘for’ statement is entered to loop over each element of this list. The tag is tested against the known values defined in the HDF.HC class: the outcome of this test indicates how to process the member object.
A DFTAG_VH tag indicates we deal with a vdata. The vdata is attached, its inquire() method called to display info about it, and the vdata is detached. In the case of a DFTAG_NFG, we are facing a dataset. The dataset is selected, info is obtained by calling the dataset info() method, and the dataset is released. A DFTAG_VG indicates that the member is a vgroup. We attach it, print its name, tag and reference number, then detach the member vgroup. A warning is finally displayed if we hit upon a member of an unknown type.
The function releases the vgroup just displayed and returns.
The main program starts by opening in readonly mode the HDF file passed as argument on the command line. The SD, VS and V interfaces are initialized, and the corresponding class instances are stored inside ‘sd’, ‘vs’ and ‘v’ global variables, respectively, for the use of the describevg() function.
A while loop is then entered to access each vgroup in the file. A reference number of -1 is passed on the first call to getid() to obtain the reference number of the first vgroup. getid() returns a new reference number on each subsequent call, and raises an exception when the last vgroup has been retrieved. This exception is caught to break out of the loop, otherwise describevg() is called to display the vgroup we have on hand.
Once the loop is over, the interfaces initialized before are terminated, and the HDF file is closed.
You will notice that this program will display vgroups other than those you have explicitly created. Those supplementary vgroups are created by the HDF library for its own internal needs.
The V class implements the V (Vgroup) interface applied to an HDF file. To instantiate a V class, call the vgstart() method of an HDF instance.
Open an existing vgroup given its name or its reference number, or create a new vgroup, returning a VG instance for that vgroup.
An exception is raised if an attempt is made to open a non-existent vgroup.
C library equivalent : Vattach
Create a new vgroup, and assign it a name.
A create(name) call is equivalent to an attach(-1, 1) call, followed by a call to the setname(name) method of the instance.
C library equivalent : no equivalent
Delete from the HDF file the vgroup identified by its reference number or its name.
C library equivalent : Vdelete
Close the V interface.
C library equivalent : Vend
Find a vgroup given its name, returning its reference number if found.
An exception is raised if the vgroup is not found.
C library equivalent: Vfind
Find a vgroup given its class name, returning its reference number if found.
An exception is raised if the vgroup is not found.
C library equivalent: Vfind
Obtain the reference number of the vgroup following the vgroup with the given reference number .
An exception is raised if the end of the vgroup is reached.
C library equivalent : Vgetid
The VG class encapsulates the functionnality of a vgroup. To instantiate a VG class, call the attach() or create() methods of a V class instance.
Add to the vgroup an object identified by its tag and reference number.
C library equivalent : Vaddtagref
Create a VGAttr instance representing a vgroup attribute.
C library equivalent : no equivalent
Return info about all the vgroup attributes.
dictionnary describing each vgroup attribute; for each attribute, a (name,data) pair is added to the dictionary, where ‘data’ is a tuple holding:
-attribute data type (one of HC.xxx constants) -attribute order -attribute value -attribute size in bytes
C library equivalent : no equivalent
Delete from the vgroup the member identified by its tag and reference number.
Only the link of the member with the vgroup is deleted. The member object is not deleted.
C library equivalent : Vdeletatagref
Terminate access to the vgroup.
C library equivalent : Vdetach
Search the vgroup for a given attribute.
if found, VGAttr instance describing the attribute None otherwise
C library equivalent : Vfindattr
Determines if an object identified by its tag and reference number belongs to the vgroup.
C library equivalent : Vinqtagref
Insert a vdata or a vgroup in the vgroup.
C library equivalent : Vinsert
Determines if the member of a vgoup is a vgroup.
C library equivalent : Visvg
Determines if the member of a vgoup is a vdata.
C library equivalent : Visvs
Determine the number of tags of a given type in a vgroup.
C library equivalent : Vnrefs
Get the tag and reference number of a vgroup member, given the index number of that member.
C library equivalent : Vgettagref
Get the tags and reference numbers of all the vgroup members.
C library equivalent : Vgettagrefs
The VGAttr class encapsulates methods used to set and query attributes defined on a vgroup. To create an instance of this class, call the attr() method of a VG class.
Retrieve the attribute value.
Note that a vgroup attribute can also be queried like a standard python class attribute by applying the usual “dot notation” to a VG instance.
C library equivalent : Vgetattr
Retrieve info about the attribute.
C library equivalent : Vattrinfo
Set the attribute value.
data_type : attribute data type (see constants HC.xxx) values : attribute value(s); specify a list to create
a multi-valued attribute; a string valued attribute can be created by setting ‘data_type’ to HC.CHAR8 and ‘values’ to the corresponding string
If the attribute already exists, it will be updated. However, it is illegal to try to change its data type or its order (number of values).
Note that a vgroup attribute can also be set like a standard python class attribute by applying the usual “dot notation” to a VG instance.
C library equivalent : Vsetattr