openpathsampling.netcdfplus.objects.ObjectStore¶

class openpathsampling.netcdfplus.objects.ObjectStore(content_class, json=True, nestable=False)[source]¶

Base Class for storing complex objects in a netCDF4 file. It holds a reference to the store file.`

content_class¶: openpathsampling.netcdfplus.base.StorableObject – a reference to the class type to be stored using this Storage. Must be subclassed from openpathsampling.netcdfplus.base.StorableObject()

json¶: string – if already computed a JSON Serialized string of the object

cache¶: openpathsampling.netcdfplus.cache.Cache – a dictionary that holds references to all stored elements by index or string for named objects. This is only used for cached access if caching is not False. Must be of type openpathsampling.netcdfplus.base.StorableObject() or subclassed.

__init__(content_class, json=True, nestable=False)[source]¶

Parameters:

content_class –
json (bool or str json or jsonobj) – if False the store will not create a json variable for serialization if True the store will use the json pickling to store objects and a single storable object will be serialized and not referenced. If a string is given the string is taken as the variable type of the json variable. Here only two values are allowed: jsonobj (equivalent to True) or json which will also reference directly given storable objects.
nestable (bool) – if True this marks the content_class to be saved as nested dict objects and not a pointing to saved objects. So the saved complex object is only stored once and not split into several objects that are referenced by each other in a tree-like fashion

Notes

Usually you want caching, but limited. Recommended is to use an LRUCache with a reasonable maximum number of objects that depends on the typical number of objects to cache and their size

The class that takes care of storing data in a file is called a Storage, so the netCDF+ subclassed Storage is a storage. The classes that know how to load and save an object from the storage are called Store, like ObjectStore, SampleStore, etc...

The difference between json and jsonobj is sutble. Consider storing a complex object. Then there are two ways to do that. 1. json: Store a reference to the object (provided) it is stored and 2. jsonobj: serialize the object and only use references for contained objects. All inner objects will always be stored using references. The only exception is using nestable. Consider objects that contain references to objects of the same type, like e.g. operations in an equation (2*3 + 3). Each operation represents a value but each operation needs values to operate on. To save such an object you have again two options: 1. nestable=False. Store all single objects and always reference the contained objects. For an equation that would mean to store several objects op1 = plus(op2, 3), op2 = times(2, 3). Since this is correct though not intuitive you can also use 2. nestable=True. Store all the serialized objects nested into one object (string). For our example this corresponds to plus(times(2,3), 3).

Methods

__init__(content_class[, json, nestable])

param content_class:

add_single_to_cache(idx, json) Add a single object to cache by json

args() Return a list of args of the __init__ function of a class

base() Return the most parent class that is actually derived from Storable(Named)Object

cache_all() Load all samples as fast as possible into the cache

clear_cache() Clear the cache and force reloading

count_weaks() Return the counts of how many objects of storable type are still in memory

create_variable(name, var_type[, ...]) Create a new variable in the netCDF storage.

descendants() Return a list of all subclassed objects

fix_name() Set the objects name to be immutable.

free() Return the number of the next free index for this store

from_dict(dct) Reconstruct an object from a dictionary representaiton

idx(obj) Return the index in this store for a given object

iterator(this[, iter_range]) Return an iterator over all objects in the storage

load(idx) Returns an object from the storage.

load_range(start, end)

load_single(idx)

named(name) Name an unnamed object.

objects() Returns a dictionary of all storable objects

prefix_delegate(dct)

proxy(item) Return a proxy of a object for this store

register(storage, prefix) Associate the object store to a specific storage with a given prefix

release_idx(idx) Releases a lock on an idx

reserve_idx(idx) Locks an idx as used

save(obj[, idx]) Saves an object to the storage.

set_caching(caching) Set the caching mode for this store

set_observer(active) (De-)Activate observing creation of storable objects

to_dict()

write(variable, idx, obj[, attribute])

Attributes

`allowed_types`
`base_cls`	Return the base class
`base_cls_name`	Return the name of the base class
`cls`	Return the class name as a string
`default_cache`
`default_name`	Return the default name.
`first`	Returns the first stored object.
`is_named`	True if this object has a custom name.
`last`	Returns the last generated trajectory.
`name`	Return the current name of the object.
`observe_objects`
`op_idx`	Returns a function that returns for an object of this storage the idx.
`simplifier`	Return the attached simplifier instance used to create JSON serialization
`storage`	Return the associated storage object

__delattr__¶: x.__delattr__(‘name’) <==> del x.name

__format__()¶: default object formatter

__getattribute__¶: x.__getattribute__(‘name’) <==> x.name

__getitem__(item)[source]¶: Enable numpy style selection of object in the store

__hash__¶

__iter__()[source]¶: Add iteration over all elements in the storage

__len__()[source]¶

Return the number of stored objects

Returns:	number of stored objects
Return type:	int

__reduce__()¶: helper for pickle

__reduce_ex__()¶: helper for pickle

__setattr__¶: x.__setattr__(‘name’, value) <==> x.name = value

__setitem__(key, value)[source]¶

Enable saving using __setitem__

This only supports writing store[...] = value. Not sure if this is ever used.

__sizeof__() → int¶: size of object in memory, in bytes

add_single_to_cache(idx, json)[source]¶

Add a single object to cache by json

Parameters:	idx (int) – the index where the object was stored json (str) – json string the represents a serialized version of the stored object

args()¶

Return a list of args of the __init__ function of a class

Returns:	the list of argument names. No information about defaults is included.
Return type:	list of str

base()¶

Return the most parent class that is actually derived from Storable(Named)Object

Important to determine which store should be used for storage

Returns:	the base class
Return type:	type

base_cls¶

Return the base class

Returns:	the base class
Return type:	type

See also

base()

base_cls_name¶

Return the name of the base class

Returns:	the string representation of the base class
Return type:	str

cache_all()[source]¶: Load all samples as fast as possible into the cache

clear_cache()[source]¶: Clear the cache and force reloading

cls¶

Return the class name as a string

Returns:	the class name
Return type:	str

count_weaks()¶

Return the counts of how many objects of storable type are still in memory

This includes objects not yet recycled by the garbage collector.

Returns:	dict of str – the dictionary which assigns the base class name of each references objects the integer number of objects still present
Return type:	int

create_variable(name, var_type, dimensions=None, chunksizes=None, **kwargs)[source]¶

Create a new variable in the netCDF storage. This is just a helper function to structure the code better.

Parameters:

name (str) – The name of the variable to be created
var_type (str) – The string representing the type of the data stored in the variable. Allowed are strings of native python types in which case the variables will be treated as python or a string of the form ‘numpy.type’ which will refer to the numpy data types. Numpy is preferred sinec the api to netCDF uses numpy and thus it is faster. Possible input strings are int, float, long, str, numpy.float32, numpy.float64, numpy.int8, numpy.int16, numpy.int32, numpy.int64
dimensions (str or tuple of str) – A tuple representing the dimensions used for the netcdf variable. If not specified then the default dimension of the storage is used.
simtk_units (str) – A string representing the units used if the var_type is float the units is set to none
description (str) – A string describing the variable in a readable form.
variable_length (bool) – If true the variable is treated as a variable length (list) of the given type. A built-in example for this type is a string which is a variable length of char. This make using all the mixed stuff superfluous
chunksizes (tuple of int or int) – A tuple of ints per number of dimensions. This specifies in what block sizes a variable is stored. Usually for object related stuff we want to store everything of one object at once so this is often (1, ..., ...). A single int is interpreted as a tuple with one entry.

default_name¶

Return the default name.

Usually derived from the objects class

Returns:	the default name
Return type:	str

descendants()¶

Return a list of all subclassed objects

Returns:	list of subclasses of a storable object
Return type:	list of type

first¶

Returns the first stored object.

Returns:	the actual first stored object
Return type:	`openpathsampling.netcdfplus.base.StorableObject`

fix_name()¶

Set the objects name to be immutable.

Usually called after load and save to fix the stored state.

free()[source]¶

Return the number of the next free index for this store

Returns:	index – the number of the next free index in the storage. Used to store a new object.
Return type:	int

from_dict(dct)¶

Reconstruct an object from a dictionary representaiton

Parameters:	dct (dict) – the dictionary containing a state representaion of the class.
Returns:	the reconstructed storable object
Return type:	`openpathsampling.netcdfplus.StorableObject`

idx(obj)[source]¶

Return the index in this store for a given object

Parameters:	obj (`openpathsampling.netcdfplus.base.StorableObject`) – the object that can be stored in this store for which its index is to be returned
Returns:	The integer index of the given object or None if it is not stored yet
Return type:	int or None

is_named¶

True if this object has a custom name.

This distinguishes default algorithmic names from assigned names.

iterator(this, iter_range=None)[source]¶

Return an iterator over all objects in the storage

Parameters:	iter_range (slice or None) – if this is not None it confines the iterator to objects specified in the slice
Returns:	The iterator that iterates the objects in the store
Return type:	`Iterator()`

last¶

Returns the last generated trajectory. Useful to continue a run.

Returns:	the last stored object in this store
Return type:	`openpathsampling.netcdfplus.base.StorableObject`

load(idx)[source]¶

Returns an object from the storage.

Parameters:	idx (int) – the integer index of the object to be loaded
Returns:	the loaded object
Return type:	`openpathsampling.netcdfplus.base.StorableObject`

name¶

Return the current name of the object.

If no name has been set a default generated name is returned.

Returns:	the name of the object
Return type:	str

named(name)¶

Name an unnamed object.

This only renames the object if it does not yet have a name. It can be used to chain the naming onto the object creation. It should also be used when naming things algorithmically: directly setting the .name attribute could override a user-defined name.

Examples

>>> import openpathsampling as p
>>> full = p.FullVolume().named('myFullVolume')

objects()¶

Returns a dictionary of all storable objects

Returns:	dict of str – a dictionary of all subclassed objects from StorableObject. The name points to the class.
Return type:	type

op_idx¶

Returns a function that returns for an object of this storage the idx. This can be used to construct order parameters the return the index in this storage. Useful for visualization

Returns:	the function that reports the index (int) in this store or None if it is not stored
Return type:	function

proxy(item)[source]¶

Return a proxy of a object for this store

Parameters:	item (`openpathsampling.netcdfplus.base.StorableObject` or int) – The item or index that points to an object in this store and to which a proxy is requested.

register(storage, prefix)[source]¶

Associate the object store to a specific storage with a given prefix

Parameters:	storage (`openpathsampling.netcdfplus.NetCDFPlus`) – the storage to be associated with prefix (str) – the name under which

release_idx(idx)[source]¶

Releases a lock on an idx

Parameters:	idx (int) – the integer index to be released

reserve_idx(idx)[source]¶

Locks an idx as used

Parameters:	idx (int) – the integer index to be reserved

save(obj, idx=None)[source]¶

Saves an object to the storage.

Parameters:	obj (`openpathsampling.netcdfplus.base.StorableObject`) – the object to be stored idx (int or string or None) – the index to be used for storing. This is highly discouraged since it changes an immutable object (at least in the storage). It is better to store also the new object and just ignore the previously stored one.

set_caching(caching)[source]¶

Set the caching mode for this store

Parameters:	caching (`openpathsampling.netcdf.Cache`) –

set_observer(active)¶

(De-)Activate observing creation of storable objects

This can be used to track which storable objects are still alive and hence look for memory leaks and inspect caching. Use openpathsampling.netcdfplus.base.StorableObject.count_weaks() to get the current summary of created objects

Parameters:	active (bool) – if True then observing is enabled. False disables observing. Per default observing is disabled.