openpathsampling.netcdfplus.objects.ObjectStore

class openpathsampling.netcdfplus.objects.ObjectStore(content_class, json=True, nestable=False)[source]

Base Class for storing complex objects in a netCDF4 file. It holds a reference to the store file.`

content_class

openpathsampling.netcdfplus.base.StorableObject – a reference to the class type to be stored using this Storage. Must be subclassed from openpathsampling.netcdfplus.base.StorableObject()

json

string – if already computed a JSON Serialized string of the object

cache

openpathsampling.netcdfplus.cache.Cache – a dictionary that holds references to all stored elements by index or string for named objects. This is only used for cached access if caching is not False. Must be of type openpathsampling.netcdfplus.base.StorableObject() or subclassed.

__init__(content_class, json=True, nestable=False)[source]
Parameters:
  • content_class
  • json (bool or str json or jsonobj) – if False the store will not create a json variable for serialization if True the store will use the json pickling to store objects and a single storable object will be serialized and not referenced. If a string is given the string is taken as the variable type of the json variable. Here only two values are allowed: jsonobj (equivalent to True) or json which will also reference directly given storable objects.
  • nestable (bool) – if True this marks the content_class to be saved as nested dict objects and not a pointing to saved objects. So the saved complex object is only stored once and not split into several objects that are referenced by each other in a tree-like fashion

Notes

Usually you want caching, but limited. Recommended is to use an LRUCache with a reasonable maximum number of objects that depends on the typical number of objects to cache and their size

The class that takes care of storing data in a file is called a Storage, so the netCDF+ subclassed Storage is a storage. The classes that know how to load and save an object from the storage are called Store, like ObjectStore, SampleStore, etc...

The difference between json and jsonobj is sutble. Consider storing a complex object. Then there are two ways to do that. 1. json: Store a reference to the object (provided) it is stored and 2. jsonobj: serialize the object and only use references for contained objects. All inner objects will always be stored using references. The only exception is using nestable. Consider objects that contain references to objects of the same type, like e.g. operations in an equation (2*3 + 3). Each operation represents a value but each operation needs values to operate on. To save such an object you have again two options: 1. nestable=False. Store all single objects and always reference the contained objects. For an equation that would mean to store several objects op1 = plus(op2, 3), op2 = times(2, 3). Since this is correct though not intuitive you can also use 2. nestable=True. Store all the serialized objects nested into one object (string). For our example this corresponds to plus(times(2,3), 3).

Methods

__init__(content_class[, json, nestable])
param content_class:
 
add_single_to_cache(idx, json) Add a single object to cache by json
args() Return a list of args of the __init__ function of a class
base() Return the most parent class that is actually derived from Storable(Named)Object
cache_all() Load all samples as fast as possible into the cache
clear_cache() Clear the cache and force reloading
count_weaks() Return the counts of how many objects of storable type are still in memory
create_variable(name, var_type[, ...]) Create a new variable in the netCDF storage.
descendants() Return a list of all subclassed objects
fix_name() Set the objects name to be immutable.
free() Return the number of the next free index for this store
from_dict(dct) Reconstruct an object from a dictionary representaiton
idx(obj) Return the index in this store for a given object
iterator(this[, iter_range]) Return an iterator over all objects in the storage
load(idx) Returns an object from the storage.
load_range(start, end)
load_single(idx)
named(name) Name an unnamed object.
objects() Returns a dictionary of all storable objects
prefix_delegate(dct)
proxy(item) Return a proxy of a object for this store
register(storage, prefix) Associate the object store to a specific storage with a given prefix
release_idx(idx) Releases a lock on an idx
reserve_idx(idx) Locks an idx as used
save(obj[, idx]) Saves an object to the storage.
set_caching(caching) Set the caching mode for this store
set_observer(active) (De-)Activate observing creation of storable objects
to_dict()
write(variable, idx, obj[, attribute])

Attributes

allowed_types
base_cls Return the base class
base_cls_name Return the name of the base class
cls Return the class name as a string
default_cache
default_name Return the default name.
first Returns the first stored object.
is_named True if this object has a custom name.
last Returns the last generated trajectory.
name Return the current name of the object.
observe_objects
op_idx Returns a function that returns for an object of this storage the idx.
simplifier Return the attached simplifier instance used to create JSON serialization
storage Return the associated storage object
__delattr__

x.__delattr__(‘name’) <==> del x.name

__format__()

default object formatter

__getattribute__

x.__getattribute__(‘name’) <==> x.name

__getitem__(item)[source]

Enable numpy style selection of object in the store

__hash__
__iter__()[source]

Add iteration over all elements in the storage

__len__()[source]

Return the number of stored objects

Returns:number of stored objects
Return type:int
__reduce__()

helper for pickle

__reduce_ex__()

helper for pickle

__setattr__

x.__setattr__(‘name’, value) <==> x.name = value

__setitem__(key, value)[source]

Enable saving using __setitem__

This only supports writing store[...] = value. Not sure if this is ever used.

__sizeof__() → int

size of object in memory, in bytes

add_single_to_cache(idx, json)[source]

Add a single object to cache by json

Parameters:
  • idx (int) – the index where the object was stored
  • json (str) – json string the represents a serialized version of the stored object
args()

Return a list of args of the __init__ function of a class

Returns:the list of argument names. No information about defaults is included.
Return type:list of str
base()

Return the most parent class that is actually derived from Storable(Named)Object

Important to determine which store should be used for storage

Returns:the base class
Return type:type
base_cls

Return the base class

Returns:the base class
Return type:type

See also

base()

base_cls_name

Return the name of the base class

Returns:the string representation of the base class
Return type:str
cache_all()[source]

Load all samples as fast as possible into the cache

clear_cache()[source]

Clear the cache and force reloading

cls

Return the class name as a string

Returns:the class name
Return type:str
count_weaks()

Return the counts of how many objects of storable type are still in memory

This includes objects not yet recycled by the garbage collector.

Returns:dict of str – the dictionary which assigns the base class name of each references objects the integer number of objects still present
Return type:int
create_variable(name, var_type, dimensions=None, chunksizes=None, **kwargs)[source]

Create a new variable in the netCDF storage. This is just a helper function to structure the code better.

Parameters:
  • name (str) – The name of the variable to be created
  • var_type (str) – The string representing the type of the data stored in the variable. Allowed are strings of native python types in which case the variables will be treated as python or a string of the form ‘numpy.type’ which will refer to the numpy data types. Numpy is preferred sinec the api to netCDF uses numpy and thus it is faster. Possible input strings are int, float, long, str, numpy.float32, numpy.float64, numpy.int8, numpy.int16, numpy.int32, numpy.int64
  • dimensions (str or tuple of str) – A tuple representing the dimensions used for the netcdf variable. If not specified then the default dimension of the storage is used.
  • simtk_units (str) – A string representing the units used if the var_type is float the units is set to none
  • description (str) – A string describing the variable in a readable form.
  • variable_length (bool) – If true the variable is treated as a variable length (list) of the given type. A built-in example for this type is a string which is a variable length of char. This make using all the mixed stuff superfluous
  • chunksizes (tuple of int or int) – A tuple of ints per number of dimensions. This specifies in what block sizes a variable is stored. Usually for object related stuff we want to store everything of one object at once so this is often (1, ..., ...). A single int is interpreted as a tuple with one entry.
default_name

Return the default name.

Usually derived from the objects class

Returns:the default name
Return type:str
descendants()

Return a list of all subclassed objects

Returns:list of subclasses of a storable object
Return type:list of type
first

Returns the first stored object.

Returns:the actual first stored object
Return type:openpathsampling.netcdfplus.base.StorableObject
fix_name()

Set the objects name to be immutable.

Usually called after load and save to fix the stored state.

free()[source]

Return the number of the next free index for this store

Returns:index – the number of the next free index in the storage. Used to store a new object.
Return type:int
from_dict(dct)

Reconstruct an object from a dictionary representaiton

Parameters:dct (dict) – the dictionary containing a state representaion of the class.
Returns:the reconstructed storable object
Return type:openpathsampling.netcdfplus.StorableObject
idx(obj)[source]

Return the index in this store for a given object

Parameters:obj (openpathsampling.netcdfplus.base.StorableObject) – the object that can be stored in this store for which its index is to be returned
Returns:The integer index of the given object or None if it is not stored yet
Return type:int or None
is_named

True if this object has a custom name.

This distinguishes default algorithmic names from assigned names.

iterator(this, iter_range=None)[source]

Return an iterator over all objects in the storage

Parameters:iter_range (slice or None) – if this is not None it confines the iterator to objects specified in the slice
Returns:The iterator that iterates the objects in the store
Return type:Iterator()
last

Returns the last generated trajectory. Useful to continue a run.

Returns:the last stored object in this store
Return type:openpathsampling.netcdfplus.base.StorableObject
load(idx)[source]

Returns an object from the storage.

Parameters:idx (int) – the integer index of the object to be loaded
Returns:the loaded object
Return type:openpathsampling.netcdfplus.base.StorableObject
name

Return the current name of the object.

If no name has been set a default generated name is returned.

Returns:the name of the object
Return type:str
named(name)

Name an unnamed object.

This only renames the object if it does not yet have a name. It can be used to chain the naming onto the object creation. It should also be used when naming things algorithmically: directly setting the .name attribute could override a user-defined name.

Examples

>>> import openpathsampling as p
>>> full = p.FullVolume().named('myFullVolume')
objects()

Returns a dictionary of all storable objects

Returns:dict of str – a dictionary of all subclassed objects from StorableObject. The name points to the class.
Return type:type
op_idx

Returns a function that returns for an object of this storage the idx. This can be used to construct order parameters the return the index in this storage. Useful for visualization

Returns:the function that reports the index (int) in this store or None if it is not stored
Return type:function
proxy(item)[source]

Return a proxy of a object for this store

Parameters:item (openpathsampling.netcdfplus.base.StorableObject or int) – The item or index that points to an object in this store and to which a proxy is requested.
register(storage, prefix)[source]

Associate the object store to a specific storage with a given prefix

Parameters:
  • storage (openpathsampling.netcdfplus.NetCDFPlus) – the storage to be associated with
  • prefix (str) – the name under which
release_idx(idx)[source]

Releases a lock on an idx

Parameters:idx (int) – the integer index to be released
reserve_idx(idx)[source]

Locks an idx as used

Parameters:idx (int) – the integer index to be reserved
save(obj, idx=None)[source]

Saves an object to the storage.

Parameters:
  • obj (openpathsampling.netcdfplus.base.StorableObject) – the object to be stored
  • idx (int or string or None) – the index to be used for storing. This is highly discouraged since it changes an immutable object (at least in the storage). It is better to store also the new object and just ignore the previously stored one.
set_caching(caching)[source]

Set the caching mode for this store

Parameters:caching (openpathsampling.netcdf.Cache) –
set_observer(active)

(De-)Activate observing creation of storable objects

This can be used to track which storable objects are still alive and hence look for memory leaks and inspect caching. Use openpathsampling.netcdfplus.base.StorableObject.count_weaks() to get the current summary of created objects

Parameters:active (bool) – if True then observing is enabled. False disables observing. Per default observing is disabled.
simplifier

Return the attached simplifier instance used to create JSON serialization

Returns:the simplifier object used in the associated storage
Return type:openpathsampling.netcdfplus.base.dictify.StorableObjectJSON
storage

Return the associated storage object

Returns:the referenced storage object
Return type:openpathsampling.netcdfplus.NetCDFPlus