Code Structure
OpenPathSampling can be divided into several subsystems that interact with each other. Usually, new additions to the code will focus on only a small number of these subsystems – often just one. Knowing how they all fit together will help you identify what you should add.
Storing to file (Storage)
Underlying everything in OpenPathSampling is the storage subsystem. Nearly all OPS objects inherit from our storable object classes, which makes the objects themselves storable (enabling things like restart files). While you shouldn’t need to modify the implementation of storage, you will need a basic understand of what storage can and can’t automatically do, and you may occasionally need to override some methods. All of this is covered in the documentation on storage.
Representing simulation data
These are the core representations of simulation data. These are
Snapshot
, Trajectory
, Sample
,
MoveChange
, and SampleSet
.
You generally won’t need to modify or subclass these (except for
Snapshots
, which some DynamicsEngines
might need to subclass). If you’re writing new analysis
tools, you’ll need to be familiar with how these work. See the section on
data objects in OPS.
Running Dynamics (Engines)
The engines
subsystem runs the dynamics. If you want to add a new
molecular dynamics engine, this is the subsystem you’ll need to become
familiar with.
The main classes you may need to subclass are:
In addition, you may need to add custom snapshot features
. See the
documentation on engines in order to learn more about adding
support for new engines.
Interpreting Dynamics (CVs and Volumes)
For the dynamics to have any sense, we need to be able to analyze each
frame. The subsystem of the code is mainly focused on frame-by-frame
analysis of a trajectory (i.e., the input is a single Snapshot
).
The main classes here are:
CollectiveVariable
, which maps a snapshot to some quantity (usually a single float). This could be, e.g., a specific distance of interest.Volume
, which maps a snapshot to a boolean value, indicating whether the snapshot is within that phase space volume. This could represent a stable state, or the “inside” of an interface in TIS.
The main reason you might subclass these is if you want to create a custom
CollectiveVariable
wrapper for some analysis package. We already
have wrappers for MDTraj and PyEmma; more wrappers would be welcome! Details on what
to do are including in the section on collective variables.
Path Ensembles (Ensembles)
For efficient path sampling, we need to not only run dynamics, but also
to know when to stop the dynamics. In the earlier path sampling methods, all
dynamics were of a fixed length. But more recent approaches have gained
significant efficiency by defining stopping criteria that depend on the
trajectory that has been run so far. These stopping criteria can be derived
from the particle path ensemble being sampled, and for us, this is one of
the primary purposes of the Ensemble
object.
In general, we discourage you from trying to subclass Ensemble
. In
most cases, it is probably easier to use our existing tools and to create
the ensemble you want from them. ???TODO: add docs about writing custom
ensembles???
If you need to create custom ensembles, it is very likely that you will also
want to create a custom TransitionNetwork
. See the discussion
under “Higher-Level Tools.”
Monte Carlo Moves (PathMovers)
Path sampling is Monte Carlo in path space, so of course we need objects
that perform Monte Carlo moves. These are our PathMovers
.
Creating custom Monte Carlo moves is one of the common tasks for developers
of new techniques in path sampling, and so we have tried to make it easy.
However, in order to make it easy for other users to mix-and-match your
custom path movers with other path movers, you should also implement a
custom MoveStrategy
. Details on both of these are in the
documentation on subclassing PathMovers and MoveStrategies.
Higher-Level Tools (Networks and MoveStrategies)
In real path sampling simulations, there are often many ensembles and many path movers. Furthermore, frequently each ensemble needs to know its context – how it relates to other ensembles – in order for its analysis to have any meaning.
For this reason, we have created some higher-level tools which act as
factories for the ensembles and path movers. If you are interested in doing
path sampling of a custom ensemble, then you probably want to write a custom
TransitionNetwork
object. This isn’t necessary if your ensemble
is only used temporarily, but if you actually intend to sample and analyze
that ensemble, it should be part of the TransitionNetwork
. The
documentation on networks ???move from topics??? explains how to create
custom networks.
As mentioned above, if you’re writing a custom path mover, you’ll also want
to write a custom MoveStrategy
. The MoveStrategy
allows
a user to define a desired type of move. For example, using
MoveStrategy
, a user could, in one or two lines, choose to have a
different method for selecting the shooting point, or could even create two
sets of shooting moves for the different methods.
The MoveStrategy
is then appended to the MoveScheme
,
which sorts the strategies into a meaningful order, and then asks each
strategy to create its moves. Details on creating MoveStrategies
are with the documentation on custom path movers.
Simulations (PathSimulators)
Most of OPS is designed around path sampling: that is, the Monte Carlo
sampling of path ensembles. However, much of the machinery can be used for
other purposes. If you’re interested in using OPS for something other than
path sampling, you’ll need to create a new PathSimulator
subclass.
The PathSimulator
is essentially the “main” function of OPS, and
details on subclassing it can be found in the section on
PathSimulators.