Multiple state TIS on Alanine Dipeptide

This examples uses a 6-state model of alanine dipeptide and shows how to use multiple state TIS. This is based on previous work by Du and Bolhuis.

AD_mstis_1_setup

Alanine Multistate¶

Example Simulation¶

This uses the state definition from [1]. 6 states named A,B,C,D,E and F

[1] W.-N. Du, K. A. Marino, and P. G. Bolhuis, “Multiple state transition interface sampling of alanine dipeptide in explicit solvent,” J. Chem. Phys., vol. 135, no. 14, p. 145102, 2011.

Import and general setup¶

First we tell the ipython notebook to put figures inside the notebook

In [1]:

%matplotlib inline
%config InlineBackend.figure_format = 'svg'

And import lots of modules

In [2]:

from __future__ import print_function
# standard packages
import numpy as np
import mdtraj as md
import pandas as pd
import math
import random
import time

# helpers for phi-psi plotting
import alatools as ala
import matplotlib.pyplot as plt

# OpenMM
from simtk.openmm import app
import simtk.unit as unit

# OpenMMTools
import openmmtools as omt

# OpenPathSampling
import openpathsampling as paths
from openpathsampling.tools import refresh_output

# Visualization of PathTrees
import openpathsampling.visualize as ops_vis
#from openpathsampling.visualize import PathTree
from IPython.display import SVG

# the openpathsampling OpenMM engine
import openpathsampling.engines.openmm as eng

Set simulation options and create a simulator object¶

Create an AlanineOpenMMSimulator for demonstration purposes

We will need a openmm.System object and an openmm.Integrator object.

To learn more about OpenMM, read the OpenMM documentation. The code we use here is based on output from the convenient web-based OpenMM builder.

We first create a snapshot using an existing PDB file. The contained topology is necessary for the OpenMM system object to compute all forces.

In [3]:

pdb_file = "../resources/AD_initial_frame.pdb"

In [4]:

template = eng.snapshot_from_pdb(pdb_file)

1. the force field¶

using AMBER96 as in the original paper with Tip3P water.

In [5]:

forcefield = app.ForceField('amber96.xml', 'tip3p.xml')

2. the system object¶

and we use the template to create the necessary OpenMM Topology object. Note that an openmm topology is (a little bit) different from an mdtraj.Topology so we need to convert.

In [6]:

pdb = app.PDBFile(pdb_file)

system = forcefield.createSystem(
    pdb.topology, 
    nonbondedMethod=app.PME, 
    nonbondedCutoff=1.0*unit.nanometers,
    constraints=app.HBonds, 
    rigidWater=True,
    ewaldErrorTolerance=0.0005
)

3. the integrator¶

In [7]:

integrator_low = omt.integrators.VVVRIntegrator(
    temperature=300 * unit.kelvin,  # temperature
    timestep=2.0 * unit.femtoseconds  # integration step size
)
integrator_low.setConstraintTolerance(0.00001)

4. the platform¶

we let OpenMM decide to use the fastest platform available. You can ask an OpenMM engine what platform it is using with the .platform attribute. You can actually override this behaviour when you initialize the engine and pass either a string or a platform object.

In [8]:

print(eng.Engine.available_platforms())

['Reference', 'CPU', 'OpenCL']

5. OpenMM properties¶

There are lots of options. For speed we pick mixed precision on GPUs. Most GPU programming libraries have difficulty with double precision. For our purposes this should be fine and it is faster than double precision. These are the options passed

In [9]:

platform = 'CPU'

if platform == 'OpenCL':
    openmm_properties = {
        'OpenCLPrecision': 'mixed',
        'OpenCLDeviceIndex': "2"  # pick the correct index if you have several instances
    }
elif platform == 'CUDA':
    openmm_properties = {'CudaPrecision': 'mixed'}
elif platform == 'CPU':
    openmm_properties = {}
else:
    openmm_properties = {}

6. OPS options¶

An engine in OpenPathSampling also has general options that are the same for any engine, like the number of steps until a frame is stored or the maximal number of steps until an engine will automatically stop.

In [10]:

engine_low_options = {
    'n_frames_max': 5000,
    'n_steps_per_frame': 10
}

Finally build the main engine from the parts

In [11]:

engine_low = eng.Engine(
    template.topology,
    system,
    integrator_low,
    openmm_properties=openmm_properties,
    options=engine_low_options
)
engine_low.name = 'default'

For the exploration of state space and getting an initial trajectory we also want to simulate at a very high temperature. Here 750K. This is totally unphysical, but all we want is to generate conformations that are not completely wrong in the sense that they could exist at low temperatures.

Set a high temperature simulation for exploration.

For OpenMM all we need to change is the replace the integrator with a new temperature and half the stepsize. We will later pick only every second frame

In [12]:

integrator_high = omt.integrators.VVVRIntegrator(
    temperature=1000 * unit.kelvin,  # temperature
    timestep=2.0 * unit.femtoseconds  # integration step size
)
integrator_high.setConstraintTolerance(0.00001)

In [13]:

engine_high_options = {
    'n_frames_max': 5000,
    'n_steps_per_frame': 10  # twice as many steps with half stepsize
}

An clone the engine using a new integrator

In [14]:

engine_high = engine_low.from_new_options(
    integrator=integrator_high,
    options=engine_high_options)
engine_high.name = 'high'

For now we use the default engine at

In [15]:

engine_high.initialize(platform)

print('High-Engine uses')
print('platform `%s`' % engine_high.platform)
print('temperature `%6.2f K' % (
    float(engine_high.integrator.getGlobalVariableByName('kT')) / 0.0083144621))

High-Engine uses
platform `CPU`
temperature `1000.00 K

Create the storage¶

We open a new empty storage

In [16]:

storage_file = 'ala_mstis_bootstrap.nc'
storage = paths.Storage(storage_file, 'w', template=template)

And store both engines in it. Since these are named you can load these later using storage.engines[name].

In [17]:

storage.save(engine_low);
storage.save(engine_high);

And store a template for convenience.

In [18]:

storage.tag['template'] = template

State Definitions¶

We define states A-F. The specifications are taken from the paper [^1].

The list of state names

In [19]:

states = ['A', 'B', 'C', 'D', 'E', 'F']

If you just want to play around you might want to just simulate the first 4 states, which is much faster. You can do so by uncommenting the line below

In [20]:

states = ['A', 'B', 'C', 'D']

Define the centers.

In [21]:

state_centers = {
    'A' : [-150, 150], 
    'B' : [-70, 135], 
    'C' : [-150, -65], 
    'D' : [-70, -50], 
    'E' : [50, -100], 
    'F' : [40, 65]
}

And the radii of the interfaces

In [22]:

interface_levels = {
    'A' : [20, 45, 65, 80],
    'B' : [20, 45, 65, 75],
    'C' : [20, 45, 60],
    'D' : [20, 45, 60],
    'E' : [20, 45, 65, 80],
    'F' : [20, 45, 65, 80],
}

And also save these information for later convenience

In [23]:

storage.tag['states'] = states
storage.tag['state_centers'] = state_centers
storage.tag['interface_levels'] = interface_levels

Order Parameters¶

this generates an order parameter (callable) object named psi (so if we call psi(trajectory) we get a list of the values of psi for each frame in the trajectory). This particular order parameter uses mdtraj's compute_dihedrals function, with the atoms in psi_atoms.

The .with_diskcache will tell the CVs to also create some space to save the values of this CV in the same storage where the CV gets saved. Usually you want to save the direct CVs with diskcache while CVs that build upon these CVs should be fine. The reason is this:

Loading large snapshots from disk is expensive, while compute something from snapshots in memory is fast. So for analysis re-computing values is fine as long as we do not need to reload the snapshots into memory. In our case this means that the direct CVs (CVs that require accessing snapshots properties) are phi and psi since everything else is based upon this, while the distance to the state center opA to opF is an indirect CV and should not be stored.

In [24]:

psi_atoms = [6,8,14,16]
psi = paths.MDTrajFunctionCV(
    name="psi", 
    f=md.compute_dihedrals,
    topology=template.topology,
    indices=[psi_atoms]
).with_diskcache()

phi_atoms = [4,6,8,14]
phi = paths.MDTrajFunctionCV(
    name="phi", 
    f=md.compute_dihedrals,
    topology=template.topology,
    indices=[phi_atoms]
).with_diskcache()

storage.save([psi, phi]);

Define a function that defines a distance in periodic $\phi,\psi$-space.

In [25]:

def circle_degree(snapshot, center, phi, psi):
    import numpy
    p = numpy.array([phi(snapshot), psi(snapshot)]) / numpy.pi * 180.0
    delta = numpy.abs(center - p)
    delta = numpy.where(delta > 180.0, delta - 360.0, delta)
    return numpy.hypot(delta[0], delta[1])

Create CVs for all states by using the circle_degree function with different centers

In [26]:

cv_state = dict()
for state in state_centers:
    op = paths.FunctionCV(
        name = 'op' + state,
        f=circle_degree,
        center=np.array(state_centers[state]),
        phi=phi,
        psi=psi
    )
    cv_state[state] = op

Volumes¶

Volume define regions in state space using given CVs. We define here the regions around the state centers. Their boundaries correspond to the interfaces as used for TIS. Crossing an interface thus corresponds to leaving an interface volume into the next larger one.

In [27]:

interface_sets = {}
for state, levels in interface_levels.items():
    print(levels)
    interface_sets[state] = \
        paths.VolumeInterfaceSet(cv_state[state], 0.0, levels)

[20, 45, 65, 80]
[20, 45, 65, 75]
[20, 45, 60]
[20, 45, 60]
[20, 45, 65, 80]
[20, 45, 65, 80]

Create Volume objects for all states. In our setup we chose the lowest interface volume for the state definition .

In [28]:

vol_state = {}
for state, levels in interface_levels.items():
#    vol_state[state] = interface_sets[state][0]
    vol_state[state] = paths.VolumeInterfaceSet(cv_state[state], 0.0, 10)[0]
    vol_state[state].name = state

In [ ]:

Visualize in Phi/Psi space¶

We use a littl helper function specific for phi/psi plots to illustrate what is happening. This code will not affect the generation of data but will help in visualizing the system

In [29]:

# reload(ala)
plot = ala.TwoCVSpherePlot(
    cvs=(phi, psi),
    states=[vol_state[vol] for vol in states],
    state_centers=[state_centers[vol] for vol in states],
    interface_levels=[interface_levels[vol] for vol in states]
)

plot.new()
# plot the phi/psi plot and show all states and interfaces
plot.main()
# add the initial template
plot.add_snapshot(template, label='initial')

No description has been provided for this image

Set up the MSTIS network¶

We want to simulate using multistate transition interface sampling and to alleviate the effort of setting up all states, interfaces, appropriate pathensembles and a scheme on how to generate samples we use some helper construct like in the other examples. In our case the MSTISNetwork comes first and contains the definitions of all states, their interfaces as well as associated CVs.

To make use of the optional Multi-State Outer Interface MSOuterInterface we need to define it. It will allow reversing samples that connect two states and overcomes the issue that in all present moves the initial state needs to remain the same.

In [30]:

ms_outers = paths.MSOuterTISInterface.from_lambdas(
    {ifaces: max(ifaces.lambdas) + 4
     for state, ifaces in interface_sets.items()}
)

In [31]:

mstis = paths.MSTISNetwork([
    (vol_state[state], interface_sets[state])          # core, interface set
    for state in states],
    ms_outers=ms_outers
)

And save the network

In [32]:

storage.save(paths.Trajectory([template]))
storage.tag['network'] = mstis

Set up the `MoveScheme`¶

Next, we need to specify the actual way in which we want to simulate the Network. This could be done by constructing a suitable PathMover yourself or employing the power of a MoveScheme that will generate all nevessary movers for you. The MoveScheme also effectively knows which initial samples in which ensembles you need. Beside the ones indirectly defined by the network the movescheme might not need some of these because certain moves will not be used. It could also happen that a movescheme require an extra ensemble to work for special moves that are not directly known from the network definition.

In our case we will use the DefaultScheme that uses RepEx, Reversal, Shooting and MinusMoves.

In [33]:

scheme = paths.DefaultScheme(mstis)

Initial trajectories¶

Next, we want to generate initial pathways for all ensembles, e.g. The TISEnsembles and MinusInterfaceEnsembles. There are several ways to do so. A very easy approach is to explore the state space at a high temperature and wait until all states have been visited. Splitting the long trajectory appropriately would give us at least one valid trajectory for each TIS ensemble. In a second step we can use over sub parts of the long trajectory and extend these into minus ensembles.

So let's try this: We generate an ensemble that is True if a trajectory is at least completely outside one state A-F. It will report false once all states have been hit, which is what we want. We will ues this as a stopping condition to generate a first long trajectory at high temperature.

In [34]:

state_volumes = list(vol_state.values())

In [35]:

visit_all_states = paths.join_ensembles([paths.AllOutXEnsemble(state) for state in state_volumes])

In [36]:

# trajectory = engine_high.generate(template, [hit_all_states.can_append])

This will take some time, even when running at a high temperature. Since this type of ensemble is quite common, and since we'd also like get progress updates while the trajectory is running, we created a specialized ensemble that wraps around the idea sketched out above. This ensemble gives a trajectory that will have visited all states, and reports progress while it is running.

In [37]:

visit_all_states = paths.VisitAllStatesEnsemble(state_volumes)

In [38]:

trajectory = engine_high.generate(template, [visit_all_states.can_append])

Ran 3126 steps. Found states [F,A,D,B,E,C]. Looking for [].

Output the long trajectory and see if we hit all states

In [39]:

plot.new()
plot.main()
plot.add_trajectory(trajectory)

In [40]:

tps= paths.TPSNetwork.from_states_all_to_all(vol_state.values())
tps_ensemble = tps.sampling_ensembles[0]
trajectories = tps_ensemble.split(trajectory)

In [41]:

print(trajectories)

[Trajectory[70], Trajectory[138], Trajectory[107], Trajectory[42], Trajectory[21], Trajectory[61], Trajectory[132], Trajectory[60], Trajectory[122], Trajectory[34], Trajectory[16], Trajectory[47], Trajectory[209], Trajectory[13], Trajectory[12], Trajectory[74], Trajectory[49], Trajectory[51], Trajectory[30], Trajectory[12], Trajectory[25], Trajectory[28], Trajectory[9], Trajectory[93], Trajectory[64], Trajectory[37]]

Find initial samples¶

It can be difficult to get initial samples for an ensemble for various reasons. And in fact for this example it might be the most tricky part in setting up the path simulation.

In theory you could just pick a frame and extend forward and backward until an ensemble reports cannot append anymore and check if that sample is in the ensemble. While this works in many cases it can be rather inefficient. For us it is better to reuse the previously computed trajectories as a basis to generate samples.

Bottomline: The next command just does what you would try to do with the long trajectory for you: Use split and extend to generate the required set of samples to start.

If you are interested in details keep reading, otherwise execute the next cell.

To do so, we use a number of functions from Ensemble that will try to generate a sample from a list of initial trajectories in different ways:

ensemble.get_sample_from_trajectories : directly search for candidate sample among the given trajectories
ensemble.split_sample_from_trajectories : split the given trajectories in valid samples
ensemble.extend_sample_from_trajectories : try to extend suitable sub-trajectories into valid samples

Instead of calling these functions directly we make use of a function from MoveScheme that uses its knowledge about the necessary ensembles to run these function for all necessary ensembles to create a suitable set of initial samples.

When doing this there are several aspects that we might want to take into account

also use reversed copies and try these
avoid reusing samples for better decorrelation.
try to use smaller trajectories as these are usually better equilibrated

The function scheme.initial_conditions_from_trajectories can take care of all of these. We will run it with the default setting but also use the extend-complex and extend-minimum strategies besides split. That means we try different strategies in a specific order.

split: try to split the trajectory and look for valid samples
extend-complex: try to look for suitable sub-trajectories that can be extended into valid samples. Currently this works only for MinusInterfaceEnsembles and it will extend A-X-A trajectories into Minus Ensembles.
extend-minimum: try to look for trajectories that cross an interface A-X and try to extend these into a full ensemble. This one will work for TISEnsemble and MinusInterfaceEnsemble. We will use this last method to create minus ensembles for the last state found. Since we stop when we discover it, we will not have generated a sample of type A-X-A. This might also happen if we only visit a state once.

Note, that the creation of sample might fail: Extending might just not reach the desired target state. By default, it will make 3 attempts per extendable subtrajectory. The whole process may take some time. Basically because we need to split a very long trajectory and to use the shortest possible ones we need to really search for all possible sub-trajectories.

In [42]:

total_sample_set = scheme.initial_conditions_from_trajectories(
    trajectories)

Missing ensembles:
*  [Out A minus]
*  [Out B minus]
*  [Out C minus]
*  [Out D minus]
No extra ensembles.

Let's see, if we were successful. If so, the next function should not show any missing or extra ensembles.

In [43]:

print(scheme.initial_conditions_report(total_sample_set))

Missing ensembles:
*  [Out A minus]
*  [Out B minus]
*  [Out C minus]
*  [Out D minus]
No extra ensembles.

In [44]:

plt.figure(21, (12,5))
plt.subplot(121)
plt.title('High temperature')
plot.main()
for s in total_sample_set:
    plot.add_trajectory(s.trajectory, line=True)

If it does, this means that our attempt to extend was probably not successful. So we can just try again until it does. To extend an existing sample_set you need to pass it to the function.

In [45]:

# loop until we are done
while scheme.check_initial_conditions(total_sample_set)[0]:
    total_sample_set = scheme.initial_conditions_from_trajectories(
        trajectory,
        sample_set=total_sample_set,  # this is new
        strategies=[  # we omit the `split` strategy
            'extend-complex',
            'extend-minimal'],
        engine=engine_high)

No missing ensembles.
No extra ensembles.

In [46]:

plt.figure(21, (12,5))
plt.subplot(121)
plt.title('High temperature')
plot.main()
for s in total_sample_set:
    plot.add_trajectory(s.trajectory, line=True)

Done.

Equilibration¶

For the next steps we need the engine running at the desired (room) temperature. So we create a new engine. Since we might run on a local machine and do not have several instances of GPU available we will unload the existing engine and create a new one.

In [47]:

engine_high.unload_context()  # delete the current openmm.Context object
engine_low.initialize(platform)

refresh_output('Low-Engine uses', refresh=False)
print('platform `%s`' % engine_low.platform)
print('temperature `%6.2f K' % (
    float(engine_low.integrator.getGlobalVariableByName('kT')) / 0.0083144621))

Low-Engine usesplatform `CPU`
temperature `300.00 K

In molecular dynamics, you need to equilibrate if you don't start with an equilibrium frame (e.g., if you start with solvent molecules on a grid, your system should equilibrate before you start taking statistics). Similarly, if you start with a set of paths which are far from the path ensemble equilibrium, you need to equilibrate. This could either be because your trajectories are not from the real dynamics (generated with metadynamics, high temperature, etc.) or because your trajectories are not likely representatives of the path ensemble (e.g., if you put transition trajectories into all interfaces).

As with MD, running equilibration can be the same process as running the total simulation. However, in path sampling, it doesn't have to be: we can equilibrate without replica exchange moves or path reversal moves, for example. In the example below, we create a MoveScheme that only includes shooting movers.

In [48]:

equil_scheme = paths.OneWayShootingMoveScheme(mstis, engine=engine_low)

Note, that some Schemes may change the actual required set of initial samples. For the equilibration we apply shooting moves only to TIS ensembles and not to the MinusInterfaceEnsembles.

Create the PathSampler object that can run the equilibration.

In [49]:

equilibration = paths.PathSampling(
    storage=storage,
    sample_set=total_sample_set,
    move_scheme=equil_scheme,
)

And run lots of equilibration steps. This is not really necessary, but we generated from the wrong path distribution and this should give more likeli paths for the right temperature.

In [50]:

equilibration.run(500)
equilibrated_sset = equilibration.sample_set

Working on Monte Carlo cycle number 500
Running for 32 minutes 31 seconds -  3.91 seconds per step
Estimated time remaining: 3 seconds
DONE! Completed 500 Monte Carlo cycles.

Let's have a quick look how many frames in TIS ensembles are already generated by the default engine at 300K and how many were done using the high engine at 1000K.

In [51]:

engine_list = {}

unique_snapshots = set(sum(
    [
        samp.trajectory for samp in equilibrated_sset 
        if isinstance(samp.ensemble, paths.TISEnsemble)
    ], 
    []))

for samp in equilibrated_sset:
    if isinstance(samp.ensemble, paths.TISEnsemble):
        for snap in samp.trajectory:
            eng = snap.engine
            engine_list[eng] = engine_list.get(eng, 0) + 1
        
for eng, counts in engine_list.items():
    print('%20s : %6d' % (eng.name, counts))

             default :   1310

Finally, save the list of samples for later without any reference to previous samples. This is necessary since usually a sample knows its origin sample and saving a sample as is, would trigger saving the whole history of all intermediate samples used. Since we are not interested in the history of equilibration, this saves a lot of storage.

In [52]:

storage.tag['sampleset'] = equilibrated_sset.copy_without_parents()

Let's have a final look at what we cover with all our initial samples.

In [53]:

for s in total_sample_set:
    print(s)

Sample(RepID: 0, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11eb62160>, 70 steps)
Sample(RepID: 1, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ebec400>, 70 steps)
Sample(RepID: 2, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ebe8cc0>, 70 steps)
Sample(RepID: 3, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11eb6b390>, 70 steps)
Sample(RepID: 4, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec05320>, 138 steps)
Sample(RepID: 5, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec08438>, 138 steps)
Sample(RepID: 6, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec0d2e8>, 138 steps)
Sample(RepID: 7, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec11320>, 138 steps)
Sample(RepID: 8, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec1de48>, 61 steps)
Sample(RepID: 9, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec22e48>, 61 steps)
Sample(RepID: 10, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec25a90>, 61 steps)
Sample(RepID: 11, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec36630>, 21 steps)
Sample(RepID: 12, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec39438>, 21 steps)
Sample(RepID: 13, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec3e358>, 21 steps)
Sample(RepID: 14, Ens: <openpathsampling.ensemble.UnionEnsemble object at 0x11ec5aa58>, 70 steps)
Sample(RepID: 15, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ebf8940>, 57 steps)
Sample(RepID: 16, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ec147f0>, 66 steps)
Sample(RepID: 17, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ec28b70>, 24 steps)
Sample(RepID: 18, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ec424a8>, 83 steps)

In [54]:

for s in equilibrated_sset:
    print(s)

Sample(RepID: 14, Ens: <openpathsampling.ensemble.UnionEnsemble object at 0x11ec5aa58>, 70 steps)
Sample(RepID: 15, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ebf8940>, 57 steps)
Sample(RepID: 16, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ec147f0>, 66 steps)
Sample(RepID: 17, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ec28b70>, 24 steps)
Sample(RepID: 18, Ens: <openpathsampling.ensemble.MinusInterfaceEnsemble object at 0x11ec424a8>, 83 steps)
Sample(RepID: 10, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec25a90>, 113 steps)
Sample(RepID: 5, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec08438>, 41 steps)
Sample(RepID: 4, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec05320>, 167 steps)
Sample(RepID: 7, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec11320>, 97 steps)
Sample(RepID: 13, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec3e358>, 47 steps)
Sample(RepID: 3, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11eb6b390>, 99 steps)
Sample(RepID: 1, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ebec400>, 29 steps)
Sample(RepID: 11, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec36630>, 24 steps)
Sample(RepID: 9, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec22e48>, 56 steps)
Sample(RepID: 2, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ebe8cc0>, 109 steps)
Sample(RepID: 8, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec1de48>, 12 steps)
Sample(RepID: 12, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec39438>, 109 steps)
Sample(RepID: 6, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11ec0d2e8>, 373 steps)
Sample(RepID: 0, Ens: <openpathsampling.ensemble.TISEnsemble object at 0x11eb62160>, 34 steps)

In [55]:

plt.figure(21, (12,5))
plt.subplot(121)
plt.title('High temperature')
plot.main()
for s in total_sample_set:
    plot.add_trajectory(s.trajectory, line=True)
    
plt.subplot(122)
plt.title('Equilibrated')
plot.main()
plot.zoom = 180/3.1415926
for s in equilibrated_sset:
    plot.add_trajectory(s.trajectory, line=True)

A final check that we still have a valid sampleset

In [56]:

equilibrated_sset.sanity_check()

In [57]:

print('snapshots:', len(storage.snapshots))
print('trajectories:', len(storage.trajectories))
print('samples:', len(storage.samples))
print('filesize:', storage.file_size_str)

snapshots: 43540
trajectories: 509
samples: 538
filesize: 965.66MB

In [58]:

storage.close()

In [59]:

storage =paths.AnalysisStorage("ala_mstis_bootstrap.nc")

In [60]:

print("PathMovers:", len(storage.pathmovers))
print("Engines:", len(storage.engines))
print("Samples:", len(storage.samples))
print("Ensembles:", len(storage.ensembles))
print("SampleSets:", len(storage.samplesets))
print("Snapshots:", len(storage.snapshots))
print("Trajectories:", len(storage.trajectories))
print("Networks:", len(storage.networks))

PathMovers: 45
Engines: 3
Samples: 538
Ensembles: 315
SampleSets: 502
Snapshots: 43540
Trajectories: 509
Networks: 1

In [61]:

scheme = storage.schemes[0]

In [62]:

scheme.move_summary(storage.steps)

shooting ran 100.000% (expected 100.00%) of the cycles with acceptance 335/500 (67.00%)

In [63]:

scheme.move_summary(storage.steps, 'shooting')

OneWayShootingMover Out A 0 ran 7.600% (expected 7.14%) of the cycles with acceptance 26/38 (68.42%)
OneWayShootingMover Out A 1 ran 7.600% (expected 7.14%) of the cycles with acceptance 27/38 (71.05%)
OneWayShootingMover Out A 2 ran 6.600% (expected 7.14%) of the cycles with acceptance 19/33 (57.58%)
OneWayShootingMover Out A 3 ran 5.400% (expected 7.14%) of the cycles with acceptance 19/27 (70.37%)
OneWayShootingMover Out B 0 ran 5.000% (expected 7.14%) of the cycles with acceptance 17/25 (68.00%)
OneWayShootingMover Out B 1 ran 7.000% (expected 7.14%) of the cycles with acceptance 14/35 (40.00%)
OneWayShootingMover Out B 2 ran 8.600% (expected 7.14%) of the cycles with acceptance 35/43 (81.40%)
OneWayShootingMover Out B 3 ran 7.200% (expected 7.14%) of the cycles with acceptance 18/36 (50.00%)
OneWayShootingMover Out C 0 ran 7.200% (expected 7.14%) of the cycles with acceptance 27/36 (75.00%)
OneWayShootingMover Out C 1 ran 8.000% (expected 7.14%) of the cycles with acceptance 30/40 (75.00%)
OneWayShootingMover Out C 2 ran 5.200% (expected 7.14%) of the cycles with acceptance 16/26 (61.54%)
OneWayShootingMover Out D 0 ran 8.400% (expected 7.14%) of the cycles with acceptance 34/42 (80.95%)
OneWayShootingMover Out D 1 ran 8.600% (expected 7.14%) of the cycles with acceptance 29/43 (67.44%)
OneWayShootingMover Out D 2 ran 7.600% (expected 7.14%) of the cycles with acceptance 24/38 (63.16%)

/Users/dwhs/Dropbox/pysrc/openpathsampling/openpathsampling/high_level/move_scheme.py:889: UserWarning: Move acceptance already calculated. The steps parameter will be ignored.
  warnings.warn("Move acceptance already calculated. "

In [ ]:

In [64]:

tree = ops_vis.PathTree(
        storage.steps[0:500],
        ops_vis.ReplicaEvolution(
            replica=0
        )
    )

f = open("myfile.svg","w") 
f.write(tree.svg())
f.close()
SVG(tree.svg())

Out[64]:

In [65]:

path_lengths = [len(step.active[7].trajectory) for step in storage.steps]
plt.hist(path_lengths, bins=40, alpha=0.5);
print("Maximum:", max(path_lengths), "("+str(max(path_lengths)*engine_low.snapshot_timestep)+")")
print("Average:", "{0:.2f}".format(np.mean(path_lengths)), "("+(np.mean(path_lengths)*engine_low.snapshot_timestep).format("%.3f")+")")

Maximum: 262 (5.24 ps)
Average: 118.10 (2.362 ps)

In [66]:

traj = storage.steps[-1].active[0].trajectory

In [67]:

len(traj)

Out[67]:

In [68]:

print((phi(traj))*(180/np.pi))
print((psi(traj))*(180/np.pi))

[-147.13214  -139.79378  -127.82453  -125.45244  -117.341965 -126.19804
 -123.69993  -116.412895 -119.310715 -113.97085  -119.34421  -122.82683
 -126.818436 -128.6968   -134.68309  -140.4169   -137.56593  -147.14778
 -134.03882  -149.24098  -137.05484  -136.05605  -147.38     -146.77097
 -156.80222  -148.07848  -153.852    -139.0577   -142.39156  -132.07857
 -125.6701   -134.29102  -139.17616  -152.39551 ]
[ 156.40907  157.90178  160.98724  172.46529  177.25502 -178.74725
 -174.92116 -166.73091 -166.93466 -175.89244 -172.20737 -173.93083
 -163.24199 -170.36339 -175.68338 -167.13893 -172.74031 -177.50563
 -172.97275 -176.91379 -165.01842 -169.9785  -177.47803 -170.0795
  178.7047   178.79144  174.10066  176.67365  179.78653  177.75027
  172.1333   170.91331  165.08852  154.3578 ]

In [ ]:

(AD_mstis_1_setup.ipynb; AD_mstis_1_setup.py)

AD_mstis_2_run

Run from bootstrap paths¶

Now we will use the initial trajectories we obtained from bootstrapping to run an MSTIS simulation. This will show both how objects can be regenerated from storage and how regenerated equivalent objects can be used in place of objects that weren't stored.

Tasks covered in this notebook:

Loading OPS objects from storage
Ways of assigning initial trajectories to initial samples
Setting up a path sampling simulation with various move schemes
Visualizing trajectories while the path sampling is running

In [1]:

%matplotlib inline

In [2]:

import openpathsampling as paths
import numpy as np
import math

# the openpathsampling OpenMM engine
import openpathsampling.engines.openmm as eng

Loading things from storage¶

First we'll reload some of the stuff we stored before. Of course, this starts with opening the file.

In [3]:

old_store = paths.AnalysisStorage("ala_mstis_bootstrap.nc")

A lot of information can be recovered from the old storage, and so we don't have the recreate it. However, we did not save our network, so we'll have to create a new one. Since the network creates the ensembles, that means we will have to translate the trajectories from the old ensembles to new ensembles.

In [4]:

print("PathMovers:", len(old_store.pathmovers))
print("Engines:", len(old_store.engines))
print("Samples:", len(old_store.samples))
print("Trajectories:", len(old_store.trajectories))
print("Ensembles:", len(old_store.ensembles))
print("SampleSets:", len(old_store.samplesets))
print("Snapshots:", len(old_store.snapshots))
print("Networks:", len(old_store.networks))

PathMovers: 45
Engines: 3
Samples: 538
Trajectories: 509
Ensembles: 315
SampleSets: 502
Snapshots: 61910
Networks: 1

Loading from storage is very easy. Each store is a list. We take the 0th snapshot as a template (it doesn't actually matter which one) for the next storage we'll create. There's only one engine stored, so we take the only one.

In [5]:

# template = old_store.snapshots[0]
engine = old_store.engines['default']
mstis = old_store.networks[0]
sset = old_store.tag['sampleset']

initialize engine¶

if we do not select a platform the fastest possible will be chosen but we explicitly request to use the one in the config file

In [6]:

#platform = 'CUDA'
#engine.initialize(platform)

print('Engine uses platform `%s`' % engine.platform)

Engine uses platform `None`

In [7]:

sset.sanity_check()

Running RETIS¶

Now we run the full calculation. Up to here, we haven't been storing any of our results. This time, we'll start a storage object, and we'll save the network we've created. Then we'll run a new PathSampling calculation object.

In [8]:

# logging creates ops_output.log file with details of what the calculation is doing
#import logging.config
#logging.config.fileConfig("logging.conf", disable_existing_loggers=False)

In [9]:

storage = paths.storage.Storage("ala_mstis_production.nc", "w")

In [10]:

storage.snapshots.save(old_store.snapshots[0]);

Before we can sample we still need to set the actual MoveScheme which determines the set of moves to apply to our set of samples and effectively doing the steps in replica (sampleset) space. We pick the default scheme for mstis and feed it with the engine to be used.

In [11]:

scheme = paths.DefaultScheme(mstis, engine)

and finally generate the PathSampler object to conduct the simulation.

In [12]:

mstis_calc = paths.PathSampling(
    storage=storage,
    sample_set=sset,
    move_scheme=scheme
)
mstis_calc.save_frequency = 10
#mstis_calc.save_frequency = 1

Now everything is ready: let's run the simulation! The first step takes a little since all necessary information, i.e. the engines, topologies, initial snapshots, ..., need to be stored. Then the monte carlo steps will be performed.

In [13]:

mstis_calc.run(10000)

Working on Monte Carlo cycle number 10000
Running for 5 hours 4 minutes 56 seconds -  1.83 seconds per step
Estimated time remaining: 1 second
DONE! Completed 10000 Monte Carlo cycles.

In [20]:

print(len(storage.steps))

In [21]:

# commented out during development, so we can "run all" and then do more
storage.close()

In [ ]:

(AD_mstis_2_run.ipynb; AD_mstis_2_run.py)

AD_mstis_4_analysis

Analyzing the MSTIS simulation¶

Included in this notebook:

Opening files for analysis
Rates, fluxes, total crossing probabilities, and condition transition probabilities
Per-ensemble properties such as path length distributions and interface crossing probabilities
Move scheme analysis
Replica exchange analysis
Replica move history tree visualization
Replaying the simulation
MORE TO COME! Like free energy projections, path density plots, and more

In [1]:

from __future__ import print_function
%matplotlib inline
import matplotlib.pyplot as plt
import openpathsampling as paths
import numpy as np

The optimum way to use storage depends on whether you're doing production or analysis. For analysis, you should open the file as an AnalysisStorage object. This makes the analysis much faster.

In [2]:

%%time
storage = paths.AnalysisStorage("ala_mstis_production.nc")

CPU times: user 48.1 s, sys: 2.36 s, total: 50.5 s
Wall time: 50.6 s

In [3]:

print("PathMovers:", len(storage.pathmovers))
print("Engines:", len(storage.engines))
print("Samples:", len(storage.samples))
print("Ensembles:", len(storage.ensembles))
print("SampleSets:", len(storage.samplesets))
print("Snapshots:", len(storage.snapshots))
print("Trajectories:", len(storage.trajectories))
print("Networks:", len(storage.networks))

PathMovers: 125
Engines: 3
Samples: 13122
Ensembles: 315
SampleSets: 10001
Snapshots: 436772
Trajectories: 8000
Networks: 1

In [4]:

%%time
mstis = storage.networks[0]

CPU times: user 16 µs, sys: 1 µs, total: 17 µs
Wall time: 19.3 µs

In [5]:

%%time
for cv in storage.cvs:
    print(cv.name, cv._store_dict)

opA None
phi <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a591fda0>
psi <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a5dc1f60>
opB None
opC None
opD None
max opD <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7ade02828>
max opB <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a6b057b8>
max opC <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a56bbf60>
max opA <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a5ea0438>
opE None
opF None
max opE <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a5597048>
max opF <openpathsampling.netcdfplus.chaindict.StoredDict object at 0x7fa7a6b059b0>
CPU times: user 1.5 ms, sys: 26 µs, total: 1.53 ms
Wall time: 1.22 ms

Reaction rates¶

TIS methods are especially good at determining reaction rates, and OPS makes it extremely easy to obtain the rate from a TIS network.

Note that, although you can get the rate directly, it is very important to look at other results of the sampling (illustrated in this notebook and in notebooks referred to herein) in order to check the validity of the rates you obtain.

By default, the built-in analysis calculates histograms the maximum value of some order parameter and the pathlength of every sampled ensemble. You can add other things to this list as well, but you must always specify histogram parameters for these two. The pathlength is in units of frames.

In [6]:

mstis.hist_args['max_lambda'] = { 'bin_width' : 2, 'bin_range' : (0.0, 90) }
mstis.hist_args['pathlength'] = { 'bin_width' : 5, 'bin_range' : (0, 100) }

In [7]:

%%time
mstis.rate_matrix(storage.steps, force=True)

/faststore/homedirs/dswenso1/openpathsampling/openpathsampling/high_level/network.py:616: FutureWarning: set_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
  rate)

CPU times: user 21.6 s, sys: 378 ms, total: 22 s
Wall time: 21.8 s

Out[7]:

	A	B	C	D
A	NaN	0.0628570742364 /ps	0.00136010659111 /ps	0.00485608427343 /ps
B	0.148157338922 /ps	NaN	0.0 /ps	0.000799122647909 /ps
C	0.0730391994665 /ps	0.0159285487056 /ps	NaN	0.197946235924 /ps
D	0.00648869347808 /ps	0.0110497834543 /ps	0.048081490166 /ps	NaN

The self-rates (the rate of returning the to initial state) are undefined, and return not-a-number.

The rate is calcuated according to the formula:

$$k_{AB} = \phi_{A,0} P(B|\lambda_m) \prod_{i=0}^{m-1} P(\lambda_{i+1} | \lambda_i)$$

where $\phi_{A,0}$ is the flux from state A through its innermost interface, $P(B|\lambda_m)$ is the conditional transition probability (the probability that a path which crosses the interface at $\lambda_m$ ends in state B), and $\prod_{i=0}^{m-1} P(\lambda_{i+1} | \lambda_i)$ is the total crossing probability. We can look at each of these terms individually.

Total crossing probability¶

In [8]:

stateA = storage.volumes["A"]
stateB = storage.volumes["B"]
stateC = storage.volumes["C"]

In [9]:

tcp_AB = mstis.transitions[(stateA, stateB)].tcp
tcp_AC = mstis.transitions[(stateA, stateC)].tcp
tcp_BC = mstis.transitions[(stateB, stateC)].tcp
tcp_BA = mstis.transitions[(stateB, stateA)].tcp
tcp_CA = mstis.transitions[(stateC, stateA)].tcp
tcp_CB = mstis.transitions[(stateC, stateB)].tcp

plt.plot(tcp_AB.x, tcp_AB)
plt.plot(tcp_CA.x, tcp_CA)
plt.plot(tcp_BC.x, tcp_BC)
plt.plot(tcp_AC.x, tcp_AC) # same as tcp_AB in MSTIS

Out[9]:

[<matplotlib.lines.Line2D at 0x7fa775ed7400>]

We normally look at these on a log scale:

In [10]:

plt.plot(tcp_AB.x, np.log(tcp_AB))
plt.plot(tcp_CA.x, np.log(tcp_CA))
plt.plot(tcp_BC.x, np.log(tcp_BC))

Out[10]:

[<matplotlib.lines.Line2D at 0x7fa775e71668>]

Flux¶

Here we also calculate the flux contribution to each transition. The flux is calculated based on

In [11]:

import pandas as pd
flux_matrix = pd.DataFrame(columns=mstis.states, index=mstis.states)
for state_pair in mstis.transitions:
    transition = mstis.transitions[state_pair]
    flux_matrix.set_value(state_pair[0], state_pair[1], transition._flux)

flux_matrix

/home/dswenso1/miniconda2/envs/ops_py36/lib/python3.6/site-packages/ipykernel_launcher.py:5: FutureWarning: set_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
  """

Out[11]:

	{x\|opA(x) in [0, 10]}	{x\|opB(x) in [0, 10]}	{x\|opC(x) in [0, 10]}	{x\|opD(x) in [0, 10]}
{x\|opA(x) in [0, 10]}	NaN	1.07205623902 /ps	1.07205623902 /ps	1.07205623902 /ps
{x\|opB(x) in [0, 10]}	2.23880597015 /ps	NaN	2.23880597015 /ps	2.23880597015 /ps
{x\|opC(x) in [0, 10]}	1.54169586545 /ps	1.54169586545 /ps	NaN	1.54169586545 /ps
{x\|opD(x) in [0, 10]}	2.74068868587 /ps	2.74068868587 /ps	2.74068868587 /ps	NaN

Conditional transition probability¶

In [12]:

outer_ctp_matrix = pd.DataFrame(columns=mstis.states, index=mstis.states)
for state_pair in mstis.transitions:
    transition = mstis.transitions[state_pair]
    outer_ctp_matrix.set_value(state_pair[0], state_pair[1], transition.ctp[transition.ensembles[-1]])    

outer_ctp_matrix

/home/dswenso1/miniconda2/envs/ops_py36/lib/python3.6/site-packages/ipykernel_launcher.py:4: FutureWarning: set_value is deprecated and will be removed in a future release. Please use .at[] or .iat[] accessors instead
  after removing the cwd from sys.path.

Out[12]:

	{x\|opA(x) in [0, 10]}	{x\|opB(x) in [0, 10]}	{x\|opC(x) in [0, 10]}	{x\|opD(x) in [0, 10]}
{x\|opA(x) in [0, 10]}	NaN	0.623838	0.0134987	0.0481952
{x\|opB(x) in [0, 10]}	0.834217	NaN	0	0.00449955
{x\|opC(x) in [0, 10]}	0.182482	0.039796	NaN	0.494551
{x\|opD(x) in [0, 10]}	0.0477952	0.0813919	0.354165	NaN

Path ensemble properties¶

In [13]:

hists_A = mstis.transitions[(stateA, stateB)].histograms
hists_B = mstis.transitions[(stateB, stateC)].histograms
hists_C = mstis.transitions[(stateC, stateB)].histograms

Interface crossing probabilities¶

We obtain the total crossing probability, shown above, by combining the individual crossing probabilities of

In [14]:

for hist in [hists_A, hists_B, hists_C]:
    for ens in hist['max_lambda']:
        normalized = hist['max_lambda'][ens].normalized()
        plt.plot(normalized.x, normalized)

In [15]:

# add visualization of the sum

In [16]:

for hist in [hists_A, hists_B, hists_C]:
    for ens in hist['max_lambda']:
        reverse_cumulative = hist['max_lambda'][ens].reverse_cumulative()
        plt.plot(reverse_cumulative.x, reverse_cumulative)

In [17]:

for hist in [hists_A, hists_B, hists_C]:
    for ens in hist['max_lambda']:
        reverse_cumulative = hist['max_lambda'][ens].reverse_cumulative()
        plt.plot(reverse_cumulative.x, np.log(reverse_cumulative))

Path length histograms¶

In [18]:

for hist in [hists_A, hists_B, hists_C]:
    for ens in hist['pathlength']:
        normalized = hist['pathlength'][ens].normalized()
        plt.plot(normalized.x, normalized)

In [19]:

for ens in hists_A['pathlength']:
    normalized = hists_A['pathlength'][ens].normalized()
    plt.plot(normalized.x, normalized)

Sampling properties¶

The properties we illustrated above were properties of the path ensembles. If your path ensembles are sufficiently well-sampled, these will never depend on how you sample them.

But to figure out whether you've done a good job of sampling, you often want to look at properties related to the sampling process. OPS also makes these very easy.

In [ ]:

Move scheme analysis¶

In [20]:

scheme = storage.schemes[0]

In [21]:

scheme.move_summary(storage.steps)

repex ran 22.760% (expected 23.10%) of the cycles with acceptance 956/2276 (42.00%)
shooting ran 45.950% (expected 46.20%) of the cycles with acceptance 3222/4595 (70.12%)
pathreversal ran 25.080% (expected 24.75%) of the cycles with acceptance 1510/2508 (60.21%)
minus ran 2.850% (expected 2.64%) of the cycles with acceptance 243/285 (85.26%)
ms_outer_shooting ran 3.360% (expected 3.30%) of the cycles with acceptance 210/336 (62.50%)

In [22]:

scheme.move_summary(storage.steps, 'shooting')

OneWayShootingMover Out A 0 ran 3.500% (expected 3.30%) of the cycles with acceptance 278/350 (79.43%)
OneWayShootingMover Out A 1 ran 3.580% (expected 3.30%) of the cycles with acceptance 261/358 (72.91%)
OneWayShootingMover Out A 2 ran 3.150% (expected 3.30%) of the cycles with acceptance 196/315 (62.22%)
OneWayShootingMover Out A 3 ran 3.410% (expected 3.30%) of the cycles with acceptance 197/341 (57.77%)
OneWayShootingMover Out B 0 ran 3.030% (expected 3.30%) of the cycles with acceptance 261/303 (86.14%)
OneWayShootingMover Out B 1 ran 3.470% (expected 3.30%) of the cycles with acceptance 242/347 (69.74%)
OneWayShootingMover Out B 2 ran 3.100% (expected 3.30%) of the cycles with acceptance 210/310 (67.74%)
OneWayShootingMover Out B 3 ran 3.340% (expected 3.30%) of the cycles with acceptance 200/334 (59.88%)
OneWayShootingMover Out C 0 ran 3.130% (expected 3.30%) of the cycles with acceptance 240/313 (76.68%)
OneWayShootingMover Out C 1 ran 3.080% (expected 3.30%) of the cycles with acceptance 207/308 (67.21%)
OneWayShootingMover Out C 2 ran 3.410% (expected 3.30%) of the cycles with acceptance 230/341 (67.45%)
OneWayShootingMover Out D 0 ran 3.240% (expected 3.30%) of the cycles with acceptance 277/324 (85.49%)
OneWayShootingMover Out D 1 ran 3.280% (expected 3.30%) of the cycles with acceptance 222/328 (67.68%)
OneWayShootingMover Out D 2 ran 3.230% (expected 3.30%) of the cycles with acceptance 201/323 (62.23%)

In [23]:

scheme.move_summary(storage.steps, 'minus')

Minus ran 0.690% (expected 0.66%) of the cycles with acceptance 61/69 (88.41%)
Minus ran 0.630% (expected 0.66%) of the cycles with acceptance 60/63 (95.24%)
Minus ran 0.720% (expected 0.66%) of the cycles with acceptance 44/72 (61.11%)
Minus ran 0.810% (expected 0.66%) of the cycles with acceptance 78/81 (96.30%)

In [24]:

scheme.move_summary(storage.steps, 'repex')

ReplicaExchange ran 1.470% (expected 1.65%) of the cycles with acceptance 56/147 (38.10%)
ReplicaExchange ran 1.610% (expected 1.65%) of the cycles with acceptance 79/161 (49.07%)
ReplicaExchange ran 1.650% (expected 1.65%) of the cycles with acceptance 102/165 (61.82%)
ReplicaExchange ran 1.520% (expected 1.65%) of the cycles with acceptance 31/152 (20.39%)
ReplicaExchange ran 1.590% (expected 1.65%) of the cycles with acceptance 117/159 (73.58%)
ReplicaExchange ran 1.690% (expected 1.65%) of the cycles with acceptance 118/169 (69.82%)
ReplicaExchange ran 1.470% (expected 1.65%) of the cycles with acceptance 60/147 (40.82%)
ReplicaExchange ran 1.930% (expected 1.65%) of the cycles with acceptance 148/193 (76.68%)
ReplicaExchange ran 1.450% (expected 1.65%) of the cycles with acceptance 17/145 (11.72%)
ReplicaExchange ran 1.640% (expected 1.65%) of the cycles with acceptance 78/164 (47.56%)
ReplicaExchange ran 1.660% (expected 1.65%) of the cycles with acceptance 70/166 (42.17%)
ReplicaExchange ran 1.730% (expected 1.65%) of the cycles with acceptance 65/173 (37.57%)
ReplicaExchange ran 1.760% (expected 1.65%) of the cycles with acceptance 13/176 (7.39%)
ReplicaExchange ran 1.590% (expected 1.65%) of the cycles with acceptance 2/159 (1.26%)

In [25]:

scheme.move_summary(storage.steps, 'pathreversal')

PathReversal ran 1.820% (expected 1.65%) of the cycles with acceptance 173/182 (95.05%)
PathReversal ran 1.620% (expected 1.65%) of the cycles with acceptance 116/162 (71.60%)
PathReversal ran 1.820% (expected 1.65%) of the cycles with acceptance 92/182 (50.55%)
PathReversal ran 1.650% (expected 1.65%) of the cycles with acceptance 56/165 (33.94%)
PathReversal ran 1.540% (expected 1.65%) of the cycles with acceptance 141/154 (91.56%)
PathReversal ran 1.930% (expected 1.65%) of the cycles with acceptance 90/193 (46.63%)
PathReversal ran 1.640% (expected 1.65%) of the cycles with acceptance 49/164 (29.88%)
PathReversal ran 1.580% (expected 1.65%) of the cycles with acceptance 29/158 (18.35%)
PathReversal ran 1.700% (expected 1.65%) of the cycles with acceptance 138/170 (81.18%)
PathReversal ran 1.690% (expected 1.65%) of the cycles with acceptance 84/169 (49.70%)
PathReversal ran 1.510% (expected 1.65%) of the cycles with acceptance 50/151 (33.11%)
PathReversal ran 1.910% (expected 1.65%) of the cycles with acceptance 184/191 (96.34%)
PathReversal ran 1.410% (expected 1.65%) of the cycles with acceptance 101/141 (71.63%)
PathReversal ran 1.760% (expected 1.65%) of the cycles with acceptance 87/176 (49.43%)
PathReversal ran 1.500% (expected 1.65%) of the cycles with acceptance 120/150 (80.00%)

Replica exchange sampling¶

See the notebook repex_networks.ipynb for more details on tools to study the convergence of replica exchange. However, a few simple examples are shown here. All of these are analyzed with a separate object, ReplicaNetwork.

In [26]:

repx_net = paths.ReplicaNetwork(scheme, storage.steps)

Replica exchange mixing matrix¶

In [27]:

repx_net.mixing_matrix()

Out[27]:

	15	14	17	4	0	11	5	1	12	6	2	13	7	3	18	10	9	8	16
15	0.000000	0.000000	0.000000	0.023428	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
14	0.000000	0.000000	0.000000	0.000000	0.023819	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
17	0.000000	0.000000	0.000000	0.000000	0.000000	0.030457	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
4	0.023428	0.000000	0.000000	0.000000	0.000000	0.000000	0.012105	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
0	0.000000	0.023819	0.000000	0.000000	0.000000	0.000000	0.000000	0.021866	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
11	0.000000	0.000000	0.030457	0.000000	0.000000	0.000000	0.000000	0.000000	0.006638	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
5	0.000000	0.000000	0.000000	0.012105	0.000000	0.000000	0.000000	0.000000	0.000000	0.045685	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
1	0.000000	0.000000	0.000000	0.000000	0.021866	0.000000	0.000000	0.000000	0.000000	0.000000	0.030847	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
12	0.000000	0.000000	0.000000	0.000000	0.000000	0.006638	0.000000	0.000000	0.000000	0.000000	0.000000	0.030457	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
6	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.045685	0.000000	0.000000	0.000000	0.000000	0.000000	0.046076	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000
2	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.030847	0.000000	0.000000	0.000000	0.000000	0.000000	0.039828	0.000000	0.000000	0.000000	0.000000	0.000000
13	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.030457	0.000000	0.000000	0.000000	0.000000	0.000000	0.000781	0.000000	0.000000	0.000000	0.000000
7	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.046076	0.000000	0.000000	0.000000	0.000000	0.025381	0.000000	0.000000	0.000000	0.000000
3	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.039828	0.000000	0.000000	0.000000	0.027333	0.000000	0.000000	0.000000	0.000000
18	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000781	0.025381	0.027333	0.000000	0.005076	0.000000	0.000000	0.000000
10	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.005076	0.000000	0.057790	0.000000	0.000000
9	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.057790	0.000000	0.023428	0.000000
8	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.023428	0.000000	0.017181
16	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.000000	0.017181	0.000000

Replica exchange graph¶

The mixing matrix tells a story of how well various interfaces are connected to other interfaces. The replica exchange graph is essentially a visualization of the mixing matrix (actually, of the transition matrix -- the mixing matrix is a symmetrized version of the transition matrix).

Note: We're still developing better layout tools to visualize these.

In [28]:

repxG = paths.ReplicaNetworkGraph(repx_net)
repxG.draw('spring')

/home/dswenso1/miniconda2/envs/ops_py36/lib/python3.6/site-packages/networkx/drawing/nx_pylab.py:611: MatplotlibDeprecationWarning: isinstance(..., numbers.Number)
  if cb.is_numlike(alpha):

Replica exchange flow¶

Replica flow is defined as TODO

Flow is designed for calculations where the replica exchange graph is linear, which ours clearly is not. However, we can define the flow over a subset of the interfaces.

Replica move history tree¶

In [29]:

import openpathsampling.visualize as vis
#reload(vis)
from IPython.display import SVG

In [30]:

tree = vis.PathTree(
    [step for step in storage.steps if not isinstance(step.change, paths.EmptyMoveChange)],
    vis.ReplicaEvolution(replica=3, accepted=False)
)
tree.options.css['width'] = 'inherit'

SVG(tree.svg())

Out[30]:

In [32]:

decorrelated = tree.generator.decorrelated
print ("We have " + str(len(decorrelated)) + " decorrelated trajectories.")

We have 40 decorrelated trajectories.

In [ ]:

Visualizing trajectories¶

Histogramming data (TODO)¶

In [ ]:

(AD_mstis_4_analysis.ipynb; AD_mstis_4_analysis.py)

Multiple state TIS on Alanine Dipeptide

Alanine Multistate¶

Example Simulation¶

Import and general setup¶

Set simulation options and create a simulator object¶

1. the force field¶

2. the system object¶

3. the integrator¶

4. the platform¶

5. OpenMM properties¶

6. OPS options¶

Create the storage¶

State Definitions¶

Order Parameters¶

Volumes¶

Visualize in Phi/Psi space¶

Set up the MSTIS network¶

Set up the MoveScheme¶

Initial trajectories¶

Find initial samples¶

Equilibration¶

Run from bootstrap paths¶

Loading things from storage¶

initialize engine¶

Running RETIS¶

Analyzing the MSTIS simulation¶

Reaction rates¶

Total crossing probability¶

Flux¶

Conditional transition probability¶

Path ensemble properties¶

Interface crossing probabilities¶

Path length histograms¶

Sampling properties¶

Move scheme analysis¶

Replica exchange sampling¶

Replica exchange mixing matrix¶

Replica exchange graph¶

Replica exchange flow¶

Replica move history tree¶

Visualizing trajectories¶

Histogramming data (TODO)¶

Set up the `MoveScheme`¶