Output during simulations
Or: How do I know that my simulation is running?
OpenPathSampling is designed to be a library, meaning that we intend many parts of the code to be reused within other packages that our users might develop. Indeed, every OPS simulation is technically an application that uses OPS as a library. This means that, for the most part, we try to keep default output minimal. The primary output from an OPS simulation is the netCDF file, where all data about the simulation is stored. Other output along the way is secondary.
While your simulation is running, there are four ways you can see evidence that it is running. This document will discuss each of them, including some caveats about how you should or should not use them and how to customize their behavior.
Progress output (default stdout)
The most obvious evidence that your simulation is running is our default
output to stdout
, which, in the case of a PathSampling
simulation, includes the Monte Carlo step number, the elapsed time, and an
estimate of the time remaining. An example output:
Working on Monte Carlo cycle number 26
Running for 4 minutes 15 seconds - 10.21 seconds per step
Estimated time remaining: 1 day 4.28 hours
If you don’t want to print this to stdout
, you can change the value of
PathSampling.output_stream
to any file handler you desire. For
example, if you want to silence it, you can use sampler.output_stream =
open(os.devnull)
(after creating sampler
as an instance of
PathSampling
). Note that this also true for any subclass of
PathSimulator
, not just PathSampling
.
Read the netCDF file
One of the reasons we use netCDF as the file format for OPS is that it can be read while it is open for writing. Since the analysis of OPS data uses this file, this is the recommended approach. You can begin to develop your specialized analysis tools while you simulation is still running! The netCDF file is the primary output of an OPS simulation.
Note that there are a few caveats here: while you can simultaneously have an open file handler for writing and one for reading, you cannot open a file for reading at the exact moment that it is being written to. This isn’t a problem for large systems (where you may do one quick block of writing per hour or longer), but can be a problem for small systems.
See various documentation and examples of analysis to see what you can do with the resulting netCDF file.
Logging what the code is doing
Another way to see what is happing during the simulation is to enable logging. Logging will give details of what step we’re on, what path mover has been selected, how far the trajectory has progressed (if generating a trajectory), and whether the move is accepted.
Note
Do not use logging output for analysis. Logging output is not considered to be part of the API, and the format and details provided are subject to change at any time.
The logging facilities use the standard Python logging
library. This can
be configured either in code or using a configuration file. One example
configuration file is our default info-level configuration.
To use it, add the following lines in your code when you would like to
enable logging (after putting the logging.conf
file in your working
directory):
import logging.config
logging.config.fileConfig("logging.conf", disable_existing_loggers=False)
By default, that logs info-level information to a file called
ops_output.log
. You can change the file (including outputting to
stdout
), or the format of the log entries, or the verbosity of the logs,
all using standard Python tools.
For more details on how to use Python’s logging facilities, see:
In particular, details on how to use a logging configuration file, see:
Note
You can not combine sending progress output to stdout
and sending
logging information to stdout
. When written to stdout
, the
progress information tells your terminal to delete and overwrite the
lines from the preceding update; combined with logging it will delete
lines from the logging instead!
See a live visualization of the simulation
The last way to see what is happening during you simulation is perhaps the
most fun, but also the least practical. You can visualize the last step of
a path sampling simulation by creating a StepVisualizer2D
object,
which projects your paths into the plane of an arbitrary pair of collective
variables. The direction of the path is indicated with a dot as the final
frame (like an arrowhead). The color of the path indicates its ensemble.
Heavy-width trajectories show the current state; light width trajectories
(with hollow final frames) indicate rejected trial moves.
To use a StepVisualizer2D
during a path sampling simulation,
assign it to the PathSampling.live_visualization
attribute. It will be
updated after every PathSampling.save_frequency
MC steps – this is
also how frequently the data is sync’d to disk, and how often we run sanity
checks (ensuring that all paths are in the expected ensembles). By default,
this is after every step, but for performance reasons it is much less frequent
for toy models and small systems.
Of course, this visualization is not practical for long-running simulations, since it requires an interactive environment. However, the same tool can be used to replay the simulation from a file. The analysis examples demonstrate how to do this.