/**
\page belfast-1 Belfast tutorial: Analyzing CVs

\section belfast-1-aims Aims

The aim of this tutorial is to introduce the users to the plumed syntax. We will go through the writing of
simple collective variable and we will use them to analyze a trajectory in terms of probability distributions
and free energy.

\section belfast-1-lo Learning Outcomes

Once this tutorial is completed students will:

- Know how to write a simple plumed input file 
- Know how to analyze a trajectory using plumed

\section belfast-1-resources Resources

The <a href="tutorial-resources/belfast-1.tar.gz" download="belfast-1.tar.gz"> tarball </a> for this project contains the following files:

- trajectory-short.xyz : a (short) trajectory for a 16 residue protein in xyz format. All calculations with plumed driver use this trajectory.
- template.pdb : a single frame from the trajectory that can be used in conjunction with the \ref MOLINFO command

\section belfast-1-instructions Instructions

PLUMED is a library that can be accessed by multiple codes adding a relatively simple and well documented interface.
Once PLUMED is installed you can run a plumed executable that can be used for multiple purposes:

\verbatim
plumed --help 
\endverbatim

some of the listed options report about features that are available in plumed, others can be used to tell plumed to do something:
from analyzing a trajectory to patch the source code of a MD code and so on. All the commands have further options that can
be seen using plumed command --help, i.e.:

\verbatim
plumed driver --help
\endverbatim

In the following we are going to see how to write an input file for plumed2 that can be used to analyze a trajectory.

\subsection belfast-1-units A note on units
By default the PLUMED inputs and outputs quantities in the following units:

- Energy - kJ/mol
- Length - nanometers
- Time - picoseconds

If you want to change these units you can do this using the \ref UNITS keyword. 

\subsection belfast-1-introinput Introduction to the PLUMED input file

A typical input file for PLUMED input is composed by specification of one or more CVs, the printout frequency and a termination line.
Comments are denoted with a # and the termination of the input for PLUMED is marked with the keyword ENDPLUMED. Whatever it follows is ignored by PLUMED. You can introduce blank lines. They are not interpreted by PLUMED.

In the following input we will analyze the \ref DISTANCE between the two terminal carbons of a 16 residues peptide, and we will \ref PRINT the results
in file named COLVAR.

\verbatim
#my first plumed input:
DISTANCE ATOMS=2,253 LABEL=e2edist

#printout frequency
PRINT ARG=e2edist STRIDE=1 FILE=COLVAR 

#endofinput 
ENDPLUMED
here I can write what I want it won't be read.
\endverbatim

Now we can use this simple input file to analyze the trajectory included in the RESOURCES:

\verbatim
plumed driver --plumed plumed.dat --ixyz trajectory-short.xyz --length-units 0.1
\endverbatim

NOTE: --length-units 0.1, xyz files, as well as pdb files, are in Angstrom.

You should have a file COLVAR, if you look at it (i.e. more COLVAR) the first two lines
should be:

\verbatim 
#! FIELDS time e2edist 
 0.000000 2.5613161
\endverbatim

NOTE: the first line of the file COLVAR tells you what is the content of each column.

In PLUMED the commands defined in the input files are executed in the same order in which they are written, this means that the
following input file is wrong:

\verbatim
#printout frequency
PRINT ARG=e2edist STRIDE=1 FILE=COLVAR 
#my first plumed input:
DISTANCE ATOMS=2,253 LABEL=e2edist
#endofinput 
ENDPLUMED
here I can write what I want it won't be read.
\endverbatim

Try to run it.

Sometimes, when calculating a collective variable, you may not want to use the positions of a number of atoms directly. Instead you may wish to use the position of a virtual atom whose position is generated based on the positions of a collection of other atoms. For example you might want to use the center of a group of atoms (\ref CENTER):

Since PLUMED executes the input in order you need to define the new Virtual Atom before using it:

\verbatim

first: CENTER ATOMS=1,2,3,4,5,6
last: CENTER ATOMS=251-256

e2edist: DISTANCE ATOMS=2,253
comdist: DISTANCE ATOMS=first,last

PRINT ARG=e2edist,comdist STRIDE=1 FILE=COLVAR 

ENDPLUMED
\endverbatim

NOTE: an action (i.e. CENTER or DISTANCE here) can be either labeled using LABEL as we did before or as label: ACTION as we have just done here.

With the above input this is what happen inside PLUMED with a STRIDE=1:

1. calculates the position of the Virtual Atom 'first' as the \ref CENTER of atoms from 1 to 6;
2. calculates the position of the Virtual Atom 'last' as the \ref CENTER of atoms from 251 to 256;
3. calculates the distance between atoms 2 and 253 and saves it in 'e2edist';
4. calculates the distance between the two atoms 'first' and 'last' and saves it in 'comdist';
5. print the content of 'e2edist' and 'comdist' in the file COLVAR

In the above input we have used two different ways of writing the atoms used in \ref CENTER calculation:

1. ATOMS=1,2,3,4,5,6 is the explicit list of the atoms we need
2. ATOMS=251-256 is the range of atoms needed

ranges of atoms can be defined with a stride which can also be negative:

3. ATOMS=from,to:by (i.e.: 251-256:2)
4. ATOMS=to,from:-by (i.e.: 256-251:-2)

Now by plotting the content of the COLVAR file we can compare the behavior in this trajectory of both the terminal carbons
as well as of the center of masses of the terminal residues.

\verbatim
gnuplot
\endverbatim

What do you expect to see now by looking at the trajectory? Let's have a look at it

\verbatim
vmd template.pdb trajectory-short.xyz 
\endverbatim
 
Virtual atoms can be used in place of standard atoms everywhere an atom can be given as input, they can also be used together
with standard atoms. So for example we can analyze the \ref TORSION angle for a set of Virtual and Standard atoms:

\verbatim
first: CENTER ATOMS=1-6
last: CENTER ATOMS=251-256
cvtor: TORSION ATOMS=first,102,138,last

PRINT ARG=cvtor STRIDE=1 FILE=COLVAR 

ENDPLUMED
\endverbatim

The above CV don't look smart to learn something about the system we are looking at. In principle CV are used to reduce the complexity
of a system by looking at a small number of properties that could be enough to rationalize its behavior. 

Now try to write a collective variable that measures the Radius of Gyration of the system: \ref GYRATION.

NOTE: if what you need for one or more variables is a long list of atoms and not a virtual atom one can use the keyword \ref GROUP.
A GROUP can be defined using ATOMS in the same way we saw before, in addition it is also possible to define a GROUP by reading a 
GROMACS index file.

\verbatim
ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238
\endverbatim

Now 'ca' is not a virtual atom but a simple list of atoms.

\subsection belfast-1-multicol MULTICOLVAR

Sometimes it can be useful to calculate properties of many similar collective variables at the same time, for example one can be interested
in calculating the properties of the distances between a group of atoms, or properties linked to the distribution of the dihedral angles
of a chain and so on. In PLUMED this kind of collective variables fall under the name of MULTICOLVAR (cf. \ref mcolv.)
Here we are going to analyze the distances between CA carbons along the chain:

\verbatim
ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238
dd: DISTANCES GROUP=ca MEAN MIN={BETA=50} MAX={BETA=0.02} MOMENTS=2

PRINT ARG=dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR 

ENDPLUMED
\endverbatim

The above input tells PLUMED to calculate all the distances between CA carbons and then look
for the mean distance, the minimum distance, the maximum distance and the variance. In this way
we have defined four collective variables that are calculated using the distances. These four
collective variables are stored as components of the defined action 'dd': dd.mean, dd.min, dd.max,
dd.moment-2.

The infrastructure of multicolvar has been used to develop many PLUMED collective variables as
for example the set of Secondary Structure CVs (\ref ANTIBETARMSD, \ref PARABETARMSD and \ref ALPHARMSD).

\verbatim
MOLINFO STRUCTURE=template.pdb
abeta: ANTIBETARMSD RESIDUES=all TYPE=DRMSD LESS_THAN={RATIONAL R_0=0.08 NN=8 MM=12} STRANDS_CUTOFF=1

PRINT ARG=abeta.lessthan STRIDE=1 FILE=COLVAR 

ENDPLUMED
\endverbatim

We have now seen how to write the input some of the many CVs available in PLUMED. More complex CVs
will be discussed in the next workshop, \ref belfast-2.
 
\subsection belfast-1-analysis Analysis of Collective Variables

Collective variables are usually used to visualize the Free Energy of a system. Given a system evolving
at fixed temperature, fixed number of particles and fixed volume, it will explore different conformations
with a probability 

\f[
  P(q)\propto e^{-\frac{U(q)}{kb_BT}}
\f]
where \f$ q \f$ are the microscopic coordinates and \f$ k_B \f$ is the Boltzmann constant.

It is possible to analyze the above probability as a function of one or more collective variable \f$ s(q)\f$:

\f[
  P(s)\propto \int dq  e^{-\frac{U(q)}{kb_BT}} \delta(s-s(q))
\f]

where the \f$ \delta \f$ function means that to for a given value \f$ s\f$ of the collective variable are counted
only those conformations for which the CV is \f$ s\f$. The probability can be recast to a free energy by
taking its logarithm: 
\f[
  F(s)=-k_B T \log P(s)
\f]

This means that by estimating the probability distribution of a CV it is possible to know the free energy of a
system along that CV. Estimating the probability distribution of the conformations of a system is what is called
'sampling'.

In order to estimate a probability distribution one needs to make \ref HISTOGRAM from the calculated CVs.
PLUMED includes the possibility of constructing a histogram from data both on the fly as well as a posteriori as we
are going to do now.

\verbatim
MOLINFO STRUCTURE=template.pdb
abeta: ANTIBETARMSD RESIDUES=all TYPE=DRMSD LESS_THAN={RATIONAL R_0=0.08 NN=8 MM=12} STRANDS_CUTOFF=1
ca: GROUP ATOMS=9,16,31,55,69,90,102,114,124,138,160,174,194,208,224,238
DISTANCES ...
GROUP=ca MEAN MIN={BETA=50} MAX={BETA=0.02} MOMENTS=2 LABEL=dd
... DISTANCES 

PRINT ARG=abeta.lessthan,dd.mean,dd.min,dd.max,dd.moment-2 STRIDE=1 FILE=COLVAR 

HISTOGRAM ...
ARG=abeta.lessthan,dd.mean
LABEL=hh
KERNEL=DISCRETE
GRID_MIN=0,0.8
GRID_MAX=4,1.2
GRID_BIN=40,40
... HISTOGRAM

DUMPGRID GRID=hh FILE=histo

ENDPLUMED
\endverbatim

NOTE: HISTOGRAM ... means that what follow is part of the \ref HISTOGRAM function, the same can be done for any action
in PLUMED.

The above input tells PLUMED to accumulate the two collective variables on a GRID. In addition the 
probability can be converted to a free-energy using the \ref CONVERT_TO_FES method and then use
\ref DUMPGRID to write it.
Histograms can be accumulated in a smoother way by using a KERNEL function, a kernel is a normalized function, 
for example a normalized gaussian is the default kernel in PLUMED, that is added to the histogram centered in the position of the data. 
Estimating a probability density using kernels can in principle give more accurate results, on the other hand in 
addition to the choice of the binning one has to choose a parameter that is the WIDTH of the kernel function. 
As a rule of thumb: the grid spacing should be smaller (i.e. one half or less) than the BANDWIDTH
and the BANDWIDTH should be smaller (i.e. one order of magnitude) than the variance observed/expected for the variable. 

\verbatim
HISTOGRAM ...
LABEL=hh
ARG=abeta.lessthan,dd.mean
GRID_MIN=0,0.8
GRID_MAX=4,1.2
GRID_SPACING=0.04,0.004
BANDWIDTH=0.08,0.008
... HISTOGRAM

DUMPGRID GRID=hh FILE=histo

ENDPLUMED
\endverbatim

If you have time less at the end of the session read the manual and look for alternative collective variables to analyze the
trajectory. Furthermore try to play with the \ref HISTOGRAM parameters to see the effect of using KERNEL in analyzing data.

*/

link: @subpage belfast-1

description: This tutorial explains how to use plumed to analyze CVs

additional-files: belfast-1
