Calculators section

Each module in ProtoSyn may contain a Calculators section which includes types and methods that allow ProtoSyn to measure the fitness of a given State, according to a specific EnergyFunction instance. The resulting energy value is stored in the State.e field, as a Dictionary of EnergyFunctionComponent instances. The :total energy of the system is the sum of each component contribution. Each module in ProtoSyn (such as the Peptides module) adds new components and methods specific to that module's purpose. As a starting point, the Core module sets up all necessary types and methods transversal to all modules, as well as a few basic energetic components. In the next section, a further exploration on how to develop and set-up custom EnergyFunction instances is provided, organized in the following sections:

Energy functions & energy function components

An EnergyFunction is the main high-level object of ProtoSyn for energy calculation of a Pose instance, whose building blocks are the EnergyFunctionComponent instances.

ProtoSyn.Calculators.EnergyFunctionComponentType
EnergyFunctionComponent(name::String, calc::Function, settings::Dict{Symbol, Any}, α::T, update_forces::Bool)

Return a new EnergyFunctionComponent instance with the given name. An EnergyFunctionComponent object is responsible to calculate an energetic contribution to the system, based on a given interaction, model or restraint (for example, a Bond Distance Restraint is responsible to calculate the energetic contribution by bonds with lengths superior to a given threshold). The :total energy of a Pose is, therefore, the sum of all EnergyFunctionComponent applied in a given calculation. Each EnergyFunctionComponent has a calc Function that receives a Pose and outputs an energy value. Optionally, this calc Function can also return a set of forces felt on all considered atoms, based on the potential or model used, which is calculated if the update_forces flag is set to true. For a more detailed explanation see the Creating a custom EnergyFunctionComponent section of the documentation). Additionally, certain EnergyFunctionComponent instances can have a settings dictionary, containing kwargs that are passed to the calc Function. These usually parametrize and fine tune the calculations perform (such as setting the flat bottom restraint distances, as an example). When calling an EnergyFunctionComponent calculation from an EnergyFunction, the relative weight of this EnergyFunctionComponent in relation to others is given by the α field.

Fields

See also

EnergyFunction

Examples

julia> ProtoSyn.Calculators.Restraints.get_default_bond_distance_restraint()
         Name : Bond_Distance_Restraint
   Weight (α) : 1.0
Update forces : true
      Setings :
          :x0 => 2.0
source
ProtoSyn.Calculators.EnergyFunctionType
EnergyFunction(components::Vector{EnergyFunctionComponent}, clean_cache_every::Int16, cache::Int16, components_by_name::Dict{String, Int}, selection::AbstractSelection, update_forces::Bool)

Construct and return a new EnergyFunction instance. An EnergyFunction is a collection of EnergyFunctionComponent instances, where each of these components calculates an energetic contribution to the :total energy and forces acting on a system. An EnergyFunctionComponent can be retrieved by its index or by its name (as saved in components_by_name). The AbstractSelection selection defines the Atom selection this EnergyFunction is applied to. If inner EnergyFunctionComponent instances also have AbstractSelection selections defined, the resulting selection will be the intersection between both. The update_forces sets whether to calculate and update the Pose State forces. The Julia cache is automatically cleaned by garbage collection. However, in certain cases (such as using the TorchANI EnergyFunctionComponent), a manual call to garbage collection is necessary (see Issue 55140). In such cases, the EnergyFunction object has an internal cache that is cleaned (by calling garbage collection) at intervals of clean_cache_every calls. This interval is automatically adjusted down by measuring the current load on the GPU, calling garbage collection once the memory allocation is greater than ProtoSyn.Units.max_gpu_allocation.

EnergyFunction([::Type{T}])

Construct and return an empty EnergyFunction instance. The :clean_cache_every field is set to ProtoSyn.Units.defaultCleanCacheEvery, the AbstractSelection selection field is defined as an atomic TrueSelection and update_forces is set to false.

EnergyFunction(components::Vector{EnergyFunctionComponent{T}}) where {T <: AbstractFloat}

Construct and return a new EnergyFunction instance filled with the given list of EnergyFunctionComponent instances components. The :clean_cache_every field is set to ProtoSyn.Units.defaultCleanCacheEvery, the AbstractSelection selection field is defined as an atomic TrueSelection and update_forces is set to false.

Fields

  • components::Vector{EnergyFunctionComponent} - The list of EnergyFunctionComponent instances in this EnergyFunction;
  • clean_cache_every::Int16 - Forcefully call garbage collection every N calls;
  • cache::Int16 - Current number of calls performed. Resets to zero every clean_cache_every;
  • components_by_name::Dict{String, Int} - The list of EnergyFunctionComponent instances in this EnergyFunction, indexed by :name;
  • selection::AbstractSelection - The AbstractSelection selecting Atom instances to apply this EnergyFunction to;
  • update_forces::Bool - Whether to calculate and update forces with this EnergyFunction.

See also

EnergyFunctionComponent

Examples

julia> energy_function = ProtoSyn.Calculators.EnergyFunction()
🗲  Energy Function (0 components):
+----------------------------------------------------------------------+
| Index | Component name                                | Weight (α)   |
+----------------------------------------------------------------------+
+----------------------------------------------------------------------+
 ● Update forces: false
 ● Selection: Set
 └── TrueSelection (Atom)

julia> push!(energy_function, Calculators.Restraints.get_default_bond_distance_restraint())
🗲  Energy Function (1 components):
+----------------------------------------------------------------------+
| Index | Component name                                | Weight (α)   |
+----------------------------------------------------------------------+
| 1     | Bond_Distance_Rest                            |      1.000   |
+----------------------------------------------------------------------+
 ● Update forces: false
 ● Selection: Set
 └── TrueSelection (Atom)

 julia> energy_function["Bond_Distance_Rest"].α = 0.5
 0.5

julia> energy_function
🗲  Energy Function (1 components):
+----------------------------------------------------------------------+
| Index | Component name                                | Weight (α)   |
+----------------------------------------------------------------------+
| 1     | Bond_Distance_Rest                            |      0.500   |
+----------------------------------------------------------------------+
 ● Update forces: false
 ● Selection: Set
 └── TrueSelection (Atom)
source

An EnergyFunction instance is actually a functor, meaning it is callable in order to apply any energy function type to any Pose instance, using the following signature:

(energy_function::EnergyFunction)(pose::Pose, update_forces::Bool = false)

An application example is provided bellow, using the TorchANI model as the sole EnergyFunctionComponent:

julia> torchani = Calculators.TorchANI.get_default_torchani_model()🞧  Energy Function Component:
+---------------------------------------------------+
| Name           | TorchANI_ML_Model                |
| Alpha (α)      | 1.0                              |
| Update forces  | true                             |
| Calculator     | calc_torchani_model              |
+---------------------------------------------------+
 |    +----------------------------------------------------------------------------------+
 ├──  ● Settings                      | Value                                            |
 |    +----------------------------------------------------------------------------------+
 |    | model                         | 3                                                |
 |    +----------------------------------------------------------------------------------+
 |
 └──  ○  Selection: nothing
julia> energy_function = Calculators.EnergyFunction([torchani])🗲 Energy Function (1 components): +----------------------------------------------------------------------+ | Index | Component name | Weight (α) | +----------------------------------------------------------------------+ | 1 | TorchANI_ML_Model | 1.00 | +----------------------------------------------------------------------+ ● Update forces: false ● Selection: Set └── TrueSelection (Atom)
julia> energy_function(pose)-0.1257355958223343
julia> energy_function(pose, true) # Calculating forces ...ERROR: MethodError: no method matching (::ProtoSyn.Calculators.EnergyFunction)(::Pose{Topology}, ::Bool) Closest candidates are: (::ProtoSyn.Calculators.EnergyFunction)(::Pose; update_forces_overwrite) at ~/ProtoSyn.jl/src/Core/Calculators/energy_function.jl:190
julia> pose.state.f3×39 Matrix{Float64}: 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 … 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

ProtoSyn Energy Function

Figure 1 | A diagram representation of the EnergyFunction organization as a set of EnergyFunctionComponent instances. Each component contribution is summed to evaluate the :total energy of the system, according to each component's scaling factor ɑ (in this example, 1.0 for the TorchANI Ensemble EnergyFunctionComponent and 0.8 for the Contact Map EnergyFunctionComponent). Some components may be able to calculate the set of forces acting on each atom of the system, in which case this contribution can be toggled on and off by the :update_forces flag. Furthermore, each component may be parametrized by a set of options. In this example, the Contact Map EnergyFunctionComponent can be further parametrized by setting the :d1, :d2, :d3 and :d4 values (controlling the flat bottom restraint potential), as well as the :selection on which to act and the :mask applied. Such settings are specific to each type of EnergyFunctionComponent and are explored in detail in each entry in this manual.

Creating a custom EnergyFunctionComponent

As stated before, an EnergyFunctionComponent is mostly a smart wrapper around a :calc Function responsible for calculating a given energetic contribution based on a specific method, model or restraint type (as an example). Therefore, when developing custom EnergyFunctionComponent instances, certain aspects need to be taken into consideration. The following section lists particular restraints when creating custom EnergyFunctionComponent instances.

  1. The :calc Function signature:

When being called from an EnergyFunction instance, ProtoSyn expects the following calc Function signature.

calc(::A, pose::Pose, sele::Opt{AbstractSelection}, update_forces::Bool = false; kwarg1::Type{kwarg1} = default_value1, kwarg2::Type{kwarg2} = default_value2, ...) where {A <: ProtoSyn.AbstractAccelerationType}

The ::A (see the ProtoSyn acceleration types section), pose::Pose, sele::AbstractSelection and update_forces::Bool = false arguments are necessary and expected. Optionally, these can be followed by a variable number of arguments (kwargs), parameterizing and fine-tuning the usage of the calc Function. When calling this EnergyFunctionComponent from an EnergyFunction, these kwargs are passed from the EnergyFunctionComponent.settings dictionary, and can therefore be automatically and continuously adjusted in a simulation. Since it may happen that one or more kwargs may not be defined in the EnergyFunctionComponent.settings, defining a default value for each of the kwargs is considered a good practice when creating a custom EnergyFunctionComponent.

Example of potential :calc functions:

# Without restricting the acceleration type
function my_calc(::A, pose, sele::Opt{AbstractSelection}, update_forces::Bool) where {A <: ProtoSyn.AbstractAccelerationType}
    (...)
end

# Restricting the acceleration type and defining additional settings (with default values)
function my_calc(::ProtoSyn.CUDA_2, pose::Pose, sele::Opt{AbstractSelection}, update_forces::Bool; d1::T = 0.0, d2::T = 0.0, d3::T = Inf, d4::T = Inf, mask::MaskMap = nothing) where {T <: AbstractFloat}
    (...)
end
  1. the :calc function returning values:

ProtoSyn expects the :calc function to return 2 objects: firstly, the energy value, as an AbstractFloat (Float64, for example), and secondly, a Matrix of forces acting on the system. If a particular EnergyFunctionComponent does not calculate forces, it can optionally return nothing. Any extra returning values besides these (as long as these are the first, in this order), can be returned but will be ignored in the context of an EnergyFunction.

function my_calc(::A, pose, sele::Opt{AbstractSelection}, update_forces::Bool) where {A <: ProtoSyn.AbstractAccelerationType}
    (...)
    return energy, forces
end
  1. The EnergyFunctionComponent.settings field

As previously stated, the EnergyFunctionComponent.settings are optional. If present, these are passed to the EnergyFunctionComponent.calc Function as kwargs, and therefore should match and have the same name (as a Symbol instead). This allows for easy access and modification of internal parameters of the calculation on the fly.

Examples:

Dict{Symbol, Any}(:d1 => 1.0, :d2 => 3.0, :d3 => Inf, :d4 => Inf, :mask => Calculators.get_diagonal_mask(pose, an"C"))

ProtoSyn acceleration types

Certain EnergyFunctionComponent instances in ProtoSyn require specific software and hardware on the machine performing the calculation, or at least have specific implementations of the code based on the availability or not of said features (for example, certain functions have specific implementations for CUDA and GPU acceleration). For this reason, ProtoSyn defines 3 levels of acceleration (3 different acceleration types, loosely based on Flynn's Taxonomy):

At start-up, ProtoSyn tries to identify the available software and hardware available and sets the fastest acceleration type possible as ProtoSyn.acceleration.active. This can, however, be overridden:

julia> ProtoSyn.acceleration.activeProtoSyn.CUDA_2
julia> ProtoSyn.acceleration.active = ProtoSyn.SIMD_1ProtoSyn.SIMD_1

Verlet lists

A Verlet list is a common data structure in molecular manipulation and simulation softwares that maintains a list of particles inside a given cut-off distance to eachother. The main objective of Verlet lists is to speed up calculations by ignoring most of the long-distance interactions between Atom instances, as these would have negligent effects in the simulation, while reducing the amount of memory allocated as a whole. Note that, given the movement and displacement of Atom instances in a simulation, this list rapidly becomes obsolete and needs to be re-updated.

ProtoSyn.Calculators.VerletListType
VerletList(size::Int, capacity::Int, cutoff::Float64, offset::Vector{Int}, list::Vector{Int})

Hold information regarding the neighbouring particles of each particle in the system (with number of particles size). A VerletList is comprised of two arrays: the list, containing the neighbouring atoms j of atom i, with sequential atoms divided by an invalid -1 entry; and the offset array, containing the positions in array list for each atom i. Neighbouring atoms are defined as having a distance bellow the defined cutoff. The main objective of a VerletList is to speed up calculations (by ignoring long-range interactions between Atom instances) and to lower the total amount of memory allocated (the number of allocated Atom entries is at most the capacity of the VerletList). Note that, given the motion of particles in a simulation, a VerletList can quickly become obsolete, and needs to be updated using update!.

VerletList(size::Int)

Creates a new VerletList with infinite cutoff (holds all atoms in the molecule).

VerletList(pose::Pose)

Creates a new VerletList with infinite cutoff (holds all atoms in the Pose).

Fields

  • size::Int - The number of Atom instances this VerletList makes reference to. Should be the size of :offset field;
  • capacity::Int - Maximum number of interaction pairs listed in this VerletList;
  • cutoff::Float64 - Interactions are considered when the distance between two Atom instances is less than this value;
  • offset::Vector{Int} - Vector with the starting index for the neighbouring Atom.id entries in the :list field;
  • list::Vector{Int} - Vector with the neighbouring Atom.id entries, in sectors separated by invalid entries (such as -1).

See also

update! distance

Examples

julia> ProtoSyn.Calculators.VerletList(4)
ProtoSyn.Calculators.VerletList(4, 4, Inf, [0, 0, 0, 0], [0, 0, 0, 0])
source
ProtoSyn.Calculators.update!Function
update!([::Type{ProtoSyn.SISD_0}], verlet_list::VerletList, pose::Pose, [selection::Opt{ProtoSyn.AbstractSelection} = nothing])
update!([::Type{ProtoSyn.SIMD_1}], verlet_list::VerletList, pose::Pose, [selection::Opt{ProtoSyn.AbstractSelection} = nothing])

Updates the given VerletList (using a SISD_0 or SIMD_1 acceleration approach) according to the defined 'verlet_list.cutoff' and the given coordinates in the Pose pose (in AoS format). If the acceletarion type is not given, the default ProtoSyn.acceleration.active is employed. If an AbstractSelection selection is provided, only include the pair of Atom instances as interacting, in the VerletList, if both instances are selected.

Examples

julia> ProtoSyn.Calculators.update!(verlet_list, pose)
    ...

julia> ProtoSyn.Calculators.update!(verlet_list, pose, an"CA")
    ...
source

ProtoSyn Verlet List

Figure 2 | A diagram representation of the VerletList organization. Usually, a VerletList is comprised of two arrays: the :list and the :offset. The :offset vector contains 1 entry for each Atom instance in a given Pose, whose Atom.id is the index of the vector. The actual content of the vector is an index on the :list array, defining the starting point from which to read this vector in order to find all the Atom instances intercating with the currently selected Atom. As such, the :list vector is a collection of Atom.id records, with contiguous sectors separated by invalid entries (such as -1). Each one of these contigous sectors is a collection of Atom.id values belonging to the Atom instances that interact with a common Atom. As an example, consider the represented schematic. The considered range of interactions is represnted in light grey, with radius a and centered on the selected Atom 2. As such, the Atom instances 5, 9 and 12 are inside the interaction radius, and should be considered for measurement. In the accompanying VerletList, index 2 in the :offset vector contains value 4. By reading the :list vector starting at index 4 (as indicated by the :offset vector) until an invalid position is found (such as -1), Atom.id 5, 9 and 12 are found. In the same schematic, the dashed grey line indicates a "moving cut-off". It's a common practice in the usage of VerletList instances to accomodate Atom movement in a simulation environment, in order to reduce the number of times the VerletList needs to be updated. In this case, radius b is added to the VerletList.cutoff, and should be equal to N x 2 x MD (where N is the number of steps before re-updating the VerletList and MD is the maximum displacement an Atom is allowed or expected to suffer each step of the simulation). Using this approach, Atom 1 (in the example) should be included in the :list vector, at the contiguous sector related to Atom 2.

Note:

As of ProtoSyn 1.1, VerletList instances are only employable in distance calculations using distance_matrix methods. However, the underlying mechanisms of VerletList data structures can be useful elsewhere and are planned to be incorporated in default EnergyFunctionComponent instances in the future.

Distance matrix calculation

As previously stated, the Calculators section in Core exists mainly to set-up types and methods transversal to all ProtoSyn's Calculators in different modules (such as Peptides). One such method commonly employed in energy calculations is the ability to measure distance matrices. A distance matrix is a 2D matrix whose entries are the distance between the X and Y Atom, usually considering only the top triangular matrix. A full distance matrix can also be useful, as is available by using the full_distance_matrix method.

ProtoSyn.Calculators.distance_matrixFunction
Calculators.distance_matrix([::A], coords::Matrix{T}) where {A, T <: AbstractFloat}
Calculators.distance_matrix([::A], state::State{T}) where {A, T <: AbstractFloat}
Calculators.distance_matrix([::A], pose::Pose) where {A}

Return a distance matrix with the distance of all pairs of coordinates in coords (this should be a Matrix{T} in AoS format) above the triangular matrix. Instead of a Matrix{T} coords, a State or Pose can be provided, in which case the coordinates considered are all the existent in the State or Pose.state, respectively. The optional A parameter defines the acceleration mode used (SISD_0, SIMD_1 or CUDA_2). If left undefined the default ProtoSyn.acceleration.active mode will be used. Note: Make sure the given Pose pose in synched (using the sync! method).

Calculators.distance_matrix([::A], pose::Pose, selection::ProtoSyn.AbstractSelection)

Return a distance matrix with the distance of all pairs of atoms in the Pose pose who are selected in the given AbstractSelection selection and above the triangular matrix. The optional A parameter defines the acceleration mode used (SISD_0, SIMD_1 or CUDA_2). If left undefined the default ProtoSyn.acceleration.active mode will be used.

Calculators.distance_matrix([::A], coords::Matrix{T}, verlet_list::VerletList) where {T <: AbstractFloat}
Calculators.distance_matrix([::A], state::State{T}, verlet_list::VerletList) where {T <: AbstractFloat}
Calculators.distance_matrix([::A], pose::Pose, verlet_list::VerletList)

Return a top triangular distance matrix with the distance of all pairs of coordinates in the VerletList verlet_list. Instead of a Matrix{T} coords, a State or Pose can be provided, in which case the coordinates considered are existent in the State or Pose.state, respectively. The optional A parameter defines the acceleration mode used (SISD_0, SIMD_1). If left undefined the default ProtoSyn.acceleration.active mode will be used. Note: Using VerletList, CUDA_2 acceleration mode is not available. If the default ProtoSyn.acceleration.active is set to CUDA_2, SIMD_1 will be used instead.

Note:

AbstractSelections can still be applied when using VerletList instances, but need to be applied when updating the lists themselves. Check VerletList and update! entries for a more in-depth look at how Verlet lists work.

See also

full_distance_matrix

Examples

julia> ProtoSyn.Calculators.distance_matrix(pose.state.x)
N×N CUDA.CuArray{Float64,2}:
    ...

julia> ProtoSyn.Calculators.distance_matrix(pose, an"CA")
N×N CUDA.CuArray{Float64,2}:
    ...

julia> ProtoSyn.Calculators.distance_matrix(pose, verlet_list)
N×N CUDA.CuArray{Float64,2}:
    ...
source
ProtoSyn.Calculators.full_distance_matrixFunction
Calculators.full_distance_matrix([::A], coords::Matrix{T}) where {A, T <: AbstractFloat}
Calculators.full_distance_matrix([::A], state::State{T}) where {A, T <: AbstractFloat}
Calculators.full_distance_matrix([::A], pose::Pose) where {A}

Return a distance matrix with the distance of all pairs of coordinates in coords (this should be a Matrix{T} in AoS format). Instead of a Matrix{T} coords, a State or Pose can be provided, in which case the coordinates considered are all the existent in the State or Pose.state, respectively. The optional A parameter defines the acceleration mode used (SISD_0, SIMD_1 or CUDA_2). If left undefined the default ProtoSyn.acceleration.active mode will be used.

Calculators.full_distance_matrix([::A], pose::Pose, selection::ProtoSyn.AbstractSelection)

Return a distance matrix with the distance of all pairs of atoms in the Pose pose who are selected in the given AbstractSelection selection. The optional A parameter defines the acceleration mode used (SISD_0, SIMD_1 or CUDA_2). If left undefined the default ProtoSyn.acceleration.active mode will be used.

See also

distance_matrix

Examples

julia> ProtoSyn.Calculators.full_distance_matrix(pose.state.x)
N×N CUDA.CuArray{Float64,2}:
 ...

julia> ProtoSyn.Calculators.distance_matrix(pose, an"CA")
N×N CUDA.CuArray{Float64,2}:
 ...
source

Available EnergyFunctionComponents

Besides all the machinery to create custom energy components, ProtoSyn's Core module makes available some simple EnergyFunctionComponent instances. Since this is not a specialized module, available components should be indiferent to the type of molecular structure being evaluated. For a compelte list of all available EnergyFunctionComponent instances, consider using the show_available_energy_function_components method.

ProtoSyn.Calculators.show_available_energy_function_componentsFunction
show_available_energy_function_components([io::IO = stdout], [m::Module = ProtoSyn.Calculators])

Prints all available EnergyFunctionComponent instances in the provided Module m (defaults to ProtoSyn.Calculators) to the given IO io (defaults to stdout). Recursivelly searches any inner Module.

See also

get_available_energy_function_components

Examples

julia> ProtoSyn.Calculators.show_available_energy_function_components(ProtoSyn.Calculators)
+------------------------------------------------------------------------------------------------+
| Index | Component name            | Function                                                   |
+------------------------------------------------------------------------------------------------+
| 1     | Custom_Ref_Energy         | ProtoSyn.Calculators.get_default_custom_ref_energy         |
| 2     | Coulomb                   | ProtoSyn.Calculators.Electrostatics.get_default_coulomb    |
| 3     | GB_Solvation              | ProtoSyn.Calculators.GB.get_default_gb                     |
+------------------------------------------------------------------------------------------------+
└── Consider using the `?` menu to learn more about each EnergyFunctionComponent.
source

As of ProtoSyn 1.1, the following EnergyFunctionComponent instances are made available from the Core module, and can be further explored in the respectivelly dedicated pages of this manual: