Caterpillar Solvation
The Caterpillar Solvation energy is a coarse-grained approximation to a solvation energy contribution, introducing a penalty for the exposure of hydrophobic Residue
instances and a penalty for the burial of hydrophylic Residue
instances (and vice-versa).
The Caterpillar solvation energy calculation is based on the work of Coluzza et al (see this paper).
The ProtoSyn's modifications introduce a significant degree of complexity, futher explained bellow. In sum, the calculation takes 2 steps: the 1. Burial degree calculation and the 2. Hydrophobicity weight calculation.
1. Burial degree calculation
In this step, a given algorithm loops over all the residues (selecting a given atom for distance matrix calculation) and identifies the burial degree of each residue (this step can be parametrized by (1) the burial degree algorithm, (2) the identification curve, (3) the selection atom, (4) the rmax cut-off and (5) slope control sc. These settings are further explained bellow).
1.1 Burial degree algorithm
ProtoSyn offers 2 different burial degree identification algorithms, the Neighbour Count (NC) and Neighbour Vector (NV) (as explained further in this paper). In comparison to eachother, NC algorithms only take into consideration the number of selected atoms within a defined rmax
range, while NV algorithms also take into consideration the orientation of neighbouring residues to defined the burial degree. As such, in general, NV algorithms are more precise, while carrying an additional performance weight.
ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_count
— FunctionProtoSyn.Peptides.Calculators.Caterpillar.neighbour_count([::A], pose::Pose, selection::Opt{AbstractSelection}, [update_forces::Bool = false]; [identification_curve::Function = null_identification_curve], [hydrophobicity_weight::Function = null_hydrophobicity_weight], [rmax::T = 9.0], [sc::T = 1.0], [Ω::Union{Int, T} = 750.0], [hydrophobicity_map::Dict{String, T} = ProtoSyn.Peptides.doolitle_hydrophobicity]) where {A, T <: AbstractFloat}
Calculate the given Pose
pose
caterpillar solvation energy using the Neighbour Count (NC) algorithm (see this article). If an AbstractSelection
selection
is provided, consider only the selected Atom
instances (any given selection
will be promoted to be of Atom
type, see ProtoSyn.promote
). In this model, the burial degree Ωi
of each atom is equal to the number (count) of neighbouring Atom
instances (within the defined rmax
cut-off, in Angstrom Å) multiplied by a w1
weight, provided by the identification_curve
Function
. This Function
receives the distance between each pair of neighbouring atoms (as a float), the rmax
value and, optionally, a slope control sc
value, and return a weight w1
(as a float). The identification_curve
signature is as follows:
identification_curve(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}
In order to use pre-defined identification_curve
Function
instances defined in ProtoSyn, check linear
and sigmoid
.
Note that: - Shorter rmax
value identify buried residues in the local environment (i.e.: in the scale of the secondary structure, recommended) while a larger rmax
value identifies buried residues in the global scale (i.e.: in comparison with the whole structure). - The slope control sc
value only has effect in sigmoid identification_curve
Function
instances. A smaller value augments the prevalence of distance information in the w1
weight calculation, while a larger value defines a more strict cut-off (recommended);
An Atom
is, therefore, considered buried if the number of neighbors (multiplied by w1
) is over a defined cut-off value Ω
. Buried hydrophobic aminoacids receive an energetic reward, while exposed hydrophobic Residue
instances receive a penalty (and vice-versa for hydrophylic aminoacids), defined in the provided hydrophobicity_map
(hydrophobicity map examples can be found in Peptides.constants.jl
) and multiplied by w2
, calculated by the hydrophobicity_weight
Function
. This Function
receives the neighbor count Ωi
, the hydrophobicity_map_value
and the cut-off value Ω
, returning a w2
weight (as a float). The hydrophobicity_weight
signature is as follows:
hydrophobicity_weight(Ωi::Union{Int, T}; hydrophobicity_map_value::T = 0.0, Ω::Union{Int, T} = 0.0) where {T <: AbstractFloat}
In order to use pre-defined hydrophobicity_weight
Function
instances defined in ProtoSyn, check nc_scalling_exposed_only
, nc_non_scalling_exposed_only
, nc_scalling_all_contributions
(recommended) and nc_non_scalling_all_contributions
.
The optional A
parameter defines the acceleration mode used (SISD0, SIMD1 or CUDA2). If left undefined the default ProtoSyn.acceleration.active
mode will be used. This function does not calculate forces (not applicable), and therefore the `updateforces` flag serves solely for uniformization with other energy-calculating functions.
See also
Examples
julia> ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_count(pose, false)
(0.0, nothing)
ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_vector
— FunctionProtoSyn.Peptides.Calculators.Caterpillar.neighbour_vector([::A], pose::Pose, selection::Opt{AbstractSelection}, [update_forces::Bool = false]; [identification_curve::Function = null_identification_curve], [hydrophobicity_weight::Function = null_hydrophobicity_weight], [rmax::T = 9.0], [sc::T = 1.0], [Ω::Union{Int, T} = 750.0], [hydrophobicity_map::Dict{String, T} = ProtoSyn.Peptides.doolitle_hydrophobicity]) where {A, T <: AbstractFloat}
Calculate the given Pose
pose
caterpillar solvation energy using the Neighbour Vector (NV) algorithm (see this article). If an AbstractSelection
selection
is provided, consider only the selected Atom
instances (any given selection
will be promoted to be of Atom
type, see ProtoSyn.promote
). In this model, vectors ωi
are defined between an Atom
and all neighbouring Atom
instances (within the defined rmax
cut-off (in Angstrom Å)). Note that only the selected atoms by the selection
are considered. A resulting vector Ωi
is calculated by suming all ωi
vectors, multiplied by a w1
weight, provided by the identification_curve
Function
. This Function
receives the distance between each pair of neighbouring atoms (as a float), the rmax
value and, optionally, a slope control sc
value, and return a weight w1
(as a float). The identification_curve
signature is as follows:
identification_curve(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}
In order to use pre-defined identification_curve
Function
instances defined in ProtoSyn, check linear
, sigmoid
and sigmoid_normalized
.
Note that:
- Shorter
rmax
value identify buried residues in the local environment (i.e.:
in the scale of the secondary structure, recommended) while a larger rmax
value identifies buried residues in the global scale (i.e.: in comparison with the whole structure).
- The slope control
sc
value only has effect in sigmoid
identification_curve
Function
instances. A smaller value augments the prevalence of distance information in the w1
weight calculation, while a larger value defines a more strict cut-off (recommended);
An Atom
is, therefore, considered buried if the magnitude of the resulting vector from the sum of the ωi
(multiplied by w1
) is within a defined cut-off value Ω
. Buried hydrophobic aminoacids receive an energetic reward, while exposed hydrophobic Residue
instances receive a penalty (and vice-versa for hydrophylic aminoacids), defined in the provided hydrophobicity_map
(hydrophobicity map examples can be found in Peptides.constants.jl
) and multiplied by w2
, calculated by the hydrophobicity_weight
Function
. This Function
receives the vector magnitude Ωi
, the hydrophobicity_map_value
and the cut-off value Ω
, returning a w2
(as a float). The hydrophobicity_weight
signature is as follows:
hydrophobicity_weight(Ωi::Union{Int, T}; hydrophobicity_map_value::T = 0.0, Ω::Union{Int, T} = 0.0) where {T <: AbstractFloat}
In order to use pre-defined hydrophobicity_weight
Function
instances defined in ProtoSyn, check nv_scalling_exposed_only
, nv_non_scalling_exposed_only
, nv_scalling_all_contributions
(recommended) and nv_non_scalling_all_contributions
.
The optional A
parameter defines the acceleration mode used (SISD0, SIMD1 or CUDA2). If left undefined the default ProtoSyn.acceleration.active
mode will be used. This function does not calculate forces (not applicable), and therefore the `updateforces` flag serves solely for uniformization with other energy-calculating functions.
See also
Examples
julia> ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_vector(pose, false)
(0.0, nothing)
Figure 1 | Visualization of neighbour_count
and neighbour_vector
algorithms for burial degree calculation. A - Neighbour count - Since only the number of selected atoms within the rmax
cutoff is accounted for, both depicted conformations are measured as having the same burial degree, even though atom B is more exposed to the system's solvent. B - Neighbour vector - Using the neighbour vector algorithm, the resulting vector from summing all individual vectors from the selected atom towards all neighbouring selected atoms (within rmax
) provides a more clear distinction between exposed or non-exposed Atom
instances: resulting vectors with higher magnitude are linearly and positively correlated with atoms with more exposed surfaces. Note that, on average, the NV algorithm was measured to be 4-5x slower than NC.
1.2 Identification curve
The available identification curves are linear
, sigmoid
and normalized sigmoid
(in NV algorithms only). Note that the definition of the identification curve controls the amount of distance information considered for the calculation of the burial degree: linear identification curves incorporate more distance information than sigmoid identification curves (normalized sigmoid, in NV algorithms, use the least distance information, similar to NC algorithms).
ProtoSyn.Peptides.Calculators.Caterpillar.linear
— Functionlinear(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}
Returns the linear identification curve weight w1
(max at distance
0.0, linearly decreasing until rmax
). Note that slope control sc
has no effect.
See also
Examples
julia> ProtoSyn.Peptides.Calculators.Caterpillar.linear(10.0, rmax = 20.0)
0.5
ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid
— Functionsigmoid(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}
Returns the sigmoid identification curve weight w1
(max at distance
0.0, min at distance rmax
). The slope control sc
defines how sharp the sigmoid curve is.
See also
Examples
julia> ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid(10.0, rmax = 20.0)
0.9999546021312976
ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid_normalized
— Functionsigmoid_normalized(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}
Returns the sigmoid identification curve weight w1
(max at distance
0.0, min at distance rmax
), normalized by the given distance (should only be applied in Neighbor Vector (NV) algorithms). The slope control sc
defines how sharp the sigmoid curve is.
See also
Examples
julia> ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid_normalized(10.0, rmax = 20.0)
0.09999546021312976
1.3 Selection atom
This can be any selection that yields an atom. However, in most cases applied to proteins, either the Cα or Cβ atoms should be chosen.
1.4 rmax
cutoff
This can be any float number, however, the range between 9.0Å and 50.0Å was identified as yielding the best results. Note that short rmax
values cause a much more localized identification of burial degrees (i.e.: in comparison with the rest of the local secondary structure) while larger rmax
values identify a more global level of burial (i.e.: in comparison with all aminoacids in the structure).
1.5 Slope control (sc
)
This value controls the slope degree in sigmoid identification curves (only). Lower values yield less pronounced slopes (therefore taking more distance information into consideration) while higher sc values define more strict cut off lines.
2. Hydrophobicity weight calculation
The final Caterpilar's solvation energy multiplies the calculated burial degree by an hydrophobicity weight: hydrophobic Residue
instances receive a penalty when exposed and vice-versa. The buried/unburied distinction is defined by the user using the Ω
field: buried degress bellow Ω
are considered buried. The difference between burial degree and Ω
is then multiplied by the hydrophobicity index. By default, ProtoSyn employs the Doolittle Hydrophobicity Index, when working with proteins and peptides.
ProtoSyn.Peptides.Calculators.Caterpillar.get_default_caterpillar_solvation_energy
— Functionget_default_caterpillar_solvation_energy(;[α::T = 1.0]) where {T <: AbstractFloat}
Return the default Caterpillar solvation energy energy EnergyFunctionComponent
. α
sets the component weight (on an EnergyFunction
instance, 1.0
by default). This function employs neighbour_vector
as the :calc
function.
Settings
hydrophobicity_weight::Function
- Returns the hydrophobicity weight of a given burial degree based on a hydrophobicity map value and default buried state;identification_curve::Function
- Returns the burial degree weight given an inter-atomic distance, rmax and slope control value;Ω::Union{Int, T}
- The default buried state (the average, buried degrees above this value are considered buried);sc::T = 1.0
- Slope control, used in sigmoid identification curves to control how sharp to consider an interaction as buried or not.hydrophobicity_map::Dict{String, T}
- A 1-on-1 map betweenResidue
types and hydrophobicity indexes;rmax::T
- The cutoff value to consider some inter-atomic interaction as buriable;
Examples
julia> ProtoSyn.Peptides.Calculators.Caterpillar.get_default_caterpillar_solvation_energy()
🞧 Energy Function Component:
+---------------------------------------------------+
| Name | Caterpillar_Solv |
| Alpha (α) | 1.0 |
| Update forces | false |
| Calculator | neighbour_vector |
+---------------------------------------------------+
| +----------------------------------------------------------------------------------+
├── ● Settings | Value |
| +----------------------------------------------------------------------------------+
| | hydrophobicity_weight | nv_scalling_all_contributions |
| | identification_curve | sigmoid |
| | Ω | 24.0 |
| | sc | 1.0 |
| | hydrophobicity_map | Dict{String, Float64}(21 components) |
| | rmax | 9.0 |
| +----------------------------------------------------------------------------------+
|
└── ● Selection:
└── BinarySelection ❯ | "or" (Atom)
├── FieldSelection › Atom.name = CB
└── BinarySelection ❯ & "and" (Atom)
├── FieldSelection › Atom.name = HA2
└── FieldSelection › Residue.name = GLY
Figure 1 | A diagram representation of the Caterpillar Solvation EnergyFunctionComponent
. The energy penalty is proportional to the hydrophobicity value (in the hydrophobicity map) multiplied by the excess number of Cα contacts (above Ω).