Caterpillar Solvation

The Caterpillar Solvation energy is a coarse-grained approximation to a solvation energy contribution, introducing a penalty for the exposure of hydrophobic Residue instances and a penalty for the burial of hydrophylic Residue instances (and vice-versa).

The Caterpillar solvation energy calculation is based on the work of Coluzza et al (see this paper).

The ProtoSyn's modifications introduce a significant degree of complexity, futher explained bellow. In sum, the calculation takes 2 steps: the 1. Burial degree calculation and the 2. Hydrophobicity weight calculation.

1. Burial degree calculation

In this step, a given algorithm loops over all the residues (selecting a given atom for distance matrix calculation) and identifies the burial degree of each residue (this step can be parametrized by (1) the burial degree algorithm, (2) the identification curve, (3) the selection atom, (4) the rmax cut-off and (5) slope control sc. These settings are further explained bellow).

1.1 Burial degree algorithm

ProtoSyn offers 2 different burial degree identification algorithms, the Neighbour Count (NC) and Neighbour Vector (NV) (as explained further in this paper). In comparison to eachother, NC algorithms only take into consideration the number of selected atoms within a defined rmax range, while NV algorithms also take into consideration the orientation of neighbouring residues to defined the burial degree. As such, in general, NV algorithms are more precise, while carrying an additional performance weight.

ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_countFunction
ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_count([::A], pose::Pose, selection::Opt{AbstractSelection}, [update_forces::Bool = false]; [identification_curve::Function = null_identification_curve], [hydrophobicity_weight::Function = null_hydrophobicity_weight], [rmax::T = 9.0], [sc::T = 1.0], [Ω::Union{Int, T} = 750.0], [hydrophobicity_map::Dict{String, T} = ProtoSyn.Peptides.doolitle_hydrophobicity]) where {A, T <: AbstractFloat}

Calculate the given Pose pose caterpillar solvation energy using the Neighbour Count (NC) algorithm (see this article). If an AbstractSelection selection is provided, consider only the selected Atom instances (any given selection will be promoted to be of Atom type, see ProtoSyn.promote). In this model, the burial degree Ωi of each atom is equal to the number (count) of neighbouring Atom instances (within the defined rmax cut-off, in Angstrom Å) multiplied by a w1 weight, provided by the identification_curve Function. This Function receives the distance between each pair of neighbouring atoms (as a float), the rmax value and, optionally, a slope control sc value, and return a weight w1 (as a float). The identification_curve signature is as follows:

identification_curve(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}

In order to use pre-defined identification_curve Function instances defined in ProtoSyn, check linear and sigmoid.

Note that: - Shorter rmax value identify buried residues in the local environment (i.e.: in the scale of the secondary structure, recommended) while a larger rmax value identifies buried residues in the global scale (i.e.: in comparison with the whole structure). - The slope control sc value only has effect in sigmoid identification_curve Function instances. A smaller value augments the prevalence of distance information in the w1 weight calculation, while a larger value defines a more strict cut-off (recommended);

An Atom is, therefore, considered buried if the number of neighbors (multiplied by w1) is over a defined cut-off value Ω. Buried hydrophobic aminoacids receive an energetic reward, while exposed hydrophobic Residue instances receive a penalty (and vice-versa for hydrophylic aminoacids), defined in the provided hydrophobicity_map (hydrophobicity map examples can be found in Peptides.constants.jl) and multiplied by w2, calculated by the hydrophobicity_weight Function. This Function receives the neighbor count Ωi, the hydrophobicity_map_value and the cut-off value Ω, returning a w2 weight (as a float). The hydrophobicity_weight signature is as follows:

hydrophobicity_weight(Ωi::Union{Int, T}; hydrophobicity_map_value::T = 0.0, Ω::Union{Int, T} = 0.0) where {T <: AbstractFloat}

In order to use pre-defined hydrophobicity_weight Function instances defined in ProtoSyn, check nc_scalling_exposed_only, nc_non_scalling_exposed_only, nc_scalling_all_contributions (recommended) and nc_non_scalling_all_contributions.

The optional A parameter defines the acceleration mode used (SISD0, SIMD1 or CUDA2). If left undefined the default ProtoSyn.acceleration.active mode will be used. This function does not calculate forces (not applicable), and therefore the `updateforces` flag serves solely for uniformization with other energy-calculating functions.

See also

neighbour_vector

Examples

julia> ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_count(pose, false)
(0.0, nothing)
source
ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_vectorFunction
ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_vector([::A], pose::Pose, selection::Opt{AbstractSelection}, [update_forces::Bool = false]; [identification_curve::Function = null_identification_curve], [hydrophobicity_weight::Function = null_hydrophobicity_weight], [rmax::T = 9.0], [sc::T = 1.0], [Ω::Union{Int, T} = 750.0], [hydrophobicity_map::Dict{String, T} = ProtoSyn.Peptides.doolitle_hydrophobicity]) where {A, T <: AbstractFloat}

Calculate the given Pose pose caterpillar solvation energy using the Neighbour Vector (NV) algorithm (see this article). If an AbstractSelection selection is provided, consider only the selected Atom instances (any given selection will be promoted to be of Atom type, see ProtoSyn.promote). In this model, vectors ωi are defined between an Atom and all neighbouring Atom instances (within the defined rmax cut-off (in Angstrom Å)). Note that only the selected atoms by the selection are considered. A resulting vector Ωi is calculated by suming all ωi vectors, multiplied by a w1 weight, provided by the identification_curve Function. This Function receives the distance between each pair of neighbouring atoms (as a float), the rmax value and, optionally, a slope control sc value, and return a weight w1 (as a float). The identification_curve signature is as follows:

identification_curve(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}

In order to use pre-defined identification_curve Function instances defined in ProtoSyn, check linear, sigmoid and sigmoid_normalized.

Note that:

  • Shorter rmax value identify buried residues in the local environment (i.e.:

in the scale of the secondary structure, recommended) while a larger rmax value identifies buried residues in the global scale (i.e.: in comparison with the whole structure).

  • The slope control sc value only has effect in sigmoid

identification_curve Function instances. A smaller value augments the prevalence of distance information in the w1 weight calculation, while a larger value defines a more strict cut-off (recommended);

An Atom is, therefore, considered buried if the magnitude of the resulting vector from the sum of the ωi (multiplied by w1) is within a defined cut-off value Ω. Buried hydrophobic aminoacids receive an energetic reward, while exposed hydrophobic Residue instances receive a penalty (and vice-versa for hydrophylic aminoacids), defined in the provided hydrophobicity_map (hydrophobicity map examples can be found in Peptides.constants.jl) and multiplied by w2, calculated by the hydrophobicity_weight Function. This Function receives the vector magnitude Ωi, the hydrophobicity_map_value and the cut-off value Ω, returning a w2 (as a float). The hydrophobicity_weight signature is as follows:

hydrophobicity_weight(Ωi::Union{Int, T}; hydrophobicity_map_value::T = 0.0, Ω::Union{Int, T} = 0.0) where {T <: AbstractFloat}

In order to use pre-defined hydrophobicity_weight Function instances defined in ProtoSyn, check nv_scalling_exposed_only, nv_non_scalling_exposed_only, nv_scalling_all_contributions (recommended) and nv_non_scalling_all_contributions.

The optional A parameter defines the acceleration mode used (SISD0, SIMD1 or CUDA2). If left undefined the default ProtoSyn.acceleration.active mode will be used. This function does not calculate forces (not applicable), and therefore the `updateforces` flag serves solely for uniformization with other energy-calculating functions.

See also

neighbour_count

Examples

julia> ProtoSyn.Peptides.Calculators.Caterpillar.neighbour_vector(pose, false)
(0.0, nothing)
source

ProtoSyn Neighbour Count Neighbour Vector

Figure 1 | Visualization of neighbour_count and neighbour_vector algorithms for burial degree calculation. A - Neighbour count - Since only the number of selected atoms within the rmax cutoff is accounted for, both depicted conformations are measured as having the same burial degree, even though atom B is more exposed to the system's solvent. B - Neighbour vector - Using the neighbour vector algorithm, the resulting vector from summing all individual vectors from the selected atom towards all neighbouring selected atoms (within rmax) provides a more clear distinction between exposed or non-exposed Atom instances: resulting vectors with higher magnitude are linearly and positively correlated with atoms with more exposed surfaces. Note that, on average, the NV algorithm was measured to be 4-5x slower than NC.

1.2 Identification curve

The available identification curves are linear, sigmoid and normalized sigmoid (in NV algorithms only). Note that the definition of the identification curve controls the amount of distance information considered for the calculation of the burial degree: linear identification curves incorporate more distance information than sigmoid identification curves (normalized sigmoid, in NV algorithms, use the least distance information, similar to NC algorithms).

ProtoSyn.Peptides.Calculators.Caterpillar.linearFunction
linear(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}

Returns the linear identification curve weight w1 (max at distance 0.0, linearly decreasing until rmax). Note that slope control sc has no effect.

See also

sigmoid sigmoid_normalized

Examples

julia> ProtoSyn.Peptides.Calculators.Caterpillar.linear(10.0, rmax = 20.0)
0.5
source
ProtoSyn.Peptides.Calculators.Caterpillar.sigmoidFunction
sigmoid(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}

Returns the sigmoid identification curve weight w1 (max at distance 0.0, min at distance rmax). The slope control sc defines how sharp the sigmoid curve is.

See also

linear sigmoid_normalized

Examples

julia> ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid(10.0, rmax = 20.0)
0.9999546021312976
source
ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid_normalizedFunction
sigmoid_normalized(distance::T; rmax::T = 9.0, sc::T = 1.0) where {T <: AbstractFloat}

Returns the sigmoid identification curve weight w1 (max at distance 0.0, min at distance rmax), normalized by the given distance (should only be applied in Neighbor Vector (NV) algorithms). The slope control sc defines how sharp the sigmoid curve is.

See also

linear sigmoid

Examples

julia> ProtoSyn.Peptides.Calculators.Caterpillar.sigmoid_normalized(10.0, rmax = 20.0)
0.09999546021312976
source

1.3 Selection atom

This can be any selection that yields an atom. However, in most cases applied to proteins, either the Cα or Cβ atoms should be chosen.

1.4 rmax cutoff

This can be any float number, however, the range between 9.0Å and 50.0Å was identified as yielding the best results. Note that short rmax values cause a much more localized identification of burial degrees (i.e.: in comparison with the rest of the local secondary structure) while larger rmax values identify a more global level of burial (i.e.: in comparison with all aminoacids in the structure).

1.5 Slope control (sc)

This value controls the slope degree in sigmoid identification curves (only). Lower values yield less pronounced slopes (therefore taking more distance information into consideration) while higher sc values define more strict cut off lines.

2. Hydrophobicity weight calculation

The final Caterpilar's solvation energy multiplies the calculated burial degree by an hydrophobicity weight: hydrophobic Residue instances receive a penalty when exposed and vice-versa. The buried/unburied distinction is defined by the user using the Ω field: buried degress bellow Ω are considered buried. The difference between burial degree and Ω is then multiplied by the hydrophobicity index. By default, ProtoSyn employs the Doolittle Hydrophobicity Index, when working with proteins and peptides.

ProtoSyn.Peptides.Calculators.Caterpillar.get_default_caterpillar_solvation_energyFunction
get_default_caterpillar_solvation_energy(;[α::T = 1.0]) where {T <: AbstractFloat}

Return the default Caterpillar solvation energy energy EnergyFunctionComponent. α sets the component weight (on an EnergyFunction instance, 1.0 by default). This function employs neighbour_vector as the :calc function.

Settings

  • hydrophobicity_weight::Function - Returns the hydrophobicity weight of a given burial degree based on a hydrophobicity map value and default buried state;
  • identification_curve::Function - Returns the burial degree weight given an inter-atomic distance, rmax and slope control value;
  • Ω::Union{Int, T} - The default buried state (the average, buried degrees above this value are considered buried);
  • sc::T = 1.0 - Slope control, used in sigmoid identification curves to control how sharp to consider an interaction as buried or not.
  • hydrophobicity_map::Dict{String, T} - A 1-on-1 map between Residue types and hydrophobicity indexes;
  • rmax::T - The cutoff value to consider some inter-atomic interaction as buriable;

Examples

julia> ProtoSyn.Peptides.Calculators.Caterpillar.get_default_caterpillar_solvation_energy()
🞧  Energy Function Component:
+---------------------------------------------------+
| Name           | Caterpillar_Solv                 |
| Alpha (α)      | 1.0                              |
| Update forces  | false                            |
| Calculator     | neighbour_vector                 |
+---------------------------------------------------+
 |    +----------------------------------------------------------------------------------+
 ├──  ● Settings                      | Value                                            |
 |    +----------------------------------------------------------------------------------+
 |    | hydrophobicity_weight         | nv_scalling_all_contributions                    |
 |    | identification_curve          | sigmoid                                          |
 |    | Ω                             | 24.0                                             |
 |    | sc                            | 1.0                                              |
 |    | hydrophobicity_map            | Dict{String, Float64}(21 components)             |
 |    | rmax                          | 9.0                                              |
 |    +----------------------------------------------------------------------------------+
 |    
 └──  ●  Selection:
      └── BinarySelection ❯  | "or" (Atom)
           ├── FieldSelection › Atom.name = CB
           └── BinarySelection ❯  & "and" (Atom)
                ├── FieldSelection › Atom.name = HA2
                └── FieldSelection › Residue.name = GLY
source

ProtoSyn Caterpillar Solvation

Figure 1 | A diagram representation of the Caterpillar Solvation EnergyFunctionComponent. The energy penalty is proportional to the hydrophobicity value (in the hydrophobicity map) multiplied by the excess number of Cα contacts (above Ω).