Skip to content

Custom analysis function for water bridge analysis #2198

@xiki-tempula

Description

@xiki-tempula

In PR #2087, I have introduced custom analysis functions to customise the behaviour of count_by_time, count_by_type, which hasn't been merged yet. Following the discussion in #2177, it is suggested that it is better to embed the heavy atom information when detecting hydrogen bonds to avoid further complication #1687.

I have changed the internal representation of water bridge analysis to the format suggested in #2177, which now has the form (selection1_index, selection1_heavy_atom_index, selection2_index, selection2_heavy_atom_index, distance, angle). However, this representation creates a little problem to the implementation of Custom analysis function for (e.g.) count_by_type. I wonder if I can have some suggestions as to how to proceed.

Suppose we want to tabulate the water bridges between selection 1 and selection 2 at residue level, distinguishing different orders of water bridge and only selecting the water bridges originating from residue id 233. We will write the analysis function in this format:

def analysis(current, output):
    # decompose the first hydrogen bond.
    s1_index, to_index, (s1_resname, s1_resid, s1_name),
                                    (to_resname, to_resid, to_name), dist, angle = current[0]
    # decompose the last hydrogen bond.
    from_index, s2_index, (from_resname, from_resid, from_name),
                                         (s2_resname, s2_resid, s2_name), dist, angle = current[-1]
    # if the residue id is 233
    if s1_resid == 233:
        # compute order of water bridge
        order = len(current) - 1
        key = (s1_resname, s1_resid, s2_resname, s2_resid, order)
        # The number of this type of water bridge is 1.
        output[key] = 1
w.count_by_type(analysis_func=analysis)

The input format is the build in a way similar to timeseries and has the problem of being WET and doesn't include heavy atom information. I'm thinking of replacing the input with the new DRY format (selection1_index, selection1_heavy_atom_index, selection2_index, selection2_heavy_atom_index, distance, angle). However, to achieve the same goal as the previous analysis function, we will need to include universe in the input paramaters and the analysis function might become:

def analysis(current, output, u):
    # decompose the first hydrogen bond.
    (selection1_index, selection1_heavy_atom_index, 
     to_index, to_index, 
     distance, angle) = current[0]
    (from_index, from_heavy_atom_index, 
     selection2_index, selection2_heavy_atom_index, 
     distance, angle) = current[-1]
    # if the residue id is 233
    if u.atoms[selection1_index].resid == 233:
        # compute order of water bridge
        order = len(current) - 1
        key = (u.atoms[selection1_index].resname, u.atoms[selection1_index].resid, 
                   u.atoms[selection2_index].resname, u.atoms[selection2_index].resid, order)
        # The number of this type of water bridge is 1.
        output[key] = 1
w.count_by_type(analysis_func=analysis)

I'm not sure if this is the best way to proceed. Thank you in advance for the advice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions