FATCAT Alignment Task

FATCAT (Flexible structure AlignmenT by Chaining Aligned fragment pairs allowing Twists)

Protein structures are flexible and undergo structural rearrangements as part of their function. FATCAT (Flexible structure AlignmenT by Chaining Aligned fragment pairs allowing Twists) is an approach for flexible protein structure comparison. It simultaneously addresses the two major goals of flexible structure alignment; optimizing the alignment and minimizing the number of rigid-body movements (twists) around pivot points (hinges) introduced in the reference structure. In FATCAT, the structure alignment is formulated as the AFPs (aligned fragment pairs) chaining process allowing at most t twists, and the flexible structure alignment is transformed into a rigid structure alignment when t is forced to be 0. Dynamic programming is used to find the optimal chaining.

More information about FATCAT can be found at the official website: FATCAT.

protein_metamorphisms_is.operations.structural_alignment_tasks.fatcat.align_task(alignment_entry, conf)

Executes a protein structure alignment task using FATCAT, preceded by CIF to PDB conversion.

This function aligns a target protein structure against a representative structure using the FATCAT algorithm, renowned for its flexibility in handling protein comparisons. Prior to alignment, the function converts CIF files to PDB format using a helper function, ensuring compatibility with FATCAT. The alignment process extracts several key metrics from the output, including RMSD, sequence identity, similarity, score, and alignment length.

The function constructs paths for the representative and target structures, converts them to PDB format, and executes the FATCAT binary with these structures as input. It captures and parses the output to extract alignment metrics.

In case of successful execution, it returns these metrics encapsulated in a result dictionary. If FATCAT encounters an error or if an exception occurs during the process, the function logs detailed error information including a traceback and returns an error object containing the error message.

Parameters:
  • alignment_entry (object) – An object containing data for the alignment task, including PDB IDs, chain identifiers, and model numbers for both the representative and target structures, as well as the cluster entry ID.

  • conf (dict) – A dictionary containing configuration settings, specifically the paths to the directories where PDB chain files are stored and where the FATCAT binary is located.

Returns:

A tuple containing the queue entry ID of the alignment task and either a result dictionary (with keys for

’cluster_entry_id’, ‘fc_rms’, ‘fc_identity’, ‘fc_similarity’, ‘fc_score’, ‘fc_align_len’) or an error object (with ‘cluster_entry_id’ and ‘error_message’).

Return type:

tuple

Raises:

Exception – Logs any exceptions that occur during the process, including a traceback, returning an error object with the error message.

Example

>>> alignment_entry = {
        'rep_pdb_id': '1A2B',
        'rep_chains': 'A',
        'rep_model': 0,
        'pdb_id': '2B3C',
        'chains': 'B',
        'model': 0,
        'cluster_id': 123,
        'queue_entry_id': 456
    }
>>> conf = {
        'pdb_chains_path': '/path/to/pdb_chains',
        'binaries_path': '/path/to/binaries'
    }
>>> align_task(alignment_entry, conf)
(456, {'cluster_entry_id': 123, 'fc_rms': 0.5, 'fc_identity': 99.0, 'fc_similarity': 98.5,
       'fc_score': 150.0, 'fc_align_len': 250})