deepfold.data.search.templates.extract_template_features

deepfold.data.search.templates.extract_template_features(mmcif_dict: dict, index_mapping: Dict[int, int], query_sequence: str, template_sequence: str, template_pdb_id: str, template_chain_id: str, kalign_executable_path: str, verbose: bool) Tuple[dict, str | None][source]

Extracts template features from a single HHSearch hit.

Parameters:
  • mmcif_dict – mmcif dict representing the template (see load_mmcif_dict).

  • index_mapping – Dictionary mapping indices in the query sequence to indices in the template sequence.

  • query_sequence – String describing the amino acid sequence for the query protein.

  • template_sequence – String describing the amino acid sequence for the template protein.

  • template_pdb_id – PDB code for the template.

  • template_chain_id – String ID describing which chain of the structure should be used.

  • kalign_executable_path – The path to a kalign executable used for template realignment.

  • verbose – Whether to print relevant details.

Returns:

  • A dictionary containing the features derived from the template protein structure.

  • A warning message if the hit was realigned to the actual mmCIF sequence.

    Otherwise None.

Return type:

A tuple with

Raises:
  • NoChainsError – If the mmcif_dict doesn’t contain any chains.

  • SequenceNotInTemplateError – If the given chain id / sequence can’t be found in the mmcif_dict.

  • QueryToTemplateAlignError – If the actual template in the mmCIF file can’t be aligned to the query.

  • NoAtomDataInTemplateError – If the mmcif_dict doesn’t contain atom positions.

  • TemplateAtomMaskAllZerosError – If the mmcif_dict doesn’t have any unmasked residues.