deepfold.data.tools.jackhmmer.Jackhmmer

class deepfold.data.tools.jackhmmer.Jackhmmer(*, binary_path: str, database_path: str, n_cpu: int = 8, n_iter: int = 1, e_value: float = 0.0001, z_value: int | None = None, get_tblout: bool = False, filter_f1: float = 0.0005, filter_f2: float = 5e-05, filter_f3: float = 5e-07, incdom_e: float | None = None, dom_e: float | None = None, num_streamed_chunks: int | None = None, streaming_callback: Callable[[int], None] | None = None)[source]

Python wrapper of the Jackhmmer binary.

__init__(*, binary_path: str, database_path: str, n_cpu: int = 8, n_iter: int = 1, e_value: float = 0.0001, z_value: int | None = None, get_tblout: bool = False, filter_f1: float = 0.0005, filter_f2: float = 5e-05, filter_f3: float = 5e-07, incdom_e: float | None = None, dom_e: float | None = None, num_streamed_chunks: int | None = None, streaming_callback: Callable[[int], None] | None = None)[source]

Initializes the Python Jackhmmer wrapper.

Parameters:
  • binary_path – The path to the jackhmmer executable.

  • database_path – The path to the jackhmmer database (FASTA format).

  • n_cpu – The number of CPUs to give Jackhmmer.

  • n_iter – The number of Jackhmmer iterations.

  • e_value – The E-value, see Jackhmmer docs for more details.

  • z_value – The Z-value, see Jackhmmer docs for more details.

  • get_tblout – Whether to save tblout string.

  • filter_f1 – MSV and biased composition pre-filter, set to >1.0 to turn off.

  • filter_f2 – Viterbi pre-filter, set to >1.0 to turn off.

  • filter_f3 – Forward pre-filter, set to >1.0 to turn off.

  • incdom_e – Domain e-value criteria for inclusion of domains in MSA/next round.

  • dom_e – Domain e-value criteria for inclusion in tblout.

  • num_streamed_chunks – Number of database chunks to stream over.

  • streaming_callback – Callback function run after each chunk iteration with the iteration number as argument.

Methods

__init__(*, binary_path, database_path[, ...])

Initializes the Python Jackhmmer wrapper.

query(input_fasta_path[, max_sequences])

query_multiple(input_fasta_paths[, ...])

Queries the database using Jackhmmer.