miniworld.feature package¶
Submodules¶
miniworld.feature.MiniWorld_featuring_species module¶
- miniworld.feature.MiniWorld_featuring_species.MSA_block_deletion(msa, insertion, nb=5)[source]¶
Down-sample given MSA by randomly delete blocks of sequences Input: MSA/Insertion having shape (N, L) output: new MSA/Insertion with block deletion (N’, L)
- miniworld.feature.MiniWorld_featuring_species.MSA_featurize_wo_statistics(msa, insertion, chain_to_idx_dict, params)[source]¶
I modified RF2 version. (just changed name of variables) Input: full MSA information (after Block deletion if necessary) & full insertion information Output: seed MSA features & extra sequences.
- Parameters:
msa (torch.LongTensor) – Full MSA tensor.
insertion (torch.LongTensor) – Full insertion tensor.
chain_to_idx_dict (dict) – Dictionary mapping chain ID to residue indices.
params (dict) – Dictionary of parameters.
Seed MSA features:
aatype of seed sequence (20 regular aa + 1 gap + 1 unknwon + 1 mask)
profile of clustered sequences (23) => removed
insertion statistics (2) => removed statistics, only use insertion_clust
N-term or C-term? (2)
extra sequence features:
aatype of extra sequence (23)
insertion info (1)
N-term or C-term? (2)
- miniworld.feature.MiniWorld_featuring_species.MSA_featurize_wo_statistics_by_chain(msa, insertion, N_clust, params)[source]¶
I modified RF2 version. (just changed name of variables) Input: full MSA information (after Block deletion if necessary) & full insertion information Output: seed MSA features & extra sequences
msa : (N, L) torch.LongTensor ins : (N, L) torch.LongTensor params : list of parameters p_mask : probability of masking eps : small number to avoid zero division chain_break : dictionary of chain idx {chain_id: (start, end)}
- Seed MSA features:
aatype of seed sequence (20 regular aa + 1 gap + 1 unknwon + 1 mask)
profile of clustered sequences (23) => removed
insertion statistics (2) => removed statistics, only use insertion_clust
N-term or C-term? (2)
- extra sequence features:
aatype of extra sequence (23)
insertion info (1)
N-term or C-term? (2)
- miniworld.feature.MiniWorld_featuring_species.cutoff_chain_num(sel, xyz, chain_break, params, query_chain_idx)[source]¶
- miniworld.feature.MiniWorld_featuring_species.display_top(snapshot, key_type='lineno', limit=10)[source]¶
- miniworld.feature.MiniWorld_featuring_species.get_complex_crop(len_s, mask, device, params)[source]¶
- miniworld.feature.MiniWorld_featuring_species.get_crop(chain_start, chain_end, mask, device, params, unclamp=False, ID=None)[source]¶
- miniworld.feature.MiniWorld_featuring_species.get_same_crop_idx(xyz_full, crop_idx, chain_break, same_chain_info, cutoff=10.0)[source]¶
- miniworld.feature.MiniWorld_featuring_species.get_spatial_crop(xyz, mask, pivot_chain_idx, chain_break, len_s, params, protein_ID, cutoff=10.0, eps=1e-06)[source]¶
- miniworld.feature.MiniWorld_featuring_species.getsize(obj_0)[source]¶
Recursively iterate to sum size of object & members.
- miniworld.feature.MiniWorld_featuring_species.permute_label(protein_list, crop_idx, out_of_sequence_idxs, chain_break, same_chain_info)[source]¶
- miniworld.feature.MiniWorld_featuring_species.template_featurize(input_template_dict, params)[source]¶
I modified RF2 version.
In MSA_featurize, I changed the name of variables and a small part of code because the shape of inputs (msa, insertion) are almost same as RF2. On the other hand, I totally reconstructed template structure, so I changed a lot in this function.
Note
Processes template information for a single chain.
- Parameters:
input_template_dict (dict) –
A dictionary containing template information. It should have the following keys:
’xyz’: torch.Tensor of shape (N_template, L_chain, 27, 3)
’mask’: torch.Tensor of shape (N_template, L_chain, 27)
’sequence’: torch.Tensor of shape (N_template, L_chain, NUM_CLASSES)
’f0d’: torch.Tensor of shape (N_template)
’f1d’: torch.Tensor of shape (N_template, L_chain)
params (dict) – Dictionary of parameters.
- Returns:
A dictionary with processed template features:
’xyz’: torch.Tensor of shape (npick_global, L_query, 27, 3)
’template_1D’: torch.Tensor of shape (npick_global, L_query, 23 + 1)
’template_atom_mask’: torch.Tensor of shape (npick_global, L_query, 27)
- Return type:
dict