mtopic.tl.feature_associations

mtopic.tl.feature_associations#

mtopic.tl.feature_associations(A: Tensor, A_var: DataFrame, B: Tensor, B_var: DataFrame, mask: Tensor = None, n_epochs=10000, lambda_reg=0.0001, lambda_entropy=0.001, lambda_spread=0.05, temperature=0.2, normalize=True, seed=1898, n_threads=10)#

Cross-modality feature associations.

Learns a feature-level probabilistic mapping from one modality to another using KL divergence minimization with regularization. The optimization is done independently for each feature in A.

The result is a sparse matrix showing how features in modality B (columns) associate with features in modality A (rows). The model applies softmax-based weighting with optional entropy and sparsity regularization.

Parameters:
  • A (torch.Tensor) – Topic-feature matrix of modality A (shape: n_topics x n_features_A).

  • A_var (pandas.DataFrame) – .var DataFrame from modality A, used for column names.

  • B (torch.Tensor) – Topic-feature matrix of modality B (shape: n_topics x n_features_B).

  • B_var (pandas.DataFrame) – .var DataFrame from modality B, used for row names.

  • mask (torch.Tensor or None) – Optional boolean mask (shape: n_features_B x n_features_A) specifying which feature pairs to consider. (default: None)

  • n_epochs (int) – Number of optimization steps per target feature. (default: 10000)

  • lambda_reg (float) – Regularization coefficient for sparsity. (default: 1e-4)

  • lambda_entropy (float) – Regularization coefficient for entropy. (default: 1e-3)

  • lambda_spread (float) – Regularization coefficient for weight spread. (default: 0.05)

  • temperature (float) – Softmax temperature for controlling assignment sharpness. (default: 0.2)

  • normalize (bool) – Whether to check and normalize A and B to ensure row sums equal 1. (default: True)

  • seed (int) – Random seed for reproducibility. (default: 1898)

  • n_threads (int) – Number of threads to use for Torch. (default: 10)

Returns:

DataFrame of shape (n_features_B, n_features_A) with association weights.

Return type:

pandas.DataFrame

Example:
# A, B = torch.Tensor (topics x features) with topic loadings
# A_var, B_var = .var dataframes from each modality
# mask = (optional) a prior-defined feature-feature mask

df = feature_associations(A, A_var, B, B_var, mask=mask, normalize=True)
df.head()  # View the learned feature associations