mtopic.tl.zscores

Contents

mtopic.tl.zscores#

mtopic.tl.zscores(mdata, raw_data_path, signatures='signatures', mod=None, n_top=10, thr=5, out_key='zscores')#

Compute z-scores for feature signatures.

This function calculates z-scores for the top features associated with each topic in the specified modality or across all modalities of a MuData object. Z-scores are computed using normalized and log-transformed raw count data, allowing for a standardized comparison of feature expression levels relative to their mean and standard deviation across all cells. Computed z-scores are capped within a specified threshold range to limit extreme values.

Parameters:
  • mdata (muon.MuData) – A MuData object containing multimodal single-cell data.

  • raw_data_path (str) – Path to the .h5mu file containing the raw count data for normalization and z-score computation.

  • signatures (str, optional) – Key in the varm attribute of each modality representing the topic signatures to compute z-scores for. Default is ‘signatures’.

  • mod (str, optional) – Specific modality to compute z-scores for. If None, z-scores are computed for all modalities. Default is None.

  • n_top (int, optional) – Number of top features to select for each topic based on their importance in the topic signature. Default is 10.

  • thr (float, optional) – Threshold to cap the computed z-scores. Z-scores will be limited to the range [-thr, thr]. Default is 5.

  • out_key (str, optional) – Key under which the computed z-scores will be stored in the obsm attribute of each modality. Default is ‘zscores’.

Returns:

None

Updates:
  • mdata[mod].obsm[out_key]: A DataFrame containing the z-scores for the top features of each topic in the specified modality or all modalities if mod is None.

Example:
import mtopic

# Load MuData object
mdata = mtopic.read.h5mu("path/to/file.h5mu")

# Compute z-scores for the top 10 features in each topic for all modalities
mtopic.pp.zscores(
    mdata,
    signatures='signatures',
    raw_data_path="path/to/raw/data.h5mu"
)

# Compute z-scores for a specific modality ('rna')
mtopic.pp.zscores(
    mdata,
    signatures='signatures',
    raw_data_path="path/to/raw/data.h5mu",
    mod='rna'
)
Notes:
  • Z-Score Calculation: Z-scores are computed as (x - mean) / std, where x is the log-transformed expression value of a feature, mean is the mean across all cells, and std is the standard deviation across all cells.

  • Feature Selection: The top n_top features for each topic are selected based on their importance in the topic signatures (highest weights).

  • Thresholding: Extreme z-scores are capped to the range [-thr, thr] to mitigate the impact of outliers.

  • The raw count data for normalization is loaded from raw_data_path.