mtopic.tl.umap

Contents

mtopic.tl.umap#

mtopic.tl.umap(mdata, x='topics', umap='umap', n_components=2, min_dist=0.1, n_neighbors=20, seed=2291, n_jobs=10)#

Perform UMAP dimensionality reduction on topic distributions.

This function applies Uniform Manifold Approximation and Projection (UMAP) to reduce the dimensionality of topic distributions stored in the obsm attribute of a MuData object. The reduced dimensions are stored in the obsm attribute under a specified name.

Parameters:
  • mdata (muon.MuData) – A MuData object containing the topic distributions in the obsm attribute.

  • x (str, optional) – The key in the obsm attribute of mdata that holds the topic distributions to be used for UMAP. Default is ‘topics’.

  • umap (str, optional) – The key under which the UMAP results will be stored in the obsm attribute of mdata. Default is ‘umap’.

  • n_components (int, optional) – The number of dimensions for the UMAP embedding. Default is 2.

  • min_dist (float, optional) – The minimum distance between points in the UMAP embedding. Controls the balance between local and global structure. Default is 0.1.

  • n_neighbors (int, optional) – The number of nearest neighbors to consider when computing the UMAP embedding. Default is 20.

  • seed (int, optional) – Random seed for reproducibility. Ensures consistent embeddings across runs. Default is 2291.

  • n_jobs (int, optional) – Number of CPU cores to use for parallel computation. If set to -1, all available cores are used. Default is 10.

Returns:

None

Updates:
  • mdata.obsm[umap]: A DataFrame containing the UMAP coordinates for each sample, with dimensions specified by n_components.

Example:
import mtopic

# Load MuData object
mdata = mtopic.read.h5mu("path/to/file.h5mu")

# Compute UMAP embedding for topic distributions
mtopic.pp.umap(mdata, x='topics', n_components=2)