mtopic.tl.umap#
- mtopic.tl.umap(mdata, x='topics', umap='umap', n_components=2, min_dist=0.1, n_neighbors=20, seed=2291, n_jobs=10)#
Perform UMAP dimensionality reduction on topic distributions.
This function applies Uniform Manifold Approximation and Projection (UMAP) to reduce the dimensionality of topic distributions stored in the obsm attribute of a MuData object. The reduced dimensions are stored in the obsm attribute under a specified name.
- Parameters:
mdata (muon.MuData) – A MuData object containing the topic distributions in the obsm attribute.
x (str, optional) – The key in the obsm attribute of mdata that holds the topic distributions to be used for UMAP. Default is ‘topics’.
umap (str, optional) – The key under which the UMAP results will be stored in the obsm attribute of mdata. Default is ‘umap’.
n_components (int, optional) – The number of dimensions for the UMAP embedding. Default is 2.
min_dist (float, optional) – The minimum distance between points in the UMAP embedding. Controls the balance between local and global structure. Default is 0.1.
n_neighbors (int, optional) – The number of nearest neighbors to consider when computing the UMAP embedding. Default is 20.
seed (int, optional) – Random seed for reproducibility. Ensures consistent embeddings across runs. Default is 2291.
n_jobs (int, optional) – Number of CPU cores to use for parallel computation. If set to -1, all available cores are used. Default is 10.
- Returns:
None
- Updates:
mdata.obsm[umap]: A DataFrame containing the UMAP coordinates for each sample, with dimensions specified by n_components.
- Example:
import mtopic # Load MuData object mdata = mtopic.read.h5mu("path/to/file.h5mu") # Compute UMAP embedding for topic distributions mtopic.pp.umap(mdata, x='topics', n_components=2)