The clustering stability (cs) accessor#
The clustering stability accessor provides metrics for assessing the stability of clustering results across different resolutions and random subsets of genes.
- segtraq.cs.clustering_stability.adjusted_rand_index(sdata: SpatialData, resolution: float = 1.0, frac_cells_subset: float = 0.63, tables_key: str = 'table', key_prefix: str = 'leiden_subset', inplace: bool = True) float#
Compute the clustering stability using pairwise adjusted Rand index (ARI) on random subset of cells.
- Parameters:
sdata (sd.SpatialData) – The SpatialData object containing clustering information.
resolution (float, optional) – The resolution parameter for Leiden clustering, by default 1.0.
frac_cells_subset (float, optional) – The fraction of cells to subset for clustering, by default 0.63.
tables_key (str, optional) – The key in sdata.tables where the relevant AnnData is stored, by default “table”.
key_prefix (str, optional) – The prefix for the keys under which the clustering results are stored, by default “leiden_subset”.
inplace (bool, optional) – Whether to store the computed ARI in sdata.uns, by default True.
- Returns:
The average pairwise ARI across the specified cluster keys.
- Return type:
float
- segtraq.cs.clustering_stability.cluster_connectedness(sdata: SpatialData, resolution: float | list[float] = (0.6, 0.8, 1.0), use_weights: bool = False, tables_key: str = 'table', key_prefix: str = 'leiden_subset', random_state: int = 42, cell_type_key: str | None = None, inplace: bool = True) float#
Compute cluster connectedness for different Leiden clustering resolutions and report the best (highest) one. If a cell_type_key is provided, compute the connectedness for that clustering only.
- Parameters:
sdata (sd.SpatialData) – The SpatialData object containing clustering information.
resolution (float or list of float, optional) – The resolution parameter(s) for Leiden clustering, by default (0.6, 0.8, 1.0).
use_weights (bool) – Use edge weights to evaluate connectedness. If false, fraction of equal neighbors is used.
tables_key (str, optional) – The key in sdata.tables where the relevant AnnData is stored, by default “table”.
key_prefix (str, optional) – Prefix for clustering keys in .obs, by default “leiden_subset”.
random_state (int, optional) – Seed for reproducibility, by default 42.
cell_type_key (str, optional) – If provided, compute the mean cosine distance for this clustering only.
inplace (bool, optional) – Whether to store the computed mean cosine distance in sdata.uns, by default True.
- Returns:
The best (highest) cluster connectedness across resolutions.
- Return type:
float
- segtraq.cs.clustering_stability.purity(sdata: SpatialData, resolution: float = 1.0, frac_cells_subset: float = 0.63, tables_key: str = 'table', key_prefix: str = 'leiden_subset', inplace: bool = True) float#
Compute the clustering stability using pairwise purity on random subsets of genes. :param sdata: The SpatialData object containing clustering information. :type sdata: sd.SpatialData :param resolution: The resolution parameter for Leiden clustering, by default 1.0. :type resolution: float, optional :param tables_key: The key in sdata.tables where the relevant AnnData is stored, by default “table”. :type tables_key: str, optional :param frac_cells_subset: The fraction of cells to subset for clustering, by default 0.63. :type frac_cells_subset: float, optional :param key_prefix: The prefix for the keys under which the clustering results are stored, by default “leiden_subset”. :type key_prefix: str, optional :param inplace: Whether to store the computed purity in sdata.uns, by default True. :type inplace: bool, optional
- Returns:
The average pairwise purity across the specified cluster keys.
- Return type:
float
- segtraq.cs.clustering_stability.silhouette_score(sdata: SpatialData, resolution: float | list[float] = (0.6, 0.8, 1.0), metric: str = 'euclidean', tables_key: str = 'table', key_prefix: str = 'leiden_subset', random_state: int = 42, cell_type_key: str | None = None, inplace: bool = True) float#
Compute the silhouette score for different resolutions and report the best one. If a cell_type_key is provided, compute the silhouette score for provided labels.
- Parameters:
sdata (sd.SpatialData) – The SpatialData object containing clustering information.
resolution (float, optional) – The resolution parameter for Leiden clustering, by default 1.0.
metric (str, optional) – The metric to use for silhouette score calculation, by default “euclidean”.
tables_key (str, optional) – The key in sdata.tables where the relevant AnnData is stored, by default “table”.
key_prefix (str, optional) – The prefix for the keys under which the clustering results are stored, by default “leiden_subset”.
random_state (int, optional) – Seed for reproducibility, by default 42.
cell_type_key (str, optional) – If provided, compute the silhouette score for provided labels.
inplace (bool, optional) – Whether to store the computed silhouette score in sdata.uns, by default True.
- Returns:
The silhouette score of the clustering.
- Return type:
float