The clustering stability (cs) accessor

The clustering stability (cs) accessor#

The clustering stability accessor provides metrics for assessing the stability of clustering results across different resolutions and random subsets of genes.

segtraq.cs.clustering_stability.adjusted_rand_index(sdata: SpatialData, resolution: float = 1.0, frac_cells_subset: float = 0.63, tables_key: str = 'table', key_prefix: str = 'leiden_subset', inplace: bool = True) float#

Compute the clustering stability using pairwise adjusted Rand index (ARI) on random subset of cells.

Parameters:
  • sdata (sd.SpatialData) – The SpatialData object containing clustering information.

  • resolution (float, optional) – The resolution parameter for Leiden clustering, by default 1.0.

  • frac_cells_subset (float, optional) – The fraction of cells to subset for clustering, by default 0.63.

  • tables_key (str, optional) – The key in sdata.tables where the relevant AnnData is stored, by default “table”.

  • key_prefix (str, optional) – The prefix for the keys under which the clustering results are stored, by default “leiden_subset”.

  • inplace (bool, optional) – Whether to store the computed ARI in sdata.uns, by default True.

Returns:

The average pairwise ARI across the specified cluster keys.

Return type:

float

segtraq.cs.clustering_stability.cluster_connectedness(sdata: SpatialData, resolution: float | list[float] = (0.6, 0.8, 1.0), use_weights: bool = False, tables_key: str = 'table', key_prefix: str = 'leiden_subset', random_state: int = 42, cell_type_key: str | None = None, inplace: bool = True) float#

Compute cluster connectedness for different Leiden clustering resolutions and report the best (highest) one. If a cell_type_key is provided, compute the connectedness for that clustering only.

Parameters:
  • sdata (sd.SpatialData) – The SpatialData object containing clustering information.

  • resolution (float or list of float, optional) – The resolution parameter(s) for Leiden clustering, by default (0.6, 0.8, 1.0).

  • use_weights (bool) – Use edge weights to evaluate connectedness. If false, fraction of equal neighbors is used.

  • tables_key (str, optional) – The key in sdata.tables where the relevant AnnData is stored, by default “table”.

  • key_prefix (str, optional) – Prefix for clustering keys in .obs, by default “leiden_subset”.

  • random_state (int, optional) – Seed for reproducibility, by default 42.

  • cell_type_key (str, optional) – If provided, compute the mean cosine distance for this clustering only.

  • inplace (bool, optional) – Whether to store the computed mean cosine distance in sdata.uns, by default True.

Returns:

The best (highest) cluster connectedness across resolutions.

Return type:

float

segtraq.cs.clustering_stability.purity(sdata: SpatialData, resolution: float = 1.0, frac_cells_subset: float = 0.63, tables_key: str = 'table', key_prefix: str = 'leiden_subset', inplace: bool = True) float#

Compute the clustering stability using pairwise purity on random subsets of genes. :param sdata: The SpatialData object containing clustering information. :type sdata: sd.SpatialData :param resolution: The resolution parameter for Leiden clustering, by default 1.0. :type resolution: float, optional :param tables_key: The key in sdata.tables where the relevant AnnData is stored, by default “table”. :type tables_key: str, optional :param frac_cells_subset: The fraction of cells to subset for clustering, by default 0.63. :type frac_cells_subset: float, optional :param key_prefix: The prefix for the keys under which the clustering results are stored, by default “leiden_subset”. :type key_prefix: str, optional :param inplace: Whether to store the computed purity in sdata.uns, by default True. :type inplace: bool, optional

Returns:

The average pairwise purity across the specified cluster keys.

Return type:

float

segtraq.cs.clustering_stability.silhouette_score(sdata: SpatialData, resolution: float | list[float] = (0.6, 0.8, 1.0), metric: str = 'euclidean', tables_key: str = 'table', key_prefix: str = 'leiden_subset', random_state: int = 42, cell_type_key: str | None = None, inplace: bool = True) float#

Compute the silhouette score for different resolutions and report the best one. If a cell_type_key is provided, compute the silhouette score for provided labels.

Parameters:
  • sdata (sd.SpatialData) – The SpatialData object containing clustering information.

  • resolution (float, optional) – The resolution parameter for Leiden clustering, by default 1.0.

  • metric (str, optional) – The metric to use for silhouette score calculation, by default “euclidean”.

  • tables_key (str, optional) – The key in sdata.tables where the relevant AnnData is stored, by default “table”.

  • key_prefix (str, optional) – The prefix for the keys under which the clustering results are stored, by default “leiden_subset”.

  • random_state (int, optional) – Seed for reproducibility, by default 42.

  • cell_type_key (str, optional) – If provided, compute the silhouette score for provided labels.

  • inplace (bool, optional) – Whether to store the computed silhouette score in sdata.uns, by default True.

Returns:

The silhouette score of the clustering.

Return type:

float