The nuclear correlation (`nc`) accessor#

The nuclear correlation accessor provides metrics to evaluate the correlation between the cell and nuclear segmentation mask.

segtraq.nc.nuclear_correlation.compute_cell_nuc_correlation(sdata: SpatialData, table_key: str = 'table', cell_id_key: str = 'cell_id', metric: str = 'pearson', transcripts_key: str = 'transcripts', nucleus_by: str = 'nucleus_boundaries', feature_column: str = 'feature_name', x_coordinate: str = 'x', y_coordinate: str = 'y') → DataFrame#

For each cell in the SpatialData table, identifies the nucleus with highest IoU and computes a correlation (e.g. Pearson) between the gene expression profiles of the cell and that nucleus.

Parameters:

sdata (spatialdata.SpatialData) –
A SpatialData object containing:
- .shapes[‘cell_boundaries’] and .shapes[‘nucleus_boundaries’] for polygon geometries,
- .tables[table_key] as an AnnData table.
table_key (str) – Key in sdata.tables pointing to the expression matrix.
cell_id_key (str) – Column in `sdata.tables[table_key].obs containing cell IDs to match with shapes.
metric (str) – Correlation metric. Currently supports only “pearson”.
transcripts_key (str) – Name of transcripts Points element.
nucleus_by (str) – Name of nucleus shape layer to aggregate by.
feature_column (str) – Column in transcripts pointing to feature (e.g. gene/protein).
x_coordinate (str) – Column in transcripts pointing x coordinate.
y_coordinate (str) – Column in transcripts pointing y coordinate.

Returns:

DataFrame with columns:

cell_id: identifier of each cell,
best_nuc_id: matching nucleus ID with highest IoU (or None),
correlation: Pearson correlation between the cell and its matched nucleus gene counts (NaN if no match).

Return type:

pandas.DataFrame

segtraq.nc.nuclear_correlation.compute_cell_nuc_ious(sdata: SpatialData, cell_shape_key: str = 'cell_boundaries', nuc_shape_key: str = 'nucleus_boundaries', n_jobs: int = -1, use_progress: bool = True) → DataFrame#

Compute per-cell IoU between cell and nucleus boundaries in a SpatialData object.

Parameters:

sdata (spatialdata.SpatialData) – Must contain cell and nuclear shapes.
cell_shape_key (str, optional) – The key in the shapes attribute of sdata that corresponds to cell boundaries.
nuc_shape_key (str, optional) – The key in the shapes attribute of sdata that corresponds to nucleus boundaries.
n_jobs (int, optional) – Number of parallel jobs. Default=-1 uses all CPUs.
use_progress (bool, optional) – Whether to display a progress bar with tqdm.

Returns:

Columns: [cell_id, best_nuc_id, IoU]

Return type:

pandas.DataFrame

segtraq.nc.nuclear_correlation.compute_correlation_between_parts(sdata: SpatialData, table_key: str = 'table', cell_shape_key: str = 'cell_boundaries', nuc_shape_key: str = 'nucleus_boundaries', transcripts_key: str = 'transcripts', feature_column: str = 'feature_name', x_coordinate: str = 'x', y_coordinate: str = 'y') → DataFrame#

Compute Pearson correlation between cell part overlapping with its nucleus and the rest of the cell.

Parameters:

sdata (SpatialData) – The SpatialData object containing cells, nuclei, and transcript points.
table_key (str) – Key in sdata.tables pointing to the expression matrix.
cell_shape_key (str) – Key for cell boundaries in sdata.shapes.
nuc_shape_key (str) – Key for nucleus boundaries in sdata.shapes.
transcripts_key (str) – Key for transcript points in sdata.points.
feature_column (str) – Feature column in transcript points (e.g. gene name).
x_coordinate (str) – Column name for x coordinate.
y_coordinate (str) – Column name for y coordinate.

Returns:

DataFrame with columns [“cell_id”, “best_nuc_id”, “correlation”]

Return type:

pd.DataFrame

The nuclear correlation (nc) accessor

Contents

The nuclear correlation (`nc`) accessor#

The nuclear correlation (nc) accessor

Contents

The nuclear correlation (nc) accessor#

The nuclear correlation (`nc`) accessor#