Skip to contents

Fits a polygon (or multi-polygon) spatial mask (an sf::sfc geometry object) to a set of XY point coordinates. Suitable for cell centroids, single-molecule transcript locations, or any spatial point process data. The mask faithfully captures tissue shape including internal voids (vessel lumens, necrotic cores, tissue tears) and disconnected fragments.

Four algorithms are available:

methodTopologySpeedBest for
"raster"Holes + islandsFastMost use cases (default)
"kde"Holes + islandsModerateDense, smooth datasets
"concave"No holesFastSimple non-convex shapes
"convex"No holesInstantQuick checks

Raster method overview (method = "raster", v3):

  1. Bin point coordinates onto a regular grid.

  2. Convolve the occupancy grid with a 2-D Gaussian of width raster_sigma (in coordinate units).

  3. Threshold at raster_threshold * max. Cells above threshold are "inside".

  4. Convert each "inside" cell to an axis-aligned rectangle and dissolve via GEOS union. Holes and islands emerge from the geometry naturally – no ring-winding logic required.

  5. Apply a morphological close (buffer out then in) to smooth staircase boundaries.

Usage

fit_spatial_mask(
  coords,
  method = "raster",
  concavity = 2,
  length_threshold = 0,
  kde_bandwidth = NULL,
  kde_threshold = 0.05,
  kde_resolution = 256L,
  raster_resolution = 256L,
  raster_sigma = NULL,
  raster_threshold = 0.15,
  raster_min_pts = 1L,
  buffer_dist = 0,
  smooth_mask = FALSE,
  smooth_tolerance = NULL,
  n_cores = 1L,
  crs = NA,
  plot = FALSE,
  verbose = TRUE
)

Arguments

coords

A data frame or matrix with columns x and y. If column names differ, the first two columns are used and renamed. Rows containing NA in x or y are silently dropped.

method

Character scalar. Masking algorithm. One of:

  • "raster" (default) – Gaussian smoothing on a rasterised grid, binary threshold, then GEOS union of "on" cells. Supports holes and islands. Recommended for most datasets.

  • "kde" – 2-D kernel density estimate via MASS::kde2d() contoured with isoband::isobands(). Supports holes and islands. Requires packages MASS and isoband.

  • "concave" – Concave hull via concaveman::concaveman(). No holes or islands. Requires package concaveman.

  • "convex" – Convex hull via sf::st_convex_hull(). No holes or islands. No additional packages required.

concavity

Numeric. Relative concavity for method = "concave". Lower values produce tighter, more concave boundaries; higher values approach the convex hull. Default 2.

length_threshold

Numeric. Minimum edge length for method = "concave". Edges shorter than this are not included. Default 0.

kde_bandwidth

Numeric scalar or length-2 vector giving the KDE bandwidth(s) in the x and y directions. NULL uses MASS::bandwidth.nrd() applied independently to each axis. Only used when method = "kde". Default NULL.

kde_threshold

Numeric in (0, 1). KDE probability quantile used as the lower contour level. Smaller values produce larger masks. Default 0.05.

kde_resolution

Integer. Grid resolution for the KDE density estimate. Larger values give finer boundaries at the cost of memory. Default 256L.

raster_resolution

Integer. Number of grid cells along each axis for method = "raster". Default 256L.

raster_sigma

Numeric. Standard deviation of the Gaussian kernel in coordinate units (the same units as x and y). This is the key tuning parameter:

  • Larger raster_sigma -> holes fill in, islands merge, boundary smooths.

  • Smaller raster_sigma -> holes and fine structure are preserved.

  • NULL (default) -> auto-set to 3% of the domain width.

raster_threshold

Numeric in (0, 1). A grid cell is "inside" when its smoothed value exceeds raster_threshold * max(smoothed_grid). Higher values shrink the mask; lower values grow it. Default 0.15.

raster_min_pts

Integer. Minimum number of points that must fall in a grid cell before it is eligible for the convolution. Cells with fewer points are treated as empty. Default 1L.

buffer_dist

Numeric. Distance to expand the mask after fitting via sf::st_buffer(). Zero means no expansion. Default 0.

smooth_mask

Logical. If TRUE, the mask boundary is simplified with sf::st_simplify() after fitting. Default FALSE.

smooth_tolerance

Numeric. Douglas-Peucker tolerance passed to sf::st_simplify(). NULL -> auto-set to 1% of the bounding-box diagonal. Default NULL.

n_cores

Integer. Number of cores for parallel row-smoothing in parallel::mclapply() (Unix/macOS only; ignored on Windows). Default 1L.

crs

CRS specification accepted by sf::st_crs(): an EPSG integer, a WKT string, a PROJ string, or NA for no CRS. Default NA.

plot

Logical. If TRUE, a quick ggplot2::ggplot2 scatter + polygon plot is printed. Requires ggplot2 to be installed. Default FALSE.

verbose

Logical. If TRUE, prints a summary table and intermediate messages (e.g., raster_sigma in grid-cell units). Default TRUE.

Value

An sf::sfc geometry collection of class sfc_POLYGON or sfc_MULTIPOLYGON. The CRS is set to crs. All input points (after NA removal) are guaranteed to lie inside or on the boundary of the returned mask. If any points fall outside the raw fitted mask a corrective buffer is applied automatically and a warning is issued.

Examples

set.seed(42)
# Two-cluster point cloud
coords <- data.frame(
  x = c(rnorm(200, 0, 3), rnorm(200, 10, 3)),
  y = c(rnorm(200, 0, 3), rnorm(200, 10, 3))
)

# Raster method (recommended default)
mask <- fit_spatial_mask(coords, verbose = FALSE)
#> Warning: 55 point(s) outside mask - applying corrective buffer.

# Convex hull -- instant, no additional packages
mask_cv <- fit_spatial_mask(coords, method = "convex", verbose = FALSE)

# Concave hull -- requires 'concaveman'
# \donttest{
mask_cc <- fit_spatial_mask(coords, method = "concave",
                            concavity = 1.5, verbose = FALSE)
#> Warning: 33 point(s) outside mask - applying corrective buffer.
# }

# Verify all points are covered by the mask (boundary counts as covered)
library(sf)
#> Linking to GEOS 3.13.0, GDAL 3.8.5, PROJ 9.5.1; sf_use_s2() is TRUE
pts <- sf::st_as_sf(coords, coords = c("x", "y"), crs = sf::NA_crs_)
all(sf::st_covered_by(pts, sf::st_union(mask), sparse = FALSE)[, 1])
#> [1] TRUE