Skip to content

feat(rust/sedona-raster-functions): add RS_Contains, RS_Intersects, RS_Within UDFs#615

Open
Kontinuation wants to merge 12 commits intoapache:mainfrom
Kontinuation:pr14-rs-spatial-predicates
Open

feat(rust/sedona-raster-functions): add RS_Contains, RS_Intersects, RS_Within UDFs#615
Kontinuation wants to merge 12 commits intoapache:mainfrom
Kontinuation:pr14-rs-spatial-predicates

Conversation

@Kontinuation
Copy link
Member

Summary

  • Add RS_Contains, RS_Intersects, and RS_Within spatial predicate UDFs to sedona-raster-functions
  • Supports raster-geometry and raster-raster spatial relationship tests with CRS-aware coordinate transformation
  • Adds crs_utils module for CRS resolution and coordinate transformation helpers
  • Extends executor.rs with execute_raster_wkb_crs_void (raster+geometry iteration) and execute_raster_raster_void (two-raster iteration) patterns
  • Makes sedona-proj::st_transform module and with_global_proj_engine function public for cross-crate CRS transformation
  • Includes unit tests and benchmark entries in native-raster-functions.rs

@Kontinuation Kontinuation force-pushed the pr14-rs-spatial-predicates branch 3 times, most recently from d54ef15 to 38aab51 Compare February 20, 2026 17:03
@Kontinuation Kontinuation force-pushed the pr14-rs-spatial-predicates branch from 38aab51 to cda9308 Compare March 4, 2026 14:02
@Kontinuation Kontinuation requested a review from Copilot March 4, 2026 14:53
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds raster spatial predicate UDFs to sedona-raster-functions, including CRS-aware handling and new executor iteration patterns to support raster↔geometry and raster↔raster comparisons.

Changes:

  • Introduces RS_Intersects, RS_Contains, and RS_Within UDF implementations (raster-geom, geom-raster, raster-raster) using raster convex hull WKB.
  • Adds crs_utils helpers for CRS resolution and WKB/coordinate transformation via the global PROJ engine.
  • Extends RasterExecutor with execute_raster_wkb_crs_void and execute_raster_raster_void, plus tests and benchmark entries.

Reviewed changes

Copilot reviewed 12 out of 13 changed files in this pull request and generated 7 comments.

Show a summary per file
File Description
rust/sedona-raster-functions/src/rs_spatial_predicates.rs New UDF kernels + CRS-aware predicate evaluation + unit tests.
rust/sedona-raster-functions/src/register.rs Registers the new spatial predicate UDFs in the default function set.
rust/sedona-raster-functions/src/lib.rs Exposes new crs_utils and rs_spatial_predicates modules.
rust/sedona-raster-functions/src/executor.rs Adds new raster↔raster and raster↔(WKB,CRS) iteration helpers + tests.
rust/sedona-raster-functions/src/crs_utils.rs New CRS resolution + WKB/coord reprojection utilities.
rust/sedona-raster-functions/benches/native-raster-functions.rs Adds benchmark cases for the new UDFs.
rust/sedona-raster-functions/Cargo.toml Adds new deps for tg predicates, CRS transform, and WKB parsing.
docs/reference/sql/rs_contains.qmd Adds SQL reference page for RS_Contains.
docs/reference/sql/rs_intersects.qmd Adds SQL reference page for RS_Intersects.
docs/reference/sql/rs_within.qmd Adds SQL reference page for RS_Within.
c/sedona-proj/src/transform.rs Makes with_global_proj_engine public for cross-crate reuse.
c/sedona-proj/src/lib.rs Makes st_transform module public.
Cargo.lock Lockfile updates for new dependencies.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Kontinuation Kontinuation force-pushed the pr14-rs-spatial-predicates branch from cda9308 to 84b5edd Compare March 4, 2026 16:07
@Kontinuation Kontinuation marked this pull request as ready for review March 4, 2026 16:42
@Kontinuation Kontinuation changed the title feat(raster): add RS_Contains, RS_Intersects, RS_Within UDFs feat(rust/sedona-raster-functions): add RS_Contains, RS_Intersects, RS_Within UDFs Mar 4, 2026
Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

A few patterns I think we should update before merging this but in general this looks very cool!

//! and (raster, raster). Rasters are compared via their convex hulls.
//!
//! CRS transformation rules:
//! - If a raster or geometry does not have a defined SRID, it is assumed to be in WGS84
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In SedonaDB I've tried hard to avoid this assumption (I am guessing this is a hangover from Sedona not propagating CRSes from GeoParquet). We can make it easy to apply this assumption explicitly if it causes issues (e.g., ST_ApplyDefaultCRS() or a Python dataframe level function that does it for all spatial columns).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This strange semantics comes from Apache Sedona. Should we simply throw an error when one of the input parameter does not have a CRS?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If both of them don't have a CRS it would technically be OK (but you can throw an error for now).

Comment on lines +188 to +192
executor.execute_raster_wkb_crs_void(|raster_opt, maybe_wkb, maybe_geom_crs| {
match (raster_opt, maybe_wkb) {
(Some(raster), Some(geom_wkb)) => {
raster_wkb.clear();
write_convexhull_wkb(raster, &mut raster_wkb)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I imagine this is not a huge performance issue, but technically this is unnecessarily recomputing the raster convex hull for the raster-is-a-scalar/geometry-is-a-column case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The raster-is-a-scalar case should be rare in practice, leaving this case unoptimized to simplify the code should be a sane trade off.

Comment on lines +283 to +296
/// Evaluate a spatial predicate with CRS handling
///
/// Rules:
/// - If no CRS defined, assume WGS84
/// - If both same CRS, compare directly
/// - Otherwise, try transforming one side to the other's CRS for comparison.
/// If that fails, transform both to WGS84 and compare.
fn evaluate_predicate_with_crs<Op: tg::BinaryPredicate>(
wkb_a: &[u8],
crs_a: Option<&str>,
wkb_b: &[u8],
crs_b: Option<&str>,
from_a_to_b: bool,
) -> Result<bool> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I expected to see with_global_proj_engine() outside the loop with a reference to the engine passed in here. I think it would be better to keep that in the UDF execute because it will be easier to grab the session engine from the ConfigOptions when we get there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved with_global_proj_engine to call once per batch.

Comment on lines +26 to +28
/// Cached default CRS (WGS84 longitude/latitude). Initialized once on first access.
static DEFAULT_CRS: LazyLock<Arc<dyn CoordinateReferenceSystem + Send + Sync>> =
LazyLock::new(|| lnglat().expect("lnglat() should always succeed"));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I commented above too, but I don't think this is a good place to apply a default CRS. In SedonaDB it is practically quite difficult to get something without a CRS (when there's a raster reader, the reader would be a good place to apply that).

Comment on lines +30 to +36
/// Run a closure with the active CRS engine, abstracting away the engine choice.
///
/// We keep this API engine-agnostic to allow future engines beyond PROJ; the
/// current implementation uses PROJ via `with_global_proj_engine`.
pub fn with_crs_engine<R, F: FnMut(&dyn CrsEngine) -> Result<R>>(mut func: F) -> Result<R> {
with_global_proj_engine(|engine| func(engine))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can apply this at the UDF level where it's a bit more explicit (and amortized over a batch).

Comment on lines +73 to +77
pub fn crs_transform_wkb(
wkb: &[u8],
from_crs: &dyn CoordinateReferenceSystem,
to_crs: &dyn CoordinateReferenceSystem,
) -> Result<Vec<u8>> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The identical CRS check should probably happen before this to avoid the clone. For a bbox the clone is trivial but this could also be a complex geometry (e.g., division boundary) on the other side of the predicate.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the CRS equivalence check before calling this function. This will eliminate one allocation and WKB buffer copying when CRSes are equivalent.

Comment on lines +191 to +197
fn crs_string_from_sedona_type(sedona_type: &SedonaType) -> Result<Option<String>> {
match sedona_type {
SedonaType::Wkb(_, crs) | SedonaType::WkbView(_, crs) => {
if let Some(crs) = crs {
Ok(Some(
crs.to_authority_code()?.unwrap_or_else(|| crs.to_json()),
))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we have a CRS abbreviator already (and the executor should probably not automatically abbreviate so that functions can use the full definitions if they are provided).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decide to change execute_raster_wkb_crs_void to pass &dyn CoordinateReferenceSystem instead of a CRS string to the closure.


impl ItemWkbAccessor {
#[inline]
fn get(&self, i: usize) -> Option<&[u8]> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The pattern of switching on type for every element is probably not a great one.

Possibly a Box<dyn Iterator<Item = ...>>? In the context of raster iteration where it is less likely there will be billions of them this is probably OK. You could also use that for raster/CRS iteration and use the existing executor to handle the arrays (it's almost always free to cast the CRS array to string view and there is less variation in the raster type than the geometry type).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have thought about eliminating this dynamic dispatching when implementing it, and I deliberately leave it that way. Here are the reasons:

  1. Box<dyn Iterator<Item = ...>> performs dynamic dispatching implicitly, I'm not sure if it is better than pattern-matching a enum.
  2. RS functions involving a raster and a geometry input are usually heavy weight, the cost of dispatching does not add up too much overhead.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add a comment here before all the switches explaining this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment added for both ItemWkbAccessor and GeomWkbCrsAccessor

@Kontinuation
Copy link
Member Author

Kontinuation commented Mar 6, 2026

The benchmark no longer works because of the semantics changes. I have to add a ST_ApplyDefaultCRS UDF to set CRS for benchmarking geometries. Currently it is mainly for unblocking the benchmark test and does not support item level CRS.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great...thank you!

I would prefer to avoid adding ST_ApplyDefaultCrs()...documented functions carry some compatibility/stability considerations and we just need this to exist for a benchmark. I left two suggestions about how I think we can avoid this!

# specific language governing permissions and limitations
# under the License.

title: ST_ApplyDefaultCRS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would give this an SD_* prefix (and then you don't have to document it). I am not sure we want documentation for this because it's an implementation detail (all actual users should just do ST_SetSRID() or ST_SetCrs()).

Alternatively, just put the definition in the raster benchmarks file since that's the only place we need it!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed ST_ApplyDefaultCRS to SD_ApplyDefaultCRS and moved it to the benchmark file.

Comment on lines +198 to +211
fn crs_from_sedona_type(sedona_type: &SedonaType) -> Crs {
match sedona_type {
SedonaType::Wkb(_, crs) | SedonaType::WkbView(_, crs) => crs.clone(),
_ => None,
}
}

fn is_item_crs_type(sedona_type: &SedonaType) -> bool {
matches!(
sedona_type,
SedonaType::Arrow(DataType::Struct(fields))
if fields.len() == 2 && fields[0].name() == "item" && fields[1].name() == "crs"
)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to do it here but these could both be methods on the SedonaType

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can do this in a separate PR and replace other ad-hoc item-crs schema checks with SedonaType::is_item_crs_type.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants