Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions HISTORY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,22 @@
History
=======

1.2.0 (2026-06-26)
------------------

* Added ``spcs_pat`` to ``DataMasqueInstanceConfig`` for authenticating to
DataMasque instances hosted behind Snowflake SPCS (Snowpark Container Services)
app ingress. When set, the client sends the Programmatic Access Token on the
``X-SF-SPCS-Authorization`` header to clear the Snowflake gateway, independently
of the instance's own DataMasque auth.
* Added ``SpcsGatewayAuthError``, raised when the SPCS gateway rejects the PAT
before the request reaches DataMasque (with the Snowflake detail and a hint at
the likely cause).
* Added ``spcs`` to ``SnowflakeStageLocation`` so connections staged inside
Snowflake SPCS deserialise correctly. Previously, listing connections on an
instance that held an SPCS-staged Snowflake connection raised a
``ValidationError`` on the unknown stage value.
Comment on lines +8 to +19

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow existing style - nowhere near as verbose. And, you guessed it, sembr.

Suggested change
* Added ``spcs_pat`` to ``DataMasqueInstanceConfig`` for authenticating to
DataMasque instances hosted behind Snowflake SPCS (Snowpark Container Services)
app ingress. When set, the client sends the Programmatic Access Token on the
``X-SF-SPCS-Authorization`` header to clear the Snowflake gateway, independently
of the instance's own DataMasque auth.
* Added ``SpcsGatewayAuthError``, raised when the SPCS gateway rejects the PAT
before the request reaches DataMasque (with the Snowflake detail and a hint at
the likely cause).
* Added ``spcs`` to ``SnowflakeStageLocation`` so connections staged inside
Snowflake SPCS deserialise correctly. Previously, listing connections on an
instance that held an SPCS-staged Snowflake connection raised a
``ValidationError`` on the unknown stage value.
* Added ``spcs_pat`` to ``DataMasqueInstanceConfig`` for authenticating to DataMasque instances hosted on Snowpark Container Services.
* Added ``SpcsGatewayAuthError``, raised when the SPCS gateway rejects the PAT.
* Added ``spcs`` option to ``SnowflakeStageLocation``.

Could also structure as a "Added support for DataMasque deployments on Snowpark Container Services (SPCS)" heading with the three bullets nested below it.


1.1.1 (2026-06-25)
------------------

Expand Down
10 changes: 10 additions & 0 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,6 +42,13 @@ Authentication is performed on the first request if ``authenticate()`` is not ca
and is automatically retried once on a 401 response.
``client.healthcheck()`` is available as a lightweight readiness probe that does not consume credentials.

For a DataMasque instance hosted behind Snowflake SPCS (Snowpark Container Services) app ingress
(a ``*.snowflakecomputing.app`` ``base_url``),
pass a Snowflake Programmatic Access Token as ``spcs_pat`` on ``DataMasqueInstanceConfig``;
the client sends it on the ``X-SF-SPCS-Authorization`` header to clear the Snowflake gateway,

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more implementation detail that doesn't need to be in the README, if even present at all

which strips it before forwarding so your DataMasque auth is unaffected.
See the `usage docs <https://datamasque-python.readthedocs.io/en/latest/usage.html>`_ for details.

Error handling
==============

Expand All @@ -60,6 +67,9 @@ All methods raise subclasses of ``DataMasqueException`` on failure:
raised by ``start_masking_run`` when the server rejects the run.
- ``DataMasqueUserError`` —
raised by user-management methods when the input is invalid.
- ``SpcsGatewayAuthError`` —

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to say the snowflake stuff isn't valuable, but is this "worthy" (mainly, frequently-used) enough of being included in top level README?

raised when a Snowflake SPCS app gateway rejects the configured ``spcs_pat``
before the request reaches DataMasque.

Documentation
=============
Expand Down
3 changes: 3 additions & 0 deletions datamasque/client/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
DataMasqueTransportError,
)
from datamasque.client.models.dm_instance import DataMasqueInstanceConfig
from datamasque.client.spcs import install_spcs_gateway_auth

logger = logging.getLogger(__name__)

Expand Down Expand Up @@ -137,6 +138,8 @@ def __init__(self, connection_config: DataMasqueInstanceConfig) -> None:
self.verify_ssl = connection_config.verify_ssl
self.token_source = connection_config.token_source
self._session = _build_session(self.verify_ssl)
if connection_config.spcs_pat:
install_spcs_gateway_auth(self._session, connection_config.spcs_pat)

@contextmanager
def _maybe_suppress_insecure_warning(self) -> Iterator[None]:
Expand Down
16 changes: 16 additions & 0 deletions datamasque/client/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,22 @@ class IfmAuthError(DataMasqueIfmError):
"""Raised when the IFM client cannot obtain or refresh a JWT (e.g. invalid credentials, missing scope)."""


class SpcsGatewayAuthError(DataMasqueException):
"""
Raised when a Snowflake SPCS app gateway rejects the configured ``spcs_pat``.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sembr.

Good comment explaining why this doesn't inherit from DataMasqueApiError.


Only relevant when the client is configured with ``spcs_pat`` for an
instance behind Snowflake SPCS app ingress. The message includes the
Snowflake-provided detail, request id, and a hint at the likely cause
(e.g. an expired PAT or a network policy that excludes your IP).

Deliberately a direct subclass of `DataMasqueException` rather than
`DataMasqueApiError`: the client's 401 re-authenticate-and-retry path keys
off `DataMasqueApiError`/HTTP status, so keeping this outside that subtree
ensures a gateway rejection aborts immediately instead of looping.
"""


class RunNotCancellableError(DataMasqueUserError):
"""
Raised when `cancel_run` is called against a run that is no longer eligible for cancellation.
Expand Down
1 change: 1 addition & 0 deletions datamasque/client/models/connection.py
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ class SnowflakeStageLocation(str, Enum):
local = "local" # Not supported for production use
aws_s3 = "aws_s3"
azure_blob_storage = "azure_blob_storage"
spcs = "spcs" # DataMasque running inside Snowflake SPCS; staged on the container's own storage


class SseSelection(Enum):
Expand Down
18 changes: 18 additions & 0 deletions datamasque/client/models/dm_instance.py
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,14 @@ class DataMasqueInstanceConfig(BaseModel):
the client prepends it with `Token ` when sending the `Authorization` header.
The client calls `token_source` on each authentication attempt,
so the callable is free to fetch and refresh tokens out-of-band (e.g. from a secrets manager).

`spcs_pat` is an optional Snowflake Programmatic Access Token for reaching a

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tell your claude to look at CONTRIBUTING.rst, where you'll see guidelines for semantic breaking.

Do we need to talk about the implementation details?

DataMasque instance hosted behind Snowflake SPCS (Snowpark Container Services)
app ingress, where `base_url` ends in `.snowflakecomputing.app`. It is sent on
every request via the `X-SF-SPCS-Authorization` header to clear the Snowflake
gateway, which strips it before forwarding — so it is independent of, and
layers underneath, whichever DataMasque auth method (`password` or
`token_source`) you choose.
"""

model_config = ConfigDict(arbitrary_types_allowed=True)
Expand All @@ -29,6 +37,16 @@ class DataMasqueInstanceConfig(BaseModel):
password: Optional[str] = None
verify_ssl: bool = True
token_source: Optional[Callable[[], str]] = None
spcs_pat: Optional[str] = None
"""Snowflake Programmatic Access Token for a DataMasque instance hosted behind

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

D213 and sembr. Ruff should have yelled at you.

Same thoughts re implementation details may be unnecessary.

Mint is casual language - perhaps Create

Snowflake SPCS app ingress (a ``*.snowflakecomputing.app`` ``base_url``).

Mint the PAT in Snowsight (User profile → Programmatic access tokens) for an
account that can reach the SPCS app. The client sends it on the
``X-SF-SPCS-Authorization`` header so the Snowflake gateway lets the request
through to DataMasque; the gateway strips the header before forwarding, leaving
DataMasque's own ``Authorization`` flow untouched. Leave it unset for
instances that are not behind an SPCS gateway."""

@model_validator(mode="after")
def _validate_auth_source(self) -> "DataMasqueInstanceConfig":
Expand Down
176 changes: 176 additions & 0 deletions datamasque/client/spcs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,176 @@
"""
Snowflake SPCS app gateway authentication for :class:`DataMasqueClient`.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess what two things I'm going to say again.


When a DataMasque instance is hosted behind Snowflake SPCS (Snowpark Container
Services) app ingress (``*.snowflakecomputing.app``), every request must first
clear the Snowflake gateway. We authenticate to the gateway with a Programmatic
Access Token (PAT) sent on ``X-SF-SPCS-Authorization: Snowflake Token="<PAT>"``.
The gateway accepts the PAT on this alternate header and strips it before
forwarding to the container, so DataMasque's own ``Authorization: Token <key>``
flow rides through untouched.

:func:`install_spcs_gateway_auth` attaches this behaviour to a client's
``requests.Session``: it sets the header on the session (so it is sent on every
request, including the unauthenticated login) and registers a response hook that
turns a gateway-originated rejection into a clear :class:`SpcsGatewayAuthError`.
"""

import re
from typing import Any, Optional

import requests

from datamasque.client.exceptions import SpcsGatewayAuthError

SPCS_GATEWAY_AUTH_HEADER = "X-SF-SPCS-Authorization"

# Body-shape discriminators for SPCS gateway error responses.
# The gateway emits JSON with `responseType` (ERROR_<UPPER_SNAKE>), `requestId`
# (canonical UUID), and `detail` (free text). All three must be present and
# match these patterns for the body to count as gateway-originated.
_GATEWAY_RESPONSE_TYPE_RE = re.compile(r"^ERROR_[A-Z][A-Z0-9_]+$")
_UUID_RE = re.compile(
r"^[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}$",
re.IGNORECASE,
)

# Header-shape discriminators for "this response transited a Snowflake SPCS
# gateway". The Server header and the `sfc-ss-` cookie name prefix both appear
# on every gateway-handled response (success and error alike) and aren't
# plausible to spoof by accident.
_SPCS_GATEWAY_SERVER_VALUE = "_"
_SPCS_COOKIE_PREFIX = "sfc-ss-"


def _has_spcs_gateway_header_signature(response: requests.Response) -> bool:
"""
True if at least one header-level Snowflake gateway marker is present.

Looks for either ``Server: _`` (the gateway's literal Server header value)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sembr again... and the comment

I'm not going to comment any more individual instances, have your claude fix them all up please

or any ``Set-Cookie`` carrying the ``sfc-ss-`` cookie name prefix. Either is
sufficient — both indicate the response was emitted by, or transited,
Snowflake's SPCS ingress.
"""
if response.headers.get("server", "").strip() == _SPCS_GATEWAY_SERVER_VALUE:
return True
# `Set-Cookie` may appear multiple times; `requests` flattens duplicates
# via a comma-separated value in `.headers`, but our prefix substring
# check is order- and count-insensitive.
if _SPCS_COOKIE_PREFIX in response.headers.get("set-cookie", ""):
return True
return False


def _is_spcs_gateway_error_body(response: requests.Response) -> Optional[dict]:
"""
Return the parsed body iff it is a structurally-valid gateway error.

All four conditions must hold:
1. The body parses as JSON and is a dict.
2. Keys ``responseType``, ``requestId``, ``detail`` are all present and string-typed.
3. ``responseType`` matches ``^ERROR_<UPPER_SNAKE>$``.
4. ``requestId`` is a canonical 8-4-4-4-12 UUID.

Returns the parsed dict (truthy) on match, ``None`` on miss.
"""
try:
data = response.json()
except ValueError:
return None
if not isinstance(data, dict):
return None
response_type = data.get("responseType")
request_id = data.get("requestId")
detail = data.get("detail")
if not (isinstance(response_type, str) and isinstance(request_id, str) and isinstance(detail, str)):
return None
if not _GATEWAY_RESPONSE_TYPE_RE.match(response_type):
return None
if not _UUID_RE.match(request_id):
return None
Comment on lines +85 to +90

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe combine into a single if without not (flip the order of the returns)?

return data


def _hint_for_gateway_detail(detail: str) -> str:
"""Map common Snowflake gateway ``detail`` strings to a one-line cause hint."""
d = (detail or "").lower()
if "network policy" in d:
return (
"PAT requires a network policy attached to the user (or account) "
"that permits your current public IP. Run `CREATE NETWORK POLICY "
"... ALLOWED_IP_LIST = ('<your.ip>/32')` and `ALTER USER <pat-user> "
"SET NETWORK_POLICY = <policy>`."
)
if "invalid" in d and "token" in d:
return (
"PAT is malformed, expired, or revoked. Re-mint a PAT in Snowsight "
"(User profile -> Programmatic access tokens) and update `spcs_pat`."
)
if "expired" in d:
return "PAT has expired. Mint a fresh one in Snowsight and update `spcs_pat`."
if "authentication" in d or "unauthorized" in d:
return (
"Generic auth rejection. Verify the PAT was minted by a user that "
"has access to this SPCS app, and that any account-level network "
"policy includes your current public IP."
)
return "Unknown gateway rejection — see the Snowflake `detail` string above and the Snowflake PAT docs."


def _check_spcs_gateway_response(response: requests.Response) -> None:
"""
Raise :class:`SpcsGatewayAuthError` iff ``response`` is a gateway-originated rejection.

Two-layer discriminator — both must hold:
* **Body originated at the gateway**: strict shape match on the JSON body
(multiple fields, typed, with format constraints) via
:func:`_is_spcs_gateway_error_body`.
* **Response transited an SPCS gateway**: header signature confirms via
:func:`_has_spcs_gateway_header_signature`.

Either layer alone could in principle false-positive on an unrelated
upstream that happened to emit one of those signals; the conjunction is what
makes the check robust. Legitimate DataMasque 401s (DRF ``{"detail": "..."}``)
have the gateway header signature but fail the body shape — so they correctly
flow through to the client's normal re-auth-and-retry path untouched.
"""
if response.status_code not in (401, 403):
return
if not _has_spcs_gateway_header_signature(response):
return
data = _is_spcs_gateway_error_body(response)
if data is None:
return

response_type = data["responseType"]
request_id = data["requestId"]
detail = data["detail"]
hint = _hint_for_gateway_detail(detail)
raise SpcsGatewayAuthError(
f"SPCS gateway rejected the PAT (HTTP {response.status_code}, "
f"{response_type}). The request never reached DataMasque.\n"
f" Snowflake said: {detail!r}\n"
f" Snowflake reqId: {request_id}\n"
f" Likely cause: {hint}\n"
f" Fix in Snowsight on the account hosting this SPCS app, then retry."
)


def _spcs_gateway_response_hook(response: requests.Response, *args: Any, **kwargs: Any) -> None:
"""``requests`` response hook: raise on a gateway-originated auth rejection."""
_check_spcs_gateway_response(response)


def install_spcs_gateway_auth(session: requests.Session, pat: str) -> None:
"""
Configure ``session`` to authenticate to a Snowflake SPCS app gateway.

Sets the ``X-SF-SPCS-Authorization`` header on the session (so it rides on
every request, including the unauthenticated login) and registers a response
hook that raises :class:`SpcsGatewayAuthError` on a gateway rejection.

Scoping is automatic: the client's session only ever talks to its own
``base_url``, so there is no need to match per-request hosts.
"""
session.headers[SPCS_GATEWAY_AUTH_HEADER] = f'Snowflake Token="{pat}"'
session.hooks["response"].append(_spcs_gateway_response_hook)
8 changes: 8 additions & 0 deletions docs/client.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,14 @@ datamasque.client.files module
:undoc-members:
:show-inheritance:

datamasque.client.spcs module
-----------------------------

.. automodule:: datamasque.client.spcs
:members:
:undoc-members:
:show-inheritance:

datamasque.client.ifm module
----------------------------

Expand Down
28 changes: 28 additions & 0 deletions docs/usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,3 +19,31 @@ To use DataMasque Python in a project:

for connection in client.list_connections():
print(connection.name)

Connecting to an SPCS-hosted instance
=====================================

When DataMasque is hosted behind Snowflake SPCS (Snowpark Container Services)

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, sembr applies to docs as well

app ingress, its ``base_url`` ends in ``.snowflakecomputing.app`` and every
request must first clear the Snowflake gateway. Pass a Snowflake Programmatic
Access Token (PAT) as ``spcs_pat`` and the client sends it on the
``X-SF-SPCS-Authorization`` header automatically; the gateway strips that header
before forwarding, so your DataMasque ``username``/``password`` (or
``token_source``) auth is unaffected.

.. code-block:: python

config = DataMasqueInstanceConfig(
base_url="https://my-app.snowflakecomputing.app",
username="api_user",
password="api_password",
spcs_pat="<snowflake-programmatic-access-token>",
)
client = DataMasqueClient(config)
client.authenticate()

Mint the PAT in Snowsight (User profile → Programmatic access tokens) for an

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/Mint/Create/

account that can reach the SPCS app. If the gateway rejects the PAT (for example
it has expired, or a network policy excludes your IP), the client raises
``SpcsGatewayAuthError`` with the Snowflake-provided detail and a hint at the
likely cause.
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[project]
name = "datamasque-python"
version = "1.1.1"
version = "1.2.0"

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use 1.1.3

description = "Official Python client for the DataMasque data-masking API."
authors = [
{ name = "DataMasque Ltd" },
Expand Down
30 changes: 30 additions & 0 deletions tests/test_connections.py
Original file line number Diff line number Diff line change
Expand Up @@ -951,6 +951,36 @@ def test_snowflake_connection_model_validate_with_stage_location():
assert conn.password is None


def test_snowflake_connection_model_validate_with_spcs_stage_location():
"""
A Snowflake connection staged in SPCS must deserialise (regression for ui-testing MR !185).

When DataMasque runs inside Snowflake SPCS it saves connections with
`snowflake_stage_location=spcs`. Listing connections deserialises every
one, so an unknown stage value used to raise `ValidationError` and break
`create_or_update_connection` for unrelated connections on a shared instance.
"""
payload = {
"id": "a1b2c3d4-0000-0000-0000-000000000000",
"name": "snowflake_spcs",
"mask_type": "database",
"db_type": "snowflake",
"user": "snowman",
"database": "icicle",
"snowflake_account_id": "ABCDEF-123456",
"snowflake_warehouse": "warehouse1",
"snowflake_storage_integration_name": "mysi",
"snowflake_stage_location": "spcs",
}

conn = SnowflakeConnectionConfig.model_validate(payload)

assert conn.snowflake_stage_location is SnowflakeStageLocation.spcs
# SPCS staging carries no external-storage fields.
assert conn.s3_bucket_name is None
assert conn.snowflake_azure_container_name is None


def test_snowflake_connection_model_validate_without_stage_location():
payload = {
"id": "id-3",
Expand Down
Loading