Skip to content

Wire tier-2a Langfuse trace-shape fixtures#185

Merged
chris-colinsky merged 2 commits into
mainfrom
chore/fixture-harness-tier-2a-langfuse
Jun 24, 2026
Merged

Wire tier-2a Langfuse trace-shape fixtures#185
chris-colinsky merged 2 commits into
mainfrom
chore/fixture-harness-tier-2a-langfuse

Conversation

@chris-colinsky

Copy link
Copy Markdown
Member

Summary

Second tier of the conformance-harness fixture catch-up: wire the trace-shape
Langfuse observability fixtures into the YAML harness, driven through a
LangfuseObserver backed by the in-memory recorder. Test-only, no library
change, no pin bump. (The two Langfuse Generation fixtures, 023/024, follow in
tier 2b.)

Wired (6)

Moved from _UNIT_TESTED_FIXTURES to _SUPPORTED_FIXTURES:

  • 022 / 031 / 032: the Langfuse Trace plus its observation tree (basic linear,
    subgraph hierarchy, fan-out per-instance)
  • 035 / 036: the caller-invocation-id to trace.id derivation (UUID hex
    dashes-stripped, and sha256 for a non-UUID)
  • 059: the implementation-attribution rows on the trace metadata

Harness machinery added

  • _run_langfuse_trace_fixture: build a graph via the adapter, record into an
    InMemoryLangfuseClient, and assert the Trace id / name / metadata plus the
    observation tree.
  • A value matcher covering the placeholder tokens (<uuid-hex>, <any-string>,
    and <corr_id_N> first-occurrence binding for the correlation-id-consistency
    check) and the assertion sub-key matchers (harness_parameterized,
    non_empty_string).
  • _assert_langfuse_observation_tree gains an opt-in matcher path; the existing
    tool-fixture caller keeps its exact-match behavior.
  • _run_invocation_id_fixture for 035/036.

A note on trace.id

The fixtures assert the derived Langfuse trace id, while the in-memory recorder
keys traces by the raw OpenArmature invocation_id (the real SDK adapter derives
the OTel id via _to_otel_trace_id). The harness bridges by running the raw id
through the implementation's own langfuse_trace_id, which is exactly that
helper's purpose. The in-memory double storing the raw id rather than the
derived one is a minor fidelity gap in that public testing utility, tracked as a
separate follow-up.

Testing

  • tests/conformance/test_observability.py: 70 passed, 42 skipped.
  • Full tests/: 1462 passed, 408 skipped.
  • ruff and pyright clean.

Move six fixtures (022/031/032 Langfuse trace + observation tree,
035/036 caller-invocation-id derivation, 059 implementation attribution)
from _UNIT_TESTED_FIXTURES into _SUPPORTED_FIXTURES, driven through a
LangfuseObserver + InMemoryLangfuseClient recorder. Second tier of the
fixture-harness catch-up; test-only, no library change, no pin bump.

Adds a Langfuse-trace runner plus a value-matcher for the placeholder
tokens (<uuid-hex>, <any-string>, <corr_id_N> first-occurrence binding)
and the assertion sub-key matchers (harness_parameterized,
non_empty_string), and an invocation-id runner. The fixture trace.id is
the derived Langfuse id, so the harness bridges the recorder's raw
invocation_id through the impl's own langfuse_trace_id. No deferrals;
023/024 (Langfuse Generation) are tier 2b.
Copilot AI review requested due to automatic review settings June 24, 2026 15:57

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Wires the tier-2a Langfuse “trace-shape” conformance fixtures into the YAML observability harness by adding Langfuse-specific runners and matchers so these fixtures can be executed end-to-end via LangfuseObserver + InMemoryLangfuseClient.

Changes:

  • Adds YAML-harness drivers for Langfuse trace-shape fixtures (022/031/032/059) and invocation-id derivation fixtures (035/036).
  • Introduces a Langfuse value-matcher to support placeholder tokens and assertion sub-key matchers used by these fixtures.
  • Extends _assert_langfuse_observation_tree to optionally use the matcher path while preserving exact-match behavior for existing tool fixtures.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/conformance/test_observability.py
Comment thread tests/conformance/test_observability.py
Comment thread tests/conformance/test_observability.py Outdated
Address review feedback on the tier-2a fixture wiring:

- _run_langfuse_trace_case wraps invoke/drain in try/finally so the
  observer is always shut down, even if the graph raises.
- _run_invocation_id_case now holds the LangfuseObserver reference and
  shuts it down in finally, matching the other Langfuse runners.
- _assert_langfuse_observation_tree enables the value-matcher only when
  both bindings and params are provided (and, not or), so a partial call
  degrades to exact match instead of half-enabling the matcher.
@chris-colinsky chris-colinsky merged commit 9310fad into main Jun 24, 2026
5 checks passed
@chris-colinsky chris-colinsky deleted the chore/fixture-harness-tier-2a-langfuse branch June 24, 2026 19:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants