Relocate Langfuse fixtures to their own runner by chris-colinsky · Pull Request #187 · LunarCommand/openarmature-python

chris-colinsky · 2026-06-24T21:50:33Z

What

The Langfuse-mapping observability conformance fixtures are fixture-tested by the dedicated sibling runner tests/conformance/test_observability_langfuse.py, not by unit tests. The _UNIT_TESTED_FIXTURES bucket in test_observability.py mislabeled them ("covered by test_observability_langfuse.py" referred to that conformance runner, not a unit test), so the in-progress fixture-harness catch-up wrongly wired 022/023/024/031/032 into test_observability.py, duplicating coverage that already existed in the sibling runner, and homed the genuinely-new 035/036/059 there as well.

This PR consolidates all Langfuse fixtures in their proper file.

Changes

test_observability.py: reverted all Langfuse wiring (back to its post-tier-1 state). Added a _LANGFUSE_HARNESS_FIXTURES bucket that the coverage guard counts and the per-fixture dispatcher skips, recording that the sibling runner owns those 12 fixtures. _UNIT_TESTED_FIXTURES now holds only the genuinely-unit-only fixtures.
test_observability_langfuse.py: added 035/036/059 to _LANGFUSE_FIXTURES, with four small runner extensions: caller_invocation_id at invoke (035/036), the langfuse_trace_id derived-id bridge for literal trace ids, 036's raw invocation_id recovered from trace.id, and the harness_parameterized / non_empty_string metadata matcher (059).

Net -322 lines as the duplication comes out.

Test plan

tests/conformance/test_observability.py + test_observability_langfuse.py: 76 passed, 48 skipped (Langfuse fixtures skip in the first, run in the second).
Full tests/: 1459 passed (5 fewer than before, exactly the de-duplicated double-runs).
ruff + pyright clean.

Test-only; no behavior or pin change.

The Langfuse-mapping observability fixtures are fixture-tested by the dedicated sibling runner tests/conformance/test_observability_langfuse.py, not by unit tests. The _UNIT_TESTED_FIXTURES bucket in test_observability.py mislabeled them, so the fixture-harness catch-up wrongly wired 022/023/024/031/032 into test_observability.py -- duplicating coverage that already existed in the sibling runner -- and homed the genuinely-new 035/036/059 there too. Revert all Langfuse wiring from test_observability.py (back to its post-tier-1 state) and add a _LANGFUSE_HARNESS_FIXTURES bucket that the coverage guard counts and the dispatcher skips, recording that the sibling runner owns them. Relocate 035/036/059 into test_observability_langfuse.py, extending its runner for caller_invocation_id, the derived-trace.id bridge, 036's raw invocation_id, and the harness_parameterized / non_empty_string metadata matcher. Test-only; no behavior or pin change.

Copilot

Pull request overview

This PR refactors the conformance test harness to consolidate all Langfuse-mapping observability fixtures under the dedicated Langfuse runner, avoiding duplicated coverage and clarifying fixture ownership between runners.

Changes:

Removed Langfuse fixture execution and assertion helpers from test_observability.py, replacing them with a _LANGFUSE_HARNESS_FIXTURES accounting bucket and an early pytest.skip for those fixture IDs.
Extended test_observability_langfuse.py to run fixtures 035/036/059, including support for caller-supplied invocation_id, derived trace.id bridging, and metadata matcher sub-keys for fixture 059.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
tests/conformance/test_observability.py	Stops running Langfuse fixtures in the OTel-focused harness; accounts for them via `_LANGFUSE_HARNESS_FIXTURES` and skips them during parametrized execution.
tests/conformance/test_observability_langfuse.py	Adds execution + assertions for Langfuse fixtures `035/036/059`, including invoke-time `invocation_id` wiring and new metadata matcher handling.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

CodeQL py/side-effect-in-assert flagged the expected_metadata.pop() inside the assert: under python -O the assertion (and its pop) is stripped, leaving the invocation_id key in expected_metadata for the later _assert_metadata_subset to wrongly check. Bind the pop to a local before the assert, and include the expected value in the failure message.

Copilot AI review requested due to automatic review settings June 24, 2026 21:50

Copilot started reviewing on behalf of chris-colinsky June 24, 2026 21:50 View session

github-code-quality Bot found potential problems Jun 24, 2026

View reviewed changes

Comment thread tests/conformance/test_observability_langfuse.py Fixed

github-advanced-security AI found potential problems Jun 24, 2026

View reviewed changes

Comment thread tests/conformance/test_observability_langfuse.py Fixed

Copilot AI reviewed Jun 24, 2026

View reviewed changes

Comment thread tests/conformance/test_observability_langfuse.py

chris-colinsky merged commit 9f166b9 into main Jun 24, 2026
5 checks passed

chris-colinsky deleted the chore/relocate-langfuse-fixtures branch June 24, 2026 22:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Relocate Langfuse fixtures to their own runner#187

Relocate Langfuse fixtures to their own runner#187
chris-colinsky merged 2 commits into
mainfrom
chore/relocate-langfuse-fixtures

chris-colinsky commented Jun 24, 2026

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

chris-colinsky commented Jun 24, 2026

What

Changes

Test plan

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants