Skip to content

feat: add AIConfigTracker, metrics & resumption tokens (AIC-2664)#174

Open
ctawiah wants to merge 1 commit into
feat/AIC-2663/ai-sdk-clientfrom
feat/ai-sdk-tracker
Open

feat: add AIConfigTracker, metrics & resumption tokens (AIC-2664)#174
ctawiah wants to merge 1 commit into
feat/AIC-2663/ai-sdk-clientfrom
feat/ai-sdk-tracker

Conversation

@ctawiah

@ctawiah ctawiah commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Requirements

  • I have added test coverage for new or changed functionality
  • I have followed the repository's pull request submission guidelines
  • I have validated my changes against all supported platform versions

Related issues

AIC-2664 — Step 4: AITRACK. Stacked on #173 (Step 3).

Describe the solution you've provided

Implements the AITRACK surface to spec, thread-safe by construction:

  • LDAIConfigTracker gains the full method set: trackDuration, trackTimeToFirstToken, trackSuccess/trackError, trackFeedback, trackTokens, trackToolCall(s), trackJudgeResult, plus the trackDurationOf and trackMetricsOf wrappers, getTrackData(), getResumptionToken(), and getSummary(). Event names match the spec and the JS/Python SDKs.
  • LDAIConfigTrackerImpl (internal): per-run UUID runId; record-once metrics use atomic claim-before-emit (AtomicBoolean) so exactly one event is emitted under concurrency (trackSuccess/trackError share one guard). Tool-call and judge-result events are not once-only; tool calls accumulate in a CopyOnWriteArrayList and getSummary().getToolCalls() returns an immutable snapshot.
  • Validation: negative durations and token counts are clamped to zero; individual token counts emit only when positive; null args are guarded; a null judge score is treated as "no score" and is distinct from a legitimate 0.0. trackMetricsOf records an error and rethrows on both operation and extractor failures.
  • Resumption tokens (ResumptionTokens): URL-safe Base64 (no padding) of canonical JSON in fixed key order runId, configKey, variationKey, version, graphKey. variationKey is always emitted for cross-SDK parity; modelName/providerName are not carried (restored trackers report ""). Decoding strictly type-validates each field and rejects malformed or oversized (>4 KB) tokens.
  • Wiring: LDAIClientImpl now produces real trackers from createTracker() (a fresh runId per call) and adds LDAIClient.createTracker(token, context) to reconstruct a run across process boundaries. The placeholder NoOpAIConfigTracker is removed.
  • The new tracking value types are consolidated in LDAITrackingTypes (mirroring the LDAIConfigTypes pattern).

Tests

  • Resumption token: byte-compatible with fixed fixtures, round-trips, strict decode (missing/mistyped/oversized rejected), escaping.
  • Tracker: per-event semantics, at-most-once, clamping, judge-result zero-vs-null, trackMetricsOf rethrow + error, summary, and concurrency (N threads → exactly one once-only event, intact tool-call list).
  • Client wiring: tracker carries variation/model metadata, token reconstruction shares runId, each createTracker() starts a new run.

Out of scope (per ticket)

No Judge/Evaluator (Step 5), no AIGRAPH createGraphTracker, no provider-specific trackOpenAIMetrics/trackBedrockMetrics (post-1.0).

Made with Cursor


Note

Medium Risk
New public API surface and telemetry behavior on every AI config evaluation; resumption token parsing handles untrusted input with size limits and strict validation.

Overview
Replaces the placeholder no-op LDAIConfigTracker with a full AITRACK implementation that emits AI run metrics through LDClient.trackMetric, keyed by a per-run runId.

LDAIConfigTracker is expanded from a stub to the full API: duration, time-to-first-token, success/error, feedback, tokens, tool calls, judge results, trackDurationOf / trackMetricsOf, getTrackData(), getResumptionToken(), and getSummary(). LDAITrackingTypes adds the public value types (TokenUsage, Metrics, JudgeResult, etc.).

LDAIConfigTrackerImpl sends spec-aligned $ld:ai:* events with shared correlation fields; record-once metrics use atomic guards for thread safety. ResumptionTokens encodes/decodes cross-process resumption (cross-SDK byte fixtures in tests). LDAIClientImpl wires real trackers (new UUID per createTracker() on configs) and adds LDAIClient.createTracker(token, context) for deferred events; NoOpAIConfigTracker is removed.

Reviewed by Cursor Bugbot for commit 19d0f4f. Bugbot is set up for automated code reviews on this repo. Configure here.

@ctawiah ctawiah marked this pull request as ready for review June 11, 2026 02:30
@ctawiah ctawiah requested a review from a team as a code owner June 11, 2026 02:30
* Metrics a caller extracts from an AI run, supplied to
* {@link com.launchdarkly.sdk.server.ai.LDAIConfigTracker#trackMetricsOf}.
*/
public static final class Metrics {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be AIMetrics to be consistent with other SDKs and spec

try {
metrics = metricsExtractor.apply(result);
} catch (RuntimeException e) {
trackError();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is tricky and @mattrmc1 might have more to say here. In the other SDK's I did not throw the exception but logged that metrics could not be tracked as a warning and moved on. In general we try to avoid throwing exceptions whenever possible. Matt called out that since this is a user provided function it make sense to throw which I can see being valid as well.

One thing however is we should not trackError as that indicates the AI failed, not that metric extraction of that AI failed.

agent.getJudgeConfiguration(),
agent.getTools(),
TRACKER_FACTORY);
trackerFactory(key, null, null, agent.getModel(), agent.getProvider(), context));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still pass the key in to trackers for defaults since that is the key you requested. It just won't have a variation.

@ctawiah ctawiah force-pushed the feat/AIC-2663/ai-sdk-client branch from 3ace063 to dfc1386 Compare June 11, 2026 21:28
…IC-2664)

Implements the AITRACK surface on LDAIConfigTracker: per-run UUID runId and
track data, the full set of track methods (duration, time-to-first-token,
success/error, feedback, tokens, tool calls, judge result) plus trackDurationOf
and trackMetricsOf wrappers, and a metric summary.

Record-once metrics use atomic claim-before-emit guards so exactly one event is
produced under concurrency; tool-call and judge-result events are not once-only.
Negative durations and token counts are clamped, and a null judge score is
distinct from a legitimate 0.0.

Resumption tokens are URL-safe Base64 of canonical JSON in fixed key order
(runId, configKey, variationKey, version, graphKey); variationKey is always
emitted for cross-SDK parity and modelName/providerName are not carried. Decode
strictly type-validates each field and rejects malformed/oversized tokens.

LDAIClientImpl now wires createTracker() on the config types to the real tracker
and adds createTracker(token, context) to reconstruct a run across processes.

Co-authored-by: Cursor <cursoragent@cursor.com>
@ctawiah ctawiah force-pushed the feat/ai-sdk-tracker branch from 19d0f4f to 2ca9fc8 Compare June 11, 2026 21:29

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, have a team admin enable autofix in the Cursor dashboard.

Reviewed by Cursor Bugbot for commit 2ca9fc8. Configure here.

metrics = metricsExtractor.apply(result);
} catch (RuntimeException e) {
trackError();
throw e;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extractor failure misreports generation error

Medium Severity

When trackMetricsOf’s user-supplied metrics extractor throws after the AI operation completes, the implementation calls trackError() and emits $ld:ai:generation:error. That event means the generation failed, but the model call already succeeded—only parsing or metric extraction failed—so dashboards can show false generation failures and the shared outcome guard blocks a later correct trackSuccess().

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 2ca9fc8. Configure here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants