-
Notifications
You must be signed in to change notification settings - Fork 10
feat: add AIConfigTracker, metrics & resumption tokens (AIC-2664) #174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ctawiah
wants to merge
1
commit into
feat/AIC-2663/ai-sdk-client
Choose a base branch
from
feat/ai-sdk-tracker
base: feat/AIC-2663/ai-sdk-client
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
167 changes: 159 additions & 8 deletions
167
lib/sdk/server-ai/src/main/java/com/launchdarkly/sdk/server/ai/LDAIConfigTracker.java
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,16 +1,167 @@ | ||
| package com.launchdarkly.sdk.server.ai; | ||
|
|
||
| import com.launchdarkly.sdk.server.ai.datamodel.LDAITrackingTypes.FeedbackKind; | ||
| import com.launchdarkly.sdk.server.ai.datamodel.LDAITrackingTypes.JudgeResult; | ||
| import com.launchdarkly.sdk.server.ai.datamodel.LDAITrackingTypes.Metrics; | ||
| import com.launchdarkly.sdk.server.ai.datamodel.LDAITrackingTypes.MetricSummary; | ||
| import com.launchdarkly.sdk.server.ai.datamodel.LDAITrackingTypes.TokenUsage; | ||
| import com.launchdarkly.sdk.server.ai.datamodel.LDAITrackingTypes.TrackData; | ||
|
|
||
| import java.time.Duration; | ||
| import java.util.List; | ||
| import java.util.concurrent.Callable; | ||
| import java.util.function.Function; | ||
|
|
||
| /** | ||
| * Reports events related to a single AI run of an {@link AIConfig}. | ||
| * Reports metrics related to a single AI run of an {@link AIConfig}. | ||
| * <p> | ||
| * A tracker is obtained from a retrieved config via {@link AIConfig#createTracker()}. Each tracker | ||
| * corresponds to one AI run and is used to record metrics such as model usage, duration, and | ||
| * feedback against the AI Config it was created from. | ||
| * A tracker is obtained from a retrieved config via {@link AIConfig#createTracker()}, or | ||
| * reconstructed across process boundaries via | ||
| * {@link LDAIClient#createTracker(String, com.launchdarkly.sdk.LDContext)}. Each tracker corresponds | ||
| * to one AI run; every event it emits shares a {@code runId} (a UUIDv4) so LaunchDarkly can | ||
| * correlate them in metrics views. Start a new run by calling {@link AIConfig#createTracker()} again. | ||
| * <p> | ||
| * <strong>This interface is an intentional placeholder.</strong> The metric- and feedback-reporting | ||
| * methods (and resumption-token support) are introduced in a later step of the AI SDK build-out; it | ||
| * is defined here so that the public config types expose a stable {@code createTracker()} surface. | ||
| * The only implementation in this release is an internal no-op. | ||
| * <strong>Thread-safety.</strong> Implementations are safe to share across threads. The | ||
| * "record-once" metrics ({@link #trackDuration}, {@link #trackTimeToFirstToken}, | ||
| * {@link #trackSuccess}/{@link #trackError}, {@link #trackFeedback}, {@link #trackTokens}) each emit | ||
| * at most once per tracker even under concurrent calls; later calls are ignored and logged. | ||
| * {@link #trackToolCall}/{@link #trackToolCalls} and {@link #trackJudgeResult} may be called any | ||
| * number of times and emit on every call. | ||
| */ | ||
| public interface LDAIConfigTracker { | ||
| /** | ||
| * Returns the correlation data attached to every event this tracker emits. | ||
| * | ||
| * @return the track data, never {@code null} | ||
| */ | ||
| TrackData getTrackData(); | ||
|
|
||
| /** | ||
| * Returns a URL-safe Base64 token that encodes this tracker's {@code runId}, {@code configKey}, | ||
| * {@code variationKey}, and {@code version}. | ||
| * <p> | ||
| * Pass it to {@link LDAIClient#createTracker(String, com.launchdarkly.sdk.LDContext)} to | ||
| * reconstruct a tracker in another process so deferred events (for example user feedback) still | ||
| * correlate with the original run. | ||
| * | ||
| * @return the resumption token, never {@code null} | ||
| */ | ||
| String getResumptionToken(); | ||
|
|
||
| /** | ||
| * Records the duration of the generation. | ||
| * <p> | ||
| * Records at most once per tracker; later calls are ignored. Negative durations (for example from | ||
| * clock skew) are clamped to zero. | ||
| * | ||
| * @param duration the generation duration; must not be {@code null} | ||
| */ | ||
| void trackDuration(Duration duration); | ||
|
|
||
| /** | ||
| * Runs the given operation, recording its duration even if it throws. | ||
| * <p> | ||
| * This does not record success or error; use {@link #trackMetricsOf} for that. Because | ||
| * {@link #trackDuration} records at most once, calling this twice on the same tracker re-runs the | ||
| * operation but emits no second duration event. | ||
| * | ||
| * @param operation the operation to time | ||
| * @param <T> the operation's result type | ||
| * @return the operation's result | ||
| * @throws Exception if the operation throws | ||
| */ | ||
| <T> T trackDurationOf(Callable<T> operation) throws Exception; | ||
|
|
||
| /** | ||
| * Records the time to first token for a streaming generation. | ||
| * <p> | ||
| * Records at most once per tracker; later calls are ignored. Negative values are clamped to zero. | ||
| * | ||
| * @param duration the time to first token; must not be {@code null} | ||
| */ | ||
| void trackTimeToFirstToken(Duration duration); | ||
|
|
||
| /** | ||
| * Records that the generation succeeded. | ||
| * <p> | ||
| * Success and error share state: only the first of {@link #trackSuccess}/{@link #trackError} | ||
| * recorded on a tracker takes effect; later calls are ignored. | ||
| */ | ||
| void trackSuccess(); | ||
|
|
||
| /** | ||
| * Records that the generation failed. | ||
| * <p> | ||
| * Success and error share state: only the first of {@link #trackSuccess}/{@link #trackError} | ||
| * recorded on a tracker takes effect; later calls are ignored. | ||
| */ | ||
| void trackError(); | ||
|
|
||
| /** | ||
| * Records end-user feedback about the generation. | ||
| * <p> | ||
| * Records at most once per tracker; later calls are ignored. | ||
| * | ||
| * @param kind the feedback sentiment; must not be {@code null} | ||
| */ | ||
| void trackFeedback(FeedbackKind kind); | ||
|
|
||
| /** | ||
| * Records token usage for the generation. | ||
| * <p> | ||
| * Records at most once per tracker; later calls are ignored. Negative counts are clamped to zero, | ||
| * and an individual count is only emitted when it is greater than zero. | ||
| * | ||
| * @param tokens the token usage; must not be {@code null} | ||
| */ | ||
| void trackTokens(TokenUsage tokens); | ||
|
|
||
| /** | ||
| * Records a single tool invocation. May be called any number of times. | ||
| * | ||
| * @param toolKey the identifier of the invoked tool; must not be {@code null} | ||
| */ | ||
| void trackToolCall(String toolKey); | ||
|
|
||
| /** | ||
| * Records several tool invocations. May be called any number of times. | ||
| * | ||
| * @param toolKeys the identifiers of the invoked tools; must not be {@code null} | ||
| */ | ||
| void trackToolCalls(List<String> toolKeys); | ||
|
|
||
| /** | ||
| * Records a judge evaluation result. May be called any number of times. | ||
| * <p> | ||
| * No event is emitted when the result was not sampled, did not succeed, or carries no metric key | ||
| * or score. A {@code null} score is treated as "no score" and is distinct from {@code 0.0}. | ||
| * | ||
| * @param result the judge result; must not be {@code null} | ||
| */ | ||
| void trackJudgeResult(JudgeResult result); | ||
|
|
||
| /** | ||
| * Runs the given operation, recording its duration and then its outcome and metrics. | ||
| * <p> | ||
| * The operation is timed via {@link #trackDurationOf}. If it throws, an error is recorded and the | ||
| * exception is rethrown. Otherwise the extractor is applied to the result; if the extractor | ||
| * throws, an error is recorded and the exception is rethrown. On success the extracted metrics | ||
| * drive {@link #trackSuccess}/{@link #trackError}, {@link #trackTokens}, and | ||
| * {@link #trackToolCalls}. | ||
| * | ||
| * @param metricsExtractor extracts {@link Metrics} from the operation's result | ||
| * @param operation the AI operation to run | ||
| * @param <T> the operation's result type | ||
| * @return the operation's result | ||
| * @throws Exception if the operation or the extractor throws | ||
| */ | ||
| <T> T trackMetricsOf(Function<? super T, Metrics> metricsExtractor, Callable<T> operation) | ||
| throws Exception; | ||
|
|
||
| /** | ||
| * Returns an immutable snapshot of the metrics recorded on this tracker so far. | ||
| * | ||
| * @return the metric summary, never {@code null} | ||
| */ | ||
| MetricSummary getSummary(); | ||
| } |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still pass the key in to trackers for defaults since that is the key you requested. It just won't have a variation.