Propagate output_tokens_details onto the accumulated streaming Message#1703
Open
camposvinicius wants to merge 1 commit into
Open
Propagate output_tokens_details onto the accumulated streaming Message#1703camposvinicius wants to merge 1 commit into
camposvinicius wants to merge 1 commit into
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
The streaming accumulator copies most
usagefields frommessage_deltaonto the accumulatedMessage, but it never copiesoutput_tokens_details. That field carries the reasoning token breakdown (thinking_tokens) the API reports for extended-thinking responses.As a result the final snapshot from a stream disagrees with the equivalent non-streamed
Message. A caller readingstream.get_final_message().usage.output_tokens_detailsgetsNoneeven when the API reported the breakdown, so cost and reasoning-token observability is silently lost on the streaming path.MessageDeltaUsage(and the beta variant) already declaresoutput_tokens_details, and the API sends it onmessage_delta. The repo's own fable-fallback fixtures carry it, for example"output_tokens_details":{"thinking_tokens":67}.This is the same class of gap that #1725 fixed for
stop_detailsin the samemessage_deltabranch.Fix
Propagate
output_tokens_detailsfrommessage_deltaonto the accumulated snapshot in bothlib/streaming/_messages.pyandlib/streaming/_beta_messages.py, mirroring the existingstop_detailsandserver_tool_usepropagation.Tests
The
refusal_response.txtstreaming fixture now carriesoutput_tokens_detailson itsmessage_delta, andassert_refusal_responsein bothtest_messages.pyandtest_beta_messages.pyasserts it reaches the final message. The four refusal tests (sync and async, beta and non-beta) fail without the fix and pass with it.ruffandpyrightare clean on the changed files.