fix(prediction): output_iterator silently yields individual characters for string output by devteamaegis · Pull Request #461 · replicate/replicate-python

devteamaegis · 2026-05-24T01:40:25Z

Summary

output_iterator() and async_output_iterator() use self.output or [] to coerce the model's output to a list. This is only safe when output is None (not started yet) or an actual list.

For non-array model outputs — plain strings, URLs, dicts — the falsy-or pattern silently returns the raw value unchanged:

# output_iterator with output = "hello world"
previous_output = self.output or []  # → "hello world"
# ...
output = self.output or []           # → "hello world"
new_output = output[len(previous_output):]  # string slice, not list slice
yield from new_output                # silently yields 'h', 'e', 'l', 'l', ...

Fix

Replace value or [] with an explicit isinstance check:

def _as_list(value):
    if value is None:
        return []
    if isinstance(value, list):
        return value
    raise ValueError(
        f"output_iterator requires an array output type, "
        f"but the model returned a {type(value).__name__!r}. "
        f"Use prediction.output directly for non-array outputs."
    )

This resolves the long-standing # TODO: check output is list comments in both methods.

Bonus fix

The async_output_iterator loop also had a variable-shadowing bug:

# Before — outer `output` list is clobbered by the loop variable
for output in new_output:
    yield output

# After
for item in new_output:
    yield item

Tests

Six new unit tests added to tests/test_prediction.py covering:

None output → empty iteration (no error)
String output → ValueError (sync and async)
Dict output → ValueError (sync)
Completed prediction with list output → empty iteration (all tokens were "seen" at start)

…acter-by-character Both output_iterator and async_output_iterator used ``self.output or []`` to coerce the model's output to a list. This idiom is only safe when output is None (not started yet) or an actual list. For non-array model outputs (plain string, URL, dict) the falsy-or pattern silently returns the raw value. Calling ``yield from "hello world"`` then iterates over 11 individual characters instead of the intended single token, and ``output[len(previous_output):]`` on a string produces a string slice rather than a list slice — causing subtly wrong behaviour with no error. Fix: replace ``value or []`` with an explicit isinstance check that accepts None (treated as an empty list, meaning "no output yet") and list, and raises a descriptive ValueError for anything else so callers know to use ``prediction.output`` directly for non-streaming models. Also fixes the variable shadowing bug in async_output_iterator where the loop variable was named ``output`` instead of ``item``, clobbering the outer ``output`` reference on each iteration. Resolves the long-standing ``# TODO: check output is list`` comments.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(prediction): output_iterator silently yields individual characters for string output#461

fix(prediction): output_iterator silently yields individual characters for string output#461
devteamaegis wants to merge 1 commit into
replicate:mainfrom
devteamaegis:fix/output-iterator-non-list-output

devteamaegis commented May 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

devteamaegis commented May 24, 2026

Summary

Fix

Bonus fix

Tests

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant