docs: add python / matplotlib requirement example by akihikokuroda · Pull Request #1293 · generative-computing/mellea

akihikokuroda · 2026-06-18T15:10:36Z

Pull Request

Issue

Fix: #1025

Description

Testing

Tests added to the respective file if code was changed
New code has 100% coverage if code was added
Ensure existing tests and github automation passes (a maintainer will kick off the github automation when the rest of the PR is populated)

Attribution

AI coding assistants used

Adding a new component, requirement, sampling strategy, or tool?

If your PR adds or modifies one of the types below, check the matching box. A checklist of type-specific review items will be posted as a comment.

Component
Requirement
Sampling Strategy
Tool

NOTE: Please ensure you have an issue that has been acknowledged by a core contributor and routed you to open a pull request against this repository. Otherwise, please open an issue before continuing with this pull request.

psschwei

a few nits but otherwise LGTM

planetf1 · 2026-06-19T14:15:52Z

+        print(f"\n  ✓ Graph saved to: {graph_path}")
+        print(f"    File size: {graph_path.stat().st_size} bytes")
+    else:
+        print(f"\n  ⚠ Graph file not found at {graph_path}")


The conftest passes the test when the script exits 0, and process_user_request always exits 0 — there's no sys.exit or raise when the graph isn't produced, just a print + return. So even with matplotlib installed, if the pipeline fails for any reason the test is recorded as passing whilst producing nothing. Worth raising here so failures are visible:

Suggested change

print(f"\n ⚠ Graph file not found at {graph_path}")

if not graph_path.exists():

raise RuntimeError(f"Graph was not created at {graph_path}")

print(f"\n ✓ Graph saved to: {graph_path}")

print(f" File size: {graph_path.stat().st_size} bytes")

Thanks for the earlier fix. The latest commit (2b9d4f89) reverted this back to print + return — the raise RuntimeError that was there previously is now gone. Since the conftest passes the test on exit code 0, a missing graph still goes undetected. Please restore the raise RuntimeError (or a sys.exit(1)) so that test failures surface properly.

planetf1 · 2026-06-19T14:25:01Z

Tiny one: the PR title has a typo — "requrement" should be "requirement". Worth fixing before merge as it ends up in the commit message.

planetf1

Thanks for the example — the pipeline approach is nicely structured and the README additions are clear.

Two things I'd want addressed before merge:

matplotlib guard — matplotlib isn't a project dependency so the example can't succeed on a clean checkout. The repair loop exhausts silently with no useful message to the user (see inline comment).
Silent test pass — the conftest passes the test on exit 0, and process_user_request always exits 0 even when no graph is produced. The e2e test gives a false green in that case (see inline comment).

The other inline comments (execution tier caveat, README requirements section) are informational — happy for those to be addressed or tracked as follow-ups.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

psschwei

LGTM. @planetf1 has requested changes so will leave approval for him.

markstur

see inline

markstur · 2026-06-23T23:24:31Z

+        data = list(reader)
+
+    preview_lines = []
+    with open(csv_path) as f:


The CSV file is opened twice: once with csv.DictReader to read all rows (lines 59–61), and again to collect preview lines (lines 63–67). This doubles the I/O cost unnecessarily.

Since csv.DictReader already reads the raw lines, both purposes can be served in a single pass.

Thanks! fixed it.

This was consolidated but not fixed. The file is still read twice. I guess you want one pass for raw preview and the other pass for dict. So you chose not to try single pass and deal w/ unwanted format. Makes sense.

TO FIX: The preview loop only takes < 5, but then it continues to read the rest for no reason. Add a break if >= 5.

else break

added.

markstur · 2026-06-23T23:25:59Z

+    Returns:
+        Extracted Python code string, or None if extraction failed.
+    """
+    from mellea.stdlib.requirements.python_reqs import _has_python_code_listing


why not import at top?

I added some comments to explain.

markstur · 2026-06-23T23:29:31Z

+
+    m = mellea.start_session()
+
+    output_path = f"{output_dir}/graph_{request_number}.png"


Better practice: use Path to join instead of hard-coding the /

markstur · 2026-06-23T23:31:21Z

+
+    prompt = f"""
+    The user has this request:
+    "{user_request}"


Curious what we can do about injection risk here.

user_request (interactive user input) and csv_preview (CSV file content that may contain adversarial data) are both interpolated unsanitized into the prompt f-string sent to the LLM. In interactive mode, a user can inject instructions that override the intended prompt, causing the model to generate arbitrary code that is then executed in a local subprocess via PythonExecutionReq(execution_tier="local").

This is especially serious because the generated code is executed as a subprocess side effect of validation.

Fix it. Some comments are added.

markstur · 2026-06-23T23:52:45Z

+    strategy = ModelFriendlyRepairStrategy(loop_budget=5, requirements=all_reqs)
+
+    print("Generating code to extract data and create graph...")
+    generated = m.instruct(


should check for errors here.
the error that would show up later would be misleading.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

planetf1 · 2026-06-24T11:41:20Z

+        return
+
+    generated_str = str(generated)
+    code = _extract_code_from_output(generated_str)


ModelOutputThunk has no __bool__, so not generated is always False — this guard never fires. The fix is to test the string value, which the code already produces one line later anyway:

Suggested change

code = _extract_code_from_output(generated_str)

generated_str = str(generated)

if not generated_str.strip():

print(" ✗ Model failed to generate output")

return

code = _extract_code_from_output(generated_str)

planetf1 · 2026-06-24T11:43:34Z

+The underlying mechanism (what `PythonExecutionReq` does under the hood):
+
+```python
+def execute_python_code(code: str, timeout: int = 10) -> dict:


This snippet does not reflect the actual implementation — PythonExecutionReq routes through environment.execute(code) (built from your execution_tier and CapabilityPolicy), not a bare subprocess.run. The two diverge silently as the code evolves.

A prose pointer to the source would be more durable:

PythonExecutionReq executes code via an ExecutionEnvironment built from the execution_tier and CapabilityPolicy you pass — that's what enforces the timeout and tier behaviour. See mellea/stdlib/requirements/python_reqs.py for the full implementation.

planetf1 · 2026-06-24T11:44:10Z

+    # Check if graph file was created
+    graph_path = Path(output_path)
+    if not graph_path.exists():
+        raise RuntimeError(f"Graph was not created at {graph_path}")


The two failure modes above (empty generation, failed code extraction) handle errors gracefully with print and return. This one raises, which aborts the whole batch run on the first failure.

Given the loop budget is 5 and LLM output is probabilistic, a single failed graph should not stop the remaining examples from running. Consider matching the pattern above:

Suggested change

raise RuntimeError(f"Graph was not created at {graph_path}")

if not graph_path.exists():

print(f" ✗ Graph was not created at {graph_path}")

return

planetf1 · 2026-06-24T11:45:17Z

+
+    # Sanitize user input and CSV preview to prevent prompt injection attacks.
+    # Use repr() to escape special characters and quote the values, preventing
+    # the model from interpreting user input as prompt instructions.


repr() wraps the string in quotes and escapes special characters, but it does not stop the model from reading and acting on the content inside them. "Prevent prompt injection" overstates what this does — it reduces accidental prompt formatting breakage, which is still useful, but worth calling it that rather than a security control.

Comments are updated.

planetf1 · 2026-06-24T11:46:15Z

+    print("Generating code to extract data and create graph...")
+    generated = m.instruct(
+        prompt,
+        requirements=all_reqs,  # type: ignore[arg-type]


all_reqs is already passed to ModelFriendlyRepairStrategy — when both are provided, the strategy takes precedence and the requirements= arg on instruct() is redundant. Since this is example code people will copy, it is worth being accurate: pass requirements in one place only (the strategy is the right home, since that is where repair feedback is built).

Took it out.

planetf1 · 2026-06-24T11:46:16Z

+        epilog="""
+Examples:
+  # Run with default sample data and predefined requests
+  python code_generation_and_execution.py


The project convention (AGENTS.md §1) is uv run python rather than plain python — and since this is example code people will run directly, it matters: plain python may silently pick up a different interpreter or environment. Applies to all four invocations in this epilog.

planetf1 · 2026-06-24T11:46:17Z

+    2. Extract the data as specified in the user request
+    3. Use matplotlib with headless backend (set matplotlib.use('Agg') at start)
+    4. Create the visualization (graph type) specified by the user
+    5. Save the graph to {output_path} using plt.savefig(\'{output_path}\')


nit: the backslash-escaped quotes around {output_path} are unnecessary inside a """ f-string — single quotes work fine without escaping here.

planetf1

Two things to fix before merge:

mellea[plotting] does not exist — the ImportError message points users at uv pip install mellea[plotting], which fails because there is no such extras group in pyproject.toml. The correct install is uv pip install matplotlib numpy (already in the README prerequisites).
README code snippet does not match the implementation — the execute_python_code block under "How Code Execution Works" uses subprocess.run directly and omits the ExecutionEnvironment abstraction that actually enforces execution_tier and CapabilityPolicy. See inline comment for a suggested replacement.

Remaining inline comments (dead guard, RuntimeError in batch mode, repr() comment, double-wired requirements, argparse epilog) are improvements worth addressing but not hard blockers.

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda · 2026-06-24T15:12:06Z

@planetf1 I addressed all your comments. Please review. Thanks!

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

planetf1 · 2026-06-25T14:26:34Z

LGTM — all comments addressed from my side. Leaving formal approval to reviewers who still have open threads.

Dismissing changes needed review - other approvers can review and check open threads - specifically need to ensure extras are correct

fixed enough to unblock

markstur

See inline.

You should add a break instead of reading the whole file to build a preview. I guess that's not blocking but worth improving.

Otherwise I think my comments were addressed and I think Nigel's were(?)

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda · 2026-06-25T23:22:41Z

@markstur Nigel said

LGTM — all comments addressed from my side. Leaving formal approval to reviewers who still have open threads.

@planetf1

@planetf1 already said changes were lgtm and dismissed

akihikokuroda requested a review from a team as a code owner June 18, 2026 15:10

akihikokuroda requested review from markstur, planetf1 and psschwei June 18, 2026 15:10

github-actions Bot added the documentation Improvements or additions to documentation label Jun 18, 2026

AngeloDanducci reviewed Jun 18, 2026

View reviewed changes

akihikokuroda requested a review from AngeloDanducci June 18, 2026 21:37

psschwei reviewed Jun 18, 2026

View reviewed changes

Comment thread docs/examples/requirements/code_generation_and_execution.py Outdated

Comment thread docs/examples/requirements/code_generation_and_execution.py Outdated

Comment thread docs/examples/requirements/code_generation_and_execution.py Outdated

akihikokuroda requested a review from psschwei June 18, 2026 22:23

planetf1 reviewed Jun 19, 2026

View reviewed changes

Comment thread docs/examples/requirements/code_generation_and_execution.py

planetf1 reviewed Jun 19, 2026

View reviewed changes

Comment thread docs/examples/requirements/README.md

planetf1 reviewed Jun 19, 2026

View reviewed changes

Comment thread docs/examples/requirements/code_generation_and_execution.py Outdated

planetf1 previously requested changes Jun 19, 2026

View reviewed changes

akihikokuroda changed the title ~~docs: add python / matplotlib requrement example~~ docs: add python / matplotlib requirement example Jun 19, 2026

akihikokuroda requested a review from planetf1 June 19, 2026 15:45

akihikokuroda added 8 commits June 19, 2026 14:08

example for python requirements and samplings

a4e8f60

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

add requirements

815ab51

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

change python execution check requrement usage

57f1b1d

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

review comments

9c0da95

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

review comment

e8503b3

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

review comment

1fb51fc

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

review comment

c54570b

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

improved prompt

11fbd13

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda force-pushed the issue1025 branch from 0949db1 to 11fbd13 Compare June 19, 2026 23:08

akihikokuroda mentioned this pull request Jun 22, 2026

Document safer execution patterns with execution_tier="docker" for untrusted code #1310

Open

psschwei reviewed Jun 22, 2026

View reviewed changes

Comment thread docs/examples/requirements/code_generation_and_execution.py Outdated

psschwei reviewed Jun 22, 2026

View reviewed changes

Comment thread docs/examples/requirements/code_generation_and_execution.py Outdated

review comment

4fb9c8f

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda requested a review from psschwei June 22, 2026 16:20

psschwei reviewed Jun 22, 2026

View reviewed changes

markstur previously requested changes Jun 24, 2026

View reviewed changes

review comments

d675bee

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda requested a review from markstur June 24, 2026 01:23

planetf1 reviewed Jun 24, 2026

View reviewed changes

planetf1 previously requested changes Jun 24, 2026

View reviewed changes

review comment

2b9d4f8

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

akihikokuroda requested a review from planetf1 June 24, 2026 15:11

review comment

277e185

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

markstur reviewed Jun 25, 2026

View reviewed changes

review comment

25057da

Signed-off-by: Akihiko Kuroda <akihikokuroda2020@gmail.com>

markstur approved these changes Jun 26, 2026

View reviewed changes

markstur added this pull request to the merge queue Jun 26, 2026

Merged via the queue into generative-computing:main with commit b59753f Jun 26, 2026
9 checks passed


		m = mellea.start_session()

		output_path = f"{output_dir}/graph_{request_number}.png"

Uh oh!

Conversation

akihikokuroda commented Jun 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Pull Request

Issue

Description

Testing

Attribution

Adding a new component, requirement, sampling strategy, or tool?

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

psschwei left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

planetf1 commented Jun 19, 2026

Uh oh!

Uh oh!

planetf1 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

psschwei left a comment

Choose a reason for hiding this comment

Uh oh!

markstur left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

akihikokuroda commented Jun 18, 2026 •

edited

Loading

planetf1 Jun 24, 2026 •

edited

Loading

planetf1 Jun 24, 2026 •

edited

Loading