feat: add text/image-to-3d-scene pipeline#340
Open
MuziWong wants to merge 1 commit into
Open
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR introduces a new embodichain/gen_sim/prompt2scene module implementing a text/image-to-3D-scene pipeline, including LLM-backed workflows (LangGraph), prompt templating, service clients/managers for asset generation/segmentation, and a CLI entry point to run the end-to-end process.
Changes:
- Add LangGraph workflows for scene intake, relation extraction, unified-scene assembly, and downstream scene generation.
- Add agent-tool clients/managers (HTTP service wrappers + geometry/simulation utilities) used by the pipeline.
- Add prompt templating system + bundled YAML prompt templates, plus CLI/config scaffolding.
Reviewed changes
Copilot reviewed 130 out of 130 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| embodichain/gen_sim/prompt2scene/init.py | Package root (currently header-only) |
| embodichain/gen_sim/prompt2scene/.gitignore | Ignore local outputs/servers |
| embodichain/gen_sim/prompt2scene/cli/init.py | CLI package init |
| embodichain/gen_sim/prompt2scene/cli/start.py | CLI entry point |
| embodichain/gen_sim/prompt2scene/configs/client_config.json | Default agent-tool server endpoints |
| embodichain/gen_sim/prompt2scene/configs/llm_config.json | Default LLM config scaffold |
| embodichain/gen_sim/prompt2scene/llms/init.py | LLM helpers exports |
| embodichain/gen_sim/prompt2scene/llms/config.py | LLM config dataclass |
| embodichain/gen_sim/prompt2scene/llms/openai_compatible.py | OpenAI-compatible LangChain model builder + config loader |
| embodichain/gen_sim/prompt2scene/pipeline/init.py | Pipeline runner exports |
| embodichain/gen_sim/prompt2scene/pipeline/runner.py | End-to-end pipeline orchestration |
| embodichain/gen_sim/prompt2scene/prompts/init.py | Prompt rendering API |
| embodichain/gen_sim/prompt2scene/prompts/base.py | Prompt template loader/renderer |
| embodichain/gen_sim/prompt2scene/prompts/data/init.py | Bundled prompt data package init |
| embodichain/gen_sim/prompt2scene/prompts/data/image_relations.yaml | Image-relations prompt template |
| embodichain/gen_sim/prompt2scene/prompts/data/scene_intake.yaml | Scene-intake prompt template |
| embodichain/gen_sim/prompt2scene/prompts/data/text_relations.yaml | Text-relations prompt template |
| embodichain/gen_sim/prompt2scene/prompts/data/unified_scene_gen.yaml | Unified-scene-gen prompt template |
| embodichain/gen_sim/prompt2scene/utils/init.py | prompt2scene utility exports |
| embodichain/gen_sim/prompt2scene/utils/io.py | IO helpers (json/path/data-url) |
| embodichain/gen_sim/prompt2scene/utils/log.py | prompt2scene logging helpers |
| embodichain/gen_sim/prompt2scene/workflows/init.py | Workflow constants/exports |
| embodichain/gen_sim/prompt2scene/workflows/attempt_state.py | Retry/error TypedDict base |
| embodichain/gen_sim/prompt2scene/workflows/artifact_writer.py | Step artifact writing helpers |
| embodichain/gen_sim/prompt2scene/workflows/llm_output.py | Structured-model call wrappers |
| embodichain/gen_sim/prompt2scene/workflows/request.py | Input normalization + manifest |
| embodichain/gen_sim/prompt2scene/workflows/spatial.py | Spatial constant definitions |
| embodichain/gen_sim/prompt2scene/workflows/stage_errors.py | Error formatting helpers |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/init.py | Image-relations workflow exports |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/graph.py | Image-relations LangGraph |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/nodes.py | Image-relations nodes |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/prompts.py | Image-relations message builders |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/schema.py | Image-relations output schema |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/state.py | Image-relations state typing |
| embodichain/gen_sim/prompt2scene/workflows/image_relations/utils.py | Image-relations normalization utils |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/init.py | Scene-intake workflow exports |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/graph.py | Scene-intake LangGraph |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/nodes.py | Scene-intake nodes |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/prompts.py | Scene-intake message builders |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/schema.py | Scene-intake JSON schema |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/state.py | Scene-intake state typing |
| embodichain/gen_sim/prompt2scene/workflows/scene_intake/utils.py | Scene-intake normalization utils |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/init.py | Text-relations workflow exports |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/graph.py | Text-relations LangGraph |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/nodes.py | Text-relations nodes |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/prompts.py | Text-relations message builders |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/schema.py | Text-relations JSON schema + dataclasses |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/state.py | Text-relations state typing |
| embodichain/gen_sim/prompt2scene/workflows/text_relations/utils.py | Text-relations normalization utils |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene/init.py | Unified-scene workflow package init (currently empty exports) |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene/graph.py | Unified-scene LangGraph |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene/nodes.py | Unified-scene assembly node |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene/schema.py | Unified-scene dataclasses + manifest |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene/state.py | Unified-scene state typing |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene/utils.py | Unified-scene construction helpers |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/init.py | Unified-scene-gen workflow exports |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/graph.py | Unified-scene-gen LangGraph |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/nodes.py | Unified-scene-gen nodes |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/paths.py | Unified-scene-gen path helpers |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/prompts.py | Unified-scene-gen message builders |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/schema.py | Unified-scene-gen JSON schemas |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/scene_update.py | Unified-scene manifest updates |
| embodichain/gen_sim/prompt2scene/workflows/unified_scene_gen/state.py | Unified-scene-gen state typing |
| embodichain/gen_sim/prompt2scene/agent_tools/init.py | Agent-tools package init (currently missing header/future/all) |
| embodichain/gen_sim/prompt2scene/agent_tools/servers/init.py | External-servers package init |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/init.py | Client package exports |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/base.py | Shared HTTP client retry logic |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/common.py | HTTP parsing/validation helpers |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/config.py | Client config loader |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_generation_client/init.py | Image-generation client exports |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_generation_client/client.py | Z-Image client implementation |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_generation_client/parser.py | Z-Image response parser |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_generation_client/schemas.py | Z-Image client schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_segmentation_client/init.py | Image-segmentation client exports |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_segmentation_client/client.py | SAM3 client implementation |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_segmentation_client/parser.py | SAM3 response parser |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_segmentation_client/schemas.py | SAM3 client schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/image_segmentation_client/utils.py | SAM3 visualization/mask utils |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/geometry_generation_client/init.py | Geometry-gen client exports |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/geometry_generation_client/client.py | Geometry-gen client implementation |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/geometry_generation_client/parser.py | Geometry-gen response parser |
| embodichain/gen_sim/prompt2scene/agent_tools/clients/geometry_generation_client/schemas.py | Geometry-gen client schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/init.py | Tools package init |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/gym_export.py | Gym export tool |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/image_scene_asset_generation.py | Image-scene asset generation tool |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/table_fit_scene.py | Table fitting tools |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/text_asset_generation.py | Text asset generation tool |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/text_clutter_layout.py | Text clutter layout tool |
| embodichain/gen_sim/prompt2scene/agent_tools/tools/text_scene_metric_scale.py | Text-scene metric scaling tool |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/blender_rendering_manager/init.py | Blender rendering manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/blender_rendering_manager/manager.py | Blender rendering implementation |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/blender_rendering_manager/schemas.py | Blender rendering schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_generation_manager/init.py | Geometry-generation manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_generation_manager/manager.py | Geometry-generation orchestration |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_generation_manager/schemas.py | Geometry-generation schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_manager/init.py | Geometry manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_manager/manager.py | Mesh processing utilities |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_manager/schemas.py | Geometry manager schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/geometry_manager/scene_geometry.py | Scene geometry helpers |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_generation_manager/init.py | Image generation manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_generation_manager/manager.py | Image generation manager implementation |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_generation_manager/schemas.py | Image generation schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_scene_manager/init.py | Image scene manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_scene_manager/alignment.py | Image-scene alignment utilities |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_scene_manager/manifests.py | Image-scene manifest writers |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_scene_manager/prompts.py | Image-scene prompt builders |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_scene_manager/schemas.py | Image-scene schemas/constants |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_segmentation_manager/init.py | Image segmentation manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_segmentation_manager/manager.py | Segmentation domain operations |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/image_segmentation_manager/schemas.py | Segmentation manager schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/matplotlib_manager/init.py | Matplotlib manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/matplotlib_manager/manager.py | Visualization implementation |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/matplotlib_manager/schemas.py | Visualization schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/metric_scale_manager/init.py | Metric scale manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/metric_scale_manager/manager.py | Metric scale logic |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/metric_scale_manager/schemas.py | Metric scale schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/optimization_manager/init.py | Optimization helpers exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/optimization_manager/manager.py | Layout/packing utilities |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/simulation_manager/init.py | Simulation manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/simulation_manager/manager.py | Gravity-drop simulation wrapper |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/simulation_manager/schemas.py | Simulation manager schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/simready_manager/init.py | Simready manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/simready_manager/manager.py | Simready conversion logic |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/simready_manager/schemas.py | Simready schemas |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/table_clutter_fit_manager/init.py | Table/clutter fit manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/table_clutter_fit_manager/manager.py | Table-to-clutter fitting logic |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/text_layout_manager/init.py | Text layout manager exports |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/text_layout_manager/layout.py | Text layout placement logic |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/text_layout_manager/optimization.py | Text layout optimization |
| embodichain/gen_sim/prompt2scene/agent_tools/managers/text_layout_manager/settle.py | Physics settling helpers |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| @@ -0,0 +1 @@ | |||
| """Internal client + External server for agent tool calling.""" | |||
Comment on lines
+1
to
+15
| # ---------------------------------------------------------------------------- | ||
| # Copyright (c) 2021-2026 DexForce Technology Co., Ltd. | ||
| # | ||
| # Licensed under the Apache License, Version 2.0 (the "License"); | ||
| # you may not use this file except in compliance with the License. | ||
| # You may obtain a copy of the License at | ||
| # | ||
| # http://www.apache.org/licenses/LICENSE-2.0 | ||
| # | ||
| # Unless required by applicable law or agreed to in writing, software | ||
| # distributed under the License is distributed on an "AS IS" BASIS, | ||
| # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # ---------------------------------------------------------------------------- No newline at end of file |
Comment on lines
+17
to
+19
| from __future__ import annotations | ||
|
|
||
| __all__: list[str] = [] |
Comment on lines
+20
to
+23
| from importlib import resources | ||
| from pathlib import Path | ||
| from string import Template | ||
| from typing import Any, Mapping |
Comment on lines
+76
to
+79
| def _get_prompt_path(self, prompt_name: str) -> Path: | ||
| if "/" in prompt_name or "\\" in prompt_name: | ||
| raise ValueError(f"Prompt name must be a file name: {prompt_name}") | ||
| return resources.files(self._package).joinpath(prompt_name) |
| sim.update(step=300) | ||
|
|
||
| final_pose = obj.get_local_pose(to_matrix=True)[0].detach().cpu() | ||
| sim._deferred_destroy() |
Comment on lines
+1
to
+21
| { | ||
| "sam3_segmentation": { | ||
| "base_url": "http://192.168.3.23:5014", | ||
| "timeout_s": 1200, | ||
| "health_path": "/health", | ||
| "segment_single_object_path": "/predict" | ||
| }, | ||
| "sam3d_generation": { | ||
| "base_url": "http://10.7.7.32:5019", | ||
| "timeout_s": 1800, | ||
| "health_path": "/health", | ||
| "generate_multiple_objects_path": "/generate_multiple_objects", | ||
| "generate_single_object_path": "/generate_single_object" | ||
| }, | ||
| "zimage": { | ||
| "base_url": "http://192.168.3.23:5013", | ||
| "timeout_s": 120, | ||
| "health_path": "/health", | ||
| "generate_single_object_path": "/generate.png" | ||
| } | ||
| } |
| # See the License for the specific language governing permissions and | ||
| # limitations under the License. | ||
| # ---------------------------------------------------------------------------- | ||
| """External servers, ignored by git, for testing or demo purposes.""" No newline at end of file |
Comment on lines
+70
to
+80
| missing = [ | ||
| name | ||
| for name, value in { | ||
| "api_key": api_key, | ||
| "model": model, | ||
| "base_url": base_url, | ||
| }.items() | ||
| if not value | ||
| ] | ||
| if missing: | ||
| raise ValueError(f"Missing required LLM config keys: {missing}") |
Comment on lines
+106
to
+111
| kwargs: dict[str, Any] = { | ||
| "api_key": cfg.api_key, | ||
| "base_url": cfg.base_url, | ||
| "model": cfg.model, | ||
| "temperature": 0, | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
This PR adds a new
prompt2scenemodule underembodichain/gen_sim/that implements a text/image-to-3D-scene generation pipeline.The pipeline takes text descriptions or images as input and generates 3D simulation scenes, leveraging LLM-based workflows for scene understanding, geometry generation, and layout optimization.
Key components:
Type of change
Checklist
black .command to format the code base.🤖 Generated with Claude Code