Skip to content

Add graph-storage crate#4198

Draft
TrueDoctor wants to merge 6 commits into
masterfrom
upstream-graph-storage
Draft

Add graph-storage crate#4198
TrueDoctor wants to merge 6 commits into
masterfrom
upstream-graph-storage

Conversation

@TrueDoctor
Copy link
Copy Markdown
Member

No description provided.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces the graph-storage crate, which provides a delta-based graph representation for the Graphite file format, including CRDT/delta computation, conversion utilities, and round-trip tests. It also cleans up legacy migrations in graph-craft and updates MemoHash to compare values instead of hashes. The review feedback highlights a potential DoS/OOM vulnerability when resizing the exports vector with unvalidated slot indices, suggests optimizing the O(N^2) pairwise comparison in order_consistent to O(N log N) by sorting, and recommends using into_iter().next() instead of drain(..).next() for idiomatic element extraction.

Important

The consumer version of Gemini Code Assist on GitHub is being sunset. Starting June 18, 2026, new organization installations will be blocked, and all code review activity will officially cease on July 17, 2026.
For more details on the timeline and next steps, please review the Help Documentation.

Comment thread document/graph-storage/src/lib.rs Outdated
Comment thread document/graph-storage/src/lib.rs Outdated
Comment thread node-graph/graph-craft/src/document/value.rs Outdated
@TrueDoctor
Copy link
Copy Markdown
Member Author

@cubic-ai-dev

@cubic-dev-ai
Copy link
Copy Markdown
Contributor

cubic-dev-ai Bot commented Jun 7, 2026

@cubic-ai-dev

@TrueDoctor I have started the AI code review. It will take a few minutes to complete.

Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

9 issues found across 26 files

Confidence score: 2/5

  • There are multiple high-confidence, user-impacting regressions (sev 7–8) across hashing, graph diffing, and serialization, so merge risk is currently high rather than routine.
  • Most severe: node-graph/libraries/core-types/src/memo.rs makes MemoHash equality ignore hash while Hash still includes it, which can violate hash collection invariants and cause incorrect map/set behavior.
  • Several storage/runtime paths risk dropped or corrupted data: missed network/export diffs in document/graph-storage/src/delta.rs, omitted export-only resources in document/graph-storage/src/from_runtime.rs, and silent truncation in document/graph-storage/src/to_runtime.rs when inputs lengths differ.
  • Pay close attention to node-graph/libraries/core-types/src/memo.rs, document/graph-storage/src/delta.rs, document/graph-storage/src/from_runtime.rs, document/graph-storage/src/to_runtime.rs, and node-graph/libraries/resources/src/lib.rs - they contain the highest-likelihood correctness and round-trip integrity risks.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread node-graph/libraries/resources/src/lib.rs
Comment thread node-graph/libraries/vector-types/src/gradient.rs Outdated
Comment thread node-graph/libraries/core-types/src/memo.rs
Comment thread document/graph-storage/src/from_runtime.rs
Comment thread document/graph-storage/src/to_runtime.rs
Comment thread document/graph-storage/src/delta.rs
Comment thread document/graph-storage/src/delta.rs Outdated
Comment thread node-graph/graph-craft/src/document/value.rs Outdated
Comment thread document/graph-storage/src/from_runtime.rs Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

11 issues found across 19 files

Confidence score: 2/5

  • Multiple high-severity, high-confidence data-integrity risks make this PR risky to merge as-is, especially in core graph storage/CRDT paths (document/graph-storage/src/attributes.rs, document/graph-storage/src/to_runtime.rs, document/graph-storage/src/crdt.rs, document/graph-storage/src/ids.rs).
  • The most severe issue is inconsistent Priority equality/ordering/hashing in document/graph-storage/src/attributes.rs, which can break HashMap/BTree* behavior and lead to hard-to-debug corruption-like behavior.
  • There are additional concrete regression risks affecting user-visible state correctness: duplicate node IDs can silently collapse nodes during runtime reconstruction, deletes can be resurrected without tombstones, and non-canonical Rev hashing can make identical deltas produce different IDs.
  • Pay close attention to document/graph-storage/src/attributes.rs, document/graph-storage/src/to_runtime.rs, document/graph-storage/src/crdt.rs, document/graph-storage/src/ids.rs, document/graph-storage/src/session.rs, document/graph-storage/src/delta.rs, document/graph-storage/src/model.rs - core invariants and history/diff behavior can diverge from expected graph state.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread document/graph-storage/src/attributes.rs Outdated
Comment thread document/graph-storage/src/to_runtime.rs Outdated
Comment thread document/graph-storage/src/crdt.rs
Comment thread document/graph-storage/src/ids.rs
Comment thread document/graph-storage/src/session.rs Outdated
Comment thread document/graph-storage/src/model.rs
Comment thread document/graph-storage/src/model.rs
Comment thread document/graph-storage/Cargo.toml Outdated
Comment thread document/graph-storage/Cargo.toml Outdated
Comment thread document/graph-storage/Cargo.toml Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 19 files

Confidence score: 3/5

  • There is meaningful regression risk in document/graph-storage/src/session.rs: the unconditional registry reset can drop visible hot-zone edits and desynchronize working state from hot_log when unretired ops are present, which is user-facing state integrity impact.
  • document/graph-storage/src/session.rs also has a fallible path in commit_ops that currently uses assert! for missing parents, so malformed or unexpected input could panic the app instead of returning an error.
  • document/graph-storage/src/ids.rs deriving Default for LamportClock introduces an ambiguous/invalid construction path without an explicit PeerId; this is moderate API correctness risk rather than an immediate blocker.
  • Pay close attention to document/graph-storage/src/session.rs, document/graph-storage/src/ids.rs - state rebuild correctness, panic-free error handling, and explicit clock initialization need validation before merge.

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread document/graph-storage/src/session.rs
Comment thread document/graph-storage/src/ids.rs Outdated
Comment thread document/graph-storage/src/session.rs Outdated
Copy link
Copy Markdown
Contributor

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 19 files

Confidence score: 2/5

  • There are concrete regression risks in core behavior: document/graph-storage/src/document.rs can drop the root delta in history iteration, which can break history-based restoration when a removal happens at the root commit.
  • document/graph-storage/src/to_runtime.rs lacks same-network validation during node-reference remapping, so cross-network references may be converted into wrong or dangling runtime IDs instead of failing fast.
  • Additional medium-severity correctness issues in document/graph-storage/src/attributes.rs (finite-value invariant bypass for Priority) and document/graph-storage/src/delta.rs (nondeterministic diffs from HashSet iteration) increase the chance of inconsistent state and revision behavior.
  • Pay close attention to document/graph-storage/src/document.rs, document/graph-storage/src/to_runtime.rs, document/graph-storage/src/attributes.rs, and document/graph-storage/src/delta.rs - they contain the highest-impact correctness and determinism risks before merge.
Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="document/graph-storage/src/attributes.rs">

<violation number="1" location="document/graph-storage/src/attributes.rs:113">
P2: `Priority` exposes its inner `f64`, so callers/deserialization can bypass the finite-value invariant enforced by `Priority::new`. Make invalid states unrepresentable by preventing direct construction and validating deserialization.</violation>
</file>

<file name="document/graph-storage/src/document.rs">

<violation number="1" location="document/graph-storage/src/document.rs:457">
P1: History iterator drops the root delta because `?` short-circuits before `Some(delta)`. This breaks history-based restoration when the relevant removal is at the root commit.</violation>
</file>

<file name="document/graph-storage/src/delta.rs">

<violation number="1" location="document/graph-storage/src/delta.rs:12">
P2: Diff generation is nondeterministic because it iterates `HashSet` differences/intersections. This makes identical state transitions produce different commit sequences and revision IDs across runs.</violation>
</file>

<file name="document/graph-storage/src/to_runtime.rs">

<violation number="1" location="document/graph-storage/src/to_runtime.rs:99">
P2: Per-network conversion does repeated full-map scans of `node_instances`, causing avoidable quadratic work on nested/large graphs. This can noticeably slow full rebuilds.</violation>

<violation number="2" location="document/graph-storage/src/to_runtime.rs:228">
P1: Node-reference remapping lacks a same-network validation guard. Cross-network references can be converted into wrong/dangling local runtime IDs instead of returning an error.</violation>
</file>

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

fn next(&mut self) -> Option<Self::Item> {
let delta = self.document.history.get(&self.parent_rev)?;
// First parent only for now. Local-chain walking (filter by author) is a follow-up.
self.parent_rev = *delta.parents.first()?;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: History iterator drops the root delta because ? short-circuits before Some(delta). This breaks history-based restoration when the relevant removal is at the root commit.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/document.rs, line 457:

<comment>History iterator drops the root delta because `?` short-circuits before `Some(delta)`. This breaks history-based restoration when the relevant removal is at the root commit.</comment>

<file context>
@@ -0,0 +1,480 @@
+	fn next(&mut self) -> Option<Self::Item> {
+		let delta = self.document.history.get(&self.parent_rev)?;
+		// First parent only for now. Local-chain walking (filter by author) is a follow-up.
+		self.parent_rev = *delta.parents.first()?;
+		Some(delta)
+	}
</file context>


fn convert_input(registry: &Registry, input: &NodeInput, input_attributes: &crate::Attributes) -> Result<GraphCraftNodeInput, ConversionError> {
Ok(match input {
NodeInput::Node { node_id, output_index } => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1: Node-reference remapping lacks a same-network validation guard. Cross-network references can be converted into wrong/dangling local runtime IDs instead of returning an error.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/to_runtime.rs, line 228:

<comment>Node-reference remapping lacks a same-network validation guard. Cross-network references can be converted into wrong/dangling local runtime IDs instead of returning an error.</comment>

<file context>
@@ -0,0 +1,277 @@
+
+fn convert_input(registry: &Registry, input: &NodeInput, input_attributes: &crate::Attributes) -> Result<GraphCraftNodeInput, ConversionError> {
+	Ok(match input {
+		NodeInput::Node { node_id, output_index } => {
+			let referenced = registry.node_instances.get(node_id).ok_or(ConversionError::NodeNotFound(*node_id))?;
+			let local_id = referenced.attributes.get(ORIGINAL_NODE_ID).and_then(|v| v.value.as_u64()).unwrap_or(*node_id);
</file context>

/// exact tie between two peers inserting at the same gap is broken by `PeerId` in [`SourceKey`].
/// `f64` precision is ample for the short fallback chains resources carry in practice.
#[derive(Copy, Clone, Debug, Serialize, Deserialize)]
pub struct Priority(pub f64);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Priority exposes its inner f64, so callers/deserialization can bypass the finite-value invariant enforced by Priority::new. Make invalid states unrepresentable by preventing direct construction and validating deserialization.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/attributes.rs, line 113:

<comment>`Priority` exposes its inner `f64`, so callers/deserialization can bypass the finite-value invariant enforced by `Priority::new`. Make invalid states unrepresentable by preventing direct construction and validating deserialization.</comment>

<file context>
@@ -0,0 +1,155 @@
+/// exact tie between two peers inserting at the same gap is broken by `PeerId` in [`SourceKey`].
+/// `f64` precision is ample for the short fallback chains resources carry in practice.
+#[derive(Copy, Clone, Debug, Serialize, Deserialize)]
+pub struct Priority(pub f64);
+
+impl Priority {
</file context>

pub fn compute_deltas(from: &Registry, to: &Registry) -> Vec<RegistryDelta> {
let mut deltas = Vec::new();

let from_network_ids: HashSet<NetworkId> = from.networks.keys().copied().collect();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Diff generation is nondeterministic because it iterates HashSet differences/intersections. This makes identical state transitions produce different commit sequences and revision IDs across runs.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/delta.rs, line 12:

<comment>Diff generation is nondeterministic because it iterates `HashSet` differences/intersections. This makes identical state transitions produce different commit sequences and revision IDs across runs.</comment>

<file context>
@@ -0,0 +1,370 @@
+pub fn compute_deltas(from: &Registry, to: &Registry) -> Vec<RegistryDelta> {
+	let mut deltas = Vec::new();
+
+	let from_network_ids: HashSet<NetworkId> = from.networks.keys().copied().collect();
+	let to_network_ids: HashSet<NetworkId> = to.networks.keys().copied().collect();
+
</file context>

}

let mut nodes: FxHashMap<RuntimeNodeId, DocumentNode> = FxHashMap::default();
for (&global_id, node) in registry.node_instances.iter().filter(|(_, node)| node.network == network_id) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2: Per-network conversion does repeated full-map scans of node_instances, causing avoidable quadratic work on nested/large graphs. This can noticeably slow full rebuilds.

Prompt for AI agents
Check if this issue is valid — if so, understand the root cause and fix it. At document/graph-storage/src/to_runtime.rs, line 99:

<comment>Per-network conversion does repeated full-map scans of `node_instances`, causing avoidable quadratic work on nested/large graphs. This can noticeably slow full rebuilds.</comment>

<file context>
@@ -0,0 +1,277 @@
+	}
+
+	let mut nodes: FxHashMap<RuntimeNodeId, DocumentNode> = FxHashMap::default();
+	for (&global_id, node) in registry.node_instances.iter().filter(|(_, node)| node.network == network_id) {
+		let local_id = node.attributes.get(ORIGINAL_NODE_ID).and_then(|v| v.value.as_u64()).unwrap_or(global_id);
+		let runtime_id = RuntimeNodeId(local_id);
</file context>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant