PlaceholderTable/SparseTable: add transient SQLite error resilience via GVFSTable base class#2031
Open
tyrielv wants to merge 1 commit into
Open
Conversation
b75962e to
8e67a19
Compare
…ia GVFSTable base class Extract shared retry and error-handling logic into GVFSTable base class, used by both PlaceholderTable and SparseTable. This provides: - ExecuteWrite: serialized writes with retry on BUSY/LOCKED/IOERR - ExecuteRead: reads with retry on transient errors - ExecuteNonCriticalRead: returns fallback on transient error (heartbeat) - ExecuteReadThenWrite: mixed operations with retry Transient errors handled (up to 5 retries with linear backoff): - SQLITE_BUSY (5): connection-level lock contention - SQLITE_LOCKED (6): table-level lock contention (fixes #59353072) - SQLITE_IOERR (10): disk I/O errors from AV/ReFS/disk busyness Non-critical count methods (GetCount, GetFilePlaceholdersCount, GetFolderPlaceholdersCount) return -1 on transient failure rather than throwing, since they are only consumed by heartbeat telemetry. Also fixes pre-existing copy-paste bug in exception messages where GetFilePlaceholdersCount/GetFolderPlaceholdersCount reported as GetCount. Assisted-by: Claude Opus 4.6 Signed-off-by: Tyrie Vella <tyrielv@gmail.com>
8e67a19 to
ede1b84
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
Two classes of transient SQLite errors hit PlaceholderTable in production:
SQLITE_IOERR (10) - telemetry shows repeated disk I/O errors in
GetFilePlaceholdersCount()during heartbeat, caused by ReFS snapshots, antivirus, or momentary disk busyness.SQLITE_LOCKED (6) - Bug 59353072: table lock contention on the Placeholder table during concurrent operations.
Both PlaceholderTable and SparseTable shared an identical pattern (connection pool, writer lock, try/catch wrapping) with no retry or transient error handling.
Fix
Extract shared logic into
GVFSTablebase class with four execution primitives:ExecuteWriteExecuteReadExecuteNonCriticalReadExecuteReadThenWriteTransient errors retried (linear backoff: 50ms, 100ms, ... 250ms):
SQLITE_BUSY (5)- connection-level lock contentionSQLITE_LOCKED (6)- table-level lock contentionSQLITE_IOERR (10)- disk I/O errorsNon-critical count methods (
GetCount,GetFilePlaceholdersCount,GetFolderPlaceholdersCount) return -1 on transient failure - only consumed by heartbeat telemetry.Also fixes pre-existing copy-paste bug where
GetFilePlaceholdersCount/GetFolderPlaceholdersCountexception messages incorrectly saidGetCount.Files Changed
GVFSTable.cs- base class with retry/lock/error infrastructurePlaceholderTable.cs- inherits GVFSTable, all operations use base methodsSparseTable.cs- same treatmentSqliteErrorCodes.cs- added BUSY, LOCKED, IOERR constants +IsTransientError()Validation
GetAllEntries