Control sharding codec read coalescing with ArrayConfig and runtime config options#3987
Control sharding codec read coalescing with ArrayConfig and runtime config options#3987aldenks wants to merge 10 commits into
Conversation
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #3987 +/- ##
=======================================
Coverage 93.49% 93.49%
=======================================
Files 88 88
Lines 11861 11873 +12
=======================================
+ Hits 11089 11101 +12
Misses 772 772
🚀 New features to boost your workflow:
|
|
disclaimer: I'm not a big fan of our global config object, so I'd like to explore some alternative ways for the sharding reads to access this configuration. A few options:
|
|
@d-v-b I also like new fields on ArrayConfig. Thinking that through:
That sound alright? |
yeah, that sounds right. the array config object is designed to make it easy to get a cheap copy of an array with a new config, using the |
|
|
Follow up #3004 by adding
ArrayConfigand runtime configuration options for the thresholds that control how requests are coalesced when reading in the sharding codec.Two new fields on
ArrayConfigcontrol how the sharding codec coalesces partial-shard reads:sharding_coalesce_max_gap_bytes(default 1 MiB) andsharding_coalesce_max_bytes(default 16 MiB). When reading multiple chunks from the same shard, nearby byte ranges are merged into a single request to the store if separated by no more thansharding_coalesce_max_gap_bytesand the merged read stays withinsharding_coalesce_max_bytes. Defaults are seeded from the matchingarray.sharding_coalesce_max_gap_bytes/array.sharding_coalesce_max_byteskeys inzarr.configat array-creation time, and can be overridden per array by passingconfig={...}tozarr.create_array.TODO:
docs/user-guide/*.mdchanges/