HDDS-15359. Read chunks into pooled direct buffers#10353
Draft
smengcl wants to merge 1 commit into
Draft
Conversation
Generated-by: Claude Code (Opus 4.7)
Contributor
|
@smengcl , thanks for working on this. The idea overall looks good to me. Stream read uses KeyValueHandler#readBlockImpl, it's not covered in the patch. Further more, we should consider add a limit to the direct buffer pool in case it causes OOM failure. There is an existing example, BoundedElasticByteBufferPool. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Generated-by: Claude Code (Opus 4.7)
What changes were proposed in this pull request?
BufferUtils.assignByteBufferspreviously allocated heapByteBuffers for chunk reads. Reading from aFileChannelinto a heap buffer forces the JDK to first read into a thread-localDirectByteBuffer(insun.nio.ch.IOUtil) and then memcpy the chunk-sized payload into the user's heap buffer. So every default-config chunk read paid one extra user-space chunk-sized memcpy plus one chunk-sized heap allocation.Two prior PRs added opt-in direct-buffer paths to skip exactly this copy:
ChunkUtils.readDataNettyChunkedNioFilevia NettyPooledByteBufAllocator.DEFAULTMappedBufferManagermmap reads with semaphore-tracked quotaBoth deliberately avoided raw
ByteBuffer.allocateDirectbecause it is a known leak pattern under sustained load. Tiny wrapper objects mean GC fires rarely, theCleanerthat frees direct memory never runs promptly, and the native budget exhausts withOutOfMemoryError: Direct buffer memory. Neither PR updated the defaultassignByteBufferspath, so the heap allocation stayed.This change uses Hadoop's
org.apache.hadoop.util.DirectBufferPool, the same pool already used byBlockOutputStreamandKeyValueContainerCheck. Release plumbing lives entirely inChunkUtils.readData: a release lambda is registered onDispatcherContext.setReleaseMethod, fires once inGrpcXceiverService'sfinallyblock afterresponseObserver.onNextreturns (the standard server marshaller is synchronous, so the response bytes are already serialized into the Netty outgoing buffer by then), and returns each buffer to the pool. Paths without a release-supporting context fall back to GC reclaim via the pool'sWeakReferencelayer, so the pool degrades to plainallocateDirectrather than leaking.ChunkBufferandChunkBufferImplWithByteBufferListare unchanged from master.A consumer audit confirmed no
.array()usage breaks with direct buffers.ChecksumByteBufferImplguards.array()withhasArray(), andUnsafeByteOperations.unsafeWrapof a directByteBufferproduces a zero-copyNioByteStringwithout calling.array(). One test (TestChunkUtils.concurrentReadWriteOfSameFile) called.array()directly and was updated to useget(byte[]).Expected impact
Per chunk read:
Wall-clock effect depends on storage:
Heap-churn reduction is consistent across storage tiers: at 100 MB/s read throughput a DN sheds roughly 100 MB/s of heap allocation.
Pool sizing and operational note
DirectBufferPoolis unbounded by design (perConcurrentMap<Integer, Queue<WeakReference<ByteBuffer>>>). Steady-state pool memory converges to peak concurrent in-flight buffers per capacity class. For default-config chunk reads (4 MB) on a 64-thread DN, that is roughly 256 MB of direct memory. Operators should ensure-XX:MaxDirectMemorySizeaccommodates this; the existingBlockOutputStreamandKeyValueContainerCheckconsumers ofDirectBufferPoolare already sized this way.What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-15359
How was this patch tested?