Skip to content

Better allocator support#2

Open
kstppd wants to merge 43 commits into
fmihpc:masterfrom
kstppd:better_allocator_support
Open

Better allocator support#2
kstppd wants to merge 43 commits into
fmihpc:masterfrom
kstppd:better_allocator_support

Conversation

@kstppd

@kstppd kstppd commented Sep 6, 2024

Copy link
Copy Markdown
Collaborator

So in this PR there is some pruning in the SplitVector code to enable the use of external allocators such as Umpire. This PR enables Hashinator as well to propagate the allocators to SplitVector. A citation methods has also been added and as well as some changes to the README.

  • The split::split_host_allocator is not needed anymore an replace with std::allocator
  • The metadata size and capacity are now allocated by the template allocator in splitvector. This enables us to use a standardized allocators but comes with the hacky part of some extra type casting that needs to take place. In cases where sizeof(T) is enormous this approach is not efficient, but since Hashinator and SplitVectors are meant to be used with trivial types this is rather unlikely.

@kstppd kstppd marked this pull request as ready for review September 27, 2024 07:47

@markusbattarbee markusbattarbee left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't look through tests yet, but some commentary and fix suggestions.

Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/hashinator/hashinator.h
Comment thread include/splitvector/splitvec.h Outdated
Comment thread include/splitvector/splitvec.h Outdated
Comment thread include/splitvector/splitvec.h Outdated
Comment thread include/splitvector/splitvec.h Outdated
Comment thread README.md

@markusbattarbee markusbattarbee left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bump

Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/hashinator/hashinator.h
}
return elements.size();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

up - what I mean is that with Allocator support, will the old versions (which do not template allocators) ever be called anymore? Or could they be safely deleted?

Comment thread include/hashinator/hashinator.h
Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/splitvector/splitvec.h Outdated
Comment thread include/splitvector/splitvec.h
Comment thread include/splitvector/splitvec.h
Comment thread include/splitvector/splitvec.h Outdated
Comment thread include/hashinator/hashinator.h
@markusbattarbee

Copy link
Copy Markdown

What do you think, how much work would it be to test this also with MallocMC? That would be nice to report. :)

Also, in case you didn't notice, there's some merge conflicts that still need resolving.

@markusbattarbee markusbattarbee left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few major things still, please.

}
return elements.size();
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lot of code duplication :( Couldn't you put the default allocator as the default argument to this templated function? (same thing as in split_tools.h line 104)

Comment thread include/hashinator/hashinator.h
Comment thread include/hashinator/hashinator.h Outdated
Comment thread include/splitvector/splitvec.h
Comment thread include/splitvector/splitvec.h Outdated
HOSTONLY void copyMetadata(SplitInfo* dst, split_gpuStream_t s = 0) {
SPLIT_CHECK_ERR(split_gpuMemcpyAsync(&dst->size, _size, sizeof(size_t), split_gpuMemcpyDeviceToHost, s));
SPLIT_CHECK_ERR(split_gpuMemcpyAsync(&dst->capacity, _capacity, sizeof(size_t), split_gpuMemcpyDeviceToHost, s));
SPLIT_CHECK_ERR(split_gpuMemcpyAsync(&dst->size, _info, sizeof(SplitInfo), split_gpuMemcpyDeviceToHost, s));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still needs changing to point to just dst

assert(0 && "Splitvector has a catastrophic failure trying to resize on device.");
}
if (construct) {
if constexpr (!std::is_trivially_copy_constructible_v<T> ){

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in that case there needs to be some other workaround - by default, a vector of a given size has all elements set to zero. This must still hold true for SplitVector. If necessary, you can add:

if (_construct) {
   for (size_t i = size(); i < newSize; ++i) {
      data[i] = T();
   }
}

or something similar, I guess. Ugly, but potentially needs to be done.
Or can you check somehow if the allocator has a construct method? Make an issue about it on the Umpire repo?

Comment thread include/splitvector/splitvec.h
Comment thread include/splitvector/splitvec.h Outdated
Comment thread include/hashinator/hashinator.h Outdated

// Allocate with Mempool
const size_t memory_for_pool = 8 * nBlocks * sizeof(uint32_t);
splitStackArena mPool(memory_for_pool, s);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔

@kstppd

kstppd commented Apr 12, 2025

Copy link
Copy Markdown
Collaborator Author

What do you think, how much work would it be to test this also with MallocMC? That would be nice to report. :)
I am now having a look on how that can be interfaces in one of our unit tests

Comment thread include/hashinator/hashinator.h

// Allocate with Mempool
const size_t memory_for_pool = 8 * nBlocks * sizeof(uint32_t);
splitStackArena mPool(memory_for_pool, s);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO the allocator should be the one handling the memory pools.

}
#if defined(__HIP__) && !defined(AMD_COHERENCE_FINE)
int device;
SPLIT_CHECK_ERR(split_gpuGetDevice(&device));

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

<3

@markusbattarbee

Copy link
Copy Markdown

In general I would like to see less code duplication as that leads to potential issues in future development, having to fix things in several places, but I guess it's out of my hands now :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants