Buffer::BlockCopy is an FCALL that delegates to the native memmove which may spend a lot of time there depending on how much is being copied.
If GC needs to sync with user threads at inconvenient time, everything will stop until the memmove is done. GC would wait for memmove, everything else will wait for GC.
There was an actual scenario reported when GC pauses could take up to a minute due to this.
(dealing with very large streams, potentially swapped out, ...)
We should either "chunk" large copying into smaller pieces with intermittent GC polling, or just move the whole thing to managed code.