I read the algorithms' code and it appears that you are using a fully qualified std::swap everywhere. It would be interesting to find swap with ADL instead to benefit from user-defined swap functions, be it for correctness or performance improvement.