Skip to content

[core] Fix the GCS crash when using LD_PRELOAD and make jemalloc work with worker as well#39192

Merged
fishbone merged 9 commits intoray-project:masterfrom
fishbone:tc-crash-fix
Sep 2, 2023
Merged

[core] Fix the GCS crash when using LD_PRELOAD and make jemalloc work with worker as well#39192
fishbone merged 9 commits intoray-project:masterfrom
fishbone:tc-crash-fix

Conversation

@fishbone
Copy link
Contributor

@fishbone fishbone commented Sep 1, 2023

Why are these changes needed?

libjemalloc was built into gcs and raylet in this PR. While it fixed the memory issues it brings some issues.

It'll break the workload if the user start ray with tcmalloc.

It doesn't work with core worker and for some workload, we need this.

This PR make the jemalloc a shared library and if there is LD_PRELOAD, it won't load jemalloc.so automatically.

Related issue number

Closes #39135

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
@fishbone
Copy link
Contributor Author

fishbone commented Sep 1, 2023

Still wip. I'll write test for this.

Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Copy link
Contributor

@jjyao jjyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the version of jemalloc we will be using?

Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
Signed-off-by: Yi Cheng <74173148+iycheng@users.noreply.github.com>
@fishbone
Copy link
Contributor Author

fishbone commented Sep 2, 2023

@fishbone fishbone merged commit ef4d189 into ray-project:master Sep 2, 2023
fishbone added a commit to fishbone/ray that referenced this pull request Sep 2, 2023
… with worker as well (ray-project#39192)

## Why are these changes needed?
`libjemalloc` was built into gcs and raylet in [this](ray-project#38644) PR. While it fixed the memory issues it brings some issues.

It'll break the workload if the user start ray with tcmalloc.

It doesn't work with core worker and for some workload, we need this.

This PR make the jemalloc a shared library and if there is LD_PRELOAD, it won't load jemalloc.so automatically.

## Related issue number
Closes ray-project#39135
GeneDer pushed a commit that referenced this pull request Sep 3, 2023
… with worker as well (#39192) (#39235)

## Why are these changes needed?
`libjemalloc` was built into gcs and raylet in [this](#38644) PR. While it fixed the memory issues it brings some issues.

It'll break the workload if the user start ray with tcmalloc.

It doesn't work with core worker and for some workload, we need this.

This PR make the jemalloc a shared library and if there is LD_PRELOAD, it won't load jemalloc.so automatically.

## Related issue number
Closes #39135
LeonLuttenberger pushed a commit to jaidisido/ray that referenced this pull request Sep 5, 2023
… with worker as well (ray-project#39192)

## Why are these changes needed?
`libjemalloc` was built into gcs and raylet in [this](ray-project#38644) PR. While it fixed the memory issues it brings some issues.

It'll break the workload if the user start ray with tcmalloc.

It doesn't work with core worker and for some workload, we need this.

This PR make the jemalloc a shared library and if there is LD_PRELOAD, it won't load jemalloc.so automatically.

## Related issue number
Closes ray-project#39135
fishbone added a commit that referenced this pull request Sep 5, 2023
## Why are these changes needed?
In [PR](#39192), jemalloc is added, but it's not delivered with the wheel. This PR fixed it.
fishbone added a commit to fishbone/ray that referenced this pull request Sep 5, 2023
## Why are these changes needed?
In [PR](ray-project#39192), jemalloc is added, but it's not delivered with the wheel. This PR fixed it.
GeneDer pushed a commit that referenced this pull request Sep 5, 2023
## Why are these changes needed?
In [PR](#39192), jemalloc is added, but it's not delivered with the wheel. This PR fixed it.
jimthompson5802 pushed a commit to jimthompson5802/ray that referenced this pull request Sep 12, 2023
… with worker as well (ray-project#39192)

## Why are these changes needed?
`libjemalloc` was built into gcs and raylet in [this](ray-project#38644) PR. While it fixed the memory issues it brings some issues.

It'll break the workload if the user start ray with tcmalloc.

It doesn't work with core worker and for some workload, we need this.

This PR make the jemalloc a shared library and if there is LD_PRELOAD, it won't load jemalloc.so automatically.

## Related issue number
Closes ray-project#39135

Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
jimthompson5802 pushed a commit to jimthompson5802/ray that referenced this pull request Sep 12, 2023
## Why are these changes needed?
In [PR](ray-project#39192), jemalloc is added, but it's not delivered with the wheel. This PR fixed it.

Signed-off-by: Jim Thompson <jimthompson5802@gmail.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
… with worker as well (ray-project#39192)

## Why are these changes needed?
`libjemalloc` was built into gcs and raylet in [this](ray-project#38644) PR. While it fixed the memory issues it brings some issues.

It'll break the workload if the user start ray with tcmalloc.

It doesn't work with core worker and for some workload, we need this.

This PR make the jemalloc a shared library and if there is LD_PRELOAD, it won't load jemalloc.so automatically.

## Related issue number
Closes ray-project#39135

Signed-off-by: Victor <vctr.y.m@example.com>
vymao pushed a commit to vymao/ray that referenced this pull request Oct 11, 2023
## Why are these changes needed?
In [PR](ray-project#39192), jemalloc is added, but it's not delivered with the wheel. This PR fixed it.

Signed-off-by: Victor <vctr.y.m@example.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[core] Fix the LD_PRELOAD

3 participants