[Bug] TorchMemorySaver observes invalid LD_PRELOAD. when add --disable-weights-backuper

### Bug Description

raw error 
```sh
timer.py:24 - Timer train_wait start Traceback (most recent call last):
File "/root/slime/train.py", line 110, in <module> train(args)
File "/root/slime/train.py", line 24, in train actor_model, critic_model = create_training_models(args, pgs, rollout_manager) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/slime/slime/ray/placement_group.py", line 152, in create_training_models start_rollout_ids = ray.get( ^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/auto_init_hook.py", line 21, in auto_init_wrapper return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/client_mode_hook.py", line 103, in wrapper return func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/worker.py", line 2822, in get values, debugger_breakpoint = worker.get_objects(object_refs, timeout=timeout) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/ray/_private/worker.py", line 930, in get_objects raise value.as_instanceof_cause() ray.exceptions.RayTaskError(AssertionError): ray::MegatronTrainRayActor.init() (pid=289796, ip=33.163.45.138, actor_id=f847046dd56551e53b8fdcb002000000, repr=<slime.backends.megatron_utils.actor.MegatronTrainRayActor object at 0x7f4a53c113d0>) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/slime/slime/utils/timer.py", line 97, in wrapper return fn(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^
File "/root/slime/slime/backends/megatron_utils/actor.py", line 113, in init self.weights_backuper.backup("actor")
File "/root/slime/slime/utils/tensor_backper.py", line 96, in backup self._backup_hash_dict = _compute_hash_dict(dict(self._source_getter())) ^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/slime/slime/backends/megatron_utils/update_weight/common.py", line 128, in <genexpr> ans = ((name, _maybe_get_cpu_backup(tensor)) for name, tensor in ans) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/root/slime/slime/backends/megatron_utils/update_weight/common.py", line 136, in _maybe_get_cpu_backup if (cpu_tensor := torch_memory_saver.get_cpu_backup(x)) is not None: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/entrypoint.py", line 86, in get_cpu_backup self._ensure_initialized()
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/entrypoint.py", line 92, in _ensure_initialized self._impl = _TorchMemorySaverImpl(**self._impl_ctor_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/entrypoint.py", line 100, in __init__ self._binary_wrapper = BinaryWrapper(path_binary=self._hook_util.get_path_binary()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/hooks/mode_preload.py", line 15, in get_path_binary assert len(interest_paths) == 1, ( ^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: TorchMemorySaver observes invalid LD_PRELOAD. You can use configure_subprocess() utility, or directly specify LD_PRELOAD=/path/to/torch_memory_saver_cpp.some-postfix.so python your_script.py. (LD_PRELOAD="" process_id=289796) --------------------------------------- Job 'raysubmit_tpVuTC2CdefHwbfb' failed --------------------------------------- Status message: Job entrypoint command failed with exit code 1, last available logs (truncated to 20,000 chars):
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/entrypoint.py", line 92, in _ensure_initialized self._impl = _TorchMemorySaverImpl(**self._impl_ctor_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/entrypoint.py", line 100, in __init__ self._binary_wrapper = BinaryWrapper(path_binary=self._hook_util.get_path_binary()) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.12/dist-packages/torch_memory_saver/hooks/mode_preload.py", line 15, in get_path_binary assert len(interest_paths) == 1, ( ^^^^^^^^^^^^^^^^^^^^^^^^ AssertionError: TorchMemorySaver observes invalid LD_PRELOAD. You can use configure_subprocess() utility, or directly specify LD_PRELOAD=/path/to/torch_memory_saver_cpp.some-postfix.so python your_script.py. (LD_PRELOAD="" process_id=289796)
```




### Steps to Reproduce

only add `--disable-weights-backuper` to script which can runs well:
```sh
ray job submit --address="http://127.0.0.1:8265" \
   --runtime-env-json="${RUNTIME_ENV_JSON}" \
   -- python3 /root/slime/train.py \
   --actor-num-nodes 1 \
   --actor-num-gpus-per-node 8 \
   --rollout-num-gpus 8 \
   --num-gpus-per-node 16 \
   --sglang-log-level error \
   --load-debug-rollout-data "$WORKDIR/ckpt/slime/debug/data_{rollout_id}.pt" \
   --disable-weights-backuper \
   ${MODEL_ARGS[@]} \
   ${CKPT_ARGS[@]} \
   ${ROLLOUT_ARGS[@]} \
   ${OPTIMIZER_ARGS[@]} \
   ${GRPO_ARGS[@]} \
   ${WANDB_ARGS[@]} \
   ${PERF_ARGS[@]} \
   ${EVAL_ARGS[@]} \
   ${SGLANG_ARGS[@]} \
   ${MISC_ARGS[@]}
```

### Expected Behavior

LD_PRELOAD can be fund automaticly

### Actual Behavior

error raise

### Environment

- slime version:
- Python version:
- PyTorch version:
- CUDA/ROCm version:
- GPU type and count:
- OS:
- SGLang version (if relevant):
- Megatron-LM version (if relevant):


### Logs

```shell

```

### Additional Context

_No response_

### Pre-submission Checklist

- [x] I have read the [CONTRIBUTING.md](https://github.com/THUDM/slime/blob/main/CONTRIBUTING.md) and understand the collaboration scope.
- [x] I have read the [documentation](https://thudm.github.io/slime/) and my issue is not addressed there.
- [x] I have searched for [existing issues](https://github.com/THUDM/slime/issues) and this is not a duplicate.
- [x] I have provided a minimal, reproducible example.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] TorchMemorySaver observes invalid LD_PRELOAD. when add --disable-weights-backuper #1936

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Logs

Additional Context

Pre-submission Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

[Bug] TorchMemorySaver observes invalid LD_PRELOAD. when add --disable-weights-backuper #1936

Description

Bug Description

Steps to Reproduce

Expected Behavior

Actual Behavior

Environment

Logs

Additional Context

Pre-submission Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions