[rllib] Best workflow to train, save, and test agent
See original GitHub issueWhat is your question?
This is a great framework, but after reading the documentation and playing around for weeks, I’m still struggeling to get the simple workflow working: Train a PPO agent, save a checkpoint at the end, save stats, and use the trained agent for evaluation or visualization in the end.
It starts with my confusion about the two ways of training an RL agent. Either
trainer = PPOTrainer(env="CartPole-v0", config={"train_batch_size": 4000})
while True:
print(trainer.train())
Which makes saving my agent simple with trainer.save(path) and I can use the trained agent afterwards for testing with trainer.compute_action(observation). But: Afaik, I cannot change the log directory, which always defaults to ~/ray-results.
Or I use ray.tune.run():
from ray import tune
tune.run(PPOTrainer, config={"env": "CartPole-v0", "train_batch_size": 4000}, local_dir=my_path, checkpoint_at_end=True)
Which allows me to configure a custom local_dir to put my logs in and create a checkpoint at the end. But: Afaik, I don’t have access to my trained agent. ray.tune.run() just returns an ExperimentAnalysis object but not my trained agent nor the exact path of the checkpoints (which includes some random hash) such that I could load the agent. The experiment_id in the results does not correspond to the hash that’s used in the dir name, so I cannot reconstruct the dir name.
My only resort at the moment is to split training with ray.tune.run and then loading and testing the agent into two separate steps, where I have to find and copy & past the path of the last checkpoint manually in between. Very inconvenient.
There must be a more convenient way to do what I want, right?
Ray version and other system information (Python version, TensorFlow version, OS):
- Ray 0.8.5
- Tensorflow 2.2.0
- Python 3.8.3
- OS: Ubuntu 20.04 on WSL (Win 10)
Issue Analytics
- State:
- Created 3 years ago
- Reactions:8
- Comments:10 (5 by maintainers)
Top Related StackOverflow Question
I finally got a workflow that does everything I want; train with configurable log dir, return the saved agent path, load the trained agent, and use it for testing.
Here’s the basic code (within a custom class):
With that you can just call
train, load, testand it should work. I hope this helps.Not sure if there’s any other/better way to do it. But it solves my issue.
I know you closed this issue but this simple workflow in official documentation would be a huge boon.