Issue in checkpointing

## Environment info


- `transformers` version: 4.6.0
- Platform: - 
- Python version: 3.8
- PyTorch version (GPU?): 3.7
- Tensorflow version (GPU?): - 
- Using GPU in script?: - 
- Using distributed or parallel set-up in script?: - 

### Who can help

@sgugger


## Information
Hi 
I am observing reloading after checkpoint does not get the same results. I searched and as mentioned here https://github.com/huggingface/transformers/issues/11323#issuecomment-822729525 , trainer currently does not save the random states to reload them as well, which is important. Could you add these info in self.state and set random states also in the trainer in the resume? that would be great 

thanks

## Expected behavior

After resume, one should get exact same results as training the models without break.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue in checkpointing #11504

Environment info

Who can help

Information

Expected behavior

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue in checkpointing #11504

Description

Environment info

Who can help

Information

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions