VRAM use grows when using SentenceTransformer.encode + potential fix.

See original GitHub issue

Hey,

I have been using this repository to obtain sentence embeddings for a data set I am currently working on. When using SentenceTransformer.encode, I noticed that my VRAM usage grows with time until a CUDA out of memory error is raised. Through my own experiments I have found the following:

  1. detaching the embeddings before they are extended to all_embeddings, using: embeddings = embeddings.to("cpu") greatly reduces this growth.
  2. Even with the above line added the VRAM use grew, albiet slowy, by adding the line torch.cuda.empty_cache() after the above VRAM usage appears to stop growing over time. The first point makes sense as a fix but I am unsure why this line is necessary?

I am using: pytorch 1.6.0, transformers 3.3.1, sentence_transformers 0.3.7.

Have I missed something in the docs or am doing something daft? I am happy to submit a pull request if needs be?

Thanks,

Martin

Issue Analytics

  • State:open
  • Created 3 years ago
  • Reactions:2
  • Comments:21 (9 by maintainers)

github_iconTop GitHub Comments

2reactions
nreimerscommented, Oct 9, 2020

Agree, when convert_to_numpy=True, I will change the code so that detach and cpu() happens in the loop, not afterwards.

1reaction
liaocs2008commented, Jun 9, 2021

I got same observation. Current code would still crash because of OOM. I managed to fix it by adding following lines for each iteration at https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/SentenceTransformer.py#L188: “”" del embeddings torch.cuda.empty_cache() “”"

Read more comments on GitHub >

github_iconTop Results From Across the Web

SentenceTransformer — Sentence-Transformers documentation
Loads or create a SentenceTransformer model, that can be used to map sentences / text to embeddings. ... batch_size – Encode sentences with...
Read more >
Computing Sentence Embeddings
If it is not a path, it first tries to download a pre-trained SentenceTransformer model. If that fails, tries to construct a model...
Read more >
Quickstart — Sentence-Transformers documentation
Once you have SentenceTransformers installed, the usage is simple: ... encoded by calling model.encode() emb1 = model.encode("This is a red cat with a...
Read more >
SentenceTransformers Documentation — Sentence ...
You can use this framework to compute sentence / text embeddings for more than ... SentenceTransformer('all-MiniLM-L6-v2') #Our sentences we like to encode ......
Read more >
Training Overview — Sentence-Transformers documentation
In the quick start & usage examples, we used pre-trained SentenceTransformer models that already come with a BERT layer and a pooling layer....
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found