VRAM use grows when using SentenceTransformer.encode + potential fix.

Hey,

I have been using this repository to obtain sentence embeddings for a data set I am currently working on. When using SentenceTransformer.encode, I noticed that my VRAM usage grows with time until a CUDA out of memory error is raised. Through my own experiments I have found the following:

detaching the embeddings before they are extended to all_embeddings, using: embeddings = embeddings.to("cpu") greatly reduces this growth.
Even with the above line added the VRAM use grew, albiet slowy, by adding the line torch.cuda.empty_cache() after the above VRAM usage appears to stop growing over time. The first point makes sense as a fix but I am unsure why this line is necessary?

I am using: pytorch 1.6.0, transformers 3.3.1, sentence_transformers 0.3.7.

Have I missed something in the docs or am doing something daft? I am happy to submit a pull request if needs be?

Thanks,

Martin

Issue Analytics

State:
Created 3 years ago
Reactions:2
Comments:21 (9 by maintainers)

Top GitHub Comments

2reactions

nreimerscommented, Oct 9, 2020

Agree, when convert_to_numpy=True, I will change the code so that detach and cpu() happens in the loop, not afterwards.

1reaction

liaocs2008commented, Jun 9, 2021

I got same observation. Current code would still crash because of OOM. I managed to fix it by adding following lines for each iteration at https://github.com/UKPLab/sentence-transformers/blob/master/sentence_transformers/SentenceTransformer.py#L188: “”" del embeddings torch.cuda.empty_cache() “”"

Top Results From Across the Web

SentenceTransformer — Sentence-Transformers documentation

Loads or create a SentenceTransformer model, that can be used to map sentences / text to embeddings. ... batch_size – Encode sentences with...

Computing Sentence Embeddings

If it is not a path, it first tries to download a pre-trained SentenceTransformer model. If that fails, tries to construct a model...

Quickstart — Sentence-Transformers documentation

Once you have SentenceTransformers installed, the usage is simple: ... encoded by calling model.encode() emb1 = model.encode("This is a red cat with a...

SentenceTransformers Documentation — Sentence ...

You can use this framework to compute sentence / text embeddings for more than ... SentenceTransformer('all-MiniLM-L6-v2') #Our sentences we like to encode ......

Training Overview — Sentence-Transformers documentation

In the quick start & usage examples, we used pre-trained SentenceTransformer models that already come with a BERT layer and a pooling layer....