torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive

See original GitHub issue

🐛 Describe the bug

I have a websocket that receives chunks of data in a byte format. The browser encodes the data in audio/webm format. The code is like the following:

@app.websocket("/listen")
async def websocket_endpoint(websocket: WebSocket):
    await websocket.accept()
    try:
        while True:
            data = await websocket.receive_bytes()
            with open('audio.wav', mode="wb") as f:
                f.write(data)
    except Exception as e:
        raise Exception(f'Could not process audio: {e}')
    finally:
        await websocket.close()

Manually writing the data to audio.wav and then reading the file using the following code works fine with no errors:

array, sr = torchaudio.load("audio.wav")

However, reading the file as a file object does not work:

with open("audio.wav", mode="rb") as f:
    torchaudio.load(f)

It raises the following error:

Exception: Could not process audio: Failed to open the input "<_io.BufferedReader name='audio.wav'>" (Invalid data found when processing input).
INFO:     connection closed

PS: Creating BytesIO from the data and passing it to the torchaudio.load results in error the same as the above.

Versions

Versions of relevant libraries: [pip3] numpy==1.23.4 [pip3] torch==1.12.1 [pip3] torchaudio==0.12.1 [conda] numpy 1.23.4 pypi_0 pypi

OS

Ubuntu: 22.04 torchaudio.backend: “sox_io”

PS

I tested the same process on a webm file which was converted from a wav file, and the result was the same:

  1. torchaudio.load can read the file from hard drive.
  2. torchaudio.load cannot read bytesio or _io.BufferedReader

Issue Analytics

  • State:closed
  • Created a year ago
  • Comments:7 (5 by maintainers)

github_iconTop GitHub Comments

3reactions
mthrokcommented, Oct 25, 2022

I think you could do chunk-by-chunk decoding, which is more efficient, but not sure if this is what you want, as I do not know what application you are building.

To do chunk-by-chunk decoding, you can wrap the socket object into a synchronous file-like object.

class Wrapper:
    def __init__(self, socket):
        self.socket = socket
        self.buffer = b''

    def read(self, n):
        while len(self.buffer) < n:
            new_data = await self.socket.receive_bytes()
            if not new_data:
                break
            self.buffer += new_data
        data, self.buffer = self.buffer[:n], self.buffer[n:]
        return data

Then passing it to StreamReader and let StreamReader pull the data.

try:
    wrapper = Wrapper(websocket)
    s = torchaudio.io.StreamReader(wrapper)
    for chunk in s.stream():
        print(chunk.shape)
except ...
2reactions
mthrokcommented, Oct 25, 2022

Hi @pooya-mohammadi

The audio you shared has wav extension but, in fact, it is WebM format.

with open("audio.wav", "rb") as f:
    print(f.read(50)[30:])

prints the following

b'\x84webmB\x87\x81\x02B\x85\x81\x02\x18S\x80g\x01\xff\xff'

and ffprove audio.wav reports;

Input #0, matroska,webm, from 'audio.wav':
  Metadata:
    encoder         : QTmuxingAppLibWebM-0.0.1
  Duration: N/A, start: -0.001000, bitrate: N/A
  Stream #0:0(eng): Audio: opus, 48000 Hz, mono, fltp (default)

torchaudio.load first attempts to read it with libsox, but it fails as WebM is not supported, and it re-tries with FFmpeg only when the source is file path. It cannot retry when the input is file-like object, as seek method is not always available.

To handle WebM, you can use torchaudio.io.StreamReader, and it works with both file input and file-like object input and it can do iterative reading as well.

# loading from path and  read the entire audio in one-go
s = torchaudio.io.StreamReader(path)
s.add_basic_audio_stream(-1)
s.process_all_packets()
waveform, = s.pop_chunks()
# load from file-like object and read audio chunk-by-chunk
s = torchaudio.io.StreamReader(f)
s.add_basic_audio_stream(chunk_size)
for chunk, in s.stream():
    # process waveform

For the detailed usage, please checkout tutorials like

Read more comments on GitHub >

github_iconTop Results From Across the Web

Loading audio data from BytesIO or memory · Issue #800 ...
Feature The load API does not support loading audio bytes from the memory. It would a great addition to be able to load...
Read more >
How do I load a bytes object WAV audio file in torchaudio?
If it's WAV format, torchaudio.load should be able to decode it from file-like object. Your code snippet looks good to me.
Read more >
Error in torchaudio.load when loading dataset
hello I tried to learn using the LibriSpeech dataset but failed I was able to get into DataLoader, but I got an error...
Read more >
Source code for lhotse.audio
:param force_read_audio: Set it to ``True`` for audio files that do not have ... a new ``Recording`` that will lazily perturb the speed...
Read more >
Audio manipulation with torchaudio - PyTorch Tutorials
Loading audio data into Tensor. To load audio data, you can use torchaudio.load . This function accepts path-like object and file-like ...
Read more >

github_iconTop Related Medium Post

No results found

github_iconTop Related StackOverflow Question

No results found

github_iconTroubleshoot Live Code

Lightrun enables developers to add logs, metrics and snapshots to live code - no restarts or redeploys required.
Start Free

github_iconTop Related Reddit Thread

No results found

github_iconTop Related Hackernoon Post

No results found

github_iconTop Related Tweet

No results found

github_iconTop Related Dev.to Post

No results found

github_iconTop Related Hashnode Post

No results found