torachaudio.load does not load webm bytesio or _io.BufferedReader, but loads the file from hard drive
See original GitHub issue🐛 Describe the bug
I have a websocket that receives chunks of data in a byte format. The browser encodes the data in audio/webm format. The code is like the following:
@app.websocket("/listen")
async def websocket_endpoint(websocket: WebSocket):
await websocket.accept()
try:
while True:
data = await websocket.receive_bytes()
with open('audio.wav', mode="wb") as f:
f.write(data)
except Exception as e:
raise Exception(f'Could not process audio: {e}')
finally:
await websocket.close()
Manually writing the data to audio.wav and then reading the file using the following code works fine with no errors:
array, sr = torchaudio.load("audio.wav")
However, reading the file as a file object does not work:
with open("audio.wav", mode="rb") as f:
torchaudio.load(f)
It raises the following error:
Exception: Could not process audio: Failed to open the input "<_io.BufferedReader name='audio.wav'>" (Invalid data found when processing input).
INFO: connection closed
PS: Creating BytesIO from the data and passing it to the torchaudio.load results in error the same as the above.
Versions
Versions of relevant libraries: [pip3] numpy==1.23.4 [pip3] torch==1.12.1 [pip3] torchaudio==0.12.1 [conda] numpy 1.23.4 pypi_0 pypi
OS
Ubuntu: 22.04 torchaudio.backend: “sox_io”
PS
I tested the same process on a webm file which was converted from a wav file, and the result was the same:
torchaudio.loadcan read the file from hard drive.torchaudio.loadcannot readbytesioor_io.BufferedReader
Issue Analytics
- State:
- Created a year ago
- Comments:7 (5 by maintainers)
Top Related StackOverflow Question
I think you could do chunk-by-chunk decoding, which is more efficient, but not sure if this is what you want, as I do not know what application you are building.
To do chunk-by-chunk decoding, you can wrap the
socketobject into a synchronous file-like object.Then passing it to
StreamReaderand letStreamReaderpull the data.Hi @pooya-mohammadi
The audio you shared has wav extension but, in fact, it is WebM format.
prints the following
and
ffprove audio.wavreports;torchaudio.loadfirst attempts to read it with libsox, but it fails as WebM is not supported, and it re-tries with FFmpeg only when the source is file path. It cannot retry when the input is file-like object, as seek method is not always available.To handle WebM, you can use
torchaudio.io.StreamReader, and it works with both file input and file-like object input and it can do iterative reading as well.For the detailed usage, please checkout tutorials like