Warm tip: This article is reproduced from serverfault.com, please click

How can I detect corrupt/incomplete MP3 file, from a node.js app?

发布于 2020-12-01 17:52:48

The common situation when the integrity of an MP3 file is not correct, is when the file has been partially uploaded to the server. In this case, the indicated audio duration doesn't correspond to what is really in the MP3 file: we can hear the beginning, but at some point the playing stops and the indicated duration of the audio player is broken.

I tried with libraries like node-ffprobe, but it seems they just read metadata, without making comparison with real audio data in the file. Is there a way to detect efficiently a corrupted or incomplete MP3 file from node.js?

Note: the client uploading MP3 files is a hardware (an audio recorder), uploading files on a FTP server. Not a browser. So I'm not able to upload potentially more useful data from the client.

Questioner
Etienne
Viewed
0
Brad 2020-12-02 02:25:30

MP3 files don't normally have a duration. They're just a series of MPEG frames. Sometimes, there is an ID3 tag indicating duration, but not always.

Players can determine duration by choosing one of a few methods:

  • Decode the entire audio file.
    This is the slowest method, but if you're going to decode the file anyway, you might as well go this route as it gives you an exact duration.
  • Read the whole file, skimming through frame headers.
    You'll have to read the whole file from disk, but you won't have to decode it. Can be slow if I/O is slow, but gives you an exact duration.
  • Read the first frame's bitrate and estimate duration by file size.
    Definitely the fastest method, and the one most commonly used by players. Duration is an estimate only, and is reasonably accurate for CBR, but can be wildly inaccurate for VBR.

What I'm getting at is that these files might not actually be broken. They might just be VBR files that your player doesn't know the duration of.

If you're convinced they are broken (such as stopping in the middle of content), then you'll have to figure out how you want to handle it. There are probably only a couple ways to determine this:

  • Ideally, there's an ID3 tag indicating duration, and you can decode the whole file and determine its real duration to compare.
  • Usually, that ID3 tag won't exist, so you'll have to check to see if the last frame is complete or not.

Beyond that, you don't really have a good way of knowing if the stream is incomplete, since there is no outer container that actually specifies number of frames to expect.