Warm tip: This article is reproduced from stackoverflow.com, please click
download python reddit moviepy

Is there a fast way to combine an mp3 file with an mp4 file in python?

发布于 2020-05-29 16:48:41

I am trying to write a program that can download videos from Reddit posts. I believe that Reddit stores the audio and video for each post separately, so I am currently downloading the mp3 and the mp4 and then combining them to make a final video file. I am not very familiar with audio or video files or how they are stored, but I thought that combining the two would be quick to compute.

However, the combining part is very slow and I was wondering if there is a faster way of combining a soundless video clip with an audio file and writing it to my drive?

I am currently using the moviepy library for the combining.

def download_video(data_url,current_post,subreddit):
    #Get the audio url of Reddit video
    audioURL = data_url + "/audio"
    #Get the soundless video url of reddit video
    videoURL = str(current_post).split("'fallback_url': '")[1].split("'")[0]
    #Get the title of the post
    postname = (current_post['title'])

    #Download the two files as mp4 and mp3
    urllib.request.urlretrieve(videoURL, subreddit + '/video_name.mp4')
    urllib.request.urlretrieve(audioURL, subreddit + '/audio.mp3')

    #Combine the mp3 and mp4
    videoName = str(subreddit + "/" + get_valid_filename(current_post['title'])) +".mp4"
    video = mpe.VideoFileClip(subreddit + '/video_name.mp4')
    video.write_videofile(videoName, audio=subreddit + "/audio.mp3")
    #Remove video file with no audio
    del video
    os.remove(subreddit + '/video_name.mp4')
Questioner
Peter
Viewed
12
Oliver.R 2020-03-16 20:36

You could try using one of the existing open-source tools that achieves this, such as youtube-dl (which downloads much more than what its name suggests). A previous SO thread has already covered how to do this from within Python, and I've just tested it on both thread links and v.redd.it links and had it work with no issues with either.

import youtube_dl

ydl = youtube_dl.YoutubeDL()
with ydl:
    ydl.extract_info("https://www.reddit.com/r/bouldering/comments/fjgmo7/one_of_my_favorite_boulders_from_my_gym_back_home/")

If this has improved performance but you would prefer not to use the library, you could check their source to see how they're doing their video and audio combining.