Quite often, you might find yourself in a situation in which you want to upload a song to YouTube which doesn’t have an accompanying music video. YouTube, being – obviously – a video sharing site, doesn’t accept only audio files and will reject them when uploading. I was wondering if there was a simple way to create a video that would contain the album cover shown throughout the whole length of the audio track.
Of course, the most obvious solution to this would be to use a video editing tool like the Windows Movie Maker, Sony Vegas, or Adobe Premiere, but that seemed like an overkill for such a – supposedly – simple task. Fortunately, the ffmpeg project can do it all automagically in one command, which was just the solution that I was looking for.
But let’s have a look at what YouTube wants from us first so that we can upload our content in the highest possible quality. The advanced encoding specifications show that the highest audio bitrate is allowed in 720p video files and equals 384 kbps. The recommended audio codec is AAC-LC, so you can use the Nero AAC Encoder or FAAC in order to encode the audio file into AAC (hopefully, from a lossless source!).
If you’ve got your audio files ready, it’s time to start making them into YouTube-compatible 720p files so that the audio bitrate is preserved and the resulting upload has the same high quality audio track. That’s where the aforementioned ffmpeg comes in. Hold on to your seats, now.
ffmpeg -loop 1 -r 5 -i $IMAGE_FILE -i $AUDIO_FILE -c:v libx264 -preset slow -crf 18 -tune stillimage -c:a copy -filter:v "[in] scale=-1:720, pad=1280:720:640-iw/2 [out]" -shortest output.mp4
This is the command that will take your $AUDIO_FILE, put it together with $IMAGE_FILE and produce output.mp4, which is a video file with 1280×720 resolution with the $IMAGE_FILE centered in its every frame encoded with x264, and $AUDIO_FILE as the audio track. The length of output.mp4 is equal to the length of $AUDIO_FILE.
Let us have a look at the parameters themselves, though :
- -loop 1 states that the input files should be looped indefinitely. This is so that the video track consists of repeated frames and not just one single frame, which is what ffmpeg would do by default – and YouTube doesn’t like it, since it requires video and audio tracks to be of equal lengths.
- -r 5 specifies the frame rate for the video track. Since our consists of just a still image, it doesn’t make sense to have the video track run at a full 25 / 23.976 fps, which would just unnecessarily enlarge the size of the resulting file. Having -r 1 makes the video track much longer than the audio track, for some bizarre reason.
- -i $IMAGE_FILE and -i $AUDIO_FILE are both input file declarations. The format of the files and whether they’re video or audio files is detected by ffmpeg at runtime.
- -c:v libx264 chooses libx264 as the codec used to encode the video track. Thus, the resulting data is encoded in YouTube-compatible H.264 format.
- -preset slow -crf 18 -tune stillimage are options passed to the x264 encoder. If you want to get to the bottom of them, I suggest reading this page.
- -c:a copy specifies that no processing whatsoever should be done to the audio track and that the track in the resulting file should be an exact copy of the input audio track.
- -filter:v „[in] scale=-1:720, pad=1280:720:640-iw/2 [out]” is the filter specification for the output video track. Here, we specify that the input image should first be pre-scaled to be 720 pixels high, while preserving the aspect ratio. Then, we pad the scaled image (assuming „normalized” album cover dimensions, we’re now dealing with a 720×720 image) with black borders to the size of 1280×720, and put the actual image at (640-iw/2), which happens to be the middle of the image. iw is the width of the input, e.g. what’s received from the scale filter.
- -shortest tells ffmpeg to stop encoding when the shortest encoded track ends. In our case this is the audio track, since the image is looped indefinitely.
- output.mp4 is just the output file name. ffmpeg automagically performs all the necessary muxing and handles output to the MP4 container.
That’s pretty much all there is to it. The resulting videos can be uploaded to YouTube without any further hassle. If you’re curious how the effect looks, here’s one of the videos that I prepared that way : click.
On a side note : it is possible to skip the first step, which consists of preparing the audio files manually with an external AAC encoder, and just have it all done by ffmpeg, thus simplifying the process even more. However, support for encoding AAC in ffmpeg is experimental and the sound quality of the result usually leaves much to be desired. If you would like to try it out, though, then use the following command.
ffmpeg -loop 1 -r 5 -i $IMAGE_FILE -i $AUDIO_FILE -c:v libx264 -preset slow -crf 18 -tune stillimage -c:a aac -b:a 384k -filter:v "[in] scale=-1:720, pad=1280:720:640-iw/2 [out]" -shortest output.mp4
Hope you found this useful,