I recently helped out with my wife’s writing conference. During this conference, they recorded Vanessa Brantley Newton doing 1:1 critiques with authors and illustrators.
The conference organizers (my wife and her co-leads) offered the attendees individual recordings of their critiques. So, I had to take the overall zoom download and split it up.
There were a number of files available from Zoom, but the main ones I was interested in seem to be GMT<datetime>_<username>-_1920x1040.mp4
and GMT<datetime>_<username>-_gallery_1920x1040.mp4
. The first had a “speaker” view and the second had a “gallery” view. (In both views, screen sharing took the whole screen, and the speaker was reduced to a window.)
The hardest part was finding the time points to split the files up–a little bit because it was a manual process, but more importantly, because it was difficult for me not to get engrossed in the discussion. While illustration isn’t my cup of tea, it is intriguing watching someone deliver constructive feedback.
Anyway, once I had these time segments recorded, I had a few ways to split them up:
- MKToolnix: this worked well, but only outputs Matroska (MKV) files. I wasn’t sure if this would be accessable to the people downloading the files.
- FFMPEG: it took a while to figure out the syntax, but this could spit out mp4 files, and can copy direct from input to output, so executes very fast.
For #1, I used the MKToolnix GUI. It seemed to work OK, but I did not pursue it very far.
For #2, I used the following (Unix) command line:
1 |
ffmpeg -i ../GMT20201115-172325_poojan-_gallery_1920x1040.mp4 -c copy -map 0 -segment_times 4:48,21:40,37:23,37:53,52:10,52:33,1:07:45,1:09:20,1:22:40,1:23:13,1:36:27 -f segment -reset_timestamps 1 -segment_list_type csv -segment_list 'gallery.csv' 'gallery-%1d.mp4 |
This worked OK, except I found that the timing swere off. I said to cut the 1st segment at 4:48 (4 minutes 48 seconds), but it ended up being a bit later (at 4:52). This wasn’t a big deal, except the timing at 21:40 ended up also being later, and the 1st file included a tiny bit of discussion about the 2nd person.
The reason for this timing uncertainty is that ffmpeg can’t really break up a file at an arbitrary location. Since the video codecs are incremental–that is, each frame depends on the previous frame, you can’t just arbitrary break the stream at any point. You can only break it on what are called key frames–points in the stream where the entire image is sent. Unfortunately, the placement of these frames does not line up with where I want them.
I even tried trimming some of the post-critique banter by following the above command with:
ffmpeg -y -i gallery-1.mp4 -c copy -map 0 -t 16:32 “Person 1 – Gallery View.mp4”
1 |
ffmpeg -y -i gallery-1.mp4 -c copy -map 0 -t 16:32 "Person 1 - Gallery View.mp4" |
This forces a truncation of the 1st segment after 16:32 (16 minutes 32 seconds). While I can’t pick where this segment starts, I can truncate it anywhere I want.
Unfortunately, the only way to split at arbitrary points is to decode and re-encode the whole video stream. (The audio can still be copied.) I used the following command to do so:
1 |
ffmpeg -i ../GMT20201115-172325_poojan-_gallery_1920x1040.mp4 -c:v libx264 -c:a copy -segment_times 4:48,21:40,37:23,37:53,52:10,52:33,1:07:45,1:09:20,1:22:40,1:23:13,1:36:27 -f segment -reset_timestamps 1 -segment_list_type csv -segment_list 'gallery.csv' 'gallery-%1d.mp4' |
This works well, but is much slower. While it runs 9x the frame rate on my Unix server, the copy version is much faster.
This works well, but is much slower. While it runs 9x the frame rate on my Unix server, the copy version is much faster.
I also tried to use the nVidia-accelerated ffmpeg on my Windows PC (with GTX 1050 Ti):
1 |
fmpeg -y -vsync 0 -hwaccel cuda -hwaccel_output_format cuda -i ..\GMT20201115-172325_wisconsin-_gallery_1920x1040.mp4 -c:v h264_nvenc -c:a copy -segment_times 4:48,21:40,22:04,37:23,37:53,52:10,52:33,1:07:45,1:09:20,1:22:40,1:23:13,1:36:27 -f segment -reset_timestamps 1 -segment_list_type csv -segment_list "gallery.csv" "gallery-%%2d.mp4" |
With GPU acceleration, this 2nd verison runs at 20x the frame rate. So, for having to do a pointless encoding, it’s not so bad.
Edit: one more quirk. It seems like the GPU-accelerated (Windows version) of ffmpeg creates output that is always a multiple of 10 seconds in length. I’m not sure if this is a limitation of key frames in the output stream using the Nvidia CUDA-based encoder.