Thursday, November 14, 2013

Ripping a DVD and hard-wiring subtitles

One way to back up a DVD is just to copy its 'raw' contents, resulting in a file like movie.iso which contains all DVD data files organized as an iso-9660 file system. We use the dd command to copy from the raw DVD device.
DEV=/dev/sr0 # name of DVD reader device
ISO=movie.iso # name of target file
umount ${DEV} # dd works on unmounted devices
dd bs=32000000 if=${DEV} of=${ISO} ||
  { echo "dd from $DEV failed"; exit 1;} 
To get access to the files, we can mount movie.iso in read-only mode.
ME=$(id -u) # our user id to 'own' the mounted file system
sudo mount -t iso9660 \
  -o ro,loop,nosuid,nodev,uid=${ME},gid=${ME},mode=0777 \
  ${ISO} /mnt || { echo "mount ${ISO} failed"; exit 1; }
Now we can explore /mnt and find it has a directory video_ts which contains files as in the example below.
$ ls /mnt/video_ts

video_ts.bup  vts_01_0.vob  vts_02_1.vob  vts_02_6.vob
video_ts.ifo  vts_01_1.vob  vts_02_2.vob  vts_02_7.vob
video_ts.vob  vts_02_0.bup  vts_02_3.vob  vts_02_8.vob
vts_01_0.bup  vts_02_0.ifo  vts_02_4.vob
vts_01_0.ifo  vts_02_0.vob  vts_02_5.vob

$ du -sm . # how many megabytes do these files take?
7590
There are two video sequences represented by the .vob files with names starting with vts_01 and vts_02 respectively. Looking at the total sizes of these files, it is clear that the main movie is stored in vts_02_0.vob...vts_02_8.vob.
It is also clear that, partly due to the use on most DVD's of the older mpeg-2 codec, the video sequences take a lot more space, almost 8GB, than if they were encoded with a more efficient codec. Still, media players such as vlc can play the movie.iso file, including handling menus, choosing subtitles etc. as on the original DVD.
If we want to save space, however, we can convert the vts_02_0.vob...vts_02_8.vob sequence to a single .mp4 file in our home directory. For this we can use the avconv command.
cat vts_02_*.vob | avconv -i pipe:0 ~/movie.mp4
Due to the better codec, the size of movie.mp4 is now about 1.5GB, a huge improvement while maintaining quality. Unfortunately, if the original DVD provides a choice of subtitles, they will not be included in the copy. A simple solution would be to make a copy that has one of the subtitle languages hardwired into it.
Apparently, DVD's store subtitle information in so called SPU (SubPicture Units). Essentially, these are images containing the subtitle text that are presumably inserted at the appropriate moments in the video stream. The SPUs come with an index which indicates at what time which picture of the selected subtitle language has to be shown.
So, one way to solve the problem would be to find a program that is able to extract the subtitle data (and index) and then define a post-processing filter that would add the subtitles in the chosen language.
Surprisingly, there seems to be only one program under Linux that supports the definition of a filter as described above: the avidemux video editor. Fortunately, avidemux also contains a tool to extract the SPU data and index from the video input. Unfortunately the command line interface for avidemux does not seem to support subtitle manipulation, so we are forced to use the GUI (Graphical User Interface).

Click 'open' button to load video file(s).
Clicking the indicated button lets us select a video file, we take the first one from the vts_02_*.vob sequence. Normally, we'd expect to select all the following files in the sequence but apparently Avidemux cleverly does that automatically.

It may be that Avidemux takes a while to load the file(s) because it indexes them. Once the movie has been loaded, not much has changed on the interface but see the arrows on the next image.
Clicking on the indicated button will show information on the format of the movie, as shown in the pop-up on the right.

The movie is loaded.
Properties of the loaded movie.
Now, we are ready for the extraction of the subtitle information. Click on tools in the top bar (not shown) and select VOB -> VobSub. A window pops up, shown on the right below, where you select the first .vob file (the same as was used to load the movie into avidemux), the accompanying .ifo file: vts_02.ifo and finally, a file name prefix that will be used to store the subtitle information that is about to be extracted. Note that the latter file cannot be in (under) the /mnt directory since /mnt contains a read-only file system.
Clicking OK starts the process and will produce two files: vts_02.sub and vts_02.idx containing, respectively, the subtitle images and the index describing when they have to be displayed.
Finally, we define a filter which will add subtitles in a selected language to the video output to be produced by avidemux. For that we first define the (container) format and the video codec, as shown on the left below.
Clicking on Filters under Video brings a pop-up menu where we select the subtitle menu that inserts "vobsub" frames. Double clicking on the "vobsub" selection brings up another menu where we select the file ~/tmp/vts_02.idx containing the subtitles information. Once that is done, we can also select the subtitle language that is to be hardwired into the output.

After all that, we are ready to produce the output. This is done by clicking the Save (next to Open) button. "Save" may not be an intuitive name for what is essentially a video transformation process but, since avidemux is mainly a video editor, it actually makes sense.

No comments:

Post a Comment