The Basics of DVD Creation

The DVD has become the de facto replacement for the VHS dub. DVD authoring has become an extension of most nonlinear editing software. This has made something that started out as a very arcane, expensive and specialized task into something that is easy enough for any video professional to master. When you produce a video DVD, you are actually building a creative product that must conform to the DVD-Video spec. This is one of the categories within the overall DVD format specification that includes other forms, such as DVD-ROM and DVD-Audio. Creating a disk with the full nuances of the DVD-Video spec has been addressed in such high-end applications as Sonic Solutions’ DVD Creator. Newer, more user-friendly programs like Adobe Encore DVD, Apple DVD Studio Pro 2, Sonic’s own Reel DVD and Sony’s DVD Architect now offer enough of these same DVD authoring features to satisfy the requirements of over 90% of all the types of DVDs typically produced.


The full DVD spec deals with complex elements like program managers, title sets and so on, but these newer authoring tools have adopted a more streamlined approach – dividing your assets into tracks (for video and audio elements) and menus. You are generally limited to 99 different assets – far less than the full DVD spec would actually allow, yet well within the requirements of most commercial and corporate DVDs. This is especially true considering the significant drop in price from the early days of DVD tools.


Most of the popular authoring applications offer the main features, like motion menus, subtitles and multiple languages; but, not all permit some of the more advanced features, including 24fps encoding, multiple camera angles and AC-3 Dolby surround audio tracks. Obviously, there is still room for the advanced products, but any of the sub-$1,000 software tools will do the trick for most producers.




DVD program assets are made up of elementary stream files. Each video segment has a compressed MPEG2-format video file (.m2v) and, if there is audio with the program, at least one audio file (.wav). When multiple languages are used, there will be more than one audio file for each video element. For surround audio, there will be both a stereo track, as well as an AC-3 encoded audio file for the surround mix. In the past, these files had to be created prior to any authoring. Ground-breaking DVD tools, like Sonic’s DVD Creator, employed real-time encoding hardware (which is still used), but fast software encoding has now become an acceptable alternative. Many authoring tools let you import DVD-compliant files, which have been compressed using a separate application – or work with regular AVI or QuickTime files and apply built-in encoding at the last stage of the authoring process.


DVD videos can be either NTSC or PAL and the frame size is the same as DV: NTSC – 720 x 480 (non-square pixel aspect ratio). The compression of DVDs should typically fall into the range of 4.0 to 7.0 Mbps (megabits per second). Higher data rates are allowed, but you generally won’t see much improvement in the video and higher bit rates often cause problems in playback on some DVD players, especially those in older laptop computers. There are three encoding methods: constant bit rate, one-pass variable and two-pass variable.


Constant bit rate encoding is the fastest because the same amount of compression is applied to all of the video, regardless of complexity. Variable bit rate encoding applies less compression to more complex scenes (a fast camera pan) and more compression to less complex scenes (a static “talking head” shot). In two-pass encoding, the first pass is used to analyze the video and the second pass is the actual encoding pass. Therefore, two-pass variable bit rate encoding will take the longest amount of time. During the encoding set up, a single bit rate value is entered for constant bit rate encoding, but two values (average and maximum peak rates) are entered for variable.


The quality of one type of encoding versus another depends on the quality of the encoding engine used by the application, as well as the compression efficiency of the video itself. The former is obvious, but the latter means that film, 24P and other progressive-based media will compress more cleanly than standard interlaced video. This is due to the fact that interlaced video changes temporal image information every 60th of a second, while film and 24P’s visual information updates only 24 times a second. As a result, compression at the exact same bit rate will appear to have fewer artifacts when it is applied to film and 24P media, than when applied to interlaced media. Add to this the fact that film material has grain, which further hides some of these compression artifacts. The bottom line is that a major movie title can often look great with a much lower bit rate (more compressed) than your video-originated corporate training DVD – even with a higher bit rate. Most standard DVDs will look pretty good at a bit rate of around 5.5 Mbps, which will permit you to get about 60 to 90 minutes of material on a 4.7 GB general-purpose DVD-R.




Creating the interactive design for a DVD is a lot like building a web site. Menus are created which are linked to video assets. Clicking a menu button causes the player to jump from one point on the DVD (menu) to another (video track). Menus can be still frames or moving video, but must conform to the same constraints as any other video. If they don’t start out as video, they are turned into video in the final DVD build. By comparison a web site, CD-ROM or even DVD-ROM might have HTML-based menus of one size and QuickTime or AVI video assets of a different size. This isn’t allowed in a DVD-Video, because the DVD must be playable on a set-top DVD player as video. Motion menus must be built with loop points, since the longer the menu runs, the more space it will consume on the disk. Thirty seconds is usually a standard duration for a loop. A slight pause or freeze occurs in the video when the disk is triggered to jump back to the start of the loop. This is a buffer as the DVD player’s head moves between two points on the disk’s surface.


Lay out your design on paper first. A flowchart is a great idea, because this will let you see how one or more menus connect to the videos in the most efficient manner. Although any type of flowcharting software program will work, it is often just as simple to do this on paper. On the other hand, a nicely printed visual flowchart goes a long way in explaining to a client how the viewer will navigate through a complex DVD. Links can be created as buttons or drop zones. A button is a graphic element added to a menu within the authoring tool, while a drop zone is a hyperlink area added to an imported video. You can create videos or animations to be used as menus and turn graphic or video portions into linkable “hot spots” by adding drop zones on top of the video.


Clicking on a button or drop zone activates a jump to another menu, a video track or a point within a video track (chapter point). Part of the authoring task is to define what actions occur when you click on the control keys of a standard DVD handheld remote. You must define what happens when the viewer clicks the Menu, Title or Return keys, as well as which direction the cursor travels when the arrow keys (up, down, left, right) are used. Although the degree of control over these parameters varies with different software applications, all the contenders must let you define the next options. You have to set the First Play – the file that plays when you first pop in the DVD. You also have to set the target destination for the end of each video file. This determines whether the video continues on to another video or jumps back to a menu – and if so, which one. Remember that if you have more than one menu, you will have to add buttons and navigation commands to go between the menus.




Most authoring tools let you run a simulation of the DVD as you configure it. This lets you proof it to see if all the links work as intended. When the authoring is complete, the next step is to “build” the disk. This is an automatic process in which the application checks to see if your authoring has any errors and then “muxes” (multiplexes) the audio, video and menu files. The muxing stage is where the VOB (video object) files are created. You can see these files if you explore the folders of a non-encrypted DVD. If your DVD tool features built-in encoding, that step generally occurs during the building process.


When a project is built, it can be saved to your hard drive as a disk image or burned to a recordable DVD. Although there are a handful of different recordable DVD formats, the DVD-R general-purpose disks seem to be the most universal. These are typically rated at 4.7 GB, but actually hold about 4.3 GB of data. If you require more capacity, you will have to advance to a dual-layer or dual-sided DVD (9 GB and higher). These cannot be burned on a DVD recorder and not all authoring tools handle these formats. If they do, you can prepare a disk image to be saved to a DLT tape, which would be sent to a commercial DVD replication facility. When you burn a DVD-R on a recording drive, be aware that the burning speed is based on the rating of the media. 1X blanks will only burn at the 1X speed, 2X disks at 2X speed and so on. If you saved the build as a disk image, you will be able to use a disk burning utility like Roxio and burn multiple DVD copies. Although encoding and recording times vary, it is not out-of-line for a one-hour DVD to require as much as four hours from the time you start the build until the DVD has finished the recording step.


We’ve only scratched the surface; so to learn more, check out the white papers and technical documents that are available on the Pioneer, Sonic Solutions and Mitsui web sites. A great reference is DVD Production, A Practical Resource for DVD Publishers by Philip De Lancie and Mark Ely, available from Sonic Solutions. User guides and tutorials – especially those included with Adobe Encore DVD and Apple DVD Studio Pro 2 – are also quite helpful. Start small and move up from there. Soon you, too, can be a master of the DVD world.


© 2004 Oliver Peters

Proper Monitoring

Just like good lighting and camerawork are some of the fundamentals of quality production, good monitoring provides some of the same important building blocks for post-production. Without high quality video and audio monitors, as well as waveform monitors and vectorscopes, it is impossible to correctly assess the quality of the video and audio signals with which you are working. There are few if any instruments that truly tell an editor or mixer the degradation of signals as they travel through the system any better than the human eyes, ears and brain. You cannot read out the amount of compression applied to a digital file from some fancy device, but the eye can quickly detect compression artifacts in the image.


Such subjective quality evaluations are only valid when you are using professional, calibrated monitoring that shows you the good with the bad. The point of broadcast grade video monitors and studio grade audio monitors is not to show you a pleasing picture or great sounding mix, but rather to show you what’s actually there, so that you can adjust it and make it better. You want the truth and you won’t get that from a consumer video monitor or TV or from a set of discount boombox speakers.


Video Monitors


Let’s start with the picture. A proper post-production suite should have a 19 or 20-inch broadcast grade monitor for video evaluation. Smaller monitors can be used if budgets are tight, but larger is better. Most people tend to use Sonys, but there are also good choices from Panasonic and Barco. In the Sony line, you can choose between the BVM (broadcast) and the PVM (professional, i.e. “prosumer”) series. The BVMs are expensive but offer truer colors because of the phosphors used in the picture tube, but most people who work with properly calibrated PVM monitors are quite happy with the results. In no case at this point in time would I recommend flat panel monitors as your definitive QC video monitor – especially if you do any color-correction with your editing.


The monitor you use should have both component analog (or SDI) and composite analog feeds from your edit system. Component gives you a better image, but most of your viewers are still looking at the end product (regardless of source) via a composite input to a TV or monitor of some type. Frequently things can look great in component and awful in composite, so you should be able to check each type of signal. If you are using a component video feed, make sure your connections are solid and the cable lengths are equal, because reduced signal strength or unequal timing on any of the three cables can result in incorrect colorimetry when the video is displayed. This may be subtle enough to go unnoticed until it is too late.


Properly calibrated monitors should show a true black-and-white image, meaning that any mage, which is totally B&W, should not appear to be tinted with a cast of red, blue or green. Color bars should appear correct. I won’t go into it here, but there are plenty of resources which describe how to properly set up your monitor using reference color bars. Once the monitor is correctly calibrated, do not change it to make a bad picture look better! Fix the bad image!




Video monitors provide the visual feedback an editor needs, but waveform monitors and vectorscopes provide the technical feedback. These are the editor’s equivalent to the cinematographer’s light meter. The waveform monitor displays information about luminance (brightness, contrast and gamma) while the vectorscope displays information about color saturation and hue. The waveform can also tell you about saturation but not hue. Most nonlinear editing applications include software-based scopes, but these are pretty inadequate when compared to the genuine article. Look for products from Tektronix, Leader, Videotek or Magni. Their products include both traditional (CRT-based) self-contained units, as well as rack-mounted modules that send a display to a separate video monitor or computer screen. Both types are accurate. Like your video display, scopes can be purchased that take SDI, component analog or composite analog signals. SDI scopes are the most expensive and composite the least. Although I would recommend SDI scopes as the first choice, the truth of the matter is that monitoring your composite output using a composite waveform monitor or vectorscope is more than adequate to determine proper levels for luma and chroma.


Audio Monitors


Mixers, power amps and speakers make up this chain. It’s possible to set up a fine NLE suite with no mixer at all, but most people find that a small mixer provides a handy signal router for the various audio devices in a room. When I work in an AES/EBU-capable system (digital audio), I will use that path to go between the decks and the edit system. Then I only use the mixer for monitoring. On the other hand, in an analog environment, the mixer becomes part of the signal chain. There are a lot of good choices out there, but most frequently you’ll find Mackie, Behringer or Alesis mixers. Some all-digital rooms use the Yamaha digital mixers, but that generally seems to be overkill.


The choice of speakers has the most impact on your perception of the mix. You can get either powered speakers (no separate amp required) or purchase a separate power amp, depending on which speakers you purchase. Power amps seem to have less of an affect, but good buys are Yamaha, Crown, Alesis, Carvin and Haffler. The point is to match the amp with the speakers so that you provide plenty of power at low volumes in order to efficiently drive the speaker cones.


Picking the right speaker is a very subjective choice. Remember that you want something that tells you the ugly truth. Clarity and proper stereo imaging is important. Most edit suites are near-field monitoring environments, so huge speakers are pointless. You will generally want a set of two-way speakers, each with an eight-inch woofer and a tweeter. There are plenty of good choices from JBL, Alesis, Mackie, Behringer, Tannoy and Eastern Acoustic Works, but my current favorite is from SLS Loudspeakers (Superior Line Source). In the interest of full disclosure, I own stock in SLS and have them at home, but they are truly impressive speakers sporting innovative planar ribbon technology in their tweeter assembly.


VU Metering


VU, peak and PPM meters are the audio equivalent to waveform monitors. What these meters tell you is often hard to interpret because of the industry changes from analog to digital processing. An analog VU scale places desirable audio at around 0 VU with peaks hitting at no more than +3 db. Digital scales have a different range. 0 is the absolute top of the range and 0 or higher results in harsh digital distortion. The equivalent spot on this range to analog’s 0 VU is minus 12, 14 or 20 db. In effect, you can have up to 20 db of headroom before distortion, as compared to analog’s 3 to 6 db of headroom. The reason for the ambiguity in the nominal reference value is because many digital systems calibrate their VU scales differently. Most current applications set the 0 VU reference at –20 db digital. 


Mixing with software VUs can be quite frustrating because the meters are instantly responsive to peaks. You see more peaks than a mechanical, analog VU meter would ever show you. As a result, it is quite easy to end up with a mix that is really too low when you go to a tape machine. I generally fix this by setting the VTR inputs to a point where the level is in the right place for the VTR. Then I may change the level of the reference tone at the head of my sequence to match the proper level as read at the VTR. This may seem backwards, but it’s a real world workaround that works quite well.


© 2004 Oliver Peters

Adobe Photoshop Tips for Video

Adobe Photoshop has become the most valuable auxiliary software application used by video professionals and video editors of all types. Whether you use it to doctor client-supplied graphics and photos or simply as the ersatz type tool for your nonlinear edit system, Photoshop has been a veritable Swiss Army knife to solve design issues for video. In this installment I’ll pass along a few pointers that might make your use of Photoshop more productive. Generally these tips apply to versions 6, 7 or higher (Mac and PC), so some may not work the same way with earlier versions.




The NTSC video frame is sized at 720 x 486 (480 for DV) non-square, i.e. rectangular, pixels. Computers work with a square pixel aspect ratio, so new graphics created in Photoshop for video should start out at 720 x 540 pixels (72 dpi). Depending on which video editing application you use, you might have to resize the file as your last step. For example, Avid software can automatically resize the frame from the 540 height to 486 (or 480), whereas Final Cut Pro does not. In the case of FCP, you would have to resize the graphic in Photoshop, before importing it into Final Cut. The reason for these corrections is so that the aspect ratio of graphics, such as the roundness of a circular logo, will look correct once you get it into the video realm. In short, create in 720 x 540 and then resize to 720 x 486 for all NTSC video formats except DV/DVCPRO/DVCAM, which use a 720 x 480 frame size. Turn “constrain proportions” off when altering image sizes in Photoshop.




Computers and digital video use an 8-bit color depth, meaning that colors, brightness, etc. are each divided into 256 increments. There is also 10-bit color for some video systems, but generally even on these systems, rendering is still based on 8-bit math. Computers work with RGB images. Each component color element – red, blue and green – has a value of from 0 to 255. Black would be 0, 0, 0 and white would be 255, 255, 255. Digital video, known as YUV (conforming to the ITUR-601 spec), splits video into a luminance and two color-difference signals. You will often see this expressed as a ratio, such as 4:2:2 or 4:1:1, representing the relative (not actual) values of these components. In most digital video systems, black is actually placed at the value of 16 and white at 235. This gives the system enough headroom at both ends to deal with things like dark video dipping below the NTSC analog “set-up” signal (analog black is at 7.5 IRE not 0) and peak white levels that exceed 100 IRE in the camera (up to 110 IRE).


Like the issue of sizing, various NLEs deal with video levels in different manners. For instance, Avid software allows you to import graphics with either RGB or 601 video levels. This choice will affect the brightness and contrast of your image. Final Cut Pro gives you no such option and assumes your graphic to be at RGB values. It is best to know where your graphic will be used and adjust values accordingly. This can be done using Photoshop’s “levels” adjustment. Here you can alter input and output values, as well as the midpoint or gamma of the image. Changing the gamma value allows you to adjust the midrange portion of an image between brighter and more visible to darker and less visible. Since computer screens have different gamma values than monitors, you will frequently get a totally different amount of contrast between the appearance of a photo or graphic in Photoshop  (viewed on a computer screen) versus your NLE playing on a video monitor. If your NLE expects a graphic with an RGB range, output settings at 0 and 255 are fine, but if it expects a 601 range, then you need to adjust things to 16 and 235.


It is ideal to be able to check the results of your adjustments by viewing the Photoshop output on an actual waveform monitor or video monitor. There are some applications and plug-ins that approximate this with software scopes, but the real thing is better. The Echo Fire plug-in works with the AJA Io to allow you to see Photoshop images on a video monitor. Some video graphics cards, like the Matrox Parhelia card, also give you this functionality.


Layers and Layer Effects


One of the biggest features that Photoshop offers is the ability to work in layers and many NLEs now allow you to import layered files. These usually come in as sequences with each Photoshop layer becoming a separate video layer. After importing, you can use your NLE’s DVE effects to move or animate any of these elements. Adobe’s blend modes and layer effects cause trouble for most NLEs. The blend modes determine how each layer interacts with the rest of the composite. These can create quite unique looks, but most NLEs really only understand the “normal” blend mode and use this transparency information to correctly key or superimpose one layer over another. The layer effects are used to add drop shadows, glows and embossing to an object. If these are left separate, most NLEs will not derive the correct transparency and/or edge information of a graphic. In order to prevent problems, layer effects must first be merged to their layers before importing into an NLE.


I usually follow this procedure for a layer with effects. Duplicate the layer and create a new blank layer above it. “Rasterize Layer” on the duplicate if this is a type layer. Link the duplicate and the new blank layer and then chose “merge linked”. You now have a single layer, complete with the composited layer effects, as well as the original layer with its individual elements. Keeping the originals intact permits later changes. Then I usually save a copy of the complete file and delete the original component layers in that copy, leaving only the various final layers. This is the file I will import into the NLE.


Normally you will import a flattened file (no layers), if you don’t intend to create any further animation with it. Most NLEs will deal with all the popular formats: PSD, PICT, TIFF, Targa, JPEG, BMP or JPEG. Make sure the files are in the RGB and not CMYK mode. To flatten a file, select that option on the layers menu or simply save the file in one of these formats, but do not enable layers in the saving dialogue. Make sure you do this with a copy so you can go back and make changes to the original.




When creating a graphic to be used as a key, such as a logo or super, you will need to have a companion alpha channel to “cut the hole” for this key. Most NLEs can only deal with a single alpha channel. You will need to merge all layers except the solid-color background layer into a single composite layer. “Load Selection” on the merged layer and you’ll see the crawling dotted outline of Photoshop’s selection tool. In the channels menu, add a new alpha channel and then fill the selection with white. You should now see a white-on-black version of the graphic, complete with soft shadows and gray values for semi-transparent objects or edges. If your NLE expects to see an alpha channel with the key-cutter object as black on a white background, then invert the video for the alpha channel. This is the case with Avids. Flatten the file and save as a 32-bit file (with alpha) in one of the above formats.




I use Photoshop a lot as the character generator for my NLE. I like the layer effects and it is easy to do a lot of similar supers simply by creating one layer with the right fonts, sizes and attributes, then duplicate and enter the revised text on the new layer. Repeat the process for as many supers as you need. I did this recently with 200 direct response phone number supers. When I was done, I had a single Photoshop file with 200 layers, each with separate layer effects for drop shadows and a subtle glow effect. I still had to get these into individual graphic files for my NLE. Along the way, I discovered yet another way that was far easier than the process I just finished describing. The PNG format is another Adobe image file that preserves transparency information and embeds it into the file. I took my 200-layer Photoshop file and turned off the layer visibility of all layers except the one I was exporting. With one layer active and visible, I saved a copy of the file in the PNG format. This new file had discarded all invisible layers except the one, rasterized the type, merged the layer effects and embedded alpha information. I was able to import this into Avid (reverse alpha selected) and had my real-time title super. Then I repeated this process for each of the different phone numbers.


Processing Moving Video Clips


Adobe After Effects is frequently used to add effects and filtering to video clips, but you can also do this with Photoshop. Most NLEs will permit you to export a clip or a sequence as a series of sequential image files. In other words, your :10 video clip can be exported as a series of still frame graphics in one of the standard files, numbered as such – NAME001.tga, NAME002.tga, NAME003.tga and so on. Any of the Photoshop level and sizing functions or filter effects can be applied to these files. Photoshop offers a way to automate such procedures through its Batch Automation and Actions menus. For instance, if you’d like a Watercolor Artistic effect filter applied to :10 of video, it’s as simple as creating the right actions and batch commands to do this. It’s a good idea to set up the batch to save the processed files as new images to a separate folder, in order to leave your original exported files unaffected.


On the opposite end, most NLEs will also import sequential image files and put them back together as a single video clip or animation. As long as no size values where changed, you shouldn’t need to worry about resizing the images or any video interlace issues during the exporting and importing of these files.


There’s a lot more to Photoshop than these few tips, but hopefully one of these ideas has added a couple of new tricks to your NLE expertise. If you want to do a bit more in-depth research, take a look at Photoshop for Nonlinear Editors (Richard Harrington, CMP Books).


© 2004 Oliver Peters