Remember film?

With all the buzz about various digital cameras like RED and the latest HDSLRs, it’s easy to forget that most national commercial campaigns, dramatic television shows, feature films and many local and regional spots are still filmed with ACTUAL 16mm and 35mm motion picture film. As an editor, you need to have a good understanding about the film transfer workflow and what information needs to be communicated between an editor and the transfer facility or lab.

Film transfers and speed

Film is typically exposed in the camera at a true 24fps. This is transferred in real-time to video using a scanner or telecine device like a Cintel Ursa or a DFT Spirit. During this process, the film’s running speed is slowed by 1/1000th to 23.98fps (also expressed as 23.976) – a rate compatible with the 29.97fps video rate of the NTSC signal. In addition, film that is being transferred to NTSC (525i) or high definition video for television (1080i/29.97 or 720p/59.94) is played with a cadence of repeated film frames, know as 3-2 pulldown. Film frames are repeated in a 2-3-2-3 pattern of video fields, so that 24 film frames equals 30 interlaced video frames (or 60 whole frames in the case of 720p) within one second of time. (Note: This is specific to the US and other NTSC-based countries. Many PAL countries shoot and post film content targeted for TV at a true 25fps.)

Film production requires the use of an external sound recorder. This production method is known as double-system sound recording. Analog audio recorders for film, like a Nagra, record at a true sound speed synced to 60Hz, or if timecode was used, at a true timecode value of 30fps. When the audio tape is synced to the film during the film-to-tape transfer session, the audio goes through a similar .999 speed adjustment, resulting in the sound (and timecode) running at 29.97fps instead of 30fps as compared to a real-time clock.

The film sound industry has largely transitioned from analog recorders – through DATs – to current file-based location recorders, like the Aaton Cantar or the Zaxcom Deva, which record multichannel Broadcast WAVE files. Sound speed and the subsequent sync-to-picture is based on sample rates. One frequent approach is for the location sound mixer to record the files at 48048 kHz, which are then “slowed” when adjusted to 48kHz inside the NLE or during film-to-tape transfer.

Check out and for expanded explanations.

Film transfer

The objective of a film-to-tape transfer session is to color-correct the image, sync the sound and provide a tape and metadata for the editor. Sessions are typically booked as “unsupervised” (no client or DP looking over the colorist’s shoulders) or “supervised” (you are there to call the shots). The latter costs more and always takes more time. Unsupervised sessions are generally considered to be “one-light” or “best-light” color correction sessions. In a true one-light session, the telecine is set-up to a standard reference film loop and your footage is transferred without adjustment, based on that reference. During a best-light session, the colorist will do general, subjective color-correction to each scene based on his eye and input from the DP.

Truthfully, most one-light sessions today are closer to a best-light session than a true one-light. Few colorists are going to let something that looks awful go through, even if it matches a reference set-up. The best procedure is for the DP to film a few seconds of a Macbeth and a Grayscale chart as part of each new lighting set-up, which can be used by the colorist as a color-correction starting point. This provides the colorist with an objective reference relative to the actual lighting and exposure of that scene as intended by the DP.

Most labs will prep film negative for transfer by adding a countdown leader to a camera roll or lab roll (several camera rolls spliced together). They may also punch a hole in the leader (usually on the “picture start” frame or in the first slate). During transfer, it is common for the colorist to start each camera roll with a new timecode hour. The :00 rollover of that hour typically coincides with this hole punch. The average 35mm camera roll constitutes about 10-11 minutes of footage, so an hour-long video tape film transfer master will contain about five full camera rolls. The timecode would ascend from 1:00:00:00 up through 5:00:00:00 – a new hour value starting each new camera roll. A sync reference, like a hole-punched frame, corresponds to each new hour value at the :00 rollover. The second videotape reel would start with 6:00:00:00 and so on.

Many transfer sessions will also include the simultaneously syncing of the double-system audio. This depends on how the sound was recorded (Nagra, DAT or digital file) and the gear available at the facility. Bear in mind that when sound has to be manually synced by the colorist for each take – especially if this is by manually matching a slate with an audible clap – then the film-to-tape transfer session is going to take longer. As a rule-of-thumb MOS (picture-only), one-light transfer sessions take about 1.5 to 2 times the running length of the footage. That’s because the colorist can do a basic set-up and let a 10 minute camera roll transfer to tape without the need to stop and make adjustments or sync audio. Adding sound syncing and client supervision, often means the length of the session will increase by a factor of 4x or 5x.

The procedure for transferring film-to-tape is a little different for features versus a television commercial or a show. When film is transferred for a feature film, it is critical that a lot of metadata be included to facilitate the needs of a DI or cutting negative at the end of the line. I won’t go into that here, because it tends to be very specialized, but the information tracked includes audio and picture roll numbers, timecode, film keycode and scene/take information. This data is stored in a telecine log known as a FLEX file. This is a tab delimited text file, which is loaded by the editor into a database used by the NLE. It becomes the basis for ingesting footage and is used later as a cross-reference to create various film lists for negative cutting from the edited sequences.

If your use of film is for a commercial or TV show, then it’s less critical to track as much metadata. TV shows generally rely on tape-to-tape (or inside the NLE) color-correction and will almost never return to the film negative. You still want to “protect” for a negative cut, however, so you still need to track the film information. It’s nice to have the metadata as a way to go back to the film if you had to. Plus, some distributors still require cut negative or at least the film lists.

It’s more important that the film be transferred with a set-up that lends itself to proper color grading in post. This means that the initial transfer is going to look a bit flatter without any clipped highlights or crushed blacks. Since each show has its own unique workflow, it is important that the editors, post supervisor and dailies colorists are all on the same page. For instance, they might not want each camera roll to start with a new hour code. Instead, they might prefer to have each videotape reel stick with consistent ascending timecode. In other words, one hour TC value per videotape reel, so you know that 6:00:00:00 is going to be the start of videotape reel 6, and not film camera reel 6 / videotape reel 2, as in my earlier example.

Communication and guidelines are essential. It’s worth noting that the introduction of Digital Intermediate Mastering (DI) for feature films has clouded the waters. Many DI workflows no longer rely on keycode as a negative cut would. Instead, they have adopted a workflow not unlike the spot world, which I describe in the next section. Be sure to nail down the requirements before you start. Cover all the bases, even if there are steps that everyone assumes won’t be used. In the end, that may become a real lifesaver!

The spot world

I’m going to concentrate of the commercial spot world, since many of the readers are more likely to work here than in the rarified world of films and film-originated TV shows. Despite the advances of nonlinear color grading, most ad agencies still prefer to retransfer from the film negative when finishing the commercial.

This is the typical workflow:

–       Transfer a one-light to a video format for offline editing, like DVCAM

–       Offline edit with your NLE of choice

–       Generate transfer lists for the colorist based on the approved cut

–       Retransfer (supervised correction) selects to Digibeta or HD for finishing

–       Online editing/finishing plus effects

In this world, often different labs and transfer facilities, as well as editorial shops, may be used for each of these steps. Communication is critical. In many cases the director and DP may not be involved in the transfer and editing stages of the project, so the offline editor frequently plays the role of a producer. This is how spot editors worked in the film days and how many of the top commercial cutters still work today in New York, LA, Chicago or London.

In the first two steps, the objective is to get all of the footage that was shot ready to edit in the least time-consuming and most inexpensive manner possible. No time wasted in color-correction or using more expensive tape formats just to make creative decisions. The downside to this approach is that the client sometimes sees an image that isn’t as good as it could be (and will be in the end). This means the editor might have to do some explaining or add some temporary color-correction filters, just so the client understands the potential.

When the offline editing is done, the editor must get the correct info to the colorist who will handle the retransfer of the negative. For example, if each camera roll used a different hour digit, it will be important for the editor to know – and to relay – the correct relationship between camera rolls and timecode starts. For instance, if a hole punch was not used, then does 1:00:00:00 match “picture start” on the camera one leader? Does it match the 2-pop on the countdown? Does it match the first frame of the slate?

When film negative is retransferred, the colorist will transfer only the shots used in the finished cut of the commercial. Standard procedure is to transfer the complete shot “flash-to-flash”. In other words, from the start to the end of exposure on that shot. If it’s too long – as in an extended recording with many takes – then the colorist will transfer the shot as cut into the spot, plus several seconds of “handles”. This is almost always a client-supervised session and it can easily take 6-8 hours to work through the 40-50 shots that make up a fast paced spot.

The reason it’s important to know how the timecode corresponds to the original transfer, is because the colorist will use these same values in the retransfer. The colorist will line up camera roll one to a start frame that matches 1:00:00:00. If a shot starts at 1:05:10:00, then the colorist will roll down to that point, color-correct the shot and record it to tape with the extra handle length. Colorists will work in the ascending scene order of the source camera rolls – not is the order that these shots occur in the edited sequence. This is done so that film negative rolls are shuttled back and forth as little as possible.

As shots are recorded to videotape, matching source timecode will be recorded to the video master. As a result, the videotape transfer master will have ascending timecode values, but the timecode will not be contiguous. The numbers will jump between shots. During the online editing (finishing) session, the new footage will be batch-captured according to the shots in the edited sequence, so it’s critical that the retransferred shots match the original dailies as frame-accurately as possible. Otherwise the editor would be forced to match each shot visually! Therefore, it’s important to have a sufficient amount of footage before and after the selected portion of the shot, so that the VTR can successfully cue, preroll and be ingested. If all these steps are followed to the letter, then the online edit (or the “uprez” process) will be frame-accurate compared with the approved rough cut of the spot.

To make sure this happens smoothly, you need to give the colorist a “C-mode” list. This is an edit decision list that is sorted in the ascending timecode order of the source clips. This sort order should correspond to the same ascending order of shots as they occur on the camera rolls. Generating a proper C-mode EDL in some NLEs can be problematic, based on how they compute the information. Final Cut is especially poor at this. A better approach is to generate a log-style batch list. The colorist doesn’t use these files in an electronic fashion anyway, so it doesn’t matter if it’s an EDL, a spreadsheet, a hand-written log or a PDF. One tactic I take in FCP is to duplicate the sequence and strip out all effects, titles and audio from the dupe. Next, I copy & paste the duped sequence to a new, blank bin, which creates a set of corresponding subclips. This can be sorted and exported as a batch list. The batch list, in turn can be further manipulated. You may add color correction instructions, reference thumbnail images and so on.

Once I get the tape back from the retransfer session, I will Media Manage (FCP) or Decompresss (Avid) the sequence to create a new offline sequence. These clips can then be batch-captured for the final sequence with full-quality video (also called “uprezzing”). In some cases, FCP’s Media Manager has let me down and I’ve had to resort to exporting an EDL and using that as a basis for the batch capture. EDLs have proven to be pretty bullet-proof in the spot world.

Even though digital is where it’s at – or so I’ve heard – film will be here for years. So don’t forget how to work with it. If you’ve never had to work with it yet, no time like the present to learn. Your day will come soon.

©2009 Oliver Peters


Blackmagic Design UltraScope


Blackmagic Design’s UltraScope gained a lot of buzz at NAB 2009. In a time when fewer facilities are spending precious budget dollars on high-end video and technical monitors, the UltraScope seems to fit the bill for a high-quality, but low-cost waveform monitor and vectorscope. It doesn’t answer all needs, but if you are interested in replacing that trusty NTSC Tektronix , Leader or Videotek scope with something that’s both cost–effective and designed for HD, then the UltraScope may be right for you.

The Blackmagic Design Ultrascope is an outgrowth of the company’s development of the Decklink cards. Purchasing UltraScope provides you with two components – a PCIe SDI/HD-SDI input card and the UltraScope software. These are to be installed into a qualified Windows PC with a high-resolution monitor and in total, provide a multi-pattern monitoring system. The PC specs are pretty loose. Blackmagic Design has listed a number of qualified systems on their website, but like most companies, these represent products that have been tested and known to work – not all the possible options that, in fact, will work. Stick to the list and you are safe. Pick other options and your mileage may vary.

Configuring your system

The idea behind UltraScope is to end up with a product that gives you high-quality HD and SD monitoring, but without the cost of top-of-the-line dedicated hardware or rasterizing scopes. The key ingredients are a PC with a PCIe bus and the appropriate graphics display card. The PC should have an Intel Core 2 Duo 2.5GHz processor (or better) and run Windows XP or Vista. Windows 32-bit and 64-bit versions are supported, but check Blackmagic Design’s tech specs page for exact details. According to Blackmagic Design, the card has to incorporate the OpenGL 2.1 (or better) standard. A fellow editor configured his system with an off-the-shelf card from a computer retailer for about $100. In his case, a Diamond-branded card using the ATI 4650 chipset worked just fine.

You need the right monitor for the best experience. Initial marketing information specified 24” monitors. In fact, the requirement is to be able to support a 1920×1200 screen resolution. My friend is using an older 23” Apple Cinema Display. HP also makes some monitors with that resolution in the 22” range for under $300. If you are prepared to do a little “DIY” experimentation and don’t mind returning a product to the store if it doesn’t work, then you can certainly get UltraScope to work on a PC that isn’t on Blackmagic Design’s list. Putting together such a system should cost under $2,000, including the UltraScope and monitor, which is well under the price of the lowest-cost competitor.

Once you have a PC with UltraScope installed, the rest is pretty simple. The UltraScope software is simply another Windows application, so it can operate on a workstation that is shared for other tasks. UltraScope becomes the dominant application when you launch it. Its interface hides everything else and can’t be minimized, so you are either running UltraScope or not. As such, I’d recommend using a PC that isn’t intended for essential editing tasks, if you plan to use UltraScope fulltime.

Connect your input cable to the PCIe card and whatever is being sent will be displayed in the interface. The UltraScope input card can handle coax and fiber optic SDI at up to 3Gb/s and each connection offers a loop-through. Most, but not all, NTSC, PAL and HD formats and frame-rates are supported. For instance, 1080p/23.98 is supported but 720p/23.98 is not. The input is auto-sensing, so as you change project settings or output formats on your NLE, the UltraScope adjusts accordingly. No operator interaction is required.

The UltraScope display is divided into six panes that display parade, waveform, vectorscope, histogram, audio and picture. The audio pane supports up to 8 embedded SDI channels and shows both volume and phase. The picture pane displays a color image and VITC timecode. There’s very little to it beyond that. You can’t change the displays or rearrange them. You also cannot zoom, magnify or calibrate the scope readouts in any way. If you need to measure horizontal or vertical blanking or where captioning is located within the vertical interval, then this product isn’t for you. The main function of the UltraScope is to display levels for quality control monitoring and color correction and it does that quite well. Video levels that run out of bounds are indicated with a red color, so video peaks that exceed 100 change from white to red as they cross over.

Is it right for you?

The UltraScope is going to be more useful to some than others. For instance, if you run Apple Final Cut Studio, then the built-in software scopes in Final Cut Pro or Color will show you the same information and, in general use, seem about as accurate. The advantage of UltraScope for such users, is the ability to check levels at the output of any hardware i/o card or VTR, not just within the editing software. If you are an Avid editor, then you only have access to built-in scopes when in the color correction mode, so UltraScope is of greater benefit.

My colleague’s system is an Avid Media Composer equipped with Mojo DX. By adding UltraScope he now has fulltime monitoring of video waveforms, which is something the Media Composer doesn’t provide. The real-time updating of the display seems very fast without lag. I did notice that the confidence video in the picture pane dropped a few frames at times, but the scopes appeared to keep up. I’m not sure, but it seems that Blackmagic Design has given preference in the software to the scopes over the image display, which is a good thing. The only problem we encountered was audio. When the Mojo DX was supposed to be outputting eight discrete audio channels, only four showed up on the UltraScope meters. As we didn’t have an 8-channel VTR to test this, I’m not sure if this was an Avid or Blackmagic Design issue.

Since the input card takes any SDI signal, it also makes perfect sense to use the Blackmagic Design UltraScope as a central monitor. You could assign the input to the card from a router or patch bay and use it in a central machine room. Another option is to locate the computer centrally, but use Cat5-DVI extenders to place a monitor in several different edit bays. This way, at any given time, one room could use the UltraScope, without necessarily installing a complete system into each room.

Future-proofed through software

It’s important to remember that this is 1.0 product. Because UltraScope is software-based, features that aren’t available today can easily be added. Blackmagic Design has already been doing that over the years with its other products. For instance, scaling and calibration aren’t there today, but if enough customers request it, then it might be available in the next software update as a simple downloadable update.

Blackmagic Design UltraScope is a great product for the editor that misses having a dedicated set of scopes, but who doesn’t want to break the bank anymore. Unlike hardware units, a software product like UltraScope makes it easier than ever to update features and improve the existing product over time. Even if you have built-in scopes within your NLE, this is going to be the only way to make sure your i/o card is really outputting the right levels, plus it gives you an ideal way to check the signal on your VTR without tying up other systems. And besides… What’s cooler to impress a client than having another monitor whose display looks like you are landing 747s at LAX?

©2009 Oliver Peters

Written for NewBay Media LLC and DV magazine

What’s wrong with this picture?


“May you live in interesting times” is said to be an ancient Chinese curse. That certainly describes modern times, but no more so than in the video world. We are at the intersection of numerous transitions: analog to digital broadcast; SD to HD; CRTs to LCD and plasma displays; and tape-based to file-based acquisition and delivery. Where the industry had the chance to make a clear break with the past, it often chose to integrate solutions that protected legacy formats and infrastructure, leaving us with the bewildering options that we know today.


Broadcasters settled on two standards: 720p and 1080i. These are both full-raster, square pixel formats: 1280x720p/59.94 (60 progressive frames per seconds in NTSC countries) – commonly known as “60P” – and 1920x1080i/59.94 (60 interlaced fields per second in NTSC countries) – commonly known as “60i”. The industry has wrestled with interlacing since before the birth of NTSC.


Interlaced scan


Interlaced displays show a frame as two sequential sets of alternating odd and even-numbered scan lines. Each set is called a field and occurs at 1/60th of a second, so two fields make a single full-resolution frame. Since the fields are displaced in time, one frame with fast horizontal motion will appear like it has serrated edges or horizontal lines. That’s because odd-numbered scan lines show action that occurred 1/60th of a second apart from the even-numbered, adjacent scan lines. If you routinely move interlaced content between software apps, you have to careful to maintain proper field dominance (whether edits start on field 1 or field 2 of a frame) and field order (whether a frame is displayed starting with odd or even-numbered scan lines).


Progressive scan


A progressive format, like 720p, displays a complete, full-resolution frame for each of 60 frames per second. All scan lines show action that was captured at the exact same instance in time. When you combine the spatial with the temporal resolution, the amount of data that passes in front of a viewer’s eyes in one second is essentially the same for 1080i (about 62 million pixels) as for 720p (about 55 million pixels).


Progressive is ultimately a better format solution from the point-of-view of conversions and graphics. Progressive media scales more easily from SD to HD without the risk of introducing interlace errors that can’t be corrected later. Graphic and VFX artists also have a better time with progressive media and won’t have issues with proper field order, as is so often the case when working with NTSC or even 1080i. The benefits of progressive media apply regardless of the format size or frame rate, so 1080p/23.98 offers the same advantages.


Outside of the boundary lines


Modern cameras, display systems and NLEs have allowed us to shed a number of boundaries from the past. Thanks to Sony and Laser Pacific, we’ve added 1920x1080psf/23.98. That’s a “progressive segmented frame” running at the video-friendly rate of 23.98 for 24fps media. PsF is really interlacing, except that at the camera end, both fields are captured at the same point in time. PsF allows the format to be “superimposed” onto an otherwise interlaced infrastructure with less impact on post and manufacturing costs.


Tapeless cameras have added more wrinkles. A Panasonic VariCam records to tape at 59.94fps (60P), even though you are shooting with the camera set to 23.98fps (24P). This is often called 24-over-60. New tapeless Panasonic P2 camcorders aren’t bound by VTR mechanisms and can record a file to the P2 recording media at any “native” frame rate. To conserve data space on the P2 card, simply record at the frame rate you need, like 23.98pn (progressive, native) or 29.97pn. No need for any redundant frames (added 3:2 pulldown) to round 24fps out to 60fps as with the VariCam.


I’d be remiss if I didn’t address raster size. At the top, I mentioned full-raster and square pixels, but the actual video content recorded in the file cheats this by changing the size and pixel aspect ratio as a way of reducing the data rate. This will vary with codec. For example, DVCPRO HD records at a true size of 960×720 pixels, but displays as 1280×720 pixels. Proper display sizes of such files (as compared with actual file sizes) are controlled by the NLE software or a media player application, like QuickTime.


Mixing it up


Editors routinely have to deal with a mix of frame rates, image sizes and aspect ratios, but ultimately this all has to go to tape or distribution through the funnel of the two accepted HD broadcast formats (720p/59.94 and 1080i/59.94). PLUS good old fashioned NTSC and/or PAL. For instance, if you work on a TV or film project being mastered at 1920x1080p/23.98, you need to realize several things: few displays support native 23.98 (24P) frame rates. You will ultimately have to generate not only a 23.98p master videotape or file, but also “broadcast” or “air” masters. Think of your 23.98p master as a “digital internegative”, which will be used to generate 1080i, 720p, NTSC, PAL, 16×9 squeezed, 4×3 center-cut and letterboxed variations.


Unfortunately your NLE won’t totally get you there. I recently finished some spots in 1080p/23.98 on an FCP system with a KONA2 card. If you think the hardware can convert to 1080i output, guess again! Changing FCP’s Video Playback setting to 1080i is really telling the FCP RT engine to do this in software, not in hardware. The ONLY conversions down by the KONA hardware are those available in the primary and secondary format options of the AJA Control Panel. In this case, only the NTSC downconversion gets the benefit of hardware-controlled pulldown insertion.


OK, so let FCP do it. The trouble with that idea is that yes, FCP can mix frame rates and convert them, but it does a poor job of it. Instead of the correct 2:3:2:3 cadence, FCP uses the faster-to-calculate 2:2:2:4. The result is an image that looks like frames are being dropped, because the fourth frame is always being displayed twice, resulting in a noticeable visual stutter. In my case, the solution was to use Apple Compressor to create the 1080i and 720p versions and to use the KONA2’s hardware downconversion for the NTSC Beta-SP dubs. Adobe After Effects also functions as a good, software conversion tool.


Another variation to this dilemma is the 720pn/29.97 (aka 30PN) of the P2 cameras. This is an easily edited format in FCP, but it deviates from the true 720p/59.94 standard. Edit in FCP with a 29.97p timeline, but when you change the Video Playback setting to 59.94, FCP converts the video on-the-fly to send a 60P video stream to the hardware. FCP is adding 2:2 pulldown (doubling each frame) to make the signal compliant. Depending on the horsepower of your workstation, you may, in fact, lower the image resolution by doing this. If you are doing this for HD output, it might actually be better to convert or render the 29.97p timeline to a new 59.94p sequence prior to output, in order to maintain proper resolution.


Converting to NTSC


But what about downconversion? Most of the HD decks and I/O cards you buy have built-in downconversion, right? You would think they do a good job, but when images are really critical, they don’t cut it. Dedicated conversion products, like the Teranex Mini do a far better job in both directions. I delivered a documentary to HBO and one of the items flagged by their QC department was the quality of the credits in the downconverted (letterboxed) Digital Betacam back-up master. I had used rolling end credits on the HD master, so I figured that changing the credits to static cards and bumping up the font size a bit would make it a lot better. I compared the converted quality of these new static HD credits through FCP internally, through the KONA hardware and through the Sony HDW-500 deck. None of these looked as crisp and clean as simply creating new SD credits for the Digital Betacam master. Downconverted video and even lower third graphics all looked fine on the SD master – just not the final credits.


The trouble with flat panels


This would be enough of a mess without display issues. Consumers are buying LCDs and plasmas. CRTs are effectively dead. Yet, CRTs are the only device to properly display interlacing – especially if you are troubleshooting errors. Flat panels all go through conversions and interpolation to display interlaced video in a progressive fashion. Going back to the original 720p versus 1080i options, I really have to wonder whether the rapid technology change in display devices was properly forecast. If you shoot 1080p/23.98, this often gets converted to a 1080i/59.94 broadcast master (with added 3:2 pulldown) and is transmitted to your set as a 1080i signal. The set converts the signal. That’s the best case scenario.


Far more often, the production company, network and local affiliate haven’t adopted the same HD standard. As a result, there may be several 720p-to-1080i and/or 1080i-to-720p that happen along the way. To further complicate things, many older consumer sets are native 720p panels and scale a 1080 image. Many include circuitry to remove 3:2 pulldown and convert 24fps programs back to progressive images. This is usually called the “film” mode setting. It generally doesn’t work well with mixed-cadence shows or rolling/crawling video titles over film content.


The newest sets are 1080p, which is a totally bogus marketing feature. These are designed for video game playback and not TV signals, which are simply frame-doubled. All of this mish-mash – plus the heavy digital compression used in transmission – makes me marvel at how bad a lot of HD signals look in retail stores. I recently saw a clip from NBC’s Heroes on a large 1080p set at a local Sam’s Club. It was far more pleasing to me on my 20” Samsung CRT at home, received over analog cable, than on the big 1080p digital panel.


Progress (?) marches on…


We can’t turn back time , of course, but my feeling about displays is that a 29.97p (30P) signal is the “sweet spot” for most LCD and plasma panels. In fact, 720p on most of today’s consumer panel looks about the same as 1080i or 1080p. When I look at 23.98 (24P) content as 29.97 (24p-over-60i), it looks proper to my eyes on a CRT, but a bit funky on an LCD display. On the other hand 29.97 (30P) strobes a bit on a CRT, but appears very smooth on a flat panel. Panasonic’s 720p/59.94 looks like regular video on a CRT, but 720p recorded as 30p-over-60p looks more film-like. Yet both signals actually look very similar on a flat panel. This is likely due to the refresh rates and image latency in an LCD or plasma panel as compared to a CRT. True 24P is also fine if your target is the web. As a web file it can be displayed as true 24fps without pulldown. Remember that as video, though, many flat panels cannot display 23.98 or 24fps frame rates without pulldown being added.


Unfortunately there is no single, best solution. If your target distribution is for the web or primarily to be viewed on flat panel display devices (including projectors), I highly recommend working strictly in a progressive format and a progressive timeline setting. If interlacing is involved, them make sure to deinterlace these clips or even the entire timeline before your final delivery. Reserve interlaced media and timelines for productions that are intended predominantly for broadcast TV using a 480i (NTSC) or 1080i transmission.


By now you’re probably echoing the common question, “When are we going to get ONE standard?” My answer is that there ARE standards – MANY of them. This won’t get better, so you can only prepare yourself with more knowledge. Learn what works for your system and your customers and then focus on those solutions – and yes – the necessary workarounds, too!


Does your head hurt yet?


© 2009 Oliver Peters


Dealing with a post facility


The do-it-yourself filmmaker might view the traditional lab or post facility as a place of last resort. That belief stems from a fear that – like a visit to a doctor or lawyer – every minute is billable. Most finishing facilities are actually easy to deal with and have the producer’s best interests at heart. They have been forging new workflows with file-based formats and are often the best-equipped to give a producer the desired creative and technical result.


Reasons to rely on outside resources include various advanced post services, like conforming a project for higher-quality or higher-resolution deliverables, color-grading and the production of digital intermediate masters. Sometimes, clients simply don’t know where to start, what to ask, or what’s expected of them. I posed some of these questions to a roundtable of post professionals, including Terence Curren, owner of Aphadogs (Burbank), Mike Most, chief technologist at Cineworks (Miami), Brian Hutchings, freelance colorist (Los Angeles) and Peter Postma, US product manager for Filmlight.


OP: Many clients don’t realize that post houses may offer up-front consultation as part of their sales effort. How do you approach that?


CURREN: We absolutely offer that service! Any post house that has the client’s welfare in mind does this. We find ourselves consulting on everything from cameras and recording formats, to file naming conventions. Every system has issues. We handle both FCP and Avid extensively and are familiar with a lot of potential pitfalls on either platform. When a client contacts us in advance, we can help to steer them clear of problems with their intended workflow. That can save them a lot of money in the long run.


HUTCHINGS: As a freelance colorist, I take on the roles of educator and salesman. Clients are increasingly making the transition from daVinci sessions to [Apple] Color. I do color timing on daVinci, Final Cut Studio and Avid systems and don’t have a financial interest in any particular piece of gear. Therefore, I can give a fairly unbiased opinion on the different paths available.


MOST: Clients these days often try to self educate. They read a lot on the Internet or elsewhere, or talk to others who have used the equipment they’re planning to use. Sometimes the knowledge they gain is accurate and useful, but often it’s either inaccurate or based on production or post conditions that differ from theirs. We try to steer them in a direction, so that what they do, how they do it, and the formats they use, flow easily into the finishing steps that we provide. Basically, we try to minimize surprises and make the process smoother, more efficient, and in many cases, more economical.


OP: What should the producer be prepared to supply for an online edit or a DI conform?


MOST: In general – if there is still such a thing – we need the original materials, an EDL, some visual copy of their offline cut as a reference, title copy (or graphics files, if they’ve created their own) and some idea as to how they’re finishing the sound. If the program is cut on an Avid, it’s useful to receive a bin with the final sequence in addition to a traditional EDL. Many less-experienced Final Cut editors use techniques, such as nesting, speed effects and other visual embellishments, which do not translate to an EDL in any kind of useful form. So with Final Cut, it helps to have a copy of the project file.


CURREN: Mike has covered the bases; however, with the new file-based formats that offer higher resolutions at smaller file sizes, we often receive the project with the media already attached. In this case our work starts by fine-tuning effects, adding any graphics, color correcting and the final mix of the audio. This saves money in the capture time and in double-checking against the offline for accuracy.


MOST: I do find that many users of newer formats, such as RED, are very confused about what they do and do not have to deliver to us to achieve the best quality with the least difficulty. They do a lot of work themselves to create elements that serve no purpose for us. This actually lengthens the amount of time it takes us to complete the conform. Hopefully in the future, there will be more communication prior to picture lock between clients and finishing facilities and much less bad guesswork.


OP: What are your guidelines for preparing the media and timelines before you start? How much time should be allowed for finishing and color grading?


CURREN: Our process is to re-import any files, then recapture any media from tape. With older analog formats like Betacam, we will actually ride levels on recapture to avoid any clipping of the video, which cannot be retrieved later in the process. Generally speaking, we figure about 100 clips captured per hour on Avid and about 90 on FCP. The more clips in a show, the longer this process takes. We will check the new timeline shot-by-shot against a copy of the offline output to verify that is correct, in sync and that effects properly translated. Next comes the color correction pass, titling and graphics. At this point we will watch the show with the client and then address any notes.


POSTMA: A commercial can be done in a day, though several days may be used for creative reasons. A feature film – including scanning, color correction and recording – can be done in three weeks. Again, it may be longer if you want to spend more time making creative color-correction decisions.


CURREN: An important side note about color correction needs to be made here. There are really three parts. Part one is to make it legal for whatever your distribution is going to be. Part two is to make it even, meaning all the shots in a given scene should look like they belong together. The first two steps are fairly objective. Part three is purely subjective. That’s where the magic can take place in color correction. Giving a slight green tint to a scary scene or a slight blue tint to two lovers silently arguing are examples of subjective choices. The creative part of the process can take a long time if allowed.


MOST: I can speak more to the feature film side of this question, because the time factors – once the conform is complete – are usually dictated by things like budgets. For a feature shot on film, we usually allocate 3-5 days to scanning (perhaps a day or two less for file restoration on a file based electronic feature), 2-3 days to conform, 5-10 days for color correction, 1-2 days to do final renders and create the HD master, and about a 5-7 days to do a film recording. All of those time factors can vary in either direction, depending on editorial complication, show length, creative choices, and, once again, budget.


OP: How do you handle grading of the same project for TV, digital cinema projection and film prints?


CURREN: Many post houses are claiming they are DI houses, but shouldn’t be. The trick with DI is to have tight control over the entire process, including the film lab. If you don’t, there are too many places where things can go wrong. Most of our work at Alphadogs is grading for television. We don’t claim to be a DI house. When we do feature work and the client plans to do a film-out, we will color correct the same way as for TV, but avoid clipping whites or crushing blacks. Afterwards, the client takes it to the lab they have chosen for a film-out, where a final scene-by-scene color pass is done. They save money by not having to color-grade every shot, since the scenes are already evened out.


MOST: Cineworks has a DI theater that’s specifically calibrated to Digital Cinema (P3) color space. We use a film print preview lookup table for projects going to film. During the session we’re basically looking at a preview of the eventual film print. The files are created in 10-bit log, film print density color space, and are used directly by a film recorder. We then use various custom lookup tables, along with minor color tweaks, to derive all other deliverables from those same files. The look remains consistent across all platforms. We usually generate an HD video version, which is then used for all other video deliverables – HD24, HD25, NTSC, PAL, full frame, letterbox, etc.


POSTMA: Filmlight’s Truelight color management system handles these conversions, so a DI facility that uses it should only need to color correct once and Truelight will handle the color conversion to the other spaces. It usually makes sense to color correct for the medium with the most color range (film or digital cinema) and then downconvert to video, web, etc. There may be some different creative decisions you’d like to make for the more limited mediums of video or the web. In that case, you can do a trim pass to tweak a few shots, but the Truelight color conversion should get you 90% of the way there.


OP: Should a producer worry about various camera color spaces, such as Panalog, REDlog or the cine-gamma settings in Sony or Panasonic cameras?


CURREN: This is a great reason to talk to post first. I’m a fan of leaving things in the native space through to your final finish; however, that can make for a very flat looking offline, which is disturbing to many folks. If so, you might need two versions of the files or tapes. One version – the master – should stay in the native space. The second – the offline editorial working files – should be changed over to video (601 or 709) space.


MOST: Color space issues should be for finishing facilities to deal with, but the use of custom gamma curves in electronic cameras presents some educational issues for shooters. We usually try to discuss these during a pre-production meeting, but they primarily affect dailies creation. For finishing, we can deal with all of these color spaces without much of a problem.


OP: If the intended distribution is 2K or 1920×1080 HD, should the producer be concerned about image sequence files (DPX, TIFF, etc.)?


MOST: No, not unless that’s the way the program is being recorded – as with an S.two or Codex recorder. It’s easier for editors to deal with wrapped movie files, QuickTime in the case of Final Cut or OMF and MXF in the case of Avid. We use the original material – in whatever form it was captured – for finishing. With film, of course, that’s obvious; but, for RED, we work directly from the .r3d files in our Assimilate SCRATCH system. That gives us access to all of the information the camera captured.


CURREN: DPX files hog a tremendous amount of storage space. If you capture digitally, with the RED camera, for instance, why not stay with the native RED codec? You won’t gain any quality by converting to DPX, but you will have to bake in a look limiting your color correction range later in the process?


OP: Who should attend the sessions?


MOST: For conforming, nobody needs to physically be there, but the editor or assistant editor needs to be available for any questions that come up. For color correction, we really want the director of photography to be with us, as the one who is really responsible for the look of the picture.


POSTMA: Cinematographer and director. You definitely don’t want too many people in the room or you can burn up time in a very expensive suite making decisions by committee.


CURREN: Who is signing the check? I’m not trying to be cheeky, it is just that someone makes the final decisions, or they have authorized someone to make the final decisions. That is who should be there at the end. For feature work, often the DP will get a pass at the color correction. In this case, it is wise for the producer to set some guidelines. The DP is going to try to make his stuff look the best he can, which is what he should be wanting. The colorist also wants the best look they can achieve. There is an old saying that applies here, “No film is ever done, they just run out of time or money.” There has to be a clear understanding of where the cut off point is. When is it good enough? Without that direction, the DP and the colorist can rack up quite a bill.


®2009 Oliver Peters

Originally written for DV Magazine (NewBay Media, LLC)


Compression Tips For The Web

One of the many new disciplines editors have to know is how to properly compress and encode video for presentations on the Internet or as part of CD-ROMs. Often this may be for demo reels or client approval copies, but it could also be for final presentations within PowerPoint, Director or another presentation application. The objective is to get the encoded program down to the smallest file size yet maintain as much of the original quality as possible.


Everyone has their own pet software or player format to recommend, but the truth of the matter is that it is unlikely that you will encode your video into a format that absolutely everyone can read without the need to download an additional player that they might have to install. The most common player formats include QuickTime, Windows Media, Real Player, Flash and the embedded media player that AOL bundles into their own software. Within each of these, there are also codec and size options that vary depending on how current a version you are targeting.


Modern formats, such as MPEG 4, Windows Media 9, QuickTime with Sorenson 3 and others may look great, but they frequently only run on the newest versions of these players. If your client has an older Windows 98 PC or an OS 9 Mac, it’s doubtful that they can play the latest and greatest software. You should also be aware that not all encoded results are equal. Some formats look awesome at standard medium-to-large video sizes, but don’t look good at all when you get down to a really small window size. The opposite is also true. Here are some guidelines that will let you target the largest possible audience.


Size and frame rate


The first thing to tackle when encoding for the web is the image size and frame rate. Standard definition video is 720 x 486 (480 for DV) pixels (rectangular aspect), which equates to a web size of 640 x 480 pixels (square aspect). This is considered a “large” window size for most web pages. Scaling the image down reduces the file size, so commonly used smaller sizes are 320 x 240 (“medium”), 192 x 144 and 160 x 120 (“small”). These sizes aren’t absolute. For instance, if your finished program is letterboxed, why waste file size on the black top and bottom bars? If your encoding software permits cropping, you could export these files in other sizes, such as 300 x 200 or 160 x 90 pixels. Another way to reduce the file size is to reduce the frame rate. Video runs at 29.97 fps but due to the progressive display and refresh rates of computer CRTs and flat panels, there is often little harm done in cutting this down to 15 fps or sometimes even 10 fps or lower.


Reducing the image size and frame rate is a matter of juggling the reduction of file size with playback that is still easily viewed and doesn’t lose the message you are trying to convey. If you are encoding for a CD-ROM instead of the web, then size is less of an issue. Here you may wish to maintain the full frame rate (29.97) so that your motion stays fluid, as long as most CPU speeds can support the size and rate you choose. For instance, a 320 x 240 file should play fine on most machines with a 200 MHz or faster CPU; however, if this same file is playing back from within another application, like an HTML page displayed in a web browser or PowerPoint, some CPU overhead will be lost to this host program. This means that the same file which plays fine outside of the host application, might tend to drop frames when playing back inside of another application.


Formats and players


There are a lot of conflicting opinions on this subject, but I tend to go for what is a common denominator and provides quality playback. For this reason, I tend to stick with formats like QuickTime (Photo-JPEG codec), Windows Media 7 and Real Player. MPEG 1 and 4 are supposed to be playable on nearly everything, but I haven’t found that to be true. I love the way Sorenson 3 (QuickTime) looks, but it requires QuickTime 5 or newer. If you encode in one of the previous three I mentioned, which are somewhat older, odds are that nearly any machine out there will be able to play these files or will be able to download a simple player in that format that works on a wide range of Mac and Windows PCs. Although Photo-JPEG is generally not considered a playback codec, the advance of CPU speeds lets these files play quite fluidly and the codec lends itself to controllable encoding – meaning, less voodoo to get a good image.


If you are putting a file up for anyone to see, like a demo reel, then you will probably have to create a version in each of these three player formats. If you are encoding for a single client and you know what they can play, then only one version is needed. As an example, a typical :30 commercial encoded with QuickTime (Photo-JPEG at about 50% quality) at a size of 320 x 240 (29.97 fps) will yield a file size of around 10 to 15MB. This is fine for approval quality, but a bit large when you multiply that for a longer demo reel on your website. Cutting down the image size and frame rate and using a lossier codec, will let you squeeze a demo reel of several minutes into that same space.


Interlacing and filtering


Interlaced video doesn’t look good on computer displays and doesn’t compress efficiently. Some programs let you export single fields only or let you apply de-interlacing filters. I recommend you use one of these options to get better results especially when there is a lot of motion. The one caveat is text. De-interlacing often trashes graphics and text, since half the visual information is tossed out. Generally, you get a better web look if your footage is based on a single-field export. Additionally, some encoding applications include noise reduction and image correction filters. I tend to stay away from these, but a touch of noise reduction won’t hurt. This will prefilter the image prior to compressing, which often results in better-looking and more efficient compression. Adding filters lengthens the encode time, so if you need a fast turnaround, you will probably want to disable any filters.


Constant versus variable bit-rate encoding


Like encoding for DVDs, many compression applications permit you to choose and adjust settings for constant (one-pass) and variable (one or two-pass) bit-rate encoding. I prefer constant bit-rate encoding because variable bit-rate often makes fades and dissolves look quite “blocky”. Constant also gives you a better look when transitioning between static graphics or frames and motion. The downside is that you will have to use a lower average rate to get comparable results in file size. Not all codecs give you this option, but when they do, it will often take a bit of trial-and-error to determine which rates look best and to decide how often to place keyframes (usually a slider in the software or a number value).




Remember that audio is a major component of your program. You can cut done your video by quite a lot, but at some point audio is taking up even more space than the video and needs to be compressed as well. Tackle this in several ways. First, change your stereo audio to a single track of mono audio. The difference is minor and often stereo channels don’t seem to encode well, introducing all sorts of phase errors. Next, drop your sampling rate. You probably edited the show using a rate of 44.1 or 48 kHz. On most programs, you can successfully drop this to 22 kHz without really affecting the sound quality heard on most computer speakers. Do not drop the bit-depth. Reducing the bit-depth from 16-bit (typical) to 8-bit will create some very undesirable audio. Finally, add compression. Most codecs include some typical audio compression schemes, which all players can decode. A compression ratio of 4:1 is common and hardly noticed.




Choosing the best application to encode/compress your footage gets down to learning curve, comfort factor, speed, preference and whether you are on a Mac or PC. Not all applications give you equal quality results with the same codec, though. You can encode using the internal export functions of most NLEs or choose from a wide range of applications, including Apple QuickTime Player Pro, Apple Compressor, Discreet Cleaner, Canopus Procoder, Sorenson Squeeze, Ligos, Windows Media encoder and many others.


When you encode a file, you may also choose to make it streaming or downloadable. Selecting progressive encoding will make the file downloadable, which is generally what you want for a demo reel or a client approval copy. If you want to ensure that the person’s browser will permit a download, wrap the file in an archive (data compression) format like .sit or .zip using WinZip or Stuffit. This forces the viewer to either open the file or save it on their local hard drive.


As with most things, it helps to read the book and spend some time experimenting when you’re not under the gun. This will let you decide which codec and encoding application gives you the best results based on need and the target audience.


© 2004 Oliver Peters


Understanding Video Levels

The video signal is made up color information (chrominance) superimposed on top of black-and-white information (luminance). Adjusting this balance gives you the values of brightness, contrast, color intensity (saturation) and hue (tint). When you look at a waveform monitor in the IRE mode, you can see the values of the grayscale that represent the black-and-white portion of the picture. A vectorscope displays the distribution of the color portion around a circular scale, representing saturation and the various component colors.


The Red, Green and Blue components of light form the basis for video. All images that end up as video originally started as some sort of RGB representation of the world – either a video camera or telecine using CCDs for red, blue and green – or a computer graphics file created on an RGB computer system. This RGB version of the world is converted into a luma+chroma format by the circuitry of the camera or your editing workstation. Originally an analog encoding process, this conversion is now nearly always digital, conforming to ITUR-601 or DV specs. This is commonly referred to as YCrCb – where Y = luminance and CrCb = two difference signals used to generate color information. You may also see this written as YUV, Y/R-Y/B-Y or other forms.


In the conversion from RGB to YCrCb, luminance detail is given more weight, because research has shown that the eye responds better to brightness and contrast than pure color information. Therefore in 601, YCrCb is expressed as the ratio of 4:2:2, so by definition, chroma has half the resolution of the black-and-white values of the image. In DV, the ratio is 4:1:1 (NTSC) – even less color information. 


Although most post-production systems keep YCrCb signal components separate, they are nevertheless encoded into a composite signal of some sort before the viewer sees the final product, even if only by the very last display technology in the chain. This means that your pristine image capture in RGB has undergone a truncation of information during the post and end use of the video. This truncation gives us the concept of “legal colors” and “broadcast safe” levels, because there are color values and brightness levels in RGB and even YCrCb that simply will not appear as intended by the time the viewer sees it broadcast or on a VHS dub or DVD. That’s why it’s important to monitor, adjust and restrict levels to get the best possible final results.


Interpreting the Waveform


The NTSC video signal is one volt of energy from the absence of video (black) until the maximum video brightness level (white). This signal is divided into the sync portion (below zero on a waveform monitor) and the video portion (above zero). The video portion of the scale is divided into 100 IRE units, with black set at 7.5 IRE (in the US) and peak whites at 100 IRE. The luminance information should not dip lower than 7.5 or higher than 100. The color information, which you see superimposed over the luminance information when a waveform monitor is set to FLAT, can exceed these 7.5 and 100 IRE limits. In fact, color can legally dip as low as –20 and as high as 120.


On top of this, many cameras are actually adjusted to limit (clip) their peak whites at higher than 100 IRE – usually at 105 or even 110. This is done so that you have a bit of artistic margin between bright parts of the image – that you would like to keep very white – and specular highlights, like reflections on metal – which are supposed to be clipped. Therefore, if you set up a camera tape to the recorded bars, you will often find that the peak levels on the tape are actually higher than 100 IRE. On the other end of the scale, you may also find chroma levels, such as highly saturated blues and reds that dip below the –20 mark. In order to correct these issues, you must do one of the following: a) lower the levels during capture into the NLE, b) adjust the image with color-correction, c) add a software filter effect to limit the video within the range, or d) add a hardware proc amp to limit the outgoing video signal.


These values are based on an analog signal displayed in the composite mode, but things get a bit confusing once digital monitoring is introduced. A digital waveform displaying a 601 serial digital video signal will frequently use a different scale. Instead of 100 IRE as the topmost limit, it will be shown as .7 volts (actually 714 millivolts) – the electrical energy at this level. Digital video also has no “set up” to the signal. This is the NTSC component that sets black at 7.5 IRE instead of 0. In the digital world, black is 0, not 7.5. There is nothing missing, simply a difference in scales, so the analog range of 7.5 to 100 IRE equals the digital range of 0 to 714 millivolts. Sometimes digital scopes may label their scale as 0 to 100, with black at 0 and the peak white level at 100. This tends to confuse the issue, because it is still a digital and not an analog signal, so operators are often not sure what the proper value for black should be.


256 Levels


Digital video is created using an 8-bit (or sometimes 10-bit) quantization of the incoming analog image. With 8-bit digital video, the 7.5 to 100 IRE analog range is divided into 256 steps between total black (0) and total white (255). In RGB values, 0 to 255 is the full range, but according to the 601 specifications, the range for YCrCb digital video was reduced so that black = 16 and white = 235. This was done to accommodate “real world” video signals that tend to exceed both ends of the scale and to accommodate the set-up level of NTSC.


Unfortunately, not all NLEs work this way. Some take video in and digitize it based on the 0-255 range, while others use the 16-235 range. Since no video can exist below digital zero, any analog video, which is lower than 7.5 IRE or higher than 100 IRE, will be clipped in a system that scales according to the 0-255 range, since there is no headroom on such an NLE. Some NLEs that work in this RGB mode will correct for the headroom issue as part of the conversion done with the incoming and outgoing video signals.


DV and Set-up


It was all pretty easy in the days of analog tape formats or even professional digital formats, like Digital Betacam, which correctly converted between analog and digital scales. Then came DV. DV is a consumer format, which has been aggressively adopted by the professional world. Like other digital recordings, DV does not use a set-up signal. If you capture video from a DV VTR into an NLE, using the DV signal over FireWire (iLink, 1394), then the lack of set-up isn’t an issue because the signal path has always been digital.


Many DV decks also have analog outputs and the analog signals coming out of these frequently have not been corrected with an added set-up value. This results in DV video – with blacks at digital 0 – being sent out via the analog spigots with the black level at analog 0, not 7.5 IRE. The problem really becomes compounded if you capture this analog signal into an NLE, which is expecting an analog signal to have a 7.5 IRE limit for black. It will scale this 7.5-point to equal the digital black value. If you use an NLE that scales digital video according to the 16-235 range, then you are safe, because the darker-than-expected portion of the signal is still within a useable range. On the other hand, if your NLE scales according to the 0-255 range, then your darkest video will be clipped and cannot be recovered because nothing can be darker than digital 0. There are four solutions to this issue: a) use a DV deck with SDI I/O – no set-up required, b) use the 1394 I/O only – no set-up required, c) use a DV deck that adds set-up to the analog signals, or d) place a converter or proc amp in line between the deck and the NLE to adjust the levels as needed.




I’ve spent a lot of time discussing the black-and-white portion of the image, but color saturation is also quite important. There are two devices that best show color-related problems: the vectorscope and the waveform monitor’s diamond display. Vectorscopes show chroma saturation (color intensity) in a circular display. The farther out from the center that you see a signal, the more intense the color. Legal color levels can go to the outer ring of the scale, but cannot go past it.


The diamond display on a waveform monitor shows the combination of brightness with color intensity. This display shows two diamond patterns – one over the other. The video signal must fall inside the diamonds in order to be legal. Certain colors, like yellow, can be a problem because of their brightness. Yellow can exceed the upper limits of the legal range due to either chroma saturation or video level (brightness). An excessive yellow signal – a “hot” yellow – would easily fall outside the edges of these diamond patterns. Either reducing the brightness or the saturation can correct the level, because it is the combination of the two that pushes it into the illegal range.


Proper levels are easy to achieve with a little care and attention to details. To understand more about the video signal, I recommend a book formerly published and distributed by Snell and Wilcox. Check out Video Standards – Signals, Formats and Interfaces by Victor Steinberg (ISBN 1900739 07 0).


© 2004 Oliver Peters


The Basics of DVD Creation

The DVD has become the de facto replacement for the VHS dub. DVD authoring has become an extension of most nonlinear editing software. This has made something that started out as a very arcane, expensive and specialized task into something that is easy enough for any video professional to master. When you produce a video DVD, you are actually building a creative product that must conform to the DVD-Video spec. This is one of the categories within the overall DVD format specification that includes other forms, such as DVD-ROM and DVD-Audio. Creating a disk with the full nuances of the DVD-Video spec has been addressed in such high-end applications as Sonic Solutions’ DVD Creator. Newer, more user-friendly programs like Adobe Encore DVD, Apple DVD Studio Pro 2, Sonic’s own Reel DVD and Sony’s DVD Architect now offer enough of these same DVD authoring features to satisfy the requirements of over 90% of all the types of DVDs typically produced.


The full DVD spec deals with complex elements like program managers, title sets and so on, but these newer authoring tools have adopted a more streamlined approach – dividing your assets into tracks (for video and audio elements) and menus. You are generally limited to 99 different assets – far less than the full DVD spec would actually allow, yet well within the requirements of most commercial and corporate DVDs. This is especially true considering the significant drop in price from the early days of DVD tools.


Most of the popular authoring applications offer the main features, like motion menus, subtitles and multiple languages; but, not all permit some of the more advanced features, including 24fps encoding, multiple camera angles and AC-3 Dolby surround audio tracks. Obviously, there is still room for the advanced products, but any of the sub-$1,000 software tools will do the trick for most producers.




DVD program assets are made up of elementary stream files. Each video segment has a compressed MPEG2-format video file (.m2v) and, if there is audio with the program, at least one audio file (.wav). When multiple languages are used, there will be more than one audio file for each video element. For surround audio, there will be both a stereo track, as well as an AC-3 encoded audio file for the surround mix. In the past, these files had to be created prior to any authoring. Ground-breaking DVD tools, like Sonic’s DVD Creator, employed real-time encoding hardware (which is still used), but fast software encoding has now become an acceptable alternative. Many authoring tools let you import DVD-compliant files, which have been compressed using a separate application – or work with regular AVI or QuickTime files and apply built-in encoding at the last stage of the authoring process.


DVD videos can be either NTSC or PAL and the frame size is the same as DV: NTSC – 720 x 480 (non-square pixel aspect ratio). The compression of DVDs should typically fall into the range of 4.0 to 7.0 Mbps (megabits per second). Higher data rates are allowed, but you generally won’t see much improvement in the video and higher bit rates often cause problems in playback on some DVD players, especially those in older laptop computers. There are three encoding methods: constant bit rate, one-pass variable and two-pass variable.


Constant bit rate encoding is the fastest because the same amount of compression is applied to all of the video, regardless of complexity. Variable bit rate encoding applies less compression to more complex scenes (a fast camera pan) and more compression to less complex scenes (a static “talking head” shot). In two-pass encoding, the first pass is used to analyze the video and the second pass is the actual encoding pass. Therefore, two-pass variable bit rate encoding will take the longest amount of time. During the encoding set up, a single bit rate value is entered for constant bit rate encoding, but two values (average and maximum peak rates) are entered for variable.


The quality of one type of encoding versus another depends on the quality of the encoding engine used by the application, as well as the compression efficiency of the video itself. The former is obvious, but the latter means that film, 24P and other progressive-based media will compress more cleanly than standard interlaced video. This is due to the fact that interlaced video changes temporal image information every 60th of a second, while film and 24P’s visual information updates only 24 times a second. As a result, compression at the exact same bit rate will appear to have fewer artifacts when it is applied to film and 24P media, than when applied to interlaced media. Add to this the fact that film material has grain, which further hides some of these compression artifacts. The bottom line is that a major movie title can often look great with a much lower bit rate (more compressed) than your video-originated corporate training DVD – even with a higher bit rate. Most standard DVDs will look pretty good at a bit rate of around 5.5 Mbps, which will permit you to get about 60 to 90 minutes of material on a 4.7 GB general-purpose DVD-R.




Creating the interactive design for a DVD is a lot like building a web site. Menus are created which are linked to video assets. Clicking a menu button causes the player to jump from one point on the DVD (menu) to another (video track). Menus can be still frames or moving video, but must conform to the same constraints as any other video. If they don’t start out as video, they are turned into video in the final DVD build. By comparison a web site, CD-ROM or even DVD-ROM might have HTML-based menus of one size and QuickTime or AVI video assets of a different size. This isn’t allowed in a DVD-Video, because the DVD must be playable on a set-top DVD player as video. Motion menus must be built with loop points, since the longer the menu runs, the more space it will consume on the disk. Thirty seconds is usually a standard duration for a loop. A slight pause or freeze occurs in the video when the disk is triggered to jump back to the start of the loop. This is a buffer as the DVD player’s head moves between two points on the disk’s surface.


Lay out your design on paper first. A flowchart is a great idea, because this will let you see how one or more menus connect to the videos in the most efficient manner. Although any type of flowcharting software program will work, it is often just as simple to do this on paper. On the other hand, a nicely printed visual flowchart goes a long way in explaining to a client how the viewer will navigate through a complex DVD. Links can be created as buttons or drop zones. A button is a graphic element added to a menu within the authoring tool, while a drop zone is a hyperlink area added to an imported video. You can create videos or animations to be used as menus and turn graphic or video portions into linkable “hot spots” by adding drop zones on top of the video.


Clicking on a button or drop zone activates a jump to another menu, a video track or a point within a video track (chapter point). Part of the authoring task is to define what actions occur when you click on the control keys of a standard DVD handheld remote. You must define what happens when the viewer clicks the Menu, Title or Return keys, as well as which direction the cursor travels when the arrow keys (up, down, left, right) are used. Although the degree of control over these parameters varies with different software applications, all the contenders must let you define the next options. You have to set the First Play – the file that plays when you first pop in the DVD. You also have to set the target destination for the end of each video file. This determines whether the video continues on to another video or jumps back to a menu – and if so, which one. Remember that if you have more than one menu, you will have to add buttons and navigation commands to go between the menus.




Most authoring tools let you run a simulation of the DVD as you configure it. This lets you proof it to see if all the links work as intended. When the authoring is complete, the next step is to “build” the disk. This is an automatic process in which the application checks to see if your authoring has any errors and then “muxes” (multiplexes) the audio, video and menu files. The muxing stage is where the VOB (video object) files are created. You can see these files if you explore the folders of a non-encrypted DVD. If your DVD tool features built-in encoding, that step generally occurs during the building process.


When a project is built, it can be saved to your hard drive as a disk image or burned to a recordable DVD. Although there are a handful of different recordable DVD formats, the DVD-R general-purpose disks seem to be the most universal. These are typically rated at 4.7 GB, but actually hold about 4.3 GB of data. If you require more capacity, you will have to advance to a dual-layer or dual-sided DVD (9 GB and higher). These cannot be burned on a DVD recorder and not all authoring tools handle these formats. If they do, you can prepare a disk image to be saved to a DLT tape, which would be sent to a commercial DVD replication facility. When you burn a DVD-R on a recording drive, be aware that the burning speed is based on the rating of the media. 1X blanks will only burn at the 1X speed, 2X disks at 2X speed and so on. If you saved the build as a disk image, you will be able to use a disk burning utility like Roxio and burn multiple DVD copies. Although encoding and recording times vary, it is not out-of-line for a one-hour DVD to require as much as four hours from the time you start the build until the DVD has finished the recording step.


We’ve only scratched the surface; so to learn more, check out the white papers and technical documents that are available on the Pioneer, Sonic Solutions and Mitsui web sites. A great reference is DVD Production, A Practical Resource for DVD Publishers by Philip De Lancie and Mark Ely, available from Sonic Solutions. User guides and tutorials – especially those included with Adobe Encore DVD and Apple DVD Studio Pro 2 – are also quite helpful. Start small and move up from there. Soon you, too, can be a master of the DVD world.


© 2004 Oliver Peters