What’s wrong with this picture?

blg_whatswrong

“May you live in interesting times” is said to be an ancient Chinese curse. That certainly describes modern times, but no more so than in the video world. We are at the intersection of numerous transitions: analog to digital broadcast; SD to HD; CRTs to LCD and plasma displays; and tape-based to file-based acquisition and delivery. Where the industry had the chance to make a clear break with the past, it often chose to integrate solutions that protected legacy formats and infrastructure, leaving us with the bewildering options that we know today.

 

Broadcasters settled on two standards: 720p and 1080i. These are both full-raster, square pixel formats: 1280x720p/59.94 (60 progressive frames per seconds in NTSC countries) – commonly known as “60P” – and 1920x1080i/59.94 (60 interlaced fields per second in NTSC countries) – commonly known as “60i”. The industry has wrestled with interlacing since before the birth of NTSC.

 

Interlaced scan

 

Interlaced displays show a frame as two sequential sets of alternating odd and even-numbered scan lines. Each set is called a field and occurs at 1/60th of a second, so two fields make a single full-resolution frame. Since the fields are displaced in time, one frame with fast horizontal motion will appear like it has serrated edges or horizontal lines. That’s because odd-numbered scan lines show action that occurred 1/60th of a second apart from the even-numbered, adjacent scan lines. If you routinely move interlaced content between software apps, you have to careful to maintain proper field dominance (whether edits start on field 1 or field 2 of a frame) and field order (whether a frame is displayed starting with odd or even-numbered scan lines).

 

Progressive scan

 

A progressive format, like 720p, displays a complete, full-resolution frame for each of 60 frames per second. All scan lines show action that was captured at the exact same instance in time. When you combine the spatial with the temporal resolution, the amount of data that passes in front of a viewer’s eyes in one second is essentially the same for 1080i (about 62 million pixels) as for 720p (about 55 million pixels).

 

Progressive is ultimately a better format solution from the point-of-view of conversions and graphics. Progressive media scales more easily from SD to HD without the risk of introducing interlace errors that can’t be corrected later. Graphic and VFX artists also have a better time with progressive media and won’t have issues with proper field order, as is so often the case when working with NTSC or even 1080i. The benefits of progressive media apply regardless of the format size or frame rate, so 1080p/23.98 offers the same advantages.

 

Outside of the boundary lines

 

Modern cameras, display systems and NLEs have allowed us to shed a number of boundaries from the past. Thanks to Sony and Laser Pacific, we’ve added 1920x1080psf/23.98. That’s a “progressive segmented frame” running at the video-friendly rate of 23.98 for 24fps media. PsF is really interlacing, except that at the camera end, both fields are captured at the same point in time. PsF allows the format to be “superimposed” onto an otherwise interlaced infrastructure with less impact on post and manufacturing costs.

 

Tapeless cameras have added more wrinkles. A Panasonic VariCam records to tape at 59.94fps (60P), even though you are shooting with the camera set to 23.98fps (24P). This is often called 24-over-60. New tapeless Panasonic P2 camcorders aren’t bound by VTR mechanisms and can record a file to the P2 recording media at any “native” frame rate. To conserve data space on the P2 card, simply record at the frame rate you need, like 23.98pn (progressive, native) or 29.97pn. No need for any redundant frames (added 3:2 pulldown) to round 24fps out to 60fps as with the VariCam.

 

I’d be remiss if I didn’t address raster size. At the top, I mentioned full-raster and square pixels, but the actual video content recorded in the file cheats this by changing the size and pixel aspect ratio as a way of reducing the data rate. This will vary with codec. For example, DVCPRO HD records at a true size of 960×720 pixels, but displays as 1280×720 pixels. Proper display sizes of such files (as compared with actual file sizes) are controlled by the NLE software or a media player application, like QuickTime.

 

Mixing it up

 

Editors routinely have to deal with a mix of frame rates, image sizes and aspect ratios, but ultimately this all has to go to tape or distribution through the funnel of the two accepted HD broadcast formats (720p/59.94 and 1080i/59.94). PLUS good old fashioned NTSC and/or PAL. For instance, if you work on a TV or film project being mastered at 1920x1080p/23.98, you need to realize several things: few displays support native 23.98 (24P) frame rates. You will ultimately have to generate not only a 23.98p master videotape or file, but also “broadcast” or “air” masters. Think of your 23.98p master as a “digital internegative”, which will be used to generate 1080i, 720p, NTSC, PAL, 16×9 squeezed, 4×3 center-cut and letterboxed variations.

 

Unfortunately your NLE won’t totally get you there. I recently finished some spots in 1080p/23.98 on an FCP system with a KONA2 card. If you think the hardware can convert to 1080i output, guess again! Changing FCP’s Video Playback setting to 1080i is really telling the FCP RT engine to do this in software, not in hardware. The ONLY conversions down by the KONA hardware are those available in the primary and secondary format options of the AJA Control Panel. In this case, only the NTSC downconversion gets the benefit of hardware-controlled pulldown insertion.

 

OK, so let FCP do it. The trouble with that idea is that yes, FCP can mix frame rates and convert them, but it does a poor job of it. Instead of the correct 2:3:2:3 cadence, FCP uses the faster-to-calculate 2:2:2:4. The result is an image that looks like frames are being dropped, because the fourth frame is always being displayed twice, resulting in a noticeable visual stutter. In my case, the solution was to use Apple Compressor to create the 1080i and 720p versions and to use the KONA2’s hardware downconversion for the NTSC Beta-SP dubs. Adobe After Effects also functions as a good, software conversion tool.

 

Another variation to this dilemma is the 720pn/29.97 (aka 30PN) of the P2 cameras. This is an easily edited format in FCP, but it deviates from the true 720p/59.94 standard. Edit in FCP with a 29.97p timeline, but when you change the Video Playback setting to 59.94, FCP converts the video on-the-fly to send a 60P video stream to the hardware. FCP is adding 2:2 pulldown (doubling each frame) to make the signal compliant. Depending on the horsepower of your workstation, you may, in fact, lower the image resolution by doing this. If you are doing this for HD output, it might actually be better to convert or render the 29.97p timeline to a new 59.94p sequence prior to output, in order to maintain proper resolution.

 

Converting to NTSC

 

But what about downconversion? Most of the HD decks and I/O cards you buy have built-in downconversion, right? You would think they do a good job, but when images are really critical, they don’t cut it. Dedicated conversion products, like the Teranex Mini do a far better job in both directions. I delivered a documentary to HBO and one of the items flagged by their QC department was the quality of the credits in the downconverted (letterboxed) Digital Betacam back-up master. I had used rolling end credits on the HD master, so I figured that changing the credits to static cards and bumping up the font size a bit would make it a lot better. I compared the converted quality of these new static HD credits through FCP internally, through the KONA hardware and through the Sony HDW-500 deck. None of these looked as crisp and clean as simply creating new SD credits for the Digital Betacam master. Downconverted video and even lower third graphics all looked fine on the SD master – just not the final credits.

 

The trouble with flat panels

 

This would be enough of a mess without display issues. Consumers are buying LCDs and plasmas. CRTs are effectively dead. Yet, CRTs are the only device to properly display interlacing – especially if you are troubleshooting errors. Flat panels all go through conversions and interpolation to display interlaced video in a progressive fashion. Going back to the original 720p versus 1080i options, I really have to wonder whether the rapid technology change in display devices was properly forecast. If you shoot 1080p/23.98, this often gets converted to a 1080i/59.94 broadcast master (with added 3:2 pulldown) and is transmitted to your set as a 1080i signal. The set converts the signal. That’s the best case scenario.

 

Far more often, the production company, network and local affiliate haven’t adopted the same HD standard. As a result, there may be several 720p-to-1080i and/or 1080i-to-720p that happen along the way. To further complicate things, many older consumer sets are native 720p panels and scale a 1080 image. Many include circuitry to remove 3:2 pulldown and convert 24fps programs back to progressive images. This is usually called the “film” mode setting. It generally doesn’t work well with mixed-cadence shows or rolling/crawling video titles over film content.

 

The newest sets are 1080p, which is a totally bogus marketing feature. These are designed for video game playback and not TV signals, which are simply frame-doubled. All of this mish-mash – plus the heavy digital compression used in transmission – makes me marvel at how bad a lot of HD signals look in retail stores. I recently saw a clip from NBC’s Heroes on a large 1080p set at a local Sam’s Club. It was far more pleasing to me on my 20” Samsung CRT at home, received over analog cable, than on the big 1080p digital panel.

 

Progress (?) marches on…

 

We can’t turn back time , of course, but my feeling about displays is that a 29.97p (30P) signal is the “sweet spot” for most LCD and plasma panels. In fact, 720p on most of today’s consumer panel looks about the same as 1080i or 1080p. When I look at 23.98 (24P) content as 29.97 (24p-over-60i), it looks proper to my eyes on a CRT, but a bit funky on an LCD display. On the other hand 29.97 (30P) strobes a bit on a CRT, but appears very smooth on a flat panel. Panasonic’s 720p/59.94 looks like regular video on a CRT, but 720p recorded as 30p-over-60p looks more film-like. Yet both signals actually look very similar on a flat panel. This is likely due to the refresh rates and image latency in an LCD or plasma panel as compared to a CRT. True 24P is also fine if your target is the web. As a web file it can be displayed as true 24fps without pulldown. Remember that as video, though, many flat panels cannot display 23.98 or 24fps frame rates without pulldown being added.

 

Unfortunately there is no single, best solution. If your target distribution is for the web or primarily to be viewed on flat panel display devices (including projectors), I highly recommend working strictly in a progressive format and a progressive timeline setting. If interlacing is involved, them make sure to deinterlace these clips or even the entire timeline before your final delivery. Reserve interlaced media and timelines for productions that are intended predominantly for broadcast TV using a 480i (NTSC) or 1080i transmission.

 

By now you’re probably echoing the common question, “When are we going to get ONE standard?” My answer is that there ARE standards – MANY of them. This won’t get better, so you can only prepare yourself with more knowledge. Learn what works for your system and your customers and then focus on those solutions – and yes – the necessary workarounds, too!

 

Does your head hurt yet?

 

© 2009 Oliver Peters

Dealing with a post facility

blg_dealingpost

The do-it-yourself filmmaker might view the traditional lab or post facility as a place of last resort. That belief stems from a fear that – like a visit to a doctor or lawyer – every minute is billable. Most finishing facilities are actually easy to deal with and have the producer’s best interests at heart. They have been forging new workflows with file-based formats and are often the best-equipped to give a producer the desired creative and technical result.

 

Reasons to rely on outside resources include various advanced post services, like conforming a project for higher-quality or higher-resolution deliverables, color-grading and the production of digital intermediate masters. Sometimes, clients simply don’t know where to start, what to ask, or what’s expected of them. I posed some of these questions to a roundtable of post professionals, including Terence Curren, owner of Aphadogs (Burbank), Mike Most, chief technologist at Cineworks (Miami), Brian Hutchings, freelance colorist (Los Angeles) and Peter Postma, US product manager for Filmlight.

 

OP: Many clients don’t realize that post houses may offer up-front consultation as part of their sales effort. How do you approach that?

 

CURREN: We absolutely offer that service! Any post house that has the client’s welfare in mind does this. We find ourselves consulting on everything from cameras and recording formats, to file naming conventions. Every system has issues. We handle both FCP and Avid extensively and are familiar with a lot of potential pitfalls on either platform. When a client contacts us in advance, we can help to steer them clear of problems with their intended workflow. That can save them a lot of money in the long run.

 

HUTCHINGS: As a freelance colorist, I take on the roles of educator and salesman. Clients are increasingly making the transition from daVinci sessions to [Apple] Color. I do color timing on daVinci, Final Cut Studio and Avid systems and don’t have a financial interest in any particular piece of gear. Therefore, I can give a fairly unbiased opinion on the different paths available.

 

MOST: Clients these days often try to self educate. They read a lot on the Internet or elsewhere, or talk to others who have used the equipment they’re planning to use. Sometimes the knowledge they gain is accurate and useful, but often it’s either inaccurate or based on production or post conditions that differ from theirs. We try to steer them in a direction, so that what they do, how they do it, and the formats they use, flow easily into the finishing steps that we provide. Basically, we try to minimize surprises and make the process smoother, more efficient, and in many cases, more economical.

 

OP: What should the producer be prepared to supply for an online edit or a DI conform?

 

MOST: In general – if there is still such a thing – we need the original materials, an EDL, some visual copy of their offline cut as a reference, title copy (or graphics files, if they’ve created their own) and some idea as to how they’re finishing the sound. If the program is cut on an Avid, it’s useful to receive a bin with the final sequence in addition to a traditional EDL. Many less-experienced Final Cut editors use techniques, such as nesting, speed effects and other visual embellishments, which do not translate to an EDL in any kind of useful form. So with Final Cut, it helps to have a copy of the project file.

 

CURREN: Mike has covered the bases; however, with the new file-based formats that offer higher resolutions at smaller file sizes, we often receive the project with the media already attached. In this case our work starts by fine-tuning effects, adding any graphics, color correcting and the final mix of the audio. This saves money in the capture time and in double-checking against the offline for accuracy.

 

MOST: I do find that many users of newer formats, such as RED, are very confused about what they do and do not have to deliver to us to achieve the best quality with the least difficulty. They do a lot of work themselves to create elements that serve no purpose for us. This actually lengthens the amount of time it takes us to complete the conform. Hopefully in the future, there will be more communication prior to picture lock between clients and finishing facilities and much less bad guesswork.

 

OP: What are your guidelines for preparing the media and timelines before you start? How much time should be allowed for finishing and color grading?

 

CURREN: Our process is to re-import any files, then recapture any media from tape. With older analog formats like Betacam, we will actually ride levels on recapture to avoid any clipping of the video, which cannot be retrieved later in the process. Generally speaking, we figure about 100 clips captured per hour on Avid and about 90 on FCP. The more clips in a show, the longer this process takes. We will check the new timeline shot-by-shot against a copy of the offline output to verify that is correct, in sync and that effects properly translated. Next comes the color correction pass, titling and graphics. At this point we will watch the show with the client and then address any notes.

 

POSTMA: A commercial can be done in a day, though several days may be used for creative reasons. A feature film – including scanning, color correction and recording – can be done in three weeks. Again, it may be longer if you want to spend more time making creative color-correction decisions.

 

CURREN: An important side note about color correction needs to be made here. There are really three parts. Part one is to make it legal for whatever your distribution is going to be. Part two is to make it even, meaning all the shots in a given scene should look like they belong together. The first two steps are fairly objective. Part three is purely subjective. That’s where the magic can take place in color correction. Giving a slight green tint to a scary scene or a slight blue tint to two lovers silently arguing are examples of subjective choices. The creative part of the process can take a long time if allowed.

 

MOST: I can speak more to the feature film side of this question, because the time factors – once the conform is complete – are usually dictated by things like budgets. For a feature shot on film, we usually allocate 3-5 days to scanning (perhaps a day or two less for file restoration on a file based electronic feature), 2-3 days to conform, 5-10 days for color correction, 1-2 days to do final renders and create the HD master, and about a 5-7 days to do a film recording. All of those time factors can vary in either direction, depending on editorial complication, show length, creative choices, and, once again, budget.

 

OP: How do you handle grading of the same project for TV, digital cinema projection and film prints?

 

CURREN: Many post houses are claiming they are DI houses, but shouldn’t be. The trick with DI is to have tight control over the entire process, including the film lab. If you don’t, there are too many places where things can go wrong. Most of our work at Alphadogs is grading for television. We don’t claim to be a DI house. When we do feature work and the client plans to do a film-out, we will color correct the same way as for TV, but avoid clipping whites or crushing blacks. Afterwards, the client takes it to the lab they have chosen for a film-out, where a final scene-by-scene color pass is done. They save money by not having to color-grade every shot, since the scenes are already evened out.

 

MOST: Cineworks has a DI theater that’s specifically calibrated to Digital Cinema (P3) color space. We use a film print preview lookup table for projects going to film. During the session we’re basically looking at a preview of the eventual film print. The files are created in 10-bit log, film print density color space, and are used directly by a film recorder. We then use various custom lookup tables, along with minor color tweaks, to derive all other deliverables from those same files. The look remains consistent across all platforms. We usually generate an HD video version, which is then used for all other video deliverables – HD24, HD25, NTSC, PAL, full frame, letterbox, etc.

 

POSTMA: Filmlight’s Truelight color management system handles these conversions, so a DI facility that uses it should only need to color correct once and Truelight will handle the color conversion to the other spaces. It usually makes sense to color correct for the medium with the most color range (film or digital cinema) and then downconvert to video, web, etc. There may be some different creative decisions you’d like to make for the more limited mediums of video or the web. In that case, you can do a trim pass to tweak a few shots, but the Truelight color conversion should get you 90% of the way there.

 

OP: Should a producer worry about various camera color spaces, such as Panalog, REDlog or the cine-gamma settings in Sony or Panasonic cameras?

 

CURREN: This is a great reason to talk to post first. I’m a fan of leaving things in the native space through to your final finish; however, that can make for a very flat looking offline, which is disturbing to many folks. If so, you might need two versions of the files or tapes. One version – the master – should stay in the native space. The second – the offline editorial working files – should be changed over to video (601 or 709) space.

 

MOST: Color space issues should be for finishing facilities to deal with, but the use of custom gamma curves in electronic cameras presents some educational issues for shooters. We usually try to discuss these during a pre-production meeting, but they primarily affect dailies creation. For finishing, we can deal with all of these color spaces without much of a problem.

 

OP: If the intended distribution is 2K or 1920×1080 HD, should the producer be concerned about image sequence files (DPX, TIFF, etc.)?

 

MOST: No, not unless that’s the way the program is being recorded – as with an S.two or Codex recorder. It’s easier for editors to deal with wrapped movie files, QuickTime in the case of Final Cut or OMF and MXF in the case of Avid. We use the original material – in whatever form it was captured – for finishing. With film, of course, that’s obvious; but, for RED, we work directly from the .r3d files in our Assimilate SCRATCH system. That gives us access to all of the information the camera captured.

 

CURREN: DPX files hog a tremendous amount of storage space. If you capture digitally, with the RED camera, for instance, why not stay with the native RED codec? You won’t gain any quality by converting to DPX, but you will have to bake in a look limiting your color correction range later in the process?

 

OP: Who should attend the sessions?

 

MOST: For conforming, nobody needs to physically be there, but the editor or assistant editor needs to be available for any questions that come up. For color correction, we really want the director of photography to be with us, as the one who is really responsible for the look of the picture.

 

POSTMA: Cinematographer and director. You definitely don’t want too many people in the room or you can burn up time in a very expensive suite making decisions by committee.

 

CURREN: Who is signing the check? I’m not trying to be cheeky, it is just that someone makes the final decisions, or they have authorized someone to make the final decisions. That is who should be there at the end. For feature work, often the DP will get a pass at the color correction. In this case, it is wise for the producer to set some guidelines. The DP is going to try to make his stuff look the best he can, which is what he should be wanting. The colorist also wants the best look they can achieve. There is an old saying that applies here, “No film is ever done, they just run out of time or money.” There has to be a clear understanding of where the cut off point is. When is it good enough? Without that direction, the DP and the colorist can rack up quite a bill.

 

®2009 Oliver Peters

Originally written for DV Magazine (NewBay Media, LLC)

Compression Tips For The Web

One of the many new disciplines editors have to know is how to properly compress and encode video for presentations on the Internet or as part of CD-ROMs. Often this may be for demo reels or client approval copies, but it could also be for final presentations within PowerPoint, Director or another presentation application. The objective is to get the encoded program down to the smallest file size yet maintain as much of the original quality as possible.

 

Everyone has their own pet software or player format to recommend, but the truth of the matter is that it is unlikely that you will encode your video into a format that absolutely everyone can read without the need to download an additional player that they might have to install. The most common player formats include QuickTime, Windows Media, Real Player, Flash and the embedded media player that AOL bundles into their own software. Within each of these, there are also codec and size options that vary depending on how current a version you are targeting.

 

Modern formats, such as MPEG 4, Windows Media 9, QuickTime with Sorenson 3 and others may look great, but they frequently only run on the newest versions of these players. If your client has an older Windows 98 PC or an OS 9 Mac, it’s doubtful that they can play the latest and greatest software. You should also be aware that not all encoded results are equal. Some formats look awesome at standard medium-to-large video sizes, but don’t look good at all when you get down to a really small window size. The opposite is also true. Here are some guidelines that will let you target the largest possible audience.

 

Size and frame rate

 

The first thing to tackle when encoding for the web is the image size and frame rate. Standard definition video is 720 x 486 (480 for DV) pixels (rectangular aspect), which equates to a web size of 640 x 480 pixels (square aspect). This is considered a “large” window size for most web pages. Scaling the image down reduces the file size, so commonly used smaller sizes are 320 x 240 (“medium”), 192 x 144 and 160 x 120 (“small”). These sizes aren’t absolute. For instance, if your finished program is letterboxed, why waste file size on the black top and bottom bars? If your encoding software permits cropping, you could export these files in other sizes, such as 300 x 200 or 160 x 90 pixels. Another way to reduce the file size is to reduce the frame rate. Video runs at 29.97 fps but due to the progressive display and refresh rates of computer CRTs and flat panels, there is often little harm done in cutting this down to 15 fps or sometimes even 10 fps or lower.

 

Reducing the image size and frame rate is a matter of juggling the reduction of file size with playback that is still easily viewed and doesn’t lose the message you are trying to convey. If you are encoding for a CD-ROM instead of the web, then size is less of an issue. Here you may wish to maintain the full frame rate (29.97) so that your motion stays fluid, as long as most CPU speeds can support the size and rate you choose. For instance, a 320 x 240 file should play fine on most machines with a 200 MHz or faster CPU; however, if this same file is playing back from within another application, like an HTML page displayed in a web browser or PowerPoint, some CPU overhead will be lost to this host program. This means that the same file which plays fine outside of the host application, might tend to drop frames when playing back inside of another application.

 

Formats and players

 

There are a lot of conflicting opinions on this subject, but I tend to go for what is a common denominator and provides quality playback. For this reason, I tend to stick with formats like QuickTime (Photo-JPEG codec), Windows Media 7 and Real Player. MPEG 1 and 4 are supposed to be playable on nearly everything, but I haven’t found that to be true. I love the way Sorenson 3 (QuickTime) looks, but it requires QuickTime 5 or newer. If you encode in one of the previous three I mentioned, which are somewhat older, odds are that nearly any machine out there will be able to play these files or will be able to download a simple player in that format that works on a wide range of Mac and Windows PCs. Although Photo-JPEG is generally not considered a playback codec, the advance of CPU speeds lets these files play quite fluidly and the codec lends itself to controllable encoding – meaning, less voodoo to get a good image.

 

If you are putting a file up for anyone to see, like a demo reel, then you will probably have to create a version in each of these three player formats. If you are encoding for a single client and you know what they can play, then only one version is needed. As an example, a typical :30 commercial encoded with QuickTime (Photo-JPEG at about 50% quality) at a size of 320 x 240 (29.97 fps) will yield a file size of around 10 to 15MB. This is fine for approval quality, but a bit large when you multiply that for a longer demo reel on your website. Cutting down the image size and frame rate and using a lossier codec, will let you squeeze a demo reel of several minutes into that same space.

 

Interlacing and filtering

 

Interlaced video doesn’t look good on computer displays and doesn’t compress efficiently. Some programs let you export single fields only or let you apply de-interlacing filters. I recommend you use one of these options to get better results especially when there is a lot of motion. The one caveat is text. De-interlacing often trashes graphics and text, since half the visual information is tossed out. Generally, you get a better web look if your footage is based on a single-field export. Additionally, some encoding applications include noise reduction and image correction filters. I tend to stay away from these, but a touch of noise reduction won’t hurt. This will prefilter the image prior to compressing, which often results in better-looking and more efficient compression. Adding filters lengthens the encode time, so if you need a fast turnaround, you will probably want to disable any filters.

 

Constant versus variable bit-rate encoding

 

Like encoding for DVDs, many compression applications permit you to choose and adjust settings for constant (one-pass) and variable (one or two-pass) bit-rate encoding. I prefer constant bit-rate encoding because variable bit-rate often makes fades and dissolves look quite “blocky”. Constant also gives you a better look when transitioning between static graphics or frames and motion. The downside is that you will have to use a lower average rate to get comparable results in file size. Not all codecs give you this option, but when they do, it will often take a bit of trial-and-error to determine which rates look best and to decide how often to place keyframes (usually a slider in the software or a number value).

 

Audio

 

Remember that audio is a major component of your program. You can cut done your video by quite a lot, but at some point audio is taking up even more space than the video and needs to be compressed as well. Tackle this in several ways. First, change your stereo audio to a single track of mono audio. The difference is minor and often stereo channels don’t seem to encode well, introducing all sorts of phase errors. Next, drop your sampling rate. You probably edited the show using a rate of 44.1 or 48 kHz. On most programs, you can successfully drop this to 22 kHz without really affecting the sound quality heard on most computer speakers. Do not drop the bit-depth. Reducing the bit-depth from 16-bit (typical) to 8-bit will create some very undesirable audio. Finally, add compression. Most codecs include some typical audio compression schemes, which all players can decode. A compression ratio of 4:1 is common and hardly noticed.

 

Software

 

Choosing the best application to encode/compress your footage gets down to learning curve, comfort factor, speed, preference and whether you are on a Mac or PC. Not all applications give you equal quality results with the same codec, though. You can encode using the internal export functions of most NLEs or choose from a wide range of applications, including Apple QuickTime Player Pro, Apple Compressor, Discreet Cleaner, Canopus Procoder, Sorenson Squeeze, Ligos, Windows Media encoder and many others.

 

When you encode a file, you may also choose to make it streaming or downloadable. Selecting progressive encoding will make the file downloadable, which is generally what you want for a demo reel or a client approval copy. If you want to ensure that the person’s browser will permit a download, wrap the file in an archive (data compression) format like .sit or .zip using WinZip or Stuffit. This forces the viewer to either open the file or save it on their local hard drive.

 

As with most things, it helps to read the book and spend some time experimenting when you’re not under the gun. This will let you decide which codec and encoding application gives you the best results based on need and the target audience.

 

© 2004 Oliver Peters

Understanding Video Levels

The video signal is made up color information (chrominance) superimposed on top of black-and-white information (luminance). Adjusting this balance gives you the values of brightness, contrast, color intensity (saturation) and hue (tint). When you look at a waveform monitor in the IRE mode, you can see the values of the grayscale that represent the black-and-white portion of the picture. A vectorscope displays the distribution of the color portion around a circular scale, representing saturation and the various component colors.

 

The Red, Green and Blue components of light form the basis for video. All images that end up as video originally started as some sort of RGB representation of the world – either a video camera or telecine using CCDs for red, blue and green – or a computer graphics file created on an RGB computer system. This RGB version of the world is converted into a luma+chroma format by the circuitry of the camera or your editing workstation. Originally an analog encoding process, this conversion is now nearly always digital, conforming to ITUR-601 or DV specs. This is commonly referred to as YCrCb – where Y = luminance and CrCb = two difference signals used to generate color information. You may also see this written as YUV, Y/R-Y/B-Y or other forms.

 

In the conversion from RGB to YCrCb, luminance detail is given more weight, because research has shown that the eye responds better to brightness and contrast than pure color information. Therefore in 601, YCrCb is expressed as the ratio of 4:2:2, so by definition, chroma has half the resolution of the black-and-white values of the image. In DV, the ratio is 4:1:1 (NTSC) – even less color information. 

 

Although most post-production systems keep YCrCb signal components separate, they are nevertheless encoded into a composite signal of some sort before the viewer sees the final product, even if only by the very last display technology in the chain. This means that your pristine image capture in RGB has undergone a truncation of information during the post and end use of the video. This truncation gives us the concept of “legal colors” and “broadcast safe” levels, because there are color values and brightness levels in RGB and even YCrCb that simply will not appear as intended by the time the viewer sees it broadcast or on a VHS dub or DVD. That’s why it’s important to monitor, adjust and restrict levels to get the best possible final results.

 

Interpreting the Waveform

 

The NTSC video signal is one volt of energy from the absence of video (black) until the maximum video brightness level (white). This signal is divided into the sync portion (below zero on a waveform monitor) and the video portion (above zero). The video portion of the scale is divided into 100 IRE units, with black set at 7.5 IRE (in the US) and peak whites at 100 IRE. The luminance information should not dip lower than 7.5 or higher than 100. The color information, which you see superimposed over the luminance information when a waveform monitor is set to FLAT, can exceed these 7.5 and 100 IRE limits. In fact, color can legally dip as low as –20 and as high as 120.

 

On top of this, many cameras are actually adjusted to limit (clip) their peak whites at higher than 100 IRE – usually at 105 or even 110. This is done so that you have a bit of artistic margin between bright parts of the image – that you would like to keep very white – and specular highlights, like reflections on metal – which are supposed to be clipped. Therefore, if you set up a camera tape to the recorded bars, you will often find that the peak levels on the tape are actually higher than 100 IRE. On the other end of the scale, you may also find chroma levels, such as highly saturated blues and reds that dip below the –20 mark. In order to correct these issues, you must do one of the following: a) lower the levels during capture into the NLE, b) adjust the image with color-correction, c) add a software filter effect to limit the video within the range, or d) add a hardware proc amp to limit the outgoing video signal.

 

These values are based on an analog signal displayed in the composite mode, but things get a bit confusing once digital monitoring is introduced. A digital waveform displaying a 601 serial digital video signal will frequently use a different scale. Instead of 100 IRE as the topmost limit, it will be shown as .7 volts (actually 714 millivolts) – the electrical energy at this level. Digital video also has no “set up” to the signal. This is the NTSC component that sets black at 7.5 IRE instead of 0. In the digital world, black is 0, not 7.5. There is nothing missing, simply a difference in scales, so the analog range of 7.5 to 100 IRE equals the digital range of 0 to 714 millivolts. Sometimes digital scopes may label their scale as 0 to 100, with black at 0 and the peak white level at 100. This tends to confuse the issue, because it is still a digital and not an analog signal, so operators are often not sure what the proper value for black should be.

 

256 Levels

 

Digital video is created using an 8-bit (or sometimes 10-bit) quantization of the incoming analog image. With 8-bit digital video, the 7.5 to 100 IRE analog range is divided into 256 steps between total black (0) and total white (255). In RGB values, 0 to 255 is the full range, but according to the 601 specifications, the range for YCrCb digital video was reduced so that black = 16 and white = 235. This was done to accommodate “real world” video signals that tend to exceed both ends of the scale and to accommodate the set-up level of NTSC.

 

Unfortunately, not all NLEs work this way. Some take video in and digitize it based on the 0-255 range, while others use the 16-235 range. Since no video can exist below digital zero, any analog video, which is lower than 7.5 IRE or higher than 100 IRE, will be clipped in a system that scales according to the 0-255 range, since there is no headroom on such an NLE. Some NLEs that work in this RGB mode will correct for the headroom issue as part of the conversion done with the incoming and outgoing video signals.

 

DV and Set-up

 

It was all pretty easy in the days of analog tape formats or even professional digital formats, like Digital Betacam, which correctly converted between analog and digital scales. Then came DV. DV is a consumer format, which has been aggressively adopted by the professional world. Like other digital recordings, DV does not use a set-up signal. If you capture video from a DV VTR into an NLE, using the DV signal over FireWire (iLink, 1394), then the lack of set-up isn’t an issue because the signal path has always been digital.

 

Many DV decks also have analog outputs and the analog signals coming out of these frequently have not been corrected with an added set-up value. This results in DV video – with blacks at digital 0 – being sent out via the analog spigots with the black level at analog 0, not 7.5 IRE. The problem really becomes compounded if you capture this analog signal into an NLE, which is expecting an analog signal to have a 7.5 IRE limit for black. It will scale this 7.5-point to equal the digital black value. If you use an NLE that scales digital video according to the 16-235 range, then you are safe, because the darker-than-expected portion of the signal is still within a useable range. On the other hand, if your NLE scales according to the 0-255 range, then your darkest video will be clipped and cannot be recovered because nothing can be darker than digital 0. There are four solutions to this issue: a) use a DV deck with SDI I/O – no set-up required, b) use the 1394 I/O only – no set-up required, c) use a DV deck that adds set-up to the analog signals, or d) place a converter or proc amp in line between the deck and the NLE to adjust the levels as needed.

 

Color

 

I’ve spent a lot of time discussing the black-and-white portion of the image, but color saturation is also quite important. There are two devices that best show color-related problems: the vectorscope and the waveform monitor’s diamond display. Vectorscopes show chroma saturation (color intensity) in a circular display. The farther out from the center that you see a signal, the more intense the color. Legal color levels can go to the outer ring of the scale, but cannot go past it.

 

The diamond display on a waveform monitor shows the combination of brightness with color intensity. This display shows two diamond patterns – one over the other. The video signal must fall inside the diamonds in order to be legal. Certain colors, like yellow, can be a problem because of their brightness. Yellow can exceed the upper limits of the legal range due to either chroma saturation or video level (brightness). An excessive yellow signal – a “hot” yellow – would easily fall outside the edges of these diamond patterns. Either reducing the brightness or the saturation can correct the level, because it is the combination of the two that pushes it into the illegal range.

 

Proper levels are easy to achieve with a little care and attention to details. To understand more about the video signal, I recommend a book formerly published and distributed by Snell and Wilcox. Check out Video Standards – Signals, Formats and Interfaces by Victor Steinberg (ISBN 1900739 07 0).

 

© 2004 Oliver Peters

The Basics of DVD Creation

The DVD has become the de facto replacement for the VHS dub. DVD authoring has become an extension of most nonlinear editing software. This has made something that started out as a very arcane, expensive and specialized task into something that is easy enough for any video professional to master. When you produce a video DVD, you are actually building a creative product that must conform to the DVD-Video spec. This is one of the categories within the overall DVD format specification that includes other forms, such as DVD-ROM and DVD-Audio. Creating a disk with the full nuances of the DVD-Video spec has been addressed in such high-end applications as Sonic Solutions’ DVD Creator. Newer, more user-friendly programs like Adobe Encore DVD, Apple DVD Studio Pro 2, Sonic’s own Reel DVD and Sony’s DVD Architect now offer enough of these same DVD authoring features to satisfy the requirements of over 90% of all the types of DVDs typically produced.

 

The full DVD spec deals with complex elements like program managers, title sets and so on, but these newer authoring tools have adopted a more streamlined approach – dividing your assets into tracks (for video and audio elements) and menus. You are generally limited to 99 different assets – far less than the full DVD spec would actually allow, yet well within the requirements of most commercial and corporate DVDs. This is especially true considering the significant drop in price from the early days of DVD tools.

 

Most of the popular authoring applications offer the main features, like motion menus, subtitles and multiple languages; but, not all permit some of the more advanced features, including 24fps encoding, multiple camera angles and AC-3 Dolby surround audio tracks. Obviously, there is still room for the advanced products, but any of the sub-$1,000 software tools will do the trick for most producers.

 

Encoding

 

DVD program assets are made up of elementary stream files. Each video segment has a compressed MPEG2-format video file (.m2v) and, if there is audio with the program, at least one audio file (.wav). When multiple languages are used, there will be more than one audio file for each video element. For surround audio, there will be both a stereo track, as well as an AC-3 encoded audio file for the surround mix. In the past, these files had to be created prior to any authoring. Ground-breaking DVD tools, like Sonic’s DVD Creator, employed real-time encoding hardware (which is still used), but fast software encoding has now become an acceptable alternative. Many authoring tools let you import DVD-compliant files, which have been compressed using a separate application – or work with regular AVI or QuickTime files and apply built-in encoding at the last stage of the authoring process.

 

DVD videos can be either NTSC or PAL and the frame size is the same as DV: NTSC – 720 x 480 (non-square pixel aspect ratio). The compression of DVDs should typically fall into the range of 4.0 to 7.0 Mbps (megabits per second). Higher data rates are allowed, but you generally won’t see much improvement in the video and higher bit rates often cause problems in playback on some DVD players, especially those in older laptop computers. There are three encoding methods: constant bit rate, one-pass variable and two-pass variable.

 

Constant bit rate encoding is the fastest because the same amount of compression is applied to all of the video, regardless of complexity. Variable bit rate encoding applies less compression to more complex scenes (a fast camera pan) and more compression to less complex scenes (a static “talking head” shot). In two-pass encoding, the first pass is used to analyze the video and the second pass is the actual encoding pass. Therefore, two-pass variable bit rate encoding will take the longest amount of time. During the encoding set up, a single bit rate value is entered for constant bit rate encoding, but two values (average and maximum peak rates) are entered for variable.

 

The quality of one type of encoding versus another depends on the quality of the encoding engine used by the application, as well as the compression efficiency of the video itself. The former is obvious, but the latter means that film, 24P and other progressive-based media will compress more cleanly than standard interlaced video. This is due to the fact that interlaced video changes temporal image information every 60th of a second, while film and 24P’s visual information updates only 24 times a second. As a result, compression at the exact same bit rate will appear to have fewer artifacts when it is applied to film and 24P media, than when applied to interlaced media. Add to this the fact that film material has grain, which further hides some of these compression artifacts. The bottom line is that a major movie title can often look great with a much lower bit rate (more compressed) than your video-originated corporate training DVD – even with a higher bit rate. Most standard DVDs will look pretty good at a bit rate of around 5.5 Mbps, which will permit you to get about 60 to 90 minutes of material on a 4.7 GB general-purpose DVD-R.

 

Authoring

 

Creating the interactive design for a DVD is a lot like building a web site. Menus are created which are linked to video assets. Clicking a menu button causes the player to jump from one point on the DVD (menu) to another (video track). Menus can be still frames or moving video, but must conform to the same constraints as any other video. If they don’t start out as video, they are turned into video in the final DVD build. By comparison a web site, CD-ROM or even DVD-ROM might have HTML-based menus of one size and QuickTime or AVI video assets of a different size. This isn’t allowed in a DVD-Video, because the DVD must be playable on a set-top DVD player as video. Motion menus must be built with loop points, since the longer the menu runs, the more space it will consume on the disk. Thirty seconds is usually a standard duration for a loop. A slight pause or freeze occurs in the video when the disk is triggered to jump back to the start of the loop. This is a buffer as the DVD player’s head moves between two points on the disk’s surface.

 

Lay out your design on paper first. A flowchart is a great idea, because this will let you see how one or more menus connect to the videos in the most efficient manner. Although any type of flowcharting software program will work, it is often just as simple to do this on paper. On the other hand, a nicely printed visual flowchart goes a long way in explaining to a client how the viewer will navigate through a complex DVD. Links can be created as buttons or drop zones. A button is a graphic element added to a menu within the authoring tool, while a drop zone is a hyperlink area added to an imported video. You can create videos or animations to be used as menus and turn graphic or video portions into linkable “hot spots” by adding drop zones on top of the video.

 

Clicking on a button or drop zone activates a jump to another menu, a video track or a point within a video track (chapter point). Part of the authoring task is to define what actions occur when you click on the control keys of a standard DVD handheld remote. You must define what happens when the viewer clicks the Menu, Title or Return keys, as well as which direction the cursor travels when the arrow keys (up, down, left, right) are used. Although the degree of control over these parameters varies with different software applications, all the contenders must let you define the next options. You have to set the First Play – the file that plays when you first pop in the DVD. You also have to set the target destination for the end of each video file. This determines whether the video continues on to another video or jumps back to a menu – and if so, which one. Remember that if you have more than one menu, you will have to add buttons and navigation commands to go between the menus.

 

Finishing

 

Most authoring tools let you run a simulation of the DVD as you configure it. This lets you proof it to see if all the links work as intended. When the authoring is complete, the next step is to “build” the disk. This is an automatic process in which the application checks to see if your authoring has any errors and then “muxes” (multiplexes) the audio, video and menu files. The muxing stage is where the VOB (video object) files are created. You can see these files if you explore the folders of a non-encrypted DVD. If your DVD tool features built-in encoding, that step generally occurs during the building process.

 

When a project is built, it can be saved to your hard drive as a disk image or burned to a recordable DVD. Although there are a handful of different recordable DVD formats, the DVD-R general-purpose disks seem to be the most universal. These are typically rated at 4.7 GB, but actually hold about 4.3 GB of data. If you require more capacity, you will have to advance to a dual-layer or dual-sided DVD (9 GB and higher). These cannot be burned on a DVD recorder and not all authoring tools handle these formats. If they do, you can prepare a disk image to be saved to a DLT tape, which would be sent to a commercial DVD replication facility. When you burn a DVD-R on a recording drive, be aware that the burning speed is based on the rating of the media. 1X blanks will only burn at the 1X speed, 2X disks at 2X speed and so on. If you saved the build as a disk image, you will be able to use a disk burning utility like Roxio and burn multiple DVD copies. Although encoding and recording times vary, it is not out-of-line for a one-hour DVD to require as much as four hours from the time you start the build until the DVD has finished the recording step.

 

We’ve only scratched the surface; so to learn more, check out the white papers and technical documents that are available on the Pioneer, Sonic Solutions and Mitsui web sites. A great reference is DVD Production, A Practical Resource for DVD Publishers by Philip De Lancie and Mark Ely, available from Sonic Solutions. User guides and tutorials – especially those included with Adobe Encore DVD and Apple DVD Studio Pro 2 – are also quite helpful. Start small and move up from there. Soon you, too, can be a master of the DVD world.

 

© 2004 Oliver Peters

Proper Monitoring

Just like good lighting and camerawork are some of the fundamentals of quality production, good monitoring provides some of the same important building blocks for post-production. Without high quality video and audio monitors, as well as waveform monitors and vectorscopes, it is impossible to correctly assess the quality of the video and audio signals with which you are working. There are few if any instruments that truly tell an editor or mixer the degradation of signals as they travel through the system any better than the human eyes, ears and brain. You cannot read out the amount of compression applied to a digital file from some fancy device, but the eye can quickly detect compression artifacts in the image.

 

Such subjective quality evaluations are only valid when you are using professional, calibrated monitoring that shows you the good with the bad. The point of broadcast grade video monitors and studio grade audio monitors is not to show you a pleasing picture or great sounding mix, but rather to show you what’s actually there, so that you can adjust it and make it better. You want the truth and you won’t get that from a consumer video monitor or TV or from a set of discount boombox speakers.

 

Video Monitors

 

Let’s start with the picture. A proper post-production suite should have a 19 or 20-inch broadcast grade monitor for video evaluation. Smaller monitors can be used if budgets are tight, but larger is better. Most people tend to use Sonys, but there are also good choices from Panasonic and Barco. In the Sony line, you can choose between the BVM (broadcast) and the PVM (professional, i.e. “prosumer”) series. The BVMs are expensive but offer truer colors because of the phosphors used in the picture tube, but most people who work with properly calibrated PVM monitors are quite happy with the results. In no case at this point in time would I recommend flat panel monitors as your definitive QC video monitor – especially if you do any color-correction with your editing.

 

The monitor you use should have both component analog (or SDI) and composite analog feeds from your edit system. Component gives you a better image, but most of your viewers are still looking at the end product (regardless of source) via a composite input to a TV or monitor of some type. Frequently things can look great in component and awful in composite, so you should be able to check each type of signal. If you are using a component video feed, make sure your connections are solid and the cable lengths are equal, because reduced signal strength or unequal timing on any of the three cables can result in incorrect colorimetry when the video is displayed. This may be subtle enough to go unnoticed until it is too late.

 

Properly calibrated monitors should show a true black-and-white image, meaning that any mage, which is totally B&W, should not appear to be tinted with a cast of red, blue or green. Color bars should appear correct. I won’t go into it here, but there are plenty of resources which describe how to properly set up your monitor using reference color bars. Once the monitor is correctly calibrated, do not change it to make a bad picture look better! Fix the bad image!

 

Scopes

 

Video monitors provide the visual feedback an editor needs, but waveform monitors and vectorscopes provide the technical feedback. These are the editor’s equivalent to the cinematographer’s light meter. The waveform monitor displays information about luminance (brightness, contrast and gamma) while the vectorscope displays information about color saturation and hue. The waveform can also tell you about saturation but not hue. Most nonlinear editing applications include software-based scopes, but these are pretty inadequate when compared to the genuine article. Look for products from Tektronix, Leader, Videotek or Magni. Their products include both traditional (CRT-based) self-contained units, as well as rack-mounted modules that send a display to a separate video monitor or computer screen. Both types are accurate. Like your video display, scopes can be purchased that take SDI, component analog or composite analog signals. SDI scopes are the most expensive and composite the least. Although I would recommend SDI scopes as the first choice, the truth of the matter is that monitoring your composite output using a composite waveform monitor or vectorscope is more than adequate to determine proper levels for luma and chroma.

 

Audio Monitors

 

Mixers, power amps and speakers make up this chain. It’s possible to set up a fine NLE suite with no mixer at all, but most people find that a small mixer provides a handy signal router for the various audio devices in a room. When I work in an AES/EBU-capable system (digital audio), I will use that path to go between the decks and the edit system. Then I only use the mixer for monitoring. On the other hand, in an analog environment, the mixer becomes part of the signal chain. There are a lot of good choices out there, but most frequently you’ll find Mackie, Behringer or Alesis mixers. Some all-digital rooms use the Yamaha digital mixers, but that generally seems to be overkill.

 

The choice of speakers has the most impact on your perception of the mix. You can get either powered speakers (no separate amp required) or purchase a separate power amp, depending on which speakers you purchase. Power amps seem to have less of an affect, but good buys are Yamaha, Crown, Alesis, Carvin and Haffler. The point is to match the amp with the speakers so that you provide plenty of power at low volumes in order to efficiently drive the speaker cones.

 

Picking the right speaker is a very subjective choice. Remember that you want something that tells you the ugly truth. Clarity and proper stereo imaging is important. Most edit suites are near-field monitoring environments, so huge speakers are pointless. You will generally want a set of two-way speakers, each with an eight-inch woofer and a tweeter. There are plenty of good choices from JBL, Alesis, Mackie, Behringer, Tannoy and Eastern Acoustic Works, but my current favorite is from SLS Loudspeakers (Superior Line Source). In the interest of full disclosure, I own stock in SLS and have them at home, but they are truly impressive speakers sporting innovative planar ribbon technology in their tweeter assembly.

 

VU Metering

 

VU, peak and PPM meters are the audio equivalent to waveform monitors. What these meters tell you is often hard to interpret because of the industry changes from analog to digital processing. An analog VU scale places desirable audio at around 0 VU with peaks hitting at no more than +3 db. Digital scales have a different range. 0 is the absolute top of the range and 0 or higher results in harsh digital distortion. The equivalent spot on this range to analog’s 0 VU is minus 12, 14 or 20 db. In effect, you can have up to 20 db of headroom before distortion, as compared to analog’s 3 to 6 db of headroom. The reason for the ambiguity in the nominal reference value is because many digital systems calibrate their VU scales differently. Most current applications set the 0 VU reference at –20 db digital. 

 

Mixing with software VUs can be quite frustrating because the meters are instantly responsive to peaks. You see more peaks than a mechanical, analog VU meter would ever show you. As a result, it is quite easy to end up with a mix that is really too low when you go to a tape machine. I generally fix this by setting the VTR inputs to a point where the level is in the right place for the VTR. Then I may change the level of the reference tone at the head of my sequence to match the proper level as read at the VTR. This may seem backwards, but it’s a real world workaround that works quite well.

 

© 2004 Oliver Peters

Moving From Point A to Point B – Understanding File Exchange

If you are new to the world of video and largely work inside of a closed environment – such as starting and finishing your DV project all inside Final Cut Pro – then, you have yet to wrestle with the details of exchanging your project information with another piece of software or computer editing system. Before nonlinear editing, video post-production was divided between offline and online editing. This paradigm was borrowed from the film world: editing workprint and lab finishing. The first part was the “messy” creative, decision-making side of editing; the latter was the “manufacturing” of a high-quality product. Meticulous handwritten record-keeping permitted decisions made in the first phase to be used as the basis for the second.

 

EDL

 

This film workflow was adopted by videotape editors at about the same time that SMPTE timecode was standardized. Timecode permitted a method of electronically tracking edits and moving offline editing data into the online (or finishing) stage of the process. Edit Decision Lists (EDL) have been with us since the 1970’s. They are largely unchanged, yet are still the most frequently used “lowest common denominator” when moving information between various edit systems and software applications. EDLs are simply text documents showing a series of columns, displaying edit event, source reel numbers, type of edit (audio or video), transition type (cut, wipe, key or dissolve), transition duration and the ingoing and outgoing edit points – in timecode – of the source and master tape(s). This type of information is commonly referred to as “metadata”.

 

Each edit event line is read horizontally and represents a change made to the master tape or sequence at that location on the master. Other information may also be displayed, like slow-motion speeds and editor’s comments. The source reel identification can be up to eight alphanumeric characters, depending on the EDL format (CMX, Sony, Grass Valley). Finally, drop-frame and non-drop-frame timecode can be mixed. Since EDLs can only accurately interpret one track of video and four tracks of audio, they are extremely limited in carrying the information generated by most modern nonlinear edit systems, like Avid or Final Cut Pro. In addition, almost none of the visual effects or audio mixing and panning data can be included within the EDL format.

 

We have temporarily hit a point in time where the offline/online editing strategy isn’t always needed, so you might ask, “What’s the big deal?” This is only for the short term. For instance, as the industry moves towards greater use of HD, it is once again impractical to start and finish your editing in the same quality and resolution as the final master. A far more cost-effective solution might be to do your creative editing with DV copies of your HD tapes and then go to an HD facility to faithfully recreate that edit in HD. It would be nice to have as much of your audio, effects and color-correction data translate from one system to the other. Since EDLs are very limiting, what are the other choices?

 

OMF

 

A few years into the development of its product line, Avid Technology tried to get another manufacturers to jump on the bandwagon and participate in the development of Open Media Framework Interchange (OMFI or OMF). OMF files were designed to be able to carry forward more of the information from the edited sequence than EDLs had been able to do. This meant that the vertical track hierarchy of NLEs could be preserved, as well as more of the effects metadata. In addition to just information, an OMF file could also contain embedded audio and/or video media files, permitting an entire edited sequence to be moved in a single archive file.

 

As promising as this sounds, it proved less effective than hoped for. First of all, the information about how an NLE program applies a specific effect, like a DVE move, tends to be proprietary metadata specific to that maker; therefore, effects information included in an OMF file is largely generic and still has to be recreated on the second system. For instance, the offline editor might have applied a certain DVE move on a clip on track two. The OMF file received by the online editor shows a DVE effect with a default value, but not those originally created. Certain effects, like a “film-style” dissolve might not come across at all. For all of their promise, OMF files have really turned into glorified “super-EDLs”.

 

There are a few areas where OMF files have proved a great workflow enhancement. Wes Plate – a veteran Avid and After Effects user – founded Automatic Duck with his programmer father. Together they have developed a number of interchange utilities to move files between various NLEs and After Effects. The most notable of these is Automatic Composition Import, which is designed to move layered sequences from Avid or Final Cut Pro into After Effects, while still maintaining the layer hierarchy. Effects have to be recreated in After Effects, but it’s a great starting point. Furthermore, if the NLE and After Effects reside on the same computer or even the same network, no media actually has to be moved. OMF and ACI permit an OMF file to be exported with media linked rather than embedded. This means that After Effects will simply find the location of the Avid or Final Cut Pro media files on the drives without any duplication of media. There is a lot of interest in moving sequence and layer information from an NLE into After Effects, so other manufacturers, like Sony, Media 100 and Pinnacle, each have their own routines for sending clips from their timelines straight into an After Effects project.

 

OMF has also become the de facto interchange format in the audio world. It is easier to send OMF audio data between competing systems, while retaining information about track patching, mix levels and panning, since audio mixing tends to be a bit less proprietary among manufacturers. The idea is to get the basics across, since the final mix inside the digital audio workstation will most likely use a number of customized, effect-filter plug-ins that weren’t available to the offline editor anyway.

 

A good offline editor can do a lot to build and clean up audio before sending it to the mixer, which ultimately saves everyone time. Since audio file sizes are fairly small, the embedded multi-track audio for the timeline of a one-hour TV show can fit on a single CD-ROM. It is important to understand that simply creating an OMF for audio isn’t the only step. DAWs like Pro Tools don’t inherently read OMF files, but rely instead on a utility called DigiTranslator (often an extra cost option) to turn OMF files into Pro Tools session files. You also have to be aware of the native audio file format used by your NLE. This varies with PCs (which normally use WAV audio files) and Macs (which tend to use AIFF). Older Macs also used SDII (Sound Designer II) as an audio file format. Most modern PC and Mac DAWs can work with either AIFF or WAV files, but generally PCs can’t deal with the older SDII files.

 

AAF

 

Since neither EDLs nor OMFs have provided the magic solution, the industry has moved on to the Advanced Authoring Format (AAF). The AAF is intended to provide a far greater ability to share effects parameters and other metadata among different products. AAF’s supporting organization has a larger number of companies who back this standard, than did OMF. In order to get everyone to sign on, the specifications were designed to permit a portion of the code to stay proprietary for each company, if they so choose, so AAF still won’t be totally universal. This means that Avid or Quantel or someone else might still have certain effects that can only be read perfectly by their own systems. For instance, one company’s AAF file might be 90% open and 10% proprietary – another’s might be totally open. It all depends on what they feel will further their position in the marketplace.

 

Like OMF, AAF has the ability to embed audio and video media as well as metadata into the file. The media subset of AAF is called Media eXchange Format (MXF), which is essentially the file format used by Sony in its MPEG-IMX products. The promise is that in the near future you might be able to shoot with your Sony camcorder and upload that video as a data file into a Sony (or other) server. You then might edit the news story using an Avid Newscutter that could read the MXF file straight from the server without any need to digitize or convert files. In the coming year, Avid products are supposed to be able to deal with both OMF and MXF media files, once the MXF specs are fully standardized. It is important to remember that both MXF and OMF media formats aren’t compression schemes, but rather “wrapper” information that tells an editing system, server, VTR, etc. what type of a file it is, what properties it has and what to do with it.

 

QuickTime

 

It isn’t specifically the same type of interchange format as OMF or AAF, yet Apple’s QuickTime has proven to be one of the better methods for moving files between various applications and platforms. QuickTime can support various frame sizes, frame rates and compression schemes, up to and beyond uncompressed HD. Even if a computer can’t fluidly play a high data rate QuickTime movie, it can still pass it along. QuickTime, like the other media formats, is a media “wrapper” that encompasses various codecs (encoder/decoder-compressor/decompressor software), which define the size, rate and compression (if any) of that media file. As such, QuickTime can be used as an origination, destination or interim file format, depending on your needs. Any computer with the same installed codec can display, copy, convert and otherwise manipulate the file.

 

The Present State

 

All of these methods offer some great workflow opportunities, but for the time being, it is still best to stay within the same family of products for the most universal file interchange. If you offline edit on Final Cut Pro, online on a Final Cut Pro system, too. If you offline on an Avid, stay with an Avid Media Composer, Symphony or Avid|DS for your online.

 

There are some changes to look for, such as from Apple, who is using the XML programming language to open “hooks” for others to use in exporting project information. The folks at Automatic Duck are busy using this to make a path from Final Cut Pro to Quantel’s editing products. In the future, this will permit you to offline your HDTV movie using Final Cut Pro with DV media, but then conform the HD master using Quantel’s advanced eQ or iQ editing platforms – hopefully with a direct transfer of all effects and color-correction metadata from your Apple to the Quantel system.

 

Even with the advances of AAF, XML or something else, I’d be willing to bet that the thirty-year-old EDL standard still won’t go away anytime soon!

 

© 2003 Oliver Peters