Script-based video editing started with Ediflex. But, it really came into its own when Avid created script integration as a way to cut dialogue-driven stories, like feature films, in Media Composer. The key ingredient is a written script or a transcription of the spoken audio. This is easy with a feature that’s been acted according to defined script lines, but much harder with something freeform, like a documentary or news interview. In those projects, you first need a person or service to transcribe the audio into a written document – or simply cut without it and hunt around when you look for that one specific sentence.
Modern technology has come to the rescue in the form of artificial intelligence, which has enabled a number of transcription services to offer very fast turnaround times from audio upload to a transcribed, speech-to-text document. Several video developers have tapped into these resources to create new transcription services/applications, which can be tied into several of the popular NLE applications.
Transcription for the three “A” companies
One of these new products is SpeedScriber, a transcription application for macOS and its companion service developed by Digital Heaven, which was founded by veteran UK editor and plug-in developer Martin Baker. To start using SpeedScriber, install the free SpeedScriber application, which is available from the Apple Mac App Store. The next steps depend on whether you just want to create transcribed documents, captioning files, or script integration for Avid Media Composer, Adobe Premiere Pro CC, or Apple Final Cut Pro X.
If you just want a document, or plan to use Media Composer or FCPX, then no other tools are required. For Premiere Pro CC workflows, you’ll want to download an panel installer for macOS or Windows from the SpeedScriber website. This integrates as a standard Premiere Pro panel and permits you to import transcription files directly into Premiere Pro. The SpeedScriber application enables roundtripping to/from Final Cut using FCPXML.
First, let’s talk about the transcription itself. It should generally be clip-based and not from edited timelines, unless you just want to document a completed project or for captioning. When you launch SpeedScriber for the first time, you’ll need to create an account. This will include 15 minutes of free transcription time. The file length determines the time used. Billing for the service is based on time and is tiered, ranging from $.50/minute (30/60/120 minutes) down to $.37/minute (6,000 minutes). Minutes are pre-purchased and don’t expire.
Once your account is ready, drag-and-drop or point the application to the file to import. Disable any unwanted audio channels, so that the transcription is based on the best audio channel within the file. Even if all channels are equal, disable all but one of them. Set up the number of speakers and language format, such as British, Australian, or American English. According to Baker, support for five European languages will be added in version 1.1. The service will automatically determine when speakers change, such as between an interviewer and the subject. It’s hard for the system to determine this with great accuracy, so don’t expect these speaker changes to be perfect.
The transcription experience
Accuracy of the transcription can be extremely good, but it depends on the audio quality that you’ve supplied. A clean interview track – well mic’ed and in a quiet room – can be dead-on with only a few corrections needed. Slower speakers who enunciate well result in greater accuracy. On the other hand, having several speakers in a noisy environment, or a very fast speaker with a heavy accent, will require a lot of correction – enough so that manual transcription might be better in those cases.
Once SpeedScriber has completed its automatic transcription, you can play the file to proof it and make any corrections to the text that are required. It’s easy to type corrections to the transcription within the SpeedScriber text editing window. When done, you can export the text in a number of different formats. I ran a test clip of a clear-spoken woman with well-recorded audio. She had a slight southern drawl, but the result from SpeedScriber was excellent. It also did a good job of ignoring speech idiosyncrasies, such a frequent “ums”. This eight minute test clip only required about a dozen text corrections throughout.
If the objective is script integration into an NLE, then the process varies depending on brand. Typically such integration is clip-based, although multi-cam clips are supported. However, it’s tougher when you try to connect the transcription to a timeline. For example, I like to do cutdowns of interviews first, before transcribing, and that’s not really how ScreedScriber works best. In version 1.1, FCPX compound clips will be supported, so segments can be cut before transcription.
Media Composer is easy, because it already has a Script Integration feature. Import the text file that was exported from SpeedScriber as a new script into Media Composer and link the video clip to it. If you purchased Avid’s ScriptSync, then you can automatically line up the clip to sentences within the script. This happens automatically thanks to ScriptSync’s speech analysis function. But if you didn’t purchase this add-on, simply add sync points manually.
With Premiere Pro, select the clip, open the SpeedScriber panel and from it, import the corresponding transcription. The text appears in the Speech Analysis section of that clip’s metadata display. It will actually be embedded into the media file so that the clip can be moved between projects complete with that clip’s transcription. You can view and use this text display to mark in/out by words for accurate script-based selections. When you import the script and link it to a multi-cam clip, synced clip, or sequence, text will show up as markers and can be viewed in the markers panel. Premiere Pro is the only integration that can easily update existing speech metadata or markers. So you can start editing with the raw transcript and then update it later when corrections have been made. However, when I tested transcriptions on an edited sequence instead of a clip, it locked up Premiere Pro, requiring a Force Quit. Fortunately, when I re-opened the recovered project, the markers were there as expected.
The most straight forward approach seems to be its use with Final Cut Pro X. According to Baker, “This is the first Digital Heaven product with broad appeal by supporting Avid and Premiere Pro. But FCPX has ended up having the deepest integration due to the ability to drag-and-drop the Library, which was introduced in 10.3. So with roundtripping, SpeedScriber rebuilds the clip’s timeline without any need to export. Another advantage of the roundtripping is that SpeedScriber can read the audio channel status from the dropped XML, which is important for getting the best accuracy.”
There’s a roundtrip procedure with FCPX, but even without it, simply export an FCPXML from SpeedScriber. Import that into your Final Cut Pro X Library. The clip will then show a number of keyword entries corresponding to line breaks. For each keyword entry, the browser notes field will display the associated text, making it easy to find any dialogue. Plus, these entries are already marked as selections. When clips are edited into the sequence (an FCPX Project), the timeline index enables these notes to be displayed under the Tags section.
SpeedScriber shows tremendous potential to accelerate the efficiency of many spoken-word projects, like documentaries. Half the battle is trying to figure out the story that you want to tell, so having the text right in front of you makes this job easier. Applying modern technology to this challenge is refreshing and the constantly improving accuracy of these systems makes it an easy consideration. SpeedScriber is one of those tools that not only gets you home earlier, but will give you the assurance that you can easily find that clip you are looking for in the proverbial haystack of clips.
©2017 Oliver Peters