Digital Anarchy Transcriptive automatic transcription plugin - NAB 2017

Digital Anarchy’s Transcriptive plugin automates transcription from within Premiere Pro, transcribing the audio from your sequence and providing you with searchable timecoded text with a claimed accuracy rate of 96%.

Better transcription through machine learning

When you’ve decided what you want transcribed, audio from the sequence is encoded to one of two speech engines online. The system uses either Watson (the IBM supercomputer) or another service called Speechmatics, based in Cambridge in the UK. The remote system then does all the heavy lifting of transcription, sending back a completed transcript to the plugin.

Ongoing costs but no subscription

There is a cost to use the speech engines, but it’s considerably less than if you were to use a human-based transcription service. Watson is free for the first 16 hours and then $0.02 / min, while Speechmatics charges a flat rate of $0.07 / min. Digital Anarchy say that typically a transcription will be accomplished and back with you ready to go in around 25% of real time – that’s to say an hour’s worth of transcription would be ready in 15 minutes.

If you’ve ever had to sit and transcribe your own interviews you’ll appreciate what a timesaver that is – and if you’ve had to use an external transcription service Transcriptive is much cheaper too. And because you’re exchanging audio and text files you don’t need nearly as much bandwidth on your connection as if you were sending a video file with a timecode burn in, as you probably would for regular transcription.

Rapid turnaround, flexible scripts

In the example in the video above, two minutes of transcription only took 20 seconds to come back to Premiere, logged with timecode down to individual words.

Once back with you the script becomes editable and can be searched and used to find specific instances of a particular word or phrase very quickly. You can export as plain text, as clip or sequence markers or, eventually, to an SRT file for captioning. Again, if you’ve ever had to manually point subtitles on even a short piece, Transcriptive has the potential to save a lot of time.

For their v1.0 release Digital Anarchy are supporting English, Spanish and Japanese. A wide range of other languages are supported by the Watson and Speechmatics engines, and the team plan to add French and Italian next.

Transcriptive Pricing and availability

Digital Anarchy intend to start a beta test of Transcriptive in the first week of May, which will be fully functional for those taking part. Alternatively they’re offering a pre-order price of $199 US until May 1, saving $50 on the regular price of $249 US. Initially it will only be available as a plugin for Premiere Pro, but by the time they get to v1.0 they hope to have a standalone version available as well, to allow you to use the service in conjunction with FCP X and Avid.

We tested the process

To test Transcriptive’s abilities, we showed up unannounced at their booth the morning after filming the interview above and had them run it through their plugin. In about a minute we had a text transcription of the whole interview and in another minute an SRT caption file with timecode. It was fairly accurate despite having filmed in a noisy convention hall.

Here’s part of the unedited results of the SRT transcription of the whole interview:

6
00:00:23,329 –> 00:00:27,020
– My name is Tor Olson I work as a software Q8 here at
– digital anarchy.

7
00:00:27,649 –> 00:00:38,240
– So I break the stuff that we’re about the show and yet
– transfer in to show off a transcript of our transcription
– software plug in works to transcribe the audio of
– whatever’s in your sequence right here.

8
00:00:39,200 –> 00:00:50,509
– And so in this case we have solid sequent Erse all the clip
– right here just a single one but it would work just as well
– for a fully Cup final at it with all the butchered up at it
– down here.

9
00:00:51,500 –> 00:00:58,339
– And the way it works is that all the audio that’s in here
– is going to get encoded and shot off to one of two speech
– engines that we have.

10
00:00:58,369 –> 00:01:03,469
– So we’re utilizing both I-beams Watson or speechmatics.

11
00:01:03,770 –> 00:01:14,690
– Now each have their pros and cons speechmatics is seven
– cents for every minute that you want to transcribe and
– Watson the first 16 hours or so is free to transcribe.

12
00:01:15,019 –> 00:01:17,299
– And after that it’s going to be two cents per minute.

13
00:01:17,840 –> 00:01:22,280
– There’s no subscription fee that comes with it is just the
– minutes that you logged through those systems.

14
00:01:22,790 –> 00:01:36,319
– So once you choose your service you would go to choose and
– subscribe or transcribe your audio either using Wattson or
– speechmatics for this two minute sequence that we have here.

15
00:01:36,590 –> 00:01:43,310
– It took about 20 seconds to get kicked back to us and that
– includes the encode the transcription and it being sent
– back to us.

16
00:01:43,670 –> 00:01:49,210
– And once that’s done it’s logged with all the time code all
– the way down to the individual word.

17
00:01:49,319 –> 00:01:55,429
– So if I choose any word you’ll notice that it’s skipping
– through the timeline to where that script is spoken.

18
00:01:55,459 –> 00:01:56,600
– So if I choose.

19
00:01:56,689 –> 00:01:59,090
– So people in our script will play so people.

20
00:02:01,599 –> 00:02:03,260
– Oh that is incredibly awesome.

21
00:02:03,489 –> 00:02:08,450
– Yeah I’ll follow along the text of this full fully
– functional editor as well.

22
00:02:08,810 –> 00:02:14,299
– Keyboard shortcuts to be able to capitalize text add
– punctuation as need be.

23
00:02:16,039 –> 00:02:21,530
– I can also get rid of it if I wanted to edit the text
– itself.

24
00:02:21,590 –> 00:02:28,189
– I would hit enter and I can edit the scripts to whatever I
– want it to be.

25
00:02:32,939 –> 00:02:41,560
– This and this will handle multiple languages we’re going to
– be able to have it enabled for on the 1.0 release for sure.

26
00:02:41,879 –> 00:02:47,609
– English Spanish and Japanese boat I think along the line
– are probably after the 1.0 release.

27
00:02:47,879 –> 00:02:51,099
– We’re going to try to implement other languages I know.

28
00:02:51,169 –> 00:02:55,009
– We tried them lots and also support French Italian.

29
00:02:55,500 –> 00:03:01,560
– So we want to bring those down the line as well and pretty
– much any kind of thing you can do it on a timeline.

30
00:03:01,919 –> 00:03:03,470
– It’ll work with that audio.

31
00:03:03,569 –> 00:03:04,949
– Oh yeah yeah exactly.

32
00:03:06,000 –> 00:03:17,220
– And so once I’m done with all my edits I would just go down
– to my export options and I could choose the SRT file for
– your standard YouTube or Facebook uploads that kind of
– thing.

33
00:03:18,030 –> 00:03:19,139
– We support a few others.

34
00:03:20,340 –> 00:03:25,290
– You can also export that is a plain text if you’re kind of
– old school and you like to print out and do the annotation
– that way.

35
00:03:26,189 –> 00:03:31,349
– My favorite option though is the ability to export as clip
– or sequence markers.

36
00:03:31,740 –> 00:03:39,149
– And so the beauty behind that is you’ll see all the markers
– and now shown up with all of our transcript text.

37
00:03:39,659 –> 00:03:46,379
– And that means that if I go into our Marcuse panel I know
– he mentioned something about trees.

38
00:03:47,159 –> 00:03:50,639
– I can immediately look up all the time code for all those
– mentions of trees.

39
00:03:50,759 –> 00:03:51,240
– So it does.

40
00:03:51,270 –> 00:03:55,919
– I don’t have to look through a big long list of all the
– text I get to searching immediately.

41
00:03:57,389 –> 00:03:59,250
– This is unbelievable to me.

42
00:04:00,270 –> 00:04:06,150
– And this works over the Internet where you use send it off
– to a super computer and it comes back.

43
00:04:06,469 –> 00:04:07,650
– But it’s very quick to do it.

44
00:04:07,949 –> 00:04:08,339
– Yes.

45
00:04:08,400 –> 00:04:16,129
– Like I said before this two minute sequence the processing
– the flac file that gets sent out and returned all that took
– 20 seconds just for that two minute clip.

46
00:04:18,029 –> 00:04:22,349
– Your colleague told me earlier that it’s about 25 percent
– of real time.

47
00:04:22,709 –> 00:04:26,310
– So in one hour interview you could get back in 15 minutes.

48
00:04:26,399 –> 00:04:27,330
– Yeah exactly yeah.

49
00:04:28,350 –> 00:04:38,669
– And what kind of bandwidth do you need is this hungry thing
– to do this or is it pretty easy just to send the audio.

50
00:04:38,759 –> 00:04:39,949
– It’s pretty easy to send audio.

51
00:04:39,990 –> 00:04:50,189
– I mean we’re just doing this off of a simple jet pack a
– little hotspot on and it’s able to export out and get back
– to us in that amount of time with that.

52
00:04:51,089 –> 00:04:55,550
– So it’s it’s not too much of a heavy load and this is a
– plug in for Premiere.

53
00:04:55,620 –> 00:04:58,169
– And how much does it cost and when all that be available.

54
00:04:58,350 –> 00:05:01,110
– So we’re going to start to open beta it within the next two
– weeks.

55
00:05:01,470 –> 00:05:12,569
– It is currently only available for Premier Pro but probably
– by the 1.0 release we’re thinking to have a standalone plug
– in as well so that we’ll be able to do all of these SRT
– file exports all of that.

56
00:05:12,600 –> 00:05:21,180
– So you can bring into your final cut have it etc. so that
– will be enable the price point is to 59.

57
00:05:21,480 –> 00:05:23,459
– But on the end the beats show floor if you buy it.

58
00:05:24,000 –> 00:05:26,699
– I think until May 1st its when 99.

59
00:05:27,660 –> 00:05:28,139
– Okay great.

60
00:05:28,350 –> 00:05:30,870
– And then what versions of Premiere will it work.

61
00:05:32,040 –> 00:05:35,230
– It should be fully functional as far back as a 6.

62
00:05:36,329 –> 00:05:36,810
– Very cool.

63
00:05:36,839 –> 00:05:37,740
– Thanks so much talk.

64
00:05:38,089 –> 00:05:38,389
– Thank you.

By Elliot Smith

After working on the pictures and multimedia desks at The Guardian, Elliot now makes videos for production company Happen Digital and types words for Newsshooter.com.

Latest news

Deals