10 (+1) transcription and captioning tools compared

Subs apps

Transcription and subtitles have regularly been part of our company’s offer, but so far it was requested with a medium-low frequency. Although recently things have changed: with the growing success of streaming platforms and the widespread availability of videos on the internet – particularly on social media – the demand for transcription and subtitling services has grown and taken up an increasingly large share of our time.

So today we have to deal with these tasks on a daily basis, and this has led us to survey the market in search of an easy-to-use and efficient solution that can meet our needs. In fact, we have used or tested several alternatives to find a web-based software that offered the features we needed most, namely automatic transcription, subtitle formatting according to predetermined rules, and the ability to translate subtitles. In addition, the chosen tool had to provide streamlined folder management and task assignment.

In this article we will compare some of the major subtitling and transcription platforms that we considered for our daily work.

1. Amberscript

Let’s start our roundup with one of the industry’s best-known software, Amberscript, which boasts Netflix, Disney + and Warner Bros. among its partners and offers videos transcription and subtitling, both manual and automatic.

Of all the video subtitling tool sites, this is probably the most streamlined, simple, and intuitive to navigate. By choosing the automatic transcription or subtitling feature, you have access to a web-based editor for editing and reviewing the transcript and subtitles. There are plenty of the typical features such as the ability to highlight sentences and the find-replace function.

It also features a mobile app, which despite not offering the text editing function, still gives Amberscript added value: in fact, it allows audio to be recorded directly within the app, and thanks to the synchronization with the desktop version, everything recorded or uploaded from the app is also available on your computer. A feature that you should not underestimate if you want to save time in moving files between platforms. It complies with GDPR regulations and with ISO27001 and ISO9001 certification standards.

Features

Accuracy: 85%
Languages: 39+
File delivery time: less than 10 minutes for a transcription, about an hour for subtitles
Multispeaker: manual (several voices are recognized, but names are not automatically matched)
Native burn-in: no
Subtitle customization: no
Free trial: yes

Pros

  • Clear, very user friendly editor.
  • La trascrizione sfrutta l’intelligenza artificiale, quindi “impara” le parole utilizzate nell’audio per ridurre gli errori.

Cons

  • Not always accurate, it requires some refinement work.
  • You cannot upload videos from cloud.
  • Several users have experienced difficulties in transcribing audio recorded with heavy accents and technical words.

Costs

Pay as you go: 10€ per hour
Subscription: 40€ per month (with a limit of 5 hours of audio/video upload)

2. Happy Scribe

Happy Scribe

We subscribed to Happy Scribe through the Appsumo deal in 2020 and it is the tool we had been using the longest, a couple of times a month. As for automatic transcription, it was fine for our purposes, although in languages other than English the results can definitely be improved. The editor is quite simple and gets the job done.

However, we encountered problems when we needed more features and we were met with silence and slowness by the customer service, on two different occasions. Firstly when we asked for an upgrade to the Business version, which had just been launched and had many interesting features for us, their replies were always slow and vague: they never told us the price of the upgrade and without explaining why, they just told us that with our subscription the upgrade was impossible. The didn’t even give us any alternative solution except that they activated for us a smaller version of Workspaces, which in any case was insufficient for our needs and caused more trouble than it’s worth.

The second time was even more serious, because in the middle of a project we had a problem with a transcription that we had to deliver urgently and we were in danger of getting stuck because customer service responded two days later. In the meantime, I had already re-uploaded the video and redone the automatic transcription (paying the corresponding costs twice), and then I had to copy and paste the portion of transcription that was already edited. At this point we decided not to waste any more time with it and switched to another tool.

Features

Accuracy: up to 85%
Languages: 62
Turnaround: half the length of the audio
Multispeaker: manual
Native burn-in: no
Subtitle customization: yes
Free trial: yes

Pros

  • There are no limits to quantity and size of the files you can upload
  • It provides subtitle translation
  • Offre strumenti di trascrizione e sottotitolazione manuale gratuiti.

Cons

  • Users find that the interface could be more intuitive.
  • Customer service is slow and inconclusive.

Costs

0.20€ per minute for automatic transcription and subtitling

3. Sonix

Sonix

Sonix is the tool we are currently using for most of our needs. It is a fairly solid and complete system, and the offer they made us for the business version was cost-effective. Automatic transcription is accurate enough and requires relatively little intervention. In addition to automatic transcription and subtitling, it will soon offer real time transcription services.

Through its editor, you can do typical operations like highlighting, find-replace, and it also allows you to cross out text to exclude it from the subtitles shown on the screen, but without removing it from the editor altogether. A Sonix feature that has helped us on several occasions is the ability to create shareable short video clips, which is useful for having someone else listen to it and get their take on a specific scene in a video.

Although we mainly use it to create subtitles in the same language (closed captions), on a few occasions we used its machine translation feature and its quality is moderately good. In our experience, you get better results if you translate the video transcript (after reviewing and editing it, if necessary) instead of the subtitles, because that way the translation engine can better understand the general meaning of the text. Otherwise, it tends to regard each subtitle as a stand-alone sentence and does not interpret the context correctly.

Generally speaking, we chose it because it allows you to manage a team of collaborators and to assign (albeit somewhat basic) the captioning task of a certain video to a certain person. Customer service is responsive in a timely manner (bearing in mind they're based in California) and they even fixed a small bug in the app when we reported it to them.

Features

Accuracy: 95-97%
Languages: 35+
Turnaround: it depends on file quality and duration
Multispeaker: both manual and automatic
Native burn-in: premium feature
Subtitle customization: premium feature
Free trial: 30 minutes

Pros

  • It allows you to add notes and comments directly into the transcript.
  • Multiple custom dictionaries can be added, client- and content-based (premium feature).
  • It allows you to combine different audio tracks (premium feature).
  • It allows uploading files from cloud.
  • It provides subtitle translation (premium feature).
  • API available

Cons

  • It does not fare well with poor quality audio or when the audio presents different languages.
  • The timeline scrolls jerkily during the playback of some videos.

Costs

Pay as you go: 10$ per hour.
Premium: 22$ per month. This subscription does not include hours of transcription but only lowers the cost per hour to $5. So you will still need to pay for the duration of every file you upload.

4. Amara

Amara

We tried Amara when we were looking for an alternative to Happy Scribe. Unlike other tools, it does not provide an automatic transcription and subtitling service but only an editor for manual subtitling and subtitle translation.

It has limited features compared to other options on the market, however, it is free and an ideal solution for charitable projects. In fact, in its free version, all uploaded videos will be in the public domain, and anyone with access to the video can edit its subtitles. Paid subscriptions, on the other hand, include audio captioning, but this is only available for English, and you can keep the files you upload as private.

Pros

  • The editor is free.
  • There is no limit to the duration of uploaded files.

Cons

  • The editor is plain and could be developed more.
  • In the free version, you cannot keep videos private.

Costs

Amara public: free.
Amara plus: 24$ per month

5. Rev

Rev is a suitable solution for those looking for an audio transcription service that does not struggle with technical terms. Through the editor, text can be highlighted, underlined, crossed out, and you can also add comments and notes. It also offers automatic captioning for Zoom, which also allows you to download the transcript of the meeting at the end of the call.

Of all the solutions in this article, it is probably the least known, in fact many users point out the company’s low investment in marketing. Despite this, the service quality is excellent, and reviews are numerous and very positive.

Features

Accuracy: it depends on audio quality, but they claim 90%+
Languages: unspecified
Turnaround: 5 minutes
Multispeaker: yes
Native burn-in: yes
Subtitle customization: yes
Free trial: 45 minutes

Pros

  • It allows you to add custom dictionaries.
  • It is very fast.

Cons

  • It may present problems with different accents.

Costs

For automatic transcription: 0.25$ per minute.
For Zoom captioning: 20$ per host.

6. Ooona

If you prefer old school tools, then Ooona is the most comprehensive web-based tool on the market for manually creating subtitles and transcriptions. Generally speaking, it is an all-around solution for those working in the fields of localization and audiovisual translation.

Given that it works completely in manual mode, Ooona has very precise control features for timeline movement, on-screen placement of subtitles, and other advanced functions such as scene change detection, burn-in, and encoding. The interface also allows subtitle translation (always manual) starting from the original template.

In contrast to other similar tools, Ooona is not an all-in-one solution but is modular as it consists of several separate tools, so that it can be adapted to each company’s needs. In my opinion, however, it is a rather confusing choice, which is also reflected in the pricing structure.

Ooona Tools

In fact, seven different tools are available, each with its own price. For some of the tools there is a cheaper standard version, which includes the basic features, and a more expensive pro version, which instead gives access to more advanced features, such as scene change detection. The same goes for bundles, which group various tools together at a discounted price. As if the pricing structure wasn't complicated enough, some tools are also offered on a weekly subscription basis, while others have only monthly, six-month or yearly plans.

Pros

  • Powerful
  • Flexible
  • Customizable
  • Easy to use

Cons

  • The modular structure can be confusing, and each tool requires a separate subscription.
  • It is more expensive than its competitors.

Costs

Standard Subscription: 84$ per month, it includes: Create, Translate, Standard Subtitle Converter, Compare.
Pro Subscription: 248$ per month, it includes: Create Pro, Translate Pro, Standard Subtitle Converter, Compare, Transcribe, Burn & Encode.

7. Subtitle Next

Subtitle Next is a professional platform for creating subtitles, dubbing and captioning, both offline and online. It has very advanced features and is fully customizable.

It offers the possibility to create real time subtitles, even for Zoom, and, starting from version 5.11, subtitles can be automatically generated and translated from the audio, both for live events (such as Twitch live streams) and for offline content. Since this is such a complex software, it is recommended for advanced users. In terms of pricing, it has different types of monthly subscriptions and one-time bundles.

Pros

  • You can upload videos from any link, even directly from social media, without having to download them.
  • All languages are supported.
  • It is currently one of the few tools that offers already translated subtitles from audio without any additional steps.

Cons

  • It is not suitable for beginners or students, as the user interface is too complex.
  • Very expensive.

Costs

Monthly subscriptions:
SubtitleNEXT Explorer: 70€ per month, is the basic package for creating subtitles and translations.
Live Subtitles Bundle: 140€ per month, it includes: SubtitleNEXT Explorer, SubtitleNEXT Live Option (for real time subtitling) and SubtitleNEXT Spark (for showing on external devices real-time subtitles).
SubtitleNEXT Vlogger: 180€ per month, it includes creation, editing and formatting of subtitles, display of subtitles on videos or live streams, real-time subtitles of live streams on YouTube or other platforms, and automatic transcription into one language.
One-time licenses:
Novice: 350€, for beginners and freelancers, it includes all essential features to complete a project.
Explorer: 900€, a complete package for working with texts that require synchronization.
Expert: 2950€, for professionals and organizations, it includes the entire suite of products.

8. Konch

Konch

Konch is a very simple automatic transcription software, suitable for institutions and education.

It has a transcript editor where notes, comments and highlights can be added, and it is built to allow teams to collaborate smoothly on the same transcript . The resulting text can be automatically translated. It complies with GDPR regulations. It offers a 15-minute free trial.

Features

Accuracy: very high
Languages: 30+
Turnaround: few minutes
Multispeaker: manual
Native burn-in: no
Subtitle customization: no
Free trial: 15 minutes

Pros

  • No-frills interface.
  • Automatic translation.
  • Inexpensive.
  • It complies with GDPR regulations.

Cons

  • No advanced functions.

Costs

Pay as you go: 8$ per hour of audio.
Subscription: 32$ per month (5 hours included, then 8$ per hour).

9. Media.io

Media.io is a very comprehensive software for editing, converting and compressing audio and video, so it is particularly suitable for content creators of any kind. Its services also include automatically generated (and translated) subtitles and automatic transcription of audio or video. It is web-based, so it can be used on any platform (desktop and smartphone) and operating system.

Features

Accuracy: more than 90%
Languages: 11 for transcription, 90 for subtitles
Turnaround: few minutes
Multispeaker: unspecified
Native burn-in: yes
Subtitle customization: yes
Free trial: 10 minutes

Pros

  • It does not apply watermarks.
  • It allows a high level of customization.
  • Extremely inexpensive.
  • Automatic translation of subtitles.

Cons

  • Not suitable for large amounts of content.
  • Generic video editing tool, not specific to subtitles.

Costs

3.95$ per month: 2 hours of subtitling and transcription, up to 100,000 characters of transcription.
6.66$ per month: 6 hours of subtitling and transcription, up to 200,000 characters of transcription.

10. Subly

Subly Subly is another particularly recommended software for creators. It offers automatic transcription and automatic subtitling (and translation), as well as the ability to resize videos according to the formats most used on social media and a tool for creating posts using text from the transcript.

Features

Accuracy: up to 98%
Languages: 67+
Turnaround: few minutes
Multispeaker: premium feature
Native burn-in: yes
Subtitle customization: yes
Free trial: 7 days, 240 minutes.

Pros

  • It has additional features such as video resizing.
  • Creator friendly.
  • Very generous free trial.
  • Inexpensive.

Cons

  • Not always very accurate, users report errors such as missing punctuation and capitalization.
  • Not suitable for very long videos.

Costs

Pay as you go: 0.60$ per minute.
Pro Subscription: 17$ per month. Includes 100 minutes per month, with 10 GB storage and 1 GB maximum upload size per file.
Premium Subscription: 39$ per month. It includes 240 minutes per month, 50 GB storage and 2 GB maximum upload size per file, captioning, and allows file uploads either via URL or Google Drive.
Business Subscription: 218$ per month, includes 1000+ minutes, 200 GB storage and 5 GB maximum upload size per file.

A futuristic option

It is clear that with so many available options in the marketplace, the technology for text-to-speech is now widely established. But as the artificial intelligence that can generate images based on a source caption, DALL-E, has shown us, there is really no limit to what AIs can achieve. In fact, OpenAI itself, the mother of DALL-E, is also working on a system that promises to make all software on this list obsolete: Whisper.

It is a neural network capable of achieving a speech recognition capability (in English) equal to that of humans. But what’s really ground-breaking about this technology is that it will be able to understand voice regardless of audio quality, which is currently the biggest obstacle of text-to-speech technology. This means that those who want to rely on automatic audio transcription will no longer have to worry about the presence of background noise, music and other interferences, making text-to-speech technology not only more accurate but also more accessible.

Whisper is currently open-source and therefore downloadable by anyone, as long as they have a computer that can handle the performance required by the program. Whisper also promises to work smoothly with technical languages and to perform multilingual transcriptions.

But since Whisper is not yet reality ー or at least, not for everyone ー the only way to maximize automatic transcription is to take precautions when recording. If, on the other hand, you are looking for a 100% accurate transcription that can faithfully reflect every word regardless of overlaps, accents, and technical terms, no one can beat the expertise of experienced professionals for both transcriptions and subtitles.

Contact us without obligation if you need help captioning or transcribing your videos!

Copywriter, content writer and social media manager. Degree in Language Mediation from Roma TRE University, Master’s degree in Audiovisual Translation from ISTRAD and Digital Marketing and Communication Specialist from ITS Academy Machina Lonati.
Technical translator, project manager, mentor, and admirer of ingenuity. Founding member of Qabiria.

Further Reading

Chat to one of us

Let us know what you need by sending an email to hola@qabiria.com or by filling in the contact form. We guarantee a response within 24 hours, but usually we’re much faster.

Contact us