Hello Transcribe goes Multiplatform

31 Jul 2023

Introduction

Hello Transcribe 2.6 is now available on macOS, iPadOS and iOS 🙌. Get it on the App Store.

For 2.6 the goal was feature parity on all platforms, which is where we are now. Future releases will diverge to some extent (larger models and background tasks on Mac for example).

If you buy Pro on any platform you have Pro on all platforms, and saved results will sync across all devices via iCloud.

Implementation

MP

It’s not super easy to build a unified interface on all platforms, but I’ve managed it with a single SwiftUI codebase, single Xcode target, and not too many #if os(...) directives.

I used the NavigationSplitView as a top-level view, which works on iPhone, iPad and Mac. I used standard toolbars on all platforms. It took some effort to understand the differences between macOS and iOS, especially with toolbars, sheets vs. windows, and settings. Often you get things for free in iOS that needs extra work on macOS.

There are some differences in using the microphone on macOS (and permissions), and using frameworks is trickier (which led to me switching to a new audio framework for WhatsApp Ogg/Vorbis). The upside is more input audio file types are now supported.

CoreML

Caveat: The CoreML optimisation (+caching) step on macOS 13 takes a VERY long time. On my M1 Max it can take 20 minutes to optimise the “small” model, but on macOS 14 it takes 60 seconds.

I’m looking forward to the Sonoma release.

CoreML is now disabled by default on all platforms because even though it IS much faster (on iOS specifically), there are some surprising performance issues during once-off caching.

Conclusion

Building a multiplatform app in SwiftUI wasn’t as easy as I expected, but it’s not a major pain either.

macOS is great for transcribing big files, especially leaving it running, so the workflow will be updated to support that.

The next step for for all platforms is supporting the larger models (for macOS definitely, for iOS I”m not so sure) and trying to make CoreML usable on macOS. So basically a rework of the underlying Whisper model selection/configuration.

BONUS: Before I start with the models… video transcription is coming in the next week (in fact it already available on macOS with drag & drop but it’s untested).

Follow me on Twitter or X or whatever it’s called for updates.