Hello everyone. This is Yujong from the Hyprnote team (
https://github.com/fastrepl/hyprnote).
We built OWhisper for 2 reasons:
(Also outlined in https://docs.hyprnote.com/owhisper/what-is-this)
(1). While working with on-device, realtime speech-to-text, we found there isn't tooling that exists to download / run the model in a practical way.
(2). Also, we got frequent requests to provide a way to plug in custom STT endpoints to the Hyprnote desktop app, just like doing it with OpenAI-compatible LLM endpoints.
The (2) part is still kind of WIP, but we spent some time writing docs so you'll get a good idea of what it will look like if you skim through them.
For (1) - You can try it now. (https://docs.hyprnote.com/owhisper/cli/get-started)
bash
brew tap fastrepl/hyprnote && brew install owhisper
owhisper pull whisper-cpp-base-q8-en
owhisper run whisper-cpp-base-q8-en
If you're tired of Whisper, we also support Moonshine :)
Give it a shot (owhisper pull moonshine-onnx-base-q8)
We're here and looking forward to your comments!
I was actually integrating some whisper tools yesterday. I was wondering if there was a way to get a streaming response, and was thinking it'd be nice if you can.
I'm on linux, so don't think I can test out owhisper right now, but is that a thing that's possible?
Also, it looks like the `owhisper run` command gives it's output as a tui. Is there an option for a plain text response so that we can just pipe it to other programs? (maybe just `kill`/`CTRL+C` to stop the recording and finalize the words).
Same question for streaming, is there a way to get a streaming text output from owhisper? (it looks like you said you create a deepgram compatible api, I had a quick look at the api docs, but I don't know how easy it is to hook into it and get some nice streaming text while speaking).
Oh yeah, and diarisation (available with a flag?) would be awesome, one of the things that's missing from most of the easiest to run things I can find.
reply