Re: API continuous Speech-To-Text -UPDATED

Unfortunately there is no ready-to-use API in Perl

No lie there. There isn't even a Perl module for the alsalib.

Just as some brainstorming, on linux anyways, you can easily access the microphone. Assuming you have the PulseAudio pavucontrol settings set correctly, you can get the microphone's audio with

arecord - | aplay -
[download]

This will pipe whatever is coming in on the microphone, or line in ( must be set properly in alsamixer and pavucontrol ), to the default sound output. So you probably can capture the microphone and pipe it to a streaming application like Gstreamer. You would then need to have gstreamer send it to the server, and somehow get the text back.

I noticed the services seem to offer a choice between streaming the audio or uploading a file. A file upload would be alot easier.

Check out this old app I uploaded way back when. ztk-v4l-video-bloger/recorder. It shows basically how to access the alsa settings, turn on/off the microphone, and record. It may not work with your current hardware, but it contains some clues which may get you pointed in the right direction.

To be honest, you might be best served by using an HTML5 Canvas app, written in javascript. It will handle the microphone, the upload and the text display.

UPDATE:

Also, check out this: speech recognition for linux. There is an interesting link concerning using Gstreamer Gstreamer and speech recognition, it may just give you the solution.

I'm not really a human, but I play one on earth. ..... an animated JAPH

Comment on Re: API continuous Speech-To-Text -UPDATED Download Code

Replies are listed 'Best First'.
Re^2: API continuous Speech-To-Text -UPDATED by Anonymous Monk on Sep 02, 2018 at 08:55 UTC
Thank you for your insights. As I thought, it may be something behind my reach. The possibility to use HTML5+javascript is of course okay, as it is documented and so on, however it would mean to drop Perl. And my second goal was to apply my legacy Perl scripts "live" to the transcribed text (regex, data visualization, etc.) and do computations on the incoming text. This would mean to rewrite everything from scratch in javascript (a language I know only vaguely), which is, of course, not a nice thought.	[reply]
Re^3: API continuous Speech-To-Text -UPDATED by zentara (Archbishop) on Sep 02, 2018 at 14:58 UTC
It may be within your reach if you dig hard enough. :-) If you notice, python has modules and scripts which will do all the hard work for you. You can easily run python from Perl, then use Perl to do your filtering and display. It might be time to learn a bit of python. I might be tempted to try it myself, but the TensorFlow libraries are huge and complex and I have other fish to fry. I'm not really a human, but I play one on earth. ..... an animated JAPH	[reply]
Re^3: API continuous Speech-To-Text -UPDATED by RonW (Parson) on Sep 07, 2018 at 19:34 UTC
No need to rewrite your Perl programs into Javascript. One option, you could create a simple web frontend that would feed the text transcript to your Perl programs.	[reply]


Come for the quick hacks, stay for the epiphanies.
	PerlMonks