Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: API continuous Speech-To-Text -UPDATED

by zentara (Archbishop)
on Sep 01, 2018 at 18:27 UTC ( [id://1221548]=note: print w/replies, xml ) Need Help??


in reply to API continuous Speech-To-Text

Unfortunately there is no ready-to-use API in Perl

No lie there. There isn't even a Perl module for the alsalib.

Just as some brainstorming, on linux anyways, you can easily access the microphone. Assuming you have the PulseAudio pavucontrol settings set correctly, you can get the microphone's audio with

arecord - | aplay -
This will pipe whatever is coming in on the microphone, or line in ( must be set properly in alsamixer and pavucontrol ), to the default sound output. So you probably can capture the microphone and pipe it to a streaming application like Gstreamer. You would then need to have gstreamer send it to the server, and somehow get the text back.

I noticed the services seem to offer a choice between streaming the audio or uploading a file. A file upload would be alot easier.

Check out this old app I uploaded way back when. ztk-v4l-video-bloger/recorder. It shows basically how to access the alsa settings, turn on/off the microphone, and record. It may not work with your current hardware, but it contains some clues which may get you pointed in the right direction.

To be honest, you might be best served by using an HTML5 Canvas app, written in javascript. It will handle the microphone, the upload and the text display.

UPDATE:

Also, check out this: speech recognition for linux. There is an interesting link concerning using Gstreamer Gstreamer and speech recognition, it may just give you the solution.


I'm not really a human, but I play one on earth. ..... an animated JAPH

Replies are listed 'Best First'.
Re^2: API continuous Speech-To-Text -UPDATED
by Anonymous Monk on Sep 02, 2018 at 08:55 UTC

    Thank you for your insights. As I thought, it may be something behind my reach. The possibility to use HTML5+javascript is of course okay, as it is documented and so on, however it would mean to drop Perl. And my second goal was to apply my legacy Perl scripts "live" to the transcribed text (regex, data visualization, etc.) and do computations on the incoming text. This would mean to rewrite everything from scratch in javascript (a language I know only vaguely), which is, of course, not a nice thought.

      It may be within your reach if you dig hard enough. :-) If you notice, python has modules and scripts which will do all the hard work for you. You can easily run python from Perl, then use Perl to do your filtering and display. It might be time to learn a bit of python. I might be tempted to try it myself, but the TensorFlow libraries are huge and complex and I have other fish to fry.

      I'm not really a human, but I play one on earth. ..... an animated JAPH

      No need to rewrite your Perl programs into Javascript.

      One option, you could create a simple web frontend that would feed the text transcript to your Perl programs.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1221548]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (2)
As of 2024-04-26 04:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found