Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Thread Contributor: CovertBotNews - Building a Voice-Controlled Front End to IoT Devices
Building a Voice-Controlled Front End to IoT Devices

<div data-history-node-id="1339907" class="layout layout--onecol">
<div class="layout__region layout__region--content">

<div class="field field--name-field-node-image field--type-image field--label-hidden field--item"> <img src="" width="800" height="451" alt="""" typeof="foaf:Image" class="img-responsive" /></div>

<div class="field field--name-node-author field--type-ds field--label-hidden field--item">by <a title="View user profile." href="" lang="" about="" typeof="schemaTongueerson" property="schema:name" datatype="" xml:lang="">Michael J. Hammel</a></div>

<div class="field field--name-body field--type-text-with-summary field--label-hidden field--item"><p><em>Apple, Google and Amazon are taking voice control to the next level.
But can voice control be a DIY project? Turns out, it can. And, it isn't
as hard as you might think.</em></p>

Siri, Alexa and Google Home can all translate voice commands into
basic activities, especially if those activities involve nothing more
than sharing digital files like music and movies. Integration with
home automation is also possible, though perhaps not as simply as users
might desire—at least, not yet.

Still, the idea of converting voice commands into actions is intriguing
to the maker world. The offerings from the big three seem like magic in
a box, but we all know it's just software and hardware. No magic here.
If that's the case, one might ask how anyone could build magic boxes?

It turns out that, using only one online API and a number of freely
available libraries, the process is not as complex as it might seem.
This article covers the <a href="">Jarvis
project</a>, a Java application for capturing
audio, translating to text, extracting and executing commands and
vocally responding to the user. It also explores the programming
issues related to integrating these components for programmed results.
That means there is no machine learning or neural networks involved.
The end goal is to have a selection of key words cause a specific method
to be called to perform an action.

APIs and Messaging</h3>

Jarvis started life several years ago as an experiment to see if
voice control was possible in a DIY project. The first step was to
determine what open-source support already existed. A couple weeks
of research uncovered a number of possible projects in a variety of
languages. This research is documented in a text document included in
the docs/notes.txt file in the source repository. The final choice of a
programming language was based on the selection of both a speech-to-text
API and a natural language processor library.

Since Jarvis was experimental (it has since graduated to a tool in
the <a href="">IronMan
project</a>), it started with a requirement that it be as
easy as possible to get working. Audio acquisition in Java is very
straightforward and a bit simpler to use than in C or other languages.
More important, once audio is collected, an API for converting it to
text would be needed. The easiest API found for this was <a href="">Google's Cloud
Speech REST API</a>. Since both audio collection and REST interfaces are
fairly easy to handle in Java, it seemed that would be the likely choice
of programming language for the project.

<div class="field field--name-node-link field--type-ds field--label-hidden field--item"> <a href="" hreflang="en">Go to Full Article</a>


Forum Jump:

Users browsing this thread: 1 Guest(s)