20
Nov
utter

Way back in February, a man by the name of Ben Randall demoed an amazing voice control app called "utter!" that he had started developing. The initial video (a whopping 22 minutes long) demonstrated some amazing capabilities - take a look for yourself:

But that was over 9 months ago, and aside from the initial release of the (very limited) alpha, we haven't heard much about the app, though Mr. Randall has kept interested parties updated via his very active XDA thread. In those 9 months, he's made very steady progress, and today he has released the first beta build.

Screenshot_2012-11-18-08-17-58 Screenshot_2012-11-19-20-16-29 Screenshot_2012-11-19-20-18-31

Screenshot_2012-11-19-20-16-42 Screenshot_2012-11-19-20-16-48 Screenshot_2012-11-19-20-17-41

Keep in mind, this isn't like any other voice control app we've seen. Most try to imitate personal assistants - handling basic tasks like using the calendar, phone, email, or messaging. Even services like Google Now try to provide info like driving times and sports scores - not so much a tool as a cool parlor trick. Utter!, though, is like your own mechanic for your phone.

Check out the new video, this time of the beta build in action:

Upon install, Utter uses Google Voice Search for speech recognition, so it's compatible with a variety of languages. Plus, there's a selection of other recognition providers available. There are quite a few default commands, too:

  • Set your name
  • Cancel commands
  • Current time, date, and weather
  • Email
  • Post to Facebook/Twitter
  • Battery (level, temperature, health, status, voltage)
  • Display, text, and call contacts (includes support for Skype). Can also read the last text.
  • Play music (default music player, Google Music, Spotify)
  • Launch/kill apps
  • Toggle: WiFi, Mobile Data, Bluetooth,
  • Navigation
  • Set/create calendar events
  • Set alarms (in minutes or at a specified time)
  • Translate from English to German, French, Romanian, Spanish, Italian, or Polish.
  • Web search (Google, Bing, Yahoo), video/movie search
  • Add note/voice note
  • Adjust volume
  • Spell/define words
  • Manage clipboard
  • File management
  • Take screenshot
  • Root functions: Reboot, hot reboot, reboot recovery, reboot bootloader, set CPU governor
  • Location functions (define locations, ask where you are, have it remember where you parked your car)
  • Set password
  • Fly to x in Google Earth
  • Display/run tasks in Tasker
  • Shake/wave to wake Utter!
  • Launch any settings menu

But things get really interesting when you crack open the customization menu. You can create your own phrases, commands, nicknames, and so on - and even better, you can get new commands from the community.

We've spent some time with Utter! and I have to say, it has gobs of potential even without considering what kind of goodies may come from the community. It offers a whole host of capabilities that aren't found in any other voice control app I've ever seen. That said, it's in beta for a reason: the voice recognition isn't perfect and when using it, you get the impression that you're dealing with a robot with pre-defined inputs and outputs rather than a genuine AI.

Ben also took the time to answer some questions about his competition, what makes Utter! unique, his plans for the future of the app, and to tell us a little bit about himself:

Q) What do I think about all of the emerging competition?

A) It made me focus on core functionality, rather than the ‘virtual intelligence’ of the app. Some people love the idea of talking to an inanimate object, some people think it’s stupid – Some people will think it’s great for a couple of days, and then think it’s stupid.

I’m building a ‘community phrase database’ which users will be able to ‘opt-in’ to if they’d like to get random answers to questions that others users have entered. You’ll be able to submit these phrases directly from the device, so I hope the database will grow very quickly and be pretty quirky and fun too. I’ve just got to decide how it’s possible to moderate the content – It might not be!

In addition, users will be able to bolt on a ‘bot’ to utter!, which they can give a personality to and teach from scratch over time or start out with 40,000+ phrases and conversations.

For the above I need server space and hosting on a large scale. The implementation is pretty basic. It will appeal to some, but not all.

Some of the competition have greatly improved the standard of natural voice recognition and interpretation. However, if you glance at the comments in the Play Store, you’ll still very often get ‘Didn’t recognise anything, crap, uninstalled’.

The only way to resolve this is to allow users to configure the commands using the words that are returned from the recognition provider, not what they actually said! This is the route I chose to take – letting the user improve the recognition for themselves, rather than trying the impossible (when working alone) of understanding everything.

There’s a ‘replace words’ feature in utter! so that if for some reason every time a user asks what the time is, the result comes back as ‘where’s the line’ you can enter ‘where’s the line’ and the replacement of ‘what’s the time’ so utter! converts this before analysing the content for commands.

This can be done for any commands – If you prefer to say ‘whack on bluetooth’ then you can add that phrase and link it to ‘turn on bluetooth’. However you naturally request something, you can link it to the base command.

The same applies for any language supported by Google Voice Search. There is a widget that allows the user to start the recognition in their native language and also the language of the voice engine too. They can customise the ‘How can I help you’ intro to their own language and then link any commands or phrases in their own language to the English command.

It will take some configuration from the user, but after gradually tweaking, there’s no reason why they shouldn’t end up with a flawless recognition experience, in any language. A couple of tweaks a day, in a month it will be a personalised beast!

Language plugins will become available, but English is hardcoded into the algorithms and will take some time and assistance to complete. The users translating the app themselves in the mean time isn’t ideal, but possible – and with a bit of work can be usable in all 42 languages! I think that’s how many are supported?

Moving on….

The newly emerged voice assistants seem to continue to try and emulate Siri – Quite why they do this instead of focussing on what the Android Platform can do, is a little beyond me!

Q) What makes utter! different?

A) The customisation is the biggest difference. You can create custom questions, answers and conversions. The content can contain any sound effect by typing se~burp (or whatever it is called) and utter! will check the external storage for that sound. It will also dynamically populate the current content of Tasker Variables when %VARNAME is used.

You can create commands using just a single word of your choice to launch any device activity or Tasker task for those who want a little more complexity.

There’s a Tasker Plugin (which I’m still working on), which passes any variable data to utter! to either store, notify or announce.

Commands are all editable and can be created in a text file and imported for mass production! Cloud storage too for transferring between devices.

Q) Anything else?!

A) utter! can be remotely controlled by text message from another device – It will speak or perform any actions requested in the message. If it’s lost or stolen it will send back its location by reply or to a separate email if requested. The control is restricted by a user configured password that must appear at the start of all commands.

In addition, if the device is rooted, the user can execute any shell command remotely and receive the output in the response – Full SU shell control is probably what you want if your device has been stolen! I think remote shell control might be a first too? Not seen it anywhere else!?

Full/partial data wipe and complete ‘nuke’ is of course possible with SU permissions.

Voice recognition

Google Voice Search just doesn’t perform well for some users and sends back rubbish. Plugins will be available for iSpeech, Dragon Nuance and ATT&T Watson, so the user can switch to these providers if they want, so to give them the best chance of being understood. They are available to test in the recognition tab. ATT&T performs very well.

Application Integration

There aren’t enough apps out there, that share their core features… In the linked applications tab you’ll see the apps that are compatible or will be very shortly. I’m really hoping that the beta will get some attention and app devs will start coming forward and offering integration with their apps. It’s a core ‘philosophy’ of the way utter! will work – providing results in the users preferred applications.

I demonstrated AccuWeather and eBay in the video – Neither of these work without a ‘hack’ and the inclusion would be so simple for them.. With a few more downloads, maybe they’ll sit up and take notice?

Pricing

Companies like Wolfram Alpha want a $ per install – Nuance, iSpeech and ATT&T want per-recognition payment. Bing want a per character translation charge.

Not everyone will want all of the above, some might want none, so I’ll be making premium plugins available for each of them individually. That way, users can spend as much or as little as they like upgrading utter!

Once the application emerges from beta, the base application will be no more than $3 at the very most. I’ll wait until I get a feeling from the feedback before deciding on a price. Anything more than that is just classed as ‘expensive’ in the trend of the Play Store (unfortunately).

A free, almost fully functional version will be available, but with the usual restrictions of a limited number of custom commands and no auto sending of texts, emails and tweets etc.

Plans for the app

In the short term I’m hoping for plenty of bug reports from users that install the beta. I’ve spent such a long time on the framework that I’m able to bolt-on any additional functions without touching the core of the app – So, before I start going crazy bolting stuff on, I’d really love to catch every bug out there in the framework. It’s hard to get users to actually report it, instead of leaving a 1* ‘Doesn’t work for me’ comment… damn them!

By the end of the beta:

More app integration: weather providers, radio streams, file explores, ebay, FourSquare, Catch Notes, Spring pad, 2do, tasks…. Etc etc

Location aware profiles – Do this when I get home/work etc

Localised searches – where’s the nearest – find me a etc etc…

A new app icon! If you’re interested in running a competition of sorts and think it might be good for readers to ‘poll their favourite’, you are more than welcome to run with it! It desperately needs one.

In the long term, I have such a huge list of things to do. Some highly functional, some quirky, some very technical. It would be great if a funder emerged so that I could get some technical assistance to help with the complexities that I’ll struggle to implement if I continue to work alone.

Finally

utter! should work out of the box for everyone who follows the structured commands. It can be tweaked if not and tweaked some more regardless! If users spend some time on it, then I hope they will end up with their perfect ‘voice assistant’, if they can be bothered…

With the bot add on and community phrases, it can be personalised and the content will be fresh and ever increasing. With the Tasker integration and shell access, there’s something in there for the more technical users too.

I really hope with the algorithms I’ve constructed and the framework I’ve designed, users will actually find it quicker to do things by voice and will start reaching for the recognition button instead of navigating between menus and then into apps and settings etc… I know the speed of the app could be improved again with some more technical assistance – I need the Java regex master to step forward and lend a hand!

Looking forward to seeing what the masses make of it! I hope….!

About me

I’ve always been a developer and ‘hacker’ of sorts since I was a kid, but life took me in another direction for my career path. When the first video got plenty of attention, I figured it was time to do what I’ve always wanted to and I’ve spent every spare moment I’ve had on it since. I’ve had utter! in my mind for years, since I hacked apart the voice command app on an old Windows Mobile device and changed some registry settings and the actions it performed.

I really hope I get some commercial funding so I can continue to develop the application. There are some amazing start-up companies out there for voice recognition and artificial intelligence and it would be fantastic to have the resource and the time to approach them. I would also be great to be approached.

If utter! isn’t successful, then I hope I’ll be offered work in a similar field, as it’s what I enjoy and what I want to do.

It’s been great to develop the application entirely on what would be useful for the user, not what would bring in a return for an investor, but now I hope the application is at a stage where it could achieve both.

Aaron Gingrich
Aaron is a geek who has always had a passion for technology. When not working or writing, he can be found spending time with his family, playing a game, or watching a movie.

  • http://www.anivision.org/ Christopher Bailey (Xcom923)

    >_< this looks BADASS!!!

  • http://twitter.com/redbullcat Phil Oakley

    This is basically what Google Now/Majel needs to become.

    Great work! Gonna try this out, looks extremely good.

  • Kenny O

    I've been following the progress on his XDA thread since it first started, Brandall is doing a great job with this.

  • http://www.anivision.org/ Christopher Bailey (Xcom923)

    apparently Nova won't link the utter shortcut to anything. Probably because there are so many selections

  • http://twitter.com/jaimepar jp

    another excellent app from a xda member!

  • fixxmyhead

    name is REALLY REALLY TERRIBLE it makes me think of a cows utter. i wanna try it but the name just kills it for me (yea its that bad for me)

    • https://plus.google.com/116879163037230501137/posts Cullen Maglothin

      Cows have udders, not utter. You utter a word. It means to speak quietly.

      • fixxmyhead

        @CullenDM:disqus, @86d537b385dc89a98fd3a7689c6f53ea:disqus yea i forgot but still same pronunciation and same thought. just terrible name

        • Yassie

          You don't pronounce a cows UDDER as UTTER. And if you do, then you need to see a speech therapist!

  • heat361

    If Google could improve Google now/voice into this that would awesome

  • traveller

    @fixxmyhead:disqus interesting you say this. Cows have "UDDERS". The word "utter" is synonymous with "speak", "talk" etc.

  • Adam

    I was in a meeting and this thing screamed at me: "You need to accept the license agreement."

    Nice. I'm uninstalling this garbage.

    • kamiller42

      You should request a refund.

    • http://twitter.com/Larsened David

      Orrrr don't play around on your smartphone during a meeting. Orrrr put it on silent.

    • qwerty

      "I'm in a meeting where a talking phone might be disruptive... HEY, let's try this voice command app!"

      Idiot.

  • Jelmer Borst

    Google Now integration with this would be awe-some!

  • spydie

    He wants startup capital? did he ever hear of Kickstarter? I've helped fund a few things on that.

    • http://codytoombs.wordpress.com/ Cody Toombs

      I think he wants to benefit from the connections and knowledge that venture capitalists can bring to the table. Kickstarter is good for just accumulating some money, but it leaves you in the cold to handle basically everything else.

      Besides, is Kickstarter offered outside of the US? I'm just making a blind assumption based on his accent, but I'd lean towards thinking he's located across the pond.

  • Al McDowall

    Excited and fascinated by this.

    So far I have not been able to get it to listen for a trigger word, or 'wave to wake' working. Anyone had any success here? Any tips?

    • Al McDowall

      Ok, I was thrown by the 'Start Listening' option in phrases. It looks like passive listening is yet to come.

      Haven't got wave to work yet though.

      Recognition is pretty good though and the option to create new commands BY VOICE is very promising. So far, it's looking pretty damn good.

  • Tucker Chastain

    Where do I send my paycheck?

  • Karthik Kumar

    This is Utter-ly amazing (sorry for the bad pun)

    Playing with it since last 20 minutes and so far it works really well on my aged Galaxy S. Wave to Wake doesn't seem to work though. Shake to Wake works flawlessly. Looking forward to the final version. Definitely will buy..

  • Elias

    Google, buy this guy already!

  • romel

    Hope to be updated to be compatible to galaxy ace GT-5830

    Leave me link if you got downloadable for galaxy ace GT-5830 ..utter voice command

  • Lionel Hamblyn

    Works off line on Jelly Bean devices now as well...Love it!

  • John O’Connor

    So I paid $3 to wolfram alpha for their app, would they still charge you(the developer) to have that plugin function? if so, that is BS

  • susancpw

    If it would work with my deja office suite that would be the ultimate gor me

Quantcast