Consumer-grade voice interfaces helping people with low vision participate in workplace

  • In a graduate class at NYU Ability Lab and ITP, I researched participation of people with low vision in workplace and developed E-Mail Client for Amazon Echo as a flagship productivity app with universal accessibility

Voice interfaces are taking off

With machine learning and neural nets, voice recognition systems are going through a boom. We can see it with Amazon Echo and Google Home. They are a big deal because they commit to the idea that voice can be the ONLY control mechanism as opposed to Siri on iPhones, which combines with a display. This requires significant quality of speech recognition. With easy-to-use APIs and low price, these devices can be easily applied in various niches.

One problem that Baidu, China’s dominant search engine, has addressed is the difficulty of transcribing Chinese characters. As MIT Technology Review said, “China is an ideal place for voice interfaces to take off, because Chinese characters were hardly designed with tiny touch screens in mind. But people everywhere should benefit as Baidu advances speech technology and makes voice interfaces more practical and useful. That could make it easier for anyone to communicate with the machines around us.”

Voice controlled productivity apps improving work participation

Affordable voice recognition has significant application in accessibility technology, specifically via apps for people with vision impairments. An area where voice recognition combined with other forms of AI can have huge impact is participation of blind people in workplace.

In 1989, Harvard Business Review published a notable article about how technology was helping blind people to pick up a wider range of jobs. It states that there “are the thousands of sightless or visually impaired people who possess desirable skills but who have difficulty finding work, or at least work commensurate with their skills.” The article enumerates some professions, in which blind people had historically excelled, such as “customer service and repair service representatives, staff writers, quality control inspectors, receptionists, and curriculum specialists.” While the author was optimistic that the work participation of blind people would increase due to technology, it remains relatively low. For working age adults reporting significant vision loss, only 40.2% were employed in 2013.

Therefore, I think apps like an intelligent and voice-controlled Slide Creator, Email client, Word Processor, etc. can have huge importance to the community of people with vision impairments. However, they can be equally useful to the people with good sight. I want to highlight the words from a Forbes article by Vanecek, who asks “What if technologists designed solutions for the disabled first, as their most challenging target market?” With that background and challenge under their belts, consider the innovative solutions they could bring to other needs.

Designing an Email Client as a flagship productivity app

The Amazon Echo is a voice-controlled hands-free speaker that was released by Amazon in June 2015. As a part of NYU Ability Lab, I worked with Jasmine Chabra, Tianyi Chen, Rewant Prakash and Camille Weins to develop an e-mail client for Amazon Echo to address the lack of productivity apps for people with low vision. I was really excited to work with Echo because of the significance it can have for people with low vision. Echo has the potential to be an accessible product, with a price point lowered by a general consumer. However, it comes pre-packaged with only a limited number of “skills.” An email is still among the missing skills.

My role on the A-Mail Team was to lead the tech development. Since I had some background in Amazon Echo, I helped the team familiarize themselves with the technology. As for myself, I had to familiarize with Node.js. The learning was quite enjoyable – we worked with examples provided by Amazon and Gmail API, so we managed to implement new iterations quickly. Alexa Skills Kit and Amazon Lambda are a real treat for development. They worked smoothly and the learning curve wasn’t too steep. I would definitely suggest people to use them in their projects. A significant part of my role was the overall design of the development process. We worked in a very agile way, heavily driven by feedback from the community. Below is the documentation of our project compiled by wonderful Tianyi and Camille. I would be happy to answer any questions about developing for Echo and potential future uses of the technology!

Diagram showing user flow of A-Mail

Accessibility of Echo

Amazon Echo is highly praised by the low-vision community for its accessibility. The Perkins School for the Blind lists the Echo as one of the top 3 “Most Intriguing Innovations” in 2015. Tech blogger, Travis Love, writes in his blog that most tech products with hands-free capabilities can set up alarms for the users, but the Echo may be the only one that can pause the alarm just by a voice command (Top 5 accessibility benefits of the amazon echo). He believes that this is a huge step towards accessible design; especially “for an individual who cannot see, has weak muscles, and no finger dexterity”. Additionally, several postings on the blind SubReddit page as well as reviews on Amazon state that the Echo offers a new way of living and entertaining people with vision impairments. They are looking forward to further accessible advancements to the Echo.

Interview with an Amazon Echo API Developer

According to an Amazon developer that we interviewed, the main goal for Amazon is to make the Echo an additional member of the family. When the user interacts with the Echo, he/she does not need to manually control anything. Amazon believes that accessibility is an important factor when developing the Echo. They like to encourage all developers to continue to add new skills by using their Alexa Skill kit.

Technical resources and challenges

Below are resources that I found extremely useful in developing for Echo:

We had issues with the authentication through Gmail’s Oauth2 protocol (how the user gives permission to their Gmail account). To circumvent this problem we had to hard code the authentication, which we tried to solve through cards. The cards are snippets of information, which appear in the Alexa App every time the user invokes a skill through Alexa. The Alexa iPhone and Android app is linked to a user’s Amazon Echo. Cards can contain up to 8000 characters, a URL and one image. Amazon provides a tutorial labeled Linking an Alexa User with a User in Your System, which directs developers to use the cards for authorization.

However, working with cards would require storage of the user’s sensitive data in a database, such as MongoDB. We did not feel comfortable handling such data. Down the road, we will need to bring in more experienced coders and test over a period time to ensure safety. Given our time constraints, we decided to focus our efforts on delivering skills rather than security. Therefore, during alpha testing among ourselves, Gus and user tested below, we hardcoded the email information. The current implementation is safe enough and the authentication is easy enough that a user can get his account set up in a matter of 20 minutes. The token for authorization automatically renews, so that the application is authorized to access the user’s data indefinitely. However, future iterations will clearly need to facilitate signing up for the service.

User testing

We started off by asking friends if they knew anyone who was visually impaired or blind, which resulted in nothing. During this process, we realized that most people are aware of what “blindness” was, but have very little to no knowledge of “low vision.” Thus, finding users that fit this category was harder than we had anticipated. However, through Camille’s grandmother, we were connected to Nancy, who volunteered to interview. The other users, Leo and Regina, were connected through friends of friends. After that, we were able to reach out to other users by word of mouth.

User Impressions of Alexa Person’s Condition Activities with Gmail Tech skills
Nancy Likes that Alexa answers everything. Uses Alexa for really basic things such as news, timer, Pandora. “Do not use as much as I could” Low vision (has a hole in her retina) Basic writing and reading emails. Does not use filters or circles Does not like using her computer; Will use her Ipad. Will pitch her voice higher if device is having trouble recognizing speech; Is accustomed to saying “Hey Siri”
Leo Likes the idea of it, did not know it existed He is legally blind Has trouble with attachments in emails- anything outside the general text is stressful to deal with. Doesn’t want the device to describe the attachment, just notify if there is one. Uses JAWS. Does not like using Iphone (Voice Over is too difficult). Gets tired of using Siri real fast, will have his wife activate “Okay Google”on her phone.

USER 1: Nancy During our first round of user testing, it was quite tiring and frustrating for Nancy to keep repeating her command of “how many emails do I have” and she had to resort to finally using shorter, direct phrases, such as “Alexa, Gmail app emails”, for Alexa to answer her question. Nancy also had difficulty saying Gmail app every time and preferred saying Email app instead; this could however also be phone issues considering this user testing was done on a phone call. Nancy tried saying longer phrases such as “alexa, Gmail app do I have emails?” but Alexa could not recognize it. Nancy prefers to use shorter phrases as it is more convenient. There were issues using Labels intent as the response was too long and confusing for Nancy.

USER 2: Leo With our first version of our prototype, the user had to start off by saying, “Alexa, ask Gmail-App ”, but because that wasn’t flowing properly, we changed it to “A-mail” and we validated with Leo that A-mail is more comfortable to say than Gmail-App.

USER 3: Regina After testing Gmail-App, she was really impressed with Alexa and wants to look into purchasing one for her classroom.

Current status

The Amazon Echo app became a part of NYU Ability Lab’s portfolio that can be developed by future students. The next steps in the development after include:

  • Handling cards for the login
  • Write messages
  • Attachment detection
  • Push Notification for new emails (maybe every 15 min or 30 min – time can be customized)
  • Use email clients beyond gmail