A few months ago, I wrote a review for RSBC on the Lens App, a free OCR app from Microsoft that allows you to take a picture of text and have it read back.
Despite the positive feedback and how it offers a free alternative to paid OCR apps, it was still rather fiddley and cumbersome to read a document quickly.
I’m happy to say Microsoft has stepped up their game and launched a brand new app called Seeing AI, an app that can extract texts from images, recognise faces, describe pictures and more! Like the Microsoft Lens app, it is completely free.
You can download the app by searching for Seeing AI on the App Store.
Once you open the app for the first time it will ask to have access to your device’s s camera. Allow this as this app needs this to work.
You will then be presented with a tutorial that you can read through. After flicking through the pages, tap on get started and agree to their terms.
You are now on the main screen of the app.
The top left is a menu button where you can set settings and face recognition, more about that later.
A quick help button is on the top right that gives you more information on the current mode.
In the middle is a camber view of your back facing camera, a nice large area for you to tap to take a picture if the app doesn’t automatically do it for you.
Finally at the bottom of the screen you have your modes, call Channels. You can use VoiceOver gestures to flick through each channel.
As you enter the channel for the first time, the quick help page will pop up alongside a short video.
Perhaps the app’s most useful feature is the short text channel. It is great for reading a small amount of text quickly. Things such as names on envelopes, or dialog boxes on computer screens. Enter this channel and point your phone at some text and it will read instantly, without you having to take a picture and worry if you got it right or not.
During testing I’ve found this channel extremely useful when my computer stopped talking and I needed to see what was on the screen to figure out why.
This channel is for larger documents such as letters or printed work sheets. Just like the first channel, point your phone at a piece of paper, and it will give you audio guidance on how to position the paper under the phone. A handy tip is to place the phone in the middle of the paper and slowly bring it up so it’s parallel. As it gets further away, more of your paper will come into the camera view, so don’t be too shy to go higher and stand up. When you get it right, it will tell you to hold your phone steady and it will automatically snap a picture of the text, and then read it with VoiceOver.
Unfortunately I didn’t’ manage to get this working, (probably due to the service being new in the UK), so it didn’t recognise any of the barcodes I showed it. However from watching the main demo, it will identify a barcode on a product by audio sounds, and once it’s in focus, automatically scans it. After processing it will tell you the type of product and a more information button will tell you more about the item.
This is great to take pictures of people and their faces, the app will tell you where they are positioned so you can get them in the whole frame. It will even try to guess the age of the person, hair colour and have a guess at their emotions.
Warning: do not get offended if they get your age wrong!
Remember I mentioned the menu button on the top left of the main screen? Double tap on this and go to Face Recognition.
Here you can teach the app whose face is who. It will always start off with the front facing camera, so if you are taking a picture of someone else and not yourself switch it back to the back facing one.
You need to take three pictures of the person, so it helps it learn, after all three, it will ask you to name the person. Once you entered a name and tap done, go back to the main screen and on the Person channel, if you point your phone at a face that you have saved, it will tell you who’s in front of you!
Continue to add more people to your list so when you are using your phone to see who’s around you, they will be announced. ?Could be handy when walking into a busy meeting for checking out who’s around the table.
One of the newer features, this channel allows you to identify money, very useful if your notes feel the same by touch, I’m looking at you dollars, or if you wish to sort your money in your wallet quickly.
It recognised a new British £10 note instantly, but it doesn’t look like it can handle coins, tried a pound coin and a 50p coin without any results. Not a huge issue as the coins are so different by touch, but it will be cool for anyone who is foreign to a currency to identify and learn those coins.
It currently supports British Pounds, Euros, Canadian Dollars and US Dollars.
This is a very basic image identification feature. Take a picture of what’s in front of you and it will describe what’s around you as best as it can. For instance it might say “An office with desks and computers”
A basic colour identification feature, point your camera at an object and it will read allowed the colour.
It can be a hit and miss as with any colour identification app as there are different variables, the biggest one here is lighting, if you have a bright light such as the torch on your phone glaring on a lack surface it could interpret it as grey.
Still, a great feature to match colour of clothing together or to separate white and coloured clothing before the wash.
Another impressive feature, Seeing AI will try to identify handwriting, a difficult feat considering handwriting can be different from person to person, and I’m sure some people’s handwriting can be a bit atrocious!
Despite this, the app handles this pretty well, I would say it gets 70% of the handwriting correctly.
It read most of my Christmas cards fine.
Accept for this one…
Obviously it won’t be as good as typed text, but still rather impressive.
This channel does not have any audio feedback in positioning, and it only works if the text is the right side up, which means you need to take the picture in portrait mode so you won’t be able to fit as much text as you would have in landscape .
Light detection is great for people who has no light perception, it means that they can see if the lights are left on in a room, and with the audio feedback know which part of an area is lighter than others, such as finding the window in a room.
The audio cues are low and high pitch tones, when it gets darker the tones will be lower and higher when it gets brighter.
Another handy feature is Seeing AI’s ability to describe pictures shared from other apps, like Twitter and Whatsapp.
On Twitter find a tweet with an image. Double tap on the tweet, scroll down to where it says image, and then double tap and hold. The share sheet will come up. Select more, and then swich Seeing AI on.
Hit done and then double tap on ‘Recognise with Seeing AI’ which should be at the bottom of the sharing options next to the more button.
Then wait and Seeing AI will describe the image.
This does not work with Facebook as Facebook handles saved pictures differently..
It does require an active internet connection to use, so if you don’t have WiFi or cellular you won’t be able to use the features.
If you’re not a VoiceOver user, it will use the phone’s text to speach voice to read aloud, so you can listen to large amount of text.
All in all, I am really impress with the results of the app, and things like the real-time text to speech makes it a must have, and even better it is completely free!
The speed of results is outstanding recognising both faces and text, saving time.
The added features makes it an assistive tech Swiss Army knife.
Try it out and let me know what you think!