How the new iPhone SE's portrait mode tries to out-Google the Pixel

Google Pixel 3a XL
Google Pixel 3a XL (Image credit: Alex Dobie / Android Central)

It's no secret that Google's Pixel phones have had some of the best cameras you'll find on any phone, even though most of them only have one camera lens. They rely on software magic to turn what the camera sees into a great photo, and effects like portrait mode or night sight take that software even further. It makes sense — Google is, after all, primarily a software company.

Apple makes hardware and sells it to people. Google mostly makes algorithms.

Apple is not primarily a software company, at least not the same way that Google is. Apple is more like Samsung and builds products that people want to buy without thinking of the software inside of them. But just like Samsung, Apple's camera game is strong. Very strong. A sampling of photos from the iPhone 11 shows how good Apple's cameras can be.

And then we have the new iPhone SE. Let's be frank — it's an iPhone 8 with a better chipset. That's not a bad thing, but it means that there's only one camera to shoot pictures. So why are they so good, especially portraits? And before you say it, they are good. At least portraits of people, which we'll explain in a bit. Machine Learning, that's why.

That new chip

Iphone Se 2020 Hero

Source: iMore (Image credit: Source: iMore)

The A13 Bionic processor inside the iPhone SE is the same processor you'll find in the iPhone 11 series. It's Apple's latest, and it's really good at a lot of things — Machine Learning being one of those things thanks to a standalone neural engine complete with its own microprocessor.

Because of this hardware, Apple was able to include what it calls "Single Image Monocular Depth Estimation" which is a very complex way to say it can shoot portrait photos with just the image data the one camera can capture. Really. There is no time of flight sensor or LiDAR or any of the other tricks that would make capturing depth easier, just the one camera lens.

More: What is a Time-of-Flight camera and how does it work?

If this sounds a little familiar, that's because Google has been doing something very similar with "just" software for a few years now in its Pixel phones. How it works is really, really cool.

Machine Learning is a broad term that can cover a whole lot of things. In the case of getting depth data from a single lens, it's using a neural network to gather data so it knows what things — in the case of the iPhone SE, just people for now — should look like. You feed a big computer thousands and thousands of photos of people and it starts to find patterns. Things like how eyes are generally shaped or how hair doesn't have a perfectly smooth outline.

That computer crunches all the data and it now has "learned" what a person is and can separate a person from its background. The chunk of software that can do this needs some space to live, but more importantly, it needs processing power to run through its routine whenever you ask it to do so.

Machines can't learn, but they can recognize patterns if fed enough data.

That's where the dedicated machine learning processor in the A13 Bionic comes into play. Much like the Pixel's dedicated camera engines — the Pixel Visual Core and the newer Google Neural Core — a dedicated chip to process machine learning functions is there and ready. When you tell the iPhone SE's camera to take a portrait photo, the neural processor kicks in and finds the person in the camera lens and feeds data to the processor that turns what the camera sees into a photograph.

The rest is pretty simple. Parts of the photo that aren't a person are given a slight blur and parts that are a person are slightly sharpened. It's not perfect, and any phone with two lenses could feed more data to the neural engine and create a much better photo, but most of the time the portrait photo looks really good.

Just people for now

Iphone Se 2020 Portrait Mode

Source: iMore (Image credit: Source: iMore)

Short answer — because that's all Apple has "taught" the chip to process right now. The outline of a person is created as a depth map that's highly detailed and has a very well-defined edge. Apple could have decided to teach the neural engine about cactuses or bottles or dogs, but people are what we take the most photos of so that's where things start.

"Teaching" a neural network what a person looks like is better than teaching it what a goldfish looks like if photography is the end goal.

But Apple also exposed the depth map (both background data and "people" data) as an API for developers. That means third-party camera apps can use portrait mode just like the Pixel camera can. Just pick a focal point and snap a photo, and the camera will try it's best to get depth information based on the outline of what it can see.

It's going to be good at some things and not so good at other things because there is no learned algorithm like the one used for people. But it should still be plenty capable of taking neat portrait effects of other things and have them turn out well. We've seen how the Pixel phones got better and better with every iteration so we know it's possible.

Apple probably isn't going to release another phone with a single camera lens and neither is Google. But give Apple's algorithms a few years and the iPhone SE could take portrait photos as good or better than any single-camera Pixel phone ever did. And like the Pixel, it can use what it's learned combined with double the data from double the camera lenses and give that Pixel camera a serious run for its money.

What's next for Apple and Google?

Pixel 4 Camera Bump

Source: Rene Ritchie / iMore (Image credit: Source: Rene Ritchie / iMore)

More and more of the same software innovations, that's what.

The Pixel 4 and iPhone 11 Pro show us what each company can do when it uses these super-complicated algorithms in tandem with a multi-lens camera, and it's a huge improvement compared to what each company has offered in the past. When you have twice the data to work with and have perfected how to gather the right data, you can do some really great work when it comes to building a camera.

But that doesn't mean either company is going to stop caring about a single-lens camera. We know the Pixel 4a is coming and it will only have one camera around back, and if the rumored iPhone SE+ 2020 actually happens, expect it to be an iPhone 8+ rehash with just the one lens. Not to mention the millions and millions of selfies taken every day with a single-lens camera system.

Software engineers and machine vision specialists at both companies will try to learn all they can from the "double-data" two lenses can collect and see how it can be worked into the algorithms that make the iPhone SE and Pixel 4a cameras (presumably) so darn good. And they will find out new things about collecting light with all that data.

Each OS update will make the cameras better and bring improvements to the front-facing cameras, too. Don't expect actual miracles, but the results three years from now should be pretty great!

Jerry Hildenbrand
Senior Editor — Google Ecosystem

Jerry is an amateur woodworker and struggling shade tree mechanic. There's nothing he can't take apart, but many things he can't reassemble. You'll find him writing and speaking his loud opinion on Android Central and occasionally on Twitter.