Affectiva

-
Interactive transcript
RANA EL KALIOUBY: Hi. I'm Rana el Kaliouby. I'm co-founder and CEO of Affectiva. My entire career has been about building artificial emotional intelligence. What drew me initially to this work is I grew up in the Middle East. I have a background in computer science. And then I had the opportunity to move to Cambridge University to do my PhD. And that was my first living away-- living abroad experience, and when I got to Cambridge, I realized that I was spending more hours on my laptop at the time than I was with any other human being.
Yet this machine, despite our intimacy, had absolutely no idea how I was feeling. And that got me thinking, what if computers could understand our emotions? And I started thinking about the different applications of that kind of technology.
And then from then on, I had the opportunity to meet Professor Rosalind Picard, who, at the time-- she had written the book Affective Computing and posited that sometime in the future, computers will have to-- will need to-- respond and adapt to human emotion. And I just was fascinated by that idea. And we finally met in person in 2004.
She invited me to join her lab at MIT Media Lab, where we focused on the application of emotion recognition technology to autism spectrum disorders-- and generally speaking, mental health. And so at the time, the project that brought me over to the US was this idea of an emotional hearing aid. So people who have hearing problems wear hearing aids, but there's no equivalent if you have social and emotional problems like a person on the autism spectrum would.
So we built these glasses that had a little camera in it. The camera faced outward. It was connected to a device. And in real time, it processed the expressions of people you were interacting with and gave you real-time feedback.
So if I was a kid on the spectrum and I really struggled with understanding how people felt around me, this thing would analyze your facial expressions in real time and give you feedback. So we piloted that technology at a school for autistic kids in Providence, Rhode Island, and it was pretty successful. So we could see that the kids were making more eye contact, making more face contact, were curious about these expressions of emotion.
But at the same time, being at MIT Media Lab, twice a year, we would invite all the member companies to see what we were up to. We called it Demo or Die. And for about three years in a row, all these member companies would say, OK, this autism research is cool, but have you thought about applying it to advertising testing or in automotive or in banking? And so I kept a log of all the different use cases. And when we got to 20, Ros and I thought, OK, there's something here.
And we thought the solution was to just hire more research students. So we went to the Media Lab director at the time, and we said, OK, we need 10 more research assistants, because we're ignoring our member companies, here. And he said, no. It's time to spin out. This is a commercialization opportunity. So I was-- I was intrigued by this idea of taking emotion recognition technology and applying it in many different ways, in many different industries, and ultimately fulfilled this vision of an emotion-aware digital world.
[MUSIC PLAYING]
-
Interactive transcript
RANA EL KALIOUBY: Hi. I'm Rana el Kaliouby. I'm co-founder and CEO of Affectiva. My entire career has been about building artificial emotional intelligence. What drew me initially to this work is I grew up in the Middle East. I have a background in computer science. And then I had the opportunity to move to Cambridge University to do my PhD. And that was my first living away-- living abroad experience, and when I got to Cambridge, I realized that I was spending more hours on my laptop at the time than I was with any other human being.
Yet this machine, despite our intimacy, had absolutely no idea how I was feeling. And that got me thinking, what if computers could understand our emotions? And I started thinking about the different applications of that kind of technology.
And then from then on, I had the opportunity to meet Professor Rosalind Picard, who, at the time-- she had written the book Affective Computing and posited that sometime in the future, computers will have to-- will need to-- respond and adapt to human emotion. And I just was fascinated by that idea. And we finally met in person in 2004.
She invited me to join her lab at MIT Media Lab, where we focused on the application of emotion recognition technology to autism spectrum disorders-- and generally speaking, mental health. And so at the time, the project that brought me over to the US was this idea of an emotional hearing aid. So people who have hearing problems wear hearing aids, but there's no equivalent if you have social and emotional problems like a person on the autism spectrum would.
So we built these glasses that had a little camera in it. The camera faced outward. It was connected to a device. And in real time, it processed the expressions of people you were interacting with and gave you real-time feedback.
So if I was a kid on the spectrum and I really struggled with understanding how people felt around me, this thing would analyze your facial expressions in real time and give you feedback. So we piloted that technology at a school for autistic kids in Providence, Rhode Island, and it was pretty successful. So we could see that the kids were making more eye contact, making more face contact, were curious about these expressions of emotion.
But at the same time, being at MIT Media Lab, twice a year, we would invite all the member companies to see what we were up to. We called it Demo or Die. And for about three years in a row, all these member companies would say, OK, this autism research is cool, but have you thought about applying it to advertising testing or in automotive or in banking? And so I kept a log of all the different use cases. And when we got to 20, Ros and I thought, OK, there's something here.
And we thought the solution was to just hire more research students. So we went to the Media Lab director at the time, and we said, OK, we need 10 more research assistants, because we're ignoring our member companies, here. And he said, no. It's time to spin out. This is a commercialization opportunity. So I was-- I was intrigued by this idea of taking emotion recognition technology and applying it in many different ways, in many different industries, and ultimately fulfilled this vision of an emotion-aware digital world.
[MUSIC PLAYING]
-
Interactive transcript
RANA EL KALIOUBY: So being at the MIT Media Lab twice a year, we would invite our member companies. And we used to call it Demo or Die. We had to show a demo of what we were building. And for about three years in a row, these member companies such as Procter & Gamble, Toyota, Samsung, Bank of America-- they would see what we've built for autism, and they would say, well, how about you apply it to advertising testing or product testing or automotive-- monitoring the state of drivers?
And so after we got to about 20 of these member companies expressing interest in our technology, we thought, there's something happening here. And we thought the solution was to bring in more research students in the lab. But then we spoke to the Media Lab director at the time and he said, this is not a research problem anymore. It's more of a commercialization opportunity.
If you look at human intelligence, people who have higher emotional intelligence tend to be more likable. They're more persuasive. They're more effective in their lives. And we think this is true in artificial intelligence as well.
So as more and more of our interactions with technology become conversational, become perceptual, become relational, the social and emotional awareness of these interfaces is going to become really critical. And so we're all about injecting these interfaces with artificial emotional intelligence. And what that really means is that we identify how humans express emotion-- and of course, humans express emotions in any number of ways-- through your voice or your gestures, but also your facial expressions, and that's where I spend the majority of my career-- looking into how humans express emotions through their face.
So as it turns out, this guy and his team, in the late '70s-- Professor Paul Ekman-- he published the Facial Action Coding System, where he mapped every single facial muscle-- we have about 45 of those-- to a code. So I'll give you an example. Action unit 12 is the zygomaticus muscle, and it's the lip corner pull. It's typically what you would activate if you were smiling.
Action unit 4 is the corrugator muscle. It's what you would do if you were furrowing your eyebrow, if you were confused or angry. And to become a certified face reader or FACS coder, you would have to go through 100 hours of training. It's pretty intense. And then it takes you about five minutes to code one minute of video, so it's very time-intensive. It's very laborious.
So what we've done as a team-- Affectiva-- before that, as part of my research, is to use computer vision and machine learning to automate that process. So now, using any device with any camera, our machine learning algorithms can track your face. So it identifies the presence of a face, and then it tracks all the different feature points on your face-- so your eyes, your mouth, your eyebrows.
So the way we train these algorithms is we collect data from all around the world, and we feed the machine learning algorithm examples of people smiling or smirking or frowning. So to date, we've collected about 5.4 million face videos from 75 countries around the world, and that amounts to about 2 billion facial frames. It's a ton of data.
And we use that data to train the algorithms, but we also use it to mine the data to understand, how do humans express emotion across the world? And so what we found is that by and large, facial expressions are universal. A smile is a smile everywhere in the world.
However, we found that there are cultural differences in how people express emotion, and we call those display rules, or norms. So we found, for instance, in collectivist cultures like China and Japan, people are less likely to share negative emotions. And we don't see that effect in individualistic cultures like the US.
We've also found that there are gender differences in how people express emotions. So women tend to be more expressive-- which is not surprising, but our data confirms that. But that difference is culturally specific.
So in the US, women smile 40% more than men. In France, Germany, women smile 25% more than men. But in the UK, we found no significant difference between men and women, which was intriguing. We don't really have an explanation yet. But this kind of technology allows us to do scientific explorations into how humans express emotion at a scale that was never possible before.
[MUSIC PLAYING]
"
-
Interactive transcript
RANA EL KALIOUBY: So our core emotion engine essentially takes any video stream or any number of images, and it maps that to an emotional state. And at the moment, it can read about 20 different facial expressions-- the obvious ones like a smile or a brow furrow, but also pretty nuanced ones like a lip suck, or the pucker, or an eye squint. And what we've done-- and then it maps it into eight different emotional states, as well as age, gender, ethnicity. And just for fun, we also map your dominant expression into one of 14 emojis-- like you're sticking your tongue out, or you look shocked, or winking, for instance.
And so then what we did is we took this core emotion engine, and we packaged it up as cloud-based APIs, as well as on-device SDKs to allow any developer to very quickly emotion-enable their own digital experience. And so our on-device SDKs run in real time. They don't send any videos to the cloud, which is important for privacy reasons. And it allows you to get real-time emotional information.
We were able to shrink the machine learning models so that they run on any device. So that includes iOS, Android, Linux, Mac OS X, Windows, and even Raspberry Pi, which is powering a lot of the IoT and connected home devices. So that was important to us, and it took a substantial amount of effort to get to that point.
One of the areas where we got the most traction to date is in media and advertising research. So advertisers and marketers want to build an emotional connection with their consumers, and they know that this emotional connection drives purchase decisions. It drives word of mouth recommendations. It drives loyalty. And so it's really important that they are able to measure and elicit these types of emotional engagement. But they struggle with measuring it in an objective way.
So before our technology existed, the way you would do that is you would show somebody a piece of content-- say, an online video ad-- and then you'd ask them if they liked it or not. And now, we know that self-report is quite biased-- for a whole variety of reasons-- and so it's not always accurate. It's not always a true representation of how you felt as you watched that online video ad.
With our technology, though, you get-- wherever you are in the world, be it at home, on your device-- you would get a request to watch a video while you turn the camera on. If you opt in, camera turns on. You watch the video. And in the background, it is analyzing how you're responding to that video ad on a moment by moment.
So now we have your emotional journey unfolding as you watch that video, and we have that for thousands of people like you, who watch that same video. We aggregate that data, and we compile it and visualize it in a dashboard that brands like Coca-Cola or Kellogg's have access to. And so through that product, we now work with 14 different market research partners around the world, including Millward Brown, for instance, and Nielsen. And through them, we work with a third of the Fortune Global 100 companies.
Some companies like Kellogg's test all their ads worldwide using our technology before it goes live. And they use the data to optimize the ad. It might be too long. It might be too boring. It might be offensive in some cases. And they're able to flag that early on before they've spent millions of dollars distributing the ad.
So while our cloud-base APIs on our on-device SDKs works really well if you're an app developer and you are able to take the core emotion engine and integrate it into your digital app or your device, we recognize that there are lots of people out there who may not have the programming background required to integrate emotions and integrate our SDK. So we built a product that we call Emotion as a Service.
All you need to do is you upload your videos or your images to the cloud, and we send you back a readout of the emotions in that video. And you can think of that use case-- say you're a researcher at MIH and you have a ton of videos of depressed patients. You're not going to necessarily code with our SDK, but you really want to identify what expressions are happening on the face.
And so you send all your videos to our cloud via Emotion as a Service. We send you back a moment-by-moment readout of the emotions in that video. So ultimately what we want to do is remove the barrier to incorporating emotion awareness into your digital interaction-- whatever that is. Whether it's images or videos or a data platform or a digital experience or an interactive kiosk, we want it to be really easy to get to the emotion data.
[MUSIC PLAYING]
-
Interactive transcript
RANA EL KALIOUBY: So when Ros and I spun Affectiva out of MIT, I remember that day very well. We sat around a table, and we said, OK, what's non-negotiable? We understand that, like any technology, there are potential for good, but there's also potential for abuse.
So we discussed-- OK, where do we draw the line? And we agreed that we're only going to do use cases or applications of this technology where we can get people's consent and opt in. And so far, that's been the case. We always, always tell people, OK, there's going to be a camera here. It's going to be measuring and identifying your expressions, and we're using this for this purpose.
With our SDK, all the processing happens on the device, and so we never store any information about-- we never saw any video data, for instance. And it's all just in time, on-the-fly analysis, which helps with privacy a lot. And generally speaking, we take that very seriously.
I mean, I'm a big believer that your emotions are very personal data, and so there needs to be a value in it for you. In the advertising world, the value is we onboard panelists, and they're paid panelists. In every other application, it makes the experience better-- like if you're interacting with a social robot. Well, you're not really going to enjoy that experience if the robot's not social and can't tell if it's annoying you or not. And so we're big on this idea that you're giving up some very personal information and there has to be some value in it in return.
I also think, as a company at the forefront of this technology, it is our responsibility, as a team, to educate the public. And so we are often engaging the public in discussions around, OK, how does this technology work? It's not a black box.
We talk a lot about the machine learning mechanics behind the technology. We talk about the drawbacks. We talk about the different privacy guidelines. We take that very seriously, and I think it is our responsibility, as a startup, to do that.
So emotion AI is a core AI capability that is becoming a-- that is growing into a multi-billion dollar industry that is transformative to many different verticals. So we got our start in media research and advertising, but we're now quickly expanding into other verticals such as automotive, for instance. In the automotive space, as we transition into semi-autonomous and then fully-autonomous vehicles, it is going to be imperative that these cars understand the mental state of the drivers, especially if the car wants to relinquish control back to a driver.
Is the driver awake? Is the driver paying attention? Is he or she texting, distracted, watching a movie? That's going to become very critical. We're also finding that cars are redefining themselves as conversational, infotainment interfaces. And there, they want to understand the emotional engagement of the user and personalize the experience. Could personalize the lighting in the car, the music in the car. And so that's looking to be a very big market for us.
Beyond that, we're excited about opportunities in education, for instance. As more and more of our learning migrates online-- and I have two kids, and I'm seeing that-- the dropout rates for a lot of these online learning courses-- is pretty high, because these systems have no understanding of your emotional engagement. An online MOOC does not-- it has no sense of are you confused? Are you bored? The way an awesome teacher would. And so our technology could bring that kind of adaptation and personalization to online education, so we're excited about that use case.
And then ultimately, one area that I'm very passionate about-- which brings me back to my roots, here-- is mental health. I do think there is a huge potential for this application to allow us to objectively quantify our mental health disorders like depression or pain, or assess suicide in a way that was not possible before.
[MUSIC PLAYING]
-
Interactive transcript
RANA EL KALIOUBY: So in the media research and advertising, we've now partnered, for a couple of years, with different research partners, and they're around the globe. So through them, we do work in 75 countries around the world. And we're continuing to expand that use case.
In the past year or so, we have started to diversify into new markets like automotive. So we've just finished a proof of concept with a Japanese OEM manufacturer where we installed cameras in cars in Tokyo and cars in Boston, so that was fun-- see Boston drivers. Quite stressful at times-- as I'm sure you can imagine. And we use all that data to fine-tune the algorithms for an automotive context.
And so we're getting a lot of traction there. We're also integrated into a number of social robotics that are going to be in the market towards the end of the year. So at Affectiva, our mission is to humanize technology by bringing emotional intelligence into our digital interactions. And fundamentally, I think this is going to change the way we interact and interface with devices, with our technologies on our devices. But I also think it will fundamentally change how we, as humans, connect with one another
A lot of our communication is mediated through technology, and I feel that even though we're connected to way more people today through our devices than we ever were, I feel like the quality of these connections are quite poor. And still, nothing beats a face-to-face conversation. And so what we're trying to do is bring this emotional data-- an emotional element-- into our online conversations, thereby humanizing technology.
And so ultimately, we envision a future where all our devices have a small emotion chip that has cameras and microphones, and consents and adapt to your emotion real time. And that's going to manifest in your car, in, obviously, your phone, but also devices that you may not-- like, a fridge could have an emotion chip and could personalize its suggestions based on your mood.
So what sets us apart in this emerging space is AR data. So we've amassed 5.4 million face videos around the world, which powers our machine learning-- because, as you know, in the deep learning world, it's not about the algorithm alone. It's also about the data that powers these networks. So that's definitely a competitive advantage.
I would say our general market traction-- the fact that we are doing business in 75 countries around the world, we're embedded into platforms that use our technology day in, day out, and then our SDKs that run on device. The notion of shrinking these machine learning models so that they run on real time Raspberry Pis-- that's definitely something that's unique about our technology.
[MUSIC PLAYING]