Postdoctoral Researcher, Microsoft
Timnit Gebru works in the Fairness Accountability Transparency and Ethics (FATE) group at Microsoft’s New York Lab. We chat about opportunities for designers in AI, algorithmic bias, and some of Timnit’s recent research using computer vision and machine learning to make demographic estimates.
Machine learning & product design
Eli Woolery: Timnit, welcome to Conversations on DesignBetter.Co. I’m so excited to have you here. You’re a long time friend. We were co-founders in a startup many years ago. I’m really excited to talk with you about some of the things that you’re working on, and some of your perspectives on how artificial intelligence, machine learning, and computer vision are impacting product design.
Let’s just start there. You’ve been in academia at Stanford, finished up your Ph.D. in 2017. And now you’re a post-doc with Microsoft, so you’re more in industry, though in an academic portion of industry. What kind of impacts, from your viewpoint, are we currently seeing from AI machine learning on product design?
Timnit Gebru: I think that what’s happening is that AI and machine learning is sort of embedded in everyday products, to the point where you don’t know that what’s going on is AI or machine learning, right? Right now, I’m working in a group called FATE: Fairness, Accountability, Transparency and Ethics in AI. Basically, trying to understand the societal impacts of AI, like what are the ethical considerations, what are some standardizations we should have, things like this. And also working on machine learning algorithms that are hopefully fair.
But one thing I’m seeing, along the lines of my work, is that we need more people who think about design. I’m going to flip it: I’m going to say, we need more people who think about design working in AI, because oftentimes what’s happening is little things. Think about every time you see a poster or some sort of advertisement for an AI-based talk or the future of AI. You have some bionic woman … you know what I mean? Just the portrayal of what that machine is. Then think about Siri or Alexa or these personal assistants who are women, what does that do to society just portraying that stereotype?
A friend of mine, Meg Mitchell, was involved in this Microsoft product called Seeing AI before she left. There’s this thing called image captioning. You look at an image, and it’s a computer vision and natural language processing problem. You look at an image and you would tell what is in the image. You describe what’s in the image. That’s the goal of that whole image captioning project. You can see that in terms of design for accessibility, this can have huge consequences. It can have a huge impact.
But oftentimes what happens is that you have people in research working on this, and they work on the image captioning problem not thinking about actually people who are visually impaired, or the product side, or how this might look if it was going to work in the real world.
Then you see a lot of papers about [image captioning]..and in the introduction, the motivation is always “we want to help visually impaired people”, or something like this. But you realize they’ve never done the needfinding, the whole design thinking process. Needfinding to figure out, okay, are we answering the right research questions? What are the pitfalls? Can we work with people who are actually visually impaired when we start doing the research to start with? Because then it has hopes of being more impactful.
You can have a lot of challenges with image captioning, but it has a lot of potential to impact a lot of people’s lives. But if it’s not done in the right way, it won’t. I think Seeing AI is one example for me where people, the researchers, actually took a lot of time and effort to actually work with people who are visually impaired and make sure that it turns into something useful.
The lesson I draw from being [at Microsoft] as a researcher in AI is that … it’s like every time people talk about diversity it becomes a cliché, but for me, it’s really important to bring in people who think about design, who think about needfinding to work on just even on the research side of AI. Because then we can hopefully make something that’s impactful.
Empathy, bias & AI
Eli: That’s wonderful. You did a great job of segueing into my next question: as designers we pride ourselves on having empathy for users, using design thinking, need finding. We recognize that a product’s design is strengthened by building it for a diverse set of users. Do you see any ways that AI or machine learning can impact how we can get those diverse sets of perspectives through data on the products that we are designing?
Timnit: That’s interesting. I guess you could do some data analytics to see how well a product is working for different groups of people, or maybe to try to find problems. I work a lot on bias in machine learning. I work on the opposite problem of how data that is not representative of people could be used to train a machine learning algorithm and arrive at biased conclusions.
An example could be that if you have software that’s trying to decide your car insurance rate … Cathy O’Neill in her book Weapons of Math Destruction showed that you can have a rich person with a drunk driving record have way lower insurance rate than a less wealthy person who has no such records. You can arrive at biased conclusions.
I think that, again, we need a lot of designers to work on these things, because there are other products where because they’re not tested on a diverse group of people, or the test for the algorithm, for example, does not consist of a diverse group of people, so you have products that don’t work well on some people.
For example, I learned that soap dispensers, because of the sensors that they have, if the skin is really dark it won’t work for people. This is a design problem. You’re creating a product that’s not going to work for a certain group of people.
I recently co-authored a paper with Joy Buolamwini (it was mostly her work), where we showed that there are these simple gender classification APIs. We analyzed three different APIs. They look at a picture of someone, and they give you a binary male/female label. We found that the darker the skin just really doesn’t work. If you have this API on darker skinned women its error rate is 33%, and on lighter skinned males it’s almost 0% error.
Again, this is a design problem. I feel like if you have more designers in this field, they would start to bring up these potential issues: “Maybe we should test it like this, maybe we should do need finding.” There are starting to be a lot of these cases that are being shown. Things not working on women versus men, or different age groups. Or you have things like speech recognition that doesn’t really work well on people who are younger, and we don’t have any standards to say, “Okay, if you’re going to use AI for this particular thing, then it needs to have these characteristics.” That’s stuff that I’m working right now, too. I really feel like this field would benefit from more designers getting involved early on, even in the research phase.
Eli: I’m sure a lot of our audience will be excited to hear that, and will start poking around for those types of roles. Let’s talk a little bit about your Ph.D. research, in particular, your paper Using deep learning and Google Street View to estimate the demographic makeup of neighborhoods across the United States. Like many papers, a very lengthy title, but really interesting work there. Maybe you can summarize it for the layperson like myself.
Actually [the original title] was shorter, and I thought that even that title was really long; the reviewers told us to change it. The goal of the project was to show that you could do data mining with images. Basically, a lot of people look at social network data or text data or Twitter data, and they do data mining and try to extract useful information and insights from it.
We wanted to show was that given that most of our data in the world, and most of our digital data, is in the form of images, visual data, we really haven’t started to use them to process millions and millions and billions of images and visual data to gain some sort of useful insight. We wanted to show that this could be done.
We wanted to see if we could infer some of the data that is gathered by many different agencies, like the ACS (American Community Survey), to try to figure out demographic characteristics, and even a general idea of carbon footprint data for each state just to see which city is worst and which city is best. We looked at crime data, which is actually what led me to my next job, trying to understand the societal impacts of this kind of data mining.
We had 15 million Google Street View images from 200 of the most populated cities in the US, and then we detected and classified all of the cars in those Street View images to the best of our ability using computer vision, and then we associated the characteristics of the cars with the types of people who live there. We tried to predict things like income, education, racial makeup, voting patterns. Because cars can tell you a lot about people.
It was interesting because it did two things. One is it verified our intuition that cars tell you a lot about people. I had read a whole bunch of articles about this in market research. Relationships between different kinds of cars, and who votes for who, or age, or [other] demographics. It was interesting to see that we could extract that information from Google Street View images. That was, in a nutshell, my Ph.D. That took four years.
Eli: That’s fascinating. I was listening to an interview with you on a podcast that was more heavily focused on the machine learning aspects of this. There are some real challenges there, because of the data that you’re getting. I think you used Edmunds as a source for some of the imagery, but then you’re looking at Google Street View and the perspective is very different, and sometimes cars are blocking the view. How did you guys work through some of those challenges?
It’s a very different type of data when you get it from things like Craigslist or e-commerce sites. When someone’s trying to sell you their car, they’re going to give you a really nice image of their car. The one you’re trying to classify or detect in Street View, it looks very different. One thing we did is we had to train an algorithm using images that are labeled with the types of cars that are contained in those images, and we teach it how to recognize these cars.
When we tried to train it on the e-commerce site, and test it on Google Street View, it just didn’t work. We then annotated a subset of Google Street View images. We hired experts, and we built a UI there. Again, there’s where design could help.
Hiring experts to label images for you, designing the task, designing the user interface, that’s also an important aspect. There’s a lot of research at the intersection of computer vision and HCI (Human Computer Interaction). One of my papers during my Ph.D. was an HCI paper, because of data gathering, labeling, how do you effectively do it, how do you design the tasks. We designed a UI to help the guide experts through labeling the Google Street View cars, and it’s a very difficult task.
But they are crazy people who just love doing this kind of stuff, and they know everything about cars. I don’t understand how it’s possible that they could do it, but they did it. We annotated a subset of our cars and we used that in conjunction with the other data that we had to then train an algorithm that could cars in Google Street View images. Labeling those cars in Street View is very, very expensive because you need to hire experts and pay them a certain amount per hour. That’s a pretty big problem in machine learning if you want to apply it into a different setting.
There is also a field called domain adaptation, and it pertains to this exact topic. If you have a training set, the data that you train, that you use to train your algorithm, it looks different than the test set. In this case, like you said, the images from Google Street View: the cars are small, they’re blurry, they’re hiding behind a tree or something like that. So that looks very different from training sets. There’s this whole field that tries to study that, and make your predictions on the test set better regardless of the kinds of data that you have in your training set.
Eli: What are you most excited about in your current work, in your role at Microsoft?
Timnit: I’m very excited and I’m very scared. I’m most excited about the potential things that you can do with AI to reach people who don’t have resources. There are a lot of potential things you can do. There are things that excite me about drones being able to take supplies to remote areas, or research into a phone app that can diagnose diabetes from looking at your retina.
But then the same things also scare me. Drones with AI capability scare me, because we are seeing, right now, disproportionate use of AI to target people who are marginalized already. We also see that the software itself is biased against those people, because you have data that’s historical. That’s the only type of data that you have, and this historical data has issues.
I’m excited about the potential positive impacts, and I’m very worried about the potential negative impacts. I’m happy that I’ve been given a platform to be able to work on trying to mitigate the negative impacts, and also educate the public and talk to people about it so that they have more awareness.
Book and resources
Eli: One last question. Are there any books or blogs or podcasts that you’d recommend for a designer and machine learning novice like me?
Timnit: There is a podcast called This Week In Machine Learning, and Sam Charrington runs it. He has great people talking about their work, and it’s meant for novices.
If you’re interested in the ways in which algorithms are being used right now and understanding the potential unintended consequences, I always recommend Weapons Of Math Destruction by Cathy O’Neill. That’s a very good book, just to get an understanding of what’s going on. It was very eye-opening for me.
Eli: Perfect. Well, Timnit, it was just fantastic having you on Conversations. Thanks so much.
Timnit: Thank you for having me.