Poisoning AI and Putting Privacy First with Heather Zheng

In this episode, we talk to Heather Zheng, who makes technologies that stop everyday surveillance. This includes bracelets that stopped devices from listening and on you, to more secure biometric technologies that can protect us by identifying us by for example, our dance moves. Most famously, Zheng is one of the computer scientists behind Nightshade, which helps artists protect their work by 'poisoning' AI training data sets.

Heather is the Neubauer Professor of Computer Science at University of Chicago. She received my PhD in Electrical and Computer Engineering from University of Maryland, College Park in 1999. Prior to joining University of Chicago in 2017, she spent 6 years in industry labs (Bell-Labs, NJ and Microsoft Research Asia), and 12 years at University of California at Santa Barbara. At UChicago, she co-directs the SAND Lab (Security, Algorithms, Networking and Data) together with Prof. Ben Y. Zhao. She was selected as one of the MIT Technology Review's TR 35 (2005) for my work on Cognitive Radios; my work was featured by MIT Technology Review as one of the 10 Emerging Technologies (2006). She is a fellow of the World Technology Network, an IEEE Fellow (class'15) and an ACM Fellow (Class'22).

Image from https://towardsdatascience.com/how-nightshade-works-b1ae14ae76c3, originally taken from the Nightshade paper

Resource List:

Access Glaze and Nightshade here!: https://nightshade.cs.uchicago.edu/index.html

Learn about Nightshade here: https://www.technologyreview.com/2023/10/23/1082189/data-poisoning-artists-fight-generative-ai/

You can follow Glaze here: https://x.com/theglazeproject

And read the full paper here: https://people.cs.uchicago.edu/~ravenben/publications/abstracts/nightshade-oakland24.html

Transcript:

Kerry:

Hi, I'm Dr. Kerry McInerney. Dr. Eleanor Drage and I are the hosts of The Good Robot podcast. Join us as we ask the experts. What is good technology? Is it even possible? And how can feminism help us work towards it? If you want to learn more about today's topic, head over to our website www. thegoodrobot.co.uk where we've got a full transcript of the episode and a sample. Especially curated reading list by every guest. We love hearing from listeners, so feel free to tweet or email us. And we'd also so appreciate you leaving us a review on the podcast app, but until then sit back, relax, and enjoy the episode

Eleanor: In this episode, we talk to Heather Zheng, who makes really cool technologies that stop everyday surveillance. This includes bracelets that stopped devices from listening and on you, to more secure biometric technologies that can protect us by identifying us by for example, our dance moves. We hope you enjoy the show.

Kerry: So thank you so much for joining us here today. It's a really a pleasure to get to chat to you. So just to kick us off, could you tell us who you are, what you do and what brought you to thinking about ethics and technology?

Heather: Thank you for having me. My name is Heather Zheng. I'm a faculty at UChicago.

And before that was the industry research lab, Microsoft Research and Bell Labs. I'm a very private person, so, um, privacy and security is always on top of my list.

And my research is basically also driven by practical problems that I face in these privacy and security regimes.

Eleanor: Awesome. Thank you so much for coming. It's a pleasure to have you here. We are the good robot. So what is good technology? Is it even possible? And how can feminism help us get there?

Heather: I think technology, especially artificial intelligence, they bring a lot of convenience and it will solve a lot of very hard problems that usually it takes many years or great amount the money to solve it. Now they bring it in front of us in a few minutes, few seconds using a smartphone using, something that we can wear or carry every day. So this has largely changed our life. So I think that's good part of the, the good robot really brings a lot of enabling technologies to end users beyond just the very fancy and very complex systems.

So I think that's really enjoyable for everybody as well.

Kerry: Yeah, no, I think this aspect of the extent to which these new technologies have like profoundly changed everyday life. It's something that I think I really failed to grasp as someone who's, mid to late 20s or late 20s now, definitely mid, really stretching but I never really grew up in that generation before say, almost total internet connectivity. I asked my mom sometimes, I'm like, what would you do if you were like, trying to meet someone and you couldn't text them? Like, how did you find anything? Like, when people drove, did they just have to know where to go and you couldn't rely on Google Maps? And, I think really that's almost like a kind of phase of life that feels profoundly different to me. But I actually wanted to come to this question of privacy that you've raised when you talked about your interest in ethics and feminism and tech. Because you've worked on a huge number of really exciting projects, which grapple with privacy and security.

I don't think Eleanor and I will even be able to touch on them all in this conversation. So I want to start with one that really intrigued me, which was the bracelet of silence. Could you tell us a little bit about what this is and what it does?

Heather: Yeah I have coworkers who really want to bring smart devices into our offices and even my homes and things like that.

But at the same time, I'm a private person. It's very convenient. It tells you what's the weather. You can basically ask them what's going on. And then it's just very convenient devices. But at the same time, I know it's always listening. I just don't want people to listen to my conversation or anything else.

So whenever I go, there will be millions of these surveillance devices near us, including especially microphones. How do you deal with them?

So that's why we thought about, hey, what do we do if we can build a wearable and then I learned these ultrasonic jammers, it's not a new technology, like maybe 10, 20 years ago, musicians have actually used it in their process, so we thought, why can't we turn that into defense? And then we started making the device and we went through many iterations of the device design to make sure that the end user can use it and also make sure that you don't need to go look for the microphone and locate the microphone and then aim [00:05:00] towards it.

How does it work in all directions without disturbing others or disable other devices that you might need. So that eventually leads to the bracelet of silence. Essentially you wear the jammer around your wrist. Like a bracelet and and they emit ultrasonic sound to disable the microphone near you, but does not disable the microphone, which is far from you, which, may have other usage does not disturb your privacy.

And then we found out, Hey, there's an artefact that there's a blind spot. There's some kind of coverage blind spot. So how do we solve that? Then we found out naturally user movement to actually disrupt that. So we test that and figure out things. So that actually collaborated with another HCI faculty here, Pedro Lopez.

So basically help us to put this into a truly wearable form and to minimize the battery usage. So eventually that leads to the platform that is cheap. Everybody can wear it and it's effective, at least within a meter range. And you don't need to go locate the microphone near you. It will just do the things for you. Usability to us is important, and the cost as well.

Eleanor: So cool, and I can totally imagine people doing that. This is probably going to be our future where we're wearing these, wearables, bracelets that disable microphones near us. And I think, maybe it seems shocking right now because we don't want to believe that there are things listening. But we know we live with Siri and Alexa. We know that these are things that we co exist with. So yeah, to me that seems pretty right now. Are company is interested in building these things?

Heather: We have, at that time, we have a lot of interest. We have venture capital VC reach out to us to try to fund that. We didn't do that at that time is because we're not really interested in commercialize. We actually open sources so everybody can actually manufacture it on the cheap. And it's mostly for the end user.

And then we also have a designer, somebody just a designer just did this using a necklaces. So you wear around your neck, and she actually wore it and test. She built it and wear it the same design and tested and actually worked. So it could be a nice piece of jewelry or any accessory you wear as long as it move with you, it can always be that so we didn't really want to commercialize ourself. Rather than providing the design for whoever wants to build it and can leverage that. I think that's that's our approach. And then we move on to other projects as well.

Eleanor: What do we do without the guardian angels? We should all be thankful we have good academics like Heather helping us all out.

Another really cool project that you've been working on is to do with building consent layers. And we know that there's a lot of outcry by artists and creatives against using generative AI that takes their data or their work in order to produce new images. And I went to a conference recently and there was a presentation by Spawning. ai. That's S P A W N I N G. And they are building these consent layers as well where they produce data sets wherein artists and other creatives have opted out. So they have consented to being included. in, in data and companies like Hugging Face are using these consent layers, which is like really encouraging.

One of your projects called Glaze is a system that's designed to protect human artists by disrupting style mimicry. And this is a really cool concept. And yeah, could you tell us a bit more about that, but also Nightshade because I think this is the project that I found you through, so it's a technology that poisons artwork to confuse a training model which is trying to decide what an image is.

So yeah, super cool ideas. Tell us about them.

Heather: So this is all my joint work are actually work with my co lab director, Ben Zhao. We actually work on these projects together and this is one of his master ideas. so I think the key thing for glaze and even for the system we call Fox before it is Fox is dedicated to protect our official biometric system against a surveillance camera to recognize us.

And also more about using our online images to train facial recognition against us without our kind of a consensus, and then the similar thing is that it can be applied to art in essence. We did not really... Glaze is trying to help the company to incorporate opt out. However, a lot of time you cannot force them. You do not know whether they can really choose images. They can still learn images where you opt out from. So glaze is more about what the individual artists can do before they posted their photo or their art online. What they can do actually protect these in addition to the opt out because opt out cannot really force them to do and it's very hard to know whether they're really opting out or not. So glaze is adding some perturbations to the images, ideally imperceptible, especially in art. You can hide a lot of these within the texture and different area of the arts and becoming part of the arts as well. What it does is it's actually changing the feature representation that the art represent, which kind of confuse the Fine tuning or the mimicry models from learning or reproducing the arts in the outcome.

So I think that is a way for the artists to protect the photos they posted online regarding our arts. The same thing happens in Fox is before we upload our facial photos online, like to a social media, you can also add these perturbations. Which prevent companies to use our online images to train facial recognition models against us.

So I think that's the general thing is we want end user who usually feel powerless against these giant AI models or AI services in these privacy intrusive way to have some power to protect themselves and can do that immediately without waiting for the other side to take actions, for the policy to take actions.

And we hope that these methods that we produce can actually get company to be aware and more ensure these copyright protection eventually against all the unethical use of these online images or even videos as well. But Nightshade is is a slightly different... it's not against the mimicry model that somebody trend on the smaller set of a specific artist images.

The Nightshade is more about how do I confuse the base model, the giant base model, the AI company trend in without the consent of using their images for the training. So that's a way to also protect the copyright of these datas, not against individual mimicry, but against the holistic models that build on them.

Yeah, so I think that's just eventually moving up the stairs in terms of the protection the artists can add to their images.

Kerry: Yeah, that's really fascinating. And, first, I just love the name Nightshade as well. It feels very witchy and delightful. And I also recognize I look very witchy and delightful to anyone who is watching our YouTube channel.

I'm not voluntarily sitting in the dark. I booked a room at the university where I do a visiting fellowship and the lights don't seem to really work. I have my phone light blaring at me and I look very strange. But second, I think the work that you're doing with Nightshade and with Glaze, it's really counter cultural to the moment that we're in, which seems to be all about algorithmic sorting and classification, all about trying to create clear cut boundaries, all about scraping mass amounts of data for commercial purposes.

And these tools really fly in the face of that. So I guess I'd be interested to know, like, how has your work generally received by other computer scientists and by people working in this field are people interested in the work you're doing? Do they see it as being useful and provocative, or do they like resent these kinds of projects which are trying to, I think, bring a lot more ambiguity and mess into what often could be quite a streamlining field of work?

Heather: Yeah, thank you for that question. I think we're really actually very happy that the work of Glaze and Nightshade and even also going back to Fox a very well received and also supported by our colleagues in the computer science community and even beyond these communities. I think people once we bring out these issues that artists face, they understand it very well and truly support us.

So we have a lot of very positive feedbacks, a lot of support from many other researchers in our areas. And the paper, the Glaze paper and the Fox paper all got accepted by top security conferences. And Glaze also got multiple awards and then many awards from that. It's just everywhere we go, we have a lot of appraisal from all these researchers and companies and even communities, which kind of helped a lot, especially for students, because what they did, what their effort, huge amount of effort they put in actually has.

It's good to see have a great impact now in terms of helping individual artists, but also educating everybody else, make them aware about the issue. I think that's as protective as a researcher can be in this case. So they are, the students are thrilled and pumped to do more. I think that's, it's really happy to see that.

Eleanor: Oh, that's such a lovely teacher. No, they're very lucky to have you. Do you do you, how do you advertise to artists or to people who want to protect their data? Because, we are asked all the time, I think by people who are often quite despairing. And, it's amazing for us to discover that these kinds of technologies are out there, that we found many different possibilities that can allow artists to protect data.

Heather: I think the important thing is to educate and let them aware that doing so is not going to take your tons of money and tons of computing time. So that's why initially we developed the app in macOS and Windows so that they can run on their individual laptops. We also recognize sometimes that we learned through this process by communicating with artists that some of them may not have these computing resources that can deploy or run glaze or nightshades in a fast, in a speedy manner.

And that's why we actually have a platform where that they can upload their images and get them Glazed. And recently, we've also been approached by companies who want to [00:16:00] implement these in their platform collapse to collaboration and see how this goes.

So we're just carefully examining all these because, we also have to let them submit their images online and how do we protect that process? So all of these, we are also learning a lot, but I think the most important thing is making usable. And cost effective for the artist. I think that's we're still learning through that.

But that's always our methodology, once it's a usable, then people will use it.

Kerry: I think that's a really wonderful approach. I think it's so important to take into account people's different infrastructural needs and also the gap in knowledge and practice, right?

So I can imagine a lot of people feeling really intimidated by the idea of how do I grapple with one of these tools and the fact that you've managed to build in so many different kinds of steps and considerations, I think is really important for clearing that knowledge gap. And for our listeners, we will also link to web Glaze as well as these award winning papers, which is very exciting.

I didn't realize it had won so many awards. That's wonderful. On the transcript of this episode, which is available on our website. www.thegoodrobot.co.uk. And every other week when we post, we'll always have a full transcript of each episode and a specially curated reading list available. So if you want to dive into these topics more or use any of the tools that we talk about, you're able to do so.

I wanted to ask you a little bit more about one of your other projects because you're incredibly prolific. I don't know how you have time to do all these exciting tools, but I've also heard that you do some work on biometrics and specifically trying to rethink what we could, potentially use instead of biometric data as a way of, for example, unlocking our phones or finding ways to identify ourselves.

And so I saw that you've been working on something called, I'm going to grab the name so I don't get it wrong, electrical muscle stimulation instead of passwords or fingerprints. So could you tell me what this is? Because I'm going to be honest, it gives me like scary dystopian visions of a large electrified spider or something like that, and how it pertains to this question of security and identification.

Heather: Yeah. Yeah. So love to. Yeah. So as we said that as we work on face and your fingerprint, or even your voice, all these are currently are biometric, but you only have a single copy of them. So once they're out, your biometrics leaked, there's a lot of laws against these protecting our biometric data.

But what if they get lost and there's no way for you to recover. They can sell, I can sell our dark market and you cannot change your face. We only have one face, or ten fingers. So that's why we thought, can we actually you know, find alternatives, which is easier.

eLectro EMS let's go with that. It basically shock your nerve system and using a very minor electricity to move your fingers and bodies. It's normally used for health recovery, just help people to regain their muscle strength or guide their movement or even teach you to play pianos.

But we thought, why can't we take that into a biometric system? Because traditionally the work on EMS is trying to minimize the difference across people, but what if we can leverage the uniqueness that everybody's muscle and fat and body skin is very different as a marker for ourselves to identify ourselves.

But instead of making a static like a face, we can generate an activation which can vary like a single question. And we can generate millions of pressure, one with each activation. And as your muscle move, I detect the muscle movement. You have a unique movement with the same, challenge signal. So that can generate millions of password from a single, one second of stimulation.

It's not really, that scary because you have all these EMS units sold on Amazon from many places. They help your body relax. You can use the same device. To authenticate yourself. So essentially, let's say in the future VR system, you have a headset. It has a camera, you can recognize your body movement, and trigger you to move and gravity movement, then you open up the VR headset.

So that's the thing we're going, but you can go for all other kind of scenarios. But one of the important advantage of this is if you stolen one of my movement, you record, Hey, you move this way. So I'm going to try to repeat that or try to hack the system to repeat that. It's fine, I leaked that I erase it.

I only use it once, and the next one will be different. So you can always rerecord by changing the signal, then you have something we call active biometric to have millions of these one time password just by moving your arm like this way. So I think that's the idea of how do we leverage your body motion, which is much harder to fake, and you can also regenerate, and re modify and so it's a leakage proof in that sense, you can always recover. So I think that's the idea. But, we have to test it and build a system and it's hard because we have a limited amount of data. So there's a machine learning challenge of if you have a small amount of data, how do you cultivate your model to leverage that to actually authenticate you so that we have to invest some time and we did that and tested in real life scenarios and it seems to work. So we hope that In the future, as we embed more of these wearable systems to stimulate our arms and body motion, especially in VR and many other systems, we can also leverage that in authentication as well.

And the final thing is that we believe that it's particularly for people with disability, because it doesn't need your cognition. You don't need to remember the password. You only just need to move your arm, trigger the arm movement. The EMS actually replaces your brain. To actually help that movement and we hope that can also carry some weight in or finding some application in the medical field so that if people does want to remember password or showing their face, we can use this to say authenticate your wheelchair or authenticate some other medical devices that you can do it on the fly.

So that's another areas that could be potential applicability of this.

Kerry: That's fascinating. And, as someone who recreationally comes from a dance background, I'm always really excited to see these kinds of applications, which are like drawing on the power, both communicative and, emotional and expressive of movement and how to distinct it is, people's bodies. And so to see that as an alternative to the way that we currently think about security, particularly things like biometrics, which, I personally am of the opinion that we collect far too much biometric data and there should be significant restrictions on that is really exciting.

I also, very selfishly think this could be a great place to bring out your different party tricks or particularly weird body moves into your daily life. And I think life should just have more dancing in general. So what would be your specific funky movement that you would want to use as your password or to let you into?

And we do videos. If everyone is listening to this on audio. You can watch us on YouTube, but we will describe the actions the best that we can for those of you who aren't watching. So Heather, what would be your password dance move?

Heather: I actually teach mobile computing class.

I actually challenged my student in that class to produce some of the moves that people call to us and they, what they have this kind of cowboy dance move that it's very difficult to mimic. I think the challenge is how do we get the camera to capture that in real life and whether you want to do that in public or well, but there are so many creativity like what you said.

There's so many creativity in that. I think you'll be great in the near future. We're able to utilize more our body motions to to help us authenticate along that way, we also shows that if you review too much of those movement, you can actually hurt your privacy. The most recent work we did is saying that if I able just to see your typing motion, I do not need to see your keyboard.

The keyboard can be completely hiding. And if I see that from, let's say 10, 20 meters away, I can fully recover or 90 percent recover your typed contents within five minutes of your five to 10 minutes of your typing. How? So yeah, it's a paper we just published this year. It's basically we, there's no training data.

We do not collect any of your training data. And our typing behavior is very unique. Everybody typing the very, very unique way. I learned the hard way from this project. We're able to do a self supervised learning methods using noisy training and methods to be able to have a two layer solution to eventually figure it out.

You're typing mapping into the keyboard and even you change your keyboard layout. No problem. We don't do not depends on the keyboard layout. So you can actually after typing 500 words and we can leverage language model to fully recover your typing behavior so that be able to guess your typed contents with about 90 percent accuracy.

So I think you're just telling people that if they can see your motion record your motion using a smartphone, which is very easy. There's a lot of people recording at airport and many other places when you're in a launch in the yard in the park typing, you think you're fine, but you're not.

And then we recently extend that to VR. In VR these days, you can also do some kind of typing, even on the physical keyboard, and your avatar, they cannot see the real you, you're hiding in your home, but they can see your avatar's finger typing by recording avatar finger typing, just the screen recording, I can do the same attack against VR user being Bye.

That so that just bring out the motion also leaked a lot of private data, especially your keystroking data, which is really private, which cover it's huge amount of sensitive information, even sometime beyond our identity. So that's another area that we're tapping into and trying to understand how these motion data, especially these active input motion data could be protected with all these, whether in the public space or in the VR space. So we need to figure out a way. So the public space you just put a shield in front of you, which is weird, but that protection in VR, we had to do some kind of system wide office station to be able to protect you. So that's something we're working on right now.

Kerry: That's fascinating and frightening. I still think life should have more dancing, but I'm now rethinking my, we should all move to these kinds of,

Eleanor: as long as you don't use typing to generate keyboard keystrokes. Then I think for now it's about work or secret messages. I think it's fine. But if you have these kind of, because we do a lot of these secret inputs, I think that's. That's the idea.

Kerry: Eleanor knows from my very distinct key typing patterns, Eleanor and my husband share a gripe, I'm the world's loudest typer, and he's always like, why do you type like that? So I feel like anyone reading my movements would be like, is she okay? But honestly, this has just been such an illuminating and fascinating conversation.

And like many of the conversations we have, I immediately want to go and now learn even more. And it's just opened up so many boxes onto the different kinds of projects you're working on. So thank you so much for joining us for this conversation. It's really been delightful. Thank you.

Heather: Thank you for having me.

Poisoning AI and Putting Privacy First with Heather Zheng

Recent Posts

Comments

Join our mailing list