Misinformation and Synthetic Media

We are witnessing a “firehose of falsehood” on the internet and in the media: mis- and disinformation that attempts to overwhelm the truth. As deepfake and synthetic media technologies become more sophisticated and their use more prolific, it is becoming more and more vital to develop tools and systems to identify when media has been manipulated – often through fake or deceptive audio and video – to increase trust in our information ecosystem.

Transcript

Sam Gregory: When we did this threat surveying globally what we heard pretty consistently across the world were these threats. First, the gender-based violence right Because people have experienced something similar second that it would be used to attack legitimate claims of video and media coming from those without power. Also, with the way that power attacks people who challenge power. Don’t view this in abstraction, view it right in this as the way in which authoritarian and democratic societies that are failing will use this to dismiss media and to use it as part of that structure of suppressing civil society.

Natalie Monsanto: Welcome to the fourth installment of our special series on global digital rights challenges. The series was produced in partnership with UCLA Law’s Institute for Technology, Law and Policy and features striking conversations on the way the relationship between technology and human rights is playing out around the world. As deep fakes and synthetic media technology has become more sophisticated and their use more prolific it’s becoming more and more vital to develop tools and systems to identify when media has been manipulated. This episode, you’ll hear from Sam Gregory of WITNESS and three Promise Institute fellows as they discuss synthetic media and misinformation. Stay tuned after the episode to hear more about how to say in the know, for now though as we start off you’ll hear ITLP Executive Director, Michael Karanicolas, introducing the speakers.

Michael Karanicolas: Today’s session is on misinformation and synthetic media. I am delighted to introduce our speaker today, we have an incredible guest to talk about this.

Sam Gregory is Program Director of WITNESS. Sam’s current work focuses on the threats and opportunities as emerging technologies such as AI intersect with disinformation, media manipulation and rising authoritarianism. An expert on new forms of misinformation and disinformation, as well as innovations in preserving trust and authenticity, he leads WITNESS’s Prepare Don’t Panic work on emerging threats such as deep fakes and on new opportunities such as live streams and co-present storytelling for action and curation of civilian witnessing. He’s on the Technology Advisory Board of the International Criminal Court and Co-Chairs the Partnership on AI’s Expert Group on AI and The Media. From 2010 to 2018, he taught the first graduate level course at Harvard on harnessing the power of new visual and participatory technologies for human rights change. He’s spoken at Davos TEDx, WIRED, and the White House as well as many other festivals and universities. Sam has an MA in Public Policy from the Harvard Kennedy School attending as a Kennedy Memorial Scholar and a BA from the University of Oxford.

Today, Sam will be in conversation with three current Promise Institute fellows Ani Setian second and Nick Levson are rising 3Ls at UCLA School of Law. Aya Dardari graduated with an LLM from UCLA Law in May, 2021. All three spending the summer researching the regulation of misinformation and disinformation on online platforms and I’m very much looking forward to, what I think, will be a fascinating discussion. I’ll turn it over to Nick to kick things off.

Nick Levsen: Welcome everyone and thanks again for joining us. Sam, your organization, WITNESS, describes itself as a leader of a global movement that uses video to create human rights change. To start us off, would you like to give the audience a brief introduction to WITNESS and its activities?

Sam Gregory: Thanks, Nick, and thank you for the introduction and the chance to be here. So, WITNESS is a human rights network. We are a team distributed around the globe that works to help people use video and technology for human rights. We work very closely with communities who are trying to use that intersection of video, mobile phones, the mobile internet, social media for human rights change and that’s on a range of issues from war crimes to police and state violence to land rights. My colleagues in Brazil, Sub-Saharan Africa, Latin America, South and Southeast Asia, Middle East and North Africa, and in the US work very closely with communities of lawyers of eyewitnesses of journalists of human rights defenders who are trying to use this intersection of tools for everything from evidence of a war crime through to community organizing to advocacy for changes in policy.

That’s the root of our work, and then on top of that we think about how do you share good practices from one community to another. If you’re working to create databases of police violence, what are the best practices for how to use them that can be shared across communities in the US, or between communities in the US and Brazil and Myanmar? How do you do that in ways that respect local knowledge but share between communities and solidarity?

The layer that I work on at WITNESS, we often describe as technology threats and opportunities. We’re very much focused on the scaled systemic challenges that impact every single one of the people in the communities we work with. The way Facebook is structured, the opportunities that deep fakes provide to attack activists or create misinformation, the problems that footage gets taken down off YouTube that is critical war crimes evidence. So, we work to try and translate the types of challenges that we’ve seen right now at a grassroots level or that are emerging because of technology trends into systemic early action or harm reduction in the present day around those technology problems. That’s exactly where I’ll work on deep fakes and synthetic media fits, which is for a decade, we’ve seen scaled visual misinformation around the globe. As we now look at, potentially, the next wave of that which could include these so-called deep fakes, how do we make sure from this Prepare-Don’t-Panic approach that the human rights voice and that the voices of communities who will be most impacted in that crisis are there from the beginning?

Nick Levsen: Great, thanks. My next question, actually you’ve already mentioned, deep fakes and misinformation and synthetic media and many academic and technical fields address issues with digital information and authenticity which has resulted in the adoption of some of these common terms, but they have widely varying definitions. Would you mind giving us your definition of the following terms: synthetic media, misinformation, shallow fake, and deep fake, and maybe an idea of how we should understand the relationship among the concepts?

Sam Gregory: Sure, so let me start with misinformation and disinformation because this is the field that we’re stepping into often when we talk about the others. Misinformation is information that is shared that is deceptive but may not be shared with an intention to deceive you. So, when someone shares a piece of anti-vaccine information to you because they’re a relative and they’re trying to convince you that you might be harmed, they’re sharing misinformation but they may not be doing it with an intent to deceive you.

Disinformation is the idea that people deliberately share deceptive information in order to deceive you. People also talk about mal information which is when someone shares private information in order to harm you. This is the field we often place deep fakes, shallow fakes, and synthetic media in. I’m going to push back on that a little bit as we talk as well though, so let me come back to that.

Shallow fakes is the term we use at WITNESS to talk about really the vast majority of ways in which people create deceptive audio-visual misinformation and disinformation and it’s something we’ve seen for a decade at scale. Colleagues working in Syria in the early days of the civil war saw this, we see this in Myanmar, we see this in the US around elections, we see this in every country. Shallow fakes are predominantly people miscontextualizing a piece of media, could be a photo or a video, saying it came from one context and actually came from another context; take a picture of a burning building from one place, say it was set alight by ethnic group A, when in fact it’s from place B and got set alight in a fire, but you caption it one way. Shallow fakes also include the types of videos that we’re familiar of, and people in the US and Argentina will remember how this was done with female politicians where they slowed down Nancy Pelosi and also a prominent Argentinian politician to make it look like they’re physically incapacitated. Simple edits of speed or maybe just an edit, and we’ve called those shallow fakes at WITNESS partly because they’re easy to do. Almost any of us could do it, we find an image on Google Image search, change the caption and upload it.

Then, in contrast, to what we might think of as deep fakes. Starting in about 2017 the term came out and it was coined to describe fakes that were made with an artificial intelligence process called deep learning. Deep learning is a form of machine learning, a form of artificial intelligence, and has been enabling a better ability to create realistic representations of people saying or doing something they never did or manipulations of scenes that look more realistic. Someone just created this portmanteau of deep learning and fakes: deep fakes. The problem with deep fakes is that I think deep fakes, for most people, someone’s got this idea of first of all of these face swaps that first came with Nicholas Cage and then people realize they’re actually mainly being used to target women with non-consensual sexual images. Often, people think of deep fakes with negative connotations, as being inherently negative.

Synthetic media is a term that is much broader that is often used to describe the range of ways you can use a range of artificial intelligence techniques to create more realistic representations of reality that never happened. This includes a whole range of ways we can both create synthetic imagery, and manipulate elements within a real image. When I’m trying to talk to people about synthetic media, I’ll talk about the range between two things they might recognize right now, like the Deep Nostalgia app. It’s a pre-trained algorithm which is based on some research from a couple of years ago. You drop in a single photo and it can create this simulation of movement in a photo where you could drop in your face of your great-grandmother and make her smile, probably most people tried that out. The opposite extreme is the Tom Cruise TikTok deep fake, where it’s an incredibly realistic deep fake of Tom Cruise appearing to play golf and do a magic trick, that was made using the face swap with a really talented impersonator with a lot of post-production.

Those are kind of the two extremes of deep fakes or synthetic media, and in between you have all these tools to do other types of manipulations, for example, to make someone’s lips match up to a soundtrack what’s known as a lip-sync dub or sort of puppetry. Then, you also have ways in which you can more easily remove objects in a scene, for example, pull out a moving object in the backdrop of a video. Then, of course the way that a lot of people have encountered this is also the ability to create realistic photos of people who never existed, this person does not exist and there’s a website you can go to, and it generates a realistic photo of a person who never exists. Synthetic media covers all of those things, not just the sort of face swap and not just the negative usage that we often associate with deep fake.

Nick Levsen: Great, so you addressed a little bit about the tools that are used to create deep fakes and obviously differential resource investment at the high end and the low end. Can you give us an idea of the prevalence along that spectrum that you encounter in your work based on more sophisticated versus less sophisticated deep fakes?

Sam Gregory: Yeah, so the vast majority of deep fakes in the world are non-sophisticated deep fakes made with apps. So, if you’ve ever used ReFace or Impressions or any of those apps that allow you to swap your face with a celebrity one do a little animated GIF where you sing along with something, we’ve all done those. The vast majority of deep fakes are created in those tools and those are simple app-based tools that have been trained for particular scenarios and then essentially you’re just adding one more input.

Now, that’s one of the big things as we look at the approach at WITNESS of Prepare Don’t Panic, is that the trajectory of making it easier to create deep fakes both at an app level and at a technical level is getting easier. You need less training data, you need to train it less on specific targetted objects, for example, as you’re transferring faces, some areas like audio are improving really rapidly. So, that sort of end which at the moment is mainly jokey, mainly fun, requires less technical skills, less investment, and then of course as you move further up the scale you are looking at more technical skills. So, I gave that analogy of the Tom cruise deep fake. You know, that took weeks to months of work, probably, to make that well. it was made by a top deep fake creator, Chris Umé, it required post-production after it, so it’s still relatively complicated and expensive to make those kinds of very effective complete face swap deep fakes to fool someone.

Now, the middle ground is between the app side in between those incredibly well done often comedic or parodic, or political campaign deep fakes, is where we have deeply problematic deep fakes creation which is the scaled creation of non-consensual sexual images of women; both women who have their face transferred into sexual images, also actresses and performers in the sex industry who have their bodies appropriated for this, finding this is happening to them. There are statistics from a few years ago that showed that 96% of publicly visible deep fakes were non-consensual sexual images. That figure probably has changed because there are many more app-based, you know, fun deep fakes out there, but that’s still the middle ground and people are using open-source tools they can find, they’re paying someone to do that with small amounts of money that you can do relatively easily if you look online. What we see at WITNESS, we don’t see that many deep fakes and I’m really glad about this I’m one of these people who is like, “I don’t want us to be in a world where we’re surrounded by malicious deep fakes, I don’t want to wish that world on us prematurely”.

Instead, we need to start seeing what’s actually happening and understand where the trends are pointing. Right now, the current problems tend to be these non-consensual sexual images that many people have had a community member targeted by or had a similar image done with photoshop or other forms of manipulation so that definitely we see. There will not be many sophisticated, complicated deep fakes used in politics, but we’re starting to see claims of something called the “liar’s dividend” which is where people claim that a true image is fake and cast doubt on it. As we move forward, I’d love to talk more about what people name as the threats that they want solutions to as this gets more prevalent.

Nick Levsen: Great, so you’ve mentioned political messaging and the relatively low prevalence of deep fakes and political messaging, but could you maybe just give us a little bit of a discussion of what types of deep fakes are some examples that you do see and political messaging?

Sam Gregory: There are emerging deep fake approaches in political messaging. One is very strong, satirical deep fake messaging, I would say the most notable is Bruno Sartori, who is a Brazilian deep fake creator, who creates these incredibly well-done deep fakes of Bolsonaro, former president Lula, political figures in Brazil, he often places them in soap opera contexts to parody their response to COVID, the rivalries between politicians, very popular, explicitly not trying to fool you, they’re parodic, but tapping into popular culture. In the US, we’ve seen examples of them being used in politics, not very successfully. Often, they’re much more about trying to create fear about deep fakes than any political point. There’s a whole sub-genre of people creating videos about deep fakes to get people scared of deep fakes, which is kind of ludicrous and rather meta.

An area we’ve been working on a lot is that satire is incredibly important as a deep fake format. We have a report coming out with MIT in a couple of months, but it’s also the one that lends itself best to the problems we have already with gaslighting, which is a problem in mis and disinformation, and attacks on public figures where you put something out there and you say, “Haha, this is a joke”, but it’s not, it was always intended to be malicious, and you tell people they didn’t get the joke but of course, it was an attack or it was deceptive harmful information.

Unfortunately, deep fakes play into that and I think the scenario we’re gonna have to really grapple with is the dividing lines between satire and malicious gaslighting that are enabled by this ability to create very realistic representations of reality there is a very interesting political format of deep fakes that is very new and emergent and very particular the form, these so-called resurrection deep fakes. That’s when someone, with the consent of the family, resurrects someone to often make a political point around saying what happened to them. So, we saw that in Mexico a murdered journalist was resurrected by a civil society group with his family to point out that the government still hadn’t found his killer. In the US, we’ve had that with a young man who was killed at the Parkland shootings. In Australia, a policeman who took his own life came back to talk about the stresses that led him to take his own life.

Nick Levsen: Well, that’s it for me I’m going to pass you over to Annie’s capable hands now.

Ani Setian: So, we’re going to move into the solutions a little bit but before we do that, I just wanted to ask which groups are most adversely affected by the expansion of synthetic media and, more generally, who are the key actors that should be involved in creating the solutions?

Sam Gregory: Thank you. So, there are two elements we need to look at when we’re looking at this. People want to solve the deep fakes problem but they, like many tech accountability problems, don’t listen to the people who are most likely to be impacted by it, have the least voice in it that happens, obviously, both within communities by people who are marginalized and excluded from power and it also happens on a global scale. Even if you look at the rhetoric and the media around it, it’s always about someone’s going to deep fake Donald Trump, someone’s going to deep fake Biden, someone’s going to deep fake Boris Johnson.

It’s such a minimal slice of the world’s population and of the ways in which these manipulative tools could be used maliciously. So, our starting point has been Prepare Don’t Panic, but center a very human rights global perspective in doing that and our approach has been, in WITNESS’s work, to make sure we’re addressing current problems but also really talk to people about where they see deep fakes moving in terms of threats and solutions. When I talk about threats and solutions and the actors involved, I want to name where that comes from for us. So for us, it came from bringing together communities and activists and people who are recommended to us in each of the regions where we work, so in Brazil and in Sub-Saharan Africa, in the US, in Southeast Asia, we brought together groups of people with lived and expert experience of similar problems and who would likely be impacted by this.

We didn’t ask them to be deep fakes experts, the moment you go to people and say, “Do you know about deep fakes?”, there’s a fairly small slice of people who, frankly, are worrying seriously about this compared to every other problem we have in the world. What we instead said is, look, we’re going to demystify deep fakes, we’re going to demystify the hype, we’re going to really talk through what people are proposing as solutions and we want you to help us understand what you want to see prioritized as threats and more importantly what you want to see in order to see prioritized as solutions.

That’s for us really important, that we constantly go back to that process of consultation to drive what we do. In terms of who is adversely impacted, right now women by non-consensual sexual images. That is the scale problem we have now with deep fakes, and we need to make sure that whenever we talk about deep fakes, we don’t let our eyes golf the ball on misinformation completely. That’s why, Nick, with your question earlier, I pushed back on just defining those terms, I think then we put ourselves in a bubble that says, “this is a misinformation problem” versus “this is a gender-based violence problem”. This is a problem of driving women out of the public sphere, this is a problem around journalism and activism with very strong gendered dimensions.

When we did this threat surveying globally, what we heard pretty consistently across the world were these four or five threats. First, the gender-based violence because people have experienced something similar. Second, that it would be used to attack legitimate claims of video and media coming from those without power, so, citizen journalism, people sharing evidence on social media. That it would intersect also with the way that power attacks people who challenge power. I remember a very prominent participant in our South African meeting who was from a very big social movement in South Africa saying, “I want to sketch the linkages here between surveillance, criminalization of civil society groups, and deep fakes because it fits in that paradigm of trying to suppress people in power, so don’t view this in abstraction view it in this as the way in which authoritarian and democratic societies that are failing will use this against to dismiss media and to use it as part of that structure of suppressing civil society”.

People then linked that also to the threat that this would overwhelm already existingly stressed journalism fact-checking human rights groups because they would be pressured to prove true to continue to fight back on this and they didn’t have the resources that are already stretched. Then, we also had people really sketching out the linkages to existing types of problems that happen already which is a lot of shallow fakes get shared in closed messaging groups, they often come very context free in those settings which makes it a lot harder to work out what they are. It’s like an audio room on WhatsApp or a video or a photo in the US or in India and it’s like, how do we deal with this existing dynamic of how this sort of digital wildfire of images and video can be shared?

When we came into the COVID pandemic, we also had people worrying about the space of Zoom or the online conference. We have this very two dimensional sort of encounter with people, can this be compromised, how do we think about that? So, people, although they recognize some of these big like Russians and disinformation type fears, wasn’t generally actually where people landed when we asked them what they thought of as the threats.

So, actors in the solutions, who gets to prioritize the threats and the solutions, and who implements them. This is tough because the centrifugal force in this is to pull it towards Silicon Valley, Brussels, DC, people who will be most directly impacted within countries and globally. I think centering people who have the lived and expert experience from directly confronting similar problems as opposed to viewing this as kind of a completely distinct or ruptured away as an abstract problem from what people have been experiencing for decades. Make sure it’s an ongoing process, we have to keep doing this since with something like deep fakes the threat from them evolves the solutions will come out of technology and then the question becomes questions of access and equity and all those.

Ani Setian: Thank you for that. Then, if you can just describe what are some of the solutions to the expansion of synthetic media and then in relation to that question can you give us examples of how they’ll work in practice and then what burdens should be placed on social media companies in particular. Also, how can those solutions be made accessible to people in the global self?

Sam Gregory: So, there’s a range of solution areas to this and we’ve mapped them out and you can get 15 different areas but I’m going to focus maybe on three or four, like detecting deep fakes, authenticating where media come from and, so you can either show that something’s being faked or show that something has been maliciously manipulated or not, practical kind of literacy based approaches, and then the role of institutional actors, I would say actually policy legal and platforms cause they kind of set rules whether we like it or not.

Detection solutions essentially are ideas that will try and detect deep fakes, the way deep fakes are generated is often within something called a generative adversarial network which is basically setting two AI systems against each other. One to generate fakes and the other to try and detect them. When the detective sees through a fake, the generator then tries to generate another fake until it gets better and better. So, it’s like this kind of adversarial process which means that also the detection is often adversarial. They will use the detection methods you build to improve their deep fakes. So, detection people are looking for ways to come up with mechanisms that will, from a technical perspective, work across different types of deep fakes appropriate to a range of types of settings and avoid, for example, bias in detection between different skin tones or other characteristics that might have inherent bias in their training dataset.

The problem with detection is a couple of problems. One is the detection community is really still grappling with how to create tools that really work well with the type of reliability we need and how to combine them so that you can have a set of signals, so that you have a range of detectors. This is an interesting technical problem and there’s been lots of investment put into it. The place we’ve been coming from at WITNESS and we’ve been leading a series of workshops this year, has actually been trying to understand the question of who gets access to these detection tools. How do we actually make sure that the people who need them have access to them with equity including, primarily in the global south, but also vulnerable communities, community media, and human rights defenders in the US, Europe, and the global north, in Japan?

You have a really tough dynamic of- From the technical side, it seems like don’t give people access, from the practical side, the people who are most vulnerable need access. There was a recent case in Myanmar that I covered in an op-ed. There was a video that came out in the early days of the coup that showed a prominent politician, very senior member of the national league for democracy, making a statement and implicating Aung San Suu Kyi, the former defacto prime minister of Myanmar in corruption. He’s speaking to the camera, he’s clearly stressed, he’s clearly reading a statement, his voice sounds odd. The quality of the initial video released was very bad and immediately people in Myanmar, not unreasonably, said this is a deep fake, he’s been forced to say this he’s been digitally manipulated.

This is part of the kind of general deep fake panic we have in our world where everyone assumes there’s more deep fakes than there are and to corroborate that they dropped it into an online detector. So, if you look online you’ll find these detectives that say, “we’ll show you if it’s a deep fake or not”. That video came back as 95% likely It’s a deep fake. I’ve worked with activists in Myanmar for 20 years and I saw very early on people tweeting this image of this politician’s head surrounded by a red deep fake detection box. Looking at the video, it seemed unlikely it was a deep fake, it’s just a very poor quality video, he’s clearly stressed, it’s much more likely it’s a false confession for a statement. We took it to a number of deep fake experts I worked with Henry Ajder, another deep fake researcher, and generally our conclusion was that it wasn’t a deep fake, but by the time we reached that conclusion the story had taken hold in Myanmar that this was a deep fake and it’s still believed right now.

The reason for this for all the reasons why it’s complicated to get access to detection, people didn’t have the skills in Myanmar to do this because we haven’t resourced people to have skills in media forensics. The tools were unreliable and the easiest tools were the worst. There was no easy way for people to work out who they could send it to to get it checked and it plays into this deep fake panic. We have a real problem here and you see it sort of captured in that story of like all the ways these elements play together to play into people’s fears around media manipulation. Of course, the ability then, of governments and people in power to say, “well we can’t work out if anything’s true”. So there’s there’s a real problem in detection which requires us to really invest actually in the skills and invest in the access to make that really work. Otherwise, we create tools that won’t work for the vast majority of people who need them.

This applies elsewhere, we could make that argument to the US as well, I’m not trying to make this into a global south global north question. The second area that is really important to grapple with is what’s known as authenticity tools or authenticity and provenance infrastructure. It’s a really interesting area because about 10 years ago groups in the human rights sector first started working on this idea and it came often out of dealing with war crimes evidence that people who filmed, say a video of a crime taking place, often wanting to make sure that it was more robust as evidence, they wanted the metadata to be strong showing the location and corroborating devices near them, they wanted to put what’s known as a hash on it to show so you could check when the media had been tampered with.

What we saw was groups like WITNESS and the Guardian Project and the IBA building tools that basically allowed you to take photos and videos and show they haven’t been tampered with. About two or three years ago, in the kind of misinformation or fake news moment, we started to see much more interest from mainstream technologists in this area and it was one of the areas where, fortunately, groups like WITNESS had invested early both building technologies but also, working with Mozilla fellow Gabby Ivan’s, produced a report which is all about the problematics of trying to create tools to claim to show you where a piece of media comes from and if it’s being manipulated. So, it’s been exciting the last couple of years, there’s a lot of investment in this area. We’ve seen Adobe and Twitter start something called the Content Authenticity Initiative. We’ve seen Microsoft and Intel and Arm and Truepic and the BBC working on something called the Coalition for Content Provenance and Authenticity.

What they’re trying to do is come up with a set of, sort of, technology infrastructure to enable people to opt in to add data to their videos and photos that enables them to say, if they want to, that they shot it or, if they want to, the location and then indicate over time how it got edited which of course, editing is not malicious we all enter our photos and videos, there’s nothing wrong with editing and there’s nothing wrong with doing a synthetic media thing. I want to make myself smile in a photo video and it’s for fun, that’s what we do on Snapchat or with Photoshop, so this whole area has been emerging as another solution which is, instead of trying to detect the manipulation, help the good actors show their work.

I think that they’re complimentary, like we’re going to need both, and the key thing, coming back to your question about the global south and frankly the world, is how do we build an infrastructure that is a number of things from a human rights perspective, from a perspective of people living in a society without rule of law, with disproportionate surveillance, where they don’t trust the platforms? How do we make sure these systems are not built for assumptions that you need to share your identity, that you can’t redact information to protect yourself, are not built on assumptions about technology, that are not relevant to places that don’t have heavy levels of mobile broadband or older devices?

So there’s technology questions, there’s privacy questions, and then also how do we really be careful about not making this a tool that either gives you a too easy check mark to say you should trust this versus saying here’s a set of signals and information about this media, this is what’s happened to it, you need to make a decision about how this helps you trust it. So, we’re not telling people this is a signal of truth. We’re giving people signals to help them understand media with this type of tool, and then of course we want to be really careful that these tools are abusable by governments who are passing all these fake news laws globally that are doing things like say you need journalistic identity, we need to be able to know who created media. Otherwise, we create a double edged sword. We create the tools for people to be able to show their work to indicate that they created something, but also create the risks that the infrastructure that enables those tools can be used to track civilians to enforce rules on freedom of expression.

So, those are two big technical solution areas. I think they plug into media electricity which is one of these buzzwords, they say we need more media literacy. There was a fabulous talk by Danah Boyd where she talks about media literacy can mean everything from this instinct to burrow down and find conspiracy versus actually trying to rise above conspiracy and think practically about, say whether a piece of media encountered makes sense. When it comes to deep fakes, we don’t do a lot of training to people to spot deep fakes. We actually tend to focus on how tools like detection or authenticity can fit into much simpler frameworks that are relevant to deep fakes and shallow fakes. So, what we use at WITNESS is the sift framework that was developed by the media literacy practitioner and academic, Mike Caulfield, that applies to almost any type of media which is stop, investigate the source, find alternative coverage, trace the origin.

This is a very simple way of basically sort of pausing, making sure that you’re looking to see if other people have covered the same piece of media, and then looking back to see if this is the original, which often applies to shallow fakes because these are typically just someone recycling a video or a photo.My colleagues in the Africa team at WITNESS are currently starting a pilot that’s really aimed at how to translate these types of skills to a community level. We described them as mis info and disinfo medics. Rather than focusing at kind of the professional fact-checker level or trying to go to every individual person, how do we really invest in human rights leaders and others to have the skillsets, to be the people who get access to deep fake detection tools or other ones if they become relevant down the line.

I think media literacy is important, but we need to get away from the kind of idea that you just need to look really closely at the pixels and you’re going to be able to spot it, as an individual just hopeless. We should never be telling people to do that, it just doesn’t work. Then platforms and policy, deep fakes are an interesting area because they’re not that prevalent on our social media platforms yet; about 18 months ago, a lot of them created policies. So, very actively, we’ve tried to influence them. I think a lot of deep fake policies at the moment are focused on its rarity. Facebook has a policy that says that if it’s a type of manipulation that would be indiscernible to a human, that they take extra care on it.

That reflects something that we’ve emphasized which is actually that, until people really know how often they should expect deep fakes, until we have better detection tools, there’s a stronger bias to tell people when there’s no reason they should know that it could have been faked. Most platforms have aimed at, I think probably the right intersection, policy-wise, of the combination of deceptive, malicious, and harmful. If you look at how Twitter handles it, that’s when they take the most action. Now, the problem is, of course, the same problem as with any of these platforms and how they work globally.

So, there’s the deep fakes problem which is it’s hard to detect deep fakes. The second is the problems of how well they’re going to enforce content moderation rules globally, you can have a great platform policy, but if they don’t know how to implement that resource that has that kind of work. It’s early days on the legal side, I think we’re starting to see at a state level in the US, you have laws that tend to focus on three areas like trying to bring through these non-consensual synthesized images in existing laws against revenge porn and other things. There’s others that tend to focus on election periods and then there’s a few that are starting to deal with some of the new issues, in a funny way, the way that most people thought about deep fakes in the last six months was in an Anthony Bourdain documentary, where the filmmaker made Anthony Bordain say some lines in a synthetic audio voice. That’s an area that, in fact legally, people have been grappling with, the New York state has been looking at how do you have control over your digital likeness after death. What are your rights?

So we’re starting to see these legal discussions around likeness rights. I think, actually, there’s a far broader discussion we need to have, such as working with others around, actually codes of ethics and norms. I don’t say that because I want to have lightweight things, I say that because I think there’s a lot of people who probably are never going to be covered at this stage by laws. If we release synthetic media, get consent from the people who are manipulated, how do we disclose in an appropriate way that something has been manipulated to the audience. I actually don’t think we need a lot of laws right now, I think we’re very early and there’s always a danger that if you don’t know what the problem looks like, you legislate for the problem that you imagined. I think there’s all kinds of historical precedence for why we shouldn’t do that. So, the focus of the moment is right, what are the norms we expect of content producers and people who build apps and the people who create media policies about what they’re sharing.

Ani Setian: Thank you for that, I’ll pass it over to Aya now.

Aya Dardari : So moving on, I’d now like to discuss with you the role that international human rights law might play in this conversation. Could you please start us off by explaining what the relevant international human rights law standards are that apply to misinformation and synthetic media?

Sam Gregory: Yeah, I’m going to draw on the work that the human rights and digital rights community has been doing over the past five or six years. WITNESS has been very actively involved in that, alongside many others, trying to think how do we think about human rights and how does that apply particularly to the types of mediums and media that we work with, like video, but it includes work like the Santa Clara principals talking about number notice and appeal around content moderation, includes the work of former Special Rapporteur David Kaye, really pushing this work at the international UN level.

This is an area where it’s been exciting to see this sort of push to have human rights. WITNESS more broadly really focuses on content moderation because, of course, if you imagine those communities we work with they are constantly facing these challenges of individual videos being taken down or videos kept up that create hate speech or harm or massive amounts of evidence being taken down, as we’ve seen from YouTube in contexts like Syria and others. When it comes to synthetic media, we start with the sort of freedom of expression principles and the way that human rights jurisprudence has developed around this.

If you create a policy around this, be it as a government or as a business that’s following UN guiding principles and trying to respect human rights and human rights law that’s developed over time, we have this three-pronged principle about when it’s okay to restrict speech. You have a prong that’s legality, have you explained clearly what you’re doing, and so if, for example, we’re going to decide we want policies either in law and legislation or in platforms we should demand a lot of clarity about what they’re actually looking for. Are they looking for deep fakes or are they looking for malicious media that is manipulated by AI, or are they looking for malicious media that’s manipulated in any way we care about. We should be clear what we’re trying to define.

Are we doing it for a legitimate reason, legitimacy prong and freedom of expression jurisprudence which is like are you doing this to protect public health rationale for a lot of COVID work, for the national security, a few other things that are stated there, and this comes up actually around satire and parody. I remember we had a tremendous intervention in a web series that WITNESS ran last year on deep fakery and satire from professor Evelyn Aswad at the University of Oklahoma, really talking about these parameters. One of the things she’s mentioned in conversation is national security doesn’t mean parodying a leader, which is what people do, of course in many authoritarian context parodying your leader is considered a national security issue that is not under international human rights law. So, that very prominent use of deep fakes is absolutely acceptable in a human rights context, of course.

The third is the necessity and proportionality. What are the alternatives we can have here that minimize reducing speech, in terms of, do we have to censor it? Can we reduce its visibility? Can we provide more context on it? For example, when we reviewed the Twitter principles around manipulated media at one of the meetings we held in Sub-Saharan Africa, I remember one observation people had there were like, please don’t remove all this media. We want to be able to have it there so we can respond to it. Now, people aren’t saying that about media that is directly inciting hate and I think we’ve got an intersection here between freedom of expression and knowing the harms that can be created by synthetic media. So, that’s a navigation that it’s really important that platforms be listening to, but as a top line around policies around manipulated media, starting with those freedom expression principles, is a good place to start.

Aya Dardari : Great, thanks Sam. So, as you mentioned the spread of misinformation and synthetic media can raise some pressing issues for content moderation which I want to dig into a bit further. So, over the past few years, social media companies have slowly begun to adopt content moderation policies that aim to regulate misinformation specifically. For instance, in its community guidelines, Facebook has a rule telling its users not to post misinformation and unverifiable rumors that contribute to the risk of imminent violence or physical harm and Twitter has a policy prohibiting its users from spreading false or misleading information about COVID-19.

So, given these developments, I wanted to ask you a few questions. Firstly, how should the international human rights law standards that you just spoke about apply to content moderation policies and practice? Secondly, should the international human rights law framework informed the way social media companies design their AI detection tools, and if so in what way? Finally, I was wondering if the international human rights law framework might help social media companies in demarcating aligned between the non-nefarious uses of synthetic media which we presumably want to protect fiercely under the right freedom of expression and the nefarious uses of synthetic media which we probably would want social media companies to have a bit more discretion in restrictions.

Sam Gregory: There’s a lot packed in there. That first part is a big part of why we want to think about this. Synthetic media as emerging into all kinds of environments, it’s in news production to have your synthetic weather avatar, it’s in commercial training videos, it’s in your videos you create on TikToK and Snapchat and Instagram. Synthetic media is there and 99% of that is non-malicious, certainly shouldn’t be taken down unless it’s breaching some other thing, is it inciting hate, is it creating deceptive harm.

I think in general, those policies you’ve described are actually pretty good policies. They kind of start to tread the line we’re talking about around legality, legitimacy, proportionality, necessity. We’ve also seen some smart moves to think about how the power of the speaker also has influence in deciding about bans, temporary or longer-term bans on speakers. The power of the speaker is incredibly important in terms of the harm it can create.

I think the problem is probably in two areas that we see most concretely in the WITNESS context. One is we’re barely scratching the surface of the types of things in the Santa Clara Principles, which is basically a recognition that we still don’t really know what’s going on. We don’t understand the algorithms. We don’t know how they’re making their decisions, it’s gone opaque. When it goes wrong they often don’t tell people. We hear this all the time from people we work with, saying the video came down I have no idea why it is and who to talk to and there’s no appeal. This is a very opaque system that makes mistakes all the time. So, that part of the side of it is completely not there yet.

The second part is what advocates globally and many of the leading advocates have spoken about who are working in national context around the world, is the question of resourcing cultural competency and investment to be able to make these contextual decisions like the one that you’re describing about Twitter. How do you make that if you don’t have the staff, you haven’t resourced it adequately, you’re not listening to local civil society. We saw that recently in Nigeria, around some videos that my colleague highlighted to a platform and it’s clear that there was not enough capacity to know whether these videos were harmful, know whether they were satirical, know whether they should stay up or down. That’s just frankly a money and resourcing question and a respect question from a platform. So, you can have good policies but if you don’t have those you create real harms globally, we can name those situations in pretty much every country. So, I think that those two are actually where I would point to as it’s less about the policies, Twitter one is pretty human rights principles respecting, it’s about the backend of what they do with it, it’s about the resourcing globally, and then of course it’s the underlying labor question of forcing people to watch endless videos in the content moderation industry.

Where it applies to detection tools is a super interesting question. There’s a couple of places, one is probably around non-discrimination, you want to make sure you don’t build tools and reflect one of the pervasive problems in AI, which is bias in the training data. That could cut both wings, you could argue that much as in the case of facial recognition, many black activists who pointed to how facial recognition is failing might also be part of the movement that says we don’t want to improve facial recognition, in fact we want to ban it. So for synthesis, I can imagine there are perspectives that say let’s not improve the training data to create synthetic images because this is going to be abused, disproportionally used to attack the lives and livelihoods of people who are already threatened or vulnerable. There is complication in that improving the training data. One thing that was noticed about a recent deep fake detective training program, was that it didn’t function so well on a variety of skin tones, all this bias towards detecting individuals with lighter skin tones and then there are issues around who benefits from these tools if they’re created. So, where’s the training data coming from and who benefits from it, it should go back to those deep fake detection equity. How are we making sure as we build these tools, if the training data is endless images that are sourced from people who had no consent or no real role in understanding that, are we ensuring there’s a diversified input access to those tools as they come out.

The final thing I’d say is that, as we look at authenticity and infrastructure, we want to be really careful we don’t place that into automated systems without caution. There are imperfect signals that tell you about media because we have a whole slew of problems that have emerged around automated content moderation, about hash database, and things like that so there are issues you have to be very cognizant of in that space as well.

Aya Dardari : Great, thanks for your take on those questions. Now, aside from the implications raised by the spread of misinformation and synthetic media for the right to freedom of expression, as you mentioned there are also implications for the right to non-discrimination because, as you were just talking about, AI can be biased. So, if we rely on AI detection systems to detect synthetic media and those decisions end up being biased, that in and of itself is an issue, and so do you think there’s a workaround solution to bias in AI? If not, do you think all content moderation decisions should be made by a human? If you do believe that, I just wanted to flag that humans can be biased as well, sso what would make a human-made decision better or more justifiable if at all than a decision generated by AI?

Sam Gregory: Yeah, in some ways I have an easier answer to this question around deep fakes and synthetic media than other areas because the deep fake detection tools are not adequate to make a decision that signals at the moment that need to be read by a human to be reliable in any kind of important contextual decision. Similarly, the authenticity tools aren’t designed to give a yes-no, they’re designed to give a set of signals and I think that does lend itself to thinking about the intersection. I think it’s reasonable to recognize that automation provides signals and that AI provides signals. It’s where that human layer comes in and how it’s accountable, how it’s resourced, what context it has, and then once it makes a decision to then have at the backend this transparency, this appeals, as to process this way to do this and of course it’s the whole thing structured international human rights principles, but in a strange way deep fakes is not yet at the stage where it really feeds into, this could be completely automated, we just don’t have that capacity yet.

Aya Dardari : Great, thank you so much. We have a question here from Ziad Borat, who says, “I’ve been quite interested in something Francis Fukuyama and Andrew Grotto have written recently, that there exists among the tech social media giants, the paradoxical outcome where in the platform’s ability to restrict speech on behalf of a foreign authoritarian government is actually protected constitutionally from US government regulation. How do we resolve this effectively given these platforms global growth ambitions?”

Sam Gregory: Goodness. I haven’t read that article, it sounds interesting so I’m going to give more of a response just generally. I think there is a real tension right now around control of platforms and when platforms have to enforce the say of governments. I think we’re seeing it play out and in Nigeria right now around Twitter and there is a dynamic that we from a global perspective point to, and I would say disinformation scholars as well, who are coming from outside the US like Jonathan Corpus Ong have pointed to this that in the US discussion there’s an over-focus off and on the role of platforms as a bad actor in this space, when in fact if you look at disinformation, if you look at structural actions, we should be paying attention to the actions of governments, like a range of Southeast Asian governments. What are they doing in terms of imposing fake news laws that are in fact about authoritarian control, about banning speech, I’m going to read into the question and make sure I go to look at the article afterwards, a little bit of actually how we pay attention to the power of government, I think, from a perspective of people who work on this we’re often grounded in the US, they tend to focus very much on the tech accountability and we shouldn’t take our eyes off what Facebook and Twitter and YouTube and Google and the others should be doing, but in fact as a dynamic of real power government legislation and fake news laws globally are a more dynamic element, and potentially also governments as agents of disinformation within those systems and more dynamic agent than we had anticipated.

Aya Dardari : Right, so related to the issue of misinformation, disinformation, synthetic media, and so forth, is the issue of confirmation bias which describes the tendency to prefer information that confirms one’s preconceptions or prior held beliefs and values. So, I was wondering if there are any solutions currently in existence to overcoming confirmation bias, and if not, what you would envision to try and overcome confirmation bias?

Sam Gregory: Goodness, that’s our question that scholars have been working on for a long time. I think on that one, I’m actually going to point to a demonstration that deep fakes world of this that came from a piece of research last year that was really trying to help spot people deep fakes in showing them a series of videos with impersonators. I’m going to oversimplify it, but one of the main conclusions of the paper was that people’s ability to detect basically aligned with whether it confirmed their existing biases, so our ability to detect is compromised by our confirmation bias or our bias to anticipate whether something is faked because of its either similarity or difference mark confirmed political position. I think deep fakes will be deployed in many similar ways that people deploy shallow fakes. Mike Caulfield, the practitioner I was talking about earlier, talks about trope field as a way of understanding shallow fakes which is basically where there are all these existing frames that people have for what they expect to happen in a scenario and really it’s super easy then to plug in any video you just go and find the video or photo in any given day that fits that sort of trope field, as he describes it, and you saw that in the US elections around people finding a video or a photo that appeared to show ballot manipulation. That simply, they’ve gone an established, as you described the trope field and they’re dropping in and they just look for the video photo that fits it, you don’t need to tell the narrative, people instantly recognize it as the narrative they have in their head. I think that is particularly powerful, visually, I think there’s a real problematic with visual misinformation because it just clicks for us. Unfortunately, that happens with shallow fakes as well as deep fakes

Aya Dardari : Great, thank you so much. That brings us to the end of our time together. Thank you so much, Sam, for joining us in this conversation.

Sam Gregory: Thank you all

Natalie Monsanto: Thanks to out co-sponsors in this series, UCLA’s Institute for Technology Law and Policy they’re @UCLAtech on Twitter. Find Sam Gregory on Twitter as well at @SamGregory. To follow us on social, look for @promiseinstucla and please, if this episode was valuable to you, support our work, visit law.ucla.edu/supportpromise to make a donation at any level and help future conversations like these come to be. We have a small housekeeping note, the fifth conversation in this series is titled, “Life Interrupted: The Impact of Internet Shutdowns”, and in a way that is both oddly-fitting and also unfortunate, the audio for that talk was throttled by, you guessed it, poor internet connectivity and is best enjoyed while watching the speakers in real time. It’s a fascinating exploration of the powers of ISPs, government’s role in securing internet access, and the vulnerabilities when daily life relies on connectivity. Head over to our Promise Institute youtube channel to watch Tomiwa Ilori speak with Mark Verstraete and round out our Global Digital Rights Challenges series. Please, subscribe to the podcast and the YouTube channel, if you’re so inclined, so you can be sure to catch the rest of this series as it’s released, until next time take care.