During the United States presidential elections in 2016, The Washington Post launched Heliograf[116], its artificial intelligence system for writing news. The program was connected to all state vote-counting centers to read the information as soon as it was available and based on that, create its own texts and stories, adding information about the elected representatives or those who had lost the opportunity to be re-elected.
This mechanism, easily replicable in the world of sports news, raised alarms for more than one journalist. Heliograf is capable of producing hundreds of notes in a matter of seconds, while a flesh-and-blood journalist would take several minutes to narrate just one note. Moreover, regarding political election coverage, data that updates from minute to minute can render human work obsolete in just a few minutes due to updated available information.
While some people were concerned, many others saw Heliograf’s achievements positively, as this would allow professional journalists to focus on investigative journalism and develop deeper stories, freeing themselves from the production of these minor notes. Now I wonder, what else do we often see in traditional media besides news? Public figures! Can we replace them with AI?
In 2020, South Korea’s MBN cable network became the first news outlet in that country to feature an AI-based anchor[117]. The technology, developed between MBN and Money Brain, perfectly imitated the historic news presenter. Not only did it look identical, but it also had her voice and had acquired her same movements.
This is not the first time we encounter similar technology. The internet is full of deepfake videos, which are falsifications of people’s videos, produced thanks to deep learning neural networks. Many of these videos show scenes from well-known movies in which the face of the main actor is replaced by that of another person, creating a natural and hard-to-detect effect. There are even deepfakes of Obama and Trump. Is it right to do this? Is it wrong? We do it because we can and want to demonstrate technological advances.
The United States presidential elections in 2016 and 2020 showcased the extensive use and dissemination of fake news. Can you imagine when, in a few years, in some country, someone falsifies a video of a president or a presidential candidate and makes them say outrageous or obscene things? How do we counteract the negative effect of these messages that will be easily directed to specific groups through social networks? Deleting the programs or algorithms that enable these simulations is not a real option. As the saying goes, everyone is entitled to their own opinions, but not necessarily to be right.
In May 2022, I had the opportunity to attend the annual World Economic Forum in Davos, where I asked Kai-Fu Lee about what companies could do in the future when they are victims of fake news attacks with deepfake videos impersonating company presidents saying “we are going bankrupt.” I think it’s easy to imagine the effect this will have on investors who automatically believe the video’s authenticity instead of checking the company’s public information. Below you can see my conversation with Mr. Lee.
Facundo Cajén and Kai-Fu Lee at the World Economic Forum 2022[118]
Closer to home, the CEO of NVIDIA deceived everyone by showing a computational version of himself during a 2021 presentation to investors to comment on the company’s advancements[119]. The purpose of this staging was to demonstrate how far the company had come. Months later, in May 2023, a fake image that went viral on Twitter showed a non-existent fire at the Pentagon. The image’s virality was enough for the market to wipe out $500 billion in one fell swoop[120]. Has Pandora’s box been opened? We will see!
Without a doubt, recent advances in human voice imitation by various AI will make everything more difficult. Microsoft has already announced VALL-E[121], a system that can mimic the speaker’s voice, tone, and ambient sound with just a 3-second audio sample. The days of robotic voices narrating text aloud in our applications are over. But beyond that, how can I ensure not to fall for a scam where someone calls using my voice[122]? This has already happened and it’s only a matter of time before it becomes the new modus operandi for criminal gangs. How do we stop someone from creating a political operation against a rival? How can I be sure that the audio a relative shared with you on WhatsApp is really the voice of a politician asking for a bribe or saying that you should withdraw all your savings from the bank? We may even have to doubt archive videos, as they are called in the jargon for those old videos of public figures that are sometimes used to contrast someone’s past statements with their current statements. It will be very difficult to discern between what is real and what is fake. Kai-Fu Lee suggests that in the future, besides having antivirus programs, we will get used to having anti-deepfake systems, but in a world ravaged by post-truth, understood as the subordination of facts to various interpretations based on political ideologies, this can be very complex. After all, today we already choose which bubble to enclose ourselves in, whether by the Likes we give on social networks or by the media we freely choose to consume and that, like the algorithms that demand our attention, shape and reinforce our worldview.
In this sense, one of my favorite songs says in a passage “I do not accept, however, if they try to indoctrinate me. I want to choose with which poison to poison myself”[123]. As much as I like this lyric, I consider that at this point we are far from that freedom. Have algorithms taken away our freedom to discover new things by suggesting things they already know we will like?
Returning to falsifications, the answer may lie in living in absolute transparency, but logically no one would want to live in an exaggerated version of The Truman Show, taking surveillance capitalism to an unprecedented version as portrayed by Tom Hanks and Emma Watson in the movie The Circle, where the characters’ lives are broadcast online 24 hours a day by themselves on social networks. If paranoia is absolute, we should even publicly certify the devices that capture and transmit images and sound, something like the IMEI code of our cell phones that serves as a unique identification code, like the National Identity Document of these devices. However, this would bring us back to the case of Strava and the US military bases shown on a public map by the app in question. Living a completely transparent life has its challenges, and one of them is the security of people who would be showing their constant location to the general public, not just to companies like Google or Meta. Can you imagine knowing the current location of a public figure 24 hours a day? This is not just a problem for private individuals and judges, but also for people involved in politics.
Currently, the Coalition for Content Provenance and Authenticity (C2PA)[124], humorously defined by some as the World Ministry of Truth, has presented a standard that allows verifying the provenance of information with a unique signature, notifying us if someone edits the original version, and thus allowing us to verify the authenticity of the content we see online. While we have not seen significant advances in this area, Microsoft, Adobe, Intel, and the BBC have committed to adopting this standard[125].
Delving further into this topic, the following QR code will take you to a video of actor Bill Hader, in which a group of computational scientists, applying deep learning techniques, managed to replace Bill’s face with a very clean transition to the appearance of Tom Cruise and then Seth Rogen. On the one hand, this shows us that we are evidently talking about technology that is already within our reach. On the other hand, it raises the question of whether we will have new movies of Leonardo DiCaprio even after his death if he or his family sell the rights to his image in the future? Will we have virtual actors, 100% created and powered by AI, who will win awards like the Oscars? Will we have virtual actors and actresses in the porn industry? Once again, many questions, few answers, although in this case I venture to predict that the answer to both questions will be affirmative, and our eyes will not be able to tell the difference between a real human and a digitally created one when we see them through a screen. I even remember that as a child, I thought Hollywood actors and actresses had to speak many languages and that this was how I could see the same actor speaking Spanish while others saw him dubbed in German or his original language, whatever it was. Incidentally, I think it was when I was 12 years old, in 2004, watching the movie I, Robot starring Will Smith, that I realized his lips did not match the words I heard him say in Spanish. Of course, it was dubbed, but in the future, perhaps major films will start using AI to modify the actors’ lips and expressions to match what they eventually say in their dubbing, which, by the way, might not even hire someone to provide that voice, but also be an AI replicating the original actor’s voice in another language.
Deepfakes or the end of reality[126]
However, if we’re talking about downsides, I have one more question. Can you imagine the lawsuits that will arise regarding the use of actors’ images? One thing is to make a contract with their families or the person themselves while they are alive, giving the rights to exploit their image to a brand or company. Now let’s imagine something else happens. Imagine Robert De Niro dies, but thanks to advances in Artificial Intelligence and deep learning, Martin Scorsese decides to make a new movie and include him as the main star, using images of young De Niro from Taxi Driver, a film directed by Scorsese himself. Imagine he holds all the rights to that movie. Should Scorsese pay De Niro’s heirs for the use of his image? After all, hypothetically speaking, Scorsese or the production company behind Taxi Driver already paid De Niro an agreed sum for the use of his image, young, at that time, and therefore they could use these same images to train computational models, so they would be using digital property that already belongs to them.
While the case of using your own image without consent is a sensitive one, at this point various artificial intelligences like MidJourney, Dall-E, and Stable Diffusion, among others, allow us to create hyper-realistic images with the content we want by simply asking for it. This has brought criticism from artists who argue, rightly, that the works made by these artificial intelligences are based on previous works created by humans and that they do not receive the recognition they deserve in each work, whether it be a mention or a monetary sum. At this point, the intellectual property of what is publicly available on the internet is increasingly fading into a distant horizon. In the end, no one pays royalties to Brunelleschi’s descendants for formulating the laws of linear perspective or to Johannes Widman for creating the plus (+) and minus (-) symbols in a book he authored published in 1489, yet the knowledge expressed by these individuals is in everyday use and new things have been created upon it. We all learned from someone, and yet our tutors are not the ones who get a constant reward for what they taught us. Just as the music industry was shaken at the beginning of the century by piracy that facilitated access to copyrighted music files, forcing the business to transform and make artists more dependent on income from their tours than from their album sales, today different influencers offer free content on YouTube and other social networks to get noticed but eventually make money by selling theater tickets. Regardless of whether one considers this right or wrong, it is how things work today, and artists must adapt to stay in the spotlight. Intellectual property for digital things today hangs by a very thin thread, and that is why this book is offered free to anyone who wants to read it. I was trained on the internet, with the knowledge that thousands of people who do not know me personally and many of whom I only know by their aliases put at my disposal selflessly and freely. This is how knowledge is distributed today, with friends, family, and also strangers through the network thanks to those stories, texts, and evidence they constantly share with us. Moreover, if we talk about copyright and the “end of reality” as this book’s subtitle points out, I find it necessary to also mention how copyright is sometimes used to hide reality. What do I mean by this? In 2021, in the midst of the protests that took to the streets of the United States for the Black Lives Matter movement, a police officer who was confronted by a group of protesters, noticing he was being recorded, proceeded to take out his phone and play a Taylor Swift song at full volume[127]. We must admit that his move was ingenious, because, of course, if you upload a video to the main social networks using a song with copyright, the platforms proceed to prohibit the video’s upload or cut the live broadcast.
[116] The Washington Post’s artificial intelligence system, H. (2016). Rep. Darrell Issa elected to represent California 49th Congressional District. The Washington Post. Viewed October 1, 2021, at https://www.washingtonpost.com/news/politics/2016-race-results-california-house-49th.
[117] Video: la presentadora de un noticiero de TV que no es un ser humano y asombra al mundo. Clarin.com. (2020). Viewed October 5, 2021, at https://www.clarin.com/internacional/video-presentadora-noticiero-tv-humano-asombra-mundo_0_hQNlAYKFz.html.
[118] Facundo Cajén. (2022). Global Shapers and Kai-Fu Lee at Davos 2022 | World Economic Forum [Video]. YouTube. Viewed June 1, 2022, at https://www.youtube.com/watch?v=ghSg5jJyQyQ.
[119] Trenholm, R. (2021). Nvidia faked part of a press conference with a CGI CEO. CNET. Viewed August 14, 2022, at https://www.cnet.com/tech/gaming/nvidia-faked-part-of-a-press-conference-with-a-cgi-ceo.
[120] Barrabi, T. (2023). AI-generated photo of fake Pentagon explosion sparks brief stock selloff. New York Post. Viewed May 24, 2023, at https://nypost.com/2023/05/22/ai-generated-photo-of-fake-pentagon-explosion-sparks-brief-stock-selloff.
[121] VALL-E. (2023). Github.io. Viewed January 25, 2023, at https://valle-demo.github.io.
[122] Verma, P. (2023). Pensaron que sus seres queridos les pedían ayuda: era una estafa con inteligencia artificial. Infobae. Viewed March 10, 2023, at https://www.infobae.com/wapo/2023/03/08/pensaron-que-sus-seres-queridos-les-pedian-ayuda-era-una-estafa-con-inteligencia-artificial.
[123] Cuarteto de Nos (2009). Breve descripción de mi persona. Letras.com. Viewed January 29, 2023, at https://www.letras.com/cuarteto-de-nos/1512800.
[124] A promising step forward on disinformation. (2021). Microsoft. Viewed March 25, 2023, at https://blogs.microsoft.com/on-the-issues/2021/02/22/deepfakes-disinformation-c2pa-origin-cai.
[125] Technology and media entities join forces to create standards group aimed at building trust in online content. (2021). Microsoft. Viewed March 25, 2023, at https://news.microsoft.com/2021/02/22/technology-and-media-entities-join-forces-to-create-standards-group-aimed-at-building-trust-in-online-content.
[126] Ctrl Shift Face. (2019). Bill Hader channels Tom Cruise [DeepFake] [Video]. Viewed July 12, 2021, at https://www.youtube.com/watch?v=VWrhRBb-1Ig.
[127] Spangler, T. (2021). Cop Plays Taylor Swift Song to Block BLM Protest Video From YouTube. Variety. Viewed March 24, 2023, at https://variety.com/2021/digital/news/police-taylor-swift-copyright-youtube-blm-1235010756.