AI in Production
Microsoft's AzureGoogle Cloud Machine Learning, Amazon Machine Learning, IBM Watson, and free platforms like Scikit.

 COMPUTER scientist Siwei Lyu had watched his team’s deepfake videos with a gnawing sense of unease. Created by a machine learning algorithm, these falsified films showed celebrities doing things they'd never done. They felt eerie to him, and not just because he knew they’d been ginned up. “They don’t look right,” he recalls thinking, “but it’s very hard to pinpoint where that feeling comes from.”

Finally, one day, a childhood memory bubbled up into his brain. He, like many kids, had held staring contests with his open-eyed peers. “I always lost those games,” he says, “because when I watch their faces and they don’t blink, it makes me very uncomfortable.”

These lab-spun deepfakes, he realized, were needling him with the same discomfort: He was losing the staring contest with these film stars, who didn't open and close their eyes at the rates typical of actual humans.

 

Deepfake programs pull in lots of images of a particular person—you, your ex-girlfriend, Kim Jong-un—to catch them at different angles, with different expressions, saying different words. The algorithms learn what this character looks like, and then synthesize that knowledge into a video showing that person doing something he or she never did. Provide a presidential meta-warning about fake videos.

These fakes, while convincing if you watch a few seconds on a phone screen, aren’t perfect (yet). They contain tells, like creepily ever-open eyes, from flaws in their creation process. In looking into DeepFake’s guts, Lyu realized that the images that the program learned from didn’t include many with closed eyes (after all, you wouldn’t keep a selfie where you were blinking, would you?). “This becomes a bias,” he says. The neural network doesn’t get blinking. Programs also might miss other “physiological signals intrinsic to human beings,” says Lyu’s paper on the phenomenon, such as breathing at a normal rate, or having a pulse. (Autonomic signs of constant existential distress are not listed.) While this research focused specifically on videos created with this particular software, it is a truth universally acknowledged that even a large set of snapshots might not adequately capture the physical human experience, and so any software trained on those images may be found lacking.

Lyu's blinking revelation revealed a lot of fakes. But a few weeks after his team put a draft of their paper online, they got anonymous emails with links to deeply faked YouTube videos whose stars opened and closed their eyes more normally. The fake content creators had evolved.

blinking can be added to deepfake videos by including face images with closed eyes or using video sequences for training.” Once you know what your tell is, avoiding it is "just" a technological problem. Which means deepfakes will likely become (or stay) an arms race between the creators and the detectors. But research like Lyu’s can at least make life harder for the fake-makers. “We are trying to raise the bar,” he says. “We want to make the process more difficult, more time-consuming.”

It's pretty easy. You download the software. You Google “Hillary Clinton.” You get tens of thousands of images. You funnel them into the deepfake pipeline. It metabolizes them, learns from them. And while it's not totally self-sufficient, with a little help, it gestates and gives birth to something new, something sufficiently real.

 

Darpa program called MediFor—Media Forensics.

MediFor started in 2016 when the agency saw the fakery game leveling up. The project aims to create an automated system that looks at three levels of tells, fuses them, and comes up with an “integrity score” for an image or video. The first level involves searching for dirty digital fingerprints, like noise that's characteristic of a particular camera model, or compression artifacts. The second level is physical: Maybe the lighting on someone's face is wrong, or a reflection isn't the way it should be given where the lamp is. 

 

 “semantic level”: comparing the media to things they know are true. So if, say, a video of a soccer game claims to come from Central Park at 2 pm on Tuesday, October 9, 2018, does the state of the sky match the archival weather report? Stack all those levels, and voila: integrity score. By the end of MediFor, Darpa hopes to have prototype systems it can test at scale.

But the clock is ticking (or is that just a repetitive sound generated by an AI trained on timekeeping data?). “What you might see in a few years’ time is things like fabrication of events,” says Darpa program manager Matt Turek. “Not just a single image or video that’s manipulated but a set of images or videos that are trying to convey a consistent message.”

Over at Los Alamos National Lab, cyber scientist Juston Moore’s visions of potential futures are a little more vivid. Like this one: Tell an algorithm you want a picture of Moore robbing a drugstore; implant it in that establishment’s security footage; send him to jail. 

In other words, he's worried that if evidentiary standards don’t (or can’t) evolve with the fabricated times, people could easily be framed. And if courts don't think they can rely on visual data, they might also throw out legitimate evidence.

 

Taken to its logical conclusion, that could mean our pictures end up worth zero words. “It could be that you don’t trust any photographic evidence anymore,” he says, “which is not a world I want to live in.”

"The algorithms can create images of faces that don't belong to real people, and they can translate images in strange ways, such as turning a horse into a zebra," says Moore. They can "imagine away" parts of pictures, and delete foreground objects from videos.

Maybe we can’t combat fakes as fast as people can make better ones. But maybe we can, and that possibility motivates Moore’s team's digital forensics research. Los Alamos’s program—which combines expertise from its cyber systems, information systems, and theoretical biology and biophysics departments—is younger than Darpa’s, just about a year old. One approach focuses on “compressibility," or times when there's not as as much information in an image as there seems to be. “Basically we start with the idea that all of these AI generators of images have a limited set of things they can generate,” Moore says. “So even if an image looks really complex to you or me just looking at it, there’s some pretty repeatable structure.” 

 a bunch of real pictures, and a bunch of made-up representations from a particular AI. The algorithm pores over them, building up what Moore calls “a dictionary of visual elements,” namely what the fictional pics have in common with each other and what the nonfictional shots uniquely share. If Moore’s friend retweets a picture of Obama, and Moore thinks maybe it's from that AI, he can run it through the program to see which of the two dictionaries—the real or the fake—best defines it.

Los Alamos, which has one of the world’s most powerful supercomputers, isn't pouring resources into this program just because someone might want to frame Moore for a robbery. The lab’s mission is “to solve national security challenges through scientific excellence.” And its core focus is nuclear security—making sure bombs don’t explode when they’re not supposed to, and do when they are (please no), and aiding in nonproliferation. That all requires general expertise in machine learning, because it helps with, as Moore says, “making powerful inferences from small datasets.”

 

But beyond that, places like Los Alamos need to be able to believe—or, to be more realistic, to know when not to believe—their eyes. Because what if you see satellite images of a country mobilizing or testing nuclear weapons? What if someone synthesized sensor measurements?

 

That's a scary future, one that work like Moore's and Lyu's will ideally circumvent. But in that lost-cause world, seeing is not believing, and seemingly concrete measurements are mere creations. Anything digital is in doubt.

But maybe “in doubt” is the wrong phrase. Many people will take fakes at face value (remember that picture of a shark in Houston?), especially if its content meshes with what they already think. “People will believe whatever they’re inclined to believe,” says Moore.

That’s likely more true in the casual news-consuming public than in the national security sphere. And to help halt the spread of misinformation among us dopes, Darpa is open to future partnerships with social media platforms, to help users determine that that video of Kim Jong-un doing the macarena has low integrity. Social media can also, Turek points out, spread a story debunking a given video as quickly as it spreads the video itself.

Will it, though? Debunking is complicated (though not as ineffective as the lore suggests). And people have to actually engage with the facts before they can change their minds about the fictions.

But even if no one could change the masses' minds about a video's veracity, it's important that the people making political and legal decisions—about who's moving missiles or murdering someone—try to machine a way to tell the difference between waking reality and an AI dream.

 

Echobox is a software company that helps publishers increase traffic by 'intelligently' posting articles on social media platforms such as Facebook and Twitter.[42] By analyzing large amounts of data, it learns how specific audiences respond to different articles at different times of the day. It then chooses the best stories to post and the best times to post them. It uses both historical and real-time data to understand to what has worked well in the past as well as what is currently trending on the web.[43]

 

Another company, called Yseop, uses artificial intelligence to turn structured data into intelligent comments and recommendations in natural language. Yseop is able to write financial reports, executive summaries, personalized sales or marketing documents and more at a speed of thousands of pages per second and in multiple languages including English, Spanish, French & German.[44]

 

Boomtrain’s is another example of AI that is designed to learn how to best engage each individual reader with the exact articles — sent through the right channel at the right time — that will be most relevant to the reader. It’s like hiring a personal editor for each individual reader to curate the perfect reading experience.

Cast and Crew Selection

Experimental Script Generator

Content Repurpose

Experimental Trailer Generator

.

© 2019 Singer Media, LLC