A team of researchers from the University of Albany have developed a method of combating Deepfake videos, using machine learning techniques to search videos for digital “fingerprints” left behind when a video has been altered.
One of the biggest concerns in the tech world over the past couple of years has been the rise of Deepfakes. Deepfakes are a type of fake video constructed by artificial intelligence algorithms run through deep neural networks, and the products of the deepfake technology are shockingly good – sometimes difficult to tell apart from a real, genuine video.
AI researchers, ethicists, and political scientists are worried that the Deepfake technology will eventually be used to impact political elections, disseminating misinformation in a form more convincing than a fake news story. In order to provide some defense against the manipulation and misinformation that Deepfakes can cause, researchers from the University of Albany have created tools to assist in the detection of fake videos.
The Subtle Tells Of A Fake Video
Deepfake programs are capable of merging different images together into a video, compositing images of one person onto another person, for instance. These programs can be used to make powerful and influential people, like politicians, appear to say things they didn’t actually say. Deepfake programs operate by analyzing thousands of images of a person from different angles, saying different things, wearing different facial expressions, and learning the features that define the person.
However, Deepfakes aren’t exactly flawless yet. The videos have certain characteristics that can be analyzed by another algorithmic system to determine if the video is fake. As noted in a piece for The Conversation, one of these characteristics is that people in Deepfake videos don’t blink as often as regular people do. The deep neural networks that learn features of the video don’t pick up on blinking like humans do. Neural networks are limited by what data they have to analyze, and humans spend much more time with their eyes open than blinking. Beyond this, images with people blinking are usually deleted or tossed out, and this creates a type of bias.
Siwei Lyu, a professor at the University of Albany and leader of the team that created the fake detection tools, says that not only might a neural network miss picking up on blinking, they could also miss other subtle, but important, physiological signals humans have, such as a normal breathing rate. Lyu said to Tech Republic that while the research the team worked on is specifically targeting video created with Deepfake programs, any software trained on images of humans could be missing subtle human cues because images don’t capture the entire physical human experience.
Lyu’s team managed to successfully identify many fake videos utilizing their machine learning system, and a draft of their paper includes many of the example fakes the system was able to identify. However, shortly after the draft of their paper was released, the team found out that makers of video fakes had corrected for the error and made videos with people who opened and closed their eyes more naturally.
A Constant Competition
Lyu and his team weren’t surprised by this development. Lyu explains that Deepfakes will constantly be getting more sophisticated, as once you determine what “tells” give the video away as a fake, getting rid of the tell is just a technological problem. Compensating for a lack of blinking can be solved simply by training the neural network on more images with eyes closed, or using sections of video for training. In other words, there will likely be a constant battle between fake makers and fake detectors, with fake makers developing more sophisticated faking techniques and fake detectors constantly working to develop more sophisticated detection methods. Given this reality, the goal of Lyu’s team is simply to make convincing fakes harder and more time-intensive to make, in hopes this will deter fake makers.
According to Wired, Lyu’s research is being funded by Media Forensics, a DARPA initiative. DARPA and the rest of the intelligence/military community are extremely concerned about the advancement of Deepfake technology. Media Forensics (MediFor) was created in 2016 in response to the quality of video fakes rapidly increasing. The project’s goal is to create an automated media analysis system that can examine a video or photo and give the piece of media an “integrity score”.
Wired also reports that MediFor’s analysis system comes up with an integrity score by examining three different types of “tells”, alterations to an image or video, each of which has its own “level” of detail. In the first level, a search for “digital fingerprints” is done. These are digital artifacts that show evidence of manipulation like image noise that is characteristic of a certain camera, compression artifacts, or irregular pixel intensities. The second level of analysis is a physical analysis, for instance, perhaps reflections on a surface aren’t where they should be given the light in the surrounding area. The final level of analysis is a “semantic” analysis, comparing the suspected fake image/video to control images that are known to be real or other data points that could falsify the fake. As an example, if a video is claimed to be from a particular time and place, does the weather in the video match the weather reported on that day?
Racing To Create Defenses Against Misinformation
Unfortunately, as videos and images become easier to fake, there could come a time where not just individual images/videos are fake but entire events are faked. An entire set of images or videos could be created that fabricate an event from multiple angles, adding to its perceived authenticity. Tech scientists are concerned that there could be a future where video are either trusted too much or not trusted at all and valuable, real evidence is thrown out. Motivated by these worries, other organizations are working on their own methods of detecting video fakes.
The Los Alamos National Lab has an entire division dedicated to digital forensics research, and the digital forensics team there combines expertise from many different disciplines to detect fake videos. Los Alamos’ stated goal is “to solve national security challenges through scientific excellence.” One method of fake detection the team is working on is a method that analyzes “compressibility”. An image’s compressibility refers to how much information is within the image, and if there are incongruities between the actual amount of information within it and how much there seems to be, this can suggest a fake.
“Basically we start with the idea that all of these AI generators of images have a limited set of things they can generate. So even if an image looks really complex to you or me just looking at it, there’s some pretty repeatable structure,” cyber scientist Justin Moore said to Wired.
Another method used by Los Alamos is the implementation of sparse coding algorithms. Sparse coding algorithms can examine many real and fake images and classify images suspected to be fake by determining what elements/features are common to real images and what elements/features fake images have in common. Essentially, the algorithms create a “dictionary of visual elements” and cross-reference the dictionary to determine if an image is fake.
Researchers are Los Alamos and other locations are working hard to create fake detection methods precisely because humans are predisposed to believe their senses, believing what is right in front of their eyes. (“People will believe whatever they’re inclined to believe,” said Moore to Wired.) It will take quite a bit of evidence to dissuade someone from believing something they’ve seen with their own eyes, and even more if what they’ve seen reinforces an already existing belief. Ultimately, the solution may have to come not only from tools created to detect fake images/videos but from education about the manipulative power of Deepfakes and a cultural shift towards becoming more skeptical in general.