Voice Fakery

I recently posted a bit about voice editing and voice fakery – my immediate reaction was “uh oh” but I’ve got some personal perspective on the topic. I’ve been a photographer since about 1992 (when I started getting serious about it) and a photoshop user since version 1. Fakery? It’s all fake.

Nadar, self-portrait, ~1853

Nadar, self-portrait, ~1853

What’s interesting is social reactions to fakery. When photography was first introduced, its hallmark was realism. If, by “realism” you mean people in awkward poses due to long exposures, flattened out into two dimensions and rendered in black and white. Some of the earliest innovations in photography were fakery: Nadar’s self-portraits in a balloon were not, in truth, taken in a balloon.

Viewers of photography went through a phase between 1900 and 1995 (let’s say) in which the photographic image was treated as though it recorded some underlying reality. I suppose that this came about because painting was obviously not real, but photography appeared to be real, and the early era of photography was all about representing events as they happened. War photography did a great deal to cement the representation of reality – it was the first time that the horror of war was captured in detail, and it looked a real as the audience expected it to. Therefore, it was capturing reality.

Then we had a period in which photographs were treated as records of truth. Even though we knew, in some part of our minds, that the photos are artifacts that are heavily manipulated. Popular landscape photographer Ansel Adams became famous and popular for representing nature with a grandeur (and contrast range) that it never had; he was just as much of a propagandist as the soviet-era photographers who created mythical soviet reality – reality as it should have been.

ufoWhen photomanipulations came along, photo fakery became a brief phenomenon among the consumer community who were primed to see photographs as real, but had not yet internalized that they weren’t. They were real – real photographs – but what they showed – was a lie.

From the 1950s till the 1970s we had photos of bigfoot, photos of the loch ness monster, photos of UFOs.

Nowadays if you tried to pass off the picture of the flying hubcap as a UFO, people would mock your photoshop skills. With the advent of photorealistic computer renderings, the idea of photographic reliability and reality slowly went out the window.

The photo of a spaceship from Elite Dangerous, parked at an airport gate turns the idea of “UFO photo” on its head: it’s where you’d expect it to be, it fits in. The old UFO pictures were particularly fake because the UFOs were out of context. Now, we are used to seeing UFOs all the time, in Star Wars, Babylon 5, Star Trek: I find it rather odd that people believed flying hubcaps were UFOs, or bad pictures of people in gorilla suits were bigfoot, when 3D rendered King Kong and Gollum were obviously right around the corner. At the time when the hubcap photos were being treated as pictures of UFOs, there was a Buck Rogers movie serial that had much better special effects.

Photomanipulation by CMDR Father Cool

Photomanipulation by CMDR Father Cool

When I saw the demo of the voice-editing software, I immediately thought that it would overthrow people’s tendency to automatically believe voice recordings, as photoshop and manipulations have overthrown people’s tendency to believe pictures are reality.

We’ve developed out-of-band ways of identifying photo fakes – we check sites like snopes, or we have learned on our own, how to identify them (notice that the shadow depth of CMDR Cool’s composite is not quite the same as the aircraft next to it?) There are techniques a’plenty in photography, dating back to the 1940s – Dino Brugioni’s book on Photo Fakery was a favorite of mine; there are techniques like negative grain analysis – looking for subtle changes in the underlying grain structure of a print – they no longer matter. But the same techniques work on JPEG compression – a composite looks very different from a rendering.

Mostly, what will happen is that people will learn not to immediately trust voices. For certain transactions, I’ve already been using a few techniques for years. If I get an SMS text message from someone that claims to know me, I apologize and ask them, “Since you’re Gary, I suspect you’ll remember the last city we had sushi in together…. Which was?”

I think we’re past the point where society believes “photographs” and we’re well into the stage where people learn not to believe e-mails. The most recent US presidential campaign will go a long way toward teaching society not to believe politicians, or Facebook click-bait news. We should have never believed in them, anyway.

Sometimes, people lie so hard, it breathes new value into the truth.


While I was looking up the references on Dino Brugioni (I thought I’d include some pictures from his book) I stumbled across what appears to be a site of people who’ve looked at photo fakery and concluded that, since there were faked photos, the holocaust didn’t happen. It’s a sad example of motivated reasoning. I don’t think I’ll link to it. It’s one thing to deny the evidential value of a photo, but it’s something else entirely to deny the weight of a lot of history and corroborating stories from witnesses.

Jerry Uelsmann‘s photography depends heavily on composites drawn painstakingly from manipulated negatives. Back when photoshop and digital photography began to eat into the scene Uelsmann had a problem: it was now fairly easy to do what he had been doing. His creative “edge” was the difficulty of producing masked composite negatives and when that went away, so did his schtick. I still love his work but it’s sad when I see an Uelsmann and there’s a comment: “this was not done in photoshop.”   Oh, so we should be impressed that you used the wrong tool? I could make the Eiffel Tower out of pasta but, so what?


Dino Brugioni – Photo Fakery – A History of Deception and Manipulation


  1. Mano Singham says

    I used to think that Ansel Adams waited for just the right moment to get the image he wanted and that the only extra factor he added was cropping. But then I visited an exhibition of his photographs. They had the famous photo of the moon over a mountain range in New Mexico. But they also had many other images of that same scene in which Adams had tried all manner of darkroom techniques with the negative, trying to get just the effect that he wanted, before settling for the iconic image.

    It was an eye-opener for me. This was, of course, long before before Photoshop and made me so wary of purported realism. I now look at photographs like I do a painting, representing the creator’s vision as opposed to some kind of objective reality.

  2. says

    I’ve always found it amusing that younger people thought there was no photo manipulation until photoshop. Chemicals, they work wonders!

    I’m one of those people with next to no photoshop skills, and really can’t be arsed to mess about with my photos. Yes, HDR is beautiful and eerie, and I don’t have a fuckin’ clue how to do any of it. If I can’t get what I want in camera, then I don’t get it. shrug.

    I used to be on photoSIG, and left with a roll of the eyes when I was critiqued multiple times for not using pshop. Fuck that noise.

  3. says

    Mano Singham@#1:
    There are so many techniques – Ansel Adams’ “Zone System” was a way of stretching the tonal range of an image for contrast control – which meant creating images with contrast ranges that were not actually in the original scene. When people talk about Adams’ work being “luminous” that’s because he relied heavily on filters and contrast manipulation. He was also a huge fan of darkroom manipulation. His book “The Print” is all about how to selectively dodge and burn a print to create the desired local contrast.

    The “moonrise over Hernandez, New Mexico” photo is (as you say) heavily manipulated. It’s also one that has been duplicated as “original edition” and “limited edition” endlessly – I know someone who once bought a copy of it thinking it was a valuable Adams print, but it was actually printed by The Ansel Adams Studio not Adams – and even Adams was notorious for producing “limited editions” where “limited” means “there is a limit to how many I am printing at this session.”

    Cartier Bresson was notable for his “vital moment” photography, which pretended to be journalistic – in fact he would set up and photograph 50 people interacting with a puddle and eventually snap the shutter on someone who jumped over it. Or staged. Doisneau’s famous “the kiss at the hotel de ville” photo turns out to have been staged (it would have to be, to get the motion blur in the background)
    It’s all fake. I constantly remind myself of this by muttering that “real” scenes don’t have wires and are perceived (by me) in 3D and photographs aren’t.

    All of these things are hacks whereby people are manipulated through their mistaken belief of some objective reality: the image is of the “real” scene, the image is rare, the image was not staged, etc.

  4. says

    I used to be on photoSIG, and left with a roll of the eyes when I was critiqued multiple times for not using pshop.

    Oh there are some memories… PhotoSIG was one of the most toxic social media sites I’ve ever encountered.

    I was critiqued multiple times for using photoshop. And not using photoshop. And shooting straight compositions. And not shooting straight compositions. And not doing enough critiques, etc. It was a really bizzare and nasty website, with a lot of self-appointed guardians of rectitude. My favorite was the time I posted a scan of an ambrotype that I’d made and someone said I had overused photoshop. “No, sorry, I coated a piece of glass with nitrocellulose dissolved in ether, dunked it in silver nitrate, exposed it, and developed it and cleared it with ferrous sulfate and cyanide, asshole.” I got banned for the last word, and good riddance.

  5. chigau (ever-elliptical) says

    The shadow depth of CMDR Cool’s composite is not quite the same as the aircraft next to it because the space ship is hovering on its anti-gravity.
    The other aircraft are on wheels.

  6. John Morales says

    Viewers of photography went through a phase between 1900 and 1995 (let’s say) in which the photographic image was treated as though it recorded some underlying reality.

    The Cottingley Fairies being an ur-example.

  7. consciousness razor says

    Audio synthesis/manipulation has of course been around for a long time too … basically as long as recorded audio itself, depending on what counts as “manipulation.” (I mean, is it like the frame of the picture you decided to take, and which type of camera/lens/film/whatever you decided to use, or is it more involved than that?)

    I toy around with these things quite a bit, so honestly VoCo doesn’t seem so impressive to me, at least not totally game-changing as some people might think. It looks like a simple, user-friendly sort of app that’s pretty good for people speaking in English (perhaps many other languages wouldn’t be hard to add), not a comprehensive program for creating/changing any arbitrary sound whatsoever. You could already do all of the things VoCo does (or I could at any rate), but this automates the process to get a “good enough” result, without requiring much skill/know-how on the part of the user. So I suppose this makes it faster, more profitable, more accessible … making “fairly good” results more efficient/easier is nothing to sneeze at of course, but that’s different from a new development that improves on the quality we were able to get.

    I wouldn’t compare it to Photoshop, which offers a very large set of tools to do just about anything to any image (I guess that’s a stretch, but not by too much). This is more like a plug-in designed for making human faces or images of text or more specific tasks like that, while not working well on other types of images. I mean, consider how it works: you get a lot of samples of a certain person, you type in a word, and it uses those to create a pattern (corresponding to that word) which fits in reasonably well with that voice. But most sounds (nearly all?) cannot be made this way.

    So, if, say, you wanted to render background noise (pesky stuff which is always present) or various other non-vocal sounds, this will not help. There is not an app for that. Maybe you’ll sample those sounds too or synthesize them some other way, but there’s no convenient button to push which does it all for you no matter what situation is. So if you really want your tinkering to be practically undetectable to everybody, you’d have to seamlessly combine all of those different elements together, which is far from easy but is possible.

    VoCo (in its current stage) seems rather sloppy on that front. You could easily hear some clipping and other artifacts where changes were made. And even if it becomes good enough to convince an attentive human listener, some very fine-grained analysis may still be able to pinpoint that something is off. But if someone puts enough work into it, had a lot of money/time/etc., people could already make some extremely impressive fake audio that would trick basically anyone or anything.

  8. consciousness razor says

    VoCo (in its current stage) seems rather sloppy on that front. You could easily hear some clipping and other artifacts where changes were made.

    Let me add that it doesn’t look like there are any features related to rhythm (or pitch for that matter). You get an ordered sequence of words/phonemes, based on what you type. But if you wanted to make the rhythm of their speech sound more natural or more characteristic of that person (think of Obama pausing often in mid-sentence, which isn’t how some typically speak), wanted to align it with a piece of video, etc., then it looks like you’re out of luck. Of course you could use many other tools to make adjustments like that, as well as the pitch and so forth, but the point is that you can start to see how limited this software really is, if you’re not too dazzled by what it was able to do in their demonstration.

  9. says

    As an amateur philosophy-of-photography wonk, this is a subject very dear to me!

    Pre-photoshop, for the most part, photographs were “indexical” which is a technical term that means, roughly, “real” in exactly the way photographs are. Yes, yes, the tones have been shoved around for dramatic effect, and it’s not in color, etc. But if it looks like the moon was in-frame, that’s because it *was*. And so on.Things which appear to be so, in a very literal and specific way, were so.

    And so it still is, really. The vast majority of pictures are just as “true”. Uploaded without friction from the phone, they are unedited. As a percentage of total photos, the heavily modified, “photoshopped” picture may actually be more rare now than in, say, 1920.

    What has changed is perception. We no longer trust photos the way we did. Hence the surge in anti-photoshop rules from the press agencies and so on. This is not aimed at more actual truth, these rules instead aim to restore faith. The media are not naive, they know well that one can lie just as easily with a crop (permitted) or by selecting this picture and not that one (permitted) as you can with an erasure or addition (WE DO NOT ALLOW SUCH TERRIBLE THINGS!!!1!!)

    W. Eugene Smith did things to photos which would be violently against the rules these days, and yet in some cases they told more profound truth than ever, after he had his way with them. What is necessary is honor and judgement, not strictures against this tool or that.

    Sadly, we seem to be lacking in the former two.

  10. says

    John Morales@#6:
    The Cottingley Fairies being an ur-example

    I wasn’t even thinking of those but you’re right. And they happened later than I thought: I’d have guessed 1890s but it was 1900s!

  11. sonofrojblake says

    @consciousness razor, 7&8: What bothers me about VoCo is not how good it is today. It’s how good it’s going to be real soon now.

    An analogy: look at this video, which at the time I found amazing: http://youtu.be/UcKqyn-gUbY?t=2m38s

    Jeff Han demonstrates a multi-touch pinch-to-zoom display, and listen to the audience reaction. Minds blown. That was February 2006. Less than a decade later, basically every single person I knew had a display like that – better than that – in their pocket all the time.

    Voco is now at the stage Han’s display was at in 2006 – want to bet it won’t be ubiquitous and much, much better in ten years or less?

    Finally: http://xkcd.com/1235/