Is it possible to identify the people in the photo above? Maybe. Digital images potentially contain a large amount of metadata that could reveal the photographer’s identify and location. There may also be a surprising number of clues in the photo itself.
The standard format for image metadata is EXIF, Exchangeable Image File Format. Some of this information is obviously identifiable, such as fields called
ImageEditor. A camera may or may not include such information, and someone may remove this image from photos after they are taken, but this image is possible inside the photo.
Similarly, the photo may include information regarding where the photo was taken, such as in the
GPSAltitude fields. There are also fields for recording when the photo was taken or edited.
A recurring theme in data privacy is that information that is not obviously identifiable may still be used to identify someone. If this data doesn’t do the whole job, it narrows down possibilities to the point that other known information may complete the identification.
For example, the highly technical fields contained in an image could identify the camera equipment. The camera serial number directly identifies the camera, but other fields may indirectly identify the camera.
Similarly, a image without GPS data still maybe contain indirect location. For example, there are fields for recording temperature, humidity, and atmospheric pressure. These fields used in combination with timestamps could identify a location, or at least narrow down the set of possible locations.
There are many EXIF fields that are allowed to be arbitrarily long ASCII or Unicode (UTF-8) sequences. A program for editing EXIF data would allow someone to copy the contents of Moby Dick into one of these fields.
The next post describes a similar situation for medical images.
Clues in the photo itself
Stripping EXIF data from an image before making it public is a good idea both for privacy and for size. If a free text field does contain Moby Dick, you could make your image 1.2 MB smaller by removing it.
However, it’s often possible to detect from the photo itself where the photo was taken. I stumbled on a YouTube channel of someone who identifies photos as a hobby. No doubt there are many such people. The host invites people to send in photos and he uses openly available information to track down where they are.
If you strip the precise time and location information from the metadata, someone may be able to infer approximate replacements from clues in the photo itself such as shadows or seasonal vegetation.
Ordinary people have no idea how much location information can be inferred from a photo. Neither do some people who ought to know better. There was a story a few months ago about a photo at a secret military location whose position was inferred from, among other clues, stars that faintly appeared in the sky near dusk.
Update: As noted in the comments, Facebook has a patent on a way to identify people from the pattern of dust on their camera lenses.