Although the article discusses EXIF, which is primarily hardware-created metadata, there is the other metadata like Description and Keywords that I find very important (typically encode as IPTC or XMP metadata, if I recall correctly).
Especially older family photos that I scan: in the Description I can add what was written on the back (and/or front) of the photo. I use Keywords to identify who is in the photo as well as who I got the photo from (like a "Photo From Peggy" keyword).
In time when you get quite a collection, it becomes handy to be able to search your Photos library using this metadata. Also, when exported, the metadata remains if you allow it to. So when I send off digital copies to relatives, they also have that additional information.
Exif data is really fascinating, and exiftool is really worth playing around with for an afternoon. The metadata on color space, focal distance, aperture are all really useful tools for making photos look correct on an arbitrary display surface, and it is pretty fun to swap them out to see the effects on rendering.
As an aside, the location/time data is useful for somethings but also kind of creepy and I really wish there were more privacy considerations in how these pieces of meta data are handled. There was a period where I routinely set these fields to be taken in Pyongyang, North Korea, 100m below sea level and one day in the future.
> I really wish there were more privacy considerations in how these pieces of meta data are handled.
When you Export a photo from Apple's Photos, there is a checkbox you can toggle for Location Information. I generally have that unchecked so that it strips that EXIF data from the resulting exported image.
My favourite thing about EXIF is its role in Hollywood and the news, e.g. "The geolocation from the metadata of this image verifies that this happened at this place and at this time", and other such nonsense nonsense. Always makes me chuckle.
My least favourite thing is colour profiles and how those can confuse grown adults and bring them to tears.
Fun fact: EXIF is simultaneously JPEG format (EXIF spec describes compressed image file format which is based on JIF, the base JPEG file format described in JPEG spec, also EXIF is for "EXchangeable Image File format", suggesting that authors saw it as new file format) and TIFF format (EXIF metadata is actually embedded TIFF which can be parsed with tiffdump, also EXIF spec describes uncompressed image file formats which are TIFF with embedded EXIF metadata which is also TIFF...)
Its a less fun fact when you have to write a parser for it
All the various metadata formats are kind of weird. IIM (less popular now but still sometimds seen in jpeg files. Was originally for news organizations) is even weirder than Exif. My favourit part is how you specify its utf-8 by adding the iso-2022 escape code in a field. Like wut.
If you're following along and can't/don't want to remember the SQL syntax, use the examples from the post for LLM text-to-SQL context:
Q: Which photo has the highest number of faces?
A: SELECT SourceFile
FROM photos
WHERE RegionType IS NOT ''
ORDER BY length(RegionType) DESC
LIMIT 1;
Q: ...
You can also fetch and use the table schema with `sqlite3 exif.db .schema`
Although the article discusses EXIF, which is primarily hardware-created metadata, there is the other metadata like Description and Keywords that I find very important (typically encode as IPTC or XMP metadata, if I recall correctly).
Especially older family photos that I scan: in the Description I can add what was written on the back (and/or front) of the photo. I use Keywords to identify who is in the photo as well as who I got the photo from (like a "Photo From Peggy" keyword).
In time when you get quite a collection, it becomes handy to be able to search your Photos library using this metadata. Also, when exported, the metadata remains if you allow it to. So when I send off digital copies to relatives, they also have that additional information.
Exif data is really fascinating, and exiftool is really worth playing around with for an afternoon. The metadata on color space, focal distance, aperture are all really useful tools for making photos look correct on an arbitrary display surface, and it is pretty fun to swap them out to see the effects on rendering.
As an aside, the location/time data is useful for somethings but also kind of creepy and I really wish there were more privacy considerations in how these pieces of meta data are handled. There was a period where I routinely set these fields to be taken in Pyongyang, North Korea, 100m below sea level and one day in the future.
exiftool -GPSLatitude=39.0738-GPSLatitudeRef=N -GPSLongitude=125.8198 -GPSLongitudeRef=E -GPSAltitude=-6 -GPSAltitudeRef="Below Sea Level" -AllDates="$(date -v +1d '+%Y-%m-%d %H:%M:%S')" FILENAME.jpg
> I really wish there were more privacy considerations in how these pieces of meta data are handled.
When you Export a photo from Apple's Photos, there is a checkbox you can toggle for Location Information. I generally have that unchecked so that it strips that EXIF data from the resulting exported image.
My favourite thing about EXIF is its role in Hollywood and the news, e.g. "The geolocation from the metadata of this image verifies that this happened at this place and at this time", and other such nonsense nonsense. Always makes me chuckle.
My least favourite thing is colour profiles and how those can confuse grown adults and bring them to tears.
Discussed at the time:
Exploring EXIF - https://news.ycombinator.com/item?id=37409524 - Sept 2023 (33 comments)
Fun fact: EXIF is simultaneously JPEG format (EXIF spec describes compressed image file format which is based on JIF, the base JPEG file format described in JPEG spec, also EXIF is for "EXchangeable Image File format", suggesting that authors saw it as new file format) and TIFF format (EXIF metadata is actually embedded TIFF which can be parsed with tiffdump, also EXIF spec describes uncompressed image file formats which are TIFF with embedded EXIF metadata which is also TIFF...)
Its a less fun fact when you have to write a parser for it
All the various metadata formats are kind of weird. IIM (less popular now but still sometimds seen in jpeg files. Was originally for news organizations) is even weirder than Exif. My favourit part is how you specify its utf-8 by adding the iso-2022 escape code in a field. Like wut.
If you're following along and can't/don't want to remember the SQL syntax, use the examples from the post for LLM text-to-SQL context:
You can also fetch and use the table schema with `sqlite3 exif.db .schema`