Collecting my attempts to improve at tech, art, and life

Pretty File Summaries with Rich and ExifTool

Tags: files python perlish rich exiftool tools

attachments/img/2021/cover-2021-02-06.jpg
Hoku hopes for scraps

A while back I shared how I use ExifTool to get extensive metadata for any file. I want to make that info dump pretty with Rich, a text formatting library for Python.

“But Brian,”" I hear you cry. “ExifTool is Perl. Why would I want to use both Perl and Python?”

Because it’s fun, obviously.

You want a “real” reason? Okay fine. I haven’t found anything that can get the depth of file information I get from ExifTool. I haven’t found a formatting library that’s as pleasant to use as Rich — maybe TTY Toolkit?

Besides — ExifTool is a standalone command line tool. We don’t need to write any Perl to use it. Heck, we don’t even need to figure out the system calls. Sven Marnach is way ahead of us with the extremely helpful pyexiftool.

Rich and pyexiftool make Python an easy choice for this task.

Setting up

If you want to play along at home, make sure you have the dependencies.

$ brew install exiftool
$ pip install pyexiftool rich typer

Typer simplifies turning this random idea into a useful command line tool.

NOTE

If you’re already a fan of Perl, consider cpanm instead of Homebrew.

$ cpanm Image::ExifTool

Now you can use Image::ExifTool in your own Perl projects.

Some scaffolding

Even though I’m the only user, I still need to figure out how I plan to use it. At minimum? I hand my script a filename. It hands me metadata.

richexif FILENAME [OPTIONS]

I can hook some minimal Typer argument handling around that flow.

richexif.py
#!/usr/bin/env python

import logging

from rich.logging import RichHandler
import typer

logging.basicConfig(
    level=logging.DEBUG,
    format="%(message)s",
    datefmt="[%X]",
    handlers=[RichHandler()]
)


def main(filename: str):
    """Display nicely-formatted file metadata."""
    logging.debug("filename: %s", filename)

Can I run it?

chmod 755 richexif.py
./richexif.py hoku-hopes-for-snacksjpg.jpg

I can! What happens if I use it wrong?

$ ./richexif.py
Usage: richexif.py [OPTIONS] FILENAME
Try 'richexif.py --help' for help.

Error: Missing argument 'FILENAME'.

I get an error message telling me what richexif.py needs to do its thing. Nice.

I confirmed that Typer handles the CLI bits, and Rich handles the formatting. Now for pyexiftool.

Oh and I’ll skip logging output from here on. Rich’s logging handler output is a joy to look at, but really that stuff is for me. For you it’ll just add noise.

Some metadata

I need exiftool, of course. Plus a Rich Console object, masterminding the display details for my terminal.

import exiftool
from rich.console import Console

console = Console()

exiftool’s get_metadata grabs everything ExifTool sees about a file. It also provides methods for ExifTool tags, but I won’t mess with them today. Tags — the official name for our metadata keys — are most useful when you already know what you’re looking for. We’re just checking stuff out.

For now, a little abstraction layer over pyexiftool’s ExifTool.

def get_metadata(filename):
    """Return a dictionary of file metadata."""
    with exiftool.ExifTool() as et:
        return et.get_metadata(filename)

main gets the metadata and asks console to print it.

def main(filename: str):
    """Display nicely-formatted file metadata."""
    metadata = get_metadata(filename)
    console.print(metadata)

And here’s what that looks like.

{
    'SourceFile': 'hoku-hopes-for-snacks.jpg',
    'ExifTool:ExifToolVersion': 12.15,
    'File:FileName': 'hoku-hopes-for-snacks.jpg',
    'File:Directory': '.',
    'File:FileSize': 918330,
    'File:FileModifyDate': '2021:02:06 00:54:29-08:00',
    'File:FileAccessDate': '2021:02:06 11:30:33-08:00',
    'File:FileInodeChangeDate': '2021:02:06 11:30:33-08:00',
    'File:FilePermissions': 775,
    'File:FileType': 'JPEG',
    …skipping 62 lines…
    'Composite:ScaleFactor35efl': 6.04651162790698,
    'Composite:ShutterSpeed': 0.05,
    'Composite:GPSLatitude': 47.5750857997222,
    'Composite:GPSLongitude': -122.386441,
    'Composite:CircleOfConfusion': '0.00496918925785101',
    'Composite:FOV': 69.3903656740024,
    'Composite:FocalLength35efl': 26,
    'Composite:GPSPosition': '47.5750857997222 -122.386441',
    'Composite:HyperfocalDistance': 2.48061927751922,
    'Composite:LightValue': 3.81378119121704
}

Holy crap that’s a lot. Some of it could be considered sensitive information — unless you read my now page. But it’s all there! Even in the snipped version you can learn a lot. Hello from my Windows partition in West Seattle during February of 2021!

TIP

Uncomfortable sharing that much with every photo you upload? You can scrub those tags right out. With ExifTool, of course.

But back to the other gripe about all this metadata. It’s way too much for me to take in all at once. I need some kind of filter!

Filtering the firehose

def filter_metadata(metadata, filter):
    """Return a copy of the metadata where fields contain the substring `filter`."""
    return {k: v for k, v in metadata.items() if filter in k}

There’s no kind of transformation here. If a field constrains the exact substring described in filter, use it.

Adding a Typer Option lets us ask for a filter from the command line.

def main(
    filename: str,
    filter: Optional[str] = typer.Option(
        None, help="Substring to restrict displayed fields"
    ),
):
    """Display nicely-formatted file metadata."""
    metadata = get_metadata(filename)

    if filter:
        metadata = filter_metadata(metadata, filter)

    console.print(metadata)

If use --filter, we should only get matching tags. Leaving out the filter gets us everything.

Try it out!

$ ./richexif.py hoku-hopes-for-snacks.jpg --filter=Image

Now that I’m not overwhelmed by the quantity of output, I’m a little underwhelmed by the quality.

{
    'File:ImageWidth': 3672,
    'File:ImageHeight': 2066,
    'EXIF:ImageWidth': 4032,
    'EXIF:ImageHeight': 2268,
    'EXIF:ExifImageWidth': 4032,
    'EXIF:ExifImageHeight': 2268,
    'EXIF:ImageUniqueID': 'J12LLKL00SM',
    'EXIF:ThumbnailImage': '(Binary data 6788 bytes, use -b option to extract)',
    'Composite:ImageSize': '3672 2066'
}

It’s nice. Don’t get me wrong. But all we’ve added to default exiftool behavior is some color.

I’ve played with Rich a bit. I know we can do better.

A metadata table!

Rich lets us create and display tables in the terminal.

from rich.table import Table

We need to build the table, defining columns and adding values row by row.

def file_table(filename, metadata):
    """Return a Rich Table showing the metadata for a file."""
    table = Table("Field", "Value", title=filename)

    for key, value in metadata.items():
        table.add_row(key, str(value))

    return table

WARNING

Hey, don’t miss that str(value)! Rich tables need strings, and take nothing for granted with the values you give it. Numeric values won’t necessarily convert straight to strings without a little help.

def main(...):
    """Display nicely-formatted file metadata."""
    ...

    if filter:
        metadata = filter_metadata(metadata, filter)

    table = file_table(filename, metadata)
    console.print(table)

What does our filtered view look like as a table?

$ ./richexif.py hoku-hopes-for-snacksjpg.jpg --filter=Image
                        hoku-hopes-for-snacksjpg.jpg                         
┏━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Field                 Value                                              ┃
┡━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ File:ImageWidth      │ 3672                                               │
│ File:ImageHeight     │ 2066                                               │
│ EXIF:ImageWidth      │ 4032                                               │
│ EXIF:ImageHeight     │ 2268                                               │
│ EXIF:ExifImageWidth  │ 4032                                               │
│ EXIF:ExifImageHeight │ 2268                                               │
│ EXIF:ImageUniqueID   │ J12LLKL00SM                                        │
│ EXIF:ThumbnailImage  │ (Binary data 6788 bytes, use -b option to extract) │
│ Composite:ImageSize  │ 3672 2066                                          │
└──────────────────────┴────────────────────────────────────────────────────┘

Pretty nifty.

A metadata tree!

We can do more than tables though. with that type:tag split, there’s kind of a heirarchy. We could add a column for the tag type, but why not use a Tree?

from rich.tree import Tree

Hang on a second while we build our little tree with its branches.

def file_tree(filename, metadata):
    tree = Tree(f"[bold]{filename}")
    branches = {}
    tagged_values = [(k.split(":"), v) for k, v in metadata.items()]

    for tags, value in tagged_values:
        root_tag = tags[0]

        if root_tag not in branches:
            branches[root_tag] = tree.add(f"[bold]{root_tag}")

        if len(tags) == 2:
            branches[root_tag].add(f"[italic]{tags[1]}:[/italic] {value}")
        else:
            branches[tags[0]].add(str(value))

    return tree

Except now we have two ways to display metadata. Three, if you count the dictionary we started with. How are we going to show this tree without discarding our table code?

For now, a callback table that says what to call for each of the options.

from rich.tree import Tree

DISPLAYS = {
    "table": lambda f, m: file_table(f, m),
    "tree": lambda f, m: file_tree(f, m),
}

We don’t need to use lambdas here. Functions can be passed around same as any other value. But if I wrap them in a lambda I can build my constant table before Python knows the functions exist.

Typer uses callback functions to validate options. They do any processing or checks they need to, then return the supplied value if everything goes well.

def validate_display(value):
    """Return value if valid, or panic if it isn't."""
    if value not in DISPLAYS:
        raise typer.BadParameter(f"Format must be one of: {DISPLAYS.keys()}")
    return value

Add the --display Option, making sure to point Typer at the callback. main itself knows the value is safe, or the script never would have reached it. So I can grab the displayer and call it without fear of consequence.

def main(
    ...
    display: str = typer.Option(
        "table",
        help="How to display the metadata",
        callback=validate_display,
    ),
):
    """Display nicely-formatted file metadata."""
    ...

    displayer = FORMATS[display]
    output = displayer(filename, metadata)
    console.print(output)

Okay! What do we have now?

$ ./richexif.py hoku-hopes-for-snacks.jpg --filter=Image --display=tree
hoku-hopes-for-snacks.jpg
├── File
│   ├── ImageWidth: 3672
│   └── ImageHeight: 2066
├── EXIF
│   ├── ImageWidth: 4032
│   ├── ImageHeight: 2268
│   ├── ExifImageWidth: 4032
│   ├── ExifImageHeight: 2268
│   ├── ImageUniqueID: J12LLKL00SM
│   └── ThumbnailImage: (Binary data 6788 bytes, use -b option to extract)
└── Composite
    └── ImageSize: 3672 2066

Oooooh.

Anyways, that’s what I wanted to show you. Got plenty more ideas for mashing ExifTool and Rich together, as I’m sure you can imagine.


Got a comment? A question? More of a comment than a question?

Talk to me about this page on: dev-to mastodon

Added to vault 2024-01-15. Updated on 2024-02-02