Get exif data from a photo in PDF??

geordiemick00

Posts: 0

Free Member

Topic starter

I've been sent a PDF as would love to get the photo out of it and see where it was taken etc, due to legal reasons I can't say to much but the info contained could put the stop to quite a nasty employment tribunal.

Posted : 09/09/2023 1:42 pm

bearnecessities

Posts: 12329

Full Member

Oh go on. Pretty please.

Posted : 09/09/2023 1:46 pm

scuttler

Posts: 6874

Full Member

Not a forensics expert but I expect the EXIF data is long gone as it’s been converted into something very different.

Posted : 09/09/2023 1:46 pm

oldtennisshoes

Posts: 4985

Full Member

Reverse image lookup?

Posted : 09/09/2023 1:51 pm

tomhoward

Posts: 20675

Ask the person who sent you the pdf?

Posted : 09/09/2023 1:54 pm

StirlingCrispin

Posts: 3131

Free Member

What is the source for the PDF?

Is it a Word file converted to PDF etc.

First go: If you have MS Word - click to open the PDF file and see what happens. Then go into the docx to see what data sits in the jpg file.

If you get no better offers message me and I'll fire up my work laptop for the PDF software and have a play.

Posted : 09/09/2023 2:03 pm

sirromj

Posts: 8771

Full Member

There's online tools to extract images could try that?

Not sure if there's any apps to do it.

I've used Coherent PDF command line tools to extract images. Sometimes they come out as hundreds of PBM or PPM files. If you're lucky it spits out a jpeg, depending on, I guess, what was used to create the PDF and how the images was placed within it.

Posted : 09/09/2023 2:05 pm

joeyr

Posts: 99

Free Member

ChatGPT says:

Extracting EXIF data from an image embedded within a PDF file can be done using various methods and tools. Here's a step-by-step guide using Python and some popular libraries:

Install Required Libraries:
You will need to install the following Python libraries if you haven't already:
PyMuPDF (PyMuPDF is used for PDF extraction)
Pillow (Pillow is used for image processing)
ExifRead (ExifRead is used for reading EXIF data)

You can install them using pip:

bash

pip install PyMuPDF Pillow ExifRead

Extract Image from PDF:
Use PyMuPDF to extract the image from the PDF. Here's a Python script to do this:

python

import fitz # PyMuPDF
import io
from PIL import Image

pdf_path = "your_pdf_file.pdf"
pdf_document = fitz.open(pdf_path)

# Specify the page number (e.g., 0 for the first page) containing the image
page_number = 0
page = pdf_document.load_page(page_number)

xref = page.get_page_image_list()[0][0]
base_image = pdf_document.extract_image(xref)
image_data = base_image["image"]

# Load the image data into Pillow
image = Image.open(io.BytesIO(image_data))

Extract EXIF Data from Image:
Once you have extracted the image using Pillow, you can proceed to extract EXIF data using the ExifRead library. Here's how:

python

import exifread

# Open the image
with open("output_image.jpg", "rb") as image_file:
# Read EXIF data
tags = exifread.process_file(image_file)

# Print EXIF data
for tag, value in tags.items():
print(f"{tag}: {value}")

Replace "output_image.jpg" with the path to the image you extracted from the PDF.

Complete Code Example:
Here's a complete code example that combines the steps above:

python

import fitz # PyMuPDF
import io
from PIL import Image
import exifread

# Open the PDF document
pdf_path = "your_pdf_file.pdf"
pdf_document = fitz.open(pdf_path)

# Specify the page number (e.g., 0 for the first page) containing the image
page_number = 0
page = pdf_document.load_page(page_number)

# Extract the image
xref = page.get_page_image_list()[0][0]
base_image = pdf_document.extract_image(xref)
image_data = base_image["image"]

# Load the image data into Pillow
image = Image.open(io.BytesIO(image_data))

# Save the image (optional)
image.save("output_image.jpg")

# Open the image file and read EXIF data
with open("output_image.jpg", "rb") as image_file:
tags = exifread.process_file(image_file)

# Print EXIF data
for tag, value in tags.items():
print(f"{tag}: {value}")

This code will extract an image from a specified page of the PDF file and then retrieve and print its EXIF data.

Posted : 09/09/2023 2:22 pm

Get exif data from a photo in PDF??

Singletrack Issue 163: Stop The Drops!

Singletrack Issue 163: Mumblings from a bike mechanic

Rampage: I was not entertained

Fresh Goods Friday 779: The Where’s Benji? Edition