Get exif data from ...
 

  You don't need to be an 'investor' to invest in Singletrack: 6 days left: 95% of target - Find out more

Get exif data from a photo in PDF??

7 Posts
8 Users
0 Reactions
1,665 Views
Posts: 0
Free Member
Topic starter
 

I've been sent a PDF as would love to get the photo out of it and see where it was taken etc, due to legal reasons I can't say to much but the info contained could put the stop to quite a nasty employment tribunal.


 
Posted : 09/09/2023 1:42 pm
Posts: 12329
Full Member
 

Oh go on. Pretty please.


 
Posted : 09/09/2023 1:46 pm
Posts: 6874
Full Member
 

Not a forensics expert but I expect the EXIF data is long gone as it’s been converted into something very different.


 
Posted : 09/09/2023 1:46 pm
Posts: 4985
Full Member
 

Reverse image lookup?


 
Posted : 09/09/2023 1:51 pm
Posts: 20675
 

Ask the person who sent you the pdf?


 
Posted : 09/09/2023 1:54 pm
Posts: 3131
Free Member
 

What is the source for the PDF?

Is it a Word file converted to PDF etc.

First go: If you have MS Word - click to open the PDF file and see what happens. Then go into the docx to see what data sits in the jpg file.

If you get no better offers message me and I'll fire up my work laptop for the PDF software and have a play.


 
Posted : 09/09/2023 2:03 pm
Posts: 8771
Full Member
 

There's online tools to extract images could try that?

Not sure if there's any apps to do it.

I've used Coherent PDF command line tools to extract images. Sometimes they come out as hundreds of PBM or PPM files. If you're lucky it spits out a jpeg, depending on, I guess, what was used to create the PDF and how the images was placed within it.


 
Posted : 09/09/2023 2:05 pm
Posts: 99
Free Member
 

ChatGPT says:

Extracting EXIF data from an image embedded within a PDF file can be done using various methods and tools. Here's a step-by-step guide using Python and some popular libraries:

Install Required Libraries:
You will need to install the following Python libraries if you haven't already:
PyMuPDF (PyMuPDF is used for PDF extraction)
Pillow (Pillow is used for image processing)
ExifRead (ExifRead is used for reading EXIF data)

You can install them using pip:

bash

pip install PyMuPDF Pillow ExifRead

Extract Image from PDF:
Use PyMuPDF to extract the image from the PDF. Here's a Python script to do this:

python

import fitz  # PyMuPDF
import io
from PIL import Image

pdf_path = "your_pdf_file.pdf"
pdf_document = fitz.open(pdf_path)

# Specify the page number (e.g., 0 for the first page) containing the image
page_number = 0
page = pdf_document.load_page(page_number)

xref = page.get_page_image_list()[0][0]
base_image = pdf_document.extract_image(xref)
image_data = base_image["image"]

# Load the image data into Pillow
image = Image.open(io.BytesIO(image_data))

Extract EXIF Data from Image:
Once you have extracted the image using Pillow, you can proceed to extract EXIF data using the ExifRead library. Here's how:

python

import exifread

# Open the image
with open("output_image.jpg", "rb") as image_file:
# Read EXIF data
tags = exifread.process_file(image_file)

# Print EXIF data
for tag, value in tags.items():
print(f"{tag}: {value}")

Replace "output_image.jpg" with the path to the image you extracted from the PDF.

Complete Code Example:
Here's a complete code example that combines the steps above:

python

import fitz  # PyMuPDF
import io
from PIL import Image
import exifread

# Open the PDF document
pdf_path = "your_pdf_file.pdf"
pdf_document = fitz.open(pdf_path)

# Specify the page number (e.g., 0 for the first page) containing the image
page_number = 0
page = pdf_document.load_page(page_number)

# Extract the image
xref = page.get_page_image_list()[0][0]
base_image = pdf_document.extract_image(xref)
image_data = base_image["image"]

# Load the image data into Pillow
image = Image.open(io.BytesIO(image_data))

# Save the image (optional)
image.save("output_image.jpg")

# Open the image file and read EXIF data
with open("output_image.jpg", "rb") as image_file:
tags = exifread.process_file(image_file)

# Print EXIF data
for tag, value in tags.items():
print(f"{tag}: {value}")

This code will extract an image from a specified page of the PDF file and then retrieve and print its EXIF data.


 
Posted : 09/09/2023 2:22 pm

6 DAYS LEFT
We are currently at 95% of our target!