You don't need to be an 'investor' to invest in Singletrack: 6 days left: 95% of target - Find out more
I've been sent a PDF as would love to get the photo out of it and see where it was taken etc, due to legal reasons I can't say to much but the info contained could put the stop to quite a nasty employment tribunal.
Oh go on. Pretty please.
Not a forensics expert but I expect the EXIF data is long gone as it’s been converted into something very different.
Reverse image lookup?
Ask the person who sent you the pdf?
What is the source for the PDF?
Is it a Word file converted to PDF etc.
First go: If you have MS Word - click to open the PDF file and see what happens. Then go into the docx to see what data sits in the jpg file.
If you get no better offers message me and I'll fire up my work laptop for the PDF software and have a play.
There's online tools to extract images could try that?
Not sure if there's any apps to do it.
I've used Coherent PDF command line tools to extract images. Sometimes they come out as hundreds of PBM or PPM files. If you're lucky it spits out a jpeg, depending on, I guess, what was used to create the PDF and how the images was placed within it.
ChatGPT says:
Extracting EXIF data from an image embedded within a PDF file can be done using various methods and tools. Here's a step-by-step guide using Python and some popular libraries:
Install Required Libraries:
You will need to install the following Python libraries if you haven't already:
PyMuPDF (PyMuPDF is used for PDF extraction)
Pillow (Pillow is used for image processing)
ExifRead (ExifRead is used for reading EXIF data)
You can install them using pip:
bash
pip install PyMuPDF Pillow ExifRead
Extract Image from PDF:
Use PyMuPDF to extract the image from the PDF. Here's a Python script to do this:
python
import fitz # PyMuPDF
import io
from PIL import Image
pdf_path = "your_pdf_file.pdf"
pdf_document = fitz.open(pdf_path)
# Specify the page number (e.g., 0 for the first page) containing the image
page_number = 0
page = pdf_document.load_page(page_number)
xref = page.get_page_image_list()[0][0]
base_image = pdf_document.extract_image(xref)
image_data = base_image["image"]
# Load the image data into Pillow
image = Image.open(io.BytesIO(image_data))
Extract EXIF Data from Image:
Once you have extracted the image using Pillow, you can proceed to extract EXIF data using the ExifRead library. Here's how:
python
import exifread
# Open the image
with open("output_image.jpg", "rb") as image_file:
# Read EXIF data
tags = exifread.process_file(image_file)
# Print EXIF data
for tag, value in tags.items():
print(f"{tag}: {value}")
Replace "output_image.jpg" with the path to the image you extracted from the PDF.
Complete Code Example:
Here's a complete code example that combines the steps above:
python
import fitz # PyMuPDF
import io
from PIL import Image
import exifread
# Open the PDF document
pdf_path = "your_pdf_file.pdf"
pdf_document = fitz.open(pdf_path)
# Specify the page number (e.g., 0 for the first page) containing the image
page_number = 0
page = pdf_document.load_page(page_number)
# Extract the image
xref = page.get_page_image_list()[0][0]
base_image = pdf_document.extract_image(xref)
image_data = base_image["image"]
# Load the image data into Pillow
image = Image.open(io.BytesIO(image_data))
# Save the image (optional)
image.save("output_image.jpg")
# Open the image file and read EXIF data
with open("output_image.jpg", "rb") as image_file:
tags = exifread.process_file(image_file)
# Print EXIF data
for tag, value in tags.items():
print(f"{tag}: {value}")
This code will extract an image from a specified page of the PDF file and then retrieve and print its EXIF data.