How to Use GetTextFromPDF() in FileMaker 2025
FileMaker + AI

How to Use GetTextFromPDF() in FileMaker 2025

By Kate Waldhauser Oct 31, 2025 8 min read
FileMaker 2025GetTextFromPDFPDF extractionAI in FileMaker
AI in FileMaker 2025 — Part 2 of 4
TL;DR: FileMaker 2025's GetTextFromPDF() function lets you extract text from PDF container fields natively -- no plugins, no external services. Once extracted, that text becomes searchable and can feed into AI workflows like summarization, classification, and RAG.
Table of Contents

How does FileMaker 2025 help you extract text from PDFs? Enter GetTextFromPDF() — a new function that makes PDF text extraction a native capability. No plugins, no external services. Just a function call.

This is one of those features that sounds simple but unlocks a lot. Once you can pull text out of PDFs programmatically, you open the door to search, classification, summarization, and AI-powered analysis — all within your FileMaker solution.

What Is GetTextFromPDF()?

GetTextFromPDF() is a new calculation function in FileMaker 2025 that extracts readable text from PDF files stored in container fields. It returns the text content as a string, which you can then store, search, analyze, or feed into AI workflows.

Syntax

GetTextFromPDF( containerField )
  • containerField — A container field that holds a PDF file
  • Returns — Text content extracted from the PDF

Key Details

  • Works with text-based PDFs (not scanned images — those require OCR)
  • Runs locally — no data leaves your server or client
  • Processes the entire document by default
  • Returns plain text — formatting, images, and layout are not preserved

When to Use It

This function shines in scenarios where you need to make PDF content searchable and actionable:

  • Document indexing — Extract and store text so you can run finds against PDF content
  • Contract management — Pull key terms from contracts for review and classification
  • Invoice processing — Extract line items, totals, and vendor information
  • Research databases — Make academic papers or reports searchable within FileMaker
  • Compliance documentation — Index policy documents for governance and audit workflows

Setting It Up: Step by Step

1. Prepare Your Container Field

Make sure you have a container field that stores actual PDF files (not references, unless your server can resolve the path).

2. Create a Text Field for Extracted Content

Add a text field to store the extracted text. This is important — you don’t want to re-extract every time you search.

Field Name: pdf_extracted_text
Type: Text

3. Write the Extraction Script

Create a script that extracts text and stores it:

Set Variable [ $text ; Value: GetTextFromPDF( Documents::pdf_container ) ]
If [ not IsEmpty( $text ) ]
    Set Field [ Documents::pdf_extracted_text ; $text ]
Else
    Set Field [ Documents::pdf_extracted_text ; "[No text extracted — may be scanned image]" ]
End If

4. Trigger on Import

Run the extraction script automatically when new PDFs are added. Use a script trigger on the container field or batch-process existing records.

Practical Tips

Handle Scanned PDFs Gracefully

GetTextFromPDF() works on text-based PDFs. If a PDF is a scanned image, the function returns empty. Plan for this:

  • Check if the result is empty
  • Flag records that need OCR processing
  • Consider a secondary workflow using an external OCR service for scanned documents

Batch Processing Existing Records

If you have hundreds or thousands of existing PDF records, create a looping script:

Go to Record/Request/Page [ First ]
Loop
    If [ IsEmpty( Documents::pdf_extracted_text ) ]
        Set Field [ Documents::pdf_extracted_text ; GetTextFromPDF( Documents::pdf_container ) ]
        Commit Records/Requests
    End If
    Go to Record/Request/Page [ Next ; Exit after last ]
End Loop

Combine with AI Features

Once text is extracted, you can feed it into FileMaker 2025’s AI capabilities:

  • Semantic search — Generate embeddings from extracted text for meaning-based search
  • Summarization — Use a text generation model to create document summaries
  • Classification — Automatically categorize documents based on content
  • RAG (Retrieval Augmented Generation) — Use extracted text as context for AI-generated responses

Performance Considerations

  • Large PDFs take longer to process. Consider running extraction as a server-side script for documents over 50 pages
  • Storage — Extracted text can be substantial. A 100-page document might produce 50,000+ characters. Plan your storage accordingly
  • Indexing — Make sure your extracted text field is indexed if you plan to search against it

Responsible Use Considerations

Even with a seemingly straightforward feature like PDF text extraction, responsible AI practices apply:

  • Data privacy — PDFs may contain sensitive information. Extracted text is now searchable and potentially more accessible. Review your access controls
  • Accuracy — Text extraction is not perfect. Complex layouts, multi-column formats, and special characters may not extract cleanly. Always verify critical data
  • Transparency — If extracted text feeds into AI workflows, document the pipeline. Know what data is being processed and where it goes

What’s Next

GetTextFromPDF() is one piece of a larger AI toolkit in FileMaker 2025. Combined with embedding models, semantic search, and text generation, it enables genuinely intelligent document workflows.

If you’re building AI features into your FileMaker solution and want guidance on doing it responsibly, let’s talk.

How AI Was Used in This Post

AI assisted with research, drafting, and code example formatting. All technical content was verified against FileMaker 2025 documentation. The header image was generated using ChatGPT.

Frequently Asked Questions

Does GetTextFromPDF() work with scanned PDFs?
+

No. GetTextFromPDF() extracts text from text-based PDFs only. Scanned documents (image-based PDFs) return empty because they require OCR processing. You can detect this by checking if the result is empty and routing those records to an external OCR service.

Does the PDF data leave my server when using GetTextFromPDF()?
+

No. GetTextFromPDF() runs entirely locally on your FileMaker client or server. No data is sent to external services, which makes it a good fit for sensitive or regulated data environments.

Can I use GetTextFromPDF() on FileMaker Server?
+

Yes. GetTextFromPDF() works on both FileMaker Pro (client-side) and FileMaker Server (server-side scripts). For large batch processing, running extraction as a server-side script is recommended for better performance.

How do I combine GetTextFromPDF() with AI features?
+

Once text is extracted and stored in a text field, you can generate embeddings for semantic search, feed the text into a text generation model for summarization or classification, or use it as context in RAG (Retrieval Augmented Generation) workflows.

How much text can GetTextFromPDF() extract from a single PDF?
+

GetTextFromPDF() processes the entire document by default. A 100-page PDF might produce 50,000+ characters of text. Plan your storage accordingly and consider indexing the extracted text field for search performance.

Explore Related Services

FileMaker
Claris FileMaker Services
Learn more →
AI Audit
Responsible AI Audit for FileMaker
Learn more →
Kate Waldhauser
Founder of Violet Beacon. Responsible AI consultant, ISO 42001 Lead Implementer, and Certified Claris Partner with 20+ years of FileMaker expertise.

Related Posts

← Back to all posts

Want to discuss this topic?

Book a free call to talk about responsible AI, FileMaker, or anything you've read here.