How to Use GetTextFromPDF() in FileMaker 2025
Table of Contents
How does FileMaker 2025 help you extract text from PDFs? Enter GetTextFromPDF() — a new function that makes PDF text extraction a native capability. No plugins, no external services. Just a function call.
This is one of those features that sounds simple but unlocks a lot. Once you can pull text out of PDFs programmatically, you open the door to search, classification, summarization, and AI-powered analysis — all within your FileMaker solution.
What Is GetTextFromPDF()?
GetTextFromPDF() is a new calculation function in FileMaker 2025 that extracts readable text from PDF files stored in container fields. It returns the text content as a string, which you can then store, search, analyze, or feed into AI workflows.
Syntax
GetTextFromPDF( containerField )
- containerField — A container field that holds a PDF file
- Returns — Text content extracted from the PDF
Key Details
- Works with text-based PDFs (not scanned images — those require OCR)
- Runs locally — no data leaves your server or client
- Processes the entire document by default
- Returns plain text — formatting, images, and layout are not preserved
When to Use It
This function shines in scenarios where you need to make PDF content searchable and actionable:
- Document indexing — Extract and store text so you can run finds against PDF content
- Contract management — Pull key terms from contracts for review and classification
- Invoice processing — Extract line items, totals, and vendor information
- Research databases — Make academic papers or reports searchable within FileMaker
- Compliance documentation — Index policy documents for governance and audit workflows
Setting It Up: Step by Step
1. Prepare Your Container Field
Make sure you have a container field that stores actual PDF files (not references, unless your server can resolve the path).
2. Create a Text Field for Extracted Content
Add a text field to store the extracted text. This is important — you don’t want to re-extract every time you search.
Field Name: pdf_extracted_text
Type: Text
3. Write the Extraction Script
Create a script that extracts text and stores it:
Set Variable [ $text ; Value: GetTextFromPDF( Documents::pdf_container ) ]
If [ not IsEmpty( $text ) ]
Set Field [ Documents::pdf_extracted_text ; $text ]
Else
Set Field [ Documents::pdf_extracted_text ; "[No text extracted — may be scanned image]" ]
End If
4. Trigger on Import
Run the extraction script automatically when new PDFs are added. Use a script trigger on the container field or batch-process existing records.
Practical Tips
Handle Scanned PDFs Gracefully
GetTextFromPDF() works on text-based PDFs. If a PDF is a scanned image, the function returns empty. Plan for this:
- Check if the result is empty
- Flag records that need OCR processing
- Consider a secondary workflow using an external OCR service for scanned documents
Batch Processing Existing Records
If you have hundreds or thousands of existing PDF records, create a looping script:
Go to Record/Request/Page [ First ]
Loop
If [ IsEmpty( Documents::pdf_extracted_text ) ]
Set Field [ Documents::pdf_extracted_text ; GetTextFromPDF( Documents::pdf_container ) ]
Commit Records/Requests
End If
Go to Record/Request/Page [ Next ; Exit after last ]
End Loop
Combine with AI Features
Once text is extracted, you can feed it into FileMaker 2025’s AI capabilities:
- Semantic search — Generate embeddings from extracted text for meaning-based search
- Summarization — Use a text generation model to create document summaries
- Classification — Automatically categorize documents based on content
- RAG (Retrieval Augmented Generation) — Use extracted text as context for AI-generated responses
Performance Considerations
- Large PDFs take longer to process. Consider running extraction as a server-side script for documents over 50 pages
- Storage — Extracted text can be substantial. A 100-page document might produce 50,000+ characters. Plan your storage accordingly
- Indexing — Make sure your extracted text field is indexed if you plan to search against it
Responsible Use Considerations
Even with a seemingly straightforward feature like PDF text extraction, responsible AI practices apply:
- Data privacy — PDFs may contain sensitive information. Extracted text is now searchable and potentially more accessible. Review your access controls
- Accuracy — Text extraction is not perfect. Complex layouts, multi-column formats, and special characters may not extract cleanly. Always verify critical data
- Transparency — If extracted text feeds into AI workflows, document the pipeline. Know what data is being processed and where it goes
What’s Next
GetTextFromPDF() is one piece of a larger AI toolkit in FileMaker 2025. Combined with embedding models, semantic search, and text generation, it enables genuinely intelligent document workflows.
If you’re building AI features into your FileMaker solution and want guidance on doing it responsibly, let’s talk.
How AI Was Used in This Post
AI assisted with research, drafting, and code example formatting. All technical content was verified against FileMaker 2025 documentation. The header image was generated using ChatGPT.
Frequently Asked Questions
No. GetTextFromPDF() extracts text from text-based PDFs only. Scanned documents (image-based PDFs) return empty because they require OCR processing. You can detect this by checking if the result is empty and routing those records to an external OCR service.
No. GetTextFromPDF() runs entirely locally on your FileMaker client or server. No data is sent to external services, which makes it a good fit for sensitive or regulated data environments.
Yes. GetTextFromPDF() works on both FileMaker Pro (client-side) and FileMaker Server (server-side scripts). For large batch processing, running extraction as a server-side script is recommended for better performance.
Once text is extracted and stored in a text field, you can generate embeddings for semantic search, feed the text into a text generation model for summarization or classification, or use it as context in RAG (Retrieval Augmented Generation) workflows.
GetTextFromPDF() processes the entire document by default. A 100-page PDF might produce 50,000+ characters of text. Plan your storage accordingly and consider indexing the extracted text field for search performance.
Is your team ready for AI in FileMaker?
Our free AI Readiness Guide walks you through the key questions to answer before your first AI project.
Get the AI Readiness Guide