If you’ve ever tried to extract data from dozens of PDFs you know how quickly the task turns into chaos.
Manual copy-paste work, inconsistent layouts, and OCR errors make it a nightmare especially when accuracy matters.
That’s exactly why I built the Document AI Starter a simple, ready-to-run workflow powered by Google Cloud’s Form Parser and Python.
It’s designed to help developers, analysts, and automation builders go from unstructured PDFs to clean, structured outputs in just a few steps.
In this post, I’ll walk through how it was built, how to set it up, and how you can use it as the foundation for your own document automation projects.

Phase 1: Setting Up Google Cloud
Everything begins in the Google Cloud Console.
To make Document AI work, you first create a dedicated project, enable the API, and generate a secure service account key.
This service account is what lets your local tool securely talk to Google’s servers without using your personal credentials.
Setup Highlights
- Create a new project (
document-ai-starter). - Go to IAM & Admin → Service Accounts and make one named
docai-form-parser. - Assign the role Document AI API User.
- Generate a JSON key — save it locally as
gcloud_key.json. - Enable the Document AI API and create a Form Parser Processor.
💡 Tip: Each processor comes with a unique ID that looks like
projects/your-project-id/locations/us/processors/1234567890abcdef.
You’ll paste this into your configuration file later.

Phase 2: Configuring the Local Environment
Once the Google Cloud side is ready, the local environment connects it all.
You only need two files:
.env– holds your Google credentials path.config.yaml– defines project details like processor ID, output folder, and cost per page.
Example:
project:
id: your-project-id
location: us
processor_id: your-processor-id
paths:
output_dir: outputs
log_file: logs/usage_log.csv
processing:
cost_per_page_usd: 0.03
Once saved, activate your virtual environment and install dependencies:
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt
Now you’re ready to process files using either the CLI or Streamlit UI.
Phase 3: Running the Extraction — CLI or Streamlit
The CLI mode is perfect for batch jobs and automation testing:
python -m core.cli --file sample_data/2.pdf
Or process all PDFs in a folder:
python -m core.cli --batch
Prefer a graphical interface?
Launch the Streamlit app and upload PDFs directly from your browser:
streamlit run ui/streamlit_app.py
You’ll see the parsed data appear live — including field names, confidence scores, and cost estimates per page.
Result Example:
📄 File: 2.pdf
🧾 Pages: 2
💰 Estimated Cost: $0.06
✅ Extraction Completed
Phase 4: Output and Reporting
The Document AI Starter automatically creates:
File Purpose
.xlsx Clean structured data
.json Raw AI output
usage_log.csv Tracks cost, confidence, and file metadata
summary_dashboard.xlsx Aggregated performance and spend summary
Every extraction is logged for full transparency — allowing you to monitor cost, review confidence levels, and build your own reporting dashboards.
Key Takeaways — Build Once, Scale Anywhere
Building the Document AI Starter taught me an important lesson: automation should be simple, secure, and repeatable.
By combining Google Cloud’s pre-trained models with a clean local interface, we can empower non-developers and small teams to perform tasks that used to require full data-engineering pipelines.
If you’re looking to integrate AI-based document extraction into your workflow, the Starter edition is a perfect foundation.
You can expand it into an internal automation tool, add Power Automate triggers, or connect it to a database for continuous ingestion.
Wrap-Up — Download the Starter and Try It Yourself
The Document AI Starter is available for free download.
You’ll get:
The pre-configured Python project
The Google Cloud setup PDF
Step-by-step environment instructions
Ready to build your own document AI workflow?
Download the Document AI Starter and follow along with this post to bring your first extraction pipeline to life.


