Your Invoices Have Bank Account Numbers in Them. Let's Talk About That.
Founder
Domain Architect, Finance B2B Operations · 2026-03-28
A few months ago, a friend who runs a 50-person logistics company asked if he could try ScribeArc. Before I could even show him the demo, his first question was: "Where does my data go?"
Not "how accurate is it?" Not "how fast is it?" Where does my data go.
I liked that question. It's the right question, and it's the one most vendors try to rush past with a compliance logo wall and a vague mention of "enterprise-grade security." So let me actually answer it.
What's in a financial document, really
When I worked in finance ops, I handled thousands of invoices, bank statements, and vendor contracts. After a while, you stop seeing them as documents and start seeing them as bundles of sensitive data:
Bank account numbers and routing information - enough to initiate fraudulent transfers
Tax IDs (GST numbers in India, VAT IDs in the EU, EINs in the US) - useful for identity fraud
Revenue figures, pricing, and contract terms - competitive intelligence gold
Vendor and customer relationships - who you buy from, who buys from you, at what prices
A breach of this data isn't just embarrassing. It's an operational and legal catastrophe. And the attack surface increases the moment you add AI, because now there's a model processing all of this data, generating logs, and potentially retaining patterns.
Our hard lines
Some of our security decisions are philosophical commitments, not just technical choices. Here are the ones we've drawn hard lines on:
We will never train models on customer data. Full stop. Our AI models are trained on anonymized, synthetic, and licensed datasets. Your invoices are used for inference - meaning the model reads them to extract data - and nothing else. This is in our Privacy Policy today, not a future plan.
Customer data is for your benefit, not ours. We don't aggregate customer data for analytics, benchmarking, or product improvement without explicit, per-customer consent. The fact that your AP department processes a lot of invoices from a particular vendor is your business information, not ours to monetize.
We default to less data, not more. Extracted data points get stored. Original documents get retained based on your retention policy, not ours. Logs are scrubbed of PII before they hit storage.
What we've built and what we're building
I want to be clear about what's in production versus what's on our roadmap, because I think the fintech industry has a bad habit of blurring that line.
In production now:
Encryption at rest (AES-256) and in transit (TLS 1.3) for all data
Model training exclusively on non-customer datasets
PII-scrubbed logging
On our roadmap (designed, not yet deployed):
Full tenant isolation - each customer's data in logically separate infrastructure, so cross-tenant access is architecturally impossible
Customer-managed encryption keys (CMEK) for enterprise accounts
Data residency controls - letting customers specify where their data is stored, which matters a lot for EU companies under GDPR and increasingly for APAC businesses too
Automatic PII detection and redaction in analytics and monitoring
Real-time anomaly detection on access patterns
Prompt injection protections as we integrate more LLM capabilities
On compliance
We're actively working toward SOC 2 Type II and GDPR alignment. We're not certified yet - we're pre-launch, and these certifications require operational track record. I'd rather be honest about that than slap a "SOC 2 Compliant" badge on our site prematurely.
What I will say: we're building the controls, audit trails, and documentation from Day 1 so that when we pursue certification, it's a validation of what we already do, not a retrofit.
The opinion nobody asked for
Here's my slightly tired take after fifteen years in this space: most fintech security pages are theater. Long lists of compliance certifications, stock photos of padlocks, and paragraphs about "military-grade encryption." Meanwhile, the actual question - "what happens to my data, who can see it, and how do I get it back if I leave?" - gets buried in a legal document nobody reads.
We're trying to do this differently. Not because we're more virtuous, but because the founders of this company have personally handled the kind of sensitive financial data that would be catastrophic if leaked. We know what's at stake because we've been on the other side of the table.
ScribeArc is in private beta. Security features noted as "on our roadmap" are designed but not yet deployed.