Skip to main content
THE_COLUMN // AI

Bedrock RAG Data Sovereignty: How Infrastructure Teams Keep Retrieval Inside the VPC

Written by: iSimplifyMe·Created on: May 21, 2026·10 min read

You probably think of Bedrock RAG sovereignty as a checkbox somewhere between the model card and the IAM policy. However, real retrieval sovereignty is closer to wiring a private banking network than to flipping an AWS console toggle — and it is the question your security review will ask before any agent reaches a production workload.

Walk into any infrastructure planning meeting where a team is scoping a Bedrock-backed agent, and the conversation collapses to the same anxiety within twenty minutes. The model is fine, the prompts are fine, the orchestration is fine — but where does the retrieved chunk actually travel between the vector store and the foundation model, and who can read it in flight?

What is Bedrock RAG data sovereignty? Bedrock RAG data sovereignty is the architectural guarantee that every retrieved chunk, embedding, and model invocation stays inside your AWS account boundary — traversing PrivateLink endpoints, encrypted with customer-managed KMS keys, and logged through CloudTrail so the path from query to answer can be reconstructed in an audit.

Why The Sovereignty Question Is Harder Than It Looks

Bedrock itself runs in AWS-managed accounts, not yours. That is a feature for managed inference, but it is the exact reason your compliance team flags retrieval as a separate concern from model hosting.

The naive picture is a straight line: your agent queries a knowledge base, Bedrock returns an answer, done. The real picture has at least four trust boundaries — your application VPC, the vector index, the embeddings endpoint, and the foundation-model endpoint — and each one needs an explicit policy decision.

Skip any of these and you end up with a retrieval pipeline that technically works but cannot be defended in a SOC 2 review. We have seen this pattern recur across every Bedrock rollout we have shipped, which is why we treat it as a load-bearing piece of the broader RAG governance story rather than a network detail.

The Four Trust Boundaries Of A Bedrock RAG Pipeline

Mapping these explicitly is the first thing we do in any architecture review. Each boundary has a default that leaks data outside the VPC unless you flip it.

BoundaryDefault BehaviorIn-VPC Configuration
Application → Bedrock RuntimePublic endpoint via NATVPC Interface Endpoint (PrivateLink) for bedrock-runtime
Embeddings (Titan, Cohere)Same public path as inferenceSame PrivateLink endpoint scoped via IAM
Vector index (OpenSearch Serverless, Aurora pgvector, Pinecone)Varies by store — Pinecone is off-account by defaultOpenSearch Serverless with VPC access policy, or Aurora in private subnet
Knowledge Base data source (S3)Public S3 endpoint via IGWGateway VPC endpoint for S3, bucket policy denying non-VPCE access

The trap is that PrivateLink for Bedrock Runtime alone is not enough. If your vector store sits in Pinecone or a public OpenSearch cluster, the retrieved chunks still leave the account on their way back to the model — which means your sovereignty claim breaks at the worst possible boundary.

PrivateLink Is Necessary But Not Sufficient

The first thing every infrastructure team gets right is creating an Interface VPC Endpoint for com.amazonaws.region.bedrock-runtime. That moves the InvokeModel and RetrieveAndGenerate calls off the public internet.

What teams miss is the IAM condition key. Without aws:SourceVpce in the role policy, an exfiltrated credential can still call Bedrock from anywhere — the PrivateLink endpoint is just a faster, cheaper path, not a perimeter.

How do you enforce VPC-only Bedrock access? Attach an IAM policy condition that requires aws:SourceVpce to match your interface endpoint ID on every bedrock:InvokeModel and bedrock:Retrieve action. Pair it with an SCP at the organization level so no role in any account can call Bedrock from outside your designated endpoint, even with a leaked key.

The Vector Store Choice Is The Sovereignty Decision

This is where most production decisions get made by accident. Teams pick Pinecone or a managed vector SaaS because the developer experience is excellent — and then discover at audit time that they have built a data-residency dependency on a third party.

For Bedrock Knowledge Bases, AWS supports OpenSearch Serverless, Aurora PostgreSQL with pgvector, Pinecone, Redis Enterprise Cloud, and MongoDB Atlas. Only the first two keep retrieval traffic inside your AWS account boundary without an additional BAA or data-processing agreement.

OpenSearch Serverless with a VPC access policy is the default we recommend when sovereignty is non-negotiable. Aurora pgvector wins when the team already operates Postgres and wants a single backup, encryption, and IAM story across operational and vector data.

KMS, Customer-Managed Keys, And The Two-Key Pattern

Encryption at rest is a settled question — every Bedrock-supported vector store offers it. The interesting decision is whether you use AWS-managed keys or customer-managed keys, and whether you separate the key that protects the index from the key that protects the source documents in S3.

For regulated workloads, the two-key pattern is the right default. One CMK protects the S3 bucket holding the source corpus, a second CMK protects the OpenSearch Serverless collection or the Aurora cluster, and the IAM roles for ingestion and retrieval are scoped to only the keys they need.

Why use separate KMS keys for source documents and the vector index? Separating the keys means a compromised retrieval role cannot decrypt the original PDF or transcript — only the embedded chunks it is authorized to read. The blast radius of a leaked retrieval credential drops from your entire knowledge corpus to whatever subset has already been embedded and indexed.

The CloudTrail Story Your Auditor Actually Wants

Sovereignty is not just where the bytes travel — it is whether you can prove, after the fact, what was retrieved and by whom. Bedrock emits data events for InvokeModel and RetrieveAndGenerate, but they are off by default.

Turn them on. Route them to a dedicated CloudTrail trail with its own KMS key and S3 bucket, and keep the retention window aligned with whatever data class your corpus falls under — typically six years for healthcare-adjacent content, seven for financial.

This is the same observability discipline we describe in agent observability, but applied one layer down at the model invocation surface. The audit trail your security team will ask for is reconstructed from these events plus the application-level trace IDs your agent emits at every tool call.

Where Sovereignty Breaks In Practice

The failure modes are boringly consistent across the deployments we audit. None of them are model problems.

Failure mode one: The vector store is in a managed SaaS, retrieval traffic exits the account, and nobody noticed because the answer quality is fine.

Failure mode two: PrivateLink is configured for bedrock-runtime but not for the embeddings model, so document ingestion still uses the public path.

Failure mode three: CloudTrail data events are off, so the retrieval log has model invocations but no record of which chunks were returned.

Failure mode four: The S3 source bucket policy allows access from any role in the account, not just the Bedrock service role, so the audit boundary is wider than the architecture diagram suggests.

Each one is a five-line fix once you know to look. None of them are caught by the default AWS Config rules.

Multi-Region And Residency: The Question Behind The Question

For US-only workloads, Bedrock in us-east-1 or us-west-2 with the architecture above is the end of the conversation. For EU residency or cross-border restrictions, the question gets harder because not every Bedrock model is available in every region.

Claude 3.5 Sonnet, for example, is available in eu-central-1 as of 2026, but model-version pinning matters — when AWS deprecates a snapshot, your retrieval architecture survives but your model identity does not. Plan the pinning at the orchestration layer, not at the application layer, which is the same argument we make in Bedrock agent patterns.

Can Bedrock RAG meet EU data residency requirements? Yes, when the foundation model, the Knowledge Base, the vector store (OpenSearch Serverless or Aurora pgvector), the source S3 bucket, and the CloudTrail trail are all provisioned in an EU region, and the IAM policies forbid cross-region invocation. The constraint is which models are available in your target region — confirm before architecting around a specific provider.

The Validation Layer That Closes The Loop

Sovereignty controls tell you where the data lives. They do not tell you whether the retrieval is returning the right chunks, or whether a prompt-injection attempt smuggled an instruction past your filters.

That is a separate problem — covered in AI retrieval blind spots — but it sits adjacent to sovereignty in every architecture review. The reader who asks one question is about to ask the other.

Does keeping Bedrock RAG inside the VPC eliminate prompt-injection risk? No. VPC isolation protects the network path and the audit boundary, but it does nothing about adversarial content inside the corpus itself. A poisoned chunk inside your own private OpenSearch index is still a poisoned chunk — sovereignty and content validation are independent controls and both are required.

A Reference Architecture That Survives Review

For teams scoping their first deployment, the configuration below is the one we keep returning to. It is opinionated on purpose.

  • Application VPC. Private subnets only for any service that touches retrieval. NAT gateway exists for OS updates, not for AWS API calls.
  • VPC Interface Endpoints. bedrock-runtime, bedrock-agent-runtime, and aoss (OpenSearch Serverless). Gateway endpoint for S3.
  • Vector store. OpenSearch Serverless collection with VPC access policy restricting ingress to the application's security group. KMS CMK distinct from the S3 source key.
  • Source corpus. S3 bucket with a bucket policy denying any request where aws:SourceVpce does not match. Block Public Access enabled at the account level.
  • IAM. Separate roles for ingestion (write to index, read from S3) and retrieval (read from index, no S3 access). Both scoped to the VPC endpoint via aws:SourceVpce.
  • CloudTrail. Data events enabled for Bedrock InvokeModel, RetrieveAndGenerate, and the S3 source bucket. Trail encrypted with its own CMK.
  • SCP. Organization-level service control policy denying any Bedrock action outside the designated VPC endpoints.

This is the shape that passes review on the first pass. Variations exist — Aurora pgvector instead of OpenSearch, KMS XKS for HSM-backed keys, AWS PrivateLink for cross-account models — but the seven primitives above are the ones that determine whether the architecture is defensible.

What This Costs You, Honestly

VPC Interface Endpoints run roughly $7 to $10 per endpoint per month plus per-GB processing charges. For a three-endpoint setup, expect $25 to $40 monthly in endpoint costs before traffic.

OpenSearch Serverless has a two-OCU minimum for indexing and another two for search, which sets a floor around $700 to $900 monthly per collection at the time of writing. Aurora pgvector on a db.r6g.large with Multi-AZ runs closer to $400 monthly — but only if you already operate Postgres at that scale.

The honest answer is that sovereignty costs between $500 and $1,500 monthly above a non-sovereign baseline, depending on store choice. Most regulated workloads consider that line item rounding error against the cost of a single failed audit.

Frequently Asked Questions

Does Amazon Bedrock train on the data I send through RetrieveAndGenerate?

No. Per the AWS Bedrock service terms, prompts and completions are not used to train the underlying foundation models or shared with model providers. This is the contractual basis for the sovereignty story but it is independent of network architecture.

Can I use Pinecone or another third-party vector store and still call the deployment sovereign?

Only if you have a signed data processing agreement that satisfies your specific compliance regime, and you have explicitly classified the embedded chunks as out-of-scope from your residency requirements. For most regulated workloads, the cleaner answer is to keep the vector store in-account.

How does this change for Bedrock Agents versus raw Knowledge Bases?

Bedrock Agents add bedrock-agent-runtime as a separate interface endpoint and introduce action-group Lambdas that need their own VPC configuration. The sovereignty story is the same shape with one additional boundary — the Lambda execution environment — that needs to be in a private subnet with no public egress.

Is OpenSearch Serverless the right default if my team has no OpenSearch experience?

Probably not. The operational story is simpler than self-managed OpenSearch, but it still has its own scaling, sharding, and cost-tuning surface. Teams already running Aurora typically find pgvector a faster path to a defensible architecture, even if the per-query performance is slightly behind.

What is the smallest thing I can do today to improve my Bedrock RAG sovereignty posture?

Turn on CloudTrail data events for Bedrock InvokeModel and RetrieveAndGenerate, and route them to a dedicated trail. It costs almost nothing, takes ten minutes, and is the single artifact your auditor will ask for first.

Where iSimplifyMe Comes In

If you are scoping a Bedrock RAG deployment and want a second set of eyes on the sovereignty architecture before you commit, the team at iSimplifyMe builds and operates production agent systems on AWS every week. We map the four trust boundaries against your specific corpus, name the failure modes you are about to hit, and leave you with a deployable reference architecture and a CloudTrail query pack your security team can run on day one.

The sovereignty conversation does not get easier the longer you postpone it. Get the boundaries right at scoping time and the rest of the agent rollout — orchestration, observability, governance — slots into a foundation that survives review.

Ready to Grow?

Let's build something extraordinary together.

Start a Project
I could not be happier with this company! I have had two websites designed by them and the whole experience was amazing. Their technology and skills are top of the line and their customer service is excellent.
Dr Millicent Rovelo
Beverly Hills
Apex Architecture

Every site we build runs on Apex — sub-500ms, AI-native, zero maintenance.

Explore Apex Architecture

Stay Ahead of the Curve

AI strategies, case studies & industry insights — delivered monthly.

K