Deploying a RAG Chatbot on AWS with Terraform

Deploying a RAG Chatbot on AWS with Terraform

March 27, 2026

Deploying a RAG Chatbot on AWS with Terraform

Most people talk about RAG like the model is the system.

It isn’t.

The real system is everything around it: networking, private connectivity, containers, IAM, databases, deployment flows, and the boring infrastructure details that decide whether your app actually works in production.

This guide takes the original AWS + Terraform walkthrough and refines it into a clearer, sharper, SpaceSlam-style engineering breakdown.

Instead of just repeating Terraform blocks, we focus on why each piece exists, how the architecture fits together, and what matters if you want to build something real.


#What We Are Actually Building

aws_architecture
aws_architecture
We want to reproduce an end-to-end AWS setup for a RAG chatbot using Terraform.

That includes:

  • a custom VPC
  • public and private subnets
  • an Internet Gateway and route tables
  • Security Groups
  • an EC2 Bastion Host
  • an Amazon DocumentDB cluster
  • AWS App Runner for backend and frontend services
  • a VPC Connector so App Runner can reach private resources
  • a Bedrock interface VPC endpoint for private model access

The point is not just “infrastructure as code.”

The point is this: you want the whole system to be reproducible, reviewable, secure, and easy to evolve.

That is what Terraform gives you when used properly.


#High-Level Architecture

Here is the mental model:

  • The frontend is public.
  • The backend is deployed with App Runner.
  • The database lives in private subnets.
  • A bastion host gives controlled admin access.
  • Bedrock is reached privately through a VPC endpoint.
  • Terraform wires all of it together.

So the real story is not “deploying a chatbot.”

It is: building a production-shaped AI system where the app, the data plane, and the model access path are all explicitly designed.


#Repository Structure

A clean Terraform repo usually separates concerns by resource type.

.
├── provider.tf
├── variables.tf
├── terraform.tfvars
├── vpc.tf
├── security_groups.tf
├── documentdb.tf
├── data.tf
├── bastion_host.tf
├── app_runner_back.tf
└── outputs.tf

This matters because Terraform files are not just “files.” They are your infrastructure boundaries.

A structure like this makes it easier to reason about:

  • networking
  • security
  • compute
  • database
  • deployment
  • outputs

That becomes much more valuable as the project grows.


#Terraform Basics You Actually Need

Before touching the AWS resources, know these commands:

#terraform init

Initializes the working directory. It downloads providers and prepares Terraform state handling.

#terraform plan

Shows what Terraform is going to create, update, or destroy. This is your chance to catch mistakes before AWS catches your wallet.

#terraform apply

Executes the plan and provisions the resources.

#Terraform State

Terraform keeps track of what it created through a state file, usually terraform.tfstate.

That state is critical because Terraform does not “guess” infrastructure. It compares:

  • what your code says should exist
  • what the state says already exists

That comparison is what drives the plan.

#Resource Referencing

Terraform resources are referenced like this:

aws_vpc.chatbot_vpc.id

That means:

  • resource type: aws_vpc
  • local resource name: chatbot_vpc
  • attribute: id

#Variables

Variables are typically:

  • declared in variables.tf
  • assigned in terraform.tfvars
  • referenced as var.some_name

That lets you separate logic from environment-specific values.


1. VPC Network

This is the foundation.

If your network design is wrong, the rest of the architecture becomes a pile of workarounds.

#Create a Custom VPC

resource "aws_vpc" "chatbot_vpc" {
  cidr_block           = "172.16.0.0/16"
  instance_tenancy     = "default"
  enable_dns_support   = true
  enable_dns_hostnames = true

  tags = {
    Name = "terraform_chatbot_vpc"
  }
}

#Why it exists

A VPC is your private network boundary in AWS.

The CIDR block 172.16.0.0/16 defines the IP address space for everything inside that network.

#What matters

  • enable_dns_support = true lets resources resolve DNS names.
  • enable_dns_hostnames = true helps AWS-managed services and instances work more smoothly.
  • tagging is not optional if you want sane operations later.

#Public Subnet

resource "aws_subnet" "public_subnet_1" {
  vpc_id                  = aws_vpc.chatbot_vpc.id
  cidr_block              = "172.16.1.0/24"
  map_public_ip_on_launch = true
  availability_zone       = "eu-central-1a"

  tags = {
    Name = "terraform_chatbot_subnet_public1"
  }
}

#Why it exists

This is where resources that need direct internet reachability can live.

In this architecture, that mainly means the bastion host.

#Key detail

map_public_ip_on_launch = true means EC2 instances here automatically get public IPs.


#Private Subnets

resource "aws_subnet" "private_subnet_1" {
  vpc_id                  = aws_vpc.chatbot_vpc.id
  cidr_block              = "172.16.2.0/24"
  map_public_ip_on_launch = false
  availability_zone       = "eu-central-1a"

  tags = {
    Name = "terraform_chatbot_subnet_private1"
  }
}

resource "aws_subnet" "private_subnet_2" {
  vpc_id                  = aws_vpc.chatbot_vpc.id
  cidr_block              = "172.16.3.0/24"
  map_public_ip_on_launch = false
  availability_zone       = "eu-central-1b"

  tags = {
    Name = "terraform_chatbot_subnet_private2"
  }
}

#Why they exist

Private subnets are where you place resources that should not be directly exposed to the internet.

Here, that includes:

  • DocumentDB
  • App Runner private networking path
  • Bedrock interface endpoint

#Why two subnets matter

Using subnets across different availability zones improves availability and is often required by managed services.


#Internet Gateway

resource "aws_internet_gateway" "internet_gw" {
  vpc_id = aws_vpc.chatbot_vpc.id

  tags = {
    Name = "terraform_internet_gw"
  }
}

#Why it exists

Without an Internet Gateway, your public subnet is not really public.

It gives resources in public subnets a path to the internet.


#Route Tables

#Public Route Table

resource "aws_route_table" "public_route_table" {
  vpc_id = aws_vpc.chatbot_vpc.id

  route {
    cidr_block = "0.0.0.0/0"
    gateway_id = aws_internet_gateway.internet_gw.id
  }

  tags = {
    Name = "terraform_public_rtb"
  }
}

This sends all outbound traffic to the Internet Gateway.

#Private Route Table

resource "aws_route_table" "private_route_table" {
  vpc_id = aws_vpc.chatbot_vpc.id

  tags = {
    Name = "terraform_private_rtb"
  }
}

No default internet route means the private subnet stays private.

That is exactly what you want for database infrastructure.


#Route Table Associations

resource "aws_route_table_association" "public_assoc" {
  subnet_id      = aws_subnet.public_subnet_1.id
  route_table_id = aws_route_table.public_route_table.id
}

resource "aws_route_table_association" "private_assoc_1" {
  subnet_id      = aws_subnet.private_subnet_1.id
  route_table_id = aws_route_table.private_route_table.id
}

resource "aws_route_table_association" "private_assoc_2" {
  subnet_id      = aws_subnet.private_subnet_2.id
  route_table_id = aws_route_table.private_route_table.id
}

This is the final wiring step.

Subnets do not magically inherit the right routes. You explicitly attach them.


2. Security Groups

Now we define who is allowed to talk to what.

Security groups are not just firewall rules. They are part of the system design.

#DocumentDB Security Group

resource "aws_security_group" "documentdb_sg" {
  vpc_id      = aws_vpc.chatbot_vpc.id
  name        = "documentdb_security_group"
  description = "Security group for DocumentDB"

  ingress {
    from_port       = 27017
    to_port         = 27017
    protocol        = "tcp"
    security_groups = [aws_security_group.bastion_sg.id]
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "terraform_documentdb_security_group"
  }
}

#Why this is good

Instead of allowing access from a broad IP range, it allows access only from a trusted security group.

That is a much better pattern than throwing CIDRs everywhere.

#Port 27017

That is the MongoDB-compatible port used by DocumentDB.


#Bastion Host Security Group

resource "aws_security_group" "bastion_sg" {
  vpc_id      = aws_vpc.chatbot_vpc.id
  name        = "bastion_security_group"
  description = "Security group for Bastion host"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = var.bastion_host_cidr
  }

  egress {
    from_port   = 0
    to_port     = 0
    protocol    = "-1"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "terraform_bastion_security_group"
  }
}

#Why this matters

A bastion host is already a controlled entry point. If you open SSH to the whole world, you just created a public attack surface with a fancy name.

Restrict it to a known CIDR.


3. Provisioning Amazon DocumentDB

Now we add the database layer.

For a RAG system, this is where your application might store:

  • embeddings
  • session data
  • chat history
  • metadata

#DocumentDB Subnet Group

resource "aws_docdb_subnet_group" "terraform_docdb_subnet_group" {
  name_prefix = "terraform_document_db_subnet_group"
  subnet_ids  = [
    aws_subnet.private_subnet_1.id,
    aws_subnet.private_subnet_2.id
  ]
}

#Why it exists

Managed databases need to know where they are allowed to launch. This subnet group tells DocumentDB to stay inside the private subnets.


#DocumentDB Cluster

resource "aws_docdb_cluster" "terraform_docdb" {
  cluster_identifier_prefix       = "terraform-docdb-cluster"
  db_subnet_group_name            = aws_docdb_subnet_group.terraform_docdb_subnet_group.name
  vpc_security_group_ids          = [aws_security_group.documentdb_sg.id]
  engine                          = "docdb"
  engine_version                  = "5.0.0"
  master_username                 = var.master_username
  master_password                 = var.master_password
  storage_encrypted               = true
  skip_final_snapshot             = true
}

#What matters here

  • it lives in the private subnet group
  • it uses the locked-down DocumentDB security group
  • storage is encrypted
  • credentials come from variables

#Important production note

Using raw Terraform variables for secrets is acceptable for demos. For production, use a more secure pattern such as:

  • AWS Secrets Manager
  • SSM Parameter Store
  • external secret injection via CI/CD

Also, skip_final_snapshot = true is fine for experimentation, not for serious environments.


#DocumentDB Cluster Instance

resource "aws_docdb_cluster_instance" "terraform_cluster_instances" {
  identifier         = "terraform-docdb-cluster-instance"
  cluster_identifier = aws_docdb_cluster.terraform_docdb.id
  instance_class     = "db.t3.medium"
}

The cluster defines the database control plane. The cluster instance is the actual compute capacity running it.


4. Bastion Host (EC2)

The bastion host is your controlled admin bridge into the private side of the system.

resource "aws_instance" "terraform_bastion_host" {
  ami                    = data.aws_ami.ubuntu_ami.id
  instance_type          = "t2.small"
  key_name               = aws_key_pair.ssh_bastion.key_name
  vpc_security_group_ids = [aws_security_group.bastion_sg.id]
  subnet_id              = aws_subnet.public_subnet_1.id

  user_data = <<-EOF
    #!/bin/bash
    apt-get update
    apt-get install -y gnupg curl wget
  EOF

  tags = {
    Name = "Terraform Bastion Host"
  }
}

#Why it exists

Because your DocumentDB cluster is private. You do not SSH into the database. You SSH into the bastion, then access private resources from there.

That is the correct pattern.


#SSH Key Pair

resource "aws_key_pair" "ssh_bastion" {
  key_name   = "ssh-key"
  public_key = file("id_bastion_bayard.pub")
}

#Important rule

Never commit private keys. Ever.


#Dynamic AMI Selection

Hardcoding an AMI ID is annoying and brittle. Use a data source instead.

data "aws_ami" "ubuntu_ami" {
  most_recent = true
  owners      = ["099720109477"]

  filter {
    name   = "architecture"
    values = ["x86_64"]
  }

  filter {
    name   = "root-device-type"
    values = ["ebs"]
  }

  filter {
    name   = "virtualization-type"
    values = ["hvm"]
  }

  filter {
    name   = "name"
    values = ["ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-*"]
  }

  filter {
    name   = "state"
    values = ["available"]
  }
}

#Why this is better

This keeps the bastion image current without editing Terraform every time AMIs change.

That is exactly the kind of small decision that separates “toy infra” from something maintainable.


5. App Runner, ECR, and IAM

Now we move from infrastructure foundation to application deployment.

The application has two parts:

  • backend
  • frontend

The images live in ECR. The services run in App Runner.

#ECR Image URIs

You will reference image URIs like:

<account-id>.dkr.ecr.<region>.amazonaws.com/chatbot-backend:latest
<account-id>.dkr.ecr.<region>.amazonaws.com/chatbot-frontend:latest

Those are what App Runner pulls when it deploys.


#IAM Role for App Runner

App Runner needs an IAM role so it can:

  • pull images from ECR
  • access AWS services at runtime
  • possibly call Bedrock

#Trust Policy

resource "aws_iam_role" "app_runner_build_role" {
  name = "app-runner-build-role"

  assume_role_policy = jsonencode({
    Version = "2012-10-17"
    Statement = [
      {
        Effect = "Allow"
        Principal = {
          Service = [
            "build.apprunner.amazonaws.com",
            "tasks.apprunner.amazonaws.com"
          ]
        }
        Action = "sts:AssumeRole"
      }
    ]
  })
}

#Why this matters

This does not grant permissions yet. It only says which AWS services are allowed to assume the role.

That is what a trust policy does.


#Attach Policies

#ECR Read Access

resource "aws_iam_role_policy_attachment" "ecr_readonly_attachment" {
  role       = aws_iam_role.app_runner_build_role.name
  policy_arn = "arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
}

#Bedrock / DocumentDB Notes

The original article attaches broad managed policies like AmazonDocDBFullAccess and AmazonBedrockFullAccess. That works for demos, but it is too broad for good production security.

A better practice is:

  • minimal ECR pull permissions
  • narrow Bedrock invocation permissions
  • avoid unnecessary full-access managed policies
  • prefer application-level auth and network control for database access

This is one of the biggest differences between “it deploys” and “it is production-grade.”


6. App Runner Networking: Security Groups and VPC Connector

This is where the architecture becomes interesting.

By default, App Runner is not magically inside your private VPC world. If your database is private, you need a VPC connector.

#App Runner Security Group

resource "aws_security_group" "apprunner_sg" {
  name_prefix = "terraform-apprunner-sg-"
  vpc_id      = aws_vpc.chatbot_vpc.id

  egress {
    from_port   = 443
    to_port     = 443
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags = {
    Name = "terraform_apprunner_security_group"
  }
}

This allows HTTPS egress, which is needed for service communication such as Bedrock access.


#DocumentDB ↔ App Runner Rules

resource "aws_security_group_rule" "documentdb_to_apprunner" {
  type                     = "ingress"
  from_port                = 27017
  to_port                  = 27017
  protocol                 = "tcp"
  security_group_id        = aws_security_group.documentdb_sg.id
  source_security_group_id = aws_security_group.apprunner_sg.id
}

resource "aws_security_group_rule" "apprunner_to_documentdb" {
  type                     = "egress"
  from_port                = 27017
  to_port                  = 27017
  protocol                 = "tcp"
  security_group_id        = aws_security_group.apprunner_sg.id
  source_security_group_id = aws_security_group.documentdb_sg.id
}

#Why this is important

This is explicit private communication.

The backend can reach the database. The database only trusts the backend security group.

That is exactly the pattern you want.


#VPC Connector

resource "aws_apprunner_vpc_connector" "connector" {
  vpc_connector_name = "terraform-app-runner-connector"

  subnets = [
    aws_subnet.private_subnet_1.id,
    aws_subnet.private_subnet_2.id
  ]

  security_groups = [
    aws_security_group.apprunner_sg.id
  ]
}

#Why it exists

Without the VPC connector, App Runner cannot privately reach resources in your VPC.

This is the bridge between your managed container service and your internal data plane.


7. Deploying the Backend with App Runner

Now we define the backend service itself.

resource "aws_apprunner_service" "terraform_app_runner_backend" {
  service_name = "terraform-app-runner-backend"

  source_configuration {
    image_repository {
      image_configuration {
        port = "8000"
        runtime_environment_variables = {
          MONGODB_HOST     = aws_docdb_cluster.terraform_docdb.endpoint
          MONGODB_PORT     = var.docdb_port
          MONGODB_PASSWORD = var.docdb_password
          MONGODB_USERNAME = var.docdb_username
        }
      }
      image_identifier      = var.image_url
      image_repository_type = "ECR"
    }

    authentication_configuration {
      access_role_arn = aws_iam_role.app_runner_build_role.arn
    }

    auto_deployments_enabled = true
  }

  network_configuration {
    ingress_configuration {
      is_publicly_accessible = true
    }

    egress_configuration {
      egress_type       = "VPC"
      vpc_connector_arn = aws_apprunner_vpc_connector.connector.arn
    }
  }

  instance_configuration {
    cpu               = "1 vCPU"
    memory            = "2 GB"
    instance_role_arn = aws_iam_role.app_runner_build_role.arn
  }

  tags = {
    Name = "terraform-apprunner-service-backend"
  }
}

#What this really does

  • pulls your backend image from ECR
  • exposes port 8000
  • injects runtime environment variables
  • enables automatic redeploy on image updates
  • sends outbound traffic through the VPC connector
  • gives the service an IAM role at runtime

#Important idea

This is not “just deployment.” This is where your application runtime becomes part of your infrastructure model.


8. Private Bedrock Access with a VPC Endpoint

If your backend needs Bedrock, you do not want model calls bouncing through the public internet if you can avoid it.

Use an interface VPC endpoint.

#Bedrock Endpoint Security Group

resource "aws_security_group" "bedrock_endpoint_sg" {
  name_prefix = "bedrock-endpoint-sg-"
  vpc_id      = aws_vpc.chatbot_vpc.id

  ingress {
    from_port       = 443
    to_port         = 443
    protocol        = "tcp"
    security_groups = [aws_security_group.apprunner_sg.id]
  }

  tags = {
    Name = "Bedrock Endpoint Security Group"
  }
}

Only the App Runner service can talk to the endpoint over HTTPS.


#Bedrock Interface Endpoint

resource "aws_vpc_endpoint" "terraform_bedrock_endpoint" {
  vpc_id            = aws_vpc.chatbot_vpc.id
  service_name      = "com.amazonaws.eu-central-1.bedrock-runtime"
  vpc_endpoint_type = "Interface"

  security_group_ids = [
    aws_security_group.bedrock_endpoint_sg.id
  ]

  private_dns_enabled = true

  subnet_ids = [
    aws_subnet.private_subnet_1.id,
    aws_subnet.private_subnet_2.id
  ]
}

#Why this is powerful

With private DNS enabled, your backend can resolve the normal Bedrock endpoint name and still stay inside private AWS networking.

That is cleaner and more secure than exposing unnecessary public paths.


#Endpoint Policy

resource "aws_vpc_endpoint_policy" "terraform_bedrock_endpoint_policy" {
  vpc_endpoint_id = aws_vpc_endpoint.terraform_bedrock_endpoint.id
  policy = jsonencode({
    Version = "2012-10-17",
    Statement = [
      {
        Effect = "Allow",
        Principal = "*",
        Action = [
          "bedrock:InvokeModel",
          "bedrock:InvokeModelWithResponseStream"
        ],
        Resource = "*"
      }
    ]
  })
}

#Production note

This is okay for a learning setup. In a real environment, you would tighten both the principals and the allowed resources.


9. What This Architecture Gets Right

This setup is strong because it separates concerns well.

#1. The database is private

That is the correct default.

#2. Admin access is controlled

The bastion host is a deliberate entry point, not accidental exposure.

#3. The app runtime is managed

App Runner gives you deployment convenience without manually managing EC2 for the app layer.

#4. Model access is treated like infrastructure

Bedrock is not just “an API call.” It is part of the network and security design.

#5. Terraform makes the whole thing reproducible

That means:

  • easier collaboration
  • easier review
  • easier environment replication
  • less console drift

10. What I Would Improve Further

If this were evolving from a demo into a stronger production system, I would improve these areas first.

#Secrets Management

Do not keep long-lived database credentials in plain Terraform variables if you can avoid it. Use Secrets Manager or Parameter Store.

#IAM Tightening

Avoid broad managed policies like full access unless absolutely necessary. Use least privilege.

#Remote State

Do not keep Terraform state only locally for collaborative environments. Use remote state with locking, such as:

  • S3 for state storage
  • DynamoDB for state locking

#Observability

Add logs, metrics, and tracing around:

  • backend latency
  • Bedrock calls
  • database performance
  • deployment events

#Environment Separation

Create separate environments for:

  • dev
  • staging
  • production

And keep variables, state, and naming consistent across them.

#CI/CD

Auto-deploy from ECR is useful, but you usually want a clearer pipeline that controls:

  • build
  • test
  • scan
  • release
  • promotion across environments

11. The Real Lesson

The biggest lesson from this architecture is simple:

a RAG app is not a prompt plus a model.

It is a system.

And once you treat it like a system, you start caring about the things that actually make AI applications real:

  • secure data access
  • network boundaries
  • deployment flows
  • secrets
  • observability
  • reproducibility
  • controlled model connectivity

That is the difference between AI demos and AI engineering.


Final Wrap-Up

This Terraform setup is a solid example of how to move from manual AWS clicks to a repeatable, infrastructure-as-code workflow for a RAG chatbot.

At a high level, it gives you:

  • a private database layer
  • a controlled operational access path
  • managed backend deployment
  • private connectivity to Bedrock
  • reproducible AWS infrastructure in code

And that is exactly the kind of architecture worth studying.

Because the model is only one part.

The system is the product.


#Takeaway

If you want to build real AI products, study this stack less like “Terraform syntax” and more like “system boundaries.”

Ask yourself:

  • What is public?
  • What must stay private?
  • Who is allowed to talk to what?
  • Where do secrets live?
  • How does the app reach the model?
  • How do I reproduce all of this safely?

That is how you stop building AI demos.

That is how you start building AI systems.