Back to Blogs

AWS SNS + SQS Fan-Out: Building Scalable Event-Driven Architecture in Production

Today

Building distributed systems that scale under pressure is one of the hardest engineering challenges. When a single user action — a payment, a sign-up, an order — needs to trigger ten downstream services simultaneously, coupling them directly turns your architecture into a house of cards. AWS SNS (Simple Notification Service) combined with SQS (Simple Queue Service) is the industry-standard solution for this exact problem.

This guide goes deep: architecture, code, gotchas, and production-ready patterns.

What is AWS SNS?

AWS Simple Notification Service (SNS) is a fully managed pub/sub messaging service. Publishers send messages to a topic, and SNS instantly delivers that message to all subscribers — whether those are SQS queues, Lambda functions, HTTP endpoints, email addresses, or mobile push notifications.

Think of it as a radio broadcast: one publisher, many listeners, zero coupling.

Publisher → SNS Topic → SQS Queue A  → Consumer A (Orders)
                     → SQS Queue B  → Consumer B (Notifications)
                     → SQS Queue C  → Consumer C (Analytics)
                     → Lambda       → Real-time processing

SNS Key Concepts

Concept Description
Topic A named channel that receives and routes messages
Publisher Any service that sends messages to a topic
Subscriber An endpoint that receives messages from a topic
Message The payload (up to 256 KB) sent through the topic
Subscription The binding between a topic and a subscriber

What is AWS SQS?

AWS Simple Queue Service (SQS) is a fully managed message queue service. It decouples the sender from the receiver by storing messages until the consumer is ready to process them. No message is lost if your consumer crashes, scales down, or is temporarily unavailable.

Two queue types matter here:

SNS vs SQS: When to Use Which

This is one of the most Googled AWS questions — and the answer is almost always both.

Feature SNS SQS
Pattern Pub/Sub (push) Point-to-point (pull)
Consumers Many simultaneous One consumer group
Durability No persistence Messages stored up to 14 days
Delivery Immediate (fire and forget) Polled by consumer
Retry Limited Configurable with DLQ
Best for Broadcasting events Reliable async processing

The real answer: Use SNS to fan out one event to multiple SQS queues. Each queue represents an isolated consumer with its own retry logic, scaling policy, and failure boundary.

The Fan-Out Pattern Explained

The fan-out pattern is the backbone of event-driven microservices. One event in SNS triggers parallel processing across multiple SQS queues — completely decoupled from each other.

E-commerce Order Placed

    [SNS Topic: order.created]

    ┌────┼────┬────────────┐
    ▼    ▼    ▼            ▼
[SQS]  [SQS]  [SQS]    [Lambda]
Inventory  Email  Analytics  Fraud
Service  Service  Service   Detection

Each downstream service:

Setting Up SNS + SQS with Node.js (AWS SDK v3)

Prerequisites

npm install @aws-sdk/client-sns @aws-sdk/client-sqs

Creating the SNS Topic

import { SNSClient, CreateTopicCommand } from "@aws-sdk/client-sns";
 
const snsClient = new SNSClient({ region: "us-east-1" });
 
async function createOrderTopic() {
  const command = new CreateTopicCommand({
    Name: "order-created",
    Attributes: {
      // Enable message filtering per subscription
      DisplayName: "Order Created Events",
    },
    Tags: [
      { Key: "Environment", Value: "production" },
      { Key: "Service", Value: "orders" },
    ],
  });
 
  const response = await snsClient.send(command);
  console.log("Topic ARN:", response.TopicArn);
  return response.TopicArn;
}

Creating SQS Queues with Dead Letter Queues

Dead letter queues (DLQs) are non-negotiable in production. They catch messages that fail processing repeatedly, so you can inspect, replay, or alert on them.

import { SQSClient, CreateQueueCommand } from "@aws-sdk/client-sqs";
 
const sqsClient = new SQSClient({ region: "us-east-1" });
 
async function createQueueWithDLQ(serviceName: string) {
  // 1. Create the Dead Letter Queue first
  const dlqCommand = new CreateQueueCommand({
    QueueName: `${serviceName}-dlq`,
    Attributes: {
      MessageRetentionPeriod: "1209600", // 14 days in seconds
    },
  });
 
  const dlqResponse = await sqsClient.send(dlqCommand);
  const dlqUrl = dlqResponse.QueueUrl!;
 
  // Get the DLQ ARN (needed for RedrivePolicy)
  const dlqArn = dlqUrl
    .replace("https://sqs.", "arn:aws:sqs:")
    .replace(".amazonaws.com/", ":")
    .replace("/", ":");
 
  // 2. Create the main queue with a redrive policy pointing to DLQ
  const queueCommand = new CreateQueueCommand({
    QueueName: `${serviceName}-queue`,
    Attributes: {
      VisibilityTimeout: "30",         // Seconds consumer has to process
      MessageRetentionPeriod: "86400", // 1 day
      ReceiveMessageWaitTimeSeconds: "20", // Long polling (reduces costs)
      RedrivePolicy: JSON.stringify({
        deadLetterTargetArn: dlqArn,
        maxReceiveCount: 3,            // Move to DLQ after 3 failures
      }),
    },
  });
 
  const queueResponse = await sqsClient.send(queueCommand);
  return {
    queueUrl: queueResponse.QueueUrl,
    dlqUrl,
  };
}

Subscribing SQS Queues to the SNS Topic

For SNS to deliver messages to SQS, the queue needs an access policy allowing SNS to send messages to it.

import {
  SQSClient,
  SetQueueAttributesCommand,
  GetQueueAttributesCommand,
} from "@aws-sdk/client-sqs";
import { SNSClient, SubscribeCommand } from "@aws-sdk/client-sns";
 
async function subscribeQueueToTopic(
  topicArn: string,
  queueUrl: string,
  filterPolicy?: Record<string, string[]>
) {
  // 1. Get the SQS Queue ARN
  const getAttrsCommand = new GetQueueAttributesCommand({
    QueueUrl: queueUrl,
    AttributeNames: ["QueueArn"],
  });
  const attrs = await sqsClient.send(getAttrsCommand);
  const queueArn = attrs.Attributes!.QueueArn;
 
  // 2. Set the queue policy to allow SNS to send messages
  const policy = {
    Version: "2012-10-17",
    Statement: [
      {
        Effect: "Allow",
        Principal: { Service: "sns.amazonaws.com" },
        Action: "sqs:SendMessage",
        Resource: queueArn,
        Condition: {
          ArnEquals: { "aws:SourceArn": topicArn },
        },
      },
    ],
  };
 
  await sqsClient.send(
    new SetQueueAttributesCommand({
      QueueUrl: queueUrl,
      Attributes: { Policy: JSON.stringify(policy) },
    })
  );
 
  // 3. Create the SNS Subscription
  const subscribeCommand = new SubscribeCommand({
    TopicArn: topicArn,
    Protocol: "sqs",
    Endpoint: queueArn,
    Attributes: filterPolicy
      ? { FilterPolicy: JSON.stringify(filterPolicy) }
      : {},
  });
 
  const subscription = await snsClient.send(subscribeCommand);
  console.log("Subscription ARN:", subscription.SubscriptionArn);
  return subscription.SubscriptionArn;
}

Publishing Events to SNS

import { SNSClient, PublishCommand } from "@aws-sdk/client-sns";
 
interface OrderCreatedEvent {
  orderId: string;
  userId: string;
  totalAmount: number;
  region: string;
  items: Array<{ productId: string; quantity: number }>;
}
 
async function publishOrderCreated(
  topicArn: string,
  event: OrderCreatedEvent
) {
  const command = new PublishCommand({
    TopicArn: topicArn,
    Message: JSON.stringify(event),
    Subject: "OrderCreated",
    MessageAttributes: {
      // Used for filter policies on subscriptions
      eventType: {
        DataType: "String",
        StringValue: "order.created",
      },
      region: {
        DataType: "String",
        StringValue: event.region,
      },
      amountTier: {
        DataType: "String",
        StringValue: event.totalAmount > 1000 ? "high-value" : "standard",
      },
    },
  });
 
  const response = await snsClient.send(command);
  console.log("Message ID:", response.MessageId);
  return response.MessageId;
}

Consuming Messages from SQS

import {
  SQSClient,
  ReceiveMessageCommand,
  DeleteMessageCommand,
} from "@aws-sdk/client-sqs";
 
interface SNSEnvelope {
  Type: string;
  MessageId: string;
  TopicArn: string;
  Subject: string;
  Message: string;
  Timestamp: string;
  MessageAttributes: Record<string, { Type: string; Value: string }>;
}
 
async function processOrderQueue(queueUrl: string) {
  console.log("Starting consumer for:", queueUrl);
 
  while (true) {
    const receiveCommand = new ReceiveMessageCommand({
      QueueUrl: queueUrl,
      MaxNumberOfMessages: 10,        // Batch up to 10 messages
      WaitTimeSeconds: 20,            // Long polling — saves money
      MessageAttributeNames: ["All"],
    });
 
    const response = await sqsClient.send(receiveCommand);
 
    if (!response.Messages || response.Messages.length === 0) {
      continue; // No messages, poll again
    }
 
    await Promise.allSettled(
      response.Messages.map(async (message) => {
        try {
          // SNS wraps your message in an envelope
          const envelope: SNSEnvelope = JSON.parse(message.Body!);
          const event = JSON.parse(envelope.Message);
 
          await handleOrderCreated(event);
 
          // Delete on success — message is gone permanently
          await sqsClient.send(
            new DeleteMessageCommand({
              QueueUrl: queueUrl,
              ReceiptHandle: message.ReceiptHandle!,
            })
          );
        } catch (err) {
          // Don't delete — let visibility timeout expire
          // SQS will redeliver up to maxReceiveCount times, then move to DLQ
          console.error("Failed to process message:", err);
        }
      })
    );
  }
}
 
async function handleOrderCreated(event: OrderCreatedEvent) {
  // Your business logic here
  console.log(`Processing order ${event.orderId} for user ${event.userId}`);
}

Message Filtering: Route Events Without Extra Code

SNS message filtering is one of its most powerful and underused features. Instead of every queue receiving every message and filtering in application code, you define a filter policy per subscription. SNS drops non-matching messages before they reach the queue.

// Only the fraud detection queue receives high-value orders from specific regions
await subscribeQueueToTopic(
  topicArn,
  fraudDetectionQueueUrl,
  {
    amountTier: ["high-value"],
    region: ["us-east-1", "eu-west-1"],
  }
);
 
// The analytics queue receives everything
await subscribeQueueToTopic(
  topicArn,
  analyticsQueueUrl,
  // No filter policy = receives all messages
);
 
// Email notifications only for completed orders
await subscribeQueueToTopic(
  topicArn,
  emailQueueUrl,
  {
    eventType: ["order.created", "order.shipped", "order.delivered"],
  }
);

This alone can cut your SQS costs significantly in high-volume systems — only relevant messages land in each queue.

Dead Letter Queue Monitoring and Replay

DLQs are where debugging happens. Set up CloudWatch alarms on DLQ depth so you catch failures immediately.

import {
  CloudWatchClient,
  PutMetricAlarmCommand,
} from "@aws-sdk/client-cloudwatch";
 
async function createDLQAlarm(dlqName: string, topicArn: string) {
  const cwClient = new CloudWatchClient({ region: "us-east-1" });
 
  await cwClient.send(
    new PutMetricAlarmCommand({
      AlarmName: `${dlqName}-depth-alarm`,
      AlarmDescription: `Messages are failing and landing in ${dlqName}`,
      MetricName: "ApproximateNumberOfMessagesVisible",
      Namespace: "AWS/SQS",
      Statistic: "Sum",
      Period: 60,
      EvaluationPeriods: 1,
      Threshold: 1,
      ComparisonOperator: "GreaterThanOrEqualToThreshold",
      Dimensions: [{ Name: "QueueName", Value: dlqName }],
      AlarmActions: [topicArn], // Notify via SNS
      TreatMissingData: "notBreaching",
    })
  );
}

Replaying DLQ Messages After a Fix

import {
  SQSClient,
  StartMessageMoveTaskCommand,
} from "@aws-sdk/client-sqs";
 
async function replayDLQ(dlqArn: string, targetQueueArn: string) {
  const command = new StartMessageMoveTaskCommand({
    SourceArn: dlqArn,
    DestinationArn: targetQueueArn,
    MaxNumberOfMessagesPerSecond: 100,
  });
 
  const response = await sqsClient.send(command);
  console.log("Replay task ID:", response.TaskHandle);
  return response.TaskHandle;
}

FIFO Queues: When Order and Deduplication Matter

For financial transactions, inventory updates, and state machines — where order is non-negotiable — use SNS FIFO + SQS FIFO.

// Create a FIFO SNS Topic
const fifoTopicCommand = new CreateTopicCommand({
  Name: "order-state-machine.fifo", // Must end with .fifo
  Attributes: {
    FifoTopic: "true",
    ContentBasedDeduplication: "true", // SHA-256 hash of message body
  },
});
 
// Publish to FIFO topic
const fifoPublishCommand = new PublishCommand({
  TopicArn: fifoTopicArn,
  Message: JSON.stringify({ orderId: "123", state: "CONFIRMED" }),
  MessageGroupId: "order-123",           // All messages for this order are ordered
  MessageDeduplicationId: "confirm-123", // Prevents duplicate delivery
});

Real-World Architecture: E-Commerce Order Pipeline

Here is how a production order pipeline looks using SNS + SQS:

                         [API Gateway]

                        [Order Service]

                    [SNS: order.created.fifo]

          ┌───────────────────┼───────────────────┐
          │                   │                   │
   [SQS: inventory]   [SQS: notifications]  [SQS: analytics]
          │                   │                   │
   [Lambda/ECS]        [Lambda/ECS]         [Lambda/ECS]
   Reserve stock       Send email/SMS       Log to Redshift
          │                   │
   [SQS: inventory-dlq] [SQS: notifications-dlq]
   Alarm → PagerDuty    Alarm → Slack

Each queue has:

Performance and Cost Optimization

Long Polling — Always Enable It

Short polling burns API calls and money. Long polling (up to 20 seconds) waits for messages to arrive before returning, drastically reducing empty responses.

// Set on the queue itself
Attributes: {
  ReceiveMessageWaitTimeSeconds: "20",
}
 
// Or per receive call
const command = new ReceiveMessageCommand({
  QueueUrl: queueUrl,
  WaitTimeSeconds: 20,
});

Batching — Process 10 at a Time

// SQS: receive up to 10 messages per API call
MaxNumberOfMessages: 10
 
// SNS: send up to 10 messages in one API call
import { PublishBatchCommand } from "@aws-sdk/client-sns";
 
const batchCommand = new PublishBatchCommand({
  TopicArn: topicArn,
  PublishBatchRequestEntries: orders.map((order, i) => ({
    Id: `msg-${i}`,
    Message: JSON.stringify(order),
    MessageAttributes: {
      eventType: { DataType: "String", StringValue: "order.created" },
    },
  })),
});

Cost Comparison

Operation Cost (us-east-1)
SNS requests (first 1M/month) Free
SNS requests (after 1M) $0.50 per million
SQS Standard (first 1M/month) Free
SQS Standard (after 1M) $0.40 per million
SQS FIFO $0.50 per million
Long polling Counts as one request regardless of wait

At 100M messages/month: roughly $45 for SNS + $39 for SQS — cheaper than running a self-managed Kafka cluster.

Best Practices for Production

DO:

  1. Always use DLQs — never let failed messages silently disappear

  2. Enable long polling on every SQS queue — reduces costs by up to 90%

  3. Use message filtering to route events at the SNS layer, not in app code

  4. Set appropriate visibility timeouts — slightly longer than your max processing time

    // If processing takes up to 25 seconds, set 30+ seconds
    VisibilityTimeout: "30"
  5. Parse the SNS envelope — SNS wraps your message; always unwrap it

    const envelope = JSON.parse(sqsMessage.Body);
    const yourData = JSON.parse(envelope.Message);
  6. Use Promise.allSettled for batch processing — one failure shouldn't block other messages

  7. Tag everything for cost allocation and debugging

    Tags: [
      { Key: "Service", Value: "orders" },
      { Key: "Environment", Value: process.env.NODE_ENV },
    ]

DON'T:

  1. Don't delete messages before processing completes — if your handler crashes, the message is gone

    // Wrong
    await deleteMessage(receiptHandle);
    await processMessage(data); // Crashes — message lost forever
     
    // Right
    await processMessage(data);
    await deleteMessage(receiptHandle); // Only delete on success
  2. Don't share one SQS queue across multiple unrelated consumers — use separate queues with SNS fan-out

  3. Don't use standard queues for financial or inventory operations — use FIFO to prevent duplicate processing

  4. Don't ignore the SNS message size limit — 256 KB max; store large payloads in S3 and send only a reference

    // Large payload pattern
    const s3Key = await uploadToS3(largePayload);
    await publishToSNS({ s3Key, bucket: "my-events-bucket" });
  5. Don't skip IAM least privilege — each consumer should only have access to its own queue

Common Pitfalls and How to Fix Them

Pitfall 1: Visibility Timeout Too Short

Symptom: Messages processed multiple times; duplicate side effects.

Fix: Set visibility timeout to at least 1.5× your maximum processing time. Extend it programmatically for long-running jobs.

import { ChangeMessageVisibilityCommand } from "@aws-sdk/client-sqs";
 
async function extendVisibility(queueUrl: string, receiptHandle: string) {
  await sqsClient.send(
    new ChangeMessageVisibilityCommand({
      QueueUrl: queueUrl,
      ReceiptHandle: receiptHandle,
      VisibilityTimeout: 60, // Extend by 60 more seconds
    })
  );
}

Pitfall 2: Missing SQS Queue Policy

Symptom: SNS subscription succeeds but no messages arrive.

Fix: The SQS queue policy must explicitly allow sqs:SendMessage from the SNS topic ARN (shown in the subscription setup above).

Pitfall 3: Not Handling SNS Envelope

Symptom: JSON parse errors on message.Body.

Fix: SNS wraps your message. Always parse twice:

const snsEnvelope = JSON.parse(sqsMessage.Body!);
const actualPayload = JSON.parse(snsEnvelope.Message);

Pitfall 4: Consumer Not Scaling with Queue Depth

Fix: Use Application Auto Scaling or Lambda event source mappings with concurrency set to match your throughput requirements. For ECS, scale on ApproximateNumberOfMessagesVisible.

Comparison: SNS+SQS vs Other Messaging Solutions

Feature SNS + SQS Apache Kafka RabbitMQ EventBridge
Managed Fully No (MSK: yes) No Fully
Setup complexity Low High Medium Low
Message replay DLQ only Up to retention Limited Up to 90 days
Throughput Very high Extremely high High High
Fan-out Native Consumer groups Exchanges Native
Ordering FIFO option Partition-level Queue-level Best-effort
Cost at 100M msg ~$84 ~$200+ (MSK) Self-hosted ~$100
Best for Microservices Log streaming Complex routing AWS event bus

SNS + SQS wins on simplicity, cost, and zero operational overhead for most microservice event-driven patterns.

Conclusion

AWS SNS and SQS together form the most practical, cost-effective foundation for event-driven architectures in the AWS ecosystem. The fan-out pattern decouples your services, the DLQ catches failures without data loss, and message filtering routes events intelligently without wasted compute.

Key Takeaways:

Master this pattern and you have the building block for payment pipelines, notification systems, analytics ingestion, inventory management, and virtually any high-throughput event-driven system.


Questions or spotted something? Reach out on Twitter/X!