Building distributed systems that scale under pressure is one of the hardest engineering challenges. When a single user action — a payment, a sign-up, an order — needs to trigger ten downstream services simultaneously, coupling them directly turns your architecture into a house of cards. AWS SNS (Simple Notification Service) combined with SQS (Simple Queue Service) is the industry-standard solution for this exact problem.
This guide goes deep: architecture, code, gotchas, and production-ready patterns.
What is AWS SNS?
AWS Simple Notification Service (SNS) is a fully managed pub/sub messaging service. Publishers send messages to a topic, and SNS instantly delivers that message to all subscribers — whether those are SQS queues, Lambda functions, HTTP endpoints, email addresses, or mobile push notifications.
Think of it as a radio broadcast: one publisher, many listeners, zero coupling.
Publisher → SNS Topic → SQS Queue A → Consumer A (Orders)
→ SQS Queue B → Consumer B (Notifications)
→ SQS Queue C → Consumer C (Analytics)
→ Lambda → Real-time processingSNS Key Concepts
| Concept | Description |
|---|---|
| Topic | A named channel that receives and routes messages |
| Publisher | Any service that sends messages to a topic |
| Subscriber | An endpoint that receives messages from a topic |
| Message | The payload (up to 256 KB) sent through the topic |
| Subscription | The binding between a topic and a subscriber |
What is AWS SQS?
AWS Simple Queue Service (SQS) is a fully managed message queue service. It decouples the sender from the receiver by storing messages until the consumer is ready to process them. No message is lost if your consumer crashes, scales down, or is temporarily unavailable.
Two queue types matter here:
- Standard Queue: At-least-once delivery, best-effort ordering, virtually unlimited throughput
- FIFO Queue: Exactly-once processing, strict ordering, up to 30,000 messages/second
SNS vs SQS: When to Use Which
This is one of the most Googled AWS questions — and the answer is almost always both.
| Feature | SNS | SQS |
|---|---|---|
| Pattern | Pub/Sub (push) | Point-to-point (pull) |
| Consumers | Many simultaneous | One consumer group |
| Durability | No persistence | Messages stored up to 14 days |
| Delivery | Immediate (fire and forget) | Polled by consumer |
| Retry | Limited | Configurable with DLQ |
| Best for | Broadcasting events | Reliable async processing |
The real answer: Use SNS to fan out one event to multiple SQS queues. Each queue represents an isolated consumer with its own retry logic, scaling policy, and failure boundary.
The Fan-Out Pattern Explained
The fan-out pattern is the backbone of event-driven microservices. One event in SNS triggers parallel processing across multiple SQS queues — completely decoupled from each other.
E-commerce Order Placed
│
[SNS Topic: order.created]
│
┌────┼────┬────────────┐
▼ ▼ ▼ ▼
[SQS] [SQS] [SQS] [Lambda]
Inventory Email Analytics Fraud
Service Service Service DetectionEach downstream service:
- Scales independently
- Fails independently
- Has its own dead letter queue
- Can be paused or redeployed without affecting others
Setting Up SNS + SQS with Node.js (AWS SDK v3)
Prerequisites
npm install @aws-sdk/client-sns @aws-sdk/client-sqsCreating the SNS Topic
import { SNSClient, CreateTopicCommand } from "@aws-sdk/client-sns";
const snsClient = new SNSClient({ region: "us-east-1" });
async function createOrderTopic() {
const command = new CreateTopicCommand({
Name: "order-created",
Attributes: {
// Enable message filtering per subscription
DisplayName: "Order Created Events",
},
Tags: [
{ Key: "Environment", Value: "production" },
{ Key: "Service", Value: "orders" },
],
});
const response = await snsClient.send(command);
console.log("Topic ARN:", response.TopicArn);
return response.TopicArn;
}Creating SQS Queues with Dead Letter Queues
Dead letter queues (DLQs) are non-negotiable in production. They catch messages that fail processing repeatedly, so you can inspect, replay, or alert on them.
import { SQSClient, CreateQueueCommand } from "@aws-sdk/client-sqs";
const sqsClient = new SQSClient({ region: "us-east-1" });
async function createQueueWithDLQ(serviceName: string) {
// 1. Create the Dead Letter Queue first
const dlqCommand = new CreateQueueCommand({
QueueName: `${serviceName}-dlq`,
Attributes: {
MessageRetentionPeriod: "1209600", // 14 days in seconds
},
});
const dlqResponse = await sqsClient.send(dlqCommand);
const dlqUrl = dlqResponse.QueueUrl!;
// Get the DLQ ARN (needed for RedrivePolicy)
const dlqArn = dlqUrl
.replace("https://sqs.", "arn:aws:sqs:")
.replace(".amazonaws.com/", ":")
.replace("/", ":");
// 2. Create the main queue with a redrive policy pointing to DLQ
const queueCommand = new CreateQueueCommand({
QueueName: `${serviceName}-queue`,
Attributes: {
VisibilityTimeout: "30", // Seconds consumer has to process
MessageRetentionPeriod: "86400", // 1 day
ReceiveMessageWaitTimeSeconds: "20", // Long polling (reduces costs)
RedrivePolicy: JSON.stringify({
deadLetterTargetArn: dlqArn,
maxReceiveCount: 3, // Move to DLQ after 3 failures
}),
},
});
const queueResponse = await sqsClient.send(queueCommand);
return {
queueUrl: queueResponse.QueueUrl,
dlqUrl,
};
}Subscribing SQS Queues to the SNS Topic
For SNS to deliver messages to SQS, the queue needs an access policy allowing SNS to send messages to it.
import {
SQSClient,
SetQueueAttributesCommand,
GetQueueAttributesCommand,
} from "@aws-sdk/client-sqs";
import { SNSClient, SubscribeCommand } from "@aws-sdk/client-sns";
async function subscribeQueueToTopic(
topicArn: string,
queueUrl: string,
filterPolicy?: Record<string, string[]>
) {
// 1. Get the SQS Queue ARN
const getAttrsCommand = new GetQueueAttributesCommand({
QueueUrl: queueUrl,
AttributeNames: ["QueueArn"],
});
const attrs = await sqsClient.send(getAttrsCommand);
const queueArn = attrs.Attributes!.QueueArn;
// 2. Set the queue policy to allow SNS to send messages
const policy = {
Version: "2012-10-17",
Statement: [
{
Effect: "Allow",
Principal: { Service: "sns.amazonaws.com" },
Action: "sqs:SendMessage",
Resource: queueArn,
Condition: {
ArnEquals: { "aws:SourceArn": topicArn },
},
},
],
};
await sqsClient.send(
new SetQueueAttributesCommand({
QueueUrl: queueUrl,
Attributes: { Policy: JSON.stringify(policy) },
})
);
// 3. Create the SNS Subscription
const subscribeCommand = new SubscribeCommand({
TopicArn: topicArn,
Protocol: "sqs",
Endpoint: queueArn,
Attributes: filterPolicy
? { FilterPolicy: JSON.stringify(filterPolicy) }
: {},
});
const subscription = await snsClient.send(subscribeCommand);
console.log("Subscription ARN:", subscription.SubscriptionArn);
return subscription.SubscriptionArn;
}Publishing Events to SNS
import { SNSClient, PublishCommand } from "@aws-sdk/client-sns";
interface OrderCreatedEvent {
orderId: string;
userId: string;
totalAmount: number;
region: string;
items: Array<{ productId: string; quantity: number }>;
}
async function publishOrderCreated(
topicArn: string,
event: OrderCreatedEvent
) {
const command = new PublishCommand({
TopicArn: topicArn,
Message: JSON.stringify(event),
Subject: "OrderCreated",
MessageAttributes: {
// Used for filter policies on subscriptions
eventType: {
DataType: "String",
StringValue: "order.created",
},
region: {
DataType: "String",
StringValue: event.region,
},
amountTier: {
DataType: "String",
StringValue: event.totalAmount > 1000 ? "high-value" : "standard",
},
},
});
const response = await snsClient.send(command);
console.log("Message ID:", response.MessageId);
return response.MessageId;
}Consuming Messages from SQS
import {
SQSClient,
ReceiveMessageCommand,
DeleteMessageCommand,
} from "@aws-sdk/client-sqs";
interface SNSEnvelope {
Type: string;
MessageId: string;
TopicArn: string;
Subject: string;
Message: string;
Timestamp: string;
MessageAttributes: Record<string, { Type: string; Value: string }>;
}
async function processOrderQueue(queueUrl: string) {
console.log("Starting consumer for:", queueUrl);
while (true) {
const receiveCommand = new ReceiveMessageCommand({
QueueUrl: queueUrl,
MaxNumberOfMessages: 10, // Batch up to 10 messages
WaitTimeSeconds: 20, // Long polling — saves money
MessageAttributeNames: ["All"],
});
const response = await sqsClient.send(receiveCommand);
if (!response.Messages || response.Messages.length === 0) {
continue; // No messages, poll again
}
await Promise.allSettled(
response.Messages.map(async (message) => {
try {
// SNS wraps your message in an envelope
const envelope: SNSEnvelope = JSON.parse(message.Body!);
const event = JSON.parse(envelope.Message);
await handleOrderCreated(event);
// Delete on success — message is gone permanently
await sqsClient.send(
new DeleteMessageCommand({
QueueUrl: queueUrl,
ReceiptHandle: message.ReceiptHandle!,
})
);
} catch (err) {
// Don't delete — let visibility timeout expire
// SQS will redeliver up to maxReceiveCount times, then move to DLQ
console.error("Failed to process message:", err);
}
})
);
}
}
async function handleOrderCreated(event: OrderCreatedEvent) {
// Your business logic here
console.log(`Processing order ${event.orderId} for user ${event.userId}`);
}Message Filtering: Route Events Without Extra Code
SNS message filtering is one of its most powerful and underused features. Instead of every queue receiving every message and filtering in application code, you define a filter policy per subscription. SNS drops non-matching messages before they reach the queue.
// Only the fraud detection queue receives high-value orders from specific regions
await subscribeQueueToTopic(
topicArn,
fraudDetectionQueueUrl,
{
amountTier: ["high-value"],
region: ["us-east-1", "eu-west-1"],
}
);
// The analytics queue receives everything
await subscribeQueueToTopic(
topicArn,
analyticsQueueUrl,
// No filter policy = receives all messages
);
// Email notifications only for completed orders
await subscribeQueueToTopic(
topicArn,
emailQueueUrl,
{
eventType: ["order.created", "order.shipped", "order.delivered"],
}
);This alone can cut your SQS costs significantly in high-volume systems — only relevant messages land in each queue.
Dead Letter Queue Monitoring and Replay
DLQs are where debugging happens. Set up CloudWatch alarms on DLQ depth so you catch failures immediately.
import {
CloudWatchClient,
PutMetricAlarmCommand,
} from "@aws-sdk/client-cloudwatch";
async function createDLQAlarm(dlqName: string, topicArn: string) {
const cwClient = new CloudWatchClient({ region: "us-east-1" });
await cwClient.send(
new PutMetricAlarmCommand({
AlarmName: `${dlqName}-depth-alarm`,
AlarmDescription: `Messages are failing and landing in ${dlqName}`,
MetricName: "ApproximateNumberOfMessagesVisible",
Namespace: "AWS/SQS",
Statistic: "Sum",
Period: 60,
EvaluationPeriods: 1,
Threshold: 1,
ComparisonOperator: "GreaterThanOrEqualToThreshold",
Dimensions: [{ Name: "QueueName", Value: dlqName }],
AlarmActions: [topicArn], // Notify via SNS
TreatMissingData: "notBreaching",
})
);
}Replaying DLQ Messages After a Fix
import {
SQSClient,
StartMessageMoveTaskCommand,
} from "@aws-sdk/client-sqs";
async function replayDLQ(dlqArn: string, targetQueueArn: string) {
const command = new StartMessageMoveTaskCommand({
SourceArn: dlqArn,
DestinationArn: targetQueueArn,
MaxNumberOfMessagesPerSecond: 100,
});
const response = await sqsClient.send(command);
console.log("Replay task ID:", response.TaskHandle);
return response.TaskHandle;
}FIFO Queues: When Order and Deduplication Matter
For financial transactions, inventory updates, and state machines — where order is non-negotiable — use SNS FIFO + SQS FIFO.
// Create a FIFO SNS Topic
const fifoTopicCommand = new CreateTopicCommand({
Name: "order-state-machine.fifo", // Must end with .fifo
Attributes: {
FifoTopic: "true",
ContentBasedDeduplication: "true", // SHA-256 hash of message body
},
});
// Publish to FIFO topic
const fifoPublishCommand = new PublishCommand({
TopicArn: fifoTopicArn,
Message: JSON.stringify({ orderId: "123", state: "CONFIRMED" }),
MessageGroupId: "order-123", // All messages for this order are ordered
MessageDeduplicationId: "confirm-123", // Prevents duplicate delivery
});Real-World Architecture: E-Commerce Order Pipeline
Here is how a production order pipeline looks using SNS + SQS:
[API Gateway]
│
[Order Service]
│
[SNS: order.created.fifo]
│
┌───────────────────┼───────────────────┐
│ │ │
[SQS: inventory] [SQS: notifications] [SQS: analytics]
│ │ │
[Lambda/ECS] [Lambda/ECS] [Lambda/ECS]
Reserve stock Send email/SMS Log to Redshift
│ │
[SQS: inventory-dlq] [SQS: notifications-dlq]
Alarm → PagerDuty Alarm → SlackEach queue has:
- Its own visibility timeout tuned to that service's processing time
- Its own scaling policy (concurrency, ECS task count)
- Its own DLQ with CloudWatch alarm
- Its own IAM role with least-privilege access
Performance and Cost Optimization
Long Polling — Always Enable It
Short polling burns API calls and money. Long polling (up to 20 seconds) waits for messages to arrive before returning, drastically reducing empty responses.
// Set on the queue itself
Attributes: {
ReceiveMessageWaitTimeSeconds: "20",
}
// Or per receive call
const command = new ReceiveMessageCommand({
QueueUrl: queueUrl,
WaitTimeSeconds: 20,
});Batching — Process 10 at a Time
// SQS: receive up to 10 messages per API call
MaxNumberOfMessages: 10
// SNS: send up to 10 messages in one API call
import { PublishBatchCommand } from "@aws-sdk/client-sns";
const batchCommand = new PublishBatchCommand({
TopicArn: topicArn,
PublishBatchRequestEntries: orders.map((order, i) => ({
Id: `msg-${i}`,
Message: JSON.stringify(order),
MessageAttributes: {
eventType: { DataType: "String", StringValue: "order.created" },
},
})),
});Cost Comparison
| Operation | Cost (us-east-1) |
|---|---|
| SNS requests (first 1M/month) | Free |
| SNS requests (after 1M) | $0.50 per million |
| SQS Standard (first 1M/month) | Free |
| SQS Standard (after 1M) | $0.40 per million |
| SQS FIFO | $0.50 per million |
| Long polling | Counts as one request regardless of wait |
At 100M messages/month: roughly $45 for SNS + $39 for SQS — cheaper than running a self-managed Kafka cluster.
Best Practices for Production
DO:
-
Always use DLQs — never let failed messages silently disappear
-
Enable long polling on every SQS queue — reduces costs by up to 90%
-
Use message filtering to route events at the SNS layer, not in app code
-
Set appropriate visibility timeouts — slightly longer than your max processing time
// If processing takes up to 25 seconds, set 30+ seconds VisibilityTimeout: "30" -
Parse the SNS envelope — SNS wraps your message; always unwrap it
const envelope = JSON.parse(sqsMessage.Body); const yourData = JSON.parse(envelope.Message); -
Use
Promise.allSettledfor batch processing — one failure shouldn't block other messages -
Tag everything for cost allocation and debugging
Tags: [ { Key: "Service", Value: "orders" }, { Key: "Environment", Value: process.env.NODE_ENV }, ]
DON'T:
-
Don't delete messages before processing completes — if your handler crashes, the message is gone
// Wrong await deleteMessage(receiptHandle); await processMessage(data); // Crashes — message lost forever // Right await processMessage(data); await deleteMessage(receiptHandle); // Only delete on success -
Don't share one SQS queue across multiple unrelated consumers — use separate queues with SNS fan-out
-
Don't use standard queues for financial or inventory operations — use FIFO to prevent duplicate processing
-
Don't ignore the SNS message size limit — 256 KB max; store large payloads in S3 and send only a reference
// Large payload pattern const s3Key = await uploadToS3(largePayload); await publishToSNS({ s3Key, bucket: "my-events-bucket" }); -
Don't skip IAM least privilege — each consumer should only have access to its own queue
Common Pitfalls and How to Fix Them
Pitfall 1: Visibility Timeout Too Short
Symptom: Messages processed multiple times; duplicate side effects.
Fix: Set visibility timeout to at least 1.5× your maximum processing time. Extend it programmatically for long-running jobs.
import { ChangeMessageVisibilityCommand } from "@aws-sdk/client-sqs";
async function extendVisibility(queueUrl: string, receiptHandle: string) {
await sqsClient.send(
new ChangeMessageVisibilityCommand({
QueueUrl: queueUrl,
ReceiptHandle: receiptHandle,
VisibilityTimeout: 60, // Extend by 60 more seconds
})
);
}Pitfall 2: Missing SQS Queue Policy
Symptom: SNS subscription succeeds but no messages arrive.
Fix: The SQS queue policy must explicitly allow sqs:SendMessage from the SNS topic ARN (shown in the subscription setup above).
Pitfall 3: Not Handling SNS Envelope
Symptom: JSON parse errors on message.Body.
Fix: SNS wraps your message. Always parse twice:
const snsEnvelope = JSON.parse(sqsMessage.Body!);
const actualPayload = JSON.parse(snsEnvelope.Message);Pitfall 4: Consumer Not Scaling with Queue Depth
Fix: Use Application Auto Scaling or Lambda event source mappings with concurrency set to match your throughput requirements. For ECS, scale on ApproximateNumberOfMessagesVisible.
Comparison: SNS+SQS vs Other Messaging Solutions
| Feature | SNS + SQS | Apache Kafka | RabbitMQ | EventBridge |
|---|---|---|---|---|
| Managed | Fully | No (MSK: yes) | No | Fully |
| Setup complexity | Low | High | Medium | Low |
| Message replay | DLQ only | Up to retention | Limited | Up to 90 days |
| Throughput | Very high | Extremely high | High | High |
| Fan-out | Native | Consumer groups | Exchanges | Native |
| Ordering | FIFO option | Partition-level | Queue-level | Best-effort |
| Cost at 100M msg | ~$84 | ~$200+ (MSK) | Self-hosted | ~$100 |
| Best for | Microservices | Log streaming | Complex routing | AWS event bus |
SNS + SQS wins on simplicity, cost, and zero operational overhead for most microservice event-driven patterns.
Conclusion
AWS SNS and SQS together form the most practical, cost-effective foundation for event-driven architectures in the AWS ecosystem. The fan-out pattern decouples your services, the DLQ catches failures without data loss, and message filtering routes events intelligently without wasted compute.
Key Takeaways:
- Use SNS for broadcasting; use SQS for reliable async processing
- Always pair every SQS queue with a dead letter queue and CloudWatch alarm
- Enable long polling on every queue — it cuts costs dramatically
- Use FIFO queues and topics when order or deduplication are critical
- Filter at the SNS level, not in application code
- Parse the SNS envelope before reading your payload
- Delete SQS messages only after successful processing
Master this pattern and you have the building block for payment pipelines, notification systems, analytics ingestion, inventory management, and virtually any high-throughput event-driven system.
Questions or spotted something? Reach out on Twitter/X!