Amazon Elastic Compute Cloud (EC2) is a core AWS service providing scalable virtual computing resources in the cloud.
EC2 provides virtual machines (servers) in the cloud, allowing users to rent computing resources like memory, storage, and networking. It is fundamental to cloud computing, enabling users to run applications. AWS acts as a cloud utility provider, offering services like Amazon EC2 instead of on-premises data centers and physical servers.
EC2 enables renting virtual machines, storing data on virtual drives (EBS, EFS), distributing load across machines (Elastic Load Balancing - ELB), and scaling services automatically (Auto Scaling Groups - ASG).
Users pay only for what is used; no charges when instances are not running. This follows a pay-as-you-go model, unlike traditional fixed expenses.
When creating an EC2 instance, users can configure various aspects.
Users can configure the Operating System (Linux, Windows, macOS), Compute Power (number of vCPUs), Memory (RAM allocation), Storage (EBS, EFS, EC2 Instance Store), Networking (Elastic Network Interfaces (ENIs), public IP addresses), Firewall Rules (configurable through Security Groups), and Bootstrap Scripts (EC2 User Data for initial instance setup).
Launching an EC2 instance involves configuring several key aspects, and instances transition through various states with associated actions.
Launching an EC2 instance involves configuring an Amazon Machine Image (AMI), Instance Type, Key Pair, Network Settings (Security Group), Storage Options (including 'delete on termination' for the root volume), and User Data scripts.
EC2 instances transition through several states: Running (operational and active), Stopped (shut down, compute billing paused, storage costs continue), and Terminated (permanently deleted, along with its associated root EBS volume if 'delete on termination' is enabled).
Technical Specs: Running, Stopped, Terminated, Pending, Rebooting, Starting, Terminating
Users can perform various actions on instances: Stop (gracefully shuts down), Start (restarts a stopped instance), Reboot (restarts the OS), Terminate (permanently deletes).
Termination is irreversible; no way to retrieve instance or data. Stop state preserves the instance and its data (EBS volumes, instance metadata, Elastic IPs) but compute is not active. Lost data in a stop state includes Public IPv4 addresses, RAM content, instance store volumes, processes, and session states. In a stop state, you are not billed for compute but still billed for EBS volumes. This is useful for development/test environments, temporary shutdowns, and cost savings during off-peak hours.
Securely connecting to an EC2 instance is crucial for management and application deployment.
Directly available on Mac and Linux. Available via PowerShell on Windows 10/11+. Requires a key pair (.pem file for Linux/macOS).
Technical Specs: Port 22
A popular client for Windows. Requires a .ppk key file (converted from .pem using PuTTYgen). Users enter the username (e.g., ec2-user) and public IPv4 address of the instance, then select the private key file for authentication.
Technical Specs: Port 22
For Windows instances. Users connect using the Public DNS, username (Administrator), and a password decrypted using the .pem private key file from the AWS console.
Technical Specs: Port 3389
A browser-based SSH client that simplifies connection without requiring PuTTY or local key pair management for specific AMIs (Amazon Linux 2). It leverages IAM for access control and provides temporary credentials. For browser-based SSH, the instance needs a public IP address and its security group must allow SSH (port 22) traffic.
Technical Specs: Port 22
An agent-based, secure, browser-based shell/RDP access without opening inbound ports. Often the recommended choice for secure access. Requires IAM permissions, network connectivity to Systems Manager (via Internet Gateway, NAT Gateway, or VPC interface endpoints), and relies on the SSM Agent.
Understanding IP addressing and its types is foundational for EC2 networking.
An IP address is a unique identifier for each device on a network. IPv4 (32-bit, e.g., 10.0.0.5) is the most widely used, facing address exhaustion. IPv6 (128-bit, hexadecimal) offers a virtually unlimited supply of addresses.
Technical Specs: IPv4: 32-bit; IPv6: 128-bit
Identifies an instance within a Virtual Private Cloud (VPC) and is not directly accessible from the internet. Multiple instances can have the same private IP address across different VPCs. Every EC2 instance receives at least one private IP address upon launch, which is static and persists even after restarts, facilitating internal communication within the VPC. Common private IP ranges include Class A: 10.0.0.0 to 10.255.255.255, Class B: 172.16.0.0 to 172.31.255.255, Class C: 192.168.0.0 to 192.168.255.255.
A unique address assigned to an instance for internet accessibility. By default, these are ephemeral and can change when an instance is stopped and restarted. For static IPs, Elastic IPs are required. They are used for resources that need to be accessible from the internet and are routable globally.
An Elastic IP address is a static, public IPv4 address that you can allocate to your AWS account and associate with an EC2 instance. This provides a static IP that does not change even if the instance is stopped and restarted, making it useful for masking instance failures by reassociating the Elastic IP with a healthy instance. Elastic IPs incur charges if they are not associated with a running instance. They are static and bound to a specific AWS region. IPv6 is not supported for EIPs. AWS accounts typically have a soft limit of 5 EIPs per region.
Technical Specs: IPv4 only; soft limit of 5 per region
An ENI acts as a virtual network card for your EC2 instances.
An ENI is a virtual network interface that can be attached to an EC2 instance within a VPC. It provides IP addresses, a MAC address, and security group associations. Each ENI possesses a private IP address, an optional public IP address, associated security groups, and a MAC address.
ENIs can be created independently and attached/detached from instances on the fly, facilitating high availability and disaster recovery scenarios by allowing an ENI to be moved between instances. An ENI is bound to a specific Availability Zone (AZ); an ENI created in one AZ can only be attached to an instance launched in the same AZ.
Technical Specs: Bound to a specific Availability Zone
ENIs are used for multihoming (assigning multiple IP addresses or network interfaces to an instance) and failover architecture (detaching an ENI from a failed instance and reattaching it to a healthy instance).
EC2 instances are secured through virtual firewalls and IAM roles.
Security Groups act as virtual firewalls for EC2 instances, controlling inbound and outbound traffic. They are stateful, meaning if an inbound request is allowed, the corresponding outbound response is automatically permitted. They regulate access to ports and authorize IP ranges.
All inbound traffic is blocked by default, while all outbound traffic is allowed by default. Rules permit or deny traffic based on protocol, port range, and source/destination (IP addresses, CIDR blocks, or other security groups).
Technical Specs: Default Inbound: Block all; Default Outbound: Allow all
A single Security Group can be associated with multiple EC2 instances. Security Groups are tied to a specific AWS Region and VPC. Traffic is filtered before reaching the EC2 instance. They can reference other Security Group IDs as sources instead of IP ranges.
Commonly configured ports include: 22 (SSH), 21 (FTP), 22 (SFTP), 80 (HTTP), 443 (HTTPS), 3389 (RDP).
Technical Specs: 22: SSH, 21: FTP, 22: SFTP, 80: HTTP, 443: HTTPS, 3389: RDP
IAM Roles provide a secure way for EC2 instances to access other AWS services without embedding access keys directly into applications or instance configurations. An EC2 instance can assume an IAM role, which grants it temporary credentials to perform actions specified by the role's attached policies (e.g., granting an instance read-only access to IAM users). This adheres to the principle of least privilege. An EC2 instance must use an instance profile to leverage an IAM role.
AWS offers various purchasing options to optimize costs.
On-Demand Instances
Pay-as-you-go model, billed per second (Linux/Windows) or per hour (macOS). Offers no upfront cost or long-term commitment.
billing:
Per second (Linux/Windows), per hour (macOS)
upfront_cost:
None
commitment:
None
Use Cases:
- Unpredictable workloads
- Short-term/uninterrupted workloads
- Testing environments
- One-time batch jobs
- Applications with fluctuating traffic patterns
- Startups
Reserved Instances (RIs)
Offer significant discounts (up to 72%) compared to On-Demand in exchange for a 1 or 3-year commitment to specific instance attributes. Standard RIs offer deeper discounts but are less flexible; Convertible RIs offer slightly lower savings (up to 66%) but allow changing instance type, family, OS, and tenancy during the commitment period. RIs can be Zonal (guarantees capacity) or Regional (more flexible, no capacity guarantee). Payment options include No Upfront, Partial Upfront, All Upfront (higher discount for more upfront).
discount:
Up to 72%
commitment:
1 or 3 years
payment_options:
No Upfront, Partial Upfront, All Upfront
scope:
Regional or Zonal
convertible_discount:
Up to 66%
Use Cases:
- Steady-state usage
- Predictable workloads (e.g., databases)
- Long-term commitment with potential for future workload changes (Convertible RIs)
EC2 Savings Plans
A flexible pricing model offering up to 72% discount based on a commitment to a consistent amount of usage (e.g., dollar amount per hour) over a 1 or 3-year term, not tied to specific instance attributes. Compute Savings Plans (up to 66% discount) apply to EC2, Fargate, and Lambda usage, offering the most flexibility. EC2 Savings Plans (up to 72% discount) require commitment to an instance family within a region, but allow changes in size, OS, and tenancy. SageMaker Savings Plans offer savings for eligible SageMaker ML instance usage.
discount:
Up to 72%
commitment:
1 or 3 years
commitment_type:
Consistent usage (dollar amount per hour)
flexibility:
Across instance size, OS, and tenancy (Compute SP); within instance family (EC2 SP)
compute_sp_discount:
Up to 66%
ec2_sp_discount:
Up to 72%
sagemaker_sp_discount:
Up to 64%
Use Cases:
- Long-term usage commitment with flexibility
- Dynamic workloads where instance type needs may evolve
EC2 Spot Instances
Provide discounts of up to 90% by bidding on unused EC2 capacity. These instances can be interrupted by AWS with a 2-minute warning if the capacity is needed or the spot price exceeds the bid. Spot Instances are not eligible for the AWS Free Tier. Request types include One-Time (requests once, no further requests if interrupted) and Persistent (automatically re-requests if terminated or reclaimed). Interruption behavior can be configured to hibernate, stop, or terminate.
discount:
Up to 90%
interruption_warning:
2-minute
free_tier_eligibility:
None
spot_blocks:
Fixed duration (1-6 hours) without interruption, lower savings
Use Cases:
- Fault-tolerant, flexible workloads
- Batch jobs
- Data analysis
- Image processing
- Distributed workloads
- ML training
- CI/CD pipelines
- Test/deployment workloads
- Handling seasonal spikes
EC2 Spot Fleets
A collection of Spot Instances and On-Demand Instances to meet a target capacity with price and allocation strategy constraints, optimizing cost and availability. It manages a group of spot instances as a unit, simplifying individual management. Can use launch templates to provision a mix of on-demand, reserved, and spot instances across different instance types and Availability Zones.
allocation_strategies:
Lowest Price, Capacity Optimized, Price Capacity Optimized (recommended for most spot workloads), Diversified
Use Cases:
- Reduces costs by mixing cheaper spot instances with on-demand instances for mixed workloads (batch and critical jobs)
EC2 Dedicated Hosts
A physical server fully dedicated to a single customer. Provides visibility into sockets, cores, and host affinity, offering the most control. Billing is per host. It is the most expensive option.
billing:
Per host
cost:
Most expensive
control_level:
Highest (physical server)
Use Cases:
- Compliance requirements
- Existing server-bound software licenses (BYOL)
- When shared hardware is not permitted
EC2 Dedicated Instances
Instances run on hardware dedicated to the customer, but the hardware may be shared with other instances within the same account. You do not control the physical server. Billing is per instance. Supports BYOL with license mobility agreements.
billing:
Per instance
shared_hardware:
May be shared with other instances within the same account
Use Cases:
- Compliance or regulatory requirements
- Specific per-core software licensing needs
EC2 Capacity Reservations
Reserve On-Demand instance capacity in a specific Availability Zone for any duration. You pay for the reservation at On-Demand rates, regardless of whether an instance is running. Can be created or cancelled anytime, with no time commitment.
billing:
On-Demand rate (for reserved capacity)
commitment:
None (can be cancelled anytime)
capacity_guarantee:
Yes, in a specific AZ
Use Cases:
- Mission-critical workloads requiring guaranteed capacity
- Short-term but uninterrupted bulk instance needs in a specific AZ
AWS offers advanced features for EC2 instance placement and state management.
Amazon Elastic Block Store (EBS) provides persistent, block-level storage volumes for EC2 instances, functioning like virtual hard drives connected over the network.
EBS provides persistent, block-level storage volumes that attach to EC2 instances. Data stored on EBS volumes is persistent and survives EC2 instance stops or terminations, unless 'delete on termination' is configured. EBS volumes are provisioned within a specific Availability Zone and can only be attached to EC2 instances residing in that same AZ. By default, a single EBS volume can attach to one EC2 instance, with multi-attach available for specific high-performance types. Users provision volumes based on desired size and performance, and billing is based on provisioned capacity.
Technical Specs: Capacity: 1GB to 64TB
AWS Systems Manager (SSM) is a secure, end-to-end management solution for AWS Cloud and hybrid cloud environments. It offers a centralized and consistent way to gather operational insights and perform routine management tasks across multiple AWS services.
SSM allows establishing SSH or RDP sessions without key pairs or security group modifications, setting up automatic operating system or software patching, and managing EC2 instances for personal learning or organizational use cases.
The Systems Manager Agent (SSM agent) must be installed on the EC2 instance. Many recent AWS AMIs come with the SSM agent pre-installed. For older AMIs or specific OS versions, manual installation is required.
An IAM role or EC2 instance profile with required privileges is necessary for the SSM agent to communicate with the Systems Manager service, adhering to AWS’s zero-trust policy. The AmazonSSMManagedInstanceCore AWS managed policy provides the necessary permissions.
The EC2 Security Group must allow outbound connections to the SSM service. This acts as an instance-level firewall. Neglecting this step is a common reason for SSM-related issues.
Perform automated steps across AWS Cloud and hybrid environments. Can integrate with on-premise and other cloud VMs.
Create custom maintenance windows to automatically apply software or OS patches.
Automatically apply OS-level or security patches to SSM-managed EC2 instances, and apply software updates/patches. It is ideal for patching a large number of EC2 instances (10 or more).
Store credentials, passwords, or parameters in Parameter Store to avoid hardcoding. This service is free to use.
Create an inventory of software packages, versions, etc., for EC2 instances or hybrid environments.
Runbooks or sets of steps for common maintenance and deployment tasks. Can be AWS-managed or custom. Documents are commonly used with AWS Config for remediations of non-compliant resources.
Remotely execute commands to securely manage the configuration of managed nodes. Usable via AWS Management Console, CLI, AWS Tools for PowerShell, or SDKs. It is a free service, capable of executing commands on multiple instances simultaneously.
Establish RDP or SSH sessions without opening TCP ports or using key pairs. This is a powerful tool for secure access.
Track infrastructure compliance and view a dashboard of compliant vs. non-compliant instances/configurations.
The demonstration shows the step-by-step process to integrate an EC2 instance with AWS Systems Manager.
Integrate a Linux EC2 instance with AWS Systems Manager to enable its management capabilities.
1
Access the EC2 console and navigate to running instances.
💡 To view and select the target instance for SSM integration.
2
Verify SSM Agent installation on the Linux EC2 instance.
💡 The SSM Agent must be present and running for SSM to manage the instance.
3
Create an IAM Role for EC2 instances.
💡 The SSM agent requires an IAM role with specific permissions to communicate with the Systems Manager service.
4
Attach the newly created IAM Role to the EC2 Instance.
💡 To grant the EC2 instance the necessary permissions to interact with SSM.
5
Verify Security Group Outbound Rules.
💡 The EC2 Security Group must allow outbound connections to the SSM service endpoints.
6
Validate SSM Integration.
💡 To confirm that the EC2 instance is successfully registered as a managed node with Systems Manager.
7
Install SSM Agent (if not pre-installed).
💡 For older AMIs or specific OS versions that do not include the SSM agent by default.
sudo yum install -y https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/linux_amd64/amazon-ssm-agent.rpm
An Auto Scaling group is a collection of EC2 instances treated as a logical grouping for the purpose of auto-scaling and management.
Auto Scaling Groups (ASGs) manage EC2 instances automatically, adjusting the number of servers based on application traffic or metrics like CPU utilization or network traffic. This functionality is crucial for high availability and cost optimization. ASGs can spin instances across multiple AZs and automatically replace failed instances using launch templates, providing self-healing capabilities.
Launch Templates are the newer, AWS-recommended method, offering more flexibility, supporting multiple versions, and advanced options (e.g., Spot instances, T2/T3 unlimited, multiple instance types in one ASG). Launch Configurations are older and deprecated, cannot be edited after creation, and support only basic parameters.
Technical Specs: Launch Configurations deprecated since Dec 31, 2023
ASG attributes define the desired scaling behavior and capacity limits.
The lowest number of EC2 instances your Auto Scaling group will ever run. Ensures baseline capacity and prevents application outages during off-peak hours when demand drops.
The target number of EC2 instances an Auto Scaling group aims to maintain at a given point in time for steady-state workload. Can be set manually or adjusted automatically through scaling policies. Cannot be lower than minimum capacity.
The highest number of EC2 instances your Auto Scaling group will allow. Controls costs and prevents overprovisioning by limiting instance launches even during extreme traffic spikes.
Scaling policies are rulebooks that dictate when and how an ASG scales.
The most common and easiest to set up. You set a target metric (e.g., average CPU utilization at 50%), and Auto Scaling adjusts the instance count to stay near the target, like a thermostat maintaining a constant temperature. It handles scaling automatically.
Scaling happens in steps based on how far a metric is from a threshold, allowing for more granular control. For example, if CPU > 60%, add 1 instance; if CPU > 80%, add 2 instances.
An older method with a single threshold and action. Less flexible as it can only perform one action per alarm.
Allows you to plan scaling actions in advance based on predictable traffic patterns, such as increasing capacity on weekdays from 9 AM to 5 PM.
Uses machine learning to forecast future traffic based on historical data and schedules scaling actions proactively, for example, adding instances before Black Friday.
Auto Scaling Groups work in conjunction with CloudWatch to monitor metrics and trigger scaling actions.
CloudWatch monitors EC2 instance metrics (CPU, memory, network, etc.). If a metric crosses a predefined threshold (e.g., CPU > 20%), CloudWatch triggers an alarm. The CloudWatch alarm notifies the relevant Auto Scaling policy, which then instructs the Auto Scaling group to perform a scaling action (e.g., add an instance). The Auto Scaling group launches a new EC2 instance. This process works in reverse for scaling in (removing instances).
The cooldown period prevents rapid or repetitive scaling.
A period of time after an auto-scaling action (adding or removing an instance) during which no further scaling actions are triggered. This prevents rapid or repetitive scaling due to quick metric fluctuations and gives newly launched instances time to start up and stabilize, ensuring metrics reflect true system load. If a scale-out event occurs and the cooldown period is 300 seconds, no further scale-out will be considered for the next 5 minutes.
Technical Specs: Default Cooldown: 300 seconds
This demonstration shows how EC2 instances automatically scale based on demand and health checks, involving the creation and configuration of various AWS resources.
Configure EC2 instances as simple web applications, then set up and integrate an Auto Scaling Group (ASG), Application Load Balancer (ALB), and Target Group to observe automatic scaling and self-healing capabilities.
Prerequisites
- An AWS account with appropriate IAM permissions.
- Existing Security Groups for EC2 instances and the ALB.
- A User Data script for initial instance setup.
1
Create a Launch Template.
💡 Defines the configuration for new EC2 instances launched by the autoscaling group.
2
Create a Target Group.
💡 Used by the Load Balancer to route traffic to registered instances and perform health checks.
3
Create an Application Load Balancer (ALB).
💡 To distribute incoming application traffic across multiple instances for high availability and fault tolerance.
4
Create an Auto Scaling Group (ASG).
💡 To manage the collection of EC2 instances, ensuring a desired number are running and scaling as needed.
5
Simulate an Outage Scenario.
💡 To test the self-healing capability of the ASG.
6
Simulate Instance Recovery.
💡 To observe the ASG scaling down to maintain desired capacity.
Elastic Load Balancing (ELB) is a service that automatically distributes incoming application traffic across multiple targets, such as EC2 instances.
ELB distributes incoming application traffic across multiple targets (EC2 instances, containers) for high availability and fault tolerance. It acts as a 'traffic cop' to prevent single instances from becoming overloaded or creating bottlenecks, improving both performance and availability. ELB performs health checks and routes traffic only to healthy instances.
AWS offers several types of load balancers: Application Load Balancer (ALB) for HTTP/S traffic, Network Load Balancer (NLB) for ultra-low latency TCP traffic, Gateway Load Balancer (GWLB) for network appliances, and Classic Load Balancer (CLB) which is a legacy load balancer not recommended for new applications.
The Application Load Balancer (ALB) operates at the OSI Layer 7 (Application Layer) and is designed for HTTP and HTTPS traffic.
ALB distributes incoming HTTP/HTTPS traffic across multiple backend targets. Its intelligence allows it to inspect request content and make routing decisions. It integrates seamlessly with services like AWS Lambda and is ideal for modern application architectures.
Target groups are logical groupings of backend targets (EC2 instances, containers, Lambda functions, private IP addresses) to which ALB routes traffic. Each target group must be associated with at least one listener rule, and health checks can be configured at this level.
ALBs continuously monitor the health of targets by periodically sending requests to a configured path, protocol, and port. Unhealthy targets are automatically removed from active traffic rotation and reinstated once healthy. If using HTTPS for health checks, the associated certificate must be valid.
Content-based routing is a hallmark feature of ALBs, allowing sophisticated routing decisions beyond simple IP and port forwarding. This feature is exclusive to ALBs.
Reduces the number of load balancers needed for multiple applications, supports routing to different microservices, and facilitates personalization (e.g., mobile-optimized pages).
Routes traffic based on the domain name specified in the Host header of the HTTP request. Useful for hosting multiple subdomains or distinct application modules on a single ALB (e.g., user.cloudexpertsolution.com to user management service).
Routes traffic based on the URL path in the request. Commonly used in microservices to direct requests to appropriate backend services (e.g., /images to an image service, /api to an API gateway service).
Utilizes parameters found in the query string (the part of the URL following a ?). For example, a URL with ?service=search could be routed to a product search target group.
ALBs can route traffic based on specific HTTP headers, including custom headers. Use cases include A/B testing, device-specific content, and environment routing (e.g., X-Environment: beta header to a beta testing target group).
ALBs offer a suite of advanced features to enhance application availability, performance, and manageability.
Prevents newly registered or re-registered targets from being overwhelmed by an immediate surge of traffic. ALB gradually increases the load sent to these targets, allowing them time to initialize. Configurable between 30 and 900 seconds.
Technical Specs: Duration: 30-900 seconds
Ensures that all requests from a particular client are consistently routed to the same backend target for a defined duration. ALB inserts a cookie into the initial response. Beneficial for stateful applications where session data is maintained on individual backend instances. Duration configurable in seconds.
Technical Specs: Duration: Configurable in seconds (e.g., 30 seconds to 7 days)
ALBs can handle the computationally intensive task of TLS encryption and decryption. Clients connect via HTTPS; ALB decrypts, forwards as unencrypted HTTP to backend, and re-encrypts response. Benefits include reduced backend load and simplified certificate management via AWS Certificate Manager (ACM).
An extension of TLS requiring both client and server to authenticate each other using digital certificates. Provides higher security, often used in Zero Trust architectures, for securing B2B APIs, and ensuring only trusted clients can access an application.
Allows hosting multiple HTTPS websites on a single ALB. Clients specify the hostname during the TLS handshake, enabling the ALB to present the correct SSL/TLS certificate for that specific hostname on a single listener port (typically 443). Simplifies certificate management and reduces costs.
Technical Specs: Listener port: Typically 443
ALBs can respond directly to HTTP requests with a predefined HTTP status code, message, and content type, without forwarding to any backend target. Used for custom maintenance messages or blocking access.
ALBs can issue client-side redirects, instructing the client’s browser to make a new request to a different URL. Used for HTTP to HTTPS redirection, domain changes, or URL normalization.
Allows precise control over traffic distribution between multiple target groups by assigning weights. Used for A/B testing (e.g., 10% to new version, 90% to stable) or Blue/Green Deployments.
ALB adds X-Forwarded-For (original client IP), X-Forwarded-Proto (client protocol), and X-Forwarded-Port (client port) headers to requests sent to backend servers, as backend servers typically see the ALB’s internal IP.
A crucial security best practice involves using the ALB as a protective layer for backend EC2 instances.
Placing an ALB in front of EC2 instances prevents their direct internet exposure. The ALB's security group allows inbound traffic on ports 80/443 from the internet. The EC2 instances' security groups should then only allow inbound traffic from the ALB’s security group, enabling EC2 instances to run with private IP addresses only, with all internet-bound traffic securely managed by the ALB.
The Network Load Balancer (NLB) is a Layer 4 load balancer built for extreme high performance and low latency.
NLB operates at Layer 4 (TCP and UDP levels) and is designed for applications handling millions of requests per second or high-throughput TCP traffic. It offers ultra-low latency.
NLB can assign Elastic IP addresses and supports AWS-provided static IP per Availability Zone, useful for IP whitelisting. Supported target types include EC2 instances, IP addresses (inside or outside VPC), and Application Load Balancers. Listeners support TCP, UDP, and TLS.
NLB supports TLS offloading (termination at Layer 4) using TLS listeners, where NLB decrypts, then re-encrypts to backend targets. It also supports TLS pass-through, where NLB forwards encrypted traffic without decryption to backend targets. Configuration requires a TLS target group on port 443 for offloading, or a TCP target group on port 443 for pass-through.
Configurable across multiple Availability Zones with automatic failover to healthy AZs. By default, NLB distributes traffic only to targets in its own AZ, but enabling cross-zone load balancing distributes traffic across all registered targets in all enabled AZs (incurs data transfer fees).
Technical Specs: Cross-zone load balancing incurs data transfer fees
Enabled by default, allowing backend targets to see the client’s original IP address, not the NLB’s private IP. This can be toggled on/off in target group attributes.
The Gateway Load Balancer (GWLB) is designed for developing, scanning, and managing third-party virtual appliances.
GWLB operates at Layer 3 (OSI Network Layer) and uses the GENEVE protocol on port 6081. It sits between VPC and the internet, or between VPCs, to inspect all traffic, routing it to virtual appliances without altering packet headers. It performs autoscaling for appliances and provides GENEVE encapsulation, allowing metadata to pass for deep inspection.
Technical Specs: OSI Layer: 3; Protocol: GENEVE on port 6081
GWLB is not for HTTP/HTTPS traffic. Its sole purpose is security inspection or network analysis, not content routing.
Connection Draining is an ELB feature ensuring in-flight requests complete before an instance is deregistered or marked unhealthy.
This feature prevents data loss or errors for ongoing user requests when instances are terminated, fail health checks, or are updated. The ELB stops sending new requests to the instance and allows existing connections to finish within a configurable timeout duration (from 0 to 3,600 seconds, default 300 seconds). If a request doesn't complete within this time, the connection will be closed. It is referred to as 'Connection Draining' for CLB and 'Deregistration Delay' for ALB and NLB.
Technical Specs: Timeout Duration: 0-3,600 seconds (default 300 seconds)
Different ELB types are optimized for various workloads and traffic patterns.
| Option |
OSI Layer |
Traffic Type |
Performance |
Key Features / Use Cases |
| Layer 7 (Application) |
HTTP, HTTPS |
Scalable, supports intelligent routing |
Content-based routing (host, path, query string, header), microservices, containerized apps, serverless (Lambda), A/B testing, SNI, TLS offloading. |
| Layer 4 (Transport) |
TCP, UDP, TLS, TCP_UDP |
Highest performing, ultra-low latency, millions of requests/sec |
Static Elastic IPs, preserves source IP, high-throughput TCP traffic, gaming, IoT, TLS pass-through, cross-zone load balancing (with cost). |
| Layer 3 (Network) |
All IP traffic (via GENEVE protocol on port 6081) |
Designed for virtual appliance deployment |
Deploying third-party virtual appliances (firewalls, IDS, DPI), traffic inspection, packet alteration prevention, GENEVE encapsulation. |
| Layer 4 and Layer 7 |
HTTP, HTTPS, TCP, SSL/TLS |
Legacy, less scalable |
Legacy applications, basic health checks, SSL termination, sticky sessions. Not recommended for new applications. |
Effective cleanup and adherence to best practices are crucial for cost management, security, and architectural soundness.
Always terminate EC2 instances and release associated Elastic IPs when they are no longer needed to avoid incurring unnecessary charges. Delete unused IAM roles and ENIs. For Elastic IPs, you must disassociate from the instance and then release to avoid charges, usually requiring the instance to be stopped first.
Employ security groups with the principle of least privilege, use IAM roles for service access (instead of access keys), and leverage SSH key pairs securely. Regularly rotate Access Key Pairs and passwords according to organizational best practices.
Understand and utilize the various purchasing options (On-Demand, Reserved Instances, Savings Plans, Spot Instances) to match your workload's predictability and fault tolerance.
For dynamic IP requirements or high availability, favor using DNS registration or Elastic Load Balancers over Elastic IPs. This is generally considered a reflection of better architectural decisions compared to relying solely on Elastic IPs.
Glossary
Amazon Machine Image (AMI)
A template containing the operating system and pre-installed software required to launch an EC2 instance.
Elastic IP address (EIP)
A static, public IPv4 address that you can allocate to your AWS account and associate with an EC2 instance, providing a consistent public IP.
Elastic Network Interface (ENI)
A virtual network interface that can be attached to an EC2 instance within a VPC, providing IP addresses, a MAC address, and security group associations.
EC2 Instance Store
Physically attached storage for temporary data that is lost when the instance is stopped or terminated.
Security Group
Acts as a virtual firewall for EC2 instances, controlling inbound and outbound traffic.
User Data
A script that executes when an EC2 instance boots up for the first time, automating initial configuration tasks.
Instance Profile
A container for IAM roles that enables EC2 instances to obtain temporary credentials.
Root Volume
The primary boot volume for an EC2 instance, typically named /dev/sda1.
Spanned Volume
A volume that combines space from multiple physical disks into a single logical volume.
Simple Volume
A volume created on a single physical disk.
Deregistration Delay
A feature in Elastic Load Balancers ensuring in-flight requests complete before an instance is deregistered or marked unhealthy.
Connection Draining
The legacy term for Deregistration Delay, particularly for Classic Load Balancers.
Target Tracking Scaling
An Auto Scaling policy where you set a target metric (e.g., average CPU utilization) and Auto Scaling adjusts instance count to stay near the target.
Cooldown Period
A period of time after an auto-scaling action during which no further scaling actions are triggered.
User Data Scripting
A script that executes when an EC2 instance boots up for the first time to automate initial configuration tasks.