7 AWS Networking Best Practices for Scalable Operations


Jump to section

Share this post:

Share on twitter
Share on linkedin

Much like colocation or captive data centers, public cloud environments consist of physical networks, which are broken down into virtual networks and provided to customers. With this blog post we will go over what a typical cloud deployment looks like and some AWS networking best practices around configuring and managing the same. 

AWS Networking Basics

  1. VPC or Virtual Private Cloud – is a virtual network created within a customer’s AWS account where it’s compute and other resources are hosted. It is by design disconnected from any other network but can be configured to permit connectivity to other VPCs via “VPC Peering” or captive data centers via VPN or “AWS Direct Connect”. 
  2. Subnets – needs little explanation, it’s a collection of an IP address space within a VPC
  3. Route Tables – are created at a subnet level and define intra VPC or outbound routing
  4. Internet Gateway or IGW – is a managed VPC component that enables communication between a resource within a VPC and the internet
  5. NAT Gateway – is a managed VPC component that enables outbound internet communication from resources within a VPC, but NOT the other way around

Let’s now look at some best practices from an AWS networking perspective.

Prod and non-prod VPCs

Create separate VPCs for your production and non-production work-load, which ensures a hard logical separation between these environments unless explicitly enabled via VPC peering or other means. It also enables enforcement of network security controls with different levels across prod and non-prod environments. Organisations at times use separate AWS accounts to keep these environments segregated from an IAM perspective as well, which should be leveraged as an approach when you’re heavily using native AWS services.

Perimeter Controls

Public facing properties are the first in line of attack and hence should be protected with adequate network security controls. Let’s go over some of these:

  1. Anti-DDoS – since most public facing components (such as ALB, NAT Gateway etc) are AWS managed services, AWS also manages threats such as DDoS. However, an organisation would need to enable the AWS Shield service in order to benefit from the same
  2. Layer 7 security – use a Web Application Firewall (popularly termed as WAF) wherever there an HTTP(s) endpoint open publicly to ensure basic web application attacks are mitigated before they reach your web layers
  3. Load balancer security – use HTTPS listeners with the HTTP ones enforcing a permanent redirection to HTTPS, and use latest / secure SSL ciphers where applicable
  4. Elastic IPs – these should be avoided unless absolutely required for services that cannot work with a load balancer and are purely standalone in nature (eg. self hosted mail, DNS etc). While using elastic IPs, carefully review the security groups associated so as to ensure only the required ports / protocols are open
  5. Non HTTP services – in case one is using a self service such as DNS or SMTP, these won’t be mitigated by typical L7 WAF implementations and would need an IPS such as snort or other commercial virtual devices to protect

Network Lists

AWS provides two types of network lists namely Security Groups and NACLs with two different purposes. 

Security Groups vs NACL:

In the linux world, you could think of a security group as a virtual firewall such as ufw applied on an AWS resource such as EC2 instance, ALB etc; whereas a NACL is a set of rules that have the same effect on an entire subnet. Hence, essentially, security groups are largely used to control inbound access whereas NACLs are used to control outbound access.

Here are some best practices while using network lists:
  1. Disable or avoid using default security groups – each VPC is provisioned with two “default” security groups which allow traffic from “ALL” sources and “ALL” ports. Needless to say what these would do when attached to an AWS resource, hence, it is critical to either delete these security groups or modify their rules to restrict inbound traffic
  2. Review “launch-wizard-*” security groups – you would notice a few services (such as CloudFormation) automatically create security groups with a naming convention “launch-wizard-*” which in some cases (a lot at times) allow permissive inbound network access. Hence either review these once created or ensure your CFT templates use existing security groups
  3. Limit the number of security groups that are created and use inheritance – security groups can be inherited just like old school firewall ACLs, hence use these where possible, which reduces security group duplicacy and eases manageability as well

Tip: in case your security groups are high in number, you could consider using AWS Firewall Manager to ease out manageability.

Network Segmentation

For large organisations, assuming you’ve got perimeter controls in order, you should probably spend some time going through your network segmentation strategy which would include:

  1. Creating high and low security zones – a classic LAN zoning exercise is equally important while designing a network in the public cloud, ensure you create zones such as DMZ, production, data, support services, monitoring etc. to separate the sensitive services from generic ones. In case you’re dealing with super sensitive data such as credit card information create a separate zone for the same too
  2. Using low range subnets for horizontal network segregation – deploy your production applications in separate subnets basis business unit, usage or any other pre-defined logic
  3. Classification of ports and services – most important! Classify ports into these two categories, for instance 80, 443 vs 3389, 22, 3306 etc.

Once you’ve got this right, carefully create security groups and NACLs where applicable to help you control access. Emerging technologies such as micro-segmentation also offer solutions to manage segmentation at a level described above. For further reading on the subject, do check out this blog on network segmentation.

Outbound Controls

This is an often ignored area but an important one for large scale organisations. Once an adversary has found it’s way through the perimeter or via lateral movement, outbound security controls go a long way to ensure post compromise activity such as CnC communication or exfiltration are kept under check (or even detected). 

  1. IPS and Layer 7 inspection – IPS or layer 7 inspection devices can be used to monitor outbound communication, and in case implemented in deny mode, block malicious connections as well. In addition to preventive security, these devices provide meaningful metadata against outbound traffic such as source / destination hostnames, requests which can provide insights into abnormal or suspicious activity emerging from within an VPC. Classic example is crypto mining detection! Such devices act as a next hop to a subnet’s traffic before it reaches the internet.
  2. Centralised vs de-centralised inspection – though outbound traffic inspection is important, one must exercise caution while designing a fault tolerant and resilient implementation. While using a centralised device such as the one described above in #1, ensure a horizontally scalable implementation leveraging a combination of an NLB and an EC2 based IPS or NGFW. An emerging alternate to this approach is using container or host based traffic inspection. Solutions available today can deploy light-weight agents on containers or EC2 instances directly hence eliminating the need for a central device. While using this approach, be sure your CI/CD pipeline enforces the agent deployment or else you’ll end up with a sub-optimal deployment of outbound inspection eventually

Caveat – given the nature of public cloud, not all AWS resources use VPC by default. For example, by default, an API gateway or a Lambda function is created as a native AWS service and not launched within a VPC, which means that any outbound network calls generated from these services will not be visible to your outbound security controls. Consider the “Launch in VPC” settings for such services depending on the criticality of applications. 

Monitoring & Logging

When you’re hosted in public cloud infrastructure, everything comes at a cost, so you’ll often be faced with discussions of what to log and what not to. While this subject will be discussed at length in another section dedicated to logging best practices, we’re sharing here, some key networking components which should be considered while firming up your monitoring and logging strategy:

  1. Public S3 buckets – helps monitor un-usual activity on your public S3 buckets, also a compliance requirement in some standards
  2. Public ALBs – monitor public HTTP traffic for attacks or un-usual activity
  3. VPC Flow Logs – monitor north-south and east-west communication
  4. DNS Logs – monitor DNS queries made from your VPC resources
  5. Deep packet inspection – use tools such as zeek, suricata or proprietary IPS / IDS / DPI solutions to inspect and log communication events and metadata

Once you’ve enabled logging for the above components, it’s crucial to enable SOC teams or security analysts to have ready access to them. In order to enable the same at scale, one could:

  1. Forward events to an SIEM solution
  2. Create Athena views with S3 as a data lake

Also, Cy5’s flagship product ion helps organisations gain visibility into their public cloud network controls and logging infrastructure via automated checks carried out by it’s Cloud Security Posture Management Module. We hope this post on AWS networking best practices was useful.