Clone the Remote Repository and Create a New Branch: Clone the remote GitHub repository and create a new branch simultaneously by specifying the branch name with the -b flag.git clone -b <branch_name> <repository_URL> Replace <branch_name> with the name you want for your new branch and <repository_URL> with the URL of the GitHub repository.
Navigate to the Cloned Repository: Change your current directory to the cloned repository.cd <repository_name> Replace <repository_name> with the name of the repository you cloned.
Define GitHub Credentials: Set up your GitHub credentials for the repository:git config user.email "your_email@example.com" git config user.name "Your Name" Replace "your_email@example.com" with your GitHub email and "Your Name" with your GitHub username.
Make Changes, Add, and Commit: Make changes to the files in the repository, then add and commit those changes.# Make changes to the files git add . git commit -m "Your commit message here" Replace "Your commit message here" with a brief description of the changes you made.
Push Changes to GitHub: Push your changes to GitHub, specifying the new branch name.git push origin <new_branch_name> Replace <new_branch_name> with the name of the new branch you created.
Enter GitHub Credentials (if prompted): If this is your first time pushing to the repository or if you’re pushing to a private repository, GitHub may prompt you to enter your GitHub username and password or personal access token.
After completing these steps, your changes should be pushed to the new branch on the GitHub repository successfully. You can verify this by visiting the GitHub repository in your web browser and checking if the changes are reflected there.
Availability and reachability is improved by adding one more server. However, the entire system can again become unavailable if there is a capacity issue. Let’s look at that load issue with both types of systems we discussed, active-passive and active-active.
Vertical Scaling
If there are too many requests sent to a single active-passive system, the active server will become unavailable and hopefully failover to the passive server. But this doesn’t solve anything. With active-passive, you need vertical scaling. This means increasing the size of the server. With EC2 instances, you select either a larger type or a different instance type. This can only be done while the instance is in a stopped state. In this scenario, the following steps occur:
Stop the passive instance. This doesn’t impact the application since it’s not taking any traffic.
Change the instance size or type, then start the instance again.
Shift the traffic to the passive instance, turning it active.
The last step is to stop, change the size, and start the previous active instance as both instances should match.
When the amount of requests reduces, the same operation needs to be done. Even though there aren’t that many steps involved, it’s actually a lot of manual work to do. Another disadvantage is that a server can only scale vertically up to a certain limit.
Once that limit is reached, the only option is to create another active-passive system and split the requests and functionalities across them. This could require massive application rewriting.This is where the active-active system can help. When there are too many requests, this system can be scaled horizontally by adding more servers.
Horizontal Scaling
As mentioned above, for the application to work in an active-active system, it’s already created as stateless, not storing any client session on the server. This means that having two servers or having four wouldn’t require any application changes. It would only be a matter of creating more instances when required and shutting them down when the traffic decreases.
The Amazon EC2 Auto Scaling service can take care of that task by automatically creating and removing EC2 instances based on metrics from Amazon CloudWatch.
You can see that there are many more advantages to using an active-active system in comparison with an active-passive. Modifying your application to become stateless enables scalability.
Integrate ELB with EC2 Auto Scaling
The ELB service integrates seamlessly with EC2 Auto Scaling. As soon as a new EC2 instance is added to or removed from the EC2 Auto Scaling group, ELB is notified. However, before it can send traffic to a new EC2 instance, it needs to validate that the application running on that EC2 instance is available.
This validation is done via the health checks feature of ELB. Monitoring is an important part of load balancers, as it should route traffic to only healthy EC2 instances. That’s why ELB supports two types of health checks.
Establishing a connection to a backend EC2 instance using TCP, and marking the instance as available if that connection is successful.
Making an HTTP or HTTPS request to a webpage that you specify, and validating that an HTTP response code is returned.
Differentiate Between Traditional Scaling and Auto Scaling
With a traditional approach to scaling, you buy and provision enough servers to handle traffic at its peak. However, this means that at night time, there is more capacity than traffic. This also means you’re wasting money. Turning off those servers at night or at times where the traffic is lower only saves on electricity.
The cloud works differently, with a pay-as-you-go model. It’s important to turn off the unused services, especially EC2 instances that you pay for On-Demand. One could manually add and remove servers at a predicted time. But with unusual spikes in traffic, this solution leads to a waste of resources with over-provisioning or with a loss of customers due to under-provisioning.
The need here is for a tool that automatically adds and removes EC2 instances according to conditions you define—that’s exactly what the EC2 Auto Scaling service does.
Use Amazon EC2 Auto Scaling
The EC2 Auto Scaling service works to add or remove capacity to keep a steady and predictable performance at the lowest possible cost. By adjusting the capacity to exactly what your application uses, you only pay for what your application needs. And even with applications that have steady usage, EC2 Auto Scaling can help with fleet management. If there is an issue with an EC2 instance, EC2 Auto Scaling can automatically replace that instance. This means that EC2 Auto Scaling helps both to scale your infrastructure and ensure high availability.
Configure EC2 Auto Scaling Components
There are three main components to EC2 Auto Scaling.
Launch template or configuration: What resource should be automatically scaled?
EC2 Auto Scaling Group: Where should the resources be deployed?
Scaling policies: When should the resources be added or removed?
Learn About Launch Templates
There are multiple parameters required to create EC2 instances: Amazon Machine Image (AMI) ID, instance type, security group, additional Amazon Elastic Block Store (EBS) volumes, and more. All this information is also required by EC2 Auto Scaling to create the EC2 instance on your behalf when there is a need to scale. This information is stored in a launch template.
You can use a launch template to manually launch an EC2 instance. You can also use it with EC2 Auto Scaling. It also supports versioning, which allows for quickly rolling back if there was an issue or to specify a default version of your launch template. This way, while iterating on a new version, other users can continue launching EC2 instances using the default version until you make the necessary changes.
You can create a launch template one of three ways.
The fastest way to create a template is to use an existing EC2 instance. All the settings are already defined.
Another option is to create one from an already existing template or a previous version of a launch template.
The last option is to create a template from scratch. The following options will need to be defined: AMI ID, instance type, key pair, security group, storage, and resource tags.
Note: Another way to define what Amazon EC2 Auto Scaling needs to scale is by using a launch configuration. It’s similar to the launch template, but it doesn’t allow for versioning using a previously created launch configuration as a template. Nor does it allow for creating one from an already existing EC2 instance. For these reasons and to ensure that you’re getting the latest features from Amazon EC2, use a launch template instead of launch configuration.
Get to Know EC2 Auto Scaling Groups
The next component that EC2 Auto Scaling needs is an EC2 Auto Scaling Group (ASG). An ASG enables you to define where EC2 Auto Scaling deploys your resources. This is where you specify the Amazon Virtual Private Cloud (VPC) and subnets the EC2 instance should be launched in.
EC2 Auto Scaling takes care of creating the EC2 instances across the subnets, so it’s important to select at least two subnets that are across different Availability Zones.
ASGs also allow you to specify the type of purchase for the EC2 instances. You can use On-Demand only, Spot only, or a combination of the two, which allows you to take advantage of Spot instances with minimal administrative overhead.To specify how many instances EC2 Auto Scaling should launch, there are three capacity settings to configure for the group size.
Minimum: The minimum number of instances running in your ASG even if the threshold for lowering the amount of instances is reached.
Maximum: The maximum number of instances running in your ASG even if the threshold for adding new instances is reached.
Desired capacity: The amount of instances that should be in your ASG. This number can only be within or equal to the minimum or maximum. EC2 Auto Scaling automatically adds or removes instances to match the desired capacity number.
When EC2 Auto Scaling removes EC2 instances because the traffic is minimal, it keeps removing EC2 instances until it reaches a minimum capacity. Depending on your application, using a minimum of two is a good idea to ensure high availability, but you know how many EC2 instances at a bare minimum your application requires at all times. When reaching that limit, even if EC2 Auto Scaling is instructed to remove an instance, it does not, to ensure the minimum is kept.
On the other hand, when the traffic keeps growing, EC2 Auto Scaling keeps adding EC2 instances. This means the cost for your application will also keep growing. That’s why it’s important to set a maximum amount to make sure it doesn’t go above your budget.
The desired capacity is the amount of EC2 instances that EC2 Auto Scaling creates at the time the group is created. If that number decreases, then EC2 Auto Scaling removes the oldest instance by default. If that number increases, then EC2 Auto Scaling creates new instances using the launch template.
Ensure Availability with EC2 Auto Scaling
Using different numbers for minimum, maximum, and desired capacity is used for dynamically adjusting the capacity. However, if you prefer to use EC2 Auto Scaling for fleet management, you can configure the three settings to the same number, for example four. EC2 Auto Scaling will ensure that if an EC2 instance becomes unhealthy, it replaces it to always ensure that four EC2 instances are available. This ensures high availability for your applications.
Enable Automation with Scaling Policies
By default, an ASG will be kept to its initial desired capacity. Although it’s possible to manually change the desired capacity, you can also use scaling policies.
In the AWS Monitoring module, you learned about Amazon CloudWatch metrics and alarms. You use metrics to keep information about different attributes of your EC2 instance like the CPU percentage. You use alarms to specify an action when a threshold is reached. Metrics and alarms are what scaling policies use to know when to act. For example, you set up an alarm that says when the CPU utilization is above 70% across the entire fleet of EC2 instances, trigger a scaling policy to add an EC2 instance.
There are three types of scaling policies: simple, step, and target tracking scaling.
Simple Scaling Policy
A simple scaling policy allows you to do exactly what’s described above. You use a CloudWatch alarm and specify what to do when it is triggered. This can be a number of EC2 instances to add or remove, or a specific number to set the desired capacity to. You can specify a percentage of the group instead of using an amount of EC2 instances, which makes the group grow or shrink more quickly.
Once this scaling policy is triggered, it waits a cooldown period before taking any other action. This is important as it takes time for the EC2 instances to start and the CloudWatch alarm may still be triggered while the EC2 instance is booting. For example, you could decide to add an EC2 instance if the CPU utilization across all instances is above 65%. You don’t want to add more instances until that new EC2 instance is accepting traffic.
However, what if the CPU utilization was now above 85% across the ASG? Only adding one instance may not be the right move here. Instead, you may want to add another step in your scaling policy. Unfortunately, a simple scaling policy can’t help with that.
Step Scaling Policy
This is where a step scaling policy helps. Step scaling policies respond to additional alarms even while a scaling activity or health check replacement is in progress. Similar to the example above, you decide to add two more instances in case the CPU utilization is at 85%, and four more instances when it’s at 95%.
Deciding when to add and remove instances based on CloudWatch alarms may seem like a difficult task. This is why the third type of scaling policy exists: target tracking.
Target Tracking Scaling Policy
If your application scales based on average CPU utilization, average network utilization (in or out), or based on request count, then this scaling policy type is the one to use. All you need to provide is the target value to track and it automatically creates the required CloudWatch alarms.
Load balancing refers to the process of distributing tasks across a set of resources. In the case of the corporate directory application, the resources are EC2 instances that host the application, and the tasks are the different requests being sent. It’s time to distribute the requests across all the servers hosting the application using a load balancer.
To do this, you first need to enable the load balancer to take all of the traffic and redirect it to the backend servers based on an algorithm. The most popular algorithm is round-robin, which sends the traffic to each server one after the other.
A typical request for the application would start from the browser of the client. It’s sent to a load balancer. Then, it’s sent to one of the EC2 instances that hosts the application. The return traffic would go back through the load balancer and back to the client browser. Thus, the load balancer is directly in the path of the traffic.
Although it is possible to install your own software load balancing solution on EC2 instances, AWS provides a service for that called Elastic Load Balancing (ELB).
FEATURES OF ELB
The ELB service provides a major advantage over using your own solution to do load balancing, in that you don’t need to manage or operate it. It can distribute incoming application traffic across EC2 instances as well as containers, IP addresses, and AWS Lambda functions.
The fact that ELB can load balance to IP addresses means that it can work in a hybrid mode as well, where it also load balances to on-premises servers.
ELB is highly available. The only option you have to ensure is that the load balancer is deployed across multiple Availability Zones.
In terms of scalability, ELB automatically scales to meet the demand of the incoming traffic. It handles the incoming traffic and sends it to your backend application.
HEALTH CHECKS
Taking the time to define an appropriate health check is critical. Only verifying that the port of an application is open doesn’t mean that the application is working. It also doesn’t mean that simply making a call to the home page of an application is the right way either.
For example, the employee directory application depends on a database, and S3. The health check should validate all of those elements. One way to do that would be to create a monitoring webpage like “/monitor” that will make a call to the database to ensure it can connect and get data, and make a call to S3. Then, you point the health check on the load balancer to the “/monitor” page.
After determining the availability of a new EC2 instance, the load balancer starts sending traffic to it. If ELB determines that an EC2 instance is no longer working, it stops sending traffic to it and lets EC2 Auto Scaling know. EC2 Auto Scaling’s responsibility is to remove it from the group and replace it with a new EC2 instance. Traffic only sends to the new instance if it passes the health check.
In the case of a scale down action that EC2 Auto Scaling needs to take due to a scaling policy, it lets ELB know that EC2 instances will be terminated. ELB can prevent EC2 Auto Scaling from terminating the EC2 instance until all connections to that instance end, while preventing any new connections. That feature is called connection draining.
ELB COMPONENTS
The ELB service is made up of three main components.
Listeners: The client connects to the listener. This is often referred to as client-side. To define a listener, a port must be provided as well as the protocol, depending on the load balancer type. There can be many listeners for a single load balancer.
Target groups: The backend servers, or server-side, is defined in one or more target groups. This is where you define the type of backend you want to direct traffic to, such as EC2 Instances, AWS Lambda functions, or IP addresses. Also, a health check needs to be defined for each target group.
Rules: To associate a target group to a listener, a rule must be used. Rules are made up of a condition that can be the source IP address of the client and a condition to decide which target group to send the traffic to.
APPLICATION LOAD BALANCER
Here are some primary features of Application Load Balancer (ALB).
ALB routes traffic based on request data. It makes routing decisions based on the HTTP protocol like the URL path (/upload) and host, HTTP headers and method, as well as the source IP address of the client. This enables granular routing to the target groups.
Send responses directly to the client. ALB has the ability to reply directly to the client with a fixed response like a custom HTML page. It also has the ability to send a redirect to the client which is useful when you need to redirect to a specific website or to redirect the request from HTTP to HTTPS, removing that work from your backend servers.
ALB supports TLS offloading. Speaking of HTTPS and saving work from backend servers, ALB understands HTTPS traffic. To be able to pass HTTPS traffic through ALB, an SSL certificate is provided by either importing a certificate via Identity and Access Management (IAM) or AWS Certificate Manager (ACM) services, or by creating one for free using ACM. This ensures the traffic between the client and ALB is encrypted.
Authenticate users. On the topic of security, ALB has the ability to authenticate the users before they are allowed to pass through the load balancer. ALB uses the OpenID Connect protocol and integrates with other AWS services to support more protocols like SAML, LDAP, Microsoft AD, and more.
Secure traffic. To prevent traffic from reaching the load balancer, you configure a security group to specify the supported IP address ranges.
ALB uses the round-robin routing algorithm. ALB ensures each server receives the same number of requests in general. This type of routing works for most applications.
ALB uses the least outstanding request routing algorithm. If the requests to the backend vary in complexity where one request may need a lot more CPU time than another, then the least outstanding request algorithm is more appropriate. It’s also the right routing algorithm to use if the targets vary in processing capabilities. An outstanding request is when a request is sent to the backend server and a response hasn’t been received yet.
For example, if the EC2 instances in a target group aren’t the same size, one server’s CPU utilization will be higher than the other if the same number of requests are sent to each server using the round-robin routing algorithm. That same server will have more outstanding requests as well. Using the least outstanding request routing algorithm would ensure an equal usage across targets.
ALB has sticky sessions. In the case where requests need to be sent to the same backend server because the application is stateful, then use the sticky session feature. This feature uses an HTTP cookie to remember across connections which server to send the traffic to.Finally, ALB is specifically for HTTP and HTTPS traffic. If your application uses a different protocol, then consider the Network Load Balancer (NLB).
NETWORK LOAD BALANCER
Here are some primary features of Network Load Balancer (NLB).Network Load Balancer supports TCP, UDP, and TLS protocols. HTTPS uses TCP and TLS as protocol. However, NLB operates at the connection layer, so it doesn’t understand what a HTTPS request is. That means all features discussed above that are required to understand the HTTP and HTTPS protocol, like routing rules based on that protocol, authentication, and least outstanding request routing algorithm, are not available with NLB.
NLB uses a flow hash routing algorithm. The algorithm is based on:
The protocol
The source IP address and source port
The destination IP address and destination port
The TCP sequence number
If all of these parameters are the same, then the packets are sent to the exact same target. If any of them are different in the next packets, then the request may be sent to a different target.
NLB has sticky sessions. Different from ALB, these sessions are based on the source IP address of the client instead of a cookie.
NLB supports TLS offloading. NLB understands the TLS protocol. It can also offload TLS from the backend servers similar to how ALB works.
NLB handles millions of requests per second. While ALB can also support this number of requests, it needs to scale to reach that number. This takes time. NLB can instantly handle this amount of requests.
NLB supports static and elastic IP addresses. There are some situations where the application client needs to send requests directly to the load balancer IP address instead of using DNS. For example, this is useful if your application can’t use DNS or if the connecting clients require firewall rules based on IP addresses. In this case, NLB is the right type of load balancer to use.
NLP preserves source IP address. NLB preserves the source IP address of the client when sending the traffic to the backend. With ALB, if you look at the source IP address of the requests, you will find the IP address of the load balancer. While with NLB, you would see the real IP address of the client, which is required by the backend application in some cases.
SELECT BETWEEN ELB TYPES
Selecting between the ELB service types is done by determining which feature is required for your application. Below you can find a list of the major features that you learned in this unit and the previous.
Feature
Application Load Balancer
Network Load Balancer
Protocols
HTTP, HTTPS
TCP, UDP, TLS
Connection draining (deregistration delay)
✔
IP addresses as targets
✔
✔
Static IP and Elastic IP address
✔
Preserve Source IP address
✔
Routing based on Source IP address, path, host, HTTP headers, HTTP method, and query string
The availability of a system is typically expressed as a percentage of uptime in a given year or as a number of nines. Below, you can see a list of the percentages of availability based on the downtime per year, as well as its notation in nines.
Availability (%)
Downtime (per year)
90% (“one nine”)
36.53 days
99% (“two nines”)
3.65 days
99.9% (“three nines”)
8.77 hours
99.95% (“three and a half nines”)
4.38 hours
99.99% (“four nines”)
52.60 minutes
99.995% (“four and a half nines”)
26.30 minutes
99.999% (“five nines”)
5.26 minutes
To increase availability, you need redundancy. This typically means more infrastructure: more data centers, more servers, more databases, and more replication of data. You can imagine that adding more of this infrastructure means a higher cost. Customers want the application to always be available, but you need to draw a line where adding redundancy is no longer viable in terms of revenue.
Improve Application Availability
In the current application, there is only one EC2 instance used to host the application, the photos are served from Amazon Simple Storage Service (S3) and the structured data is stored in Amazon DynamoDB. That single EC2 instance is a single point of failure for the application. Even if the database and S3 are highly available, customers have no way to connect if the single instance becomes unavailable. One way to solve this single point of failure issue is by adding one more server.
Use a Second Availability Zone
The physical location of that server is important. On top of having software issues at the operating system or application level, there can be a hardware issue. It could be in the physical server, the rack, the data center or even the Availability Zone hosting the virtual machine. An easy way to fix the physical location issue is by deploying a second EC2 instance in a different Availability Zone. That would also solve issues with the operating system and the application. However, having more than one instance brings new challenges.
Manage Replication, Redirection, and High Availability
Create a Process for ReplicationThe first challenge is that you need to create a process to replicate the configuration files, software patches, and application itself across instances. The best method is to automate where you can.
Address Customer RedirectionThe second challenge is how to let the clients, the computers sending requests to your server, know about the different servers. There are different tools that can be used here. The most common is using a Domain Name System (DNS) where the client uses one record which points to the IP address of all available servers. However, the time it takes to update that list of IP addresses and for the clients to become aware of such change, sometimes called propagation, is typically the reason why this method isn’t always used.
Another option is to use a load balancer which takes care of health checks and distributing the load across each server. Being between the client and the server, the load balancer avoids propagation time issues. We discuss load balancers later.
Understand the Types of High AvailabilityThe last challenge to address when having more than one server is the type of availability you need—either be an active-passive or an active-active system.
Active-Passive: With an active-passive system, only one of the two instances is available at a time. One advantage of this method is that for stateful applications where data about the client’s session is stored on the server, there won’t be any issues as the customers are always sent to the same server where their session is stored.
Active-Active: A disadvantage of active-passive and where an active-active system shines is scalability. By having both servers available, the second server can take some load for the application, thus allowing the entire system to take more load. However, if the application is stateful, there would be an issue if the customer’s session isn’t available on both servers. Stateless applications work better for active-active systems.
When operating a website like the Employee Directory Application on AWS you may have questions like:
How many people are visiting my site day to day?
How can I track the number of visitors over time?
How will I know if the website is having performance or availability issues?
What happens if my Amazon Elastic Compute Cloud (EC2) instance runs out of capacity?
Will I be alerted if my website goes down?
You need a way to collect and analyze data about the operational health and usage of your resources. The act of collecting, analyzing, and using data to make decisions or answer questions about your IT resources and systems is called monitoring. Monitoring enables you to have a near real-time pulse on your system and answer the questions listed above. You can use the data you collect to watch for operational issues caused by events like over-utilization of resources, application flaws, resource misconfiguration, or security-related events. Think of the data collected through monitoring as outputs of the system, or metrics.
Use Metrics to Solve Problems
The resources that host your solutions on AWS all create various forms of data that you might be interested in collecting. You can think of each individual data point that is created by a resource as a metric. Metrics that are collected and analyzed over time become statistics, like the example of average CPU utilization over time below, showing a spike at 1:30. Consider this: One way to evaluate the health of an Amazon EC2 instance is through CPU utilization. Generally speaking, if an EC2 instance has a high CPU utilization, it can mean a flood of requests. Or it can reflect a process that has encountered an error and is consuming too much of the CPU. When analyzing CPU utilization, take a process that exceeds a specific threshold for an unusual length of time. Use that abnormal event as a cue to either manually or automatically resolve the issue through actions like scaling the instance. This is one example of a metric. Other examples of metrics EC2 instances have are network utilization, disk performance, memory utilization, and the logs created by the applications running on top of EC2.
Know the Different Types of Metrics
Different resources in AWS create different types of metrics. An Amazon Simple Storage Service (S3) bucket would not have CPU utilization like an EC2 instance does. Instead, S3 creates metrics related to the objects stored in a bucket like the overall size, or the number of objects in a bucket. S3 also has metrics related to the requests made to the bucket such as reading or writing objects. Amazon Relational Database Service (RDS) creates metrics such as database connections, CPU utilization of an instance, or disk space consumption. This is not a complete list for any of the services mentioned, but you can see how different resources create different metrics. You could be interested in a wide variety of metrics depending on the types of resources you are using, the goals you have, or the types of questions you want answered.
Understand the Benefits of Monitoring
Monitoring gives you visibility into your resources, but the question now is, “Why is that important?” The following are some of the benefits of monitoring.
Respond to operational issues proactively before your end users are aware of them. It’s a bad practice to wait for end users to let you know your application is experiencing an outage. Through monitoring, you can keep tabs on metrics like error response rate or request latency, over time, that help signal that an outage is going to occur. This enables you to automatically or manually perform actions to prevent the outage from happening—fixing the problem before your end users are aware of it.
Improve the performance and reliability of your resources. Monitoring the different resources that comprise your application provides you with a full picture of how your solution behaves as a system. Monitoring, if done well, can illuminate bottlenecks and inefficient architectures. This enables you to drive performance and reliability improvement processes.
Recognize security threats and events. When you monitor resources, events, and systems over time, you create what is called a baseline. A baseline defines what activity is normal. Using a baseline, you can spot anomalies like unusual traffic spikes or unusual IP addresses accessing your resources. When an anomaly occurs, an alert can be sent out or an action can be taken to investigate the event.
Make data-driven decisions for your business. Monitoring is not only to keep an eye on IT operational health. It also helps drive business decisions. For example, let’s say you launched a new feature for your cat photo app, and want to know whether it’s being used. You can collect application-level metrics and view the number of users who use the new feature. With your findings, you decide whether to invest more time into improving the new feature.
Create more cost-effective solutions. Through monitoring, you can view resources that are being underutilized and rightsize your resources to your usage. This helps you optimize cost and make sure you aren’t spending more money than necessary.
Enable Visibility
AWS resources create data you can monitor through metrics, logs, network traffic, events, and more. This data is coming from components that are distributed in nature, which can lead to difficulty in collecting the data you need if you don’t have a centralized place to review it all. AWS has already done that for you with a service called Amazon CloudWatch.
Amazon CloudWatch is a monitoring and observability service that collects data like those mentioned in this module. CloudWatch provides actionable insights into your applications, and enables you to respond to system-wide performance changes, optimize resource utilization, and get a unified view of operational health. This unified view is important. You can use CloudWatch to:
Detect anomalous behavior in your environments.
Set alarms to alert you when something’s not right.
Visualize logs and metrics with the AWS Management Console.
Take automated actions like scaling.
Troubleshoot issues.
Discover insights to keep your applications healthy.
Site Reliability Engineering (SRE) principles, as defined by Google, focus on creating scalable and reliable software systems through a combination of engineering and operations practices. SRE aims to balance the need for rapid innovation with the requirement for reliability, availability, and scalability. Here are some key principles of SRE:
Service Level Objectives (SLOs):
SLOs define the level of reliability or performance that a service should achieve, typically expressed as a percentage of uptime or response time.
SLOs provide a clear target for reliability and help align engineering efforts with business goals.
SRE teams monitor and measure SLOs, using them to make informed decisions about service improvements and investments.
Error Budgets:
Error budgets are a concept closely related to SLOs. They represent the permissible amount of downtime or errors that a service can experience within a given time period.
SRE teams manage error budgets to strike a balance between reliability and innovation. They allow for a certain level of risk-taking and experimentation, as long as it doesn’t exceed the error budget.
Automation:
SRE emphasizes automation to reduce manual toil and improve efficiency. Automation helps standardize processes, eliminate human error, and scale operations.
Automation is applied to various areas, including deployment, monitoring, incident response, and capacity management.
Monitoring and Alerting:
Effective monitoring and alerting are crucial for detecting and responding to issues proactively.
SRE teams use monitoring tools to collect and analyze metrics, track the health and performance of systems, and identify potential problems.
Alerting systems notify teams about incidents or deviations from expected behavior, allowing for timely responses.
Incident Management:
SRE follows a structured approach to incident management, aiming to minimize the impact of incidents on service reliability and user experience.
Incident response processes include escalation paths, on-call rotations, incident retrospectives, and postmortems to learn from failures and prevent recurrence.
Capacity Planning:
SRE teams perform capacity planning to ensure that systems have sufficient resources to handle current and future workloads.
Capacity planning involves forecasting demand, monitoring resource utilization, and scaling infrastructure as needed to maintain performance and reliability.
Blameless Culture:
SRE promotes a blameless culture where individuals are encouraged to take risks, learn from failures, and collaborate to improve systems.
Postmortems focus on identifying root causes and systemic issues rather than assigning blame to individuals.
Continuous Improvement:
SRE emphasizes continuous improvement through iterative processes, experimentation, and feedback loops.
Teams regularly review performance, reliability, and user feedback to identify opportunities for optimization and enhancement.
By embracing these principles, SRE teams strive to build and operate resilient and scalable systems that meet user expectations for reliability and performance.
The concept of “DevOps Tree Dimensions” refers to three fundamental aspects of DevOps implementation: Culture, Methods, and Tools. These dimensions are often depicted metaphorically as branches of a tree, with each dimension representing a critical component of DevOps adoption and success. Here’s an explanation of each dimension:
Culture:
Collaboration and Communication: DevOps culture emphasizes collaboration and communication among development, operations, and other stakeholders involved in the software delivery process. It promotes breaking down silos, fostering cross-functional teams, and encouraging transparency and trust.
Shared Responsibility: DevOps culture encourages a shift from individual responsibility to shared responsibility across teams. It promotes a culture where everyone takes ownership of the entire software delivery lifecycle, from planning and development to deployment and operations.
Continuous Learning and Improvement: DevOps culture values continuous learning and improvement, encouraging teams to experiment, innovate, and learn from failures. It promotes a growth mindset, where failure is seen as an opportunity for learning and feedback is used to drive continuous improvement.
Methods:
Agile Practices: DevOps often builds upon agile principles and practices, such as iterative development, cross-functional teams, and frequent feedback loops. Agile methodologies, such as Scrum or Kanban, help teams deliver value to customers quickly and adapt to changing requirements.
Continuous Integration and Delivery (CI/CD): CI/CD practices automate the process of integrating code changes, running tests, and deploying applications to production environments. CI/CD enables teams to deliver software updates rapidly, reliably, and with minimal manual intervention.
Lean Principles: DevOps incorporates lean principles, such as reducing waste, optimizing workflows, and maximizing efficiency. Lean methodologies help teams streamline processes, eliminate bottlenecks, and deliver value to customers more efficiently.
Tools:
Automation Tools: DevOps relies on a wide range of automation tools to streamline development, testing, deployment, and operations processes. These tools automate repetitive tasks, improve efficiency, and reduce the risk of human error. Examples include Jenkins for CI/CD, Terraform for infrastructure as code, and Ansible for configuration management.
Monitoring and Logging Tools: DevOps teams use monitoring and logging tools to gain visibility into system performance, detect issues in real-time, and troubleshoot problems quickly. These tools provide insights into application and infrastructure health, enabling teams to ensure reliability and availability.
Collaboration Tools: DevOps emphasizes collaboration and communication, so teams use collaboration tools to facilitate communication, document processes, and share knowledge. These tools include chat platforms like Slack, issue trackers like Jira, and version control systems like Git.
By focusing on these three dimensions—Culture, Methods, and Tools—organizations can effectively implement DevOps practices and principles, improve collaboration and efficiency, and deliver value to customers more rapidly and reliably. Each dimension plays a critical role in shaping the culture, practices, and tools used in DevOps adoption, ultimately driving better business outcomes and competitive advantage.
The concept of agility within the context of DevOps, Microservices, and Containers can be represented through various pillars or principles that guide the implementation of agile practices. Here’s an explanation of agility tree pillars within each of these domains:
DevOps:
Automation: Automation is a fundamental pillar of DevOps agility, emphasizing the use of automation tools and practices to streamline processes, eliminate manual tasks, and accelerate delivery. Automation enables teams to achieve faster deployment cycles, improve consistency, and reduce errors, leading to increased efficiency and productivity.
Collaboration: Collaboration is another essential pillar of DevOps agility, focusing on breaking down silos between development, operations, and other relevant teams to foster teamwork, communication, and shared ownership. Collaboration enables cross-functional teams to work together seamlessly, share knowledge and expertise, and collaborate on delivering value to customers more effectively.
Continuous Improvement: Continuous Improvement is a core pillar of DevOps agility, emphasizing the importance of establishing feedback loops, measuring performance, identifying areas for improvement, and implementing changes incrementally over time. Continuous improvement enables teams to adapt to changing requirements, address issues proactively, and drive innovation to continuously enhance their capabilities and outcomes.
Microservices:
Modularity: Modularity is a foundational pillar of Microservices agility, focusing on breaking down monolithic applications into smaller, independent services that are loosely coupled and independently deployable. Modularity enables teams to develop, deploy, and scale services more rapidly and efficiently, reduce dependencies, and enhance flexibility and agility in responding to changing business needs.
Autonomy: Autonomy is another key pillar of Microservices agility, emphasizing the empowerment of teams to make decisions and take ownership of their services. Autonomy enables teams to innovate, iterate, and evolve services independently, without being constrained by centralized control, leading to faster delivery cycles, improved responsiveness, and greater adaptability to change.
Resilience: Resilience is an essential pillar of Microservices agility, focusing on designing services to be resilient to failures, with redundancy, fault tolerance, and automated recovery mechanisms in place. Resilience enables services to withstand disruptions, recover quickly from failures, and maintain high availability and reliability, ensuring uninterrupted service delivery and a positive user experience.
Containers:
Portability: Portability is a core pillar of Containers agility, emphasizing the ability to package applications and their dependencies into lightweight, portable containers that can run consistently across different environments. Portability enables teams to deploy applications seamlessly across development, testing, and production environments, reduce vendor lock-in, and improve agility in deploying and scaling applications.
Scalability: Scalability is another key pillar of Containers agility, focusing on the ability to scale applications horizontally and vertically to meet changing demands. Containers enable teams to scale applications more efficiently, dynamically allocate resources, and respond quickly to fluctuations in workload, ensuring optimal performance and resource utilization without overprovisioning or underutilization.
Isolation: Isolation is an essential pillar of Containers agility, focusing on providing secure, isolated environments for running applications without interference from other processes or dependencies. Isolation enables teams to ensure that applications remain stable and secure, minimize the impact of failures, and protect sensitive data, ensuring a high level of reliability and security in containerized environments.
These agility tree pillars within DevOps, Microservices, and Containers provide a framework for fostering agility and innovation, enabling teams to deliver value to customers more quickly, reliably, and efficiently. By focusing on these pillars, organizations can enhance their capabilities, improve their competitiveness, and drive business success in today’s fast-paced and dynamic digital landscape.
MTTR stands for Mean Time To Recovery. It is a key performance indicator (KPI) used to measure the average time it takes to restore a service or system to normal operation after a failure or incident occurs. MTTR is an important metric in incident management and is used to assess the efficiency of an organization’s response and resolution processes.
The formula to calculate MTTR is:
MTTR = Total Downtime / Number of Incidents
Where:
Total Downtime: The cumulative duration of time during which a service or system was unavailable or degraded due to incidents.
Number of Incidents: The total number of incidents that occurred during a specific period.
For example, if a service experiences three incidents in a month, with respective downtime durations of 2 hours, 3 hours, and 4 hours, the total downtime would be 2 + 3 + 4 = 9 hours. If we divide this total downtime by the number of incidents (3), we would get an MTTR of 3 hours.
A lower MTTR indicates that incidents are being resolved quickly, minimizing the impact on users and the business. Organizations strive to continuously reduce their MTTR by improving incident detection, response, and resolution processes, implementing automation, and investing in proactive monitoring and preventive measures. By reducing MTTR, organizations can improve service reliability, minimize downtime, and enhance overall customer satisfaction.
Understanding what DevOps is not can be as crucial as understanding what it is. Here are some misconceptions or things that DevOps is often mistakenly perceived as, but isn’t:
Not Just Automation: While automation is a significant aspect of DevOps, it’s not the sole focus. DevOps is not just about automating manual tasks; it’s about cultural transformation, collaboration, and breaking down silos between development and operations teams.
Not Just Tools: DevOps is often associated with a plethora of tools and technologies, but it’s not about the tools themselves. Simply adopting tools like Docker, Kubernetes, or Jenkins doesn’t automatically mean an organization has implemented DevOps. DevOps is about people, processes, and culture, with tools being enablers of those aspects.
Not a Team or Role: DevOps is not a specific team or role within an organization. It’s a cultural mindset and set of practices that promote collaboration, communication, and shared responsibility across development, operations, and other relevant teams. While some organizations may have DevOps teams or roles, the true essence of DevOps is about breaking down barriers between teams, not creating new ones.
Not Just Continuous Deployment: While continuous deployment (CD) is a common DevOps practice, DevOps is not solely about continuously deploying code into production. It’s about delivering value to customers quickly and efficiently through the adoption of agile principles, automation, and a culture of continuous improvement.
Not a Silver Bullet: DevOps is not a one-size-fits-all solution or a silver bullet that can magically solve all of an organization’s problems. Implementing DevOps requires careful planning, cultural change, and ongoing commitment from leadership and teams. It’s a journey rather than a destination, and success depends on various factors, including organizational culture, maturity, and context.
Not Just for Technology Companies: While DevOps originated in the technology sector, it’s not exclusive to technology companies. Organizations across various industries, including finance, healthcare, retail, and manufacturing, have successfully adopted DevOps principles and practices to improve their software delivery processes, enhance customer experiences, and drive business outcomes.
Not Just about Speed: While DevOps emphasizes rapid and frequent delivery of software, it’s not solely about speed at the expense of quality or stability. DevOps aims to strike a balance between speed, quality, and reliability, enabling organizations to deliver high-quality software quickly and sustainably through automation, collaboration, and continuous feedback loops.
Understanding these misconceptions can help organizations approach Dev