Mastering Kubernetes Deployment Strategies: Best Practices for Successful Application Deployments

Contents:

Introduction

  • Recreate Deployment
  • Rolling Update Deployment
  • Blue-Green Deployment
  • Dark deployments or A/B Deployments
  • Canary Deployment

Introduction

Deploying applications on Kubernetes can be a daunting task, especially if you don’t have the right deployment strategy in place. However, with the right strategy, deploying applications on Kubernetes can be made more manageable and efficient. In this blog, we will discuss the best practices and deployment strategies for Kubernetes to help you deploy your applications successfully.

Recreate

The Recreate deployment strategy in Kubernetes is a simple strategy that involves replacing the existing instances of an application with new instances of the updated version all at once. This means that the entire application will be offline during the deployment process, which can result in downtime for users.

To use the Recreate deployment strategy, the deployment manifest needs to be updated to specify the new version of the application. Then, the updated manifest is applied to Kubernetes, which will terminate the existing instances of the application and create new instances of the updated version.

spec:
  replicas: 3
  strategy:
        type: Recreate
  template:
  ...

In contrast to other deployment strategies such as Rolling Updates and Canary Deployments, the Recreate strategy has no in-built capabilities for ensuring that the updated application is functioning correctly before taking it live. This means that testing and quality assurance must be done beforehand to ensure the updated version is free of issues that may cause downtime.

Rolling Updates

Rolling updates are a popular deployment strategy used in Kubernetes. This strategy allows for the application to be updated without any downtime. During a rolling update, Kubernetes updates the pods in the deployment one at a time, ensuring that the application is running throughout the deployment process, and there is no interruption in the user experience. Best practices for implementing rolling updates in Kubernetes include:

  1. Use Readiness Probes – Readiness probes allow Kubernetes to check the state of the pod before sending traffic to it, ensuring that only the healthy pods receive traffic.
  2. Gradual Rollout – Gradually rolling out updates allows you to monitor the update process and detect any issues that may arise, allowing you to take action before the update is fully rolled out.
...
    spec:
      containers:
      - name: my-app
        readinessProbe:
          httpGet:
            path: /healthz
            port: 80
 strategy:
    type: RollingUpdate
    rollingUpdate:
      maxUnavailable: 1
      maxSurge: 1
...
---
#OR
---
...
spec:
  replicas: 3
  strategy:
        type: RollingUpdate
        rollingUpdate:
           maxSurge: 25%
           maxUnavailable: 25%  
  template:
  ...

The strategy field defines the deployment strategy. In this case, the type of the strategy is RollingUpdate. The maxUnavailable field specifies the maximum number of Pods that can be unavailable during the update process, which is set to 1. The maxSurge field specifies the maximum number of Pods that can be created above the desired number of Pods, which is also set to 1.

With these settings, Kubernetes will replace the old Pods with new ones one at a time. This means that at any point during the deployment process, there will always be at least two instances of the application running. Once all the old instances have been replaced with the new ones, the Rolling Update is complete, and the application is fully updated with zero downtime.

Blue-Green Deployment

Blue-Green deployment is another popular deployment strategy that is used to ensure that there is zero downtime during the deployment process. This strategy involves deploying two identical environments, one blue and the other green. The blue environment represents the current stable version, while the green environment represents the new version. Best practices for implementing blue-green deployments in Kubernetes include:

  1. Automate Environment Switching – Automating the switching of environments ensures that the deployment process is fast and efficient, reducing the risk of errors during the process.
  2. Continuous Integration and Deployment (CI/CD) Pipeline – Implementing a CI/CD pipeline streamlines the deployment process, ensuring that updates are thoroughly tested and deployed seamlessly.

Kubernetes natively supports rolling updates and recreating deployment strategies but does not have a native blue-green deployment strategy. However, blue-green deployment can be implemented in Kubernetes using a combination of Kubernetes services and deployment strategies.

The most common approach to implementing blue-green deployment in Kubernetes is to create two separate environments (blue and green) with identical application deployments behind different services. Traffic is directed to one of the environments through the corresponding service. When a new version of the application is ready to deploy, a new deployment is created in the inactive environment, and the corresponding service is updated to direct traffic to the new version. Once the new version has been fully tested and verified, traffic is shifted entirely to the new version by updating the service selector to point to the new environment.

There are also tools like Istio, Linkerd, and Flagger, which can automate the process of blue-green deployment and make it easier to implement. These tools use Kubernetes-native service meshes and allow for the automatic shifting of traffic between the two environments based on metrics and conditions set by the user.

Note: Here is an example YAML for a Blue-Green deployment strategy: this is just to understand the concept and this sample YAML which requires, CRD from argocd and it rollout-controller to be setup in the cluster.

apiVersion: argoproj.io/v1alpha1
kind: Rollout
...
strategy:
type: BlueGreen
blueGreenStrategy:
activeService: my-app
previewService: my-app-preview
prePromotionAnalysis:
enabled: true
postPromotionAnalysis:
enabled: true

The strategy field defines the deployment strategy. In this case, the type of the strategy is Blue-Green. The activeService field specifies the name of the service that will be serving traffic to the active environment, which is set to my-app. The previewService field specifies the name of the service that will be serving traffic to the preview environment, which is set to my-app-preview.

The prePromotionAnalysis field specifies whether to run analysis before the new environment is promoted to the active environment. In this example, it is enabled. The postPromotionAnalysis field specifies whether to run analysis after the new environment has been promoted to the active environment. In this example, it is also enabled.

With these settings, Kubernetes will gradually shift traffic from the active environment to the preview environment. Once the new environment has been fully tested and analyzed, Kubernetes will promote the preview environment to the active environment by updating the service selector. This means that all traffic will be redirected to the new environment with zero downtime.

Dark deployments or A/B Deployments

A dark deployment is another variation on the canary (that incidentally can also be handled by Flagger). The difference between a dark deployment and a canary is that dark deployments deal with features in the front-end rather than the backend as is the case with canaries.

Another name for dark deployment is A/B testing. Rather than launch a new feature for all users, you can release it to a small set of users. The users are typically unaware they are being used as testers for the new feature, hence the term “dark” deployment.

With the use of feature toggles and other tools, you can monitor how your user is interacting with the new feature and whether it is converting your users, or whether they find the new UI confusing and other types of metrics.

Besides weighted routing, Flagger can also route traffic to the canary based on HTTP match conditions. In an A/B testing scenario, you’ll be using HTTP headers or cookies to target a certain segment of your users. This is particularly useful for front-end applications that require session affinity.

Canary Deployment

Canary deployment is a deployment strategy that involves rolling out the new version of an application to a small subset of users or traffic before rolling it out to the entire user base. Canary deployment allows you to test the new version in a live environment, ensuring that it works as expected before fully rolling it out. Best practices for implementing canary deployments in Kubernetes include:

  1. Monitor and Analyze Metrics – Monitoring and analyzing metrics during the canary deployment process helps you detect any issues and make informed decisions about the deployment process.
  2. Validation checks: Once the canary deployment passes the validation checks, it can be promoted to the production environment, and all users or traffic can be directed to the new version of the software.

Kubernetes does not have a native canary deployment strategy. However, it is commonly implemented in Kubernetes using tools like Istio, Linkerd, and Flagger, which use service meshes and automation to simplify the process. These tools allow for automatic shifting of traffic based on metrics and conditions set by the user, and they provide monitoring and alerting capabilities to quickly identify any issues with the new version.

Canary deployment is commonly implemented in Kubernetes using tools like Istio, Linkerd, and Flagger, which use service meshes and automation to simplify the process. These tools allow for automatic shifting of traffic based on metrics and conditions set by the user, and they provide monitoring and alerting capabilities to quickly identify any issues with the new version.

Note: Here is an example YAML for a Canary deployment strategy: this is just to understand the concept and this sample YAML which requires, CRD from istio and it istiod to be setup in the cluster.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: myapp
spec:
  replicas: 2
  selector:
    matchLabels:
      app: myapp
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: myapp
        image: myapp:v1
        ports:
        - containerPort: 80
      imagePullSecrets:
      - name: regcred

---

apiVersion: v1
kind: Service
metadata:
  name: myapp
spec:
  selector:
    app: myapp
  ports:
  - name: http
    port: 80
    targetPort: 80
  type: ClusterIP

---

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: myapp
spec:
  hosts:
  - myapp.example.com
  http:
  - route:
    - destination:
        host: myapp
        subset: v1
      weight: 90
    - destination:
        host: myapp
        subset: v2
      weight: 10

---

apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
  name: myapp
spec:
  host: myapp
  subsets:
  - name: v1
    labels:
      app: myapp
      version: v1
  - name: v2
    labels:
      app: myapp
      version: v2

This YAML file creates a Deployment with two replicas of the “myapp” container image, a Service that points to the Deployment, and a VirtualService and DestinationRule that implement the canary deployment strategy using Istio. The VirtualService directs 90% of traffic to the stable version (v1) and 10% of traffic to the new version (v2), while the DestinationRule defines subsets for each version of the application.

Conclusion

Deploying applications on Kubernetes can be challenging, but with the right deployment strategy and best practices in place, it can be made more manageable and efficient. By implementing rolling updates, blue-green deployments, and canary deployments, you can ensure that your applications are deployed seamlessly and with minimal downtime.

Deploy Static Sites from Azure DevOps to AWS S3 in 3 steps

While working on my previous note on how to deploy from Azure DevOps to an AWS EC2 instance, I came across another use case –how to deploy from Azure DevOps to an AWS S3 bucket. This note is about the steps I followed to do so.
Here is a recap so that we’re all on the same page.
Azure DevOps is Microsoft’s solution to a software development process that aids collaboration, traceability, and visibility using components like Azure Boards (work items), Azure Repos (code repository), Azure Pipelines (build and deploy), Azure Artifacts (package management), Azure Test Plans along with a plug and play module to integrate an awesome number of third party tools like Docker, Terraform, etc
AWS S3 stands for simple storage service and as the name suggests it is used to store and retrieve artifacts -files and folder. Here is a link to the official documentation.

Coming back to the use case at hand, I wanted to create a release definition (pipeline) to store artifacts from Azure DevOps into an AWS S3 bucket. This can be broken down into three steps:
Step 1: Create an AWS IAM user with appropriate permissions
Step 2: Create a service connection in Azure DevOps
Step 3: Create a release definition

Please note that there are a couple of pre-requisites to this use case.
-knowledge about AWS IAM -how to create/manage a user
-knowledge about AWS S3 bucket -how to create/manage a bucket
-knowledge about Azure DevOps pipelines -how to create a build and release definition

Step 1: Create an AWS IAM user with appropriate permissions
As we know, to work with resources in AWS, we need appropriate access -read/modify. In this case, we need an IAM user with programmatic access permission (full access) to S3. Please attach appropriate policy (AmazonS3FullAccess) and store the Access key ID and Secret Access key securely. We need those in the next step.

24. AZtoS3-image1
Note: I am purposefully staying out of going into details about IAM -user, group, policy, and role in this post.

Step 2: Create a service connection in Azure DevOps
Now let’s go back to our Azure DevOps portal and launch the project where we want to set up the connection. Click on the gear icon at the bottom left corner. This will bring up the project settings and under Pipelines, we find the Service connections options. If you do not have any existing connections, you’ll get a welcome message.

24. AZtoS3-image3

Click on Create service connection.
This will launch a new side panel with all the new connections available. Chances are that the AWS connection is not available, as I have in the below image.

24. AZtoS3-image4

This means that we’ll have to install the AWS Connection from the marketplace.
Let us close the new service connection panel and return to our project portal. On the top right corner you’ll see a shopping bag icon -Marketplace -> Browse marketplace.
This launches a new page where we may add extensions for our Azure DevOps project. Search for AWS here.

24. AZtoS3-image5
Click on the icon to install the extension to get it for free. Make sure you are logged into the right organization when you are installing this service connection (ensure your email ID displayed at the top is the same email ID tied to your Azure DevOps project).
As instructed on the next page, select the Azure DevOps organization for which you want to install the extension. and click on Install. After installation, you’ll see a message to “Proceed to organization”. Let’s click on that.
Now, lets again navigate to the project and click on the gear icon at the bottom left followed by Service connections and then “Create service connection”. This time we see AWS as the first option to connect to. Yay!
Select AWS and click on Next.
On this page, we are requested to provided mandatory authentication details along with a few optional details. For now, we’ll proceed with Access Key ID and Secret Access Key that we saved from the previous step when we created our IAM user along with the Service Connection name and description.

24. AZtoS3-image6a
24. AZtoS3-image6

Once you have the connection created, you’ll see that under Service connections.
That brings us to the end of step 2 and now we proceed with creating a release definition to copy artifacts (files and folders) from Azure DevOps to an S3 bucket in AWS. How exciting!

Step 3: Create a release definition
Let’s navigate back to our project portal by clicking on Overview (vertical right panel) and then click on Pipelines -> Releases.
It’d be good if we already have a Pipeline/Build definition created or some code in the repository that we can upload to S3.
Assuming you have, we proceed to the step when we add a new stage to a release definition.
Click on the + to add a task to agent job and search for S3.

24. AZtoS3-image8
This will list two tasks. We select “Amazon S3 Upload” and click on Add.
Under this task, we will be required to fill details like the AWS Credential, Region, BucketName, and Source Folder. Once that is done, save the release definition and trigger a run.
Note: Here is an image of the task group from a separate project with all values.

24. AZtoS3-image9
After a successful run of the release definition, I was able to view the artifacts in the S3 bucket.

24. AZtoS3-image10

Conclusion: There are multiple options available when it comes to populating AWS S3 bucket from Azure DevOps depending on what the specific use case is. Similar to uploading to S3, we can also download contents from an S3 bucket by using the “Amazon S3 download” task.

As I like to state at the end of my notes, I hope you enjoyed reading the article as much as enjoyed writing it. And if you have any questions or clarifications to seek, please do not hesitate. I will be glad to explore those with you.

Taking a look at EC2 storage options ! When to use InstanceStore/EBS/EFS?

Pricing

For information about storage pricing, open AWS Pricing

Storage Options

Lets take a look at the what AWS has to offer.

Amazon EC2 provides you with flexible, cost effective, and easy-to-use data storage options for your instances. Each option has a unique combination of performance and durability. These storage options can be used independently or in combination to suit your requirements.

After reading this section, you should have a good understanding about how you can use the data storage options supported by Amazon EC2 to meet your specific requirements. These storage options include the following:

Amazon EBS

Amazon EBS provides durable, block-level storage volumes that you can attach to a running instance. You can use Amazon EBS as a primary storage device for data that requires frequent and granular updates. For example, Amazon EBS is the recommended storage option when you run a database on an instance.

Points to remember !!

  • in a single Availability Zones.
  • need to attach to only one EC2 at a time
  • need to be a Root Volume/ Boot Volume.
  • need to have IOPS up to 260,000
  • data need to be persisted after restart/termination.
  • when latency of milliseconds are not important
  • additional cost are to be incurred apart from ec2.
  • can be configured to be either deleted or not when ec2 terminates

Amazon EBS provides the following volume types

  • Solid state drives (SSD) — Optimized for transactional workloads involving frequent read/write operations with small I/O size, where the dominant performance attribute is IOPS. example : gp2/gp3,io2
  • Hard disk drives (HDD) — Optimized for large streaming workloads where the dominant performance attribute is throughput.
  • Previous generation — Hard disk drives that can be used for workloads with small datasets where data is accessed infrequently and performance is not of primary importance. (not talked about here)

mote at : https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ebs-volume-types.html

Amazon EC2 instance store

Many instances can access storage from disks that are physically attached to the host computer. This disk storage is referred to as instance store. Instance store provides temporary block-level storage for instances. The data on an instance store volume persists only during the life of the associated instance; if you stop, hibernate, or terminate an instance, any data on instance store volumes is lost.

Points to remember !!

  • Its a Temporary storage and is volatile.
  • All data are lost when the machine is restarted/terminated.
  • Its specific to Availability Zones.
  • It can be attached to only one EC2 at a time
  • It can be a Root Volume/ Boot Volume.
  • It can have IOPS up to 1.2 million
  • when latency on milliseconds are important
  • The instance type determines the size of the instance store available and the type of hardware used for the instance store volumes.
  • Instance store volumes are included as part of the instance’s usage cost. 

more at : https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/InstanceStorage.html

Amazon EFS file system

Amazon EFS provides scalable file storage for use with Amazon EC2. You can create an EFS file system and configure your instances to mount the file system. You can use an EFS file system as a common data source for workloads and applications running on multiple instances.

Points to remember !!

  • All data are Persists and can be shared among ec2.
  • Its specific to Region.
  • It can be attached to multiple EC2 at the same time
  • It cannot be a Root Volume/ Boot Volume.

Adding storage

Every time you launch an instance from an AMI, a root storage device is created for that instance. The root storage device contains all the information necessary to boot the instance. You can specify storage volumes in addition to the root device volume when you create an AMI or launch an instance using block device mapping. For more information, see Block device mappings.

You can also attach EBS volumes to a running instance. 

Manage Imported Certificates Expiry in ACM (using CLoudWatch Events + Lambda) Implement using CLOUD FORMATION

First of all, In addition to requesting SSL/TLS certificates provided by AWS Certificate Manager (ACM), you can import certificates that you obtained outside of AWS. You might do this because you already have a certificate from a third-party certificate authority (CA), or because you have application-specific requirements that are not met by ACM issued certificates.

One thing to note:

ACM does not provide managed renewal for imported certificates.

You are responsible for monitoring the expiration date of your imported certificates and for renewing them before they expire. You can simplify this task by using Amazon CloudWatch Events to send notices when your imported certificates approach expiration. For more information, see Using CloudWatch Events.

– docs.aws.amazon.com

The above diagram shows the complete flow but we will just be covering the part of invoking the Lambda via Event Rule and create both rule and lambda via CloudFormation.

Understanding the Event.

An event such as Lambda receives is displayed under 

{
  "version": "0",
  "id": "9c95e8e4-96a4-ef3f-b739-b6aa5b193afb",
  "detail-type": "ACM Certificate Approaching Expiration",
  "source": "aws.acm",
  "account": "123456789012",
  "time": "2020-09-30T06:51:08Z",
  "region": "us-east-1",
  "resources": [
    "arn:aws:acm:us-east-1:123456789012:certificate/61f50cd4-45b9-4259-b049-d0a53682fa4b"
  ],
  "detail": {
    "DaysToExpiry": 31,
    "CommonName": "My Awesome Service"
  }
}

Creating the Lambda Event Handler

Currently we do not perform any action in Lambda but you will have to have a IAM role attached to the Lambda so that the Lambda can perform the desired action.

We will be creating the a simple lambda function to just log the event now. We can modify the code to do whatever we want.

AWSTemplateFormatVersion: '2010-09-09'
Description: Lambda function with cfn-response.
Resources:
  primer:
    Type: AWS::Lambda::Function
    Properties:
      Runtime: nodejs12.x
      Role: arn:aws:iam::123456789012:role/lambda-role
      Handler: index.handler
      Code:
        ZipFile: |
          var aws = require('aws-sdk')
          var response = require('cfn-response')
          exports.handler = function(event, context) {
              console.log("REQUEST RECEIVED:\n" + JSON.stringify(event))
// check the cetificate using the arn. event['resource'][0]
          }
      Description: Invoke a function when certificate expiry.
      TracingConfig:
        Mode: Active
      VpcConfig:
        SecurityGroupIds:
          - <SecurityGroupID>
        SubnetIds:
          - <subnetid>
          - <subnetid>
      

<SecurityGroupId>– replace with you security group

<subnetid> – replace with your subnet ids

event[‘resource’][0] – this is where the arn is, in the event. We can make a request to ACM describe-certificate

event[‘DaysToExpiry’] – This mentions the day pending for the cert to expire.

We can not get the certificate issues from our authority and then reimport is using the same arn.

Using CloudWatch Events AND Permissions

You can create CloudWatch rules based on these events and use the CloudWatch console to configure actions that take place when the events are detected. 

Creating the Event and the attach permission for rule to Invoke the Lambda the using CloudFormation

AWSTemplateFormatVersion: "2010-09-09"
Description: "A CloudWatch Event Rule and permission"
Resources:
  EventRule:
    Type: "AWS::Events::Rule"
    Properties:
      Name: "detect-acm-certificate-expiry-events"
      Description: "A CloudWatch Event Rule that sends a notification to provide notice of approaching expiration of an ACM certificate."
      State: "ENABLED"
      Targets:
        - Arn: ""
          Id: <LambdaARN>
      EventPattern:
        detail-type:
          - "ACM Certificate Approaching Expiration"
        source:
          - "aws.acm"
  LambdaInvokePermissionsAcmExpiryRule:
    Type: "AWS::Lambda::Permission"
    Properties:
      FunctionName:
        Fn::GetAtt:
          - <LambdaARN>
          - "Arn"
      Action: "lambda:InvokeFunction"
      Principal: "events.amazonaws.com"

<LambdaARN> – Replace this with the above create Lambda ARN.

An Intelligent Overview of Amazon S3.

We have all heard about s3 a lot, and has been use by many enterprise organization in the past and we have seen that the solution has turned out to be awesome. So, Let take a deeper look and understand what s3 is and how can I harness the power it provides us with.

Amazon Simple Storage Service (S3) is an object storage service

So, to start with one on the most interesting feature for any developer is that it is free to use up to an extent, Let understand what is means.

12-Months Free: These free tier offers are only available to new AWS customers, and are available for 12 months following your AWS sign-up date. When your 12 month free usage term expires or if your application use exceeds the tiers, you simply pay standard, pay-as-you-go service rates

Always Free: These free tier offers do not automatically expire at the end of your 12 month AWS Free Tier term, but are available to both existing and new AWS customers indefinitely.

Trials: These free tier offers are short term trial offers that start from the time of first usage begins. Once the trial period expires you simply pay standard, pay-as-you-go service rates (see each service page for full pricing details).

Features

  • Storage classes – Amazon S3 offers a range of storage classes designed for different use cases.
  • Storage management – Amazon S3 has storage management features that you can use to manage costs, meet regulatory requirements, reduce latency, and save multiple distinct copies of your data for compliance requirements.
  • Access management – Amazon S3 provides features for auditing and managing access to your buckets and objects. By default, S3 buckets and the objects in them are private. 
  • Data processing –  To transform data and trigger workflows to automate a variety of other processing activities at scale
  • Storage logging and monitoring – Amazon S3 provides logging and monitoring tools that you can use to monitor and control how your Amazon S3 resources are being used. 
  • Analytics and insights – Amazon S3 offers features to help you gain visibility into your storage usage, which empowers you to better understand, analyze, and optimize your storage at scale.
  • Strong consistency – Amazon S3 provides strong read-after-write consistency for PUT and DELETE requests of objects in your Amazon S3 bucket in all AWS Regions.

How Amazon S3 works ?

Amazon S3 is an object storage service that stores data as objects within buckets. An object is a file and any metadata that describes the file. A bucket is a container for objects.

To store your data in Amazon S3, you first create a bucket and specify a bucket name and AWS Region. Then, you upload your data to that bucket as objects in Amazon S3. Each object has a key (or key name), which is the unique identifier for the object within the bucket.

S3 provides features that you can configure to support your specific use case. For example, you can use S3 Versioning to keep multiple versions of an object in the same bucket, which allows you to restore objects that are accidentally deleted or overwritten.

Buckets and the objects in them are private and can be accessed only if you explicitly grant access permissions. You can use bucket policies, AWS Identity and Access Management (IAM) policies, access control lists (ACLs), and S3 Access Points to manage access.

Topics

Buckets

A bucket is a container for objects stored in Amazon S3. You can store any number of objects in a bucket and can have up to 100 buckets in your account. To request an increase, visit the Service Quotas Console.

Every object is contained in a bucket. For example, if the object named photos/puppy.jpg is stored in the DOC-EXAMPLE-BUCKET bucket in the US West (Oregon) Region, then it is addressable using the URL https://DOC-EXAMPLE-BUCKET.s3.us-west-2.amazonaws.com/photos/puppy.jpg. For more information, see Accessing a Bucket.

When you create a bucket, you enter a bucket name and choose the AWS Region where the bucket will reside. After you create a bucket, you cannot change the name of the bucket or its Region. Bucket names must follow the bucket naming rules. You can also configure a bucket to use S3 Versioning or other storage management features.

Buckets also:

  • Organize the Amazon S3 namespace at the highest level.
  • Identify the account responsible for storage and data transfer charges.
  • Provide access control options, such as bucket policies, access control lists (ACLs), and S3 Access Points, that you can use to manage access to your Amazon S3 resources.
  • Serve as the unit of aggregation for usage reporting.

Objects

Objects are the fundamental entities stored in Amazon S3. Objects consist of object data and metadata. The metadata is a set of name-value pairs that describe the object. These pairs include some default metadata, such as the date last modified, and standard HTTP metadata, such as Content-Type. You can also specify custom metadata at the time that the object is stored.

Keys

An object key (or key name) is the unique identifier for an object within a bucket. Every object in a bucket has exactly one key. The combination of a bucket, object key, and optionally, version ID (if S3 Versioning is enabled for the bucket) uniquely identify each object. So you can think of Amazon S3 as a basic data map between “bucket + key + version” and the object itself.

S3 Versioning

You can use S3 Versioning to keep multiple variants of an object in the same bucket. With S3 Versioning, you can preserve, retrieve, and restore every version of every object stored in your buckets. You can easily recover from both unintended user actions and application failures.

Bucket policy

A bucket policy is a resource-based AWS Identity and Access Management (IAM) policy that you can use to grant access permissions to your bucket and the objects in it. Only the bucket owner can associate a policy with a bucket. The permissions attached to the bucket apply to all of the objects in the bucket that are owned by the bucket owner. Bucket policies are limited to 20 KB in size.

Bucket policies use JSON-based access policy language that is standard across AWS. You can use bucket policies to add or deny permissions for the objects in a bucket. Bucket policies allow or deny requests based on the elements in the policy, including the requester, S3 actions, resources, and aspects or conditions of the request (for example, the IP address used to make the request). For example, you can create a bucket policy that grants cross-account permissions to upload objects to an S3 bucket while ensuring that the bucket owner has full control of the uploaded objects. For more information, see Bucket policy examples.

Access control lists (ACLs)

As a general rule, we recommend using S3 resource-based policies (bucket policies and access point policies) or IAM policies for access control instead of ACLs. ACLs are an access control mechanism that predates resource-based policies and IAM. For more information about when you’d use ACLs instead of resource-based policies or IAM policies, see Access policy guidelines.

You can use ACLs to grant read and write permissions for individual buckets and objects to authorized users. Each bucket and object has an ACL attached to it as a subresource. The ACL defines which AWS accounts or groups are granted access and the type of access. For more information, see Access control list (ACL) overview.

S3 Access Points

Amazon S3 Access Points are named network endpoints with dedicated access policies that describe how data can be accessed using that endpoint. Access Points simplify managing data access at scale for shared datasets in Amazon S3. Access Points are named network endpoints attached to buckets that you can use to perform S3 object operations, such as GetObject and PutObject.

Each access point has its own IAM policy. You can configure Block Public Access settings for each access point. To restrict Amazon S3 data access to a private network, you can also configure any access point to accept requests only from a virtual private cloud (VPC).

For more information, see Managing data access with Amazon S3 access points.

Regions

You can choose the geographical AWS Region where Amazon S3 stores the buckets that you create. You might choose a Region to optimize latency, minimize costs, or address regulatory requirements. Objects stored in an AWS Region never leave the Region unless you explicitly transfer or replicate them to another Region. For example, objects stored in the Europe (Ireland) Region never leave it.

Amazon S3 data consistency model

Amazon S3 provides strong read-after-write consistency for PUT and DELETE requests of objects in your Amazon S3 bucket in all AWS Regions. This behavior applies to both writes to new objects as well as PUT requests that overwrite existing objects and DELETE requests. In addition, read operations on Amazon S3 Select, Amazon S3 access controls lists (ACLs), Amazon S3 Object Tags, and object metadata (for example, the HEAD object) are strongly consistent.

Updates to a single key are atomic. For example, if you make a PUT request to an existing key from one thread and perform a GET request on the same key from a second thread concurrently, you will get either the old data or the new data, but never partial or corrupt data.

Amazon S3 achieves high availability by replicating data across multiple servers within AWS data centers. If a PUT request is successful, your data is safely stored. Any read (GET or LIST request) that is initiated following the receipt of a successful PUT response will return the data written by the PUT request. Here are examples of this behavior:

  • A process writes a new object to Amazon S3 and immediately lists keys within its bucket. The new object appears in the list.
  • A process replaces an existing object and immediately tries to read it. Amazon S3 returns the new data.
  • A process deletes an existing object and immediately tries to read it. Amazon S3 does not return any data because the object has been deleted.
  • A process deletes an existing object and immediately lists keys within its bucket. The object does not appear in the listing.

Note

  • Amazon S3 does not support object locking for concurrent writers. If two PUT requests are simultaneously made to the same key, the request with the latest timestamp wins. If this is an issue, you must build an object-locking mechanism into your application.
  • Updates are key-based. There is no way to make atomic updates across keys. For example, you cannot make the update of one key dependent on the update of another key unless you design this functionality into your application.

Bucket configurations have an eventual consistency model. Specifically, this means that:

  • If you delete a bucket and immediately list all buckets, the deleted bucket might still appear in the list.
  • If you enable versioning on a bucket for the first time, it might take a short amount of time for the change to be fully propagated. We recommend that you wait for 15 minutes after enabling versioning before issuing write operations (PUT or DELETE requests) on objects in the bucket.

Concurrent applications

This section provides examples of behavior to be expected from Amazon S3 when multiple clients are writing to the same items.

In this example, both W1 (write 1) and W2 (write 2) finish before the start of R1 (read 1) and R2 (read 2). Because S3 is strongly consistent, R1 and R2 both return color = ruby.

In the next example, W2 does not finish before the start of R1. Therefore, R1 might return color = ruby or color = garnet. However, because W1 and W2 finish before the start of R2, R2 returns color = garnet.

In the last example, W2 begins before W1 has received an acknowledgement. Therefore, these writes are considered concurrent. Amazon S3 internally uses last-writer-wins semantics to determine which write takes precedence. However, the order in which Amazon S3 receives the requests and the order in which applications receive acknowledgements cannot be predicted because of various factors, such as network latency. For example, W2 might be initiated by an Amazon EC2 instance in the same Region, while W1 might be initiated by a host that is farther away. The best way to determine the final value is to perform a read after both writes have been acknowledged.

Accessing Amazon S3

You can work with Amazon S3 in any of the following ways:

AWS Management Console

The console is a web-based user interface for managing Amazon S3 and AWS resources. If you’ve signed up for an AWS account, you can access the Amazon S3 console by signing into the AWS Management Console and choosing S3 from the AWS Management Console home page.

AWS Command Line Interface

You can use the AWS command line tools to issue commands or build scripts at your system’s command line to perform AWS (including S3) tasks.

The AWS Command Line Interface (AWS CLI) provides commands for a broad set of AWS services. The AWS CLI is supported on Windows, macOS, and Linux. To get started, see the AWS Command Line Interface User Guide. For more information about the commands for Amazon S3, see s3api and s3control in the AWS CLI Command Reference.

AWS SDKs

AWS provides SDKs (software development kits) that consist of libraries and sample code for various programming languages and platforms (Java, Python, Ruby, .NET, iOS, Android, and so on). The AWS SDKs provide a convenient way to create programmatic access to S3 and AWS. Amazon S3 is a REST service. You can send requests to Amazon S3 using the AWS SDK libraries. which wrap the underlying Amazon S3 REST API and simplify your programming tasks. For example, the SDKs take care of tasks such as calculating signatures, cryptographically signing requests, managing errors, and retrying requests automatically. For information about the AWS SDKs, including how to download and install them, see Tools for AWS.

Every interaction with Amazon S3 is either authenticated or anonymous. If you are using the AWS SDKs, the libraries compute the signature for authentication from the keys that you provide. For more information about how to make requests to Amazon S3, see Making requests.

Amazon S3 REST API

The architecture of Amazon S3 is designed to be programming language-neutral, using AWS-supported interfaces to store and retrieve objects. You can access S3 and AWS programmatically by using the Amazon S3 REST API. The REST API is an HTTP interface to Amazon S3. With the REST API, you use standard HTTP requests to create, fetch, and delete buckets and objects.

To use the REST API, you can use any toolkit that supports HTTP. You can even use a browser to fetch objects, as long as they are anonymously readable.

The REST API uses standard HTTP headers and status codes, so that standard browsers and toolkits work as expected. In some areas, we have added functionality to HTTP (for example, we added headers to support access control). In these cases, we have done our best to add the new functionality in a way that matches the style of standard HTTP usage.

If you make direct REST API calls in your application, you must write the code to compute the signature and add it to the request. For more information about how to make requests to Amazon S3, see Making requests.Note

SOAP API support over HTTP is deprecated, but it is still available over HTTPS. Newer Amazon S3 features are not supported for SOAP. We recommend that you use either the REST API or the AWS SDKs.

PCI DSS compliance

Amazon S3 supports the processing, storage, and transmission of credit card data by a merchant or service provider, and has been validated as being compliant with Payment Card Industry (PCI) Data Security Standard (DSS). For more information about PCI DSS, including how to request a copy of the AWS PCI Compliance Package, see PCI DSS Level 1.

AWS RDS 101 – Amazon Relational Database Service

What is a Relational Database (RDBMS)?

A relational database is a type of database that stores and provides access to data points that are related to one another. Relational databases are based on the relational model, an intuitive, straightforward way of representing data in tables. In a relational database, each row in the table is a record with a unique ID called the key. The columns of the table hold attributes of the data, and each record usually has a value for each attribute, making it easy to establish the relationships among data points.

ACID properties and RDBMS

Four crucial properties define relational database transactions: atomicity, consistency, isolation, and durability—typically referred to as ACID.

  • Atomicity defines all the elements that make up a complete database transaction.
  • Consistency defines the rules for maintaining data points in a correct state after a transaction.
  • Isolation keeps the effect of a transaction invisible to others until it is committed, to avoid confusion.
  • Durability ensures that data changes become permanent once the transaction is committed.

Relational Databases Provided by AWS

Features of RDS

RDS has two key features

  • Multi -AZ – For DR – Automatic Failover(AWS automatically points the DNS to the IP.)
  • Read Replica – For Performance – Heavy read – Read Replicas to Read Replica
  • Encryption at Rest – Security – if enabled snapshots/backups/read replicas are also encrypted.
  • Backups – Automatic Backup / Manual Snapshots

Non Relational Databases

Non-relational databases (often called NoSQL databases) are different from traditional relational databases in that they store their data in a non-tabular form. Instead, non-relational databases might be based on data structures like documents. A document can be highly detailed while containing a range of different types of information in different formats. This ability to digest and organize various types of information side-by-side makes non-relational databases much more flexible than relational databases.

Example MongoDB Document for a Patient in Healthcare

What is Data warehousing ?

A data warehouse is a type of data management system that is designed to enable and support business intelligence (BI) activities, especially analytics. Data warehouses are solely intended to perform queries and analysis and often contain large amounts of historical data.

Difference OLTP Vs OLAP

Online Transaction Processing(OLTP) differs from Online Analytics Processing(OLAP) in terms of the type of query you run. Online transaction processing (OLTP) captures, stores, and processes data from transactions in real time. Online analytical processing (OLAP) uses complex queries to analyze aggregated historical data from OLTP systems.

AWS Data warehousing Solution is Red Shift

What is ElastiCache ?

Amazon ElastiCache is a fully managed, in-memory caching service supporting flexible, real-time use cases. You can use ElastiCache for caching, which accelerates application and database performance, or as a primary data store for use cases that don’t require durability like session stores, gaming leaderboards, streaming, and analytics. ElastiCache is compatible with

  • Redis
  • Memcached. 

Few Take Away

“Unable to verify secret hash for client ” error from my Amazon Cognito user pools API when called from AWS Lambda ?

When I try to invoke my Amazon Cognito user pools API, I get an “Unable to verify secret hash for client <client-id>” error. How do I resolve the error?

As per documentation

When a user pool app client is configured with a client secret in the user pool, a SecretHash value is required in the API’s query argument. If a secret hash isn’t provided in the APIs query argument, then Amazon Cognito returns an Unable to verify secret hash for client <client-id> error.

– AWS Docs

But still even after passing the SECRET_HASH this still doesn’t behave correctly when using ADMIN_NO_SRP_AUTH. This should not be generated when using the ADMIN_NO_SRP_AUTH

I am trying to do this via AWS Lambda here.

So, There is a simple solution to this.

Go to App Clients in your cognito.

You should see a list of client as there can be more than on app client in an user pool.

Now, click on show details, and you should see a text box showing CLIENT SECRET

If this shows up with a secret key the work arround is to delete the app client for the user pool and create a new one

uncheck the checkbox for Generate Client Secret

To Remember
enter image description here
Please remember to uncheck the above checkbox.

We can now remove the Client secret from the Auth Parameters

The request would now look like shown below.

    public AdminInitiateAuthRequest CreateAdminInitiateAuthRequest(AdminInitiateAuthRequestModel apiRequest)
    {
        if (String.IsNullOrEmpty(apiRequest.Username) || String.IsNullOrEmpty(apiRequest.Password))
            throw new Exception("Bad Request");

        AdminInitiateAuthRequest authRequest = new AdminInitiateAuthRequest();
        authRequest.UserPoolId = _poolId;
        authRequest.ClientId = _appClientId;
        authRequest.AuthFlow = AuthFlowType.ADMIN_NO_SRP_AUTH;
        authRequest.AuthParameters = new Dictionary<string, string>();
        authRequest.AuthParameters.Add("USERNAME", apiRequest.Username);
        authRequest.AuthParameters.Add("PASSWORD", apiRequest.Password);
        return authRequest;
    }

Building Your First Serverless Service With AWS Lambda Functions

Many developers are at least marginally familiar with AWS Lambda functions. They’re reasonably straightforward to set up, but the vast AWS landscape can make it hard to see the big picture. With so many different pieces it can be daunting, and frustratingly hard to see how they fit seamlessly into a normal web application.

The Serverless framework is a huge help here. It streamlines the creation, deployment, and most significantly, the integration of Lambda functions into a web app. To be clear, it does much, much more than that, but these are the pieces I’ll be focusing on. Hopefully, this post strikes your interest and encourages you to check out the many other things Serverless supports. If you’re completely new to Lambda you might first want to check out this AWS intro.


#
Your first Serverless service

Before we get to cool things like file uploads and S3 buckets, let’s create a basic Lambda function, connect it to an HTTP endpoint, and call it from an existing web app. The Lambda won’t do anything useful or interesting, but this will give us a nice opportunity to see how pleasant it is to work with Serverless. First, let’s create our service. For me, it’s my lambda folder. Whatever directory you choose, cd into it from terminal and run the following command:

sls create -t aws-nodejs --path hello-world

That creates a new directory called hello-world. Let’s crack it open and see what’s in there.

If you look in handler.js, you should see an async function that returns a message. We could hit sls deploy in our terminal right now, and deploy that Lambda function, which could then be invoked. But before we do that, let’s make it callable over the web.

Working with AWS manually, we’d normally need to go into the AWS API Gateway, create an endpoint, then create a stage, and tell it to proxy to our Lambda. With serverless, all we need is a little bit of config.

Still in the hello-world directory? Open the serverless.yaml file that was created in there.

The config file actually comes with boilerplate for the most common setups. Let’s uncomment the http entries, and add a more sensible path.


#
CORS configuration 

Ideally, we want to call this from front-end JavaScript code with the Fetch API, but that unfortunately means we need CORS to be configured. This section will walk you through that.

Below the configuration above, add cors: true, like this

functions:
  hello:
    handler: handler.hello
    events:
      - http:
        path: msg
        method: get
        cors: true 

That’s the section! CORS is now configured on our API endpoint, allowing cross-origin communication.