Amazon Web Services (AWS) works on the idea of an on-demand, pay-as-you-go, IT services that are delivered over the internet. These cloud computing web services provide a set of primitive abstract technical infrastructure and distributed computing building blocks and tools. Different services offered by AWS are listed below:

EC2

  • EC2 stands for Elastic Compute Cloud.

  • It is a web service that provides secure, resizable compute capacity in the cloud.

Lambda

  • Compute service lets run your code without provisioning or managing servers.

  • You pay only for the compute time you consume–there is no charge when your code isn’t running.

ECS

  • Stands for Elastic Container Service.

  • It is a highly scalable, high-performance container orchestration service that supports Docker containers.

  • It allows you to run and scale containerized applications on AWS.

S3

  • Stands for Simple Storage Service.

  • Provides object storage through a web service interface.

  • S3 can be used alone or together with other AWS services such as Elastic Compute Cloud (EC2), Elastic Block Store (EBS), and Glacier, as well as third-party storage repositories and gateways.

  • Stores data as objects within resources that are called buckets. You can store as many objects as you want within a bucket, and you can write, read, and delete objects in your bucket.

  • Requires you to first create a bucket and then store any type of data in that bucket.

DynamoDB

  • Managed NoSQL database.

  • Its a fully managed cloud database, and it supports both document and key-value store models.

RDS

  • Amazon Relational Database Service (Amazon RDS) makes it straightforward to set up, operate, and scale a relational database in the cloud.

  • It provides cost-efficient and resizable capacity while automating time-consuming administration tasks such as provisioning hardware, setting up the database, patching, and making backups.

  • Provides block storage.

Glue

  • Fully managed serverless ETL (extract, transform, and load) service.

  • Build a data warehouse to organize, cleanse, validate, and format data.

  • Consists of AWS Glue Data Catalog (central metadata repository), ETL engine (automatically generates Python or Scala code) and flexible scheduler (handles dependency resolution, job monitoring, and retries).

  • Simple, cost-effective way to categorize, clean, enrich and moving reliably between various data stores and data streams.

  • serverless, implying no infrastructure to set up or manage.

Athena

  • Serverless Interactive query tool.

  • Not an ETL tool.

Works in the following manner ::

  • Load data in S3 –> Define the schema –> query

Kinesis

  • Easy to collect, process, and analyze real-time, streaming data so you can get timely insights and react quickly to new information.

  • Offers key capabilities to cost-effectively process streaming data at any scale, along with the flexibility to choose the tools that best suit the requirements of your application.

  • KDS

    • Stands for Kinesis Data Streams.

    • Massively scalable and durable real-time data streaming service.

    • Can continuously capture gigabytes of data per second from hundreds of thousands of sources such as website clickstreams, database event streams, financial transactions, social media feeds, IT logs, and location-tracking events.

    • Data collected is available in milliseconds to enable real-time analytics use cases such as real-time dashboards, real-time anomaly detection, etc.

  • KDF

    • Stands for Kinesis Data Firehose.

    • Easiest way to reliably load streaming data into data lakes, data stores, and analytics services.

    • Can capture, transform, and deliver streaming data to Amazon S3, Amazon Redshift, Amazon Elasticsearch Service, generic HTTP endpoints, and service providers like Datadog, New Relic, MongoDB, and Splunk.

  • KDA

    • Stands for Kinesis Data Analytics

    • Easiest way to transform and analyze streaming data in real time with Apache Flink.

EMR

  • Stands for Elastic MapReduce.

  • EMR is a managed service that makes it fast, easy, and cost-effective to run Apache Hadoop and Spark to process vast amounts of data.

CodeStar

  • Cloud-based service for creating, managing, and working with software development projects on AWS.

  • Quickly develop, build and deploy applications on AWS.

RedShift

  • Redshift is an enterprise-level, petabyte scale, fully managed data warehousing service.

  • Simple, cost-effective to run high performance queries on petabytes of structured data so that we can build powerful reports and dashboards using your existing business intelligence tools.

CloudWatch

  • Amazon CloudWatch is a monitoring service for AWS Cloud resources and the applications that you run on AWS.

  • You can use Amazon CloudWatch to collect and track metrics, collect and monitor log files, set alarms, and automatically react to changes in your AWS resources.

ELB

  • Stands for Elastic Load Balancing.

  • Automatically distributes incoming application traffic across multiple targets, such as Amazon EC2 instances, containers, and IP addresses.

  • It can handle the varying load of your application traffic in a single Availability Zone or across multiple Availability Zones.

  • ELB offers three types of load balancers – Application Load Balancer, Network Load Balancer, Classic Load Balancer that all feature the high availability, automatic scaling, and robust security that are necessary to make your applications fault-tolerant.

AutoScaling

  • Helps you maintain application availability, and it allows you to dynamically scale your Amazon EC2 capacity up or down automatically according to conditions that you define.

VPC

  • Stands for Virtual Private Cloud.

  • Lets you provision a logically isolated section of the AWS Cloud where you can launch AWS resources in a virtual network that you define.
  • You have complete control over your virtual networking environment, including the selection of your own IP address range, the creation of subnets, and the configuration of route tables and network gateways.

  • The idea of VPC is to provide a frame or a box in which all of your applications can live inside, and the idea is that nothing comes inside the box and nothing gets outs of the box unless you provide specific permission, and whether you’re filtering by network protocol, or port, or IP address, or by user or other information, you maintain complete control of all the assets inside your VPC. When you create a VPC, you also then divide the space inside the VPC into subnets. We let any instances running in the public subnet to communicate with the outside world through attaching an InternetGateWay (IGW) to the VPC and attaching a route table to the public subnet.

  • Schematic Diagram that explains the working of a VPC is shown below:

alt text

Useful Resources

What is Cloud Computing?

Amazon Web Serivces

AWS EC2

AWS EMR

AWS Lambda

Check which services are eligible for AWS Free Tier