Open In App

AWS Athena

Pre-requisites: AWS

Amazon Web Services (AWS) provides its account holders with on-demand IT resources, i.e. pay-as-you-go with no upfront expenses. Amazon Web services are adaptable since you just pay for the services you use or require. 



What is AWS Athena

AWS Athena is a serverless interactive query service that enables normal SQL data analysis in Amazon S3. Athena is based on Presto, a distributed SQL query engine, and it can query data in Amazon S3 fast using conventional SQL syntax. There is no infrastructure to handle with Athena, so you can focus on analyzing data at scale. To have more idea of AWS Ethena, let us understand the architecture first.

AWS Ethena Architecture

Apache Presto, an open-source distributed SQL query engine, serves as the foundation for Athena. When a query is submitted by a user, Athena generates a query plan and sends it to Presto for execution. Presto then distributes the query over numerous cluster nodes for parallel processing. The results are subsequently compiled and presented to the user. Athena stores table and partition metadata in a controlled Hive metastore. When a query is run, Athena gets the metadata from the metastore to establish the data’s location and format. Athena also interfaces with AWS Glue, a fully managed extract, transform, and load (ETL) service, allowing customers to create and manage data catalogs and ETL processes. Furthermore, we will go through the various components of AWS Athena.



 

Features of AWS Athena

Advantages of AWS Athena

  1. No infrastructure setup – Athena is a serverless service that eliminates the need for users to set up and manage infrastructure, making data querying easier and faster.
  2. Cost-effective – Athena charges customers solely for the quantity of data scanned by their searches, making it an affordable solution for ad hoc and exploratory queries. 
  3. Scalability – Athena is a fully-managed service that can automatically scale to accommodate massive amounts of data and queries.
  4. SQL support – Since Athena supports ANSI SQL, users can query data in S3 using their existing SQL knowledge and tools.

Disadvantages of AWS Athena

  1. Restricted query performance – The volume of data scanned and the intricacy of the query can limit Athena’s speed, resulting in lengthier query times.
  2. No real-time querying – Because Athena is intended for batch processing, it may not be ideal for real-time querying.
  3. Limited data types – In comparison to other database systems, Athena only supports a restricted selection of data types.

Use Cases of AWS Athena

In conclusion, Amazon Athena is a serverless query service that allows customers to run regular SQL queries to evaluate data in S3. Serverless design, standard SQL support, interaction with the AWS environment, cost-effective pricing, and integration with BI tools are among its primary characteristics. Its architecture is based on top of Apache Presto and interfaces with AWS Gl.

Article Tags :