Using Athena or Redshift? Upsolver can help you get the most out of your AWS data lake. ![]() Queries run on AWS Lambda, with connectors currently available for DynamoDB, HBase, Document DB, Redshift, CloudWatch, and JDBC-compliant relational databases such as MySQL and PostgreSQL. We’re used to seeing Amazon Athena used to query data stored on S3, often as part of an AWS data lake this new feature, currently still in preview, offers a way to significantly extend Athena functionality by allowing you to query several additional data sources directly in Athena, using its familiar SQL syntax. This sounds very promising and we’ll be keeping a close eye on Redshift query performance in the coming period! Amazon explains that AQUA will make Redshift faster by running data-intensive tasks closer to the storage layer to reduce data movements between storage and compute clusters, as well as by leveraging a scale-out architecture and purpose-designed processors developed by AWS. Advanced Query Accelerator (AQUA) for RedshiftĪQUA was described in Andy Jassy’s keynote as an “innovative new hardware-accelerated cache that delivers up to 10x query performance than other cloud data warehouses”. These enhancements are meant to provide up to 2x better performance and 2x more storage to existing Redshift customers using DS2, without increasing their AWS bill.Ĥ. This new cluster type brings decoupled storage and compute to Redshift, with separate optimizations for each, and leverages 48 vCPUs, 384 gigabytes of memory and up to 64 terabytes of storage per instance. Hailed as the “next generation of Nitro-powered compute instances”, Amazon’s new RA3 instances for Redshift promises to deliver 3x the performance of other databases. Next-generation Compute Instances for Redshift According to the announcement, “Federated Query also makes it easy to ingest data into Redshift by letting you query operational databases directly, applying transformations on the fly, and loading data into the target tables without requiring complex ETL pipelines.”ģ. Read the full announcement on the AWS website.Ĭurrently still in preview, this is another new Redshift feature that adds flexibility and versatility to Redshift, by allowing you to execute Redshift queries on live data in Amazon RDS for PostgreSQL, as well as Amazon Aurora. Upsolver also leverages Parquet when ingesting data to S3, and we think this new feature will definitely make lives easier for companies that rely on Redshift as a core component of their data lake architecture. This is really great news for companies that use Redshift as part of their cloud data lake – columnar Parquet storage is highly efficient when it comes to analytical querying, and enables access from a wide variety of additional services such as Athena and Redshift Spectrum. You can now store the result of a Redshift query as an Apache Parquet file on Amazon S3. Redshift Data Lake Export to Apache Parquet However, what caught our attention were some really interesting announcements related to two of our favorite outputs – Amazon Athena and Amazon Redshift.
0 Comments
Leave a Reply. |