Jump to content

AWS Glue

From Wikipedia, the free encyclopedia
AWS Glue
Developer(s)Amazon.com
Initial releaseAugust 2017; 7 years ago (2017-08) [1]
Operating systemCross-platform
Available inEnglish
Websiteaws.amazon.com/glue/ Edit this on Wikidata

AWS Glue is an event-driven, serverless computing platform provided by Amazon as a part of Amazon Web Services. It was introduced in August 2017.[2]

The primary purpose of Glue is to scan other services[3] in the same Virtual Private Cloud (or equivalent accessible network element even if not provided by AWS), particularly S3. The jobs are billed according to compute time, with a minimum count of 1 minute.[4] Glue discovers the source data to store associated meta-data (e.g. the table's schema of field names, types lengths) in the AWS Glue Data Catalog (which is then accessible via AWS console or APIs).[5]

Languages supported

[edit]

Scala and Python are officially supported as of 2020.[6]

Catalog interrogation via API

[edit]

The catalog can be read in AWS console (via browser) and via API divided into topics including:[7]

  • Database API
  • Table API
  • Partition API
  • Connection API
  • User-Defined Function API
  • Importing an Athena Catalog to AWS Glue

See also

[edit]

References

[edit]
  1. ^ "Introducing AWS Glue: A Simple, Flexible, and Cost-Effective Extract, Transfer, and Load (ETL) Service".
  2. ^ "AWS Services List". ParkMyCloud. Retrieved October 6, 2020.
  3. ^ "AWS Glue: crawlers and use cases". 5 January 2022. Retrieved July 13, 2022.
  4. ^ "AWS Glue version 2.0 featuring 10x faster job start times and 1-minute minimum billing duration". AWS. August 10, 2020. Retrieved October 6, 2020.
  5. ^ "AWS Glue API Documentation". AWS. Retrieved October 6, 2020.
  6. ^ "AWS Glue Now Supports Scala in Addition to Python". AWS. January 12, 2018. Retrieved October 6, 2020.
  7. ^ "Catalog API". AWS. Retrieved October 8, 2020.
[edit]