S3 Data Connector

S3 Data Connector Documentation

S3 as a connector for federated SQL query across Parquet files stored in S3, or S3-compatible storage solutions (e.g. Minio, Cloudflare R2).

params

  • endpoint: The S3 endpoint, or equivalent (e.g. Minio endpoint), for the S3-compatible storage.
  • region: Region of the S3 bucket, if region specific.

auth

Check Secrets Stores.

Required attributes:

  • key: The access key authorised to access the S3 data (e.g. AWS_ACCESS_KEY_ID for AWS)
  • secretThe secret key authorised to access the S3 data (e.g. AWS_SECRET_ACCESS_KEY for AWS)

Example

Minio

- from: s3://s3-bucket-name/path/to/parquet/cool_dataset.parquet
  name: cool_dataset
  params:
    endpoint: https://my.minio.server
    region: 'us-east-1' # Best practice for Minio

S3

- from: s3://my-startups-data/path/to/parquet/cool_dataset.parquet
  name: cool_dataset
  params:
    endpoint: http://my-startups-data.s3.amazonaws.com
    region: 'ap-southeast-2'