ParquetReader Logo

🚀 Self-Host ParquetReader

Run ParquetReader on-premises or in your own cloud and execute fast SQL on Parquet, Feather, Avro, CSV, NetCDF, GeoJSON and ORC with every row staying inside your environment.

  • Keep sensitive data and logs on your own systems. No uploads and no hidden tracking.
  • Query files directly from S3-compatible storage instead of copying them to a tool.
  • Deploy as a Docker container or on Kubernetes.

Short demo

Request Self-Host Access

We reply within 1 business day. You’ll receive Docker/Compose instructions and licensing details.

Why teams self-host

Built for environments where data cannot leave your infrastructure, while still enabling fast exploration and prototyping.

Complete Data Privacy

Operate entirely within your environment for compliance, audits and internal policies.

Flexible Deployment

Docker, Compose or Kubernetes. Single node or scaled out.

Advanced SQL Querying

Fast filtering, joins and aggregations on file-based data without ETL.

ParquetReader API

Connect internal systems, dashboards and analytics tools through the API.

SQL directly on S3-compatible storage

Point ParquetReader at a bucket or prefix in S3-compatible storage and write SQL. It scans the relevant files and streams back results without moving data out of your account.

  • Query Parquet, Feather, Avro, CSV, GeoJSON and ORC from S3-compatible storage.
  • Column projection and pagination for efficient work with large files.
  • Works with AWS S3, MinIO and custom S3-compatible endpoints.

Click to enlarge. SQL over multiple Parquet files in S3-compatible storage with instant preview and download.

Self-hosted ParquetReader is for teams that cannot send data to hosted tools and need to explore, validate and prototype on real files.

Share your deployment setup (VM/Kubernetes, S3/MinIO, auth requirements) and we will respond with a clear proposal.