🚀 Self-Host ParquetReader
Run ParquetReader on-premises or in your own cloud and execute fast SQL on Parquet, Feather, Avro, CSV, GeoJSON and ORC with every row staying inside your environment.
- Keep sensitive data and logs on your own systems. No uploads and no hidden tracking.
- Query files directly from S3-compatible storage instead of copying them to a tool.
- Deploy as a Docker container or on Kubernetes.
Request Access
Complete Data Privacy
Operate ParquetReader entirely within your environment. Full control for compliance and audits.
Flexible Deployment
Deploy with Docker, Compose or Kubernetes, on a single node or scaled out.
Advanced SQL Querying
Run high-performance SQL on file-based data with filtering, joins and aggregations, without a separate ETL process.
ParquetReader API
Connect internal systems, dashboards and analytics tools directly through the ParquetReader API.
SQL directly on S3-compatible storage
Point ParquetReader at a bucket or prefix in S3-compatible storage and write SQL. It scans the relevant files and streams back results without moving data out of your account.
- Query Parquet, Feather, Avro, CSV, GeoJSON and ORC from S3-compatible storage.
- Column projection and pagination for efficient work with large files.
- Works with AWS S3, MinIO and custom S3-compatible endpoints.
Click to enlarge. SQL over multiple Parquet files in S3-compatible storage with instant preview and download.
Self-hosted ParquetReader is for teams that cannot send data to hosted tools and need to explore, validate and prototype on real files. It works well for research, analytics and integration use cases.
Pricing is available as a monthly or annual subscription per deployment, for example per cluster or environment. Share your setup and we will send a clear proposal.
