ParquetReader Logo

API for paying users

Build data workflows on top of real files.

Use your ParquetReader API key to run search and SQL, export CSV/JSON/Parquet, and reuse saved files for n8n agents, Power BI, dashboards, and backend jobs.

Base URL

https://api.parquetreader.com

Send your API key in X-API-Key.

Use dataset as the SQL table name.

Save this file, then copy the file ID from My Files.

Quick start

Export CSV from a saved file

First save your file in ParquetReader with Save this file. Then use the persistent file_id with your API key. Exports are asynchronous: start a job, poll status, then download when ready.

Step 1

Save file

Step 2

Start export

Step 3

Poll status

Step 4

Download

Python

import time
import requests

API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.parquetreader.com/parquet"
FILE_ID = "YOUR_PERSISTENT_FILE_ID"
HEADERS = {"X-API-Key": API_KEY}

# 1) Start export: csv, json, or parquet
export = requests.post(
    f"{BASE_URL}/file/{FILE_ID}/process",
    headers={**HEADERS, "Content-Type": "application/json"},
    params={"format": "csv"},
    json={},
    timeout=120,
)
export.raise_for_status()
job_id = export.json()["job_id"]

# 2) Poll status
while True:
    status = requests.get(
        f"{BASE_URL}/file/{FILE_ID}/status",
        headers=HEADERS,
        params={"job_id": job_id},
        timeout=30,
    )
    status.raise_for_status()
    body = status.json()
    if body["status"] == "ready":
        break
    time.sleep(1)

# 3) Download
download = requests.get(
    f"{BASE_URL}/file/{FILE_ID}/download",
    headers=HEADERS,
    params={"job_id": job_id},
    stream=True,
    timeout=120,
)
download.raise_for_status()

with open("output.csv", "wb") as out:
    for chunk in download.iter_content(chunk_size=1024 * 1024):
        out.write(chunk)

print(f"Downloaded output.csv from file {FILE_ID}")

curl

API_KEY="YOUR_API_KEY"
FILE_ID="YOUR_PERSISTENT_FILE_ID"

# 1) Start an export job for a saved file
JOB_RESPONSE=$(curl -sS -X POST "https://api.parquetreader.com/parquet/file/$FILE_ID/process?format=csv" \
  -H "X-API-Key: $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{}')

JOB_ID=$(echo "$JOB_RESPONSE" | python3 -c 'import json,sys; print(json.load(sys.stdin)["job_id"])')

# 2) Poll until the export is ready
while true; do
  STATUS_RESPONSE=$(curl -sS "https://api.parquetreader.com/parquet/file/$FILE_ID/status?job_id=$JOB_ID" \
    -H "X-API-Key: $API_KEY")
  STATUS=$(echo "$STATUS_RESPONSE" | python3 -c 'import json,sys; print(json.load(sys.stdin)["status"])')
  [ "$STATUS" = "ready" ] && break
  sleep 1
done

# 3) Download the generated CSV
curl -L "https://api.parquetreader.com/parquet/file/$FILE_ID/download?job_id=$JOB_ID" \
  -H "X-API-Key: $API_KEY" \
  -o output.csv

Search and SQL

Inspect data without exporting first

Use search for broad text matching. Use SQL for counts, filters, grouping, and repeatable analysis. This is the endpoint used by the n8n AI Agent workflow.

Search rows

curl -G "https://api.parquetreader.com/parquet/search/YOUR_FILE_ID" \
  -H "X-API-Key: YOUR_API_KEY" \
  --data-urlencode "q=customer@example.com" \
  --data-urlencode "limit=25" \
  --data-urlencode "meta=true"

Run SQL

curl -G "https://api.parquetreader.com/parquet/search/YOUR_FILE_ID" \
  -H "X-API-Key: YOUR_API_KEY" \
  --data-urlencode "sql=SELECT country, COUNT(*) AS total FROM dataset GROUP BY country ORDER BY total DESC" \
  --data-urlencode "dialect=duckdb" \
  --data-urlencode "limit=100"

SQL notes

  • Use dataset as the table name.
  • Use safe SELECT queries only.
  • Set dialect=duckdb when you write SQL directly for ParquetReader.

Persistent files

Make files reusable for automation

For n8n, Power BI, scheduled jobs, or AI agents, save the file so the same file_id remains available. In the interface, use Save this file. To find saved files later, call:

curl "https://api.parquetreader.com/parquet/files" \
  -H "X-API-Key: YOUR_API_KEY"

Once saved, the owner API key can list, search, query, export, rename, and manage the file later.

Save this file turns an uploaded dataset into a persistent API data source.

Export a SQL result

Add a JSON body to the process request when you want the export to contain a filtered or transformed result.

{
  "sql_query": "SELECT customer_id, email, created_at FROM dataset WHERE email IS NOT NULL",
  "dialect": "duckdb"
}

Reference

Core endpoints

MethodPathUse
GET/parquet/filesList persistent files owned by the API key.
POST/parquet/file/{file_id}/processStart a CSV, JSON, or Parquet export job.
GET/parquet/file/{file_id}/statusPoll export status by job_id.
GET/parquet/file/{file_id}/downloadDownload the generated export when the job is ready.
GET/parquet/search/{file_id}Search, page, inspect schema, or run safe SQL against a saved file.

Supported input formats

ParquetReader accepts .parquet, .csv, .tsv, .json, .jsonl, .ndjson, .xlsx, .xls, .avro, .orc, .feather, .geojson, .nc, .nc4, .netcdf, .h5, .arrow, and .ipc.

Export formats: CSV, JSON, and Parquet.