API for paying users
Build data workflows on top of real files.
Use your ParquetReader API key to run search and SQL, export CSV/JSON/Parquet, and reuse saved files for n8n agents, Power BI, dashboards, and backend jobs.
Base URL
https://api.parquetreader.com Send your API key in X-API-Key.
Use dataset as the SQL table name.
Save this file, then copy the file ID from My Files.
Quick start
Export CSV from a saved file
First save your file in ParquetReader with Save this file. Then use the persistent file_id with your API key. Exports are asynchronous: start a job, poll status, then download when ready.
Step 1
Save file
Step 2
Start export
Step 3
Poll status
Step 4
Download
Python
import time
import requests
API_KEY = "YOUR_API_KEY"
BASE_URL = "https://api.parquetreader.com/parquet"
FILE_ID = "YOUR_PERSISTENT_FILE_ID"
HEADERS = {"X-API-Key": API_KEY}
# 1) Start export: csv, json, or parquet
export = requests.post(
f"{BASE_URL}/file/{FILE_ID}/process",
headers={**HEADERS, "Content-Type": "application/json"},
params={"format": "csv"},
json={},
timeout=120,
)
export.raise_for_status()
job_id = export.json()["job_id"]
# 2) Poll status
while True:
status = requests.get(
f"{BASE_URL}/file/{FILE_ID}/status",
headers=HEADERS,
params={"job_id": job_id},
timeout=30,
)
status.raise_for_status()
body = status.json()
if body["status"] == "ready":
break
time.sleep(1)
# 3) Download
download = requests.get(
f"{BASE_URL}/file/{FILE_ID}/download",
headers=HEADERS,
params={"job_id": job_id},
stream=True,
timeout=120,
)
download.raise_for_status()
with open("output.csv", "wb") as out:
for chunk in download.iter_content(chunk_size=1024 * 1024):
out.write(chunk)
print(f"Downloaded output.csv from file {FILE_ID}")curl
API_KEY="YOUR_API_KEY"
FILE_ID="YOUR_PERSISTENT_FILE_ID"
# 1) Start an export job for a saved file
JOB_RESPONSE=$(curl -sS -X POST "https://api.parquetreader.com/parquet/file/$FILE_ID/process?format=csv" \
-H "X-API-Key: $API_KEY" \
-H "Content-Type: application/json" \
-d '{}')
JOB_ID=$(echo "$JOB_RESPONSE" | python3 -c 'import json,sys; print(json.load(sys.stdin)["job_id"])')
# 2) Poll until the export is ready
while true; do
STATUS_RESPONSE=$(curl -sS "https://api.parquetreader.com/parquet/file/$FILE_ID/status?job_id=$JOB_ID" \
-H "X-API-Key: $API_KEY")
STATUS=$(echo "$STATUS_RESPONSE" | python3 -c 'import json,sys; print(json.load(sys.stdin)["status"])')
[ "$STATUS" = "ready" ] && break
sleep 1
done
# 3) Download the generated CSV
curl -L "https://api.parquetreader.com/parquet/file/$FILE_ID/download?job_id=$JOB_ID" \
-H "X-API-Key: $API_KEY" \
-o output.csvSearch and SQL
Inspect data without exporting first
Use search for broad text matching. Use SQL for counts, filters, grouping, and repeatable analysis. This is the endpoint used by the n8n AI Agent workflow.
Search rows
curl -G "https://api.parquetreader.com/parquet/search/YOUR_FILE_ID" \
-H "X-API-Key: YOUR_API_KEY" \
--data-urlencode "q=customer@example.com" \
--data-urlencode "limit=25" \
--data-urlencode "meta=true"Run SQL
curl -G "https://api.parquetreader.com/parquet/search/YOUR_FILE_ID" \
-H "X-API-Key: YOUR_API_KEY" \
--data-urlencode "sql=SELECT country, COUNT(*) AS total FROM dataset GROUP BY country ORDER BY total DESC" \
--data-urlencode "dialect=duckdb" \
--data-urlencode "limit=100"SQL notes
- Use
datasetas the table name. - Use safe
SELECTqueries only. - Set
dialect=duckdbwhen you write SQL directly for ParquetReader.
Persistent files
Make files reusable for automation
For n8n, Power BI, scheduled jobs, or AI agents, save the file so the same file_id remains available. In the interface, use Save this file. To find saved files later, call:
curl "https://api.parquetreader.com/parquet/files" \
-H "X-API-Key: YOUR_API_KEY" Once saved, the owner API key can list, search, query, export, rename, and manage the file later.
Export a SQL result
Add a JSON body to the process request when you want the export to contain a filtered or transformed result.
{
"sql_query": "SELECT customer_id, email, created_at FROM dataset WHERE email IS NOT NULL",
"dialect": "duckdb"
}Reference
Core endpoints
| Method | Path | Use |
|---|---|---|
| GET | /parquet/files | List persistent files owned by the API key. |
| POST | /parquet/file/{file_id}/process | Start a CSV, JSON, or Parquet export job. |
| GET | /parquet/file/{file_id}/status | Poll export status by job_id. |
| GET | /parquet/file/{file_id}/download | Download the generated export when the job is ready. |
| GET | /parquet/search/{file_id} | Search, page, inspect schema, or run safe SQL against a saved file. |
Supported input formats
ParquetReader accepts .parquet, .csv, .tsv, .json, .jsonl, .ndjson, .xlsx, .xls, .avro, .orc, .feather, .geojson, .nc, .nc4, .netcdf, .h5, .arrow, and .ipc.
Export formats: CSV, JSON, and Parquet.
