Supported Queries
pg_mooncake translates PostgreSQL queries into DuckDB queries behind the scenes. This allows you to write standard PostgreSQL syntax while benefiting from DuckDB's vectorized execution engine for analytics.
Analytical Queries​
Columnstore tables excel at analytical workloads that process large volumes of data. We typically see a 1000x speed up for these queries on pg_mooncake columnstore over regular postgres tables.
Aggregations​
-- Sum of sales by region and quarter
SELECT
region,
date_trunc('quarter', sale_date) AS quarter,
SUM(amount) AS total_sales,
COUNT(*) AS num_transactions,
AVG(amount) AS avg_sale
FROM sales_columnstore
WHERE sale_date BETWEEN '2023-01-01' AND '2023-12-31'
GROUP BY region, quarter
ORDER BY region, quarter;
Filtering Large Datasets​
-- Find high-value customers
SELECT
customer_id,
SUM(amount) AS total_spent
FROM sales_columnstore
WHERE sale_date >= CURRENT_DATE - INTERVAL '1 year'
GROUP BY customer_id
HAVING SUM(amount) > 10000
ORDER BY total_spent DESC;
Time-Series Analysis​
SELECT
date_trunc('day', event_time) AS day,
COUNT(*) AS num_events,
SUM(CASE WHEN event_type = 'purchase' THEN 1 ELSE 0 END) AS purchases
FROM events_columnstore
WHERE event_time >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY day
ORDER BY day;
Operational Queries (Not Recommended)​
While pg_mooncake supports point lookups and operational queries, these are not its strength. For these types of queries, regular PostgreSQL tables (rowstore) will typically perform better.
-- Point lookup (better on rowstore tables)
SELECT *
FROM users_columnstore
WHERE user_id = 12345;
-- Small range scan (better on rowstore tables)
SELECT *
FROM orders_columnstore
WHERE order_id BETWEEN 1000 AND 1010;
Query Translation​
pg_mooncake automatically translates PostgreSQL queries to DuckDB queries. While most standard SQL is supported, there are some PostgreSQL-specific features that may not translate directly.
Unsupported Postgres Functions​
Some PostgreSQL-specific functions are not supported when querying columnstore tables. This is because the underlying query execution is handled by DuckDB, which has a different set of built-in functions.
For example, the following PostgreSQL functions are not supported on columnstore tables:
to_char()
- DuckDB uses different formatting functionsregexp_replace()
- DuckDB has its own regex implementationarray_agg()
- Array handling differs between PostgreSQL and DuckDBjsonb_*
functions - JSON handling is implemented differently
When using these functions in queries involving columnstore tables, you may encounter errors. For more details on compatibility, refer to the pg_duckdb issue #615.
DuckDB Functions in Postgres​
This can be a good workaround for some of the unsupported Postgres functions. You can expose DuckDB functions within Postgres and use them to query columnstore tables. For example, to create a time bucketing function:
CREATE OR REPLACE FUNCTION time_bucket(interval, timestamp)
RETURNS timestamp AS $$
BEGIN
RAISE EXCEPTION 'this is a placeholder for columnstore';
END;
$$ LANGUAGE plpgsql;
You can call this time_bucket function with:
```sql
SELECT DISTINCT ON (user_id, five_min_bucket)
user_properties ->> 'id' AS user_id,
time_bucket(INTERVAL '5 minutes', created_at) AS five_min_bucket,
properties ->> 'search_query' AS search_query,
created_at
FROM events_view
WHERE
organization_id = 'org_JfjtEvzbwOikUEUn'
AND event_type = 'search'
AND type IN ('search_query_response_received', 'search_query_response')
ORDER BY user_id, five_min_bucket, created_at DESC;
Performance Considerations​
- Columnstore tables perform best when scanning many rows but few columns
- For mixed workloads, consider using both table types: columnstore for analytics and rowstore for operational queries