Skip to main content
Version: 0.1

Columnstore DDL

Overview​

Columnstore tables in pg_mooncake work just like regular PostgreSQL tables, but with the performance benefits of columnar storage, and fast analytics. The key difference is that you need to add the USING columnstore clause when creating the table.

Create Columnstore Table​

To create a columnstore table, use the standard PostgreSQL CREATE TABLE syntax with the USING columnstore clause:

CREATE TABLE users_columnstore (
id INT,
name TEXT,
created_at TIMESTAMP
) USING columnstore;

You can also create a columnstore table from an existing rowstore table:

CREATE TABLE users_columnstore USING columnstore AS 
SELECT * FROM users_rowstore;

Drop Columnstore Table​

DROP TABLE users_columnstore;

Supported Features​

Columnstore tables support most PostgreSQL features, including:

  • All standard DML operations (INSERT, UPDATE, DELETE)
  • Transactions and ACID compliance
  • NOT NULL constraints and default values
  • Joins between columnstore tables, and between columnstore and rowstore tables.

Unsupported Data Types​

While most PostgreSQL data types are supported, there are a few limitations:

  • Unsupported Data Types:

    • Enums
    • uhugeint
    • JSONB (support is in progress)
    • Arrays (limited support for some types)
  • Unsupported Constraints:

    • Primary keys
    • Foreign keys
    • Unique constraints
    • Indexes

There may also be some precision differences when dealing with Postgres numeric types, and not all data types support comparisons to literals in queries yet.

Partitioned Tables​

pg_mooncake does not currently support partitioned tables. However, partitioning is usually not required to get good performance with columnstore tables.

This is because data is naturally 'segmented' into data files, and pg_mooncake automatically performs file skipping (segment elimination) at query time. This provides many of the same benefits as partitioning without the added complexity.

Schema Modifications​

Currently, you cannot alter a columnstore table schema after creation. If you need to change the schema, you'll need to:

  1. Create a new table with the desired schema
  2. Copy the data from the old table to the new one
  3. Drop the old table
-- Example of recreating a table with a modified schema
CREATE TABLE users_columnstore_new (
id INT,
name TEXT,
email TEXT, -- New column
created_at TIMESTAMP
) USING columnstore;

INSERT INTO users_columnstore_new (id, name, created_at)
SELECT id, name, created_at FROM users_columnstore;

DROP TABLE users_columnstore;

Setting Columnstore as Default Table Access Method​

PostgreSQL allows you to set a default table access method using the default_table_access_method configuration parameter. This means you can make all new tables use columnstore storage by default, without having to specify USING columnstore in each CREATE TABLE statement.

To set columnstore as the default table access method for your current session:

SET default_table_access_method = 'columnstore';
-- This will create a columnstore table
CREATE TABLE users (
id INT,
name TEXT,
created_at TIMESTAMP
);

You can still create regular heap tables when needed by explicitly specifying the access method:

CREATE TABLE users_heap (
id INT,
name TEXT,
created_at TIMESTAMP
) USING heap;

Performance Considerations​

  • Columnstore tables excel at analytical queries that scan large amounts of data
  • They're ideal for queries that select a subset of columns from wide tables
  • For best performance, batch your writes into larger transactions
  • Each write operation creates a new Parquet file, with ideal batch sizes between 100,000 and 1,000,000 rows