Useful Postgres features you might not know about
Postgres features that may fly under the radar.
Add descriptions to tables and columns
COMMENT ON
can be used to add comments to tables, columns, indexes, and other database entities.
COMMENT ON TABLE users IS
'This table is deprecated. Please use the users_v2 table instead.';
COMMENT ON COLUMN users.name IS 'The full name of the user';
SQL clients can pick up on these comments, and display them whenever you're browsing the database schema. In psql
, you can view tables with comments using \d+
Comments are stored in the pg_catalog.pg_description
table.
COMMENT ON
is supported in all versions of Postgres (docs).
Count non-null and null arguments
num_nonnulls(...)
counts the number of provided arguments that are not null.
SELECT num_nonnulls('a', 'b', null);
-- Returns 2
num_nulls(...)
counts the number of provided arguments that are null.
SELECT num_nulls('a', 'b', null);
-- Returns 1
This is useful in a check constraint, if you need to ensure that exactly some number of columns are provided
CREATE TABLE purchase_order (
id UUID PRIMARY KEY,
contact_email TEXT NULL,
contact_phone TEXT NULL
);
-- If we want to ensure that at least one of email or phone is provided
-- when making a purchase order, we can add a check constraint using
-- `num_nonnulls`
ALTER TABLE purchase_order ADD CONSTRAINT contact_info_check CHECK (
num_nonnulls(contact_email, contact_phone) >= 1
);
num_nonnulls
and num_nulls
are available in Postgres 9.6+ (docs)
Add different filters to different aggregations
In Postgres, if you wanted to perform two or more aggregations with different filters, you might be able to use the FILTER
clause.
SELECT
count(*) as user_count,
count(*) filter (where verified = true) as verified_user_count,
count(*) filter (where verified = false) as unverified_user_count
FROM users;
is the equivalent of:
SELECT
count(*) as user_count,
sum(case when verified = true then 1 else 0 end) as verified_user_count,
sum(case when verified = false then 1 else 0 end) as unverified_user_count
FROM users;
except that a) it reads more naturally, and b) you don't need to write a complicated case
expression.
FILTER
is supported in Postgres 9.4 and above (docs).
Duplicate a table's schema
If you want to create a table with the same columns as an existing table, there is a shorthand for this: CREATE TABLE ... LIKE
.
CREATE TABLE users (
id uuid PRIMARY KEY,
name text NOT NULL,
email text NOT NULL,
created_at timestamptz NOT NULL DEFAULT now()
);
-- This creates a users_history table with the same columns as the users table,
-- but with additional columns
CREATE TABLE users_history (
LIKE users INCLUDING ALL,
modified_at timestamptz NOT NULL,
modification_type text NOT NULL
);
You can optionally add INCLUDING
and EXCLUDING
clauses to control what else gets copied over.
I've written about this in more detail in a separate article
CREATE TABLE ... LIKE
is supported in Postgres 7.4 and newer (docs)
Bucket timestamps into intervals
The date_bin
function can be used to bin a timestamp to some given interval, aligned to the specified origin.
date_bin(stride, source, origin)
takes three arguments:
stride
- The length of the interval to align to. This is of typeinterval
.source
- The timestamp to align. This is of typetimestamp
ortimestamptz
.origin
- The start of the first interval. This is of typetimestamp
ortimestamptz
.
-- This returns the start of the nearest 10-minute interval that each event_time
-- belongs to, starting from 13:02:00.
SELECT
event_time,
date_bin(
'10 minutes', -- Interval length
event_time, -- Timestamp to align
TIMESTAMP '2024-12-02 13:02:00' -- Start of the first interval
)
FROM
(VALUES
(TIMESTAMP '2024-12-02 14:24:00'),
(TIMESTAMP '2024-12-02 14:26:00'),
(TIMESTAMP '2024-12-02 14:35:12'),
(TIMESTAMP '2024-12-02 14:37:12'),
(TIMESTAMP '2024-12-02 14:45:45')
) AS t(event_time);
/*
event_time | date_bin
---------------------+---------------------
2024-12-02 14:24:00 | 2024-12-02 14:22:00
2024-12-02 14:26:00 | 2024-12-02 14:22:00
2024-12-02 14:35:12 | 2024-12-02 14:32:00
2024-12-02 14:37:12 | 2024-12-02 14:32:00
2024-12-02 14:45:45 | 2024-12-02 14:42:00
*/
This is very useful when building histograms from time-series data.
The difference between date_bin
and the more traditionally-used date_trunc
is that date_bin
lets you align timestamps to any arbitrary interval.
date_bin
is available in Postgres 14+ (docs)
Update [5 Dec 2024]
The title and subtitles of the article were tweaked for clarity.
Like this article? Follow me on Bluesky or subscribe to the RSS feed to get notified about new articles.