The Error
ERROR: column "orders.customer_name" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: SELECT customer_name, status, COUNT(*) FROM orders GROUP BY status
^
Spot the problem: SELECT asks for customer_name, but GROUP BY only lists status. One status group might contain 50 rows with 50 different customer names. PostgreSQL has no idea which one to return โ so it refuses.
Coming from MySQL? This might catch you off guard. MySQL silently picks an arbitrary row's value, which is usually wrong. PostgreSQL rejects the query outright. That's not a bug โ it's correct SQL standard behavior.
Typical Query That Causes This
-- This fails:
SELECT customer_name, status, COUNT(*)
FROM orders
GROUP BY status;
-- ERROR: column "orders.customer_name" must appear in the GROUP BY clause...
You're grouping by status but selecting customer_name. Each status group holds many different customer names โ PostgreSQL won't pick one for you.
Step-by-Step Fixes
Fix 1: Add the Missing Column to GROUP BY
The simplest fix. Every non-aggregated column in your SELECT must also appear in GROUP BY.
-- Broken:
SELECT customer_name, status, COUNT(*)
FROM orders
GROUP BY status;
-- Fixed:
SELECT customer_name, status, COUNT(*)
FROM orders
GROUP BY customer_name, status;
Use this when grouping by both columns together is exactly what your report needs.
Fix 2: Wrap the Column in an Aggregate Function
Need just one value from the group โ not all of them? Use MIN(), MAX(), or string_agg().
-- Pick any single customer name per status group:
SELECT MIN(customer_name) AS customer_name, status, COUNT(*)
FROM orders
GROUP BY status;
-- Collect all customer names as a comma-separated list:
SELECT status,
COUNT(*) AS order_count,
string_agg(DISTINCT customer_name, ', ') AS customers
FROM orders
GROUP BY status;
-- Or as an array:
SELECT status, COUNT(*), array_agg(DISTINCT customer_name) AS customers
FROM orders
GROUP BY status;
Fix 3: Use DISTINCT ON (PostgreSQL-specific)
DISTINCT ON keeps one row per partition after sorting. No subquery, no CTE.
-- Get the most recent order per status:
SELECT DISTINCT ON (status)
customer_name, status, order_amount, created_at
FROM orders
ORDER BY status, created_at DESC;
The ORDER BY decides which row wins for each group. Here: the latest created_at comes out on top.
Fix 4: Use a CTE with ROW_NUMBER()
Want the full row matching an aggregate condition โ say, the biggest order per status? Window functions handle this cleanly.
-- Get the highest-value order per status:
WITH ranked AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY status ORDER BY order_amount DESC) AS rn
FROM orders
)
SELECT customer_name, status, order_amount
FROM ranked
WHERE rn = 1;
Full Working Example
-- Setup:
CREATE TABLE orders (
id SERIAL PRIMARY KEY,
customer_name TEXT,
status TEXT,
order_amount NUMERIC,
created_at TIMESTAMP DEFAULT NOW()
);
INSERT INTO orders (customer_name, status, order_amount) VALUES
('Alice', 'pending', 150.00),
('Bob', 'pending', 200.00),
('Carol', 'shipped', 350.00),
('Dave', 'shipped', 100.00);
-- BROKEN โ customer_name not in GROUP BY:
SELECT customer_name, status, COUNT(*), SUM(order_amount)
FROM orders
GROUP BY status;
-- ERROR: column "orders.customer_name" must appear in the GROUP BY clause...
-- OPTION A: Group by both columns:
SELECT customer_name, status, SUM(order_amount)
FROM orders
GROUP BY customer_name, status;
-- customer_name | status | sum
-- ---------------+---------+--------
-- Alice | pending | 150.00
-- Bob | pending | 200.00
-- Carol | shipped | 350.00
-- Dave | shipped | 100.00
-- OPTION B: Summary by status only โ drop customer_name:
SELECT status, COUNT(*) AS orders, SUM(order_amount) AS total
FROM orders
GROUP BY status;
-- status | orders | total
-- ---------+--------+--------
-- pending | 2 | 350.00
-- shipped | 2 | 450.00
-- OPTION C: Stats + all customers per status:
SELECT status,
COUNT(*),
string_agg(customer_name, ', ') AS customers
FROM orders
GROUP BY status;
-- status | count | customers
-- ---------+-------+---------------
-- pending | 2 | Alice, Bob
-- shipped | 2 | Carol, Dave
Verify the Fix
Paste your corrected query into psql and make sure it returns rows, not an error:
-- Should return rows, not an error:
SELECT status, COUNT(*), SUM(order_amount)
FROM orders
GROUP BY status;
-- status | count | sum
-- ---------+-------+--------
-- pending | 2 | 350.00
-- shipped | 2 | 450.00
-- (2 rows)
Testing DISTINCT ON? Check that you get exactly one row per unique status value:
SELECT DISTINCT ON (status) status, customer_name, created_at
FROM orders
ORDER BY status, created_at DESC;
-- You should see one row per unique status value.
Which Fix Should You Use?
- Add to GROUP BY โ when grouping by that column too is exactly what you want (most common fix)
- MIN / MAX / string_agg / array_agg โ when you need a summary or combined value from non-grouped columns
- DISTINCT ON โ when you need one row per group and can define an ORDER BY to pick the winner
- ROW_NUMBER() in a CTE โ when you need the full row matching an aggregate condition, like the highest sale per region
One Extra Gotcha: Aliases in GROUP BY
PostgreSQL rejects column aliases in GROUP BY. Many developers hit this on their first day with date extraction:
-- This fails โ alias in GROUP BY:
SELECT EXTRACT(YEAR FROM created_at) AS year, COUNT(*)
FROM orders
GROUP BY year;
-- ERROR: column "year" does not exist
-- Fix: repeat the full expression:
SELECT EXTRACT(YEAR FROM created_at) AS year, COUNT(*)
FROM orders
GROUP BY EXTRACT(YEAR FROM created_at);
-- Or use the column position (1-indexed):
SELECT EXTRACT(YEAR FROM created_at) AS year, COUNT(*)
FROM orders
GROUP BY 1;

