Database Design for Non-DBAs: A Practical Technical Guide

The database is the part of an application you can least afford to get wrong, because it is the hardest thing to change later. You can redesign a screen in an afternoon, but reshaping a schema that already holds a year of live data is slow, risky work. The good news is that you do not need to be a database administrator to get it roughly right. A handful of solid principles will keep your data clean, your queries fast, and your future self grateful, and avoiding the common mistakes is most of the battle.

This guide covers those principles in plain terms, aimed at developers and technical founders who are building on a database rather than specializing in one. The examples use Postgres, which is the sensible default for most applications.

Your schema is the foundation

Everything your app does sits on top of the data model, so a little care here pays off for the life of the project. The aim is not a perfect, theoretical design, it is a clear one that represents the real things in your business and the relationships between them, and that the database itself can keep honest. Spend a bit of thought up front, because this is the layer where mistakes compound and corrections hurt.

Model the real things and their relationships

Start by naming the real entities, a customer, an order, a product, and making each its own table where a row represents exactly one of that thing. Then capture how they relate. An order belongs to one customer but a customer has many orders, that is a one-to-many relationship, expressed with a foreign key on the order. When two things relate many-to-many, like orders and products, you introduce a join table that links them. Getting these relationships right is most of good design, because they mirror how the business actually works.

Choose sensible keys and types

Give every table a primary key, a stable unique identifier, with UUIDs being a good default when ids should not be guessable or when data merges across systems. Connect tables with foreign keys so relationships are explicit. And use the right column type for each value, which prevents a whole class of bugs: store money as integer cents rather than floating point to avoid rounding errors, use timestamp-with-timezone for times, and mark columns not-null when a value is genuinely required.

create table orders (
  id           uuid primary key default gen_random_uuid(),
  user_id      uuid not null references users(id),          -- the relationship, enforced
  amount_cents integer not null check (amount_cents >= 0),  -- money as integer, never float
  status       text not null default 'pending',
  created_at   timestamptz not null default now()
);
create index on orders (user_id);   -- you look up orders by user, so index it

Let the database enforce integrity

Your application code is not the only thing writing to the database over its life, scripts, migrations, and future features all do, so do not rely on app code alone to keep data valid. Push the rules into the database with constraints: foreign keys so an order cannot reference a customer who does not exist, unique constraints so you cannot register the same email twice, not-null and check constraints so impossible values are rejected outright. The database is your last line of defense, and a constraint catches the bad write that a forgotten code path would have let through.

Normalize first, denormalize only when measured

The default discipline is normalization: each fact lives in exactly one place, so a customer's address is stored once and referenced, not copied into every order. This keeps data consistent, because there is only one copy to update. Duplicating data for speed (denormalization) is sometimes worth it, but only when you have measured a real performance problem, never as a starting assumption. Premature duplication is how databases drift into contradicting themselves.

Index for how you actually query

An index makes lookups fast, and the rule is to index the columns you filter, join, and sort by. If you constantly fetch orders by customer, index that column. But indexes are not free, they cost space and slow writes, so add them deliberately rather than on everything. Also watch for the N+1 pattern, where code fetches a list and then queries the database once per item in a loop, which quietly kills performance, and fetch related data in a single query instead.

Change the schema safely with migrations

Your schema will evolve, and how you change it matters as much as the design. Never edit a production database by hand. Keep every change as a versioned migration committed to git, so the schema's history is recorded and reproducible across environments. For changes on live data, work in safe steps, the expand-then-contract pattern: add the new shape without breaking the old, migrate the data, and only then remove the old shape.

-- safe change in steps, so nothing breaks mid-deploy
alter table orders add column currency text;            -- 1. expand: add nullable
-- ... backfill currency for existing rows ...
alter table orders alter column currency set not null;  -- 2. contract: enforce, once safe

Plan for growth, audit, and backups

Build in a little foresight. Put created_at and updated_at on your tables so you can reason about history, consider soft deletes or an audit trail where you need to know what changed, and make sure the database is backed up with restores you have actually tested. A clean schema is worth little if you cannot recover it, so treat backups as part of the design, not an afterthought.

A schema design checklist

Model real entities as tables, one row per thing, with relationships via foreign keys.
Use stable primary keys, and correct types: integer cents for money, timestamptz for time.
Enforce integrity in the database with foreign-key, unique, not-null, and check constraints.
Normalize so each fact lives in one place, and denormalize only when you have measured a need.
Index the columns you filter, join, and sort by, and avoid N+1 query loops.
Keep schema changes as versioned migrations in git, never edit production by hand.
Use expand-then-contract for changes on live data.
Add timestamps, consider an audit trail, and keep tested backups.

FAQ

Do I need to be a database expert to design a good schema?

No. You need a handful of solid principles, not deep specialization. Model the real entities and their relationships, choose sensible keys and types, let the database enforce integrity with constraints, normalize so each fact lives once, and index for how you query. Avoiding the common mistakes gets you most of the way, and because the database is the hardest layer to change later, that modest upfront care pays off for the whole life of the project.

Why should I store money as integers instead of decimals?

Because floating-point types cannot represent many decimal values exactly, which leads to rounding errors that are unacceptable for money. The standard fix is to store amounts as integers in the smallest unit, cents for dollars, and format them for display. This keeps arithmetic exact and avoids the classic bug where totals are off by a cent. Pair it with explicit currency handling, and your financial data stays trustworthy.

Should I enforce rules in my code or in the database?

Both, but never rely on code alone. Application code is one of several things that write to the database over its life, alongside scripts, migrations, and future features, so rules enforced only in code can be bypassed. Put the hard guarantees in the database with foreign-key, unique, not-null, and check constraints, so invalid data is rejected no matter what wrote it. The database is your last line of defense, and constraints catch what a forgotten code path would miss.

What is normalization and do I always need it?

Normalization means each fact is stored in exactly one place and referenced elsewhere, rather than copied, so data stays consistent because there is only one copy to update. It should be your default. Denormalization, deliberately duplicating data for speed, is sometimes worth it, but only after you have measured a real performance problem, never as a starting assumption. Duplicating data prematurely is a common way databases end up contradicting themselves.

How do I change a database schema without breaking things?

Use versioned migrations kept in git, never hand-edit production, and for live data change the schema in safe steps. The expand-then-contract pattern is the key: add the new column or table without removing the old, backfill and migrate the data, deploy code that uses the new shape, and only then drop the old one. Doing it in stages means the application keeps working at every point of the change instead of breaking mid-deploy.

If you are building something and want a data model that will not fight you in a year, tell me what your app needs to track and I will help you get the schema right from the start.

Database Design for Non-DBAs: A Practical Technical Guide

Your schema is the foundation

Model the real things and their relationships

Choose sensible keys and types

Let the database enforce integrity

Normalize first, denormalize only when measured

Index for how you actually query

Change the schema safely with migrations

Plan for growth, audit, and backups

A schema design checklist

FAQ

Do I need to be a database expert to design a good schema?

Why should I store money as integers instead of decimals?

Should I enforce rules in my code or in the database?

What is normalization and do I always need it?

How do I change a database schema without breaking things?

Want a hand applying this?

Go deeper

You Don't Have to Build It All at Once: Custom Software, Step by Step

7 Signs You've Outgrown Spreadsheets (and Need Custom Software)

What Custom Software Actually Costs (and What Drives the Price)

API Integrations: Why Connecting Your Stack Beats Copy-Paste

From Idea to MVP: A Founder's Guide to Shipping Fast

Build vs Buy: A Practical Decision Framework for SMBs