Database Design Anti-Patterns: Common Mistakes and How to Avoid Them
Structural flaws introduced during database design propagate through every layer of an application — degrading query performance, undermining data integrity, and compounding the cost of future schema changes. This page maps the recognized anti-patterns in relational and non-relational database design, their mechanical causes, the operational contexts in which they surface, and the structural criteria used to distinguish acceptable tradeoffs from genuine design failures. The scope covers professional database design practice within enterprise and public-sector technology environments across the United States.
Definition and scope
A database design anti-pattern is a recurring structural or modeling decision that appears locally reasonable but produces systemic negative consequences — reduced maintainability, query inefficiency, integrity violations, or scaling failures — across the database lifecycle. The term draws from the broader software engineering catalog of anti-patterns formalized in published literature, including the patterns documented by Ward Cunningham's Portland Pattern Repository and subsequently codified in applied database engineering texts such as Bill Karwin's SQL Antipatterns (O'Reilly Media, 2010), which identifies 24 discrete relational anti-patterns organized by programming, application, schema, and query domains.
Anti-patterns are distinct from deliberate tradeoffs. Normalization and denormalization decisions, for example, involve accepted performance-versus-integrity tradeoffs backed by documented reasoning. An anti-pattern, by contrast, typically lacks a corresponding benefit that justifies its cost — it is a mistake with disguised short-term appeal rather than an engineered compromise.
The scope of database design anti-patterns spans three primary domains:
- Schema structure — how tables, columns, relationships, and constraints are modeled
- Indexing strategy — how indexes are created, maintained, and queried
- Query design — how SQL or query language constructs interact with the underlying storage engine
The database schema design domain carries the highest long-term risk, because structural flaws baked into a schema require costly database migration operations to correct after production data has accumulated.
How it works
Database anti-patterns typically emerge from one of three root causes: inadequate domain modeling before schema construction, pressure to deliver working prototypes without normalization, or misapplication of patterns from one database category to another (for instance, applying relational design assumptions to NoSQL database systems).
The mechanical failure chain follows a consistent structure:
- Incorrect abstraction — a real-world entity or relationship is modeled with a shortcut structure (e.g., encoding multiple values in a single column as a comma-separated string rather than a child table)
- Short-term functionality — the shortcut satisfies immediate application requirements and passes initial testing
- Query complexity accumulation — as data volume grows, queries must parse or reconstruct what the schema should have encoded structurally
- Integrity degradation — without relational constraints enforcing the implicit structure, data quality erodes
- Refactoring cost escalation — the schema becomes load-bearing for application logic, making correction prohibitively expensive
The Entity-Attribute-Value (EAV) pattern illustrates this chain. EAV stores attribute names and values as rows rather than columns, achieving schema flexibility at the cost of type enforcement, query readability, and join performance. The entity-relationship modeling discipline, as defined in the ANSI/ISO SQL standard family, provides structured alternatives — but EAV persists because it appears to solve extensibility problems at design time while creating query and integrity problems at operation time.
Database indexing anti-patterns follow a parallel failure chain: over-indexing increases write latency and storage overhead; under-indexing forces full table scans on high-frequency query paths. Neither failure is detectable from schema inspection alone — both require execution plan analysis, covered under database query optimization and database performance tuning.
Common scenarios
The following anti-patterns appear with high frequency in production database systems audited by database administrators and reviewed in professional literature:
Jaywalking (comma-separated lists in columns)
Storing foreign-key references or tag values as delimited strings in a single column violates First Normal Form. It prevents the use of referential integrity constraints, requires string parsing in every consuming query, and breaks aggregate operations. The correct structure is an intersection table.
Polymorphic associations
A single foreign-key column that references rows in multiple unrelated tables — disambiguated by a type column — cannot carry a referential integrity constraint in standard SQL. This design is documented in Karwin's SQL Antipatterns as producing unmaintainable join logic and silent integrity failures.
God table / wide table
A single table accumulating 80 or more columns representing multiple conceptually distinct entities trades design complexity for perceived query simplicity. The database administrator role frequently inherits god tables from prototype systems promoted to production without formal schema review.
Missing or misaligned primary keys
Tables without declared primary keys — or with surrogate keys that permit duplicate logical entities — undermine data integrity and constraints. The ISO/IEC 9075 SQL standard explicitly requires that primary keys enforce entity uniqueness.
Index absence on foreign keys
Foreign key columns used in join conditions without supporting indexes generate full table scans proportional to the child table's row count. A child table with 10 million rows joined to a parent on an unindexed foreign key can increase query latency by 3 to 4 orders of magnitude compared to the indexed equivalent, depending on selectivity and storage engine behavior.
Implicit column selection (SELECT *)
Queries using SELECT * couple application logic to schema structure, increase network payload, and prevent index-only scans. This is classified as a query-layer anti-pattern rather than a schema anti-pattern but produces schema change fragility equivalent to structural design flaws.
Premature sharding
Applying database sharding before a single-node deployment has reached genuine capacity limits introduces distributed systems complexity — cross-shard queries, distributed transactions, and rebalancing overhead — without the scale justification. This anti-pattern is particularly common in teams applying distributed database systems patterns to workloads that fit comfortably within a single optimized relational instance.
Decision boundaries
Distinguishing a genuine anti-pattern from an acceptable tradeoff requires evaluation against four structural criteria:
1. Reversibility cost
If correcting the design requires a zero-downtime schema migration across a table with more than 100 million rows, the anti-pattern carries a materially different risk profile than one correctable in a development environment. Database version control practices directly affect how reversible early design decisions remain.
2. Integrity enforcement capability
Designs that prevent the database engine from enforcing constraints declaratively — through referential integrity, check constraints, or domain types — shift correctness enforcement to the application layer. Application-layer enforcement is fragile across multiple codebases, ORMs, and direct database access paths. The object-relational mapping layer, in particular, cannot substitute for database-native constraint enforcement.
3. Query plan determinism
Anti-patterns that produce execution plans whose cost scales super-linearly with data volume (full scans, implicit type casts on indexed columns, correlated subqueries in SELECT lists) represent performance failures independent of current data size. Database monitoring and observability tooling can surface these before they reach critical thresholds.
4. Regulatory and compliance impact
Certain anti-patterns carry compliance consequences beyond operational degradation. A schema without audit trail support, for instance, may fail requirements under frameworks such as HIPAA (45 CFR §164.312) or PCI DSS v4.0 Requirement 10, which mandates logging of all access to cardholder data. Database auditing and compliance requirements can make certain design shortcuts non-negotiable failures rather than technical debt.
The contrast between EAV and a properly normalized schema with optional columns or a JSONB column (in PostgreSQL) illustrates the decision boundary most clearly: JSONB in a documented, constrained context is an engineered tradeoff with query and indexing support; EAV is a structural anti-pattern that disables the relational engine's core enforcement capabilities without equivalent compensating controls.
Practitioners navigating schema evaluation decisions can reference the broader database systems landscape mapped at databasesystemsauthority.com, which covers the full taxonomy of database models, design standards, and operational frameworks that contextualize individual design choices.
References
- ISO/IEC 9075 SQL Standard — ANSI
- NIST SP 800-53 Rev 5 — Security and Privacy Controls for Information Systems (AU-2, AU-12 for audit log requirements)
- 45 CFR §164.312 — HIPAA Security Rule Technical Safeguards (HHS)
- PCI DSS v4.0 — PCI Security Standards Council
- Portland Pattern Repository — Ward Cunningham (WikiWikiWeb)
- NIST National Vulnerability Database — CWE-89 (SQL Injection, schema-level exposure)