Key Dimensions and Scopes of Database Systems

The scope of a database system is not a fixed attribute — it is a product of deployment model, data volume, regulatory environment, access patterns, and organizational context. Defining these dimensions precisely is essential for procurement decisions, architectural planning, compliance audits, and professional role classification. This page maps the structural boundaries, regulatory constraints, scale gradients, and contextual variables that collectively determine what any given database system covers, and what it does not.


What falls outside the scope

Database systems are frequently conflated with adjacent infrastructure components that operate alongside them but fall under distinct architectural and professional boundaries.

File systems and object storage — Amazon S3, Azure Blob Storage, and POSIX file systems store unstructured binary objects. They do not enforce relational constraints, execute SQL queries, or maintain ACID transactional guarantees as defined under database transactions and ACID properties. Retrieval is key-based or path-based, not query-based.

Message queues and event streaming platforms — Apache Kafka, RabbitMQ, and AWS SQS are durable message brokers designed for ordered event delivery. They are not databases in the structural sense: they do not support ad hoc queries, schema-enforced row storage, or multi-table joins.

Data pipelines and ETL tooling — Tools such as Apache Spark, AWS Glue, and dbt transform and move data between systems. The pipeline layer is distinct from the storage and query execution layer that constitutes a database system.

Application caches — Caching layers such as Memcached operate as transient volatile stores. Although Redis occupies a hybrid position — functioning as both cache and persistent store — pure caching infrastructure is not classified as a database system under most enterprise architecture frameworks.

Analytics and BI frontends — Tableau, Power BI, and Looker are presentation and query-generation tools. They consume database systems but are not themselves database systems, even when they embed query engines.

A persistent misconception is that any system containing data qualifies as a database system. NIST SP 800-53 (Rev 5), available at csrc.nist.gov, distinguishes data management systems from broader information systems precisely because the security, access control, and audit requirements differ by system type.


Geographic and jurisdictional dimensions

Database systems serving US-based organizations operate under a patchwork of federal and state-level regulatory regimes that directly affect permissible data residency, cross-border transfer, and storage architecture.

Federal jurisdictional layers apply when databases handle regulated data categories. The Health Insurance Portability and Accountability Act (HIPAA), administered by the HHS Office for Civil Rights at hhs.gov, imposes geographic and technical controls on protected health information (PHI) stored in any database — including cloud-hosted instances. The Federal Risk and Authorization Management Program (FedRAMP), managed by GSA, establishes mandatory controls for cloud database services used by federal agencies, requiring that data processing occur within the continental United States in most authorization tiers.

State-level residency and privacy laws create additional jurisdictional layers. California's Consumer Privacy Act (CCPA), codified at Cal. Civ. Code § 1798.100, applies to any database system holding personal data of California residents regardless of where the database is physically hosted. Virginia (VCDPA), Colorado (CPA), and Connecticut (CTDPA) have enacted comparable frameworks with overlapping but non-identical scope definitions.

Cross-border data flows are governed at the international level by frameworks such as the EU-US Data Privacy Framework (successor to Privacy Shield), which affects US organizations operating distributed database systems with nodes in European jurisdictions. The European Data Protection Board's guidance on data transfer mechanisms — available at edpb.europa.eu — constitutes binding interpretive authority for EU-side nodes of multinational database deployments.

Geographic scope is not limited to physical server location. Cloud database services replicate across availability zones that may span state and national boundaries, making geographic scope a configuration decision requiring explicit architectural governance rather than an automatic property of the hosting contract.


Scale and operational range

Database system scope scales along three orthogonal dimensions: data volume, transaction throughput, and concurrency.

Data volume range spans from sub-gigabyte single-instance databases serving small internal applications to multi-petabyte distributed systems underpinning financial exchanges or national health registries. The data warehousing sector routinely operates at the 100-terabyte to multi-petabyte range, where columnar storage architectures (columnar databases) and database partitioning strategies become architectural requirements rather than optimizations.

Transaction throughput is measured in transactions per second (TPS). OLTP systems — described in the OLTP vs. OLAP classification framework — are designed for high-concurrency short-duration transactions, with enterprise-grade RDBMS platforms such as Oracle Database and PostgreSQL sustaining tens of thousands of TPS under standard hardware configurations. OLAP systems prioritize scan throughput over per-transaction latency and are tuned for query workloads that may process billions of rows per query.

Concurrency range — the number of simultaneous active connections — is governed by database connection pooling architectures. Standard PostgreSQL installations support up to a few hundred direct connections before connection overhead degrades performance; connection poolers such as PgBouncer extend effective concurrency to thousands of application threads against the same instance.

Scale Tier Typical Volume Primary Architecture Representative Platform
Small < 100 GB Single-node RDBMS SQLite, MySQL
Mid-range 100 GB – 10 TB Multi-node RDBMS / Managed cloud PostgreSQL, AWS RDS
Enterprise 10 TB – 1 PB Distributed RDBMS / NewSQL Google Spanner, CockroachDB
Hyperscale > 1 PB Distributed columnar / sharded Snowflake, Apache Cassandra

Regulatory dimensions

Database systems sit at the intersection of technical architecture and regulatory compliance. The database security and access control and database auditing and compliance dimensions are not optional features — for regulated industries, they are legally mandated functional requirements.

PCI DSS (Payment Card Industry Data Security Standard), published by the PCI Security Standards Council at pcisecuritystandards.org, requires that cardholder data environments maintain access controls, database encryption, and audit logging to specific technical specifications. Requirement 10 mandates logging of all individual user access to cardholder data stored in database systems.

SOX Section 404 — the Sarbanes-Oxley Act's internal controls mandate — requires that financial reporting databases maintain data integrity controls, change management logs, and access restriction evidence sufficient for external auditor review. The Public Company Accounting Oversight Board (PCAOB) auditing standards at pcaobus.org define the evidentiary requirements that shape database audit trail specifications for publicly traded companies.

NIST SP 800-53 Rev 5 establishes the AU (Audit and Accountability) and AC (Access Control) control families that define baseline database security requirements for federal systems. These controls are referenced in FedRAMP authorizations and are increasingly adopted as baseline standards in state government database procurement requirements.

FISMA (Federal Information Security Modernization Act of 2014, 44 U.S.C. § 3551) requires federal agencies to categorize information systems — including database systems — under FIPS 199 impact levels (Low, Moderate, High), with the impact level directly determining the security control baseline applied to the database.


Dimensions that vary by context

Scope attributes that appear fixed in one deployment context are configurable or contested in another.

Consistency modelRelational database systems default to strong consistency under ACID guarantees. NoSQL database systems frequently offer configurable consistency levels (eventual, bounded staleness, session, strong) that shift the system's behavior based on application requirements. The CAP theorem provides the theoretical framework governing this tradeoff.

Schema rigidity — Relational systems enforce schema-on-write; document databases and other NoSQL variants typically support schema-on-read. This distinction has direct implications for database schema design governance and migration complexity.

Query surface — SQL is a defined standard (ISO/IEC 9075), but implementations vary. SQL fundamentals apply across ANSI-compliant engines, but platform-specific extensions in Oracle PL/SQL, Microsoft T-SQL, and PostgreSQL PL/pgSQL create functional scope differences that affect portability.

Durability guaranteesIn-memory databases trade full persistence for speed. Redis, for example, offers configurable persistence modes (RDB snapshots, AOF logging, or none) that place it at different positions on the durability spectrum depending on configuration.


Service delivery boundaries

Database system delivery occurs across five distinct operational models, each with distinct scope boundaries:

  1. Self-managed on-premises — Full infrastructure and software stack under organizational control. Scope includes hardware provisioning, OS patching, DBMS installation, and all database backup and recovery operations.

  2. Infrastructure-as-a-Service (IaaS) hosted — Database software runs on cloud VMs (AWS EC2, Azure VMs). Hardware abstracted; OS and DBMS management remain organizational responsibilities.

  3. Platform-as-a-Service / Managed cloudCloud database services such as AWS RDS and Azure SQL Database manage the OS and engine layer. Organizations retain schema, query, and access control responsibility.

  4. Database-as-a-Service (DBaaS) — Fully managed services such as Database-as-a-Service offerings abstract nearly all operational functions. Scope for the consuming organization is limited to data model, query design, and access policy.

  5. Serverless database — Aurora Serverless and Firestore abstract capacity management entirely. Scope boundaries shift billing and scaling responsibility to the provider; the organization defines only the data model and access patterns.


How scope is determined

Scope determination for a database system follows a structured evaluation sequence:

  1. Data classification — Identify what categories of data the system will hold (PHI, PII, financial, public). This drives regulatory scope before any architecture decision is made.

  2. Workload profile analysis — Characterize the access pattern: transaction-heavy OLTP, read-heavy OLAP, mixed, or event-driven. This determines whether database indexing, database query optimization, or columnar storage dominates the architecture.

  3. Consistency and availability requirements — Define SLA targets for uptime, database high availability tiers, and recovery point objectives (RPO / RTO) that constrain the database disaster recovery architecture.

  4. Scale projection — Estimate data growth rate, peak concurrency, and geographic distribution to determine whether database sharding or database replication topologies are required.

  5. Regulatory mapping — Cross-reference the data classification output against applicable federal and state frameworks to establish mandatory controls for encryption, audit, and residency.

  6. Delivery model selection — Match organizational operational capacity against the five delivery models listed above to determine which management functions remain in scope for internal teams.

The databasesystemsauthority.com reference network covers each of these dimensions in dedicated technical depth.


Common scope disputes

Dispute 1: Is the application layer's ORM part of the database system's scope?
Object-relational mapping tools such as Hibernate or SQLAlchemy generate SQL and interact with the database engine, but they are application-layer software. Database administrators typically treat the ORM as out of scope for performance troubleshooting unless ORM-generated queries are the source of execution plan problems — at which point the boundary becomes contested between application developers and DBAs.

Dispute 2: Does caching infrastructure belong to the database scope?
Database caching strategies that use Redis or Memcached as read-through caches are architecturally adjacent to the primary database but fall under separate operational ownership in most enterprise structures. The dispute intensifies when cache invalidation failures cause data consistency incidents — at that point, accountability crosses the boundary.

Dispute 3: Who owns schema migration in a DevOps pipeline?
Database migration and database version control practices place schema change management at the intersection of developer and DBA responsibilities. The database administrator role has historically owned production schema changes; the database developer role owns application-side data models. CI/CD pipeline automation collapses this boundary in many organizations, creating scope ambiguity that must be resolved in operational agreements before incidents occur.

Dispute 4: Is full-text search a database function or a search platform function?
Full-text search in databases is natively supported in PostgreSQL (via tsvector/tsquery) and MySQL (via FULLTEXT indexes). Organizations that also deploy Elasticsearch or OpenSearch face a boundary dispute: queries that could be served by either system become architectural choices that carry cost, latency, and consistency tradeoffs. This dispute has no universal resolution — it is determined by the specific query patterns, index freshness requirements, and operational capacity of the organization.

Dispute 5: Are time-series databases and graph databases in scope for general DBA functions?
Specialized database types are frequently excluded from general DBA service contracts, which are scoped to RDBMS platforms. InfluxDB, TimescaleDB, Neo4j, and Amazon Neptune require distinct operational expertise. Organizations often discover this gap when a general DBA is asked to troubleshoot a graph database or time-series system for which the standard RDBMS toolset does not apply.

📜 5 regulatory citations referenced  ·  🔍 Monitored by ANA Regulatory Watch  ·  View update log