Databases
Where Meaning Survives
A database is not just where data lives—it's where meaning survives. Beneath every social feed, every financial report, every airport gate and cardiac monitor, there's a structure that knows how to not forget. Databases are the silent systems of memory.
🏗️ Relational Design Concepts
Designing a database isn't just a technical act—it's epistemology. It's deciding what matters, what relates, and what needs protection from inconsistency. Whether it's a registry of births or a schema for intergalactic trade, the structure must persist, adapt, and query fast.
One-to-One Relationship
Each row in Table A relates to one and only one row in Table B. Used when data is best split for security or logical separation.
One-to-Many Relationship
The cornerstone of relational modeling. One record (e.g. a user) can relate to many (e.g. posts). Enforced via foreign keys.
Many-to-Many Relationship
Modeled using a join table. One student can enroll in many courses; each course has many students. The matrix of human networks.
Foreign Keys
The DNA of relational integrity. Foreign keys ensure that linked data references something real. Delete a parent, risk orphaning the child.
Normalization
The process of structuring data to reduce redundancy and improve integrity. There are forms—like laws—you obey for clarity.
CTEs (Common Table Expressions)
Temporary result sets you can reference within a larger query. Think of them as sub-thoughts in the middle of a complex sentence.
📊 Normalization Forms
Normalization isn't religion, but it's close. It's the discipline of keeping your tables from becoming tangled forests of duplication, update anomalies, and regret. These forms are progressive rules—each building on the last—to bring peace to your schema.
1NF - First Normal Form
Ensure atomic columns—no repeating groups or arrays. Every cell contains a single value.
2NF - Second Normal Form
Remove partial dependencies. Every non-key attribute must depend on the entire primary key.
3NF - Third Normal Form
Remove transitive dependencies. No non-key attribute should depend on another non-key attribute.
BCNF - Boyce-Codd Normal Form
A stricter 3NF where every determinant is a candidate key.
4NF - Fourth Normal Form
Handle multi-valued dependencies. Say goodbye to double-dipping data.
5NF - Fifth Normal Form
Resolve join dependencies. Only use when you're designing systems for gods or legacy Oracle instances.
📚 Key Concepts Dictionary
🔑 Keys & Identifiers
- Primary Key: A column or set of columns that uniquely identifies each row in a table.
- Foreign Key: A reference from one table to a primary key in another. Integrity enforcer.
- Composite Key: A primary key made from two or more columns.
- Surrogate Key: An artificial ID, often an auto-increment integer or UUID.
- Candidate Key: A column or set of columns that could serve as a primary key.
🔗 Joins & Relationships
- Join: Operation that combines rows from two or more tables based on a related column.
- Inner Join: Returns only matching rows. The intersection of data.
- Left Join: Returns all rows from the left table and matching ones from the right—if they exist.
- Right Join: Opposite of Left Join. Rarely used, but conceptually symmetrical.
- Full Outer Join: All rows from both tables. Gaps filled with NULLs.
⚡ Performance & Optimization
- Index: A structure that speeds up reads but can slow down writes. Tradeoffs, always.
- Query Optimization: The process of choosing the most efficient execution plan for a query.
- Denormalization: Sometimes you break the rules for performance. It's dangerous, but sometimes necessary.
- Partitioning: Splitting large tables into smaller, more manageable pieces.
- Caching: Storing frequently accessed data in memory for faster retrieval.
🛡️ ACID & Transactions
- Transaction: A bundle of queries that succeed or fail together. Atomicity is safety.
- ACID: Atomicity, Consistency, Isolation, Durability. The commandments of relational safety.
- Atomicity: All operations in a transaction succeed or all fail.
- Consistency: Database remains in a valid state before and after transactions.
- Isolation: Concurrent transactions don't interfere with each other.
📋 Data Types & Constraints
- NULL: The ghost in your data. A lack of value, not zero, not empty. Unknowable.
- Schema: The blueprint of your database: tables, columns, types, relationships, constraints.
- View: A saved query—like a virtual table. Keeps queries DRY and logic centralized.
- Stored Procedure: Precompiled SQL code stored in the database for reuse.
- Trigger: Automatic actions that fire when data is inserted, updated, or deleted.
🌐 Modern Databases
- NoSQL: Non-relational databases designed for specific use cases and scalability.
- Document Stores: Store data as documents (JSON, XML) rather than tables.
- Graph Databases: Optimized for storing and querying relationships between entities.
- Time-Series Databases: Specialized for data that changes over time.
- Distributed Databases: Data spread across multiple nodes for scalability and fault tolerance.
📄 Influential Papers and Historical Moments
🏗️ Foundational Papers
- A Relational Model of Data for Large Shared Data Banks (E. F. Codd, 1970) — Introduced the relational model.
- Spanner: Google's Globally-Distributed Database (2012) — Planet-scale, consistent distributed SQL.
- The End of an Architectural Era (Stonebraker, 2007) — Traditional RDBMSs are too rigid.
🚀 Scaling & Performance
- Scaling Memcache at Facebook — How Facebook scales ephemeral caching.
- The NoSQL Movement — Why scale killed schemas and how the pendulum is swinging back.
- Dynamo: Amazon's Highly Available Key-value Store (2007) — Influenced many NoSQL systems.
🕸️ Graph & Modern Systems
- Graph Thinking — The shift from tables to graphs.
- Bigtable: A Distributed Storage System (2006) — Google's distributed storage system.
- MapReduce: Simplified Data Processing (2004) — Programming model for large-scale data processing.