Random Thoughts | Designing a DB Schema
March 2nd, 2024
Random Thoughts - Designing a Schema
Recently, I came across a sample DB design problem that Iād like to document here for posterity. The summary of the problem goes as follows:
You'd like to store data related to a Person. A person has a name, email address(es), phone number(s) and a (physical) address.
Additionally, there are tags associated with them that are represented as key/value pairs.
Tags can be user-submitted.
Expect that ~200-300 million people are being stored in this solution.
The query patterns that this schema must support are as follows:
1. Lookup people by identifying information (name/email/phone).
2. Partial name match.
3. Building an audience of people with a specific tag value.
A Relational Solution - SQL Table Definition
CREATE TABLE Persons (
PersonID VARCHAR(255) PRIMARY KEY,
Name VARCHAR(255) NOT NULL
);
-- Index on Persons Table
CREATE INDEX idx_Name ON Persons (Name);
CREATE TABLE EmailAddresses (
EmailID VARCHAR(255) PRIMARY KEY,
PersonID VARCHAR(255) REFERENCES Persons(PersonID),
Email VARCHAR(255) NOT NULL
);
-- Index on EmailAddresses Table
CREATE INDEX idx_Email ON EmailAddresses (Email);
-- Index on EmailAddresses Table for PersonID
CREATE INDEX idx_EmailAddressesPersonID ON EmailAddresses (PersonID);
CREATE TABLE PhoneNumbers (
PhoneID VARCHAR(255) PRIMARY KEY,
PersonID VARCHAR(255) REFERENCES Persons(PersonID),
PhoneNumber INT NOT NULL
);
-- Index on PhoneNumbers Table for Phone Number
CREATE INDEX idx_PhoneNumber ON PhoneNumbers (PhoneNumber);
-- Index on PhoneNumbers Table for PersonID
CREATE INDEX idx_PhoneNumberPersonID ON PhoneNumbers (PersonID);
CREATE TABLE Addresses (
AddressID VARCHAR(255) PRIMARY KEY,
PersonID VARCHAR(255) REFERENCES Persons(PersonID),
Street VARCHAR(255) NOT NULL,
City VARCHAR(255) NOT NULL,
State VARCHAR(255) NOT NULL,
ZipCode VARCHAR(10) NOT NULL
);
CREATE TABLE Tags (
TagID VARCHAR(255) PRIMARY KEY,
TagKey VARCHAR(255) NOT NULL,
TagValue VARCHAR(255) NOT NULL
);
-- Index on Tags Table for TagKey
CREATE INDEX idx_TagKey ON Tags (TagKey);
CREATE TABLE PersonTags (
PersonID VARCHAR(255) REFERENCES Persons(PersonID),
TagID VARCHAR(255) REFERENCES Tags(TagID),
PRIMARY KEY (PersonID, TagID)
);
-- Index on PersonTags Table for TagID
CREATE INDEX idx_PersonTagsTagID ON PersonTags (TagID);
Query Solutions
1. Lookup people by identifying information (name/email/phone).
-- Query to Look Up a Person by Name
SELECT *
FROM Persons
WHERE Name = 'desired_name';
-- Query to Look Up a Person by Email
SELECT Persons.*
FROM Persons
INNER JOIN EmailAddresses ON Persons.PersonID = EmailAddresses.PersonID
WHERE EmailAddresses.Email = 'desired_email';
-- Query to Look Up a Person by Phone Number
SELECT Persons.*
FROM Persons
INNER JOIN PhoneNumbers ON Persons.PersonID = PhoneNumbers.PersonID
WHERE PhoneNumbers.PhoneNumber = 'desired_phone_number';
2. Look up a person by a partial name match.
-- Query to Find a Person by Partial Name Match
SELECT *
FROM Persons
WHERE Name LIKE '%partial_name%';
3. Build an audience of people by a specific tag value.
-- Query to Build an Audience by Tag Name and Value
SELECT p.*
FROM Persons p
INNER JOIN PersonTags pt on p.PersonID = pt.PersonID
INNER JOIN Tags t on t.TagID = pt.TagID
where t.TagKey='designed_tag_key' and t.TagValue = 'desired_tag_value'
Database Schema Summary
-
Persons Table:
- Stores unique individuals with a primary key (PersonID) and their names.
-
EmailAddresses Table:
- Manages email addresses associated with persons. It includes a foreign key (PersonID) linking to the Persons table.
-
PhoneNumbers Table:
- Stores phone numbers related to persons. Similar to EmailAddresses, it has a foreign key (PersonID) connecting to the Persons table.
-
Addresses Table:
- Manages address information such as street, city, state, and zip code, linked to the Persons table through a foreign key (PersonID).
-
Tags Table:
- Holds tag information with a primary key (TagID), tag key, and tag value.
-
PersonTags Table:
- Represents the many-to-many relationship between persons and tags. It includes foreign keys (PersonID and TagID) referencing Persons and Tags tables, respectively.