Random Thoughts | Designing a DB Schema

March 2nd, 2024

Random Thoughts - Designing a Schema

Recently, I came across a sample DB design problem that Iā€™d like to document here for posterity. The summary of the problem goes as follows:

    You'd like to store data related to a Person. A person has a name, email address(es), phone number(s) and a (physical) address.
    Additionally, there are tags associated with them that are represented as key/value pairs.
    Tags can be user-submitted.
    Expect that ~200-300 million people are being stored in this solution.

    The query patterns that this schema must support are as follows:

    1. Lookup people by identifying information (name/email/phone).
    2. Partial name match.
    3. Building an audience of people with a specific tag value.

A Relational Solution - SQL Table Definition

CREATE TABLE Persons (
    PersonID VARCHAR(255) PRIMARY KEY,
    Name VARCHAR(255) NOT NULL
);

-- Index on Persons Table
CREATE INDEX idx_Name ON Persons (Name);

CREATE TABLE EmailAddresses (
    EmailID VARCHAR(255) PRIMARY KEY,
    PersonID VARCHAR(255) REFERENCES Persons(PersonID),
    Email VARCHAR(255) NOT NULL
);

-- Index on EmailAddresses Table
CREATE INDEX idx_Email ON EmailAddresses (Email);
-- Index on EmailAddresses Table for PersonID
CREATE INDEX idx_EmailAddressesPersonID ON EmailAddresses (PersonID);

CREATE TABLE PhoneNumbers (
    PhoneID VARCHAR(255) PRIMARY KEY,
    PersonID VARCHAR(255) REFERENCES Persons(PersonID),
    PhoneNumber INT NOT NULL
);

-- Index on PhoneNumbers Table for Phone Number
CREATE INDEX idx_PhoneNumber ON PhoneNumbers (PhoneNumber);
-- Index on PhoneNumbers Table for PersonID
CREATE INDEX idx_PhoneNumberPersonID ON PhoneNumbers (PersonID);

CREATE TABLE Addresses (
    AddressID VARCHAR(255) PRIMARY KEY,
    PersonID VARCHAR(255) REFERENCES Persons(PersonID),
    Street VARCHAR(255) NOT NULL,
    City VARCHAR(255) NOT NULL,
    State VARCHAR(255) NOT NULL,
    ZipCode VARCHAR(10) NOT NULL
);

CREATE TABLE Tags (
    TagID VARCHAR(255) PRIMARY KEY,
    TagKey VARCHAR(255) NOT NULL,
    TagValue VARCHAR(255) NOT NULL
);

-- Index on Tags Table for TagKey
CREATE INDEX idx_TagKey ON Tags (TagKey);

CREATE TABLE PersonTags (
    PersonID VARCHAR(255) REFERENCES Persons(PersonID),
    TagID VARCHAR(255) REFERENCES Tags(TagID),
    PRIMARY KEY (PersonID, TagID)
);

-- Index on PersonTags Table for TagID
CREATE INDEX idx_PersonTagsTagID ON PersonTags (TagID);

Query Solutions

1. Lookup people by identifying information (name/email/phone).

-- Query to Look Up a Person by Name
SELECT *
FROM Persons
WHERE Name = 'desired_name';

-- Query to Look Up a Person by Email
SELECT Persons.*
FROM Persons
INNER JOIN EmailAddresses ON Persons.PersonID = EmailAddresses.PersonID
WHERE EmailAddresses.Email = 'desired_email';

-- Query to Look Up a Person by Phone Number
SELECT Persons.*
FROM Persons
INNER JOIN PhoneNumbers ON Persons.PersonID = PhoneNumbers.PersonID
WHERE PhoneNumbers.PhoneNumber = 'desired_phone_number';

2. Look up a person by a partial name match.

-- Query to Find a Person by Partial Name Match
SELECT *
FROM Persons
WHERE Name LIKE '%partial_name%';

3. Build an audience of people by a specific tag value.

-- Query to Build an Audience by Tag Name and Value
SELECT p.*
FROM Persons p
INNER JOIN PersonTags pt on p.PersonID = pt.PersonID
INNER JOIN Tags t on t.TagID = pt.TagID
where t.TagKey='designed_tag_key' and t.TagValue = 'desired_tag_value'

Database Schema Summary

  1. Persons Table:

    • Stores unique individuals with a primary key (PersonID) and their names.
  2. EmailAddresses Table:

    • Manages email addresses associated with persons. It includes a foreign key (PersonID) linking to the Persons table.
  3. PhoneNumbers Table:

    • Stores phone numbers related to persons. Similar to EmailAddresses, it has a foreign key (PersonID) connecting to the Persons table.
  4. Addresses Table:

    • Manages address information such as street, city, state, and zip code, linked to the Persons table through a foreign key (PersonID).
  5. Tags Table:

    • Holds tag information with a primary key (TagID), tag key, and tag value.
  6. PersonTags Table:

    • Represents the many-to-many relationship between persons and tags. It includes foreign keys (PersonID and TagID) referencing Persons and Tags tables, respectively.