12 KiB
Email Organizer Data Model
Overview
This document describes the data model for the Email Organizer application, including entities, attributes, relationships, and constraints. The system uses PostgreSQL with SQLAlchemy ORM for data persistence.
Entity Relationship Diagram
erDiagram
USER {
int id PK "Primary Key"
string first_name "Not Null"
string last_name "Not Null"
string email "Unique, Not Null"
string password_hash "Not Null"
json imap_config "JSON Configuration"
datetime created_at "Default: UTC Now"
datetime updated_at "Default: UTC Now, On Update"
}
FOLDER {
int id PK "Primary Key"
int user_id FK "Foreign Key to User"
string name "Not Null"
text rule_text "Natural Language Rule"
int priority "Processing Order"
boolean organize_enabled "Default: True"
string folder_type "Default: 'destination'"
int total_count "Default: 0"
int pending_count "Default: 0"
int emails_count "Default: 0"
json recent_emails "JSON Array"
datetime created_at "Default: UTC Now"
datetime updated_at "Default: UTC Now, On Update"
}
USER ||--o{ FOLDER : "has"
note "User-Folder Relationship"
note "One-to-Many: Each user can have multiple folders"
Entities
User Entity
The User entity stores account information and authentication data for each user.
Attributes
| Column Name | Data Type | Constraints | Description |
|---|---|---|---|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each user |
| first_name | String(255) | Not Null | User's first name |
| last_name | String(255) | Not Null | User's last name |
| String(255) | Unique, Not Null | User's email address (login identifier) | |
| password_hash | String(2048) | Not Null | Hashed password for authentication |
| imap_config | JSON | Nullable | IMAP server configuration settings |
| created_at | DateTime | Default: datetime.utcnow | Timestamp of account creation |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update |
Relationships
- One-to-Many: Each
Usercan have multipleFolderinstances - Self-referencing: No direct relationships to other User instances
Business Rules
- Email must be unique across all users
- Password is stored as a hash, never in plain text
- IMAP configuration is stored as JSON for flexibility
Folder Entity
The Folder entity stores email organization rules and metadata for each user's email folders.
Attributes
| Column Name | Data Type | Constraints | Description |
|---|---|---|---|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each folder |
| user_id | Integer | Foreign Key to User, Not Null | Reference to the owning user |
| name | String(255) | Not Null | Display name of the folder |
| rule_text | Text | Nullable | Natural language description of the folder rule |
| priority | Integer | Nullable | Processing order (0=normal, 1=high) |
| organize_enabled | Boolean | Default: True | Whether the organization rule is active |
| folder_type | String(20) | Default: 'destination' | Folder type: 'tidy' or 'destination' |
| total_count | Integer | Default: 0 | Total number of emails in the folder |
| pending_count | Integer | Default: 0 | Number of emails waiting to be processed |
| emails_count | Integer | Default: 0 | Number of emails moved to this destination folder |
| recent_emails | JSON | Default: [] | Array of recent email metadata |
| created_at | DateTime | Default: datetime.utcnow | Timestamp of folder creation |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update |
Relationships
- Many-to-One: Each
Folderbelongs to oneUser - Self-referencing: No direct relationships to other Folder instances
Business Rules
- Each folder must belong to a user
- Folder name must be unique per user
- Rule text can be null (for manually created folders)
- Priority values: 0 (normal), 1 (high priority)
- Folder types:
- 'tidy': Folders containing emails to be processed (e.g., Inbox)
- 'destination': Folders that are targets for email organization (default)
- Recent emails array stores JSON objects with subject and date information
Data Constraints
Primary Keys
User.id: Integer, auto-incrementingFolder.id: Integer, auto-incrementing
Foreign Keys
Folder.user_id: ReferencesUser.idwith ON DELETE CASCADE
Unique Constraints
User.email: Ensures no duplicate email addresses- Composite unique constraint on
(User.id, Folder.name)to prevent duplicate folder names per user
Not Null Constraints
User.first_name,User.last_name,User.email,User.password_hashFolder.user_id,Folder.name
Default Values
User.created_at,User.updated_at: Current UTC timestampFolder.created_at,Folder.updated_at: Current UTC timestampFolder.organize_enabled: TrueFolder.folder_type: 'destination'Folder.total_count,Folder.pending_count,Folder.emails_count: 0Folder.recent_emails: Empty array
JSON Data Structures
IMAP Configuration
The imap_config field stores JSON with the following structure:
{
"server": "imap.gmail.com",
"port": 993,
"username": "user@example.com",
"password": "app-specific-password",
"use_ssl": true,
"use_tls": false,
"connection_timeout": 30
}
Recent Emails
The recent_emails field stores an array of JSON objects:
[
{
"subject": "Order Confirmation",
"date": "2023-11-15T10:30:00Z"
},
{
"subject": "Meeting Reminder",
"date": "2023-11-14T14:45:00Z"
}
]
Database Indexes
Current Indexes
- Primary key indexes on
User.idandFolder.id - Foreign key index on
Folder.user_id - Index on
User.emailfor faster login lookups
Recommended Indexes
- Composite index on
(user_id, name)for folder uniqueness checks - Index on
Folder.priorityfor filtering by priority - Index on
Folder.organize_enabledfor active/inactive filtering
Data Migration History
Migration Files
-
Initial Migration (
migrations/versions/02a7c13515a4_initial.py)- Created basic User and Folder tables
- Established primary keys and foreign keys
-
Add Name Fields (
migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py)- Added first_name and last_name columns to User table
- Added created_at and updated_at timestamps
-
Add Email Count Fields (
migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py)- Added total_count and pending_count columns to Folder table
- Added organize_enabled boolean flag
-
Add Recent Emails Field (
migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py)- Added recent_emails JSON column to Folder table
- Default value set to empty array
-
Add Toggle Feature (
migrations/versions/f8ba65458ba2_adding_toggle.py)- Added organize_enabled toggle functionality
- Enhanced folder management features
Performance Considerations
-
User Authentication
- Index on email column for fast login lookups
- Password hash comparison is done in application code
-
Folder Operations
- Foreign key index on user_id for efficient filtering
- Consider pagination for users with many folders
-
IMAP Sync Operations
- Batch updates for email counts
- JSON operations for recent emails metadata
Folder Types
The system supports three distinct types of folders, each with different purposes and behaviors:
Tidy Folders
Folders with folder_type = 'tidy' are source folders that contain emails waiting to be processed and organized.
Characteristics:
- Display pending and processed email counts
- Can have organization rules enabled/disabled
- Support viewing pending emails
- Example: Inbox folder
UI Representation:
- Shows "total", "pending", and "processed" count badges
- Includes "View Pending" button if there are pending emails
- May include priority indicators
- Located in "Emails to organize" section
Destination Folders
Folders with folder_type = 'destination' are target folders where emails are moved from other folders during organization.
Characteristics:
- Display count of emails moved to this folder
- Typically don't have organization rules (or they're ignored)
- Focus on showing how many emails have been organized into them
- Example: "Projects", "Finance", "Personal" folders
UI Representation:
- Shows "emails count" badge
- Simpler interface without pending/processed indicators
- Focus on folder management and viewing contents
- Located in "Destination Folders" section
Ignore Folders
Folders with folder_type = 'ignore' are folders that are stored in the database but are neither scanned to be tidied nor used as destination folders.
Characteristics:
- Hidden by default in the user interface
- Not processed by AI for organization
- No organization rules specified
- Known emails count is reset to 0 when changed to this type
- Example: Archive, Spam, Drafts folders
UI Representation:
- Hidden by default unless "Show Hidden" checkbox is checked
- When visible, shows minimal information
- No action buttons for organization or processing
- Located in "Hidden Folders" section
Folder Type Determination
Folder types are determined as follows:
- During IMAP synchronization:
- First step: Connection testing
- Second step: Folder type selection modal with table
- Default folder types:
- Inbox: Tidy
- Archive/Spam/Drafts: Ignore
- All others: Destination
- Manually created folders default to 'destination' (except 'inbox' which defaults to 'tidy')
- Folder type can be changed through the user interface via dropdown
- When changing to 'ignore', emails_count is reset to 0
ProcessedEmail Model
The ProcessedEmail entity tracks email processing status to avoid reprocessing the same emails during synchronization and provide accurate pending email counts.
Attributes
| Column Name | Data Type | Constraints | Description |
|---|---|---|---|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each processed email record |
| user_id | Integer | Foreign Key to User, Not Null | Reference to the user who owns this email |
| folder_id | Integer | Foreign Key to Folder, Not Null | Reference to the folder this email belongs to |
| email_uid | String(255) | Not Null | Unique ID of the email from IMAP server |
| folder_name | String(255) | Not Null | Name of the IMAP folder (for redundancy) |
| is_processed | Boolean | Default: False | Processing status (false=pending, true=processed) |
| first_seen_at | DateTime | Default: datetime.utcnow | First time this email was detected during sync |
| processed_at | DateTime | Nullable | When the email was marked as processed |
| created_at | DateTime | Default: datetime.utcnow | Record creation timestamp |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Record update timestamp |
Relationships
- Many-to-One: Each
ProcessedEmailbelongs to oneUser - Many-to-One: Each
ProcessedEmailbelongs to oneFolder - Composite Key: The combination of (user_id, folder_name, email_uid) should be unique to prevent duplicate records
Business Rules
- Each processed email record must belong to a user and folder
- Email UID must be unique per user and folder to prevent duplicates
- Processing status tracks whether an email has been processed
- First seen timestamp tracks when the email was first discovered
- Processed timestamp is set when email is marked as processed
Future Data Model Considerations
Potential Enhancements
-
Email Movement Tracking
- Store email movement history between folders
- Track source and destination folder for each moved email
-
Rule Engine
- Store parsed rule structures for better processing
- Version control for rule changes
-
User Preferences
- Additional customization options
- UI preference storage
-
Audit Log
- Track changes to user data
- Monitor folder operations
- Log email processing actions