Files
email-organizer/docs/design/data-model.md
2025-08-06 15:38:49 -07:00

291 lines
9.8 KiB
Markdown

# Email Organizer Data Model
## Overview
This document describes the data model for the Email Organizer application, including entities, attributes, relationships, and constraints. The system uses PostgreSQL with SQLAlchemy ORM for data persistence.
## Entity Relationship Diagram
```mermaid
erDiagram
USER {
int id PK "Primary Key"
string first_name "Not Null"
string last_name "Not Null"
string email "Unique, Not Null"
string password_hash "Not Null"
json imap_config "JSON Configuration"
datetime created_at "Default: UTC Now"
datetime updated_at "Default: UTC Now, On Update"
}
FOLDER {
int id PK "Primary Key"
int user_id FK "Foreign Key to User"
string name "Not Null"
text rule_text "Natural Language Rule"
int priority "Processing Order"
boolean organize_enabled "Default: True"
string folder_type "Default: 'destination'"
int total_count "Default: 0"
int pending_count "Default: 0"
int emails_count "Default: 0"
json recent_emails "JSON Array"
datetime created_at "Default: UTC Now"
datetime updated_at "Default: UTC Now, On Update"
}
USER ||--o{ FOLDER : "has"
note "User-Folder Relationship"
note "One-to-Many: Each user can have multiple folders"
```
## Entities
### User Entity
The `User` entity stores account information and authentication data for each user.
#### Attributes
| Column Name | Data Type | Constraints | Description |
|-------------|------------|--------------|-------------|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each user |
| first_name | String(255) | Not Null | User's first name |
| last_name | String(255) | Not Null | User's last name |
| email | String(255) | Unique, Not Null | User's email address (login identifier) |
| password_hash | String(2048) | Not Null | Hashed password for authentication |
| imap_config | JSON | Nullable | IMAP server configuration settings |
| created_at | DateTime | Default: datetime.utcnow | Timestamp of account creation |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update |
#### Relationships
- **One-to-Many**: Each `User` can have multiple `Folder` instances
- **Self-referencing**: No direct relationships to other User instances
#### Business Rules
- Email must be unique across all users
- Password is stored as a hash, never in plain text
- IMAP configuration is stored as JSON for flexibility
### Folder Entity
The `Folder` entity stores email organization rules and metadata for each user's email folders.
#### Attributes
| Column Name | Data Type | Constraints | Description |
|-------------|------------|--------------|-------------|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each folder |
| user_id | Integer | Foreign Key to User, Not Null | Reference to the owning user |
| name | String(255) | Not Null | Display name of the folder |
| rule_text | Text | Nullable | Natural language description of the folder rule |
| priority | Integer | Nullable | Processing order (0=normal, 1=high) |
| organize_enabled | Boolean | Default: True | Whether the organization rule is active |
| folder_type | String(20) | Default: 'destination' | Folder type: 'tidy' or 'destination' |
| total_count | Integer | Default: 0 | Total number of emails in the folder |
| pending_count | Integer | Default: 0 | Number of emails waiting to be processed |
| emails_count | Integer | Default: 0 | Number of emails moved to this destination folder |
| recent_emails | JSON | Default: [] | Array of recent email metadata |
| created_at | DateTime | Default: datetime.utcnow | Timestamp of folder creation |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update |
#### Relationships
- **Many-to-One**: Each `Folder` belongs to one `User`
- **Self-referencing**: No direct relationships to other Folder instances
#### Business Rules
- Each folder must belong to a user
- Folder name must be unique per user
- Rule text can be null (for manually created folders)
- Priority values: 0 (normal), 1 (high priority)
- Folder types:
- 'tidy': Folders containing emails to be processed (e.g., Inbox)
- 'destination': Folders that are targets for email organization (default)
- Recent emails array stores JSON objects with subject and date information
## Data Constraints
### Primary Keys
- `User.id`: Integer, auto-incrementing
- `Folder.id`: Integer, auto-incrementing
### Foreign Keys
- `Folder.user_id`: References `User.id` with ON DELETE CASCADE
### Unique Constraints
- `User.email`: Ensures no duplicate email addresses
- Composite unique constraint on `(User.id, Folder.name)` to prevent duplicate folder names per user
### Not Null Constraints
- `User.first_name`, `User.last_name`, `User.email`, `User.password_hash`
- `Folder.user_id`, `Folder.name`
### Default Values
- `User.created_at`, `User.updated_at`: Current UTC timestamp
- `Folder.created_at`, `Folder.updated_at`: Current UTC timestamp
- `Folder.organize_enabled`: True
- `Folder.folder_type`: 'destination'
- `Folder.total_count`, `Folder.pending_count`, `Folder.emails_count`: 0
- `Folder.recent_emails`: Empty array
## JSON Data Structures
### IMAP Configuration
The `imap_config` field stores JSON with the following structure:
```json
{
"server": "imap.gmail.com",
"port": 993,
"username": "user@example.com",
"password": "app-specific-password",
"use_ssl": true,
"use_tls": false,
"connection_timeout": 30
}
```
### Recent Emails
The `recent_emails` field stores an array of JSON objects:
```json
[
{
"subject": "Order Confirmation",
"date": "2023-11-15T10:30:00Z"
},
{
"subject": "Meeting Reminder",
"date": "2023-11-14T14:45:00Z"
}
]
```
## Database Indexes
### Current Indexes
- Primary key indexes on `User.id` and `Folder.id`
- Foreign key index on `Folder.user_id`
- Index on `User.email` for faster login lookups
### Recommended Indexes
- Composite index on `(user_id, name)` for folder uniqueness checks
- Index on `Folder.priority` for filtering by priority
- Index on `Folder.organize_enabled` for active/inactive filtering
## Data Migration History
### Migration Files
1. **Initial Migration** ([`migrations/versions/02a7c13515a4_initial.py`](migrations/versions/02a7c13515a4_initial.py:1))
- Created basic User and Folder tables
- Established primary keys and foreign keys
2. **Add Name Fields** ([`migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py`](migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py:1))
- Added first_name and last_name columns to User table
- Added created_at and updated_at timestamps
3. **Add Email Count Fields** ([`migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py`](migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py:1))
- Added total_count and pending_count columns to Folder table
- Added organize_enabled boolean flag
4. **Add Recent Emails Field** ([`migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py`](migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py:1))
- Added recent_emails JSON column to Folder table
- Default value set to empty array
5. **Add Toggle Feature** ([`migrations/versions/f8ba65458ba2_adding_toggle.py`](migrations/versions/f8ba65458ba2_adding_toggle.py:1))
- Added organize_enabled toggle functionality
- Enhanced folder management features
### Performance Considerations
1. **User Authentication**
- Index on email column for fast login lookups
- Password hash comparison is done in application code
2. **Folder Operations**
- Foreign key index on user_id for efficient filtering
- Consider pagination for users with many folders
3. **IMAP Sync Operations**
- Batch updates for email counts
- JSON operations for recent emails metadata
## Folder Types
The system supports two distinct types of folders, each with different purposes and behaviors:
### Tidy Folders
Folders with `folder_type = 'tidy'` are source folders that contain emails waiting to be processed and organized.
**Characteristics:**
- Display pending and processed email counts
- Can have organization rules enabled/disabled
- Support viewing pending emails
- Example: Inbox folder
**UI Representation:**
- Shows "pending count" and "processed count" badges
- Includes "View Pending" button if there are pending emails
- May include priority indicators
### Destination Folders
Folders with `folder_type = 'destination'` are target folders where emails are moved from other folders during organization.
**Characteristics:**
- Display count of emails moved to this folder
- Typically don't have organization rules (or they're ignored)
- Focus on showing how many emails have been organized into them
- Example: "Projects", "Finance", "Personal" folders
**UI Representation:**
- Shows "emails count" badge
- Simpler interface without pending/processed indicators
- Focus on folder management and viewing contents
### Folder Type Determination
Folder types are determined as follows:
- During IMAP synchronization:
- Folders named "inbox" (case-insensitive) are automatically set as 'tidy'
- All other folders are set as 'destination'
- Manually created folders default to 'destination'
- Folder type can be changed through administrative functions
## Future Data Model Considerations
### Potential Enhancements
1. **Email Entity**
- Store email metadata for better analytics
- Track email movement between folders
2. **Rule Engine**
- Store parsed rule structures for better processing
- Version control for rule changes
3. **User Preferences**
- Additional customization options
- UI preference storage
4. **Audit Log**
- Track changes to user data
- Monitor folder operations