# Email Organizer Data Model ## Overview This document describes the data model for the Email Organizer application, including entities, attributes, relationships, and constraints. The system uses PostgreSQL with SQLAlchemy ORM for data persistence. ## Entity Relationship Diagram ```mermaid erDiagram USER { int id PK "Primary Key" string first_name "Not Null" string last_name "Not Null" string email "Unique, Not Null" string password_hash "Not Null" json imap_config "JSON Configuration" datetime created_at "Default: UTC Now" datetime updated_at "Default: UTC Now, On Update" } FOLDER { int id PK "Primary Key" int user_id FK "Foreign Key to User" string name "Not Null" text rule_text "Natural Language Rule" int priority "Processing Order" boolean organize_enabled "Default: True" string folder_type "Default: 'destination'" int total_count "Default: 0" int pending_count "Default: 0" int emails_count "Default: 0" json recent_emails "JSON Array" datetime created_at "Default: UTC Now" datetime updated_at "Default: UTC Now, On Update" } USER ||--o{ FOLDER : "has" note "User-Folder Relationship" note "One-to-Many: Each user can have multiple folders" ``` ## Entities ### User Entity The `User` entity stores account information and authentication data for each user. #### Attributes | Column Name | Data Type | Constraints | Description | |-------------|------------|--------------|-------------| | id | Integer | Primary Key, Autoincrement | Unique identifier for each user | | first_name | String(255) | Not Null | User's first name | | last_name | String(255) | Not Null | User's last name | | email | String(255) | Unique, Not Null | User's email address (login identifier) | | password_hash | String(2048) | Not Null | Hashed password for authentication | | imap_config | JSON | Nullable | IMAP server configuration settings | | created_at | DateTime | Default: datetime.utcnow | Timestamp of account creation | | updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update | #### Relationships - **One-to-Many**: Each `User` can have multiple `Folder` instances - **Self-referencing**: No direct relationships to other User instances #### Business Rules - Email must be unique across all users - Password is stored as a hash, never in plain text - IMAP configuration is stored as JSON for flexibility ### Folder Entity The `Folder` entity stores email organization rules and metadata for each user's email folders. #### Attributes | Column Name | Data Type | Constraints | Description | |-------------|------------|--------------|-------------| | id | Integer | Primary Key, Autoincrement | Unique identifier for each folder | | user_id | Integer | Foreign Key to User, Not Null | Reference to the owning user | | name | String(255) | Not Null | Display name of the folder | | rule_text | Text | Nullable | Natural language description of the folder rule | | priority | Integer | Nullable | Processing order (0=normal, 1=high) | | organize_enabled | Boolean | Default: True | Whether the organization rule is active | | folder_type | String(20) | Default: 'destination' | Folder type: 'tidy' or 'destination' | | total_count | Integer | Default: 0 | Total number of emails in the folder | | pending_count | Integer | Default: 0 | Number of emails waiting to be processed | | emails_count | Integer | Default: 0 | Number of emails moved to this destination folder | | recent_emails | JSON | Default: [] | Array of recent email metadata | | created_at | DateTime | Default: datetime.utcnow | Timestamp of folder creation | | updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update | #### Relationships - **Many-to-One**: Each `Folder` belongs to one `User` - **Self-referencing**: No direct relationships to other Folder instances #### Business Rules - Each folder must belong to a user - Folder name must be unique per user - Rule text can be null (for manually created folders) - Priority values: 0 (normal), 1 (high priority) - Folder types: - 'tidy': Folders containing emails to be processed (e.g., Inbox) - 'destination': Folders that are targets for email organization (default) - Recent emails array stores JSON objects with subject and date information ## Data Constraints ### Primary Keys - `User.id`: Integer, auto-incrementing - `Folder.id`: Integer, auto-incrementing ### Foreign Keys - `Folder.user_id`: References `User.id` with ON DELETE CASCADE ### Unique Constraints - `User.email`: Ensures no duplicate email addresses - Composite unique constraint on `(User.id, Folder.name)` to prevent duplicate folder names per user ### Not Null Constraints - `User.first_name`, `User.last_name`, `User.email`, `User.password_hash` - `Folder.user_id`, `Folder.name` ### Default Values - `User.created_at`, `User.updated_at`: Current UTC timestamp - `Folder.created_at`, `Folder.updated_at`: Current UTC timestamp - `Folder.organize_enabled`: True - `Folder.folder_type`: 'destination' - `Folder.total_count`, `Folder.pending_count`, `Folder.emails_count`: 0 - `Folder.recent_emails`: Empty array ## JSON Data Structures ### IMAP Configuration The `imap_config` field stores JSON with the following structure: ```json { "server": "imap.gmail.com", "port": 993, "username": "user@example.com", "password": "app-specific-password", "use_ssl": true, "use_tls": false, "connection_timeout": 30 } ``` ### Recent Emails The `recent_emails` field stores an array of JSON objects: ```json [ { "subject": "Order Confirmation", "date": "2023-11-15T10:30:00Z" }, { "subject": "Meeting Reminder", "date": "2023-11-14T14:45:00Z" } ] ``` ## Database Indexes ### Current Indexes - Primary key indexes on `User.id` and `Folder.id` - Foreign key index on `Folder.user_id` - Index on `User.email` for faster login lookups ### Recommended Indexes - Composite index on `(user_id, name)` for folder uniqueness checks - Index on `Folder.priority` for filtering by priority - Index on `Folder.organize_enabled` for active/inactive filtering ## Data Migration History ### Migration Files 1. **Initial Migration** ([`migrations/versions/02a7c13515a4_initial.py`](migrations/versions/02a7c13515a4_initial.py:1)) - Created basic User and Folder tables - Established primary keys and foreign keys 2. **Add Name Fields** ([`migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py`](migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py:1)) - Added first_name and last_name columns to User table - Added created_at and updated_at timestamps 3. **Add Email Count Fields** ([`migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py`](migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py:1)) - Added total_count and pending_count columns to Folder table - Added organize_enabled boolean flag 4. **Add Recent Emails Field** ([`migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py`](migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py:1)) - Added recent_emails JSON column to Folder table - Default value set to empty array 5. **Add Toggle Feature** ([`migrations/versions/f8ba65458ba2_adding_toggle.py`](migrations/versions/f8ba65458ba2_adding_toggle.py:1)) - Added organize_enabled toggle functionality - Enhanced folder management features ### Performance Considerations 1. **User Authentication** - Index on email column for fast login lookups - Password hash comparison is done in application code 2. **Folder Operations** - Foreign key index on user_id for efficient filtering - Consider pagination for users with many folders 3. **IMAP Sync Operations** - Batch updates for email counts - JSON operations for recent emails metadata ## Folder Types The system supports three distinct types of folders, each with different purposes and behaviors: ### Tidy Folders Folders with `folder_type = 'tidy'` are source folders that contain emails waiting to be processed and organized. **Characteristics:** - Display pending and processed email counts - Can have organization rules enabled/disabled - Support viewing pending emails - Example: Inbox folder **UI Representation:** - Shows "pending count" and "processed count" badges - Includes "View Pending" button if there are pending emails - May include priority indicators ### Destination Folders Folders with `folder_type = 'destination'` are target folders where emails are moved from other folders during organization. **Characteristics:** - Display count of emails moved to this folder - Typically don't have organization rules (or they're ignored) - Focus on showing how many emails have been organized into them - Example: "Projects", "Finance", "Personal" folders **UI Representation:** - Shows "emails count" badge - Simpler interface without pending/processed indicators - Focus on folder management and viewing contents ### Ignore Folders Folders with `folder_type = 'ignore'` are folders that are stored in the database but are neither scanned to be tidied nor used as destination folders. **Characteristics:** - Hidden by default in the user interface - Not processed by AI for organization - No organization rules specified - Known emails count is reset to 0 when changed to this type - Example: Archive, Spam, Drafts folders **UI Representation:** - Hidden by default unless "Show Hidden" checkbox is checked - When visible, shows minimal information - No action buttons for organization or processing ### Folder Type Determination Folder types are determined as follows: - During IMAP synchronization: - First step: Connection testing - Second step: Folder type selection modal with table - Default folder types: - Inbox: Tidy - Archive/Spam/Drafts: Ignore - All others: Destination - Manually created folders default to 'destination' - Folder type can be changed through the user interface - When changing to 'ignore', emails_count is reset to 0 ## Future Data Model Considerations ### Potential Enhancements 1. **Email Entity** - Store email metadata for better analytics - Track email movement between folders 2. **Rule Engine** - Store parsed rule structures for better processing - Version control for rule changes 3. **User Preferences** - Additional customization options - UI preference storage 4. **Audit Log** - Track changes to user data - Monitor folder operations