diff --git a/docs/design/authentication-system.md b/docs/design/authentication-system.md new file mode 100644 index 0000000..04ff394 --- /dev/null +++ b/docs/design/authentication-system.md @@ -0,0 +1,353 @@ +# Authentication System Design + +## Overview + +The Email Organizer implements a complete user authentication system using Flask-Login for session management and Werkzeug for password security. This document provides a detailed overview of the authentication architecture, components, and implementation details. + +## System Architecture + +```mermaid +graph TB + subgraph "Authentication Flow" + A[User Browser] --> B[Login Page] + B --> C[Submit Credentials] + C --> D[Validate Input] + D --> E[Check Password Hash] + E --> F[Create Session] + F --> G[Redirect to Dashboard] + end + + subgraph "Backend Components" + H[Auth Blueprint] --> I[Validation Logic] + H --> J[Password Hashing] + H --> K[Session Management] + I --> L[User Model] + J --> L + K --> M[Flask-Login] + end + + subgraph "Security Measures" + N[Password Validation] --> O[Complexity Requirements] + N --> P[Hashing Algorithm] + Q[Session Security] --> R[Cookie Protection] + Q --> S[Timeout Handling] + end + + C --> D + D --> I + I --> J + J --> K + K --> F +``` + +## Components + +### 1. Auth Blueprint ([`app/auth.py`](app/auth.py:1)) + +The authentication blueprint handles all authentication-related routes and logic: + +#### Routes +- `/login`: GET/POST - User login page and authentication +- `/signup`: GET/POST - User registration page and account creation +- `/logout`: GET - User logout and session termination + +#### Key Functions +- `login()`: Handles user authentication +- `signup()`: Manages user registration +- `logout()`: Terminates user sessions +- `validate_password()`: Enforces password complexity requirements + +### 2. User Model ([`app/models.py`](app/models.py:13)) + +The User model extends Flask-Login's UserMixin to provide authentication functionality: + +#### Authentication Methods +- `set_password()`: Hashes and stores passwords +- `check_password()`: Verifies password against hash + +#### Security Attributes +- Password hashing using Werkzeug's secure hashing algorithm +- Email uniqueness constraint +- Timestamp tracking for account management + +### 3. Flask-Login Integration + +The application uses Flask-Login for session management: + +#### Configuration +```python +login_manager = LoginManager() +login_manager.login_view = 'auth.login' +login_manager.login_message = 'Please log in to access this page.' +login_manager.login_message_category = 'warning' +``` + +#### User Loader +```python +@login_manager.user_loader +def load_user(user_id): + return User.query.get(int(user_id)) +``` + +## Authentication Flow + +### User Registration + +```mermaid +sequenceDiagram + participant U as User + participant B as Browser + participant A as Auth Blueprint + participant DB as Database + + U->>B: Navigate to /auth/signup + B->>A: GET /auth/signup + A->>B: Return signup form + U->>B: Fill and submit form + B->>A: POST /auth/signup + + loop Validation + A->>A: Validate input fields + A->>A: Check email uniqueness + A->>A: Validate password complexity + end + + A->>DB: Create new user record + DB-->>A: User created + A->>A: Hash and store password + A->>B: Login user + B->>U: Redirect to dashboard +``` + +### User Login + +```mermaid +sequenceDiagram + participant U as User + participant B as Browser + participant A as Auth Blueprint + participant DB as Database + + U->>B: Navigate to /auth/login + B->>A: GET /auth/login + A->>B: Return login form + U->>B: Submit credentials + B->>A: POST /auth/login + + A->>DB: Query user by email + DB-->>A: User data + + alt User exists + A->>A: Verify password hash + alt Password correct + A->>A: Create session + A->>B: Redirect to dashboard + else Password incorrect + A->>B: Show error message + end + else User not found + A->>B: Show error message + end +``` + +### Password Validation + +The system enforces strong password requirements: + +```python +def validate_password(password): + """Validate password strength.""" + if len(password) < 8: + return False, "Password must be at least 8 characters long" + if not re.search(r'[A-Z]', password): + return False, "Password must contain at least one uppercase letter" + if not re.search(r'[a-z]', password): + return False, "Password must contain at least one lowercase letter" + if not re.search(r'\d', password): + return False, "Password must contain at least one digit" + return True, "Password is valid" +``` + +### Session Management + +Flask-Login handles session security: + +- **Session Cookies**: Secure, HttpOnly cookies for session storage +- **CSRF Protection**: Built-in CSRF protection for form submissions +- **Session Timeout**: Automatic session expiration +- **Remember Me**: Optional persistent login functionality + +## Security Measures + +### 1. Password Security + +#### Hashing Algorithm +- Uses Werkzeug's `generate_password_hash()` and `check_password_hash()` +- Implements PBKDF2 with SHA256 for secure password storage +- Random salt generation for each password + +#### Password Complexity +- Minimum 8 characters +- At least one uppercase letter +- At least one lowercase letter +- At least one digit +- No maximum length limit + +### 2. Input Validation + +#### Client-Side Validation +- HTML5 form validation +- JavaScript feedback for user experience + +#### Server-Side Validation +- Comprehensive input sanitization +- Email format validation +- Length restrictions for all fields +- SQL injection prevention through SQLAlchemy ORM + +### 3. Session Security + +#### Cookie Protection +- Secure flag for HTTPS environments +- HttpOnly flag to prevent JavaScript access +- SameSite policy for cross-site request protection + +#### Session Management +- Automatic session regeneration on login +- Session timeout handling +- Logout cleanup + +### 4. Error Handling + +#### User-Friendly Messages +- Clear validation error messages +- General error messages for security-sensitive operations +- No exposure of internal system details + +#### Logging +- Authentication attempt logging +- Security event tracking +- Error debugging information + +## API Endpoints + +### Authentication Endpoints + +| Endpoint | Method | Description | Authentication Required | +|----------|--------|-------------|---------------------------| +| `/auth/login` | GET/POST | User login | No | +| `/auth/signup` | GET/POST | User registration | No | +| `/auth/logout` | GET | User logout | Yes | + +### Response Formats + +#### Login Success +```http +HTTP/1.1 302 Found +Location: / +Set-Cookie: session=; HttpOnly; Secure; Path=/ +``` + +#### Login Failure +```http +HTTP/1.1 200 OK +Content-Type: text/html + + +``` + +#### Registration Success +```http +HTTP/1.1 302 Found +Location: / +Set-Cookie: session=; HttpOnly; Secure; Path=/ +``` + +#### Registration Failure +```http +HTTP/1.1 200 OK +Content-Type: text/html + + +``` + +## Configuration + +### Environment Variables + +| Variable | Description | Default Value | +|----------|-------------|--------------| +| `SECRET_KEY` | Flask secret key for session encryption | `dev-secret-key` | + +### Flask-Login Configuration + +```python +login_manager = LoginManager() +login_manager.login_view = 'auth.login' +login_manager.login_message = 'Please log in to access this page.' +login_manager.login_message_category = 'warning' +``` + +## Testing Strategy + +### Unit Tests + +#### Authentication Tests +- Test password validation logic +- Test password hashing and verification +- Test user creation and validation +- Test session creation and management + +#### Integration Tests +- Test login flow with valid credentials +- Test login flow with invalid credentials +- Test registration flow with valid data +- Test registration flow with invalid data +- Test session persistence and timeout + +### Security Tests + +- Test SQL injection attempts +- Test XSS vulnerabilities +- Test session hijacking prevention +- Test CSRF protection + +## Performance Considerations + +### Database Optimization +- Index on email column for fast login lookups +- Efficient password hashing with proper salting +- Session data stored in server-side session store + +### Security vs. Usability Balance +- Reasonable password complexity requirements +- Clear error messages for failed login attempts +- Session timeout balanced with user convenience + +## Future Enhancements + +### 1. Multi-Factor Authentication +- SMS-based 2FA +- TOTP (Time-based One-Time Password) support +- Hardware token integration + +### 2. OAuth Integration +- Google OAuth +- Facebook OAuth +- GitHub OAuth + +### 3. Password Reset +- Email-based password reset +- Secure token generation +- Expiration handling + +### 4. Account Management +- User profile management +- Email address changes +- Password change functionality + +### 5. Security Enhancements +- Rate limiting for login attempts +- Account lockout after failed attempts +- Suspicious activity monitoring +- IP-based security checks \ No newline at end of file diff --git a/docs/design/data-model.md b/docs/design/data-model.md new file mode 100644 index 0000000..7a2c319 --- /dev/null +++ b/docs/design/data-model.md @@ -0,0 +1,240 @@ +# Email Organizer Data Model + +## Overview + +This document describes the data model for the Email Organizer application, including entities, attributes, relationships, and constraints. The system uses PostgreSQL with SQLAlchemy ORM for data persistence. + +## Entity Relationship Diagram + +```mermaid +erDiagram + USER { + int id PK "Primary Key" + string first_name "Not Null" + string last_name "Not Null" + string email "Unique, Not Null" + string password_hash "Not Null" + json imap_config "JSON Configuration" + datetime created_at "Default: UTC Now" + datetime updated_at "Default: UTC Now, On Update" + } + + FOLDER { + int id PK "Primary Key" + int user_id FK "Foreign Key to User" + string name "Not Null" + text rule_text "Natural Language Rule" + int priority "Processing Order" + boolean organize_enabled "Default: True" + int total_count "Default: 0" + int pending_count "Default: 0" + json recent_emails "JSON Array" + datetime created_at "Default: UTC Now" + datetime updated_at "Default: UTC Now, On Update" + } + + USER ||--o{ FOLDER : "has" + + note "User-Folder Relationship" + note "One-to-Many: Each user can have multiple folders" +``` + +## Entities + +### User Entity + +The `User` entity stores account information and authentication data for each user. + +#### Attributes + +| Column Name | Data Type | Constraints | Description | +|-------------|------------|--------------|-------------| +| id | Integer | Primary Key, Autoincrement | Unique identifier for each user | +| first_name | String(255) | Not Null | User's first name | +| last_name | String(255) | Not Null | User's last name | +| email | String(255) | Unique, Not Null | User's email address (login identifier) | +| password_hash | String(2048) | Not Null | Hashed password for authentication | +| imap_config | JSON | Nullable | IMAP server configuration settings | +| created_at | DateTime | Default: datetime.utcnow | Timestamp of account creation | +| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update | + +#### Relationships + +- **One-to-Many**: Each `User` can have multiple `Folder` instances +- **Self-referencing**: No direct relationships to other User instances + +#### Business Rules + +- Email must be unique across all users +- Password is stored as a hash, never in plain text +- IMAP configuration is stored as JSON for flexibility + +### Folder Entity + +The `Folder` entity stores email organization rules and metadata for each user's email folders. + +#### Attributes + +| Column Name | Data Type | Constraints | Description | +|-------------|------------|--------------|-------------| +| id | Integer | Primary Key, Autoincrement | Unique identifier for each folder | +| user_id | Integer | Foreign Key to User, Not Null | Reference to the owning user | +| name | String(255) | Not Null | Display name of the folder | +| rule_text | Text | Nullable | Natural language description of the folder rule | +| priority | Integer | Nullable | Processing order (0=normal, 1=high) | +| organize_enabled | Boolean | Default: True | Whether the organization rule is active | +| total_count | Integer | Default: 0 | Total number of emails in the folder | +| pending_count | Integer | Default: 0 | Number of emails waiting to be processed | +| recent_emails | JSON | Default: [] | Array of recent email metadata | +| created_at | DateTime | Default: datetime.utcnow | Timestamp of folder creation | +| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update | + +#### Relationships + +- **Many-to-One**: Each `Folder` belongs to one `User` +- **Self-referencing**: No direct relationships to other Folder instances + +#### Business Rules + +- Each folder must belong to a user +- Folder name must be unique per user +- Rule text can be null (for manually created folders) +- Priority values: 0 (normal), 1 (high priority) +- Recent emails array stores JSON objects with subject and date information + +## Data Constraints + +### Primary Keys + +- `User.id`: Integer, auto-incrementing +- `Folder.id`: Integer, auto-incrementing + +### Foreign Keys + +- `Folder.user_id`: References `User.id` with ON DELETE CASCADE + +### Unique Constraints + +- `User.email`: Ensures no duplicate email addresses +- Composite unique constraint on `(User.id, Folder.name)` to prevent duplicate folder names per user + +### Not Null Constraints + +- `User.first_name`, `User.last_name`, `User.email`, `User.password_hash` +- `Folder.user_id`, `Folder.name` + +### Default Values + +- `User.created_at`, `User.updated_at`: Current UTC timestamp +- `Folder.created_at`, `Folder.updated_at`: Current UTC timestamp +- `Folder.organize_enabled`: True +- `Folder.total_count`, `Folder.pending_count`: 0 +- `Folder.recent_emails`: Empty array + +## JSON Data Structures + +### IMAP Configuration + +The `imap_config` field stores JSON with the following structure: + +```json +{ + "server": "imap.gmail.com", + "port": 993, + "username": "user@example.com", + "password": "app-specific-password", + "use_ssl": true, + "use_tls": false, + "connection_timeout": 30 +} +``` + +### Recent Emails + +The `recent_emails` field stores an array of JSON objects: + +```json +[ + { + "subject": "Order Confirmation", + "date": "2023-11-15T10:30:00Z" + }, + { + "subject": "Meeting Reminder", + "date": "2023-11-14T14:45:00Z" + } +] +``` + +## Database Indexes + +### Current Indexes + +- Primary key indexes on `User.id` and `Folder.id` +- Foreign key index on `Folder.user_id` +- Index on `User.email` for faster login lookups + +### Recommended Indexes + +- Composite index on `(user_id, name)` for folder uniqueness checks +- Index on `Folder.priority` for filtering by priority +- Index on `Folder.organize_enabled` for active/inactive filtering + +## Data Migration History + +### Migration Files + +1. **Initial Migration** ([`migrations/versions/02a7c13515a4_initial.py`](migrations/versions/02a7c13515a4_initial.py:1)) + - Created basic User and Folder tables + - Established primary keys and foreign keys + +2. **Add Name Fields** ([`migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py`](migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py:1)) + - Added first_name and last_name columns to User table + - Added created_at and updated_at timestamps + +3. **Add Email Count Fields** ([`migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py`](migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py:1)) + - Added total_count and pending_count columns to Folder table + - Added organize_enabled boolean flag + +4. **Add Recent Emails Field** ([`migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py`](migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py:1)) + - Added recent_emails JSON column to Folder table + - Default value set to empty array + +5. **Add Toggle Feature** ([`migrations/versions/f8ba65458ba2_adding_toggle.py`](migrations/versions/f8ba65458ba2_adding_toggle.py:1)) + - Added organize_enabled toggle functionality + - Enhanced folder management features + + +### Performance Considerations + +1. **User Authentication** + - Index on email column for fast login lookups + - Password hash comparison is done in application code + +2. **Folder Operations** + - Foreign key index on user_id for efficient filtering + - Consider pagination for users with many folders + +3. **IMAP Sync Operations** + - Batch updates for email counts + - JSON operations for recent emails metadata + +## Future Data Model Considerations + +### Potential Enhancements + +1. **Email Entity** + - Store email metadata for better analytics + - Track email movement between folders + +2. **Rule Engine** + - Store parsed rule structures for better processing + - Version control for rule changes + +3. **User Preferences** + - Additional customization options + - UI preference storage + +4. **Audit Log** + - Track changes to user data + - Monitor folder operations \ No newline at end of file diff --git a/docs/design/system-architecture.md b/docs/design/system-architecture.md new file mode 100644 index 0000000..8455628 --- /dev/null +++ b/docs/design/system-architecture.md @@ -0,0 +1,256 @@ +# Email Organizer System Architecture + +## Overview + +The Email Organizer is a self-hosted AI-powered email organization system that automates folder sorting, prioritization, and rule recommendations through natural language configuration. This document provides a comprehensive overview of the system architecture, components, and data flow. + +## High-Level Architecture + +```mermaid +graph TB + subgraph "Frontend Layer" + A[Browser] --> B[Base Template] + B --> C[HTMX/AlpineJS/DaisyUI] + C --> D[Dynamic UI Components] + C --> E[Modal System] + end + + subgraph "Application Layer" + F[Flask App] --> G[Main Blueprint] + F --> H[Auth Blueprint] + G --> I[IMAP Service] + G --> J[Folder Management] + H --> K[User Authentication] + end + + subgraph "Data Layer" + L[PostgreSQL] --> M[User Model] + L --> N[Folder Model] + M --> O[IMAP Configuration] + N --> P[Folder Rules] + N --> Q[Email Metadata] + end + + subgraph "External Services" + R[IMAP Server] --> I + S[Future AI Service] --> I + end + + D --> F + F --> L + I --> R +``` + +## System Components + +### 1. Frontend Layer + +The frontend is built using a modern, lightweight stack that provides a responsive and interactive user experience: + +- **Base Template**: Foundation with DaisyUI theme, HTMX, and AlpineJS +- **Dynamic UI Components**: + - HTMX for server-side rendered content updates + - AlpineJS for client-side interactivity + - DaisyUI for consistent styling and components +- **Modal System**: Custom modal handling for forms and configuration + +### 2. Application Layer + +The Flask application follows a modular blueprint architecture: + +#### Flask App Factory +Implements the factory pattern for creating Flask application instances with configuration support. + +#### Main Blueprint +Handles core application functionality: +- Folder CRUD operations +- IMAP configuration and testing +- Folder synchronization +- User interface endpoints + +#### Auth Blueprint +Manages user authentication: +- User registration and login +- Password validation and hashing +- Session management + +### 3. Data Layer + +The system uses PostgreSQL with SQLAlchemy ORM for data persistence: + +#### User Model +Stores user account information and authentication data: +- Primary key: Integer auto-increment ID +- Personal information: First name, last name, email +- Authentication: Password hash +- Configuration: IMAP server settings in JSON format +- Timestamps: Creation and update times + +#### Folder Model +Stores email organization rules and metadata: +- Primary key: Integer auto-increment ID +- Relationship: Foreign key to user +- Rule definition: Natural language rule text +- Organization settings: Priority, enable/disable flag +- Email metrics: Total count, pending count +- Email metadata: Recent emails information in JSON format + +### 4. External Services + +#### IMAP Service +Handles communication with IMAP servers: +- Connection management and authentication +- Folder listing and synchronization +- Email retrieval and metadata extraction +- Connection testing and validation + +## Data Flow + +### User Authentication Flow + +```mermaid +sequenceDiagram + participant U as User + participant B as Browser + participant A as Auth Blueprint + participant DB as Database + + U->>B: Navigate to login page + B->>A: GET /auth/login + A->>B: Return login form + U->>B: Submit credentials + B->>A: POST /auth/login + A->>DB: Verify credentials + DB-->>A: User data + A->>B: Set session cookie + B->>U: Redirect to main page +``` + +### Folder Management Flow + +```mermaid +sequenceDiagram + participant U as User + participant B as Browser + participant M as Main Blueprint + participant DB as Database + participant I as IMAP Service + + U->>B: Click "Add Folder" + B->>M: GET /api/folders/new + M->>B: Return folder modal + U->>B: Fill and submit form + B->>M: POST /api/folders + M->>DB: Create new folder record + DB-->>M: Success confirmation + M->>B: Return updated folders list +``` + +### IMAP Synchronization Flow + +```mermaid +sequenceDiagram + participant U as User + participant B as Browser + participant M as Main Blueprint + participant I as IMAP Service + participant IMAP as IMAP Server + participant DB as Database + + U->>B: Click "Sync Folders" + B->>M: POST /api/imap/sync + M->>I: Initialize with user config + I->>IMAP: Establish connection + IMAP-->>I: Connection success + I->>IMAP: List all folders + IMAP-->>I: Folder list + I->>DB: Create/update local folders + DB-->>I: Commit changes + I->>M: Return sync results + M->>B: Return updated UI +``` + +## Design Patterns + +### 1. Factory Pattern +- **Implementation**: Flask app factory in [`app/__init__.py`](app/__init__.py:14) +- **Purpose**: Create application instances with different configurations +- **Benefits**: Supports multiple environments (development, testing, production) + +### 2. Blueprint Pattern +- **Implementation**: Separated functionality in [`app/routes.py`](app/routes.py:9) and [`app/auth.py`](app/auth.py:9) +- **Purpose**: Modularize application features +- **Benefits**: Code organization, easier testing, scalability + +### 3. Service Layer Pattern +- **Implementation**: IMAP service in [`app/imap_service.py`](app/imap_service.py:10) +- **Purpose**: Encapsulate business logic and external communication +- **Benefits**: Separation of concerns, reusability, testability + +### 4. Repository Pattern +- **Implementation**: SQLAlchemy models in [`app/models.py`](app/models.py:13) +- **Purpose**: Abstract data access layer +- **Benefits**: Database independence, centralized query logic + +## Security Considerations + +### 1. Authentication +- Password hashing using Werkzeug's security functions +- Session management with Flask-Login +- Input validation on all user submissions + +### 2. IMAP Security +- SSL/TLS encryption for IMAP connections +- Secure credential storage (JSON format) +- Connection timeout handling + +### 3. Data Protection +- Server-side validation for all form inputs +- Protection against SQL injection through SQLAlchemy ORM +- Error handling that doesn't expose sensitive information + +## Performance Considerations + +### 1. Database +- PostgreSQL for reliable data storage +- SQLAlchemy ORM for efficient querying +- Alembic for database migrations + +### 2. UI Performance +- HTMX for partial page updates +- Lazy loading of folder content +- Efficient rendering with DaisyUI components + +### 3. IMAP Operations +- Connection pooling for efficiency +- Timeout handling for reliability +- Batch processing for folder synchronization + +## Scalability + +### 1. Architecture +- Modular design supports feature expansion +- Service layer allows for additional email providers +- Database schema designed for growth + +### 2. Future Enhancements +- AI-powered rule recommendations +- Additional email provider support +- Multi-tenant architecture support + +## Error Handling + +### 1. User-Friendly Errors +- Clear validation messages for form inputs +- Graceful degradation for IMAP connection issues +- Modal-based error display + +### 2. Logging +- Comprehensive logging for debugging +- Error tracking for IMAP operations +- Database operation logging + +### 3. Recovery +- Transaction rollbacks for database operations +- Connection cleanup for IMAP service +- Session management for user errors \ No newline at end of file