Update documentation

This commit is contained in:
2025-08-06 10:30:27 -07:00
parent 5336c04444
commit 05770c53e0
3 changed files with 849 additions and 0 deletions

View File

@@ -0,0 +1,353 @@
# Authentication System Design
## Overview
The Email Organizer implements a complete user authentication system using Flask-Login for session management and Werkzeug for password security. This document provides a detailed overview of the authentication architecture, components, and implementation details.
## System Architecture
```mermaid
graph TB
subgraph "Authentication Flow"
A[User Browser] --> B[Login Page]
B --> C[Submit Credentials]
C --> D[Validate Input]
D --> E[Check Password Hash]
E --> F[Create Session]
F --> G[Redirect to Dashboard]
end
subgraph "Backend Components"
H[Auth Blueprint] --> I[Validation Logic]
H --> J[Password Hashing]
H --> K[Session Management]
I --> L[User Model]
J --> L
K --> M[Flask-Login]
end
subgraph "Security Measures"
N[Password Validation] --> O[Complexity Requirements]
N --> P[Hashing Algorithm]
Q[Session Security] --> R[Cookie Protection]
Q --> S[Timeout Handling]
end
C --> D
D --> I
I --> J
J --> K
K --> F
```
## Components
### 1. Auth Blueprint ([`app/auth.py`](app/auth.py:1))
The authentication blueprint handles all authentication-related routes and logic:
#### Routes
- `/login`: GET/POST - User login page and authentication
- `/signup`: GET/POST - User registration page and account creation
- `/logout`: GET - User logout and session termination
#### Key Functions
- `login()`: Handles user authentication
- `signup()`: Manages user registration
- `logout()`: Terminates user sessions
- `validate_password()`: Enforces password complexity requirements
### 2. User Model ([`app/models.py`](app/models.py:13))
The User model extends Flask-Login's UserMixin to provide authentication functionality:
#### Authentication Methods
- `set_password()`: Hashes and stores passwords
- `check_password()`: Verifies password against hash
#### Security Attributes
- Password hashing using Werkzeug's secure hashing algorithm
- Email uniqueness constraint
- Timestamp tracking for account management
### 3. Flask-Login Integration
The application uses Flask-Login for session management:
#### Configuration
```python
login_manager = LoginManager()
login_manager.login_view = 'auth.login'
login_manager.login_message = 'Please log in to access this page.'
login_manager.login_message_category = 'warning'
```
#### User Loader
```python
@login_manager.user_loader
def load_user(user_id):
return User.query.get(int(user_id))
```
## Authentication Flow
### User Registration
```mermaid
sequenceDiagram
participant U as User
participant B as Browser
participant A as Auth Blueprint
participant DB as Database
U->>B: Navigate to /auth/signup
B->>A: GET /auth/signup
A->>B: Return signup form
U->>B: Fill and submit form
B->>A: POST /auth/signup
loop Validation
A->>A: Validate input fields
A->>A: Check email uniqueness
A->>A: Validate password complexity
end
A->>DB: Create new user record
DB-->>A: User created
A->>A: Hash and store password
A->>B: Login user
B->>U: Redirect to dashboard
```
### User Login
```mermaid
sequenceDiagram
participant U as User
participant B as Browser
participant A as Auth Blueprint
participant DB as Database
U->>B: Navigate to /auth/login
B->>A: GET /auth/login
A->>B: Return login form
U->>B: Submit credentials
B->>A: POST /auth/login
A->>DB: Query user by email
DB-->>A: User data
alt User exists
A->>A: Verify password hash
alt Password correct
A->>A: Create session
A->>B: Redirect to dashboard
else Password incorrect
A->>B: Show error message
end
else User not found
A->>B: Show error message
end
```
### Password Validation
The system enforces strong password requirements:
```python
def validate_password(password):
"""Validate password strength."""
if len(password) < 8:
return False, "Password must be at least 8 characters long"
if not re.search(r'[A-Z]', password):
return False, "Password must contain at least one uppercase letter"
if not re.search(r'[a-z]', password):
return False, "Password must contain at least one lowercase letter"
if not re.search(r'\d', password):
return False, "Password must contain at least one digit"
return True, "Password is valid"
```
### Session Management
Flask-Login handles session security:
- **Session Cookies**: Secure, HttpOnly cookies for session storage
- **CSRF Protection**: Built-in CSRF protection for form submissions
- **Session Timeout**: Automatic session expiration
- **Remember Me**: Optional persistent login functionality
## Security Measures
### 1. Password Security
#### Hashing Algorithm
- Uses Werkzeug's `generate_password_hash()` and `check_password_hash()`
- Implements PBKDF2 with SHA256 for secure password storage
- Random salt generation for each password
#### Password Complexity
- Minimum 8 characters
- At least one uppercase letter
- At least one lowercase letter
- At least one digit
- No maximum length limit
### 2. Input Validation
#### Client-Side Validation
- HTML5 form validation
- JavaScript feedback for user experience
#### Server-Side Validation
- Comprehensive input sanitization
- Email format validation
- Length restrictions for all fields
- SQL injection prevention through SQLAlchemy ORM
### 3. Session Security
#### Cookie Protection
- Secure flag for HTTPS environments
- HttpOnly flag to prevent JavaScript access
- SameSite policy for cross-site request protection
#### Session Management
- Automatic session regeneration on login
- Session timeout handling
- Logout cleanup
### 4. Error Handling
#### User-Friendly Messages
- Clear validation error messages
- General error messages for security-sensitive operations
- No exposure of internal system details
#### Logging
- Authentication attempt logging
- Security event tracking
- Error debugging information
## API Endpoints
### Authentication Endpoints
| Endpoint | Method | Description | Authentication Required |
|----------|--------|-------------|---------------------------|
| `/auth/login` | GET/POST | User login | No |
| `/auth/signup` | GET/POST | User registration | No |
| `/auth/logout` | GET | User logout | Yes |
### Response Formats
#### Login Success
```http
HTTP/1.1 302 Found
Location: /
Set-Cookie: session=<session_id>; HttpOnly; Secure; Path=/
```
#### Login Failure
```http
HTTP/1.1 200 OK
Content-Type: text/html
```
#### Registration Success
```http
HTTP/1.1 302 Found
Location: /
Set-Cookie: session=<session_id>; HttpOnly; Secure; Path=/
```
#### Registration Failure
```http
HTTP/1.1 200 OK
Content-Type: text/html
```
## Configuration
### Environment Variables
| Variable | Description | Default Value |
|----------|-------------|--------------|
| `SECRET_KEY` | Flask secret key for session encryption | `dev-secret-key` |
### Flask-Login Configuration
```python
login_manager = LoginManager()
login_manager.login_view = 'auth.login'
login_manager.login_message = 'Please log in to access this page.'
login_manager.login_message_category = 'warning'
```
## Testing Strategy
### Unit Tests
#### Authentication Tests
- Test password validation logic
- Test password hashing and verification
- Test user creation and validation
- Test session creation and management
#### Integration Tests
- Test login flow with valid credentials
- Test login flow with invalid credentials
- Test registration flow with valid data
- Test registration flow with invalid data
- Test session persistence and timeout
### Security Tests
- Test SQL injection attempts
- Test XSS vulnerabilities
- Test session hijacking prevention
- Test CSRF protection
## Performance Considerations
### Database Optimization
- Index on email column for fast login lookups
- Efficient password hashing with proper salting
- Session data stored in server-side session store
### Security vs. Usability Balance
- Reasonable password complexity requirements
- Clear error messages for failed login attempts
- Session timeout balanced with user convenience
## Future Enhancements
### 1. Multi-Factor Authentication
- SMS-based 2FA
- TOTP (Time-based One-Time Password) support
- Hardware token integration
### 2. OAuth Integration
- Google OAuth
- Facebook OAuth
- GitHub OAuth
### 3. Password Reset
- Email-based password reset
- Secure token generation
- Expiration handling
### 4. Account Management
- User profile management
- Email address changes
- Password change functionality
### 5. Security Enhancements
- Rate limiting for login attempts
- Account lockout after failed attempts
- Suspicious activity monitoring
- IP-based security checks
- Suspicious activity monitoring
- IP-based security checks

240
docs/design/data-model.md Normal file
View File

@@ -0,0 +1,240 @@
# Email Organizer Data Model
## Overview
This document describes the data model for the Email Organizer application, including entities, attributes, relationships, and constraints. The system uses PostgreSQL with SQLAlchemy ORM for data persistence.
## Entity Relationship Diagram
```mermaid
erDiagram
USER {
int id PK "Primary Key"
string first_name "Not Null"
string last_name "Not Null"
string email "Unique, Not Null"
string password_hash "Not Null"
json imap_config "JSON Configuration"
datetime created_at "Default: UTC Now"
datetime updated_at "Default: UTC Now, On Update"
}
FOLDER {
int id PK "Primary Key"
int user_id FK "Foreign Key to User"
string name "Not Null"
text rule_text "Natural Language Rule"
int priority "Processing Order"
boolean organize_enabled "Default: True"
int total_count "Default: 0"
int pending_count "Default: 0"
json recent_emails "JSON Array"
datetime created_at "Default: UTC Now"
datetime updated_at "Default: UTC Now, On Update"
}
USER ||--o{ FOLDER : "has"
note "User-Folder Relationship"
note "One-to-Many: Each user can have multiple folders"
```
## Entities
### User Entity
The `User` entity stores account information and authentication data for each user.
#### Attributes
| Column Name | Data Type | Constraints | Description |
|-------------|------------|--------------|-------------|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each user |
| first_name | String(255) | Not Null | User's first name |
| last_name | String(255) | Not Null | User's last name |
| email | String(255) | Unique, Not Null | User's email address (login identifier) |
| password_hash | String(2048) | Not Null | Hashed password for authentication |
| imap_config | JSON | Nullable | IMAP server configuration settings |
| created_at | DateTime | Default: datetime.utcnow | Timestamp of account creation |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update |
#### Relationships
- **One-to-Many**: Each `User` can have multiple `Folder` instances
- **Self-referencing**: No direct relationships to other User instances
#### Business Rules
- Email must be unique across all users
- Password is stored as a hash, never in plain text
- IMAP configuration is stored as JSON for flexibility
### Folder Entity
The `Folder` entity stores email organization rules and metadata for each user's email folders.
#### Attributes
| Column Name | Data Type | Constraints | Description |
|-------------|------------|--------------|-------------|
| id | Integer | Primary Key, Autoincrement | Unique identifier for each folder |
| user_id | Integer | Foreign Key to User, Not Null | Reference to the owning user |
| name | String(255) | Not Null | Display name of the folder |
| rule_text | Text | Nullable | Natural language description of the folder rule |
| priority | Integer | Nullable | Processing order (0=normal, 1=high) |
| organize_enabled | Boolean | Default: True | Whether the organization rule is active |
| total_count | Integer | Default: 0 | Total number of emails in the folder |
| pending_count | Integer | Default: 0 | Number of emails waiting to be processed |
| recent_emails | JSON | Default: [] | Array of recent email metadata |
| created_at | DateTime | Default: datetime.utcnow | Timestamp of folder creation |
| updated_at | DateTime | Default: datetime.utcnow, On Update | Timestamp of last update |
#### Relationships
- **Many-to-One**: Each `Folder` belongs to one `User`
- **Self-referencing**: No direct relationships to other Folder instances
#### Business Rules
- Each folder must belong to a user
- Folder name must be unique per user
- Rule text can be null (for manually created folders)
- Priority values: 0 (normal), 1 (high priority)
- Recent emails array stores JSON objects with subject and date information
## Data Constraints
### Primary Keys
- `User.id`: Integer, auto-incrementing
- `Folder.id`: Integer, auto-incrementing
### Foreign Keys
- `Folder.user_id`: References `User.id` with ON DELETE CASCADE
### Unique Constraints
- `User.email`: Ensures no duplicate email addresses
- Composite unique constraint on `(User.id, Folder.name)` to prevent duplicate folder names per user
### Not Null Constraints
- `User.first_name`, `User.last_name`, `User.email`, `User.password_hash`
- `Folder.user_id`, `Folder.name`
### Default Values
- `User.created_at`, `User.updated_at`: Current UTC timestamp
- `Folder.created_at`, `Folder.updated_at`: Current UTC timestamp
- `Folder.organize_enabled`: True
- `Folder.total_count`, `Folder.pending_count`: 0
- `Folder.recent_emails`: Empty array
## JSON Data Structures
### IMAP Configuration
The `imap_config` field stores JSON with the following structure:
```json
{
"server": "imap.gmail.com",
"port": 993,
"username": "user@example.com",
"password": "app-specific-password",
"use_ssl": true,
"use_tls": false,
"connection_timeout": 30
}
```
### Recent Emails
The `recent_emails` field stores an array of JSON objects:
```json
[
{
"subject": "Order Confirmation",
"date": "2023-11-15T10:30:00Z"
},
{
"subject": "Meeting Reminder",
"date": "2023-11-14T14:45:00Z"
}
]
```
## Database Indexes
### Current Indexes
- Primary key indexes on `User.id` and `Folder.id`
- Foreign key index on `Folder.user_id`
- Index on `User.email` for faster login lookups
### Recommended Indexes
- Composite index on `(user_id, name)` for folder uniqueness checks
- Index on `Folder.priority` for filtering by priority
- Index on `Folder.organize_enabled` for active/inactive filtering
## Data Migration History
### Migration Files
1. **Initial Migration** ([`migrations/versions/02a7c13515a4_initial.py`](migrations/versions/02a7c13515a4_initial.py:1))
- Created basic User and Folder tables
- Established primary keys and foreign keys
2. **Add Name Fields** ([`migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py`](migrations/versions/28e8e0be0355_add_first_name_last_name_and_timestamp_.py:1))
- Added first_name and last_name columns to User table
- Added created_at and updated_at timestamps
3. **Add Email Count Fields** ([`migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py`](migrations/versions/a3ad1b9a0e5f_add_email_count_fields_to_folders.py:1))
- Added total_count and pending_count columns to Folder table
- Added organize_enabled boolean flag
4. **Add Recent Emails Field** ([`migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py`](migrations/versions/9a88c7e94083_add_recent_emails_field_to_folders_table.py:1))
- Added recent_emails JSON column to Folder table
- Default value set to empty array
5. **Add Toggle Feature** ([`migrations/versions/f8ba65458ba2_adding_toggle.py`](migrations/versions/f8ba65458ba2_adding_toggle.py:1))
- Added organize_enabled toggle functionality
- Enhanced folder management features
### Performance Considerations
1. **User Authentication**
- Index on email column for fast login lookups
- Password hash comparison is done in application code
2. **Folder Operations**
- Foreign key index on user_id for efficient filtering
- Consider pagination for users with many folders
3. **IMAP Sync Operations**
- Batch updates for email counts
- JSON operations for recent emails metadata
## Future Data Model Considerations
### Potential Enhancements
1. **Email Entity**
- Store email metadata for better analytics
- Track email movement between folders
2. **Rule Engine**
- Store parsed rule structures for better processing
- Version control for rule changes
3. **User Preferences**
- Additional customization options
- UI preference storage
4. **Audit Log**
- Track changes to user data
- Monitor folder operations

View File

@@ -0,0 +1,256 @@
# Email Organizer System Architecture
## Overview
The Email Organizer is a self-hosted AI-powered email organization system that automates folder sorting, prioritization, and rule recommendations through natural language configuration. This document provides a comprehensive overview of the system architecture, components, and data flow.
## High-Level Architecture
```mermaid
graph TB
subgraph "Frontend Layer"
A[Browser] --> B[Base Template]
B --> C[HTMX/AlpineJS/DaisyUI]
C --> D[Dynamic UI Components]
C --> E[Modal System]
end
subgraph "Application Layer"
F[Flask App] --> G[Main Blueprint]
F --> H[Auth Blueprint]
G --> I[IMAP Service]
G --> J[Folder Management]
H --> K[User Authentication]
end
subgraph "Data Layer"
L[PostgreSQL] --> M[User Model]
L --> N[Folder Model]
M --> O[IMAP Configuration]
N --> P[Folder Rules]
N --> Q[Email Metadata]
end
subgraph "External Services"
R[IMAP Server] --> I
S[Future AI Service] --> I
end
D --> F
F --> L
I --> R
```
## System Components
### 1. Frontend Layer
The frontend is built using a modern, lightweight stack that provides a responsive and interactive user experience:
- **Base Template**: Foundation with DaisyUI theme, HTMX, and AlpineJS
- **Dynamic UI Components**:
- HTMX for server-side rendered content updates
- AlpineJS for client-side interactivity
- DaisyUI for consistent styling and components
- **Modal System**: Custom modal handling for forms and configuration
### 2. Application Layer
The Flask application follows a modular blueprint architecture:
#### Flask App Factory
Implements the factory pattern for creating Flask application instances with configuration support.
#### Main Blueprint
Handles core application functionality:
- Folder CRUD operations
- IMAP configuration and testing
- Folder synchronization
- User interface endpoints
#### Auth Blueprint
Manages user authentication:
- User registration and login
- Password validation and hashing
- Session management
### 3. Data Layer
The system uses PostgreSQL with SQLAlchemy ORM for data persistence:
#### User Model
Stores user account information and authentication data:
- Primary key: Integer auto-increment ID
- Personal information: First name, last name, email
- Authentication: Password hash
- Configuration: IMAP server settings in JSON format
- Timestamps: Creation and update times
#### Folder Model
Stores email organization rules and metadata:
- Primary key: Integer auto-increment ID
- Relationship: Foreign key to user
- Rule definition: Natural language rule text
- Organization settings: Priority, enable/disable flag
- Email metrics: Total count, pending count
- Email metadata: Recent emails information in JSON format
### 4. External Services
#### IMAP Service
Handles communication with IMAP servers:
- Connection management and authentication
- Folder listing and synchronization
- Email retrieval and metadata extraction
- Connection testing and validation
## Data Flow
### User Authentication Flow
```mermaid
sequenceDiagram
participant U as User
participant B as Browser
participant A as Auth Blueprint
participant DB as Database
U->>B: Navigate to login page
B->>A: GET /auth/login
A->>B: Return login form
U->>B: Submit credentials
B->>A: POST /auth/login
A->>DB: Verify credentials
DB-->>A: User data
A->>B: Set session cookie
B->>U: Redirect to main page
```
### Folder Management Flow
```mermaid
sequenceDiagram
participant U as User
participant B as Browser
participant M as Main Blueprint
participant DB as Database
participant I as IMAP Service
U->>B: Click "Add Folder"
B->>M: GET /api/folders/new
M->>B: Return folder modal
U->>B: Fill and submit form
B->>M: POST /api/folders
M->>DB: Create new folder record
DB-->>M: Success confirmation
M->>B: Return updated folders list
```
### IMAP Synchronization Flow
```mermaid
sequenceDiagram
participant U as User
participant B as Browser
participant M as Main Blueprint
participant I as IMAP Service
participant IMAP as IMAP Server
participant DB as Database
U->>B: Click "Sync Folders"
B->>M: POST /api/imap/sync
M->>I: Initialize with user config
I->>IMAP: Establish connection
IMAP-->>I: Connection success
I->>IMAP: List all folders
IMAP-->>I: Folder list
I->>DB: Create/update local folders
DB-->>I: Commit changes
I->>M: Return sync results
M->>B: Return updated UI
```
## Design Patterns
### 1. Factory Pattern
- **Implementation**: Flask app factory in [`app/__init__.py`](app/__init__.py:14)
- **Purpose**: Create application instances with different configurations
- **Benefits**: Supports multiple environments (development, testing, production)
### 2. Blueprint Pattern
- **Implementation**: Separated functionality in [`app/routes.py`](app/routes.py:9) and [`app/auth.py`](app/auth.py:9)
- **Purpose**: Modularize application features
- **Benefits**: Code organization, easier testing, scalability
### 3. Service Layer Pattern
- **Implementation**: IMAP service in [`app/imap_service.py`](app/imap_service.py:10)
- **Purpose**: Encapsulate business logic and external communication
- **Benefits**: Separation of concerns, reusability, testability
### 4. Repository Pattern
- **Implementation**: SQLAlchemy models in [`app/models.py`](app/models.py:13)
- **Purpose**: Abstract data access layer
- **Benefits**: Database independence, centralized query logic
## Security Considerations
### 1. Authentication
- Password hashing using Werkzeug's security functions
- Session management with Flask-Login
- Input validation on all user submissions
### 2. IMAP Security
- SSL/TLS encryption for IMAP connections
- Secure credential storage (JSON format)
- Connection timeout handling
### 3. Data Protection
- Server-side validation for all form inputs
- Protection against SQL injection through SQLAlchemy ORM
- Error handling that doesn't expose sensitive information
## Performance Considerations
### 1. Database
- PostgreSQL for reliable data storage
- SQLAlchemy ORM for efficient querying
- Alembic for database migrations
### 2. UI Performance
- HTMX for partial page updates
- Lazy loading of folder content
- Efficient rendering with DaisyUI components
### 3. IMAP Operations
- Connection pooling for efficiency
- Timeout handling for reliability
- Batch processing for folder synchronization
## Scalability
### 1. Architecture
- Modular design supports feature expansion
- Service layer allows for additional email providers
- Database schema designed for growth
### 2. Future Enhancements
- AI-powered rule recommendations
- Additional email provider support
- Multi-tenant architecture support
## Error Handling
### 1. User-Friendly Errors
- Clear validation messages for form inputs
- Graceful degradation for IMAP connection issues
- Modal-based error display
### 2. Logging
- Comprehensive logging for debugging
- Error tracking for IMAP operations
- Database operation logging
### 3. Recovery
- Transaction rollbacks for database operations
- Connection cleanup for IMAP service
- Session management for user errors