10 KiB
Rule System Implementation Summary
What We Built
A complete distributed WAF rule synchronization system that allows the Baffle Hub to generate and manage rules while Agents download and enforce them locally with sub-millisecond latency.
Implementation Status: ✅ Complete (Phase 1)
1. Database Schema ✅
Migration: db/migrate/20251103080823_enhance_rules_table_for_sync.rb
Enhanced the rules table with:
sourcefield to track rule origin (manual, auto-generated, imported)- JSON
conditionsandmetadatafields expires_atfor temporal rules (24h bans)enabledflag for soft deletespriorityfor rule specificity- Optimized indexes for sync queries (
updated_at, id)
Schema:
create_table "rules" do |t|
t.string :rule_type, null: false # network_v4, network_v6, rate_limit, path_pattern
t.string :action, null: false # allow, deny, rate_limit, redirect, log
t.json :conditions, null: false # CIDR, patterns, scope
t.json :metadata # reason, limits, redirect_url
t.integer :priority # Auto-calculated from CIDR prefix
t.datetime :expires_at # For temporal bans
t.boolean :enabled, default: true # Soft delete flag
t.string :source, limit: 100 # Origin tracking
t.timestamps
# Indexes for efficient sync
t.index [:updated_at, :id] # Primary sync cursor
t.index :enabled
t.index :expires_at
t.index [:rule_type, :enabled]
end
2. Rule Model ✅
File: app/models/rule.rb
Complete Rule model with:
- Rule types:
network_v4,network_v6,rate_limit,path_pattern - Actions:
allow,deny,rate_limit,redirect,log - Validations: Type-specific validation for conditions and metadata
- Scopes:
active,expired,network_rules,rate_limit_rules, etc. - Sync methods:
since(timestamp),latest_version - Auto-priority: Calculates priority from CIDR prefix length
- Agent format:
to_agent_formatfor API responses
Example Usage:
# Create network block rule
Rule.create!(
rule_type: "network_v4",
action: "deny",
conditions: { cidr: "1.2.3.4/32" },
expires_at: 24.hours.from_now,
source: "auto:scanner_detected",
metadata: { reason: "Hit /.env multiple times" }
)
# Create rate limit rule
Rule.create!(
rule_type: "rate_limit",
action: "rate_limit",
conditions: { cidr: "0.0.0.0/0", scope: "global" },
metadata: { limit: 100, window: 60, per_ip: true },
source: "manual"
)
# Disable rule (soft delete)
rule.disable!(reason: "False positive")
# Query for sync
Rule.since("2025-11-03T08:00:00.000Z")
3. API Endpoints ✅
Controller: app/controllers/api/rules_controller.rb
Routes: Added to config/routes.rb
Version Endpoint (Lightweight Check)
GET /api/:public_key/rules/version
Response:
{
"version": "2025-11-03T08:14:23.648330Z",
"count": 150,
"sampling": {
"allowed_requests": 1.0,
"blocked_requests": 1.0,
"rate_limited_requests": 1.0,
"effective_until": "2025-11-03T08:14:33.689Z",
"load_level": "normal",
"queue_depth": 0
}
}
Incremental Sync
GET /api/:public_key/rules?since=2025-11-03T08:00:00.000Z
Response:
{
"version": "2025-11-03T08:14:23.648330Z",
"sampling": { ... },
"rules": [
{
"id": 1,
"rule_type": "network_v4",
"action": "deny",
"conditions": { "cidr": "10.0.0.0/8" },
"priority": 8,
"expires_at": null,
"enabled": true,
"source": "manual",
"metadata": { "reason": "Testing" },
"created_at": "2025-11-03T08:14:23Z",
"updated_at": "2025-11-03T08:14:23Z"
}
]
}
Full Sync
GET /api/:public_key/rules
Response: Same format, returns all active rules
4. Dynamic Load-Based Sampling ✅
Service: app/services/hub_load.rb
Monitors SolidQueue depth and adjusts event sampling rates:
| Queue Depth | Load Level | Allowed | Blocked | Rate Limited |
|---|---|---|---|---|
| 0-1,000 | Normal | 100% | 100% | 100% |
| 1,001-5,000 | Moderate | 50% | 100% | 100% |
| 5,001-10,000 | High | 20% | 100% | 100% |
| 10,001+ | Critical | 5% | 100% | 100% |
Features:
- Automatic backpressure control
- Always sends 100% of blocks/rate-limits
- Reduces allowed request sampling under load
- Included in every API response
5. Background Jobs ✅
ExpiredRulesCleanupJob
File: app/jobs/expired_rules_cleanup_job.rb
- Runs hourly
- Disables rules with
expires_atin the past - Cleans up old disabled rules (>30 days) once per day
- Agents pick up disabled rules via
updated_atchange
PathScannerDetectorJob
File: app/jobs/path_scanner_detector_job.rb
- Runs every 5 minutes (recommended)
- Detects IPs hitting scanner paths (/.env, /.git, /wp-admin, etc.)
- Auto-creates 24h ban rules after 3+ hits
- Handles both IPv4 and IPv6
- Prevents duplicate rules
Scanner Paths:
/.env,/.git,/.aws,/.ssh,/.config/wp-admin,/wp-login.php/phpMyAdmin,/phpmyadmin/admin,/administrator/backup,/db_backup/.DS_Store,/web.config
Testing
Create Test Rules
bin/rails runner '
# Network block
Rule.create!(
rule_type: "network_v4",
action: "deny",
conditions: { cidr: "10.0.0.0/8" },
source: "manual",
metadata: { reason: "Test block" }
)
# Rate limit
Rule.create!(
rule_type: "rate_limit",
action: "rate_limit",
conditions: { cidr: "0.0.0.0/0", scope: "global" },
metadata: { limit: 100, window: 60 },
source: "manual"
)
puts "✓ Created #{Rule.count} rules"
puts "✓ Latest version: #{Rule.latest_version}"
'
Test API Endpoints
# Get your project key
bin/rails runner 'puts Project.first.public_key'
# Test version endpoint
curl http://localhost:3000/api/YOUR_PUBLIC_KEY/rules/version | jq
# Test full sync
curl http://localhost:3000/api/YOUR_PUBLIC_KEY/rules | jq
# Test incremental sync
curl "http://localhost:3000/api/YOUR_PUBLIC_KEY/rules?since=2025-11-03T08:00:00.000Z" | jq
Run Background Jobs
# Test expired rules cleanup
bin/rails runner 'ExpiredRulesCleanupJob.perform_now'
# Test scanner detector (needs events first)
bin/rails runner 'PathScannerDetectorJob.perform_now'
# Check hub load
bin/rails runner 'puts HubLoad.stats.inspect'
Agent Integration (Next Steps)
The Agent needs to:
-
Poll for updates every 10 seconds or 1000 events:
GET /api/:public_key/rules?since=<last_updated_at> -
Process rules received:
enabled: true→ Insert/update in local tablesenabled: false→ Remove from local tables
-
Populate local SQLite tables:
# For network_v4 rules: cidr = IPAddr.new(rule.conditions.cidr) Ipv4Range.upsert({ source: "hub:#{rule.id}", network_start: cidr.to_i, network_end: cidr.to_range.end.to_i, network_prefix: rule.priority, waf_action: map_action(rule.action), redirect_url: rule.metadata.redirect_url, priority: rule.priority }) -
Respect sampling rates from API response:
sampling = response["sampling"] if event.allowed? && rand > sampling["allowed_requests"] skip_sending_to_hub end
Key Design Decisions
✅ IPv4/IPv6 Split
- Separate
network_v4andnetwork_v6rule types - Agent has separate
ipv4_rangesandipv6_rangestables - Better performance (integer vs binary indexes)
✅ Timestamp-Based Sync
- Use
updated_atas version cursor (notid) - Handles rule updates and soft deletes
- Query overlap (0.5s) handles clock skew
- Secondary sort by
idfor consistency
✅ Soft Deletes
- Rules disabled, not deleted
- Audit trail preserved
- Agents sync via
enabled: false - Old rules cleaned after 30 days
✅ Priority from CIDR
- Auto-calculated from prefix length
- Most specific (smallest prefix) wins
/32>/24>/16>/8- No manual priority needed for network rules
✅ Dynamic Sampling
- Hub controls load via sampling rates
- Always sends critical events (blocks, rate limits)
- Reduces allowed event traffic under load
- Prevents Hub overload
Performance Characteristics
Hub
- Version check: Single index lookup (~1ms)
- Incremental sync: Index scan on
(updated_at, id)(~5-10ms for 100 rules) - Rule creation: Single insert (~5ms)
Agent (Expected)
- Network lookup: O(log n) via B-tree on
(network_start, network_end)(<1ms) - Rate limit check: O(1) hash lookup in memory (<0.1ms)
- Sync overhead: 10s polling, ~5-10 KB payload for 50 rules
What's Not Included (Future Phases)
- ❌ Per-path rate limiting (Phase 2)
- ❌ Path-based event sampling (Phase 2)
- ❌ Challenge actions/CAPTCHA (Phase 2+)
- ❌ Multi-project rules (Phase 10+)
- ❌ Rule UI (manual creation via console for now)
- ❌ Recurring job scheduling (needs separate setup)
Next Implementation Steps
-
Schedule Background Jobs
- Add to
config/initializers/recurring_jobs.rbor use gem likegood_job ExpiredRulesCleanupJobevery hourPathScannerDetectorJobevery 5 minutes
- Add to
-
Build Rule Management UI
- Form to create network block rules
- List active rules
- Disable/enable rules
- View auto-generated rules
-
Agent Sync Implementation
- HTTP client to poll rules endpoint
- SQLite population logic
- Sampling rate respect
- Rule evaluation integration
-
Monitoring/Metrics
- Dashboard showing active rules count
- Auto-generated rules per day
- Banned IPs list
- Rule sync lag per agent
Documentation
Complete architecture documentation available at:
- docs/rule-architecture.md - Full technical specification
- This file - Implementation summary and testing guide
Summary
We've built a production-ready, distributed WAF rule system with:
- ✅ Database schema with optimized indexes
- ✅ Complete Rule model with validations
- ✅ RESTful API with version/incremental/full sync
- ✅ Dynamic load-based event sampling
- ✅ Auto-expiring temporal rules
- ✅ Scanner detection and auto-banning
- ✅ Soft deletes with audit trail
- ✅ IPv4/IPv6 separation
- ✅ Comprehensive documentation
The system is ready for Agent integration and can scale from single-server to multi-agent distributed deployments.