EventFlow Analytics Proposal

Observable State Machines with Automatic Funnel Detection

Version: Draft 0.1 Date: December 2024

1. Executive Summary

EventFlow machines process events and transition between states. Understanding how these flows behave in production - which paths users take, where they drop off, which guards are effective - requires analytics. This proposal introduces EventFlow Analytics - a natural language approach to observing, measuring, and understanding state machine behavior.

Core Philosophy

Numbers tell the story. Funnels reveal the journey.
Metrics are events. Events flow through queues.
Analytics never blocks business logic.

Key Features

Zero overhead collection - Metrics are queued events, never blocking business logic
In-flow analytics declaration - Metrics defined alongside workflows, readable by non-developers
Automatic funnel detection - System discovers conversion paths from state graph topology
Multi-level metrics - Event, state, guard, transition, and context measurement
Near real-time alerts - Queued metrics evaluated within seconds
CLI analytics - eventflow analytics and eventflow funnel commands
Auto-generated insights - Dead code, bottlenecks, drop-off points

2. Motivation & Problem Statement

2.1 Current Situation

The Event Queue Proposal introduces queue-level metrics:

queue.pending, queue.processing, queue.completed
queue.processing_time, queue.wait_time

But there's no model for:

Machine behavior analytics - How do instances flow through states?
Conversion tracking - What percentage reach the success state?
Guard effectiveness - Which conditions actually branch the flow?
Dead code detection - Which paths are defined but never taken?

2.2 Real-World Challenges

Challenge 1: Invisible Funnels
──────────────────────────────
E-commerce orders go through: cart → checkout → payment → fulfillment
How many drop at each step? We don't know without manual instrumentation.

Challenge 2: Dead Guards
────────────────────────
Guard "customer is VIP" exists in code but always returns false.
Is it dead code or just missing test data?

Challenge 3: Hidden Bottlenecks
───────────────────────────────
Orders are slow. Is it payment processing? Stock reservation?
Which state has the longest duration?

Challenge 4: Disconnected Metrics
─────────────────────────────────
Business analysts define funnels in external tools.
When developers change the workflow, funnel definitions become stale.

2.3 Goals

Automatic funnel discovery - Infer conversion paths from state transitions
In-flow metric declaration - Keep metrics with workflows they measure
Multi-stakeholder readability - Developers, PMs, analysts can all contribute
Real-time + batch support - Alerts immediately, reports periodically
Dead code detection - Identify unused paths and always-true/false guards

3. Core Metric Types

EventFlow Analytics tracks metrics at five levels, each revealing different insights about machine behavior.

3.1 Event Metrics

Track event occurrences, rates, and processing latency.

flow

analytics:
  track :checkout
    count
    rate over 1 minute
    latency: histogram

  track :payment_failed as "Payment Failures"
    count
    rate over 5 minutes

Metric	Type	Description
`count`	Counter	Total event occurrences
`rate over <duration>`	Gauge	Events per time window
`latency: histogram`	Histogram	Time from event arrival to handler completion

Use cases:

Monitor traffic patterns: "How many checkouts per minute?"
Detect anomalies: "Payment failures spiked 3x in last hour"
Performance monitoring: "p95 checkout latency is 2.3s"

3.2 State Metrics

Measure time spent in states and state entry/exit patterns.

flow

analytics:
  measure #awaiting_payment
    duration: histogram
    entry_count as "Payments Started"
    exit_count as "Payments Resolved"
    active_count

  measure #fulfilled
    entry_count as "Orders Completed"

Metric	Type	Description
`duration: histogram`	Histogram	Time instances spend in state
`entry_count`	Counter	How many times state was entered
`exit_count`	Counter	How many times state was exited
`active_count`	Gauge	Current instances in state

Use cases:

Identify bottlenecks: "Orders spend 45s average in #awaiting_payment"
Monitor capacity: "234 orders currently in #processing"
Track completions: "8,725 orders fulfilled this week"

3.3 Guard Metrics

Track guard evaluation patterns to detect dead code and understand branching.

flow

analytics:
  measure guard "cart is valid"
    true_rate
    false_rate
    evaluation_count

  measure guard "fraud detected"
    true_rate
    alert when true_rate > 5%

Metric	Type	Description
`true_rate`	Gauge	Percentage of evaluations returning true
`false_rate`	Gauge	Percentage of evaluations returning false
`evaluation_count`	Counter	Total guard evaluations

Auto-detected insights:

Dead guard (always true): Guard "payment gateway available" is 100% true - consider removing
Dead guard (always false): Guard "customer is VIP" is 0% true - dead code or missing data?
Effective guard: Guard "cart is valid" is 92% true, 8% false - working as intended

3.4 Transition Metrics

Track state-to-state movement patterns and conversion rates.

flow

analytics:
  measure transition #cart -> #checkout
    count as "Checkout Started"
    conversion_rate from #cart

  measure transition #awaiting_payment -> #paid
    count as "Successful Payments"
    conversion_rate from #awaiting_payment

Metric	Type	Description
`count`	Counter	Transition occurrences
`conversion_rate from <state>`	Gauge	Percentage of source state entries that take this transition

Use cases:

Conversion tracking: "78% of #awaiting_payment reach #paid"
Path analysis: "Most common path is cart → checkout → paid → fulfilled"
Drop-off detection: "22% exit to #payment_failed at payment stage"

3.5 Context Metrics

Track context variable distributions and cardinality.

flow

analytics:
  measure $total
    distribution: histogram
    buckets: [0, 100, 500, 1000, 5000]

  measure $payment_method
    distribution: labels
    cardinality

  measure $items
    cardinality as "Unique Products Ordered"

Metric	Type	Description
`distribution: histogram`	Histogram	Numeric value distribution with buckets
`distribution: labels`	Labels	Categorical value distribution
`buckets: [...]`	Config	Custom histogram bucket boundaries
`cardinality`	Gauge	Count of unique values

Use cases:

Order value analysis: "65% of orders are between $100-500"
Payment method breakdown: "Credit card 45%, PayPal 30%, Apple Pay 25%"
Product diversity: "Average order contains 2.3 unique products"

3.6 Timing Metrics (Performance Profiling)

Track execution time at the most granular level: individual guards, individual actions, state transitions, and API response times.

flow

analytics:
  // Individual guard timing
  measure guard "cart is valid"
    evaluation_time: histogram
    alert when evaluation_time p95 > 10ms

  measure guard "payment gateway available"
    evaluation_time: histogram

  measure guard "fraud check passed"
    evaluation_time: histogram
    alert when evaluation_time p95 > 500ms

  // Individual action timing
  measure action "validate cart"
    execution_time: histogram
    alert when execution_time p95 > 100ms

  measure action "reserve stock"
    execution_time: histogram

  measure action "charge payment"
    execution_time: histogram
    alert when execution_time p99 > 5 seconds

  measure action "send confirmation email"
    execution_time: histogram

  // State transition overhead
  measure transition #cart -> #checkout
    transition_time: histogram

  measure transition #awaiting_payment -> #paid
    transition_time: histogram

  // API event end-to-end timing
  track :checkout (api)
    response_time: histogram     // HTTP request → response (total)
    processing_time: histogram   // Business logic only (excludes HTTP overhead)

  track :process_payment (api)
    response_time: histogram
    processing_time: histogram

Metric	Type	Applies To	Description
`evaluation_time: histogram`	Histogram	Guard	Time to evaluate guard condition
`execution_time: histogram`	Histogram	Action	Time to execute a single action
`transition_time: histogram`	Histogram	Transition	Overhead of state transition itself
`response_time: histogram`	Histogram	API Event	End-to-end HTTP response time
`processing_time: histogram`	Histogram	API Event	Business logic execution time

Use cases:

Identify slow guards: "Fraud check guard takes 450ms p95 - needs optimization"
Find slow actions: "Payment charging takes 1.8s p95 - consider async processing"
Measure transition overhead: "State transitions are 0.1ms - negligible"
API performance: "Checkout API responds in 120ms p95 - within SLA"

Metric Event Types:

Timing metrics emit the following metric events:

Metric Event	Payload	When Emitted
`:metric.guard_evaluated`	`{ guard, result, duration_ms, machine, instance_id }`	Guard evaluation completes
`:metric.action_executed`	`{ action, duration_ms, machine, instance_id }`	Action execution completes
`:metric.transition_completed`	`{ from_state, to_state, duration_ms, machine, instance_id }`	State transition completes
`:metric.api_event_handled`	`{ event, response_time_ms, processing_time_ms, machine, instance_id }`	API event handler completes

Performance Budget:

Define acceptable timing thresholds for automatic enforcement:

flow

analytics:
  performance_budget:
    api_response_time p95: < 500ms
    guard_evaluation p95: < 50ms
    action_execution p95: < 200ms
    transition_time p95: < 1ms

    on budget_exceeded
      notify @ops via slack
      message "Performance budget exceeded: {metric} is {value}"

Auto-detected insights:

Slow guard: Guard "fraud check" p95 > 100ms - consider caching or async pre-check
Slow action: Action "charge payment" p95 > 1s - consider async processing
Normal overhead: Transition times are sub-millisecond - healthy

4. Automatic Funnel Detection

The core innovation of EventFlow Analytics is automatic funnel discovery. The system analyzes the state machine topology to identify conversion paths without manual configuration.

4.1 Algorithm

Automatic Funnel Detection Algorithm
─────────────────────────────────────────────────────────────────

Step 1: Build State Graph
  - Parse all state transitions from machine definition
  - Create directed graph: states as nodes, transitions as edges
  - Identify entry states (reachable via API events)

Step 2: Identify Terminal States
  - Terminal state = node with out-degree 0 (no outgoing transitions)
  - Or explicitly marked terminal states

Step 3: Classify Terminals
  - SUCCESS patterns: #fulfilled, #completed, #paid, #shipped, #hired, #approved
  - FAILURE patterns: #cancelled, #failed, #rejected, #expired, #declined
  - NEUTRAL: Other terminals (#archived, #closed)

Step 4: Discover Paths
  - For each terminal state T:
    - Reverse BFS/DFS from T to entry states
    - Record all paths leading to T
    - Track the events that trigger each transition

Step 5: Compute Metrics
  - For each path step:
    - Calculate conversion rate (proceeding to next step)
    - Calculate drop-off rate (exiting to other paths)
    - Identify the drop-off destinations

4.2 Terminal State Classification

Terminal states can be classified in two ways:

Explicit Marking (Recommended)

Developers mark terminal states during Session 4 (Implementation):

flow

// In the scenario, mark terminal states explicitly
#fulfilled (success)          // terminal - success outcome
#cancelled (failure)          // terminal - failure outcome
#archived (neutral)           // terminal - neutral (neither success nor failure)

This is the recommended approach because:

Clear intent - no ambiguity about state purpose
Part of implementation workflow - developer decides during state derivation
Self-documenting - terminal classification visible in flow file

Pattern-Based Inference (Fallback)

If not explicitly marked, the system infers from naming patterns:

Classification	Patterns	Examples
SUCCESS	fulfilled, completed, paid, shipped, hired, approved, active, done	#fulfilled, #order_completed, #hired
FAILURE	cancelled, failed, rejected, expired, declined, denied, abandoned	#payment_failed, #application_rejected
NEUTRAL	archived, closed, suspended, inactive	#archived, #account_closed

When to use pattern inference: Legacy flows or quick prototyping where explicit marking hasn't been added yet.

4.3 Auto-Generated Funnel Output

Given an e-commerce order machine, the system automatically produces:

Auto-Detected Funnel: @order → #fulfilled
═══════════════════════════════════════════════════════════════

#cart (12,450 entered)
  │
  │ :checkout (91.2% proceed)
  │ [8.8% never checkout - cart abandonment]
  ▼
#checkout (11,356 reached)
  │
  │ :validate (98.5% proceed)
  │ [1.5% validation failed → exit]
  ▼
#awaiting_payment (11,186 reached)
  │
  │ :payment_success (78.0% proceed) ───────────► #paid
  │ :payment_failed (22.0% exit) ───────────────► #payment_failed
  ▼
#paid (8,725 reached)
  │
  │ :ship (100% proceed)
  │
  ▼
#fulfilled (8,725 reached) ✓ SUCCESS

═══════════════════════════════════════════════════════════════
Overall Conversion: 70.1% (8,725 / 12,450)
Primary Drop-off: Payment stage (22% → #payment_failed)
Secondary Drop-off: Cart abandonment (8.8%)

4.4 Multiple Funnels Per Machine

A machine may have multiple terminal states, generating multiple funnels:

@order Funnels (Auto-Detected)
─────────────────────────────────────────

Funnel 1: → #fulfilled (SUCCESS)
  Conversion: 70.1%
  Path: #cart → #checkout → #awaiting_payment → #paid → #fulfilled

Funnel 2: → #payment_failed (FAILURE)
  Conversion: 15.4%
  Path: #cart → #checkout → #awaiting_payment → #payment_failed

Funnel 3: → #cancelled (FAILURE)
  Conversion: 5.6%
  Path: #cart → #checkout → #awaiting_payment → #paid → #cancelled

Unclassified exits: 8.9% (cart abandonment - never reached #checkout)

4.5 Optional Funnel Hints

While detection is automatic, users can provide hints to label or customize funnels:

flow

analytics:
  funnel "Purchase Flow"
    success: #fulfilled, #shipped
    failure: #cancelled, #payment_failed
    label: "E-Commerce Checkout"

  funnel "Refund Process"
    entry: #fulfilled
    success: #refunded
    failure: #refund_denied

Option	Description
`success:`	States to classify as successful completion
`failure:`	States to classify as failure/drop-off
`entry:`	Override the funnel entry point (default: initial states)
`label:`	Human-readable funnel name

5. DSL Syntax

5.1 Analytics Block

Analytics are declared in an analytics: block within a machine:

flow

machine: @order

analytics:
  // Event tracking
  track :checkout as "Checkout Started"
    count
    rate over 1 minute
    latency: histogram

  // State measurement
  measure #awaiting_payment
    duration: histogram
    alert when duration p95 > 30 seconds

  // Guard tracking
  measure guard "cart is valid"
    true_rate
    false_rate

  // Context distribution
  measure $total
    distribution: histogram
    buckets: [0, 100, 500, 1000, 5000]

  // Alerts
  alert "High Failure Rate"
    when :payment_failed rate > 5% over 1 hour
    severity: warning
    notify: payments-team

  // Funnel hints (optional)
  funnel "Purchase Flow"
    success: #fulfilled
    failure: #cancelled, #payment_failed

scenario: order processing
  // ... event handlers ...

5.2 Inline Tracking

For simpler cases, add tracking directly to event handlers:

flow

on :checkout from @customer (api) track
  // 'track' enables default event metrics (count, rate, latency)
  ? cart is valid
    order moves to #awaiting_payment

on :payment_success from @payment track as "Payment Completed"
  // Custom label for this event
  order moves to #paid measure
  // 'measure' enables default state metrics (duration, entry_count)

5.3 Alert Syntax

Alerts are defined within the analytics: block, alongside tracking and measurement declarations:

flow

machine: @order

analytics:
  // Tracking declarations
  track :checkout
    count
    rate over 1 minute

  track :payment_failed
    count

  // Measurement declarations
  measure #awaiting_payment
    duration: histogram

  // Alert declarations (same block, typically at the end)
  alert "High Failure Rate"
    when :payment_failed rate > 5% of :checkout over 1 hour
    severity: warning
    notify: payments-team

  alert "Stuck Orders"
    when #awaiting_payment duration > 5 minutes for any instance
    severity: critical
    notify: on-call

Alert syntax structure:

flow

alert "<name>"
  when <condition>
  severity: <level>
  notify: <channel>

Condition patterns:

flow

// Event rate conditions
when :payment_failed rate > 10 per minute
when :payment_failed rate > 5% of :checkout over 1 hour

// State duration conditions
when #awaiting_payment duration p95 > 30 seconds
when #awaiting_payment duration > 5 minutes for any instance

// Guard conditions
when guard "fraud detected" true_rate > 5%

// Count conditions
when #payment_failed active_count > 100

Severity levels:

Level	Description
`info`	Informational, no action required
`warning`	Attention needed, not urgent
`critical`	Immediate action required

6. Zero Overhead Architecture

The most critical architectural principle of EventFlow Analytics:

Analytics collection must have near-zero overhead on production workload.

Traditional analytics add latency to every operation. EventFlow Analytics takes a fundamentally different approach: metrics are events that flow through queues.

6.1 The Problem with Synchronous Analytics

Traditional analytics block business logic:

Traditional Approach (BAD):
─────────────────────────────────────────────────────────────────
:checkout event arrives
    │
    ├──► Track event count (5ms)
    ├──► Write to metrics DB (10ms)
    ├──► Update histogram (2ms)
    │
    └──► Continue to business logic...

Total added latency: ~17ms per event ❌

This approach:

Adds latency to every event handler
Creates coupling between business logic and analytics storage
Can cause cascading failures if analytics storage is slow/down

6.2 Metric Events Architecture

EventFlow Analytics treats every metric observation as a lightweight event:

EventFlow Approach (GOOD):
─────────────────────────────────────────────────────────────────
:checkout event arrives
    │
    ├──► Emit :metric.event_received (fire-and-forget, ~0.01ms)
    │
    └──► Continue to business logic immediately

Total added latency: ~0ms ✓

Meanwhile (async):
─────────────────────────────────────────────────────────────────
:metric.event_received ──┐
:metric.state_entered  ──┼──► Analytics Queue ──► Analytics Worker ──► Storage
:metric.guard_evaluated──┘         │
                              (low priority)
                              (batched writes)

6.3 Metric Event Types

Every analytics observation emits an internal metric event:

Metric Event	Payload	When Emitted
`:metric.event_received`	`{ event, machine, instance_id, timestamp }`	Event handler starts
`:metric.event_completed`	`{ event, duration_ms, success, error? }`	Event handler completes
`:metric.event_emitted`	`{ event, to_machine, to_instance, timestamp }`	Event emitted to another machine
`:metric.state_entered`	`{ state, instance_id, timestamp }`	State transition (enter)
`:metric.state_exited`	`{ state, instance_id, duration_ms }`	State transition (exit)
`:metric.guard_evaluated`	`{ guard, result, instance_id }`	Guard condition checked
`:metric.transition`	`{ from, to, event, instance_id }`	State transition recorded
`:metric.context_changed`	`{ variable, old_value, new_value }`	Context variable modified

:metric.event_emitted Use Cases:

Cross-machine communication tracking - Monitor event flow between machines
Event chain visualization - Trace causal chains across system
Cascade failure detection - Identify failure propagation patterns

6.4 Analytics Queue

Metric events flow through a dedicated analytics queue:

flow

system: e-commerce

analytics:
  queue:
    priority: bulk              // lowest priority, never starves business events
    concurrency: 10             // metrics are independent, high parallelism OK
    batch_size: 100             // write 100 metrics per DB operation
    flush_interval: 1 second    // or flush every second, whichever comes first
    buffer_size: 10000          // in-memory ring buffer
    overflow: drop_oldest       // if buffer full, drop oldest (never block)

Option	Default	Description
`priority`	`bulk`	Queue priority (bulk = lowest)
`concurrency`	`10`	Parallel metric processors
`batch_size`	`100`	Metrics per storage write
`flush_interval`	`1 second`	Maximum time before flush
`buffer_size`	`10000`	In-memory buffer capacity
`overflow`	`drop_oldest`	Behavior when buffer full

6.5 Collection Modes

Configure how metrics are collected:

flow

analytics:
  collection: queued          // default - async via queue (production)

flow

analytics:
  collection: sampled 10%     // collect only 10% of metrics (high-traffic)

flow

analytics:
  collection: disabled        // no collection (emergency/testing)

flow

analytics:
  collection: sync            // synchronous (development only!)

Mode	Overhead	Use Case
`queued`	~0ms	Production (default)
`sampled N%`	~0ms	Very high traffic, approximate metrics OK
`disabled`	0ms	Emergency, load testing without metrics
`sync`	+15-50ms	Local development, debugging

6.6 Performance Guarantees

Metric	Guarantee
Business event latency impact	< 0.1ms (fire-and-forget emit)
Memory overhead per metric	~100-200 bytes
Metric delivery	Best-effort (may drop under extreme load)
Metric latency	1-5 seconds from observation to storage
Batch efficiency	100+ metrics per DB write

6.7 Graceful Degradation

Analytics never impacts business operations:

Scenario: Analytics storage is down
───────────────────────────────────
1. Metric events accumulate in buffer
2. Buffer reaches capacity (10,000 events)
3. Oldest metrics dropped (ring buffer)
4. Business events continue unaffected ✓
5. When storage recovers, remaining metrics flush

Scenario: Extreme traffic spike
───────────────────────────────
1. Metrics generated faster than processable
2. Buffer fills up
3. System switches to sampling mode automatically
4. Business events continue unaffected ✓
5. Approximate metrics still collected

6.8 Architecture Diagram

┌─────────────────────────────────────────────────────────────────────────────┐
│                           BUSINESS EVENT FLOW                               │
│                                                                             │
│   API ──► Validation ──► Business Queue ──► Worker ──► State Change ──► Response
│                                               │                             │
│                                        (fire-and-forget)                    │
│                                               │                             │
│                                               ▼                             │
│   ┌───────────────────────────────────────────────────────────────────────┐ │
│   │                      ANALYTICS PIPELINE                               │ │
│   │                                                                       │ │
│   │   :metric.* ──► Ring Buffer ──► Analytics Queue ──► Analytics Worker  │ │
│   │    events       (in-memory)      (low priority)      (batch writes)   │ │
│   │                                                            │          │ │
│   │                                        ┌───────────────────┼────────┐ │ │
│   │                                        │                   │        │ │ │
│   │                                        ▼                   ▼        │ │ │
│   │                                   ┌─────────┐      ┌─────────────┐  │ │ │
│   │                                   │ Storage │      │ Alert Check │  │ │ │
│   │                                   └─────────┘      └─────────────┘  │ │ │
│   │                                                           │         │ │ │
│   │                                                           ▼         │ │ │
│   │                                                    ┌────────────┐   │ │ │
│   │                                                    │ Notify     │   │ │ │
│   │                                                    └────────────┘   │ │ │
│   └───────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────┘

7. Timing Model

Because metrics flow through queues, "real-time" actually means "near real-time" with a small delay (1-5 seconds).

7.1 Near Real-Time Analytics

Metrics are available within seconds of observation:

Capability	Description	Latency
Live counters	Event counts, state entry/exit	1-5 seconds
Rate calculation	Events per minute/hour	Rolling window, ~5s delay
Alert evaluation	Threshold breach detection	1-5 seconds
Active gauges	Current instances per state	~1 second

Why not truly real-time?

Metrics are queued (fire-and-forget)
Buffer flushes every 1 second
Analytics worker processes batch
Total pipeline latency: 1-5 seconds

This is acceptable because:

Business logic is not blocked
Alerts within seconds are fast enough for most use cases
True sub-second alerting requires external streaming infrastructure

7.2 Batch Analytics

Processed periodically for comprehensive analysis:

Frequency	Analysis	Output
Hourly	Funnel conversion rates	Trend updates
Daily	Full funnel analysis, path frequencies	Daily report
Weekly	Dead code detection, guard effectiveness	Coverage report

Use cases:

Funnel reports with accurate conversion rates
Historical trend comparison
Dead code and test coverage analysis

7.3 Alert Evaluation

Alerts are evaluated by the Analytics Worker, not inline:

Metric Events ──► Analytics Queue ──► Analytics Worker
                                             │
                                 ┌───────────┴───────────┐
                                 │                       │
                                 ▼                       ▼
                          Write to Storage        Evaluate Alert Rules
                                                         │
                                                         ▼
                                                  (if threshold breached)
                                                         │
                                                         ▼
                                                  Send Notification

This means alerts fire 1-5 seconds after the triggering event, not immediately. For most monitoring scenarios, this is acceptable.

7.4 Configuration

flow

analytics:
  timing:
    flush_interval: 1 second    // how often to flush buffer to queue
    alerts: near-realtime       // evaluated on each batch (1-5s)
    dashboards: near-realtime   // updated on each batch (1-5s)
    funnels: hourly             // aggregated hourly
    coverage: weekly            // full analysis weekly

8. Auto-Generated Insights

Beyond raw metrics, EventFlow Analytics automatically generates actionable insights.

8.1 Path Analysis

Path Analysis: @order (Last 7 Days)
═══════════════════════════════════════════════════════════════

Most Common Paths:
──────────────────
1. #cart → #checkout → #paid → #fulfilled         (68.2% of orders)
   Average duration: 4.2 hours

2. #cart → #checkout → #payment_failed             (15.4% of orders)
   Average duration: 12 minutes

3. #cart → #checkout → #paid → #cancelled          (5.6% of orders)
   Average duration: 18 hours

4. #cart → (abandoned)                             (8.9% of sessions)
   Never reached #checkout

Rare Paths (< 1%):
──────────────────
- #cart → #checkout → #paid → #fraud_review → #cancelled (0.3%)
- #cart → #checkout → #paid → #shipped → #returned (0.8%)

8.2 Bottleneck Detection

Bottleneck Analysis: @order
═══════════════════════════════════════════════════════════════

State Duration Ranking:
───────────────────────
1. #paid                  p50: 2.1h    p95: 18h     ← Longest
2. #awaiting_payment      p50: 12s     p95: 45s     ⚠ Warning threshold
3. #checkout              p50: 2s      p95: 8s      ✓ Healthy
4. #cart                  p50: 5m      p95: 30m     ✓ Healthy

Recommendations:
────────────────
- #paid has high p95 (18h) - investigate shipping delays
- #awaiting_payment p95 (45s) exceeds 30s warning threshold

8.3 Drop-off Analysis

Drop-off Analysis: @order
═══════════════════════════════════════════════════════════════

Significant Drop-offs (> 5%):
─────────────────────────────
1. #awaiting_payment → #payment_failed (22.0%)
   Cause: Payment processing failures
   Events: :payment_failed, :card_declined, :insufficient_funds

2. #cart → (abandoned) (8.9%)
   Cause: Cart abandonment
   Events: (no checkout event received)

3. #paid → #cancelled (5.6%)
   Cause: Post-payment cancellations
   Events: :cancel_order, :out_of_stock

Trend Comparison (vs Last Week):
────────────────────────────────
- Payment failures: 22.0% (+2.3%) ⚠ Increasing
- Cart abandonment: 8.9% (-0.5%) ✓ Improving
- Cancellations: 5.6% (+0.1%) ─ Stable

8.4 Dead Code Detection

Dead Code Analysis: @order
═══════════════════════════════════════════════════════════════

Dead Guards (Always Same Result):
─────────────────────────────────
⚠ "payment gateway available"    100% true   (10,234 evaluations)
  → Consider removing - gateway always available

⚠ "customer is VIP"              0% true     (5,432 evaluations)
  → Either dead code or missing VIP customers in production

Unused Transitions (0 Occurrences in 30 Days):
──────────────────────────────────────────────
✗ #paid → #fraud_review
  → Defined but never taken - verify fraud detection is working

✗ #fulfilled → #disputed
  → Defined but never taken - may be dead code

Recommendations:
────────────────
1. Review "payment gateway available" guard - likely removable
2. Verify VIP detection logic or add test data
3. Test fraud detection path manually
4. Consider removing #disputed state if unused

8.5 Test Coverage Integration

Test Coverage: @order
═══════════════════════════════════════════════════════════════

States:
───────
  #cart                  [✓ tested: entry, exit]
  #checkout              [✓ tested: entry, exit]
  #awaiting_payment      [✓ tested: entry, exit]
  #paid                  [✓ tested: entry, exit]
  #fulfilled             [⚠ tested: entry only]
  #payment_failed        [✓ tested: entry, exit]
  #fraud_review          [✗ NOT TESTED]

Guards:
───────
  "cart is valid"        [✓ tested: both branches]
  "fraud detected"       [⚠ tested: false branch only]
  "gateway available"    [✗ NOT TESTED]

Transitions:
────────────
  #cart → #checkout              [✓ tested]
  #awaiting_payment → #paid      [✓ tested]
  #awaiting_payment → #failed    [✓ tested]
  #paid → #fraud_review          [✗ NOT TESTED]

Coverage Summary:
─────────────────
  States:      85% (6/7)
  Guards:      67% (4/6 branches)
  Transitions: 88% (7/8)
  Overall:     80%

Recommendations:
────────────────
  - Add test for #fraud_review entry
  - Add test for "fraud detected" = true branch
  - Add test for #paid → #fraud_review transition

9. CLI Commands

9.1 Analytics Dashboard

bash

# Live dashboard
eventflow analytics @order --live

# Period-based analytics
eventflow analytics @order --period 7d
eventflow analytics @order --from 2024-01-01 --to 2024-01-31

# Filter by state or event
eventflow analytics @order --state=#awaiting_payment
eventflow analytics @order --event=:payment_failed

# Cohort filtering
eventflow analytics @order --where "$customer_type = 'premium'"
eventflow analytics @order --where "$total > 1000"

Output:

@order Analytics (Last 7 Days)
═══════════════════════════════════════════════════════════════

Events:
┌──────────────────────┬─────────┬──────────┬───────────┬─────────┐
│ Event                │ Count   │ Rate/min │ p50 lat   │ p95 lat │
├──────────────────────┼─────────┼──────────┼───────────┼─────────┤
│ :checkout            │ 12,450  │ 1.24     │ 45ms      │ 120ms   │
│ :payment_success     │ 9,711   │ 0.97     │ 890ms     │ 2.1s    │
│ :payment_failed      │ 2,739   │ 0.27     │ 650ms     │ 1.8s    │
│ :ship                │ 9,234   │ 0.92     │ 12ms      │ 45ms    │
└──────────────────────┴─────────┴──────────┴───────────┴─────────┘

States:
┌─────────────────────┬──────────┬──────────┬───────────┬─────────┐
│ State               │ Entries  │ Active   │ p50 dur   │ p95 dur │
├─────────────────────┼──────────┼──────────┼───────────┼─────────┤
│ #awaiting_payment   │ 12,450   │ 234      │ 12s       │ 45s     │
│ #paid               │ 9,711    │ 456      │ 2.1h      │ 18h     │
│ #fulfilled          │ 9,234    │ -        │ terminal  │ -       │
│ #payment_failed     │ 2,739    │ 89       │ terminal  │ -       │
└─────────────────────┴──────────┴──────────┴───────────┴─────────┘

Guards:
┌────────────────────────────┬─────────┬──────────┬─────────────┐
│ Guard                      │ True %  │ False %  │ Evaluations │
├────────────────────────────┼─────────┼──────────┼─────────────┤
│ "cart is valid"            │ 92%     │ 8%       │ 12,450      │
│ "fraud detected"           │ 2%      │ 98%      │ 9,711       │
│ "payment gateway available"│ 100%    │ 0%       │ 12,450      │ ⚠
└────────────────────────────┴─────────┴──────────┴─────────────┘

9.2 Funnel Analysis

bash

# Auto-detected funnels
eventflow funnel @order

# Specific terminal state
eventflow funnel @order --to=#fulfilled

# Compare periods
eventflow funnel @order --compare --before 2024-01-01 --after 2024-01-01

# Cohort comparison
eventflow funnel @order --where "$customer_type = 'premium'" --compare-to "$customer_type = 'standard'"

# Export funnel data
eventflow funnel @order --format=csv > funnel.csv
eventflow funnel @order --format=json > funnel.json

Output:

Funnel: @order → #fulfilled
Period: Last 7 Days
═══════════════════════════════════════════════════════════════

#cart (12,450 entered)
  │
  │ :checkout (91.2% proceed)
  │ [8.8% abandon - never checkout]
  ▼
#checkout (11,356 reached)
  │
  │ :validate (98.5% proceed)
  │ [1.5% validation failed]
  ▼
#awaiting_payment (11,186 reached)
  │
  │ :payment_success (78.0%) ──────► #paid
  │ :payment_failed (22.0%) ───────► #payment_failed
  ▼
#paid (8,725 reached)
  │
  │ :ship (100% proceed)
  │
  ▼
#fulfilled (8,725 reached) ✓

═══════════════════════════════════════════════════════════════
Overall Conversion: 70.1%
Primary Drop-off: Payment (22% → #payment_failed)

Comparison (vs Previous 7 Days):
  Overall: 70.1% (+2.3%)  ▲
  Payment success: 78.0% (+1.8%)  ▲
  Cart abandonment: 8.8% (-0.5%)  ▼

9.3 Coverage Analysis

bash

# Full coverage report
eventflow coverage @order

# Guards only
eventflow coverage @order --guards

# Dead code detection
eventflow coverage @order --dead-code

# Verbose output
eventflow coverage @order --verbose

# Compare with production data
eventflow coverage @order --production-data

Output:

Coverage Report: @order
═══════════════════════════════════════════════════════════════

Overall Coverage: 80% (16/20 branches)

Untested:
─────────
  ✗ State #fraud_review entry
  ✗ Guard "fraud detected" = true
  ✗ Guard "gateway available" (both branches)
  ✗ Transition #paid → #fraud_review

Dead Code (Production Data):
────────────────────────────
  ⚠ Guard "gateway available" always true (100%)
  ⚠ Guard "customer is VIP" always false (0%)
  ⚠ Transition #paid → #fraud_review (0 occurrences)

Recommendations:
────────────────
  1. Add test: fraud detection true path
  2. Review: "gateway available" guard (always true)
  3. Review: "customer is VIP" guard (always false)

9.4 Alerts

bash

# Active alerts
eventflow alerts

# Alert history
eventflow alerts --history --period=7d

# Filter by severity
eventflow alerts --severity=critical

# Acknowledge alert
eventflow alerts ack <alert-id>

# Silence alert temporarily
eventflow alerts silence <alert-id> --duration=1h

Output:

Active Alerts
═══════════════════════════════════════════════════════════════

⚠ [warning] High Payment Failure Rate
  Machine: @order
  Condition: :payment_failed rate > 5% over 1 hour
  Current: 6.2%
  Triggered: 14 minutes ago
  ID: alert-abc123

✓ No critical alerts

History (Last 24h):
───────────────────
  [resolved] High Payment Failure Rate (2h ago, duration: 45m)
  [resolved] Stuck in #awaiting_payment (6h ago, duration: 12m)

9.5 Insights

bash

# All auto-generated insights
eventflow insights @order

# Specific insight types
eventflow insights @order --bottlenecks
eventflow insights @order --drop-offs
eventflow insights @order --dead-code
eventflow insights @order --paths

Output:

$ eventflow insights @order

@order Insights (Last 7 Days)
═══════════════════════════════════════════════════════════════

BOTTLENECKS
───────────
  #paid has high p95 duration (18h)
    → Consider: Add shipping automation or parallel processing

  #awaiting_payment p95 (45s) exceeds warning threshold
    → Consider: Optimize payment gateway integration

DROP-OFFS
─────────
  22% drop at payment stage (#awaiting_payment → #payment_failed)
    → Consider: Add payment retry, alternative payment methods

  8.8% cart abandonment (never reach #checkout)
    → Consider: Cart recovery emails, simplify checkout

DEAD CODE
─────────
  Guard "customer is VIP" always false (0% true rate)
    → Either dead code or missing VIP customers in production

  Guard "payment gateway available" always true (100%)
    → Consider removing - provides no branching value

  Transition #paid → #fraud_review never taken (0 occurrences)
    → Verify fraud detection is working correctly

PATH INSIGHTS
─────────────
  68% take happy path: #cart → #checkout → #paid → #fulfilled
  15% fail at payment: #cart → #checkout → #payment_failed
  8.8% abandon cart: #cart → (no further events)
  5.6% cancel after payment: #cart → #checkout → #paid → #cancelled

RECOMMENDATIONS
───────────────
  1. [High Priority] Investigate payment gateway failures (22% drop-off)
  2. [Medium Priority] Implement cart abandonment recovery
  3. [Low Priority] Remove or fix "customer is VIP" guard
  4. [Low Priority] Add test coverage for fraud_review path

───────────────────────────────────────────────────────────────
Run 'eventflow insights @order --verbose' for detailed analysis

9.6 Performance Profiling

bash

# Full performance profile
eventflow perf @order

# Filter by component
eventflow perf @order --guards           # Guard evaluation times only
eventflow perf @order --actions          # Action execution times only
eventflow perf @order --api              # API event response times only
eventflow perf @order --transitions      # State transition overhead only

# Time range
eventflow perf @order --range 24h        # Last 24 hours (default)
eventflow perf @order --range 7d         # Last 7 days
eventflow perf @order --from 2024-12-01 --to 2024-12-08

# Filter slow components
eventflow perf @order --slow             # Only show items exceeding thresholds
eventflow perf @order --slow --threshold 100ms

# Export
eventflow perf @order --export json > perf.json
eventflow perf @order --export csv > perf.csv

Output:

$ eventflow perf @order

╔═══════════════════════════════════════════════════════════════════════════════╗
║                         Performance Profile: @order                           ║
╠═══════════════════════════════════════════════════════════════════════════════╣
║  Time Range: Last 24 hours │ Samples: 12,450                                  ║
╚═══════════════════════════════════════════════════════════════════════════════╝

┌─ API Event Response Times ────────────────────────────────────────────────────┐
│ Event              │ p50      │ p95      │ p99      │ Max      │ Calls      │
├────────────────────┼──────────┼──────────┼──────────┼──────────┼────────────┤
│ :checkout          │ 45ms     │ 120ms    │ 350ms    │ 1.2s     │ 8,234      │
│ :add_item          │ 12ms     │ 35ms     │ 80ms     │ 450ms    │ 24,567     │
│ :process_payment   │ 890ms    │ 2.1s     │ 4.5s     │ 12s      │ 6,890      │
└────────────────────┴──────────┴──────────┴──────────┴──────────┴────────────┘

┌─ Guard Evaluation Times ──────────────────────────────────────────────────────┐
│ Guard                          │ p50      │ p95      │ Evals     │ Status    │
├────────────────────────────────┼──────────┼──────────┼───────────┼───────────┤
│ "cart is valid"                │ 0.2ms    │ 0.8ms    │ 12,450    │ ✓ healthy │
│ "payment gateway available"    │ 45ms     │ 180ms    │ 6,890     │ ⚠ SLOW    │
│ "stock is available"           │ 2ms      │ 8ms      │ 8,234     │ ✓ healthy │
│ "fraud check passed"           │ 120ms    │ 450ms    │ 6,890     │ ⚠ SLOW    │
└────────────────────────────────┴──────────┴──────────┴───────────┴───────────┘

┌─ Action Execution Times ──────────────────────────────────────────────────────┐
│ Action                         │ p50      │ p95      │ Calls     │ Status    │
├────────────────────────────────┼──────────┼──────────┼───────────┼───────────┤
│ "validate cart"                │ 1ms      │ 3ms      │ 12,450    │ ✓ healthy │
│ "calculate totals"             │ 0.5ms    │ 1.2ms    │ 12,450    │ ✓ healthy │
│ "reserve stock"                │ 15ms     │ 45ms     │ 8,234     │ ✓ healthy │
│ "charge payment"               │ 780ms    │ 1.8s     │ 6,890     │ ⚠ SLOW    │
│ "send confirmation email"      │ 120ms    │ 350ms    │ 5,234     │ ✓ healthy │
└────────────────────────────────┴──────────┴──────────┴───────────┴───────────┘

┌─ State Transition Overhead ───────────────────────────────────────────────────┐
│ Transition                     │ p50      │ p95      │ Count     │
├────────────────────────────────┼──────────┼──────────┼───────────┤
│ #cart → #checkout              │ 0.1ms    │ 0.3ms    │ 8,234     │
│ #checkout → #awaiting_payment  │ 0.1ms    │ 0.2ms    │ 8,100     │
│ #awaiting_payment → #paid      │ 0.1ms    │ 0.3ms    │ 6,320     │
│ #paid → #fulfilled             │ 0.1ms    │ 0.2ms    │ 6,320     │
└────────────────────────────────┴──────────┴──────────┴───────────┘

┌─ Bottleneck Analysis ─────────────────────────────────────────────────────────┐
│ ⚠  Top 3 Performance Bottlenecks (by p95 impact):                             │
│                                                                                │
│ 1. Action "charge payment" - 1.8s p95                                         │
│    └─ Recommendation: Consider async processing or timeout optimization       │
│                                                                                │
│ 2. Guard "fraud check passed" - 450ms p95                                     │
│    └─ Recommendation: Cache results or use async pre-check                    │
│                                                                                │
│ 3. Guard "payment gateway available" - 180ms p95                              │
│    └─ Recommendation: Use circuit breaker pattern                             │
└───────────────────────────────────────────────────────────────────────────────┘

───────────────────────────────────────────────────────────────────────────────
Run 'eventflow perf @order --slow' to see only slow components
Run 'eventflow perf @order --guards' for detailed guard analysis

Slow Components View:

$ eventflow perf @order --slow

@order Slow Components (p95 > threshold)
═══════════════════════════════════════════════════════════════

⚠ GUARDS (threshold: 50ms)
  "fraud check passed"           p95: 450ms   (+800% over threshold)
  "payment gateway available"    p95: 180ms   (+260% over threshold)

⚠ ACTIONS (threshold: 200ms)
  "charge payment"               p95: 1.8s    (+800% over threshold)

⚠ API EVENTS (threshold: 500ms)
  :process_payment               p95: 2.1s    (+320% over threshold)

✓ TRANSITIONS
  All transitions within threshold (< 1ms)

═══════════════════════════════════════════════════════════════
Total slow components: 4
Recommendation: Focus on "charge payment" action for biggest impact

10. Visualization Integration

Analytics integrate with EventFlow's diagram generation to produce annotated visualizations.

10.1 Annotated State Diagrams

bash

eventflow diagram @order --type=state --analytics

Produces a state diagram with:

State nodes annotated with duration (p50/p95)
Transition edges annotated with conversion rates
Color coding: green (healthy), yellow (warning), red (critical)
Dead paths shown as dashed/gray lines

10.2 Funnel Diagrams

bash

eventflow diagram @order --type=funnel

Produces a funnel visualization:

Horizontal bars for each state
Bar width proportional to volume
Drop-off percentages between stages
Color intensity by conversion rate

10.3 Heat Maps

bash

eventflow diagram @order --type=heatmap --metric=duration
eventflow diagram @order --type=heatmap --metric=volume
eventflow diagram @order --type=heatmap --metric=errors

Produces a state diagram colored by metric intensity.

11. A/B Testing & Experimentation

Note: A/B testing is covered in a separate proposal. See A/B Testing Proposal.

For basic cohort comparison, use segment by with any context variable:

flow

analytics:
  track :checkout
    segment by $ab_variant

  funnel "Purchase Flow"
    segment by $ab_variant

bash

$ eventflow funnel @order --segment-by=$ab_variant

This provides basic variant comparison via CLI. For advanced experimentation features (statistical significance, experiment lifecycle), see the dedicated proposal.

12. Configuration Location

Analytics are declared inline within the machine file, in an analytics: block at the top level:

flow

machine: @order

analytics:
  collection: queued

  track :checkout as "Checkout Started"
    count
    rate over 1 minute

  measure #awaiting_payment
    duration: histogram

  funnel "Purchase Flow"
    success: #fulfilled
    failure: #cancelled, #payment_failed

scenario: order lifecycle
  on :checkout from @customer (api)
    ? cart is valid
      order moves to #awaiting_payment

Why Inline?

Single source of truth - Analytics and behavior in one place
Easy to maintain - Changes are localized
Best tooling support - IDE navigation, syntax highlighting
Natural for EventFlow - "Documentation is code" philosophy

Future: Web-Based Analytics Builder

A future web interface could enable non-developers to:

Visually define funnels by selecting states
Configure alerts with form-based UI
Preview metrics before committing to flow files

This would generate valid EventFlow syntax that developers can review and merge.

This is a future consideration, not part of the initial implementation.

13. Complete Example

13.1 E-Commerce Order with Analytics

flow

machine: @order

analytics:
  // Zero-overhead collection configuration
  collection: queued                    // async via queue (default)
  queue:
    priority: bulk                      // lowest priority
    batch_size: 100                     // metrics per DB write
    flush_interval: 1 second            // flush buffer every second
    buffer_size: 10000                  // in-memory ring buffer

  // Event tracking
  track :checkout as "Checkout Started"
    count
    rate over 1 minute
    latency: histogram

  track :add_item as "Items Added"
    count

  track :payment_success as "Payments Succeeded"
    count

  track :payment_failed as "Payments Failed"
    count
    alert when rate > 10% of :checkout over 1 hour

  // State measurement
  measure #awaiting_payment
    duration: histogram
    active_count
    alert when duration p95 > 30 seconds

  measure #paid
    duration: histogram
    entry_count as "Paid Orders"

  measure #fulfilled
    entry_count as "Fulfilled Orders"

  // Guard tracking
  measure guard "cart is valid"
    true_rate
    false_rate

  measure guard "fraud detected"
    true_rate
    alert when true_rate > 5%

  // Context distribution
  measure $total
    distribution: histogram
    buckets: [0, 100, 500, 1000, 5000]

  measure $payment_method
    distribution: labels

  // Alerts
  alert "High Payment Failure Rate"
    when :payment_failed rate > 5% of :checkout over 1 hour
    severity: warning
    notify: payments-team

  alert "Stuck in Payment"
    when #awaiting_payment duration > 5 minutes for any instance
    severity: critical
    notify: on-call

  alert "High Fraud Rate"
    when guard "fraud detected" true_rate > 5%
    severity: critical
    notify: security-team

  // Funnel hints
  funnel "Purchase Flow"
    success: #fulfilled, #shipped
    failure: #cancelled, #payment_failed
    label: "E-Commerce Checkout"

scenario: order lifecycle

  given:
    @customer is logged in
    cart has items

  on :add_item from @customer (api) track
    $items adds $product
    $total increases by $product.price

  on :checkout from @customer (api) track
    ? cart is valid
      emit :payment_request to @payment
      order moves to #awaiting_payment measure

    ? cart is empty
      emit :error to @customer

  on :payment_success from @payment track
    ? fraud detected
      order moves to #fraud_review
      emit :fraud_alert to @security

    ?
      order moves to #paid measure

  on :payment_failed from @payment track
    order moves to #payment_failed
    emit :payment_error to @customer

  on :stock_reserved from @inventory
    order moves to #fulfilled measure as "Order Complete"
    emit :order_confirmed to @customer

13.2 Job Application with Analytics

flow

machine: @application

analytics:
  // Stage tracking
  track :submit as "Applications Submitted"
    count
    rate over 1 day

  track :screen as "Screenings Performed"
    count

  track :schedule_interview as "Interviews Scheduled"
    count

  // Stage duration
  measure #pending
    duration: histogram
    alert when duration p95 > 7 days

  measure #interview_scheduled
    duration: histogram

  measure #offer_extended
    duration: histogram
    alert when duration > 7 days for any instance

  // Outcome tracking
  measure #hired
    entry_count as "Candidates Hired"

  measure #rejected
    entry_count as "Candidates Rejected"

  // Guard effectiveness
  measure guard "qualifications match"
    true_rate as "Screening Pass Rate"
    false_rate

  // Funnel
  funnel "Hiring Pipeline"
    success: #hired
    failure: #rejected, #offer_declined, #expired
    label: "Recruitment Funnel"

scenario: hiring process
  // ... event handlers ...

14. Sample Dashboard

┌─────────────────────────────────────────────────────────────────────────────┐
│                        @order Analytics Dashboard                            │
│                         Period: 2024-12-01 to 2024-12-08                    │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  OPERATIONAL                                                                │
│  ───────────                                                                │
│                                                                             │
│  Throughput:     1,234 orders/hour     ▲ 12% vs last week                  │
│  Latency p50:    890ms                 ▼ 5% improvement                    │
│  Latency p95:    2.1s                  ─ stable                            │
│  Error Rate:     2.3%                  ▼ 0.8% improvement                  │
│                                                                             │
│  Active by State:                                                           │
│    #awaiting_payment ████████████████ 234                                  │
│    #paid             ██████████████████████████ 456                        │
│    #shipping         ████████ 123                                          │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  FUNNEL: Purchase Flow                                                      │
│  ─────────────────────                                                      │
│                                                                             │
│  #cart              ██████████████████████████████████████ 12,450 (100%)   │
│       │ 91.2%                                                               │
│  #checkout          █████████████████████████████████████  11,356 (91%)    │
│       │ 98.5%                                                               │
│  #awaiting_payment  ████████████████████████████████████   11,186 (90%)    │
│       │ 78.0%                                                               │
│  #paid              ████████████████████████████           8,725 (70%)     │
│       │ 100%                                                                │
│  #fulfilled         ████████████████████████████           8,725 (70%)     │
│                                                                             │
│  Conversion: 70.1%    Drop-off: Payment 22%                                │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  TESTING & QUALITY                                                          │
│  ─────────────────                                                          │
│                                                                             │
│  Coverage: 80%                                                              │
│                                                                             │
│  Guard Effectiveness:                                                       │
│    "cart is valid"        true 92% │ false 8%   ✓                          │
│    "fraud detected"       true 2%  │ false 98%  ✓                          │
│    "gateway available"    true 100%│ false 0%   ⚠ always true              │
│                                                                             │
│  Dead Code: 2 candidates                                                    │
│    - Guard "customer is VIP" (always false)                                │
│    - Transition #paid → #fraud_review (0 occurrences)                      │
│                                                                             │
├─────────────────────────────────────────────────────────────────────────────┤
│                                                                             │
│  ALERTS                                                                     │
│  ──────                                                                     │
│                                                                             │
│  ⚠ [warning] High Payment Failure Rate (6.2% > 5%)                         │
│              14 minutes ago                                                 │
│                                                                             │
└─────────────────────────────────────────────────────────────────────────────┘

15. Free vs Pro Features

EventFlow Analytics follows a freemium model: CLI tools are free, while advanced web features require Pro.

15.1 Feature Matrix

Feature	Free (CLI)	Pro (Web Dashboard)
Analytics Dashboard	Text-based output	Rich interactive charts
Funnel Analysis	ASCII funnel diagram	Visual funnel with drag-drop
Coverage Analysis	Text coverage report	Interactive treemap
Alerts	CLI notifications	Web UI + Slack/Teams/PagerDuty
Insights	Text recommendations	AI-powered suggestions
Historical Data	7 days	Unlimited retention
Export	JSON, CSV	PDF reports, API access
Real-time Updates	Manual refresh	Live streaming
Team Sharing	File-based sharing	Collaborative dashboards

15.2 Visualization Tiers

Diagram Type	Free	Pro
`--type=state`	Basic state diagram	Basic state diagram
`--type=state --analytics`	-	Annotated with metrics
`--type=funnel`	-	Visual funnel diagram
`--type=heatmap`	-	State heatmap by metric
`--type=flow`	-	Animated event flow

15.3 CLI Examples (Free)

All analytics functionality is available via CLI:

bash

# Free: Text dashboard
$ eventflow analytics @order --period=7d

# Free: ASCII funnel
$ eventflow funnel @order

# Free: Text coverage report
$ eventflow coverage @order

# Free: Text insights
$ eventflow insights @order

# Free: JSON export
$ eventflow analytics @order --format=json > analytics.json

15.4 Pro Web Dashboard

The Pro tier provides a web-based dashboard with:

Real-time streaming - Metrics update live without refresh
Interactive charts - Zoom, pan, drill-down into data
Collaborative features - Share dashboards, annotations, comments
Advanced integrations - Slack alerts, PagerDuty, custom webhooks
AI-powered insights - Pattern detection, anomaly prediction
Unlimited history - No 7-day retention limit
PDF reports - Scheduled and on-demand reporting
API access - Programmatic access to all analytics data

15.5 Recommendation

Start with Free CLI - It provides full functionality for most teams:

All analytics data is accessible
Automation-friendly (JSON/CSV export)
No vendor lock-in

Upgrade to Pro when you need:

Non-technical stakeholder access (visual dashboards)
Real-time monitoring screens
Third-party integrations (Slack, PagerDuty)
Compliance requirements (unlimited history, audit logs)

16. Keywords Reference

Keyword	Context	Description
`analytics:`	machine	Analytics configuration block
`track`	event handler	Enable default event metrics inline
`track :event`	analytics	Track specific event
`as "label"`	track/measure	Human-readable label
`count`	track	Counter metric
`rate over <duration>`	track	Events per time window
`latency: histogram`	track	Processing latency histogram
`measure`	state/transition	Enable metrics inline
`measure #state`	analytics	Measure specific state
`measure guard "..."`	analytics	Measure guard effectiveness
`measure transition`	analytics	Measure state transition
`measure $var`	analytics	Measure context variable
`duration: histogram`	state	Time in state histogram
`entry_count`	state	State entry counter
`exit_count`	state	State exit counter
`active_count`	state	Current instances gauge
`true_rate`	guard	Percentage returning true
`false_rate`	guard	Percentage returning false
`evaluation_count`	guard	Total evaluations
`evaluation_time: histogram`	guard	Guard evaluation duration
`execution_time: histogram`	action	Action execution duration
`transition_time: histogram`	transition	State transition overhead
`response_time: histogram`	API event	End-to-end HTTP response time
`processing_time: histogram`	API event	Business logic execution time
`performance_budget:`	analytics	Performance threshold definitions
`measure action "..."`	analytics	Measure action execution time
`conversion_rate from`	transition	Conversion percentage
`distribution: histogram`	context	Numeric distribution
`distribution: labels`	context	Categorical distribution
`buckets: [...]`	histogram	Custom bucket boundaries
`cardinality`	context	Unique value count
`alert`	analytics	Alert definition
`when`	alert	Alert condition
`severity:`	alert	info/warning/critical
`notify:`	alert	Notification channel
`funnel`	analytics	Funnel configuration
`success:`	funnel	Success terminal states
`failure:`	funnel	Failure terminal states
`entry:`	funnel	Override entry point
`label:`	funnel	Human-readable name
`timing:`	analytics	Timing configuration
`collection:`	analytics	Collection mode (queued/sampled/disabled/sync)
`queue:`	analytics	Analytics queue configuration
`priority:`	queue	Queue priority (bulk = lowest)
`batch_size:`	queue	Metrics per storage write
`flush_interval:`	queue	Time between buffer flushes
`buffer_size:`	queue	In-memory buffer capacity
`overflow:`	queue	Behavior when buffer full (drop_oldest)

17. Implementation Notes

17.1 Metric Event Emission (Zero Overhead)

The core implementation principle: metric observation emits a fire-and-forget event.

php

class MetricEmitter
{
    private RingBuffer $buffer;

    public function emit(MetricEvent $event): void
    {
        // Fire-and-forget: ~0.01ms, never blocks
        if (!$this->buffer->isFull()) {
            $this->buffer->push($event);
        }
        // If buffer full, metric is silently dropped (never block business logic)
    }
}

Key guarantees:

emit() completes in < 0.1ms
Never waits for I/O
Never throws exceptions
Gracefully handles buffer overflow

17.2 Ring Buffer

In-memory circular buffer for metric events:

php

class RingBuffer
{
    private array $buffer;
    private int $head = 0;
    private int $tail = 0;
    private int $size;

    public function __construct(int $size = 10000)
    {
        $this->size = $size;
        $this->buffer = array_fill(0, $size, null);
    }

    public function push(MetricEvent $event): bool
    {
        if ($this->isFull()) {
            return false; // Drop oldest, never block
        }
        $this->buffer[$this->tail] = $event;
        $this->tail = ($this->tail + 1) % $this->size;
        return true;
    }

    public function flush(): array
    {
        $events = [];
        while ($this->head !== $this->tail) {
            $events[] = $this->buffer[$this->head];
            $this->head = ($this->head + 1) % $this->size;
        }
        return $events;
    }
}

17.3 Analytics Queue Integration

Metric events flow through a dedicated analytics queue:

php

class AnalyticsQueueFlusher
{
    private RingBuffer $buffer;
    private Queue $analyticsQueue;
    private int $flushInterval = 1000; // 1 second
    private int $batchSize = 100;

    public function flush(): void
    {
        $events = $this->buffer->flush();

        // Batch events for efficient queue operations
        foreach (array_chunk($events, $this->batchSize) as $batch) {
            $this->analyticsQueue->push(new MetricBatch($batch), priority: 'bulk');
        }
    }
}

The analytics queue uses:

Priority: bulk (lowest, never starves business events)
High concurrency (metrics are independent)
Batch processing (100+ metrics per job)

17.4 Analytics Worker

Processes metric batches and writes to storage:

php

class AnalyticsWorker
{
    public function process(MetricBatch $batch): void
    {
        // 1. Bulk insert to storage (efficient)
        $this->storage->bulkInsert($batch->events);

        // 2. Update in-memory aggregates for alerts
        foreach ($batch->events as $event) {
            $this->aggregator->update($event);
        }

        // 3. Evaluate alert rules
        $this->alertEvaluator->checkThresholds($this->aggregator);
    }
}

17.5 Metric Event Schemas

Each metric event type has a defined schema:

php

// Event tracking
class EventReceivedMetric {
    string $event;           // :checkout
    string $machine;         // @order
    string $instance_id;     // order-abc123
    float $timestamp;        // 1702000000.123
    ?string $correlation_id; // for tracing
}

class EventCompletedMetric {
    string $event;
    float $duration_ms;      // 45.2
    bool $success;
    ?string $error;
}

// State tracking
class StateEnteredMetric {
    string $state;           // #awaiting_payment
    string $instance_id;
    float $timestamp;
    ?string $previous_state;
}

class StateExitedMetric {
    string $state;
    string $instance_id;
    float $duration_ms;
    string $next_state;
}

// Guard tracking (with timing)
class GuardEvaluatedMetric {
    string $guard;           // "cart is valid"
    bool $result;            // true
    float $duration_ms;      // 0.8 (evaluation time)
    string $instance_id;
    float $timestamp;
}

// Action tracking (timing)
class ActionExecutedMetric {
    string $action;          // "validate cart"
    float $duration_ms;      // 3.2 (execution time)
    string $machine;         // @order
    string $instance_id;
    float $timestamp;
}

// Transition tracking (with timing)
class TransitionMetric {
    string $from_state;      // #cart
    string $to_state;        // #checkout
    float $duration_ms;      // 0.1 (transition overhead)
    string $event;           // :checkout
    string $instance_id;
}

// API Event tracking (end-to-end timing)
class ApiEventHandledMetric {
    string $event;           // :checkout
    float $response_time_ms; // 120.5 (total HTTP response time)
    float $processing_time_ms; // 45.2 (business logic only)
    string $machine;         // @order
    string $instance_id;
    float $timestamp;
}

17.6 Storage Backend

Environment	Backend	Notes
Development	SQLite / In-memory	Simple, no setup
Production	TimescaleDB	Time-series optimized PostgreSQL
Production	InfluxDB	Purpose-built time-series DB
Production	Prometheus	Pull-based, great for dashboards

Bulk writes are critical for performance:

sql

-- Instead of 100 individual INSERTs:
INSERT INTO metrics (event, machine, instance_id, timestamp, value)
VALUES
  (':checkout', '@order', 'abc', 1702000000.1, 1),
  (':checkout', '@order', 'def', 1702000000.2, 1),
  -- ... 98 more rows
;

17.7 Funnel Computation

Auto-funnel detection runs at startup and on definition changes:

php

class FunnelDetector
{
    public function detect(Machine $machine): array
    {
        $graph = $this->buildStateGraph($machine);
        $terminals = $this->findTerminals($graph);
        $classified = $this->classifyTerminals($terminals);

        $funnels = [];
        foreach ($classified as $terminal => $type) {
            $paths = $this->discoverPaths($graph, $terminal);
            $funnels[] = new Funnel($terminal, $type, $paths);
        }

        return $funnels;
    }
}

17.8 PHP Binding

php

#[Machine('@order')]
#[Analytics(collection: 'queued', bufferSize: 10000)]
class OrderMachine
{
    #[Track(':checkout', label: 'Checkout Started')]
    public function onCheckout(Event $event): void
    {
        // MetricEmitter::emit() called automatically (fire-and-forget)
        // Business logic continues immediately
    }

    #[Measure('#awaiting_payment', duration: true)]
    public function enterAwaitingPayment(State $state): void
    {
        // StateEnteredMetric emitted automatically
    }
}

17.9 Graceful Degradation Strategies

php

class AdaptiveCollector
{
    private float $lastFlushTime;
    private int $droppedCount = 0;

    public function emit(MetricEvent $event): void
    {
        // Strategy 1: Drop if buffer full
        if ($this->buffer->isFull()) {
            $this->droppedCount++;
            return; // Silently drop, never block
        }

        // Strategy 2: Sample under pressure
        if ($this->isUnderPressure() && !$this->shouldSample($event)) {
            return; // Skip this metric (sampled)
        }

        $this->buffer->push($event);
    }

    private function isUnderPressure(): bool
    {
        return $this->buffer->fillRatio() > 0.8; // 80% full
    }

    private function shouldSample(MetricEvent $event): bool
    {
        // Keep 10% of metrics under pressure
        return crc32($event->instance_id) % 10 === 0;
    }
}

17.10 Timing Metrics Implementation

Timing metrics wrap each measurable component with a stopwatch:

php

class TimingMetricCollector
{
    private MetricEmitter $emitter;

    /**
     * Measure guard evaluation time
     */
    public function measureGuard(string $guard, callable $evaluator): bool
    {
        $stopwatch = Stopwatch::start();
        $result = $evaluator();
        $elapsed = $stopwatch->elapsedMs();

        $this->emitter->emit(new GuardEvaluatedMetric(
            guard: $guard,
            result: $result,
            duration_ms: $elapsed,
            instance_id: $this->context->instanceId(),
            timestamp: microtime(true),
        ));

        return $result;
    }

    /**
     * Measure action execution time
     */
    public function measureAction(string $action, callable $executor): void
    {
        $stopwatch = Stopwatch::start();
        $executor();
        $elapsed = $stopwatch->elapsedMs();

        $this->emitter->emit(new ActionExecutedMetric(
            action: $action,
            duration_ms: $elapsed,
            machine: $this->context->machineName(),
            instance_id: $this->context->instanceId(),
            timestamp: microtime(true),
        ));
    }

    /**
     * Measure state transition overhead
     */
    public function measureTransition(
        string $fromState,
        string $toState,
        string $event,
        callable $transitioner
    ): void {
        $stopwatch = Stopwatch::start();
        $transitioner();
        $elapsed = $stopwatch->elapsedMs();

        $this->emitter->emit(new TransitionMetric(
            from_state: $fromState,
            to_state: $toState,
            duration_ms: $elapsed,
            event: $event,
            instance_id: $this->context->instanceId(),
        ));
    }

    /**
     * Measure API event end-to-end timing
     */
    public function measureApiEvent(
        string $event,
        float $httpStartTime,
        callable $handler
    ): mixed {
        $processingStart = microtime(true);
        $result = $handler();
        $processingEnd = microtime(true);

        $this->emitter->emit(new ApiEventHandledMetric(
            event: $event,
            response_time_ms: ($processingEnd - $httpStartTime) * 1000,
            processing_time_ms: ($processingEnd - $processingStart) * 1000,
            machine: $this->context->machineName(),
            instance_id: $this->context->instanceId(),
            timestamp: $processingEnd,
        ));

        return $result;
    }
}

Integration with Event Handler:

php

class EventHandler
{
    public function handle(Event $event): void
    {
        // Measure each guard
        foreach ($this->guards as $guard) {
            $passed = $this->timing->measureGuard(
                $guard->name(),
                fn() => $guard->evaluate($this->context)
            );
            if (!$passed) return;
        }

        // Measure each action
        foreach ($this->actions as $action) {
            $this->timing->measureAction(
                $action->name(),
                fn() => $action->execute($this->context)
            );
        }

        // Measure state transition
        if ($this->transition) {
            $this->timing->measureTransition(
                $this->currentState,
                $this->transition->targetState(),
                $event->name(),
                fn() => $this->state->transitionTo($this->transition->targetState())
            );
        }
    }
}

PHP Binding with Timing Attributes:

php

#[Machine('@order')]
class OrderMachine
{
    #[MeasureGuard(name: 'cart is valid')]
    public function guardCartIsValid(Context $ctx): bool
    {
        // Guard logic - timing measured automatically
        return $ctx->cart->isValid();
    }

    #[MeasureAction(name: 'charge payment')]
    public function actionChargePayment(Context $ctx): void
    {
        // Action logic - timing measured automatically
        $this->paymentGateway->charge($ctx->amount);
    }

    #[Track(':checkout', timing: true)]
    public function onCheckout(Event $event): void
    {
        // API event timing measured automatically
        // response_time includes HTTP overhead
        // processing_time is business logic only
    }
}

Performance Budget Enforcement:

php

class PerformanceBudgetChecker
{
    private array $budgets = [
        'api_response_time' => ['p95' => 500],    // 500ms
        'guard_evaluation' => ['p95' => 50],       // 50ms
        'action_execution' => ['p95' => 200],      // 200ms
        'transition_time' => ['p95' => 1],         // 1ms
    ];

    public function check(PercentileAggregator $aggregator): array
    {
        $violations = [];

        foreach ($this->budgets as $metric => $thresholds) {
            foreach ($thresholds as $percentile => $limit) {
                $value = $aggregator->getPercentile($metric, $percentile);
                if ($value > $limit) {
                    $violations[] = new BudgetViolation(
                        metric: $metric,
                        percentile: $percentile,
                        limit: $limit,
                        actual: $value,
                    );
                }
            }
        }

        return $violations;
    }
}

18. Schema Evolution & Migration

When flow definitions change, analytics data needs to remain consistent and queryable.

18.1 The Challenge

When you change a machine definition:

Problem: Event renamed
────────────────────────
Before: :checkout
After:  :start_checkout

Historical data has :checkout
New data has :start_checkout
Funnel reports break!

Problem: State removed
──────────────────────
Before: #cart → #checkout → #awaiting_payment → #paid → #fulfilled
After:  #cart → #payment → #paid → #fulfilled

Funnel path changed, historical data has old state names

18.2 Migration Syntax

Declare migrations in the analytics block:

flow

machine: @order

analytics:
  migrations:
    // Event renames
    :checkout -> :start_checkout           // alias old → new
    :payment_received -> :payment_success

    // State renames
    #awaiting_payment -> #payment_pending
    #processing -> #in_progress

    // Removed (mark as deprecated, preserve history)
    :legacy_checkout deprecated           // historical only
    #old_state deprecated

18.3 Migration Behavior

Change Type	Behavior
Event rename	Historical data aliased, queries work with either name
State rename	Historical state durations remain, funnel paths updated
Event removed	Marked deprecated, historical data preserved, new metrics stop
State removed	Marked deprecated, not included in new funnels
Guard renamed	Guard effectiveness combined under new name

18.4 CLI Commands

bash

# Validate analytics schema against current flow
$ eventflow analytics:validate @order

@order Analytics Validation
═══════════════════════════════════════════════════════════════

✓ Schema consistent with flow definition

Migrations Applied:
  :checkout → :start_checkout (1,234 historical events aliased)
  #awaiting_payment → #payment_pending (456 state entries aliased)

Deprecated (Historical Only):
  :legacy_checkout (890 events, last seen: 2024-01-15)
  #old_state (12 entries, last seen: 2024-01-10)

Warnings:
  ⚠ Funnel "Purchase Flow" references #awaiting_payment (migrated)
    → Auto-updated to #payment_pending

bash

# Preview migration impact
$ eventflow analytics:migrate @order --dry-run

Migration Preview: @order
═══════════════════════════════════════════════════════════════

Changes Detected:
  1. State #processing not in flow definition
     → Mark as deprecated? [y/n]

  2. Event :old_event tracked but not in flow
     → Mark as deprecated? [y/n]

  3. Guard "legacy check" no longer exists
     → Historical data will be preserved

Run 'eventflow analytics:migrate @order' to apply changes.

18.5 Funnel Invalidation

When schema changes affect funnels:

Funnel Invalidation Warning
═══════════════════════════════════════════════════════════════

Funnel "Purchase Flow" affected by schema change:

Before: #cart → #checkout → #awaiting_payment → #paid → #fulfilled
After:  #cart → #checkout → #payment_pending  → #paid → #fulfilled

Historical data:
  - 12,450 instances through old path
  - Path step #awaiting_payment aliased to #payment_pending

Action Required: None (auto-migrated)

18.6 Best Practices

Always add migrations before deploying flow changes
- Historical data remains queryable
- Funnel reports don't break
Use deprecated for removed elements
- Preserves audit trail
- Allows historical analysis
Run analytics:validate in CI/CD
- Catch schema drift early
- Ensure migrations are complete

Document migration reasons

flow

migrations:
  :checkout -> :start_checkout  // renamed for clarity in v2.0

19. Event Sourcing vs Metric Tables

This section clarifies the relationship between event sourcing (the foundation of EventFlow) and metric tables (used for analytics).

19.1 The Question

"EventFlow is already event-sourced. Every event is stored. Why do we need separate metric tables?"

19.2 Event Sourcing Data

Event sourcing stores every event that occurs:

Event Store (Raw Events)
─────────────────────────────────────────────────────────────
| id     | machine | instance_id | event      | timestamp   | payload    |
|--------|---------|-------------|------------|-------------|------------|
| evt-1  | @order  | order-abc   | :checkout  | 1702000000  | {cart: ..} |
| evt-2  | @order  | order-abc   | :pay       | 1702000045  | {amount:99}|
| evt-3  | @order  | order-def   | :checkout  | 1702000050  | {cart: ..} |
| ...    | ...     | ...         | ...        | ...         | ...        |

Pros:

Complete audit trail
Can replay to any point in time
No data loss

Cons for Analytics:

To answer "How many checkouts today?" → scan ALL events
To calculate "Average time in #awaiting_payment" → replay state transitions
Query performance degrades with data size
Real-time dashboards impractical

19.3 Metric Tables

Metric tables are pre-aggregated derived views:

Metric Tables (Pre-Aggregated)
─────────────────────────────────────────────────────────────

event_counts:
| machine | event      | period_start | count |
|---------|------------|--------------|-------|
| @order  | :checkout  | 2024-12-01   | 1,234 |
| @order  | :checkout  | 2024-12-02   | 1,456 |

state_durations:
| machine | state              | p50_ms | p95_ms | p99_ms |
|---------|--------------------|--------|--------|--------|
| @order  | #awaiting_payment  | 12000  | 45000  | 120000 |

guard_effectiveness:
| machine | guard             | true_count | false_count |
|---------|-------------------|------------|-------------|
| @order  | "cart is valid"   | 11,234     | 1,016       |

Pros:

O(1) query time for "How many checkouts today?"
Dashboard-ready aggregates
Supports real-time alerting

Cons:

Cannot replay (aggregates only)
Potential data loss if aggregation missed events

19.4 Hybrid Architecture

EventFlow Analytics uses both:

┌─────────────────────────────────────────────────────────────────────┐
│                        EVENT STORE                                   │
│                   (Source of Truth)                                  │
│                                                                      │
│   Every event stored permanently                                     │
│   Full audit trail                                                   │
│   Can rebuild everything from here                                   │
└───────────────────────────┬─────────────────────────────────────────┘
                            │
            ┌───────────────┴───────────────┐
            │                               │
            ▼                               ▼
┌───────────────────────┐       ┌───────────────────────┐
│   REAL-TIME PATH      │       │    BATCH PATH         │
│                       │       │                       │
│   Ring Buffer         │       │   Nightly Job         │
│        ↓              │       │        ↓              │
│   Analytics Queue     │       │   Rebuild Aggregates  │
│        ↓              │       │        ↓              │
│   Metric Tables       │       │   Metric Tables       │
│   (incremental)       │       │   (full refresh)      │
└───────────────────────┘       └───────────────────────┘
         │                               │
         └───────────────┬───────────────┘
                         │
                         ▼
              ┌───────────────────────┐
              │   ANALYTICS QUERIES   │
              │                       │
              │   - Dashboard         │
              │   - Funnels           │
              │   - Alerts            │
              └───────────────────────┘

19.5 Why Both?

Use Case	Event Store	Metric Tables
Audit trail	✓ Primary	-
Replay/debug	✓ Primary	-
Real-time dashboard	-	✓ Primary
Funnel reports	-	✓ Primary
Alert evaluation	-	✓ Primary
Historical analysis	✓ Fallback	✓ Primary
Data recovery	✓ Source	Rebuild from events

19.6 Implementation

Metric tables are materialized views of the event store:

Real-time updates: Metric events → Queue → Increment counters
Batch rebuild: Scheduled job replays event store → Rebuild aggregates
Consistency: Batch job fixes any real-time drift

php

// Conceptual: Metric table is derivable from event store
class MetricTableRebuilder
{
    public function rebuild(DateRange $range): void
    {
        $events = $this->eventStore->query($range);

        foreach ($events as $event) {
            // Same logic as real-time, but from historical data
            $this->aggregator->process($event);
        }

        $this->metricTable->replaceAggregates($this->aggregator->results());
    }
}

19.7 Rebuild Triggers

Metric table rebuilds can be triggered in several ways:

Trigger	When	Use Case
Scheduled	Nightly (configurable)	Regular consistency check
On Demand	CLI command	After data recovery, migration
Automatic	Drift detected	Real-time vs batch mismatch
Schema Change	Migration applied	Event/state renames

CLI Commands:

bash

# Full rebuild (all time)
$ eventflow analytics:rebuild @order

Rebuilding metric tables for @order...
  Processing: 2024-01-01 to 2024-12-08
  Events processed: 1,234,567
  ✓ Rebuild complete (took 3m 42s)

# Partial rebuild (specific range)
$ eventflow analytics:rebuild @order --from=2024-12-01

# Check for drift without rebuilding
$ eventflow analytics:verify @order

Verifying metric consistency for @order...
  Checking event counts... ✓ match
  Checking state durations... ⚠ drift detected
    - #awaiting_payment: real-time 45.2s vs batch 44.8s (0.8% diff)
  Checking guard rates... ✓ match

Recommendation: Run 'eventflow analytics:rebuild @order' to fix drift

Configuration:

flow

analytics:
  rebuild:
    schedule: "0 2 * * *"        // 2 AM daily (cron syntax)
    drift_threshold: 1%          // auto-rebuild if drift > 1%
    retention: 90 days           // rebuild window

Automatic Drift Detection:

The analytics worker compares real-time aggregates with batch results. If drift exceeds the threshold:

Log warning with drift details
If drift_threshold configured, trigger automatic rebuild
Alert operations team if drift persists

19.8 Summary

Aspect	Event Store	Metric Tables
Role	Source of truth	Derived views
Persistence	Permanent	Rebuildable
Query speed	O(n)	O(1)
Use case	Audit, replay	Dashboard, alerts
Relationship	Parent	Child (derived)

Key insight: Metric tables don't replace event sourcing—they're an optimization layer. If metric tables are lost, they can be rebuilt from the event store.

Event Queue Proposal - Queue-level metrics (pending, processing, failed) - Analytics extends this with machine behavior metrics
Test Scenarios Proposal - Test coverage tracking integrates with analytics coverage reports
Data Validation Proposal - Validation failures can be tracked as analytics events
Machine Response Proposal - Response metrics (status codes, latency) complement event metrics

21. Summary

┌─────────────────────────────────────────────────────────────────┐
│                                                                 │
│  Zero Overhead Architecture                                     │
│  ──────────────────────────                                     │
│  - Metrics are events (fire-and-forget)                         │
│  - Ring buffer + async queue = ~0ms latency impact              │
│  - Batch writes for storage efficiency                          │
│  - Graceful degradation under load                              │
│                                                                 │
│  Metric Types                                                   │
│  ────────────                                                   │
│  - Events: count, rate, latency                                 │
│  - States: duration, entry/exit, active                         │
│  - Guards: true/false rate, dead detection, evaluation time     │
│  - Actions: execution time per action                           │
│  - Transitions: count, conversion rate, transition time         │
│  - Context: distribution, cardinality                           │
│  - Timing: API response time, processing time                   │
│                                                                 │
│  Automatic Funnel Detection                                     │
│  ──────────────────────────                                     │
│  - Terminal states discovered from graph topology               │
│  - SUCCESS/FAILURE classification by naming patterns            │
│  - All paths traced, conversion rates computed                  │
│  - No manual configuration required                             │
│                                                                 │
│  Timing Model                                                   │
│  ────────────                                                   │
│  - Near real-time: alerts, dashboards (1-5s delay)              │
│  - Batch: funnel reports, coverage, dead code                   │
│                                                                 │
│  CLI Tools                                                      │
│  ─────────                                                      │
│  - eventflow analytics: metrics dashboard                       │
│  - eventflow funnel: conversion analysis                        │
│  - eventflow coverage: test coverage + dead code                │
│  - eventflow alerts: alert management                           │
│  - eventflow insights: auto-generated recommendations           │
│  - eventflow perf: performance profiling                        │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Philosophy

Numbers tell the story. Funnels reveal the journey.
Metrics are events. Events flow through queues.
Analytics never blocks business logic.

EventFlow Analytics follows the natural language philosophy: analytics declarations are as readable as the workflows they measure, enabling collaboration between developers, product managers, and business analysts. The zero-overhead architecture ensures production workloads are never impacted by metric collection.

EventFlow Analytics Proposal ​

1. Executive Summary ​

Core Philosophy ​

Key Features ​

2. Motivation & Problem Statement ​

2.1 Current Situation ​

2.2 Real-World Challenges ​

2.3 Goals ​

3. Core Metric Types ​

3.1 Event Metrics ​

3.2 State Metrics ​

3.3 Guard Metrics ​

3.4 Transition Metrics ​

3.5 Context Metrics ​

3.6 Timing Metrics (Performance Profiling) ​

4. Automatic Funnel Detection ​

4.1 Algorithm ​

4.2 Terminal State Classification ​

Explicit Marking (Recommended) ​

Pattern-Based Inference (Fallback) ​

4.3 Auto-Generated Funnel Output ​

4.4 Multiple Funnels Per Machine ​

4.5 Optional Funnel Hints ​

5. DSL Syntax ​

5.1 Analytics Block ​

5.2 Inline Tracking ​

5.3 Alert Syntax ​

6. Zero Overhead Architecture ​

6.1 The Problem with Synchronous Analytics ​

6.2 Metric Events Architecture ​

6.3 Metric Event Types ​

6.4 Analytics Queue ​

6.5 Collection Modes ​

6.6 Performance Guarantees ​

6.7 Graceful Degradation ​

6.8 Architecture Diagram ​

7. Timing Model ​

7.1 Near Real-Time Analytics ​

7.2 Batch Analytics ​

7.3 Alert Evaluation ​

7.4 Configuration ​

8. Auto-Generated Insights ​

8.1 Path Analysis ​

8.2 Bottleneck Detection ​

8.3 Drop-off Analysis ​

8.4 Dead Code Detection ​

8.5 Test Coverage Integration ​

9. CLI Commands ​

9.1 Analytics Dashboard ​

9.2 Funnel Analysis ​

9.3 Coverage Analysis ​

9.4 Alerts ​

9.5 Insights ​

9.6 Performance Profiling ​

10. Visualization Integration ​

10.1 Annotated State Diagrams ​

10.2 Funnel Diagrams ​

10.3 Heat Maps ​

11. A/B Testing & Experimentation ​

12. Configuration Location ​

Why Inline? ​

Future: Web-Based Analytics Builder ​

13. Complete Example ​

13.1 E-Commerce Order with Analytics ​

13.2 Job Application with Analytics ​

14. Sample Dashboard ​

15. Free vs Pro Features ​

15.1 Feature Matrix ​

15.2 Visualization Tiers ​

15.3 CLI Examples (Free) ​

15.4 Pro Web Dashboard ​

15.5 Recommendation ​

16. Keywords Reference ​

17. Implementation Notes ​

17.1 Metric Event Emission (Zero Overhead) ​

17.2 Ring Buffer ​

17.3 Analytics Queue Integration ​

17.4 Analytics Worker ​

17.5 Metric Event Schemas ​

17.6 Storage Backend ​

EventFlow Analytics Proposal

1. Executive Summary

Core Philosophy

Key Features

2. Motivation & Problem Statement

2.1 Current Situation

2.2 Real-World Challenges

2.3 Goals

3. Core Metric Types

3.1 Event Metrics

3.2 State Metrics

3.3 Guard Metrics

3.4 Transition Metrics

3.5 Context Metrics

3.6 Timing Metrics (Performance Profiling)

4. Automatic Funnel Detection

4.1 Algorithm

4.2 Terminal State Classification

Explicit Marking (Recommended)

Pattern-Based Inference (Fallback)

4.3 Auto-Generated Funnel Output

4.4 Multiple Funnels Per Machine

4.5 Optional Funnel Hints

5. DSL Syntax

5.1 Analytics Block

5.2 Inline Tracking

5.3 Alert Syntax

6. Zero Overhead Architecture

6.1 The Problem with Synchronous Analytics

6.2 Metric Events Architecture

6.3 Metric Event Types

6.4 Analytics Queue

6.5 Collection Modes

6.6 Performance Guarantees

6.7 Graceful Degradation

6.8 Architecture Diagram

7. Timing Model

7.1 Near Real-Time Analytics

7.2 Batch Analytics

7.3 Alert Evaluation

7.4 Configuration

8. Auto-Generated Insights

8.1 Path Analysis

8.2 Bottleneck Detection

8.3 Drop-off Analysis

8.4 Dead Code Detection

8.5 Test Coverage Integration

9. CLI Commands

9.1 Analytics Dashboard

9.2 Funnel Analysis

9.3 Coverage Analysis

9.4 Alerts

9.5 Insights

9.6 Performance Profiling

10. Visualization Integration

10.1 Annotated State Diagrams

10.2 Funnel Diagrams

10.3 Heat Maps

11. A/B Testing & Experimentation

12. Configuration Location

Why Inline?

Future: Web-Based Analytics Builder

13. Complete Example

13.1 E-Commerce Order with Analytics

13.2 Job Application with Analytics

14. Sample Dashboard

15. Free vs Pro Features

15.1 Feature Matrix

15.2 Visualization Tiers

15.3 CLI Examples (Free)

15.4 Pro Web Dashboard

15.5 Recommendation

16. Keywords Reference

17. Implementation Notes

17.1 Metric Event Emission (Zero Overhead)

17.2 Ring Buffer

17.3 Analytics Queue Integration

17.4 Analytics Worker

17.5 Metric Event Schemas

17.6 Storage Backend