Signal Generator v1.0.0

Release Date: 2024-02-06

Summary

Initial release of the Signal Generator signal generation system. This release introduces a map-reduce architecture for extracting design signals from Helio UX research data, featuring parallel processing, real-time SSE streaming, and a customizable skills system.

Highlights

Map-Reduce Pattern: Parallel section processing with gpt-4o-mini followed by synthesis with gpt-4o
SSE Streaming: Real-time progress updates as each section completes
Skills Injection: Database-driven prompt customization without code changes
Shareable Reports: Persistent SignalRun records with unique URLs
UX Metric Grouping: Cross-section synthesis when multiple questions contribute to the same metric

Architecture

System Design

The Signal Generator uses a two-phase map-reduce pattern:

Map Phase: Each question/section is processed in parallel using gpt-4o-mini to extract structured insights (summary, quant metrics, qual quotes)
Reduce Phase: All section insights are synthesized into 2-5 coherent design signals using gpt-4o

Data Flow

GET /api/signals/generate?testId=xxx
        │
        ▼
┌───────────────────┐
│ Parallel Fetch    │
│ - Helio Report    │
│ - SignalConfig    │
│ - Enabled Skills  │
└────────┬──────────┘
         │
         ▼
┌───────────────────────────────────────┐
│ MAP PHASE (parallel)                  │
│ Section 1 ──► Insight 1               │
│ Section 2 ──► Insight 2               │
│ Section N ──► Insight N               │
│ (gpt-4o-mini each)                    │
└────────────────┬──────────────────────┘
                 │
                 ▼
┌───────────────────────────────────────┐
│ REDUCE PHASE                          │
│ All Insights ──► 2-5 Design Signals   │
│ (gpt-4o with system prompt + skills)  │
└────────────────┬──────────────────────┘
                 │
                 ▼
┌───────────────────────────────────────┐
│ PERSISTENCE                           │
│ SignalRun.create() → shareable URL    │
└───────────────────────────────────────┘

Key Files

| File | Purpose | |------|---------| | app/api/signals/generate/route.ts | SSE endpoint, orchestrates full pipeline | | app/apps/signal-generator/_actions/generate-signals.ts | Core processing functions | | app/apps/signal-generator/_actions/signal-configs.ts | SignalConfig CRUD and default prompt | | app/apps/signal-generator/DATA-GUIDE.md | Reference for 17 UX metrics, 12 question types | | app/apps/signal-generator/reports/[runId]/page.tsx | Shareable report viewer |

Model Configuration

| Phase | Model | Purpose | Rationale | |-------|-------|---------|-----------| | Map | gpt-4o-mini | Section extraction | Fast, cost-effective for structured extraction | | Reduce | gpt-4o | Signal synthesis | Superior reasoning for cross-section synthesis |

Prompt Details

Section Processing Prompt (Map Phase)

Purpose: Extract structured insight from a single question/section

Model: gpt-4o-mini

Extract data from this UX research question.

## Study: ${studyName}

## Question Data
${questionJson}

## Instructions

1. **summary**: 1-2 sentence summary of findings

2. **quant[]**: Extract quantitative data based on question type.

   **IMPORTANT**: Every quant entry MUST include these context fields:
   - questionType: "${question.type}" (the question type)
   - questionText: "${question.question}" (the question asked)
   - imageUrl: Use option's image_url if available, otherwise null

   If `ux_metric` exists, create ONE entry:
   { uxMetricId: ux_metric.type, score: ux_metric.score, ... }

   Then extract supporting data based on type:
   - **preference/multiple_choice**: One entry per option
   - **likert/numerical_scale**: ONE entry with average_score only
   - **rank**: One entry per item with average position
   - **free_response**: Extract from `sentiment_breakdown` if available

3. **qual[]**: Verbatim quotes ONLY (max 5)
   - ONLY extract from `text` string fields
   - NEVER from `selected` arrays
   - If no text fields exist, return empty array

Key Rules:

Quant entries always include questionType, questionText, imageUrl for context
UX Metrics identified by non-null uxMetricId
Qualitative quotes from text fields only, never selected arrays

Synthesis Prompt (Reduce Phase)

Purpose: Combine all section insights into coherent design signals

Model: gpt-4o

Synthesize section insights into design signals.

## Study Context
${studyContext}

## UX Metrics (Overall)
${uxMetricsJson}

## UX Metric Groupings
${uxMetricGroupingsJson}

## Demographics & Audiences
${demographicsJson}

## Section Insights (with UX metric annotations)
${insightsJson}

## Instructions

Create 2-5 design signals. Each signal should have:

1. **header**: Clear finding statement
2. **body**: Narrative blending quant + qual evidence
3. **quant[]**: UX Metrics + supporting data, preserve sectionId
4. **qual[]**: 2-5 compelling verbatim quotes
5. **reportIds**: ["${report.study.id}"]

## Synthesis Guidance
When creating a signal that discusses a UX metric, consider insights from ALL sections that contribute to that metric.

Key Rules:

Always produce 2-5 signals per test
Distinguish UX Metrics (uxMetricId set) from supporting data
Cross-section synthesis for shared UX metrics
Preserve sectionId/sectionResponseId for deep linking

Default System Prompt

Purpose: Establish analyst persona and signal structure

You are a senior UX research analyst. Your job is to analyze test report data and design assets from Helio (a UX testing platform) and produce design signals.

A design signal surfaces a key finding from the test, explains why it matters, and provides actionable suggestions backed by data.

For each signal you generate:

**header** — State the finding as problem space + context.
**why** — Explain the reasoning that connects the data points.
**suggestions** — List specific, actionable next steps.
**data** — Cite supporting evidence (qual + quant).
**imageUrls** — Include relevant design asset URLs.

Generate multiple signals per test. Focus on the strongest, most well-supported findings.

Schema Definitions

SectionInsight (Map Output)

const sectionInsightSchema = z.object({
  sectionId: z.number(),
  sectionType: z.string(),
  summary: z.string(),
  quant: z.array(z.object({
    conceptName: z.string(),
    metricLabel: z.string(),
    score: z.number(),
    sectionId: z.number(),
    uxMetricId: z.string().nullable(),
    questionType: z.string(),
    questionText: z.string(),
    imageUrl: z.string().nullable(),
  })),
  qual: z.array(z.object({
    participantId: z.string(),
    quote: z.string(),
    sentiment: z.enum(["positive", "neutral", "negative"]).nullable(),
    sectionId: z.number(),
    sectionResponseId: z.number(),
  })),
});

DesignSignal (Reduce Output)

const designSignalSchema = z.object({
  signals: z.array(z.object({
    header: z.string(),
    body: z.string(),
    reportIds: z.array(z.string()),
    quant: z.array(z.object({
      conceptName: z.string(),
      metricLabel: z.string(),
      score: z.number(),
      sectionId: z.number().nullable(),
      uxMetricId: z.string().nullable(),
      questionType: z.string(),
      questionText: z.string(),
      imageUrl: z.string().nullable(),
    })),
    qual: z.array(z.object({
      participantId: z.string(),
      avatarUrl: z.string().nullable(),
      quote: z.string(),
      sentiment: z.enum(["positive", "neutral", "negative"]).nullable(),
      sectionId: z.number().nullable(),
      sectionResponseId: z.number().nullable(),
    })),
  })),
});

SSE Event Types

type SSEEvent =
  | { type: "init"; testName: string; sections: [...] }
  | { type: "section-start"; sectionId: number }
  | { type: "section-complete"; sectionId: number; insight: SectionInsight }
  | { type: "section-error"; sectionId: number; error: string }
  | { type: "synthesis-start" }
  | { type: "synthesis-complete"; signals: DesignSignal[] }
  | { type: "done"; signalRunId: string; reportUuid: string; signals: DesignSignal[] }
  | { type: "error"; message: string };

Skills System

Skills are injectable prompt segments stored in the database:

if (enabledSkills.length > 0) {
  systemPrompt += "\n\n## Skills\n\n";
  for (const skill of enabledSkills) {
    systemPrompt += `### ${skill.name}\n\n${skill.content}\n\n`;
  }
}

This allows customizing signal generation behavior without code changes.

Known Limitations

No streaming for individual signal generation - Synthesis happens in one call
Fixed signal count - Always produces 2-5 signals regardless of data complexity
No retry logic - Section failures are reported but not retried
Image analysis - Design assets referenced but not visually analyzed by the model

Related ADRs

ADR-001: Map-Reduce Pattern - Why parallel map + synthesis reduce
ADR-002: Model Selection - gpt-4o-mini vs gpt-4o selection

Performance Characteristics

| Metric | Typical Value | Notes | |--------|---------------|-------| | Section processing | 1-3s each | gpt-4o-mini, runs in parallel | | Synthesis | 5-15s | gpt-4o, varies by insight count | | Total (10 sections) | 10-20s | Parallelization limits map phase | | Token usage (map) | ~500-1000/section | Depends on question complexity | | Token usage (reduce) | ~2000-5000 | Depends on total insights |

Database Schema

SignalRun

model SignalRun {
  id              String       @id @default(cuid())
  testId          String
  testName        String?
  signalConfigId  String?
  signals         String       // JSON stringified
  deepLinkTemplate String
  createdAt       DateTime     @default(now())
}

API Reference

Generate Signals

GET /api/signals/generate?testId={testId}

Response: Server-Sent Events stream
Content-Type: text/event-stream

View Report

GET /apps/signal-generator/reports/{runId}

Response: HTML page with signal visualization