Report Agent v1.0.0

Release Date: 2024-02-06

Summary

Initial release of the Report Agent interactive chat system. This release provides natural language exploration of Helio test report data through 14 specialized tools, O(1) context indexing, and an agentic loop that supports multi-turn reasoning.

Highlights

  • 14 Specialized Tools: Comprehensive coverage for data access, filtering, and analysis
  • O(1) Context Indexing: Pre-computed Map indexes for instant lookups
  • Agentic Loop: Up to 10 tool call rounds for complex queries
  • Dynamic System Prompt: Report-specific context injection
  • Tool Call Debugging: Full visibility into tool usage for debugging

Architecture

System Design

The Report Agent uses a closure-based tool pattern where:

  1. Report context is loaded once per conversation
  2. Tools are created with context access via closure
  3. The AI SDK manages multi-turn tool calling

Data Flow

POST /api/report-agent/chat
        │
        ▼
┌───────────────────┐
│ Load Context      │ ← Helio API (once)
│ Build Map Indexes │
└────────┬──────────┘
         │
         ▼
┌───────────────────┐
│ Create Tools      │ ← 14 tools with context closure
└────────┬──────────┘
         │
         ▼
┌───────────────────────────────────────┐
│ Agentic Loop (gpt-4o)                 │
│                                       │
│ Round 1: Model selects tools          │
│ Round 2: Execute, receive results     │
│ Round 3: Continue or respond          │
│ ...                                   │
│ Round N: Final text response          │
│                                       │
│ stopWhen: stepCountIs(10)             │
└───────────────────────────────────────┘
         │
         ▼
{ text, toolCalls[], debug }

Key Files

| File | Purpose | |------|---------| | app/api/report-agent/chat/route.ts | Chat endpoint, system prompt, generateText | | app/apps/report-agent/_lib/tools.ts | 14 tool definitions | | app/apps/report-agent/_lib/report-context.ts | Context loading and indexing | | app/apps/report-agent/_lib/types.ts | TypeScript types |

Model Configuration

| Component | Model | Rationale | |-----------|-------|-----------| | Chat | gpt-4o | Best multi-turn reasoning, reliable tool selection |


Tools Reference

Data Access (7 tools)

| Tool | Purpose | Inputs | |------|---------|--------| | getReportOverview | Full report summary | None | | getQuestion | Single question details | questionId: number | | getUxMetrics | All UX metrics | None | | getUxMetric | Single UX metric | metricType: string | | getDemographics | Participant demographics | None | | getResponses | Paginated responses | limit?, offset? | | getResponse | Single participant | participantId: number |

Filtering (4 tools)

| Tool | Purpose | Inputs | |------|---------|--------| | filterQuestionsByType | By question type | questionType: string | | filterQuestionsByMetric | By UX metric | metricType: string | | searchResponses | Keyword search | keyword, limit? | | filterByDemographic | By demographic | demographicKey, demographicValue |

Analysis (3 tools)

| Tool | Purpose | Inputs | |------|---------|--------| | compareOptions | Option comparison | questionId: number | | compareDemographics | Segment comparison | questionId, demographicKey | | getTopQuotes | Best quotes | sentiment?, keyword?, limit? |

See TOOLS.md for complete tool documentation.


Prompt Details

System Prompt

Model: gpt-4o

You are a UX research analyst assistant helping users explore test report data from Helio. You have access to tools that let you query the report data.

## Current Report
- **Name**: ${report.study.name}
- **Status**: ${report.study.status}
- **Total Responses**: ${report.study.total_responses}
- **Total Questions**: ${report.study.total_questions}

## Available Tools
You have tools to:
1. **Get Overview**: Get a summary of the report (getReportOverview)
2. **Get Questions**: Get details about specific questions
3. **Get UX Metrics**: Get overall and specific UX scores
4. **Get Demographics**: Get participant demographic breakdowns
5. **Get Responses**: Get individual participant responses
6. **Search**: Search for keywords in text responses
7. **Filter**: Filter responses by demographic criteria
8. **Compare**: Compare options or demographic segments
9. **Get Quotes**: Get insightful verbatim quotes

## Guidelines
- Always start by using getReportOverview if the user asks a broad question
- When discussing scores or metrics, always call the relevant tool - don't make up numbers
- When quoting participants, use the exact text from the data
- Provide actionable insights when appropriate
- Reference specific question IDs and participant IDs when relevant

## Response Style
- Use natural, conversational language
- Format responses with markdown for readability
- Include context for quantitative data
- Highlight surprising or important findings

## Important Rules
- ALWAYS complete your lists and sentences
- When listing metrics, include ALL of them with their values
- If data is missing, explicitly say "data not available"
- Double-check that your response is complete before finishing

Context Indexing

Pre-computed indexes enable O(1) lookups:

interface ReportContext {
  report: TestReportData;
  testId: string;
  questionsById: Map<number, TestReportQuestion>;       // O(1) by ID
  questionsByPosition: Map<number, TestReportQuestion>; // O(1) by position
  responsesByParticipant: Map<number, TestReportResponseRecord>; // O(1) by participant
}

Index Building

export function buildReportContext(testId: string, report: TestReportData): ReportContext {
  const questionsById = new Map<number, TestReportQuestion>();
  const questionsByPosition = new Map<number, TestReportQuestion>();
  const responsesByParticipant = new Map<number, TestReportResponseRecord>();

  for (const question of report.questions ?? []) {
    questionsById.set(question.id, question);
    questionsByPosition.set(question.position, question);
  }

  for (const response of report.responses?.data ?? []) {
    responsesByParticipant.set(response.id, response);
  }

  return { report, testId, questionsById, questionsByPosition, responsesByParticipant };
}

API Reference

Chat Endpoint

POST /api/report-agent/chat

Request Body:
{
  testId: string;
  messages: Array<{ role: "user" | "assistant"; content: string }>;
}

Response:
{
  text: string;
  toolCalls: Array<{
    name: string;
    args: unknown;
    result: unknown;
  }>;
  debug: {
    finishReason: string;
    stepCount: number;
    usage: { promptTokens: number; completionTokens: number };
  };
}

Known Limitations

  1. No streaming - Full response returned at once (not streamed)
  2. Max 10 tool rounds - Complex queries may hit the limit
  3. Context not persisted - Reloaded for each request
  4. No cross-report queries - Single report per conversation
  5. Tool results truncated - Large results truncated to 2000 chars in debug

Related ADRs


Performance Characteristics

| Metric | Typical Value | Notes | |--------|---------------|-------| | Context load | 500ms-2s | Helio API call | | Tool execution | 1-10ms | O(1) Map lookups | | Full response | 2-15s | Depends on tool chain | | Max tool rounds | 10 | Configurable |


Example Interactions

Basic Query

User: "What's the usability score?"

Tool Calls:

  1. getUxMetrics({})

Response:

The usability score is 78/100 (rated "Good"). This score is derived from questions that measure how easy participants found the product to use.

Multi-Tool Query

User: "Compare how different age groups responded to question 5"

Tool Calls:

  1. getQuestion({ questionId: 5 })
  2. compareDemographics({ questionId: 5, demographicKey: "age" })

Response:

Question 5 asked "How easy was it to complete the checkout?"

Age breakdown:

  • 18-24: 42% found it "Very Easy"
  • 25-34: 55% found it "Very Easy"
  • 35-44: 38% found it "Very Easy"

Younger millennials (25-34) showed the highest ease ratings, while Gen X respondents struggled more.

Search Query

User: "Find feedback mentioning 'slow'"

Tool Calls:

  1. searchResponses({ keyword: "slow", limit: 10 })

Response:

Found 7 responses mentioning "slow":

  • "The page was too slow to load" (Participant 9045, Q3)
  • "Checkout felt slow and clunky" (Participant 9012, Q7)
  • ...