Tool Validation & Retry Hints

Learn how Goa-AI turns validation failures into structured ToolError and RetryHint values that LLMs can use to repair tool calls.

Why This Matters

Goa-AI combines Goa’s design-time validations with a structured tool error model to give LLM planners a powerful way to repair invalid tool calls automatically.

Instead of:

sprinkling ad-hoc validation in executors,
guessing which fields are missing from opaque error strings, or
forcing users to fix every bad call manually,

Goa-AI lets you:

describe precise payload schemas and validations in the Goa design,
surface failures as structured ToolError + RetryHint values, and
teach your planner to retry with better inputs based on those hints.

This pattern is one of the best reasons to use Goa-AI for tool-using agents.

Core Types: ToolError and RetryHint

At planner level (runtime/agent/planner):

ToolError
Alias to runtime/agent/toolerrors.ToolError:
- Message string – human-readable summary.
- Cause *ToolError – optional nested cause (preserves chains across retries and agent-as-tool hops).
- Constructors and helpers:
  - planner.NewToolError(msg string)
  - planner.NewToolErrorWithCause(msg string, cause error)
  - planner.ToolErrorFromError(err error)
  - planner.ToolErrorf(format, args...).

RetryHint
Planner-side hint used by the runtime and policy engine:

type RetryHint struct {
    Reason             RetryReason
    Tool               tools.Ident
    RestrictToTool     bool
    MissingFields      []string
    ExampleInput       map[string]any
    PriorInput         map[string]any
    ClarifyingQuestion string
    Message            string
}

Common RetryReason values:

invalid_arguments – payload failed validation (schema/type).
missing_fields – required fields are missing.
malformed_response – tool returned data that could not be decoded.
timeout, rate_limited, tool_unavailable – execution/infra issues.

The runtime converts planner hints into a runtime/policy form (policy.RetryHint) with the same fields; policy engines and UIs can then react (adjust caps, disable tools, display repair suggestions).

Where Validation Comes From

Validation is design-driven:

In your Goa design, you describe tool payloads (Args) with:
- Attribute types,
- Required(...),
- constraints (MinLength, Maximum, Enum, etc.),
- descriptions and examples.
Goa-AI codegen emits:
- typed payload/result structs, and
- validators/codecs that enforce those rules at the tool boundary.

When a tool payload fails validation (e.g., a missing required field), the generated code and runtime can:

produce a ToolError with a concise message, and
attach a RetryHint that tells the planner exactly how to repair the call.

You do not need to write custom parsers for error strings; the pattern is standardized.

ToolResult: Carrying Errors and Hints

Every tool invocation surfaces as a planner.ToolResult:

type ToolResult struct {
    Name          tools.Ident
    Result        any
    Error         *ToolError
    RetryHint     *RetryHint
    Telemetry     *telemetry.ToolTelemetry
    ToolCallID    string
    ChildrenCount int
    RunLink       *run.Handle
}

Key points:

On success:
- Result contains the decoded result (or raw JSON as a fallback).
- Error and RetryHint are nil.
On validation or execution failure:
- Error describes what went wrong in a planner/UI-friendly way.
- RetryHint (when present) explains how to try again.

The runtime:

receives a logical tool output from your executor or activity,
wraps any failure message into a ToolError,
attaches any RetryHint produced at the boundary, and
publishes a ToolResultReceivedEvent (and corresponding stream.ToolEnd event) with a structured toolerrors.ToolError payload for UIs.

Pattern: Auto-Repairing Invalid Tool Calls

The recommended pattern is:

Design tools with strong payload schemas (Goa design).
Let executors/tools surface validation failures as ToolError + RetryHint instead of panicking or hiding errors.
Teach your planner to:
- inspect ToolResult.Error and ToolResult.RetryHint,
- repair the payload when possible (or ask the user), and
- retry the tool call if appropriate.

Example Executor (Pseudo-Code)

Conceptual executor for a tool that upserts a record:

func Execute(ctx context.Context, meta runtime.ToolCallMeta, call planner.ToolRequest) (*planner.ToolResult, error) {
    // Decode using generated codec
    args, err := spec.UnmarshalUpsertPayload(call.Payload)
    if err != nil {
        // Validation / decode error → structured ToolError + RetryHint
        return &planner.ToolResult{
            Name: call.Name,
            Error: planner.NewToolError("invalid payload"),
            RetryHint: &planner.RetryHint{
                Reason:        planner.RetryReasonInvalidArguments,
                Tool:          call.Name,
                RestrictToTool: true,
                Message:       "Payload did not match the expected schema.",
                // Optionally: MissingFields, ExampleInput, ClarifyingQuestion...
            },
        }, nil
    }

    // Call the underlying service
    res, err := client.Upsert(ctx, args)
    if err != nil {
        return &planner.ToolResult{
            Name:  call.Name,
            Error: planner.ToolErrorFromError(err),
        }, nil
    }

    // Success: return result as usual
    return &planner.ToolResult{
        Name:   call.Name,
        Result: res,
    }, nil
}

Example Planner Logic

On PlanResume, your planner can inspect the last tool result:

func (p *MyPlanner) PlanResume(ctx context.Context, in *planner.PlanResumeInput) (*planner.PlanResult, error) {
    if len(in.ToolResults) == 0 {
        // Nothing to do
        return &planner.PlanResult{}, nil
    }

    last := in.ToolResults[len(in.ToolResults)-1]
    if last.Error != nil && last.RetryHint != nil {
        hint := last.RetryHint

        switch hint.Reason {
        case planner.RetryReasonMissingFields, planner.RetryReasonInvalidArguments:
            // Strategy 1: Ask the user for clarification (AwaitClarification)
            return &planner.PlanResult{
                Await: &planner.Await{
                    Clarification: &planner.AwaitClarification{
                        ID:               "fix-" + string(hint.Tool),
                        Question:         hint.ClarifyingQuestion,
                        MissingFields:    hint.MissingFields,
                        RestrictToTool:   hint.Tool,
                        ExampleInput:     hint.ExampleInput,
                        ClarifyingPrompt: hint.Message,
                    },
                },
            }, nil

        case planner.RetryReasonTimeout, planner.RetryReasonRateLimited:
            // Strategy 2: Back off or switch tools (implementation-specific)
        }
    }

    // Default: synthesize a final answer from tool results
    // ...
    return &planner.PlanResult{/* FinalResponse, next ToolCalls, ... */}, nil
}

This pattern lets the LLM (via your planner code) repair tool calls rather than failing the run outright.

How Goa-AI Uses Goa to Make This Work

Goa-AI leverages Goa’s design and codegen in several ways:

Design-time validation
The Goa design for a tool’s payload defines:
- required vs optional fields,
- allowed values, formats, and ranges,
- descriptions and examples.
Generated validators and codecs
For each tool payload:
- Goa-AI emits typed structs and validation logic.
- Validation errors at the boundary can be mapped into concise messages and field-level hints.
Runtime hint building
The runtime and activity layer:
- capture validation failures at the tool boundary (before executing the underlying service),
- preserve error chains as ToolError,
- attach any available RetryHint to the ToolResult, and
- avoid aborting the entire workflow when a single tool call is invalid.

Because the design and runtime agree on schemas, you can rely on a uniform error/hint contract for all tools, whether they are:

service-backed,
agent-backed (agent-as-tool),
or MCP-backed.

Best Practices

Put validations in the design, not in planners
Use Goa’s attribute DSL (Required, MinLength, Enum, etc.). Let generated validators and codecs enforce them; surface structured errors and hints at the tool boundary.

Return ToolError + RetryHint from executors
Prefer:

return &planner.ToolResult{
    Name:  call.Name,
    Error: planner.NewToolError("..."),
    RetryHint: &planner.RetryHint{ /* structured guidance */ },
}, nil

over panics or plain error returns.

Keep hints concise but actionable
Focus on:
- which fields are missing/invalid,
- a short clarifying question,
- a small ExampleInput map with corrected values.
Teach planners to read hints
Make RetryHint handling a first-class part of your planner:
- repair and retry when safe,
- ask the user via AwaitClarification when needed,
- fall back to final answers otherwise.
Avoid re-validating inside services
Goa-AI assumes validation happens at the tool boundary; internal service logic should trust validated inputs and focus on domain behavior.

For a deeper dive into validation-driven retry patterns and future enhancements (schema-aware hint builders, richer field issues), consult the Goa-AI runtime and planner packages in the goa-ai repo (for example, runtime/agent/toolerrors and runtime/agent/planner).