🤖

AI Integration Docs

Sprint 3 — end-to-end account of how AI was integrated into Wally as a paywall generation engine: what we discovered, what was broken, what we built to fix it, and the framework that now governs every paywall the system produces.

Capability Discovery — What AI can actually produce

The question we needed to answer

Before writing any rules or standards, we had to find out whether AI could produce store-compliant paywalls at all. Not theoretically — hands-on, with the real schema and the real renderer. The risk of skipping this step was writing a framework around capabilities the model didn't have.

How the generation pipeline works

A user prompt flows into app/api/generate/route.ts, which calls a language model (Groq / Llama-3.3-70B) with a structured system prompt. The model must return a single JSON object matching the PaywallPayload type defined in lib/types.ts. That output passes through two validation layers before it can be used:

Layer 1 — AJV schemaChecks types, required fields, format:uri, enum values

Layer 2 — policyChecks.tsChecks compliance rules, structure, contrast, claims

Failure handlingSurface the specific violation — never silently discard output

What AI gets right without help

JSON structureConsistently follows the PaywallPayload shape when shown the schema

Copy qualityProduces benefit-focused, on-brand text across prompt types

Token usageUses {{price}}, {{trial}}, {{period}} tokens when explicitly instructed

Tracking eventsGenerates all 5 required events with pw_ prefix when told to

What AI gets wrong without constraints

The critical finding: the model treats the legal block as optional. In unconstrained runs, it was omitted in roughly 60% of outputs. When it did appear, it lacked privacy_url, terms_url, and the show_restore field — the three elements Apple and Google review teams check first. Block ordering was also unreliable: legal appeared mid-screen, CTA appeared after legal, benefits appeared after pricing. Schema validation passed all of these because AJV only checks types and required fields — it has no concept of block order or compliance semantics.

Conclusion

AI generation is viable, but only with a structural contract enforced at two levels: a system prompt that prescribes exact block order and compliance requirements, and a policy check layer that catches what schema validation cannot. The capability exists; the constraints are what make it safe to deploy.

Gap Discovery — Where the system failed before we fixed it

What a gap audit means

A gap is a mismatch between what the framework requires and what the codebase enforces. Some gaps are silent — the schema accepts a timer block, but no component existed to render it, so the block simply vanished at runtime. Others are compliance gaps — the legal block had no structured fields for Privacy Policy or Terms of Use URLs, so those links existed only in free-text copy that AI could omit without triggering any check.

All 10 gaps — what they were and how they were resolved

G1HighTimerBlock had no renderer

✓TimerBlock.tsx implemented in components/blocks/. Prefers deadline_utc over duration_seconds, stops at zero without looping, uses tabular-nums font for stable layout.

G2HighLegal links not schema-enforced

✓privacy_url and terms_url added as format:uri fields on the Block interface and schema. policyChecks.ts validates their presence on every legal block.

G3HighRestore Purchases not renderer-enforced

✓LegalBlock.tsx reads show_restore from the block. PaywallRenderer.tsx defaults to true if the field is absent. policyChecks.ts rejects show_restore: false.

G4MediumNo CTA contrast validation (WCAG AA)

✓contrastRatio() added to policyChecks.ts using the WCAG relative luminance formula. Rejects any payload where cta_background / cta_text ratio is below 4.5:1.

G5Mediumbody_size minimum too low (was 8px)

✓Schema minimum raised to 10px. Apple review flags legal text below 10pt as intentionally unreadable.

G6MediumPlan badge not implemented

✓PricingBlock.tsx renders the badge field as a floating pill on bordered cards (top-edge, accent background) or an inline tag on standard cards. badge_color is optional — defaults to accent.

G7MediumSticky CTA not implemented

✓PaywallRenderer.tsx separates blocks where position === "sticky" from inline blocks. Sticky CTAs render in a fixed footer panel with safe-area-inset-bottom padding. Inline content gets paddingBottom: 88 to avoid overlap.

G8MediumDismiss button unverified

✓DismissButton component implemented in PaywallRenderer.tsx. Renders absolutely positioned top-right with a semi-transparent surface background. Requires onDismiss prop — absent in preview mode, present in live mode.

G9HighSystem prompt had no block ordering rules

✓policyChecks.ts now validates: hero at index ≤ 1, timer only at index 0, benefits before pricing, CTA immediately before legal, legal always last. These 6 structural checks run on every generated payload.

G10MediumEnglish forbidden claims not checked

✓BLOCKED_CLAIMS in policyChecks.ts extended with English equivalents: "guaranteed", "risk-free", "100% safe", "free forever", "last chance". Both Spanish and English checked on every payload.

The silent failure problem — why it matters

G1 is the most instructive gap. Timer blocks were fully defined in the schema, generatable by AI, and would pass AJV validation — but at render time, PaywallRenderer.tsx hit the default: return null branch in its switch statement and silently dropped the block. No error, no warning. The paywall would look complete in preview and broken in production. The fix was both the component and a discipline: every block type in the schema must have a renderer, and new block types must be added to both simultaneously.

Implementation — The validation pipeline that enforces everything

How the two-layer validation works

Every paywall — whether AI-generated or hand-authored — must pass both layers before it can be saved or deployed. Layer 1 is structural: AJV compiles schemas/paywall.schema.json at startup and checks types, required fields, enum values, URI formats, and numeric minimums. Layer 2 is semantic: lib/policyChecks.ts runs 15+ checks that require understanding the meaning of blocks, not just their shape.

What policyChecks.ts enforces today

No hardcoded pricesPRICE_REGEX — matches currency symbols + digit patterns

No forbidden claims10-item blocklist — ES + EN, case-insensitive

No forbidden CTA textcontinue / next / ok / accept rejected

vendor_product_id length≥ 3 characters on all product_refs

Rollout safetyrollout_percent ≤ 25%

Experiment completenesstest_id, axis, ttl_utc, rollback_variant_id all required

Tracking events5 required events present + pw_ prefix on all values

CTA contrastWCAG AA 4.5:1 — relative luminance formula

Block presencehero, benefits, cta, legal all required

Block orderinghero ≤ idx 1, timer at idx 0, benefits before pricing, CTA before legal, legal last

Legal fieldsprivacy_url and terms_url required, show_restore cannot be false

Legal copyBody must mention renewal (renov/renewal) and cancellation (cancel)

Product–pricing alignmentEvery product_ref must have a matching pricing block

The WCAG contrast implementation

The contrast check uses the real WCAG 2.1 relative luminance formula — not an approximation. Each hex colour channel is converted to linear light via the sRGB gamma function (channel ≤ 0.03928 uses linear division; above it uses the power curve). Luminance is then calculated as 0.2126R + 0.7152G + 0.0722B. The ratio between lighter and darker is compared against 4.5:1. Any payload whose CTA button fails this check is rejected with the exact ratio in the error message so the author knows how far off it is.

Acceptance criteria — all met

AC-1All high-priority gaps resolved✓ Done

G1 (TimerBlock), G2 (legal schema fields), G3 (restore enforcement), and G9 (block ordering) are all resolved and tested. The renderer no longer silently drops any defined block type.

AC-2Automated checks extended✓ Done

policyChecks.ts now covers structural ordering, legal field presence, WCAG contrast, experiment completeness, and bilingual forbidden claims — none of which existed before this sprint.

AC-3Alignment report✓ Done

G9 and G10 were previously misclassified in the gap registry. G9 was labelled as "plan toggle" but the real gap was block ordering in the system prompt. G10 was icon style — low impact. Both resolved or explicitly deferred.

Standardization — The framework that governs every paywall

Why a formal framework was necessary

Without a single source of truth, Product, Design, and Engineering each had different mental models of what a valid paywall looked like. Designers would add elements that had no schema field. Engineers would merge paywalls that passed AJV but would fail App Store review. AI would generate structurally plausible but non-compliant output. The framework resolves all three problems with one shared contract.

The four framework documents

T1Platform Policy Research & Compliance Checklist

Maps Apple App Store (14 rules) and Google Play (10 rules) subscription requirements directly to schema fields and policyChecks.ts checks. Every compliance rule has a code (A1–A14, G1–G10) that is referenced in policyChecks.ts violation messages so engineers know which policy a failure violates.

T2Design Framework: Atomic Structure & Experimentation Dimensions

Defines the 5 mandatory zones (hero → benefits → pricing → cta → legal) and the 2 optional zones (social_proof, timer). Documents 14 experimentation dimensions — which ones can vary between variants (copy, badge, urgency, price framing, social proof) and which ones cannot (zone structure, legal fields, product_refs, tracking events, TTL).

T3Technical Validation & Adapty Mapping

Traces every block type to its renderer component. Documents the gap registry (G1–G10) and the resolution status of each. Defines the implementation patterns for sticky CTA (position: sticky), floating badge (absolute top:-14px), and timer (deadline_utc preferred, stops at zero).

T4Paywall Design Framework & AI System Specification

The single source of truth. Combines T1–T3 into a unified reference and adds the production-ready 26-rule AI system prompt. Rules cover: block structure, compliance, copy constraints, experiment limits, tracking requirements, and output format. This is the prompt that should live in app/api/generate/route.ts.

The 5 mandatory zones — what the renderer enforces

Every list or cards layout paywall must contain exactly these zones in this order. policyChecks.ts verifies all five are present and correctly sequenced on every save.

Zone 1 — heroindex 0, or index 1 if timer precedes it. One hero maximum.

Zone 2 — benefitsAlways before the first pricing block. benefitsIdx < pricingIdx.

Zone 3 — pricingOne block per product_ref. product_ref_id must resolve to a real ref.

Zone 4 — ctaImmediately before legal. ctaIdx + 1 === legalIdx.

Zone 5 — legalAlways last. privacy_url, terms_url, show_restore required.

Generation status

The AI generation endpoint (app/api/generate/route.ts) is currently disabled — it returns HTTP 503 with a clear message. This is a deliberate decision: the 26-rule system prompt from T4 has not yet been deployed to replace the original prompt. Enabling generation before updating the prompt would produce paywalls that bypass the structural constraints the framework defines. The endpoint stays off until the T4 prompt is live.

Acceptance criteria — all met

AC-1Four framework documents produced✓ Done

T1–T4 exist and are live in the Standardization Docs section of this app. They cover Apple/Google policy rules, structural zones, block specs, experimentation dimensions, and the AI system prompt.

AC-2System prompt ready✓ Done

T4 Part 5.1 contains the production-ready 26-rule prompt. It is not yet deployed to route.ts — that is the next action. Generation is intentionally gated until the prompt is in place.

AC-3Framework validated against codebase✓ Done

Every structural rule in T2 and every compliance rule in T1 has a corresponding check in policyChecks.ts. The gap between framework and implementation is closed.

✦

Where we are now

SchemaAJV-validated, additionalProperties: false, URI format checks, minimum: 10 on body_size

Policy checks13 automated checks — compliance, structure, contrast, experiment, tracking

RendererAll 6 block types implemented. Sticky CTA, floating badge, dismiss button, timer all live

Legal blockprivacy_url and terms_url schema-enforced. Restore Purchases renderer-enforced

AI generationPipeline exists and works. Gated until T4 system prompt is deployed to route.ts

Framework docsT1–T4 live in Standardization Docs. This page is the AI integration audit trail