Customer Discovery — 60 min

5 blocks, 60 minutes, structured.

The complete script. Five blocks, 60 minutes, all probes intact, both the skill-distribution wedge probe and the LLM-vs-deterministic-line question retained for higher-signal conversations. Use when a warm contact has given you the time, or for a second conversation after a 30-min first pass.

Present mode →

All scripts Jump to blocks Falsifiable claims (C1–C7)

Present mode opens a one-question-at-a-time slide deck for screenshare — questions only, none of the internal probes / claim anchors / don'ts.

Timing

Total 60 minutes: 5 for unstructured intros and 60 for the structured blocks. The intro absorbs greetings, the ask restatement, and role/team color — the structured blocks start from calibration on. Spend the time where it's allocated; don't trade out of Block 2.

Pre-block

Intros — 5 min

Greeting, thanks, restate the ask, confirm note-taking / recording. Let role and team color come out naturally; do not re-ask them in Block 1.

Block 1

Context — 10 min

Get a clean read on their role, team, and AI footprint before any framing of yours can color it.

Block 2 · load-bearing

The trigger event — 20 min

Surface where the org actually is on agents-acting-on-consequential-systems: shipped, tried-but-stalled, proposed-and-rejected, wanted-but-never-proposed, or not on the table. The branch matters as much as the answer. A specific recallable incident is the strongest signal, but a blocked proposal is also data; the only branch that kills C1 is 'no one's thought about it and there's no demand.'

Block 3

What they tried — 15 min

Surface the competing solutions they evaluated and why each got rejected or retained, and how their evaluation map handles capability sharing across teams — not just per-team tooling. The shape of their rejection list is the shape of the competitive landscape.

Block 4 · end only

The pitch test — 10 min

One shot at the pitch. Read the line, watch the reaction shape, capture which framing lands. This is the only block where you describe Aileron.

Block 5

Budget & authority — 5 min

Turn vague interest into a named buyer, a category, and a procurement shape. This is what makes the next conversation possible.

Block 1

Context

10 min

Goal

Get a clean read on their role, team, and AI footprint before any framing of yours can color it.

Questions to read

1.
Walk me through your role and what your team ships.
- Probe: How big is the eng org, roughly? How does platform or infra split out?
- Probe: Where does AI or agents fit in what you do today?
- Probe: What's your team's model strategy — frontier-only, mixed, anything local?
- Probe: How does your team share AI tooling or prompts with adjacent teams today? Anything formal, anything ad hoc?
2.
When you say 'agent' — what's the canonical example for you?

Why: Use their definition, not yours. Anchor every later question on whatever they say here. If two interviewees mean two different things, downstream coding is meaningless.
3.
Which of the systems you rely on do you think of as consequential — where taking an action is either irreversible or could have real business impact?
- Probe: Customer-facing systems? Billing, payments, financial systems? Production infra? Internal data stores?
- Probe: Where would a wrong action mean an incident, a refund, an outage — the kind of Slack thread you don't want to be on?
Why: Two reads. (a) Calibrates stakes for Block 2 — what counts as 'consequential' varies wildly. Some teams call a Slackbot consequential; others mean rolling deploys. (b) Surfaces the named consequential systems Aileron needs to connect to safely — roadmap intelligence that compounds with the integration list in Block 3.

Don't

× Don't describe Aileron.
× Don't define 'agent' for them.
× Don't ask 'have you thought about credentials' — that's leading into Block 2.
× Don't introduce 'Skills', 'Actions', or 'Connectors' as Aileron terms. If they use those words, follow.

Block 2

The trigger event

20 min Load-bearing Claims: C1C2C3

Goal

Questions to read

1.
Has anyone on your team actually put an agent — or a script that acts like one — in front of one of those consequential systems? If yes, walk me through the most recent time. If no, has anyone tried, wanted to, or proposed it?
- Probe: Shipped: when was this, roughly? What went sideways, or what almost did? Who pushed for it, who pushed back?
- Probe: Tried and didn't ship: where did it stall, who pulled the plug, what was the reason given?
- Probe: Proposed and rejected: who proposed it, who said no, what was the reason cited?
- Probe: Wanted to but never proposed: what were they trying to achieve, and why hasn't a proposal landed — would it die on arrival, or is the need not concrete yet?
- Probe: Was it (or would it be) one person's machine, a team substrate, a shared repo of prompts — what's the surface?
Why: C1 lives here, and the branch matters as much as the answer. SHIPPED: the incident details = direct C1 evidence. STALLED IN BUILD: what stopped it = the gap Aileron fills. REJECTED IN PROPOSAL: the reason cited = the governance / attribution / determinism gap. WANTED BUT NEVER PROPOSED: the desired outcome = the job-to-be-done; the reason no proposal landed = the organizational barrier. NEVER WANTED / NEVER THOUGHT ABOUT IT: C1 is dead for this org.
2.
Has security, compliance, or risk had a conversation about agents-in-the-loop in your org? What surfaced it, and what was the outcome?
- Probe: A near-miss? An incident elsewhere? A vendor pitch? A policy update from above?
- Probe: What was the decision — greenlight with constraints, red light, defer-to-later?
- Probe: What changed in your stack, your policy, or your roadmap as a result?
Why: C3 lives here. The decision-forcing event is what separates 'we should think about this' from 'we did something about this'. Reframed so it works whether they've shipped or not — a policy decision made absent a shipped agent still counts as C3 evidence.
3.
If you're running any agents or scripts at all on your team — including non-consequential ones, dev environments, personal automations — have you seen the same one behave differently on a teammate's machine than on yours? When was the last time that bit someone?
- Probe: What was different — model, prompt, env vars, the integrations themselves?
- Probe: Did anyone try to standardize it? What happened?
Why: Free-text capture. Probes the skills-sprawl pattern (Andrew Gordon's four-org observation, 2026-05-28). Broadened beyond consequential-system exposure so it works for orgs that haven't shipped there yet — skill sprawl is upstream of consequential-system exposure. Candidate for a formalized claim under issue #6 if three or more interviewees describe the pattern unprompted.
4.
If I asked you who owns the question of what an agent is allowed to do in your org — who would that be? Does that role exist by name today?

Why: C2 lives here. A named owner with budget is the difference between pain that's real and pain that gets bought against. Watch for two owners: one for 'agent acting on consequential systems' governance (sec/compliance lens) and one for 'AI rollout across the org' (platform/enablement lens). Capture both if present — two owners means two procurement shapes.

Don't

× Don't accept abstract worries. If they answer with 'we'd worry about...', steer back: 'has that worry actually shown up in a proposal, a near-miss, or a conversation that mattered?' Speculation isn't data. The wanted-to-but-didn't branch IS fine — that's a real organizational moment with a real blocker. 'We'd worry someday' is not.
× Don't describe Aileron's approach. The pitch is Block 4.
× Don't lead toward security framing. If they raise it, follow. If they don't, don't.
× Don't ask 'do you have skill sprawl' — name the symptom (different behavior on a teammate's machine), let them name the pattern.

Block 3

What they tried

15 min Claims: C4C5C6

Goal

Questions to read

1.
When this came up, what did you reach for first?
- Probe: Internal scripts? Vendor tools? Hosted offerings?
- Probe: What did you actually end up running?
2.
Walk me through what you evaluated and what got rejected. What was wrong with each option?

Why: C4 lives here. If their rejected list includes Anthropic Managed Agents or OpenAI Agent Builder for specific reasons, the runtime-layer thesis sharpens. If they cite those as sufficient, C4 weakens.
3.
How do agent credentials flow today? Where do approvals live, if anywhere?

Why: Concrete probe behind C1. If the answer is "they have the same env vars as the human", that is the operational gap.
4.
Are you running anything customer-operated — self-hosted in your own VPC, on-prem — for this category? Or is it all SaaS?

Why: C5 lives here. If buyers prefer SaaS only for this category, the v4 customer-operated ordering needs revisiting.
5.
What systems does the agent — or the script — actually touch today? Walk me through the integration list.
- Probe: Which of those were already there as off-the-shelf integrations, and which did you have to wire up yourselves?
- Probe: Anything you needed that didn't have a clean integration story — something proprietary, internal-only, or that nobody's built a public connector for?
- Probe: What was the longest detour you took to talk to one of those systems?
Why: Two reads. (a) The concrete integration surface — the named services Aileron must support out of the box to land in this org; this is roadmap intelligence. (b) The rate at which they had to wire up unknown-up-front services. High rate means the connector-hub thesis needs a bring-your-own-connector affordance; low rate sharpens C6.
6.
Do you have a connector or integrations layer for agent tools that you'd describe as yours? Something you built and would not give up?

Why: C6 lives here. If they have one they like, the curated-connectors thesis softens. If they hate maintaining it, it sharpens.
7.
If a teammate wanted to use the same agent or script you've been running, what's the path? Copy a folder? An install script? A shared repo? Something built on top?
- Probe: What breaks when they try it?
- Probe: Who's maintaining whatever that path is — by name, not by team?
Why: Free-text capture. Cleanest non-leading probe for the Homebrew-for-skills wedge (Andrew Gordon, 2026-05-28). The answer shape tells you whether the org has an internal 'skills distribution' pain point or not.
8.
For the workflows you've actually shipped — which parts are the LLM doing the work, and which parts are deterministic code the LLM is just calling?
- Probe: Where did you draw that line, and why there?
- Probe: Anything you tried to make the LLM do that you eventually moved into deterministic code?
Why: Tests the 'agents express intent; execution happens at another layer' framing without naming it. If they've already drawn the line themselves, the abstract-away-agents pitch lands. If everything is LLM-end-to-end, the cost/reliability thesis is less mature here.

Don't

× Don't mention Clawvisor, Infisical, Anthropic Managed Agents, Devin, Composio, or LangSmith unprompted. Note carefully when they raise these names — that list is your competitive map.
× Don't say 'connector hub' or 'runtime' — borrow their language back to them.
× Don't use 'Homebrew', 'substrate', or 'Skills' as Aileron terms — those are wedge hypotheses, not pitch fodder. Borrow only their language.
× Don't pitch deterministic-execution-as-architecture in this block. Capture only what they've already built.

Block 4

The pitch test

10 min End only

Goal

One shot at the pitch. Read the line, watch the reaction shape, capture which framing lands. This is the only block where you describe Aileron.

Questions to read

1.
If there were a substrate that let your teams share the same vetted skills, gated approvals where they mattered, kept the irreversible actions deterministic instead of LLM-decided, and gave you an audit trail per action — would that have helped here?
- Probe: Capture the reaction shape: enthusiasm, skepticism, clarification questions, silence.
- Probe: If they ask a clarifying question: answer it briefly, then return to silence. Do not elaborate unsolicited.
Why: The line is fixed. Variability in your phrasing kills the signal. Read it once, slowly.
2.
Of those four — shared skills across teams, gated approvals, deterministic execution, action-level audit — which one would have moved the needle most for you?

Why: Positioning signal. Shared skills → D (Homebrew-for-skills / org-rollout wedge). Gated approvals → B (compliance). Deterministic execution → A (runtime / fleet) restated in the buyer's verb. Audit → B (compliance). All four equally → C (vendor-neutral hub). This is the cleanest read on which candidate to lock at the 20-conversation gate.
3.
If you imagine telling your VP about this in one sentence, what's the sentence?

Why: Their words for the value, not yours. If they can't compose one, the framing isn't sticky yet — useful feedback. If they can, the sentence is gold for the outreach next-iteration.

Don't

× Don't demo. Don't open a tab.
× Don't elaborate the architecture. Shell mediation, PTY interception, TEE attestation — none of this belongs here.
× Don't promise features. Every 'we could build that' is debt. The right response is 'tell me more about that need' and then write it down.
× Don't sell. Read the line, listen.
× Don't add a fifth dimension on the fly. The four are fixed.
× Don't explain what 'deterministic' means architecturally. If they ask, the one-sentence answer is: 'the LLM decides what to do, regular code does the doing.' Then return to silence.

Block 5

Budget & authority

5 min Claims: C2C7

Goal

Turn vague interest into a named buyer, a category, and a procurement shape. This is what makes the next conversation possible.

Questions to read

1.
Who would sign off on something like this in your org?
- Probe: Have they signed off on anything similar recently? What was it?
- Probe: Is that person someone I could talk to directly, or does this need to come through you?
- Probe: Is there a separate person who owns 'rolling out AI capabilities across the org' — distinct from whoever owns the security signoff?
Why: C2 sharpens. A named buyer who recently signed something adjacent is much warmer than a title. Two buyers (sec/compliance vs. platform/enablement) means two procurement shapes and two pitch routes.
2.
What's the size of category budget that exists for this kind of tooling, roughly? $5K, $50K, $500K, ranges?

Why: C7 lives here. $25K to $100K allocatable without full procurement is the v4 thesis. If everything drags into a 6-month enterprise cycle, C7 is dead.
3.
How does that kind of purchase happen at your org? Is it a line item, a security-tools envelope, a renewal, an experiment budget?

Why: Reveals the procurement shape. 'Experiment budget under the VPE' is fast. 'Annual security renewal' is slow. Both are signal.
4.
If we kept talking, what would the next step look like for you? A sandbox trial? An LOI? A pilot?

Why: Design-partner ladder. Their answer tells you if this is real or polite. Anyone who proposes a concrete next step is a candidate for #8 (LOI).

Don't

× Don't quote a price.
× Don't promise a free tier or a discount.
× Don't pitch the next step. Let them propose it. What they propose is the signal.