How to Use AI for Requirements Engineering: A 5-Step ASPICE Workflow with Prompts
It’s a random Thursday at my day job, and we’re in the big conference room, and the whiteboard is a hot mess. Two systems engineers are arguing on a screen share, plowing through forty pages of field-trial interview notes. The spreadsheet open in the shared tab is the free requirements tracking template from this site. The other window open on one of the laptops is an AI assistant, chewing through the transcripts and surfacing candidate user needs grouped by theme. (Whatever model they grabbed off the shelf that morning. I don’t ask.) They’re arguing about whether item seven is a distinct need or a restatement of item three. They pick a side. Edit the cell. Move on.
This is what requirements engineering looks like in 2026 on a team that doesn’t have a $50,000 ALM tool and doesn’t want one. And it mostly works.
Mostly.
Requirements work is reading, translating, checking, and tracing. The language model is astonishingly good at some of those. It is career-endingly bad at others. The short version:
Good at: extracting user needs from unstructured input, checking existing requirements for quality, finding gaps and duplicates in a large set, drafting test cases from well-formed requirements, answering natural-language questions about a traceability graph.
Bad at: inventing user needs from a one-paragraph brief, generating requirements that need domain-specific numeric specificity, making safety-critical calls, producing anything a regulator will read with your name on it.
Below is a five-step workflow for an ASPICE-regulated automotive program. The running example is an adaptive cruise control (ACC) module in a 2027-model-year midsize sedan, ASIL B. The principles transfer to AS9100, ISO 13485, IEC 62304, DO-178C, and any other domain with a traceability chain to defend.
Step 1: Extract user needs from real input
You need a corpus of material generated by contact with the physical world and the humans in it. For the ACC module, that’s eighteen hours of pilot-fleet ride-along transcripts, forty beta-driver survey responses, six NTSB incident summaries involving prior-generation ACC systems, and eighteen months of field service complaints from the current-model-year platform.
Feed it all to the model. Here’s the prompt:
You are reviewing transcripts from pilot-fleet driver interviews about
the adaptive cruise control (ACC) system in a 2027 midsize sedan
prototype.
Extract distinct user needs from the corpus. Rules:
- State each need in the form "the driver needs to [verb phrase]"
- Capture what the driver wants to experience, know, or be able to do
- Do not state how the system should behave
- Group by theme: comfort, trust, control, awareness, recovery
- Cite the interview timestamp or reference for each item
- Do not merge distinct needs into summary bullets
Output a table: Theme | User Need | Source Reference
The model returns sixty-odd candidate needs across the five themes. Examples:
- Trust: the driver needs to be able to predict when the car will brake
- Comfort: the driver needs to experience braking that feels controlled rather than abrupt
- Control: the driver needs to override the system without consciously thinking about how
- Awareness: the driver needs to know whether the system is currently active and in what mode
- Recovery: the driver needs to regain full steering and braking authority immediately on system fault
You will not ship this list. You will argue about it. Some items will get merged. Some will get split. Some will get thrown out because they are actually requirements in disguise (“the system shall brake within 1.5 seconds” is not a user need). Some will get added because you read a transcript the model clustered under a different theme and realized it was saying something else.
This argument is the work. The model just saved you the part where you read sixty hours of transcripts to find it.
Step 2: Cluster and deduplicate
Your working list now has seventy-something items from a dozen source batches. Many overlap. Some are restatements. A few are misclassified. Run this prompt:
Here is a list of 74 candidate user needs for the ACC system. Cluster
duplicates and near-duplicates. For each cluster, propose a canonical
phrasing that does not lose information. Flag any entries that are
actually system requirements or implementation details rather than
user needs.
Output:
- Canonical User Need
- Source IDs merged into this canonical
- Flags (REQ-in-disguise, impl-detail, ambiguous)
You review the clusters, accept most of the merges, reject a few where the model flattened a real distinction, add back one or two the model missed entirely, and end up with something like thirty-two distinct canonical user needs grouped by theme and ready to go into the spreadsheet. These go in as first-class rows. They are yours now. Your name is on them.
Step 3: Translate user needs to requirements
This is the step the model does not drive. Ever. The model helps with the mechanical parts of the writing. The numbers come from you.
Take this user need: the driver needs to experience braking that feels controlled rather than abrupt.
Ask a chatbot to “write a system requirement for this” and you get something like:
The ACC system shall decelerate at a comfortable rate under normal operating conditions.
That requirement is useless. “Comfortable” is not a measurable criterion. A reviewer will reject it. A developer will implement whatever they feel like and call it done.
Here is the real requirement, as written by your team:
REQ-ACC-114: Under normal operating conditions (driver in active ACC mode, no lead-vehicle emergency deceleration event detected), the ACC system shall decelerate at a longitudinal rate not exceeding 3.5 m/s² with jerk not exceeding 2.5 m/s³. Under detected emergency lead-vehicle deceleration, the system may apply up to 6.0 m/s² for a maximum duration of 1.2 s before initiating a handoff alert to the driver.
Where did those numbers come from?
- ISO 26262 ASIL B bounds inherited from your hazard analysis
- SAE J2399 guidance on ACC performance envelopes
- Ride-comfort studies from your chassis dynamics group
- Real-world logs from the prior-model-year platform instrumented with accelerometers
- An argument between a systems engineer and a safety engineer in a meeting three weeks ago
None of that exists in a chatbot’s training distribution. The model can help you format the requirement in EARS syntax, check that it’s unambiguous once written, and suggest alternative phrasings for readability. It cannot honestly generate the numeric bounds. It will pick numbers that sound right. Those numbers will go into your SWE.2 artifact and eventually into a homologation package and a type-approval submission.
Do this step by hand. Use AI as a typing aid.
Here’s the kind of prompt that’s actually useful at this stage, once you have a draft in front of you:
Here is a draft system requirement I've written for the ACC module.
Do not change the numeric bounds, operational conditions, or scope.
Check only:
- Is it stated in valid EARS syntax?
- Are there any ambiguous terms remaining?
- Does it have exactly one verifiable criterion, or more than one
(which would indicate it should be split)?
- Does the scoping clause fully specify when the requirement applies?
Do not rewrite. Only identify issues.
Short, bounded, no room for the model to invent anything.
Step 4: Quality-check the full requirement set
You now have a complete set of system requirements for the ACC module. Call it seventy-three rows. Run this:
Review the following 73 system requirements for the ACC module
(ASPICE SYS.2, ASIL B). For each, flag:
1. Testability: does it include a measurable, verifiable criterion?
2. Ambiguity: does it use terms like "fast," "reliable,"
"appropriate," "as needed," "when necessary," "user-friendly"?
3. EARS compliance: does it follow one of: Ubiquitous, Event-Driven,
State-Driven, Optional Feature, Unwanted Behavior?
4. Missing conditions: does it imply an operational mode, speed
range, or environmental condition that isn't stated?
5. Safety implications: does it touch behavior that should link to
a hazard analysis entry?
Do not rewrite. Do not suggest fixes. Only flag issues with severity
(High / Medium / Low) and a one-sentence explanation.
Output: REQ-ID | Issue Type | Severity | Explanation
The model returns a table. Some calls will be wrong. Many will be useful. A representative hit:
REQ-ACC-047 | Ambiguity | High | Uses “appropriate lane-change detection threshold” without numeric bound. Not testable as written.
You fix REQ-ACC-047 to: The ACC system shall detect lane-change behavior by an adjacent vehicle when that vehicle’s lateral velocity exceeds 0.8 m/s and its lateral position crosses within 0.5 m of the ego-lane boundary.
False positives are cheap to dismiss. True positives save you from the SYS.2 review meeting where the quality lead points at fifteen of your requirements and asks how you plan to verify any of them.
Step 5: Query the connected graph
Steps 1 through 4 work with whatever AI assistant your engineers happen to have open in another tab. Step 5 is different. It requires a connected graph of everything your team produces: not just the requirements spreadsheet, but source code annotations, test execution records from CI and ATE and ATP, your Design BOM, your SOUP register, your SBOM, and the serial-number trail coming out of manufacturing. The graph is what makes the queries below answerable.
I built RTMify Live to be that graph.
Live is a single binary that runs on your laptop. It polls your requirements spreadsheet every 30 seconds (Google Sheets or local .xlsx), scans your working tree for inline requirement tags like // REQ-ACC-114, ingests test evidence from CI, ATE, ATP, and design verification through a drop-folder inbox or HTTP API, reads CSV and CycloneDX and SPDX from your PLM and build system, and assembles all of it into one queryable graph. It exposes that graph through a native MCP server. Point Claude, Cursor, or any MCP-compatible client at it and your entire product realization history becomes conversational.
Here is what that looks like, for the ACC module:
Show me every user need in the Trust theme that has no linked
system requirement.
REQ-ACC-114 changed last Thursday. Which test executions used the
prior version, and which ECU calibration sets are affected?
What's the blast radius if the 3.5 m/s² normal-operation
deceleration bound in SYS-ACC-14 changes by 10%? List impacted
requirements, tests, source modules, and Design BOM items.
Which SWE.4 unit design elements derived from SWE.2-ACC-033 have
no SWE.5 unit verification linkage?
Trace the full chain from the user need "driver needs to override
the system easily" down to source code annotations and unit tests.
Flag broken or stale links.
These are the questions you should be asking before every SYS.2 gate review, every hazard analysis update, and every ASPICE assessment. The chain from “why we built this” to “did this specific unit pass” should be one query in English, not a three-week consulting engagement against a proprietary database.
Live runs entirely on your machine. No cloud. No account. No internet connection required. For engineers at automotive tier-1 suppliers, defense contractors, and medical device companies, this is the difference between “let me check with IT” and “it’s already running.” Perpetual license starts at $599, with a 30-day money-back guarantee.
A note on the template
The free RTM template on this site has User Needs and Requirements as distinct row types with a parent-child relationship. That structure is deliberate. A user need is the thing you can defend in front of a driver. A requirement is the thing you can defend in front of an ASPICE assessor. They are not the same artifact, and an AI workflow that collapses them produces output that looks complete and isn’t.
Keep them separate. Extract user needs from real input. Translate to requirements with care. Let the model help with the mechanical work and the quality checks. Let it answer questions about the graph. Do not let it generate either user needs or requirements from a vibe.
So
You - the human - you do the hard work, the real thinking. Let the model do the typing.