Quick Answers

Frequently Asked Questions

Decision-making help for every selector and workflow. Click any question to expand.

Using the Composer
What does this tool actually do? +

QA Prompt Composer is a training and reference tool for understanding AI prompt structure. It shows you which options (purpose, role, coverage, guards) go into a quality QA prompt and how they combine — so you can build that awareness and apply it in your own AI agent workflow.

It also assembles a structured prompt you can copy and paste into an AI assistant (ChatGPT, Claude, Gemini, or any other). It does not call an AI itself.

The loop is:

  1. Pick a Purpose (e.g. “Test case writing” or “Bug reporting”) to see the relevant selectors.
  2. Optionally paste your raw material — a Jira ticket, steps to reproduce, an API spec. Leave it blank to explore the prompt structure without submitting content.
  3. Adjust selectors if you have a specific reason to (most defaults are calibrated for each purpose).
  4. Click Generate to see the assembled prompt structure.
  5. Copy the prompt, update your content as needed, and paste it into your AI agent.

Content sent to the server is only used to assemble the prompt. It is not stored, logged, or used for training. No account, no API key, no cost.

What does a structured prompt add compared to a freeform one? +

A freeform prompt like “write Gherkin test cases for this feature” tends to produce generic output because the AI has no persona, no explicit format constraint, no coverage instruction, and no grounding guard. It fills in the gaps with guesses.

The Composer shows you what those components look like and assembles them systematically:

Without Composer
“Write Gherkin test cases for TEST-4821. It’s about adding a trim column to the inventory CSV export.”

No role · No format spec · No grounding · No output constraint → generic, inconsistent output
With Composer
Persona: senior Test Case Author / Manual QA
Task: Gherkin BDD, happy path + negative + boundary
Source: your ticket, isolated from the instructions
Guard: no preamble, grounded in stated AC only

→ Consistent, format-correct output grounded in your actual ticket

The difference is most visible for complex or format-sensitive outputs — Gherkin scenarios, column-based test suites, Playwright specs, and bug reports with root-cause hypotheses. Understanding these components is also what you take with you when prompting any AI agent directly.

What’s the correct order to build a prompt? +

Work top to bottom through the three steps:

  1. Step 1 – Pick a Purpose. This sets sensible Role, Output, Reasoning, and Coverage defaults automatically. Most of the time you change nothing else.
  2. Step 2 – Add your content. Paste the ticket, requirement, logs, or scenario into the main textarea. Fill in optional sources (logs, schema, house-style example) only when they’re needed.
  3. Step 3 – Tune the selectors. Only change a selector if you can name a specific reason — wrong output format, need a different role, want to lock down tone. The defaults are calibrated; unnecessary changes add noise.

The prompt assembles live on the right panel. When it looks right, hit Copy prompt and paste it into your AI assistant.

When should I trust the defaults vs change the selectors? +

Trust the defaults as the starting point. Each Purpose was calibrated with the most useful combination. Change a selector only when you have a concrete reason:

  • Output format — your team uses column-based suites for audit evidence, not Gherkin
  • Role — the default role is close but missing a second failure dimension (e.g. data integrity on top of a flow test)
  • Coverage — the feature has complex role-based permissions that the default doesn’t cover
  • Guards — you’re pasting real customer data and need the Redact guard on

If you’re changing three or more selectors routinely for the same task type, save it as a preset so you don’t repeat that work.

What are built-in presets and when should I load one? +

Built-in presets are one-click configurations for the five most common daily QA workflows. Load them from the Presets ▾ menu in the left panel header.

Preset Best for
Gherkin from Jira TicketA new ticket that needs BDD scenarios immediately
Bug Report + Root CauseSomething broke and you need a dev-ready report with a root-cause hypothesis
Regression Smoke ChecklistBefore a release — confirming nothing broke across existing features
API Test Cases (Column Suite)Testing a REST endpoint for status codes, payload shape, auth, errors
Playwright E2E SpecConverting a manual scenario to a runnable TypeScript spec

All fields remain editable after loading. You can also save your own presets, export them as JSON, and share across browsers or team members.

How do I save and reuse my own configurations? +

Open the Presets ▾ menu, type a name in the “Save current configuration” field, and click Save. The entire state — purpose, all selectors, and context fields — is saved to your browser’s local storage.

  • Load: select from the dropdown and click Load
  • Delete: select and click Delete
  • Share across browsers / team: Export as a JSON file, then Import on another machine

Presets survive browser restarts but are browser-specific unless exported. Export regularly if the configuration matters.

Choosing a Purpose
I have a Jira ticket. “Requirement & Jira” or “Test case writing”? +

It depends on how well-specified the ticket is.

Use “Requirement & Jira” if:

  • The acceptance criteria are vague, missing, or written in business language with no measurable conditions.
  • You see words like “should work correctly”, “as expected”, or “user-friendly” with no concrete pass/fail condition.
  • There are open questions about edge cases or role-based behaviour.

Use “Test case writing” directly if:

  • The ticket has explicit, numbered acceptance criteria with observable outcomes.
  • Each AC item states a condition, an action, and an expected result.
The safest workflow

Always run Requirement & Jira first. It produces an ambiguity list — if the list is empty, the ticket is ready; if it has items, resolve them with the product owner before writing cases. Writing test cases against an ambiguous ticket wastes time.

What’s the difference between “Data generator” and “Data validation”? +

These are opposite directions:

Data Generator
Produces synthetic test data that does not yet exist.

You define the schema (field names, types, constraints) and the tool produces records you can use as test input. Use it when you need realistic, PII-safe, constraint-honouring test data for seeding databases, API fixtures, or import tests.
Data Validation
Checks data that already exists.

You paste the actual data (an API response, a migrated table export, an ETL result) and the tool checks it against type, range, format, nullability, uniqueness, referential integrity, and business invariants. Use it after a migration, after an ETL run, or when an API response shape looks wrong.
Practical sequence

Use Data Generator to create the input data → run your process → use Data Validation to verify the output.

When do I use “Automation strategy” instead of going straight to “Automation (Playwright)”? +

Use Automation Strategy first whenever you have more than one test case to automate.

Automation Strategy answers three questions the Playwright purpose cannot:

  1. Should this case be automated at all? Some cases are cheaper to keep manual (one-off, visual, exploratory). Automating them creates maintenance debt with no ROI.
  2. Which pyramid layer? A test that checks business logic belongs at the unit or API layer — not E2E. Putting it in Playwright makes it slower, flakier, and harder to debug.
  3. In what order? High-frequency, high-stability cases first. New features under active development last.
Common mistake

Skipping this step is the most common cause of brittle, high-maintenance test suites. Run Automation Strategy on your batch, get the “automate now / automate later / keep manual” decision for each case, then run Automation (Playwright) one case at a time on the approved candidates.

My CI is failing intermittently. Which purpose handles that? +

Use the Flake Triage purpose (Tier 2). Paste the failing test code and CI failure logs into the content field. The prompt will:

  • Apply a hypothesis-evidence ladder over the logs
  • Root-cause non-determinism: race conditions, animation/network timing, shared or leaked state, test-ordering dependencies, time/locale sensitivity
  • Propose concrete fixes (e.g. replace hard waitForTimeout with web-first assertions, isolate setup/teardown)
  • Recommend a quarantine policy and flake budget

The more log context you paste — stack trace, timing, CI environment, frequency of failure — the better the root-cause output.

Roles & Output Formats
“The numbers are wrong” vs “the button doesn’t work” — which role? +

These are two different failure modes at two different layers, and the right role depends on which layer the defect lives at.

“The button doesn’t work”
UI/interaction failure → use Bug Reporter / Defect Analyst.

The defect is in behaviour: something is unresponsive, mis-labelled, or throws a UI error. Steps to reproduce focus on user actions and visual state.
“The numbers are wrong”
Data/calculation failure → use Bug Reporter / Defect Analyst as primary, but add Data Validation Specialist as a second role.

The defect is in a value: a calculation is incorrect, a field is truncated, a decimal is rounded wrong.
Practical test

If you can reproduce the defect without opening a browser (e.g. by querying the database or calling the API directly), it’s a data-layer bug — use the second-role pairing. If you need the UI to repro it, the bug is in the UI layer — use Bug Reporter alone.

When do I add a second role and what does it actually change? +

Add a second role only when the task genuinely spans two failure dimensions — not as a default.

What it changes: the first selected role sets the primary lens (vocabulary, coverage instincts, output structure). The second role adds an additional set of instincts to the same output — it does not produce two separate outputs.

Example: Test Case Author + Accessibility Test Engineer → produces test cases that include functional coverage AND WCAG 2.2 AA checks (keyboard, focus, ARIA, contrast) in the same output.

Useful pairings:

  • Test Case Author + Data Validation Specialist — feature involves data correctness (types, ranges, nulls)
  • Test Case Author + Security Test Engineer — feature handles auth, tokens, or PII
  • Bug Reporter + Data Validation Specialist — defect involves wrong values, not wrong behaviour
  • Playwright Engineer + Flake Triage — writing a new spec while diagnosing why an existing one is flaky
Checklist vs Gherkin vs Column-based suite — when does each make sense? +
Format Effort Traceability Use when
Lightweight checklistLowLowSmoke sweeps, sanity checks, tight deadlines, MVPs.
BDD / GherkinMediumMedium-HighAgile BDD/ATDD teams. Readable by non-technical stakeholders, can feed Cucumber/SpecFlow.
Column-based suiteHighestHighestEnterprise, regulated, compliance/audit. Imports to Xray/Zephyr/TestRail.

The “Regression smoke checklist” preset locks Checklist + concise length. The “Gherkin from Jira Ticket” preset locks Gherkin. The “API Test Cases” preset locks Column-based suite.

What does “Runnable spec file” actually output, and is it safe to run as-is? +

The Runnable spec file output instructs the model to emit a single runnable file — imports, describe/test blocks, setup/teardown, and assertions — in the idioms of the framework set in the Framework context field.

How safe is it? Treat it as a strong first draft:

  • The locator names (e.g. getByRole('button', { name: 'Save' })) will match your input — verify they reflect the actual DOM
  • The API seeding calls will use placeholder endpoints — replace with real paths
  • Check the framework version — @playwright/test API changes between major versions
  • Run it in headed mode first (--headed) to watch what it actually does before adding it to CI
Guards & Coverage
Which guards should I always have on? +

Two guards are safe defaults for almost every prompt:

  • Grounding — prevents the model from inventing limits, causes, or selectors not present in your input. Turn it on whenever facts, logs, or numbers are involved. It’s on by default for most Tier 1 purposes.
  • No-preamble — makes the response start directly at the deliverable with no “Here is your test suite...” wrapper. Useful for output you’ll copy straight into a tool.

Add these situationally:

  • Redact / privacy — when pasting real logs or tickets that contain tokens, customer names, or PII.
  • PII / synthetic — auto-enabled for the Data Generator purpose. Keeps data fully synthetic.
  • Self-check — for structured output with fixed columns or ID sequences.
What does “Pairwise” coverage actually do, and how is it different from “Boundary”? +

Boundary targets individual field edges — min, min−1, max, max+1 values for a single field. Use it whenever constrained numeric or string fields exist.

Pairwise targets combinations of fields. When you have many inputs, an exhaustive cross-product of all combinations is too large to test. Pairwise (all-pairs) ensures every pair of field values appears in at least one test case. This gives ~90% defect detection at a fraction of the test count.

Example: a search form with 4 filters, each with 3 values = 81 combinations exhaustively. Pairwise reduces this to ~15 cases while still covering every value paired with every other value at least once.

Use both when a form has both range-constrained fields (needs Boundary) and multiple interacting inputs (needs Pairwise). They solve different coverage problems.

When should I fill in the “Roles under test” context field? +

Fill it in whenever you select Permission coverage. Without it, the model has no role names to vary across and produces generic “authorized/unauthorized” language instead of concrete cases.

Also fill it in for:

  • Features with visibility differences per role (e.g. pricing tiers, dashboard widgets)
  • Security testing where IDOR or broken access control is a concern
  • Multi-tenant systems where one tenant must not see another’s data

Format: comma-separated role names matching what your system actually uses — e.g. admin, dealer, customer, guest. The model will vary test data and assertions across those exact names.

Data Generator
What’s the difference between “Constant fields” and “Varying fields”? +

Constant fields are identical across every generated record — tenant ID, region code, account code. They give the dataset a shared context without wasting uniqueness budget on fields that don’t need to vary.

Varying fields must be unique per record — names, IDs, emails, amounts, dates. The type dropdown (Email, UUID, Decimal, Date, etc.) injects a format constraint into the prompt, telling the model exactly what shape each unique value should take.

Required

At least one Varying field is required. A dataset where every field is constant has no per-record uniqueness and is useless for most test scenarios.

How do I pick the right data format: CSV, JSON, SQL, XML, or Faker.js? +
Format Use when
CSV / TSVImporting into a tool, spreadsheet, or any system with a flat import feature.
JSONAPI fixtures, frontend mock data, or any consumer that reads JSON.
SQL INSERTSeeding a relational database. Set the dialect so escaping and date/boolean literals are correct.
XMLStructured feeds, EDI documents, or any legacy system that consumes XML.
Faker.js factory (TS)When you need repeatable, version-controlled data generation in a TypeScript project.
Volume note

The model returns records inline — keep Quantity at or below 50 for reliable output. For larger volumes, generate a Faker.js factory or SQL script and run it yourself.

What does “Edge-case records” actually add beyond normal records? +

Toggling Edge-case records instructs the model to include deliberately invalid and boundary records, clearly flagged as such, alongside the normal records. Specifically it adds:

  • Boundary values per constrained field: MIN, MIN+1, MAX−1, MAX, and a mid-range value
  • Empty string and null variants for nullable fields
  • Max-length + 1 overflow strings (expect rejection)
  • Unicode / emoji / right-to-left (RTL) characters in string fields

These records are useful for negative testing — loading the dataset and verifying the system rejects or handles them gracefully. Combine with Scenario mapping if you want each edge-case record linked to a specific test case ID.