Week 4 tutorial notes

Critique 101: Standards, Evidence, and Acceptance Criteria

Critique a product/design option or AI-generated screen using user goals, evidence, heuristics, severity, accessibility, and AI-output risk, then translate the highest-priority improvements into observable acceptance criteria and readiness standards.

Lesson thesis

Critique is a structured review of product/design work against user goals, evidence, usability heuristics, accessibility, risk, and business fit. Acceptance criteria translate critique into observable readiness standards so a team can decide what is working, what must change, and what good enough means.

Preparation status

Prepared with study notes, references, and web examples.

Study notes

1. Lecture spine

Session 10 teaches the learner to turn a design option into a reviewable product standard. The output is not a list of likes and dislikes. It is a critique and acceptance sheet: strengths, issues, severity, evidence, assumptions, top fixes, acceptance criteria, Given/When/Then scenarios, accessibility criteria, and Definition of Done checks.

This session matters because product teams need more than ideas. They need shared language for what is working, what is risky, and what must be true before work is ready. Critique creates that language.

The tone should be rigorous but not harsh. A strong critique protects the user, the team, and the work. It is specific, evidence-aware, and directed at the artifact, not the person who made it.

01Part 1Why critique mattersConnect from Session 9. The learner has chosen an option. Now she needs to decide whether it is clear, usable, accessible, evidence-grounded, and ready for the next step.
02Part 2Critique standardsTeach critique as a structured review against user goal, task, heuristics, accessibility, evidence, severity, and business fit.
03Part 3Readiness standardsTranslate critique findings into acceptance criteria, Given/When/Then scenarios, accessibility criteria, and a small Definition of Done checklist.
04Part 4Studio artifactLearners critique one chosen option or AI-generated screen, prioritize issues, and write criteria that make the work testable.

2. Core vocabulary

The learner should leave this session with critique vocabulary: scope, standard, observation, inference, evidence, assumption, issue, severity, recommendation, acceptance criterion, scenario, accessibility criterion, and Definition of Done.

The core distinction is between feedback and readiness. Feedback says what to improve. Readiness standards say how the team will know the work is good enough for the current stage.

The learner should practice speaking in product language: this issue is major because it blocks the user from understanding why the recommendation fits the task; the acceptance criterion is that the reason must be visible before the save action.

Critique

A structured review of work against goals, evidence, user needs, constraints, and standards.

Example: The recommendation reason is hard to notice, which may reduce trust before the student saves a laptop.

Acceptance criterion

A checkable outcome that tells the team when a user story has done its job.

Example: It is done when each recommendation shows a plain-language reason before the save action.

Given/When/Then

A plain-language scenario format that describes starting context, action, and observable outcome.

Example: Given three saved laptops, when the student opens comparison, then each laptop shows price, battery, weight, and recommendation reason.

Definition of Done

A shared quality standard that defines when work is complete enough to count as done.

Example: Reviewed, tested, accessible, responsive, and documented.

3. Critique versus opinion

Critique is not personality. It is not a chance to sound clever. It is a method for protecting the user goal and improving the work.

The tutor should repeatedly ask: what goal or standard does that feedback refer to? If the learner cannot answer, the feedback may be a preference rather than a critique.

A helpful critique sentence usually has four parts: observation, user impact, severity or priority, and recommended next step.

From opinion to readiness

Opinion

Weak: I like it.

Strong: I prefer this only after explaining the user goal it supports.

Critique

Weak: The button is weird.

Strong: The primary action is unclear because the label does not match the task.

Readiness

Weak: It looks finished.

Strong: The story is ready when the user can complete the comparison and recover from errors.

4. Set the critique boundary

Critique without scope gets messy quickly. A learner reviewing an entire app will either stay vague or become overwhelmed. A learner reviewing one screen for one task can be specific.

Scope protects the conversation. It tells reviewers what kind of feedback is useful. It also protects the designer from receiving feedback about everything at once.

In this course, the default scope is one chosen option or AI-generated screen from the product recommendation project.

Critique scope checklist

What artifact is being reviewed?
Which user and task are in scope?
Which device, context, or flow step matters?
What evidence is available?
What standard will be used for review?
What is deliberately out of scope for now?

5. Anchor critique in the user story

The user story gives critique a target. GOV.UK guidance stresses actor, narrative, and goal. The goal is especially important because it helps the team decide when the story is done.

If the learner cannot write the user story, she should not begin detailed critique yet. She may be reviewing against taste rather than user need.

A useful story for this session might be: As a first-time student, I want to compare recommended laptops using criteria I understand, so that I can choose one that fits my study needs and budget.

User story quality

Actor

Weak: User.

Strong: First-time student choosing a laptop for coursework.

Need

Weak: See options.

Strong: Compare recommended laptops using criteria they understand.

Goal

Weak: Use the feature.

Strong: Choose or shortlist one with confidence.

6. Heuristic review

Heuristics give the learner a vocabulary for likely usability issues. They are broad rules of thumb. They are useful for a first pass, especially when formal user research is not yet available.

The lecturer should not ask beginners to memorize all ten at once. Instead, cluster them into practical questions: does the user know what is happening, understand the language, control the process, avoid mistakes, recognize what to do, and recover when something fails?

NN/g is clear that heuristic review complements user research. A heuristic issue is a hypothesis about likely user friction until tested or supported by other evidence.

Heuristic vocabulary for beginners

Visibility of system status
Match between system and real world
User control and freedom
Consistency and standards
Error prevention
Recognition rather than recall
Flexibility and efficiency of use
Aesthetic and minimalist design
Help users recognize, diagnose, and recover from errors
Help and documentation

7. Severity and prioritisation

Severity helps the learner move from feedback to priority. NN/g frames severity around frequency, impact, persistence, and broader market or product effect.

The learner should not use severe language casually. Calling everything major or blocker destroys trust in the critique. The goal is to match priority to user harm and product risk.

Severity is not the same as effort. A blocker may be hard to fix or easy to fix. The team still needs to make a product decision about what to do now.

Severity language

Cosmetic

Weak: Anything visually imperfect.

Strong: Low task impact; fix when there is time.

Minor

Weak: A personal preference.

Strong: Some friction, usually recoverable.

Major

Weak: A style disagreement.

Strong: Likely to slow or confuse many users.

Blocker

Weak: Something the reviewer strongly dislikes.

Strong: Prevents task completion or creates serious risk.

8. Evidence, assumptions, and confidence

A critique should separate what is visible from what is inferred. This protects the learner from overstating certainty.

The learner can say: I observe that the confirmation is only a color change. I infer that some users may miss it. We need evidence from a quick prototype test or accessibility review.

This discipline is especially important with AI output. AI may write confident critique that invents user behavior. The learner must lower confidence when evidence is missing.

Evidence discipline

Fact

Weak: The user will not notice it.

Strong: The save confirmation appears only as a color change.

Assumption

Weak: Nobody will understand this.

Strong: Students may miss the confirmation if they rely on text feedback.

Evidence gap

Weak: This is definitely a problem.

Strong: We need to test whether students notice that the item was saved.

9. AI output critique

AI is useful in critique because it can apply a checklist, spot missing states, propose acceptance criteria, and rewrite vague feedback into clearer language. But AI is also prone to generic critique.

The prompt therefore gives scope, user, task, evidence, constraints, and an explicit output structure. It asks AI to separate facts, assumptions, and open questions so the learner can check the output.

The tutor should ask the learner to reject any AI critique that sounds polished but cannot point to a visible issue, a user goal, or an evidence gap.

Session 10 prompt lab

I am critiquing a product/design option or AI-generated screen and turning the critique into acceptance criteria.

Context:
- Product or feature:
- User:
- User task:
- Desired outcome:
- Chosen option or screen:
- What the design/output currently includes:
- Evidence available:
- Evidence missing:
- Constraints:
- Risk level:
- What "ready" needs to mean for this work:

Act as a Product Owner with design-review judgment.

1. Restate the user goal, task, and critique scope in one paragraph.
2. Name the standards you will use for critique: task success, clarity, hierarchy, action/state behavior, copy, error prevention, recovery, accessibility, trust, feasibility, evidence, and business fit.
3. Identify what is working. For each strength, explain which user goal or standard it supports.
4. Identify issues. For each issue, include:
- observation
- why it matters
- affected user or situation
- likely severity: cosmetic, minor, major, or blocker
- evidence or assumption behind the issue
- recommended improvement
5. Separate facts, assumptions, and open questions.
6. Check for AI-output risks: invented evidence, generic advice, hidden assumptions, inaccessible interaction, missing states, and overconfident claims.
7. Prioritize the top 3 fixes and explain why each comes before the others.
8. Write one user story for the selected improvement.
9. Write 6 to 8 acceptance criteria as observable outcomes. Use "It is done when..." statements.
10. Convert 2 or 3 criteria into Given/When/Then scenarios.
11. Add accessibility acceptance criteria that are specific to this screen or feature.
12. Add a small Definition of Done checklist for the work.
13. End with a readiness decision: ready to prototype, ready to build, ready to test, or not ready yet.

Rules:
- Do not invent user research, analytics, or technical certainty.
- Do not make critique personal. Critique the work against goals and standards.
- Do not treat the highest-severity issue as automatically easiest to fix; separate priority from effort.
- Do not write acceptance criteria that prescribe a single visual solution unless the implementation detail is required.
- Keep every criterion observable enough that a teammate could check it.

Check before accepting

Does the AI output include the states needed for the task?
Does it invent research, analytics, or business certainty?
Does it use generic advice instead of the given user and task?
Does it hide accessibility problems behind visual polish?
Does it include errors, empty states, loading, recovery, and confirmation?
Does it make assumptions about technical feasibility?
Does it produce acceptance criteria that are observable?

Reject or revise when

The critique is mostly taste or tone.
The critique is personal rather than directed at the work.
The critique invents evidence or treats AI confidence as proof.
Issues do not include severity or user impact.
Acceptance criteria describe implementation details rather than outcomes.
The final readiness decision is vague.

10. Accessibility acceptance criteria

Accessibility criteria should be specific to the thing being built. Generic statements such as make it accessible are too weak.

Use WCAG principles at a beginner level: perceivable, operable, understandable, and robust. Then translate them into the current feature. For a shortlist action, this means the saved state should not rely on color alone, should be keyboard reachable, should be announced clearly, and should be reversible.

The GOV.UK accessibility acceptance criteria guidance is useful because it treats criteria as records of accessibility decisions, not just compliance notes at the end.

Accessibility review criteria

Perceivable

Weak: Looks clear to me.

Strong: Text, status, and recommendation reason are available without relying on color alone.

Operable

Weak: Works with a mouse.

Strong: Save, compare, remove, and retry can be reached by keyboard or touch.

Understandable

Weak: Short copy only.

Strong: Labels and error messages explain what happened and what to do next.

Robust

Weak: Probably fine.

Strong: Assistive technologies can identify state, control, and selected item.

11. Acceptance criteria

Acceptance criteria are the bridge from critique to action. GOV.UK describes them as outcomes that confirm the service has done its job and meets the user need.

A weak criterion says: make the recommendation clearer. A stronger criterion says: it is done when each recommendation shows a plain-language reason before the save action.

Criteria should make the work inspectable by a teammate. If no one can check whether the criterion is true or false, it needs to be rewritten.

Acceptance criteria checklist

Start with the user story goal.
Write outcomes, not vague preferences.
Keep criteria observable.
Include happy path and key edge cases.
Include accessibility and recovery where relevant.
Link criteria to evidence or assumptions.
Avoid prescribing implementation unless required.

12. Given/When/Then scenarios

Given/When/Then is useful because it keeps acceptance criteria grounded in examples. It is not just a testing syntax; it is a way to make expected behavior concrete.

A beginner-friendly example: Given a student has three recommended laptops, when she opens comparison, then each laptop shows price, battery, weight, required software fit, and a recommendation reason.

The lecturer should warn against scenarios that describe implementation details. The scenario should describe observable behavior and user outcome.

01GivenContextState the starting condition, including the user state or data state.
02WhenActionState the user action or event that happens.
03ThenOutcomeState the observable result the user or system should see.

13. Acceptance criteria and Definition of Done

Acceptance criteria and Definition of Done are related but not identical. Acceptance criteria describe what this story or improvement must achieve. Definition of Done describes the broader quality measures work must meet.

The Scrum Guide treats Definition of Done as a shared quality description for a completed increment. In this course, the learner can use a small version of it: reviewed, tested, accessible, responsive, documented, and ready for the next handoff.

This distinction helps the learner avoid writing acceptance criteria that are too broad or a Definition of Done that is too vague.

Criteria versus Definition of Done

Acceptance criteria

Weak: Everything is done.

Strong: Specific to this story: recommendation reasons appear before save action.

Definition of Done

Weak: The designer likes it.

Strong: Shared quality bar: reviewed, tested, accessible, responsive, documented.

Relationship

Weak: Use one term for everything.

Strong: A story needs both story-specific outcomes and team-wide quality checks.

14. Worked example: critique

Use the course project for a worked example. The learner reviews a recommendation screen created from Session 9. The screen helps students compare laptops, but the reason for each recommendation is visually weaker than the price and badges.

The critique should say: observation, why it matters, likely severity, evidence basis, and improvement. This keeps the feedback grounded.

The recommended improvement is not simply make it bigger. It is: make the recommendation reason visible before the save action so students can understand why the option fits their study needs.

Recommendation screen critique

Strength

Weak: Looks nice.

Strong: The screen gives three recommended laptops and shows price immediately.

Issue

Weak: Hierarchy is weird.

Strong: The reason for recommendation is below visual badges, so trust may be weaker.

Severity

Weak: High because I dislike it.

Strong: Major if users cannot explain why a laptop fits their needs.

15. Worked example: acceptance criteria

After critique, convert the improvement into criteria. The criteria should be observable and tied to the user story.

Then convert the most important criteria into scenarios. Example: Given a student is viewing recommended laptops, when she saves one to the shortlist, then the card confirms the saved state without relying only on color.

Finally, add a small Definition of Done checklist: reviewed against criteria, tested in prototype, accessibility checked, responsive behavior considered, open questions logged.

Example acceptance criteria

It is done when each recommendation shows a plain-language reason before the save action.
It is done when the student can compare price, battery, weight, and required software fit for each recommendation.
It is done when save and remove actions have visible, keyboard-accessible, and screen-reader-identifiable states.
It is done when the user can recover if comparison data fails to load.
It is done when accessibility review finds no blocker for the main comparison task.

16. Studio exercise, rubric, and home study

The studio output is one critique and acceptance sheet. It should be specific enough that a teammate could understand the issue, act on the recommendation, and check the criteria.

For home study, the learner should critique one familiar screen and write three acceptance criteria for the main task. This repetition is important because critique improves only when learners practice moving from observation to standard to action.

Close by previewing Session 11. With standards in place, the next lesson turns a chosen, critiqued direction into a 3-screen flow and feature brief.

01Part AFrameChoose one selected option or AI-generated screen from Session 9 and write the user story, scope, and evidence available.
02Part BCritiqueRun the Session 10 prompt and review the output for fake certainty, vague feedback, missing states, and accessibility gaps.
03Part CPrioritizePrioritize the top three issues using severity, evidence, user impact, and effort awareness.
04Part DDefine readinessWrite acceptance criteria, Given/When/Then scenarios, accessibility criteria, Definition of Done checks, and a readiness decision.

Rubric for a strong critique and acceptance sheet

Critique scope is narrow and clear.
User story includes actor, need, and goal.
Strengths and issues are linked to user goals or standards.
Issues include observation, impact, severity, evidence basis, and recommendation.
AI-generated critique is checked rather than accepted wholesale.
Acceptance criteria are outcome-focused and observable.
At least two Given/When/Then scenarios are included.
Accessibility criteria are specific to the feature.
Definition of Done checklist is small but meaningful.
Readiness decision is explicit: prototype, build, test, or revise.

Follow-up reading for the lecturer

Nielsen Norman Group: UX Design Critiques Cheat SheetNN/g critique guidance on defining scope, tying feedback to goals, keeping critique about the work, documenting decisions, and following up.Nielsen Norman Group: How to Conduct a Heuristic EvaluationNN/g step-by-step heuristic evaluation guidance, including scope, independent evaluation, consolidation, and the warning that heuristics complement but do not replace user research.Nielsen Norman Group: 10 Usability HeuristicsJakob Nielsen's ten broad usability heuristics for interaction design, used here as beginner-friendly critique vocabulary.Nielsen Norman Group: Severity Ratings for Usability ProblemsNN/g severity-rating guidance for prioritising usability problems by frequency, impact, persistence, and market effect.GOV.UK Service Manual: Writing user storiesGOV.UK guidance on user stories and acceptance criteria as outcome checklists that confirm whether a service has done its job and met a user need.Cucumber: Gherkin referenceCucumber Gherkin reference for writing concrete examples with Given, When, and Then so expected outcomes can be checked.Scrum Guides: Scrum Guide 2020Official Scrum Guide description of the Definition of Done as a shared quality standard for completed product increments.GOV.UK Developer Documentation: Accessibility acceptance criteriaGOV.UK developer guidance on accessibility acceptance criteria: specific, testable, outcome-focused criteria that record accessibility decisions.W3C WAI: Understanding WCAG 2.2W3C WAI explanation of WCAG principles, success criteria, and the role of automated and human testing in accessibility review.Microsoft HAX Toolkit: Guidelines for Human-AI InteractionMicrosoft HAX guidelines for human-AI interaction, useful for critiquing AI-generated experiences and AI-assisted design decisions.OpenAI Academy: Prompting fundamentalsOpenAI prompting guidance on clear instructions, useful context, desired output, and iterative refinement.

References

Nielsen Norman Group: UX Design Critiques Cheat Sheet

NN/g critique guidance on defining scope, tying feedback to goals, keeping critique about the work, documenting decisions, and following up.

Nielsen Norman Group: How to Conduct a Heuristic Evaluation

NN/g step-by-step heuristic evaluation guidance, including scope, independent evaluation, consolidation, and the warning that heuristics complement but do not replace user research.

Nielsen Norman Group: 10 Usability Heuristics

Jakob Nielsen's ten broad usability heuristics for interaction design, used here as beginner-friendly critique vocabulary.

Nielsen Norman Group: Severity Ratings for Usability Problems

NN/g severity-rating guidance for prioritising usability problems by frequency, impact, persistence, and market effect.

GOV.UK Service Manual: Writing user stories

GOV.UK guidance on user stories and acceptance criteria as outcome checklists that confirm whether a service has done its job and met a user need.

Cucumber: Gherkin reference

Cucumber Gherkin reference for writing concrete examples with Given, When, and Then so expected outcomes can be checked.

Scrum Guides: Scrum Guide 2020

Official Scrum Guide description of the Definition of Done as a shared quality standard for completed product increments.

GOV.UK Developer Documentation: Accessibility acceptance criteria

GOV.UK developer guidance on accessibility acceptance criteria: specific, testable, outcome-focused criteria that record accessibility decisions.

W3C WAI: Understanding WCAG 2.2

W3C WAI explanation of WCAG principles, success criteria, and the role of automated and human testing in accessibility review.

Microsoft HAX Toolkit: Guidelines for Human-AI Interaction

Microsoft HAX guidelines for human-AI interaction, useful for critiquing AI-generated experiences and AI-assisted design decisions.

OpenAI Academy: Prompting fundamentals

OpenAI prompting guidance on clear instructions, useful context, desired output, and iterative refinement.

Web examples

Nielsen Norman GroupCritique as structured, goal-linked feedback

NN/g critique guidance gives the social shape of the session: define scope, frame feedback around goals, keep critique directed at the work, document action items, and follow up.

Nielsen Norman GroupHeuristic evaluation as a beginner critique method

Heuristic evaluation gives learners a repeatable way to inspect an interface before user testing: review independently, document issues, consolidate, and remember that context still matters.

Nielsen Norman GroupSeverity turns feedback into priority

Severity ratings turn critique into prioritisation by asking how often an issue occurs, how hard it is to overcome, whether it persists, and how damaging it is.

GOV.UK Service ManualAcceptance criteria as outcome checklist

GOV.UK links acceptance criteria to user need and evidence, which is exactly the move learners need after critique: define what must be true for the story to be done.

CucumberGiven, When, Then examples

Gherkin is useful because Given, When, Then pushes acceptance criteria toward observable conditions rather than vague design wishes.

GOV.UK Developer DocumentationAccessibility criteria as product readiness

Accessibility acceptance criteria show how readiness standards can be made specific and testable without prescribing every implementation detail.

Guided practice

Frame one critique, run the Session 13 prompt, check the AI critique for fake certainty and missing evidence, prioritize issues by severity and user impact, then write acceptance criteria, scenarios, accessibility criteria, and a small Definition of Done checklist.

Artifact: Product critique and acceptance sheet with severity, evidence, acceptance criteria, scenarios, accessibility criteria, and Definition of Done

Tutor review questions

Can the learner define critique scope before reviewing?
Can the learner separate observation, inference, evidence, assumption, and open question?
Can the learner critique against user goal, task success, heuristics, accessibility, evidence, and business fit rather than taste?
Can the learner assign severity without exaggerating or ignoring user impact?
Can the learner identify AI-output risks such as missing states, invented evidence, and fake confidence?
Can the learner write observable acceptance criteria and Given/When/Then scenarios?
Can the learner distinguish story-specific acceptance criteria from Definition of Done?
Can the learner make a clear readiness decision: prototype, build, test, or revise?

AI prompt

I am critiquing a product/design option or AI-generated screen and turning the critique into acceptance criteria.

Act as a Product Owner with design-review judgment.

Home study

Choose one selected option or AI-generated screen from the course project. Write the user story and critique scope, identify strengths and issues with severity and evidence notes, prioritize the top three fixes, then write acceptance criteria, Given/When/Then scenarios, accessibility criteria, Definition of Done checks, and a readiness decision.