We have now arrived at a key concern for the mixed reality user experiences that we have designed: How do we study and evaluate them? What do we define a success criteria? Which research methods make sense?
There are lots of standard methods for evaluating user experiences, and there are specific ones that are most often used in the Mixed Reality research community.
BUT ⌠rather than mindlessly applying the standard methods, we should first think it through for ourselves. And consider how we might appropriate the standard methods for our particular needs.
The first question we should ask ourselves when evaluating a new human-computer interaction experience is: what are the measures of success in our case?
Of course, we need to be familiar with the toolkit before we can start to pick the right tools to answer this question. So here are some key concepts and case studies to introduce you to the area of research methods for how to evaluate user experiences.
The measure of success
First, we must identify what we want to measure and how we can measure it.
What to measure?
When defining measures of success, it is important to identify which aspects of the user experience or task performance matter most. Depending on the project, this might include: presence and social presence, collaboration performance, creativity, social experience in play, enjoyment, stress, focus or distraction, memory recall, and learning, etc.
Each of these constructs captures a different dimension of user behavior or experience, and prior work often provides validated ways of measuring them. It is therefore useful to search the literature for established methods and metrics, and bring your ideas to supervision sessions where additional guidance and resources can help refine what you choose to measure.
How to measure it?
To measure these constructs effectively, you typically need to design an experiment in which users complete a specific task while you collect relevant data. This can be done through multiple channels: interaction logging to capture behavior, observations to understand context and patterns, questionnaires to gather structured feedback, and interviews to obtain deeper insights. These methods vary in nature (some provide objective indicators, while others rely on subjective reports), so combining them often gives the most robust understanding of a user experience.
Key concepts and distinctions
Now, there are some key concepts and distinctions that you should know of when designing your user evaluations.
- Objective & subjective measures
- Objective measures: Observations or data that do not depend on personal feelings or interpretations (e.g., reaction time, error rates).
- Subjective measures: Data based on personal opinions, perceptions, or selfâreports (e.g., satisfaction ratings, perceived workload).
- Quantitative & qualitative methods
- Quantitative methods: Research approaches that collect numerical data and use statistical analysis (e.g., surveys with scales, experiments).
- Qualitative methods: Approaches that gather nonânumerical, descriptive data to understand experiences or meanings (e.g., interviews, observations).
- Dependent & independent variables
- Independent variable: The factor the researcher manipulates or categorizes to examine its effect.
- Dependent variable: The outcome that is measured and expected to change as a result of the independent variable.
- Within-subjects & between-subjects designs
- Within-subjects design: Each participant experiences all experimental conditions.
- Between-subjects design: Different participants experience different conditions, with each person in only one condition.
Case studies
To ground the above concepts in some specific examples, we will cover two case studies:
- 1) LystbĂŚk, M.N., Rosenberg, P., Pfeuffer, K., GrønbĂŚk, J.E., Gellersen, H.G. (2022) Gaze-Hand Alignment: Combining Eye Gaze and Mid-Air Pointing for Interacting with Menus in Augmented Reality. ACM Symposium on Eye Tracking Research and Applications (ETRA â22).
- 2) GrønbĂŚk, J.E.S., Esquivel, J.S., Leiva, G., Velloso, E., Gellersen, H., Pfeuffer, K. Blended Whiteboard: Physicality and Reconfigurability in Remote Mixed Reality Collaboration. In Proceedings of CHIâ24.
You can read the details about the research methods in the study/method sections of each paper. But here is a tabular overview of their main differences:
| Case study | 1) Gaze-Hand Alignment![]() | 2) Blended Whiteboard![]() |
| Study type | Controlled single-user study | Exploratory multi-user study |
| Purpose | Measuring objective performance and subjective experience | Discovering new collaborative behaviour and subjective experiences |
| Experimental design | Within-subjects study design (with repetition controlled of tasks) | Within-subjects study design (with open-ended tasks) |
| Data collection | Collecting questionnaire data on multiple conditions in controlled task | Collecting qualitative subjective feedback on multiple conditions |
| Analysis | Statistical analysis | Thematic analysis |
Example of a between-subjects experimental design
If youâre curious, hereâs an example of a between-subjects experimental design, which wasnât covered in the two main case studies above.
Huang, K. T., Ball, C., Francis, J., Ratan, R., Boumis, J., & Fordham, J. (2019). Augmented versus virtual reality in education: an exploratory study examining science knowledge retention when using augmented reality/virtual reality mobile applications. Cyberpsychology, Behavior, and Social Networking, 22(2), 105-110. https://pubmed.ncbi.nlm.nih.gov/30657334/
In this study, researchers used a between-subjects experimental design to compare learning outcomes in AR and VR. Each participant experienced only one medium (either AR or VR), which helped prevent carryover effectsâsuch as participants learning the material once and performing better in the second condition regardless of the medium. This design made it easier to attribute differences in attention, spatial presence, enjoyment, and knowledge retention directly to the technology used rather than to practice or fatigue.
The results showed that VR increased spatial presence, attention, enjoyment, and visual information retention, whereas AR supported better auditory information retention. By separating participants into distinct groups, the study could clearly isolate how each medium uniquely influenced cognitive and psychological processes.
Next up for your group projects
Now, you need to take the above concepts and case studies into account when designing your own user evaluations. Onwards!














