šŸ‘€ Eye Tracking Revisit

Made a small speech about this at the Stockholm Xperience Conference 2021

šŸ¤” Intro

I often say, or what could be interpret as urge to the teams I have been coaching as well as to students iā€™ve had through the years that ā€œit is not a failure to not hit bulls eye in your research every time, but to not reflect and learn upon the method and way you executed the study is a failureā€. And revisiting old studies is crucial to actually learn from old ā€œmistakesā€. But apparently Iā€™m a hypocrite and I do not live as I learn.

I very rarely revisit old studies to reflect or learn from them, I just keep on moving like nothing ever happened. But new year, new me, new habits and so on. So today Iā€™m revisiting, reflecting and maybe Iā€™ll learn something.

Iā€™m revisiting an old Eye Tracking study I did back in 2021. Why I chose that study as the first one out is that it feels like I want to start testing with Eye Tracking again. Which means I need to understand how to make the most out of it.

šŸ‘“ Eye Tracking

Back in 2021 I worked as User Research lead at SBAB, a government owned bank who needed to step up their game when it came to designing for inclusivity and accessibility. Iā€™m not that interested in technology per se, but I do not shy away from experimenting with technology for the sake of understanding certain phenomena, behaviours and contexts. Iā€™m interested in Research through design rather than Design through research. And what that means is that I care more about using Design artefacts and technology to understand more about certain behaviours rather than creating products through research. One of them is much more financially lucrative than the other, guess which one is better for revenue (I do both though).

So with that said, I thought back then I would investigate if we could detect different gaze patterns (gaze pattern is the pattern created when moving our eyes over the screen) between users who have autism and neurotypical users. Because if we could detect any difference between the two target groups we could change how we designed things, create updated frameworks for how to work with layouts and design elements. Which would generate design better suited for people with cognitive disabilities. And wouldnā€™t that be somethingā€¦

The old Tobii Glasses

Technicalities you might need to know to understand the context of this article

  • We used the Tobii Eye tracking glasses in a ā€œlabā€ environment. 

  • We had two target groups, one with autism and one without. We were aware that autism is a spectrum and should not be generalised like it might seem we did in this study. But recruiting this target group was not easy, so we made an active choice to keep on going and not make recruitment harder than it already was, but made sure we considered this shortcoming during analysis.

  • We recorded both screen and gaze through built in screen recording on the device, and gaze through the Tobii glasses

  • We used the method called Retrospective think-aloud protocol (RTA), which means we did not instruct the user to think out loud while doing the test. Instead we made them recall afterwards when we looked at the recording together with the gaze layer on top

  • The respondents solved a very goal oriented task on SBAB.se web page. The scenario was ā€œyouā€™re buying a home, and you want to know which interest rate you can get from SBABā€. 

šŸ“ˆ Result

To summarise this a bit (link to full study), and cut to the chase. We found no differences in gaze patterns between the two target groups, and the themes we found was that good old UX guidelines were enough to make the layout and design more suitable for everyone. Guidelines such as:

  1. Patterns: We saw clear evidence that F pattern and Layer Cake pattern existed and has to be considered while designing content and text heavy pages.

  2. That layout really matters: In which order you put information plays a huge role in how the user interprets it. You tell a story through the order you put the content. The things that are presented first are the things they think they have to do first. Take advantage of this.

  3. Headlines matter: Like you didnā€™t know this. But we saw very clear patterns that most respondents relied upon the headlines itself. To guide them through the content, make sense of it  but also to quickly determine whether the page was relevant for them or not. 

Layer Cake Pattern reading - link to a video

F pattern slow reading - link to a video

So this could have been one of those studies that we could have said was a failure, because we did not validate our hypothesis believing that it might have been a difference in gaze pattern, nor could we present any new findings into designing better for people with cognitive disabilities. But we did learn some important things about the method itself. So what did we learn about the method:

  1. It includes certain target groups: target groups who otherwise might not be comfortable with verbally expressing themselves, which is something we researchers rely upon all the time in ā€œthink aloudā€ methods. When comparing the screen recording with the eye tracking recording the eye tracking recording was self explanatory, if you believe in the mind eye hypothesis, which states that whatever youā€™re looking at is what is on your mind. While the screen recording without the Gaze overlays (where you could see where the respondent gazed) required the respondents to think out loud to be interpreted in a valuable way.

  2. The method itself is not an inclusive method, but together with other methods it creates a more inclusive toolbox of methods we can use and share. I believe that no method is fully inclusive, there is no such thing as Universal Design or Universal Methods. But there is an Inclusive toolbox which should be extended and forever explored and added to.

 šŸ¤“ What would I do differently?

Ok so what would I do differently if I did this again? Obviously a lot. Iā€™m not bummed about the fact that we couldnā€™t find any differences in gaze pattern and that we didnā€™t change design forever (eye roll šŸ™„). I still think it was important for me to empathize how valuable early work has been to the design and tech community. That our obsession with inventing the wheel has made us forget how well existing guidelines and framework still works, in this case very basic Usability Guidlines. But also highlighting how important inclusivity and accessibility is when choosing method and planning a study, not only the design we do, but also the research we execute. And that no method will work for everyone, but together we can create an inclusive toolbox with methods suitable for different target groups. 

So these are few of my learnings Iā€™v been reflecting on for this piece, but didnā€™t reflect upon enough right after the study or in the paper.

Pre determined metrics

Underneath you can see a table of some of the metrics we collected. On top of the ones in the matrix we had a few related more to self-assesment from the respondents when it came to their pre-knowledge about finance and mortgage, and traditional SuS ratings on how it felt to interact with the interface itself. We also screened the respondents asking question regarding their digital habits and how comfortable they were with executing different bank-related task online.

In hindsight there were too few quantitative metrics, I quantified a lot of the qualitative data, but I should have taken more advantage of the more advanced quantitative analysis program (Tobii Pro).  

So the learning here was that I should have been more thorough with determining which metrics to collect before I started the data gathering. So I could have optimised the way I collected the data, to be better suited for the intended software that was made for this kind of data, and not try to make it fit into the way I otherwise would have analysed it.

Adapt the way you analyse to optimise outcome, donā€™t be lazy and comfortable, a study like this is too expensive to not give you the chance to make the most out of it. 

Table of metrics we used and source we collected it from

Everything counts

While talking about the method itself, we thought we had everything covered. We had made sure that the respondents could take Covid tests, we had been thourough in explaining how to get to the ā€œlabā€ and what would happen. We were a bit worried about how the respondents would feel about  wearing the glasses, which turned out to be no problem at all. 

But the learning here was that we still could have been more clear about certain aspects of the test, not what would happen when they met us, but how the environment would be. We had one respondent that left the building because the music was too loud when the respondent had arrived and there had been too many people at the site. And talk about failures, this was a failure - to make a respondent so insecure and stressed about the test-environment so he/she left is a failure. It is your responsibility to make sure the respondent feels completely safe and comfortable and that starts from the first time you make contact with them. You can never explain too much. 

The ā€œlabā€œ - meaning a room

 Learn the tool

I would say we learned the tools fairly well, we understood how to calibrate the glasses and how to make the respondents feel comfortable with the equipment and how to collect the data. But I should have put more time and effort into learning the analysis tool, and that relates back to the fact that I donā€™t think I got the most out of this study, not just because I didnā€™t pre determine the metrics but also by the fact I didnā€™t learnt he analysis tool well enough (Tobii Pro). So if youā€™re doing a study which requires you to learn a lot of new software and methods make sure you have the time to learn it properly, otherwise wait until you have the time. It will be worth it.  

 šŸŒŸ Conclusion

These last things were things I did not reflect upon directly after the study. But things I would do differently if I did it over again. And the most important aspect of this is, that not only accessible designa matters, our processes to produce these products matters equally, donā€™t forget that researchers out there. I do at times, so letā€™s remind each other of that.

So would I do it again? Of course, but better.

šŸ”— šŸ¬Links and goodies

Link to Video about the study from Stockholm Xperience Conference

Link to full Study

Previous
Previous

šŸ¤” Q&A Research and Discovery Part 1: Respondents

Next
Next

šŸ„— Plating counts ā€“ Food for thought