Designing Alexa to Support Board Game Rule Learning for Users with Vision Impairment

The data

#load the data df.raw #A tibble: 9,000 × 7 id ques alt modality identity appearance choice <int> <int> <int> <fct> <fct> <fct> ...

Jin Kang

Aug 27, 20241 min read

Helvetica Light is an easy-to-read font, with tall and narrow letters, that works well on almost every site.

Team

Research associate (me)
Graduate researcher* (computer science)
Research assistant (UX design and industrial design)
Faculty supervisor (information technology)

Skills

Literature review
Co-design workshop
Thematic analysis
A/B testing

Timeframe

June 2022–Nov 2022

*This project was their master's thesis, and I was their mentor who oversaw the entire research and equally contributed to major processes.

PROBLEM

Users with vision impairment feel helpless when learning board game rules.

Users who are blind or have low vision (hereafter, BLV) rely on sighted people to learn board game rules. Most digital board game rulebooks are incompatible with screen readers because they are poorly formatted (e.g., images without alternative text).

My text-to-speech program does not work well with PDF rule books.

-Y from a board game community

I cannot read the physical rule books and the digital one doesn't work with my screen reader.

-H from a board game community

As a result of having to depend on sighted people, BLV users can feel helpless when learning the rules.

SOLUTION

Could Alexa communicate board game rules to BLV users?

We saw the potential of a voiced-based assistant Alexa in addressing BLV users' pain point of inaccessible digital rulebooks. Alexa has "skills" that are like mobile applications to support a user's goal. Using a tailored skill, Alexa can completely reduce the need for BLV users to use digital rulebooks by communicating board game rules verbally.

Alexa, rules.

OK. Wildcards come in rainbow color.

This possible usage scenario depicts an Alexa skill telling a BLV user about board game rules.

If Alexa's skill can be an alternative technology in place of a screen reader, what features should it have to communicate board game rules to BLV users effectively? Are there other BLV users' pain points that we should consider? Prior research has indicated inaccessible digital rulebooks as one important pain point for BLV users, and we were determined to identify other pain points that can inform new features in an Alexa skill.

Research Question 1: What are BLV people's pain points when learning board game rules?
Research Question 2: What new features in an Alexa skill can address BLV people's pain points?

The board games market is bigger than one might think, but only a few Alexa skills for board games exist that are catered to the needs of BLV users.

80 Alexa skills

for Board Games

2 Alexa skills

for Board Games

for BLV Users

250+ million

BLV Users who Play Board Games

The predicted business outcomes for the board game publishers who offer an Alexa skill with our recommendations are:

Increased customer satisfaction for BLV users, which can convert to increased customer retention.
Attracts new customers of all abilities by showing accessibility as a core value.
Boosts branding by driving innovation with an inclusive Alexa skill.
Contribute towards making Canadian society more inclusive.

OVERALL PROCESS

We led the project from research conception to publication, with the goal of developing and validating the success of an Alexa skill prototype. The figure shows an overview of our research process.

Created a roadmap of tasks and internal deadlines for Study 1 and 2.

Prepared standardized testing scripts and recruitment posters.

Prepared a demonstration video describing Alexa skills for board games.

Designed an accessible survey to collect participant demographics.

Foundational Research

Study 1

Research

Planning

Background

Research

Co-design

Workshop N = 14

Thematic

Analysis

Researched the academic literature on the accessibility of board games to gain topic expertise and develop research questions.

Reviewed other researchers' co-design procedures for reference.

Discussed BLV participants' pain points with learning board game rules and ideated Alexa features that can support rule learning.

Identified pain points related to rule learning and gameplay.

Prioritized the pain point related to rule learning: learning rules cause high cognitive load.

Identified 5 features in an Alexa skill that can reduce high cognitive load in BLV users when learning rules.

Evaluative Research

Study 2

A/B Testing

N = 9

Thematic

Analysis

Prototype Development

Pilot Testing

N = 2

Created a high-fidelity Alexa skill with 5 features for the board game Ticket to Ride.

Conducted pilot usability testing with 2 sighted participants. Iterated the prototype based on feedback.

Conducted usability testing with BLV participants, validating the prototype.

Identified which of the 5 Alexa features would reduce cognitive load in BLV users.

Identified other design opportunities.

STUDY 1: RESEARCH SETUP

The graduate researcher and I equally contributed in this phase.

1. Participant Recruitment

We recruited 14 BLV participants using convenient and snow sampling techniques (Mean age = 42.2 yrs old). We created an accessible survey on Qualtrics.com to collect participants’ personal information.

2. Demonstration Video

We created a demonstration video that described features in an Alexa skill for board games Ticket to Ride and No Thanks. The video was a probe to elicit active brainstorming with participants.

The graduate researcher (right) and I (left) in the demonstration video.

3. Co-design Script

We created a co-design script to standardize data collection across 9 co-design sessions. The script specified:

1. Greeting statements to participants.

2. Instructions for 2 brainstorming activities.

3. Guiding questions to prompt participants’ ideation.

FOOD FOR THOUGHT

I've been trained to draft a detailed experimental script since I was an undergraduate psychology student. In my typical script, I write word-by-word statements that I will say to participants and actions I will take with participants. Of course, I can deviate a little from the script!

Having such a script ensures you are not introducing confounds across experimental sessions. Moreover, while I write a script, I can identify problems that can arise during data collection and revise the study procedure ahead of time to prevent the problems.

In Zoom Chat:

[Researcher]: Hello, everyone. We will be starting the session as soon as all the participants have joined.

Invite participants who will use a pseudonym in Zoom:

[Researcher]: Since you chose to use a pseudonym, you can take this time to rename your Zoom name to your chosen pseudonym.

Introduction (10 mins)

[Researcher]: Hello, everyone. Thank you for joining us for the co-design workshop today. I hope everyone is doing well today. We have a total of (4 or 5) people in this meeting today.

...

Break (10 mins)

Activity 2 (Characteristics) (20 mins)

[Researcher]: Imagine there is a board game with many rules. It has typical board game components like some cards, a physical board, and some other small components, and there is no hidden information in the game, and it has an Alexa skill.

A snippet of the study’s co-design script.

STUDY 1: CO-DESIGN WORKSHOPS

The graduate researcher ran all workshops.

Each workshop lasted 1 to 2 hours. We purposefully adopted discussion-based co-design workshops because studies have shown that BLV participants prefer discussion-based ideation techniques over tactile prototyping.

Activity 1.

Participants shared their challenges associated with rule learning of board games.

How was your experience using physical and digital rulebooks?

Tell me about how you go about learning complex rules.

What makes a poorly designed rulebook?

What is your process like when you pick up a new board game?

Activity 2.

Participants watched the demonstration video, imagined a board game with many rules, and collectively brainstormed features in an Alexa skill that could support their rule learning.

"Imagine a board game with many rules. It has typical board game components, and it has an Alexa skill.

As a BLV user, what features in the skill would help you learn the rules?"

CONTRAINTS

We had 2 constraints while devising our data collection plan. Our constraint #1 was that we wanted to recruit some BLV participants with board game experience, which would slow down the recruitment process. Finding BLV participants in general is already hard as it is! To combat the constraint, we conducted remote co-design workshops and recruited participants globally to reach out to as many potential qualified participants as possible and as fast as possible.

One downside of remote co-design workshops is that keeping participants engaged can be difficult, which introduced constraint #2. To combat this constraint, we created a demonstration video, which richly described Alexa’s current features with two popular board games. This video can increase participants’ interest in ideation because it describes Alexa’s interaction with board games that most of them know about. We asked participants to complete a demographic survey two weeks prior to joining a co-design workshop and purposefully made the video with two frequently mentioned board games.

STUDY 1: THEMATIC ANALYSIS

The graduate researcher and I equally contributed in this phase.

This is the thematic analysis process that I generally follow.

Review all the qualitative data

Create semantic & latent codes

Group codes into themes

Identity insights

We assigned latent (i.e., an implicit assumption not said by the participant) and semantic codes (i.e., a direct statement said by the participant). We created a codebook for consistent coding between me and the graduate researcher.

Using the codebook, each of us coded transcripts independently, compared our (dis)agreement, and resolved discrepancies.

FOOD FOR THOUGHT

Don’t be a lone wolf in data analysis! Thematic analysis is a collaborative process. It consists of 4 steps: 1) familiarization with transcripts, 2) systematic coding, 3) coming up with potential themes, and 4) review and refine themes. In each step, you actively discuss with your collaborators, including “what did participants mean when they said this word?” “am I reading too much into what the participant said by assigning this latent code?” “which codes should we group for a potential theme?” and “these 3 themes seem to be related in this way. What do you think?” Involve your collaborators in your thematic analysis and see how their unique backgrounds can be a resource in interpreting data!

To create a codebook or not? Some researchers dislike creating a codebook because the practice goes against of the core philosophy of thematic analysis: embracing others’ (subjective) backgrounds as a resource. Other researchers also celebrate own and others’ backgrounds but they try to minimize the subjectivity by introducing consistency into analysis.

So, the answer to the question depends on your view on what constitutes as trustworthy thematic analysis.

STUDY 1: KEY INSIGHTS

Theme 1. BLV users suffer from high cognitive load

Most participants memorized rules because digital rulebooks are not accessible by screen readers. They had to remember irrelevant and relevant game information at once but could not process and remember all the information. As a result, they experienced high cognitive load (i.e., too much information is being processed in a person's working memory).

"There are many rules that never come into play unless specific situations happen. It will be hard to memorize stuff that I don't consistently use. I feel helpless when I am given too much information to memorize."

- E

“I can sometimes struggle. My friends can quickly read a card and play that. There are lots of different cards, and I as a blind person can't possibly memorize all the rules and react quickly."

- J

Theme 2. Alexa's features should reduce cognitive load

All participants agreed that Alexa could be a good communicator of rules. They emphasized that Alexa’s speech should not cause high cognitive load in BLV users and identified 5 features in the existing Alexa skill for Ticket to Ride that can cause high cognitive load. Consequently, their ideation of desired features in a new Alexa skill centred on improving upon them.

Feature in the existing Alexa skill

Improved version of the feature in new Alexa skill

1. Alexa tells a rule even if it does not apply at the time.

1. Alexa only tells a rule when needed.

2. Alexa offers a long explanation of a rule.

2. Alexa explains a rule concisely.

3. Alexa's music, encouragment, and game rule reminders cannot be customized.

3. Alexa's game music, encouragement, and rule reminders can be customized.

4. Alexa cannot pause while explaining a rule.

4. Alexa can pause while explaining a rule.

5. Alexa sends a reference card to a BLV user's mobile phone of the commands it understands.

5. Alexa sends a reference card to a BLV user's phone for a rule summary.

PROTOTYPE DEVELOPMENT

The graduate researcher and I ideated how Study 1 findings can be contextualized for Ticket to Ride. The former prototyped an Alexa skill.

We prototyped a new Alexa skill consisting of the five features identified in Study 1. We prototyped the skill for Ticket to Ride because our features were an immediate improvement of the features in the existing Alexa skill for Ticket to Ride, and we can compare our new Alexa skill against the existing Alexa skill, making the design of the subsequent A/B testing rigorous.

The graduate researcher developed a new Alexa skill, downloadable on a mobile phone or a physical Alexa device.

See this document for how each feature in our Alexa skill was implemented for Ticket to Ride.

The landing page of our Alexa skill for Ticket To Ride.

A reference card in our Alexa skill offers a rule summary.

STUDY 2: RESEARCH SETUP

The graduate researcher and I equally contributed to #1-2. The lead researcher ran all the testing sessions, and I oversaw their progress.

We conducted 9 moderated A/B testing to see whether our Alexa skill prototype would be less likely to cause high cognitive load in BLV users than the existing Alexa skill.

1. Participant Recruitment

We recruited 9 BLV English-speaking adults who had some experience with board gaming and had access to Amazon Alexa (M = 41 years old, SD = 13.3 years old).

2. A/B Testing Script

We created an A/B testing script to standardize a data collection procedure across 9 sessions. The script identified:

1. Greeting statements to participants.

2. Study instructions and evaluation questions in order.

3. Study Procedure

We used audio recordings to facilitate faster data collection while retaining valid results. For each feature #1 to #4, we recorded the lead author's verbal interaction with our prototype Alexa skill (Recording 1) and their verbal interaction with the feature's counterpart (or lack thereof) in the existing Alexa skill (Recording 2).

Once participants joined their session via Zoom, the graduate researcher played the recordings and asked participants about which feature (new vs. the existing feature) would reduce their cognitive load while listening to rules from Alexa.

Recording 1

Participant

Recording 2

The order of audios was counterbalanced.

Alexa plays a background music.

The graduate researcher: "Alexa, music off."

Alexa: "OK, music off."

Alexa stops playing a background music.

Alexa plays a background music.

The graduate researcher: "Alexa, music off."

Alexa: "Sorry, I didn't catch that."

Alexa continues playing a background music.

Participants also downloaded an Alexa app on their mobile phones and interacted with Alexa's existing and new reference cards (feature #5).

We provided them with a PDF guideline that outlined how participants can download an Alexa mobile app and download our skill and the existing skill for Ticket to Ride.

The graduate researcher asked questions as participants physically interacted with the skill.

CONTRAINTS

The graduate researcher and I had a constructive disagreement on which research method should be used in study 2. We had a time constraint: we were not finished with prototype development, and we had to complete the study within 3 weeks. She proposed a remote field study: BLV participants would invite their friends, buy a board game, play the game with our Alexa skill prototype, complete a survey, and one of us would join in their session remotely and record the session.

I proposed an hour remote A/B testing because 1) we are not giving too much "homework" to participants, decreasing the chance of participant drop-out, 2) we would still finish data collection on time in case prototype development was delayed, and we have to push our data collection by one week, 3) we would spend less budget, and 4) we would get valid results that answer the research question. Each of us outlined the pros and cons of our proposed method and discussed our view, and the graduate investigator agreed that my proposed method was more ideal.

Remote field study

Remote A/B testing

Pros

Cons

Pros

Cons

Participants will interact with the prototype, and we can find valuable behavioural insights in addition to participants' self-reported experience in a survey

We are asking too much from participants, and they might drop out. We are offloading the researchers' task to participants

There is more wiggle room. We are uncertain as to when the prototype will be developed and if we can push back data collection by one week and still finish data collection on time.

There is no actual user interaction with the prototype. The results would be hypothetical (i.e., the feature reduced cognitive load vs. the feature would reduce cognitive load).

Participants will provide feedback based on their experience of the prototype as a whole.

There are many confounds to control for between participants, e.g., the number of players, and the level of knowledge of a board game.

We can spend less budget (we do not need to compensate participants when they purchase a board game).

Participants will test out a feature individually and not experience the prototype as a whole.

STUDY 2: KEY INSIGHTS

The graduate researcher led the analysis.

I oversaw her progress.

Participants said all features, except Alexa's encouragement can be customized, would reduce their cognitive load when learning rules through Alexa. They also shared how each feature can be further improved.

Alexa's New Features

1. Alexa only tells a rule when needed.

2. Alexa explains a rule concisely.

5. Alexa can have a reference card for a rule summary.

"This card will significantly lower my cognitive load because I can learn about the rules at my own pace and whenever I want if I missed Alexa's speech."

"The text on the reference card reads as one large block by the voice-over so the card should be read as separate items on the list."

4. Alexa can pause in explaining a rule.

"Because you can pause Alexa, and you don't have to rush in trying to remember a rule and you can pause it, relax, and find a card you're supposed to find."

3. Alexa's music, encouragement, and game rule reminders can be customized.

Would It Reduce Cognitive Load?

No for Alexa's encouragement customization

What Users Think

"I like this feature. The more rules Alexa piles, it would just end up being more confusing to a BLV user by throwing too much at them. I'll have a brain freeze."

"I immediately understand the rule and I don't have to try too hard to remember long rules, which I probably will fail."

"If you've got the music, it's always there in the back of a BLV user's head, and it makes it harder for them to pay attention to more things."

Design Opportunities

"In addition to rules, f the board is adapted with braille, Alexa can mention labels or give tactile descriptions, so that I can find them as easily as sighted players."

"Alexa should ask if a BLV player is a novice or an experienced player in the first place and adjust its rule explanation length accoridngly."

"Ideally, Alexa should save the state of the player's selected customization to save them the hassle of rework every time.

"Alexa can emphasize certain words, which can help me to remember the key words better."

FOOD FOR THOUGHT

Alexa's encouragement can be customized. Why did Study 2 participants say this feature would not reduce their cognitive load? My reflection is that Alexa's speech that has a short duration (e.g., its encouragement) does not interfere with a BLV player's processing of the speech. However, Alexa's speech that has a long duration (e.g., background music) interferes with their effort to process and understands rules.

OUTCOMES & LESSONS

The project was so exciting. It exercised my creativity and research skills. I was constantly asking myself, "What new features in Alexa are most important?" "How would they work with the current capabilities of Alexa?", "What are fun and accessible co-design activities for BLV participants?" I found myself truly enjoying the process of designing and strategizing user interactions with Alexa.

Compared to the original Alexa skill, 100% of BLV participants said 4 of 5 features in our Alexa skill would reduce their cognitive load.

IMPACT

TAKEAWAY

First to propose Alexa features targeted to reduce cognitive load in BLV players.
Proposed remote co-design activities with BLV people.
Launched the prototype in the Amazon Games & Skills.
Made a GitHub repository that shares the prototype codes with other developers.
Published two publications to the top-tier Human-computer Interaction and Design conference (acceptance rate 24%).

Saman Karim, Jin Kang, & Audrey Girouard. Exploring Accessibility and Empathy via Conversational Agent in Board Game Players who are Blind, or Low Vision and Sighted. Presented at EmpathiCH'22 Workshop at CHI Conference on Human Factors in Computing Systems (CHI 2022).

Saman Karim, Jin Kang, & Audrey Girouard. 2023. Exploring Rulebook Accessibility and Companionship in Board Games via Voiced-based Conversational Agent Alex. In Designing Interactive Systems Conference (DIS'23). ACM.

Create a well-organized project plan. A good plan will prioritize what's needed at the research stage, prepare you to address constraints ahead of time and ensure the project will complete in time.
Understand trade-offs in decision-making. Knowing when and why to use a particular tool or research method is part of the critical thinking needed to successfully deliver.
Pilot test a prototype with your actual use population. Have your prototype tested at early stage by your user group to avoid accessibility issues experienced by your user group in actual testing sessions.
Create a list of pros and cons to communicate why one research method is an ideal option. Share realistic time estimations of tasks involve in a given method.