Developing and Deploying a Real-World Solution for Accessible Slide Reading and Authoring for Blind Users

Developing and Deploying a Real-World Solution for Accessible

Slide Reading and Authoring for Blind Users

Zhuohao (Jerry) Zhang Gene S-H Kim Jacob O. Wobbrock

The Information School | DUB Group,

Stanford University

The Information School | DUB Group,

University of Washington Stanford, USA University of Washington

Seattle, USA [email protected] Seattle, USA

zhuohao@uw.edu wobbrock@uw.edu

ABSTRACT

Presentation software like Microsoft PowerPoint and Google Slides

remains largely inaccessible for blind users because screen readers

are not well suited to 2-D “artboards” that contain dierent objects

in arbitrary arrangements lacking any inherent reading order. To

investigate this problem, prior work by Zhang & Wobbrock (2023)

developed multimodal interaction techniques in a prototype system

called A11yBoard, but their system was limited to a single artboard

in a self-contained prototype and was unable to support real-world

use. In this work, we present a major extension of A11yBoard that

expands upon its initial interaction techniques, addresses numerous

real-world issues, and makes it deployable with Google Slides. We

describe the new features developed for A11yBoard for Google Slides

along with our participatory design process with a blind co-author.

We also present two case studies based on real-world deployments

showing that participants were able to independently complete

slide reading and authoring tasks that were not possible without

sighted assistance previously. We conclude with several design

guidelines for making accessible digital content creation tools.

CCS CONCEPTS

• Human-centered computing

→

Human computer interac-

tion (HCI); Accessibility technologies.

ACM Reference Format:

Zhuohao (Jerry) Zhang, Gene S-H Kim, and Jacob O. Wobbrock. 2023. De-

veloping and Deploying a Real-World Solution for Accessible Slide Read-

ing and Authoring for Blind Users. In The 25th International ACM SIGAC-

CESS Conference on Computers and Accessibility (ASSETS ’23), October 22–

25, 2023, New York, NY, USA. ACM, New York, NY, USA, 15 pages. https:

//doi.org/10.1145/3597638.3608418

1 INTRODUCTION

People today regularly use presentation software like Microsoft

PowerPoint, Google Slides, and Apple Keynote for business, edu-

cation, and creative purposes. These software tools employ slides

based on a digital “artboard” canvas, as described by Schaadhardt

et al. [

], which can contain various objects such as text boxes,

This work is licensed under a Creative Commons Attribution International

4.0 License.

ASSETS ’23, October 22–25, 2023, New York, NY, USA

ACM ISBN 979-8-4007-0220-4/23/10.

https://doi.org/10.1145/3597638.3608418

shapes, images, videos, charts, and diagrams. For blind users, in-

terpreting existing slides and generating new ones both remain

largely inaccessible, which contribute to signicant educational

and professional barriers [

]. To address these challenges, prior

work by Zhang & Wobbrock [

] developed a multi-device multi-

modal system called A11yBoard to make digital artboards accessi-

ble. Although A11yBoard shed light on interaction techniques that

make rich information in 2-D canvases accessible to read and edit

using touch, gesture, audio, speech, keyboard input, and search,

A11yBoard was limited to a proof-of-concept prototype that worked

on an open-source drawing canvas—only a single self-contained

artboard. Furthermore, A11yBoard’s evaluation was based only on

curated usability tasks in a laboratory setting. Although A11yBoard

enabled an important initial exploration of accessible artboards, it

could not support real-world use. Moreover, the literature is clear

that moving from self-contained research prototypes to real-world

eld deployments inevitably elevates not only practical design and

engineering issues, but uncovers new knowledge about the problem

domain [

]. Therefore, to further our knowledge of how to design,

develop, and deploy accessible artboard creation tools, we created

A11yBoard for Google Slides, a major extension of the original self-

contained A11yBoard prototype.

A11yBoard for Google Slides is a deployable multi-device mul-

timodal system that consists of a mobile touch screen application

and a Chrome browser extension (see Figure 1). Created out of a

participatory design process with a blind co-author, A11yBoard for

Google Slides mirrors desktop slides onto a touch screen device,

and enables multimodal interactions to read and edit slide con-

tents. For example, users can employ nger-driven screen reading

[

] on the touch screen to explore slide content without fear of

altering it accidentally [

]. Audio tones and customized screen

reader outputs are displayed in response to a user’s (1) touches and

gestures, (2) speech commands through the touch screen device,

and (3) keyboard commands through an accompanying Chrome

browser extension that works exclusively on Google Slides pages.

A11yBoard for Google Slides was created through a participatory

design process with a blind co-author over multiple sessions. In

this design process, we rst identied issues with the prototype

version of A11yBoard [

] and explored how presentation software

currently works with commercial screen readers.

We then repeat-

edly tested and improved the design through these participatory

design sessions. As a result, compared to the original A11yBoard

[

], A11yBoard for Google Slides oers more exibility in slide

A11yBoard for Google Slides employs its own custom speech output because existing

screen readers do not handle slide contents in an accessible manner. For example, on a

Microsoft PowerPoint slide, NVDA would read out objects in their Z-order, regardless

of their placement on the canvas.

ASSETS ’23, October 22–25, 2023, New York, NY, USA Zhang et al.

Figure 1: A blind user “nger reading” [

] a slide using

A11yBoard for Google Slides, which consists of a browser ex-

tension and an Apple iOS app. The app shows the slide and

enables touch, gesture, and speech interactions with it.

exploration, is more adaptable to blind users’ workow and devices,

and is integrated into Google Slides’ existing features.

We conducted two case studies as eld deployments [

] to eval-

uate A11yBoard for Google Slides in real-world applications. Two

blind participants were recruited and used the tool independently

for ve and seven days, respectively. They utilized A11yBoard to

read and recreate various slide decks, totaling 4.5 hours each, includ-

ing tutorial usage. Feedback was obtained through interviews, and

back-end log data was analyzed. Our results show that participants

were able to use A11yBoard for Google Slides to read and create

slides independently without sighted assistance, which was a rst

for both of them. Our results also show that although blind users

still feel the need to seek sighted conrmation before they actually

use slides in a presentation, A11yBoard for Google Slides greatly

reduced the amount of back-and-forth when checking with sighted

collaborators. We discuss lessons learned from the design process

and evaluation that could inform the future design of assistive tech-

nologies for digital content creation. Specically, we oer design

recommendations for making content creation on 2-D canvases

more accessible for blind users.

2 RELATED WORK

Prior work related to A11yBoard for Google Slides can be classied

into (1) exploration of blind users’ experiences with 2-D digital

content, including presentation software and beyond, and (2) non-

visual interaction techniques for blind users.

2.1 Blind Users’ Experiences with 2-D Digital

Content

Various approaches have been proposed to facilitate access for

blind and low-vision users to 2-D digital content, including digi-

tal artboards, formatted documents, visualization charts, images,

animations, and videos. Prior work on A11yBoard by Zhang &

Wobbrock [

] explored multi-device multimodal interaction

techniques to make digital artboards accessible. However, their

system had limitations as it was conned to a single artboard in

a self-contained prototype, limiting real-world usefulness. Other

works also demonstrated similar eorts. AVScript [

] enabled blind

users to edit videos using text-based interactions through narration

and transcripts. VoxLens [

] provided an inclusive solution for

blind or low-vision users to interact with online data visualizations

through data sonication and speech recognition. Chart Reader [

]

used a navigation ow for screen reader users to explore and read

visualization charts through their data insights, axes, data points,

lters, etc. Machine learning models were utilized in SciA11y [

which extracted the scientic content from Adobe PDF les and

converted it into an accessible HTML format with additional nav-

igational tools to aid screen readers. Lee et al. [

] demonstrated

a multi-layered touch method for exploring digital images with

AI-generated captions. Relatedly, Zhang et al. [

] oered Ga11y

as a combined machine learning and crowdsourcing solution for

annotating animated GIF images with alt-text descriptions. Li et

al. [

] explored how blind people adopted non-visual interactions

to interact with visual artworks. Peng et al. [

–

] proposed a

series of methods to non-visually explore presentation videos, vi-

sual design changes in presentation slides, and slide content in an

automatic way. However, most of these prior approaches focused

on providing a non-visually accessible end-result for blind people

to consume, rather than giving access to blind people throughout

the authoring process (i.e., agency to dynamically author the con-

tent independently). In contrast, A11yBoard for Google Slides is

an integrated solution that focuses on both content interpretation

and content creation. This is in keeping with Ladner’s [

] call to

develop tools for people with disabilities to participate in all phases

of the design process, including in prototyping and development,

not just in user research, ideation, and evaluation [5, 15, 34].

Current presentation software tools like Microsoft PowerPoint

or Google Slides provide some built-in accessibility features for

blind people. For instance, Microsoft PowerPoint provides screen

reader support [

] for blind users to navigate through its user

interface elements, including views and ribbon tabs. It also oers

a full set of keyboard shortcuts for creating, deleting, rearranging,

and organizing slides. But these aspects of PowerPoint exist out-

side the artboard itself, which is a largely unstructured space in

which arbitrary objects can exist in any arrangement. As a result,

Microsoft PowerPoint is very dicult to use with a screen reader.

Google Slides oers more, but is still left wanting. It provides a

verbalization of a selected object’s content and formatting styles

[

]. But it still remains dicult to know which objects are present

and to select them for verbalizing. As with PowerPoint, it remains

dicult if not impossible for users to “read the artboard” to know

what content is present on it. In a related study exploring the acces-

sibility challenges of digital whiteboard tools, Fan et al. [

] found

that even when blind users were able to access individual pieces

of information on a linked-node diagram, it was cognitively de-

manding to understand the spatial relationships between individual

items, especially with a high degree of condence.

So, although most presentation software tools provide some

accessibility features, they mainly focus on making the software

interface accessible, rather than making the 2-D artboard accessible.

In contrast, A11yBoard for Google Slides makes the 2-D canvas

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users

accessible through non-visual touch-, gesture-, and speech-based

interactions.

2.2 Non-Visual Interactions for Blind Users

We review dierent input and output modalities that enable non-

visual interactions for blind users, including audio, tactile, haptic,

and multimodal interactions.

Assistive technologies for blind individuals often use audio in-

teractions, which include speech recognition, text-to-speech, and

non-speech audio. Voice assistants and screen readers, such as

VoiceOver [

], NVDA [

], JAWS [

], and Windows Narrator [

enable blind users to access visual elements through speech output.

Previous research has also explored various auditory techniques

to enhance the accessibility of virtual 2-D spaces, including user

interface design [

], graphs [

], maps [

], and

documents [

]. Tactile and haptic interfaces have also been shown

to support non-visual interactions for blind people. These inter-

actions provide more intuitive representations of graphical and

operational information [

]. Previous

research has also investigated various forms of tactile and haptic

feedback for blind people to interact with maps [

] and graphs

[

]. Multimodal designs, which combine audio and tactile in-

teractions, can create more accessible experiences for blind people

[

]. In our work here, we employed another set of

multi-device multimodal interactions that include touch, gesture,

audio, speech, keyboard, and search to create an accessible 2-D slide

reading and editing experience. For dierent scenarios, A11yBoard

for Google Slides may provide dierent interaction modalities. For

instance, a blind user can create an object in multiple ways—by

drawing it with a nger, using speech commands, or using keyboard

commands accessed via search.

3 A11YBOARD FOR GOOGLE SLIDES: A

REAL-WORLD DEPLOYMENT

We present a detailed description of the design and implementation

of A11yBoard for Google Slides. To appreciate the signicant im-

provements made during our iterative participatory design process,

it is necessary to describe the features of A11yBoard [

], which pro-

vided a starting point for our current investigations. Subsequently,

we provide a summary of the design challenges and considerations

that emerged from our participatory design process. This backdrop

will then allow us to reect on the improvements and new features

introduced in A11yBoard for Google Slides.

3.1 A11yBoard in Review

A11yBoard employed a variety of multimodal inputs and outputs.

It supported touch and gesture to enable a user to interpret an

artboard. Blind people could use one nger, the “reading nger”

[

], to explore the artboard by touching its mirrored image on the

touch screen device, receiving dierent audio tones as feedback

indicating whether they had entered or left an object’s borders.

Furthermore, speech output revealed objects’ shapes. While a user

explored an artboard with their “reading nger,” they could also

split-tap (i.e., a “second-nger tap” issued anywhere on the screen

while the rst “reading nger” remained on the intended target [

])

to receive detailed information about objects (e.g., their positions,

ASSETS ’23, October 22–25, 2023, New York, NY, USA

sizes, and colors) as well as to select objects for further action.

When a split-tap was performed on empty space, a “dull” audio

tone was played and the empty location was selected for further

action. Other supported gestures included a two-nger directional

ick to discover nearby objects in the ick direction, and a double-

tap to traverse objects’ Z-order under the current “reading-nger.”

Finally, a single-nger dwell initiated speech input, like holding

down a walkie-talkie button before speaking.

Regarding speech input, A11yBoard allows users to issue speech

commands and receive spoken feedback while their nger remains

on the screen. The feedback can be either brief or detailed, provid-

ing information about object properties such as position, size, color,

text, and the closest or farthest objects. Additionally, A11yBoard

supports editing operations through speech commands, enabling

users to create, move, and resize objects with ease. Unlike typical

drag-and-drop methods found in most artboard tools, A11yBoard

separates the moving and resizing process into two phases: First,

users indicate the object they want to move or resize, and second,

they can explore the canvas to nd the desired destination, thereby

deferring the placement decision and reducing cognitive load. Fur-

thermore, A11yBoard facilitates aligning two objects when moving

or resizing one towards another.

To ensure blind users could also execute additional commands

and edit object properties, A11yBoard also oered a search-driven

keyboard interface. This interface allowed users to browse com-

mand keywords in an accessible input box and select them with a

few keystrokes. Examples of these supported commands included

“copy,” “delete,” “bring to front,” “send to back,” and many more.

3.2 Design Challenges for A11yBoard for

Google Slides

Although the original A11yBoard [

] pioneered a number of useful

interaction techniques, it was severely limited as a real-world tool,

having only one artboard in a self-contained prototype. It therefore

oered no opportunity for real-world use, let alone the ability to

support making slide decks in a commercial tool like Google Slides.

We therefore set out to create A11yBoard for Google Slides, discov-

ering in the process what was necessary for supporting real-world

use of accessible artboards. Before we present our A11yBoard for

Google Slides system, we summarize six key challenges for design-

ing accessible slide reading and authoring.

3.2.1 Limited Control over Slide Content. When working with our

blind co-author, one of the primary challenges we encountered

was how existing commercial software like Google Slides gives

us very limited control over slide content. Because of this, proto-

typing a system to control Google Slides is quite challenging. We

needed to work around the existing interface and APIs, optimizing

A11yBoard’s design to t within the constraints of this commercial

software tool.

3.2.2 Moving from a Single Artboard to a Full Slide Deck. Another

challenge was transitioning from A11yBoard’s single artboard to a

multi-slide deck, which is the norm for commercial presentation

software. This transition required a redesign of A11yBoard’s naviga-

tion and editing features to suit this expanded format. Furthermore,

ASSETS ’23, October 22–25, 2023, New York, NY, USA Zhang et al.

we had to develop interactions to support cross-slide operations

(e.g., copy an object from one slide for pasting onto another).

3.2.3 More Complex Slide Reading and Authoring Needs. The de-

sign of a new A11yBoard experience for blind users needs to con-

sider the more complex reading and authoring needs of slide decks.

These needs may include more complex shapes and diagrams, such

as arrows and lines, and a large number of slides. Moreover, the sys-

tem needs to support a variety of operations required for creating,

modifying, and presenting slides.

3.2.4 Multi-Device Interference. A11yBoard is a multi-device multi-

modal system, and as such, there is potential interference of screen

readers across dierent devices. This can cause confusion for the

user and interference with the audio output as two screen readers

may speak simultaneously. Therefore, in designing A11yBoard, we

must consider ways to mitigate the potential for screen reader inter-

ference across multiple devices. We must ensure that A11yBoard’s

audio output is clear and concise, regardless of whether the user is

accessing it through their desktop or mobile device, and that it does

not interfere with other screen readers the user might be using.

3.2.5 Balancing Eiciency and Expressiveness. Eciency and ex-

pressiveness are two competing priorities that need to be balanced

when designing A11yBoard. For example, creating a connector be-

tween two objects can have multiple options including whether a

line is straight or curved, which line ends have arrows, what are

the arrow styles, and what are the line widths and line dash styles.

It takes a great amount of unnecessary eort for blind users to

indicate these visual properties before creating a single connector.

The system must be ecient enough to allow users to complete

tasks quickly, while also ensuring that the user’s preferences can

be expressed fully.

3.2.6 Supporting Individual Dierences in Perceptions. Finally, indi-

vidual dierences in perceptions of 2-D artboard information must

be considered. Users might have dierent preferences and require-

ments for how A11yBoard should work, like reporting values in

dierent metrics (e.g., inches, centimeters, or pixels), or creating

and placing objects using dierent methods or sequences. These

individual dierences must be accommodated insofar as possible.

Therefore, our new A11yBoard system must be customizable and

adaptable to suit the diverse needs of blind users.

3.3 Overview: A11yBoard for Google Slides

A11yBoard for Google Slides enhances accessibility with extra inter-

actions for exploring and editing slides. It consists of a web browser

extension for Chrome and Firefox, and a mobile app for iOS devices.

User authentication involves a four-digit code displayed on the

extension, which is entered into the iOS app to connect. The server

retrieves slide content using the Google Slides API [

] and sends

it to the app for rendering basic shapes. Non-visual authoring oper-

ations are validated and applied via HTTP requests to the Google

Slides API.

The touch screen device supports touch, gesture, speech input,

and speech and audio output. Reading operations like selecting

an object or switching slides automatically place the focus on the

Google Slides’ artboard for further editing.

The system supports speech interactions for accessing detailed

object properties and relationships. An intelligent keyboard search

interface handles complex operations not easily done via touch,

gesture, or speech. Customizable speech outputs are generated in

response to user actions.

To avoid conicts with multi-device screen readers, our system

uses a custom text-to-speech technique on the touch screen, allow-

ing desktop screen readers like NVDA and JAWS to work alongside

it. This approach ensures all visual elements like text input in the

browser extension are accessible to screen readers without inter-

fering with touch and gesture inputs.

3.4 Supported Interactions

We now present A11yBoard’s supported interactions in detail, orga-

nized by input modality: (1) touch and gesture, (2) speech commands

and corresponding feedback, and (3) intelligent keyboard search.

3.4.1 Touch and Gesture. Similar to the original A11yBoard [

A11yBoard for Google Slides also comprises a mobile application

that runs on a touch screen device, providing a safe way for blind

users to spatially read slides without fear of accidentally altering

them [

]. The objects on the current slide will be shown on the

touch screen, enabling exploration via touch and gesture (see Fig-

ure 2).

Interpretive Touch and Gestures. A11yBoard for Google Slides

supports single-nger reading to explore the slide and a second-

nger split-tap to select an object and access more detail. However,

audio tone and speech feedback in A11yBoard for Google Slides

have been signicantly improved over A11yBoard [

] to address

the design challenges in Section 3.2.

When exploring a slide using a “reading nger,” A11yBoard for

Google Slides employs a layered method to notify users about ob-

jects’ Z-order using dierent audio tones. For example, users hear

a “step-up” sound (notes F-B) when entering an object from the

empty canvas. When the user enters an object that overlaps that

object, users hear a higher “step-up” sound (notes G-C), indicating

that they have entered another object in a “higher” place in the

Z-order. The audio tones get progressively higher as users “step-up”

into more and more overlapping objects. The same scheme works

in the opposite direction when users “step-out” of an object into an-

other overlapped object. Comparing to the original A11yBoard [

which only had a single step-up and step-down sound, A11yBoard

for Google Slides provides much more spatial information by adding

richer Z-order feedback.

A11yBoard for Google Slides provides much more detailed re-

porting for dierent types of objects. In the original version of

A11yBoard [

], when a split-tap happens, all objects are reported

with their color, location, and size. However, dierent object types

serve dierent purposes and should be reported in dierent ways.

For example, in addition to color, location, and size, A11yBoard

for Google Slides reports any text inside a shape or a text box. If

there are long paragraphs inside an object, A11yBoard for Google

Slides will intelligently report a title, a rst sentence, or a rst bullet

point to represent that content. For other objects like a connector,

A11yBoard for Google Slides reports what objects are connected

by it, and where the starting and ending points are, which matter

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users ASSETS ’23, October 22–25, 2023, New York, NY, USA

Figure 2: Eight touch- and gesture-based interactions, including (a) single-nger exploration to spatially “read” artboard objects,

(b) split-tap to select an object and access more detail, (c) two-nger dwell to initiate speech recognition, (d) two-nger ick to

reveal nearby objects in a given direction, (e) double-tap to step through the Z-order of overlapping objects, (f) three-nger

swipe right/down to switch to the previous slide, and swipe left/up to switch to the next slide, (g) single-nger tap four times to

update the slide for any other changes, and (h) single-nger triple-tap to start creating an object, followed by a unistroke object

drawing.

more than the connector’s location and size. An example of speech

output for a connector is: “A curved line connecting a text box at

top-left corner to a round rectangle at bottom-right corner.”

Similar to A11yBoard [

], when an object is selected via split-

tap on the touch screen app, that object will become selected on

the Google Slide in the desktop web browser. Unlike in A11yBoard

[

], where users needed to open the keyboard search interface

to perform actions like typing text, they can now perform direct

operations on objects, like pressing the Enter key to start typing

text into an object.

To support navigating through a slide deck, A11yBoard for

Google Slides added a new gesture, a three-nger swipe that switches

to the previous slide (by swiping right) or next slide (by swiping

left), which is consistent with gestures to navigate pages on the

iOS home screen, apps in the app switcher, or images in the Photos

app. When users arrive at a new slide, A11yBoard for Google Slides

will report the current slide number and an overview of the slide,

which includes the number of dierent objects on the slide.

Another new gesture added to A11yBoard for Google Slides is a

single-nger quadruple-tap to actively refresh the touch screen de-

vice’s view of the current slide. This gesture is for situations when

a user makes a change to the current slide using the desktop web

browser outside of what A11yBoard provides. After a quadruple-

tap, A11yBoard retrieves the current slide’s contents and refreshes,

providing feedback with a spoken “slide updated” response. Al-

though this situation arises rarely, this gesture provides a way of

forcing the iOS screen to refresh.

We decided to employ the original A11yBoard’s other gestures,

like a two-nger directional swipe to discover the closest object in

a given direction, and a double-tap to traverse the Z-order under

the current nger location. These gestures were all reported to be

useful and straightforward [59].

Generative Gestures. In addition to touch and gestures that serve

to interpret slides, A11yBoard for Google Slides also supports a

single-nger triple-tap to start creating objects by drawing. After

the triple-tap, A11yBoard gives a speech-based notication, “start

drawing,” to inform users that they should start drawing an object

on the canvas. A11yBoard for Google Slides will then recognize the

drawn shape by using the $1 unistroke recognizer [

] and then

tting a beautied shape to the drawn trace. The supported shapes

are shown in Figure 3. Note that to distinguish between a text box

and a rectangle, blind users can draw a big “T” unistroke to represent

a text box. The horizontal line represents the top side of the text

box, with the width as drawn. The vertical line represents the

height of the textbox. A “triple-tap” is required before drawing any

object to ensure that blind users can still explore the slide in a risk-

free way without worrying about accidentally drawing an object

on the canvas. Even if users accidentally trigger object-drawing,

A11yBoard for Google Slides also supports a big “X” drawing to

cancel the current operation.

3.4.2 Speech-Based Interactions. A11yBoard for Google Slides sup-

ports similar speech-based interactions as the original A11yBoard

[

], which can be inputted by two ngers dwelling on the screen.

To improve the user experience, we added more audio and speech

feedback when users are talking to the system. For example, when

the system stops talking and starts recording again, or when the

system takes some time to process HTTP requests with the Google

Slides API, users hear audio feedback like a clock ticking sound.

A11yBoard for Google Slides also further enhances the range of

ASSETS ’23, October 22–25, 2023, New York, NY, USA Zhang et al.

Figure 3: Seven supported shapes that can be drawn as unistrokes [

], including a rectangle, triangle, ellipse, text box, line,

arrow, and “cancel the current operation.” These unistrokes are recognized with the $1 gesture recognizer [55].

speech outputs by allowing customization. Users can set a speech

output mode via a keyboard command. Available modes and their

corresponding examples are listed in Table 1.

We divide all speech commands into two categories (see Table 2).

First, based on their purpose, the commands can be categorized into

Interpretive

Generative

commands, meaning those that help

users interpret existing artboard content or generate new artboard

content, respectively. Second, depending on whether a command

would access or operate on a single object or two objects, the com-

Binary

, respectively. mands can be categorized as

Unary

Interpretive unary speech commands include commands like

“position,” “size,” “left,” “right,” “top,” “bottom,” “width,” “height,” and

“color,” which give the requested properties according to the current

reporting mode (see Table 1). An additional keyword, “exact,” can be

appended to retrieve more precise information. For the “position”

and “size” commands, appending “exact” causes the speech out-

put to give exact pixel values. (An exception is when users set the

speech output mode to an absolute value in inches or centimeters,

appending “exact” gives the output in exact metric values accord-

ingly.) If “color exact” is issued, then RGB values will be reported

instead of color names.

Interpretive binary commands include “closest” and “farthest,”

which report the closest or farthest object, and its direction, from

the selected object or current nger position. An example output is,

“The closest object is a text box to the south-southwest.” A11yBoard

for Google Slides uses the closest named directions to report an

object’s approximate direction. Similar to interpretive unary com-

mands, “exact” can be added after the commands to learn about

an object’s position and size in pixels, and its direction in degrees.

Furthermore, a number can be added after the commands to learn

about a number of objects instead of only one. For example, “clos-

est two” requests information about the two closest objects to the

nger’s position, reported in increasing distance.

A11yBoard for Google Slides also supports a variety of genera-

tive commands to create and edit objects. A11yBoard for Google

Slides supports generative unary commands like “create,” which

enables the creation of dierent types of default objects under the

dwelling nger (e.g., “create text box”). Besides creating an object,

“move here” and “resize here” are also supported to move or resize

an object to a specic position. Particularly, as was described in Sec-

tion 3.1, “move here” and “resize here” would trigger a two-phase

process. First, users would say “move” or “resize” to initialize the

moving or resizing process for a selected object or on an empty

position. For “resize” specically, users need to indicate a resizing

handle, either a corner (e.g., “top-left”) or an edge (e.g., “bottom”),

by explicitly speaking this handle name after the “resize” command.

Second, users can continue exploring the slide using the full set of

touch, gesture, and speech interactions until they nd a suitable

destination. Users can say “here” to complete the moving or resizing

operation.

A11yBoard’s generative binary commands, i.e., those that work

on two objects while authoring slide content, include “move to

align,” “resize to align,” and “connect with this.” Similar to “move

here” and “resize here,” after an object has been created or selected,

the “move,” “resize,” and “connect” commands enable the object

to be moved, resized, or connected to another object on the slide.

Generative binary speech commands trigger a two-phase process.

After users initiate the moving, resizing, or connecting process by

saying the relevant speech command, they can continue exploring

the slide until they nd another object to align with or connect to.

Note that the “here” command is used when users try to move or

resize an object to a specic location, whereas the “align” command

is used when users try to move or resize an object to align with

another object’s edge. For example, by saying “align left to left,”

A11yBoard for Google Slides will resize or move the rst selected

object’s left side to be aligned with the currently selected object’s

left side. For “connect,” users can say “with this” to indicate the

object to which they want to connect a rst object.

A separate command is “help,” which can be used independently

or in tandem with any other command. When used independently,

A11yBoard for Google Slides will give a quick introduction of avail-

able speech commands, which can be stopped at any time by lifting

the ngers to exit the speech interaction mode. When used in tan-

dem with other commands, A11yBoard for Google Slides will give a

tutorial on how to use the given command, followed by an example.

3.4.3 Intelligent Keyboard Search. To support additional commands

that are not easily completed through touch, gesture, and speech,

A11yBoard for Google Slides also supports an intelligent keyboard

search interface via a browser extension pop-up window, which can

be initiated with a preset keyboard shortcut. The interface consists

only of a search text box, which embeds a list of supported com-

mands to be selected and executed. Users do not need to remember

keywords for this search interface; they only need to type a few

characters related to their command. The keyboard commands can

also be divided into two categories:

Direct editing

commands and

Navigation

commands (see Table 3).

The direct editing commands are implemented as a simplied

way to edit objects, like “create,” “speech output mode,” “bring to

front,” “bring forward,” “send to back,” and “send backward.” By

selecting “create,” users are prompted to type in the object type,

which creates a default object at the center of the slide. This ap-

proach serves as an alternative way to create objects, along with

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users ASSETS ’23, October 22–25, 2023, New York, NY, USA

Table 1: Seven supported speech output modes that can be set to adjust the speech output style and detail, including a brief

reporting mode, a detailed reporting mode, a mode that reports properties using relative percentages, one that reports properties

using relative fractions, and three others that report properties in absolute values using pixels, inches, or centimeters.

Speech Output Mode Example Output

Report brieﬂy “Text box created.”

Report in detail “A new slide created at page 9 with two text boxes inside.”

Relative in percentage “Ellipse moved to 20% of canvas width, 30% of canvas height, with size of 15% by 30%.”

Relative in fraction “From left, about one quarter of canvas width; from top, about two thirds of canvas height; from right, one

quarter; from bottom, one eighth.”

Absolute in pixels “Nearest object at 45 degrees is a text box at (528, 491) with size of 200 by 100.”

Absolute in inches “Triangle created at 2.7, 3.9 inches with bounding box of 1.5 by 2.0 inches.”

Absolute in centimeters “Rectangle resized to 3.5, 23.6 centimeters with size of 15.5 by 21.3 centimeters.”

Table 2: Speech commands for interpreting and generating objects, including their types, functions, and usage.

Speech Command Type Function Usage

Position (or Left, Right, Top,

Bottom)

Interpretive

Unary

Report position of an object Use directly or append “exact”

Size (or Width/Height)

Interpretive

Unary

Report size of an object Use directly or append “exact”

Color

Interpretive

Unary

Report color of an object Use directly or append “exact”

Closest

Interpretive

Binary

Report closest object(s) of an object

or a position

Use directly, append “exact,” and/or ap-

pend a number

Farthest

Interpretive

Binary

Report farthest object(s) of an object

or a position

Use directly, append “exact,” and/or ap-

pend a number

Create

Generative

Unary

Create an object by type Append a supported type

Move (A to) here

Generative

Unary

Move an object to a position Use “here” at ﬁnal destination to trigger

Resize (A to) here

Generative

Unary

Resize an object to a position Append a handle after “resize,” use “here”

at destination

Move (A to) align (with B)

Generative

Binary

Move to align with another object Use “align [edge] to [edge]” at target object

Resize (A to) to align (with B)

Generative

Binary

Resize to align with another object Use “align [edge] to [edge]” at target object

Connect (A) with this (B)

Generative

Binary

Connect with another object Use “with this” at target object

Overview

Interpretive

Report overview of the current slide Use directly

Help N/A Report tutorial of any given com-

mand or in general

Use directly or with any other command

speech commands or nger-drawing an object’s shape. Users can

also change the speech output style to one of the modes in Table 1.

Other direct editing commands are used to change objects’ Z-order.

They can move an object forward or backward, or send an object

to the top or bottom layer. We did not include commands to copy,

paste, or delete objects because Google Slides already supports

copy, paste, and delete with typical keyboard shortcuts like Ctrl+C,

Ctrl+V, and backspace. These shortcuts are already made accessible

for blind users.

Another type of keyboard command is used for navigation. In-

stead of developing a complex editing interface on our own, we

utilize the existing interfaces in Google Slides and enable users to

navigate to their desired panel by simulating mouse clicks. These

panels already contain accessible elements that are labeled for

screen readers, but are usually hard to navigate inside the com-

plex visual interface. For example, when an object is selected, users

can type in “ll color” to navigate to the ll color panel of this

selected object. Users can then select from the pre-dened colors

with color name labels or type in exact RGB values. The list of

supported commands include “ll color,” “border color,” “border

width,” “font family,” “font size,” etc.

ASSETS ’23, October 22–25, 2023, New York, NY, USA Zhang et al.

Table 3: Twelve supported keyboard commands including their types and functions that help blind users edit a slide.

Keyboard Command Type Function

Report mode

Direct editing

Change the reporting mode as described in Table 1

Create

Direct editing

Create an object with a type selected in a second input box

Fill color, Border color, Border width, Font family, Font size

Navigation

Navigate to the corresponding panel

Bring to front or forward

Direct editing

Bring the current object to front or forward stepwise

Send to back or backward

Direct editing

Send the current object to back or backward stepwise

Insert image

Navigation

Navigate to the insert image window

4 PARTICIPATORY DESIGN OF A11YBOARD

FOR GOOGLE SLIDES

In this section, we take a step back to present the participatory

design process of A11yBoard for Google Slides. To reconsider and

improve upon the original design of A11yBoard [

] in making it

suitable for real-world use, we carried out a series of participatory

design sessions with the involvement of a blind co-author, GK. The

objective of these sessions was twofold: rst, to foreground the

challenges that must be addressed to enable A11yBoard to work

as a real-world tool within Google Slides (see Section 3.2); and

second, to develop and rene a fully functional system that would

be well-suited to the needs of blind users.

4.1 Method

GK, a co-author on this paper, collaborated with the other authors

over the entire design process. GK was born legally blind and has

been completely blind for seven years. He is an undergraduate

majoring in Symbolic Systems

with years of experience using

presentation software like Microsoft PowerPoint and Google Slides

with a conventional screen reader. GK’s expertise in using these

tools was gained through his extensive use of presentation software

in college courses, where he has often collaborated with classmates

to deliver live presentations in class.

Our participatory design process contained two main stages.

First, we conducted an interview with GK to create a shared un-

derstanding of how we should design A11yBoard [

] for real-

world presentation needs rather than just as a self-contained pro-

totype with a single artboard. The interview began by reviewing

A11yBoard’s existing features. GK oered reactions and suggestions

while exploring each feature.

Second, we designed and implemented an initial prototype of

A11yBoard for Google Slides over eight weeks, followed by three

iterative design sessions with GK over four weeks to improve the us-

ability and functionality of A11yBoard for Google Slides. A descrip-

tion of the system features can be found in Section 3. An overview

of the insights we gained from our design process appears in the

section below.

4.2 Insights

In the interview, we discussed limitations of the prior version

of A11yBoard [

], which included that it was limited to a self-

contained single artboard that could not be saved or shared. Also,

https://symsys.stanford.edu/

it was limited to basic shapes like text boxes, rectangles, triangles,

and ellipses. In Microsoft PowerPoint and Google Slides, the set

of objects is more extensive and objects can have more complex

properties, such as rotation, borders, and rich formatting. Another

limitation about speech interactions was that although GK was com-

fortable with how A11yBoard reported object properties, dierent

individuals might have dierent preferences for how to perceive nu-

meric and descriptive information. Furthermore, GK mentioned that

some operations via speech were inecient when done frequently,

like creating objects. He suggested implementing gesture-based

object creation as an alternative, which we did.

After we developed the initial prototype, GK guided the other

authors in iterative system evaluation and design. We present the

insights about how we improved A11yBoard for Google Slides

below.

Design more tailored slide exploration. We discovered the need

to design more tailored interactions for a better slide-exploration

experience. Our prototype provided detailed announcements when

accessing objects, but testing with GK showed that more customiza-

tion options were necessary. For instance, a “brevity mode” could

minimize speech announcements and reduce cognitive load. We

also enabled other metrics, such as absolute values in centimeters

and relative values in fractions of slide width and height. GK also

suggested adding a “help” command to provide guidance, assigning

dierent pitches for objects that overlap, and reading out dierent

objects dierently based on their typical usage.

Adapt better to users’ workow and devices. We also gathered

insights on how to better accommodate the needs and workow

of blind users. One insight was to provide more audio or speech

feedback to aid users in operating the app on their devices. GK

suggested adding speech reports and audio feedback for system

notications, like entering or leaving the app, and indicating any

unrecognized operation. Another insight was to improve the sys-

tem’s recognition of gesture and speech inputs. GK found that the

old interaction of using one nger to dwell on the screen was easily

misinterpreted into a “nger reading” action. We enhanced the

system’s tolerance of these inputs to better match the exploration

habits of blind users. Lastly, using A11yBoard for Google Slides

together with other software can be challenging for blind users,

leading to unintended operations and extra cognitive load. GK sug-

gested adding a keystroke to reset and refocus the system, allowing

users to recover from accidental movements and continue using the

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users

app condently. This recommendation aligns with previous stud-

ies that have shown the need to address user concerns regarding

system fragility [43].

Consider trade-os between accessibility and usability. Incorpo-

rating all features of the original A11yBoard system [

] could

enhance the accessibility of Google Slides, but its full usability in

this new context was initially uncertain. For example, while the

original intelligent keyboard search allowed color editing in its

interface, Google Slides already oers accessible elements for the

“ll color” panel, which can be read by current screen readers. GK

proposed a more eective approach of navigating to existing acces-

sible panels instead of introducing additional interfaces for property

editing. This strategy improved usability, avoided confusion from

extra interfaces, and underscored the importance of assessing an ap-

plication’s existing accessibility features and integrating them with

new tools to strike a balance between accessibility and usability.

5 FIELD DEPLOYMENT

For our eld deployment [

], we used a case study methodology

with two blind users [

]. The goal of our case study was to as-

sess whether blind users can integrate A11yBoard for Google Slides

into their own workow and use it to read and author slide decks

independently. To achieve this goal, we conducted two eld deploy-

ments in which blind participants used A11yBoard for Google Slides

freely, without any supervision or interference from researchers,

over several days. In each case study, we provided a tutorial ses-

sion to introduce the tasks and the system, and then allowed the

participants a few days to complete a task on their own. Once the

participants indicated that they were nished, we conducted an in-

terview to collect their nal artifacts and feedback about A11yBoard

for Google Slides. We also conducted an empirical evaluation of

their activities by analyzing the back-end log data and their verbal

responses to understand how they used the system.

5.1 Participants

Each case study involved one blind participant (P1, P2 accordingly),

recruited through personal communications from our blind co-

author, GK. P1 and P2 both reported being blind since birth, with

P1 having some light perception. P1 was a 24-year-old female gov-

ernment relations analyst at a non-prot accessibility organization,

while P2 was a 30-year-old male graduate student studying design.

Both participants had prior experience using presentation software

like Microsoft PowerPoint and Google Slides and used Apple iOS

devices and Windows desktop computers or laptops with JAWS

screen readers. P1 used presentation software on a weekly basis as

part of her job, while P2 used it for university courses on a monthly

basis. Participants were compensated $30 for each hour they spent

in the study, including the tutorial session, usage of A11yBoard for

Google Slides, and the follow-up interview.

5.2 Apparatus

The apparatus deployed and tested in this study was the A11yBoard

for Google Slides system, as described in Section 3. Both participants

used their personal Apple iPhone and Windows laptop devices

when working with A11yBoard for Google Slides. We instructed

both participants to turn o their iPhone’s VoiceOver software

ASSETS ’23, October 22–25, 2023, New York, NY, USA

after opening the A11yBoard mobile app. Additionally, both partici-

pants used JAWS to interact with Google Slides and our A11yBoard

browser extension.

The study task was to understand a slide deck and create a new

slide with same or similar content inside the same slide deck. The

rst study’s slide deck (depicted in Figure 4a) was about a fundamen-

tal concept in computer science, conditional or if-then statements,

which was presented in a ow chart that had ve shapes, ve con-

nectors, and two text boxes. The second study’s slide deck (depicted

in Figure 4b) was about guidelines of making slides accessible, in-

cluding ve guidelines presented as shapes with a capital letter

inside, and 10 text boxes positioned beneath them.

5.3 Procedure

The study involved three phases. In the initial phase, each partici-

pant had a 90-minute tutorial session individually with the authors.

P1’s session took place on Zoom, while P2’s was in person. Be-

fore each session, participants answered demographic questions

and gave verbal consent. They were instructed to install two re-

quired software components on their devices: an iOS app and a

Chrome browser extension. During the tutorial, participants re-

ceived a Google Slides document and a Google Docs tutorial with a

“cheat sheet” containing simplied information about A11yBoard’s

commands. Researchers demonstrated the system’s features and

had participants try them out. Usability issues were noted and tips

were shared to address them. An exit interview took place after

seven days (for P1) and ve days (for P2). During the interview, par-

ticipants discussed their experiences, the system’s usefulness, task

completion, and comparisons with other tools, along with potential

improvements.

5.4 Analysis

The data analyzed in this study consisted of four parts: (1) obser-

vational results from the tutorial session, (2) the nal slides made

by participants, (3) back-end time-stamped log data indicating how

P1 and P2 used the system, and (4) the exit interviews and feed-

back about participants’ experiences. We report each of these parts

separately in Section 6.

6 RESULTS

We present the results from each case study in turn, focusing on

the slides participants made, their use of the A11yBoard for Google

Slides system, and their subjective impressions and feedback char-

acterizing their experiences.

6.1 Case Study 1

In this study, after the tutorial session, P1 was instructed to rst

interpret the slide content and comprehend its structure, and then

replicate a similar slide deck. P1 was permitted to use any existing

features of Google Slides with the assistance of A11yBoard for

Google Slides.

6.1.1 Final Artifact. Overall, the slide created by P1 (Figure 5) con-

tained all the information in target slide 2, with correct content and

objects in the correct order and ow. This result indicated that P1

had a complete understanding of the spatial layout of the original

ASSETS ’23, October 22–25, 2023, New York, NY, USA Zhang et al.

Figure 4: Two slide decks used in two eld studies. Each deck contained two slides at the beginning, and participants were

instructed to read, understand, and create new slides in the same deck.

Figure 5: The slide made by P1 in an eort to recreate the target slide shown in Figure 4a, above. The slide is largely an accurate

reproduction except that the two connectors at the bottom have arrowheads on the wrong ends.

slides and could create objects accordingly. However, the created

text box was not formatted with larger fonts as it should have been,

slide also revealed some issues. For instance, the ow chart on the

and the title and paragraph text boxes were not quite aligned.

right side displayed the correct workow, but the two arrows at

the bottom pointed in the wrong direction. Additionally, the title

6.1.2 Overall Impressions and Feedback. P1 reported that after the

90-minute tutorial session, it took her an additional 3 hours and

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users

15 minutes to complete the task using our system. Specically,

she spent around 60 minutes familiarizing herself with the touch,

gestures, and speech commands, followed by around 45 minutes

reading and exploring the slides. Finally, she spent approximately

90 minutes creating the slide and its contents.

As a regular Microsoft PowerPoint and Google Slides user, P1

mentioned that A11yBoard for Google Slides helped her to read

the slides more eectively and provided her with the ability to

create objects freely. Comparing to her previous experiences, P1

noted that without A11yBoard for Google Slides, she was limited

to working with text and had no ability to edit the visual layout of

a slide with objects:

“There are so many visualizations that screen readers

cannot deliver to you” (P1).

Without A11yBoard for Google Slides, P1 would have needed

the help of a sighted co-worker or assistive services like Aira to deal

with the visual layouts, images, and charts. P1 reported that produc-

ing a slide deck like this all by herself would have been impossible.

In particular, she appreciated how she could use a “reading nger”

to explore the slides and was “very condent” in understanding

the visual content. Another aspect of A11yBoard that P1 enjoyed

was the ability to explore the slides in a tactile way and to edit the

slides more accurately by using the physical keyboard.

In the opening tutorial, P1 pointed out several usability issues

that hindered her from using the system uently, including gesture

and voice misrecognition. Apart from these usability issues, P1 also

expressed her feelings about performing editing operations. She

thought that there was still too high a cognitive load involved in

editing operations. Specically, she pointed out that she would

need to pay attention to the operation itself and the spatial posi-

tion where the operation occurs, which can be challenging. For

example, when drawing to create an object, there was no interme-

diate feedback until she nished drawing in one stroke and heard

conrmation of the shape she created.

We discussed with P1 whether A11yBoard for Google Slides

could t into her workow and how it might be improved in the

future. P1 expressed a need for more instant and real-time tactile

feedback or conrmation when performing an editing operation,

so that she could feel more condent when using the system. P1

also expressed that she would love to use the system in her daily

workow if the interfering gestures from iOS could be mitigated.

P1 further pointed out that having a more physical layout beyond

the current touch screen and supplementing it with a braille display

would be helpful, which is an interesting venue for future work.

6.1.3 Performance and Activities. In view of the nal artifact along-

side P1’s verbal statements and the back-end log data, we could

reconstruct the process of how P1 used A11yBoard for Google Slides

to complete the assigned task. To begin, P1 explored the canvases

on slides 1 and 2 to understand their contents. This exploration was

not strictly separated from the editing process, which was happen-

ing throughout her usage of the system. Specically, P1 frequently

used split-taps to examine objects’ contents, positions, and sizes

in detail. P1 then created several slides and experimented with

the system’s functionality. During this process, P1 conrmed how

A11yBoard for Google Slides works by realizing that the objects

ASSETS ’23, October 22–25, 2023, New York, NY, USA

would be selected automatically in her laptop browser, and she just

needed to press the Enter key to start typing text inside them.

Next, P1 created two text boxes, a title and an introduction text

box, on the left side of the slide by triple-tapping on the screen and

then drawing uppercase unistroke “T” letters. P1 also used speech

commands to create two shapes, the rectangles containing “State-

ment 1” and “Statement 2,” on the right side of the screen. Because

the default shape size was 100 by 100 pixels, the two rectangles

were in their default size and were not resized by P1. For the other

shapes, P1 chose to copy and paste the “Start,” “Expression,” “Stop,”

“True,” and “False” objects directly from the original slide. We ac-

knowledge that copying and pasting objects is within the purview

of free and unfettered usage of A11yBoard for Google Slides, and

this activity showed that A11yBoard for Google Slides served as a

complement to the existing Google Slides system, not a replacement,

which ts our expectation. After copying and pasting objects, P1

then created ve connectors in sequence to build the ow chart. P1

rst connected “Start” to “Expression,” then connected “Expression”

to “Statement 1” and “Statement 2.” Finally, P1 created the last two

connectors from “Stop” to “Statement 1” and “Statement 2,” which

is opposite the intended direction.

As for the keyboard interface, P1 did not utilize the keyboard

search interface much for navigating among interface panels or per-

forming editing operations. When asked about this, P1 responded

that there was not much formatting needed, as the main focus was

on ensuring the content was correct. However, when prompted to

recall the tutorial session, P1 acknowledged that using the keyboard

search feature would have been helpful for editing object properties

and navigating through panels.

In conclusion, while there were some usability issues and in-

herent limitations in A11yBoard for Google Slides as a real-world

solution for making 2-D content accessible, P1 was still able to

successfully read and edit her slides independently. Vitally, P1’s ac-

complishment transformed a formerly “impossible task” (her words)

into a possible one.

6.2 Case Study 2

Similar to study 1, the second case study also involved understand-

ing and recreating a slide deck, which is about design guidelines for

making accessible presentations. Unlike the rst case study, which

involved understanding the ow of a connected shapes diagram,

the main challenge in this case study was to understand the layout

of the ve object groups, their corresponding positions, and how

to create, edit, and align them correctly.

6.2.1 Final Artifact. Figure 6 shows the nal slide created by P2.

The slide deck created by P2 captured most of the content in the

original slide, including the ve shapes that contained the ve de-

sign guidelines, with ve text boxes positioned under each elliptical

shape. This indicates that P2 was able to understand the layout of

the slide and recreate the objects in the correct order and position.

However, there were a few imperfect details in the recreated slide.

Firstly, the ve text boxes at the bottom that explained each design

guideline were missing. P2 explained that he skipped creating them

due to personal time constraints, as creating a set of ve more text

boxes was a redundant process to what P2 had already done. Sec-

ondly, the ve elliptical shapes were not exactly the same size, and

ASSETS ’23, October 22–25, 2023, New York, NY, USA Zhang et al.

Figure 6: The slide made by P2 in an eort to recreate the target slide shown in Figure 4b, above. Note that ve text boxes of

explanation were missing because P2 reported that creating a similar set of text boxes were trivial and he did not want to

repeat.

they were not perfectly aligned. Finally, the letters inside the shapes

were not formatted in the same way as the target slide. But overall,

the recreated slide by P2 demonstrated a good understanding of

the original slide’s layout and content.

6.2.2 Overall Impressions and Feedback. P2 reported that after 90

minutes of the tutorial session, he spent 3 hours completing the

task. He took full advantage of the tutorial document and explored

all possible touch, gesture, and speech commands for around 1.5

hours to understand the system and the target slide’s layout. P2

then created a new slide using speech commands and completed

the task after another 1.5 hours.

P2 compared his prior experience of using presentation software

with this experience of using A11yBoard for Google Slides. He said

that previously, he could only make slides out of existing templates

and copy and paste text from written documents into slides with

a title and a paragraph text box. Any other slide layouts, or using

objects other than text boxes, were simply impossible for him to

attempt. With A11yBoard, he was able to perform editing operations

on objects, which was groundbreaking for him.

“I won’t be able to create this kind of slide before [using

A11yBoard for Google Slides]. I would have to take

visual assistance [without the system]” (P2).

P2 also raised some usability issues that happened during the

eld deployment, including the same issue P1 had of accidentally

leaving the A11yBoard iOS app because of the iOS’s swipe-up app-

switching gesture. P2 also faced the challenge of drawing smaller

objects, like a small text box on a relatively small Apple iPhone

screen, which is the universal fat nger problem that can be miti-

gated on a bigger touch screen like an Apple iPad.

P2 said that A11yBoard denitely would be benecial when

working with Google Slides, given that the task was not accessible

at all without A11yBoard. He said that he would not be able to

create a slide deck with so many objects without sighted assistance.

He did say that he would still want sighted assistance for nal

conrmation of his slides after using A11yBoard if he were to deliver

a presentation, but that the use of A11yBoard would signicantly

reduce the time needed to interact with a sighted assistant from

Aira or his co-workers.

“I would like to spend as much time as possible work-

ing on slides independently without sighted assistance.

Right now I think I have at least 90% to 95% indepen-

dence [with A11yBoard for Google Slides]” (P2).

P2 also believed that A11yBoard could not only be used in busi-

ness and educational settings but also by middle- and high-school

students as the rst tool for them to interact with 2-D canvases.

As for improvements, P2 pointed out the same issue as P1, which

is the challenge of executing an operation while also maintaining

spatial awareness of nearby objects. P2 would like a solution in-

volving a physical layout, such as using Wikki Stix

to represent

objects. In this case, he would be more condent in editing one

object while being able to touch and sense other objects nearby in

a physical form.

6.2.3 Performance and Activities. We studied how P2 created his

slides using A11yBoard for Google Slides. Initially, P2 created a

new slide and entered a title in the default title text box. He then

focused on the top-half of the slide and created ve elliptical shapes

by drawing them left to right. P2 was mindful of aligning the ob-

jects and tried to keep the size consistent. He later realized that he

could copy-and-paste the shapes for consistency. P2 then turned

to the laptop and typed letters in each shape. After completing the

top-half, P2 repeated the same process for the bottom-half of the

slide. However, he stopped before creating the remaining ve text

https://www.wikkistix.com/

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users

boxes as he considered it a trivial but repetitive task, given he had

already shown he could create text boxes and enter text into them.

During the exit interview, P2 conrmed that he was able to read

and edit the slides independently, demonstrating that A11yBoard

for Google Slides has successfully made previously inaccessible

canvases accessible.

In hindsight, P2 expressed that he could have created the slide

more eciently if he had considered using copy-and-paste sooner.

Despite a few usability issues raised by his eld deployment, P2

believed that A11yBoard could be benecial in his daily work by

signicantly reducing time needed for sighted assistance.

7 DISCUSSION

In this section, we reect on what the A11yBoard deployments

have taught us and we discuss the broader implications of making

2-D presentation tools accessible to blind users. We present design

recommendations drawn from our own participatory design and

deployment process for future use by designers and researchers.

We also discuss A11yBoard’s limitations and avenues for future

research.

7.1 Design Recommendations

We present ve design recommendations for making 2-D canvases

accessible to blind users. They arise from our participatory design

and deployment eorts. We oer them in hopes that they might

serve as a resource for future designers and researchers.

7.1.1 Provide Spatial, Intuitive, and Immediate Feedback. We gained

a signicant insight into the importance of providing spatial and

intuitive feedback to blind users. This is due to the complexity of

2-D content, which is inherently challenging for screen readers.

With the advancement of touch screen devices and their haptic fea-

tures, it is increasingly feasible to provide intuitive and immediate

spatial feedback for 2-D objects on the touch screen. However, a

challenge remains in making sense of 2-D content semantically. For

instance, we found that blind participants could easily comprehend

a ow chart when they were provided with a brief introduction.

Nevertheless, when presented with a new 2-D space, blind users

must take considerable time to explore that space and understand

its objects, properties, and relationships.

7.1.2 Tailor Feedback Based on Context and Individual Dierences.

Dierent objects serve dierent purposes. When exploring a 2-D

artboard through touch and gesture, it is crucial to emphasize dif-

ferent object attributes when delivering audio or speech feedback

to eciently convey the most signicant information to blind users.

For example, the positions and sizes of rectangles are important,

but the positions and sizes of connectors are less important than the

rectangles they might connect. Additionally, individual dierences

in human perception should be considered, and customization op-

tions should be provided accordingly. This priority aligns with how

all screen readers oer the ability to adjust speech verbosity. Apart

from verbosity, metrics and other attributes should also be included

in the customization options.

7.1.3 Provide Multimodal Ways to Create and Edit. During our

participatory design process, we discovered the signicance of

providing multimodal options for creating and editing objects, as

ASSETS ’23, October 22–25, 2023, New York, NY, USA

these “authoring” operations can occur at any point while the user

works. For instance, creating an object can happen when a user has

just nished typing on the keyboard or while exploring the 2-D

canvas. In either case, the user might prefer to continue with their

current workow and avoid switching devices. Thus, it is crucial to

oer an accessible means to create and edit objects through each

modality, ensuring that the designed system is exible in this way.

7.1.4 Consider the Role of AI-Generated Content (AIGC). The re-

cent advancements in AIGC technology, such as Microsoft Oce

Copilot [

] and GPT-4 [

] based models, have made creating

2-D visual content, such as slides, eortless. These tools can gener-

ate high-quality visual content from inputs, including prompts and

data, which can signicantly benet blind users in creating visual

content. However, with such tools, it becomes even more crucial

to provide blind users with access to the auto-generated content,

as A11yBoard for Google Slides does, to ensure they have agency

and control over the visuals. Further research is required on how

to tailor these AIGC tools to suit accessibility needs. Additionally,

AIGC can be utilized to comprehend and standardize users’ natural

language input as operations, creating a true “virtual assistant” to

assist blind users in reading and creating visual content.

7.1.5 Balance Cognitive Load and Functionality. Another crucial

lesson we learned from our design process is the importance of bal-

ancing cognitive load and functionality. Manipulating 2-D content

can be challenging, as software such as Adobe PhotoShop or Illus-

trator requires users to possess professional skills that are acquired

over years. The full range of functionality can create a signicant

cognitive load for users, even for software that is relatively feature-

light, such as Microsoft PowerPoint and Google Slides. While the

initial eort of learning a complex but useful system should not be

perceived as a barrier, we still need to meticulously design our sys-

tem’s functionality to avoid overwhelming users with all possible

features, rendering it unusable.

7.2 Limitations

One limitation of our study is that it only involved the task of repro-

ducing a few existing slides. We acknowledge that the real-world

scenarios people encounter could be more complex and varied.

Therefore, while our study provides valuable insights into the de-

sign of an accessible 2-D slide creation tool for blind users, further

research is needed to fully evaluate the usability and eectiveness

of A11yBoard for Google Slides in a greater range of scenarios. Also,

the valuable contributions of our blind co-author, GK, primarily

pertain to a specic subgroup within the blind community, charac-

terized by individuals with similar needs. While their expertise has

been instrumental in addressing the requirements of this particular

user group, it is important to acknowledge that their insights may

not be universally applicable to the entire blind community.

In addition to the study limitations, there are also some technical

limitations to our system. One limitation is the lack of cross-device

undo and redo functionality. This is because the Google Slides API

does not provide these features, and it is not possible to undo an

operation performed on a mobile device from a desktop computer.

This limitation could be a source of frustration and confusion for

ASSETS ’23, October 22–25, 2023, New York, NY, USA

blind users who switch between devices frequently or who acciden-

tally make a mistake and need to backtrack. We recognize that this

is a signicant usability limitation and suggest that future iterations

of the A11yBoard system need to add cross-device undo and redo.

Another limitation of A11yBoard for Google Slides is the lack

of integration with Apple’s VoiceOver mobile screen reader. This

means that blind users who rely on VoiceOver will need to turn it

o before using A11yBoard, which can add an additional layer of

complexity to their workow. For example, when they accidentally

exit the App, it would be hard for blind users to realize that they

have done so without screen reader announcements.

8 FUTURE WORK

This work opens up several areas of future research. One potential

direction is exploring the use of advanced AI-generated content

(AIGC) techniques, like large language models (LLMs), to auto-

matically generate slides. This could reduce the cognitive load in

creating accessible visual content, but careful curation is necessary

to meet the needs of blind users. Balancing automation with user

control over visual content is another aspect to investigate, along

with enabling accessible ne-tuning processes to personalize the

generated content.

Another area to explore is enhancing collaborative slide edit-

ing accessibility. With A11yBoard’s improvements in slide reading

and authoring, extending these benets to collaborative editing

becomes essential. Possible approaches include using AIGC tech-

niques to generate alternative descriptions for visual content, aiding

team members using dierent assistive technologies, or develop-

ing new collaboration features that support real-time collaboration

with various assistive technologies. Whichever approach is cho-

sen, it should prioritize accessibility and intuitiveness for all team

members, regardless of their abilities.

9 CONCLUSION

In this work, we have presented A11yBoard for Google Slides, a

multi-device multimodal system deployable in real-world scenarios

to make Google Slides, a commerical presentation tool, accessible

to blind users. We described the A11yBoard system consisting of a

Chrome browser extension and an Apple iOS mobile app. We also

described the participatory design process we followed with a blind

co-author that led to A11yBoard’s interaction design. A11yBoard

for Google Slides addresses the key challenges of designing an

accessible experience of reading and editing slide decks for blind

users. To put A11yBoard for Google Slides through its paces, we

deployed it in two case studies that showed our blind participants

were capable of reading and creating slides independently and

eciently, without the need for sighted assistance, something that

had been previously impossible for them. We also oered design

recommendations for the further development of accessible content

creation tools. In the end, it is our hope that A11yBoard for Google

Slides provides a signicant step towards making 2-D presentation

tools more inclusive and accessible for all users.

ACKNOWLEDGMENTS

This work was supported in part by the University of Washington

Center for Research and Education on Accessible Technology and

Zhang et al.

Experiences (CREATE). Any opinions, ndings, conclusions or rec-

ommendations expressed in our work are those of the authors and

do not necessarily reect those of any supporter.

REFERENCES

[1]

Apple. 2022-09-09. Accessibility - Vision. https://www.apple.com/accessibility/

vision/

[2]

Amine Awada, Youssef Bou Issa, Clara Ghannam, Joe Tekli, and Richard Chbeir.

2012. Towards digital image accessibility for blind users via vibrating touch

screen: A feasibility test protocol. In 2012 Eighth International Conference on

Signal Image Technology and Internet Based Systems. IEEE, 547–554.

[3]

Amine Awada, Youssef Bou Issa, Joe Tekli, and Richard Chbeir. 2013. Evaluation

of touch screen vibration accessibility for blind users. In Proceedings of the 15th

International ACM SIGACCESS Conference on Computers and Accessibility. Article

48, 2 pages.

[4]

L. M. Brown, S. A. Brewster, S. A. Ramloll, R. Burton, and B. Riedel. 2003. Design

guidelines for audio presentation of graphs and tables. https://eprints.gla.ac.uk/

3196/

[5]

Tim Brown et al

2008. Design thinking. Harvard business review 86, 6 (2008), 84.

[6]

Matt Calder, Robert F Cohen, Jessica Lanzoni, and Yun Xu. 2006. PLUMB: an

interface for users who are blind to display, create, and modify graphs. In Pro-

ceedings of the 8th international ACM SIGACCESS conference on Computers and

accessibility. 263–264.

[7]

Robert F Cohen, Arthur Meacham, and Joelle Ska. 2006. Teaching graphs to

visually impaired students using an active auditory interface. ACM SIGCSE

Bulletin 38, 1 (2006), 279–282.

[8]

Franco Delogu, Massimiliano Palmiero, Stefano Federici, Catherine Plaisant,

Haixia Zhao, and Olivetti Belardinelli. 2010. Non-visual exploration of geographic

maps: does sonication help? Disability and Rehabilitation: Assistive Technology

5, 3 (2010), 164–174.

[9]

Julie Ducasse, Anke M Brock, and Christophe Jourais. 2018. Accessible inter-

active maps for visually impaired users. In Mobility of visually impaired people.

Springer, 537–584.

[10]

Danyang Fan, Kate Glazko, and Sean Follmer. 2022. Accessibility of Linked-

Node Diagrams on Collaborative Whiteboards for Screen Reader Users: Challenges

and Opportunities. Springer International Publishing, Cham, 97–108. https:

//doi.org/10.1007/978-3-031-09297-8_6

[11]

Nicholas A Giudice, Hari Prasath Palani, Eric Brenner, and Kevin M Kramer. 2012.

Learning non-visual graphical information using a touch-based vibro-audio

interface. In Proceedings of the 14th international ACM SIGACCESS conference on

Computers and accessibility. 103–110.

[12]

David Goldberg and Cate Richardson. 1993. Touch-Typing with a Stylus. In

Proceedings of the INTERACT ’93 and CHI ’93 Conference on Human Factors in

Computing Systems (Amsterdam, The Netherlands) (CHI ’93). Association for

Computing Machinery, New York, NY, USA, 80–87. https://doi.org/10.1145/

169059.169093

[13]

Google. 2023-04-15. Edit presentations with a screen reader - Google

Docs Editors Help. https://support.google.com/docs/answer/1634140?sjid=

12624978275069237619-NA

[14]

Google. 2023-04-15. Google Slides API. https://developers.google.com/slides/

api/reference/rest

[15]

Joan Greenbaum and Morten Kyng. 2020. Design at work: Cooperative design of

computer systems. CRC Press.

[16]

Tiago Guerreiro, Paulo Lagoa, Hugo Nicolau, Daniel Goncalves, and Joaquim A

Jorge. 2009. From Tapping to Touching: Making Touch Screens Accessible to

Blind Users (vol 15, pg 48, 2008). IEEE MULTIMEDIA 16, 1 (2009), 13–13.

[17]

Mina Huh, Saelyne Yang, Yi-Hao Peng, Xiang ’Anthony’ Chen, Young-Ho Kim,

and Amy Pavel. 2023. AVscript: Accessible Video Editing with Audio-Visual

Scripts. In Proceedings of the 2023 CHI Conference on Human Factors in Computing

Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery,

New York, NY, USA, Article 796, 17 pages. https://doi.org/10.1145/3544548.

3581494

[18]

Deepak Jagdish, Rahul Sawhney, Mohit Gupta, and Shreyas Nangia. 2008. Sonic

Grid: an auditory interface for the visually impaired to navigate GUI-based

environments. In Proceedings of the 13th international conference on Intelligent

user interfaces. 337–340.

[19]

JAWS. 2020-10-15. Jaws® – freedom scientic. https://www.freedomscientic.

com/products/software/jaws/

[20]

Nikolaos Kaklanis, Konstantinos Votis, and Dimitrios Tzovaras. 2013. A mobile

interactive maps application for a visually impaired audience. In Proceedings of

the 10th International Cross-Disciplinary Conference on Web Accessibility. Article

23, 2 pages.

[21]

Shaun K. Kane, Jerey P. Bigham, and Jacob O. Wobbrock. 2008. Slide rule: making

mobile touch screens accessible to blind people using multi-touch interaction

techniques. In Proceedings of the 10th international ACM SIGACCESS conference

on Computers and accessibility (Assets ’08). Association for Computing Machinery,

Developing and Deploying a Real-World Solution for

Accessible Slide Reading and Authoring for Blind Users

New York, NY, USA, 73–80. https://doi.org/10.1145/1414471.1414487

[22]

Shaun K. Kane, Meredith Ringel Morris, and Jacob O. Wobbrock. 2013. Touch-

plates: low-cost tactile overlays for visually impaired touch screen users. In

Proceedings of the 15th International ACM SIGACCESS Conference on Computers

and Accessibility (ASSETS ’13). Association for Computing Machinery, New York,

NY, USA, Article 22, 8 pages. https://doi.org/10.1145/2513383.2513442

[23]

Martin Kurze. 1996. TDraw: a computer-based tactile drawing tool for blind peo-

ple. In Proceedings of the second annual ACM conference on Assistive technologies

(Assets ’96). Association for Computing Machinery, New York, NY, USA, 131–138.

https://doi.org/10.1145/228347.228368

[24]

Richard E Ladner. 2015. Design for user empowerment. interactions 22, 2 (2015),

24–29.

[25]

Steven Landau and Lesley Wells. 2003. Merging tactile sensory input and audio

data by means of the Talking Tactile Tablet. In Proceedings of EuroHaptics, Vol. 3.

414–418.

[26]

Jonathan Lazar, Jinjuan Heidi Feng, and Harry Hochheiser. 2017. Case Studies.

(2017).

[27]

Cheuk Yin Phipson Lee, Zhuohao Zhang, Jaylin Herskovitz, JooYoung Seo, and

Anhong Guo. 2022. CollabAlly: Accessible Collaboration Awareness in Document

Editing. In CHI Conference on Human Factors in Computing Systems. Article 596,

17 pages.

[28]

Jaewook Lee, Jaylin Herskovitz, Yi-Hao Peng, and Anhong Guo. 2022. Image-

Explorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards

Imperfect AI-Generated Image Captions. In CHI Conference on Human Factors in

Computing Systems. Article 462, 15 pages.

[29]

Franklin Mingzhe Li, Lotus Zhang, Maryam Bandukda, Abigale Stangl, Kristen

Shinohara, Leah Findlater, and Patrick Carrington. 2023. Understanding Visual

Arts Experiences of Blind People. In Proceedings of the 2023 CHI Conference on

Human Factors in Computing Systems (Hamburg, Germany) (CHI ’23). Association

for Computing Machinery, New York, NY, USA, Article 60, 21 pages. https:

//doi.org/10.1145/3544548.3580941

[30]

Microsoft. 2023-04-15. Accessibility tools for PowerPoint - Microsoft Sup-

port. https://support.microsoft.com/en-us/oce/accessibility-tools-for-

powerpoint-2b7a387c-bc02-408f-8c49-59534665850f

[31]

Microsoft. 2023-04-15. Introducing Microsoft 365 Copilot – your copilot for

work. https://blogs.microsoft.com/blog/2023/03/16/introducing-microsoft-365-

copilot-your-copilot-for-work/

[32]

Joe Mullenbach, Craig Shultz, J Edward Colgate, and Anne Marie Piper. 2014.

Exploring aective communication through variable-friction surface haptics. In

Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.

3963–3972.

[33]

Joe Mullenbach, Craig Shultz, Anne Marie Piper, Michael Peshkin, and J Edward

Colgate. 2013. Surface haptic interactions with a TPad tablet. In Proceedings of the

adjunct publication of the 26th annual ACM symposium on User interface software

and technology. 7–8.

[34] Jakob Nielsen. 1994. Usability engineering. Morgan Kaufmann.

[35] NVDA. 2020-10-15. Nv access. https://www.nvaccess.org/

[36] OpenAI. 2023-04-15. GPT-4. https://openai.com/product/gpt-4

[37] OpenAI. 2023-04-15. Introducing ChatGPT. https://openai.com/blog/chatgpt

[38]

Hari Prasath Palani, G Bernard Giudice, and Nicholas A Giudice. 2018. Haptic

information access using touchscreen devices: design guidelines for accurate

perception of angular magnitude and line orientation. In International Conference

on Universal Access in Human-Computer Interaction. Springer, 243–255.

[39]

Yi-Hao Peng, Jerey P Bigham, and Amy Pavel. 2021. Slidecho: Flexible Non-

Visual Exploration of Presentation Videos. In The 23rd International ACM SIGAC-

CESS Conference on Computers and Accessibility. Article 24, 12 pages.

[40]

Yi-Hao Peng, Peggy Chi, Anjuli Kannan, Meredith Ringel Morris, and Ifran Essa.

2023. Slide Gestalt: Automatic Structure Extraction in Slide Decks for Non-Visual

Access. In Proceedings of the 2023 CHI Conference on Human Factors in Computing

Systems (Hamburg, Germany) (CHI ’23). Association for Computing Machinery,

New York, NY, USA, Article 829, 14 pages. https://doi.org/10.1145/3544548.

3580921

[41]

Yi-Hao Peng, Jason Wu, Jerey Bigham, and Amy Pavel. 2022. Discriber:

Describing Visual Design Changes to Support Mixed-Ability Collaborative Pre-

sentation Authoring. In Proceedings of the 35th Annual ACM Symposium on

User Interface Software and Technology (Bend, OR, USA) (UIST ’22). Associ-

ation for Computing Machinery, New York, NY, USA, Article 35, 13 pages.

https://doi.org/10.1145/3526113.3545637

[42]

Beryl Plimmer, Andrew Crossan, Stephen A Brewster, and Rachel Blagojevic.

2008. Multimodal collaborative handwriting training for visually-impaired people.

In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems.

393–402.

[43]

Anastasia Schaadhardt, Alexis Hiniker, and Jacob O. Wobbrock. 2021. Understand-

ing Blind Screen-Reader Users’ Experiences of Digital Artboards. In Proceedings

of the 2021 CHI Conference on Human Factors in Computing Systems (Yokohama,

Japan) (CHI ’21). Association for Computing Machinery, New York, NY, USA,

Article 270, 19 pages. https://doi.org/10.1145/3411764.3445242

ASSETS ’23, October 22–25, 2023, New York, NY, USA

[44]

Ather Sharif and Babak Forouraghi. 2018. evoGraphs — A jQuery plugin to create

web accessible graphs. In 2018 15th IEEE Annual Consumer Communications &

Networking Conference (CCNC). 1–4. https://doi.org/10.1109/CCNC.2018.8319239

ISSN: 2331-9860.

[45]

Ather Sharif, Olivia H Wang, Alida T Muongchan, Katharina Reinecke, and

Jacob O Wobbrock. 2022. VoxLens: Making Online Data Visualizations Accessible

with an Interactive JavaScript Plug-In. In CHI Conference on Human Factors in

Computing Systems. Article 478, 19 pages.

[46]

Ather Sharif, Andrew M Zhang, Anna Shih, Jacob O Wobbrock, and Katharina

Reinecke. 2022. Understanding and improving information extraction from online

geospatial data visualizations for screen-reader users (Assets ’22). Athens, Greece,

Article 61, 5 pages.

[47]

Katie A Siek, Gillian R Hayes, Mark W Newman, and John C Tang. 2014. Field

deployments: Knowing from using in context. Ways of Knowing in HCI (2014),

119–142.

[48]

Mathieu Simonnet, Anke M Brock, Antonio Serpa, Bernard Oriola, and Christophe

Jourais. 2019. Comparing interaction techniques to help blind people explore

maps on small tactile devices. Multimodal Technologies and Interaction 3, 2 (2019),

27.

[49]

Jing Su, Alyssa Rosenzweig, Ashvin Goel, Eyal de Lara, and Khai N Truong.

2010. Timbremap: enabling the visually-impaired to use maps on touch-enabled

devices. In Proceedings of the 12th international conference on Human computer

interaction with mobile devices and services. 17–26.

[50]

John Thompson, Jesse Martinez, Alper Sarikaya, Edward Cutrell, and Bongshin

Lee. 2023. Chart Reader: Accessible Visualization Experiences Designed with

Screen Reader Users. In Proceedings of the 2023 CHI Conference on Human Factors

in Computing Systems (Hamburg, Germany) (CHI ’23). Association for Computing

Machinery, New York, NY, USA, Article 802, 18 pages. https://doi.org/10.1145/

3544548.3581186

[51]

Gregg C Vanderheiden. 1996. Use of audio-haptic interface techniques to allow

nonvisual access to touchscreen appliances. In Proceedings of the human factors

and ergonomics society annual meeting, Vol. 40. SAGE Publications Sage CA: Los

Angeles, CA, 1266–1266.

[52]

Steven Wall and Stephen Brewster. 2006. Feeling what you hear: tactile feedback

for navigation of audio graphs. In Proceedings of the SIGCHI Conference on Human

Factors in Computing Systems (CHI ’06). Association for Computing Machinery,

New York, NY, USA, 1123–1132. https://doi.org/10.1145/1124772.1124941

[53]

Lucy Lu Wang, Isabel Cachola, Jonathan Bragg, Evie Yu-Yen Cheng, Chelsea

Haupt, Matt Latzke, Bailey Kuehl, Madeleine van Zuylen, Linda Wagner, and

Daniel S Weld. 2021. Improving the accessibility of scientic documents: Current

state, user needs, and a system solution to enhance scientic PDF accessibility

for blind and low vision users. arXiv preprint arXiv:2105.00076 (2021).

[54]

Microsoft Windows. 2023-04-15. Complete guide to Narrator - Microsoft

Support. https://support.microsoft.com/en-us/windows/complete-guide-to-

narrator-e4397a0d-ef4f-b386-d8ae-c172f109bdb1

[55]

Jacob O Wobbrock, Andrew D Wilson, and Yang Li. 2007. Gestures without

libraries, toolkits or training: a $1 recognizer for user interface prototypes. In

Proceedings of the 20th annual ACM symposium on User interface software and

technology. 159–168.

[56]

Cheng Xu, Ali Israr, Ivan Poupyrev, Olivier Bau, and Chris Harrison. 2011. Tactile

display for the visually impaired using TeslaTouch. In CHI’11 Extended Abstracts

on Human Factors in Computing Systems. 317–322.

[57]

Mingrui Ray Zhang, Mingyuan Zhong, and Jacob O. Wobbrock. 2022. Ga11y: An

Automated GIF Annotation System for Visually Impaired Users. In Proceedings of

the 2022 CHI Conference on Human Factors in Computing Systems (New Orleans,

LA, USA) (CHI ’22). Association for Computing Machinery, New York, NY, USA,

Article 197, 16 pages. https://doi.org/10.1145/3491102.3502092

[58]

Zhuohao (Jerry) Zhang and Jacob O. Wobbrock. 2022. A11yBoard: Using

Multimodal Input and Output to Make Digital Artboards Accessible to Blind

Users. In Adjunct Proceedings of the 35th Annual ACM Symposium on User

Interface Software and Technology (Bend, OR, USA) (UIST ’22 Adjunct). As-

sociation for Computing Machinery, New York, NY, USA, Article 9, 4 pages.

https://doi.org/10.1145/3526114.3558695

[59]

Zhuohao (Jerry) Zhang and Jacob O. Wobbrock. 2023. A11yBoard: Making Digital

Artboards Accessible to Blind and Low-Vision Users. In Proceedings of the 2023

CHI Conference on Human Factors in Computing Systems (Hamburg, Germany)

(CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 55,

17 pages. https://doi.org/10.1145/3544548.3580655