Sound in the Woods – Audio UI/UX-based Game
Tyler Hasty and Timothy Iseri
What is Sound in the Woods?
Sound in the Woods is an interactive non-visual digital experience, also known as an “audio game”. Whereas primitive digital games were generally accessible to users with a wide range of visual disabilities and impairments, modern games which are overwhelmingly graphics-based (and thus widely referred to as “video games”) present great difficulties for these users. The nature of graphics-oriented user interfaces and the radically different ways in which different classes of users can interact with software systems present great challenges in achieving high standards of accessibility for the widest possible user-base.
While solutions to these problems are increasingly being pursued in various spaces, substantial work is still to be done in enabling non-visual users to enjoy and utilize software features and experiences currently restricted to those with sufficient visual ability. Thus, the goal of this project was to explore UI/UX design in a highly under-evaluated context: non-visual digital games in the 3D first-person survival-adventure genre.
Sound in the Woods first started as a simple audio maze game developed for phones using android studio. These mazes were constructed using a tile based map, in which a map is constructed from several cells or tiles all of the same size. Each tile was either an open space, a wall, a door, or the goal space. When the goal space is reached the game would move on to the next level. The game played in turns and on each turn the player could either move forward one tile, turn left or right staying on the same tile, or echolocate. Since the game had no visuals the world was communicated through echolocate. Echolocate would make repeated pinging sounds that corresponded with the number of empty space tiles directly in front of the player until the next wall or door. For example if the player was at the end of a 5 space corridor and the other end was a door echolocate would play 4 pings and then a door knock sound. Using these movement commands and echolocate the player navigated through several mazes each progressively larger and more difficult.
The second iteration turned the game into a survival game. This was also a 2d game using the same movement concepts and grid based map ideas from the maze game. However the player was now exploring a forest area instead of narrow corridors and the idea of a survival element was added. The player now had a limited number of moves to find the goal, but there were now more types of tiles. Instead of just walls there were trees that couldn’t be moved through and stones that would take 2 moves to walk through instead of just 1. There was also food placed around the map that would increase the number of moves that player had left. This game was used as a physical prototype for proof of concept of the survival game by using cards to represent tiles. The cards were placed facedown to simulate the lack of visuals. When the player would move or echo the card was flipped up to reveal the nature of the tile then flipped back down. This represented the unique sound that would be played to identify each unique tile type.
The third iteration brings us to the current project. Using feedback from both 2d games we proceeded by evolving both the concept of audio and survival into a 3d survival game. The game is now played from a first person perspective which in a 3d space allows for more freedom of movement that is much closer to the real world. Instead of turning in place or moving from one tile to the next, the player can now move and turn freely in conjunction with one another. 3d also allows sounds to be played using a binaural audio setup. This means that sounds can be played through both left and right channels. So if an object is to the left than when the player echolocates the sound of the object sounds like it is to the player’s left. This allows for a much deeper understanding of the world of the game and allows for much more to be communicated to the player through audio without the use of visuals.
With the loss of visual feedback all of the player’s knowledge about the world within the game has to come through audio. So the same amount of information that would normally be transmitted through both visual and audio channels now has to be transmitted only through sound. With this comes a few challenges. Sounds that are too similar are difficult to tell apart, in a dynamic world sounds can play at the same time, and each sound had to accurately convey what it was. Normally in video games these problems are fixed by having visual stimuli coupled with sounds to convey to the player what is happening. Since we only have sounds we had to make sure that every sound was short and distinct enough that they could be easily told apart and that if two or more sounds played together they were each recognizable. For example trees were originally indicated by leaves rustling in the wind, but this sound was too quiet to hear over everything else. The next idea for a tree sound was a knock on wood but this was too similar to the “thunk” sound being used for footsteps. So the sound for a tree was eventually changed to the sound of a woodpecker pecking a tree repeatedly. This final sound met the criteria for being distinctly different from other sounds, loud enough to not be lost but not too loud to drown out other things, and is recognizable enough to convey the idea of a tree.
User Interfaces and Experience
The player’s character must harvest resources to sustain its energy levels while searching for a set of artifacts, the discovery of which completes the game.
There was some difficulty in determining the best way to design options menus based entirely on auditory and haptic feedback. It was determined that a linear approach would work best in order to reduce UI complexity. Two significant navigation problems remained: first, should the linear menus should be cyclical or bounded at extremes?; second, unlike visual menus where users can identify a breadth of items and quickly select from them, the restrictive nature of traditional linear audio menus presented a very real need to address the tedium of a low feedback rate.
The first problem was solved on a case by case basis. Some information is distinct enough to allow for non-ambiguous cyclical navigation. Other information introduces ambiguity when certain distinct menu items are similar or even identical to one another. This ambiguity is reduced when a menu is non-cyclical, as the distinctness of such items is communicated by relative position in the list.
The second problem, selection speed, was solved in two ways. First, with the ‘whisper mechanic’. When a menu item is selected, the name of the previous item is whispered in the left audio channel, and the name of the next is whispered in the right. This adds a very useful layer of depth to general navigation. Additionally, a non-linear ‘affordance-based’ help system was put in place. Rather than reading a list of inputs the user can provide, the user can hold down the help button and then engage any other input and will be informed what that input does in the current context.
For 3D navigation of the game-world, the primary UI/UX challenge was replacing visual input with something comparable in the contextual auditory cues. Once again, we employed two solutions. First, a three-tiered echolocation mechanic was crafted. These echo-location mechanics provide short-range and long-range information about game world objects relative to the player position by making use of 3D positional audio streams. Additionally, a 2D map menu interface allows the player to place markers in the game world which they can then hear via positional audio feedback.
Primarily this project was developed with Godot Engine using a Pythonic scripting language called GD script. Audacity was employed for editing audio asset files. Git, GitLab and Git LFS were utilized to coordinate development in the context of version-control and collaborative development practices.
There is much work to be done in improving existing interface features as well as developing and testing new and innovative methods and mechanisms by which to enable access to software systems through non-visual interfaces. Extensive and iterative user-testing would permit prototyping of such features. For example, a “long-range”, “radar-echo” is planned which enables the player to detect game world objects at a much greater proximity.
Various features of the game are pending as well, such as hostile entities which the player must use non-visual senses to evade. A day-night system is planned in which different stages of the day affect the immediate goal, motivating players to make use of all the various different sub-systems and interactions at different times throughout the experience.
Finally, the full depth of game world navigation via current mechanics and interfaces would be best explored by adding much more variety to the game world. Additional regions would feature unique obstacles, resources, soundscapes and so forth.
Thanks to Eric Kaltman for facilitating the design and development process, and to all those who gave feedback.
Impact of COVID19
Fortunately, thanks to the availability of a wide variety of communication and collaboration tools which enable the productivity of remotely situated teams, development and internal communication was largely uninhibited by demands for isolation. Regular communication was maintained through multiple mediums both between team members as well as with the advising professor. Video conferencing enabled more direct communication where necessary to resolve pressing problems of design and to engage in such practices are pair programming. We are grateful for all of the technology which allowed the general development experience of this project to go largely unhindered.
With that being said, certain challenges were encountered in the course of this project as a result of the disruptions caused by the pandemic response. The inability of the development team to maintain any sort of physical proximity to potential test-users impacted testing and quality efforts in no small way. Hardware intended to enhance the experience of these test users and thus their ability to give useful feedback had been purchased in advance, however, its use became essentially impossible under conditions of separation. Moreover, given the highly unusual, non-traditional nature of the user-facing systems presented in this project, a certain degree of difficulty was encountered in the pursuit of explaining such abstract features to users wholly unacquainted with what was taking place in the game. On the other hand, that a certain percent of interface implementations were found intuitive and useful as indicated in anonymous survey responses, despite these obstructions to communication, was a satisfying testimony to the success of this project’s core efforts.
Through the chaos and stress of the disruptions which have impacted many, this project has reached a state of completion reasonably close to expectations set at the beginning of the semester. We find this to be an optimistic gem in what has been a difficult and trying time for a great number of people around the world, because this means the technologies engineered and supported by individuals who were at one time in a position similar to ours as graduating students have a very real and measurable impact on quality of life, and represent profound opportunities for solving serious problems in ways never before considered. Certainly going forward we hope such technologies will be there to help not only in times of crisis, but also in everyday circumstances for individuals of all varieties with unique requirements, preferences and goals.