Monday, October 23, 2023
HomeArtificial IntelligenceConsidering past audio: Augmenting headphones for on a regular basis digital interactions

Considering past audio: Augmenting headphones for on a regular basis digital interactions


This analysis was accepted by and obtained a Finest Paper Award throughout ACM Designing Interactive Techniques (DIS) 2023, which is devoted to advancing the sector of user-centered system design.

Headphones are historically used to supply and handle audio experiences by bodily controls and a variety of sensors. Nonetheless, these controls and sensors have remained confined to audio enter and output performance, equivalent to adjusting the quantity or muting the microphone. Think about if headphones might transcend their position as mere audio units. 

As a result of headphones rank among the many hottest wearables out there, we have now an thrilling alternative to broaden their capabilities by integrating present sensors with supplementary ones to allow all kinds of experiences that transcend conventional audio management. In our paper, “Past Audio: In direction of a Design Area of Headphones as a Website for Interplay and Sensing,” we share a imaginative and prescient that explores this potential.

By utilizing sensors equivalent to microphones, proximity sensors, movement sensors, inertial measurement models (IMUs), and LiDARs, headphone designers can discover new avenues of enter and interplay. The truth that headphones are worn on an individual’s head permits for a variety of functions, equivalent to following head actions, physique postures, and hand gestures. Moreover, as wearable units, headphones have the potential to supply wearers with context-rich info and allow extra intuitive and immersive interactions with their units and setting past conventional button-based controls.

Highlight: On-demand video

AI Explainer: Basis fashions ​and the subsequent period of AI

Discover how the transformer structure, bigger fashions and extra knowledge, and in-context studying have helped advance AI from notion to creation.

Potential eventualities for sensor-enhanced headphones 

To discover this idea additional, we suggest augmenting headphones with further sensors and enter widgets. These embody: 

  • IMUs to sense head orientation
  • Swappable units of enter controls  
  • A spread-sensing LiDAR that allows the sensing of hand gestures

By incorporating these capabilities, we envision a variety of functions the place headphone enter acts as a bridge between the particular person sporting it and their setting and allow extra environment friendly and context-aware interactions amongst a number of units and duties. For instance, a headphone might help individuals with functions like video video games or assist handle interruptions throughout a video name.  

Let’s discover some eventualities as an instance the potential of our headphone design idea. Contemplate an individual engaged in a video name with teammates when they’re all of the sudden interrupted by a colleague who approaches in particular person. On this scenario, our headphones can be outfitted to detect contextual cues, equivalent to when the wearer rotates their head away from a video name, signaling a shift in consideration. In response, the headphones might robotically blur the video feed and mute the microphone to guard the wearer’s privateness, as proven in Determine 1. This characteristic might additionally talk to different members that the wearer is briefly engaged in one other dialog or exercise. When the wearer returns their consideration to the decision, the system removes the blur and reactivates the microphone.

Figure 1: Two videos side-by-side showing the headphones in a context-aware privacy-control scenario. On the left, there is an over-the-shoulder view of a wearer participating in a video call on a laptop. As he looks away from the call, the laptop screen changes color, and the application is muted, depicted by a mute icon overlayed on the video. As the wearer looks back at the screen, it becomes unblurred and a unmute icon is overlaid on the image, indicating the mute has been turned off. On the right, we see the laptop screen previously described.
Determine 1. These movies illustrate a context-aware privateness management system applied throughout a video convention. On this state of affairs, the wearer briefly disengages from the video convention to interact in an in-person dialog. After a predefined interval, the system detects the wearer’s continued consideration directed away from any recognized gadget, taking into consideration the setting context. Because of this, privateness measures are triggered, together with video blurring, microphone muting, and notifying different members on the decision. As soon as the wearer re-engages with the display screen, their video and microphone settings return to regular, guaranteeing a seamless expertise.

In one other privacy-focused state of affairs, think about an individual concurrently conversing with a number of teammates in separate video name channels. Our headphone design permits the wearer to regulate to whom their speech is directed by merely their supposed viewers, as proven in Determine 2. This directed speech interplay can prolong past video calls and be utilized to different contexts, equivalent to sending focused voice instructions to teammates in a multiplayer online game.

DIS 2023 - Figure 2: Two videos side-by-side showing the wearer controlling where his input is being sent among a multitude of devices. On the left, a video shows an over-the-shoulder view of a wearer interacting with a monitor and aptop while wearing headphones. There are two separate video calls on each screen. As the wearer turns from one screen to another, a large microphone icon appears on the screen at which the wearer is looking, and a muted microphone icon is shown on the other screen.

The video on the right shows an over-the-shoulder view of a wearer interacting with a laptop while wearing headphones. The laptop screen shows a video game and four circular icons on each corner depicting the other players. The user looks at the bottom left of the screen, which enlarges the icon of the teammate in that corner, and the wearer starts to speak. The wearer then looks at the top-right of the screen, and the teammate in that corner is highlighted while the wearer speaks.
Determine 2. Headphones observe the wearer’s head pose, seamlessly facilitating the distribution of video and/or audio throughout a number of personal chats. They successfully talk the wearer’s availability to different members, whether or not in a video conferencing state of affairs (left) or a gaming state of affairs (proper).

In our paper, we additionally display how socially recognizable gestures can introduce new types of audio-visual management as an alternative of relying solely on on-screen controls. For instance, wearers might work together with media by gestural actions, equivalent to cupping their ear in the direction of the audio supply to extend the quantity whereas concurrently decreasing ambient noise, as proven in Determine 3. These gestures, ingrained in social and cultural contexts, can function each management mechanisms and nonverbal communication indicators.

DIS 2023 - Fig 3 - image showing gestural controls for volume
Determine 3. Prime: Elevating the earcup, a generally used gesture to handle in-person interruptions, mutes each the sound and the microphone to make sure privateness. Backside: Cupping the earcup, a gesture indicating issue listening to, will increase the system quantity.

Moreover, we are able to estimate the wearer’s head gaze by the usage of an IMU. When mixed with the bodily location of computing units within the wearer’s neighborhood, it opens up potentialities for seamless interactions throughout a number of units. As an example, throughout a video name, the wearer can share the display screen of the gadget they’re actively specializing in. On this state of affairs, the wearer shifts their consideration from an exterior monitor to a pill gadget. Although this pill isn’t instantly related to the primary laptop computer, our system easily transitions the display screen sharing for the wearer’s viewers within the video name, as proven in Determine 4.

DIS 2023 - Figure 4: Two videos side-by-side showing a headphone wearer among a multitude of devices controlling which screen is shared in a video call. The video on the left shows an over-the-shoulder view of a person interacting with three screens—a monitor, a laptop, and a tablet—while wearing headphones. A video call is in progress on the laptop, and the wearer is giving a presentation, which appears as a slide on the attached monitor. As the wearer turns from the laptop screen to the monitor, the presentation slide appears on the shared laptop screen. The video on the right shows an over-the-shoulder view of the person interacting with three screens—a monitor, a laptop, and a tablet—while wearing headphones. We see the wearer looking at the monitor with a presentation slide, which is mirrored on the laptop screen. He then turns from the monitor to the tablet, which has a drawing app open. As he does this, the drawing app appears on the shared laptop screen. The wearer uses a pen to draw on the tablet, and this is mirrored on the laptop. Finally, the wearer looks up from the tablet to the laptop, and the laptop screen switches to the video call view with the participants’ videos.
Determine 4. A wearer delivers a presentation utilizing a video conferencing instrument. Because the wearer seems at totally different units, the streamed video dynamically updates to show the related supply to members.

Lastly, in our paper we additionally present the usage of embodied interactions, the place the wearer’s physique actions serve to animate a digital illustration of themselves, equivalent to an avatar in a video name, as proven in Determine 5. This characteristic will also be applied as a gameplay mechanism. Take a racing sport for example, the place the wearer’s physique actions might management the automobile’s steering, proven on the left in Determine 6. To increase this functionality, these actions might allow a wearer to peek round obstacles in any first-person sport, enhancing the immersion and gameplay expertise, proven on the best in Determine 6.

DIS 2023 - Figure 5: Two videos showing a headphone wearer controlling an avatar in a video call through head movements. The video on the left shows an over-the-shoulder view of a headphones wearer interacting with another participant on the call. The video on the right shows a wearer using a touch control to depict an emotion in his avatar.
Determine 5. Left: Headphones use an IMU to watch and seize pure physique actions, that are then translated into corresponding avatar actions. Proper: Contact controls built-in into headphones allow wearers to evoke a variety of feelings on the avatar, enhancing the person expertise.
DIS 2023 - Figure 6: Two videos showing a wearer playing a video game while leaning left and right. These movements control his character’s movements, enabling him to duck and peek around walls.
Determine 6. Leaning whereas sporting the headphone (with an built-in IMU) has a direct affect on sport play motion. On the left, it ends in swerving the automotive to the aspect, whereas on the best, in permits the participant to duck behind a wall.

Design house for headphone interactions 

We outline a design house for interactive headphones by an exploration of two distinct ideas, which we talk about in depth in our paper.

First, we have a look at the kind of enter gesture for the interplay, which we additional classify into three classes. The gestural enter from the wearer would possibly fall below a number of of those classes, which we define in additional element under and illustrate in Determine 7.

  • Contact-based gestures that contain tangible inputs on the headphones, equivalent to buttons or knobs, requiring bodily contact by the wearer
  • Mid-air gestures, which the wearer makes with their arms in shut proximity to the headphones, detected by LiDAR expertise
  • Head orientation, indicating the route of the wearer’s consideration
DIS 2023 - Figure 7: List of three stylized images showing the three main kinds of gestures we look at: touch, head orientation, and mid-air gestures.
Determine 7. Sensor-enhanced headphones can use touch-based gestures (left), head orientation (center), or mid-air gestures (proper) as forms of enter.

The second method that we outline the design house is thru the context inside which the wearer executes the motion. Right here, design concerns for sensor-enhanced headphones transcend person intentionality and noticed movement. Context-awareness permits these headphones to grasp the wearer’s actions, the functions they’re engaged with, and the units of their neighborhood, as illustrated in Determine 8. This understanding permits the headphones to supply customized experiences and seamlessly combine with the wearer’s setting. The 4 classes that outline this context-awareness are comprised of the next: 

  • Context-free actions, which produce comparable outcomes whatever the energetic software, the wearer’s exercise, or the social or bodily setting.  
  • Context that’s outlined by the appliance with which the wearer is interacting. For instance, are they listening to music, on a video name, or watching a film?  
  • Context that’s outlined by the wearer’s physique. For instance, is the wearer’s gesture near a physique half that has an related which means? Eyes would possibly relate to visible features, ears to audio enter, and the mouth to audio output. 
  • Context that’s outlined by the wearer’s setting. For instance, are there different units or individuals across the wearer with whom they could wish to work together?
DIS 2023 - Figure 8: Diagram showing the different levels of context we look at: context free, application, user's body, and the environment.
Determine 8. The system makes use of numerous contextual info to allow customized responses to person enter.

Trying forward: Increasing the chances of HCI with on a regular basis wearables  

Sensor-enhanced headphones supply a promising avenue for designers to create immersive and context-aware person experiences. By incorporating sensors, these headphones can seize refined person behaviors, facilitating seamless interactions and enhancing the wearer’s general expertise.  

From safeguarding privateness to offering intuitive management mechanisms, the potential functions for sensor-enhanced headphones are huge and thrilling. This exploration with headphones scratches the floor of what context-aware wearable expertise can empower its wearers to attain. Contemplate the multitude of wearables we use on daily basis that might profit from integrating comparable sensing and interplay capabilities into these units. For instance, think about a watch that may observe your hand actions and detect gestures. By enabling communication between sensor-enhanced wearables, we are able to set up a cohesive ecosystem for human-computer interplay that spans throughout functions, units, and social contexts.





Supply hyperlink

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments