BERYL – new breakthrough Acoustic Echo Cancellation by Meta

I attended Meta’s RTC@Scale 2024 Conference where Meta talked about two new major changes that it accomplished while revamping the audio processing core stack. BERYL – new breakthrough Acoustic Echo Cancellation by Meta and MLOW – a new low bitrate audio codec fully written in software. this blog contains notes on Beryl. PDF of handwritten notes can be found here.

BERYL -full software AC (by Sriram Srinivasan & Hoang Do)

  • META did 20% reduction in “No Audio” or “Audio device reliability” issue on iOS & Android
  • 15% reduction in P50 mouth to ear latency on Android
  • Revamp of Audio processing stack core for WhatsApp, Instagram messenger
    • Very diverse user base
    • Different kinds of handsets
    • Different Geography
    • Noisy conditions
    • Both high end & Low end phones (more than 20% low end ARMV7)
  • Based on telemetry and user feedback Meta decided to tackle 1. ECHO and 2. Audio Quality under low bit rate network
  • High end devices use ML to suppress echo
  • To accommodate low end devices which cannot run ML, a baseline solution for echo cancellation is needed
  • Welcome BERYL
  • Bery/replaces WebRTC‘s AEC3, AECM on all devices
  • Interestingly users experiencing echo issues are also on low end devices which cannot run ML
  • Meta’s scale is too larger
    • High end phones have hardware AEC
    • Low end phones do not
    • Stereo I spatial audio only possible in s/w
    • H/w only does mono AEC
  • Beryl was needed because AM either leaves lot of residual echo or degrades quality of double-talk
  • AECM – Not scalable for millions of users & Quality not best
  • Beryl AEC = Low compute – DSP based s/w AEC
    • Lite mode for low end devices
    • Full made for high end
    • Both modes adaptive vs. ACT being simple echo suppressor
    • Near instant adaptation to changes
    • Better double talk performance
    • Multi-channel capture & render l6k1tz & 48 kHz
    • Tuned using 3000 music t speech (monot stereo on 20T devices
    • CPU usage increase of less than 7% compared to WebRTC AEC

Beryl Components

1. Delay Estimator

  • Clock drift when using external mic & speaker as they do not share common clock
  • Delay estimator, estimates delay between far- end reference signal (speaker) & near end capture signals (mic)
  • Beryl full made can handle non-causal delays (-ve delay)
  • Can handle delay up to 1 sec

2 Linear AEC

  • Estimate echo & subtract from capture signal
  • Beryl AEC is normalized least mean squared (NLMS) frequency domain dual filter algo
  • One fixed & one adaptive filter
  • Coefficients can be copied between filters
    • relative difference in the powers of error signal between two filters and input mic signal
    • Coupling factor between echo estimate E error signal *
  • Adaptation step size is configurable I depends on coherence between mic & reference signals, power and SIR
  • Great double talk performance compared to WebRTC AEC

3 Acoustic Echo Suppressor (AES)

  • Non linear distortions are introduced by amplifiers before speaker and after microphone
  • AES removes this non-linear echo (residual echo)
  • AES removes stationary echo noise, distortion, applies perceptual filtering & ambient noise matching

Implementation

  • Reduce memory, CPU & latency
  • Synchronization needed due to work on audio from input & output devices from different threads
    • mutex in functions (Good safety but worse real time performance)
    • Low level locks on shared data structures
    • Thread safe low level data structures (ok safety, great realtime Performance)
  • Neon on ARMY7 & ARMG4
  • AUX on Intel
  • CPU 4110% of WebRTC AEC

Leave a Reply

Your email address will not be published. Required fields are marked *