How It Works

A technical deep-dive into the audio capture, frequency analysis, and GPU rendering pipeline that powers System Audio Visualizer.

Step 1: Audio Capture

System Audio Visualizer uses the Screen Capture API (navigator.mediaDevices.getDisplayMedia) to capture audio from your screen or browser tab. When you click “Start Visualization,” the browser presents a dialog where you can choose which screen, window, or tab to share.

The critical step is enabling the “Share audio” checkbox in the browser dialog. Without this, no audio data reaches the visualizer. Once sharing begins, we immediately stop the video track since we only need the audio stream, reducing CPU and bandwidth usage.

Step 2: Audio Context and Analysis

The captured audio stream is connected to the Web Audio API processing pipeline:

  1. An AudioContext is created to manage the audio processing graph.
  2. A MediaStreamSource node connects the captured stream to the audio graph.
  3. An AnalyserNode is attached to perform real-time Fast Fourier Transform (FFT) analysis.

The AnalyserNode is configured with an FFT size of 1024, providing 512 frequency bins. A smoothing time constant of 0.8 prevents jittery visual updates while maintaining responsiveness.

Step 3: Frequency Extraction

On every animation frame, the audio engine extracts two key data arrays from the AnalyserNode:

From the frequency data, the engine computes three energy bands:

Step 4: Beat Detection

The beat detection algorithm monitors the bass energy band for sudden peaks. When the current bass energy exceeds the previous frame's energy by a threshold factor (1.4x) and the absolute bass level is above 0.3, a beat is flagged. A cooldown of 150ms prevents rapid false triggers from sustained bass.

Beat events drive dramatic visual responses like shockwaves in the Bass Shockwave preset and expansion bursts in Circular Pulse Ring.

Step 5: GPU Rendering

Audio metrics are passed to the WebGL rendering pipeline powered by Three.js. Each visualization preset defines custom GLSL vertex and fragment shaders that run on the GPU.

The rendering loop runs at 60fps using requestAnimationFrame:

  1. Read audio frequency data from the AnalyserNode.
  2. Compute bass, mid, treble, amplitude, and beat metrics.
  3. Update shader uniform values with the new audio data.
  4. Update any particle systems or geometry transformations.
  5. Render the frame to the fullscreen canvas.

Step 6: Shader Techniques

Different presets use different GPU rendering techniques:

Performance Optimization

Several strategies ensure smooth performance:

Full Feature List

See every feature in detail.

FAQ

Common questions and troubleshooting.

Last updated: March 2026