Building a Recording Studio App with Backing Tracks

Musicians practicing or recording want to:

  1. Play along with a backing track (drums, bass, full mix)
  2. Record their performance with studio-quality audio
  3. Control the blend between backing track and wet effects signal
  4. Zero latency - everything happens on-device
flowchart LR subgraph Device["Hoopi Pedal"] subgraph ESP["ESP32"] SD_READ[SD Card<br/>Read WAV] SD_WRITE[SD Card<br/>Write WAV] I2S_TX[I2S TX] I2S_RX[I2S RX] end subgraph DSP["Daisy Seed"] BT_IN[Backing Track In] FX[Effects Engine] MIX_OUT[Output Mixer] MIX_REC[Recording Mixer] WET_OUT[Wet Audio Out] end HP[Headphones] end SD_READ -->|Read WAV| I2S_TX I2S_TX -->|Backing Track| BT_IN BT_IN --> MIX_OUT FX -->|Wet Signal| MIX_OUT FX -->|Wet Signal| MIX_REC BT_IN -.->|If flag set| MIX_REC MIX_OUT --> HP MIX_REC --> WET_OUT WET_OUT -->|Recording Audio| I2S_RX I2S_RX --> SD_WRITE

Key insight: The backing track never leaves the device. Recording happens on ESP32, DSP handles mixing and effects.


Architecture Overview

flowchart TB subgraph App["Flutter App"] LIB[Track Library] UPLOAD[Upload Service] API[API Service] CTRL[Blend Controls] end subgraph ESP["ESP32"] REST[REST API] FS[SD Card FS] STREAM[WAV Streamer] RECORDER[WAV Recorder] I2S[I2S Driver<br/>Bidirectional] end subgraph Daisy["Daisy Seed DSP"] RX[Backing Track RX] TX[Recording TX] EFFECTS[Effects Chain] BLEND_OUT[Output Blend] BLEND_REC[Recording Blend] end LIB --> UPLOAD UPLOAD -->|HTTP POST| REST REST --> FS CTRL --> API API -->|HTTP| REST REST -->|UART| Daisy FS --> STREAM STREAM -->|I2S TX| RX RX --> BLEND_OUT EFFECTS --> BLEND_OUT EFFECTS --> BLEND_REC RX -.->|Optional| BLEND_REC BLEND_REC -->|I2S RX| RECORDER RECORDER --> FS

Bidirectional I2S Flow

The ESP32 and Daisy communicate over a bidirectional I2S bus:

sequenceDiagram participant SD as SD Card participant ESP as ESP32 participant DSP as Daisy Seed participant HP as Headphones Note over ESP,DSP: I2S Bus (Full Duplex) loop Every audio frame (48kHz) SD->>ESP: Read backing track samples ESP->>DSP: I2S TX: Backing track L/R DSP->>DSP: Blend backing + wet for output DSP->>HP: Output blended audio DSP->>DSP: Mix wet (+ backing if flag) DSP->>ESP: I2S RX: Recording audio L/R ESP->>SD: Write recording samples end

Audio Signal Routing

flowchart TB subgraph Inputs["Inputs"] GUITAR[Guitar/Instrument] BT[Backing Track<br/>from ESP32 I2S] end subgraph DSP["Daisy Seed Processing"] FX[Effects Chain<br/>Reverb, Delay, etc.] subgraph OutputPath["Output Path (Headphones)"] MIX_HP[Blend Mixer] end subgraph RecordPath["Recording Path (to ESP32)"] MIX_REC[Recording Mixer] end end subgraph Outputs["Outputs"] HP[Headphones<br/>Backing + Wet] I2S_OUT[I2S to ESP32<br/>Wet + optional Backing] end GUITAR --> FX FX -->|Wet signal| MIX_HP BT -->|Backing audio| MIX_HP MIX_HP --> HP FX -->|Wet signal<br/>Always| MIX_REC BT -.->|Backing audio<br/>If blend_to_recording flag| MIX_REC MIX_REC --> I2S_OUT

Key points:

  • Output (headphones): Always backing track + wet effects blended
  • Recording (to ESP32): Always wet effects, optionally includes backing track if flag is set

Backing Track Sources

Tracks on the device SD card can come from:

  1. Previous recordings - Practice along with your own takes
  2. Uploaded tracks - Push WAV files from the app
  3. Downloaded content - Backing tracks from the cloud
class BackingTrack {
  final String filename;       // Filename on device SD card
  final String displayName;    // User-friendly name
  final Duration duration;
  final int sampleRate;        // Always 48000
  final int bitDepth;          // Always 16
  final BackingTrackSource source;

  const BackingTrack({
    required this.filename,
    required this.displayName,
    required this.duration,
    this.sampleRate = 48000,
    this.bitDepth = 16,
    required this.source,
  });
}

enum BackingTrackSource {
  recording,    // Previous session recording
  uploaded,     // Pushed from app
  downloaded,   // From cloud library
}

Uploading Tracks to Device

Tracks must be 16-bit/48kHz WAV format for I2S compatibility:

class BackingTrackUploadService {
  final ApiService _apiService;

  Future<bool> uploadTrack(String localPath, String deviceFilename) async {
    final file = File(localPath);

    // Verify format
    final header = await _readWavHeader(file);
    if (header.sampleRate != 48000 || header.bitsPerSample != 16) {
      throw FormatException(
        'Backing tracks must be 16-bit/48kHz WAV. '
        'Got ${header.bitsPerSample}-bit/${header.sampleRate}Hz'
      );
    }

    // Upload to device
    final bytes = await file.readAsBytes();

    final response = await http.post(
      Uri.parse('${_apiService.baseUrl}/upload/backingtrack'),
      headers: {
        'Content-Type': 'application/octet-stream',
        'X-Filename': deviceFilename,
        'X-Duration': header.duration.inSeconds.toString(),
      },
      body: bytes,
    );

    return response.statusCode == 200;
  }

  Future<WavHeader> _readWavHeader(File file) async {
    final bytes = await file.openRead(0, 44).first;
    return WavHeader.parse(Uint8List.fromList(bytes));
  }
}

Progress Tracking

Future<void> uploadWithProgress(
  String localPath,
  String deviceFilename,
  void Function(double progress) onProgress,
) async {
  final file = File(localPath);
  final fileSize = await file.length();

  final request = http.StreamedRequest(
    'POST',
    Uri.parse('${_apiService.baseUrl}/upload/backingtrack'),
  );

  request.headers['Content-Type'] = 'application/octet-stream';
  request.headers['X-Filename'] = deviceFilename;
  request.headers['Content-Length'] = fileSize.toString();

  int uploaded = 0;
  final stream = file.openRead().transform(
    StreamTransformer.fromHandlers(
      handleData: (data, sink) {
        uploaded += data.length;
        onProgress(uploaded / fileSize);
        sink.add(data);
      },
    ),
  );

  request.sink.addStream(stream).then((_) => request.sink.close());

  final response = await request.send();
  if (response.statusCode != 200) {
    throw Exception('Upload failed: ${response.statusCode}');
  }
}

Listing Available Tracks

The device API returns all WAV files that can be used as backing tracks:

Future<List<BackingTrack>> getAvailableTracks() async {
  final response = await http.get(
    Uri.parse('${_apiService.baseUrl}/api/backingtracks'),
  );

  if (response.statusCode != 200) {
    throw Exception('Failed to list backing tracks');
  }

  final List<dynamic> json = jsonDecode(response.body);

  return json.map((item) => BackingTrack(
    filename: item['filename'],
    displayName: item['name'] ?? _extractName(item['filename']),
    duration: Duration(seconds: item['duration_sec'] ?? 0),
    source: BackingTrackSource.values.byName(item['source'] ?? 'uploaded'),
  )).toList();
}

String _extractName(String filename) {
  // "my_song.wav" -> "my song"
  return filename
      .replaceAll('.wav', '')
      .replaceAll('_', ' ');
}

Enabling a Backing Track

sequenceDiagram participant App as Flutter App participant ESP as ESP32 participant DSP as Daisy Seed App->>ESP: POST /api/backingtrack/enable<br/>{filename: "drums.wav", blendToRecording: false} ESP->>ESP: Open WAV file ESP->>ESP: Parse header, seek to data ESP->>DSP: UART: CMD_BACKING_TRACK_ENABLE DSP->>DSP: Configure blend flags DSP-->>ESP: ACK ESP-->>App: {status: "enabled", filename: "drums.wav"} Note over ESP,DSP: Ready to stream on recording start

API Call

Future<bool> enableBackingTrack(String filename, {bool blendToRecording = false}) async {
  try {
    final response = await http.post(
      Uri.parse('$_baseUrl/api/backingtrack/enable'),
      headers: {'Content-Type': 'application/json'},
      body: jsonEncode({
        'filename': filename,
        'blend_to_recording': blendToRecording,
      }),
    );

    if (response.statusCode == 200) {
      final json = jsonDecode(response.body);
      return json['status'] == 'enabled';
    }
    return false;
  } catch (e) {
    log('Failed to enable backing track: $e');
    return false;
  }
}

Future<bool> disableBackingTrack() async {
  try {
    final response = await http.post(
      Uri.parse('$_baseUrl/api/backingtrack/disable'),
    );
    return response.statusCode == 200;
  } catch (e) {
    log('Failed to disable backing track: $e');
    return false;
  }
}

I2S Streaming Architecture

flowchart TB subgraph ESP32["ESP32"] FILE_R[WAV File Read] FILE_W[WAV File Write] DMA_TX[DMA TX Buffer] DMA_RX[DMA RX Buffer] I2S_PERIPH[I2S Peripheral<br/>Full Duplex] end subgraph I2S_BUS["I2S Bus"] BCLK[BCLK: 48kHz × 32] LRCK[LRCK: 48kHz] DATA_OUT[DOUT: Backing Track] DATA_IN[DIN: Recording Audio] end subgraph Daisy["Daisy Seed"] I2S_RX[I2S RX<br/>Backing Track] I2S_TX[I2S TX<br/>Recording Audio] AUDIO[Audio Processing] end FILE_R -->|Read| DMA_TX DMA_TX --> I2S_PERIPH I2S_PERIPH --> DATA_OUT DATA_OUT --> I2S_RX I2S_RX --> AUDIO AUDIO --> I2S_TX I2S_TX --> DATA_IN DATA_IN --> I2S_PERIPH I2S_PERIPH --> DMA_RX DMA_RX -->|Write| FILE_W

ESP32 I2S Configuration (Full Duplex)

// ESP32 side - I2S full duplex for backing track + recording
i2s_config_t i2s_config = {
    .mode = I2S_MODE_MASTER | I2S_MODE_TX | I2S_MODE_RX,
    .sample_rate = 48000,
    .bits_per_sample = I2S_BITS_PER_SAMPLE_16BIT,
    .channel_format = I2S_CHANNEL_FMT_RIGHT_LEFT,
    .communication_format = I2S_COMM_FORMAT_STAND_I2S,
    .intr_alloc_flags = ESP_INTR_FLAG_LEVEL1,
    .dma_buf_count = 8,
    .dma_buf_len = 256,
    .use_apll = true,  // Better clock accuracy
};

i2s_pin_config_t pin_config = {
    .bck_io_num = GPIO_NUM_26,
    .ws_io_num = GPIO_NUM_25,
    .data_out_num = GPIO_NUM_22,  // TX: Backing track to Daisy
    .data_in_num = GPIO_NUM_23,   // RX: Recording audio from Daisy
};

Bidirectional Streaming Task

void audio_streaming_task(void* param) {
    int16_t tx_buffer[BUFFER_SIZE];  // Backing track to Daisy
    int16_t rx_buffer[BUFFER_SIZE];  // Recording from Daisy
    size_t bytes_written, bytes_read;

    while (streaming_enabled) {
        // Read backing track from SD card
        if (backing_track_enabled && wav_file != NULL) {
            size_t samples = fread(tx_buffer, sizeof(int16_t), BUFFER_SIZE, wav_file);
            if (samples == 0 && loop_enabled) {
                fseek(wav_file, wav_data_offset, SEEK_SET);
                continue;
            }
        } else {
            // Send silence if no backing track
            memset(tx_buffer, 0, sizeof(tx_buffer));
        }

        // Full duplex I2S transfer
        i2s_write(I2S_NUM_0, tx_buffer, sizeof(tx_buffer), &bytes_written, portMAX_DELAY);
        i2s_read(I2S_NUM_0, rx_buffer, sizeof(rx_buffer), &bytes_read, portMAX_DELAY);

        // Write recording audio to SD card
        if (recording_enabled && rec_file != NULL) {
            fwrite(rx_buffer, 1, bytes_read, rec_file);
        }
    }
}

Blend Controls

The Daisy receives blend parameters via UART and mixes in real-time:

class BlendSettings {
  final int backingLevel;       // 0-255: backing track volume in output
  final int wetLevel;           // 0-255: effects signal volume in output
  final int pan;                // 0-255: L/R balance (128 = center)
  final bool blendToRecording;  // Include backing track in recording

  const BlendSettings({
    this.backingLevel = 180,       // ~70%
    this.wetLevel = 255,           // 100%
    this.pan = 128,                // Center
    this.blendToRecording = false, // Record wet only by default
  });
}

Future<bool> setBlend(BlendSettings settings) async {
  try {
    final response = await http.post(
      Uri.parse('$_baseUrl/api/backingtrack/blend'),
      headers: {'Content-Type': 'application/json'},
      body: jsonEncode({
        'backing': settings.backingLevel,
        'wet': settings.wetLevel,
        'pan': settings.pan,
        'blend_to_recording': settings.blendToRecording,
      }),
    );
    return response.statusCode == 200;
  } catch (e) {
    log('Failed to set blend: $e');
    return false;
  }
}

UART Command

// CMD_BACKING_BLEND (0x0C) - 4 data bytes
struct BackingBlendFrame {
    uint8_t start;            // 0xAA
    uint8_t len;              // 0x05
    uint8_t cmd;              // 0x0C
    uint8_t backing;          // Backing track level (output)
    uint8_t wet;              // Wet effects level (output)
    uint8_t pan;              // L/R pan
    uint8_t blend_to_rec;     // 0 = wet only, 1 = wet + backing
    uint8_t checksum;
};

Recording Session Flow

stateDiagram-v2 [*] --> Idle Idle --> TrackSelected: Select backing track TrackSelected --> Enabled: Enable track Enabled --> TrackSelected: Disable track Enabled --> Recording: Start recording Recording --> Enabled: Stop recording Recording --> Recording: Adjust blend Note right of Recording: ESP32 streams backing via I2S TX Note right of Recording: Daisy returns wet audio via I2S RX Note right of Recording: ESP32 writes recording to SD

Coordinated Recording

Future<void> startRecordingWithBackingTrack({
  required String backingTrackFilename,
  required String recordingFilename,
  BlendSettings blend = const BlendSettings(),
}) async {
  // 1. Enable backing track (prepares I2S streaming)
  final enabled = await _apiService.enableBackingTrack(
    backingTrackFilename,
    blendToRecording: blend.blendToRecording,
  );
  if (!enabled) {
    throw Exception('Failed to enable backing track');
  }

  // 2. Set initial blend
  await _apiService.setBlend(blend);

  // 3. Start recording (triggers bidirectional I2S)
  final started = await _apiService.startRecording(recordingFilename);
  if (!started) {
    await _apiService.disableBackingTrack();
    throw Exception('Failed to start recording');
  }

  // Recording is now active:
  // - ESP32 sends backing track to Daisy via I2S TX
  // - Daisy sends wet effects (+ optional backing) to ESP32 via I2S RX
  // - ESP32 writes received audio to SD card
}

Future<void> stopRecordingWithBackingTrack() async {
  // 1. Stop recording (stops I2S and SD write)
  await _apiService.stopRecording();

  // 2. Disable backing track
  await _apiService.disableBackingTrack();
}

The UI

class BackingTrackPanel extends StatefulWidget {
  final ApiService apiService;

  @override
  State<BackingTrackPanel> createState() => _BackingTrackPanelState();
}

class _BackingTrackPanelState extends State<BackingTrackPanel> {
  List<BackingTrack> _tracks = [];
  BackingTrack? _selectedTrack;
  bool _isEnabled = false;
  BlendSettings _blend = const BlendSettings();

  @override
  void initState() {
    super.initState();
    _loadTracks();
  }

  Future<void> _loadTracks() async {
    final tracks = await widget.apiService.getAvailableTracks();
    setState(() => _tracks = tracks);
  }

  @override
  Widget build(BuildContext context) {
    return Card(
      child: Padding(
        padding: EdgeInsets.all(16),
        child: Column(
          crossAxisAlignment: CrossAxisAlignment.start,
          children: [
            // Header with enable toggle
            Row(
              children: [
                Icon(Icons.music_note),
                SizedBox(width: 8),
                Text('Backing Track', style: Theme.of(context).textTheme.titleMedium),
                Spacer(),
                Switch(
                  value: _isEnabled,
                  onChanged: _selectedTrack != null ? _toggleEnabled : null,
                ),
              ],
            ),

            SizedBox(height: 16),

            // Track selector
            _buildTrackDropdown(),

            if (_isEnabled) ...[
              SizedBox(height: 16),

              // Blend controls
              _buildSlider(
                label: 'Backing Level',
                value: _blend.backingLevel,
                onChanged: (v) => _updateBlend(backingLevel: v),
              ),

              _buildSlider(
                label: 'Effects Level',
                value: _blend.wetLevel,
                onChanged: (v) => _updateBlend(wetLevel: v),
              ),

              _buildSlider(
                label: 'Pan',
                value: _blend.pan,
                onChanged: (v) => _updateBlend(pan: v),
                showCenter: true,
              ),

              SizedBox(height: 8),

              // Blend to recording toggle
              SwitchListTile(
                title: Text('Include backing in recording'),
                subtitle: Text(
                  _blend.blendToRecording
                      ? 'Recording will include backing track'
                      : 'Recording wet effects only',
                ),
                value: _blend.blendToRecording,
                onChanged: (v) => _updateBlend(blendToRecording: v),
              ),
            ],
          ],
        ),
      ),
    );
  }

  Widget _buildTrackDropdown() {
    return DropdownButtonFormField<BackingTrack>(
      value: _selectedTrack,
      hint: Text('Select a track'),
      isExpanded: true,
      items: _tracks.map((track) {
        return DropdownMenuItem(
          value: track,
          child: Row(
            children: [
              Icon(_sourceIcon(track.source), size: 16),
              SizedBox(width: 8),
              Expanded(child: Text(track.displayName)),
              Text(
                _formatDuration(track.duration),
                style: TextStyle(color: Colors.grey),
              ),
            ],
          ),
        );
      }).toList(),
      onChanged: (track) {
        setState(() => _selectedTrack = track);
        if (_isEnabled && track != null) {
          widget.apiService.enableBackingTrack(track.filename);
        }
      },
    );
  }

  Future<void> _toggleEnabled(bool enabled) async {
    if (enabled && _selectedTrack != null) {
      final success = await widget.apiService.enableBackingTrack(
        _selectedTrack!.filename,
        blendToRecording: _blend.blendToRecording,
      );
      if (success) {
        setState(() => _isEnabled = true);
      }
    } else {
      await widget.apiService.disableBackingTrack();
      setState(() => _isEnabled = false);
    }
  }

  Future<void> _updateBlend({
    int? backingLevel,
    int? wetLevel,
    int? pan,
    bool? blendToRecording,
  }) async {
    final newBlend = BlendSettings(
      backingLevel: backingLevel ?? _blend.backingLevel,
      wetLevel: wetLevel ?? _blend.wetLevel,
      pan: pan ?? _blend.pan,
      blendToRecording: blendToRecording ?? _blend.blendToRecording,
    );

    setState(() => _blend = newBlend);
    await widget.apiService.setBlend(newBlend);
  }

  IconData _sourceIcon(BackingTrackSource source) {
    return switch (source) {
      BackingTrackSource.recording => Icons.mic,
      BackingTrackSource.uploaded => Icons.upload,
      BackingTrackSource.downloaded => Icons.cloud_download,
    };
  }
}

Recording Modes

Mode Output (Headphones) Recording (SD Card) Use Case
Wet Only (default) Backing + Wet Wet effects only Post-production flexibility
Blend to Recording Backing + Wet Backing + Wet mixed Quick demos, practice review
flowchart LR subgraph WetOnly["Wet Only Mode"] direction TB HP1[🎧 Backing + Wet] REC1[💾 Wet Only] end subgraph BlendMode["Blend to Recording Mode"] direction TB HP2[🎧 Backing + Wet] REC2[💾 Backing + Wet] end

Advantages of On-Device Architecture

Aspect Phone Streaming On-Device (Our Approach)
Latency 40-200ms (Bluetooth) <1ms (I2S)
Sync Requires compensation Perfect sync
Quality Compressed (BT codec) Lossless 16-bit
Recording Complex routing Simple - ESP32 handles it
Battery Drains phone Minimal phone usage
Reliability BT can drop Rock solid

Key Takeaways

  1. Bidirectional I2S - ESP32 sends backing track, receives recording audio
  2. Recording on ESP32 - Daisy focuses on DSP, ESP32 handles file I/O
  3. Wet effects always recorded - Clean signal path for recording
  4. Optional backing blend - Flag to include backing track in recording
  5. Zero latency monitoring - Backing + wet mixed in real-time on Daisy
  6. 16-bit/48kHz standard - Consistent format throughout

This architecture delivers a professional practice experience - zero-latency backing tracks with flexible recording options, all handled seamlessly between ESP32 and Daisy Seed.