# Velour Phase 5: Audio Support (Music & Audiobooks) Phase 5 extends Velour from a video library to a comprehensive media library by adding support for music and audiobooks. This builds upon the extensible MediaFile architecture established in Phase 1. ## Technology Stack ### Audio Processing Components - **FFmpeg** - Audio transcoding and metadata extraction (extends existing video processing) - **Ruby Audio Gems** - ID3 tag parsing, waveform generation - **Active Storage** - Album art and waveform visualization storage - **MediaInfo** - Comprehensive audio metadata extraction ## Database Schema Extensions ### Audio Model (inherits from MediaFile) ```ruby class Audio < MediaFile # Audio-specific associations has_many :audio_assets, dependent: :destroy # album art, waveforms # Audio-specific metadata store store :audio_metadata, accessors: [:sample_rate, :channels, :artist, :album, :track_number, :genre, :year] # Audio-specific methods def quality_label return "Unknown" unless bit_rate case bit_rate when 0..128 then "128kbps" when 129..192 then "192kbps" when 193..256 then "256kbps" when 257..320 then "320kbps" else "Lossless" end end def format_type return "Unknown" unless format case format&.downcase when "mp3" then "MP3" when "flac" then "FLAC" when "wav" then "WAV" when "aac", "m4a" then "AAC" when "ogg" then "OGG Vorbis" else format&.upcase end end end class AudioAsset < ApplicationRecord belongs_to :audio enum asset_type: { album_art: 0, waveform: 1, lyrics: 2 } # Uses Active Storage for file storage has_one_attached :file end ``` ### Extended Work Model ```ruby class Work < ApplicationRecord # Existing video associations has_many :videos, dependent: :destroy has_many :external_ids, dependent: :destroy # New audio associations has_many :audios, dependent: :destroy # Enhanced primary media selection def primary_media (audios + videos).sort_by(&:created_at).last end def primary_video videos.order(created_at: :desc).first end def primary_audio audios.order(created_at: :desc).first end # Content type detection def video_content? videos.exists? end def audio_content? audios.exists? end def mixed_content? video_content? && audio_content? end end ``` ## Audio Processing Pipeline ### AudioProcessorJob ```ruby class AudioProcessorJob < ApplicationJob queue_as :processing def perform(audio_id) audio = Audio.find(audio_id) # Extract audio metadata AudioMetadataExtractor.new(audio).extract! # Generate album art if embedded AlbumArtExtractor.new(audio).extract! # Generate waveform visualization WaveformGenerator.new(audio).generate! # Check web compatibility and transcode if needed unless AudioTranscoder.new(audio).web_compatible? AudioTranscoderJob.perform_later(audio_id) end audio.update!(processed: true) rescue => e audio.update!(processing_error: e.message) raise end end ``` ### AudioTranscoderJob ```ruby class AudioTranscoderJob < ApplicationJob queue_as :transcoding def perform(audio_id) audio = Audio.find(audio_id) AudioTranscoder.new(audio).transcode_for_web! end end ``` ## File Discovery Extensions ### Enhanced FileScannerService ```ruby class FileScannerService AUDIO_EXTENSIONS = %w[mp3 flac wav aac m4a ogg wma].freeze def scan_directory(storage_location) # Existing video scanning logic scan_videos(storage_location) # New audio scanning logic scan_audio(storage_location) end private def scan_audio(storage_location) AUDIO_EXTENSIONS.each do |ext| Dir.glob(File.join(storage_location.path, "**", "*.#{ext}")).each do |file_path| process_audio_file(file_path, storage_location) end end end def process_audio_file(file_path, storage_location) filename = File.basename(file_path) return if Audio.joins(:storage_location).exists?(filename: filename, storage_locations: { id: storage_location.id }) # Create Work based on filename parsing (album/track structure) work = find_or_create_audio_work(filename, file_path) # Create Audio record Audio.create!( work: work, storage_location: storage_location, filename: filename, xxhash64: calculate_xxhash64(file_path) ) AudioProcessorJob.perform_later(audio.id) end end ``` ## User Interface Extensions ### Audio Player Integration - **Video.js Audio Plugin** - Extend existing video player for audio - **Waveform Visualization** - Interactive seeking with waveform display - **Chapter Support** - Essential for audiobooks - **Speed Control** - Variable playback speed for audiobooks ### Library Organization - **Album View** - Grid layout with album art - **Artist Pages** - Discography and album organization - **Audiobook Progress** - Chapter tracking and resume functionality - **Mixed Media Collections** - Works containing both video and audio content ### Audio-Specific Features - **Playlist Creation** - Custom playlists for music - **Shuffle Play** - Random playback for albums/artists - **Gapless Playback** - Seamless track transitions - **Lyrics Display** - Embedded or external lyrics support ## Implementation Timeline ### Phase 5A: Audio Foundation (Week 1-2) - Create Audio model inheriting from MediaFile - Implement AudioProcessorJob and audio metadata extraction - Extend FileScannerService for audio formats - Basic audio streaming endpoint ### Phase 5B: Audio Processing (Week 3) - Album art extraction and storage - Waveform generation - Audio transcoding for web compatibility - Quality optimization and format conversion ### Phase 5C: User Interface (Week 4) - Audio player component (extends Video.js) - Album and artist browsing interfaces - Audio library management views - Search and filtering for audio content ### Phase 5D: Advanced Features (Week 5) - Chapter support for audiobooks - Playlist creation and management - Mixed media Works (video + audio) - Audio-specific user preferences ## Migration Strategy ### Database Migrations ```ruby # Extend videos table for STI (already done in Phase 1) # Add audio-specific columns if needed class AddAudioFeatures < ActiveRecord::Migration[8.1] def change create_table :audio_assets do |t| t.references :audio, null: false, foreign_key: true t.string :asset_type t.timestamps end # Audio-specific indexes add_index :audios, :artist if column_exists?(:audios, :artist) add_index :audios, :album if column_exists?(:audios, :album) end end ``` ### Backward Compatibility - All existing video functionality remains unchanged - Video URLs and routes continue to work identically - Database migration is additive (type column only) - No breaking changes to existing API ## Configuration ### Environment Variables ```bash # Audio Processing (extends existing video processing) FFMPEG_PATH=/usr/bin/ffmpeg AUDIO_TRANSCODE_QUALITY=high MAX_AUDIO_TRANSCODE_SIZE_GB=10 # Audio Features ENABLE_AUDIO_SCANNING=true ENABLE_WAVEFORM_GENERATION=true AUDIO_THUMBNAIL_SIZE=300x300 ``` ### Storage Considerations - Album art storage in Active Storage - Waveform images (generated per track) - Potential audio transcoding cache - Audio-specific metadata storage ## Testing Strategy ### Model Tests - Audio model validation and inheritance - Work model mixed content handling - Audio metadata extraction accuracy ### Integration Tests - Audio processing pipeline end-to-end - Audio streaming with seeking support - File scanner audio discovery ### System Tests - Audio player functionality - Album/artist interface navigation - Mixed media library browsing ## Performance Considerations ### Audio Processing - Parallel audio metadata extraction - Efficient album art extraction - Optimized waveform generation - Background transcoding queue management ### Storage Optimization - Compressed waveform storage - Album art caching and optimization - Efficient audio streaming with range requests ### User Experience - Fast audio library browsing - Quick album art loading - Responsive audio player controls ## Future Extensions ### Phase 5+ Possibilities - **Podcast Support** - RSS feed integration and episode management - **Radio Streaming** - Internet radio station integration - **Music Discovery** - Similar artist recommendations - **Audio Bookmarks** - Detailed note-taking for audiobooks - **Social Features** - Sharing playlists and recommendations This phase transforms Velour from a video library into a comprehensive personal media platform while maintaining the simplicity and robustness of the existing architecture.