Heihachi is a high-performance audio analysis framework designed for electronic music, with a particular focus on neurofunk and drum & bass genres. The system implements novel approaches to audio analysis by combining neurological models of rhythm processing with advanced signal processing techniques.
The framework is built upon established neuroscientific research demonstrating that humans possess an inherent ability to synchronize motor responses with external rhythmic stimuli. This phenomenon, known as beat-based timing, involves complex interactions between auditory and motor systems in the brain.
Key neural mechanisms include:
-
Beat-based Timing Networks
- Basal ganglia-thalamocortical circuits
- Supplementary motor area (SMA)
- Premotor cortex (PMC)
-
Temporal Processing Systems
- Duration-based timing mechanisms
- Beat-based timing mechanisms
- Motor-auditory feedback loops
Research has shown that low-frequency neural oscillations from motor planning areas guide auditory sampling, expressed through coherence measures:
Where:
-
$C_{xy}(f)$ represents coherence at frequency$f$ -
$S_{xy}(f)$ is the cross-spectral density -
$S_{xx}(f)$ and$S_{yy}(f)$ are auto-spectral densities
In addition to the coherence measures, we utilize several key mathematical formulas:
- Spectral Decomposition: For analyzing sub-bass and Reese bass components:
- Groove Pattern Analysis: For microtiming deviations:
- Amen Break Detection: Pattern matching score:
- Reese Bass Analysis: For analyzing modulation and phase relationships:
Where:
-
$A_k(t)$ is the amplitude of the k-th component -
$\phi_k(t)$ is the phase of the k-th component
- Transition Detection: For identifying mix points and transitions:
Where:
-
$E(t)$ is energy change -
$S(t)$ is spectral change -
$H(t)$ is harmonic change -
$\alpha, \beta, \gamma$ are weighting factors
- Similarity Computation: For comparing audio segments:
Where:
-
$sim_i(x,y)$ is the similarity for feature i -
$w_i$ is the weight for feature i
- Segment Clustering: Using DBSCAN with adaptive distance:
Where:
-
$f_i(p)$ is feature i of point p -
$\lambda_i$ is the importance weight for feature i
- Bass Design Analysis: For analyzing Reese bass modulation depth:
- Effect Chain Detection: For compression ratio estimation:
- Pattern Recognition: For rhythmic similarity using dynamic time warping:
- Transition Analysis: For blend detection using cross-correlation:
- Automated drum pattern recognition
- Groove quantification
- Microtiming analysis
- Syncopation detection
- Multi-band decomposition
- Harmonic tracking
- Timbral feature extraction
- Sub-bass characterization
- Sound source separation
- Transformation detection
- Energy distribution analysis
- Component relationship mapping
- Pattern matching and variation detection
- Transformation identification
- Groove characteristic extraction
- VIP/Dubplate classification
- Neurofunk-specific component separation
- Bass sound design analysis
- Effect chain detection
- Temporal structure analysis
- Multi-band similarity computation
- Transformation-aware comparison
- Groove-based alignment
- Confidence scoring
- Multi-band onset detection
- Adaptive thresholding
- Feature-based peak classification
- Confidence scoring
- Pattern-based segmentation
- Hierarchical clustering
- Relationship analysis
- Transition detection
- Mix point identification
- Blend type classification
- Energy flow analysis
- Structure boundary detection
graph LR
A[Audio Stream] --> B[Preprocessing]
B --> C[Feature Extraction]
C --> D[Component Analysis]
D --> E[Pattern Recognition]
E --> F[Result Generation]
subgraph "Feature Extraction"
C1[Spectral] --> C2[Temporal]
C2 --> C3[Rhythmic]
end
graph TD
A[Input Signal] --> B[Sub-bass Extraction]
A --> C[Reese Detection]
A --> D[Drum Pattern Analysis]
B --> E[Bass Pattern]
C --> E
D --> F[Rhythm Grid]
E --> G[Component Fusion]
F --> G
graph TD
A[Audio Input] --> B[Preprocessing]
B --> C[Feature Extraction]
subgraph "Feature Extraction"
C1[Spectral Analysis] --> D1[Sub-bass]
C1 --> D2[Mid-range]
C1 --> D3[High-freq]
C2[Temporal Analysis] --> E1[Envelope]
C2 --> E2[Transients]
C3[Rhythmic Analysis] --> F1[Beats]
C3 --> F2[Patterns]
end
D1 --> G[Feature Fusion]
D2 --> G
D3 --> G
E1 --> G
E2 --> G
F1 --> G
F2 --> G
graph LR
A[Audio Stream] --> B[Peak Detection]
B --> C[Segment Creation]
C --> D[Pattern Analysis]
D --> E[Clustering]
subgraph "Pattern Analysis"
D1[Drum Patterns]
D2[Bass Patterns]
D3[Effect Patterns]
end
graph TD
A[Input Signal] --> B[Background Separation]
A --> C[Foreground Analysis]
subgraph "Background"
B1[Ambient Detection]
B2[Noise Floor]
B3[Reverb Tail]
end
subgraph "Foreground"
C1[Transient Detection]
C2[Note Events]
C3[Effect Events]
end
graph LR
A[Component Results] --> B[Confidence Scoring]
B --> C[Weight Assignment]
C --> D[Fusion]
D --> E[Final Results]
subgraph "Confidence Scoring"
B1[Pattern Confidence]
B2[Feature Confidence]
B3[Temporal Confidence]
end
-
Reese Bass Components:
- Fundamental frequency tracking
- Phase relationship analysis
- Modulation pattern detection
- Harmonic content analysis
-
Sub Bass Characteristics:
- Frequency range: 20-60 Hz
- Envelope characteristics
- Distortion analysis
- Phase alignment
-
Signal Chain Analysis:
- Compression detection
- Distortion identification
- Filter resonance analysis
- Modulation effects
-
Processing Order:
- Pre/post processing detection
- Parallel processing identification
- Send/return effect analysis
-
Rhythmic Transformations:
- Time-stretching detection
- Beat shuffling analysis
- Groove template matching
- Syncopation patterns
-
Spectral Transformations:
- Frequency shifting
- Harmonic manipulation
- Formant preservation
- Resynthesis detection
-
Preprocessing
- Sample rate normalization (44.1 kHz)
- Stereo to mono conversion when needed
- Segment-wise processing for large files
-
Feature Extraction
- Multi-threaded processing
- GPU acceleration where available
- Efficient memory management
- Caching system for intermediate results
-
Analysis Flow
- Cascading analysis system
- Component-wise processing
- Result fusion and validation
- Confidence scoring
-
Memory Management
- Streaming processing for large files
- Efficient cache utilization
- GPU memory optimization
-
Parallel Processing
- Multi-threaded feature extraction
- Batch processing capabilities
- Distributed analysis support
-
Storage Efficiency
- Compressed result storage
- Metadata indexing
- Version control for analysis results
For evaluating analysis accuracy:
Where:
- TP: True Positives (correctly identified patterns)
- TN: True Negatives (correctly rejected non-patterns)
- FP: False Positives (incorrectly identified patterns)
- FN: False Negatives (missed patterns)
- Track boundary detection
- Transition type classification
- Mix structure analysis
- Energy flow visualization
- Sound design deconstruction
- Arrangement analysis
- Effect chain detection
- Reference track comparison
- Similar track identification
- Style classification
- Groove pattern matching
- VIP/Dubplate detection
The framework includes comprehensive visualization tools for:
- Spectral analysis results
- Component relationships
- Groove patterns
- Transition points
- Similarity matrices
- Analysis confidence scores
graph TD
A[Audio Input] --> B[Feature Extraction]
B --> C[Analysis Pipeline]
C --> D[Results Generation]
subgraph "Feature Extraction"
B1[Spectral] --> B2[Temporal]
B2 --> B3[Rhythmic]
B3 --> B4[Component]
end
subgraph "Analysis Pipeline"
C1[Pattern Recognition]
C2[Similarity Analysis]
C3[Structure Analysis]
C4[Effect Analysis]
end
subgraph "Results Generation"
D1[Visualization]
D2[Storage]
D3[Export]
end
graph LR
A[Audio Stream] --> B[Component Separation]
B --> C[Feature Analysis]
C --> D[Pattern Recognition]
subgraph "Component Separation"
B1[Sub Bass]
B2[Reese Bass]
B3[Drums]
B4[Effects]
end
subgraph "Feature Analysis"
C1[Spectral Features]
C2[Temporal Features]
C3[Modulation Features]
end
subgraph "Pattern Recognition"
D1[Rhythmic Patterns]
D2[Effect Patterns]
D3[Bass Patterns]
end
graph TD
A[Input] --> B[Preprocessing]
B --> C[Analysis]
C --> D[Results]
subgraph "Preprocessing"
B1[Normalization]
B2[Segmentation]
B3[Enhancement]
end
subgraph "Analysis"
C1[Feature Extraction]
C2[Pattern Analysis]
C3[Component Analysis]
end
subgraph "Results"
D1[Metrics]
D2[Visualizations]
D3[Reports]
end
-
Enhanced Neural Processing
- Integration of deep learning models
- Real-time processing capabilities
- Adaptive threshold optimization
-
Extended Analysis Capabilities
- Additional genre support
- Extended effect detection
- Advanced pattern recognition
-
Improved Visualization
- Interactive dashboards
- 3D visualization options
- Real-time visualization
-
Chen, J. L., Penhune, V. B., & Zatorre, R. J. (2008). Listening to musical rhythms recruits motor regions of the brain. Cerebral Cortex, 18(12), 2844-2854.
-
Cannon, J. J., & Patel, A. D. (2020). How beat perception co-opts motor neurophysiology. Trends in Cognitive Sciences, 24(1), 51-64.
-
Fukuie, T., et al. (2022). Neural entrainment reflects temporal predictions guiding speech comprehension. Current Biology, 32(5), 1051-1067.
-
Smith, J. O. (2011). Spectral Audio Signal Processing. W3K Publishing.
-
Bello, J. P., et al. (2005). A Tutorial on Onset Detection in Music Signals. IEEE Transactions on Speech and Audio Processing.
-
Gouyon, F., & Dixon, S. (2005). A Review of Automatic Rhythm Description Systems. Computer Music Journal.
This project is licensed under the MIT License - see the LICENSE file for details.
If you use this framework in your research, please cite:
@software{heihachi2024,
title = {Heihachi: Neural Processing of Electronic Music},
author = {[Kundai Sachikonye]},
year = {2024},
url = {https://github.com/fullscreen-triangle/heihachi}
}