If we can determine the tempo of the input sample in beats per minute, we can then narrow our set of possible matches to only songs with a similar tempo. Our method for determining the beats per minute of an input sample follows.

3D spectrogram of the first five seconds of Britney Spears - Womanizer

Fig. 1. Using an average of power spectral densities, essentially volume, over a range of frequencies gives a simplified line of information about a song’s potential beats.

Finding peaks of selected frequencies to determine beats per minute

Fig. 2. In this plot, the blue line shows the average over the range of frequencies from 0.35 radians to 0.40 radians. This frequency choice is because it is out of the range of most human voices, but still within the frequency of many percussive instruments. A suitable range of frequencies should exclude voices because most vocal spikes do not occur on the beat, and using the frequency range of a bass drum or a snare drum will result in either the beats per minute (BPM) or a multiple of the BPM.

Using a custom written peak detection algorithm, the peaks of the sample are found. Only the repetitive peaks are desired, and the cutoff method of data selection is used here. The red line is a moving average of the three peaks before and after the current position. This is because different parts of the song, introduction, chorus, verse, etc. will produce a different repetitive beat. In the future, instead of using a cutoff data selection method, a selection range should be used, so that peaks too low and too high will be eliminated.

Using the accepted data points, the distance between is point is found, and once again, the outliers are discarded.

Determination of beats per minute using identified peaks

Fig. 3. Beats per minute is a very useful tool for identifying a song, as it is a simple way to categorize groups of songs. Once the BPM of a song is calculated, all songs approximately 20% above and 20% below can be eliminated for comparison.

In order to efficiently match a sample with a song, a database of data of song data is required. Starting with the highest classification method, such as BPM, a progressively lower level comparison may be performed, reducing the processing required for the song matching.