Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 0 additions & 2 deletions doc/BufMFCC.rst
Original file line number Diff line number Diff line change
Expand Up @@ -12,8 +12,6 @@

Other than the 0th coefficient, MFCCs are unchanged by differences in the overall energy of the spectrum (which relates to how we perceive loudness). This means that timbres with similar spectral contours, but different volumes, will still have similar MFCC values, other than MFCC 0. To remove any indication of loudness but keep the information about timbre, we can ignore MFCC 0 by setting the parameter ``startCoeff`` to 1.

For more information visit https://learn.flucoma.org/reference/mfcc/.

For an interactive explanation of this relationship, visit https://learn.flucoma.org/reference/mfcc/explain.

:control source:
Expand Down
96 changes: 96 additions & 0 deletions doc/BufSineFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,96 @@
:digest: Buffer-Based Sinusoidal Peak Tracking
:species: buffer-proc
:sc-categories: Libraries>FluidDecomposition, UGens>Buffer
:sc-related: Guides/FluidCorpusManipulation, Classes/SinOsc
:see-also: SineFeature, BufSines
:description: Interpolated Sinusoidal Peak Tracking on the Spectrum of Audio Stored in a Buffer.
:discussion:
This process is tracking peaks in the spectrum of audio stored in a buffer, then estimating an interpolated frequency and amplitude of that peak in relation to its spectral context. It is the first part of the process used by :fluid-obj:`BufSines`.

The process will return two buffers containing time series that describes the interpolated frequencies and magnitudes changing over time in the source buffer.

:process: This is the method that calls for the slicing to be calculated on a given source buffer.
:output: Nothing, as the various destination buffers are declared in the function call.

:control source:

The |buffer| to use as the source material. The channels of multichannel buffers will be processed sequentially.

:control startFrame:

The starting point for analysis in the source (in samples).

:control numFrames:

The duration (in samples) to analyse.

:control startChan:

For multichannel sources, the starting channel to analyse.

:control numChans:

For multichannel sources, the number of channels to analyse.

:control frequency:

The buffer where the interpolated frequency of the peaks will be written.

:control magnitude:

The buffer where the interpolated magnitude of the peaks will be written.

:control numPeaks:

The number of peaks to search report back. It is capped at (fftSize / 2) + 1.

:control detectionThreshold:

The threshold in dB above which a magnitude peak is considered to be a sinusoidal component.

:control order:

How the reported peaks are to be ordered. By default (0), it is by frequencies (lowest first), and the alternative (1) is by magnitude (loudest first).

:control freqUnit:

The units and scale used to report the frequency of the peaks. By default (0), it is in Hz (linear), and the alternative (1) is in MIDI (logarithmic).

:control magUnit:

The units and scale used to report the magnitude of the peaks. By default (0), it is in amp (linear), and the alternative (1) is in dB (logarithmic).

:control windowSize:

The window size. As sinusoidal estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally. For more information visit https://learn.flucoma.org/learn/fourier-transform/

:control hopSize:

The window hop size. As sinusoidal estimation relies on spectral frames, we need to move the window forward. It can be any size, but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).

:control fftSize:

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will default to windowSize. The -1 default value will default to the highest of windowSize and (bandwidth - 1) * 2.

:control padding:

Controls the zero-padding added to either end of the source buffer or segment. Padding ensures all values are analysed. Possible values are:

:enum:

:0:
No padding - The first analysis window starts at time 0, and the samples at either end will be tapered by the STFT windowing function.

:1:
Half the window size - The first sample is centred in the analysis window ensuring that the start and end of the segment are accounted for in the analysis.

:2:
Window size minus the hop size - Mode 2 can be useful when the overlap factor (window size / hop size) is greater than 2, to ensure that the input samples at either end of the segment are covered by the same number of analysis frames as the rest of the analysed material.

:control maxFFTSize:

How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.

:control maxNumPeaks:

Up to how many peaks can be reported, by allocating memory at instantiation time. This cannot be modulated.
4 changes: 2 additions & 2 deletions doc/BufStats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@

:control outliersCutoff:

A ratio of the inter quantile range (IQR) that defines a range from the median, outside of which data will be considered an outlier and not used to compute the statistical summary. For each frame, if a single value in any channel of that frame is considered an outlier (when compared to the rest of the values in it's channel), the whole frame (on all channels) will not be used for statistical calculations. The default of -1 bypasses this function, keeping all frames in the statistical measurements.
A ratio of the inter quantile range (IQR) that defines a range from the median, outside of which data will be considered an outlier and not used to compute the statistical summary. For each frame, if a single value in any channel of that frame is considered an outlier (when compared to the rest of the values in its channel), the whole frame (on all channels) will not be used for statistical calculations. The default of -1 bypasses this function, keeping all frames in the statistical measurements.

:control weights:

Expand All @@ -78,4 +78,4 @@

:control select:

An array of ``symbols`` indicating which statistics to return. The options are ``mean``, ``std``, ``skewness``, ``kurtosis``, ``low``, ``mid``, and ``high``. If nothing is specified, the object will return all the statistics. The statistics will always appear in their normal order, this argument just allows for a selection of them to be returned. Reordering the options in this argument will not reorder how the statistics are returned.
An array of ``symbols`` indicating which statistics to return. The options are ``mean``, ``std``, ``skewness``, ``kurtosis``, ``low``, ``mid``, and ``high``. If nothing is specified, the object will return all the statistics. The statistics will always appear in their normal order, this argument just allows for a selection of them to be returned. Reordering the options in this argument will not reorder how the statistics are returned.
2 changes: 0 additions & 2 deletions doc/MFCC.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,8 +15,6 @@
.. only_in:: sc

When ``numCoeffs`` is less than ``maxNumCoeffs`` the result will be zero-padded on the right so the control stream returned by this object is always ``maxNumCoeffs`` channels.

For more information visit https://learn.flucoma.org/reference/mfcc/.

For an interactive explanation of this relationship, visit https://learn.flucoma.org/reference/mfcc/explain.

Expand Down
56 changes: 56 additions & 0 deletions doc/SineFeature.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
:digest: Sinusoidal Peak Tracking
:species: descriptor
:sc-categories: Libraries>FluidDecomposition
:sc-related: Guides/FluidCorpusManipulation, Classes/SinOsc
:see-also: BufSineFeature, Sines
:description: Interpolated Sinusoidal Peak Tracking on the Spectrum.
:discussion:
This process is tracking peaks in the spectrum, then estimating an interpolated frequency and amplitude of that peak in relation to its spectral context. It is the first part of the process used by :fluid-obj:`Sines`.

:process: The audio rate version of the object.
:output: An array of two control streams: [0] is the interpolated frequency of the peaks extracted in Hz or MIDI, [1] is their respective magnitudes in amp or dB. The latency between the input and the output is windowSize samples.


:control in:

The input to be processed

:control numPeaks:

The number of peaks to search report back. It is capped at (fftSize / 2) + 1.

:control detectionThreshold:

The threshold in dB above which a magnitude peak is considered to be a sinusoidal component.

:control order:

How the reported peaks are to be ordered. By default (0), it is by frequencies (lowest first), and the alternative (1) is by magnitude (loudest first).

:control freqUnit:

The units and scale used to report the frequency of the peaks. By default (0), it is in Hz (linear), and the alternative (1) is in MIDI (logarithmic).

:control magUnit:

The units and scale used to report the magnitude of the peaks. By default (0), it is in amp (linear), and the alternative (1) is in dB (logarithmic).

:control windowSize:

The window size. As sinusoidal estimation relies on spectral frames, we need to decide what precision we give it spectrally and temporally. For more information visit https://learn.flucoma.org/learn/fourier-transform/

:control hopSize:

The window hop size. As sinusoidal estimation relies on spectral frames, we need to move the window forward. It can be any size, but low overlap will create audible artefacts. The -1 default value will default to half of windowSize (overlap of 2).

:control fftSize:

The inner FFT/IFFT size. It should be at least 4 samples long, at least the size of the window, and a power of 2. Making it larger allows an oversampling of the spectral precision. The -1 default value will default to windowSize. The -1 default value will default to the highest of windowSize and (bandwidth - 1) * 2.

:control maxFFTSize:

How large can the FFT be, by allocating memory at instantiation time. This cannot be modulated.

:control maxNumPeaks:

Up to how many peaks can be reported, by allocating memory at instantiation time. This cannot be modulated.
8 changes: 5 additions & 3 deletions example-code/sc/BufMFCC.scd
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,8 @@ FluidBufMFCC.processBlocking(s,~buf,features:~mfccs,action:{"done".postln;})
(
~ds = FluidDataSet(s).fromBuffer(~mfccs);
~ds.print;
~ds2d = FluidDataSet(s);
~ds2dN = FluidDataSet(s);
)

// dimensionally reduce the 13 MFCCs into 2D space
Expand All @@ -37,9 +39,9 @@ fork{
~umap = FluidUMAP(s);
~norm = FluidNormalize(s);
s.sync;
~umap.fitTransform(~ds,~ds);
~norm.fitTransform(~ds,~ds);
~dict = ~ds.dump({
~umap.fitTransform(~ds,~ds2d);
~norm.fitTransform(~ds2d,~ds2dN);
~dict = ~ds2dN.dump({
arg dict;
defer{
FluidPlotter(bounds:Rect(0,0,800,800),dict:dict);
Expand Down
59 changes: 59 additions & 0 deletions example-code/sc/BufSineFeature.scd
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
strong::Plot the BufSineFeature curves one over the other::
code::
~oboe = Buffer.read(s,FluidFilesPath("Harker-DS-TenOboeMultiphonics-M.wav"),numFrames: 311000);

(
~freq = Buffer(s);
~mags = Buffer(s);
FluidBufSineFeature.processBlocking(s,~oboe, frequency: ~freq, magnitude: ~mags, numPeaks: 5);
w = Window("FluidWaveform Test",Rect(0,0,1000,500));
w.view.layout = VLayout(
FluidWaveform(~oboe,featuresBuffer: ~freq,standalone: false),
FluidWaveform(~oboe,featuresBuffer: ~mags,standalone: false));
w.front;
)
::

strong::A few didactic examples::
code::

//mono source of 3 sines
b = {SinOsc.ar([440,789,535],mul: [0.01,0.03,0.02]).sum}.asBuffer(1)

//listen
b.play

//make destination buffers
~freq = Buffer(s); ~mags = Buffer(s);

//process
FluidBufSineFeature.process(s,b,frequency: ~freq, magnitude: ~mags, numPeaks: 4, action: {\done.postln})

// retrieve the first 2 frames of 4 peaks
~freq.getn(0, 8, {|x|x.postln})
~mags.getn(0, 8, {|x|x.postln})

// there are only 2 peaks... this is because the distance between 2 peaks has to be clearly segregated in the FFT world. At the default 1024 and the usual SC SR of 44100, this is 43Hz per bin, so 440 and 535 are too near each other... if we reprocess with a higher frame size, we get the right values

FluidBufSineFeature.process(s,b,frequency: ~freq, magnitude: ~mags, numPeaks: 4, windowSize: 2048, action: {\done.postln})

// first 2 frames of 4 peaks
~freq.getn(0, 8, {|x|x.postln})
~mags.getn(0, 8, {|x|x.postln})

//here is another source for fun, stereo this time, and triangle waves
b = {LFTri.ar([300, 500],mul: [-40, -45].dbamp)}.asBuffer(1)

b.play
b.query

// asking for 2 peaks - first and third harmonic of each should pop out
FluidBufSineFeature.process(s, b, frequency: ~freq, magnitude: ~mags, numPeaks: 2, magUnit: 1, action: {\done.postln})

// retrieving - the stereo values are interleaved, 2 for left 2 for right.
~freq.getn(0, 8, {|x|x.postln})
~mags.getn(0, 8, {|x|x.postln})

::


13 changes: 8 additions & 5 deletions example-code/sc/MDS.scd
Original file line number Diff line number Diff line change
Expand Up @@ -6,9 +6,11 @@ code::
~features = Buffer(s);
FluidBufMFCC.processBlocking(s,~src,features:~features,startCoeff:1);
~ds = FluidDataSet(s).fromBuffer(~features);
FluidMDS(s).fitTransform(~ds,~ds);
FluidNormalize(s).fitTransform(~ds,~ds);
~ds.dump({
~ds2d = FluidDataSet(s);
~ds2dN = FluidDataSet(s);
FluidMDS(s).fitTransform(~ds,~ds2d);
FluidNormalize(s).fitTransform(~ds2d,~ds2dN);
~ds2dN.dump({
arg dict;
{FluidPlotter(bounds:Rect(0,0,800,800),dict:dict)}.defer;
});
Expand All @@ -32,10 +34,11 @@ fork({
["Manhattan","Euclidean","Squared Euclidean","Minkowski Max","Minkowski Min","Symmetric Kullback Leibler"].do{
arg name, dist_measure;
var ds_transformed = FluidDataSet(s);
var ds_trans_normed = FluidDataSet(s);
"computing distance measure: % %".format(dist_measure, name).postln;
FluidMDS(s,2,dist_measure).fitTransform(~ds,ds_transformed);
FluidNormalize(s).fitTransform(ds_transformed,ds_transformed);
ds_transformed.dump({
FluidNormalize(s).fitTransform(ds_transformed,ds_trans_normed);
ds_trans_normed.dump({
arg dict;
defer{
var x = (dist_measure * 400) % win.bounds.width;
Expand Down
9 changes: 6 additions & 3 deletions example-code/sc/MFCC.scd
Original file line number Diff line number Diff line change
Expand Up @@ -108,6 +108,9 @@ CODE::

(
~ds = FluidDataSet(s);
~ds2d = FluidDataSet(s);
~ds2dN = FluidDataSet(s);

~bass = Buffer.read(s,FluidFilesPath("Tremblay-AaS-AcBassGuit-Melo-M.wav"),numFrames:44100 * 9); // only load the first 9 seconds so they're about the same length
~box = Buffer.read(s,FluidFilesPath("Green-Box639.wav"));
~umap = FluidUMAP(s); // algorithm for dimensionality reduction
Expand Down Expand Up @@ -138,9 +141,9 @@ CODE::
~ds.print;

(
~umap.fitTransform(~ds,~ds); // dimensionally reduce the 13 MFCCs into 2D space
~norm.fitTransform(~ds,~ds);
~ds.dump({
~umap.fitTransform(~ds,~ds2d); // dimensionally reduce the 13 MFCCs into 2D space
~norm.fitTransform(~ds2d,~ds2dN);
~ds2dN.dump({
arg dict;
var labels = Dictionary.newFrom(["data",Dictionary.new]);
dict["data"].keys.do{
Expand Down
2 changes: 1 addition & 1 deletion example-code/sc/Normalize.scd
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ fork({
~ds.print;
)

// normalize it
// normalize it in place, overwriting the dataset
(
~normalizer = FluidNormalize(s).fitTransform(~ds,~ds);
~ds.print;
Expand Down
13 changes: 8 additions & 5 deletions example-code/sc/PCA.scd
Original file line number Diff line number Diff line change
Expand Up @@ -11,17 +11,20 @@ s.boot;
FluidBufMFCC.processBlocking(s,~src,startCoeff:1,features:~mfcc_feature);
~ds = FluidDataSet(s).fromBuffer(~mfcc_feature);
~ds.print;
~ds2stan = FluidDataSet(s);
~ds2st2d = FluidDataSet(s);
~ds2st_2d_n = FluidDataSet(s);
)

// first standardize our DataSet, so that the MFCC dimensions are in similar ranges
// then apply the PCA in-place on the standardized data,
// reducing the number of dimensions to the default of 2
// lastly normalize it so it can be plotted in a normalized space
(
~stand = FluidStandardize(s).fitTransform(~ds,~ds);
~pca = FluidPCA(s).fitTransform(~ds,~ds);
~norm = FluidNormalize(s).fitTransform(~ds,~ds);
~ds.dump({
~stand = FluidStandardize(s).fitTransform(~ds,~ds2stan);
~pca = FluidPCA(s).fitTransform(~ds2stan,~ds2st2d);
~norm = FluidNormalize(s).fitTransform(~ds2st2d,~ds2st_2d_n);
~ds2st_2d_n.dump({
arg dict;
defer{FluidPlotter(dict:dict)};
});
Expand Down Expand Up @@ -72,7 +75,7 @@ FluidBufSpectralShape.processBlocking(s,~src,features:~analysis);
~ds = FluidDataSet(s).fromBuffer(~analysis);
~ds.print;

~stand = FluidStandardize(s).fitTransform(~ds,~ds);
~stand = FluidStandardize(s).fitTransform(~ds,~ds); // note: standardize in place

~ds_pca = FluidDataSet(s);
~pca = FluidPCA(s).fitTransform(~ds,~ds_pca);
Expand Down
34 changes: 34 additions & 0 deletions example-code/sc/SineFeature.scd
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@

CODE::

// a didactic example: a cluster of sinusoids, sorted by magnitudes
{var source = SinOsc.ar([440,789],mul: [0.05,0.1]).sum; FluidSineFeature.kr(source,numPeaks: 3, order: 1).poll; source.dup}.play

// or in MIDI and dB
{var source = SinOsc.ar([69,79].midicps,mul: [-40,-35].dbamp).sum; FluidSineFeature.kr(source,numPeaks: 3, order: 1, freqUnit: 1, magUnit: 1).poll; source.dup}.play

// a more exciting example: resynthesizing audio input
(
var buf = Buffer.read(s,FluidFilesPath("Harker-DS-TenOboeMultiphonics-M.wav"));

x = {
arg nbPeaks = 10, t_hold = 1;
var source = PlayBuf.ar(1, buf, loop: 1);
var analysis = Latch.kr(FluidSineFeature.kr(source,numPeaks: nbPeaks, maxNumPeaks: 50),t_hold);
var resynth = SinOsc.ar(analysis[Array.iota(50)], mul: analysis[Array.iota(50) + 50]).sum;
[source, resynth];
}.play
)

// play with the number of peaks to track
x.set(\nbPeaks, 5)
x.set(\nbPeaks, 1)
x.set(\nbPeaks, 50)

// trigger the holder to hear the reconstruction on the right hand side
x.set(\t_hold, 1)

// or make it automatic
r = Routine{x.set(\t_hold, 1);0.01.wait;}.loop.play
r.stop
::
Loading