Generate synthetic mixtures from pure spectra

In a previous post on Multivariate Curve Resolution (MCR), we looked at techniques to extract the contribution of individual components from the spectra of mixtures. This is a companion post, showing how to generate a synthetic mixture of Raman spectra of minerals.

The pure spectra are taken from the RRUFF Project, containing Raman spectra of a large number of minerals.

The script presented here takes three compounds (directly from the file saved from the database) and mixes their Raman spectra in a random way, with the only constraint that the individual concentrations add up to 1.

Mixtures from pure spectra

The script assumes you have the spectra from three compounds, saved as txt files, but it can be generalised to an arbitrary number of compounds.

For this example, we chose three minerals:

The scripts generates 300 random mixes and add the pure spectra of the three compounds at the end (i.e. 303 spectra in total). The data is then saved as csv file.

We start with the imports, and then load the files

Next, we create lists of three elements, containing spectra and wavenumber arrays for the three minerals.

You will notice that, in general, the spectra don’t span the same wavenumber range, and don’t have the same resolution. To generate synthetic spectra from mixtures of these pure spectra, we need to make sure that the spectra are interpolated to the same wavenumber range.

Now we are finally read to calculate n = 300 random synthetic mixtures. For each sample, we set the first concentration via a random number in the range (0-1). Then we set the second concentration by extracting another random number in the range from 0 up to the previous value. The concentration of the third component will then be fixed by imposing that the sum of the three concentrations is unity.

Finally, we convert the array into a data frame and export a CSV file

An example of data generated with this script is available at our Github repository. And, as mentioned, head to our previous post on Multivariate Curve Resolution (MCR) to learn how to invert this process, i.e. how to extract the spectra of the pure components from the spectra of the mixtures.

As always, thanks for reading and until next time.

Daniel

 

Feature image from msvr on Pexels.