• About
  • Consulting
  • Sponsors
  • Newsletter

NIRPY Research

Menu
  • Classification
    • Linear Discriminant Analysis
    • Principal Components Analysis
    • PLS Discriminant Analysis
    • Logistic Regression
    • Perceptron
    • Classification pipelines
    • Classification metrics
  • Data Operations and Plotting
    • Data Correction and Normalisation
    • Outliers Detection
    • Plots and Charts
    • Multivariate Curve Resolution
    • Variable Selection
  • Regression
    • Partial Least Squares Regression
    • Principal Components Regression
    • Ridge Regression
    • Regression Model Validation
    • Regression metrics
    • Regression pipelines
  • Neural Networks
  • Use Cases
  • Classification
    • Linear Discriminant Analysis
    • Principal Components Analysis
  • Data Operations and Plotting
    • Data Correction and Normalisation
    • Outliers Detection
    • Plots and Charts
  • Regression
    • Partial Least Squares Regression
    • Principal Components Regression
    • Ridge Regression
  • About
  • Sign Up To Our Newsletter

Include MSC in a custom pipeline using scikit-learn

Regression, Partial Least Squares Regression, Regression pipelines 12/07/2024 Daniel Pelliccia
Include MSC in a custom pipeline using scikit-learn
Learn how to include Multiplicative scatter correction in a custom pipeline using scikit-learn.
[Continue Reading...]

Generate synthetic mixtures from pure spectra

Regression, Data Operations and Plotting, Multivariate Curve Resolution 11/15/2024 Daniel Pelliccia
Generate synthetic mixtures from pure spectra
A Python script to generate random synthetic mixtures from pure spectra from a database. This post can be used as a tutorial to generate datasets for Multivariate Curve Resolution.
[Continue Reading...]

Diagnostic plots for linear regression

Data Operations and Plotting, Partial Least Squares Regression, Regression 10/05/2024 Daniel Pelliccia
Diagnostic plots for linear regression
A short survey of diagnostic plots, as tools to dig deeper on the assumptions behind a regression model.
[Continue Reading...]

Genetic algorithm for wavelength selection using Numpy

Regression, Partial Least Squares Regression, Variable Selection 06/22/2024 Daniel Pelliccia
Genetic algorithm for wavelength selection using Numpy
A implementation of a genetic algorithm for wavelength selection using basic Numpy functions.
[Continue Reading...]

Robust PCA

Data Operations and Plotting, Data Correction and Normalisation, Outliers Detection, Principal Components Analysis 03/17/2024 Daniel Pelliccia
Robust PCA
A implementation of robust PCA, useful when the data contains outliers.
[Continue Reading...]

Two methods for baseline correction of spectral data

Data Operations and Plotting, Data Correction and Normalisation 02/10/2024 Daniel Pelliccia
Two methods for baseline correction of spectral data
Worked examples of two methods for baseline correction of spectra applied to Raman and XRF data.
[Continue Reading...]

Optimal spectra smoothing with Fourier ring correlation

Data Operations and Plotting, Data Correction and Normalisation 11/19/2023 Daniel Pelliccia
Optimal spectra smoothing with Fourier ring correlation
We bring in the concept of Fourier ring correlation for another riff on the optimal settings of smoothing filters.
[Continue Reading...]

Spectra smoothing with locally-weighted regression

Data Correction and Normalisation, Data Operations and Plotting 10/07/2023 Daniel Pelliccia
Spectra smoothing with locally-weighted regression
Locally weighted regression is an approach for smoothing a set of data points by performing a polynomial regression in the vicinity of each point. It does sounds like a …
[Continue Reading...]

Wavelength selection with a genetic algorithm

Data Operations and Plotting, Partial Least Squares Regression, Regression, Variable Selection 07/28/2023 Daniel Pelliccia
Wavelength selection with a genetic algorithm
Wavelength selection methods aim at choosing the spectral bands that produce the best regression or classification model. Here we introduce a genetic algorithm for wavelength selection.
[Continue Reading...]

Multi-class classification for NIR spectroscopy: the very basics

Classification, Classification metrics, Classification pipelines, Logistic Regression, Principal Components Analysis 07/08/2023 Daniel Pelliccia
Multi-class classification for NIR spectroscopy: the very basics
Multi-class classification aims and subdividing samples into one of multiple predefined categories. In this post we explore a basic classifier and discuss important metrics such as accuracy and AUC …
[Continue Reading...]

Updates and additions to the PLS Regression code

Regression, Partial Least Squares Regression, Plots and Charts 04/29/2023 Daniel Pelliccia
Updates and additions to the PLS Regression code
Updated code and additional utility scripts for PLS regression. Will keep it updated as we go.
[Continue Reading...]

Multivariate curve resolution: an introduction

Regression, Data Operations and Plotting, Multivariate Curve Resolution 03/11/2023 Daniel Pelliccia
Multivariate curve resolution: an introduction
Multivariate Curve Resolution deals with spectra, or other signals, from samples containing multiple components, and aims at recovering the pure components.
[Continue Reading...]

Wavelet denoising of spectra

Data Operations and Plotting, Data Correction and Normalisation 01/12/2023 Daniel Pelliccia
Wavelet denoising of spectra
On the concept of wavelet transform and decomposition, and its application for denoising of spectra.
[Continue Reading...]

Understanding neural network parameters with TensorFlow in Python: the optimiser

Neural Networks, Regression 11/20/2022 Daniel Pelliccia
Understanding neural network parameters with TensorFlow in Python: the optimiser
An introductory tutorial on optimisers for deep learning, including Python code for a regression training for NIR spectroscopy.
[Continue Reading...]

The Kennard-Stone algorithm

Data Operations and Plotting 10/15/2022 Daniel Pelliccia
The Kennard-Stone algorithm
The Kennard-Stone algorithm is a criterion to divide a dataset into a calibration and a validation subset. Here's an introduction to the method and a basic Python implementation.
[Continue Reading...]

Detecting lactose in lactose-free milk with NIR spectroscopy

Use Cases 09/24/2022 Daniel Pelliccia
Detecting lactose in lactose-free milk with NIR spectroscopy
NIR spectroscopy can detect sources of allergens or intolerance in food. Here's an example of detecting lactose in lactose-free milk.
[Continue Reading...]

The PCA correlation circle

Plots and Charts, Data Operations and Plotting, Principal Components Analysis 09/03/2022 Daniel Pelliccia
The PCA correlation circle
The PCA correlation circle is a useful tool to visually display the correlation between spectral bands and principal components. The correlation can be quantified through the Euclidean distance and …
[Continue Reading...]

Regression optimisation with a Pipeline

Regression, Partial Least Squares Regression, Principal Components Regression, Regression pipelines 08/13/2022 Daniel Pelliccia
Regression optimisation with a Pipeline
The process of developing and optimising a regression model, almost invariably requires a sequence of steps. These steps can be combined in a single predictor using the Pipeline function …
[Continue Reading...]

Qualitative analysis of ground coffee with NIR spectroscopy

Use Cases 07/30/2022 Daniel Pelliccia
Qualitative analysis of ground coffee with NIR spectroscopy
Can we use NIR spectroscopy to distinguish between coffee types? Let's take a look at how basic PCA can enable a qualitative estimation of coffee strength.
[Continue Reading...]

Aquagrams with Python and Matplotlib

Data Operations and Plotting, Plots and Charts 07/09/2022 Daniel Pelliccia
Aquagrams with Python and Matplotlib
Aquagrams are diagrams charting the relative strength of the NIR reflectance or absorbance of specific bands in water-based systems. Here's an example code to plot an aquagram using Matplotlib.
[Continue Reading...]

Parallel computation of loops for cross-validation analysis

Regression, Data Operations and Plotting, Partial Least Squares Regression 05/21/2022 Daniel Pelliccia
Parallel computation of loops for cross-validation analysis
Using parallel computation to speed up cross-validation analysis for large data sets.
[Continue Reading...]

Understanding neural network parameters with TensorFlow in Python: the activation function

Neural Networks, Regression 04/09/2022 Daniel Pelliccia
Understanding neural network parameters with TensorFlow in Python: the activation function
Where we discuss the meaning of an activation function in neural networks, discuss a few examples, and show a comparison of neural network training with different activation functions.
[Continue Reading...]

Deep neural networks for spectral data regression with TensorFlow

Neural Networks, Regression 03/12/2022 Daniel Pelliccia
Deep neural networks for spectral data regression with TensorFlow
This post introduces basic Python code to build fully-connected deep neural networks with TensorFlow for regression analysis of spectral data.
[Continue Reading...]

The Akaike Information Criterion for model selection

Regression, Principal Components Regression, Regression metrics, Regression Model Validation 09/18/2021 Daniel Pelliccia
The Akaike Information Criterion for model selection
The Akaike Information Criterion (AIC) is another tool to compare prediction models. AIC combines model accuracy and parsimony in a single metric and can be used to evaluate data …
[Continue Reading...]

Minimal prediction models for linear regression

Regression 04/10/2021 Daniel Pelliccia
Minimal prediction models for linear regression
What is the minimum amount of information required to export and re-use a linear regression model? The answer is surprisingly simple. Here's a step by step example using PLS …
[Continue Reading...]

Backward Variable Selection for PLS regression

Regression, Partial Least Squares Regression 03/13/2021 Daniel Pelliccia
Backward Variable Selection for PLS regression
Backward Variable Selection for PLS regression is a method to discard variables that contribute poorly to the regression model. Here's a Python implementation of the method.
[Continue Reading...]

The Concordance Correlation Coefficient

Regression, Regression metrics, Regression Model Validation 01/09/2021 Daniel Pelliccia
The Concordance Correlation Coefficient
The Concordance Correlation Coefficient (CCC) can be useful to quantify the quality of a linear regression model. In this tutorial we explain the CCC and describe its relation with …
[Continue Reading...]

Bias-Variance trade-off in PLS regression

Regression, Partial Least Squares Regression, Regression Model Validation 09/20/2020 Daniel Pelliccia
Bias-Variance trade-off in PLS regression
Bias-Variance trade-off refers to the optimal choice of parameters in a model in order to avoid both overfitting and underfitting. Let's look at a worked example using PLS regression.
[Continue Reading...]

Wavelength band selection with simulated annealing

Regression, Partial Least Squares Regression 08/15/2020 Daniel Pelliccia
Wavelength band selection with simulated annealing
Improve the performance of a PLS method by wavelength band selection using Simulated Annealing optimisation.
[Continue Reading...]

PCA and kernel PCA explained

Principal Components Analysis 06/10/2020 Daniel Pelliccia
PCA and kernel PCA explained
Gain a practical understanding of PCA and kernel PCA by learning to code the algorithms and test it on real spectroscopic data.
[Continue Reading...]

Binary classification of spectra with a single perceptron

Classification, Perceptron, PLS Discriminant Analysis 04/17/2020 Daniel Pelliccia
Binary classification of spectra with a single perceptron
The perceptron is a basic block of feed-forward neural networks. Learn how to use a single perceptron for binary classification of NIR spectra using gradient descent
[Continue Reading...]

PLS Discriminant Analysis for binary classification in Python

Classification, PLS Discriminant Analysis 03/29/2020 Daniel Pelliccia
PLS Discriminant Analysis for binary classification in Python
PLS Discriminant analysis is a variation of PLS able to deal with classification problems. Here's a tutorial on binary classification with PLS-DA in Python
[Continue Reading...]

Principal component selection with simulated annealing

Principal Components Regression, Regression 02/09/2020 Daniel Pelliccia
Principal component selection with simulated annealing
Simulated annealing helps overcome some of the shortcomings of greedy algorithms. Here's a tutorial on simulated annealing for principal components selection in regression.
[Continue Reading...]

Principal component selection with a greedy algorithm

Principal Components Regression, Regression 01/28/2020 Daniel Pelliccia
Principal component selection with a greedy algorithm
Greedy algorithms are commonly used to optimise a function over a parameter space. Here's an implementation of a greedy algorithm for principal components selection in regression.
[Continue Reading...]

Choosing the optimal parameters for a Savitzky–Golay smoothing filter

Data Operations and Plotting, Data Correction and Normalisation 01/04/2020 Daniel Pelliccia
Choosing the optimal parameters for a Savitzky–Golay smoothing filter
Setting the parameters of a Savitzky-Golay filter seems more a craft than a science. Here's my method to find an optimal filter, complete with code.
[Continue Reading...]

Moving window PLS regression

Regression, Partial Least Squares Regression 12/07/2019 Daniel Pelliccia
Moving window PLS regression
Not all wavelengths are created equals. A moving window PLS algorithm optimises the regression by discarding bands that are not useful for prediction.
[Continue Reading...]

K-fold and Montecarlo cross-validation vs Bootstrap: a primer

Regression, Partial Least Squares Regression, Regression Model Validation 11/15/2019 Daniel Pelliccia
K-fold and Montecarlo cross-validation vs Bootstrap: a primer
Cross-validation is a standard procedure to quantify the robustness of a regression model. Compare K-Fold, Montecarlo and Bootstrap methods and learn some neat trick in the process.
[Continue Reading...]

Fourier spectral smoothing method

Data Operations and Plotting, Data Correction and Normalisation 10/25/2019 Daniel Pelliccia
Fourier spectral smoothing method
Meet a fairly unknown member of the spectral smoothing family: the Fourier spectral smoothing method. Learn some theory and Python code implementation.
[Continue Reading...]

Savitzky–Golay smoothing method

Data Operations and Plotting, Data Correction and Normalisation 10/05/2019 Daniel Pelliccia
Savitzky–Golay smoothing method
The secret behind perfect smoothing is a wise choice of parameters. In this tutorial you will learn about the Savitzky–Golay method and the way to optimise its performance
[Continue Reading...]

Principal Component Regression in Python revisited

Principal Components Regression, Regression 09/10/2019 Daniel Pelliccia
Principal Component Regression in Python revisited
Want to get more out of your principal components regression? Here's a simple hack that will give you a stunning improvement on the performance of PCR.
[Continue Reading...]

Detecting outliers using the Mahalanobis distance with PCA in Python

Classification, Data Operations and Plotting, Outliers Detection, Principal Components Analysis 01/20/2019 Daniel Pelliccia
Detecting outliers using the Mahalanobis distance with PCA in Python
How do we make sure we are detecting only true outliers and not cherry-picking from the data? Here's a method based on the Mahalanobis distance with PCA.
[Continue Reading...]

NIR data correlograms with Seaborn in Python

Data Operations and Plotting, Plots and Charts 01/03/2019 Daniel Pelliccia
NIR data correlograms with Seaborn in Python
Exploratory analysis is an essential part of data analysis. Learn a handy way to explore your dataset with NIR data correlograms with Seaborn in Python.
[Continue Reading...]

Classification of NIR spectra by Linear Discriminant Analysis in Python

Classification, Linear Discriminant Analysis 12/03/2018 Daniel Pelliccia
Classification of NIR spectra by Linear Discriminant Analysis in Python
What is Linear Discriminant Analysis and how it differs from PCA? Let's talk trough LDA and build a NIR spectra classifier using LDA in Python.
[Continue Reading...]

Principal Components Regression vs Ridge Regression on NIR data in Python

Principal Components Regression, Regression, Ridge Regression 10/19/2018 Daniel Pelliccia
Principal Components Regression vs Ridge Regression on NIR data in Python
Principal components regression is a staple of NIR analysis. Ridge regression is much used of machine learning. How do they relate? Find out in this post.
[Continue Reading...]

Outliers detection with PLS regression for NIR spectroscopy in Python

Data Operations and Plotting, Outliers Detection, Partial Least Squares Regression, Regression 09/22/2018 Daniel Pelliccia
Outliers detection with PLS regression for NIR spectroscopy in Python
Not every data point is created equal. In this post we'll show how to perform outliers detection with PLS regression for NIR spectroscopy in Python.
[Continue Reading...]

Exporting NIR regression models built in Python

Data Operations and Plotting 08/19/2018 Daniel Pelliccia
Exporting NIR regression models built in Python
Three methods to export a Python NIR regression model and how to load it back for future use. Worked Python codes to discuss pros and cons of these methods.
[Continue Reading...]

Two scatter correction techniques for NIR spectroscopy in Python

Data Correction and Normalisation, Data Operations and Plotting 07/21/2018 Daniel Pelliccia
Two scatter correction techniques for NIR spectroscopy in Python
Worked example of two scatter correction techniques for NIR spectroscopy in Python: Multiplicative Scatter Correction and Standard Normal Variate.
[Continue Reading...]

A variable selection method for PLS in Python

Data Operations and Plotting, Partial Least Squares Regression, Regression, Variable Selection 07/04/2018 Daniel Pelliccia
A variable selection method for PLS in Python
Improve the quality of your PLS regression using variable selection. This tutorial will work through a variable selection method for PLS in Python.
[Continue Reading...]

Partial Least Squares Regression in Python

Partial Least Squares Regression, Regression 06/14/2018 Daniel Pelliccia
Partial Least Squares Regression in Python
Step by step tutorial on how to build a NIR calibration model using Partial Least Squares Regression in Python.
[Continue Reading...]

Principal Component Regression in Python

Principal Components Regression, Regression 05/12/2018 Daniel Pelliccia
Principal Component Regression in Python
An in-depth introduction to Principal Component Regression in Python using NIR data. PCR is the combination of PCA with linear regression. Check it out.
[Continue Reading...]

Classification of NIR spectra using Principal Component Analysis in Python

Classification, Principal Components Analysis 03/23/2018 Daniel Pelliccia
Classification of NIR spectra using Principal Component Analysis in Python
An in-depth tutorial on how to run a classification of NIR spectra using Principal Component Analysis in Python. Step by step example with code.
[Continue Reading...]

NIR classification of macadamia kernels

Classification, Principal Components Analysis, Use Cases 07/06/2017 Daniel Pelliccia
NIR classification of macadamia kernels
Can we use NIR analysis to grade macadamias? Check out our preliminary results of NIR classification of macadamia kernels using Principal Component Analysis.
[Continue Reading...]

The cricket thermometer: an introduction to Principal Component Analysis in Python

Classification, Principal Components Analysis 03/21/2017 Daniel Pelliccia
The cricket thermometer: an introduction to Principal Component Analysis in Python
A worked example for an introduction to Principal Component Analysis in Python.
[Continue Reading...]

ACKNOWLEDGEMENT

I acknowledge the Traditional Owners of the land on which I live and work. I pay my respects to their Elders, past and present, and the Aboriginal Elders of other communities who may be here today.

Support Nirpy Research

If you love this blog, please consider supporting it with a small donation on Buy Me a Coffee. You donation will help keeping this blog and our GitHub content always free for everyone.

Subscribe to Our Newsletter

DON’T MISS A BEAT

By subscribing to this newsletter, you’ll receive notifications of new posts, and the occasional email with new initiatives and other news!

Be the first to know when a new post is released.

We don’t spam! Read our privacy policy for more info.

Check your inbox or spam folder to confirm your subscription.

About The Author

Avatar
Daniel Pelliccia

Physicist and entrepreneur. Founder of Instruments & Data Tools, specialising in custom sensors and analytics. Founder of Rubens Technologies, the crop intelligence system.

  • Classification
  • Classification metrics
  • Classification pipelines
  • Data Correction and Normalisation
  • Data Operations and Plotting
  • Linear Discriminant Analysis
  • Logistic Regression
  • Multivariate Curve Resolution
  • Neural Networks
  • Outliers Detection
  • Partial Least Squares Regression
  • Perceptron
  • Plots and Charts
  • PLS Discriminant Analysis
  • Principal Components Analysis
  • Principal Components Regression
  • Regression
  • Regression metrics
  • Regression Model Validation
  • Regression pipelines
  • Ridge Regression
  • Use Cases
  • Variable Selection
Creative Commons License The NIRPY Research Blog by Daniel Pelliccia is licensed under a Creative Commons Attribution 4.0 International License.

Instruments & Data Tools Pty Ltd
PRIVACY POLICY | COOKIE POLICY
Website created by Francesco Pelliccia

NIRPY Research © 2025
 
We use cookies on our website. By clicking “Accept”, you consent to the use of ALL the cookies.
.
Cookie SettingsAccept
Manage consent

Privacy Overview

This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary
Always Enabled
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
CookieDurationDescription
cookielawinfo-checkbox-analytics11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional11 monthsThe cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance11 monthsThis cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy11 monthsThe cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytics
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.
Others
Other uncategorized cookies are those that are being analyzed and have not been classified into a category as yet.
SAVE & ACCEPT