In statistics, kernel density estimation (KDE) is a non-parametric way to estimate the probability density function (PDF) of a random variable. This function uses Gaussian kernels and includes automatic bandwidth determination.
Geoscientists involved in geochronological analysis always have to create plots to highlight the distribution of different zircon populations. The importance of identifying these populations will not be discussed here, however, this is thoroughly explained by other researchers (e.g. Cawood et al., 2012; Barham et al., 2022). The creation of KDE plots can be easily achieved by using the Python programming language.
I am using the Jupyter notebook environment. Before proceeding with the code, one must prepare the excel (xlsx) or csv file with the zircon ages. For this example, I will be using the Kalyus Member data.
After making sure the data is clean and filtered, we can proceed with the Python code. Initially, we import all the packages necessary for plotting kde diagrams.
import pandas as pd
import scipy.stats as stats
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()
import numpy as np
We load the excel file with the zircon age data and create a data frame. Check the column name where your ages are stored. In this case, the zircon data corresponds to the age column.
# Load the zircons dataset
data = pd.read_excel('kalyus.xlsx')
# Data as a Python Dictionary
dataDictionary = data["age"]
# Create a DataFrame
dataFrame = pd.DataFrame(data = dataDictionary);
Finally, we generate the kernel density estimate plot using Gaussian kernels. I have observed that the best bw_method (the method used to calculate the estimator bandwidth) should be set to 0.05, however ‘scott’, ‘silverman’, or any other scalar constant or callable can be used as well. Check the python documentation page for more information.
You can set your x and y axes limits by using the plt.xlim and plt.ylim commands. After the image is generated, you can save the image in different file formats, such as .jpg, .pdf., .png, with the plt.savefig command.
# Plot PDF using KDE with different bandwidth values
plt.style.use('default')
dataFrame.plot.kde(bw_method=0.05, title="Kalyus Member");
plt.xlim(0, 3500)
plt.ylim(0, 0.009)
plt.savefig('Kalyus.pdf', dpi = 400)
plt.show(block=True);
The final result looks like this, and in my case it is stored as a .pdf file. Afterwards, you can use programs such as Adobe Illustrator to process the image, change the colors, height, width etc. in order to have a good-looking image for your research article.
Kommentare