Pre-treatment of soil X-ray powder diffraction data for cluster analysis

X-ray powder diffraction (XRPD) is widely applied for the qualitative and quantitative analysis of soil mineralogy. In recent years, high-throughput XRPD has resulted in soil XRPD datasets containing thousands of samples. The efforts required for conventional approaches of soil XRPD data analysis ar...

Full description

Bibliographic Details
Main Authors: Butler, B. M., Sila, Andrew M., Shepherd, Keith D., Nyambura, M., Gilmore, C. J., Kourkoumelis, N., Hillier, S.
Format: Journal Article
Language:Inglés
Published: Elsevier 2019
Subjects:
Online Access:https://hdl.handle.net/10568/108337
_version_ 1855515567863103488
author Butler, B. M.
Sila, Andrew M.
Shepherd, Keith D.
Nyambura, M.
Gilmore, C. J.
Kourkoumelis, N.
Hillier, S.
author_browse Butler, B. M.
Gilmore, C. J.
Hillier, S.
Kourkoumelis, N.
Nyambura, M.
Shepherd, Keith D.
Sila, Andrew M.
author_facet Butler, B. M.
Sila, Andrew M.
Shepherd, Keith D.
Nyambura, M.
Gilmore, C. J.
Kourkoumelis, N.
Hillier, S.
author_sort Butler, B. M.
collection Repository of Agricultural Research Outputs (CGSpace)
description X-ray powder diffraction (XRPD) is widely applied for the qualitative and quantitative analysis of soil mineralogy. In recent years, high-throughput XRPD has resulted in soil XRPD datasets containing thousands of samples. The efforts required for conventional approaches of soil XRPD data analysis are currently restrictive for such large data sets, resulting in a need for computational methods that can aid in defining soil property – soil mineralogy relationships. Cluster analysis of soil XRPD data represents a rapid method for grouping data into discrete classes based on mineralogical similarities, and thus allows for sets of mineralogically distinct soils to be defined and investigated in greater detail. Effective cluster analysis requires minimisation of sample-independent variation and maximisation of sample-dependent variation, which entails pre-treatment of XRPD data in order to correct for common aberrations associated with data collection. A 24 factorial design was used to investigate the most effective data pre-treatment protocol for the cluster analysis of XRPD data from 12 African soils, each analysed once by five different personnel. Sample-independent effects of displacement error, noise and signal intensity variation were pre-treated using peak alignment, binning and scaling, respectively. The sample-dependent effect of strongly diffracting minerals overwhelming the signal of weakly diffracting minerals was pre-treated using a square-root transformation. Without pre-treatment, the 60 XRPD measurements failed to provide informative clusters. Pre-treatment via peak alignment, square-root transformation, and scaling each resulted in significantly improved partitioning of the groups (p < 0.05). Data pre-treatment via binning reduced the computational demands of cluster analysis, but did not significantly affect the partitioning (p > 0.1). Applying all four pre-treatments proved to be the most suitable protocol for both non-hierarchical and hierarchical cluster analysis. Deducing such a protocol is considered a prerequisite to the wider application of cluster analysis in exploring soil property – soil mineralogy relationships in larger datasets.
format Journal Article
id CGSpace108337
institution CGIAR Consortium
language Inglés
publishDate 2019
publishDateRange 2019
publishDateSort 2019
publisher Elsevier
publisherStr Elsevier
record_format dspace
spelling CGSpace1083372023-12-08T19:36:04Z Pre-treatment of soil X-ray powder diffraction data for cluster analysis Butler, B. M. Sila, Andrew M. Shepherd, Keith D. Nyambura, M. Gilmore, C. J. Kourkoumelis, N. Hillier, S. cluster sampling data analysis X-ray powder diffraction (XRPD) is widely applied for the qualitative and quantitative analysis of soil mineralogy. In recent years, high-throughput XRPD has resulted in soil XRPD datasets containing thousands of samples. The efforts required for conventional approaches of soil XRPD data analysis are currently restrictive for such large data sets, resulting in a need for computational methods that can aid in defining soil property – soil mineralogy relationships. Cluster analysis of soil XRPD data represents a rapid method for grouping data into discrete classes based on mineralogical similarities, and thus allows for sets of mineralogically distinct soils to be defined and investigated in greater detail. Effective cluster analysis requires minimisation of sample-independent variation and maximisation of sample-dependent variation, which entails pre-treatment of XRPD data in order to correct for common aberrations associated with data collection. A 24 factorial design was used to investigate the most effective data pre-treatment protocol for the cluster analysis of XRPD data from 12 African soils, each analysed once by five different personnel. Sample-independent effects of displacement error, noise and signal intensity variation were pre-treated using peak alignment, binning and scaling, respectively. The sample-dependent effect of strongly diffracting minerals overwhelming the signal of weakly diffracting minerals was pre-treated using a square-root transformation. Without pre-treatment, the 60 XRPD measurements failed to provide informative clusters. Pre-treatment via peak alignment, square-root transformation, and scaling each resulted in significantly improved partitioning of the groups (p < 0.05). Data pre-treatment via binning reduced the computational demands of cluster analysis, but did not significantly affect the partitioning (p > 0.1). Applying all four pre-treatments proved to be the most suitable protocol for both non-hierarchical and hierarchical cluster analysis. Deducing such a protocol is considered a prerequisite to the wider application of cluster analysis in exploring soil property – soil mineralogy relationships in larger datasets. 2019-03 2020-05-27T15:19:43Z 2020-05-27T15:19:43Z Journal Article https://hdl.handle.net/10568/108337 en Open Access Elsevier Butler, B. M., Sila, A. M., Shepherd, K. D., Nyambura, M., Gilmore, C. J., Kourkoumelis, N., & Hillier, S. (2019). Pre-treatment of soil X-ray powder diffraction data for cluster analysis. Geoderma, 337:413–424. https://doi.org/10.1016/j.geoderma.2018.09.044
spellingShingle cluster sampling
data analysis
Butler, B. M.
Sila, Andrew M.
Shepherd, Keith D.
Nyambura, M.
Gilmore, C. J.
Kourkoumelis, N.
Hillier, S.
Pre-treatment of soil X-ray powder diffraction data for cluster analysis
title Pre-treatment of soil X-ray powder diffraction data for cluster analysis
title_full Pre-treatment of soil X-ray powder diffraction data for cluster analysis
title_fullStr Pre-treatment of soil X-ray powder diffraction data for cluster analysis
title_full_unstemmed Pre-treatment of soil X-ray powder diffraction data for cluster analysis
title_short Pre-treatment of soil X-ray powder diffraction data for cluster analysis
title_sort pre treatment of soil x ray powder diffraction data for cluster analysis
topic cluster sampling
data analysis
url https://hdl.handle.net/10568/108337
work_keys_str_mv AT butlerbm pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis
AT silaandrewm pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis
AT shepherdkeithd pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis
AT nyamburam pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis
AT gilmorecj pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis
AT kourkoumelisn pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis
AT hilliers pretreatmentofsoilxraypowderdiffractiondataforclusteranalysis