Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data

A numerical classification problem encountered by breeders and gene-bank curators is how to partition the original heterogeneous population of genotypes into non-overlapping homogeneous subpopulations. The measure of distance that may be defined depends on the type of variables measured (i.e., conti...

Full description

Bibliographic Details
Main Authors: Franco, J., Crossa, J., Desphande, S.
Format: Journal Article
Language:Inglés
Published: Wiley 2010
Subjects:
Online Access:https://hdl.handle.net/10568/92353
_version_ 1855532583859781632
author Franco, J.
Crossa, J.
Desphande, S.
author_browse Crossa, J.
Desphande, S.
Franco, J.
author_facet Franco, J.
Crossa, J.
Desphande, S.
author_sort Franco, J.
collection Repository of Agricultural Research Outputs (CGSpace)
description A numerical classification problem encountered by breeders and gene-bank curators is how to partition the original heterogeneous population of genotypes into non-overlapping homogeneous subpopulations. The measure of distance that may be defined depends on the type of variables measured (i.e., continuous and/or discrete). The key points are whether and how a distance may be defined using all types of variables to achieve effective classification. The objective of this research was to propose an approach that combines the use of hierarchical multiple-factor analysis (HMFA) and the two-stage Ward Modified Location Model (Ward-MLM) classification strategy that allows (i) combining different types of phenotypic and genetic data simultaneously; (ii) balancing out the effects of the different phenotypic, genetic, continuous, and discrete variables; and (iii) measuring the contribution of each original variable to the new principal axes (PAs). Of the two strategies applied for developing PA scores to be used for clustering genotypes, the strategy that used the first few PA scores to which phenotypic and genetic variables each contributed 50% (i.e., a balanced contribution) formed better groups than those formed by the strategy that used a large number of PA scores explaining 95% of total variability. Phenotypic variables account for much variability in the initial PA; then their contributions decrease. The importance of genetic variables increases in later PAs. Results showed that various phenotypic and genetic variables made important contributions to the new PA. The HMFA uses all phenotypic and genetic variables simultaneously and, in conjunction with the Ward-MLM method, it offers an effective unifying approach for the classification of breeding genotypes into homogeneous groups and for the formation of core subsets for genetic resource conservation.
format Journal Article
id CGSpace92353
institution CGIAR Consortium
language Inglés
publishDate 2010
publishDateRange 2010
publishDateSort 2010
publisher Wiley
publisherStr Wiley
record_format dspace
spelling CGSpace923532023-12-08T19:25:22Z Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data Franco, J. Crossa, J. Desphande, S. breeding gene banks genotypes molecular markers genetics phenotypic genetic data A numerical classification problem encountered by breeders and gene-bank curators is how to partition the original heterogeneous population of genotypes into non-overlapping homogeneous subpopulations. The measure of distance that may be defined depends on the type of variables measured (i.e., continuous and/or discrete). The key points are whether and how a distance may be defined using all types of variables to achieve effective classification. The objective of this research was to propose an approach that combines the use of hierarchical multiple-factor analysis (HMFA) and the two-stage Ward Modified Location Model (Ward-MLM) classification strategy that allows (i) combining different types of phenotypic and genetic data simultaneously; (ii) balancing out the effects of the different phenotypic, genetic, continuous, and discrete variables; and (iii) measuring the contribution of each original variable to the new principal axes (PAs). Of the two strategies applied for developing PA scores to be used for clustering genotypes, the strategy that used the first few PA scores to which phenotypic and genetic variables each contributed 50% (i.e., a balanced contribution) formed better groups than those formed by the strategy that used a large number of PA scores explaining 95% of total variability. Phenotypic variables account for much variability in the initial PA; then their contributions decrease. The importance of genetic variables increases in later PAs. Results showed that various phenotypic and genetic variables made important contributions to the new PA. The HMFA uses all phenotypic and genetic variables simultaneously and, in conjunction with the Ward-MLM method, it offers an effective unifying approach for the classification of breeding genotypes into homogeneous groups and for the formation of core subsets for genetic resource conservation. 2010-01 2018-04-24T15:21:04Z 2018-04-24T15:21:04Z Journal Article https://hdl.handle.net/10568/92353 en Limited Access Wiley Franco, J., Crossa, J. & Desphande, S. (2010). Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data. Crop Science, 50(1), 105-117.
spellingShingle breeding
gene banks
genotypes
molecular markers
genetics
phenotypic
genetic data
Franco, J.
Crossa, J.
Desphande, S.
Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data
title Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data
title_full Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data
title_fullStr Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data
title_full_unstemmed Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data
title_short Hierarchical multiple-factor analysis for classifying genotypes based on phenotypic and genetic data
title_sort hierarchical multiple factor analysis for classifying genotypes based on phenotypic and genetic data
topic breeding
gene banks
genotypes
molecular markers
genetics
phenotypic
genetic data
url https://hdl.handle.net/10568/92353
work_keys_str_mv AT francoj hierarchicalmultiplefactoranalysisforclassifyinggenotypesbasedonphenotypicandgeneticdata
AT crossaj hierarchicalmultiplefactoranalysisforclassifyinggenotypesbasedonphenotypicandgeneticdata
AT desphandes hierarchicalmultiplefactoranalysisforclassifyinggenotypesbasedonphenotypicandgeneticdata