Machine learning algorithms identified relevant SNPs for milk fat content in cattle

In recent years, machine learning methods have been shown to be efficient in identifying a subset of single nucleotide polymorphisms (SNP) underlying a trait of interest. The aim of this study was the construction of predictive models using machine learning algorithms, for the identification of lo...

Full description

Bibliographic Details
Main Authors: Ríos, Pablo J., Raschia, Maria Agustina, Maizon, Daniel Omar, Demitrio, Daniel Arturo, Poli, Mario Andres
Format: info:ar-repo/semantics/documento de conferencia
Language:Inglés
Published: Sociedad Argentina de Informática 2022
Subjects:
Online Access:http://hdl.handle.net/20.500.12123/11706
_version_ 1855036622347698176
author Ríos, Pablo J.
Raschia, Maria Agustina
Maizon, Daniel Omar
Demitrio, Daniel Arturo
Poli, Mario Andres
author_browse Demitrio, Daniel Arturo
Maizon, Daniel Omar
Poli, Mario Andres
Raschia, Maria Agustina
Ríos, Pablo J.
author_facet Ríos, Pablo J.
Raschia, Maria Agustina
Maizon, Daniel Omar
Demitrio, Daniel Arturo
Poli, Mario Andres
author_sort Ríos, Pablo J.
collection INTA Digital
description In recent years, machine learning methods have been shown to be efficient in identifying a subset of single nucleotide polymorphisms (SNP) underlying a trait of interest. The aim of this study was the construction of predictive models using machine learning algorithms, for the identification of loci that best explain the variance in milk fat production of dairy cattle. Further objectives involve determining the genes flanking relevant SNPs and retrieving the pathways, biological processes, or molecular functions overrepresented by them. Fat production values adjusted for fixed effects (FPadj) and estimated breeding values for milk fat production (EBVFP) were used as phenotypes and SNPs as predictor variables. The models constructed for EBVFP performed better and yield considerably less relevant SNPs than models for FPadj. Among the genes flanking relevant SNPs, signaling transduction pathways and gated channel activities were detected as overrepresented. The loci obtained for EBVFP matched better with previously reported relevant loci for milk fat content than those obtained for FPadj. Based on the better performance showed by the models trained for EBVFP and their agreement with previous reported results for the trait studied, we conclude that the relationship among individuals should be accounted for in the phenotype used.
format info:ar-repo/semantics/documento de conferencia
id INTA11706
institution Instituto Nacional de Tecnología Agropecuaria (INTA -Argentina)
language Inglés
publishDate 2022
publishDateRange 2022
publishDateSort 2022
publisher Sociedad Argentina de Informática
publisherStr Sociedad Argentina de Informática
record_format dspace
spelling INTA117062022-04-22T11:08:42Z Machine learning algorithms identified relevant SNPs for milk fat content in cattle Ríos, Pablo J. Raschia, Maria Agustina Maizon, Daniel Omar Demitrio, Daniel Arturo Poli, Mario Andres Single Nucleotide Polymorphism Dairy Cattle Algorithms Milk Fat Polimorfismo de un Solo Nucleótido Ganado de Leche Algoritmos Grasa de la Leche Machine Learning Methods Métodos de Aprendizaje Automático In recent years, machine learning methods have been shown to be efficient in identifying a subset of single nucleotide polymorphisms (SNP) underlying a trait of interest. The aim of this study was the construction of predictive models using machine learning algorithms, for the identification of loci that best explain the variance in milk fat production of dairy cattle. Further objectives involve determining the genes flanking relevant SNPs and retrieving the pathways, biological processes, or molecular functions overrepresented by them. Fat production values adjusted for fixed effects (FPadj) and estimated breeding values for milk fat production (EBVFP) were used as phenotypes and SNPs as predictor variables. The models constructed for EBVFP performed better and yield considerably less relevant SNPs than models for FPadj. Among the genes flanking relevant SNPs, signaling transduction pathways and gated channel activities were detected as overrepresented. The loci obtained for EBVFP matched better with previously reported relevant loci for milk fat content than those obtained for FPadj. Based on the better performance showed by the models trained for EBVFP and their agreement with previous reported results for the trait studied, we conclude that the relationship among individuals should be accounted for in the phenotype used. Instituto de Genética Fil: Ríos, Pablo J. Universidad de Buenos Aires; Argentina Fil: Ríos, Pablo J. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina Fil: Raschia, Maria Agustina. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Genética; Argentina Fil: Raschia, Maria Agustina. Universidad Nacional de La Plata. Facultad de Ciencias Médicas; Argentina Fil: Maizon, Daniel Omar. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria Anguil; Argentina Fil: Maizon, Daniel Omar. Universidad Nacional de La Pampa. Facultad de Agronomía; Argentina Fil: Demitrio, Daniel Arturo. Instituto Nacional de Tecnología Agropecuaria (INTA). Dirección General de Sistemas de Información, Comunicación y Procesos. Gerencia de Informática y Gestión de la Información; Argentina Fil: Demitrio, Daniel Arturo. Universidad Nacional de La Plata. Facultad de Ciencias Exactas; Argentina Fil: Poli, Mario Andres. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Genética; Argentina Fil: Poli, Mario Andres. Universidad del Salvador. Facultad de Ciencias Agrarias y Veterinaria; Argentina 2022-04-22T11:01:37Z 2022-04-22T11:01:37Z 2021-10 info:ar-repo/semantics/documento de conferencia info:eu-repo/semantics/conferenceObject info:eu-repo/semantics/publishedVersion http://hdl.handle.net/20.500.12123/11706 eng info:eu-repograntAgreement/INTA/2019-PE-E6-I145-001/2019-PE-E6-I145-001/AR./Mejora genética objetiva para aumentar la eficiencia de los sistemas de producción animal. info:eu-repograntAgreement/INTA/2019-PT-E6-I513-001/2019-PT-E6-I513-001/AR./Plataforma de mejoramiento animal info:eu-repograntAgreement/INTA/2019-PT-E9-I180-001/2019-PT-E9-I180-001/AR./TICs y gestión de Big Data info:eu-repo/semantics/openAccess http://creativecommons.org/licenses/by-nc-sa/4.0/ Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0) application/pdf Sociedad Argentina de Informática 50 Jornadas Argentinas de Informática (50 JAIIO), 13 Congreso Argentino de AgroInformática (CAI 2021), 18 al 29 de octubre de 2021 (virtual)
spellingShingle Single Nucleotide Polymorphism
Dairy Cattle
Algorithms
Milk Fat
Polimorfismo de un Solo Nucleótido
Ganado de Leche
Algoritmos
Grasa de la Leche
Machine Learning Methods
Métodos de Aprendizaje Automático
Ríos, Pablo J.
Raschia, Maria Agustina
Maizon, Daniel Omar
Demitrio, Daniel Arturo
Poli, Mario Andres
Machine learning algorithms identified relevant SNPs for milk fat content in cattle
title Machine learning algorithms identified relevant SNPs for milk fat content in cattle
title_full Machine learning algorithms identified relevant SNPs for milk fat content in cattle
title_fullStr Machine learning algorithms identified relevant SNPs for milk fat content in cattle
title_full_unstemmed Machine learning algorithms identified relevant SNPs for milk fat content in cattle
title_short Machine learning algorithms identified relevant SNPs for milk fat content in cattle
title_sort machine learning algorithms identified relevant snps for milk fat content in cattle
topic Single Nucleotide Polymorphism
Dairy Cattle
Algorithms
Milk Fat
Polimorfismo de un Solo Nucleótido
Ganado de Leche
Algoritmos
Grasa de la Leche
Machine Learning Methods
Métodos de Aprendizaje Automático
url http://hdl.handle.net/20.500.12123/11706
work_keys_str_mv AT riospabloj machinelearningalgorithmsidentifiedrelevantsnpsformilkfatcontentincattle
AT raschiamariaagustina machinelearningalgorithmsidentifiedrelevantsnpsformilkfatcontentincattle
AT maizondanielomar machinelearningalgorithmsidentifiedrelevantsnpsformilkfatcontentincattle
AT demitriodanielarturo machinelearningalgorithmsidentifiedrelevantsnpsformilkfatcontentincattle
AT polimarioandres machinelearningalgorithmsidentifiedrelevantsnpsformilkfatcontentincattle