©1996-2018 All Rights Reserved.
Online Journal of Veterinary Research. You may not store these pages in any
form except for your own personal use. All other usage or distribution is
illegal under international copyright treaties. Permission to use any of these
pages in any other way besides the before mentioned must be gained in writing
from the publisher. This article is exclusively copyrighted in its entirety to
OJVR. This article may be copied once but may not be, reproduced or
re-transmitted without the express permission of the editors. This journal
satisfies the refereeing requirements (DEST) for the Higher Education Research
Data Collection (Australia). Linking: To link to this page or any pages linking
to this page you must link directly to this page only here rather than put up
your own page
OJVRTM
Online
Journal of Veterinary Research©
Volume 21(9):600-615, 2017
Clustering dairy
cattle genes by Kullback-Leibler divergence
Houshang Dehghanzadeh1,
Seyed Ziaeddin Mirhoseini*2, Mostafa Ghaderi-Zefrehei3,
Hassan Tavakoli4, Saeid Esmaeilkhaniyan5
1,2 Department of Animal
Science, Faculty of Agricultural Sciences, University of Guilan,
Rasht,
Iran, 3 Department
of Animal Science, Faculty of Agricultural Sciences, University of Yasouj, Yasouj, Iran, 4 Department of Electrical
Engineering, Faculty
of Electrical Engineering, University of Guilan,
Rasht, Iran, 5 Department
of Biotechnology, Animal Science Research
Institute of Iran, Agricultural Research, Education and Extension Organization
(AREEO), Karaj, Iran.
*Corresponding Author: Seyed Ziaeddin Mirhoseini, Email: mirhosin@guilan.ac.ir
ABSTRACT
Dehghanzadeh H, Mirhoseini SZ, Ghaderi-Zefrehei
M, Tavakoli H,
Esmaeilkhaniyan S., Clustering dairy cattle genes by Kullback-Leibler divergence, Onl
J Vet Res., 21(9):600-615,
2017. Bio-computational grouping of genes facilitates genetic analysis,
sequencing and structural-based analyses. DNA sequence of 30 genes involved
with milk protein production were extracted ad
hoc from NCBI genome database and stored in FASTA format. A C algorithm base 2 to calculate Shannon
entropy of gene DNA sequences was used to extract cluster genes governing milk
production in dairy cows by Kullback-Leibler (KL)
divergence. KL was based on nucleotide
similarity (KLA), difference (KLB) and different order of
Relative Entropy (KLH). AdaBoost
algorithm was used to interpret clustering results. Examples of results: STX3(nnucleotide =79347) and CD14 (nnucleotide = 1417) were longest and
shortest genes, respectively. 258 exons were identified wherein exon 1 of HSPA1A(nnucleotide
=2101) and HSPA5(nnucleotide =
20) were longest and shortest. LCP1 and ABCG2 genes had highest number of exons
(nexon=16) and HSPA1A and YWHAG(nexon =
1) had shortest number exons for this set of genes. Findings suggested that exons with maximum entropy value are likely to be
suitable for genotype analysis using molecular markers and that both coding and non-coding
sequences had low or high complexity. KL divergence can be used
to cluster large sets of dairy cattle genes with other methods to group
biologically relevant sets of genes.
Key words:
Information theory, Dairy cattle, Kullback-Leibler
divergence, Gene clustering.
FULL-TEXT(SUBSCRIPTION) OR PURCHASE ARTICLE