Sample Discussion
Subjects:
- Inbreeding coefficents
From erikscraggsgmail.com Thu May 30 18:46:07 2013
From: Erik Scraggs <erikscraggsgmail.com>
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Subject: Inbreeding coefficents
Date: Thu, 30 May 2013 18:46:07 -0500
Dear all,
I would kindly appreciate if somebody within the community could provide me
with some assistance. I'm currently looking at estimating inbreeding
coefficients within a population of cattle.To do this I have been using the
inbreeding coefficients option in plink (--het option), which given a
large number of SNPs, in a homogeneous sample, it is possible to calculate
inbreeding coefficients (i.e. based on the observed versus expected number
of homozygous genotypes).
I've run the program and posted below is a snapshot of my results, this is
where I would appreciate some clarity. Is it correct to assume, that where
you see a negative value in the F column, that this indicates that there is
no inbreeding and can therefore be set 0?
FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
0 FB2892 24327 2.60E+04 37504 -0.1414
--
Erik Scraggs, PhD
Department of Animal Sciences
Washington State University
Pullman, WA, 99164-4236, USA
Tel: 509-288-2291
|
From gianolaansci.wisc.edu Thu May 30 20:06:15 2013
From: Daniel Gianola <gianolaansci.wisc.edu>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Thu, 30 May 2013 20:06:15 -0500
Since inbreeding coefficients cannot be negative, being probabilities, this
indicates that PLINK (have no idea what it does) does not use a good
estimation procedures. In the latter, estimates must fall inside the
permissible parameter space.
Regards,
Daniel
-----Original message-----
.From: Erik Scraggs <erikscraggsgmail.com>
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Sent: Thu, May 30, 2013 23:49:45 GMT+00:00
.Subject: Inbreeding coefficents
Dear all,
I would kindly appreciate if somebody within the community could provide me
with some assistance. I'm currently looking at estimating inbreeding
coefficients within a population of cattle.To do this I have been using the
inbreeding coefficients option in plink (--het option), which given a
large number of SNPs, in a homogeneous sample, it is possible to calculate
inbreeding coefficients (i.e. based on the observed versus expected number
of homozygous genotypes).
I've run the program and posted below is a snapshot of my results, this is
where I would appreciate some clarity. Is it correct to assume, that where
you see a negative value in the F column, that this indicates that there is
no inbreeding and can therefore be set 0?
FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
0 FB2892 24327 2.60E+04 37504 -0.1414
--
Erik Scraggs, PhD
Department of Animal Sciences
Washington State University
Pullman, WA, 99164-4236, USA
Tel: 509-288-2291
|
From steibeljmsu.edu Thu May 30 21:39:42 2013
From: MSU_JPS <steibeljmsu.edu>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Thu, 30 May 2013 21:39:42 -0500
Hi Erik
In the plink documentation there are two notes that may be worth considering.
I copy them below. But what Daniel said is more important: understand
estimates and their properties before using them and be wary of results that
are plain wrong.
Note With whole genome data, it is probably best to apply this analysis to
a subset that are pruned to be in approximate linkage equilibrium, say on
the order of 50,000 autosomal SNPs. Use the --indep-pairwise and --indep
commands to achieve this, described here.
Note The estimate of F can sometimes be negative. Often this will just
reflect random sampling error, but a result that is strongly negative
(i.e. an individual has fewer homozygotes than one would expect by chance
at the genome-wide level) can reflect other factors, e.g. sample
contamination events perhaps.
Sincerely,
Juan P. Steibel
On May 30, 2013, at 7:46 PM, Erik Scraggs wrote:
> Dear all,
>
> I would kindly appreciate if somebody within the community could provide me
> with some assistance. I'm currently looking at estimating inbreeding
> coefficients within a population of cattle.To do this I have been using the
> inbreeding coefficients option in plink (--het option), which given a
> large number of SNPs, in a homogeneous sample, it is possible to calculate
> inbreeding coefficients (i.e. based on the observed versus expected number
> of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results, this is
> where I would appreciate some clarity. Is it correct to assume, that where
> you see a negative value in the F column, that this indicates that there is
> no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
>
> --
> Erik Scraggs, PhD
> Department of Animal Sciences
> Washington State University
> Pullman, WA, 99164-4236, USA
> Tel: 509-288-2291
|
From ytutsunomiyagmail.com Thu May 30 21:40:42 2013
From: Yuri Tani Utsunomiya <ytutsunomiyagmail.com>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Thu, 30 May 2013 21:40:42 -0500
Dear Erik,
The inbreeding coefficient calculated by PLINK is equivalent to FIS in
Wright's F-statistics [1].
In a structured population, the fixation index (represented by F = the
degree of reduction in heterozygosity relative to Hardy-Weinberg
expectation) can be partitioned into three levels: FIT - individual (I)
relative to the total population (T); FIS - individual (I) relative to the
subpopulation (S); and FST - subpopulation (S) relative to the total (T).
Thus, Wright's FIS is often referred as the inbreeding coefficient, and a
simplistic definition is FIS = 1 - (HI/HS), where HI represents the
individual's heterozygosity, and HS the subpopulation's (or breed)
heterozygosity.
Looking at the definition proposed by Wright (1950) [1], F is better
interpreted as a correlation measure between alleles in different
'partitions' of a structured population, rather than a probability. This
means that it does assume negative values. If HI = HS, then FIS = 0, and
the individual has the exactly expected heterozygosity level for the
subpopulation. If HI < HS, then FIS > 0, and the individual is less
heterozygous than expected given the subpopulation's heterozygosity. The
closer to 1 FIS gets, the more inbred the individual is assumed to be. On
the other hand, if HI > HS, then F < 0, so the individual is more
heterozygous then expected given the subpopulation. Hence, negative values
denote outbred individuals.
I did not understand why you want to set the negative values to zero, as
FIS is not a probability. In fact, you can test departure of individual
heterozygosity from the expectation for the subpopulation by performing
tests for goodness of fit if you want p-values...
One worth note observation is: if the negative value is too small, then you
may want to double check the outbred sample for the possibility of
contamination (incidental mixing of two DNA sources, causing high sample
heterozygosity). Although FIS has been largely used in genetic diversity
studies using microsatellites to quantify inbreeding/diversity loss, you
may want to have a look at inbreeding levels estimation by means of runs of
homozygosity (ROH). While FIS largely relies on identity by state,
empirical data suggests that ROH better captures information of identity by
descent, and has been proposed as a suitable method to estimate
autozygosity - some people would say that it should replace the pedigree
estimates. PLINK also has an implementation for the algorithm [2].
For those who are not familiar with PLINK[3], I suggest checking it out. It
is elegantly written in C/C++, and is a pionner software for the analysis
of SNP data. It still remains one of the most complete toolsets available
out there.
Yours sincerely,
Yuri
[1] http://onlinelibrary.wiley.com/...j.1469-1809.1949.tb02451.x/pdf
[2] http://pngu.mgh.harvard.edu/...urcell/plink/ibdibs.shtml#homo
[3] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950838/
On Thu, May 30, 2013 at 10:06 PM, Daniel Gianola <gianolaansci.wisc.edu>wrote:
> Since inbreeding coefficients cannot be negative, being probabilities, this
> indicates that PLINK (have no idea what it does) does not use a good
> estimation procedures. In the latter, estimates must fall inside the
> permissible parameter space.
>
> Regards,
>
> Daniel
>
>
>
> -----Original message-----
> .From: Erik Scraggs <erikscraggsgmail.com>
> .To: Multiple Recipients of <AnGenMapanimalgenome.org>
> .Sent: Thu, May 30, 2013 23:49:45 GMT+00:00
> .Subject: Inbreeding coefficents
>
> Dear all,
>
> I would kindly appreciate if somebody within the community could provide me
> with some assistance. I'm currently looking at estimating inbreeding
> coefficients within a population of cattle.To do this I have been using the
> inbreeding coefficients option in plink (--het option), which given a
> large number of SNPs, in a homogeneous sample, it is possible to calculate
> inbreeding coefficients (i.e. based on the observed versus expected number
> of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results, this is
> where I would appreciate some clarity. Is it correct to assume, that where
> you see a negative value in the F column, that this indicates that there is
> no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125
> -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
>
> --
> Erik Scraggs, PhD
> Department of Animal Sciences
> Washington State University
> Pullman, WA, 99164-4236, USA
> Tel: 509-288-2291
--
*Yuri T. Utsunomiya*
MSc student at São Paulo State University (UNESP - Brazil)
Laboratory of Animal Biochemistry and Molecular Biology - Araçatuba/SP
Mobile: *
+551881170036
*
Skype me: yuri.tani
*
?So I do dearly hope that the Genome Project does not give rise to some
naive biological determinism that says we are nothing more than the sum of
our genes. Geneticists don't believe that. Geneticists believe genes are an
important part of the story. By understanding that part of the story, we're
in a so-much better position to try to understand the rest of the story?
- Prof. Eric Lander*
|
From bmuirpurdue.edu Thu May 30 21:44:57 2013
From: "Muir, William M." <bmuirpurdue.edu>
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Subject: FW: Inbreeding coefficents
Date: Thu, 30 May 2013 21:44:57 -0500
Hi Eric,
A negative inbreeding coefficient means that there is an excess heterozygosity
as compared to expected. I am not familiar with the output of Plink, but if
the excess occurs at most loci, it is a signature of demography, indicating that
you are most likely looking at a recent out crossing event.
However, in order to correctly determine inbreeding from genotyping data the
proportion of loci NOT segregating is also important. Inbreeding drives loci
to homozygosity, thus if you only considering those loci still segregating, you
will observe an excess of heterozygosity compared to expected, as you did.
The question then becomes how to determine which non informative loci to include
in the calculation. In order to do this several breeds would have to be
genotyped to determine the hypothetical ancestral population (HAP) allele
frequencies. From the theory of drift, inbreeding does not change allele
frequency across populations, but does within sub-populations, i.e. the rate of
fixation and loss is directly proportional to the initial allele frequency in
the HAP. Thus if a random set of subpopulations were sampled, the average
allele frequency at that locus across those subpopulations, including those
fixed and lost within some subpopulations, will estimate the HAP allele
frequency (p) at that locus. From this allele frequency, the total expected
heterozygosity (Ht) is determined as 2pq, and summed across loci.
Next, to determine individual inbreeding, using those same loci in your
population, including the ones fixed, determine Hi, heterozygosity of the
individual, as the sum of the segregating loci over the total number of loci
originally segregating in the HAP.
The ratio of Hi/Ht=Hit is the amount of heterozygosity in the individual
relative to the HAP. The total inbreeding coefficient of this individual is
then Fx=1-Hit. Wright called this Fit, i.e. inbreeding of the individual
relative to total. If the expected inbreeding were also be computed for each
subpopulation based on the subpopulation allele frequency, this heterozygosity
is Hs and Fst=1-Hs/Ht, which is the amount of drift that occurred between
subpopulations.
However, to complicate things even further, a SNP chip has ascertainment bias,
meaning that only SNPs that were informative in certain breeds were put on the
chip. This results in the same problem and has to be corrected for, Andy Clark
has a paper on how to do this. Sequencing data is much better for determination
of inbreeding coefficients as it does not have ascertainment bias.
I am sure this is more than you wanted, but the bottom line is it is difficult
to get a true estimate of inbreeding with knowledge of the entire drift process
and correct sampling of the genetic material.
A quick references would be from Hartl and Clark's book on population genetics,
and they also reference the original works of Wright, Hill, and Weir (who also
has a program to do this from genomic data, the book's title is Genetic Data
Analysis, and the program is at http://www.eeb.uconn.edu/people/plewis
/software.php). I also have a publication on the topic in chickens using a SNP
chip which I can share with you if interested.
Best Regards, Bill
-----------------------------
William Muir, Ph.D.
Professor Genetics
Department of Animal Sciences
Purdue University and
Department of Medicine
Indiana University
Room G406 Lilly Hall
915 West State Street
W. Lafayette, IN 47907
765-494-8032
https://ag.purdue.edu/...?strAlias=bmuir&intDirDeptID=8
http://medicine.iupui.edu/iarc/
-----Original Message-----
.From: Erik Scraggs [mailto:erikscraggsgmail.com]
.Sent: Thursday, May 30, 2013 7:46 PM
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Subject: Inbreeding coefficents
Dear all,
I would kindly appreciate if somebody within the community could provide me with
some assistance. I'm currently looking at estimating inbreeding coefficients
within a population of cattle.To do this I have been using the inbreeding
coefficients option in plink (--het option), which given a large number of SNPs,
in a homogeneous sample, it is possible to calculate inbreeding coefficients
(i.e. based on the observed versus expected number of homozygous genotypes).
I've run the program and posted below is a snapshot of my results, this is where
I would appreciate some clarity. Is it correct to assume, that where you see a
negative value in the F column, that this indicates that there is no inbreeding
and can therefore be set 0?
FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
0 FB2892 24327 2.60E+04 37504 -0.1414
--
Erik Scraggs, PhD
Department of Animal Sciences
Washington State University
Pullman, WA, 99164-4236, USA
Tel: 509-288-2291
|
From Andres.Legarratoulouse.inra.fr Fri May 31 07:49:42 2013
From: Andres Legarra <Andres.Legarratoulouse.inra.fr>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 07:49:42 -0500
Hi,
inbreeding is excess of homozygotes respective to Hardy-Weinberg
equilibrium (Falconer). Or, it is the correlation between uniting
gametes (Wright). According to these definitions, it is NOT a
probability and it can be therefore negative.
However, if you use pedigree to estimate inbreeding, you are forced to
assume that all founder alleles are different, and as a byproduct of
this assumption, inbreeding is positive and is also a probability of
identity by descent.
When constructing genomic relationship matrices (VanRaden, 2008; Yang et
al., 2010; etc) it is frequent to find negative values of inbreeding and
also of relationships. These have to be interpreted as covariances and
not like probabilities. Setting them to 0 creates havoc: you mess up the
linear model and bias your results.
Andres
Le 31/05/2013 01:46, Erik Scraggs a �crit :
>
> Dear all,
>
> I would kindly appreciate if somebody within the community could provide me
> with some assistance. I'm currently looking at estimating inbreeding
> coefficients within a population of cattle.To do this I have been using the
> inbreeding coefficients option in plink (--het option), which given a
> large number of SNPs, in a homogeneous sample, it is possible to calculate
> inbreeding coefficients (i.e. based on the observed versus expected number
> of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results, this is
> where I would appreciate some clarity. Is it correct to assume, that where
> you see a negative value in the F column, that this indicates that there is
> no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
>
> --
> Erik Scraggs, PhD
> Department of Animal Sciences
> Washington State University
> Pullman, WA, 99164-4236, USA
> Tel: 509-288-2291
--
Andres Legarra
+33 561285182
INRA, UR 631 SAGA, 24 Chemin de Borde Rouge - Auzeville
CS 52627
31326 Castanet Tolosan, France
http://genoweb.toulouse.inra.fr/~alegarra
|
From hsimiangwdg.de Fri May 31 08:31:11 2013
From: "Simianer, Henner" <hsimiangwdg.de>
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Subject: AW: Inbreeding coefficents
Date: Fri, 31 May 2013 08:31:11 -0500
Hi Andres,
This is not the full story.
The inbreeding coefficient F is also defined as probability that the two
homologous alleles at a random locus in one individual are identical by
descent (Malecot) and, being a probability, is bounded between 0 and 1,
regardless of your assumptions on the base population. Thus, as often in
quantitative genetics, the same thing has different definitions with
different implications. Obviously estimates of F can be outside the interval
(0,1) depending on the method you use.
Best wishes
Henner
_____________________________________
Dr. Henner Simianer
Professor of Animal Breeding and Genetics
Department of Animal Sciences
Georg-August-University Goettingen
Albrecht-Thaer-Weg 3, 37075 Goettingen
Tel.: +49-551-395604, Fax: +49-551-395587
Email: hsimiangwdg.de
http://www.uni-goettingen.de/tierzucht
-----Urspr�ngliche Nachricht-----
Von: Andres Legarra [mailto:Andres.Legarratoulouse.inra.fr]
Gesendet: Freitag, 31. Mai 2013 14:50
An: Multiple Recipients of
Betreff: Re: Inbreeding coefficents
Hi,
inbreeding is excess of homozygotes respective to Hardy-Weinberg equilibrium
(Falconer). Or, it is the correlation between uniting gametes (Wright).
According to these definitions, it is NOT a probability and it can be
therefore negative.
However, if you use pedigree to estimate inbreeding, you are forced to
assume that all founder alleles are different, and as a byproduct of this
assumption, inbreeding is positive and is also a probability of identity by
descent.
When constructing genomic relationship matrices (VanRaden, 2008; Yang et
al., 2010; etc) it is frequent to find negative values of inbreeding and also
of relationships. These have to be interpreted as covariances and not like
probabilities. Setting them to 0 creates havoc: you mess up the linear model
and bias your results.
Andres
|
From taylorjerrmissouri.edu Fri May 31 08:34:10 2013
From: "Taylor, Jerry F. (Animal Science)" <taylorjerrmissouri.edu>
Subject: RE: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 08:34:10 -0500
Just a couple of other comments to add to the mix:
1. No matter how it is estimated/calculated/interpreted there is generally
an assumption of random mating and selective neutrality (an absence of direct
selection on genotype) associated with the locus/loci and these assumptions
are generally violated in most populations due to drift and artificial
selection. Thus, it is quite possible that you will observe a lower level of
homozygosity within individuals than would be expected under these
assumptions.
2. I am not sure how PLINK calculates the genomic relationship matrix, but
if you have a read through PVR's great paper "Efficient methods to compute
genomic predictions." J Dairy Sci. 2008 91(11):4414-23 you will see that the
allele frequencies (AF) that are used to compute the GRM are for the base
generation. Most programs that compute GRMs from genotype data simply compute
AF at the locus using all animals and use this to construct the GRM and this
is fine if the population is not subject to admixture, selection or drift.
So:
a) If your animals are crossbreds - you have a problem
b) If your animals are stratified in time - you may have a problem
I have found that the F coefficients for a population of 3570 registered
Angus animals are quite sensitive to the AF estimates. If you estimate AF
using all animals these differ from AF estimates estimated using the oldest
10% of animals and the effect on estimates of F is quite considerable.
Jared Decker describes the very strong selection occurring genome-wide in
these animals in his paper " A novel analytical method, Birth Date Selection
Mapping, detects response of the Angus (Bos taurus) genome to selection on
complex traits" BMC Genomics. 2012 13:606. He also examines the relationship
between genomic F and pedigree F in this paper.
Jerry
-----Original Message-----
.From: Erik Scraggs [mailto:erikscraggsgmail.com]
.Sent: Thursday, May 30, 2013 6:46 PM
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Subject: Inbreeding coefficents
Dear all,
I would kindly appreciate if somebody within the community could provide me
with some assistance. I'm currently looking at estimating inbreeding
coefficients within a population of cattle.To do this I have been using the
inbreeding coefficients option in plink (--het option), which given a large
number of SNPs, in a homogeneous sample, it is possible to calculate inbreeding
coefficients (i.e. based on the observed versus expected number of
homozygous genotypes).
I've run the program and posted below is a snapshot of my results, this is
where I would appreciate some clarity. Is it correct to assume, that where
you see a negative value in the F column, that this indicates that there is
no inbreeding and can therefore be set 0?
FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
0 FB2892 24327 2.60E+04 37504 -0.1414
--
Erik Scraggs, PhD
Department of Animal Sciences
Washington State University
Pullman, WA, 99164-4236, USA
Tel: 509-288-2291
|
From bmuirpurdue.edu Fri May 31 08:34:17 2013
From: "Muir, William M." <bmuirpurdue.edu>
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Subject: FW: Inbreeding coefficents
Date: Fri, 31 May 2013 08:34:17 -0500
Hi,
Andres is of correct, and if calculating inbreeding coefficients using the
genomic relationship approach (GRM), and scaled correctly, the inbreeding
detected is that which has occurred within that breed or sub-population.
Because inbreeding is cumulative, it can be broken down into that which has
occurred prior to breed formation and that which has occurred after. If one
uses a single breed starting at some time after breed formation, the
inbreeding detected is current inbreeding, not total inbreeding. So the more
important question is, what is the inbreeding coefficient being used for. If
it is for within breed comparisons, i.e. genomic selection, then current
inbreeding is appropriate. If one wants to know how much diversity has been
lost as a result of population subdivision (breed formation) as well as
current inbreeding, then one has to essentially do an across breed GRM.
In the first definition given by Andres as deviation from HWE, the issue is
what allele frequency to use as 'p' to calculate expected heterozygosity. If
one uses p estimated within a breed, the inbreeding detected is local or
current inbreeding. If p is estimated from the HAP, then the expected
heterozygosity is that in the HAP and inbreeding detected is total.
Bill
-----Original Message-----
.From: Andres Legarra [mailto:Andres.Legarratoulouse.inra.fr]
.Sent: Friday, May 31, 2013 8:50 AM
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Subject: Re: Inbreeding coefficents
Hi,
inbreeding is excess of homozygotes respective to Hardy-Weinberg equilibrium
(Falconer). Or, it is the correlation between uniting gametes (Wright).
According to these definitions, it is NOT a probability and it can be
therefore negative.
However, if you use pedigree to estimate inbreeding, you are forced to
assume that all founder alleles are different, and as a byproduct of this
assumption, inbreeding is positive and is also a probability of identity by
descent.
When constructing genomic relationship matrices (VanRaden, 2008; Yang et
al., 2010; etc) it is frequent to find negative values of inbreeding and also
of relationships. These have to be interpreted as covariances and not like
probabilities. Setting them to 0 creates havoc: you mess up the linear model
and bias your results.
Andres
Le 31/05/2013 01:46, Erik Scraggs a �crit :
> Dear all,
>
> I would kindly appreciate if somebody within the community could
> provide me with some assistance. I'm currently looking at estimating
> inbreeding coefficients within a population of cattle.To do this I
> have been using the inbreeding coefficients option in plink (--het
> option), which given a large number of SNPs, in a homogeneous sample,
> it is possible to calculate inbreeding coefficients (i.e. based on the
> observed versus expected number of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results,
> this is where I would appreciate some clarity. Is it correct to
> assume, that where you see a negative value in the F column, that this
> indicates that there is no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
>
> --
> Erik Scraggs, PhD
> Department of Animal Sciences
> Washington State University
> Pullman, WA, 99164-4236, USA
> Tel: 509-288-2291
--
Andres Legarra
+33 561285182
INRA, UR 631 SAGA, 24 Chemin de Borde Rouge - Auzeville CS 52627
31326 Castanet Tolosan, France
http://genoweb.toulouse.inra.fr/~alegarra
|
From gianolaansci.wisc.edu Fri May 31 08:56:13 2013
From: Daniel Gianola <gianolaansci.wisc.edu>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 08:56:13 -0500
Points well taken, as I was assuming that this was based on pedigrees.
We could certainly used negatively inbred individuals in some animal
populations,eg, dogs.
It would be useful to revisit Cockerham (1969, 1973) where he revisits
Wright's indexes in terms of variance components, and these cannot be
negative except when silly unbiased estimators are used.
Please take my comments in the light of my ignorance about PLINK.
Bill Muir's remarks are very useful as well.
Regards,
Dan
-----Original message-----
.From: Yuri Tani Utsunomiya <ytutsunomiyagmail.com>
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Sent: Fri, May 31, 2013 02:42:10 GMT+00:00
.Subject: Re: Inbreeding coefficents
Dear Erik,
The inbreeding coefficient calculated by PLINK is equivalent to FIS in
Wright's F-statistics [1].
In a structured population, the fixation index (represented by F = the
degree of reduction in heterozygosity relative to Hardy-Weinberg
expectation) can be partitioned into three levels: FIT - individual (I)
relative to the total population (T); FIS - individual (I) relative to the
subpopulation (S); and FST - subpopulation (S) relative to the total (T).
Thus, Wright's FIS is often referred as the inbreeding coefficient, and a
simplistic definition is FIS = 1 - (HI/HS), where HI represents the
individual's heterozygosity, and HS the subpopulation's (or breed)
heterozygosity.
Looking at the definition proposed by Wright (1950) [1], F is better
interpreted as a correlation measure between alleles in different
'partitions' of a structured population, rather than a probability. This
means that it does assume negative values. If HI = HS, then FIS = 0, and
the individual has the exactly expected heterozygosity level for the
subpopulation. If HI < HS, then FIS > 0, and the individual is less
heterozygous than expected given the subpopulation's heterozygosity. The
closer to 1 FIS gets, the more inbred the individual is assumed to be. On
the other hand, if HI > HS, then F < 0, so the individual is more
heterozygous then expected given the subpopulation. Hence, negative values
denote outbred individuals.
I did not understand why you want to set the negative values to zero, as
FIS is not a probability. In fact, you can test departure of individual
heterozygosity from the expectation for the subpopulation by performing
tests for goodness of fit if you want p-values...
One worth note observation is: if the negative value is too small, then you
may want to double check the outbred sample for the possibility of
contamination (incidental mixing of two DNA sources, causing high sample
heterozygosity). Although FIS has been largely used in genetic diversity
studies using microsatellites to quantify inbreeding/diversity loss, you
may want to have a look at inbreeding levels estimation by means of runs of
homozygosity (ROH). While FIS largely relies on identity by state,
empirical data suggests that ROH better captures information of identity by
descent, and has been proposed as a suitable method to estimate
autozygosity - some people would say that it should replace the pedigree
estimates. PLINK also has an implementation for the algorithm [2].
For those who are not familiar with PLINK[3], I suggest checking it out. It
is elegantly written in C/C++, and is a pionner software for the analysis
of SNP data. It still remains one of the most complete toolsets available
out there.
Yours sincerely,
Yuri
[1] http://onlinelibrary.wiley.com/...j.1469-1809.1949.tb02451.x/pdf
[2] http://pngu.mgh.harvard.edu/...urcell/plink/ibdibs.shtml#homo
[3] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950838/
On Thu, May 30, 2013 at 10:06 PM, Daniel Gianola <gianolaansci.wisc.edu>wrote:
> Since inbreeding coefficients cannot be negative, being probabilities, this
> indicates that PLINK (have no idea what it does) does not use a good
> estimation procedures. In the latter, estimates must fall inside the
> permissible parameter space.
>
> Regards,
>
> Daniel
>
>
> -----Original message-----
> .From: Erik Scraggs <erikscraggsgmail.com>
> .To: Multiple Recipients of <AnGenMapanimalgenome.org>
> .Sent: Thu, May 30, 2013 23:49:45 GMT+00:00
> .Subject: Inbreeding coefficents
>
> Dear all,
>
> I would kindly appreciate if somebody within the community could provide me
> with some assistance. I'm currently looking at estimating inbreeding
> coefficients within a population of cattle.To do this I have been using the
> inbreeding coefficients option in plink (--het option), which given a
> large number of SNPs, in a homogeneous sample, it is possible to calculate
> inbreeding coefficients (i.e. based on the observed versus expected number
> of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results, this is
> where I would appreciate some clarity. Is it correct to assume, that where
> you see a negative value in the F column, that this indicates that there is
> no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125
> -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
>
> --
> Erik Scraggs, PhD
> Department of Animal Sciences
> Washington State University
> Pullman, WA, 99164-4236, USA
> Tel: 509-288-2291
--
*Yuri T. Utsunomiya*
MSc student at São Paulo State University (UNESP - Brazil)
Laboratory of Animal Biochemistry and Molecular Biology - Araçatuba/SP
Mobile: * +551881170036 * Skype me: yuri.tani *
?So I do dearly hope that the Genome Project does not give rise to some
naive biological determinism that says we are nothing more than the sum of
our genes. Geneticists don't believe that. Geneticists believe genes are an
important part of the story. By understanding that part of the story, we're
in a so-much better position to try to understand the rest of the story?
- Prof. Eric Lander*
|
From ytutsunomiyagmail.com Fri May 31 08:59:13 2013
From: Yuri Tani Utsunomiya <ytutsunomiyagmail.com>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 08:59:13 -0500
Thanks Bill and Andres for putting it down in clearer words.
I concur with Daniel: Weir and Cockerham's [1] contribution to redefine
F-statistics in a variance-components framework opened a window to a new
series of estimators for genetic diversity and differentiation in
geographically structured populations. I recommend reading [2] for a nice
review on the subject. Although [3] is a review focusing on FST, I find it
a very pleasant material to be read by anybody that wants to study
F-statistics.
Now, if your focus, Erik, is on individual autozygosity and potentially
inbreeding depression, rather than population diversity, you may want to go
beyond F-statistics and do ROH analysis.
Best,
Yuri
[1]http://www.jstor.org/stable/2408641
[2]http://www.annualreviews.org/...annurev.genet.36.050802.093940
[3]http://www.nature.com/...journal/v10/n9/pdf/nrg2611.pdf
On Fri, May 31, 2013 at 9:39 AM, Daniel Gianola <gianolaansci.wisc.edu>wrote:
> Points well taken, as I was assuming that this was based on pedigrees. We
> could certainly used negatively inbred individuals in some animal
> populations,eg, dogs.
>
> It would be useful to revisit Cockerham (1969, 1973) where he revisits
> Wright's indexes in terms of variance components, and these cannot be
> negative except when silly unbiased estimators are used.
>
> Please take my comments in the light of my ignorance about PLINK.
>
> Bill Muir's remarks are very useful as well.
>
> Regards,
>
> Dan
>
>
> *Connected by DROID on Verizon Wireless*
>
>
> -----Original message-----
>
> *From: *Yuri Tani Utsunomiya <ytutsunomiyagmail.com>*
> To: *Multiple Recipients of <angenmapanimalgenome.org>*
> Sent: *Fri, May 31, 2013 02:42:10 GMT+00:00*
> Subject: *Re: Inbreeding coefficents
>
> Dear Erik,
>
> The inbreeding coefficient calculated by PLINK is equivalent to FIS in
> Wright's F-statistics [1].
>
> In a structured population, the fixation index (represented by F = the
> degree of reduction in heterozygosity relative to Hardy-Weinberg
> expectation) can be partitioned into three levels: FIT - individual (I)
> relative to the total population (T); FIS - individual (I) relative to the
> subpopulation (S); and FST - subpopulation (S) relative to the total (T).
> Thus, Wright's FIS is often referred as the inbreeding coefficient, and a
> simplistic definition is FIS = 1 - (HI/HS), where HI represents the
> individual's heterozygosity, and HS the subpopulation's (or breed)
> heterozygosity.
>
> Looking at the definition proposed by Wright (1950) [1], F is better
> interpreted as a correlation measure between alleles in different
> 'partitions' of a structured population, rather than a probability. This
> means that it does assume negative values. If HI = HS, then FIS = 0, and
> the individual has the exactly expected heterozygosity level for the
> subpopulation. If HI < HS, then FIS > 0, and the individual is less
> heterozygous than expected given the subpopulation's heterozygosity. The
> closer to 1 FIS gets, the more inbred the individual is assumed to be. On
> the other hand, if HI > HS, then F < 0, so the individual is more
> heterozygous then expected given the subpopulation. Hence, negative values
> denote outbred individuals.
>
> I did not understand why you want to set the negative values to zero, as
> FIS is not a probability. In fact, you can test departure of individual
> heterozygosity from the expectation for the subpopulation by performing
> tests for goodness of fit if you want p-values...
>
> One worth note observation is: if the negative value is too small, then you
> may want to double check the outbred sample for the possibility of
> contamination (incidental mixing of two DNA sources, causing high sample
> heterozygosity). Although FIS has been largely used in genetic diversity
> studies using microsatellites to quantify inbreeding/diversity loss, you
> may want to have a look at inbreeding levels estimation by means of runs of
> homozygosity (ROH). While FIS largely relies on identity by state,
> empirical data suggests that ROH better captures information of identity by
> descent, and has been proposed as a suitable method to estimate
> autozygosity - some people would say that it should replace the pedigree
> estimates. PLINK also has an implementation for the algorithm [2].
>
> For those who are not familiar with PLINK[3], I suggest checking it out. It
> is elegantly written in C/C++, and is a pionner software for the analysis
> of SNP data. It still remains one of the most complete toolsets available
> out there.
>
> Yours sincerely,
>
> Yuri
>
>
> [1]
> http://onlinelibrary.wiley.com/...j.1469-1809.1949.tb02451.x/pdf
> [2] http://pngu.mgh.harvard.edu/...urcell/plink/ibdibs.shtml#homo
> [3] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950838/
>
> On Thu, May 30, 2013 at 10:06 PM, Daniel Gianola <gianolaansci.wisc.edu
>>wrote:
>
>> Since inbreeding coefficients cannot be negative, being probabilities, this
>> indicates that PLINK (have no idea what it does) does not use a good
>> estimation procedures. In the latter, estimates must fall inside the
>> permissible parameter space.
>>
>> Regards,
>>
>> Daniel
>>
>>
>>
>> -----Original message-----
>> .From: Erik Scraggs <erikscraggsgmail.com>
>> .To: Multiple Recipients of <AnGenMapanimalgenome.org>
>> .Sent: Thu, May 30, 2013 23:49:45 GMT+00:00
>> .Subject: Inbreeding coefficents
>>
>> Dear all,
>>
>> I would kindly appreciate if somebody within the community could provide me
>> with some assistance. I'm currently looking at estimating inbreeding
>> coefficients within a population of cattle.To do this I have been using the
>> inbreeding coefficients option in plink (--het option), which given a
>> large number of SNPs, in a homogeneous sample, it is possible to calculate
>> inbreeding coefficients (i.e. based on the observed versus expected number
>> of homozygous genotypes).
>>
>> I've run the program and posted below is a snapshot of my results, this is
>> where I would appreciate some clarity. Is it correct to assume, that where
>> you see a negative value in the F column, that this indicates that there is
>> no inbreeding and can therefore be set 0?
>>
>>
>> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
>> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
>> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125
>> -0.1179
>> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
>> 0 FB2892 24327 2.60E+04 37504 -0.1414
>>
>> --
>> Erik Scraggs, PhD
>> Department of Animal Sciences
>> Washington State University
>> Pullman, WA, 99164-4236, USA
>> Tel: 509-288-2291
>
>
> --
> *Yuri T. Utsunomiya*
>
> MSc student at São Paulo State University (UNESP - Brazil)
> Laboratory of Animal Biochemistry and Molecular Biology - Araçatuba/SP
> Mobile: * +551881170036 * Skype me: yuri.tani *
|
From ydaumn.edu Fri May 31 09:03:38 2013
From: Yang Da <ydaumn.edu>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 09:03:38 -0500
I think the main point is the distinction between pedigree inbreeding
coefficient, which is a function of IBD prabability and non-negative, and
genomic inbreeding coefficient, which could be negative due to the main
reasons given by other posts.
Yang Da, Ph. D.
Department of Animal Science
University of Minnesota
On Fri, May 31, 2013 at 8:31 AM, Simianer, Henner <hsimiangwdg.de> wrote:
> Hi Andres,
>
> This is not the full story.
>
> The inbreeding coefficient F is also defined as probability that the two
> homologous alleles at a
> random locus in one individual are identical by descent (Malecot) and,
> being a probability, is
> bounded between 0 and 1, regardless of your assumptions on the base
> population. Thus, as often
> in quantitative genetics, the same thing has different definitions with
> different implications.
> Obviously estimates of F can be outside the interval (0,1) depending on
> the method you use.
>
> Best wishes
>
> Henner
>
>
> _____________________________________
> Dr. Henner Simianer
> Professor of Animal Breeding and Genetics
> Department of Animal Sciences
> Georg-August-University Goettingen
> Albrecht-Thaer-Weg 3, 37075 Goettingen
> Tel.: +49-551-395604, Fax: +49-551-395587
> Email: hsimiangwdg.de
> http://www.uni-goettingen.de/tierzucht
>
>
> -----Ursprÿÿngliche Nachricht-----
> Von: Andres Legarra [mailto:Andres.Legarratoulouse.inra.fr]
> Gesendet: Freitag, 31. Mai 2013 14:50
> An: Multiple Recipients of
> Betreff: Re: Inbreeding coefficents
>
> Hi,
>
> inbreeding is excess of homozygotes respective to Hardy-Weinberg
> equilibrium (Falconer). Or, it is
> the correlation between uniting gametes (Wright). According to these
> definitions, it is NOT a
> probability and it can be therefore negative.
>
> However, if you use pedigree to estimate inbreeding, you are forced to
> assume that all founder
> alleles are different, and as a byproduct of this assumption, inbreeding
> is positive and is also a
> probability of identity by descent.
>
> When constructing genomic relationship matrices (VanRaden, 2008; Yang et
> al., 2010; etc) it is
> frequent to find negative values of inbreeding and also of relationships.
> These have to be
> interpreted as covariances and not like probabilities. Setting them to 0
> creates havoc: you mess up
> the linear model and bias your results.
>
> Andres
|
|
From ytutsunomiyagmail.com Fri May 31 09:50:29 2013
From: Yuri Tani Utsunomiya <ytutsunomiyagmail.com>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 09:50:29 -0500
Just to foster the discussion with some extra useful info, I believe PLINK
calculates F as:
[notice this is a Latex equation - to see how it looks like you can copy
and paste it here: http://www.codecogs.com/latex/eqneditor.php]
F_{i} = \frac{(O_{i} - E_{i})}{(L_{i} - E_{i})}
where O_{i} is observed homozygosity, L_{i} is the number of SNPs measured
in individual i and
E_{i} = \sum\limits_{j=1}^{L_{i}}
\left(1-2p_{j}(1-p_{j})\frac{n_{j}}{1-n_{j}}\right)
where nj and pj are the number of measured genotypes and the reference
allele frequency at locus j, respectively.
I may be wrong, but PLINK implements F as a descriptive statistic that can
be used by the user for three main purposes: 1) identify contaminated
samples; 2) identify excess of X chromosome heterozygosity in samples
declared to be male; 3) as the Wright inbreeding coefficient. Other usage
must be carefully assessed and interpreted. As said before in the
discussion, 'inbreeding coefficients' have a handful of different
definitions and contexts, but this in particular is a variant of the FIS
(IBS-based) measure defined by Wright in the analysis of structured
populations.
Yuri
On Fri, May 31, 2013 at 11:07 AM, Baumung, Roswitha (AGAG)
<Roswitha.Baumungfao.org> wrote:
> Dear colleagues,
>
> You might find the following publication interesting: Inbreeding: one word,
> several meanings, much confusion. Templeton AR, Read B. Source Department of
> Biology, Washington University, St. Louis, MO 63130.
>
> Abstract
>
> Because conservation biologists must frequently deal with small populations,
> inbreeding (a frequent consequence of small population size) has played a
> central role in many genetic management programs. However, the word
> "inbreeding" has several, often contradictory meanings, and a failure to
> distinguish among these meanings has caused much misunderstanding on the role
> of inbreeding in genetic management. Three different biological meanings of
> inbreeding are discussed in this paper: (1) inbreeding as a measure of shared
> ancestry in the paternal and maternal lineages of an individual; (2)
> inbreeding as a measure of genetic drift in a finite population, and (3)
> inbreeding as a measure of system of mating in a reproducing population. The
> distinction and use of these different measures of inbreeding are discussed
> and illustrated with a worked example, the North American captive population
> of Speke's gazelle (Gazella spekei). It is shown that these different meanings
> of the word inbreeding must be kept separated, otherwise erroneous management
> recommendations and evaluations can occur. On the positive side, the different
> measures of inbreeding when used jointly can be a powerful management tool
> precisely because they measure different biological phenomena.
>
> Kind regards, Roswitha
>
>
> Roswitha Baumung
> Animal Production Officer
> Animal Genetic Resources Branch
> Animal Production and Health Division
> FAO - Food and Agriculture Organization of the United Nations
> Viale delle Terme di Caracalla
> 00153 Rome Italy
> Tel. +39 06 57052158
>
>
> -----Original Message-----
> .From: Simianer, Henner [mailto:hsimiangwdg.de]
> .Sent: 31 May 2013 15:31
> .To: Multiple Recipients of <AnGenMapanimalgenome.org>
> .Subject: AW: Inbreeding coefficents
>
> Hi Andres,
>
> This is not the full story.
>
> The inbreeding coefficient F is also defined as probability that the two
> homologous alleles at a random locus in one individual are identical by
> descent (Malecot) and, being a probability, is bounded between 0 and 1,
> regardless of your assumptions on the base population. Thus, as often in
> quantitative genetics, the same thing has different definitions with
> different
> implications. Obviously estimates of F can be outside the interval (0,1)
> depending on the method you use.
>
> Best wishes
>
> Henner
>
>
> _____________________________________
> Dr. Henner Simianer
> Professor of Animal Breeding and Genetics
> Department of Animal Sciences
> Georg-August-University
> Goettingen Albrecht-Thaer-Weg 3, 37075 Goettingen
> Tel.: +49-551-395604, Fax: +49-551-395587
> Email: hsimiangwdg.de
> http://www.uni-goettingen.de/tierzucht
|
From erikscraggsgmail.com Fri May 31 10:04:48 2013
From: Erik Scraggs <erikscraggsgmail.com>
Subject: Re: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 10:04:48 -0500
Dear all,
I greatly appreciate your help, thank you for taking the time to provide me
such detailed explanations. It is great to have the help of the community.
Many thanks
Erik
On Fri, May 31, 2013 at 6:34 AM, Taylor, Jerry F. (Animal Science) <
taylorjerrmissouri.edu> wrote:
> Just a couple of other comments to add to the mix:
>
> 1. No matter how it is estimated/calculated/interpreted there is generally
> an assumption of random mating and selective neutrality (an absence of direct
> selection on genotype) associated with the locus/loci and these assumptions
> are generally violated in most populations due to drift and artificial
> selection. Thus, it is quite possible that you will observe a lower level of
> homozygosity within individuals than would be expected under these
> assumptions.
>
> 2. I am not sure how PLINK calculates the genomic relationship matrix, but
> if you have a read through PVR's great paper "Efficient methods to compute
> genomic predictions." J Dairy Sci. 2008 91(11):4414-23 you will see that the
> allele frequencies (AF) that are used to compute the GRM are for the base
> generation. Most programs that compute GRMs from genotype data simply compute
> AF at the locus using all animals and use this to construct the GRM and this
> is fine if the population is not subject to admixture, selection or drift.
> So:
> a) If your animals are crossbreds - you have a problem
> b) If your animals are stratified in time - you may have a problem
>
> I have found that the F coefficients for a population of 3570 registered
> Angus animals are quite sensitive to the AF estimates. If you estimate AF
> using all animals these differ from AF estimates estimated using the oldest
> 10% of animals and the effect on estimates of F is quite considerable.
>
> Jared Decker describes the very strong selection occurring genome-wide in
> these animals in his paper " A novel analytical method, Birth Date Selection
> Mapping, detects response of the Angus (Bos taurus) genome to selection on
> complex traits" BMC Genomics. 2012 13:606. He also examines the relationship
> between genomic F and pedigree F in this paper.
>
> Jerry
>
>
> -----Original Message-----
> .From: Erik Scraggs [mailto:erikscraggsgmail.com]
> .Sent: Thursday, May 30, 2013 6:46 PM
> .To: Multiple Recipients of <AnGenMapanimalgenome.org>
> .Subject: Inbreeding coefficents
>
> Dear all,
>
> I would kindly appreciate if somebody within the community could provide me
> with some assistance. I'm currently looking at estimating inbreeding
> coefficients within a population of cattle.To do this I have been using the
> inbreeding coefficients option in plink (--het option), which given a
> large number of SNPs, in a homogeneous sample, it is possible to calculate
> inbreeding coefficients (i.e. based on the observed versus expected number
> of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results, this is
> where I would appreciate some clarity. Is it correct to assume, that where
> you see a negative value in the F column, that this indicates that there is
> no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125 -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
--
Erik Scraggs, PhD
Department of Animal Sciences
Washington State University
Pullman, WA, 99164-4236, USA
Tel: 509-288-2291
|
From Roger.VallejoARS.USDA.GOV Fri May 31 10:22:49 2013
From: "Vallejo, Roger" <Roger.VallejoARS.USDA.GOV>
Subject: RE: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 10:22:49 -0500
I think the basic questions are not being answered. Are the Wright's
F-statistics correlations or probabilities? Then, you can decide on how to
treat these F-statistics. Let me add this.
The F-statistic model is a hierarchical model with genes stratified at three
levels: Individuals (I), within subdivisions (S) and within the total population
(T). It has three main parameters: FIT is the correlation of uniting gametes
relative to those of the total population; FIS is the average over all
subdivisions of the correlation of uniting gametes relative to the gametes of
the subdivision; and FST is the correlation of random gametes within
subdivisions relative to the total population. The three F-statistics are
interrelated as (1 - FIT) = (1 - FST) (1 - FIS). A variety of derivations of
this basic relationship are available (Wright 1951, 1965; Cockerham 1969). It is
clear from WRIGHT's formulation of the F-statistic model that the parameters FIS
and FIT are free to take either positive or negative values depending on whether
there is a deficit or excess of heterozygotes; it is also clear from WRIGHT's
work that the parameter FST is necessarily positive (JC Long, Genetics 1986).
I hope this helps some on this very interesting issue.
Roger
Roger L. Vallejo, Ph.D.
U.S. Department of Agriculture, ARS, NCCCWA
Voice: (304) 724-8340 Ext. 2141
Email: roger.vallejoars.usda.gov
http://www.ars.usda.gov/...ople/people.htm?personid=37662
-----Original Message-----
.From: Daniel Gianola [mailto:gianolaansci.wisc.edu]
.Sent: Friday, May 31, 2013 9:56 AM
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Subject: Re: Inbreeding coefficents
Points well taken, as I was assuming that this was based on pedigrees.
We could certainly used negatively inbred individuals in some animal
populations,eg, dogs.
It would be useful to revisit Cockerham (1969, 1973) where he revisits Wright's
indexes in terms of variance components, and these cannot be negative except
when silly unbiased estimators are used.
Please take my comments in the light of my ignorance about PLINK.
Bill Muir's remarks are very useful as well.
Regards,
Dan
-----Original message-----
.From: Yuri Tani Utsunomiya <ytutsunomiyagmail.com>
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Sent: Fri, May 31, 2013 02:42:10 GMT+00:00
.Subject: Re: Inbreeding coefficents
Dear Erik,
The inbreeding coefficient calculated by PLINK is equivalent to FIS in Wright's
F-statistics [1].
In a structured population, the fixation index (represented by F = the degree of
reduction in heterozygosity relative to Hardy-Weinberg expectation) can be
partitioned into three levels: FIT - individual (I) relative to the total
population (T); FIS - individual (I) relative to the subpopulation (S); and FST
- subpopulation (S) relative to the total (T). Thus, Wright's FIS is often
referred as the inbreeding coefficient, and a simplistic definition is FIS = 1 -
(HI/HS), where HI represents the individual's heterozygosity, and HS the
subpopulation's (or breed) heterozygosity.
Looking at the definition proposed by Wright (1950) [1], F is better interpreted
as a correlation measure between alleles in different 'partitions' of a
structured population, rather than a probability. This means that it does assume
negative values. If HI = HS, then FIS = 0, and the individual has the exactly
expected heterozygosity level for the subpopulation. If HI < HS, then FIS > 0,
and the individual is less heterozygous than expected given the subpopulation's
heterozygosity. The closer to 1 FIS gets, the more inbred the individual is
assumed to be. On the other hand, if HI > HS, then F < 0, so the individual is
more heterozygous then expected given the subpopulation. Hence, negative values
denote outbred individuals.
I did not understand why you want to set the negative values to zero, as FIS is
not a probability. In fact, you can test departure of individual heterozygosity
from the expectation for the subpopulation by performing tests for goodness of
fit if you want p-values...
One worth note observation is: if the negative value is too small, then you may
want to double check the outbred sample for the possibility of contamination
(incidental mixing of two DNA sources, causing high sample heterozygosity).
Although FIS has been largely used in genetic diversity studies using
microsatellites to quantify inbreeding/diversity loss, you may want to have a
look at inbreeding levels estimation by means of runs of homozygosity (ROH).
While FIS largely relies on identity by state, empirical data suggests that ROH
better captures information of identity by descent, and has been proposed as a
suitable method to estimate autozygosity - some people would say that it should
replace the pedigree estimates. PLINK also has an implementation for the
algorithm [2].
For those who are not familiar with PLINK[3], I suggest checking it out. It is
elegantly written in C/C++, and is a pionner software for the analysis of SNP
data. It still remains one of the most complete toolsets available out there.
Yours sincerely,
Yuri
[1] http://onlinelibrary.wiley.com/...j.1469-1809.1949.tb02451.x/pdf
[2] http://pngu.mgh.harvard.edu/...urcell/plink/ibdibs.shtml#homo
[3] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950838/
On Thu, May 30, 2013 at 10:06 PM, Daniel Gianola <gianolaansci.wisc.edu>wrote:
> Since inbreeding coefficients cannot be negative, being probabilities,
> this indicates that PLINK (have no idea what it does) does not use a
> good estimation procedures. In the latter, estimates must fall inside
> the permissible parameter space.
>
> Regards,
>
> Daniel
>
>
> -----Original message-----
> .From: Erik Scraggs <erikscraggsgmail.com>
> .To: Multiple Recipients of <AnGenMapanimalgenome.org>
> .Sent: Thu, May 30, 2013 23:49:45 GMT+00:00
> .Subject: Inbreeding coefficents
>
> Dear all,
>
> I would kindly appreciate if somebody within the community could
> provide me with some assistance. I'm currently looking at estimating
> inbreeding coefficients within a population of cattle.To do this I
> have been using the inbreeding coefficients option in plink (--het
> option), which given a large number of SNPs, in a homogeneous sample,
> it is possible to calculate inbreeding coefficients (i.e. based on the
> observed versus expected number of homozygous genotypes).
>
> I've run the program and posted below is a snapshot of my results,
> this is where I would appreciate some clarity. Is it correct to
> assume, that where you see a negative value in the F column, that this
> indicates that there is no inbreeding and can therefore be set 0?
>
>
> FID IID O(HOM) E(HOM) N(NM) F 0 FB686 28313 2.58E+04 37177 0.2243 0
> FB1615 25773 2.60E+04 37510 -0.01666 0 FB2101 27566 2.58E+04 37287
> 0.1517
> 0 FB2126 23992 2.60E+04 37494 -0.1699 0 FB2127 24348 2.57E+04 37125
> -0.1179
> 0 FB2422 24469 2.60E+04 37497 -0.1287 0 FB2501 27209 2.58E+04 37317
> 0.1194
> 0 FB2892 24327 2.60E+04 37504 -0.1414
>
> --
> Erik Scraggs, PhD
> Department of Animal Sciences
> Washington State University
> Pullman, WA, 99164-4236, USA
> Tel: 509-288-2291
--
*Yuri T. Utsunomiya*
MSc student at S�o Paulo State University (UNESP - Brazil) Laboratory of Animal
Biochemistry and Molecular Biology - Ara�atuba/SP
Mobile: * +551881170036 * Skype me: yuri.tani *
?So I do dearly hope that the Genome Project does not give rise to some naive
biological determinism that says we are nothing more than the sum of our genes.
Geneticists don't believe that. Geneticists believe genes are an important part
of the story. By understanding that part of the story, we're in a so-much better
position to try to understand the rest of the story?
- Prof. Eric Lander*
|
From gianolaansci.wisc.edu Fri May 31 10:50:17 2013
From: Daniel Gianola <gianolaansci.wisc.edu>
Subject: RE: Inbreeding coefficents
Postmaster: submission approved
To: Multiple Recipients of <angenmapanimalgenome.org>
Date: Fri, 31 May 2013 10:50:17 -0500
Roger:
Well put.
Daniel
-----Original message-----
.From: "Vallejo, Roger" <Roger.VallejoARS.USDA.GOV>
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Sent: Fri, May 31, 2013 15:24:19 GMT+00:00
.Subject: RE: Inbreeding coefficents
I think the basic questions are not being answered. Are the Wright's
F-statistics correlations or probabilities? Then, you can decide on how to
treat these F-statistics. Let me add this.
The F-statistic model is a hierarchical model with genes stratified at three
levels: Individuals (I), within subdivisions (S) and within the total population
(T). It has three main parameters: FIT is the correlation of uniting gametes
relative to those of the total population; FIS is the average over all
subdivisions of the correlation of uniting gametes relative to the gametes of
the subdivision; and FST is the correlation of random gametes within
subdivisions relative to the total population. The three F-statistics are
interrelated as (1 - FIT) = (1 - FST) (1 - FIS). A variety of derivations of
this basic relationship are available (Wright 1951, 1965; Cockerham 1969). It is
clear from WRIGHT's formulation of the F-statistic model that the parameters FIS
and FIT are free to take either positive or negative values depending on whether
there is a deficit or excess of heterozygotes; it is also clear from WRIGHT's
work that the parameter FST is necessarily positive (JC Long, Genetics 1986).
I hope this helps some on this very interesting issue.
Roger
Roger L. Vallejo, Ph.D.
U.S. Department of Agriculture, ARS, NCCCWA
Voice: (304) 724-8340 Ext. 2141
Email: roger.vallejoars.usda.gov
http://www.ars.usda.gov/...ople/people.htm?personid=37662
-----Original Message-----
.From: Daniel Gianola [mailto:gianolaansci.wisc.edu]
.Sent: Friday, May 31, 2013 9:56 AM
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Subject: Re: Inbreeding coefficents
Points well taken, as I was assuming that this was based on pedigrees.
We could certainly used negatively inbred individuals in some animal
populations,eg, dogs.
It would be useful to revisit Cockerham (1969, 1973) where he revisits Wright's
indexes in terms of variance components, and these cannot be negative except
when silly unbiased estimators are used.
Please take my comments in the light of my ignorance about PLINK.
Bill Muir's remarks are very useful as well.
Regards,
Dan
-----Original message-----
.From: Yuri Tani Utsunomiya <ytutsunomiyagmail.com>
.To: Multiple Recipients of <AnGenMapanimalgenome.org>
.Sent: Fri, May 31, 2013 02:42:10 GMT+00:00
.Subject: Re: Inbreeding coefficents
Dear Erik,
The inbreeding coefficient calculated by PLINK is equivalent to FIS in Wright's
F-statistics [1].
In a structured population, the fixation index (represented by F = the degree of
reduction in heterozygosity relative to Hardy-Weinberg expectation) can be
partitioned into three levels: FIT - individual (I) relative to the total
population (T); FIS - individual (I) relative to the subpopulation (S); and FST
- subpopulation (S) relative to the total (T). Thus, Wright's FIS is often
referred as the inbreeding coefficient, and a simplistic definition is FIS = 1 -
(HI/HS), where HI represents the individual's heterozygosity, and HS the
subpopulation's (or breed) heterozygosity.
Looking at the definition proposed by Wright (1950) [1], F is better interpreted
as a correlation measure between alleles in different 'partitions' of a
structured population, rather than a probability. This means that it does assume
negative values. If HI = HS, then FIS = 0, and the individual has the exactly
expected heterozygosity level for the subpopulation. If HI < HS, then FIS > 0,
and the individual is less heterozygous than expected given the subpopulation's
heterozygosity. The closer to 1 FIS gets, the more inbred the individual is
assumed to be. On the other hand, if HI > HS, then F < 0, so the individual is
more heterozygous then expected given the subpopulation. Hence, negative values
denote outbred individuals.
I did not understand why you want to set the negative values to zero, as FIS is
not a probability. In fact, you can test departure of individual heterozygosity
from the expectation for the subpopulation by performing tests for goodness of
fit if you want p-values...
One worth note observation is: if the negative value is too small, then you may
want to double check the outbred sample for the possibility of contamination
(incidental mixing of two DNA sources, causing high sample heterozygosity).
Although FIS has been largely used in genetic diversity studies using
microsatellites to quantify inbreeding/diversity loss, you may want to have a
look at inbreeding levels estimation by means of runs of homozygosity (ROH).
While FIS largely relies on identity by state, empirical data suggests that ROH
better captures information of identity by descent, and has been proposed as a
suitable method to estimate autozygosity - some people would say that it should
replace the pedigree estimates. PLINK also has an implementation for the
algorithm [2].
For those who are not familiar with PLINK[3], I suggest checking it out. It is
elegantly written in C/C++, and is a pionner software for the analysis of SNP
data. It still remains one of the most complete toolsets available out there.
Yours sincerely,
Yuri
[1] http://onlinelibrary.wiley.com/...j.1469-1809.1949.tb02451.x/pdf
[2] http://pngu.mgh.harvard.edu/...urcell/plink/ibdibs.shtml#homo
[3] http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1950838/
|
Go back to the AnGenMap main page.
|