Optimum Sample Size for Estimating Gene Diversity in Wild Wheat using AFLP Markers
Abstract
Genetic diversity in five wild types of wheat was estimated using Simpson's index (based on heterozygosity) applied to data from AFLP markers. For such studies, the cost of obtaining the required information increases both with the number of samples required to estimate diversity and with the number of markers used. When the population studied is in Hardy-Weinberg equilibrium (HWE), allelic frequencies follow the binomial expansion and parametric methods can be used to calculate the variance of the diversity index in terms of the number of individuals sampled. Inbred species are never in HWE. With regard to such populations, this study addresses the question of the sample size required to estimate gene diversity using a distribution-free re-sampling method. We studied populations of five wild species (Aegilops speltoides, Triticum urartu, Triticum boeoticum, Triticum dicoccoides, and Triticum araraticum) as sources of diversity. We used bootstrap re-sampling with varying sample sizes to develop a relationship between the precision of the diversity estimate and the sample size. Such a relationship was used to determine the samples required for capturing a given amount of diversity and its precision. We found that 5-6 samples are sufficient to obtain a standard error equal to 10% of the diversity in the populations of the species Ae. speltoides, T. dicoccoides and T. araraticum. However, more than 12 samples would be needed for populations of T. urartu and T. boeoticum. The procedure presented here can be used to obtain the optimum sample size for other crop species as well