What Is Germany Doing For Animal Selection Breeding Guide

BMC Bioinformatics. 2019; twenty: 25.

Optimum contribution selection for brute convenance and conservation: the R package optiSel

Robin Wellmann

Found of Animal Science, University of Hohenheim, Garbenstraße, Stuttgart, Germany

Received 2018 Aug 6; Accepted 2018 Oct 29.

Supplementary Materials: Additional file i: Example data set and replication script. (Zilch 14700 kb)

GUID: 9F334DBA-859C-44F0-87BE-3C4A3D040BB1

Data Availability Statement: The example data set and a replication R-script are included as an electronic appendix. It is a subset of the information generated by [eight].

Abstract

Background

Selecting animals for convenance in the optimum way plays an essential role for the management of genetic resources and in selective convenance of livestock species. It requires to compute the optimum genetic contribution of each choice candidate to the next generation. Electric current software packages for optimum contribution pick (OCS) are not able to handle the chief conflicting objectives of animal breeding programs simultaneously, which includes to increase genetic proceeds, to increase or to maintain genetic diversity, to recover the original genetic background of endangered breeds with celebrated introgression, and to maintain or increase genetic diversity at native alleles.

Results

The free R packet optiSel offers functions for estimating the above mentioned parameters from pedigree and marker information, and for solving OCS problems. One parameter tin exist optimized, whereas the remaining ones can be constrained. The results reveal the optimum numbers of offspring of all selection candidates, and tin subsequently exist used for mate resource allotment. Unlike solvers tin exist used. Solver slsqp was superior when the genetic diverseness at native alleles was to be maximized, whereas solvers cccp and cccp2 were superior for all other OCS problems.

Conclusion

Optimum contribution selection applied to local breeds requires special attention due to the conflicting objectives of their breeding programs. The free R package optiSel is an easy-to-use software taking these conflicting objectives into account.

Electronic supplementary textile

The online version of this article (10.1186/s12859-018-2450-five) contains supplementary material, which is bachelor to authorized users.

Keywords: Optimum contribution pick, Animal breeding, Conservation, Segment-based kinship, Native kinship, Native contribution, Runs of homozygosity, optiSel

Background

The objectives of breeding programs for livestock breeds, companion animals, and zoo populations of endangered species may be quite different. In any case, all the same, selecting animals for convenance in the optimum way requires to compute the genetic contribution each selection candidate should have to the next generation.

For high-functioning livestock breeds, the objective of a convenance program is to maximize genetic proceeds while at the aforementioned time a sufficient constructive size of the breed should be maintained to avoid inbreeding depression or a depletion of the additive genetic variance. Maintenance of a sufficient effective size is achieved by restricting the charge per unit of increase in mean kinship. Thus, the optimum contributions of the choice candidates are the solution of an optimization trouble where the objective is to maximize the mean breeding value in the offspring while the increment in mean kinship in the population is constrained. This arroyo is the classical optimum contribution pick (OCS) proposed by [1].

High performance livestock breeds, still, accept frequently been used for upgrading local breeds [2, 3]. This displacement crossing has frequently progressed to the point where the original genetic background of the local breed must be considered endangered. Hence, breeding programs for local breeds with historic introgression accept the boosted objective to recover the original genetic background of the breed. This means to reduce their genetic contribution from not-endangered breeds [iv], to conserve the genetic variety at native haplotype segments [5], and to maintain a sufficient genetic altitude to non-endangered breeds [6].

In dissimilarity, for many companion breeds (e.k. dog breeds), authentic breeding values for total merit are not available and historical genetic bottlenecks take depleted their genetic pool. For these breeds, the chief objective of the breeding program is to maintain or to increase genetic multifariousness past minimizing the mean kinship in the population. In this case, genetic introgression with other breeds may exist non avoidable but should be restricted.

In summary, animal convenance programs can have different objectives simultanously, which are to increase genetic gain, to increase or to maintain genetic diversity, to recover the original genetic background of breeds with historic introgression, and to maintain or increment genetic variety at native haplotype segments. Optimizing one of these criteria and restricting the others is chosen advanced OCS [7, viii].

Current software packages for OCS are non able to handle all alien objectives of animal breeding programs simultaneously and many of them may not observe the global optimum. The implementation of classical OCS in the plan GenCont uses Lagrangian multipliers [9], simply is non guaranteed to discover the optimal solution [10]. An alternative is the free software EVA [eleven] that uses an evolutionary algorithm for optimization. Methods using evolutionary algorithms are as well described e.g., past [12] and are implemented in the commercial software TGRM. Some of these software packages provide flexible opportunities for mate allocation, but breeding programs that aim at recovering the native genetic background of a breed cannot be optimized with the software. An alternative is the utilise of full general purpose software for optimization. Pong-Wong and Woolliams [10] demonstrated how OCS problems tin be reformulated as semidefinite programming problems and used software SDPA [13] for optimization. Since the gratuitous software R is widely used past statisticians, of particular interest is full general purpose software for optimization available as an R packet. A multifariousness of suitable packages exist. However, preparing brute data for utilise with general purpose software is a quite circuitous task, and so it is rarely used by animal breeders or breeding organizations.

This newspaper introduces the gratis R package optiSel which provides a framework for solving advanced OCS problems with fiddling R code. It also offers functions for estimating various parameters from pedigree and marker information. These are the kinships, kinships at native haplotype segments, and genetic contributions from native ancestors. The advanced OCS methods currently implemented include maximizing genetic gain, minimizing the average kinship, maximizing contributions from native ancestors, and minimizing the mean kinship at native haplotype segments, while criteria not included in the objective function tin be used as constraints. This results in a tabular array from which the optimum numbers of offspring of all selection candidates can be obtained, and which can subsequently be used for mate allotment to minimize the boilerplate inbreeding in the offspring.

The package enables to use a variety of gratis solvers for optimization and allows for easy switching between solvers by setting the parameter solver of function opticont() appropriately. Optimization problems can currently be solved by augmented lagrangian minimization as implemented in the R package alabama [14] (solver="alabama"), by semidefinite programming using the CSDP library introduced by [15] (solver="csdp"), by gradient-based optimization with sequential least-squares quadratic programming every bit implemented in part slsqp() [16] from package nloptr (solver="slsqp"), and by part cccp() from package cccp [17] for solving cone constrained convex programs (solver="cccp" or solver=~cccp2~).

The aims of this newspaper are to demonstrate how the free parcel optiSel tin be used for the estimation of genetic parameters and for OCS. In addition, the suitability of the unlike solvers for solving a diversity of OCS problems is compared.

Implementation

The software bundle optiSel is implemented in R and C++. This section demonstrates the functionality of the package. This includes the estimation of genetic parameters and their utilize in OCS. Exact mathematical formulas for objective functions and constraints in OCS and their derivations can be constitute in (Wellmann R, Bennewitz J: Primal genetic parameters for optimal population management, submitted).

The required packages optiSel and data.tabular array tin can be downloaded from cran and then loaded equally follows:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figa_HTML.jpg

Parcel data.table is used considering it provides a fast file reader. A fake data set consisting of phenotypes, genotypes and pedigrees of simulated Angler cattle and a replication script can exist institute in the electronic appendix (Additional file 1). Estimation of genetic parameters and OCS are described beneath at the case of 1132 simulated genotyped individuals. Vector animals contains the IDs of these individuals. All estimated genetic parameters will be displayed for iii related animals, which are an private and its parents. These are the individuals included in vector I.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figb_HTML.jpg

Kinships

The kinship f _IBD(i,j) of two individuals i,j is the probability that two alleles Ten _i, and Y _j, randomly chosen from both individuals from a single locus, are identical by descent (IBD). This means that they descend from a mutual ancestor. That is,

Kinships can be estimated either from the pedigree or from mark data. In social club to distinguish between segment-based estimates and full-blooded-based estimates, we utilize for pedigree-based estimates the prefix or suffix PED, and for segment-based estimates SEG in this paper.

The pedigree-based kinship or geneological coancestry ${\hat{f}}_{PED} (i, j)$ between each pair of individuals i,j can be computed with function pedIBD(). The function allows to define a human relationship matrix for the founders. By default, the founders are unrelated and not inbred. However, before a pedigree can be used, it needs to be prepared with function prePed(). This function sorts the pedigree, adds new lines for founders, and corrects some full-blooded errors.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figc_HTML.jpg

The additive relationship matrix A=2*fPED tin can besides be computed with role makeA().

Pedigree-based evaluations crave sufficiently consummate pedigrees. Parameters quantifying the completeness of the pedigrees of all individuals tin can be obtained with function summary(). Of item involvement is the number of equivalent complete generations, which can be institute in column equiGen. It is the sum of the proportions of known ancestors of an individual over all generations traced [18]. Below, data table phen, which contains the fake breeding values in column EBV is loaded, and column equiGen is appended.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figd_HTML.jpg

Pedigree-based estimates take the disadvantage that Mendelian sampling in all ancestors is considered to exist random, so it cannot business relationship for the alleles the ancestors actually inherited from their parents. In full general, the usage of segment-based estimates is recommended in society to account for Mendelian sampling. The most useful marker-based kinship estimates are based on runs of homozygosity (ROH). A ROH with respect to two haplotypes is a segment consisting of consecutive base pairs which are identical in both haplotypes [19].

The segment-based kinship ${\hat{f}}_{SEG} (i, j)$ between individual i and j is the probability that ii alleles, taken at random from both individuals from a single locus, belong to identical segments. The matrix containing the segment-based kinships of all individuals can exist computed with function segIBD(). The number of cores to exist used tin can be specified by argument cores, so different chromosomes tin can exist processed in parallel.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Fige_HTML.jpg

Important arguments of part segIBD() are minSNP and minL. A segment needs to accept a minimum length for existence taken into account. Past default, the minimum number of markers to be included in a segment is minSNP=twenty because considerably smaller sections of a haplotype may exist identical by chance. The minimum length of a segment is past default minL=1.0 Mb. For the example data set nosotros used minL=2.5 in accord with [eight]. Since short shared segments predominantly originate from early common ancestors, this value should be chosen depending on the historic period of the inbreeding that should exist taken into account, but also dependent on the size of the marking panel [20].

Native contribution

The native contribution N(i) of an individual i is the proportion of its genome which is native [8]. In other words, it is the genetic contribution it has from native ancestors, or the probability that an allele X _i, randomly chosen from the individual, is native. That is,

where $A_{North}$ is the fix of alleles originating from native ancestors. Information technology is usually divers with respect to a base population, i.east. a time t ₀ before which all registered individuals were considered native. Native contributions can exist estimated either from full-blooded or from marker data.

The pedigree-based native contribution ${\hat{N}}_{PED} (i)$ of individual i is the sum of the genetic contributions private i has from native founders, whereby a founder is an individual with unknown parents. For estimating native contributions, the pedigree needs to be prepared differently than for estimating kinships. Below, arguments lastNative=1970 and thisBreed=~Angler~ ensure that the breed name of founders built-in afterward t ₀=1970 is shifted from ~Angler~ to "unknown". The native contributions and the contributions of other breeds to the genome of each private are estimated with function pedBreedComp(). Thereafter, the column with native contributions is appended to information tabular array phen and renamed as pedNC.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figf_HTML.jpg

It can exist seen that the selected individuals have a low native contribution, a loftier contribution from Holstein, and also a substantial contribution from individuals of unknown origin.

The segment-based native contribution ${\hat{N}}_{SEG} (i)$ of individual i is the proportion of its genome included in native haplotype sections. Thereby, an allele is considered native, if the segment containing the allele has depression frequency in all breeds that might have been used for upgrading. That is, a marker one thousand is native in a haplotype, if the frequency of the segment containing the marking is smaller than some threshold value ubFreq in all breeds that might have been used for upgrading the breed of interest. If a segment is substantially more frequent than (say) 0.01 in another not-endangered brood that was used for upgrading, then it does non need to be conserved and has likely been introgressed. Short segments predominantly arose from early on introgression events, so segments are required to have a minimum length minL, which enables to fail very old introgression.

Below, office haplofreq() is used to determine the about likely origin of each allele from each haplotype. The results are written to files in directory w.dir=~Population~, and a list with file names is returned. The starting time letters of the breed names are used in the files for labeling the origins of the markers, and then intendance should be taken that these messages are different for the unlike breeds. Part segBreedComp() is used to compute the native contribution of each individual. Thereafter, the cavalcade with native contributions is appended to data table phen and renamed as segNC.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figg_HTML.jpg

The scatter plot in Fig.1 shows the full-blooded-based estimate of the genetic contribution from Holstein cattle vs. the segment-based gauge. Thereby, contributions from Holstein and Red Holstein are added and but individuals with real parents are included that accept at least half dozen equivalent complete generations in the pedigree. Information technology can be seen that the segment-based contribution from Holstein is highly correlated with the pedigree-based estimate. Probably, both estimates are slightly biased downward. The pedigree-based approximate could be also low because of wrong and missing ancestors in the pedigree, whereas the marker-based estimate could exist too low because some Holstein cattle with rare haplotypes are missing in the reference prepare.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Fig1_HTML.jpg

Joint Distribution. Full-blooded-based estimates of the genetic contribution from Holstein cattle vs. segment-based estimates for simulated Angler cattle

Native kinship

The native kinship f _{I B D|Due north}(i,j) of two individuals i,j is the conditional probability that two alleles Ten _i, and Y _j, taken at random from both individuals from a single locus, are identical by descent (IBD), given that they are native. That is,

$f_{IBD | N} (i, j) = P (X_{i} \overset{IBD}{=} Y_{j} |X_{i}, Y_{j} \in A_{N}) .$

In other words, it is the kinship computed only from the alleles that are native in both individuals. Note that the native kinship depends neither on the way, the migrant ancestors were related with each other, nor on their genetic contribution to the population. Since the kinship is divers as a conditional probability, it can exist computed past the ratio

$\begin{array}{lcr} f_{IBD | N} (i, j) & = & \frac{f_{IBD & N} (i, j)}{f_{North} (i, j)}, \end{array}$

where f _{I B D&North}(i,j) is the probability that two alleles taken at random from both individuals are IBD and native, whereas f _N(i,j) is the probability that both alleles are native. The numerator and the denominator, and thus the native kinships, tin can exist estimated either from full-blooded or from marking data.

The pedigree-based native kinship ${\hat{f}}_{PED | North} (i, j)$ betwixt individuals i,j can be computed with function pedIBDatN(), whereby the native founders are assumed to be unrelated and non-inbred.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figh_HTML.jpg

The native kinships of these individuals are rather high, which means that the sets of native ancestors in their pedigrees are considerably overlapping.

The segment-based native kinship ${\hat{f}}_{SEG | N} (i, j)$ betwixt individuals i,j is the conditional probability that two alleles from the same locus taken at random from these individuals belong to identical segments, given that the alleles are native. Information technology tin can be computed with function segIBDatN().

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figi_HTML.jpg

Population ways

The mean values of the genetic parameters in the population depend on the contributions the different age ×sex activity classes have to the population. The fourth dimension interval covered by an age class needs to ensure that no individual can accept offspring in the same age class. Typically, each age class spans one year.

Function agecont() estimates the contributions of the classes to the population. It assumes that the percentage of the population that is attributed to a particular class is proportional to the expected proportion of its offspring that is not notwithstanding born. Since these values are estimated from the past, this requires some continuity in the breeding program when this function is used for estimation. The total contributions of non-juvenile males and females to the population are assumed to exist equal, whereby not-juvenile animals are all individuals that are non built-in in the current year. Annotation that the contributions are arcadian and may not coincide with the proportions of living animals included in the classes. The contributions of the historic period classes are estimated from the ages of the parents at the fourth dimension when their offspring was built-in. The offspring consists of the individuals indicated by argument use.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figj_HTML.jpg

In this example, males have lower contributions to young historic period classes than females. This is because the males were predominantly progeny tested, then they were used for breeding at an older age. Hence, their contributions spread over a longer period of time.

Earlier we compute the population means, data frame phen should be completed past appending column isCandidate, which indicates the selection candidates for OCS. In this example, the option candidates are the individuals that are at least one year old.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figk_HTML.jpg

Role candes() computes the population means for all numeric columns in data table phen and for all kinships and native kinships that are supplied equally additional arguments. Note that these additional arguments can accept arbitrary names and they can be omitted if the respective kinship is not of involvement. The population means depend on the contributions the different age ×sex activity classes accept to the population as defined by argument cont. If argument cont is omitted, then discrete generations are assumed and the total contributions of males and females to the population are equal.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figl_HTML.jpg

It tin can exist seen that the average number of equivalent consummate generations in the pedigree is rather loftier, even though the proportion of the genome with unknown origin is too moderately loftier. The results deviate from results of other studies for this breed [7, eight] because the data set up used in this paper for sit-in purposes was not obtained from a random sample of the population. Nevertheless, they demonstrate several interesting relationships between the parameters.

The full-blooded-based native contribution of 0.1671 is probably underestimated because some of the founders with unknown origin may be native. The pedigree-based kinship is smaller than the segment-based kinship considering pedigrees are incomplete. Native kinships are higher than the kinships because the diversity of native alleles is commonly smaller than the full variety of all alleles. The segment-based estimates of the native kinships are lower than the pedigree-based estimates. This has two reasons. First, the individuals take a substantial genetic contribution from founders with unknown origin. Alleles from these individuals do not contribute to the pedigree-based diversity of native alleles, even though some of them could have been native. This results in overestimating the pedigree-based native kinships. Second, crossing overs have shortened some haplotype segments, so that some segments can no longer be considered identical. This results in a slight underestimation of segment-based estimates.

Constraint settings for kinships

Since the inbreeding coefficient of an individual is equal to the kinship of its parents, constraining the increment in hateful kinship in the population enables breeders to avert inbreeding. The rate of increase in mean kinship is measured by the variance effective size N _{due east} of the population. The critical effective size, i.due east. the size below which the fitness of the population steadily decreases, depends on the population and is usually causeless to be between l and 100 [21]. For about populations, maintenance of an effective size of North _e≥100 should be envisaged. Hence, we ascertain

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figm_HTML.jpg

The effective size of the population is at to the lowest degree N _e, if the charge per unit of increase in mean kinship per generation is $Δ f_{g} \leq \frac{1}{2 {Due north}_{e}}$ [22]. In a population with overlapping generations and generation interval L, the rate of increase in hateful kinship per year Δ f _y is of interest for OCS, which should satisfy

The generation interval can exist approximated from the results of function agecont() as

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Fign_HTML.jpg

This enables to define upper premises for the mean kinships in the population at the next evaluation time t+1 as

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figo_HTML.jpg

Of course, upper bounds demand to be defined just for the parameters that should exist constrained in OCS. The expected mean kinship in the population at time t+one depends on the vector c containing the genetic contribution of each individual to the offspring, which is the parameter that will be optimized. The expected mean kinship can exist computed by the quadratic office

$f_{IBD} (c) = {(r_{0} c + v)}^{⊤} f_{IBD} (r_{0} c + 5) + l_{IBD} (c),$

where r ₀ is the percent of the population represented by the offspring, and component v _i of 5 is the percentage of the population represented by individual i itself. The modest linear correction term l _IBD(c) accounts, for example, for genetic drift (Wellmann R, Bennewitz J: Key genetic parameters for optimal population management, submitted). Estimates ${\hat{f}}_{PED} (c)$ and ${\hat{f}}_{SEG} (c)$ can be obtained by replacing f _IBD and l _IBD(c) with their estimates obtained from pedigrees or marker data, respectively. Hence, constraining a kinship means to add a quadratic constraint of the form

$\begin{array}{lcr} {\hat{f}}_{PED} (c) & \leq & ub.fPED, or \\ {\hat{f}}_{SEG} (c) & \leq & ub.fSEG \end{array}$

to the programming problem. Native kinships are of item interest for populations with celebrated introgression if removal of the introgressed genetic fabric is envisaged in the future. Defining the upper bound for the hateful kinship in accordance with the desired effective size ensures that enough genetic multifariousness volition be maintained in the population after the introgressed genetic material has been removed. Hence, upper bounds are divers as

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figp_HTML.jpg

The expected hateful native kinship in the population at time t+i tin be computed by the rational function

$\begin{array}{lcr} f_{IBD | Northward} (c) & = & \frac{{(r_{0} c + 5)}^{⊤} f_{IBD & N} (r_{0} c + v) + l_{IBD & N} (c)}{{(r_{0} c + 5)}^{⊤} f_{Northward} (r_{0} c + 5) + l_{N} (c)}, \end{array}$

where l _{S East One thousand&N}(c) and l _N(c) are the small linear correction terms defined in (Wellmann R, Bennewitz J: Key genetic parameters for optimal population management, submitted). Estimates ${\hat{f}}_{PED | N} (c)$ and ${\hat{f}}_{SEG | Northward} (c)$ are obtained past replacing the terms by their estimates obtained from pedigrees or marker data, respectively. Hence, constraining a native kinship means to add together a rational constraint of the form

$\begin{array}{lcr} {\hat{f}}_{PED | Due north} (c) & \leq & ub.fPEDN, or \\ {\hat{f}}_{SEG | N} (c) & \leq & ub.fSEGN \end{array}$

to the programming problem.

Traditional OCS

The goal of OCS is finding the optimum contribution c _i each choice candidate i should accept to the next birth cohort. It is the fraction of genes in the birth cohort that should originate from private i. Since 50% of the genes originate from males and 50% originate from females, the proportion of individuals in the nascence cohort having individual i as a parent should exist 2c _i.

Traditionally, OCS maximizes the mean convenance value in the population in the next year or generation, while the average kinship is required not to exceed a predefined threshold value. The usage of bundle optiSel is demonstrated below at the instance of this optimization problem.

Since pedigree data is used, care must exist taken that the completeness of the pedigrees is taken into account. Individuals with a depression number of equivalent consummate generations in their pedigree would otherwise be favored for breeding because they announced to be less related with the population. The constraints of the optimization problem are defined in a list:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figq_HTML.jpg

In this case, simply the contributions of males are to exist optimized, which is achieved by calculation component uniform="female". That is, all females within a particular age cohort are causeless to accept equal contributions to the offspring. This optimization problem is of involvement if the contributions of the females cannot be centrally controlled.

Component ub.fPED=ub.fPED defines the upper bound for the mean kinship in the population to be equal to the value ub.fPED. This component has name ub.fPED because the kinship was named fPED in the call of office candes().

Component lb.equiGen=cand$mean$equiGen defines a lower jump for the average number of equivalent complete generations in the population. This constraint is simply needed if incomplete pedigree data is used. The threshold value should exist chosen such that individuals with incomplete pedigrees are non disproportionately favored for convenance. This component has name lb.equiGen considering the column in data table cand$phen that contains the numbers of equivalent complete generations was named equiGen.

Optimization is carried out below with function opticont(). The offset statement defines the objective of the optimization problem, which is to maximize the average breeding value in the population at fourth dimension t+1. This is accomplished with character string ~max.EBV~ considering the column of data table cand$phen that contains the convenance values is named EBV.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figr_HTML.jpg

Argument solver defines the algorithm to be used for optimization. If numerical problems are encountered then it is advisable to utilise another solver or to adjust the tuning parameters of the solver, which can be supplied as additional arguments to function opticont(). Available solvers are

cccp: Function cccp() from R packet cccp for solving cone constrained convex problems is called. Quadratic constraints are defined as second order cone constraints.

cccp2: Office cccp() from R package cccp is called, but quadratic constraints are divers by functions.

alabama: This solver calls part auglag() from R package alabama for optimizing smooth nonlinear objective functions with constraints.

csdp: This solver calls function csdp() from R packet Rcsdp for solving semidefinite programming problems.

slsqp: Role slsqp() from package nloptr is chosen, which optimizes successive 2d-order approximations of the objective role with first-society approximations of the constraints.

The event of part opticont() is a listing with several components. Data frame fit$info contains information on the success of the optimization. That is, component valid is TRUE, if all constraints are fulfilled by the optimized contributions, whereas component condition describes the solution every bit reported by the solver.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figs_HTML.jpg

Data frame fit$mean contains the predicted mean values of heritable traits, kinships, and native kinships in the population at the next evaluation time t+1. For other variables, such equally component equiGen, the weighted mean (r ₀ c+v)^⊤ X is shown, where Ten is the corresponding column vector from data frame cand$phen.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figt_HTML.jpg

The optimized contributions of the breeding individuals tin can be found in cavalcade oc of data frame fit$parent:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figu_HTML.jpg

The example in a higher place optimizes only the contributions of males. For optimizing the contributions of both sexes, component uniform="female" needs to be removed from the list of constraints. Moreover, since the number of offspring a female person can take is usually limited, upper limits need to be defined for the female person contributions. More generally, upper and lower limits for the contributions of arbitrary individuals can exist specified. If each nativity cohort consists of N ₀=200 individuals and if a female can take at most 5 offspring per year, then the upper limit for the contributions of females needs to exist $\frac{five}{2 N_{0}} = 0.0125$ . The respective list of constraints can be created as follows:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figv_HTML.jpg

Computation of the optimum contributions with this list of constraints takes virtually iii min. Their ciphering with constraint compatible="female" is in general much faster.

Advanced OCS

This department provides an overview on the constraints and objective functions that tin can be handled past role opticont() and are of involvement in many breeding programs. In general, all kinship and native kinships, and all numeric traits in information frame phen tin be constrained. These parameters can also be optimized, but but ane at a time. In animal breeding, the groups of males and females contribute equally to the offspring. This may, however, not be relevant in plant breeding. The constraint that males and females have equal contributions to the offspring is omitted, if column Sexual activity in data frame phen contains simply NA.

For most breeding programs, traditional OCS turned out to be not sufficient. This has several reasons. First, marking information enables to obtain more than authentic estimates of kinships, native kinships and breeding values than pedigree data. In the examples below, we assume that marker information is bachelor. However, if but pedigree data is available, then the examples tin easily be adjusted by replacing terms SEG and seg with PED and ped. In particular, for maximizing breeding values while restricting the segment-based kinship, constraint ub.fPED=ub.fPED needs to exist replaced with ub.fSEG=ub.fSEG:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figw_HTML.jpg

While the above setting is appropriate for well-nigh livestock breeds, many companion breeds and endangered breeds accept different breeding objectives. Several companion breeds endure from historic bottlenecks, which resulted in high inbreeding coefficients and inbreeding depression. For these breeds, the chief breeding goal is minimizing the average kinship in guild to reduce inbreeding depression and the loss of genetic variation. This is accomplished with the following call to function opticont():

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figx_HTML.jpg

If breeding values are available, then they can be constrained in order to achieve genetic proceeds:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figy_HTML.jpg

For some companion breeds, the erosion of genetic diversity has proceeded to a point that crossings with other breeds are not avoidable. However, the genetic contribution from other breeds should be restricted to the necessary minimum. Hence, a lower jump for the native contribution should be defined:

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figz_HTML.jpg

Notation that the example above cannot be executed for the example data ready considering the optimization problem has no solution.

Some endangered livestock breeds, such as the breed used in the examples, take been continuously upgraded with loftier performance breeds in order to maintain economic competitiveness. Replacement of the original genetic background can have proceeded to the signal that the original breed can exist considered genetically extinct. For some of these breeds, de-extinction efforts are made with the aim to recover the original genetic background. Such breeding programs need to restrict the increase in native kinship in accordance with the desired effective size in guild to ensure that enough genetic diversity persists in the breed later the foreign genetic material has been removed. Hence, the call to office opticont() would exist

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figaa_HTML.jpg

In general, recovering the native genetic background is not the but objective of the breeding program, but genetic gain should exist achieved as well. In this case, breeding values for the native contribution should be estimated from marker data and included in the total merit alphabetize. And so, the total merit alphabetize would be maximized instead of the native contribution. Information technology may be desirable to maintain specific introgressed QTL in the population, which could exist accomplished by giving them an advisable weight in the total merit index.

Mate allocation

Subsequently the optimum contributions of the selection candidates have been computed, males and females can exist allocated for mating such that the mean inbreeding coefficient in the offspring is minimized. This can be done with function matings(). Since the kinship of the parents is equal to the inbreeding coefficient of the offspring, the objective is to minimize

where northward _ij is the number of offspring from the mating of individual i with individual j, $ℳ$ contains all male option candidates, $F$ contains all female person selection candidates, N ₀ is the total number of offspring, and f _ij is either the segment-based kinship, or the pedigree-based kinship, or some other user-supplied similarity measure for individuals i and j.

In any case, the genetic contribution of each parent must exist equal to its optimum contribution. That is, for all males i, the post-obit equation holds

and for all females j, we have

where n _i≈2c _i Northward ₀ is the number of offspring of individual i.

The maximum number of offspring per mating tin exist constrained to exist ub.nOff at about. In this case, for all males i, and for all females j the following inequality holds

Without this constraint, some superior animals may always exist mated to the same inferior individual, so all their offspring may non be good enough for breeding.

Moreover, for each herd, the proportion of offspring sired by the same male can exist constrained to be at almost α. This increases genetic connexion between herds, so information technology enables to judge more accurate breeding values. Accept $F_{h}$ to exist the set of females from herd h. For all herds h and all males i nosotros have

where N _h is the number of individuals in the nascency accomplice that will be born in herd h. Mate allocation is demonstrated at the example of OCS with segment-based kinship matrix. Recall that the optimization problem tin exist solved with

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figab_HTML.jpg

Function noffspring() is used below to compute the desired number of offspring per selection candidate by bold that the birth accomplice covers 200 individuals. The effect of function matings(), which is used for mate allocation, is a data frame with columns Sire, Dam, and n. Column north contains the desired number of offspring from matings betwixt the respective sire and dam. Annotation that this is the number of offspring that should be used as selection candidates in the side by side generation. The full number of offspring from the matings may be larger.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figac_HTML.jpg

The average inbreeding coefficient of the offspring is

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Figad_HTML.jpg

Results

Comparing of solvers

The ability of dissimilar solvers to find optimum solutions for different OCS problems was compared at the example of a data prepare containing genotypes, breeding values, and migrant contributions of 11000 imitation Angler cattle. These fake individuals were generated from genotypes of 131 Angler bulls and 137 Angler cows during 2 generations of selection. Male choice candidates were sampled at random from the population that consisted of all 11000 individuals. Females were assumed to have equal contributions inside each age grade. Breeding values were simulated as described in [8]. Segment-based kinships, native kinships, and native contributions were estimated from haplotypes consisting of 23448 SNPs.

The post-obit OCS-scenarios for populations with overlapping generations were considered:

max.EBV: This is traditional OCS with segment-based kinship matrix. The mean convenance value in population was maximized, while the mean kinship was constrained such that N _e≥100.

max.segNC: This OCS approach is suitable for convenance programs whose main objective is to recover the native genetic background. The mean native contribution in the population was maximized, while the mean native kinship was constrained such that Due north _e≥100.

min.fSEG: This objective function is suitable for breeds suffering from inbreeding depression. The mean kinship was minimized, while the hateful native contribution was constrained, and the hateful convenance value was constrained not to decrease.

min.fSEGN: This OCS approach may exist suitable for breeding programs that aim at maximizing the genetic diversity at native alleles and at recovering the native genetic background. The hateful kinship at native alleles was minimized, while the mean native contribution was constrained to increase by at least 2.five% per year.

The results shown in Figs.2 - 3 were obtained from fifty replicates for scenarios with less than 300 selection candidates, and from 10 replicates for scenarios with more than 300 option candidates. Effigy2 shows the proportions of correct results (green), the proportions of suboptimal results (blue), and the proportions of cases in which no viable solution was found (cherry). These proportions are shown for the unlike solvers, OCS-methods, and numbers of selection candidates. A outcome was classified as correct if the ratio between the value found by the solver and the best solution deviates from i by less than 1%. Effigy3 shows the relative computation times needed by the different solvers. Computation times are standardized and can be compared directly merely for a given number of choice candidates. Bars representing computation times of solvers that did not produce correct results in at least eighty% of the cases are red.

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Fig2_HTML.jpg

Classification of results. Proportion of right results (green), the proportions of suboptimal results (blue), and the proportions of cases in which no feasible solution was found (scarlet) for different solvers, OCS-methods and numbers of option candidates. A solution was classified as non correct if the value of the objective function at the solution deviates from the all-time estimate past more than 1%

An external file that holds a picture, illustration, etc. Object name is 12859_2018_2450_Fig3_HTML.jpg

Relative computation time. Relative computation time needed by different solvers to find optimum solutions for different OCS-methods. Computation times are standardized and can be compared directly only for a given number of selection candidates, which are displayed at the right-hand side. Bars representing ciphering times of solvers that did not produce right results in at least lxxx% of the cases are carmine

All solvers were able to find correct solutions when the number of selection candidates was small. Solver alabama provided suboptimal results for larger optimization problems and had the longest runtime, and then its apply tin can not be recommended. Solvers cccp and cccp2 had the shortest runtime for problems with linear or quadratic objective office and provided correct results, so their employ can be recommended for breeding programs that aim at maximizing genetic gain, at recovering the native genetic background, or at minimizing kinships.

Minimization of the native kinship is in general not a convex trouble, so solver csdp could non be used for this. Solvers cccp and cccp2 are besides not designed to solve non-convex bug, but were able to observe the solution when the number of selection candidates was small. When the number of candidates was large, then their solutions did non satisfy the constraints. Hence, simply solver slsqp can exist recommended for breeding programs that aim at maximizing the genetic diversity at native alleles.

Ciphering of full-blooded-based kinships

Different R packages exist to compute pedigree-based kinships, or, equivalently, the additive human relationship matrix A. Tableone shows the computation time needed to compute the kinship matrix for different numbers of individuals. The pedigree size was the number of individuals included in the pedigree, which are the individuals for which the kinships were to be computed and their ancestors. R bundle optiSel was x times faster than all other packages. Moreover, all other packages failed to compute the kinship matrix for the example information set with 32698 individuals considering the memory that would have been needed by those packages was larger than 32 GB RAM.

Table 1

Time needed for computing kinship matrices on a 3.40 GHz PC with 32GB RAM

Pedigree size	Individuals	nadiv	optiSel	Pedigree	pedigreeR	pedigreemm
47064	4705	189	13	648	184	184
70075	12269	1082	64	-	930	1080
96411	32698	-	153	-	-	-

Conclusion

Optimum contribution selection applied to local breeds requires special attention due to the conflicting objectives of their breeding programs. The free R bundle optiSel is an like shooting fish in a barrel-to-use software taking these conflicting objectives into account. It enables to judge the genetic parameters that need to be controlled, and which can afterward be used to define the objective and constraints of a breeding program. The optimization trouble can be solved with a diversity of solvers, which provide a list with the optimum numbers of offspring for all option candidates, and which can subsequently exist used for mate resource allotment.

Availability and requirements

Project proper noun: optiSel 2.0.1

Project home page: https://CRAN.R-projection.org/package=optiSel

Operating system(due south): Platform independent

Programming language: R and C++

Other requirements: None

License: The software is gratis

Additional file

Acknowledgements

The author cheers Yu Wang for providing the information ready used in this study.

Funding

The written report was supported by the German Research Foundation (Deutsche Forschungsgemeinschaft, DFG).

Availability of data and materials

The example data ready and a replication R-script are included equally an electronic appendix. Information technology is a subset of the data generated past [8].

Authors' contributions

RW wrote the manuscript and the R parcel optiSel. The writer read and canonical the final manuscript.

Notes

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The author declares that he has no competing interests.

Publisher'southward Annotation

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

References

ane. Meuwissen THE. Maximising the response of pick with a predefined rate of inbreeding. J Fauna Sci. 1997;75:934–40. doi: 10.2527/1997.754934x. [PubMed] [CrossRef] [Google Scholar]

2. Hartwig S, Wellmann R, Hamann H, Bennewitz J. The contribution of migrant breeds to the genetic gain of beef traits of german vorderwald and hinterwald cattle. J Anim Breeding Genet. 2014;131:496–503. doi: ten.1111/jbg.12099. [PubMed] [CrossRef] [Google Scholar]

iii. Hartwig S, Wellmann R, Emmerling R, Hamann H, Bennewitz J. Short communication: Importance of introgression for milk traits in the german language vorderwald and hinterwald cattle. J Dairy Sci. 2015;98:2033–8. doi: 10.3168/jds.2014-8571. [PubMed] [CrossRef] [Google Scholar]

4. Amador C, Toro MA, Fernandez J. Removing exogeneous information using pedigree information. Conserv Genet. 2011;12:1565–73. doi: 10.1007/s10592-011-0255-4. [CrossRef] [Google Scholar]

5. Wellmann R, et al. Optimum contribution selection for conserved populations with celebrated migration. Genet Sel Evol. 2012;44:34. doi: ten.1186/1297-9686-44-34. [PMC gratis article] [PubMed] [CrossRef] [Google Scholar]

vi. Bennewitz J, Simianer H, Meuwissen THE. Investigations on merging breeds in genetic conservation schemes. J Dairy Sci. 2008;91:2512–nine. doi: ten.3168/jds.2007-0924. [PubMed] [CrossRef] [Google Scholar]

7. Wang Y, Wellmann R, Bennewitz J. Novel optimum contribution selection methods accounting for alien objectives in breeding programs for livestock breeds with historical migration. GSE. 2017; 49:45. [PMC costless article] [PubMed]

8. Wang Y, Segelke D, Emmerling R, Bennewitz J, Wellmann R. Long-term impact of optimum contribution selection strategies on local livestock breeds with historical introgression. G3. 2017;vii:4009–18. doi: 10.1534/g3.117.300272. [PMC complimentary article] [PubMed] [CrossRef] [Google Scholar]

9. Meuwissen THE. GENCONT: An operational tool for controlling inbreeding in selection and conservation schemes. Proc. 7th World Congr. Genet. Applied to Livest. Prod., Montpellier, France. 2002;33:769–70. [Google Scholar]

10. Pong-Wong R, Woolliams JA. Optimisation of contribution of candidate parents to maximise genetic gain and restricting inbreeding using semidefinite programming. Genet Sel Evol. 2007;39:3–25. doi: 10.1186/1297-9686-39-1-3. [PMC free article] [PubMed] [CrossRef] [Google Scholar]

11. Berg P, Nielsen J, Sørensen MK. EVA: Realized and predicted optimal genetic contributions. Proc 8th Globe Cong Genet Appl Livest Prod Belo Horizonte, Brazil. 2006;:246.

12. Kinghorn BP. An algorithm for efficient constrained mate option. Genet Sel Evol. 2011;43:4. doi: 10.1186/1297-9686-43-4. [PMC free article] [PubMed] [CrossRef] [Google Scholar]

13. Fujisawa 1000, Kojima M, Nakata K, Yamashita K. SDPA (SemiDefinite Programming Algorithm) user'southward transmission—Version six.0; 2002. Enquiry Study B-308, Dept. of Mathematical and Computing Sciences, Tokyo Institute of Engineering science, Oh-Okayama, Meguro, Tokyo 152-8552, Nihon, 1995. Revised July 2002.

xv. Borchers B. Csdp, a c library for semidefinite programming. Optim Methods Softw. 1999;11(one):613–23. doi: 10.1080/10556789908805765. [CrossRef] [Google Scholar]

16. Kraft D. A software package for sequential quadratic programming. 1988. Tech Rep DFVLR-FB 88-28, DLR German language Aerospace Center—Institute for Flying Mechanics, Köln, Federal republic of germany.

17. Pfaff B. The R parcel cccp: Pattern for solving cone constrained convex programs. R Financ 16-17 May 2014 Chic. 2014.

eighteen. Maignel L, Boichard D, Verrier E. Genetic variability of french dairy breeds estimated from pedigree information. Interbull Bull. 1996;fourteen:49–54. [Google Scholar]

19. Peripolli E, Munari DP, Silva MVGB, Lima ALF, Irgang R, Baldi F. Runs of homozygosity: current knowledge and applications in livestock. Anim Genet. 2016;48:255–71. doi: 10.1111/age.12526. [PubMed] [CrossRef] [Google Scholar]

20. Ferenčaković M, Sölkner J, Curik I. Estimating autozygosity from loftier-throughput information: effects of snp density and genotyping errors. Genet Sel Evol. 2013;45(42). [PMC costless commodity] [PubMed]

21. Meuwissen THE. Genetic direction of small populations: A review. Acta Agric Scand Sect A. 2009;59:71–9. [Google Scholar]

22. Falconer DS. Introduction to Quantitative Genetics. Essex: Longman Group UK Limited; 1989. [Google Scholar]

Articles from BMC Bioinformatics are provided here courtesy of BioMed Key

Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6332575/

Posted by: taylorduress.blogspot.com

What Is Germany Doing For Animal Selection Breeding Guide

Optimum contribution selection for brute convenance and conservation: the R package optiSel

Robin Wellmann

Abstract

Background

Results

Conclusion

Electronic supplementary textile

Background

Implementation

Kinships

Native contribution

Native kinship

Population ways

Constraint settings for kinships

Traditional OCS

Advanced OCS

Mate allocation

Results

Comparing of solvers

Ciphering of full-blooded-based kinships

Table 1

Conclusion

Availability and requirements

Additional file

Acknowledgements

Funding

Availability of data and materials

Authors' contributions

Notes

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher'southward Annotation

References

0 Response to "What Is Germany Doing For Animal Selection Breeding Guide"

Post a Comment

Iklan Atas Artikel

Iklan Tengah Artikel 1

Iklan Tengah Artikel 2

Iklan Bawah Artikel