Selecting Breeding Stock on their Breeding Values for achieving higher Genetic progress in Cattle and Buffaloes


Breeders focus on selecting parents to increase the average productivity of future generations. Phenotypic records of economically important traits on which parents are selected are usually quantitative in nature and are influenced by both genetics and environmental factors. However, parents pass on only their genes to their progenies and not environmental influences. For selecting parents, therefore, the important job for breeders is to disassociate environmental influences from phenotypic records and estimate what animals can transmit to their progenies. The genetic worth of animals in terms of what they can transmit to their progenies is expressed in terms of breeding values (BV) or probable transmitting ability (PTA). BV is simply twice of PTA. 

The question then is how to estimate BV or PTA of animals. As all economically important traits are influenced by both environment and genetics, first phenotypic records of the animals available for selection are corrected for environmental factors and the resultant values are then used for estimating breeding values making use of relationship among individuals. Though breeding value can be estimated for an individual animal, it can only be estimated for a group of animals. In this paper first, the statistical procedure used for estimating breeding values of bulls put under test and all females recorded in progeny testing programs implemented under the National Dairy Plan is described in simple terms. In the second part, all common questions raised by persons involved in selection of bulls for semen production on the procedure used in estimation of breeding values and the advantages of using BVs for selection of parents over phenotypic values are answered in simple language.   


Animal’s milk record is influenced by environmental factors such as herd, village, year of calving, season of calving, age of animal, etc. (collectively referred as environment) and animal’s genetics. This can be expressed mathematically as:

Milk = Herd + Year of calving + Season of Calving + Age of animal + Animal’s Genetics + Error

For estimating the genetic worth of an animal, milk production records are first corrected for various environmental factors. After correction with such factors relationship among all animals are used for calculating their breeding values. The correction of records for the environmental factors and estimation of genetic worth based on differences among records of related and not related animals is done simultaneously using the following modified formula:

Milk = Herd + Year of calving + Season of Calving + Age of animal + Animal’s ID*(matrix of relationship of animal with all other animals) + Error

A solution of this equation is obtained by a statistical procedure. One of the best and most popular methods used is BLUP (Best Linear Unbiased Prediction) which does correction for environmental factors and estimation of breeding values simultaneously. For estimating breeding values of bulls and recorded females under progeny testing programs, we use Random Regression Test Day BLUP procedure, wherein instead of lactation yield test day records are used. 

Every estimated breeding value is provided with its reliability. Reliability provides the confidence that one can place on the estimate of breeding value. Larger the number of pedigree and progeny records of an individual used for estimating its breeding value higher will be the reliability of its estimate of breeding value.  One can go to NDDB’s web site ( and see the latest estimates of breeding values along with the reliability of bulls having more than 30 daughter records under progeny testing programs. 

If an animal’s relationship is not known we can correct the record for all other environmental effects by the following formula:

Milk = Herd + Year of calving + Season of Calving + Age of animal + Error

And the difference of animal’s corrected yield then is multiplied by heritability (a factor that determines what proportion of difference is controlled by genetics) to arrive at a breeding value of an animal.

BV= (Corrected yield of animal-Average of all animals)*heritability 

We can correct for as many factors affecting milk production as possible provided we have records for all those factors along with milk production records. 

The process of estimating breeding value is depicted in Figure 1

Figure 1.    Process of estimating breeding values

Process of estimating breeding values

Answering frequently asked questions

Q. What factors are considered for calculating breeding values? 

In most projects, we consider owner, village, year of calving, season of calving, and lactation number as factors influencing animal’s record. In case there are not sufficient observations in any one sub-class, we combine sub-classes to create a sub-class with a larger number of observations. For example, if many villages have 2 or 3 records each, we then combine nearby villages into a Tehsil factor. If most of the owners have only one animal recorded, we consider a village as a herd instead of an owner. And pedigree information i.e a matrix defining relationships among animals is added in all breeding value estimation runs. Besides, now we also add genotype information wherever feasible. 

Theoretically, as many factors as we can record accurately can be used for breeding value estimation provided the factor has a biological justification of its influence on the trait under consideration and there are at least 5-10 observations in each sub-class of that factor.

Apart from the factors that we currently use for estimating breeding values many other factors such as owner of animal, farm condition, animal house type, feeding practices (low protein, high protein, medium protein diet, etc.), temperature-humidity on the farm on the day of record, body size of animal, breed percentage of animal, etc. can be considered provided we can measure accurately these factors and there are sufficient observations in each sub-class of the factor. 

Q. What are the consequences of not correcting for environmental factors?

Let me explain this with an example. Take two cows one which calved during summer in the month of May during which the farm had a water crisis and produced 4000 lit milk in lactation, and the other cow calved during winter in the month of December during which the farm had plenty of food and water and produced 4400 Kg milk. We if do not correct for the season of calving, in this case, we would consider the second cow superior then the first. However, if we have records of many cows on the farm both calved in summer as well as in winter, it would be possible to estimate the effect of season on milk production. Let us assume the effect of season calculated for this farm is 900 kg. i.e cows calved in winter produce 900 kg. more milk than those calved in summer. The season corrected milk production of the first cow will be 4900 (4000+900) and that of the second cow will be 3500 kg. This means the first cow has better genetic worth than the second cow. In this case we would have selected a wrong cow if we had not corrected for season effects.   

Q. Is estimation of BV of calves possible only with the records of their dams and no pedigree records?

Yes, it is possible. For this, we would require a sizable number of dams recorded in the same area. We can calculate the effect of factors such as tehsil, season, year of calving, and so on based on a substantial number of records of cows/buffaloes and correct for the phenotypic records for environmental effects. The corrected record is then multiplied by heritability for estimating a breeding value as shown earlier. In a situation where we do not have pedigree records, but have a reasonably large number of animals recorded in an area (like in the PS projects), we can calculate BV of animals based on environmental corrections. Reliability of such a breeding value, of course, will be very small as very little information on the animal available for breeding value estimation.

Q. Can we use both dam yield and BV for selecting a bull calf for semen production? 

The formula for calculating breeding value considers dam’s yield. Once we have calculated BV, the dam’s yield should not be looked into. BV would be sufficient and can be used as a sole criterion for selection of bull calves.

Q. Why the estimation of BVs from the PT projects is more accurate than those in the PS projects? Are BVs from the PT and PS projects the same?

In the PS projects, we do not record pedigree. So we can only correct dams’ records for environmental factors and not for different bulls used. Besides, in the PT projects, each sire has a large number of daughter records measured in varied environments making correction for environmental factors and pedigree much better. All these make the estimate of reliability of breeding values under the PT projects much larger than what is possible under the PS projects. However, breeding values estimated under the PS projects will be much more accurate compared to those estimated just based on raw dam’s yield. 

Q. Can we select bulls for the entire country based on breeding values calculated in one area?

If we have breeding values estimated based on daughters produced in different environments with reasonable reliability like ones estimated in the PT projects, the breeding values estimated in one project area could be used for the entire country. However, this is applicable for comparing bull/bull calves produced within a system. An animal (bull calf) that is not at all related to these animals is not comparable. Nevertheless, if we want to make a decision on selecting a calf between the one with only dam’s yield and the other with BV, selecting the bull calf with BV poses less risk.

Q. Does the genetic base of a breed influence the estimation of breeding values and ranking of bulls?

The genetic base for a particular breed will definitely influence the estimation of breeding value and ranking of bulls. However, the genetic base for a breed in different states in India is not very different. The main differences in productivity are due to differences in feeding, management, and environment practices rather than the genetic base. For example, we cannot say Murrah in Haryana, Western UP and Punjab are very different genetically though one finds average productivity of Murrah buffaloes in Haryana higher than those in Punjab. The differences in productivity are more due to differences in feeding and management practices in Haryana than Punjab rather than due to differences in the genetic bases. A lot of animal movements happen between states in India making a genetic base of a particular breed no different in two states. For example, Gujarat farmers have been procuring a very large number of CB cows form Punjab making the genetic base of crossbreds in Gujarat and Punjab more or less the same. But still, we know the productivity of Gujarat CBs is lower than those in Punjab perhaps due to differences in environmental factors between the two states. In practical terms, it means that a bull having high breeding value in Gujarat is used in Punjab, its daughters will definitely perform better than the daughters of an average bull in Punjab. 

Q. Can we predict the performance of a daughter produced from a particular cow and particular bull?

If one wants to predict the performance of a daughter of a particular bull mated to a particular cow, it will be guesswork only.  Breeding Values are comparison of individuals on the basis of averages. That means on average daughters of a high BV bull are expected to perform better than daughters of other bulls. But still there will be variability both due to gene segregation and due to environmental differences. That is why we get the same EBV for full-sibs but different genomic BVs.

Q. How many observations are required for estimating breeding values? 

The basic idea behind estimating breeding value is to remove the influence of non-genetic factors and then estimate breeding values based on the relationship among animals. For correction of non-genetic factors, we need repeated records in that environment. Statistically, we should have a minimum of five observations per level of factor. If we consider a season as one factor, then for each season we should have a minimum of five observations. If we can satisfy this condition, we can correct records for environmental influence and estimate breeding values.

If we have more observations, our corrections will be more accurate, and consequently, estimated breeding values will be more reliable. As there are many factors that influence milk production we need a large data set to estimate the effects of all factors and estimate breeding values with reasonable level of reliability. Thus, larger the dataset higher will be the reliability of BV estimates. One can, of course, estimate BVs with smaller data set but then reliability of breeding values will be smaller. There are theories giving method to estimate breeding values on smaller farm data, or data with no pedigree information or dataset with a few environmental factors.

Q. Is the sample size different for different projects?

Yes. Different projects have different follow-up levels, different rearing practices, different geography, etc.  Hence, different projects have different daughter numbers or animal recorded numbers.

Q. Is the accuracy of factors and breeding values is different for different projects? 

There are two factors affecting breeding value’s reliability. The first is the accuracy of data recording (both production records and pedigree records- parentage), and the second is the volume of data. It is true that as a project grows older, the parentage errors decreases and the volume of data increases, and we get better reliabilities.

Q. Is it true that BVs for some projects are more reliable than others?

As discussed above, projects differ in their reliability. Along with BV, we should always consider its reliability. If I want to choose a bull with 250 BV and 90% reliability versus a bull with 300 BV and 30% reliability, I would prefer the bull with 250 BV.

However, if I want to choose a bull with 1000 BV with 30% reliability and bull with 100 BV with 90% reliability, I may take a risk of selecting 1000 BV bull. Here the word “risk” is significant. With less reliability, one is not confident whether this 1000 is really 1000 or with more information it will come down or go up.

With higher reliability, we have more confidence on the estimates of breeding values and less risk. For AI bulls, risk should be as minimum as possible. With increasing information, reliabilities increase as shown in Figure 2.

Figure 2.    Reliability and load of information used in breeding value estimation
Reliability and load of information used in breeding value estimation

Q. If reliability is low, would it be better to simply consider the dam’s lactation yield?

As explained earlier, the raw dam’s lactation yield has low reliability than breeding value calculated based on accurate record and environmental corrections. This is even more desirable if we wish to compare bulls from different sources like from a few organized farms, from farm and field together, etc.

Q. How the breeding Values are expressed?

Breeding values are expressed in terms of superiority over their population mean. If an animal is having 100 BV for a trait, it means that this animal has superior genetic potential of producing 100 kg. above the current population mean. 

Q. What is PTA and how it is expressed?

PTA is half of BV. PTA estimates what is passed on to the progeny from its sire or dam which is half of what the sire or dam transmits to their progenies. The offspring gets 50% of the genetic superiority of its sire and dam.

Q. If breeding values are expressed as deviations from a population mean (in which the testing has happened) and the bulls are to be used in a different population with a different population mean, will we get  the same level of genetic improvement in the other population? For example, say a BV for milk yield of a bull in the population where it is tested is +300 and the mean is 5000 Kg. If we have a population with an average of 6000 kg., is it advisable to use the above bull in the population?

If the two populations represent the same/closed breed with no much genetic distance, the BV estimates in one population are valid irrespective of current productivity levels of these two populations. When there are no genetic differences between the two populations, the phenotypic differences in production levels are attributed to environmental differences. This is similar to examining performance of Murrah in Haryana, Western UP, and Punjab. Another example is Jersey crossbred performance in Tamil Nadu and Andhra Pradesh. In such conditions, the differences in BV may amplify meaning if difference between two bulls breeding value is 100 in low producing population, the difference will be around 125 in high producing population. 

In our country, we can use BVs that have been estimated in one area to another area with reasonable confidence. To bring in more confidence/accuracy, under the PT projects, we have started projects for the same breed in multiple locations and have been sharing bulls across projects so that we can have joint breeding value estimation and the estimated breeding values could be used across the entire country.

However, if the genetic base of the two populations is different, the ranking will change considerably. Also if environment is drastically different, the ranking will change considerably. This is the case with HF in US and its performance in India. Here also bulls with very high BV will not have negative or very low BV in other population but ranking among bull will change considerably due to different genes may get activated differently in different environments.

Q. Should we distribute bulls ranked 1 to 20 or say 1 to 50 based on breeding values irrespective of their breeding values? Can we also define MSP for breeding value?

Ideally, we should produce twice or thrice the numbers of bulls required through nominated mating and then select the best one third or one-half of bulls for semen production. The nominated mating also should be based on BV and not on raw-production records. If we mate best dams (by BV) with best bulls (by BV), the progenies are bound to have better BVs. The bull calves with very small but positive BV will be average bulls. They will give small genetic gain. But consider the bulls selected from their raw-dam yields which may have negative BV. Such bulls may lead to negative genetic progress. Initially, we may say bulls with positive breeding values should be used. Later as we progress we may set certain percentage of top bulls that may be used for semen production, but it should be dynamic. It may not be advisable to fix a percentage limit, as what you select depends on the actual demand of bulls for semen production. In fact, we must think about a differential price system based on breeding values and demand as our systems of bull evaluation and production matures.    

Q. Do we have any data or have we done any experiment wherein bulls with high breeding value are actually performing better under field conditions?

Table 1 shows that the bulls with higher BVs perform better than the bulls with negative breeding values. Table 2 shows that there is no difference among daughters of bulls with high dam yield and bulls with low dam yield. This clearly indicates that under field conditions, bulls with high BV are expected to produce better progenies not bulls with high raw-dam yields. One can see that the difference of daughter yields between bulls with high dam yield and low dam yield bulls is not much. However, there is a big difference between bulls with high BV and low BV. These are actual figures taken from one of our PT projects.
Table 1        Bull ranked based on dam’s yield and their daughter performance 

Bull No.

Dam YLD Kg

AVG daughter YLD Kg

Bull No.

Dam YLD Kg

AVG daughter YLD Kg

Bull A



Bull 1



Bull B



Bull 2



Bull C



Bull 3



Bull D



Bull 4



Bull E



Bull 5



Bull F



Bull 6



Bull G



Bull 7



Bull H



Bull 8



Bull I



Bull 9



Bull J



Bull 10









Table 2        Bulls ranked based on BV and their daughter performance

Bull No.

Dam Yield Kg.

Avg. daughters’ yield Kg.

BV Milk

Bull No.

Dam Yield Kg.

Avg. daughters’  yield Kg.

BV Milk

Bull A




Bull 1




Bull B




Bull 2




Bull C




Bull 3




Bull D




Bull 4




Bull E




Bull 5




Bull F




Bull 6




Bull G




Bull 7




Bull H




Bull 8




Bull I




Bull 9




Bull J




Bull 10













Q. What is the relation between traditional breeding value and genomic breeding value?

Genomic breeding values build on traditional breeding values. For estimating genomic breeding values in any population, we need performance records of a large number of animals along with recording of various factors. If pedigree information is available, genomic breeding values will be more accurate. Genomic breeding values will have better reliability than traditional breeding values.

For example, in traditional analysis, two full-sib bulls with no daughter information will have the same breeding value as its sire and dam are the same. However genetic information of these full-sib bulls will differ and hence their genomic breeding values will be different. Thus we can further differentiate which full-sib is better than the other.

Q. If we have 3 bulls produced through OPU-IVF and ET, what would be their BVs? 

If three bulls are produced through OPU-IVF/ET with the same donor cow and sire, they will have the same breeding values in the traditional way of estimating breeding values. However, their genomic breeding values will differ as they will have different genes. Only monozygotic twins will have the same Genomic Breeding Values.

Q. Once a breeding value is established for the bull, do we need to continue the breeding value estimation for the same bull?

With an increasing amount of information, bulls BV will change and its reliability will increase. However, once the reliability crosses 90%, there will be no considerable change in BV of a bull. This means we need to continue looking at BV for a bull till it crosses 90% reliability. Afterward, the bull will be there in the system as we have to use all relationship and information available, but the BV will not change much.

Q. Which scientific method will provide the best bulls for breeding - Genomic breeding values or traditional Breeding Values or any other?

If we can use only bulls with BV calculated based on 100 daughter records for semen production (proven bulls), this is the best scientific method. This is what the classical PT program in developed countries have done before the genomic selection was introduced.

However, in our conditions, when we get 100 daughter records, the bull will not be available for breeding/semen production. This means the above-mentioned best method is not practicable in our situation.

The compromise we did was to store 3000 doses and use these doses for nominated mating when we have BVs from a large number of daughter records. Here, we are using male calf born from nominated mating sired by top proven bulls for semen production. The young bull will have BV for its sire and dam. This will give relatively lower reliability. An improvement over selecting bull calf, produced using semen of proven bull, is its selection having its Genomic Breeding Value (this is genomic selection). This is superior to nominated mating bull calf, but not as good as of using proven bulls for semen production. 

Q. Is there any standard operating protocol for breeding values results which can be considered by the semen stations for selection/ rejection of particular bulls?

We should look at differences in breeding values among bulls. If differences are not too high (say 50-100 for 305-day milk yield), we should choose bull with high reliability. If differences are very high (say more than 250-300 for 305-day milk yield), one may think about taking a bit of risk and select bull with high BV even though its reliability is low.

Q. Is it possible to establish breeding values based on sisters’ milk yield?

Yes, BV can be estimated from sisters’ milk yield. However, the reliability will be lower than those estimated based on several daughters’ performance.

Q. Why are daughters of the bulls given more weightage for estimating breeding values?

Any estimate that is based on a repeated sample will have higher reliability. Daughters receive half of the bull’s gene pool. This means in daughters we have the multiple samples of bull’s gene pool. Estimation of BV of bulls made based on their daughter records always has high reliability than any other relationship.

Q. What is the possibility of errors in estimating breeding values?

Reliability gives an estimate of possible error in the estimation of breeding values. Lower the reliability, higher will be the error.

Q. How to know that data provided from the field is having errors and cannot be considered for breeding value estimation?

There are a few statistical ways to find out extremely wrong data. However, there is no way to know wrong data that fall within the biological range and follow established pattern. After completion of BV estimation analysis, when we look at results, we can sense the accuracy of records but we cannot pin-point which record is wrong. If parameters like heritability are very different from established norms, we may suspect high errors in recording.

Q. Can we estimate the breeding value of bulls with say 10 daughter records per bull? 

Yes, we can estimate breeding value from small number of daughter records, but their reliability will be lower than one that is estimated with a very large number of daughter records.