Bayesian methods for linkage analysis are useful because they allow for incorporation of prior information about allele frequencies, meiotic drive, and other factors important to linkage calculations. This, along with LOCate's versatility for ordinal and nominal traits, makes our method a valuable complementary tool to existing frequentist methods.
Even in a Bayesian framework, it is desirable to have a means of computing LOD scores, as they are commonly used to assess linkage. We developed a new, linear-regression based estimator for
, which has similar mean squared error to the RLR estimator, and is faster to compute. Our LinReg estimator will be useful for parameter inference in any situation in which MCMC is used and it is possible to calculate
, the joint probability of the observed and unobserved data, conditional on the parameter. For example, it could be used in the problem of population structure 
to infer K
, the number of populations represented by an observed sample of genotypes.
The choice of a penetrance model is an important question in any parametric linkage analysis, and this choice becomes even more challenging when analyzing categorical traits, as the number of possible penetrance matrices increases with the number of levels of the trait. An important distinction in the choice of penetrance matrices for categorical traits is whether the model should be ordinal or nominal. LOT estimates penetrances according to an ordinal model; this gives it an advantage for researchers who are confident their trait follows an ordinal model, but who do not wish to estimate the penetrances in advance. In contrast, LOCate is flexible to both ordinal and nominal penetrance models, but requires the penetrances to be estimated in advance. As we have done in this paper, these can be estimated on the basis of previous estimates of the phenocopy rate and overall penetrance of the trait. As our simulations demonstrate, LOCate exhibits better power than LOT when used to analyze a nominal trait, even when the input penetrance matrix is only a rough estimate. This robustness mitigates the importance of exactly estimating the penetrance matrix, and makes LOCate a valuable alternative method for researchers who wish to test penetrance models that do not have the ordinal proportional-odds property.
Due to LOCate's computational intensiveness, our simulation study was limited in scope. We believe our simulations establish LOCate as a valuable complementary approach for linkage analysis of categorical traits, particularly nominal traits. We are currently developing extensions to increase the computational speed of LOCate, which will enable a more extensive range of simulations to compare LOCate's performance to LOT on a variety of ordinal and nominal traits with varying amounts of missing data and inbreeding.
We further demonstrated the versatility of our method through a trichotomous linkage analysis of a dataset of humans affected by panic disorder with a large proportion of missing data. By splitting the most memory-intensive pedigrees into nuclear families, we were able to analyze the dataset using LOCate, while LOT was unable to process the large proportion of individuals with missing phenotypes. In this particular application, it was interesting to note the very negative LOD scores produced in the trichotomous analysis, while the binary analysis on the same set of subpedigrees had positive LODs. This demonstrates that the trichotomous model in is a poor fit to the data. The exclusion of this penetrance matrix as a model for the contribution of D2S1788 (or a locus linked to it) to panic disorder was not possible using LOT. The exclusion of this model, a categorical “translation” of the binary penetrance model used by Fyer et al., demonstrates that modeling genetic contributions to categorical traits is not a simple matter of applying a few modifications to existing binary models. Further investigation of panic disorder as an ordinal trait is needed, to establish more complete bounds on the range of possible penetrance models. In addition, further methods development, such as a Bayesian treatment of the penetrance matrix, would enable us to analyze categorical traits without specifying the penetrance matrix in advance.
We have implemented our method in the software LOCate, available at https://sourceforge.net/projects/categorical
. LOCate is an effective and versatile approach for single marker analysis of nominal, ordinal, and binary traits on arbitrary family-sized pedigrees, including those with inbreeding loops and missing phenotypes and/or genotypes. While our method currently has scaling limitations for larger pedigrees, we are developing extensions for LOCate that make use of variable elimination to make the method available for multimarker analysis as well as the analysis of arbitrarily sized linkage studies.