Most traits of interest in biology and important to biomedicine are multifactorial, controlled by multiple genes displaying complex interactions with each other and with environmental factors [

1]. For this reason, genetic analysis of these traits has been difficult, despite tremendous efforts for the development of genetic theory and methods over the past century. Recent advances of powerful molecular technologies have revolutionized the tools of analyzing multifactorial traits by dissecting them into the underlying quantitative trait loci (QTLs) using DNA-based markers. Lander and Botstein [

2] proposed a statistical model for interval mapping of individual QTLs that contribute to a complex trait. A considerable body of literature on the methodological development of QTL mapping, which aims at improving the precision of QTL detection [

3,

4] broadens the scope of utility of this approach [

5–

7]. Specific statistical issues for QTL mapping have also been considered in several areas including the determination of critical thresholds [

8,

9], model selection [

10], nonparametric mapping of QTLs [

11], and asymptotic properties of QTL parameter estimates [

12]. Because of its many favorable properties, the Bayesian approach has been introduced to map QTLs first by Satagopan et al. [

13] and subsequently by a number of other researchers [

14].

There is increasing recognition of the limitations of traditional QTL mapping approaches that only capitalize on single measurements of a complex trait at one time point, given that all biological traits or diseases undergo a dynamic process across time and spatial scales. To better describe the dynamic pattern of trait progression, it is crucial to measure phenotypic values of a trait longitudinally at multiple time points, which allows a comprehensive analysis of how genes govern time-specific changes of the trait. These so-called dynamic traits can now be mapped by a dynamic model that is equipped with a capacity to study the temporal pattern of QTL effects on trait progression [

15]. This model approximates time-dependent mean vectors for individual QTL genotypes and tests their differences by using biologically relevant parametric curves, such as logistic equations for growth data [

15], or nonparametrically when no explicit parametric equations exist for functional data [

16–

18]. It provides a general quantitative and testable platform for assessing the interplay between genetic actions and developmental pattern [

19] and has now been used as a mapping tool in a number of areas such as allometric scaling, bird flight, thermal reaction norm, HIV-1 dynamics, tumor progression, biological clock, and drug response [

20–

24].

In many longitudinal trials, data are often collected at irregularly spaced time points and with measurement schedules specific to different subjects. The efficient estimation of covariance structure in this situation presents a major challenge for genetic mapping to study the genetic control of dynamic traits. The motivation of this study is to develop a robust approach for joint-modeling of the mean-covariance structures of subject-specific irregular longitudinal data. Within a generalized linear model framework, covariates were used to model the mean function by McCullagh and Nelder [

25] and the covariance matrix by Pourahmadi [

26]. Pan and Mackenzie [

27] generalized Pourahmadi’s setting to sparse irregular longitudinal data and implemented iteratively re-weighted least squares algorithms (IRLS) to estimate the model parameters maximizing the likelihood. An alternative for modeling the covariance structure is to shrink the covariance matrix towards a specified structure using Bayesian hierarchical models. A popular and standard prior for the covariance matrix is the inverse Wishart distribution which is conjugate to a covariance matrix from a normal distribution. Daniels and Kass [

28] showed that such a prior could enhance computing efficiency. Better priors for shrinking the covariance toward a known structure were proposed by Daniels and Pourahmadi [

29]. All these methods guarantee the estimated covariance matrix to be positive definite.

In this article, we present a Bayesian approach for semiparametric modeling of mean and covariance structures for irregular longitudinal data within the framework of genetic mapping constructed by a finite mixture model. A Bayesian algorithm for genetic mapping of dynamic traits was already proposed by Liu and Wu [

30] and Heuven and Janss [

31]. Unlike their parametric modeling, however, we model the mean structure by a penalized spline [

32] and the covariance structure by a generalized linear model [

26]. This semiparametric modeling is particularly robust for dynamic trait mapping with irregular longitudinal data. Proper priors have been tested and chosen to get a smooth mean curve and meaningful covariance parameters. We derive the full conditional posterior distributions for the model parameters and then use Gibbs sampler and the Metropolis-Hastings algorithm to estimate the parameters. The new model was used to map QTLs responsible for age-specific changes of body mass index (BMI) in a random sample of 977 subjects from Framingham Heart Study [

33]. As a heuristic measure of body fatness based on a person’s weight and height, BMI is the most widely used diagnostic tool to identify whether individuals are underweighted, overweighted, or obese and further examine their risk of developing obesity-related diseases, such as hypertension, type 2 diabetes, and cardiovascular diseases [

34]. By mapping genes associated with BMI trajectories, we hope to diagnose and predict the timing of pathogenesis for these diseases based on individual patients’ genetic makeup. To validate the usefulness of the new model for QTL mapping, we performed simulation studies to investigate its statistical properties.