Sunday, December 22, 2024

How do lifestyle factors modify the association between genetic predisposition and obesity-related phenotypes? A 4-way decomposition analysis using UK Biobank – BMC Medicine

Must read

Study design

This cross-sectional study used baseline data from the UK Biobank, a population cohort study. From April 2006 to December 2010, 502,536 participants, who were largely between 40 and 70 years old were recruited [17]. Participants attended one of the 22 assessment centres across England, Wales, and Scotland, at which they completed a touch-screen questionnaire (including self-reported physical activity level, total energy intake, diet intake, sleep duration, smoking frequency, and alcohol consumption), underwent physical measurements, and provided biological samples, as described elsewhere [18].

The main outcome measures investigated in this study were obesity and central obesity. BMI-PRS and WHR-PRS were the exposures of interest [19]. Lifestyle factors—physical activity level, total energy intake, diet quality, frequency of alcohol consumption, and smoking status—were investigated as potential mediators and effect modifiers. Sex, age, and sociodemographic deprivation were considered potential confounders and included as covariates in the statistical models, and the WHR-PRS models were stratified by sex because of a significant statistical interaction between WHR-PRS and sex.

Inclusion in the study was restricted to participants who self-reported white British ethnicity to avoid heterogeneity (> 90% of the sample), and those whose BMI was ≥ 18.5 kg/m2 (i.e. non-underweight) to avoid non-linear associations of BMI. We excluded participants who reported never drinking alcohol because of potential confounding (e.g. stopped due to poor health), and those with missing data on BMI-PRS and WHR-PRS and/or failed genetic quality controlling. Overall, 312,748 eligible participants had genetic data available for use in this study. After excluding people who had missing data on physical activity level, diet risk score, alcohol consumption, sleep duration, and smoking status, the final study sample was n = 201,446. UK Biobank received ethical approval from the North-West Multi-centre Research Ethics Committee (reference: 11/NW/03820). All participants provided written, informed consent based on the principles of the Declaration of Helsinki before enrolment in the study. This project was completed using UK Biobank data application 71392 (Fig. 1).

Fig. 1

Outcomes and covariates

During the baseline assessment, participants’ height (Seca 202 stadiometer; Sca) and weight (BC-418 MA body composition analyzer; Tanita Corp) were measured by trained nurses [20]. Waist and hip circumference were measured using a Wessex non-stretchable sprung tape measure [21]. BMI was calculated from weight in kilogrammes divided by height in metres squared, and WHR was calculated as waist measurement divided by hip measurement [22]. Obesity was defined as BMI ≥ 30 kg/m2. Central obesity was defined as WHR ≥ 0.85 in women and WHR ≥ 0.90 in men.

Area-based socioeconomic status was measured by the Townsend score, which was derived from census data on housing, employment, social class, and car availability by postcode of residence [23]. A higher Townsend score represents a higher level of deprivation. More detailed information can be found in the UK Biobank online protocol (http://www.ukbiobank.ac.uk).

In the baseline assessment, participants self-reported their physical activity level using the International Physical Activity Questionnaire (IPAQ) [24]. Low physical activity level was defined as 24].

Dietary information was collected via the Oxford WebQ questionnaire which is based on self-reported 24-h recall. It records the usual consumption of a range of foods and was designed for use in large population studies [25]. Participants were invited on five occasions to complete an online questionnaire between April 2009 and June 2012. For participants who completed more than one questionnaire, we derived the average intake from the questionnaires completed. Total energy intake and energy derived from each macronutrient were calculated, in kilocalories per day, using the information recorded in the 7th edition of McCance and Widdowson’s The Composition of Foods [26]. High total energy intake was defined as > 2,000 kcal/day for women and > 2500 kcal/day for men, in accordance with the NHS guideline (https://www.nhs.uk/live-well/healthy-weight/managing-your-weight/understanding-calories/). The sample size for analysis using these energy intake variables was 95,437.

Since the dietary recall was available in less than half of the UK Biobank participants, this study used a cumulative dietary quality score [27] from the food frequency questionnaire, which was completed by most participants. Twenty-one of the 27 items in the score were deemed to be relevant to the study and therefore included: cooked vegetables, salad/raw vegetables, fresh fruit, dried fruit, oily fish, non-oily fish, processed meat, poultry, beef, lamb, pork, cheese, milk type used, spread type, bread type, cereal intake, cereal type, salt added to food, tea, coffee, and water. We then collapsed beef, pork, and lamb into red meat; oily fish and non-oily fish into total fish; and fresh fruit, dried fruit, salad vegetables, and cooked vegetables, into fruit and vegetables, resulting in 15 items. Six of them have unknown or uncertain associations with health outcomes (such as poultry) or were not available for the full cohort (such as cereal type) and, therefore, were not included in the cumulative dietary quality score. Finally, we included the remaining nine of the 15 food items in the score (processed meat, red meat, total fish, milk, spread type, cereal intake, salt added to food, water, and fruits and vegetables). As food item data were collected using various frequencies of consumption (such as never, less than once a week, or once a week), all food items were dichotomised into meeting or not meeting recommendations using cut-offs derived from the UK and European food-based dietary guidelines (the Eatwell Guide [28] and the Food-Based Dietary Guidelines from the European Food Safety Authority [29]) or median food intake where specific recommendations did not exist [30]. We assigned one point to participants for each healthy category met, defined as processed meat less than once per week [28, 31]; red meat less than once per week [28, 31]; total fish more than twice per week [28]; no consumption of full-cream milk or non-dairy milk [28, 29]; no intake of spread [30]; more than five bowls per week of cereal [30]; no salt added to food [28, 29]; more than six glasses per day of water; and more than five servings per day of fruit and vegetables [28, 29]. Participants’ points were summated to create an unweighted score, with a minimum score of 0 representing the least healthy diet, and a maximum score of 9 representing the healthiest diet. Low diet quality was defined as a diet quality score 

Smoking status was self-reported at baseline and classified as either ever smoker (current or former smoker) or never smoker. Alcohol intake was self-reported as the number of units consumed per week and > 14 units/week was defined as high alcohol intake. Self-reported sleep duration was categorised into abnormal sleep duration ( 9 h/day) and normal sleep duration (7–9 h/day) [16].

Exposures

For this study, we used the updated genetic data (October 2018), which is available on 488,377 participants [32]. Of these, 438,427 samples were genotyped using Affymetrix UK Biobank Axiom Array with 825,927 markers (Santa Clara, CA, USA), and the remaining 49,950 were genotyped using the Affymetrix UK BiLEVE Axiom array with 807,411 markers. These two arrays are extremely similar (sharing more than 95% same content). To maximise homogeneity and BMI-PRS applicability, we exclude participants who did not self-report their ethnicity as white British, and those with missing data on BMI-PRS and WHR-PRS. Further information on the genotyping process is available on the UK Biobank website (http://www.ukbiobank.ac.uk/scientists-3/genetic-data), which includes detailed technical documentation (https://www.nature.com/articles/s41586-018-0579-z).

We used a standard set of sample quality-control procedures, applying statistical tests designed mainly to check for consistency of genotype calling across experimental factors and the indicators of missing rate and heterozygosity to identify poor quality samples, conducting quality control specific to the sex chromosomes using a set of high-quality markers on the X and Y chromosomes [32]. We only used markers present on both the UK BiLEVE and UK Biobank Axiom arrays and excluded those that markers failed to pass the quality control in more than one batch, had a greater than 5% overall missing rate, and had 32].

LDpred [33] was used to generate the BMI-PRS [34] and WHR-PRS [19]. LDpred adjusts GWAS summary statistics to account for linkage disequilibrium (LD) between SNPs, creating a single genome-wide score using an infinitesimal model. The raw summary statistics are adjusted using 1000 unrelated UK Biobank participants as the LD reference panel, who were not used in the main analyses. These participants are white British, whose self-reported sex match their genetically determined sex, who do not have purported sex chromosome aneuploidy, and who are not determined by UK Biobank to be outliers for heterozygosity. Scores are then generated using these LD-adjusted summary statistics in those who pass the same genetic quality control as above and were not used in the LD reference panel.

Statistical analyses

Participant characteristics were firstly compared by BMI-PRS and WHR-PRS categories. The weighted PRS scores were transformed into z scores and categorised as PRS  1. All the lifestyle factors were classified as binary variables in the pre-specified deleterious direction. The sociodemographic and lifestyle characteristics of the PRS categories were summarised using frequencies and percentages and compared using chi-square tests.

The first set of analysis focuses on mediation. Logistic regression was used to investigate whether lifestyle factors mediated the associations between PRS and obesity. In this analysis, BMI-PRS and WHR-PRS were the exposures, and the outcomes were obesity (BMI ≥ 30 kg/m2) and central obesity (WHR ≥ 0.9 for men and WHR ≥ 0.85 for women), respectively. We tested for statistical interactions between sex and BMI-PRS and WHR-PRS. Where the interactions were significant, the models were run stratified by sex. The models were adjusted for potential confounders (age, sex, and deprivation), the 10 principal genetic components (PGC) (to correct for population stratification), and the genotyping chip used. Each of the lifestyle factors was then added sequentially.

The second analysis focuses on interaction. Stratified logistic regression models were then used to test for interactions between PRS and lifestyle factors. Separate models were run for each lifestyle factor. Interaction terms were included to investigate whether individual lifestyle factors interacted with obesity-related PRS (low PRS defined as PRS 35]. The dichotomisation of PRS is required for calculating RERI and is only used for this analysis.

Finally, a 4-way decomposition was used to quantify how much of the total association between BMI-PRS/WHR-PRS and obesity and central obesity could be attributed to mediation, additive interaction, or neither [36]. The association between PRS and outcome was estimated as an odds ratio (OR) in the logistic regression model. This method further decomposes the OR (‘total effect’) into: OR via interaction with lifestyle (‘effect due to interaction’), OR via mediation through lifestyle (‘mediated effect’), and OR not via lifestyle (‘direct effect’). Because the logistic regression model operates on the logistic scale, it could be interpreted as total effect = effect due to interaction * effect due to mediation * effect not due to lifestyle. The results were presented as the overall proportion of excess prevalence attributable to additive interaction ([effect due to interaction − 1]/[total effect − 1]) and mediation ([mediated effect − 1]/[total effect − 1]). The total lifestyle in 4-way decomposition analyses represents the mediation/modification role of when combining all lifestyles together (PAL, diet quality, etc.), and adjusting age, sex, deprivation, genetic principal components and chip. In a sensitivity analysis, all lifestyle factors were also adjusted mutually for a conservative estimate. Because there is a slight difference in BMI-PRS by sex, we conducted a sensitivity to adjust for BMI-PRS*sex interaction as a covariate in the final 4-way decomposition analysis. All statistical analyses were conducted using R, with the cmest function from the CMAverse package and two-sided P 

Latest article