沈伯松教授学术贡献简述

#沈伯松教授生平学术贡献

研究成就及贡献:Prof. Pak Chung Sham

1. 遗传连锁分析的方法和程序

在2005年全基因组关联分析方法兴起之前,连锁分析是人类遗传学中鉴定孟德尔型 (Mendelian)和复杂疾病遗传变异的重要方法。它对于发现具有高外显率的致病罕见变异特别有用。随着全基因组测序 (whole genome sequencing) 技术的发展,连锁分析在分离罕见病的家系中,帮助揭示高外显率变异方面仍可能发挥作用。我为连锁分析中关键统计方法的开发做出了贡献。早期的贡献包括基于似然 (likelihood-based) 的新型连锁定位方法 (Curtis and Sham 1995),其特点是无需指定传递模型(transmission model)参数。随后,我们研究并提出了针对兄弟姐妹群体 (sibship) 数据,量化性状位点 (QTL) 进行连锁和关联分析的效能 (power) 评估方法 (Sham et al. 2000),并阐明了在兄弟姐妹配对 (sib-pair) 连锁分析中,(Haseman-Elston) 方法与方差组分 (variance-components) 方法二者的等效性 (Sham and Purcell 2001)。最后,我们开发了一个结合回归模型与方差组分模型优点的强大统计方法和计算机程序,用于量化性状位点连锁分析 (Sham et al. 2002)。

Arranz M, Collier D, Sodhi M, Ball D, Roberts G, Price J, Sham PC, & Kerwin R. (1995) Association between clozapine response and allelic variation in 5-HT2A receptor gene. Lancet, 346, 281-282.
Curtis D, Sham PC (1995) Model-Free Linkage Analysis Using Likelihoods. American Journal of Human Genetics 57: 703-716.
Sham PC, Cherny SS, Purcell S, Hewitt JK (2000) Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. American Journal of Human Genetics 66: 1616-1630. doi: Doi 10.1086/302891
Sham PC, Purcell S (2001) Equivalence between Haseman-Elston and variance-components linkage analyses for sib pairs. American Journal of Human Genetics 68: 1527-1532. doi: Doi 10.1086/320593
Sham PC, Purcell S, Cherny SS, Abecasis GR (2002) Powerful regression-based quantitative-trait linkage analysis of general pedigrees. American Journal of Human Genetics 71: 238-253. doi: Doi 10.1086/341560

2. 全基因组关联分析的方法和程序

全基因组关联研究 (genome-wide association studies) 对于找出导致常见疾病易感性的基因变异至关重要,这些变异的识别不仅能用于预测个体的疾病风险,也能帮助深入研究疾病发生机制。自2005年来, GWAS研究已经成为疾病生物学机制研究和致病基因定位的上游金标准和重要策源。我领导PhD学生Shaun Purcell与其他合作者开发的名为PLINK的程序已成为GWAS 的标准分析工具,被引用超过3万次(依据Google Scholar)(Purcell et al. 2007)。我们还开发了一款广泛使用的程序,用于关联分析的统计效能计算 (Purcell et al. 2003)。在早期工作中,我们针对多等位基因高多态性标记的家系关联数据进行关联分析提出了一种方法 (Sham and Curtis 1995),并且是最早提出基因层面 (gene-based) 关联分析方法的团队之一 (Neale and Sham 2004)。随后,我们开发了 GATES)和 HYST两个程序,用于在全基因组关联研究中进行基于基因或通路的分析 (Li et al, 2011)。与 Shaun Purcell一起,我们在Nature Reviews Genetics期刊回顾了大型遗传学关联研究中分析及效能计算的一般原则 (Sham & Purcell, 2014)。

Sham PC, & Curtis D. (1995). An extended transmission/disequilibrium test (TDT) for multi-allele marker loci. Annals of Human Genetics, 59, 323-336.
Purcell S, Cherny SS, Sham PC (2003) Genetic Power Calculator: design of linkage and association genetic mapping studies of complex traits. Bioinformatics 19: 149-150. doi: DOI 10.1093/bioinformatics/19.1.149
Neale BM, Sham PC (2004) The future of association studies: Gene-based analysis and replication. American Journal of Human Genetics 75: 353-362. doi: Doi 10.1086/423901
Purcell S, Neale B, Todd-Brown K, Thomas L, Ferreira MAR, Bender D, Maller J, Sklar P, de Bakker PIW, Daly MJ, Sham PC (2007) PLINK: A tool set for whole-genome association and population-based linkage analyses. American Journal of Human Genetics 81: 559-575. doi: 10.1086/519795
Li MX, Gui HS, Kwan JSH, & Sham PC. (2011). GATES: A Rapid and Powerful Gene-Based Association Test Using Extended Simes Procedure. American Journal of Human Genetics, 88(3), 283-293.
Li M-X, Kwan JSH, & Sham PC. (2012). HYST: A Hybrid Set-Based Test for Genome-wide Association Studies, with Application to Protein-Protein Interaction-Based Association Analysis. American Journal of Human Genetics, 91(3), 478-488.
Sham PC, & Purcell SM (2014) Statistical power and significance testing in large-scale genetic studies. Nature Reviews Genetics, 15, 335-346.

3. 基于 GWAS 数据的遗传力分析和风险预测

GWAS 数据提供了有关常见疾病遗传结构的宝贵信息,并为构建预测个体疾病风险的模型提供了宝贵信息。我为这些领域的方法论发展做出了贡献。我们是最早指出遗传力解释决定了遗传风险预测准确性的人之一,同时提出了一个统一的框架来根据解释的遗传力计算所有常见的预测准确性指数 (So and Sham 2010 年).然后,我们开发了评估由个体遗传变异解释的遗传力的方法 (So et al. 2011a) 以及 GWAS 面板上的所有变体 (So et al. 2011b).两种方法都只需要汇总统计,而不需要原始基因型和表型数据。我们还衍生了一种结合家族史和遗传信息进行风险预测的新方法,并展示了如何将其用于个性化乳腺癌和前列腺癌筛查计划 (So et al. 2011c).我们还进行了方法学创新,以提高多基因评分的风险预测准确性(Mak et al, 2016, Mak et al, 2017),这有可能应用于个性化预防医学。

(GWAS) 数据为了解常见疾病的遗传结构以及构建个体疾病风险预测模型提供了宝贵信息。我在该领域的方法学开发上也有贡献。我们是最早指出“可解释遗传力 (heritability explained)”决定基因风险预测准确度的团队之一,并提出了一个基于可解释遗传力来计算所有常见预测准确度指标的统一框架 (So and Sham 2010)。随后,我们开发了评估单个基因变异 (So et al. 2011a) 以及整个平台全部变异 (So et al. 2011b) 对应的可解释遗传力的方法。这些方法只需要GWAS汇总统计量,而无需原始基因型和表型数据。我们还提出了一种结合家族史与基因信息进行风险预测的全新方法,并展示了这一方法如何应用于个性化乳腺癌和前列腺癌筛查 (So et al. 2011c)。此外,我们还在提升多基因风险评分 (polygenic scores) 准确度方面做出了方法学创新 (Mak et al, 2016; Mak et al, 2017),所发表的lassosum软件包是最早发表的多基因风险评分构建方法之一。这些方法有望应用于个性化预防医学。

So HC, Sham PC (2010) A unifying framework for evaluating the predictive power of genetic variants based on the level of heritability explained. PLoS Genet 6: e1001230. doi: 10.1371/journal.pgen.1001230
So HC, Gui AH, Cherny SS, Sham PC (2011a) Evaluating the heritability explained by known susceptibility variants: a survey of ten complex diseases. Genet Epidemiol 35: 310-7. doi: 10.1002/gepi.20579
So HC, Li M, Sham PC (2011b) Uncovering the total heritability explained by all true susceptibility variants in a genome-wide association study. Genet Epidemiol 35: 447-56. doi: 10.1002/gepi.20593
So HC, Kwan JS, Cherny SS, Sham PC (2011c) Risk prediction of complex diseases from family history and known susceptibility loci, with applications for cancer screening. Am J Hum Genet 88: 548-65. doi: 10.1016/j.ajhg.2011.04.001
Mak, TSH, Porsch, RM, Choi, SW, Zhou, XY, Sham PC (2017) Polygenic scores via penalized regression on summary statistics. Genet Epidemiol 41: 469-80. doi: 10.1002/gepi.22050
Mak, TSH, Kwan, JS, Campbell, DD, Sham PC (2016) Local True Discovery Rate Weighted Polygenic Scores Using GWAS Summary Data. Behav Genet. 46(4): 573-82. doi: 10.1007/s10519-015-9770-2

4. 精神障碍的遗传流行病学

精神障碍是现代社会最常见且最具致残性的疾病之一,特别是精神分裂症 (schizophrenia) 和双相障碍 (bipolar disorder)。我在这些疾病的遗传学和流行病学研究方面取得了一系列基础性发现。在流行病学方面,发现了精神分裂症 (schizophrenia) 与孕期流感 (influenza) 流行 (Sham et al. 1992) 以及饥荒 (St Clair et al, 2005) 暴露之间的联系,这提示了大脑发育异常以及免疫学、营养学机制在精神分裂症中可能发挥作用。我们发展了用于双生子研究的结构方程模型 (SEM) 技术 (Rijsdijk & Sham, 2002),帮助证明了不同精神疾病表型之间存在共享的遗传基础 (McGuffin et al, 2003)。我们的团队在 (MECP2) 基因区域检测到了与精神分裂症相关的连锁信号 (Wong et al. 2014)。同时,我们也是 精神病基因组学联盟(Psychiatric Genomics Consortium) 中精神分裂症工作组 (Schizophrenia Working Group) 的一员,该小组发表了发现108个独立精神分裂症关联位点的里程碑式论文 (Schizophrenia Working Group of the Psychiatric Genomics, 2014)。基于 PGC 的 GWAS数据所训练的多基因风险评分 (polygenic risk scores),我们发现与精神分裂症风险相关的遗传效应与认知相关通路相互关联 (Toulopoulou et al, 2018)。

O’Callaghan E, Sham PC, Takei N, Glover G, & Murray RM. (1991) Schizophrenia after the exposure to 1957 A2 influenza epidemic. Lancet, 337, 1248-1250.
Sham PC, Ocallaghan E, Takei N, Murray GK, Hare EH, Murray RM (1992) Schizophrenia Following Prenatal Exposure to Influenza Epidemics between 1939 and 1960. British Journal of Psychiatry 160: 461-466. doi: DOI 10.1192/bjp.160.4.461
St Clair D, Xu MQ, Wang P, Yu YQ, Fang YR, Zhang F, Zheng XY, Gu NF, Feng GY, Sham PC, & He L. (2005). Rates of adult schizophrenia following prenatal exposure to the Chinese famine of 1959-1961. Journal of the American Medical Association (JAMA), 294(5), 557-562
Rijsdijk FV, & Sham PC. (2002). Analytic approaches to twin data using structural equation models. Briefings in Bioinformatics, 3(2), 119-133
McGuffin P, Rijsdijk F, Martin A, Sham PC, Katz R, & Cardno A. (2003). The heritability of bipolar affective disorder and genetic relationship to unipolar depression. Arch Gen Psychiatry, 60, 497-502.
Wong EH, So HC, Li M, Wang Q, Butler AW, Paul B, Wu HM, Hui TC, Choi SC, So MT, Garcia-Barcelo MM, McAlonan GM, Chen EY, Cheung EF, Chan RC, Purcell SM, Cherny SS, Chen RR, Li T, Sham PC (2014) Common variants on Xq28 conferring risk of schizophrenia in Han Chinese. Schizophr Bull 40: 777-86. doi: 10.1093/schbul/sbt104
Schizophrenia Working Group of the Psychiatric Genomics C (2014) Biological insights from 108 schizophrenia-associated genetic loci. Nature 511: 421-7. doi: 10.1038/nature13595
So HC, Chau CKL, Chiu WT, Ho KS, Lo CP, Yim SHY, & Sham PC. (2017). Analysis of genome-wide association data highlights candidates for drug repositioning in psychiatry. Nature Neuroscience, 20, 1342-1349.

Hui CL, Honer WG, Lee EH., Chang WC, Chan SK, Chen ES, Pang EPF, Lui SSY, Chung DWS, Yeung WS, Ng RMK, Lo WTL, Jones PB, Sham PC, &, Chen EYH. (2018). Long-term effects of discontinuation from antipsychotic maintenance following first-episode schizophrenia and related disorders: a 10 year follow-up of a randomised, double-blind trial. The Lancet Psychiatry, 5(5), 432-442.
Toulopoulou T, Zhang X, Cherny SS, Dickinson D, Berman KF, Straub RE, Sham PC, & Weinberger, DR. (2019). Polygenic risk score increases schizophrenia liability through cognition-relevant pathways. Brain, 142(2), 471-485
Wong SMY, Chen EYH, Suen YN, Wong CSM, Chang WC, Chan SKW, McGorry PD, Morgan C, Van Os J, McDaid D, Jones PB, Lam TH, Lam LCW, Lee EHM, Tang EYH, Ip CH, Ho WWK, McGhee SM, Sham PC, Hui CLM (2023) Prevalence, time trends and correlates of major depressive episode and other psychiatric conditions among young people amid major social unrest and COVID-19 in Hong Kong: a representative epidemiological study from 2019 to 2022. Lancet Regional Health Western Pacific 40 100881, 1-13

5. 基因组测序技术的方法和应用

基因组测序 (genome sequencing) 能够捕获影响基因功能的罕见变异,这些变异可导致群体中罕见和常见疾病的发生。我们开发了方法和程序,用于筛选外显子组测序 (exome sequencing) 中鉴定的基因,并预测其致病性 (Li et al. 2012; Li et al, 2013a)。我们利用外显子组测序在脊髓小脑变性 (spinocerebellar ataxia) (Li et al, 2013b) 和扩张型心肌病 (dilated cardiomyopathy) (Tse et al, 2014) 的致病突变研究中取得了成果,并通过源自患者诱导多能干细胞 (iPSCs) 的心肌细胞证明了潜在的致病突变。我们还运用外显子组测序鉴定了在亚洲人群中影响血脂水平和冠心病风险的变异 (Tang et al, 2014; Lu et al, 2017),为在东亚人群中进行个体化风险预测和预防铺平了道路。此外,通过外显子组测序,我们发现了对抗精神分裂症 (schizophrenia) 药物反应不佳的患者在 (AMPA/NMDA) 谷氨酸能受体的突触电流相关基因中存在更多稀有的破坏性变异 (Wang et al, 2018),这一发现对未来药物开发和精准治疗具有重要意义。另外,我们是首批利用第三代长读长 (long-read) 测序技术对精神分裂症遗传学研究团队之一,证实中等大小的结构变异 (medium-size structural variants) 也会增加精神分裂症的患病风险 (Lee et al, 2013),这或许能解释为何常见单核苷酸变异 (SNP) 并不能完全解释双生子研究所估计的精神分裂症遗传率。

Li MX, Gui HS, Kwan JSH, Bao SY, Sham PC (2012) A comprehensive framework for prioritizing variants in exome sequencing studies of Mendelian diseases. Nucleic Acids Research 40. doi: ARTN e510.1093/nar/gkr1257
Xu F, Wang W, Wang P, Li MJ, Sham PC, & Wang J. (2012). A fast and accurate SNP detection algorithm for next-generation sequencing data. Nature Communication, 3, 1258.

Li MX, Kwan JSH, Bao SY, Yang WL, Ho SL, Song YQ, Sham PC (2013b) Predicting Mendelian Disease-Causing Non-Synonymous Single Nucleotide Variants in Exome Sequencing Studies. Plos Genetics 9. doi: ARTN e100314 10.1371/journal.pgen.1003143
Li M, Pang SYY, Song Y, Kung MHW, Ho SL, Sham PC (2013b) Whole exome sequencing identifies a novel mutation in the transglutaminase 6 gene for spinocerebellar ataxia in a Chinese family. Clinical Genetics 83: 269-273. doi: 10.1111/j.1399-0004.2012.01895.x
Tse HF, Ho JCY, Choi SW, Butler AW, Ng KM, Siu CW, Simpson MA, Lai WH, Chan YC, Au KW, Zhang JQ, Lay KWJ, Esteban MA, Nicholls JM, Alan C, & Sham PC. (2014). Patient-specific induced-pluripotent stem cells derived cardiomyocytes recapitulate the pathogenic phenotypes of dilated cardiomyopathy due to a novel DES mutation identified by whole exome sequencing. Human Molecular Genetics, 23(8), 2232-2233.
Tang CS, Zhang H, Cheung CYY, Xu M, Ho JCY, Zhou W, Cherny SS, Zhang Y, Holmen O, Au KW, Yu H, Xu L, Jia J, Porsch RM, Sun LJ, Xu WX, Zheng HP, Wong LY, Mu YM, Dou JT, Fong CHY, Wang SY, Hong XY, Dong LG, Liao YH, Wang JS, Lam LSM, Su X, Yan H, Yang ML, Chen J, Siu CW, Xie GQ, Woo YC, Wu YF, Tan KCB, Hveem K, Cheung BMY, Zollner S, Xu A, Chen E, Jiang CQ, Zhang YY, Lam TH, Ganesh SK, Huo Y, Sham PC, Lam KSL, Willer CJ, Tse HF, & Gao W. (2014), Exome-wide association analysis reveals novel coding sequence variants associated with lipid traits in Chinese. Nature Communications, 6, 10206
Lu X, Pelosi GM, Liu DJ, Wu Y, Zhang H, Zhou W, Li J, Tang CS, Dorajoo R, Li H, Guo X, Xu M, Spracklen CN, Chen Y, Liu X, Zhang Y, Khor CC, Liu J, Sun L, Wang L, Gao YT, Hu Y, Yu K, Wang Y, Cheung CYY, Wang F, Huang J, Fan Q, Cai Q, Chen S, Shi J, Yang X, Zhao W, Sheu WH, Cherny SS, He M, Feranil AB, Adair LS, Gordon-Larsen P, Du S, Varma R, Chen Yi, Chu XO, Lam KSL, Wong TY, Ganesh SK, Mo Z, Kveem K, Fristche LG, Nielsen JB, Tse HF, Huo Y, Cheng CY, Chen YE, Zheng W, Tai ES, Gao W, Lin X, Huang W, Abecasis G, CLGC Consortium, Kathiresan S, Mohlke KL, Wu T, Sham PC*, Gu D*, & Willer C*. (2017). Exome chip meta-analysis identifies novel loci and East Asian-specific coding variants that contribute to lipid levels and coronary heart disease. Nature Genetics, 49, 1722-1730
Wang Q., Wu HM, Yue W, Yan H, Zhang Y, Tan L, Deng W, Chen Q, Yang G, Lu T, Wang L, Zhang Z, Yang J, Li K, Lu L, Tan Q, Zhang H, Ma X, Yang F, Li L, Wang C, MA X, Zhao L, Ren H, Yu H, Wang Y, Hu X. Zhang D, Sham PC, & Li T. (2018). Effect of damaging rare mutations in synapse-related gene sets on response to short-term antipsychotic medication in Chinese patients with schizophrenia: a randomized clinical trial. JAMA psychiatry, 75(12), 1261-1269
Lee CC, Ye R, Tubbs JD, Baun L, Zhong Y, Leung SYJ, Chan SC, Wu KYK, Cheng PKJ, Chow LP, Leung PWL, Sham PC (2023) Third-generation genome sequencing implicates medium-sized structural variants in chronic schizophrenia. Frontiers in Neuroscience 16, 1058359

6. 方法学和因果建模在复杂疾病中的应用

因果模型 (causal modeling) 对于选择有效的干预靶点至关重要,从而可预防疾病发生或改善疾病症状及转归。我开发过用于互为因果 (reciprocal causation) 建模的结构方程模型 (SEM),并通过这一模型证实了认知功能对精神分裂症 (schizophrenia) 风险具有因果影响 (Toulopoulou et al, 2015)。之后,我们使用孟德尔随机化 (Mendelian randomization,MR) 分析表明抑郁症 (depression) 与血脂 (blood lipids) (So et al, 2021) 以及心血管疾病 (cardiovascular disease) (Li et al, 2022) 之间存在因果关联。但由于 MR研究的假设被违反时易发生假阳性,我们进一步开发了基于混合高斯模型的互为因果推断方法 (Mixture Reciprocal Causal Inference,MRCI) ,以便进行更加稳健的 MR)分析 (Liu et al, 2023)。通过该方法(以及其他 MR 方法)我们发现淋巴细胞计数 (lymphocyte counts) 与精神分裂症 (schizophrenia) 可能存在因果关联 (Leung et al 2024)。我们还开发了利用父母基因型对后代表型通过家庭环境所产生影响(即基因抚育效应 (genetic nurture))的推断方法,并在抑郁症 (depression) 中找到了这一效应的证据 (Tubbs and Sham 2023)。此外,我们最近还发表了一篇利用MR 方法进行因果推断的综述 (Chen et al, 2024)。

Toulopoulou T, Van Heren N, Zhang X, Sham PC, Cherny SS, Campbell DD, Pichioni M, Murray RM, Boomsma DI, Hulshoff Pol HE, Brouwer R, Schnack H, Fananas L, Suaer H, Nenadic I, Weibrod M, Cannon TD, Kahn RS (2015) Reciprocal causation models of cognitive vs volumetric cerebral intermediate phenotypes for schizophrenia in a pan-European twin cohort. Molecular Psychiatry, 20, 1386-1396
So HC, Chau CKL, Cheng YY, Sham PC (2021) Causal relationships between blood lipids and depression phenotypes: a Mendelian randomisation analysis. Psychological Medicine 51, 2357-2369
Li GHY, Cheung CL, Chung AKK, Cheung BMY, Wong ICK, Fok MLY, Au PCM, Sham PC (2022) Evaluation of bidirectional causal association between depression and cardiovascular disease: a Mendelian randomization study. Psychological Medicine 52, 1765-1776
Liu Z, Qin Y, Wu T, Tubbs JD, B L, Mak TSH, Li MX, Zhang YD, Sham PC (2023). Reciprocal causation mixture model for robust mendelian randomization analysis using genome-scale summary data. Nature Communications 14, 1131, 1-12
Leung PBM, Liu Z, Zhong Y, Tubbs JD, Di Forti M, Murray RM, So HC, Sham PC, Lui SSY (2024) Bidirectional two-sample Mendelian randomization study of differential white blood cell counts and schizophrenia. Brain Behavior and Immunity 118, 22-30
Tubbs JD, Sham PC. (2023) Preliminary evidence for genetic nurture on depression and neuroticism through polygenic scores. JAMA Psychiatry, 80, 832-841
Chen LG, Tubbs JD, Lui Z, Thach TQ, Sham PC (2024) Mendelian randomization: causal inference leveraging genetic data. Psychological Medicine 54, 1461-1474.


Comments

Leave a Reply

Your email address will not be published. Required fields are marked *