Department of Biostatistics

Steve Horvath

Professor of Human Genetics & Biostatistics

Tel: (310) 825-9299

Office: 4357A Gonda & 21-254A CHS

Department of Biostatistics

UCLA School of Public Health

Los Angeles, CA 90095-1772



  • B.S. Mathematics and Physics (1989) Technical University of Berlin

  • Ph.D. in Mathematics (1995) University of North Carolina, Chapel Hill

  • Doctor of Science in Biostatistics (2000) Harvard School of Public Health

Research & Interests

I am a Professor in Human Genetics and Biostatistics at UCLA. My methodological research area lies at the intersection of biostatistics, bioinformatics, computational biology, cancer research, genetics, epidemiology, machine learning, and systems biology. My group applies these methods to study a broad spectrum of diseases, e.g. aging research, cancer, cardiovascular disease, HIV, Huntington's disease, neurodegenerative diseases.

Systems biology and systems genetics: My group develops and applies methods for analyzing and integrating gene expression-, DNA methylation-, microRNA, genetic marker-, and complex phenotype data. In particular, we developed weighted correlation network analysis (also known as weighted gene co-expression network analysis WGCNA), which is a systems biologic data analysis method for analyzing high dimensional "-omics" data. These methods also lend themselves for comparing different species at the genomic level. A lot of material including articles, R software tutorials, and youtube lectures can be found here

Biomarker development: My group works on all aspects of biomarker development: data collection, novel data analysis methods, and biomarker validation studies. For example, we worked on genomic biomarkers of aging and age related diseases including cancer. We compared standard meta analysis methods with network based meta analysis methods:

Machine learning methods: We work both on supervised and unsupervised machine learning methods. For example, we developed the random generalized linear model (randomGLM) predictor, see random forest clustering (, and the cluster and propensity based approximation of a network (CPBA):

Genome wide association studies: I have long standing interest in developing and applying allelic association tests, e.g. I have worked on the family based association test (FBAT), see More recently my group is interested in enhancing GWAS studies and exome sequencing methods.

Epigenomics: Methods and applications surrounding epigenetic data (in particular DNA methylation data) to study human diseases (e.g. age related diseases). Epigenetics is the study of changes in gene expression or cellular phenotype, caused by mechanisms other than changes in the underlying DNA sequence.


  • Chemistry 260 Winter 2002-3 Lecture: Statistical Methods for Microarray Data Analysis

    (Microsoft PowerPoint required to view this file!)

  • Human Genetics 236 Winter 2001 Advanced Human Genetics: Statistical Genetics and Human Disease Genes

    (Microsoft PowerPoint required to view this file!)

  • Biostat 250B Winter 2001-05 Linear Statistical Models

  • Biostat M278 Winter 2002, 04 Statistical Analysis of DNA Microarray Data

  • Biostat 402B Spring 2002 Biostatistical Consulting


Department of Biostatistics
UCLA Fielding School of Public Health
Los Angeles, CA 90095-1772
310-825-5250 FAX:310-267-2113