Svetlana Vinogradova

Svetlana Vinogradova

Lead Data Scientist



Currently, I am a Lead Data Scientist at Zwift, working on multiple data science initiatives, most notably a collaborative filtering recommendation model designed to offer personalized training and route suggestions for users, drawing from their historical time-series data.

Prior to Zwift, I was at InsideTracker, working to integrate blood biomarkers and DNA data with physiological data from activity trackers to improve lifestyle recommendations and discover new patterns and optimal zones in sleep, heart rate, and blood biomarkers.

I have always been passionate about teaching and explaining complex concepts in simple terms. During my PhD, I developed and taught a course “R and its application in bioinformatics”. In 2013, this course was the first R course taught in Russia and was very popular among students. Since then, I taught R to students and professionals, in groups and individually. I am an RStudio Certified Instructor in tidyverse and teach at posit Academy.


  • Machine Learning
  • Teaching
  • Storytelling with Data


  • PhD in Mathematical Biology and Bioinformatics, 2016

    Lomonosov Moscow State University

  • BSc and Msc in Bioengineering and Bioinformatics, 2010

    Lomonosov Moscow State University



Lead Data Scientist


Sep 2022 – Present Remote
Spearheading the development of multiple data science initiatives, most notably a collaborative filtering recommendation model designed to offer personalized training and route suggestions for users, drawing from their historical time-series data. Collaborating with product teams and key stakeholders to ensure alignment with business objectives and user needs. Initiated and established a systematic data quality review process with data engineers, emphasizing data validation, anomaly detection, and consistency checks to maintain data integrity across sources.

Lead Data Scientist

Inside Tracker

Aug 2019 – Sep 2022 Cambridge, MA
Lead the Data Science team to integrate a unique dataset of blood biomarkers and genomic data with physiological data from wearable sensors (including sleep, RHR, HRV), performed end-to-end analyses, and built new product features. Built core Data Engineering infrastructure supporting the data science function, and set up internal consulting service supporting the Marketing, Product, and Executive teams.

Research Fellow

Harvard Medical School / Dana-Farber Cancer Insitute

Aug 2016 – Jul 2019 Boston, MA
Performed bioinformatics analysis of NGS data produced in the lab (RNA-seq, ChIP-seq), and developed algorithms and pipelines to address various biological questions: analysis of allelic imbalances in gene expression, chromatin marks signatures analysis using machine learning approaches and NGS experimental design.

PhD candidate and senior lecturer

Lomonosov Moscow State University

Aug 2010 – Jul 2016 Moscow, Russia
As a part of Mironov lab, I developed a computational approach to incorporate RNA probing data into RNA secondary structure prediction and genome-wide search for structured RNAs. Published 3 papers in peer-reviewed journals, and published a web-server. Additionally, I developed and/or taught the following courses: algorithms; algorithms in computational biology; bioinformatics; R language.


Institute for Systems Biology

Jul 2010 – Apr 2013 Moscow, Russia
During time in the company, I got industry experience in PK/PD modeling and clinical pharmacology. I was a part of several projects closely interacting with colleagues from pharmaceutical companies and published a paper in a peer-reviewed journal.


Recent Posts

Recent & Upcoming Talks

Building Personalized Scores for Customers: How to Combine Different Data Types and Learn in the Process

We live in the era when everything is personalized. Fitness and running plans, nutrition and supplements recommendations, restaurant …

Biomarker, Sleep, and and Activity Patterns Data from a Web-Based Nutrition Platform for Healthy Individuals: Insights for Personalized Recommendations

Personalized nutrition and lifestyle recommendations is a fast growing field and includes assessing an individual’s health and fitness …

Recent Publications

Quickly discover relevant content by filtering publications.

MaGIC: a machine learning tool set and web application for monoallelic gene inference from chromatin

A large fraction of human and mouse autosomal genes are subject to random monoallelic expression (MAE), an epigenetic mechanism …

High prevalence of clonal monoallelic expression

Comment on Analysis of allelic expression patterns in clonal somatic cells by single-cell RNA-seq

Risk alleles of genes with monoallelic expression are enriched in gain-of-function variants and depleted in loss-of-function variants for neurodevelopmental disorders

Over 3000 human genes can be expressed from a single allele in one cell, and from the other allele—or both—in neighboring cells. Little …


  • +1(857)264-8057