Svetlana Vinogradova

Svetlana Vinogradova

Data Scientist

Meta (Instagram)

Biography

Data Scientist with 10+ years of experience building production-grade machine learning systems in consumer health and large-scale platforms. Currently at Meta (Instagram), where I develop ML and statistical models to detect under-delivered content within large-scale ranking systems and translate diagnostic insights into user-facing creator analytics features.

My expertise spans physiological signal modeling, causal inference, and experimentation. Previous roles include Senior Data Scientist at Nuna Health (longitudinal blood pressure modeling, LLM safety evaluation), Lead Data Scientist at Zwift (recommendation systems, customer segmentation), and Lead Data Scientist / Data Science Manager at InsideTracker (founding data scientist, integrating biomarkers, genomics, and wearable data into personalized health recommendations).

I hold a PhD in Mathematical Biology and Bioinformatics from Lomonosov Moscow State University and previously conducted research at Harvard Medical School / Dana-Farber Cancer Institute.

Interests

  • Machine Learning
  • Health & Longevity
  • Physiological Data

Education

  • PhD in Mathematical Biology and Bioinformatics, 2016

    Lomonosov Moscow State University

  • BSc and Msc in Bioengineering and Bioinformatics, 2010

    Lomonosov Moscow State University

Experience

 
 
 
 
 

Data Scientist

Meta (Instagram)

Jul 2025 – Present New York City, NY
• Leading development of machine learning and statistical models to detect under-delivered content within large-scale ranking systems, identifying distribution anomalies and modeling root causes • Translating diagnostic insights into both internal debugging tools and user-facing creator analytics features • Designed validation and performance monitoring frameworks to ensure model robustness and interpretability across diverse user cohorts
 
 
 
 
 

Senior Data Scientist, ML Algorithms

Nuna Health

Jul 2024 – Jul 2025 San Francisco, CA
• Led end-to-end development and deployment of a longitudinal blood pressure trend modeling algorithm, leveraging time-series analysis to detect instability patterns and forecast near-term risk trajectories • Built a safety evaluation framework for LLM-powered patient conversations, identifying violations related to harm, bias, hallucinations, and privacy
 
 
 
 
 

Lead Data Scientist

Zwift

Sep 2022 – Jun 2024 Long Beach, CA / remote
• Led development and deployment of customer-facing product features, including introducing a racing score as part of a company-wide personalized recommendation system • Developed a machine learning model for customer segmentation leveraging behavior data from 1 million users • Enhanced data integrity across multiple data streams by establishing a comprehensive data quality and governance review process
 
 
 
 
 

Lead Data Scientist / Data Science Manager

InsideTracker

Aug 2019 – Sep 2022 Cambridge, MA
• As a founding data scientist, led and mentored a team of 3, setting technical direction for ML initiatives and partnering with product, engineering, and leadership • Developed and deployed production ML models integrating blood biomarkers, genomic data, wearable-derived physiological signals, and behavioral metrics for personalized health recommendations • Owned the full model lifecycle from exploratory research through validation, deployment, and ongoing performance monitoring
 
 
 
 
 

Research Scientist

Harvard Medical School / Dana-Farber Cancer Institute

Aug 2016 – Aug 2019 Boston, MA
Developed novel machine learning methods to study gene expression and epigenetic signatures. Introduced Qllelic, a computational approach establishing a new gold standard for analyzing allelic imbalance. Co-authored an R package and 5 peer-reviewed articles.
 
 
 
 
 

Research Scientist

Institute for Information Transmission Problems

Aug 2010 – Jul 2016 Moscow, Russia
Developed machine learning approaches to incorporate RNA probing data into RNA secondary structure prediction and genome-wide search for structured RNAs. Co-authored 3 peer-reviewed articles.

Certifications

Certified Trainer — R/tidyverse

Recent Posts

What connects it

On the surface, my career looks like a series of stops. It can look scattered if you don’t know what connects it.

Running Coach

I am also a running coach and currently taking clients to coach them to run 5k/10k/half-marathon or a marathon. Please reach out!

You can view my coaching certification here.

Recent & Upcoming Talks

Building Personalized Scores for Customers: How to Combine Different Data Types and Learn in the Process

We live in the era when everything is personalized. Fitness and running plans, nutrition and supplements recommendations, restaurant …

Biomarker, Sleep, and and Activity Patterns Data from a Web-Based Nutrition Platform for Healthy Individuals: Insights for Personalized Recommendations

Personalized nutrition and lifestyle recommendations is a fast growing field and includes assessing an individual’s health and fitness …

Recent Publications

Quickly discover relevant content by filtering publications.

MaGIC: a machine learning tool set and web application for monoallelic gene inference from chromatin

A large fraction of human and mouse autosomal genes are subject to random monoallelic expression (MAE), an epigenetic mechanism …

High prevalence of clonal monoallelic expression

Comment on Analysis of allelic expression patterns in clonal somatic cells by single-cell RNA-seq

Risk alleles of genes with monoallelic expression are enriched in gain-of-function variants and depleted in loss-of-function variants for neurodevelopmental disorders

Over 3000 human genes can be expressed from a single allele in one cell, and from the other allele—or both—in neighboring cells. Little …

Contact

Have a question, want to collaborate, or just want to talk health data? Feel free to reach out at kintany@gmail.com.

If you’d like to book a career consultation, you can grab a time directly:

Book a Career Consultation