Hi, I'm

David!


I'm an undergraduate student in Engineering Science (Machine Intelligence) at the University of Toronto.
My research experience spans machine learning and data science, with a focus on multimodal dialogue systems, medical image analysis, and generative models.
I also have practical experience with large-scale datasets, experimental design, and visualization pipelines, and occasionally work with frontend frameworks such as React, Vue, and Next.js to support research tools and interfaces.

David Guo portrait

Publications

Selected papers and preprints.

VOGUE: A Multimodal Conversational Fashion Recommendation Dataset

Guo, D.*, Sun, M.*, Jiang, Y.*, Liang, J., & Sanner, S.

Accepted

Accepted to UMAP 2026

2026

Evaluating Scene-based In-Situ Item Labeling for Immersive Conversational Recommendation

Liang, J.*, Liu, Y. S.*, Guo, D.*, Sun, M., Jiang, Y., & Sanner, S.

Accepted

Findings of ACL 2026

2026

Projects

A selection of projects I've worked on with public repositories

Spinal Geometry Modeling for Metastasis Assessment
Python
XGBoost
Quantile Regression
Statistical Shape Modeling
Spinal Geometry Modeling for Metastasis Assessment

Modeled healthy spinal geometry with a bi-directional quantile regression framework (XGBoost) that predicts normative vertebral volume and inter-centroid distances from demographic and biomarker features. Built a population Statistical Shape Model via Generalized Procrustes Analysis to capture vertebral shape variance, then integrated both modules into a Python library and dashboard for clinician-facing assessment of metastasis-related deformities.

Multi-task Pancreas Cancer Segmentation and Classification with nnUNetV2
Pytorch
nnUNetv2
Multi-task Learning
Medical Imaging
Multi-task Pancreas Cancer Segmentation and Classification with nnUNetV2

This project implements multi-task learning to add a classifier head to nnUNetv2's Residual Encoder preset. This classifier is used to augment nnUNetv2's existing segmentation, specifically in the context of lesion subtypes in pancreatic cancers.

Augmentation of WM811K Silicon Wafer Map Dataset for Error Pattern Detection Training
Pytorch
Wasserstein GAN
Gradient Penalty
Data Augmentation
Augmentation of WM811K Silicon Wafer Map Dataset for Error Pattern Detection Training

In this on-going project, I am investigating the efficacy of using Wasserstein GANs with Gradient Penalty to generate images to balance the WM-811K silicon wafer map dataset. Being able to generate synthetic data will aid in training image detection models later on to improve fabrication efficacy. Preliminary results show improvement for problem classes.

Music Generation Using Autoencoders and Transformer Mixture Distribution Models
Tensorflow
Autoencoders
Transformers
VAEs
Music Modeling
Music Generation Using Autoencoders and Transformer Mixture Distribution Models

In this project, we developed a model for music generation using autoencoders and transformer mixture distribution models, implemented in TensorFlow. Building on techniques like variational autoencoders (VAEs) and transformers, our approach processes high-dimensional music data to create coherent compositions. By using a sliding window method and training the model on both diverse and classical music datasets, we aimed to capture melodic patterns effectively. While the model performs well at learning tonality, there are still challenges with rhythm and long-term structure. We're continuing to explore ways to enhance the model’s rhythmic coherence and overall musicality.

Contact Me

University Email

davidmy.guo@mail.utoronto.ca

Personal Email

davidguo123456@gmail.com

Phone

(604) 825 9637

Location

Toronto, ON