AI Stack
Tools this role uses
Workflow Map
AI-assisted vs. human-led
AI-assisted
Human-led
About the role
We are looking for an expert in Generative AI and Computer Vision to build a high performance, local Virtual Try On pipeline. Our goal is to achieve visual quality comparable to Gemini 3 Pro virtual try on 001, but with a local inference speed of under 5 seconds on an NVIDIA GPU Droplet A100 or H100.
The Challenge
Proprietary models like Gemini Pro are too slow for our production needs. You will be responsible for teaching a faster, open weight student model such as IDM VTON, CatVTON, or OOTDiffusion to mimic Gemini’s high fidelity fabric draping, shadow alignment, and texture preservation.
Key Responsibilities
Synthetic Data Pipeline
Build a script to generate a training dataset using Google Gemini or Imagen VTO as the teacher model.
Model Fine Tuning
Fine tune an open source diffusion model such as IDM VTON or CatVTON using the Gemini generated dataset to capture pro tier garment realism.
Inference Optimization
Implement TensorRT and FP8 or INT8 quantization to ensure the model runs in under 5 seconds on our GPU instance.
Deployment
Containerize the solution using Docker with a vLLM or SGLang style backend for low latency API access.
Required Skills
Deep Learning
Expert level PyTorch and experience with diffusion models such as Stable Diffusion and ControlNet.
Computer Vision
Proven experience with Virtual Try On pipelines, image segmentation such as SAM 2, and human pose estimation.
Optimization
Deep knowledge of NVIDIA TensorRT, Flash Attention 3, and CUDA optimization.
API Integration
Experience with Google Vertex AI or Gemini API for synthetic data generation.
Deliverables
A fine tuned local VTO model including weights and configuration.
Benchmark report showing under 5 second end to end latency.
Dockerized API for easy deployment on our GPU Droplet.
Interested in this role?
Apply on Upwork website →Listed on UpgradedJobs · Source: scraped
Similar roles
Hiring someone who uses AI daily?
Post your role and reach candidates with real AI skills.