Local LLM Expert: Fine-Tuning & Training with Gemini Distillation

Upwork·Remote

engineeringtoday

AI Stack

Tools this role uses

GEGeminiGenerate synthetic VTO training dataoccasional

SDStable DiffusionFine-tune open-source VTO modeloccasional

S2SAM 2Image segmentation in VTO pipelineoccasional

Workflow Map

AI-assisted vs. human-led

AI-assisted

Generate synthetic training datasetGemini

Fine-tune diffusion model on VTO taskStable Diffusion

Image segmentation for garment maskingSAM 2

Human-led

TensorRT and quantization optimization

Dockerized API deployment setup

Benchmark latency and model evaluation

CUDA and Flash Attention 3 tuning

Model architecture design and selection

About the role

We are looking for an expert in Generative AI and Computer Vision to build a high performance, local Virtual Try On pipeline. Our goal is to achieve visual quality comparable to Gemini 3 Pro virtual try on 001, but with a local inference speed of under 5 seconds on an NVIDIA GPU Droplet A100 or H100.

The Challenge

Proprietary models like Gemini Pro are too slow for our production needs. You will be responsible for teaching a faster, open weight student model such as IDM VTON, CatVTON, or OOTDiffusion to mimic Gemini’s high fidelity fabric draping, shadow alignment, and texture preservation.

Key Responsibilities

Synthetic Data Pipeline

Build a script to generate a training dataset using Google Gemini or Imagen VTO as the teacher model.

Model Fine Tuning

Fine tune an open source diffusion model such as IDM VTON or CatVTON using the Gemini generated dataset to capture pro tier garment realism.

Inference Optimization

Implement TensorRT and FP8 or INT8 quantization to ensure the model runs in under 5 seconds on our GPU instance.

Deployment

Containerize the solution using Docker with a vLLM or SGLang style backend for low latency API access.

Required Skills

Deep Learning

Expert level PyTorch and experience with diffusion models such as Stable Diffusion and ControlNet.

Computer Vision

Proven experience with Virtual Try On pipelines, image segmentation such as SAM 2, and human pose estimation.

Optimization

Deep knowledge of NVIDIA TensorRT, Flash Attention 3, and CUDA optimization.

API Integration

Experience with Google Vertex AI or Gemini API for synthetic data generation.

Deliverables

A fine tuned local VTO model including weights and configuration.

Benchmark report showing under 5 second end to end latency.

Dockerized API for easy deployment on our GPU Droplet.

Interested in this role?

Apply on Upwork website →

Listed on UpgradedJobs · Source: scraped

Similar roles

AI Fashion Model Image Creator — Jewelry & Lifestyle Photography (Prompt Engineering Expert) - Contract to Hire

UpworkRemote

designtoday

MIMidjourneyGenerate photorealistic model imagesdaily

SDStable DiffusionCreate lifestyle jewelry photographydaily

FLFluxGenerate brand-consistent fashion imagesdaily

NBnano bannaAI image generation for jewelryoccasional

Remotedesign

Sr. Director, Engineering - Applied AI

ZapierRemote

engineeringtoday

GPChatGPTSupport engineering workflows dailydaily

CLClaudeSupport engineering workflows dailydaily

GEGeminiSupport engineering workflows dailydaily

Remoteengineering

Hiring someone who uses AI daily?

Post your role and reach candidates with real AI skills.

Post a job →