Lead/Architect – GPU Programming & Infrastructure - LAGPU

IT - NY - Bronx, NY
Bronx County, New York

Locations: Bronx County, Manhattan Beach, Poughkeepsie, White Plains
Posted On: May 23, 2025
Last Day to Apply: June 06, 2025
Pay: $70.00 to $80.00 per hour

Position Title: Lead/Architect – GPU Programming & Infrastructure
Location: Bronx, NY
Contract Duration: 6 Months
Employment Type: Contract

Position Overview:

The Lead/Architect will play a critical role in deploying and optimizing LLMs (e.g., LLaMA), managing GPU-based inference infrastructure, and streamlining production-grade machine learning pipelines. This is a highly collaborative role requiring strong engineering skills, cloud expertise, and a deep understanding of model performance tuning.

Key Responsibilities:

Deploy and optimize LLMs using Hugging Face Transformers
Configure and manage GPU-based inference workflows with NVIDIA CUDA
Automate infrastructure and development tasks using Python, PyTorch, and shell scripting
Administer and maintain Linux (Ubuntu) environments for development and deployment
Provision and manage GPU resources in cloud environments (AWS EC2, GCP, Hugging Face Spaces)
Implement robust environment isolation and package management using Conda and pip
Lead the end-to-end design and deployment of scalable ML model pipelines
Partner with cross-functional engineering teams to ensure optimal model performance and reliability

Qualifications:

Strong experience in deploying and fine-tuning LLMs in production environments
Proficiency with Hugging Face Transformers and PyTorch
Deep knowledge of GPU performance tuning and parallel processing
Fluency in Python and experience with Linux systems administration
Hands-on expertise with cloud GPU orchestration and containerized environments
Proven leadership in designing scalable AI/ML infrastructure
Ability to work in a fast-paced, performance-driven environment