Lead/Architect – GPU Programming & Infrastructure - LAGPU


IT - NY - Bronx, NY
Bronx County, New York
Locations: Bronx County, Manhattan Beach, Poughkeepsie, White Plains
Posted On: May 23, 2025
Last Day to Apply: June 06, 2025
Pay: $70.00 to $80.00 per hour

Position Title: Lead/Architect – GPU Programming & Infrastructure
Location: Bronx, NY
Contract Duration: 6 Months
Employment Type: Contract

Position Overview:

The Lead/Architect will play a critical role in deploying and optimizing LLMs (e.g., LLaMA), managing GPU-based inference infrastructure, and streamlining production-grade machine learning pipelines. This is a highly collaborative role requiring strong engineering skills, cloud expertise, and a deep understanding of model performance tuning.

Key Responsibilities:

  • Deploy and optimize LLMs using Hugging Face Transformers

  • Configure and manage GPU-based inference workflows with NVIDIA CUDA

  • Automate infrastructure and development tasks using Python, PyTorch, and shell scripting

  • Administer and maintain Linux (Ubuntu) environments for development and deployment

  • Provision and manage GPU resources in cloud environments (AWS EC2, GCP, Hugging Face Spaces)

  • Implement robust environment isolation and package management using Conda and pip

  • Lead the end-to-end design and deployment of scalable ML model pipelines

  • Partner with cross-functional engineering teams to ensure optimal model performance and reliability

Qualifications:

  • Strong experience in deploying and fine-tuning LLMs in production environments

  • Proficiency with Hugging Face Transformers and PyTorch

  • Deep knowledge of GPU performance tuning and parallel processing

  • Fluency in Python and experience with Linux systems administration

  • Hands-on expertise with cloud GPU orchestration and containerized environments

  • Proven leadership in designing scalable AI/ML infrastructure

  • Ability to work in a fast-paced, performance-driven environment

Skip to the main content