Staff Machine Learning Engineer - MLAV#01 (Need Local CA candidates only)


IT - CA - San Jose, CA
San Jose, California
Locations: San Jose, Milpitas, Palo Alto, Santa Clara, Sunnyvale
Posted On: December 16, 2025
Last Day to Apply: December 26, 2025
Pay: $100.00 to $115.00 per hour

Staff Machine Learning Engineer

Level: Staff
Location: San Jose, CA (Onsite)
Cloud Platform: AWS (Bedrock & SageMaker)

Role Overview

We are building privacy-preserving large language model (LLM) capabilities to support hardware design workflows involving Verilog/SystemVerilog and RTL artifacts. These models enable advanced use cases such as code generation and refactoring, lint explanation, constraint translation, and spec-to-RTL assistance—while operating within strict enterprise security and data-privacy boundaries.

This Staff-level role will provide technical leadership for fine-tuning, evaluating, and deploying LLMs in production environments. While experience with Verilog/RTL is a strong plus, success in this role is driven primarily by deep LLM expertise, strong engineering fundamentals, and the ability to lead high-impact initiatives.

Responsibilities

  • Own the technical roadmap for RTL-focused LLM capabilities, from model selection and fine-tuning through deployment and continuous improvement.

  • Lead and mentor a small team of applied ML engineers and scientists; review designs and code, remove technical blockers, and drive execution.

  • Fine-tune and customize transformer models using modern techniques such as LoRA/QLoRA, PEFT, instruction tuning, and preference optimization (RLAIF).

  • Design and operate HDL-aware evaluation frameworks, including:

    • Compile, lint, and simulation pass rates

    • Pass@k metrics for code generation

    • Constrained/grammar-guided decoding

    • Synthesis-readiness checks

  • Build and maintain secure, privacy-first ML pipelines on AWS, including:

    • Amazon Bedrock for managed foundation models

    • SageMaker and/or EKS for bespoke training and inference

    • Encrypted storage (S3 + KMS), private VPCs, IAM least privilege, CloudTrail auditing

  • Deploy and operate low-latency, production inference using Bedrock and/or self-hosted stacks (vLLM, TensorRT-LLM), with autoscaling and safe rollout strategies.

  • Establish a strong evaluation and MLOps culture with automated regression testing, experiment tracking, and model documentation.

  • Partner with hardware engineering, EDA, security, and legal stakeholders to ensure compliant data sourcing, anonymization, and governance.

  • Drive product integration with internal developer tools, CI workflows, IDE plug-ins, retrieval-augmented generation (RAG), and safe tool-use.

  • Mentor engineers on LLM best practices, reproducible experimentation, and secure system design.

Minimum Qualifications

  • 10+ years of overall engineering experience, including:

    • 5+ years in ML/AI or large-scale distributed systems

    • 3+ years working hands-on with transformers or LLMs

  • Proven experience shipping LLM-powered features to production and leading cross-functional technical initiatives.

  • Deep expertise with PyTorch, Hugging Face (Transformers, PEFT, TRL), and distributed training frameworks (DeepSpeed, FSDP).

  • Experience with quantization-aware fine-tuning, constrained decoding, and evaluation of code-generation models.

  • Strong AWS background, including:

    • Amazon Bedrock (model usage, customization, Guardrails, runtime APIs, VPC endpoints)

    • SageMaker (Training, Inference, Pipelines)

    • Core services: S3, EC2/EKS, IAM, KMS, VPC, CloudWatch, CloudTrail, Secrets Manager

  • Solid software engineering fundamentals: testing, CI/CD, observability, and performance optimization.

  • Excellent communication skills and the ability to influence both technical and executive stakeholders.

Preferred Qualifications

  • Familiarity with Verilog/SystemVerilog or RTL workflows, including linting, synthesis, simulation, and EDA tools.

  • Experience with AST-aware tokenization or grammar-constrained decoding for code models.

  • Retrieval-augmented generation (RAG) over code and technical specifications.

  • Inference optimization techniques (TensorRT-LLM, KV-cache optimization, speculative decoding).

  • Enterprise model governance, security reviews, and compliance frameworks (e.g., SOC 2, ISO 27001).

  • Experience with data anonymization, DLP scanning, and IP-safe ML pipelines.

For more details reach at resumes@navitassols.com

About Navitas Partners, LLC: It is a certified WBENC and one of the fastest-growing Technical / IT staffing firms in the US providing services to numerous clients. We offer the most competitive pay for every position. We understand this is a partnership. You will not be blindsided and your salary will be discussed upfront.

Skip to the main content