Staff Machine Learning Engineer, LLM Fine-Tuning (Verilog/RTL Applications) - NJ#1


IT - CA - San Jose, CA
San Jose, California
Locations: San Jose, Milpitas, Santa Clara, Sunnyvale
Posted On: December 03, 2025
Last Day to Apply: December 17, 2025
Pay: $210,000 per year

Staff Machine Learning Engineer, LLM Fine-Tuning (Verilog/RTL Applications)

Level: Staff
Location: San Jose, CA 
Cloud: AWS (primary: Bedrock + SageMaker)

Why this role exists

You will architect and lead privacy-preserving LLM capabilities that support hardware design teams working with Verilog/SystemVerilog and RTL artifacts. This includes code generation, refactoring, lint explanation, constraint translation, and spec-to-RTL assistance. You’ll lead a small, high-leverage team focused on fine-tuning and productizing LLMs in a strict enterprise data-privacy environment.

You do not need deep RTL expertise to start—curiosity, LLM craftsmanship, and strong engineering rigor matter most. Exposure to HDL/EDA tooling is a plus.

Responsibilities

Technical Leadership & Roadmap

  • Own the end-to-end roadmap for Verilog/RTL-focused LLM capabilities, covering model selection, fine-tuning, evals, deployment, and continuous improvement.

  • Lead a hands-on team of applied ML engineers/scientists, unblock technically, review designs and code, and drive experimentation velocity and reliability.

Model Training & Customization

  • Fine-tune and customize models using modern techniques (LoRA/QLoRA, PEFT, instruction tuning, RLAIF/preference optimization).

  • Build HDL-aware evaluation workflows:

    • Compile/lint/simulate-based pass rates

    • Pass@k for code generation

    • Constrained decoding enforcing HDL syntax

    • “Does-it-synthesize?” checks

Privacy-First AWS ML Pipelines

  • Design secure training & inference environments using AWS services such as:

    • Amazon Bedrock (incl. Anthropic models)

    • SageMaker or EKS + KServe/Triton/DJL for bespoke training

  • Implement strict privacy controls:

    • Artifacts in S3 with KMS CMKs

    • VPC-only infrastructure with PrivateLink (incl. Bedrock endpoints)

    • IAM least-privilege, CloudTrail auditing

    • Secrets Manager for credential handling

    • Full encryption in transit/at rest

    • No public egress for customer/RTL corpora

Inference & Deployment

  • Stand up scalable, reliable LLM serving:

    • Bedrock model invocation where applicable

    • Low-latency self-hosted inference (vLLM/TensorRT-LLM)

    • Autoscaling and canary/blue-green rollouts

Evaluation Culture & Tooling

  • Build automated regression suites running HDL compilers/simulators to measure correctness and detect hallucinations.

  • Track experiments and produce model cards using MLflow/W&B.

Cross-Functional Collaboration

  • Work with hardware design teams, CAD/EDA, Security, and Legal to:

    • Prepare/anonymize datasets

    • Define acceptance gates

    • Meet licensing, compliance, and security requirements

Productization

  • Integrate models into engineering workflows: IDE plugins, CI bots, code review assistants, retrieval over internal HDL repos/specs, and safe function-calling.

Mentorship

  • Develop team capabilities in LLM training, reproducibility, secure pipelines, and research literacy.

Minimum Qualifications

  • 10+ years total engineering experience; 5+ years in ML/AI or large-scale distributed systems; 3+ years with transformers/LLMs.

  • Proven record shipping LLM-powered features and leading cross-functional technical initiatives at Staff level.

  • Deep, hands-on experience with:

    • PyTorch, Hugging Face Transformers/PEFT/TRL

    • Distributed training (DeepSpeed/FSDP)

    • LoRA/QLoRA, grammar-guided decoding

  • Strong AWS expertise:

    • Bedrock (model customization, Guardrails, Knowledge Bases, VPC endpoints)

    • SageMaker (Training/Inference/Pipelines)

    • S3, EC2/EKS/ECR, IAM, VPC, KMS, CloudWatch/CloudTrail, Step Functions, Secrets Manager

  • Strong Python engineering fundamentals (testing, CI/CD, observability, performance tuning).

  • Excellent technical communication and ability to set vision across teams.

Preferred Qualifications

  • Familiarity with Verilog/SystemVerilog/RTL workflows (lint, simulation, synthesis, timing closure, test benches).

  • Experience with static-analysis/AST-aware tokenization and grammar-constrained decoding.

  • RAG over code/spec repos; tool-use/function-calling for code transformation.

  • Inference optimization (TensorRT-LLM, KV-cache tuning, speculative decoding).

  • Experience with enterprise model governance and security frameworks (SOC2/ISO 27001/NIST).

  • Background in data anonymization, DLP scanning, and code de-identification.

What success looks like

90 Days

  • Stand up HDL-aware eval harness with compile/simulate checks.

  • Establish secure AWS training & inference environments (VPC-only, KMS encryption, no public egress).

  • Deliver initial fine-tuned model with measurable performance gains.

180 Days

  • Expand training coverage using Bedrock + SageMaker/EKS.

  • Add constrained decoding and retrieval over design specs.

  • Productionize inference with SLOs and rollout to pilot teams.

12 Months

  • Reduce RTL review/iteration cycles using measurable metrics: lint-clean time, defect reductions, suggestion acceptance rates.

  • Establish a stable MLOps pathway for continuous improvements.

Security & Privacy by Design

  • All sensitive data remains within private AWS VPCs with IAM-controlled access and CloudTrail auditing.

  • Bedrock access via VPC PrivateLink endpoints only.

  • Strict data minimization, tagging, retention, reproducibility, and DLP scanning.

  • Model cards, lineage, and evaluation artifacts for each release.

Tech Stack

Modeling: PyTorch, HF Transformers/PEFT/TRL, DeepSpeed/FSDP, vLLM, TensorRT-LLM
AWS/MLOps: Bedrock, SageMaker, ECR, EKS/KServe/Triton, MLflow/W&B, Step Functions
Platform/Security: S3 + KMS, IAM, VPC/PrivateLink, CloudWatch/CloudTrail, Secrets Manager
Bonus: HDL toolchains, vector stores (pgvector/OpenSearch), GitHub/GitLab CI

Skip to the main content