HPC Engineer
Company: Anduril Industries
Location: Costa Mesa
Posted on: April 4, 2026
|
|
|
Job Description:
ABOUT THE ROLE Anduril is seeking a High Performance Computing
(HPC) System Engineer to directly support our most sensitive
programs. You will be a part of the team building and maintaining
large scale HPC infrastructure. You will have the opportunity to
work with and learn from some of the world’s best engineers and
cybersecurity professionals as you help to implement cutting edge
systems. You will work directly to support systems deployed across
the globe in support of national security missions. WHAT YOU'LL DO
Work in a fast-paced, customer-focused environment supporting
high-profile operational and research requirements. Architect and
deploy advanced GPU infrastructure, leading the design, deployment,
and lifecycle management of cutting-edge NVIDIA hardware including
H100, H200, and B200/B300 systems. Ability to rack, stack, cable,
and configure physical servers and multi-node GPU systems from end
to end. Configure HPC and AI environments, including job schedulers
(e.g., Slurm), multi-user login environments, and cluster
management software (e.g., Warewulf, NVIDIA Base Command, RunAI).
Implement and fine-tune high-speed interconnects (e.g., NVLink,
NVSwitch, InfiniBand/NDR) crucial for large-scale distributed
training. Configure and manage large-scale, high-performance
storage platforms in the multiple petabytes range, optimized for
AI/ML data access patterns. Install, configure, and maintain the
application stack on HPC clusters, including traditional simulation
software (StarCCM, Ansys, Matlab) and the core AI/ML software stack
(NVIDIA drivers, CUDA, PyTorch, TensorFlow). Implement and manage
GPU virtualization and sharing technologies, such as Multi-Instance
GPU (MIG), to maximize resource utilization across diverse
workloads. Troubleshoot complex, system-wide issues related to
application performance, user access, compute nodes, storage, and
job queueing services. Utilize NVIDIA Data Center GPU Manager
(DCGM) and additional tools to proactively monitor GPU health and
performance, diagnosing and resolving training bottlenecks in
collaboration with ML engineers. Ensure the security and integrity
of the server and cluster infrastructure through regular audits,
patching, and proactive security measures. Collaborate closely with
engineering and AI/ML research stakeholders to gather requirements
and architect robust, scalable solutions. Manage the hardware
lifecycle, from quoting and procuring hardware from vendors to
creating and executing deployment schedules. Provide technical
guidance, mentoring, and architectural leadership to other team
members. REQUIRED QUALIFICATIONS 7 years of experience in
designing, developing, and implementing large scale compute
enterprise systems and solutions Strong Knowledge and experience
with High Performance Computing concepts to include cluster
architecture file system, and high-speed infiniBand/ethernet
interconnections Proven expertise in one or more of the following,
Red Hat Enterprise Linux, Ubuntu, HPC, GPU, Azure or AWS cloud
services Strong understanding and experience with systems
automation tools (Ansible, Salt, Puppet) Experience in HPC
technologies such as parallel/distribution file systems (e.g.,
Lustre, GPFS, Pure, VAST) Working knowledge of HPC batch schedule
software (e.g., PBSPro, SLURM) AWS/Azure experience building HPC
clusters Ability to lift 50 lbs Eligible to obtain an maintain a US
Top Secret Clearance US Salary Range $146,000 - $194,000 USD The
salary range for this role is an estimate based on a wide range of
compensation factors, inclusive of base salary only. Actual salary
offer may vary based on (but not limited to) work experience,
education and/or training, critical skills, and/or business
considerations. Highly competitive equity grants are included in
the majority of full time offers; and are considered part of
Anduril's total compensation package. Additionally, Anduril offers
top-tier benefits for full-time employees, including: Healthcare
Benefits US Roles: Comprehensive medical, dental, and vision plans
at little to no cost to you. UK & AUS Roles: We cover full cost of
medical insurance premiums for you and your dependents. IE Roles:
We offer an annual contribution toward your private health
insurance for you and your dependents. Additional Benefits Income
Protection : Anduril covers life and disability insurance for all
employees. Generous time off : Highly competitive PTO plans with a
holiday hiatus in December. Caregiver & Wellness Leave is available
to care for family members, bond with a new baby, or address your
own medical needs. Family Planning & Parenting Support: Coverage
for fertility treatments (e.g., IVF, preservation), adoption, and
gestational carriers, along with resources to support you and your
partner from planning to parenting. Mental Health Resources: Access
free mental health resources 24/7, including therapy and life
coaching. Additional work-life services, such as legal and
financial support, are also available. Professional Development:
Annual reimbursement for professional development Commuter
Benefits: Company-funded commuter benefits based on your region.
Relocation Assistance: Available depending on role eligibility.
Retirement Savings Plan US Roles: Traditional 401(k), Roth, and
after-tax (mega backdoor Roth) options. UK & IE Roles: Pension plan
with employer match. AUS Roles: Superannuation plan. The recruiter
assigned to this role can share more information about the specific
compensation and benefit details associated with this role during
the hiring process. Protecting Yourself from Recruitment Scams
Anduril is committed to maintaining the integrity of our Talent
acquisition process and the security of our candidates. We've
observed a rise in sophisticated phishing and fraudulent schemes
where individuals impersonate Anduril representatives, luring job
seekers with false interviews or job offers. These scammers often
attempt to extract payment or sensitive personal information. To
ensure your safety and help you navigate your job search with
confidence, please keep the following critical points in mind: No
Financial Requests: Anduril will never solicit payment or demand
personal financial details (such as banking information, credit
card numbers, or social security numbers) at any stage of our
hiring process. Our legitimate recruitment is entirely free for
candidates. Please always verify communications: Direct from
Anduril: If you receive an email from one of our recruiters, it
will only come from an @anduril.com address. Via Agency Partner: If
contacted by a recruiting agency for an Anduril role, their email
will clearly identify their agency. If you suspect any suspicious
activity, please verify the agency's authenticity by reaching out
to contact@anduril.com . Exercise Caution with Unsolicited
Outreach: If you receive any communication that appears suspicious,
contains grammatical errors, or makes unusual requests, do not
engage. Always confirm the sender's email domain is @anduril.com
before providing any personal information or clicking on links.
What to Do If You Suspect Fraud: Should you encounter any
questionable or fraudulent outreach claiming to be from Anduril,
please report it immediately to contact@anduril.com . Your
proactive caution is invaluable in protecting your personal
information and upholding the security and trustworthiness of our
recruitment efforts. Data Privacy To view Anduril's candidate data
privacy policy, please visit
https://anduril.com/applicant-privacy-notice/ .
Keywords: Anduril Industries, Cathedral City , HPC Engineer, IT / Software / Systems , Costa Mesa, California