We are pioneering advanced computational solutions requiring massive computational power and sophisticated optimization. Our HPC team designs and implements cutting-edge solutions that push the boundaries of computational performance.
About The Role
We're seeking a High Performance Computing Engineer with deep expertise in parallel computing and heterogeneous architecture optimization. The ideal candidate will have extensive experience with CUDA/ROCm, MPI, and large-scale system optimization.
AI Tool Proficiency Requirements
Advanced expertise in using AI coding assistants for HPC code generation and optimization
Demonstrated ability to leverage AI tools for performance debugging and analysis
Experience using AI systems for parallel algorithm implementation and optimization
Strong proficiency in using AI to accelerate development workflows
Ability to effectively combine AI-generated solutions with manual optimization
Experience using AI tools for code review and quality assurance
What You Will Do
Design and optimize high-performance computing applications for large-scale clusters
Implement and tune parallel algorithms across distributed systems
Develop efficient CUDA/ROCm implementations for computational kernels
Create and maintain MPI-based distributed computing solutions
Optimize memory access patterns and communication protocols
Profile and analyze performance bottlenecks in distributed systems
Implement efficient linear algebra and scientific computing routines
Manage and optimize job scheduling and resource allocation
Design and implement fault tolerance and recovery systems
Collaborate with research teams to optimize computational workloads
Required Qualifications
M.S. or Ph.D. in Computer Science, Engineering, or related field from a top-tier university
Expert knowledge of parallel programming models (MPI, OpenMP)
Advanced expertise in CUDA or ROCm programming
Strong proficiency in C++ and parallel algorithm implementation
Deep understanding of computer architecture and memory hierarchies
Expert-level knowledge of Linux/Unix environments
Experience with large-scale cluster management
Preferred Qualifications
Experience with InfiniBand and high-speed interconnects
Knowledge of scientific computing libraries (BLAS, LAPACK)
Expertise in vectorization and SIMD optimization
Background in distributed algorithms
Experience with performance modeling and prediction
Familiarity with job scheduling systems (Slurm, PBS)
Experience with container technologies for HPC
ACM-ICPC Regional or World Finals medalist
USACO (USA Computing Olympiad) Gold/Platinum award
Top-tier algorithmic competition achievements
Technical Skills
Advanced parallel programming techniques
Proficiency in performance optimization tools
Strong debugging and profiling skills
Expertise in distributed computing concepts
Knowledge of network topology and optimization
Experience with scientific computing applications
Required Tools & Technologies
CUDA/ROCm
MPI
OpenMP
Linux/Unix
Performance profiling tools
Job scheduling systems
Version control systems
C/C++
What We Offer
Access to cutting-edge HPC infrastructure
Competitive compensation package
Professional development opportunities
Collaboration with leading researchers
Health and retirement benefits
#J-18808-Ljbffr
Apply for this Job
Please use the APPLY HERE link below to view additional details and application instructions.