Faculty Research Software Engineer Manager (Fixed Term)
Website University of Cambridge
The Faculty of Mathematics seeks to recruit a Research Software Engineering (RSE) Project Manager to start as soon as possible.
The role holder is expected to provide oversight of the Faculty of Mathematics Computing Development Platform, an HPC and data analytics supercomputer facility, together with Research Software Engineers supporting related programming efforts within Faculty research groups.
The role holder will be responsible for strategic design and enhancement of the Computing Development Platform and will provide second-tier support to research staff using the system, and will be expected to take technical responsibility for the Faculty of Mathematics’ HPC training resource. This encompasses the design, tier-two support and service improvement by the identification and addressing of fundamental, systemic issues of services failure. It also includes the management and resolution of any service disruption. The role holder will take responsibility for the development and ongoing operation of services for HPC application development, delivery and training; and high-performance computing including design, specification, procurement, commissioning and support. Note that there is an HPC System Administrator in post, so the latter is a shared and strategic responsibility.
The role will provide senior-level expertise to oversee the employment and coordination of Research Software Engineers based within individual research groups. The role facilitates RSE teamwork in the areas of testing, profiling and improving the performance of parallel code in preparation for production runs on external HPC facilities, using in-depth optimisation and tuning of code submitted by users and improvement of parallel scaling characteristics. The role advises Faculty members on grant applications for RSE support and hardware procurement and supports them by offering RSE-based assistance.
The role holder is expected to have experience of programming in C, C++, Fortran 90 and Python, using scripting languages such as Bash and Perl, and of parallel-programming using OpenMP and MPI as well as for GPU systems. They will have a track record of administering and integrating Linux operating systems in a research environment with experience of sustainable configuration management and automation, and of configuring and managing Linux HPC clusters and massive SMP systems, including the management of queuing systems such as Moab/Torque and Slurm). Their experience of software development will include standard software engineering practices such as source control systems and more advanced techniques of compilation, optimisation and installation methods on a variety of scientific HPC applications. Knowledge of storage sub-systems, co-processors (such as the Xeon Phi) and accelerators (such as GPU) would be very desirable.
The role holder will need to become proficient in a variety of techniques related to developing code for HPC systems: benchmarking, debugging, profiling and optimisation including vectorization of parallel applications. They will be expected to have experience of visualisation techniques for massive parallel computations; of programming on HPC architectures with coprocessors and accelerators. The role holder would be expected to be actively involved in training PhD students. Courses for this purpose are available through the new Master’s degree in Data Intensive Science, so the role would be expected to advise and assist MPhil faculty members with course design and teaching.
Fixed-term: The funds for this post are available for 2 years in the first instance.
To apply for this job please visit www.jobs.cam.ac.uk.