Skip to content

SIG/HPC meeting 2023-10-19


* Sherif
* Stack
* Alan Marshall
* Jeremy Siadal


Stack, Asks about automating process for building slumr packages, Sherif explained the packaging process work and how we can improve it by using upstream monitoring tools

Jeremy, suggesting to start working on HPC rocky's kernel, will be mostly based on Rocky standard kernel with different configuration file

Stack, Found a problem slurmrestd, will look about it for next week

Action items:

* Sherif to create kernel repo for kernel HPC, kernel-hpc-node
* Jeermy, to get the ball rolling with intel GPU driver
* Stack, Fix the slurm rest daemon and integrated it with openQA
* Sherif, staging repo for HPC

Old business:


* None for this meeting, however we should be working on old business action items


* Sherif: Get the SIG for drivers
* Sherif: Check the names of nvidia drivers "open , dkms and closed source"
* Chris: Bench mark nvidia open vs closed source


* Sherif: Reaching out to AI SIG to check on hosting nvida that drivers that CIQ would like to contribute - Done and waiting to hear from them -


* Sherif: To push the testing repo file to release package
* Sherif: testing / merging the_real_swa scripts


* Sherif: Looking into the openQA testing - Pending


* Sherif: Reach out to jose-d about pmix - Done, no feedback yet -
* Greg: to reach out to openPBS and cloud charly
* Sherif: To update slurm23 to latest - Done -


* Sherif needs to update the wiki - Done
* Sherif to look into MPI stack
* Chris will send Sherif a link with intro


* Sherif release slurm23 sources - Done
* Stack and Sherif working on the HPC list
* Sherif email Jeremy, the slurm23 source URL - Done


* Sherif to look int openHPC slurm spec file - Pending on Sherif
* We need to get lists of centres and HPC that are moving to Rocky to make a blog post and PR


* Get a list of packages from Jeremy to pick up from openHPC - Done
* Greg / Sherif talk in Rocky / RESF about generic SIG for common packages such as chaintools
* Plan the openHPC demo Chris / Sherif - Done
* Finlise the slurm package with naming / configuration - Done


* Get a demo / technical talk after 4 weeks "Sherif can arrange that with Chris" - Done
* Getting a list of packages that openHPC would like to move to distros "Jeremy will be point of contact if we need those in couple of weeks" - Done


* Start building slurm - On going, a bit slowing down with R9.2 and R8.8 releases, however packages are built, some minor configurations needs to be fixed -
* Start building apptainer - on hold -
* Start building singulartiry - on hold -
* Start building warewulf - on hold -
* Sherif: check about forums - done, we can have our own section if we want, can be discussed over the chat -


* Reach out to other communities “Greg” - on going -
* Reaching out for different sites that uses Rocky for HPC “Stack will ping few of them and others as well -Group effort-”
* Reaching out to hardware vendors - nothing done yet -
* Statistic / public registry for sites / HPC to add themselves if they want - nothing done yet -