FGCI Tech : Software installations with CI build system
Contents
- I’ll try to explain our build system from the perspective of the problems it tries to solve
- Goal: automate boring work as much as possible without compromising quality
- Current system is ~5th iteration. It has been in use for about 2 years.
Step 1: Building software
- We use spack for compiled software (https://spack.io)
- We use miniconda + mamba when we create anaconda environments for Python use
- We use singularity to build containers for specialized software
Task: Build software with Spack
- Install OpenMPI:
spack install openmpi@4.0.5
Problem:
- Spack finds dependencies dynamically = different dependency resolutions
- Packages have multiple variants = different routes lead to lots of different installations
Solution:
- Use Spack’s site configs to specify default versions / variants
Task: Build for specific architecture with specific default compilers
spack install openmpi@4.0.5 %gcc@8.4.0 arch=linux-centos7-haswell
Problem:
- Sometimes Spack does not propagate architecture optimization flags to builds
Solution:
- Write default compiler options to
~/.spack/linux/compilers.yaml
Task: Repeat 100x
Problem:
- Lots of stuff to write / remember
- Configuration is not versioned
Solution
- Python code that creates commands from minimal yaml file, runs them, does deployment with rsync (https://github.com/AaltoSciComp/science-build-rules)
- Put spack site configs and build configs to a git repository (https://github.com/AaltoSciComp/science-build-configs)
Step 2: Do it automatically
- Consistent builds are step in the right direction, but not enough
- Move towards CI
Task: Ownership troubles
- If admins run the commands, there will be problems
Problem:
- Files owned by different users = builds often fail
- Settings that admins have set might affect the build
Solution:
- Create a user that runs the buildrules (for us,
triton-ci
)
Task: Builds can be heavy
Problem:
- Builds can create huge amounts of temporary files and take a long time
Solution:
- Run builds on a separate machine (for us a Dell workstation with few SSDs + HDD for end products)
Task: Build with desired OS
Problem:
- Build environment should be similar to the system where software will be run
- System libraries + mountpoints should be similar
Solution:
- Run builds in docker containers (we build for our workstations (Ubuntu 20.04) + cluster (CentOS 7.9))
Task: Do builds automatically
Problem:
- Builds should happen automatically after configurations have been updated in git
Solution:
- Use buildbot (https://buildbot.net) to run the builds in containers
Task: Set up everything together
Problem:
- The CI system + builders should be set up from configurations
Solution:
- science-build-rules has ci-builder that sets up the builder
- End product is a folder with docker-compose that one can use to bring the whole system up
Demo
- Quick demo on how we install packages
Current problems
Problem:
- Deployment for CI is not good.
- Updating spack or building from upstream can result in “build avalanches” when some package gets a new version / variant.
Solution:
- We’re switching towards ansible so that we can set all of the requirements as well.
- We’re trying to get a two step strategy:
- one slower moving build that would build base compilers etc.
- one faster that would build end products using the other software
Current problems
Problem:
- science-build-rules pretty scripty at times and easier configs can make adopting it actually more complicated.
- The build system sometimes moves fast so that documentation is not up to date.
Solution:
- The code should be streamlined so that it is easier to read + adopt.
- We’re increasing the number of people involved in the project so that we have to document it better.
// reveal.js plugins