Guide
This section presents a guide to the operators to deploy CEEMS stack. There are two principal components in the CEEMS stack:
- CEEMS Exporter that must be installed on all the compute nodes.
- CEEMS API Server that must be installed on a service node.
Optionally, a third component CEEMS LB can be installed on the same service node as CEEMS API server to enforce access control on metrics.
Prerequisites
Before starting installation, ensure that the resource managers (SLURM/Openstack) have necessary configuration to work along with CEEMS exporter and API server.
Compute nodes
There are no special requirements for CEEMS exporter to work on compute nodes. Although the exporter is not extensively tested on different OS distros/architectures, it should work on all the major distros supported by SLURM/Openstack. The exporter is very light and when all the available collectors are enabled on the exporter, it will have a maximum consumption of memory around 150 MB and takes a CPU time of around 0.05 seconds per scrape request.
If the compute nodes have NVIDIA GPUs, NVIDIA DCGM and NVIDIA DCGM Exporter must be installed on the compute nodes. Installation instructions of those packages can be found in their corresponding docs.
Similarly, if the compute nodes have AMD GPUs, AMD SMI Exporter must be installed on the compute nodes to get power consumption and performance metrics of GPUs.
Finally, for the SLURM or k8s clusters, if the continuous profiling of the jobs/pods is required, Grafana Alloy must be installed on the compute nodes.
Service node
Different services must be deployed for the CEEMS. They can all be deployed on the same service node or different nodes. Installing them on a same machine will help to manage the services easily and reduce the attack surface as all services can be bound to localhost. The list of required services are:
- Prometheus (compulsory): To scrape metrics from exporters running on compute nodes
- CEEMS API server (compulsory): To store the jobs/VMs data in a standardized DB
- Grafana (compulsory): To construct dashboards to expose metrics for operators and endusers.
- CEEMS LB (optional): To enforce access control to the Prometheus metrics
- Pyroscope (optional): When continuous profiling of SLURM jobs/k8s pods is needed
The present guide assumes that Prometheus, Pyroscope (if needed) and Grafana are already installed and configured on the service node. Installation instructions of each component can be consulted from their respective documentation and hence, they are omitted here.
Installation instructions for Prometheus, Grafana and Pyroscope can be found in their docs. CEEMS API server and CEEMS LB requires very modest system resources and hence, they can be run alongside Prometheus and Pyroscope on the same service node. The scaling of this service node must take into account the size of cluster, number of Prometheus targets, Prometheus data retention period, etc. A good recommendation is to have at least 32 GiB of memory and 8 CPUs which should be enough to host all the necessary services.
When it comes to the storage, Prometheus works best on local disk storage. Thus, depending on the required retention period, local SSD/NVMe disks with RAID to achieve fault tolerance can be a good starting point. There are other options like Thanos and Cortex to achieve long term storage and fault tolerance for Prometheus data.
Installation Steps
The installation steps in this section make following assumptions:
- There are two sets of compute nodes: 1 Compute node without GPUs
compute-0
and 1 Compute node with NVIDIA GPUscompute-gpu-0
. - A single service node
service-0
is used to install all CEEMS related services
For containerized deployments, podman
will be used along
with Quadlet
to manage container services.
Installing Exporter(s)
Firstly, all the necessary repositories must be added to the local YUM or DEB repositories. If local repositories are not maintained, it is possible to download the package files and install from the package files. The following packages and/or repositories must be added:
- CEEMS Exporter, API Server and Load Balancer RPM and DEB files can be downloaded from GH Releases.
- When NVIDIA GPUs are present on the cluster CUDA Repos must be added.
Once all the necessary packages are downloaded and/or added to the repositories, they can be installed on the compute nodes.
On the compute nodes, following packages must be installed:
RHEL/CentOS/Rockylinux/Alma
whoami
# root
hostname
# compute-0 or compute-gpu-0 or service-0
dnf install ceems_exporter -y
When nodes have NVIDIA GPUs, we need to install NVIDIA DCGM and NVIDIA DCGM exporter.
Current guide assumes that NVIDIA driver >=550
and CUDA >=12
is available on
compute nodes
whoami
# root
hostname
# compute-gpu-0
dnf install datacenter-gpu-manager-4-core datacenter-gpu-manager-4-cuda12 datacenter-gpu-manager-4-devel datacenter-gpu-manager-4-proprietary datacenter-gpu-manager-4-proprietary-cuda12 datacenter-gpu-manager-exporter -y
Debian/Ubuntu
whoami
# root
hostname
# compute-0
apt-get install ceems_exporter -y
When nodes have NVIDIA GPUs, we need to install NVIDIA DCGM and NVIDIA DCGM exporter.
Current guide assumes that NVIDIA driver >=550
and CUDA >=12
is available on
compute nodes
whoami
# root
hostname
# compute-gpu-0
apt-get install datacenter-gpu-manager-4-core datacenter-gpu-manager-4-cuda12 datacenter-gpu-manager-4-devel datacenter-gpu-manager-4-proprietary datacenter-gpu-manager-4-proprietary-cuda12 datacenter-gpu-manager-exporter -y
We install ceems_exporter
on service node service-0
to export real time and static
emission factor data.
Configuring Exporter(s)
CEEMS Exporter
At minimum, CEEMS exporter must be configured with the CLI arguments that enable the relevant collectors. This can be done using environment variables which can be provided to systemd service file installed by the package. For instance, to enable SLURM collector and to disable collector metrics, following must be added to systemd service file
whoami
# root
hostname
# compute-0 or compute-gpu-0
cat > /etc/systemd/system/ceems_exporter.service.d/override.conf << EOF
[Service]
Environment=CEEMS_EXPORTER_OPTIONS="--collector.slurm --web.disable-exporter-metrics"
EOF
Similarly for Openstack compute nodes, a basic runtime configuration would be as follows:
whoami
# root
hostname
# compute-0 or compute-gpu-0
cat > /etc/systemd/system/ceems_exporter.service.d/override.conf << EOF
[Service]
Environment=CEEMS_EXPORTER_OPTIONS="--collector.libvirt --web.disable-exporter-metrics"
EOF
Optionally, if emissions must be estimated using real time emission factors, we need to
deploy another instance of CEEMS exporter on the service node, service-0
, to pull the
emission factors and export them to Prometheus. To enable real time emission factors from
Electricity Maps
and RTE eCO2 Mix, the CLI options
for this exporter must be:
whoami
# root
hostname
# service-0
cat > /etc/systemd/system/ceems_exporter.service.d/override.conf << EOF
[Service]
Environment=CEEMS_EXPORTER_OPTIONS="--collector.emissions --collector.emissions.provider="rte" --collector.emissions.provider="emaps" --collector.disable-defaults --web.disable-exporter-metrics"
EOF
Operators need to verify the usage policy of Electricity Maps API before using it in their production.
CEEMS package supports static emission factors from historical data provided by OWID. To estimate emissions using this static factor, there is no need to deploy the above of CEEMS exporter and emissions will be estimated directly using the static factor value for a given country.
More details on runtime configuration of CEEMS exporter can be consulted from the docs.
By default, no authentication is enabled on CEEMS exporter and it is strongly recommended to add at least the basic authentication. This is done using a web configuration file which is installed by packages. More details on all the available options for web configuration can be found in its dedicated section.
There is a utility tool ceems_tool
distributed with CEEMS API server package that can be used to
generate web config file. Assuming ceems_tool
is available on the current host, web config file
can be generated as follows:
ceems_tool config create-web-config
This command will generate a web config file with basic auth configuration in config
folder in current
directory named as web-config.yml
. The config file will only
contain hashed password and the output of the command shows the password in plain text. For example, the
output of above command would be:
web config file created at config/web-config.yml
plain text password for basic auth is <PASSWORD_WILL_BE DISPLAYED_HERE>
store the plain text password securely as you will need it to configure Prometheus
This password must be stored securely to use it when configuring Prometheus. The generated web configuration
file must be placed at /etc/ceems_exporter/web-config.yml
on the compute nodes.
Finally, CEEMS exporter must be enabled to start at boot and restarted for the changes to take effect.
whoami
# root
hostname
# compute-0 or compute-gpu-0
systemctl enable ceems_exporter.service
systemctl start ceems_exporter.service
For the case of containerized deployments using Podman Quadlets, sample systemd Quadlet files are provided in the repository. Steps to follow to deploy Quadlets:
- Copy
ceems_exporter.network
andceems_exporter.container
files to/etc/containers/systemd
folder. - Create
/etc/ceems_exporter
folder on the host and copy the generated web configuration file to/etc/ceems_exporter/web-config.yml
. - Modify the
Exec
directive inceems_exporter.container
file to add relevant CLI options. - Execute
systemctl daemon-reload
which should generate necessary service files. - Finally launch the service using
systemctl start ceems_exporter.service
.
DCGM Exporter
DCGM exporter needs a CSV file that lists all the metrics that will be monitored. datacenter-gpu-manager-exporter
package installs a default file at /etc/dcgm-exporter/default-counters.csv
which enables important metrics. Replace
the contents of default-counters.csv
file with the one
provided in the CEEMS repo,
which enables more profiling metrics than the default one.
By default DCGM exporter runs without any authentication and it is desirable to run it behind basic auth. DCGM exporter
supports same web configuration file as CEEMS exporter and hence, same web configuration can be used for the both
exporters. Assuming the web configuration file is installed as /etc/dcgm-exporter/web-config.yml
, it can be passed
to the DCGM exporter using environment variable DCGM_EXPORTER_WEB_CONFIG_FILE
.
whoami
# root
hostname
# compute-0 or compute-gpu-0
cat > /etc/systemd/system/nvidia-dcgm-exporter.service.d/override.conf << EOF
[Service]
Environment=DCGM_EXPORTER_WEB_CONFIG_FILE=/etc/dcgm-exporter/web-config.yml
EOF
Final step is to enable and start DCGM exporter service.
whoami
# root
hostname
# compute-0 or compute-gpu-0
systemctl enable nvidia-dcgm-exporter.service
systemctl start nvidia-dcgm-exporter.service
To deploy DCGM exporter as Podman container, ensure the version of Podman is > 4.3
.
We need to ensure to install NVIDIA Container Toolkit
before deploying DCGM exporter container. For Podman, Container Device Interface (CDI)
must be configured and more details can be found in the
NVIDIA CDI Docs.
For the case of containerized deployments using Podman Quadlets, sample systemd Quadlet files are provided in the repository. Steps to follow to deploy Quadlets:
- Copy
nvidia-dcgm-exporter.container
file to/etc/containers/systemd
folder. - Create
/etc/dcgm-exporter
folder on the host and copy the generated web configuration file to/etc/dcgm-exporter/web-config.yml
and counters.csv to/etc/dcgm-exporter/default-counters.csv
. - Execute
systemctl daemon-reload
which should generate necessary service files. - Finally launch the service using
systemctl start nvidia-dcgm-exporter.service
.
Configuring Prometheus
Assuming Prometheus has already been installed on service-0
, following scrape configuration must be added to
Prometheus. Remember that in the current deployment scenario, we have two sets of compute nodes:
- 1 Compute node without GPUs
compute-0
- 1 Compute node with NVIDIA GPUs
compute-gpu-0
- 1 Service node where emission factors are fetched and exported
We define three different scrape jobs: cpu-nodes
, gpu-nodes
and service-nodes
to set up CEEMS exporter targets. We can either add DCGM exporter targets in gpu-nodes
job or define a separate scrape job for DCGM exporter. In the current scenario,
we setup DCGM exporters in the same job.
We will need the plain text basic auth password generated for CEEMS and DCGM exporters in the previous step to configure Prometheus scrape jobs.
The scrape jobs configuration would be as follows:
# A list of scrape configurations.
scrape_configs:
- job_name: cpu-nodes
scheme: http
metrics_path: /metrics
basic_auth:
username: ceems
password: <BASIC_AUTH_PLAIN_TEXT_PASSWORD>
static_config:
targets:
- compute-0:9010
- job_name: gpu-nodes
scheme: http
metrics_path: /metrics
basic_auth:
username: ceems
password: <BASIC_AUTH_PLAIN_TEXT_PASSWORD>
# This relabel_config must be added to all
# scrape jobs that have DCGM targets
metric_relabel_configs:
source_labels:
- modelName
- UUID
target_label: gpuuuid
regex: NVIDIA(.*);(.*)
replacement: $2
action: replace
- source_labels:
- modelName
- GPU_I_ID
target_label: gpuiid
regex: NVIDIA(.*);(.*)
replacement: $2
action: replace
- regex: UUID
action: labeldrop
- regex: GPU_I_ID
action: labeldrop
static_config:
targets:
- compute-gpu-0:9010
- compute-gpu-0:9400
# This job is needed only when exporter is deployed
# on service node to pull real time emission factors
# from RTE eCo2 mix and/or Electricity Maps
- job_name: service-nodes
scheme: http
metrics_path: /metrics
basic_auth:
username: ceems
password: <BASIC_AUTH_PLAIN_TEXT_PASSWORD>
static_config:
targets:
- service-0:9010
All the Prometheus scrape jobs that have DCGM exporter targets must include a
metric_relabel_configs
as follows:
metric_relabel_configs:
- source_labels:
- modelName
- UUID
target_label: gpuuuid
regex: NVIDIA(.*);(.*)
replacement: $2
action: replace
- source_labels:
- modelName
- GPU_I_ID
target_label: gpuiid
regex: NVIDIA(.*);(.*)
replacement: $2
action: replace
- regex: UUID
action: labeldrop
- regex: GPU_I_ID
action: labeldrop
This is only basic configuration and more options can be found in the Prometheus configuration docs. Once this configuration has been added, reload Prometheus and check if it is able to scrape the targets. This can be verified using Prometheus Web UI.
Once the Prometheus is able to scrape targets and ingest metrics, we will need to add recording rules to create new derived metrics from the raw metrics exported by the CEEMS and DCGM exporters. The advantage of using recording rules is that Prometheus will calculate these metrics at ingest time once and there is no need to make calculation each time we want to make queries.
Recording rules can be created using ceems_tool
using the following command:
./bin/ceems_tool tsdb create-recording-rules --url=http://<PROMETHEUS_BASIC_AUTH_USERNAME>:<PROMETHEUS_BASIC_AUTH_PASSWORD>@service-0:9090 --country-code=FR
When Redfish Collector is enabled on CEEMS exporters and if Redfish server has multiple chassis defined, the above command will ask for the user input on which chassis must be used in estimated power consumption. As different chassis can report power consumption of different components, operators must choose a chassis that reports power consumption of host.
The --url
must be URL at which Prometheus server is running and --country-code
must be
ISO2 country code to get the emission factor. This command will generate recording rules
files in current directory inside a folder named rules
. Copy these rules files to
/etc/prometheus/rules
directory and ensure to set following configuration for Prometheus:
rule_files:
- /etc/prometheus/rules/*.rules
Reload Prometheus and verify the rules are being evaluted and recorded correctly.
Installing and Configuring CEEMS API Server
Before going to this step, ensure that Prometheus is able to scrape the targets well and all the metrics are being monitored.
CEEMS API server can be installed on the same host as Prometheus or a different one.
In the current example, we use the same host as Prometheus for simplicity. Assuming
Prometheus has been installed on service node service-0
, CEEMS API server can
be installed as follows:
RHEL/CentOS/Rockylinux/Alma
whoami
# root
hostname
# service-0
dnf install ceems_api_server -y
Debian/Ubuntu
whoami
# root
hostname
# service-0
apt-get install ceems_api_server -y
CEEMS API server stores all the data related to compute units and hence, it is
strongly recommended to protect the server using authentication. It supports
the same authentication mechanism as CEEMS and DCGM exporter as explained in the
previous section. ceems_tool
can be leveraged to
generate a web configuration file. Copy the generated configuration file to
/etc/ceems_api_server/web-config.yml
.
Now CEEMS API server config must be updated. ceems_api_server
package installs
a default configuration file at /etc/ceems_api_server/config.yml
with sane
defaults. More details about configuration parameters can be consulted in the
Configuration Reference. Here we need to
add configuration for clusters
and updaters
sections in the file.
First, we will start with updaters
section. updaters
are list of servers that
will be used to estimate aggregate metrics of each compute unit and store them in
a SQL DB. For example in the current scenario, Prometheus server is an updater from
which we can estimate the aggregate metrics of compute units. The advantage of using
updaters is that we do not need to make expensive repeated queries to Prometheus to
get aggregate values of the metrics.
In order to estimate aggregated metrics, we need to configure the updater with TSDB
queries that estimates aggregated metrics. Assuming that recording rules for Prometheus
are added as explained in the Configuring Prometheus section,
we can generate the queries needed for the updater using ceems_tool
as follows:
ceems_tool tsdb create-ceems-tsdb-updater-queries --url=http://<PROMETHEUS_BASIC_AUTH_USERNAME>:<PROMETHEUS_BASIC_AUTH_PASSWORD>@service-0:9090
The --url
must point to Prometheus URL and above command will output the queries
configuration section to the terminal. Copy this output and store it on clipboard. Every
updater must have an unique identifier and assuming prom-tsdb
as identifier and
QUERIES_OUTPUT
as the queries returned by the above command, following
configuration must be added to updaters
section in file /etc/ceems_api_server/config.yml
:
updaters:
- id: prom-tsdb
updater: tsdb
web:
url: http://service-0:9090
basic_auth:
username: <PROMETHEUS_BASIC_AUTH_USERNAME>
password: <PROMETHEUS_BASIC_AUTH_PASSWORD>
extra_config:
queries: <QUERIES_OUTPUT>
Finally, we need to configure clusters
section in the configuration file. clusters
section defines the list of clusters from where we fetch computt units data. It can be
multiple clusters of same kind or multiple clusters of different kind. Each cluster
must be identified by a unique identifier like in the case of updater
.
We assume that the resource manager of the current scenario is SLURM. In this case,
the host where CEEMS API server will be deployed must be configured as a SLURM client
to be able to execute sacct
command to get list of jobs. Assuming that it has been
done, the clusters
section in file /etc/ceems_api_server/config.yml
must have following
configuration:
clusters:
- id: slurm-cluster
manager: slurm
# Updater id that we defined in the `updaters` section
# Aggregate metrics of job will be estimated by querying
# against this Prometheus server
updaters:
- prom-tsdb
# If `sacct` command is installed in a non-standard location,
# set the path here
cli:
path: /usr/bin
With the above clusters
and updaters
configurations in-place in
/etc/ceems_api_server/config.yml
, we can enable and start the CEEMS API server
systemctl enable ceems_api_server.service
systemctl start ceems_api_server.service
Once the API server has started, we can check for its health by hitting endpoint
http://localhost:9020/api/v1/health
assuming we are on the host where API server
has been deployed.
Once the Prometheus and CEEMS API server are up and running, we can configure Grafana to use these servers as data sources for building dashboards.
Configuring Grafana
The final step of the deployment guide is to configure Grafana to use Prometheus and
CEEMS API server has datasources to build dashboards. Assuming Grafana server is also
installed on the same service node service-0
, firstly, we need to ensure that
Grafana server is configured to send the user header to datasources. This can be done
using the following configuration in grafana.ini
file:
[dataproxy]
send_user_header = true
or setting GF_DATAPROXY_SEND_USER_HEADER=true
environment variable on Grafana server.
Following we need to install Grafana Infinity Datasource plugin using following command:
grafana-cli plugins install yesoreyeram-infinity-datasource
Once the plugin has been installed, restart Grafana server.
For the plugin versions yesoreyeram-infinity-datasource < 3.x
, plugin does not support
X-Grafana-User
header which we need for CEEMS API server to identify current user. So,
it is recommended to use version >= 3.x
.
We use Grafana provisioning to define the datasources. A sample provisioned datasources file is provided in the repository. For the current scenario, the provisioning file would be as follows:
# Configuration file version
apiVersion: 1
# List of datasources that CEEMS uses
datasources:
# Vanilla Prometheus datasource that DOES NOT IMPOSE ANY ACCESS CONTROL
- name: prom
type: prometheus
access: proxy
# Replace it with Prometheus URL
url: <PROMETHEUS_URL>
basicAuth: true
# Replace it with Prometheus basic auth username
basicAuthUser: <PROMETHEUS_BASIC_AUTH_USERNAME>
secureJsonData:
# Replace it with Prometheus basic auth password
basicAuthPassword: <PROMETHEUS_BASIC_AUTH_PASSWORD>
# CEEMS API server JSON datasource
- name: ceems-api
type: yesoreyeram-infinity-datasource
url: <CEEMS_API_SERVER_URL>
basicAuth: true
# Replace it with CEEMS API server basic auth username
basicAuthUser: <CEEMS_API_SERVER_BASIC_AUTH_USERNAME>
jsonData:
auth_method: basicAuth
timeout: 120
# Replace it with CEEMS API server URL
allowedHosts:
- <CEEMS_API_SERVER_URL>
httpHeaderName1: X-Grafana-User
secureJsonData:
# Replace it with CEEMS API server basic auth password
basicAuthPassword: <CEEMS_API_SERVER_BASIC_AUTH_PASSWORD>
# This will be replaced by username before passing to API server
# This feature is available only for yesoreyeram-infinity-datasource >= 3.x
# IMPORTANT: Need $$ to escape $
httpHeaderValue1: $${__user.login}
Replace the placeholders with values and install the file at /etc/grafana/provisioning/datasources
.
Now restarting the Grafana must include all the newly provisioned datasources.
The next step is to setup dashboards to visualize the metrics of compute units. This can be done using Grafana provisioning as well. A reference set of dashboards are provided in the repository. More details on the dashboards are provided in the README.
Optional Steps
Note that with above installation steps, a functional CEEMS deployment can be assured. However, if access control to Prometheus data must be enforced, an additional component CEEMS LB must be also deployed. In a nutshell this component sits between Grafana and Prometheus to introspect the queries coming from Grafana to verify if the user making the query has view access to the metrics of the compute unit they are querying for.
As discussed in the Prerequisites, in order to enable continuous profiling of SLURM jobs or k8s pods, Grafana Alloy must be installed on compute nodes and Pyroscope must be installed on service node.
Deploying Grafana Alloy and Pyroscope
Firstly, ensure that Grafana Alloy and Pyroscope packages must be added and enabled.
First we must install Pyroscope server so that Grafana Alloy running on compute nodes
can send profile data to Pyroscope. We deploy Pyroscope on the service node service-0
:
RHEL/CentOS/Rockylinux/Alma
whoami
# root
hostname
# service-0
dnf install pyroscope -y
Debian/Ubuntu
whoami
# root
hostname
# service-0
apt-get install pyroscope -y
A basic configuration file is provided
in the repository
and it can be used as a good starting point. It must be installed at /etc/pyroscope/config.yml
.
More details on Pyroscope configuration can be
found in the documentation.
It is highly recommended to configure TLS auth for Pyroscope to enforce authentication. If managing TLS certificates is not desired, we recommended to use basic auth by exposing Pyroscope behind a reverse proxy like nginx and configuring the nginx server block with basic auth credentials. In the absence of any form of authentication, end users in a typical HPC environment will be able to query Pyroscope server directly which is not desired.
On the compute nodes, following packages must be installed:
RHEL/CentOS/Rockylinux/Alma
whoami
# root
hostname
# compute-0 or compute-gpu-0
dnf install alloy -y
Debian/Ubuntu
whoami
# root
hostname
# compute-0 or compute-gpu-0
apt-get install alloy -y
A sample configuration file is provided in the
repository.
Necessary placeholders on the sample config file must be replaced and file
must be installed at /etc/alloy/config.alloy
.
We need to enable Grafana Alloy targets discoverer component on CEEMS exporter
so that it provides a list of targets to profile to Grafana Alloy. This can be
done by configuring CEEMS_EXPORTER_OPTIONS
environment variable for CEEMS exporter
service:
whoami
# root
hostname
# compute-0 or compute-gpu-0
cat > /etc/systemd/system/ceems_exporter.service.d/override.conf << EOF
[Service]
Environment=CEEMS_EXPORTER_OPTIONS="--collector.slurm --collector.alloy-targets --web.disable-exporter-metrics"
EOF
Finally, enable and restart both CEEMS Exporter and Grafana Alloy services:
whoami
# root
hostname
# compute-0 or compute-gpu-0
systemctl enable ceems_exporter.service
systemctl restart ceems_exporter.service
systemctl enable alloy.service
systemctl restart alloy.service
If Grafana Alloy throws any errors, ensure that alloy.service
is running as root
user in the systemd service file. Grafana Alloy needs to access a lot of files in
/proc
and /sys
file systems to be able to continuously profile processes which
are otherwise not permitted for non-privileged users.
After this step, Grafana Alloy should be sending the profiles data to Pyroscope for every SLURM job on the compute node.
Installing and Configuring CEEMS LB
Before going to this step, ensure that CEEMS API server, Prometheus and Grafana are installed, configured and working as expected.
CEEMS LB can be installed on the same host as Prometheus. It is more practical ans secure to deploy it on the same node where Prometheus is running. It is a simple proxy/load balancer that does not need a lot of resources.
In the current example, we use the same host as Prometheus for simplicity. Assuming
Prometheus has been installed on service node service-0
, CEEMS LB can
be installed as follows:
RHEL/CentOS/Rockylinux/Alma
whoami
# root
hostname
# service-0
dnf install ceems_lb -y
Debian/Ubuntu
whoami
# root
hostname
# service-0
apt-get install ceems_lb -y
Just like CEEMS exporter and API server, it is
strongly recommended to protect the load balancer using authentication. It supports
the same authentication mechanism as other components as explained in the
previous section. ceems_tool
can be leveraged to
generate a web configuration file. Copy the generated configuration file to
/etc/ceems_lb/web-config.yml
.
Now CEEMS LB config must be updated. ceems_lb
package installs
a default configuration file at /etc/ceems_lb/config.yml
with sane
defaults. More details about configuration parameters can be consulted in the
Configuration Reference. The core configuration
for CEEMS LB is simple and it takes two keys strategy
and backends
. strategy
is
the load balancing strategy whereas backends
is the list of TSDB (and/or Pyroscope)
backends.
The true value that CEEMS LB offers is the ability to provide access control to
Prometheus query data. Deploying CEEMS LB without access control enabled is not very
useful and not recommended. In order to enable access control an additional section
ceems_api_server
must be provided to CEEMS LB config. This section must provide
either the ceems_api_server.data
section or ceems_api_server.web
section. If
CEEMS LB is able to access the DB files of CEEMS API server, it is recommended to setup
ceems_api_server.data.path
file so that CEEMS LB will make queries directly to
the DB. If CEEMS API server's DB files are not available to CEEMS LB, it will make HTTP
requests to CEEMS API server to verify the ownership of compute units. It should be
preferred to gives direct access to DB files to CEEMS LB to maximize performance and
minimize latencies.
In the current scenario, as both CEEMS API server and CEEMS LB are deployed on the
same physical host, we use ceems_api_server.data.path
method for DB access. The
configuration file would be as follows:
ceems_lb:
# Load balancing strategy
strategy: round-robin
# List of Prometheus and/or Pyroscope backends
backends:
# `id` should be the same as configured in `clusters` config.
- id: slurm-cluster
tsdb:
- web:
url: http://<PROMETHEUS_URL>
basic_auth:
username: <PROMETHEUS_BASIC_AUTH_USERNAME>
password: <PROMETHEUS_BASIC_AUTH_PASSWORD>
# When Pyroscope is also deployed
pyroscope:
- web:
url: <PYROSCOPE_URL>
# Must be same config as configured for `ceems_api_server` at `/etc/ceems_api_server/config.yml`
ceems_api_server:
data: /var/lib/ceems
Replace the configuration file's content at /etc/ceems_lb/config.yml
with above file after replacing
placeholders and restart the CEEMS LB service.
whoami
# root
hostname
# service-0
systemctl enable ceems_lb.service
systemctl start ceems_lb.service
This should ensure that CEEMS LB running at localhost:9030
. When Pyroscope server has also been
deployed and configured in ceems_lb.backends
, we will notice another HTTP server running at
localhost:9040
. Normally, server running at localhost:9030
is load balancer for Prometheus
backends where server running at localhost:9040
is load balancer for Pyroscope backends. This
can be confirmed by looking at the logs of the ceems_lb
.
time=2025-02-13T16:43:51.775Z level=INFO source=frontend.go:220 msg="Starting ceems_lb" backend_type=pyroscope listening=127.0.0.1:9040
time=2025-02-13T16:43:51.775Z level=INFO source=tls_config.go:347 msg="Listening on" backend_type=pyroscope address=127.0.0.1:9040
time=2025-02-13T16:43:51.775Z level=INFO source=tls_config.go:350 msg="TLS is disabled." backend_type=pyroscope http2=false address=127.0.0.1:9040
time=2025-02-13T16:43:51.775Z level=INFO source=helpers.go:55 msg="Starting health checker" backend_type=tsdb
time=2025-02-13T16:43:51.775Z level=INFO source=frontend.go:220 msg="Starting ceems_lb" backend_type=tsdb listening=127.0.0.1:9030
time=2025-02-13T16:43:51.776Z level=INFO source=tls_config.go:347 msg="Listening on" backend_type=tsdb address=127.0.0.1:9030
time=2025-02-13T16:43:51.776Z level=INFO source=tls_config.go:350 msg="TLS is disabled." backend_type=tsdb http2=false address=127.0.0.1:9030
time=2025-02-13T16:43:51.776Z level=INFO source=helpers.go:55 msg="Starting health checker" backend_type=pyroscope
Adding CEEMS LB and Pyroscope Datasources on Grafana
When CEEMS LB and Pyroscope have been deployed, in addition to the datasources configured for Grafana in the above section, we need to add three new datasources: two for CEEMS LB (Prometheus and Pyroscope backends) and one for vanilla Pyroscope (without any access control). A sample provisioned config file for these datasources is shown below:
# Configuration file version
apiVersion: 1
# List of additional datasources that CEEMS uses
datasources:
# Vanilla Pyroscope datasource that DOES NOT IMPOSE ANY ACCESS CONTROL
- name: pyro
type: pyroscope
access: proxy
# Replace it with Pyroscope URL
url: <PYROSCOPE_URL>
# If Pyroscope server has basic authentication
# configured ensure that it has been added here as well
- name: ceems-lb-tsdb
# It should be of type Prometheus
type: prometheus
access: proxy
url: http://localhost:9030
basicAuth: true
basicAuthUser: <CEEMS_LB_BASIC_AUTH_USERNAME>
jsonData:
prometheusVersion: 2.51
prometheusType: Prometheus
timeInterval: 30s
incrementalQuerying: true
cacheLevel: Medium
# This is CRUCIAL. We need to send this header for CEEMS LB
# to proxy the request to correct backend
httpHeaderName1: X-Ceems-Cluster-Id
secureJsonData:
basicAuthPassword: <CEEMS_LB_BASIC_AUTH_PASSWORD>
# It must be the same `id` configured across CEEMS components
httpHeaderValue1: slurm-cluster
- name: ceems-lb-pyro
# It should be of Pyroscope type
type: pyroscope
access: proxy
url: http://localhost:9040
basicAuth: true
basicAuthUser: <CEEMS_LB_BASIC_AUTH_USERNAME>
jsonData:
# This is CRUCIAL. We need to send this header for CEEMS LB
# to proxy the request to correct backend
httpHeaderName1: X-Ceems-Cluster-Id
secureJsonData:
basicAuthPassword: <CEEMS_LB_BASIC_AUTH_PASSWORD>
# It must be the same `id` configured across CEEMS components
httpHeaderValue1: slurm-cluster
After replacing the placeholders, this file must be installed at /etc/grafana/provisioning/datasources
folder and restart Grafana server.
Finally, while importing dashboards,
the datasources for SLURM Single Job Metrics
and for Openstack Single VM Metrics
must be configured as ceems-lb-tsdb
and ceems-lb-pyro
(only for SLURM). This ensures that the queries
made by Grafana will be intercepted by CEEMS LB, enforce the access control and then decide whether to proxy
request to backend or not.
Conclusion
This guide provides an overall view of all necessary steps needed to configure CEEMS, Prometheus and Grafana. This should be only used as a guide and it must be adopted to the needs and constraints of individual data center. Any suggestions to improve this guide are always welcome and please do not hesitate to open a bug report, if any errors are found here.