Monitoring Extension for Eco2clouds

In the context of the Eco2clouds project, the core monitoring functionnalities of BonFIRE where extended to provide experimenters with monitoring data about a VM as measured by the machine hosting that VM.

New host metrics

The implementation works as follows. For each host, a new metric is defined in the /etc/zabbix/zabbix_agentd.conf file.

UserParameter=custom.metrics.sent,[ -f /usr/local/share/metrics/xentop_parser.rb ] && /usr/local/share/metrics/xentop_parser.rb

This metric’s real functionnality is to call the xentop_parser.rb script, that does more than just send back a value. Because of that, the /etc/zabbix/zabbix_agentd.conf file must be changed to allow for enough time for that script to work with the following directive:

Timeout=10

We’ll detail xentop_parser.rb a bit more in the next paragraph.

The infrastructure aggregator configuration must also be changed

  • By adding a IDLE_POWER macro for each host, with an estimation of the power consumption of the machine in Watts when no activity is running
  • By uploading a new template (e2c-infra-monitoring) to the infrastructure aggregator. This template can be found in the e2c-host-monit repository on Inria’s gforge installation.
  • By linking this new template to the BonFIRE Template, that will expose the IDLE power consumption, and create metrics to be filled by xentop_parser.rb.

xentop_parser.rb

This script, running on infrastructure hosts, is configured with the /etc/e2c-host-monit/default.yml file and available in the e2c-host-monit repository on Inria’s gforge installation.

The behaviour of that script is to call xentop to get a list of running VMs with their usage of different resources. With that information, it will use zabbix_sender to update total resource usage by VMs metrics defined in the e2c-infra-monitoring template on the infrastructure aggregator, and attempt to send to every VM’s experiment aggregator resource usage for each VM. This can be done

  • using zabbix_sender also, but supposes network connectivity between hosts and BonFIRE WAN.
  • using the Management Message Queue, with a message format that will be recognised by the message spreader so the message is resent to the experiment message queue.

xentop_parser.rb is not trivial to install, as it runs as zabbix user, but expects to be able to run xentop and to read VM’s context.sh file (grep_one_log file, also part of the e2c-host-monit git repository). The default configuration file supposes this will be possible using sudo, so you’ll need a configuration such as

Defaults:zabbix !requiretty zabbix ALL=(ALL) NOPASSWD:/usr/sbin/xentop,/usr/local/bin/grep_one_log

# This is the configuration file for e2c-host-monit
# all known options a referenced here
---
:debug: false
:data_filename: /var/lib/e2c-host-monit/latestdata
:lock_filename: /var/run/lock/e2c-host-monit.lock
:one_home: /var/lib/one/
:testbed: <testbed_id>
:send-to-amqp: true
:send-to-zabbix: false
:mq_host: <mq_server>
:mq_user: <testbed_mq_user>
:mq_pass: <testbed_mq pass>
:mq_vhost: bonfire
:mq_channel: resourceUsage
:zbx_cmd: zabbix_sender -c /etc/zabbix/zabbix_agentd.conf -i - -T
:xentop_cmd: sudo /usr/sbin/xentop -bfi 1
:grep_cmd: sudo /usr/local/bin/grep_one_log
:delay-before-failure: 600

It also expects to be able to save data between calls to xentop in file data_filename, by default /var/lib/e2c-host-monit/latestdata and to be able to avoid overlapping runs using locking over the file lock_filename, by default /var/run/lock/e2c-host-monit.lock

Finally, you’ll need some ruby libraries, in particular amqp, available in debian in the ruby-amqp package. If you are installing using gem, amqp gem version 0.6.7 is known not to work, but amqp 1.3.0 is known to work with eventmachine version 0.12.10. eventmachine is a gem with native extensions, so it is easier to install from your disctribution’s packages. Install this first, then amqp.

To debug installation, you can run
su -c "/usr/local/share/metrics/xentop_parser.rb
--status" -s "/bin/bash" zabbix

If you run this as root, don’t forget that any files or locks created during the debug period might fail you when starting the program as an active zabbix agent, with the zabbix user.

The previous command will look at the age of the datafile, and warn of issues. If you can run this command, and zabbix is configured, you should see traces of runs of the package in the general logging files, as xentop_parser.rb relies on the syslog facility to log messages. Of course, using :debug: true will increase the verbosity of log in syslog.

Monitoring Contextualisation updates

Finally, the VM contextualisation scripts (located at /srv/cloud/context in Inria’s ONE frontend installation) are updated so that the experiment aggragator

  • to dynamically adds metrics to VMs it knows about. This happens in get-infra-monitoring-data.py
  • to listen to the experiment message queue for monitoring messages, and update the dynamically added metrics with the values received. This happens in MQtoZabbix.py that is started by /etc/init.d/mq_monitoring

These are updated versions of the classical scripts and can be found in bonfire-dev/vm-metrics/trunk/context

Their behaviour can be checked by looking at /var/log/infra_monitoring.log

Table Of Contents

Previous topic

Monitoring Core

Next topic

Elasticity as a Service

This Page