How often have you endured complaints from users about server or service performance because of a small subset of users who are taxing the system, utilizing the majority of the available resources? How do you manage resource sharing on a multi-purpose server that may have a database server and web server running together? We have experienced these situations many times, but rather than write exotic process management scripts, maintain heavily modified init scripts or complicated user environments; we have switched to using cgroups.
What are Cgroups?
Cgroups or container groups started in September 2006 as a patch for the Linux kernel initially developed by a few Google engineers. Process containers, as they were initially called, allowed administrators to restrict or prioritize processes and their access to server subsystems. In the 2.6.24 kernel release process containers was merged into the mainline kernel and were renamed to cgroups.
Cgroups provide hooks into various kernel subsystems which allow an administrator to limit the resource utilization of a single process or group of processes on one or more subsystems on the server. Cgroups are currently able to limit CPU, memory, network, and disk I/O. They are also able to perform resource accounting and have the ability to pause and restart processes. As cgroups have grown and developed they have become part of LXC, an OS level virtual server implementation similar to Linux-Vserver or OpenVZ.
Take the following example as an example of how cgroups can be used to balance resource utilization on a server. Company XYZ, Inc has a multi-purpose application server with three departments utilizing it, Accounting, HR, and Sales. This server is also providing XYZ, Inc’s intranet site. Administrators frequently receive complaints from internal users about the performance of the corporate intranet site when HR is running their payroll report. dditionally, Accounting and HR are unable to run their reports at the same time due to the degradation of performance. By using cgroups on this application server it is possible to improve the end users experience for all departments.
Since cgroups work in a hierarchical fashion and allows for multiple hierarchal groups to co-exist we can setup multiple, nested cgroups to limit users or processes from consuming all the system resources. This will protect system resources for other processes ensuring they will always have access to system resources. For XYZ, Inc’s application server we can create two “top level” cgroups, one we would call ‘departments’ and the other ‘daemons’. Inside the root level of each hierarchy we can set the maximum amount of CPU time the whole group will have available to it. We can give our ‘departments’ group 80% of the CPU and memory and ‘daemons’ group 20% of the CPU and memory. The application accounting uses is CPU and memory intensive and we want to prevent those applications from negatively impacting the intranet site and other department’s applications. Within our ‘departments’ hierarchy we can create three additional cgroups, one for each department, giving each a percentage of the 80% of the CPU and memory that has been allocated to the ‘departments’ cgroup. This now limits the memory and cpu that can be consumed by each department and allows a more equitable sharing of the resources. We could continue on adding additional sub-cgroups in ‘departments’ limiting or guaranteeing the amount of block I/O or network throughput for certain departments, applications or users.
Utilizing Cgroups
While the explanation provided is fairly straight forward, when it comes to configuring cgroups, it can be rather daunting as the process isn’t well documented. Cgroups utilizes a pseudo file system for configuration and reporting very similar to the proc or sys files systems. There are two ways to setup cgroups. The first is to write scripts that will mount the pseudo file system which is composed of the subsystems that will be used, create the hierarchy that will be used and and finally set the values for the cgroups. The second way uses the scripts and daemons that come in Debian’s cgroup-bin package or Red Hat’s libcgroup package. With the second method only two configuration files need to be modified.
In Debian based distributions those files are /etc/cgconfig.conf and /etc/cgrules.conf. The basic tools to configure and manage cgroups comes in packages for Debian and Redhat, cgroup-bin and libcgroup which provide the cgconf init script for setting up cgroups and the cgred daemon which places processes in cgroups based on user defined rules.
cgconf
The first init script is cgconf. This script will setup cgroups based on the /etc/cgconfig.conf configuration file. In this file you are able to set the subsystems that you wish to be able to control and then set the values for the various subsystems. Currently cgroups supports the following subsystems: blkio (block device I/O), cpu, cpuacct (CPU accounting), cpuset (CPU core affinity and NUMA access), devices (allows and denies access to block and character devices), memory, freezer (suspend and resume of processes), net_cls (allows for tagging and classification of packets), net_prio (allows for setting the priority of the outbound traffic on a specific interface), ns (limits access to a specific namespace), perf_event (allows for monitoring of processes via the perf tool).
An example from the Red Hat documentation of a cgconf configuration file.
mount {
cpu = /cgroup/cpu_and_mem;
cpuacct = /cgroup/cpu_and_mem;
memory = /cgroup/cpu_and_mem;
}
group finance {
cpu {
cpu.shares=”250″;
}
cpuacct {
cpuacct.usage=”0″;
}
memory {
memory.limit_in_bytes=”2G”;
memory.memsw.limit_in_bytes=”3G”;
}
}
group sales {
cpu {
cpu.shares=”250″;
}
cpuacct {
cpuacct.usage=”0″;
}
memory {
memory.limit_in_bytes=”4G”;
memory.memsw.limit_in_bytes=”6G”;
}
}
group engineering {
cpu {
cpu.shares=”500″;
}
cpuacct {
cpuacct.usage=”0″;
}
memory {
memory.limit_in_bytes=”8G”;
memory.memsw.limit_in_bytes=”16G”;
}
}
When editing the configuration file for cgconf you will need to restart the /etc/init.d/cgconf init script. Restarting only unmounts and remounts the cgroups file system, but this will clear all current cgroups, their configuration directives and will also free all processes from cgroup restrictions. I have found that it is best for persistent daemons such as Apache or MySQL to be restarted after cgconf is restarted to ensure all processes are placed under cgroups control again.
cgrules
The second init script will start the cgred daemon which is the cgroup rules engine daemon. This daemon moves processes into the correct cgroup as the process is created. The daemon takes its configuration from /etc/cgrules.conf. You can configure processes to go into specific cgroups based on the user and/or group the process is running under and/or the process name.
Most documentation that I found while implementing this in some of our production environments was basic; either referencing or coming straight from the Linux kernel documentation of cgroups; one of the most helpful documents was from Red Hat. We are almost entirely a Debian based shop so there are a few difference in how Debian configures cgroups. One of the main differences is the location of the root cgroup. In Redhat based distributions this file system is mounted at /cgroups/. Debian based distributions mount this in /sys/fs/cgroups/. While you can change where this mounts, I would recommend keeping this at the default location as it makes it much easier with the default init scripts to work with the cgroups.
While deploying cgroups I encountered a known issue in Debian 7 “Wheezy” where the init scripts are broken. This was with libcgroup 0.38-1 which shipped with Wheezy. There was an attempt in Wheezy to fix a bug in the init scripts, which could cause processes to crash when loading a new configuration; however, the bug was not correctly fixed when Wheezy shipped. I had to use the init script for cgconf that shipped with Debian 6 “Squeeze” to get cgconf to behave as expected.
We have cgroups implemented on our shared Linux Fusion Web Hosting platform. Being able to restrict the amount of memory and cpu that processes like Apache or PHP can consume has helped to limit the impact of DoS attacks against our customer sites. Beyond this, we are implementing cgroups in other areas of our Linux infrastructure to provide better access to servers during times of high utilization. We have found cgroups as an effective way to manage and enforce resource sharing in our multi-user Linux environments.