Mark Burgess
Oslo University College
Many schemes for installing, monitoring, correcting, and upgrading clusters of hosts are in use. Apart from minor differences, they all follow the same basic themes: They feature a centralized user interface, report errors to a human, and provide some tools for that human to fix errors. The weakness in that design is that it channels all problems and errors through a single, central control point. Not only does it present a bottleneck, but also multiple possible points of failure. Most designs require a human to make decisions even about the simplest issue. That process is inefficient, unreliable, and does not scale.
Even in the smallest local area network, one needs to build a scheme for automating host configuration and maintenance, because networks have a way of quickly growing from one host to many. To avoid that growth becoming a problem, cfengine follows a strategy of centralized
policy, but distributed, autonomous execution. Cfengine is instructed from a central location, but its operation is completely and evenly spread across the network. Each host must obtain its own copy of a network policy file from a trusted source, and is then responsible for configuring itself without further outside intervention. Cfengine constantly compares the state of every host with the network policy model, and attempts to correct any discrepancy.
Unlike some models, cfengine does not rely on the permanent availability of network communication between nodes, or on remote object communication models. If some of the hosts are unavailable at the time of a policy decision, or when an error occurs, cfengine attempts to rectify the problem when those hosts are again up.
Autonomous host management is a good strategy for scalability: It presents no bottlenecks. However, we also need integration: the ability to manage the interrelationships between hosts. One cannot have complete control of one important host and think that all is done and secure. Cfengine promotes the practice of formulating a configuration/security policy and then sticking to it. It instills a discipline of preparation, focusing the problems at the right level of detail, provides 'secure' scalable automation, and a common interface to all hosts. It scales to any number of hosts without additional burden.
Convergence to an ideal system state
Cfengine is a rule-based system that uses a language to describe how hosts on a network must behave. That includes file contents and permissions, running processes, and the resources available to the hosts, etc. Rather than focusing on how to get from one state to another, cfengine focuses on the system state we wish to end up in. The problem of consistently getting the system into its global ideal state is addressed by an agent. This is an unusual approach: System administrators are used to coding at a low level of abstraction, and in terms of procedural steps. Instead, cfengine, uses a declarative language in which imperative details are hidden.
As with any system of rules, it is possible for contradictions to occur. This is where cfengine's long history of research provides many benefits. Cfengine does its utmost to ensure the property of `convergence:' An agent makes only those changes that bring the local system closer to its intended configuration. Once a host reaches its desired state, the agent that runs on it becomes dormant. Convergence also means avoiding contradictory rules: A more elusive problem, because cfengine allows administrators to develop time-dependent policies, or system configurations that change at different times of the day or on different days of the year.
To make sense of the idea of convergence towards an ideal state, cfengine follows two strategies. The first is to supply high-level primitives, ensuring that nothing happens unless there is a difference between the actual and desired system states. The second is to attempt to prevent the coding of inconsistent policies. The latter is a difficult issue, and is a topic of active research. However, cfengine's current version provides some help in that regard as well.
System components
Figure 1 shows cfengine's components. Not all components are mandatory, but they are designed to work together, and present only minimal system overhead:

Figure 1. Overview of cfengine
- cfagent: The agent that interprets policy and implements the convergence process.
- cfservd: An optional file server and remote executor. The server can be asked to start its agent immediately, for important updates, or it can be asked to serve files to a remote system. Authentication is based on RSA public-private key techniques, and communication can be encrypted if desired.
- cfenvd: The environment daemon. It is a monitoring process that tracks system resource usage in order to detect anomalies in behavior. Current development in this area is moving towards incorporating intrusion detection and automatic recovery from resource exhaustion. cfenv is plug-and-play, and requires no special setup. It consumes about 2 megabytes of disk space in operation, used for a database.
- cfexecd: A scheduling service that allows different scheduling methods and strategies for starting the agent. It also forms a part of continuing research, examining game-theoretic methods in support of optimal execution and protection.
Security
Cfengine is designed so that an external agent cannot explicitly command it. It must choose to update its instructions itself, according to its current policy. That design prevents attackers from being able to exploit the system.
The first step in security management is to decide on a security policy. In many cases one can formulate a large part of that security policy as cfengine code, making that definition formal and accurate. That way the agent is able to implement the configuration aspects of security policy without any more work. A trivial but popular commercial program for integrity checking is Tripwire. Cfengine incorporates the same cryptographic checksum possibilities as Tripwire, only in a more flexible fashion.
Cfengine offers a very simple trust model: It trusts the integrity of its input file, and any data which it explicitly chooses to download, but nothing else. Cfengine assumes that its policy file is secure. Apart from that input file, no part of cfengine accepts or uses any configuration information from outside sources. The most one could do from an authenticated network connection is to ask cfengine to check and correct the system configuration. Thus, in the worst-case scenario, an outside attacker could spoof cfengine into configuring the host correctly. In short, only root@localhost can force cfengine to do anything.
Automating configuration
Correct security starts with correct host configuration. Even with a firewall shielding a system from outside intrusion, an incorrectly configured host remains a security risk. Host configuration is what cfengine is about. Rather than reiterating cfengine's extensive documentation, I will discuss here a few illustrative examples.
A cfengine configuration file is composed of objects with the following syntax (see the cfengine documentation):
rule-type:
classes-of-host-this-applies-to::
Actual rule 1
Actual rule 2 ...
The rule-types include checking file permissions, editing text files, disabling (renaming and removing permissions to) files, controlled execution of scripts, and a variety of other things related to host configuration. Some of the 'control' rules are simply flags that switch on complex (read 'smart') behavior. Every cfengine program needs an action sequence, telling it the order in which bulk configuration operations should be evaluated.
Cfengine addresses the needs of a complete distributed cluster of computers in a single rule-set with the notion of a classification. Classification in cfengine means naming various sets of hosts, and using those sets - called classes - to constrain rules. A host can belong to one or more classes. Thus, every rule in a cfengine program belongs to a certain class of hosts, or a logical combination of several classes. When the agent on a host reads the common policy file, it determines which of the classes the host currently belongs to, and executes the rules belonging to those classes.

Figure 2.
Entities in cfengine configuration
Class-based decision making is very efficient, because it can be as specific or as general as one sees it fit. Class-based decision making does not depend on the number of hosts in a cluster or at an organization. It relies only on common information available to everyone, or on private information known only to the host concerned. Class-based decision making implies a different way of thinking from that advocated by imperative programming languages, such as Perl. However, it is easy to start thinking in a `cfengine' way about system configuration, rather than in a Perl or Makefile way.
Who is using it and why
Although it is difficult to obtain exact usage information when users have no obligation to register, survey results and download information reveal that cfengine is used by thousands of organizations, running on hundreds of thousands of hosts. Most of those hosts are Unix, some are MacOS X, and a few are Windows NT or derivatives. Some of the organizations using cfengine include NASA, ESA, Alcatel, IBM, Hewlett-Packard, Silicon Graphics, Cray Research, Inc., Sun Microsystems, Inc., Motorola, Netcom, AOL, and NEC. In addition, many universities and government institutions are among its users.
Cfengine users report that they use it for the simplicity of its concept, because it represents cutting-edge research rather than marketing hype, and because it saves them a good deal of work. Cfengine's philosophy is 'let the machine do the work.' It goes out of its way to lighten the load on humans.
Some system administrators like to write their own configuration tools. That has the advantage of not having to learn someone else's work - but it makes a system dependent on its system administrator. As well, home grown tools are usually based on one person's own vision, rather than on actual research. Cfengine has survived, and become popular, over a period of ten years because it addresses real system configuration issues in a straightforward and robust fashion. It has resisted the temptation to succumb to fashion or whim, and has maintained a steady focus on core problems. Users can extend cfengine's basic functionality via a user-module interface.
Summary
Cfengine is not merely a tool: It is an environment for managing host configuration and integrity. This article only hints at cfengine's full capabilities. Cfengine comes with extensive documentation, and to fully understand the syntax of the example below, you should consult that documentation.
The advantage of cfengine over many other configuration schemes is that one can store everything in a single file, or set of files, and that every host is responsible for managing its own state. A global policy is common to every host; It can be as general or as specific as best suits a particular system.
Cfengine is available for most kinds of Unix and for Windows NT. It is easily portable to other platforms.
APPENDIX: A simple cfengine configuration
#
# Simple cfengine configuration file
#
control:
actionsequence = ( checktimezone files )
domain = ( example.com )
timezone = ( MET )
smtpserver = ( smtphost.example.org ) # used by cfexecd
sysadm = ( me@example.com ) # where to mail output
######################################################################
files:
# Check some important files
/etc/passwd mode=644 owner=root action=fixall
/etc/shadow mode=600 owner=root action=fixall
# Do a "tripwire" style check on binaries!
/usr # Scan /usr dir
owner=root,daemon # all files must be owned by root or daemon
checksum=md5 # use md5 or sha
recurse=inf # all subdirs
ignore=tmp # skip /usr/tmp
action=fixall
About the author
Mark Burgess is an associate professor of theoretical physics and computer science at Oslo University College. He is the author of cfengine and of a number of books and articles on system administration and other topics. When he is not thinking about computers, he enjoys music and painting and various outdoor activities, including sitting around at cafes and restaurants with friends.
Resources
Cfengine Home Page
Mark Burgess. cfengine: A site configuration engine.. In USENIX Computing Systems. 8(3) 1995
Mark Burgess. Strategies for distributed resource administration using cfengine. In Software-Practice and Experience. 27 1997
Mark Burgess. Adaptive locks for frequently scheduled tasks with unpredictable runtimes. In Proceedings of the 11th System Administration Conference (USENIX/LISA). 1997
Mark Burgess. Automated system administration with feedback regulation. In Software-Practice and Experience. 28 1998
Mark Burgess. Computer immunology. In Proceedings of the 12th system administration conference (USENIX/LISA). 1998
1
. . . .Designer handbag knockoffs replica
Replica designer dog carrier
Designer louis replica vuitton