slm

Syntax:

slm [-avV] [-D debug_mode] [-n subsystem_path] [-p priority]
    [-P search_path] [-r recovery_mode] [-R frequency/sec|min|hour]
    [-s comp_name] [-t polling_interval] [-T total_wait]
    [-x comp_name] config_file

Runs on:

QNX Neutrino

Options:

-a: Adopt running daemon processes. Use this option to integrate SLM with an existing system where some server processes may already be running. If you place component entries for all relevant system processes in the configuration file, SLM will adopt these processes at startup as if it had launched them itself (and can thus control the processes via the command interface or restart them automatically if they terminate abnormally; see “Normal vs. abnormal termination”).
-D debug_mode: Specify when to use the <SLM:debug> argument list (instead of the normal <SLM:args> list). One of: cmd (default), startup, or always. With cmd, the debug list is used only when the module is started using slmctl with -d. With startup, all components launched at startup (see the -s option) initially use the debug list, but then honor the -d option of subsequent restarts. With always, the debug list is always used.
-n subsystem_path: Set the access point (default is /dev/slm) for client applications to write control and query commands.
-p priority: Set the priority of the SLM server thread (default is 30).
-P search_path: Set the search path for executables (default is $PATH). When launching a process, SLM looks in the search path to find the executable if the corresponding command element doesn't contain a full path.
-r recovery_mode: Set the recovery mode for components monitored by SLM. One of: none, stop, or replace (the default). The action specified with the -r option is performed when a component terminates abnormally if that component doesn't override this setting in its repair element.
-R frequency: Set how frequently SLM attempts to recover a component that has terminated abnormally. The frequency argument specifies the maximum number of recovery attempts as an integer and one of the following suffixes, separated by a forward slash: sec (seconds), min (minutes), or hour. For example, 1/min. Default is 2/min (2 times per minute).
-s comp_name: Name a component or module to launch when SLM starts. For convenience, you can use the built-in pseudo-modules all and none (default is "all").
-t polling_interval: Set the polling interval in milliseconds for the wait property. Default is 100.
-T total_wait: Set the total wait time in milliseconds. Default is 50000.
-v: Specifies output verbosity (messages are written to slog2info). The -v option is cumulative; each additional v adds a level of verbosity, up to 7 levels. Default level is warning messages.
-V: Log output messages to the console. The -V option is cumulative; each additional V adds a level of verbosity. Default level is error messages.
-x comp_name: Name a component or module to terminate when SLM terminates. For convenience, you can use the built-in pseudo-modules all and none (default is all).

Description:

The System Launch and Monitor (SLM) service automates the management of complex, multi-process applications that must be started in a specific order.

One or more configuration files control SLM's behavior. Configuration files specify the processes to run, their properties, and any interprocess dependencies. SLM uses the information in the configuration file to internally construct a directed acyclic graph (DAG). SLM uses the DAG to determine the order in which it starts the processes.

Similarly, when a process fails, SLM determines any dependent processes to terminate and restart, when SLM starts the process again.

When you start SLM, you must specify a configuration file, but all the other parameters are optional.

Client applications can control SLM using the slmctl utility or by directly writing commands to the /dev/slm interface.

Control and query commands

Client applications can control SLM by writing commands to the /dev/slm interface.

Control commands can start, stop, restart, or replace a specified module or component. When you start a component, SLM will start any dependencies (that aren't already running) and wait for them as required. When you stop a component, SLM first stops any dependents on the component. Restarting is a sequential composition of stop and start operations and is typically applied to set a specific high-level module state. Replacing will stop and relaunch a component and then restart any currently active components that had a dependency on that component. This is typically applied to update a low-level component process.

Query commands can list the dependencies (depend), running components (active), or components that terminated abnormally (dead). Command lines consist of the command, any options, and a module or component name, if appropriate.

Note:

Only the system superuser (UID 0) can execute the control and query commands (except active and depend).

The following table summarizes the control and query commands:

Command	Options	Description
`active`	-v	List the active (running) components.
`dead`	-v -w	List the dead (faulted) components.
`depend`	-s -u -v	List dependencies or dependents for the specified component.
`start`	-d -v -x	Start the specified component.
`stop`	-s -v -x	Stop the specified component.
`restart`	-d -s -v -x	Stop, then start the specified component.
`replace`	-d -s -v -x	Update the specified component.

The following table describes all the options:

Option	Description
-d	Debug mode: start components with their debug argument list.
-p `pid`	Only display information for the process with the specified ID. Used only with `active`.
-s	Stateless: ignore any stateless dependencies when stopping components.
-u	Used by: list components that depend on the specified component.
-v	Verbose: give details of each action performed when responding to a command.
-w	Wait: block until a process terminates abnormally.
-x	Explain: list the required actions but don't perform them.

Command example

Following execution of a command written to /dev/slm, the results are available to be read from the same file descriptor. Here's a simple example (with no error handling):

int    slm;
char   text[128];

slm = open("/dev/slm", O_RDWR);
write(slm, "start -v all", 12);
while (read(slm, text, sizeof(text)) > 0)
    printf("%s\n", text);
close(slm);

Issuing commands via the slmctl utility

Besides writing control/query commands programmatically, you can use the slmctl utility to send SLM commands (via the command line or typed interactively). It uses the following syntax:

slmctl [-n subsystem_path] "command [component]"...

where -n subsystem_path sets the access point that client applications write control and query commands to. Should match the path specified by the slm option -n. Default is /dev/slm.

The utility displays the results of each action in a line describing the operation on the specified component or module as follows:

Utility output	Meaning
`START component pid\|error`	Component was started.
`start component`	Component already active (no errors).
`WAIT component error`	Waiting for component.
`wait component`	Component already active (no errors).
`STOP component error`	Component stopped.
`stop component`	Component already inactive.
`BEGIN module`	Encapsulation of multiple components.
`END module error`	Reported only via slog2info, not slmctl.

SLM configuration file

SLM uses an XML configuration file to determine the appropriate order for starting processes. The configuration file lists all the programs for SLM to manage, any dependencies between the programs, the commands for launching the processes, and other properties.

Configuration file structure

The root XML element of the configuration file is system. All element names start with SLM:, so the root element (and the outline of the file) looks like this:

<SLM:system>
    -- component and module descriptions --
</SLM:system>

Components

A process managed by SLM is represented by a component. You must provide a component name (usually based on the process name) to use within the configuration file when specifying interprocess dependencies or membership in a module.

All component elements are children of the root element and contain other elements that describe the properties of individual components. The component element uses the following syntax:

<SLM:component name="qconn">
    -- component properties --
</SLM:component>

The following table describes the component elements:

Tag	Attribute	Value(s)	Description
`ability`		`ability1[,ability2, ... abilityn]`	List of the process's procnto abilities. This tag is equivalent to the –A option of the on command, and the syntax of the ability specification is the same. Using many ability specifications to launch processes is generally a bad idea; using types to configure abilities is simpler and safer.
`args`		`command_args`	The list of command-line arguments to provide the binary executable.
`cd`		`dir_name`	The directory to switch into when launching the process; this directory becomes the process's working directory (`$CWD`).
`command`	`launch`	`bg`	Controls process creation.
		`nohup`	Controls how signals are handled (no hangup).
		`pathname`	The full path of the binary executable (e.g., /usr/bin/qconn). When calling posix_spawn(), pass the full pathname in `argv[0]` instead of truncating the value to a filename. This information is required by some utilities, such as `sshd`.
		`builtin`	The name of a built-in SLM command. Options are: `mkdir`—creates one or more directories. List the directories to create in the `args` element. `no_op`—does nothing, but allows waiting for a filepath. This mechanism can be used to detect whether a process started outside of SLM is ready. `pathmgr_symlink`—creates one or more fast kernel symlinks. List the symlinks to create in the `args` element.
		`session`	In order to start a process as a session leader, the launch attribute of the <SLM:command> element must include the value `session`. The <SLM:command> element must also have a <SLM:tty> child element. Its value specifies where to redirect the stdin, stdout, and stderr of the process. See the examples for more information on how to use SLM to start a shell.
`debug`		`command_args`	An alternative list of command-line arguments to provide the binary executable when SLM is run in debug mode. This list might contain options (such as -v to increase verbosity).
`depend`	`state`	`[ session \| stateless ]`	A component may need other services to be active before the component can run. Any prerequisites must be expressed as dependencies. There are two forms of dependency: session (stateful) and stateless. With session dependency (the default), a client/server relationship is assumed; the server stores state information on all its clients. In this model, if the server must be stopped or restarted, then all its clients must be stopped. With stateless dependency, the server doesn't maintain any client information, so it's not necessary to restart clients if the server is restarted.
`depend`		`component_name`	Name of the prerequisite component. A component can have zero, one, or many dependencies. Note: You must define a separate tag for each dependency. SLM won't start a component until all the prerequisites are running.
`envvar`	`clear`	`[ none \| login \| all ]`	Specifies changes to environment variables. By default, the variables are inherited from the SLM server. The `clear` attribute specifies which current environment variables to clear or preserve: `none`—All current environment variables are preserved `login`—Only the initial login environment variables are preserved `all`—All current environment variables are cleared
`envvar`		`environment_variables`	A list of environment variables to either merge with or override the current environment variables. Use the format `VAR=value` to specify each variable.
`partition`	content	`partition_name`	Specifies the adaptive scheduler partition to put the process in. For detailed information, see the Adaptive Partitioning User's Guide.
`priority`		`priority_algorithm`	An alphanumeric value that indicates the priority level and scheduling policy to assign the process (e.g., `10r`). `f`—SCHED_FIFO (FIFO scheduling) `r`—SCHED_RR (Round-robin scheduling) `o`—SCHED_OTHER (other scheduling) For descriptions of the scheduling policies, see “Scheduling polices” in the Programmer's Guide.
`repair`		`[ default \| none \| stop \| replace ]`	Specifies the action to take if the component terminates abnormally: `default`—tells SLM to perform the action specified by the -r command-line option `none`—SLM takes no recovery action `stop`—SLM stops any other components that depend on the component that failed `replace`—SLM restarts the failed component
`runmask`	content	`component_runmask`	A value that is interpreted as a bitmask, which specifies on which processors a process can run. It is a 32-bit integer and can be specified using any format that strtol() recognizes. For example, the decimal value `5` corresponds to the bitmask 00000101, which allows the thread to run on CPUs 0 and 2. Only specify the runmask once. A valid runmask is always inherited by children. For more information about runmasks, see the Multicore Processing chapter of the System Architecture guide, and the Multicore Processing chapter of the QNX Neutrino Programmer's Guide.
`stderr`	`iomode`	`[ w[+] \| a[+] ]`	The access mode: overwrite (`w`), read and overwrite (`w+`), append (`a`), or read and append (`a+`).
`stderr`		`filename`	Name of the file for redirecting standard error (`stderr`).
`stdin`	`iomode`	`[ r[+] ]`	The access mode: read only (`r`) or read and write (`r+`).
`stdin`		`filename`	Name of the file for redirecting standard input (`stdin`).
`stdout`	`iomode`	`[ w \| a ]`	The access mode: overwrite (`w`) or append (`a`).
`stdout`		`filename`	Name of the file for redirecting standard output (`stdout`).
`stop`	`stop`	`[ none \| signal ]`	The `signal` setting (the default) causes SLM to send a signal to the underlying process. The `none` setting disables the signaling; in this case, SLM takes no action to stop a process.
	`child`	`[ self \| before \| after ]`	For any process launched by SLM, its child processes are out of SLM's direct control. You can specify the shutdown of these child processes as relative to when the SLM-controlled parent process is terminated. The settings are: `self` (the default), `before`, and `after`.
	`timeout`	`timeout_time`	The maximum amount of time to try to stop a process nicely, in milliseconds. If the process can't be stopped nicely, SIGKILL is sent to it. For no timeout, specify 0 (the default).
		`data`	Contains the signal number to send the process to stop it. By default, `SIGTERM` is sent, but you can change this to any signal. If repeated failed attempts to stop the process fail, `SIGKILL` is sent. Note: This tag value isn't needed when the `stop` attribute is set to `none`.
`tty`		`filename`	Name of the file to which `stderr`, `stdin`, and `stdout` are redirected to when a process is opened as the session leader.
`type`		`typename`	Name of the security type to launch the component as. The name is a label that reflects the security policy being enforced. Generally, you should pick a name based on what you're trying to launch. For information about security policies, see the Security Policy and Mandatory Access Control chapter in the Security Developer's Guide.
`user`		`uid:gid`	The user ID and group ID to assign to the underlying process. The two strings are separated by a colon (e.g., `jgarvey:techies`).
`waitfor`	`wait`	`[ none \| delay \| pathname \| exits \| blocks ]`	Once a component has been launched, SLM can wait for that component to set itself up before starting any dependent components. Values: `none` (the default)—Causes SLM to start other components immediately. `delay`—SLM pauses for the specified number of milliseconds. `pathname`—SLM probes for the appearance of the specified pathname. `exits`—SLM waits for the process to exit with the specified exit code. If the exit code is different from the expected one, SLM restarts the process. `blocks`—SLM waits for a specified thread in the process to reach the `RECV-blocked` state.
	`polltime`	`poll_time`:`timeout_time`	Use with `wait="pathname"` or `wait="exits"` to specify a polling interval and total wait time (both in milliseconds) that override the global values. For example, `polltime="100:20000"` results in polling every 100 milliseconds and timing out after 20 seconds.
		`data`	Contains data for the specified `wait` condition: `none` — No data required. `delay` — A time in milliseconds (e.g., 5000 for a 5-second delay). `pathname` — A path. `exits` — The expected exit code (default is 0). `blocks` — A thread ID.

Note:

Only the command element is mandatory—all components must have a path to the binary. The remaining elements are optional.

Modules

You can group components into modules. The processes within a module could make up a subsystem or could be used to establish a set of system states, such as a base level of operation and various higher levels. Modules must be named so they can be internally referenced. Each module must be described in an element, as follows:

<SLM:module name="device_monitors">
    -- module description --
</SLM:module>

To list the components within a module, use member. There are no attributes for member elements; the element values refer to member components by the internal names defined in their respective component elements. Modules cannot contain depend elements.

Note:

You can include multiple components in a module by using one member element with wildcards in the component names. For example, you can write:

<SLM:member>devb-*</SLM:member>

Components and modules may be specified in any order in the XML configuration file, but SLM raises an error if any circular dependencies are found.

Reusing SLM modules and components

You can define modules and components for reuse in one or more SLM files. This can be useful for breaking up your SLM modules and components to reuse in different SLM configuration files.

In the SLM configuration file where you are reusing modules and components from other SLM files, you need to define the filenames of where these reusable sections reside. The syntax to do so is as follows:

<!DOCTYPE SLM_system [
    <!ENTITY inclusion_name SYSTEM 'filename'>
]>

where inclusion_name is a name that you use in your SLM configuration file to identify the reusable entities and filename is a separate file on your system where your reusable SLM modules and components are defined.

At the point in your SLM configuration file where you want to include the reusable entities, include them by specifying the following:

&inclusion_name;

For example, in your system you have a file called my_reusable_modules.xml where you have defined the SLM modules and components that can be included in different SLM configuration files. Then, in one of your SLM configuration files, you can define an entity named reuseModules and later include it:

<!DOCTYPE SLM_system [
    <!ENTITY reuseModules SYSTEM 'my_reusable_modules.xml'>
]>
...
<SLM:system>
    ...
    <!-- Include the contents of what's specified in 'my_reusable_modules.xml'
            by specifying the entity 'reuseModules' -->
    &reuseModules;
    ...
</SLM:system>

Sample configuration files

Suppose you want to automate the setup of your system's IP connectivity. This would require running io-pkt, which creates an IP socket for network traffic, and then running ifconfig to bind an IP address to the socket. You can create a module to include two components that correspond to the two services, and then describe the dependency of ifconfig on io-pkt in the component entries. The XML file would then look like this:

<SLM:system>
    <SLM:component name="io-pkt">
        <SLM:command>/sbin/io-pkt-v6-hc</SLM:command>
        <SLM:args>-ptcpip stacksize=8192</SLM:args>
        <SLM:waitfor wait="pathname">/dev/socket</SLM:waitfor>
    </SLM:component>
    <SLM:component name="ifconfig">
        <SLM:depend>io-pkt</SLM:depend>
        <SLM:command>/sbin/ifconfig</SLM:command>
        <SLM:args>en0 192.168.1.5 up</SLM:args>
        <SLM:waitfor wait="exits"></SLM:waitfor>
    </SLM:component>
    <SLM:module name="net-setup">
        <SLM:member>io-pkt</SLM:member>
        <SLM:member>ifconfig</SLM:member>
    </SLM:module>
</SLM:system>

The following example shows how to use SLM to start a shell:

<SLM:component name="console"> 
    <SLM:command launch="session">/bin/ksh</SLM:command> 
    <SLM:args>-l</SLM:args> 
    <SLM:tty>/dev/ser1</SLM:tty> 
    ... 
</SLM:component>

The following example shows how sshd could be started by SLM (so that sshd could be monitored):

<SLM:component name="sshd">
    <SLM:command launch="pathname">/system/xbin/sshd</SLM:command> 
    <SLM:args>-D</SLM:args>
    ... 
</SLM:component>

Normal vs. abnormal termination

SLM considers a process to have terminated normally in the following situations only:

SLM terminates a component's process because:
- a stop action was created by executing slmctl stop component.
- a dependency required SLM to stop the component's process.
The component is configured with a waitfor=exits and the component's process exits with the expected exit code.

All other process terminations are considered abnormal and cause SLM to restart the component process. If a process has died too frequently in a certain time period, SLM stops trying to restart the process even though the termination is abnormal.