Control the adaptive partitioning scheduler
#include <sys/sched_aps.h>
#include <sys/neutrino.h>
int SchedCtl( int cmd,
void *data,
int length);
int SchedCtl_r( int cmd,
void *data,
int length);
For details about each command and its data, see the sections below.
libc
Use the -l c option to qcc to link against this library. This library is usually included automatically.
The SchedCtl() and SchedCtl_r() kernel calls control the adaptive partitioning scheduler.
This scheduler is optional and is present only if you add [module=aps] to your OS image's buildfile. For more information, see the Adaptive Partitioning User's Guide. These functions were added in the QNX Neutrino Core OS 6.3.2.
These functions are identical except in the way they indicate errors. See the Returns section for details.
![]() |
You must initialize all of the fields—including reserved ones—in
the structures you pass as the data argument, by calling
(for example)
memset().
You can also use the APS_INIT_DATA() macro:
APS_INIT_DATA( &data ); |
This command fills in a sched_aps_info structure that describes the overall parameters of the adaptive partitioning scheduler:
typedef struct {
uint64_t cycles_per_ms;
uint64_t windowsize_cycles;
uint64_t windowsize2_cycles;
uint64_t windowsize3_cycles;
uint32_t scheduling_policy_flags;
uint32_t sec_flags;
uint32_t bankruptcy_policy;
uint16_t num_partitions;
uint16_t max_partitions;
uint64_t reserved1;
uint64_t reserved2;
} sched_aps_info;
The members include:
![]() |
The value of cycles_per_ms:
|
![]() |
If you change the tick size of the system at runtime, do so before defining the adaptive partitioning scheduler's window size. That's because Neutrino converts the window size from milliseconds to clock ticks for internal use. |
These flags set options for the adaptive partitioning scheduling algorithm. To set, pass a pointer to an ORed set of these flags with the SCHED_APS_SET_PARMS call to SchedCtl():
By default, the scheduler hands out free time to the partition with the highest-priority running thread. That guarantees realtime scheduling behavior (i.e., scheduling strictly by priority) to partitions any time they aren't being limited by some other partition's right to its guaranteed minimum budget. But it also means that one partition is allowed to grab all the free time.
If you set SCHED_APS_SCHEDPOL_FREETIME_BY_RATIOw, the running partitions share the free time in proportion to the ratios of their budgets. So, one partition can no longer grab all the free time. However, when this flag is set, partitions will see strict priority-scheduling between partitions only when they're consuming less than their CPU budgets.
If this flag is set, SCHED_APS_SCHEDPOL_FREETIME_BY_RATIO is also automatically set.
Allowing the highest-priority partition to run to completion means lower-priority partitions won't run in the meantime. That means that when the system is loaded, the default algorithm can cause small-budget low-priority partitions to see long delays between intervals when they run. (For example, on a loaded system with a 100 ms averaging window, a 10% partition may run only every 90 milliseconds.) One way to reduce the latency is to reduce the averaging window size.
The SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES policy schedules purely by budget ratio. When enabled, the scheduler tries to balance budgets on as short a timescale as possible, regardless of the window size.
That means high-priority partitions with large budgets no longer run to completion (while they have budget). They're timesliced with other partitions, even low-priority ones. The result is that SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES can reduce the latencies seen by small budget partitions in a loaded system. This comes at the cost of a departure from strict priority preemptive behavior when all partitions have budget (i.e., the default policy).
The SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES policy provides the latencies while retaining the full accuracy of a 100 ms averaging window. It continues to schedule the highest priority thread within partitions.
This option includes the behavior of SCHED_APS_SCHEDPOL_FREETIME_BY_RATIO.
When a critical thread consumes critical time, it temporarily forces a return to the default scheduling policy. Critical threads are allowed to run to completion and aren't timesliced by SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES's attempts to balance budgets. In other words, critical threads aren't affected by this policy.
![]() |
Don't use this scheduling policy if you have running threads in a zero-budget partition. Since this policy divides time by the ratio of budgets, a zero-budget partition may never be scheduled. |
Scheduling within a partition is always strictly by priority, no matter which of these flags are set.
For more information about adaptive partitioning and BMP, see the Adaptive Partitioning Scheduling Details chapter of the Adaptive Partitioning User's Guide.
Bankruptcy is when critical CPU time billed to a partition exceeds its critical budget. Bankruptcy is always considered to be a design error on the part of the application, but you can configure how the system responds to it.
If the system isn't declaring bankruptcy when you expect it, note that bankruptcy can be declared only if critical time is billed to your partition. Critical time is billed on those timeslices when the following conditions are all met:
Only then if the critical time, billed over the current averaging window, exceeds a partition's critical budget will the system declare the partition bankrupt.
When the system detects that a partition has gone bankrupt, it always:
In addition, you can configure the following responses:
To set a choice of bankruptcy-handling options, OR the above SCHED_APS_BNKR_* flags and pass a pointer to it as the bankruptcy_policyp field of the sched_aps_parms structure when you call SCHED_APS_SET_PARMS.
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_QUERY_PARMS command (see the Returns section for details):
The command sets the parameters for the overall behavior of the adaptive partitioning scheduler. The data argument must be a pointer to a sched_aps_parms structure:
typedef struct {
int16_t windowsize_ms;
int16_t reserved1;
uint32_t *scheduling_policy_flagsp;
uint32_t *bankruptcy_policyp;
int32_t reserved2;
int64_t reserved3;
} sched_aps_parms;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_SET_PARMS command (see the Returns section for details):
For more information, see “Security,” below.
This command creates a new partition which is considered to be a child of the partition that's calling SchedCtl(). The system automatically creates a partition called System (the value of APS_SYSTEM_PARTITION_NAME) with an ID of 0.
The data argument for this command must be a pointer to a sched_aps_create_parms structure:
typedef struct {
/* input parms */
char *name;
uint16_t budget_percent;
int16_t critical_budget_ms;
uint8_t aps_create_flags;
int8_t parent_id;
int16_t reserved1;
uint64_t reserved2;
/* output parms */
int16_t id;
int16_t reserved3;
} sched_aps_create_parms;
The input members include:
![]() |
Before creating zero-budget partitions, read the cautions in “Setting budgets for resource managers” in the System Considerations chapter of the Adaptive Partitioning User's Guide. |
The output members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_CREATE_PARTITION command (see the Returns section for details):
For more information, see “Security,” below.
This command gets information about a given partition. The data argument for this command must be a pointer to a sched_aps_partition_info structure:
typedef struct {
/* out parms */
uint64_t budget_cycles;
uint64_t critical_budget_cycles;
char name[APS_PARTITION_NAME_LENGTH+1];
int16_t parent_id;
uint16_t budget_percent;
int32_t notify_pid;
int32_t notify_tid;
uint32_t pinfo_flags;
int32_t pid_at_last_bankruptcy;
int32_t tid_at_last_bankruptcy;
int64_t reserved1;
int64_t reserved2;
/* input parm */
int16_t id;
} sched_aps_partition_info;
The input members include:
The output members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_QUERY_PARTITION command (see the Returns section for details):
This command finds the partition ID for a given partition name.
The data argument for this command must be a sched_aps_lookup_parms structure:
typedef struct {
/* input parms */
char *name;
int16_t reserved1;
/* output parms */
int16_t id;
} sched_aps_lookup_parms;
The input members include:
The output members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_LOOKUP command (see the Returns section for details):
This command makes the thread specified by the given process and thread IDs becomes a member of the specified partition. This partition also becomes the thread's new home partition, i.e., where it returns after partition inheritance.
The data argument for this command must be a pointer to a sched_aps_join_parms structure:
typedef struct {
int16_t id;
int16_t reserved1;
int32_t pid;
int32_t tid;
int32_t reserved2;
} sched_aps_join_parms;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_JOIN_PARTITION command (see the Returns section for details):
For more information, see “Security,” below.
This command changes the parameters of an existing partition. If the new budget's percent value is different from the current, the difference is either taken from, or returned to, the parent partition's budget. The critical time parameter affects only the chosen partition, not its parent. To change just one of new budget or new critical time, set the other to -1.
![]() |
|
The data argument for this command must be a pointer to a sched_aps_modify_parms structure:
typedef struct {
int16_t id;
int16_t new_budget_percent;
int16_t new_critical_budget_ms;
int16_t reserved1;
int64_t reserved2;
int64_t reserved3;
} sched_aps_modify_parms;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_MODIFY_PARTITION command (see the Returns section for details):
For more information, see “Security,” below.
This command returns instantaneous values of the CPU time-accounting variables for a set of partitions. It can fill in data for more than one partition. If the length argument to SchedCtl() indicates that you've passed the function an array of sched_aps_partition_stats structures, SchedCtl() fills each element with statistics for a different partition, starting with the partition specified by the id field.
![]() |
To get an accurate picture for the whole machine it's important to read
data for all partitions in one call, since sequential calls to
SCHED_APS_PARTITION_STATS may come from separate averaging
windows.
To determine the number of partitions, use the SCHED_APS_OVERALL_STATS command. |
The command overwrites the id field with the partition number for which data is being returned. It stores -1 into the id field of unused elements.
To convert times in cycles into milliseconds, divide them by the cycles_per_ms obtained with an SCHED_APS_QUERY_PARMS command.
The data argument for this command must be a pointer to a sched_aps_partition_stats structure, or an array of these structures:
typedef struct {
/* out parms */
uint64_t run_time_cycles;
uint64_t critical_time_cycles;
uint64_t run_time_cycles_w2;
uint64_t critical_time_cycles_w2;
uint64_t run_time_cycles_w3;
uint64_t critical_time_cycles_w3;
uint32_t stats_flags;
uint32_t reserved1;
uint64_t reserved2;
uint64_t reserved3;
/* in parm */
int16_t id;
} sched_aps_partition_stats;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_PARTITION_STATS command (see the Returns section for details):
This command returns instantaneous values of overall CPU-usage variables and other dynamic scheduler states. The data argument for this command must be a pointer to a sched_aps_overall_stats structure:
typedef struct {
uint64_t idle_cycles;
uint64_t idle_cycles_w2;
uint64_t idle_cycles_w3;
int16_t id_at_last_bankruptcy;
uint16_t reserved1;
int32_t pid_at_last_bankruptcy;
int32_t tid_at_last_bankruptcy;
uint32_t reserved2;
uint32_t reserved3;
uint64_t reserved4;
} sched_aps_overall_stats;
The members include:
(100 × idle_cycles) / windowsize_cycles
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_OVERALL_STATS command (see the Returns section for details):
This command sets one thread in your process to run as a critical thread whenever it runs. Use a thread ID of zero to set the calling thread to be critical.
![]() |
In general, it's more useful to send a critical sigevent structure to a thread to make it run as a critical thread. |
The data argument for this command must be a pointer to a sched_aps_mark_crit_parms structure:
typedef struct {
int32_t pid;
int32_t tid;
int32_t reserved;
} sched_aps_mark_crit_parms;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_MARK_CRITICAL command (see the Returns section for details):
This command clears the “always run as critical” state set by the SCHED_APS_CLEAR_CRITICAL command. Then the thread will run as critical only when it inherits that state from another thread (on receipt of a message).
The data argument for this command must be a pointer to a sched_aps_clear_crit_parms structure:
typedef struct {
int32_t pid;
int32_t tid;
int32_t reserved;
} sched_aps_clear_crit_parms;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_CLEAR_CRITICAL command (see the Returns section for details):
This command determines the partition for the given thread and indicates whether or not the thread in your process is marked to run as critical. Use a thread ID of zero to indicate the calling thread.
The data argument for this command must be a pointer to a sched_aps_query_thread_parms structure:
typedef struct {
int32_t pid;
int32_t tid;
/* out parms: */
int16_t id;
int16_t inherited_id;
uint32_t crit_state_flags;
int32_t reserved1;
int32_t reserved2;
} sched_aps_query_thread_parms;
The input members include:
The output members include:
If APS_QCRIT_PERM_CRITICAL isn't set, and APS_QCRIT_RUNNING_CRITICAL is set, it means the thread has temporarily inherited the critical state. If APS_QCRIT_RUNNING_CRITICAL is set, and APS_QCRIT_BILL_AS_CRITICAL isn't set, it means that the thread is running as critical, but isn't depleting its partition's critical-time budget (i.e., it's running for free).
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_QUERY_THREAD command (see the Returns section for details):
This command defines sigevent structures that the scheduler will return to the calling thread when the scheduler detects that a given partition has become bankrupt, or the whole system has become overloaded.
![]() |
Overload notification isn't implemented in this release. |
Calling SCHED_APS_ATTACH_EVENTS arms the notification once. After you receive the notification, you must call SCHED_APS_ATTACH_EVENTS again to receive a subsequent notification. This is to ensure that the system doesn't send you notifications faster than you can handle them. The pinfo_flags field of the sched_aps_partition_stats structure (see the SCHED_APS_PARTITION_STATS command) indicates if these events are armed.
![]() |
You can register only one pair of sigevent structures (bankruptcy and overload) per partition, and the notifications must go to the same thread. The thread notified is the calling thread. Attaching events a second time overwrites the first. Passing NULL pointers means “no changes in notification.” To turn off notification, use SIGEV_NONE_INIT() to set the appropriate sigevent to SIGEV_NONE. |
If you want to configure additional actions for the system to perform on bankruptcy, see “Handling bankruptcy,” below.
The data argument for this command must be a pointer to a sched_aps_events_parm structure:
typedef struct {
const struct sigevent *bankruptcy_notification;
const struct sigevent *overload_notification;
/* each partition gets a different set of sigevents */
int16_t id;
int16_t reserved1;
int32_t reserved2;
int64_t reserved3;
} sched_aps_events_parm;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_ATTACH_EVENTS command (see the Returns section for details):
For more information, see “Security,” below.
This command sets security options. A bit that's set turns the corresponding security option on. Successive calls add to the existing set of security options. Security options can only be cleared by a restart.
![]() |
You must be root running in the System partition to use this command, even if all security options are off. |
The data argument for this command must be a pointer to a sched_aps_security_parms structure:
typedef struct {
uint32_t sec_flags;
uint32_t reserved1;
uint32_t reserved2;
} sched_aps_security_parms;
The members include:
The adaptive partitioning scheduler lets you dynamically create and modify the partitions in your system.
![]() |
We recommend that you set up your partition environment at
boot time, and then lock all parameters:
|
However you might need to modify a partition at runtime. In this case, you can use the security options described below.
When Neutrino starts, it sets the security option to SCHED_APS_SEC_OFF. We recommend that you immediately set it to SCHED_APS_SEC_RECOMMENDED. In code, do this:
sched_aps_security_parms p; APS_INIT_DATA( &p ); p.sec_flags = SCHED_APS_SEC_RECOMMENDED; SchedCtl( SCHED_APS_ADD_SECURITY,&p, sizeof(p) );
These are the security options:
Unless you're testing the partitioning and want to change all parameters without needing to restart, you should set at least SCHED_APS_SEC_BASIC.
In general, SCHED_APS_SEC_RECOMMENDED is more secure than SCHED_APS_SEC_FLEXIBLE, which is more secure than SCHED_APS_SEC_BASIC. All three allow partitions to be created and modified. After setting up partitions, use SCHED_APS_SEC_LOCK_PARTITIONS to prevent further unauthorized changes. For example:
sched_aps_security_parms p; APS_INIT_DATA( &p ); p.sec_flags = SCHED_APS_SEC_LOCK_PARTITIONS; SchedCtl( SCHED_APS_ADD_SECURITY, &p, sizeof(p));
SCHED_APS_SEC_RECOMMENDED, SCHED_APS_SEC_FLEXIBLE, and SCHED_APS_SEC_BASIC are composed of the flags defined below (but it's usually more convenient for you to use the compound options):
#define SCHED_APS_SEC_BASIC (SCHED_APS_SEC_ROOT0_OVERALL | SCHED_APS_SEC_ROOT_MAKES_CRITICAL)
#define SCHED_APS_SEC_FLEXIBLE (SCHED_APS_SEC_BASIC | SCHED_APS_SEC_NONZERO_BUDGETS |\
SCHED_APS_SEC_ROOT_MAKES_PARTITIONS |\
SCHED_APS_SEC_PARENT_JOINS | SCHED_APS_SEC_PARENT_MODIFIES )
#define SCHED_APS_SEC_RECOMMENDED (SCHED_APS_SEC_FLEXIBLE | SCHED_APS_SEC_SYS_MAKES_PARTITIONS |\
SCHED_APS_SEC_SYS_JOINS | SCHED_APS_SEC_JOIN_SELF_ONLY)
#define SCHED_APS_SEC_OFF 0x00000000
The individual options are as follows:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_ADD_SECURITY command (see the Returns section for details):
This command returns the partition of the given process. The partition of a process is billed while one of the process's threads handles a pulse. The individual threads in a process may all be in different partitions from the process.
The data argument for this command must be a pointer to a sched_aps_query_process_parms structure:
typedef struct {
int32_t pid;
/* out parms: */
int16_t id; /* partition of process */
int16_t reserved1;
int32_t reserved2;
int32_t reserved3;
int32_t reserved4;
} sched_aps_query_process_parms;
The members include:
SchedCtl() and SchedCtl_r() indicate the following errors for the SCHED_APS_QUERY_PROCESS command (see the Returns section for details):
These calls don't block.
The only difference between these functions is the way they indicate errors:
For a list of error codes, see the description of each command.
sched_aps_partition_info part_info;
// You need to initialize the parameter block.
APS_INIT_DATA(&part_info);
// Set the input members of the parameter block.
part_info.id = 2;
// Invoke SchedCtl to query the partition.
ret = SchedCtl( SCHED_APS_QUERY_PARTITION, &part_info,
sizeof(part_info) );
if (EOK!=ret) some_kind_of_error_handler();
// Use output field
printf( "The budget is %d per cent.\n",
part_info.budget_percent);
| Safety: | |
|---|---|
| Cancellation point | No |
| Interrupt handler | No |
| Signal handler | Yes |
| Thread | Yes |
SchedGet(), SchedInfo(), SchedSet(), SchedYield(), sigevent
aps in the Utilities Reference
Adaptive Partitioning User's Guide