Description:
The SchedCtl() and SchedCtl_r() kernel calls control the scheduler.
These functions are identical except in the way they indicate errors;
see the Returns section for details.
Note:
The adaptive partitioning scheduler is optional and is present only if you add
[module=aps] to
your OS image's buildfile.
For more information, see the Adaptive Partitioning
User's Guide.
For the control commands that are related to adaptive partitioning,
you must initialize all of the fields—including reserved ones—in the structures you pass as the
data argument, by calling (for example)
memset().
You can also use the APS_INIT_DATA() macro:
APS_INIT_DATA( &data );
SCHED_APS_QUERY_PARMS
This command fills in a sched_aps_info structure that describes the overall parameters of the
adaptive partitioning scheduler:
typedef struct {
uint64_t cycles_per_ms;
uint64_t windowsize_cycles; /* Deprecated */
uint64_t windowsize2_cycles; /* Deprecated */
uint64_t windowsize3_cycles; /* Deprecated */
uint32_t scheduling_policy_flags;
uint32_t sec_flags;
uint32_t bankruptcy_policy;
uint16_t num_partitions;
uint16_t max_partitions;
uint16_t windowsize_ms;
uint16_t reserved1;
uint32_t reserved2;
uint64_t reserved3;
} sched_aps_info;
The members include:
- cycles_per_ms
- The number of machine cycles in a millisecond. Use this value to convert the output of the
SCHED_APS_QUERY_PARTITION
command to the time units of your choice.
Note:
The value of
cycles_per_ms:
- might not equal the value of the cycles_per_sec member of the
system page divided by 1000
- isn't necessarily in the same units as values returned by
ClockCycles() on all platforms
- scheduling_policy_flags
- The set of SCHED_APS_SCHEDPOL_* flags that describe the scheduling policy.
For more information, see
Scheduling policies, below.
- sec_flags
- The set of SCHED_APS_SEC_* flags that describe the security options.
For more information, see Security, below.
- bankruptcy_policy
- What to do if a partition exhausts its critical budget; a combination of
SCHED_APS_BNKR_* flags (see
Handling bankruptcy, below).
- num_partitions
- The number of partitions defined.
- max_partitions
- The largest number of partitions that may be created at any time.
- windowsize_ms
- The length of the averaging window used for scheduling, in milliseconds.
Scheduling policies
These flags set options for the adaptive partitioning scheduling algorithm.
To set, pass a pointer to an ORed set of these flags with the
SCHED_APS_SET_PARMS
call to SchedCtl():
- SCHED_APS_SCHEDPOL_FREETIME_BY_RATIO
- Free time is when at least one partition isn't running.
Its time becomes free to other partitions that may then run over their budgets.
By default, the scheduler hands out free time to the partition with the
highest-priority running thread. That guarantees realtime scheduling behavior (i.e.,
scheduling strictly by priority) to partitions any time they aren't being limited by
some other partition's right to its guaranteed minimum budget. But it also means that
one partition is allowed to grab all the free time.
If you set SCHED_APS_SCHEDPOL_FREETIME_BY_RATIO, the running
partitions share the free time in proportion to the ratios of their budgets. So, one
partition can no longer grab all the free time. However, when this flag is set,
partitions will see strict priority-scheduling between partitions only when they're
consuming less than their CPU budgets.
- SCHED_APS_SCHEDPOL_DEFAULT
- The default policy, which currently means that
SCHED_APS_SCHEDPOL_FREETIME_BY_RATIO isn't set,
long-window reporting is enabled, and all maximum budgets are 100%.
QNX Neutrino sets this at startup.
- SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES
- Normally, the APS scheduler always selects the highest-priority thread from
partitions with available budget.
That algorithm produces behavior closest to realtime scheduling.
In particular, it allows the highest-priority partition to run to completion
(as long as it keeps under its budget).
Allowing the highest-priority partition to run to completion means
lower-priority partitions won't run in the meantime.
That means that when the system is loaded, the default algorithm can
cause small-budget low-priority partitions to see long delays between
intervals when they run.
(For example, on a loaded system with a 100 ms averaging window,
a 10% partition may run only every 90 milliseconds.)
One way to reduce the latency is to reduce the averaging window size.
The SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES policy
schedules purely by budget ratio.
When enabled, the scheduler tries to balance budgets on as short a timescale
as possible, regardless of the window size.
That means high-priority partitions with large budgets no longer run
to completion (while they have budget).
They're timesliced with other partitions, even low-priority ones.
The result is that SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES can
reduce the latencies seen by small budget partitions in a loaded system.
This comes at the cost of a departure from strict priority preemptive behavior
when all partitions have budget (i.e., the default policy).
The SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES policy
provides the latencies while retaining the full accuracy of a 100 ms
averaging window.
It continues to schedule the highest priority thread within partitions.
This option includes the behavior of SCHED_APS_SCHEDPOL_FREETIME_BY_RATIO.
When a critical thread consumes critical time, it temporarily forces a
return to the default scheduling policy.
Critical threads are allowed to run to completion and aren't timesliced by
SCHED_APS_SCHEDPOL_PARTITION_LOCAL_PRIORITIES's
attempts to balance budgets.
In other words, critical threads aren't affected by this policy.
Note:
Don't use this scheduling policy if you have running threads in a
zero-budget partition.
Since this policy divides time by the ratio of budgets, a zero-budget partition
may never be scheduled.
- SCHED_APS_SCHEDPOL_LIMIT_CPU_USAGE
- Enable enforcement of the max_budget_percent parameters, which limit the amount
a partition can overrun its normal budget when the system is underloaded.
If this option isn't set, max_budget_percent is ignored when you're setting parameters,
and is reported as 100% (meaning no limit on freetime usage).
Note:
Threads in a partition with a normal budget of 0 and a max_budget_percent of 0 will never run.
Scheduling within a partition is always strictly by priority, no matter which of these flags are set.
For more information about adaptive partitioning and BMP, see the
Adaptive Partitioning Scheduling Details
chapter of the Adaptive Partitioning User's Guide.
Handling bankruptcy
Bankruptcy is when critical CPU time billed to a partition exceeds its critical budget.
Bankruptcy is always considered to be a design error on the part of the application, but you can configure how
the system responds to it.
If the system isn't declaring bankruptcy when you expect it, note that bankruptcy can be
declared only if critical time is billed to your partition.
Critical time is billed on those timeslices when the following conditions are all met:
- The partition of the running thread has a critical budget greater than zero.
- The running thread has a critical priority.
- The partition must be out of percentage-CPU budget.
- There be at least one other partition that is competing for CPU time.
Only then if the critical time, billed over the current averaging window, exceeds a
partition's critical budget will the system declare the partition bankrupt.
When the system detects that a partition has gone bankrupt:
- It causes that partition to be out-of-budget for the remainder of the current scheduling window.
- If you've set a sigevent via
procmgr_event_notify()
or
procmgr_event_notify_add()
with a flag of PROCMGR_EVENT_APS_BANKRUPTCY, the system delivers the event.
In addition, you can configure the following responses:
- SCHED_APS_BNKR_BASIC
- Deliver bankruptcy-notification events and make the partition out-of-budget for the
rest of the scheduling window (nominally 100 ms). This is the default.
- SCHED_APS_BNKR_CANCEL_BUDGET
- Set the offending partition's critical budget to zero, which forces the partition to
be scheduled by its percentage CPU budget only.
This also means that a second bankruptcy can't occur.
This persists until a restart occurs, or you call
SCHED_APS_MODIFY_PARTITION
to set a new critical budget.
- SCHED_APS_BNKR_REBOOT
- Cause the system to crash with a brief message identifying the offending partition.
This is the most severe response, suggested for use while testing a product, to make
sure bankruptcies are never ignored.
You probably shouldn't use this option in your finished product.
To set a choice of bankruptcy-handling options, OR the above SCHED_APS_BNKR_* flags
and pass a pointer to it as the bankruptcy_policyp field of the sched_aps_parms
structure when you call
SCHED_APS_SET_PARMS.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_QUERY_PARMS command (see the
Returns section for details):
- EOK
- Success.
- EACCES
- The calling thread doesn't meet the security options set (see
SCHED_APS_ADD_SECURITY).
Your process must have the PROCMGR_AID_APS_ROOT ability (see
procmgr_ability()).
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_SET_PARMS
This command sets the parameters for the overall behavior of the adaptive partitioning scheduler.
The data argument must be a pointer to a sched_aps_parms structure:
typedef struct {
int16_t windowsize_ms;
int16_t reserved1;
uint32_t *scheduling_policy_flagsp;
uint32_t *bankruptcy_policyp;
int32_t reserved2;
int64_t reserved3;
} sched_aps_parms;
The members include:
- windowsize_ms
- The time over which the scheduler is to average CPU cycles and balance the partitions
to their budgets as specified by
SCHED_APS_CREATE_PARTITION.
The default is 100 ms.
If you don't want to set the window size, set this member to -1.
- scheduling_policy_flagsp
- A pointer to an ORed set of SCHED_APS_SCHEDPOL_* flags that specify the scheduling policy.
For more information, see
Scheduling policies, above.
If you don't want to change the scheduling policy, set this member to NULL.
- bankruptcy_policyp
- A pointer to an ORing of SCHED_APS_BNKR_* flags, as described under
Handling bankruptcy, above.
If you don't want to change these flags, set this member to NULL.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_SET_PARMS command (see the
Returns section for details):
- EOK
- Success.
- EACCES
- One of the following:
- SCHED_APS_SEC_PARTITIONS_LOCKED is set.
- SCHED_APS_SEC_ROOT0_OVERALL is set, and you aren't running in the
System partition with the PROCMGR_AID_APS_ROOT ability (see
procmgr_ability()).
For more information, see Security, below.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_CREATE_PARTITION
This command creates a new partition that's considered to be a child of the partition that's calling
SchedCtl().
The system automatically creates a partition called System (the value of
APS_SYSTEM_PARTITION_NAME) with an ID of 0.
The data argument for this command must be a pointer to a sched_aps_create_parms
structure:
typedef struct {
/* input parms */
char *name;
uint16_t budget_percent;
int16_t critical_budget_ms;
uint8_t aps_create_flags;
int8_t parent_id;
uint16_t max_budget_percent;
uint16_t critical_priority;
uint16_t budget_percent_scale;
uint64_t reserved1;
uint32_t reserved2;
/* output parms */
int16_t id;
int16_t reserved3;
} sched_aps_create_parms;
The input members include:
- name
- The name of the new partition.
If name is NULL or points to an empty string, the partition's name is the
same as its ID.
The name must be no longer than APS_PARTITION_NAME_LENGTH, not including the
trailing null character, can't start with a digit, and can't include any slashes (/).
- budget_percent
- The percentage CPU budget for the new partition.
Budgets given to the new partition are subtracted from the parent partition.
Note:
Before creating zero-budget partitions, read the cautions in
Setting budgets for resource managers
in the System Considerations chapter of the Adaptive Partitioning
User's Guide.
- critical_budget_ms
- The critical budget, in milliseconds, for the partition, or -1 or 0 if you don't want
the partition to have a critical budget.
Critical budgets don't affect the parent, but
are automatically limited to be no bigger than the window size.
- aps_create_flags
- Flags that control the creation of the partition.
The only flag currently defined is:
- APS_CREATE_FLAGS_USE_PARENT_ID — if set, the parent_id field is used;
otherwise it's ignored.
- parent_id
- Which partition the budget should come from.
If -1, then the budget comes from the calling thread's budget.
- max_budget_percent
- The maximum CPU time, in percent, that the partition may consume if it has no competition (i.e., free time).
This limit has an effect only if SCHED_APS_SCHEDPOL_LIMIT_CPU_USAGE is set.
- critical_priority
- Threads at this priority or higher are critical.
This value is optional; set it to -1 or 0 to disable.
- budget_percent_scale
- The number of digits to the right of the decimal point in budget_percent and
max_budget_percent. The budget is calculated as follows:
Actual Budget (%) = budget_percent / (10 ^ budget_percent_scale)
The output members include:
- id
- The created partition's ID number, in the range 0 to the maximum number of partitions
− 1 (see the max_partitions member of the data from a call to
SCHED_APS_QUERY_PARMS.
The System partition's ID number is APS_SYSTEM_PARTITION_ID.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_CREATE_PARTITION command (see the
Returns section for details):
- EOK
- Success.
- EACCES
- SCHED_APS_SEC_PARTITIONS_LOCKED is set, or any of these security
conditions are set and not satisfied:
- SCHED_APS_SEC_ROOT_MAKES_PARTITIONS
- SCHED_APS_SEC_SYS_MAKES_PARTITIONS
- SCHED_APS_SEC_NONZERO_BUDGETS
- SCHED_APS_SEC_ROOT_MAKES_CRITICAL
- SCHED_APS_SEC_SYS_MAKES_CRITICAL
For more information, see Security, below.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EDQUOT
- The parent partition doesn't have enough budget.
- EEXIST
- Another partition is already using the given name.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure, the
partition name is badly formed, or the budget is out of range.
- ENAMETOOLONG
- The partition name is longer than APS_PARTITION_NAME_LENGTH characters.
- ENOSPC
- The maximum number of partitions already exist.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_QUERY_PARTITION
This command gets information about a given partition.
The data argument for this command must be a
pointer to a sched_aps_partition_info structure:
typedef struct {
/* out parms */
uint64_t budget_cycles; /* Deprecated */
uint64_t critical_budget_cycles;
char name[APS_PARTITION_NAME_LENGTH+1];
int16_t parent_id;
uint16_t budget_percent;
int32_t notify_pid; /* Deprecated */
int32_t notify_tid; /* Deprecated */
uint32_t pinfo_flags; /* Deprecated */
int32_t pid_at_last_bankruptcy;
int32_t tid_at_last_bankruptcy;
uint16_t max_budget_percent;
uint16_t critical_priority;
uint16_t budget_percent_scale;
int16_t reserved1;
int64_t reserved2;
/* input parm */
int16_t id;
} sched_aps_partition_info;
The input members include:
- id
- The number of the partition you want to query.
The output members include:
- critical_budget_cycles
- The critical budget, in cycles.
To convert this value to milliseconds, multiply it by the cycles_per_ms value from a
SCHED_APS_QUERY_PARMS
command.
- name
- The name of the partition.
- parent_id
- The number of the partition that's the parent of the partition being queried.
The System partition's ID number is APS_SYSTEM_PARTITION_ID.
- budget_percent
- The partition's budget, expressed as a percentage.
- pid_at_last_bankruptcy, tid_at_last_bankruptcy
- The process and thread IDs at the time of the last bankruptcy, or -1 if there wasn't a
previous bankruptcy.
- max_budget_percent
- The maximum CPU time, in percent, that the partition may consume if it has no competition (i.e., free time).
This is always reported as 100% if SCHED_APS_SCHEDPOL_LIMIT_CPU_USAGE isn't set.
- critical_priority
- Threads at this priority or higher are critical.
- budget_percent_scale
- The number of digits to the right of the decimal point in budget_percent and
max_budget_percent.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_QUERY_PARTITION command (see the
Returns section for details):
- EOK
- Success.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_LOOKUP
This command finds the partition ID for a given partition name.
The data argument for this command must be a sched_aps_lookup_parms structure:
typedef struct {
/* input parms */
char *name;
int16_t reserved1;
/* output parms */
int16_t id;
} sched_aps_lookup_parms;
The input members include:
- name
- The name of the partition.
The output members include:
- id
- The ID number of the partition, if found.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_LOOKUP command (see the
Returns
section for details):
- EOK
- Success.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The name wasn't found.
SCHED_APS_JOIN_PARTITION
This command makes the thread specified by the given process and thread IDs becomes a member of
the specified partition.
This partition also becomes the thread's new home partition, i.e., where it returns after partition inheritance.
The data argument for this command must be a pointer to a sched_aps_join_parms
structure:
typedef struct {
int16_t id;
int16_t reserved1;
int32_t pid;
int32_t tid;
int32_t aid;
} sched_aps_join_parms;
The members include:
- id
- The ID number of the partition that the thread is to join.
- pid, tid
- The process and thread IDs of the thread that you want to join the specified partition:
- aid
- If non-zero, join all processes for this application ID.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_JOIN_PARTITION command (see the
Returns section for details):
- EOK
- Success.
- EACCES
- The following security options are set but not satisfied:
- SCHED_APS_SEC_ROOT_JOINS
- SCHED_APS_SEC_SYS_JOINS
- SCHED_APS_SEC_PARENT_JOINS
- SCHED_APS_SEC_JOIN_SELF_ONLY
For more information, see Security, below.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure, or
the partition with the given ID doesn't exist.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
- ESRCH
- The pid and tid are invalid.
SCHED_APS_MODIFY_PARTITION
This command changes the parameters of an existing partition.
If the new budget's value is different from the current, the difference is either taken from,
or returned to, the parent partition's budget.
The critical time parameter affects only the chosen partition, not its parent.
To change just one of new budget or new critical time, set the other to -1.
Note:
- You can't use this command to modify the System partition's budget. To increase the
size of the System partition, reduce the budget of one of its child partitions.
- Reducing the size of a partition may cause it not to run for the time of an averaging
window, as you may have caused it to become temporarily over-budget. However, reducing
the critical time doesn't trigger the declaration of bankruptcy.
The data argument for this command must be a pointer to a sched_aps_modify_parms
structure:
typedef struct {
int16_t id;
int16_t new_budget_percent;
int16_t new_critical_budget_ms;
int16_t new_max_budget_percent;
int16_t new_critical_priority;
uint16_t budget_percent_scale;
int32_t reserved1;
int64_t reserved2;
} sched_aps_modify_parms;
The members include:
- id
- The ID number of the partition.
- new_budget_percent
- The new budget for the partition, expressed as a percentage, or -1 if you don't want to change it.
- new_critical_budget_ms
- The new critical budget, in milliseconds, for the partition, or -1 if you don't want to change it.
If the critical budget is greater than the window size, it's considered to be infinite.
- max_budget_percent
- The maximum CPU time, in percent, that the partition may consume if it has no competition (i.e., free time).
This limit has an effect only if SCHED_APS_SCHEDPOL_LIMIT_CPU_USAGE is set.
- critical_priority
- Threads at this priority or higher are critical.
This value is optional; set it to -1 or 0 to skip.
- budget_percent_scale
- The number of digits to the right of the decimal point in new_budget_percent and
max_budget_percent.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_MODIFY_PARTITION command (see the
Returns section for details):
- EOK
- Success.
- EACCES
- SCHED_APS_SEC_PARTITIONS_LOCKED is set, or the following
security options are set and not satisfied:
- SCHED_APS_SEC_PARENT_MODIFIES
- SCHED_APS_SEC_ROOT_MAKES_PARTITIONS
- SCHED_APS_SEC_SYS_MAKES_PARTITIONS
- SCHED_APS_SEC_NONZERO_BUDGETS
- SCHED_APS_SEC_ROOT_MAKES_CRITICAL
- SCHED_APS_SEC_SYS_MAKES_CRITICAL
For more information, see Security, below.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure, or
the partition with the given ID doesn't exist.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_PARTITION_STATS
This command returns instantaneous values of the CPU time-accounting variables for a set of partitions.
It can fill in data for more than one partition.
If the length argument to SchedCtl() indicates that
you've passed the function an array of
sched_aps_partition_stats structures, SchedCtl() fills
each element with statistics for a different partition, starting with the partition
specified by the id field.
Note:
To get an accurate picture for the whole machine it's important to read data for
all partitions in one call, since sequential calls to
SCHED_APS_PARTITION_STATS may come from separate averaging windows.
To determine the number of partitions, use the
SCHED_APS_OVERALL_STATS command.
The command overwrites the id field with the partition number for which
data is being returned. It stores -1 in the id field of unused elements.
To convert times in cycles into milliseconds, divide them by the cycles_per_ms obtained with an
SCHED_APS_QUERY_PARMS
command.
The data argument for this command must be a pointer to a
sched_aps_partition_stats structure, or an array of these structures:
typedef struct {
/* out parms */
uint64_t run_time_cycles;
uint64_t critical_time_cycles;
uint64_t run_time_cycles_w2; /* Deprecated */
uint64_t critical_time_cycles_w2; /* Deprecated */
uint64_t run_time_cycles_w3; /* Deprecated */
uint64_t critical_time_cycles_w3; /* Deprecated */
uint32_t stats_flags;
uint32_t reserved1;
uint64_t dynamic_windowsize_cycles; /* length of last averaging window used for scheduling */
uint64_t reserved2;
/* in parm */
int16_t id;
} sched_aps_partition_stats;
The members include:
- run_time_cycles
- The CPU execution time during the last scheduling window.
- critical_time_cycles
- The time billed as critical during the last scheduling window.
- stats_flags
- A set of the following flags:
- SCHED_APS_PSTATS_IS_BANKRUPT_NOW — the critical time used is
greater than the critical budget at the time you used the
SCHED_APS_PARTITION_STATS command.
- SCHED_APS_PSTATS_WAS_BANKRUPT — the partition was declared
bankrupt sometime since the last restart.
- dynamic_windowsize_cycles
- The length of last averaging window used for scheduling.
Note that dynamic_windowsize_cycles may differ from the nominal window size.
Use dynamic_windowsize_cycles to convert run_time_cycles to a percentage.
- id
- This is both an input and output field.
As input, it's the ID number of the first partition you want data for.
If you've passed an array of sched_aps_partition_stats structures, the command fills in the ID
number for each partition that it fills in statistics for.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_PARTITION_STATS command (see the
Returns section for details):
- EOK
- Success.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block isn't a multiple of size(sched_aps_partition_stats).
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_OVERALL_STATS
This command returns instantaneous information about scheduler states.
The data argument for this command must be a pointer to a sched_aps_overall_stats
structure:
typedef struct {
uint64_t idle_cycles; /* Deprecated */
uint64_t idle_cycles_w2; /* Deprecated */
uint64_t idle_cycles_w3; /* Deprecated */
int16_t id_at_last_bankruptcy;
uint16_t reserved1;
int32_t pid_at_last_bankruptcy;
int32_t tid_at_last_bankruptcy;
uint32_t reserved2;
uint32_t reserved3;
uint64_t reserved4;
} sched_aps_overall_stats;
The members include:
- id_at_last_bankruptcy
- The ID of last bankrupt partition, or -1 if no bankruptcy has occurred.
- pid_at_last_bankruptcy, tid_at_last_bankruptcy
- The process and thread IDs at last the bankruptcy, or -1 if no bankruptcy has occurred.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_OVERALL_STATS command (see the
Returns section for details):
- EOK
- Success.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_QUERY_THREAD
This command determines the partition for the given thread and indicates whether or not the thread in
your process is marked to run as critical.
Use a thread ID of zero to indicate the calling thread.
The data argument for this command must be a pointer to a
sched_aps_query_thread_parms structure:
typedef struct {
int32_t pid;
int32_t tid;
/* out parms: */
int16_t id;
int16_t inherited_id;
uint32_t crit_state_flags;
int32_t reserved1;
int32_t reserved2;
} sched_aps_query_thread_parms;
The input members include:
- pid
- The ID of process that the thread belongs to, or 0 to indicate the calling process.
- tid
- The thread ID, or 0 for the calling thread.
The output members include:
- id
- The ID number of the partition that the thread originally joined.
- inherited_id
- The ID number of the partition that the thread currently belongs to.
This might not be the same as the id member, because the thread might have inherited
the partition from a calling process.
- crit_state_flags
- A combination of the following flags:
- APS_QCRIT_RUNNING_CRITICAL — the thread is currently running as critical.
- APS_QCRIT_BILL_AS_CRITICAL — the thread's execution time is
being billed to the partition's critical budget.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_QUERY_THREAD command (see the
Returns section for details):
- EOK
- Success.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
- ESRCH
- The specified thread wasn't found.
SCHED_APS_ADD_SECURITY
This command sets security options.
A bit that's set turns the corresponding security option on.
Successive calls add to the existing set of security options.
Security options can be cleared only by a restart.
Note:
You must be running in the System partition with the
PROCMGR_AID_APS_ROOT ability enabled (see
procmgr_ability())
in order to use this command, even if all security options are off.
The data argument for this command must be a pointer to a sched_aps_security_parms
structure:
typedef struct {
uint32_t sec_flags;
uint32_t reserved1;
uint32_t reserved2;
} sched_aps_security_parms;
The members include:
- sec_flags
- A set of SCHED_APS_SEC_* flags (see below), as both input and output parameters.
Set this member to 0 if you want to get the current security flags.
Security
The adaptive partitioning scheduler lets you dynamically create and modify the partitions in your system.
Note:
We recommend that you set up your partition environment at boot time, and then lock all parameters:
- in a program, by using the SCHED_APS_SEC_PARTITIONS_LOCKED flag
- from the command line, by using the
aps modify
command
However you might need to modify a partition at runtime.
In this case, you can use the security options described below.
When QNX Neutrino starts, it sets the security option to SCHED_APS_SEC_OFF.
We recommend that you immediately set it to SCHED_APS_SEC_RECOMMENDED.
In code, do this:
sched_aps_security_parms p;
APS_INIT_DATA( &p );
p.sec_flags = SCHED_APS_SEC_RECOMMENDED;
SchedCtl( SCHED_APS_ADD_SECURITY,&p, sizeof(p) );
Note:
Some of the security options restrict certain operations to processes that have the
PROCMGR_AID_APS_ROOT ability enabled (see
procmgr_ability()).
The security options include the following:
- SCHED_APS_SEC_RECOMMENDED
- Only a process that's running in the System partition with the
PROCMGR_AID_APS_ROOT ability enabled may create partitions or change parameters.
This arranges a 2-level hierarchy of partitions: the System partition and its children.
Only a process that's running in the System partition with the
PROCMGR_AID_APS_ROOT ability enabled may join its own thread to partitions.
The percentage budgets must not be zero.
- SCHED_APS_SEC_FLEXIBLE
- Only a process that's running in the System partition with the
PROCMGR_AID_APS_ROOT ability enabled can change scheduling parameters.
But a process that's running in any partition with
the PROCMGR_AID_APS_ROOT ability enabled can create subpartitions, join
threads into its own subpartitions, modify subpartitions, and change critical budgets.
This lets applications
create their own local subpartitions out of their own budgets. The percentage budgets
must not be zero.
- SCHED_APS_SEC_BASIC
- Only a process that's running in the System partition with the
PROCMGR_AID_APS_ROOT ability enabled may change overall scheduling
parameters.
Only a process with the PROCMGR_AID_APS_ROOT ability enabled may set critical budgets.
Unless you're testing the partitioning and want to change all parameters without needing
to restart, you should set at least SCHED_APS_SEC_BASIC.
In general, SCHED_APS_SEC_RECOMMENDED is more secure than
SCHED_APS_SEC_FLEXIBLE, which is more secure than SCHED_APS_SEC_BASIC.
All three allow partitions to be created and modified.
After setting up partitions, use SCHED_APS_SEC_PARTITIONS_LOCKED to prevent
further unauthorized changes. For example:
sched_aps_security_parms p;
APS_INIT_DATA( &p );
p.sec_flags = SCHED_APS_SEC_PARTITIONS_LOCKED;
SchedCtl( SCHED_APS_ADD_SECURITY, &p, sizeof(p));
SCHED_APS_SEC_RECOMMENDED, SCHED_APS_SEC_FLEXIBLE, and
SCHED_APS_SEC_BASIC are composed of the flags defined below (but it's
usually more convenient for you to use the compound options):
#define SCHED_APS_SEC_BASIC (SCHED_APS_SEC_ROOT0_OVERALL | SCHED_APS_SEC_ROOT_MAKES_CRITICAL)
#define SCHED_APS_SEC_FLEXIBLE (SCHED_APS_SEC_BASIC | SCHED_APS_SEC_NONZERO_BUDGETS |\
SCHED_APS_SEC_ROOT_MAKES_PARTITIONS |\
SCHED_APS_SEC_PARENT_JOINS | SCHED_APS_SEC_PARENT_MODIFIES )
#define SCHED_APS_SEC_RECOMMENDED (SCHED_APS_SEC_FLEXIBLE | SCHED_APS_SEC_SYS_MAKES_PARTITIONS |\
SCHED_APS_SEC_SYS_JOINS | SCHED_APS_SEC_JOIN_SELF_ONLY)
#define SCHED_APS_SEC_OFF 0x00000000
The individual flags are as follows:
- SCHED_APS_SEC_ROOT0_OVERALL
- Your process must have the PROCMGR_AID_APS_ROOT ability enabled and be
running in the System partition in order to change the overall scheduling parameters,
such as the averaging window size.
- SCHED_APS_SEC_ROOT_MAKES_PARTITIONS
- Your process must have the PROCMGR_AID_APS_ROOT ability enabled in
order to create or modify partitions.
Applies to the
SCHED_APS_CREATE_PARTITION
and
SCHED_APS_MODIFY_PARTITION
commands.
- SCHED_APS_SEC_SYS_MAKES_PARTITIONS
- You must be running in the System partition in order to create or modify partitions.
This applies to same commands as SCHED_APS_SEC_ROOT_MAKES_PARTITIONS.
- SCHED_APS_SEC_PARENT_MODIFIES
- Allows partitions to be modified (SCHED_APS_MODIFY_PARTITION), but you
must be running in the parent partition of the partition being modified.
Modify means to change a partition's percentage or critical budget.
- SCHED_APS_SEC_NONZERO_BUDGETS
- A partition may not be created with, or modified to have, a zero budget.
Unless you know that all your partitions need to run only in response to client requests, i.e.,
receipt of messages, you should set this option.
- SCHED_APS_SEC_ROOT_MAKES_CRITICAL
- Your process must have the PROCMGR_AID_APS_ROOT ability enabled in
order to create a nonzero critical budget or change an existing critical budget.
- SCHED_APS_SEC_SYS_MAKES_CRITICAL
- You must be running in the System partition to create a nonzero critical budget or
change an existing critical budget.
- SCHED_APS_SEC_ROOT_JOINS
- Your process must have the PROCMGR_AID_APS_ROOT ability enabled in
order to join a thread to a partition.
- SCHED_APS_SEC_SYS_JOINS
- You must be running in the System partition in order to join a thread.
- SCHED_APS_SEC_PARENT_JOINS
- You must be running in the parent partition of the partition you wish to join to.
- SCHED_APS_SEC_JOIN_SELF_ONLY
- The caller of the
SCHED_APS_JOIN_PARTITION
command must specify 0 for the pid and tid.
In other words, a process may join only itself to a partition.
- SCHED_APS_SEC_PARTITIONS_LOCKED
- Prevent further changes to any partition's budget, or overall scheduling parameters, such as the window size.
Set this after you've set up your partitions.
Once you've locked the partitions, you can still use the
SCHED_APS_JOIN_PARTITION
command.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_ADD_SECURITY command (see the
Returns section for details):
- EOK
- Success.
- EACCES
- The calling thread doesn't meet the security options set (see
SCHED_APS_ADD_SECURITY).
Your process must have the PROCMGR_AID_APS_ROOT ability enabled.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
SCHED_APS_QUERY_PROCESS
This command returns the partition of the given process.
The partition of a process is billed while one of the process's threads handles a pulse.
The individual threads in a process may all be in different partitions from the process.
The data argument for this command must be a pointer to a
sched_aps_query_process_parms structure:
typedef struct {
int32_t pid;
/* out parms: */
int16_t id; /* partition of process */
int16_t reserved1;
int64_t reserved2;
int64_t reserved3;
int32_t reserved4;
} sched_aps_query_process_parms;
The members include:
- pid
- The process ID, or 0 for the calling process.
- id
- The ID of the process's partition.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_APS_QUERY_PROCESS command (see the
Returns section for details):
- EOK
- Success.
- EDOM
- A reserved field isn't zero.
You might not have used APS_INIT_DATA() to initialize the data parameter.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- ENOSYS
- The adaptive partitioning scheduler isn't installed.
- ESRCH
- The process wasn't found.
SCHED_CONFIGURE (QNX Neutrino 7.0.1 or later)
As a trade-off between latency and throughput, the scheduling algorithm makes decisions that people
who expect strict priority scheduling occasionally find surprising:
- A high-priority thread T that becomes ready on CPU A
doesn't always displace a lower-priority thread running on that CPU.
Under some circumstances, an interprocessor interruptor (IPI) is sent to CPU B
to force it to perform a new scheduling decision, which is expected (but not guaranteed)
to cause thread T to run on CPU B.
Immediately scheduling the higher-priority thread to run on CPU A results in lower thread latency
for the high-priority thread, but at the expense of decreased total throughput,
as it incurs the cost of interrupting the lower-priority thread and possibly moving it to another CPU.
- If a thread T that's running on CPU A is displaced by a
higher-priority thread, and a thread with a priority lower than T's is running on
CPU B, CPU B might not be signalled to reevaluate its scheduling decision.
High-priority threads normally perform small amounts of work and then voluntarily suspend themselves.
In that case, migrating thread T is likely unnecessary and would reduce the total system throughput.
The end result is that only the highest-priority thread in the system is guaranteed to run on some CPU,
while the other CPUs may be running threads with lower priorities than those ready to run at any given point in time.
The SCHED_CONFIGURE command provides some parameters that you can use to tune the scheduler.
Note:
In order to use this command,
your process must have the
PROCMGR_AID_SCHEDULE ability enabled; see
procmgr_ability().
The data argument for this command must be a pointer to a
struct sched_config.
This structure includes the following members:
- int32_t low_latency_priority
- The priority threshold above which a thread always displaces a lower-priority thread running on the CPU
on which it becomes ready.
Effectively, this parameter lets you define a priority class as low latency,
avoiding the overhead of an IPI and a new scheduling decision by another CPU.
- int32_t migrate_priority
- The priority threshold above which a preempted thread will be rescheduled on another CPU (the one
running the lowest-priority thread).
Keeping both parameters at their default values (INT_MAX) maintains the scheduling behavior
described above.
For example, suppose there are three threads, T20, T10, and T5, where the priority of each thread matches
the thread's label, and that
T10 is running on CPU A, and T5 is running on CPU B.
If thread T20 becomes ready on CPU A, then the results are as follows:
- If low_latency_priority > 20, T20 migrates to CPU B and displaces T5,
leaving T10 and T20 running, but the thread latency of T20 was impacted by the necessity of sending an
IPI to CPU B to perform the rescheduling there.
- If low_latency_priority <= 20, T20 preempts T10 immediately and thus minimizes
its thread latency, but at the expense of migrating T10 to another CPU.
- If T10 is preempted, then:
- If migrate_priority > 10, T10 doesn't migrate and doesn't run,
leaving T5 running while T10 is ready.
- If migrate_priority <= 10, T10 migrates to CPU B and displaces T5.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_CONFIGURE command (see the
Returns section for details):
- EOK
- Success.
- EINVAL
- The size of the parameter block doesn't match the size of the expected structure.
- EPERM
- You don't have the PROCMGR_AID_SCHEDULE ability enabled.
SCHED_CONT_APP and SCHED_STOP_APP (QNX Neutrino 7.0 or later)
The SCHED_STOP_APP and SCHED_CONT_APP commands make all processes with the given
application ID stop or continue.
Stop means that no thread in any of these processes is scheduled,
and continue means allowing these threads to be scheduled again, subject to their current state.
You use these commands like this:
if ( SchedCtl( SCHED_STOP_APP, (void *)appid, 0) == -1 ) {
/* An error occurred */
}
...
if ( SchedCtl( SCHED_CONT_APP, (void *)appid, 0) == -1 ) {
/* An error occurred */
}
This mechanism is independent of the SIGSTOP and SIGCONT signals;
for example, a SIGCONT doesn't resume a thread that belongs to a process that was stopped by a
SchedCtl(SCHED_STOP_APP, ...) call.
Nevertheless, in order to successfully issue these commands, your process must have the
PROCMGR_AID_SIGNAL ability enabled such that you could send the appropriate signal:
- SIGCONT for SCHED_CONT_APP
- SIGSTOP for SCHED_STOP_APP
For more information about PROCMGR_AID_SIGNAL, see
procmgr_ability().
This mechanism interacts with the _NTO_TCTL_ONE_THREAD_CONT and
_NTO_TCTL_ONE_THREAD_HOLD commands for
ThreadCtl()
as follows:
- SCHED_STOP_APP stops threads in the specified application
that are running.
- SCHED_CONT_APP starts threads in the specified application
that were stopped by SCHED_STOP_APP, but not those that were stopped by
_NTO_TCTL_ONE_THREAD_HOLD.
- _NTO_TCTL_ONE_THREAD_CONT starts threads that were stopped by
_NTO_TCTL_ONE_THREAD_HOLD or SCHED_STOP_APP.
Errors:
SchedCtl() and SchedCtl_r() indicate the following
errors for the SCHED_CONT_APP and SCHED_STOP_APP commands (see the
Returns section for details):
- EOK
- Success; at least one app was made to continue or stop, depending on the command.
- EPERM
- You don't have the PROCMGR_AID_SIGNAL ability enabled for the appropriate signal.
- ESRCH
- No process with the given application ID was found, or the application ID is the one for the process manager.
Blocking states
These calls don't block.