Introduction

From linuxrealtime
Jump to: navigation, search

As hardware has become ever more capable there has been a trend to implement real-time applications on Linux. Linux was designed from the beginning for server and desktop applications, not for real-time applications. This means that achieving real-time properties on Linux is not trivial. This document is a guide for anyone attempting to implement a real-time application using Linux.

How To Read This Document

This document is divided into the following sections:

Chapter 2, Basic Linux from a Real-Time Perspective: This section is intended as school book explaining areas of interest for designers of real-time systems.

Chapter 3, Improving the Real-Time Properties: This section lists a number of things that can be done to improve the real-time behavior of Enea Linux. Some are of general nature and easy to apply, while others are suitable in fewer situation and/or requires a greater effort.

Chapter 4, Designing Real-Time Applications: This section gives tips and hints about how to design your application for real-time requirements.

Chapter 5, Hardware Aspects: This section gives some hints about how to handle some hardware aspects that do impact the real-time properties.

Readers who already have a good understanding of how Linux works may want to go directly to Chapter 3, Improving the Real-Time Properties.

Terminology

List of terms and definitions
blocked When a task is waiting for an event, the task is said to be blocked.
boot parameters Kernel boot parameters, i.e. parameters given to the kernel when booting. See kernel source file https://www.kernel.org/doc/Documentation/kernel-parameters.txt.
bottom half Work needed to be done initiated by an interrupt, but which can be done without interrupts being disabled. This is also referred to as deferred interrupt work, and this work is typically scheduled using soft IRQ, tasklet, or a work queue. See also top half.
core A hardware resource capable of executing a thread of code in parallel with other cores.
CPU Either a core or a hyper-thread. In Linux this represents a hardware resource capable of executing a thread of code in parallel with other CPUs.
CPU affinity Tasks, interrupts etc can have a CPU affinity, which means that they will only run on the CPUs given as an affinity mask or affinity CPU list.
CPU hotplug Kernel feature for adding and removing CPUs at runtime.
CPU isolation Make a CPU as independent as possible of work done on other CPUs.
CPU partitioning CPU partitioning is about grouping CPUs together. In the scope of this document, the intention is to group a set of CPUs in order to perform CPU isolation on each CPU within this group.
cpuset Kernel feature used to perform CPU partitioning.
critical section A code path which needs to be protected from concurrency. It could be a global state that takes several operations to update and where all those operations must appear to be one atomic update. Such sections are usually protected using e.g. mutex.
dynamic ticks Used to denote idle dynamic ticks as well as full dynamic ticks.
full dynamic ticks Kernel feature to inhibit tick interrupts when running a single task on a CPU.
GP partition A CPU partition intended for general purpose applications, i.e. applications that do not have real-time requirements.
hard real-time When a response to an event must be done within a well defined time limit, or else the application will fail miserably, this is referred to as hard real-time.
hyper-threading A technique to allow two executions to occur simultaneously on the same core as long as they don't need the same core resources. This means that these multiple executions heavily affects each other, even though Linux presents them as being two different CPUs.
idle dynamic ticks Kernel feature to inhibit tick interrupts when CPU is idle.
interrupt Hardware functionality for indicating asynchronous events. Interrupts will preempt the current execution and start executing the top half of an interrupt handler.
IRQ Interrupt request - Each type of interrupt is assigned its own vector, which is called IRQ vector or simply IRQ.
I/O isolation Technique for making code execution occur in parallel with I/O latencies to make the execution less dependent on hardware latencies.
jitter Since this document is about real-time it is variance in latencies that jitter is referring to. This includes both jitter caused by work done by other threads of execution, application induced jitter, and work done by the kernel.
kernel configuration This refers to the kernel configuration being done before compiling the kernel, e.g. using make menuconfig from the root directory of the kernel source.
kernel space Code executed in the kernel, either compiled into the kernel or installed as a kernel module. Kernel space manages all hardware resources and delegates them to processes. Also see user space.
kthread Task executing in kernel space. All code executing in the kernel share the same memory.
latency Time from when an event occurs until a response has been produced. The most important part of latency for the scope of this document is the latency caused by anything else except from the actual work needed to be done as a response for the event.
LWP Light-weight process, kernel side of a pthread. LWP is used to make it possible to put pthreads on different CPUs while still sharing memory with a single parent process.
normal tasks Task with SCHED_OTHER, SCHED_IDLE or SCHED_BATCH scheduling policy. SCHED_OTHER is by far the most common one.
NUMA Non-uniform memory access is a design of a multi-core system where the access time for a specific memory range can depend on which CPU is accessing it, excluding any effects that caches might have.
partrt A tool for performing CPU partitioning. Available from https://github.com/OpenEneaLinux/rt-tools.
preemption When an execution is involuntarily interrupted to begin another thread of execution, it is said that the execution is preempted.
PREEMPT_RT Set of patches to achieve a fully preemptible kernel. See https://rt.wiki.kernel.org/index.php/Main_Page for more information.
priority inheritance When a task is waiting for a lock owned by a less prioritised task and the lock has priority inheritance, then the task owning the lock will be raised to the same priority as the waiting task for as long as the lock is being held. This technique avoids priority inversion problems, where a more prioritised task is forced to wait for a less prioritised task.
process Task that owns resources, such as memory areas, file descriptors, and locks. These resources may only be shared by using specific system calls, such as shmem().
pthreads POSIX implementation of threads. In Linux, every pthread is associated with a kernel side task called LWP. Therefore pthreads has a process ID which can be retrieved using the gettid() system call.
RCU Read, copy, update. A lock mechanism that makes read very cheap on the expense of more work when writing. See http://lwn.net/Articles/262464 for an article how it works.
real-time application An application with real-time requirements. It is enough that there is one task within the application that has those requirements for the application to be called real-time application.
real-time properties Properties of a system with predictable latency.
real-time tasks A task with scheduling policy SCHED_FIFO or SCHED_RR.
scheduling How to distribute resources as a function of time, e.g. I/O scheduling is how to distribute I/O accesses as a function of time and task scheduling is how to distribute CPU execution as a function of time.
soft IRQ A technique for implementing the bottom half handling of an interrupt. Also see tasklet.
RT partition Real-time partition, a CPU partition intended for real-time applications.
SMI System management interrupt. This is an x86 specific feature used for CPU overheating protection and fixing microcode bugs. The interrupt is handled by BIOS, which is outside Linux control.
SMM System management mode. When a CPU issues an SMI, it enters system management mode. Code running in this mode is found in the BIOS.
SMP Symmetric multiprocessor. This describes a multi-core system when all cores are handled by one operating system and treated as one resource.
soft real-time When there are requirements on maximum latency, but where failing those requirements causes a graceful degradation, this is referred to as soft real-time.
system call Function calls that are implemented in the kernel are called system calls. When a system call is issued, the currently running task goes from user space to kernel space and runs the implementation in the kernel, and then back to user mode again.
task The scheduling entry for a task scheduler. Can be a process or a thread.
task switch When a task stop running and an other task get to run on the same CPU. This can happen when a task is preempted or it yields.
tasklet A bottom half implementation that is more generic than soft IRQ. It is implemented as a soft IRQ vector.
threaded interrupts When registering an interrupt handler it is possible to register an associated interrupt thread. In this case the interrupt handler is expected to be very short, and the main part of the interrupt handling should be done in this interrupt thread.
throughput Throughput is measured by how much useful work can be achieved for a longer period of time, and is a way of measuring efficiency. It is often in conflict with latency.
ticks A per CPU periodic interrupt used to ensure time slices are being distributed fairly among the currently running tasks on the CPU.
top half Interrupt handling code that needs to be run with interrupts disabled. Also see threaded interrupts and bottom half.
user space Code executed outside of kernel space. User space is used to protect resources owned by a process from being accessed by other processes. This is the execution context for a process and its threads when not executing a system call.
work queue Kernel feature for handling small work packages in a kthread scope. Can be used for bottom half work or for work induced by a system call, but that can be deferred to a later time.
yielding When a task voluntarily gives up execution this is referred to as yielding. This could either be because the task has requested to sleep for an amount of time, or because it is blocking on a lock owned by another task.