Documentation / arm / kernel_user_helpers.rst


Based on kernel version 6.4.12. Page generated on 2023-08-29 08:47 EST.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268
============================
Kernel-provided User Helpers
============================

These are segment of kernel provided user code reachable from user space
at a fixed address in kernel memory.  This is used to provide user space
with some operations which require kernel help because of unimplemented
native feature and/or instructions in many ARM CPUs. The idea is for this
code to be executed directly in user mode for best efficiency but which is
too intimate with the kernel counter part to be left to user libraries.
In fact this code might even differ from one CPU to another depending on
the available instruction set, or whether it is a SMP systems. In other
words, the kernel reserves the right to change this code as needed without
warning. Only the entry points and their results as documented here are
guaranteed to be stable.

This is different from (but doesn't preclude) a full blown VDSO
implementation, however a VDSO would prevent some assembly tricks with
constants that allows for efficient branching to those code segments. And
since those code segments only use a few cycles before returning to user
code, the overhead of a VDSO indirect far call would add a measurable
overhead to such minimalistic operations.

User space is expected to bypass those helpers and implement those things
inline (either in the code emitted directly by the compiler, or part of
the implementation of a library call) when optimizing for a recent enough
processor that has the necessary native support, but only if resulting
binaries are already to be incompatible with earlier ARM processors due to
usage of similar native instructions for other things.  In other words
don't make binaries unable to run on earlier processors just for the sake
of not using these kernel helpers if your compiled code is not going to
use new instructions for other purpose.

New helpers may be added over time, so an older kernel may be missing some
helpers present in a newer kernel.  For this reason, programs must check
the value of __kuser_helper_version (see below) before assuming that it is
safe to call any particular helper.  This check should ideally be
performed only once at process startup time, and execution aborted early
if the required helpers are not provided by the kernel version that
process is running on.

kuser_helper_version
--------------------

Location:	0xffff0ffc

Reference declaration::

  extern int32_t __kuser_helper_version;

Definition:

  This field contains the number of helpers being implemented by the
  running kernel.  User space may read this to determine the availability
  of a particular helper.

Usage example::

  #define __kuser_helper_version (*(int32_t *)0xffff0ffc)

  void check_kuser_version(void)
  {
	if (__kuser_helper_version < 2) {
		fprintf(stderr, "can't do atomic operations, kernel too old\n");
		abort();
	}
  }

Notes:

  User space may assume that the value of this field never changes
  during the lifetime of any single process.  This means that this
  field can be read once during the initialisation of a library or
  startup phase of a program.

kuser_get_tls
-------------

Location:	0xffff0fe0

Reference prototype::

  void * __kuser_get_tls(void);

Input:

  lr = return address

Output:

  r0 = TLS value

Clobbered registers:

  none

Definition:

  Get the TLS value as previously set via the __ARM_NR_set_tls syscall.

Usage example::

  typedef void * (__kuser_get_tls_t)(void);
  #define __kuser_get_tls (*(__kuser_get_tls_t *)0xffff0fe0)

  void foo()
  {
	void *tls = __kuser_get_tls();
	printf("TLS = %p\n", tls);
  }

Notes:

  - Valid only if __kuser_helper_version >= 1 (from kernel version 2.6.12).

kuser_cmpxchg
-------------

Location:	0xffff0fc0

Reference prototype::

  int __kuser_cmpxchg(int32_t oldval, int32_t newval, volatile int32_t *ptr);

Input:

  r0 = oldval
  r1 = newval
  r2 = ptr
  lr = return address

Output:

  r0 = success code (zero or non-zero)
  C flag = set if r0 == 0, clear if r0 != 0

Clobbered registers:

  r3, ip, flags

Definition:

  Atomically store newval in `*ptr` only if `*ptr` is equal to oldval.
  Return zero if `*ptr` was changed or non-zero if no exchange happened.
  The C flag is also set if `*ptr` was changed to allow for assembly
  optimization in the calling code.

Usage example::

  typedef int (__kuser_cmpxchg_t)(int oldval, int newval, volatile int *ptr);
  #define __kuser_cmpxchg (*(__kuser_cmpxchg_t *)0xffff0fc0)

  int atomic_add(volatile int *ptr, int val)
  {
	int old, new;

	do {
		old = *ptr;
		new = old + val;
	} while(__kuser_cmpxchg(old, new, ptr));

	return new;
  }

Notes:

  - This routine already includes memory barriers as needed.

  - Valid only if __kuser_helper_version >= 2 (from kernel version 2.6.12).

kuser_memory_barrier
--------------------

Location:	0xffff0fa0

Reference prototype::

  void __kuser_memory_barrier(void);

Input:

  lr = return address

Output:

  none

Clobbered registers:

  none

Definition:

  Apply any needed memory barrier to preserve consistency with data modified
  manually and __kuser_cmpxchg usage.

Usage example::

  typedef void (__kuser_dmb_t)(void);
  #define __kuser_dmb (*(__kuser_dmb_t *)0xffff0fa0)

Notes:

  - Valid only if __kuser_helper_version >= 3 (from kernel version 2.6.15).

kuser_cmpxchg64
---------------

Location:	0xffff0f60

Reference prototype::

  int __kuser_cmpxchg64(const int64_t *oldval,
                        const int64_t *newval,
                        volatile int64_t *ptr);

Input:

  r0 = pointer to oldval
  r1 = pointer to newval
  r2 = pointer to target value
  lr = return address

Output:

  r0 = success code (zero or non-zero)
  C flag = set if r0 == 0, clear if r0 != 0

Clobbered registers:

  r3, lr, flags

Definition:

  Atomically store the 64-bit value pointed by `*newval` in `*ptr` only if `*ptr`
  is equal to the 64-bit value pointed by `*oldval`.  Return zero if `*ptr` was
  changed or non-zero if no exchange happened.

  The C flag is also set if `*ptr` was changed to allow for assembly
  optimization in the calling code.

Usage example::

  typedef int (__kuser_cmpxchg64_t)(const int64_t *oldval,
                                    const int64_t *newval,
                                    volatile int64_t *ptr);
  #define __kuser_cmpxchg64 (*(__kuser_cmpxchg64_t *)0xffff0f60)

  int64_t atomic_add64(volatile int64_t *ptr, int64_t val)
  {
	int64_t old, new;

	do {
		old = *ptr;
		new = old + val;
	} while(__kuser_cmpxchg64(&old, &new, ptr));

	return new;
  }

Notes:

  - This routine already includes memory barriers as needed.

  - Due to the length of this sequence, this spans 2 conventional kuser
    "slots", therefore 0xffff0f80 is not used as a valid entry point.

  - Valid only if __kuser_helper_version >= 5 (from kernel version 3.1).