Based on kernel version 4.16.1. Page generated on 2018-04-09 11:53 EST.
1 Generic Thermal Sysfs driver How To 2 =================================== 3 4 Written by Sujith Thomas <sujith.thomas@intel.com>, Zhang Rui <rui.zhang@intel.com> 5 6 Updated: 2 January 2008 7 8 Copyright (c) 2008 Intel Corporation 9 10 11 0. Introduction 12 13 The generic thermal sysfs provides a set of interfaces for thermal zone 14 devices (sensors) and thermal cooling devices (fan, processor...) to register 15 with the thermal management solution and to be a part of it. 16 17 This how-to focuses on enabling new thermal zone and cooling devices to 18 participate in thermal management. 19 This solution is platform independent and any type of thermal zone devices 20 and cooling devices should be able to make use of the infrastructure. 21 22 The main task of the thermal sysfs driver is to expose thermal zone attributes 23 as well as cooling device attributes to the user space. 24 An intelligent thermal management application can make decisions based on 25 inputs from thermal zone attributes (the current temperature and trip point 26 temperature) and throttle appropriate devices. 27 28 [0-*] denotes any positive number starting from 0 29 [1-*] denotes any positive number starting from 1 30 31 1. thermal sysfs driver interface functions 32 33 1.1 thermal zone device interface 34 1.1.1 struct thermal_zone_device *thermal_zone_device_register(char *type, 35 int trips, int mask, void *devdata, 36 struct thermal_zone_device_ops *ops, 37 const struct thermal_zone_params *tzp, 38 int passive_delay, int polling_delay)) 39 40 This interface function adds a new thermal zone device (sensor) to 41 /sys/class/thermal folder as thermal_zone[0-*]. It tries to bind all the 42 thermal cooling devices registered at the same time. 43 44 type: the thermal zone type. 45 trips: the total number of trip points this thermal zone supports. 46 mask: Bit string: If 'n'th bit is set, then trip point 'n' is writeable. 47 devdata: device private data 48 ops: thermal zone device call-backs. 49 .bind: bind the thermal zone device with a thermal cooling device. 50 .unbind: unbind the thermal zone device with a thermal cooling device. 51 .get_temp: get the current temperature of the thermal zone. 52 .set_trips: set the trip points window. Whenever the current temperature 53 is updated, the trip points immediately below and above the 54 current temperature are found. 55 .get_mode: get the current mode (enabled/disabled) of the thermal zone. 56 - "enabled" means the kernel thermal management is enabled. 57 - "disabled" will prevent kernel thermal driver action upon trip points 58 so that user applications can take charge of thermal management. 59 .set_mode: set the mode (enabled/disabled) of the thermal zone. 60 .get_trip_type: get the type of certain trip point. 61 .get_trip_temp: get the temperature above which the certain trip point 62 will be fired. 63 .set_emul_temp: set the emulation temperature which helps in debugging 64 different threshold temperature points. 65 tzp: thermal zone platform parameters. 66 passive_delay: number of milliseconds to wait between polls when 67 performing passive cooling. 68 polling_delay: number of milliseconds to wait between polls when checking 69 whether trip points have been crossed (0 for interrupt driven systems). 70 71 72 1.1.2 void thermal_zone_device_unregister(struct thermal_zone_device *tz) 73 74 This interface function removes the thermal zone device. 75 It deletes the corresponding entry from /sys/class/thermal folder and 76 unbinds all the thermal cooling devices it uses. 77 78 1.1.3 struct thermal_zone_device *thermal_zone_of_sensor_register( 79 struct device *dev, int sensor_id, void *data, 80 const struct thermal_zone_of_device_ops *ops) 81 82 This interface adds a new sensor to a DT thermal zone. 83 This function will search the list of thermal zones described in 84 device tree and look for the zone that refer to the sensor device 85 pointed by dev->of_node as temperature providers. For the zone 86 pointing to the sensor node, the sensor will be added to the DT 87 thermal zone device. 88 89 The parameters for this interface are: 90 dev: Device node of sensor containing valid node pointer in 91 dev->of_node. 92 sensor_id: a sensor identifier, in case the sensor IP has more 93 than one sensors 94 data: a private pointer (owned by the caller) that will be 95 passed back, when a temperature reading is needed. 96 ops: struct thermal_zone_of_device_ops *. 97 98 get_temp: a pointer to a function that reads the 99 sensor temperature. This is mandatory 100 callback provided by sensor driver. 101 set_trips: a pointer to a function that sets a 102 temperature window. When this window is 103 left the driver must inform the thermal 104 core via thermal_zone_device_update. 105 get_trend: a pointer to a function that reads the 106 sensor temperature trend. 107 set_emul_temp: a pointer to a function that sets 108 sensor emulated temperature. 109 The thermal zone temperature is provided by the get_temp() function 110 pointer of thermal_zone_of_device_ops. When called, it will 111 have the private pointer @data back. 112 113 It returns error pointer if fails otherwise valid thermal zone device 114 handle. Caller should check the return handle with IS_ERR() for finding 115 whether success or not. 116 117 1.1.4 void thermal_zone_of_sensor_unregister(struct device *dev, 118 struct thermal_zone_device *tzd) 119 120 This interface unregisters a sensor from a DT thermal zone which was 121 successfully added by interface thermal_zone_of_sensor_register(). 122 This function removes the sensor callbacks and private data from the 123 thermal zone device registered with thermal_zone_of_sensor_register() 124 interface. It will also silent the zone by remove the .get_temp() and 125 get_trend() thermal zone device callbacks. 126 127 1.1.5 struct thermal_zone_device *devm_thermal_zone_of_sensor_register( 128 struct device *dev, int sensor_id, 129 void *data, const struct thermal_zone_of_device_ops *ops) 130 131 This interface is resource managed version of 132 thermal_zone_of_sensor_register(). 133 All details of thermal_zone_of_sensor_register() described in 134 section 1.1.3 is applicable here. 135 The benefit of using this interface to register sensor is that it 136 is not require to explicitly call thermal_zone_of_sensor_unregister() 137 in error path or during driver unbinding as this is done by driver 138 resource manager. 139 140 1.1.6 void devm_thermal_zone_of_sensor_unregister(struct device *dev, 141 struct thermal_zone_device *tzd) 142 143 This interface is resource managed version of 144 thermal_zone_of_sensor_unregister(). 145 All details of thermal_zone_of_sensor_unregister() described in 146 section 1.1.4 is applicable here. 147 Normally this function will not need to be called and the resource 148 management code will ensure that the resource is freed. 149 150 1.1.7 int thermal_zone_get_slope(struct thermal_zone_device *tz) 151 152 This interface is used to read the slope attribute value 153 for the thermal zone device, which might be useful for platform 154 drivers for temperature calculations. 155 156 1.1.8 int thermal_zone_get_offset(struct thermal_zone_device *tz) 157 158 This interface is used to read the offset attribute value 159 for the thermal zone device, which might be useful for platform 160 drivers for temperature calculations. 161 162 1.2 thermal cooling device interface 163 1.2.1 struct thermal_cooling_device *thermal_cooling_device_register(char *name, 164 void *devdata, struct thermal_cooling_device_ops *) 165 166 This interface function adds a new thermal cooling device (fan/processor/...) 167 to /sys/class/thermal/ folder as cooling_device[0-*]. It tries to bind itself 168 to all the thermal zone devices registered at the same time. 169 name: the cooling device name. 170 devdata: device private data. 171 ops: thermal cooling devices call-backs. 172 .get_max_state: get the Maximum throttle state of the cooling device. 173 .get_cur_state: get the Currently requested throttle state of the cooling device. 174 .set_cur_state: set the Current throttle state of the cooling device. 175 176 1.2.2 void thermal_cooling_device_unregister(struct thermal_cooling_device *cdev) 177 178 This interface function removes the thermal cooling device. 179 It deletes the corresponding entry from /sys/class/thermal folder and 180 unbinds itself from all the thermal zone devices using it. 181 182 1.3 interface for binding a thermal zone device with a thermal cooling device 183 1.3.1 int thermal_zone_bind_cooling_device(struct thermal_zone_device *tz, 184 int trip, struct thermal_cooling_device *cdev, 185 unsigned long upper, unsigned long lower, unsigned int weight); 186 187 This interface function binds a thermal cooling device to a particular trip 188 point of a thermal zone device. 189 This function is usually called in the thermal zone device .bind callback. 190 tz: the thermal zone device 191 cdev: thermal cooling device 192 trip: indicates which trip point in this thermal zone the cooling device 193 is associated with. 194 upper:the Maximum cooling state for this trip point. 195 THERMAL_NO_LIMIT means no upper limit, 196 and the cooling device can be in max_state. 197 lower:the Minimum cooling state can be used for this trip point. 198 THERMAL_NO_LIMIT means no lower limit, 199 and the cooling device can be in cooling state 0. 200 weight: the influence of this cooling device in this thermal 201 zone. See 1.4.1 below for more information. 202 203 1.3.2 int thermal_zone_unbind_cooling_device(struct thermal_zone_device *tz, 204 int trip, struct thermal_cooling_device *cdev); 205 206 This interface function unbinds a thermal cooling device from a particular 207 trip point of a thermal zone device. This function is usually called in 208 the thermal zone device .unbind callback. 209 tz: the thermal zone device 210 cdev: thermal cooling device 211 trip: indicates which trip point in this thermal zone the cooling device 212 is associated with. 213 214 1.4 Thermal Zone Parameters 215 1.4.1 struct thermal_bind_params 216 This structure defines the following parameters that are used to bind 217 a zone with a cooling device for a particular trip point. 218 .cdev: The cooling device pointer 219 .weight: The 'influence' of a particular cooling device on this 220 zone. This is relative to the rest of the cooling 221 devices. For example, if all cooling devices have a 222 weight of 1, then they all contribute the same. You can 223 use percentages if you want, but it's not mandatory. A 224 weight of 0 means that this cooling device doesn't 225 contribute to the cooling of this zone unless all cooling 226 devices have a weight of 0. If all weights are 0, then 227 they all contribute the same. 228 .trip_mask:This is a bit mask that gives the binding relation between 229 this thermal zone and cdev, for a particular trip point. 230 If nth bit is set, then the cdev and thermal zone are bound 231 for trip point n. 232 .binding_limits: This is an array of cooling state limits. Must have 233 exactly 2 * thermal_zone.number_of_trip_points. It is an 234 array consisting of tuples <lower-state upper-state> of 235 state limits. Each trip will be associated with one state 236 limit tuple when binding. A NULL pointer means 237 <THERMAL_NO_LIMITS THERMAL_NO_LIMITS> on all trips. 238 These limits are used when binding a cdev to a trip point. 239 .match: This call back returns success(0) if the 'tz and cdev' need to 240 be bound, as per platform data. 241 1.4.2 struct thermal_zone_params 242 This structure defines the platform level parameters for a thermal zone. 243 This data, for each thermal zone should come from the platform layer. 244 This is an optional feature where some platforms can choose not to 245 provide this data. 246 .governor_name: Name of the thermal governor used for this zone 247 .no_hwmon: a boolean to indicate if the thermal to hwmon sysfs interface 248 is required. when no_hwmon == false, a hwmon sysfs interface 249 will be created. when no_hwmon == true, nothing will be done. 250 In case the thermal_zone_params is NULL, the hwmon interface 251 will be created (for backward compatibility). 252 .num_tbps: Number of thermal_bind_params entries for this zone 253 .tbp: thermal_bind_params entries 254 255 2. sysfs attributes structure 256 257 RO read only value 258 RW read/write value 259 260 Thermal sysfs attributes will be represented under /sys/class/thermal. 261 Hwmon sysfs I/F extension is also available under /sys/class/hwmon 262 if hwmon is compiled in or built as a module. 263 264 Thermal zone device sys I/F, created once it's registered: 265 /sys/class/thermal/thermal_zone[0-*]: 266 |---type: Type of the thermal zone 267 |---temp: Current temperature 268 |---mode: Working mode of the thermal zone 269 |---policy: Thermal governor used for this zone 270 |---available_policies: Available thermal governors for this zone 271 |---trip_point_[0-*]_temp: Trip point temperature 272 |---trip_point_[0-*]_type: Trip point type 273 |---trip_point_[0-*]_hyst: Hysteresis value for this trip point 274 |---emul_temp: Emulated temperature set node 275 |---sustainable_power: Sustainable dissipatable power 276 |---k_po: Proportional term during temperature overshoot 277 |---k_pu: Proportional term during temperature undershoot 278 |---k_i: PID's integral term in the power allocator gov 279 |---k_d: PID's derivative term in the power allocator 280 |---integral_cutoff: Offset above which errors are accumulated 281 |---slope: Slope constant applied as linear extrapolation 282 |---offset: Offset constant applied as linear extrapolation 283 284 Thermal cooling device sys I/F, created once it's registered: 285 /sys/class/thermal/cooling_device[0-*]: 286 |---type: Type of the cooling device(processor/fan/...) 287 |---max_state: Maximum cooling state of the cooling device 288 |---cur_state: Current cooling state of the cooling device 289 290 291 Then next two dynamic attributes are created/removed in pairs. They represent 292 the relationship between a thermal zone and its associated cooling device. 293 They are created/removed for each successful execution of 294 thermal_zone_bind_cooling_device/thermal_zone_unbind_cooling_device. 295 296 /sys/class/thermal/thermal_zone[0-*]: 297 |---cdev[0-*]: [0-*]th cooling device in current thermal zone 298 |---cdev[0-*]_trip_point: Trip point that cdev[0-*] is associated with 299 |---cdev[0-*]_weight: Influence of the cooling device in 300 this thermal zone 301 302 Besides the thermal zone device sysfs I/F and cooling device sysfs I/F, 303 the generic thermal driver also creates a hwmon sysfs I/F for each _type_ 304 of thermal zone device. E.g. the generic thermal driver registers one hwmon 305 class device and build the associated hwmon sysfs I/F for all the registered 306 ACPI thermal zones. 307 308 /sys/class/hwmon/hwmon[0-*]: 309 |---name: The type of the thermal zone devices 310 |---temp[1-*]_input: The current temperature of thermal zone [1-*] 311 |---temp[1-*]_critical: The critical trip point of thermal zone [1-*] 312 313 Please read Documentation/hwmon/sysfs-interface for additional information. 314 315 *************************** 316 * Thermal zone attributes * 317 *************************** 318 319 type 320 Strings which represent the thermal zone type. 321 This is given by thermal zone driver as part of registration. 322 E.g: "acpitz" indicates it's an ACPI thermal device. 323 In order to keep it consistent with hwmon sys attribute; this should 324 be a short, lowercase string, not containing spaces nor dashes. 325 RO, Required 326 327 temp 328 Current temperature as reported by thermal zone (sensor). 329 Unit: millidegree Celsius 330 RO, Required 331 332 mode 333 One of the predefined values in [enabled, disabled]. 334 This file gives information about the algorithm that is currently 335 managing the thermal zone. It can be either default kernel based 336 algorithm or user space application. 337 enabled = enable Kernel Thermal management. 338 disabled = Preventing kernel thermal zone driver actions upon 339 trip points so that user application can take full 340 charge of the thermal management. 341 RW, Optional 342 343 policy 344 One of the various thermal governors used for a particular zone. 345 RW, Required 346 347 available_policies 348 Available thermal governors which can be used for a particular zone. 349 RO, Required 350 351 trip_point_[0-*]_temp 352 The temperature above which trip point will be fired. 353 Unit: millidegree Celsius 354 RO, Optional 355 356 trip_point_[0-*]_type 357 Strings which indicate the type of the trip point. 358 E.g. it can be one of critical, hot, passive, active[0-*] for ACPI 359 thermal zone. 360 RO, Optional 361 362 trip_point_[0-*]_hyst 363 The hysteresis value for a trip point, represented as an integer 364 Unit: Celsius 365 RW, Optional 366 367 cdev[0-*] 368 Sysfs link to the thermal cooling device node where the sys I/F 369 for cooling device throttling control represents. 370 RO, Optional 371 372 cdev[0-*]_trip_point 373 The trip point in this thermal zone which cdev[0-*] is associated 374 with; -1 means the cooling device is not associated with any trip 375 point. 376 RO, Optional 377 378 cdev[0-*]_weight 379 The influence of cdev[0-*] in this thermal zone. This value 380 is relative to the rest of cooling devices in the thermal 381 zone. For example, if a cooling device has a weight double 382 than that of other, it's twice as effective in cooling the 383 thermal zone. 384 RW, Optional 385 386 passive 387 Attribute is only present for zones in which the passive cooling 388 policy is not supported by native thermal driver. Default is zero 389 and can be set to a temperature (in millidegrees) to enable a 390 passive trip point for the zone. Activation is done by polling with 391 an interval of 1 second. 392 Unit: millidegrees Celsius 393 Valid values: 0 (disabled) or greater than 1000 394 RW, Optional 395 396 emul_temp 397 Interface to set the emulated temperature method in thermal zone 398 (sensor). After setting this temperature, the thermal zone may pass 399 this temperature to platform emulation function if registered or 400 cache it locally. This is useful in debugging different temperature 401 threshold and its associated cooling action. This is write only node 402 and writing 0 on this node should disable emulation. 403 Unit: millidegree Celsius 404 WO, Optional 405 406 WARNING: Be careful while enabling this option on production systems, 407 because userland can easily disable the thermal policy by simply 408 flooding this sysfs node with low temperature values. 409 410 sustainable_power 411 An estimate of the sustained power that can be dissipated by 412 the thermal zone. Used by the power allocator governor. For 413 more information see Documentation/thermal/power_allocator.txt 414 Unit: milliwatts 415 RW, Optional 416 417 k_po 418 The proportional term of the power allocator governor's PID 419 controller during temperature overshoot. Temperature overshoot 420 is when the current temperature is above the "desired 421 temperature" trip point. For more information see 422 Documentation/thermal/power_allocator.txt 423 RW, Optional 424 425 k_pu 426 The proportional term of the power allocator governor's PID 427 controller during temperature undershoot. Temperature undershoot 428 is when the current temperature is below the "desired 429 temperature" trip point. For more information see 430 Documentation/thermal/power_allocator.txt 431 RW, Optional 432 433 k_i 434 The integral term of the power allocator governor's PID 435 controller. This term allows the PID controller to compensate 436 for long term drift. For more information see 437 Documentation/thermal/power_allocator.txt 438 RW, Optional 439 440 k_d 441 The derivative term of the power allocator governor's PID 442 controller. For more information see 443 Documentation/thermal/power_allocator.txt 444 RW, Optional 445 446 integral_cutoff 447 Temperature offset from the desired temperature trip point 448 above which the integral term of the power allocator 449 governor's PID controller starts accumulating errors. For 450 example, if integral_cutoff is 0, then the integral term only 451 accumulates error when temperature is above the desired 452 temperature trip point. For more information see 453 Documentation/thermal/power_allocator.txt 454 Unit: millidegree Celsius 455 RW, Optional 456 457 slope 458 The slope constant used in a linear extrapolation model 459 to determine a hotspot temperature based off the sensor's 460 raw readings. It is up to the device driver to determine 461 the usage of these values. 462 RW, Optional 463 464 offset 465 The offset constant used in a linear extrapolation model 466 to determine a hotspot temperature based off the sensor's 467 raw readings. It is up to the device driver to determine 468 the usage of these values. 469 RW, Optional 470 471 ***************************** 472 * Cooling device attributes * 473 ***************************** 474 475 type 476 String which represents the type of device, e.g: 477 - for generic ACPI: should be "Fan", "Processor" or "LCD" 478 - for memory controller device on intel_menlow platform: 479 should be "Memory controller". 480 RO, Required 481 482 max_state 483 The maximum permissible cooling state of this cooling device. 484 RO, Required 485 486 cur_state 487 The current cooling state of this cooling device. 488 The value can any integer numbers between 0 and max_state: 489 - cur_state == 0 means no cooling 490 - cur_state == max_state means the maximum cooling. 491 RW, Required 492 493 3. A simple implementation 494 495 ACPI thermal zone may support multiple trip points like critical, hot, 496 passive, active. If an ACPI thermal zone supports critical, passive, 497 active[0] and active[1] at the same time, it may register itself as a 498 thermal_zone_device (thermal_zone1) with 4 trip points in all. 499 It has one processor and one fan, which are both registered as 500 thermal_cooling_device. Both are considered to have the same 501 effectiveness in cooling the thermal zone. 502 503 If the processor is listed in _PSL method, and the fan is listed in _AL0 504 method, the sys I/F structure will be built like this: 505 506 /sys/class/thermal: 507 508 |thermal_zone1: 509 |---type: acpitz 510 |---temp: 37000 511 |---mode: enabled 512 |---policy: step_wise 513 |---available_policies: step_wise fair_share 514 |---trip_point_0_temp: 100000 515 |---trip_point_0_type: critical 516 |---trip_point_1_temp: 80000 517 |---trip_point_1_type: passive 518 |---trip_point_2_temp: 70000 519 |---trip_point_2_type: active0 520 |---trip_point_3_temp: 60000 521 |---trip_point_3_type: active1 522 |---cdev0: --->/sys/class/thermal/cooling_device0 523 |---cdev0_trip_point: 1 /* cdev0 can be used for passive */ 524 |---cdev0_weight: 1024 525 |---cdev1: --->/sys/class/thermal/cooling_device3 526 |---cdev1_trip_point: 2 /* cdev1 can be used for active[0]*/ 527 |---cdev1_weight: 1024 528 529 |cooling_device0: 530 |---type: Processor 531 |---max_state: 8 532 |---cur_state: 0 533 534 |cooling_device3: 535 |---type: Fan 536 |---max_state: 2 537 |---cur_state: 0 538 539 /sys/class/hwmon: 540 541 |hwmon0: 542 |---name: acpitz 543 |---temp1_input: 37000 544 |---temp1_crit: 100000 545 546 4. Event Notification 547 548 The framework includes a simple notification mechanism, in the form of a 549 netlink event. Netlink socket initialization is done during the _init_ 550 of the framework. Drivers which intend to use the notification mechanism 551 just need to call thermal_generate_netlink_event() with two arguments viz 552 (originator, event). The originator is a pointer to struct thermal_zone_device 553 from where the event has been originated. An integer which represents the 554 thermal zone device will be used in the message to identify the zone. The 555 event will be one of:{THERMAL_AUX0, THERMAL_AUX1, THERMAL_CRITICAL, 556 THERMAL_DEV_FAULT}. Notification can be sent when the current temperature 557 crosses any of the configured thresholds. 558 559 5. Export Symbol APIs: 560 561 5.1: get_tz_trend: 562 This function returns the trend of a thermal zone, i.e the rate of change 563 of temperature of the thermal zone. Ideally, the thermal sensor drivers 564 are supposed to implement the callback. If they don't, the thermal 565 framework calculated the trend by comparing the previous and the current 566 temperature values. 567 568 5.2:get_thermal_instance: 569 This function returns the thermal_instance corresponding to a given 570 {thermal_zone, cooling_device, trip_point} combination. Returns NULL 571 if such an instance does not exist. 572 573 5.3:thermal_notify_framework: 574 This function handles the trip events from sensor drivers. It starts 575 throttling the cooling devices according to the policy configured. 576 For CRITICAL and HOT trip points, this notifies the respective drivers, 577 and does actual throttling for other trip points i.e ACTIVE and PASSIVE. 578 The throttling policy is based on the configured platform data; if no 579 platform data is provided, this uses the step_wise throttling policy. 580 581 5.4:thermal_cdev_update: 582 This function serves as an arbitrator to set the state of a cooling 583 device. It sets the cooling device to the deepest cooling state if 584 possible. 585 586 6. thermal_emergency_poweroff: 587 588 On an event of critical trip temperature crossing. Thermal framework 589 allows the system to shutdown gracefully by calling orderly_poweroff(). 590 In the event of a failure of orderly_poweroff() to shut down the system 591 we are in danger of keeping the system alive at undesirably high 592 temperatures. To mitigate this high risk scenario we program a work 593 queue to fire after a pre-determined number of seconds to start 594 an emergency shutdown of the device using the kernel_power_off() 595 function. In case kernel_power_off() fails then finally 596 emergency_restart() is called in the worst case. 597 598 The delay should be carefully profiled so as to give adequate time for 599 orderly_poweroff(). In case of failure of an orderly_poweroff() the 600 emergency poweroff kicks in after the delay has elapsed and shuts down 601 the system. 602 603 If set to 0 emergency poweroff will not be supported. So a carefully 604 profiled non-zero positive value is a must for emergerncy poweroff to be 605 triggered.