Fix throttling of T490 in Linux

In this post, we talk about how to fix the throttling problem of T490 in Linux.

The Lenovo Thinkpad T490 (especially with NVIDIA MX250) is severly throttled in normal useage under stock configuration. Basically, the cooling capacity of this laptop is so poor and the thermal settings is too conservative.

My laptop can maintain full all core load at ~3.3GHz and 85C for long time running and here are some tips for solving this problem under linux:

  1. BIOS and UEFI version:
    Currently(2020/12/23), NEVER update over 1.53.

  2. Configure CPU settings via throttled:
    There are AC and Battery settings in throttled. There are two benefits for using this tool:

    1. Set the power limit settings to more reasonable values.
    2. Undervolt the CPU, which means we can achive same performance with less power consumption.

    I’m a little conservative and my settings are:

    • AC -> PL1 -> 30W(default 44W)

    • AC -> Trip_Temp_C -> 85(default 95)

    • AC -> HWP_Mode -> True(default False)

    • AC Undervolt:

      • CORE: -80 mV
      • GPU: -70 mV
      • CACHE: -80 mV
      • UNCORE: -70 mV
      • ANALOGIO: -40 mV

      Based on what I observe, CPU and CACHE should have the same value. And GPU and UNCORE should have the save value too.
      NOTE: undervolt settings should be finetuned according to specific machines. I found some post about -60mV makes the system unstable while some one can set this value to even -110mV.
      NOTE: one can use stress for CPU testing and glmark2 for GPU testing. A useful system monitor is s-tui and one can install them via

      sudo apt install s-tui stress glmark2
  3. Configure CPU settings via tlp: The PPA is not available for Ubuntu 20.04, so I install the one in the official repo

    sudo apt install tlp

    Then run

    sudo tlp-stat -b

    to check the basic information of the system. It suggests me to install acpi-call, NOT the tp-smapi, hence

    sudo apt install acpi-call-dkms

    One can use

    tlp-stat -s

    to check the status of TLP. Looking for

    --- TLP 1.3.1 ---    # version infomation for TLP
    
    +++ TLP Status       # TLP status
    State          = enabled
    RDW state      = enabled

    NOTE: TLP does not include a daemon and there is no tlp process showing up in the output of ps. User configuration should be put in /etc/tlp.d/*.conf/ or /etc/tlp.conf directly. For me I put a 01-cpu.conf in /etc/tlp.d/ to constrain the max turbo frequency of CPU to 3.0 - 3.5 GHz for sustainable useage.

    # Define the min/max P-state for Intel CPUs. 
    # Values are stated as a percentage (0..100%) of the total 
    # available processor performance.
    #
    # Do NOT use CPU_SCALING_MIN/MAX_FREQ_ON_AC/BAT settings if
    # intel_pstate scaling driver is in use.
    CPU_MIN_PERF_ON_AC=8
    CPU_MAX_PERF_ON_AC=75
    CPU_MIN_PERF_ON_BAT=8
    CPU_MAX_PERF_ON_BAT=40   

    The maximum freq of my CPU is 4.6Ghz, therefore 4.6 * 0.75 ~ 3.5 Ghz.

  4. Configure GPU via nvisia-smi:
    The MX250 comes with T490 is a 25W version(Device ID 1D13), but the cooling of the laptop can not handle it properly, let alone when stressed together with CPU load.

    1. Set coolbits for the MX250: In old linux distributions, the file might be /etc/X11/xorg.conf. But in recent linux, the configuration files are located at /usr/share/X11/xorg.conf.d/. One should look for 10-nvidia.conf or nvidia-drm-outputclass.conf. Add the line with coolbits option:

      Section "OutputClass"
          Identifier     "nvidia"
          MatchDriver    "nvidia-drm"
          Driver         "nvidia"
          Option "Coolbits" "28"
      EndSection

      See Nvidia’s manual for more details about coolbits. Reboot the system and check

      cat /var/log/Xorg.0.log | grep -i coolbit -9

      to verify it’s working.
      NOTE: If you’re using prime-select intel or prime-select on-demand, this setting would not work. You have to use prime-select nvidia for this to take effect.

    2. Set powerlimit and clocks for the MX250: UNFORTUNATELY, the functionality of MX250 of T490 in Linux is so constrained, it fails to set the power limit and the clock speed. Normally, one should use nvidia-smi to set the power limit and the clock speed.

      sudo nvidia-smi -pl 10
      sudo nvidia-smi -lgc 500,1500

      These two commands will set the power limit to 10W and the clock speed ranging from 500 to 1500 MHz. But these two commands fail on my laptop even if I’ve set the sufficient coolbits.

  5. Configure fan settings via thinkfan:

    1. In order to be able to change the fan level, one have to set the file /usr/lib/modprobe.d/thinkpad_acpi.conf with content

      options thinkpad_acpi fan_control=1

      Then unload and reload the module with:

      modprobe -rv thinkpad_acpi
      modprobe -v thinkpad_acpi
    2. One can check the fan information via

      cat /proc/acpi/ibm/fan
      ## status:        enabled
      ## speed:     0
      ## level:     auto
      ## commands:  level <level> (<level> is 0-7, auto, disengaged, full-speed)
      ## commands:  enable, disable
      ## commands:  watchdog <timeout> (<timeout> is 0 (off), 1-120 (seconds))

      Effectively, to change the fan speed level, one just need to change level, such as

      echo level 1 > /proc/acpi/ibm/fan

      There are 7 regulated speed, 1 to 7, with higher level meaning higher speed. An auto level and full-speed level which is even higher than level 7.

    3. There is a GUI tool for manually setting the fan speed: Thinkpad Fan Control GUI.

    4. There is also a useful app thinkfan. The one in the Github repo is the most updated one, much newer than that in offical repo. The configuration has changed from a .conf file to a .yaml one. Here is my configuration:

       sensors:
        - tpacpi: /proc/acpi/ibm/thermal
          # indices: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
          correction: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, -20, 0, 0, 0, 0, 0]
          # optional: true
      
      fans:
        - tpacpi: /proc/acpi/ibm/fan
      
      levels:
        - [0, 0, 55]
        - [1, 48, 60]
        - [2, 50, 61]
        - [3, 52, 63]
        - [4, 56, 65]
        - [5, 59, 66]
        - [6, 63, 70]
        - [7, 68, 32767]

      Please refer to this example configuration, man thinkfan and man thinkfan.conf for more details.

  6. Additional external help: a powerful cooling pad.

Chao Cheng
Chao Cheng
Statistician

My research interests include applied statistics and machine learning.

Related