Recently, my laptop, with an Intel i7-4700MQ CPU running Debian Linux began running very slowly (very is largely relative, I didn’t notice for months…) when I changed from the current stable Debian Linux kernel 3.16 to the 4.4 kernel available in Debian backports. I investigated and noticed that the CPU was often running at only 600-700Mhz! A simple command to show the speed of all of your cores is
watch -n 1 'for i in /sys/devices/system/cpu/cpu?/cpufreq/cpuinfo_cur_freq ; do cat $i ; done'. Initially I assumed there was some misconfiguration of cpufreq, the Linux system which manages CPU speed. However, changing cpufreq settings, such as changing the governor to performance, changing the minimum speed, etc., had no effect at all. There are several other systems which may slow down the CPU:
- Thermal overload - in my system there are several different methods that the machine may detect that it is over-temperature and reduce the clock speed of the CPU to compensate. These include:
- Thermal Monitor 1 - Introduced in the Pentium 4 and only supports being enabled or disabled. Do not disable this. It seems to be completely automatic within the CPU itself.
- Thermal Monitor 2 - Introduced in later Penium 4 and Penitium M CPUs, which also should not be disabled. In addition to supporting being disabled, it supports setting a target maximum temperature for the CPU. Increasing this target is probably a very bad idea.
- Adaptive Thermal Monitor - Introduced in Core 2 CPUs, it will slow the CPU down more and more over time if the CPU fails to cool down. One can check whether the CPU is running slowly because of Adaptive TM, TM1, or TM2 activation by running the following command:
rdmsr 0x19c -f 1:0: in the output, bit 0 indicates if the these systems are currently throttling the CPU and bit 1 indicates if it has ever done so since the CPU was reset. Bit 1 can also be reset by writing a 0 to it.
- Voltage regulator thermal overload: if the voltage regulators on the mainboard become too hot.
- The OS can use the same system used by the thermal monitors systems to slow down the CPU via the clock modulation interface.
- Autonomous Utilization-based Frequency Control - Used when the CPU has turbo enabled to determine which core(s) should be turbo boosted.
- Hardware-Controlled Performance States (HWP), though this is not available on my CPU. I haven’t looked into it further.
- C1E - Reduces frequency, but only supposed to activate when the core is idle.
- Power throttling - The system can specify either a power or a current limit (or both) which the CPU will try not to exceed by limiting its speed. On my system there are limits configured at both 47W and 58.75W, it is unclear which limit is active when. This is configured by the BIOS and may not be changed.
- Graphics driver - The graphics driver can request that the CPU be throttled. It is unclear why this happens or by how much.
- External PROCHOT or FORCEPR lines - external chips can be wired to the CPU pins to force the CPU to throttle, such as power or temperature sensing chips on the mainboard.
- External STPCLK lines - external chips can stop the CPU clock by connecting to a pin on the CPU itself.
- ACPI - there are probably more methods for the BIOS/ACPI firmware system to throttle the CPU than I am aware of.
- At the very least the thermal_cooling sysfs interface can throttle the CPU in a way that isn’t seemingly detectible by probing the CPU’s model-specific registers. For example, a thermal_cooling sysfs interface is created for each ACPI CPU which can be activated by writing to e.g.
/sys/devices/virtual/thermal/cooling_device0/cur_state. Since this is capable of slowing my CPU down below its P-state minimum of 800Mhz, I assume it’s using clock modulation or STPCLK.
In the end, I discovered that on Linux 4.4 the
/sys/devices/virtual/thermal/cooling_device0/cur_state would end up being set by something unknown to 4 when plugging or unplugging the power cable to my laptop resulting in 600-700Mhz operation. At the same time, the CPU reports seeing the PROCHOT or FORCEPR lines asserted as the power cable is plugged/unplugged. It’s my guess that this is the result of some interaction between the kernel, ACPI, and an external chip on the system. This hasn’t occurred since upgrading to Linux 4.5, but the behavior can be half-reproduced by writing 4 to
I have developed a script to detect some possible reasons why my CPU might be running slowly, it’s available at: https://github.com/orezpraw/scripts/blob/master/whyslow.pl.