This applies only to Epyc CPUs of the Rome generation.
The AMD Epyc processors of the Rome generation (the second generation of Epyc) revealed one curious feature. It lies in the fact that such a CPU can freeze after 1044 days of continuous operation.
This feature means that owners of servers based on such CPUs may need to reboot the server before the specified time in order to avoid system freezes.
Some AMD processors may freeze after 1044 days of continuous operation
The problem is related to the fact that the CPU core cannot wake up from the CC6 sleep state. That being said, AMD states that the crash time may vary depending on the spread spectrum and REFCLK frequency. The nature of the problem is of a hardware nature, so, unfortunately, it will not be possible to fix it.
In addition to rebooting, there is a second method to avoid freezing: disable CC6 sleep mode.
In general, in most cases, this problem will not affect server owners, since maintenance and installation of security updates usually occur much more often, including requiring a reboot. However, in case someone uses the Linux hot fix feature to upgrade without rebooting, they may encounter the above problem.