跳转至

SPEC CPU 2017 Rate

SPEC INT 2017 Rate-1

下面贴出自己测的数据(SPECint2017,Estimated,rate,base,1 copy),不保证满足 SPEC 的要求,仅供参考。总运行时间(秒)基本和分数成反比,乘积按 5e4 估算。

数据总览

Debian Trixie

分数/GHz

每项分数

IPC

分支预测 MPKI

分支预测错误率

频率

指令数

Debian Bookworm

分数/GHz

每项分数

IPC

分支预测 MPKI

分支预测错误率

频率

指令数

HarmonyOS

每项分数

原始数据

Debian Trixie

桌面平台(LTO + Jemalloc):

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3 -flto -ljemalloc): 9.31
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3 -flto -ljemalloc): 3.58
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3 -flto -ljemalloc): 9.14
  • Intel Core i5-1135G7 @ 4.2 GHz Willow Cove(-O3 -flto -ljemalloc): 7.28
  • Intel Core i7-13700K E-Core @ 4.2 GHz Gracemont(-O3 -flto -ljemalloc): 7.43
  • Intel Core i7-13700K P-Core @ 5.2 GHz Raptor Cove(-O3 -flto -ljemalloc): 10.9
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3 -flto -ljemalloc): 6.96
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3 -flto -ljemalloc): 6.63
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3 -flto -ljemalloc): 10.6
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3 -flto -ljemalloc): 7.90
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3 -flto -ljemalloc): 12.6
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3 -flto -ljemalloc): 8.96
  • Loongson 3A6000 @ 2.5 GHz LA664(-O3 -flto -ljemalloc): 4.86

桌面平台(LTO):

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3 -flto): 8.57
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3 -flto): 3.34
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3 -flto): 8.40
  • Intel Core i5-1135G7 @ 4.2 GHz Willow Cove(-O3 -flto): 6.80
  • Intel Core i7-13700K E-Core @ 4.2 GHz Gracemont(-O3 -flto): 6.97
  • Intel Core i7-13700K P-Core @ 5.2 GHz Raptor Cove(-O3 -flto): 10.3
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3 -flto): 6.57
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3 -flto): 6.31
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3 -flto): 10.0
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3 -flto): 7.43
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3 -flto): 12.1
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3 -flto): 8.41
  • Loongson 3A6000 @ 2.5 GHz LA664(-O3 -flto): 4.56

桌面平台:

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3): 8.19
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3): 3.20
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3): 7.97
  • Intel Core i5-1135G7 @ 4.2 GHz Willow Cove(-O3): 6.58
  • Intel Core i7-13700K E-Core @ 4.2 GHz Gracemont(-O3): 6.72
  • Intel Core i7-13700K P-Core @ 5.2 GHz Raptor Cove(-O3): 9.85
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3): 6.31
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3): 6.10
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3): 9.74
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3): 7.18
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3): 11.6
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3): 8.23
  • Loongson 3A6000 @ 2.5 GHz LA664(-O3): 4.35 4.39

服务器平台(LTO + Jemalloc):

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3 -flto -ljemalloc): 3.49
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3 -flto -ljemalloc): 5.48
  • AMD EPYC 9R45 @ 4.5 GHz Zen 5(-O3 -flto -ljemalloc): 10.3
  • AMD EPYC 9T95 @ 3.7 GHz Zen 5c(-O3 -flto -ljemalloc): 8.80
  • Google Axion C4A @ Neoverse V2(-O3 -flto -ljemalloc): 8.23
  • Google Axion N4A @ Neoverse N3(-O3 -flto -ljemalloc): 7.97
  • IBM POWER8 @ 3.2 GHz POWER8(-O3 -flto -ljemalloc): 3.63
  • IBM POWER9 3.2 GHz @ 3.2 GHz POWER9(-O3 -flto -ljemalloc): 3.53
  • IBM POWER9 3.8 GHz @ 3.2 GHz POWER9(-O3 -flto -ljemalloc): 4.81
  • Intel Xeon 6975P-C @ 3.9 GHz Redwood Cove(-O3 -flto -ljemalloc): 8.03
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3 -flto -ljemalloc): 4.95
  • Intel Xeon Gold 6430 @ 2.6 GHz Golden Cove(-O3 -flto -ljemalloc): 5.39
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3 -flto -ljemalloc): 6.17
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3 -flto -ljemalloc): 3.65
  • Loongson 3C6000 @ 2.2 GHz LA664(-O3 -flto -ljemalloc): 4.54

服务器平台(LTO):

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3 -flto): 3.28
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3 -flto): 5.05
  • AMD EPYC 9R45 @ 4.5 GHz Zen 5(-O3 -flto): 9.49
  • AMD EPYC 9T95 @ 3.7 GHz Zen 5c(-O3 -flto): 8.18
  • Google Axion C4A @ Neoverse V2(-O3 -flto): 7.68
  • Google Axion N4A @ Neoverse N3(-O3 -flto): 7.44
  • IBM POWER8 @ 3.2 GHz POWER8(-O3 -flto): 3.45
  • IBM POWER9 3.2 GHz @ 3.2 GHz POWER9(-O3 -flto): 3.30
  • IBM POWER9 3.8 GHz @ 3.2 GHz POWER9(-O3 -flto): 4.41
  • Intel Xeon 6975P-C @ 3.9 GHz Redwood Cove(-O3 -flto): 7.65
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3 -flto): 4.59
  • Intel Xeon Gold 6430 @ 2.6 GHz Golden Cove(-O3 -flto): 5.16
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3 -flto): 5.91
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3 -flto): 3.32
  • Loongson 3C6000 @ 2.2 GHz LA664(-O3 -flto): 4.39 4.37

服务器平台:

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3): 3.12
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3): 4.78
  • AMD EPYC 9R45 @ 4.5 GHz Zen 5(-O3): 9.07
  • AMD EPYC 9T95 @ 3.7 GHz Zen 5c(-O3): 7.83
  • Google Axion C4A @ Neoverse V2(-O3): 7.25
  • Google Axion N4A @ Neoverse N3(-O3): 7.16
  • IBM POWER8 @ 3.2 GHz POWER8(-O3): 3.24
  • IBM POWER9 3.2 GHz @ 3.2 GHz POWER9(-O3): 3.01
  • IBM POWER9 3.8 GHz @ 3.2 GHz POWER9(-O3): 4.14
  • Intel Xeon 6975P-C @ 3.9 GHz Redwood Cove(-O3): 7.38
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3): 4.39
  • Intel Xeon Gold 6430 @ 2.6 GHz Golden Cove(-O3): 4.97
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3): 5.66
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3): 3.17
  • Loongson 3C5000 @ 2.2 GHz LA464(-O3): 2.63
  • Loongson 3C6000 @ 2.2 GHz LA664(-O3): 4.19 4.14

Debian Bookworm

桌面平台(LTO + Jemalloc):

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3 -flto -ljemalloc): 9.13
  • AMD Ryzen 9 9950X @ 5.7 GHz Zen 5(-O3 -flto -ljemalloc): 12.9
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3 -flto -ljemalloc): 3.52
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3 -flto -ljemalloc): 8.93
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3 -flto -ljemalloc): 6.70
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3 -flto -ljemalloc): 10.7
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3 -flto -ljemalloc): 12.1
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3 -flto -ljemalloc): 8.71
  • Qualcomm X1E80100 @ 4.0 GHz X Elite(-O3 -flto -ljemalloc): 9.25

桌面平台(LTO):

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3 -flto): 8.44
  • AMD Ryzen 9 9950X @ 5.7 GHz Zen 5(-O3 -flto): 11.7
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3 -flto): 3.29
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3 -flto): 8.24
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3 -flto): 6.37
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3 -flto): 9.97
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3 -flto): 11.7 11.7
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3 -flto): 8.30
  • Qualcomm X1E80100 @ 4.0 GHz X Elite(-O3 -flto): 8.62

桌面平台:

  • AMD Ryzen 5 7500F @ 5.0 GHz Zen 4(-O3): 9.51
  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3): 7.87
  • AMD Ryzen 9 9950X @ 5.7 GHz Zen 5(-O3): 11.2 11.3
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3): 3.15
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3): 7.85
  • Huawei Kirin X90 VM P-Core @ 2.3 GHz(-O3): 4.07
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3): 6.24
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3): 6.08
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3): 9.62
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3): 7.03
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3): 11.3
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3): 8.05
  • Qualcomm 8cx Gen3 E-Core @ 2.4 GHz Cortex-A78C(-O3): 4.11
  • Qualcomm 8cx Gen3 P-Core @ 3.0 GHz Cortex-X1C(-O3): 5.73
  • Qualcomm X1E80100 @ 4.0 GHz X Elite(-O3): 8.31

服务器平台(LTO + Jemalloc):

  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3 -flto -ljemalloc): 5.33
  • AMD EPYC 9754 @ 3.1 GHz Zen 4c(-O3 -flto -ljemalloc): 5.79
  • AMD EPYC 9755 @ 4.1 GHz Zen 5(-O3 -flto -ljemalloc): 9.66
  • AMD EPYC 9K65 @ 3.7 GHz Zen 5c(-O3 -flto -ljemalloc): 8.19
  • AMD EPYC 9K85 @ 4.1 GHz Zen 5(-O3 -flto -ljemalloc): 9.48
  • AMD EPYC 9R14 @ 3.7 GHz Zen 4(-O3 -flto -ljemalloc): 7.21
  • AMD EPYC 9T24 @ 3.7 GHz Zen 4(-O3 -flto -ljemalloc): 7.62
  • AWS Graviton 3 @ 2.6 GHz Neoverse V1(-O3 -flto -ljemalloc): 5.24
  • AWS Graviton 3E @ 2.6 GHz Neoverse V1(-O3 -flto -ljemalloc): 6.17
  • AWS Graviton 4 @ 2.8 GHz Neoverse V2(-O3 -flto -ljemalloc): 7.64 7.41
  • Hygon C86 7390(-O3 -flto -ljemalloc): 3.29
  • IBM POWER8NVL @ 4.0 GHz POWER8(-O3 -flto -ljemalloc): 4.02
  • Intel Xeon 6981E Crestmont(-O3 -flto -ljemalloc): 4.79
  • Intel Xeon 6982P-C @ 3.6 GHz Redwood Cove(-O3 -flto -ljemalloc): 7.22
  • Intel Xeon Platinum 8581C @ 3.4 GHz Raptor Cove(-O3 -flto -ljemalloc): 6.87
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3 -flto -ljemalloc): 3.57
  • Kunpeng 920 HuaweiCloud kc2 @ 2.9 GHz(-O3 -flto -ljemalloc): 6.03

服务器平台(LTO):

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3 -flto): 3.19
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3 -flto): 5.02
  • AMD EPYC 9754 @ 3.1 GHz Zen 4c(-O3 -flto): 5.48
  • AMD EPYC 9755 @ 4.1 GHz Zen 5(-O3 -flto): 8.97
  • AMD EPYC 9K65 @ 3.7 GHz Zen 5c(-O3 -flto): 7.78
  • AMD EPYC 9K85 @ 4.1 GHz Zen 5(-O3 -flto): 8.83
  • AMD EPYC 9R14 @ 3.7 GHz Zen 4(-O3 -flto): 6.62
  • AMD EPYC 9T24 @ 3.7 GHz Zen 4(-O3 -flto): 7.14
  • AWS Graviton 3 @ 2.6 GHz Neoverse V1(-O3 -flto): 5.68
  • AWS Graviton 4 @ 2.8 GHz Neoverse V2(-O3 -flto): 7.14 6.53 6.51
  • Hygon C86 7390(-O3 -flto): 3.09
  • Intel Xeon 6981E Crestmont(-O3 -flto): 4.62
  • Intel Xeon 6982P-C @ 3.6 GHz Redwood Cove(-O3 -flto): 6.90
  • Intel Xeon Platinum 8581C @ 3.4 GHz Raptor Cove(-O3 -flto): 6.67
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3 -flto): 3.26
  • Kunpeng 920 HuaweiCloud kc2 @ 2.9 GHz(-O3 -flto): 5.71

服务器平台:

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3): 3.07
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3): 4.73
  • AMD EPYC 7H12 @ 3.3 GHz Zen 2(-O3): 4.23
  • AMD EPYC 7K83 Zen 3(-O3): 5.18
  • AMD EPYC 9754 @ 3.1 GHz Zen 4c(-O3): 5.33
  • AMD EPYC 9755 @ 4.1 GHz Zen 5(-O3): 8.57
  • AMD EPYC 9K65 @ 3.7 GHz Zen 5c(-O3): 7.47
  • AMD EPYC 9K85 @ 4.1 GHz Zen 5(-O3): 8.44
  • AMD EPYC 9R14 @ 3.7 GHz Zen 4(-O3): 6.57 6.41
  • AMD EPYC 9T24 @ 3.7 GHz Zen 4(-O3): 6.94
  • AWS Graviton 3 @ 2.6 GHz Neoverse V1(-O3): 5.43
  • AWS Graviton 3E @ 2.6 GHz Neoverse V1(-O3): 5.53
  • AWS Graviton 4 @ 2.8 GHz Neoverse V2(-O3): 7.00 6.85
  • Ampere Altra @ 3.0 GHz Neoverse N1(-O3): 4.41
  • Hygon C86 7390(-O3): 2.97
  • IBM POWER8NVL @ 4.0 GHz POWER8(-O3): 3.54
  • Intel Xeon 6981E Crestmont(-O3): 4.48
  • Intel Xeon 6982P-C @ 3.6 GHz Redwood Cove(-O3): 6.68
  • Intel Xeon D-2146NT @ 2.9 GHz Skylake(-O3): 3.96
  • Intel Xeon E5-2603 v4 @ 1.7 GHz Broadwell(-O3): 2.48
  • Intel Xeon E5-2680 v3 @ 3.3 GHz Haswell(-O3): 4.01
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3): 4.35
  • Intel Xeon E5-4610 v2 @ 2.7 GHz Ivy Bridge EP(-O3): 3.06
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3): 5.66
  • Intel Xeon Platinum 8576C Raptor Cove(-O3): 5.72
  • Intel Xeon Platinum 8581C @ 3.4 GHz Raptor Cove(-O3): 6.52
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3): 3.10
  • Kunpeng 920 HuaweiCloud kc2 @ 2.9 GHz(-O3): 5.53
  • T-Head Yitian 710 @ 3.0 GHz Neoverse N2(-O3): 5.79

HarmonyOS

桌面平台(LTO):

  • Huawei Kirin X90 E-Core @ 2.0 GHz(-O3 -flto): 4.28
  • Huawei Kirin X90 P-Core @ 2.3 GHz(-O3 -flto): 4.87

手机平台(LTO):

  • Huawei Kirin 9010 E-Core Full @ 2.2 GHz(-O3 -flto): 3.21
  • Huawei Kirin 9010 P-Core Best @ 2.3 GHz(-O3 -flto): 4.18
  • Huawei Kirin 9010 P-Core Full @ 2.3 GHz(-O3 -flto): 3.96

备注

  1. SPEC INT 2017 Rate-1 结果受 -flto(分数 +4%,主要优化 mcf/deepsjeng)和 -ljemalloc(分数 +4-10%,主要优化 omnetpp/xalancbmk)影响很明显。-Ofast-O3 区别很小,-march=native 影响很小。
  2. 在部分处理器上,Linux 不能保证程序被调度到性能最高的核心上,例如:
    1. Qualcomm X1E80100 上,负载不一定会调度到有 Boost 的核上,因此需要手动绑核。没有 Boost 的核心会跑在 3.4 GHz,Boost 的核心最高可以达到 4.0 GHz,对应 14% 的性能提升。具体地讲,它有三个 Cluster,0-3 是没有 Boost 的 Cluster,4-7 和 8-11 每个 Cluster 中可以有一个核心 Boost 到 4.0 GHz,也就是说,最多有两个核达到 4.0 GHz,这两个核需要分别位于 4-7 和 8-11 两个 Cluster 当中。如果一个 Cluster 有两个或者以上的核有负载,那么他们都只有 3.4 GHz。
    2. AMD Ryzen 9 9950X 不同核能够达到的最大频率不同,目前 Linux(6.11)的调度算法不一定可以保证跑到最大频率 5.75 GHz 上,可能会飘到频率低一些(5.45 GHz 左右)的核心上,损失 4% 的性能,因此需要绑核心,详见 Linux 大小核的调度算法探究 以及 谈谈 Linux 与 ITMT 调度器与多簇处理器。这个问题已经有 Patch 进行修复。
  3. 对于服务器 CPU,默认设置可能没有打开 C6 State,此时单核不一定能 Boost 到宣称的最高频率,需要进 BIOS 打开 C6 State,使得空闲的核心进入低功耗模式,才能发挥出最高的 Boost 频率。
  4. 对于除了苹果以外的 ARM64 核心,内核的 branch-misses 计数器考虑了 speculative 而不只是 retired,因此数字会偏高,此时要用 r22 计数替代。
  5. Google Cloud 只有部分机型(如 C4 和 C4A)支持 PMU,并且需要手动开启(参考 Enable the PMU in VMs):

    $ gcloud compute instances export VM_NAME \
        --destination=YAML_FILE \
        --zone=ZONE
    $ vim YAML_FILE
    # append the following lines
    advancedMachineFeatures:
      performanceMonitoringUnit: STANDARD
    $ gcloud compute instances update-from-file VM_NAME \
        --most-disruptive-allowed-action=RESTART \
        --source=YAML_FILE \
        --zone=ZONE
    
    6. 华为云的 kc2 实例的 PMU 默认不开启,需要通过工单申请,步骤如下: 1. 创建一个私有镜像(可以先用公共镜像起一个虚拟机,再从虚拟机创建私有镜像) 2. 创建工单申请,申请给私有镜像启用 PMU 3. 用私有镜像创建新的虚拟机,在这个虚拟机内就可以使用 PMU:
    root@kc2 ~# dmesg | grep PMU
    [    1.196145] hw perfevents: enabled with armv8_pmuv3_0 PMU driver, 9 counters available
    
    7. Kirin 9010 因为散热问题,单独跑测试,和顺着跑一遍测试,结果差距比较大。因此提供了两组数据:Best(每一项单独跑,取最短时间,散热影响比较小)和 Full(按照顺序跑一次,散热影响比较大)。

分支预测器比较

x86 平台的分支预测准确率(Average)由高到低(-O3,Debian Bookworm):

  1. Zen 5(AMD 9950X/AMD 9755/AMD 9K85): MPKI=4.48 Mispred=2.52%
  2. Zen 5c(AMD 9K65): MPKI=4.51 Mispred=2.54%
  3. Zen 4(AMD 9T24/9R14): MPKI=4.57 Mispred=2.57%
  4. Zen 4c(AMD 9754): MPKI=4.66 Mispred=2.63%
  5. Zen 4(AMD 7500F): MPKI=4.68 Mispred=2.64%
  6. Zen 3(AMD 5700X): MPKI=4.68 Mispred=2.64%
  7. Zen 2(AMD 7742): MPKI=4.77 Mispred=2.69%
  8. Redwood Cove(Intel 6982P-C): MPKI=4.77 Mispred=2.71%
  9. Sunny Cove(Intel 8358P)/Golden Cove(Intel 12900KS P-Core)/Raptor Cove(Intel 14900K P-Core/Intel 8581C): MPKI=4.86 Mispred=2.75%
  10. Gracemont(Intel 12900KS P-Core/Intel 14900K P-Core): MPKI=5.15 Mispred=2.92%
  11. Skylake(Intel D-2146NT)/Cascade Lake(Intel 10980XE): MPKI=5.50 Mispred=3.13%
  12. Zen 1(AMD 7551): MPKI=5.82 Mispred=3.31%
  13. Haswell(Intel E5-2680 v3)/Broadwell(Intel E5-2680 v4): MPKI=5.98 Mispred=3.34%

x86 平台的分支预测准确率(Average)由高到低(-O3 -flto,Debian Bookworm):

  1. Zen 5(AMD 9950X/AMD 9755): MPKI=5.35 Mispred=3.07%
  2. Zen 5c(AMD 9K65)/Zen 5(AMD 9K85): MPKI=5.42 Mispred=3.10%
  3. Zen 2(AMD 7742): MPKI=5.52 Mispred=3.17%
  4. Zen 3(AMD 5700X): MPKI=5.55 Mispred=3.19%
  5. Zen 4(AMD 9T24/AMD 9R14): MPKI=5.57 Mispred=3.19%
  6. Redwood Cove(Intel 6982P-C): MPKI=5.70 Mispred=3.29%
  7. Golden Cove(Intel 12900KS P-Core)/Raptor Cove(Intel 14900K P-Core/Intel 8581C): MPKI=5.81 Mispred=3.37%
  8. Cascade Lake(Intel 10980XE): MPKI=6.55 Mispred=3.83%
  9. Zen 1(AMD 7551): MPKI=6.86 Mispred=4.02%

ARM64 平台的分支预测准确率(Average)由高到低(-O3,Debian Bookworm):

  1. Neoverse V2(AWS Graviton 4): MPKI=4.50 Mispred=2.47%
  2. Oryon(Qualcomm X1E80100): MPKI=4.71 Mispred=2.58%
  3. Neoverse N2(Aliyun Yitian 710): MPKI=4.80 Mispred=2.64%
  4. Firestorm(Apple M1 P-Core): MPKI=4.81 Mispred=2.62%
  5. Neoverse V1(AWS Graviton 3/AWS Graviton 3E)/Cortex X1C(Qualcomm 8cx Gen3): MPKI=4.91 Mispred=2.69%
  6. HuaweiCloud kc2: MPKI=5.17 Mispred=2.85%
  7. Neoverse N1(Ampere Altra)/Cortex A78C(Qualcomm 8cx Gen3 E-Core): MPKI=5.21 Mispred=2.87%
  8. Icestorm(Apple M1 E-Core): MPKI=5.41 Mispred=2.99%
  9. TSV110(Hisilicon Kunpeng 920): MPKI=6.54 Mispred=3.58%

ARM64 平台的分支预测准确率(Average)由高到低(-O3 -flto,Debian Bookworm):

  1. Neoverse V2(AWS Graviton 4): MPKI=5.19 Mispred=3.03%
  2. Oryon(Qualcomm X1E80100): MPKI=5.41 Mispred=3.13%
  3. Firestorm(Apple M1 P-Core): MPKI=5.45 Mispred=3.14%
  4. Neoverse V1(AWS Graviton 3): MPKI=5.64 Mispred=3.27%
  5. HuaweiCloud kc2: MPKI=6.00 Mispred=3.50%
  6. Icestorm(Apple M1 E-Core): MPKI=6.10 Mispred=3.56%
  7. TSV110(Hisilicon Kunpeng 920): MPKI=6.74 Mispred=3.98%

LoongArch64 平台的分支预测准确率(Average)由高到低(-O3,Debian Trixie):

  1. LA664(3A6000/3C6000): MPKI=5.01 Mispred=2.79%
  2. LA464(3C5000): MPKI=8.39 Mispred=4.21%

网上的数据

SPEC CPU 2017 by David Huang:

  • Apple M4 Pro: 13.7
  • AMD Ryzen 9950X Zen 5: 12.6
  • Apple M3 Pro: 11.8
  • Intel Core 13900K Raptor Cove: 11.5
  • Intel Core Ultra 7 265K Arrow Lake Lion Cove+Skymont: 11.1
  • AMD AI Max+ 395 Zen 5: 10.6
  • Apple M2 Pro: 10.3
  • Apple M2: 9.95
  • AMD HX 370 Strix Point Zen 5: 9.64
  • Intel Core Ultra 258V Lunar Lake Lion Cove+Skymont: 9.46
  • Apple M1 Max Firestorm: 9.2
  • AMD Ryzen 5950X Zen 3: 9.15
  • Kunpeng 920 TSV120: 6.00
  • Loongson 3A6000 LA664: 4.29
  • Phytium D3000 FTC862: 4.24
  • Loongson 3A5000 LA464: 3.04

高通 X Elite Oryon 微架构评测:走走停停 by JamesAslan:

  • AMD Ryzen 7700X Zen 4: 10.35
  • Intel Core 13700K Raptor Cove: 9.81
  • Intel Core 12700K Golden Cove: 9.13
  • AMD Ryzen 5950X Zen 3: 8.45
  • Apple M2 Avalanche+Blizzard: 8.40
  • Qualcomm X1E80100 Oryon: 8.19
  • Apple M1 Firestorm+Icestorm: 7.40
  • Qualcomm 8 Gen 2 Cortex-X3: 6.58

Running SPEC CPU2017 on Chinese CPUs, and More

  • AMD Ryzen 9 7950X3D Non-VCache: 10.5
  • Intel Core Ultra 7 258V Lion Cove: 9.37
  • Intel Core Ultra 7 115H Redwood Cove: 7.6
  • Intel Core Ultra 7 258V Skymont: 5.92
  • Intel Core Ultra 7 115H Crestmont: 5.88
  • Intel Core i5-6600K Skylake: 5.65
  • Loongson 3A6000 LA664: 4.27
  • Mediatek Genio 1200 Cortex A78: 3.8
  • AMD FX-8150: 3.5
  • Intel Core Ultra 7 115H Low Power Crestmont: 3.32
  • Loongson 3A5000: 2.93
  • Intel Celeron J4125 Goldmont Plus: 2.43
  • Zhaoxin KaiXian KX-6640MA: 2.07
  • Amlogic S922X Cortex A73: 1.77
  • Mediatek Genio 1200 Cortex A55: 1.19

Running SPEC CPU2017 at Chips and Cheese?

  • AMD Ryzen 9 9950X: 11.9
  • AMD Ryzen 9 7950X3D Non-VCache: 10.5
  • AMD Ryzen 9 7950X3D VCache: 10.5
  • Intel Core Ultra 7 115H Redwood Cove: 7.58
  • Intel Core Ultra 7 115H Crestmont: 5.34
  • AMD Ryzen 9 3950X 3.5GHz: 5.28
  • Ampere Altra: 3.98
  • AmpereOne: 3.94
  • AMD FX-8159: 3.46

The AMD Ryzen 9 9950X and Ryzen 9 9900X Review: Flagship Zen 5 Soars - and Stalls

  • AMD Ryzen 9 9950X Zen 5: 10.95
  • Intel Core i9-14900K Raptor Cove: 10.94
  • AMD Ryzen 9 7950X Zen 4: 9.88

Snapdragon X Elite Qualcomm Oryon CPU Design and Architecture Hot Chips 2024

  • Qualcomm X Elite Oryon on Linux: 10.64
  • Qualcomm X Elite Oryon on Windows: 9.70

ARM Cortex X1 微架构评测(上):向山进发

  • Zen 3 @ 4.95 GHz: 8.4
  • Firestorm @ 3.0 GHz: 7.4
  • Cortex X1 @ 3.0 GHz: 5.7
  • Cortex A78 @ 2.4 GHz: 3.9

极客湾•麒麟 9010,测评汇总:IPC 性能,巨幅提升!CPU 能效全频段领先,麒麟 9000S!

  • Huawei Kirin 9010: 4.54
  • HUawei Kirin 9000s: 4.06

多版本 GCC 和 LLVM 性能比较

在 Intel i9-14900K @ 5.7 GHz 上用 -O3 测试几种编译器组合的性能:

Benchmark GCC 15 GCC 14 GCC 13 GCC 12 GCC 11 LLVM 19 LLVM 18 LLVM 17 LLVM 20 LLVM 20 w/ -fwrapv
500.perlbench_r 12.0 11.8 12.0 12.3 12.3 10.9 10.9 10.8 10.8 10.9
502.gcc_r 14.0 13.7 13.7 13.6 13.5 13.5 13.5 13.5 13.5 13.5
505.mcf_r 9.34 9.48 9.19 9.32 9.38 8.32 8.40 8.76 8.27 8.26
520.omnetpp_r 9.39 9.07 8.87 9.17 9.17 8.78 8.80 8.77 8.74 8.73
523.xalancbmk_r 8.91 8.91 8.95 8.85 9.11 8.88 8.86 8.85 8.86 8.83
525.x264_r 23.7 19.7 18.6 18.5 19.4 19.5 18.8 18.5 19.9 20.0
531.deepsjeng_r 7.43 7.36 7.36 7.24 6.95 7.18 7.27 7.22 7.17 7.29
541.leela_r 7.20 7.06 7.13 7.16 7.00 7.45 7.39 7.36 7.41 7.19
548.exchange2_r 32.5 29.9 28.8 28.2 16.2 14.4 14.4 12.9 10.9 14.4
557.xz_r 5.69 5.62 5.59 5.62 5.55 5.71 5.69 5.70 5.69 5.66
geomean 11.2 10.8 10.7 10.7 10.1 9.80 9.78 9.68 9.52 9.78

完整数据:

LLVM 20 的 548.exchange2_r 性能下降可以通过添加 -fwrapv 选项来解决,见 548.exchange2_r of SPEC CPU 2017 has 30% performance regression between LLVM 18/19 and LLVM 20 on amd64

注:GCC 指 GCC + GFortran,LLVM 指 Clang + Flang-new

LA664 不同编译器版本和编译选项下的测试结果

鉴于网上针对 LA664 的 SPEC INT 2017 Rate-1 性能测试有一些争议:

小结一下上面的文章里的结果:

可见主要的分歧是在 GCC 版本和编译选项上。

下面贴出本人测试的结果:

注:3A6000 频率是 2.5 GHz,3C6000 频率是 2.2 GHz。

结论:性能受编译器版本和编译选项影响很大,如果对不上,那么性能的差距可能会影响和其他处理器比较的结论。在上面的例子里,这些编译器版本和编译选项带来的优化:

  1. -flto:约 5% 提升,4.39/4.19=1.05, 4.56/4.35=1.05
  2. -march=native(仅 GCC 14)或 -msimd=lasx: 约 8% 提升,4.50/4.17=1.08
  3. GCC 15.1.0 vs GCC 14.2.0: 约 7% 提升,4.49/4.19=1.07
  4. -ljemalloc: 约 3-7% 提升,4.90/4.63=1.06, 4.86/4.56=1.07, 4.54/4.39=1.03

SPEC FP 2017 Rate-1

下面贴出自己测的数据(SPECfp2017,Estimated,rate,base,1 copy),不保证满足 SPEC 的要求,仅供参考。总运行时间基本和分数成反比,乘积按 1e5 估算。

数据总览

Debian Bookworm

分数/GHz

每项分数

IPC

分支预测 MPKI

分支预测错误率

频率

指令数

Debian Trixie

分数/GHz

每项分数

IPC

分支预测 MPKI

分支预测错误率

频率

指令数

HarmonyOS

每项分数

原始数据

Debian Trixie

桌面平台(-march=native):

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3 -march=native): 11.7
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3 -march=native): 3.93
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3 -march=native): 12.2
  • Intel Core i5-1135G7 @ 4.2 GHz Willow Cove(-O3 -march=native): 9.93
  • Intel Core i7-13700K E-Core @ 4.2 GHz Gracemont(-O3 -march=native): 7.22
  • Intel Core i7-13700K P-Core @ 5.2 GHz Raptor Cove(-O3 -march=native): 15.0
  • Intel Core i9-10980XE @ 4.7 GHz (AVX-512 @ 4.0 GHz) Cascade Lake(-O3 -march=native): 7.85
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3 -march=native): 7.23
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3 -march=native): 15.4
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3 -march=native): 7.70
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3 -march=native): 18.0
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3 -march=native): 11.9
  • Loongson 3A6000 @ 2.5 GHz LA664(-O3 -march=native): 5.73

桌面平台:

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3): 10.9
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3): 3.93
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3): 12.2
  • Intel Core i5-1135G7 @ 4.2 GHz Willow Cove(-O3): 9.04
  • Intel Core i7-13700K E-Core @ 4.2 GHz Gracemont(-O3): 6.93
  • Intel Core i7-13700K P-Core @ 5.2 GHz Raptor Cove(-O3): 14.0
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3): 7.24
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3): 6.97
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3): 14.4
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3): 7.42
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3): 16.8
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3): 11.0
  • Loongson 3A6000 @ 2.5 GHz LA664(-O3): 5.56

服务器平台(-march=native):

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3 -march=native): 4.42
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3 -march=native): 7.96
  • AMD EPYC 9R45 @ 4.5 GHz Zen 5(-O3 -march=native): 16.2
  • AMD EPYC 9T95 @ 3.7 GHz Zen 5c(-O3 -march=native): 13.9
  • Google Axion C4A @ Neoverse V2(-O3 -march=native): 10.8
  • Google Axion N4A @ Neoverse N3(-O3 -march=native): 8.94
  • Intel Xeon 6975P-C @ 3.9 GHz Redwood Cove(-O3 -march=native): 11.0
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3 -march=native): 5.58
  • Intel Xeon Gold 6430 @ 2.6 GHz Golden Cove(-O3 -march=native): 7.64
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3 -march=native): 8.03
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3 -march=native): 3.19
  • Loongson 3C5000 @ 2.2 GHz LA464(-O3 -march=native): 3.09
  • Loongson 3C6000 @ 2.2 GHz LA664(-O3 -march=native): 4.93

服务器平台:

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3): 4.15
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3): 7.37
  • AMD EPYC 9R45 @ 4.5 GHz Zen 5(-O3): 14.5
  • AMD EPYC 9T95 @ 3.7 GHz Zen 5c(-O3): 12.5
  • Google Axion C4A @ Neoverse V2(-O3): 10.8
  • Google Axion N4A @ Neoverse N3(-O3): 9.18
  • IBM POWER8 @ 3.2 GHz POWER8(-O3): 3.47
  • IBM POWER9 3.2 GHz @ 3.2 GHz POWER9(-O3): 3.84
  • IBM POWER9 3.8 GHz @ 3.2 GHz POWER9(-O3): 4.75
  • Intel Xeon 6975P-C @ 3.9 GHz Redwood Cove(-O3): 10.3
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3): 5.49
  • Intel Xeon Gold 6430 @ 2.6 GHz Golden Cove(-O3): 7.01
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3): 7.24
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3): 3.17
  • Loongson 3C5000 @ 2.2 GHz LA464(-O3): 3.00
  • Loongson 3C6000 @ 2.2 GHz LA664(-O3): 4.75 4.77 4.75

Debian Bookworm

桌面平台(-march=native):

  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3 -march=native): 11.4
  • AMD Ryzen 9 9950X @ 5.7 GHz Zen 5(-O3 -march=native): 17.6
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3 -march=native): 3.89
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3 -march=native): 11.6
  • Intel Core i9-10980XE @ 4.7 GHz (AVX-512 @ 4.0 GHz) Cascade Lake(-O3 -march=native): 7.24
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3 -march=native): 16.6
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3 -march=native): 11.0
  • Qualcomm X1E80100 @ 4.0 GHz X Elite(-O3 -march=native): 14.4

桌面平台:

  • AMD Ryzen 5 7500F @ 5.0 GHz Zen 4(-O3): 11.6
  • AMD Ryzen 7 5700X @ 4.65 GHz Zen 3(-O3): 9.91
  • AMD Ryzen 9 9950X @ 5.7 GHz Zen 5(-O3): 16.3 16.6
  • Apple M1 E-Core @ 2.1 GHz Icestorm(-O3): 3.89
  • Apple M1 P-Core @ 3.2 GHz Firestorm(-O3): 11.6
  • Intel Core i9-10980XE @ 4.7 GHz Cascade Lake(-O3): 6.91
  • Intel Core i9-12900KS E-Core @ 4.1 GHz Gracemont(-O3): 6.90
  • Intel Core i9-12900KS P-Core @ 5.5 GHz Golden Cove(-O3): 14.3
  • Intel Core i9-14900K E-Core @ 4.4 GHz Gracemont(-O3): 7.31
  • Intel Core i9-14900K P-Core @ 6.0 GHz Raptor Cove(-O3): 16.1
  • Intel Xeon w9-3595X @ 4.5 GHz Golden Cove(-O3): 10.6
  • Qualcomm 8cx Gen3 E-Core @ 2.4 GHz Cortex-A78C(-O3): 6.08
  • Qualcomm 8cx Gen3 P-Core @ 3.0 GHz Cortex-X1C(-O3): 8.07
  • Qualcomm X1E80100 @ 4.0 GHz X Elite(-O3): 14.4

服务器平台(-march=native):

  • AMD EPYC 9754 @ 3.1 GHz Zen 4c(-O3 -march=native): 8.42
  • AMD EPYC 9755 @ 4.1 GHz Zen 5(-O3 -march=native): 14.4
  • AMD EPYC 9K65 @ 3.7 GHz Zen 5c(-O3 -march=native): 12.7
  • AMD EPYC 9K85 @ 4.1 GHz Zen 5(-O3 -march=native): 14.2
  • AMD EPYC 9R14 @ 3.7 GHz Zen 4(-O3 -march=native): 10.1
  • AMD EPYC 9T24 @ 3.7 GHz Zen 4(-O3 -march=native): 10.1
  • AWS Graviton 3 @ 2.6 GHz Neoverse V1(-O3 -march=native): 7.73
  • AWS Graviton 4 @ 2.8 GHz Neoverse V2(-O3 -march=native): 9.29 9.35
  • Intel Xeon 6981E Crestmont(-O3 -march=native): 4.77
  • Intel Xeon 6982P-C @ 3.6 GHz Redwood Cove(-O3 -march=native): 9.50
  • Intel Xeon D-2146NT @ 2.9 GHz Skylake(-O3 -march=native): 5.48
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3 -march=native): 7.60
  • Intel Xeon Platinum 8581C @ 3.4 GHz Raptor Cove(-O3 -march=native): 8.60
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3 -march=native): 3.17
  • Kunpeng 920 HuaweiCloud kc2 @ 2.9 GHz(-O3 -march=native): 8.01

服务器平台:

  • AMD EPYC 7551 @ 2.5 GHz Zen 1(-O3): 4.05
  • AMD EPYC 7742 @ 3.4 GHz Zen 2(-O3): 7.12
  • AMD EPYC 7H12 @ 3.3 GHz Zen 2(-O3): 6.61
  • AMD EPYC 7K83 Zen 3(-O3): 7.63
  • AMD EPYC 9754 @ 3.1 GHz Zen 4c(-O3): 7.64
  • AMD EPYC 9755 @ 4.1 GHz Zen 5(-O3): 13.2
  • AMD EPYC 9K65 @ 3.7 GHz Zen 5c(-O3): 11.7
  • AMD EPYC 9K85 @ 4.1 GHz Zen 5(-O3): 13.0
  • AMD EPYC 9R14 @ 3.7 GHz Zen 4(-O3): 9.26
  • AMD EPYC 9T24 @ 3.7 GHz Zen 4(-O3): 9.23
  • AWS Graviton 3 @ 2.6 GHz Neoverse V1(-O3): 7.80
  • AWS Graviton 3E @ 2.6 GHz Neoverse V1(-O3): 8.10
  • AWS Graviton 4 @ 2.8 GHz Neoverse V2(-O3): 9.36 9.39
  • Ampere Altra @ 3.0 GHz Neoverse N1(-O3): 5.26
  • Hygon C86 7390(-O3): 3.95
  • IBM POWER8NVL @ 4.0 GHz POWER8(-O3): 4.10
  • Intel Xeon 6981E Crestmont(-O3): 4.80
  • Intel Xeon 6982P-C @ 3.6 GHz Redwood Cove(-O3): 9.50
  • Intel Xeon D-2146NT @ 2.9 GHz Skylake(-O3): 5.00
  • Intel Xeon E5-2603 v4 @ 1.7 GHz Broadwell(-O3): 3.14
  • Intel Xeon E5-2680 v3 @ 3.3 GHz Haswell(-O3): 5.15
  • Intel Xeon E5-2680 v4 @ 3.3 GHz Broadwell(-O3): 5.44
  • Intel Xeon E5-4610 v2 @ 2.7 GHz Ivy Bridge EP(-O3): 3.74
  • Intel Xeon Platinum 8358P @ 3.4 GHz Sunny Cove(-O3): 7.12
  • Intel Xeon Platinum 8576C Raptor Cove(-O3): 8.14
  • Intel Xeon Platinum 8581C @ 3.4 GHz Raptor Cove(-O3): 8.42
  • Kunpeng 920 @ 2.6 GHz TaiShan V110(-O3): 3.13
  • Kunpeng 920 HuaweiCloud kc2 @ 2.9 GHz(-O3): 8.17
  • T-Head Yitian 710 @ 3.0 GHz Neoverse N2(-O3): 7.63

HarmonyOS

桌面平台(LTO):

  • Huawei Kirin X90 E-Core @ 2.0 GHz(-O3 -flto): 6.52
  • Huawei Kirin X90 P-Core @ 2.3 GHz(-O3 -flto): 7.42

手机平台(LTO):

  • Huawei Kirin 9010 E-Core Full @ 2.2 GHz(-O3 -flto): 4.72
  • Huawei Kirin 9010 P-Core Best @ 2.3 GHz(-O3 -flto): 6.22
  • Huawei Kirin 9010 P-Core Full @ 2.3 GHz(-O3 -flto): 5.86

备注

  1. SPEC FP 2017 Rate-1 结果在 AMD64 平台下受 -march=native 影响很明显,特别是有 AVX-512 的平台,因为不开 -march=native 时,默认情况下 SIMD 最多用到 SSE。ARM64 平台下 -march=native 没有什么影响,甚至有一定的劣化。
  2. 部分内核版本(大约 6.7-6.11,在 6.12/6.11.7 中修复)会显著影响 503.bwaves_r 和 507.cactuBSSN_r 项目的性能,详见 Intel Spots A 3888.9% Performance Improvement In The Linux Kernel From One Line Of Codemm, mmap: limit THP alignment of anonymous mappings to PMD-aligned sizeskernel 6.10 THP causes abysmal performance drop
  3. Qualcomm 8cx Gen3 在跑测试的时候,会因为过热降频,导致达不到最佳性能,三轮测试一轮比一轮慢。
  4. 在华为云 kc2 实例上用 Debian Bookworm 带 -march=native 编译代码会报错,是 binutils 2.40 版本的问题;解决办法是手动安装一个 binutils 2.42:

    # Fix error building 511.povray_r:
    # /usr/bin/gcc -std=c99 -c -o image_validator/ImageValidator.o -DSPEC -DNDEBUG -Ifrontend -Ibase -I. -Ispec_qsort -DSPEC_AUTO_SUPPRESS_OPENMP  -O3 -march=native            -Wno-error=implicit-int   -DSPEC_LP64  image_validator/ImageValidator.c
    # /tmp/cc0E80QY.s: Assembler messages:
    # /tmp/cc0E80QY.s:2340: Error: selected processor does not support `bcax v22.16b,v22.16b,v11.16b,v22.16b'
    # /tmp/cc0E80QY.s:2425: Error: selected processor does not support `bcax v8.16b,v8.16b,v16.16b,v8.16b'
    # /tmp/cc0E80QY.s:2502: Error: selected processor does not support `bcax v3.16b,v3.16b,v5.16b,v3.16b'
    apt update
    apt install -y texinfo
    wget https://mirrors.tuna.tsinghua.edu.cn/gnu/binutils/binutils-2.42.tar.xz
    tar xvf binutils-2.42.tar.xz
    cd binutils-2.42
    mkdir build
    cd build/
    ../configure
    make all -j4
    make install -j4
    
    5. Kirin 9010 因为散热问题,单独跑测试,和顺着跑一遍测试,结果差距比较大。因此提供了两组数据:Best(每一项单独跑,取最短时间,散热影响比较小)和 Full(按照顺序跑一次,散热影响比较大)。

网上的数据

高通 X Elite Oryon 微架构评测:走走停停 by JamesAslan:

  • Intel Core 13700K Raptor Cove: 14.56
  • Qualcomm X1E80100 Oryon: 14.20
  • AMD Ryzen 7700X Zen 4: 13.97
  • Intel Core 12700K Golden Cove: 13.70
  • Apple M2 Avalanche+Blizzard: 12.64
  • AMD Ryzen 5950X Zen 3: 11.86
  • Apple M1 Firestorm+Icestorm: 11.20
  • Qualcomm 8 Gen 2 Cortex-X3: 9.91

Running SPEC CPU2017 on Chinese CPUs, and More

  • AMD Ryzen 9 7950X3D Non-VCache: 15.4
  • Intel Core Ultra 7 258V Lion Cove: 13.9
  • Intel Core Ultra 7 115H Redwood Cove: 12
  • Intel Core Ultra 7 258V Skymont: 7.94
  • Intel Core i5-6600K Skylake: 7.92
  • Intel Core Ultra 7 115H Crestmont: 6.86
  • Loongson 3A6000 LA664: 5.49
  • Mediatek Genio 1200 Cortex A78: 5.09
  • Intel Core Ultra 7 115H Low Power Crestmont: 4.32
  • AMD FX-8150: 3.63
  • Loongson 3A5000 LA464: 3.38
  • Intel Celeron J4125 Goldmont Plus: 2.45
  • Amlogic S922X Cortex A73: 2.01
  • Zhaoxin KaiXian KX-6640MA: 1.97
  • Mediatek Genio 1200 Cortex A55: 1.01

Running SPEC CPU2017 at Chips and Cheese?

  • AMD Ryzen 9 9950X: 19.8
  • AMD Ryzen 9 7950X3D Non-VCache: 15.3
  • AMD Ryzen 9 7950X3D VCache: 12.7
  • Intel Core Ultra 7 115H Redwood Cove: 11.2
  • AMD Ryzen 9 3950X 3.5GHz: 7.26
  • Intel Core Ultra 7 115H Crestmont: 5.83
  • Ampere Altra: 5.62
  • AmpereOne: 4.29
  • AMD FX-8159: 3.47

The AMD Ryzen 9 9950X and Ryzen 9 9900X Review: Flagship Zen 5 Soars - and Stalls

  • AMD Ryzen 9 9950X Zen 5: 17.72
  • Intel Core i9-14900K Raptor Cove: 16.90
  • AMD Ryzen 9 7950X Zen 4: 14.26

Snapdragon X Elite Qualcomm Oryon CPU Design and Architecture Hot Chips 2024

  • Qualcomm X Elite Oryon on Linux: 17.77
  • Qualcomm X Elite Oryon on Windows: 16.66

ARM Cortex X1 微架构评测(上):向山进发

  • Zen 3 @ 4.95 GHz: 11.9
  • Firestorm @ 3.0 GHz: 11.2
  • Cortex X1 @ 3.0 GHz: 8.9
  • Cortex A78 @ 2.4 GHz: 5.9

极客湾•麒麟 9010,测评汇总:IPC 性能,巨幅提升!CPU 能效全频段领先,麒麟 9000S!

  • Huawei Kirin 9010: 7.77
  • HUawei Kirin 9000s: 7.12

多版本 GCC 和 LLVM 性能比较

在 Intel i9-14900K @ 5.7 GHz 上用 -O3 测试几种编译器组合的性能:

Benchmark GCC 12 LLVM 20 LLVM 19 LLVM 18
503.bwaves_r 75.1 70.9 73.1 73.2
507.cactuBSSN_r 14.4 13.3 13.6 13.6
508.namd_r 9.24 10.5 10.6 10.5
510.parest_r 14.6 14.6 14.6 14.4
511.povray_r 14.6 13.7 13.7 13.7
519.lbm_r 12.1 11.2 11.2 11.2
521.wrf_r 13.5 14.0 13.3 13.3
526.blender_r 12.0 11.8 11.9 11.9
527.cam4_r 15.4 13.2 12.9 12.9
538.imagick_r 10.2 11.8 11.9 12.3
544.nab_r 12.6 8.47 8.22 8.22
549.fotonik3d_r 24.5 24.0 21.0 21.2
554.roms_r 14.1 14.3 13.7 13.7
geomean 15.4 14.8 14.6 14.6

完整数据:

注:GCC 指 GCC + GFortran,LLVM 指 Clang + Flang-new

评论