To get the current time in seconds, the runtime implements runtime.nanotime as a call to the OS (the vdso on GOOS=linux, libc on GOOS=darwin).

Profiling and tracing benefit from efficient access to a clock, but don't require a particular scale or offset. (We're in the habit of rescaling it after the fact, by comparing against another clock.) On GOARCH=amd64, the runtime implements runtime.cputicks as RDTSCP (or RDTSC plus memory fences). It's a bit faster to only read the timer, than to read the timer and then do scaling math.

But on GOARCH=arm64, for GOOS=linux, darwin, and several others, we implement runtime.cputicks as a call to runtime.nanotime. It looks like AArch64's CNTVCT_EL0 register [1] might have what we need for cputicks (monotonic, also monotonic across cores, static frequency). The Linux vdso [2] appears to use it (plus ISB), or the self-synchronizing version CNTVCTSS_EL0.

There's also CNTVCT for GOARCH=arm [3].

Initial benchmarking on an Apple M1 (darwin) and on Raspberry Pi models 5 and 3B (linux) show that reading CNTVCT_EL0 is bit faster than calling nanotime (14 vs 24ns, 28 vs 43ns, and 45 vs 126ns on those three platforms). I don't know how much of a difference this will make in complete applications, but cheaper clocks means less worry when adding profiling/tracing points.

CC @golang/runtime @golang/arm

[1] https://developer.arm.com/documentation/ddi0595/2021-03/AArch64-Registers/CNTVCT-EL0--Counter-timer-Virtual-Count-register

[2] https://elixir.bootlin.com/linux/v6.9/source/arch/arm64/include/asm/vdso/gettimeofday.h#L69

[3] https://developer.arm.com/documentation/ddi0601/2024-03/AArch32-Registers/CNTVCT--Counter-timer-Virtual-Count-register

Comment From: mauri870

I wonder if https://www.felixcloutier.com/x86/rdtsc can be used on x86 as well.

Comment From: rhysh

Yes, GOARCH=amd64 and 386 use it (or RDTSCP) already! I don't know about other architectures, but if we use a more direct/efficient cputicks implementation on arm/arm64 then we'll have covered all of the first-class ports.

https://github.com/golang/go/blob/go1.22.4/src/runtime/asm_amd64.s#L1174 https://github.com/golang/go/blob/go1.22.4/src/runtime/asm_386.s#L870

Comment From: mauri870

Thanks, I was unaware we were already using it! I think covering all the first class ports would be great.

Comment From: rsc

If you use CNTVCT_EL0, how do you know what speed the timer runs at? Empirically it seems to be 24MHz on my M3 Mac, which is not really what we want for nanosecond level timing.

% go run rsc.io/tmp/hwtimer@latest
go: downloading rsc.io/tmp v0.0.0-20250914141124-9178c1a78e6c
go: downloading rsc.io/tmp/hwtimer v0.0.0-20250916155736-44772ddfbd0a
2425062 in 101.0ms (24.0MHz)
2430390 in 101.3ms (24.0MHz)
2428210 in 101.2ms (24.0MHz)
2429548 in 101.2ms (24.0MHz)
2426937 in 101.1ms (24.0MHz)
2428663 in 101.2ms (24.0MHz)
2427182 in 101.1ms (24.0MHz)
2430347 in 101.3ms (24.0MHz)
2429233 in 101.2ms (24.0MHz)
2427459 in 101.1ms (24.0MHz)
%