This is an excellent idea as a blog. Kudos!
Isn't a dedicated worker pool with priority queues enough to get predictable P99 without leaving Go?
If you fix N workers and control dispatch order yourself, the scheduler barely gets involved — no stealing, no surprises.
The inter-goroutine handoff is ~50-100ns anyway.
Isn't the real issue using `go f()` per request rather than something in the language itself?
No. Eventually the queues get full and go routines pause waiting to place the element onto the queue, landing you right back at unfair scheduling.
https://github.com/php/frankenphp/pull/2016 if you want to see a “correctly behaving” implementation that becomes 100% cpu usage under contention.
I enjoyed both these GopherCon talks:
GopherCon 2018: The Scheduler Saga - Kavya Joshi https://www.youtube.com/watch?v=YHRO5WQGh0k
GopherCon 2017: Understanding Channels - Kavya Joshi https://www.youtube.com/watch?v=KBZlN0izeiY
https://m.youtube.com/watch?v=-K11rY57K7k - Dmitry Vyukov — Go scheduler: Implementing language with lightweight concurrency
This one notably also explains the design considerations for golangs M:N:P in comparison to other schemes and which specific challenges it tries to address.
Good videos, thanks for sharing!
My biggest issue with go is it’s incredibly unfair scheduler. No matter what load you have, P99 and especially P99.9 latency will be higher than any other language. The way that it steals work guarantees that requests “in the middle” will be served last.
It’s a problem that only go can solve, but that means giving up some of your speed that are currently handled immediately that shouldn’t be. So overall latency will go up and P99 will drop precipitously. Thus, they’ll probably never fix it.
If you have a system that requires predictable latency, go is not the right language for it.
> If you have a system, go is not the right language for it.
FTFY
It misses having a custom scheduler option, like Java and .NET runtimes offer, unfortunely that is too many knobs for the usual Go approach to language design.
Having a interface for how it is supposed to behave, a runtime.SetScheduler() or something, but it won't happen.
> If you have a system that requires predictable latency, go is not the right language for it.
Having a garbage collector already make this the case, it is a known trade off.
This may have been practically true for a long time, but as Java's ZGC garbage collector proves, this is not a hard truth.
You can have world pauses that are independent of heap size, and thus predictable latency (of course, trading off some throughput, but that is almost fundamental)
> If you have a system that requires predictable latency, go is not the right language for it.
I presume that's by design, to trade off against other things google designed it for?
No clue. All I know is that people complain about it every time they benchmark.