Hacker News

9 hours ago by marcodiego

Artificial benchmarks show a 4% improvement in some cases, which is huge. There aren't many tests to demonstrate the improvements on common use cases but there are already reports of subjective improvements with games having less stutter: https://www.phoronix.com/forums/forum/phoronix/latest-phoron...

8 hours ago by phkahler

- For requeue, I measured an average of 21% decrease in performance compared to the original futex implementation. This is expected given the new design with individual flags. The performance trade-offs are explained at patch 4 ("futex2: Implement requeue operation").

It's not clear that this is a win overall. But hey, if Windows games run better under wine...

9 minutes ago by Denvercoder9

The old interface will remain available, so if requeue performance is important, you can keep using that. Also at least one kernel maintainer considers requeue to be historical ballast [1], so it's probably not used much anymore.

[1] https://lwn.net/ml/linux-kernel/87o8thg031.fsf@nanos.tec.lin...

8 hours ago by derefr

Better yet, since Wine is there to serve as a “smart” shim, it can offer a launch config parameter for the app, that will get Wine to point its futex symbols at its previous userland impl. (Sort of like the various Windows compatibility-mode flags.) So even Windows apps that do tons of requeues (and so presumably would be slowed down by unilaterally being made to do kernel futexes) would be able to be tuned back up to speed. And distros of Wine that provide pre-tuned installer scripts for various apps, like CrossOver, can add that flag to their scripts, so users don’t even have to know.

7 hours ago by undefined

[deleted]

4 hours ago by gpderetta

IIRC glibc stopped using requeue a while ago for cobdution variables wait morphing, so maybe requeue is not a hugely important use case anymore.

6 hours ago by xxs

> 4% performance improvement for up to 800 threads.

800 threads on similar WaitForMultipleObject is already a horrific design if you even remotely care about performance. The only viable case for 800 threads (and wine) would be blocking IO which is an extremely boring one.

5 hours ago by usefulcat

Sure, but that's irrelevant for wine, which doesn't get to choose how the things that it's running (e.g. games) are implemented.

3 hours ago by xxs

Of course. Yet, I am be massively surprised to see so heavily multithreaded games with massive contention on WaitForMultipleObjects. No one programs that way (again, save for blocking IO... and then it totally doesn't matter)

4 hours ago by asveikau

IIRC, WaitForMultipleObjects can only block on 63 handles. 800 threads blocked on 63 handles seems pretty weird.

3 hours ago by xxs

>IIRC, WaitForMultipleObjects can only block on 63 handles

Yup (64 though), it's an extra annoying limitation. It has been there since the 90s, WinNT (cant recall win95 version as I never used it with win95). It's limited by MAXIMUM_WAIT_OBJECTS which is 64.

For instance Java implements non-blocking IO (java.nio.Selector) by using multiple threads and sync between them to effectively achieve linux' epoll.

----

my point was mostly that the 'huge' 4% happens with massive amounts of threads and WaitForMultipleObjects won't see even that... so kind of click-baity. Flip note: sync. between threads can be done better than relying on OS primitives aside for real wait/notify stuff (which indeed it's implemented via futex)

8 hours ago by savant_penguin

Really curious, why is 4% improvement huge?

7 hours ago by chalst

"Huge" means different things in different contexts. A compiler optimisation that delivered a 4% increase in speed for a whole class of well-tuned C programs would be regarded as huge by people who follow compiler technology, but would not strike most users of computers as a big deal. I take it "huge" is meant in this sense of extraordinary within a class of potential improvements.

If you can accumulate a series of 4% improvements, then even the computer consumer will start to see the significance.

6 hours ago by jhgb

A 4% improvement of the 100m dash world record would have been considered phenomenal, I imagine.

6 hours ago by Someone

I can’t find where the 4% number comes from, but I expect that it comes from a micro benchmark that does little more than futex calls. Reason: if it is from a benchmark that represents real-world use, real-world use must spend at least 4% of its time in futex calls. If that were the case, somebody would have seen that, and work would have been spent on this.

So no, this won’t give you 4% in general.

8 hours ago by marcodiego

Because you don't have to change one line of your previously running software to get this benefit or replace your hardware. Of course, will need a new version of the kernel and wine, but that is a matter of time to hit distro repositories. Also, consider that this will accumulate when combined with new user space libs, servers, compilers, drivers... Software will improve without any change just by keeping your distro updated.

Yes, a 4% improvement is huge.

8 hours ago by sp332

Well, it's a relative measure. There's been a ton of optimization already. All the low-hanging fruit is gone. It's probably been a long time since a single optimization gained 4%.

7 hours ago by tombert

I don't know much about Wine, but I'm speaking in generalities, but I've worked jobs where we spend weeks to get <10% performance increases. Very often that last bit of optimization is the hardest part.

9 hours ago by c7DJTLrn

This might get me some angry replies, but does the feature set of Linux ever end? Will there ever be a day where someone says "you know what, Linux isn't built for this and your code adds too much complexity, let's not". It really just seems like Linux has become a dumping ground for anyone to introduce whatever they feel like as long as it passes code review, with nobody looking at the big picture.

6 hours ago by DSMan195276

Features get rejected all the time, even some pretty substantial ones. That said I think there's two things to keep in mind:

1. If someone is willing to take the time to develop and submit something to the kernel, it probably has some redeeming factors to it. The original idea or code might not get merged, but something similar to meet those needs likely can be if it's really something the kernel can't do.

2. The modularity of the Linux Kernel build system means that adding new features doesn't have to be that big of a deal, since anybody who doesn't want them can just choose not to compile them in. It's not always that simple, but for a lot of the added features it is.

9 hours ago by thedracle

I mean, the futex2 proposal has been around for over a year.

A lot of the history and discussion can be found here: https://lkml.org/lkml/2021/4/27/1208

I personally feel like the kernel maintainers have been very conservative over the years regarding the introduction of new system calls, but then again I've maybe gotten used to the trauma of being familiar with windows internals.

an hour ago by stjohnswarts

That's a fair criticism but the kernel is very modular and the legacy stuff that no one will maintain gets cut all the time, so I think it's all a fair tradeoff. I'm not angry, just pragmatic because if you don't evolve you die, look at how much market share BSD has lost over the past 15 years

7 hours ago by Ericson2314

This shouldn't be getting down-votes.

Consider the slogan "composition not configuration", Linux is squarely the latter. All the hardware, all the different use-cases, that are just added to what's mostly a monolith. You can disable them, but not really pull things a part.

Now, Linux also has some of the most stringent code review in the industry, and they are well aware of issues of maintainability, but fundamentally C is not a sufficiently expressive language for them to architect the code base in a compositional manner.

What really worries me is that as the totality of the userland users an increasingly large interface to the kernel, it will very hard to ever replace linux withsomething else when we do have something more compositional. I am therefore interested, and somewhat working on, stuff that hopefully would end up with a multi-kernel NixOS (Think Debian GNU/kFreeBSD) to try to get some kernel competition going again, and incentivize some userland portability.

Note that I don't think portability <===> just use old system calls. I am no Unix purist. I would really like a kernel that does add new interfaces but also dispenses with the old cruft; think Rust vs C++.

an hour ago by oalae5niMiel7qu

> but fundamentally C is not a sufficiently expressive language for them to architect the code base in a compositional manner.

Plan 9 would like to have a word with you.

5 hours ago by fabianhjr

> Consider the slogan "composition not configuration", Linux is squarely the latter. All the hardware, all the different use-cases, that are just added to what's mostly a monolith. You can disable them, but not really pull things a part.

Linux is a Monolithic Kernel, sounds like you are more interested in Microkernels. https://www.wikiwand.com/en/Microkernel

One microkernel alternative could be Redox-OS: https://www.wikiwand.com/en/Redox_(operating_system)

13 minutes ago by bruckie

You could write a monolithic kernel using composition for your code structure. It would just be composition at build time rather than at runtime.

It seems pretty hard to write a microkernel without using composition, though, since the runtime requirements kind of dictate your code structure.

9 hours ago by alexhutcheson

Bummer that this doesn’t seem to support or explicitly plan for the proposed FUTEX_SWAP operation[1] (unless I’m missing it).

[1] https://lwn.net/Articles/824409/

8 hours ago by dundarious

I haven't read anything about FUTEX_SWAP since Aug 2020. Do you know is there ongoing progress?

7 hours ago by aasasd

I lowkey have to wonder why Linux seemingly goes all-in on monster functions that implement a variety of operations depending on the arguments (as far as I can see from casual encounters). In my experience, such code often both reads poorly on the calling side and looks very messy inside. I came to prefer making distinct functions that spell out their intended action in the names. Windows apparently went the same path.

E.g.: syscall doesn't accept multiple handles and other bells and whistles? Well duh, it's not supposed to—just make a new one that does.

Of course, it's probably too late to change this design choice now...

7 hours ago by whatshisface

I can only offer speculation, but it could be that the alternative, a complicated API that requires several subsequent calls to use, is seen as more of a risk for bad programming and state mis-tracking. We definitely don't want programs leaving the kernel in a bad state by calling the first half of a sequence of syscalls then later on getting distracted or crashing and never finishing it. A single call gives atomicity guarantees that are otherwise hard to come by in languages without state tracking primitives like futures or borrow checked references.

Another thing to remember is that in C, it's easier to make a call more specific, by hardcoding values than it is to make it more general, by picking between variants in a branch. I can see programmers wishing the API was more dynamic while writing a giant if-else tree to select the appropriate version of the syscall, but the opposite problem, when the same invocation is always needed but the API takes a lot of dynamic options, is not as bad because it is easy to write a bunch of literals in the line that calls the function. (If C had named arguments that approach would be almost flawless.)

7 hours ago by aasasd

I don't see how `an_op_syscall(param)` is ‘more complicated’ or requires more calls or state than `a_syscall(OP_NAME, param)`.

7 hours ago by whatshisface

Since C doesn't have first-class functions, if you needed to dynamically pick between values of OP_NAME you'd be glad it was a parameter. Imagine ten options parameters, which your program needs to choose between depending on the circumstances. That's a huge branch tree. Imagine choosing between two options - you'd need to keep track of function pointers, or at least write your own enum. C makes it far nicer to handle enum values than it is to handle functions, because ints are first-class. I agree that there's no difference between a symbol in the name and a value in the args when your program always needs the same value, but sometimes it needs dynamism.

5 hours ago by elynl

It's still a thing! kernel: https://repo.or.cz/linux/zf.git/shortlog/refs/heads/winesync wine: https://repo.or.cz/wine/zf.git/shortlog/refs/heads/fastsync2

They decided to rename it since the kernel maintainers don't want NT interfaces and having NT in the name doesn't help

5 hours ago by Cloudef

Thats good to hear. Darling devs also went with kernel module approach. Its weird wine devs did not do the same until now. Wineserver is nice but as we see, its sometimes impossible to do accurate emulation of some apis there.

https://docs.darlinghq.org/internals/basics/system-call-emul...

9 hours ago by maxfurman

Can someone please ELI5 what is a futex? Is it like a mutex?

9 hours ago by hyperman1

Futex stands for fast userspace mutex. It is a basic building block for implementing a mutex (amongst others).

Afaik, if locking a mutex succeeds, futexes avoid a context switch to and from the kernel. The fast path happens completely in userspace. Only when a mutex lock causes a thread to sleep, the kernel is needed. All of this improves mutex performance.

There are other locking primitives than mutexes, and the current futex interface is not optimal for implementing common windows ones. Hence this proposal.

5 hours ago by xxs

>"context switch to and from the kernel"

This is a common misconception, a "context" switch is a lot more expensive than a "mode" switch that goes from user mode to the kernel mode (and back).

Of course, if the futex has to wait, it'd be a context switch for real.

5 minutes ago by ece

ESYNC is the way this is currently done, it's hit and miss in that it doesn't work unless the number of open files is set high enough for some games. It's also slower than this proposal.

9 hours ago by whoisburbansky

It's a primitive that lets you build mutexes; you've got an atomic integer in userspace, attached to a wait queue in kernel space. Eli Bendersky has a pretty good rundown: https://eli.thegreenplace.net/2018/basics-of-futexes/

6 hours ago by bogomipz

This is a good post thanks. One thing confuses me is that these two sentences seem to contradict each other:

>"Before the introduction of futexes, system calls were required for locking and unlocking shared resources (for example semop)."

>"A user-space program employs the futex() system call only when it is likely that the program has to block for a longer time until the condition becomes true"

The first indicates futexes are an improvement because they avoid the overhead of system call but the second states that futex() is a system call.

Is this similar to how a vDSO or vsyscall works where a page from kernel address space is mapped into all users processes? In other words is it similar to how gettimeofday works?[1]

[1] https://0xax.gitbooks.io/linux-insides/content/SysCall/linux...

5 hours ago by kccqzy

It's not a contradiction. The two sentences are perfectly clear to me. Of course futex() is itself a system call, but in case of no contention you don't need to invoke the system call. That's why futex() is an improvement. The key phrase is "only when" in the second sentence.

9 hours ago by corbet

See https://lwn.net/Articles/823513/ for a look at both the old and proposed new futex interfaces.

8 hours ago by brandmeyer

The best introduction is probably the original presentation paper from 2002.

Fuss, Futexes, and Furwoks: Fast Userlevel Locking in Linux

https://www.kernel.org/doc/ols/2002/ols2002-pages-479-495.pd...

Another great paper that came after some additional experience by glibc developers:

Futexes are Tricky, by Ulrich Drepper

https://www.akkadia.org/drepper/futex.pdf

7 hours ago by MaxBarraclough

I see Drepper resisted the urge to pluralise futex to futices.

8 hours ago by zamalek

> The use case lies in the Wine implementation of the Windows NT interface WaitMultipleObjects.

This can't only be a win for emulating Windows, surely (future) Linux native code could benefit from this too?

8 hours ago by CoolGuySteve

Linux's answer is io_uring with eventfd. I've noticed it has higher latency than WaitForMultipleObjects though. I've had a hard time getting the 99th percentile latency below 5usec.

It would be nice if io_uring polling had a futex-like feature that spinned in userspace rather than having to syscall io_uring_wait every time. I feel like that feature would be more universal than futex2.

Adding an atomic int somewhere in the userspace would add ~100ns in the kernel path but save many microseconds in the userspace-spin path during high contention so it seems like a win to me.

7 hours ago by the8472

My understanding of io_uring is that you can already poll the CQ with atomics without any syscall. io_uring_enter() is only needed to inform the kernel that new SQ entries exist and that too can be replaced by polling via IORING_SETUP_SQPOLL in privileged processes.

7 hours ago by vardump

Spinning consumes a lot of power, so I'd guess one would only want to do that when very low latency is truly required. At least I'd certainly wouldn't want it to be the default for the generic case.

Of course a low latency option should also exist for those cases that require it and currently that pretty much means spinning.

8 hours ago by undefined

[deleted]

7 hours ago by vardump

> This can't only be a win for emulating Windows, surely (future) Linux native code could benefit from this too?

That would certainly help porting Windows software to Linux.

3 hours ago by twoodfin

Unclear to me from the writeup: Would it be possible to implement the original futex() API on top of futex2() or is Linux doomed to preserve most of two complete futex implementations?

3 hours ago by guipsp

Would it be possible? Probably yes, but not performant.

Daily digest email

Get a daily email with the the top stories from Hacker News. No spam, unsubscribe at any time.