21 comments

  • dietr1ch 11 hours ago

    > - Writing to a fresh file is slower than writing to an existing file

    > mold can link a 4 GiB LLVM/clang executable in ~1.8 seconds on my machine if the linker reuses an existing file and overwrites it. However, the speed decreases to ~2.8 seconds if the output file does not exist and mold needs to create a fresh file. I tried using fallocate(2) to preallocate disk blocks, but it didn't help. While 4 GiB is not small, should creating a file really take almost a second?

    1s difference here is insane these days. There must be something weird going on, even if the physical disk is one of those ancient spinning things.

    • Sesse__ 10 hours ago

      He doesn't specify what file system he's using, but offhand, you would assume that what actually takes time isn't to creating the file, but rather allocating all the blocks. A good first step would be to reproduce the issue and take a profile of both cases?

      • the8472 10 hours ago

        xfs, ext4 and btrfs all have delayed allocation. so they should only synchronously allocate the blocks if there is memory pressure or if they're triggering those well-meant (but in this case counterproductive) auto_da_alloc heuristics.

  • seba_dos1 9 hours ago

    It must be stressful to write to LKML knowing that Phoronix is hiding behind a corner waiting to sensationalize every little thing that sounds like could give it free clicks you write even before any sensible discussion on what you wrote has a chance of happening.

  • vlovich123 10 hours ago
  • rbanffy 11 hours ago
  • stefanos82 11 hours ago

    Our boy @rui314 developed such fast linker that the kernel can't catch up! LOL ^_^ such an amazing job, well done!

  • devmor 11 hours ago

    The issue with slower writing to newly created files is extremely interesting- I hope I get to see a follow up on that once it’s figured out.

    • the8472 10 hours ago

      presumably the existing file is already backed by pages in the page cache while the new one still has to be allocated (+ whatever the io subsystem is doing).

  • magicalhippo 11 hours ago

    I'm interested in knowing what kind of workloads this is targetting with multi-GB executables being built at such a pace a 0.2 second wait between them is unacceptable.

    • NobodyNada 10 hours ago

      Performance optimization is not always just low-hanging fruit. When you start trying to optimize something, there's often large bottlenecks to clean up -- "we can add a hashmap here to turn this O(n^2) function into O(n) and improve performance by 10x" kind of thing. But once you've got all the easy stuff, what's left is the little things -- a 1% improvement here, a 0.1% improvement there. If you can find 10 of those 1% improvements, now you've got a 10% performance improvement. 0.2 seconds on its own isn't that much, but the reason mold is so fast is because the author has found a lot of 0.2 second improvements.

      And even disregarding that, the linked LKML post mentions LLVM/clang at a case of building a 4GB executable. If you've ever built the LLVM project from source, there's about 50ish (?) binaries that need to be linked at the end of the build process -- it's not just clang, but all sorts of other tools, intermediate libraries and debugging utilities. So that is an example of a workload with "multi-GB executables being built at such a pace" -- saving 0.2 seconds per executable saves something like 10 seconds on the build.

      • magicalhippo 10 hours ago

        I'm well aware of the joys of optimization, I just haven't come across someone building multi-GB executables at a pace where milliseconds spent linking mattered.

        To me that's an exotic workload which sounds interesting, hence why I'm curious.

        • NobodyNada 9 hours ago

          Well, keep in mind that the full linking step has to be done at the end of an incremental build. So if you're a developer actively working on a project with a 4GB executable, that linking time is part of your edit-compile-test cycle, and you have to wait for it every time you change a line of code.

          The benchmarks on mold's README show that GNU gold takes 33 seconds to link clang, whereas mold takes 1.3 seconds. If you're a developer working on Clang, that's a pretty serious productivity improvement.

    • not_your_vase 11 hours ago

        > 0.2 second wait between them is unacceptable.
      
      It's more about sending a message, and I support the idea very much.

      It always starts with "it's only a fraction of a second" or "just 100kb more Javascript to load", and suddenly every website pulls in 25MB JS at least, and starting Windows calculator shows a splash screen (on a modern machine), because it takes that long to start up.

      Down with shitty, bloated crapware. Mold FTW.

      • magicalhippo 10 hours ago

        Sure, as I mentioned I'm just genuinely curious what the use-case is.

        For example, running a compiler test suite I could understand, that would be quite impacted. But those tests wouldn't be multi-GB builds for the most part.

        And I could understand being annoyed by it, but the author took steps to work around it, hence finding it unacceptable.

        Far from my day job, so curious to know.

        • not_your_vase 7 hours ago

          I see, at the first time it sounded more like questioning the project's reason to exist...

          E.g. Chromium, Firefox, clang, including debug symbols will be quite sizable - Chromium sometimes trips me up when I somehow (usually accidentally) enable the debug config, and the only thing I see is the linker complaining that my 32-bit Arm target will be unhappy with the 6GB+ executable... I imagine some people do this intentionally, hacking on these projects directly.

          • magicalhippo 6 hours ago

            English is not my native language, and us Norwegians are known to be direct.

            Anyway, now that you mention it I do recall the C++ project I worked on way back started to crash the linker due to running out of address space. Had to switch to the 64bit version.

            Though didn't recompile often, it was slow enough that I reread my code several times before compiling, even in debug mode.

    • Sesse__ 10 hours ago

      Incremental builds of projects with several large binaries (the examples the README gives are MySQL, Clang and Chromium).

    • wmf 11 hours ago

      I guess obidos no longer exists so mold came too late for that.

  • 10 hours ago
    [deleted]
  • teeray 11 hours ago

    Having no familiarity with Mold, I assumed this was somehow improving linux using observations from slime molds.