I am pretty familiar with object file formats, and I don't get how this "atom" stuff works.
The site says that "atoms" are an improvement over simple section interlacing. But I don't get how you are going to make this leap without changing the object file format. Linkers work on the section level because that is how object files work. Object files have sections, not atoms. Compilers emit sections as their basic, atomic unit of output. Within a section, the code will assume that all offsets referring to other parts of the section will be stable, so you can't chop a section apart without breaking the code.
How does the new linker work in terms of this new "atom" abstraction without changing the underlying object file format?
"The atom model is not the best model for some architectures The atom model makes sense only for Mach-O, but it’s used everywhere. I guess that we originally expected that we would be able to model the linker’s behavior beautifully using the atom model because the atom model seemed like a superset of the section model. Although it can, it turned out that it’s not necessarily natural and efficient model for ELF or PE/COFF on which section-based linking is expected."
But maybe, they require you to create special versions of object files where even references internal to each library are referenced there as if they live in a different object file? Is that even possible?
> But maybe, they require you to create special versions of object files where even references internal to each library are referenced there as if they live in a different object file? Is that even possible?
The extra information that is needed for an ELF linker (any ELF linker; nothing LLD specific) to operate on functions and global data objects in a fine-grained manner is enabled by -ffunction-sections/-fdata-sections.
If you are familiar with object file formats in general, you may know that this is exactly how MachO works: it is based on atoms.
If you want to map ELF to the atom model, you need somehow to build with -ffunction-section so that the compiler emits a single function per section (and similarly with -fdata-section) or model it by mapping one section of the object to an atom.
Hmm, in my experience with Mach-O, I have never come across atoms. For example, in this file format reference, atoms are not mentioned -- only the more traditional segment/section hierarchy: https://github.com/aidansteele/osx-abi-macho-file-format-ref...
What am I missing?
I do note that on OS X, -ffunction-sections appears to do nothing.
Sorry, I was wrong to characterize the atom being a core part of the MachO object format while it is only a core part of how ld64 works. The compiler is following some convention though that ld64 takes advantages of. Other than ld64 source code (available on opensource.apple.com) I can only point to some design document in the source repo: https://opensource.apple.com/source/ld64/ld64-253.6/doc/desi...
Actually no: `lld` is not a single linker, there has been a split between Elf/COFF folks and the MachO ones. The page you're linking is about the MachO project which is not very actively developed.
No, it is just that nobody is currently working on it. Last I talked with the Apple folks they are just busy with other stuff.
Patches are definitely welcome for MachO improvements in LLD (as always in LLVM!). You should be aware though that the Apple folks feel strongly that the original "atom" linker design is the one that they want to use. If you want to start a MachO linker based on the COFF/ELF design (which has been very successful) you will want to ping the llvm-dev mailing list first to talk with the Apple folks (CC Lang Hames and Jim Grosbach).
--
Actually, the page you linked seems to be about an older ELF and COFF specific implementation, despite its URL.
Here's an overview of what makes the new one special: https://lld.llvm.org/design.html