_start_c does more work than is necessary

Submitted by Jon Chesterfield on Aug. 26, 2018, 12:56 a.m.

Details

Message ID CAOUYtQAeJvx2a6Gb8NW-64UGC7E=8N7C+LC9nvv_oBnQ4bAEOg@mail.gmail.com
State New
Series "_start_c does more work than is necessary"
Headers show

Commit Message

Jon Chesterfield Aug. 26, 2018, 12:56 a.m.
The init sequence in musl is _start calls _start_c which calls
__libc_start_main.

_start_c passes pointers to the _init and _fini functions, and also a
trailing zero, to __libc_start_main.

__libc_start_main currently takes exactly three arguments. I'd like to
simplify crt1.c by only passing main, argc, argv.

This is worth a few lines of C and three instructions in the startup
sequence. E.g. x86-64 this removes mov, mov, xor for fourteen bytes.

It also removes uses of _init() and _free() which I'm considering deleting
from a musl/llvm toolchain which makes no use of either, slightly
decreasing the size of my out of tree patch.

Thanks,

Jon

---

Patch hide | download patch | download mbox

diff --git a/crt/crt1.c b/crt/crt1.c
index af02af9..f27c949 100644
--- a/crt/crt1.c
+++ b/crt/crt1.c
@@ -5,14 +5,11 @@ 
 #include "crt_arch.h"

 int main();
-void _init() __attribute__((weak));
-void _fini() __attribute__((weak));
-_Noreturn int __libc_start_main(int (*)(), int, char **,
-       void (*)(), void(*)(), void(*)());
+_Noreturn int __libc_start_main(int (*)(), int, char **);

 void _start_c(long *p)
 {
        int argc = p[0];
        char **argv = (void *)(p+1);
-       __libc_start_main(main, argc, argv, _init, _fini, 0);
+       __libc_start_main(main, argc, argv);
 }

Comments

Rich Felker Aug. 26, 2018, 1:23 a.m.
On Sun, Aug 26, 2018 at 01:56:30AM +0100, Jon Chesterfield wrote:
> The init sequence in musl is _start calls _start_c which calls
> __libc_start_main.
> 
> _start_c passes pointers to the _init and _fini functions, and also a
> trailing zero, to __libc_start_main.
> 
> __libc_start_main currently takes exactly three arguments. I'd like to
> simplify crt1.c by only passing main, argc, argv.
> 
> This is worth a few lines of C and three instructions in the startup
> sequence. E.g. x86-64 this removes mov, mov, xor for fourteen bytes.
> 
> It also removes uses of _init() and _free() which I'm considering deleting
> from a musl/llvm toolchain which makes no use of either, slightly
> decreasing the size of my out of tree patch.
> 
> Thanks,
> 
> Jon
> 
> ---
> diff --git a/crt/crt1.c b/crt/crt1.c
> index af02af9..f27c949 100644
> --- a/crt/crt1.c
> +++ b/crt/crt1.c
> @@ -5,14 +5,11 @@
>  #include "crt_arch.h"
> 
>  int main();
> -void _init() __attribute__((weak));
> -void _fini() __attribute__((weak));
> -_Noreturn int __libc_start_main(int (*)(), int, char **,
> -       void (*)(), void(*)(), void(*)());
> +_Noreturn int __libc_start_main(int (*)(), int, char **);
> 
>  void _start_c(long *p)
>  {
>         int argc = p[0];
>         char **argv = (void *)(p+1);
> -       __libc_start_main(main, argc, argv, _init, _fini, 0);
> +       __libc_start_main(main, argc, argv);
>  }

It may be reasonable to make this change now, but it should be
reviewed. Read the git log for __libc_start_main.c and in particular
commit 7586360badcae6e73f04eb1b8189ce630281c4b2 which explains the
history and non-obvious reason musl does not have any reason to use
these arguments.

Ultimately all crt files except crt1 should be removed at some point.
The _init/_fini machinery (crti/crtn) is purely legacy mess from
before init/fini arrays were used, and the gcc crtbegin/crtend mess is
just for wacky language runtimes like java (gcj, obsolete and removed
from gcc). I'm not sure if there are other compilers out there that
still don't know how to do the arrays, though (pcc? firm/cparser?).

Rich
Jon Chesterfield Aug. 26, 2018, 1:47 a.m.
That's interesting context, thanks. rcrt1.c is ofc similar.

I'd be happy to see the crt files go. The patch was motivated by debugging
the
consequences of importing crtbegin from bsd (as broadly recommended online)
while trying to placate lld. Said crtbegin installed a function that walks
init_array into .init so all the constructors fired twice.

crt1.o appears to be the only necessary crt file for llvm/x86-64/c++.

.ctor/.dtor is still implemented in lld along with a comment about legacy :)

Cheers

Jon


On Sun, Aug 26, 2018 at 2:24 AM Rich Felker <dalias@libc.org> wrote:

> On Sun, Aug 26, 2018 at 01:56:30AM +0100, Jon Chesterfield wrote:
> > The init sequence in musl is _start calls _start_c which calls
> > __libc_start_main.
> >
> > _start_c passes pointers to the _init and _fini functions, and also a
> > trailing zero, to __libc_start_main.
> >
> > __libc_start_main currently takes exactly three arguments. I'd like to
> > simplify crt1.c by only passing main, argc, argv.
> >
> > This is worth a few lines of C and three instructions in the startup
> > sequence. E.g. x86-64 this removes mov, mov, xor for fourteen bytes.
> >
> > It also removes uses of _init() and _free() which I'm considering
> deleting
> > from a musl/llvm toolchain which makes no use of either, slightly
> > decreasing the size of my out of tree patch.
> >
> > Thanks,
> >
> > Jon
> >
> > ---
> > diff --git a/crt/crt1.c b/crt/crt1.c
> > index af02af9..f27c949 100644
> > --- a/crt/crt1.c
> > +++ b/crt/crt1.c
> > @@ -5,14 +5,11 @@
> >  #include "crt_arch.h"
> >
> >  int main();
> > -void _init() __attribute__((weak));
> > -void _fini() __attribute__((weak));
> > -_Noreturn int __libc_start_main(int (*)(), int, char **,
> > -       void (*)(), void(*)(), void(*)());
> > +_Noreturn int __libc_start_main(int (*)(), int, char **);
> >
> >  void _start_c(long *p)
> >  {
> >         int argc = p[0];
> >         char **argv = (void *)(p+1);
> > -       __libc_start_main(main, argc, argv, _init, _fini, 0);
> > +       __libc_start_main(main, argc, argv);
> >  }
>
> It may be reasonable to make this change now, but it should be
> reviewed. Read the git log for __libc_start_main.c and in particular
> commit 7586360badcae6e73f04eb1b8189ce630281c4b2 which explains the
> history and non-obvious reason musl does not have any reason to use
> these arguments.
>
> Ultimately all crt files except crt1 should be removed at some point.
> The _init/_fini machinery (crti/crtn) is purely legacy mess from
> before init/fini arrays were used, and the gcc crtbegin/crtend mess is
> just for wacky language runtimes like java (gcj, obsolete and removed
> from gcc). I'm not sure if there are other compilers out there that
> still don't know how to do the arrays, though (pcc? firm/cparser?).
>
> Rich
>
A. Wilcox Aug. 30, 2018, 6:19 p.m.
On 08/25/18 20:23, Rich Felker wrote:
> the gcc crtbegin/crtend mess is
> just for wacky language runtimes like java (gcj, obsolete and removed
> from gcc). I'm not sure if there are other compilers out there that
> still don't know how to do the arrays, though (pcc? firm/cparser?).


That wacky language runtime is still needed to bootstrap Java compilers
without relying on binaries.  This is especially important for musl
(because except for Oracle's semi-proprietary portola, there is no
binary for musl at all) and for other architectures (OpenJDK is, as far
as I'm aware, not a cross-compiler, so arches like mips or ppc64 would
be screwed).

--arw