Invalid pointer subtractions in __shlim and __shgetc

Submitted by Rich Felker on April 17, 2020, 4:48 p.m.

Details

Message ID 20200417164807.GI11469@brightrain.aerifal.cx
State New
Series "Invalid pointer subtractions in __shlim and __shgetc"
Headers show

Commit Message

Rich Felker April 17, 2020, 4:48 p.m.
On Fri, Apr 17, 2020 at 12:13:51PM -0400, Rich Felker wrote:
> On Fri, Apr 17, 2020 at 03:56:06PM +0000, Pascal Cuoq wrote:
> > ?Hello,
> > 
> > both functions `__shlim` and `__shgetc` subtract the members
> > named `buf` and `rpos` of the struct they manipulate.
> > 
> > In `__shlim`, this happens in the statement `f->shcnt = f->buf - f->rpos;`.
> > And in `__shgetc`, in happens inside the `shcnt` macro:
> > 
> > #define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))
> > 
> > In our tests, while running `testsuite` in `libc-testsuite`,
> > both the `__shlim` and `__shgetc` functions are reached
> > with `f->buf` non-null and `f->rpos` a null pointer.
> > 
> > This can be made visible on execution platforms other than ours
> > by adding a statement at the beginning of the functions:
> > 
> > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shlim\n");
> > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shgetc\n");
> > 
> > Then if, running `libc-testsuite`, you see the following, it means that
> > `f->buf` was non-null and `f->rpos` was null when these points were
> > reached:
> > 
> > $ ./testsuite
> > fdopen test passed
> > fcntl test passed
> > fnmatch test passed
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > XXX Problem in __shlim
> > XXX Problem in __shgetc
> > fscanf test passed
> > (...)
> > 
> > This has been tested on the (tag: v1.2.0) branch of git://git.musl-libc.org/musl
> > 
> > These pointer subtractions are undefined behavior. This is slightly worse
> > than computing `(char*)0-(char*)0`, which is undefined in C and defined in C++,
> > because compilers for both C and C++ are unlikely to exploit this one
> > for optimization. Subtracting between a non-null pointer and a null pointer
> > on the other hand is undefined behavior in both languages, and it is
> > plausible that doing it may someday have unexpected consequences.
> > 
> > I mention this because similar undefined behaviors that were extremely
> > unlikely to cause harm have been fixed in musl in recent months,
> > so that this looks like something you may want to fix too.
> 
> Absolutely. Do you have an analysis of how this is reached? Neither of
> these should be called when the FILE is not in suitable state for
> reading. It might just be that vfscanf needs to call __toread on the
> FILE before starting and error out if it fails.

Indeed I think the attached fixes it.

Rich

Patch hide | download patch | download mbox

diff --git a/src/stdio/vfscanf.c b/src/stdio/vfscanf.c
index 9e030fc4..d990db9f 100644
--- a/src/stdio/vfscanf.c
+++ b/src/stdio/vfscanf.c
@@ -76,6 +76,8 @@  int vfscanf(FILE *restrict f, const char *restrict fmt, va_list ap)
 
 	FLOCK(f);
 
+	if (!f->rpos && __toread(f)) goto input_fail;
+
 	for (p=(const unsigned char *)fmt; *p; p++) {
 
 		alloc = 0;

Comments

Rich Felker April 17, 2020, 5:56 p.m.
On Fri, Apr 17, 2020 at 12:48:07PM -0400, Rich Felker wrote:
> On Fri, Apr 17, 2020 at 12:13:51PM -0400, Rich Felker wrote:
> > On Fri, Apr 17, 2020 at 03:56:06PM +0000, Pascal Cuoq wrote:
> > > ?Hello,
> > > 
> > > both functions `__shlim` and `__shgetc` subtract the members
> > > named `buf` and `rpos` of the struct they manipulate.
> > > 
> > > In `__shlim`, this happens in the statement `f->shcnt = f->buf - f->rpos;`.
> > > And in `__shgetc`, in happens inside the `shcnt` macro:
> > > 
> > > #define shcnt(f) ((f)->shcnt + ((f)->rpos - (f)->buf))
> > > 
> > > In our tests, while running `testsuite` in `libc-testsuite`,
> > > both the `__shlim` and `__shgetc` functions are reached
> > > with `f->buf` non-null and `f->rpos` a null pointer.
> > > 
> > > This can be made visible on execution platforms other than ours
> > > by adding a statement at the beginning of the functions:
> > > 
> > > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shlim\n");
> > > +      if (f->buf && !f->rpos) dprintf (2, "XXX Problem in __shgetc\n");
> > > 
> > > Then if, running `libc-testsuite`, you see the following, it means that
> > > `f->buf` was non-null and `f->rpos` was null when these points were
> > > reached:
> > > 
> > > $ ./testsuite
> > > fdopen test passed
> > > fcntl test passed
> > > fnmatch test passed
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > XXX Problem in __shlim
> > > XXX Problem in __shgetc
> > > fscanf test passed
> > > (...)
> > > 
> > > This has been tested on the (tag: v1.2.0) branch of git://git.musl-libc.org/musl
> > > 
> > > These pointer subtractions are undefined behavior. This is slightly worse
> > > than computing `(char*)0-(char*)0`, which is undefined in C and defined in C++,
> > > because compilers for both C and C++ are unlikely to exploit this one
> > > for optimization. Subtracting between a non-null pointer and a null pointer
> > > on the other hand is undefined behavior in both languages, and it is
> > > plausible that doing it may someday have unexpected consequences.
> > > 
> > > I mention this because similar undefined behaviors that were extremely
> > > unlikely to cause harm have been fixed in musl in recent months,
> > > so that this looks like something you may want to fix too.
> > 
> > Absolutely. Do you have an analysis of how this is reached? Neither of
> > these should be called when the FILE is not in suitable state for
> > reading. It might just be that vfscanf needs to call __toread on the
> > FILE before starting and error out if it fails.
> 
> Indeed I think the attached fixes it.
> 
> Rich

> diff --git a/src/stdio/vfscanf.c b/src/stdio/vfscanf.c
> index 9e030fc4..d990db9f 100644
> --- a/src/stdio/vfscanf.c
> +++ b/src/stdio/vfscanf.c
> @@ -76,6 +76,8 @@ int vfscanf(FILE *restrict f, const char *restrict fmt, va_list ap)
>  
>  	FLOCK(f);
>  
> +	if (!f->rpos && __toread(f)) goto input_fail;
> +
>  	for (p=(const unsigned char *)fmt; *p; p++) {
>  
>  		alloc = 0;

I think this patch may result in wrong error behavior on a trivial
scanf that doesn't try to read anything. Instead it should be:

	if (!f->rpos) __toread(f);
	if (!f->rpos) goto input_fail;

so that the error path is taken only on failure to enter read mode,
not on EOF.

If this works on in my tests I'll commit it.

Rich
Pascal Cuoq April 23, 2020, 11:34 a.m.
Hello again,

Rich Felker <dalias@libc.org> wrote:
> I think this patch may result in wrong error behavior on a trivial
> scanf that doesn't try to read anything. Instead it should be:
>
>        if (!f->rpos) __toread(f);
>        if (!f->rpos) goto input_fail;
>
> so that the error path is taken only on failure to enter read mode,
> not on EOF.

This has indeed fixed the invalid comparisons that were observed
from the tests I mentioned earlier, but a different test still has
the same problem.

As of commit 33338eb, the function wcstox does:
        f.rpos = f.rend = 0;
        f.buf = buf + 4;

(https://git.musl-libc.org/cgit/musl/tree/src/stdlib/wcstol.c?id=33338ebc853d37c80f0f236cc7a92cb0acc6aace#n38 )

It then passes the address of this f to shlim (line 45), causing the same invalid pointer subtraction f->buf - f->rpos that has already been discussed in this thread.

Best regards,

Pascal