[v4] implement recallocarray(3)

Submitted by Ariadne Conill on Aug. 2, 2020, 10:54 p.m.

Details

Message ID 20200802225426.32002-1-ariadne@dereferenced.org
State New
Series "implement recallocarray(3)"
Headers show

Commit Message

Ariadne Conill Aug. 2, 2020, 10:54 p.m.
This OpenBSD extension is similar to reallocarray(3), but
zero-initializes the new memory area.

This extension is placed in _BSD_SOURCE, like
reallocarray(3).

Changes from v3:
- use calloc() instead of realloc() and always copy
- explicitly zero old memory block
- restore overflow checking for old size

Changes from v2:
- drop overflow checking for old size

Changes from v1:
- use realloc() instead of reallocarray()
---
 include/stdlib.h           |  1 +
 src/malloc/recallocarray.c | 39 ++++++++++++++++++++++++++++++++++++++
 2 files changed, 40 insertions(+)
 create mode 100644 src/malloc/recallocarray.c

Patch hide | download patch | download mbox

diff --git a/include/stdlib.h b/include/stdlib.h
index b54a051f..a0412ad4 100644
--- a/include/stdlib.h
+++ b/include/stdlib.h
@@ -146,6 +146,7 @@  int clearenv(void);
 #define WCOREDUMP(s) ((s) & 0x80)
 #define WIFCONTINUED(s) ((s) == 0xffff)
 void *reallocarray (void *, size_t, size_t);
+void *recallocarray (void *, size_t, size_t, size_t);
 #endif
 
 #ifdef _GNU_SOURCE
diff --git a/src/malloc/recallocarray.c b/src/malloc/recallocarray.c
new file mode 100644
index 00000000..ce3adde4
--- /dev/null
+++ b/src/malloc/recallocarray.c
@@ -0,0 +1,39 @@ 
+#define _BSD_SOURCE
+#include <errno.h>
+#include <stdlib.h>
+#include <string.h>
+
+void *recallocarray(void *ptr, size_t om, size_t m, size_t n)
+{
+	void *newptr;
+	size_t old_size, new_size;
+
+	if (n && m > -1 / n) {
+		errno = ENOMEM;
+		return 0;
+	}
+	new_size = m * n;
+
+	if (n && om > -1 / n) {
+		errno = EINVAL;
+		return 0;
+	}
+	old_size = om * n;
+
+	newptr = calloc(m, n);
+	if (!newptr)
+		return ptr;
+
+	if (new_size <= old_size) {
+		memcpy((char *) newptr, ptr, new_size);
+	}
+	else {
+		memcpy((char *) newptr, ptr, old_size);
+		memset((char *) newptr + old_size, 0, new_size - old_size);
+	}
+
+	memset(ptr, 0, old_size);
+	free(ptr);
+
+	return newptr;
+}

Comments

Rich Felker Aug. 3, 2020, 12:45 a.m.
On Sun, Aug 02, 2020 at 04:54:26PM -0600, Ariadne Conill wrote:
> This OpenBSD extension is similar to reallocarray(3), but
> zero-initializes the new memory area.
> 
> This extension is placed in _BSD_SOURCE, like
> reallocarray(3).
> 
> Changes from v3:
> - use calloc() instead of realloc() and always copy
> - explicitly zero old memory block
> - restore overflow checking for old size
> 
> Changes from v2:
> - drop overflow checking for old size
> 
> Changes from v1:
> - use realloc() instead of reallocarray()

Can we finish discussion of questions raised before making new
versions with changes that haven't even been agreed upon? I asked
about EINVAL so we could determine if it's expected and if it's
reasonable to copy; that's not a demand to remove it. And the new
memset before free is a complete no-op. If there's a contract to
behave like explicit_bzero on the old memory, we may want to honor
that if we implement recallocarray, but that likely makes it entirely
unsuitable to the purpose you want it for and will bring back the
atrociously bad apk performance you previously saw with mallocng if
you use it there, since each increment of array size will incur a
whole memcpy of the existing array (quadratic time) rather than copies
only occuring a logarithmic number of times (yielding O(n log n)
overall) as would happen with realloc.

Overall I'm getting kinda skeptical of this function. Implementing it
without the hardness guarantees of the OpenBSD version seems like a
bad idea, but with those guarantees it just seems to be a bad
interface.

Rich
Ariadne Conill Aug. 3, 2020, 1:35 a.m.
Hello,

On 2020-08-02 18:45, Rich Felker wrote:
> On Sun, Aug 02, 2020 at 04:54:26PM -0600, Ariadne Conill wrote:
>> This OpenBSD extension is similar to reallocarray(3), but
>> zero-initializes the new memory area.
>>
>> This extension is placed in _BSD_SOURCE, like
>> reallocarray(3).
>>
>> Changes from v3:
>> - use calloc() instead of realloc() and always copy
>> - explicitly zero old memory block
>> - restore overflow checking for old size
>>
>> Changes from v2:
>> - drop overflow checking for old size
>>
>> Changes from v1:
>> - use realloc() instead of reallocarray()
> 
> Can we finish discussion of questions raised before making new
> versions with changes that haven't even been agreed upon? I asked
> about EINVAL so we could determine if it's expected and if it's
> reasonable to copy; that's not a demand to remove it. And the new
> memset before free is a complete no-op. If there's a contract to
> behave like explicit_bzero on the old memory, we may want to honor
> that if we implement recallocarray, but that likely makes it entirely
> unsuitable to the purpose you want it for and will bring back the
> atrociously bad apk performance you previously saw with mallocng if
> you use it there, since each increment of array size will incur a
> whole memcpy of the existing array (quadratic time) rather than copies
> only occuring a logarithmic number of times (yielding O(n log n)
> overall) as would happen with realloc.

At this time, there are no plans to use recallocarray(3) in apk, but 
that would only be a problem if used in a hot path.  I don't envision 
any hot path in apk where we would ever use recallocarray(3).  For 
managing buffers, we would continue using realloc() as we do now, since 
it's all bytes anyway.

*My* use case for recallocarray(3) is simply that I want a 
reallocarray() that also zeros the new members, *and* I do not want to 
reimplement that function every time I write a program, because this is 
the sort of thing a libc should ideally provide in 2020.  It provides 
calloc() after all, why not a realloc version of it?

In order to do that, you have to have a similar interface to what 
recallocarray(3) uses, because you need to know where to start zeroing. 
There's not any way around it that is also interposable.

I do agree that the hardness guarantees of the OpenBSD version implement 
significant overhead, and if we don't provide them, there will be some 
heartbleed type bug down the road.

So, I stick with v4 patch behavior, because I do not wish for anyone to 
shoot themselves in the foot.

Ariadne
Rich Felker Aug. 3, 2020, 1:58 a.m.
On Sun, Aug 02, 2020 at 07:35:23PM -0600, Ariadne Conill wrote:
> Hello,
> 
> On 2020-08-02 18:45, Rich Felker wrote:
> >On Sun, Aug 02, 2020 at 04:54:26PM -0600, Ariadne Conill wrote:
> >>This OpenBSD extension is similar to reallocarray(3), but
> >>zero-initializes the new memory area.
> >>
> >>This extension is placed in _BSD_SOURCE, like
> >>reallocarray(3).
> >>
> >>Changes from v3:
> >>- use calloc() instead of realloc() and always copy
> >>- explicitly zero old memory block
> >>- restore overflow checking for old size
> >>
> >>Changes from v2:
> >>- drop overflow checking for old size
> >>
> >>Changes from v1:
> >>- use realloc() instead of reallocarray()
> >
> >Can we finish discussion of questions raised before making new
> >versions with changes that haven't even been agreed upon? I asked
> >about EINVAL so we could determine if it's expected and if it's
> >reasonable to copy; that's not a demand to remove it. And the new
> >memset before free is a complete no-op. If there's a contract to
> >behave like explicit_bzero on the old memory, we may want to honor
> >that if we implement recallocarray, but that likely makes it entirely
> >unsuitable to the purpose you want it for and will bring back the
> >atrociously bad apk performance you previously saw with mallocng if
> >you use it there, since each increment of array size will incur a
> >whole memcpy of the existing array (quadratic time) rather than copies
> >only occuring a logarithmic number of times (yielding O(n log n)
> >overall) as would happen with realloc.
> 
> At this time, there are no plans to use recallocarray(3) in apk, but
> that would only be a problem if used in a hot path.  I don't
> envision any hot path in apk where we would ever use
> recallocarray(3).  For managing buffers, we would continue using
> realloc() as we do now, since it's all bytes anyway.
> 
> *My* use case for recallocarray(3) is simply that I want a
> reallocarray() that also zeros the new members, *and* I do not want
> to reimplement that function every time I write a program, because
> this is the sort of thing a libc should ideally provide in 2020.  It
> provides calloc() after all, why not a realloc version of it?
> 
> In order to do that, you have to have a similar interface to what
> recallocarray(3) uses, because you need to know where to start
> zeroing. There's not any way around it that is also interposable.
> 
> I do agree that the hardness guarantees of the OpenBSD version
> implement significant overhead, and if we don't provide them, there
> will be some heartbleed type bug down the road.
> 
> So, I stick with v4 patch behavior, because I do not wish for anyone
> to shoot themselves in the foot.

If you want a function that differs from the OpenBSD function, I don't
think this is something that belongs in libc but rather something you
can just drop into projects you want to use it in. Dropping whatever
random utility function seemed useful to have shared between system
programs into libc is a legacy BSD and GNU practice that musl
intentionally does not immitate.

A function that has broad usefulness and consensus across implementors
about what the signature and semantics should be is reasonable for
consideration, but it's looking like that's not what we have here, at
least not yet. I'm open to discussing this on libc-coord if it's
something you want to try to make into a cross-implementation
consensus.

Note that reallocarray still looks fine and is already something glibc
has adopted too. Unlike recallocarray it doesn't have subtle behaviors
that are a tradeoff between usability and hardening. If you have
reallocarray, I think the version of recallocarray you want is a very
thin wrapper around it (on success, if m>om, memset tail).

Rich
Wolf Aug. 3, 2020, 3:11 p.m.
On 2020-08-02 16:54:26 -0600, Ariadne Conill wrote:
> +void *recallocarray(void *ptr, size_t om, size_t m, size_t n)
> +{
> +	void *newptr;
> +	size_t old_size, new_size;
> +
> +	if (n && m > -1 / n) {
> +		errno = ENOMEM;
> +		return 0;
> +	}
> +	new_size = m * n;
> +
> +	if (n && om > -1 / n) {
> +		errno = EINVAL;
> +		return 0;
> +	}
> +	old_size = om * n;
> +
> +	newptr = calloc(m, n);
> +	if (!newptr)
> +		return ptr;

From reading openbsd's source I would say this should return 0 (or
newptr), not ptr. Otherwise I'm not sure how the caller could tell if
the resize failed or not.

> +
> +	if (new_size <= old_size) {
> +		memcpy((char *) newptr, ptr, new_size);
> +	}
> +	else {
> +		memcpy((char *) newptr, ptr, old_size);
> +		memset((char *) newptr + old_size, 0, new_size - old_size);
> +	}
> +
> +	memset(ptr, 0, old_size);
> +	free(ptr);
> +
> +	return newptr;
> +}

W.
Wolf Aug. 3, 2020, 3:15 p.m.
On 2020-08-03 17:11:50 +0200, Wolf wrote:
> From reading openbsd's source I would say this should return 0 (or
> newptr), not ptr. Otherwise I'm not sure how the caller could tell if
> the resize failed or not.

Or not, please ignore. I've got confused.

W.
Rich Felker Aug. 3, 2020, 3:25 p.m.
On Mon, Aug 03, 2020 at 05:15:11PM +0200, Wolf wrote:
> 
> On 2020-08-03 17:11:50 +0200, Wolf wrote:
> > From reading openbsd's source I would say this should return 0 (or
> > newptr), not ptr. Otherwise I'm not sure how the caller could tell if
> > the resize failed or not.
> 
> Or not, please ignore. I've got confused.

Your original comment seems correct. Returning the old ptr on failure
cannot be correct.

Rich