[libc-test] add strptime basic test

Submitted by Rafał Miłecki on Nov. 15, 2018, 7:34 a.m.

Details

Message ID 20181115073456.11105-1-zajec5@gmail.com
State New
Series "add strptime basic test"
Headers show

Commit Message

Rafał Miłecki Nov. 15, 2018, 7:34 a.m.
From: Rafał Miłecki <rafal@milecki.pl>

Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
---
 src/functional/strptime.c | 119 ++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 119 insertions(+)
 create mode 100644 src/functional/strptime.c

Patch hide | download patch | download mbox

diff --git a/src/functional/strptime.c b/src/functional/strptime.c
new file mode 100644
index 0000000..5f15058
--- /dev/null
+++ b/src/functional/strptime.c
@@ -0,0 +1,119 @@ 
+// SPDX-License-Identifier: MIT
+
+#define _GNU_SOURCE	/* For tm_gmtoff */
+#include <stddef.h>
+#include <stdlib.h>
+#include <string.h>
+#include <time.h>
+#include "test.h"
+
+static void checkStrptime(const char *s, const char *format, const struct tm *expected) {
+	size_t n = offsetof(struct tm, tm_isdst);
+	struct tm tm = { };
+	const char *ret;
+
+	ret = strptime(s, format, &tm);
+	if (!ret || *ret != '\0') {
+		t_error("\"%s\": failed to parse \"%s\"\n", format, s);
+	} else if (memcmp(&tm, expected, n)) {
+		char buf1[64];
+		char buf2[64];
+
+		strftime(buf1, sizeof(buf1), "%FT%H:%M:%S%Z (day %j: %a)", expected);
+		strftime(buf2, sizeof(buf2), "%FT%H:%M:%S%Z (day %j: %a)", &tm);
+
+		t_error("\"%s\": for \"%s\" expected %s but got %s\n", format, s, buf1, buf2);
+	}
+}
+
+static void checkStrptimeTz(const char *s, int h, int m) {
+	long int expected = h * 3600 + m * 60;
+	struct tm tm = { };
+	const char *ret;
+
+	ret = strptime(s, "%z", &tm);
+	if (!ret || *ret != '\0') {
+		t_error("\"%%z\": failed to parse \"%s\"\n", s);
+	} else if (tm.tm_gmtoff != expected) {
+		t_error("\"%%z\": for \"%s\" expected tm_gmtoff %ld but got %ld\n", s, tm.tm_gmtoff, expected);
+	}
+}
+
+static struct tm tm1 = {
+	.tm_sec = 8,
+	.tm_min = 57,
+	.tm_hour = 20,
+	.tm_mday = 0,
+	.tm_mon = 0,
+	.tm_year = 0,
+	.tm_wday = 0,
+	.tm_yday = 0,
+	.tm_isdst = 0,
+};
+
+static struct tm tm2 = {
+	.tm_sec = 0,
+	.tm_min = 0,
+	.tm_hour = 0,
+	.tm_mday = 25,
+	.tm_mon = 8 - 1,
+	.tm_year = 1991 - 1900,
+	.tm_wday = 0,
+	.tm_yday = 237 - 1,
+	.tm_isdst = 0,
+};
+
+static struct tm tm3 = {
+	.tm_sec = 0,
+	.tm_min = 0,
+	.tm_hour = 0,
+	.tm_mday = 21,
+	.tm_mon = 10 - 1,
+	.tm_year = 2015 - 1900,
+	.tm_wday = 3,
+	.tm_yday = 294 - 1,
+	.tm_isdst = 0,
+};
+
+static struct tm tm4 = {
+	.tm_sec = 0,
+	.tm_min = 0,
+	.tm_hour = 0,
+	.tm_mday = 10,
+	.tm_mon = 7 - 1,
+	.tm_year = 1856 - 1900,
+	.tm_wday = 4,
+	.tm_yday = 192 - 1,
+	.tm_isdst = 0,
+};
+
+int main() {
+	setenv("TZ", "UTC0", 1);
+
+	/* Time */
+	checkStrptime("20:57:08", "%H:%M:%S", &tm1);
+	checkStrptime("20:57:8", "%R:%S", &tm1);
+	checkStrptime("20:57:08", "%T", &tm1);
+
+	/* Format */
+	checkStrptime("20:57:08", "%H : %M  :  %S", &tm1);
+	checkStrptime("20 57  08", "%H %M %S", &tm1);
+	checkStrptime("20%57%08", "%H %% %M%%%S", &tm1);
+	checkStrptime("foo20bar57qux08      ", "foo %Hbar %M qux%S ", &tm1);
+
+	/* Date */
+	checkStrptime("1991-08-25", "%Y-%m-%d", &tm2);
+	checkStrptime("25.08.91", "%d.%m.%y", &tm2);
+	checkStrptime("08/25/91", "%D", &tm2);
+	checkStrptime("21.10.15", "%d.%m.%y", &tm3);
+	checkStrptime("10.7.56 in 18th", "%d.%m.%y in %C th", &tm4);
+
+	/* Glibc */
+	checkStrptime("1856-07-10", "%F", &tm4);
+	checkStrptime("683078400", "%s", &tm2);
+	checkStrptimeTz("+0200", 2, 0);
+	checkStrptimeTz("-0530", -5, -30);
+	checkStrptimeTz("-06", -6, 0);
+
+	return t_status;
+}

Comments

Rafał Miłecki Nov. 15, 2018, 10:12 a.m.
On 15.11.2018 08:34, Rafał Miłecki wrote:
> From: Rafał Miłecki <rafal@milecki.pl>
> 
> Signed-off-by: Rafał Miłecki <rafal@milecki.pl>

I've just tried it with musl (I should have done that before sending a
patch) and noticed is fails with:

"%Y-%m-%d": for "1991-08-25" expected 1991-08-25T00:00:00 (day 237: Sun) but got 1991-08-25T00:00:00 (day 001: Sun)
"%d.%m.%y": for "25.08.91" expected 1991-08-25T00:00:00 (day 237: Sun) but got 1991-08-25T00:00:00 (day 001: Sun)
"%D": for "08/25/91" expected 1991-08-25T00:00:00 (day 237: Sun) but got 1991-08-25T00:00:00 (day 001: Sun)
"%d.%m.%y": for "21.10.15" expected 2015-10-21T00:00:00 (day 294: Wed) but got 2015-10-21T00:00:00 (day 001: Sun)
"%d.%m.%y in %C th": for "10.7.56 in 18th" expected 1856-07-10T00:00:00 (day 192: Thu) but got 1856-07-10T00:00:00 (day 001: Sun)

which I didn't expect.

It's because I assumed glibc behavior which sets tm_wday and tm_yday.

The man says:
"In principle, this function does not initialize tm but stores only the
values specified."

There is a glibc behavior however:
"Details differ a bit between different UNIX sys-tems.  The glibc
implementation does not touch those fields which are not explicitly
specified, except that it recomputes the tm_wday and tm_yday field if
any of the year, month,  or  day  elements changed."

I guess a correct test should allow any behavior and don't test tm_wday
and tm_yday fields.


It also fails with:

"%F": failed to parse "1856-07-10"
"%s": failed to parse "683078400"
"%z": failed to parse "+0200"
"%z": failed to parse "-0530"
"%z": failed to parse "-06"

but that's expected due to unimplemented %F %s and %z.
Szabolcs Nagy Nov. 15, 2018, 9:26 p.m.
* Rafał Miłecki <zajec5@gmail.com> [2018-11-15 11:12:34 +0100]:
> It also fails with:
> 
> "%F": failed to parse "1856-07-10"
> "%s": failed to parse "683078400"
> "%z": failed to parse "+0200"
> "%z": failed to parse "-0530"
> "%z": failed to parse "-06"
> 
> but that's expected due to unimplemented %F %s and %z.

v2 looks good, but these are testing extensions.

do you intend to add patch for %F and %s too?
if not i'll disable those tests
Rafał Miłecki Nov. 15, 2018, 9:31 p.m.
On 2018-11-15 22:26, Szabolcs Nagy wrote:
> * Rafał Miłecki <zajec5@gmail.com> [2018-11-15 11:12:34 +0100]:
>> It also fails with:
>> 
>> "%F": failed to parse "1856-07-10"
>> "%s": failed to parse "683078400"
>> "%z": failed to parse "+0200"
>> "%z": failed to parse "-0530"
>> "%z": failed to parse "-06"
>> 
>> but that's expected due to unimplemented %F %s and %z.
> 
> v2 looks good, but these are testing extensions.
> 
> do you intend to add patch for %F and %s too?
> if not i'll disable those tests

I'm planning to work on those two, but I can't guarantee any deadline.
It may even take me a month+. If you are not comfortable with that, we
can disable/drop them and enable/add later. I'm leaving it up to you.
Bartosz Brachaczek Nov. 16, 2018, 9:21 p.m.
On Thu, Nov 15, 2018 at 11:12 AM Rafał Miłecki <zajec5@gmail.com> wrote:

> On 15.11.2018 08:34, Rafał Miłecki wrote:
> > From: Rafał Miłecki <rafal@milecki.pl>
> >
> > Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
>
> I've just tried it with musl (I should have done that before sending a
> patch) and noticed is fails with:
>
> "%Y-%m-%d": for "1991-08-25" expected 1991-08-25T00:00:00 (day 237: Sun)
> but got 1991-08-25T00:00:00 (day 001: Sun)
> "%d.%m.%y": for "25.08.91" expected 1991-08-25T00:00:00 (day 237: Sun) but
> got 1991-08-25T00:00:00 (day 001: Sun)
> "%D": for "08/25/91" expected 1991-08-25T00:00:00 (day 237: Sun) but got
> 1991-08-25T00:00:00 (day 001: Sun)
> "%d.%m.%y": for "21.10.15" expected 2015-10-21T00:00:00 (day 294: Wed) but
> got 2015-10-21T00:00:00 (day 001: Sun)
> "%d.%m.%y in %C th": for "10.7.56 in 18th" expected 1856-07-10T00:00:00
> (day 192: Thu) but got 1856-07-10T00:00:00 (day 001: Sun)
>
> which I didn't expect.
>
> It's because I assumed glibc behavior which sets tm_wday and tm_yday.
>
> The man says:
> "In principle, this function does not initialize tm but stores only the
> values specified."
>
> There is a glibc behavior however:
> "Details differ a bit between different UNIX sys-tems.  The glibc
> implementation does not touch those fields which are not explicitly
> specified, except that it recomputes the tm_wday and tm_yday field if
> any of the year, month,  or  day  elements changed."
>
> I guess a correct test should allow any behavior and don't test tm_wday
> and tm_yday fields.
>
>
> It also fails with:
>
> "%F": failed to parse "1856-07-10"
> "%s": failed to parse "683078400"
> "%z": failed to parse "+0200"
> "%z": failed to parse "-0530"
> "%z": failed to parse "-06"
>
> but that's expected due to unimplemented %F %s and %z.
>

I cannot find anything in the normative text that would suggest that
recomputing tm_wday and/or tm_yday is required, but interestingly enough,
the strptime example that is used in POSIX seems to rely on that. See:
https://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html.
That example does not produce expected output using musl.

Possibly something that should be clarified in POSIX?
Rich Felker Nov. 16, 2018, 9:34 p.m.
On Fri, Nov 16, 2018 at 10:21:01PM +0100, Bartosz Brachaczek wrote:
> On Thu, Nov 15, 2018 at 11:12 AM Rafał Miłecki <zajec5@gmail.com> wrote:
> 
> > On 15.11.2018 08:34, Rafał Miłecki wrote:
> > > From: Rafał Miłecki <rafal@milecki.pl>
> > >
> > > Signed-off-by: Rafał Miłecki <rafal@milecki.pl>
> >
> > I've just tried it with musl (I should have done that before sending a
> > patch) and noticed is fails with:
> >
> > "%Y-%m-%d": for "1991-08-25" expected 1991-08-25T00:00:00 (day 237: Sun)
> > but got 1991-08-25T00:00:00 (day 001: Sun)
> > "%d.%m.%y": for "25.08.91" expected 1991-08-25T00:00:00 (day 237: Sun) but
> > got 1991-08-25T00:00:00 (day 001: Sun)
> > "%D": for "08/25/91" expected 1991-08-25T00:00:00 (day 237: Sun) but got
> > 1991-08-25T00:00:00 (day 001: Sun)
> > "%d.%m.%y": for "21.10.15" expected 2015-10-21T00:00:00 (day 294: Wed) but
> > got 2015-10-21T00:00:00 (day 001: Sun)
> > "%d.%m.%y in %C th": for "10.7.56 in 18th" expected 1856-07-10T00:00:00
> > (day 192: Thu) but got 1856-07-10T00:00:00 (day 001: Sun)
> >
> > which I didn't expect.
> >
> > It's because I assumed glibc behavior which sets tm_wday and tm_yday.
> >
> > The man says:
> > "In principle, this function does not initialize tm but stores only the
> > values specified."
> >
> > There is a glibc behavior however:
> > "Details differ a bit between different UNIX sys-tems.  The glibc
> > implementation does not touch those fields which are not explicitly
> > specified, except that it recomputes the tm_wday and tm_yday field if
> > any of the year, month,  or  day  elements changed."
> >
> > I guess a correct test should allow any behavior and don't test tm_wday
> > and tm_yday fields.
> >
> >
> > It also fails with:
> >
> > "%F": failed to parse "1856-07-10"
> > "%s": failed to parse "683078400"
> > "%z": failed to parse "+0200"
> > "%z": failed to parse "-0530"
> > "%z": failed to parse "-06"
> >
> > but that's expected due to unimplemented %F %s and %z.
> >
> 
> I cannot find anything in the normative text that would suggest that
> recomputing tm_wday and/or tm_yday is required, but interestingly enough,
> the strptime example that is used in POSIX seems to rely on that. See:
> https://pubs.opengroup.org/onlinepubs/9699919799/functions/strptime.html.
> That example does not produce expected output using musl.
> 
> Possibly something that should be clarified in POSIX?

Yes, I think this calls for a defect report/request for
interpretation. Even if one intends that tm_[wy]day be updated, there
is no canonical correct way to do it. Certainly it can't happen
without sufficiently many constraints to determine the value, and it's
not clear what would happen when %a, %w, etc. are also present.
Overall, strptime is severely under-specified, and musl tends to err
on the side of not doing anything it's not specified to do, since it
could turn out that such action is contrary to intended interpretation
or future changes to the standard, in which case applications
depending on the behavior could be broken by fixing musl to conform.

Rich
Rich Felker Nov. 16, 2018, 9:36 p.m.
On Thu, Nov 15, 2018 at 10:26:58PM +0100, Szabolcs Nagy wrote:
> * Rafał Miłecki <zajec5@gmail.com> [2018-11-15 11:12:34 +0100]:
> > It also fails with:
> > 
> > "%F": failed to parse "1856-07-10"
> > "%s": failed to parse "683078400"
> > "%z": failed to parse "+0200"
> > "%z": failed to parse "-0530"
> > "%z": failed to parse "-06"
> > 
> > but that's expected due to unimplemented %F %s and %z.
> 
> v2 looks good, but these are testing extensions.
> 
> do you intend to add patch for %F and %s too?
> if not i'll disable those tests

I think it's a good idea to keep tests for standards-mandated behavior
separate from tests for extensions, even if we do want to support the
latter in musl, and name them appropriately. It matters if/when people
use libc-test to assess the conformance of another implementation that
may not have made the same choices about extensions.

Rich