Skip to content

Start addressing unchecked snprintf() calls #3898

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 49 commits into from
Jul 7, 2025

Conversation

nrwahl2
Copy link
Contributor

@nrwahl2 nrwahl2 commented Jun 20, 2025

Ref T974

@nrwahl2 nrwahl2 requested a review from clumens June 20, 2025 21:13
@nrwahl2 nrwahl2 force-pushed the nrwahl2-snprintf branch from 7c1730a to 9223b92 Compare June 20, 2025 21:29
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jun 20, 2025

Update one copyright and add another commit

@nrwahl2 nrwahl2 force-pushed the nrwahl2-snprintf branch from 9223b92 to bc89939 Compare June 23, 2025 08:48
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jun 23, 2025

Two more callers to go, but going to bed

@nrwahl2 nrwahl2 force-pushed the nrwahl2-snprintf branch from bc89939 to 9f150ba Compare June 23, 2025 21:03
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jun 23, 2025

This should be all of it. We can split it into two PRs if you prefer that.

@nrwahl2 nrwahl2 force-pushed the nrwahl2-snprintf branch 2 times, most recently from b9f1bdc to f20ff07 Compare June 23, 2025 23:16
| | | any specifier expands to an empty string, then the |
| | | timestamp is discarded. (Expanding to an empty |
| | | string is not an error, but there is no way to |
| | | distinguish this from a too-small buffer.) |
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

......I looked hard for anything like this, in GLib or elsewhere, and never found this. Sigh. What I did find was https://docs.gtk.org/glib/type_func.Date.strftime.html, which doesn't help at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The '%f' could be interesting. Unfortunately it seems not to support limiting to less than 6 digits. And I guess we should stay compatible here with the timestamp-format introduced initially.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This wouldn't mean we have to use g_date_time_format() to replace our parsing of %N. We could perhaps use it to replace the strftime() call, which could be nice.

Although even there, "The format strings understood by this function are a subset of the strftime() format language as specified by C99."

"Subset."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far, two unit tests fail because g_date_time_format() doesn't support a bare '%' or a width field (e.g., "%3s"). It returns NULL when it finds either of these.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was just thinking of a way to get rid of the %N implementation stating that "s/%.+N/%f/" is unfortunately not exactly the same :-(

Copy link
Contributor

@wenningerk wenningerk Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A width field should be fine with the %N implementation but it would as well stumble over a bare '%' at least if there is another format to follow as it would eat the next '%'.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updating to use g_date_time_format() with a fallback to strftime() for now

@@ -257,44 +257,43 @@ remoted_spawn_pidone(int argc, char **argv, char **envp)
# ifdef HAVE_PROGNAME
/* Differentiate ourselves in the 'ps' output */
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any answers to your questions, but there's a lot going on in this function and it's all stuff I've never looked into or tested before, so I am very hesitant to make changes.

I don't think HAVE_PROGNAME is relevant here, unless it's some indirect implied weirdness (if __progname exists, then we know we can do all this other stuff even though we don't actually use __progname). It looks like __progname is a variable that glibc sets up for us (https://sourceware.org/git/?p=glibc.git;a=blob;f=misc/init-misc.c;h=2a1b82710ec8b42b4dac6edb359d8920f902cd21;hb=HEAD). Anyway, this is the only place it's ever used so either it's useless and the whole check in configure.ac can go, or it's not relevant and we're doing everything here all wrong.

I'm surprised that the argv and envp overwriting works in the parent, and I don't understand what any of the envp stuff is supposed to be accomplishing. Comments and history are not helpful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with all this.

I just spent some more time looking at this code. I understand your hesitation (and the PR in its current form doesn't change anything except that it adds an error-check).

However, it does appear that this complicated code is PURELY to change the ps output. The parent is either the true "PID 1" in a container, or a faux PID 1 (by setting PCMK_remote_pid1="full") for testing. I'm not gonna pretend to fully understand the latter use case.

The parent sticks around, renamed to "pcmk-init," and does nothing but sleep until it gets a SIGCHLD (at which point it waits for a dead child) or SIGINT (at which point it exits). The child becomes the actual pacemaker-remoted daemon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So it seems like the envp stuff could go away, and that all of this complicated overwriting stuff is only there in case argv[0] isn't big enough to hold the string "pcmk-init".

Nice find on __progname in the glibc source code. Yeah... it seems irrelevant. Like maybe we "need" this if __progname exists, but it seems like we could overwrite argv even if it doesn't.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is already too big, so I'll leave this alone for this PR... but would be nice to clean up in the near future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be worth filing a task in the upstream tracker so we perhaps remember to do this eventually.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternately, do we actually care if the program name is changed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternately, do we actually care if the program name is changed?

It doesn't seem important but could be visually helpful. Otherwise we may have two /usr/bin/pacemaker-remoted processes, one of which is the parent (just waiting to reap dead children) and one of which is the child that's actually doing work.

@clumens clumens added the review: in progress PRs that are currently being reviewed label Jun 26, 2025
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jul 1, 2025

Updated to partially address review

@@ -92,8 +95,9 @@ no_specifiers(void **state)
static void
without_frac(void **state)
{
assert_hr_format("%Y-%m-%d %H:%M:%S", DATE_S " " TIME_S, NULL, 0);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should have done two separate commits. There are a few new tests of this form, and then the macro additions.

@clumens
Copy link
Contributor

clumens commented Jul 1, 2025

Retest this please

@nrwahl2 nrwahl2 force-pushed the nrwahl2-snprintf branch from 96f6fc9 to 683519c Compare July 1, 2025 20:29
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jul 1, 2025

@clumens Latest push just updates a commit to move the free in daemons/fenced/cts-fence-helper.c. Ready for review again.

The previous push only added new commits in iso8601.c

@clumens
Copy link
Contributor

clumens commented Jul 1, 2025

Regarding the various time changes and problems... it looks like some of this was already known: https://projects.clusterlabs.org/T432

Also, #3846 (comment) is in there too.

@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jul 1, 2025

Regarding the various time changes and problems... it looks like some of this was already known: https://projects.clusterlabs.org/T432

I'm seeing the overlap only in the general idea that we want to get rid of our custom time handling as much as possible. This PR doesn't deal with parsing, which is where most of the explicit issues in T432 seem to arise. I put "Ref T432" in my commit message.

Our iso8601 stuff is a big mess and I look forward to shrinking it drastically.

nrwahl2 added 12 commits July 2, 2025 22:57
We don't error-check the return code, and the offset update idiom is
painful to look at. It won't hurt us to do a tiny bit more string
allocation in the heap for readability. We already do a ton of
allocation for XML and Pacemaker data structures.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Nothing uses this yet

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The %N formatting specifier behavior is not obvious, but we're stuck
with it now. Future commits will at least clarify it.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We weren't error-checking it, and this way we don't have to. GString
functions assert on memory error.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The use of "%N" and "nano_*" were misleading. We only support storing
and displaying times at microsecond resolution or less.
* Storing: Note that we fetch hr_dt->useconds and append three zeros
  unconditionally in the string representation (as nanoseconds).
* Displaying: Note the `nano_digits = QB_MIN(nano_digits, 6)`. This caps
  the displayed field width so that the fractional (decimal) component
  of a timestamp is at microsecond resolution or coarser. So we can
  never display any of the three zeros that were appended above.

We have to keep "%N" and the existing behavior for backward
compatibility, of course.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
We weren't error-checking sprintf() before. Just use
g_string_append_printf() so that the error checking is done for us.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This asserts on error.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
A leading zero is ignored for "%xN", so remove it from the example. Add
detail to the description.

Also, we strftime() into a fixed-size buffer of 128 characters. For some
reason, there is no general way to strftime into a dynamically sized
buffer or to determine how large the buffer needs to be.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Right now we can be sure port is at most 65536, but it's simpler not to
use that assumption in get_address_info() anyway. This extra allocation
shouldn't impact performance in any significant way.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
nrwahl2 added 25 commits July 2, 2025 22:57
Previously, we required an input buffer and its size as arguments. Now,
we take an output argument, which can be NULL. If not NULL, we
dynamically allocate a buffer and return it via the output argument.

We can pass the full buffer size (PATH_MAX) to readlink(), rather than
passing the buffer size minus one. The readlink(2) man page says to test
for truncation by checking whether the return value is equal to the
buffer size. readlink() doesn't append a terminating null character, but
that's okay, because the buffer is initialized with zeros. If fewer than
PATH_MAX bytes were written, then there's at least one null byte at the
end.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
NAME_MAX isn't uniformly enforced (see URL below), and this avoids the
need to explicitly error-check snprintf().

www.gnu.org/software/libc/manual/html_node/Limits-for-Files.html

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing. This also avoids imposing an arbitrary (albeit
reasonable) limit of 500 characters on tag length.

It should be simple to replace some of this logic with a call to
g_strsplit(). However, that would require testing that I don't feel like
doing right now. The current change seems straightforward enough.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing. This also avoids imposing an arbitrary (albeit
reasonable) limit of 512 characters on action name length. Pacemaker
technically supports arbitrary action names.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing. This also avoids imposing an arbitrary (albeit
reasonable) limit of 512 characters on action name length. Pacemaker
technically supports arbitrary action names.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To avoid the need to error-check snprintf() explicitly, which we have
not been doing.

Ref T974

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Also break up the test date into separate macros for convenience, and
use them in the "expected" strings.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Currently ha_get_tm_time() assigns the source crm_time_t object's years
(day of year) member to the mday (day of month) member of the target
struct tm.

One would expect that we would instead assign it to the target's yday
(day of year) member.

However, the call to mktime() fixes out-of-range values to make all the
target's fields consistent if possible. For example, October 40 becomes
November 9. The original yday value is ignored. This is why we assign
the day of the year to the mday member.

If there were any problem with this logic (for example, if mktime() were
omitted), then a test date in January may still give correct results.

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
in pcmk__time_format_hr() unit tests

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
in pcmk__time_format_hr() unit tests. The parsing logic in
pcmk__time_format_hr() is hard to follow. For a little while, I was
confused and thought it would give incorrect output if there were any
format specifiers after a "%N".

There is probably no real use case for this support, but it should work
(and currently it does work).

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
strftime() requires a fixed-size buffer. It's not feasible to determine
in advance how large the buffer will need to be, since a user configures
the format string for alert timestamps.

Using g_date_time_format() fixes this. However, it does not support
certain syntax, such as a width field or a literal '%' character (unless
escaped as "%%"). So we need to maintain backward compatibility for now,
lest alert timestamps break for some users.

This also moves us one step closer toward drastically shrinking
Pacemaker's time libraries, perhaps in favor of GDateTime.

Ref T432

Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
@nrwahl2 nrwahl2 force-pushed the nrwahl2-snprintf branch from 683519c to 5a783ff Compare July 3, 2025 05:57
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jul 3, 2025

Rebased on main, no other changes

@clumens clumens merged commit cd6b3a8 into ClusterLabs:main Jul 7, 2025
1 check passed
@nrwahl2
Copy link
Contributor Author

nrwahl2 commented Jul 9, 2025

Rebased on main, no other changes

@clumens just FYI, this referred only to the second push. Prior to that, I pushed 12 new commits on top. Just in case you didn't see those. Of course I hope the new ones are fine, and the unit tests passed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
review: in progress PRs that are currently being reviewed
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants