runtests: display the test status if tests appear hung

It sometimes happens that a test hangs during a test run and never
returns. The test harness will wait indefinitely for the results and on
CI servers the CI job will eventually be killed after an hour or two.
At the end of a test run, if results haven't come in within a couple of
minutes, display the status of all test runners and what tests they're
running to help in debugging the problem.

This feature is really only kick in with parallel testing enabled, which
is fine because without parallel testing it's usually easy to tell what
test has hung.

Closes #11980
This commit is contained in:
Dan Fandrich 2023-09-28 10:41:50 -07:00
parent 5c006df36c
commit 65729f65c7

View File

@ -2762,6 +2762,7 @@ my $total=0;
my $lasttest=0;
my @at = split(" ", $TESTCASES);
my $count=0;
my $endwaitcnt=0;
$start = time();
@ -2922,6 +2923,16 @@ while () {
delete $runnersrunning{$riderror} if(defined $runnersrunning{$riderror});
$globalabort = 1;
}
if(!scalar(@runtests) && ++$endwaitcnt == (240 + $jobs)) {
# Once all tests have been scheduled on a runner at the end of a test
# run, we just wait for their results to come in. If we're still
# waiting after a couple of minutes ($endwaitcnt multiplied by
# $runnerwait, plus $jobs because that number won't time out), display
# the same test runner status as we give with a SIGUSR1. This will
# likely point to a single test that has hung.
logmsg "Hmmm, the tests are taking a while to finish. Here is the status:\n";
catch_usr1();
}
}
my $sofar = time() - $start;