autoconf/build-aux/help-extract.pl
Zack Weinberg 9b5c0f1774
Generate manpages directly from source code.
We generate manpages for autoconf’s installed programs (autoconf,
autoheader, etc.) using help2man, which runs each program in order to
learn its --help output.  Each manpage therefore has a dependency on
the existence of the corresponding program, but this dependency is
intentionally left out of the Makefile so that one can build from a
tarball release (which will include prebuilt manpages) without having
help2man installed.

But when building from a git checkout with high levels of
parallelism (-j20 or so), the missing dependency can lead to build
failures, because help2man will try to run the program before it
exists.  In an earlier patch I tried to work around this with a
recursive make invocation in the ‘.x.1’ rule, to ensure the existence
of the program.  That only traded one concurrency bug for another, now
we could have two jobs trying to build the same program simultaneously
and they would clobber each other’s work and the build would still
fail.

Instead, this patch introduces a utility script ‘help-extract.pl’ that
reads --help and --version information directly from the source code
for each program.  This utility, wrapped appropriately for each
program, is what help2man now runs.  Usage is a little weird because
help2man doesn’t let you specify any arguments to the “executable”
that it runs, but it works, and lets us write all of the true
dependencies of each manpage into the Makefile without naming any file
that would be created during a build from a tarball.  help-extract.pl
is a Perl script, so it introduces no new build-time requirements.

A downside is that we have to make sure each of the script sources in
bin/, and also part of lib/Autom4te/ChannelDefs.pm, are parseable by
help-extract.  The most important constraints are that the text output
by --help must be defined in a global variable named ‘help’, and its
definition has to be formatted just the way these definitions are
currently formatted.  Similarly for --version.  Furthermore, only some
non-literal substitutions are possible in these texts; each has to be
explicitly supported in help-extract.pl.  The current list of supported
substitutions is $0, @PACKAGE_NAME@, @VERSION@, @RELEASE_YEAR@, and
Autom4te::ChannelDefs::usage.

The generated manpages themselves are character-for-character
identical before and after this patch.

 * build-aux/help-extract.pl: New build script that extracts --help
  and --version output from manpages.

 * man/autoconf.w, man/autoheader.w, man/autom4te.w, man/autoreconf.w
 * man/autoscan.w, man/autoupdate.w, man/ifnames.w: New shell scripts
   which wrap build-aux/help-extract.pl.

 * man/local.mk: Generate each manpage by running help2man on the
   corresponding .w script, not on the built utility itself.
   Revise all dependencies to match.

 * bin/autoconf.as: Rename ‘usage’ variable to ‘help’ and
   ‘help’ variable to ‘usage_err’.
 * bin/autoheader.in: Call Autom4te::ChannelDefs::usage with no
   function-call parentheses, matching all the other scripts.
 * bin/autom4te.in: Initialize $version with a regular double-quoted
   string, not a heredoc, matching all the other scripts.
 * bin/autoscan.in: Remove global variable $configure_scan.
2020-08-21 16:23:32 -04:00

276 lines
7.6 KiB
Perl

# help-extract -- extract --help and --version output from a script.
# Copyright (C) 2020 Free Software Foundation, Inc.
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <https://www.gnu.org/licenses/>.
# Written by Zack Weinberg.
use strict;
use warnings;
# File::Spec itself was added in 5.005.
# File::Spec::Functions was added in 5.6.1 which is just barely too new.
use File::Spec;
# This script is not intended to be used directly. It's run by
# help2man via wrappers in man/, e.g. man/autoconf.w, as if it were
# one of autoconf's executable scripts. It extracts the --help and
# --version output of that script from its source form, without
# actually running it. The script to work from is set by the wrapper,
# and several other parameters are passed down from the Makefile as
# environment variables; see parse_args below.
# The point of this script is, the preprocessed forms of the
# executable scripts, and their wrappers for uninstalled use
# (e.g. <build-dir>/{bin,tests}/autoconf) do not need to exist to
# generate the corresponding manpages. This is desirable because we
# can't put those dependencies in the makefiles without breaking
# people's ability to build autoconf from a release tarball without
# help2man installed. It also ensures that we will generate manpages
# from the current source code and not from an older version of the
# script that has already been installed.
## ----------------------------- ##
## Extraction from Perl scripts. ##
## ----------------------------- ##
sub eval_qq_no_interpolation ($)
{
# The argument is expected to be a "double quoted string" including the
# leading and trailing delimiters. Returns the text of this string after
# processing backslash escapes but NOT interpolation.
# / (?<!\\) (?>\\\\)* blah /x means match blah preceded by an
# *even* number of backslashes. It would be nice if we could use \K
# to exclude the backslashes from the matched text, but that was only
# added in Perl 5.10 and we still support back to 5.006.
return eval $_[0] =~ s/ (?<!\\) (?>\\\\)* [\$\@] /\\$&/xrg;
}
sub extract_channeldefs_usage ($)
{
my ($channeldefs_pm) = @_;
my $usage = "";
my $parse_state = 0;
local $_;
open (my $fh, "<", $channeldefs_pm) or die "$channeldefs_pm: $!\n";
while (<$fh>)
{
if ($parse_state == 0)
{
$parse_state = 1 if /^sub usage\b/;
}
elsif ($parse_state == 1)
{
if (s/^ return "//)
{
$parse_state = 2;
$usage .= $_;
}
}
elsif ($parse_state == 2)
{
if (s/(?<!\\) ((?>\\\\)*) "; $/$1/x)
{
$usage .= $_;
return $usage;
}
else
{
$usage .= $_;
}
}
}
die "$channeldefs_pm: unexpected EOF in state $parse_state\n";
}
sub extract_perl_assignment (*$$$)
{
my ($fh, $source, $channeldefs_pm, $what) = @_;
my $value = "";
my $parse_state = 0;
local $_;
while (<$fh>)
{
if ($parse_state == 0)
{
if (s/^\$\Q${what}\E = (?=")//o)
{
$value .= $_;
$parse_state = 1;
}
}
elsif ($parse_state == 1)
{
if (/^"\s*\.\s*Autom4te::ChannelDefs::usage\s*(?:\(\))?\s*\.\s*"$/)
{
$value .= extract_channeldefs_usage ($channeldefs_pm);
}
elsif (/^";$/)
{
$value .= '"';
return eval_qq_no_interpolation ($value);
}
else
{
$value .= $_;
}
}
}
die "$source: unexpected EOF in state $parse_state\n";
}
## ------------------------------ ##
## Extraction from shell scripts. ##
## ------------------------------ ##
sub extract_shell_assignment (*$$)
{
my ($fh, $source, $what) = @_;
my $value = "";
my $parse_state = 0;
local $_;
while (<$fh>)
{
if ($parse_state == 0)
{
if (/^\Q${what}\E=\[\"\\$/)
{
$parse_state = 1;
}
}
elsif ($parse_state == 1)
{
my $done = s/"\]$//;
$value .= $_;
if ($done)
{
# This is not strictly correct but it works acceptably
# for the escapes that actually occur in the strings
# we're extracting.
return eval_qq_no_interpolation ('"'.$value.'"');
}
}
}
die "$source: unexpected EOF in state $parse_state\n";
}
## -------------- ##
## Main program. ##
## -------------- ##
sub extract_assignment ($$$)
{
my ($source, $channeldefs_pm, $what) = @_;
open (my $fh, "<", $source) or die "$source: $!\n";
my $firstline = <$fh>;
if ($firstline =~ /\@PERL\@/ || $firstline =~ /-\*-\s*perl\s*-\*-/i)
{
return extract_perl_assignment ($fh, $source, $channeldefs_pm, $what);
}
elsif ($firstline =~ /\bAS_INIT\b/
|| $firstline =~ /bin\/[a-z0-9]*sh\b/
|| $firstline =~ /-\*-\s*shell-script\s*-\*-/i)
{
return extract_shell_assignment ($fh, $source, $what);
}
else
{
die "$source: language not recognized\n";
}
}
sub main ()
{
# Most of our arguments come from environment variables, because
# help2man doesn't allow for passing additional command line
# arguments to the wrappers, and it's easier to write the wrappers
# to not mess with the command line.
my $usage = "Usage: $0 script-source (--help | --version)
Extract help and version information from a perl or shell script.
Required environment variables:
top_srcdir relative path from cwd to the top of the source tree
channeldefs_pm relative path from top_srcdir to ChannelDefs.pm
PACKAGE_NAME the autoconf PACKAGE_NAME substitution variable
VERSION the autoconf VERSION substitution variable
RELEASE_YEAR the autoconf RELEASE_YEAR substitution variable
The script-source argument should also be relative to top_srcdir.
";
my $source = shift(@ARGV) || die $usage;
my $what = shift(@ARGV) || die $usage;
my $top_srcdir = $ENV{top_srcdir} || die $usage;
my $channeldefs_pm = $ENV{channeldefs_pm} || die $usage;
my $package_name = $ENV{PACKAGE_NAME} || die $usage;
my $version = $ENV{VERSION} || die $usage;
my $release_year = $ENV{RELEASE_YEAR} || die $usage;
if ($what eq "-h" || $what eq "--help")
{
$what = "help";
}
elsif ($what eq "-V" || $what eq "--version")
{
$what = "version";
}
else
{
die $usage;
}
my $cmd_name = $source =~ s{^.*/([^./]+)\.(?:as|in)$}{$1}r;
$source = File::Spec->catfile($top_srcdir, $source);
$channeldefs_pm = File::Spec->catfile($top_srcdir, $channeldefs_pm);
my $text = extract_assignment ($source, $channeldefs_pm, $what);
$text =~ s/\$0\b/$cmd_name/g;
$text =~ s/[@]PACKAGE_NAME@/$package_name/g;
$text =~ s/[@]VERSION@/$version/g;
$text =~ s/[@]RELEASE_YEAR@/$release_year/g;
print $text;
}
main;
### Setup "GNU" style for perl-mode and cperl-mode.
## Local Variables:
## perl-indent-level: 2
## perl-continued-statement-offset: 2
## perl-continued-brace-offset: 0
## perl-brace-offset: 0
## perl-brace-imaginary-offset: 0
## perl-label-offset: -2
## cperl-indent-level: 2
## cperl-brace-offset: 0
## cperl-continued-brace-offset: 0
## cperl-label-offset: -2
## cperl-extra-newline-before-brace: t
## cperl-merge-trailing-else: nil
## cperl-continued-statement-offset: 2
## End: