Browse Source

scripts: Replace changelog program parsers with perl modules

Using programs to implement the custom changelog parsers was very
inefficient as it required to parse the custom changelog, output deb822
formatted entries to then parse that and output again with the desired
format.

These were implemented as programs because at the time the perl code
in dpkg was not using perl modules, so it was not easy to extend. Using
perl modules now is cleaner and allows for a faster implementation.

In addition there's no known users in Debian, so it was deemed safe to
remove the support without a transition.
Guillem Jover 7 years ago
parent
commit
d1629d0ec1

+ 1 - 0
debian/changelog

@@ -7,6 +7,7 @@ dpkg (1.18.8) UNRELEASED; urgency=medium
     - Use warnings::warnif() instead of carp() for deprecated warnings.
     - Add new format_range() method and deprecate dpkg() and rfc822() methods
       in Dpkg::Changelog.
+    - Replace changelog program parsers with perl modules.
   * Test suite:
     - Bump perlcritic ValuesAndExpressions::RequireNumberSeparators minimum
       to 99999.

+ 0 - 1
debian/libdpkg-perl.install

@@ -1,4 +1,3 @@
-usr/lib/dpkg/parsechangelog usr/lib/dpkg
 usr/share/locale/*/LC_MESSAGES/dpkg-dev.mo
 usr/share/man/man3/Dpkg*.3
 usr/share/perl5/Dpkg*

+ 25 - 0
doc/README.feature-removal-schedule

@@ -75,6 +75,16 @@ Why:
  guarantee of what exact format version will be used to produce the
  output file. They have been replaced with a new --deb-format option.
 
+What: -L (dpkg-parsechangelog option)
+Status: obsolete
+Since: 1.18.8
+When: 1.19.x
+Warning: program
+Why:
+ The custom parsers have been switched from programs to perl modules,
+ and this option has no use any longer. The caller can set PERL5LIB or
+ PERLLIB to specify the perl module search path now.
+
 History of feature removals
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 
@@ -145,3 +155,18 @@ When: 1.16.4
 Warning: program
 Why:
  This option has been superseded by -tudeb.
+
+What: support for custom changelog parsers as programs
+Status: removed
+Since: 1.18.8
+Why:
+ Using programs to implement the custom changelog parsers was very inefficient
+ as it required to parse the custom changelog, output deb822 formatted entries
+ to then parse that and output again with the desired format.
+ .
+ These were implemented as programs because at the time the perl code in dpkg
+ was not using perl modules, so it was not easy to extend. Using perl modules
+ now is cleaner and allows for a faster implementation.
+ .
+ In addition there's no known users in Debian, so it was deemed safe to remove
+ the support without a transition.

+ 13 - 21
man/dpkg-parsechangelog.1

@@ -48,11 +48,9 @@ defaults to the \fBdebian\fP standard format. See also
 \fBCHANGELOG FORMATS\fP.
 .TP
 .BR \-L " \fIlibdir\fP"
-Specify an additional directory to search for parser scripts.
-This directory is searched before the default directories
-which are currently
-.BR /usr/local/lib/dpkg/parsechangelog " and "
-.BR /usr/lib/dpkg/parsechangelog .
+Obsolete option without effect (since dpkg 1.18.8).
+Setting the perl environment variables \fBPERL5LIB\fP or \fBPERLLIB\fP
+has a similar effect when looking for the parser perl modules.
 .TP
 .BR \-S ", " \-\-show\-field " \fIfield\fP"
 Specifies the name of the field to show (since dpkg 1.17.0).
@@ -66,8 +64,7 @@ Show the version and exit.
 .SS Parser Options
 The following options can be used to influence the output of
 the changelog parser, e.g. the range of entries or the format
-of the output. They need to be supported by the parser script
-in question. See also \fBNOTES\fP.
+of the output.
 .TP
 .BI \-\-file " file"
 Set the changelog filename to parse.
@@ -162,17 +159,14 @@ The part in parentheses should be the name of the format. For example:
 Changelog format names are non-empty strings of alphanumerics.
 
 If such a line exists then \fBdpkg\-parsechangelog\fP will look for
-the parser as \fB/usr/lib/dpkg/parsechangelog/\fP\fIotherformat\fP
-or \fB/usr/local/lib/dpkg/parsechangelog/\fP\fIotherformat\fP; it is
-an error for it not being present or not being an executable program.
+the parser as a \fBDpkg::Changelog::\fP\fIOtherformat\fP perl module;
+it is an error for it not being present.
+The parser name in the perl module will be automatically capitalized.
 The default changelog format is \fBdebian\fP, and a parser for it is
 provided by default.
 
-The parser will be invoked with the changelog open on standard input at
-the start of the file. It should read the file (it may seek if it wishes)
-to determine the information required and return the parsed information
-to standard output in the format specified by the \fB\-\-format\fP option.
-It should accept all \fBParser Options\fP.
+The parser should be derived from the Dpkg::Changelog class and implement
+the required documented interface.
 
 If the changelog format which is being parsed always or almost always
 leaves a blank line between individual change notes, these blank lines
@@ -182,17 +176,15 @@ If the changelog format does not contain date or package name information
 this information should be omitted from the output. The parser should not
 attempt to synthesize it or find it from other sources.
 
-If the changelog does not have the expected format the parser should exit
-with a nonzero exit status, rather than trying to muddle through and
-possibly generating incorrect output.
+If the changelog does not have the expected format the parser should
+error out, rather than trying to muddle through and possibly generating
+incorrect output.
 
 A changelog parser may not interact with the user at all.
 .
 .SH NOTES
 All \fBParser Options\fP except for \fB\-v\fP are only supported
-since dpkg 1.14.16. Third party parsers for
-changelog formats other than \fBdebian\fP might not support
-all options.
+since dpkg 1.14.16.
 .PP
 Short option parsing with non-bundled values available only since dpkg 1.18.0.
 .

+ 94 - 157
scripts/Dpkg/Changelog/Parse.pm

@@ -33,7 +33,7 @@ package Dpkg::Changelog::Parse;
 use strict;
 use warnings;
 
-our $VERSION = '1.01';
+our $VERSION = '1.02';
 our @EXPORT = qw(
     changelog_parse_debian
     changelog_parse_plugin
@@ -46,15 +46,67 @@ use Dpkg ();
 use Dpkg::Util qw(none);
 use Dpkg::Gettext;
 use Dpkg::ErrorHandling;
-use Dpkg::Changelog::Debian;
 use Dpkg::Control::Changelog;
 
+sub _changelog_detect_format {
+    my $file = shift;
+    my $format = 'debian';
+
+    # Extract the format from the changelog file if possible
+    if ($file ne '-') {
+        local $_;
+
+        open my $format_fh, '-|', 'tail', '-n', '40', $file
+            or syserr(g_('cannot create pipe for %s'), 'tail');
+        while (<$format_fh>) {
+            $format = $1 if m/\schangelog-format:\s+([0-9a-z]+)\W/;
+        }
+        close $format_fh or subprocerr(g_('tail of %s'), $file);
+    }
+
+    return $format;
+}
+
 =head1 FUNCTIONS
 
 =over 4
 
 =item $fields = changelog_parse_debian(%opt)
 
+This function is deprecated, use changelog_parse() instead, with the changelog
+format set to "debian".
+
+=cut
+
+sub changelog_parse_debian {
+    my (%options) = @_;
+
+    warnings::warnif('deprecated',
+                     'deprecated function changelog_parse_debian, use changelog_parse instead');
+
+    # Force the plugin to be debian.
+    $options{changelogformat} = 'debian';
+
+    return _changelog_parse(%options);
+}
+
+=item $fields = changelog_parse_plugin(%opt)
+
+This function is deprecated, use changelog_parse() instead.
+
+=cut
+
+sub changelog_parse_plugin {
+    my (%options) = @_;
+
+    warnings::warnif('deprecated',
+                     'deprecated function changelog_parse_plugin, use changelog_parse instead');
+
+    return _changelog_parse(%options);
+}
+
+=item $fields = changelog_parse(%opt)
+
 This function will parse a changelog. In list context, it returns as many
 Dpkg::Control objects as the parser did create. In scalar context, it will
 return only the first one. If the parser did not return any data, it will
@@ -65,30 +117,53 @@ The changelog file that is parsed is F<debian/changelog> by default but it
 can be overridden with $opt{file}. The default output format is "dpkg" but
 it can be overridden with $opt{format}.
 
-The parsing itself is done by Dpkg::Changelog::Debian.
+The parsing itself is done by a parser module (searched in the standard
+perl library directories. That module is named according to the format that
+it is able to parse, with the name capitalized. By default it is either
+Dpkg::Changelog::Debian (from the "debian" format) or the format name looked
+up in the 40 last lines of the changelog itself (extracted with this perl
+regular expression "\schangelog-format:\s+([0-9a-z]+)\W"). But it can be
+overridden with $opt{changelogformat}.
+
+All the other keys in %opt are forwarded to the parser module constructor.
 
 =cut
 
-sub changelog_parse_debian {
+sub _changelog_parse {
     my (%options) = @_;
 
     # Setup and sanity checks.
+    if (exists $options{libdir}) {
+        warnings::warnif('deprecated',
+                         'obsolete libdir option, changelog parsers are now perl modules');
+    }
+
     $options{file} //= 'debian/changelog';
     $options{label} //= $options{file};
+    $options{changelogformat} //= _changelog_detect_format($options{file});
     $options{format} //= 'dpkg';
-    $options{all} = 1 if exists $options{all};
 
-    if (none { defined $options{$_} } qw(since until from to offset count all)) {
+    my @range_opts = qw(since until from to offset count all);
+    $options{all} = 1 if exists $options{all};
+    if (none { defined $options{$_} } @range_opts) {
         $options{count} = 1;
     }
-
     my $range;
-    foreach my $opt (qw(since until from to offset count all)) {
+    foreach my $opt (@range_opts) {
         $range->{$opt} = $options{$opt} if exists $options{$opt};
     }
 
-    my $changes = Dpkg::Changelog::Debian->new(reportfile => $options{label},
-                                               range => $range);
+    # Find the right changelog parser.
+    my $format = ucfirst lc $options{changelogformat};
+    my $changes;
+    eval qq{
+        require Dpkg::Changelog::$format;
+        \$changes = Dpkg::Changelog::$format->new();
+    };
+    error(g_('changelog format %s is unknown: %s'), $format, $@) if $@;
+    $changes->set_options(reportfile => $options{label}, range => $range);
+
+    # Load and parse the changelog.
     $changes->load($options{file})
         or error(g_('fatal error occurred while parsing %s'), $options{file});
 
@@ -110,163 +185,25 @@ sub changelog_parse_debian {
     }
 }
 
-sub _changelog_detect_format {
-    my $file = shift;
-    my $format = 'debian';
-
-    # Extract the format from the changelog file if possible
-    if ($file ne '-') {
-        local $_;
-
-        open my $format_fh, '-|', 'tail', '-n', '40', $file
-            or syserr(g_('cannot create pipe for %s'), 'tail');
-        while (<$format_fh>) {
-            $format = $1 if m/\schangelog-format:\s+([0-9a-z]+)\W/;
-        }
-        close $format_fh or subprocerr(g_('tail of %s'), $file);
-    }
-
-    return $format;
-}
-
-=item $fields = changelog_parse_plugin(%opt)
-
-This function will parse a changelog. In list context, it returns as many
-Dpkg::Control objects as the parser did output. In scalar context, it will
-return only the first one. If the parser did not return any data, it will
-return an empty list in list context or undef on scalar context. If the
-parser failed, it will die.
-
-The parsing itself is done by an external program (searched in the
-following list of directories: $opt{libdir},
-F</usr/local/lib/dpkg/parsechangelog>, F</usr/lib/dpkg/parsechangelog>).
-That program is named according to the format that it is able to parse. By
-default it is either "debian" or the format name looked up in the 40 last
-lines of the changelog itself (extracted with this perl regular expression
-"\schangelog-format:\s+([0-9a-z]+)\W"). But it can be overridden
-with $opt{changelogformat}. The program expects the content of the
-changelog file on its standard input.
-
-The changelog file that is parsed is F<debian/changelog> by default but it
-can be overridden with $opt{file}.
-
-All the other keys in %opt are forwarded as parameter to the external
-parser. If the key starts with "-", it is passed as is. If not, it is passed
-as "--<key>". If the value of the corresponding hash entry is defined, then
-it is passed as the parameter that follows.
-
-=cut
-
-sub changelog_parse_plugin {
+sub changelog_parse {
     my (%options) = @_;
 
-    # Setup and sanity checks.
-    $options{file} //= 'debian/changelog';
-
-    my @parserpath = ('/usr/local/lib/dpkg/parsechangelog',
-                      "$Dpkg::LIBDIR/parsechangelog",
-                      '/usr/lib/dpkg/parsechangelog');
-    my $format;
-
-    # Extract and remove options that do not concern the changelog parser
-    # itself (and that we shouldn't forward)
-    delete $options{forceplugin};
-    if (exists $options{libdir}) {
-	unshift @parserpath, $options{libdir};
-	delete $options{libdir};
-    }
-    if (exists $options{changelogformat}) {
-	$format = $options{changelogformat};
-	delete $options{changelogformat};
-    } else {
-	$format = _changelog_detect_format($options{file});
-    }
-
-    # Find the right changelog parser
-    my $parser;
-    foreach my $dir (@parserpath) {
-        my $candidate = "$dir/$format";
-	next if not -e $candidate;
-	if (-x _) {
-	    $parser = $candidate;
-	    last;
-	} else {
-	    warning(g_('format parser %s not executable'), $candidate);
-	}
-    }
-    error(g_('changelog format %s is unknown'), $format) if not defined $parser;
-
-    # Create the arguments for the changelog parser
-    my @exec = ($parser, "-l$options{file}");
-    foreach my $option (keys %options) {
-	if ($option =~ m/^-/) {
-	    # Options passed untouched
-	    push @exec, $option;
-	} else {
-	    # Non-options are mapped to long options
-	    push @exec, "--$option";
-	}
-	push @exec, $options{$option} if defined $options{$option};
-    }
-
-    # Fork and call the parser
-    my $pid = open(my $parser_fh, '-|');
-    syserr(g_('cannot fork for %s'), $parser) unless defined $pid;
-    if (not $pid) {
-        exec @exec or syserr(g_('cannot execute format parser: %s'), $parser);
+    if (exists $options{forceplugin}) {
+        warnings::warnif('deprecated', 'obsolete forceplugin option');
     }
 
-    # Get the output into several Dpkg::Control objects
-    my (@res, $fields);
-    while (1) {
-        $fields = Dpkg::Control::Changelog->new();
-        last unless $fields->parse($parser_fh, g_('output of changelog parser'));
-	push @res, $fields;
-    }
-    close($parser_fh) or subprocerr(g_('changelog parser %s'), $parser);
-    if (wantarray) {
-	return @res;
-    } else {
-	return $res[0] if (@res);
-	return;
-    }
+    return _changelog_parse(%options);
 }
 
-=item $fields = changelog_parse(%opt)
-
-This function will parse a changelog. In list context, it returns as many
-Dpkg::Control objects as the parser did create. In scalar context, it will
-return only the first one. If the parser did not return any data, it will
-return an empty list in list context or undef on scalar context. If the
-parser failed, it will die.
-
-If $opt{forceplugin} is false and $opt{changelogformat} is "debian", then
-changelog_parse_debian() is called to perform the parsing. Otherwise
-changelog_parse_plugin() is used.
-
-The changelog file that is parsed is F<debian/changelog> by default but it
-can be overridden with $opt{file}.
-
-=cut
-
-sub changelog_parse {
-    my (%options) = @_;
+=back
 
-    $options{forceplugin} //= 0;
-    $options{file} //= 'debian/changelog';
-    $options{changelogformat} //= _changelog_detect_format($options{file});
+=head1 CHANGES
 
-    if (not $options{forceplugin} and
-        $options{changelogformat} eq 'debian') {
-        return changelog_parse_debian(%options);
-    } else {
-        return changelog_parse_plugin(%options);
-    }
-}
+=head2 Version 1.02 (dpkg 1.18.8)
 
-=back
+Deprecated functions: changelog_parse_debian(), changelog_parse_plugin().
 
-=head1 CHANGES
+Obsolete options: $forceplugin, $libdir.
 
 =head2 Version 1.01 (dpkg 1.18.2)
 

+ 1 - 6
scripts/Makefile.am

@@ -21,10 +21,6 @@ bin_SCRIPTS = \
 	dpkg-source \
 	dpkg-vendor
 
-changelogdir = $(pkglibdir)/parsechangelog
-changelog_SCRIPTS = \
-	changelog/debian
-
 EXTRA_DIST = \
 	dpkg-architecture.pl \
 	dpkg-buildflags.pl \
@@ -43,14 +39,13 @@ EXTRA_DIST = \
 	dpkg-shlibdeps.pl \
 	dpkg-source.pl \
 	dpkg-vendor.pl \
-	changelog/debian.pl \
 	$(test_scripts) \
 	$(test_data)
 
 
 CLEANFILES = \
 	$(test_data_objects) \
-	$(bin_SCRIPTS) $(changelog_SCRIPTS)
+	$(bin_SCRIPTS)
 
 perllibdir = $(PERL_LIBDIR)
 nobase_dist_perllib_DATA = \

+ 0 - 1
scripts/changelog/.gitignore

@@ -1 +0,0 @@
-debian

+ 0 - 138
scripts/changelog/debian.pl

@@ -1,138 +0,0 @@
-#!/usr/bin/perl
-#
-# parsechangelog/debian
-#
-# Copyright © 1996 Ian Jackson
-# Copyright © 2005,2007 Frank Lichtenheld
-# Copyright © 2006-2014 Guillem Jover <guillem@debian.org>
-#
-# This program is free software; you can redistribute it and/or modify
-# it under the terms of the GNU General Public License as published by
-# the Free Software Foundation; either version 2 of the License, or
-# (at your option) any later version.
-#
-# This program is distributed in the hope that it will be useful,
-# but WITHOUT ANY WARRANTY; without even the implied warranty of
-# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
-# GNU General Public License for more details.
-#
-# You should have received a copy of the GNU General Public License
-# along with this program.  If not, see <https://www.gnu.org/licenses/>.
-
-use strict;
-use warnings;
-
-use Getopt::Long qw(:config posix_default bundling no_ignorecase);
-
-use Dpkg ();
-use Dpkg::Util qw(none);
-use Dpkg::Gettext;
-use Dpkg::ErrorHandling;
-use Dpkg::Changelog::Debian;
-
-textdomain('dpkg-dev');
-
-$Dpkg::PROGNAME = "parsechangelog/$Dpkg::PROGNAME";
-
-sub version {
-    printf g_("Debian %s version %s.\n"), $Dpkg::PROGNAME, $Dpkg::PROGVERSION;
-
-    printf g_('
-This is free software; see the GNU General Public License version 2 or
-later for copying conditions. There is NO warranty.
-');
-}
-
-sub usage {
-    printf g_(
-'Usage: %s [<option>...] [<changelog-file>]')
-    . "\n\n" . g_(
-"Options:
-      --file <file>       changelog <file> to parse (defaults to '-').
-  -l, --label <file>      changelog <file> name to use in error messages.
-      --format <output-format>
-                          set the output format (defaults to 'dpkg').
-      --all               include all changes.
-  -s, --since <version>   include all changes later than <version>.
-  -v <version>            ditto.
-  -u, --until <version>   include all changes earlier than <version>.
-  -f, --from <version>    include all changes equal or later than <version>.
-  -t, --to <version>      include all changes up to or equal than <version>.
-  -c, --count <number>    include <number> entries from the top (or tail if
-                            <number> is lower than 0).
-  -n <number>             ditto.
-  -o, --offset <number>   change starting point for --count, counted from
-                            the top (or tail if <number> is lower than 0).
-  -?, --help              print usage information.
-  -V, --version           print version information.
-"), $Dpkg::PROGNAME;
-}
-
-my ( $since, $until, $from, $to, $all, $count, $offset, $file, $label );
-my $default_file = '-';
-my $format = 'dpkg';
-my %allowed_formats = (
-    dpkg => 1,
-    rfc822 => 1,
-    );
-
-sub set_format {
-    my ($opt, $val) = @_;
-
-    unless ($allowed_formats{$val}) {
-	usageerr(g_('output format %s not supported'), $val );
-    }
-
-    $format = $val;
-}
-
-my @options_spec = (
-    'file=s' => \$file,
-    'label|l=s' => \$label,
-    'since|v=s' => \$since,
-    'until|u=s' => \$until,
-    'from|f=s' => \$from,
-    'to|t=s' => \$to,
-    'count|c|n=i' => \$count,
-    'offset|o=i' => \$offset,
-    'help|?' => sub{ usage(); exit(0) },
-    'version|V' => sub{version();exit(0)},
-    'format=s' => \&set_format,
-    'all|a' => \$all,
-);
-
-{
-    local $SIG{__WARN__} = sub { usageerr($_[0]) };
-    GetOptions(@options_spec);
-}
-
-usageerr('too many arguments') if @ARGV > 1;
-
-if (@ARGV) {
-    if ($file && ($file ne $ARGV[0])) {
-	usageerr(g_('more than one file specified (%s and %s)'),
-		 $file, $ARGV[0] );
-    }
-    $file = $ARGV[0];
-}
-
-$file //= $default_file;
-$label //= $file;
-
-my %all = $all ? ( all => $all ) : ();
-my $range = {
-    since => $since, until => $until, from => $from, to => $to,
-    count => $count, offset => $offset,
-    %all
-};
-if (none { defined $range->{$_} } qw(since until from to offset count all)) {
-    $range->{count} = 1;
-}
-
-my $changes = Dpkg::Changelog::Debian->new(reportfile => $label, range => $range);
-
-$changes->load($file)
-    or error(g_('fatal error occurred while parsing %s'), $file);
-
-my $entries = $changes->format_range($format, $range);
-print $entries if defined $entries;

+ 1 - 4
scripts/dpkg-parsechangelog.pl

@@ -49,7 +49,6 @@ sub usage {
 'Options:
   -l <changelog-file>      get per-version info from this file.
   -F <changelog-format>    force changelog format.
-  -L <libdir>              look for changelog parsers in <libdir>.
   -S, --show-field <field> show the values for <field>.
   -?, --help               show this help message.
       --version            show the version.')
@@ -81,9 +80,7 @@ while (@ARGV) {
     if ($arg eq '--') {
         last;
     } elsif ($arg eq '-L') {
-        $options{libdir} = shift;
-        usageerr(g_('missing library directory'))
-            unless length $options{libdir};
+        warning(g_('-L is obsolete; it is without effect'));
     } elsif ($arg eq '-F') {
         $options{changelogformat} = shift;
         usageerr(g_('bad changelog format name'))

+ 0 - 1
scripts/po/POTFILES.in

@@ -16,7 +16,6 @@ scripts/dpkg-scansources.pl
 scripts/dpkg-shlibdeps.pl
 scripts/dpkg-source.pl
 scripts/dpkg-vendor.pl
-scripts/changelog/debian.pl
 scripts/Dpkg/Arch.pm
 scripts/Dpkg/BuildFlags.pm
 scripts/Dpkg/BuildOptions.pm