Warm tip: This article is reproduced from serverfault.com, please click

Why does PodSpelling complain about an ignored word if it's part of a file name?

发布于 2020-11-27 21:38:15

PodSpelling (run via Perlcritic) complains about "html" even though I've added this to the stopwords:

=for stopwords html

=head2 some function

Generates an index.html page.

=cut

This gets:

robert@saaz:~$ perlcritic --brutal  test.pl | grep html
Check the spelling in your POD: html at line 1, column 1.  See page 148 of PBP.  (Severity: 1)

(I pipe the output through grep to ignore the other complaints in this minimal example that are unrelated to spelling.)

If I change the POD to either use C<index.html> or F<index.html>, or rewrite it as a .html file, or add "index.html" as a stopword, the spell checker is happy. The first makes sense to me (don't check code snippets or file names), but I don't see what the big difference between "index.html" and ".html" is.

Does it work with =for stopwords index.html because it ignores the dot when it's next to white space? I'm just guessing here, though.

Why does my stopword html not work for "index.html" and how can I fix this if I don't want to use C<...> or F<...>?

Questioner
Robert
Viewed
0
clamp 2020-11-28 16:12:44

The authors of Pod::Wordlist implemented it that way. They wanted to make sure that abbreviations like e.g. "e.g." will be passed to the spellchecker as a whole. Since the splitting and stripping of stopwords is done before the wordlist gets passed to the spellchecker you can't have both. On the other hand the default spellchecker is aspell. Here it seems to me, that abbreviations pass the check no matter if they are split on periods or not.

So you can change Pod/Wordlist.pm like this:

#line 129
sub _strip_a_word {
    my ($self, $word) = @_;
    my $remainder;

    # try word as-is, including possible hyphenation vs stoplist
    if ($self->is_stopword($word) ) {
        $remainder = '';
    }

        # check individual parts of hyphenated or period separated word,
        # keep whatever isn't a stopword as individual words
        # aspell will accept split words for abbreviations as well
        # 'e.g' passes like 'e g' does
    elsif ( $word =~ /-|\./ ) {
            my @keep;
            for my $part ( split /-|\./, $word ) {
                push @keep, $part if ! $self->is_stopword( $part );
            }
            $remainder = join(" ", @keep) if @keep;
    }
    # otherwise, we just keep it
    else {
        $remainder = $word;
    }
    return $remainder;
}