Preposition with next word

2023-01-03 perl Text::Autoformat Text::Trim

My other web (in czech) bobroviny.cz is written in php/html. Before publishing an article I try to make sure the prepositions are properly aligned on the same line as the next word as czech typography rules suggest. To automate this, I created simple script below.

For each specified file just load it into memory, perform few regex substitutions, format it with Text::Autoformat and Text::Trim and write it back. The substitution just replace whitespace between preposition and the adjacent word with non-breakable space  . Same substitution is done for number preceding a word.

use 5.16.3;
use utf8;
use Text::Autoformat;
use Text::Trim;
use Path::Class qw(file);

my @preposition = qw{
    a na v do po s z k i o ve za pro od ze ke u
};
my $preposition_re = join "|", @preposition;

for my $filename (@ARGV) {
    my $file = file($filename);
    my $data = $file->slurp(iomode => '<:encoding(utf-8)');

    $data =~ s/\b($preposition_re)\s+(\w)/$1&nbsp;$2/gsi;   # preposition space(s) word
    $data =~ s/(\d)\s+(\w)/$1&nbsp;$2/gs;                   # number space(s) word
    $data =~ s{(<p\b.*?>)(.*?)(<\/p>)}{
        $1 . "\n" .
        trim(autoformat($2,{
            left   => 0,
            right  => 78,
            ignore => qr/<.*>/,
        }))
        . "\n" . $3
    }gise;

    $file->spew(iomode => '>:utf8', $data);
}