Accueil > Système > Copy a folder overwriting ONLY smaller files in destination

Copy a folder overwriting ONLY smaller files in destination

28/11/2023 Categories: Système Tags: , ,
Print Friendly, PDF & Email

I have tons of PDFs in multiple sub-folders in /home/user/original that I have compressed using ghostscript pdfwrite in /home/user/compressed.

ghostscript has done a great job at compressing about 90% of the files however the rest of them ended up bigger than originals.

I would like to cp /home/user/compressed to /home/user/original overwriting files that are only smaller than the ones in destination while the bigger ones are skipped.

Any ideas?

Lire aussi:  Sauvegarde journalisée de votre machine avec RSnapShot
Categories: Système Tags: , ,
  1. Mandrake
    07/12/2021 à 16:11 | #1

    Perl’s -s operator to the rescue!

    Create an executable Perl script overwrite-smaller:

    #!/bin/perl
    use warnings;
    use strict;
    use File::Copy;

    my $file = shift;
    (my $compressed = $file) =~ s/original/compressed/;
    copy($compressed, $file) if -s $compressed < -s $file;

    And run it for each file in the original directory:

    find /home/user/original -type f -exec overwrite-smaller {} \;

    Or, once in Perl, write the subtree walking there as well:

    #!/usr/bin/perl
    use warnings;
    use strict;

    use File::Copy;
    use File::Find;

    find({no_chdir => 1,
    wanted => sub {
    my $file = $File::Find::name;
    -f $file or return;
    (my $compressed = $file) =~ s/original/compressed/;
    copy($compressed, $file) if -s $compressed < -s $file; }}, 'original');

  2. Mandrake
    07/12/2021 à 16:09 | #2

    The following find command should work for this:

    cd /home/user/original
    find . -type f -exec bash -c 'file="$1"; rsync --max-size=$(stat -c '%s' "$file") "/home/user/compressed/$file" "/home/user/original/$file"' _ {} \;

    The key part of this solution is the –max-size provided by rsync. From the rsync manual:

    --max-size=SIZE

    This tells rsync to avoid transferring any file that is larger than the specified SIZE.

    So the find command operates on the destination directory (/home/user/original) and returns a list of files. For each file, it spawns a bash shell that runs the rsync command. The SIZE parameter for –max-size option is set by running a stat command against the destination file.

    In effect, the rsync processing logic becomes this:

    If the source file is larger than than the destination file, the –max-size parameter will prevent the source file from being transferred.
    If the source file is smaller than the destination file, the transfer will proceed as expected.
    This logic will result in only the smaller files being transferred from the source directory to the destination directory.

    I have tested this in a few different ways, and it works for me as expected. However, you may want to create a backup of the destination directory before you try it out on your system.

Les commentaires sont fermés.