Shell script to re-format poor date format in filenames

New, more-good version here: New script to re-write dates in file names

After testing and some using, this is going to change to just write the mv statements to a file and to accept an argument to run them instead. No logging since that will serve the same purpose. This post does not contain the final version. It’s here for the sake of history.

And boom. Or so I hope. This *should* handle dates as “01.02.03” and the equivalent “01.02.2003”.
To-do: Check for administrator privileges or just write the log to ~/. MacOS (nee OS X (nee Mac OS)) has a default user log directory, but GNU does not. I suppose we could check for “~/.log/” and mkdir if false…

Title explains it pretty well. Uses egrep to find date, sed to edit it, mv to rename, and read for user confirmation prompt. Which, btw, can’t be done if you are already doing a read without feeding the new read from somewhere other than stdin. Hence the < /dev/tty at that point in the script. So many strangers to thank for sharing their knowledge online. Thanks, strangers! That should do it.

Why this exists is I had a co-worker who added dates to tons of files in the format “01.02.03”, which resulted in files named “Example filename 01.02.03.xmpl” Same for phone numbers. Clearly this is wrong and something must be done about it (now that new ones won’t be cropping up since he.is.gone). You may be asking yourself how he got away with not getting an e-mailed file bounced back for multiple extensions? My educated guess is that he did and kept doing it the dumb way despite that.

Corrects the above format to “+%Y-%m-%d” (YYYY-MM-DD).

This could probably be done with fewer lines or overall be more betterer, but this is like my first whole shell script. Maybe my years of AppleScript are showing. *meh*

#!/bin/sh
# NOTICE: This is recursive.

#### SETTINGS ####

# Behavior:
#    prompt = Prompt to confirm each file change, run if yes and log the new name (if enabled).
#    build  = Build a file in 'pwd' that contains all 'mv' statements to peruse and/or run later.
#    now    = Run each mv statement after it is generated, log new file name (if enabled).
behavior=build

# Name of log file. Leave blank to disable logging.
loglog=mvdate.log

# Directory (must exist) to store the log. Leave blank to use the defaults below.
logdir=

#### END OF SETTINGS ####

# We want to log all files altered in case something shouldn't have been.
# Apparently macOS (nee OS X) is the only Unix with a default user log directory.
# GNU doesn't have one either. Lovely. :/
if [ ! "$logdir" ]; then
    if [ -d ~/Library/Logs ]
    then
        logdir=~/Library/Logs
    else
        logdir=~/.log
        if [ ! -d $logdir ]; then mkdir $logdir; fi
    fi
fi
if [ "$loglog" ]; then
    mlog=$logdir/$loglog
    echo "NOTICE: This script writes time ran and filenames altered to $mlog"
    echo `date` >> $mlog
fi

# Here, 'tail' reverses the order of 'find' after 'egrep' filters the result.
# That way files in a directory are renamed before the directory is.
# 'find' might be able to use the regex, but I couldn't work it out. YMMV.
# This only looks for the pattern at the end of the line or just before an extension.
find "`pwd`" | egrep '([0-9]{2}[.-/,]){2}[0-9]{2,4}(.[[:alnum:]]+)?$' | tail -r | while read f
do

# Escape certain characters so they don't wreck the 'mv' statement later.
# double quote, single quote, parens, dollar sign, [space]
# Single quote and parens are escaped for the shell after breaking out of the 'sed' statement.
    f=$(echo "$f" | sed -E 's/(["''()&'[:space:]$])/\1/g')

# Suck out the last instance of our hated date pattern (00.00.00 or 00.00.0000).
# Also replace dashes, slashes, and errant commas for dots because we've come too far not to.
    d=$(echo $f | egrep -o '([0-9]{1,2}[.-/,]){2}[0-9]{2,4}' | tail -1 | sed -E 's/[-/,]/./g')
    if [ "$d" != "" ]; then

# 'date' will not accept a 2 OR 4 digit year.
        if [ $(echo $d | egrep -o "[0-9]{4}$") ]; then
            dform="%m.%d.%Y"; else
            dform="%m.%d.%y"
        fi
        d2="$(date -j -f "$dform" "$d" "+%Y-%m-%d")"

# This is the 'sed' that finds the date and replaces it with what we made just above.
# It looks for the pattern at the end of the line (path) but includes the extension if there.
# It is possible that a version number of some sort will also match. This is why it prompts.
# Also possible to do away with the EOL in the regex and just go for the pattern.
        f2="$(echo "$f" | sed -E 's/([0-9]{1,2}[.,-]){2}[0-9]{2,4}(.[[:alnum:]]+)?$/'$d2'2/')"

# And double spaces....fml.
# Which won't work unless we start over and look for them at the tail of the path.
#        f2="$(echo "$f2" | sed 's/  / /g')"
        m="mv -- $f $f2"

        case "$behavior" in
            "prompt")
# We echo the 'mv' statement for approval to be sure it's doing what we want.
# This 'read' here has to come from somewhere other than stdin.e
                echo "$m"
                read -r -p "Run 'mv' (as shown)? [y,n] " response < /dev/tty
                if [[ $response =~ ^([yY])$ ]]; then
                    eval $m
                    if [ "$loglog" ]; then echo $f2 >> $mlog; fi
                fi
                ;;
            "build")
# Add every 'mv' statement to a file for later evaluation. 
                echo "$m" >> "!items to rename.txt"
                ;;
# Run the 'mv' statements as they are created.
            "now")
                eval $m
                if [ "$loglog" ]; then echo $f2 >> $mlog; fi
                ;;
        esac
    fi
done

# Add a line between runs.
if [ "$loglog" ]
    then
        printf \n >> $mlog
fi

1 thought on “Shell script to re-format poor date format in filenames

  1. Pingback: New script to re-write dates in file names | strawhousepig.net

Leave a Reply

Your email address will not be published. Required fields are marked *