Murphy Mac - Screencasts and Tutorials » Page 'Find Duplicate Files with Tidy Up'

Find Duplicate Files with Tidy Up


Tidy Up
Murphy has built up a ton of duplicate files over the years.  All kinds of stuff, some files with the same file name, some with different file names.  Many of the files are half-baked backups thrown onto external drives in haste.  Some are just the result of poor housekeeping.

I’ve used various tools to get the mess under control.  Later we’ll be looking at a tool called CD Finder that despite its name can be very helpful in cataloging an unruly collection of disks and drives.  We’ll be looking at the diff command too – which is already on your Mac.  But first let’s take a look at Tidy Up, an extremely helpful tool for finding duplicate files and deleting them.

Tidy Up can look beyond the filename to determine if files are duplicates or not.  In the screencast Murphy uses Tidy Up to look at file content and size.  There are many other criteria sets the application can use to evaluate files.

Tidy Up can also dig into iPhoto and iTunes databases in search of duplicates.  Mail mailboxes too.  Information about deleted files is then synced back to the applications.  We’ll look at these features in another screencast.

One feature Murphy really likes:  The ability to keep a single copy from a duplicate grouping.  Tidy Up groups identical files together in its search results.  The application will display all but a single file from each group, allowing you to delete all the extras at once.

Tidy Up can also restore content you’ve deleted to its original location, as long as you haven’t emptied the trash.

You can use Tidy Up to scan multiple drives at once or just a folder that you suspect has duplicates.  It’s probably best to experiment a little before deleting anything – to ensure you’re getting the results you expect.

Watch Now | Permalink

3 comments to “Find Duplicate Files with Tidy Up”

  1. It took me less than 30 minutes to write a Perl script to find all of the duplicates in a directory, with the option of searching all subdirectories. My script gives you the option of listing the duplicates or deleting all but the one with the shortest name. It runs lightening fast.

    No, it doesn’t have a GUI. That would have been a bug waste of time.

    I can upload it if you want. Or if you want it to write it yourself, the algorithm is simple:
    1. Fetch each filename from the command line (expanding directories if requested).
    2. Loop thru the filenames looking for files that are not regular files.
    While you’re at it, fetch the size of each file in bytes and put it next to the filename in the array, so it’s a 2 dimensional array. This saves disk hits, because testing the file type also brings in the file size.
    3. Sort the array by file size.
    4. Compare each file (i) with each file after it (j) in the array.
    If 2 files have a different size, they cannot be duplicates,
    and because the array is sorted, neither can any of the files after it.
    If 2 files match, remember their names, and clear out the jth element, so you don’t compare it again. Remember to skip empty names.

    Very little comparing, very fast. All for free.
    There are a few more tricks that will make it even faster, but they’re not that big a gain.

  2. fantastic points altogether, you simply won a brand new reader.
    What may you recommend in regards to your submit that you simply made some days in the past?
    Any sure?

  3. WOW just what I was looking for. Came here by searching for tractari

Leave a comment