This essentially means you just create a new filename pointing to the same data on disk, without taking up extra data.
I also made the program create a log file containing the location of each file and its MD5 hash so I could find out how many files were processed and how many duplicates were found.
Although I have made my best attempt at making the script work (and it does for me!
), I take no responsibility for any data loss you may experience.
A problem with the MD5 hash approach would be if duplicate copies of images had different image metadata.
This could be the case is some copies had had their dates adjusted or if i Photo where changing the EXIF information when you star photos or tag faces in them.
We are really happy with the result :-) Update 2014/05/08: A few days ago I received a question to this post from Jeff Ruth. If you have a minute, could you please tell me how I can use your code, or do this a different way, without the program? The script was intended for developers, but if you want to try you have to copy the file (dedupe-media.sh) to your local disk, then make it executable.
Because it, and its answer, may be useful to more people I asked if I could reproduce it here. I am not a programmer, but have used Terminal now and then and am not afraid to experiment if necessary. In Terminal, you do that with ‘chmod x dedupe-media.sh'.Please note this script comes with NO WARRANTY OF ANY KIND.You must take adequate backup of your libraries before running the above.We had two further libraries containing about 13,000 and 10,000 images each, and a fourth i Photo library on a separate machine, plus the aforementioned random grab-bag of directories full of images.(I didn't count how many these were, but they numbered in the thousands.) After consolidation we ended up with 33,000 images and movies in one i Photo library.Consolidating and de-duping the libraries would be worth it. One worry I had was how to make sure I got hold of all the originals from i Photo.