fdupes-1.51/0000755000175000017500000000000012134556236012213 5ustar adrianadrianfdupes-1.51/CHANGES0000644000175000017500000000750312134553750013211 0ustar adrianadrianThe following list, organized by fdupes version, documents changes to fdupes. Every item on the list includes, inside square brackets, a list of indentifiers referring to the people who contributed that particular item. When more than one person is listed the person who contributed the patch or idea appears first, followed by those who've otherwise worked on that item. For a list of contributors names and identifiers please see the CONTRIBUTORS file. Changes from 1.50 to 1.51 - Added support for 64-bit file offsets on 32-bit systems. - Using tty for interactive input instead of regular stdin. This is to allow feeding filenames via stdin in future versions of fdupes without breaking interactive deletion feature. - Fixed some typos in --help. - Turned C++ style comments into C style comments. Changes from 1.40 to 1.50-PR2 - Fixed memory leak. [JB] - Added "--summarize" option. [AL] - Added "--recurse:" selective recursion option. [AL] - Added "--noprompt" option for totally automated deletion of duplicate files. - Now sorts duplicates (old to new) for consistent order when listing or deleteing duplicate files. - Now tests for early matching of files, which should help speed up the matching process when large files are involved. - Added warning whenever a file cannot be deleted. [CHL, AL] - Fixed bug where some files would not be closed after failure. [AL] - Fixed bug where confirmmatch() function wouldn't always deal properly with zero-length files. [AL] - Fixed bug where progress indicator would not be cleared when no files were found. [AL] - Removed experimental red-black tree code (it was slower on my system than the default code). [AL] - Modified md5/md5.c to avoid compiler warning. [CHL] - Changes to fdupes.c for compilation under platforms where getopt_long is unavailable. [LR, AL] - Changes to help text for clarity. [AL] - Various changes and improvements to Makefile. [PB, AL] Changes from 1.31 to 1.40 - Added option to omit the first file in each group of matches. [LM, AL] - Added escaping of filenames containing spaces when sameline option is specified. [AL] - Changed version indicator format from "fdupes version X.Y" to the simpler "fdupes X.Y". [AL] - Changed ordering of options appearing in the help text (--help), manpage, and README file. [AL] Changes from 1.30 to 1.31 - Added interactive option to preserve all files during delete procedure (something similar was already in place, but now it's official). [AL] - Updated delete procedure prompt format. [AL] - Cosmetic code changes. [AL] Changes from 1.20 to 1.30 - Added size option to display size of duplicates. [LB, AL] - Added missing typecast for proper compilation under g++. [LB] - Better handling of errors occurring during retrieval of a file's signature. [KK, AL] - No longer displays an error message when specified directories contain no files. [AL] - Added red-black tree structure (experimental compile-time option, disabled by default). [AL] Changes from 1.12 to 1.20 - Fixed bug where program would crash when files being scanned were named pipes or sockets. [FD] - Fix against security risk resulting from the use of a temporary file to store md5sum output. [FD, AL] - Using an external md5sum program is now optional. Started using L. Peter Deutsh's MD5 library instead. [FD, AL] - Added hardlinks option to distinguish between hard links and actual duplicate files. [FD, AL] - Added noempty option to exclude zero-length files from consideration [AL] Changes from 1.11 to 1.12 - Improved handling of extremely long input on preserve prompt (delete option). [SSD, AL] Changes from 1.1 to 1.11 - Started checking file sizes before signatures for better performance. [AB, AL] - Added fdupes manpage. [AB, AL] Changes from 1.0 to 1.1 - Added delete option for semi-automatic deletion of duplicate files. [AL] fdupes-1.51/CONTRIBUTORS0000644000175000017500000000133412134544631014070 0ustar adrianadrianThe following people have contributed in some way to the development of fdupes. Please see the CHANGES file for detailed information on their contributions. Names are listed in alphabetical order. [AB] Adrian Bridgett (adrian.bridgett@iname.com) [AL] Adrian Lopez (adrian2@caribe.net) [CHL] Charles Longeau (chl@tuxfamily.org) [FD] Frank DENIS, a.k.a. Jedi/Sector One, a.k.a. DJ Chrysalis (j@4u.net) [JB] Jean-Baptiste () [KK] Kresimir Kukulj (madmax@pc-hrvoje.srce.hr) [LB] Laurent Bonnaud (Laurent.Bonnaud@iut2.upmf-grenoble.fr) [LM] Luca Montecchiani (m.luca@iname.com) [LR] Lukas Ruf (lukas@lpr.ch) [PB] Peter Bray (Sydney, Australia) [SSD] Steven S. Dick (ssd@nevets.oau.org) fdupes-1.51/Makefile.inc/0000755000175000017500000000000012134556224014475 5ustar adrianadrianfdupes-1.51/Makefile.inc/VERSION0000644000175000017500000000010712134556224015543 0ustar adrianadrian# # VERSION determines the program's version number. # VERSION = 1.51 fdupes-1.51/testdir/0000755000175000017500000000000012134544631013665 5ustar adrianadrianfdupes-1.51/testdir/recursed_a/0000755000175000017500000000000012134544631016001 5ustar adrianadrianfdupes-1.51/testdir/recursed_a/two0000644000175000017500000000000412134544631016527 0ustar adrianadriantwo fdupes-1.51/testdir/recursed_a/five0000644000175000017500000000000512134544631016650 0ustar adrianadrianfive fdupes-1.51/testdir/recursed_a/one0000644000175000017500000000000412134544631016477 0ustar adrianadrianone fdupes-1.51/testdir/recursed_b/0000755000175000017500000000000012134544631016002 5ustar adrianadrianfdupes-1.51/testdir/recursed_b/four0000644000175000017500000000000512134544631016673 0ustar adrianadrianfour fdupes-1.51/testdir/recursed_b/three0000644000175000017500000000000612134544631017030 0ustar adrianadrianthree fdupes-1.51/testdir/recursed_b/two_plus_one0000644000175000017500000000000612134544631020436 0ustar adrianadrianthree fdupes-1.51/testdir/recursed_b/one0000644000175000017500000000000412134544631016500 0ustar adrianadrianone fdupes-1.51/testdir/symlink_dir0000777000175000017500000000000012134544631020170 2recursed_austar adrianadrianfdupes-1.51/testdir/with spaces a0000644000175000017500000000001412134544631016216 0ustar adrianadrianwith spaces fdupes-1.51/testdir/two0000644000175000017500000000000412134544631014413 0ustar adrianadriantwo fdupes-1.51/testdir/symlink_two0000777000175000017500000000000012134544631016720 2twoustar adrianadrianfdupes-1.51/testdir/with spaces b0000644000175000017500000000001412134544631016217 0ustar adrianadrianwith spaces fdupes-1.51/testdir/twice_one0000644000175000017500000000000412134544631015556 0ustar adrianadriantwo fdupes-1.51/testdir/zero_b0000644000175000017500000000000012134544631015056 0ustar adrianadrianfdupes-1.51/testdir/zero_a0000644000175000017500000000000012134544631015055 0ustar adrianadrianfdupes-1.51/testdir/nine_upsidedown0000644000175000017500000000000412134544631016774 0ustar adrianadriansix fdupes-1.51/fdupes.c0000644000175000017500000006700212134554052013644 0ustar adrianadrian/* FDUPES Copyright (c) 1999-2002 Adrian Lopez Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #ifndef OMIT_GETOPT_LONG #include #endif #include #include #ifndef EXTERNAL_MD5 #include "md5/md5.h" #endif #define ISFLAG(a,b) ((a & b) == b) #define SETFLAG(a,b) (a |= b) #define F_RECURSE 0x0001 #define F_HIDEPROGRESS 0x0002 #define F_DSAMELINE 0x0004 #define F_FOLLOWLINKS 0x0008 #define F_DELETEFILES 0x0010 #define F_EXCLUDEEMPTY 0x0020 #define F_CONSIDERHARDLINKS 0x0040 #define F_SHOWSIZE 0x0080 #define F_OMITFIRST 0x0100 #define F_RECURSEAFTER 0x0200 #define F_NOPROMPT 0x0400 #define F_SUMMARIZEMATCHES 0x0800 char *program_name; unsigned long flags = 0; #define CHUNK_SIZE 8192 #define INPUT_SIZE 256 #define PARTIAL_MD5_SIZE 4096 /* TODO: Partial sums (for working with very large files). typedef struct _signature { md5_state_t state; md5_byte_t digest[16]; } signature_t; typedef struct _signatures { int num_signatures; signature_t *signatures; } signatures_t; */ typedef struct _file { char *d_name; off_t size; char *crcpartial; char *crcsignature; dev_t device; ino_t inode; time_t mtime; int hasdupes; /* true only if file is first on duplicate chain */ struct _file *duplicates; struct _file *next; } file_t; typedef struct _filetree { file_t *file; struct _filetree *left; struct _filetree *right; } filetree_t; void errormsg(char *message, ...) { va_list ap; va_start(ap, message); fprintf(stderr, "\r%40s\r%s: ", "", program_name); vfprintf(stderr, message, ap); } void escapefilename(char *escape_list, char **filename_ptr) { int x; int tx; char *tmp; char *filename; filename = *filename_ptr; tmp = (char*) malloc(strlen(filename) * 2 + 1); if (tmp == NULL) { errormsg("out of memory!\n"); exit(1); } for (x = 0, tx = 0; x < strlen(filename); x++) { if (strchr(escape_list, filename[x]) != NULL) tmp[tx++] = '\\'; tmp[tx++] = filename[x]; } tmp[tx] = '\0'; if (x != tx) { *filename_ptr = realloc(*filename_ptr, strlen(tmp) + 1); if (*filename_ptr == NULL) { errormsg("out of memory!\n"); exit(1); } strcpy(*filename_ptr, tmp); } } off_t filesize(char *filename) { struct stat s; if (stat(filename, &s) != 0) return -1; return s.st_size; } dev_t getdevice(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_dev; } ino_t getinode(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_ino; } time_t getmtime(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_mtime; } char **cloneargs(int argc, char **argv) { int x; char **args; args = (char **) malloc(sizeof(char*) * argc); if (args == NULL) { errormsg("out of memory!\n"); exit(1); } for (x = 0; x < argc; x++) { args[x] = (char*) malloc(strlen(argv[x]) + 1); if (args[x] == NULL) { free(args); errormsg("out of memory!\n"); exit(1); } strcpy(args[x], argv[x]); } return args; } int findarg(char *arg, int start, int argc, char **argv) { int x; for (x = start; x < argc; x++) if (strcmp(argv[x], arg) == 0) return x; return x; } /* Find the first non-option argument after specified option. */ int nonoptafter(char *option, int argc, char **oldargv, char **newargv, int optind) { int x; int targetind; int testind; int startat = 1; targetind = findarg(option, 1, argc, oldargv); for (x = optind; x < argc; x++) { testind = findarg(newargv[x], startat, argc, oldargv); if (testind > targetind) return x; else startat = testind; } return x; } int grokdir(char *dir, file_t **filelistp) { DIR *cd; file_t *newfile; struct dirent *dirinfo; int lastchar; int filecount = 0; struct stat info; struct stat linfo; static int progress = 0; static char indicator[] = "-\\|/"; cd = opendir(dir); if (!cd) { errormsg("could not chdir to %s\n", dir); return 0; } while ((dirinfo = readdir(cd)) != NULL) { if (strcmp(dirinfo->d_name, ".") && strcmp(dirinfo->d_name, "..")) { if (!ISFLAG(flags, F_HIDEPROGRESS)) { fprintf(stderr, "\rBuilding file list %c ", indicator[progress]); progress = (progress + 1) % 4; } newfile = (file_t*) malloc(sizeof(file_t)); if (!newfile) { errormsg("out of memory!\n"); closedir(cd); exit(1); } else newfile->next = *filelistp; newfile->device = 0; newfile->inode = 0; newfile->crcsignature = NULL; newfile->crcpartial = NULL; newfile->duplicates = NULL; newfile->hasdupes = 0; newfile->d_name = (char*)malloc(strlen(dir)+strlen(dirinfo->d_name)+2); if (!newfile->d_name) { errormsg("out of memory!\n"); free(newfile); closedir(cd); exit(1); } strcpy(newfile->d_name, dir); lastchar = strlen(dir) - 1; if (lastchar >= 0 && dir[lastchar] != '/') strcat(newfile->d_name, "/"); strcat(newfile->d_name, dirinfo->d_name); if (filesize(newfile->d_name) == 0 && ISFLAG(flags, F_EXCLUDEEMPTY)) { free(newfile->d_name); free(newfile); continue; } if (stat(newfile->d_name, &info) == -1) { free(newfile->d_name); free(newfile); continue; } if (lstat(newfile->d_name, &linfo) == -1) { free(newfile->d_name); free(newfile); continue; } if (S_ISDIR(info.st_mode)) { if (ISFLAG(flags, F_RECURSE) && (ISFLAG(flags, F_FOLLOWLINKS) || !S_ISLNK(linfo.st_mode))) filecount += grokdir(newfile->d_name, filelistp); free(newfile->d_name); free(newfile); } else { if (S_ISREG(linfo.st_mode) || (S_ISLNK(linfo.st_mode) && ISFLAG(flags, F_FOLLOWLINKS))) { *filelistp = newfile; filecount++; } else { free(newfile->d_name); free(newfile); } } } } closedir(cd); return filecount; } #ifndef EXTERNAL_MD5 /* If EXTERNAL_MD5 is not defined, use L. Peter Deutsch's MD5 library. */ char *getcrcsignatureuntil(char *filename, off_t max_read) { int x; off_t fsize; off_t toread; md5_state_t state; md5_byte_t digest[16]; static md5_byte_t chunk[CHUNK_SIZE]; static char signature[16*2 + 1]; char *sigp; FILE *file; md5_init(&state); fsize = filesize(filename); if (max_read != 0 && fsize > max_read) fsize = max_read; file = fopen(filename, "rb"); if (file == NULL) { errormsg("error opening file %s\n", filename); return NULL; } while (fsize > 0) { toread = (fsize % CHUNK_SIZE) ? (fsize % CHUNK_SIZE) : CHUNK_SIZE; if (fread(chunk, toread, 1, file) != 1) { errormsg("error reading from file %s\n", filename); fclose(file); return NULL; } md5_append(&state, chunk, toread); fsize -= toread; } md5_finish(&state, digest); sigp = signature; for (x = 0; x < 16; x++) { sprintf(sigp, "%02x", digest[x]); sigp = strchr(sigp, '\0'); } fclose(file); return signature; } char *getcrcsignature(char *filename) { return getcrcsignatureuntil(filename, 0); } char *getcrcpartialsignature(char *filename) { return getcrcsignatureuntil(filename, PARTIAL_MD5_SIZE); } #endif /* [#ifndef EXTERNAL_MD5] */ #ifdef EXTERNAL_MD5 /* If EXTERNAL_MD5 is defined, use md5sum program to calculate signatures. */ char *getcrcsignature(char *filename) { static char signature[256]; char *command; char *separator; FILE *result; command = (char*) malloc(strlen(filename)+strlen(EXTERNAL_MD5)+2); if (command == NULL) { errormsg("out of memory\n"); exit(1); } sprintf(command, "%s %s", EXTERNAL_MD5, filename); result = popen(command, "r"); if (result == NULL) { errormsg("error invoking %s\n", EXTERNAL_MD5); exit(1); } free(command); if (fgets(signature, 256, result) == NULL) { errormsg("error generating signature for %s\n", filename); return NULL; } separator = strchr(signature, ' '); if (separator) *separator = '\0'; pclose(result); return signature; } #endif /* [#ifdef EXTERNAL_MD5] */ void purgetree(filetree_t *checktree) { if (checktree->left != NULL) purgetree(checktree->left); if (checktree->right != NULL) purgetree(checktree->right); free(checktree); } void getfilestats(file_t *file) { file->size = filesize(file->d_name); file->inode = getinode(file->d_name); file->device = getdevice(file->d_name); file->mtime = getmtime(file->d_name); } int registerfile(filetree_t **branch, file_t *file) { getfilestats(file); *branch = (filetree_t*) malloc(sizeof(filetree_t)); if (*branch == NULL) { errormsg("out of memory!\n"); exit(1); } (*branch)->file = file; (*branch)->left = NULL; (*branch)->right = NULL; return 1; } file_t **checkmatch(filetree_t **root, filetree_t *checktree, file_t *file) { int cmpresult; char *crcsignature; off_t fsize; /* If device and inode fields are equal one of the files is a hard link to the other or the files have been listed twice unintentionally. We don't want to flag these files as duplicates unless the user specifies otherwise. */ if (!ISFLAG(flags, F_CONSIDERHARDLINKS) && (getinode(file->d_name) == checktree->file->inode) && (getdevice(file->d_name) == checktree->file->device)) return NULL; fsize = filesize(file->d_name); if (fsize < checktree->file->size) cmpresult = -1; else if (fsize > checktree->file->size) cmpresult = 1; else { if (checktree->file->crcpartial == NULL) { crcsignature = getcrcpartialsignature(checktree->file->d_name); if (crcsignature == NULL) return NULL; checktree->file->crcpartial = (char*) malloc(strlen(crcsignature)+1); if (checktree->file->crcpartial == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(checktree->file->crcpartial, crcsignature); } if (file->crcpartial == NULL) { crcsignature = getcrcpartialsignature(file->d_name); if (crcsignature == NULL) return NULL; file->crcpartial = (char*) malloc(strlen(crcsignature)+1); if (file->crcpartial == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(file->crcpartial, crcsignature); } cmpresult = strcmp(file->crcpartial, checktree->file->crcpartial); /*if (cmpresult != 0) errormsg(" on %s vs %s\n", file->d_name, checktree->file->d_name);*/ if (cmpresult == 0) { if (checktree->file->crcsignature == NULL) { crcsignature = getcrcsignature(checktree->file->d_name); if (crcsignature == NULL) return NULL; checktree->file->crcsignature = (char*) malloc(strlen(crcsignature)+1); if (checktree->file->crcsignature == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(checktree->file->crcsignature, crcsignature); } if (file->crcsignature == NULL) { crcsignature = getcrcsignature(file->d_name); if (crcsignature == NULL) return NULL; file->crcsignature = (char*) malloc(strlen(crcsignature)+1); if (file->crcsignature == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(file->crcsignature, crcsignature); } cmpresult = strcmp(file->crcsignature, checktree->file->crcsignature); /*if (cmpresult != 0) errormsg("P on %s vs %s\n", file->d_name, checktree->file->d_name); else errormsg("P F on %s vs %s\n", file->d_name, checktree->file->d_name); printf("%s matches %s\n", file->d_name, checktree->file->d_name);*/ } } if (cmpresult < 0) { if (checktree->left != NULL) { return checkmatch(root, checktree->left, file); } else { registerfile(&(checktree->left), file); return NULL; } } else if (cmpresult > 0) { if (checktree->right != NULL) { return checkmatch(root, checktree->right, file); } else { registerfile(&(checktree->right), file); return NULL; } } else { getfilestats(file); return &checktree->file; } } /* Do a bit-for-bit comparison in case two different files produce the same signature. Unlikely, but better safe than sorry. */ int confirmmatch(FILE *file1, FILE *file2) { unsigned char c1 = 0; unsigned char c2 = 0; size_t r1; size_t r2; fseek(file1, 0, SEEK_SET); fseek(file2, 0, SEEK_SET); do { r1 = fread(&c1, sizeof(c1), 1, file1); r2 = fread(&c2, sizeof(c2), 1, file2); if (c1 != c2) return 0; /* file contents are different */ } while (r1 && r2); if (r1 != r2) return 0; /* file lengths are different */ return 1; } void summarizematches(file_t *files) { int numsets = 0; double numbytes = 0.0; int numfiles = 0; file_t *tmpfile; while (files != NULL) { if (files->hasdupes) { numsets++; tmpfile = files->duplicates; while (tmpfile != NULL) { numfiles++; numbytes += files->size; tmpfile = tmpfile->duplicates; } } files = files->next; } if (numsets == 0) printf("No duplicates found.\n\n"); else { if (numbytes < 1024.0) printf("%d duplicate files (in %d sets), occupying %.0f bytes.\n\n", numfiles, numsets, numbytes); else if (numbytes <= (1000.0 * 1000.0)) printf("%d duplicate files (in %d sets), occupying %.1f kylobytes\n\n", numfiles, numsets, numbytes / 1000.0); else printf("%d duplicate files (in %d sets), occupying %.1f megabytes\n\n", numfiles, numsets, numbytes / (1000.0 * 1000.0)); } } void printmatches(file_t *files) { file_t *tmpfile; while (files != NULL) { if (files->hasdupes) { if (!ISFLAG(flags, F_OMITFIRST)) { if (ISFLAG(flags, F_SHOWSIZE)) printf("%lld byte%seach:\n", files->size, (files->size != 1) ? "s " : " "); if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &files->d_name); printf("%s%c", files->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); } tmpfile = files->duplicates; while (tmpfile != NULL) { if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &tmpfile->d_name); printf("%s%c", tmpfile->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); tmpfile = tmpfile->duplicates; } printf("\n"); } files = files->next; } } /* #define REVISE_APPEND "_tmp" char *revisefilename(char *path, int seq) { int digits; char *newpath; char *scratch; char *dot; digits = numdigits(seq); newpath = malloc(strlen(path) + strlen(REVISE_APPEND) + digits + 1); if (!newpath) return newpath; scratch = malloc(strlen(path) + 1); if (!scratch) return newpath; strcpy(scratch, path); dot = strrchr(scratch, '.'); if (dot) { *dot = 0; sprintf(newpath, "%s%s%d.%s", scratch, REVISE_APPEND, seq, dot + 1); } else { sprintf(newpath, "%s%s%d", path, REVISE_APPEND, seq); } free(scratch); return newpath; } */ int relink(char *oldfile, char *newfile) { dev_t od; dev_t nd; ino_t oi; ino_t ni; od = getdevice(oldfile); oi = getinode(oldfile); if (link(oldfile, newfile) != 0) return 0; /* make sure we're working with the right file (the one we created) */ nd = getdevice(newfile); ni = getinode(newfile); if (nd != od || oi != ni) return 0; /* file is not what we expected */ return 1; } void deletefiles(file_t *files, int prompt, FILE *tty) { int counter; int groups = 0; int curgroup = 0; file_t *tmpfile; file_t *curfile; file_t **dupelist; int *preserve; char *preservestr; char *token; char *tstr; int number; int sum; int max = 0; int x; int i; curfile = files; while (curfile) { if (curfile->hasdupes) { counter = 1; groups++; tmpfile = curfile->duplicates; while (tmpfile) { counter++; tmpfile = tmpfile->duplicates; } if (counter > max) max = counter; } curfile = curfile->next; } max++; dupelist = (file_t**) malloc(sizeof(file_t*) * max); preserve = (int*) malloc(sizeof(int) * max); preservestr = (char*) malloc(INPUT_SIZE); if (!dupelist || !preserve || !preservestr) { errormsg("out of memory\n"); exit(1); } while (files) { if (files->hasdupes) { curgroup++; counter = 1; dupelist[counter] = files; if (prompt) printf("[%d] %s\n", counter, files->d_name); tmpfile = files->duplicates; while (tmpfile) { dupelist[++counter] = tmpfile; if (prompt) printf("[%d] %s\n", counter, tmpfile->d_name); tmpfile = tmpfile->duplicates; } if (prompt) printf("\n"); if (!prompt) /* preserve only the first file */ { preserve[1] = 1; for (x = 2; x <= counter; x++) preserve[x] = 0; } else /* prompt for files to preserve */ do { printf("Set %d of %d, preserve files [1 - %d, all]", curgroup, groups, counter); if (ISFLAG(flags, F_SHOWSIZE)) printf(" (%lld byte%seach)", files->size, (files->size != 1) ? "s " : " "); printf(": "); fflush(stdout); if (!fgets(preservestr, INPUT_SIZE, tty)) preservestr[0] = '\n'; /* treat fgets() failure as if nothing was entered */ i = strlen(preservestr) - 1; while (preservestr[i]!='\n'){ /* tail of buffer must be a newline */ tstr = (char*) realloc(preservestr, strlen(preservestr) + 1 + INPUT_SIZE); if (!tstr) { /* couldn't allocate memory, treat as fatal */ errormsg("out of memory!\n"); exit(1); } preservestr = tstr; if (!fgets(preservestr + i + 1, INPUT_SIZE, tty)) { preservestr[0] = '\n'; /* treat fgets() failure as if nothing was entered */ break; } i = strlen(preservestr)-1; } for (x = 1; x <= counter; x++) preserve[x] = 0; token = strtok(preservestr, " ,\n"); while (token != NULL) { if (strcasecmp(token, "all") == 0) for (x = 0; x <= counter; x++) preserve[x] = 1; number = 0; sscanf(token, "%d", &number); if (number > 0 && number <= counter) preserve[number] = 1; token = strtok(NULL, " ,\n"); } for (sum = 0, x = 1; x <= counter; x++) sum += preserve[x]; } while (sum < 1); /* make sure we've preserved at least one file */ printf("\n"); for (x = 1; x <= counter; x++) { if (preserve[x]) printf(" [+] %s\n", dupelist[x]->d_name); else { if (remove(dupelist[x]->d_name) == 0) { printf(" [-] %s\n", dupelist[x]->d_name); } else { printf(" [!] %s ", dupelist[x]->d_name); printf("-- unable to delete file!\n"); } } } printf("\n"); } files = files->next; } free(dupelist); free(preserve); free(preservestr); } int sort_pairs_by_arrival(file_t *f1, file_t *f2) { if (f2->duplicates != 0) return 1; return -1; } int sort_pairs_by_mtime(file_t *f1, file_t *f2) { if (f1->mtime < f2->mtime) return -1; else if (f1->mtime > f2->mtime) return 1; return 0; } void registerpair(file_t **matchlist, file_t *newmatch, int (*comparef)(file_t *f1, file_t *f2)) { file_t *traverse; file_t *back; (*matchlist)->hasdupes = 1; back = 0; traverse = *matchlist; while (traverse) { if (comparef(newmatch, traverse) <= 0) { newmatch->duplicates = traverse; if (back == 0) { *matchlist = newmatch; /* update pointer to head of list */ newmatch->hasdupes = 1; traverse->hasdupes = 0; /* flag is only for first file in dupe chain */ } else back->duplicates = newmatch; break; } else { if (traverse->duplicates == 0) { traverse->duplicates = newmatch; if (back == 0) traverse->hasdupes = 1; break; } } back = traverse; traverse = traverse->duplicates; } } void help_text() { printf("Usage: fdupes [options] DIRECTORY...\n\n"); printf(" -r --recurse \tfor every directory given follow subdirectories\n"); printf(" \tencountered within\n"); printf(" -R --recurse: \tfor each directory given after this option follow\n"); printf(" \tsubdirectories encountered within\n"); printf(" -s --symlinks \tfollow symlinks\n"); printf(" -H --hardlinks \tnormally, when two or more files point to the same\n"); printf(" \tdisk area they are treated as non-duplicates; this\n"); printf(" \toption will change this behavior\n"); printf(" -n --noempty \texclude zero-length files from consideration\n"); printf(" -f --omitfirst \tomit the first file in each set of matches\n"); printf(" -1 --sameline \tlist each set of matches on a single line\n"); printf(" -S --size \tshow size of duplicate files\n"); printf(" -m --summarize \tsummarize dupe information\n"); printf(" -q --quiet \thide progress indicator\n"); printf(" -d --delete \tprompt user for files to preserve and delete all\n"); printf(" \tothers; important: under particular circumstances,\n"); printf(" \tdata may be lost when using this option together\n"); printf(" \twith -s or --symlinks, or when specifying a\n"); printf(" \tparticular directory more than once; refer to the\n"); printf(" \tfdupes documentation for additional information\n"); /*printf(" -l --relink \t(description)\n");*/ printf(" -N --noprompt \ttogether with --delete, preserve the first file in\n"); printf(" \teach set of duplicates and delete the rest without\n"); printf(" \tprompting the user\n"); printf(" -v --version \tdisplay fdupes version\n"); printf(" -h --help \tdisplay this help message\n\n"); #ifdef OMIT_GETOPT_LONG printf("Note: Long options are not supported in this fdupes build.\n\n"); #endif } int main(int argc, char **argv) { int x; int opt; FILE *file1; FILE *file2; file_t *files = NULL; file_t *curfile; file_t **match = NULL; filetree_t *checktree = NULL; int filecount = 0; int progress = 0; char **oldargv; int firstrecurse; #ifndef OMIT_GETOPT_LONG static struct option long_options[] = { { "omitfirst", 0, 0, 'f' }, { "recurse", 0, 0, 'r' }, { "recursive", 0, 0, 'r' }, { "recurse:", 0, 0, 'R' }, { "recursive:", 0, 0, 'R' }, { "quiet", 0, 0, 'q' }, { "sameline", 0, 0, '1' }, { "size", 0, 0, 'S' }, { "symlinks", 0, 0, 's' }, { "hardlinks", 0, 0, 'H' }, { "relink", 0, 0, 'l' }, { "noempty", 0, 0, 'n' }, { "delete", 0, 0, 'd' }, { "version", 0, 0, 'v' }, { "help", 0, 0, 'h' }, { "noprompt", 0, 0, 'N' }, { "summarize", 0, 0, 'm'}, { "summary", 0, 0, 'm' }, { 0, 0, 0, 0 } }; #define GETOPT getopt_long #else #define GETOPT getopt #endif program_name = argv[0]; oldargv = cloneargs(argc, argv); while ((opt = GETOPT(argc, argv, "frRq1Ss::HlndvhNm" #ifndef OMIT_GETOPT_LONG , long_options, NULL #endif )) != EOF) { switch (opt) { case 'f': SETFLAG(flags, F_OMITFIRST); break; case 'r': SETFLAG(flags, F_RECURSE); break; case 'R': SETFLAG(flags, F_RECURSEAFTER); break; case 'q': SETFLAG(flags, F_HIDEPROGRESS); break; case '1': SETFLAG(flags, F_DSAMELINE); break; case 'S': SETFLAG(flags, F_SHOWSIZE); break; case 's': SETFLAG(flags, F_FOLLOWLINKS); break; case 'H': SETFLAG(flags, F_CONSIDERHARDLINKS); break; case 'n': SETFLAG(flags, F_EXCLUDEEMPTY); break; case 'd': SETFLAG(flags, F_DELETEFILES); break; case 'v': printf("fdupes %s\n", VERSION); exit(0); case 'h': help_text(); exit(1); case 'N': SETFLAG(flags, F_NOPROMPT); break; case 'm': SETFLAG(flags, F_SUMMARIZEMATCHES); break; default: fprintf(stderr, "Try `fdupes --help' for more information.\n"); exit(1); } } if (optind >= argc) { errormsg("no directories specified\n"); exit(1); } if (ISFLAG(flags, F_RECURSE) && ISFLAG(flags, F_RECURSEAFTER)) { errormsg("options --recurse and --recurse: are not compatible\n"); exit(1); } if (ISFLAG(flags, F_SUMMARIZEMATCHES) && ISFLAG(flags, F_DELETEFILES)) { errormsg("options --summarize and --delete are not compatible\n"); exit(1); } if (ISFLAG(flags, F_RECURSEAFTER)) { firstrecurse = nonoptafter("--recurse:", argc, oldargv, argv, optind); if (firstrecurse == argc) firstrecurse = nonoptafter("-R", argc, oldargv, argv, optind); if (firstrecurse == argc) { errormsg("-R option must be isolated from other options\n"); exit(1); } /* F_RECURSE is not set for directories before --recurse: */ for (x = optind; x < firstrecurse; x++) filecount += grokdir(argv[x], &files); /* Set F_RECURSE for directories after --recurse: */ SETFLAG(flags, F_RECURSE); for (x = firstrecurse; x < argc; x++) filecount += grokdir(argv[x], &files); } else { for (x = optind; x < argc; x++) filecount += grokdir(argv[x], &files); } if (!files) { if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); exit(0); } curfile = files; while (curfile) { if (!checktree) registerfile(&checktree, curfile); else match = checkmatch(&checktree, checktree, curfile); if (match != NULL) { file1 = fopen(curfile->d_name, "rb"); if (!file1) { curfile = curfile->next; continue; } file2 = fopen((*match)->d_name, "rb"); if (!file2) { fclose(file1); curfile = curfile->next; continue; } if (confirmmatch(file1, file2)) { registerpair(match, curfile, sort_pairs_by_mtime); /*match->hasdupes = 1; curfile->duplicates = match->duplicates; match->duplicates = curfile;*/ } fclose(file1); fclose(file2); } curfile = curfile->next; if (!ISFLAG(flags, F_HIDEPROGRESS)) { fprintf(stderr, "\rProgress [%d/%d] %d%% ", progress, filecount, (int)((float) progress / (float) filecount * 100.0)); progress++; } } if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); if (ISFLAG(flags, F_DELETEFILES)) { if (ISFLAG(flags, F_NOPROMPT)) { deletefiles(files, 0, 0); } else { stdin = freopen("/dev/tty", "r", stdin); deletefiles(files, 1, stdin); } } else if (ISFLAG(flags, F_SUMMARIZEMATCHES)) summarizematches(files); else printmatches(files); while (files) { curfile = files->next; free(files->d_name); free(files->crcsignature); free(files->crcpartial); free(files); files = curfile; } for (x = 0; x < argc; x++) free(oldargv[x]); free(oldargv); purgetree(checktree); return 0; } fdupes-1.51/INSTALL0000644000175000017500000000210412134544631013235 0ustar adrianadrianInstalling fdupes -------------------------------------------------------------------- To install the program, issue the following commands: make fdupes su root make install This will install the program in /usr/local/bin. You may change this to a different location by editing the Makefile. Please refer to the Makefile for an explanation of compile-time options. If you're having trouble compiling, please take a look at the Makefile. UPGRADING NOTE: When upgrading from a version prior to 1.2, it should be noted that the default installation directory for the fdupes man page has changed from "/usr/man" to "/usr/local/man". If installing to the default location you should delete the old man page before proceeding. This file would be named "/usr/man/man1/fdupes.1". A test directory is included so that you may familiarise yourself with the way fdupes operates. You may test the program before installing it by issuing a command such as "./fdupes testdir" or "./fdupes -r testdir", just to name a couple of examples. Refer to the documentation for information on valid options. fdupes-1.51/README0000644000175000017500000000734512134544631013100 0ustar adrianadrianIntroduction -------------------------------------------------------------------- FDUPES is a program for identifying duplicate files residing within specified directories. Usage -------------------------------------------------------------------- Usage: fdupes [options] DIRECTORY... -r --recurse for every directory given follow subdirectories encountered within -R --recurse: for each directory given after this option follow subdirectories encountered within -s --symlinks follow symlinks -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior -n --noempty exclude zero-length files from consideration -f --omitfirst omit the first file in each set of matches -1 --sameline list each set of matches on a single line -S --size show size of duplicate files -q --quiet hide progress indicator -d --delete prompt user for files to preserve and delete all others; important: under particular circumstances, data may be lost when using this option together with -s or --symlinks, or when specifying a particular directory more than once; refer to the fdupes documentation for additional information -v --version display fdupes version -h --help display this help message Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When -1 or --sameline is specified, spaces and backslash characters (\) appearing in a filename are preceded by a backslash character. For instance, "with spaces" becomes "with\ spaces". When using -d or --delete, care should be taken to insure against accidental data loss. While no information will be immediately lost, using this option together with -s or --symlink can lead to confusing information being presented to the user when prompted for files to preserve. Specifically, a user could accidentally preserve a symlink while deleting the file it points to. A similar problem arises when specifying a particular directory more than once. All files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). Contact Information for Adrian Lopez -------------------------------------------------------------------- email: adrian2@caribe.net Legal Information -------------------------------------------------------------------- FDUPES Copyright (c) 1999 Adrian Lopez Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. fdupes-1.51/.svn/0000755000175000017500000000000012134556276013103 5ustar adrianadrianfdupes-1.51/.svn/entries0000644000175000017500000000000312134544625014463 0ustar adrianadrian12 fdupes-1.51/.svn/pristine/0000755000175000017500000000000012134556275014737 5ustar adrianadrianfdupes-1.51/.svn/pristine/1d/0000755000175000017500000000000012134554374015241 5ustar adrianadrianfdupes-1.51/.svn/pristine/1d/1d9037bc7f488901dc068f1df3818084c5125ec1.svn-base0000644000175000017500000006652612134544631024141 0ustar adrianadrian/* FDUPES Copyright (c) 1999-2002 Adrian Lopez Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #ifndef OMIT_GETOPT_LONG #include #endif #include #include #ifndef EXTERNAL_MD5 #include "md5/md5.h" #endif #define ISFLAG(a,b) ((a & b) == b) #define SETFLAG(a,b) (a |= b) #define F_RECURSE 0x0001 #define F_HIDEPROGRESS 0x0002 #define F_DSAMELINE 0x0004 #define F_FOLLOWLINKS 0x0008 #define F_DELETEFILES 0x0010 #define F_EXCLUDEEMPTY 0x0020 #define F_CONSIDERHARDLINKS 0x0040 #define F_SHOWSIZE 0x0080 #define F_OMITFIRST 0x0100 #define F_RECURSEAFTER 0x0200 #define F_NOPROMPT 0x0400 #define F_SUMMARIZEMATCHES 0x0800 char *program_name; unsigned long flags = 0; #define CHUNK_SIZE 8192 #define INPUT_SIZE 256 #define PARTIAL_MD5_SIZE 4096 /* TODO: Partial sums (for working with very large files). typedef struct _signature { md5_state_t state; md5_byte_t digest[16]; } signature_t; typedef struct _signatures { int num_signatures; signature_t *signatures; } signatures_t; */ typedef struct _file { char *d_name; off_t size; char *crcpartial; char *crcsignature; dev_t device; ino_t inode; time_t mtime; int hasdupes; /* true only if file is first on duplicate chain */ struct _file *duplicates; struct _file *next; } file_t; typedef struct _filetree { file_t *file; struct _filetree *left; struct _filetree *right; } filetree_t; void errormsg(char *message, ...) { va_list ap; va_start(ap, message); fprintf(stderr, "\r%40s\r%s: ", "", program_name); vfprintf(stderr, message, ap); } void escapefilename(char *escape_list, char **filename_ptr) { int x; int tx; char *tmp; char *filename; filename = *filename_ptr; tmp = (char*) malloc(strlen(filename) * 2 + 1); if (tmp == NULL) { errormsg("out of memory!\n"); exit(1); } for (x = 0, tx = 0; x < strlen(filename); x++) { if (strchr(escape_list, filename[x]) != NULL) tmp[tx++] = '\\'; tmp[tx++] = filename[x]; } tmp[tx] = '\0'; if (x != tx) { *filename_ptr = realloc(*filename_ptr, strlen(tmp) + 1); if (*filename_ptr == NULL) { errormsg("out of memory!\n"); exit(1); } strcpy(*filename_ptr, tmp); } } off_t filesize(char *filename) { struct stat s; if (stat(filename, &s) != 0) return -1; return s.st_size; } dev_t getdevice(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_dev; } ino_t getinode(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_ino; } time_t getmtime(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_mtime; } char **cloneargs(int argc, char **argv) { int x; char **args; args = (char **) malloc(sizeof(char*) * argc); if (args == NULL) { errormsg("out of memory!\n"); exit(1); } for (x = 0; x < argc; x++) { args[x] = (char*) malloc(strlen(argv[x]) + 1); if (args[x] == NULL) { free(args); errormsg("out of memory!\n"); exit(1); } strcpy(args[x], argv[x]); } return args; } int findarg(char *arg, int start, int argc, char **argv) { int x; for (x = start; x < argc; x++) if (strcmp(argv[x], arg) == 0) return x; return x; } /* Find the first non-option argument after specified option. */ int nonoptafter(char *option, int argc, char **oldargv, char **newargv, int optind) { int x; int targetind; int testind; int startat = 1; targetind = findarg(option, 1, argc, oldargv); for (x = optind; x < argc; x++) { testind = findarg(newargv[x], startat, argc, oldargv); if (testind > targetind) return x; else startat = testind; } return x; } int grokdir(char *dir, file_t **filelistp) { DIR *cd; file_t *newfile; struct dirent *dirinfo; int lastchar; int filecount = 0; struct stat info; struct stat linfo; static int progress = 0; static char indicator[] = "-\\|/"; cd = opendir(dir); if (!cd) { errormsg("could not chdir to %s\n", dir); return 0; } while ((dirinfo = readdir(cd)) != NULL) { if (strcmp(dirinfo->d_name, ".") && strcmp(dirinfo->d_name, "..")) { if (!ISFLAG(flags, F_HIDEPROGRESS)) { fprintf(stderr, "\rBuilding file list %c ", indicator[progress]); progress = (progress + 1) % 4; } newfile = (file_t*) malloc(sizeof(file_t)); if (!newfile) { errormsg("out of memory!\n"); closedir(cd); exit(1); } else newfile->next = *filelistp; newfile->device = 0; newfile->inode = 0; newfile->crcsignature = NULL; newfile->crcpartial = NULL; newfile->duplicates = NULL; newfile->hasdupes = 0; newfile->d_name = (char*)malloc(strlen(dir)+strlen(dirinfo->d_name)+2); if (!newfile->d_name) { errormsg("out of memory!\n"); free(newfile); closedir(cd); exit(1); } strcpy(newfile->d_name, dir); lastchar = strlen(dir) - 1; if (lastchar >= 0 && dir[lastchar] != '/') strcat(newfile->d_name, "/"); strcat(newfile->d_name, dirinfo->d_name); if (filesize(newfile->d_name) == 0 && ISFLAG(flags, F_EXCLUDEEMPTY)) { free(newfile->d_name); free(newfile); continue; } if (stat(newfile->d_name, &info) == -1) { free(newfile->d_name); free(newfile); continue; } if (lstat(newfile->d_name, &linfo) == -1) { free(newfile->d_name); free(newfile); continue; } if (S_ISDIR(info.st_mode)) { if (ISFLAG(flags, F_RECURSE) && (ISFLAG(flags, F_FOLLOWLINKS) || !S_ISLNK(linfo.st_mode))) filecount += grokdir(newfile->d_name, filelistp); free(newfile->d_name); free(newfile); } else { if (S_ISREG(linfo.st_mode) || (S_ISLNK(linfo.st_mode) && ISFLAG(flags, F_FOLLOWLINKS))) { *filelistp = newfile; filecount++; } else { free(newfile->d_name); free(newfile); } } } } closedir(cd); return filecount; } #ifndef EXTERNAL_MD5 /* If EXTERNAL_MD5 is not defined, use L. Peter Deutsch's MD5 library. */ char *getcrcsignatureuntil(char *filename, off_t max_read) { int x; off_t fsize; off_t toread; md5_state_t state; md5_byte_t digest[16]; static md5_byte_t chunk[CHUNK_SIZE]; static char signature[16*2 + 1]; char *sigp; FILE *file; md5_init(&state); fsize = filesize(filename); if (max_read != 0 && fsize > max_read) fsize = max_read; file = fopen(filename, "rb"); if (file == NULL) { errormsg("error opening file %s\n", filename); return NULL; } while (fsize > 0) { toread = (fsize % CHUNK_SIZE) ? (fsize % CHUNK_SIZE) : CHUNK_SIZE; if (fread(chunk, toread, 1, file) != 1) { errormsg("error reading from file %s\n", filename); fclose(file); // bugfix return NULL; } md5_append(&state, chunk, toread); fsize -= toread; } md5_finish(&state, digest); sigp = signature; for (x = 0; x < 16; x++) { sprintf(sigp, "%02x", digest[x]); sigp = strchr(sigp, '\0'); } fclose(file); return signature; } char *getcrcsignature(char *filename) { return getcrcsignatureuntil(filename, 0); } char *getcrcpartialsignature(char *filename) { return getcrcsignatureuntil(filename, PARTIAL_MD5_SIZE); } #endif /* [#ifndef EXTERNAL_MD5] */ #ifdef EXTERNAL_MD5 /* If EXTERNAL_MD5 is defined, use md5sum program to calculate signatures. */ char *getcrcsignature(char *filename) { static char signature[256]; char *command; char *separator; FILE *result; command = (char*) malloc(strlen(filename)+strlen(EXTERNAL_MD5)+2); if (command == NULL) { errormsg("out of memory\n"); exit(1); } sprintf(command, "%s %s", EXTERNAL_MD5, filename); result = popen(command, "r"); if (result == NULL) { errormsg("error invoking %s\n", EXTERNAL_MD5); exit(1); } free(command); if (fgets(signature, 256, result) == NULL) { errormsg("error generating signature for %s\n", filename); return NULL; } separator = strchr(signature, ' '); if (separator) *separator = '\0'; pclose(result); return signature; } #endif /* [#ifdef EXTERNAL_MD5] */ void purgetree(filetree_t *checktree) { if (checktree->left != NULL) purgetree(checktree->left); if (checktree->right != NULL) purgetree(checktree->right); free(checktree); } void getfilestats(file_t *file) { file->size = filesize(file->d_name); file->inode = getinode(file->d_name); file->device = getdevice(file->d_name); file->mtime = getmtime(file->d_name); } int registerfile(filetree_t **branch, file_t *file) { getfilestats(file); *branch = (filetree_t*) malloc(sizeof(filetree_t)); if (*branch == NULL) { errormsg("out of memory!\n"); exit(1); } (*branch)->file = file; (*branch)->left = NULL; (*branch)->right = NULL; return 1; } file_t **checkmatch(filetree_t **root, filetree_t *checktree, file_t *file) { int cmpresult; char *crcsignature; off_t fsize; /* If device and inode fields are equal one of the files is a hard link to the other or the files have been listed twice unintentionally. We don't want to flag these files as duplicates unless the user specifies otherwise. */ if (!ISFLAG(flags, F_CONSIDERHARDLINKS) && (getinode(file->d_name) == checktree->file->inode) && (getdevice(file->d_name) == checktree->file->device)) return NULL; fsize = filesize(file->d_name); if (fsize < checktree->file->size) cmpresult = -1; else if (fsize > checktree->file->size) cmpresult = 1; else { if (checktree->file->crcpartial == NULL) { crcsignature = getcrcpartialsignature(checktree->file->d_name); if (crcsignature == NULL) return NULL; checktree->file->crcpartial = (char*) malloc(strlen(crcsignature)+1); if (checktree->file->crcpartial == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(checktree->file->crcpartial, crcsignature); } if (file->crcpartial == NULL) { crcsignature = getcrcpartialsignature(file->d_name); if (crcsignature == NULL) return NULL; file->crcpartial = (char*) malloc(strlen(crcsignature)+1); if (file->crcpartial == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(file->crcpartial, crcsignature); } cmpresult = strcmp(file->crcpartial, checktree->file->crcpartial); //if (cmpresult != 0) errormsg(" on %s vs %s\n", file->d_name, checktree->file->d_name); if (cmpresult == 0) { if (checktree->file->crcsignature == NULL) { crcsignature = getcrcsignature(checktree->file->d_name); if (crcsignature == NULL) return NULL; checktree->file->crcsignature = (char*) malloc(strlen(crcsignature)+1); if (checktree->file->crcsignature == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(checktree->file->crcsignature, crcsignature); } if (file->crcsignature == NULL) { crcsignature = getcrcsignature(file->d_name); if (crcsignature == NULL) return NULL; file->crcsignature = (char*) malloc(strlen(crcsignature)+1); if (file->crcsignature == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(file->crcsignature, crcsignature); } cmpresult = strcmp(file->crcsignature, checktree->file->crcsignature); //if (cmpresult != 0) errormsg("P on %s vs %s\n", //file->d_name, checktree->file->d_name); //else errormsg("P F on %s vs %s\n", file->d_name, //checktree->file->d_name); //printf("%s matches %s\n", file->d_name, checktree->file->d_name); } } if (cmpresult < 0) { if (checktree->left != NULL) { return checkmatch(root, checktree->left, file); } else { registerfile(&(checktree->left), file); return NULL; } } else if (cmpresult > 0) { if (checktree->right != NULL) { return checkmatch(root, checktree->right, file); } else { registerfile(&(checktree->right), file); return NULL; } } else { getfilestats(file); return &checktree->file; } } /* Do a bit-for-bit comparison in case two different files produce the same signature. Unlikely, but better safe than sorry. */ int confirmmatch(FILE *file1, FILE *file2) { unsigned char c1 = 0; unsigned char c2 = 0; size_t r1; size_t r2; fseek(file1, 0, SEEK_SET); fseek(file2, 0, SEEK_SET); do { r1 = fread(&c1, sizeof(c1), 1, file1); r2 = fread(&c2, sizeof(c2), 1, file2); if (c1 != c2) return 0; /* file contents are different */ } while (r1 && r2); if (r1 != r2) return 0; /* file lengths are different */ return 1; } void summarizematches(file_t *files) { int numsets = 0; double numbytes = 0.0; int numfiles = 0; file_t *tmpfile; while (files != NULL) { if (files->hasdupes) { numsets++; tmpfile = files->duplicates; while (tmpfile != NULL) { numfiles++; numbytes += files->size; tmpfile = tmpfile->duplicates; } } files = files->next; } if (numsets == 0) printf("No duplicates found.\n\n"); else { if (numbytes < 1024.0) printf("%d duplicate files (in %d sets), occupying %.0f bytes.\n\n", numfiles, numsets, numbytes); else if (numbytes <= (1000.0 * 1000.0)) printf("%d duplicate files (in %d sets), occupying %.1f kylobytes\n\n", numfiles, numsets, numbytes / 1000.0); else printf("%d duplicate files (in %d sets), occupying %.1f megabytes\n\n", numfiles, numsets, numbytes / (1000.0 * 1000.0)); } } void printmatches(file_t *files) { file_t *tmpfile; while (files != NULL) { if (files->hasdupes) { if (!ISFLAG(flags, F_OMITFIRST)) { if (ISFLAG(flags, F_SHOWSIZE)) printf("%ld byte%seach:\n", files->size, (files->size != 1) ? "s " : " "); if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &files->d_name); printf("%s%c", files->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); } tmpfile = files->duplicates; while (tmpfile != NULL) { if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &tmpfile->d_name); printf("%s%c", tmpfile->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); tmpfile = tmpfile->duplicates; } printf("\n"); } files = files->next; } } /* #define REVISE_APPEND "_tmp" char *revisefilename(char *path, int seq) { int digits; char *newpath; char *scratch; char *dot; digits = numdigits(seq); newpath = malloc(strlen(path) + strlen(REVISE_APPEND) + digits + 1); if (!newpath) return newpath; scratch = malloc(strlen(path) + 1); if (!scratch) return newpath; strcpy(scratch, path); dot = strrchr(scratch, '.'); if (dot) { *dot = 0; sprintf(newpath, "%s%s%d.%s", scratch, REVISE_APPEND, seq, dot + 1); } else { sprintf(newpath, "%s%s%d", path, REVISE_APPEND, seq); } free(scratch); return newpath; } */ int relink(char *oldfile, char *newfile) { dev_t od; dev_t nd; ino_t oi; ino_t ni; od = getdevice(oldfile); oi = getinode(oldfile); if (link(oldfile, newfile) != 0) return 0; // make sure we're working with the right file (the one we created) nd = getdevice(newfile); ni = getinode(newfile); if (nd != od || oi != ni) return 0; // file is not what we expected return 1; } void deletefiles(file_t *files, int prompt) { int counter; int groups = 0; int curgroup = 0; file_t *tmpfile; file_t *curfile; file_t **dupelist; int *preserve; char *preservestr; char *token; char *tstr; int number; int sum; int max = 0; int x; int i; curfile = files; while (curfile) { if (curfile->hasdupes) { counter = 1; groups++; tmpfile = curfile->duplicates; while (tmpfile) { counter++; tmpfile = tmpfile->duplicates; } if (counter > max) max = counter; } curfile = curfile->next; } max++; dupelist = (file_t**) malloc(sizeof(file_t*) * max); preserve = (int*) malloc(sizeof(int) * max); preservestr = (char*) malloc(INPUT_SIZE); if (!dupelist || !preserve || !preservestr) { errormsg("out of memory\n"); exit(1); } while (files) { if (files->hasdupes) { curgroup++; counter = 1; dupelist[counter] = files; if (prompt) printf("[%d] %s\n", counter, files->d_name); tmpfile = files->duplicates; while (tmpfile) { dupelist[++counter] = tmpfile; if (prompt) printf("[%d] %s\n", counter, tmpfile->d_name); tmpfile = tmpfile->duplicates; } if (prompt) printf("\n"); if (!prompt) /* preserve only the first file */ { preserve[1] = 1; for (x = 2; x <= counter; x++) preserve[x] = 0; } else /* prompt for files to preserve */ do { printf("Set %d of %d, preserve files [1 - %d, all]", curgroup, groups, counter); if (ISFLAG(flags, F_SHOWSIZE)) printf(" (%ld byte%seach)", files->size, (files->size != 1) ? "s " : " "); printf(": "); fflush(stdout); fgets(preservestr, INPUT_SIZE, stdin); i = strlen(preservestr) - 1; while (preservestr[i]!='\n'){ /* tail of buffer must be a newline */ tstr = (char*) realloc(preservestr, strlen(preservestr) + 1 + INPUT_SIZE); if (!tstr) { /* couldn't allocate memory, treat as fatal */ errormsg("out of memory!\n"); exit(1); } preservestr = tstr; if (!fgets(preservestr + i + 1, INPUT_SIZE, stdin)) break; /* stop if fgets fails -- possible EOF? */ i = strlen(preservestr)-1; } for (x = 1; x <= counter; x++) preserve[x] = 0; token = strtok(preservestr, " ,\n"); while (token != NULL) { if (strcasecmp(token, "all") == 0) for (x = 0; x <= counter; x++) preserve[x] = 1; number = 0; sscanf(token, "%d", &number); if (number > 0 && number <= counter) preserve[number] = 1; token = strtok(NULL, " ,\n"); } for (sum = 0, x = 1; x <= counter; x++) sum += preserve[x]; } while (sum < 1); /* make sure we've preserved at least one file */ printf("\n"); for (x = 1; x <= counter; x++) { if (preserve[x]) printf(" [+] %s\n", dupelist[x]->d_name); else { if (remove(dupelist[x]->d_name) == 0) { printf(" [-] %s\n", dupelist[x]->d_name); } else { printf(" [!] %s ", dupelist[x]->d_name); printf("-- unable to delete file!\n"); } } } printf("\n"); } files = files->next; } free(dupelist); free(preserve); free(preservestr); } int sort_pairs_by_arrival(file_t *f1, file_t *f2) { if (f2->duplicates != 0) return 1; return -1; } int sort_pairs_by_mtime(file_t *f1, file_t *f2) { if (f1->mtime < f2->mtime) return -1; else if (f1->mtime > f2->mtime) return 1; //return sort_pairs_by_arrival(f1, f2); return 0; } void registerpair(file_t **matchlist, file_t *newmatch, int (*comparef)(file_t *f1, file_t *f2)) { file_t *traverse; file_t *back; (*matchlist)->hasdupes = 1; back = 0; traverse = *matchlist; while (traverse) { if (comparef(newmatch, traverse) <= 0) { newmatch->duplicates = traverse; if (back == 0) { *matchlist = newmatch; // update pointer to head of list newmatch->hasdupes = 1; traverse->hasdupes = 0; // flag is only for first file in dupe chain } else back->duplicates = newmatch; break; } else { if (traverse->duplicates == 0) { traverse->duplicates = newmatch; if (back == 0) traverse->hasdupes = 1; break; } } back = traverse; traverse = traverse->duplicates; } } void help_text() { printf("Usage: fdupes [options] DIRECTORY...\n\n"); printf(" -r --recurse \tfor every directory given follow subdirectories\n"); printf(" \tencountered within\n"); printf(" -R --recurse: \tfor each directory given after this option follow\n"); printf(" \tsubdirectories encountered within\n"); printf(" -s --symlinks \tfollow symlinks\n"); printf(" -H --hardlinks \tnormally, when two or more files point to the same\n"); printf(" \tdisk area they are treated as non-duplicates; this\n"); printf(" \toption will change this behavior\n"); printf(" -n --noempty \texclude zero-length files from consideration\n"); printf(" -f --omitfirst \tomit the first file in each set of matches\n"); printf(" -1 --sameline \tlist each set of matches on a single line\n"); printf(" -S --size \tshow size of duplicate files\n"); printf(" -m --summarize \tsummarize dupe information\n"); printf(" -q --quiet \thide progress indicator\n"); printf(" -d --delete \tprompt user for files to preserve and delete all\n"); printf(" \tothers; important: under particular circumstances,\n"); printf(" \tdata may be lost when using this option together\n"); printf(" \twith -s or --symlinks, or when specifying a\n"); printf(" \tparticular directory more than once; refer to the\n"); printf(" \tfdupes documentation for additional information\n"); //printf(" -l --relink \t(description)\n"); printf(" -N --noprompt \ttogether with --delete, preserve the first file in\n"); printf(" \teach set of duplicates and delete the rest without\n"); printf(" \twithout prompting the user\n"); printf(" -v --version \tdisplay fdupes version\n"); printf(" -h --help \tdisplay this help message\n\n"); #ifdef OMIT_GETOPT_LONG printf("Note: Long options are not supported in this fdupes build.\n\n"); #endif } int main(int argc, char **argv) { int x; int opt; FILE *file1; FILE *file2; file_t *files = NULL; file_t *curfile; file_t **match = NULL; filetree_t *checktree = NULL; int filecount = 0; int progress = 0; char **oldargv; int firstrecurse; #ifndef OMIT_GETOPT_LONG static struct option long_options[] = { { "omitfirst", 0, 0, 'f' }, { "recurse", 0, 0, 'r' }, { "recursive", 0, 0, 'r' }, { "recurse:", 0, 0, 'R' }, { "recursive:", 0, 0, 'R' }, { "quiet", 0, 0, 'q' }, { "sameline", 0, 0, '1' }, { "size", 0, 0, 'S' }, { "symlinks", 0, 0, 's' }, { "hardlinks", 0, 0, 'H' }, { "relink", 0, 0, 'l' }, { "noempty", 0, 0, 'n' }, { "delete", 0, 0, 'd' }, { "version", 0, 0, 'v' }, { "help", 0, 0, 'h' }, { "noprompt", 0, 0, 'N' }, { "summarize", 0, 0, 'm'}, { "summary", 0, 0, 'm' }, { 0, 0, 0, 0 } }; #define GETOPT getopt_long #else #define GETOPT getopt #endif program_name = argv[0]; oldargv = cloneargs(argc, argv); while ((opt = GETOPT(argc, argv, "frRq1Ss::HlndvhNm" #ifndef OMIT_GETOPT_LONG , long_options, NULL #endif )) != EOF) { switch (opt) { case 'f': SETFLAG(flags, F_OMITFIRST); break; case 'r': SETFLAG(flags, F_RECURSE); break; case 'R': SETFLAG(flags, F_RECURSEAFTER); break; case 'q': SETFLAG(flags, F_HIDEPROGRESS); break; case '1': SETFLAG(flags, F_DSAMELINE); break; case 'S': SETFLAG(flags, F_SHOWSIZE); break; case 's': SETFLAG(flags, F_FOLLOWLINKS); break; case 'H': SETFLAG(flags, F_CONSIDERHARDLINKS); break; case 'n': SETFLAG(flags, F_EXCLUDEEMPTY); break; case 'd': SETFLAG(flags, F_DELETEFILES); break; case 'v': printf("fdupes %s\n", VERSION); exit(0); case 'h': help_text(); exit(1); case 'N': SETFLAG(flags, F_NOPROMPT); break; case 'm': SETFLAG(flags, F_SUMMARIZEMATCHES); break; default: fprintf(stderr, "Try `fdupes --help' for more information.\n"); exit(1); } } if (optind >= argc) { errormsg("no directories specified\n"); exit(1); } if (ISFLAG(flags, F_RECURSE) && ISFLAG(flags, F_RECURSEAFTER)) { errormsg("options --recurse and --recurse: are not compatible\n"); exit(1); } if (ISFLAG(flags, F_SUMMARIZEMATCHES) && ISFLAG(flags, F_DELETEFILES)) { errormsg("options --summarize and --delete are not compatible\n"); exit(1); } if (ISFLAG(flags, F_RECURSEAFTER)) { firstrecurse = nonoptafter("--recurse:", argc, oldargv, argv, optind); if (firstrecurse == argc) firstrecurse = nonoptafter("-R", argc, oldargv, argv, optind); if (firstrecurse == argc) { errormsg("-R option must be isolated from other options\n"); exit(1); } /* F_RECURSE is not set for directories before --recurse: */ for (x = optind; x < firstrecurse; x++) filecount += grokdir(argv[x], &files); /* Set F_RECURSE for directories after --recurse: */ SETFLAG(flags, F_RECURSE); for (x = firstrecurse; x < argc; x++) filecount += grokdir(argv[x], &files); } else { for (x = optind; x < argc; x++) filecount += grokdir(argv[x], &files); } if (!files) { if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); exit(0); } curfile = files; while (curfile) { if (!checktree) registerfile(&checktree, curfile); else match = checkmatch(&checktree, checktree, curfile); if (match != NULL) { file1 = fopen(curfile->d_name, "rb"); if (!file1) { curfile = curfile->next; continue; } file2 = fopen((*match)->d_name, "rb"); if (!file2) { fclose(file1); curfile = curfile->next; continue; } if (confirmmatch(file1, file2)) { registerpair(match, curfile, sort_pairs_by_mtime); //match->hasdupes = 1; //curfile->duplicates = match->duplicates; //match->duplicates = curfile; } fclose(file1); fclose(file2); } curfile = curfile->next; if (!ISFLAG(flags, F_HIDEPROGRESS)) { fprintf(stderr, "\rProgress [%d/%d] %d%% ", progress, filecount, (int)((float) progress / (float) filecount * 100.0)); progress++; } } if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); if (ISFLAG(flags, F_DELETEFILES)) { if (ISFLAG(flags, F_NOPROMPT)) deletefiles(files, 0); else deletefiles(files, 1); } else if (ISFLAG(flags, F_SUMMARIZEMATCHES)) summarizematches(files); else printmatches(files); while (files) { curfile = files->next; free(files->d_name); free(files->crcsignature); free(files->crcpartial); free(files); files = curfile; } for (x = 0; x < argc; x++) free(oldargv[x]); free(oldargv); purgetree(checktree); return 0; } fdupes-1.51/.svn/pristine/1d/1d9f71f8c1e8ae30a626dee15c17c8061285e32d.svn-base0000644000175000017500000006700212134554374024346 0ustar adrianadrian/* FDUPES Copyright (c) 1999-2002 Adrian Lopez Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. */ #include #include #include #include #include #include #include #ifndef OMIT_GETOPT_LONG #include #endif #include #include #ifndef EXTERNAL_MD5 #include "md5/md5.h" #endif #define ISFLAG(a,b) ((a & b) == b) #define SETFLAG(a,b) (a |= b) #define F_RECURSE 0x0001 #define F_HIDEPROGRESS 0x0002 #define F_DSAMELINE 0x0004 #define F_FOLLOWLINKS 0x0008 #define F_DELETEFILES 0x0010 #define F_EXCLUDEEMPTY 0x0020 #define F_CONSIDERHARDLINKS 0x0040 #define F_SHOWSIZE 0x0080 #define F_OMITFIRST 0x0100 #define F_RECURSEAFTER 0x0200 #define F_NOPROMPT 0x0400 #define F_SUMMARIZEMATCHES 0x0800 char *program_name; unsigned long flags = 0; #define CHUNK_SIZE 8192 #define INPUT_SIZE 256 #define PARTIAL_MD5_SIZE 4096 /* TODO: Partial sums (for working with very large files). typedef struct _signature { md5_state_t state; md5_byte_t digest[16]; } signature_t; typedef struct _signatures { int num_signatures; signature_t *signatures; } signatures_t; */ typedef struct _file { char *d_name; off_t size; char *crcpartial; char *crcsignature; dev_t device; ino_t inode; time_t mtime; int hasdupes; /* true only if file is first on duplicate chain */ struct _file *duplicates; struct _file *next; } file_t; typedef struct _filetree { file_t *file; struct _filetree *left; struct _filetree *right; } filetree_t; void errormsg(char *message, ...) { va_list ap; va_start(ap, message); fprintf(stderr, "\r%40s\r%s: ", "", program_name); vfprintf(stderr, message, ap); } void escapefilename(char *escape_list, char **filename_ptr) { int x; int tx; char *tmp; char *filename; filename = *filename_ptr; tmp = (char*) malloc(strlen(filename) * 2 + 1); if (tmp == NULL) { errormsg("out of memory!\n"); exit(1); } for (x = 0, tx = 0; x < strlen(filename); x++) { if (strchr(escape_list, filename[x]) != NULL) tmp[tx++] = '\\'; tmp[tx++] = filename[x]; } tmp[tx] = '\0'; if (x != tx) { *filename_ptr = realloc(*filename_ptr, strlen(tmp) + 1); if (*filename_ptr == NULL) { errormsg("out of memory!\n"); exit(1); } strcpy(*filename_ptr, tmp); } } off_t filesize(char *filename) { struct stat s; if (stat(filename, &s) != 0) return -1; return s.st_size; } dev_t getdevice(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_dev; } ino_t getinode(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_ino; } time_t getmtime(char *filename) { struct stat s; if (stat(filename, &s) != 0) return 0; return s.st_mtime; } char **cloneargs(int argc, char **argv) { int x; char **args; args = (char **) malloc(sizeof(char*) * argc); if (args == NULL) { errormsg("out of memory!\n"); exit(1); } for (x = 0; x < argc; x++) { args[x] = (char*) malloc(strlen(argv[x]) + 1); if (args[x] == NULL) { free(args); errormsg("out of memory!\n"); exit(1); } strcpy(args[x], argv[x]); } return args; } int findarg(char *arg, int start, int argc, char **argv) { int x; for (x = start; x < argc; x++) if (strcmp(argv[x], arg) == 0) return x; return x; } /* Find the first non-option argument after specified option. */ int nonoptafter(char *option, int argc, char **oldargv, char **newargv, int optind) { int x; int targetind; int testind; int startat = 1; targetind = findarg(option, 1, argc, oldargv); for (x = optind; x < argc; x++) { testind = findarg(newargv[x], startat, argc, oldargv); if (testind > targetind) return x; else startat = testind; } return x; } int grokdir(char *dir, file_t **filelistp) { DIR *cd; file_t *newfile; struct dirent *dirinfo; int lastchar; int filecount = 0; struct stat info; struct stat linfo; static int progress = 0; static char indicator[] = "-\\|/"; cd = opendir(dir); if (!cd) { errormsg("could not chdir to %s\n", dir); return 0; } while ((dirinfo = readdir(cd)) != NULL) { if (strcmp(dirinfo->d_name, ".") && strcmp(dirinfo->d_name, "..")) { if (!ISFLAG(flags, F_HIDEPROGRESS)) { fprintf(stderr, "\rBuilding file list %c ", indicator[progress]); progress = (progress + 1) % 4; } newfile = (file_t*) malloc(sizeof(file_t)); if (!newfile) { errormsg("out of memory!\n"); closedir(cd); exit(1); } else newfile->next = *filelistp; newfile->device = 0; newfile->inode = 0; newfile->crcsignature = NULL; newfile->crcpartial = NULL; newfile->duplicates = NULL; newfile->hasdupes = 0; newfile->d_name = (char*)malloc(strlen(dir)+strlen(dirinfo->d_name)+2); if (!newfile->d_name) { errormsg("out of memory!\n"); free(newfile); closedir(cd); exit(1); } strcpy(newfile->d_name, dir); lastchar = strlen(dir) - 1; if (lastchar >= 0 && dir[lastchar] != '/') strcat(newfile->d_name, "/"); strcat(newfile->d_name, dirinfo->d_name); if (filesize(newfile->d_name) == 0 && ISFLAG(flags, F_EXCLUDEEMPTY)) { free(newfile->d_name); free(newfile); continue; } if (stat(newfile->d_name, &info) == -1) { free(newfile->d_name); free(newfile); continue; } if (lstat(newfile->d_name, &linfo) == -1) { free(newfile->d_name); free(newfile); continue; } if (S_ISDIR(info.st_mode)) { if (ISFLAG(flags, F_RECURSE) && (ISFLAG(flags, F_FOLLOWLINKS) || !S_ISLNK(linfo.st_mode))) filecount += grokdir(newfile->d_name, filelistp); free(newfile->d_name); free(newfile); } else { if (S_ISREG(linfo.st_mode) || (S_ISLNK(linfo.st_mode) && ISFLAG(flags, F_FOLLOWLINKS))) { *filelistp = newfile; filecount++; } else { free(newfile->d_name); free(newfile); } } } } closedir(cd); return filecount; } #ifndef EXTERNAL_MD5 /* If EXTERNAL_MD5 is not defined, use L. Peter Deutsch's MD5 library. */ char *getcrcsignatureuntil(char *filename, off_t max_read) { int x; off_t fsize; off_t toread; md5_state_t state; md5_byte_t digest[16]; static md5_byte_t chunk[CHUNK_SIZE]; static char signature[16*2 + 1]; char *sigp; FILE *file; md5_init(&state); fsize = filesize(filename); if (max_read != 0 && fsize > max_read) fsize = max_read; file = fopen(filename, "rb"); if (file == NULL) { errormsg("error opening file %s\n", filename); return NULL; } while (fsize > 0) { toread = (fsize % CHUNK_SIZE) ? (fsize % CHUNK_SIZE) : CHUNK_SIZE; if (fread(chunk, toread, 1, file) != 1) { errormsg("error reading from file %s\n", filename); fclose(file); return NULL; } md5_append(&state, chunk, toread); fsize -= toread; } md5_finish(&state, digest); sigp = signature; for (x = 0; x < 16; x++) { sprintf(sigp, "%02x", digest[x]); sigp = strchr(sigp, '\0'); } fclose(file); return signature; } char *getcrcsignature(char *filename) { return getcrcsignatureuntil(filename, 0); } char *getcrcpartialsignature(char *filename) { return getcrcsignatureuntil(filename, PARTIAL_MD5_SIZE); } #endif /* [#ifndef EXTERNAL_MD5] */ #ifdef EXTERNAL_MD5 /* If EXTERNAL_MD5 is defined, use md5sum program to calculate signatures. */ char *getcrcsignature(char *filename) { static char signature[256]; char *command; char *separator; FILE *result; command = (char*) malloc(strlen(filename)+strlen(EXTERNAL_MD5)+2); if (command == NULL) { errormsg("out of memory\n"); exit(1); } sprintf(command, "%s %s", EXTERNAL_MD5, filename); result = popen(command, "r"); if (result == NULL) { errormsg("error invoking %s\n", EXTERNAL_MD5); exit(1); } free(command); if (fgets(signature, 256, result) == NULL) { errormsg("error generating signature for %s\n", filename); return NULL; } separator = strchr(signature, ' '); if (separator) *separator = '\0'; pclose(result); return signature; } #endif /* [#ifdef EXTERNAL_MD5] */ void purgetree(filetree_t *checktree) { if (checktree->left != NULL) purgetree(checktree->left); if (checktree->right != NULL) purgetree(checktree->right); free(checktree); } void getfilestats(file_t *file) { file->size = filesize(file->d_name); file->inode = getinode(file->d_name); file->device = getdevice(file->d_name); file->mtime = getmtime(file->d_name); } int registerfile(filetree_t **branch, file_t *file) { getfilestats(file); *branch = (filetree_t*) malloc(sizeof(filetree_t)); if (*branch == NULL) { errormsg("out of memory!\n"); exit(1); } (*branch)->file = file; (*branch)->left = NULL; (*branch)->right = NULL; return 1; } file_t **checkmatch(filetree_t **root, filetree_t *checktree, file_t *file) { int cmpresult; char *crcsignature; off_t fsize; /* If device and inode fields are equal one of the files is a hard link to the other or the files have been listed twice unintentionally. We don't want to flag these files as duplicates unless the user specifies otherwise. */ if (!ISFLAG(flags, F_CONSIDERHARDLINKS) && (getinode(file->d_name) == checktree->file->inode) && (getdevice(file->d_name) == checktree->file->device)) return NULL; fsize = filesize(file->d_name); if (fsize < checktree->file->size) cmpresult = -1; else if (fsize > checktree->file->size) cmpresult = 1; else { if (checktree->file->crcpartial == NULL) { crcsignature = getcrcpartialsignature(checktree->file->d_name); if (crcsignature == NULL) return NULL; checktree->file->crcpartial = (char*) malloc(strlen(crcsignature)+1); if (checktree->file->crcpartial == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(checktree->file->crcpartial, crcsignature); } if (file->crcpartial == NULL) { crcsignature = getcrcpartialsignature(file->d_name); if (crcsignature == NULL) return NULL; file->crcpartial = (char*) malloc(strlen(crcsignature)+1); if (file->crcpartial == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(file->crcpartial, crcsignature); } cmpresult = strcmp(file->crcpartial, checktree->file->crcpartial); /*if (cmpresult != 0) errormsg(" on %s vs %s\n", file->d_name, checktree->file->d_name);*/ if (cmpresult == 0) { if (checktree->file->crcsignature == NULL) { crcsignature = getcrcsignature(checktree->file->d_name); if (crcsignature == NULL) return NULL; checktree->file->crcsignature = (char*) malloc(strlen(crcsignature)+1); if (checktree->file->crcsignature == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(checktree->file->crcsignature, crcsignature); } if (file->crcsignature == NULL) { crcsignature = getcrcsignature(file->d_name); if (crcsignature == NULL) return NULL; file->crcsignature = (char*) malloc(strlen(crcsignature)+1); if (file->crcsignature == NULL) { errormsg("out of memory\n"); exit(1); } strcpy(file->crcsignature, crcsignature); } cmpresult = strcmp(file->crcsignature, checktree->file->crcsignature); /*if (cmpresult != 0) errormsg("P on %s vs %s\n", file->d_name, checktree->file->d_name); else errormsg("P F on %s vs %s\n", file->d_name, checktree->file->d_name); printf("%s matches %s\n", file->d_name, checktree->file->d_name);*/ } } if (cmpresult < 0) { if (checktree->left != NULL) { return checkmatch(root, checktree->left, file); } else { registerfile(&(checktree->left), file); return NULL; } } else if (cmpresult > 0) { if (checktree->right != NULL) { return checkmatch(root, checktree->right, file); } else { registerfile(&(checktree->right), file); return NULL; } } else { getfilestats(file); return &checktree->file; } } /* Do a bit-for-bit comparison in case two different files produce the same signature. Unlikely, but better safe than sorry. */ int confirmmatch(FILE *file1, FILE *file2) { unsigned char c1 = 0; unsigned char c2 = 0; size_t r1; size_t r2; fseek(file1, 0, SEEK_SET); fseek(file2, 0, SEEK_SET); do { r1 = fread(&c1, sizeof(c1), 1, file1); r2 = fread(&c2, sizeof(c2), 1, file2); if (c1 != c2) return 0; /* file contents are different */ } while (r1 && r2); if (r1 != r2) return 0; /* file lengths are different */ return 1; } void summarizematches(file_t *files) { int numsets = 0; double numbytes = 0.0; int numfiles = 0; file_t *tmpfile; while (files != NULL) { if (files->hasdupes) { numsets++; tmpfile = files->duplicates; while (tmpfile != NULL) { numfiles++; numbytes += files->size; tmpfile = tmpfile->duplicates; } } files = files->next; } if (numsets == 0) printf("No duplicates found.\n\n"); else { if (numbytes < 1024.0) printf("%d duplicate files (in %d sets), occupying %.0f bytes.\n\n", numfiles, numsets, numbytes); else if (numbytes <= (1000.0 * 1000.0)) printf("%d duplicate files (in %d sets), occupying %.1f kylobytes\n\n", numfiles, numsets, numbytes / 1000.0); else printf("%d duplicate files (in %d sets), occupying %.1f megabytes\n\n", numfiles, numsets, numbytes / (1000.0 * 1000.0)); } } void printmatches(file_t *files) { file_t *tmpfile; while (files != NULL) { if (files->hasdupes) { if (!ISFLAG(flags, F_OMITFIRST)) { if (ISFLAG(flags, F_SHOWSIZE)) printf("%lld byte%seach:\n", files->size, (files->size != 1) ? "s " : " "); if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &files->d_name); printf("%s%c", files->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); } tmpfile = files->duplicates; while (tmpfile != NULL) { if (ISFLAG(flags, F_DSAMELINE)) escapefilename("\\ ", &tmpfile->d_name); printf("%s%c", tmpfile->d_name, ISFLAG(flags, F_DSAMELINE)?' ':'\n'); tmpfile = tmpfile->duplicates; } printf("\n"); } files = files->next; } } /* #define REVISE_APPEND "_tmp" char *revisefilename(char *path, int seq) { int digits; char *newpath; char *scratch; char *dot; digits = numdigits(seq); newpath = malloc(strlen(path) + strlen(REVISE_APPEND) + digits + 1); if (!newpath) return newpath; scratch = malloc(strlen(path) + 1); if (!scratch) return newpath; strcpy(scratch, path); dot = strrchr(scratch, '.'); if (dot) { *dot = 0; sprintf(newpath, "%s%s%d.%s", scratch, REVISE_APPEND, seq, dot + 1); } else { sprintf(newpath, "%s%s%d", path, REVISE_APPEND, seq); } free(scratch); return newpath; } */ int relink(char *oldfile, char *newfile) { dev_t od; dev_t nd; ino_t oi; ino_t ni; od = getdevice(oldfile); oi = getinode(oldfile); if (link(oldfile, newfile) != 0) return 0; /* make sure we're working with the right file (the one we created) */ nd = getdevice(newfile); ni = getinode(newfile); if (nd != od || oi != ni) return 0; /* file is not what we expected */ return 1; } void deletefiles(file_t *files, int prompt, FILE *tty) { int counter; int groups = 0; int curgroup = 0; file_t *tmpfile; file_t *curfile; file_t **dupelist; int *preserve; char *preservestr; char *token; char *tstr; int number; int sum; int max = 0; int x; int i; curfile = files; while (curfile) { if (curfile->hasdupes) { counter = 1; groups++; tmpfile = curfile->duplicates; while (tmpfile) { counter++; tmpfile = tmpfile->duplicates; } if (counter > max) max = counter; } curfile = curfile->next; } max++; dupelist = (file_t**) malloc(sizeof(file_t*) * max); preserve = (int*) malloc(sizeof(int) * max); preservestr = (char*) malloc(INPUT_SIZE); if (!dupelist || !preserve || !preservestr) { errormsg("out of memory\n"); exit(1); } while (files) { if (files->hasdupes) { curgroup++; counter = 1; dupelist[counter] = files; if (prompt) printf("[%d] %s\n", counter, files->d_name); tmpfile = files->duplicates; while (tmpfile) { dupelist[++counter] = tmpfile; if (prompt) printf("[%d] %s\n", counter, tmpfile->d_name); tmpfile = tmpfile->duplicates; } if (prompt) printf("\n"); if (!prompt) /* preserve only the first file */ { preserve[1] = 1; for (x = 2; x <= counter; x++) preserve[x] = 0; } else /* prompt for files to preserve */ do { printf("Set %d of %d, preserve files [1 - %d, all]", curgroup, groups, counter); if (ISFLAG(flags, F_SHOWSIZE)) printf(" (%lld byte%seach)", files->size, (files->size != 1) ? "s " : " "); printf(": "); fflush(stdout); if (!fgets(preservestr, INPUT_SIZE, tty)) preservestr[0] = '\n'; /* treat fgets() failure as if nothing was entered */ i = strlen(preservestr) - 1; while (preservestr[i]!='\n'){ /* tail of buffer must be a newline */ tstr = (char*) realloc(preservestr, strlen(preservestr) + 1 + INPUT_SIZE); if (!tstr) { /* couldn't allocate memory, treat as fatal */ errormsg("out of memory!\n"); exit(1); } preservestr = tstr; if (!fgets(preservestr + i + 1, INPUT_SIZE, tty)) { preservestr[0] = '\n'; /* treat fgets() failure as if nothing was entered */ break; } i = strlen(preservestr)-1; } for (x = 1; x <= counter; x++) preserve[x] = 0; token = strtok(preservestr, " ,\n"); while (token != NULL) { if (strcasecmp(token, "all") == 0) for (x = 0; x <= counter; x++) preserve[x] = 1; number = 0; sscanf(token, "%d", &number); if (number > 0 && number <= counter) preserve[number] = 1; token = strtok(NULL, " ,\n"); } for (sum = 0, x = 1; x <= counter; x++) sum += preserve[x]; } while (sum < 1); /* make sure we've preserved at least one file */ printf("\n"); for (x = 1; x <= counter; x++) { if (preserve[x]) printf(" [+] %s\n", dupelist[x]->d_name); else { if (remove(dupelist[x]->d_name) == 0) { printf(" [-] %s\n", dupelist[x]->d_name); } else { printf(" [!] %s ", dupelist[x]->d_name); printf("-- unable to delete file!\n"); } } } printf("\n"); } files = files->next; } free(dupelist); free(preserve); free(preservestr); } int sort_pairs_by_arrival(file_t *f1, file_t *f2) { if (f2->duplicates != 0) return 1; return -1; } int sort_pairs_by_mtime(file_t *f1, file_t *f2) { if (f1->mtime < f2->mtime) return -1; else if (f1->mtime > f2->mtime) return 1; return 0; } void registerpair(file_t **matchlist, file_t *newmatch, int (*comparef)(file_t *f1, file_t *f2)) { file_t *traverse; file_t *back; (*matchlist)->hasdupes = 1; back = 0; traverse = *matchlist; while (traverse) { if (comparef(newmatch, traverse) <= 0) { newmatch->duplicates = traverse; if (back == 0) { *matchlist = newmatch; /* update pointer to head of list */ newmatch->hasdupes = 1; traverse->hasdupes = 0; /* flag is only for first file in dupe chain */ } else back->duplicates = newmatch; break; } else { if (traverse->duplicates == 0) { traverse->duplicates = newmatch; if (back == 0) traverse->hasdupes = 1; break; } } back = traverse; traverse = traverse->duplicates; } } void help_text() { printf("Usage: fdupes [options] DIRECTORY...\n\n"); printf(" -r --recurse \tfor every directory given follow subdirectories\n"); printf(" \tencountered within\n"); printf(" -R --recurse: \tfor each directory given after this option follow\n"); printf(" \tsubdirectories encountered within\n"); printf(" -s --symlinks \tfollow symlinks\n"); printf(" -H --hardlinks \tnormally, when two or more files point to the same\n"); printf(" \tdisk area they are treated as non-duplicates; this\n"); printf(" \toption will change this behavior\n"); printf(" -n --noempty \texclude zero-length files from consideration\n"); printf(" -f --omitfirst \tomit the first file in each set of matches\n"); printf(" -1 --sameline \tlist each set of matches on a single line\n"); printf(" -S --size \tshow size of duplicate files\n"); printf(" -m --summarize \tsummarize dupe information\n"); printf(" -q --quiet \thide progress indicator\n"); printf(" -d --delete \tprompt user for files to preserve and delete all\n"); printf(" \tothers; important: under particular circumstances,\n"); printf(" \tdata may be lost when using this option together\n"); printf(" \twith -s or --symlinks, or when specifying a\n"); printf(" \tparticular directory more than once; refer to the\n"); printf(" \tfdupes documentation for additional information\n"); /*printf(" -l --relink \t(description)\n");*/ printf(" -N --noprompt \ttogether with --delete, preserve the first file in\n"); printf(" \teach set of duplicates and delete the rest without\n"); printf(" \tprompting the user\n"); printf(" -v --version \tdisplay fdupes version\n"); printf(" -h --help \tdisplay this help message\n\n"); #ifdef OMIT_GETOPT_LONG printf("Note: Long options are not supported in this fdupes build.\n\n"); #endif } int main(int argc, char **argv) { int x; int opt; FILE *file1; FILE *file2; file_t *files = NULL; file_t *curfile; file_t **match = NULL; filetree_t *checktree = NULL; int filecount = 0; int progress = 0; char **oldargv; int firstrecurse; #ifndef OMIT_GETOPT_LONG static struct option long_options[] = { { "omitfirst", 0, 0, 'f' }, { "recurse", 0, 0, 'r' }, { "recursive", 0, 0, 'r' }, { "recurse:", 0, 0, 'R' }, { "recursive:", 0, 0, 'R' }, { "quiet", 0, 0, 'q' }, { "sameline", 0, 0, '1' }, { "size", 0, 0, 'S' }, { "symlinks", 0, 0, 's' }, { "hardlinks", 0, 0, 'H' }, { "relink", 0, 0, 'l' }, { "noempty", 0, 0, 'n' }, { "delete", 0, 0, 'd' }, { "version", 0, 0, 'v' }, { "help", 0, 0, 'h' }, { "noprompt", 0, 0, 'N' }, { "summarize", 0, 0, 'm'}, { "summary", 0, 0, 'm' }, { 0, 0, 0, 0 } }; #define GETOPT getopt_long #else #define GETOPT getopt #endif program_name = argv[0]; oldargv = cloneargs(argc, argv); while ((opt = GETOPT(argc, argv, "frRq1Ss::HlndvhNm" #ifndef OMIT_GETOPT_LONG , long_options, NULL #endif )) != EOF) { switch (opt) { case 'f': SETFLAG(flags, F_OMITFIRST); break; case 'r': SETFLAG(flags, F_RECURSE); break; case 'R': SETFLAG(flags, F_RECURSEAFTER); break; case 'q': SETFLAG(flags, F_HIDEPROGRESS); break; case '1': SETFLAG(flags, F_DSAMELINE); break; case 'S': SETFLAG(flags, F_SHOWSIZE); break; case 's': SETFLAG(flags, F_FOLLOWLINKS); break; case 'H': SETFLAG(flags, F_CONSIDERHARDLINKS); break; case 'n': SETFLAG(flags, F_EXCLUDEEMPTY); break; case 'd': SETFLAG(flags, F_DELETEFILES); break; case 'v': printf("fdupes %s\n", VERSION); exit(0); case 'h': help_text(); exit(1); case 'N': SETFLAG(flags, F_NOPROMPT); break; case 'm': SETFLAG(flags, F_SUMMARIZEMATCHES); break; default: fprintf(stderr, "Try `fdupes --help' for more information.\n"); exit(1); } } if (optind >= argc) { errormsg("no directories specified\n"); exit(1); } if (ISFLAG(flags, F_RECURSE) && ISFLAG(flags, F_RECURSEAFTER)) { errormsg("options --recurse and --recurse: are not compatible\n"); exit(1); } if (ISFLAG(flags, F_SUMMARIZEMATCHES) && ISFLAG(flags, F_DELETEFILES)) { errormsg("options --summarize and --delete are not compatible\n"); exit(1); } if (ISFLAG(flags, F_RECURSEAFTER)) { firstrecurse = nonoptafter("--recurse:", argc, oldargv, argv, optind); if (firstrecurse == argc) firstrecurse = nonoptafter("-R", argc, oldargv, argv, optind); if (firstrecurse == argc) { errormsg("-R option must be isolated from other options\n"); exit(1); } /* F_RECURSE is not set for directories before --recurse: */ for (x = optind; x < firstrecurse; x++) filecount += grokdir(argv[x], &files); /* Set F_RECURSE for directories after --recurse: */ SETFLAG(flags, F_RECURSE); for (x = firstrecurse; x < argc; x++) filecount += grokdir(argv[x], &files); } else { for (x = optind; x < argc; x++) filecount += grokdir(argv[x], &files); } if (!files) { if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); exit(0); } curfile = files; while (curfile) { if (!checktree) registerfile(&checktree, curfile); else match = checkmatch(&checktree, checktree, curfile); if (match != NULL) { file1 = fopen(curfile->d_name, "rb"); if (!file1) { curfile = curfile->next; continue; } file2 = fopen((*match)->d_name, "rb"); if (!file2) { fclose(file1); curfile = curfile->next; continue; } if (confirmmatch(file1, file2)) { registerpair(match, curfile, sort_pairs_by_mtime); /*match->hasdupes = 1; curfile->duplicates = match->duplicates; match->duplicates = curfile;*/ } fclose(file1); fclose(file2); } curfile = curfile->next; if (!ISFLAG(flags, F_HIDEPROGRESS)) { fprintf(stderr, "\rProgress [%d/%d] %d%% ", progress, filecount, (int)((float) progress / (float) filecount * 100.0)); progress++; } } if (!ISFLAG(flags, F_HIDEPROGRESS)) fprintf(stderr, "\r%40s\r", " "); if (ISFLAG(flags, F_DELETEFILES)) { if (ISFLAG(flags, F_NOPROMPT)) { deletefiles(files, 0, 0); } else { stdin = freopen("/dev/tty", "r", stdin); deletefiles(files, 1, stdin); } } else if (ISFLAG(flags, F_SUMMARIZEMATCHES)) summarizematches(files); else printmatches(files); while (files) { curfile = files->next; free(files->d_name); free(files->crcsignature); free(files->crcpartial); free(files); files = curfile; } for (x = 0; x < argc; x++) free(oldargv[x]); free(oldargv); purgetree(checktree); return 0; } fdupes-1.51/.svn/pristine/cf/0000755000175000017500000000000012134544631015320 5ustar adrianadrianfdupes-1.51/.svn/pristine/cf/cfa698ef88230fbe6862cb300268a3a647ecc71d.svn-base0000644000175000017500000000000412134544631024502 0ustar adrianadriansix fdupes-1.51/.svn/pristine/da/0000755000175000017500000000000012134544631015314 5ustar adrianadrianfdupes-1.51/.svn/pristine/da/da39a3ee5e6b4b0d3255bfef95601890afd80709.svn-base0000644000175000017500000000000012134544631024467 0ustar adrianadrianfdupes-1.51/.svn/pristine/6f/0000755000175000017500000000000012134554375015251 5ustar adrianadrianfdupes-1.51/.svn/pristine/6f/6f02f5c07dcf985ec62ac8013e28ab399d428f7a.svn-base0000644000175000017500000000750312134554375024452 0ustar adrianadrianThe following list, organized by fdupes version, documents changes to fdupes. Every item on the list includes, inside square brackets, a list of indentifiers referring to the people who contributed that particular item. When more than one person is listed the person who contributed the patch or idea appears first, followed by those who've otherwise worked on that item. For a list of contributors names and identifiers please see the CONTRIBUTORS file. Changes from 1.50 to 1.51 - Added support for 64-bit file offsets on 32-bit systems. - Using tty for interactive input instead of regular stdin. This is to allow feeding filenames via stdin in future versions of fdupes without breaking interactive deletion feature. - Fixed some typos in --help. - Turned C++ style comments into C style comments. Changes from 1.40 to 1.50-PR2 - Fixed memory leak. [JB] - Added "--summarize" option. [AL] - Added "--recurse:" selective recursion option. [AL] - Added "--noprompt" option for totally automated deletion of duplicate files. - Now sorts duplicates (old to new) for consistent order when listing or deleteing duplicate files. - Now tests for early matching of files, which should help speed up the matching process when large files are involved. - Added warning whenever a file cannot be deleted. [CHL, AL] - Fixed bug where some files would not be closed after failure. [AL] - Fixed bug where confirmmatch() function wouldn't always deal properly with zero-length files. [AL] - Fixed bug where progress indicator would not be cleared when no files were found. [AL] - Removed experimental red-black tree code (it was slower on my system than the default code). [AL] - Modified md5/md5.c to avoid compiler warning. [CHL] - Changes to fdupes.c for compilation under platforms where getopt_long is unavailable. [LR, AL] - Changes to help text for clarity. [AL] - Various changes and improvements to Makefile. [PB, AL] Changes from 1.31 to 1.40 - Added option to omit the first file in each group of matches. [LM, AL] - Added escaping of filenames containing spaces when sameline option is specified. [AL] - Changed version indicator format from "fdupes version X.Y" to the simpler "fdupes X.Y". [AL] - Changed ordering of options appearing in the help text (--help), manpage, and README file. [AL] Changes from 1.30 to 1.31 - Added interactive option to preserve all files during delete procedure (something similar was already in place, but now it's official). [AL] - Updated delete procedure prompt format. [AL] - Cosmetic code changes. [AL] Changes from 1.20 to 1.30 - Added size option to display size of duplicates. [LB, AL] - Added missing typecast for proper compilation under g++. [LB] - Better handling of errors occurring during retrieval of a file's signature. [KK, AL] - No longer displays an error message when specified directories contain no files. [AL] - Added red-black tree structure (experimental compile-time option, disabled by default). [AL] Changes from 1.12 to 1.20 - Fixed bug where program would crash when files being scanned were named pipes or sockets. [FD] - Fix against security risk resulting from the use of a temporary file to store md5sum output. [FD, AL] - Using an external md5sum program is now optional. Started using L. Peter Deutsh's MD5 library instead. [FD, AL] - Added hardlinks option to distinguish between hard links and actual duplicate files. [FD, AL] - Added noempty option to exclude zero-length files from consideration [AL] Changes from 1.11 to 1.12 - Improved handling of extremely long input on preserve prompt (delete option). [SSD, AL] Changes from 1.1 to 1.11 - Started checking file sizes before signatures for better performance. [AB, AL] - Added fdupes manpage. [AB, AL] Changes from 1.0 to 1.1 - Added delete option for semi-automatic deletion of duplicate files. [AL] fdupes-1.51/.svn/pristine/44/0000755000175000017500000000000012134544631015157 5ustar adrianadrianfdupes-1.51/.svn/pristine/44/4430bb02f6ed700d4408eb307b25f8b1a25d93de.svn-base0000644000175000017500000000000512134544631024226 0ustar adrianadrianfive fdupes-1.51/.svn/pristine/b9/0000755000175000017500000000000012134556275015251 5ustar adrianadrianfdupes-1.51/.svn/pristine/b9/b9cb24548c74e0c61ae5d6749fc756edd1418f9a.svn-base0000644000175000017500000000010712134556275024451 0ustar adrianadrian# # VERSION determines the program's version number. # VERSION = 1.51 fdupes-1.51/.svn/pristine/70/0000755000175000017500000000000012134554375015164 5ustar adrianadrianfdupes-1.51/.svn/pristine/70/706833cca7f1bd17799bd19d7341c4528570654c.svn-base0000644000175000017500000000605312134554375024005 0ustar adrianadrian# # fdupes Makefile # ##################################################################### # Standand User Configuration Section # ##################################################################### # # PREFIX indicates the base directory used as the basis for the # determination of the actual installation directories. # Suggested values are "/usr/local", "/usr", "/pkgs/fdupes-$(VERSION)" # PREFIX = /usr/local # # When compiling for 32-bit systems, FILEOFFSET_64BIT must be enabled # for fdupes to handle files greater than (2<<31)-1 bytes. # FILEOFFSET_64BIT = -D_FILE_OFFSET_BITS=64 # # Certain platforms do not support long options (command line options). # To disable long options, uncomment the following line. # #OMIT_GETOPT_LONG = -DOMIT_GETOPT_LONG # # To use the md5sum program for calculating signatures (instead of the # built in MD5 message digest routines) uncomment the following # line (try this if you're having trouble with built in code). # #EXTERNAL_MD5 = -DEXTERNAL_MD5=\"md5sum\" ##################################################################### # Developer Configuration Section # ##################################################################### # # VERSION determines the program's version number. # include Makefile.inc/VERSION # # PROGRAM_NAME determines the installation name and manual page name # PROGRAM_NAME=fdupes # # BIN_DIR indicates directory where program is to be installed. # Suggested value is "$(PREFIX)/bin" # BIN_DIR = $(PREFIX)/bin # # MAN_DIR indicates directory where the fdupes man page is to be # installed. Suggested value is "$(PREFIX)/man/man1" # MAN_BASE_DIR = $(PREFIX)/man MAN_DIR = $(MAN_BASE_DIR)/man1 MAN_EXT = 1 # # Required External Tools # INSTALL = install # install : UCB/GNU Install compatiable #INSTALL = ginstall RM = rm -f MKDIR = mkdir -p #MKDIR = mkdirhier #MKDIR = mkinstalldirs # # Make Configuration # CC = gcc COMPILER_OPTIONS = -Wall -O -g CFLAGS= $(COMPILER_OPTIONS) -I. -DVERSION=\"$(VERSION)\" $(EXTERNAL_MD5) $(OMIT_GETOPT_LONG) $(FILEOFFSET_64BIT) INSTALL_PROGRAM = $(INSTALL) -c -m 0755 INSTALL_DATA = $(INSTALL) -c -m 0644 # # ADDITIONAL_OBJECTS - some platforms will need additional object files # to support features not supplied by their vendor. Eg: GNU getopt() # #ADDITIONAL_OBJECTS = getopt.o OBJECT_FILES = fdupes.o md5/md5.o $(ADDITIONAL_OBJECTS) ##################################################################### # no need to modify anything beyond this point # ##################################################################### all: fdupes fdupes: $(OBJECT_FILES) $(CC) $(CFLAGS) -o fdupes $(OBJECT_FILES) installdirs: test -d $(BIN_DIR) || $(MKDIR) $(BIN_DIR) test -d $(MAN_DIR) || $(MKDIR) $(MAN_DIR) install: fdupes installdirs $(INSTALL_PROGRAM) fdupes $(BIN_DIR)/$(PROGRAM_NAME) $(INSTALL_DATA) fdupes.1 $(MAN_DIR)/$(PROGRAM_NAME).$(MAN_EXT) clean: $(RM) $(OBJECT_FILES) $(RM) fdupes $(RM) *~ md5/*~ love: @echo You\'re not my type. Go find a human partner. fdupes-1.51/.svn/pristine/99/0000755000175000017500000000000012134544631015171 5ustar adrianadrianfdupes-1.51/.svn/pristine/99/99346e4e318333b4785bb199997c1fcc2be7546d.svn-base0000644000175000017500000000001412134544631024102 0ustar adrianadrianwith spaces fdupes-1.51/.svn/pristine/a3/0000755000175000017500000000000012134544631015233 5ustar adrianadrianfdupes-1.51/.svn/pristine/a3/a3170b78ced4cd2e8302c6d80436a1cd5a908f6f.svn-base0000644000175000017500000000011312134544631024401 0ustar adrianadrian# # VERSION determines the program's version number. # VERSION = 1.50-PR2 fdupes-1.51/.svn/pristine/a3/a34e525c92d7acf1e92ea5b99eaabaeb8ff4d1d6.svn-base0000644000175000017500000000001012134544631025045 0ustar adrianadrianlink twofdupes-1.51/.svn/pristine/c7/0000755000175000017500000000000012134544631015241 5ustar adrianadrianfdupes-1.51/.svn/pristine/c7/c7059bb19433cc3cabaa6236c83d56668a843dd2.svn-base0000644000175000017500000000000412134544631024327 0ustar adrianadrianone fdupes-1.51/.svn/pristine/4a/0000755000175000017500000000000012134544631015234 5ustar adrianadrianfdupes-1.51/.svn/pristine/4a/4a4121ecd766ed16943a0c7b54c18f743e90c3f6.svn-base0000644000175000017500000000000512134544631024250 0ustar adrianadrianfour fdupes-1.51/.svn/pristine/2f/0000755000175000017500000000000012134544631015237 5ustar adrianadrianfdupes-1.51/.svn/pristine/2f/2f93a711e65db83f78c6cbccb255c8375b440223.svn-base0000644000175000017500000002567012134544631024271 0ustar adrianadrian/* Copyright (C) 1999 Aladdin Enterprises. All rights reserved. This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. L. Peter Deutsch ghost@aladdin.com */ /*$Id: md5.c $ */ /* Independent implementation of MD5 (RFC 1321). This code implements the MD5 Algorithm defined in RFC 1321. It is derived directly from the text of the RFC and not from the reference implementation. The original and principal author of md5.c is L. Peter Deutsch . Other authors are noted in the change history that follows (in reverse chronological order): contributors chl - Charles Longeau 2002-05-31 chl Relocated string.h to avoid memcpy warning. 1999-11-04 lpd Edited comments slightly for automatic TOC extraction. 1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5). 1999-05-03 lpd Original version. */ #include "md5.h" #include #ifdef TEST /* * Compile with -DTEST to create a self-contained executable test program. * The test program should print out the same values as given in section * A.5 of RFC 1321, reproduced below. */ main() { static const char *const test[7] = { "", /*d41d8cd98f00b204e9800998ecf8427e*/ "a", /*0cc175b9c0f1b6a831c399e269772661*/ "abc", /*900150983cd24fb0d6963f7d28e17f72*/ "message digest", /*f96b697d7cb7938d525a2f31aaf161d0*/ "abcdefghijklmnopqrstuvwxyz", /*c3fcd3d76192e4007dfb496cca67e13b*/ "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", /*d174ab98d277d9f5a5611c2c9f419d9f*/ "12345678901234567890123456789012345678901234567890123456789012345678901234567890" /*57edf4a22be3c955ac49da2e2107b67a*/ }; int i; for (i = 0; i < 7; ++i) { md5_state_t state; md5_byte_t digest[16]; int di; md5_init(&state); md5_append(&state, (const md5_byte_t *)test[i], strlen(test[i])); md5_finish(&state, digest); printf("MD5 (\"%s\") = ", test[i]); for (di = 0; di < 16; ++di) printf("%02x", digest[di]); printf("\n"); } return 0; } #endif /* TEST */ /* * For reference, here is the program that computed the T values. */ #if 0 #include main() { int i; for (i = 1; i <= 64; ++i) { unsigned long v = (unsigned long)(4294967296.0 * fabs(sin((double)i))); printf("#define T%d 0x%08lx\n", i, v); } return 0; } #endif /* * End of T computation program. */ #define T1 0xd76aa478 #define T2 0xe8c7b756 #define T3 0x242070db #define T4 0xc1bdceee #define T5 0xf57c0faf #define T6 0x4787c62a #define T7 0xa8304613 #define T8 0xfd469501 #define T9 0x698098d8 #define T10 0x8b44f7af #define T11 0xffff5bb1 #define T12 0x895cd7be #define T13 0x6b901122 #define T14 0xfd987193 #define T15 0xa679438e #define T16 0x49b40821 #define T17 0xf61e2562 #define T18 0xc040b340 #define T19 0x265e5a51 #define T20 0xe9b6c7aa #define T21 0xd62f105d #define T22 0x02441453 #define T23 0xd8a1e681 #define T24 0xe7d3fbc8 #define T25 0x21e1cde6 #define T26 0xc33707d6 #define T27 0xf4d50d87 #define T28 0x455a14ed #define T29 0xa9e3e905 #define T30 0xfcefa3f8 #define T31 0x676f02d9 #define T32 0x8d2a4c8a #define T33 0xfffa3942 #define T34 0x8771f681 #define T35 0x6d9d6122 #define T36 0xfde5380c #define T37 0xa4beea44 #define T38 0x4bdecfa9 #define T39 0xf6bb4b60 #define T40 0xbebfbc70 #define T41 0x289b7ec6 #define T42 0xeaa127fa #define T43 0xd4ef3085 #define T44 0x04881d05 #define T45 0xd9d4d039 #define T46 0xe6db99e5 #define T47 0x1fa27cf8 #define T48 0xc4ac5665 #define T49 0xf4292244 #define T50 0x432aff97 #define T51 0xab9423a7 #define T52 0xfc93a039 #define T53 0x655b59c3 #define T54 0x8f0ccc92 #define T55 0xffeff47d #define T56 0x85845dd1 #define T57 0x6fa87e4f #define T58 0xfe2ce6e0 #define T59 0xa3014314 #define T60 0x4e0811a1 #define T61 0xf7537e82 #define T62 0xbd3af235 #define T63 0x2ad7d2bb #define T64 0xeb86d391 static void md5_process(md5_state_t *pms, const md5_byte_t *data /*[64]*/) { md5_word_t a = pms->abcd[0], b = pms->abcd[1], c = pms->abcd[2], d = pms->abcd[3]; md5_word_t t; #ifndef ARCH_IS_BIG_ENDIAN # define ARCH_IS_BIG_ENDIAN 1 /* slower, default implementation */ #endif #if ARCH_IS_BIG_ENDIAN /* * On big-endian machines, we must arrange the bytes in the right * order. (This also works on machines of unknown byte order.) */ md5_word_t X[16]; const md5_byte_t *xp = data; int i; for (i = 0; i < 16; ++i, xp += 4) X[i] = xp[0] + (xp[1] << 8) + (xp[2] << 16) + (xp[3] << 24); #else /* !ARCH_IS_BIG_ENDIAN */ /* * On little-endian machines, we can process properly aligned data * without copying it. */ md5_word_t xbuf[16]; const md5_word_t *X; if (!((data - (const md5_byte_t *)0) & 3)) { /* data are properly aligned */ X = (const md5_word_t *)data; } else { /* not aligned */ memcpy(xbuf, data, 64); X = xbuf; } #endif #define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n)))) /* Round 1. */ /* Let [abcd k s i] denote the operation a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */ #define F(x, y, z) (((x) & (y)) | (~(x) & (z))) #define SET(a, b, c, d, k, s, Ti)\ t = a + F(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 0, 7, T1); SET(d, a, b, c, 1, 12, T2); SET(c, d, a, b, 2, 17, T3); SET(b, c, d, a, 3, 22, T4); SET(a, b, c, d, 4, 7, T5); SET(d, a, b, c, 5, 12, T6); SET(c, d, a, b, 6, 17, T7); SET(b, c, d, a, 7, 22, T8); SET(a, b, c, d, 8, 7, T9); SET(d, a, b, c, 9, 12, T10); SET(c, d, a, b, 10, 17, T11); SET(b, c, d, a, 11, 22, T12); SET(a, b, c, d, 12, 7, T13); SET(d, a, b, c, 13, 12, T14); SET(c, d, a, b, 14, 17, T15); SET(b, c, d, a, 15, 22, T16); #undef SET /* Round 2. */ /* Let [abcd k s i] denote the operation a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */ #define G(x, y, z) (((x) & (z)) | ((y) & ~(z))) #define SET(a, b, c, d, k, s, Ti)\ t = a + G(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 1, 5, T17); SET(d, a, b, c, 6, 9, T18); SET(c, d, a, b, 11, 14, T19); SET(b, c, d, a, 0, 20, T20); SET(a, b, c, d, 5, 5, T21); SET(d, a, b, c, 10, 9, T22); SET(c, d, a, b, 15, 14, T23); SET(b, c, d, a, 4, 20, T24); SET(a, b, c, d, 9, 5, T25); SET(d, a, b, c, 14, 9, T26); SET(c, d, a, b, 3, 14, T27); SET(b, c, d, a, 8, 20, T28); SET(a, b, c, d, 13, 5, T29); SET(d, a, b, c, 2, 9, T30); SET(c, d, a, b, 7, 14, T31); SET(b, c, d, a, 12, 20, T32); #undef SET /* Round 3. */ /* Let [abcd k s t] denote the operation a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */ #define H(x, y, z) ((x) ^ (y) ^ (z)) #define SET(a, b, c, d, k, s, Ti)\ t = a + H(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 5, 4, T33); SET(d, a, b, c, 8, 11, T34); SET(c, d, a, b, 11, 16, T35); SET(b, c, d, a, 14, 23, T36); SET(a, b, c, d, 1, 4, T37); SET(d, a, b, c, 4, 11, T38); SET(c, d, a, b, 7, 16, T39); SET(b, c, d, a, 10, 23, T40); SET(a, b, c, d, 13, 4, T41); SET(d, a, b, c, 0, 11, T42); SET(c, d, a, b, 3, 16, T43); SET(b, c, d, a, 6, 23, T44); SET(a, b, c, d, 9, 4, T45); SET(d, a, b, c, 12, 11, T46); SET(c, d, a, b, 15, 16, T47); SET(b, c, d, a, 2, 23, T48); #undef SET /* Round 4. */ /* Let [abcd k s t] denote the operation a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */ #define I(x, y, z) ((y) ^ ((x) | ~(z))) #define SET(a, b, c, d, k, s, Ti)\ t = a + I(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 0, 6, T49); SET(d, a, b, c, 7, 10, T50); SET(c, d, a, b, 14, 15, T51); SET(b, c, d, a, 5, 21, T52); SET(a, b, c, d, 12, 6, T53); SET(d, a, b, c, 3, 10, T54); SET(c, d, a, b, 10, 15, T55); SET(b, c, d, a, 1, 21, T56); SET(a, b, c, d, 8, 6, T57); SET(d, a, b, c, 15, 10, T58); SET(c, d, a, b, 6, 15, T59); SET(b, c, d, a, 13, 21, T60); SET(a, b, c, d, 4, 6, T61); SET(d, a, b, c, 11, 10, T62); SET(c, d, a, b, 2, 15, T63); SET(b, c, d, a, 9, 21, T64); #undef SET /* Then perform the following additions. (That is increment each of the four registers by the value it had before this block was started.) */ pms->abcd[0] += a; pms->abcd[1] += b; pms->abcd[2] += c; pms->abcd[3] += d; } void md5_init(md5_state_t *pms) { pms->count[0] = pms->count[1] = 0; pms->abcd[0] = 0x67452301; pms->abcd[1] = 0xefcdab89; pms->abcd[2] = 0x98badcfe; pms->abcd[3] = 0x10325476; } void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes) { const md5_byte_t *p = data; int left = nbytes; int offset = (pms->count[0] >> 3) & 63; md5_word_t nbits = (md5_word_t)(nbytes << 3); if (nbytes <= 0) return; /* Update the message length. */ pms->count[1] += nbytes >> 29; pms->count[0] += nbits; if (pms->count[0] < nbits) pms->count[1]++; /* Process an initial partial block. */ if (offset) { int copy = (offset + nbytes > 64 ? 64 - offset : nbytes); memcpy(pms->buf + offset, p, copy); if (offset + copy < 64) return; p += copy; left -= copy; md5_process(pms, pms->buf); } /* Process full blocks. */ for (; left >= 64; p += 64, left -= 64) md5_process(pms, p); /* Process a final partial block. */ if (left) memcpy(pms->buf, p, left); } void md5_finish(md5_state_t *pms, md5_byte_t digest[16]) { static const md5_byte_t pad[64] = { 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; md5_byte_t data[8]; int i; /* Save the length before padding. */ for (i = 0; i < 8; ++i) data[i] = (md5_byte_t)(pms->count[i >> 2] >> ((i & 3) << 3)); /* Pad to 56 bytes mod 64. */ md5_append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1); /* Append the length. */ md5_append(pms, data, 8); for (i = 0; i < 16; ++i) digest[i] = (md5_byte_t)(pms->abcd[i >> 2] >> ((i & 3) << 3)); } fdupes-1.51/.svn/pristine/7d/0000755000175000017500000000000012134544631015242 5ustar adrianadrianfdupes-1.51/.svn/pristine/7d/7db95721ce0051d62b982e79febd2726654a96fa.svn-base0000644000175000017500000000557612134544631024317 0ustar adrianadrian# # fdupes Makefile # ##################################################################### # Standand User Configuration Section # ##################################################################### # # PREFIX indicates the base directory used as the basis for the # determination of the actual installation directories. # Suggested values are "/usr/local", "/usr", "/pkgs/fdupes-$(VERSION)" # PREFIX = /usr/local # # Certain platforms do not support long options (command line options). # To disable long options, uncomment the following line. # #OMIT_GETOPT_LONG = -DOMIT_GETOPT_LONG # # To use the md5sum program for calculating signatures (instead of the # built in MD5 message digest routines) uncomment the following # line (try this if you're having trouble with built in code). # #EXTERNAL_MD5 = -DEXTERNAL_MD5=\"md5sum\" ##################################################################### # Developer Configuration Section # ##################################################################### # # VERSION determines the program's version number. # include Makefile.inc/VERSION # # PROGRAM_NAME determines the installation name and manual page name # PROGRAM_NAME=fdupes # # BIN_DIR indicates directory where program is to be installed. # Suggested value is "$(PREFIX)/bin" # BIN_DIR = $(PREFIX)/bin # # MAN_DIR indicates directory where the fdupes man page is to be # installed. Suggested value is "$(PREFIX)/man/man1" # MAN_BASE_DIR = $(PREFIX)/man MAN_DIR = $(MAN_BASE_DIR)/man1 MAN_EXT = 1 # # Required External Tools # INSTALL = install # install : UCB/GNU Install compatiable #INSTALL = ginstall RM = rm -f MKDIR = mkdir -p #MKDIR = mkdirhier #MKDIR = mkinstalldirs # # Make Configuration # CC = gcc COMPILER_OPTIONS = -Wall -O -g CFLAGS= $(COMPILER_OPTIONS) -I. -DVERSION=\"$(VERSION)\" $(EXTERNAL_MD5) $(EXPERIMENTAL_RBTREE) $(OMIT_GETOPT_LONG) INSTALL_PROGRAM = $(INSTALL) -c -m 0755 INSTALL_DATA = $(INSTALL) -c -m 0644 # # ADDITIONAL_OBJECTS - some platforms will need additional object files # to support features not supplied by their vendor. Eg: GNU getopt() # #ADDITIONAL_OBJECTS = getopt.o OBJECT_FILES = fdupes.o md5/md5.o $(ADDITIONAL_OBJECTS) ##################################################################### # no need to modify anything beyond this point # ##################################################################### all: fdupes fdupes: $(OBJECT_FILES) $(CC) $(CFLAGS) -o fdupes $(OBJECT_FILES) installdirs: test -d $(BIN_DIR) || $(MKDIR) $(BIN_DIR) test -d $(MAN_DIR) || $(MKDIR) $(MAN_DIR) install: fdupes installdirs $(INSTALL_PROGRAM) fdupes $(BIN_DIR)/$(PROGRAM_NAME) $(INSTALL_DATA) fdupes.1 $(MAN_DIR)/$(PROGRAM_NAME).$(MAN_EXT) clean: $(RM) $(OBJECT_FILES) $(RM) fdupes $(RM) *~ md5/*~ love: @echo You\'re not my type. Go find a human partner. fdupes-1.51/.svn/pristine/85/0000755000175000017500000000000012134544631015164 5ustar adrianadrianfdupes-1.51/.svn/pristine/85/8501050f63779a4a0ddae46d72ce8585754d87f6.svn-base0000644000175000017500000000536712134544631024103 0ustar adrianadrian.TH FDUPES 1 .\" NAME should be all caps, SECTION should be 1-8, maybe w/ subsection .\" other parms are allowed: see man(7), man(1) .SH NAME fdupes \- finds duplicate files in a given set of directories .SH SYNOPSIS .B fdupes [ .I options ] .I DIRECTORY \|.\|.\|. .SH "DESCRIPTION" Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. .SH OPTIONS .TP .B -r --recurse for every directory given follow subdirectories encountered within .TP .B -R --recurse: for each directory given after this option follow subdirectories encountered within .TP .B -s --symlinks follow symlinked directories .TP .B -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior .TP .B -n --noempty exclude zero-length files from consideration .TP .B -f --omitfirst omit the first file in each set of matches .TP .B -1 --sameline list each set of matches on a single line .TP .B -S --size show size of duplicate files .TP .B -q --quiet hide progress indicator .TP .B -d --delete prompt user for files to preserve, deleting all others (see .B CAVEATS below) .TP .B -N --noprompt when used together with --delete, preserve the first file in each set of duplicates and delete the others without prompting the user .TP .B -v --version display fdupes version .TP .B -h --help displays help .SH "SEE ALSO" .\" Always quote multiple words for .SH .BR md5sum (1) .SH NOTES Unless .B -1 or .B --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When .B -1 or .B --sameline is specified, spaces and backslash characters (\fB\e\fP) appearing in a filename are preceded by a backslash character. .SH CAVEATS If fdupes returns with an error message such as .B fdupes: error invoking md5sum it means the program has been compiled to use an external program to calculate MD5 signatures (otherwise, fdupes uses interal routines for this purpose), and an error has occurred while attempting to execute it. If this is the case, the specified program should be properly installed prior to running fdupes. When using .B \-d or .BR \-\-delete , care should be taken to insure against accidental data loss. When used together with options .B \-s or .BR \-\-symlink , a user could accidentally preserve a symlink while deleting the file it points to. Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). .SH AUTHOR Adrian Lopez fdupes-1.51/.svn/pristine/4d/0000755000175000017500000000000012134544631015237 5ustar adrianadrianfdupes-1.51/.svn/pristine/4d/4d40fc46ac7632656e7b7ed5c6a47011695e1335.svn-base0000644000175000017500000000001712134544631024121 0ustar adrianadrianlink recursed_afdupes-1.51/.svn/pristine/4d/4db195b8aa934d450b907c7f6d194f9d1eadf992.svn-base0000644000175000017500000000734512134544631024453 0ustar adrianadrianIntroduction -------------------------------------------------------------------- FDUPES is a program for identifying duplicate files residing within specified directories. Usage -------------------------------------------------------------------- Usage: fdupes [options] DIRECTORY... -r --recurse for every directory given follow subdirectories encountered within -R --recurse: for each directory given after this option follow subdirectories encountered within -s --symlinks follow symlinks -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior -n --noempty exclude zero-length files from consideration -f --omitfirst omit the first file in each set of matches -1 --sameline list each set of matches on a single line -S --size show size of duplicate files -q --quiet hide progress indicator -d --delete prompt user for files to preserve and delete all others; important: under particular circumstances, data may be lost when using this option together with -s or --symlinks, or when specifying a particular directory more than once; refer to the fdupes documentation for additional information -v --version display fdupes version -h --help display this help message Unless -1 or --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When -1 or --sameline is specified, spaces and backslash characters (\) appearing in a filename are preceded by a backslash character. For instance, "with spaces" becomes "with\ spaces". When using -d or --delete, care should be taken to insure against accidental data loss. While no information will be immediately lost, using this option together with -s or --symlink can lead to confusing information being presented to the user when prompted for files to preserve. Specifically, a user could accidentally preserve a symlink while deleting the file it points to. A similar problem arises when specifying a particular directory more than once. All files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). Contact Information for Adrian Lopez -------------------------------------------------------------------- email: adrian2@caribe.net Legal Information -------------------------------------------------------------------- FDUPES Copyright (c) 1999 Adrian Lopez Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. fdupes-1.51/.svn/pristine/e7/0000755000175000017500000000000012134544631015243 5ustar adrianadrianfdupes-1.51/.svn/pristine/e7/e73a4cd50904e0af47fda6e98d066e3436e9820a.svn-base0000644000175000017500000000567612134544631024371 0ustar adrianadrian/* Copyright (C) 1999 Aladdin Enterprises. All rights reserved. This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. L. Peter Deutsch ghost@aladdin.com */ /*$Id: md5.h $ */ /* Independent implementation of MD5 (RFC 1321). This code implements the MD5 Algorithm defined in RFC 1321. It is derived directly from the text of the RFC and not from the reference implementation. The original and principal author of md5.h is L. Peter Deutsch . Other authors are noted in the change history that follows (in reverse chronological order): 1999-11-04 lpd Edited comments slightly for automatic TOC extraction. 1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5); added conditionalization for C++ compilation from Martin Purschke . 1999-05-03 lpd Original version. */ #ifndef md5_INCLUDED # define md5_INCLUDED /* * This code has some adaptations for the Ghostscript environment, but it * will compile and run correctly in any environment with 8-bit chars and * 32-bit ints. Specifically, it assumes that if the following are * defined, they have the same meaning as in Ghostscript: P1, P2, P3, * ARCH_IS_BIG_ENDIAN. */ typedef unsigned char md5_byte_t; /* 8-bit byte */ typedef unsigned int md5_word_t; /* 32-bit word */ /* Define the state of the MD5 Algorithm. */ typedef struct md5_state_s { md5_word_t count[2]; /* message length in bits, lsw first */ md5_word_t abcd[4]; /* digest buffer */ md5_byte_t buf[64]; /* accumulate block */ } md5_state_t; #ifdef __cplusplus extern "C" { #endif /* Initialize the algorithm. */ #ifdef P1 void md5_init(P1(md5_state_t *pms)); #else void md5_init(md5_state_t *pms); #endif /* Append a string to the message. */ #ifdef P3 void md5_append(P3(md5_state_t *pms, const md5_byte_t *data, int nbytes)); #else void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes); #endif /* Finish the message and return the digest. */ #ifdef P2 void md5_finish(P2(md5_state_t *pms, md5_byte_t digest[16])); #else void md5_finish(md5_state_t *pms, md5_byte_t digest[16]); #endif #ifdef __cplusplus } /* end extern "C" */ #endif #endif /* md5_INCLUDED */ fdupes-1.51/.svn/pristine/e4/0000755000175000017500000000000012134544631015240 5ustar adrianadrianfdupes-1.51/.svn/pristine/e4/e490ace58b9e686f8c4858864e72de0fbc5a4f8d.svn-base0000644000175000017500000000037312134544631024550 0ustar adrianadrianThe MD5 library code residing within this directory was written by L. Peter Deutsch. Although distributed here with fdupes, the license for his MD5 library is different from the fdupes license. Please look md5.c or md5.h for licensing information. fdupes-1.51/.svn/pristine/1e/0000755000175000017500000000000012134544631015235 5ustar adrianadrianfdupes-1.51/.svn/pristine/1e/1e7720a3460b8a84ac4ba27880d64526a3872f1c.svn-base0000644000175000017500000000000612134544631024101 0ustar adrianadrianthree fdupes-1.51/.svn/pristine/f7/0000755000175000017500000000000012134544631015244 5ustar adrianadrianfdupes-1.51/.svn/pristine/f7/f73d64088b09a8b3af29f141d61ce9e88d99fd02.svn-base0000644000175000017500000000210412134544631024370 0ustar adrianadrianInstalling fdupes -------------------------------------------------------------------- To install the program, issue the following commands: make fdupes su root make install This will install the program in /usr/local/bin. You may change this to a different location by editing the Makefile. Please refer to the Makefile for an explanation of compile-time options. If you're having trouble compiling, please take a look at the Makefile. UPGRADING NOTE: When upgrading from a version prior to 1.2, it should be noted that the default installation directory for the fdupes man page has changed from "/usr/man" to "/usr/local/man". If installing to the default location you should delete the old man page before proceeding. This file would be named "/usr/man/man1/fdupes.1". A test directory is included so that you may familiarise yourself with the way fdupes operates. You may test the program before installing it by issuing a command such as "./fdupes testdir" or "./fdupes -r testdir", just to name a couple of examples. Refer to the documentation for information on valid options. fdupes-1.51/.svn/pristine/12/0000755000175000017500000000000012134544631015152 5ustar adrianadrianfdupes-1.51/.svn/pristine/12/1276bd8f6e7e8f9d3879bdcdf2d2e31c86886466.svn-base0000644000175000017500000000133412134544631024326 0ustar adrianadrianThe following people have contributed in some way to the development of fdupes. Please see the CHANGES file for detailed information on their contributions. Names are listed in alphabetical order. [AB] Adrian Bridgett (adrian.bridgett@iname.com) [AL] Adrian Lopez (adrian2@caribe.net) [CHL] Charles Longeau (chl@tuxfamily.org) [FD] Frank DENIS, a.k.a. Jedi/Sector One, a.k.a. DJ Chrysalis (j@4u.net) [JB] Jean-Baptiste () [KK] Kresimir Kukulj (madmax@pc-hrvoje.srce.hr) [LB] Laurent Bonnaud (Laurent.Bonnaud@iut2.upmf-grenoble.fr) [LM] Luca Montecchiani (m.luca@iname.com) [LR] Lukas Ruf (lukas@lpr.ch) [PB] Peter Bray (Sydney, Australia) [SSD] Steven S. Dick (ssd@nevets.oau.org) fdupes-1.51/.svn/pristine/35/0000755000175000017500000000000012134544631015157 5ustar adrianadrianfdupes-1.51/.svn/pristine/35/353762497616f46139c0f37010f673361d6221bc.svn-base0000644000175000017500000000322112134544631023454 0ustar adrianadrian- A bug with -S shows wrong results. - A bug causes the following behavior: $ fdupes --symlinks testdir testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a testdir/symlink_two testdir/twice_one $ cp testdir/two testdir/two_again $ fdupes --symlinks testdir testdir/two_again testdir/two testdir/twice_one testdir/symlink_two testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a ** This is not the desired behavior. Likewise: $ fdupes testdir testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a testdir/twice_one testdir/two $ fdupes --symlinks testdir testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a testdir/symlink_two testdir/twice_one - Don't assume that stat always works. - Add partial checksumming where instead of MD5ing whole files we MD5 and compare every so many bytes, caching these partial results for subsequent comparisons. - Option -R should not have to be separated from the rest, such that "fdupes -dR testdir", "fdupes -d -R testdir", "fdupes -Rd testdir", etc., all yield the same results. - Add option to highlight or identify symlinked files (suggest using --classify to identify symlinks with @ suffix... when specified, files containing @ are listed using \@). - Consider autodeletion option without user intervention. - Consider option to match only to files in specific directory. - Do a little commenting, to avoid rolling eyes and/or snickering. - Fix problem where MD5 collisions will result in one of the files not being registered (causing it to be ignored). fdupes-1.51/.svn/pristine/7b/0000755000175000017500000000000012134544631015240 5ustar adrianadrianfdupes-1.51/.svn/pristine/7b/7bbef45b3bc70855010e02460717643125c3beca.svn-base0000644000175000017500000000000412134544631024142 0ustar adrianadriantwo fdupes-1.51/.svn/pristine/08/0000755000175000017500000000000012134544631015157 5ustar adrianadrianfdupes-1.51/.svn/pristine/08/08c4afaacabe55c58e4299959d27a332f5f0f258.svn-base0000644000175000017500000000674112134544631024363 0ustar adrianadrianThe following list, organized by fdupes version, documents changes to fdupes. Every item on the list includes, inside square brackets, a list of indentifiers referring to the people who contributed that particular item. When more than one person is listed the person who contributed the patch or idea appears first, followed by those who've otherwise worked on that item. For a list of contributors names and identifiers please see the CONTRIBUTORS file. Changes from 1.40 to 1.50-PR2 - Fixed memory leak. [JB] - Added "--summarize" option. [AL] - Added "--recurse:" selective recursion option. [AL] - Added "--noprompt" option for totally automated deletion of duplicate files. - Now sorts duplicates (old to new) for consistent order when listing or deleteing duplicate files. - Now tests for early matching of files, which should help speed up the matching process when large files are involved. - Added warning whenever a file cannot be deleted. [CHL, AL] - Fixed bug where some files would not be closed after failure. [AL] - Fixed bug where confirmmatch() function wouldn't always deal properly with zero-length files. [AL] - Fixed bug where progress indicator would not be cleared when no files were found. [AL] - Removed experimental red-black tree code (it was slower on my system than the default code). [AL] - Modified md5/md5.c to avoid compiler warning. [CHL] - Changes to fdupes.c for compilation under platforms where getopt_long is unavailable. [LR, AL] - Changes to help text for clarity. [AL] - Various changes and improvements to Makefile. [PB, AL] Changes from 1.31 to 1.40 - Added option to omit the first file in each group of matches. [LM, AL] - Added escaping of filenames containing spaces when sameline option is specified. [AL] - Changed version indicator format from "fdupes version X.Y" to the simpler "fdupes X.Y". [AL] - Changed ordering of options appearing in the help text (--help), manpage, and README file. [AL] Changes from 1.30 to 1.31 - Added interactive option to preserve all files during delete procedure (something similar was already in place, but now it's official). [AL] - Updated delete procedure prompt format. [AL] - Cosmetic code changes. [AL] Changes from 1.20 to 1.30 - Added size option to display size of duplicates. [LB, AL] - Added missing typecast for proper compilation under g++. [LB] - Better handling of errors occurring during retrieval of a file's signature. [KK, AL] - No longer displays an error message when specified directories contain no files. [AL] - Added red-black tree structure (experimental compile-time option, disabled by default). [AL] Changes from 1.12 to 1.20 - Fixed bug where program would crash when files being scanned were named pipes or sockets. [FD] - Fix against security risk resulting from the use of a temporary file to store md5sum output. [FD, AL] - Using an external md5sum program is now optional. Started using L. Peter Deutsh's MD5 library instead. [FD, AL] - Added hardlinks option to distinguish between hard links and actual duplicate files. [FD, AL] - Added noempty option to exclude zero-length files from consideration [AL] Changes from 1.11 to 1.12 - Improved handling of extremely long input on preserve prompt (delete option). [SSD, AL] Changes from 1.1 to 1.11 - Started checking file sizes before signatures for better performance. [AB, AL] - Added fdupes manpage. [AB, AL] Changes from 1.0 to 1.1 - Added delete option for semi-automatic deletion of duplicate files. [AL] fdupes-1.51/.svn/format0000644000175000017500000000000312134544625014302 0ustar adrianadrian12 fdupes-1.51/.svn/wc.db0000644000175000017500000014400012134556276014022 0ustar adrianadrianSQLite format 3@ 2-%" IOUhttps://fdupes.googlecode.com/svn05beb26c-b384-11de-a73b-b36ed9d36964 $O https://fdupes.googlecode.com/svn !WORK_QUEUE!  WCROOT ! REPOSITORY 'U 05beb26c-b384-11de-a73b-b36ed9d36964 $O https://fdupes.googlecode.com/svn    /'& ,+2i$sha1$8501050f63779a4a0ddae46d72ce8585754d87f6 xxJ(0!!tableREPOSITORYREPOSITORYCREATE TABLE REPOSITORY ( id INTEGER PRIMARY KEY AUTOINCREMENT, root TEXT UNIQUE NOT NULL, uuid TEXT NOT NULL )3G!indexsqlite_autoindex_REPOSITORY_1REPOSITORYP++Ytablesqlite_sequencesqlite_sequenceCREATE TABLE sqlite_sequence(name,seq)D!]indexI_UUIDREPOSITORYCREATE INDEX I_UUID ON REPOSITORY (uuid)D!]indexI_ROOTREPOSITORYCREATE INDEX I_ROOT ON REPOSITORY (root)xKtableWCROOTWCROOTCREATE TABLE WCROOT ( id INTEGER PRIMARY KEY AUTOINCREMENT, local_abspath TEXT UNIQUE )+?indexsqlite_autoindex_WCROOT_1WCROOT_+indexI_LOCAL_ABSPATHWCROOT CREATE UNIQUE INDEX I_LOCAL_ABSPATH ON WCROOT (local_abspath)M mtablePRISTINEPRISTINE CREATE TABLE PRISTINE ( checksum TEXT NOT NULL PRIMARY KEY, compression INTEGER, size INTEGER NOT NULL, refcount INTEGER NOT NULL, md5_checksum TEXT NOT NULL )   +/ Cindexsqlite_autoindex_PRISTINE_1PRISTINE < ##?tableACTUAL_NODEACTUAL_NODE CREATE TABLE ACTUAL_NODE ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT, properties BLOB, conflict_old TEXT, conflict_new TEXT, conflict_working TEXT, prop_reject TEXT, changelist TEXT, text_mod TEXT, tree_conflict_data TEXT, conflict_data BLOB, older_checksum TEXT REFERENCES PRISTINE (checksum), left_checksum TEXT REFERENCES PRISTINE (checksum), right_checksum TEXT REFERENCES PRISTINE (checksum), PRIMARY KEY (wc_id, local_relpath) )5 I#indexsqlite_autoindex_ACTUAL_NODE_1ACTUAL_NODEj +#indexI_ACTUAL_PARENTACTUAL_NODECREATE INDEX I_ACTUAL_PARENT ON ACTUAL_NODE (wc_id, parent_relpath)g3#indexI_ACTUAL_CHANGELISTACTUAL_NODECREATE INDEX I_ACTUAL_CHANGELIST ON ACTUAL_NODE (changelist)     77Dm wtableLOCKLOCKCREATE TABLE LOCK ( repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), repos_relpath TEXT NOT NULL, lock_token TEXT NOT NULL, lock_owner TEXT, lock_comment TEXT, lock_date INTEGER, PRIMARY KEY (repos_id, repos_relpath) )';indexsqlite_autoindex_LOCK_1LOCK}!!EtableWORK_QUEUEWORK_QUEUECREATE TABLE WORK_QUEUE ( id INTEGER PRIMARY KEY AUTOINCREMENT, work BLOB NOT NULL )btableWC_LOCKWC_LOCKCREATE TABLE WC_LOCK ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_dir_relpath TEXT NOT NULL, locked_levels INTEGER NOT NULL DEFAULT -1, PRIMARY KEY (wc_id, local_dir_relpath) )-Aindexsqlite_autoindex_WC_LOCK_1WC_LOCK ,yL[1hE/   21+0'-#(*).%$ # "  L,iw?ZMk'1eI (% Ktestdir/recursed_b/two_plus_one% ;testdir/recursed_b/four$ =testdir/recursed_b/three# 9testdir/recursed_b/one" 1testdir/recursed_b& ;testdir/recursed_a/five 9testdir/recursed_a/two 9testdir/recursed_a/one 1testdir/recursed_a  /testdir/twice_one 7testdir/with spaces b 7testdir/with spaces a ;testdir/nine_upsidedown #testdir/two 3testdir/symlink_dir )testdir/zero_b )testdir/zero_a 3testdir/symlink_two  testdir' Makefile* %CONTRIBUTORS  README  CHANGES) fdupes.c+  INSTALL  5Makefile.inc/VERSION, %Makefile.inc  !md5/README md5/md5.h md5/md5.c  md5 fdupes.1  TODO m9QtableNODESNODESCREATE TABLE NODES ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, op_depth INTEGER NOT NULL, parent_relpath TEXT, repos_id INTEGER REFERENCES REPOSITORY (id), repos_path TEXT, revision INTEGER, presence TEXT NOT NULL, moved_here INTEGER, moved_to TEXT, kind TEXT NOT NULL, properties BLOB, depth TEXT, checksum TEXT REFERENCES PRISTINE (checksum), symlink_target TEXT, changed_revision INTEGER, changed_date INTEGER, changed_author TEXT, translated_size INTEGER, last_mod_time INTEGER, dav_cache BLOB, file_external TEXT, PRIMARY KEY (wc_id, local_relpath, op_depth) ))=indexsqlite_autoindex_NODES_1NODESf)indexI_NODES_PARENTNODESCREATE INDEX I_NODES_PARENT ON NODES (wc_id, parent_relpath, op_depth) ",,|n`RD6( ~eL3 ( 1testdir/recursed_b% 1testdir/recursed_b$ 1testdir/recursed_b# 1testdir/recursed_b"  testdir& 1testdir/recursed_a 1testdir/recursed_a 1testdir/recursed_a  testdir  testdir  testdir  testdir  testdir  testdir  testdir  testdir  testdir  testdir ' *   ) +   %Makefile.inc,   md5  md5  md5    L&''QviewNODES_CURRENTNODES_CURRENTCREATE VIEW NODES_CURRENT AS SELECT * FROM nodes AS n WHERE op_depth = (SELECT MAX(op_depth) FROM nodes AS n2 WHERE n2.wc_id = n.wc_id AND n2.local_relpath = n.local_relpath)c!!viewNODES_BASENODES_BASECREATE VIEW NODES_BASE AS SELECT * FROM nodes WHERE op_depth = 0W5mtriggernodes_insert_triggernodesCREATE TRIGGER nodes_insert_trigger AFTER INSERT ON nodes WHEN NEW.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; ENDW5mtriggernodes_delete_triggernodesCREATE TRIGGER nodes_delete_trigger AFTER DELETE ON nodes WHEN OLD.checksum IS NOT NULL BEGIN UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; END 'LGEtriggernodes_update_checksum_triggernodesCREATE TRIGGER nodes_update_checksum_trigger AFTER UPDATE OF checksum ON nodes WHEN NEW.checksum IS NOT OLD.checksum BEGIN UPDATE pristine SET refcount = refcount + 1 WHERE checksum = NEW.checksum; UPDATE pristine SET refcount = refcount - 1 WHERE checksum = OLD.checksum; ENDV{tableEXTERNALSEXTERNALSCREATE TABLE EXTERNALS ( wc_id INTEGER NOT NULL REFERENCES WCROOT (id), local_relpath TEXT NOT NULL, parent_relpath TEXT NOT NULL, repos_id INTEGER NOT NULL REFERENCES REPOSITORY (id), presence TEXT NOT NULL, kind TEXT NOT NULL, def_local_relpath TEXT NOT NULL, def_repos_relpath TEXT NOT NULL, def_operational_revision TEXT, def_revision TEXT, PRIMARY KEY (wc_id, local_relpath) )     ii 1Eindexsqlite_autoindex_EXTERNALS_1EXTERNALSl1indexI_EXTERNALS_PARENTEXTERNALS CREATE INDEX I_EXTERNALS_PARENT ON EXTERNALS (wc_id, parent_relpath)s3!indexI_EXTERNALS_DEFINEDEXTERNALS!CREATE UNIQUE INDEX I_EXTERNALS_DEFINED ON EXTERNALS (wc_id, def_local_relpath, local_relpath)  AoO  +i-md5/md5.hmd5trunk/md5/md5.hnormalfile()$sha1$e73a4cd50904e0af47fda6e98d066e3436e9820au_VF*adrianlopezroche ĥA(svn:wc:ra_dav:version-url 31 /svn/!svn/ver/2/trunk/md5/md5.h)O  +i-md5/md5.cmd5trunk/md5/md5.cnormalfile()$sha1$2f93a711e65db83f78c6cbccb255c8375b440223u_XP3adrianlopezroche+ĥA(svn:wc:ra_dav:version-url 31 /svn/!svn/ver/3/trunk/md5/md5.c)I  )i-fdupes.1trunk/fdupes.1normalfile()$sha1$8501050f63779a4a0ddae46d72ce8585754d87f6u_XP3adrianlopezroche ĥA(svn:wc:ra_dav:version-url 30 /svn/!svn/ver/3/trunk/fdupes.1)<  !i-~TODOtrunk/TODOnormalfile()$sha1$353762497616f46139c0f37010f673361d6221bcu_XP3adrianlopezrocheĥA(svn:wc:ra_dav:version-url 26 /svn/!svn/ver/3/trunk/TODO)4 55 77R ! -i- md5/READMEmd5trunk/md5/READMEnormalfile()$sha1$e490ace58b9e686f8c4858864e72de0fbc5a4f8du_VF*adrianlopezrocheĥA(svn:wc:ra_dav:version-url 32 /svn/!svn/ver/2/trunk/md5/README)  -|md5trunk/md5normaldir()infinityu_XP3adrianlopezroche(svn:wc:ra_dav:version-url 25 /svn/!svn/ver/3/trunk/md5)$  % 1-Makefile.inctrunk/Makefile.incnormaldir()infinityu_XP3adrianlopezroche(svn:wc:ra_dav:version-url 34 /svn/!svn/ver/3/trunk/Makefile.inc)F   'i-INSTALLtrunk/INSTALLnormalfile()$sha1$f73d64088b09a8b3af29f141d61ce9e88d99fd02u_VF*adrianlopezrocheDĨ(svn:wc:ra_dav:version-url 29 /svn/!svn/ver/2/trunk/INSTALL) kU % 1i-CONTRIBUTORStrunk/CONTRIBUTORSnormalfile()$sha1$1276bd8f6e7e8f9d3879bdcdf2d2e31c86886466ut;iAadrianlopezrocheĨ(svn:wc:ra_dav:version-url 34 /svn/!svn/ver/4/trunk/CONTRIBUTORS)C  %i-READMEtrunk/READMEnormalfile()$sha1$4db195b8aa934d450b907c7f6d194f9d1eadf992u_XP3adrianlopezrocheĨ(svn:wc:ra_dav:version-url 28 /svn/!svn/ver/3/trunk/README) UD*oU\ i Y$sha1$4db195b8aa934d450b907c7f6d194f9d1eadf992$md5 $dc2354fe16658f941d3eafd71f3d9bcb\ iY$sha1$08c4afaacabe55c58e4299959d27a332f5f0f258 $md5 $b33aae049f28369841950eadbee1b2d7\iY$sha1$1d9037bc7f488901dc068f1df3818084c5125ec1mV$md5 $ecc89188d0ca8155d63f5e369e379fc7\i Y$sha1$f73d64088b09a8b3af29f141d61ce9e88d99fd02D$md5 $a0d00f9917274298e956548696687909[iY$sha1$a3170b78ced4cd2e8302c6d80436a1cd5a908f6fK$md5 $00b712af0346b0f2a70f88012be98eaa\i Y$sha1$e490ace58b9e686f8c4858864e72de0fbc5a4f8d$md5 $a84384cbf5312c8b2c99a58b5800e325\i Y$sha1$e73a4cd50904e0af47fda6e98d066e3436e9820a $md5 $609f50b36e926c6f14734c2090a5a7ca\i Y$sha1$2f93a711e65db83f78c6cbccb255c8375b440223+$md5 $edef6f96a99d815f4e17ad845d721f9a\i Y$sha1$8501050f63779a4a0ddae46d72ce8585754d87f6 $md5 $435806be54218d7e1688f2ad86ac00df\i Y$sha1$353762497616f46139c0f37010f673361d6221bc$md5 $3a56e1f8fdcce40dfef615b7ab771acb YD-rY\iY$sha1$c7059bb19433cc3cabaa6236c83d56668a843dd2$md5 $5bbf5a52328e7439ae6e719dfe712200[i Y$sha1$4430bb02f6ed700d4408eb307b25f8b1a25d93de$md5 $014835e36358e38c7f7897d6571e4529\iY$sha1$99346e4e318333b4785bb199997c1fcc2be7546d $md5 $41a8691bea95abea691e92ec566b02c4[i Y$sha1$cfa698ef88230fbe6862cb300268a3a647ecc71d$md5 $5d2dfbea120f23e84e689374aa2ba84f\iY$sha1$7bbef45b3bc70855010e02460717643125c3beca$md5 $c193497a1a06b2c72230e6146ff47080[i Y$sha1$4d40fc46ac7632656e7b7ed5c6a47011695e1335$md5 $91a05a9a5c8be28378797ee8df827b9f[iY$sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709$md5 $d41d8cd98f00b204e9800998ecf8427e[ i Y$sha1$a34e525c92d7acf1e92ea5b99eaabaeb8ff4d1d6$md5 $372cbc6c0db7967fd5f3dbde6649f6ea\ iY$sha1$7db95721ce0051d62b982e79febd2726654a96fa ~$md5 $ec9edf4e0a06b83148ef5ff87b611285\ i Y$sha1$1276bd8f6e7e8f9d3879bdcdf2d2e31c86886466$md5 $2ab88c55dc467cf5a1904866944585f5 ////` ) 5i-testdir/zero_atestdirtrunk/testdir/zero_anormalfile()$sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709u_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 36 /svn/!svn/ver/2/trunk/testdir/zero_a)% 3 ?.i-testdir/symlink_twotestdirtrunk/testdir/symlink_twonormalfile(svn:special 1 *)$sha1$a34e525c92d7acf1e92ea5b99eaabaeb8ff4d1d6u_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 41 /svn/!svn/ver/2/trunk/testdir/symlink_two) ,%&),z 7 Ci-(testdir/with spaces btestdirtrunk/testdir/with spaces bnormalfile()$sha1$99346e4e318333b4785bb199997c1fcc2be7546du_VF*adrianlopezroche Ĩ(svn:wc:ra_dav:version-url 47 /svn/!svn/ver/2/trunk/testdir/with%20spaces%20b)z 7 Ci-(testdir/with spaces atestdirtrunk/testdir/with spaces anormalfile()$sha1$99346e4e318333b4785bb199997c1fcc2be7546du_VF*adrianlopezroche Ĩ(svn:wc:ra_dav:version-url 47 /svn/!svn/ver/2/trunk/testdir/with%20spaces%20a)| ; Gi-$testdir/nine_upsidedowntestdirtrunk/testdir/nine_upsidedownnormalfile()$sha1$cfa698ef88230fbe6862cb300268a3a647ecc71du_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 45 /svn/!svn/ver/2/trunk/testdir/nine_upsidedown)X # /i- testdir/twotestdirtrunk/testdir/twonormalfile()$sha1$7bbef45b3bc70855010e02460717643125c3becau_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 33 /svn/!svn/ver/2/trunk/testdir/two) j / ;i-testdir/twice_onetestdirtrunk/testdir/twice_onenormalfile()$sha1$7bbef45b3bc70855010e02460717643125c3becau_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 39 /svn/!svn/ver/2/trunk/testdir/twice_one) 91 Ei-"testdir/recursed_a/onetestdir/recursed_atrunk/testdir/recursed_a/onenormalfile()$sha1$c7059bb19433cc3cabaa6236c83d56668a843dd2u_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 44 /svn/!svn/ver/2/trunk/testdir/recursed_a/one) 91 Ei-"testdir/recursed_a/twotestdir/recursed_atrunk/testdir/recursed_a/twonormalfile()$sha1$7bbef45b3bc70855010e02460717643125c3becau_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 44 /svn/!svn/ver/2/trunk/testdir/recursed_a/two) 6ij4g72i$sha1$706833cca7f1bd17799bd19d7341c4528570654c2i$sha1$6f02f5c07dcf985ec62ac8013e28ab399d428f7a2i$sha1$1d9f71f8c1e8ae30a626dee15c17c8061285e32d2i$sha1$4a4121ecd766ed16943a0c7b54c18f743e90c3f62i$sha1$1e7720a3460b8a84ac4ba27880d64526a3872f1c2i$sha1$08c4afaacabe55c58e4299959d27a332f5f0f258 2i$sha1$1276bd8f6e7e8f9d3879bdcdf2d2e31c86886466 2i$sha1$1d9037bc7f488901dc068f1df3818084c5125ec12i$sha1$2f93a711e65db83f78c6cbccb255c8375b4402231i $sha1$353762497616f46139c0f37010f673361d6221bc2i$sha1$4430bb02f6ed700d4408eb307b25f8b1a25d93de2i$sha1$4d40fc46ac7632656e7b7ed5c6a47011695e13352i$sha1$4db195b8aa934d450b907c7f6d194f9d1eadf992 2i$sha1$7bbef45b3bc70855010e02460717643125c3beca2i$sha1$7db95721ce0051d62b982e79febd2726654a96fa 5h4g2i$sha1$b9cb24548c74e0c61ae5d6749fc756edd1418f9a2i$sha1$99346e4e318333b4785bb199997c1fcc2be7546d2i$sha1$a3170b78ced4cd2e8302c6d80436a1cd5a908f6f2i$sha1$a34e525c92d7acf1e92ea5b99eaabaeb8ff4d1d6 2i$sha1$c7059bb19433cc3cabaa6236c83d56668a843dd22i$sha1$cfa698ef88230fbe6862cb300268a3a647ecc71d2i$sha1$da39a3ee5e6b4b0d3255bfef95601890afd807092i$sha1$e490ace58b9e686f8c4858864e72de0fbc5a4f8d2i$sha1$e73a4cd50904e0af47fda6e98d066e3436e9820a2i$sha1$f73d64088b09a8b3af29f141d61ce9e88d99fd02 "", ;1 Gi-$testdir/recursed_a/fivetestdir/recursed_atrunk/testdir/recursed_a/fivenormalfile()$sha1$4430bb02f6ed700d4408eb307b25f8b1a25d93deu_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 45 /svn/!svn/ver/2/trunk/testdir/recursed_a/five)=  1 =-testdir/recursed_atestdirtrunk/testdir/recursed_anormaldir()infinityu_VF*adrianlopezroche(svn:wc:ra_dav:version-url 40 /svn/!svn/ver/2/trunk/testdir/recursed_a)" 91 Ei-"testdir/recursed_b/onetestdir/recursed_btrunk/testdir/recursed_b/onenormalfile()$sha1$c7059bb19433cc3cabaa6236c83d56668a843dd2u_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 44 /svn/!svn/ver/2/trunk/testdir/recursed_b/one) # =1 Ii-&testdir/recursed_b/threetestdir/recursed_btrunk/testdir/recursed_b/threenormalfile()$sha1$1e7720a3460b8a84ac4ba27880d64526a3872f1cu_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 46 /svn/!svn/ver/2/trunk/testdir/recursed_b/three) "$" 3 ?.i-testdir/symlink_dirtestdirtrunk/testdir/symlink_dirnormalfile(svn:special 1 *)$sha1$4d40fc46ac7632656e7b7ed5c6a47011695e1335u_VF*adrianlopezroche Ĩ(svn:wc:ra_dav:version-url 41 /svn/!svn/ver/2/trunk/testdir/symlink_dir)` ) 5i-testdir/zero_btestdirtrunk/testdir/zero_bnormalfile()$sha1$da39a3ee5e6b4b0d3255bfef95601890afd80709u_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 36 /svn/!svn/ver/2/trunk/testdir/zero_b) F,\i Y$sha1$706833cca7f1bd17799bd19d7341c4528570654c +$md5 $d84af420451b4a82927c80933118bf33\i Y$sha1$6f02f5c07dcf985ec62ac8013e28ab399d428f7aC$md5 $9f81fe70442a0d69d7d1b29d7b21a53a\i Y$sha1$1d9f71f8c1e8ae30a626dee15c17c8061285e32dn$md5 $0cb86121a1ba5ea10fc78f1eb8a2e71a\iY$sha1$1e7720a3460b8a84ac4ba27880d64526a3872f1c$md5 $febe6995bad457991331348f7b9c85fa[i Y$sha1$4a4121ecd766ed16943a0c7b54c18f743e90c3f6$md5 $75ffdb827341e578959bfcabde3789d8[i Y$sha1$b9cb24548c74e0c61ae5d6749fc756edd1418f9aG$md5 $a6ab935497b5c2252891649ebf25f42d |@|'  '-testdirtrunk/testdirnormaldir()infinityu_VF*adrianlopezroche(svn:wc:ra_dav:version-url 29 /svn/!svn/ver/2/trunk/testdir)$ ;1 Gi-$testdir/recursed_b/fourtestdir/recursed_btrunk/testdir/recursed_b/fournormalfile()$sha1$4a4121ecd766ed16943a0c7b54c18f743e90c3f6u_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 45 /svn/!svn/ver/2/trunk/testdir/recursed_b/four)% K1 Wi-4testdir/recursed_b/two_plus_onetestdir/recursed_btrunk/testdir/recursed_b/two_plus_onenormalfile()$sha1$1e7720a3460b8a84ac4ba27880d64526a3872f1cu_VF*adrianlopezrocheĨ(svn:wc:ra_dav:version-url 53 /svn/!svn/ver/2/trunk/testdir/recursed_b/two_plus_one)=& 1 =-testdir/recursed_btestdirtrunk/testdir/recursed_bnormaldir()infinityu_VF*adrianlopezroche(svn:wc:ra_dav:version-url 40 /svn/!svn/ver/2/trunk/testdir/recursed_b) =u=Q)  'iACHANGEStrunk/CHANGESnormalfile()$sha1$6f02f5c07dcf985ec62ac8013e28ab399d428f7aί/#adrianlopezroche@gmail.comCΞN(svn:wc:ra_dav:version-url 30 /svn/!svn/ver/18/trunk/CHANGES)T+  )iAfdupes.ctrunk/fdupes.cnormalfile()$sha1$1d9f71f8c1e8ae30a626dee15c17c8061285e32dί/#adrianlopezroche@gmail.comn΢<(svn:wc:ra_dav:version-url 31 /svn/!svn/ver/18/trunk/fdupes.c)T*  )iAMakefiletrunk/Makefilenormalfile()$sha1$706833cca7f1bd17799bd19d7341c4528570654cί/#adrianlopezroche@gmail.com +} o(svn:wc:ra_dav:version-url 31 /svn/!svn/ver/18/trunk/Makefile)(  Avtrunknormaldir()infinity}Radrianlopezroche@gmail.com(svn:wc:ra_dav:version-url 22 /svn/!svn/ver/17/trunk) , 5% AiA Makefile.inc/VERSIONMakefile.inctrunk/Makefile.inc/VERSIONnormalfile()$sha1$b9cb24548c74e0c61ae5d6749fc756edd1418f9a Vadrianlopezroche@gmail.comGH(svn:wc:ra_dav:version-url 43 /svn/!svn/ver/20/trunk/Makefile.inc/VERSION)fdupes-1.51/.svn/tmp/0000755000175000017500000000000012134556275013702 5ustar adrianadrianfdupes-1.51/fdupes.10000644000175000017500000000536712134544631013572 0ustar adrianadrian.TH FDUPES 1 .\" NAME should be all caps, SECTION should be 1-8, maybe w/ subsection .\" other parms are allowed: see man(7), man(1) .SH NAME fdupes \- finds duplicate files in a given set of directories .SH SYNOPSIS .B fdupes [ .I options ] .I DIRECTORY \|.\|.\|. .SH "DESCRIPTION" Searches the given path for duplicate files. Such files are found by comparing file sizes and MD5 signatures, followed by a byte-by-byte comparison. .SH OPTIONS .TP .B -r --recurse for every directory given follow subdirectories encountered within .TP .B -R --recurse: for each directory given after this option follow subdirectories encountered within .TP .B -s --symlinks follow symlinked directories .TP .B -H --hardlinks normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior .TP .B -n --noempty exclude zero-length files from consideration .TP .B -f --omitfirst omit the first file in each set of matches .TP .B -1 --sameline list each set of matches on a single line .TP .B -S --size show size of duplicate files .TP .B -q --quiet hide progress indicator .TP .B -d --delete prompt user for files to preserve, deleting all others (see .B CAVEATS below) .TP .B -N --noprompt when used together with --delete, preserve the first file in each set of duplicates and delete the others without prompting the user .TP .B -v --version display fdupes version .TP .B -h --help displays help .SH "SEE ALSO" .\" Always quote multiple words for .SH .BR md5sum (1) .SH NOTES Unless .B -1 or .B --sameline is specified, duplicate files are listed together in groups, each file displayed on a separate line. The groups are then separated from each other by blank lines. When .B -1 or .B --sameline is specified, spaces and backslash characters (\fB\e\fP) appearing in a filename are preceded by a backslash character. .SH CAVEATS If fdupes returns with an error message such as .B fdupes: error invoking md5sum it means the program has been compiled to use an external program to calculate MD5 signatures (otherwise, fdupes uses interal routines for this purpose), and an error has occurred while attempting to execute it. If this is the case, the specified program should be properly installed prior to running fdupes. When using .B \-d or .BR \-\-delete , care should be taken to insure against accidental data loss. When used together with options .B \-s or .BR \-\-symlink , a user could accidentally preserve a symlink while deleting the file it points to. Furthermore, when specifying a particular directory more than once, all files within that directory will be listed as their own duplicates, leading to data loss should a user preserve a file without its "duplicate" (the file itself!). .SH AUTHOR Adrian Lopez fdupes-1.51/md5/0000755000175000017500000000000012134556236012700 5ustar adrianadrianfdupes-1.51/md5/md5.h0000644000175000017500000000567612134544631013550 0ustar adrianadrian/* Copyright (C) 1999 Aladdin Enterprises. All rights reserved. This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. L. Peter Deutsch ghost@aladdin.com */ /*$Id: md5.h $ */ /* Independent implementation of MD5 (RFC 1321). This code implements the MD5 Algorithm defined in RFC 1321. It is derived directly from the text of the RFC and not from the reference implementation. The original and principal author of md5.h is L. Peter Deutsch . Other authors are noted in the change history that follows (in reverse chronological order): 1999-11-04 lpd Edited comments slightly for automatic TOC extraction. 1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5); added conditionalization for C++ compilation from Martin Purschke . 1999-05-03 lpd Original version. */ #ifndef md5_INCLUDED # define md5_INCLUDED /* * This code has some adaptations for the Ghostscript environment, but it * will compile and run correctly in any environment with 8-bit chars and * 32-bit ints. Specifically, it assumes that if the following are * defined, they have the same meaning as in Ghostscript: P1, P2, P3, * ARCH_IS_BIG_ENDIAN. */ typedef unsigned char md5_byte_t; /* 8-bit byte */ typedef unsigned int md5_word_t; /* 32-bit word */ /* Define the state of the MD5 Algorithm. */ typedef struct md5_state_s { md5_word_t count[2]; /* message length in bits, lsw first */ md5_word_t abcd[4]; /* digest buffer */ md5_byte_t buf[64]; /* accumulate block */ } md5_state_t; #ifdef __cplusplus extern "C" { #endif /* Initialize the algorithm. */ #ifdef P1 void md5_init(P1(md5_state_t *pms)); #else void md5_init(md5_state_t *pms); #endif /* Append a string to the message. */ #ifdef P3 void md5_append(P3(md5_state_t *pms, const md5_byte_t *data, int nbytes)); #else void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes); #endif /* Finish the message and return the digest. */ #ifdef P2 void md5_finish(P2(md5_state_t *pms, md5_byte_t digest[16])); #else void md5_finish(md5_state_t *pms, md5_byte_t digest[16]); #endif #ifdef __cplusplus } /* end extern "C" */ #endif #endif /* md5_INCLUDED */ fdupes-1.51/md5/md5.c0000644000175000017500000002567012134544631013537 0ustar adrianadrian/* Copyright (C) 1999 Aladdin Enterprises. All rights reserved. This software is provided 'as-is', without any express or implied warranty. In no event will the authors be held liable for any damages arising from the use of this software. Permission is granted to anyone to use this software for any purpose, including commercial applications, and to alter it and redistribute it freely, subject to the following restrictions: 1. The origin of this software must not be misrepresented; you must not claim that you wrote the original software. If you use this software in a product, an acknowledgment in the product documentation would be appreciated but is not required. 2. Altered source versions must be plainly marked as such, and must not be misrepresented as being the original software. 3. This notice may not be removed or altered from any source distribution. L. Peter Deutsch ghost@aladdin.com */ /*$Id: md5.c $ */ /* Independent implementation of MD5 (RFC 1321). This code implements the MD5 Algorithm defined in RFC 1321. It is derived directly from the text of the RFC and not from the reference implementation. The original and principal author of md5.c is L. Peter Deutsch . Other authors are noted in the change history that follows (in reverse chronological order): contributors chl - Charles Longeau 2002-05-31 chl Relocated string.h to avoid memcpy warning. 1999-11-04 lpd Edited comments slightly for automatic TOC extraction. 1999-10-18 lpd Fixed typo in header comment (ansi2knr rather than md5). 1999-05-03 lpd Original version. */ #include "md5.h" #include #ifdef TEST /* * Compile with -DTEST to create a self-contained executable test program. * The test program should print out the same values as given in section * A.5 of RFC 1321, reproduced below. */ main() { static const char *const test[7] = { "", /*d41d8cd98f00b204e9800998ecf8427e*/ "a", /*0cc175b9c0f1b6a831c399e269772661*/ "abc", /*900150983cd24fb0d6963f7d28e17f72*/ "message digest", /*f96b697d7cb7938d525a2f31aaf161d0*/ "abcdefghijklmnopqrstuvwxyz", /*c3fcd3d76192e4007dfb496cca67e13b*/ "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789", /*d174ab98d277d9f5a5611c2c9f419d9f*/ "12345678901234567890123456789012345678901234567890123456789012345678901234567890" /*57edf4a22be3c955ac49da2e2107b67a*/ }; int i; for (i = 0; i < 7; ++i) { md5_state_t state; md5_byte_t digest[16]; int di; md5_init(&state); md5_append(&state, (const md5_byte_t *)test[i], strlen(test[i])); md5_finish(&state, digest); printf("MD5 (\"%s\") = ", test[i]); for (di = 0; di < 16; ++di) printf("%02x", digest[di]); printf("\n"); } return 0; } #endif /* TEST */ /* * For reference, here is the program that computed the T values. */ #if 0 #include main() { int i; for (i = 1; i <= 64; ++i) { unsigned long v = (unsigned long)(4294967296.0 * fabs(sin((double)i))); printf("#define T%d 0x%08lx\n", i, v); } return 0; } #endif /* * End of T computation program. */ #define T1 0xd76aa478 #define T2 0xe8c7b756 #define T3 0x242070db #define T4 0xc1bdceee #define T5 0xf57c0faf #define T6 0x4787c62a #define T7 0xa8304613 #define T8 0xfd469501 #define T9 0x698098d8 #define T10 0x8b44f7af #define T11 0xffff5bb1 #define T12 0x895cd7be #define T13 0x6b901122 #define T14 0xfd987193 #define T15 0xa679438e #define T16 0x49b40821 #define T17 0xf61e2562 #define T18 0xc040b340 #define T19 0x265e5a51 #define T20 0xe9b6c7aa #define T21 0xd62f105d #define T22 0x02441453 #define T23 0xd8a1e681 #define T24 0xe7d3fbc8 #define T25 0x21e1cde6 #define T26 0xc33707d6 #define T27 0xf4d50d87 #define T28 0x455a14ed #define T29 0xa9e3e905 #define T30 0xfcefa3f8 #define T31 0x676f02d9 #define T32 0x8d2a4c8a #define T33 0xfffa3942 #define T34 0x8771f681 #define T35 0x6d9d6122 #define T36 0xfde5380c #define T37 0xa4beea44 #define T38 0x4bdecfa9 #define T39 0xf6bb4b60 #define T40 0xbebfbc70 #define T41 0x289b7ec6 #define T42 0xeaa127fa #define T43 0xd4ef3085 #define T44 0x04881d05 #define T45 0xd9d4d039 #define T46 0xe6db99e5 #define T47 0x1fa27cf8 #define T48 0xc4ac5665 #define T49 0xf4292244 #define T50 0x432aff97 #define T51 0xab9423a7 #define T52 0xfc93a039 #define T53 0x655b59c3 #define T54 0x8f0ccc92 #define T55 0xffeff47d #define T56 0x85845dd1 #define T57 0x6fa87e4f #define T58 0xfe2ce6e0 #define T59 0xa3014314 #define T60 0x4e0811a1 #define T61 0xf7537e82 #define T62 0xbd3af235 #define T63 0x2ad7d2bb #define T64 0xeb86d391 static void md5_process(md5_state_t *pms, const md5_byte_t *data /*[64]*/) { md5_word_t a = pms->abcd[0], b = pms->abcd[1], c = pms->abcd[2], d = pms->abcd[3]; md5_word_t t; #ifndef ARCH_IS_BIG_ENDIAN # define ARCH_IS_BIG_ENDIAN 1 /* slower, default implementation */ #endif #if ARCH_IS_BIG_ENDIAN /* * On big-endian machines, we must arrange the bytes in the right * order. (This also works on machines of unknown byte order.) */ md5_word_t X[16]; const md5_byte_t *xp = data; int i; for (i = 0; i < 16; ++i, xp += 4) X[i] = xp[0] + (xp[1] << 8) + (xp[2] << 16) + (xp[3] << 24); #else /* !ARCH_IS_BIG_ENDIAN */ /* * On little-endian machines, we can process properly aligned data * without copying it. */ md5_word_t xbuf[16]; const md5_word_t *X; if (!((data - (const md5_byte_t *)0) & 3)) { /* data are properly aligned */ X = (const md5_word_t *)data; } else { /* not aligned */ memcpy(xbuf, data, 64); X = xbuf; } #endif #define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n)))) /* Round 1. */ /* Let [abcd k s i] denote the operation a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */ #define F(x, y, z) (((x) & (y)) | (~(x) & (z))) #define SET(a, b, c, d, k, s, Ti)\ t = a + F(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 0, 7, T1); SET(d, a, b, c, 1, 12, T2); SET(c, d, a, b, 2, 17, T3); SET(b, c, d, a, 3, 22, T4); SET(a, b, c, d, 4, 7, T5); SET(d, a, b, c, 5, 12, T6); SET(c, d, a, b, 6, 17, T7); SET(b, c, d, a, 7, 22, T8); SET(a, b, c, d, 8, 7, T9); SET(d, a, b, c, 9, 12, T10); SET(c, d, a, b, 10, 17, T11); SET(b, c, d, a, 11, 22, T12); SET(a, b, c, d, 12, 7, T13); SET(d, a, b, c, 13, 12, T14); SET(c, d, a, b, 14, 17, T15); SET(b, c, d, a, 15, 22, T16); #undef SET /* Round 2. */ /* Let [abcd k s i] denote the operation a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */ #define G(x, y, z) (((x) & (z)) | ((y) & ~(z))) #define SET(a, b, c, d, k, s, Ti)\ t = a + G(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 1, 5, T17); SET(d, a, b, c, 6, 9, T18); SET(c, d, a, b, 11, 14, T19); SET(b, c, d, a, 0, 20, T20); SET(a, b, c, d, 5, 5, T21); SET(d, a, b, c, 10, 9, T22); SET(c, d, a, b, 15, 14, T23); SET(b, c, d, a, 4, 20, T24); SET(a, b, c, d, 9, 5, T25); SET(d, a, b, c, 14, 9, T26); SET(c, d, a, b, 3, 14, T27); SET(b, c, d, a, 8, 20, T28); SET(a, b, c, d, 13, 5, T29); SET(d, a, b, c, 2, 9, T30); SET(c, d, a, b, 7, 14, T31); SET(b, c, d, a, 12, 20, T32); #undef SET /* Round 3. */ /* Let [abcd k s t] denote the operation a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */ #define H(x, y, z) ((x) ^ (y) ^ (z)) #define SET(a, b, c, d, k, s, Ti)\ t = a + H(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 5, 4, T33); SET(d, a, b, c, 8, 11, T34); SET(c, d, a, b, 11, 16, T35); SET(b, c, d, a, 14, 23, T36); SET(a, b, c, d, 1, 4, T37); SET(d, a, b, c, 4, 11, T38); SET(c, d, a, b, 7, 16, T39); SET(b, c, d, a, 10, 23, T40); SET(a, b, c, d, 13, 4, T41); SET(d, a, b, c, 0, 11, T42); SET(c, d, a, b, 3, 16, T43); SET(b, c, d, a, 6, 23, T44); SET(a, b, c, d, 9, 4, T45); SET(d, a, b, c, 12, 11, T46); SET(c, d, a, b, 15, 16, T47); SET(b, c, d, a, 2, 23, T48); #undef SET /* Round 4. */ /* Let [abcd k s t] denote the operation a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */ #define I(x, y, z) ((y) ^ ((x) | ~(z))) #define SET(a, b, c, d, k, s, Ti)\ t = a + I(b,c,d) + X[k] + Ti;\ a = ROTATE_LEFT(t, s) + b /* Do the following 16 operations. */ SET(a, b, c, d, 0, 6, T49); SET(d, a, b, c, 7, 10, T50); SET(c, d, a, b, 14, 15, T51); SET(b, c, d, a, 5, 21, T52); SET(a, b, c, d, 12, 6, T53); SET(d, a, b, c, 3, 10, T54); SET(c, d, a, b, 10, 15, T55); SET(b, c, d, a, 1, 21, T56); SET(a, b, c, d, 8, 6, T57); SET(d, a, b, c, 15, 10, T58); SET(c, d, a, b, 6, 15, T59); SET(b, c, d, a, 13, 21, T60); SET(a, b, c, d, 4, 6, T61); SET(d, a, b, c, 11, 10, T62); SET(c, d, a, b, 2, 15, T63); SET(b, c, d, a, 9, 21, T64); #undef SET /* Then perform the following additions. (That is increment each of the four registers by the value it had before this block was started.) */ pms->abcd[0] += a; pms->abcd[1] += b; pms->abcd[2] += c; pms->abcd[3] += d; } void md5_init(md5_state_t *pms) { pms->count[0] = pms->count[1] = 0; pms->abcd[0] = 0x67452301; pms->abcd[1] = 0xefcdab89; pms->abcd[2] = 0x98badcfe; pms->abcd[3] = 0x10325476; } void md5_append(md5_state_t *pms, const md5_byte_t *data, int nbytes) { const md5_byte_t *p = data; int left = nbytes; int offset = (pms->count[0] >> 3) & 63; md5_word_t nbits = (md5_word_t)(nbytes << 3); if (nbytes <= 0) return; /* Update the message length. */ pms->count[1] += nbytes >> 29; pms->count[0] += nbits; if (pms->count[0] < nbits) pms->count[1]++; /* Process an initial partial block. */ if (offset) { int copy = (offset + nbytes > 64 ? 64 - offset : nbytes); memcpy(pms->buf + offset, p, copy); if (offset + copy < 64) return; p += copy; left -= copy; md5_process(pms, pms->buf); } /* Process full blocks. */ for (; left >= 64; p += 64, left -= 64) md5_process(pms, p); /* Process a final partial block. */ if (left) memcpy(pms->buf, p, left); } void md5_finish(md5_state_t *pms, md5_byte_t digest[16]) { static const md5_byte_t pad[64] = { 0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 }; md5_byte_t data[8]; int i; /* Save the length before padding. */ for (i = 0; i < 8; ++i) data[i] = (md5_byte_t)(pms->count[i >> 2] >> ((i & 3) << 3)); /* Pad to 56 bytes mod 64. */ md5_append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1); /* Append the length. */ md5_append(pms, data, 8); for (i = 0; i < 16; ++i) digest[i] = (md5_byte_t)(pms->abcd[i >> 2] >> ((i & 3) << 3)); } fdupes-1.51/md5/README0000644000175000017500000000037312134544631013557 0ustar adrianadrianThe MD5 library code residing within this directory was written by L. Peter Deutsch. Although distributed here with fdupes, the license for his MD5 library is different from the fdupes license. Please look md5.c or md5.h for licensing information. fdupes-1.51/TODO0000644000175000017500000000322112134544631012675 0ustar adrianadrian- A bug with -S shows wrong results. - A bug causes the following behavior: $ fdupes --symlinks testdir testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a testdir/symlink_two testdir/twice_one $ cp testdir/two testdir/two_again $ fdupes --symlinks testdir testdir/two_again testdir/two testdir/twice_one testdir/symlink_two testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a ** This is not the desired behavior. Likewise: $ fdupes testdir testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a testdir/twice_one testdir/two $ fdupes --symlinks testdir testdir/with spaces b testdir/with spaces a testdir/zero_b testdir/zero_a testdir/symlink_two testdir/twice_one - Don't assume that stat always works. - Add partial checksumming where instead of MD5ing whole files we MD5 and compare every so many bytes, caching these partial results for subsequent comparisons. - Option -R should not have to be separated from the rest, such that "fdupes -dR testdir", "fdupes -d -R testdir", "fdupes -Rd testdir", etc., all yield the same results. - Add option to highlight or identify symlinked files (suggest using --classify to identify symlinks with @ suffix... when specified, files containing @ are listed using \@). - Consider autodeletion option without user intervention. - Consider option to match only to files in specific directory. - Do a little commenting, to avoid rolling eyes and/or snickering. - Fix problem where MD5 collisions will result in one of the files not being registered (causing it to be ignored). fdupes-1.51/Makefile0000644000175000017500000000605312134552676013663 0ustar adrianadrian# # fdupes Makefile # ##################################################################### # Standand User Configuration Section # ##################################################################### # # PREFIX indicates the base directory used as the basis for the # determination of the actual installation directories. # Suggested values are "/usr/local", "/usr", "/pkgs/fdupes-$(VERSION)" # PREFIX = /usr/local # # When compiling for 32-bit systems, FILEOFFSET_64BIT must be enabled # for fdupes to handle files greater than (2<<31)-1 bytes. # FILEOFFSET_64BIT = -D_FILE_OFFSET_BITS=64 # # Certain platforms do not support long options (command line options). # To disable long options, uncomment the following line. # #OMIT_GETOPT_LONG = -DOMIT_GETOPT_LONG # # To use the md5sum program for calculating signatures (instead of the # built in MD5 message digest routines) uncomment the following # line (try this if you're having trouble with built in code). # #EXTERNAL_MD5 = -DEXTERNAL_MD5=\"md5sum\" ##################################################################### # Developer Configuration Section # ##################################################################### # # VERSION determines the program's version number. # include Makefile.inc/VERSION # # PROGRAM_NAME determines the installation name and manual page name # PROGRAM_NAME=fdupes # # BIN_DIR indicates directory where program is to be installed. # Suggested value is "$(PREFIX)/bin" # BIN_DIR = $(PREFIX)/bin # # MAN_DIR indicates directory where the fdupes man page is to be # installed. Suggested value is "$(PREFIX)/man/man1" # MAN_BASE_DIR = $(PREFIX)/man MAN_DIR = $(MAN_BASE_DIR)/man1 MAN_EXT = 1 # # Required External Tools # INSTALL = install # install : UCB/GNU Install compatiable #INSTALL = ginstall RM = rm -f MKDIR = mkdir -p #MKDIR = mkdirhier #MKDIR = mkinstalldirs # # Make Configuration # CC = gcc COMPILER_OPTIONS = -Wall -O -g CFLAGS= $(COMPILER_OPTIONS) -I. -DVERSION=\"$(VERSION)\" $(EXTERNAL_MD5) $(OMIT_GETOPT_LONG) $(FILEOFFSET_64BIT) INSTALL_PROGRAM = $(INSTALL) -c -m 0755 INSTALL_DATA = $(INSTALL) -c -m 0644 # # ADDITIONAL_OBJECTS - some platforms will need additional object files # to support features not supplied by their vendor. Eg: GNU getopt() # #ADDITIONAL_OBJECTS = getopt.o OBJECT_FILES = fdupes.o md5/md5.o $(ADDITIONAL_OBJECTS) ##################################################################### # no need to modify anything beyond this point # ##################################################################### all: fdupes fdupes: $(OBJECT_FILES) $(CC) $(CFLAGS) -o fdupes $(OBJECT_FILES) installdirs: test -d $(BIN_DIR) || $(MKDIR) $(BIN_DIR) test -d $(MAN_DIR) || $(MKDIR) $(MAN_DIR) install: fdupes installdirs $(INSTALL_PROGRAM) fdupes $(BIN_DIR)/$(PROGRAM_NAME) $(INSTALL_DATA) fdupes.1 $(MAN_DIR)/$(PROGRAM_NAME).$(MAN_EXT) clean: $(RM) $(OBJECT_FILES) $(RM) fdupes $(RM) *~ md5/*~ love: @echo You\'re not my type. Go find a human partner.