pax_global_header00006660000000000000000000000064124714116470014521gustar00rootroot0000000000000052 comment=224655761e320b6f3aaf1e5d73ffea98983f69ba fpart-0.9.2/000077500000000000000000000000001247141164700126455ustar00rootroot00000000000000fpart-0.9.2/COPYING000066400000000000000000000024521247141164700137030ustar00rootroot00000000000000Copyright (c) 2011-2015 Ganael LAPLANCHE All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. fpart-0.9.2/Changelog000066400000000000000000000062731247141164700144670ustar00rootroot000000000000002015/02/19, 0.9.2 ('What's that?'): - fpsync: add option '-S' to use sudo(1) for filesystem crawling and synchronizations - fpsync: add option '-O' to override default fpart(1) options - add RPM .spec file (see the contribs/package/rpm directory) 2015/02/06, 0.9.1 ('Let's play together'): - add tools/fpsync: a tool to sync directories in parallel using fpart and rsync. See fpsync(1) for more details. - fpart: print the number of files found in verbose mode only 2013/11/13, 0.9-1: (this minor change only impacts tarball users, fpart code itself has not changed) - Backport the following patch to config.guess: http://git.savannah.gnu.org/gitweb/?p=config.git;a=commitdiff;h=29900d3bff1ce445087ece5cb2cac425df1c2f74 it adds support for ppcle and ppc64le architectures. Submitted by: Madhu Pavan 2013/09/10, 0.9 ('I bite!'): - Code cleanup - Fix Debian bug #719338 (fix build on 32bit architectures) 2013/06/25, 0.8 ('Moving around'): - Keep environ(7) when forking hooks - Add sections about live mode and data migration in README - Use autotools and get rid of manual Makefiles 2013/02/18, 0.7 ('Can I take it ?'): - Added option -D to group leaf directories as single file entries - Big options and doc update/cleanup - Sync'd fts(3) code with current FreeBSD version (svn rev 245505) - Embedded fts(3) can now be used on GNU/Linux - Renamed option '-x' to '-b' - Added options '-y', '-Y', '-x' and '-X' to include or exclude files 2013/01/21, 0.6 ('Very funny'): - Options handling cleanup and various bugfixes - Added FPART_PID and FPART_HOOKTYPE hook variables - Added -z and -Z options to display empty directories 2013/01/09, 0.5 ('Cassounette'): - Added option '-L' (live mode) - Added options '-w' and '-W' (hooks to be used with live mode) - Fixed build on Solaris 9 - Removed option '-m' and associated code 2012/06/12, 0.4: - Now builds on Solaris (using FreeBSD's fts(3)) - Fix stack overflow when allocating file entry pointers' array - Added preloading/rounding options -p, -q and -r. See fpart(1) - Added more verbose messages (option '-v') - Better error handling when unable to add file entries (mostly when running out of memory) - Verbose mode now accepts several levels - Added option '-m' (disabled by default), that tries to lower physical memory usage (at least during FS crawling) by using temporary file(s) and mmap(2) facility 2011/12/06, 0.3: - Switch to fts(3) - Replaced getline(3) calls with fgets(3) for compatibility - New "handle arbitrary values" (-a) option - Strings handling cleanup (stop using FILENAME_MAX) - Various smaller changes 2011/11/24, 0.2: - New "follow symbolic links" (-l) option - Ending slash (if present in input path) is now left to allow following an initial symbolic link (without using option -l) - New "do not cross file system boundaries" (-x) option - Fpart now reads on stdin by default if no path is given - File size are now written to stdout when displaying partition contents - Various smaller changes 2011/11/18, 0.1: - Initial version fpart-0.9.2/Makefile.am000066400000000000000000000001561247141164700147030ustar00rootroot00000000000000# Add Changelog to distribution package EXTRA_DIST = Changelog SUBDIRS = src SUBDIRS += tools SUBDIRS += man fpart-0.9.2/README000066400000000000000000000244441247141164700135350ustar00rootroot00000000000000Fpart - README What is fpart ? *************** Fpart is a tool that helps you sort file trees and pack them into bags (called "partitions"). It is developped in C and available under the BSD license. It splits a list of directories and file trees into a certain number of partitions, trying to produce partitions with the same size and number of files. It can also produce partitions with a given number of files or a limited size. Once generated, partitions are either printed as file lists to stdout (default) or to files. Those lists can then be used by third party programs. Fpart also includes a live mode, which allows it to crawl very large filesystems and produce partitions in live. Hooks are available to act on those partitions (e.g. immediatly start a transfer using rsync(1)) without having to wait for the filesystem traversal job to be finished. Used this way, fpart can be seen as a powerful data migration tool. Compatibility : *************** Fpart has been successfully tested on : * FreeBSD 7.x, 8.x, 9.x (i386, amd64) * GNU/Linux (x86_64, arm) * Solaris 9, 10 (Sparc, i386) * NetBSD (amd64, alpha) * Mac OS X (10.6, 10.8) and will probably work on other operating systems too (*). (*) fpart built as a static binary within a Debian (armel) chroot will give you a powerful tool for backing-up your Android (arm) phone ;-) Examples - common usage : ************************* The following will produce 3 partitions, with (approximatively) the same size and number of files. Three files : "var-parts.[0-2]", are generated as output : $ fpart -n 3 -o var-parts /var $ ls var-parts* var-parts.0 var-parts.1 var-parts.2 $ head -n 2 var-parts.0 /var/some/file1 /var/some/file2 The following will produce partitions of 4.3 GB, containing music files ready to be burnt to a DVD (for example). Files "music-parts.[0-n]", are generated as output : $ fpart -s 4617089843 -o music-parts /path/to/my/music The following will produce partitions containing 10000 files each by examining /usr first and then /home and display only partition 0 on stdout : $ find /usr ! -type d | fpart -f 10000 -i - /home | grep '^0:' The following will produce two partitions by re-using du(1) output. Fpart will not examine the filesystem but instead re-use arbitrary values printed by du(1) and sort them : $ du * | fpart -n 2 -a Examples - live mode : ********************** By default, fpart will wait for FS crawling to terminate before generating and displaying partitions. If you use the live mode (option -L), fpart will display each partition as soon as it is complete. You can combine this option with hooks; they will be triggered just before (pre-part hook, option -w) or after (post-part hook, option -W) partitions' completion. Hooks provide several environment variables (see fpart(1)); they are a convenient way of getting information about fpart's and partition's current states. For example, ${FPART_PARTFILENAME} will contain the name of the output file of the partition that has just been generated; using that variable within a post-part hook permits starting manipulating the files just after partition generation. See the following example : $ mkdir foo && touch foo/{bar,baz} $ fpart -L -f 1 -o /tmp/part.out -W \ 'echo == ${FPART_PARTFILENAME} == ; cat ${FPART_PARTFILENAME}' foo/ == /tmp/part.out.0 == foo/bar == /tmp/part.out.1 == foo/baz This example crawls foo/ in live mode (option -L). For each file (option -f, 1 file per partition), it generates a partition into /tmp/part.out. (option -o; is the partition index and will be automatically added by fpart) and executes the following post-part hook (option -W) : echo == ${FPART_PARTFILENAME} == ; cat ${FPART_PARTFILENAME} This hook will display the name of current partition's output file name as well as display its contents. Examples - migrating data (take 1) : ************************************ Here is a more complex example that will show you how to use fpart, GNU Parallel and Rsync to split up a directory and immediately schedule data synchronization of smaller lists of files, while FS crawling goes on. We will be synchronizing data from /data/src to /data/dest. First, go to the source directory (as rsync's --files-from option takes a file list relative to its source directory) : $ cd /data/src Then, run fpart from here : $ fpart -L -f 10000 -x '.snapshot' -x '.zfs' -Z -o /tmp/part.out -W \ '/usr/local/bin/sem -j 3 "/usr/local/bin/rsync -av --files-from=${FPART_PARTFILENAME} /data/src/ /data/dest/"' . This command will start fpart in live mode (option -L), making it generate partitions during FS crawling. Fpart will produce partitions containing at most 10000 files each (option -f), will skip files and folders named '.snapshot' or '.zfs' (option -x) and will list empty and non-accessible directories (option -Z; that option is necessary when working with rsync to make sure the whole file tree will be re-created within the destination directory). Last but not least, each partition will be written to /tmp/part.out. (option -o) and used within the post-part hook (option -W), run immediately by fpart once the partition is complete : /usr/local/bin/sem -j 3 "/usr/local/bin/rsync -av --files-from=${FPART_PARTFILENAME} /data/src/ /data/dest/" This hook is itself a nested command. It will run GNU Parallel's sem scheduler (any other scheduler will do) to run at most 3 rsync in parallel. The scheduler will finally trigger the following command : /usr/local/bin/rsync -av --files-from=${FPART_PARTFILENAME} /data/src/ /data/dest/ where ${FPART_PARTFILENAME} will be part of rsync's environment when it runs and contain the file name of the partition that has just been generated. That's all, folks ! Pretty simple, isn't it ? In this example, FS crawling and data transfer are run from the same -local- machine, but you can use it as the basis of a much sophisticated solution : at work, by using a cluster of machines connected to our filers through NFS and running Open Grid Scheduler, we successully migrated over 400 TB of data. Note : several fpart runs can be launched using the above example ; you will perform incremental synchronizations. That is, deleted files from the source directory will not be removed from destination unless rsync's --delete option is used. Unfortunately, this option cannot be used with a list of files (files that do not appear in the list are just ignored). To use the --delete option in conjunction with fpart, you *have* to provide rsync's --files-from option a list of directories (only). Fpart's -d option can do the trick but it will probaby not generate partitions having the same size (if you choose to use that method anyway, do not forget to use rsync's -r option too). Another - simpler - method if you plan to perform several synchronizations is to use a single, last rsync to delete extra files from destination using the --delete option. This simple rsync pass will not transfer data but only delete files from destination directory. Examples - migrating data (take 2) : ************************************ A program called 'fpsync' is provided within the tools/ directory. This tool is a shell script that wraps fpart and rsync to launch several rsync in parallel as presented in the previous section ("take 1"), but while the previous example used GNU Parallel to schedule transfers, this wrapper provides its own -embedded- scheduler. It can execute several rsync processes locally or launch rsync transfers on several nodes (workers) through SSH. The following examples show two typical usage : $ fpsync -n 4 -f 1000 -s $((100 * 1024 * 1024)) \ /data/src/ /data/dst/ will synchronize /data/src/ to /data/dst/ using 4 local workers, each one transferring at most 1000 files and 100 MB per synchronization job. $ fpsync -n 8 -f 1000 -s $((100 * 1024 * 1024)) \ -w login@machine1 -w login@machine2 -d /mnt/nfs/fpsync \ /data/src/ /data/dst/ will synchronize /data/src/ to /data/dst/ using the same transfer limits, but through 8 concurrent synchronization jobs spread over two machines (machine1 and machine2). Those machines can both access /data/src/ and /data/dst/, as well as /mnt/nfs/fpsync, which is fpsync's shared working directory. Jobs can be interrupted and resumed using the -r option and the job id presented when verbose mode (-v) is on. See fpsync(1) to get a list of all supported options. Limits : ******** Fpart will *NOT* modify data, it will *NOT* split your files ! As a consequence, if you have a directory containing several small files and a huge one, it will be unable to produce partitions with the same size. Fpart does magic, but not that much ;-) If you provide several paths to fpart, it will examine all of them. If those paths overlap or if the same path is specified more than once, same files will appear more than once within generated partitions. This is not a bug, fpart does not deduplicate FS crawling results. Installing : ************ For FreeBSD users, fpart is already available in ports, see sysutils/fpart. Fpart is also available from official repositories on Debian and Ubuntu. If a pre-compiled package is not available for your favourite operating system, installing from sources is simple. First, if there is no 'configure' script in the main directory, run : $ autoreconf -i (autoreconf comes from the GNU autotools), then run : $ ./configure $ make to configure and build fpart. Finally, install fpart (as root) : # make install See also : ********** See fpart(1) and fpsync(1) for more details. Article about data migration using fpart and rsync (GNU Linux Magazine #164 - October 2013, french) : http://connect.ed-diamond.com/GNU-Linux-Magazine/GLMF-164/Parallelisez-vos-transferts-de-fichiers The partition problem is detailed here : http://en.wikipedia.org/wiki/Partition_problem and here : http://en.wikipedia.org/wiki/Bin_packing_problem I am sure you will also be interested in : https://github.com/jbd/packo which was developped by Jean-Baptiste Denis as the original proof of concept. Author / Licence : ****************** Fpart has been written by Ganaël LAPLANCHE and is available under the BSD license (see COPYING for details). Thanks to Jean-Baptiste Denis for having given me the idea of this program ! Contributions : *************** FTS code comes from FreeBSD : lib/libc/gen/fts.c -> fts.c include/fts.h -> fts.h and is available under the BSD license. fpart-0.9.2/TODO000066400000000000000000000021271247141164700133370ustar00rootroot00000000000000TODO (ideas) : ************** Fpart: - Switch to ftw(3) for portability ? - Factorize code (err(3), ...) - Add an option to specify that a directory matching a path or a pattern should not be split but treated as a file entry - Add constraints, e.g. : force hardlinks to belong to the same partition - Make the program multithreaded for FS analysis - Display/accept size in a human-friendly format - Add timestamps before/after hooks - Improve sort by using, e.g. : http://en.wikipedia.org/wiki/External_sorting - Display total size in final status - As a second pass, remove partitions with no file (e.g. option -n with too many partitions, special partition #0 for option -s, ...) Autotools: - Use config.h (for program version, data types, header inclusion, build options info [...]) - Add -Wall to CFLAGS when possible (gcc, clang) Fpsync: - Add an option to exit after fpart pass (to generate jobs only) - Authorize rsync:// and ssh source and target URLs - Check if src_dir/ is the same on all workers (using a stat fingerprint) and use the same method for dst_dir/ (stop using a witness file) fpart-0.9.2/configure.ac000066400000000000000000000046231247141164700151400ustar00rootroot00000000000000AC_PREREQ([2.69]) AC_INIT([fpart], [0.9.2], [ganael.laplanche@martymac.org]) AC_CONFIG_SRCDIR([src/fpart.h]) AM_INIT_AUTOMAKE([foreign -Wall -Werror]) # Checks for programs. AC_PROG_CC([cc gcc]) AC_PROG_CC_C99 AM_PROG_CC_C_O AC_PROG_INSTALL # Checks for log10() in -lm AC_CHECK_LIB(m, log10) # Checks for header files. AC_CHECK_HEADERS([fcntl.h paths.h stdlib.h string.h strings.h sys/mount.h sys/param.h sys/statfs.h sys/statvfs.h sys/vfs.h unistd.h]) # Checks for typedefs, structures, and compiler characteristics. AC_TYPE_PID_T AC_TYPE_SIZE_T # Checks for library functions. AC_FUNC_FORK AC_FUNC_LSTAT_FOLLOWS_SLASHED_SYMLINK AC_FUNC_MALLOC AC_FUNC_REALLOC AC_CHECK_FUNCS([bzero fchdir getcwd memmove memset strchr strerror strrchr strtol]) # OS detection AC_CANONICAL_HOST case "${host_os}" in solaris*) host_os_solaris=true ;; linux*) host_os_linux=true ;; esac # Default value for embedded fts support dflt_embfts=false # Enabled on Solaris if test x$host_os_solaris = xtrue then dflt_embfts=true fi # Embedded fts option AC_ARG_ENABLE([embfts], [ --enable-embfts enable embedded fts], [case "${enableval}" in yes) embfts=true ;; no) embfts=false ;; *) AC_MSG_ERROR([bad value ${enableval} for --enable-embfts]) ;; esac],[embfts=${dflt_embfts}]) # Static build option AC_ARG_ENABLE([static], [ --enable-static build static binary], [case "${enableval}" in yes) static=true ;; no) static=false ;; *) AC_MSG_ERROR([bad value ${enableval} for --enable-static]) ;; esac],[static=false]) # Debug option AC_ARG_ENABLE([debug], [ --enable-debug turn on debugging], [case "${enableval}" in yes) debug=true ;; no) debug=false ;; *) AC_MSG_ERROR([bad value ${enableval} for --enable-debug]) ;; esac],[debug=false]) # Disable large file support on Linux when not using embedded fts # as Linux' fts.h cannot be used with _FILE_OFFSET_BITS==64 if test x$embfts = xfalse then if test x$host_os_linux = xtrue then enable_largefile=no fi fi # Large file support AC_SYS_LARGEFILE # Automake output AM_CONDITIONAL([DEBUG], [test x$debug = xtrue]) AM_CONDITIONAL([EMBEDDED_FTS], [test x$embfts = xtrue]) AM_CONDITIONAL([SOLARIS], [test x$host_os_solaris = xtrue]) AM_CONDITIONAL([LINUX], [test x$host_os_linux = xtrue]) AM_CONDITIONAL([STATIC], [test x$static = xtrue]) #AC_CONFIG_HEADERS([src/config.h]) AC_CONFIG_FILES([Makefile src/Makefile tools/Makefile man/Makefile]) AC_OUTPUT fpart-0.9.2/contribs/000077500000000000000000000000001247141164700144705ustar00rootroot00000000000000fpart-0.9.2/contribs/package/000077500000000000000000000000001247141164700160635ustar00rootroot00000000000000fpart-0.9.2/contribs/package/rpm/000077500000000000000000000000001247141164700166615ustar00rootroot00000000000000fpart-0.9.2/contribs/package/rpm/fpart.spec000066400000000000000000000025461247141164700206600ustar00rootroot00000000000000Name: fpart Version: 0.9.2 Release: 1%{?dist} Group: Applications/System License: BSD Summary: Fpart is a tool that sorts files and packs them into bags. URL: http://contribs.martymac.org BuildRoot: %{_tmppath}/%{name}-%{version}-%{release}-build-%(%{__id_u} -n) Source0: http://contribs.martymac.org/fpart/%{name}-%{version}.tar.gz %description Fpart is a tool that helps you sort file trees and pack them into bags (called "partitions"). It is developped in C and available under the BSD license. It splits a list of directories and file trees into a certain number of partitions, trying to produce partitions with the same size and number of files. It can also produce partitions with a given number of files or a limited size. %prep %setup -q %configure %build %{__make} %install %{__make} install DESTDIR=${RPM_BUILD_ROOT} %{__rm} -rf %{RPM_BUILD_ROOT} mkdir -p %{RPM_BUILD_ROOT}%{_docdir} %{__install} -p -m 0644 Changelog COPYING README TODO %{RPM_BUILD_ROOT}%{_docdir}/ %clean %{__rm} -rf %{RPM_BUILD_ROOT} %files %defattr(-,root,root,0755) %doc Changelog COPYING README TODO %{_mandir}/man1/fpart.1* %{_mandir}/man1/fpsync.1* %{_bindir}/fpart %{_bindir}/fpsync %changelog * Tue Feb 17 2015 Ganael Laplanche - 0.9.2 - Version 0.9.2 * Mon Feb 16 2015 Tru Huynh - 0.9.1 - Initial build of the package. fpart-0.9.2/man/000077500000000000000000000000001247141164700134205ustar00rootroot00000000000000fpart-0.9.2/man/Makefile.am000066400000000000000000000000411247141164700154470ustar00rootroot00000000000000dist_man_MANS = fpart.1 fpsync.1 fpart-0.9.2/man/fpart.1000066400000000000000000000211621247141164700146200ustar00rootroot00000000000000.\" Copyright (c) 2011-2015 Ganael LAPLANCHE .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .Dd November 18, 2011 .Dt FPART 1 .Os .Sh NAME .Nm fpart .Nd Sort and pack files into partitions. .Sh SYNOPSIS .Nm .Op Fl h .Op Fl V .Fl n Ar num | Fl f Ar files | Fl s Ar size .Op Fl i Ar infile .Op Fl a .Op Fl o Ar outfile .Op Fl e .Op Fl v .Op Fl l .Op Fl b .Op Fl y Ar pattern .Op Fl Y Ar pattern .Op Fl x Ar pattern .Op Fl X Ar pattern .Op Fl z .Op Fl Z .Op Fl d Ar depth .Op Fl D .Op Fl L .Op Fl w Ar cmd .Op Fl W Ar cmd .Op Fl p Ar num .Op Fl q Ar num .Op Fl r Ar num .Op Ar FILE or DIR... .Sh DESCRIPTION The .Nm utility helps you sort file trees and pack them into bags (called "partitions"). .Sh GENERAL OPTIONS .Bl -tag -width indent .It Fl h Print help .It Fl V Print version .El .Sh PARTITION CONTROL .Bl -tag -width indent .It Ic -n Ar num Create exactly .Ar num partitions and try to generate partitions with the same size and number of files. This option cannot be used in conjunction with .Fl f , .Fl s or .Fl L . .It Ic -f Ar files Create partitions containing at most .Ar files files. This option can be used in conjunction with .Fl s and .Fl L . .It Ic -s Ar size Create partitions with a maximum size of .Ar size bytes. With this option, partition 0 may be used to handle files that do not fit in a regular partition, given the provided .Ar size limit. This option can be used in conjunction with .Fl f and .Fl L . .El .Sh INPUT CONTROL .Bl -tag -width indent .It Ic -i Ar infile Read file list from .Ar infile . If .Ar infile is .Dq Li "-" , then list is read from stdin. .It Fl a Input contains arbitrary values; just sort them (do not crawl filesystem). Input must follow the .Dq Li "size(blank)path" scheme. This option is incompatible with crawling-related options. .El .Sh OUTPUT CONTROL .Bl -tag -width indent .It Ic -o Ar outfile Output partitions' contents to .Ar outfile template. Multiple files will be generated given that template. Each .Ar outfile will get partition number as a suffix. If .Ar outfile is .Dq Li "-" , then partitions will be printed to stdout, with partition number used as a prefix (so you can grep partitions you are interested in, or do whatever you want). .It Fl e When adding directories (see .Sx DIRECTORY HANDLING ), add an ending .Dq Li "/" to each directory entry. .It Fl v Verbose mode (may be specified more than once). .El .Sh FILESYSTEM CRAWLING CONTROL .Bl -tag -width indent .It Fl l Follow symbolic links (default: do not follow). .It Fl b Do not cross filesystem boundaries (default: cross). .It Ic -y Ar pattern Include files or directories matching .Ar pattern only (and discard all other files). This option may be specified several times. It does not apply when computing size of directories to be added as leaf entries (the computed size will then include every file within directory). .It Ic -Y Ar pattern Same as .Fl y but case insensitive. This option may not be available on your platform (at least FreeBSD and GNU/Linux support it, Solaris does not). .It Ic -x Ar pattern Exclude files or directories matching .Ar pattern . This option can be used in conjunction with .Fl y and .Fl Y . In this case, exclusion is performed after. This option may be specified several times. It does not apply when computing size of directories to be added as leaf entries (the computed size will then include every file within directory). .It Ic -X Ar pattern Same as .Fl x but case insensitive. This option may not be available on your platform (at least FreeBSD and GNU/Linux support it, Solaris does not). .El .Sh DIRECTORY HANDLING .Bl -tag -width indent .It Fl z Pack empty directories. By default, fpart will pack files only (except when using the .Fl d or .Fl D options). This option can be useful for tools such as .Xr rsync 1 to be able to recreate a full file tree when used with fpart (e.g. using rsync's --files-from option). See the .Fl Z option to also pack un-readable directories. .It Fl Z Implies .Fl z . Treat un-readable directories as empty, causing them to be packed anyway. .It Ic -d Ar depth After a certain .Ar depth , pack directories instead of files (directories themselves will be added to partitions, instead of their content). .It Fl D Implies .Fl z . Pack leaf directories: if a directory contains files only, it will be packed as a single entry. .El .Sh LIVE MODE .Bl -tag -width indent .It Fl L Live mode (default: disabled). When using this mode, partitions will be generated while crawling filesystem. This option saves time and memory, but does not give partition 0 a special meaning (see option .Fl s ). As a consequence, it can generate partitions larger than the size specified with option .Fl s . This option can be used in conjunction with options .Fl f and .Fl s , but not with option .Fl n . .It Ic -w Ar cmd When using live mode, execute .Ar cmd when starting a new partition (before having opened next output file, if any). .Ar cmd is run in a specific environment that provides several variables describing the state of the program: .Ev FPART_HOOKTYPE ("pre-part" or "post-part"), .Ev FPART_PARTFILENAME (current partition's output file name), .Ev FPART_PARTNUMBER (current partition number), .Ev FPART_PARTSIZE (current partition size), .Ev FPART_PARTNUMFILES (number of files in current partition), .Ev FPART_PID (PID of fpart). Note that variables may or may not be defined, depending of requested options and current partition's state when the hook is triggered. Also, note that hooks are executed in a synchronous way while crawling filesystem, so 1) avoid executing commands that take a long time to return as it slows down filesystem crawling and 2) do not presume cwd (PWD) is the one fpart has been started in, as it is regularly changed to speed up crawling (use abolute paths within hooks). .It Ic -W Ar cmd Same as .Fl w , but executes .Ar cmd when finishing a partition (after having closed last output file, if any). .El .Sh SIZE HANDLING .Bl -tag -width indent .It Ic -p Ar num Preload each partition with .Ar num bytes. .It Ic -q Ar num Overload each file size with .Ar num bytes. .It Ic -r Ar num Round each file size up to next .Ar num bytes multiple. This option can be used in conjunction with overloading, which is done *before* rounding. .El .Sh EXAMPLES Here are some examples: .Bl -tag -width indent .It Li "fpart -n 3 -o var-parts /var" Produce 3 partitions, with (hopefully) the same size and number of files. Three files: var-parts.0, var-parts.1 and var-parts.2 are generated as output. .It Li "fpart -s 4724464025 -o music-parts /path/to/music ./*.mp3" Produce partitions of 4.4 GB, containing music files from /path/to/music as well as MP3 files from current directory; with such a partition size, each partition content will be ready to be burnt to a DVD. Files music-parts.0 to music-parts.n, are generated as output. .It Li "find /usr ! -type d | fpart -f 10000 -i - /home | grep '^0:'" Produce partitions containing 10000 files each by examining /usr first and then /home and display only partition 0 on stdout. .It Li "du * | fpart -n 2 -a" Produce two partitions by using .Xr du 1 output. Fpart will not examine the file system but instead use arbitrary values printed by .Xr du 1 and sort them. .El .Sh SEE ALSO .Xr du 1 , .Xr find 1 , .Xr fpsync 1 , .Xr grep 1 , .Xr rsync 1 .Sh AUTHOR, AVAILABILITY Fpart has been written by .An Gana\(:el LAPLANCHE and is available under the BSD license on .Lk http://contribs.martymac.org .Sh BUGS No bug known (yet). fpart-0.9.2/man/fpsync.1000066400000000000000000000165411247141164700150130ustar00rootroot00000000000000.\" Copyright (c) 2015 Ganael LAPLANCHE .\" All rights reserved. .\" .\" Redistribution and use in source and binary forms, with or without .\" modification, are permitted provided that the following conditions .\" are met: .\" 1. Redistributions of source code must retain the above copyright .\" notice, this list of conditions and the following disclaimer. .\" 2. Redistributions in binary form must reproduce the above copyright .\" notice, this list of conditions and the following disclaimer in the .\" documentation and/or other materials provided with the distribution. .\" .\" THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND .\" ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE .\" IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE .\" ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE .\" FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL .\" DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS .\" OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) .\" HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT .\" LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY .\" OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF .\" SUCH DAMAGE. .\" .Dd January 27, 2015 .Dt FPSYNC 1 .Os .Sh NAME .Nm fpsync .Nd Synchronize directories in parallel using fpart and rsync. .Sh SYNOPSIS .Nm .Op Fl h .Op Fl v .Op Fl n Ar jobs .Op Fl f Ar files .Op Fl s Ar size .Op Fl w Ar wrks .Op Fl d Ar shdir .Op Fl t Ar tmpdir .Op Fl r Ar jobname .Op Fl o Ar rsyncopts .Op Fl O Ar fpartopts .Op Fl S .Pa src_dir/ .Pa dst_dir/ .Sh DESCRIPTION The .Nm tool synchronizes directories in parallel using .Xr fpart 1 and .Xr rsync 1 . It computes subsets of .Pa src_dir/ and spawns .Xr rsync 1 jobs to synchronize them to .Pa dst_dir/ . .sp Synchronization jobs can be executed either locally or remotely (using SSH workers, see option .Fl w ) and are executed on-the-fly while filesystem crawling goes on. This makes .Nm a good tool for migrating large filesystems. .Sh OPTIONS .Bl -tag -width indent .It Fl h Print help .It Fl v Verbose mode. Can be be specified several times to increase verbosity level. .It Ic -n Ar jobs Start .Ar jobs concurrent sync jobs (either locally or remotely, see below). Default: .Sy 2 .It Ic -f Ar files Transfer at most .Ar files files per sync job. Default: .Sy 2000 .It Ic -s Ar size Transfer at most .Ar size bytes per sync job. .br Default: .Sy 4294967296 (4 GB) .It Ic -w Ar wrks Use remote SSH .Ar wrks to synchronize files. Synchronization jobs are executed locally when this option is not set. .Ar wrks is a space-separated list of login@machine connection strings and can be specified several times. You must be allowed to connect to those machines using a SSH key to avoid user interaction. .It Ic -d Ar shdir Set .Nm shared directory to .Ar shdir . This option is mandatory when using SSH workers and set by default to .Ar tmpdir when running locally. The specified directory must be an absolute path ; it will be used to handle communications with SSH hosts (sharing partitions and log files) and, as a consequence, must be made available to all participating hosts (e.g. through a r/w NFS mount), including the master one running .Nm . .It Ic -t Ar tmpdir Set .Nm temporary directory to .Ar tmpdir . This directory remains local and does not need to be shared amongst SSH workers when using the .Fl w option. Default: .Pa /tmp/fpsync .It Ic -r Ar jobname Resume job .Ar jobname and restart synchronizing remaining partitions from a previous run. .Ar jobname can be obtained using verbose mode (see option .Fl v ) . Note that filesystem crawling is skipped when resuming a previous run. As a consequence, options .Fl f , .Fl s , .Fl o , .Fl O , .Fl S , .Pa src_dir/ , and .Pa dst_dir/ are ignored. .It Ic -o Ar rsyncopts Override default .Xr rsync 1 options with .Ar rsyncopts . Use this option with care as certain options are incompatible with a parallel usage (e.g. .Cm --delete ) . Default: .Cm -av --numeric-ids .It Ic -O Ar fpartopts Override default .Xr fpart 1 options with .Ar fpartopts . .br Default: .Cm -x .zfs -x .snapshot* -x .ckpt .It Fl S Sudo mode. Use .Xr sudo 8 for filesystem crawling and synchronizations. .It Pa src_dir/ Source directory. It must be absolute and available on all participating hosts (including the master one, running .Nm ) . .It Pa dst_dir/ Destination directory. It must be absolute and available on all participating workers. .El .Sh RUNNING FPSYNC Each .Nm run generates a unique .Ar jobname , which is displayed in verbose mode (see option .Fl v ) and within log files. You can use that .Ar jobname to resume a previous run (see option .Fl r ) . .Nm will then restart synchronizing data from the parts that were being synchonized at the time it stopped. .sp This unique feature gives the administrator the ability to stop .Nm and restart it later, without having to restart the whole filesystem crawling and synchronization process. Note that resuming is only possible when filesystem crawling step has finished. .sp During synchronization, you can press CTRL-C to interrupt the process. The first CTRL-C prevents new synchronizations from being submitted and the process will wait for current synchronizations to be finished before exiting. If you press CTRL-C again, current synchronizations will be killed and .Nm will exit immediately. .sp On certain systems, CTRL-T can be pressed to get the status of current and remaining parts to be synchronized. This can also be achieved by sending a SIGINFO to the .Nm process. .sp Whether you use verbose mode or not, everything is logged within .Pa shdir/log/ . .Sh EXAMPLES Here are some examples: .Bl -tag -width indent .It Li "fpsync -n 4 /usr/src/ /var/src/" .sp Synchronizes .Pa /usr/src/ to .Pa /var/src/ using 4 local jobs. .It Li "fpsync -n 2 -w login@machine1 -w login@machine2 -d /mnt/fpsync /mnt/src/ /mnt/dst/" .sp Synchronizes .Pa /mnt/src/ to .Pa /mnt/dst/ using 2 concurrent jobs executed remotely on 2 SSH workers (machine1 and machine2). The shared directory is set to .Pa /mnt/fpsync and mounted on the machine running .Nm , as well as on machine1 and machine2. The source directory .Pa ( /mnt/src/ ) is also available on those 3 machines, while the destination directory .Pa ( /mnt/dst/ ) is mounted on SSH workers only (machine1 and machine2). .El .Sh LIMITS Parallelizing .Xr rsync 1 makes several options not usable, such as .Cm --delete . If your source directory is live while .Nm is running, you will have to delete extra files from destination directory. This is usually done by using a final -offline- .Xr rsync 1 pass that will use this option. .sp .Nm enqueues synchronization jobs on disk, within the .Pa tmpdir/queue directory. Be careful to host this queue on a filesystem that can handle fine-grained mtime timestamps (i.e. with a sub-second precision) if you want the queue to be processed in order when .Xr fpart 1 generates several jobs per second. On FreeBSD, .Xr VFS 9 timestamps' precision can be tuned using the 'vfs.timestamp_precision' sysctl. See .Xr vfs_timestamp 9 . .Sh SEE ALSO .Xr fpart 1 , .Xr rsync 1 , .Xr sudo 8 .Sh AUTHOR, AVAILABILITY Fpsync has been written by .An Gana\(:el LAPLANCHE and is available under the BSD license on .Lk http://contribs.martymac.org .Sh BUGS No bug known (yet). fpart-0.9.2/src/000077500000000000000000000000001247141164700134345ustar00rootroot00000000000000fpart-0.9.2/src/Makefile.am000066400000000000000000000010231247141164700154640ustar00rootroot00000000000000# Disable -I. AUTOMAKE_OPTIONS = nostdinc bin_PROGRAMS = fpart fpart_SOURCES = types.h utils.c utils.h options.c options.h partition.c partition.h file_entry.c file_entry.h dispatch.c dispatch.h fpart.c fpart.h fpart_CFLAGS = fpart_LDFLAGS = if DEBUG fpart_CFLAGS += -g -DDEBUG endif if EMBEDDED_FTS fpart_SOURCES += fts.c fts.h fpart_CFLAGS += -DEMBED_FTS endif if SOLARIS fpart_CFLAGS += -D_POSIX_C_SOURCE=200112L -D__EXTENSIONS__ endif if LINUX fpart_CFLAGS += -D_GNU_SOURCE endif if STATIC fpart_LDFLAGS += -static endif fpart-0.9.2/src/dispatch.c000066400000000000000000000253161247141164700154060ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include "types.h" #include "utils.h" #include "dispatch.h" /* NULL */ #include /* fprintf(3) */ #include /* assert(3) */ #include /***************************** File entry dispatch functions *****************************/ /* Sort an array of file_entry pointers given file size, biggest to smallest This function is used by qsort(3) */ int sort_file_entry_p(const void *a, const void *b) { assert((a != NULL) && (*(struct file_entry **)a != NULL)); assert((b != NULL) && (*(struct file_entry **)b != NULL)); if((*(struct file_entry **)a)->size < (*(struct file_entry **)b)->size) return (1); else if((*(struct file_entry **)a)->size > (*(struct file_entry **)b)->size) return (-1); else return (0); } /* Dispatch file_entries by assigning them a partition number - a sorted array of file entry pointers must be provided as an argument - as well as a pointer to a double linked-list of partitions' head that will contain the total amount of data of each assigned file */ int dispatch_file_entry_p_by_size(struct file_entry **file_entry_p, fnum_t num_entries, struct partition *head, pnum_t num_parts) { assert(head != NULL); assert(num_parts > 0); fnum_t i = 0; while((file_entry_p != NULL) && (file_entry_p[i] != NULL) && (i < num_entries)) { /* find most approriate partition */ pnum_t smallest_partition_index = find_smallest_partition_index(head); struct partition *smallest_partition = get_partition_at(head, smallest_partition_index); if(smallest_partition == NULL) { fprintf(stderr, "%s(): get_partition_at() returned NULL\n", __func__); return (1); } /* assign it */ file_entry_p[i]->partition_index = smallest_partition_index; #if defined(DEBUG) fprintf(stderr, "%s(): %s added to partition %d (%p)\n", __func__, file_entry_p[i]->path, file_entry_p[i]->partition_index, smallest_partition); #endif /* and load the partition with file size */ smallest_partition->size += file_entry_p[i]->size; smallest_partition->num_files++; i++; } return (0); } /* Dispatch empty file_entries (files with zero-byte size) from head by assigning them a more appropriate partition number. The idea is to get empty files spread accross partitions and not get them all in the last one. - a double-linked list of partitions is provided as an argument */ int dispatch_empty_file_entries(struct file_entry *head, fnum_t num_entries, struct partition *part_head, pnum_t num_parts) { assert(head != NULL); assert(part_head != NULL); assert(num_parts > 0); /* backup head */ struct file_entry *start = head; /* first pass: count empty files */ fnum_t num_empty_entries = 0; while(head != NULL) { if(head->size == 0) num_empty_entries++; head = head->nextp; } /* go back to original head */ head = start; /* compute mean file entry number per partition */ fnum_t mean_files = (num_entries / num_parts); /* be sure to start at first partition as we are handling indexes here. Starting at first file_entry is not necessary as we would not corrupt any information, but just skip a few file entries */ rewind_list(part_head); /* for each empty file, associate it with the first partition having less files than mean_files */ while(head != NULL) { if(head->size == 0) { /* empty file found */ pnum_t j = 0; /* backup partition head */ struct partition *part_start = part_head; while(part_head != NULL) { if((head->partition_index != j) && (part_head->num_files < mean_files)) { struct partition *previous_partition = get_partition_at(part_start, head->partition_index); if(previous_partition == NULL) { fprintf(stderr, "%s(): " "get_partition_at() returned NULL\n", __func__); return (1); } /* unload the previous part (only affects the number of files, size does not change) */ previous_partition->num_files--; /* load the new part */ part_head->num_files++; /* assign new index to file entry */ head->partition_index = j; #if defined(DEBUG) fprintf(stderr, "%s(): %s (empty) re-assigned to partition " "%d (%p)\n", __func__, head->path, head->partition_index, part_head); #endif break; } part_head = part_head->nextp; j++; } /* go back to original head */ part_head = part_start; } head = head->nextp; } return (0); } /* Dispatch file_entries from head into partitions that will be created on-the-fly, with respect to max_entries (maximum files per partitions) and max_size (max partition size) - must be called with *part_head == NULL (will create partitions) - if max_size > 0, partition 0 will hold files that cannot be held by other partitions - returns the number of parts created with part_head set to the last element */ pnum_t dispatch_file_entries_by_limits(struct file_entry *head, struct partition **part_head, fnum_t max_entries, fsize_t max_size, struct program_options *options) { assert(head != NULL); assert((part_head != NULL) && (*part_head == NULL)); assert(max_size >= 0); assert(options != NULL); /* number of partitions created, our return value */ pnum_t num_parts_created = 0; /* when max_size is used, create a default partition (partition 0) that will hold files that does not match criteria */ if(max_size > 0) { if(add_partitions(part_head, 1, options) != 0) { fprintf(stderr, "%s(): cannot init default partition\n", __func__); return (num_parts_created); } num_parts_created++; } struct partition *default_partition = *part_head; pnum_t default_partition_index = 0; /* create a first data partition and keep a pointer to it */ if(add_partitions(part_head, 1, options) != 0) { fprintf(stderr, "%s(): cannot create partition\n", __func__); return (num_parts_created); } num_parts_created++; struct partition *start_partition = *part_head; pnum_t start_partition_index = num_parts_created - 1; /* for each file, associate it with current partition (or default_partition) */ pnum_t current_partition_index = start_partition_index; while(head != NULL) { /* max_size provided and file size > max_size, associate file to default partition */ if((max_size > 0) && (head->size > max_size)) { head->partition_index = default_partition_index; default_partition->size += head->size; default_partition->num_files++; #if defined(DEBUG) fprintf(stderr, "%s(): %s added to partition %d (%p)\n", __func__, head->path, head->partition_index, default_partition); #endif } else { /* examine each partition */ while((*part_head) != NULL) { /* if file does not fit in partition */ if(((max_entries > 0) && (((*part_head)->num_files + 1) > max_entries)) || ((max_size > 0) && (((*part_head)->size + head->size) > max_size))) { /* and we reached last partition, chain a new one */ if((*part_head)->nextp == NULL) { if(add_partitions(part_head, 1, options) != 0) { fprintf(stderr, "%s(): cannot create partition\n", __func__); return (num_parts_created); } num_parts_created++; #if defined(DEBUG) fprintf(stderr, "%s(): chained one partition (%p)\n", __func__, *part_head); #endif } else { /* examine next partition */ *part_head = (*part_head)->nextp; } current_partition_index++; } else { /* file fits in current partition, add it */ head->partition_index = current_partition_index; (*part_head)->size += head->size; (*part_head)->num_files++; #if defined(DEBUG) fprintf(stderr, "%s(): %s added to partition %d (%p)\n", __func__, head->path, head->partition_index, *part_head); #endif /* examine next file */ break; } } assert(*part_head != NULL); } /* examine next file */ head = head->nextp; /* come back to the first partition */ current_partition_index = start_partition_index; *part_head = start_partition; } return (num_parts_created); } fpart-0.9.2/src/dispatch.h000066400000000000000000000037741247141164700154170ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _DISPATCH_H #define _DISPATCH_H #include "types.h" #include "partition.h" #include "file_entry.h" #include "options.h" int sort_file_entry_p(const void *a, const void *b); int dispatch_file_entry_p_by_size(struct file_entry **file_entry_p, fnum_t num_entries, struct partition *head, pnum_t num_parts); int dispatch_empty_file_entries(struct file_entry *head, fnum_t num_entries, struct partition *part_head, pnum_t num_parts); pnum_t dispatch_file_entries_by_limits(struct file_entry *head, struct partition **part_head, fnum_t max_entries, fsize_t max_size, struct program_options *options); #endif /* _DISPATCH_H */ fpart-0.9.2/src/file_entry.c000066400000000000000000001041331247141164700157420ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include "types.h" #include "utils.h" #include "options.h" #include "file_entry.h" /* stat(2) */ #include #include /* fprintf(3) */ #include /* strerror(3), strlen(3) */ #include /* errno */ #include /* malloc(3), exit(3) */ #include /* fts(3) */ #include #include #if defined(EMBED_FTS) #include "fts.h" #else #include #endif /* open(2) */ #include /* close(2) */ #include /* assert(3) */ #include /* wait(2) */ #include /* _PATH_BSHELL */ #if defined(__sun) || defined(__sun__) #define _PATH_BSHELL "/bin/sh" #else #include #endif /* signal(3) */ #include #if defined(__GNUC__) static void kill_child(int) __attribute__((__noreturn__)); #endif /**************************** Live-mode related functions ****************************/ /* Status */ static struct { int fd; /* current file descriptor (if option '-o' used) */ char *filename; /* current file name */ pnum_t partition_index; /* current partition number */ fsize_t partition_size; /* current partition size */ fnum_t partition_num_files; /* number of files in current partition */ int exit_summary; /* 0 if every single hook exit()ed with 0, else 1 */ pid_t child_pid; } live_status = { STDOUT_FILENO, NULL, 0, 0, 0, 0, -1 }; /* Signal handler, kills child and exit() */ static void kill_child(int sig) { #if defined(DEBUG) fprintf(stderr, "%s(): killing child process %d\n", __func__, live_status.child_pid); #endif if(live_status.child_pid > 1) { killpg(live_status.child_pid, sig ? sig : SIGTERM); waitpid(live_status.child_pid, NULL, 0); } exit(EXIT_FAILURE); } /* Executes 'cmd' and waits for it to terminate - returns 0 if cmd has been executed and its return code was 0, else returns 1 */ int fpart_hook(const char *cmd, const struct program_options *options, const char *live_filename, const pnum_t *live_partition_index, const fsize_t *live_partition_size, const fnum_t *live_num_files) { assert(cmd != NULL); assert(options != NULL); int retval = 0; /* env variables' names */ char *env_fpart_hooktype_name = "FPART_HOOKTYPE"; char *env_fpart_partfilename_name = "FPART_PARTFILENAME"; char *env_fpart_partnumber_name = "FPART_PARTNUMBER"; char *env_fpart_partsize_name = "FPART_PARTSIZE"; char *env_fpart_partnumfiles_name = "FPART_PARTNUMFILES"; char *env_fpart_pid_name = "FPART_PID"; /* env variables' values */ char *env_fpart_hooktype_string = NULL; char *env_fpart_partfilename_string = NULL; char *env_fpart_partnumber_string = NULL; char *env_fpart_partsize_string = NULL; char *env_fpart_partnumfiles_string = NULL; char *env_fpart_pid_string = NULL; /* XXX As setenv(3)/unsetenv(3) are not available on all platforms, and there does not seem to be a standard way of unsetting variables through putenv(3), clone current environment (to avoid working on environ(7)) and add fpart variables. This is a convenient way of starting from a clean environment and add only needed FPART_* variables for each hook (putenv(3) would leave variables from a hook to another, even if next hooks do not need them) */ char **envp = clone_env(); if(envp == NULL) return (1); size_t malloc_size = 1; /* empty string */ /* determine the kind of hook we are in */ if(cmd == options->pre_part_hook) { assert(live_partition_index != NULL); assert(live_partition_size != NULL); assert(live_num_files != NULL); if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Executing pre-part #%d hook: '%s'\n", *live_partition_index, cmd); /* FPART_HOOKTYPE (pre-part) */ malloc_size = strlen(env_fpart_hooktype_name) + 1 + strlen("pre-part") + 1; if_not_malloc(env_fpart_hooktype_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_hooktype_string, malloc_size, "%s=%s", env_fpart_hooktype_name, "pre-part"); if(push_env(env_fpart_hooktype_string, &envp) != 0) { retval = 1; goto cleanup; } } else if(cmd == options->post_part_hook) { assert(live_partition_index != NULL); assert(live_partition_size != NULL); assert(live_num_files != NULL); if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Executing post-part #%d hook: '%s'\n", *live_partition_index, cmd); /* FPART_HOOKTYPE (post-part) */ malloc_size = strlen(env_fpart_hooktype_name) + 1 + strlen("post-part") + 1; if_not_malloc(env_fpart_hooktype_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_hooktype_string, malloc_size, "%s=%s", env_fpart_hooktype_name, "post-part"); if(push_env(env_fpart_hooktype_string, &envp) != 0) { retval = 1; goto cleanup; } } /* FPART_PARTFILENAME */ if(live_filename != NULL) { malloc_size = strlen(env_fpart_partfilename_name) + 1 + strlen(live_filename) + 1; if_not_malloc(env_fpart_partfilename_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_partfilename_string, malloc_size, "%s=%s", env_fpart_partfilename_name, live_filename); if(push_env(env_fpart_partfilename_string, &envp) != 0) { retval = 1; goto cleanup; } } /* FPART_PARTNUMBER */ if(live_partition_index != NULL) { malloc_size = strlen(env_fpart_partnumber_name) + 1 + get_num_digits(*live_partition_index) + 1; if_not_malloc(env_fpart_partnumber_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_partnumber_string, malloc_size, "%s=%d", env_fpart_partnumber_name, *live_partition_index); if(push_env(env_fpart_partnumber_string, &envp) != 0) { retval = 1; goto cleanup; } } /* FPART_PARTSIZE */ if(live_partition_size != NULL) { malloc_size = strlen(env_fpart_partsize_name) + 1 + get_num_digits(*live_partition_size) + 1; if_not_malloc(env_fpart_partsize_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_partsize_string, malloc_size, "%s=%lld", env_fpart_partsize_name, *live_partition_size); if(push_env(env_fpart_partsize_string, &envp) != 0) { retval = 1; goto cleanup; } } /* FPART_PARTNUMFILES */ if(live_num_files != NULL) { malloc_size = strlen(env_fpart_partnumfiles_name) + 1 + get_num_digits(*live_num_files) + 1; if_not_malloc(env_fpart_partnumfiles_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_partnumfiles_string, malloc_size, "%s=%llu", env_fpart_partnumfiles_name, *live_num_files); if(push_env(env_fpart_partnumfiles_string, &envp) != 0) { retval = 1; goto cleanup; } } /* FPART_PID */ pid_t fpart_pid = getpid(); malloc_size = strlen(env_fpart_pid_name) + 1 + get_num_digits(fpart_pid) + 1; if_not_malloc(env_fpart_pid_string, malloc_size, retval = 1; goto cleanup; ) snprintf(env_fpart_pid_string, malloc_size, "%s=%d", env_fpart_pid_name, (int)fpart_pid); if(push_env(env_fpart_pid_string, &envp) != 0) { retval = 1; goto cleanup; } /* fork child process */ int child_status = 0; switch(live_status.child_pid = fork()) { case -1: /* error */ fprintf(stderr, "fork(): %s\n", strerror(errno)); retval = 1; break; case 0: /* child */ { /* become process group leader */ if(setpgid(live_status.child_pid, 0) != 0) { fprintf(stderr, "%s(): setpgid(): %s\n", __func__, strerror(errno)); exit(EXIT_FAILURE); } execle(_PATH_BSHELL, "sh", "-c", cmd, (char *)NULL, envp); /* if reached, error */ exit(EXIT_FAILURE); } default: /* parent */ { /* child-killer signal handler */ signal(SIGTERM, kill_child); signal(SIGINT, kill_child); signal(SIGHUP, kill_child); pid_t wpid; do { wpid = wait(&child_status); } while((wpid != live_status.child_pid) && (wpid != -1)); /* reset actions for signals */ signal(SIGTERM, SIG_DFL); signal(SIGINT, SIG_DFL); signal(SIGHUP, SIG_DFL); /* reset child PID */ live_status.child_pid = -1; if(wpid == -1) { fprintf(stderr, "%s(): wait(): %s\n", __func__, strerror(errno)); retval = 1; } else { if(WIFEXITED(child_status)) { /* collect exit code */ if(WEXITSTATUS(child_status) != 0) { if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Hook '%s' exited with error %d\n", cmd, WEXITSTATUS(child_status)); retval = 1; } } else { if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Hook '%s' terminated prematurely\n", cmd); retval = 1; } } } break; } cleanup: if(envp != NULL) free(envp); if(env_fpart_hooktype_string != NULL) free(env_fpart_hooktype_string); if(env_fpart_partfilename_string != NULL) free(env_fpart_partfilename_string); if(env_fpart_partnumber_string != NULL) free(env_fpart_partnumber_string); if(env_fpart_partsize_string != NULL) free(env_fpart_partsize_string); if(env_fpart_partnumfiles_string != NULL) free(env_fpart_partnumfiles_string); if(env_fpart_pid_string != NULL) free(env_fpart_pid_string); return (retval); } /* Print or add a file entry (redirector) */ int handle_file_entry(struct file_entry **head, char *path, fsize_t size, struct program_options *options) { assert(options != NULL); if(options->live_mode == OPT_LIVEMODE) return (live_print_file_entry(path, size, options->out_filename, options)); else return (add_file_entry(head, path, size, options)); } /* Print a file entry */ int live_print_file_entry(char *path, fsize_t size, char *out_template, struct program_options *options) { assert(path != NULL); assert(options != NULL); assert(options->live_mode == OPT_LIVEMODE); /* beginning of a new partition */ if(live_status.partition_num_files == 0) { /* very first pass of first partition, preload first partition */ if(live_status.partition_index == 0) live_status.partition_size = options->preload_size; if(out_template != NULL) { /* compute live_status.filename "out_template.i\0" */ size_t malloc_size = strlen(out_template) + 1 + get_num_digits(live_status.partition_index) + 1; if_not_malloc(live_status.filename, malloc_size, return (1); ) snprintf(live_status.filename, malloc_size, "%s.%d", out_template, live_status.partition_index); } /* execute pre-partition hook */ if(options->pre_part_hook != NULL) { if(fpart_hook(options->pre_part_hook, options, live_status.filename, &live_status.partition_index, &live_status.partition_size, &live_status.partition_num_files) != 0) live_status.exit_summary = 1; } if(out_template != NULL) { /* open file */ if((live_status.fd = open(live_status.filename, O_WRONLY|O_CREAT|O_TRUNC, 0660)) < 0) { fprintf(stderr, "%s: %s\n", live_status.filename, strerror(errno)); free(live_status.filename); live_status.filename = NULL; return (1); } } } /* count file in */ live_status.partition_size += round_num(size + options->overload_size, options->round_size); live_status.partition_num_files++; if(out_template == NULL) { /* no template provided, just print to stdout */ fprintf(stdout, "%d (%lld): %s\n", live_status.partition_index, size, path); } else { /* print to fd */ size_t to_write = strlen(path); if((write(live_status.fd, path, to_write) != (ssize_t)to_write) || (write(live_status.fd, "\n", 1) != 1)) { fprintf(stderr, "%s\n", strerror(errno)); /* do not close(livefd) and free(live_status.filename) here because it will be useful and free'd in uninit_file_entries() below */ return (1); } } /* display added filename */ if(options->verbose >= OPT_VVERBOSE) fprintf(stderr, "%s\n", path); /* if end of partition reached */ if(((options->max_entries > 0) && (live_status.partition_num_files >= options->max_entries)) || ((options->max_size > 0) && (live_status.partition_size >= options->max_size))) { /* display added partition */ if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Filled part #%d: size = %lld, %lld file(s)\n", live_status.partition_index, live_status.partition_size, live_status.partition_num_files); /* close fd or flush buffer */ if(out_template == NULL) fflush(stdout); else close(live_status.fd); /* execute post-partition hook */ if(options->post_part_hook != NULL) { if(fpart_hook(options->post_part_hook, options, live_status.filename, &live_status.partition_index, &live_status.partition_size, &live_status.partition_num_files) != 0) live_status.exit_summary = 1; } if(out_template != NULL) { free(live_status.filename); live_status.filename = NULL; } /* reset current partition status */ live_status.partition_index++; live_status.partition_size = options->preload_size; live_status.partition_num_files = 0; } return (0); } /********************************************************* Double-linked list of file_entries manipulation functions *********************************************************/ /* Add a file entry to a double-linked list of file_entries - if head is NULL, creates a new file entry ; if not, chains a new file entry to it - returns with head set to the newly added element */ int add_file_entry(struct file_entry **head, char *path, fsize_t size, struct program_options *options) { assert(head != NULL); assert(path != NULL); assert(options != NULL); assert(options->live_mode == OPT_NOLIVEMODE); struct file_entry **current = head; /* current file_entry pointer address */ struct file_entry *previous = NULL; /* previous file_entry pointer */ /* backup current structure pointer and initialize a new structure */ previous = *current; if_not_malloc(*current, sizeof(struct file_entry), return (1); ) /* set head on first call */ if(*head == NULL) *head = *current; /* set current file data */ size_t malloc_size = strlen(path) + 1; if_not_malloc((*current)->path, malloc_size, free(*current); *current = previous; return (1); ) snprintf((*current)->path, malloc_size, "%s", path); (*current)->size = size + options->overload_size; (*current)->size = round_num((*current)->size, options->round_size); /* set current file entry's index and pointers */ (*current)->partition_index = 0; /* set during dispatch */ (*current)->nextp = NULL; /* set in next pass (see below) */ (*current)->prevp = previous; /* set previous' nextp pointer */ if(previous != NULL) previous->nextp = *current; /* display added filename */ if(options->verbose >= OPT_VVERBOSE) fprintf(stderr, "%s\n", (*current)->path); return (0); } /* Compare entries to list directories first - compar() function used by fts_open() when in leaf dirs mode */ static int #if (defined(__linux__) || defined(__NetBSD__)) && !defined(EMBED_FTS) fts_dirsfirst(const FTSENT **a, const FTSENT **b) #else fts_dirsfirst(const FTSENT * const *a, const FTSENT * const *b) #endif { assert(a != NULL); assert((*a) != NULL); assert(b != NULL); assert((*b) != NULL); if(((*a)->fts_info == FTS_NS) || ((*a)->fts_info == FTS_NSOK) || ((*b)->fts_info == FTS_NS) || ((*b)->fts_info == FTS_NSOK)) return (0); /* place non-directory entries after directory ones */ if(S_ISDIR((*a)->fts_statp->st_mode)) if(!S_ISDIR((*b)->fts_statp->st_mode)) return (-1); else return (0); else if(S_ISDIR((*b)->fts_statp->st_mode)) return (1); else return (0); } /* Initialize a double-linked list of file_entries from a path - file_path may be a file or directory - if head is NULL, creates a new list ; if not, chains a new list to it - increments *count with the number of files found - returns != 0 if critical error - returns with head set to the last element added */ int init_file_entries(char *file_path, struct file_entry **head, fnum_t *count, struct program_options *options) { assert(file_path != NULL); assert(head != NULL); assert(count != NULL); assert(options != NULL); /* prepare fts */ FTS *ftsp = NULL; FTSENT *p = NULL; int fts_options = (options->follow_symbolic_links == OPT_FOLLOWSYMLINKS) ? FTS_LOGICAL : FTS_PHYSICAL; fts_options |= (options->cross_fs_boundaries == OPT_NOCROSSFSBOUNDARIES) ? FTS_XDEV : 0; char *fts_argv[] = { file_path, NULL }; if((ftsp = fts_open(fts_argv, fts_options, (options->leaf_dirs == OPT_LEAFDIRS) ? &fts_dirsfirst : NULL)) == NULL) { fprintf(stderr, "%s: fts_open()\n", file_path); return (0); } /* current dir state */ unsigned char curdir_empty = 1; unsigned char curdir_dirsfound = 0; unsigned char curdir_addme = 0; while((p = fts_read(ftsp)) != NULL) { switch (p->fts_info) { /* misc errors */ case FTS_ERR: fprintf(stderr, "%s: %s\n", p->fts_path, strerror(p->fts_errno)); continue; /* errors for which we know there is a file or directory within current directory */ case FTS_DNR: /* un-readable directory */ { fprintf(stderr, "%s: %s\n", p->fts_path, strerror(p->fts_errno)); /* if requested by the -Z option, add directory anyway by simulating FTS_DP */ if(options->dnr_empty == OPT_DNREMPTY) { curdir_empty = 1; goto add_directory; } /* else, mark current dir as not empty */ curdir_empty = 0; curdir_dirsfound = 1; continue; } case FTS_NS: /* stat() error */ fprintf(stderr, "%s: %s\n", p->fts_path, strerror(p->fts_errno)); case FTS_NSOK: /* no stat(2) available (not requested) */ /* mark current dir as not empty */ curdir_empty = 0; continue; case FTS_DC: fprintf(stderr, "%s: filesystem loop detected\n", p->fts_path); case FTS_DOT: /* ignore "." and ".." */ continue; case FTS_DP: { add_directory: /* if empty_dirs display requested and current dir is empty, add directory entry */ if((options->empty_dirs == OPT_EMPTYDIRS) && curdir_empty) curdir_addme = 1; /* if leaf dirs mode activated and current directory is a leaf, add directory entry */ if((options->leaf_dirs == OPT_LEAFDIRS) && (!curdir_dirsfound)) curdir_addme = 1; /* add directory if necessary */ if(curdir_addme) { fsize_t curdir_size = 0; char *curdir_entry_path = NULL; /* check for name validity regarding include/exclude options; directory is a leaf */ if(!valid_filename(p->fts_name, options, 1)) { if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Skipping directory: '%s'\n", p->fts_path); goto reset_directory; } /* count ending '/' and '\0', even if an ending '/' is not added */ size_t malloc_size = p->fts_pathlen + 1 + 1; if_not_malloc(curdir_entry_path, malloc_size, fts_close(ftsp); return (1); ) /* add slash if requested and necessary */ if((options->add_slash == OPT_ADDSLASH) && (p->fts_pathlen > 0) && (p->fts_path[p->fts_pathlen - 1] != '/')) snprintf(curdir_entry_path, malloc_size, "%s/", p->fts_path); else snprintf(curdir_entry_path, malloc_size, "%s", p->fts_path); /* compute current dir size */ if((p->fts_level > 0) && (options->cross_fs_boundaries == OPT_NOCROSSFSBOUNDARIES) && (p->fts_parent->fts_statp->st_dev != p->fts_statp->st_dev)) /* when using option -b, set size to 0 for mountpoint (non-root) directories */ curdir_size = 0; else if(curdir_empty) curdir_size = 0; else curdir_size = get_size(p->fts_accpath, p->fts_statp, options); /* add or display it */ if(handle_file_entry (head, curdir_entry_path, curdir_size, options) == 0) (*count)++; else { fprintf(stderr, "%s(): cannot add file entry\n", __func__); free(curdir_entry_path); fts_close(ftsp); return (1); } /* cleanup */ free(curdir_entry_path); } /* reset parent (now current) dir state */ reset_directory: curdir_empty = 0; curdir_dirsfound = 1; curdir_addme = 0; continue; } case FTS_D: { curdir_empty = 1; /* enter directory, mark it as empty */ curdir_dirsfound = 0; /* no dirs found yet */ /* check for name validity regarding exclude options */ if(!valid_filename(p->fts_name, options, 0)) { if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Skipping directory: '%s'\n", p->fts_path); fts_set(ftsp, p, FTS_SKIP); continue; } /* if dir_depth requested and reached, skip descendants but add directory entry (in post order) */ if((options->dir_depth != OPT_NODIRDEPTH) && (p->fts_level >= options->dir_depth)) { fts_set(ftsp, p, FTS_SKIP); curdir_addme = 1; /* as we have not crawled into this directory yet, remove the empty flag to allow a call to get_size() in FTS_DP */ curdir_empty = 0; } continue; } default: /* XXX default means remaining file types: FTS_F, FTS_SL, FTS_SLNONE, FTS_DEFAULT */ { curdir_empty = 0; /* mark current dir as non empty */ /* check for name validity regarding include/exclude options,*/ if(!valid_filename(p->fts_name, options, 1)) { if(options->verbose >= OPT_VERBOSE) fprintf(stderr, "Skipping file: '%s'\n", p->fts_path); continue; } /* skip file entry when in leaf dirs mode (option -D) and no directory has been found in current directory (i.e. we are in a leaf directory). We must have visited all directories first ; this is achieved by using a compar() function with fts_open() */ if((options->leaf_dirs == OPT_LEAFDIRS) && (!curdir_dirsfound)) continue; /* add or display it */ if(handle_file_entry (head, p->fts_path, get_size(p->fts_accpath, p->fts_statp, options), options) == 0) (*count)++; else { fprintf(stderr, "%s(): cannot add file entry\n", __func__); fts_close(ftsp); return (1); } continue; } } } if(errno != 0) { fprintf(stderr, "%s: fts_read()\n", file_path); fts_close(ftsp); return (1); } if(fts_close(ftsp) < 0) fprintf(stderr, "%s: fts_close()\n", file_path); return (0); } /* Un-initialize a double-linked list of file_entries */ void uninit_file_entries(struct file_entry *head, struct program_options *options) { assert(options != NULL); /* be sure to start from last file entry */ fastfw_list(head); struct file_entry *current = head; struct file_entry *prev = NULL; while(current != NULL) { if(current->path != NULL) { free(current->path); } prev = current->prevp; free(current); current = prev; } /* live mode */ if(options->live_mode == OPT_LIVEMODE) { /* display added partition */ if((options->verbose >= OPT_VERBOSE) && (live_status.partition_num_files > 0)) fprintf(stderr, "Filled part #%d: size = %lld, %lld file(s)\n", live_status.partition_index, live_status.partition_size, live_status.partition_num_files); /* flush buffer or close last file if necessary */ if(options->out_filename == NULL) fflush(stdout); else if(live_status.filename != NULL) close(live_status.fd); /* execute last post-partition hook */ if((options->post_part_hook != NULL) && (live_status.partition_num_files > 0)) { if(fpart_hook(options->post_part_hook, options, live_status.filename, &live_status.partition_index, &live_status.partition_size, &live_status.partition_num_files) != 0) live_status.exit_summary = 1; } if(live_status.filename != NULL) { free(live_status.filename); live_status.filename = NULL; } /* print hooks' exit codes summary */ if((options->verbose >= OPT_VERBOSE) && (live_status.exit_summary != 0)) fprintf(stderr, "Warning: at least one hook exited with error !\n"); } return; } /* Print a double-linked list of file_entries from head - if no filename template given, print to stdout */ int print_file_entries(struct file_entry *head, char *out_template, pnum_t num_parts) { assert(head != NULL); assert(num_parts > 0); /* no template provided, just print to stdout and return */ if(out_template == NULL) { while(head != NULL) { fprintf(stdout, "%d (%lld): %s\n", head->partition_index, head->size, head->path); head = head->nextp; } return (0); } /* a template has been provided; to avoid opening too many files, open chunks of FDs and do as many passes as necessary */ struct file_entry *start = head; pnum_t current_chunk = 0; /* current chunk */ pnum_t current_file_entry = 0; /* current file entry within chunk */ assert(PRINT_FE_CHUNKS > 0); while(((current_chunk * PRINT_FE_CHUNKS) + current_file_entry) < num_parts) { int fd[PRINT_FE_CHUNKS]; /* our file descriptors */ /* open as necessary file descriptors as needed to print num_part partitions */ while((current_file_entry < PRINT_FE_CHUNKS) && (((current_chunk * PRINT_FE_CHUNKS) + current_file_entry) < num_parts)) { /* compute out_filename "out_template.i\0" */ char *out_filename = NULL; size_t malloc_size = strlen(out_template) + 1 + get_num_digits ((current_chunk * PRINT_FE_CHUNKS) + current_file_entry) + 1; if_not_malloc(out_filename, malloc_size, /* close all open descriptors and return */ pnum_t i; for(i = 0; i < current_file_entry; i++) close(fd[i]); return (1); ) snprintf(out_filename, malloc_size, "%s.%d", out_template, (current_chunk * PRINT_FE_CHUNKS) + current_file_entry); if((fd[current_file_entry] = open(out_filename, O_WRONLY|O_CREAT|O_TRUNC, 0660)) < 0) { fprintf(stderr, "%s: %s\n", out_filename, strerror(errno)); free(out_filename); /* close all open descriptors and return */ pnum_t i; for(i = 0; i < current_file_entry; i++) close(fd[i]); return (1); } free(out_filename); current_file_entry++; } while(head != NULL) { if((head->partition_index >= (current_chunk * PRINT_FE_CHUNKS)) && (head->partition_index < ((current_chunk + 1) * PRINT_FE_CHUNKS))) { size_t to_write = strlen(head->path); if((write(fd[head->partition_index % PRINT_FE_CHUNKS], head->path, to_write) != (ssize_t)to_write) || (write(fd[head->partition_index % PRINT_FE_CHUNKS], "\n", 1) != 1)) { fprintf(stderr, "%s\n", strerror(errno)); /* close all open descriptors */ pnum_t i; for(i = 0; (i < PRINT_FE_CHUNKS) && (((current_chunk * PRINT_FE_CHUNKS) + i) < num_parts); i++) close(fd[i]); return (1); } } head = head->nextp; } /* come back to first entry */ head = start; /* close file descriptors */ pnum_t i; for(i = 0; (i < PRINT_FE_CHUNKS) && (((current_chunk * PRINT_FE_CHUNKS) + i) < num_parts); i++) close(fd[i]); current_file_entry = 0; current_chunk++; } return (0); } /*************************************************** Array of file_entry pointers manipulation functions ***************************************************/ /* Initialize an array of file_entry pointers from a double-linked list of file_entries (head) */ void init_file_entry_p(struct file_entry **file_entry_p, fnum_t num_entries, struct file_entry *head) { assert(file_entry_p != NULL); fnum_t i = 0; while((head != NULL) && (file_entry_p != NULL) && (i < num_entries)) { file_entry_p[i] = head; head = head->nextp; i++; } return; } fpart-0.9.2/src/file_entry.h000066400000000000000000000056671247141164700157630ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _FILE_ENTRY_H #define _FILE_ENTRY_H #include "types.h" #include "options.h" #include #if !defined(PRINT_FE_CHUNKS) #define PRINT_FE_CHUNKS 32 /* files per chunk when flushing partitions to disk */ #endif /* A file entry */ struct file_entry; struct file_entry { char *path; /* file name */ fsize_t size; /* size in bytes */ pnum_t partition_index; /* assigned partition index */ struct file_entry* nextp; /* next file_entry */ struct file_entry* prevp; /* previous one */ }; int fpart_hook(const char *cmd, const struct program_options *options, const char *live_filename, const pnum_t *live_partition_index, const fsize_t *live_partition_size, const fnum_t *live_num_files); int handle_file_entry(struct file_entry **head, char *path, fsize_t size, struct program_options *options); int live_print_file_entry(char *path, fsize_t size, char *out_template, struct program_options *options); int add_file_entry(struct file_entry **head, char *path, fsize_t size, struct program_options *options); int init_file_entries(char *file_path, struct file_entry **head, fnum_t *count, struct program_options *options); void uninit_file_entries(struct file_entry *head, struct program_options *options); int print_file_entries(struct file_entry *head, char *out_template, pnum_t num_parts); void init_file_entry_p(struct file_entry **file_entry_p, fnum_t num_entries, struct file_entry *head); #endif /* _FILE_ENTRY_H */ fpart-0.9.2/src/fpart.c000066400000000000000000000671431247141164700147270ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include "fpart.h" #include "types.h" #include "utils.h" #include "options.h" #include "partition.h" #include "file_entry.h" #include "dispatch.h" /* NULL, exit(3) */ #include /* fprintf(3), fopen(3), fclose(3), fgets(3), foef(3) */ #include /* getopt(3) */ #include #if !defined(__SunOS_5_9) #include #endif /* strlen(3) */ #include /* bzero(3) */ #include /* errno */ #include /* assert(3) */ #include /* Print version */ static void version(void) { fprintf(stderr, "fpart v" FPART_VERSION "\n" "Copyright (c) 2011-2015 Ganael LAPLANCHE \n" "WWW: http://contribs.martymac.org\n"); fprintf(stderr, "Build options: debug="); #if defined(DEBUG) fprintf(stderr, "yes, fts="); #else fprintf(stderr, "no, fts="); #endif #if defined(EMBED_FTS) fprintf(stderr, "embedded\n"); #else fprintf(stderr, "system\n"); #endif } /* Print usage */ static void usage(void) { fprintf(stderr, "Usage: fpart [OPTIONS] -n num | -f files | -s size " "[FILE or DIR...]\n"); fprintf(stderr, "Sort and pack files into partitions.\n"); fprintf(stderr, "\n"); fprintf(stderr, "General options:\n"); fprintf(stderr, " -h\tthis help\n"); fprintf(stderr, " -V\tprint version\n"); fprintf(stderr, "\n"); fprintf(stderr, "Partition control:\n"); fprintf(stderr, " -n\tpack files into partitions\n"); fprintf(stderr, " -f\tlimit partitions to files\n"); fprintf(stderr, " -s\tlimit partitions to bytes\n"); fprintf(stderr, "\n"); fprintf(stderr, "Input control:\n"); fprintf(stderr, " -i\tread file list from " "(stdin if '-' is specified)\n"); fprintf(stderr, " -a\tinput contains arbitrary values " "(do not crawl filesystem)\n"); fprintf(stderr, "\n"); fprintf(stderr, "Output control:\n"); fprintf(stderr, " -o\toutput partitions to template " "(stdout if '-' is specified)\n"); fprintf(stderr, " -e\tadd ending slash to directories\n"); fprintf(stderr, " -v\tverbose mode (may be specified more than once to " "increase verbosity)\n"); fprintf(stderr, "\n"); fprintf(stderr, "Filesystem crawling control:\n"); fprintf(stderr, " -l\tfollow symbolic links\n"); fprintf(stderr, " -b\tdo not cross filesystem boundaries\n"); fprintf(stderr, " -y\tinclude files matching only (may be " "specified more than once)\n"); #if defined(_HAS_FNM_CASEFOLD) fprintf(stderr, " -Y\tsame as -y, but ignore case\n"); #endif fprintf(stderr, " -x\texclude files matching (may be specified " "more than once)\n"); #if defined(_HAS_FNM_CASEFOLD) fprintf(stderr, " -X\tsame as -x, but ignore case\n"); #endif fprintf(stderr, "\n"); fprintf(stderr, "Directory handling:\n"); fprintf(stderr, " -z\tpack empty directories " "(default: pack files only)\n"); fprintf(stderr, " -Z\ttreat un-readable directories as empty " "(implies -z)\n"); fprintf(stderr, " -d\tpack directories instead of files after a certain " "\n"); fprintf(stderr, " -D\tpack leaf directories (i.e. containing files only, " "implies -z)\n"); fprintf(stderr, "\n"); fprintf(stderr, "Live mode:\n"); fprintf(stderr, " -L\tlive mode: generate partitions during filesystem " "crawling\n"); fprintf(stderr, " -w\tpre-partition hook: execute at partition " "start\n"); fprintf(stderr, " -W\tpost-partition hook: execute at partition " "end\n"); fprintf(stderr, "\n"); fprintf(stderr, "Size handling:\n"); fprintf(stderr, " -p\tpreload each partition with bytes\n"); fprintf(stderr, " -q\toverload each file with bytes\n"); fprintf(stderr, " -r\tround each file size up to next bytes " "multiple\n"); fprintf(stderr, "\n"); fprintf(stderr, "Example: fpart -n 3 -o var-parts /var\n"); fprintf(stderr, "\n"); fprintf(stderr, "Please report bugs to Ganael LAPLANCHE " "\n"); return; } /* Handle one argument (either a path to crawl or an arbitrary value) and update file entries (head) - returns != 0 if a critical error occurred - returns with head set to the last element added - updates totalfiles with the number of elements added */ static int handle_argument(char *argument, fnum_t *totalfiles, struct file_entry **head, struct program_options *options) { assert(argument != NULL); assert(totalfiles != NULL); assert(head != NULL); assert(options != NULL); if(options->arbitrary_values == OPT_ARBITRARYVALUES) { /* handle arbitrary values */ fsize_t input_size = 0; char *input_path = NULL; if_not_malloc(input_path, strlen(argument) + 1, return (1); ) if(sscanf(argument, "%lld %[^\n]", &input_size, input_path) == 2) { if(handle_file_entry(head, input_path, input_size, options) == 0) (*totalfiles)++; else { fprintf(stderr, "%s(): cannot add file entry\n", __func__); free(input_path); return (1); } } else fprintf(stderr, "error parsing input values: %s\n", argument); /* cleanup */ free(input_path); } else { /* handle paths, must examine filesystem */ char *input_path = NULL; size_t input_path_len = strlen(argument); size_t malloc_size = input_path_len + 1; if_not_malloc(input_path, malloc_size, return (1); ) snprintf(input_path, malloc_size, "%s", argument); /* remove multiple ending slashes */ while((input_path_len > 1) && (input_path[input_path_len - 1] == '/') && (input_path[input_path_len - 2] == '/')) { input_path[input_path_len - 1] = '\0'; input_path_len--; } /* crawl path */ if(input_path[0] != '\0') { #if defined(DEBUG) fprintf(stderr, "init_file_entries(): examining %s\n", input_path); #endif if(init_file_entries(input_path, head, totalfiles, options) != 0) { fprintf(stderr, "%s(): cannot initialize file entries\n", __func__); free(input_path); return (1); } } /* cleanup */ free(input_path); } return (0); } /* Handle options parsing - initializes options structure using argc and argv (through pointers) - returns a value defined by the mask below */ static int handle_options(struct program_options *options, int *argcp, char ***argvp) { /* Return code mask */ #define FPART_OPTS_OK 0 /* OK */ #define FPART_OPTS_NOK 1 /* Error */ #define FPART_OPTS_EXIT 2 /* exit(3) required */ #define FPART_OPTS_USAGE 4 /* usage() call required */ #define FPART_OPTS_VERSION 8 /* version() call required */ assert(options != NULL); assert(argcp != NULL); assert(*argcp > 0); assert(argvp != NULL); assert(*argvp != NULL); /* Options handling */ extern char *optarg; extern int optind; int ch; while((ch = getopt(*argcp, *argvp, #if defined(_HAS_FNM_CASEFOLD) "?hVn:f:s:i:ao:evlby:Y:x:X:zZd:DLw:W:p:q:r:" #else "?hVn:f:s:i:ao:evlby:x:zZd:DLw:W:p:q:r:" #endif )) != -1) { switch(ch) { case '?': case 'h': return (FPART_OPTS_USAGE | FPART_OPTS_OK | FPART_OPTS_EXIT); case 'V': return (FPART_OPTS_VERSION | FPART_OPTS_OK | FPART_OPTS_EXIT); case 'n': { char *endptr = NULL; long num_parts = strtol(optarg, &endptr, 10); /* refuse values <= 0 and partially-converted arguments */ if((endptr == optarg) || (*endptr != '\0') || (num_parts <= 0)) return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); options->num_parts = (pnum_t)num_parts; break; } case 'f': { char *endptr = NULL; long long max_entries = strtoll(optarg, &endptr, 10); /* refuse values <= 0 and partially-converted arguments */ if((endptr == optarg) || (*endptr != '\0') || (max_entries <= 0)) return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); options->max_entries = (fnum_t)max_entries; break; } case 's': { char *endptr = NULL; long long max_size = strtoll(optarg, &endptr, 10); /* refuse values <= 0 and partially-converted arguments */ if((endptr == optarg) || (*endptr != '\0') || (max_size <= 0)) return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); options->max_size = (fsize_t)max_size; break; } case 'i': { /* check for empty argument */ if(strlen(optarg) == 0) break; /* replace previous filename if '-i' specified multiple times */ if(options->in_filename != NULL) free(options->in_filename); options->in_filename = abs_path(optarg); if(options->in_filename == NULL) { fprintf(stderr, "%s(): cannot determine absolute path for " "file '%s'\n", __func__, optarg); return (FPART_OPTS_NOK | FPART_OPTS_EXIT); } break; } case 'a': options->arbitrary_values = OPT_ARBITRARYVALUES; break; case 'o': { /* check for empty argument */ if(strlen(optarg) == 0) break; /* replace previous filename if '-o' specified multiple times */ if(options->out_filename != NULL) free(options->out_filename); /* '-' goes to stdout */ if((optarg[0] == '-') && (optarg[1] == '\0')) { options->out_filename = NULL; } else { options->out_filename = abs_path(optarg); if(options->out_filename == NULL) { fprintf(stderr, "%s(): cannot determine absolute path " "for file '%s'\n", __func__, optarg); return (FPART_OPTS_NOK | FPART_OPTS_EXIT); } } break; } case 'e': options->add_slash = OPT_ADDSLASH; break; case 'v': options->verbose++; break; case 'l': options->follow_symbolic_links = OPT_FOLLOWSYMLINKS; break; case 'b': options->cross_fs_boundaries = OPT_NOCROSSFSBOUNDARIES; break; case 'y': case 'Y': /* needs _HAS_FNM_CASEFOLD */ case 'x': case 'X': /* needs _HAS_FNM_CASEFOLD */ { char ***dst_list = NULL; unsigned int *dst_num = NULL; switch(ch) { case 'y': dst_list = &(options->include_files); dst_num = &(options->ninclude_files); break; case 'Y': dst_list = &(options->include_files_ci); dst_num = &(options->ninclude_files_ci); break; case 'x': dst_list = &(options->exclude_files); dst_num = &(options->nexclude_files); break; case 'X': dst_list = &(options->exclude_files_ci); dst_num = &(options->nexclude_files_ci); break; } /* check for empty argument */ if(strlen(optarg) == 0) break; /* push string */ if(str_push(dst_list, dst_num, optarg) != 0) return (FPART_OPTS_NOK | FPART_OPTS_EXIT); break; } case 'z': options->empty_dirs = OPT_EMPTYDIRS; break; case 'Z': options->dnr_empty = OPT_DNREMPTY; options->empty_dirs = OPT_EMPTYDIRS; break; case 'd': { char *endptr = NULL; long dir_depth = strtol(optarg, &endptr, 10); /* refuse values < 0 (-1 being used to disable this option) */ if((endptr == optarg) || (*endptr != '\0') || (dir_depth < 0)) return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); options->dir_depth = (int)dir_depth; break; } case 'D': options->leaf_dirs = OPT_LEAFDIRS; options->empty_dirs = OPT_EMPTYDIRS; break; case 'L': options->live_mode = OPT_LIVEMODE; break; case 'w': { /* check for empty argument */ size_t malloc_size = strlen(optarg) + 1; if(malloc_size <= 1) break; /* replace previous hook if '-w' specified multiple times */ if(options->pre_part_hook != NULL) free(options->pre_part_hook); if_not_malloc(options->pre_part_hook, malloc_size, return (FPART_OPTS_NOK | FPART_OPTS_EXIT); ) snprintf(options->pre_part_hook, malloc_size, "%s", optarg); break; } case 'W': { /* check for empty argument */ size_t malloc_size = strlen(optarg) + 1; if(malloc_size <= 1) break; /* replace previous hook if '-W' specified multiple times */ if(options->post_part_hook != NULL) free(options->post_part_hook); if_not_malloc(options->post_part_hook, malloc_size, return (FPART_OPTS_NOK | FPART_OPTS_EXIT); ) snprintf(options->post_part_hook, malloc_size, "%s", optarg); break; } case 'p': { char *endptr = NULL; long long preload_size = strtoll(optarg, &endptr, 10); /* refuse values <= 0 and partially-converted arguments */ if((endptr == optarg) || (*endptr != '\0') || (preload_size <= 0)) { fprintf(stderr, "Option -p requires a value greater than 0.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } options->preload_size = (fsize_t)preload_size; break; } case 'q': { char *endptr = NULL; long long overload_size = strtoll(optarg, &endptr, 10); /* refuse values <= 0 and partially-converted arguments */ if((endptr == optarg) || (*endptr != '\0') || (overload_size <= 0)) { fprintf(stderr, "Option -q requires a value greater than 0.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } options->overload_size = (fsize_t)overload_size; break; } case 'r': { char *endptr = NULL; long long round_size = strtoll(optarg, &endptr, 10); /* refuse values <= 1 and partially-converted arguments */ if((endptr == optarg) || (*endptr != '\0') || (round_size <= 1)) { fprintf(stderr, "Option -r requires a value greater than 1.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } options->round_size = (fsize_t)round_size; break; } } } *argcp -= optind; *argvp += optind; /* check for options consistency */ if((options->num_parts == DFLT_OPT_NUM_PARTS) && (options->max_entries == DFLT_OPT_MAX_ENTRIES) && (options->max_size == DFLT_OPT_MAX_SIZE)) { fprintf(stderr, "Please specify either -n, -f or -s.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } if((options->num_parts != DFLT_OPT_NUM_PARTS) && ((options->max_entries != DFLT_OPT_MAX_ENTRIES) || (options->max_size != DFLT_OPT_MAX_SIZE) || (options->live_mode != DFLT_OPT_LIVEMODE))) { fprintf(stderr, "Option -n is incompatible with options -f, -s and -L.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } if(options->arbitrary_values == OPT_ARBITRARYVALUES) { if((options->add_slash != DFLT_OPT_ADDSLASH) || (options->follow_symbolic_links != DFLT_OPT_FOLLOWSYMLINKS) || (options->cross_fs_boundaries != DFLT_OPT_CROSSFSBOUNDARIES) || (options->include_files != NULL) || (options->include_files_ci != NULL) || (options->exclude_files != NULL) || (options->exclude_files_ci != NULL) || (options->empty_dirs != DFLT_OPT_EMPTYDIRS) || (options->dnr_empty != DFLT_OPT_DNREMPTY) || (options->dir_depth != DFLT_OPT_DIR_DEPTH) || (options->leaf_dirs != DFLT_OPT_LEAFDIRS)) { fprintf(stderr, "Option -a is incompatible with crawling-related options.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } } if((options->live_mode == OPT_NOLIVEMODE) && ((options->pre_part_hook != NULL) || (options->post_part_hook != NULL))) { fprintf(stderr, "Hooks can only be used with option -L.\n"); return (FPART_OPTS_USAGE | FPART_OPTS_NOK | FPART_OPTS_EXIT); } if((options->in_filename == NULL) && (*argcp <= 0)) { /* no file specified, force stdin */ char *opt_input = "-"; size_t malloc_size = strlen(opt_input) + 1; if_not_malloc(options->in_filename, malloc_size, return (FPART_OPTS_NOK | FPART_OPTS_EXIT); ) snprintf(options->in_filename, malloc_size, "%s", opt_input); } return (FPART_OPTS_OK); } int main(int argc, char **argv) { fnum_t totalfiles = 0; /****************** Handle options ******************/ /* Program options */ struct program_options options; /* Set default options */ init_options(&options); /* Parse and initialize options */ int options_init_res = handle_options(&options, &argc, &argv); if(options_init_res & FPART_OPTS_USAGE) usage(); if(options_init_res & FPART_OPTS_VERSION) version(); if(options_init_res & FPART_OPTS_EXIT) { uninit_options(&options); exit(options_init_res & FPART_OPTS_NOK ? EXIT_FAILURE : EXIT_SUCCESS); } /************** Handle stdin ***************/ /* our main double-linked file list */ struct file_entry *head = NULL; if(options.verbose >= OPT_VERBOSE) fprintf(stderr, "Examining filesystem...\n"); /* work on each file provided through input file (or stdin) */ if(options.in_filename != NULL) { /* handle fd opening */ FILE *in_fp = NULL; if((options.in_filename[0] == '-') && (options.in_filename[1] == '\0')) { /* working from stdin */ in_fp = stdin; } else { /* working from a filename */ if((in_fp = fopen(options.in_filename, "r")) == NULL) { fprintf(stderr, "%s: %s\n", options.in_filename, strerror(errno)); uninit_options(&options); exit(EXIT_FAILURE); } } /* read fd and do the work */ char line[MAX_LINE_LENGTH]; char *line_end_p = NULL; bzero(line, MAX_LINE_LENGTH); while(fgets(line, MAX_LINE_LENGTH, in_fp) != NULL) { /* replace '\n' with '\0' */ if ((line_end_p = strchr(line, '\n')) != NULL) *line_end_p = '\0'; if(handle_argument(line, &totalfiles, &head, &options) != 0) { uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); } /* cleanup */ bzero(line, MAX_LINE_LENGTH); } /* check for error reading input */ if(ferror(in_fp) != 0) { fprintf(stderr, "error reading from input stream\n"); } /* cleanup */ if(in_fp != NULL) fclose(in_fp); } /****************** Handle arguments *******************/ /* now, work on each path provided as arguments */ int i; for(i = 0 ; i < argc ; i++) { if(handle_argument(argv[i], &totalfiles, &head, &options) != 0) { uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); } } /**************** Display status *****************/ /* come back to the first element */ rewind_list(head); /* no file found or live mode */ if((totalfiles <= 0) || (options.live_mode == OPT_LIVEMODE)) { uninit_file_entries(head, &options); /* display status */ if(options.verbose >= OPT_VERBOSE) fprintf(stderr, "%lld file(s) found.\n", totalfiles); uninit_options(&options); exit(EXIT_SUCCESS); } /* display status */ if(options.verbose >= OPT_VERBOSE) { fprintf(stderr, "%lld file(s) found.\n", totalfiles); fprintf(stderr, "Sorting entries...\n"); } /************************************************ Sort entries with a fixed number of partitions *************************************************/ /* our list of partitions */ struct partition *part_head = NULL; pnum_t num_parts = options.num_parts; /* sort files with a fixed size of partitions */ if(options.num_parts != DFLT_OPT_NUM_PARTS) { /* create a fixed-size array of pointers to sort */ struct file_entry **file_entry_p = NULL; if_not_malloc(file_entry_p, sizeof(struct file_entry *) * totalfiles, uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); ) /* initialize array */ init_file_entry_p(file_entry_p, totalfiles, head); /* sort array */ qsort(&file_entry_p[0], totalfiles, sizeof(struct file_entry *), &sort_file_entry_p); /* create a double_linked list of partitions which will hold dispatched files */ if(add_partitions(&part_head, options.num_parts, &options) != 0) { fprintf(stderr, "%s(): cannot init list of partitions\n", __func__); uninit_partitions(part_head); free(file_entry_p); uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); } /* come back to the first element */ rewind_list(part_head); /* dispatch files */ if(dispatch_file_entry_p_by_size (file_entry_p, totalfiles, part_head, options.num_parts) != 0) { fprintf(stderr, "%s(): unable to dispatch file entries\n", __func__); uninit_partitions(part_head); free(file_entry_p); uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); } /* re-dispatch empty files */ if(dispatch_empty_file_entries (head, totalfiles, part_head, options.num_parts) != 0) { fprintf(stderr, "%s(): unable to dispatch empty file entries\n", __func__); uninit_partitions(part_head); free(file_entry_p); uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); } /* cleanup */ free(file_entry_p); } /*************************************************** Sort entries with a variable number of partitions ****************************************************/ /* sort files with a file number or size limit per-partitions. In this case, partitions are dynamically-created */ else { if((num_parts = dispatch_file_entries_by_limits (head, &part_head, options.max_entries, options.max_size, &options)) == 0) { fprintf(stderr, "%s(): unable to dispatch file entries\n", __func__); uninit_partitions(part_head); uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_FAILURE); } /* come back to the first element (we may have exited with part_head set to partition 1, after default partition) */ rewind_list(part_head); } /*********************** Print result and exit ************************/ /* print result summary */ print_partitions(part_head); if(options.verbose >= OPT_VERBOSE) fprintf(stderr, "Writing output lists...\n"); /* print file entries */ print_file_entries(head, options.out_filename, num_parts); if(options.verbose >= OPT_VERBOSE) fprintf(stderr, "Cleaning up...\n"); /* free stuff */ uninit_partitions(part_head); uninit_file_entries(head, &options); uninit_options(&options); exit(EXIT_SUCCESS); } fpart-0.9.2/src/fpart.h000066400000000000000000000030451247141164700147230ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _FPART_H #define _FPART_H #define FPART_VERSION "0.9.2" /* maximum input line length, including '\n' and '\0' */ #define MAX_LINE_LENGTH 2048 #endif /* _FPART_H */ fpart-0.9.2/src/fts.c000066400000000000000000001007541247141164700144030ustar00rootroot00000000000000/*- * Copyright (c) 1990, 1993, 1994 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 4. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * $OpenBSD: fts.c,v 1.22 1999/10/03 19:22:22 millert Exp $ * * This version of fts has been patched to build on Solaris and GNU/Linux. * Solaris notes : * - no FTS_WHITEOUT (sparse files) support * GNU/Linux notes : * - the FTS_NOSTAT speedup trick is disabled * - no FTS_WHITEOUT (sparse files) support */ #if 0 #if defined(LIBC_SCCS) && !defined(lint) static char sccsid[] = "@(#)fts.c 8.6 (Berkeley) 8/14/94"; #endif /* LIBC_SCCS and not lint */ #endif #if defined(__FreeBSD__) #include __FBSDID("$FreeBSD: head/lib/libc/gen/fts.c 241010 2012-09-27 22:05:54Z jilles $"); #include "/usr/src/lib/libc/include/namespace.h" #else #define _open open #define _close close #define _fstat fstat #define _dirfd dirfd #endif #include #include #include #if defined(__sun) || defined(__sun__) #include #include #include #endif #if defined(__linux__) #include #endif #include #include #include #include "fts.h" #include #include #include #if defined(__FreeBSD__) #include "/usr/src/lib/libc/include/un-namespace.h" #include "/usr/src/lib/libc/gen/gen-private.h" #endif #if defined(__sun) || defined(__sun__) || defined(__linux__) void * reallocf(void *ptr, size_t size) { void *nptr; nptr = realloc(ptr, size); /* * When the System V compatibility option (malloc "V" flag) is * in effect, realloc(ptr, 0) frees the memory and returns NULL. * So, to avoid double free, call free() only when size != 0. * realloc(ptr, 0) can't fail when ptr != NULL. */ if (!nptr && ptr && size != 0) free(ptr); return (nptr); } #endif static FTSENT *fts_alloc(FTS *, char *, size_t); static FTSENT *fts_build(FTS *, int); static void fts_lfree(FTSENT *); static void fts_load(FTS *, FTSENT *); static size_t fts_maxarglen(char * const *); static void fts_padjust(FTS *, FTSENT *); static int fts_palloc(FTS *, size_t); static FTSENT *fts_sort(FTS *, FTSENT *, size_t); static int fts_stat(FTS *, FTSENT *, int); static int fts_safe_changedir(FTS *, FTSENT *, int, char *); #if !defined(__linux__) static int fts_ufslinks(FTS *, const FTSENT *); #endif #define ISDOT(a) (a[0] == '.' && (!a[1] || (a[1] == '.' && !a[2]))) #define CLR(opt) (sp->fts_options &= ~(opt)) #define ISSET(opt) (sp->fts_options & (opt)) #define SET(opt) (sp->fts_options |= (opt)) #define FCHDIR(sp, fd) (!ISSET(FTS_NOCHDIR) && fchdir(fd)) /* fts_build flags */ #define BCHILD 1 /* fts_children */ #define BNAMES 2 /* fts_children, names only */ #define BREAD 3 /* fts_read */ /* * Internal representation of an FTS, including extra implementation * details. The FTS returned from fts_open points to this structure's * ftsp_fts member (and can be cast to an _fts_private as required) */ struct _fts_private { FTS ftsp_fts; #if defined(__sun) || defined(__sun__) struct statvfs ftsp_statvfs; #else struct statfs ftsp_statfs; #endif dev_t ftsp_dev; int ftsp_linksreliable; }; #if !defined(__linux__) /* * The "FTS_NOSTAT" option can avoid a lot of calls to stat(2) if it * knows that a directory could not possibly have subdirectories. This * is decided by looking at the link count: a subdirectory would * increment its parent's link count by virtue of its own ".." entry. * This assumption only holds for UFS-like filesystems that implement * links and directories this way, so we must punt for others. */ static const char *ufslike_filesystems[] = { "ufs", "zfs", "nfs", "nfs4", "ext2fs", 0 }; #endif FTS * fts_open(argv, options, compar) char * const *argv; int options; int (*compar)(const FTSENT * const *, const FTSENT * const *); { struct _fts_private *priv; FTS *sp; FTSENT *p, *root; FTSENT *parent, *tmp; size_t len, nitems; /* Options check. */ if (options & ~FTS_OPTIONMASK) { errno = EINVAL; return (NULL); } /* fts_open() requires at least one path */ if (*argv == NULL) { errno = EINVAL; return (NULL); } /* Allocate/initialize the stream. */ if ((priv = calloc(1, sizeof(*priv))) == NULL) return (NULL); sp = &priv->ftsp_fts; sp->fts_compar = compar; sp->fts_options = options; /* Shush, GCC. */ tmp = NULL; /* Logical walks turn on NOCHDIR; symbolic links are too hard. */ if (ISSET(FTS_LOGICAL)) SET(FTS_NOCHDIR); /* * Start out with 1K of path space, and enough, in any case, * to hold the user's paths. */ if (fts_palloc(sp, MAX(fts_maxarglen(argv), MAXPATHLEN))) goto mem1; /* Allocate/initialize root's parent. */ if ((parent = fts_alloc(sp, "", 0)) == NULL) goto mem2; parent->fts_level = FTS_ROOTPARENTLEVEL; /* Allocate/initialize root(s). */ for (root = NULL, nitems = 0; *argv != NULL; ++argv, ++nitems) { /* Don't allow zero-length paths. */ if ((len = strlen(*argv)) == 0) { errno = ENOENT; goto mem3; } p = fts_alloc(sp, *argv, len); p->fts_level = FTS_ROOTLEVEL; p->fts_parent = parent; p->fts_accpath = p->fts_name; p->fts_info = fts_stat(sp, p, ISSET(FTS_COMFOLLOW)); /* Command-line "." and ".." are real directories. */ if (p->fts_info == FTS_DOT) p->fts_info = FTS_D; /* * If comparison routine supplied, traverse in sorted * order; otherwise traverse in the order specified. */ if (compar) { p->fts_link = root; root = p; } else { p->fts_link = NULL; if (root == NULL) tmp = root = p; else { tmp->fts_link = p; tmp = p; } } } if (compar && nitems > 1) root = fts_sort(sp, root, nitems); /* * Allocate a dummy pointer and make fts_read think that we've just * finished the node before the root(s); set p->fts_info to FTS_INIT * so that everything about the "current" node is ignored. */ if ((sp->fts_cur = fts_alloc(sp, "", 0)) == NULL) goto mem3; sp->fts_cur->fts_link = root; sp->fts_cur->fts_info = FTS_INIT; /* * If using chdir(2), grab a file descriptor pointing to dot to ensure * that we can get back here; this could be avoided for some paths, * but almost certainly not worth the effort. Slashes, symbolic links, * and ".." are all fairly nasty problems. Note, if we can't get the * descriptor we run anyway, just more slowly. */ if (!ISSET(FTS_NOCHDIR) && (sp->fts_rfd = _open(".", O_RDONLY #if !defined(__sun) && !defined(__sun__) | O_CLOEXEC #endif , 0)) < 0) SET(FTS_NOCHDIR); return (sp); mem3: fts_lfree(root); free(parent); mem2: free(sp->fts_path); mem1: free(sp); return (NULL); } static void fts_load(FTS *sp, FTSENT *p) { size_t len; char *cp; /* * Load the stream structure for the next traversal. Since we don't * actually enter the directory until after the preorder visit, set * the fts_accpath field specially so the chdir gets done to the right * place and the user can access the first node. From fts_open it's * known that the path will fit. */ len = p->fts_pathlen = p->fts_namelen; memmove(sp->fts_path, p->fts_name, len + 1); if ((cp = strrchr(p->fts_name, '/')) && (cp != p->fts_name || cp[1])) { len = strlen(++cp); memmove(p->fts_name, cp, len + 1); p->fts_namelen = len; } p->fts_accpath = p->fts_path = sp->fts_path; sp->fts_dev = p->fts_dev; } int fts_close(FTS *sp) { FTSENT *freep, *p; int saved_errno; /* * This still works if we haven't read anything -- the dummy structure * points to the root list, so we step through to the end of the root * list which has a valid parent pointer. */ if (sp->fts_cur) { for (p = sp->fts_cur; p->fts_level >= FTS_ROOTLEVEL;) { freep = p; p = p->fts_link != NULL ? p->fts_link : p->fts_parent; free(freep); } free(p); } /* Free up child linked list, sort array, path buffer. */ if (sp->fts_child) fts_lfree(sp->fts_child); if (sp->fts_array) free(sp->fts_array); free(sp->fts_path); /* Return to original directory, save errno if necessary. */ if (!ISSET(FTS_NOCHDIR)) { saved_errno = fchdir(sp->fts_rfd) ? errno : 0; (void)_close(sp->fts_rfd); /* Set errno and return. */ if (saved_errno != 0) { /* Free up the stream pointer. */ free(sp); errno = saved_errno; return (-1); } } /* Free up the stream pointer. */ free(sp); return (0); } /* * Special case of "/" at the end of the path so that slashes aren't * appended which would cause paths to be written as "....//foo". */ #define NAPPEND(p) \ (p->fts_path[p->fts_pathlen - 1] == '/' \ ? p->fts_pathlen - 1 : p->fts_pathlen) FTSENT * fts_read(FTS *sp) { FTSENT *p, *tmp; int instr; char *t; int saved_errno; /* If finished or unrecoverable error, return NULL. */ if (sp->fts_cur == NULL || ISSET(FTS_STOP)) return (NULL); /* Set current node pointer. */ p = sp->fts_cur; /* Save and zero out user instructions. */ instr = p->fts_instr; p->fts_instr = FTS_NOINSTR; /* Any type of file may be re-visited; re-stat and re-turn. */ if (instr == FTS_AGAIN) { p->fts_info = fts_stat(sp, p, 0); return (p); } /* * Following a symlink -- SLNONE test allows application to see * SLNONE and recover. If indirecting through a symlink, have * keep a pointer to current location. If unable to get that * pointer, follow fails. */ if (instr == FTS_FOLLOW && (p->fts_info == FTS_SL || p->fts_info == FTS_SLNONE)) { p->fts_info = fts_stat(sp, p, 1); if (p->fts_info == FTS_D && !ISSET(FTS_NOCHDIR)) { if ((p->fts_symfd = _open(".", O_RDONLY #if !defined(__sun) && !defined(__sun__) | O_CLOEXEC #endif , 0)) < 0) { p->fts_errno = errno; p->fts_info = FTS_ERR; } else p->fts_flags |= FTS_SYMFOLLOW; } return (p); } /* Directory in pre-order. */ if (p->fts_info == FTS_D) { /* If skipped or crossed mount point, do post-order visit. */ if (instr == FTS_SKIP || (ISSET(FTS_XDEV) && p->fts_dev != sp->fts_dev)) { if (p->fts_flags & FTS_SYMFOLLOW) (void)_close(p->fts_symfd); if (sp->fts_child) { fts_lfree(sp->fts_child); sp->fts_child = NULL; } p->fts_info = FTS_DP; return (p); } /* Rebuild if only read the names and now traversing. */ if (sp->fts_child != NULL && ISSET(FTS_NAMEONLY)) { CLR(FTS_NAMEONLY); fts_lfree(sp->fts_child); sp->fts_child = NULL; } /* * Cd to the subdirectory. * * If have already read and now fail to chdir, whack the list * to make the names come out right, and set the parent errno * so the application will eventually get an error condition. * Set the FTS_DONTCHDIR flag so that when we logically change * directories back to the parent we don't do a chdir. * * If haven't read do so. If the read fails, fts_build sets * FTS_STOP or the fts_info field of the node. */ if (sp->fts_child != NULL) { if (fts_safe_changedir(sp, p, -1, p->fts_accpath)) { p->fts_errno = errno; p->fts_flags |= FTS_DONTCHDIR; for (p = sp->fts_child; p != NULL; p = p->fts_link) p->fts_accpath = p->fts_parent->fts_accpath; } } else if ((sp->fts_child = fts_build(sp, BREAD)) == NULL) { if (ISSET(FTS_STOP)) return (NULL); return (p); } p = sp->fts_child; sp->fts_child = NULL; goto name; } /* Move to the next node on this level. */ next: tmp = p; if ((p = p->fts_link) != NULL) { free(tmp); /* * If reached the top, return to the original directory (or * the root of the tree), and load the paths for the next root. */ if (p->fts_level == FTS_ROOTLEVEL) { if (FCHDIR(sp, sp->fts_rfd)) { SET(FTS_STOP); return (NULL); } fts_load(sp, p); return (sp->fts_cur = p); } /* * User may have called fts_set on the node. If skipped, * ignore. If followed, get a file descriptor so we can * get back if necessary. */ if (p->fts_instr == FTS_SKIP) goto next; if (p->fts_instr == FTS_FOLLOW) { p->fts_info = fts_stat(sp, p, 1); if (p->fts_info == FTS_D && !ISSET(FTS_NOCHDIR)) { if ((p->fts_symfd = _open(".", O_RDONLY #if !defined(__sun) && !defined(__sun__) | O_CLOEXEC #endif , 0)) < 0) { p->fts_errno = errno; p->fts_info = FTS_ERR; } else p->fts_flags |= FTS_SYMFOLLOW; } p->fts_instr = FTS_NOINSTR; } name: t = sp->fts_path + NAPPEND(p->fts_parent); *t++ = '/'; memmove(t, p->fts_name, p->fts_namelen + 1); return (sp->fts_cur = p); } /* Move up to the parent node. */ p = tmp->fts_parent; free(tmp); if (p->fts_level == FTS_ROOTPARENTLEVEL) { /* * Done; free everything up and set errno to 0 so the user * can distinguish between error and EOF. */ free(p); errno = 0; return (sp->fts_cur = NULL); } /* NUL terminate the pathname. */ sp->fts_path[p->fts_pathlen] = '\0'; /* * Return to the parent directory. If at a root node or came through * a symlink, go back through the file descriptor. Otherwise, cd up * one directory. */ if (p->fts_level == FTS_ROOTLEVEL) { if (FCHDIR(sp, sp->fts_rfd)) { SET(FTS_STOP); return (NULL); } } else if (p->fts_flags & FTS_SYMFOLLOW) { if (FCHDIR(sp, p->fts_symfd)) { saved_errno = errno; (void)_close(p->fts_symfd); errno = saved_errno; SET(FTS_STOP); return (NULL); } (void)_close(p->fts_symfd); } else if (!(p->fts_flags & FTS_DONTCHDIR) && fts_safe_changedir(sp, p->fts_parent, -1, "..")) { SET(FTS_STOP); return (NULL); } p->fts_info = p->fts_errno ? FTS_ERR : FTS_DP; return (sp->fts_cur = p); } /* * Fts_set takes the stream as an argument although it's not used in this * implementation; it would be necessary if anyone wanted to add global * semantics to fts using fts_set. An error return is allowed for similar * reasons. */ /* ARGSUSED */ int fts_set(FTS *sp, FTSENT *p, int instr) { if (instr != 0 && instr != FTS_AGAIN && instr != FTS_FOLLOW && instr != FTS_NOINSTR && instr != FTS_SKIP) { errno = EINVAL; return (1); } p->fts_instr = instr; return (0); } FTSENT * fts_children(FTS *sp, int instr) { FTSENT *p; int fd; if (instr != 0 && instr != FTS_NAMEONLY) { errno = EINVAL; return (NULL); } /* Set current node pointer. */ p = sp->fts_cur; /* * Errno set to 0 so user can distinguish empty directory from * an error. */ errno = 0; /* Fatal errors stop here. */ if (ISSET(FTS_STOP)) return (NULL); /* Return logical hierarchy of user's arguments. */ if (p->fts_info == FTS_INIT) return (p->fts_link); /* * If not a directory being visited in pre-order, stop here. Could * allow FTS_DNR, assuming the user has fixed the problem, but the * same effect is available with FTS_AGAIN. */ if (p->fts_info != FTS_D /* && p->fts_info != FTS_DNR */) return (NULL); /* Free up any previous child list. */ if (sp->fts_child != NULL) fts_lfree(sp->fts_child); if (instr == FTS_NAMEONLY) { SET(FTS_NAMEONLY); instr = BNAMES; } else instr = BCHILD; /* * If using chdir on a relative path and called BEFORE fts_read does * its chdir to the root of a traversal, we can lose -- we need to * chdir into the subdirectory, and we don't know where the current * directory is, so we can't get back so that the upcoming chdir by * fts_read will work. */ if (p->fts_level != FTS_ROOTLEVEL || p->fts_accpath[0] == '/' || ISSET(FTS_NOCHDIR)) return (sp->fts_child = fts_build(sp, instr)); if ((fd = _open(".", O_RDONLY #if !defined(__sun) && !defined(__sun__) | O_CLOEXEC #endif , 0)) < 0) return (NULL); sp->fts_child = fts_build(sp, instr); if (fchdir(fd)) { (void)_close(fd); return (NULL); } (void)_close(fd); return (sp->fts_child); } #ifndef fts_get_clientptr #error "fts_get_clientptr not defined" #endif void * (fts_get_clientptr)(FTS *sp) { return (fts_get_clientptr(sp)); } #ifndef fts_get_stream #error "fts_get_stream not defined" #endif FTS * (fts_get_stream)(FTSENT *p) { return (fts_get_stream(p)); } void fts_set_clientptr(FTS *sp, void *clientptr) { sp->fts_clientptr = clientptr; } /* * This is the tricky part -- do not casually change *anything* in here. The * idea is to build the linked list of entries that are used by fts_children * and fts_read. There are lots of special cases. * * The real slowdown in walking the tree is the stat calls. If FTS_NOSTAT is * set and it's a physical walk (so that symbolic links can't be directories), * we can do things quickly. First, if it's a 4.4BSD file system, the type * of the file is in the directory entry. Otherwise, we assume that the number * of subdirectories in a node is equal to the number of links to the parent. * The former skips all stat calls. The latter skips stat calls in any leaf * directories and for any files after the subdirectories in the directory have * been found, cutting the stat calls by about 2/3. */ static FTSENT * fts_build(FTS *sp, int type) { struct dirent *dp; FTSENT *p, *head; FTSENT *cur, *tail; DIR *dirp; void *oldaddr; char *cp; int cderrno, descend, saved_errno, nostat, doadjust; #ifdef FTS_WHITEOUT int oflag; #endif long level; long nlinks; /* has to be signed because -1 is a magic value */ size_t dnamlen, len, maxlen, nitems; /* Set current node pointer. */ cur = sp->fts_cur; /* * Open the directory for reading. If this fails, we're done. * If being called from fts_read, set the fts_info field. */ #ifdef FTS_WHITEOUT if (ISSET(FTS_WHITEOUT)) oflag = DTF_NODUP | DTF_REWIND; else oflag = DTF_HIDEW | DTF_NODUP | DTF_REWIND; #else #define __opendir2(path, flag) opendir(path) #endif if ((dirp = __opendir2(cur->fts_accpath, oflag)) == NULL) { if (type == BREAD) { cur->fts_info = FTS_DNR; cur->fts_errno = errno; } return (NULL); } /* * Nlinks is the number of possible entries of type directory in the * directory if we're cheating on stat calls, 0 if we're not doing * any stat calls at all, -1 if we're doing stats on everything. */ if (type == BNAMES) { nlinks = 0; /* Be quiet about nostat, GCC. */ nostat = 0; } else if (ISSET(FTS_NOSTAT) && ISSET(FTS_PHYSICAL)) { #if !defined(__linux__) if (fts_ufslinks(sp, cur)) nlinks = cur->fts_nlink - (ISSET(FTS_SEEDOT) ? 0 : 2); else #endif nlinks = -1; nostat = 1; } else { nlinks = -1; nostat = 0; } #ifdef notdef (void)printf("nlinks == %d (cur: %d)\n", nlinks, cur->fts_nlink); (void)printf("NOSTAT %d PHYSICAL %d SEEDOT %d\n", ISSET(FTS_NOSTAT), ISSET(FTS_PHYSICAL), ISSET(FTS_SEEDOT)); #endif /* * If we're going to need to stat anything or we want to descend * and stay in the directory, chdir. If this fails we keep going, * but set a flag so we don't chdir after the post-order visit. * We won't be able to stat anything, but we can still return the * names themselves. Note, that since fts_read won't be able to * chdir into the directory, it will have to return different path * names than before, i.e. "a/b" instead of "b". Since the node * has already been visited in pre-order, have to wait until the * post-order visit to return the error. There is a special case * here, if there was nothing to stat then it's not an error to * not be able to stat. This is all fairly nasty. If a program * needed sorted entries or stat information, they had better be * checking FTS_NS on the returned nodes. */ cderrno = 0; if (nlinks || type == BREAD) { if (fts_safe_changedir(sp, cur, _dirfd(dirp), NULL)) { if (nlinks && type == BREAD) cur->fts_errno = errno; cur->fts_flags |= FTS_DONTCHDIR; descend = 0; cderrno = errno; } else descend = 1; } else descend = 0; /* * Figure out the max file name length that can be stored in the * current path -- the inner loop allocates more path as necessary. * We really wouldn't have to do the maxlen calculations here, we * could do them in fts_read before returning the path, but it's a * lot easier here since the length is part of the dirent structure. * * If not changing directories set a pointer so that can just append * each new name into the path. */ len = NAPPEND(cur); if (ISSET(FTS_NOCHDIR)) { cp = sp->fts_path + len; *cp++ = '/'; } else { /* GCC, you're too verbose. */ cp = NULL; } len++; maxlen = sp->fts_pathlen - len; level = cur->fts_level + 1; /* Read the directory, attaching each entry to the `link' pointer. */ doadjust = 0; for (head = tail = NULL, nitems = 0; dirp && (dp = readdir(dirp));) { #if defined(__sun) || defined(__sun__) || defined(__linux__) dnamlen = strlen(dp->d_name); #else dnamlen = dp->d_namlen; #endif if (!ISSET(FTS_SEEDOT) && ISDOT(dp->d_name)) continue; if ((p = fts_alloc(sp, dp->d_name, dnamlen)) == NULL) goto mem1; if (dnamlen >= maxlen) { /* include space for NUL */ oldaddr = sp->fts_path; if (fts_palloc(sp, dnamlen + len + 1)) { /* * No more memory for path or structures. Save * errno, free up the current structure and the * structures already allocated. */ mem1: saved_errno = errno; if (p) free(p); fts_lfree(head); (void)closedir(dirp); cur->fts_info = FTS_ERR; SET(FTS_STOP); errno = saved_errno; return (NULL); } /* Did realloc() change the pointer? */ if (oldaddr != sp->fts_path) { doadjust = 1; if (ISSET(FTS_NOCHDIR)) cp = sp->fts_path + len; } maxlen = sp->fts_pathlen - len; } p->fts_level = level; p->fts_parent = sp->fts_cur; p->fts_pathlen = len + dnamlen; #ifdef FTS_WHITEOUT if (dp->d_type == DT_WHT) p->fts_flags |= FTS_ISW; #endif if (cderrno) { if (nlinks) { p->fts_info = FTS_NS; p->fts_errno = cderrno; } else p->fts_info = FTS_NSOK; p->fts_accpath = cur->fts_accpath; } else if (nlinks == 0 #ifdef DT_DIR || (nostat && dp->d_type != DT_DIR && dp->d_type != DT_UNKNOWN) #endif ) { p->fts_accpath = ISSET(FTS_NOCHDIR) ? p->fts_path : p->fts_name; p->fts_info = FTS_NSOK; } else { /* Build a file name for fts_stat to stat. */ if (ISSET(FTS_NOCHDIR)) { p->fts_accpath = p->fts_path; memmove(cp, p->fts_name, p->fts_namelen + 1); } else p->fts_accpath = p->fts_name; /* Stat it. */ p->fts_info = fts_stat(sp, p, 0); /* Decrement link count if applicable. */ if (nlinks > 0 && (p->fts_info == FTS_D || p->fts_info == FTS_DC || p->fts_info == FTS_DOT)) --nlinks; } /* We walk in directory order so "ls -f" doesn't get upset. */ p->fts_link = NULL; if (head == NULL) head = tail = p; else { tail->fts_link = p; tail = p; } ++nitems; } if (dirp) (void)closedir(dirp); /* * If realloc() changed the address of the path, adjust the * addresses for the rest of the tree and the dir list. */ if (doadjust) fts_padjust(sp, head); /* * If not changing directories, reset the path back to original * state. */ if (ISSET(FTS_NOCHDIR)) sp->fts_path[cur->fts_pathlen] = '\0'; /* * If descended after called from fts_children or after called from * fts_read and nothing found, get back. At the root level we use * the saved fd; if one of fts_open()'s arguments is a relative path * to an empty directory, we wind up here with no other way back. If * can't get back, we're done. */ if (descend && (type == BCHILD || !nitems) && (cur->fts_level == FTS_ROOTLEVEL ? FCHDIR(sp, sp->fts_rfd) : fts_safe_changedir(sp, cur->fts_parent, -1, ".."))) { cur->fts_info = FTS_ERR; SET(FTS_STOP); return (NULL); } /* If didn't find anything, return NULL. */ if (!nitems) { if (type == BREAD) cur->fts_info = FTS_DP; return (NULL); } /* Sort the entries. */ if (sp->fts_compar && nitems > 1) head = fts_sort(sp, head, nitems); return (head); } static int fts_stat(FTS *sp, FTSENT *p, int follow) { FTSENT *t; dev_t dev; ino_t ino; struct stat *sbp, sb; int saved_errno; /* If user needs stat info, stat buffer already allocated. */ sbp = ISSET(FTS_NOSTAT) ? &sb : p->fts_statp; #ifdef FTS_WHITEOUT /* Check for whiteout. */ if (p->fts_flags & FTS_ISW) { if (sbp != &sb) { memset(sbp, '\0', sizeof(*sbp)); sbp->st_mode = S_IFWHT; } return (FTS_W); } #endif /* * If doing a logical walk, or application requested FTS_FOLLOW, do * a stat(2). If that fails, check for a non-existent symlink. If * fail, set the errno from the stat call. */ if (ISSET(FTS_LOGICAL) || follow) { if (stat(p->fts_accpath, sbp)) { saved_errno = errno; if (!lstat(p->fts_accpath, sbp)) { errno = 0; return (FTS_SLNONE); } p->fts_errno = saved_errno; goto err; } } else if (lstat(p->fts_accpath, sbp)) { p->fts_errno = errno; err: memset(sbp, 0, sizeof(struct stat)); return (FTS_NS); } if (S_ISDIR(sbp->st_mode)) { /* * Set the device/inode. Used to find cycles and check for * crossing mount points. Also remember the link count, used * in fts_build to limit the number of stat calls. It is * understood that these fields are only referenced if fts_info * is set to FTS_D. */ dev = p->fts_dev = sbp->st_dev; ino = p->fts_ino = sbp->st_ino; p->fts_nlink = sbp->st_nlink; if (ISDOT(p->fts_name)) return (FTS_DOT); /* * Cycle detection is done by brute force when the directory * is first encountered. If the tree gets deep enough or the * number of symbolic links to directories is high enough, * something faster might be worthwhile. */ for (t = p->fts_parent; t->fts_level >= FTS_ROOTLEVEL; t = t->fts_parent) if (ino == t->fts_ino && dev == t->fts_dev) { p->fts_cycle = t; return (FTS_DC); } return (FTS_D); } if (S_ISLNK(sbp->st_mode)) return (FTS_SL); if (S_ISREG(sbp->st_mode)) return (FTS_F); return (FTS_DEFAULT); } /* * The comparison function takes pointers to pointers to FTSENT structures. * Qsort wants a comparison function that takes pointers to void. * (Both with appropriate levels of const-poisoning, of course!) * Use a trampoline function to deal with the difference. */ static int fts_compar(const void *a, const void *b) { FTS *parent; parent = (*(const FTSENT * const *)a)->fts_fts; return (*parent->fts_compar)(a, b); } static FTSENT * fts_sort(FTS *sp, FTSENT *head, size_t nitems) { FTSENT **ap, *p; /* * Construct an array of pointers to the structures and call qsort(3). * Reassemble the array in the order returned by qsort. If unable to * sort for memory reasons, return the directory entries in their * current order. Allocate enough space for the current needs plus * 40 so don't realloc one entry at a time. */ if (nitems > sp->fts_nitems) { sp->fts_nitems = nitems + 40; if ((sp->fts_array = reallocf(sp->fts_array, sp->fts_nitems * sizeof(FTSENT *))) == NULL) { sp->fts_nitems = 0; return (head); } } for (ap = sp->fts_array, p = head; p; p = p->fts_link) *ap++ = p; qsort(sp->fts_array, nitems, sizeof(FTSENT *), fts_compar); for (head = *(ap = sp->fts_array); --nitems; ++ap) ap[0]->fts_link = ap[1]; ap[0]->fts_link = NULL; return (head); } static FTSENT * fts_alloc(FTS *sp, char *name, size_t namelen) { FTSENT *p; size_t len; struct ftsent_withstat { FTSENT ent; struct stat statbuf; }; /* * The file name is a variable length array and no stat structure is * necessary if the user has set the nostat bit. Allocate the FTSENT * structure, the file name and the stat structure in one chunk, but * be careful that the stat structure is reasonably aligned. */ if (ISSET(FTS_NOSTAT)) len = sizeof(FTSENT) + namelen + 1; else len = sizeof(struct ftsent_withstat) + namelen + 1; if ((p = malloc(len)) == NULL) return (NULL); if (ISSET(FTS_NOSTAT)) { p->fts_name = (char *)(p + 1); p->fts_statp = NULL; } else { p->fts_name = (char *)((struct ftsent_withstat *)p + 1); p->fts_statp = &((struct ftsent_withstat *)p)->statbuf; } /* Copy the name and guarantee NUL termination. */ memcpy(p->fts_name, name, namelen); p->fts_name[namelen] = '\0'; p->fts_namelen = namelen; p->fts_path = sp->fts_path; p->fts_errno = 0; p->fts_flags = 0; p->fts_instr = FTS_NOINSTR; p->fts_number = 0; p->fts_pointer = NULL; p->fts_fts = sp; return (p); } static void fts_lfree(FTSENT *head) { FTSENT *p; /* Free a linked list of structures. */ while ((p = head)) { head = head->fts_link; free(p); } } /* * Allow essentially unlimited paths; find, rm, ls should all work on any tree. * Most systems will allow creation of paths much longer than MAXPATHLEN, even * though the kernel won't resolve them. Add the size (not just what's needed) * plus 256 bytes so don't realloc the path 2 bytes at a time. */ static int fts_palloc(FTS *sp, size_t more) { sp->fts_pathlen += more + 256; sp->fts_path = reallocf(sp->fts_path, sp->fts_pathlen); return (sp->fts_path == NULL); } /* * When the path is realloc'd, have to fix all of the pointers in structures * already returned. */ static void fts_padjust(FTS *sp, FTSENT *head) { FTSENT *p; char *addr = sp->fts_path; #define ADJUST(p) do { \ if ((p)->fts_accpath != (p)->fts_name) { \ (p)->fts_accpath = \ (char *)addr + ((p)->fts_accpath - (p)->fts_path); \ } \ (p)->fts_path = addr; \ } while (0) /* Adjust the current set of children. */ for (p = sp->fts_child; p; p = p->fts_link) ADJUST(p); /* Adjust the rest of the tree, including the current level. */ for (p = head; p->fts_level >= FTS_ROOTLEVEL;) { ADJUST(p); p = p->fts_link ? p->fts_link : p->fts_parent; } } static size_t fts_maxarglen(argv) char * const *argv; { size_t len, max; for (max = 0; *argv; ++argv) if ((len = strlen(*argv)) > max) max = len; return (max + 1); } /* * Change to dir specified by fd or p->fts_accpath without getting * tricked by someone changing the world out from underneath us. * Assumes p->fts_dev and p->fts_ino are filled in. */ static int fts_safe_changedir(FTS *sp, FTSENT *p, int fd, char *path) { int ret, oerrno, newfd; struct stat sb; newfd = fd; if (ISSET(FTS_NOCHDIR)) return (0); if (fd < 0 && (newfd = _open(path, O_RDONLY #if !defined(__sun) && !defined(__sun__) | O_CLOEXEC #endif , 0)) < 0) return (-1); if (_fstat(newfd, &sb)) { ret = -1; goto bail; } if (p->fts_dev != sb.st_dev || p->fts_ino != sb.st_ino) { errno = ENOENT; /* disinformation */ ret = -1; goto bail; } ret = fchdir(newfd); bail: oerrno = errno; if (fd < 0) (void)_close(newfd); errno = oerrno; return (ret); } #if !defined(__linux__) /* * Check if the filesystem for "ent" has UFS-style links. */ static int fts_ufslinks(FTS *sp, const FTSENT *ent) { struct _fts_private *priv; const char **cpp; priv = (struct _fts_private *)sp; /* * If this node's device is different from the previous, grab * the filesystem information, and decide on the reliability * of the link information from this filesystem for stat(2) * avoidance. */ if (priv->ftsp_dev != ent->fts_dev) { #if defined(__sun) || defined(__sun__) if (statvfs(ent->fts_path, &priv->ftsp_statvfs) != -1) { #else if (statfs(ent->fts_path, &priv->ftsp_statfs) != -1) { #endif priv->ftsp_dev = ent->fts_dev; priv->ftsp_linksreliable = 0; for (cpp = ufslike_filesystems; *cpp; cpp++) { #if defined(__sun) || defined(__sun__) if (strcmp(priv->ftsp_statvfs.f_basetype, #else if (strcmp(priv->ftsp_statfs.f_fstypename, #endif *cpp) == 0) { priv->ftsp_linksreliable = 1; break; } } } else { priv->ftsp_linksreliable = 0; } } return (priv->ftsp_linksreliable); } #endif fpart-0.9.2/src/fts.h000066400000000000000000000133531247141164700144060ustar00rootroot00000000000000/* * Copyright (c) 1989, 1993 * The Regents of the University of California. All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * 3. Neither the name of the University nor the names of its contributors * may be used to endorse or promote products derived from this software * without specific prior written permission. * * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. * * @(#)fts.h 8.3 (Berkeley) 8/14/94 * $FreeBSD: head/include/fts.h 203964 2010-02-16 19:39:50Z imp $ */ #ifndef _FTS_H_ #define _FTS_H_ #if !defined(__FreeBSD__) && !defined(__linux__) #define MAX(a, b) ((a) >= (b) ? (a) : (b)) #define dirfd(X) ((X)->d_fd) #endif typedef struct { struct _ftsent *fts_cur; /* current node */ struct _ftsent *fts_child; /* linked list of children */ struct _ftsent **fts_array; /* sort array */ dev_t fts_dev; /* starting device # */ char *fts_path; /* path for this descent */ int fts_rfd; /* fd for root */ size_t fts_pathlen; /* sizeof(path) */ size_t fts_nitems; /* elements in the sort array */ int (*fts_compar) /* compare function */ (const struct _ftsent * const *, const struct _ftsent * const *); #define FTS_COMFOLLOW 0x001 /* follow command line symlinks */ #define FTS_LOGICAL 0x002 /* logical walk */ #define FTS_NOCHDIR 0x004 /* don't change directories */ #define FTS_NOSTAT 0x008 /* don't get stat info */ #define FTS_PHYSICAL 0x010 /* physical walk */ #define FTS_SEEDOT 0x020 /* return dot and dot-dot */ #define FTS_XDEV 0x040 /* don't cross devices */ #if !defined(__sun) && !defined(__sun__) && !defined(__linux__) #define FTS_WHITEOUT 0x080 /* return whiteout information */ #endif #define FTS_OPTIONMASK 0x0ff /* valid user option mask */ #define FTS_NAMEONLY 0x100 /* (private) child names only */ #define FTS_STOP 0x200 /* (private) unrecoverable error */ int fts_options; /* fts_open options, global flags */ void *fts_clientptr; /* thunk for sort function */ } FTS; typedef struct _ftsent { struct _ftsent *fts_cycle; /* cycle node */ struct _ftsent *fts_parent; /* parent directory */ struct _ftsent *fts_link; /* next file in directory */ long long fts_number; /* local numeric value */ #define fts_bignum fts_number /* XXX non-std, should go away */ void *fts_pointer; /* local address value */ char *fts_accpath; /* access path */ char *fts_path; /* root path */ int fts_errno; /* errno for this node */ int fts_symfd; /* fd for symlink */ size_t fts_pathlen; /* strlen(fts_path) */ size_t fts_namelen; /* strlen(fts_name) */ ino_t fts_ino; /* inode */ dev_t fts_dev; /* device */ nlink_t fts_nlink; /* link count */ #define FTS_ROOTPARENTLEVEL -1 #define FTS_ROOTLEVEL 0 long fts_level; /* depth (-1 to N) */ #define FTS_D 1 /* preorder directory */ #define FTS_DC 2 /* directory that causes cycles */ #define FTS_DEFAULT 3 /* none of the above */ #define FTS_DNR 4 /* unreadable directory */ #define FTS_DOT 5 /* dot or dot-dot */ #define FTS_DP 6 /* postorder directory */ #define FTS_ERR 7 /* error; errno is set */ #define FTS_F 8 /* regular file */ #define FTS_INIT 9 /* initialized only */ #define FTS_NS 10 /* stat(2) failed */ #define FTS_NSOK 11 /* no stat(2) requested */ #define FTS_SL 12 /* symbolic link */ #define FTS_SLNONE 13 /* symbolic link without target */ #define FTS_W 14 /* whiteout object */ int fts_info; /* user status for FTSENT structure */ #define FTS_DONTCHDIR 0x01 /* don't chdir .. to the parent */ #define FTS_SYMFOLLOW 0x02 /* followed a symlink to get here */ #define FTS_ISW 0x04 /* this is a whiteout object */ unsigned fts_flags; /* private flags for FTSENT structure */ #define FTS_AGAIN 1 /* read node again */ #define FTS_FOLLOW 2 /* follow symbolic link */ #define FTS_NOINSTR 3 /* no instructions */ #define FTS_SKIP 4 /* discard node */ int fts_instr; /* fts_set() instructions */ struct stat *fts_statp; /* stat(2) information */ char *fts_name; /* file name */ FTS *fts_fts; /* back pointer to main FTS */ } FTSENT; #if defined(__FreeBSD__) #include #endif #if defined(__FreeBSD__) __BEGIN_DECLS #endif FTSENT *fts_children(FTS *, int); int fts_close(FTS *); void *fts_get_clientptr(FTS *); #define fts_get_clientptr(fts) ((fts)->fts_clientptr) FTS *fts_get_stream(FTSENT *); #define fts_get_stream(ftsent) ((ftsent)->fts_fts) FTS *fts_open(char * const *, int, int (*)(const FTSENT * const *, const FTSENT * const *)); FTSENT *fts_read(FTS *); int fts_set(FTS *, FTSENT *, int); void fts_set_clientptr(FTS *, void *); #if defined(__FreeBSD__) __END_DECLS #endif #endif /* !_FTS_H_ */ fpart-0.9.2/src/options.c000066400000000000000000000135521247141164700153010ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include "utils.h" #include "options.h" /* NULL */ #include /* assert(3) */ #include /************************* Program options functions *************************/ /* Initialize global options structure */ void init_options(struct program_options *options) { /* check our default values */ assert(DFLT_OPT_NUM_PARTS >= 0); assert(DFLT_OPT_MAX_ENTRIES >= 0); assert(DFLT_OPT_MAX_SIZE >= 0); assert((DFLT_OPT_ARBITRARYVALUES == OPT_NOARBITRARYVALUES) || (DFLT_OPT_ARBITRARYVALUES == OPT_ARBITRARYVALUES)); assert((DFLT_OPT_ADDSLASH == OPT_NOADDSLASH) || (DFLT_OPT_ADDSLASH == OPT_ADDSLASH)); assert((DFLT_OPT_VERBOSE == OPT_NOVERBOSE) || (DFLT_OPT_VERBOSE == OPT_VERBOSE) || (DFLT_OPT_VERBOSE == OPT_VVERBOSE)); assert((DFLT_OPT_FOLLOWSYMLINKS == OPT_FOLLOWSYMLINKS) || (DFLT_OPT_FOLLOWSYMLINKS == OPT_NOFOLLOWSYMLINKS)); assert((DFLT_OPT_CROSSFSBOUNDARIES == OPT_NOCROSSFSBOUNDARIES) || (DFLT_OPT_CROSSFSBOUNDARIES == OPT_CROSSFSBOUNDARIES)); assert((DFLT_OPT_EMPTYDIRS == OPT_NOEMPTYDIRS) || (DFLT_OPT_EMPTYDIRS == OPT_EMPTYDIRS)); assert((DFLT_OPT_DNREMPTY == OPT_NODNREMPTY) || (DFLT_OPT_DNREMPTY == OPT_DNREMPTY)); assert(DFLT_OPT_DIR_DEPTH >= OPT_NODIRDEPTH); assert((DFLT_OPT_LEAFDIRS == OPT_NOLEAFDIRS) || (DFLT_OPT_LEAFDIRS == OPT_LEAFDIRS)); assert((DFLT_OPT_LIVEMODE == OPT_NOLIVEMODE) || (DFLT_OPT_LIVEMODE == OPT_LIVEMODE)); assert(DFLT_OPT_PRELOAD_SIZE >= 0); assert(DFLT_OPT_OVERLOAD_SIZE >= 0); assert(DFLT_OPT_ROUND_SIZE >= 1); /* set default options */ options->num_parts = DFLT_OPT_NUM_PARTS; options->max_entries = DFLT_OPT_MAX_ENTRIES; options->max_size = DFLT_OPT_MAX_SIZE; options->in_filename = NULL; options->arbitrary_values = DFLT_OPT_ARBITRARYVALUES; options->out_filename = NULL; options->add_slash = DFLT_OPT_ADDSLASH; options->verbose = DFLT_OPT_VERBOSE; options->follow_symbolic_links = DFLT_OPT_FOLLOWSYMLINKS; options->cross_fs_boundaries = DFLT_OPT_CROSSFSBOUNDARIES; options->include_files = NULL; options->ninclude_files = 0; options->include_files_ci = NULL; options->ninclude_files_ci = 0; options->exclude_files = NULL; options->nexclude_files = 0; options->exclude_files_ci = NULL; options->nexclude_files_ci = 0; options->empty_dirs = DFLT_OPT_EMPTYDIRS; options->dnr_empty = DFLT_OPT_DNREMPTY; options->dir_depth = DFLT_OPT_DIR_DEPTH; options->leaf_dirs = DFLT_OPT_LEAFDIRS; options->live_mode = DFLT_OPT_LIVEMODE; options->pre_part_hook = NULL; options->post_part_hook = NULL; options->preload_size = DFLT_OPT_PRELOAD_SIZE; options->overload_size = DFLT_OPT_OVERLOAD_SIZE; options->round_size = DFLT_OPT_ROUND_SIZE; } /* Un-initialize global options structure */ void uninit_options(struct program_options *options) { options->round_size = DFLT_OPT_ROUND_SIZE; options->overload_size = DFLT_OPT_OVERLOAD_SIZE; options->preload_size = DFLT_OPT_PRELOAD_SIZE; if(options->post_part_hook != NULL) free(options->post_part_hook); if(options->pre_part_hook != NULL) free(options->pre_part_hook); options->live_mode = DFLT_OPT_LIVEMODE; options->leaf_dirs = DFLT_OPT_LEAFDIRS; options->dir_depth = DFLT_OPT_DIR_DEPTH; options->dnr_empty = DFLT_OPT_DNREMPTY; options->empty_dirs = DFLT_OPT_EMPTYDIRS; if(options->exclude_files_ci != NULL) str_cleanup(&(options->exclude_files_ci), &(options->nexclude_files_ci)); if(options->exclude_files != NULL) str_cleanup(&(options->exclude_files), &(options->nexclude_files)); if(options->include_files_ci != NULL) str_cleanup(&(options->include_files_ci), &(options->ninclude_files_ci)); if(options->include_files != NULL) str_cleanup(&(options->include_files), &(options->ninclude_files)); options->cross_fs_boundaries = DFLT_OPT_CROSSFSBOUNDARIES; options->follow_symbolic_links = DFLT_OPT_FOLLOWSYMLINKS; options->verbose = DFLT_OPT_VERBOSE; options->add_slash = DFLT_OPT_ADDSLASH; if(options->out_filename != NULL) free(options->out_filename); options->arbitrary_values = DFLT_OPT_ARBITRARYVALUES; if(options->in_filename != NULL) free(options->in_filename); options->max_size = DFLT_OPT_MAX_SIZE; options->max_entries = DFLT_OPT_MAX_ENTRIES; options->num_parts = DFLT_OPT_NUM_PARTS; } fpart-0.9.2/src/options.h000066400000000000000000000117511247141164700153050ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _OPTIONS_H #define _OPTIONS_H #include "types.h" #include /* stat(2) */ #include #include /* Program options */ struct program_options { /* number of partitions (option -n) */ #define DFLT_OPT_NUM_PARTS 0 pnum_t num_parts; /* maximum files per partition (option -f) */ #define DFLT_OPT_MAX_ENTRIES 0 fnum_t max_entries; /* maximum partition size (option -s) */ #define DFLT_OPT_MAX_SIZE 0 fsize_t max_size; /* input file (option -i); NULL = undefined, "-" = stdin, "filename" */ char *in_filename; /* arbitrary values (option -a) */ #define OPT_NOARBITRARYVALUES 0 #define OPT_ARBITRARYVALUES 1 #define DFLT_OPT_ARBITRARYVALUES OPT_NOARBITRARYVALUES unsigned char arbitrary_values; /* output file (option -o); NULL = stdout, "filename" */ char *out_filename; /* add slash to directories (option -e) */ #define OPT_NOADDSLASH 0 #define OPT_ADDSLASH 1 #define DFLT_OPT_ADDSLASH OPT_NOADDSLASH unsigned char add_slash; /* verbose output (option -v) */ #define OPT_NOVERBOSE 0 #define OPT_VERBOSE 1 #define OPT_VVERBOSE 2 #define DFLT_OPT_VERBOSE OPT_NOVERBOSE unsigned char verbose; /* follow symbolic links (option -l) */ #define OPT_FOLLOWSYMLINKS 0 #define OPT_NOFOLLOWSYMLINKS 1 #define DFLT_OPT_FOLLOWSYMLINKS OPT_NOFOLLOWSYMLINKS unsigned char follow_symbolic_links; /* cross fs boundaries (option -b) */ #define OPT_NOCROSSFSBOUNDARIES 0 #define OPT_CROSSFSBOUNDARIES 1 #define DFLT_OPT_CROSSFSBOUNDARIES OPT_CROSSFSBOUNDARIES unsigned char cross_fs_boundaries; /* include files, case sensitive (option -y) */ char **include_files; unsigned int ninclude_files; /* include files, case insensitive (option -Y) */ char **include_files_ci; unsigned int ninclude_files_ci; /* exclude files, case sensitive (option -x) */ char **exclude_files; unsigned int nexclude_files; /* exclude files, case insensitive (option -X) */ char **exclude_files_ci; unsigned int nexclude_files_ci; /* include empty directories (option -z) */ #define OPT_NOEMPTYDIRS 0 #define OPT_EMPTYDIRS 1 #define DFLT_OPT_EMPTYDIRS OPT_NOEMPTYDIRS unsigned char empty_dirs; /* treat un-readable directories as empty (option -Z) */ #define OPT_NODNREMPTY 0 #define OPT_DNREMPTY 1 #define DFLT_OPT_DNREMPTY OPT_NODNREMPTY unsigned char dnr_empty; /* display directories after certain depth (option -d) */ #define OPT_NODIRDEPTH -1 #define DFLT_OPT_DIR_DEPTH OPT_NODIRDEPTH int dir_depth; /* pack leaf directories (option -D) */ #define OPT_NOLEAFDIRS 0 #define OPT_LEAFDIRS 1 #define DFLT_OPT_LEAFDIRS OPT_NOLEAFDIRS unsigned char leaf_dirs; /* live mode (option -L) */ #define OPT_NOLIVEMODE 0 #define OPT_LIVEMODE 1 #define DFLT_OPT_LIVEMODE OPT_NOLIVEMODE unsigned char live_mode; /* pre-partition hook (option -w) */ char *pre_part_hook; /* post-partition hook (option -W) */ char *post_part_hook; /* preload partitions (option -p) */ #define DFLT_OPT_PRELOAD_SIZE 0 fsize_t preload_size; /* overload file entries (option -q) */ #define DFLT_OPT_OVERLOAD_SIZE 0 fsize_t overload_size; /* round file size up (option -r) */ #define DFLT_OPT_ROUND_SIZE 1 fsize_t round_size; }; void init_options(struct program_options *options); void uninit_options(struct program_options *options); #endif /* _OPTIONS_H */ fpart-0.9.2/src/partition.c000066400000000000000000000111261247141164700156120ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include "types.h" #include "utils.h" #include "options.h" #include "partition.h" /* fprintf(3) */ #include /* malloc(3) */ #include /* assert(3) */ #include /******************************************************* Double-linked list of partitions manipulation functions *******************************************************/ /* Add num_parts empty partitions to a double-linked list of partitions from head - if head is NULL, creates a new list ; if not, chains a new list to it - returns with head set to the last element */ int add_partitions(struct partition **head, pnum_t num_parts, struct program_options *options) { assert(head != NULL); assert(num_parts > 0); assert(options != NULL); struct partition **current = head; /* current partition pointer address */ struct partition *previous = NULL; /* previous partition pointer */ pnum_t i = 0; while(i < num_parts) { /* backup current structure pointer and initialize a new structure */ previous = *current; if_not_malloc(*current, sizeof(struct partition), return (1); ) /* set head on first pass */ if(*head == NULL) *head = *current; /* initialize partition data */ (*current)->size = options->preload_size; (*current)->num_files = 0; (*current)->nextp = NULL; /* will be set in next pass (see below) */ (*current)->prevp = previous; /* set previous' nextp pointer */ if(previous != NULL) previous->nextp = *current; i++; } return (0); } /* Un-initialize a double-linked list of partitions */ void uninit_partitions(struct partition *head) { /* be sure to start from last partition */ fastfw_list(head); struct partition *current = head; struct partition *prev = NULL; while(current != NULL) { prev = current->prevp; free(current); current = prev; } return; } /* Crawl partitions and return the least-loaded partition index */ pnum_t find_smallest_partition_index(struct partition *head) { assert(head != NULL); /* be sure to start at first partition */ rewind_list(head); /* start values */ fsize_t smallest_partition_value = head->size; pnum_t smallest_partition_index = 0; pnum_t i = 0; while(head != NULL) { if(head->size < smallest_partition_value) { smallest_partition_value = head->size; smallest_partition_index = i; } head = head->nextp; i++; } return (smallest_partition_index); } /* Return a pointer to a given partition */ struct partition * get_partition_at(struct partition *head, pnum_t index) { assert(head != NULL); /* be sure to start at first partition */ rewind_list(head); pnum_t i = 0; while((head != NULL) && (i < index)) { head = head->nextp; i++; } return (head); } /* Print partitions from head */ void print_partitions(struct partition *head) { pnum_t i = 0; while(head != NULL) { fprintf(stderr, "Part #%d: size = %lld, %lld file(s)\n", i, head->size, head->num_files); head = head->nextp; i++; } return; } fpart-0.9.2/src/partition.h000066400000000000000000000041571247141164700156250ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _PARTITION_H #define _PARTITION_H #include "types.h" #include "options.h" #include /* A partition (group of file entries) */ struct partition; struct partition { fsize_t size; /* size in bytes */ fnum_t num_files; /* number of files */ struct partition* nextp; /* next partition */ struct partition* prevp; /* previous one */ }; int add_partitions(struct partition **head, pnum_t num_parts, struct program_options *options); void uninit_partitions(struct partition *head); pnum_t find_smallest_partition_index(struct partition *head); struct partition * get_partition_at(struct partition *head, pnum_t index); void print_partitions(struct partition *head); #endif /* _PARTITION_H */ fpart-0.9.2/src/types.h000066400000000000000000000034751247141164700147620ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _TYPES_H #define _TYPES_H /* Handles the size of a file or partition. Must be signed and longer than off_t */ typedef long long fsize_t; /* Handles the number of files in a partition and the number of file entries. Must be unsigned and longer than ino_t */ typedef unsigned long long fnum_t; /* Handles the number of partitions. Must be unsigned and can be smaller than fnum_t */ typedef unsigned int pnum_t; #endif /* _TYPES_H */ fpart-0.9.2/src/utils.c000066400000000000000000000247011247141164700147440ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #include "types.h" #include "utils.h" #include "options.h" /* log10(3) */ #include /* malloc(3) */ #include /* fprintf(3), snprintf(3) */ #include /* fts(3) */ #include #include #if defined(EMBED_FTS) #include "fts.h" #else #include #endif /* strerror(3) */ #include /* errno */ #include /* getcwd(3) */ #include /* MAXPATHLEN */ #include /* strlen(3) */ #include /* assert(3) */ #include /* opendir(3) */ #include /* fnmatch(3) */ #include /**************** Helper functions ****************/ /* Return the number of digits necessary to print i */ unsigned int get_num_digits(double i) { if((int)i == 0) return (1); double logvalue = log10(i); return (logvalue >= 0 ? (unsigned int)logvalue + 1 : 0); } /* Return the size of a file or directory - a pointer to an existing stat must be provided */ fsize_t get_size(char *file_path, struct stat *file_stat, struct program_options *options) { assert(file_path != NULL); assert(file_stat != NULL); assert(options != NULL); fsize_t file_size = 0; /* current return value */ /* if file_path is not a directory, return st_size for regular files (only) */ if(!S_ISDIR(file_stat->st_mode)) { return (S_ISREG(file_stat->st_mode) ? file_stat->st_size : 0); } /* directory, use fts */ FTS *ftsp = NULL; FTSENT *p = NULL; int fts_options = (options->follow_symbolic_links == OPT_FOLLOWSYMLINKS) ? FTS_LOGICAL : FTS_PHYSICAL; fts_options |= (options->cross_fs_boundaries == OPT_NOCROSSFSBOUNDARIES) ? FTS_XDEV : 0; char *fts_argv[] = { file_path, NULL }; if((ftsp = fts_open(fts_argv, fts_options, NULL)) == NULL) { fprintf(stderr, "%s: fts_open()\n", file_path); return (0); } while((p = fts_read(ftsp)) != NULL) { switch (p->fts_info) { case FTS_DNR: /* un-readable directory */ case FTS_ERR: /* misc error */ case FTS_NS: /* stat() error */ fprintf(stderr, "%s: %s\n", p->fts_path, strerror(p->fts_errno)); continue; case FTS_DC: fprintf(stderr, "%s: filesystem loop detected\n", p->fts_path); continue; case FTS_F: file_size += p->fts_statp->st_size; continue; /* skip everything else (only count regular files' size) */ default: continue; } } if(errno != 0) fprintf(stderr, "%s: fts_read()\n", file_path); if(fts_close(ftsp) < 0) fprintf(stderr, "%s: fts_close()\n", file_path); return (file_size); } /* Return absolute path for given path - '/xxx' and '-' are considered absolute, e.g. will not be prefixed by cwd. Everything else will. - returned pointer must be manually freed later */ char * abs_path(const char *path) { assert(path != NULL); char *cwd = NULL; /* current working directory */ char *abs = NULL; /* will be returned */ size_t malloc_size = 0; if(path[0] == '\0') { errno = ENOENT; return (NULL); } if((path[0] != '/') && ((path[0] != '-') || (path[1] != '\0'))) { /* relative path given */ if_not_malloc(cwd, MAXPATHLEN, return (NULL); ) if(getcwd(cwd, MAXPATHLEN) == NULL) { free(cwd); return (NULL); } malloc_size += strlen(cwd) + 1; /* cwd + '/' */ } malloc_size += strlen(path) + 1; /* path + '\0' */ if_not_malloc(abs, malloc_size, /* just print error message (within macro code) */ ) else { if(cwd != NULL) snprintf(abs, malloc_size, "%s/%s", cwd, path); else snprintf(abs, malloc_size, "%s", path); } if(cwd != NULL) free(cwd); return (abs); } /* Push str into array and update num - allocate memory for array if NULL - return 0 (success) or 1 (failure) */ int str_push(char ***array, unsigned int *num, const char * const str) { assert(array != NULL); assert(num != NULL); assert(str != NULL); assert(((*array == NULL) && (*num == 0)) || ((*num > 0) && (*array != NULL))); /* allocate new string */ char *tmp_str = NULL; size_t malloc_size = strlen(str) + 1; if_not_malloc(tmp_str, malloc_size, return (1); ) snprintf(tmp_str, malloc_size, "%s", str); /* add new char *pointer to array */ if_not_realloc(*array, sizeof(char *) * ((*num) + 1), free(tmp_str); return (1); ) (*array)[*num] = tmp_str; *num += 1; return (0); } /* Cleanup str array - remove and free() every str from array - free() and NULL'ify array - update num */ void str_cleanup(char ***array, unsigned int *num) { assert(num != NULL); assert(array != NULL); assert(((*array == NULL) && (*num == 0)) || ((*num > 0) && (*array != NULL))); while(*num > 0) { if((*array)[(*num) - 1] != NULL) { free((*array)[(*num) - 1]); (*array)[(*num) - 1] = NULL; *num -= 1; } } free(*array); *array = NULL; return; } /* Match str against array of str - return 0 (no match) or 1 (match) */ int str_match(const char * const * const array, const unsigned int num, const char * const str, const unsigned char ignore_case) { assert(str != NULL); if(array == NULL) return (0); unsigned int i = 0; while(i < num) { if(fnmatch(array[i], str, ignore_case ? FNM_CASEFOLD : 0) == 0) return(1); i++; } return (0); } /* Validate a file name regarding program options - do not check inclusion lists for directories (we must be able to crawl the entire file hierarchy) - return 0 if file is not valid, 1 if it is */ int valid_filename(char *filename, struct program_options *options, unsigned char is_leaf) { assert(filename != NULL); assert(options != NULL); int valid = 1; #if defined(DEBUG) fprintf(stderr, "%s(): checking name validity for %s: %s\n", __func__, is_leaf ? "leaf" : "directory", filename); #endif /* check for includes (options -y and -Y), for leaves only */ if(is_leaf) { if((options->include_files != NULL) || (options->include_files_ci != NULL)) { /* switch to default exclude, unless file found in lists */ valid = 0; if(str_match((const char * const * const)(options->include_files), options->ninclude_files, filename, 0) || str_match((const char * const * const)(options->include_files_ci), options->ninclude_files_ci, filename, 1)) valid = 1; } } /* check for excludes (options -x and -X) */ if(str_match((const char * const * const)(options->exclude_files), options->nexclude_files, filename, 0) || str_match((const char * const * const)(options->exclude_files_ci), options->nexclude_files_ci, filename, 1)) valid = 0; #if defined(DEBUG) fprintf(stderr, "%s(): %s: %s, validity: %s\n", __func__, is_leaf ? "leaf" : "directory", filename, valid ? "valid" : "invalid"); #endif return (valid); } /* Create a copy of environ(7) and return its address - return a pointer to the copy or NULL if error - returned environ must be freed later */ char ** clone_env(void) { unsigned int env_size = 0; char **new_env = NULL; /* import original environ */ extern char **environ; /* compute environ size */ while(environ[env_size]) env_size++; /* ending NULL */ env_size++; size_t malloc_size = sizeof(char *) * env_size; if_not_malloc(new_env, malloc_size, /* just print error message (within macro code) */ ) else { /* copy each pointer, beginning from the ending NULL value */ while(env_size > 0) { new_env[env_size - 1] = environ[env_size - 1]; env_size--; } } return (new_env); } /* Push a str pointer to a cloned environ(7) - return enlarged environ through env - returned environ must be freed later - return 0 (success) or 1 (failure) */ int push_env(char *str, char ***env) { assert(str != NULL); assert(env != NULL); assert(*env != NULL); unsigned int env_size = 0; char **new_env = NULL; /* compute environ size */ while((*env)[env_size]) env_size++; /* add our pointer */ env_size++; /* add ending NULL */ env_size++; size_t malloc_size = sizeof(char *) * env_size; if_not_malloc(new_env, malloc_size, return (1); ) /* copy each pointer, beginning from the ending NULL value */ new_env[env_size - 1] = NULL; new_env[env_size - 2] = str; env_size -= 2; while(env_size > 0) { new_env[env_size - 1] = (*env)[env_size - 1]; env_size--; } /* free previous environment and update env */ free(*env); *env = new_env; return (0); } fpart-0.9.2/src/utils.h000066400000000000000000000067311247141164700147540ustar00rootroot00000000000000/*- * Copyright (c) 2011-2015 Ganael LAPLANCHE * All rights reserved. * * Redistribution and use in source and binary forms, with or without * modification, are permitted provided that the following conditions * are met: * 1. Redistributions of source code must retain the above copyright * notice, this list of conditions and the following disclaimer. * 2. Redistributions in binary form must reproduce the above copyright * notice, this list of conditions and the following disclaimer in the * documentation and/or other materials provided with the distribution. * * THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE * ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ #ifndef _UTILS_H #define _UTILS_H #include "types.h" #include "options.h" /* stat(2) */ #include #include /* fnmatch(3) and FNM_CASEFOLD FNM_CASEFOLD is a GNU extension and may not be available */ #include #if defined(FNM_CASEFOLD) #define _HAS_FNM_CASEFOLD #else #define FNM_CASEFOLD 0 #if defined(DEBUG) #warning FNM_CASEFOLD not supported by fnmatch(3), \ options '-X' and '-Y' disabled #endif #endif #define round_num(x, y) \ ((((x) % (y)) != 0) ? (((x) / (y)) * (y) + (y)) : (x)) #define rewind_list(head) \ { while((head) && (head)->prevp) { (head) = (head)->prevp; } } #define fastfw_list(head) \ { while((head) && (head)->nextp) { (head) = (head)->nextp; } } #define min(x, y) (((x) <= (y)) ? (x) : (y)) #define if_not_malloc(ptr, size, err_action) \ ptr = malloc(size); \ if (ptr == NULL) { \ fprintf(stderr, "%s(): cannot allocate memory\n", __func__); \ err_action \ } #define if_not_realloc(ptr, size, err_action) \ ptr = realloc(ptr, size); \ if (ptr == NULL) { \ fprintf(stderr, "%s(): cannot reallocate memory\n", __func__); \ err_action \ } unsigned int get_num_digits(double i); fsize_t get_size(char *file_path, struct stat *file_stat, struct program_options *options); char *abs_path(const char *path); int str_push(char ***array, unsigned int *num, const char * const str); void str_cleanup(char ***array, unsigned int *num); int str_match(const char * const * const array, const unsigned int num, const char * const str, const unsigned char ignore_case); int valid_filename(char *filename, struct program_options *options, unsigned char is_leaf); char ** clone_env(void); int push_env(char *str, char ***env); #endif /* _UTILS_H */ fpart-0.9.2/tools/000077500000000000000000000000001247141164700140055ustar00rootroot00000000000000fpart-0.9.2/tools/Makefile.am000066400000000000000000000000321247141164700160340ustar00rootroot00000000000000dist_bin_SCRIPTS = fpsync fpart-0.9.2/tools/fpsync000077500000000000000000000665501247141164700152510ustar00rootroot00000000000000#!/bin/sh # Copyright (c) 2014-2015 Ganael LAPLANCHE # All rights reserved. # # Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions # are met: # 1. Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer. # 2. Redistributions in binary form must reproduce the above copyright # notice, this list of conditions and the following disclaimer in the # documentation and/or other materials provided with the distribution. # # THIS SOFTWARE IS PROVIDED BY THE AUTHORS AND CONTRIBUTORS ``AS IS'' AND # ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE # ARE DISCLAIMED. IN NO EVENT SHALL THE AUTHORS OR CONTRIBUTORS BE LIABLE # FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL # DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS # OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) # HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT # LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY # OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF # SUCH DAMAGE. # This script is a simple wrapper showing how fpart can be used to migrate data. # It uses fpart and rsync to spawn multiple rsync instances to migrate data # from src_dir/ to dst_dir/. Rsync jobs can execute either locally or over SSH. # This migration is incremental and will need a final rsync pass # (rsync -av -delete src_dir/ dst_dir/) to remove extra files from dst_dir/. FPSYNC_VERSION="0.9.2" ########## Default values for options # Number of sync jobs to run in parallel ("workers", -n) OPT_JOBS=2 # Same, but autodetected #OPT_JOBS=$(sysctl -n hw.ncpu) # On FreeBSD #OPT_JOBS=$(nproc) # On Linux # Number of sync jobs from resumed run, read from the 'info' file OPT_RJOBS= # Maximum files per sync job (-f) OPT_FPMAXPARTFILES="2000" # Maximum bytes per sync job (-s) OPT_FPMAXPARTSIZE="$((4 * 1024 * 1024 * 1024))" # 4 GB # SSH workers (execute jobs locally if not defined, -w) OPT_WRKRS="" # Fpart shared dir (must be shared amongst all workers, -d) OPT_FPSHDIR="" # Temporary dir (local, used for queue management, -t) OPT_TMPDIR="/tmp/fpsync" # Job name for resume mode (-r) OPT_JOBNAME="" # Rsync options (-o) OPT_RSYNC="-av --numeric-ids" # Fpart options (-O) OPT_FPART="-x .zfs -x .snapshot* -x .ckpt" # Sudo mode (-S) OPT_SUDO="" # Verbose mode (-v) OPT_VERBOSE="0" # Source directory OPT_SRCDIR="" # Destination directory OPT_DSTDIR="" # Mail - Uncomment to receive a mail when the whole run has finished. # The master machine (the one running this script) must be able to send mail # using the 'mail' command. No CLI option is provided to change it. #OPT_MAIL="address@mydomain.tld" ########## Various functions # Print help usage () { echo "fpsync v${FPSYNC_VERSION} - Sync directories in parallel using fpart and rsync" echo "Copyright (c) 2014-2015 Ganael LAPLANCHE " echo "WWW: http://contribs.martymac.org" echo "Usage: $0 [OPTIONS...] /src_dir/ /dst_dir/" echo "Usage: $0 [-r jobname] [OPTIONS...]" echo " -n x start concurrent sync jobs" echo " -f y transfer at most files per sync job" echo " -s z transfer at most bytes per sync job" echo " -w wrks space-separated list of SSH workers" echo " e.g.: -w 'login@host1 login@host2 login@host3'" echo " or: -w 'login@host1' -w 'login@host2' -w 'login@host3'" echo " Jobs are executed locally if not specified (default)." echo " -d /dir/ set fpsync shared dir to (absolute path)" echo " This option is mandatory when using SSH workers." echo " -t /dir/ set fpsync temp dir to (absolute path)" echo " -r jobname resume job " echo " (options -f, -s, -o, /src_dir/ and /dst_dir/" echo " are ignored when resuming a previous run)" echo " -o options override default rsync options with " echo " Use this option with care as certain options are" echo " incompatible with a parallel usage (e.g. --delete)" echo " -O options override default fpart options with " echo " -S use sudo for filesystem crawling and synchronizations" echo " -v verbose mode (default: quiet)" echo " This option can be be specified several times to" echo " increase verbosity level." echo " -h this help" echo " /src_dir/ source directory (absolute path)" echo " /dst_dir/ destination directory (absolute path)" echo "See fpsync(1) for more details." } # Print a message to stderr and exit with error code 1 end_die () { [ -n "$1" ] && echo "$1" 1>&2 exit 1 } # Print (to stdout) and log a message # $1 = level (0 = quiet, 1 = verbose, >=2 more verbose) # $2 = message to log echo_log () { is_num "$1" && [ ${OPT_VERBOSE} -ge $1 ] && [ -n "$2" ] && \ echo "$2" [ -n "$2" ] && \ echo "$2" >> "${FPART_LOGFILE}" } # Check if $1 is an absolute path is_abs_path() { echo "$1" | grep -qE "^/" } # Check if $1 is a number is_num () { echo "$1" | grep -qE '^[0-9]+$' } # Parse user options and initialize OPT_* global variables parse_opts () { local opt OPTARG OPTIND while getopts "n:f:s:w:d:t:r:o:O:Svh" opt do case "${opt}" in "n") if is_num "${OPTARG}" && [ ${OPTARG} -ge 1 ] then OPT_JOBS=${OPTARG} else end_die "Option -n expects a numeric value >= 1" fi ;; "f") if is_num "${OPTARG}" && [ ${OPTARG} -gt 0 ] then OPT_FPMAXPARTFILES=${OPTARG} else end_die "Option -f expects a numeric value > 0" fi ;; "s") if is_num "${OPTARG}" && [ ${OPTARG} -gt 0 ] then OPT_FPMAXPARTSIZE=${OPTARG} else end_die "Option -s expects a numeric value > 0" fi ;; "w") if [ -n "${OPTARG}" ] then OPT_WRKRS="${OPT_WRKRS} ${OPTARG}" else end_die "Invalid workers list supplied" fi ;; "d") if is_abs_path "${OPTARG}" then OPT_FPSHDIR="${OPTARG}" else end_die "Please supply an absolute path for shared dir" fi ;; "t") if is_abs_path "${OPTARG}" then OPT_TMPDIR="${OPTARG}" else end_die "Please supply an absolute path for temp dir" fi ;; "r") if [ -n "${OPTARG}" ] then OPT_JOBNAME="${OPTARG}" else end_die "Invalid job name supplied" fi ;; "o") if [ -n "${OPTARG}" ] then OPT_RSYNC="${OPTARG}" else end_die "Invalid rsync options supplied" fi ;; "O") if [ -n "${OPTARG}" ] then OPT_FPART="${OPTARG}" else end_die "Invalid fpart options supplied" fi ;; "S") OPT_SUDO="yes" ;; "v") OPT_VERBOSE="$((${OPT_VERBOSE} + 1))" ;; "h") usage exit 0 ;; *) usage end_die "Invalid option specified" ;; esac done shift $((${OPTIND} - 1)) # Validate OPT_FPSHDIR (shared directory) if [ -z "${OPT_WRKRS}" ] then # For local jobs, set shared directory to temporary directory [ -z "${OPT_FPSHDIR}" ] && \ OPT_FPSHDIR="${OPT_TMPDIR}" else # For remote ones, specifying a shared directory is mandatory [ -z "${OPT_FPSHDIR}" ] && \ end_die "Please supply a shared dir when specifying workers" fi # Check for src and dst dirs presence and validity if [ -z "${OPT_JOBNAME}" ] then if is_abs_path "$1" && is_abs_path "$2" then OPT_SRCDIR="$1" OPT_DSTDIR="$2" else usage end_die "Please supply an absolute path for both src_dir/ and dst_dir/" fi fi } ########## Work-related functions (in-memory, running-jobs handling) # Initialize WORK_FREEWORKERS by expanding OPT_WRKRS up to OPT_JOBS elements, # assigning a fixed number of slots to each worker. # Sanitize OPT_WRKRS if necessary. work_list_free_workers_init () { local _OPT_WRKRS_NUM=$(echo ${OPT_WRKRS} | awk '{print NF}') if [ ${_OPT_WRKRS_NUM} -gt 0 ] then local _i=0 while [ ${_i} -lt ${OPT_JOBS} ] do local _OPT_WRKRS_IDX="$((${_i} % ${_OPT_WRKRS_NUM} + 1))" WORK_FREEWORKERS="${WORK_FREEWORKERS} $(echo ${OPT_WRKRS} | awk '{print $'${_OPT_WRKRS_IDX}'}')" _i=$((${_i} + 1)) done else OPT_WRKRS="" WORK_FREEWORKERS="local" fi } # Pick-up next worker work_list_pick_next_free_worker () { echo "${WORK_FREEWORKERS}" | awk '{print $1}' } # Remove next worker from list work_list_trunc_next_free_worker () { WORK_FREEWORKERS="$(echo ${WORK_FREEWORKERS} | sed -E 's/^[[:space:]]*[^[:space:]]+[[:space:]]*//')" } # Push a work to the list of currently-running ones work_list_push () { if [ -n "$1" ] then WORK_LIST="${WORK_LIST} $1" WORK_NUM="$((${WORK_NUM} + 1))" fi } # Rebuild the currently-running jobs' list by examining each process' state work_list_refresh () { local _WORK_LIST="" local _WORK_NUM=0 for _JOB in ${WORK_LIST} do # If the process is still alive, keep it if ps "$(echo ${_JOB} | cut -d ':' -f 1)" 1>/dev/null 2>&1 then _WORK_LIST="${_WORK_LIST} ${_JOB}" _WORK_NUM="$((${_WORK_NUM} + 1))" # If not, put its worker to the free list else echo_log "2" "<= [QMGR] Job ${_JOB} finished" if [ -n "${OPT_WRKRS}" ] then WORK_FREEWORKERS="${WORK_FREEWORKERS} $(echo ${_JOB} | cut -d ':' -f 2)" fi fi done WORK_LIST=${_WORK_LIST} WORK_NUM=${_WORK_NUM} } ########## Jobs-related functions (on-disk, jobs' queue handling) # Initialize job queue and work directories job_queue_init () { mkdir -p "${JOBS_QUEUEDIR}" 2>/dev/null || \ end_die "Cannot create job queue directory ${JOBS_QUEUEDIR}" mkdir -p "${JOBS_WORKDIR}" 2>/dev/null || \ end_die "Cannot create job work directory ${JOBS_WORKDIR}" } # Dump job queue information to allow later resuming job_queue_info_dump () { # Create "info" file local _TMPMASK="$(umask)" umask "0077" touch "${JOBS_QUEUEDIR}/info" 2>/dev/null umask "${_TMPMASK}" # Dump information necessary for job resuming cat << EOF > "${JOBS_QUEUEDIR}/info" || \ end_die "Cannot record job information" # Job information used for resuming, do not edit ! OPT_RJOBS="${OPT_JOBS}" OPT_SRCDIR="${OPT_SRCDIR}" OPT_DSTDIR="${OPT_DSTDIR}" EOF } job_queue_info_load () { # Source info file and initialize a few variables . "${JOBS_QUEUEDIR}/info" || \ end_die "Cannot read job information" # Validate loaded options { is_num "${OPT_RJOBS}" && [ ${OPT_RJOBS} -ge 1 ] ;} || \ end_die "Invalid option value loaded from resumed job: OPT_RJOBS" { is_abs_path "${OPT_SRCDIR}" && is_abs_path "${OPT_DSTDIR}" ;} || \ end_die "Invalid options value loaded from resumed job: OPT_SRCDIR/OPT_DSTDIR" } # Set the "fp_done" (fpart done) flag within job queue job_queue_fp_done () { sleep 1 # Ensure this very last file gets created within the next second of # last job file's mtime. Necessary for filesystems that don't get # below the second for mtime precision (msdosfs). touch "${JOBS_QUEUEDIR}/fp_done" } # Set the "sl_done" (sync loop done) flag within job queue job_queue_sl_done () { touch "${JOBS_QUEUEDIR}/sl_done" } # Set the "sl_stop" (sync loop stop) flag within job queue job_queue_sl_stop () { touch "${JOBS_QUEUEDIR}/sl_stop" } # Handle first ^C: stop queue processing by setting the "sl_stop" flag # then wait for sync jobs to complete and display status before exiting sigint_handler () { SIGINT_COUNT="$((${SIGINT_COUNT} + 1))" job_queue_sl_stop if [ ${SIGINT_COUNT} -eq 1 ] then echo_log "1" "===> Interrupted. Waiting for running jobs to complete..." echo_log "1" "===> (hit ^C again to kill them and exit)" # Wait for queue processing to stop wait # Display current status before exiting [ ${OPT_VERBOSE} -ge 1 ] && siginfo_handler # Exit program end_die fi } # Handle subsequent ^C from within job_queue_loop(): kill sync processes # to fast-unlock the main process (waiting for child processes to exit) job_queue_loop_sigint_handler () { SIGINT_COUNT="$((${SIGINT_COUNT} + 1))" if [ ${SIGINT_COUNT} -eq 2 ] then echo_log "1" "===> Interrupted again, killing remaining jobs" for _JOB in ${WORK_LIST} do kill "$(echo ${_JOB} | cut -d ':' -f 1)" 1>/dev/null 2>&1 done # Wait for child processes to exit and let parent process end_die() wait fi } # Handle ^T: print info about queue status siginfo_handler () { local _jobs_total="$(cd "${FPART_PARTSDIR}" && ls -t1 | wc -l)" local _jobs_done="$(cd "${JOBS_WORKDIR}" && ls -t1 | grep -v 'fp_done' | wc -l)" local _jobs_percent="??" [ ${_jobs_total} -ge 1 ] && \ _jobs_percent="$(( (${_jobs_done} * 100) / ${_jobs_total} ))" # Trim values _jobs_total="$(( ${_jobs_total} + 0))" _jobs_done="$(( ${_jobs_done} + 0))" echo "<=== Parts done: ${_jobs_done}/${_jobs_total} (${_jobs_percent}%), remaining: $((${_jobs_total} - ${_jobs_done}))" } # Get next job name relative to ${JOBS_WORKDIR}/ # Returns empty string if no job is available # JOBS_QUEUEDIR can host several types of file : # : a sync job to perform # 'info': info file regarding this fpsync run # 'sl_stop': the 'immediate stop' flag, set when ^C is hit # 'fp_done': set when fpart has finished crawling src_dir/ and generated jobs # 'sl_done': set when job_queue_loop() terminates (either normally or stopped) job_queue_next () { local _NEXT="" if [ -f "${JOBS_QUEUEDIR}/sl_stop" ] then echo "sl_stop" else _NEXT=$(cd "${JOBS_QUEUEDIR}" && ls -rt1 | grep -v "info" | head -n 1) if [ -n "${_NEXT}" ] then mv "${JOBS_QUEUEDIR}/${_NEXT}" "${JOBS_WORKDIR}" || \ end_die "Cannot dequeue next job" echo "${_NEXT}" fi fi } # Main jobs' loop: pick up jobs within the queue directory and start them job_queue_loop () { echo_log "2" "===> [QMGR] Starting queue manager" # Trap SIGINT trap 'job_queue_loop_sigint_handler' 2 # Ignore SIGINFO from within loop, handled by the parent (master) process trap '' 29 local _NEXT="" while [ "${_NEXT}" != "fp_done" ] && [ "${_NEXT}" != "sl_stop" ] do local _PID="" if [ ${WORK_NUM} -lt ${OPT_JOBS} ] then _NEXT="$(job_queue_next)" if [ -n "${_NEXT}" ] && \ [ "${_NEXT}" != "fp_done" ] && \ [ "${_NEXT}" != "sl_stop" ] then if [ -z "${OPT_WRKRS}" ] then echo_log "2" "=> [QMGR] Starting job ${JOBS_WORKDIR}/${_NEXT} (local)" /bin/sh "${JOBS_WORKDIR}/${_NEXT}" & work_list_push "$!:local" else local _NEXT_HOST="$(work_list_pick_next_free_worker)" work_list_trunc_next_free_worker echo_log "2" "=> [QMGR] Starting job ${JOBS_WORKDIR}/${_NEXT} -> ${_NEXT_HOST}" "${SSH_BIN}" "${_NEXT_HOST}" '/bin/sh -s' \ < "${JOBS_WORKDIR}/${_NEXT}" & work_list_push "$!:${_NEXT_HOST}" fi fi fi work_list_refresh sleep 0.2 done if [ "${_NEXT}" = "fp_done" ] then echo_log "2" "<=== [QMGR] Done submitting jobs. Waiting for them to finish." else echo_log "2" "<=== [QMGR] Stopped. Waiting for jobs to finish." fi wait echo_log "2" "<=== [QMGR] Queue processed" # Set the 'sl_done' (sync done) flag to let the master process go job_queue_sl_done } ########## Program start (main() !) # Parse command-line options parse_opts "$@" ## Options' post-processing section # Job name initialization if [ -n "${OPT_JOBNAME}" ] then # Resume mode, check if OPT_JOBNAME exists if [ -d "${OPT_TMPDIR}/queue/${OPT_JOBNAME}" ] && \ [ -d "${OPT_TMPDIR}/work/${OPT_JOBNAME}" ] then FPART_JOBNAME="${OPT_JOBNAME}" else end_die "Could not find specified job's queue and work directories" fi else # Generate a unique job name. This job name *must* remain # unique from one job to another. FPART_JOBNAME="$(echo ${OPT_DSTDIR} | LC_ALL=C tr -dc '/a-zA-Z0-9' | sed -E -e 's|/*$||' -e 's|^.*/([^/]+)$|\1|' | head -c 16)-$(date '+%s')-$$" fi # Queue manager configuration. This queue remains local, even when using SSH. JOBS_QUEUEDIR="${OPT_TMPDIR}/queue/${FPART_JOBNAME}" # Queue dir. JOBS_WORKDIR="${OPT_TMPDIR}/work/${FPART_JOBNAME}" # Current jobs' dir. # Paths to executables that must exist locally FPART_BIN="$(which fpart)" SSH_BIN="$(which ssh)" MAIL_BIN="$(which mail)" # Paths to executables that must exist both locally and remotely SUDO_BIN="$(which sudo)" # Paths to executables that must exist either locally or remotely (depending # on if you use SSH or not). When using SSH, the following binaries must be # present at those paths on each worker. RSYNC_BIN="$(which rsync)" # Do we need sudo ? if [ -n "${OPT_SUDO}" ] then SUDO="${SUDO_BIN}" else SUDO="" fi # Fpart paths. Those ones must be shared amongst all nodes when using SSH # (e.g. through a NFS share mounted on *every* single node, including the master # 'job submitter'). FPART_PARTSDIR="${OPT_FPSHDIR}/parts/${FPART_JOBNAME}" FPART_PARTSTMPL="${FPART_PARTSDIR}/part" FPART_LOGDIR="${OPT_FPSHDIR}/log/${FPART_JOBNAME}" FPART_LOGFILE="${FPART_LOGDIR}/fpart.log" # Fpart hooks (black magic is here !) FPART_COMMAND="/bin/sh -c '${SUDO} ${RSYNC_BIN} ${OPT_RSYNC} \ --files-from=\\\"\${FPART_PARTFILENAME}\\\" \ \\\"${OPT_SRCDIR}/\\\" \ \\\"${OPT_DSTDIR}/\\\"' \ 1>\"${FPART_LOGDIR}/\${FPART_PARTNUMBER}.stdout\" \ 2>\"${FPART_LOGDIR}/\${FPART_PARTNUMBER}.stderr\"" FPART_POSTHOOK="echo \"${FPART_COMMAND}\" > \ \"${JOBS_QUEUEDIR}/\${FPART_PARTNUMBER}\" && \ [ ${OPT_VERBOSE} -ge 2 ] && \ echo \"=> [FPART] Partition \${FPART_PARTNUMBER} written\"" # [1] # [1] Be careful to host the job queue on a filesystem that can handle # fine-grained mtime timestamps (i.e. with a sub-second precision) if you want # the queue to be processed in order when fpart generates several job files per # second. # On FreeBSD, vfs timestamps' precision can be tuned using the # vfs.timestamp_precision sysctl. See vfs_timestamp(9). ## End of options' post-processing section, let's start for real now ! SIGINT_COUNT="0" # ^C counter WORK_NUM=0 # Current number of running processes WORK_LIST="" # Work PID[:WORKER] list WORK_FREEWORKERS="" # Free workers' list # Check for essential binaries [ ! -x "${FPART_BIN}" ] && \ end_die "Fpart is missing locally, check your configuration" [ -n "${OPT_WRKRS}" ] && [ ! -x "${SSH_BIN}" ] && \ end_die "SSH is missing locally, check your configuration" [ -n "${OPT_MAIL}" ] && [ ! -x "${MAIL_BIN}" ] && \ end_die "Mail is missing locally, check your configuration" [ -n "${OPT_SUDO}" ] && [ ! -x "${SUDO}" ] && \ end_die "Sudo is missing locally, check your configuration" # Create / check for fpart shared directories if [ -z "${OPT_JOBNAME}" ] then # For a new job, create those directories mkdir -p "${FPART_PARTSDIR}" 2>/dev/null || \ end_die "Cannot create partitions' output directory: ${FPART_PARTSDIR}" mkdir -p "${FPART_LOGDIR}" 2>/dev/null || \ end_die "Cannot create log directory: ${FPART_LOGDIR}" else # In resume mode, FPART_PARTSDIR and FPART_LOGDIR must already exist if [ ! -d "${FPART_PARTSDIR}" ] || [ ! -d "${FPART_LOGDIR}" ] then end_die "Could not find specified job's 'parts' and 'log' directories" fi fi # Create or update log file touch "${FPART_LOGFILE}" 2>/dev/null || \ end_die "Cannot create log file: ${FPART_LOGFILE}" # Create / check for job and work queues if [ -z "${OPT_JOBNAME}" ] then # For a new job, create those directories job_queue_init else # When resuming a job, check if : # - the last fpart pass has completed # (the 'fp_done' flag is present) # - the last fpsync pass has *not* completed # (the 'fp_done' flag is *still* present) # - we can get the number of workers previously implied # (the 'info' flag is present) # - the work queue exists if [ ! -f "${JOBS_QUEUEDIR}/fp_done" ] || [ ! -f "${JOBS_QUEUEDIR}/info" ] then end_die "Specified job is not resumable ('fp_done' or 'info' flag missing)" fi [ ! -d "${JOBS_WORKDIR}" ] && \ end_die "Specified job is not resumable (work queue missing)" # Job is resumable, try to reload job options and prepare queues job_queue_info_load # Remove the "sl_stop" and "sl_done" flags, if any rm -f "${JOBS_QUEUEDIR}/sl_stop" 2>/dev/null rm -f "${JOBS_QUEUEDIR}/sl_done" 2>/dev/null # Move potentially-incomplete jobs to the jobs queue so that they can be # executed again. We consider the worst-case scenario and resume OPT_RJOBS # last jobs, some of them being partially finished. for _file in \ $(cd "${JOBS_WORKDIR}" && ls -t1 | head -n "${OPT_RJOBS}") do mv "${JOBS_WORKDIR}/${_file}" "${JOBS_QUEUEDIR}" 2>/dev/null || end_die "Cannot resume specified job" done fi # Validate src_dir/ locally (needed for fpart) for first runs or local ones if [ -z "${OPT_JOBNAME}" ] || [ -z "${OPT_WRKRS}" ] then [ ! -d "${OPT_SRCDIR}" ] && \ end_die "Source directory does not exist (or is not a directory): ${OPT_SRCDIR}" fi # When using SSH, validate src_dir/ and dst_dir/ remotely and check for rsync # presence (this also allows checking SSH connectivity to each declared host) if [ -n "${OPT_WRKRS}" ] then echo_log "2" "=====> Validating requirements on SSH nodes..." _FIRST_HOST="$(echo ${OPT_WRKRS} | awk '{print $1}')" for _host in ${OPT_WRKRS} do # Check for sudo presence (it must be passwordless) if [ -n "${OPT_SUDO}" ] then "${SSH_BIN}" "${_host}" "${SUDO} /bin/sh -c ':' 2>/dev/null" || \ end_die "Sudo executable not found or requires password on target ${_host}" fi # Blindly try to create dst_dir/ as well as a witness file on the first # node. Using a witness file will allow us to check for its # presence/visibility from other nodes, avoiding "split-brain" # situations where dst_dir/ exists but is not shared amongst all nodes # (typically a local mount point where the shared storage area *should* # be mounted but isn't, for any reason). [ "${_host}" = "${_FIRST_HOST}" ] && \ "${SSH_BIN}" "${_host}" "/bin/sh -c 'mkdir -p \"${OPT_DSTDIR}\" && \ ${SUDO} touch \"${OPT_DSTDIR}/${FPART_JOBNAME}\"' 2>/dev/null" # Check for src_dir/ and dst_dir/ (witness file) "${SSH_BIN}" "${_host}" "/bin/sh -c '[ -d \"${OPT_SRCDIR}\" ]'" || \ end_die "Source directory does not exist on target ${_host} (or is not a directory): ${OPT_SRCDIR}" "${SSH_BIN}" "${_host}" "/bin/sh -c '[ -f \"${OPT_DSTDIR}/${FPART_JOBNAME}\" ]'" || \ end_die "Destination directory (shared) is not available on target ${_host}: ${OPT_DSTDIR}" # Finally, check for rsync presence "${SSH_BIN}" "${_host}" "/bin/sh -c '[ -x \"${RSYNC_BIN}\" ]'" || \ end_die "Rsync executable not found on target ${_host}" echo_log "2" "<=== ${_host}: OK" done # Remove witness file "${SSH_BIN}" "${_FIRST_HOST}" \ "/bin/sh -c '${SUDO} rm -f \"${OPT_DSTDIR}/${FPART_JOBNAME}\"' 2>/dev/null" unset _FIRST_HOST else # Local usage - create dst_dir/ and check for rsync presence if [ ! -d "${OPT_DSTDIR}" ] then mkdir -p "${OPT_DSTDIR}" 2>/dev/null || \ end_die "Cannot create destination directory: ${OPT_DSTDIR}" fi [ ! -x "${RSYNC_BIN}" ] && \ end_die "Rsync is missing locally, check your configuration" fi # Dispatch OPT_WRKRS into WORK_FREEWORKERS work_list_free_workers_init # Let's rock ! echo_log "2" "=====> [$$] Syncing ${OPT_SRCDIR} => ${OPT_DSTDIR}" echo_log "1" "===> Job name: ${FPART_JOBNAME}$([ -n "${OPT_JOBNAME}" ] && echo ' (resumed)')" echo_log "2" "===> Start time: $(date)" echo_log "2" "===> Concurrent sync jobs: ${OPT_JOBS}" echo_log "2" "===> Workers: $(echo "${OPT_WRKRS}" | sed -E -e 's/^[[:space:]]+//' -e 's/[[:space:]]+/ /g')$([ -z "${OPT_WRKRS}" ] && echo 'local')" echo_log "2" "===> Shared dir: ${OPT_FPSHDIR}" echo_log "2" "===> Temp dir: ${OPT_TMPDIR}" # The following options are ignored when resuming if [ -z "${OPT_JOBNAME}" ] then echo_log "2" "===> Max files per sync job: ${OPT_FPMAXPARTFILES}" echo_log "2" "===> Max bytes per sync job: ${OPT_FPMAXPARTSIZE}" echo_log "2" "===> Rsync options: \"${OPT_RSYNC}\"" fi # Record job information job_queue_info_dump # Set SIGINT and SIGINFO traps and start job_queue_loop trap 'sigint_handler' 2 trap 'siginfo_handler' 29 echo_log "2" "===> Use ^C to abort, ^T (SIGINFO) to display status" job_queue_loop& # When not resuming a previous job, start fpart if [ -z "${OPT_JOBNAME}" ] then echo_log "1" "===> Analyzing filesystem..." # Start fpart from src_dir/ directory and produce jobs within # ${JOBS_QUEUEDIR}/ cd "${OPT_SRCDIR}" && \ ${SUDO} "${FPART_BIN}" \ -f "${OPT_FPMAXPARTFILES}" \ -s "${OPT_FPMAXPARTSIZE}" \ -o "${FPART_PARTSTMPL}" ${OPT_FPART} -Z -L \ -W "${FPART_POSTHOOK}" . 2>&1 | tee -a "${FPART_LOGFILE}" fi # Tell job_queue_loop that crawling has finished job_queue_fp_done # Wait for job_queue_loop to terminate # Use an active wait to allow signal processing (^T) echo_log "1" "===> Waiting for sync jobs to complete..." while [ ! -f "${JOBS_QUEUEDIR}/sl_done" ] do sleep 0.2 done # Display final status [ ${OPT_VERBOSE} -ge 1 ] && siginfo_handler # Examine results and send an e-mail if requested RET=$(find "${FPART_LOGDIR}/" -name "*.stderr" ! -size 0) MSG=$( { [ -z "${RET}" ] && echo 'Fpsync completed without error.' ;} || \ { echo "Fpsync completed with errors, see logs:" && echo "${RET}" ;} ) if [ -n "${OPT_MAIL}" ] then echo "${MSG}" | ${MAIL_BIN} -s "Fpsync job ${FPART_JOBNAME}" "${OPT_MAIL}" fi echo_log "1" "<=== ${MSG}" echo_log "2" "<=== End time: $(date)" [ -n "${RET}" ] && exit 1 exit 0