--- swish++-6.1.5.orig/debian/copyright +++ swish++-6.1.5/debian/copyright @@ -0,0 +1,133 @@ +Authors +======= + +This package was debianized by Jim Pick on +Fri, 13 Mar 1998 22:41:24 -0800. + +It has since then been maintained by Josip Rodin + and Michael Hummel . + +The current package is maintained by Kapil Hari Paranjape +. + +In addition to the Debian Maintainers and the Upstream author the +files in the debian/ directory have been copyrighted by +Joey Hess , +Kai Grojohann , +Christoph Conrad and +Simon Josefsson . + +The files in debian/oo_indexing are copyrighted by +Bastian Kleineidam . + +The original source was found at + http://homepage.mac.com/pauljlucas/software/swish/ + +The current source was downloaded from + http://swishplusplus.sourceforge.net/ + +Upstream Author: Paul J. Lucas . + +Copyright +========= + + Copyright (C) 1998-2006 Paul J. Lucas + Copyright (C) 1997-1999 Joey Hess + Copyright (C) 1998 Kai Grojohann + Copyright (C) 2000-2001 Christoph Conrad + Copyright (C) 1998, 2000-2001 Simon Josefsson + +All the source files used to build the Debian package contain +copyright statements like the above along with text like the +following from index.c: + + This program is free software; you can redistribute it and/or modify + it under the terms of the GNU General Public License as published by + the Free Software Foundation; either version 2 of the License, or + (at your option) any later version. + + This program is distributed in the hope that it will be useful, + but WITHOUT ANY WARRANTY; without even the implied warranty of + MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + GNU General Public License for more details. + +On the Debian system, the full text of the GNU General Public License can be +found in the file `/usr/share/common-licenses/GPL'. + +OpenOffice Indexing Example files: + +The OpenOffice Indexing Example files are by Bastian Kleineidam +They contain copyright statements like the following taken from +oo.conf: + + Copyright (C) 2004 Bastian Kleineidam + The contents of this file are released in the Public Domain + + +Additional Files: + +The files fnmatch.[ch] in the source package are used only in the +Win32 build. These are copyright DJ Delorie and contain the copyright +notice: + + Copyright (C) 1995 DJ Delorie, see COPYING.DJ for details + +The file copying.dj is reproduced here in the interest of having all +copyright statements related to the source in one place. It reads: + +========copying.dj============== +This is the file "copying.dj". It does NOT apply to any sources or +binaries copyrighted by UCB Berkeley, the Free Software Foundation, or +any other agency besides DJ Delorie and others who have agreed to +allow their sources to be distributed under these terms. + + Copyright Information for sources and executables that are marked + Copyright (C) DJ Delorie + 7 Kim Lane + Rochester NH 03867-2954 + +This document is Copyright (C) DJ Delorie and may be distributed +verbatim, but changing it is not allowed. + +Source code copyright DJ Delorie is distributed under the terms of the +GNU General Public Licence, with the following exceptions: + +* Sources used to build crt0.o, gcrt0.o, libc.a, libdbg.a, and + libemu.a are distributed under the terms of the GNU Library General + Public License, rather than the GNU GPL. + +* Any existing copyright or authorship information in any given source + file must remain intact. If you modify a source file, a notice to that + effect must be added to the authorship information in the source file. + +* Runtime binaries, as provided by DJ in DJGPP, may be distributed + without sources ONLY if the recipient is given sufficient information + to obtain a copy of djgpp themselves. This primarily applies to + go32-v2.exe, emu387.dxe, and stubedit.exe. + +* Runtime objects and libraries, as provided by DJ in DJGPP, when + linked into an application, may be distributed without sources ONLY + if the recipient is given sufficient information to obtain a copy of + djgpp themselves. This primarily applies to crt0.o and libc.a. + +----- + +Changes to source code copyright BSD or FSF by DJ Delorie fall under +the terms of the original copyright. + +A copy of the files "COPYING" and "COPYING.LIB" are included with this +document. If you did not receive a copy of these files, you may +obtain one from whence this document was obtained, or by writing. + +[ At this point copying.dj refers to the old FSF address ] + +========end of copying.dj============== + +The file COPYING is the GPL text. +On Debian system, the full text of the GNU General Public +License can be found in the file `/usr/share/common-licenses/GPL'. + +The file COPYING.LIB is the Library GPL text. +On Debian system, the full text of the Library GNU General Public +License can be found in the file `/usr/share/common-licenses/LGPL-2'. + --- swish++-6.1.5.orig/debian/watch +++ swish++-6.1.5/debian/watch @@ -0,0 +1,5 @@ +# format version number, currently 3; this line is compulsory! +version=3 + +http://sf.net/swishplusplus/swishplusplus-([0-9.]*)\.tar\.gz +http://sf.net/swishplusplus/swish\+\+-([0-9.]*)\.tar\.gz --- swish++-6.1.5.orig/debian/fixmanpage +++ swish++-6.1.5/debian/fixmanpage @@ -0,0 +1,48 @@ +#! /bin/sh -e + +# Fixes man pages because of renamed binaries + +TMPFILE=`mktemp /tmp/deb.swish++.XXXXXX` + +sed -e 's,\,index++,g' \ + -e 's,to index++,to index,g' \ + -e 's,the index++,the index,g' \ + -e 's,index++ to,index to,g' \ + -e 's,word index++,word index,g' \ + -e 's,index++-file,index-file,g' \ + -e 's,\\f3index\\f1,\\f3index++\\f1,g' \ + -e 's,index++ file,index file,g' \ + -e 's,swish++.index++,swish++.index,g' \ + -e 's,To index++,To index,g' \ + -e 's,no_index++,no_index,g' \ + -e 's,index++\/elements,index\/elements,g' \ + -e 's,index++ the,index the,g' \ + -e 's,file index++,file index,g' \ + -e 's,index++ of files,index of files,g' \ + -e 's,met.-name index++,meta-name index,g' \ + -e 's,file-index++,file-index,g' \ + -e 's,an index++,an index,g' \ + -e 's,An index++,An index,g' \ + -e 's,single index++,single index,g' \ + -e 's,own index++,own index,g' \ + -e 's,a new index++,a new index,g' \ + -e 's,generated index++, generated index,g' \ + -e 's# not word, index++# not word, index#g' \ + -e 's, --dump-index++, --dump-index,g' \ + -e 's,search\>,search++,g' \ + -e 's,search++ engine,search engine,g' \ + -e 's,search++-intensive,search-intensive,g' \ + -e 's,the search++,the search,g' \ + -e 's,a search++,a sarch,g' \ + -e 's,to search++,to search,g' \ + -e 's,can search++,can search,g' \ + -e 's,/tmp/search++.socket,/tmp/search.socket,g' \ + -e 's,search++ results,search results,g' \ + -e 's,search++ result,search result,g' \ + -e 's,extract\>,extract++,g' \ + -e 's,extract++ the,extract the,g' \ + -e 's,o extract++,o extract,g' \ + -e 's,splitmail\>,splitmail++,g' \ + < $1 > $TMPFILE; + +mv -f $TMPFILE $1 --- swish++-6.1.5.orig/debian/control +++ swish++-6.1.5/debian/control @@ -0,0 +1,34 @@ +Source: swish++ +Section: web +Priority: optional +Maintainer: Kapil Hari Paranjape +Build-Depends: debhelper (>= 4.0), zlib1g-dev, quilt (>= 0.45-3) +Vcs-Svn: svn://svn.debian.org/svn/collab-maint/deb-maint/swish++/ +Standards-Version: 3.8.0 +Homepage: http://swishplusplus.sourceforge.net/ + +Package: swish++ +Architecture: any +Depends: ${shlibs:Depends}, perl5 +Suggests: xpdf-utils, antiword +Description: Simple Document Indexing System for Humans: C++ version + SWISH++ is a Unix-based file indexing and searching engine + (typically used to index and search files on web sites). It + was based on SWISH-E although SWISH++ is a complete rewrite. + . + SWISH++ features: + * Lightning-fast indexing + * Indexes META elements, ALT, and other attributes + * Selectively not index text within HTML or XHTML elements + * Intelligently index mail and news files + * Index Unix manual page files + * Apply filters to files on-the-fly prior to indexing + * Index non-text files such as Microsoft Office documents + * Modular indexing architecture + * Index new files incrementally + * Index remote web sites + * Handles large collections of files + * Lightning-fast searching + * Optional word stemming (suffix stripping) + * Ability to run as a search server + * Easy-to-parse results format --- swish++-6.1.5.orig/debian/changelog +++ swish++-6.1.5/debian/changelog @@ -0,0 +1,398 @@ +swish++ (6.1.5-2.2) unstable; urgency=low + + * Non-maintainer upload. + * Add fix_ftbfs_with_gcc4.7 patch. + Fix FTBFS with gcc 4.7 by adding 'this->' where needed. + Thanks to Cyril Brulebois for the patch. (Closes: #667386) + + -- Salvatore Bonaccorso Thu, 17 May 2012 09:06:13 +0200 + +swish++ (6.1.5-2.1) unstable; urgency=low + + * Non-maintainer upload. + * Fix FTBFS with newer GCC 4.6 (Closes: #629813) and with binutils-gold + (Closes:#556479). Patches thanks to Georgios M. Zarkadas. + + -- Ana Beatriz Guerrero Lopez Mon, 12 Sep 2011 19:37:31 +0200 + +swish++ (6.1.5-2) unstable; urgency=low + + * Added some more detailed remarks in README.Debian about + the use of swish++ in cron scripts based on remarks from + upstream author Paul J. Lucas. + - we do not need the print_failed_filter patch. Users + should run with "-v4" instead. + - the bugs #459611 and #461349 are marked "wontfix" now + that instructions on fixing these problems are provided. + * debian/watch: Added entry for take new upstream .tar.gz name. + * debian/rlimit: Ensure that hard limit is set with "-H" flag to + ulimit. + * debian/patches/fixincludes_gcc4.4: Added fix for gcc4.4 FTBFS + thanks to Martin MichlMayr. Closes: #504850. + * debian/control: Standards Version 3.8.0. No other changes + required. + * debian/patches/print_failed_filter: Removed. No longer required. + + -- Kapil Hari Paranjape Thu, 05 Mar 2009 18:58:35 +0530 + +swish++ (6.1.5-1) unstable; urgency=low + + * New upstream release (6.1.5) + * debian/control: + - added Homepage: field pointing to sourceforge site. + - Standards Version: 3.7.3. + * Implemented a partial solution to #459611 and #461349 + as a wrapper for filters which imposes resource limits: + - debian/rlimit: the wrapper script. + - debian/rules: modified the "binary-arch" rule to actually + install the script in examples directory. + - README.Debian: added documentation on this problem. + * debian/patches/print_failed_filter: generate error message + about failed filter. (Closes: #211513) + * debian/control: Added Vcs-Svn tag. + * Updated debhelper usage: + - debian/compat: set to '4'. + - debian/rules: removed definition of DH_COMPAT. + + -- Kapil Hari Paranjape Thu, 21 Feb 2008 16:04:29 +0530 + +swish++ (6.1.4-2) unstable; urgency=low + + * debian/patches: + - add_explicit_cstdlib_dependencies: Add explicit + include for cstdlib. Thanks to Martin Michlmayr for the + patch. Closes: #417712. + - use_PATH_MAX_from_climits: Comment out upstream's + redefinition of PATH_MAX. Closes: #397955. + - fix_man_pages: Remove extra space in synopsis for man/man1/splitmail.1 + Closes: #384988. + - fix_man4to5: Change references from section 4 to section 5 in + all relevant man pages. + - splitmail_junk_header: Make splitmail save a non-mail junk header + to a separate file. Closes: #420791. + * debian/watch: Fix url for sourceforge. + * debian/changelog: Fix typo in the previous entry closing #242238. + * debian/control: Change maintainer's e-mail address. + * debian/copyright: Edited out the old address of FSF. + + -- Kapil Hari Paranjape Mon, 6 Aug 2007 07:13:25 +0530 + +swish++ (6.1.4-1) unstable; urgency=low + + * New Maintainer. Closes: #385743. + * New upstream release (6.1.4). Closes: #321674. + - Builds with g++ 4.1. Closes: #356160. + * debian/patches: + - incorporated patches under "quilt". + - patch from Andreas Jochens to cast char to int correctly for + 64bit arches. Closes: #302561. + * debian/rules: + - updated for new location of scripts in source. + - changed "configure-stamp" to use /usr/share/quilt/quilt.make. + following suggestion of James Westby. + - replaced use of "debian/fixmanpages" by use of + quilt. + - changed "configure-stamp" to fix manpage location. + Closes: #282508. + - DH_COMPAT=4 + - Use "distclean" target of toplevel Makefile. + - Removed call to "dh_undocumented". + * debian/copyright: included all copyright statements. + * debian/control: + - added homepage to description. + - updated "debhelper" dependency. + - added "quilt" dependency. + - update package description. Closes: #291070. + - Standards-Version 3.7.2. No changes required. + * debian/changelog: Acknowledge NMU's. Thanks to Frank Lichtenheld, + Martin Michlmayr. Closes: #302561, #356160. + * debian/: + - Removed the unused files "all_in_one.patch" and + "swish++.conf.patch". + - added "watch" file. + * debian/oo_indexing: Included OpenOffice indexing example from + Bastian Kleineidam. Closes: #242238. + * debian/patches/use_gcc_for_ld: Patched config/config.mk and + top-level GNUmakefile to use "gcc" instead of "g++" for linking. + This prevents unneeded dependencies. Thanks to Christian + Aichinger's scripts for pointing this out. + + -- Kapil Hari Paranjape Mon, 23 Oct 2006 09:27:25 +0530 + +swish++ (5.15.3-3.2) unstable; urgency=low + + * NMU as part of the GCC 4.1 transition. + * patches/gcc41.patch: Remove extra qualification from C++ header + file (closes: #356160). + + -- Martin Michlmayr Thu, 25 May 2006 18:01:50 +0200 + +swish++ (5.15.3-3.1) unstable; urgency=medium + + * Non-maintainer upload. + * Apply patch by Andreas Jochens to fix FTBFS with g++ 4.0 + (Closes: #302561) + * Add missing Build-Depends on zlib1g-dev + + -- Frank Lichtenheld Sat, 6 Aug 2005 21:31:30 +0200 + +swish++ (5.15.3-3) unstable; urgency=low + + * Getting swish on the UploadQueue + + -- Michael Hummel Tue, 30 Mar 2004 16:02:39 +0200 + +swish++ (5.15.3-2) unstable; urgency=low + + * Loosing quotes in swish++.conf as well + * Fixing typo in search.1 + (closes: #238172) + + -- Michael Hummel Mon, 29 Mar 2004 21:03:43 +0200 + +swish++ (5.15.3-1) unstable; urgency=low + + * New upstream release + * Bumping Standards-Version to 3.6.1 + * Including tab and newline in ShellFilenameEscapeChars[] and ShellFilenameDelimChars[]\ + and loosing the quotes in the FAQ.Debian filter example. Thanks to calvin for those + (closes: #238362) + + -- Michael Hummel Mon, 29 Mar 2004 18:00:48 +0200 + +swish++ (5.14.2-1) unstable; urgency=low + + * New upstream release + ( + Shell meta charakters are now escaped (closes: #205342) + Filenames with spaces are handled correctly by filter (closes: #194358) + ) + + + -- Michael Hummel Tue, 19 Aug 2003 16:48:36 +0200 + +swish++ (5.13.5-1) unstable; urgency=low + + * New upstream release + * Changed FilterFile rules in swish++.conf + (closes: #194358) + + -- Michael Hummel Tue, 27 May 2003 21:02:12 +0200 + +swish++ (5.13.2-2) unstable; urgency=medium + + * Applied a patch to fix the mod_man problem + (closes: #176796) + + -- Michael Hummel Wed, 29 Jan 2003 07:41:09 +0100 + +swish++ (5.13.2-1) unstable; urgency=low + + * New upstream release + * Waived the original install procedure, because it was terribly opaque + * Switched to Standards-Version: 3.5.8 + + -- Michael Hummel Sat, 18 Jan 2003 14:26:15 +0100 + +swish++ (5.12.1-1) unstable; urgency=low + + * New upstream release + * WORD_THRESHOLD back at 250.000, because it's no longer only a compile time + option and can be lowered through an user swish++.conf file or a + command line switch (-W) + * Fixed an indentation problem in the search++ man page (closes: #171656) + + -- Michael Hummel Thu, 5 Dec 2002 00:00:21 +0100 + +swish++ (5.11-2) unstable; urgency=low + + * Applied a pre.5.11.1 patch from Paul + (closes: #167576) + + -- Michael Hummel Wed, 20 Nov 2002 17:24:58 +0100 + +swish++ (5.11-1) unstable; urgency=low + + * New upstream release (closes: #166932) + * Patches are now splitted and located in a directory of its own + * Moved all the daemon scripts and documentation to examples/daemon + * Some cosmetic cleanup - like fixing executable defaults in example scripts + (closes: #166242) + * Suggesting antiword for filtering and extracting options + + -- Michael Hummel Sat, 2 Nov 2002 08:36:04 +0100 + +swish++ (5.9.6-3) unstable; urgency=low + + * Removed offending searchd.8 man page; moved converted man page and + script to the examples directory; cleaned up links to the man page + (closes: #163542) + * Cosmetic clean up (rules) + + -- Michael Hummel Sun, 6 Oct 2002 21:57:15 +0200 + +swish++ (5.9.6-2) unstable; urgency=low + + * Changing to 3.5.7 standard + + -- Michael Hummel Thu, 3 Oct 2002 09:05:23 +0200 + +swish++ (5.9.6-1) unstable; urgency=low + + * New upstream release + + -- Michael Hummel Fri, 13 Sep 2002 20:38:02 +0200 + +swish++ (5.9.5-1) unstable; urgency=low + + * New upstream release + * Compiled with g++-3.2 + (closes: #160105) + + -- Michael Hummel Sun, 8 Sep 2002 18:40:04 +0200 + +swish++ (5.9.2-1) unstable; urgency=low + + * New upstream release + * Some cosmetic changes (swishmutt.sh) + * Added a FAQ.Debian regarding the index of ro directories ... + (closes: #138553) + * Compiled with gcc-3.1 on i386 (see below ...) + + -- Michael Hummel Sun, 14 Jul 2002 16:51:11 +0200 + +swish++ (5.7-2) unstable; urgency=low + + * Suggesting xpdf-utils instead of xpdf, due to changes in + the xpdf package(s) + * Moved /etc/swish++.conf to /usr/share/doc/swish++/ + * Cosmetical changes in the examples directory + + -- Michael Hummel Thu, 28 Feb 2002 22:03:28 +0100 + +swish++ (5.7-1) unstable; urgency=medium + + * New upstream release + (closes: #129389) + * Some cosmetic changes (README.Debian, swishmutt.sh) + * Fixed (old) sed script bug - no empty man page anymore + (closes: #133269) + + -- Michael Hummel Mon, 11 Feb 2002 20:55:58 +0100 + +swish++ (5.6-2) unstable; urgency=low + + * Dirty compiler trick: needed gcc-3.0 on i386 and gcc-2.95 on the + rest. The i386 version is compiled with g++ linked to gcc-3.0. + * Depends now on libstdc++3 therefore. + * Optimization should speed up indexing (~30%) + * Lowered threshold to 100.000 + * (closes: #130605) + * Added an email-indexing-example-directory (howto + a script example + from me + a patched version of Kai Grossjohann's nnir.el) + + -- Michael Hummel Tue, 29 Jan 2002 19:00:54 +0100 + +swish++ (5.6-1) unstable; urgency=high + + * New upstream release + * Removed compiler optimization flags; will have to wait for a upstream + solution to re-include optimization. (Safe bet: trading speed for + random segfaults) + * (closes: #129390) + + -- Michael Hummel Sun, 20 Jan 2002 16:30:17 +0100 + +swish++ (5.5.3-1) unstable; urgency=medium + + * Adopted this orphan + (closes: #88974) + * Assuming some more progressive default values in swish++.conf + * Suggesting xpdf (pdftotext) for attachment filtering ... + * Some polishing with fixmanpage + * Some new primitiveness (regarding the patches), as a gradual approach + to adoption + * New upstream release seems to fix the ia64 build bug - + built successfully on caballero + (closes: #119736) + * Added a minimal manpage for searchc + + -- Michael Hummel Sun, 30 Dec 2001 01:16:03 +0100 + +swish++ (5.1-0.1) unstable; urgency=medium + + * New upstream release. + * Marking the package as orphaned. + * Policy compliance. + * Updated DBS. + * Renamed newly introduced upstream splitmail to splitmail++ due to a + conflict with metamail. + * Various polishing. + * Made with help from Al Stone . + + -- Josip Rodin Mon, 9 Jul 2001 12:48:54 +0200 + +swish++ (3.0.3-3) unstable; urgency=low + + * chmod +x debian/fixmanpage. Closes: #51235 + * Reclosing bugs that only got marked as fixed + by -1 upload, which was recognized as an NMU. + Closes: #38173, #43417, #45677, #49391 + + -- Jim Pick Thu, 25 Nov 1999 16:31:01 -0800 + +swish++ (3.0.3-2) unstable; urgency=low + + * Oops, messed up the source package. Second try. + + -- Jim Pick Sun, 21 Nov 1999 12:23:36 -0800 + +swish++ (3.0.3-1) unstable; urgency=low + + * New upstream source. + * Changed to new packaging format. + * Closes: #38173, #43417, #45677, #49391 + + -- Jim Pick Sat, 20 Nov 1999 15:48:55 -0800 + +swish++ (1.4.1-1) unstable; urgency=low + + * New upstream source. + + -- Jim Pick Wed, 6 Jan 1999 12:36:12 -0800 + +swish++ (1.4b2-1) unstable; urgency=low + + * New upstream source. + + -- Jim Pick Thu, 17 Dec 1998 21:10:38 -0800 + +swish++ (1.1b3-3) frozen unstable; urgency=low + + * Oops, forgot to add frozen. + + -- Jim Pick Tue, 15 Dec 1998 00:28:25 -0800 + +swish++ (1.1b3-2) unstable; urgency=low + + * Recompiled. fixes: BUG#29050 + * Fixed location of gzip/gunzip. fixes: BUG#24742 + * Added support for bzip2. + + -- Jim Pick Mon, 14 Dec 1998 23:21:35 -0800 + +swish++ (1.1b3-1) unstable frozen; urgency=low + + * Upstream release. + * Renamed binaries to prevent name collisions (fixes bug #20536). + * Recompiled so it now works with new libstdc++ + + -- Jim Pick Tue, 12 May 1998 09:47:24 -0700 + +swish++ (1.1b1-1) unstable; urgency=low + + * Initial Release. + + -- Jim Pick Fri, 13 Mar 1998 22:41:24 -0800 + + --- swish++-6.1.5.orig/debian/docs +++ swish++-6.1.5/debian/docs @@ -0,0 +1,3 @@ +README +./debian/README.Debian +./debian/FAQ.Debian --- swish++-6.1.5.orig/debian/README.Debian +++ swish++-6.1.5/debian/README.Debian @@ -0,0 +1,86 @@ +Some Remarks on configuring and running swish++ +----------------------------------------------- + +The programs contained in this package are very well documented. +So just a few comments: + +* You can find a sample configuration file swish++.conf in + /usr/share/doc/swish++/examples/ + (moved from /etc as it is only a sample) + +* Some of the executables had to be renamed in order to avoid name + confusions (search --> search++ ...) + + +* Have a look at www_example if you intent to use swish++ for web + indexes; so you have to tweak it as the author states. + +* I moved the whole daemon stuff to + /usr/share/doc/swish++/examples/daemon/. The reason is that you + will need the daemon-mode only in very special environments where + you most likely want to set some compile time parameters + accordingly, which I can't presage either. + +* (Personal observation: Swish++ is really powerful indexing email + folders consisting of one file per message. For example + Gnus-nnml + nnir + swish++ is an amazing combination + see "/usr/share/doc/swish++/examples/email_indexing/") + +MH + +Running swish++ in cron jobs +---------------------------- + +(Ref: bugs.debian.org reports #459611 #461349 and #211513) + +First of all read the previous section and obey. The documentation of +swish++ is really extensive and the upstream author has implemented a +number of thoughtful features. + +Swish++ is often used as part of a cron job to index user file +contents. + +Swish++ tries to be very quiet as it works and so when all goes well +you don't get needless noise. However, when some file or filter +generates an error, the error message may be too brief for the user +of swish++ to locate and fix the problem. In brief, use the "-v" +option with a suitable level during the next swish++ run to locate +the problem. Of course, you can also choose to run with -v4 always +and use some sort of filtering mechanism for your cron logs. + +Swish++ tries to do its work as fast as possible. Hence it tries to +obtain all the available resources in order to finish its task +quickly. There are times when you do not want this to happen. For +example, in cron jobs you would like to apply resource limits. + +There is no uniform mechanism for applying resource limits to cron +jobs. Typically, these are run with "nice" but that is certainly far +from enough. For example a cron job may open a large number of files +or run for too long etc. + +It is currently not possible for swish to solve this problem on its +own. Users should make judicious use of "ulimit" which is defined in +all Posix shells in order to set resource limits for child processes. + +One way to limit cron jobs is to replace your cron script with something like + + #!/bin/sh + ulimit + + +This will put limits on the entire cron job but not on an individual +filtering process called by swish++. + +So another possible solution is to use some program like the script +"rlimit" which is provided in the examples/ subdirectory which allows +you to write a filter rule like: + + FilterFile *.pdf rlimit -t 3600 -- pdftotext %f @%F.txt + +which will limit the time the filter will run for to 3600 seconds or +1 hour. (Of course you will need to make "rlimit" executable and put +it in the PATH where index++ will look for its filters. /usr/local/bin +should work on Debian systems). + +Kapil Hari Paranjape Thu, 21 Feb 2008 12:17:22 +0530 +-- --- swish++-6.1.5.orig/debian/rules +++ swish++-6.1.5/debian/rules @@ -0,0 +1,129 @@ +#!/usr/bin/make -f +# Sample debian/rules that uses debhelper. +# GNU copyright 1997 to 1999 by Joey Hess. + +tmp := $(CURDIR)/debian/swish++ +renamed := extract index search splitmail +man4to5 := man4/swish++.conf.4 man4/swish++.index.4 + +install_file = install -p -o root -g root -m 644 +install_program = install -p -o root -g root -m 755 +install_script = install -p -o root -g root -m 755 +makedirectory = install -p -d -o root -g root -m 755 +# Uncomment this to turn on verbose mode. +#export DH_VERBOSE=1 + +# These are probably irrelevant since the source is g++ +CFLAGS = -g -Wall +ifneq (,$(findstring noopt,$(DEB_BUILD_OPTIONS))) +CFLAGS += -O0 +else +CFLAGS += -O2 +endif +ifeq (,$(findstring nostrip,$(DEB_BUILD_OPTIONS))) +install_program += -s +endif + +include /usr/share/quilt/quilt.make + +configure: configure-stamp + +configure-stamp: $(QUILT_STAMPFN) + dh_testdir + ## Patches and configuring + #ln -s debian/patches + #quilt push -a || test $$? = 2 + # Fix location of man pages + mkdir -p $(CURDIR)/man/man5 + for i in $(man4to5); do j=$$(echo $${i} | tr '4' '5'); cp -p man/$${i} man/$${j}; done + touch configure-stamp + +build: build-stamp + +build-stamp: configure-stamp + dh_testdir + # Add here commands to compile the package. + $(MAKE) + touch build-stamp + +clean: unpatch + dh_testdir + dh_testroot + ## Reverse patches + #quilt pop -a || test $$? = 2 + ## Clean up quilt stuff + #rm -rf patches .pc + #rm -f build-stamp configure-stamp + # Add here commands to clean up after the build process. + $(MAKE) distclean + # Clean up a left over file + rm -f init_mod_vars.c + # Clean up the renamed man pages + for i in $(renamed); do rm -f $(CURDIR)/man/man1/$${i}++.1; done; + # Remove copied manpages + rm -rf $(CURDIR)/man/man5 + rm -f build-stamp configure-stamp + dh_clean + +install: build + dh_testdir + dh_testroot + dh_clean -k +# $(MAKE) install DESTDIR=$(tmp) + $(makedirectory) $(tmp)/usr/lib/swish++ + $(makedirectory) $(tmp)/usr/bin + $(install_file) WWW.pm $(tmp)/usr/lib/swish++/ + $(install_program) search $(tmp)/usr/bin/search++ + $(install_program) extract $(tmp)/usr/bin/extract++ + $(install_program) index $(tmp)/usr/bin/index++ + $(install_script) scripts/splitmail $(tmp)/usr/bin/splitmail++ + $(install_script) scripts/httpindex $(tmp)/usr/bin/httpindex + + cd $(CURDIR)/man/man1 && for i in $(renamed); do cp -p $${i}.1 $${i}++.1; done; + + $(makedirectory) $(CURDIR)/debian/daemon + $(install_script) scripts/searchmonitor $(CURDIR)/debian/daemon/searchmonitor + $(install_script) scripts/searchc $(CURDIR)/debian/daemon/searchc + $(install_script) scripts/searchd $(CURDIR)/debian/daemon/searchd + # should customize the whole install as it's really opaque now + + +# Build architecture-independent files here. +binary-indep: build install +# We have nothing to do by default. + +# Build architecture-dependent files here. +binary-arch: build install + dh_testdir + dh_testroot +# dh_installdebconf + dh_installdocs + dh_installexamples $(CURDIR)/debian/email_indexing/ \ + $(CURDIR)/debian/oo_indexing $(CURDIR)/debian/daemon/ \ + $(CURDIR)/swish++.conf $(CURDIR)/www_example/ \ + $(CURDIR)/debian/rlimit + dh_installmenu +# dh_installlogrotate +# dh_installemacsen +# dh_installpam +# dh_installmime +# dh_installinit + dh_installcron + dh_installman man/man1/splitmail++.1 man/man1/search++.1 man/man1/extract++.1 man/man1/index++.1 \ + man/man1/httpindex.1 man/man3/WWW.3 man/man5/swish++.conf.5 man/man5/swish++.index.5 + dh_installinfo + dh_installchangelogs Changes + dh_link + dh_strip + dh_compress + dh_fixperms +# dh_makeshlibs + dh_installdeb +# dh_perl + dh_shlibdeps + dh_gencontrol + dh_md5sums + dh_builddeb + +binary: binary-indep binary-arch +.PHONY: build clean binary-indep binary-arch binary install configure --- swish++-6.1.5.orig/debian/searchc.1 +++ swish++-6.1.5/debian/searchc.1 @@ -0,0 +1,87 @@ +.\" Hey, EMACS: -*- nroff -*- +.\" First parameter, NAME, should be all caps +.\" Second parameter, SECTION, should be 1-8, maybe w/ subsection +.\" other parameters are allowed: see man(7), man(1) +.TH SEARCHC 1 "Dezember 30, 2001" +.\" Please adjust this date whenever revising the manpage. +.\" +.\" Some roff macros, for reference: +.\" .nh disable hyphenation +.\" .hy enable hyphenation +.\" .ad l left justify +.\" .ad b justify to both left and right margins +.\" .nf disable filling +.\" .fi enable filling +.\" .br insert line break +.\" .sp insert n+1 empty lines +.\" for manpage-specific macros, see man(7) +.SH NAME +searchc \- Simple search client script used mostly to test 'search++' +when running as a server daemon. +.SH SYNOPSIS +.B searchc +.RI [ options ] \ query +.SH DESCRIPTION +This manual page documents briefly the +.B searchc command. +This manual page was written for the Debian GNU/Linux distribution +because the original program does not have a manual page. +.PP +.\" TeX users may be more comfortable with the \fB\fP and +.\" \fI\fP escape sequences to invode bold face and italics, +.\" respectively. + +.SH OPTIONS +.TP +.B \-a socket_addr +Host and port of socket address [default: *:1967] +.TP +.B \-c config_file +Name of configuration file [default: $ConfigFile_Default] +.TP +.B \-d +Dump query word indices and exit +.TP +.B \-D +Dump entire word index and exit +.TP +.B \-h +Print this help message +.TP +.B \-m max_results +Maximum number of results [default: $ResultsMax_Default] +.TP +.B \-M +Dump meta-name index and exit +.TP +.B \-r skip_results +Number of initial results to skip [default: 0] +.TP +.B \-s +Stem words prior to search [default: no] +.TP +.B \-S +Dump stop-word index and exit +.TP +.B \-T +Connect via TCP socket +.TP +.B \-u socket_file +Name of socket file [default: $SocketFile_Default] +.TP +.B \-U +Connect via Unix domain socket +.TP +.B \-V +Print version number and exit +.TP +.B \-w size[,chars] +Dump window of words around query words [default: 0] + +.SH SEE ALSO +.BR search++ (1), +\" .BR searchmonitor (8), +.BR /usr/share/doc/swish++/examples/demon/. +.SH AUTHOR +This manual page was written by Michael Hummel , +for the Debian GNU/Linux system (but may be used by others). --- swish++-6.1.5.orig/debian/FAQ.Debian +++ swish++-6.1.5/debian/FAQ.Debian @@ -0,0 +1,35 @@ +1. Indexing ro media or directories where you don't have +write permission + + +>> Package: swish++ Version: 5.7-1 Severity: wishlist + +>> Hello, It'd be really nice if index++ could do unzipping +>> internally (zlib?) instead of temporarily making an +>> unzipped copy in the location of the compressed files, +>> using the filter FilterFile *.gz gunzip -c %f > @%F + +>> Or is there another solution for indexing a read-only +>> directory (or medium eg. CD-ROM) of compressed files? + +> Good question! + +> Yes there is another way: check man swish++.conf. + +> For example (swish++.conf): + +> FilterFile *.gz gunzip -c %f > @/tmp/%B + + +> So you could do, for example: + + +> index++ -v3 -i ~/tmp/index_readonly_media -c + /path/to/your/swish++.conf /ro_or_dir_wo_write_permission + +2. Indexing files with spaces in the file or path name + + +It works and there are FilterFile examples in the example swish++.conf. + + --- swish++-6.1.5.orig/debian/rlimit +++ swish++-6.1.5/debian/rlimit @@ -0,0 +1,23 @@ +#!/bin/sh +# Copyright (C) 2008 Kapil Hari Paranjape +# The contents of this file are released in the Public Domain +# A sample script to run a filter with +# some limits set as by ulimit + +ulimargs="$*" +cmd="$*" +ulimargs=${ulimargs%%--*} +cmd=${cmd##*--} + +ulimit -H ${ulimargs} > /dev/null 2>&1 + +[ -n "$cmd" ] || exit 0 + +if [ -z "$cmd" ] +then + echo "usage:" $0 " -- " + exit 1 +fi + +exec ${cmd} + --- swish++-6.1.5.orig/debian/dirs +++ swish++-6.1.5/debian/dirs @@ -0,0 +1,4 @@ +usr/bin +etc +usr/lib/swish++ +usr/share/man --- swish++-6.1.5.orig/debian/compat +++ swish++-6.1.5/debian/compat @@ -0,0 +1 @@ +4 --- swish++-6.1.5.orig/debian/searchc.txt +++ swish++-6.1.5/debian/searchc.txt @@ -0,0 +1,61 @@ +SEARCHC(1) SEARCHC(1) + +NAME + searchc - Simple search client script used mostly to test 'search++' + when running as a server daemon. + +SYNOPSIS + searchc [_______] query + +DESCRIPTION + This manual page documents briefly the searchc command. This manual + page was written for the Debian GNU/Linux distribution because the + original program does not have a manual page. + +OPTIONS + -a socket_addr + Host and port of socket address [default: *:1967] + + -c config_file + Name of configuration file [default: $ConfigFile_Default] + + -d Dump query word indices and exit + + -D Dump entire word index and exit + + -h Print this help message + + -m max_results + Maximum number of results [default: $ResultsMax_Default] + + -M Dump meta-name index and exit + + -r skip_results + Number of initial results to skip [default: 0] + + -s Stem words prior to search [default: no] + + -S Dump stop-word index and exit + + -T Connect via TCP socket + + -u socket_file + Name of socket file [default: $SocketFile_Default] + + -U Connect via Unix domain socket + + -V Print version number and exit + + -w size[,chars] + Dump window of words around query words [default: 0] + +SEE ALSO + search++(1), + + /usr/share/doc/swish++/examples/daemon/. + +AUTHOR + This manual page was written by Michael Hummel , + for the Debian GNU/Linux system (but may be used by others). + + Dezember 30, 2001 SEARCHC(1) --- swish++-6.1.5.orig/debian/daemon/searchmonitor +++ swish++-6.1.5/debian/daemon/searchmonitor @@ -0,0 +1,140 @@ +#! /bin/sh +# This code is Bourne Shell for maximal portability. +## +# SWISH++ +# searchmonitor -- search daemon monitor +# +# Copyright (C) 2001 Paul J. Lucas +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +## + +## +# What stuff is called and where it's located. +## +LOG="logger -t search -p daemon" +SEARCH_DEFAULT="/usr/bin/search++" + +## +# You may need to set LD_LIBRARY_PATH to contain the directory of the C++ +# run-time library, e.g. libstdc++.so if g++ was used to compile SWISH++. +## +#LD_LIBRARY_PATH=/usr/local/lib +#export LD_LIBRARY_PATH + +########### You shouldn't have to change anything below this line. ############ + +ME=`basename $0` +USAGE="usage: $ME [-c conf_file] [-s search_path]" + +## +# Parse command-line options. +## +while getopts Bc:s: opt +do + case $opt in + B) NO_BACKGROUND=true ;; + c) CONF_FILE="$OPTARG" ;; + s) SEARCH="$OPTARG" ;; + ?) echo $USAGE >&2; exit 1 ;; + esac +done +shift `expr $OPTIND - 1` + +## +# Check for existence of configuration file. +## +if [ -n "$CONF_FILE" ] +then + [ -f "$CONF_FILE" ] || { echo "$ME: $CONF_FILE not found" >&2; exit 3; } +else + CONF_FILE="swish++.conf" + if [ ! -f "$CONF_FILE" ] + then + CONF_FILE="/etc/$CONF_FILE" + [ -f "$CONF_FILE" ] || + { echo "$ME: no configuration file found" >&2; exit 3; } + fi +fi + +## +# Check for existence of search. +## +[ -z "$SEARCH" ] && SEARCH="$SEARCH_DEFAULT" +[ -f "$SEARCH" ] || { echo "$ME: $SEARCH not found" >&2; exit 2; } + +## +# Determine the numeric value for SIGUSR2 that we use to kill search. By using +# a user-generated signal, if search gets killed and it was killed by SIGUSR2, +# then it MUST have been done by manual request. +## +KILL=`which kill` # ensure shell built-in isn't used +USR2=`$KILL -l USR2 2>&1` + +## +# If we haven't already put ourselves into the background, call ourselves in +# the background (the child process) and immediately exit (as the parent +# process). This is effectively how to fork(2) in shell. +## +[ -z "$NO_BACKGROUND" ] && { $0 -B -c "$CONF_FILE" -s "$SEARCH" & exit 0; } + +## +# Start search as a daemon and wait for it to exit. If it exited because of a +# condition that restarting might cure, restart it. +## +while true +do + $SEARCH -B -c $CONF_FILE + exit_code=$? + if [ $exit_code -lt 128 ] + then # it exited by itself + case $exit_code in + 1) $LOG.alert "configuration file error; NOT restarting"; exit 0 ;; + 2) $LOG.alert "usage error; NOT restarting"; exit 0 ;; + 40) $LOG.alert "can't read index file; NOT restarting"; exit 0 ;; + 51) $LOG.alert "can't write PID file; NOT restarting"; exit 0 ;; + 52) $LOG.alert "bad host/IP; NOT restarting"; exit 0 ;; + 53) $LOG.err "can't create TCP socket; restarting ..." ;; + 54) $LOG.err "can't create Unix socket; restarting ..." ;; + 55) $LOG.alert "can't delete Unix socket; NOT restarting"; exit 0 ;; + 56) $LOG.err "can't bind TCP socket; restarting ..." ;; + 57) $LOG.err "can't bind Unix socket; restarting ..." ;; + 58) $LOG.err "can't listen TCP socket; restarting ..." ;; + 59) $LOG.err "can't listen Unix socket; restarting ..." ;; + 60) $LOG.err "can't select; restarting ..." ;; + 61) $LOG.err "can't accept; restarting ..." ;; + 62) $LOG.err "can't fork; restarting ..." ;; + 63) $LOG.alert "can't cd /; NOT restarting"; exit 0 ;; + 64) $LOG.err "can't create thread; restarting ..." ;; + 65) $LOG.err "can't detach thread; restarting ..." ;; + 66) $LOG.err "can't init thread condition; restarting ..." ;; + 67) $LOG.err "can't init thread mutex; restarting ..." ;; + 68) $LOG.alert "No such user; NOT restarting"; exit 0 ;; + 69) $LOG.alert "No such group; NOT restarting"; exit 0 ;; + *) $LOG.err "exited with code $exit_code; restarting..." ;; + esac + else # it received a signal + sig=`expr $exit_code - 128` + if [ $sig -eq $USR2 ] + then + $LOG.notice "shut down by manual request via signal USR2" + exit 0 + else + $LOG.err "died on signal $sig; restarting..." + fi + fi + sleep 5 +done 2>/dev/null +# vim:set et sw=4 ts=4: --- swish++-6.1.5.orig/debian/daemon/searchc +++ swish++-6.1.5/debian/daemon/searchc @@ -0,0 +1,164 @@ +#! /usr/bin/perl +## +# SWISH++ +# searchc: Simple search client script used mostly to test 'search' when +# running as a server daemon. +# +# Copyright (C) 1999 Paul J. Lucas +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. +## + +$ConfigFile_Default = 'swish++.conf'; +$ResultsFormat_Default = 'classic'; +$ResultsMax_Default = 100; +$SocketAddress_Default = '1967'; +$SocketFile_Default = '/tmp/search.socket'; + +########## You should't have to change anything below this line. ############## + +## +# SEE ALSO +# +# Larry Wall, et al. "Programming Perl," 3rd ed., O'Reilly and +# Associates, Inc., Sebastopol, CA, 1996, pp. 439-440. +## + +require 5.003; + +use File::Basename; +use Getopt::Std; +use Socket; + +$me = basename( $0 ); +sub usage; + +########## Process command-line options ####################################### + +( getopts( 'a:c:dDF:hm:Mr:sSTu:UVw:' ) && !$opt_h ) || usage(); +die "$me: one of -[auTU] must be specified\n" + unless $opt_a || $opt_u || $opt_T || $opt_U; +die "$me: -T and -U are mutually exclusive\n" if $opt_T && $opt_U; + +( $ConfigFile = $opt_c ) ||= $ConfigFile_Default; +$SocketAddress = $SocketAddress_Default; +$SocketFile = $SocketFile_Default; + +## +# First, parse the config. file (if any); then override variables specified on +# the command line with options. +## +if ( open( CONF, $ConfigFile ) ) { + my $conf = join( '', grep( !/^\s*#/, ) ); # without comments + close( CONF ); + $conf =~ /SocketAddress\s+(\S+)/im; + $SocketAddress = $1 if $1; + $conf =~ /SocketFile\s+(\S+)/im; + $SocketFile = $1 if $1; + $conf =~ /ResultsFormat\s+(\S+)/im; + $ResultsFormat = $1 if $1; + $conf =~ /ResultsMax\s+(\S+)/im; + $ResultsMax = $1 if $1; +} else { + die "$me: could not read configuration \"$ConfigFile\"\n" + if $ConfigFile ne $ConfigFile_Default; +} + +$ResultsFormat = $opt_F if $opt_F; +$ResultsMax = $opt_m if $opt_m; +$SocketAddress = $opt_a if $opt_a; +$SocketFile = $opt_u if $opt_u; + +## +# Build a command line to pass to 'search'. +## +unshift( @ARGV, '-d' ) if $opt_d; +unshift( @ARGV, '-D' ) if $opt_D; +unshift( @ARGV, "-F$ResultsFormat" ) if $ResultsFormat; +unshift( @ARGV, "-m$ResultsMax" ) if $ResultsMax; +unshift( @ARGV, '-M' ) if $opt_M; +unshift( @ARGV, "-r$opt_r" ) if $opt_r; +unshift( @ARGV, '-s' ) if $opt_s; +unshift( @ARGV, '-S' ) if $opt_S; +unshift( @ARGV, '-V' ) if $opt_V; +unshift( @ARGV, "-w$opt_w" ) if $opt_w; + +########## Main ############################################################### + +if ( $opt_T ) { + ## + # Connect to the 'search' server via a TCP socket. + ## + my( $host, $port ) = $SocketAddress =~ /(?:([^\s:]+):)?(\d+)/; + $host = 'localhost' if $host eq '' || $host =~ /^\*?$/; + my $iaddr = inet_aton( $host ) || + die "$me: \"$host\": bad or unknown host\n"; + socket( SEARCH, PF_INET, SOCK_STREAM, getprotobyname( 'tcp' ) ) || + die "$me: can not open socket: $!\n"; + connect( SEARCH, sockaddr_in( $port, $iaddr ) ) || + die "$me: can not connect to \"$SocketAddress\": $!\n"; +} else { + ## + # Connect to the 'search' server via a Unix domain socket. + ## + socket( SEARCH, PF_UNIX, SOCK_STREAM, 0 ) || + die "$me: can not open socket: $!\n"; + connect( SEARCH, sockaddr_un( $SocketFile ) ) || + die "$me: can not connect to \"$SocketFile\": $!\n"; +} + +## +# We *MUST* set autoflush for the socket filehandle, otherwise the server +# thread will hang since I/O buffering will wait for the buffer to fill that +# will never happen since queries are short. See [Wall], p. 781. +## +select( (select( SEARCH ), $| = 1)[0] ); + +## +# We also *MUST* print a trailing newline since the server reads an entire line +# of input (so therefore it looks and waits for a newline). +## +print SEARCH 'search ', join( ' ', @ARGV ), "\n"; # send query to server +shutdown( SEARCH, 1 ); # finished sending + +print while ; # read results back +close( SEARCH ); +exit 0; + +########## Miscellaneous function(s) ########################################## + +sub usage { + die <&1` + +## +# Figure out how to do echo without a newline. This is stolen from Perl's +# Configure script. +## +(echo "hi there\c"; echo) >/tmp/.echotmp +if grep c /tmp/.echotmp >/dev/null 2>&1 +then c='' ; n='-n' +else c='\c'; n='' +fi +rm -f /tmp/.echotmp + +## +# If we're on a Linux system, use it's start/stop functions that make the +# output look perty. +## +if [ -f /etc/rc.d/init.d/functions ] +then + . /etc/rc.d/init.d/functions + LINUX=true +fi + +case "$1" in + start) + echo $n "Starting $SEARCH$c" + rm -f $PID_FILE + OPTS="-c $CONF_FILE -s $SEARCH_PATH" + if [ -n "$LINUX" ] + then daemon $SEARCHMONITOR $OPTS + else $SEARCHMONITOR $OPTS & + fi + echo + ;; + stop) + [ -r $PID_FILE ] || { echo $PID_FILE not found >&2; exit 1; } + echo $n "Stopping $SEARCH$c" + if [ -n "$LINUX" ] + then killproc $SEARCH -$USR2 2>/dev/null + else $KILL -s USR2 `head -1 $PID_FILE` || exit 1 + fi + echo + ;; + restart) + $SCRIPT_DIR/$SEARCH stop || exit 1 + sleep 3 + $SCRIPT_DIR/$SEARCH start + ;; + *) + echo "usage: $0 { start | stop | restart }" >&2 + exit 1 + ;; +esac +# vim:set et sw=4 ts=4: --- swish++-6.1.5.orig/debian/oo_indexing/oo.conf +++ swish++-6.1.5/debian/oo_indexing/oo.conf @@ -0,0 +1,16 @@ +# Copyright (C) 2004 Bastian Kleineidam +# The contents of this file are released in the Public Domain +# swish configuration entries for indexing OpenOffice documents + +# the .openoffice suffix can be whatever you like, you must also +# adjust also the FilterFile entries +IncludeFile text *.openoffice + +# replace /home/user/temp with a temporary path you like +# note that oototext.sh must be found in the PATH +# see also OpenOffice Documents http://books.evc-cit.info/ +FilterFile *.sxd oototext.sh %f > @/home/user/temp/%B.openoffice +FilterFile *.sxw oototext.sh %f > @/home/user/temp/%B.openoffice +FilterFile *.sxc oototext.sh %f > @/home/user/temp/%B.openoffice +FilterFile *.sxi oototext.sh %f > @/home/user/temp/%B.openoffice + --- swish++-6.1.5.orig/debian/oo_indexing/oocontent.xsl +++ swish++-6.1.5/debian/oo_indexing/oocontent.xsl @@ -0,0 +1,58 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + --- swish++-6.1.5.orig/debian/oo_indexing/oometa.xsl +++ swish++-6.1.5/debian/oo_indexing/oometa.xsl @@ -0,0 +1,62 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +Lang: + + + + + + + + + + + + + + + + : + + + + + + --- swish++-6.1.5.orig/debian/oo_indexing/oototext.sh +++ swish++-6.1.5/debian/oo_indexing/oototext.sh @@ -0,0 +1,21 @@ +#!/bin/sh -e +# Copyright (C) 2004 Bastian Kleineidam +# The contents of this file are released in the Public Domain +# convert openoffice documents to text +# suitable for full-text indexing +# usage: oototext.sh + +# requires: unzip, xsltproc + +case "$1" in +*.sxd|*.sxw|*.sxc|*.sxi) + unzip -p "$1" meta.xml | xsltproc --novalid oometa.xsl - + unzip -p "$1" content.xml | xsltproc --novalid oocontent.xsl - + break + ;; +*) + echo "Unsupported file type $1" + exit 1 + break + ;; +esac --- swish++-6.1.5.orig/debian/patches/configure +++ swish++-6.1.5/debian/patches/configure @@ -0,0 +1,89 @@ +Index: swish++-6.1.4/config.h +=================================================================== +--- swish++-6.1.4.orig/config.h 2006-09-17 18:45:07.000000000 +0530 ++++ swish++-6.1.4/config.h 2006-09-17 18:45:36.000000000 +0530 +@@ -186,11 +186,11 @@ + // Default maximum number of search results; this can be overridden + // either in a config. file or on the command line. + +-char const ShellFilenameDelimChars[] = " \t&;<>|"; ++char const ShellFilenameDelimChars[] = " \t\n&;<>|"; + // Characters in a Unix shell command that delimit file names. Note + // that this says "file" (not "path") names. + +-char const ShellFilenameEscapeChars[] = " !\"#$&'()*/;<>?[\\]^`{|}~"; ++char const ShellFilenameEscapeChars[] = " \t\n!\"#$&'()*/;<>?[\\]^`{|}~"; + // Characters in a file name that must be escaped when passed to a + // Unix shell. This is a superset of what are commonly referred to as + // "meta-characers" because the space and tab characters are included. +@@ -199,8 +199,8 @@ + #ifdef __CYGWIN__ + char const TempDirectory_Default[] = "/temp"; + #else +-#error You have not set TempDirectory_Default for your system. +-#error Comment out these lines after you have set it. ++// #error You have not set TempDirectory_Default for your system. ++// #error Comment out these lines after you have set it. + + char const TempDirectory_Default[] = "/tmp"; + #endif +@@ -235,8 +235,8 @@ + // is discarded as being too frequent; this can be overridden either + // in a config. file or on the command line. + +-#error You have not set WordThreshold_Default for your system. +-#error Comment out these lines after you have set it. ++// #error You have not set WordThreshold_Default for your system. ++// #error Comment out these lines after you have set it. + + int const WordThreshold_Default = 250000; + // The word count past which partial indicies are generated and merged +Index: swish++-6.1.4/config/config.mk +=================================================================== +--- swish++-6.1.4.orig/config/config.mk 2006-09-17 18:45:18.000000000 +0530 ++++ swish++-6.1.4/config/config.mk 2006-09-17 18:45:42.000000000 +0530 +@@ -159,7 +159,7 @@ + # The command to remove files recursively and ignore errors; + # usually "rm -fr" for Unix or "erase" for Windows. + +-PERL:= /usr/local/bin/perl ++PERL:= /usr/bin/perl + # The full path to the Perl 5 executable; usually "/bin/perl" or + # "/usr/local/bin/perl" for Unix or "\Perl\bin\perl" for + # Windows. +@@ -251,28 +251,28 @@ + # + ############################################################################### + +-INSTALL:= $(ROOT)/install-sh ++INSTALL:= /usr/bin/install + # Install command; usually "$(ROOT)/install-sh". + +-I_ROOT:= /usr/local ++I_ROOT:= $(DESTDIR)/usr + # The top-level directory of where SWISH++ will be installed. + + I_BIN:= $(I_ROOT)/bin + # Where executables are installed; usually "$(I_ROOT)/bin". + +-I_ETC:= $(I_ROOT)/etc ++I_ETC:= $(DESTDIR)/etc + # Where .conf files are installed; usually "$(I_ROOT)/etc". + +-I_LIB:= $(I_ROOT)/lib ++I_LIB:= $(I_ROOT)/lib/swish++ + # Where libraries are installed; usually "$(I_ROOT)/lib". + +-I_MAN:= $(I_ROOT)/man ++I_MAN:= $(I_ROOT)/share/man + # Where manual pages are installed; usually "$(I_ROOT)/man". + +-I_OWNER:= -o bin ++I_OWNER:= -o root + # The owner of the installed files. + +-I_GROUP:= -g bin ++I_GROUP:= -g root + # The group of the installed files. + + I_MODE:= -m 644 --- swish++-6.1.5.orig/debian/patches/include_cstddef_header +++ swish++-6.1.5/debian/patches/include_cstddef_header @@ -0,0 +1,14 @@ +From: "Georgios M. Zarkadas" +Subject: fix FTBFS with gcc 4.6 +diff --git a/encoded_char.h b/encoded_char.h +index 819c993..6b9a371 100644 +--- a/encoded_char.h ++++ b/encoded_char.h +@@ -25,6 +25,7 @@ + // standard + #include + #include ++#include + + // local + #include "iso8859-1.h" --- swish++-6.1.5.orig/debian/patches/fix_man_pages +++ swish++-6.1.5/debian/patches/fix_man_pages @@ -0,0 +1,694 @@ +Index: swish++-6.1.4/man/man1/extract.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/extract.1 2007-08-04 10:36:52.000000000 +0530 ++++ swish++-6.1.4/man/man1/extract.1 2007-08-04 10:37:26.000000000 +0530 +@@ -1,6 +1,6 @@ + .\" + .\" SWISH++ +-.\" extract.1 ++.\" extract++.1 + .\" + .\" Copyright (C) 1998 Paul J. Lucas + .\" +@@ -35,18 +35,18 @@ + .if !'\\$1'0' .sp + .. + .\" --------------------------------------------------------------------------- +-.TH \f3extract\fP 1 "November 1, 2002" "SWISH++" ++.TH \f3extract++\fP 1 "November 1, 2002" "SWISH++" + .SH NAME +-extract \- SWISH++ text extractor ++extract++ \- SWISH++ text extractor + .SH SYNOPSIS +-.B extract ++.B extract++ + [ + .I options + ] + .I directory... + .I file... + .SH DESCRIPTION +-.B extract ++.B extract++ + is the SWISH++ text extractor, + a utility to extract what text there is from a (mostly) binary file + (similar to the +@@ -99,7 +99,7 @@ + option or the + .B ExcludeFile + variable), are extracted, i.e., +-.B extract ++.B extract++ + assumes you know what you're doing when specifying filenames in this manner. + .PP + Ordinarily, the text extracted from a file is written to another file +@@ -135,10 +135,10 @@ + (See the examples in + .BR swish++.conf (5).) + .SS Character Mapping and Word Determination +-.B extract ++.B extract++ + performs the same character mapping, character entity conversions, + and word determination heuristics used by +-.BR index (1) ++.BR index++ (1) + but also additionally: + .TP 4 + 1. +@@ -165,13 +165,13 @@ + characters or longer, e.g., ``\f(CW7F454C46\fP.'' + (Default is 5.) + .SS Motivation +-.B extract ++.B extract++ + was developed to be able to index non-text files in proprietary formats + such as Microsoft Office documents. + There are a couple of reasons why the functionality of +-.B extract ++.B extract++ + isn't simply built into +-.BR index (1): ++.BR index++ (1): + .TP 4 + 1. + Users who do not need to index such documents +@@ -180,7 +180,7 @@ + .TP + 2. + While +-.BR index (1) ++.BR index++ (1) + can uncompress files on the fly using filters also, + uncompressing them every time indexing is performed is excessive. + Text extraction, on the other hand, is done only once per file; +@@ -439,7 +439,7 @@ + To extract text from all Microsoft Office files on a web server: + .cS + cd /home/www/htdocs +-extract \-v3 \-e '*.doc' \-e '*.ppt' \-e '*.xls' . ++extract++ \-v3 \-e '*.doc' \-e '*.ppt' \-e '*.xls' . + .cE + .SS Filters + (See the examples in +@@ -473,7 +473,7 @@ + .TP + 2. + As with +-.BR index (1), ++.BR index++ (1), + the word-determination heuristics employed are heavily geared for English. + Using SWISH++ as-is to extract files in non-English languages + is not recommended. +@@ -484,8 +484,8 @@ + default configuration file name + .PD + .SH SEE ALSO +-.BR index (1), +-.BR search (1), ++.BR index++ (1), ++.BR search++ (1), + .BR strings (1), + .BR swish++.conf (5), + .BR glob (7) +Index: swish++-6.1.4/man/man1/index.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/index.1 2007-08-04 10:36:52.000000000 +0530 ++++ swish++-6.1.4/man/man1/index.1 2007-08-04 10:38:30.000000000 +0530 +@@ -1,6 +1,6 @@ + .\" + .\" SWISH++ +-.\" index.1 ++.\" index++.1 + .\" + .\" Copyright (C) 2003 Paul J. Lucas + .\" +@@ -35,18 +35,18 @@ + .if !'\\$1'0' .sp + .. + .\" --------------------------------------------------------------------------- +-.TH \f3index\f1 1 "March 25, 2004" "SWISH++" ++.TH \f3index++\f1 1 "March 25, 2004" "SWISH++" + .SH NAME +-index \- SWISH++ indexer ++index++ \- SWISH++ indexer + .SH SYNOPSIS +-.B index ++.B index++ + [ + .I options + ] + .I directory... + .I file... + .SH DESCRIPTION +-.B index ++.B index++ + is the SWISH++ file indexer. + It indexes the specified files + and files in the specified directories; +@@ -86,7 +86,7 @@ + option or the + .B ExcludeFile + variable), are indexed, i.e., +-.B index ++.B index++ + assumes you know what you're doing when specifying filenames in this manner. + .P + In any case, care must be taken not to specify files or subdirectories +@@ -159,11 +159,11 @@ + (See FILTERS in + .BR swish++.conf (5).) + .SS Incremental Indexing +-In order to add words from new documents to an existing index, ++In order to add words from new documents to an existing index++, + either the entire set of documents can be reindexed + or the new documents alone can be incrementally indexed. + In many cases, reindexing everything is sufficient since +-.B index ++.B index++ + is really fast. + For a very large document set, however, + this may use too many resources. +@@ -198,7 +198,7 @@ + .P + Another way around this problem is to do periodic full indexing. + .SH INDEXING MODULES +-.B index ++.B index++ + is written in a modular fashion + where different types of files have different indexing modules. + Currently, there are 7 modules: +@@ -307,7 +307,7 @@ + .B ExcludeMeta + variables.) + Meta names can later be queried against specifically using +-.BR search (1). ++.BR search++ (1). + .TP + 7. + If a \f(CWTABLE\f1 element contains a \f(CWSUMMARY\f1 attribute, +@@ -330,7 +330,7 @@ + Values containing whitespace, however, must be quoted. + The specification is vague as to whether whitespace surrounding the \f(CW=\f1 + is legal, but +-.B index ++.B index++ + allows it. + .SS ID3 Module + ID3 tags are used to store audio meta information for MP3 files (generally). +@@ -374,7 +374,7 @@ + .B ExcludeMeta + variables.) + Meta names can later be queried against specifically using +-.BR search (1). ++.BR search++ (1). + .IP "" + For ID3v1.x, the recommended fields to be indexed are: + .BR album , +@@ -507,7 +507,7 @@ + .B ExcludeMeta + variables.) + Meta names can later be queried against specifically using +-.BR search (1). ++.BR search++ (1). + .IP "" + The recommended headers to be indexed are: + .BR Bcc , +@@ -571,7 +571,7 @@ + prior to indexing since there's no point in indexing a single mailbox: + every search result would return a rank of 100 for the same file. + Therefore, the +-.BR splitmail (1) ++.BR splitmail++ (1) + utility is included in the SWISH++ distribution. + .SS Manual Module + Additional processing is done for Unix manual page files. +@@ -618,7 +618,7 @@ + .B ExcludeMeta + variables.) + Meta names can later be queried against specifically using +-.BR search (1). ++.BR search++ (1). + .IP "" + The recommended sections to be indexed are: + .BR AUTHOR , +@@ -810,9 +810,9 @@ + .ns + .TP + .B \-\-incremental +-Incrementally add the indexed files and words to an existing index. +-The existing index is not touched; +-instead, a new index is created having the same pathname of the existing index ++Incrementally add the indexed files and words to an existing index++. ++The existing index++ is not touched; ++instead, a new index is created having the same pathname of the existing index++ + with ``\f(CW.new\f1'' appended. + .TP + .B \-l +@@ -1016,7 +1016,7 @@ + .BI \-\-word-threshold= n + The word count past which partial indices are generated and merged + since all the words are too big to fit into memory at the same time. +-If you index and your machine begins to swap like mad, ++If you index++ and your machine begins to swap like mad, + lower this value. + Only the super-user can specify a value larger + than the compiled-in default. +@@ -1170,11 +1170,11 @@ + .P + To index all HTML and text files on a web server: + .cS +-index \-v3 \-e 'html:*.*htm*' \-e 'text:*.txt' . ++index++ \-v3 \-e 'html:*.*htm*' \-e 'text:*.txt' . + .cE + To index all files not under directories named \f(CWCVS\f1: + .cS +-find . \-name CVS \-prune \-o \-type f \-a \-print | index \-e 'html:*.*htm*' \- ++find . \-name CVS \-prune \-o \-type f \-a \-print | index++ \-e 'html:*.*htm*' \- + .cE + .SS Windows Command-Lines + When using the Windows command interpreter, +@@ -1182,7 +1182,7 @@ + .I must + use double quotes: + .cS +-index \-v3 \-e "html:*.*htm*" \-e "text:*.txt" . ++index++ \-v3 \-e "html:*.*htm*" \-e "text:*.txt" . + .cE + This is a problem with Windows, not SWISH++. + (Double quotes will also work under Unix.) +@@ -1293,7 +1293,7 @@ + .TP + 2. + The word-determination heuristics employed are heavily geared for English. +-Using SWISH++ as-is to index and search files in non-English languages ++Using SWISH++ as-is to index and search++ files in non-English languages + is not recommended. + .TP + 3. +@@ -1336,11 +1336,11 @@ + .B TempDirectory + variable. + .SH SEE ALSO +-.BR extract (1), ++.BR extract++ (1), + .BR find (1), + .BR nroff (1), +-.BR search (1), +-.BR splitmail (1), ++.BR search++ (1), ++.BR splitmail++ (1), + .BR swish++.conf (5), + .BR glob (7), + .BR man (7). +Index: swish++-6.1.4/man/man1/search.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/search.1 2007-08-04 10:36:52.000000000 +0530 ++++ swish++-6.1.4/man/man1/search.1 2007-08-04 10:37:26.000000000 +0530 +@@ -1,6 +1,6 @@ + .\" + .\" SWISH++ +-.\" search.1 ++.\" search++.1 + .\" + .\" Copyright (C) 2003 Paul J. Lucas + .\" +@@ -35,22 +35,22 @@ + .if !'\\$1'0' .sp + .. + .\" --------------------------------------------------------------------------- +-.TH \f3search\fP 1 "June 16, 2005" "SWISH++" ++.TH \f3search++\fP 1 "June 16, 2005" "SWISH++" + .SH NAME +-search \- SWISH++ searcher ++search++ \- SWISH++ searcher + .SH SYNOPSIS +-.B search ++.B search++ + [ + .I options + ] + .I query + .SH DESCRIPTION +-.B search ++.B search++ + is the SWISH++ searcher. +-It searches a previously generated index for the words specified in a query. ++It searches a previously generated index for the words specified in a query. + In addition to running from the command-line, + it can run as a daemon process +-functioning as a ``search server.'' ++functioning as a ``search++ server.'' + .SH QUERY INPUT + .SS Query Syntax + The formal grammar of a query is: +@@ -120,7 +120,7 @@ + see the EXAMPLES. + .SS Character Mapping and Word Determination + The same character mapping and word determination heuristics used by +-.BR index (1) ++.BR index++ (1) + are used on queries prior to searching. + .SH RESULTS OUTPUT + .SS Result Components +@@ -224,7 +224,7 @@ + .cE 0 + .SH RUNNING AS A DAEMON PROCESS + .SS Description +-.B search ++.B search++ + can alternatively run as a daemon process + (via either the + .B \-b +@@ -233,7 +233,7 @@ + options or the + .B SearchDaemon + variable) +-functioning as a ``search server'' ++functioning as a ``search++ server'' + by listening to a Unix domain socket + (specified by either the + .B \-u +@@ -267,7 +267,7 @@ + .SS Clients and Requests + Search clients connect to a daemon via a socket + and send a query in the same manner as on the command line +-(including the first word being ``\f(CWsearch\f1''). ++(including the first word being ``\f(CWsearch++\f1''). + The only exception is that shell meta-characters + .I "must not" + be escaped (backslashed) since no shell is involved. +@@ -322,7 +322,7 @@ + variable.) + .SS Restrictions + A single daemon can search only a single index. +-To search multiple indices concurrently, ++To search++ multiple indices concurrently, + multiple daemons can be run, + each searching its own index and using its own socket. + An index +@@ -430,7 +430,7 @@ + .B "" + By default, + if executed from the command-line, +-.B search ++.B search++ + appears to return immediately; + however, it has merely + detached from the terminal +@@ -482,7 +482,7 @@ + .br + .ns + .TP +-.B \-\-dump-index ++.B \-\-dump-index++ + Dump the entire word index to standard output and exit. + .TP + .BI \-F f +@@ -551,7 +551,7 @@ + .BI \-\-socket-timeout= s + The number of seconds, + .IR s , +-a search client has to complete a query request ++a sarch client has to complete a query request + before the socket connection is closed. + (Default is 10.) + This is to prevent a client from connecting, not completing a request, +@@ -584,7 +584,7 @@ + .TP + .BI \-\-pid-file= f + The name of the file to record the process ID of +-.B search ++.B search++ + if running as a daemon. + (Default is none.) + .TP +@@ -713,7 +713,7 @@ + This option is available only under Mac OS X, + should be used only for version 10.4 (Tiger) or later, + and only when +-.B search ++.B search++ + will be started via + .BR launchd (8). + .SH CONFIGURATION FILE +@@ -991,7 +991,7 @@ + that would have additionally required both ``stephen'' and ``hawking'' + to be near ``hole'' or ``holes.'' + .SS Sending Queries to a Search Daemon +-To send a query request to a search daemon using Perl, ++To send a query request to a sarch daemon using Perl, + first open the socket and connect to the daemon + (see [Wall], pp. 439-440): + .cS +@@ -1014,7 +1014,7 @@ + select( (select( SEARCH ), $| = 1)[0] ); + .cE + Next, send a query request +-(beginning with the word ``search'' ++(beginning with the word ``search++'' + and any options just as with a command-line) + to the daemon via the socket filehandle + making sure to include a trailing newline +@@ -1022,7 +1022,7 @@ + (so therefore it looks and waits for a newline): + .cS + $query = 'mouse and computer'; +-print SEARCH "search $query\\n"; ++print SEARCH "search++ $query\\n"; + .cE + Finally, read the results back and print them: + .cS +@@ -1051,7 +1051,7 @@ + Malformed query. + .TP + 51 +-Attempted ``near'' search without word-position data. ++Attempted ``near'' search++ without word-position data. + .TP + 60 + Could not write to PID file. +@@ -1139,7 +1139,7 @@ + .TP + 2. + When run as a daemon using a TCP socket, +-there are no security restrictions on who may connect and search. ++there are no security restrictions on who may connect and search++. + The code to implement domain and IP address restrictions + isn't worth it since such things are better handled by firewalls and routers. + .TP +@@ -1156,7 +1156,7 @@ + default index file name + .PD + .SH SEE ALSO +-.BR index (1), ++.BR index++ (1), + .BR perlfunc (1), + .BR exec (2), + .BR fork (2), +Index: swish++-6.1.4/man/man1/splitmail.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/splitmail.1 2007-08-04 10:37:04.000000000 +0530 ++++ swish++-6.1.4/man/man1/splitmail.1 2007-08-04 10:39:58.000000000 +0530 +@@ -1,6 +1,6 @@ + .\" + .\" SWISH++ +-.\" splitmail.1 ++.\" splitmail++.1 + .\" + .\" Copyright (C) 2000 Paul J. Lucas + .\" +@@ -35,21 +35,20 @@ + .if !'\\$1'0' .sp + .. + .\" --------------------------------------------------------------------------- +-.TH \f3splitmail\f1 1 "December 13, 2000" "SWISH++" ++.TH \f3splitmail++\f1 1 "December 13, 2000" "SWISH++" + .SH NAME +-splitmail \- split mailbox files prior to indexing ++splitmail++ \- split mailbox files prior to indexing + .SH SYNOPSIS +-.B splitmail \-p +-.I prefix ++.B splitmail++ \-p\fIprefix + .BI "[ " file " ]" + .SH DESCRIPTION +-.B splitmail ++.B splitmail++ + is a utility to split a mailbox file + (or standard input) + comprised of multiple messages + into multiple files of individual messages + to facilitate indexing with +-.BR index (1). ++.BR index++ (1). + The generated files have 5-digit increasing numbers + appended to a common prefix. + .SH OPTIONS +@@ -59,7 +58,7 @@ + .SH EXAMPLE + The command: + .cS +-splitmail \-p msg sent_messages ++splitmail++ \-p msg sent_messages + .cE + splits the mailbox \f(CWsent_messages\f1 into files named + \f(CWmsg.00001\f1, +@@ -68,7 +67,7 @@ + .SH NOTE + This utility hasn't been exhaustively tested. + .SH SEE ALSO +-.BR index (1). ++.BR index++ (1). + .SH AUTHOR + Paul J. Lucas + .RI < pauljlucas@mac.com > +Index: swish++-6.1.4/man/man1/httpindex.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/httpindex.1 2007-08-04 10:36:52.000000000 +0530 ++++ swish++-6.1.4/man/man1/httpindex.1 2007-08-04 10:37:26.000000000 +0530 +@@ -51,7 +51,7 @@ + .SH DESCRIPTION + .B httpindex + is a front-end for +-.BR index (1) ++.BR index++ (1) + to index files copied from remote servers using + .BR wget (1). + The files (in a copy of the remote directory structure) +@@ -80,7 +80,7 @@ + .SS httpindex Options + .B httpindex + accepts the same short options as +-.BR index (1) ++.BR index++ (1) + except for + .BR \-H , + .BR \-I , +@@ -124,7 +124,7 @@ + non-zero otherwise. + .SH CAVEATS + In addition to those for +-.BR index (1), ++.BR index++ (1), + .B httpindex + does not correctly handle the use of multiple + .BR \-e , +@@ -148,7 +148,7 @@ + httpindex \-e'html:*.html,text:*.txt' + .cE + .SH SEE ALSO +-.BR index (1), ++.BR index++ (1), + .BR wget (1), + .BR WWW (3) + .SH AUTHOR +Index: swish++-6.1.4/man/man4/swish++.conf.4 +=================================================================== +--- swish++-6.1.4.orig/man/man4/swish++.conf.4 2007-08-04 10:36:52.000000000 +0530 ++++ swish++-6.1.4/man/man4/swish++.conf.4 2007-08-04 10:37:26.000000000 +0530 +@@ -330,9 +330,9 @@ + mean that a file will be filtered + and subsequently indexed or extracted. + When +-.B index ++.B index++ + or +-.B extract ++.B extract++ + encounters a file having an extension for which a filter has been specified, + it performs the filename substitution(s) on it first + to determine what the target filename would be. +@@ -378,7 +378,7 @@ + Patterns can be useful for MIME types. + For example: + .cS +-FilterAttachment application/*word extract \-f %f > @%F.txt ++FilterAttachment application/*word extract++ \-f %f > @%F.txt + .cE + can be used regardless of whether the MIME type is + \f(CWapplication/msword\f1 (the official MIME type for Microsoft Word documents) +@@ -386,7 +386,7 @@ + \f(CWapplication/vnd.ms-word\f1 (an older version). + .PP + The MIME types that are built into +-.BR index (1) ++.BR index++ (1) + are: + \f(CWtext/plain\f1, + \f(CWtext/enriched\f1 (but only if the RTF module is compiled in), +@@ -410,12 +410,12 @@ + .SH SEE ALSO + .BR bzip (1), + .BR compress (1), +-.BR extract (1), ++.BR extract++ (1), + .BR gunzip (1), + .BR gzip (1), +-.BR index (1), ++.BR index++ (1), + .BR pdftotext (1), +-.BR search (1), ++.BR search++ (1), + .BR uncompress (1), + .BR glob (7) + .PP +Index: swish++-6.1.4/man/man4/swish++.index.4 +=================================================================== +--- swish++-6.1.4.orig/man/man4/swish++.index.4 2007-08-04 10:28:52.000000000 +0530 ++++ swish++-6.1.4/man/man4/swish++.index.4 2007-08-04 10:37:26.000000000 +0530 +@@ -55,7 +55,7 @@ + .ft 2 + word index + stop-word index +- directory index ++ directory index++ + file index + meta-name index + .ft 1 +@@ -71,7 +71,7 @@ + pointing at the first character of a stop-word entry; + similarly, + every \f(CWdirectory_offset\f1 is an offset into the +-.I "directory index" ++.I "directory index++" + pointing at the first character of a directory entry; + similarly, + every \f(CWfile_offset\f1 is an offset into the +@@ -79,7 +79,7 @@ + pointing at the first byte of a file entry; + finally, + every \f(CWmeta_name_offset\f1 is an offset into the +-.I "mete-name index" ++.I "meta-name index" + pointing at the first character of a meta-name entry. + .P + The index file is written as it is so that it can be mapped into memory via the +@@ -197,7 +197,7 @@ + that is: every word is null-terminated. + .SS Directory Entries + Every directory entry in the +-.I "directory index" ++.I "directory index++" + is of the form: + .cS + \f2directory-path\fP0 +@@ -213,7 +213,7 @@ + .cS + \f3\s+2{\s-2\fP\f2D\fP\f3\s+2}\s-2\fP\f2file-name\fP0\f3\s+2{\s-2\fP\f2S\fP\f3\s+2}{\s-2\fP\f2W\fP\f3\s+2}\s-2\fP\f2file-title\fP0 + .cE +-that is: the file's directory index ++that is: the file's directory index++ + .RI ( D ) + followed by a null-terminated file name + followed by the file's size in bytes +@@ -247,8 +247,8 @@ + Generated index files are machine-dependent + (size of data types and byte-order). + .SH SEE ALSO +-.BR index (1), +-.BR search (1) ++.BR index++ (1), ++.BR search++ (1) + .SH AUTHOR + Paul J. Lucas + .RI < pauljlucas@mac.com > --- swish++-6.1.5.orig/debian/patches/fix_splitmail_manpage +++ swish++-6.1.5/debian/patches/fix_splitmail_manpage @@ -0,0 +1,14 @@ +Index: swish++-6.1.4/man/man1/splitmail.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/splitmail.1 2007-08-03 15:20:09.000000000 +0530 ++++ swish++-6.1.4/man/man1/splitmail.1 2007-08-03 15:20:32.000000000 +0530 +@@ -39,8 +39,7 @@ + .SH NAME + splitmail \- split mailbox files prior to indexing + .SH SYNOPSIS +-.B splitmail \-p +-.I prefix ++.B splitmail \-p\fIprefix + .BI "[ " file " ]" + .SH DESCRIPTION + .B splitmail --- swish++-6.1.5.orig/debian/patches/enable_filters +++ swish++-6.1.5/debian/patches/enable_filters @@ -0,0 +1,21 @@ +Index: swish++-6.1.4/swish++.conf +=================================================================== +--- swish++-6.1.4.orig/swish++.conf 2006-09-17 18:48:04.000000000 +0530 ++++ swish++-6.1.4/swish++.conf 2006-09-17 18:48:28.000000000 +0530 +@@ -189,10 +189,12 @@ + # See http://www.wvware.com/ for information about the wvText program. + # See http://www.research.compaq.com/SRC/virtualpaper/pstotext.html for + # information about the pstotext program. +- +-#FilterFile *.bz2 bunzip2 -c %f > @%F +-#FilterFile *.gz gunzip -c %f > @%F +-#FilterFile *.Z uncompress -c %f > @%F ++# ++#Edited for Debian ++FilterFile *.bz2 bunzip2 -c %f > @%F ++FilterFile *.gz gunzip -c %f > @%F ++FilterFile *.Z uncompress -c %f > @%F ++#FilterFile *.doc antiword %f @%F.txt + #FilterFile *.doc wvText %f @%F.txt + #FilterFile *.pdf pdftotext %f @%F.txt + #FilterFile *.ps pstotext %f > @%F.txt --- swish++-6.1.5.orig/debian/patches/fix_ftbfs_with_gcc4.7 +++ swish++-6.1.5/debian/patches/fix_ftbfs_with_gcc4.7 @@ -0,0 +1,18 @@ +Description: Fix FTBFS with gcc 4.7 +Origin: vendor +Bug-Debian: http://bugs.debian.org/667386 +Forwarded: no +Author: Cyril Brulebois +Last-Update: 2012-05-17 + +--- a/my_set.h ++++ b/my_set.h +@@ -47,7 +47,7 @@ namespace PJL { + //***************************************************************************** + { + public: +- bool contains( T const &s ) const { return find( s ) != this->end(); } ++ bool contains( T const &s ) const { return this->find( s ) != this->end(); } + }; + + //***************************************************************************** --- swish++-6.1.5.orig/debian/patches/series +++ swish++-6.1.5/debian/patches/series @@ -0,0 +1,14 @@ +use_PATH_MAX_from_climits +add_explicit_cstdlib_dependencies +fix_man4to5 +configure +locations +enable_filters +minus_as_hyphen_in_manpages +fix_man_pages +use_gcc_for_ld +splitmail_junk_header +fixincludes_gcc4.4 +include_cstddef_header +fix_ftbfs_gold_bug +fix_ftbfs_with_gcc4.7 --- swish++-6.1.5.orig/debian/patches/add_explicit_cstdlib_dependencies +++ swish++-6.1.5/debian/patches/add_explicit_cstdlib_dependencies @@ -0,0 +1,132 @@ +Index: swish++-6.1.4/Group.c +=================================================================== +--- swish++-6.1.4.orig/Group.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/Group.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include /* needed by FreeBSD systems */ + #include /* for getgrnam(3) */ + +Index: swish++-6.1.4/IncludeMeta.c +=================================================================== +--- swish++-6.1.4.orig/IncludeMeta.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/IncludeMeta.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include + + // local +Index: swish++-6.1.4/User.c +=================================================================== +--- swish++-6.1.4.orig/User.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/User.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include /* for getpwnam(3) */ + + // local +Index: swish++-6.1.4/WordThreshold.c +=================================================================== +--- swish++-6.1.4.orig/WordThreshold.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/WordThreshold.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include /* needed by FreeBSD systems */ + #include /* for geteuid(2) */ + +Index: swish++-6.1.4/conf_bool.c +=================================================================== +--- swish++-6.1.4.orig/conf_bool.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/conf_bool.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include + + // local +Index: swish++-6.1.4/conf_enum.c +=================================================================== +--- swish++-6.1.4.orig/conf_enum.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/conf_enum.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include + + // local +Index: swish++-6.1.4/conf_percent.c +=================================================================== +--- swish++-6.1.4.orig/conf_percent.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/conf_percent.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include + + // local +Index: swish++-6.1.4/conf_string.c +=================================================================== +--- swish++-6.1.4.orig/conf_string.c 2005-01-03 01:40:25.000000000 +0530 ++++ swish++-6.1.4/conf_string.c 2007-08-03 11:09:21.000000000 +0530 +@@ -20,6 +20,7 @@ + */ + + // standard ++#include + #include + #include + +Index: swish++-6.1.4/conf_var.c +=================================================================== +--- swish++-6.1.4.orig/conf_var.c 2005-11-17 09:31:03.000000000 +0530 ++++ swish++-6.1.4/conf_var.c 2007-08-03 11:09:21.000000000 +0530 +@@ -21,6 +21,7 @@ + + // standard + #include ++#include + #include + #include + +Index: swish++-6.1.4/mod/html/mod_html.c +=================================================================== +--- swish++-6.1.4.orig/mod/html/mod_html.c 2005-01-03 01:40:26.000000000 +0530 ++++ swish++-6.1.4/mod/html/mod_html.c 2007-08-03 11:09:21.000000000 +0530 +@@ -23,6 +23,7 @@ + + // standard + #include ++#include + #include + #include /* for pair<> */ + #include +Index: swish++-6.1.4/stop_words.c +=================================================================== +--- swish++-6.1.4.orig/stop_words.c 2005-01-03 01:40:26.000000000 +0530 ++++ swish++-6.1.4/stop_words.c 2007-08-03 11:09:21.000000000 +0530 +@@ -21,6 +21,7 @@ + + // standard + #include ++#include + + // local + #include "config.h" --- swish++-6.1.5.orig/debian/patches/fix_ftbfs_gold_bug +++ swish++-6.1.5/debian/patches/fix_ftbfs_gold_bug @@ -0,0 +1,7 @@ +From: "Georgios M. Zarkadas" +Subject: Fix FTBFS with binutils-gold +--- swish++-6.1.4.orig/config/config.mk ++++ swish++-6.1.4/config/config.mk +@@ -186,1 +186,1 @@ +-STDCXXLINK:= -lstdc++ ++STDCXXLINK:= -lstdc++ -lgcc -lm --- swish++-6.1.5.orig/debian/patches/fix_man4to5 +++ swish++-6.1.5/debian/patches/fix_man4to5 @@ -0,0 +1,184 @@ +Index: swish++-6.1.4/man/man4/swish++.conf.4 +=================================================================== +--- swish++-6.1.4.orig/man/man4/swish++.conf.4 2005-07-04 00:19:31.000000000 +0530 ++++ swish++-6.1.4/man/man4/swish++.conf.4 2007-08-04 11:22:21.000000000 +0530 +@@ -1,6 +1,6 @@ + .\" + .\" SWISH++ +-.\" swish++.conf.4 ++.\" swish++.conf.5 + .\" + .\" Copyright (C) 1998 Paul J. Lucas + .\" +@@ -35,7 +35,7 @@ + .if !'\\$1'0' .sp + .. + .\" --------------------------------------------------------------------------- +-.TH "\f3swish++.conf\f1" 4 "June 16, 2005" "SWISH++" ++.TH "\f3swish++.conf\f1" 5 "June 16, 2005" "SWISH++" + .SH NAME + swish++.conf \- SWISH++ configuration file format + .SH DESCRIPTION +Index: swish++-6.1.4/man/man4/swish++.index.4 +=================================================================== +--- swish++-6.1.4.orig/man/man4/swish++.index.4 2004-03-30 02:38:02.000000000 +0530 ++++ swish++-6.1.4/man/man4/swish++.index.4 2007-08-04 11:22:21.000000000 +0530 +@@ -1,6 +1,6 @@ + .\" + .\" SWISH++ +-.\" swish++.index.4 ++.\" swish++.index.5 + .\" + .\" Copyright (C) 1998-2003 Paul J. Lucas + .\" +@@ -35,7 +35,7 @@ + .if !'\\$1'0' .sp + .. + .\" --------------------------------------------------------------------------- +-.TH \f3swish++.index\f1 4 "March 29, 2004" "SWISH++" ++.TH \f3swish++.index\f1 5 "March 29, 2004" "SWISH++" + .SH NAME + swish++.index \- SWISH++ index file format + .SH SYNOPSIS +Index: swish++-6.1.4/man/man4/GNUmakefile +=================================================================== +--- swish++-6.1.4.orig/man/man4/GNUmakefile 2004-04-30 10:33:25.000000000 +0530 ++++ swish++-6.1.4/man/man4/GNUmakefile 2007-08-04 11:22:21.000000000 +0530 +@@ -22,7 +22,7 @@ + ########## You shouldn't have to change anything below this line. ############# + + ROOT:= ../.. +-SECT:= 4 ++SECT:= 5 + include $(ROOT)/config/man.mk + + # vim:set noet sw=8 ts=8: +Index: swish++-6.1.4/man/man1/extract.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/extract.1 2007-08-04 11:25:18.000000000 +0530 ++++ swish++-6.1.4/man/man1/extract.1 2007-08-04 11:20:29.000000000 +0530 +@@ -133,7 +133,7 @@ + configuration file variable, + files having particular patterns can be filtered prior to extraction. + (See the examples in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .SS Character Mapping and Word Determination + .B extract + performs the same character mapping, character entity conversions, +@@ -397,11 +397,11 @@ + .TP + .B FilterAttachment + (See FILTERS in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .TP + .B FilterFile + (See FILTERS in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .TP + .B FollowLinks + Same as +@@ -443,7 +443,7 @@ + .cE + .SS Filters + (See the examples in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .SH EXIT STATUS + Exits with one of the values given below: + .PP +@@ -487,7 +487,7 @@ + .BR index (1), + .BR search (1), + .BR strings (1), +-.BR swish++.conf (4), ++.BR swish++.conf (5), + .BR glob (7) + .PP + Adobe Systems Incorporated. +Index: swish++-6.1.4/man/man1/index.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/index.1 2007-08-04 11:25:18.000000000 +0530 ++++ swish++-6.1.4/man/man1/index.1 2007-08-04 11:20:29.000000000 +0530 +@@ -157,7 +157,7 @@ + e-mail attachments whose MIME types match particular patterns + can be filtered prior to indexing. + (See FILTERS in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .SS Incremental Indexing + In order to add words from new documents to an existing index, + either the entire set of documents can be reindexed +@@ -1066,11 +1066,11 @@ + .TP + .B FilterAttachment + (See FILTERS in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .TP + .B FilterFile + (See FILTERS in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .TP + .B FollowLinks + Same as +@@ -1247,7 +1247,7 @@ + if the \f(CWCLASS\f1 attribute were not there. + .SS Filters + (See Filters under EXAMPLES in +-.BR swish++.conf (4).) ++.BR swish++.conf (5).) + .SH EXIT STATUS + Exits with one of the values given below: + .P +@@ -1341,7 +1341,7 @@ + .BR nroff (1), + .BR search (1), + .BR splitmail (1), +-.BR swish++.conf (4), ++.BR swish++.conf (5), + .BR glob (7), + .BR man (7). + .P +Index: swish++-6.1.4/man/man1/search.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/search.1 2007-08-04 11:25:18.000000000 +0530 ++++ swish++-6.1.4/man/man1/search.1 2007-08-04 11:20:29.000000000 +0530 +@@ -1165,7 +1165,7 @@ + .BR bind (3), + .BR listen (3), + .BR select (3), +-.BR swish++.conf (4), ++.BR swish++.conf (5), + .BR launchd (8), + .BR searchmonitor (8) + .PP +Index: swish++-6.1.4/man/man8/searchd.8 +=================================================================== +--- swish++-6.1.4.orig/man/man8/searchd.8 2007-08-04 11:25:18.000000000 +0530 ++++ swish++-6.1.4/man/man8/searchd.8 2007-08-04 11:20:29.000000000 +0530 +@@ -79,7 +79,7 @@ + .PD + .SH SEE ALSO + .BR search (1), +-.BR swish++.conf (4), ++.BR swish++.conf (5), + .BR init (8), + .BR searchmonitor (8) + .SH AUTHOR +Index: swish++-6.1.4/man/man8/searchmonitor.8 +=================================================================== +--- swish++-6.1.4.orig/man/man8/searchmonitor.8 2007-08-04 11:25:18.000000000 +0530 ++++ swish++-6.1.4/man/man8/searchmonitor.8 2007-08-04 11:20:29.000000000 +0530 +@@ -130,7 +130,7 @@ + .PD + .SH SEE ALSO + .BR search (1), +-.BR swish++.conf (4), ++.BR swish++.conf (5), + .BR searchd (8), + .BR syslogd (8) + .SH AUTHOR --- swish++-6.1.5.orig/debian/patches/fixincludes_gcc4.4 +++ swish++-6.1.5/debian/patches/fixincludes_gcc4.4 @@ -0,0 +1,17 @@ +From: Martin Michlmayr + +GCC 4.4 cleaned up some more C++ headers. You always have to #include +headers directly and cannot rely for things to be included indirectly. + +Index: swish++/fdbuf.c +=================================================================== +--- swish++.orig/fdbuf.c 2009-03-04 12:32:17.000000000 +0530 ++++ swish++/fdbuf.c 2009-03-04 12:32:35.000000000 +0530 +@@ -21,6 +21,7 @@ + + // standard + #include ++#include + #include + #include + --- swish++-6.1.5.orig/debian/patches/minus_as_hyphen_in_manpages +++ swish++-6.1.5/debian/patches/minus_as_hyphen_in_manpages @@ -0,0 +1,148 @@ +Index: swish++-6.1.4/man/man1/extract.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/extract.1 2006-09-17 21:48:53.000000000 +0530 ++++ swish++-6.1.4/man/man1/extract.1 2006-09-17 21:49:06.000000000 +0530 +@@ -204,7 +204,7 @@ + (but the last option in the group can take an argument), e.g., + \f(CW-lrv4\fP + is equivalent to +-\f(CW-l -r -v4\fP. ++\f(CW-l \-r \-v4\fP. + .PP + For a long option that takes an argument, + the argument is either taken to be the characters after a `\f(CW=\fP', if any, +@@ -439,7 +439,7 @@ + To extract text from all Microsoft Office files on a web server: + .cS + cd /home/www/htdocs +-extract -v3 -e '*.doc' -e '*.ppt' -e '*.xls' . ++extract \-v3 \-e '*.doc' \-e '*.ppt' \-e '*.xls' . + .cE + .SS Filters + (See the examples in +Index: swish++-6.1.4/man/man1/httpindex.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/httpindex.1 2006-09-17 21:49:09.000000000 +0530 ++++ swish++-6.1.4/man/man1/httpindex.1 2006-09-17 21:49:28.000000000 +0530 +@@ -111,8 +111,8 @@ + To index all HTML and text files on a remote web server + keeping descriptions locally: + .cS +-wget -A html,txt -linf -t2 -rxnv -nh -w2 http://www.foo.com 2>&1 | +-httpindex -d -e'html:*.html,text:*.txt' ++wget \-A html,txt \-linf \-t2 \-rxnv \-nh \-w2 http://www.foo.com 2>&1 | ++httpindex \-d \-e'html:*.html,text:*.txt' + .cE + Note that you need to redirect + .BR wget (1)'s +@@ -141,11 +141,11 @@ + seperated by commas to a single one of those options. + For example, if you want to do: + .cS +-httpindex -e'html:*.html' -e'text:*.txt' ++httpindex \-e'html:*.html' \-e'text:*.txt' + .cE + do this instead: + .cS +-httpindex -e'html:*.html,text:*.txt' ++httpindex \-e'html:*.html,text:*.txt' + .cE + .SH SEE ALSO + .BR index (1), +Index: swish++-6.1.4/man/man1/index.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/index.1 2006-09-17 21:47:50.000000000 +0530 ++++ swish++-6.1.4/man/man1/index.1 2006-09-17 21:48:21.000000000 +0530 +@@ -660,7 +660,7 @@ + (but the last option in the group can take an argument), e.g., + \f(CW-lrv4\fP + is equivalent to +-\f(CW-l -r -v4\fP. ++\f(CW-l \-r \-v4\fP. + .P + For a long option that takes an argument, + the argument is either taken to be the characters after a `\f(CW=\fP', if any, +@@ -1170,11 +1170,11 @@ + .P + To index all HTML and text files on a web server: + .cS +-index -v3 -e 'html:*.*htm*' -e 'text:*.txt' . ++index \-v3 \-e 'html:*.*htm*' \-e 'text:*.txt' . + .cE + To index all files not under directories named \f(CWCVS\f1: + .cS +-find . -name CVS -prune -o -type f -a -print | index -e 'html:*.*htm*' - ++find . \-name CVS \-prune \-o \-type f \-a \-print | index \-e 'html:*.*htm*' \- + .cE + .SS Windows Command-Lines + When using the Windows command interpreter, +@@ -1182,7 +1182,7 @@ + .I must + use double quotes: + .cS +-index -v3 -e "html:*.*htm*" -e "text:*.txt" . ++index \-v3 \-e "html:*.*htm*" \-e "text:*.txt" . + .cE + This is a problem with Windows, not SWISH++. + (Double quotes will also work under Unix.) +Index: swish++-6.1.4/man/man1/search.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/search.1 2006-09-17 21:48:42.000000000 +0530 ++++ swish++-6.1.4/man/man1/search.1 2006-09-17 21:48:50.000000000 +0530 +@@ -347,7 +347,7 @@ + (but the last option in the group can take an argument), e.g., + \f(CW-Bq511\fP + is equivalent to +-\f(CW-B -q 511\fP. ++\f(CW-B \-q 511\fP. + .PP + For a long option that takes an argument, + the argument is either taken to be the characters after a `\f(CW=\fP', if any, +Index: swish++-6.1.4/man/man1/splitmail.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/splitmail.1 2006-09-17 21:48:24.000000000 +0530 ++++ swish++-6.1.4/man/man1/splitmail.1 2006-09-17 21:48:39.000000000 +0530 +@@ -39,7 +39,7 @@ + .SH NAME + splitmail \- split mailbox files prior to indexing + .SH SYNOPSIS +-.B splitmail -p ++.B splitmail \-p + .I prefix + .BI "[ " file " ]" + .SH DESCRIPTION +@@ -59,7 +59,7 @@ + .SH EXAMPLE + The command: + .cS +-splitmail -p msg sent_messages ++splitmail \-p msg sent_messages + .cE + splits the mailbox \f(CWsent_messages\f1 into files named + \f(CWmsg.00001\f1, +Index: swish++-6.1.4/man/man4/swish++.conf.4 +=================================================================== +--- swish++-6.1.4.orig/man/man4/swish++.conf.4 2006-09-17 21:49:29.000000000 +0530 ++++ swish++-6.1.4/man/man4/swish++.conf.4 2006-09-17 21:49:41.000000000 +0530 +@@ -291,9 +291,9 @@ + .B FilterFile + variable lines in a configuration file would be: + .cS +-FilterFile *.bz2 bunzip2 -c %f > @%F +-FilterFile *.gz gunzip -c %f > @%F +-FilterFile *.Z uncompress -c %f > @%F ++FilterFile *.bz2 bunzip2 \-c %f > @%F ++FilterFile *.gz gunzip \-c %f > @%F ++FilterFile *.Z uncompress \-c %f > @%F + .cE + Given that, a filename such as \f(CWfoo.txt.gz\f1 would become \f(CWfoo.txt\f1. + If files having \f(CWtxt\f1 extensions should be indexed, then it will be. +@@ -378,7 +378,7 @@ + Patterns can be useful for MIME types. + For example: + .cS +-FilterAttachment application/*word extract -f %f > @%F.txt ++FilterAttachment application/*word extract \-f %f > @%F.txt + .cE + can be used regardless of whether the MIME type is + \f(CWapplication/msword\f1 (the official MIME type for Microsoft Word documents) --- swish++-6.1.5.orig/debian/patches/splitmail_junk_header +++ swish++-6.1.5/debian/patches/splitmail_junk_header @@ -0,0 +1,54 @@ +Index: swish++-6.1.4/man/man1/splitmail.1 +=================================================================== +--- swish++-6.1.4.orig/man/man1/splitmail.1 2007-08-04 11:43:11.000000000 +0530 ++++ swish++-6.1.4/man/man1/splitmail.1 2007-08-04 11:48:18.000000000 +0530 +@@ -51,6 +51,10 @@ + .BR index++ (1). + The generated files have 5-digit increasing numbers + appended to a common prefix. ++If the mailbox file has some non-mail header, then this is ++dumped to a file with the same prefix and ++\f(Cw_junk_header\f1 ++as the suffix. + .SH OPTIONS + .TP 12 + .BI \-p prefix +@@ -63,7 +67,8 @@ + splits the mailbox \f(CWsent_messages\f1 into files named + \f(CWmsg.00001\f1, + \f(CWmsg.00002\f1, +-and so on. ++and so on. Any non-mail header gets dumped to a file named ++\f(CWmsg_junk_header\f1. + .SH NOTE + This utility hasn't been exhaustively tested. + .SH SEE ALSO +Index: swish++-6.1.4/scripts/splitmail.in +=================================================================== +--- swish++-6.1.4.orig/scripts/splitmail.in 2007-08-04 11:38:30.000000000 +0530 ++++ swish++-6.1.4/scripts/splitmail.in 2007-08-04 11:43:02.000000000 +0530 +@@ -2,8 +2,11 @@ + ## + # SWISH++ + # splitmail ++# Modified Sat, 04 Aug 2007 11:40:55 +0530 by KHP ++# to save "junk header" if any. + # + # Copyright (C) 2000 Paul J. Lucas ++# Copyright (C) 2007 Kapil Hari Paranjape + # + # This program is free software; you can redistribute it and/or modify + # it under the terms of the GNU General Public License as published by +@@ -30,6 +33,12 @@ + getopts( 'p:' ) or die "usage: $me -p prefix\n"; + die "$me: -p required\n" unless $opt_p; + ++# Lines added by KHP ++open( FILE, ">$opt_p." . "_junk_header" || ++ die "$me: can not create file\n"; ++i=0; ++# End of lines added by KHP ++ + while ( <> ) { + if ( /^From / ) { + close( FILE ); --- swish++-6.1.5.orig/debian/patches/use_PATH_MAX_from_climits +++ swish++-6.1.5/debian/patches/use_PATH_MAX_from_climits @@ -0,0 +1,38 @@ +Index: swish++-6.1.4/util.h +=================================================================== +--- swish++-6.1.4.orig/util.h 2007-08-03 12:07:21.000000000 +0530 ++++ swish++-6.1.4/util.h 2007-08-03 12:08:51.000000000 +0530 +@@ -36,19 +36,20 @@ + #include + #include /* for _exit(2), geteuid(2) */ + +-// +-// POSIX.1 is, IMHO, brain-damaged in the way it makes you determine the +-// maximum path-name length, so we'll simply pick a sufficiently large constant +-// such as 1024. In practice, this is the actual value used on many SVR4 as +-// well as 4.3+BSD systems. +-// +-// See also: W. Richard Stevens. "Advanced Programming in the Unix +-// Environment," Addison-Wesley, Reading, MA, 1993. pp. 34-42. +-// +-#ifdef PATH_MAX +-#undef PATH_MAX +-#endif +-int const PATH_MAX = 1024; ++// Forgo upstreams way of setting PATH_MAX and use the value from climits ++// // ++// // POSIX.1 is, IMHO, brain-damaged in the way it makes you determine the ++// // maximum path-name length, so we'll simply pick a sufficiently large constant ++// // such as 1024. In practice, this is the actual value used on many SVR4 as ++// // well as 4.3+BSD systems. ++// // ++// // See also: W. Richard Stevens. "Advanced Programming in the Unix ++// // Environment," Addison-Wesley, Reading, MA, 1993. pp. 34-42. ++// // ++// #ifdef PATH_MAX ++// #undef PATH_MAX ++// #endif ++// int const PATH_MAX = 1024; + + // local + #include "exit_codes.h" --- swish++-6.1.5.orig/debian/patches/locations +++ swish++-6.1.5/debian/patches/locations @@ -0,0 +1,69 @@ +Index: swish++-6.1.4/scripts/searchd.in +=================================================================== +--- swish++-6.1.4.orig/scripts/searchd.in 2006-09-17 18:46:15.000000000 +0530 ++++ swish++-6.1.4/scripts/searchd.in 2006-09-17 18:47:05.000000000 +0530 +@@ -26,7 +26,7 @@ + ## + # What stuff is called and where it's located. + ## +-SEARCH="search" ++SEARCH="search++" + SEARCH_PATH="%%I_BIN%%/$SEARCH" + SEARCHMONITOR="%%I_BIN%%/searchmonitor" + CONF_FILE="/etc/swish++.conf" +Index: swish++-6.1.4/scripts/searchmonitor.in +=================================================================== +--- swish++-6.1.4.orig/scripts/searchmonitor.in 2006-09-17 18:46:38.000000000 +0530 ++++ swish++-6.1.4/scripts/searchmonitor.in 2006-09-17 18:47:20.000000000 +0530 +@@ -25,7 +25,7 @@ + # What stuff is called and where it's located. + ## + LOG="logger -t search -p daemon" +-SEARCH_DEFAULT="%%I_BIN%%/search" ++SEARCH_DEFAULT="%%I_BIN%%/search++" + + ## + # You may need to set LD_LIBRARY_PATH to contain the directory of the C++ +Index: swish++-6.1.4/www_example/search.cgi +=================================================================== +--- swish++-6.1.4.orig/www_example/search.cgi 2006-09-17 18:46:30.000000000 +0530 ++++ swish++-6.1.4/www_example/search.cgi 2006-09-17 18:46:59.000000000 +0530 +@@ -1,4 +1,4 @@ +-#! /usr/local/bin/perl ++#! /usr/bin/perl + ############################################################################### + # + # NAME +@@ -30,19 +30,19 @@ + # + ############################################################################### + +-use lib qw( /home/www/swish++/lib/ ); ++use lib qw( /usr/lib/swish++ ); + # Put the path to where the WWW library is above. + require WWW; + +-$SWISH_BIN = '/usr/local/bin'; ++$SWISH_BIN = '/usr/bin'; + # The full path to the bin directory where you installed the + # SWISH++ executables. + +-$DOC_ROOT = '/home/www/httpd/htdocs'; ++$DOC_ROOT = '/var/www/htdocs'; + # The top-level directory for your document tree presumeably + # where the index was generated from. + +-$INDEX_FILE = '/home/www/swish++.index'; ++$INDEX_FILE = '/var/www/swish++.index'; + # The full path to the index file to be searched through. + + #$SOCKET_FILE = '/tmp/search.socket'; +@@ -125,7 +125,7 @@ + # legitimate options. If not given, it may be possible for a user to + # give options in the search terms. + ## +- open( SEARCH, "$SWISH_BIN/search -i $INDEX_FILE @options -- $search |" ) ++ open( SEARCH, "$SWISH_BIN/searchi++ -i $INDEX_FILE @options -- $search |" ) + || die "open: $!"; + } + --- swish++-6.1.5.orig/debian/patches/use_gcc_for_ld +++ swish++-6.1.5/debian/patches/use_gcc_for_ld @@ -0,0 +1,71 @@ +Index: swish++-6.1.4/GNUmakefile +=================================================================== +--- swish++-6.1.4.orig/GNUmakefile 2006-09-26 16:38:07.000000000 +0530 ++++ swish++-6.1.4/GNUmakefile 2006-09-26 16:39:26.000000000 +0530 +@@ -109,7 +109,7 @@ + ZLIB_LINK:= + endif + I_LINK:= $(MOD_LINK) $(ENCODING_LINK) $(CHARSET_LINK) $(PTHREAD_LINK) \ +- -lm $(ZLIB_LINK) ++ $(ZLIB_LINK) $(STDCXXLINK) + + S_SOURCES:= enc_int.c \ + mmap_file.c \ +@@ -150,7 +150,7 @@ + User.c + endif + S_OBJECTS:= $(S_SOURCES:.c=.o) +-S_LINK:= $(SOCKET_LINK) $(PTHREAD_LINK) ++S_LINK:= $(SOCKET_LINK) $(PTHREAD_LINK) $(STDCXXLINK) + + E_SOURCES:= mmap_file.c \ + conf_var.c \ +@@ -175,7 +175,7 @@ + E_SOURCES+= fnmatch.c # see the comment in pattern_map.h + endif + E_OBJECTS:= $(E_SOURCES:.c=.o) +-E_LINK:= $(PTHREAD_LINK) ++E_LINK:= $(PTHREAD_LINK) $(STDCXXLINK) + + LIB_TARGET:= WWW.pm + +@@ -184,10 +184,10 @@ + ## + + extract: $(E_OBJECTS) +- $(CC) $(CFLAGS) -o $@ $^ $(E_LINK) ++ $(LD) $(CFLAGS) -o $@ $^ $(E_LINK) + + index: $(I_OBJECTS) $(CHARSET_LIB) $(ENCODING_LIB) $(MOD_LIBS) +- $(CC) $(CFLAGS) $(CHARSET_LIB_PATH) $(ENCODING_LIB_PATH) \ ++ $(LD) $(CFLAGS) $(CHARSET_LIB_PATH) $(ENCODING_LIB_PATH) \ + $(MOD_LIB_PATHS) -o $@ $(I_OBJECTS) $(I_LINK) + + init_modules.c: mod/*/mod_*.h init_modules-sh +@@ -197,7 +197,7 @@ + ./init_mod_vars-sh > $@ || $(RM) $@ + + search: $(S_OBJECTS) +- $(CC) $(CFLAGS) -o $@ $^ $(S_LINK) ++ $(LD) $(CFLAGS) -o $@ $^ $(S_LINK) + + $(CHARSET_LIB): FORCE + @$(MAKE) -C $(dir $@) DEBUGFLAGS="$(DEBUGFLAGS)" +Index: swish++-6.1.4/config/config.mk +=================================================================== +--- swish++-6.1.4.orig/config/config.mk 2006-09-26 16:35:00.000000000 +0530 ++++ swish++-6.1.4/config/config.mk 2006-09-26 16:38:02.000000000 +0530 +@@ -179,6 +179,13 @@ + # doesn't have it or any equivalent since any errors from this + # command are ignored in the makefiles. + ++LD:= gcc ++# This is the command the provides linking since g++ links in ++# unnecessary extra stuff. ++ ++STDCXXLINK:= -lstdc++ ++# This is necessary to link C++ programs using gcc ++ + ############################################################################### + # + # C++ compiler --- swish++-6.1.5.orig/debian/email_indexing/nnir.el +++ swish++-6.1.5/debian/email_indexing/nnir.el @@ -0,0 +1,1362 @@ +;;; nnir.el --- search mail with various search engines -*- coding: iso-8859-1 -*- +;; Copyright (C) 1998 Kai Großjohann + +;; $Id: nnir.el,v 1.1.1.1 2003/06/06 04:55:44 fip Exp $ + +;; Author: Kai Großjohann +;; Keywords: news, mail, searching, ir, glimpse, wais +;; +;; Edited for Debian/GNU Linux +;; This file is not part of GNU Emacs. + +;; This is free software; you can redistribute it and/or modify +;; it under the terms of the GNU General Public License as published by +;; the Free Software Foundation; either version 2, or (at your option) +;; any later version. + +;; GNU Emacs is distributed in the hope that it will be useful, +;; but WITHOUT ANY WARRANTY; without even the implied warranty of +;; MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +;; GNU General Public License for more details. + +;; You should have received a copy of the GNU General Public License +;; along with GNU Emacs; see the file COPYING. If not, write to the +;; Free Software Foundation, Inc., 59 Temple Place - Suite 330, +;; Boston, MA 02111-1307, USA. + +;;; Commentary: + +;; The most recent version of this can always be fetched from the +;; following FTP site: +;; ls6-ftp.cs.uni-dortmund.de:/pub/src/emacs + +;; This code is still in the development stage but I'd like other +;; people to have a look at it. Please do not hesitate to contact me +;; with your ideas. + +;; What does it do? Well, it allows you to index your mail using some +;; search engine (freeWAIS-sf and Glimpse are currently supported), +;; then type `G G' in the Group buffer and issue a query to the search +;; engine. You will then get a buffer which shows all articles +;; matching the query, sorted by Retrieval Status Value (score). + +;; When looking at the retrieval result (in the Summary buffer) you +;; can type `G T' (aka M-x gnus-summary-nnir-goto-thread RET) on an +;; article. You will be teleported into the group this article came +;; from, showing the thread this article is part of. (See below for +;; restrictions.) + +;; The Lisp installation is simple: just put this file on your +;; load-path, byte-compile it, and load it from ~/.gnus or something. +;; This will install a new command `G G' in your Group buffer for +;; searching your mail. Note that you also need to configure a number +;; of variables, as described below. + +;; Restrictions: +;; +;; * Currently, this expects that you use nnml or another +;; one-file-per-message backend. +;; * It can only search one mail backend. +;; * There are restrictions to the Glimpse setup. +;; * There are restrictions to the Wais setup. +;; * gnus-summary-nnir-goto-thread: Fetches whole group first, before +;; limiting to the right articles. This is much too slow, of +;; course. May issue a query for number of articles to fetch; you +;; must accept the default of all articles at this point or things +;; may break. + +;; The Lisp setup involves setting a few variables and setting up the +;; search engine. The first variable to set is `nnir-mail-backend'. +;; For me, `gnus-secondary-select-methods' contains just one select +;; method, and this is also what I put in `nnir-mail-backend'. Type +;; `C-h v nnir-mail-backend RET' for more information -- the variable +;; documentation includes more details and a few examples. The second +;; variable to set is `nnir-search-engine'. Choose one of the engines +;; listed in `nnir-engines'. (Actually `nnir-engines' is an alist, +;; type `C-h v nnir-engines RET' for more information; this includes +;; examples for setting `nnir-search-engine', too.) + +;; You must also set up a search engine. I'll tell you about the two +;; search engines currently supported: + +;; 1. freeWAIS-sf +;; +;; As always with freeWAIS-sf, you need a so-called `format file'. I +;; use the following file: +;; +;; ,----- +;; | # Kai's format file for freeWAIS-sf for indexing mails. +;; | # Each mail is in a file, much like the MH format. +;; | +;; | # Document separator should never match -- each file is a document. +;; | record-sep: /^@this regex should never match@$/ +;; | +;; | # Searchable fields specification. +;; | +;; | region: /^[sS]ubject:/ /^[sS]ubject: */ +;; | subject "Subject header" stemming TEXT BOTH +;; | end: /^[^ \t]/ +;; | +;; | region: /^([tT][oO]|[cC][cC]):/ /^([tT][oO]|[cC][cC]): */ +;; | to "To and Cc headers" SOUNDEX BOTH +;; | end: /^[^ \t]/ +;; | +;; | region: /^[fF][rR][oO][mM]:/ /^[fF][rR][oO][mM]: */ +;; | from "From header" SOUNDEX BOTH +;; | end: /^[^ \t]/ +;; | +;; | region: /^$/ +;; | stemming TEXT GLOBAL +;; | end: /^@this regex should never match@$/ +;; `----- +;; +;; 1998-07-22: waisindex would dump core on me for large articles with +;; the above settings. I used /^$/ as the end regex for the global +;; field. That seemed to work okay. + +;; There is a Perl module called `WAIS.pm' which is available from +;; CPAN as well as ls6-ftp.cs.uni-dortmund.de:/pub/wais/Perl. This +;; module comes with a nifty tool called `makedb', which I use for +;; indexing. Here's my `makedb.conf': +;; +;; ,----- +;; | # Config file for makedb +;; | +;; | # Global options +;; | waisindex = /usr/local/bin/waisindex +;; | wais_opt = -stem -t fields +;; | # `-stem' option necessary when `stemming' is specified for the +;; | # global field in the *.fmt file +;; | +;; | # Own variables +;; | homedir = /home/kai +;; | +;; | # The mail database. +;; | database = mail +;; | files = `find $homedir/Mail -name \*[0-9] -print` +;; | dbdir = $homedir/.wais +;; | limit = 100 +;; `----- +;; +;; The Lisp setup involves the `nnir-wais-*' variables. The most +;; difficult to understand variable is probably +;; `nnir-wais-remove-prefix'. Here's what it does: the output of +;; `waissearch' basically contains the file name and the (full) +;; directory name. As Gnus works with group names rather than +;; directory names, the directory name is transformed into a group +;; name as follows: first, a prefix is removed from the (full) +;; directory name, then all `/' are replaced with `.'. The variable +;; `nnir-wais-remove-prefix' should contain a regex matching exactly +;; this prefix. It defaults to `$HOME/Mail/' (note the trailing +;; slash). + +;; 2. Glimpse +;; +;; The code expects you to have one Glimpse index which contains all +;; your mail files. The Lisp setup involves setting the +;; `nnir-glimpse-*' variables. The most difficult to understand +;; variable is probably `nnir-glimpse-remove-prefix', it corresponds +;; to `nnir-wais-remove-prefix', see above. The `nnir-glimpse-home' +;; variable should be set to the value of the `-H' option which allows +;; one to search this Glimpse index. I have indexed my whole home +;; directory with Glimpse, so I assume a default of `$HOME'. + +;; 3. Namazu +;; +;; The Namazu backend requires you to have one directory containing all +;; index files, this is controlled by the `nnir-namazu-index-directory' +;; variable. To function the `nnir-namazu-remove-prefix' variable must +;; also be correct, see the documentation for `nnir-wais-remove-prefix' +;; above. +;; +;; It is particularly important not to pass any any switches to namazu +;; that will change the output format. Good switches to use include +;; `--sort', `--ascending', `--early' and `--late'. Refer to the Namazu +;; documentation for further information on valid switches. +;; +;; To index my mail with the `mknmz' program I use the following +;; configuration file: +;; +;; ,---- +;; | package conf; # Don't remove this line! +;; | +;; | # Paths which will not be indexed. Don't use `^' or `$' anchors. +;; | $EXCLUDE_PATH = "spam|sent"; +;; | +;; | # Header fields which should be searchable. case-insensitive +;; | $REMAIN_HEADER = "from|date|message-id|subject"; +;; | +;; | # Searchable fields. case-insensitive +;; | $SEARCH_FIELD = "from|date|message-id|subject"; +;; | +;; | # The max length of a word. +;; | $WORD_LENG_MAX = 128; +;; | +;; | # The max length of a field. +;; | $MAX_FIELD_LENGTH = 256; +;; `---- +;; +;; My mail is stored in the directories ~/Mail/mail/, ~/Mail/lists/ and +;; ~/Mail/archive/, so to index them I go to the directory set in +;; `nnir-namazu-index-directory' and issue the following command. +;; +;; mknmz --mailnews ~/Mail/archive/ ~/Mail/mail/ ~/Mail/lists/ +;; +;; For maximum searching efficiency I have a cron job set to run this +;; command every four hours. + +;; Developer information: + +;; I have tried to make the code expandable. Basically, it is divided +;; into two layers. The upper layer is somewhat like the `nnvirtual' +;; or `nnkiboze' backends: given a specification of what articles to +;; show from another backend, it creates a group containing exactly +;; those articles. The lower layer issues a query to a search engine +;; and produces such a specification of what articles to show from the +;; other backend. + +;; The interface between the two layers consists of the single +;; function `nnir-run-query', which just selects the appropriate +;; function for the search engine one is using. The input to +;; `nnir-run-query' is a string, representing the query as input by +;; the user. The output of `nnir-run-query' is supposed to be a +;; vector, each element of which should in turn be a three-element +;; vector. The first element should be group name of the article, the +;; second element should be the article number, and the third element +;; should be the Retrieval Status Value (RSV) as returned from the +;; search engine. An RSV is the score assigned to the document by the +;; search engine. For Boolean search engines like Glimpse, the RSV is +;; always 1000 (or 1 or 100, or whatever you like). + +;; The sorting order of the articles in the summary buffer created by +;; nnir is based on the order of the articles in the above mentioned +;; vector, so that's where you can do the sorting you'd like. Maybe +;; it would be nice to have a way of displaying the search result +;; sorted differently? + +;; So what do you need to do when you want to add another search +;; engine? You write a function that executes the query. Temporary +;; data from the search engine can be put in `nnir-tmp-buffer'. This +;; function should return the list of articles as a vector, as +;; described above. Then, you need to register this backend in +;; `nnir-engines'. Then, users can choose the backend by setting +;; `nnir-search-engine'. + +;; Todo, or future ideas: + +;; * Make it so that Glimpse can also be called without `-F'. +;; +;; * It should be possible to restrict search to certain groups. +;; +;; * There is currently no error checking. +;; +;; * The summary buffer display is currently really ugly, with all the +;; added information in the subjects. How could I make this +;; prettier? +;; +;; * A function which can be called from an nnir summary buffer which +;; teleports you into the group the current article came from and +;; shows you the whole thread this article is part of. +;; Implementation suggestions? +;; (1998-07-24: There is now a preliminary implementation, but +;; it is much too slow and quite fragile.) +;; +;; * Support other mail backends. In particular, probably quite a few +;; people use nnfolder. How would one go about searching nnfolders +;; and producing the right data needed? The group name and the RSV +;; are simple, but what about the article number? +;; +;; * Support compressed mail files. Probably, just stripping off the +;; `.gz' or `.Z' file name extension is sufficient. +;; +;; * Support a find/grep combination. +;; +;; * At least for imap, the query is performed twice. +;; +;; * Support multiple mail backends. The information that is needed +;; by nnir could be put in the server parameters. (Use sensible +;; default values, though: include the name of the backend in the +;; default value such that people do not have to mess with the +;; server parameters if they don't want to.) It is not clear how to +;; do the user interface, though. Hm. Maybe offer the user a +;; completable list of backends to search? Or use the +;; process-marked groups to find out which backends to search? Or +;; always search all backends? +;; + +;; Have you got other ideas? + +;;; Setup Code: + +(defconst nnir-version "$Id: nnir.el,v 1.1.1.1 2003/06/06 04:55:44 fip Exp $" + "Version of NNIR.") + +(require 'cl) +(require 'nnoo) +(require 'gnus-group) +(require 'gnus-sum) +(eval-and-compile + (require 'gnus-util)) + +(nnoo-declare nnir) +(nnoo-define-basics nnir) + +(gnus-declare-backend "nnir" 'mail) + +;;; Developer Extension Variable: + +(defvar nnir-engines + '((glimpse nnir-run-glimpse + ((group . "Group spec: "))) + (wais nnir-run-waissearch + ()) + (excite nnir-run-excite-search + ()) + (imap nnir-run-imap + ()) + (swish++ nnir-run-swish++ + ((group . "Group spec: "))) + (swish-e nnir-run-swish-e + ((group . "Group spec: "))) + (namazu nnir-run-namazu + ())) +"Alist of supported search engines. +Each element in the alist is a three-element list (ENGINE FUNCTION ARGS). +ENGINE is a symbol designating the searching engine. FUNCTION is also +a symbol, giving the function that does the search. The third element +ARGS is a list of cons pairs (PARAM . PROMPT). When issuing a query, +the FUNCTION will issue a query for each of the PARAMs, using PROMPT. + +The value of `nnir-search-engine' must be one of the ENGINE symbols. +For example, use the following line for searching using freeWAIS-sf: + (setq nnir-search-engine 'wais) +Use the following line if you read your mail via IMAP and your IMAP +server supports searching: + (setq nnir-search-engine 'imap) +Note that you have to set additional variables for most backends. For +example, the `wais' backend needs the variables `nnir-wais-program', +`nnir-wais-database' and `nnir-wais-remove-prefix'. + +Add an entry here when adding a new search engine.") + +;;; User Customizable Variables: + +(defgroup nnir nil + "Search nnmh and nnml groups in Gnus with Glimpse, freeWAIS-sf, or EWS.") + +;; Mail backend. + +;; TODO: +;; If `nil', use server parameters to find out which server to search. CCC +;; +(defcustom nnir-mail-backend '(nnml "") + "*Specifies which backend should be searched. +More precisely, this is used to determine from which backend to fetch the +messages found. + +This must be equal to an existing server, so maybe it is best to use +something like the following: + (setq nnir-mail-backend (nth 0 gnus-secondary-select-methods)) +The above line works fine if the mail backend you want to search is +the first element of gnus-secondary-select-methods (`nth' starts counting +at zero)." + :type '(sexp) + :group 'nnir) + +;; Search engine to use. + +(defcustom nnir-search-engine 'wais + "*The search engine to use. Must be a symbol. +See `nnir-engines' for a list of supported engines, and for example +settings of `nnir-search-engine'." + :type '(sexp) + :group 'nnir) + +;; Glimpse engine. + +(defcustom nnir-glimpse-program "glimpse" + "*Name of Glimpse executable." + :type '(string) + :group 'nnir) + +(defcustom nnir-glimpse-home (getenv "HOME") + "*Value of `-H' glimpse option. +`~' and environment variables must be expanded, see the functions +`expand-file-name' and `substitute-in-file-name'." + :type '(directory) + :group 'nnir) + +(defcustom nnir-glimpse-remove-prefix (concat (getenv "HOME") "/Mail/") + "*The prefix to remove from each file name returned by Glimpse +in order to get a group name (albeit with / instead of .). This is a +regular expression. + +For example, suppose that Glimpse returns file names such as +\"/home/john/Mail/mail/misc/42\". For this example, use the following +setting: (setq nnir-glimpse-remove-prefix \"/home/john/Mail/\") +Note the trailing slash. Removing this prefix gives \"mail/misc/42\". +`nnir' knows to remove the \"/42\" and to replace \"/\" with \".\" to +arrive at the correct group name, \"mail.misc\"." + :type '(regexp) + :group 'nnir) + +(defcustom nnir-glimpse-additional-switches '("-i") + "*A list of strings, to be given as additional arguments to glimpse. +The switches `-H', `-W', `-l' and `-y' are always used -- calling +glimpse without them does not make sense in our situation. +Suggested elements to put here are `-i' and `-w'. + +Note that this should be a list. Ie, do NOT use the following: + (setq nnir-glimpse-additional-switches \"-i -w\") ; wrong! +Instead, use this: + (setq nnir-glimpse-additional-switches '(\"-i\" \"-w\"))" + :type '(repeat (string)) + :group 'nnir) + +;; freeWAIS-sf. + +(defcustom nnir-wais-program "waissearch" + "*Name of waissearch executable." + :type '(string) + :group 'nnir) + +(defcustom nnir-wais-database (expand-file-name "~/.wais/mail") + "*Name of Wais database containing the mail. + +Note that this should be a file name without extension. For example, +if you have a file /home/john/.wais/mail.fmt, use this: + (setq nnir-wais-database \"/home/john/.wais/mail\") +The string given here is passed to `waissearch -d' as-is." + :type '(file) + :group 'nnir) + +(defcustom nnir-wais-remove-prefix (concat (getenv "HOME") "/Mail/") + "*The prefix to remove from each directory name returned by waissearch +in order to get a group name (albeit with / instead of .). This is a +regular expression. + +This variable is similar to `nnir-glimpse-remove-prefix', only for Wais, +not Glimpse." + :type '(regexp) + :group 'nnir) + +;; EWS (Excite for Web Servers) engine. + +(defcustom nnir-excite-aquery-program "aquery.pl" + "*Name of the EWS query program. Should be `aquery.pl' or a path to same." + :type '(string) + :group 'nnir) + +(defcustom nnir-excite-collection "Mail" + "*Name of the EWS collection to search." + :type '(string) + :group 'nnir) + +(defcustom nnir-excite-remove-prefix (concat (getenv "HOME") "/Mail/") + "*The prefix to remove from each file name returned by EWS +in order to get a group name (albeit with / instead of .). This is a +regular expression. + +This variable is very similar to `nnir-glimpse-remove-prefix', except +that it is for EWS, not Glimpse." + :type '(regexp) + :group 'nnir) + +;; Swish++. Next three variables Copyright (C) 2000, 2001 Christoph +;; Conrad . +;; Swish++ home page: http://homepage.mac.com/pauljlucas/software/swish/ + +(defcustom nnir-swish++-configuration-file + (expand-file-name "~/Mail/swish++.conf") + "*Configuration file for swish++" + :type '(file) + :group 'nnir) + +(defcustom nnir-swish++-index-file + (concat (getenv "HOME") "/Mail/swish++.index") + "*Index file for swish++." + :type '(file) + :group 'nnir) + +(defcustom nnir-swish-e-program "swish-e" + "*Name of swish-e search executable." + :type '(string) + :group 'nnir) + + +(defcustom nnir-swish++-program "search++" + "*Name of swish++ search executable." + :type '(string) + :group 'nnir) + +(defcustom nnir-swish++-additional-switches '() + "*A list of strings, to be given as additional arguments to swish++. + +Note that this should be a list. Ie, do NOT use the following: + (setq nnir-swish++-additional-switches \"-i -w\") ; wrong +Instead, use this: + (setq nnir-swish++-additional-switches '(\"-i\" \"-w\"))" + :type '(repeat (string)) + :group 'nnir) + +(defcustom nnir-swish++-remove-prefix (concat (getenv "HOME") "/Mail/") + "*The prefix to remove from each file name returned by swish++ +in order to get a group name (albeit with / instead of .). This is a +regular expression. + +This variable is very similar to `nnir-glimpse-remove-prefix', except +that it is for swish++, not Glimpse." + :type '(regexp) + :group 'nnir) + +;; Swish-E. Next three variables Copyright (C) 2000 Christoph Conrad +;; . +;; URL: http://sunsite.berkeley.edu/SWISH-E/ +;; New version: http://www.boe.es/swish-e + +(defcustom nnir-swish-e-index-file + (expand-file-name "~/Mail/index.swish-e") + "*Index file for swish-e." + :type '(file) + :group 'nnir) + +(defcustom nnir-swish-e-program "swish-e" + "*Name of swish-e search executable." + :type '(string) + :group 'nnir) + +(defcustom nnir-swish-e-additional-switches '() + "*A list of strings, to be given as additional arguments to swish-e. + +Note that this should be a list. Ie, do NOT use the following: + (setq nnir-swish-e-additional-switches \"-i -w\") ; wrong +Instead, use this: + (setq nnir-swish-e-additional-switches '(\"-i\" \"-w\"))" + :type '(repeat (string)) + :group 'nnir) + +(defcustom nnir-swish-e-remove-prefix (concat (getenv "HOME") "/Mail/") + "*The prefix to remove from each file name returned by swish-e +in order to get a group name (albeit with / instead of .). This is a +regular expression. + +This variable is very similar to `nnir-glimpse-remove-prefix', except +that it is for swish-e, not Glimpse." + :type '(regexp) + :group 'nnir) + +;; Namazu engine, see + +(defcustom nnir-namazu-program "namazu" + "*Name of Namazu search executable." + :type '(string) + :group 'nnir) + +(defcustom nnir-namazu-index-directory (expand-file-name "~/Mail/namazu/") + "*Index directory for Namazu." + :type '(directory) + :group 'nnir) + +(defcustom nnir-namazu-additional-switches '() + "*A list of strings, to be given as additional arguments to namazu. +The switches `-q', `-a', and `-s' are always used, very few other switches +make any sense in this context. + +Note that this should be a list. Ie, do NOT use the following: + (setq nnir-namazu-additional-switches \"-i -w\") ; wrong +Instead, use this: + (setq nnir-namazu-additional-switches '(\"-i\" \"-w\"))" + :type '(repeat (string)) + :group 'nnir) + +(defcustom nnir-namazu-remove-prefix (concat (getenv "HOME") "/Mail/") + "*The prefix to remove from each file name returned by Namazu +in order to get a group name (albeit with / instead of .). + +This variable is very similar to `nnir-glimpse-remove-prefix', except +that it is for Namazu, not Glimpse." + :type '(directory) + :group 'nnir) + +;;; Internal Variables: + +(defvar nnir-current-query nil + "Internal: stores current query (= group name).") + +(defvar nnir-current-server nil + "Internal: stores current server (does it ever change?).") + +(defvar nnir-current-group-marked nil + "Internal: stores current list of process-marked groups.") + +(defvar nnir-artlist nil + "Internal: stores search result.") + +(defvar nnir-tmp-buffer " *nnir*" + "Internal: temporary buffer.") + +;;; Code: + +;; Gnus glue. + +(defun gnus-group-make-nnir-group (extra-parms query) + "Create an nnir group. Asks for query." + (interactive "P\nsQuery: ") + (let ((parms nil)) + (if extra-parms + (setq parms (nnir-read-parms query)) + (setq parms (list (cons 'query query)))) + (gnus-group-read-ephemeral-group + (concat "nnir:" (prin1-to-string parms)) '(nnir "") t + (cons (current-buffer) + gnus-current-window-configuration) + nil))) + +;; Emacs 19 compatibility? +(or (fboundp 'kbd) (defalias 'kbd 'read-kbd-macro)) + +(defun nnir-group-mode-hook () + (define-key gnus-group-mode-map + (if (fboundp 'read-kbd-macro) + (kbd "G G") + "GG") ; XEmacs 19 compat + 'gnus-group-make-nnir-group)) +(add-hook 'gnus-group-mode-hook 'nnir-group-mode-hook) + + + +;; Summary mode commands. + +(defun gnus-summary-nnir-goto-thread () + "Only applies to nnir groups. Go to group this article came from +and show thread that contains this article." + (interactive) + (unless (eq 'nnir (car (gnus-find-method-for-group gnus-newsgroup-name))) + (error "Can't execute this command unless in nnir group.")) + (let* ((cur (gnus-summary-article-number)) + (backend-group (nnir-artlist-artitem-group nnir-artlist cur)) + (backend-number (nnir-artlist-artitem-number nnir-artlist cur))) + (gnus-group-read-ephemeral-group + backend-group + nnir-mail-backend + t ; activate + (cons (current-buffer) + 'summary) ; window config + nil + (list backend-number)) + (gnus-summary-limit (list backend-number)) + (gnus-summary-refer-thread))) + +(if (fboundp 'eval-after-load) + (eval-after-load "gnus-sum" + '(define-key gnus-summary-goto-map + "T" 'gnus-summary-nnir-goto-thread)) + (add-hook 'gnus-summary-mode-hook + (function (lambda () + (define-key gnus-summary-goto-map + "T" 'gnus-summary-nnir-goto-thread))))) + + + +;; Gnus backend interface functions. + +(deffoo nnir-open-server (server &optional definitions) + ;; Just set the server variables appropriately. + (nnoo-change-server 'nnir server definitions)) + +(deffoo nnir-request-group (group &optional server fast) + "GROUP is the query string." + (nnir-possibly-change-server server) + ;; Check for cache and return that if appropriate. + (if (and (equal group nnir-current-query) + (equal gnus-group-marked nnir-current-group-marked) + (or (null server) + (equal server nnir-current-server))) + nnir-artlist + ;; Cache miss. + (setq nnir-artlist (nnir-run-query group)) + (save-excursion + (set-buffer nntp-server-buffer) + (if (zerop (length nnir-artlist)) + (progn + (setq nnir-current-query nil + nnir-current-server nil + nnir-current-group-marked nil + nnir-artlist nil) + (nnheader-report 'nnir "Search produced empty results.")) + ;; Remember data for cache. + (setq nnir-current-query group) + (when server (setq nnir-current-server server)) + (setq nnir-current-group-marked gnus-group-marked) + (nnheader-insert "211 %d %d %d %s\n" + (nnir-artlist-length nnir-artlist) ; total # + 1 ; first # + (nnir-artlist-length nnir-artlist) ; last # + group))))) ; group name + +(deffoo nnir-retrieve-headers (articles &optional group server fetch-old) + (save-excursion + (let ((artlist (copy-sequence articles)) + (idx 1) + (art nil) + (artitem nil) + (artgroup nil) (artno nil) + (artrsv nil) + (artfullgroup nil) + (novitem nil) + (novdata nil) + (foo nil)) + (while (not (null artlist)) + (setq art (car artlist)) + (or (numberp art) + (nnheader-report + 'nnir + "nnir-retrieve-headers doesn't grok message ids: %s" + art)) + (setq artitem (nnir-artlist-article nnir-artlist art)) + (setq artrsv (nnir-artitem-rsv artitem)) + (setq artgroup (nnir-artitem-group artitem)) + (setq artno (nnir-artitem-number artitem)) + (setq artfullgroup (nnir-group-full-name artgroup)) + ;; retrieve NOV or HEAD data for this article, transform into + ;; NOV data and prepend to `novdata' + (set-buffer nntp-server-buffer) + (case (setq foo (gnus-retrieve-headers (list artno) artfullgroup nil)) + (nov + (goto-char (point-min)) + (setq novitem (nnheader-parse-nov)) + (unless novitem + (pop-to-buffer nntp-server-buffer) + (error + "nnheader-parse-nov returned nil for article %s in group %s" + artno artfullgroup))) + (headers + (goto-char (point-min)) + (setq novitem (nnheader-parse-head)) + (unless novitem + (pop-to-buffer nntp-server-buffer) + (error + "nnheader-parse-head returned nil for article %s in group %s" + artno artfullgroup))) + (t (nnheader-report 'nnir "Don't support header type %s." foo))) + ;; replace article number in original group with article number + ;; in nnir group + (mail-header-set-number novitem idx) + (mail-header-set-from novitem + (mail-header-from novitem)) + (mail-header-set-subject + novitem + (format "[%d: %s/%d] %s" + artrsv artgroup artno + (mail-header-subject novitem))) + ;;-(mail-header-set-extra novitem nil) + (push novitem novdata) + (setq artlist (cdr artlist)) + (setq idx (1+ idx))) + (setq novdata (nreverse novdata)) + (set-buffer nntp-server-buffer) (erase-buffer) + (mapcar 'nnheader-insert-nov novdata) + 'nov))) + +(deffoo nnir-request-article (article + &optional group server to-buffer) + (save-excursion + (let* ((artitem (nnir-artlist-article nnir-artlist + article)) + (artgroup (nnir-artitem-group artitem)) + (artno (nnir-artitem-number artitem)) + ;; Bug? + ;; Why must we bind nntp-server-buffer here? It won't + ;; work if `buf' is used, say. (Of course, the set-buffer + ;; line below must then be updated, too.) + (nntp-server-buffer (or to-buffer nntp-server-buffer))) + (set-buffer nntp-server-buffer) + (erase-buffer) + (message "Requesting article %d from group %s" + artno + (nnir-group-full-name artgroup)) + (gnus-request-article artno (nnir-group-full-name artgroup) + nntp-server-buffer) + (cons artgroup artno)))) + + +(nnoo-define-skeleton nnir) + +;;; Search Engine Interfaces: + +;; Glimpse interface. +(defun nnir-run-glimpse (query &optional group) + "Run given query against glimpse. Returns a vector of (group name, file name) +pairs (also vectors, actually)." + (save-excursion + (let ((artlist nil) + (groupspec (cdr (assq 'group query))) + (qstring (cdr (assq 'query query)))) + (when (and group groupspec) + (error (concat "It does not make sense to use a group spec" + " with process-marked groups."))) + (when group + (setq groupspec (gnus-group-real-name group))) + (set-buffer (get-buffer-create nnir-tmp-buffer)) + (erase-buffer) + (if groupspec + (message "Doing glimpse query %s on %s..." query groupspec) + (message "Doing glimpse query %s..." query)) + (let* ((cp-list + `( ,nnir-glimpse-program + nil ; input from /dev/null + t ; output + nil ; don't redisplay + "-H" ,nnir-glimpse-home ; search home dir + "-W" ; match pattern in file + "-l" "-y" ; misc options + ,@nnir-glimpse-additional-switches + "-F" ,nnir-glimpse-remove-prefix ; restrict output to mail + ,qstring ; the query, in glimpse format + )) + (exitstatus + (progn + (message "%s args: %s" nnir-glimpse-program + (mapconcat 'identity (cddddr cp-list) " ")) + (apply 'call-process cp-list)))) + (unless (or (null exitstatus) + (zerop exitstatus)) + (nnheader-report 'nnir "Couldn't run glimpse: %s" exitstatus) + ;; Glimpse failure reason is in this buffer, show it if + ;; the user wants it. + (when (> gnus-verbose 6) + (display-buffer nnir-tmp-buffer)))) + (when groupspec + (keep-lines groupspec)) + (if groupspec + (message "Doing glimpse query %s on %s...done" query groupspec) + (message "Doing glimpse query %s...done" query)) + (sit-for 0) + ;; CCC: The following work of extracting group name and article + ;; number from the Glimpse output can probably better be done by + ;; just going through the buffer once, and plucking out the + ;; right information from each line. + ;; remove superfluous stuff from glimpse output + (goto-char (point-min)) + (delete-non-matching-lines "/[0-9]+$") + ;;(delete-matching-lines "\\.overview~?$") + (goto-char (point-min)) + (while (re-search-forward (concat "^" nnir-glimpse-remove-prefix) nil t) + (replace-match "")) + ;; separate group name from article number with \t + ;; XEmacs compatible version + (goto-char (point-max)) + (while (re-search-backward "/[0-9]+$" nil t) + (delete-char 1 nil) + (insert-char ?\t 1)) +; Emacs compatible version +; (goto-char (point-min)) +; (while (re-search-forward "\\(/\\)[0-9]+$" nil t) +; (replace-match "\t" t t nil 1)) + ;; replace / with . in group names + (subst-char-in-region (point-min) (point-max) ?/ ?. t) + ;; massage buffer to contain some Lisp; + ;; this depends on the artlist encoding internals + ;; maybe this dependency should be removed? + (goto-char (point-min)) + (while (not (eobp)) + (insert "[\"") + (skip-chars-forward "^\t") + (insert "\" ") + (end-of-line) + (insert " 1000 ]") ; 1000 = score + (forward-line 1)) + (insert "])\n") + (goto-char (point-min)) + (insert "(setq artlist [\n") + (eval-buffer) + (sort* artlist + (function (lambda (x y) + (if (string-lessp (nnir-artitem-group x) + (nnir-artitem-group y)) + t + (< (nnir-artitem-number x) + (nnir-artitem-number y)))))) + ))) + +;; freeWAIS-sf interface. +(defun nnir-run-waissearch (query &optional group) + "Run given query agains waissearch. Returns vector of (group name, file name) +pairs (also vectors, actually)." + (when group + (error "The freeWAIS-sf backend cannot search specific groups.")) + (save-excursion + (let ((qstring (cdr (assq 'query query))) + (artlist nil) + (score nil) (artno nil) (dirnam nil) (group nil)) + (set-buffer (get-buffer-create nnir-tmp-buffer)) + (erase-buffer) + (message "Doing WAIS query %s..." query) + (call-process nnir-wais-program + nil ; input from /dev/null + t ; output to current buffer + nil ; don't redisplay + "-d" nnir-wais-database ; database to search + qstring) + (message "Massaging waissearch output...") + ;; remove superfluous lines + (keep-lines "Score:") + ;; extract data from result lines + (goto-char (point-min)) + (while (re-search-forward + "Score: +\\([0-9]+\\).*'\\([0-9]+\\) +\\([^']+\\)/'" nil t) + (setq score (match-string 1) + artno (match-string 2) + dirnam (match-string 3)) + (unless (string-match nnir-wais-remove-prefix dirnam) + (nnheader-report 'nnir "Dir name %s doesn't contain prefix %s" + dirnam nnir-wais-remove-prefix)) + (setq group (substitute ?. ?/ (replace-match "" t t dirnam))) + (push (vector group + (string-to-int artno) + (string-to-int score)) + artlist)) + (message "Massaging waissearch output...done") + (apply 'vector + (sort* artlist + (function (lambda (x y) + (> (nnir-artitem-rsv x) + (nnir-artitem-rsv y))))))))) + +;; EWS (Excite for Web Servers) interface +(defun nnir-run-excite-search (query &optional group) + "Run a given query against EWS. Returns vector of (group name, file name) +pairs (also vectors, actually)." + (when group + (error "Searching specific groups not implemented for EWS.")) + (save-excursion + (let ((qstring (cdr (assq 'query query))) + artlist group article-num article) + (setq nnir-current-query query) + (set-buffer (get-buffer-create nnir-tmp-buffer)) + (erase-buffer) + (message "Doing EWS query %s..." qstring) + (call-process nnir-excite-aquery-program + nil ; input from /dev/null + t ; output to current buffer + nil ; don't redisplay + nnir-excite-collection + (if (string= (substring qstring 0 1) "(") + qstring + (format "(concept %s)" qstring))) + (message "Gathering query output...") + + (goto-char (point-min)) + (while (re-search-forward + "^[0-9]+\\s-[0-9]+\\s-[0-9]+\\s-\\(\\S-*\\)" nil t) + (setq article (match-string 1)) + (unless (string-match + (concat "^" (regexp-quote nnir-excite-remove-prefix) + "\\(.*\\)/\\([0-9]+\\)") article) + (nnheader-report 'nnir "Dir name %s doesn't contain prefix %s" + article nnir-excite-remove-prefix)) + (setq group (substitute ?. ?/ (match-string 1 article))) + (setq article-num (match-string 2 article)) + (setq artlist (vconcat artlist (vector (vector group + (string-to-int article-num) + 1000))))) + (message "Gathering query output...done") + artlist))) + +;; IMAP interface. The following function is Copyright (C) 1998 Simon +;; Josefsson . +;; todo: +;; nnir invokes this two (2) times???! +;; we should not use nnimap at all but open our own server connection +;; we should not LIST * but use nnimap-list-pattern from defs +;; send queries as literals +;; handle errors + +(defun nnir-run-imap (query &optional group) + (require 'imap) + (require 'nnimap) + (unless group + (error "Must specify groups for IMAP searching.")) + (save-excursion + (let ((qstring (cdr (assq 'query query))) + (server (cadr nnir-mail-backend)) + (defs (caddr nnir-mail-backend)) + artlist buf) + (message "Opening server %s" server) + (condition-case () + (when (nnimap-open-server server defs) ;; xxx + (setq buf nnimap-server-buffer) ;; xxx + (message "Searching %s..." group) + (let ((arts 0) + (mbx (gnus-group-real-name group))) + (when (imap-mailbox-select mbx nil buf) + (mapcar + (lambda (artnum) + (push (vector mbx artnum 1) artlist) + (setq arts (1+ arts))) + (imap-search (concat "TEXT \"" qstring "\"") buf)) + (message "Searching %s... %d matches" mbx arts))) + (message "Searching %s...done" group)) + (quit nil)) + (reverse artlist)))) + +;; Swish++ interface. The following function is Copyright (C) 2000, +;; 2001 Christoph Conrad . +;; -cc- Todo +;; Search by +;; - group +;; Sort by +;; - rank (default) +;; - article number +;; - file size +;; - group +(defun nnir-run-swish++ (query &optional group) + "Run given query against swish++. +Returns a vector of (group name, file name) pairs (also vectors, +actually). + +Tested with swish++ 4.7 on GNU/Linux and with with swish++ 5.0b2 on +Windows NT 4.0." + + (when group + (error "The swish++ backend cannot search specific groups.")) + + (save-excursion + (let ( (qstring (cdr (assq 'query query))) + (groupspec (cdr (assq 'group query))) + (artlist nil) + (score nil) (artno nil) (dirnam nil) (group nil) ) + + (when (equal "" qstring) + (error "swish++: You didn't enter anything.")) + + (set-buffer (get-buffer-create nnir-tmp-buffer)) + (erase-buffer) + + (if groupspec + (message "Doing swish++ query %s on %s..." qstring groupspec) + (message "Doing swish++ query %s..." qstring)) + + (let* ((cp-list `( ,nnir-swish++-program + nil ; input from /dev/null + t ; output + nil ; don't redisplay + "--index-file" ,nnir-swish++-index-file + "--config-file" ,nnir-swish++-configuration-file + ,@nnir-swish++-additional-switches + ,qstring ; the query, in swish++ format + )) + (exitstatus + (progn + (message "%s args: %s" nnir-swish++-program + (mapconcat 'identity (cddddr cp-list) " "));; ??? + (apply 'call-process cp-list)))) + (unless (or (null exitstatus) + (zerop exitstatus)) + (nnheader-report 'nnir "Couldn't run swish++: %s" exitstatus) + ;; swish++ failure reason is in this buffer, show it if + ;; the user wants it. + (when (> gnus-verbose 6) + (display-buffer nnir-tmp-buffer)))) + + ;; The results are output in the format of: + ;; V 4.7 Linux + ;; rank relative-path-name file-size file-title + ;; V 5.0b2: + ;; rank relative-path-name file-size topic?? + ;; where rank is an integer from 1 to 100. + (goto-char (point-min)) + (while (re-search-forward + "\\(^[0-9]+\\) \\([^ ]+\\) [0-9]+ \\(.*\\)$" nil t) + (setq score (match-string 1) + artno (file-name-nondirectory (match-string 2)) + dirnam (file-name-directory (match-string 2))) + + ;; don't match directories + (when (string-match "^[0-9]+$" artno) + (when (not (null dirnam)) + + ; maybe limit results to matching groups. + (when (or (not groupspec) + (string-match groupspec dirnam)) + + ;; remove nnir-swish++-remove-prefix from beginning of dirname + (when (string-match (concat "^" nnir-swish++-remove-prefix) + dirnam) + (setq dirnam (replace-match "" t t dirnam))) + + (setq dirnam (substring dirnam 0 -1)) + ;; eliminate all ".", "/", "\" from beginning. Always matches. + (string-match "^[./\\]*\\(.*\\)$" dirnam) + ;; "/" -> "." + (setq group (substitute ?. ?/ (match-string 1 dirnam))) + ;; "\\" -> "." + (setq group (substitute ?. ?\\ group)) + + (push (vector group + (string-to-int artno) + (string-to-int score)) + artlist))))) + + (message "Massaging swish++ output...done") + + ;; Sort by score + (apply 'vector + (sort* artlist + (function (lambda (x y) + (> (nnir-artitem-rsv x) + (nnir-artitem-rsv y))))))))) + +;; Swish-E interface. The following function is Copyright (C) 2000, +;; 2001 by Christoph Conrad . +(defun nnir-run-swish-e (query &optional group) + "Run given query against swish-e. +Returns a vector of (group name, file name) pairs (also vectors, +actually). + +Tested with swish-e-2.0.1 on Windows NT 4.0." + + ;; swish-e crashes with empty parameter to "-w" on commandline... + (when group + (error "The swish-e backend cannot search specific groups.")) + + (save-excursion + (let ( (qstring (cdr (assq 'query query))) + (artlist nil) + (score nil) (artno nil) (dirnam nil) (group nil) ) + + (when (equal "" qstring) + (error "swish-e: You didn't enter anything.")) + + (set-buffer (get-buffer-create nnir-tmp-buffer)) + (erase-buffer) + + (message "Doing swish-e query %s..." query) + (let* ((cp-list `( ,nnir-swish-e-program + nil ; input from /dev/null + t ; output + nil ; don't redisplay + "-f" ,nnir-swish-e-index-file + ,@nnir-swish-e-additional-switches + "-w" + ,qstring ; the query, in swish-e format + )) + (exitstatus + (progn + (message "%s args: %s" nnir-swish-e-program + (mapconcat 'identity (cddddr cp-list) " ")) + (apply 'call-process cp-list)))) + (unless (or (null exitstatus) + (zerop exitstatus)) + (nnheader-report 'nnir "Couldn't run swish-e: %s" exitstatus) + ;; swish-e failure reason is in this buffer, show it if + ;; the user wants it. + (when (> gnus-verbose 6) + (display-buffer nnir-tmp-buffer)))) + + ;; The results are output in the format of: + ;; rank path-name file-title file-size + (goto-char (point-min)) + (while (re-search-forward + "\\(^[0-9]+\\) \\([^ ]+\\) \"\\([^\"]+\\)\" [0-9]+$" nil t) + (setq score (match-string 1) + artno (match-string 3) + dirnam (file-name-directory (match-string 2))) + + ;; don't match directories + (when (string-match "^[0-9]+$" artno) + (when (not (null dirnam)) + + ;; remove nnir-swish-e-remove-prefix from beginning of dirname + (when (string-match (concat "^" nnir-swish-e-remove-prefix) + dirnam) + (setq dirnam (replace-match "" t t dirnam))) + + (setq dirnam (substring dirnam 0 -1)) + ;; eliminate all ".", "/", "\" from beginning. Always matches. + (string-match "^[./\\]*\\(.*\\)$" dirnam) + ;; "/" -> "." + (setq group (substitute ?. ?/ (match-string 1 dirnam))) + ;; Windows "\\" -> "." + (setq group (substitute ?. ?\\ group)) + + (push (vector group + (string-to-int artno) + (string-to-int score)) + artlist)))) + + (message "Massaging swish-e output...done") + + ;; Sort by score + (apply 'vector + (sort* artlist + (function (lambda (x y) + (> (nnir-artitem-rsv x) + (nnir-artitem-rsv y))))))))) + +;; Namazu interface +(defun nnir-run-namazu (query &optional group) + "Run given query against Namazu. Returns a vector of (group name, file name) +pairs (also vectors, actually). + +Tested with Namazu 2.0.6 on a GNU/Linux system." + (when group + (error "The Namazu backend cannot search specific groups")) + (save-excursion + (let ( + (artlist nil) + (qstring (cdr (assq 'query query))) + (score nil) + (group nil) + (article nil) + (process-environment (copy-sequence process-environment)) + ) + (setenv "LC_MESSAGES" "C") + (set-buffer (get-buffer-create nnir-tmp-buffer)) + (erase-buffer) + (let* ((cp-list + `( ,nnir-namazu-program + nil ; input from /dev/null + t ; output + nil ; don't redisplay + "-q" ; don't be verbose + "-a" ; show all matches + "-s" ; use short format + ,@nnir-namazu-additional-switches + ,qstring ; the query, in namazu format + ,nnir-namazu-index-directory ; index directory + )) + (exitstatus + (progn + (message "%s args: %s" nnir-namazu-program + (mapconcat 'identity (cddddr cp-list) " ")) + (apply 'call-process cp-list)))) + (unless (or (null exitstatus) + (zerop exitstatus)) + (nnheader-report 'nnir "Couldn't run namazu: %s" exitstatus) + ;; Namazu failure reason is in this buffer, show it if + ;; the user wants it. + (when (> gnus-verbose 6) + (display-buffer nnir-tmp-buffer)))) + + ;; Namazu output looks something like this: + ;; 2. Re: Gnus agent expire broken (score: 55) + ;; /home/henrik/Mail/mail/sent/1310 (4,138 bytes) + + (goto-char (point-min)) + (while (re-search-forward + "^\\([0-9]+\\.\\).*\\((score: \\([0-9]+\\)\\))\n\\([^ ]+\\)" + nil t) + (setq score (match-string 3) + group (file-name-directory (match-string 4)) + article (file-name-nondirectory (match-string 4))) + + ;; make sure article and group is sane + (when (and (string-match "^[0-9]+$" article) + (not (null group))) + (when (string-match (concat "^" nnir-namazu-remove-prefix) group) + (setq group (replace-match "" t t group))) + + ;; remove trailing slash from groupname + (setq group (substring group 0 -1)) + + ;; stuff results into artlist vector + (push (vector (substitute ?. ?/ group) + (string-to-int article) + (string-to-int score)) artlist))) + + ;; sort artlist by score + (apply 'vector + (sort* artlist + (function (lambda (x y) + (> (nnir-artitem-rsv x) + (nnir-artitem-rsv y))))))))) + +;;; Util Code: + +(defun nnir-read-parms (query) + "Reads additional search parameters according to `nnir-engines'." + (let ((parmspec (caddr (assoc nnir-search-engine nnir-engines)))) + (cons (cons 'query query) + (mapcar 'nnir-read-parm parmspec)))) + +(defun nnir-read-parm (parmspec) + "Reads a single search parameter. +`parmspec' is a cons cell, the car is a symbol, the cdr is a prompt." + (let ((sym (car parmspec)) + (prompt (cdr parmspec))) + (cons sym (read-string prompt)))) + +(defun nnir-run-query (query) + "Invoke appropriate search engine function (see `nnir-engines'). +If some groups were process-marked, run the query for each of the groups +and concat the results." + (let ((search-func (cadr (assoc nnir-search-engine nnir-engines))) + (q (car (read-from-string query)))) + (if gnus-group-marked + (apply 'append + (mapcar (lambda (x) + (funcall search-func q x)) + gnus-group-marked)) + (funcall search-func q nil)))) + +(defun nnir-group-full-name (shortname) + "For the given group name, return a full Gnus group name. +The Gnus backend/server information is added." + (gnus-group-prefixed-name shortname nnir-mail-backend)) + +(defun nnir-possibly-change-server (server) + (unless (and server (nnir-server-opened server)) + (nnir-open-server server))) + + +;; Data type article list. + +(defun nnir-artlist-length (artlist) + "Returns number of articles in artlist." + (length artlist)) + +(defun nnir-artlist-article (artlist n) + "Returns from ARTLIST the Nth artitem (counting starting at 1)." + (elt artlist (1- n))) + +(defun nnir-artitem-group (artitem) + "Returns the group from the ARTITEM." + (elt artitem 0)) + +(defun nnir-artlist-artitem-group (artlist n) + "Returns from ARTLIST the group of the Nth artitem (counting from 1)." + (nnir-artitem-group (nnir-artlist-article artlist n))) + +(defun nnir-artitem-number (artitem) + "Returns the number from the ARTITEM." + (elt artitem 1)) + +(defun nnir-artlist-artitem-number (artlist n) + "Returns from ARTLIST the number of the Nth artitem (counting from 1)." + (nnir-artitem-number (nnir-artlist-article artlist n))) + +(defun nnir-artitem-rsv (artitem) + "Returns the Retrieval Status Value (RSV, score) from the ARTITEM." + (elt artitem 2)) + +(defun nnir-artlist-artitem-rsv (artlist n) + "Returns from ARTLIST the Retrieval Status Value of the Nth artitem +(counting from 1)." + (nnir-artitem-rsv (nnir-artlist-article artlist n))) + +;; unused? +(defun nnir-artlist-groups (artlist) + "Returns a list of all groups in the given ARTLIST." + (let ((res nil) + (with-dups nil)) + ;; from each artitem, extract group component + (setq with-dups (mapcar 'nnir-artitem-group artlist)) + ;; remove duplicates from above + (mapcar (function (lambda (x) (add-to-list 'res x))) + with-dups) + res)) + + +;; The end. +(provide 'nnir) --- swish++-6.1.5.orig/debian/email_indexing/swishmutt.sh +++ swish++-6.1.5/debian/email_indexing/swishmutt.sh @@ -0,0 +1,60 @@ +#!/bin/bash +#swishmutt +#MH 2002 ; starting point was: http://www.muttfr.org/gen.php3/2001/12/05/85,0,1,0.html +#awkward example just to show the possibilities of integration of swish++ and mutt +#You certainly could create a temporary Maildir and not a mbox like here +#You need the procmail-package for the mbox formater: formail +# +#Have a look at the howto ... +MAILHOME=~/mail +TMPRAW=$MAILHOME/sqmbox.raw +TMPMBOX=$MAILHOME/sqmbox.tmp +INDEXFILE=$MAILHOME/swish++.index +if [ -f $TMPRAW ]; then + rm -f $TMPRAW +fi +if [ -f $TMPMBOX ]; then + rm -f $TMPMBOX +fi +#remove old results + +echo -e "Keywords with less than 4 chars get ignored by swish++ \n Your query please: \n" + +read KEYWORD + +if [ -z "$KEYWORD" ]; then + echo -e "No keyword found\n" + echo -e "Your query please !: ( type q to quit )\n" + read KEYWORD + if [ -z "$KEYWORD" ]; then + echo -e "You didn't specify any keyword\n" + echo -e "Your query please !: ( type q to quit )\n" + read KEYWORD + elif [ "$KEYWORD" = "q" ]; then + exit 0 + fi +fi + +#The following assumes you created an index to $MAILHOME/swish++.index +#like: +# +#index++ -v3 -e 'mail:*' -c /PATH_TO/swish++.conf -i $MAILHOME/swish++.index ~/DIR/DIR1 ~/DIR/DIR2 ... +#if ~/DIR/DIR# is a Maildir you should give ~/DIR/DIR#/cur +# +for i in `search++ --index-file=$INDEXFILE "$KEYWORD" | cut -d" " --fields=2` + do + + if [ $i = "results:" -o $i = "ignored:" ]; then + continue +#don't try to read the "results" and "ignored" indications ;-) + fi + cat $MAILHOME$i >> $TMPRAW +done +if [ ! -f $TMPRAW ]; then + echo "No search results for the given keyword(s)" + exit 0 +#no file no result +fi +formail -ds < $TMPRAW >> $TMPMBOX +#violent approach, but can come in very handy repairing headers ... +exit 0 --- swish++-6.1.5.orig/debian/email_indexing/Email-Indexing-Mini-Howto.txt +++ swish++-6.1.5/debian/email_indexing/Email-Indexing-Mini-Howto.txt @@ -0,0 +1,154 @@ + ;-*- outline -*- + +"swish++ -- A Mail Indexing System for Humans" + +* swish++ + Email + Swish++ is great for indexing and later, of course, searching and + retrieving your old emails. + +!!! Swish++ searches on a per file basis, so its use only makes sense + with one-file-per-message systems (like Maildir, gnus nnml, mh ... and + not with mbox based storage -- use grepmail instead or just switch + your mail system, especially with a high volume of archived messages) + +** Indexing of emails + +swish++ provides a specific module for this purpose + +*** index example: + +pwd = ~/Mail +index++ -v3 -s stop_words -e 'mail:*' -E 'Incomin*' -E '*~' +./archive ./Ich ./drafts ./maintainer ./maintainer-debian + +-e denotes the module, -E excludes file-pattern and -s indicates a + specific stop word file + +(man index++) + +*** cron + + If you want always "fresh" indexes you need the help of cron. + +**** cron sample + +41 06 * * * /usr/bin/index++ -s /home/YOU/Mail/stop_words -e 'mail:*' -E \ +'Incomin*' -E '*~' -E '*.alt' \ +--config-file=/home/YOU/Mail/swish++.config \ +--index-file=/home/YOU/Mail/swish++.index /home/YOU/Mail/archive /home/YOU/Mail/Ich /home/YOU/Mail/drafts /home/YOU/Mail/maintainer 2>&1 + +"Index every day at 6:41 the mentioned directories (recursively - the +default) use a config-file and create the index in /home/YOU/Mail; use +the mail module (-e 'mail:*') and don't index certain patterns (-E +...)." + + Or put the whole command as a script and execute it from cron: + +41 06 * * * /home/YOU/cron_swish++.sh 2>&1 + + +*** search example: + >search++ from = mhummel and swish + # results: 9 + 100 ./Ich/930 1175 Indexing + 58 ./selber/254 1110 Bug#88974 ITA: swish++ -- Simple Web Indexing System for + 47 ./selber/301 3270 Re: Bug#129390: swish++: index++ gets a segmentation fault 41 ./selber/338 939 Indexing + [...] + + Though this is already quite useful, it doesn't signify the whole + comfort. I bet you want to read the rediscovered email with your + favorite mail reader. + + (for search++ options s. man search++ , of course) + + +* Mail Reader Integration of swish++ + +** Integration with emacs + gnus + nnir + +*** How to use emacs? + + Get one year of vacancy and have some kind of remote approach to this digital + epos. + +*** How to use gnus? + + See above ... + +*** gnus + nnir + swish++ + + In the examples directory you'll find a patched version of nnir.el + (Maybe the small patch will be included at the time you are reading + this.) + +**** nnml-back-ends and nnir + +***** put nnir.el in your load-path + +***** Add the following to your ~/.gnus init-file + + (setq nnir-search-engine 'swish++) + ;;the following are the default values + (setq nnir-swish++-program "search++") ; the search executable in Debian + ;(setq nnir-swish++-index-file "/home/YOU/Mail/swish++.index"); + ;the index location mail is stored in ~/Mail normally -- the default + + (Have a look at Kai's commentaries in nnir.el + Maybe you have to set 'nnir-mail-back-end' but with nnml -- and + having nnml as the default gnus-select-method -- the default is + fine) + +***** Interaction within gnus + + Type G G and you will be prompted for a query (enter the same as with plain + search++ at the command line). + + The search results will form a new group. + + Further try G T and enjoy + +** Integration with mutt + + mutt supports the Maildir format and mh + (http://www.reedmedia.net/misc/mail/mailbox-formats.html and + http://www.courier-mta.org/mbox-vs-maildir/#intro1) well; you don't + need to install qmail to be able to work with maildirs, not even + the MTA needs maildir support s. procmail (even Mail::Audit does + the right thing, discovering a maildir, and accepts to the + Maildir/new directory). + + +*** swishmutt.sh + + (Caveat: You need the procmail package installed -- for the mbox + formatter) + + Copy this script to your ~/bin/ directory (or, of course adapt the + settings to your needs) + It's just an example of how you could parse the swish++ results + for mutt. But checking and adapting the variables it should + work out of the box. + Add something like: + + macro index "\ch" "!~/bin/swishmutt.sh\nc=sqmbox.tmp\n" + + folder-hook sqmbox.tmp set sort=mailbox-order + + + to your .muttrc file. (Make sure that mutt actually finds the + temporary mbox : sqmbox.tmp -- the script assumes + ~/mail/sqmbox.tmp) + + The folder-hook is necessary to conserve the search ranking. (The + mailbox order is OK, so no mess with date_sent and the like.) + +**** Interaction within mutt + + The macro works from within a mail folder. Just press Ctrl-h and + you will be prompted for a query. + Then if everything works fine you will enter a temporary mbox + with the emails ordered corresponding with the search++ results. + + +MH +