pax_global_header00006660000000000000000000000064130037130710014505gustar00rootroot0000000000000052 comment=49457116bb0796636fd1bc84f39006fb102bfafc glimpse-4.18.7/000077500000000000000000000000001300371307100132465ustar00rootroot00000000000000glimpse-4.18.7/CHANGES000066400000000000000000000367211300371307100142520ustar00rootroot00000000000000 Manually edited CHANGES file - now out of date. See ChangeLog for latest update information. Some notes from Peter on checking the code into RCS and some fixes to 4.1 are appended to the bottom of this file. 4.12.5 --> 4.12.6 - Fixes to configure script, thanks to Michael Heironimus - Fixes to index/partition.c, index/io.c and index/build_in.c should resolve problem with missing hits on the first one or two keywords in the index. Thanks to Morey Hubin. - Fix to sgrep.c solves problem of double-hit count with record delimiters. M. Hubin. 4.11 --> 4.12.5 Fix for using filters with structured indexes. Added FILE_END_MARK constant so it is possible to configure for filenames with spaces Test-fix for core dump on large indexes (may not have solved problem). 4.1 --> 4.11 Fix for core dump on merge, cleanup makefiles. 4.0 --> 4.1 - Minor bug fixes and cleanup preparatory to final glimpse release. 3.6 --> 4.0 - Added support to extract titles from HTML pages in glimpseindex with the -X option. These files must have names that end in: html, htm, shtml, shtm (It is easy to extend these -- just see glimpse.h/EXTRACT_INFO_SUFFIX. The routine to extract titles is index/filetype.c/extract_info(). This can be modified in various ways to extract info from many filetypes.) The titles are appended to the corresponding filenames after a ' ', before storing the filenames in .glimpse_filenames. In this case, glimpseindex assumes that filenames don't have spaces in them. - Added support to glimpseindex to store not just the names of files that are indexed, but also some extra information (like a URL) after each file, when -F is used to provide the names of the files to be indexed to glimpseindex. This will be stored in .glimpse_filenames and .glimpse_filehash. The information (URL) must be separated from the actual file name by one blank ' '. In this case too, glimpse assumes that filenames don't have spaces in them. - Added a -U option to glimpse to be able to interpret indices created with a -X or a -U option in glimpseindex. This is necessary since glimpse must know that the first ' ' (see above) siginifies the end of the filename in .glimpse_filenames. When glimpse outputs matches, it will display the filename, the URL, and the title automatically. The user must be able to parse this info properly though! - Added an option -X to glimpse to just output the names of files that do contain a match, in case glimpse is not able to open the file for reading. Without the -X option, glimpse will simply ignore the file and continue. - Added "wgconvert", a program to compress and decompress neighborhoods in webglimpse. It can also be used to convert a file of filenames (that's used as a parameter for the -f option in glimpse) to a smaller binary representation, and vice versa. See file "index/convert.c". (9-10/96). The compression can change a filenames file to a file containing a bit mask representaion of the set of files, or to a file containing a sparse set representation of these files. We recommend sparse-sets only. - Added support in glimpse to read not just a set of filenames (with a -f option), but also a compressed set of filenames (with the -p option). The -p option allows you to utilize compressed `neighborhoods' (sets of filenames) to limit your search, without uncompressing them. The usage is: "-p filename:X:Y:Z" where "filename" is the file with compressed neighborhoods, X is an offset into that file (usually 0, must be a multiple of sizeof(int)), Y is the length glimpse must access from that file (if 0, then whole file; must be a multiple of sizeof(int)), and Z must be 2 (it indicates that "filename" has the sparse-set representation of compressed neighborhoods: the other values are for internal use only). Note that any colon ":" in filename must be escaped using a backslash \. - Added limited support for NOT in glimpse. This works with index search (-N) or whole file scope for booleans (-W) only. (11/96). "Not" can be specified using "~". "Not" is most useful in expressions like "bad;~boy" or "woman,~girl"; or in "global not" expressions like "~{bad;boy}" or "~{woman,girl}". The semantics of ~ is as follows: the ~ works exactly as you would expect for index search (-N). For actual output, you will get all records with at least one of the specified patterns (bad, boy, woman, girl), that satisfy the boolean expression. That is, for example, "bad;~boy" will give you all records that contain "bad" but not "boy", in all files that contain "bad" but not "boy". However, if you search for "~{bad;boy}", glimpse/agrep will NOT output records that don't contain either bad or boy. They will only give you records that contain alteast one of "bad" or "boy" but not both. This is logical since otherwise, a pattern like "~ZZZZIYIUYIUYIUYRR", for example, would force glimpse to output all records in all files... For index-search and actual file-search to be consistent, a ~ should be used only with -W. Glimpse exits with an error otherwise. Agrep can now also search for nots, and the semantics are the same as above, except that the boolean expressions are evaluated on a per- record basis, rather than a per-file basis like glimpse. - Added support to search for patterns with repeating strings (11/96): "{computer;science},{computer;chronicles}" This now works in agrep as well as glimpse. However, its for simple patterns only (i.e., no regexp or spelling-errors). Previously, you were forced to say "{computer;{science,chronicles}}". This also fixes the "bug" where queries like "url=pat1;content=pat1" in Harvest did not work (the same pattern pat1 appears twice). - Fixed some nagging memory leaks and segfaults on Solaris (10/96). - Fixed multiple matches / missed matches problems with -W (11/96). 3.5 --> 3.6 - Many bug fixes and performance improvements to support webglimpse - A -R option to glimpseindex to recompute .glimpse_filenames_index from a changed .glimpse_filenames. This allows users to move the index from one file system to another (where the absolute pathnames of the same files can be different), or convert all absolute pathnames .glimpse_filenames to relative pathnames, and still use the existing index of that data. 3.0 --> 3.5 - added "-f filename" option to glimpse: it allows you to restrict the search to only those files whose names appear in "filename". - fixed the agrep bug where -n was not working with ISO characters. - Added -t to glimpseindex that sorts .glimpse_filenames by decreasing order of modify time (st_mtime in stat structure); - Added -j option to glimpse to print time of file along with its name; - Added "-Y days" option to print files that were modified "days" before the index was created. - Added support for arbitrary characters in filenames (e.g. >, <, space, &...) 2.1 ---> 3.0 - added a data structure (in .glimpse_turbo) that speeds up queries using -w and -i considerably for large indexes. It is meant mostly for servers using glimpse (e.g., Harvest and glimpseHTTP servers), but it benefits everyone. With this "turbo" option, typical queries take less than a second even for very large indexes. This was so successful that we made it the default rather than an option (it used to be -T in some earlier versions). If the .glimpse_turbo file is deleted, glimpse will still work properly (but glimpseindex -f and -a require it). - incremental indexing is now fully supported (even for -b). Deletion from the index is supported. glimpseindex -d filename(s) completely deletes the files from the index; glimpseindex -D filename(s) deletes the files only from the file list. - the index has been improved in several ways (transparently except for speed and space). As a result, indices built with earlier versions of glimpseindex will not work with 3.0 -- you must reindex again. - several options were added to glimpseindex and glimpse: glimpseindex -E indexes all file without attempting to run the filetype filtering (but excluded files or suffixes still apply). glimpse -Q extends -N in a nice way giving much more information about the matches in the index. glimpse -L has more options: -L x | x:y | x:y:z if one number is given, it is a limit on the total number of matches. Glimpse outputs only the first x matches. If two numbers are given (x:y), then y is an added limit on the total number of files. If three numbers are given (x:y:z), then z is an added limit on the number of matches per file. If any of the x, y, or z is set to 0, it means to ignore it (in other words 0 = infinity in this case); for example, -L 0:10 will output all matches to the first 10 files that contain a match. (There are also some undocumented-as-yet options. We are running out of letters. Only -j and -Y are not used!) - glimpse 3.0 still has a LOT of makefiles (one per architecture / OS). We hope to include autoconf support for glimpse in the future: but these should be sufficient for most purposes. - several bugs were fixed, and the whole package is now more portable. Binaries and make files for the following platforms are now available: AIX-3.2.5, HPPA, HPMC68K, IBM-RS6000, Linux, SGI. (Binaries and make files for SUNOS4.1.1, SUNOS4.1.3, SOLARIS 5.3 and DEC OSF/1 (ALPHA) are avaialable as usual.) See README.install for more details. 2.0 ---> 2.1 - Added the facility to run a glimpse server which reads the index into memory and stays in the background. Regular glimpse then submits queries to the server and echoes the replies. This can improve performance if the index is large since it doesn't have to be read-in for each query. Glimpse can contact (local or remote) servers using the -C, -J and -K options (see the man-pages for more details). - Optimized the performance of glimpse for very large structured indexes: this is mostly relevant in Harvest.1.1. Such indexes now take half the space, the indexing can be done in half the time, and structured queries are faster by a factor of 2 to 5! - Made code more portable: the code now runs on the following machines and operating systems: SUNOS ALPHA SOLARIS HPUX AIX LINUX - Added much improved man pages for glimpse, glimpseserver and glimpseindex. - Many bugs were fixed based on the reports received for glimpse.2.0 and Harvest.1.0. The code is now more robust, portable and readable. 1.1 ---> 2.0 - A "byte-level" indexing (glimpseindex -b) has been added, which mimics regular inverted indexes in that the exact location of each occurrence of each word (except for a stop list of common words) is indexed. The index itself is still searched with agrep so all options are still available. This option speeds up the search, sometimes considerably. - Added customizable filtering support -z to glimpseindex and glimpse. glimpseindex -z consults the file .glimpse_filters and performs the programs listed there for each match. The best example is compress/decompress. If .glimpse_filters include the line *.Z uncompress < then before indexing any file that matches the pattern "*.Z" (same syntax as the one for .glimpse_exclude) the command listed is executed first (assuming input is from stdin, which is why uncompress needs <) and its output (assuming it goes to stdout) is indexed. The file itself is not changed (i.e., it stays compressed). Then if glimpse -z is used, the same program is used on these files on the fly. Any program can be used (we run 'exec'). For example, one can filter out parts of files that should not be indexed. Note that this can slow down the search because the filters need to be run before files are searched. - There is a new compression package that allows glimpse (and agrep) to search DIRECTLY in compressed files. A new compression routine, called cast, is included. Also, glimpseindex can automatically index files compressed with cast. More details on this will be published later. - Queries can now include arbitrary combinations of ANDs, ORs, NOTs. - Added option -F in cast, uncast, buildcast and glimpseindex to take filenames from stdin. - Added a -L x option to glimpse to output only the first x matches. - User can explicitly specify whether exclude or include has higher priority (the default is to prefer exclude, glimpseindex -i gives priority to include). For example, you can put * in .glimpse_exclude and then explicitly say which files you want to include. - Added a -S x option to glimpseindex to allow the user to adjust the size of the stop-list under -o and -b. - Added a -W option to change the scope of Boolean queries to be the whole file. - Added support for structured queries in glimpse/glimpseindex (This was done for the Harvest project.) - Many small corrections were made based on the bug-reports received for version 1.1 and the beta version of 2.0. 1.0 ---> 1.1 - Names of files/directories whose ABSOLUTE path names are given as input to glimpseindex are indexed "as they are". If their RELATIVE path names are given, THEN glimpseindex tries to construct their absolute path names. Path names are still absolute: they are NOT relative to where the index is stored. - A new faster mgrep() (multi pattern search) has been added to agrep. - Boolean search by glimpse is now faster: it uses the new mgrep routine and the limit on the number of simple patterns separated by boolean operations is no longer 32 (it is 256 = maximum pattern length). - The maximum number of files which can be indexed at one go has been increased from 16000 to around 65000. - Many small corrections were made based on the bug-reports received for version 1.0. # /cs/usi/glimpse/ChangeLog # $Id: CHANGES,v 1.4 2000/08/16 04:23:00 golda Exp $ # Created: Tue Apr 7 08:54:27 1998 # Peter A. Bigot # # Description: # Master source area for glimpse indexing system. Tue Apr 7 08:54:27 1998 Peter A. Bigot (pab@thalia.CS.Arizona.EDU) * Generated this area from the glimpse-4.1 source release. All files checked in as version 1.1, and tagged with symbolic name "r4-1" thusly: % for f in `find . -name RCS` ; do \ (cd `dirname $f`; rcs -nr4-1: RCS/*) ; \ done * Patches are in ./Patches. * Applied glimpse-4.1-4.1b.patch; log in plog-4.1-4.1b. System incorrectly attempted to patch ./defs.h instead of ./compress/defs.h; applied that one by hand. This patch combines Burra's changes with some general cleanup; see the patch file for details. These changes checked in and tagged as symbolic name r4-1b. agrep/agrep.h:1.2; ./get_index.c:1.2; ./main.c:1.2; ./agrep/compat.c:1.2; ./agrep/newmgrep.c:1.2; ./compress/cast.c:1.2; ./compress/defs.h :1.2; ./compress/main_tbuild.c:1.2; ./compress/tsimpletest.c:1.2; ./index/build_in.c:1.2; ./index/convert.c:1.2; ./index/filetype.c:1.2; ./index/glimpse.c:1.2; ./index/region.c:1.2; ./index/simpletest.c:1.2. * Built and applied glimpse-4.1b-4.1c-patch. This fixes a problem when the last line of the file does not end in a newline. ./agrep/agrep.c:1.2; ./agrep/bitap.c:1.2; ./agrep/io.c:1.2; ./agrep/newmgrep.c:1.3; ./index/build_in.c:1.3. Thu Aug 20 15:03:08 1998 Peter A. Bigot (pab@thalia.CS.Arizona.EDU) * (index/{glimpse.c,Makefile.linux}) Fix bug involving the TEMP_DIR fixes: syntax error in glimpse.c, need to define variable when building buildcast. 1.{4,2}. * (KNOWN_BUGS) Added this file which contains information on how to duplicate bugs that we know exist, but haven't figured out how to squash yet. glimpse-4.18.7/CONTRIBUTIONS000066400000000000000000000141201300371307100152310ustar00rootroot00000000000000We would like to acknowledge the members of harvest-dvl@cs.colorado.edu and everyone who had sent valuable bug-reports that helped to make glimpse more reliable, portable and easier to use. Especially, we would like to thank (in no particular order): "Marty Leisner" "Shirley Brown" Mark Eichin David Koski Jose Luis Pino James Binkley Benjamin Pierce Chris Dalton Rob Hartill nandu@cs.clemson.edu gmt (Gregg Townsend) Gabe Dalbec Jay Plett "Ullrich Hustadt" Pei Cao Michael Short Eric Johnson "Paul L. Clark" Charlie Stross Daniel Simmons Andrew Mauer "Curtis K. Wong" voelker@cs.washington.edu Vladimir G Ivanovic Bob Jackson raman@crl.dec.com "Michael S. Hart" Ray Schnitzler "David B. Rosen" rwk@integra.com jrb@cs.pdx.edu Brian Behlendorf rpe@pastek.cray.com (Roland Piquepaille) Dennis Grinberg Tom Phelps Eric Grosse douglis@research.att.com Jose Luis Pino ney@research.att.com joerg@pharmacy.isu.edu (Joerg Senekowitsch) George Hartzell Alastair Aitken CLMS leo@zycad.com (Leo Broukhis) Ray Allis (206) 865-3583 Paul Everitt "Bart J. Parliman" "Craig Bell (8-321-4036)" Vivek Khera forman@cs.washington.edu Vladimir Vukicevic barry@hal.com (Barry Bakalor) "Paul Pomes" Luca Toldo em@free.net (Evgeny Mironov) David A. Gernert Kent Lewallen "Gregory R. Olsen" Andries.Brouwer@cwi.nl Edgar Nielsen travis.winfrey@fi.gs.com (Travis Winfrey) Roy C Bixler John Cordell Bill Allocca dws@ssec.wisc.edu (DaviD W. Sanderson) barry@hal.com (Barry Bakalor) "Paul Pomes" vogelke@c-17igp.wpafb.af.mil Mark Metson (aa332@cfn.cs.dal.ca, gpurdy@fox.nstn.ns.ca) aeb@cwi.nl beebe@math.utah.edu (Nelson H. F. Beebe) jcasler@vnet.IBM.COM eaddy@scri.fsu.edu zeidenbe@ssc.wisc.edu seavey@OpenMarket.co dshaw@aplcomm.jhuapl.edu nandu@longs.att.com gencela@lafcol.lafayette.edu Keith Waclena "Charles F. Randall" Peter Marks - mark.dohm@teldta.com (Mark Dohm) "Daniel P. Zepeda" Binh Nguyen Alan Cunningham Ron Courtright rcourt@infinity.com jarausch@igpm.igpm.rwth-aachen.de (Helmut Jarausch) Dave Van Horn Alan.Harder@corp.sun.com Travis Taylor travis@ink2.ink.org Francesco Ruta ew@senate.be (Emmanuel Willems) "William Jaynes" ohst@informatik.hu-berlin.de (Daniel Ohst) Kurt Leinbach Pierre Violet Henrik.Martin@eua.ericsson.se (8 bit clean) gross@stimpy.ame.nd.edu (George B. Ross) Jim Meyering roger Firth Chet Murthy Todd White at RADium Technology Centre (Canada) stamer@merlin.physik.uni-oldenburg.DE (Heinrich Stamerjohanns) "CHRISM" (0) Jim Hurley "Piroz Mohseni" Cici.Mills@digicool.com (Ci-ci Mills) kaj.hejer@usit.uio.no (Kaj Hejer) Jonathan Shakes jrochkin@cs.oberlin.edu (Jonathan Rochkind) feyrer@rfhs1012.fh.uni-regensburg.de (Hubert Feyrer) Sudha Vaidyanathan Urmo Maeorg Tarvi Martens ericwolf@iquest.net (Eric Wolf) "O.Bartunov" Daniel Miles Kehoe Brett Bendickson Yvan Leclerc celuszak@bc-ad.bc.ca (Tom Celuszak) charles Tony Sprinzl Tai Jin Volker Ossenkopf bentson@grieg.seaslug.org (Randolph Bentson) koechlin@krug.inria.fr (Bruno Koechlin) "Steven J. Beaty" djk@clara.leather.chbi.co.uk () wysenet@iah.com (Larry E. Culver) Steve Karlovsky Duncan Fraser Doug Cooper Marc Paquette Keith Porterfield glenn@rockie.nsc.com (Glenn Newell) Emil Sit ada@mail2.umu.se Christer Holgersson Larry Schwimmer schwim@cyclone.stanford.edu Fred Douglis "Phil Kaslo" "Achim Bohnet" Alf-Christian Achilles p.d.stitt.kid0110@oasis.icl.co.uk Dachuan Zhang Howard Fear hsf@pagelus.com VaX#n8 Edwin Inglis Thomas Gries gries@epo.e-mail.com, gries@ibm.net Gerald Wildgruber, gewil@ue801be.ppp Davis Houlton SHIOZAKI Takehiko Jochen Schwarze wolinski@caissedesdepots.fr (Francis Wolinski) tobega@x.fra.se "David L. Fielding" wjones@tc.fluke.com (Warren Jones) vlad@mars.tecnomatix.com (Vlad Agranovsky) Udi Manber, Sun Wu and Burra Gopal glimpse-4.18.7/ChangeLog000066400000000000000000000451341300371307100150270ustar00rootroot000000000000002006-08-22 10:18 dkreil * index/glimpse.h: fixed version number [tt] 2006-03-24 19:36 root * index/: asearch.c, ss/hash.c, dir.c: Add "const" to match *exact* type cast requirment 2006-03-24 19:13 root * libtemplate/util/: , , , , ate/template.c, host.c: Fix type cast, missing return values 2006-03-20 09:02 root * test/check.sh: Fix for compare operator 2006-03-13 15:12 root * Makefile: Add check target suite for Makefile 2006-03-13 15:09 root * Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.linux, Makefile.org, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos: Add check target to manually generated make files. 2006-03-13 14:31 root * test/check.sh: Removed duplicated log path and used $ERROR_LOG instead 2006-03-13 14:27 root * test/check.sh: Added code to log errors in tests in case they occur. 2006-03-13 12:35 root * main.c: Inital initialization of ret = 0. By default compiler should do it but we need this to suppress warnings 2006-03-13 12:33 root * index/io.c: Added type cast to "char *" to suppress warnings 2006-03-10 18:42 root * Makefile.in: Fix path to check.sh 2006-03-10 18:32 root * sample/check.sh: We don't need sample directory anymore, now we have test directory instead 2006-03-10 18:30 root * test/: check.sh, test.txt: Add check target suite, check.sh and test.txt file 2006-03-10 18:28 root * Makefile.in: Check target added 2006-03-09 22:03 julian * main.c, split.c: remove 2 warnings with main.c (type cast to sockaddr) and warning with split.c (wrong type cast to unsigned char pointer) 2006-02-03 15:04 golda * index/glimpse.h: version number... 2006-02-03 10:56 golda * README: bug report address 2006-02-03 10:37 golda * index/glimpse.c: fix for TEMP_DIR def for buildcast thanks to Kent Mein for noticing! 2006-02-03 10:12 golda * Makefile.in: What the hell happened to glimpseindex ? 2006-02-03 09:59 golda * libtemplate/include/: ccache.h, template.h, url.h: [no log message] 2006-02-03 09:56 golda * libtemplate/include/util.h: Always STRICT_ANSI 2006-02-03 09:53 golda * libtemplate/util/log.c: [no log message] 2006-02-03 09:36 golda * libtemplate/util/log.c: Never use varargs.h - nowadays we ALWAYS use stdarg.h, any time this century... 2005-08-07 01:35 golda * agrep/: agrep.c, preprocess.c: Use MAXREG, not static '30' 2004-11-08 18:02 golda * README: version # 2004-06-08 10:07 golda * index/: le, ure, , Makefile.in, io.c: Finally using autoconf 2004-06-08 10:02 golda * configure.in: finally got right combination of quotes for autoconf... 2004-06-08 09:22 golda * sample/check.sh: Adding script for make check as per suggestion from Nelson Beebe 2004-05-30 16:35 golda * libtemplate/util/log.c: put varargs.h option back - we thought this woudln't be needed anymore but apparently is on some systems 2003-11-12 22:17 golda * libtemplate/: include/util.h, template/template.c, util/buffer.c, util/harvest.c, util/log.c, util/system.c: change name of log() function to glimpselog() so as not to conflict with math.h 2003-11-12 22:16 golda * index/glimpse.h: add .abra to list of EXTRACT_INFO_SUFFIX files to look for titles in. (We preserve the title tag when prefiltering html files) 2003-11-12 22:10 golda * dynfilters/htuml2txt.so: [no log message] 2003-11-12 22:10 golda * configure, configure.in: Trying to use only autoconf, and not a customized configure script 2003-11-12 21:50 golda * main.c: Add support for --help & --version (in a rather cheesy way...) 2003-11-05 18:00 golda * index/io.c: Move max_entry check before assigning partition, thanks to Jerrold Leichter for the fix! 2003-11-02 19:40 golda * index/glimpse.h: Updating list of suffixes, version # and date 2003-02-06 05:04 golda * main.c: Make glimpse -V return 0 (no error), not 1 - thanks to Bob Proulx 2003-02-06 04:46 golda * libtemplate/Makefile.hp: Add "-f Makefile.hp" pass down to the next invokation of make as per Bob Proulx 2003-01-25 13:15 golda * agrep/agrep.c: Enabled Bestmatch to work with Linenum as per Kevin McGrail (KAM) 2003-01-25 13:09 golda * index/glimpse.h: [no log message] 2003-01-25 13:09 golda * agrep/compat.c: Enable use of -B option along with -n (linenum) as per Kevin McGrail - probably never should have been disabled in first place 2002-11-29 17:47 golda * index/glimpse.h: [no log message] 2002-11-29 17:47 golda * agrep/: bitap.c, compat.c, io.c: Fix typo in bitap.c,io.c and compat.c that prevented compiling with --enable-pointers. Thanks to Clemens Fischer Correct error message about Bestmatch and Linenums, thanks to Kevin McGrail for pointing this out (and possible future fix to allow them to work together) 2002-10-10 22:28 golda * index/: build_in.c, io.c, utils.c: There was an error in build_in.c, a typo that prematurely terminated a for loop and resulted in missing hits for common search terms! Fixed now. This affected versions from 4.16.2 thru 4.17.0. If running one of those versions please upgrade to 4.17.1 2002-10-10 22:27 golda * dynfilters/htuml2txt.so: [no log message] 2002-09-27 14:41 golda * index/: dex.c, checksg.c, compat.c, maskgen.c, preprocess.c, build_in.c, glimpse.c, glimpse.h, partition.c: Should compile now under CYGWIN (v1.3.12 or later)! Thanks to Tom Hudson for the tip, with the latest cygwin it seems that all that is needed is to add some include lines. Please report problems or other changes needed! 2002-09-06 01:11 golda * glimpse.1, glimpseindex.1, glimpseserver.1: Updated man pages to reflect new URL's, thanks to Kang-Jin Lee for reporting this and providing the new Harvest doc locations. 2002-06-17 23:45 golda * index/: filetype.c, glimpse.h, io.c: Finaly fixed segfault bug with large indexes, by increasing size of guilty buffer. Actually this is better than checking for buffer overrun because if we just stop at overrun we miss hits. With new size no overruns should be able to occur, that is the correct solution. Has been tested on two sites that were experiencing the segfault and has fixed problem. 2002-06-17 23:44 golda * dynfilters/htuml2txt.so: [no log message] 2002-05-02 21:43 golda * index/convert.c: Use system 'mv' command instead of c lib; works better on some linux systems 2002-02-14 17:38 golda * index/glimpse.h: Corrected version number 2001-10-13 01:14 golda * index/: build_in.c, utils.c: Fix segfault on certain large indexes. Check that we have enough space in merge_in before adding to it. In utils.c, when counter can be incremented > once in a loop, make sure we stop in time. 2001-10-13 01:13 golda * ChangeLog: Don't need ChangeLog in CVS! 2001-08-20 22:06 golda * index/: le, ure, Makefile.in: Added 'INDEXLIBS' configure script variable to include only -ldl if needed in index/Makefile 2001-08-20 20:59 golda * configure: Trying to get executable set in cvs 2001-08-20 19:56 golda * index/Makefile.in: Remove hard ref to -ldl 2001-08-20 19:53 golda * Makefile.in, configure: Added check for -ldl rather than including automatically 2001-07-07 23:17 golda * README, README.install: Updated instructions for using configure script 2001-07-07 23:06 golda * index/glimpse.h: Updated version, url, etc 2001-07-07 23:04 golda * index/: le, le.in, Makefile.in: make puts binaries in ./bin subdirectory 2001-07-06 23:37 golda * index/glimpse.h: Add '.jhtml' to list of extensions of HTML files (to extract titles from) 2001-07-06 22:47 golda * dynfilters/Makefile.in, libtemplate/lib/Makefile.in: Adding "distclean:" option to makefiles 2001-07-05 10:38 golda * libtemplate/: lib/Makefile, template/Makefile, util/Makefile: Removing Makefiles from CVS, generated by configure 2001-07-05 10:37 golda * libtemplate/include/autoconf.h: autoconf.h should never have been checked into CVS - generated by configure 2001-07-05 10:37 golda * agrep/Makefile, compress/Makefile, index/Makefile, libtemplate/Makefile: Removing Makefiles from CVS because they are now generated by configure script 2001-07-05 10:30 golda * configure: Corrected help text 2001-07-05 03:32 golda * libtemplate/: Makefile, include/autoconf.h, include/autoconf.h.in, template/Makefile, util/Makefile: [no log message] 2001-07-05 03:32 golda * index/: ure, glimpse.h: Moved FILE_END_MARK option into configure script (--file-end-mark=) instead of glimpse.h 2001-07-04 13:22 golda * libtemplate/: le, efile, akefile.in, le, Makefile, include/autoconf.h, lib/Makefile, lib/Makefile.in, template/Makefile, util/Makefile: [no log message] 2001-07-04 13:22 golda * Makefile: Providing Makefile.in for configure script, for dynfilters may or may not work on all systems; we need to add a configure switch to turn them off if necessary. 2001-07-02 21:58 golda * agrep/bitap.c: Fix from Dan Slowik to fix line number reporting problem with agrep. 2001-05-20 21:49 golda * README.install: [no log message] 2001-05-20 21:48 golda * libtemplate/: , c, , , , , le.in, h, c, efile.in, t.c, le.in, t.c, e.h, , k.c, .h, Makefile.in, include/autoconf.h.in, template/Makefile.in, util/Makefile.in: Major fixes to configure script from Sang-yong Suh. Now it works! 2001-03-05 21:14 golda * README: Testing cvs problem 2001-01-03 17:27 golda * index/io.c: Added some ifdefs to not use dynamic lib (dlopen) commands when DYNHTMLFILTER is not defined. Avoids errors on some platforms (BSDi) 2000-12-06 16:11 golda * index/: recursive.c, ss/cast.c, ss/trecursive.c, ss/uncast.c, build_in.c, dir.c, filetype.c, glimpse.h, partition.c: Use strerror(errno) instead of arbitrary "permission denied or nonexistent" error message, as suggested by ariel. Fixes bug #45. 2000-11-16 22:47 golda * split.c: Added dirs "lib" and "bin" to repository, also for some reason split.c had never been checked in. 2000-08-15 21:23 golda * CHANGES, ChangeLog: Used cvs2cl.pl to generate up-to-date ChangeLog from CVS. All old manually entered change information is in CHANGES file. 2000-08-15 15:07 golda * main.c: tiny formatting change 2000-08-15 15:04 golda * main.c: Corrected help output for -t switch 2000-08-14 17:06 golda * index/: glimpse.h, io.c: Re-set name of filter file to .glimpse_filters, instead of the temporary name .glimpse_experimental_filters. This would have caused HTML tags in the output of glimpse 4.13.0a and 4.13.0, unless .glimpse_experimental_filters existed. Also corrected minor bug introduced in 4.12.6, in reading temp_rdelim (two lines were transposed). Probably didn't affect anything because the default delimiter is normally correct. 2000-07-16 22:34 golda * agrep/agrep.h: Changed leaf.attribute to type int to fix pointer to integer compiler warnings. Was used as int anyway, should not hurt anything. 2000-06-21 13:13 golda * COPYRIGHT: [no log message] 2000-05-30 13:01 golda * main.c: Corrected usage info for -t option on glimpse 2000-05-25 11:07 golda * index/: ters/Makefile.linux, ters/README, ters/htuml2txt.lex, ters/htuml2txt.so, ters/sotest.c, Makefile.linux, glimpse.h, io.c: Added Christian's changes to allow dynamic filters. I believe this has only been tested on Linux systems. --GV 2000-05-25 11:06 golda * Makefile, Makefile.linux: Added Christian's changes to allow dynamic filters 2000-04-20 13:27 golda * index/filetype.c: re-commented out debugging line 2000-03-08 11:51 golda * agrep/agrep.c: Patch from Dan Riley to fix glimpseserver crashing problems. Basically make sure multibuf is always set to NULL after it is freed, and freed before it is et to NULL. --GV 2000-01-20 06:04 golda * index/filetype.c: Added separator char before "xinfo" variable. This is needed to parse the URL out of the results and correctly make links in webglimpse's output. Fix as per Victor Gonzales T. --GV 2000-01-15 23:52 golda * index/filetype.c: Corrected problem with multiple-line titles and files with spaces in the names. FILE_END_MARK was being used in the wrong place, where multiple-line titles were being stuck together. --GV 2000-01-15 22:47 golda * agrep/checksg.c: Don't use mgrep() with delimiters - fix by Morey as per report by Michael O. --GV 2000-01-11 11:34 cpv298 * index/filetype.c: Fixed a nasty off-by-one error in extract_info() that clobbered memory past the end of arrays. glimpseindex -X should now stop segfaulting. 2000-01-11 11:32 cpv298 * index/memlook.c: The old version of memlook clobbered data past the end of an array. It also was prone to other buffer overruns. This is a complete rewrite. 2000-01-11 11:23 test * README: Testing remote cvs --GV 1999-11-19 01:11 golda * libtemplate/util/: strdup.c, strerror.c: Removed sys_nerr and sys_errlist declarations from strerror.c and added sys/types.h to strdup.c as per Nelson Beebe. Nelson says this fixes build problems on Rhapsody 5.5 and GNU/Linux 2.2.5-22 at least. --GV 1999-11-03 15:41 golda * Makefile.linux: Checking in modified Makefile to compile cleanly on Linux. 1999-11-03 15:16 golda * index/glimpse.h: Checking in changes to allow spaces in filenames - FILE_END_MARK constant set here. 1999-11-03 15:00 golda * main.c: Checking in changes that allow spaces infilenames, using FILE_END_MARK rather than a fixed character (a space) to delimit filenames & extra info. 1999-11-03 14:40 golda * libtemplate/: include/autoconf.h.in, include/ccache.h, include/ccache_list.h, include/ccache_queue.h, include/config.h, include/gdbm.h, include/paths.h, include/paths.h.in, include/template.h, include/time_it.h, include/url.h, include/util.h, lib/Makefile.sunos, template/Attributes.html, template/Makefile.NeXT, template/Makefile.alpha, template/Makefile.hp, template/Makefile.in, template/Makefile.linux, template/Makefile.rs6000, template/Makefile.sgi, template/Makefile.solaris, template/Makefile.sunos, template/README, template/cksoif.c, template/iafa2soif.c, template/lsm2soif.c, template/mktemplate, template/netfind2soif.pl, template/pcindex2soif.c, template/print-attr.c, template/print-template.c, template/print-urlrefs.c, template/soif.pl, template/template.c, template/translate-urls.c: Checking into repository 1999-11-03 13:42 golda * libtemplate/util/: Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.in, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, buffer.c, harvest.c, host.c, log.c, strdup.c, strerror.c, string.c, system.c, xmalloc.c: Updating repository. 1999-11-03 13:41 golda * libtemplate/: Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.in, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README: [no log message] 1999-11-03 13:40 golda * compress/: Makefile, Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.in, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, cast.c, compress.chronicle, defs.h, hash.c, main_cast.c, main_tbuild.c, main_uncast.c, misc.c, quick.c, string.c, tbuild.c, test.c, tmemlook.c, trecursive.c, tsimpletest.c, uncast.c: Checking files into repostitory. 1999-11-03 13:39 golda * index/: Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, index.chronicle, region.h: Checking files into repository. 1999-11-03 13:37 golda * agrep/: COPYRIGHT, Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, agrep.1, agrep.algorithms, agrep.c, agrep.chronicle, agrep.h, asearch.c, asearch1.c, asplit.c, bitap.c, bitap.c.orig, checkfile.c, checkfile.h, checksg.c, compat.c, config.h, contribution.list, defs.h, delim.c, dummyfilters.c, dummysyscalls.c, follow.c, io.c, io.c.orig, main.c, maskgen.c, newmgrep.c, parse.c, preprocess.c, putils.c, re.h, recursive.c, utilities.c: Checking files into repository. --GV 1999-11-03 13:36 golda * ChangeLog, KNOWN_BUGS, Makefile.org, config.cache, config.log, config.status, genpatch: Cleaning up repository. --GV 1999-11-01 14:19 golda * CHANGES: Changes log - will be obsolete after 11/1/99 --G 1999-11-01 14:19 golda * index/: build_in.c, convert.c, filetype.c, io.c, partition.c: Bringing Glimpse 4.12.6 under CVS. Future changes will be checked in independently. General changes were: 4.12.5 --> 4.12.6 - Fixes to configure script, thanks to Michael Heironimus - Fixes to index/partition.c, index/io.c and index/build_in.c should resolve problem with missing hits on the first one or two keywords in the index. Thanks to Morey Hubin. - Fix to sgrep.c solves problem of double-hit count with record delimiters. M. Hubin. 4.11 --> 4.12.5 Fix for using filters with structured indexes. Added FILE_END_MARK constant so it is possible to configure for filenames with spaces Test-fix for core dump on large indexes (may not have solved problem). --G 1999-11-01 14:16 golda * index/: Makefile, Makefile.in: Adding makefiles to cvs. 1999-11-01 14:14 golda * libtemplate/: lib/Makefile, template/Makefile, util/Makefile: Adding to CVS repository 1999-11-01 14:13 golda * libtemplate/Makefile: Adding into new CVS repository 1999-11-01 13:41 golda * libtemplate/include/autoconf.h: Added to repository 1999-11-01 13:34 golda * CHANGES, Makefile.in, config.cache, config.log, config.status, configure, configure.in: Adding into cvs repository. 1999-11-01 13:33 golda * agrep/: Makefile, Makefile.in: Adding into CVS repository. 1999-11-01 13:32 golda * agrep/sgrep.c: Adding into CVS repository 1999-05-05 01:16 gvelez * Makefile: Adding agrep archive to my CVS repository 1999-05-05 01:12 gvelez * main.c: Changes to return value in keeping with man pages: 0 for some hits, 1 for no hits, 2 for error. --GB 1999-03-02 00:38 gvelez * index/: build_in.c, convert.c, dir.c, filetype.c, fixname.c, getword.c, glimpse.h, io.c, lib.c, memlook.c, partition.c, region.c, simpletest.c, utils.c: Added index directory to repository 1999-03-02 00:37 gvelez * index/glimpse.c: Added default for TEMP_DIR --G 1998-05-22 10:31 udi * main.c: changed all /tmp to use TEMP_DIR (for security reasons) 1998-04-27 09:11 pab * gentar: Initial revision 1998-04-07 09:17 pab * get_index.c, main.c: Patch to rev 4.1b 1998-04-07 08:54 pab * CHANGES, CONTRIBUTIONS, COPYRIGHT, Makefile, Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.in, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, README.install, communicate.c, configure, configure.in, defs.h, get_filename.c, get_index.c, glimpse.1, glimpse.chronicle, glimpseindex.1, glimpseserver.1, install-sh, main.c, mkinstalldirs, split.c: Initial revision glimpse-4.18.7/KNOWN_BUGS000066400000000000000000000017741300371307100147160ustar00rootroot00000000000000* Thu Aug 20 14:56:44 1998 agrep -v fails to write anything unless at least one line (possibly, a particular line) of the input matches the pattern that we're trying to mismatch. The non-glimpse agrep does not have this problem. E.g.: thalia[189]$ for p in O e n w o r ; do echo "Pattern '${p}':" ; echo "One@Two@Three" | tr '@' '\012' | agrep -v ${p} ; done Pattern 'O': Pattern 'e': Two Pattern 'n': Pattern 'w': One Pattern 'o': One Pattern 'r': One Two thalia[190]$ for p in O e n w o r ; do echo "Pattern '${p}':" ; echo "One@Two@Three" | tr '@' '\012' | agrep-2.04 -v ${p} ; done Pattern 'O': Two Three Pattern 'e': Two Pattern 'n': Two Three Pattern 'w': One Three Pattern 'o': One Three Pattern 'r': One Two The non-glimpse agrep uses bitap to do the search; the glimpse one uses bm. Some pre-condition is unsatisfied in the call to bm, because it overruns the input text buffer by a huge amount in attempting to find a pattern match. It isn't obvious to me where this bug is arising, or how to fix it. glimpse-4.18.7/LICENSE000066400000000000000000000015271300371307100142600ustar00rootroot00000000000000Anyone distributing the Glimpse code should include the following license: Copyright 1996, Arizona Board of Regents on behalf of The University of Arizona. Permission to use, copy, modify, and/or distribute this software for any purpose with or without fee is hereby granted, provided that the above copyright notice and this permission notice appear in all copies. THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE. glimpse-4.18.7/Makefile.NeXT000066400000000000000000000205241300371307100155260ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.alpha000066400000000000000000000206001300371307100157700ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc #gcc -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O -Olimit 3000 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.hp000066400000000000000000000205131300371307100153150ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 1 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.in000066400000000000000000000133031300371307100153130ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. srcdir = @srcdir@ VPATH = @srcdir@ SHELL = /bin/sh CC = @CC@ LIBS = @LIBS@ CP = @CP@ STRIP = @STRIP@ INSTALL = @INSTALL@ INSTALL_PROGRAM = @INSTALL_PROGRAM@ INSTALL_DATA = @INSTALL_DATA@ INSTALL_MAN = ${INSTALL} -m 444 DEFS = DYNFILTER = @DYNFILTER@ prefix = @prefix@ exec_prefix = @exec_prefix@ binprefix = manprefix = bindir = $(exec_prefix)/bin libdir = $(exec_prefix)/lib mandir = $(prefix)/man/man1 manext = 1 MANUAL = glimpse.1 glimpseindex.1 glimpseserver.1 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib SUBDIRS = compress agrep libtemplate index $(DYNFILTER) LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep OPTIMIZEFLAGS = -O2 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include CFLAGS = $(INCLUDEFLAGS) $(DEFS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c all: build-sub @TARGET@ Sall: $(PROG) $(PROGSERVER) $(PROGINDEX) agrep: $(PROGAGREP) NOTSall: $(NOTSPROG) $(NOTSPROGSERVER) build-sub: # these empty dirs are not in git [ -d $(BINDIR) ] || mkdir $(BINDIR) [ -d $(LIBDIR) ] || mkdir $(LIBDIR) for d in $(SUBDIRS) ; do \ ( cd $$d; $(MAKE) ); \ done # Check target check: all $(SHELL) test/check.sh # INSTALL on Solaris should be carried one at a time. :-( install: all installdirs install-man for d in $(SUBDIRS) ; do \ ( cd $$d; $(MAKE) $@ ); \ done for d in $(BINDIR)/$(PROG) $(BINDIR)/$(PROGSERVER) ; do \ $(INSTALL) $$d $(bindir) ; \ done install-man: for d in $(MANUAL) ; do \ $(INSTALL_MAN) $$d $(mandir) ; \ done installdirs: mkinstalldirs $(srcdir)/mkinstalldirs $(bindir) $(mandir) clean: for d in $(SUBDIRS); do \ ( cd $$d; $(MAKE) $@ ); \ done rm -f main_server.o main_server.c main.o $(OBJS) core a.out config.log rm -f $(LIBDIR)/lib$(LIBCOMPRESS).a $(LIBDIR)/lib$(LIBAGREP).a rm -f $(BINDIR)/* distclean: clean for d in $(SUBDIRS); do \ ( cd $$d; $(MAKE) $@ ); \ done rm -f Makefile config.cache config.status $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/$(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -o $(BINDIR)/$(PROG) main.o $(OBJS) -l$(LIBAGREP) $(LIBS) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/$(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -o $(BINDIR)/$(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(LIBS) main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) -c $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -o $@ split.c glimpse-4.18.7/Makefile.linux000066400000000000000000000213001300371307100160400ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 HAVE_STRERROR = 1 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 1 # You might have to change this depending on your machine configuration. CC = gcc -mpentiumpro SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib DYNFILTERDIR = dynfilters LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = -ldl PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep DYNHTMLFILTER = dynfilters/htuml2txt.so # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O2 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_STRERROR=$(HAVE_STRERROR)\ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) $(DYNHTMLFILTER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_STRERROR="$(HAVE_STRERROR)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(DYNHTMLFILTER): $(DYNFILTERDIR)/htuml2txt.lex cd $(DYNFILTERDIR); $(MAKE) -f Makefile.linux htuml2txt.so # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean cd $(DYNFILTERDIR); $(MAKE) -f Makefile.linux clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.org000066400000000000000000000204031300371307100154730ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.rs6000000066400000000000000000000205431300371307100156430ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.sgi000066400000000000000000000205451300371307100154750ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = # -lsun for Irix 5? PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.solaris000066400000000000000000000206141300371307100163640ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To complie for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc # -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = -lsocket -lnsl PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/Makefile.sunos000066400000000000000000000205361300371307100160620ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/README000066400000000000000000000045661300371307100141410ustar00rootroot00000000000000GLIMPSE 4.18: searching entire file systems (http://webglimpse.org/) (http://glimpse.cs.arizona.edu/) For installation instructions, see README.install Glimpse is a very powerful indexing and query system that allows you to search through all your files very quickly. It can be used by individuals for their personal file systems as well as by organizations for large data collections. Glimpse is also the basis of WebGlimpse, which provides search for web sites, and it is the default search engine in Harvest (see below). Glimpseindex, which you run by saying "glimpseindex DIR" builds an index of all text files in the tree rooted at DIR. (e.g., glimpseindex ~ indexes all your files.) With it, glimpse can search through all files much the same way as agrep (or any other grep), except that you don't have to specify file names and the search is fast. For example, glimpse -1 unbelievable will find all occurrences (in all your files!) of "unbelievable" allowing one spelling error; glimpse -F mail arizona will find all occurrences of "arizona" in all files with "mail" somewhere in their name; glimpse 'Arizona desert;windsurfing' will find all lines that contain both "Arizona desert" and "windsurfing". Glimpse supports three types of indexes: a tiny one (2-3% of the size of all files), a small one (7-9%), and a medium one (20-30%). The larger the index the faster the search. Glimpse supports most of agrep's options (agrep is our powerful version of grep, and it is part of glimpse) including approximate matching (e.g., finding misspelled words), Boolean queries, and even some limited forms of regular expressions. The WWW home page for glimpse is in http://glimpse.cs.arizona.edu/ It includes links to the source, binaries for most UNIX systems, documentations, articles, and more. The WebGlimpse home page is in http://glimpse.cs.arizona.edu/webglimpse/ Harvest's WWW home page is http://harvest.cs.colorado.edu/ (Harvest is an integrated set of tools to gather, extract, organize, search, cache, and replicate relevant information across the Internet.) Mail glimpse-request@cs.arizona.edu to be added to the glimpse mailing list. Mail glimpse@cs.arizona.edu to report bugs, ask questions, discuss tricks for using glimpse, etc. (This is a moderated mailing list.) Udi Manber, Burra Gopal, and Sun Wu. Please report bugs online at http://webglimpse.net/contact.php glimpse-4.18.7/README.install000066400000000000000000000137461300371307100156060ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ This is version 4.1 of the glimpse package - a tool to search entire file systems. Please send any comments to glimpse@cs.arizona.edu. Check the file CHANGES for the changes since version 3.0 (there are many of them) and 4.0 (there are few). The files glimpse.1, glimpseindex.1, and glimpseserver.1 are the manual pages. Instructions for installing glimpse, glimpseindex, glimpseserver, and agrep: 1. Both the agrep and index directories have individual Makefiles which you can use independently. You can make everything by just typing make in the root glimpse directory. Sample Makefiles for various architectures have been provided; some have not been tested recently. We have a script "configure" in our distribution which has recently been very much improved. To generate makefiles for your system, run sh configure ( see ./configure --help for options ) Then run make make install to put binaries under /usr/local/bin. To install to a different directory, see the --prefix and --bindir options of configure. This configure script has been tested successfully on Linux-2.4.2 gcc-2.96 RedHat-7.1 Linux-2.2.16 egcs-2.91.66 RedHat-6.2 Linux-2.2.12 gcc-2.8.1 Solaris-2.5.1 gcc-2.7.2.1 SunOS-4.1.3 gcc-2.8.0 AIX-3.4 cc 2. To make individual binaries in a subdirectory "ddd", do the following: cd ./ddd ; make ; cd .. 3. To rebuild everything from scratch, do the following: make clean You can then proceed with the above steps. 4. The directory libtemplate contains code that was originally part of the Harvest source distribution. We believe the configuration there will work on all systems, but there are remnants of the Harvest configuration system still present. Don't be confused by them; they aren't used. 5. Binaries for several operating systems are available; see the download webpage (http://webglimpse.net/download.html) for details. NOTES: ------ People in our mailing list have commented that the make files we provide work on many other architectures too. We recommend that you do a pairwise "diff" of these makefiles to find out whether they support the options you need before trying to modify makefiles to suit your environment. Often a few changes to compiler options, etc., are enough to port glimpse to a new architecture / OS. Source code modifications are usually not necessary. We request you to mail us any changes to the Makefile (or the source) that are necessary to port glimpse to your architecture, and the corresponding binaries, so that we can include it in our distribution. We will appreciate any suggestions and will duly acknowledge all contributions. Some comments about portability (using the sample Makefiles): ------------------------------------------------------------- 6. You must define HAVE_DIRENT_H in agrep/Makefile, index/Makefile, compress/Makefile to be 1 or 0 depending on whether your machine has /usr/include/dirent.h or /usr/include/sys/dir.h. We found that on most machines/OSs like SunOS4.1, Solaris, Ultrix, AIX, OSF/1, HPUX and SGI IRIX 5.3, HAVE_DIRENT_H should be 1. On NeXT, HAVE_SYS_DIR_H should be 1 (the rest should be 0). 7. On Solaris, "RANLIB" should be define to be "true" in agrep/Makefile.solaris and compress/Makefile.solaris. 8. On Solaris (at least the version we have), the library archive program "ar" is in /usr/ccs/bin/ar instead of /usr/bin/ar. You must define "AR" in agrep/Makefile.solaris and compress/Makefile.solaris appropriately or set your PATH to include the appropriate directory name. 9. On Solaris you have to link the glimpse executables with the socket and nls libraries by specifying "-lsocket" and "-lnsl" to the make rules for "glimpse" and "glimpseserver". 10. On the DEC ALPHA and HP, the make variable "CC" was changed from "gcc" to "cc". 11. If you have the utime() routine and , define the make variable UTIME to 1 in glimpse/Makefile and compress/Makefile. Else define it to 0. 12. If you want to support the international character set (ISO_CHAR_SET), define the make variable ISO_CHAR_SET to 1 in glimpse/Makefile. Else define it to 0. 13. If you have the function strerr() on your system, #define HAVE_STRERROR to 1 in libtemplate/autoconf.h, else #undef it (or leave the definition in /**/). This is necessary on some BSD systems (some of our users have said so) and on NeXT (Gerald Wildgruber, gewil@ue801be.ppp). And on Irix 6.*, although the warning you get from the loader seems to be benign. 14. If you need to add any new macros or flags, you can edit the file: glimpse/agrep/config.h and add whatever is needed to make porting easy on your machine / OS. This file is included throughout glimpse source code. 15. On BSD, it may be necessary to define DONTUSESORT_T_OPTION in the Makefile for glimpseindex since -T in some BSD systems is defined as an alternative record separator rather than to specify a directory other than /tmp to store sort-temporary files. Porting to other platforms: -------------------------- We provide, but do not support, pre-compiled glimpse binaries for a variety of systems; see the glimpse web page for details. If you port glimpse to a system and are willing to provide binaries for it, send mail to glimpse@cs.arizona.edu. Makefiles for previous versions of glimpse on the following platforms have been provided by these people. We do not have these systems available to us, so are unable to verify that they work with the current version, but they should be a good starting point. Platform ported to Person to contact AIX-3.2.5 stamer@merlin.physik.uni-oldenburg.DE (Heinrich Stamerjohanns) HPPA, HPMC68K Chris Dalton IBM-RS6000 "CHRISM" (0) NeXT gross@stimpy.ame.nd.edu Thanks for your interest in glimpse. Udi Manber, Burra Gopal, and Sun Wu. glimpse-4.18.7/README.md000066400000000000000000000007471300371307100145350ustar00rootroot00000000000000#### Glimpse This is the official repository for Glimpse. (further text to be added here) #### Agrep An essential part of Glimpse is Agrep, - the approximate GREP for fast fuzzy string searching. Files are searched for a string or regular expression, with approximate matching capabilities and user-definable records. Developed 1989-1991 by Udi Manber, Sun Wu et al. at the University of Arizona. The official Agrep repository can be found on https://github.com/Wikinaut/agrep . glimpse-4.18.7/agrep/000077500000000000000000000000001300371307100143445ustar00rootroot00000000000000glimpse-4.18.7/agrep/Makefile.NeXT000066400000000000000000000106111300371307100166200ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On IRIX, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = /bin/ar RANLIB = /bin/ranlib # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.alpha000066400000000000000000000107341300371307100170750ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = ranlib #true #for Solaris # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc #gcc -traditional #cc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O -Olimit 3000 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.hp000066400000000000000000000106361300371307100164200ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = : # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 1 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.in000066400000000000000000000064731300371307100164230ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. srcdir = @srcdir@ VPATH = @srcdir@ SHELL = /bin/sh CC = @CC@ AR = @AR@ RANLIB = @RANLIB@ CP = @CP@ STRIP = @STRIP@ INSTALL = @INSTALL@ INSTALL_PROGRAM = @INSTALL_PROGRAM@ INSTALL_DATA = @INSTALL_DATA@ INSTALL_MAN = ${INSTALL} -m 444 DEFS = prefix = @prefix@ exec_prefix = @exec_prefix@ binprefix = manprefix = bindir = $(exec_prefix)/bin libdir = $(exec_prefix)/lib mandir = $(prefix)/man/man1 manext = 1 MAN1 = agrep.1 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 OPTIMIZEFLAGS = -O2 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include # AGREP_POINTER is defined in autoconf.h MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(DEFS) LDFLAGS = OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) LIB = $(LIBDIR)/lib$(PROG).a all: $(LIB) $(NOTCPROG) install: all install-man $(INSTALL) $(PROG) $(bindir) install-man: $(MAN1) $(INSTALL_MAN) $(MAN1) $(mandir) clean: rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) distclean: clean rm -f Makefile HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LDFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) $(LIB): $(OBJS) dummyfilters.o $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LDFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.linux000066400000000000000000000107011300371307100171410ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = ranlib #true #for Solaris # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc -m486 SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O2 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.rs6000000066400000000000000000000106271300371307100167430ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On IRIX, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = /usr/bin/ar RANLIB = true #for IRIX # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.sgi000066400000000000000000000106241300371307100165700ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On IRIX, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = /usr/bin/ar RANLIB = true #for IRIX # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.solaris000066400000000000000000000107011300371307100174560ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = /usr/ccs/bin/ar #for Solaris RANLIB = true #for Solaris # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc -traditional #cc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/Makefile.sunos000066400000000000000000000106721300371307100171600ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = ranlib #true #for Solaris # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library in ../lib # You normally don't have to change them. BINDIR = ../bin LIBDIR = ../lib TCOMP = cast TCOMPDIR = ../compress AGREPDIR = ../agrep TEMPLATEDIR = ../libtemplate # You can change the target to use the "cast" (compression) library by changing: # all: $(NOTCPROG) # to: # all: $(PROG) # You must also define DOTCOMPRESSED below to be 1 instead of 0. DOTCOMPRESSED = 0 # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) \ -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) MYDEFINEFLAGS = -DMEASURE_TIMES=0 -DAGREP_POINTER=1 -DDOTCOMPRESSED=$(DOTCOMPRESSED) CFLAGS = $(MYDEFINEFLAGS) $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) OTHERLIBS = PROG = agrep NOTCPROG = notc$(PROG) all: $(NOTCPROG) cp $(PROG) $(BINDIR)/. LIB = $(LIBDIR)/lib$(PROG).a HDRS = agrep.h checkfile.h re.h defs.h config.h TCOMPLIBOBJ = \ $(TCOMPDIR)/hash.o \ $(TCOMPDIR)/string.o \ $(TCOMPDIR)/misc.o \ $(TCOMPDIR)/quick.o \ $(TCOMPDIR)/cast.o \ $(TCOMPDIR)/uncast.o \ $(TCOMPDIR)/tsimpletest.o \ $(TCOMPDIR)/tbuild.o\ $(TCOMPDIR)/tmemlook.o OBJS = \ follow.o \ asearch.o \ asearch1.o \ agrep.o \ bitap.o \ checkfile.o \ compat.o \ maskgen.o \ parse.o \ checksg.o \ preprocess.o \ delim.o \ asplit.o \ recursive.o \ sgrep.o \ newmgrep.o \ utilities.o $(PROG): $(OBJS) main.o $(LIBDIR)/lib$(TCOMP).a $(CC) -L$(LIBDIR) $(LINKFLAGS) -o $@ $(OBJS) main.o -l$(TCOMP) $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) $(TCOMPLIBOBJ) $(RANLIB) $(LIB) $(LIBDIR)/lib$(TCOMP).a: cd $(TCOMPDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(NOTCPROG): $(OBJS) dummyfilters.o main.o $(CC) $(LINKFLAGS) -o $(PROG) $(OBJS) dummyfilters.o main.o $(OTHERLIBS) $(AR) rcv $(LIB) $(OBJS) dummyfilters.o $(RANLIB) $(LIB) clean: -rm -f $(LIB) $(OBJS) dummyfilters.o main.o core a.out $(PROG) compat.o: agrep.h defs.h config.h asearch.o: agrep.h defs.h config.h asearch1.o: agrep.h defs.h config.h bitap.o: agrep.h defs.h config.h checkfile.o: agrep.h checkfile.h defs.h config.h follow.o: re.h agrep.h defs.h config.h main.o: agrep.h checkfile.h defs.h config.h dummysyscalls.c agrep.o: agrep.h checkfile.h defs.h config.h newmgrep.o: agrep.h defs.h config.h maskgen.o: agrep.h defs.h config.h next.o: agrep.h defs.h config.h parse.o: re.h agrep.h defs.h config.h preprocess.o: agrep.h defs.h config.h checksg.o: agrep.h checkfile.h defs.h config.h delim.o: agrep.h defs.h config.h asplit.o: agrep.h defs.h config.h sgrep.o: agrep.h defs.h config.h abm.o: agrep.h defs.h config.h utilities.o: re.h agrep.h defs.h config.h dummyfilters.o: dummyfilters.c glimpse-4.18.7/agrep/README000066400000000000000000000142661300371307100152350ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ This is version 2.04 of agrep - a new tool for fast text searching allowing errors. NOTE: an actively developed branch of agrep is now maintained at https://github.com/Wikinaut/agrep/ and is recommended for any standalone use of agrep agrep is similar to egrep (or grep or fgrep), but it is much more general (and usually faster). The main changes from version 1.1 are 1) incorporating Boyer-Moore type filtering to speed up search considerably, 2) allowing multi patterns via the -f option; this is similar to fgrep, but from our experience agrep is much faster, 3) searching for "best match" without having to specify the number of errors allowed, and 4) ascii is no longer required. Several more options were added. To compile, simply run make in the agrep directory after untar'ing the tar file (tar -xf agrep-2.04.tar will do it). The three most significant features of agrep that are not supported by the grep family are 1) the ability to search for approximate patterns; for example, "agrep -2 homogenos foo" will find homogeneous as well as any other word that can be obtained from homogenos with at most 2 substitutions, insertions, or deletions. "agrep -B homogenos foo" will generate a message of the form best match has 2 errors, there are 5 matches, output them? (y/n) 2) agrep is record oriented rather than just line oriented; a record is by default a line, but it can be user defined; for example, "agrep -d '^From ' 'pizza' mbox" outputs all mail messages that contain the keyword "pizza". Another example: "agrep -d '$$' pattern foo" will output all paragraphs (separated by an empty line) that contain pattern. 3) multiple patterns with AND (or OR) logic queries. For example, "agrep -d '^From ' 'burger,pizza' mbox" outputs all mail messages containing at least one of the two keywords (, stands for OR). "agrep -d '^From ' 'good;pizza' mbox" outputs all mail messages containing both keywords. Putting these options together one can ask queries like agrep -d '$$' -2 ';TheAuthor;Curriculum;<198[5-9]>' bib which outputs all paragraphs referencing articles in CACM between 1985 and 1989 by TheAuthor dealing with curriculum. Two errors are allowed, but they cannot be in either CACM or the year (the <> brackets forbid errors in the pattern between them). Other features include searching for regular expressions (with or without errors), unlimited wild cards, limiting the errors to only insertions or only substitutions or any combination, allowing each deletion, for example, to be counted as, say, 2 substitutions or 3 insertions, restricting parts of the query to be exact and parts to be approximate, and many more. agrep is available by anonymous ftp from cs.arizona.edu (IP 192.12.69.5) as agrep/agrep-2.04.tar.Z (or in uncompressed form as agrep/agrep-2.04.tar). The tar file contains the source code (in C), man pages (agrep.1), and two additional files, agrep.algorithms and agrep.chronicle, giving more information. The agrep directory also includes two postscript files: agrep.ps.1 is a technical report from June 1991 describing the design and implementation of agrep; agrep.ps.2 is a copy of the paper as appeared in the 1992 Winter USENIX conference. Please mail bug reports (or any other comments) to sw@cs.arizona.edu or to udi@cs.arizona.edu. We would appreciate if users notify us (at the address above) of any extensions, improvements, or interesting uses of this software. January 17, 1992 BUGS_fixed/option_update 1. remove multiple definitions of some global variables. 2. fix a bug in -G option. 3. fix a bug in -w option. January 23, 1992 4. fix a bug in pipeline input. 5. make the definition of word-delimiter consistant. March 16, 1992 6. add option '-y' which, if specified with -B option, will always output the best-matches without a prompt. April 10, 1992 7. fix a bug regarding exit status. April 15, 1992 ------------------------------------------------------------------------------- REVISIONS TO AGREP, FALL '93 8. Options can now be specified in a single group of characters after one '-'. - Sept 3rd 1993 9. Made agrep callable as a library routine from a separate function. The interface is: memagrep(argc, argv, searchbufferlen, searchbuffer), the pattern to be searched for and the options being specified EXACTLY as if they are being specified on the command line. The only difference is that instead of the file-names to look at, the user should specify a buffer and its length. Sample user programs are in ../user directory. In memagrep(), there are TWO peculiarities: 1. Peculiarity #1 -- at the end of the buffer, the user must have N bytes of valid virtual memory, where N is the length of the pattern to be searched. This space is used by agrep to speed up the checking of the termination condition. Its contents are restored before memagrep() returns -- however, some space must be there... else you'll get SIGSEGV. I might trap segv and do a longjmp, but that'll be in a new version! 2. The search buffer must begin with a newline so that it is easy for agrep to output matched lines. This also avoids some copying. Ofcourse, if we copied the user's search buffer into another buffer which meets both the above conditions, memagrep() will no longer be fast -- and speed is the primary goal. - Sept 27th 1993 10. Added some filter-programs to make agrep search thru compressed files. Also added some features in the Makefile which allows the user to build an agrep with a dummyfilter so that agrep remains independent of tcompress. The definitions needed in agrep to interface with tcompress are in defs.h - Nov 10th 1993 11. Added a library interface for searching thru a specified set of files, fileagrep(), which is similar to memagrep(). This is used by glimpse. Had to modify some other things and fix some bugs (see CHANGES). - Dec 1993 (coding), Jan 1993 (debugging). ------------------------------------------------------------------------------- CODING NOTE: sgrep.c and newmgrep.c use a similar while(fill_buf) loop with start and end, while others use loops with an internal variable i. glimpse-4.18.7/agrep/agrep.1000066400000000000000000000310711300371307100155260ustar00rootroot00000000000000.TH AGREP l "Jan 17, 1992" .SH NAME agrep \- search a file for a string or regular expression, with approximate matching capabilities .SH SYNOPSIS .B agrep [ .B \-#cdehiklnpstvwxBDGIS ] .I pattern [ -f .I patternfile ] [ .IR filename ".\|.\|. ]" .SH DESCRIPTION .B agrep searches the input .IR filenames (standard input is the default, but see a warning under LIMITATIONS) for records containing strings which either \fIexactly\fP or \fIapproximately\fP match a pattern. A record is by default a line, but it can be defined differently using the -d option (see below). Normally, each record found is copied to the standard output. Approximate matching allows finding records that contain the pattern with several errors including substitutions, insertions, and deletions. For example, Massechusets matches Massachusetts with two errors (one substitution and one insertion). Running .B agrep -2 Massechusets foo outputs all lines in foo containing any string with at most 2 errors from Massechusets. .LP .B agrep supports many kinds of queries including arbitrary wild cards, sets of patterns, and in general, regular expressions. See PATTERNS below. It supports most of the options supported by the .B grep family plus several more (but it is not 100% compatible with grep). For more information on the algorithms used by agrep see Wu and Manber, "Fast Text Searching With Errors," Technical report #91-11, Department of Computer Science, University of Arizona, June 1991 (available by anonymous ftp from cs.arizona.edu in agrep/agrep.ps.1), and Wu and Manber, "Agrep -- A Fast Approximate Pattern Searching Tool", To appear in USENIX Conference 1992 January (available by anonymous ftp from cs.arizona.edu in agrep/agrep.ps.2). .LP As with the rest of the \fBgrep\fP family, the characters .RB ` $ ', .RB `^ ', .RB ` \(** ', .RB ` [ ' , .RB ` ] ' , .RB ` \s+2^\s0 ', .RB ` | ', .RB ` ( ', .RB ` ) ', .RB ` ! ', and .RB ` \e ' can cause unexpected results when included in the .IR pattern , as these characters are also meaningful to the shell. To avoid these problems, one should always enclose the entire pattern argument in single quotes, i.e., 'pattern'. Do not use double quotes ("). .LP When .B agrep is applied to more than one input file, the name of the file is displayed preceding each line which matches the pattern. The filename is not displayed when processing a single file, so if you actually want the filename to appear, use .B /dev/null as a second file in the list. .SH OPTIONS .TP .B \-\fI#\fP \fI#\fP is a non-negative integer (at most 8) specifying the maximum number of errors permitted in finding the approximate matches (defaults to zero). Generally, each insertion, deletion, or substitution counts as one error. It is possible to adjust the relative cost of insertions, deletions and substitutions (see -I -D and -S options). .TP .B \-c Display only the count of matching records. .TP .B \-d "'\fIdelim\fP'" Define \fIdelim\fP to be the separator between two records. The default value is '$', namely a record is by default a line. \fIdelim\fP can be a string of size at most 8 (with possible use of ^ and $), but not a regular expression. Text between two \fIdelim\fP's, before the first \fIdelim\fP, and after the last \fIdelim\fP is considered as one record. For example, -d '$$' defines paragraphs as records and -d '^From\ ' defines mail messages as records. .B agrep matches each record separately. This option does not currently work with regular expressions. .TP .BI \-e " pattern" Same as a simple .I pattern argument, but useful when the .I pattern begins with a .RB ` \- '. .TP .BI \-f " patternfile" .I patternfile contains a set of (simple) patterns. The output is all lines that match at least one of the patterns in .I patternfile. Currently, the \-f option works only for exact match and for simple patterns (any meta symbol is interpreted as a regular character); it is compatible only with \-c, \-h, \-i, \-l, \-s, \-v, \-w, and \-x options. see LIMITATIONS for size bounds. .TP .B \-h Do not display filenames. .TP .B \-i Case-insensitive search \(em e.g., "A" and "a" are considered equivalent. .TP .B \-k No symbol in the pattern is treated as a meta character. For example, agrep -k 'a(b|c)*d' foo will find the occurrences of a(b|c)*d in foo whereas agrep 'a(b|c)*d' foo will find substrings in foo that match the regular expression 'a(b|c)*d'. .TP .B \-l List only the files that contain a match. This option is useful for looking for files containing a certain pattern. For example, " agrep -l 'wonderful' * " will list the names of those files in current directory that contain the word 'wonderful'. .TP .B \-n Each line that is printed is prefixed by its record number in the file. .TP .B \-p Find records in the text that contain a supersequence of the pattern. For example, \fB agrep \-p DCS foo will match "Department of Computer Science." .TP .B \-s Work silently, that is, display nothing except error messages. This is useful for checking the error status. .TP .B \-t Output the record starting from the end of .I delim to (and including) the next .I delim. This is useful for cases where .I delim should come at the end of the record. .TP .B \-v Inverse mode \(em display only those records that .I do not contain the pattern. .TP .B \-w Search for the pattern as a word \(em i.e., surrounded by non-alphanumeric characters. The non-alphanumeric .B must surround the match; they cannot be counted as errors. For example, .B agrep -w -1 car will match cars, but not characters. .TP .B \-x The pattern must match the whole line. .TP .B \-y Used with \-B option. When \-y is on, agrep will always output the best matches without giving a prompt. .TP .B \-B Best match mode. When \-B is specified and no exact matches are found, agrep will continue to search until the closest matches (i.e., the ones with minimum number of errors) are found, at which point the following message will be shown: "the best match contains x errors, there are y matches, output them? (y/n)" The best match mode is not supported for standard input, e.g., pipeline input. When the \-#, \-c, or \-l options are specified, the \-B option is ignored. In general, \-B may be slower than \-#, but not by very much. .TP .B \-D\fIk\fP Set the cost of a deletion to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .B \-G Output the files that contain a match. .TP .B \-I\fIk\fP Set the cost of an insertion to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .B \-S\fIk\fP Set the cost of a substitution to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .ne 4 .SH PATTERNS .LP \fIagrep\fP supports a large variety of patterns, including simple strings, strings with classes of characters, sets of strings, wild cards, and regular expressions. .TP \fBStrings\fP any sequence of characters, including the special symbols `^' for beginning of line and `$' for end of line. The special characters listed above ( .RB ` $ ', .RB `^ ', .RB ` \(** ', .RB ` [ ' , .RB ` \s+2^\s0 ', .RB ` | ', .RB ` ( ', .RB ` ) ', .RB ` ! ', and .RB ` \e ' ) should be preceded by `\\' if they are to be matched as regular characters. For example, \\^abc\\\\ corresponds to the string ^abc\\, whereas ^abc corresponds to the string abc at the beginning of a line. .TP \fBClasses of characters\fP a list of characters inside [] (in order) corresponds to any character from the list. For example, [a-ho-z] is any character between a and h or between o and z. The symbol `^' inside [] complements the list. For example, [^i-n] denote any character in the character set except character 'i' to 'n'. The symbol `^' thus has two meanings, but this is consistent with egrep. The symbol `.' (don't care) stands for any symbol (except for the newline symbol). .TP \fBBoolean operations\fP .B agrep supports an `and' operation `;' and an `or' operation `,', but not a combination of both. For example, 'fast;network' searches for all records containing both words. .TP \fBWild cards\fP The symbol '#' is used to denote a wild card. # matches zero or any number of arbitrary characters. For example, ex#e matches example. The symbol # is equivalent to .* in egrep. In fact, .* will work too, because it is a valid regular expression (see below), but unless this is part of an actual regular expression, # will work faster. .TP \fBCombination of exact and approximate matching\fP any pattern inside angle brackets <> must match the text exactly even if the match is with errors. For example, ics matches mathematical with one error (replacing the last s with an a), but mathe does not match mathematical no matter how many errors we allow. .TP \fBRegular expressions\fP The syntax of regular expressions in \fBagrep\fP is in general the same as that for \fBegrep\fP. The union operation `|', Kleene closure `*', and parentheses () are all supported. Currently '+' is not supported. Regular expressions are currently limited to approximately 30 characters (generally excluding meta characters). Some options (\-d, \-w, \-f, \-t, \-x, \-D, \-I, \-S) do not currently work with regular expressions. The maximal number of errors for regular expressions that use '*' or '|' is 4. .SH EXAMPLES .LP .TP agrep -2 -c ABCDEFG foo gives the number of lines in file foo that contain ABCDEFG within two errors. .TP agrep -1 -D2 -S2 'ABCD#YZ' foo outputs the lines containing ABCD followed, within arbitrary distance, by YZ, with up to one additional insertion (-D2 and -S2 make deletions and substitutions too "expensive"). .TP agrep -5 -p abcdefghij /usr/dict/words outputs the list of all words containing at least 5 of the first 10 letters of the alphabet \fIin order\fR. (Try it: any list starting with academia and ending with sacrilegious must mean something!) .TP agrep -1 'abc[0-9](de|fg)*[x-z]' foo outputs the lines containing, within up to one error, the string that starts with abc followed by one digit, followed by zero or more repetitions of either de or fg, followed by either x, y, or z. .TP agrep -d '^From\ ' 'breakdown;internet' mbox outputs all mail messages (the pattern '^From\ ' separates mail messages in a mail file) that contain keywords 'breakdown' and 'internet'. .TP agrep -d '$$' -1 ' ' foo finds all paragraphs that contain word1 followed by word2 with one error in place of the blank. In particular, if word1 is the last word in a line and word2 is the first word in the next line, then the space will be substituted by a newline symbol and it will match. Thus, this is a way to overcome separation by a newline. Note that -d '$$' (or another delim which spans more than one line) is necessary, because otherwise agrep searches only one line at a time. .TP agrep '^agrep' outputs all the examples of the use of agrep in this man pages. .PD .SH "SEE ALSO" .BR ed (1), .BR ex (1), .BR grep (1V), .BR sh (1), .BR csh (1). .SH BUGS/LIMITATIONS Any bug reports or comments will be appreciated! Please mail them to sw@cs.arizona.edu or udi@cs.arizona.edu .LP Regular expressions do not support the '+' operator (match 1 or more instances of the preceding token). These can be searched for by using this syntax in the pattern: .sp .in 1.0i \&'\fIpattern\fB(\fIpattern\fB)*\fR' .in .sp (search for strings containing one instance of the pattern, followed by 0 or more instances of the pattern). .LP The following can cause an infinite loop: .B agrep pattern * > output_file. If the number of matches is high, they may be deposited in output_file before it is completely read leading to more matches of the pattern within output_file (the matches are against the whole directory). It's not clear whether this is a "bug" (grep will do the same), but be warned. .LP The maximum size of the .I patternfile is limited to be 250Kb, and the maximum number of patterns is limited to be 30,000. .LP Standard input is the default if no input file is given. However, if standard input is keyed in directly (as opposed to through a pipe, for example) agrep may not work for some non-simple patterns. .LP There is no size limit for simple patterns. More complicated patterns are currently limited to approximately 30 characters. Lines are limited to 1024 characters. Records are limited to 48K, and may be truncated if they are larger than that. The limit of record length can be changed by modifying the parameter Max_record in agrep.h. .SH DIAGNOSTICS Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files. .SH AUTHORS Sun Wu and Udi Manber, Department of Computer Science, University of Arizona, Tucson, AZ 85721. {sw|udi}@cs.arizona.edu. glimpse-4.18.7/agrep/agrep.algorithms000066400000000000000000000051101300371307100175320ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ The implementation of agrep includes the following algorithms. Except for exact matching of simple patterns, for which we use a simple variation of the Boyer-Moore algorithm, all the algorithms (listed below) were designed by Sun Wu and Udi Manber. 1. bitap: The most general algorithm inside agrep. It supports many extensions such as approximate regular expression pattern matching, non-uniform costs, simultaneous matching of multiple patterns, mixed exact/approximate matching, etc. The algorithm is described in agrep.ps.1. 2. mgrep: A sub-linear expect-time algorithm for matching a set of patterns. It assumes that the set of patterns contains k patterns, and that the shortest pattern is of size m. See agrep.ps.2 for a brief description of the algorithm. 3. amonkey: a Boyer-Moore style algorithm for approximate pattern matching. let b = log_c (2*m), where c is the size of alphabet set. In the preprocessing, a table is built to determine whether a given substring of size b is in the pattern. Suppose we are looking for matches with at most k errors. The search is done in two passes. In the first pass (the filtering pass), the areas in the text that have a possibility to contain the matches are marked. The second pass finds the matches in those marked areas. The search in the first pass is done in the following way. Suppose the end position of the pattern is currently aligned with position tx in the text. The algorithm scans backward from tx until either (k+1) blocks that do not occur in the pattern have been scanned, or the scan has passed position (tx-m+k). In the former case, pattern is shifted forward to align the beginning position of the pattern with one character after the position in the text where the scan was stopped. In the latter case, we marked tx-m to tx+m as a candidate area. 4. mmonkey: Combining the mgrep algorithm with a partition technique, we have an algorithm with the same time complexity as amonkey. For ASCII text and pattern, this algorithm is faster than amonkey. The principle of the partition technique is as follows. Let A and B be two strings of size m. If we partition A into (k+1) blocks, then the distance between A and B is > k if none of the blocks of A occur in B. This implies that to match A with no more than k errors, B has to contain a substring that matches exactly one block of A. A brief description can be found in agrep.ps.2. glimpse-4.18.7/agrep/agrep.c000066400000000000000000003604121300371307100156140ustar00rootroot00000000000000 /* * bgopal: (1993-4) added a library interface and removed some bugs: also * selectively modified many routines to work with our text-compression algo. */ #include #include "agrep.h" #include "checkfile.h" #include #define PRINT(s) extern char **environ; extern int errno; int pattern_index; /* index in argv where the pattern is */ int glimpse_isserver=0; /* so that there is no user interaction */ int glimpse_call = 0; /* So that usage message is not printed twice */ int glimpse_clientdied=0;/* to quit search if glimpseserver's client dies */ int agrep_initialfd; /* Where does input come from? File/Memory? */ CHAR *agrep_inbuffer; int agrep_inlen; int agrep_inpointer; FILE *agrep_finalfp; /* Where does output go to? File/Memory? */ CHAR *agrep_outbuffer; int agrep_outlen; int agrep_outpointer; int execfd; /* used by exec called within agrep_search, set in agrep_init */ int multifd = -1; /* fd for multipattern search used in ^^ , set in ^^^^^^^^ */ extern char *pat_spool; #if DOTCOMPRESSED extern char *tc_pat_spool; #endif /* DOTCOMPRESSED */ char *multibuf=NULL; /* buffer to put the multiple patterns in */ int multilen = 0; /* length of the multibuf: not the #of multi-patterns! */ extern int pos_cnt; /* to re-initialize it to 0 for reg-exp search */ unsigned Mask[MAXSYM]; unsigned Init1, NO_ERR_MASK, Init[MaxError]; unsigned Bit[WORD+1]; CHAR buffer[BlockSize+Maxline+1]; /* should not be used anywhere: 10/18/93 */ unsigned Next[MaxNext], Next1[MaxNext]; unsigned wildmask, endposition, D_endpos; int LIMITOUTPUT; /* maximum number of matches we are going to allow */ int LIMITPERFILE; /* maximum number of matches per file we are going to allow */ int LIMITTOTALFILE; /* maximum number of files we are going to allow */ int EXITONERROR; /* return -1 or exit on error? */ int REGEX, FASTREGEX, RE_ERR, FNAME, WHOLELINE, SIMPLEPATTERN; int COUNT, HEAD, TAIL, LINENUM, INVERSE, I, S, DD, AND, SGREP, JUMP; int NOOUTPUTZERO; int Num_Pat, PSIZE, prev_num_of_matched, num_of_matched, files_matched, SILENT, NOPROMPT, BESTMATCH, NOUPPER; int NOMATCH, TRUNCATE, FIRST_IN_RE, FIRSTOUTPUT; int WORDBOUND, DELIMITER, D_length, tc_D_length, original_D_length; int EATFIRST, OUTTAIL; int BYTECOUNT; int PRINTOFFSET; int PRINTRECORD; int PRINTNONEXISTENTFILE; int FILEOUT; int DNA; int APPROX; int PAT_FILE; /* multiple patterns from a given file */ char PAT_FILE_NAME[MAX_LINE_LEN]; int PAT_BUFFER; /* multiple patterns from a given buffer */ int CONSTANT; int RECURSIVE; int total_line; /* used in mgrep */ int D; int M; int TCOMPRESSED; int EASYSEARCH; /* 1 used only for compressed files: LITTLE/BIG */ int ALWAYSFILENAME = OFF; int POST_FILTER = OFF; int NEW_FILE = OFF; /* only when post-filter is used */ int PRINTFILENUMBER = OFF; int PRINTFILETIME = OFF; int PRINTPATTERN = OFF; int MULTI_OUTPUT = OFF; /* should mgrep print the matched line multiple times for each matched pattern or just once? */ /* invisible to the user, used only by glimpse: cannot use -l since it is incompatible with stdin and -A is used for the index search (done next) */ /* Stuff to handle complicated boolean patterns */ int AComplexBoolean = 0; ParseTree *AParse = NULL; int anum_terminals = 0; ParseTree aterminals[MAXNUM_PAT]; char amatched_terminals[MAXNUM_PAT]; char aduplicates[MAXNUM_PAT][MAXNUM_PAT]; /* tells what other patterns are exactly equal to the i-th one */ char tc_aduplicates[MAXNUM_PAT][MAXNUM_PAT]; /* tells what other patterns are exactly equal to the i-th one */ #if MEASURE_TIMES /* timing variables */ int OUTFILTER_ms; int FILTERALGO_ms; int INFILTER_ms; #endif /*MEASURE_TIMES*/ CHAR **Textfiles = NULL; /* array of filenames to be searched */ int Numfiles = 0; /* indicates how many files in Textfiles */ int copied_from_argv = 0; /* were filenames copied from argv (should I free 'em)? */ CHAR old_D_pat[MaxDelimit * 2] = "\n"; /* to hold original D_pattern */ CHAR original_old_D_pat[MaxDelimit * 2] = "\n"; CHAR Pattern[MAXPAT], OldPattern[MAXPAT]; CHAR CurrentFileName[MAX_LINE_LEN]; long CurrentFileTime; int SetCurrentFileName = 0; /* dirty glimpse trick to make filters work: output seems to come from another file */ int SetCurrentFileTime = 0; /* dirty glimpse trick to avoid doing a stat to find the time */ int CurrentByteOffset; int SetCurrentByteOffset = 0; CHAR Progname[MAXNAME]; CHAR D_pattern[MaxDelimit * 2] = "\n; "; /* string which delimits records -- defaults to newline */ CHAR tc_D_pattern[MaxDelimit * 2] = "\n"; CHAR original_D_pattern[MaxDelimit * 2] = "\n; "; char COMP_DIR[MAX_LINE_LEN]; char FREQ_FILE[MAX_LINE_LEN], HASH_FILE[MAX_LINE_LEN], STRING_FILE[MAX_LINE_LEN]; /* interfacing with tcompress */ int NOFILENAME, /* Boolean flag, set for -h option */ FILENAMEONLY;/* Boolean flag, set for -l option */ extern int init(); int table[WORD][WORD]; CHAR *agrep_saved_pattern = NULL; /* to prevent multiple prepfs for each boolean search: crd@hplb.hpl.hp.com */ long aget_file_time(stbuf, name) struct stat *stbuf; char *name; { long ret = 0; struct stat mystbuf; if (stbuf != NULL) ret = stbuf->st_mtime; else { if (my_stat(name, &mystbuf) == -1) ret = 0; else ret = mystbuf.st_mtime; } return ret; } char * aprint_file_time(thetime) time_t thetime; { #if 0 char s[256], s1[16], s2[16], s3[16], s4[16], s5[16]; static char buffer[256]; strcpy(s, ctime(&thetime)); /* of the form: Sun Sep 16 01:03:52 1973\n\0 */ s[strlen(s) - 1] = '\0'; sscanf(s, "%s%s%s%s%s", s1, s2, s3, s4, s5); sprintf(buffer, ": %s %s %s", s2, s3, s5); /* ditch Sun 01:03:52 */ #else static char buffer[256]; buffer[0] = ':'; buffer[1] = ' '; strftime(&buffer[2], 256, "%h %e %Y", gmtime(&thetime)); #endif return &buffer[0]; } /* Called when multipattern search and pattern has not changed */ void reinit_value_partial() { num_of_matched = prev_num_of_matched = 0; errno = 0; FIRST_IN_RE = ON; } /* This must be called before every agrep_search to reset agrep globals */ void reinit_value() { int i, j; /* Added on 7th Oct 194 */ if (AParse) { if (AComplexBoolean) destroy_tree(AParse); AComplexBoolean = 0; AParse = 0; PAT_BUFFER = 0; if (multibuf != NULL) free(multibuf); /* this was allocated for arbit booleans, not multipattern search */ multibuf = NULL; multilen = 0; /* Cannot free multifd here since that is always allocated for multipattern search */ } for (i=0; i 0 ; i--) Bit[i] = Bit[i+1] << 1; for (i=0; i< MAXSYM; i++) Mask[i] = 0; /* bg: new things added on Mar 13 94 */ Init1 = 0; NO_ERR_MASK = 0; memset(Init, '\0', MaxError * sizeof(unsigned)); memset(Next, '\0', MaxNext * sizeof(unsigned)); memset(Next1, '\0', MaxNext * sizeof(unsigned)); wildmask = endposition = D_endpos = 0; for (i=0; i 0 && j < 10) { V[i] = V[i] | Bit[base + table[i][j++]]; } } Bit[base]=temp; if(M <= SHORTREG) { k = exponen(M); pp = 2*k; for(i=k; i>1); for(j=M; j>=1; j--) { if(n & Bit[WORD]) Next[i] = Next[i] | V[j]; n = (n>>1); } } return; } if(M > MAXREG) fprintf(stderr, "%s: regular expression too long\n", Progname); MM = M; if(M & 1) M=M+1; k = exponen(M/2); pp = 2*k; mid = MM/2; for(i=k; i>1); for(j=MM; j>mid ; j--) { if(n & Bit[WORD]) Next[i] = Next[i] | V[j-mid]; n = (n>>1); } n=i-k; Next1[i-k] = 0; for(j = 0; j>1); } } return; } int exponen(m) int m; { int i, ex; ex= 1; for (i=0; i MAXREG) { fprintf(stderr, "%s: regular expression too long, max is %d\n", Progname,MAXREG); if (!EXITONERROR){ errno = AGREP_ERROR; return -1; } else exit(2); } base = WORD - M; hh = M/2; for(i=WORD, j=0; j < hh ; i--, j++) LL = LL | Bit[i]; if(FIRST_IN_RE) compute_next(M, Next, Next1); /*SUN: try: change to memory allocation */ FIRST_IN_RE = 0; Newline = '\n'; Init[0] = Bit[base]; if(HEAD) Init[0] = Init[0] | Bit[base+1]; for(i=1; i<= D; i++) Init[i] = Init[i-1] | Next[Init[i-1]>>hh] | Next1[Init[i-1]&LL]; Init1 = Init[0] | 1; Init0 = Init[0]; r2 = r3 = Init[0]; for(k=0; k<= D; k++) { A[k] = B[k] = Init[k]; } if ( D == 0 ) { #if AGREP_POINTER if (Text != -1) { #endif /*AGREP_POINTER*/ alloc_buf(Text, &buffer, BlockSize+Maxline+1); while ((num_read = fill_buf(Text, buffer + Maxline, BlockSize)) > 0) { i=Maxline; end = num_read + Maxline; #if 0 /* pab: Don't do this here; it's done in bitap.fill_buf, * where we can handle eof on a block boundary right */ if((num_read < BlockSize) && buffer[end-1] != '\n') buffer[end++] = '\n'; #endif /* 0 */ if(FIRST_LOOP) { /* if first time in the loop add a newline */ buffer[i-1] = '\n'; /* in front the text. */ i--; CurrentByteOffset --; FIRST_LOOP = 0; } /* RE1_PROCESS_WHEN_DZERO: the while-loop below */ while ( i < end ) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r1 = Init1 & r3; r2 = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | r1; } else { r2 = r3 & Init1; } } else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & r3; /* match against endofline */ r2 = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | r1; if(TAIL) r2 = (Next[r2>>hh] | Next1[r2&LL]) | r2; /* epsilon move */ if(( r2 & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } r3 = Init0; r2 = (Next[r3>>hh] | Next1[r3&LL]) & CMask | Init0; /* match begin of line */ if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r1 = Init1 & r2; r3 = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; } else r3 = r2 & Init1; } /* if(NOT Newline) */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & r2; /* match against endofline */ r3 = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; if(TAIL) r3 = ( Next[r3>>hh] | Next1[r3&LL] ) | r3; /* epsilon move */ if(( r3 & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } r2 = Init0; r3 = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | Init0; /* match begin of line */ if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } /* while i < end ... */ strncpy(buffer, buffer+num_read, Maxline); } /* end while fill_buf()... */ free_buf(Text, buffer); return 0; #if AGREP_POINTER } else { /* within the memory buffer: assume it starts with a newline at position 0, the actual pattern follows that, and it ends with a '\n' */ num_read = agrep_inlen; buffer = (CHAR *)agrep_inbuffer; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; /* An exact copy of the above RE1_PROCESS_WHEN_DZERO: the while-loop below */ while ( i < end ) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r1 = Init1 & r3; r2 = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | r1; } else { r2 = r3 & Init1; } } else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & r3; /* match against endofline */ r2 = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | r1; if(TAIL) r2 = (Next[r2>>hh] | Next1[r2&LL]) | r2; /* epsilon move */ if(( r2 & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } r3 = Init0; r2 = (Next[r3>>hh] | Next1[r3&LL]) & CMask | Init0; /* match begin of line */ if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r1 = Init1 & r2; r3 = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; } else r3 = r2 & Init1; } /* if(NOT Newline) */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & r2; /* match against endofline */ r3 = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; if(TAIL) r3 = ( Next[r3>>hh] | Next1[r3&LL] ) | r3; /* epsilon move */ if(( r3 & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } r2 = Init0; r3 = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | Init0; /* match begin of line */ if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } /* while i < end ... */ return 0; } #endif /*AGREP_POINTER*/ } /* end if (D == 0) */ #if AGREP_POINTER if (Text != -1) { #endif /*AGREP_POINTER*/ while ((num_read = fill_buf(Text, buffer + Maxline, BlockSize)) > 0) { i=Maxline; end = Maxline + num_read; #if 0 /* pab: Don't do this here; it's done in bitap.fill_buf, * where we can handle eof on a block boundary right */ if((num_read < BlockSize) && buffer[end-1] != '\n') buffer[end++] = '\n'; #endif /* 0 */ if(FIRST_TIME) { /* if first time in the loop add a newline */ buffer[i-1] = '\n'; /* in front the text. */ i--; CurrentByteOffset --; FIRST_TIME = 0; } /* RE1_PROCESS_WHEN_DNOTZERO: the while loop below */ while (i < end ) { c = buffer[i]; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r2 = B[0]; r1 = Init1 & r2; A[0] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; r3 = B[1]; r1 = Init1 & r3; r0 = r2 | A[0]; /* A[0] | B[0] */ A[1] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | (( r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 1) goto Nextcharfile; r2 = B[2]; r1 = Init1 & r2; r0 = r3 | A[1]; A[2] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextcharfile; r3 = B[3]; r1 = Init1 & r3; r0 = r2 | A[2]; A[3] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextcharfile; r2 = B[4]; r1 = Init1 & r2; r0 = r3 | A[3]; A[4] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextcharfile; } /* if(CMask) */ else { r2 = B[0]; A[0] = r2 & Init1; r3 = B[1]; r1 = Init1 & r3; r0 = r2 | A[0]; A[1] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 1) goto Nextcharfile; r2 = B[2]; r1 = Init1 & r2; r0 = r3 | A[1]; A[2] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextcharfile; r3 = B[3]; r1 = Init1 & r3; r0 = r2 | A[2]; A[3] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextcharfile; r2 = B[4]; r1 = Init1 & r2; r0 = r3 | A[3]; A[4] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextcharfile; } } else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & B[D]; /* match against endofline */ A[D] = ((Next[B[D]>>hh] | Next1[B[D]&LL]) & CMask) | r1; if(TAIL) A[D] = ( Next[A[D]>>hh] | Next1[A[D]&LL] ) | A[D]; /* epsilon move */ if(( A[D] & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<=D; k++) B[k] = Init[0]; r1 = Init1 & B[0]; A[0] = (( Next[B[0]>>hh] | Next1[B[0]&LL]) & CMask) | r1; for(k=1; k<=D; k++) { r3 = B[k]; r1 = Init1 & r3; r2 = A[k-1] | B[k-1]; A[k] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((B[k-1] | Next[r2>>hh] | Next1[r2&LL]) & r_NO_ERR) | r1; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextcharfile: i=i+1; CurrentByteOffset ++; c = buffer[i]; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r2 = A[0]; r1 = Init1 & r2; B[0] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; r3 = A[1]; r1 = Init1 & r3; r0 = B[0] | r2; B[1] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((r2 | Next[r0>>hh] | Next1[r0&LL]) & r_NO_ERR) | r1 ; if(D == 1) goto Nextchar1file; r2 = A[2]; r1 = Init1 & r2; r0 = B[1] | r3; B[2] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextchar1file; r3 = A[3]; r1 = Init1 & r3; r0 = B[2] | r2; B[3] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextchar1file; r2 = A[4]; r1 = Init1 & r2; r0 = B[3] | r3; B[4] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextchar1file; } /* if(CMask) */ else { r2 = A[0]; B[0] = r2 & Init1; r3 = A[1]; r1 = Init1 & r3; r0 = B[0] | r2; B[1] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 1) goto Nextchar1file; r2 = A[2]; r1 = Init1 & r2; r0 = B[1] | r3; B[2] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextchar1file; r3 = A[3]; r1 = Init1 & r3; r0 = B[2] | r2; B[3] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextchar1file; r2 = A[4]; r1 = Init1 & r2; r0 = B[3] | r3; B[4] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextchar1file; } } /* if(NOT Newline) */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & A[D]; /* match against endofline */ B[D] = ((Next[A[D]>>hh] | Next1[A[D]&LL]) & CMask) | r1; if(TAIL) B[D] = ( Next[B[D]>>hh] | Next1[B[D]&LL] ) | B[D]; /* epsilon move */ if(( B[D] & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<=D; k++) A[k] = Init0; r1 = Init1 & A[0]; B[0] = ((Next[A[0]>>hh] | Next1[A[0]&LL]) & CMask) | r1; for(k=1; k<=D; k++) { r3 = A[k]; r1 = Init1 & r3; r2 = A[k-1] | B[k-1]; B[k] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((A[k-1] | Next[r2>>hh] | Next1[r2&LL]) & r_NO_ERR) | r1; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextchar1file: i=i+1; CurrentByteOffset ++; } /* while i < end */ strncpy(buffer, buffer+num_read, Maxline); } /* while fill_buf... */ free_buf(Text, buffer); return 0; #if AGREP_POINTER } else { /* within the memory buffer: assume it starts with a newline at position 0, the actual pattern follows that, and it ends with a '\n' */ num_read = agrep_inlen; buffer = (CHAR *)agrep_inbuffer; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; /* An exact copy of the above RE1_PROCESS_WHEN_DNOTZERO: the while loop below */ while (i < end ) { c = buffer[i]; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r2 = B[0]; r1 = Init1 & r2; A[0] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; r3 = B[1]; r1 = Init1 & r3; r0 = r2 | A[0]; /* A[0] | B[0] */ A[1] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | (( r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 1) goto Nextcharmem; r2 = B[2]; r1 = Init1 & r2; r0 = r3 | A[1]; A[2] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextcharmem; r3 = B[3]; r1 = Init1 & r3; r0 = r2 | A[2]; A[3] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextcharmem; r2 = B[4]; r1 = Init1 & r2; r0 = r3 | A[3]; A[4] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextcharmem; } /* if(CMask) */ else { r2 = B[0]; A[0] = r2 & Init1; r3 = B[1]; r1 = Init1 & r3; r0 = r2 | A[0]; A[1] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 1) goto Nextcharmem; r2 = B[2]; r1 = Init1 & r2; r0 = r3 | A[1]; A[2] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextcharmem; r3 = B[3]; r1 = Init1 & r3; r0 = r2 | A[2]; A[3] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextcharmem; r2 = B[4]; r1 = Init1 & r2; r0 = r3 | A[3]; A[4] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextcharmem; } } else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & B[D]; /* match against endofline */ A[D] = ((Next[B[D]>>hh] | Next1[B[D]&LL]) & CMask) | r1; if(TAIL) A[D] = ( Next[A[D]>>hh] | Next1[A[D]&LL] ) | A[D]; /* epsilon move */ if(( A[D] & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<=D; k++) B[k] = Init[0]; r1 = Init1 & B[0]; A[0] = (( Next[B[0]>>hh] | Next1[B[0]&LL]) & CMask) | r1; for(k=1; k<=D; k++) { r3 = B[k]; r1 = Init1 & r3; r2 = A[k-1] | B[k-1]; A[k] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((B[k-1] | Next[r2>>hh] | Next1[r2&LL]) & r_NO_ERR) | r1; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextcharmem: i=i+1; CurrentByteOffset ++; c = buffer[i]; CMask = Mask[c]; if(c != Newline) { if(CMask != 0) { r2 = A[0]; r1 = Init1 & r2; B[0] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | r1; r3 = A[1]; r1 = Init1 & r3; r0 = B[0] | r2; B[1] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((r2 | Next[r0>>hh] | Next1[r0&LL]) & r_NO_ERR) | r1 ; if(D == 1) goto Nextchar1mem; r2 = A[2]; r1 = Init1 & r2; r0 = B[1] | r3; B[2] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextchar1mem; r3 = A[3]; r1 = Init1 & r3; r0 = B[2] | r2; B[3] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextchar1mem; r2 = A[4]; r1 = Init1 & r2; r0 = B[3] | r3; B[4] = ((Next[r2>>hh] | Next1[r2&LL]) & CMask) | ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextchar1mem; } /* if(CMask) */ else { r2 = A[0]; B[0] = r2 & Init1; r3 = A[1]; r1 = Init1 & r3; r0 = B[0] | r2; B[1] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 1) goto Nextchar1mem; r2 = A[2]; r1 = Init1 & r2; r0 = B[1] | r3; B[2] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 2) goto Nextchar1mem; r3 = A[3]; r1 = Init1 & r3; r0 = B[2] | r2; B[3] = ((r2 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 3) goto Nextchar1mem; r2 = A[4]; r1 = Init1 & r2; r0 = B[3] | r3; B[4] = ((r3 | Next[r0>>hh] | Next1[r0&LL])&r_NO_ERR) | r1 ; if(D == 4) goto Nextchar1mem; } } /* if(NOT Newline) */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & A[D]; /* match against endofline */ B[D] = ((Next[A[D]>>hh] | Next1[A[D]&LL]) & CMask) | r1; if(TAIL) B[D] = ( Next[B[D]>>hh] | Next1[B[D]&LL] ) | B[D]; /* epsilon move */ if(( B[D] & 1 ) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<=D; k++) A[k] = Init0; r1 = Init1 & A[0]; B[0] = ((Next[A[0]>>hh] | Next1[A[0]&LL]) & CMask) | r1; for(k=1; k<=D; k++) { r3 = A[k]; r1 = Init1 & r3; r2 = A[k-1] | B[k-1]; B[k] = ((Next[r3>>hh] | Next1[r3&LL]) & CMask) | ((A[k-1] | Next[r2>>hh] | Next1[r2&LL]) & r_NO_ERR) | r1; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextchar1mem: i=i+1; CurrentByteOffset ++; } /* while i < end */ return 0; } #endif /*AGREP_POINTER*/ } /* re1 */ int re(Text, M, D) int Text, M, D; { register unsigned i, c, r1, r2, r3, CMask, k, Newline, Init0, Init1, end; register unsigned r_even, r_odd, r_NO_ERR ; unsigned RMask[MAXSYM]; unsigned A[MaxRerror+1], B[MaxRerror+1]; int num_read, j=0, lasti, base, ResidueSize; int FIRST_TIME; /* Flag */ CHAR *buffer; base = WORD - M; k = 2*exponen(M); if(FIRST_IN_RE) { compute_next(M, Next, Next1); FIRST_IN_RE = 0; } for(i=0; i< MAXSYM; i++) RMask[i] = Mask[i]; r_NO_ERR = NO_ERR_MASK; Newline = '\n'; Init0 = Init[0] = Bit[base]; if(HEAD) Init0 = Init[0] = Init0 | Bit[base+1] ; for(i=1; i<= D; i++) Init[i] = Init[i-1] | Next[Init[i-1]]; /* can be out? */ Init1 = Init0 | 1; r2 = r3 = Init0; for(k=0; k<= D; k++) { A[k] = B[k] = Init[0]; } /* can be out? */ FIRST_TIME = ON; alloc_buf(Text, &buffer, BlockSize+Maxline+1); if ( D == 0 ) { #if AGREP_POINTER if(Text != -1) { #endif /*AGREP_POINTER*/ lasti = Maxline; while ((num_read = fill_buf(Text, buffer + Maxline, BlockSize)) > 0) { i=Maxline; end = Maxline + num_read ; #if 0 /* pab: Don't do this here; it's done in bitap.fill_buf, * where we can handle eof on a block boundary right */ if((num_read < BlockSize) && buffer[end-1] != '\n') buffer[end++] = '\n'; #endif /* 0 */ if(FIRST_TIME) { buffer[i-1] = '\n'; i--; CurrentByteOffset --; FIRST_TIME = 0; } /* RE_PROCESS_WHEN_DZERO: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = RMask[c]; if(c != Newline) { r1 = Init1 & r3; r2 = (Next[r3] & CMask) | r1; } else { r1 = Init1 & r3; /* match against '\n' */ r2 = Next[r3] & CMask | r1; j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(TAIL) r2 = Next[r2] | r2 ; /* epsilon move */ if(( r2 & 1) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i - 1; r3 = Init0; r2 = (Next[r3] & CMask) | Init0; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = RMask[c]; if(c != Newline) { r1 = Init1 & r2; r3 = (Next[r2] & CMask) | r1; } else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & r2; /* match against endofline */ r3 = Next[r2] & CMask | r1; if(TAIL) r3 = Next[r3] | r3; if(( r3 & 1) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i - 1; r2 = Init0; r3 = (Next[r2] & CMask) | Init0; /* match the newline */ if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } /* while */ ResidueSize = Maxline + num_read - lasti; if(ResidueSize > Maxline) { ResidueSize = Maxline; } strncpy(buffer+Maxline-ResidueSize, buffer+lasti, ResidueSize); lasti = Maxline - ResidueSize; } /* while fill_buf() */ free_buf(Text, buffer); return 0; #if AGREP_POINTER } else { num_read = agrep_inlen; buffer = (CHAR *)agrep_inbuffer; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; lasti = 1; /* An exact copy of the above RE_PROCESS_WHEN_DZERO: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = RMask[c]; if(c != Newline) { r1 = Init1 & r3; r2 = (Next[r3] & CMask) | r1; } else { r1 = Init1 & r3; /* match against '\n' */ r2 = Next[r3] & CMask | r1; j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(TAIL) r2 = Next[r2] | r2 ; /* epsilon move */ if(( r2 & 1) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i - 1; r3 = Init0; r2 = (Next[r3] & CMask) | Init0; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = RMask[c]; if(c != Newline) { r1 = Init1 & r2; r3 = (Next[r2] & CMask) | r1; } else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & r2; /* match against endofline */ r3 = Next[r2] & CMask | r1; if(TAIL) r3 = Next[r3] | r3; if(( r3 & 1) ^ INVERSE) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i - 1; r2 = Init0; r3 = (Next[r2] & CMask) | Init0; /* match the newline */ if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } /* while */ /* If a residue is left for within-memory-buffer, since nothing can be "read" after that, we can ignore it: as if only 1 iteration of while */ return 0; } #endif /*AGREP_POINTER*/ } /* end if(D==0) */ #if AGREP_POINTER if (Text != -1) { #endif /*AGREP_POINTER*/ while ((num_read = fill_buf(Text, buffer + Maxline, BlockSize)) > 0) { i=Maxline; end = Maxline+num_read; #if 0 /* pab: Don't do this here; it's done in bitap.fill_buf, * where we can handle eof on a block boundary right */ if((num_read < BlockSize) && buffer[end-1] != '\n') buffer[end++] = '\n'; #endif /* 0 */ if(FIRST_TIME) { buffer[i-1] = '\n'; i--; CurrentByteOffset --; FIRST_TIME = 0; } /* RE_PROCESS_WHEN_DNOTZERO: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = RMask[c]; if (c != Newline) { r_even = B[0]; r1 = Init1 & r_even; A[0] = (Next[r_even] & CMask) | r1; r_odd = B[1]; r1 = Init1 & r_odd; r2 = (r_even | Next[r_even|A[0]]) &r_NO_ERR; A[1] = (Next[r_odd] & CMask) | r2 | r1 ; if(D == 1) goto Nextcharfile; r_even = B[2]; r1 = Init1 & r_even; r2 = (r_odd | Next[r_odd|A[1]]) &r_NO_ERR; A[2] = (Next[r_even] & CMask) | r2 | r1 ; if(D == 2) goto Nextcharfile; r_odd = B[3]; r1 = Init1 & r_odd; r2 = (r_even | Next[r_even|A[2]]) &r_NO_ERR; A[3] = (Next[r_odd] & CMask) | r2 | r1 ; if(D == 3) goto Nextcharfile; r_even = B[4]; r1 = Init1 & r_even; r2 = (r_odd | Next[r_odd|A[3]]) &r_NO_ERR; A[4] = (Next[r_even] & CMask) | r2 | r1 ; goto Nextcharfile; } /* if NOT Newline */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & B[D]; /* match endofline */ A[D] = (Next[B[D]] & CMask) | r1; if(TAIL) A[D] = Next[A[D]] | A[D]; if((A[D] & 1) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<= D; k++) { A[k] = B[k] = Init[k]; } r1 = Init1 & B[0]; A[0] = (Next[B[0]] & CMask) | r1; for(k=1; k<= D; k++) { r1 = Init1 & B[k]; r2 = (B[k-1] | Next[A[k-1]|B[k-1]]) &r_NO_ERR; A[k] = (Next[B[k]] & CMask) | r1 | r2; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextcharfile: c = buffer[i]; CMask = RMask[c]; if(c != Newline) { r1 = Init1 & A[0]; B[0] = (Next[A[0]] & CMask) | r1; r1 = Init1 & A[1]; B[1] = (Next[A[1]] & CMask) | ((A[0] | Next[A[0] | B[0]]) & r_NO_ERR) | r1 ; if(D == 1) goto Nextchar1file; r1 = Init1 & A[2]; B[2] = (Next[A[2]] & CMask) | ((A[1] | Next[A[1] | B[1]]) &r_NO_ERR) | r1 ; if(D == 2) goto Nextchar1file; r1 = Init1 & A[3]; B[3] = (Next[A[3]] & CMask) | ((A[2] | Next[A[2] | B[2]])&r_NO_ERR) | r1 ; if(D == 3) goto Nextchar1file; r1 = Init1 & A[4]; B[4] = (Next[A[4]] & CMask) | ((A[3] | Next[A[3] | B[3]])&r_NO_ERR) | r1 ; goto Nextchar1file; } /* if(NOT Newline) */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & A[D]; /* match endofline */ B[D] = (Next[A[D]] & CMask) | r1; if(TAIL) B[D] = Next[B[D]] | B[D]; if((B[D] & 1) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<= D; k++) { A[k] = B[k] = Init[k]; } r1 = Init1 & A[0]; B[0] = (Next[A[0]] & CMask) | r1; for(k=1; k<= D; k++) { r1 = Init1 & A[k]; r2 = (A[k-1] | Next[A[k-1]|B[k-1]])&r_NO_ERR; B[k] = (Next[A[k]] & CMask) | r1 | r2; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextchar1file: i++; CurrentByteOffset ++; } /* while i < end */ strncpy(buffer, buffer+num_read, Maxline); } /* while fill_buf() */ free_buf(Text, buffer); return 0; #if AGREP_POINTER } else { num_read = agrep_inlen; buffer = (CHAR *)agrep_inbuffer; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; /* An exact copy of the above RE_PROCESS_WHEN_DNOTZERO: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = RMask[c]; if (c != Newline) { r_even = B[0]; r1 = Init1 & r_even; A[0] = (Next[r_even] & CMask) | r1; r_odd = B[1]; r1 = Init1 & r_odd; r2 = (r_even | Next[r_even|A[0]]) &r_NO_ERR; A[1] = (Next[r_odd] & CMask) | r2 | r1 ; if(D == 1) goto Nextcharmem; r_even = B[2]; r1 = Init1 & r_even; r2 = (r_odd | Next[r_odd|A[1]]) &r_NO_ERR; A[2] = (Next[r_even] & CMask) | r2 | r1 ; if(D == 2) goto Nextcharmem; r_odd = B[3]; r1 = Init1 & r_odd; r2 = (r_even | Next[r_even|A[2]]) &r_NO_ERR; A[3] = (Next[r_odd] & CMask) | r2 | r1 ; if(D == 3) goto Nextcharmem; r_even = B[4]; r1 = Init1 & r_even; r2 = (r_odd | Next[r_odd|A[3]]) &r_NO_ERR; A[4] = (Next[r_even] & CMask) | r2 | r1 ; goto Nextcharmem; } /* if NOT Newline */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & B[D]; /* match endofline */ A[D] = (Next[B[D]] & CMask) | r1; if(TAIL) A[D] = Next[A[D]] | A[D]; if((A[D] & 1) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i-1, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<= D; k++) { A[k] = B[k] = Init[k]; } r1 = Init1 & B[0]; A[0] = (Next[B[0]] & CMask) | r1; for(k=1; k<= D; k++) { r1 = Init1 & B[k]; r2 = (B[k-1] | Next[A[k-1]|B[k-1]]) &r_NO_ERR; A[k] = (Next[B[k]] & CMask) | r1 | r2; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextcharmem: c = buffer[i]; CMask = RMask[c]; if(c != Newline) { r1 = Init1 & A[0]; B[0] = (Next[A[0]] & CMask) | r1; r1 = Init1 & A[1]; B[1] = (Next[A[1]] & CMask) | ((A[0] | Next[A[0] | B[0]]) & r_NO_ERR) | r1 ; if(D == 1) goto Nextchar1mem; r1 = Init1 & A[2]; B[2] = (Next[A[2]] & CMask) | ((A[1] | Next[A[1] | B[1]]) &r_NO_ERR) | r1 ; if(D == 2) goto Nextchar1mem; r1 = Init1 & A[3]; B[3] = (Next[A[3]] & CMask) | ((A[2] | Next[A[2] | B[2]])&r_NO_ERR) | r1 ; if(D == 3) goto Nextchar1mem; r1 = Init1 & A[4]; B[4] = (Next[A[4]] & CMask) | ((A[3] | Next[A[3] | B[3]])&r_NO_ERR) | r1 ; goto Nextchar1mem; } /* if(NOT Newline) */ else { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = Init1 & A[D]; /* match endofline */ B[D] = (Next[A[D]] & CMask) | r1; if(TAIL) B[D] = Next[B[D]] | B[D]; if((B[D] & 1) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if (-1 == r_output(buffer, i, end, j)) {free_buf(Text, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } for(k=0; k<= D; k++) { A[k] = B[k] = Init[k]; } r1 = Init1 & A[0]; B[0] = (Next[A[0]] & CMask) | r1; for(k=1; k<= D; k++) { r1 = Init1 & A[k]; r2 = (A[k-1] | Next[A[k-1]|B[k-1]])&r_NO_ERR; B[k] = (Next[A[k]] & CMask) | r1 | r2; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } Nextchar1mem: i++; CurrentByteOffset ++; } /* while i < end */ return 0; } #endif /*AGREP_POINTER*/ } /* re */ int r_output (buffer, i, end, j) int i, end, j; CHAR *buffer; { int PRINTED = 0; int bp; if(i >= end) return 0; if ((j < 1) || (CurrentByteOffset < 0)) return 0; num_of_matched++; if(COUNT) return 0; if (SILENT) return 0; if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } bp = i-1; while ((buffer[bp] != '\n') && (bp > 0)) bp--; if(LINENUM) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d: ", j-1); else { char s[32]; int outindex; sprintf(s, "%d: ", j-1); for(outindex=0; (outindex+agrep_outpointer= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } while(bp <= i) agrep_outbuffer[agrep_outpointer ++] = buffer[bp++]; } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } return 0; } /* * Processes the options specified in argc and argv, and fetches the pattern. * Also sets the set of filenames to be searched for internally. Returns: -1 * if there is a serious error, 0 if there is no pattern or an error in getting * the file names, the length (> 0) of the pattern if there is no error. When a * 0 is returned, it means that at least the options were processed correctly. */ int agrep_init(argc, argv, initialfd, pattern_len, pattern_buffer) int argc; char *argv[]; int initialfd; int pattern_len; CHAR *pattern_buffer; { int i, j, seenlsq = 0; char c, *p; int filetype; char **original_argv = argv; char *home; int quitwhile; int NOOUTTAIL=OFF; initial_value(); if (pattern_len < 1) { fprintf(stderr, "agrep_init: pattern length %d too small\n", pattern_len); errno = 3; return -1; } agrep_initialfd = initialfd; strncpy(Progname, argv[0], MAXNAME); if (argc < 2) return agrep_usage(); printf(""); /* dummy statement which avoids program crash with SYS3175 when piping the output of complex AGREP results into a file. This bug is regarded as COMPILER-UNSPECIFIC. For sure, the problem SHOULD BE FIXED somewhere else in AGREP, later. [TG] 16.09.96 Thomas Gries gries@epo.e-mail.com, gries@ibm.net */ Pattern[0] = '\0'; while(--argc > 0 && (*++argv)[0] == '-') { /* argv is incremented automatically here */ p = argv[0]+1; /* ptr to first character after '-' */ c = *(argv[0]+1); quitwhile = OFF; while(!quitwhile && (*p != '\0')) { c = *p; switch(c) { case 'z' : NOOUTPUTZERO = ON; /* don't output files with 0 matches */ PRINT(printf("z\n"); ) break; case 'c' : COUNT = ON; /* output the # of matches */ PRINT(printf("c\n"); ) break; case 's' : SILENT = ON; /* silent mode */ PRINT(printf("s\n"); ) break; case 'p' : I = 0; /* insertion cost is 0 */ PRINT(printf("p\n"); ) break; case 'P' : PRINTPATTERN = 1; /* print pattern before every matched line */ PRINT(printf("p\n"); ) break; case 'x' : WHOLELINE = ON; /* match the whole line */ PRINT(printf("x\n"); ) if(WORDBOUND) { fprintf(stderr, "%s: illegal option combination (-x and -w)\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } break; case 'b' : BYTECOUNT = ON; PRINT(printf("b\n"); ) break; case 'q' : PRINTOFFSET = ON; PRINT(printf("q\n"); ) break; case 'u' : PRINTRECORD = OFF; PRINT(printf("u\n"); ) break; case 'X' : PRINTNONEXISTENTFILE = ON; PRINT(printf("X\n"); ) break; case 'g' : PRINTFILENUMBER = ON; PRINT(printf("g\n"); ) break; case 'j' : PRINTFILETIME = ON; PRINT(printf("@\n"); ) break; case 'L' : if ( *(p + 1) == '\0') {/* space after -L option */ if(argc <= 1) { fprintf(stderr, "%s: the -L option must have an output-limit argument\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } argv++; LIMITOUTPUT = LIMITTOTALFILE = LIMITPERFILE = 0; sscanf(argv[0], "%d:%d:%d", &LIMITOUTPUT, &LIMITTOTALFILE, &LIMITPERFILE); if ((LIMITOUTPUT < 0) || (LIMITTOTALFILE < 0) || (LIMITPERFILE < 0)) { fprintf(stderr, "%s: invalid output limit %s\n", Progname, argv[0]); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } argc--; } else { LIMITOUTPUT = LIMITTOTALFILE = LIMITPERFILE = 0; sscanf(p+1, "%d:%d:%d", &LIMITOUTPUT, &LIMITTOTALFILE, &LIMITPERFILE); if ((LIMITOUTPUT < 0) || (LIMITTOTALFILE < 0) || (LIMITPERFILE < 0)) { fprintf(stderr, "%s: invalid output limit %s\n", Progname, p+1); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } } /* else */ PRINT(printf("L\n"); ) quitwhile = ON; break; case 'd' : DELIMITER = ON; /* user defines delimiter */ PRINT(printf("d\n"); ) if ( *(p + 1) == '\0') {/* space after -d option */ if(argc <= 1) { fprintf(stderr, "%s: the -d option must have a delimiter argument\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } argv++; if ((D_length = strlen(argv[0])) > MaxDelimit) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", Progname, MaxDelimit); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } D_pattern[0] = '<'; strcpy(D_pattern+1, argv[0]); if (((argv[0][D_length-1] == '\n') || (argv[0][D_length-1] == '$') || (argv[0][D_length-1] == '^')) && (D_length == 1)) OUTTAIL = ON; argc--; PRINT(printf("space\n"); ) } else { if ((D_length = strlen(p + 1)) > MaxDelimit) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", Progname, MaxDelimit); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } D_pattern[0] = '<'; strcpy(D_pattern+1, p + 1); if ((((p+1)[D_length-1] == '\n') || ((p+1)[D_length-1] == '$') || ((p+1)[D_length-1] == '^')) && (D_length == 1)) OUTTAIL = ON; } /* else */ strcat(D_pattern, ">; "); D_length++; /* to count '<' as one */ PRINT(printf("D_pattern=%s\n", D_pattern); ) strcpy(original_D_pattern, D_pattern); original_D_length = D_length; quitwhile = ON; break; case 'H': if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -H option\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return agrep_usage(); } else exit(2); } argv ++; strcpy(COMP_DIR, argv[0]); argc --; } else { strcpy(COMP_DIR, p+1); } quitwhile = ON; break; case 'e' : if ( *(p + 1) == '\0') {/* space after -e option */ if(argc <= 1) { fprintf(stderr, "%s: the -e option must have a pattern argument\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } argv++; if(argv[0][0] == '-') { /* not strictly necessary but no harm done */ Pattern[0] = '\\'; strcat(Pattern, (argv)[0]); } else strcat(Pattern, argv[0]); argc--; } else { if (*(p+1) == '-') { /* not strictly necessary but no harm done */ Pattern[0] = '\\'; strcat(Pattern, p+1); } else strcat (Pattern, p+1); } /* else */ PRINT(printf("Pattern=%s\n", Pattern); ) pattern_index = abs(argv - original_argv); quitwhile = ON; break; case 'k' : CONSTANT = ON; if ( *(p + 1) == '\0') {/* space after -e option */ if(argc <= 1) { fprintf(stderr, "%s: the -k option must have a pattern argument\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } argv++; strcat(Pattern, argv[0]); if((argc > 2) && (argv[1][0] == '-')) { fprintf(stderr, "%s: -k should be the last option in the command\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } argc--; } else { if((argc > 1) && (argv[1][0] == '-')) { fprintf(stderr, "%s: -k should be the last option in the command\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } strcat (Pattern, p+1); } /* else */ pattern_index = abs(argv - original_argv); quitwhile = ON; break; case 'f' : if (PAT_FILE == ON) { fprintf(stderr, "%s: multiple -f options\n", Progname); if (multifd >= 0) close(multifd); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if (PAT_BUFFER == ON) { fprintf(stderr, "%s: -f and -m are incompatible\n", Progname); if (multibuf != NULL) free(multibuf); multibuf = NULL; multilen = 0; if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } PAT_FILE = ON; PRINT(printf("f\n"); ) argv++; argc--; if (argv[0] == NULL) { /* A -f option with a NULL file name is a NO-OP: stupid, but simplifies glimpse :-) */ PAT_FILE = OFF; quitwhile = ON; break; } if((multifd = open(argv[0], O_RDONLY)) < 0) { PAT_FILE = OFF; fprintf(stderr, "%s: can't open pattern file for reading: %s\n", Progname, argv[0]); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } PRINT(printf("file=%s\n", argv[0]); ) strcpy(PAT_FILE_NAME, argv[0]); if (prepf(multifd, NULL, 0) <= -1) { close(multifd); PAT_FILE = OFF; fprintf(stderr, "%s: error in processing pattern file: %s\n", Progname, argv[0]); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } quitwhile = ON; break; case 'm' : if (PAT_BUFFER == ON) { fprintf(stderr, "%s: multiple -m options\n", Progname); if (multibuf != NULL) free(multibuf); multibuf = NULL; multilen = 0; if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if (PAT_FILE == ON) { fprintf(stderr, "%s: -f and -m are incompatible\n", Progname); if (multifd >= 0) close(multifd); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } PAT_BUFFER = ON; PRINT(printf("m\n"); ) argv ++; argc --; if ((argv[0] == NULL) || ((multilen = strlen(argv[0])) <= 0)) { /* A -m option with a NULL or empty pattern buffer is a NO-OP: stupid, but simplifies glimpse :-) */ PAT_BUFFER = OFF; if (multibuf != NULL) free(multibuf); multilen = 0; multibuf = NULL; } else { multibuf = (char *)malloc(multilen + 2); strcpy(multibuf, argv[0]); PRINT(printf("patterns=%s\n", multibuf); ) if (prepf(-1, multibuf, multilen) <= -1) { free(multibuf); multibuf = NULL; multilen = 0; PAT_BUFFER = OFF; fprintf(stderr, "%s: error in processing pattern buffer\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } } quitwhile = ON; break; case 'h' : NOFILENAME = ON; PRINT(printf("h\n"); ) break; case 'i' : NOUPPER = ON; PRINT(printf("i\n"); ) break; case 'l' : FILENAMEONLY = ON; PRINT(printf("l\n"); ) break; case 'n' : LINENUM = ON; /* output prefixed by line no*/ PRINT(printf("n\n"); ) break; case 'r' : RECURSIVE = ON; PRINT(printf("r\n"); ) break; case 'V' : printf("\nThis is agrep version %s, %s.\n\n", AGREP_VERSION, AGREP_DATE); return 0; case 'v' : INVERSE = ON; /* output no-matched lines */ PRINT(printf("v\n"); ) break; case 't' : OUTTAIL = ON; /* output from tail of delimiter */ PRINT(printf("t\n"); ) break; case 'o' : NOOUTTAIL = ON; /* output from front of delimiter */ PRINT(printf("t\n"); ) break; case 'B' : BESTMATCH = ON; PRINT(printf("B\n"); ) break; case 'w' : WORDBOUND = ON;/* match to words */ PRINT(printf("w\n"); ) if(WHOLELINE) { fprintf(stderr, "%s: illegal option combination (-w and -x)\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } break; case 'y' : NOPROMPT = ON; PRINT(printf("y\n"); ) break; case 'I' : I = atoi(p + 1); /* Insertion Cost */ JUMP = ON; quitwhile = ON; break; case 'S' : S = atoi(p + 1); /* Substitution Cost */ JUMP = ON; quitwhile = ON; break; case 'D' : DD = atoi(p + 1); /* Deletion Cost */ JUMP = ON; quitwhile = ON; break; case 'G' : FILEOUT = ON; COUNT = ON; break; case 'A': ALWAYSFILENAME = ON; break; case 'O': POST_FILTER = ON; break; case 'M': MULTI_OUTPUT = ON; break; case 'Z': break; /* no-op: used by glimpse */ default : if (isdigit(c)) { APPROX = ON; D = atoi(p); if (D > MaxError) { fprintf(stderr,"%s: the maximum number of errors is %d\n", Progname, MaxError); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } quitwhile = ON; /* note that even a number should occur at the end of a group of options, as f & e */ } else { fprintf(stderr, "%s: illegal option -%c\n",Progname, c); return agrep_usage(); } } /* switch(c) */ p ++; } } /* while (--argc > 0 && (*++argv)[0] == '-') */ if (NOOUTTAIL == ON) OUTTAIL = OFF; if (COMP_DIR[0] == '\0') { if ((home = (char *)getenv("HOME")) == NULL) { getcwd(COMP_DIR, MAX_LINE_LEN-1); fprintf(stderr, "using working-directory '%s' to locate dictionaries\n", COMP_DIR); } else strncpy(COMP_DIR, home, MAX_LINE_LEN); } strcpy(FREQ_FILE, COMP_DIR); strcat(FREQ_FILE, "/"); strcat(FREQ_FILE, DEF_FREQ_FILE); strcpy(HASH_FILE, COMP_DIR); strcat(HASH_FILE, "/"); strcat(HASH_FILE, DEF_HASH_FILE); strcpy(STRING_FILE, COMP_DIR); strcat(STRING_FILE, "/"); strcat(STRING_FILE, DEF_STRING_FILE); initialize_common(FREQ_FILE, 0); /* no error msgs */ if (FILENAMEONLY && NOFILENAME) { fprintf(stderr, "%s: -h and -l options are mutually exclusive\n",Progname); } if (COUNT && (FILENAMEONLY || NOFILENAME)) { FILENAMEONLY = OFF; if(!FILEOUT) NOFILENAME = OFF; } if (SILENT) { FILEOUT = 0; NOFILENAME = 1; PRINTRECORD = 0; FILENAMEONLY = 0; PRINTFILETIME = 0; BYTECOUNT = 0; PRINTOFFSET = 0; } if (!(PAT_FILE || PAT_BUFFER) && Pattern[0] == '\0') { /* Pattern not set with -e option */ if (argc <= 0) { agrep_usage(); return 0; } strcpy(Pattern, *argv); pattern_index = abs(argv - original_argv); argc--; argv++; } /* if multi-pattern search, just ignore any specified pattern altogether: treat it as a filename */ if (copied_from_argv) { for (i=0; i USERRANGE_MIN) && ( ((unsigned char *)Pattern)[i] <= USERRANGE_MAX)) { fprintf(stderr, "Warning: pattern has some meta-characters interpreted by agrep!\n"); break; } else if (Pattern[i] == '\\') i++; /* extra */ else if (Pattern[i] == '[') seenlsq = 1; else if ((Pattern[i] == '-') && !seenlsq) { for (j=M; j>=i; j--) Pattern[j+1] = Pattern[j]; /* right shift including '\0' */ Pattern[i] = '\\'; /* escape the - */ M ++; i++; } else if (Pattern[i] == ']') seenlsq = 0; } if (M > pattern_len - 1) { fprintf(stderr, "%s: pattern '%s' does not fit in specified buffer\n", Progname, Pattern); errno = 3; return 0; } if (pattern_buffer != Pattern) /* not from mem/file-agrep() */ strncpy(pattern_buffer, Pattern, M+1); /* copy \0 */ return M; } /* * User need not bother about initialfd. * Both functions return -1 on error, 0 if there was no pattern, * length (>=1) of pattern otherwise. */ int memagrep_init(argc, argv, pattern_len, pattern_buffer) int argc; char *argv[]; int pattern_len; char *pattern_buffer; { return (agrep_init(argc, argv, -1, pattern_len, pattern_buffer)); } int fileagrep_init(argc, argv, pattern_len, pattern_buffer) int argc; char *argv[]; int pattern_len; char *pattern_buffer; { return (agrep_init(argc, argv, 3, pattern_len, pattern_buffer)); } /* returns -1 on error, num of matches (>=0) otherwise */ int agrep_search(pattern_len, pattern_buffer, initialfd, input_len, input, output_len, output) int pattern_len; CHAR *pattern_buffer; int initialfd; int input_len; void *input; int output_len; void *output; { int i; int filetype; int ret; int pattern_has_changed = 1; if ((multifd == -1) && (multibuf == NULL) && (pattern_len < 1)) { fprintf(stderr, "%s: pattern length %d too small\n", Progname, pattern_len); errno = 3; return -1; } if (pattern_len >= MAXPAT) { fprintf(stderr, "%s: pattern '%s' too long\n", Progname, pattern_buffer); errno = 3; return -1; } /* courtesy: crd@hplb.hpl.hp.com */ if (agrep_saved_pattern) { if (strcmp(agrep_saved_pattern, pattern_buffer)) { free(agrep_saved_pattern); agrep_saved_pattern = NULL; } else { pattern_has_changed = 0; } } if (! agrep_saved_pattern) { agrep_saved_pattern = (CHAR *)malloc(pattern_len+1); memcpy(agrep_saved_pattern, pattern_buffer, pattern_len); agrep_saved_pattern[pattern_len] = '\0'; } if (!pattern_has_changed) { reinit_value_partial(); } else { reinit_value(); if (pattern_buffer != Pattern) /* not from mem/file-agrep() */ strncpy(Pattern, pattern_buffer, pattern_len+1); /* copy \0 */ M = strlen(Pattern); } if (output == NULL) { fprintf(stderr, "%s: invalid output descriptor\n", Progname); return -1; } if (output_len <= 0) { agrep_finalfp = (FILE *)output; agrep_outlen = 0; agrep_outbuffer = NULL; agrep_outpointer = 0; } else { agrep_finalfp = NULL; agrep_outlen = output_len; agrep_outbuffer = (CHAR *)output; agrep_outpointer = 0; } agrep_initialfd = initialfd; execfd = initialfd; if (initialfd == -1) { agrep_inbuffer = (CHAR *)input; agrep_inlen = input_len; agrep_inpointer = 0; } else if ((input_len > 0) && (input != NULL)) { /* Copy the set of filenames into Textfiles */ if (copied_from_argv) { for (i=0; i; ", D_length is 1 + length of string PAT: see agrep.c/'d' */ preprocess_delimiter(D_pattern+1, D_length - 1, D_pattern, &D_length); /* D_pattern is the exact stuff we want to match, D_length is its strlen */ if ((tc_D_length = quick_tcompress(FREQ_FILE,HASH_FILE,D_pattern,D_length,tc_D_pattern,MaxDelimit*2,TC_EASYSEARCH)) <= 0) { strcpy(tc_D_pattern, D_pattern); tc_D_length = D_length; } /* printf("sgrep's delim=%s,%d tc_delim=%s,%d\n", D_pattern, D_length, tc_D_pattern, tc_D_length); */ } M = strlen(OldPattern); } } if (AParse) { /* boolean converted to multi-pattern search */ int prepf_ret= 0; if (pattern_has_changed) prepf_ret= prepf(-1, multibuf, multilen); if (prepf_ret <= -1) { if (AComplexBoolean) destroy_tree(AParse); AParse = 0; PAT_BUFFER = 0; if (multibuf != NULL) free(multibuf); /* this was allocated for arbit booleans, not multipattern search */ multibuf = NULL; multilen = 0; /* Cannot free multifd here since that is always allocated for multipattern search */ return -1; } } if (Numfiles > 1) FNAME = ON; if (NOFILENAME) FNAME = 0; if (ALWAYSFILENAME) FNAME = ON; /* used by glimpse ONLY: 15/dec/93 */ if (agrep_initialfd == -1) ret = exec(execfd, NULL); else if(RECURSIVE) ret = (recursive(Numfiles, Textfiles)); else ret = (exec(execfd, Textfiles)); return ret; } /* * User need not bother about initialfd. * Both functions return -1 on error, 0 otherwise. */ int memagrep_search(pattern_len, pattern_buffer, input_len, input_buffer, output_len, output) int pattern_len; char *pattern_buffer; int input_len; char *input_buffer; int output_len; void *output; { return(agrep_search(pattern_len, pattern_buffer, -1, input_len, input_buffer, output_len, output)); } int fileagrep_search(pattern_len, pattern_buffer, file_num, file_buffer, output_len, output) int pattern_len; char *pattern_buffer; int file_num; char **file_buffer; int output_len; void *output; { return(agrep_search(pattern_len, pattern_buffer, 3, file_num, file_buffer, output_len, output)); } /* * The original agrep_run() routine was split into agrep_search and agrep_init * so that the interface with glimpse could be made cleaner: see glimpse. * Now, the user can specify an initial set of options, and use them in future * searches. If agrep_init does not find the pattern, options are still SET. * In fileagrep_search, the user can specify a NEW set of files to be searched * after the options are processed (this is used in glimpse). * * Both functions return -1 on error, 0 otherwise. * * The arguments are self explanatory. The pattern should be specified in * one of the argvs. Options too can be specified in one of the argvs -- it * is exactly as if the options are being given to agrep at run time. * The only restrictions are that the input_buffer should begin with a '\n' * and after its end, there must be valid memory to store a copy of the pattern. */ int memagrep(argc, argv, input_len, input_buffer, output_len, output) int argc; char *argv[]; int input_len; char *input_buffer; int output_len; void *output; { int ret; if ((ret = memagrep_init(argc, argv, MAXPAT, Pattern)) < 0) return -1; else if ((ret == 0) && (multifd == -1) && (multibuf == NULL)) return -1; /* ^^^ because one need not specify the pattern on the cmd line if -f OR -m */ return memagrep_search(ret, Pattern, input_len, input_buffer, output_len, output); } int fileagrep(argc, argv, output_len, output) int argc; char *argv[]; int output_len; void *output; { int ret; if ((ret = fileagrep_init(argc, argv, MAXPAT, Pattern)) < 0) return -1; else if ((ret == 0) && (multifd == -1) && (multibuf == NULL)) return -1; /* ^^^ because one need not specify the pattern on the cmd line if -f OR -m */ return fileagrep_search(ret, Pattern, 0, NULL, output_len, output); } /* * RETURNS: total number of matched lines in all files that were searched. * * The pattern(s) remain(s) constant irrespective of the number of files. * Hence, essentially, all the interface routines below have to be changed * so that they DONT do that preprocessing again and again for multiple * files. This bug was found while interfacing agrep with cast. * * At present, sgrep() has been modified to have another parameter, * "samepattern" that tells it whether the pattern is the same as before. * Other funtions too should have such a parameter and should not repeat * preprocessing for all patterns. Since preprocessing for a pattern to * be searched in compressed files is siginificant, this bug was found. * * - bgopal on 15/Nov/93. */ int exec(fd, file_list) int fd; char **file_list; { int i; char c[8]; int ret = 0; /* no error */ if ((Numfiles > 1) && (NOFILENAME == OFF)) FNAME = ON; if ((-1 == compat())) return -1; /* check compatibility between options */ if (fd <= 0) { TCOMPRESSED = ON; /* there is a possibility that the data might be tuncompressible */ if (!SetCurrentByteOffset) CurrentByteOffset = 0; if((fd == 0) && FILENAMEONLY) { fprintf(stderr, "%s: -l option is not compatible with standard input\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if(PAT_FILE || PAT_BUFFER) mgrep(fd, AParse); else { if(SGREP) ret = sgrep(OldPattern, strlen(OldPattern), fd, D, 0); else ret = bitap(old_D_pat, Pattern, fd, M, D); } if (ret <= -1) return -1; if (COUNT /* && ret */) { /* dirty solution for glimpse's -b! */ if(INVERSE && (PAT_FILE || PAT_BUFFER)) { /* inverse will never be set in glimpse */ if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d\n", total_line-(num_of_matched - prev_num_of_matched)); else { char s[32]; int outindex; sprintf(s, "%d\n", total_line-(num_of_matched - prev_num_of_matched)); for(outindex=0; (outindex+agrep_outpointer 0 => Numfiles > 0 */ for (i = 0; i < Numfiles; i++, close(fd)) { prev_num_of_matched = num_of_matched; if (!SetCurrentByteOffset) CurrentByteOffset = 0; if (!SetCurrentFileName) { if (PRINTFILENUMBER) sprintf(CurrentFileName, "%d", i); else strcpy(CurrentFileName, file_list[i]); } if (!SetCurrentFileTime) { if (PRINTFILETIME) CurrentFileTime = aget_file_time(NULL, file_list[i]); } TCOMPRESSED = ON; if (!tuncompressible_filename(file_list[i], strlen(file_list[i]))) TCOMPRESSED = OFF; NEW_FILE = ON; if ((fd = my_open(file_list[i], O_RDONLY)) < /*=*/ 0) { if (PRINTNONEXISTENTFILE) printf("%s\n", CurrentFileName); else if (!glimpse_call) fprintf(stderr, "%s: can't open file for reading: %s\n",Progname, file_list[i]); } else { if(PAT_FILE || PAT_BUFFER) mgrep(fd, AParse); else { if(SGREP) ret = sgrep(OldPattern, strlen(OldPattern), fd, D, i); else ret = bitap(old_D_pat, Pattern, fd, M, D); } if (ret <= -1) { close(fd); return -1; } if (num_of_matched - prev_num_of_matched > 0) { NOMATCH = OFF; files_matched ++; } if (COUNT && !FILEOUT) { if( (INVERSE && (PAT_FILE || PAT_BUFFER)) && ((total_line - (num_of_matched - prev_num_of_matched)> 0) || !NOOUTPUTZERO) ) { if(FNAME && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; close(fd); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; close(fd); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ": %d\n", total_line - (num_of_matched - prev_num_of_matched)); else { char s[32]; int outindex; sprintf(s, ": %d\n", total_line - (num_of_matched - prev_num_of_matched)); for(outindex=0; (outindex+agrep_outpointer 0) || !NOOUTPUTZERO) ) { /* inverse is always 0 in glimpse, so we always come here */ if(FNAME && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; close(fd); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; close(fd); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ": %d\n", (num_of_matched - prev_num_of_matched)); else { char s[32]; int outindex; sprintf(s, ": %d\n", (num_of_matched - prev_num_of_matched)); for(outindex=0; (outindex+agrep_outpointer 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITTOTALFILE > 0) && (LIMITTOTALFILE <= files_matched))) { close(fd); break; /* done */ } } /* for i < Numfiles */ if(NOMATCH && BESTMATCH) { if(WORDBOUND || WHOLELINE || INVERSE) { SGREP = 0; if(-1 == preprocess(D_pattern, Pattern)) return -1; strcpy(old_D_pat, D_pattern); if((M = maskgen(Pattern, D)) == -1) return -1; } COUNT=ON; D=1; while(D 0) { if(PAT_FILE || PAT_BUFFER) mgrep(fd, AParse); else { if(SGREP) ret = sgrep(OldPattern,strlen(OldPattern),fd,D, i); else ret = bitap(old_D_pat,Pattern,fd,M,D); } if (ret <= -1) return -1; } /* else don't have to process PRINTNONEXISTENTFILE since must print only once */ if (glimpse_clientdied) { close(fd); return -1; } if (agrep_finalfp != NULL) fflush(agrep_finalfp); if ((((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITTOTALFILE > 0) && (LIMITTOTALFILE <= files_matched))) && (num_of_matched > prev_num_of_matched)) { close(fd); break; } } /* for i < Numfiles */ D++; } /* while */ if(num_of_matched - prev_num_of_matched > 0) { D--; errno = D; /* #of errors if proper return */ COUNT = 0; if(num_of_matched - prev_num_of_matched == 1) fprintf(stderr,"%s: 1 word matches ", Progname); else fprintf(stderr,"%s: %d words match ", Progname, num_of_matched - prev_num_of_matched); if(D==1) fprintf(stderr, "within 1 error"); else fprintf(stderr, "within %d errors", D); fflush(stderr); if(NOPROMPT) fprintf(stderr, "\n"); else { if(num_of_matched - prev_num_of_matched == 1) fprintf(stderr,"; search for it? (y/n)"); else fprintf(stderr,"; search for them? (y/n)"); c[0] = 'y'; if (!glimpse_isserver && (fgets(c, 4, stdin) == NULL)) goto CONT; if(c[0] != 'y') goto CONT; } for (i = 0; i < Numfiles; i++, close(fd)) { prev_num_of_matched = num_of_matched; CurrentByteOffset = 0; if (PRINTFILENUMBER) sprintf(CurrentFileName, "%d", i); else strcpy(CurrentFileName, file_list[i]); if (!SetCurrentFileTime) if (PRINTFILETIME) CurrentFileTime = aget_file_time(NULL, file_list[i]); NEW_FILE = ON; if ((fd = my_open(Textfiles[i], O_RDONLY)) > 0) { if(PAT_FILE || PAT_BUFFER) mgrep(fd, AParse); else { if(SGREP) ret = sgrep(OldPattern,strlen(OldPattern),fd,D, i); else ret = bitap(old_D_pat,Pattern,fd,M,D); } if (ret <= -1) { close(fd); return -1; } } /* else don't have to process PRINTNONEXISTENTFILE since must print only once */ if (glimpse_clientdied) { close(fd); return -1; } if (agrep_finalfp != NULL) fflush(agrep_finalfp); if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITTOTALFILE > 0) && (LIMITTOTALFILE <= files_matched))) { close(fd); break; /* done */ } } /* for i < Numfiles */ NOMATCH = 0; } } } CONT: if(EATFIRST) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else if (agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; EATFIRST = OFF; } if(num_of_matched - prev_num_of_matched > 0) NOMATCH = OFF; /* if(NOMATCH) return(0); */ /*printf("exec=%d\n", num_of_matched);*/ return(num_of_matched); } /* end of exec() */ /* Just output the contents of the file fname onto the std output */ int file_out(fname) char *fname; { int num_read; int fd; int i, len; CHAR buf[SIZE+2]; if(FNAME) { len = strlen(fname); if (agrep_finalfp != NULL) { fputc('\n', agrep_finalfp); for(i=0; i< len; i++) fputc(':', agrep_finalfp); fputc('\n', agrep_finalfp); fprintf(agrep_finalfp, "%s\n", fname); for(i=0; i< len; i++) fputc(':', agrep_finalfp); fputc('\n', agrep_finalfp); fflush(agrep_finalfp); } else { if (1+len+1+len+1+len+1+agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outbuffer[agrep_outpointer++] = '\n'; for (i=0; i 0) write(1, buf, num_read); if (glimpse_clientdied) { close(fd); return -1; } } else { if ((num_read = fill_buf(fd, agrep_outbuffer + agrep_outpointer, agrep_outlen - agrep_outpointer)) > 0) agrep_outpointer += num_read; } close(fd); return 0; } int output(buffer, i1, i2, j) register CHAR *buffer; int i1, i2, j; { int PRINTED = 0; register CHAR *bp, *outend; if(i1 > i2) return 0; num_of_matched++; if(COUNT) return 0; if(SILENT) return 0; if(OUTTAIL || (!DELIMITER && (D_length == 1) && (D_pattern[0] == '\n')) ) { if (j>1) i1 = i1 + D_length; i2 = i2 + D_length; } if(DELIMITER) j = j+1; if(FIRSTOUTPUT) { if (buffer[i1] == '\n') { i1++; EATFIRST = ON; } FIRSTOUTPUT = 0; } if(TRUNCATE) { fprintf(stderr, "WARNING! some lines have been truncated in output record #%d\n", num_of_matched-1); } /* Why do we have to do this? */ while ((buffer[i1] == '\n') && (i1 <= i2)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer < agrep_outlen) agrep_outbuffer[agrep_outpointer ++] = '\n'; else { OUTPUT_OVERFLOW; return -1; } } i1++; } if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if(LINENUM) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d: ", j-1); else { char s[32]; int outindex; sprintf(s, "%d: ", j-1); for(outindex=0; (outindex+agrep_outpointer= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } while(bp <= outend) agrep_outbuffer[agrep_outpointer ++] = *bp++; } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } return 0; } int agrep_usage() { if (glimpse_call) return -1; fprintf(stderr, "usage: %s [-@#abcdehiklnoprstvwxyBDGIMSV] [-f patternfile] [-H dir] pattern [files]\n", Progname); fprintf(stderr, "\n"); fprintf(stderr, "summary of frequently used options:\n"); fprintf(stderr, "(For a more detailed listing see 'man agrep'.)\n"); fprintf(stderr, "-#: find matches with at most # errors\n"); fprintf(stderr, "-c: output the number of matched records\n"); fprintf(stderr, "-d: define record delimiter\n"); fprintf(stderr, "-h: do not output file names\n"); fprintf(stderr, "-i: case-insensitive search, e.g., 'a' = 'A'\n"); fprintf(stderr, "-l: output the names of files that contain a match\n"); fprintf(stderr, "-n: output record prefixed by record number\n"); fprintf(stderr, "-v: output those records that have no matches\n"); fprintf(stderr, "-w: pattern has to match as a word, e.g., 'win' will not match 'wind'\n"); fprintf(stderr, "-B: best match mode. find the closest matches to the pattern\n"); fprintf(stderr, "-G: output the files that contain a match\n"); fprintf(stderr, "-H 'dir': the cast-dictionary is located in directory 'dir'\n"); fprintf(stderr, "\n"); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } glimpse-4.18.7/agrep/agrep.chronicle000066400000000000000000000206641300371307100173420ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ Started in Feb 1991. This chronicle briefly describes the progress of agrep. Feb/91: The approximate pattern matching algorithm called 'bitap' (bit-parallel approximate pattern matching) is designed. The algorithm is a generalization of Baeza-Yates' "shift-or" algorithm for exact matching. Mar/91: Many extensions of the algorithm 'bitap' are found, especially for approximate regular expression pattern matching. Preliminary implementation of the algorithm showed a strong promise for a general-purpose fast approximate pattern-matching tool. Apr/91: Approximate regular expression pattern matching was implemented. The result is even better than expected. The design of the software tool is pinned down. (For example, record oriented, multi-pattern, AND/OR logic queries.) A partition technique for approximate pattern matching is used. May/91: The prototype of "agrep" is completed. A lot of debugging/optimization in this month. Jun/91: The first version of agrep is released. agrep 1.0 was announced and made available by anonymous ftp from cs.arizona.edu. Jul/91: A sub-linear expected-time algorithm, called "amonkey" for approximate pattern matching (for simple pattern) is designed. The algorithm has the same time complexity as that of Chang&Lawler but is much much faster in practice. The algorithm is based on a variation of Boyer-Moore technique, which we call "block-shifting." A sub-linear expected-time algorithm, called "mgrep" for matching a set of patterns is designed based on the "block-shifting" technique with a hashing technique. Aug/91: "amonkey" is implemented and incorporated into agrep. It is very fast for long patterns like DNA patterns. (But roughly the same for matching English words as the bitap algorithm using the partition technique.) Prototype of "mgrep" is implemented. Sep/91: "mgrep" is incorporated into agrep to support the -f option. An algorithm for approximate pattern matching that combines the 'partition' technique with the sub-linear expected-time algorithm for multi-patterns is designed. Implementation shows it to be the fastest for ASCII text (and pattern). Boyer-moore technique for exact matching is incorporated. Nov/91: The final paper of "agrep" that is to appear in USENIX conference (Jan 1992) is finished. Jan/92: Some new options are added, such as find best matches (-B), and file outputs (-G). The man pages are revised. agrep version 2.0 is released. Fixed the following bugs and change the version to be 2.01. 1. -G option doesn't work correctly. 2. multiple definition of some global variables. 3. -# with -w forced the first character of the pattern to be matched Mar/92: Fixed the following bugs and change the version to be 2.02. 1. agrep sometimes misses some matches for pipeline input. 2. the word delimiter was not defined consistantly. ------------------------------------------------------------------------------ bgopal: The following changes were made to the original agrep during 1993-94: 1. Modifications to make main() take multiple options from the same '-' group: - the only modifications were in main.c. 2. Now, to make agrep take input from a buffer so that it can be used as a procedure from another program. Places where changes have to be done: - asearch.c/fill_buf(), bitap.c/fill_buf() - main.c/read() statements - mgrep.c/read() statements - sgrep.c/read() statements - probably don't have to change scanf in main.c where a y/n is asked. - probably don't have to change readdir in recursive.c. I have used fill_buf everywhere for reading things from a file. I have to verify whether this is actually used to take input in which it has to search for patterns or to read things REALLY from a file (-f option, file_out, etc.). If former, then I can simply modify fill_buf to read from an fd or from an input string. How to specify that string / area of memory is a separate issue to be resolved during the weekend. I have resolved it. I've also made a library interface for agrep. So 2 is done. 3. Make errno = exit code whenever you return -1 instead of exiting. 4. See if there is a way to avoid copying of memory bytes in agrep by using pointer manipulation instead of fill_buf: a part of making agrep a callable routine. Important to make it really fast, that's why do this. Solution: --------- I think I've solved the problem: but there is a restriction for within the memory pattern matching: THE SEARCHBUFFER HAS TO BEGIN WITH A NEWLINE -- otherwise we cannot avoid the copying. This fact can be checked in the library interface. There are some more problems whose solution I'm not sure of: ask Udi. The problem is: a. In asearch(), asearch0() and asearch1(), some data is copied after the data read in the buffer. Is that crucial? The same thing can be seen in bitap(). This is done when num_read < BlockSize -- why? b. In sgrep(), the whole buffer is filled with pat[m-1] so that bm() does not enter an infinite-loop. Is that crucial if there is an equivalent of a single iteration of the while-fill_buf-loop. I have not modified prepf() to read the multi-pattern from memory, not a file. I have to modify it later (including agrep.c). Function fill_buf now simply reads from the fd given: it does not bother about pointer manipulation. Note: wherever there is a while(i lots of problems! *). **** These were completed and added into glimpse/glimpseindex in Spring 1994. 7. One other problems with agrep as a callable routine: the variable names used by agrep can clash with user defined variable names. Making agrep variables static is not going to help since they are accessed throughout agrep code. Making code reentrant is not the issue (it is almost impossible!). glimpse-4.18.7/agrep/agrep.h000066400000000000000000000124361300371307100156210ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #ifndef _AGREP_H_ #define _AGREP_H_ #include #include #include #include #include #include "re.h" #include "defs.h" #include "config.h" #include #include #include #include #define MAXNUM_PAT 16 /* 32 parts of a pattern = width of expression-tree */ #define CHAR unsigned char #define MAXPAT 256 #define MAXPATT 256 #define MAXDELIM 8 /* Max size of a delimiter pattern */ #define SHORTREG 15 #define MAXREG 30 #define MAXNAME 256 #define Max_Pats 12 /* max num of patterns */ #define Max_Keys 12 /* max num of keywords */ #define Max_Psize 128 /* max size of a pattern counting all the characters */ #define Max_Keyword 31 /* the max size of a keyword */ #define WORD 32 /* the size of a word */ #define MaxError 8 /* the max number of errors allowed */ #define MaxRerror 4 /* the max number of erros for regular expression */ #define MaxDelimit 16 /* the max raw length of a user defined delimiter */ #define BlockSize 49152/* BlockSize is always >= Max_record */ #define Max_record 49152 #define SIZE 16384 /* BlockSIze in sgrep */ #define MAXLINE 1024 /* maxline in sgrep */ #define MAX_LINE_LEN 1024 #define Maxline 1024 #define RBLOCK 8192 #define RMAXLINE 1024 #define MaxNext 66000 #define ON 1 #define OFF 0 #define Compl 1 #define Maxresult 10000 #define MaxCan 2500 #define MAX_DASHF_FILES 40000 #if 1 #define MAXSYM 256 /* ASCII */ #define WORDB 133 /* -w option */ #define LPARENT 134 /* ( */ #define RPARENT 135 /* ) */ #define LRANGE 136 /* [ */ #define RRANGE 137 /* ] */ #define LANGLE 138 /* < */ #define RANGLE 139 /* > */ #define NOTSYM 140 /* ^ */ #define WILDCD 141 /* wildcard */ #define ORSYM 142 /* | */ #define ORPAT 143 /* , */ #define ANDPAT 144 /* ; */ #define STAR 145 /* closure */ #define HYPHEN 129 /* - */ #define NOCARE 130 /* . */ #define NNLINE 131 /* special symbol for newline in begin of pattern*/ /* matches '\n' and NNLINE */ #define USERRANGE_MIN 128 /* min char in pattern of user: give warning */ #define USERRANGE_MAX 145 /* max char in pattern of user: give warning */ #else #define MAXSYM 256 /* ASCII */ #define WORDB 241 /* -w option */ #define LPARENT 242 /* ( */ #define RPARENT 243 /* ) */ #define LRANGE 244 /* [ */ #define RRANGE 245 /* ] */ #define LANGLE 246 /* < */ #define RANGLE 247 /* > */ #define NOTSYM 248 /* ^ */ #define WILDCD 249 /* wildcard */ #define ORSYM 250 /* | */ #define ORPAT 251 /* , */ #define ANDPAT 252 /* ; */ #define STAR 253 /* closure */ #define HYPHEN 237 /* - */ #define NOCARE 238 /* . */ #define NNLINE 239 /* special symbol for newline in begin of pattern*/ /* matches '\n' and NNLINE */ #define USERRANGE_MIN 236 /* min char in pattern of user: give warning */ #define USERRANGE_MAX 255 /* max char in pattern of user: give warning */ #endif #define OUTPUT_OVERFLOW \ { /* fprintf(stderr, "Output buffer overflow after %d bytes @ %s:%d !!\n", agrep_outpointer, __FILE__, __LINE__) */\ errno = ERANGE;\ } extern unsigned char *forward_delimiter(), *backward_delimiter(); extern int exists_delimiter(); extern void preprocess_delimiter(); unsigned char *forward_delimiter(), *backward_delimiter(); int exists_tcompressed_word(); unsigned char * forward_tcompressed_word(), *backward_tcompressed_word(); void alloc_buf(), free_buf(); extern char *aprint_file_time(); #define AGREP_VERSION "3.0" #define AGREP_DATE "1994" /* To parse patterns in asplit.c */ #define AND_EXP 0x1 /* boolean ; -- remains set throughout */ #define OR_EXP 0x2 /* boolean , -- remains set throughout */ #define ATTR_EXP 0x4 /* set when = is next non-alpha char, remains set until next , or ; --> never used in agrep */ #define VAL_EXP 0x8 /* set all the time except when = is seen for first time --> never used in agrep */ #define ENDSUB_EXP 0x10 /* set when , or ; is seen: must unset ATTR_EXP now --> never used in agrep */ #define INTERNAL 1 #define LEAF 2 #define NOTPAT 0x1000 #define OPMASK 0x00ff typedef struct _ParseTree { short op; char type; char terminalindex; union { struct { struct _ParseTree *left, *right; } internal; struct { int attribute; /* never used in agrep */ unsigned char *value; } leaf; } data; } ParseTree; #define unget_token_bool(bufptr, tokenlen) (*(bufptr)) -= (tokenlen) #define dd(a,b) 1 #define AGREP_ERROR 123 /* errno = 123 means that glimpse should quit searching files: used for errors glimpse itself cannot detect but agrep can */ #if ISO_CHAR_SET /* From Henrik.Martin@eua.ericsson.se (Henrik Martin) */ #define IS_LOCALE_CHAR(c) ((isalnum((c)) || isxdigit((c)) || \ isspace((c)) || ispunct((c)) || iscntrl((c))) ? 1 : 0) #define ISASCII(c) IS_LOCALE_CHAR(c) #else #define ISASCII(c) isascii(c) #endif extern int my_open(); extern FILE *my_fopen(); extern int my_stat(); extern int my_fstat(); extern int my_lstat(); extern int special_get_name(); #endif /* _AGREP_H_ */ glimpse-4.18.7/agrep/asearch.c000066400000000000000000001063701300371307100161250ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include "agrep.h" #include extern unsigned Init1, Init[], Mask[], endposition, D_endpos, AND, NO_ERR_MASK; extern int DELIMITER, FILENAMEONLY, INVERSE, PRINTFILETIME; extern CHAR CurrentFileName[]; extern long CurrentFileTime; extern int I, num_of_matched, prev_num_of_matched, TRUNCATE; extern int CurrentByteOffset; extern int errno; extern CHAR *agrep_inbuffer; extern int agrep_inlen; extern int agrep_initialfd; extern int EXITONERROR; extern int agrep_inpointer; extern FILE *agrep_finalfp; extern CHAR *agrep_outbuffer; extern int agrep_outlen; extern int agrep_outpointer; extern int NEW_FILE, POST_FILTER; extern int LIMITOUTPUT, LIMITPERFILE; int asearch(old_D_pat, text, D) CHAR old_D_pat[]; int text; register unsigned D; { register unsigned i, c, r1, r2, CMask, r_NO_ERR, r_Init1; register unsigned A0, B0, A1, B1, endpos; unsigned A2, B2, A3, B3, A4, B4; unsigned A[MaxError+1], B[MaxError+1]; unsigned D_Mask; int end; int D_length, FIRSTROUND, ResidueSize, lasti, l, k, j=0; int printout_end; CHAR *buffer; /* CHAR *tempbuf = NULL; */ /* used only when text == -1 */ if (I == 0) Init1 = (unsigned)037777777777; if(D > 4) { return asearch0(old_D_pat, text, D); } D_length = strlen((const char *)old_D_pat); D_Mask = D_endpos; for ( i=1; i 0) { i = Max_record; end = Max_record + l ; if (FIRSTROUND) { i = Max_record - 1; if(DELIMITER) { for(k=0; k=D_length) j--; } FIRSTROUND = OFF; } if (l < BlockSize) { /* copy pattern and '\0' at end of buffer */ strncpy((char *)(buffer+end), (const char *)old_D_pat, D_length); buffer[end+D_length] = '\0'; end = end + D_length; } /* ASEARCH_PROCESS: the while-loop below */ while (i < end ) { c = buffer[i]; CMask = Mask[c]; r1 = r_Init1 & B0; A0 = ((B0 >>1 ) & CMask) | r1; r1 = r_Init1 & B1; r2 = B0 | (((A0 | B0) >> 1) & r_NO_ERR); A1 = ((B1 >>1 ) & CMask) | r2 | r1 ; if(D == 1) goto Nextcharfile; r1 = r_Init1 & B2; r2 = B1 | (((A1 | B1) >> 1) & r_NO_ERR); A2 = ((B2 >>1 ) & CMask) | r2 | r1 ; if(D == 2) goto Nextcharfile; r1 = r_Init1 & B3; r2 = B2 | (((A2 | B2) >> 1) & r_NO_ERR); A3 = ((B3 >>1 ) & CMask) | r2 | r1 ; if(D == 3) goto Nextcharfile; r1 = r_Init1 & B4; r2 = B3 | (((A3 | B3) >> 1) & r_NO_ERR); A4 = ((B4 >>1 ) & CMask) | r2 | r1 ; if(D == 4) goto Nextcharfile; Nextcharfile: i=i+1; CurrentByteOffset ++; if(A0 & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = A0; if ( D == 1) r1 = A1; if ( D == 2) r1 = A2; if ( D == 3) r1 = A3; if ( D == 4) r1 = A4; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length - 1 ; if ((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length; /* point to starting position of D_pat */ TRUNCATE = OFF; for(k=0; k<= D; k++) { B[k] = Init[0]; } r1 = B[0] & Init1; A[0] = (((B[0]>>1) & CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = Init1 & B[k]; r2 = B[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); A[k] = (((B[k]>>1)&CMask) | r1 | r2) ; } A0 = A[0]; B0 = B[0]; A1 = A[1]; B1 = B[1]; A2 = A[2]; B2 = B[2]; A3 = A[3]; B3 = B[3]; A4 = A[4]; B4 = B[4]; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i]; CMask = Mask[c]; r1 = r_Init1 & A0; B0 = ((A0 >> 1 ) & CMask) | r1; /* printf("Mask = %o, B0 = %on", CMask, B0); */ r1 = r_Init1 & A1; r2 = A0 | (((A0 | B0) >> 1) & r_NO_ERR); B1 = ((A1 >>1 ) & CMask) | r2 | r1 ; if(D == 1) goto Nextchar1file; r1 = r_Init1 & A2; r2 = A1 | (((A1 | B1) >> 1) & r_NO_ERR); B2 = ((A2 >>1 ) & CMask) | r2 | r1 ; if(D == 2) goto Nextchar1file; r1 = r_Init1 & A3; r2 = A2 | (((A2 | B2) >> 1) & r_NO_ERR); B3 = ((A3 >>1 ) & CMask) | r2 | r1 ; if(D == 3) goto Nextchar1file; r1 = r_Init1 & A4; r2 = A3 | (((A3 | B3) >> 1) & r_NO_ERR); B4 = ((A4 >>1 ) & CMask) | r2 | r1 ; if(D == 4) goto Nextchar1file; Nextchar1file: i=i+1; CurrentByteOffset ++; if(B0 & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = B0; if ( D == 1) r1 = B1; if ( D == 2) r1 = B2; if ( D == 3) r1 = B3; if ( D == 4) r1 = B4; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; free_buf(text, buffer); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length - 1 ; if((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length ; TRUNCATE = OFF; for(k=0; k<= D; k++) { A[k] = Init[0]; } r1 = A[0] & Init1; B[0] = (((A[0]>>1)&CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = Init1 & A[k]; r2 = A[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); B[k] = (((A[k]>>1)&CMask) | r1 | r2) ; } A0 = A[0]; B0 = B[0]; A1 = A[1]; B1 = B[1]; A2 = A[2]; B2 = B[2]; A3 = A[3]; B3 = B[3]; A4 = A[4]; B4 = B[4]; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } if(l < BlockSize) { lasti = Max_record ; } else { ResidueSize = Max_record + l - lasti; if(ResidueSize > Max_record) { ResidueSize = Max_record; TRUNCATE = ON; } strncpy((char *)(buffer+Max_record-ResidueSize), (const char *)(buffer+lasti), ResidueSize); lasti = Max_record - ResidueSize; if(lasti == 0) lasti = 1; } } free_buf(text, buffer); return 0; #if AGREP_POINTER } else { lasti = 1; /* if (DELIMITER) tempbuf = (CHAR*)malloc(D_length + 1); */ buffer = (CHAR *)agrep_inbuffer; l = agrep_inlen; end = l; /* buffer[end-1] = '\n'; */ /* at end of the text. */ /* buffer[0] = '\n'; */ /* in front of the text. */ i = 0; if(DELIMITER) { for(k=0; k=D_length) j--; /* memcpy(tempbuf, buffer+end, D_length+1); strncpy(buffer+end, old_D_pat, D_length); buffer[end+D_length] = '\0'; end = end + D_length; */ } /* An exact copy of the above ASEARCH_PROCESS: the while-loop below */ while (i < end ) { c = buffer[i]; CMask = Mask[c]; r1 = r_Init1 & B0; A0 = ((B0 >>1 ) & CMask) | r1; r1 = r_Init1 & B1; r2 = B0 | (((A0 | B0) >> 1) & r_NO_ERR); A1 = ((B1 >>1 ) & CMask) | r2 | r1 ; if(D == 1) goto Nextcharmem; r1 = r_Init1 & B2; r2 = B1 | (((A1 | B1) >> 1) & r_NO_ERR); A2 = ((B2 >>1 ) & CMask) | r2 | r1 ; if(D == 2) goto Nextcharmem; r1 = r_Init1 & B3; r2 = B2 | (((A2 | B2) >> 1) & r_NO_ERR); A3 = ((B3 >>1 ) & CMask) | r2 | r1 ; if(D == 3) goto Nextcharmem; r1 = r_Init1 & B4; r2 = B3 | (((A3 | B3) >> 1) & r_NO_ERR); A4 = ((B4 >>1 ) & CMask) | r2 | r1 ; if(D == 4) goto Nextcharmem; Nextcharmem: i=i+1; CurrentByteOffset ++; if(A0 & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = A0; if ( D == 1) r1 = A1; if ( D == 2) r1 = A2; if ( D == 3) r1 = A3; if ( D == 4) r1 = A4; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length - 1 ; if ((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length; /* point to starting position of D_pat */ TRUNCATE = OFF; for(k=0; k<= D; k++) { B[k] = Init[0]; } r1 = B[0] & Init1; A[0] = (((B[0]>>1) & CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = Init1 & B[k]; r2 = B[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); A[k] = (((B[k]>>1)&CMask) | r1 | r2) ; } A0 = A[0]; B0 = B[0]; A1 = A[1]; B1 = B[1]; A2 = A[2]; B2 = B[2]; A3 = A[3]; B3 = B[3]; A4 = A[4]; B4 = B[4]; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i]; CMask = Mask[c]; r1 = r_Init1 & A0; B0 = ((A0 >> 1 ) & CMask) | r1; /* printf("Mask = %o, B0 = %on", CMask, B0); */ r1 = r_Init1 & A1; r2 = A0 | (((A0 | B0) >> 1) & r_NO_ERR); B1 = ((A1 >>1 ) & CMask) | r2 | r1 ; if(D == 1) goto Nextchar1mem; r1 = r_Init1 & A2; r2 = A1 | (((A1 | B1) >> 1) & r_NO_ERR); B2 = ((A2 >>1 ) & CMask) | r2 | r1 ; if(D == 2) goto Nextchar1mem; r1 = r_Init1 & A3; r2 = A2 | (((A2 | B2) >> 1) & r_NO_ERR); B3 = ((A3 >>1 ) & CMask) | r2 | r1 ; if(D == 3) goto Nextchar1mem; r1 = r_Init1 & A4; r2 = A3 | (((A3 | B3) >> 1) & r_NO_ERR); B4 = ((A4 >>1 ) & CMask) | r2 | r1 ; if(D == 4) goto Nextchar1mem; Nextchar1mem: i=i+1; CurrentByteOffset ++; if(B0 & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = B0; if ( D == 1) r1 = B1; if ( D == 2) r1 = B2; if ( D == 3) r1 = B3; if ( D == 4) r1 = B4; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; free_buf(text, buffer); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length - 1 ; if((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length ; TRUNCATE = OFF; for(k=0; k<= D; k++) { A[k] = Init[0]; } r1 = A[0] & Init1; B[0] = (((A[0]>>1)&CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = Init1 & A[k]; r2 = A[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); B[k] = (((A[k]>>1)&CMask) | r1 | r2) ; } A0 = A[0]; B0 = B[0]; A1 = A[1]; B1 = B[1]; A2 = A[2]; B2 = B[2]; A3 = A[3]; B3 = B[3]; A4 = A[4]; B4 = B[4]; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } /* if (DELIMITER) { memcpy(buffer+end, tempbuf, D_length+1); free(tempbuf); } */ return 0; } #endif /*AGREP_POINTER*/ } int asearch0(old_D_pat, text, D) CHAR old_D_pat[]; int text; register unsigned D; { register unsigned i, c, r1, r2, CMask, r_NO_ERR, r_Init1, end, endpos; unsigned A[MaxError+2], B[MaxError+2]; unsigned D_Mask; int D_length, FIRSTROUND, ResidueSize, lasti, l, k, j=0; int printout_end; CHAR *buffer; /* CHAR *tempbuf = NULL;*/ /* used only when text == -1 */ D_length = strlen((const char *)old_D_pat); D_Mask = D_endpos; for ( i=1; i 0) { i = Max_record; end = Max_record + l ; if (FIRSTROUND) { i = Max_record - 1; FIRSTROUND = OFF; } if (l < BlockSize) { strncpy((char *)(buffer+end), (const char *)old_D_pat, D_length); buffer[end+D_length] = '\0'; end = end + D_length; } /* ASEARCH0_PROCESS: the while-loop below */ while (i < end ) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = B[0] & r_Init1; A[0] = (((B[0] >> 1)) & CMask | r1 ) ; for(k=1; k<=D; k++) { r1 = r_Init1 & B[k]; r2 = B[k-1] | (((A[k-1]|B[k-1])>>1) & r_NO_ERR); A[k] = ((B[k] >> 1) & CMask) | r2 | r1; } if(A[0] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = A[D]; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length - 1; if((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length; /* point to starting position of D_pat */ for(k=0; k<= D; k++) { B[k] = Init[0]; } r1 = B[0] & r_Init1; A[0] = (((B[0]>>1) & CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = Init1 & B[k]; r2 = B[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); A[k] = (((B[k]>>1)&CMask) | r1 | r2) ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & A[0]; B[0] = ((A[0] >> 1 ) & CMask) | r1; for(k=1; k<=D; k++) { r1 = r_Init1 & A[k]; r2 = A[k-1] | (((A[k-1]|B[k-1])>>1) & r_NO_ERR); B[k] = ((A[k] >> 1) & CMask) | r2 | r1; } if(B[0] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = B[D]; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length -1 ; if((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length ; for(k=0; k<= D; k++) { A[k] = Init[0]; } r1 = A[0] & r_Init1; B[0] = (((A[0]>>1)&CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = r_Init1 & A[k]; r2 = A[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); B[k] = (((A[k]>>1)&CMask) | r1 | r2) ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } if(l < BlockSize) { lasti = Max_record; } else { ResidueSize = Max_record + l - lasti; if(ResidueSize > Max_record) { ResidueSize = Max_record; TRUNCATE = ON; } strncpy((char *)(buffer+Max_record-ResidueSize), (const char *)(buffer+lasti), ResidueSize); lasti = Max_record - ResidueSize; if(lasti == 0) lasti = 1; } } free_buf(text, buffer); return 0; #if AGREP_POINTER } else { lasti = 1; /* if (DELIMITER) tempbuf = (CHAR*)malloc(D_length + 1); */ buffer = (CHAR *)agrep_inbuffer; l = agrep_inlen; end = l; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; if(DELIMITER) { for(k=0; k=D_length) j--; /* memcpy(tempbuf, buffer+end, D_length+1); strncpy(buffer+end, old_D_pat, D_length); buffer[end+D_length] = '\0'; end = end + D_length; */ } /* An exact copy of the above ASEARCH0_PROCESS: the while-loop below */ while (i < end ) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = B[0] & r_Init1; A[0] = (((B[0] >> 1)) & CMask | r1 ) ; for(k=1; k<=D; k++) { r1 = r_Init1 & B[k]; r2 = B[k-1] | (((A[k-1]|B[k-1])>>1) & r_NO_ERR); A[k] = ((B[k] >> 1) & CMask) | r2 | r1; } if(A[0] & endpos) { if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; j++; r1 = A[D]; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length - 1; if((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length; /* point to starting position of D_pat */ for(k=0; k<= D; k++) { B[k] = Init[0]; } r1 = B[0] & r_Init1; A[0] = (((B[0]>>1) & CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = Init1 & B[k]; r2 = B[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); A[k] = (((B[k]>>1)&CMask) | r1 | r2) ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & A[0]; B[0] = ((A[0] >> 1 ) & CMask) | r1; for(k=1; k<=D; k++) { r1 = r_Init1 & A[k]; r2 = A[k-1] | (((A[k-1]|B[k-1])>>1) & r_NO_ERR); B[k] = ((A[k] >> 1) & CMask) | r2 | r1; } if(B[0] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; r1 = B[D]; if(((AND == 1) && ((r1 & endposition) == endposition)) || ((AND == 0) && (r1 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(text, buffer); NEW_FILE = OFF; return 0; } printout_end = i - D_length -1 ; if((text != -1) && !(lasti >= Max_record + l - 1)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } else if ((text == -1) && !(lasti >= l)) { if (-1 == output(buffer, lasti, printout_end, j)) {free_buf(text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(text, buffer); return 0; /* done */ } } lasti = i - D_length ; for(k=0; k<= D; k++) { A[k] = Init[0]; } r1 = A[0] & r_Init1; B[0] = (((A[0]>>1)&CMask) | r1) & D_Mask; for(k=1; k<= D; k++) { r1 = r_Init1 & A[k]; r2 = A[k-1] | (((A[k-1] | B[k-1])>>1)&r_NO_ERR); B[k] = (((A[k]>>1)&CMask) | r1 | r2) ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } /* if (DELIMITER) { memcpy(buffer+end, tempbuf, D_length+1); free(tempbuf); } */ return 0; } #endif /*AGREP_POINTER*/ } glimpse-4.18.7/agrep/asearch1.c000066400000000000000000000410041300371307100161760ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include "agrep.h" #include extern unsigned Init1, Init[], Mask[], endposition, D_endpos; extern unsigned NO_ERR_MASK; extern int TRUNCATE, DELIMITER, AND, I, S, DD, INVERSE, FILENAMEONLY, PRINTFILETIME ; extern char CurrentFileName[]; extern long CurrentFileTime; extern int num_of_matched, prev_num_of_matched; extern int CurrentByteOffset; extern CHAR *agrep_inbuffer; extern int agrep_inlen; extern FILE *agrep_finalfp; extern CHAR *agrep_outbuffer; extern int agrep_outlen; extern int agrep_outpointer; extern int NEW_FILE, POST_FILTER; extern int LIMITOUTPUT, LIMITPERFILE; int asearch1(old_D_pat, Text, D) char old_D_pat[]; int Text; register unsigned D; { register unsigned end, i, r1, r3, r4, r5, CMask, D_Mask, k, endpos; register unsigned r_NO_ERR; unsigned A[MaxError*2+1], B[MaxError*2+1]; int D_length, ResidueSize, lasti, num_read, FIRSTROUND=1, j=0; CHAR *buffer; /* CHAR *tempbuf = NULL;*/ /* used only when Text == -1 */ if(I == 0) Init1 = (unsigned)037777777777; if(DD > D) DD = D+1; if(I > D) I = D+1; if(S > D) S = D+1; D_length = strlen(old_D_pat); r_NO_ERR = NO_ERR_MASK; D_Mask = D_endpos; for(i=1; i 0) { i=Max_record; end = Max_record + num_read; if(FIRSTROUND) { i = Max_record -1 ; if(DELIMITER) { for(k=0; k=D_length) j--; } FIRSTROUND = 0; } if(num_read < BlockSize) { strncpy(buffer+Max_record+num_read, old_D_pat, D_length); end = end + D_length; buffer[end] = '\0'; } /* ASEARCH1_PROCESS: the while-loop below */ while (i < end) { CMask = Mask[buffer[i++]]; CurrentByteOffset ++; r1 = Init1 & B[D]; A[D] = ((B[D] >> 1) & CMask ) | r1; for(k = r3; k <= r4; k++) /* r3 = D+1, r4 = 2*D */ { r5 = B[k]; r1 = Init1 & r5; A[k] = ((r5 >> 1) & CMask) | B[k-I] | (((A[k-DD] | B[k-S]) >>1) & r_NO_ERR) | r1 ; } if(A[D] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((A[D*2] & endposition) == endposition)) || ((AND == 0) && (A[D*2] & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (Text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if((Text != -1) && !(lasti >= Max_record + num_read - 1)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } else if ((Text == -1) && !(lasti >= num_read)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i - D_length; TRUNCATE = OFF; for(k = D; k <= r4 ; k++) A[k] = B[k] = Init[0]; r1 = Init1 & B[D]; A[D] = (((B[D] >> 1) & CMask ) | r1) & D_Mask; for(k = r3; k <= r4; k++) /* r3 = D+1, r4 = 2*D */ { r5 = B[k]; r1 = Init1 & r5; A[k] = ((r5 >> 1) & CMask) | B[k-I] | (((A[k-DD] | B[k-S]) >>1) & r_NO_ERR) | r1 ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } /* end if (A[D]&endpos) */ CMask = Mask[buffer[i++]]; CurrentByteOffset ++; r1 = A[D] & Init1; B[D] = ((A[D] >> 1) & CMask) | r1; for(k = r3; k <= r4; k++) { r1 = A[k] & Init1; B[k] = ((A[k] >> 1) & CMask) | A[k-I] | (((B[k-DD] | A[k-S]) >>1)&r_NO_ERR) | r1 ; } if(B[D] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((B[r4] & endposition) == endposition)) || ((AND == 0) && (B[r4] & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (Text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if((Text != -1) && !(lasti >= Max_record + num_read - 1)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } else if ((Text == -1) && !(lasti >= num_read)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i-D_length; TRUNCATE = OFF; for(k=D; k <= r4; k++) A[k] = B[k] = Init[0]; r1 = Init1 & A[D]; B[D] = (((A[D] >> 1) & CMask ) | r1) & D_Mask; for(k = r3; k <= r4; k++) /* r3 = D+1, r4 = 2*D */ { r5 = A[k]; r1 = Init1 & r5; B[k] = ((r5 >> 1) & CMask) | A[k-I] | (((B[k-DD] | A[k-S]) >>1) & r_NO_ERR) | r1 ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } /* end if (B[D]&endpos) */ } ResidueSize = Max_record + num_read - lasti; if(ResidueSize > Max_record) { ResidueSize = Max_record; TRUNCATE = ON; } strncpy(buffer+Max_record-ResidueSize, buffer+lasti, ResidueSize); lasti = Max_record - ResidueSize; if(lasti < 0) lasti = 1; if(num_read < BlockSize) lasti = Max_record; } free_buf(Text, buffer); return 0; #if AGREP_POINTER } else { lasti = 1; /* if (DELIMITER) tempbuf = (CHAR*)malloc(D_length + 1); */ buffer = (CHAR *)agrep_inbuffer; num_read = agrep_inlen; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; if(DELIMITER) { for(k=0; k=D_length) j--; /* memcpy(tempbuf, buffer+end, D_length+1); strncpy(buffer+end, old_D_pat, D_length); buffer[end+D_length] = '\0'; end = end + D_length; */ } /* An exact copy of the above ASEARCH1_PROCESS: the while-loop below */ while (i < end) { CMask = Mask[buffer[i++]]; CurrentByteOffset ++; r1 = Init1 & B[D]; A[D] = ((B[D] >> 1) & CMask ) | r1; for(k = r3; k <= r4; k++) /* r3 = D+1, r4 = 2*D */ { r5 = B[k]; r1 = Init1 & r5; A[k] = ((r5 >> 1) & CMask) | B[k-I] | (((A[k-DD] | B[k-S]) >>1) & r_NO_ERR) | r1 ; } if(A[D] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((A[D*2] & endposition) == endposition)) || ((AND == 0) && (A[D*2] & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (Text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if((Text != -1) && !(lasti >= Max_record + num_read - 1)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } else if ((Text == -1) && !(lasti >= num_read)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i - D_length; TRUNCATE = OFF; for(k = D; k <= r4 ; k++) A[k] = B[k] = Init[0]; r1 = Init1 & B[D]; A[D] = (((B[D] >> 1) & CMask ) | r1) & D_Mask; for(k = r3; k <= r4; k++) /* r3 = D+1, r4 = 2*D */ { r5 = B[k]; r1 = Init1 & r5; A[k] = ((r5 >> 1) & CMask) | B[k-I] | (((A[k-DD] | B[k-S]) >>1) & r_NO_ERR) | r1 ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } /* end if (A[D]&endpos) */ CMask = Mask[buffer[i++]]; CurrentByteOffset ++; r1 = A[D] & Init1; B[D] = ((A[D] >> 1) & CMask) | r1; for(k = r3; k <= r4; k++) { r1 = A[k] & Init1; B[k] = ((A[k] >> 1) & CMask) | A[k-I] | (((B[k-DD] | A[k-S]) >>1)&r_NO_ERR) | r1 ; } if(B[D] & endpos) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((B[r4] & endposition) == endposition)) || ((AND == 0) && (B[r4] & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(Text, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } /* if (Text == -1) { memcpy(buffer+end-D_length, tempbuf, D_length+1); } */ free_buf(Text, buffer); NEW_FILE = OFF; return 0; } if((Text != -1) && !(lasti >= Max_record + num_read - 1)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } else if ((Text == -1) && !(lasti >= num_read)) { if (-1 == output(buffer, lasti, i-D_length-1, j)) {free_buf(Text, buffer); return -1;} } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(Text, buffer); return 0; /* done */ } } lasti = i-D_length; TRUNCATE = OFF; for(k=D; k <= r4; k++) A[k] = B[k] = Init[0]; r1 = Init1 & A[D]; B[D] = (((A[D] >> 1) & CMask ) | r1) & D_Mask; for(k = r3; k <= r4; k++) /* r3 = D+1, r4 = 2*D */ { r5 = A[k]; r1 = Init1 & r5; B[k] = ((r5 >> 1) & CMask) | A[k-I] | (((B[k-DD] | A[k-S]) >>1) & r_NO_ERR) | r1 ; } if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } /* end if (B[D]&endpos) */ } /* if (DELIMITER) { memcpy(buffer+end, tempbuf, D_length+1); free(tempbuf); } */ return 0; } #endif /*AGREP_POINTER*/ } glimpse-4.18.7/agrep/asplit.c000066400000000000000000000262451300371307100160150ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include "agrep.h" #include "putils.c" extern int checksg(); extern int D; extern FILE *debug; /* All borrowed from agrep.c and are needed for searching the index */ extern ParseTree aterminals[MAXNUM_PAT]; extern int AComplexBoolean; /* returns where it found the distinguishing token: until that from prev value of begin is the current pattern (not just the "words" in it) */ CHAR * aparse_flat(begin, end, prev, next) CHAR *begin; CHAR *end; int prev; int *next; { if (begin > end) { *next = prev; return end; } if (prev & ENDSUB_EXP) prev &= ~ATTR_EXP; if ((prev & ATTR_EXP) && !(prev & VAL_EXP)) prev |= VAL_EXP; while (begin <= end) { if (*begin == ',') { prev |= OR_EXP; prev |= VAL_EXP; prev |= ENDSUB_EXP; if (prev & AND_EXP) { fprintf(stderr, "asplit.c: parse error at character '%c'\n", *begin); return NULL; } *next = prev; return begin; } else if (*begin == ';') { prev |= AND_EXP; prev |= VAL_EXP; prev |= ENDSUB_EXP; if (prev & OR_EXP) { fprintf(stderr, "asplit.c: parse error at character '%c'\n", *begin); return NULL; } *next = prev; return begin; } else if (*begin == '\\') begin ++; /* skip two things */ begin++; } *next = prev; return begin; } int asplit_pattern_flat(APattern, AM, terminals, pnum_terminals, pAParse) CHAR *APattern; int AM; ParseTree terminals[MAXNUM_PAT]; int *pnum_terminals; int *pAParse; { CHAR *buffer; CHAR *buffer_pat; CHAR *buffer_end; buffer = APattern; buffer_end = buffer + AM; *pAParse = 0; /* * buffer is the runnning pointer, buffer_pat is the place where * the distinguishing delimiter was found, buffer_end is the end. */ while (buffer_pat = aparse_flat(buffer, buffer_end, *pAParse, pAParse)) { /* there is no pattern until after the distinguishing delimiter position: some agrep garbage */ if (buffer_pat <= buffer) { buffer = buffer_pat+1; if (buffer_pat >= buffer_end) break; continue; } if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "boolean expression has too many terms\n"); return -1; } terminals[*pnum_terminals].op = 0; terminals[*pnum_terminals].type = LEAF; terminals[*pnum_terminals].terminalindex = *pnum_terminals; terminals[*pnum_terminals].data.leaf.attribute = 0; /* default is no structure */ terminals[*pnum_terminals].data.leaf.value = (CHAR *)malloc(buffer_pat - buffer + 2); memcpy(terminals[*pnum_terminals].data.leaf.value, buffer, buffer_pat - buffer); /* without distinguishing delimiter */ terminals[*pnum_terminals].data.leaf.value[buffer_pat - buffer] = '\0'; (*pnum_terminals)++; if (buffer_pat >= buffer_end) break; buffer = buffer_pat+1; } if (buffer_pat == NULL) return -1; /* got out of while loop because of NULL rather than break */ return(*pnum_terminals); } /* * Recursive descent; C-style => AND + OR have equal priority => must bracketize expressions appropriately or will go left->right. * Grammar: * E = {E} | ~a | ~{E} | E ; E | E , E | a * Parser: * One look ahead at each literal will tell you what to do. * ~ has highest priority, ; and , have equal priority (left to right associativity), ~~ is not allowed. */ ParseTree * aparse_tree(buffer, len, bufptr, terminals, pnum_terminals) CHAR *buffer; int len; int *bufptr; ParseTree terminals[]; int *pnum_terminals; { int token, tokenlen; CHAR tokenbuf[MAXNAME]; int oldtokenlen; CHAR oldtokenbuf[MAXNAME]; ParseTree *t, *n, *leftn; token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen); switch(token) { case '{': /* (exp) */ if ((t = aparse_tree(buffer, len, bufptr, terminals, pnum_terminals)) == NULL) return NULL; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) != '}') { fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); destroy_tree(t); return (NULL); } if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return t; switch(token) { /* must find boolean infix operator */ case ',': case ';': leftn = t; if ((t = aparse_tree(buffer, len, bufptr, terminals, pnum_terminals)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; /* or end of parent sub expression */ case '}': unget_token_bool(bufptr, tokenlen); /* part of someone else who called me */ return t; default: destroy_tree(t); fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); return NULL; } /* Go one level deeper */ case '~': /* not exp */ if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return NULL; switch(token) { case 'a': if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "Pattern expression too large (> %d)\n", MAXNUM_PAT); return NULL; } n = &terminals[*pnum_terminals]; n->op = 0; n->type = LEAF; n->terminalindex = (*pnum_terminals); n->data.leaf.attribute = 0; n->data.leaf.value = (unsigned char*)malloc(tokenlen + 2); memcpy(n->data.leaf.value, tokenbuf, tokenlen); n->data.leaf.value[tokenlen] = '\0'; (*pnum_terminals)++; n->op |= NOTPAT; t = n; break; case '{': if ((t = aparse_tree(buffer, len, bufptr, terminals, pnum_terminals)) == NULL) return NULL; if (t->op & NOTPAT) t->op &= ~NOTPAT; else t->op |= NOTPAT; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) != '}') { fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); destroy_tree(t); return NULL; } break; default: fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); return NULL; } /* The resulting tree is in t. Now do another lookahead at this level */ if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return t; switch(token) { /* must find boolean infix operator */ case ',': case ';': leftn = t; if ((t = aparse_tree(buffer, len, bufptr, terminals, pnum_terminals)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; case '}': unget_token_bool(bufptr, tokenlen); return t; default: destroy_tree(t); fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); return NULL; } case 'a': /* individual term (attr=val) */ if (tokenlen == 0) return NULL; memcpy(oldtokenbuf, tokenbuf, tokenlen); oldtokenlen = tokenlen; oldtokenbuf[oldtokenlen] = '\0'; token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen); switch(token) { case '}': /* part of case '{' above: else syntax error not detected but semantics ok */ unget_token_bool(bufptr, tokenlen); case 'e': /* endof input */ case ',': case ';': if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "Pattern expression too large (> %d)\n", MAXNUM_PAT); return NULL; } n = &terminals[*pnum_terminals]; n->op = 0; n->type = LEAF; n->terminalindex = (*pnum_terminals); n->data.leaf.attribute = 0; n->data.leaf.value = (unsigned char*)malloc(oldtokenlen + 2); strcpy(n->data.leaf.value, oldtokenbuf); (*pnum_terminals)++; if ((token == 'e') || (token == '}')) return n; /* nothing after terminal in expression */ leftn = n; if ((t = aparse_tree(buffer, len, bufptr, terminals, pnum_terminals)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; default: fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); return NULL; } case 'e': /* can't happen as I always do a lookahead above and return current tree if e */ default: fprintf(stderr, "asplit.c: parse error at offset %d\n", *bufptr); return NULL; } } int asplit_pattern(APattern, AM, terminals, pnum_terminals, pAParse) CHAR *APattern; int AM; ParseTree terminals[]; int *pnum_terminals; ParseTree **pAParse; { int bufptr = 0, ret, i, j; if (is_complex_boolean(APattern, AM)) { AComplexBoolean = 1; *pnum_terminals = 0; if ((*pAParse = aparse_tree(APattern, AM, &bufptr, terminals, pnum_terminals)) == NULL) return -1; /* print_tree(*pAParse, 0); */ return *pnum_terminals; } else { for (i=0; itype == LEAF) return ((tree->op & NOTPAT) ? (!matched_terminals[tree->terminalindex]) : (matched_terminals[tree->terminalindex])); else if (tree->type == INTERNAL) { if ((tree->op & OPMASK) == ANDPAT) { /* sequential evaluation */ if ((res = eval_tree(tree->data.internal.left, matched_terminals)) != 0) res = eval_tree(tree->data.internal.right, matched_terminals); return (tree->op & NOTPAT) ? !res : res; } else { /* sequential evaluation */ if ((res = eval_tree(tree->data.internal.left, matched_terminals)) == 0) res = eval_tree(tree->data.internal.right, matched_terminals); return (tree->op & NOTPAT) ? !res : res; } } else { fprintf(stderr, "Eval on bad tree: returning false\n"); return 0; /* safety sake, but cannot happen! */ } } /* [first, last) = C-style range for which we want the words in terminal-values' patterns: 0..num_terminals for !ComplexBoolean, term/term otherwise */ int asplit_terminal(first, last, pat_buf, pat_ptr) int first, last; char *pat_buf; int *pat_ptr; { int word_length; int type; int num_pat; *pat_ptr = 0; num_pat = 0; for (; first= MAXNUM_PAT) { fprintf(stderr, "Warning: too many words in pattern (> %d): ignoring...\n", MAXNUM_PAT); break; } } return num_pat; } glimpse-4.18.7/agrep/bitap.c000066400000000000000000000436701300371307100156210ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* if the pattern is not simple fixed pattern, then after preprocessing */ /* and generating the masks, the program goes here. four cases: 1. */ /* the pattern is simple regular expression and no error, then do the */ /* matching here. 2. the pattern is simple regular expression and */ /* unit cost errors are allowed: then go to asearch(). */ /* 3. the pattern is simple regular expression, and the edit cost is */ /* not uniform, then go to asearch1(). */ /* if the pattern is regular expression then go to re() if M < 14, */ /* else go to re1() */ /* input parameters: old_D_pat: delimiter pattern. */ /* fd, input file descriptor, M: size of pattern, D: # of errors. */ #include #include "agrep.h" #include "memory.h" #include extern int CurrentByteOffset; extern unsigned Init1, D_endpos, endposition, Init[], Mask[], Bit[]; extern int LIMITOUTPUT, LIMITPERFILE; extern int DELIMITER, FILENAMEONLY, D_length, I, AND, REGEX, JUMP, INVERSE, PRINTFILETIME; extern char D_pattern[]; extern int TRUNCATE, DD, S; extern char Progname[], CurrentFileName[]; extern long CurrentFileTime; extern int num_of_matched, prev_num_of_matched; extern int agrep_initialfd; extern int EXITONERROR; extern int agrep_inlen; extern CHAR *agrep_inbuffer; extern int agrep_inpointer; extern CHAR *agrep_outbuffer; extern int agrep_outlen; extern int agrep_outpointer; extern FILE *agrep_finalfp; extern int errno; extern int NEW_FILE, POST_FILTER; /* bitap dispatches job */ int bitap(old_D_pat, Pattern, fd, M, D) char old_D_pat[], *Pattern; int fd, M, D; { unsigned char c; /* Patch to fix -n with ISO characters, "O.Bartunov" , S.Nazin (leng@sai.msu.su) */ register unsigned r1, r2, r3, CMask, i; register unsigned end, endpos, r_Init1; register unsigned D_Mask; int ResidueSize , FIRSTROUND, lasti, print_end, j, num_read; int k; CHAR *buffer; int NumBufferFills; D_length = strlen(old_D_pat); for(i=0; i 4) { fprintf(stderr, "%s: the maximum number of erorrs allowed for full regular expressions is 4\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if (M <= SHORTREG) { return re(fd, M, D); /* SUN: need to find a even point */ } else { return re1(fd, M, D); } } if (D > 0 && JUMP == ON) { return asearch1(old_D_pat, fd, D); } if (D > 0) { return asearch(old_D_pat, fd, D); } if(I == 0) Init1 = (unsigned)037777777777; j=0; r_Init1 = Init1; r1 = r2 = r3 = Init[0]; endpos = D_endpos; D_Mask = D_endpos; for(i=1 ; i 0) { NumBufferFills++; i=Max_record; end = Max_record + num_read; if(FIRSTROUND) { i = Max_record - 1 ; if(DELIMITER) { for(k=0; k=D_length) j--; } FIRSTROUND = OFF; } if(num_read < BlockSize) { strncpy(buffer+Max_record+num_read, old_D_pat, D_length); end = end + D_length; buffer[end] = '\0'; } /* BITAP_PROCESS: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r3; r2 = (( r3 >> 1 ) & CMask) | r1; if ( r2 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r2 & endposition) == endposition)) || ((AND == 0) && (r2 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j - (NumBufferFills - 1))) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r3; r2 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r2; r3 = (( r2 >> 1 ) & CMask) | r1; if ( r3 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r3 & endposition) == endposition)) || ((AND == 0) && (r3 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j - (NumBufferFills - 1))) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length ; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r2; r3 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } ResidueSize = num_read + Max_record - lasti; if(ResidueSize > Max_record) { ResidueSize = Max_record; TRUNCATE = ON; } strncpy(buffer+Max_record-ResidueSize, buffer+lasti, ResidueSize); lasti = Max_record - ResidueSize; if(lasti < 0) { lasti = 1; } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } free_buf(fd, buffer); return 0; #if AGREP_POINTER } else { buffer = agrep_inbuffer; num_read = agrep_inlen; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; lasti = 1; if(DELIMITER) { for(k=0; k=D_length) j--; } /* An exact copy of the above: BITAP_PROCESS: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r3; r2 = (( r3 >> 1 ) & CMask) | r1; if ( r2 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r2 & endposition) == endposition)) || ((AND == 0) && (r2 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j)) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r3; r2 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r2; r3 = (( r2 >> 1 ) & CMask) | r1; if ( r3 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r3 & endposition) == endposition)) || ((AND == 0) && (r3 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j)) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length ; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r2; r3 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } return 0; } #endif /*AGREP_POINTER*/ } fill_buf(fd, buf, record_size) int fd, record_size; unsigned char *buf; { int num_read=1; int total_read=0; extern int glimpse_clientdied; static int havePending = 0; static int pendingChar = 0; if (fd >= 0) { /* Decrement record size so we have room for an appended * newline, if we might need one. */ if (0 == DELIMITER) { --record_size; } if (havePending) { havePending = 0; buf [total_read++] = pendingChar; } while(total_read < record_size && num_read > 0) { if (glimpse_clientdied) return 0; num_read = read(fd, buf+total_read, record_size - total_read); total_read = total_read + num_read; } if (0 < num_read) { /* We're stopping because the buffer is full. Save * the last char for the next time through. This * guarantees, if we just read the last char, that * on the next call we'll know that we still need to * append a delimiter, even though we didn't "read" * anything. */ havePending = 1; pendingChar = buf [--total_read]; } else { /* Stopping because we read the last char. This * resets state for the next call. */ havePending = 0; } if ((0 == num_read) && /* Reached end-of-file */ (0 < total_read) && /* Got something, maybe from pending */ (0 == DELIMITER) && /* Not expecting special delimiter */ ('\n' != buf [total_read-1])) { /* Default delimiter not present */ /* Add the default delimiter, so the last line of the * file (terminated with EOF instead of newline) isn't * quietly dropped. */ buf [total_read] = '\n'; ++total_read; } } #if AGREP_POINTER else return 0; /* should not call this function if buffer is a pointer to a user-specified region! */ #else /*AGREP_POINTER*/ else { /* simulate a file */ total_read = (record_size > (agrep_inlen - agrep_inpointer)) ? (agrep_inlen - agrep_inpointer) : record_size; memcpy(buf, agrep_inbuffer + agrep_inpointer, total_read); agrep_inpointer += total_read; /* printf("agrep_inpointer %d total_read %d\n", agrep_inpointer, total_read);*/ } #endif /*AGREP_POINTER*/ if (glimpse_clientdied) return 0; return(total_read); } /* * In these functions no allocs/copying is done when * fd == -1, i.e., agrep is called to search within memory. */ void alloc_buf(fd, buf, size) int fd; char **buf; int size; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ *buf = (char *)malloc(size); } void free_buf(fd, buf) int fd; char *buf; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ free(buf); } glimpse-4.18.7/agrep/bitap.c.orig000066400000000000000000000410561300371307100165540ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* if the pattern is not simple fixed pattern, then after preprocessing */ /* and generating the masks, the program goes here. four cases: 1. */ /* the pattern is simple regular expression and no error, then do the */ /* matching here. 2. the pattern is simple regular expression and */ /* unit cost errors are allowed: then go to asearch(). */ /* 3. the pattern is simple regular expression, and the edit cost is */ /* not uniform, then go to asearch1(). */ /* if the pattern is regular expression then go to re() if M < 14, */ /* else go to re1() */ /* input parameters: old_D_pat: delimiter pattern. */ /* fd, input file descriptor, M: size of pattern, D: # of errors. */ #include "agrep.h" #include "memory.h" #include extern int CurrentByteOffset; extern unsigned Init1, D_endpos, endposition, Init[], Mask[], Bit[]; extern int LIMITOUTPUT, LIMITPERFILE; extern int DELIMITER, FILENAMEONLY, D_length, I, AND, REGEX, JUMP, INVERSE, PRINTFILETIME; extern char D_pattern[]; extern int TRUNCATE, DD, S; extern char Progname[], CurrentFileName[]; extern long CurrentFileTime; extern int num_of_matched, prev_num_of_matched; extern int agrep_initialfd; extern int EXITONERROR; extern int agrep_inlen; extern CHAR *agrep_inbuffer; extern int agrep_inpointer; extern CHAR *agrep_outbuffer; extern int agrep_outlen; extern int agrep_outpointer; extern FILE *agrep_finalfp; extern int errno; extern int NEW_FILE, POST_FILTER; /* bitap dispatches job */ int bitap(old_D_pat, Pattern, fd, M, D) char old_D_pat[], *Pattern; int fd, M, D; { unsigned char c; /* Patch to fix -n with ISO characters, "O.Bartunov" , S.Nazin (leng@sai.msu.su) */ register unsigned r1, r2, r3, CMask, i; register unsigned end, endpos, r_Init1; register unsigned D_Mask; int ResidueSize , FIRSTROUND, lasti, print_end, j, num_read; int k; CHAR *buffer; D_length = strlen(old_D_pat); for(i=0; i 4) { fprintf(stderr, "%s: the maximum number of erorrs allowed for full regular expressions is 4\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if (M <= SHORTREG) { return re(fd, M, D); /* SUN: need to find a even point */ } else { return re1(fd, M, D); } } if (D > 0 && JUMP == ON) { return asearch1(old_D_pat, fd, D); } if (D > 0) { return asearch(old_D_pat, fd, D); } if(I == 0) Init1 = (unsigned)037777777777; j=0; r_Init1 = Init1; r1 = r2 = r3 = Init[0]; endpos = D_endpos; D_Mask = D_endpos; for(i=1 ; i 0) { i=Max_record; end = Max_record + num_read; if(FIRSTROUND) { i = Max_record - 1 ; if(DELIMITER) { for(k=0; k=D_length) j--; } FIRSTROUND = OFF; } if(num_read < BlockSize) { strncpy(buffer+Max_record+num_read, old_D_pat, D_length); end = end + D_length; buffer[end] = '\0'; } /* BITAP_PROCESS: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r3; r2 = (( r3 >> 1 ) & CMask) | r1; if ( r2 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r2 & endposition) == endposition)) || ((AND == 0) && (r2 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j)) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r3; r2 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r2; r3 = (( r2 >> 1 ) & CMask) | r1; if ( r3 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r3 & endposition) == endposition)) || ((AND == 0) && (r3 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j)) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length ; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r2; r3 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } ResidueSize = num_read + Max_record - lasti; if(ResidueSize > Max_record) { ResidueSize = Max_record; TRUNCATE = ON; } strncpy(buffer+Max_record-ResidueSize, buffer+lasti, ResidueSize); lasti = Max_record - ResidueSize; if(lasti < 0) { lasti = 1; } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } free_buf(fd, buffer); return 0; #if AGREP_POINTER } else { buffer = agrep_inbuffer; num_read = agrep_inlen; end = num_read; /* buffer[end-1] = '\n';*/ /* at end of the text. */ /* buffer[0] = '\n';*/ /* in front of the text. */ i = 0; lasti = 1; if(DELIMITER) { for(k=0; k=D_length) j--; } /* An exact copy of the above: BITAP_PROCESS: the while-loop below */ while (i < end) { c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r3; r2 = (( r3 >> 1 ) & CMask) | r1; if ( r2 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r2 & endposition) == endposition)) || ((AND == 0) && (r2 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j)) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r3; r2 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } c = buffer[i++]; CurrentByteOffset ++; CMask = Mask[c]; r1 = r_Init1 & r2; r3 = (( r2 >> 1 ) & CMask) | r1; if ( r3 & endpos ) { j++; if (DELIMITER) CurrentByteOffset -= D_length; else CurrentByteOffset -= 1; if(((AND == 1) && ((r3 & endposition) == endposition)) || ((AND == 0) && (r3 & endposition)) ^ INVERSE ) { if(FILENAMEONLY && (NEW_FILE || !POST_FILTER)) { num_of_matched++; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, buffer); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, buffer); NEW_FILE = OFF; return 0; } print_end = i - D_length - 1; if ( ((fd != -1) && !(lasti >= Max_record+num_read - 1)) || ((fd == -1) && !(lasti >= num_read)) ) if (-1 == output(buffer, lasti, print_end, j)) { free_buf(fd, buffer); return -1;} if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, buffer); return 0; /* done */ } } lasti = i - D_length ; TRUNCATE = OFF; r2 = r3 = r1 = Init[0]; r1 = r_Init1 & r2; r3 = ((( r2 >> 1) & CMask) | r1 ) & D_Mask; if (DELIMITER) CurrentByteOffset += 1*D_length; else CurrentByteOffset += 1*1; } } return 0; } #endif /*AGREP_POINTER*/ } fill_buf(fd, buf, record_size) int fd, record_size; unsigned char *buf; { int num_read=1; int total_read=0; extern int glimpse_clientdied; if (fd >= 0) { while(total_read < record_size && num_read > 0) { if (glimpse_clientdied) return 0; num_read = read(fd, buf+total_read, record_size - total_read); total_read = total_read + num_read; } } #if AGREP_POINTER else return 0; /* should not call this function if buffer is a pointer to a user-specified region! */ #else /*AGREP_POINTER*/ else { /* simulate a file */ total_read = (record_size > (agrep_inlen - agrep_inpointer)) ? (agrep_len - agrep_inpointer) : record_size; memcpy(buf, agrep_inbuffer + agrep_inpointer, total_read); agrep_inpointer += total_read; /* printf("agrep_inpointer %d total_read %d\n", agrep_inpointer, total_read);*/ } #endif /*AGREP_POINTER*/ if (glimpse_clientdied) return 0; return(total_read); } /* * In these functions no allocs/copying is done when * fd == -1, i.e., agrep is called to search within memory. */ void alloc_buf(fd, buf, size) int fd; char **buf; int size; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ *buf = (char *)malloc(size); } void free_buf(fd, buf) int fd; char *buf; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ free(buf); } glimpse-4.18.7/agrep/checkfile.c000066400000000000000000000045101300371307100164250ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* * checkfile.c * takes a file name and checks to see if a file is a regular ascii file * */ #include #include #include #include #include #include #include "checkfile.h" #ifndef S_ISREG #define S_ISREG(mode) (0100000&(mode)) #endif #ifndef S_ISDIR #define S_ISDIR(mode) (0040000&(mode)) #endif #define MAXLINE 512 extern char Progname[]; extern int errno; unsigned char ibuf[MAXLINE]; /************************************************************************** * * check_file * input: filename or path (null-terminated character string) * returns: int (0 if file is a regular file, non-0 if not) * * uses stat(2) to see if a file is a regular file. * ***************************************************************************/ int check_file(fname) char *fname; { struct stat buf; if (my_stat(fname, &buf) != 0) { if (errno == ENOENT) return NOSUCHFILE; else return STATFAILED; } else { /* int ftype; if (S_ISREG(buf.st_mode)) { if ((ftype = samplefile(fname)) == ISASCIIFILE) { return ISASCIIFILE; } else if (ftype == ISBINARYFILE) { return ISBINARYFILE; } else if (ftype == OPENFAILED) { return OPENFAILED; } } if (S_ISDIR(buf.st_mode)) { return ISDIRECTORY; } if (S_ISBLK(buf.st_mode)) { return ISBLOCKFILE; } if (S_ISSOCK(buf.st_mode)) { return ISSOCKET; } */ return 0; } } /*************************************************************************** * * samplefile * reads in the first part of a file, and checks to see that it is * all ascii. * ***************************************************************************/ /* int samplefile(fname) char *fname; { char *p; int numread; int fd; if ((fd = open(fname, O_RDONLY)) == -1) { fprintf(stderr, "open failed on filename %s\n", fname); return OPENFAILED; } -comment- No need to use alloc_buf and free_buf here since always read from non-ve fd -tnemmoc- if (numread = fill_buf(fd, ibuf, MAXLINE)) { close(fd); p = ibuf; while (ISASCII(*p++) && --numread); if (!numread) { return(ISASCIIFILE); } else { return(ISBINARYFILE); } } else { close(fd); return(ISASCIIFILE); } } */ glimpse-4.18.7/agrep/checkfile.h000066400000000000000000000003761300371307100164400ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #define NOSUCHFILE -3 #define OPENFAILED -2 #define STATFAILED -1 #define ISASCIIFILE 0 #define ISDIRECTORY 1 #define ISBLOCKFILE 2 #define ISSOCKET 3 #define ISBINARYFILE 4 glimpse-4.18.7/agrep/checksg.c000066400000000000000000000077131300371307100161270ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include "agrep.h" #include "checkfile.h" #include extern int errno; extern CHAR Progname[MAXNAME]; extern int SGREP, PAT_FILE, PAT_BUFFER, EXITONERROR, SIMPLEPATTERN, CONSTANT, D, NOUPPER, JUMP, I, LINENUM, INVERSE, WORDBOUND, WHOLELINE, SILENT, DNA, BESTMATCH, DELIMITER; /* Make it an interface routine that tells you whether mgrep can be used for the pattern or not: must sneak and access global variable D though... */ int checksg(Pattern, D, set) CHAR *Pattern; int D; int set; /* should I set flags SGREP and DNA? not if called from glimpse via library */ { char c; int i, m; int NOTSGREP = 0; if (set) SGREP = OFF; m = strlen(Pattern); #if DEBUG fprintf(stderr, "checksg: len=%d, pat=%s, pat[len]=%d\n", m, Pattern, Pattern[m]); #endif if(!(PAT_FILE || PAT_BUFFER) && (m <= D)) { fprintf(stderr, "%s: size of pattern '%s' must be > #of errors %d\n", Progname, Pattern, D); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } SIMPLEPATTERN = ON; for (i=0; i < m; i++) { switch(Pattern[i]) { case ';' : SIMPLEPATTERN = OFF; goto outoffor; case ',' : SIMPLEPATTERN = OFF; goto outoffor; case '.' : SIMPLEPATTERN = OFF; goto outoffor; case '*' : SIMPLEPATTERN = OFF; goto outoffor; case '-' : SIMPLEPATTERN = OFF; goto outoffor; case '[' : SIMPLEPATTERN = OFF; goto outoffor; case ']' : SIMPLEPATTERN = OFF; goto outoffor; case '(' : SIMPLEPATTERN = OFF; goto outoffor; case ')' : SIMPLEPATTERN = OFF; goto outoffor; case '<' : SIMPLEPATTERN = OFF; goto outoffor; case '>' : SIMPLEPATTERN = OFF; goto outoffor; case '^' : /* NOTSGREP = 1; sgrep does it; bg 4/27/97 */ if(D > 0) { SIMPLEPATTERN = OFF; goto outoffor; } break; case '$' : /* NOTSGREP = 1; sgrep does it; bg 4/27/97 */ if(D > 0) { SIMPLEPATTERN = OFF; goto outoffor; } break; case '|' : SIMPLEPATTERN = OFF; goto outoffor; case '#' : SIMPLEPATTERN = OFF; goto outoffor; case '{': SIMPLEPATTERN = OFF; goto outoffor; case '}': SIMPLEPATTERN = OFF; goto outoffor; case '~': SIMPLEPATTERN = OFF; goto outoffor; case '\\' : { /* Should I DO the left shift Pattern including Pattern[m] which is '\0', or just ignore the next character after '\\'????? */ if (set) { /* preprocess and maskgen figure out what to do */ i++; /* in addition to for loop ++ */ } else { /* maskgen won't be called if we can help it, so shift it to make it verbatim */ /* int j; for (j=i; j0)) return 0; /* errors, not simple */ if (NOUPPER && (D>0)) return 0; /* errors, not simple */ if (JUMP == ON) return 0; /* I, S, D costs, not simple */ if (DELIMITER) return 0; /* delimiters avoid mgrep */ if (I == 0) return 0; /* I has 0 cost not 1, not simple */ if (LINENUM) return 0; /* can't use mgrep, so not simple */ if (WORDBOUND && (D > 0)) return 0; /* errors, not simple */ if (WHOLELINE && (D > 0)) return 0; /* errors, not simple */ if (SILENT) return 1; /* dont care output, so dont care pat */ if (set) { if (!NOTSGREP || CONSTANT) SGREP = ON; if (m >= 16) DNA = ON; for(i=0; i set MUST be on */ for (i=0; i < m; i++) { switch(Pattern[i]) { case '\\' : for (j=i; j #include "agrep.h" #include extern int D; extern int FILENAMEONLY, APPROX, PAT_FILE, PAT_BUFFER, MULTI_OUTPUT, COUNT, INVERSE, BESTMATCH; extern FILEOUT; extern REGEX; extern DELIMITER; extern WHOLELINE; extern LINENUM; extern I, S, DD; extern JUMP; extern char Progname[MAXNAME]; extern int agrep_initialfd; extern int EXITONERROR; extern int errno; int compat() { if(BESTMATCH) { if(COUNT || FILENAMEONLY || APPROX || PAT_FILE) { BESTMATCH = 0; fprintf(stderr, "%s: -B option ignored when -c, -l, -f, or -# is on\n", Progname); } /* if (LINENUM) { BESTMATCH = 0; fprintf(stderr, "%s: -B option ignored when -n is on", Progname); *//* Currently, the BESTMATCH option disables -n but there * doesn't seem to be a reason for it. * compat.c modified while testing continues 10-26-2002 KAM *//* } */ } if (COUNT && LINENUM) { LINENUM = 0; fprintf(stderr, "%s: -n option ignored with -c\n", Progname); } if(PAT_FILE || PAT_BUFFER) { if(APPROX && (D > 0)) { fprintf(stderr, "%s: approximate matching is not supported with -f option\n", Progname); } /* if(INVERSE) { fprintf(stderr, "%s: -f and -v are not compatible\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } */ if(LINENUM) { fprintf(stderr, "%s: -f and -n are not compatible\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } /* if(DELIMITER) { fprintf(stderr, "%s: -f and -d are not compatible\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } */ } if (MULTI_OUTPUT && LINENUM) { fprintf(stderr, "%s: -M and -n are not compatible\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if(JUMP) { if(REGEX) { fprintf(stderr, "%s: -D#, -I#, or -S# option is ignored for regular expression pattern\n", Progname); JUMP = 0; } if(I == 0 || S == 0 || DD == 0) { fprintf(stderr, "%s: the error cost cannot be 0\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } } if(DELIMITER) { if(WHOLELINE) { fprintf(stderr, "%s: -d and -x are not compatible\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } } if (INVERSE && (PAT_FILE || PAT_BUFFER) && MULTI_OUTPUT) { fprintf(stderr, "%s: -v and -M are not compatible\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } return 0; } glimpse-4.18.7/agrep/config.h000066400000000000000000000007661300371307100157730ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* * Definitions in this file will be visible throughout glimpse source code. * Any global flags or macros can should be defined here. */ #if defined(__NeXT__) #define getcwd(buf,size) getwd(buf) /* NB: unchecked target size--could overflow; BG: Ok since buffers are usually >= 256B */ #define S_ISREG(mode) (((mode) & (_S_IFMT)) == (_S_IFREG)) #define S_ISDIR(mode) (((mode) & (_S_IFMT)) == (_S_IFDIR)) #endif glimpse-4.18.7/agrep/contribution.list000066400000000000000000000006431300371307100177630ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ List of people (other than authors) who have contributed to agrep: Chunghwa H. Rao Gene Myers Ricardo Baeza-Yates Cliff Hathaway Ric Anderson Su-Ing Tsuei Raphael Finkel Andrew Hume David W. Sanderson William I. Chang Jack Kirman Dave Lutz Tony Plate Ken Lalonde Mark Christopher Dieter Becker Ian Young James M. Winget John F. Stoffel glimpse-4.18.7/agrep/defs.h000066400000000000000000000011331300371307100154340ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* Must be the same as those defined in compress/defs.h */ #define SIGNATURE_LEN 16 #define TC_LITTLE_ENDIAN 1 #define TC_BIG_ENDIAN 0 #define TC_EASYSEARCH 0x1 #define TC_UNTILNEWLINE 0x2 #define TC_REMOVE 0x4 #define TC_OVERWRITE 0x8 #define TC_RECURSIVE 0x10 #define TC_ERRORMSGS 0x20 #define TC_SILENT 0x40 #define TC_NOPROMPT 0x80 #define TC_FILENAMESONSTDIN 0x100 #define COMP_SUFFIX ".CZ" #define DEF_FREQ_FILE ".glimpse_quick" #define DEF_HASH_FILE ".glimpse_compress" #define DEF_STRING_FILE ".glimpse_uncompress" glimpse-4.18.7/agrep/delim.c000066400000000000000000000064771300371307100156200ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include "agrep.h" extern int EASYSEARCH, TCOMPRESSED; /* Accesses src completely before dest, so that dest can be = src */ void preprocess_delimiter(src, srclen, dest, pdestlen) unsigned char *src, *dest; int srclen, *pdestlen; { CHAR temp[Maxline]; int i, j; strcpy(temp, src); temp[srclen] = '\0'; for (i=0, j=0; i end) return -1; if (TCOMPRESSED == ON) return (exists_tcompressed_word(delim, len, begin, end - begin, EASYSEARCH)); for (curbegin = begin; curbegin + len <= end; curbegin ++) { for (curbuf = curbegin, curdelim = delim; curbuf < curbegin + len; curbuf ++, curdelim++) if (*curbuf != *curdelim) break; if (curbuf >= curbegin + len) return (curbegin - begin); } return -1; } /* return where delimiter begins or ends (=outtail): range = [begin, end) */ unsigned char * forward_delimiter(begin, end, delim, len, outtail) unsigned char *begin, *end, *delim; int len, outtail; { register unsigned char *curbegin, *curbuf, *curdelim; unsigned char *oldbegin = begin, *retval = begin; if (begin + len > end) { retval = end + 1; goto _ret; } if ((len == 1) && (*delim == '\n')) { begin ++; while ((begin < end) && (*begin != '\n')) begin ++; if (outtail && (*begin == '\n')) begin++; retval = begin; goto _ret; } if (TCOMPRESSED == ON) return forward_tcompressed_word(begin, end, delim, len, outtail, EASYSEARCH); for (curbegin = begin; curbegin + len <= end; curbegin ++) { for (curbuf = curbegin, curdelim = delim; curbuf < curbegin + len; curbuf ++, curdelim++) if (*curbuf != *curdelim) break; if (curbuf >= curbegin + len) break; } if (!outtail) retval = (curbegin <= end - len ? curbegin: end + 1); else retval = (curbegin <= end - len ? curbegin + len : end + 1); _ret: /* Gurantee that this skips at least one character */ if (retval <= oldbegin) return oldbegin + 1; else return retval; } /* return where the delimiter begins or ends (=outtail): range = [begin, end) */ unsigned char * backward_delimiter(end, begin, delim, len, outtail) unsigned char *end, *begin, *delim; int len, outtail; { register unsigned char *curbegin, *curbuf, *curdelim; if (end - len < begin) return begin; if ((len == 1) && (*delim == '\n')) { end --; while ((end > begin) && (*end != '\n')) end --; if (outtail && (*end == '\n')) end++; return end; } if (TCOMPRESSED == ON) return backward_tcompressed_word(end, begin, delim, len, outtail, EASYSEARCH); for (curbegin = end-len; curbegin >= begin; curbegin --) { for (curbuf = curbegin, curdelim = delim; curbuf < curbegin + len; curbuf ++, curdelim++) if (*curbuf != *curdelim) break; if (curbuf >= curbegin + len) break; } if (!outtail) return (curbegin >= begin ? curbegin : begin); else return (curbegin >= begin ? curbegin + len : begin); } glimpse-4.18.7/agrep/dummyfilters.c000066400000000000000000000026671300371307100172470ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* bgopal: used if search in compressed text files is not being performed */ /* Always say could not be compressed */ int quick_tcompress() { return 0; } /* Always say could not be uncompressed */ int quick_tuncompress() { return 0; } /* Always return uncompressible */ int tuncompressible() { return 0; } /* Always return uncompressible */ int tuncompressible_filename() { return 0; } /* Always return uncompressible */ int tuncompressible_file() { return 0; } /* Always return uncompressible */ int tuncompressible_fp() { return 0; } int exists_tcompressed_word() { return -1; } unsigned char * forward_tcompressed_word(begin, end, delim, len, outtail, flags) unsigned char *begin, *end, *delim; int len, outtail, flags; { return begin; } unsigned char * backward_tcompressed_word(end, begin, delim, len, outtail, flags) unsigned char *begin, *end, *delim; int len, outtail, flags; { return end; } int tcompress_file() { return 0; } int tuncompress_file() { return 0; } int initialize_tcompress() { return 0; } int initialize_tuncompress() { return 0; } int initialize_common() { return 0; } int uninitialize_tuncompress() { return 0; } int compute_dictionary() { return 0; } int uninitialize_common() { return 0; } int uninitialize_tcompress() { return 0; } int usemalloc = 0; int set_usemalloc() { return 0; } int unset_usemalloc() { return 0; } glimpse-4.18.7/agrep/dummysyscalls.c000066400000000000000000000011021300371307100174130ustar00rootroot00000000000000 /* These functions have been added here so that agrep/cast binaries will work independent of glimpse */ int my_open(name, flags, mode) char *name; int flags, mode; { return open(name, flags, mode); } FILE * my_fopen(name, flags) char *name; char *flags; { return fopen(name, flags); } int my_lstat(name, buf) char *name; struct stat *buf; { return lstat(name, buf); } int my_stat(name, buf) char *name; struct stat *buf; { return stat(name, buf); } int special_get_name(name, len, temp) char *name; int len; char *temp; { strcpy(temp, name); return 0; } glimpse-4.18.7/agrep/follow.c000066400000000000000000000116771300371307100160260ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* the functions in this file take a syntax tree for a regular expression and produce a DFA using the McNaughton-Yamada construction. */ #include #include #include #include #include "re.h" #define TRUE 1 extern Pset pset_union(); extern int pos_cnt; extern Re_node parse(); Re_lit_array lpos; /* extend_re() extends the RE by adding a ".*(" at the front and a "(" at the back. */ char *extend_re(s) char *s; { char *s1; s1 = malloc((unsigned) strlen(s)+4+1); return strcat(strcat(strcpy(s1, ".*("), s), ")"); } void free_pos(fpos, pos_cnt) Pset_array fpos; int pos_cnt; { Pset tpos, pos; int i; if ((fpos == NULL) || (*fpos == NULL)) return; for (i=0; i<=pos_cnt; i++) { pos = (*fpos)[i]; while (pos != NULL) { tpos = pos; pos = pos->nextpos; free(tpos); } } free(fpos); } /* Function to clear out a Ch_Set */ void free_cset(cset) Ch_Set cset; { Ch_Set tset; while (cset != NULL) { tset = cset; cset = cset->rest; free(tset->elt); free(tset); } } /* Function to clear out the tree of re-nodes */ void free_re(e) Re_node e; { if (e == NULL) return; /* * Was creating "reading freed memory", "freeing unallocated/freed memory" * errors. So abandoned it. Leaks are now up by 60B/call to 80B/call * -bg * Enabled on 26/Aug/1996 after changing pos routines to copy stuff rather than link up parents/children/etc. */ { Pset tpos, pos; int tofree = 0; if ((Lastpos(e)) != (Firstpos(e))) tofree = 1; pos = Lastpos(e); while (pos != NULL) { tpos = pos; pos = pos->nextpos; free(tpos); } Lastpos(e) = NULL; if (tofree) { pos = Firstpos(e); while (pos != NULL) { tpos = pos; pos = pos->nextpos; free(tpos); } Firstpos(e) = NULL; } } /* Enabled on 26/Aug/1996 after changing pos routines to copy stuff rather than link up parents/children/etc. */ switch (Op(e)) { case EOS: if (lit_type(Lit(e)) == C_SET) free_cset(lit_cset(Lit(e))); free(Lit(e)); break; case OPSTAR: free_re(Child(e)); break; case OPCAT: free_re(Lchild(e)); free_re(Rchild(e)); break; case OPOPT: free_re(Child(e)); break; case OPALT: free_re(Lchild(e)); free_re(Rchild(e)); break; case LITERAL: if (lit_type(Lit(e)) == C_SET) free_cset(lit_cset(Lit(e))); free(Lit(e)); break; default: fprintf(stderr, "free_re: unknown node type %d\n", Op(e)); } free(e); return; } /* mk_followpos() takes a syntax tree for a regular expression and traverses it once, computing the followpos function at each node and returns a pointer to an array whose ith element is a pointer to a list of position nodes, representing the positions in followpos(i). */ void mk_followpos_1(e, fpos) Re_node e; Pset_array fpos; { Pset pos; int i; switch (Op(e)) { case EOS: break; case OPSTAR: pos = Lastpos(e); while (pos != NULL) { i = pos->posnum; (*fpos)[i] = pset_union(Firstpos(e), (*fpos)[i], 1); pos = pos->nextpos; } mk_followpos_1(Child(e), fpos); break; case OPCAT: pos = Lastpos(Lchild(e)); while (pos != NULL) { i = pos->posnum; (*fpos)[i] = pset_union(Firstpos(Rchild(e)), (*fpos)[i], 1); pos = pos->nextpos; } mk_followpos_1(Lchild(e), fpos); mk_followpos_1(Rchild(e), fpos); break; case OPOPT: mk_followpos_1(Child(e), fpos); break; case OPALT: mk_followpos_1(Lchild(e), fpos); mk_followpos_1(Rchild(e), fpos); break; case LITERAL: break; default: fprintf(stderr, "mk_followpos: unknown node type %d\n", Op(e)); } return; } Pset_array mk_followpos(tree, npos) Re_node tree; int npos; { int i; Pset_array fpos; if (tree == NULL || npos < 0) return NULL; fpos = (Pset_array) malloc((unsigned) (npos+1)*sizeof(Pset)); if (fpos == NULL) return NULL; for (i = 0; i <= npos; i++) (*fpos)[i] = NULL; mk_followpos_1(tree, fpos); return fpos; } /* mk_poslist() sets a static array whose i_th element is a pointer to the RE-literal at position i. It returns 1 if everything is OK, 0 otherwise. */ /* init performs initialization actions; it returns -1 in case of error, 0 if everything goes OK. */ int init(s, table) char *s; int table[32][32]; { Pset_array fpos; Re_node e; Pset l; int i, j; char *s1; if ((s1 = extend_re(s)) == NULL) return -1; if ((e = parse(s1)) == NULL) { free(s1); return -1; } free(s1); if ((fpos = mk_followpos(e, pos_cnt)) == NULL) { free_re(e); return -1; } for (i = 0; i <= pos_cnt; i += 1) { #ifdef Debug printf("followpos[%d] = ", i); #endif l = (*fpos)[i]; j = 0; for ( ; l != NULL; l = l->nextpos) { #ifdef Debug printf("%d ", l->posnum); #endif table[i][j] = l->posnum; j++; } #ifdef Debug printf("\n"); #endif } #ifdef Debug for (i=0; i <= pos_cnt; i += 1) { j = 0; while (table[i][j] != 0) { printf(" %d ", table[i][j]); j++; } printf("\n"); } #endif free_pos(fpos, pos_cnt); free_re(e); return (pos_cnt); } glimpse-4.18.7/agrep/io.c000066400000000000000000000030431300371307100151170ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include "agrep.h" /* AGREP_POINTER must be defined to be 1 always */ /* #define AGREP_POINTER 1 */ /* Removed since we now have a -DAGREP_POINTER=1 option in the Makefile */ fill_buf(fd, buf, record_size) int fd, record_size; unsigned char *buf; { int num_read=1; int total_read=0; if (fd >= 0) { --record_size; while(total_read < record_size && num_read > 0) { num_read = read(fd, buf+total_read, record_size - total_read); total_read = total_read + num_read; } if ((0 == num_read) && (0 < total_read)) { /* Add newline terminator */ buf [total_read] = '\n'; ++total_read; } } #if AGREP_POINTER else return 0; /* should not call this function if buf is a pointer to a user-specified region! */ #else /*AGREP_POINTER*/ else { /* simulate a file */ total_read = (record_size > (agrep_inlen - agrep_inpointer)) ? (agrep_inlen - agrep_inpointer) : record_size; memcpy(buf, agrep_inbuffer + agrep_inpointer, total_read); agrep_inpointer += total_read; } #endif /*AGREP_POINTER*/ return(total_read); } /* * In these functions no allocs/copying is done when * fd == -1, i.e., agrep is called to search within memory. */ alloc_buf(fd, buf, size) int fd; char **buf; int size; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ *buf = (char *)malloc(size); } free_buf(fd, buf) int fd; char *buf; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ free(buf); } glimpse-4.18.7/agrep/io.c.orig000066400000000000000000000025241300371307100160610ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include "agrep.h" /* AGREP_POINTER must be defined to be 1 always */ /* #define AGREP_POINTER 1 */ /* Removed since we now have a -DAGREP_POINTER=1 option in the Makefile */ fill_buf(fd, buf, record_size) int fd, record_size; unsigned char *buf; { int num_read=1; int total_read=0; if (fd >= 0) { while(total_read < record_size && num_read > 0) { num_read = read(fd, buf+total_read, record_size - total_read); total_read = total_read + num_read; } } #if AGREP_POINTER else return 0; /* should not call this function if buf is a pointer to a user-specified region! */ #else /*AGREP_POINTER*/ else { /* simulate a file */ total_read = (record_size > (agrep_inlen - agrep_inpointer)) ? (agrep_len - agrep_inpointer) : record_size; memcpy(buf, agrep_inbuffer + agrep_inpointer, total_read); agrep_inpointer += total_read; } #endif /*AGREP_POINTER*/ return(total_read); } /* * In these functions no allocs/copying is done when * fd == -1, i.e., agrep is called to search within memory. */ alloc_buf(fd, buf, size) int fd; char **buf; int size; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ *buf = (char *)malloc(size); } free_buf(fd, buf) int fd; char *buf; { #if AGREP_POINTER if (fd != -1) #endif /*AGREP_POINTER*/ free(buf); } glimpse-4.18.7/agrep/main.c000066400000000000000000000017211300371307100154350ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include #include "agrep.h" #if ISO_CHAR_SET #include /* support for 8bit character set: ew@senate.be */ #endif #if MEASURE_TIMES extern int INFILTER_ms, OUTFILTER_ms, FILTERALGO_ms; #endif /*MEASURE_TIMES*/ extern char Pattern[MAXPAT]; extern int EXITONERROR; #include "dummysyscalls.c" int main(argc, argv) int argc; char *argv[]; { int ret; #if ISO_CHAR_SET setlocale(LC_ALL,""); /* support for 8bit character set: ew@senate.be, Henrik.Martin@eua.ericsson.se, "O.Bartunov" , S.Nazin (leng@sai.msu.su) */ #endif EXITONERROR = 1; /* the only place where it is set to 1 */ ret = fileagrep(argc, argv, 0, stdout); #if MEASURE_TIMES fprintf(stderr, "ret = %d infilter = %d ms\toutfilter = %d ms\tfilteralgo = %d ms\n", ret, INFILTER_ms, OUTFILTER_ms, FILTERALGO_ms); #endif /*MEASURE_TIMES*/ if(ret<0) exit(2); if(ret==0) exit(1); exit(0); } glimpse-4.18.7/agrep/maskgen.c000066400000000000000000000143421300371307100161410ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include "agrep.h" #include extern unsigned D_endpos, endposition, Init1, wildmask; extern Mask[], Bit[], Init[], NO_ERR_MASK; extern int AND, REGEX, NOUPPER, D_length; extern unsigned char Progname[]; extern int agrep_initialfd; extern int EXITONERROR; extern int errno; int maskgen(Pattern, D) unsigned char *Pattern; int D; { struct term { int flag; unsigned char class[WORD]; } position[WORD+10]; unsigned char c; int i, j, k, l, M, OR=0, EVEN = 0, base, No_error; #ifdef DEBUG fprintf(stderr, "maskgen: len=%d, pat=%s, D=%d\n", strlen(Pattern), Pattern, D); #endif for(i=0; i' (use \\<, \\> to search for <, >)\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } break; case LRANGE : if(No_error == ON) NO_ERR_MASK = NO_ERR_MASK | Bit[j]; i=i+1; if (Pattern[i] == NOTSYM) { position[j].flag = Compl; i++; } k=0; while (Pattern[i] != RRANGE && i < M) { if(Pattern[i] == HYPHEN) { position[j].class[k-1] = Pattern[i+1]; i=i+2; } else { position[j].class[k] = position[j].class[k+1] = Pattern[i]; k = k+2; i++; } } if(i == M) { fprintf(stderr, "%s: unmatched '[', ']' (use \\[, \\] to search for [, ])\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } position[j].class[k] = '\0'; j++; break; case RRANGE : fprintf(stderr, "%s: unmatched '[', ']' (use \\[, \\] to search for [, ])\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); break; case ORPAT : if(REGEX == ON || AND == ON) { fprintf(stderr, "illegal pattern: cannot handle OR (',') and AND (';')/regular-expressions simultaneously\n"); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } OR = ON; position[j].flag = 2; position[j].class[0] = '\0'; endposition = endposition | Bit[j++]; break; case ANDPAT : position[j].flag = 2; position[j].class[0] = '\0'; if(j > D_length) AND = ON; if(OR || (REGEX == ON && j>D_length)) { fprintf(stderr, "illegal pattern: cannot handle AND (';') and OR (',')/regular-expressions simultaneously\n"); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } endposition = endposition | Bit[j++]; break; /* case ' ' : if (Pattern[i-1] == ORPAT || Pattern[i-1] == ANDPAT) break; if(No_error == ON) NO_ERR_MASK = NO_ERR_MASK | Bit[j]; position[j].flag = 0; position[j].class[0] = position[j].class[1] = Pattern[i]; position[j++].class[2] = '\0'; break; */ case '\n' : NO_ERR_MASK = NO_ERR_MASK | Bit[j]; position[j].class[0] = position[j].class[1] = '\n'; position[j++].class[2] = '\0'; break; case WORDB : NO_ERR_MASK = NO_ERR_MASK | Bit[j]; position[j].class[0] = 1; position[j].class[1] = 47; position[j].class[2] = 58; position[j].class[3] = 64; position[j].class[4] = 91; position[j].class[5] = 96; position[j].class[6] = 123; position[j].class[7] = 127; position[j++].class[8] = '\0'; break; case NNLINE : NO_ERR_MASK |= Bit[j]; position[j].class[0] = position[j].class[1] = '\n'; position[j].class[2] = position[j].class[3] = NNLINE; position[j++].class[4] = '\0'; break; default : if(No_error == ON) NO_ERR_MASK = NO_ERR_MASK | Bit[j]; position[j].flag = 0; position[j].class[0] = position[j].class[1] = Pattern[i]; position[j++].class[2] = '\0'; } if(j > WORD) { fprintf(stderr, "%s: pattern too long (has > %d chars)\n", Progname, WORD); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } } if (EVEN != 0) { fprintf(stderr, "%s: unmatched '<', '>' (use \\<, \\> to search for <, >)\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } M = j - 1; base = WORD - M; wildmask = (wildmask >> base); endposition = (endposition >> base); NO_ERR_MASK = (NO_ERR_MASK >> 1) & (~Bit[1]); NO_ERR_MASK = ~NO_ERR_MASK >> (base-1); for (i=1; i<= WORD - M ; i++) Init[0] = Init[0] | Bit[i]; Init[0] = Init[0] | endposition; /* not necessary for INit[i], i>0, */ /* but at every begining of the matching process append one no-match character to initialize the error vectors */ endposition = ( endposition << 1 ) + 1; Init1 = (Init[0] | wildmask | endposition) ; D_endpos = ( endposition >> ( M - D_length ) ) << ( M - D_length); endposition = endposition ^ D_endpos; #ifdef DEBUG printf("endposition: %o\n", endposition); printf("no_err_mask: %o\n", NO_ERR_MASK); #endif for(c=0, i=0; i < MAXSYM; c++, i++) { for (k=1, l=0; k<=M ; k++, l=0) { while (position[k].class[l] != '\0') { if (position[k].class[l] == NOCARE && (c != '\n' || REGEX) ) { Mask[c] = Mask[c] | Bit[base + k]; break; } if (c >= position[k].class[l] && c <= position[k].class[l+1]) { Mask[c] = Mask[c] | Bit[base + k]; break; } l = l + 2; } if (position[k].flag == Compl) Mask[c] = Mask[c] ^ Bit[base+k]; } } if(NOUPPER) for(i=0; i #include #include #ifdef ultrix #include #endif #include #include "agrep.h" #include #define ddebug #define uchar unsigned char #undef MAXPAT #define MAXPAT 256 #undef MAXLINE #define MAXLINE 1024 #undef MAXSYM #define MAXSYM 256 #define MAXMEMBER1 32768 /* #define MAXMEMBER1 262144 */ /*2^18 */ #define MAXPATFILE 600000 #define BLOCKSIZE 16384 #define MAXHASH 32768 /* #define MAXHASH 262144 */ #define mask5 32767 #define max_num MAX_DASHF_FILES #if ISO_CHAR_SET #define W_DELIM 256 #else #define W_DELIM 128 #endif #define L_DELIM 10 #define Hbits 5 /* how much to shift to perform the hash */ extern char aduplicates[MAXNUM_PAT][MAXNUM_PAT]; /* tells what other patterns are exactly equal to the i-th one */ extern char tc_aduplicates[MAXNUM_PAT][MAXNUM_PAT]; /* tells what other patterns are exactly equal to the i-th one */ extern ParseTree aterminals[MAXNUM_PAT]; extern int AComplexBoolean; extern int LIMITOUTPUT, LIMITPERFILE; extern int BYTECOUNT, PRINTOFFSET, PRINTRECORD, CurrentByteOffset; extern int MULTI_OUTPUT; /* used by glimpse only if OR, never for AND */ extern int DELIMITER; extern CHAR D_pattern[MaxDelimit*2]; extern int D_length; extern CHAR tc_D_pattern[MaxDelimit*2]; extern int tc_D_length; extern COUNT, FNAME, SILENT, FILENAMEONLY, prev_num_of_matched, num_of_matched, PRINTFILETIME; extern INVERSE, OUTTAIL; extern WORDBOUND, WHOLELINE, NOUPPER; extern ParseTree *AParse; extern int AComplexPattern; extern unsigned char CurrentFileName[], Progname[]; extern long CurrentFileTime; extern total_line; extern agrep_initialfd; extern int EXITONERROR; extern int PRINTPATTERN; extern int agrep_inlen; extern CHAR *agrep_inbuffer; extern FILE *agrep_finalfp; extern int agrep_outpointer; extern int agrep_outlen; extern CHAR * agrep_outbuffer; extern int errno; extern int NEW_FILE, POST_FILTER; extern int tuncompressible(); extern int quick_tcompress(); extern int quick_tuncompress(); extern int TCOMPRESSED; extern int EASYSEARCH; extern char FREQ_FILE[MAX_LINE_LEN], HASH_FILE[MAX_LINE_LEN], STRING_FILE[MAX_LINE_LEN]; extern char PAT_FILE_NAME[MAX_LINE_LEN]; uchar SHIFT1[MAXMEMBER1]; int LONG = 0; int SHORT = 0; int p_size= 0; uchar tr[MAXSYM]; uchar tr1[MAXSYM]; int HASH[MAXHASH]; int Hash2[max_num]; uchar *PatPtr[max_num]; uchar *pat_spool = NULL; /* [MAXPATFILE+2*max_num+MAXPAT]; */ uchar *patt[max_num]; int pat_len[max_num]; int pat_indices[max_num]; /* pat_indices[p] gives the actual index in matched_teriminals: used only with AParse != 0 */ int num_pat; extern char amatched_terminals[MAXNUM_PAT]; /* which patterns have been matched in the current line? Used only with AParse != 0, so max_num is not needed */ extern int anum_terminals; extern int AComplexBoolean; static void countline(); void acompute_duplicates(); #if DOTCOMPRESSED /* Equivalent variables for compression search */ uchar tc_SHIFT1[MAXMEMBER1]; int tc_LONG = 0; int tc_SHORT = 0; int tc_p_size= 0; uchar tc_tr[MAXSYM]; uchar tc_tr1[MAXSYM]; int tc_HASH[MAXHASH]; int tc_Hash2[max_num]; uchar *tc_PatPtr[max_num]; uchar *tc_pat_spool = NULL; /* [MAXPATFILE+2*max_num+MAXPAT]; */ uchar *tc_patt[max_num]; int tc_pat_len[max_num]; int tc_pat_indices[max_num]; /* pat_indices[p] gives the actual index in matched_teriminals: used only with AParse != 0 */ int tc_num_pat; /* must be the same as num_pat */ #endif /*DOTCOMPRESSED*/ static void f_prep(); static void f_prep1(); static void accumulate(); #if DOTCOMPRESSED static void tc_f_prep(); static void tc_f_prep1(); static void tc_accumulate(); #endif #ifdef perf_check int cshift=0, cshift0=0, chash=0; #endif /* * General idea behind output processing with delimiters, inverse, compression, etc. * CAUTION: In compressed files, we can search ONLY for simple patterns or their ;,. * Attempts to search for complex patterns / with errors might lead to spurious matches. * 1. Once we find the match, go back and forward to get the delimiters that surround * the matched region. * 2. If it is a compressed file, verify that the match is "real" (compressed files * can have pseudo matches hence this filtering step is required). * 3. Increment num_of_matched. * 4. Process some output options which print stuff before the matched region is * printed. * 5. If there is compression, decomress and output the matched region. Otherwise * just output it as is. Remember, from step (1) we know the matched region. * 6. If inverse is set, then we must keep track of the end of the last matched region * in the variable lastout. When there is a match, we must print everything from * lastout to the beginning of the current matched region (curtextbegin) and then * update lastout to point to the end of the current matched region (curtextend). * ALSO: if we exit from the main loops, we must output everything from the end * of the last matched region to the end of the input buffer. * 7. Delimiter handling in complex patterns is different: there the search is done * for a boolean and of the delimiter pattern and the actual pattern. * 8. For convenience and speed, the multipattern matching routines to handle * compressed files have been separated from their (normal) counterparts. * 9. One special note on handling complicated boolean patterns: the parse * tree will be the same for both compressed and uncomrpessed patterns and the * same amatched_terminals array will be used in both. BUT, the pat_spool and * pat_index, etc., will be different as they refer to the individual terminals. */ int prepf(mfp, mbuf, mlen) int mfp, mlen; unsigned char *mbuf; { int length=0, i, p=1; uchar *pat_ptr; unsigned Mask = 31; int num_read; unsigned char *buf; struct stat stbuf; int j, k; /* to implement \\ */ if ((mfp == -1) && ((mbuf == NULL) || (mlen <= 0))) return -1; if (mfp != -1) { if (fstat(mfp, &stbuf) == -1) { fprintf(stderr, "%s: cannot stat file: %s\n", Progname, PAT_FILE_NAME); return -1; } if (!S_ISREG(stbuf.st_mode)) { fprintf(stderr, "%s: pattern file not regular file: %s\n", Progname, PAT_FILE_NAME); return -1; } if (stbuf.st_size*2 > MAXPATFILE + 2*max_num) { fprintf(stderr, "%s: pattern file too large (> %d B): %s\n", Progname, (MAXPATFILE+2*max_num)/2, PAT_FILE_NAME); return -1; } if (pat_spool != NULL) free(pat_spool); pat_ptr = pat_spool = (unsigned char *)malloc(stbuf.st_size*2 + MAXPAT); alloc_buf(mfp, &buf, MAXPATFILE+2*BlockSize); while((num_read = fill_buf(mfp, buf+length, 2*BlockSize)) > 0) { length = length + num_read; if(length > MAXPATFILE) { fprintf(stderr, "%s: maximum pattern file size is %d\n", Progname, MAXPATFILE); if (!EXITONERROR) { errno = AGREP_ERROR; free_buf(mfp, buf); return -1; } else exit(2); } } } else { buf = mbuf; length = mlen; if (mlen*2 > MAXPATFILE + 2*max_num) { fprintf(stderr, "%s: pattern buffer too large (> %d B)\n", Progname, (MAXPATFILE+2*max_num)/2); return -1; } if (pat_spool != NULL) free(pat_spool); pat_ptr = pat_spool = (unsigned char *)malloc(mlen*2 + MAXPAT); } /* Now all the patterns are in buf */ buf[length] = '\n'; i=0; p=1; /* removed by Udi 11/8/94 - we now do WORDBOUND "by hand" if(WORDBOUND) { while(imax_num) { fprintf(stderr, "%s: maximum number of patterns is %d\n", Progname, max_num); if (!EXITONERROR) { errno = AGREP_ERROR; free_buf(mfp, buf); return -1; } else exit(2); } for(i=1; i<20; i++) *pat_ptr = i; /* boundary safety zone */ /* I might have to keep changing tr s.t. mgrep won't get confused with W_DELIM */ for(i=0; i< MAXSYM; i++) tr[i] = i; if(NOUPPER) { for (i=0; i 1) && ((patt[i][p-1] == '^') || (patt[i][p-1] == '$')) && (patt[i][p-2] != '\\')) patt[i][p-1] = '\n'; /* Added by bg, Dec 2nd 1994 */ for (k=0; k0?p-2:1):p); changed by Udi 11/8/94 */ #ifdef debug printf("prepf(): patt[%d]=%s, pat_len[%d]=%d\n", i, patt[i], i, pat_len[i]); #endif if(p!=0 && p < p_size) p_size = p; /* MIN */ } if(p_size == 0) { fprintf(stderr, "%s: the pattern file is empty\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; free_buf(mfp, buf); return -1; } else exit(2); } if(length > 400 && p_size > 2) LONG = 1; if(p_size == 1) SHORT = 1; for(i=0; i MAXPATFILE + 2*max_num) { fprintf(stderr, "%s: pattern buffer too large (> %d B)\n", Progname, (MAXPATFILE+2*max_num)/2); return -1; } if (tc_pat_spool != NULL) free(tc_pat_spool); pat_ptr = tc_pat_spool = (unsigned char *)malloc(length*2 + MAXPAT); #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ i=0; p=1; while(i < length) { tc_patt[p] = pat_ptr; while((*pat_ptr = buf[i++]) != '\n') pat_ptr++; *pat_ptr++ = 0; if ((tc_length = quick_tcompress(FREQ_FILE, HASH_FILE, tc_patt[p], strlen(tc_patt[p]), tc_buf, MAXPAT * 2 - 8, TC_EASYSEARCH)) > 0) { memcpy(tc_patt[p], tc_buf, tc_length); tc_patt[p][tc_length] = '\0'; pat_ptr = tc_patt[p] + tc_length + 1; /* character after '\0' */ } p++; } for(i=1; i<20; i++) *pat_ptr = i; /* boundary safety zone */ /* Ignore all other options: it is automatically W_DELIM */ for(i=0; i< MAXSYM; i++) tc_tr[i] = i; for(i=0; i< MAXSYM; i++) tc_tr1[i] = tc_tr[i]&Mask; tc_num_pat = p-1; tc_p_size = MAXPAT; for(i=1; i<=num_pat; i++) { p = strlen(tc_patt[i]); tc_pat_len[i] = p; #ifdef debug printf("prepf(): tc_patt[%d]=%s, tc_pat_len[%d]=%d\n", i, tc_patt[i], i, tc_pat_len[i]); #endif if(p!=0 && p < tc_p_size) tc_p_size = p; /* MIN */ } if(tc_p_size == 0) { /* cannot happen NOW */ fprintf(stderr, "%s: the pattern file is empty\n", Progname); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if(length > 400 && tc_p_size > 2) tc_LONG = 1; if(tc_p_size == 1) tc_SHORT = 1; for(i=0; i 0) { buf_end = end = Max_record + num_read -1 ; oldCurrentByteOffset = CurrentByteOffset; if (first_time) { if ((TCOMPRESSED == ON) && tuncompressible(text+Max_record, num_read)) { EASYSEARCH = text[Max_record+SIGNATURE_LEN-1]; start += SIGNATURE_LEN; CurrentByteOffset += SIGNATURE_LEN; if (!EASYSEARCH) { fprintf(stderr, "not compressed for easy-search: can miss some matches in: %s\n", CurrentFileName); } } else TCOMPRESSED = OFF; first_time = 0; } if (!DELIMITER) { while(text[end] != r_newline && end > Max_record) end--; text[start-1] = r_newline; } else { unsigned char *newbuf = text + end + 1; newbuf = backward_delimiter(newbuf, text+Max_record, D_pattern, D_length, OUTTAIL); /* see agrep.c/'d' */ if (newbuf < text+Max_record+D_length) newbuf = text + end + 1; end = newbuf - text - 1; memcpy(text+start-D_length, D_pattern, D_length); } residue = buf_end - end + 1 ; if(INVERSE && COUNT) countline(text+Max_record, num_read); /* MGREP_PROCESS */ if (TCOMPRESSED) { /* separate functions since separate globals => too many if-statements within a single function makes it slow */ #if DOTCOMPRESSED if(tc_SHORT) { if (-1 == tc_m_short(text, start, end)) {free_buf(fd, text); return -1;}} else { if (-1 == tc_monkey1(text, start, end)) {free_buf(fd, text); return -1;}} #endif /*DOTCOMPRESSED*/ } else { if(SHORT) { if (-1 == m_short(text, start, end)) {free_buf(fd, text); return -1;}} else { if (-1 == monkey1(text, start, end)) {free_buf(fd, text); return -1;}} } if(FILENAMEONLY && (num_of_matched - prev_num_of_matched) && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, text); NEW_FILE = OFF; return 0; } CurrentByteOffset = oldCurrentByteOffset + end - start + 1; start = Max_record - residue; if(start < 0) { start = 1; } strncpy(text+start, text+end, residue); if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, text); return 0; /* done */ } } /* end of while(num_read = ... */ if (!DELIMITER) { text[start-1] = '\n'; text[start+residue] = '\n'; } else { if (start > D_length) memcpy(text+start-D_length, D_pattern, D_length); memcpy(text+start+residue, D_pattern, D_length); } end = start + residue; if(residue > 1) { if (TCOMPRESSED) { #if DOTCOMPRESSED if(tc_SHORT) tc_m_short(text, start, end); else tc_monkey1(text, start, end); #endif /*DOTCOMPRESSED*/ } else { if(SHORT) m_short(text, start, end); else monkey1(text, start, end); } if(FILENAMEONLY && (num_of_matched - prev_num_of_matched) && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, text); NEW_FILE = OFF; return 0; } } free_buf(fd, text); return (0); #if AGREP_POINTER } else { text = (unsigned char *)agrep_inbuffer; num_read = agrep_inlen; start = 0; buf_end = end = num_read - 1; oldCurrentByteOffset = CurrentByteOffset; if (first_time) { if ((TCOMPRESSED == ON) && tuncompressible(text+Max_record, num_read)) { EASYSEARCH = text[Max_record+SIGNATURE_LEN-1]; start += SIGNATURE_LEN; CurrentByteOffset += SIGNATURE_LEN; if (!EASYSEARCH) { fprintf(stderr, "not compressed for easy-search: can miss some matches in: %s\n", CurrentFileName); } } else TCOMPRESSED = OFF; first_time = 0; } if (!DELIMITER) while(text[end] != r_newline && end > 1) end--; else { unsigned char *newbuf = text + end + 1; newbuf = backward_delimiter(newbuf, text, D_pattern, D_length, OUTTAIL); /* see agrep.c/'d' */ if (newbuf < text+D_length) newbuf = text + end + 1; end = newbuf - text - 1; } /* text[0] = text[end] = r_newline; : the user must ensure that the delimiter is there at text[0] and occurs somewhere before text[end] */ if (INVERSE && COUNT) countline(text, num_read); /* An exact copy of the above MGREP_PROCESS */ if (TCOMPRESSED) { /* separate functions since separate globals => too many if-statements within a single function makes it slow */ #if DOTCOMPRESSED if(tc_SHORT) { if (-1 == tc_m_short(text, start, end)) {free_buf(fd, text); return -1;}} else { if (-1 == tc_monkey1(text, start, end)) {free_buf(fd, text); return -1;}} #endif /*DOTCOMPRESSED*/ } else { if(SHORT) { if (-1 == m_short(text, start, end)) {free_buf(fd, text); return -1;}} else { if (-1 == monkey1(text, start, end)) {free_buf(fd, text); return -1;}} } if(FILENAMEONLY && (num_of_matched - prev_num_of_matched) && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, text); NEW_FILE = OFF; return 0; } return 0; } #endif /*AGREP_POINTER*/ #ifdef perf_check fprintf(stderr,"Shifted %d times; shift=0 %d times; hash was = %d times\n",cshift, cshift0, chash); return 0; #endif } /* end mgrep */ static void countline(text, len) unsigned char *text; int len; { int i; for (i=0; i= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if (PRINTPATTERN) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d- ", pat_index); else { char s[32]; int outindex; sprintf(s, "%d- ", pat_index); for(outindex=0; (outindex+agrep_outpointer textend) return 0; if (WORDBOUND) { if (isalnum(*(unsigned char *)qx)) goto skip_output; if (isalnum(*(unsigned char *)(text-m1-1))) goto skip_output; } if (!DOWITHMASK) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m1*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, D_pattern, D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text-m1*/, textend, D_pattern, D_length, OUTTAIL); } if (!OUTTAIL || INVERSE) textbegin = curtextend; else if (DELIMITER) textbegin = curtextend - D_length; else textbegin = curtextend - 1; } DOWITHMASK = 1; if (pat_index <= anum_terminals) { int iii; amatched_terminals[pat_index - 1] = 1; for (iii=0; iii= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ else {\ memcpy(agrep_outbuffer + agrep_outpointer, curtextbegin, curtextend-curtextbegin);\ agrep_outpointer += curtextend - curtextbegin;\ }\ }\ }\ else if (PRINTED) {\ if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp);\ else agrep_outbuffer[agrep_outpointer ++] = '\n';\ PRINTED = 0;\ }\ if ((change_text) && MULTI_OUTPUT) { /* next match starting from end of current */\ CurrentByteOffset += (oldtext + pat_len[pat_index] - 1 - text);\ text = oldtext + pat_len[pat_index] - 1;\ MATCHED = 0;\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ else { /* INVERSE */\ /* if(lastout < curtextbegin) OUT=1; */\ if (!SILENT) {\ if (agrep_finalfp != NULL)\ fwrite(lastout, 1, curtextbegin-lastout, agrep_finalfp);\ else {\ if (curtextbegin - lastout + agrep_outpointer >= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ memcpy(agrep_outbuffer+agrep_outpointer, lastout, curtextbegin-lastout);\ agrep_outpointer += (curtextbegin-lastout);\ }\ }\ lastout=textbegin;\ if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ }\ else if (change_text) { /* COUNT */\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) ||\ ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */\ DO_OUTPUT(1) } skip_output: if (MATCHED && !MULTI_OUTPUT && !AComplexBoolean) break; /* else look for more possible matches since we never know how many will match */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } } /* If I found some match and I am about to cross over a delimiter, then set DOWITHMASK to 0 and zero out the amatched_terminals */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if(!MATCHED) shift = 1; /* || MULTI_OUTPUT is implicit */ else { MATCHED = 0; shift = m1 - 1 > 0 ? m1 - 1 : 1; } } /* If I found some match and I am about to cross over a delimiter, then set DOWITHMASK to 0 and zero out the amatched_terminals */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } text += shift; CurrentByteOffset += shift; } /* Do residual stuff: check if there was a match at the end of the line | check if rest of the buffer needs to be output due to inverse */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if(INVERSE && !COUNT && (lastout <= textend)) { if (!SILENT) { if (agrep_finalfp != NULL) { while(lastout <= textend) fputc(*lastout++, agrep_finalfp); } else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, textend-lastout+1); agrep_outpointer += (textend-lastout+1); lastout = textend; } } } return 0; } #if DOTCOMPRESSED int tc_monkey1( text, start, end ) int start, end; register unsigned char *text; { int PRINTED = 0; unsigned char *oldtext; int pat_index; register uchar *textend; unsigned char *textbegin; unsigned char *curtextend; unsigned char *curtextbegin; register unsigned hash; register uchar shift; register int m1, Long=LONG; int MATCHED=0; register uchar *qx; register uchar *px; register int p, p_end; uchar *lastout; /* int OUT=0; */ int hash2; int j; int DOWITHMASK; struct timeval initt, finalt; int newlen; DOWITHMASK = 0; if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); textbegin = text + start; textend = text + end; m1 = tc_p_size-1; lastout = text+start; text = text + start + m1 -1; /* -1 to allow match to the first \n in case the pattern has ^ in front of it */ /* WORDBOUND adjustment not required */ while (text <= textend) { hash=tc_tr1[*text]; hash=(hash< textend) return 0; if (!DOWITHMASK) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m1*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, tc_D_pattern, tc_D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text-m1*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } /* else prev curtextbegin is OK: if full AND isn't found, DOWITHMASK is 0-ed so that we search at most 1 line below */ #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ /* Was it really a match in the compressed line from prev line in text to text + strlen(tc_pat_len[pat_index]? */ if (-1==exists_tcompressed_word(tc_PatPtr[p], tc_pat_len[pat_index], curtextbegin, text - curtextbegin + tc_pat_len[pat_index], EASYSEARCH)) goto skip_output; #if MEASURE_TIMES gettimeofday(&finalt, NULL); FILTERALGO_ms += (finalt.tv_sec *1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ if (!DOWITHMASK) { if (!OUTTAIL || INVERSE) textbegin = curtextend; else if (DELIMITER) textbegin = curtextend - D_length; else textbegin = curtextend - 1; } DOWITHMASK = 1; if (pat_index <= anum_terminals) { int iii; amatched_terminals[pat_index - 1] = 1; for (iii=0; iii 0) {\ if (newlen + agrep_outpointer >= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ agrep_outpointer += newlen;\ }\ }\ /* #if MEASURE_TIMES\ gettimeofday(&finalt, NULL);\ OUTFILTER_ms += (finalt.tv_sec* 1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000);\ */ /*#endif MEASURE_TIMES */\ }\ else if (PRINTED) {\ if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp);\ else agrep_outbuffer[agrep_outpointer ++] = '\n';\ PRINTED = 0;\ }\ if ((change_text) && MULTI_OUTPUT) { /* next match starting from end of current */\ CurrentByteOffset += (oldtext + tc_pat_len[pat_index] - 1 - text);\ text = oldtext + tc_pat_len[pat_index] - 1;\ MATCHED = 0;\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ else { /* INVERSE: Don't care about filtering time */\ /* if(lastout < curtextbegin) OUT=1; */\ if (!SILENT) {\ if (agrep_finalfp != NULL)\ newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_finalfp, -1, EASYSEARCH);\ else {\ if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) {\ if (newlen + agrep_outpointer >= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ agrep_outpointer += newlen;\ }\ }\ }\ lastout=textbegin;\ if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) ||\ ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */\ DO_OUTPUT(1) } skip_output: if (MATCHED && !MULTI_OUTPUT && !AComplexBoolean) break; /* else look for more possible matches since we never know how many will match */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } } /* If I found some match and I am about to cross over a delimiter, then set DOWITHMASK to 0 and zero out the amatched_terminals */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if(!MATCHED) shift = 1; /* || MULTI_OUTPUT is implicit */ else { MATCHED = 0; shift = m1 - 1 > 0 ? m1 - 1 : 1; } } /* If I found some match and I am about to cross over a delimiter, then set DOWITHMASK to 0 and zero out the amatched_terminals */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } text += shift; CurrentByteOffset += shift; } /* Do residual stuff: check if there was a match at the end of the line | check if rest of the buffer needs to be output due to inverse */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if (INVERSE && !COUNT && (lastout <= textend)) { if (!SILENT) { if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } } return 0; } #endif /*DOTCOMPRESSED*/ /* shift is always 1: slight change in MATCHED semantics: it is set to 1 even if COUNT is set: previously, it wasn't set. Will it effect m_short? */ int m_short(text, start, end) int start, end; register uchar *text; { int m1=1; int PRINTED = 0; int pat_index; unsigned char *oldtext; register uchar *textend; unsigned char *textbegin; unsigned char *curtextend; unsigned char *curtextbegin; register int p, p_end; int MATCHED=0; /* int OUT=0; */ uchar *lastout; uchar *qx; uchar *px; int j; int DOWITHMASK; DOWITHMASK = 0; if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); textend = text + end; lastout = text + start; textbegin = text + start; text = text + start - 1 ; /* if (WORDBOUND || WHOLELINE) text = text-1; */ if (WHOLELINE) text = text-1; /* to accomodate the extra 2 W_delim */ while (++text <= textend) { CurrentByteOffset ++; p = HASH[tr[*text]]; p_end = HASH[tr[*text]+1]; while(p++ < p_end) { if (((pat_index = pat_indices[p]) <= 0) || (pat_len[pat_index] <= 0)) continue; #ifdef debug printf("m_short(): p=%d pat_index=%d off=%d\n", p, pat_index, textend - text); #endif px = PatPtr[p]; qx = text; while((*px!=0)&&(tr[*px] == tr[*qx])) { px++; qx++; } if (*px == 0) { if(text >= textend) return 0; if (WORDBOUND) { if (isalnum(*(unsigned char *)qx)) goto skip_output; if (isalnum(*(unsigned char *)(text-1))) goto skip_output; } if (!DOWITHMASK) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m1*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, D_pattern, D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text-m1*/, textend, D_pattern, D_length, OUTTAIL); } if (!OUTTAIL || INVERSE) textbegin = curtextend; else if (DELIMITER) textbegin = curtextend - D_length; else textbegin = curtextend - 1; } /* else prev curtextbegin is OK: if full AND isn't found, DOWITHMASK is 0-ed so that we search at most 1 line below */ DOWITHMASK = 1; if (pat_index <= anum_terminals) { int iii; amatched_terminals[pat_index - 1] = 1; for (iii=0; iii= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ else {\ memcpy(agrep_outbuffer + agrep_outpointer, curtextbegin, curtextend-curtextbegin);\ agrep_outpointer += curtextend - curtextbegin;\ }\ }\ }\ else if (PRINTED) {\ if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp);\ else agrep_outbuffer[agrep_outpointer ++] = '\n';\ PRINTED = 0;\ }\ if ((change_text) && MULTI_OUTPUT) { /* next match starting from end of current */\ CurrentByteOffset += (oldtext + pat_len[pat_index] - 1 - text);\ text = oldtext + pat_len[pat_index] - 1;\ MATCHED = 0;\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ else {\ /* if(lastout < curtextbegin) OUT=1; */\ if (!SILENT) {\ if (agrep_finalfp != NULL)\ fwrite(lastout, 1, curtextbegin-lastout, agrep_finalfp);\ else {\ if (curtextbegin - lastout + agrep_outpointer >= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ memcpy(agrep_outbuffer+agrep_outpointer, lastout, curtextbegin-lastout);\ agrep_outpointer += (curtextbegin-lastout);\ }\ }\ lastout=textbegin;\ if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) ||\ ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */\ DO_OUTPUT(1) } skip_output: if(MATCHED && !MULTI_OUTPUT && !AComplexBoolean) break; /* else look for more possible matches */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } } /* If I found some match and I am about to cross over a delimiter, then set DOWITHMASK to 0 and zero out the amatched_terminals */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if (MATCHED) text --; MATCHED = 0; } /* while */ CurrentByteOffset ++; /* Do residual stuff: check if there was a match at the end of the line | check if rest of the buffer needs to be output due to inverse */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if(INVERSE && !COUNT && (lastout <= textend)) { if (!SILENT) { if (agrep_finalfp != NULL) { while(lastout <= textend) fputc(*lastout++, agrep_finalfp); } else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, text-lastout+1); agrep_outpointer += (text-lastout+1); lastout = textend; } } } return 0; } #if DOTCOMPRESSED /* shift is always 1: slight change in MATCHED semantics: it is set to 1 even if COUNT is set: previously, it wasn't set. Will it effect m_short? */ int tc_m_short(text, start, end) int start, end; register uchar *text; { int m1=1; int PRINTED = 0; int pat_index; unsigned char *oldtext; register uchar *textend; unsigned char *textbegin; unsigned char *curtextend; unsigned char *curtextbegin; register int p, p_end; int MATCHED=0; /* int OUT=0; */ uchar *lastout; uchar *qx; uchar *px; int j; int DOWITHMASK; struct timeval initt, finalt; int newlen; DOWITHMASK = 0; if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); textend = text + end; lastout = text + start; text = text + start - 1 ; textbegin = text + start; /* WORDBOUND adjustment not required */ while (++text <= textend) { CurrentByteOffset ++; p = tc_HASH[tc_tr[*text]]; p_end = tc_HASH[tc_tr[*text]+1]; while(p++ < p_end) { if (((pat_index = tc_pat_indices[p]) <= 0) || (tc_pat_len[pat_index] <= 0)) continue; #ifdef debug printf("m_short(): p=%d pat_index=%d off=%d\n", p, pat_index, textend - text); #endif px = tc_PatPtr[p]; qx = text; while((*px!=0)&&(tc_tr[*px] == tc_tr[*qx])) { px++; qx++; } if (*px == 0) { if(text >= textend) return 0; if (!DOWITHMASK) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m1*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, tc_D_pattern, tc_D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text-m1*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } /* else prev curtextbegin is OK: if full AND isn't found, DOWITHMASK is 0-ed so that we search at most 1 line below */ #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ /* Was it really a match in the compressed line from prev line in text to text + strlen(tc_pat_len[pat_index]? */ if (-1 == exists_tcompressed_word(tc_PatPtr[p], tc_pat_len[pat_index], curtextbegin, text - curtextbegin + tc_pat_len[pat_index], EASYSEARCH)) goto skip_output; #if MEASURE_TIMES gettimeofday(&finalt, NULL); FILTERALGO_ms += (finalt.tv_sec *1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ if (!DOWITHMASK) { if (!OUTTAIL || INVERSE) textbegin = curtextend; else if (DELIMITER) textbegin = curtextend - D_length; else textbegin = curtextend - 1; } DOWITHMASK = 1; if (pat_index <= anum_terminals) { int iii; amatched_terminals[pat_index - 1] = 1; for (iii=0; iii 0) {\ if (newlen + agrep_outpointer >= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ agrep_outpointer += newlen;\ }\ }\ /*#if MEASURE_TIMES\ gettimeofday(&finalt, NULL);\ OUTFILTER_ms += (finalt.tv_sec* 1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000);\ */ /*#endif MEASURE_TIMES*/\ }\ else if (PRINTED) {\ if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp);\ else agrep_outbuffer[agrep_outpointer ++] = '\n';\ PRINTED = 0;\ }\ if ((change_text) && MULTI_OUTPUT) { /* next match starting from end of current */\ CurrentByteOffset += (oldtext + tc_pat_len[pat_index] - 1 - text);\ text = oldtext + tc_pat_len[pat_index] - 1;\ MATCHED = 0;\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ else { /* INVERSE: Don't care about filtering time */\ /* if(lastout < curtextbegin) OUT=1; */\ if (!SILENT) {\ if (agrep_finalfp != NULL)\ newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_finalfp, -1, EASYSEARCH);\ else {\ if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) {\ if (newlen + agrep_outpointer >= agrep_outlen) {\ OUTPUT_OVERFLOW;\ return -1;\ }\ agrep_outpointer += newlen;\ }\ }\ }\ lastout=textbegin;\ if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ }\ }\ else if (change_text) {\ CurrentByteOffset += textbegin - text;\ text = textbegin;\ }\ if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) ||\ ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */\ DO_OUTPUT(1) } skip_output: if(MATCHED && !MULTI_OUTPUT && !AComplexBoolean) break; /* else look for more possible matches */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } } /* If I found some match and I am about to cross over a delimiter, then set DOWITHMASK to 0 and zero out the amatched_terminals */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if (MATCHED) text--; MATCHED = 0; } /* while */ CurrentByteOffset ++; /* Do residual stuff: check if there was a match at the end of the line | check if rest of the buffer needs to be output due to inverse */ if (DOWITHMASK && (text >= curtextend - 1)) { DOWITHMASK = 0; if (AComplexBoolean && dd(curtextbegin, curtextend) && eval_tree(AParse, amatched_terminals)) { DO_OUTPUT(0) } if (AParse != 0) memset(amatched_terminals, '\0', anum_terminals); } if (INVERSE && !COUNT && (lastout <= textend)) { if (!SILENT) { if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } } return 0; } #endif /*DOTCOMPRESSED*/ static void f_prep(pat_index, Pattern) uchar *Pattern; int pat_index; { int i, m; register unsigned hash=0; #ifdef debug puts(Pattern); #endif m = p_size; for (i=m-1; i>=(1+LONG); i--) { hash = (tr1[Pattern[i]]); hash = (hash << Hbits) + (tr1[Pattern[i-1]]); if(LONG) hash = (hash << Hbits) + (tr1[Pattern[i-2]] ); if(SHIFT1[hash] >= m-1-i) SHIFT1[hash] = m-1-i; } i=m-1; hash = (tr1[Pattern[i]]); hash = (hash << Hbits) + (tr1[Pattern[i-1]]); if(LONG) hash = (hash << Hbits) + (tr1[Pattern[i-2]] ); if(SHORT) hash=tr[Pattern[0]]; #ifdef debug printf("hash = %d\n", hash); #endif HASH[hash]++; return; } #if DOTCOMPRESSED static void tc_f_prep(pat_index, Pattern) uchar *Pattern; int pat_index; { int i, m; register unsigned hash=0; #ifdef debug puts(Pattern); #endif m = tc_p_size; for (i=m-1; i>=(1+tc_LONG); i--) { hash = (tc_tr1[Pattern[i]]); hash = (hash << Hbits) + (tc_tr1[Pattern[i-1]]); if(tc_LONG) hash = (hash << Hbits) + (tc_tr1[Pattern[i-2]] ); if(tc_SHIFT1[hash] >= m-1-i) tc_SHIFT1[hash] = m-1-i; } i=m-1; hash = (tc_tr1[Pattern[i]]); hash = (hash << Hbits) + (tc_tr1[Pattern[i-1]]); if(tc_LONG) hash = (hash << Hbits) + (tc_tr1[Pattern[i-2]] ); if(tc_SHORT) hash=tc_tr[Pattern[0]]; #ifdef debug printf("hash = %d\n", hash); #endif tc_HASH[hash]++; return; } #endif /*DOTCOMPRESSED*/ static void f_prep1(pat_index, Pattern) uchar *Pattern; int pat_index; { int i, m; int hash2; register unsigned hash; m = p_size; #ifdef debug puts(Pattern); #endif for (i=m-1; i>=(1+LONG); i--) { hash = (tr1[Pattern[i]]); hash = (hash << Hbits) + (tr1[Pattern[i-1]]); if(LONG) hash = (hash << Hbits) + (tr1[Pattern[i-2]] ); if(SHIFT1[hash] >= m-1-i) SHIFT1[hash] = m-1-i; } i=m-1; hash = (tr1[Pattern[i]]); hash = (hash << Hbits) + (tr1[Pattern[i-1]]); if(LONG) hash = (hash << Hbits) + (tr1[Pattern[i-2]] ); if(SHORT) hash=tr[Pattern[0]]; hash2 = (tr[Pattern[0]] << 8) + tr[Pattern[1]]; #ifdef debug printf("hash = %d, HASH[hash] = %d\n", hash, HASH[hash]); #endif PatPtr[HASH[hash]] = Pattern; pat_indices[HASH[hash]] = pat_index; Hash2[HASH[hash]] = hash2; HASH[hash]--; return; } #if DOTCOMPRESSED static void tc_f_prep1(pat_index, Pattern) uchar *Pattern; int pat_index; { int i, m; int hash2; register unsigned hash; m = tc_p_size; #ifdef debug puts(Pattern); #endif for (i=m-1; i>=(1+tc_LONG); i--) { hash = (tc_tr1[Pattern[i]]); hash = (hash << Hbits) + (tc_tr1[Pattern[i-1]]); if(tc_LONG) hash = (hash << Hbits) + (tc_tr1[Pattern[i-2]] ); if(tc_SHIFT1[hash] >= m-1-i) tc_SHIFT1[hash] = m-1-i; } i=m-1; hash = (tc_tr1[Pattern[i]]); hash = (hash << Hbits) + (tc_tr1[Pattern[i-1]]); if(tc_LONG) hash = (hash << Hbits) + (tc_tr1[Pattern[i-2]] ); if(tc_SHORT) hash=tc_tr[Pattern[0]]; hash2 = (tc_tr[Pattern[0]] << 8) + tc_tr[Pattern[1]]; #ifdef debug printf("hash = %d, tc_HASH[hash] = %d\n", hash, tc_HASH[hash]); #endif tc_PatPtr[tc_HASH[hash]] = Pattern; tc_pat_indices[tc_HASH[hash]] = pat_index; tc_Hash2[tc_HASH[hash]] = hash2; tc_HASH[hash]--; return; } #endif /*DOTCOMPRESSED*/ static void accumulate() { int i; for(i=1; i #include "re.h" #define FALSE 0 #define TRUE 1 #define NextChar(s) *(*s)++ #define Unexpected(s, c) (**s == NUL || **s == c) #define Invalid_range(x, y) (x == NUL || x == '-' || x == ']' || x < y) extern Stack Push(); extern Re_node Pop(); extern Re_node Top(); extern int Size(); extern Pset pset_union(); extern Pset create_pos(); extern void free_re(); int final_pos, pos_cnt = 0; /* retract_token() moves the string pointer back, effectively "unseeing" the last character seen. It is used only to retract a right paren -- the idea is that the incarnation of parse_re() that saw the corresponding left paren is supposed to take care of matching the right paren. This is necessary to prevent recursive calls from mistakenly eating up someone else's right parens. */ #define retract_token(s) --(*s) /* mk_leaf() creates a leaf node that is (usually) a literal node. */ Re_node mk_leaf(opval, type, ch, cset) short opval, type; char ch; Ch_Set cset; { Re_node node; Re_Lit l; new_node(Re_Lit, l, l); new_node(Re_node, node, node); if (l == NULL || node == NULL) { if (l != NULL) free(l); if (node != NULL) free(node); return NULL; } lit_type(l) = type; lit_pos(l) = pos_cnt++; if (type == C_SET) lit_cset(l) = cset; else lit_char(l) = ch; /* type == C_LIT */ Op(node) = opval; Lit(node) = l; Nullable(node) = FALSE; Firstpos(node) = create_pos(lit_pos(l)); Lastpos(node) = Firstpos(node); return node; } /* parse_cset() takes a pointer to a pointer to a string and parses a prefix of it denoting a character set literal. It returns a pointer to a Re_node node, NULL if there is an error. */ Re_node parse_cset(s) char **s; { Ch_Set cs_ptr, curr_ptr, prev_ptr; char ch; Ch_Range range = NULL; if (Unexpected(s, ']')) return NULL; new_node(Ch_Set, curr_ptr, curr_ptr); cs_ptr = curr_ptr; while (!Unexpected(s, ']')) { new_node(Ch_Range, range, range); curr_ptr->elt = range; ch = NextChar(s); if (ch == '-') { free(range); free(curr_ptr); return NULL; /* invalid range */ } range->low_bd = ch; if (**s == NUL) { free(range); free(curr_ptr); return NULL; } else if (**s == '-') { /* character range */ (*s)++; if (Invalid_range(**s, ch)) { free(range); free(curr_ptr); return NULL; } else range->hi_bd = NextChar(s); } else range->hi_bd = ch; prev_ptr = curr_ptr; new_node(Ch_Set, curr_ptr, curr_ptr); prev_ptr->rest = curr_ptr; }; if (**s == ']') { prev_ptr->rest = NULL; return mk_leaf(LITERAL, C_SET, NUL, cs_ptr); } else { if (range != NULL) free(range); free(curr_ptr); return NULL; } } /* parse_cset */ /* parse_wildcard() "parses" a wildcard -- a wildcard is treated as a character range whose values span all ASCII values. parse_wildcard() creates a node representing such a range. */ Re_node parse_wildcard() { Ch_Set s; Ch_Range r; new_node(Ch_Range, r, r); r->low_bd = ASCII_MIN; /* smallest ASCII value */ r->hi_bd = ASCII_MAX; /* greatest ASCII value */ new_node(Ch_Set, s, s); s->elt = r; s->rest = NULL; return mk_leaf(LITERAL, C_SET, NUL, s); } /* parse_chlit() parses a character literal. It is assumed that the character in question does not have any special meaning. It returns a pointer to a node for that literal. */ Re_node parse_chlit(ch) char ch; { if (ch == NUL) return NULL; else return mk_leaf(LITERAL, C_LIT, ch, NULL); } /* routine to free the malloced token */ void free_tok(next_token) Tok_node next_token; { if (next_token == NULL) return; switch (tok_type(next_token)) { case LITERAL: free_re(tok_val(next_token)); case EOS: case RPAREN: case LPAREN: case OPSTAR: case OPALT: case OPOPT: default: free(next_token); break; } } /* get_token() returns the next token -- this may be a character literal, a character set, an escaped character, a punctuation (i.e. parenthesis), or an operator. It traverses the character string representing the RE, given by a pointer s; leaves s positioned immediately after the unit it parsed, and returns a pointer to a token node for that unit. */ Tok_node get_token(s) char **s; { Tok_node rn = NULL; if (s == NULL || *s == NULL) return NULL; /* error */ new_node(Tok_node, rn, rn); if (**s == NUL) tok_type(rn) = EOS; /* end of string */ else { switch (**s) { case '.': /* wildcard */ tok_type(rn) = LITERAL; tok_val(rn) = parse_wildcard(); if (tok_val(rn) == NULL) { free_tok(rn); return NULL; } break; case '[': /* character set literal */ (*s)++; tok_type(rn) = LITERAL; tok_val(rn) = parse_cset(s); if (tok_val(rn) == NULL) { free_tok(rn); return NULL; } break; case '(': tok_type(rn) = LPAREN; break; case ')' : tok_type(rn) = RPAREN; break; case '*' : tok_type(rn) = OPSTAR; break; case '|' : tok_type(rn) = OPALT; break; case '?' : tok_type(rn) = OPOPT; break; case '\\': /* escaped character */ (*s)++; default : /* must be ordinary character */ tok_type(rn) = LITERAL; tok_val(rn) = parse_chlit(**s); if (tok_val(rn) == NULL) { free_tok(rn); return NULL; } break; } /* switch (**s) */ (*s)++; } /* else */ return rn; } /* cat2() takes a stack of RE-nodes and, if the stack contains more than one node, returns the stack obtained by condensing the top two nodes of the stack into a single CAT-node. If there is only one node on the stack, nothing is done. */ Stack cat2(stk) Stack *stk; { Re_node r; if (stk == NULL) return NULL; if (*stk == NULL || (*stk)->next == NULL) return *stk; new_node(Re_node, r, r); if (r == NULL) return NULL; /* can't allocate memory */ Op(r) = OPCAT; Rchild(r) = Pop(stk); Lchild(r) = Pop(stk); if (Push(stk, r) == NULL) { free_re(Rchild(r)); free_re(Lchild(r)); free(r); return NULL; } Nullable(r) = Nullable(Lchild(r)) && Nullable(Rchild(r)); if (Nullable(Lchild(r))) Firstpos(r) = pset_union(Firstpos(Lchild(r)), Firstpos(Rchild(r)), 0); else Firstpos(r) = pset_union(Firstpos(Lchild(r)), NULL, 0); /* added pset_union with NULL 26/Aug/1996 */ if (Nullable(Rchild(r))) Lastpos(r) = pset_union(Lastpos(Lchild(r)), Lastpos(Rchild(r)), 0); else Lastpos(r) = pset_union(Lastpos(Rchild(r)), NULL, 0); /* added pset_union with NULL 26/Aug/1996 */ return *stk; } /* wrap() takes a stack and an operator, takes the top element of the stack and "wraps" that operator around it, then puts this back on the stack and returns the resulting stack. */ Stack wrap(s, opv) Stack *s; short opv; { Re_node r; if (s == NULL || *s == NULL) return NULL; new_node(Re_node, r, r); if (r == NULL) return NULL; Op(r) = opv; Child(r) = Pop(s); if (Push(s, r) == NULL) { free_re(Child(r)); free(r); return NULL; } Nullable(r) = TRUE; Firstpos(r) = pset_union(Firstpos(Child(r)), NULL, 0); /* added pset_union with NULL 26/Aug/1996 */ Lastpos(r) = pset_union(Lastpos(Child(r)), NULL, 0); /* added pset_union with NULL 26/Aug/1996 */ return *s; } /* mk_alt() takes a stack and a regular expression, creates an ALT-node from the top of the stack and the given RE, and replaces the top-of-stack by the resulting ALT-node. */ Stack mk_alt(s, r) Stack *s; Re_node r; { Re_node node; if (s == NULL || *s == NULL || r == NULL) return NULL; new_node(Re_node, node, node); if (node == NULL) return NULL; Op(node) = OPALT; Lchild(node) = Pop(s); Rchild(node) = r; if (Push(s, node) == NULL) return NULL; Nullable(node) = Nullable(Lchild(node)) || Nullable(Rchild(node)); Firstpos(node) = pset_union(Firstpos(Lchild(node)), Firstpos(Rchild(node)), 0); Lastpos(node) = pset_union(Lastpos(Lchild(node)), Lastpos(Rchild(node)), 0); return *s; } /* parse_re() takes a pointer to a string and traverses that string, returning a pointer to a syntax tree for the regular expression represented by that string, NULL if there is an error. */ Re_node parse_re(s, end) char **s; short end; { Stack stk = NULL, ret = NULL, top, temp; Tok_node next_token, t1; Re_node re = NULL, val; if (s == NULL || *s == NULL) return NULL; while (TRUE) { ret = NULL; if ((next_token = get_token(s)) == NULL) return NULL; switch (tok_type(next_token)) { case RPAREN: retract_token(s); case EOS: if (end == tok_type(next_token)) { free_tok(next_token); top = cat2(&stk); val = Top(top); free(top); return val; } else { free_tok(next_token); return NULL; } case LPAREN: free_tok(next_token); re = parse_re(s, RPAREN); if ((ret = Push(&stk, re)) == NULL) { free_re(re); /* ZZZZZZZZZZZZZZZZZZ */ return NULL; } if ((t1 = get_token(s)) == NULL) { free_re(re); /* ZZZZZZZZZZZZZZZZZZ */ free(ret); return NULL; } if ((tok_type(t1) != RPAREN) || (re == NULL)) { free_re(re); /* ZZZZZZZZZZZZZZZZZZ */ free(ret); free_tok(t1); return NULL; } free_tok(t1); if (Size(stk) > 2) { temp = stk->next; stk->next = cat2(&temp); /* condense CAT nodes */ if (stk->next == NULL) { free_re(re); /* ZZZZZZZZZZZZZZZZZZ */ free(ret); return NULL; } else stk->size = stk->next->size + 1; } break; case OPSTAR: if ((ret = wrap(&stk, OPSTAR)) == NULL) { free_tok(next_token); return NULL; } free_tok(next_token); /* ZZZZZZZZZZZZZZZZZZ */ break; case OPOPT: if ((ret = wrap(&stk, OPOPT)) == NULL) { free_tok(next_token); return NULL; } free_tok(next_token); /* ZZZZZZZZZZZZZZZZZZ */ break; case OPALT: if ((ret = cat2(&stk)) == NULL) { free_tok(next_token); return NULL; } re = parse_re(s, end); if (re == NULL) { free(ret); free_tok(next_token); return NULL; } if (mk_alt(&stk, re) == NULL) { free(ret); free_tok(next_token); return NULL; } free_tok(next_token); /* ZZZZZZZZZZZZZZZZZZ */ break; case LITERAL: if ((ret = Push(&stk, tok_val(next_token))) == NULL) { free_tok(next_token); return NULL; } free(next_token); if (Size(stk) > 2) { temp = stk->next; stk->next = cat2(&temp); /* condense CAT nodes */ if (stk->next == NULL) { free(ret); return NULL; } else stk->size = stk->next->size + 1; } break; default: printf("parse_re: unknown token type %d\n", tok_type(next_token)); free_tok(next_token); /* ZZZZZZZZZZZZZZZZZZ */ break; } /* free_tok(next_token); */ } } /* parse() essentially just calls parse_re(). Its purpose is to stick an end-of-string token at the end of the syntax tree returned by parse_re(). It should really be done in parse_re, but the recursion there makes it more desirable to have it here. */ Re_node parse(s) char *s; { Re_node val, tree, temp; Stack top, stk = NULL; if ((tree = parse_re(&s, NUL)) == NULL) return NULL; if (Push(&stk, tree) == NULL) return NULL; temp = mk_leaf(EOS, C_LIT, NUL, NULL); if (temp == NULL || Push(&stk, temp) == NULL) return NULL; final_pos = --pos_cnt; top = cat2(&stk); val = Top(top); free(top); return val; } glimpse-4.18.7/agrep/preprocess.c000066400000000000000000000221431300371307100166770ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* substitute metachar with special symbol */ /* if regularr expression, then set flag REGEX */ /* if REGEX and MULTIPAT then report error message, */ /* -w only for single word pattern. If WORDBOUND & MULTIWORD error */ /* process start of line, endof line symbol, */ /* process -w WORDBOUND option, append special symbol at begin&end of */ /* process -d option before this routine */ /* the delimiter pattern is in D_pattern (need to end with '; ') */ /* if '-t' (suggestion: how about -B) the pattern is passed to sgrep */ /* and doesn't go here */ /* in that case, -d is ignored? or not necessary */ /* upon return, Pattern contains the pattern to be processed by maskgen */ /* D_pattern contains transformed D_pattern */ #include "agrep.h" #include extern int PAT_FILE, PAT_BUFFER; extern ParseTree *AParse; extern int WHOLELINE, REGEX, FASTREGEX, RE_ERR, DELIMITER, TAIL, WORDBOUND; extern int HEAD; extern CHAR Progname[]; extern int D_length, tc_D_length; extern CHAR tc_D_pattern[MaxDelimit * 2]; extern int table[WORD][WORD]; extern int agrep_initialfd; extern int EXITONERROR; extern int errno; extern int multifd; extern char *multibuf; extern int multilen; extern int anum_terminals; extern ParseTree aterminals[MAXNUM_PAT]; extern char FREQ_FILE[MAX_LINE_LEN], HASH_FILE[MAX_LINE_LEN], STRING_FILE[MAX_LINE_LEN]; /* interfacing with tcompress */ extern int AComplexBoolean; int preprocess(D_pattern, Pattern) /* need two parameters */ CHAR D_pattern[], Pattern[]; { CHAR temp[Maxline], *r_pat, *old_pat; /* r_pat for r.e. */ CHAR old_D_pat[MaxDelimit*2]; int i, j=0, rp=0, m, t=0, num_pos, ANDON = 0; int d_end ; int IN_RANGE=0; int ret1, ret2; #if DEBUG fprintf(stderr, "preprocess: m=%d, pat=%s, PAT_FILE=%d, PAT_BUFFER=%d\n", strlen(Pattern), Pattern, PAT_FILE, PAT_BUFFER); #endif if ((m = strlen(Pattern)) <= 0) return 0; if (PAT_FILE || PAT_BUFFER) return 0; REGEX = OFF; FASTREGEX = OFF; old_pat = Pattern; /* to remember the starting position */ /* Check if pattern is a concatenation of ands OR ors of simple patterns */ multibuf = (char *)malloc(m * 2 + 2); /* worst case: a,a,a,a,a,a */ if (multibuf == NULL) goto normal_processing; /* if (WORDBOUND) goto normal_processing; */ multilen = 0; AParse = 0; ret1 = ret2 = 0; if (((ret1 = asplit_pattern(Pattern, m, aterminals, &anum_terminals, &AParse)) <= 0) || /* can change the pattern if simple boolean with {} */ ((ret2 = asplit_terminal(0, anum_terminals, multibuf, &multilen)) <= 0) || ((ret2 == 1) && !(aterminals[0].op & NOTPAT))) { /* must do normal processing */ if (AComplexBoolean && (AParse != NULL)) destroy_tree(AParse); /* so that direct exec invocations don't use AParse by mistake! */ #if DEBUG fprintf(stderr, "preprocess: split_pat = %d, split_term = %d, #terms = %d\n", ret1, ret2, anum_terminals); #endif /*DEBUG*/ /* if (ret2 == 1) { strcpy(Pattern, aterminals[0].data.leaf.value); m = strlen(Pattern); } */ m = strlen(Pattern); AParse = 0; free(multibuf); multibuf = NULL; multilen = 0; goto normal_processing; } /* This is quick processing */ if (AParse != 0) { /* successfully converted to ANDPAT/ORPAT */ PAT_BUFFER = 1; /* printf("preprocess(): converted= %d, patterns= %s", AParse, multibuf); */ /* Now I have to process the delimiter if any */ if (DELIMITER) { /* D_pattern is "; ", D_length is 1 + length of string PAT: see agrep.c/'d' */ preprocess_delimiter(D_pattern+1, D_length - 1, D_pattern, &D_length); /* D_pattern is the exact stuff we want to match, D_length is its strlen */ if ((tc_D_length = quick_tcompress(FREQ_FILE, HASH_FILE, D_pattern, D_length, tc_D_pattern, MaxDelimit*2, TC_EASYSEARCH)) <= 0) { strcpy(tc_D_pattern, D_pattern); tc_D_length = D_length; } /* printf("mgrep's delim=%s,%d tc_delim=%s,%d\n", D_pattern, D_length, tc_D_pattern, tc_D_length); */ } return 0; } /* else either unknown character, one simple pattern or none at all */ normal_processing: for(i=0; i< m; i++) { if(Pattern[i] == '\\') i++; else if(Pattern[i] == '|' || Pattern[i] == '*') REGEX = ON; } r_pat = (CHAR *) malloc(strlen(Pattern)+2*strlen(D_pattern) + 8); /* bug-report, From: Chris Dalton */ strcpy(temp, D_pattern); d_end = t = strlen(temp); /* size of D_pattern, including '; ' */ if (WHOLELINE) { temp[t++] = LANGLE; temp[t++] = NNLINE; temp[t++] = RANGLE; temp[t] = '\0'; strcat(temp, Pattern); m = strlen(temp); temp[m++] = LANGLE; temp[m++] = '\n'; temp[m++] = RANGLE; temp[m] = '\0'; } else { if (WORDBOUND) { temp[t++] = LANGLE; temp[t++] = WORDB; temp[t++] = RANGLE; temp[t] = '\0'; } strcat(temp, Pattern); m = strlen(temp); if (WORDBOUND) { temp[m++] = LANGLE; temp[m++] = WORDB; temp[m++] = RANGLE; } temp[m] = '\0'; } /* now temp contains augmented pattern , m it's size */ D_length = 0; for (i=0, j=0; i< d_end-2; i++) { switch(temp[i]) { case '\\' : i++; Pattern[j++] = temp[i]; old_D_pat[D_length++] = temp[i]; break; case '<' : Pattern[j++] = LANGLE; break; case '>' : Pattern[j++] = RANGLE; break; case '^' : Pattern[j++] = '\n'; old_D_pat[D_length++] = temp[i]; break; case '$' : Pattern[j++] = '\n'; old_D_pat[D_length++] = temp[i]; break; default : Pattern[j++] = temp[i]; old_D_pat[D_length++] = temp[i]; break; } } if(D_length > MAXDELIM) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", Progname, MAXDELIM); free(r_pat); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } Pattern[j++] = ANDPAT; old_D_pat[D_length] = '\0'; strcpy(D_pattern, old_D_pat); D_length++; /* Pattern[j++] = ' '; */ Pattern[j] = '\0'; rp = 0; if(REGEX) { r_pat[rp++] = '.'; /* if REGEX: always append '.' in front */ r_pat[rp++] = '('; Pattern[j++] = NOCARE; HEAD = ON; } for (i=d_end; i < m ; i++) { switch(temp[i]) { case '\\': i++; Pattern[j++] = temp[i]; r_pat[rp++] = 'o'; /* the symbol doesn't matter */ break; case '#': FASTREGEX = ON; if(REGEX) { Pattern[j++] = NOCARE; r_pat[rp++] = '.'; r_pat[rp++] = '*'; break; } Pattern[j++] = WILDCD; break; case '(': Pattern[j++] = LPARENT; r_pat[rp++] = '('; break; case ')': Pattern[j++] = RPARENT; r_pat[rp++] = ')'; break; case '[': Pattern[j++] = LRANGE; r_pat[rp++] = '['; IN_RANGE = ON; break; case ']': Pattern[j++] = RRANGE; r_pat[rp++] = ']'; IN_RANGE = OFF; break; case '<': Pattern[j++] = LANGLE; break; case '>': Pattern[j++] = RANGLE; break; case '^': if (temp[i-1] == '[') Pattern[j++] = NOTSYM; else Pattern[j++] = '\n'; r_pat[rp++] = '^'; break; case '$': Pattern[j++] = '\n'; r_pat[rp++] = '$'; break; case '.': Pattern[j++] = NOCARE; r_pat[rp++] = '.'; break; case '*': Pattern[j++] = STAR; r_pat[rp++] = '*'; break; case '|': Pattern[j++] = ORSYM; r_pat[rp++] = '|'; break; case ',': Pattern[j++] = ORPAT; RE_ERR = ON; break; case ';': if(ANDON) RE_ERR = ON; Pattern[j++] = ANDPAT; ANDON = ON; break; case '-': if(IN_RANGE) { Pattern[j++] = HYPHEN; r_pat[rp++] = '-'; } else { Pattern[j++] = temp[i]; r_pat[rp++] = temp[i]; } break; case NNLINE : Pattern[j++] = temp[i]; r_pat[rp++] = 'N'; break; default: Pattern[j++] = temp[i]; r_pat[rp++] = temp[i]; break; } } if(REGEX) { /* append ').' at end of regular expression */ r_pat[rp++] = ')'; r_pat[rp++] = '.'; Pattern[j++] = NOCARE; TAIL = ON; } Pattern[j] = '\0'; m = j; r_pat[rp] = '\0'; if(REGEX) { if(DELIMITER || WORDBOUND) { fprintf(stderr, "%s: -d or -w option is not supported for this pattern\n", Progname); free(r_pat); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if(RE_ERR) { fprintf(stderr, "%s: illegal regular expression\n", Progname); free(r_pat); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } while(*Pattern != NOCARE && m-- > 0) Pattern++; /* poit to . */ num_pos = init(r_pat, table); if(num_pos <= 0) { fprintf(stderr, "%s: illegal regular expression\n", Progname); free(r_pat); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } if(num_pos > MAXREG) { fprintf(stderr, "%s: regular expression too long, max is %d\n", Progname,MAXREG); free(r_pat); if (!EXITONERROR) { errno = AGREP_ERROR; return -1; } else exit(2); } strcpy(old_pat, Pattern); /* do real change to the Pattern to be returned */ free(r_pat); return 0; } /* if regex */ free(r_pat); return 0; } glimpse-4.18.7/agrep/putils.c000066400000000000000000000056051300371307100160360ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include "agrep.h" int is_complex_boolean(buffer, len) CHAR *buffer; int len; { int i = 0; CHAR cur = '\0'; while (i < len) { if (buffer[i] == '\\') i+=2; else if (buffer[i] == ',') { if ((cur == ';') || (cur == '~')) return 1; else cur = ','; i++; } else if (buffer[i] == ';') { if ((cur == ',') || (cur == '~')) return 1; else cur = ';'; i++; } /* else if ((buffer[i] == '~') || (buffer[i] == '{') || (buffer[i] == '}')) { */ else if (buffer[i] == '~') { /* even if pattern has just ~s... user must use -v option for single NOT */ return 1; } else i++; } return 0; } /* The possible tokens are: ; , a e ~ { } */ int get_token_bool(buffer, len, ptr, tokenbuf, tokenlen) CHAR *buffer, *tokenbuf; int len, *ptr, *tokenlen; { if ((*ptr>=len) || (buffer[*ptr] == '\n') || (buffer[*ptr] == '\0')) return 'e'; while ((*ptr=len) || (buffer[*ptr] == '\n') || (buffer[*ptr] == '\0')) return 'e'; if ((buffer[*ptr] == ',') || (buffer[*ptr] == ';') || (buffer[*ptr] == '~') || (buffer[*ptr] == '{') || (buffer[*ptr] == '}')) { tokenbuf[0] = buffer[*ptr]; *tokenlen = 1; return buffer[(*ptr)++]; } *tokenlen = 0; if (buffer[*ptr] == '\\') { tokenbuf[(*tokenlen)++] = buffer[(*ptr)++]; tokenbuf[(*tokenlen)++] = buffer[(*ptr)++]; } else tokenbuf[(*tokenlen)++] = buffer[(*ptr)++]; while ( (*ptr= len) */ } void print_tree(t, level) ParseTree *t; { int i; if (t == NULL) printf("NULL"); else if (t->type == LEAF) { for (i=0; iop, t->terminalindex, t->data.leaf.value); } else if (t->type == INTERNAL) { if (t->data.internal.left != NULL) print_tree(t->data.internal.left, level + 1); for (i=0; iop); if (t->data.internal.right != NULL) print_tree(t->data.internal.right, level + 1); } } void destroy_tree(t) ParseTree *t; { if (t == NULL) return; if (t->type == LEAF) { free(t->data.leaf.value); /* t itself should not be freed: static allocation */ } else if (t->type == INTERNAL) { if (t->data.internal.left != NULL) destroy_tree(t->data.internal.left); if (t->data.internal.right != NULL) destroy_tree(t->data.internal.right); free(t); } } glimpse-4.18.7/agrep/re.h000066400000000000000000000070441300371307100151300ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /************************************************************* * * * Macros defining special characters. * * * *************************************************************/ #define NUL '\0' #define ASCII_MIN '\001' #define ASCII_MAX '\177' /************************************************************* * * * Macros defining lexical categories. * * * *************************************************************/ #define C_LIT 0 /* individual character literal */ #define C_SET 1 /* character set literal */ #define EOS 0 /* end-of-string */ #define LITERAL 1 #define OPSTAR 2 #define OPALT 3 #define OPOPT 4 #define OPCAT 5 #define LPAREN 6 #define RPAREN 7 /************************************************************* * * * Macros for manipulating syntax tree nodes. * * * *************************************************************/ #define lit_type(x) (x->l_type) #define lit_pos(x) (x->pos) #define lit_char(x) ((x->val).c) #define lit_cset(x) ((x->val).cset) #define tok_type(x) (x->type) #define tok_val(x) (x->val) #define tok_op(x) (x->val->op) #define tok_lit(x) ((x->val->refs).lit) #define Op(x) (x->op) #define Lit(x) ((x->refs).lit) #define Child(x) ((x->refs).child) #define Lchild(x) ((x->refs).children.l_child) #define Rchild(x) ((x->refs).children.r_child) #define Nullable(x) (x->nullable) #define Firstpos(x) (x->firstposn) #define Lastpos(x) (x->lastposn) /************************************************************* * * * Macros for manipulating DFA states and sets of states. * * * *************************************************************/ #define Positions(x) (x->posns) #define Final_St(x) (x->final) #define Goto(x, c) ((x->trans)[c]) #define Next_State(x) ((x)->next_state) /*************************************************************/ #define new_node(type, l, x) \ {\ extern void *malloc();\ \ (l) = (type) malloc(sizeof(*(x)));\ if ((l) == NULL) {\ fprintf(stderr, "malloc failure in new_node\n");\ exit(2);\ }\ memset((l), '\0', sizeof(*(x)));\ } typedef struct { /* character range literals */ char low_bd, hi_bd; } *Ch_Range; typedef struct ch_set { /* character set literals */ Ch_Range elt; /* rep. as list of ranges */ struct ch_set *rest; } *Ch_Set; typedef struct { /* regular expression literal */ int pos; /* position in syntax tree */ short l_type; /* type of literal */ union { char c; /* for character literals */ Ch_Set cset; /* for character sets */ } val; } *Re_Lit, *(*Re_lit_array)[]; typedef struct pnode { int posnum; struct pnode *nextpos; } *Pset, *(*Pset_array)[]; typedef struct rnode { /* regular expression node */ short op; /* operator at that node */ union { Re_Lit lit; /* child is a leaf node */ struct rnode *child; /* child of unary op */ struct { struct rnode *l_child; struct rnode *r_child; } children; /* children of binary op */ } refs; short nullable; Pset firstposn, lastposn; } *Re_node; typedef struct { /* token node */ short type; Re_node val; } *Tok_node; typedef struct snode { Re_node val; int size; struct snode *next; } *Stack; typedef struct dfa_st { Pset posns; int final; /* 1 if the state is a final state, 0 o/w */ struct dfa_st *trans[128]; } *Dfa_state; typedef struct dfa_stset { Dfa_state st; struct dfa_stset *next_state; } *Dfa_state_set; glimpse-4.18.7/agrep/recursive.c000066400000000000000000000063731300371307100165300ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* The function of the program is to traverse the direcctory tree and collect paath names. This program is derived from the C-programming language book Originally, the program open a directory file as a regular file. But it won't work. We have to open a directory file using opendir system call, and use readdir() to read each entry of the directory. */ #include "autoconf.h" /* ../libtemplate/include */ #include #include #if ISO_CHAR_SET #include #endif #if HAVE_DIRENT_H # include # define NAMLEN(dirent) strlen((dirent)->d_name) #else # define dirent direct # define NAMLEN(dirent) (dirent)->d_namlen # if HAVE_SYS_NDIR_H # include # endif # if HAVE_SYS_DIR_H # include # endif # if HAVE_NDIR_H # include # endif #endif #include #include #define BUFSIZE 256 #define DIRSIZE 14 #define max_list 10 #ifndef S_ISREG #define S_ISREG(mode) (0100000&(mode)) #endif #ifndef S_ISDIR #define S_ISDIR(mode) (0040000&(mode)) #endif char *file_list[max_list*2]; int fdx=0; /* index of file_List */ extern int Numfiles; char name_buf[BUFSIZE]; void directory(); static void treewalk(); /* returns -1 if error, num of matches >= 0 otherwise */ int recursive(argc, argv) int argc; char **argv; { int i,j; int num = 0, ret; for(i=0; i< argc; i++) { strcpy(name_buf, argv[i]); treewalk(name_buf); if(fdx > 0) { Numfiles = fdx; if ((ret = exec(3, file_list)) == -1) return -1; num += ret; for(j=0; j 0) { strcpy(buf, *++argv); treewalk(buf); } } */ static void treewalk(name) char *name; { struct stat stbuf; int i; extern void *malloc(); /* printf(" In treewalk\n"); */ if(my_lstat(name, &stbuf) == -1) { fprintf(stderr, "permission denied or non-existent: %s\n", name); return; } if ((stbuf.st_mode & S_IFMT) == S_IFLNK) { return; } if (( stbuf.st_mode & S_IFMT) == S_IFDIR) directory(name); else { file_list[fdx] = (char *)malloc(BUFSIZE); strcpy(file_list[fdx++], name); /* printf(" %s\n", name); */ if(fdx >= max_list) { Numfiles = fdx; exec(3, file_list); for(i=0; i= name+BUFSIZE ) /* name too long */ { fprintf(stderr, "name too long: %.32s...\n", name); return; } if((dirp = opendir(name)) == NULL) { fprintf(stderr, "permission denied: %s\n", name); return; } *nbp++ = '/'; *nbp = '\0'; for (dp = readdir(dirp); dp != NULL; dp = readdir(dirp)) { if (dp->d_name[0] == '\0' || strcmp(dp->d_name, ".") == 0 || strcmp(dp->d_name, "..")==0) goto CONT; /* printf("dp->d_name = %s\n", dp->d_name); */ strcpy(nbp, dp->d_name); treewalk(name); CONT: ; } closedir (dirp); *--nbp = '\0'; /* restore name */ } glimpse-4.18.7/agrep/sgrep.c000066400000000000000000002355771300371307100156530ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include #include #include "agrep.h" #include #undef MAXSYM #define MAXSYM 256 #define MAXMEMBER 8192 #define CHARTYPE unsigned char #undef MaxError /* don't use agrep.h definition */ #define MaxError 20 #define MAXPATT 256 #undef MAXLINE #define MAXLINE 1024 #undef MAXNAME #define MAXNAME 256 #undef MaxCan /* don't use agrep.h definition */ #define MaxCan 2048 #define BLOCKSIZE 16384 #define MAX_SHIFT_2 4096 #undef ON #define ON 1 #undef OFF #define OFF 0 #define LOG_ASCII 8 #define LOG_DNA 3 #define MAXMEMBER_1 65536 #define LONG_EXAC 20 #define LONG_APPX 24 #if ISO_CHAR_SET #define W_DELIM 256 #else #define W_DELIM 128 #endif #include extern int tuncompressible(); extern int quick_tcompress(); extern int quick_tuncompress(); extern int DELIMITER, OUTTAIL; extern int D_length, tc_D_length; extern unsigned char D_pattern[MaxDelimit *2], tc_D_pattern[MaxDelimit *2]; extern int LIMITOUTPUT, LIMITPERFILE, INVERSE; extern int CurrentByteOffset; extern int BYTECOUNT; extern int PRINTOFFSET; extern int PRINTRECORD; extern int CONSTANT, COUNT, FNAME, SILENT, FILENAMEONLY, prev_num_of_matched, num_of_matched, PRINTFILETIME; extern int DNA ; /* DNA flag is set in checksg when pattern is DNA pattern and p_size > 16 */ extern WORDBOUND, WHOLELINE, NOUPPER; extern unsigned char CurrentFileName[], Progname[]; extern long CurrentFileTime; extern unsigned Mask[]; extern unsigned endposition; extern int agrep_inlen; extern CHARTYPE *agrep_inbuffer; extern int agrep_initialfd; extern FILE *agrep_finalfp; extern int agrep_outpointer; extern int agrep_outlen; extern CHARTYPE * agrep_outbuffer; extern int NEW_FILE, POST_FILTER; extern int EXITONERROR; extern int errno; extern int TCOMPRESSED; extern int EASYSEARCH; extern char FREQ_FILE[MAX_LINE_LEN], HASH_FILE[MAX_LINE_LEN], STRING_FILE[MAX_LINE_LEN]; #if MEASURE_TIMES /* timing variables */ extern int OUTFILTER_ms; extern int FILTERALGO_ms; extern int INFILTER_ms; #endif /*MEASURE_TIMES*/ unsigned char BSize; /* log_c m */ unsigned char char_map[MAXSYM]; /* data area */ int shift_1; CHARTYPE SHIFT[MAXSYM]; CHARTYPE MEMBER[MAXMEMBER]; CHARTYPE pat[MAXPATT]; unsigned Hashmask; char MEMBER_1[MAXMEMBER_1]; CHARTYPE TR[MAXSYM]; static void initmask(); static void am_preprocess(); static void m_preprocess(); static void prep(); static void prep4(); static void prep_bm(); /* * General idea behind output processing with delimiters, inverse, compression, etc. * CAUTION: In compressed files, we can search ONLY for simple patterns or their ;,. * Attempts to search for complex patterns / with errors might lead to spurious matches. * 1. Once we find the match, go back and forward to get the delimiters that surround * the matched region. * 2. If it is a compressed file, verify that the match is "real" (compressed files * can have pseudo matches hence this filtering step is required). * 3. Increment num_of_matched. * 4. Process some output options which print stuff before the matched region is * printed. * 5. If there is compression, decomress and output the matched region. Otherwise * just output it as is. Remember, from step (1) we know the matched region. * 6. If inverse is set, then we must keep track of the end of the last matched region * in the variable lastout. When there is a match, we must print everything from * lastout to the beginning of the current matched region (curtextbegin) and then * update lastout to point to the end of the current matched region (curtextend). * ALSO: if we exit from the main loops, we must output everything from the end * of the last matched region to the end of the input buffer. * 7. Delimiter handling in complex patterns is different: there the search is done * for a boolean and of the delimiter pattern and the actual pattern. */ /* skips over escaped characters */ unsigned char * mystrchr(s, c) unsigned char *s; int c; { unsigned char *t = s; while (*t) { if (*t == '\\') t++; else if (c == *t) return t; t ++; } return NULL; } void char_tr(pat, m) unsigned char *pat; int *m; { int i; unsigned char temp[MAXPATT]; for(i=0; i1) && (pat[m-2] != '\\') && ((pat[m-1] == '^') || (pat[m-1] == '$'))) pat[m-1] = '\n';\ }\ /* whether constant or not, interpret the escape character */\ for (k=0; k= MAXPATT) {\ fprintf(stderr, "%s: pattern too long (has > %d chars)\n", Progname, MAXPATT);\ if (!EXITONERROR) {\ errno = AGREP_ERROR;\ return -1;\ }\ else exit(2);\ }\ if(D == 0) {\ if(m > LONG_EXAC) m_preprocess(pat);\ else prep_bm(pat, m);\ }\ else if (DNA) prep4(pat, m);\ else if(m >= LONG_APPX) am_preprocess(pat);\ else {\ prep(pat, m, D);\ initmask(pat, Mask, m, 0, &endposition);\ } #if AGREP_POINTER if (fd != -1) { #endif /*AGREP_POINTER*/ alloc_buf(fd, &text, 2*BlockSize+2*Max_record+MAXPATT); text[offset-1] = '\n'; /* initial case */ for(i=0; i < Max_record; i++) text[i] = 0; /* security zone */ start = offset; if(WHOLELINE) { start--; CurrentByteOffset --; } while( (num_read = fill_buf(fd, text+offset, 2*BlockSize)) > 0) { buf_end = end = offset + num_read -1 ; oldCurrentByteOffset = CurrentByteOffset; if (first_time) { if ((TCOMPRESSED == ON) && tuncompressible(text+offset, num_read)) { EASYSEARCH = text[offset+SIGNATURE_LEN-1]; start += SIGNATURE_LEN; CurrentByteOffset += SIGNATURE_LEN; if (!EASYSEARCH) { fprintf(stderr, "not compressed for easy-search: can miss some matches in: %s\n", CurrentFileName); } #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ if (samepattern || ((newm = quick_tcompress(FREQ_FILE, HASH_FILE, pat, m, newpat, Max_record-8, EASYSEARCH)) > 0)) { oldm = m; oldpat = pat; m = newm; pat = newpat; } #if MEASURE_TIMES gettimeofday(&finalt, NULL); INFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else TCOMPRESSED = OFF; PROCESS_PATTERN /* must be AFTER we know that it is a compressed pattern... */ for(i=1; i<=m; i++) text[2*BlockSize+offset+i] = pat[m-1]; /* to make sure the skip loop in bm() won't go out of bound in later iterations */ first_time = 0; } if (!DELIMITER) { while ((text[end] != '\n') && (end > offset)) end--; text[start-1] = '\n'; } else { unsigned char *newbuf = text + end + 1; newbuf = backward_delimiter(newbuf, text+offset, D_pattern, D_length, OUTTAIL); /* see agrep.c/'d' */ if (newbuf < text+offset+D_length) newbuf = text + end + 1; end = newbuf - text - 1; memcpy(text+start-D_length, D_pattern, D_length); } residue = buf_end - end + 1 ; /* SGREP_PROCESS */ /* No harm in sending a few extra parameters even if they are unused: they are not accessed in monkey*()s */ if(D==0) { if(m > LONG_EXAC) { if (-1 == monkey(pat, m, text+start, text+end, oldpat, oldm)) { free_buf(fd, text); return -1; } } else { if (-1 == bm(pat, m, text+start, text+end, oldpat, oldm)) { free_buf(fd, text); return -1; } } } else { if(DNA) { if (-1 == monkey4( pat, m, text+start, text+end, D , oldpat, oldm )) { free_buf(fd, text); return -1; } } else { if(m >= LONG_APPX) { if (-1 == a_monkey(pat, m, text+start, text+end, D, oldpat, oldm)) { free_buf(fd, text); return -1; } } else { if (-1 == agrep(pat, m, text+start, text+end, D, oldpat, oldm)) { free_buf(fd, text); return -1; } } } } if(FILENAMEONLY && (num_of_matched - prev_num_of_matched) && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, text); NEW_FILE = OFF; return 0; } CurrentByteOffset = oldCurrentByteOffset + end - start + 1; /* for a new iteration: avoid complicated calculations below */ start = offset - residue ; if(start < Max_record) { start = Max_record; } /* strncpy(text+start, text+end, residue); */ memcpy(text+start, text+end, residue); start++; if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) { free_buf(fd, text); return 0; /* done */ } } /* end of while(num_read = ...) */ if (!DELIMITER) { text[start-1] = '\n'; text[start+residue] = '\n'; } else { if (start > D_length) memcpy(text+start-D_length, D_pattern, D_length); memcpy(text+start+residue, D_pattern, D_length); } end = start + residue - 2; if(residue > 1) { /* SGREP_PROCESS */ /* No harm in sending a few extra parameters even if they are unused: they are not accessed in monkey*()s */ if(D==0) { if(m > LONG_EXAC) { if (-1 == monkey(pat, m, text+start, text+end, oldpat, oldm)) { free_buf(fd, text); return -1; } } else { if (-1 == bm(pat, m, text+start, text+end, oldpat, oldm)) { free_buf(fd, text); return -1; } } } else { if(DNA) { if (-1 == monkey4( pat, m, text+start, text+end, D , oldpat, oldm )) { free_buf(fd, text); return -1; } } else { if(m >= LONG_APPX) { if (-1 == a_monkey(pat, m, text+start, text+end, D, oldpat, oldm)) { free_buf(fd, text); return -1; } } else { if (-1 == agrep(pat, m, text+start, text+end, D, oldpat, oldm)) { free_buf(fd, text); return -1; } } } } if(FILENAMEONLY && (num_of_matched - prev_num_of_matched) && (NEW_FILE || !POST_FILTER)) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, text); NEW_FILE = OFF; return 0; } } free_buf(fd, text); return 0; #if AGREP_POINTER } else { /* as if only one iteration of the while-loop and offset = 0 */ tempbuf = (CHARTYPE*)malloc(m); text = (CHARTYPE *)agrep_inbuffer; num_read = agrep_inlen; start = 0; buf_end = end = num_read - 1; #if 0 if (WHOLELINE) { start --; CurrentByteOffset --; } #endif if ((TCOMPRESSED == ON) && tuncompressible(text+1, num_read)) { EASYSEARCH = text[offset+SIGNATURE_LEN-1]; start += SIGNATURE_LEN; CurrentByteOffset += SIGNATURE_LEN; if (!EASYSEARCH) { fprintf(stderr, "not compressed for easy-search: can miss some matches in: %s\n", CurrentFileName); } #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ if (samepattern || ((newm = quick_tcompress(FREQ_FILE, HASH_FILE, pat, m, newpat, Max_record-8, EASYSEARCH)) > 0)) { oldm = m; oldpat = pat; m = newm; pat = newpat; } #if MEASURE_TIMES gettimeofday(&finalt, NULL); INFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else TCOMPRESSED = OFF; PROCESS_PATTERN /* must be after we know whether it is compressed or not */ memcpy(tempbuf, text+end+1, m); /* save portion being overwritten */ for(i=1; i<=m; i++) text[end+i] = pat[m-1]; /* to make sure the skip loop in bm() won't go out of bound in later iterations */ if (!DELIMITER) while(text[end] != '\n' && end > 1) end--; else { unsigned char *newbuf = text + end + 1; newbuf = backward_delimiter(newbuf, text, D_pattern, D_length, OUTTAIL); /* see agrep.c/'d' */ if (newbuf < text+offset+D_length) newbuf = text + end + 1; end = newbuf - text - 1; } /* text[0] = text[end] = r_newline; : the user must ensure that the delimiter is there at text[0] and occurs somewhere before text[end ] */ /* An exact copy of the above SGREP_PROCESS */ /* No harm in sending a few extra parameters even if they are unused: they are not accessed in monkey*()s */ if(D==0) { if(m > LONG_EXAC) { if (-1 == monkey(pat, m, text+start, text+end, oldpat, oldm)) { free_buf(fd, text); memcpy(text+end+1, tempbuf, m); /* restore */ free(tempbuf); return -1; } } else { if (-1 == bm(pat, m, text+start, text+end, oldpat, oldm)) { free_buf(fd, text); memcpy(text+end+1, tempbuf, m); /* restore */ free(tempbuf); return -1; } } } else { if(DNA) { if (-1 == monkey4( pat, m, text+start, text+end, D , oldpat, oldm )) { free_buf(fd, text); memcpy(text+end+1, tempbuf, m); /* restore */ free(tempbuf); return -1; } } else { if(m >= LONG_APPX) { if (-1 == a_monkey(pat, m, text+start, text+end, D, oldpat, oldm)) { free_buf(fd, text); memcpy(text+end+1, tempbuf, m); /* restore */ free(tempbuf); return -1; } } else { if (-1 == agrep(pat, m, text+start, text+end, D, oldpat, oldm)) { free_buf(fd, text); memcpy(text+end+1, tempbuf, m); /* restore */ free(tempbuf); return -1; } } } } if(FILENAMEONLY && (num_of_matched - prev_num_of_matched) && (NEW_FILE || !POST_FILTER)) { /* externally set */ if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", CurrentFileName); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "\n"); else { if (agrep_outpointer+1>=agrep_outlen) { OUTPUT_OVERFLOW; free_buf(fd, text); return -1; } else agrep_outbuffer[agrep_outpointer++] = '\n'; } free_buf(fd, text); NEW_FILE = OFF; } memcpy(text+end+1, tempbuf, m); /* restore */ free(tempbuf); return 0; } #endif /*AGREP_POINTER*/ } /* end sgrep */ /* SUN: bm assumes that the content of text[n]...text[n+m-1] is pat[m-1] such that the skip loop is guaranteed to terminated */ int bm(pat, m, text, textend, oldpat, oldm) CHARTYPE *text, *textend, *pat, *oldpat; int m, oldm; { int PRINTED = 0; register int shift; register int m1, j, d1; CHARTYPE *textbegin = text; int newlen; CHARTYPE *textstart; CHARTYPE *curtextbegin; CHARTYPE *curtextend; #if MEASURE_TIMES struct timeval initt, finalt; #endif CHARTYPE *lastout = text; d1 = shift_1; /* at least 1 */ m1 = m - 1; shift = 0; while (text <= textend) { textstart = text; shift = SHIFT[*(text += shift)]; while(shift) { shift = SHIFT[*(text += shift)]; shift = SHIFT[*(text += shift)]; shift = SHIFT[*(text += shift)]; } CurrentByteOffset += text - textstart; j = 0; while(TR[pat[m1 - j]] == TR[*(text - j)]) { if(++j == m) break; /* if statement can be saved, but for safty ... */ } if (j == m ) { if(text > textend) return 0; if(WORDBOUND) { /* if(isalnum(*(unsigned char *)(text+1))) goto CONT; --> fixed by SHIOZAKI Takehiko */ if((text+1 <= textend) && isalnum(*(unsigned char *)(text+1)) && isalnum(*(unsigned char *)text)) { shift = 1; /* bg 4/27/97 */ goto WCONT; /* as if there was no match */ } /* if(isalnum(*(unsigned char *)(text-m))) goto CONT; --> fixed by SHIOZAKI Takehiko */ if((textbegin <= (text-m)) && isalnum(*(unsigned char *)(text-m)) && isalnum(*(unsigned char *)(text-m+1))) { shift = 1; /* bg 4/27/97 */ goto WCONT; /* as if there was no match */ } /* changed by Udi 11/7/94 to avoid having to set TR[] to W_delim */ } if (TCOMPRESSED == ON) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin; /*text-m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, tc_D_pattern, tc_D_length, OUTTAIL); if (!OUTTAIL) { curtextend = forward_delimiter(curtextbegin+D_length/*text-m*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } else { curtextend = forward_delimiter(curtextbegin/*text-m*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } } else { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, D_pattern, D_length, OUTTAIL); if (!OUTTAIL) { curtextend = forward_delimiter(curtextbegin+D_length/*text-m*/, textend, D_pattern, D_length, OUTTAIL); } else { curtextend = forward_delimiter(curtextbegin/*text-m*/, textend, D_pattern, D_length, OUTTAIL); } } } if (TCOMPRESSED == ON) { #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ if (-1 == exists_tcompressed_word(pat, m, curtextbegin, text - curtextbegin + m, EASYSEARCH)) goto CONT; /* as if there was no match */ #if MEASURE_TIMES gettimeofday(&finalt, NULL); FILTERALGO_ms += (finalt.tv_sec *1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } textbegin = curtextend; /* (curtextend - 1 > textbegin ? curtextend - 1 : curtextend); */ num_of_matched++; if(FILENAMEONLY) return 0; if(!COUNT) { if (!INVERSE) { if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if(BYTECOUNT) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d= ", CurrentByteOffset); else { char s[32]; int outindex; sprintf(s, "%d=", CurrentByteOffset); for(outindex=0; (outindex+agrep_outpointer 0) { if (agrep_outpointer + newlen + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } #if MEASURE_TIMES gettimeofday(&finalt, NULL); OUTFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else { if (agrep_finalfp != NULL) { fwrite(curtextbegin, 1, curtextend - curtextbegin, agrep_finalfp); } else { if (agrep_outpointer + curtextend - curtextbegin >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, curtextbegin, curtextend-curtextbegin); agrep_outpointer += curtextend - curtextbegin; } } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } } else { /* INVERSE */ if (!SILENT) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, curtextbegin-lastout, agrep_finalfp); else { if (curtextbegin - lastout + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, curtextbegin-lastout); agrep_outpointer += (curtextbegin - lastout); } lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } /* TCOMPRESSED */ } /* !SILENT */ } /* INVERSE */ } else { /* COUNT */ CurrentByteOffset += textbegin - text; text = textbegin; } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */ CONT: if (m == 1) shift = 0; else shift = 1; /* ZZZZZZZZZZZZZZZZ check it out later */ } else shift = d1; WCONT: ; } if (!SILENT && INVERSE && !COUNT && (lastout <= textend)) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, textend-lastout + 1, agrep_finalfp); else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, textend-lastout + 1); agrep_outpointer += (textend - lastout + 1); } } /* TCOMPRESSED */ } return 0; } /* initmask() initializes the mask table for the pattern */ /* endposition is a mask for the endposition of the pattern */ /* endposition will contain k mask bits if the pattern contains k fragments */ static void initmask(pattern, Mask, m, D, endposition) CHARTYPE *pattern; unsigned *Mask; register int m, D; unsigned *endposition; { register unsigned Bit1, c; register int i, j, frag_num; /* Bit1 = 1 << 31;*/ /* the first bit of Bit1 is 1, others 0. */ Bit1 = (unsigned)0x80000000; frag_num = D+1; *endposition = 0; for (i = 0; i < frag_num; i++) *endposition = *endposition | (Bit1 >> i); *endposition = *endposition >> (m - frag_num); for(i = 0; i < m; i++) if (pattern[i] == '^' || pattern[i] == '$') { pattern[i] = '\n'; } for(i = 0; i < MAXSYM; i++) Mask[i] = ~0; for(i = 0; i < m; i++) /* initialize the mask table */ { c = pattern[i]; for ( j = 0; j < m; j++) if( c == pattern[j] ) Mask[c] = Mask[c] & ~( Bit1 >> j ) ; } } static void prep(Pattern, M, D) /* preprocessing for partitioning_bm */ CHARTYPE *Pattern; /* can be fine-tuned to choose a better partition */ register int M, D; { register int i, j, k, p, shift; register unsigned m; unsigned hash, b_size = 3; m = M/(D+1); p = M - m*(D+1); for (i = 0; i < MAXSYM; i++) SHIFT[i] = m; for (i = M-1; i>=p ; i--) { shift = (M-1-i)%m; hash = Pattern[i]; if((int)(SHIFT[hash]) > (int)(shift)) SHIFT[hash] = shift; } #ifdef DEBUG for(i=0; i Candidate[cdx][1]) { Candidate[++cdx][0] = i-M-D-2; Candidate[cdx][1] = i+M+D; } else Candidate[cdx][1] = i+M+D; shift = d1; } else shift = d1; } CurrentByteOffset += (textbegin - text); text = textbegin; n = textend - textbegin; r_newline = '\n'; /* for those candidate areas, find the D-error matches */ if(Candidate[1][0] < 0) Candidate[1][0] = 0; endpos = endposition; /* the mask table and the endposition */ /* Bit1 = (1 << 31); */ Bit1 = (unsigned)0x80000000; oldbyteoffset = CurrentByteOffset; for(round = 0; round <= cdx; round++) { i = Candidate[round][0] ; if(Candidate[round][1] > n) Candidate[round][1] = n; if(i < 0) i = 0; CurrentByteOffset = oldbyteoffset+i; R1[0] = R2[0] = ~0; R1[1] = R2[1] = ~Bit1; for(k = 1; k <= D; k++) R1[k] = R2[k] = (R1[k-1] >> 1) & R1[k-1]; while (i < Candidate[round][1]) { c = text[i++]; CurrentByteOffset ++; if(c == r_newline) { for(k = 0 ; k <= D; k++) R1[k] = R2[k] = (~0 ); } r1 = Mask[c]; R1[0] = (R2[0] >> 1) | r1; for(k=1; k<=D; k++) R1[k] = ((R2[k] >> 1) | r1) & R2[k-1] & ((R1[k-1] & R2[k-1]) >> 1); if((R1[D] & endpos) == 0) { num_of_matched++; if(FILENAMEONLY) return 0; currentpos = i; if(i <= lastend) { CurrentByteOffset += lastend - i; i = lastend; } else { int oldcurrentpos = currentpos; if (-1 == s_output(text, ¤tpos, textbegin, textend, &lastout, pat, M, oldpat, oldM)) return -1; CurrentByteOffset += currentpos - oldcurrentpos; i = currentpos; } lastend = i; for(k=0; k<=D; k++) R1[k] = R2[k] = ~0; if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */ } /* copying the code to save a few instructions. you need to understand the shift-or algorithm to figure this one... */ c = text[i++]; CurrentByteOffset ++; if(c == r_newline) { for(k = 0 ; k <= D; k++) R1[k] = R2[k] = (~0 ); } r1 = Mask[c]; R2[0] = (R1[0] >> 1) | r1; for(k = 1; k <= D; k++) R2[k] = ((R1[k] >> 1) | r1) & R1[k-1] & ((R1[k-1] & R2[k-1]) >> 1); if((R2[D] & endpos) == 0) { currentpos = i; num_of_matched++; if(FILENAMEONLY) return 0; if(i <= lastend) { CurrentByteOffset += lastend - i; i = lastend; } else { int oldcurrentpos = currentpos; if (-1 == s_output(text, ¤tpos, textbegin, textend, &lastout, pat, M, oldpat, oldM)) return -1; CurrentByteOffset += currentpos - oldcurrentpos; i = currentpos; } lastend = i; for(k=0; k<=D; k++) R1[k] = R2[k] = ~0; if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */ } } } if (!SILENT && INVERSE && !COUNT && (lastout <= textend)) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, textend-lastout + 1, agrep_finalfp); else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, textend-lastout + 1); agrep_outpointer += (textend - lastout + 1); } } /* TCOMPRESSED */ } return 0; } /* Don't update CurrentByteOffset here: done by caller */ int s_output(text, i, textbegin, textend, lastout, pat, m, oldpat, oldm) int *i; /* in, out */ int m, oldm; CHARTYPE *text, *textbegin, *textend, *pat, *oldpat; CHARTYPE **lastout; /* in, out */ { int PRINTED = 0; int newlen; int oldi; CHARTYPE *curtextbegin; CHARTYPE *curtextend; #if MEASURE_TIMES struct timeval initt, finalt; #endif if(SILENT) return 0; if (TCOMPRESSED == ON) { if (!DELIMITER) { curtextbegin = text + *i; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text -m + *i*/ /* + 1 agrep() has i++ */; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text + *i, text, tc_D_pattern, tc_D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text -m + *i*/ /* + 1 agrep() has i++ */, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } else { if (!DELIMITER) { curtextbegin = text + *i; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text -m + *i*/ /* + 1 agrep() has i++ */; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text + *i, text, D_pattern, D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text -m + *i*/ /* + 1 agrep() has i++ */, textend, D_pattern, D_length, OUTTAIL); } } if (TCOMPRESSED == ON) { #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ if (-1 == exists_tcompressed_word(pat, m, curtextbegin, text + *i - curtextbegin + m, EASYSEARCH)) { num_of_matched --; return 0; } #if MEASURE_TIMES gettimeofday(&finalt, NULL); FILTERALGO_ms += (finalt.tv_sec *1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } textbegin = curtextend; /*(curtextend - 1 > textbegin ? curtextend - 1 : curtextend); */ oldi = *i; *i += textbegin - (text + *i); if(COUNT) return 0; if (INVERSE) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, *lastout, curtextbegin - *lastout, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, *lastout, curtextbegin - *lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } *lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(*lastout, 1, curtextbegin-*lastout, agrep_finalfp); else { if (curtextbegin - *lastout + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, *lastout, curtextbegin-*lastout); agrep_outpointer += (curtextbegin - *lastout); } *lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } /* TCOMPRESSED */ return 0; } if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if(BYTECOUNT) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d= ", CurrentByteOffset); else { char s[32]; int outindex; sprintf(s, "%d= ", CurrentByteOffset); for(outindex=0; (outindex+agrep_outpointer 0) { if (agrep_outpointer + newlen + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } #if MEASURE_TIMES gettimeofday(&finalt, NULL); OUTFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else { if (agrep_finalfp != NULL) { fwrite(curtextbegin, 1, curtextend - curtextbegin, agrep_finalfp); } else { if (agrep_outpointer + curtextend - curtextbegin >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer + agrep_outpointer, curtextbegin, curtextend - curtextbegin); agrep_outpointer += curtextend - curtextbegin; } } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } return 0; } static void prep_bm(Pattern, m) unsigned char *Pattern; register m; { int i; unsigned hash; unsigned char lastc; for (i = 0; i < MAXSYM; i++) SHIFT[i] = m; for (i = m-1; i>=0; i--) { hash = TR[Pattern[i]]; if((int)(SHIFT[hash]) >= (int)(m - 1)) SHIFT[hash] = m-1-i; } shift_1 = m-1; /* shift_1 records the previous occurrence of the last character of the pattern. When we match this last character but do not have a match, we can shift until we reach the next occurrence from the right. */ lastc = TR[Pattern[m-1]]; for (i= m-2; i>=0; i--) { if(TR[Pattern[i]] == lastc ) { shift_1 = m-1 - i; i = -1; } } if(shift_1 == 0) shift_1 = 1; /* can never happen - Udi 11/7/94 */ if(NOUPPER) for(i=0; i textend) return 0; /* Udi: used to be >= for some reason */ /* added by Udi 11/7/94 */ if(WORDBOUND) { /* if(isalnum(*(unsigned char *)(text+1))) goto CONT; --> fixed by SHIOZAKI Takehiko */ if((text+1 <= textend) && isalnum(*(unsigned char *)(text+1)) && isalnum(*(unsigned char *)text)) { goto CONT; /* as if there was no match */ } /* if(isalnum(*(unsigned char *)(text-m))) goto CONT; --> fixed by SHIOZAKI Takehiko */ if((textbegin <= (text-m)) && isalnum(*(unsigned char *)(text-m)) && isalnum(*(unsigned char *)(text-m+1))) { goto CONT; /* as if there was no match */ } /* changed by Udi 11/7/94 to avoid having to set TR[] to W_delim */ } if (TCOMPRESSED == ON) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, tc_D_pattern, tc_D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text -m*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } else { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text-m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, D_pattern, D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin/*text -m*/, textend, D_pattern, D_length, OUTTAIL); } } if (TCOMPRESSED == ON) { #if MEASURE_TIMES gettimeofday(&initt, NULL); #endif /*MEASURE_TIMES*/ if (-1 == exists_tcompressed_word(pat, m, curtextbegin, text - curtextbegin + m, EASYSEARCH)) goto CONT; /* as if there was no match */ #if MEASURE_TIMES gettimeofday(&finalt, NULL); FILTERALGO_ms += (finalt.tv_sec *1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } textbegin = curtextend; /*(curtextend - 1 > textbegin ? curtextend - 1 : curtextend); */ num_of_matched++; if(FILENAMEONLY) return 0; if (!COUNT) { if (!INVERSE) { if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if(BYTECOUNT) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d= ", CurrentByteOffset); else { char s[32]; int outindex; sprintf(s, "%d= ", CurrentByteOffset); for(outindex=0; (outindex+agrep_outpointer 0) { if (agrep_outpointer + newlen + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } #if MEASURE_TIMES gettimeofday(&finalt, NULL); OUTFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else { if (agrep_finalfp != NULL) { fwrite(curtextbegin, 1, curtextend - curtextbegin, agrep_finalfp); } else { if (agrep_outpointer + curtextend - curtextbegin >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, curtextbegin, curtextend-curtextbegin); agrep_outpointer += curtextend - curtextbegin; } } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } } else { /* INVERSE */ if (!SILENT) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, curtextbegin-lastout, agrep_finalfp); else { if (curtextbegin - lastout + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, curtextbegin-lastout); agrep_outpointer += (curtextbegin - lastout); } lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } /* TCOMPRESSED */ } /* !SILENT */ } /* INVERSE */ } else { /* COUNT */ CurrentByteOffset += textbegin - text; text = textbegin; } /* Counteract the ++ below */ text --; CurrentByteOffset --; if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */ } CONT: text++; CurrentByteOffset ++; } if (!SILENT && INVERSE && !COUNT && (lastout <= textend)) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, textend-lastout + 1, agrep_finalfp); else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, textend-lastout + 1); agrep_outpointer += (textend - lastout + 1); } } /* TCOMPRESSED */ } return 0; } /* a_monkey() the approximate monkey move */ int a_monkey( pat, m, text, textend, D ) register int m, D ; register CHARTYPE *text, *textend, *pat; { int PRINTED = 0; register CHARTYPE *oldtext; CHARTYPE *curtextbegin; CHARTYPE *curtextend; register unsigned hash, hashmask, suffix_error; register int m1 = m-1-D, pos; CHARTYPE *textbegin = text; CHARTYPE *textstart; CHARTYPE *lastout = text; int newlen; hashmask = Hashmask; oldtext = text; while (text < textend) { textstart = text; text = text+m1; suffix_error = 0; while(suffix_error <= D) { hash = *text--; while(MEMBER_1[hash]) { hash = ((hash << LOG_ASCII) + *(text--)) & hashmask; } suffix_error++; } CurrentByteOffset += text - textstart; if(text <= oldtext) { if((pos = verify(m, 2*m+D, D, pat, oldtext)) > 0) { CurrentByteOffset += (oldtext+pos - text); text = oldtext+pos; if(text > textend) return 0; /* Don't update CurrentByteOffset here: only before outputting properly */ if (TCOMPRESSED == ON) { if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text -m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, tc_D_pattern, tc_D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin /*text -m*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } else { if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin/*text -m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, D_pattern, D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin/*text -m*/, textend, D_pattern, D_length, OUTTAIL); } } textbegin = curtextend; /* (curtextend - 1 > textbegin ? curtextend - 1 : curtextend); */ num_of_matched++; if(FILENAMEONLY) return 0; if(!COUNT) { if (!INVERSE) { if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if(BYTECOUNT) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d= ", CurrentByteOffset); else { char s[32]; int outindex; sprintf(s, "%d= ", CurrentByteOffset); for(outindex=0; (outindex+agrep_outpointer 0) { if (agrep_outpointer + newlen + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } #if MEASURE_TIMES gettimeofday(&finalt, NULL); OUTFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else { if (agrep_finalfp != NULL) { fwrite(curtextbegin, 1, curtextend - curtextbegin, agrep_finalfp); } else { if (agrep_outpointer + curtextend - curtextbegin >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, curtextbegin, curtextend-curtextbegin); agrep_outpointer += curtextend - curtextbegin; } } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } } else { /* INVERSE */ if (!SILENT) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, curtextbegin-lastout, agrep_finalfp); else { if (curtextbegin - lastout + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, curtextbegin-lastout); agrep_outpointer += (curtextbegin - lastout); } lastout=textbegin; CurrentByteOffset += textbegin - text; text = textbegin; } /* TCOMPRESSED */ } /* !SILENT */ } /* INVERSE */ } else { /* COUNT */ CurrentByteOffset += textbegin - text; text = textbegin; } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */ } else { CurrentByteOffset += (oldtext + m - text); text = oldtext + m; } } oldtext = text; } if (!SILENT && INVERSE && !COUNT && (lastout <= textend)) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, textend-lastout + 1, agrep_finalfp); else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, textend-lastout + 1); agrep_outpointer += (textend - lastout + 1); } } /* TCOMPRESSED */ } return 0; } static void am_preprocess(Pattern) CHARTYPE *Pattern; { int i, m; m = strlen(Pattern); for (i = 1, Hashmask = 1 ; i<16 ; i++) Hashmask = (Hashmask << 1) + 1 ; for (i = 0; i < MAXMEMBER_1; i++) MEMBER_1[i] = 0; for (i = m-1; i>=0; i--) { MEMBER_1[Pattern[i]] = 1; } for (i = m-1; i > 0; i--) { MEMBER_1[(Pattern[i] << LOG_ASCII) + Pattern[i-1]] = 1; } } int verify(m, n, D, pat, text) register int m, n, D; CHARTYPE *pat, *text; { int A[MAXPATT], B[MAXPATT]; register int last = D; register int cost = 0; register int k, i, c; register int m1 = m+1; CHARTYPE *textend = text+n; CHARTYPE *textbegin = text; for (i = 0; i <= m1; i++) A[i] = B[i] = i; while (text < textend) { for (k = 1; k <= last; k++) { cost = B[k-1]+1; if (pat[k-1] != *text) { if (B[k]+1 < cost) cost = B[k]+1; if (A[k-1]+1 < cost) cost = A[k-1]+1; } else cost = cost -1; A[k] = cost; } if(pat[last] == *text++) { A[last+1] = B[last]; last++; } if(A[last] < D) A[last+1] = A[last++]+1; while (A[last] > D) last = last - 1; if(last >= m) return(text - textbegin - 1); if(*text == '\n') { last = D; for(c = 0; c<=m1; c++) A[c] = B[c] = c; } for (k = 1; k <= last; k++) { cost = A[k-1]+1; if (pat[k-1] != *text) { if (A[k]+1 < cost) cost = A[k]+1; if (B[k-1]+1 < cost) cost = B[k-1]+1; } else cost = cost -1; B[k] = cost; } if(pat[last] == *text++) { B[last+1] = A[last]; last++; } if(B[last] < D) B[last+1] = B[last++]+1; while (B[last] > D) last = last -1; if(last >= m) return(text - textbegin - 1); if(*text == '\n') { last = D; for(c = 0; c<=m1; c++) A[c] = B[c] = c; } } return(0); } /* preprocessing for monkey() */ static void m_preprocess(Pattern) CHARTYPE *Pattern; { int i, j, m; unsigned hash; m = strlen(Pattern); for (i = 0; i < MAX_SHIFT_2; i++) SHIFT_2[i] = m; for (i = m-1; i>=1; i--) { hash = TR[Pattern[i]]; hash = hash << 3; for (j = 0; j< MAXSYM; j++) { if(SHIFT_2[hash+j] == m) SHIFT_2[hash+j] = m-1; } hash = hash + TR[Pattern[i-1]]; if((int)(SHIFT_2[hash]) >= (int)(m - 1)) SHIFT_2[hash] = m-1-i; } shift_1 = m-1; for (i= m-2; i>=0; i--) { if(TR[Pattern[i]] == TR[Pattern[m-1]] ) { shift_1 = m-1 - i; i = -1; } } if(shift_1 == 0) shift_1 = 1; SHIFT_2[0] = 0; } /* monkey4() the approximate monkey move */ char *MEMBER_D = NULL; int monkey4( pat, m, text, textend, D ) register int m, D ; register unsigned char *text, *pat, *textend; { int PRINTED = 0; register unsigned char *oldtext; register unsigned hash, hashmask, suffix_error; register int m1=m-1-D, pos; CHARTYPE *textbegin = text; CHARTYPE *textstart; CHARTYPE *curtextbegin; CHARTYPE *curtextend; CHARTYPE *lastout = text; int newlen; hashmask = Hashmask; oldtext = text ; while (text < textend) { textstart = text; text = text + m1; suffix_error = 0; while(suffix_error <= D) { hash = char_map[*text--]; hash = ((hash << LOG_DNA) + char_map[*(text--)]) & hashmask; while(MEMBER_D[hash]) { hash = ((hash << LOG_DNA) + char_map[*(text--)]) & hashmask; } suffix_error++; } CurrentByteOffset += text - textstart; if(text <= oldtext) { if((pos = verify(m, 2*m+D, D, pat, oldtext)) > 0) { CurrentByteOffset += (oldtext+pos - text); text = oldtext+pos; if(text > textend) return 0; if (TCOMPRESSED == ON) { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text -m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, tc_D_pattern, tc_D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin/*text -m*/, textend, tc_D_pattern, tc_D_length, OUTTAIL); } } else { /* Don't update CurrentByteOffset here: only before outputting properly */ if (!DELIMITER) { curtextbegin = text; while((curtextbegin > textbegin) && (*(--curtextbegin) != '\n')); if (*curtextbegin == '\n') curtextbegin ++; curtextend = curtextbegin /*text -m*/; while((curtextend < textend) && (*curtextend != '\n')) curtextend ++; if (*curtextend == '\n') curtextend ++; } else { curtextbegin = backward_delimiter(text, textbegin, D_pattern, D_length, OUTTAIL); curtextend = forward_delimiter(curtextbegin/*text -m*/, textend, D_pattern, D_length, OUTTAIL); } } textbegin = curtextend; /*(curtextend - 1 > textbegin ? curtextend - 1 : curtextend); */ num_of_matched++; if(FILENAMEONLY) return 0; if(!COUNT) { if (!INVERSE) { if(FNAME && (NEW_FILE || !POST_FILTER)) { char nextchar = (POST_FILTER == ON)?'\n':' '; char *prevstring = (POST_FILTER == ON)?"\n":""; if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s%s", prevstring, CurrentFileName); else { int outindex; if (prevstring[0] != '\0') { if(agrep_outpointer + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else agrep_outbuffer[agrep_outpointer ++] = prevstring[0]; } for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } if (PRINTFILETIME) { char *s = aprint_file_time(CurrentFileTime); if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%s", s); else { int outindex; for(outindex=0; (outindex+agrep_outpointer=agrep_outlen)) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += outindex; } } if (agrep_finalfp != NULL) fprintf(agrep_finalfp, ":%c", nextchar); else { if (agrep_outpointer+2>= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } else { agrep_outbuffer[agrep_outpointer++] = ':'; agrep_outbuffer[agrep_outpointer++] = nextchar; } } NEW_FILE = OFF; PRINTED = 1; } if(BYTECOUNT) { if (agrep_finalfp != NULL) fprintf(agrep_finalfp, "%d= ", CurrentByteOffset); else { char s[32]; int outindex; sprintf(s, "%d= ", CurrentByteOffset); for(outindex=0; (outindex+agrep_outpointer 0) { if (agrep_outpointer + newlen + 1 >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } #if MEASURE_TIMES gettimeofday(&finalt, NULL); OUTFILTER_ms += (finalt.tv_sec*1000 + finalt.tv_usec/1000) - (initt.tv_sec*1000 + initt.tv_usec/1000); #endif /*MEASURE_TIMES*/ } else { if (agrep_finalfp != NULL) { fwrite(curtextbegin, 1, curtextend - curtextbegin, agrep_finalfp); } else { if (agrep_outpointer + curtextend - curtextbegin >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, curtextbegin, curtextend-curtextbegin); agrep_outpointer += curtextend - curtextbegin; } } } else if (PRINTED) { if (agrep_finalfp != NULL) fputc('\n', agrep_finalfp); else agrep_outbuffer[agrep_outpointer ++] = '\n'; PRINTED = 0; } } else { /* INVERSE */ if (!SILENT) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, curtextbegin - lastout, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } lastout=textbegin; CurrentByteOffset += textbegin + 1 - text; text = textbegin + 1; } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, curtextbegin-lastout, agrep_finalfp); else { if (curtextbegin - lastout + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, curtextbegin-lastout); agrep_outpointer += (curtextbegin - lastout); } lastout=textbegin; CurrentByteOffset += textbegin + 1 - text; text = textbegin + 1; } /* TCOMPRESSED */ } /* !SILENT */ } /* INVERSE */ } else { /* COUNT */ CurrentByteOffset += textbegin + 1 - text; text = textbegin + 1 ; } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) return 0; /* done */ } else { CurrentByteOffset += (oldtext + m - text); text = oldtext + m; } } oldtext = text; } if (!SILENT && INVERSE && !COUNT && (lastout <= textend)) { if (TCOMPRESSED == ON) { /* INVERSE: Don't care about filtering time */ if (agrep_finalfp != NULL) newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_finalfp, -1, EASYSEARCH); else { if ((newlen = quick_tuncompress(FREQ_FILE, STRING_FILE, lastout, textend - lastout + 1, agrep_outbuffer, agrep_outlen - agrep_outpointer, EASYSEARCH)) > 0) { if (newlen + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } agrep_outpointer += newlen; } } } else { /* NOT TCOMPRESSED */ if (agrep_finalfp != NULL) fwrite(lastout, 1, textend-lastout + 1, agrep_finalfp); else { if (textend - lastout + 1 + agrep_outpointer >= agrep_outlen) { OUTPUT_OVERFLOW; return -1; } memcpy(agrep_outbuffer+agrep_outpointer, lastout, textend-lastout + 1); agrep_outpointer += (textend - lastout + 1); } } /* TCOMPRESSED */ } return 0; } static void prep4(Pattern, m) char *Pattern; int m; { int i, j, k; unsigned hash; for(i=0; i< MAXSYM; i++) char_map[i] = 0; char_map['a'] = char_map['A'] = 4; char_map['g'] = char_map['g'] = 1; char_map['t'] = char_map['t'] = 2; char_map['c'] = char_map['c'] = 3; char_map['n'] = char_map['n'] = 5; BSize = blog(4, m); for (i = 1, Hashmask = 1 ; i<(int)(BSize*LOG_DNA); i++) Hashmask = (Hashmask << 1) + 1 ; if (MEMBER_D != NULL) free(MEMBER_D); MEMBER_D = (char *) malloc((Hashmask+1) * sizeof(char)); #ifdef DEBUG printf("BSize = %d", BSize); #endif for (i=0; i <= Hashmask; i++) MEMBER_D[i] = 0; for (j=0; j < (int)BSize; j++) { for(i=m-1; i >= j; i--) { hash = 0; for(k=0; k <= j; k++) hash = (hash << LOG_DNA) +char_map[Pattern[i-k]]; #ifdef DEBUG printf("< %d >, ", hash); #endif MEMBER_D[hash] = 1; } } } int blog(base, m ) int base, m; { int i, exp; exp = base; m = m + m/2; for (i = 1; exp < m; i++) exp = exp * base; return(i); } glimpse-4.18.7/agrep/utilities.c000066400000000000000000000074211300371307100165270ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* this file contains various utility functions for accessing and manipulating regular expression syntax trees. */ #include #include #include "re.h" /************************************************************************/ /* */ /* the following routines implement an abstract data type "stack". */ /* */ /************************************************************************/ Stack Push(s, v) Stack *s; Re_node v; { Stack node; new_node(Stack, node, node); if (s == NULL || node == NULL) return NULL; /* can't allocate */ node->next = *s; node->val = v; if (*s == NULL) node->size = 1; else node->size = (*s)->size + 1; *s = node; return *s; } Re_node Pop(s) Stack *s; { Re_node node; Stack temp; if (s == NULL || *s == NULL) return NULL; else { temp = *s; node = (*s)->val; *s = (*s)->next; free(temp); return node; } } Re_node Top(s) Stack s; { if (s == NULL) return NULL; else return s->val; } int Size(s) Stack s; { if (s == NULL) return 0; else return s->size; } /************************************************************************/ /* */ /* the following routines manipulate sets of positions. */ /* */ /************************************************************************/ int occurs_in(n, p) int n; Pset p; { while (p != NULL) if (n == p->posnum) return 1; else p = p->nextpos; return 0; } /* pset_union() takes two position-sets and returns their union. */ Pset pset_union(s1, s2, dontreplicate) Pset s1, s2; int dontreplicate; { Pset hd, curr, new = NULL; Pset replicas2 = NULL, temps2 = s2; /* code added: 26/Aug/96 */ /* Code added on 26/Aug/96 */ if (dontreplicate) replicas2 = s2; else while (temps2 != NULL) { new_node(Pset, new, new); if (new == NULL) return NULL; new->posnum = temps2->posnum; if (replicas2 == NULL) replicas2 = new; else curr->nextpos = new; curr = new; temps2 = temps2->nextpos; } hd = NULL; curr = NULL; while (s1 != NULL) { if (!occurs_in(s1->posnum, s2)) { new_node(Pset, new, new); if (new == NULL) return NULL; new->posnum = s1->posnum; if (hd == NULL) hd = new; else curr->nextpos = new; } curr = new; s1 = s1->nextpos; } if (hd == NULL) hd = replicas2; /* changed from s2: 26/Aug/96 */ else curr->nextpos = replicas2; /* changed from s2: 26/Aug/96 */ return hd; } /* create_pos() creates a position node with the position value given, then returns a pointer to this node. */ Pset create_pos(n) int n; { Pset x; new_node(Pset, x, x); if (x == NULL) return NULL; x->posnum = n; x->nextpos = NULL; return x; } /* eq_pset() takes two position sets and checks to see if they are equal. It returns 1 if the sets are equal, 0 if they are not. */ int subset_pset(s1, s2) Pset s1, s2; { int subs = 1; while (s1 != NULL && subs != 0) { subs = 0; while (s2 != NULL && subs != 1) if (s1->posnum == s2->posnum) subs = 1; else s2 = s2->nextpos; s1 = s1->nextpos; } return subs; } int eq_pset(s1, s2) Pset s1, s2; { return subset_pset(s1, s2) && subset_pset(s2, s1); } int word_exists(word, wordlen, line, linelen) unsigned char *word, *line; int wordlen, linelen; { unsigned char oldchar, *lineend = line+linelen; int i; i = 0; while(line /* HAVE_SYS_SELECT_H is defined here */ #if SFS_COMPAT #if defined(__NeXT__) #include #else #include #endif #endif #include #include #include #include #include #include #include /* #include */ /* #include */ #include #if 1 #if defined(_IBMR2) #include #endif #else #if defined(HAVE_SYS_SELECT_H) #include #endif #endif #include "glimpse.h" #include "defs.h" int mystrlen(str, max) char *str; int max; { int i=0; while ((i 0) { #if SFS_COMPAT nread = syscall(SYS_read, fd, ptr, nleft); #else nread = read(fd, ptr, nleft); #endif if (nread < 0) return(nread); else if (nread == 0) break; /* EOF */ nleft -= nread; ptr += nread; } return (nbytes - nleft); } int writen(fd, ptr, nbytes) int fd; char *ptr; int nbytes; { int nleft, nwritten; nleft = nbytes; while (nleft > 0) { #if SFS_COMPAT nwritten = syscall(SYS_write, fd, ptr, nleft); #else nwritten = write(fd, ptr, nleft); #endif if (nwritten <= 0) return nwritten; nleft -= nwritten; ptr += nwritten; } return (nbytes - nleft); } int readline(sockfd, ptr, maxlen) int sockfd; char *ptr; int maxlen; { int n, rc; char c; for (n=1; n= 20)) return -1; if (((*pclstdout = fds[1]) < 0) || (*pclstdout >= 20)) return -1; if (((*pclstderr = fds[2]) < 0) || (*pclstderr >= 20)) return -1; return 0; } #endif /*USE_MSGHDR*/ int linearize(sockfd, reqbuf, reqlen, argc, argv, pid) int sockfd; int reqlen, argc; char *reqbuf, *argv[]; int pid; { int i; unsigned char array[4]; int ptr = 0; int len; array[0] = (pid & 0xff000000) >> 24; array[1] = (pid & 0xff0000) >> 16; array[2] = (pid & 0xff00) >> 8; array[3] = (pid & 0xff); if (sockfd >= 0) { if (writen(sockfd, array, 4) < 4) return -1; } if (reqbuf != NULL) { if (ptr + 4 >= reqlen) return -1; memcpy(reqbuf+ptr, array, 4); ptr += 4; } array[0] = (argc & 0xff000000) >> 24; array[1] = (argc & 0xff0000) >> 16; array[2] = (argc & 0xff00) >> 8; array[3] = (argc & 0xff); if (sockfd >= 0) { if (writen(sockfd, array, 4) < 4) return -1; } if (reqbuf != NULL) { if (ptr + 4 >= reqlen) return -1; memcpy(reqbuf+ptr, array, 4); ptr += 4; } for (i=0; i= 0) { if (writen(sockfd, argv[i], len + 1) < len + 1) return -1; if (writen(sockfd, "\n", 1) < 1) return -1; /* so that we can do gets */ } if (reqbuf != NULL) { if (ptr + len + 2 >= reqlen) return -1; strcpy(reqbuf+ptr, argv[i]); ptr += len+1; reqbuf[ptr++] = '\0'; /* so that we can do strcpy */ } #if 0 printf("sending %s\n", argv[i]); #endif } return ptr; } int delinearize(sockfd, reqbuf, reqlen, pargc, pargv, ppid) int sockfd; int reqlen, *pargc; char *reqbuf, **pargv[]; int *ppid; { int i; char line[MAXLINE]; int len; int ptr = 0; unsigned char array[4]; *ppid = 0; *pargc = 0; *pargv = NULL; memset(array, '\0', 4); if (sockfd >= 0) if (readn(sockfd, array, 4) != 4) return -1; if (reqbuf != NULL) { if (ptr+4 >= reqlen) return -1; memcpy(array, reqbuf+ptr, 4); ptr += 4; } *ppid = (array[0] << 24) + (array[1] << 16) + (array[2] << 8) + array[3]; memset(array, '\0', 4); if (sockfd >= 0) if (readn(sockfd, array, 4) != 4) return -1; if (reqbuf != NULL) { if (ptr+4 >= reqlen) return -1; memcpy(array, reqbuf+ptr, 4); ptr += 4; } *pargc = (array[0] << 24) + (array[1] << 16) + (array[2] << 8) + array[3]; #if 0 printf("clargc=%x\n", *pargc); #endif /* VERY important, set hard-coded limit to MAX_ARGS*MAX_NAME_LEN; otherwise can cause the server to allocate TONS of memory */ if (*pargc <= 0 || *pargc >= (MAX_ARGS*MAX_NAME_LEN)) { *pargc = 0; return -1; } if ((*pargv = (char **)malloc(sizeof(char *) * *pargc)) == NULL) { /* no memory, so discard */ *pargc = 0; return - 1; } memset(*pargv, '\0', sizeof(char *) * *pargc); for (i=0; i<*pargc; i++) { if (sockfd >= 0) { if (readline(sockfd, line, MAXLINE) <= 0) return -1; if ((len = mystrlen(line, MAXLINE)) <= 0) { i--; continue; } if (((*pargv)[i] = (char *)malloc(len + 2)) == NULL) return -1; line[len] = '\0'; /* overwrite the '\n' */ strcpy((*pargv)[i], line); } if (reqbuf != NULL) { if ( ((len = mystrlen(reqbuf+ptr, reqlen-ptr)) <= 0) || (len >= MAXLINE) ) return -1; if (((*pargv)[i] = (char *)malloc(len + 2)) == NULL) return -1; strcpy((*pargv)[i], reqbuf+ptr); ptr += len + 2; } #if 0 printf("clargv[%x]=%s\n", i, (*pargv)[i]); #endif } return ptr; } int sendreq(sockfd, reqbuf, clstdin, clstdout, clstderr, clargc, clargv, clpid) int sockfd, clstdin, clstdout, clstderr, clargc, clpid; char reqbuf[MAX_ARGS*MAX_NAME_LEN], *clargv[]; { #if USE_MSGHDR struct iovec iov[1]; struct msghdr msg; int ret; int fds[3]; #endif /*USE_MSGHDR*/ #if USE_MSGHDR if ((ret = linearize(-1, reqbuf, MAX_ARGS*MAX_NAME_LEN, clargc, clargv, clpid)) < 0) return -1; fds[2] = clstdin; fds[1] = clstdout; fds[0] = clstderr; iov[0].iov_base = (char *) reqbuf; iov[0].iov_len = ret; msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = (caddr_t) NULL; msg.msg_namelen = 0; msg.msg_accrights = (caddr_t) fds; msg.msg_accrightslen = 2 * sizeof(int); /* don't send clstdin */ errno = 0; #if SFS_COMPAT if ((ret = syscall(SYS_sendmsg, sockfd, &msg, 0)) < 0) { #else if ((ret = sendmsg(sockfd, &msg, 0)) < 0) { #endif #if 0 printf("sendmsg ret = %x, errno = %d\n", ret, errno); #endif return (-1); } #if 0 printf("sendreq %x %x %x, ret = %x, errno = %d\n", fds[0], fds[1], fds[2], ret, errno); #endif #else /*USE_MSGHDR*/ if (linearize(sockfd, (char *)NULL, MAX_ARGS*MAX_NAME_LEN, clargc, clargv, clpid) < 0) return -1; #endif /*USE_MSGHDR*/ return (0); } int getreq(sockfd, reqbuf, pclstdin, pclstdout, pclstderr, pclargc, pclargv, pclpid) int sockfd, *pclstdin, *pclstdout, *pclstderr, *pclargc, *pclpid; char reqbuf[MAX_ARGS*MAX_NAME_LEN], **pclargv[]; { #if USE_MSGHDR struct iovec iov[1]; struct msghdr msg; int ret; int fds[3]; #endif /*USE_MSGHDR*/ #if USE_MSGHDR iov[0].iov_base = (char *) reqbuf; iov[0].iov_len = MAX_ARGS * MAX_NAME_LEN; msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = (caddr_t) NULL; msg.msg_namelen = 0; msg.msg_accrights = (caddr_t)fds; msg.msg_accrightslen = 2*sizeof(int); errno = 0; #if SFS_COMPAT if ((ret = syscall(SYS_recvmsg, sockfd, &msg, 0)) < 0) { #else if ((ret = recvmsg(sockfd, &msg, 0)) < 0) { #endif #if 0 printf("bad recvmsg: ret = %x, errno = %d\n", ret, errno); #endif return -1; } *pclstdin = fds[2]; *pclstdout = fds[1]; *pclstderr = fds[0]; if ((ret == delinearize(-1, reqbuf, MAX_ARGS * MAX_NAME_LEN, pclargc, pclargv, pclpid)) < 0) return -1; #if 0 printf("getreq %x %x %x, ret = %x, errno = %d\n", fds[0], fds[1], fds[2], ret, errno); #endif #else /*USE_MSGHDR*/ if (delinearize(sockfd, (char *)NULL, MAX_ARGS * MAX_NAME_LEN, pclargc, pclargv, pclpid) < 0) return -1; *pclstdin = -1; *pclstdout = sockfd; *pclstderr = sockfd; #endif /*USE_MSGHDR*/ return (0); } glimpse-4.18.7/compress/000077500000000000000000000000001300371307100151015ustar00rootroot00000000000000glimpse-4.18.7/compress/Makefile.NeXT000066400000000000000000000057411300371307100173650ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On IRIX, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = gcc AR = /bin/ar RANLIB = /bin/ranlib SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include -I/usr/include/bsd/sys DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.alpha000066400000000000000000000060061300371307100176270ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = cc AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = ranlib #true #for Solaris SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O -Olimit 3000 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.hp000066400000000000000000000057411300371307100171560ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = cc AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = : SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 1 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.in000066400000000000000000000044261300371307100171540ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE srcdir = @srcdir@ VPATH = @srcdir@ SHELL = /bin/sh CC = @CC@ AR = @AR@ RANLIB = @RANLIB@ CP = @CP@ STRIP = @STRIP@ INSTALL = @INSTALL@ INSTALL_PROGRAM = @INSTALL_PROGRAM@ INSTALL_DATA = @INSTALL_DATA@ DEFS = prefix = @prefix@ exec_prefix = @exec_prefix@ binprefix = manprefix = bindir = $(exec_prefix)/bin libdir = $(exec_prefix)/lib mandir = $(prefix)/man/man1 manext = 1 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate OPTIMIZEFLAGS = -O2 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include CFLAGS = $(INCLUDEFLAGS) $(OPTIMIZEFLAGS) LDFLAGS = OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a all: $(LIB) tbuild cast uncast install: all for i in tbuild cast uncast ; do \ $(INSTALL) $$i $(bindir) ; \ done install-man: clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out distclean: clean rm -f Makefile $(LIB): $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LDFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LDFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LDFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LDFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c glimpse-4.18.7/compress/Makefile.linux000066400000000000000000000060011300371307100176740ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = gcc -m486 AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = ranlib #true #for Solaris SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O2 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.rs6000000066400000000000000000000057261300371307100175040ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On IRIX, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = cc AR = /usr/bin/ar RANLIB = true #for IRIX SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.sgi000066400000000000000000000057261300371307100173340ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On IRIX, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = cc AR = /usr/bin/ar RANLIB = true #for IRIX SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.solaris000066400000000000000000000057771300371307100202340ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = gcc -traditional #cc AR = /usr/ccs/bin/ar #for Solaris RANLIB = true #for Solaris SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/Makefile.sunos000066400000000000000000000057721300371307100177220ustar00rootroot00000000000000#/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ # Makefile for the compress library -- agrep should be linked with it in case # it wants to search for patterns in a compressed file. # You might have to change these depending on your machine configuration. # AR and RANLIB are the library-archive programs. On Solaris, RANLIB is not # required (define it to true) and AR is in /usr/ccs/bin/ar (on our machine!). CC = gcc AR = ar #/usr/ccs/bin/ar #for Solaris RANLIB = ranlib #true #for Solaris SHELL = /bin/sh # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE INDEXDIR = ../index AGREPDIR = ../agrep LIBDIR = ../lib BIN = ../bin TEMPLATEDIR = ../libtemplate all: lib tbuild cast uncast test cp tbuild $(BIN)/. cp cast $(BIN)/. cp uncast $(BIN)/. # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OTHERLIBS = LIBOBJ = hash.o string.o misc.o quick.o cast.o uncast.o tsimpletest.o tmemlook.o tbuild.o LIB = $(LIBDIR)/libcast.a lib: $(LIBOBJ) $(AR) rcv $(LIB) $(LIBOBJ) $(RANLIB) $(LIB) test: hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(CC) $(LINKFLAGS) -o test hash.o string.o misc.o test.o quick.o tsimpletest.o tmemlook.o cast.o uncast.o $(OTHERLIBS) tbuild: hash.o string.o misc.o tbuild.o main_tbuild.o defs.h $(CC) $(LINKFLAGS) -o tbuild hash.o string.o misc.o tbuild.o main_tbuild.o $(OTHERLIBS) cast: main_cast.o $(LIB) $(CC) $(LINKFLAGS) -o cast main_cast.o $(LIBOBJ) $(OTHERLIBS) uncast: main_uncast.o $(LIB) $(CC) $(LINKFLAGS) -o uncast main_uncast.o $(LIBOBJ) $(OTHERLIBS) hash.o: defs.h $(INDEXDIR)/glimpse.h string.o: defs.h $(INDEXDIR)/glimpse.h misc.o: defs.h $(INDEXDIR)/glimpse.h quick.o: defs.h $(INDEXDIR)/glimpse.h cast.o: defs.h $(INDEXDIR)/glimpse.h uncast.o: defs.h $(INDEXDIR)/glimpse.h main_cast.o: defs.h $(INDEXDIR)/glimpse.h main_uncast.o: defs.h $(INDEXDIR)/glimpse.h tsimpletest.o: defs.h $(INDEXDIR)/glimpse.h tmemlook.o: defs.h $(INDEXDIR)/glimpse.h test.o : test.c clean: rm -f *.o $(LIB) core test cast uncast tbuild a.out glimpse-4.18.7/compress/README000066400000000000000000000014501300371307100157610ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ This directory contains the source code for the new text-compression algorithm. The source is divided as follows: 1. main_comp.c, tcomp.c: source code for tcomp (compress algorithm). This also uses simpletest.c and memlook.c from ../index. 2. main_uncomp.c, tuncomp.c: source code for tuncomp (uncompress algorithm). 3. read_in.c: generates build, the procedure which builds the dictionary to be used by tuncomp and the hash_table used by tcomp. It uses and interprets the output of the indexing-algorithm present in ../index, the software is glimpseindex (a part of glimpse). 4. hash.c: common routines used by tcomp and build. 5. string.c: common routines used by tuncomp and build. 6. misc.c, defs.h: common to all above. glimpse-4.18.7/compress/cast.c000066400000000000000000000546751300371307100162200ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* * cast.c: main text compression routines. Exports tcompress() called * from main() in read_out.c, and one other simple routine * tcompressible_file(). This module can also be used from csearch.c. */ #include "defs.h" #include #if defined(__NeXT__) /* NeXT has no */ struct utimbuf { time_t actime; /* access time */ time_t modtime; /* modification time */ }; #else #include #endif #define ALNUMWORDS 1 #define MYEOF 0xffffffff extern int RESERVED_CHARS; extern int MAX_WORDS; extern int SPECIAL_WORDS; extern int BEGIN_SPECIAL_WORDS; extern int END_SPECIAL_WORDS; extern int NUM_SPECIAL_DELIMITERS; extern int END_SPECIAL_DELIMITERS; extern int ONE_VERBATIM; extern int next_free_hash, next_free_str; extern hash_entry freq_words_table[MAX_WORD_LEN+2][256]; /* 256 is the maximum possible number of special words */ extern char freq_words_strings[256][MAX_WORD_LEN+2]; extern int freq_words_lens[256]; extern char comp_signature[SIGNATURE_LEN]; extern hash_entry *compress_hash_table[HASH_TABLE_SIZE]; extern int usemalloc; /* initialize and load dictionaries */ initialize_tcompress(hash_file, freq_file, flags) char *hash_file, *freq_file; int flags; { FILE *hashfp; if (!initialize_common(freq_file, flags)) return 0; next_free_hash = 0; memset(compress_hash_table, '\0', sizeof(hash_entry *) * HASH_TABLE_SIZE); if (MAX_WORDS == 0) return 1; /* Load compress dictionary */ if ((hashfp = fopen(hash_file, "r")) == NULL) { if (flags & TC_ERRORMSGS) { fprintf(stderr, "cannot open cast-dictionary file: %s\n", hash_file); fprintf(stderr, "(use -H to give a dictionary-dir or run 'buildcast' to make a dictionary)\n"); } return 0; } if (!tbuild_hash(compress_hash_table, hashfp, -1)) { /* read all bytes until end */ fclose(hashfp); return 0; } fclose(hashfp); return 1; } uninitialize_tcompress() { int i; hash_entry *e, *t; uninitialize_common(); if (usemalloc) { for (i=0; inext; free(t->word); free(t); } } } memset(compress_hash_table, '\0', sizeof(hash_entry *) * HASH_TABLE_SIZE); next_free_hash = next_free_str = 0; } /* TRUE if input file has been compressed already, FALSE otherwise */ int already_tcompressed(buffer, length, flags) char *buffer; int length; int flags; { char *sig = comp_signature; if (!strncmp(buffer, sig, SIGNATURE_LEN - 1)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "Already compressed,"); return 1; } return 0; } extern int initialize_common_done; /* TRUE if input file is an ascii file, FALSE otherwise */ int tcompressible(buffer, num_read, flags) char *buffer; int num_read; int flags; { if (!initialize_common_done) { if (flags & TC_ERRORMSGS) fprintf(stderr, "No cast-dictionary,"); return 0; } if(ttest_binary(buffer, num_read)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "Binary data,"); return(0); } if(ttest_uuencode(buffer, num_read)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "UUEncoded data,"); return(0); } if(ttest_postscript(buffer, num_read)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "Postscript data,"); return(0); } if (already_tcompressed(buffer, num_read, flags)) return 0; return(1); } tcompressible_file(name, flags) char *name; int flags; { char buf[SAMPLE_SIZE + 2]; int num; FILE *fp = my_fopen(name, "r"); if (!initialize_common_done) { if (flags & TC_ERRORMSGS) fprintf(stderr, "No cast-dictionary,"); if (fp != NULL) fclose(fp); return 0; } if (fp == NULL) return 0; num = fread(buf, 1, SAMPLE_SIZE, fp); fclose(fp); return(tcompressible(buf, num, flags)); } tcompressible_fp(fp, flags) FILE *fp; int flags; { char buf[SAMPLE_SIZE + 2]; int num; if (!initialize_common_done) { if (flags & TC_ERRORMSGS) fprintf(stderr, "No cast-dictionary,"); return 0; } if (fp == stdin) return 1; num = fread(buf, 1, SAMPLE_SIZE, fp); return(tcompressible(buf, num, flags)); } /* ------------------------------------------------------------------------- tgetword(): get a word from stream pointed to by fp: a "word" is an ID = a stream of alphanumeric characters beginning with an alphanumeric char and ending with a non-alphanumeric character. The character following the word is returned, and the file pointer points to THIS character in the input stream. If there is no word beginning at the current position of the file pointer, tgetword simply behaves like getc(), i.e., just returns the character read. If the word is too long, then it fills up all the bytes it can and returns the character it could not fill up. To read a series of words without doing an ungetc() for the extra character read by tgetword, the caller can set *length to 1 and word[0] to the character returned by tgetword. This can make compress work even if infile = stdin. --------------------------------------------------------------------------*/ unsigned int tgetword(fp, buf, maxinlen, lenp, word, length) FILE *fp; char *buf; int maxinlen; int *lenp; char *word; int *length; { unsigned int c; #if !ALNUMWORDS if (*length > 0){ c = (unsigned char)word[*length - 1]; if (!isalpha(c)) goto not_alpha; else goto alpha; } if ((c = mygetc(fp, buf, maxinlen, lenp)) == MYEOF) return MYEOF; if (!isalpha(c)) { /* this might be a number */ if (!isdigit(c)) return c; word[*length] = c; (*length) ++; word[*length] = '\0'; not_alpha: while(isdigit(c = mygetc(fp, buf, maxinlen, lenp))) { if (*length >= MAX_NAME_LEN) return c; word[*length] = c; (*length) ++; word[*length] = '\0'; } return c; } else { /* this might be a dictionary word */ word[*length] = c; (*length) ++; word[*length] = '\0'; alpha: while(isalnum(c = mygetc(fp, buf, maxinlen, lenp))) { if (*length >= MAX_NAME_LEN) return c; word[*length] = c; (*length) ++; word[*length] = '\0'; } return c; } #else /*!ALNUMWORDS*/ if (*length > 0){ c = word[*length - 1]; } else { if ((c = mygetc(fp, buf, maxinlen, lenp)) == MYEOF) return MYEOF; if (!isalnum(c)) return c; word[(*length)++] = c; word[*length] = '\0'; } while(((c = mygetc(fp, buf, maxinlen, lenp)) != MYEOF) && (isalnum(c))) { if (*length >= MAX_NAME_LEN) return c; word[*length] = c; (*length) ++; word[*length] = '\0'; } return c; #endif /*!ALNUMWORDS*/ } /*-------------------------------------------------------------------- Skips a series of characters of the type skipc and sets the number of characters skipped. Used to compress multiple blanks, tabs & newlines. It returns the first character not equal to skipc. If there are no characters beginning at the current location of the file pointer which are equal to skipc, this function simply behaves as getc(). ---------------------------------------------------------------------*/ int skip(fp, buf, maxinlen, lenp, skipc, skiplen) FILE *fp; char *buf; int maxinlen; int *lenp; int skipc; int *skiplen; { unsigned int c; *skiplen = 1; /* c has already been read! */ while((c = mygetc(fp, buf, maxinlen, lenp)) == skipc) (*skiplen) ++; return c; } /* defined in misc.c */ extern char special_texts[]; extern char special_delimiters[]; int get_special_text_index(c) unsigned int c; { int i; for(i=0; i= 0) && (outlen + count1 + count2 >= maxoutlen)) return outlen;\ if (outfp != NULL) {\ for (i=0; i= 0) && (outlen + count1 + count2 >= maxoutlen)) return outlen;\ if (outfp != NULL) {\ for (i=0; i= 0) && (outlen + count1 + count2*2 >= maxoutlen)) return outlen;\ if (outfp != NULL) {\ for (i=0; i= 0) && (outlen + count1 + count2 >= maxoutlen)) return outlen;\ switch(skipc)\ {\ case ' ':\ if (outfp != NULL) {\ for (i=0; i= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ if (outfp != NULL) putc(BEGIN_VERBATIM, outfp);\ if (outbuf != NULL) outbuf[outlen] = BEGIN_VERBATIM;\ outlen ++;\ }\ } #define POST_VERBATIM(v) \ {\ if (v) {\ v = 0;\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ if (outfp != NULL) putc(END_VERBATIM, outfp);\ if (outbuf != NULL) outbuf[outlen] = END_VERBATIM;\ outlen ++;\ }\ } #define EASY_PRE_VERBATIM(v) \ {\ if (easysearch) {\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ if (outfp != NULL) putc(ONE_VERBATIM, outfp);\ if (outbuf != NULL) outbuf[outlen] = ONE_VERBATIM;\ outlen ++;\ }\ else {\ PRE_VERBATIM(v)\ }\ } #define EASY_POST_VERBATIM(v) \ {\ if (easysearch) {\ POST_VERBATIM(v)\ }\ /* else ignore */\ } int get_special_word_index(word, len) char word[MAX_NAME_LEN]; int len; { register int comp; hash_entry *e; if ((len > MAX_WORD_LEN) || (SPECIAL_WORDS <= 0)) return -1; e = freq_words_table[len]; while((e != NULL) && (e->val.offset != -1)) { comp = strcmp(word, e->word); if (comp == 0) return e->val.offset; if (comp < 0) return -1; /* can't find it anyway */ e = e->next; } return -1; } /* Compresses input from indata and outputs it into outdata: returns number of chars in output */ int tcompress(indata, maxinlen, outdata, maxoutlen, flags) void *indata, *outdata; int maxinlen, maxoutlen; int flags; { unsigned char curword[MAX_NAME_LEN]; int curlen; int hashindex; hash_entry *e; unsigned int c; unsigned short encodedindex; int skiplen; int ret; int verbatim_state = 0; char *sig = comp_signature; FILE *infp = NULL, *outfp = NULL; unsigned char *inbuf = NULL, *outbuf = NULL; int outlen = 0, inlen = 0; int easysearch = flags&TC_EASYSEARCH; int untilnewline = flags&TC_UNTILNEWLINE; if (flags & TC_SILENT) return 0; if (easysearch) { ONE_VERBATIM = EASY_ONE_VERBATIM; NUM_SPECIAL_DELIMITERS = EASY_NUM_SPECIAL_DELIMITERS; END_SPECIAL_DELIMITERS = EASY_END_SPECIAL_DELIMITERS; } else { ONE_VERBATIM = HARD_ONE_VERBATIM; NUM_SPECIAL_DELIMITERS = HARD_NUM_SPECIAL_DELIMITERS; END_SPECIAL_DELIMITERS = HARD_END_SPECIAL_DELIMITERS; } if (maxinlen < 0) { infp = (FILE *)indata; } else { inbuf = (unsigned char *)indata; } if (maxoutlen < 0) { outfp = (FILE *)outdata; } else { outbuf = (unsigned char *)outdata; } /* Write signature and information about whether compression was context-free or not: first 16 bytes */ if (outfp != NULL) { if ((maxoutlen >= 0) && (outlen + SIGNATURE_LEN >= maxoutlen)) return outlen; if (0 == fwrite(sig, 1, SIGNATURE_LEN - 1, outfp)) return 0; if (easysearch) putc(1, outfp); else putc(0, outfp); outlen += SIGNATURE_LEN; } /* No need to put a signature OR easysearch when doing it in memory: caller must manipulate */ /* * The algorithm for compression is as follows: * * For each input word, we search and see if it is in the dictionary. * If it IS there, we just look at its word-index and output it. * Then, if the character immediately after the word is NOT a blank, * we output a second character indicating what it was. * * If it is not in the dictionary then we output it verbatim: for * verbatim o/p, we take care to merge consecutive verbatim outputs * by NOT putting delimiters between them (one start and one end * delimiter). * * If the input is not a word but a single character, then it can be: * 1. A special character, in which case we output its code. * 2. A blank character in which case we keep getting more characters * to see howmany blanks we get. At the first non blank character, * we output a sequence of special characters which encode multiple * blanks (note: blanks can be spaces, tabs or newlines). * * Please refer to the state diagram for explanations. * I've used gotos since the termination condition is too complex. */ real_tgetword: curlen = 0; curword[0] = '\0'; concocted_tgetword: c = tgetword(infp, inbuf, maxinlen, &inlen, curword, &curlen); bypass_tgetword: if (curlen == 0) { /* only one character read and that is in c. */ switch(c) { case ' ': case '\t': case '\n': POST_VERBATIM(verbatim_state); /* need post-verbatim since there might be a LOT of blanks, etc. */ ret = skip(infp, inbuf, maxinlen, &inlen, c, &skiplen); process_spaces(c, skiplen); if ((c == '\n') && untilnewline) return outlen; if (isalnum((unsigned char)ret)) { curword[0] = (unsigned char)ret; curword[1] = '\0'; curlen = 1; goto concocted_tgetword; } else if (ret != MYEOF) { c = (unsigned int)ret; goto bypass_tgetword; } /* else fall thru */ case MYEOF: return outlen; default: if ((ret = get_special_text_index(c)) != -1) { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (verbatim_state) { /* no need to do post-verbatim since only one character: optimization */ if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; } else { if (outfp != NULL) putc(ret + BEGIN_SPECIAL_TEXTS, outfp); if (outbuf != NULL) outbuf[outlen] = ret + BEGIN_SPECIAL_TEXTS; outlen ++; } } else { /* * Has to be verbatim character: they have a ONE_VERBATIM before each * irrespective of verbatim_state. Otherwise there is no way to differentiate * one of our special characters from the same characters appearing in the * source. Hence binary files blow-up to twice their original size. * * Also, if it is a verbatim character that cannot be confused with one of OUR * special characters, then just put it in w/o changing verbatim state. Else * put a begin-verbatim before it and THEN output that character=saves 1 char. */ if ((c != BEGIN_VERBATIM) && (c != END_VERBATIM)) { /* reduces to below if easysearch */ EASY_PRE_VERBATIM(verbatim_state) if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; } else { /* like \ escape in C: \ is \\ */ if ((maxoutlen >= 0) && (outlen + 2 >= maxoutlen)) return outlen; if (outfp != NULL) putc(ONE_VERBATIM, outfp); if (outbuf != NULL) outbuf[outlen] = ONE_VERBATIM; outlen ++; if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; } } goto real_tgetword; } } else /* curlen >= 1 */ { if (!easysearch && verbatim_state && (curlen <= 2)) { fprintf(outfp, "%s", curword); /* don't bother to close the verbatim state and put a 2byte index=saves 1 char */ curword[0] = '\0'; curlen = 0; goto bypass_tgetword; } else { if ((ret = get_special_word_index(curword, curlen)) != -1) { POST_VERBATIM(verbatim_state); /* printf("ret=%d word=%s\n", ret, curword); */ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(ret + BEGIN_SPECIAL_WORDS, outfp); if (outbuf != NULL) outbuf[outlen] = ret + BEGIN_SPECIAL_WORDS; outlen ++; } else if ((e = get_hash(compress_hash_table, curword, curlen, &hashindex)) != NULL) { #if 0 fprintf(stderr, "%x ", e->val.attribute.index); #endif /*0*/ encodedindex = encode_index(e->val.attribute.index); POST_VERBATIM(verbatim_state); if ((maxoutlen >= 0) && (outlen + sizeof(short) >= maxoutlen)) return outlen; if (outfp != NULL) { putc(((encodedindex & 0xff00)>>8), outfp); putc((encodedindex & 0x00ff), outfp); } if (outbuf != NULL) { outbuf[outlen] = ((encodedindex & 0xff00)>>8); outbuf[outlen + 1] = encodedindex & 0x00ff; } outlen += sizeof(short); } else goto NOT_IN_DICTIONARY; /* process_char_after_word: */ switch(c) { case ' ': goto real_tgetword; /* blank is a part of the word */ case MYEOF: if (easysearch) return outlen; if (outfp != NULL) putc(NOTBLANK, outfp); if (outbuf != NULL) outbuf[outlen] = NOTBLANK; outlen ++; return outlen; default: if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if ((ret = get_special_delimiter_index(c)) != -1) { if (outfp != NULL) putc((ret+BEGIN_SPECIAL_DELIMITERS), outfp); if (outbuf != NULL) outbuf[outlen] = ret + BEGIN_SPECIAL_DELIMITERS; outlen ++; goto real_tgetword; } else { if (outfp != NULL) putc(NOTBLANK, outfp); if (outbuf != NULL) outbuf[outlen] = NOTBLANK; outlen ++; if (!isalnum(c)) { curword[0] = '\0'; curlen = 0; goto bypass_tgetword; } else { /* might be a number which ended with an alphabet: ".. born in 1992AD" */ curword[0] = c; curword[1] = '\0'; curlen = 1; goto concocted_tgetword; } } } } NOT_IN_DICTIONARY: /* word not in dictionary */ PRE_VERBATIM(verbatim_state); if ((maxoutlen >= 0) && (outlen + curlen >= maxoutlen)) return outlen; if ((outfp != NULL) && (0 == fwrite(curword, sizeof(char), curlen, outfp))) return 0; if (outbuf != NULL) memcpy(outbuf+outlen, curword, curlen); outlen += curlen; EASY_POST_VERBATIM(verbatim_state); switch(c) { case MYEOF: /* Prefix searches still work since our scheme is context free */ return outlen; default: if (!isalnum(c)) { curword[0] = '\0'; curlen = 0; goto bypass_tgetword; } else { /* might be a number which ended with an alphabet: ".. born in 1992AD" */ curword[0] = c; curword[1] = '\0'; curlen = 1; goto concocted_tgetword; } } } } #define FUNCTION tcompress_file #define DIRECTORY tcompress_directory #include "trecursive.c" /* returns #bytes (>=0) in the compressed file, -1 if major error (not able to compress) */ tcompress_file(name, outname, flags) char *name, *outname; int flags; { FILE *fp; FILE *outfp; int inlen, ret; struct stat statbuf; /* struct timeval tvp[2]; */ struct utimbuf tvp; char tempname[MAX_LINE_LEN]; if (name == NULL) return -1; special_get_name(name, -1, tempname); inlen = strlen(tempname); if (-1 == stat(tempname, &statbuf)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "permission denied or non-existent: %s\n", tempname); return -1; } if (S_ISDIR(statbuf.st_mode)) { if (flags & TC_RECURSIVE) return tcompress_directory(tempname, outname, flags); if (flags & TC_ERRORMSGS) fprintf(stderr, "skipping directory: %s\n", tempname); return -1; } if (!S_ISREG(statbuf.st_mode)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "not a regular file, skipping: %s\n", tempname); return -1; } if ((fp = fopen(tempname, "r")) == NULL) { if (flags & TC_ERRORMSGS) fprintf(stderr, "permission denied or non-existent: %s\n", tempname); return -1; } if (!tcompressible_fp(fp, flags)) { if (flags & TC_ERRORMSGS) fprintf(stderr, " skipping: %s\n", tempname); fclose(fp); return -1; } rewind(fp); if (flags & TC_SILENT) { printf("%s\n", tempname); fclose(fp); return 0; } /* Create and open output file */ strncpy(outname, tempname, MAX_LINE_LEN); if (inlen + strlen(COMP_SUFFIX) + 1 >= MAX_LINE_LEN) { outname[MAX_LINE_LEN - strlen(COMP_SUFFIX)] = '\0'; fprintf(stderr, "very long file name %s: truncating to: %s", tempname, outname); } strcat(outname, COMP_SUFFIX); if (!access(outname, R_OK)) { /* output file exists */ if (!(flags & TC_OVERWRITE)) { fclose(fp); return 0; } else if (!(flags & TC_NOPROMPT)) { char s[8]; printf("overwrite %s? (y/n): ", outname); scanf("%c", s); if (s[0] != 'y') { fclose(fp); return 0; } } } if ((outfp = fopen(outname, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", outname); fclose(fp); return -1; } ret = tcompress(fp, -1, outfp, -1, flags); if ((statbuf.st_size * (100 - COMP_ATLEAST))/100 < ret) { fprintf(stderr, "less than %d%% compression, skipping: %s\n", COMP_ATLEAST, tempname); fclose(fp); rewind(outfp); fclose(outfp); unlink(outname); return ret; } if ((ret > 0) && (flags & TC_REMOVE)) unlink(tempname); fclose(fp); fflush(outfp); fclose(outfp); /* tvp[0].tv_sec = statbuf.st_atime; tvp[0].tv_usec = 0; tvp[1].tv_sec = statbuf.st_mtime; tvp[1].tv_usec = 0; utimes(outname, tvp); */ tvp.actime = statbuf.st_atime; tvp.modtime = statbuf.st_mtime; utime(outname, &tvp); return ret; } glimpse-4.18.7/compress/compress.chronicle000066400000000000000000000002051300371307100206210ustar00rootroot00000000000000Started on 1st Sep 1994. 0. Completely integrated compression (cast, uncast) into glimpse/glimpseindex and agrep during Aug 1994. glimpse-4.18.7/compress/defs.h000066400000000000000000000177311300371307100162040ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /************************************************************************** * defs.h: contains definitions for our static/dictionary based * * compression scheme that is tailored for very fast search. * **************************************************************************/ #ifndef _DEFS_H_ #define _DEFS_H_ #include #include #include #include "glimpse.h" #undef COMP_SUFFIX #undef DEF_STRING_FILE #undef DEF_HASH_FILE #undef DEF_FREQ_FILE #undef SIGNATURE_LEN #define MIN_WORD_LEN 1 /* smaller words are not indexed: heuristics like special_texts etc. must be used: verbatim is good enough */ #define HASH_TABLE_SIZE MAX_64K_HASH #define SMALL_HASH_TABLE_SIZE MAX_4K_HASH #define HASH_ENTRY_SIZE 32 /* hash-file stores: name of len=24, a 5 digit int, a ' ' + a '\n' = 31 bytes + some padding once in a while */ #define DEF_BLOCKSIZE 4096 /* I/O unit size = OS page size */ #define MIN_BLOCKSIZE 512 /* granularity for above and below */ #define HASH_FILE_BLOCKS (HASH_TABLE_SIZE * HASH_ENTRY_SIZE / MIN_BLOCKSIZE) #define STRING_FILE_BLOCKS (HASH_TABLE_SIZE * MAX_WORD_LEN / MIN_BLOCKSIZE) #define MAX_SPECIAL_CHARS 32 /* Maximum # of special characters used during compress */ #define DEF_SPECIAL_WORDS 32 /* Special words for which 1B codes are reserved */ #define COMP_ATLEAST 10 /* At least 10% compression is needed */ #define COMP_SUFFIX ".CZ" /* Common suffix used for all compressed files: IT INCLUDES THE '.' !!! */ #define DEF_INDEX_FILE INDEX_FILE /* same as glimpse's */ #define DEF_STRING_FILE ".glimpse_uncompress" #define DEF_HASH_FILE ".glimpse_compress" #define DEF_FREQ_FILE ".glimpse_quick" #define DEF_THRESHOLD 16 /* 256? default for min bytes to be coverd before storing in hash table */ #define MAX_THRESHOLD 65535 /* MAX_WORDS*MAX_THRESHOLD must be < 2**32 - 1 = maxoffset = maxdiskspace = integer */ #define MAX_LSB 254 /* 256 - |{'\0', '\n'}| */ #define DEF_MAX_WORDS (MAX_LSB*MAX_LSB) #define SAMPLE_SIZE 8192 /* amount of data read to determine file-type: NOT CALLED FOR STDIN! */ #define SIGNATURE_LEN 16 /* to avoid calling strlen: including \0! */ typedef struct _hash_entry { struct _hash_entry *next; char *word; /* string itself */ union { int offset; /* offset into the dictionary file: used only while building compress's dict from glimpse's dict */ struct { short freq; /* number of times the word occurs -- provided it is in the dictionary */ short index; /* index into the string table */ } attribute; /* once freq > THRESHOLD, its just an index into the string table: used only while compressing a file */ } val; } hash_entry; /* * The total number of special characters (1..4) CANNOT exceed MAX_SPECIAL_CHARS. * The arrangement is as follows: * 1. SPECIAL_TEXTS * 2. SPECIAL_SEPARATORS * 3. SPECIAL_DELIMITERS * 4. VERBATIM * 5. SPECIAL_WORDS * Any rearrangement of these can be done provided the BEGIN/END values * are defined properly: the NUMs remain the same. */ #define BEGIN_SPECIAL_CHARS 1 /* character 0 is never a part of any code */ #define END_SPECIAL_CHARS 30 /* Not including begin/end verbatim */ /* Special delimiters are text-sequences which can come after a word instead of a blank: this is a subset of the above with '\n' and '\t' */ #define EASY_NUM_SPECIAL_DELIMITERS 8 /* numbered from 1 .. 8 */ #define HARD_NUM_SPECIAL_DELIMITERS 9 /* extra: a special kind of newline */ #define SPECIAL_DELIMITERS { '.', ',', ':', '-', ';', '!', '"', '\'', '\n'} #define BEGIN_SPECIAL_DELIMITERS BEGIN_SPECIAL_CHARS #define EASY_END_SPECIAL_DELIMITERS 9 #define HARD_END_SPECIAL_DELIMITERS 10 /* Special separators are things that can separate two words: they are 2blanks, 2tabs or 2newlines */ #define NUM_SEPARATORS 7 /* numbered from 10 .. 16 */ #define NEWLINE '\n' /* = HARD_END_SPECIAL_DELIMITERS --> carefully chosen so that this is TRUE !!!! Speeds up searches */ #define NOTBLANK (NEWLINE + 1) /* acts like unputc(' ') if char after a word != blk OR sp-delims */ #define BLANK (NOTBLANK + 1) #define TAB (NOTBLANK + 2) #define TWOBLANKS (NOTBLANK + 3) /* Beginning of a sentence */ #define TWOTABS (NOTBLANK + 4) /* Indentation */ #define TWONEWLINES (NOTBLANK + 5) /* Beginning of a paragraph */ #define BEGIN_SEPARATORS 10 #define END_SEPARATORS 17 /* * An alternate way would be to have a code for BLANK and NBLANKS, TAB and NTABS, and, NEWLINE and NNEWLINES: * in each of these cases, the byte occuring immediately next would determine the number of BLANKS/TABS/NEWLINES. * Though this works for a general number of cases, it needs two bytes of encoding: which makes us * wonder whether those cases occur commonly enough to waste two bytes to encode two blanks (common). * The present encoding guarantees 50% compression for any sequence of separators anyway, and is much simpler. */ /* Special texts are text-sequences which have a 1 byte codes associated with them: these appear first among the special things */ #define NUM_SPECIAL_TEXTS 13 /* numbered from 17 .. 29 */ #define SPECIAL_TEXTS { '.', ',', ':', '-', ';', '!', '"', '\'', '#', '$', '%', '(', ')'} /* Could have used ?, @ and & too */ #define BEGIN_SPECIAL_TEXTS 17 #define END_SPECIAL_TEXTS 30 /* Characters for literal text */ #define BEGIN_VERBATIM 30 #define END_VERBATIM 31 #define EASY_ONE_VERBATIM EASY_END_SPECIAL_DELIMITERS #define HARD_ONE_VERBATIM BEGIN_VERBATIM /* Is not an ascii char since ascii is 32.. */ /* BEGIN and END SPECIAL_WORDS are variables */ #if 0 /* THIS WON'T REALLY HELP SINCE SOURCE CODE RARELY HAS COMMON WORDS: KEYWORDS ARE VERY SMALL SO THEY HARDLY GIVE ANY COMPRESSION */ char special_program_chars[] = { '.', ',', ':', '-', '!', ';', '?', '+', '/', '\'', '"', '~', '`', '&', '@', '#', '$', '%', '^', '*', '=', '(', ')', '{', '}', '[', ']', '_', '|', '\\', '<', '>' }; #endif /*0*/ /* * Common exported functions. */ unsigned short encode_index(); unsigned short decode_index(); unsigned int mygetc(); int is_little_endian(); int build_string(); int build_hash(); int dump_hash(); int dump_string(); int get_word_from_offset(); int dump_and_free_string_hash(); hash_entry *insert_hash(); hash_entry *get_hash(); int hash_it(); char * tescapesinglequote(); /* * The beauty of this allocation scheme is that "free" does not need to be implemented! * The total memory occupied by both the string and hash tables is appx 1.5 MB */ #define hashfree(h) if (usemalloc) free(e); #define hashalloc(e) \ {\ if (usemalloc) (e) = (hash_entry *)malloc(sizeof(hash_entry));\ else {\ if (free_hash == NULL) free_hash = (hash_entry *)malloc(sizeof(hash_entry) * DEF_MAX_WORDS);\ if (free_hash == NULL) (e) = NULL;\ else (e) = ((next_free_hash >= DEF_MAX_WORDS) ? (NULL) : (&(free_hash[next_free_hash ++])));\ }\ if ((e) == NULL) {fprintf(stderr, "Out of memory in cast-hash-table!\n"); exit(2); }\ } #define strfree(s) if (usemalloc) free(s); /* called ONLY in the build procedure in which we can afford to be slow and do an strcpy since sizes of words are not determined: hardcoded in build_hash() */ #define stralloc(s, len) \ {\ if (usemalloc) (s) = (char *)malloc(len);\ else {\ if (free_str == NULL) free_str = (char *)malloc(AVG_WORD_LEN * DEF_MAX_WORDS);\ if (free_str == NULL) (s) = NULL;\ else (s) = ((next_free_str >= AVG_WORD_LEN * DEF_MAX_WORDS) ? (NULL) : (&(free_str[next_free_str]))); next_free_str += (len);\ }\ if ((s) == NULL) {fprintf(stderr, "Out of memory in cast-string-table!\n"); exit(2); }\ } /* There is no equivalent strtablealloc since it is hardcoded into build_string and is not used anywhere else */ /* Some flags corr. to user options: avoid global variables for options, pass flags as parameters */ #define TC_EASYSEARCH 0x1 #define TC_UNTILNEWLINE 0x2 #define TC_REMOVE 0x4 #define TC_OVERWRITE 0x8 #define TC_RECURSIVE 0x10 #define TC_ERRORMSGS 0x20 #define TC_SILENT 0x40 #define TC_NOPROMPT 0x80 #define TC_FILENAMESONSTDIN 0x100 #define CAST_VERSION "1.0" #define CAST_DATE "1994" #endif /*_DEFS_H_*/ glimpse-4.18.7/compress/hash.c000066400000000000000000000421201300371307100161670ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* * hash.c: Hash table manipulation routines. Can be used to compute * the dictionary as well as compress files. */ #include "defs.h" int next_free_hash = 0; hash_entry *free_hash = NULL; /*[DEF_MAX_WORDS]; */ int next_free_str = 0; char *free_str = NULL; /*[DEF_MAX_WORDS * AVG_WORD_LEN]; */ extern int usemalloc; /* ----------------------------------------------------------------- input: a word (a string of ascii character terminated by NULL) output: a hash_value of the input word. hash function: if the word has length <= 4 the hash value is just a concatenation of the last four bits of the characters. if the word has length > 4, then after the above operation, the hash value is updated by adding each remaining character. (and AND with the 16-bits mask). ---------------------------------------------------------------- */ int thash64k(word, len) unsigned char *word; int len; { unsigned int hash_value=0; unsigned int mask_4=017; unsigned int mask_16=0177777; int i; if(len<=4) { for(i=0; iword, (char *)word)) break; else e = e->next; } return e; } /* * Assigns either the freq or the offset to the hash-entry. The kind of * information in the entry depends on the caller. Advice: different * hash-tables must be used to store information gathered during * the build operation and the compress operation by the appropriate * module. This can be specified by passing -1's for offset/freq resply. */ hash_entry * insert_hash(hash_table, word, len, freq, offset) hash_entry *hash_table[HASH_TABLE_SIZE]; unsigned char *word; int len, freq, offset; { int i; hash_entry *e; e = get_hash(hash_table, word, len, &i); if (e == NULL) { hashalloc(e); stralloc(e->word, len + 2); strcpy(e->word, (char *)word); e->val.offset = 0; e->next = hash_table[i]; hash_table[i] = e; } if ((offset == -1) && (freq != -1)) { e->val.attribute.freq += freq; /* e->val.attribute.index has to be accessed from outside this function */ } else if ((offset != -1) && (freq == -1)) { e->val.offset = offset; /* used in building the string table from the dictionary */ } else { fprintf(stderr, "error in accessing hash-table [frequencies/offsets]. skipping...\n"); return (NULL); } #if 0 printf("%d %x\n", i, e); #endif /*0*/ return e; } /* * HASHFILE format: the hash-file is a sequence of "'\0' hash-index word-index word-name" * The '\0' is there to indicate that this is not a padded line. Padded lines simply have * a '\n' as the first character (words don't have '\0' or '\n'). The hash and word indices * are 2 unsigned short integers in binary, MSB first. The word name therefore starts from the * 5th character and continues until a '\0' or '\n' is encountered. The total size of the * hash-table is therefore (|avgwordlen|+5)*numwords = appx 12 * 50000 = .6MB. * Note that there can be multiple lines with the same hash-index. */ /* used when computing compress's dictionary */ int dump_hash(hash_table, HASHFILE) hash_entry *hash_table[HASH_TABLE_SIZE]; unsigned char *HASHFILE; { int i; FILE *hashfp; int wordindex; hash_entry *e, *t; if ((hashfp = fopen((char *)HASHFILE, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", HASHFILE); return 0; } /* We have a guarantee that the wordindex + 1 cannot exceed MAX_WORDS */ wordindex = 0; for(i=0; iword); t = e->next; strfree(e->word); hashfree(e); e = t; wordindex ++; } } fclose(hashfp); return wordindex; } /* * These are routines that operate on hash-tables of 4K size (used in tbuild.c) */ /* crazy hash function that operates on 4K hashtables */ thash4k(word, len) char *word; int len; { unsigned int hash_value=0; unsigned int mask_3=07; unsigned int mask_12=07777; int i; #if 0 /* discard prefix = the directory name */ if (len<=1) return 0; i = len-1; while(word[i] != '/') i--; if ((i > 0) && (word[i] == '/')) { word = &word[i+1]; len = strlen(word); } #endif /*0*/ if(len<=4) { for(i=0; iword, (char *)word)) break; else e = e->next; } return e; } hash_entry * insert_small_hash(hash_table, word, len, freq, offset) hash_entry *hash_table[SMALL_HASH_TABLE_SIZE]; unsigned char *word; int len, freq, offset; { int i; hash_entry *e; e = get_small_hash(hash_table, word, len, &i); if (e == NULL) { hashalloc(e); stralloc(e->word, len + 2); strcpy(e->word, (char *)word); e->val.offset = 0; e->next = hash_table[i]; hash_table[i] = e; } if ((offset == -1) && (freq != -1)) { e->val.attribute.freq += freq; /* e->val.attribute.index has to be accessed from outside this function */ } else if ((offset != -1) && (freq == -1)) { e->val.offset = offset; /* used in building the string table from the dictionary */ } else { fprintf(stderr, "error in accessing hash-table [frequencies/offsets]. skipping...\n"); return (NULL); } #if 0 printf("%d %x\n", i, e); #endif /*0*/ return e; } int dump_small_hash(hash_table, HASHFILE) hash_entry *hash_table[SMALL_HASH_TABLE_SIZE]; unsigned char *HASHFILE; { int i; FILE *hashfp; int wordindex; hash_entry *e, *t; if ((hashfp = fopen((char *)HASHFILE, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", HASHFILE); return 0; } /* We have a guarantee that the wordindex + 1 cannot exceed MAX_WORDS */ wordindex = 0; for(i=0; iword, strlen(e->word)), wordindex, e->word); /* must look like I used 64K table */ t = e->next; strfree(e->word); hashfree(e); e = t; wordindex ++; } } fclose(hashfp); return wordindex; } /* * These are again routines that operate on big (64k) hash-tables */ /* used only during debugging to see if output = input */ int dump_hash_debug(hash_table, HASHFILE) hash_entry *hash_table[HASH_TABLE_SIZE]; unsigned char *HASHFILE; { int i; FILE *hashfp; hash_entry *e; if ((hashfp = fopen((char *)HASHFILE, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", HASHFILE); return 0; } /* We have a guarantee that the wordindex + 1 cannot exceed MAX_WORDS */ for(i=0; ival.attribute.freq, e->val.attribute.index, e->word); e = e->next; } } fclose(hashfp); return 1; } /* * VERY particular to the format of the hash-table file: * -- does an fscanf+2atoi's+strlen all in one scan. * Returns 0 if you are in padded are, -1 on EOF, else ~. */ int myhashread(fp, pint1, pint2, str, plen) FILE *fp; int *pint1; int *pint2; char *str; int *plen; { int numread; int int1, int2; int c; if((int1 = getc(fp)) == '\n') return 0; /* padded area */ if(int1 != 0) return -1; /* formatting error! */ if ((int1 = getc(fp)) == EOF) return -1; if ((int2 = getc(fp)) == EOF) return -1; *pint1 = (int1 << 8) | int2; /* hashindex */ if ((int1 = getc(fp)) == EOF) return -1; if ((int2 = getc(fp)) == EOF) return -1; *pint2 = (int1 << 8) | int2; /* wordindex */ numread = 5; *plen = 0; /* wordname */ while((c = getc(fp)) != EOF) { if ( (c == '\0') || (c == '\n') ){ ungetc(c, fp); str[*plen] = '\0'; return numread; } str[(*plen)++] = c; numread ++; if (numread >= MAX_NAME_LEN) { str[*plen - 1] = '\0'; return numread; } } return -1; } int tbuild_hash(hash_table, hashfp, bytestoread) hash_entry *hash_table[HASH_TABLE_SIZE]; FILE *hashfp; int bytestoread; { int hashindex; int wordindex; int numread = 0; int ret; int len; char *word; char dummybuf[MAX_WORD_BUF]; hash_entry *e; if (bytestoread == -1) { /* read until end of file */ while (1) { if (usemalloc) word = dummybuf; else { if (free_str == NULL) free_str = (char *)malloc(AVG_WORD_LEN * DEF_MAX_WORDS); if (free_str == NULL) break; word = &free_str[next_free_str]; } if ((ret = myhashread(hashfp, &hashindex, &wordindex, word, &len)) == 0) continue; if (ret == -1) break; if ((hashindex >= HASH_TABLE_SIZE) || (hashindex < 0)) continue; /* ignore */ hashalloc(e); if (usemalloc) { if ((word = (char *)malloc(len + 2)) == NULL) break; strcpy(word, dummybuf); } else next_free_str += len + 2; e->word = word; e->val.attribute.freq = 0; /* just exists in compress's dict: not found in text-file yet! */ e->val.attribute.index = wordindex; e->next = hash_table[hashindex]; hash_table[hashindex] = e; #if 0 printf("word=%s index=%d\n", word, wordindex); #endif /*0*/ } } else { /* read only a specified number of bytes */ while (bytestoread > numread) { if (usemalloc) word = dummybuf; else { if (free_str == NULL) free_str = (char *)malloc(AVG_WORD_LEN * DEF_MAX_WORDS); if (free_str == NULL) break; word = &free_str[next_free_str]; } if ((ret = myhashread(hashfp, &hashindex, &wordindex, word, &len)) <= 0) break; if ((hashindex >= HASH_TABLE_SIZE) || (hashindex < 0)) continue; /* ignore */ hashalloc(e); if (usemalloc) { if ((word = (char *)malloc(len + 2)) == NULL) break; strcpy(word, dummybuf); } else next_free_str += len + 2; e->word = word; e->val.attribute.freq = 0; /* just exists in compress's dict: not found in text-file yet! */ e->val.attribute.index = wordindex; e->next = hash_table[hashindex]; hash_table[hashindex] = e; wordindex ++; numread += ret; #if 0 printf("%d %d %s\n", hashindex, wordindex, word); #endif /*0*/ } } return (wordindex + 1); /* the highest indexed word + 1 */ } /* * Interprets srcbuf as a series of words separated by newlines and looks * for a complete occurrence of words in patbuf in it. If there IS an occurrence, * it builds the hash-table for THAT page. The hashfp must start at the * beginning on each call. */ int build_partial_hash(hash_table, hashfp, srcbuf, srclen, patbuf, patlen, blocksize, loaded_hash_table) hash_entry *hash_table[HASH_TABLE_SIZE]; FILE *hashfp; unsigned char *srcbuf; int srclen; unsigned char *patbuf; int patlen; int blocksize; char loaded_hash_table[HASH_FILE_BLOCKS]; { unsigned char *srcpos; unsigned char *srcinit, *srcend, dest[MAX_NAME_LEN]; int blockindex = 0; int i, initlen, endlen; unsigned char *strings[MAX_NAME_LEN]; /* maximum pattern length */ int numstrings = 0; int inword = 0; /* * Find all the relevant strings in the pattern. */ i = 0; while(i= 0) && (strcmp((char *)strings[i], (char *)srcend) <= 0)) goto include_page; blockindex++; srcpos += (initlen + endlen + 2); continue; include_page: /* Include it if any of the patterns fit within this range */ if (loaded_hash_table[blockindex++]) continue; #if 0 printf("build_partial_hash: hashing words in page# %d\n", blockindex); #endif /*0*/ loaded_hash_table[blockindex - 1] = 1; fseek(hashfp, (blockindex-1)*blocksize, 0); tbuild_hash(hash_table, hashfp, blocksize); srcpos += (initlen + endlen + 2); } return 0; } pad_hash_file(filename, FILEBLOCKSIZE) unsigned char *filename; int FILEBLOCKSIZE; { FILE *outfp, *infp, *indexfp; int offset = 0, len; unsigned char buf[MAX_NAME_LEN]; int pid = getpid(); int i; unsigned char word[MAX_NAME_LEN]; unsigned char prev_word[MAX_NAME_LEN]; unsigned int hashindex, wordindex; char es1[MAX_LINE_LEN], es2[MAX_LINE_LEN]; if ((infp = fopen((char *)filename, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", filename); exit(2); } sprintf((char *)buf, "%s.index", filename); if ((indexfp = fopen((const char *)buf, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", buf); fclose(infp); exit(2); } sprintf((char *)buf, "%s.%d", filename, pid); if ((outfp = fopen((const char *)buf, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", buf); fclose(infp); fclose(indexfp); exit(2); } if ((FILEBLOCKSIZE % MIN_BLOCKSIZE) != 0) { fprintf(stderr, "invalid block size %d: changing to %d\n", FILEBLOCKSIZE, MIN_BLOCKSIZE); FILEBLOCKSIZE = MIN_BLOCKSIZE; } fprintf(indexfp, "%d\n", FILEBLOCKSIZE); if ((char*)buf != fgets((char *)buf, MAX_NAME_LEN, infp)) goto end_of_input; len = strlen((char *)buf); sscanf((const char *)buf, "%d %d %s\n", &hashindex, &wordindex, word); putc(0, outfp); putc((hashindex & 0xff00)>>8, outfp); putc((hashindex & 0x00ff), outfp); putc((wordindex & 0xff00)>>8, outfp); putc((wordindex & 0x00ff), outfp); fprintf(outfp, "%s", word); buf[len-1] = '\0'; /* fgets gives you the newline too */ for (i=0; i< len; i++) if (isupper(buf[i])) buf[i] = tolower(buf[i]); for (i=len-2; i>=0; i--) if (buf[i] == ' ') { i++; break; } if (i < 0) i = 0; strcpy((char *)prev_word, (char *)&buf[i]); fprintf(indexfp, "%s", &buf[i]); /* the first word */ putc(0, indexfp); /* null terminated */ offset += strlen((char *)word)+5; while(fgets((char *)buf, MAX_NAME_LEN, infp) == (char *)buf) { len = strlen((char *)buf); if (offset + len > FILEBLOCKSIZE) { /* Put the last char of the prev. page */ fprintf(indexfp, "%s", prev_word); putc(0, indexfp); /* null terminated */ for (i=0; i>8, outfp); putc((hashindex & 0x00ff), outfp); putc((wordindex & 0xff00)>>8, outfp); putc((wordindex & 0x00ff), outfp); fprintf(outfp, "%s", word); buf[len-1] = '\0'; /* fgets gives you the newline too */ for (i=0; i< len; i++) if (isupper(buf[i])) buf[i] = tolower(buf[i]); for (i=len-2; i>=0; i--) if (buf[i] == ' ') { i++; break; } if (i < 0) i = 0; strcpy((char *)prev_word, (char *)&buf[i]); fprintf(indexfp, "%s", &buf[i]); /* store the first word at each page */ putc(0, indexfp); /* null terminated */ offset = 0; } else { sscanf((const char *)buf, "%d %d %s\n", &hashindex, &wordindex, word); putc(0, outfp); putc((hashindex & 0xff00)>>8, outfp); putc((hashindex & 0x00ff), outfp); putc((wordindex & 0xff00)>>8, outfp); putc((wordindex & 0x00ff), outfp); fprintf(outfp, "%s", word); buf[len-1] = '\0'; /* fgets gives you the newline too */ for (i=0; i=0; i--) if (buf[i] == ' ') { i++; break; } if (i < 0) i = 0; strcpy((char *)prev_word, (char *)&buf[i]); } offset += strlen((char *)word)+5; } fprintf(indexfp, "%s", prev_word); putc(0, indexfp); /* null terminated */ end_of_input: fclose(infp); fflush(outfp); fclose(outfp); fflush(indexfp); fclose(indexfp); sprintf((char *)buf, "exec %s '%s.%d' '%s'\n", SYSTEM_MV, tescapesinglequote(filename, es1), pid, tescapesinglequote(filename, es2)); system((const char *)buf); return(1); /* by default this function is declared to return int, but not explictly declared, so we need to return something */ } glimpse-4.18.7/compress/main_cast.c000066400000000000000000000077251300371307100172160ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* * main_cast.c : uses functions in hash.c and cast.c to implement "cast". */ #include "defs.h" #if ISO_CHAR_SET #include #endif #include "dummysyscalls.c" extern char **environ; usage(progname) char *progname; { fprintf(stderr, "\nThis is cast version %s. Copyright (c) %s, University of Arizona.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-help] [-F] [-H dir] [-V] [-d] [-e] [-o] [-r] [-s] [-y] sourcefiles\n", progname); fprintf(stderr, "summary of options (for a more detailed version, see 'man cast'):\n"); fprintf(stderr, "-help: output this menu\n"); fprintf(stderr, "-d: DO NOT delete sourcefiles after compress\n"); fprintf(stderr, "-e: DO NOT compress for easy search\n"); fprintf(stderr, "-o: DO NOT overwrite the existing compressed file (if any)\n"); fprintf(stderr, "-r: compress recursively\n"); fprintf(stderr, "-s: proceed silently and output the names of the compressible files\n"); fprintf(stderr, "-y: DO NOT prompt and always overwrite when used without -o\n"); fprintf(stderr, "-F: expect NAMES of files on stdin instead of data if there are no sourcefiles\n"); fprintf(stderr, "-H dir: the directory for dictionaries is 'dir' (default ~)\n"); fprintf(stderr, "\n"); exit(1); } main(argc, argv) int argc; char *argv[]; { char **filev; int filec; int i; /* counter on argc, and later, filec */ char hash_file[MAX_LINE_LEN]; char freq_file[MAX_LINE_LEN]; char comp_dir[MAX_LINE_LEN]; char name[MAX_LINE_LEN]; char outname[MAX_LINE_LEN]; char *home; int FLAGS; #if ISO_CHAR_SET setlocale(LC_ALL, ""); #endif filev = (char **)malloc(sizeof(char *) * argc); memset(filev, '\0', sizeof(char *)*argc); filec = 0; comp_dir[0] = '\0'; hash_file[0] = '\0'; freq_file[0] = '\0'; FLAGS = 0; FLAGS |= TC_ERRORMSGS | TC_REMOVE | TC_OVERWRITE | TC_EASYSEARCH; /* Look at options */ i=1; while (i < argc) { if (filec == 0 && !strcmp(argv[i], "-d")) { FLAGS &= ~TC_REMOVE; i ++; } else if (filec == 0 && !strcmp(argv[i], "-V")) { printf("\nThis is cast version %s, %s.\n\n", CAST_VERSION, CAST_DATE); return 0; } else if (filec == 0 && !strcmp(argv[i], "-e")) { FLAGS &= ~TC_EASYSEARCH; i ++; } if (filec == 0 && !strcmp(argv[i], "-help")) { return usage(argv[0]); } else if (filec == 0 && !strcmp(argv[i], "-o")) { FLAGS &= ~TC_OVERWRITE; i ++; } else if (filec == 0 && !strcmp(argv[i], "-r")) { FLAGS |= TC_RECURSIVE; i ++; } else if (filec == 0 && !strcmp(argv[i], "-s")) { FLAGS |= TC_SILENT; i ++; } else if (filec == 0 && !strcmp(argv[i], "-y")) { FLAGS |= TC_NOPROMPT; i ++; } else if(filec == 0 && !strcmp(argv[1], "-F")) { FLAGS |= TC_FILENAMESONSTDIN; i ++; } else if (filec == 0 && !strcmp(argv[i], "-H")) { if (i + 1 >= argc) { fprintf(stderr, "%s: directory not specified after -H\n", argv[0]); return usage(argv[0]); } strcpy(comp_dir, argv[i+1]); i+=2; } else { filev[filec++] = argv[i++]; } } if (comp_dir[0] == '\0') { if ((home = (char *)getenv("HOME")) == NULL) { getcwd(comp_dir, MAXNAME-1); fprintf(stderr, "using working-directory '%s' to locate index\n", comp_dir); } else strncpy(comp_dir, home, MAXNAME); } strcpy(hash_file, comp_dir); strcat(hash_file, "/"); strcat(hash_file, DEF_HASH_FILE); strcpy(freq_file, comp_dir); strcat(freq_file, "/"); strcat(freq_file, DEF_FREQ_FILE); if (!initialize_tcompress(hash_file, freq_file, FLAGS)) exit(2); if (FLAGS & TC_SILENT) FLAGS &= ~TC_ERRORMSGS; /* now compress each file in filev array: if no files specified, compress stdin and put it on stdout */ if (filec == 0) { if (FLAGS & TC_FILENAMESONSTDIN) { while (fgets(name, MAX_LINE_LEN, stdin) == name) { tcompress_file(name, outname, FLAGS); } } else tcompress(stdin, -1, stdout, -1, FLAGS); } else for (i=0; i #endif extern int compute_dictionary(); extern char **environ; #include "dummysyscalls.c" usage(progname) char *progname; { fprintf(stderr, "usage: %s [-H directory] [-t threshold] [-l stop-list-size]\n", progname); fprintf(stderr, "defaults: %d %d %d ~\n", DEF_SPECIAL_WORDS, DEF_THRESHOLD, DEF_BLOCKSIZE); exit(1); } int main(argc, argv) int argc; unsigned char *argv[]; { char comp_dir[MAX_LINE_LEN]; int threshold, specialwords; int i = 1; char *home; #if ISO_CHAR_SET setlocale(LC_ALL, ""); #endif /* fill in default options */ comp_dir[0] = '\0'; threshold = DEF_THRESHOLD; specialwords = DEF_SPECIAL_WORDS; while(i < argc) { if (argv[i][0] != '-') return usage(argv[0]); else if (argv[i][1] == 'H') strcpy(comp_dir, argv[++i]); else if (argv[i][1] == 't') threshold = atoi(argv[++i]); else if (argv[i][1] == 'l') specialwords = atoi(argv[++i]); else if (argv[i][1] == 'V') { printf("\nThis is tbuild version %s. Copyright (c) %s, University of Arizona.\n\n", CAST_VERSION, CAST_DATE); } else return usage(argv[0]); i++; } if (comp_dir[0] == '\0') { if ((home = (char *)getenv("HOME")) == NULL) { getcwd(comp_dir, MAX_LINE_LEN-1); fprintf(stderr, "using working-directory '%s' to locate index\n", comp_dir); } else strncpy(comp_dir, home, MAX_LINE_LEN); } compute_dictionary(threshold, DISKBLOCKSIZE, specialwords, comp_dir); return 0; } glimpse-4.18.7/compress/main_uncast.c000066400000000000000000000104141300371307100175460ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* * main_uncast.c: uses functions in hash.c and uncast.c to implement "uncast". */ #include "defs.h" #if ISO_CHAR_SET #include #endif #include "dummysyscalls.c" extern char **environ; usage(progname) char *progname; { fprintf(stderr, "\nThis is uncast version %s. Copyright (c) %s, University of Arizona.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-help] [-F] [-H dir] [-V] [-d] [-o] [-r] [-s] [-y] sourcefiles\n", progname); fprintf(stderr, "summary of options (for a more detailed version, see 'man cast'):\n"); fprintf(stderr, "-help: output this menu\n"); fprintf(stderr, "-d: DO NOT remove source files after uncompress\n"); /* fprintf(stderr, "-e: is used with files NOT compressed for easy search\n"); */ fprintf(stderr, "-o: DO NOT overwrite the existing clear file (if any)\n"); fprintf(stderr, "-r: uncompress recursively\n"); fprintf(stderr, "-s: proceed silently and output the names of the uncompressible files\n"); fprintf(stderr, "-y: DO NOT prompt and always overwrite when used without -o\n"); fprintf(stderr, "-F: expect NAMES of files on stdin rather than data if there are no sourcefiles\n"); fprintf(stderr, "-H dir: the directory for dictionaries is 'dir' (default ~)\n"); fprintf(stderr, "\n"); exit(1); } main(argc, argv) int argc; char *argv[]; { char **filev; int filec; int i; /* counter on argc, and later, filec */ char freq_file[MAX_LINE_LEN]; char string_file[MAX_LINE_LEN]; char comp_dir[MAX_LINE_LEN]; char name[MAX_LINE_LEN]; char outname[MAX_LINE_LEN]; char *home; int FLAGS; int num_read; char buffer[SIGNATURE_LEN]; #if ISO_CHAR_SET setlocale(LC_ALL, ""); #endif filev = (char **)malloc(sizeof(char *) * argc); memset(filev, '\0', sizeof(char *)*argc); filec = 0; comp_dir[0] = '\0'; freq_file[0] = '\0'; string_file[0] = '\0'; FLAGS = 0; FLAGS |= TC_ERRORMSGS | TC_REMOVE | TC_OVERWRITE | TC_EASYSEARCH; /* Look at options */ i=1; while (i < argc) { if (filec == 0 && !strcmp(argv[i], "-d")) { FLAGS &= ~TC_REMOVE; i ++; } else if (filec == 0 && !strcmp(argv[i], "-V")) { printf("\nThis is uncast version %s, %s.\n\n", CAST_VERSION, CAST_DATE); return 0; } /* else if (filec == 0 && !strcmp(argv[i], "-e")) { FLAGS &= ~TC_EASYSEARCH; i ++; } */ if (filec == 0 && !strcmp(argv[i], "-help")) { return usage(argv[0]); } else if (filec == 0 && !strcmp(argv[i], "-o")) { FLAGS &= ~TC_OVERWRITE; i ++; } else if (filec == 0 && !strcmp(argv[i], "-r")) { FLAGS |= TC_RECURSIVE; i ++; } else if (filec == 0 && !strcmp(argv[i], "-s")) { FLAGS |= TC_SILENT; i ++; } else if (filec == 0 && !strcmp(argv[i], "-y")) { FLAGS |= TC_NOPROMPT; i ++; } else if(filec == 0 && !strcmp(argv[1], "-F")) { FLAGS |= TC_FILENAMESONSTDIN; i ++; } else if (filec == 0 && !strcmp(argv[i], "-H")) { if (i + 1 >= argc) { fprintf(stderr, "directory not specified after -H\n"); return usage(argv[0]); } strcpy(comp_dir, argv[i+1]); i+=2; } else { filev[filec++] = argv[i++]; } } if (comp_dir[0] == '\0') { if ((home = (char *)getenv("HOME")) == NULL) { getcwd(comp_dir, MAXNAME-1); fprintf(stderr, "using working-directory '%s' to locate index\n", comp_dir); } else strncpy(comp_dir, home, MAXNAME); } strcpy(string_file, comp_dir); strcat(string_file, "/"); strcat(string_file, DEF_STRING_FILE); strcpy(freq_file, comp_dir); strcat(freq_file, "/"); strcat(freq_file, DEF_FREQ_FILE); if (!initialize_tuncompress(string_file, freq_file, FLAGS)) exit(2); if (FLAGS & TC_SILENT) FLAGS &= ~TC_ERRORMSGS; /* now compress each file in filev array: if no files specified, compress stdin and put it on stdout */ if (filec == 0) { if (FLAGS & TC_FILENAMESONSTDIN) { while (fgets(name, MAX_LINE_LEN, stdin) == name) { tuncompress_file(name, outname, FLAGS); } } else { num_read = fread(buffer, 1, SIGNATURE_LEN - 1, stdin); if (!tuncompressible(buffer, num_read)) { fprintf(stderr, "signature does not match -- cannot uncompress %s\n", filev[i]); return -1; } tuncompress(stdin, -1, stdout, -1, FLAGS); } } else for (i=0; i= maxlen) return MYEOF; else c = (unsigned int)buf[*lenp]; } (*lenp) ++; return c; } myfpcopy(fp, src) FILE *fp; char *src; { int i=0; while(*src) { putc(*src, fp); src ++; i ++; } return i; } mystrcpy(dest, src) char *src, *dest; { int i=0; while(*dest = *src) { dest ++; src ++; i ++; } return i; } /* Returns 1 if little endian, 0 if big endian */ int get_endian() { union{ int x; struct { short y1; short y2; } y; } var; var.x = 0xffff0000; if (var.y.y1 == 0) return 1; else return 0; } /* * These procedures take care of the fact that the msb of the encoded * short cannot be < RESERVED_CHARS, and the lsb cannot be equal to '\n' or '\0'. */ unsigned char encode_msb(i) unsigned char i; { return i + RESERVED_CHARS; } unsigned char decode_msb(i) unsigned char i; { return i - RESERVED_CHARS; } unsigned char encode_lsb(i) unsigned char i; { if (i == '\0') return MAX_LSB; if (i == '\n') return MAX_LSB + 1; return i; } unsigned char decode_lsb(i) unsigned char i; { if (i == MAX_LSB) return '\0'; if (i == MAX_LSB + 1) return '\n'; return i; } unsigned short encode_index(i) unsigned short i; { unsigned char msb, lsb; msb = (i / MAX_LSB); lsb = (i % MAX_LSB); msb = encode_msb(msb); lsb = encode_lsb(lsb); return (msb << 8) | lsb; } unsigned short decode_index(i) unsigned short i; { unsigned char msb, lsb; msb = ((i & 0x0ff00) >> 8); lsb = (i & 0x00ff); msb = decode_msb(msb); lsb = decode_lsb(lsb); return (msb * MAX_LSB + lsb); } #if 0 /* This is bullshit */ unsigned short encode_index(i) unsigned short i; { unsigned int msb, lsb; top: msb = (i & 0xff00) >> 8; if ((i & 0x00ff) == '\n') { i = MAX_WORDS + msb; goto top; /* eliminate tail recursion */} lsb = (i & 0x00ff); msb += RESERVED_CHARS; return (0x0000ffff & ((msb << 8) | lsb)); } unsigned short decode_index(i) unsigned short i; { unsigned int msb, lsb, ret; msb = (i & 0xff00) >> 8; lsb = (i & 0x00ff); msb -= RESERVED_CHARS; ret = (0x0000ffff & ((msb << 8) | lsb)); if (ret >= MAX_WORDS) ret = (((ret - MAX_WORDS) << 8) | '\n'); return ret; } #endif /*0*/ char comp_signature[SIGNATURE_LEN]; /* SIGNATURE_LEN - 1 hex-chars terminated by '\0' */ /* returns the number of words read */ build_freq(freq_words_table, freq_words_strings, freq_words_lens, freq_file, flags) hash_entry freq_words_table[MAX_WORD_LEN+2][256]; char freq_words_strings[256][MAX_WORD_LEN+2]; int freq_words_lens[256]; char *freq_file; { FILE *fp = fopen(freq_file, "r"); int len, num, i, j; hash_entry *e; int numsofar = 0; int freq_words; memset(comp_signature, '\0', SIGNATURE_LEN); if (fp == NULL) { if (flags & TC_ERRORMSGS) { fprintf(stderr, "cannot open cast-dictionary file: %s\n", freq_file); fprintf(stderr, "(use -H to give a dictionary-dir or run 'buildcast' to make a dictionary)\n"); } return -1; } /* initialize the tables by accessing only those entries which will be used */ if (SIGNATURE_LEN != fread(comp_signature, 1, SIGNATURE_LEN, fp)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "illegal cast signature in: %s\n", freq_file); fclose(fp); return -1; } comp_signature[SIGNATURE_LEN - 1] = '\0'; /* overwrite '\0' */ fscanf(fp, "%d\n", &freq_words); if ((freq_words < 0) || (freq_words > 256 - MAX_SPECIAL_CHARS)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "illegal number of frequent words %d outside [0, %d] in: %s\n", freq_words, 256-MAX_SPECIAL_CHARS, freq_file); fclose(fp); return -1; } if (freq_words == 0) { fclose(fp); return 0; } for (i=0; i<=MAX_WORD_LEN; i++) { for (j=0; jword = &(freq_words_strings[numsofar + i][0]); if (1 != fscanf(fp, "%s\n", e->word)) { fclose(fp); return numsofar; } freq_words_lens[numsofar + i] = len; e->val.offset = numsofar + i; /* which-th special word is it? */ if (i + 1 == num) e->next = NULL; else e->next = &(freq_words_table[len][i+1]); } numsofar += num; } fclose(fp); return numsofar; } int initialize_common_done = 0; /* Used in tcomp.c, tuncomp.c and csearch.c */ initialize_common(freq_file, flags) char *freq_file; int flags; { if (initialize_common_done == 1) return 1; if (SPECIAL_WORDS == -1) return 0; if ((freq_file == NULL) || (freq_file[0] == '\0')) return 0; /* courtesy: crd@hplb.hpl.hp.com */ if ((SPECIAL_WORDS = build_freq(freq_words_table, freq_words_strings, freq_words_lens, freq_file, flags)) == -1) return 0; BEGIN_SPECIAL_WORDS = MAX_SPECIAL_CHARS; RESERVED_CHARS = END_SPECIAL_WORDS = BEGIN_SPECIAL_WORDS + SPECIAL_WORDS; MAX_WORDS = MAX_LSB*(256-RESERVED_CHARS); /* upper byte must be > RESERVED_CHARS, lower byte must not be '\n' */ TC_FOUND_NOTBLANK = 0; TC_FOUND_BLANK = 0; initialize_common_done = 1; return 1; } uninitialize_common() { initialize_common_done = 0; return; } /* * Simple O(worlen*linelen) search since the average linelen is * guaranteed to be ~ 80/2, and the average wordlen, 2. * SHOULD WORK FOR ANY LEGITIMATE COMPRESSED STRING WITH EASY SEARCH */ int exists_tcompressed_word(word, wordlen, line, linelen, flags) CHAR *word, *line; int wordlen, linelen; { int i, j; #if 0 for (i=0; i linelen) return -1; if (flags & TC_EASYSEARCH) { for (i=0; i<=linelen-wordlen; i++) { if (word[0] == BEGIN_VERBATIM) while ((i <= linelen - wordlen) && (line[i] != BEGIN_VERBATIM)) i++; j = 0; while ((j < wordlen) && (i <= linelen - wordlen) && (word[j] == line[i+j])) j++; if (j >= wordlen) return i; if (i > linelen - wordlen) return -1; /* Goto next-pos for i. Remember: the for loop ALSO skips over one i */ if (line[i] >= RESERVED_CHARS) i++; else if (line[i] == BEGIN_VERBATIM) { i++; while ((i <= linelen - wordlen) && (line[i] != END_VERBATIM)) i++; if (i > linelen - wordlen) return -1; } else if (line[i] == EASY_ONE_VERBATIM) i++; } } else { for (i=0; i<=linelen-wordlen; i++) { if (word[0] == BEGIN_VERBATIM) while ((i <= linelen - wordlen) && (line[i] != BEGIN_VERBATIM)) i++; j = 0; while ((j < wordlen) && (i <= linelen - wordlen) && (word[j] == line[i+j])) j++; if (j >= wordlen) return i; if (i > linelen - wordlen) return -1; /* Goto next-pos for i. Remember: the for loop ALSO skips over one i */ if (line[i] >= RESERVED_CHARS) i++; else if (line[i] == BEGIN_VERBATIM) { i++; while ((i <= linelen - wordlen) && (line[i] != BEGIN_VERBATIM) && (line[i] != END_VERBATIM)) i++; if (i > linelen - wordlen) return -1; if (line[i] == BEGIN_VERBATIM) i--; /* counter-act the i++ */ } } } return -1; } /* * There is a problem here if we use these two routines to search for delimiters: * With outtail set, the implicit blank AFTER the word just before the beginning * of the record and a possible NOTBLANK after the end of the record might be missed. * No way to rectify it now unless we have flags to indicate if these things occured. * That is why, I have introduced TC_FOUND_NOTBLANK and TC_FOUND_BLANK. */ /* return where the word begins or ends (=outtail): range = [begin, end) */ unsigned char * forward_tcompressed_word(begin, end, delim, len, outtail, flags) unsigned char *begin, *end, *delim; int len, outtail, flags; { register unsigned char *curend; register int pos; TC_FOUND_NOTBLANK = 0; if (begin + len > end) return end + 1; curend = begin; top: while ((curend <= end) && (*curend != '\n')) curend ++; if ((pos = exists_tcompressed_word(delim, len, begin, curend-begin, flags)) == -1) { curend ++; /* for next '\n' */ if (curend > end) return end + 1; begin = curend; goto top; } begin += pos; /* place where delimiter begins */ if (outtail) { TC_FOUND_NOTBLANK = 1; return begin + len; } else return begin; } /* return where the word begins or ends (=outtail): range = [begin, end) */ unsigned char * backward_tcompressed_word(end, begin, delim, len, outtail, flags) unsigned char *begin, *end, *delim; int len, outtail, flags; { register unsigned char *curbegin; register int pos; TC_FOUND_BLANK = 0; if (begin + len > end) return begin; curbegin = end; top: while ((curbegin > begin) && (*curbegin != '\n')) curbegin --; if ((pos = exists_tcompressed_word(delim, len, curbegin, end-curbegin, flags)) == -1) { curbegin --; /* for next '\n' */ if (curbegin < begin) return begin; end = curbegin; goto top; } curbegin += pos; /* place where delimiter begins */ if (outtail) { if ((curbegin + len < end) && (*(curbegin + len) != NOTBLANK)) TC_FOUND_BLANK = 1; return curbegin + len; } else return curbegin; } /* Escapes single quotes in "original" string with backquote (\) s.t. it can be passed on to the shell as a file name: returns its second argument for printf */ /* Called before passing any argument to the system() routine in glimpse or glimspeindex source code */ /* Works only if the new name is going to be passed as argument to the shell within two ''s */ char * tescapesinglequote(original, new) char *original, *new; { char *oldnew = new; while (*original != '\0') { if (*original == '\'') { *new ++ = '\''; /* close existing ' : this guy will be a part of a file name starting from a ' */ *new ++ = '\\'; /* add escape character */ *new ++ = '\''; /* add single quote from original here */ } *new ++ = *original ++; /* start the real single quote to continute existing file name if *original was ' */ } *new = *original; return oldnew; } glimpse-4.18.7/compress/quick.c000066400000000000000000000170751300371307100163730ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* * quick.c: Used to search for a pattern in a compressed file. * * Algorithm: if the file (or stdin) is a compressed file, then: * + a. Read in the hash-table-index file. * + b. For each page in which the words of the pattern can be found: * build the hash-table using the words in exactly those pages. * + c. Now, call compress with the given pattern. * * + d. Call the normal search routines with the compressed pattern on * the input file. * + e. If the option is to count number of matches, just exit. * Otherwise we have to modify the r_output/output routines: * * + f. Read in the string-table-index file. * + g. For each page in which the word numbers of the input file can * be found: build the string-table using the words in exactly * those pages. * + h. Call uncompress with the input file line to be output and * output THIS line instead of the original matched line. * * Part of this will be in agrep and part of this here. */ #include "defs.h" #include #include /* * The quick-functions can be called multiple number of times -- * they however open the hash, string and freq files only once. */ hash_entry *compress_hash_table[HASH_TABLE_SIZE]; /* used for compress: assume it is zeroed by C */ char loaded_hash_table[HASH_FILE_BLOCKS]; /* bit mask of loaded pages in hash-table: store chars since just 4K: speed is most imp. */ char *hashindexbuf; int hashindexsize; /* returns length of compressed pattern after filling up the compressed pattern in the user-supplied newpattern buffer */ int quick_tcompress(freq_file, hash_file, pattern, len, newpattern, maxnewlen, flags) char *freq_file; char *hash_file; CHAR *pattern; int len; void *newpattern; /* can be FILE* or CHAR* */ int *maxnewlen; int flags; { static FILE *hashfp = NULL, *hashindexfp = NULL; static char old_freq_file[MAX_LINE_LEN] = "", old_hash_file[MAX_LINE_LEN] = ""; static int blocksize; int newlen; if ((hashfp == NULL) || (strcmp(freq_file, old_freq_file)) || (strcmp(hash_file, old_hash_file))) { /* Have to do some initializations */ char s[256]; struct stat statbuf; if (hashfp != NULL) { uninitialize_tcompress(); fclose(hashfp); hashfp = NULL; } else memset(loaded_hash_table, '\0', HASH_FILE_BLOCKS); if (!initialize_common(freq_file, flags)) return 0; /* don't call initialize_tcompress since that will load the FULL hash table */ if ((hashfp = fopen(hash_file, "r")) == NULL) { if (flags & TC_ERRORMSGS) { fprintf(stderr, "cannot open cast-dictionary file: %s\n", hash_file); fprintf(stderr, "(use -H to give a dictionary-dir or run 'buildcast' to make a dictionary)\n"); } return 0; } sprintf(s, "%s.index", hash_file); if ((hashindexfp = fopen(s, "r")) == NULL) { if (flags & TC_ERRORMSGS) fprintf(stderr, "cannot open for reading: %s\n", s); fclose(hashfp); hashfp = NULL; return 0; } blocksize = 0; fscanf(hashindexfp, "%d\n", &blocksize); if (blocksize == 0) blocksize = DEF_BLOCKSIZE; if (fstat(fileno(hashindexfp), &statbuf) == -1) { fprintf(stderr, "error in quick_tcompress/fstat on '%s.index'\n", hash_file); fclose(hashfp); hashfp = NULL; fclose(hashindexfp); hashindexfp = NULL; return 0; } if ((hashindexbuf = (char *)malloc(statbuf.st_size + 1)) == NULL) { if (flags & TC_ERRORMSGS) fprintf(stderr, "quick_tcompress: malloc failure!\n"); fclose(hashfp); hashfp = NULL; fclose(hashindexfp); hashindexfp = NULL; return 0; } if ((hashindexsize = fread(hashindexbuf, 1, statbuf.st_size, hashindexfp)) == -1) { fprintf(stderr, "error in quick_tcompress/fread on '%s.index'\n", hash_file); fclose(hashfp); hashfp = NULL; fclose(hashindexfp); hashindexfp = NULL; return 0; } hashindexsize ++; /* st_size - bytes used up for blocksize in file + 1 <= st_size */ hashindexbuf[hashindexsize] = '\0'; fclose(hashindexfp); strcpy(old_freq_file, freq_file); strcpy(old_hash_file, hash_file); } else rewind(hashfp); /* Don't do it first time */ if (pattern[len-1] == '\0') len--; build_partial_hash(compress_hash_table, hashfp, hashindexbuf, hashindexsize, pattern, len, blocksize, loaded_hash_table); newlen = tcompress(pattern, len, newpattern, maxnewlen, flags); #if 0 printf("quick_tcompress: pat=%s len=%d newlen=%d newpat=", pattern, len, newlen); for (i=0; i in padded area */ c = '\0'; str[numread++] = c; return numread; } else str[numread++] = c; } str[numread] = '\0'; if (c == EOF) return -1; return numread; } int build_string(string_table, stringfp, bytestoread, initialwordindex) char *string_table[DEF_MAX_WORDS]; /*[MAX_WORD_LEN+2]; */ FILE *stringfp; int bytestoread; int initialwordindex; { int wordindex = initialwordindex; int numread = 0; int ret; char dummybuf[MAX_WORD_BUF]; char *word; if (bytestoread == -1) { /* read until end of file */ while (wordindex < MAX_WORDS) { if (usemalloc) word = dummybuf; else { if (free_strtable == NULL) free_strtable = (char *)malloc(AVG_WORD_LEN * DEF_MAX_WORDS); if (free_strtable == NULL) break; word = &free_strtable[next_free_strtable]; } if ((ret = mystringread(stringfp, word)) == 0) continue; if (ret == -1) break; if (usemalloc) { if ((word = (char *)malloc(ret + 2)) == NULL) break; strcpy(word, dummybuf); } else next_free_strtable += ret + 2; string_table[wordindex] = word; #if 0 printf("word=%s index=%d\n", string_table[wordindex], wordindex); #endif /*0*/ wordindex ++; } } else { /* read only the specified number of bytes */ while((wordindex < MAX_WORDS) && (bytestoread > numread)) { if (usemalloc) word = dummybuf; else { if (free_strtable == NULL) free_strtable = (char *)malloc(AVG_WORD_LEN * DEF_MAX_WORDS); if (free_strtable == NULL) break; word = &free_strtable[next_free_strtable]; } if ((ret = mystringread(stringfp, word)) <= 0) break; /* quit if EOF OR if padded area */ if (usemalloc) { if ((word = (char *)malloc(ret + 2)) == NULL) break; strcpy(word, dummybuf); } else next_free_strtable += ret + 2; string_table[wordindex] = word; #if 0 printf("word=%s index=%d\n", string_table[wordindex], wordindex); #endif /*0*/ wordindex ++; numread += ret; } } return wordindex; } /* * Interprets srcbuf as a set of srclen/2 short integers. It looks for all the * short-integers encoding words in the matched line and loads only those blocks * of the string table. Note: srcbuf must be aligned on a short-int boundary. */ int build_partial_string(string_table, stringfp, srcbuf, srclen, linebuf, linelen, blocksize, loaded_string_table) char *string_table[DEF_MAX_WORDS]; /* [MAX_WORD_LEN+2]; */ FILE *stringfp; unsigned char *srcbuf; int srclen; unsigned char *linebuf; int linelen; int blocksize; char loaded_string_table[STRING_FILE_BLOCKS]; { unsigned char *srcpos; int blockindex = 0; unsigned short srcinit, srcend; unsigned short wordnums[MAX_NAME_LEN]; /* maximum pattern length */ int numwordnums = 0; int i; /* * Find all the relevant wordnums in the line. */ i = 0; while(i= srcinit) && (wordnums[i] <= srcend)) goto include_page; blockindex++; continue; include_page: /* Include it if any of the word-indices fit within this range */ if (loaded_string_table[blockindex++]) continue; #if 0 printf("build_partial_string: hashing words in page# %d\n", blockindex); #endif /*0*/ loaded_string_table[blockindex - 1] = 1; fseek(stringfp, (blockindex-1)*blocksize, 0); build_string(string_table, stringfp, blocksize, srcinit); } return 0; } pad_string_file(filename, FILEBLOCKSIZE) unsigned char *filename; int FILEBLOCKSIZE; { FILE *outfp, *infp, *indexfp; int offset = 0, len; unsigned char buf[MAX_NAME_LEN]; int pid = getpid(); int i; unsigned short wordindex = 0; char es1[MAX_LINE_LEN], es2[MAX_LINE_LEN]; if ((infp = fopen(filename, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", filename); exit(2); } sprintf(buf, "%s.index", filename); if ((indexfp = fopen(buf, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", buf); fclose(infp); exit(2); } sprintf(buf, "%s.%d", filename, pid); if ((outfp = fopen(buf, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", buf); fclose(infp); fclose(indexfp); exit(2); } if ((FILEBLOCKSIZE % MIN_BLOCKSIZE) != 0) { fprintf(stderr, "invalid block size %d: changing to %d\n", FILEBLOCKSIZE, MIN_BLOCKSIZE); FILEBLOCKSIZE = MIN_BLOCKSIZE; } fprintf(indexfp, "%d\n", FILEBLOCKSIZE); buf[0] = '\0'; if ((char *)buf != fgets(buf, MAX_NAME_LEN, infp)) goto end_of_input; len = strlen((char *)buf); fputs(buf, outfp); fprintf(indexfp, "%d\n", wordindex); offset += len; wordindex ++; while(fgets(buf, MAX_NAME_LEN, infp) == (char *)buf) { len = strlen((char *)buf); if (offset + len > FILEBLOCKSIZE) { for (i=0; i MAX_THRESHOLD) { fprintf(stderr, "threshold must be in [%d, %d]\n", MIN_WORD_LEN, MAX_THRESHOLD); return -1; } if ((SPECIAL_WORDS < 0) || (SPECIAL_WORDS > 256-MAX_SPECIAL_CHARS)) { fprintf(stderr, "invalid special words %d: must be in [0, %d]\n", SPECIAL_WORDS, 256-MAX_SPECIAL_CHARS); return -1; } RESERVED_CHARS = SPECIAL_WORDS + MAX_SPECIAL_CHARS; MAX_WORDS = MAX_LSB*(256-RESERVED_CHARS); if ((fp = fopen(index_file, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", index_file); return -1; } sprintf(s, "/tmp/temp%d", pid); if ((tempfp = fopen(s, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", s); fclose(fp); return -1; } while((c = getc(fp)) != EOF) { if (curchar >= MAX_NAME_LEN) { curchar = 0; while((c = getc(fp) != '\n') && (c != EOF)); if (c == EOF) break; else continue; } curline[curchar++] = (unsigned char)c; if (c == '\n') { /* reached end of record */ int i = 0; if (curline[0] == '%') { /* initial lines */ curchar = 0; continue; } curword[0] = '\0'; while ((i= MIN_WORD_LEN) && (curlen*curfreq >= THRESHOLD)) { fprintf(tempfp, "%d %d %s\n", curlen*curfreq, curoffset, curword); wordindex ++; } curoffset += curchar; /* offset cannot begin at 0: .index_list starts with %% and some junk */ curchar = 0; #if 0 printf("word=%s freq=%d\n", curword, curfreq); #endif /*0*/ } } fclose(fp); /* * Now chose MAX_WORDS words with highest frequency. */ fflush(tempfp); fclose(tempfp); if (wordindex <= SPECIAL_WORDS) { fprintf(stderr, "Warning: very small dictionary with only %d words!\n", wordindex); } sprintf(s, "exec %s -n -r /tmp/temp%d > /tmp/sort%d\n", SYSTEM_SORT, pid, pid); system(s); sprintf(s, "exec %s /tmp/temp%d\n", SYSTEM_RM, pid); system(s); sprintf(s, "exec %s -%d /tmp/sort%d > /tmp/temp%d\n", SYSTEM_HEAD, MAX_WORDS, pid, pid); system(s); /* * The first ultra-frequent 32 words are stored in a separate table sorted by * lengths and within that according to alphabetical order (=canonical order). */ sprintf(s, "/tmp/temp%d", pid); if ((tempfp = fopen(s, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", s); return -1; } for (i=0; iword = (char *)malloc(len + 2); e->next = NULL; strcpy(e->word, (char *)s); /* I'm not worried about the offsets now */ t = &freq_strings_table[len]; while(*t != NULL) { if (strcmp((char *)s, (*t)->word) < 0) { e->next = *t; break; } t = &(*t)->next; } *t = e; /* valid in both cases */ freq_strings_num[len]++; } /* * Put all the other words in the hash/string tables */ for (; iword); p = e; e = e->next; free(p->word); free(p); } } fflush(freqfp); fclose(freqfp); if (!dump_small_hash(dict_hash_table, hash_file)) return -1; /* * Now sort chosen ones case-insensitively according to the name so that * those words all fall into the same page offset in the hash/string tables. */ /* Alter order of words in .hash_table */ sprintf(s, "/tmp/sort%d.a", pid); if ((awkfp = fopen(s, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", s); return -1; } sprintf(s, "BEGIN {}\n{print $3 \" \" $2 \" \" $1}\nEND {}\n"); fwrite(s, 1, strlen((char *)s), awkfp); fflush(awkfp); fclose(awkfp); #if 0 sprintf(s, "cat /tmp/sort%d.a\n", pid); system(s); #endif /*0*/ #if 0 printf("stage1:"); getchar(); #endif /*0*/ sprintf(s, "exec %s -f /tmp/sort%d.a < '%s' > /tmp/sort%d\n", SYSTEM_AWK, pid, tescapesinglequote(hash_file, es1), pid); system(s); sprintf(s, "exec %s -d -f /tmp/sort%d > /tmp/temp%d\n", SYSTEM_SORT, pid, pid); system(s); sprintf(s, "/tmp/sort%d.a", pid); if ((awkfp = fopen(s, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", s); return -1; } sprintf(s, "%s", "BEGIN {}\n{print $3 \" \" NR-1 \" \" $1}\nEND {}\n"); fwrite(s, 1, strlen((char *)s), awkfp); fflush(awkfp); fclose(awkfp); sprintf(s, "exec %s -f /tmp/sort%d.a < /tmp/temp%d > '%s'\n", SYSTEM_AWK, pid, pid, tescapesinglequote(hash_file, es1)); /* reorder and put in new word numbers */ system(s); #if 0 printf("stage2:"); getchar(); #endif /*0*/ /* Now extract string-table, which is the set of 2nd components of the hash-table */ sprintf(s, "/tmp/sort%d.a", pid); if ((awkfp = fopen(s, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", s); return -1; } sprintf(s, "%s", "BEGIN {}\n{print $3}\nEND {}\n"); fwrite(s, 1, strlen((char *)s), awkfp); fflush(awkfp); fclose(awkfp); #if 0 sprintf(s, "cat /tmp/sort%d.a\n", pid); system(s); #endif /*0*/ sprintf(s, "exec %s -f /tmp/sort%d.a < '%s' > '%s'\n", SYSTEM_AWK, pid, tescapesinglequote(hash_file, es1), tescapesinglequote(string_file, es2)); system(s); #if 0 printf("stage3:"); getchar(); #endif /*0*/ /* * Now pad the hash-file and string files and store indices to words at * page boundary so that search() on compressed files can be made fast * -- it need not load the whole hash-table: just the page where the * word occurs. The index files are very small (< 1K) so read is fast. * The padded files are in binary -- this is what tcompress/tuncompress * read-in. This is done to save space. */ pad_hash_file(hash_file, FILEBLOCKSIZE); pad_string_file(string_file, FILEBLOCKSIZE); return 0; } glimpse-4.18.7/compress/test.c000066400000000000000000000017521300371307100162310ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include #include #include /* configured defines */ #if ISO_CHAR_SET #include #endif char src[256] = " industrial production because of energy and input shortages and labor\n"; char dest[256]; char srcsrc[256]; #include "dummysyscalls.c" main() { int i; int srclen = strlen(src); int destlen; int srcsrclen; #if ISO_CHAR_SET setlocale(LC_ALL, ""); #endif printf("going...\n"); destlen = quick_tcompress(".quick_lookup", ".compress_dictionary", src, srclen, dest, get_endian()); dest[63] = 0; printf("len=%d\n", destlen); for (i=0; i #include int tmemlook(pattern, text, length) unsigned char *pattern; unsigned char *text; int length; { unsigned char *text_end = text+length; unsigned char *text_begin = text; register unsigned char pat_0=pattern[0]; register unsigned char *px, *tx; if(pat_0 == '\n') { if(strncmp((char *)pattern+1, text, strlen((char *)pattern) -1) == 0) { return(0); } } /* this is a special case when the pattern is to begin of a line while the text match the pattern right at the beginning, in which case, '\n' won't be matched. */ pattern++; *text_end = pat_0 ; while(text < text_end) { while(*text++ != pat_0); if(text < text_end) { px = pattern; tx = text; while(*px == *tx) { px++; tx++; }; if(*px == '\0') { /* printf("begin matched\n"); */ return(text - text_begin); } } } return(-2); } glimpse-4.18.7/compress/trecursive.c000066400000000000000000000043511300371307100174430ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include "autoconf.h" /* ../libtemplate/include */ #include #include #if ISO_CHAR_SET #include #endif #if HAVE_DIRENT_H # include # define NAMLEN(dirent) strlen((dirent)->d_name) #else # define dirent direct # define NAMLEN(dirent) (dirent)->d_namlen # if HAVE_SYS_NDIR_H # include # endif # if HAVE_SYS_DIR_H # include # endif # if HAVE_NDIR_H # include # endif #endif #include #include #define DIRSIZE 14 #ifndef S_ISREG #define S_ISREG(mode) (0100000&(mode)) #endif #ifndef S_ISDIR #define S_ISDIR(mode) (0040000&(mode)) #endif #if 0 #define FUNCTION(x, y, z) treewalk(x, y, z) #define MAX_LINE_LEN 1024 main(argc, argv) int argc; char **argv; { char buf[MAX_LINE_LEN]; char outbuf[MAX_LINE_LEN]; int flags=0; #if ISO_CHAR_SET setlocale(LC_ALL, ""); #endif if (argc == 1) { strcpy(buf, "."); return treewalk(buf, outbuf, flags); } else while(--argc > 0) { strcpy(buf, *++argv); return treewalk(buf, outbuf, flags); } } int treewalk(name, outname, flags) char *name; char *outname; int flags; { struct stat stbuf; extern int puts(); if(my_lstat(name, &stbuf) == -1) { fprintf(stderr, "permission denied or non-existent: %s\n", name); return -1; } if ((stbuf.st_mode & S_IFMT) == S_IFLNK) return -1; if ((stbuf.st_mode & S_IFMT) == S_IFDIR) return DIRECTORY(name, outname, flags); if ((stbuf.st_mode & S_IFMT) == S_IFREG) return puts(name); } #endif /*0*/ int DIRECTORY(name, outname, flags) char *name, *outname; int flags; { struct dirent *dp; char *nbp; DIR *dirp; nbp = name + strlen(name); if( nbp+DIRSIZE+2 >= name+MAX_LINE_LEN ) { /* name too long */ fprintf(stderr, "name too long: %.32s...\n", name); return -1; } if((dirp = opendir(name)) == NULL) { fprintf(stderr, "permission denied: %s\n", name); return -1; } *nbp++ = '/'; for (dp = readdir(dirp); dp != NULL; dp = readdir(dirp)) { if (dp->d_name[0] == '\0' || strcmp(dp->d_name, ".") == 0 || strcmp(dp->d_name, "..")==0) { continue; } strcpy(nbp, dp->d_name); FUNCTION(name, outname, flags); } closedir (dirp); *--nbp = '\0'; /* restore name */ return 0; } glimpse-4.18.7/compress/tsimpletest.c000066400000000000000000000046641300371307100176340ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* simple tests which don't need to access indexing data structures */ #include #include #define b_sample_size 2048 /* the number of bytes sampled to determine whether a file is binary */ #define u_sample_size 1024 /* the number of bytes sampled to determine whether a file is uuencoded */ #if 0 /* --------------------------------------------------------------------- check for binary stream --------------------------------------------------------------------- */ ttest_binary(buffer, length) unsigned char *buffer; int length; { int i=0; int b_count=0; if(length > b_sample_size) length = b_sample_size; for(i=0; i 127) b_count++; } if(b_count*10 >= length) return(1); return(0); } #else /*0*/ /* Lets try this one instead: Chris Dalton */ ttest_binary(buffer, length) unsigned char *buffer; int length; { int permitted_errors; if (length > b_sample_size) { length= b_sample_size; } permitted_errors= length/10; while (permitted_errors && length--) { if (!(isgraph(*buffer) || isspace(*buffer))) --permitted_errors; } return (permitted_errors == 0); } #endif /*0*/ /* --------------------------------------------------------------------- check for uuencoded stream --------------------------------------------------------------------- */ ttest_uuencode(buffer, length) unsigned char *buffer; int length; { int i=0; int j; if(length > u_sample_size) length = u_sample_size; if(strncmp((char *)buffer, "begin", 5) == 0) { i=5; goto CONT; } i = tmemlook("\nbegin", buffer, length); if(i < 0) return(0); CONT: while(buffer[i] != '\n' && i=length) return 0; buffer[i] = '\0'; if ((first = (char *)strstr((char *)buffer, "PS-Adobe")) == NULL) { buffer[i] = '\n'; return 0; } buffer[i] = '\n'; return 1; } glimpse-4.18.7/compress/uncast.c000066400000000000000000000457701300371307100165570ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ /* * uncast.c: main text uncompression routines. Exports tuncompress() called * from main() in main_uncast.c, and one other simple routine * tuncompressible_file(). */ #include "defs.h" #include #if defined(__NeXT__) /* NeXT has no */ struct utimbuf { time_t actime; /* access time */ time_t modtime; /* modification time */ }; #else #include #endif #define MYEOF 0xffffffff extern int RESERVED_CHARS; extern int MAX_WORDS; extern int SPECIAL_WORDS; extern int BEGIN_SPECIAL_WORDS; extern int END_SPECIAL_WORDS; extern int NUM_SPECIAL_DELIMITERS; extern int END_SPECIAL_DELIMITERS; extern int ONE_VERBATIM; extern int TC_FOUND_BLANK, TC_FOUND_NOTBLANK; extern char comp_signature[SIGNATURE_LEN]; extern hash_entry freq_words_table[MAX_WORD_LEN+2][256]; /* 256 is the maximum possible number of special words */ extern char freq_words_strings[256][MAX_WORD_LEN+2]; extern int freq_words_lens[256]; extern char *compress_string_table[DEF_MAX_WORDS]; /*[MAX_WORD_LEN+2]; */ extern int usemalloc, next_free_strtable; initialize_tuncompress(string_file, freq_file, flags) char *string_file, *freq_file; int flags; { FILE *stringfp; if (!initialize_common(freq_file, flags)) return 0; next_free_strtable = 0; memset(compress_string_table, '\0', sizeof(char *) * DEF_MAX_WORDS); if (MAX_WORDS == 0) return 1; /* Load uncompress dictionary */ if ((stringfp = fopen(string_file, "r")) == NULL) { if (flags & TC_ERRORMSGS) { fprintf(stderr, "cannot open cast-dictionary file: %s\n", string_file); fprintf(stderr, "(use -H to give a dictionary-dir or run 'buildcast' to make a dictionary)\n"); } return 0; } if (!build_string(compress_string_table, stringfp, -1, 0)) { /* read all bytes until end */ fclose(stringfp); return 0; } fclose(stringfp); return 1; } uninitialize_tuncompress() { int i; uninitialize_common(); if (usemalloc) { for (i=0; i= 0) && (outlen + 2 >= maxoutlen)) return outlen;\ putc(' ', outfp);\ outlen ++;\ putc(' ', outfp);\ outlen ++;\ break;\ case BLANK:\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ putc(' ', outfp);\ outlen ++;\ break;\ \ case TWOTABS:\ if ((maxoutlen >= 0) && (outlen + 2 >= maxoutlen)) return outlen;\ putc('\t', outfp);\ outlen ++;\ putc('\t', outfp);\ outlen ++;\ break;\ case TAB:\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ putc('\t', outfp);\ outlen ++;\ break;\ \ case TWONEWLINES:\ if ((maxoutlen >= 0) && (outlen + 2 >= maxoutlen)) return outlen;\ putc('\n', outfp);\ outlen ++;\ putc('\n', outfp);\ outlen ++;\ break;\ case NEWLINE:\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ putc('\n', outfp);\ outlen ++;\ break;\ \ default:\ if ((c < END_SPECIAL_TEXTS) && (c >= BEGIN_SPECIAL_TEXTS)) {\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ putc(special_texts[c - BEGIN_SPECIAL_TEXTS], outfp); outlen ++;\ }\ else if ((c < END_SPECIAL_DELIMITERS) && (c >= BEGIN_SPECIAL_DELIMITERS)) {\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ putc(special_delimiters[c - BEGIN_SPECIAL_DELIMITERS], outfp); outlen ++;\ }\ else if ((c < END_SPECIAL_WORDS) && (c >= BEGIN_SPECIAL_WORDS)) {\ if ((maxoutlen >= 0) && (outlen + freq_words_lens[c - BEGIN_SPECIAL_WORDS] >= maxoutlen)) return outlen;\ fprintf(outfp, "%s", freq_words_strings[c - BEGIN_SPECIAL_WORDS]); outlen += freq_words_lens[c - BEGIN_SPECIAL_WORDS];\ }\ /* else should not have called this function */\ }\ }\ if (outbuf != NULL) {\ switch(c)\ {\ case TWOBLANKS:\ if ((maxoutlen >= 0) && (outlen + 2 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = ' ';\ outbuf[outlen ++] = ' ';\ break;\ case BLANK:\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = ' ';\ break;\ \ case TWOTABS:\ if ((maxoutlen >= 0) && (outlen + 2 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = '\t';\ outbuf[outlen ++] = '\t';\ break;\ case TAB:\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = '\t';\ break;\ \ case TWONEWLINES:\ if ((maxoutlen >= 0) && (outlen + 2 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = '\n';\ outbuf[outlen ++] = '\n';\ break;\ case NEWLINE:\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = '\n';\ break;\ \ default:\ if ((c < END_SPECIAL_TEXTS) && (c >= BEGIN_SPECIAL_TEXTS)) {\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = special_texts[c - BEGIN_SPECIAL_TEXTS];\ }\ else if ((c < END_SPECIAL_DELIMITERS) && (c >= BEGIN_SPECIAL_DELIMITERS)) {\ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen;\ outbuf[outlen ++] = special_delimiters[c - BEGIN_SPECIAL_DELIMITERS];\ }\ else if ((c < END_SPECIAL_WORDS) && (c >= BEGIN_SPECIAL_WORDS)) {\ /* printf("-->%s\n", freq_words_strings[c-BEGIN_SPECIAL_WORDS]); */\ if ((maxoutlen >= 0) && (outlen + freq_words_lens[c - BEGIN_SPECIAL_WORDS] >= maxoutlen)) return outlen;\ memcpy(outbuf+outlen, freq_words_strings[c - BEGIN_SPECIAL_WORDS], freq_words_lens[c - BEGIN_SPECIAL_WORDS]);\ outlen += freq_words_lens[c - BEGIN_SPECIAL_WORDS]; \ }\ /* else should not have called this function */\ }\ }\ } int UNCAST_ERRORS = 0; /* Uncompresses input from indata and outputs it into outdata: returns number of chars in output */ int tuncompress(indata, maxinlen, outdata, maxoutlen, flags) void *indata, *outdata; int maxinlen, maxoutlen; int flags; { unsigned short index, dindex; unsigned int c; int verbatim_state = 0; int inlen, outlen = 0; FILE *infp = NULL, *outfp = NULL; unsigned char *inbuf = NULL, *outbuf = NULL; int easysearch = flags&TC_EASYSEARCH; int untilnewline = flags&TC_UNTILNEWLINE; if (flags & TC_SILENT) return 0; if (maxinlen < 0) { infp = (FILE *)indata; if ((easysearch = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; /* ignore parameter: take from file */ inlen = SIGNATURE_LEN; } else { /* don't care about signature: user's responsibility */ inbuf = (unsigned char *)indata; inlen = 0; } if (maxoutlen < 0) { outfp = (FILE *)outdata; } else { outbuf = (unsigned char *)outdata; } if (easysearch) { ONE_VERBATIM = EASY_ONE_VERBATIM; NUM_SPECIAL_DELIMITERS = EASY_NUM_SPECIAL_DELIMITERS; END_SPECIAL_DELIMITERS = EASY_END_SPECIAL_DELIMITERS; } else { ONE_VERBATIM = HARD_ONE_VERBATIM; NUM_SPECIAL_DELIMITERS = HARD_NUM_SPECIAL_DELIMITERS; END_SPECIAL_DELIMITERS = HARD_END_SPECIAL_DELIMITERS; } if (TC_FOUND_BLANK) { if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; } TC_FOUND_BLANK = 0; /* default: use result of previous backward_tcompressed_word only */ /* * The algorithm, as expected, is a complete inverse of the compression * algorithm: see tcompress.c in this directory to understand this function. * I've used gotos since the termination condition is too complex. * The two sub-parts are exactly the same except for verbatim processing. * Actually, loop-unrolling was done here: you can combine them together but... */ if (easysearch) { /* compress was done in a context-free way to speed up searches */ while(1) { if((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; bypass_getc1: if (c == ONE_VERBATIM) { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if ((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; if (outfp != NULL) putc(c, outfp); /* no processing whatsoever */ if (outbuf != NULL) outbuf[outlen] = c; outlen ++; } else if (verbatim_state) { if (c == END_VERBATIM) verbatim_state = 0; else { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; } } else if (c < END_SPECIAL_CHARS) { process_special_char(c) if ( ((c == NEWLINE) || (c == TWONEWLINES)) && untilnewline) return outlen; } else if (c == BEGIN_VERBATIM) { if((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; if ((maxoutlen >= 0) && (outlen + 1>= maxoutlen)) return outlen; if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; verbatim_state = 1; } else if (c == END_VERBATIM) { /* not in verbatim state, still end_verbatim! */ verbatim_state = 0; fprintf(stderr, "error in decompression after %d chars [verbatim processing]. skipping...\n", inlen); UNCAST_ERRORS = 1; } else { above1: if (c < RESERVED_CHARS) { /* this is a special-word but not a special char */ process_special_char(c) } else { /* it is an index of a word in the dictionary since 1st byte >= RESERVED_CHARS */ index = c; index <<= 8; if ((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; index |= c; dindex = decode_index(index); if(dindex < MAX_WORDS) { if ((maxoutlen >= 0) && (outlen + AVG_WORD_LEN >= maxoutlen)) return outlen; if (outfp != NULL) outlen += myfpcopy(outfp, compress_string_table[dindex]); if (outbuf != NULL) { outlen += mystrcpy(outbuf+outlen, compress_string_table[dindex]); } if ((maxoutlen >= 0) && (outlen >= maxoutlen)) return outlen; } else { fprintf(stderr, "error in decomperssion after %d chars [bad index %x]. skipping...\n", inlen, index); UNCAST_ERRORS = 1; } } /* process_char_after_word1: */ /* now to see what follows the word: a blank or a special delimiter or not-blank */ if((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) { if (!TC_FOUND_NOTBLANK) { if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; } TC_FOUND_NOTBLANK = 0; /* default: use result of previous forward_tcompressed_word only */ return outlen; } else if (c < MAX_SPECIAL_CHARS) { if ((c < END_SPECIAL_DELIMITERS) && (c >= BEGIN_SPECIAL_DELIMITERS)) { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(special_delimiters[c - BEGIN_SPECIAL_DELIMITERS], outfp); if (outbuf != NULL) outbuf[outlen] = special_delimiters[c - BEGIN_SPECIAL_DELIMITERS]; outlen ++; } else if (c != NOTBLANK) { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; goto bypass_getc1; } /* else go normal getc */ } else { /* can be one of the special_words or a dictionary index */ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; goto above1; } } } } else { /* compression was done in a context sensitive fashion w/o regards to search */ while(1) { if((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; bypass_getc2: if (verbatim_state) { if (c == END_VERBATIM) verbatim_state = 0; else if (c == BEGIN_VERBATIM) goto verbatim_processing; else { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; } } else if (c < END_SPECIAL_CHARS) { process_special_char(c) if ( ((c == NEWLINE) || (c == TWONEWLINES)) && untilnewline) return outlen; } else if (c == BEGIN_VERBATIM) { verbatim_processing: if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; if (outfp != NULL) putc(c, outfp); if (outbuf != NULL) outbuf[outlen] = c; outlen ++; if ((c!=BEGIN_VERBATIM) && (c!=END_VERBATIM)) verbatim_state = 1; /* only _these_ are escape characters */ } else if (c == END_VERBATIM) { /* not in verbatim state, still end_verbatim! */ verbatim_state = 0; fprintf(stderr, "error in decompression after %d chars [verbatim processing]. skipping...\n", inlen); UNCAST_ERRORS = 1; } else { above2: if (c < RESERVED_CHARS) { /* this is a special-word but not a special char */ process_special_char(c) } else { /* it is an index of a word in the dictionary since 1st byte >= RESERVED_CHARS */ index = c; index <<= 8; if ((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) return outlen; index |= c; dindex = decode_index(index); if(dindex < MAX_WORDS) { if ((maxoutlen >= 0) && (outlen + AVG_WORD_LEN >= maxoutlen)) return outlen; if (outfp != NULL) outlen += myfpcopy(outfp, compress_string_table[dindex]); if (outbuf != NULL) { outlen += mystrcpy(outbuf+outlen, compress_string_table[dindex]); } if ((maxoutlen >= 0) && (outlen >= maxoutlen)) return outlen; } else { fprintf(stderr, "error in decomperssion after %d chars [bad index %x]. skipping...\n", inlen, index); UNCAST_ERRORS = 1; } } /* process_char_after_word2: */ /* now to see what follows the word: a blank or a special delimiter or not-blank */ if((c = mygetc(infp, inbuf, maxinlen, &inlen)) == MYEOF) { if (!TC_FOUND_NOTBLANK) { if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; } TC_FOUND_NOTBLANK = 0; /* default: use result of previous forward_tcompressed_word only */ return outlen; } else if (c < MAX_SPECIAL_CHARS) { if ((c < END_SPECIAL_DELIMITERS) && (c >= BEGIN_SPECIAL_DELIMITERS)) { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(special_delimiters[c - BEGIN_SPECIAL_DELIMITERS], outfp); if (outbuf != NULL) outbuf[outlen] = special_delimiters[c - BEGIN_SPECIAL_DELIMITERS]; outlen ++; } else if (c != NOTBLANK) { if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; goto bypass_getc2; } /* else go normal getc */ } else { /* can be one of the special_words or a dictionary index */ if ((maxoutlen >= 0) && (outlen + 1 >= maxoutlen)) return outlen; if (outfp != NULL) putc(' ', outfp); if (outbuf != NULL) outbuf[outlen] = ' '; outlen ++; goto above2; } } } } } #define FUNCTION tuncompress_file #define DIRECTORY tuncompress_directory #include "trecursive.c" /* returns #bytes (>=0) in the uncompressed file, -1 if major error (not able to uncompress) */ int tuncompress_file(name, outname, flags) char *name; char *outname; int flags; { FILE *fp; FILE *outfp; int inlen, ret; struct stat statbuf; /* struct timeval tvp[2]; */ struct utimbuf tvp; char tempname[MAX_LINE_LEN]; if (name == NULL) return -1; special_get_name(name, -1, tempname); if (-1 == stat(tempname, &statbuf)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "permission denied or non-existent: %s\n", tempname); return -1; } if (S_ISDIR(statbuf.st_mode)) { if (flags & TC_RECURSIVE) return tuncompress_directory(tempname, outname, flags); if (flags & TC_ERRORMSGS) fprintf(stderr, "skipping directory: %s\n", tempname); return -1; } if (!S_ISREG(statbuf.st_mode)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "not a regular file, skipping: %s\n", tempname); return -1; } inlen = strlen(tempname); if (!tuncompressible_filename(tempname, inlen)) { if (!(flags & TC_RECURSIVE) && (flags & TC_ERRORMSGS)) fprintf(stderr, "no %s extension, skipping: %s\n", COMP_SUFFIX, tempname); return -1; } if ((fp = fopen(tempname, "r")) == NULL) { if (flags & TC_ERRORMSGS) fprintf(stderr, "permission denied or non-existent: %s\n", tempname); return -1; } if (!tuncompressible_fp(fp)) { if (flags & TC_ERRORMSGS) fprintf(stderr, "signature does not match, skipping: %s\n", tempname); fclose(fp); return -1; } if (flags & TC_SILENT) { printf("%s\n", tempname); fclose(fp); return 0; } /* Create and open output file */ strncpy(outname, tempname, MAX_LINE_LEN); outname[inlen - strlen(COMP_SUFFIX)] = '\0'; if (!access(outname, R_OK)) { if (!(flags & TC_OVERWRITE)) { fclose(fp); return 0; } else if (!(flags & TC_NOPROMPT)) { char s[8]; printf("overwrite %s? (y/n): ", outname); scanf("%c", s); if (s[0] != 'y') { fclose(fp); return 0; } } } if ((outfp = fopen(outname, "w")) == NULL) { if (flags & TC_ERRORMSGS) fprintf(stderr, "cannot open for writing: %s\n", outname); fclose(fp); return -1; } UNCAST_ERRORS = 0; if ( ((ret = tuncompress(fp, -1, outfp, -1, flags)) > 0) && !UNCAST_ERRORS && (flags & TC_REMOVE)) { unlink(tempname); } fclose(fp); fflush(outfp); fclose(outfp); /* tvp[0].tv_sec = statbuf.st_atime; tvp[0].tv_usec = 0; tvp[1].tv_sec = statbuf.st_mtime; tvp[1].tv_usec = 0; utimes(outname, tvp); */ tvp.actime = statbuf.st_atime; tvp.modtime = statbuf.st_mtime; utime(outname, &tvp); return ret; } glimpse-4.18.7/configure000077500000000000000000005176011300371307100151670ustar00rootroot00000000000000#! /bin/sh # Guess values for system-dependent variables and create Makefiles. # Generated by GNU Autoconf 2.57. # # Copyright 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002 # Free Software Foundation, Inc. # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## # Be Bourne compatible if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' elif test -n "${BASH_VERSION+set}" && (set -o posix) >/dev/null 2>&1; then set -o posix fi # Support unset when possible. if (FOO=FOO; unset FOO) >/dev/null 2>&1; then as_unset=unset else as_unset=false fi # Work around bugs in pre-3.0 UWIN ksh. $as_unset ENV MAIL MAILPATH PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. for as_var in \ LANG LANGUAGE LC_ADDRESS LC_ALL LC_COLLATE LC_CTYPE LC_IDENTIFICATION \ LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER \ LC_TELEPHONE LC_TIME do if (set +x; test -n "`(eval $as_var=C; export $as_var) 2>&1`"); then eval $as_var=C; export $as_var else $as_unset $as_var fi done # Required to use basename. if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi if (basename /) >/dev/null 2>&1 && test "X`basename / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi # Name of the executable. as_me=`$as_basename "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)$' \| \ . : '\(.\)' 2>/dev/null || echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/; q; } /^X\/\(\/\/\)$/{ s//\1/; q; } /^X\/\(\/\).*/{ s//\1/; q; } s/.*/./; q'` # PATH needs CR, and LINENO needs CR and PATH. # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then echo "#! /bin/sh" >conf$$.sh echo "exit 0" >>conf$$.sh chmod +x conf$$.sh if (PATH="/nonexistent;."; conf$$.sh) >/dev/null 2>&1; then PATH_SEPARATOR=';' else PATH_SEPARATOR=: fi rm -f conf$$.sh fi as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" || { # Find who we are. Look in the path if we contain no path at all # relative or not. case $0 in *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then { echo "$as_me: error: cannot find myself; rerun with an absolute path" >&2 { (exit 1); exit 1; }; } fi case $CONFIG_SHELL in '') as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for as_base in sh bash ksh sh5; do case $as_dir in /*) if ("$as_dir/$as_base" -c ' as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" ') 2>/dev/null; then $as_unset BASH_ENV || test "${BASH_ENV+set}" != set || { BASH_ENV=; export BASH_ENV; } $as_unset ENV || test "${ENV+set}" != set || { ENV=; export ENV; } CONFIG_SHELL=$as_dir/$as_base export CONFIG_SHELL exec "$CONFIG_SHELL" "$0" ${1+"$@"} fi;; esac done done ;; esac # Create $as_me.lineno as a copy of $as_myself, but with $LINENO # uniformly replaced by the line number. The first 'sed' inserts a # line-number line before each line; the second 'sed' does the real # work. The second script uses 'N' to pair each line-number line # with the numbered line, and appends trailing '-' during # substitution so that $LINENO is not a special case at line end. # (Raja R Harinath suggested sed '=', and Paul Eggert wrote the # second 'sed' script. Blame Lee E. McMahon for sed's syntax. :-) sed '=' <$as_myself | sed ' N s,$,-, : loop s,^\(['$as_cr_digits']*\)\(.*\)[$]LINENO\([^'$as_cr_alnum'_]\),\1\2\1\3, t loop s,-$,, s,^['$as_cr_digits']*\n,, ' >$as_me.lineno && chmod +x $as_me.lineno || { echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2 { (exit 1); exit 1; }; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensible to this). . ./$as_me.lineno # Exit status is that of the last command. exit } case `echo "testing\c"; echo 1,2,3`,`echo -n testing; echo 1,2,3` in *c*,-n*) ECHO_N= ECHO_C=' ' ECHO_T=' ' ;; *c*,* ) ECHO_N=-n ECHO_C= ECHO_T= ;; *) ECHO_N= ECHO_C='\c' ECHO_T= ;; esac if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi rm -f conf$$ conf$$.exe conf$$.file echo >conf$$.file if ln -s conf$$.file conf$$ 2>/dev/null; then # We could just check for DJGPP; but this test a) works b) is more generic # and c) will remain valid once DJGPP supports symlinks (DJGPP 2.04). if test -f conf$$.exe; then # Don't use ln at all; we don't have any links as_ln_s='cp -p' else as_ln_s='ln -s' fi elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.file if mkdir -p . 2>/dev/null; then as_mkdir_p=: else as_mkdir_p=false fi as_executable_p="test -f" # Sed expression to map a string onto a valid CPP name. as_tr_cpp="sed y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g" # Sed expression to map a string onto a valid variable name. as_tr_sh="sed y%*+%pp%;s%[^_$as_cr_alnum]%_%g" # IFS # We need space, tab and new line, in precisely that order. as_nl=' ' IFS=" $as_nl" # CDPATH. $as_unset CDPATH # Name of the host. # hostname on some systems (SVR3.2, Linux) returns a bogus exit status, # so uname gets run too. ac_hostname=`(hostname || uname -n) 2>/dev/null | sed 1q` exec 6>&1 # # Initializations. # ac_default_prefix=/usr/local ac_config_libobj_dir=. cross_compiling=no subdirs= MFLAGS= MAKEFLAGS= SHELL=${CONFIG_SHELL-/bin/sh} # Maximum number of lines to put in a shell here document. # This variable seems obsolete. It should probably be removed, and # only ac_max_sed_lines should be used. : ${ac_max_here_lines=38} # Identity of this package. PACKAGE_NAME= PACKAGE_TARNAME= PACKAGE_VERSION= PACKAGE_STRING= PACKAGE_BUGREPORT= ac_unique_file="get_filename.c" # Factoring default headers for most tests. ac_includes_default="\ #include #if HAVE_SYS_TYPES_H # include #endif #if HAVE_SYS_STAT_H # include #endif #if STDC_HEADERS # include # include #else # if HAVE_STDLIB_H # include # endif #endif #if HAVE_STRING_H # if !STDC_HEADERS && HAVE_MEMORY_H # include # endif # include #endif #if HAVE_STRINGS_H # include #endif #if HAVE_INTTYPES_H # include #else # if HAVE_STDINT_H # include # endif #endif #if HAVE_UNISTD_H # include #endif" ac_subst_vars='SHELL PATH_SEPARATOR PACKAGE_NAME PACKAGE_TARNAME PACKAGE_VERSION PACKAGE_STRING PACKAGE_BUGREPORT exec_prefix prefix program_transform_name bindir sbindir libexecdir datadir sysconfdir sharedstatedir localstatedir libdir includedir oldincludedir infodir mandir build_alias host_alias target_alias DEFS ECHO_C ECHO_N ECHO_T LIBS CC CFLAGS LDFLAGS CPPFLAGS ac_ct_CC EXEEXT OBJEXT AR RANLIB ac_ct_RANLIB LN_S LEX LEXLIB LEX_OUTPUT_ROOT STRIP CP INSTALL_PROGRAM INSTALL_SCRIPT INSTALL_DATA CPP EGREP TARGET HAVE_STRDUP LEXFLAGS DYNFILTER_TARGET DYNFILTER_CFLAGS DYNFILTER LIBOBJS LTLIBOBJS' ac_subst_files='' # Initialize some variables set by options. ac_init_help= ac_init_version=false # The variables have the same names as the options, with # dashes changed to underlines. cache_file=/dev/null exec_prefix=NONE no_create= no_recursion= prefix=NONE program_prefix=NONE program_suffix=NONE program_transform_name=s,x,x, silent= site= srcdir= verbose= x_includes=NONE x_libraries=NONE # Installation directory options. # These are left unexpanded so users can "make install exec_prefix=/foo" # and all the variables that are supposed to be based on exec_prefix # by default will actually change. # Use braces instead of parens because sh, perl, etc. also accept them. bindir='${exec_prefix}/bin' sbindir='${exec_prefix}/sbin' libexecdir='${exec_prefix}/libexec' datadir='${prefix}/share' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' libdir='${exec_prefix}/lib' includedir='${prefix}/include' oldincludedir='/usr/include' infodir='${prefix}/info' mandir='${prefix}/man' ac_prev= for ac_option do # If the previous option needs an argument, assign it. if test -n "$ac_prev"; then eval "$ac_prev=\$ac_option" ac_prev= continue fi ac_optarg=`expr "x$ac_option" : 'x[^=]*=\(.*\)'` # Accept the important Cygnus configure options, so we can diagnose typos. case $ac_option in -bindir | --bindir | --bindi | --bind | --bin | --bi) ac_prev=bindir ;; -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) bindir=$ac_optarg ;; -build | --build | --buil | --bui | --bu) ac_prev=build_alias ;; -build=* | --build=* | --buil=* | --bui=* | --bu=*) build_alias=$ac_optarg ;; -cache-file | --cache-file | --cache-fil | --cache-fi \ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) ac_prev=cache_file ;; -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) cache_file=$ac_optarg ;; --config-cache | -C) cache_file=config.cache ;; -datadir | --datadir | --datadi | --datad | --data | --dat | --da) ac_prev=datadir ;; -datadir=* | --datadir=* | --datadi=* | --datad=* | --data=* | --dat=* \ | --da=*) datadir=$ac_optarg ;; -disable-* | --disable-*) ac_feature=`expr "x$ac_option" : 'x-*disable-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_feature" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid feature name: $ac_feature" >&2 { (exit 1); exit 1; }; } ac_feature=`echo $ac_feature | sed 's/-/_/g'` eval "enable_$ac_feature=no" ;; -enable-* | --enable-*) ac_feature=`expr "x$ac_option" : 'x-*enable-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_feature" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid feature name: $ac_feature" >&2 { (exit 1); exit 1; }; } ac_feature=`echo $ac_feature | sed 's/-/_/g'` case $ac_option in *=*) ac_optarg=`echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"`;; *) ac_optarg=yes ;; esac eval "enable_$ac_feature='$ac_optarg'" ;; -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ | --exec | --exe | --ex) ac_prev=exec_prefix ;; -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ | --exec=* | --exe=* | --ex=*) exec_prefix=$ac_optarg ;; -gas | --gas | --ga | --g) # Obsolete; use --with-gas. with_gas=yes ;; -help | --help | --hel | --he | -h) ac_init_help=long ;; -help=r* | --help=r* | --hel=r* | --he=r* | -hr*) ac_init_help=recursive ;; -help=s* | --help=s* | --hel=s* | --he=s* | -hs*) ac_init_help=short ;; -host | --host | --hos | --ho) ac_prev=host_alias ;; -host=* | --host=* | --hos=* | --ho=*) host_alias=$ac_optarg ;; -includedir | --includedir | --includedi | --included | --include \ | --includ | --inclu | --incl | --inc) ac_prev=includedir ;; -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ | --includ=* | --inclu=* | --incl=* | --inc=*) includedir=$ac_optarg ;; -infodir | --infodir | --infodi | --infod | --info | --inf) ac_prev=infodir ;; -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) infodir=$ac_optarg ;; -libdir | --libdir | --libdi | --libd) ac_prev=libdir ;; -libdir=* | --libdir=* | --libdi=* | --libd=*) libdir=$ac_optarg ;; -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ | --libexe | --libex | --libe) ac_prev=libexecdir ;; -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ | --libexe=* | --libex=* | --libe=*) libexecdir=$ac_optarg ;; -localstatedir | --localstatedir | --localstatedi | --localstated \ | --localstate | --localstat | --localsta | --localst \ | --locals | --local | --loca | --loc | --lo) ac_prev=localstatedir ;; -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ | --localstate=* | --localstat=* | --localsta=* | --localst=* \ | --locals=* | --local=* | --loca=* | --loc=* | --lo=*) localstatedir=$ac_optarg ;; -mandir | --mandir | --mandi | --mand | --man | --ma | --m) ac_prev=mandir ;; -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) mandir=$ac_optarg ;; -nfp | --nfp | --nf) # Obsolete; use --without-fp. with_fp=no ;; -no-create | --no-create | --no-creat | --no-crea | --no-cre \ | --no-cr | --no-c | -n) no_create=yes ;; -no-recursion | --no-recursion | --no-recursio | --no-recursi \ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) no_recursion=yes ;; -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ | --oldin | --oldi | --old | --ol | --o) ac_prev=oldincludedir ;; -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) oldincludedir=$ac_optarg ;; -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) ac_prev=prefix ;; -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) prefix=$ac_optarg ;; -program-prefix | --program-prefix | --program-prefi | --program-pref \ | --program-pre | --program-pr | --program-p) ac_prev=program_prefix ;; -program-prefix=* | --program-prefix=* | --program-prefi=* \ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) program_prefix=$ac_optarg ;; -program-suffix | --program-suffix | --program-suffi | --program-suff \ | --program-suf | --program-su | --program-s) ac_prev=program_suffix ;; -program-suffix=* | --program-suffix=* | --program-suffi=* \ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) program_suffix=$ac_optarg ;; -program-transform-name | --program-transform-name \ | --program-transform-nam | --program-transform-na \ | --program-transform-n | --program-transform- \ | --program-transform | --program-transfor \ | --program-transfo | --program-transf \ | --program-trans | --program-tran \ | --progr-tra | --program-tr | --program-t) ac_prev=program_transform_name ;; -program-transform-name=* | --program-transform-name=* \ | --program-transform-nam=* | --program-transform-na=* \ | --program-transform-n=* | --program-transform-=* \ | --program-transform=* | --program-transfor=* \ | --program-transfo=* | --program-transf=* \ | --program-trans=* | --program-tran=* \ | --progr-tra=* | --program-tr=* | --program-t=*) program_transform_name=$ac_optarg ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) silent=yes ;; -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ | --sbi=* | --sb=*) sbindir=$ac_optarg ;; -sharedstatedir | --sharedstatedir | --sharedstatedi \ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ | --sharedst | --shareds | --shared | --share | --shar \ | --sha | --sh) ac_prev=sharedstatedir ;; -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ | --sha=* | --sh=*) sharedstatedir=$ac_optarg ;; -site | --site | --sit) ac_prev=site ;; -site=* | --site=* | --sit=*) site=$ac_optarg ;; -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) ac_prev=srcdir ;; -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) srcdir=$ac_optarg ;; -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ | --syscon | --sysco | --sysc | --sys | --sy) ac_prev=sysconfdir ;; -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) sysconfdir=$ac_optarg ;; -target | --target | --targe | --targ | --tar | --ta | --t) ac_prev=target_alias ;; -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) target_alias=$ac_optarg ;; -v | -verbose | --verbose | --verbos | --verbo | --verb) verbose=yes ;; -version | --version | --versio | --versi | --vers | -V) ac_init_version=: ;; -with-* | --with-*) ac_package=`expr "x$ac_option" : 'x-*with-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_package" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid package name: $ac_package" >&2 { (exit 1); exit 1; }; } ac_package=`echo $ac_package| sed 's/-/_/g'` case $ac_option in *=*) ac_optarg=`echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"`;; *) ac_optarg=yes ;; esac eval "with_$ac_package='$ac_optarg'" ;; -without-* | --without-*) ac_package=`expr "x$ac_option" : 'x-*without-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_package" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid package name: $ac_package" >&2 { (exit 1); exit 1; }; } ac_package=`echo $ac_package | sed 's/-/_/g'` eval "with_$ac_package=no" ;; --x) # Obsolete; use --with-x. with_x=yes ;; -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ | --x-incl | --x-inc | --x-in | --x-i) ac_prev=x_includes ;; -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) x_includes=$ac_optarg ;; -x-libraries | --x-libraries | --x-librarie | --x-librari \ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) ac_prev=x_libraries ;; -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) x_libraries=$ac_optarg ;; -*) { echo "$as_me: error: unrecognized option: $ac_option Try \`$0 --help' for more information." >&2 { (exit 1); exit 1; }; } ;; *=*) ac_envvar=`expr "x$ac_option" : 'x\([^=]*\)='` # Reject names that are not valid shell variable names. expr "x$ac_envvar" : ".*[^_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid variable name: $ac_envvar" >&2 { (exit 1); exit 1; }; } ac_optarg=`echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` eval "$ac_envvar='$ac_optarg'" export $ac_envvar ;; *) # FIXME: should be removed in autoconf 3.0. echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && echo "$as_me: WARNING: invalid host type: $ac_option" >&2 : ${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option} ;; esac done if test -n "$ac_prev"; then ac_option=--`echo $ac_prev | sed 's/_/-/g'` { echo "$as_me: error: missing argument to $ac_option" >&2 { (exit 1); exit 1; }; } fi # Be sure to have absolute paths. for ac_var in exec_prefix prefix do eval ac_val=$`echo $ac_var` case $ac_val in [\\/$]* | ?:[\\/]* | NONE | '' ) ;; *) { echo "$as_me: error: expected an absolute directory name for --$ac_var: $ac_val" >&2 { (exit 1); exit 1; }; };; esac done # Be sure to have absolute paths. for ac_var in bindir sbindir libexecdir datadir sysconfdir sharedstatedir \ localstatedir libdir includedir oldincludedir infodir mandir do eval ac_val=$`echo $ac_var` case $ac_val in [\\/$]* | ?:[\\/]* ) ;; *) { echo "$as_me: error: expected an absolute directory name for --$ac_var: $ac_val" >&2 { (exit 1); exit 1; }; };; esac done # There might be people who depend on the old broken behavior: `$host' # used to hold the argument of --host etc. # FIXME: To remove some day. build=$build_alias host=$host_alias target=$target_alias # FIXME: To remove some day. if test "x$host_alias" != x; then if test "x$build_alias" = x; then cross_compiling=maybe echo "$as_me: WARNING: If you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used." >&2 elif test "x$build_alias" != "x$host_alias"; then cross_compiling=yes fi fi ac_tool_prefix= test -n "$host_alias" && ac_tool_prefix=$host_alias- test "$silent" = yes && exec 6>/dev/null # Find the source files, if location was not specified. if test -z "$srcdir"; then ac_srcdir_defaulted=yes # Try the directory containing this script, then its parent. ac_confdir=`(dirname "$0") 2>/dev/null || $as_expr X"$0" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$0" : 'X\(//\)[^/]' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$0" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` srcdir=$ac_confdir if test ! -r $srcdir/$ac_unique_file; then srcdir=.. fi else ac_srcdir_defaulted=no fi if test ! -r $srcdir/$ac_unique_file; then if test "$ac_srcdir_defaulted" = yes; then { echo "$as_me: error: cannot find sources ($ac_unique_file) in $ac_confdir or .." >&2 { (exit 1); exit 1; }; } else { echo "$as_me: error: cannot find sources ($ac_unique_file) in $srcdir" >&2 { (exit 1); exit 1; }; } fi fi (cd $srcdir && test -r ./$ac_unique_file) 2>/dev/null || { echo "$as_me: error: sources are in $srcdir, but \`cd $srcdir' does not work" >&2 { (exit 1); exit 1; }; } srcdir=`echo "$srcdir" | sed 's%\([^\\/]\)[\\/]*$%\1%'` ac_env_build_alias_set=${build_alias+set} ac_env_build_alias_value=$build_alias ac_cv_env_build_alias_set=${build_alias+set} ac_cv_env_build_alias_value=$build_alias ac_env_host_alias_set=${host_alias+set} ac_env_host_alias_value=$host_alias ac_cv_env_host_alias_set=${host_alias+set} ac_cv_env_host_alias_value=$host_alias ac_env_target_alias_set=${target_alias+set} ac_env_target_alias_value=$target_alias ac_cv_env_target_alias_set=${target_alias+set} ac_cv_env_target_alias_value=$target_alias ac_env_CC_set=${CC+set} ac_env_CC_value=$CC ac_cv_env_CC_set=${CC+set} ac_cv_env_CC_value=$CC ac_env_CFLAGS_set=${CFLAGS+set} ac_env_CFLAGS_value=$CFLAGS ac_cv_env_CFLAGS_set=${CFLAGS+set} ac_cv_env_CFLAGS_value=$CFLAGS ac_env_LDFLAGS_set=${LDFLAGS+set} ac_env_LDFLAGS_value=$LDFLAGS ac_cv_env_LDFLAGS_set=${LDFLAGS+set} ac_cv_env_LDFLAGS_value=$LDFLAGS ac_env_CPPFLAGS_set=${CPPFLAGS+set} ac_env_CPPFLAGS_value=$CPPFLAGS ac_cv_env_CPPFLAGS_set=${CPPFLAGS+set} ac_cv_env_CPPFLAGS_value=$CPPFLAGS ac_env_CPP_set=${CPP+set} ac_env_CPP_value=$CPP ac_cv_env_CPP_set=${CPP+set} ac_cv_env_CPP_value=$CPP # # Report the --help message. # if test "$ac_init_help" = "long"; then # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF \`configure' configures this package to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit -q, --quiet, --silent do not print \`checking...' messages --cache-file=FILE cache test results in FILE [disabled] -C, --config-cache alias for \`--cache-file=config.cache' -n, --no-create do not create output files --srcdir=DIR find the sources in DIR [configure dir or \`..'] _ACEOF cat <<_ACEOF Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [$ac_default_prefix] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, \`make install' will install all the files in \`$ac_default_prefix/bin', \`$ac_default_prefix/lib' etc. You can specify an installation prefix other than \`$ac_default_prefix' using \`--prefix', for instance \`--prefix=\$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --datadir=DIR read-only architecture-independent data [PREFIX/share] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --infodir=DIR info documentation [PREFIX/info] --mandir=DIR man documentation [PREFIX/man] _ACEOF cat <<\_ACEOF _ACEOF fi if test -n "$ac_init_help"; then cat <<\_ACEOF Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] --enable-structured-queries enable structured queries --disable-iso-charset disable iso charset (may be slightly faster if you don't care about upper-ascii characters) --enable-sfs-compat Support SFS compatibility --enable-pointer blah --enable-measure-times blah --enable-warnings Add -Wall to CFLAGS --enable-strip Strip binaries Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-file-end-mark=CHAR use character CHAR as filename delimiter ' ' most often set to '\t' in order to index filenames with spaces; must match Webglimpse setting in lib/wgHeader.pm. Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory CPPFLAGS C/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. _ACEOF fi if test "$ac_init_help" = "recursive"; then # If there are subdirs, report their specific --help. ac_popdir=`pwd` for ac_dir in : $ac_subdirs_all; do test "x$ac_dir" = x: && continue test -d $ac_dir || continue ac_builddir=. if test "$ac_dir" != .; then ac_dir_suffix=/`echo "$ac_dir" | sed 's,^\.[\\/],,'` # A "../" for each directory in $ac_dir_suffix. ac_top_builddir=`echo "$ac_dir_suffix" | sed 's,/[^\\/]*,../,g'` else ac_dir_suffix= ac_top_builddir= fi case $srcdir in .) # No --srcdir option. We are building in place. ac_srcdir=. if test -z "$ac_top_builddir"; then ac_top_srcdir=. else ac_top_srcdir=`echo $ac_top_builddir | sed 's,/$,,'` fi ;; [\\/]* | ?:[\\/]* ) # Absolute path. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ;; *) # Relative path. ac_srcdir=$ac_top_builddir$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_builddir$srcdir ;; esac # Don't blindly perform a `cd "$ac_dir"/$ac_foo && pwd` since $ac_foo can be # absolute. ac_abs_builddir=`cd "$ac_dir" && cd $ac_builddir && pwd` ac_abs_top_builddir=`cd "$ac_dir" && cd ${ac_top_builddir}. && pwd` ac_abs_srcdir=`cd "$ac_dir" && cd $ac_srcdir && pwd` ac_abs_top_srcdir=`cd "$ac_dir" && cd $ac_top_srcdir && pwd` cd $ac_dir # Check for guested configure; otherwise get Cygnus style configure. if test -f $ac_srcdir/configure.gnu; then echo $SHELL $ac_srcdir/configure.gnu --help=recursive elif test -f $ac_srcdir/configure; then echo $SHELL $ac_srcdir/configure --help=recursive elif test -f $ac_srcdir/configure.ac || test -f $ac_srcdir/configure.in; then echo $ac_configure --help else echo "$as_me: WARNING: no configuration information is in $ac_dir" >&2 fi cd $ac_popdir done fi test -n "$ac_init_help" && exit 0 if $ac_init_version; then cat <<\_ACEOF Copyright 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. _ACEOF exit 0 fi exec 5>config.log cat >&5 <<_ACEOF This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by $as_me, which was generated by GNU Autoconf 2.57. Invocation command line was $ $0 $@ _ACEOF { cat <<_ASUNAME ## --------- ## ## Platform. ## ## --------- ## hostname = `(hostname || uname -n) 2>/dev/null | sed 1q` uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null || echo unknown` /bin/uname -X = `(/bin/uname -X) 2>/dev/null || echo unknown` /bin/arch = `(/bin/arch) 2>/dev/null || echo unknown` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null || echo unknown` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null || echo unknown` hostinfo = `(hostinfo) 2>/dev/null || echo unknown` /bin/machine = `(/bin/machine) 2>/dev/null || echo unknown` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null || echo unknown` /bin/universe = `(/bin/universe) 2>/dev/null || echo unknown` _ASUNAME as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. echo "PATH: $as_dir" done } >&5 cat >&5 <<_ACEOF ## ----------- ## ## Core tests. ## ## ----------- ## _ACEOF # Keep a trace of the command line. # Strip out --no-create and --no-recursion so they do not pile up. # Strip out --silent because we don't want to record it for future runs. # Also quote any args containing shell meta-characters. # Make two passes to allow for proper duplicate-argument suppression. ac_configure_args= ac_configure_args0= ac_configure_args1= ac_sep= ac_must_keep_next=false for ac_pass in 1 2 do for ac_arg do case $ac_arg in -no-create | --no-c* | -n | -no-recursion | --no-r*) continue ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) continue ;; *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?\"\']*) ac_arg=`echo "$ac_arg" | sed "s/'/'\\\\\\\\''/g"` ;; esac case $ac_pass in 1) ac_configure_args0="$ac_configure_args0 '$ac_arg'" ;; 2) ac_configure_args1="$ac_configure_args1 '$ac_arg'" if test $ac_must_keep_next = true; then ac_must_keep_next=false # Got value, back to normal. else case $ac_arg in *=* | --config-cache | -C | -disable-* | --disable-* \ | -enable-* | --enable-* | -gas | --g* | -nfp | --nf* \ | -q | -quiet | --q* | -silent | --sil* | -v | -verb* \ | -with-* | --with-* | -without-* | --without-* | --x) case "$ac_configure_args0 " in "$ac_configure_args1"*" '$ac_arg' "* ) continue ;; esac ;; -* ) ac_must_keep_next=true ;; esac fi ac_configure_args="$ac_configure_args$ac_sep'$ac_arg'" # Get rid of the leading space. ac_sep=" " ;; esac done done $as_unset ac_configure_args0 || test "${ac_configure_args0+set}" != set || { ac_configure_args0=; export ac_configure_args0; } $as_unset ac_configure_args1 || test "${ac_configure_args1+set}" != set || { ac_configure_args1=; export ac_configure_args1; } # When interrupted or exit'd, cleanup temporary files, and complete # config.log. We remove comments because anyway the quotes in there # would cause problems or look ugly. # WARNING: Be sure not to use single quotes in there, as some shells, # such as our DU 5.0 friend, will then `close' the trap. trap 'exit_status=$? # Save into config.log some information that might help in debugging. { echo cat <<\_ASBOX ## ---------------- ## ## Cache variables. ## ## ---------------- ## _ASBOX echo # The following way of writing the cache mishandles newlines in values, { (set) 2>&1 | case `(ac_space='"'"' '"'"'; set | grep ac_space) 2>&1` in *ac_space=\ *) sed -n \ "s/'"'"'/'"'"'\\\\'"'"''"'"'/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='"'"'\\2'"'"'/p" ;; *) sed -n \ "s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1=\\2/p" ;; esac; } echo cat <<\_ASBOX ## ----------------- ## ## Output variables. ## ## ----------------- ## _ASBOX echo for ac_var in $ac_subst_vars do eval ac_val=$`echo $ac_var` echo "$ac_var='"'"'$ac_val'"'"'" done | sort echo if test -n "$ac_subst_files"; then cat <<\_ASBOX ## ------------- ## ## Output files. ## ## ------------- ## _ASBOX echo for ac_var in $ac_subst_files do eval ac_val=$`echo $ac_var` echo "$ac_var='"'"'$ac_val'"'"'" done | sort echo fi if test -s confdefs.h; then cat <<\_ASBOX ## ----------- ## ## confdefs.h. ## ## ----------- ## _ASBOX echo sed "/^$/d" confdefs.h | sort echo fi test "$ac_signal" != 0 && echo "$as_me: caught signal $ac_signal" echo "$as_me: exit $exit_status" } >&5 rm -f core core.* *.core && rm -rf conftest* confdefs* conf$$* $ac_clean_files && exit $exit_status ' 0 for ac_signal in 1 2 13 15; do trap 'ac_signal='$ac_signal'; { (exit 1); exit 1; }' $ac_signal done ac_signal=0 # confdefs.h avoids OS command line length limits that DEFS can exceed. rm -rf conftest* confdefs.h # AIX cpp loses on an empty file, so make sure it contains at least a newline. echo >confdefs.h # Predefined preprocessor variables. cat >>confdefs.h <<_ACEOF #define PACKAGE_NAME "$PACKAGE_NAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_TARNAME "$PACKAGE_TARNAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_VERSION "$PACKAGE_VERSION" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_STRING "$PACKAGE_STRING" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_BUGREPORT "$PACKAGE_BUGREPORT" _ACEOF # Let the site file select an alternate cache file if it wants to. # Prefer explicitly selected file to automatically selected ones. if test -z "$CONFIG_SITE"; then if test "x$prefix" != xNONE; then CONFIG_SITE="$prefix/share/config.site $prefix/etc/config.site" else CONFIG_SITE="$ac_default_prefix/share/config.site $ac_default_prefix/etc/config.site" fi fi for ac_site_file in $CONFIG_SITE; do if test -r "$ac_site_file"; then { echo "$as_me:$LINENO: loading site script $ac_site_file" >&5 echo "$as_me: loading site script $ac_site_file" >&6;} sed 's/^/| /' "$ac_site_file" >&5 . "$ac_site_file" fi done if test -r "$cache_file"; then # Some versions of bash will fail to source /dev/null (special # files actually), so we avoid doing that. if test -f "$cache_file"; then { echo "$as_me:$LINENO: loading cache $cache_file" >&5 echo "$as_me: loading cache $cache_file" >&6;} case $cache_file in [\\/]* | ?:[\\/]* ) . $cache_file;; *) . ./$cache_file;; esac fi else { echo "$as_me:$LINENO: creating cache $cache_file" >&5 echo "$as_me: creating cache $cache_file" >&6;} >$cache_file fi # Check that the precious variables saved in the cache have kept the same # value. ac_cache_corrupted=false for ac_var in `(set) 2>&1 | sed -n 's/^ac_env_\([a-zA-Z_0-9]*\)_set=.*/\1/p'`; do eval ac_old_set=\$ac_cv_env_${ac_var}_set eval ac_new_set=\$ac_env_${ac_var}_set eval ac_old_val="\$ac_cv_env_${ac_var}_value" eval ac_new_val="\$ac_env_${ac_var}_value" case $ac_old_set,$ac_new_set in set,) { echo "$as_me:$LINENO: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&5 echo "$as_me: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&2;} ac_cache_corrupted=: ;; ,set) { echo "$as_me:$LINENO: error: \`$ac_var' was not set in the previous run" >&5 echo "$as_me: error: \`$ac_var' was not set in the previous run" >&2;} ac_cache_corrupted=: ;; ,);; *) if test "x$ac_old_val" != "x$ac_new_val"; then { echo "$as_me:$LINENO: error: \`$ac_var' has changed since the previous run:" >&5 echo "$as_me: error: \`$ac_var' has changed since the previous run:" >&2;} { echo "$as_me:$LINENO: former value: $ac_old_val" >&5 echo "$as_me: former value: $ac_old_val" >&2;} { echo "$as_me:$LINENO: current value: $ac_new_val" >&5 echo "$as_me: current value: $ac_new_val" >&2;} ac_cache_corrupted=: fi;; esac # Pass precious variables to config.status. if test "$ac_new_set" = set; then case $ac_new_val in *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?\"\']*) ac_arg=$ac_var=`echo "$ac_new_val" | sed "s/'/'\\\\\\\\''/g"` ;; *) ac_arg=$ac_var=$ac_new_val ;; esac case " $ac_configure_args " in *" '$ac_arg' "*) ;; # Avoid dups. Use of quotes ensures accuracy. *) ac_configure_args="$ac_configure_args '$ac_arg'" ;; esac fi done if $ac_cache_corrupted; then { echo "$as_me:$LINENO: error: changes in the environment can compromise the build" >&5 echo "$as_me: error: changes in the environment can compromise the build" >&2;} { { echo "$as_me:$LINENO: error: run \`make distclean' and/or \`rm $cache_file' and start over" >&5 echo "$as_me: error: run \`make distclean' and/or \`rm $cache_file' and start over" >&2;} { (exit 1); exit 1; }; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_config_headers="$ac_config_headers libtemplate/include/autoconf.h" ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}gcc", so it can be a program name with args. set dummy ${ac_tool_prefix}gcc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="${ac_tool_prefix}gcc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "gcc", so it can be a program name with args. set dummy gcc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="gcc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 echo "${ECHO_T}$ac_ct_CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi CC=$ac_ct_CC else CC="$ac_cv_prog_CC" fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}cc", so it can be a program name with args. set dummy ${ac_tool_prefix}cc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="${ac_tool_prefix}cc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="cc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 echo "${ECHO_T}$ac_ct_CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi CC=$ac_ct_CC else CC="$ac_cv_prog_CC" fi fi if test -z "$CC"; then # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then if test "$as_dir/$ac_word$ac_exec_ext" = "/usr/ucb/cc"; then ac_prog_rejected=yes continue fi ac_cv_prog_CC="cc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_CC shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set CC to just the basename; use the full file name. shift ac_cv_prog_CC="$as_dir/$ac_word${1+' '}$@" fi fi fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then for ac_prog in cl do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="$ac_tool_prefix$ac_prog" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi test -n "$CC" && break done fi if test -z "$CC"; then ac_ct_CC=$CC for ac_prog in cl do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="$ac_prog" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 echo "${ECHO_T}$ac_ct_CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi test -n "$ac_ct_CC" && break done CC=$ac_ct_CC fi fi test -z "$CC" && { { echo "$as_me:$LINENO: error: no acceptable C compiler found in \$PATH See \`config.log' for more details." >&5 echo "$as_me: error: no acceptable C compiler found in \$PATH See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } # Provide some information about the compiler. echo "$as_me:$LINENO:" \ "checking for C compiler version" >&5 ac_compiler=`set X $ac_compile; echo $2` { (eval echo "$as_me:$LINENO: \"$ac_compiler --version &5\"") >&5 (eval $ac_compiler --version &5) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } { (eval echo "$as_me:$LINENO: \"$ac_compiler -v &5\"") >&5 (eval $ac_compiler -v &5) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } { (eval echo "$as_me:$LINENO: \"$ac_compiler -V &5\"") >&5 (eval $ac_compiler -V &5) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files a.out a.exe b.out" # Try to create an executable without -o first, disregard a.out. # It will help us diagnose broken compilers, and finding out an intuition # of exeext. echo "$as_me:$LINENO: checking for C compiler default output" >&5 echo $ECHO_N "checking for C compiler default output... $ECHO_C" >&6 ac_link_default=`echo "$ac_link" | sed 's/ -o *conftest[^ ]*//'` if { (eval echo "$as_me:$LINENO: \"$ac_link_default\"") >&5 (eval $ac_link_default) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then # Find the output, starting from the most likely. This scheme is # not robust to junk in `.', hence go to wildcards (a.*) only as a last # resort. # Be careful to initialize this variable, since it used to be cached. # Otherwise an old cache value of `no' led to `EXEEXT = no' in a Makefile. ac_cv_exeext= # b.out is created by i960 compilers. for ac_file in a_out.exe a.exe conftest.exe a.out conftest a.* conftest.* b.out do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.o | *.obj ) ;; conftest.$ac_ext ) # This is the source file. ;; [ab].out ) # We found the default executable, but exeext='' is most # certainly right. break;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` # FIXME: I believe we export ac_cv_exeext for Libtool, # but it would be cool to find out if it's true. Does anybody # maintain Libtool? --akim. export ac_cv_exeext break;; * ) break;; esac done else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { echo "$as_me:$LINENO: error: C compiler cannot create executables See \`config.log' for more details." >&5 echo "$as_me: error: C compiler cannot create executables See \`config.log' for more details." >&2;} { (exit 77); exit 77; }; } fi ac_exeext=$ac_cv_exeext echo "$as_me:$LINENO: result: $ac_file" >&5 echo "${ECHO_T}$ac_file" >&6 # Check the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. echo "$as_me:$LINENO: checking whether the C compiler works" >&5 echo $ECHO_N "checking whether the C compiler works... $ECHO_C" >&6 # FIXME: These cross compiler hacks should be removed for Autoconf 3.0 # If not cross compiling, check that we can run a simple program. if test "$cross_compiling" != yes; then if { ac_try='./$ac_file' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then cross_compiling=no else if test "$cross_compiling" = maybe; then cross_compiling=yes else { { echo "$as_me:$LINENO: error: cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details." >&5 echo "$as_me: error: cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi fi fi echo "$as_me:$LINENO: result: yes" >&5 echo "${ECHO_T}yes" >&6 rm -f a.out a.exe conftest$ac_cv_exeext b.out ac_clean_files=$ac_clean_files_save # Check the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. echo "$as_me:$LINENO: checking whether we are cross compiling" >&5 echo $ECHO_N "checking whether we are cross compiling... $ECHO_C" >&6 echo "$as_me:$LINENO: result: $cross_compiling" >&5 echo "${ECHO_T}$cross_compiling" >&6 echo "$as_me:$LINENO: checking for suffix of executables" >&5 echo $ECHO_N "checking for suffix of executables... $ECHO_C" >&6 if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then # If both `conftest.exe' and `conftest' are `present' (well, observable) # catch `conftest.exe'. For instance with Cygwin, `ls conftest' will # work properly (i.e., refer to `conftest.exe'), while it won't with # `rm'. for ac_file in conftest.exe conftest conftest.*; do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.o | *.obj ) ;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` export ac_cv_exeext break;; * ) break;; esac done else { { echo "$as_me:$LINENO: error: cannot compute suffix of executables: cannot compile and link See \`config.log' for more details." >&5 echo "$as_me: error: cannot compute suffix of executables: cannot compile and link See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi rm -f conftest$ac_cv_exeext echo "$as_me:$LINENO: result: $ac_cv_exeext" >&5 echo "${ECHO_T}$ac_cv_exeext" >&6 rm -f conftest.$ac_ext EXEEXT=$ac_cv_exeext ac_exeext=$EXEEXT echo "$as_me:$LINENO: checking for suffix of object files" >&5 echo $ECHO_N "checking for suffix of object files... $ECHO_C" >&6 if test "${ac_cv_objext+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.o conftest.obj if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then for ac_file in `(ls conftest.o conftest.obj; ls conftest.*) 2>/dev/null`; do case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg ) ;; *) ac_cv_objext=`expr "$ac_file" : '.*\.\(.*\)'` break;; esac done else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { echo "$as_me:$LINENO: error: cannot compute suffix of object files: cannot compile See \`config.log' for more details." >&5 echo "$as_me: error: cannot compute suffix of object files: cannot compile See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_objext" >&5 echo "${ECHO_T}$ac_cv_objext" >&6 OBJEXT=$ac_cv_objext ac_objext=$OBJEXT echo "$as_me:$LINENO: checking whether we are using the GNU C compiler" >&5 echo $ECHO_N "checking whether we are using the GNU C compiler... $ECHO_C" >&6 if test "${ac_cv_c_compiler_gnu+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { #ifndef __GNUC__ choke me #endif ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_compiler_gnu=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_compiler_gnu=no fi rm -f conftest.$ac_objext conftest.$ac_ext ac_cv_c_compiler_gnu=$ac_compiler_gnu fi echo "$as_me:$LINENO: result: $ac_cv_c_compiler_gnu" >&5 echo "${ECHO_T}$ac_cv_c_compiler_gnu" >&6 GCC=`test $ac_compiler_gnu = yes && echo yes` ac_test_CFLAGS=${CFLAGS+set} ac_save_CFLAGS=$CFLAGS CFLAGS="-g" echo "$as_me:$LINENO: checking whether $CC accepts -g" >&5 echo $ECHO_N "checking whether $CC accepts -g... $ECHO_C" >&6 if test "${ac_cv_prog_cc_g+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_prog_cc_g=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_prog_cc_g=no fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_prog_cc_g" >&5 echo "${ECHO_T}$ac_cv_prog_cc_g" >&6 if test "$ac_test_CFLAGS" = set; then CFLAGS=$ac_save_CFLAGS elif test $ac_cv_prog_cc_g = yes; then if test "$GCC" = yes; then CFLAGS="-g -O2" else CFLAGS="-g" fi else if test "$GCC" = yes; then CFLAGS="-O2" else CFLAGS= fi fi echo "$as_me:$LINENO: checking for $CC option to accept ANSI C" >&5 echo $ECHO_N "checking for $CC option to accept ANSI C... $ECHO_C" >&6 if test "${ac_cv_prog_cc_stdc+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_cv_prog_cc_stdc=no ac_save_CC=$CC cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include #include /* Most of the following tests are stolen from RCS 5.7's src/conf.sh. */ struct buf { int x; }; FILE * (*rcsopen) (struct buf *, struct stat *, int); static char *e (p, i) char **p; int i; { return p[i]; } static char *f (char * (*g) (char **, int), char **p, ...) { char *s; va_list v; va_start (v,p); s = g (p, va_arg (v,int)); va_end (v); return s; } int test (int i, double x); struct s1 {int (*f) (int a);}; struct s2 {int (*f) (double a);}; int pairnames (int, char **, FILE *(*)(struct buf *, struct stat *, int), int, int); int argc; char **argv; int main () { return f (e, argv, 0) != argv[0] || f (e, argv, 1) != argv[1]; ; return 0; } _ACEOF # Don't try gcc -ansi; that turns off useful extensions and # breaks some systems' header files. # AIX -qlanglvl=ansi # Ultrix and OSF/1 -std1 # HP-UX 10.20 and later -Ae # HP-UX older versions -Aa -D_HPUX_SOURCE # SVR4 -Xc -D__EXTENSIONS__ for ac_arg in "" -qlanglvl=ansi -std1 -Ae "-Aa -D_HPUX_SOURCE" "-Xc -D__EXTENSIONS__" do CC="$ac_save_CC $ac_arg" rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_prog_cc_stdc=$ac_arg break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext done rm -f conftest.$ac_ext conftest.$ac_objext CC=$ac_save_CC fi case "x$ac_cv_prog_cc_stdc" in x|xno) echo "$as_me:$LINENO: result: none needed" >&5 echo "${ECHO_T}none needed" >&6 ;; *) echo "$as_me:$LINENO: result: $ac_cv_prog_cc_stdc" >&5 echo "${ECHO_T}$ac_cv_prog_cc_stdc" >&6 CC="$CC $ac_cv_prog_cc_stdc" ;; esac # Some people use a C++ compiler to compile C. Since we use `exit', # in C++ we need to declare it. In case someone uses the same compiler # for both compiling C and C++ we need to have the C++ compiler decide # the declaration of exit, since it's the most demanding environment. cat >conftest.$ac_ext <<_ACEOF #ifndef __cplusplus choke me #endif _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then for ac_declaration in \ ''\ '#include ' \ 'extern "C" void std::exit (int) throw (); using std::exit;' \ 'extern "C" void std::exit (int); using std::exit;' \ 'extern "C" void exit (int) throw ();' \ 'extern "C" void exit (int);' \ 'void exit (int);' do cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include $ac_declaration int main () { exit (42); ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then : else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 continue fi rm -f conftest.$ac_objext conftest.$ac_ext cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_declaration int main () { exit (42); ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest.$ac_ext done rm -f conftest* if test -n "$ac_declaration"; then echo '#ifdef __cplusplus' >>confdefs.h echo $ac_declaration >>confdefs.h echo '#endif' >>confdefs.h fi else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest.$ac_ext ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu # Extract the first word of "ar", so it can be a program name with args. set dummy ar; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_AR+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $AR in [\\/]* | ?:[\\/]*) ac_cv_path_AR="$AR" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_AR="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_path_AR" && ac_cv_path_AR="ar" ;; esac fi AR=$ac_cv_path_AR if test -n "$AR"; then echo "$as_me:$LINENO: result: $AR" >&5 echo "${ECHO_T}$AR" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi if test -z "$AR" ; then # Extract the first word of "ar", so it can be a program name with args. set dummy ar; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_AR+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $AR in [\\/]* | ?:[\\/]*) ac_cv_path_AR="$AR" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /usr/ccs/bin/ar do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_AR="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done ;; esac fi AR=$ac_cv_path_AR if test -n "$AR"; then echo "$as_me:$LINENO: result: $AR" >&5 echo "${ECHO_T}$AR" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi if test -z "$AR" ; then { { echo "$as_me:$LINENO: error: no acceptable ar found in \$PATH:/usr/ccs/bin/ar" >&5 echo "$as_me: error: no acceptable ar found in \$PATH:/usr/ccs/bin/ar" >&2;} { (exit 1); exit 1; }; } fi fi if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}ranlib", so it can be a program name with args. set dummy ${ac_tool_prefix}ranlib; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_RANLIB+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$RANLIB"; then ac_cv_prog_RANLIB="$RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_RANLIB="${ac_tool_prefix}ranlib" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi RANLIB=$ac_cv_prog_RANLIB if test -n "$RANLIB"; then echo "$as_me:$LINENO: result: $RANLIB" >&5 echo "${ECHO_T}$RANLIB" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$ac_cv_prog_RANLIB"; then ac_ct_RANLIB=$RANLIB # Extract the first word of "ranlib", so it can be a program name with args. set dummy ranlib; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_RANLIB+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_RANLIB"; then ac_cv_prog_ac_ct_RANLIB="$ac_ct_RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_RANLIB="ranlib" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_prog_ac_ct_RANLIB" && ac_cv_prog_ac_ct_RANLIB=":" fi fi ac_ct_RANLIB=$ac_cv_prog_ac_ct_RANLIB if test -n "$ac_ct_RANLIB"; then echo "$as_me:$LINENO: result: $ac_ct_RANLIB" >&5 echo "${ECHO_T}$ac_ct_RANLIB" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi RANLIB=$ac_ct_RANLIB else RANLIB="$ac_cv_prog_RANLIB" fi echo "$as_me:$LINENO: checking whether ln -s works" >&5 echo $ECHO_N "checking whether ln -s works... $ECHO_C" >&6 LN_S=$as_ln_s if test "$LN_S" = "ln -s"; then echo "$as_me:$LINENO: result: yes" >&5 echo "${ECHO_T}yes" >&6 else echo "$as_me:$LINENO: result: no, using $LN_S" >&5 echo "${ECHO_T}no, using $LN_S" >&6 fi for ac_prog in flex lex do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_LEX+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$LEX"; then ac_cv_prog_LEX="$LEX" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_LEX="$ac_prog" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi LEX=$ac_cv_prog_LEX if test -n "$LEX"; then echo "$as_me:$LINENO: result: $LEX" >&5 echo "${ECHO_T}$LEX" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi test -n "$LEX" && break done test -n "$LEX" || LEX=":" if test -z "$LEXLIB" then echo "$as_me:$LINENO: checking for yywrap in -lfl" >&5 echo $ECHO_N "checking for yywrap in -lfl... $ECHO_C" >&6 if test "${ac_cv_lib_fl_yywrap+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lfl $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char yywrap (); int main () { yywrap (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_fl_yywrap=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_fl_yywrap=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_fl_yywrap" >&5 echo "${ECHO_T}$ac_cv_lib_fl_yywrap" >&6 if test $ac_cv_lib_fl_yywrap = yes; then LEXLIB="-lfl" else echo "$as_me:$LINENO: checking for yywrap in -ll" >&5 echo $ECHO_N "checking for yywrap in -ll... $ECHO_C" >&6 if test "${ac_cv_lib_l_yywrap+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-ll $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char yywrap (); int main () { yywrap (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_l_yywrap=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_l_yywrap=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_l_yywrap" >&5 echo "${ECHO_T}$ac_cv_lib_l_yywrap" >&6 if test $ac_cv_lib_l_yywrap = yes; then LEXLIB="-ll" fi fi fi if test "x$LEX" != "x:"; then echo "$as_me:$LINENO: checking lex output file root" >&5 echo $ECHO_N "checking lex output file root... $ECHO_C" >&6 if test "${ac_cv_prog_lex_root+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else # The minimal lex program is just a single line: %%. But some broken lexes # (Solaris, I think it was) want two %% lines, so accommodate them. cat >conftest.l <<_ACEOF %% %% _ACEOF { (eval echo "$as_me:$LINENO: \"$LEX conftest.l\"") >&5 (eval $LEX conftest.l) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } if test -f lex.yy.c; then ac_cv_prog_lex_root=lex.yy elif test -f lexyy.c; then ac_cv_prog_lex_root=lexyy else { { echo "$as_me:$LINENO: error: cannot find output from $LEX; giving up" >&5 echo "$as_me: error: cannot find output from $LEX; giving up" >&2;} { (exit 1); exit 1; }; } fi fi echo "$as_me:$LINENO: result: $ac_cv_prog_lex_root" >&5 echo "${ECHO_T}$ac_cv_prog_lex_root" >&6 rm -f conftest.l LEX_OUTPUT_ROOT=$ac_cv_prog_lex_root echo "$as_me:$LINENO: checking whether yytext is a pointer" >&5 echo $ECHO_N "checking whether yytext is a pointer... $ECHO_C" >&6 if test "${ac_cv_prog_lex_yytext_pointer+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else # POSIX says lex can declare yytext either as a pointer or an array; the # default is implementation-dependent. Figure out which it is, since # not all implementations provide the %pointer and %array declarations. ac_cv_prog_lex_yytext_pointer=no echo 'extern char *yytext;' >>$LEX_OUTPUT_ROOT.c ac_save_LIBS=$LIBS LIBS="$LIBS $LEXLIB" cat >conftest.$ac_ext <<_ACEOF `cat $LEX_OUTPUT_ROOT.c` _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_prog_lex_yytext_pointer=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_save_LIBS rm -f "${LEX_OUTPUT_ROOT}.c" fi echo "$as_me:$LINENO: result: $ac_cv_prog_lex_yytext_pointer" >&5 echo "${ECHO_T}$ac_cv_prog_lex_yytext_pointer" >&6 if test $ac_cv_prog_lex_yytext_pointer = yes; then cat >>confdefs.h <<\_ACEOF #define YYTEXT_POINTER 1 _ACEOF fi fi if test "x$LEX" = "xflex" ; then DYNFILTER_TARGET=htuml2txt.so LEXFLAGS="-F -8" else DYNFILTER_TARGET=htuml2txt LEXFLAGS= fi if test "$ac_cv_c_compiler_gnu" = "yes" ; then DYNFILTER_CFLAGS="-O3 -fomit-frame-pointer" else DYNFILTER_CFLAGS="-O" fi if test "`uname`" = "Linux" ; then DYNFILTER=dynfilters else DYNFILTER= fi # Extract the first word of "strip", so it can be a program name with args. set dummy strip; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_STRIP+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $STRIP in [\\/]* | ?:[\\/]*) ac_cv_path_STRIP="$STRIP" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_STRIP="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_path_STRIP" && ac_cv_path_STRIP="strip" ;; esac fi STRIP=$ac_cv_path_STRIP if test -n "$STRIP"; then echo "$as_me:$LINENO: result: $STRIP" >&5 echo "${ECHO_T}$STRIP" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi # Extract the first word of "cp", so it can be a program name with args. set dummy cp; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_CP+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $CP in [\\/]* | ?:[\\/]*) ac_cv_path_CP="$CP" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_CP="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_path_CP" && ac_cv_path_CP="cp" ;; esac fi CP=$ac_cv_path_CP if test -n "$CP"; then echo "$as_me:$LINENO: result: $CP" >&5 echo "${ECHO_T}$CP" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi if test -z "$CP" ; then { { echo "$as_me:$LINENO: error: no cp found in \$PATH, something weird is going on." >&5 echo "$as_me: error: no cp found in \$PATH, something weird is going on." >&2;} { (exit 1); exit 1; }; } fi ac_aux_dir= for ac_dir in $srcdir $srcdir/.. $srcdir/../..; do if test -f $ac_dir/install-sh; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/install-sh -c" break elif test -f $ac_dir/install.sh; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/install.sh -c" break elif test -f $ac_dir/shtool; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/shtool install -c" break fi done if test -z "$ac_aux_dir"; then { { echo "$as_me:$LINENO: error: cannot find install-sh or install.sh in $srcdir $srcdir/.. $srcdir/../.." >&5 echo "$as_me: error: cannot find install-sh or install.sh in $srcdir $srcdir/.. $srcdir/../.." >&2;} { (exit 1); exit 1; }; } fi ac_config_guess="$SHELL $ac_aux_dir/config.guess" ac_config_sub="$SHELL $ac_aux_dir/config.sub" ac_configure="$SHELL $ac_aux_dir/configure" # This should be Cygnus configure. # Find a good install program. We prefer a C program (faster), # so one script is as good as another. But avoid the broken or # incompatible versions: # SysV /etc/install, /usr/sbin/install # SunOS /usr/etc/install # IRIX /sbin/install # AIX /bin/install # AmigaOS /C/install, which installs bootblocks on floppy discs # AIX 4 /usr/bin/installbsd, which doesn't work without a -g flag # AFS /usr/afsws/bin/install, which mishandles nonexistent args # SVR4 /usr/ucb/install, which tries to use the nonexistent group "staff" # ./install, which can be erroneously created by make from ./install.sh. echo "$as_me:$LINENO: checking for a BSD-compatible install" >&5 echo $ECHO_N "checking for a BSD-compatible install... $ECHO_C" >&6 if test -z "$INSTALL"; then if test "${ac_cv_path_install+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. # Account for people who put trailing slashes in PATH elements. case $as_dir/ in ./ | .// | /cC/* | \ /etc/* | /usr/sbin/* | /usr/etc/* | /sbin/* | /usr/afsws/bin/* | \ /usr/ucb/* ) ;; *) # OSF1 and SCO ODT 3.0 have their own names for install. # Don't use installbsd from OSF since it installs stuff as root # by default. for ac_prog in ginstall scoinst install; do for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_prog$ac_exec_ext"; then if test $ac_prog = install && grep dspmsg "$as_dir/$ac_prog$ac_exec_ext" >/dev/null 2>&1; then # AIX install. It has an incompatible calling convention. : elif test $ac_prog = install && grep pwplus "$as_dir/$ac_prog$ac_exec_ext" >/dev/null 2>&1; then # program-specific install script used by HP pwplus--don't use. : else ac_cv_path_install="$as_dir/$ac_prog$ac_exec_ext -c" break 3 fi fi done done ;; esac done fi if test "${ac_cv_path_install+set}" = set; then INSTALL=$ac_cv_path_install else # As a last resort, use the slow shell script. We don't cache a # path for INSTALL within a source directory, because that will # break other packages using the cache if that directory is # removed, or if the path is relative. INSTALL=$ac_install_sh fi fi echo "$as_me:$LINENO: result: $INSTALL" >&5 echo "${ECHO_T}$INSTALL" >&6 # Use test -z because SunOS4 sh mishandles braces in ${var-val}. # It thinks the first close brace ends the variable substitution. test -z "$INSTALL_PROGRAM" && INSTALL_PROGRAM='${INSTALL}' test -z "$INSTALL_SCRIPT" && INSTALL_SCRIPT='${INSTALL}' test -z "$INSTALL_DATA" && INSTALL_DATA='${INSTALL} -m 644' ac_header_dirent=no for ac_hdr in dirent.h sys/ndir.h sys/dir.h ndir.h; do as_ac_Header=`echo "ac_cv_header_dirent_$ac_hdr" | $as_tr_sh` echo "$as_me:$LINENO: checking for $ac_hdr that defines DIR" >&5 echo $ECHO_N "checking for $ac_hdr that defines DIR... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include <$ac_hdr> int main () { if ((DIR *) 0) return 0; ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then eval "$as_ac_Header=yes" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_Header=no" fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_hdr" | $as_tr_cpp` 1 _ACEOF ac_header_dirent=$ac_hdr; break fi done # Two versions of opendir et al. are in -ldir and -lx on SCO Xenix. if test $ac_header_dirent = dirent.h; then echo "$as_me:$LINENO: checking for library containing opendir" >&5 echo $ECHO_N "checking for library containing opendir... $ECHO_C" >&6 if test "${ac_cv_search_opendir+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_func_search_save_LIBS=$LIBS ac_cv_search_opendir=no cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="none required" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext if test "$ac_cv_search_opendir" = no; then for ac_lib in dir; do LIBS="-l$ac_lib $ac_func_search_save_LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="-l$ac_lib" break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext done fi LIBS=$ac_func_search_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_search_opendir" >&5 echo "${ECHO_T}$ac_cv_search_opendir" >&6 if test "$ac_cv_search_opendir" != no; then test "$ac_cv_search_opendir" = "none required" || LIBS="$ac_cv_search_opendir $LIBS" fi else echo "$as_me:$LINENO: checking for library containing opendir" >&5 echo $ECHO_N "checking for library containing opendir... $ECHO_C" >&6 if test "${ac_cv_search_opendir+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_func_search_save_LIBS=$LIBS ac_cv_search_opendir=no cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="none required" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext if test "$ac_cv_search_opendir" = no; then for ac_lib in x; do LIBS="-l$ac_lib $ac_func_search_save_LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="-l$ac_lib" break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext done fi LIBS=$ac_func_search_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_search_opendir" >&5 echo "${ECHO_T}$ac_cv_search_opendir" >&6 if test "$ac_cv_search_opendir" != no; then test "$ac_cv_search_opendir" = "none required" || LIBS="$ac_cv_search_opendir $LIBS" fi fi #Contribution by VaX#n8 ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu echo "$as_me:$LINENO: checking how to run the C preprocessor" >&5 echo $ECHO_N "checking how to run the C preprocessor... $ECHO_C" >&6 # On Suns, sometimes $CPP names a directory. if test -n "$CPP" && test -d "$CPP"; then CPP= fi if test -z "$CPP"; then if test "${ac_cv_prog_CPP+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else # Double quotes because CPP needs to be expanded for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp" do ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then : else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Broken: fails on valid input. continue fi rm -f conftest.err conftest.$ac_ext # OK, works on sane cases. Now check whether non-existent headers # can be detected and how. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then # Broken: success on invalid input. continue else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.err conftest.$ac_ext if $ac_preproc_ok; then break fi done ac_cv_prog_CPP=$CPP fi CPP=$ac_cv_prog_CPP else ac_cv_prog_CPP=$CPP fi echo "$as_me:$LINENO: result: $CPP" >&5 echo "${ECHO_T}$CPP" >&6 ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then : else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Broken: fails on valid input. continue fi rm -f conftest.err conftest.$ac_ext # OK, works on sane cases. Now check whether non-existent headers # can be detected and how. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then # Broken: success on invalid input. continue else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.err conftest.$ac_ext if $ac_preproc_ok; then : else { { echo "$as_me:$LINENO: error: C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details." >&5 echo "$as_me: error: C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu echo "$as_me:$LINENO: checking for egrep" >&5 echo $ECHO_N "checking for egrep... $ECHO_C" >&6 if test "${ac_cv_prog_egrep+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if echo a | (grep -E '(a|b)') >/dev/null 2>&1 then ac_cv_prog_egrep='grep -E' else ac_cv_prog_egrep='egrep' fi fi echo "$as_me:$LINENO: result: $ac_cv_prog_egrep" >&5 echo "${ECHO_T}$ac_cv_prog_egrep" >&6 EGREP=$ac_cv_prog_egrep echo "$as_me:$LINENO: checking for ANSI C header files" >&5 echo $ECHO_N "checking for ANSI C header files... $ECHO_C" >&6 if test "${ac_cv_header_stdc+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_header_stdc=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_header_stdc=no fi rm -f conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) exit(2); exit (0); } _ACEOF rm -f conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='./conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then : else echo "$as_me: program exited with status $ac_status" >&5 echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ( exit $ac_status ) ac_cv_header_stdc=no fi rm -f core core.* *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext fi fi fi echo "$as_me:$LINENO: result: $ac_cv_header_stdc" >&5 echo "${ECHO_T}$ac_cv_header_stdc" >&6 if test $ac_cv_header_stdc = yes; then cat >>confdefs.h <<\_ACEOF #define STDC_HEADERS 1 _ACEOF fi # On IRIX 5.3, sys/types and inttypes.h are conflicting. for ac_header in sys/types.h sys/stat.h stdlib.h string.h memory.h strings.h \ inttypes.h stdint.h unistd.h do as_ac_Header=`echo "ac_cv_header_$ac_header" | $as_tr_sh` echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then eval "$as_ac_Header=yes" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_Header=no" fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done for ac_header in fcntl.h sys/file.h sys/time.h unistd.h sys/select.h do as_ac_Header=`echo "ac_cv_header_$ac_header" | $as_tr_sh` if eval "test \"\${$as_ac_Header+set}\" = set"; then echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 else # Is the header compilable? echo "$as_me:$LINENO: checking $ac_header usability" >&5 echo $ECHO_N "checking $ac_header usability... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_header_compiler=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_compiler=no fi rm -f conftest.$ac_objext conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_compiler" >&5 echo "${ECHO_T}$ac_header_compiler" >&6 # Is the header present? echo "$as_me:$LINENO: checking $ac_header presence" >&5 echo $ECHO_N "checking $ac_header presence... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include <$ac_header> _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then ac_header_preproc=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_preproc=no fi rm -f conftest.err conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_preproc" >&5 echo "${ECHO_T}$ac_header_preproc" >&6 # So? What about this header? case $ac_header_compiler:$ac_header_preproc in yes:no ) { echo "$as_me:$LINENO: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&5 echo "$as_me: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; no:yes ) { echo "$as_me:$LINENO: WARNING: $ac_header: present but cannot be compiled" >&5 echo "$as_me: WARNING: $ac_header: present but cannot be compiled" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: check for missing prerequisite headers?" >&5 echo "$as_me: WARNING: $ac_header: check for missing prerequisite headers?" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; esac echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else eval "$as_ac_Header=$ac_header_preproc" fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 fi if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done for ac_header in sys/dir.h sys/ndir.h strerr.h do as_ac_Header=`echo "ac_cv_header_$ac_header" | $as_tr_sh` if eval "test \"\${$as_ac_Header+set}\" = set"; then echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 else # Is the header compilable? echo "$as_me:$LINENO: checking $ac_header usability" >&5 echo $ECHO_N "checking $ac_header usability... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_header_compiler=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_compiler=no fi rm -f conftest.$ac_objext conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_compiler" >&5 echo "${ECHO_T}$ac_header_compiler" >&6 # Is the header present? echo "$as_me:$LINENO: checking $ac_header presence" >&5 echo $ECHO_N "checking $ac_header presence... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include <$ac_header> _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then ac_header_preproc=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_preproc=no fi rm -f conftest.err conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_preproc" >&5 echo "${ECHO_T}$ac_header_preproc" >&6 # So? What about this header? case $ac_header_compiler:$ac_header_preproc in yes:no ) { echo "$as_me:$LINENO: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&5 echo "$as_me: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; no:yes ) { echo "$as_me:$LINENO: WARNING: $ac_header: present but cannot be compiled" >&5 echo "$as_me: WARNING: $ac_header: present but cannot be compiled" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: check for missing prerequisite headers?" >&5 echo "$as_me: WARNING: $ac_header: check for missing prerequisite headers?" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; esac echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else eval "$as_ac_Header=$ac_header_preproc" fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 fi if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done echo "$as_me:$LINENO: checking whether time.h and sys/time.h may both be included" >&5 echo $ECHO_N "checking whether time.h and sys/time.h may both be included... $ECHO_C" >&6 if test "${ac_cv_header_time+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include int main () { if ((struct tm *) 0) return 0; ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_header_time=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_header_time=no fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_header_time" >&5 echo "${ECHO_T}$ac_cv_header_time" >&6 if test $ac_cv_header_time = yes; then cat >>confdefs.h <<\_ACEOF #define TIME_WITH_SYS_TIME 1 _ACEOF fi echo "$as_me:$LINENO: checking for an ANSI C-conforming const" >&5 echo $ECHO_N "checking for an ANSI C-conforming const... $ECHO_C" >&6 if test "${ac_cv_c_const+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { /* FIXME: Include the comments suggested by Paul. */ #ifndef __cplusplus /* Ultrix mips cc rejects this. */ typedef int charset[2]; const charset x; /* SunOS 4.1.1 cc rejects this. */ char const *const *ccp; char **p; /* NEC SVR4.0.2 mips cc rejects this. */ struct point {int x, y;}; static struct point const zero = {0,0}; /* AIX XL C 1.02.0.0 rejects this. It does not let you subtract one const X* pointer from another in an arm of an if-expression whose if-part is not a constant expression */ const char *g = "string"; ccp = &g + (g ? g-g : 0); /* HPUX 7.0 cc rejects these. */ ++ccp; p = (char**) ccp; ccp = (char const *const *) p; { /* SCO 3.2v4 cc rejects this. */ char *t; char const *s = 0 ? (char *) 0 : (char const *) 0; *t++ = 0; } { /* Someone thinks the Sun supposedly-ANSI compiler will reject this. */ int x[] = {25, 17}; const int *foo = &x[0]; ++foo; } { /* Sun SC1.0 ANSI compiler rejects this -- but not the above. */ typedef const int *iptr; iptr p = 0; ++p; } { /* AIX XL C 1.02.0.0 rejects this saying "k.c", line 2.27: 1506-025 (S) Operand must be a modifiable lvalue. */ struct s { int j; const int *ap[3]; }; struct s *b; b->j = 5; } { /* ULTRIX-32 V3.1 (Rev 9) vcc rejects this */ const int foo = 10; } #endif ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_c_const=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_c_const=no fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_c_const" >&5 echo "${ECHO_T}$ac_cv_c_const" >&6 if test $ac_cv_c_const = no; then cat >>confdefs.h <<\_ACEOF #define const _ACEOF fi echo "$as_me:$LINENO: checking return type of signal handlers" >&5 echo $ECHO_N "checking return type of signal handlers... $ECHO_C" >&6 if test "${ac_cv_type_signal+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #ifdef signal # undef signal #endif #ifdef __cplusplus extern "C" void (*signal (int, void (*)(int)))(int); #else void (*signal ()) (); #endif int main () { int i; ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_type_signal=void else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_type_signal=int fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_type_signal" >&5 echo "${ECHO_T}$ac_cv_type_signal" >&6 cat >>confdefs.h <<_ACEOF #define RETSIGTYPE $ac_cv_type_signal _ACEOF echo "$as_me:$LINENO: checking whether utime accepts a null argument" >&5 echo $ECHO_N "checking whether utime accepts a null argument... $ECHO_C" >&6 if test "${ac_cv_func_utime_null+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else rm -f conftest.data; >conftest.data # Sequent interprets utime(file, 0) to mean use start of epoch. Wrong. if test "$cross_compiling" = yes; then ac_cv_func_utime_null=no else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default int main () { struct stat s, t; exit (!(stat ("conftest.data", &s) == 0 && utime ("conftest.data", (long *)0) == 0 && stat ("conftest.data", &t) == 0 && t.st_mtime >= s.st_mtime && t.st_mtime - s.st_mtime < 120)); ; return 0; } _ACEOF rm -f conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='./conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_utime_null=yes else echo "$as_me: program exited with status $ac_status" >&5 echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ( exit $ac_status ) ac_cv_func_utime_null=no fi rm -f core core.* *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext fi rm -f core core.* *.core fi echo "$as_me:$LINENO: result: $ac_cv_func_utime_null" >&5 echo "${ECHO_T}$ac_cv_func_utime_null" >&6 if test $ac_cv_func_utime_null = yes; then cat >>confdefs.h <<\_ACEOF #define HAVE_UTIME_NULL 1 _ACEOF fi rm -f conftest.data #AC_CHECK_FUNCS(getcwd gethostname gettimeofday mkdir rmdir select socket strdup strftime strstr) # Need this for libtemplate for ac_func in strdup strerror do as_ac_var=`echo "ac_cv_func_$ac_func" | $as_tr_sh` echo "$as_me:$LINENO: checking for $ac_func" >&5 echo $ECHO_N "checking for $ac_func... $ECHO_C" >&6 if eval "test \"\${$as_ac_var+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char $ac_func (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char $ac_func (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_$ac_func) || defined (__stub___$ac_func) choke me #else char (*f) () = $ac_func; #endif #ifdef __cplusplus } #endif int main () { return f != $ac_func; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then eval "$as_ac_var=yes" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_var=no" fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_var'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_var'}'`" >&6 if test `eval echo '${'$as_ac_var'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_func" | $as_tr_cpp` 1 _ACEOF fi done # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. echo "$as_me:$LINENO: checking for sqrt in -lm" >&5 echo $ECHO_N "checking for sqrt in -lm... $ECHO_C" >&6 if test "${ac_cv_lib_m_sqrt+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lm $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char sqrt (); int main () { sqrt (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_m_sqrt=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_m_sqrt=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_m_sqrt" >&5 echo "${ECHO_T}$ac_cv_lib_m_sqrt" >&6 if test $ac_cv_lib_m_sqrt = yes; then cat >>confdefs.h <<_ACEOF #define HAVE_LIBM 1 _ACEOF LIBS="-lm $LIBS" fi echo "$as_me:$LINENO: checking for dlopen in -lc" >&5 echo $ECHO_N "checking for dlopen in -lc... $ECHO_C" >&6 if test "${ac_cv_lib_c_dlopen+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lc $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char dlopen (); int main () { dlopen (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_c_dlopen=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_c_dlopen=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_c_dlopen" >&5 echo "${ECHO_T}$ac_cv_lib_c_dlopen" >&6 if test $ac_cv_lib_c_dlopen = yes; then cat >>confdefs.h <<_ACEOF #define HAVE_LIBC 1 _ACEOF LIBS="-lc $LIBS" else echo "$as_me:$LINENO: checking for dlopen in -ldl" >&5 echo $ECHO_N "checking for dlopen in -ldl... $ECHO_C" >&6 if test "${ac_cv_lib_dl_dlopen+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-ldl $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char dlopen (); int main () { dlopen (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_dl_dlopen=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_dl_dlopen=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_dl_dlopen" >&5 echo "${ECHO_T}$ac_cv_lib_dl_dlopen" >&6 if test $ac_cv_lib_dl_dlopen = yes; then LIBS="$LIBS -ldl" fi fi #AC_CHECK_LIB(resolv, gethostbyname) #AC_CHECK_LIB(nsl, gethostname, [LIBS="$LIBS -lnsl"]) #AC_CHECK_LIB(socket, setsockopt, [LIBS="$LIBS -lsocket"]) #Contribution by: Larry Schwimmer schwim@cyclone.stanford.edu echo "$as_me:$LINENO: checking for connect" >&5 echo $ECHO_N "checking for connect... $ECHO_C" >&6 if test "${ac_cv_func_connect+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char connect (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char connect (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_connect) || defined (__stub___connect) choke me #else char (*f) () = connect; #endif #ifdef __cplusplus } #endif int main () { return f != connect; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_connect=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_func_connect=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_func_connect" >&5 echo "${ECHO_T}$ac_cv_func_connect" >&6 if test $ac_cv_func_connect = yes; then : else ac_check_socket=1 fi if test "$ac_check_socket" = 1; then echo "$as_me:$LINENO: checking for main in -lsocket" >&5 echo $ECHO_N "checking for main in -lsocket... $ECHO_C" >&6 if test "${ac_cv_lib_socket_main+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lsocket $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { main (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_socket_main=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_socket_main=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_socket_main" >&5 echo "${ECHO_T}$ac_cv_lib_socket_main" >&6 if test $ac_cv_lib_socket_main = yes; then LIBS="$LIBS -lsocket" else ac_check_both=1 fi fi if test "$ac_check_both" = 1; then ac_old_libs=$LIBS LIBS="$LIBS -lsocket -lnsl" echo "$as_me:$LINENO: checking for accept" >&5 echo $ECHO_N "checking for accept... $ECHO_C" >&6 if test "${ac_cv_func_accept+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char accept (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char accept (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_accept) || defined (__stub___accept) choke me #else char (*f) () = accept; #endif #ifdef __cplusplus } #endif int main () { return f != accept; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_accept=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_func_accept=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_func_accept" >&5 echo "${ECHO_T}$ac_cv_func_accept" >&6 if test $ac_cv_func_accept = yes; then checknsl=0 else LIBS=$ac_old_libs fi fi echo "$as_me:$LINENO: checking for gethostbyname" >&5 echo $ECHO_N "checking for gethostbyname... $ECHO_C" >&6 if test "${ac_cv_func_gethostbyname+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char gethostbyname (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char gethostbyname (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_gethostbyname) || defined (__stub___gethostbyname) choke me #else char (*f) () = gethostbyname; #endif #ifdef __cplusplus } #endif int main () { return f != gethostbyname; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_gethostbyname=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_func_gethostbyname=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_func_gethostbyname" >&5 echo "${ECHO_T}$ac_cv_func_gethostbyname" >&6 if test $ac_cv_func_gethostbyname = yes; then : else echo "$as_me:$LINENO: checking for main in -lnsl" >&5 echo $ECHO_N "checking for main in -lnsl... $ECHO_C" >&6 if test "${ac_cv_lib_nsl_main+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lnsl $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { main (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_nsl_main=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_nsl_main=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_nsl_main" >&5 echo "${ECHO_T}$ac_cv_lib_nsl_main" >&6 if test $ac_cv_lib_nsl_main = yes; then LIBS="$LIBS -lnsl" fi fi # Check whether --with-file-end-mark or --without-file-end-mark was given. if test "${with_file_end_mark+set}" = set; then withval="$with_file_end_mark" cat >>confdefs.h <<_ACEOF #define FILE_END_MARK '$withval' _ACEOF else cat >>confdefs.h <<\_ACEOF #define FILE_END_MARK ' ' _ACEOF fi; # Check whether --enable-structured-queries or --disable-structured-queries was given. if test "${enable_structured_queries+set}" = set; then enableval="$enable_structured_queries" cat >>confdefs.h <<\_ACEOF #define STRUCTURED_QUERIES 1 _ACEOF TARGET=Sall else cat >>confdefs.h <<\_ACEOF #define STRUCTURED_QUERIES 0 _ACEOF TARGET=NOTSall fi; # Check whether --enable-iso-charset or --disable-iso-charset was given. if test "${enable_iso_charset+set}" = set; then enableval="$enable_iso_charset" use_iso=$enablevel else use_iso=yes fi; # Check whether --enable-sfs-compat or --disable-sfs-compat was given. if test "${enable_sfs_compat+set}" = set; then enableval="$enable_sfs_compat" cat >>confdefs.h <<\_ACEOF #define SFS_COMPAT 1 _ACEOF else cat >>confdefs.h <<\_ACEOF #define SFS_COMPAT 0 _ACEOF fi; # Check whether --enable-pointer or --disable-pointer was given. if test "${enable_pointer+set}" = set; then enableval="$enable_pointer" else cat >>confdefs.h <<\_ACEOF #define AGREP_POINTER 1 _ACEOF fi; # Check whether --enable-measure-times or --disable-measure-times was given. if test "${enable_measure_times+set}" = set; then enableval="$enable_measure_times" cat >>confdefs.h <<\_ACEOF #define MEASURE_TIMES 1 _ACEOF fi; # Check whether --enable-warnings or --disable-warnings was given. if test "${enable_warnings+set}" = set; then enableval="$enable_warnings" CFLAGS="$CFLAGS -Wall" fi; # Check whether --enable-strip or --disable-strip was given. if test "${enable_strip+set}" = set; then enableval="$enable_strip" else STRIP="" fi; if test $use_iso = yes; then cat >>confdefs.h <<\_ACEOF #define ISO_CHAR_SET 1 _ACEOF else cat >>confdefs.h <<\_ACEOF #define ISO_CHAR_SET 0 _ACEOF fi ac_config_files="$ac_config_files Makefile index/Makefile compress/Makefile agrep/Makefile dynfilters/Makefile libtemplate/Makefile libtemplate/util/Makefile libtemplate/template/Makefile libtemplate/lib/Makefile" cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure # scripts and configure runs, see configure's option --config-cache. # It is not useful on other systems. If it contains results you don't # want to keep, you may remove or edit it. # # config.status only pays attention to the cache file if you give it # the --recheck option to rerun configure. # # `ac_cv_env_foo' variables (set or unset) will be overridden when # loading this file, other *unset* `ac_cv_foo' will be assigned the # following values. _ACEOF # The following way of writing the cache mishandles newlines in values, # but we know of no workaround that is simple, portable, and efficient. # So, don't put newlines in cache variables' values. # Ultrix sh set writes to stderr and can't be redirected directly, # and sets the high bit in the cache file unless we assign to the vars. { (set) 2>&1 | case `(ac_space=' '; set | grep ac_space) 2>&1` in *ac_space=\ *) # `set' does not quote correctly, so add quotes (double-quote # substitution turns \\\\ into \\, and sed turns \\ into \). sed -n \ "s/'/'\\\\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\\2'/p" ;; *) # `set' quotes correctly as required by POSIX, so do not add quotes. sed -n \ "s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1=\\2/p" ;; esac; } | sed ' t clear : clear s/^\([^=]*\)=\(.*[{}].*\)$/test "${\1+set}" = set || &/ t end /^ac_cv_env/!s/^\([^=]*\)=\(.*\)$/\1=${\1=\2}/ : end' >>confcache if diff $cache_file confcache >/dev/null 2>&1; then :; else if test -w $cache_file; then test "x$cache_file" != "x/dev/null" && echo "updating cache $cache_file" cat confcache >$cache_file else echo "not updating unwritable cache $cache_file" fi fi rm -f confcache test "x$prefix" = xNONE && prefix=$ac_default_prefix # Let make expand exec_prefix. test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' # VPATH may cause trouble with some makes, so we remove $(srcdir), # ${srcdir} and @srcdir@ from VPATH if srcdir is ".", strip leading and # trailing colons and then remove the whole line if VPATH becomes empty # (actually we leave an empty line to preserve line numbers). if test "x$srcdir" = x.; then ac_vpsub='/^[ ]*VPATH[ ]*=/{ s/:*\$(srcdir):*/:/; s/:*\${srcdir}:*/:/; s/:*@srcdir@:*/:/; s/^\([^=]*=[ ]*\):*/\1/; s/:*$//; s/^[^=]*=[ ]*$//; }' fi DEFS=-DHAVE_CONFIG_H ac_libobjs= ac_ltlibobjs= for ac_i in : $LIBOBJS; do test "x$ac_i" = x: && continue # 1. Remove the extension, and $U if already installed. ac_i=`echo "$ac_i" | sed 's/\$U\././;s/\.o$//;s/\.obj$//'` # 2. Add them. ac_libobjs="$ac_libobjs $ac_i\$U.$ac_objext" ac_ltlibobjs="$ac_ltlibobjs $ac_i"'$U.lo' done LIBOBJS=$ac_libobjs LTLIBOBJS=$ac_ltlibobjs : ${CONFIG_STATUS=./config.status} ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" { echo "$as_me:$LINENO: creating $CONFIG_STATUS" >&5 echo "$as_me: creating $CONFIG_STATUS" >&6;} cat >$CONFIG_STATUS <<_ACEOF #! $SHELL # Generated by $as_me. # Run this file to recreate the current configuration. # Compiler output produced by configure, useful for debugging # configure, is in config.log if it exists. debug=false ac_cs_recheck=false ac_cs_silent=false SHELL=\${CONFIG_SHELL-$SHELL} _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## # Be Bourne compatible if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' elif test -n "${BASH_VERSION+set}" && (set -o posix) >/dev/null 2>&1; then set -o posix fi # Support unset when possible. if (FOO=FOO; unset FOO) >/dev/null 2>&1; then as_unset=unset else as_unset=false fi # Work around bugs in pre-3.0 UWIN ksh. $as_unset ENV MAIL MAILPATH PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. for as_var in \ LANG LANGUAGE LC_ADDRESS LC_ALL LC_COLLATE LC_CTYPE LC_IDENTIFICATION \ LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER \ LC_TELEPHONE LC_TIME do if (set +x; test -n "`(eval $as_var=C; export $as_var) 2>&1`"); then eval $as_var=C; export $as_var else $as_unset $as_var fi done # Required to use basename. if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi if (basename /) >/dev/null 2>&1 && test "X`basename / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi # Name of the executable. as_me=`$as_basename "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)$' \| \ . : '\(.\)' 2>/dev/null || echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/; q; } /^X\/\(\/\/\)$/{ s//\1/; q; } /^X\/\(\/\).*/{ s//\1/; q; } s/.*/./; q'` # PATH needs CR, and LINENO needs CR and PATH. # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then echo "#! /bin/sh" >conf$$.sh echo "exit 0" >>conf$$.sh chmod +x conf$$.sh if (PATH="/nonexistent;."; conf$$.sh) >/dev/null 2>&1; then PATH_SEPARATOR=';' else PATH_SEPARATOR=: fi rm -f conf$$.sh fi as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" || { # Find who we are. Look in the path if we contain no path at all # relative or not. case $0 in *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then { { echo "$as_me:$LINENO: error: cannot find myself; rerun with an absolute path" >&5 echo "$as_me: error: cannot find myself; rerun with an absolute path" >&2;} { (exit 1); exit 1; }; } fi case $CONFIG_SHELL in '') as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for as_base in sh bash ksh sh5; do case $as_dir in /*) if ("$as_dir/$as_base" -c ' as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" ') 2>/dev/null; then $as_unset BASH_ENV || test "${BASH_ENV+set}" != set || { BASH_ENV=; export BASH_ENV; } $as_unset ENV || test "${ENV+set}" != set || { ENV=; export ENV; } CONFIG_SHELL=$as_dir/$as_base export CONFIG_SHELL exec "$CONFIG_SHELL" "$0" ${1+"$@"} fi;; esac done done ;; esac # Create $as_me.lineno as a copy of $as_myself, but with $LINENO # uniformly replaced by the line number. The first 'sed' inserts a # line-number line before each line; the second 'sed' does the real # work. The second script uses 'N' to pair each line-number line # with the numbered line, and appends trailing '-' during # substitution so that $LINENO is not a special case at line end. # (Raja R Harinath suggested sed '=', and Paul Eggert wrote the # second 'sed' script. Blame Lee E. McMahon for sed's syntax. :-) sed '=' <$as_myself | sed ' N s,$,-, : loop s,^\(['$as_cr_digits']*\)\(.*\)[$]LINENO\([^'$as_cr_alnum'_]\),\1\2\1\3, t loop s,-$,, s,^['$as_cr_digits']*\n,, ' >$as_me.lineno && chmod +x $as_me.lineno || { { echo "$as_me:$LINENO: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&5 echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2;} { (exit 1); exit 1; }; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensible to this). . ./$as_me.lineno # Exit status is that of the last command. exit } case `echo "testing\c"; echo 1,2,3`,`echo -n testing; echo 1,2,3` in *c*,-n*) ECHO_N= ECHO_C=' ' ECHO_T=' ' ;; *c*,* ) ECHO_N=-n ECHO_C= ECHO_T= ;; *) ECHO_N= ECHO_C='\c' ECHO_T= ;; esac if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi rm -f conf$$ conf$$.exe conf$$.file echo >conf$$.file if ln -s conf$$.file conf$$ 2>/dev/null; then # We could just check for DJGPP; but this test a) works b) is more generic # and c) will remain valid once DJGPP supports symlinks (DJGPP 2.04). if test -f conf$$.exe; then # Don't use ln at all; we don't have any links as_ln_s='cp -p' else as_ln_s='ln -s' fi elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.file if mkdir -p . 2>/dev/null; then as_mkdir_p=: else as_mkdir_p=false fi as_executable_p="test -f" # Sed expression to map a string onto a valid CPP name. as_tr_cpp="sed y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g" # Sed expression to map a string onto a valid variable name. as_tr_sh="sed y%*+%pp%;s%[^_$as_cr_alnum]%_%g" # IFS # We need space, tab and new line, in precisely that order. as_nl=' ' IFS=" $as_nl" # CDPATH. $as_unset CDPATH exec 6>&1 # Open the log real soon, to keep \$[0] and so on meaningful, and to # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. Logging --version etc. is OK. exec 5>>config.log { echo sed 'h;s/./-/g;s/^.../## /;s/...$/ ##/;p;x;p;x' <<_ASBOX ## Running $as_me. ## _ASBOX } >&5 cat >&5 <<_CSEOF This file was extended by $as_me, which was generated by GNU Autoconf 2.57. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS CONFIG_LINKS = $CONFIG_LINKS CONFIG_COMMANDS = $CONFIG_COMMANDS $ $0 $@ _CSEOF echo "on `(hostname || uname -n) 2>/dev/null | sed 1q`" >&5 echo >&5 _ACEOF # Files that config.status was made for. if test -n "$ac_config_files"; then echo "config_files=\"$ac_config_files\"" >>$CONFIG_STATUS fi if test -n "$ac_config_headers"; then echo "config_headers=\"$ac_config_headers\"" >>$CONFIG_STATUS fi if test -n "$ac_config_links"; then echo "config_links=\"$ac_config_links\"" >>$CONFIG_STATUS fi if test -n "$ac_config_commands"; then echo "config_commands=\"$ac_config_commands\"" >>$CONFIG_STATUS fi cat >>$CONFIG_STATUS <<\_ACEOF ac_cs_usage="\ \`$as_me' instantiates files from templates according to the current configuration. Usage: $0 [OPTIONS] [FILE]... -h, --help print this help, then exit -V, --version print version number, then exit -q, --quiet do not print progress messages -d, --debug don't remove temporary files --recheck update $as_me by reconfiguring in the same conditions --file=FILE[:TEMPLATE] instantiate the configuration file FILE --header=FILE[:TEMPLATE] instantiate the configuration header FILE Configuration files: $config_files Configuration headers: $config_headers Report bugs to ." _ACEOF cat >>$CONFIG_STATUS <<_ACEOF ac_cs_version="\\ config.status configured by $0, generated by GNU Autoconf 2.57, with options \\"`echo "$ac_configure_args" | sed 's/[\\""\`\$]/\\\\&/g'`\\" Copyright 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001 Free Software Foundation, Inc. This config.status script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it." srcdir=$srcdir INSTALL="$INSTALL" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF # If no file are specified by the user, then we need to provide default # value. By we need to know if files were specified by the user. ac_need_defaults=: while test $# != 0 do case $1 in --*=*) ac_option=`expr "x$1" : 'x\([^=]*\)='` ac_optarg=`expr "x$1" : 'x[^=]*=\(.*\)'` ac_shift=: ;; -*) ac_option=$1 ac_optarg=$2 ac_shift=shift ;; *) # This is not an option, so the user has probably given explicit # arguments. ac_option=$1 ac_need_defaults=false;; esac case $ac_option in # Handling of the options. _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) ac_cs_recheck=: ;; --version | --vers* | -V ) echo "$ac_cs_version"; exit 0 ;; --he | --h) # Conflict between --help and --header { { echo "$as_me:$LINENO: error: ambiguous option: $1 Try \`$0 --help' for more information." >&5 echo "$as_me: error: ambiguous option: $1 Try \`$0 --help' for more information." >&2;} { (exit 1); exit 1; }; };; --help | --hel | -h ) echo "$ac_cs_usage"; exit 0 ;; --debug | --d* | -d ) debug=: ;; --file | --fil | --fi | --f ) $ac_shift CONFIG_FILES="$CONFIG_FILES $ac_optarg" ac_need_defaults=false;; --header | --heade | --head | --hea ) $ac_shift CONFIG_HEADERS="$CONFIG_HEADERS $ac_optarg" ac_need_defaults=false;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil | --si | --s) ac_cs_silent=: ;; # This is an error. -*) { { echo "$as_me:$LINENO: error: unrecognized option: $1 Try \`$0 --help' for more information." >&5 echo "$as_me: error: unrecognized option: $1 Try \`$0 --help' for more information." >&2;} { (exit 1); exit 1; }; } ;; *) ac_config_targets="$ac_config_targets $1" ;; esac shift done ac_configure_extra_args= if $ac_cs_silent; then exec 6>/dev/null ac_configure_extra_args="$ac_configure_extra_args --silent" fi _ACEOF cat >>$CONFIG_STATUS <<_ACEOF if \$ac_cs_recheck; then echo "running $SHELL $0 " $ac_configure_args \$ac_configure_extra_args " --no-create --no-recursion" >&6 exec $SHELL $0 $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion fi _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF for ac_config_target in $ac_config_targets do case "$ac_config_target" in # Handling of arguments. "Makefile" ) CONFIG_FILES="$CONFIG_FILES Makefile" ;; "index/Makefile" ) CONFIG_FILES="$CONFIG_FILES index/Makefile" ;; "compress/Makefile" ) CONFIG_FILES="$CONFIG_FILES compress/Makefile" ;; "agrep/Makefile" ) CONFIG_FILES="$CONFIG_FILES agrep/Makefile" ;; "dynfilters/Makefile" ) CONFIG_FILES="$CONFIG_FILES dynfilters/Makefile" ;; "libtemplate/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/Makefile" ;; "libtemplate/util/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/util/Makefile" ;; "libtemplate/template/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/template/Makefile" ;; "libtemplate/lib/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/lib/Makefile" ;; "libtemplate/include/autoconf.h" ) CONFIG_HEADERS="$CONFIG_HEADERS libtemplate/include/autoconf.h" ;; *) { { echo "$as_me:$LINENO: error: invalid argument: $ac_config_target" >&5 echo "$as_me: error: invalid argument: $ac_config_target" >&2;} { (exit 1); exit 1; }; };; esac done # If the user did not use the arguments to specify the items to instantiate, # then the envvar interface is used. Set only those that are not. # We use the long form for the default assignment because of an extremely # bizarre bug on SunOS 4.1.3. if $ac_need_defaults; then test "${CONFIG_FILES+set}" = set || CONFIG_FILES=$config_files test "${CONFIG_HEADERS+set}" = set || CONFIG_HEADERS=$config_headers fi # Have a temporary directory for convenience. Make it in the build tree # simply because there is no reason to put it here, and in addition, # creating and moving files from /tmp can sometimes cause problems. # Create a temporary directory, and hook for its removal unless debugging. $debug || { trap 'exit_status=$?; rm -rf $tmp && exit $exit_status' 0 trap '{ (exit 1); exit 1; }' 1 2 13 15 } # Create a (secure) tmp directory for tmp files. { tmp=`(umask 077 && mktemp -d -q "./confstatXXXXXX") 2>/dev/null` && test -n "$tmp" && test -d "$tmp" } || { tmp=./confstat$$-$RANDOM (umask 077 && mkdir $tmp) } || { echo "$me: cannot create a temporary directory in ." >&2 { (exit 1); exit 1; } } _ACEOF cat >>$CONFIG_STATUS <<_ACEOF # # CONFIG_FILES section. # # No need to generate the scripts if there are no CONFIG_FILES. # This happens for instance when ./config.status config.h if test -n "\$CONFIG_FILES"; then # Protect against being on the right side of a sed subst in config.status. sed 's/,@/@@/; s/@,/@@/; s/,;t t\$/@;t t/; /@;t t\$/s/[\\\\&,]/\\\\&/g; s/@@/,@/; s/@@/@,/; s/@;t t\$/,;t t/' >\$tmp/subs.sed <<\\CEOF s,@SHELL@,$SHELL,;t t s,@PATH_SEPARATOR@,$PATH_SEPARATOR,;t t s,@PACKAGE_NAME@,$PACKAGE_NAME,;t t s,@PACKAGE_TARNAME@,$PACKAGE_TARNAME,;t t s,@PACKAGE_VERSION@,$PACKAGE_VERSION,;t t s,@PACKAGE_STRING@,$PACKAGE_STRING,;t t s,@PACKAGE_BUGREPORT@,$PACKAGE_BUGREPORT,;t t s,@exec_prefix@,$exec_prefix,;t t s,@prefix@,$prefix,;t t s,@program_transform_name@,$program_transform_name,;t t s,@bindir@,$bindir,;t t s,@sbindir@,$sbindir,;t t s,@libexecdir@,$libexecdir,;t t s,@datadir@,$datadir,;t t s,@sysconfdir@,$sysconfdir,;t t s,@sharedstatedir@,$sharedstatedir,;t t s,@localstatedir@,$localstatedir,;t t s,@libdir@,$libdir,;t t s,@includedir@,$includedir,;t t s,@oldincludedir@,$oldincludedir,;t t s,@infodir@,$infodir,;t t s,@mandir@,$mandir,;t t s,@build_alias@,$build_alias,;t t s,@host_alias@,$host_alias,;t t s,@target_alias@,$target_alias,;t t s,@DEFS@,$DEFS,;t t s,@ECHO_C@,$ECHO_C,;t t s,@ECHO_N@,$ECHO_N,;t t s,@ECHO_T@,$ECHO_T,;t t s,@LIBS@,$LIBS,;t t s,@CC@,$CC,;t t s,@CFLAGS@,$CFLAGS,;t t s,@LDFLAGS@,$LDFLAGS,;t t s,@CPPFLAGS@,$CPPFLAGS,;t t s,@ac_ct_CC@,$ac_ct_CC,;t t s,@EXEEXT@,$EXEEXT,;t t s,@OBJEXT@,$OBJEXT,;t t s,@AR@,$AR,;t t s,@RANLIB@,$RANLIB,;t t s,@ac_ct_RANLIB@,$ac_ct_RANLIB,;t t s,@LN_S@,$LN_S,;t t s,@LEX@,$LEX,;t t s,@LEXLIB@,$LEXLIB,;t t s,@LEX_OUTPUT_ROOT@,$LEX_OUTPUT_ROOT,;t t s,@STRIP@,$STRIP,;t t s,@CP@,$CP,;t t s,@INSTALL_PROGRAM@,$INSTALL_PROGRAM,;t t s,@INSTALL_SCRIPT@,$INSTALL_SCRIPT,;t t s,@INSTALL_DATA@,$INSTALL_DATA,;t t s,@CPP@,$CPP,;t t s,@EGREP@,$EGREP,;t t s,@TARGET@,$TARGET,;t t s,@HAVE_STRDUP@,$HAVE_STRDUP,;t t s,@LEXFLAGS@,$LEXFLAGS,;t t s,@DYNFILTER_TARGET@,$DYNFILTER_TARGET,;t t s,@DYNFILTER_CFLAGS@,$DYNFILTER_CFLAGS,;t t s,@DYNFILTER@,$DYNFILTER,;t t s,@LIBOBJS@,$LIBOBJS,;t t s,@LTLIBOBJS@,$LTLIBOBJS,;t t CEOF _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF # Split the substitutions into bite-sized pieces for seds with # small command number limits, like on Digital OSF/1 and HP-UX. ac_max_sed_lines=48 ac_sed_frag=1 # Number of current file. ac_beg=1 # First line for current file. ac_end=$ac_max_sed_lines # Line after last line for current file. ac_more_lines=: ac_sed_cmds= while $ac_more_lines; do if test $ac_beg -gt 1; then sed "1,${ac_beg}d; ${ac_end}q" $tmp/subs.sed >$tmp/subs.frag else sed "${ac_end}q" $tmp/subs.sed >$tmp/subs.frag fi if test ! -s $tmp/subs.frag; then ac_more_lines=false else # The purpose of the label and of the branching condition is to # speed up the sed processing (if there are no `@' at all, there # is no need to browse any of the substitutions). # These are the two extra sed commands mentioned above. (echo ':t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b' && cat $tmp/subs.frag) >$tmp/subs-$ac_sed_frag.sed if test -z "$ac_sed_cmds"; then ac_sed_cmds="sed -f $tmp/subs-$ac_sed_frag.sed" else ac_sed_cmds="$ac_sed_cmds | sed -f $tmp/subs-$ac_sed_frag.sed" fi ac_sed_frag=`expr $ac_sed_frag + 1` ac_beg=$ac_end ac_end=`expr $ac_end + $ac_max_sed_lines` fi done if test -z "$ac_sed_cmds"; then ac_sed_cmds=cat fi fi # test -n "$CONFIG_FILES" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF for ac_file in : $CONFIG_FILES; do test "x$ac_file" = x: && continue # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in". case $ac_file in - | *:- | *:-:* ) # input from stdin cat >$tmp/stdin ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; *:* ) ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; * ) ac_file_in=$ac_file.in ;; esac # Compute @srcdir@, @top_srcdir@, and @INSTALL@ for subdirectories. ac_dir=`(dirname "$ac_file") 2>/dev/null || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` { if $as_mkdir_p; then mkdir -p "$ac_dir" else as_dir="$ac_dir" as_dirs= while test ! -d "$as_dir"; do as_dirs="$as_dir $as_dirs" as_dir=`(dirname "$as_dir") 2>/dev/null || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` done test ! -n "$as_dirs" || mkdir $as_dirs fi || { { echo "$as_me:$LINENO: error: cannot create directory \"$ac_dir\"" >&5 echo "$as_me: error: cannot create directory \"$ac_dir\"" >&2;} { (exit 1); exit 1; }; }; } ac_builddir=. if test "$ac_dir" != .; then ac_dir_suffix=/`echo "$ac_dir" | sed 's,^\.[\\/],,'` # A "../" for each directory in $ac_dir_suffix. ac_top_builddir=`echo "$ac_dir_suffix" | sed 's,/[^\\/]*,../,g'` else ac_dir_suffix= ac_top_builddir= fi case $srcdir in .) # No --srcdir option. We are building in place. ac_srcdir=. if test -z "$ac_top_builddir"; then ac_top_srcdir=. else ac_top_srcdir=`echo $ac_top_builddir | sed 's,/$,,'` fi ;; [\\/]* | ?:[\\/]* ) # Absolute path. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ;; *) # Relative path. ac_srcdir=$ac_top_builddir$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_builddir$srcdir ;; esac # Don't blindly perform a `cd "$ac_dir"/$ac_foo && pwd` since $ac_foo can be # absolute. ac_abs_builddir=`cd "$ac_dir" && cd $ac_builddir && pwd` ac_abs_top_builddir=`cd "$ac_dir" && cd ${ac_top_builddir}. && pwd` ac_abs_srcdir=`cd "$ac_dir" && cd $ac_srcdir && pwd` ac_abs_top_srcdir=`cd "$ac_dir" && cd $ac_top_srcdir && pwd` case $INSTALL in [\\/$]* | ?:[\\/]* ) ac_INSTALL=$INSTALL ;; *) ac_INSTALL=$ac_top_builddir$INSTALL ;; esac if test x"$ac_file" != x-; then { echo "$as_me:$LINENO: creating $ac_file" >&5 echo "$as_me: creating $ac_file" >&6;} rm -f "$ac_file" fi # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ if test x"$ac_file" = x-; then configure_input= else configure_input="$ac_file. " fi configure_input=$configure_input"Generated from `echo $ac_file_in | sed 's,.*/,,'` by configure." # First look for the input files in the build tree, otherwise in the # src tree. ac_file_inputs=`IFS=: for f in $ac_file_in; do case $f in -) echo $tmp/stdin ;; [\\/$]*) # Absolute (can't be DOS-style, as IFS=:) test -f "$f" || { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } echo $f;; *) # Relative if test -f "$f"; then # Build tree echo $f elif test -f "$srcdir/$f"; then # Source tree echo $srcdir/$f else # /dev/null tree { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } fi;; esac done` || { (exit 1); exit 1; } _ACEOF cat >>$CONFIG_STATUS <<_ACEOF sed "$ac_vpsub $extrasub _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF :t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b s,@configure_input@,$configure_input,;t t s,@srcdir@,$ac_srcdir,;t t s,@abs_srcdir@,$ac_abs_srcdir,;t t s,@top_srcdir@,$ac_top_srcdir,;t t s,@abs_top_srcdir@,$ac_abs_top_srcdir,;t t s,@builddir@,$ac_builddir,;t t s,@abs_builddir@,$ac_abs_builddir,;t t s,@top_builddir@,$ac_top_builddir,;t t s,@abs_top_builddir@,$ac_abs_top_builddir,;t t s,@INSTALL@,$ac_INSTALL,;t t " $ac_file_inputs | (eval "$ac_sed_cmds") >$tmp/out rm -f $tmp/stdin if test x"$ac_file" != x-; then mv $tmp/out $ac_file else cat $tmp/out rm -f $tmp/out fi done _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF # # CONFIG_HEADER section. # # These sed commands are passed to sed as "A NAME B NAME C VALUE D", where # NAME is the cpp macro being defined and VALUE is the value it is being given. # # ac_d sets the value in "#define NAME VALUE" lines. ac_dA='s,^\([ ]*\)#\([ ]*define[ ][ ]*\)' ac_dB='[ ].*$,\1#\2' ac_dC=' ' ac_dD=',;t' # ac_u turns "#undef NAME" without trailing blanks into "#define NAME VALUE". ac_uA='s,^\([ ]*\)#\([ ]*\)undef\([ ][ ]*\)' ac_uB='$,\1#\2define\3' ac_uC=' ' ac_uD=',;t' for ac_file in : $CONFIG_HEADERS; do test "x$ac_file" = x: && continue # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in". case $ac_file in - | *:- | *:-:* ) # input from stdin cat >$tmp/stdin ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; *:* ) ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; * ) ac_file_in=$ac_file.in ;; esac test x"$ac_file" != x- && { echo "$as_me:$LINENO: creating $ac_file" >&5 echo "$as_me: creating $ac_file" >&6;} # First look for the input files in the build tree, otherwise in the # src tree. ac_file_inputs=`IFS=: for f in $ac_file_in; do case $f in -) echo $tmp/stdin ;; [\\/$]*) # Absolute (can't be DOS-style, as IFS=:) test -f "$f" || { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } echo $f;; *) # Relative if test -f "$f"; then # Build tree echo $f elif test -f "$srcdir/$f"; then # Source tree echo $srcdir/$f else # /dev/null tree { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } fi;; esac done` || { (exit 1); exit 1; } # Remove the trailing spaces. sed 's/[ ]*$//' $ac_file_inputs >$tmp/in _ACEOF # Transform confdefs.h into two sed scripts, `conftest.defines' and # `conftest.undefs', that substitutes the proper values into # config.h.in to produce config.h. The first handles `#define' # templates, and the second `#undef' templates. # And first: Protect against being on the right side of a sed subst in # config.status. Protect against being in an unquoted here document # in config.status. rm -f conftest.defines conftest.undefs # Using a here document instead of a string reduces the quoting nightmare. # Putting comments in sed scripts is not portable. # # `end' is used to avoid that the second main sed command (meant for # 0-ary CPP macros) applies to n-ary macro definitions. # See the Autoconf documentation for `clear'. cat >confdef2sed.sed <<\_ACEOF s/[\\&,]/\\&/g s,[\\$`],\\&,g t clear : clear s,^[ ]*#[ ]*define[ ][ ]*\([^ (][^ (]*\)\(([^)]*)\)[ ]*\(.*\)$,${ac_dA}\1${ac_dB}\1\2${ac_dC}\3${ac_dD},gp t end s,^[ ]*#[ ]*define[ ][ ]*\([^ ][^ ]*\)[ ]*\(.*\)$,${ac_dA}\1${ac_dB}\1${ac_dC}\2${ac_dD},gp : end _ACEOF # If some macros were called several times there might be several times # the same #defines, which is useless. Nevertheless, we may not want to # sort them, since we want the *last* AC-DEFINE to be honored. uniq confdefs.h | sed -n -f confdef2sed.sed >conftest.defines sed 's/ac_d/ac_u/g' conftest.defines >conftest.undefs rm -f confdef2sed.sed # This sed command replaces #undef with comments. This is necessary, for # example, in the case of _POSIX_SOURCE, which is predefined and required # on some systems where configure will not decide to define it. cat >>conftest.undefs <<\_ACEOF s,^[ ]*#[ ]*undef[ ][ ]*[a-zA-Z_][a-zA-Z_0-9]*,/* & */, _ACEOF # Break up conftest.defines because some shells have a limit on the size # of here documents, and old seds have small limits too (100 cmds). echo ' # Handle all the #define templates only if necessary.' >>$CONFIG_STATUS echo ' if grep "^[ ]*#[ ]*define" $tmp/in >/dev/null; then' >>$CONFIG_STATUS echo ' # If there are no defines, we may have an empty if/fi' >>$CONFIG_STATUS echo ' :' >>$CONFIG_STATUS rm -f conftest.tail while grep . conftest.defines >/dev/null do # Write a limited-size here document to $tmp/defines.sed. echo ' cat >$tmp/defines.sed <>$CONFIG_STATUS # Speed up: don't consider the non `#define' lines. echo '/^[ ]*#[ ]*define/!b' >>$CONFIG_STATUS # Work around the forget-to-reset-the-flag bug. echo 't clr' >>$CONFIG_STATUS echo ': clr' >>$CONFIG_STATUS sed ${ac_max_here_lines}q conftest.defines >>$CONFIG_STATUS echo 'CEOF sed -f $tmp/defines.sed $tmp/in >$tmp/out rm -f $tmp/in mv $tmp/out $tmp/in ' >>$CONFIG_STATUS sed 1,${ac_max_here_lines}d conftest.defines >conftest.tail rm -f conftest.defines mv conftest.tail conftest.defines done rm -f conftest.defines echo ' fi # grep' >>$CONFIG_STATUS echo >>$CONFIG_STATUS # Break up conftest.undefs because some shells have a limit on the size # of here documents, and old seds have small limits too (100 cmds). echo ' # Handle all the #undef templates' >>$CONFIG_STATUS rm -f conftest.tail while grep . conftest.undefs >/dev/null do # Write a limited-size here document to $tmp/undefs.sed. echo ' cat >$tmp/undefs.sed <>$CONFIG_STATUS # Speed up: don't consider the non `#undef' echo '/^[ ]*#[ ]*undef/!b' >>$CONFIG_STATUS # Work around the forget-to-reset-the-flag bug. echo 't clr' >>$CONFIG_STATUS echo ': clr' >>$CONFIG_STATUS sed ${ac_max_here_lines}q conftest.undefs >>$CONFIG_STATUS echo 'CEOF sed -f $tmp/undefs.sed $tmp/in >$tmp/out rm -f $tmp/in mv $tmp/out $tmp/in ' >>$CONFIG_STATUS sed 1,${ac_max_here_lines}d conftest.undefs >conftest.tail rm -f conftest.undefs mv conftest.tail conftest.undefs done rm -f conftest.undefs cat >>$CONFIG_STATUS <<\_ACEOF # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ if test x"$ac_file" = x-; then echo "/* Generated by configure. */" >$tmp/config.h else echo "/* $ac_file. Generated by configure. */" >$tmp/config.h fi cat $tmp/in >>$tmp/config.h rm -f $tmp/in if test x"$ac_file" != x-; then if diff $ac_file $tmp/config.h >/dev/null 2>&1; then { echo "$as_me:$LINENO: $ac_file is unchanged" >&5 echo "$as_me: $ac_file is unchanged" >&6;} else ac_dir=`(dirname "$ac_file") 2>/dev/null || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` { if $as_mkdir_p; then mkdir -p "$ac_dir" else as_dir="$ac_dir" as_dirs= while test ! -d "$as_dir"; do as_dirs="$as_dir $as_dirs" as_dir=`(dirname "$as_dir") 2>/dev/null || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` done test ! -n "$as_dirs" || mkdir $as_dirs fi || { { echo "$as_me:$LINENO: error: cannot create directory \"$ac_dir\"" >&5 echo "$as_me: error: cannot create directory \"$ac_dir\"" >&2;} { (exit 1); exit 1; }; }; } rm -f $ac_file mv $tmp/config.h $ac_file fi else cat $tmp/config.h rm -f $tmp/config.h fi done _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF { (exit 0); exit 0; } _ACEOF chmod +x $CONFIG_STATUS ac_clean_files=$ac_clean_files_save # configure is writing to config.log, and then calls config.status. # config.status does its own redirection, appending to config.log. # Unfortunately, on DOS this fails, as config.log is still kept open # by configure, so config.status won't be able to write to it; its # output is simply discarded. So we exec the FD to /dev/null, # effectively closing config.log, so it can be properly (re)opened and # appended to by config.status. When coming back to configure, we # need to make the FD available again. if test "$no_create" != yes; then ac_cs_success=: ac_config_status_args= test "$silent" = yes && ac_config_status_args="$ac_config_status_args --quiet" exec 5>/dev/null $SHELL $CONFIG_STATUS $ac_config_status_args || ac_cs_success=false exec 5>>config.log # Use ||, not &&, to avoid exiting from the if with $? = 1, which # would make configure fail if this is the last instruction. $ac_cs_success || { (exit 1); exit 1; } fi glimpse-4.18.7/configure.in000066400000000000000000000102721300371307100155610ustar00rootroot00000000000000dnl Process this file with autoconf to produce a configure script. AC_INIT(get_filename.c) AC_CONFIG_HEADER(libtemplate/include/autoconf.h) AC_PROG_CC dnl Check to see where to find ar. -- mkh AC_PATH_PROG(AR, ar, ar) if test -z "$AR" ; then AC_PATH_PROG(AR, ar, , /usr/ccs/bin/ar) if test -z "$AR" ; then AC_MSG_ERROR([no acceptable ar found in \$PATH:/usr/ccs/bin/ar]) fi fi AC_PROG_RANLIB AC_PROG_LN_S dnl configure for dynfilter AC_PROG_LEX if test "x$LEX" = "xflex" ; then DYNFILTER_TARGET=htuml2txt.so LEXFLAGS="-F -8" else DYNFILTER_TARGET=htuml2txt LEXFLAGS= fi if test "$ac_cv_prog_gcc" = "yes" ; then DYNFILTER_CFLAGS="-O3 -fomit-frame-pointer" else DYNFILTER_CFLAGS="-O" fi if test "`uname`" = "Linux" ; then DYNFILTER=dynfilters else DYNFILTER= fi dnl Check for strip, to support the --enable-strip option. -- mkh AC_PATH_PROG(STRIP, strip, strip) dnl Check for cp, this should not be a problem. -- mkh AC_PATH_PROG(CP, cp, cp) if test -z "$CP" ; then AC_MSG_ERROR([no cp found in \$PATH, something weird is going on.]) fi AC_PROG_INSTALL dnl Checks for header files. AC_HEADER_DIRENT #Contribution by VaX#n8 AC_HEADER_STDC AC_CHECK_HEADERS(fcntl.h sys/file.h sys/time.h unistd.h sys/select.h) dnl XXX sysuh AC_CHECK_HEADERS(sys/dir.h sys/ndir.h strerr.h) dnl Checks for typedefs, structures, and compiler characteristics. AC_HEADER_TIME dnl ######### compiler characteristics AC_C_CONST dnl Checks for library functions. AC_TYPE_SIGNAL AC_FUNC_UTIME_NULL #AC_CHECK_FUNCS(getcwd gethostname gettimeofday mkdir rmdir select socket strdup strftime strstr) # Need this for libtemplate AC_CHECK_FUNCS(strdup strerror) dnl Check for libraries # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) AC_CHECK_LIB(c, dlopen, , [AC_CHECK_LIB(dl, dlopen, [LIBS="$LIBS -ldl"])]) #AC_CHECK_LIB(resolv, gethostbyname) #AC_CHECK_LIB(nsl, gethostname, [LIBS="$LIBS -lnsl"]) #AC_CHECK_LIB(socket, setsockopt, [LIBS="$LIBS -lsocket"]) #Contribution by: Larry Schwimmer schwim@cyclone.stanford.edu AC_CHECK_FUNC(connect, , ac_check_socket=1) if test "$ac_check_socket" = 1; then AC_CHECK_LIB(socket, main, LIBS="$LIBS -lsocket", ac_check_both=1) fi if test "$ac_check_both" = 1; then ac_old_libs=$LIBS LIBS="$LIBS -lsocket -lnsl" AC_CHECK_FUNC(accept, checknsl=0, LIBS=$ac_old_libs) fi AC_CHECK_FUNC(gethostbyname, , AC_CHECK_LIB(nsl, main, [LIBS="$LIBS -lnsl"])) dnl Optional stuff AC_ARG_WITH(file-end-mark, [ --with-file-end-mark=CHAR use character CHAR as filename delimiter [' '] most often set to '\t' in order to index filenames with spaces; must match Webglimpse setting in lib/wgHeader.pm.], AC_DEFINE_UNQUOTED(FILE_END_MARK,'$withval'), AC_DEFINE(FILE_END_MARK,[' '])) dnl AC_DEFINE(FILE_END_MARK, [$withval]) dnl TARGET=Sall, dnl AC_DEFINE(FILE_END_MARK, [' ']) dnl TARGET=Sall) AC_ARG_ENABLE(structured-queries, [ --enable-structured-queries enable structured queries], AC_DEFINE(STRUCTURED_QUERIES,1) TARGET=Sall, AC_DEFINE(STRUCTURED_QUERIES,0) TARGET=NOTSall) AC_ARG_ENABLE(iso-charset, [ --disable-iso-charset disable iso charset (may be slightly faster if you don't care about upper-ascii characters)], [use_iso=$enablevel], [use_iso=yes]) AC_ARG_ENABLE(sfs-compat, [ --enable-sfs-compat Support SFS compatibility], AC_DEFINE(SFS_COMPAT,1), AC_DEFINE(SFS_COMPAT,0)) AC_ARG_ENABLE(pointer, [ --enable-pointer blah], , AC_DEFINE(AGREP_POINTER)) AC_ARG_ENABLE(measure-times, [ --enable-measure-times blah], AC_DEFINE(MEASURE_TIMES)) AC_ARG_ENABLE(warnings, [ --enable-warnings Add -Wall to CFLAGS], [CFLAGS="$CFLAGS -Wall"]) AC_ARG_ENABLE(strip, [ --enable-strip Strip binaries], , [STRIP=""]) if test $use_iso = yes; then AC_DEFINE(ISO_CHAR_SET,1) else AC_DEFINE(ISO_CHAR_SET,0) fi dnl local substitute AC_SUBST(TARGET) AC_SUBST(HAVE_STRDUP) AC_SUBST(LEXFLAGS) AC_SUBST(DYNFILTER_TARGET) AC_SUBST(DYNFILTER_CFLAGS) AC_SUBST(DYNFILTER) AC_OUTPUT(Makefile index/Makefile compress/Makefile agrep/Makefile dynfilters/Makefile libtemplate/Makefile libtemplate/util/Makefile libtemplate/template/Makefile libtemplate/lib/Makefile) glimpse-4.18.7/defs.h000066400000000000000000000025001300371307100143350ustar00rootroot00000000000000#ifndef _GIMPSE_DEFS_H_ #define _GIMPSE_DEFS_H_ #include /* autoconf defines */ #define MAX_ARGS 80 /* English alphabets + numbers + pattern + progname + arguments + extras */ #define MAXFILEOPT 1024 /* includes length of args too: #args is <= MAX_ARGS */ #define BLOCKSIZE 8192 /* For compression: what is the optimal unit of disk i/o = n * pagesize */ /* * These are some parameters that allow us to switch between offset computation * and just index computation when the index is built at a byte-level: since * offset computation is a waste if we can't narrow down search enough (since * we must look all over and the lists become too long => bottleneck). This may * not be needed if we used trees to store intervals --- we'll do it later :-). */ #define MAX_DISPARITY 100 /* if least frequent word occurrs in < 1/100 times most frequent word, resort to agrep: don't intersect lists (byte-level) */ #define MIN_OCCURRENCES 20 /* Min no. of occurrences before we check for highly frequent words using MAX_UNION */ #define MAX_UNION 500 /* Don't even perform the Union of offsets if least < 1/500 times most freq word (we are on track of stop list kinda words) */ #define MAX_ABSOLUTE MAX_SORTLINE_LEN /* Don't even perform the Union of offsets if a word occurs more than 16K times (independent of #of files) */ #endif glimpse-4.18.7/dynfilters/000077500000000000000000000000001300371307100154315ustar00rootroot00000000000000glimpse-4.18.7/dynfilters/Makefile.in000066400000000000000000000055741300371307100175110ustar00rootroot00000000000000# Set this variable to the name of your C compiler CC=gcc # Provide optimization flages here. These settings are fine for egcs and # gcc >= 2.95. If you use an older gcc version, change -O3 to -O2. If you # use a non-gcc compiler, change this line to # CFLAGS=-O CFLAGS=-O3 -fomit-frame-pointer # Additional definitions you would like to pass to the compiler. You usually # can leave this line alone. DEFS= # The name of the linker. Usually, specifying the same name as the C compiler # is fine. Change this to "ld" if in doubt. LD=$(CC) # Additional flags you would like to pass to the linker. You usually can leave # this line alone. LDFLAGS= # Specify the name of your lex program here. If you have flex installed, it is # highly recommended that you leave this line alone. If you use AT&T lex, # change this line to "lex". You then also need to change the next variable, # see below. LEX=flex # Flags you pass to the lexer. These are flex-specific. If you use AT&T lex, # change this line to "LEXFLAGS=" LEXFLAGS=-F -8 # lex and flex require that you link against an additional library. If you use # AT&T lex, change this to "-ll". LEXLIB=-lfl # You can ignore the next lines if you do not want to build the filter # in a shared library. Unless you take part in the experimental glimpse # project that allows filters to be specified in shared libraries, you # can leave the rest of the makefile safely alone. #------------------------------------------------------------------------ # If you build a shared library, some compilers, such as gcc, require you # to build "position-independent" (i.e. relocatable) code. Some other # compilers do not require anything. Check your compiler manual on how to # build shared libraries and clear this line if necessary. SHAREDCFLAGS=-fPIC # This line is passed to the linker, to indicate that it should build a # shared library. Consult your compiler manual on how to build shared # libraries, and change this line if necessary. SHAREDLDFLAGS=-shared # This line specifies the library, in which the functions for loading # a shared library at runtime reside. More specifically, look for the # library that defines dlopen, dlsym, and dlclose. DLLIB=-ldl # Do not change this line SHAREDDEFS=-DSHARED_OBJECT htuml2txt: lex.yy.c $(CC) $(CFLAGS) $(DEFS) -c lex.yy.c $(LD) $(LDFLAGS) -o htuml2txt lex.yy.o $(LEXLIB) htuml2txt.so: lex.yy.c $(CC) $(CFLAGS) $(DEFS) $(SHAREDCFLAGS) $(SHAREDDEFS) -c lex.yy.c $(LD) $(LDFLAGS) $(SHAREDLDFLAGS) -o htuml2txt.so lex.yy.o $(LEXLIB) sotest: sotest.c $(CC) $(CFLAGS) $(DEFS) -c sotest.c $(LD) $(LDFLAGS) -o sotest sotest.o $(DLLIB) lex.yy.c: htuml2txt.lex $(LEX) $(LEXFLAGS) htuml2txt.lex all: htuml2txt htuml2txt.so clean: rm -f *.o lex.yy.c core distclean: clean rm -f htuml2txt htuml2txt.so Makefile install: all test: all echo "Doing regression test ..." alltest: test sotest echo "Doing extended regression test ... " glimpse-4.18.7/dynfilters/Makefile.linux000066400000000000000000000054651300371307100202410ustar00rootroot00000000000000# Set this variable to the name of your C compiler CC=gcc # Provide optimization flages here. These settings are fine for egcs and # gcc >= 2.95. If you use an older gcc version, change -O3 to -O2. If you # use a non-gcc compiler, change this line to # CFLAGS=-O CFLAGS=-O3 -fomit-frame-pointer # Additional definitions you would like to pass to the compiler. You usually # can leave this line alone. DEFS= # The name of the linker. Usually, specifying the same name as the C compiler # is fine. Change this to "ld" if in doubt. LD=$(CC) # Additional flags you would like to pass to the linker. You usually can leave # this line alone. LDFLAGS= # Specify the name of your lex program here. If you have flex installed, it is # highly recommended that you leave this line alone. If you use AT&T lex, # change this line to "lex". You then also need to change the next variable, # see below. LEX=flex # Flags you pass to the lexer. These are flex-specific. If you use AT&T lex, # change this line to "LEXFLAGS=" LEXFLAGS=-F -8 # lex and flex require that you link against an additional library. If you use # AT&T lex, change this to "-ll". LEXLIB=-lfl # You can ignore the next lines if you do not want to build the filter # in a shared library. Unless you take part in the experimental glimpse # project that allows filters to be specified in shared libraries, you # can leave the rest of the makefile safely alone. #------------------------------------------------------------------------ # If you build a shared library, some compilers, such as gcc, require you # to build "position-independent" (i.e. relocatable) code. Some other # compilers do not require anything. Check your compiler manual on how to # build shared libraries and clear this line if necessary. SHAREDCFLAGS=-fPIC # This line is passed to the linker, to indicate that it should build a # shared library. Consult your compiler manual on how to build shared # libraries, and change this line if necessary. SHAREDLDFLAGS=-shared # This line specifies the library, in which the functions for loading # a shared library at runtime reside. More specifically, look for the # library that defines dlopen, dlsym, and dlclose. DLLIB=-ldl # Do not change this line SHAREDDEFS=-DSHARED_OBJECT htuml2txt: lex.yy.c $(CC) $(CFLAGS) $(DEFS) -c lex.yy.c $(LD) $(LDFLAGS) -o htuml2txt lex.yy.o $(LEXLIB) htuml2txt.so: lex.yy.c $(CC) $(CFLAGS) $(DEFS) $(SHAREDCFLAGS) $(SHAREDDEFS) -c lex.yy.c $(LD) $(LDFLAGS) $(SHAREDLDFLAGS) -o htuml2txt.so lex.yy.o $(LEXLIB) sotest: sotest.c $(CC) $(CFLAGS) $(DEFS) -c sotest.c $(LD) $(LDFLAGS) -o sotest sotest.o $(DLLIB) lex.yy.c: htuml2txt.lex $(LEX) $(LEXFLAGS) htuml2txt.lex all: htuml2txt htuml2txt.so clean: rm -f *.o lex.yy.c core test: all echo "Doing regression test ..." alltest: test sotest echo "Doing extended regression test ... " glimpse-4.18.7/dynfilters/README000066400000000000000000000060461300371307100163170ustar00rootroot00000000000000These files constitute fixes and enhancements for Webglimpse 1.7.1. Feel free to contact me at cvogler@gradient.cis.upenn.edu with any questions or concerns. Overview: ========= htuml2txt.lex - a set of lex rules intended to replace the functionality htuml2txt.pl. Can be built with either flex or an 8bit-clean AT&T lex. Makefile.linux - Makefile for building an optimized htuml2txt on Linux. Adapting it to other systems should be trivial. Makefile.flex - A generic makefile for flex Makefile.att - A generic makefile for AT&T lex variants makecron.patch - A small patch that makes sure that the indexing phase of Webglimpse also uses an HTML filter (html2txt, htuml2txt, or htuml2txt.pl). It does so by adding the -z switch to the argument list of glimpseindex. htuml2txt.lex ============= Description: A faster HTML filter for WebGlimpse than htuml2txt.pl. I found that the spawning of all the perl processes by glimpse was way too expensive to be practical. In particular, searching 2000 files for a frequently occuring term took more than 30 seconds on a PII-400/Linux 2.2.5 machine. Rewriting the filter as a set of lex rules sped up the search by a factor of 6, which is about on par with the plain html2txt filter. You need either flex(1) from the Free Software Foundation (http://www.fsf.org), or an 8bit-clean version of AT&T lex(1) to build the filter correctly. Systems that have a C compiler installed usually also have at least one of these tools installed. I tested the filter successfully on Linux 2.2 (using flex), SGI IRIX 6.4 (using both flex and AT&T lex), and Solaris 2.6 (using both flex and AT&T lex). When in doubt, I recommend using flex. It is freely available, and it is much faster and more robust than the AT&T variants of lex. If you are using Linux and have egcs installed, you can build an optimized version of html2txt with this command: make -f Makefile.linux Otherwise, if you are using flex on any system, you can build the filter with this command: make -f Makefile.flex Finally, if you prefer using AT&T lex, you can build the filter with this command: make -f Makefile.att Any of these commands should build the file "htuml2txt". To install and use the filter, copy this file to the lib directory in your Webglimpse home directory. Also, edit the file ".glimpse_filters" in your database directory and replace all occurrences of "htuml2txt.pl" with "htuml2txt". makecron.patch ============== This is a small patch that ensures that Webglimpse uses the HTML filters during index creation. This can greatly reduce the number of files that webglimpse has to search. For some search terms, I observed a speedup of a factor of 2 with this change. To apply the patch, change to the webglimpse home directory and run this command: patch -p1 < /makecron.patch Afterwards, run makecron and reconfigure the archives to make sure that the changes propagate to all archives. glimpse-4.18.7/dynfilters/htuml2txt.lex000066400000000000000000000405531300371307100201250ustar00rootroot00000000000000/******** * $Id: htuml2txt.lex,v 1.1 2000/05/25 18:07:05 golda Exp $ * $Log: htuml2txt.lex,v $ * Revision 1.1 2000/05/25 18:07:05 golda * Added Christian's changes to allow dynamic filters. I believe this has only been tested on Linux * systems. --GV * * Revision 1.5 1999/11/06 21:25:07 cvogler * - Fixed bug that did not recognize the end of a comment correctly. * * Revision 1.4 1999/11/06 06:55:08 cvogler * - Added support for > and < (greather than, and less than). * - Fixed problems with the matching rules for non-spacing tags that * caused linefeeds to be incorrectly suppressed. As a result, jumping * to line numbers from webglimpse searches did not work. * * * htuml2text.lex * * A faster HTML filter for WebGlimpse than htuml2txt.pl. I found that * the spawning of all the perl processes by glimpse was way too expensive * to be practical. In particular, searching 2000 files for a frequently * occuring term took more than 30 seconds on a PII-400/Linux 2.2.5 * machine. Rewriting the filter as a set of lex rules reduced the search * time to 5 seconds, which is on par with the simple html2txt filter. * * Suggested options for compiling on i386/Linux with egcs 1.1.2/flex 2.5.4: * flex -F -8 htuml2txt.lex * gcc -O3 -fomit-frame-pointer -o htuml2txt lex.yy.c -lfl * * Note: For a smaller, slightly slower executable, omit the -F switch in * the call to flex. * * Caution: The -8 switch MUST be specified if -f or -F is specified! * * Note: It is also necessary to edit .glimpse_filters in the * WebGlimpse database directories. * * Suggested options for compiling with AT&T-style lex: * lex htuml2txt.lex * cc -O -o htuml2txt lex.yy.c -ll * * Written on 5/16/1999 by Christian Vogler * Send bugreports and suggestions to cvogler@gradient.cis.upenn.edu. ******/ STRING \"([^\"\n\\]|\\\")*\" WHITE [\ \t] /* HTML tags that are to be eliminated altogether, without even a */ /* substitution with a space */ A [aA] B [bB] I [iI] EM [eE][mM] FONT [fF][oO][nN][tT] STRONG [sS][tT][rR][oO][nN][gG] BIG [bB][iI][gG] SUP [sS][uU][pP] SUB [sS][uU][bB] U [uU] STRIKE [sS][tT][rR][iI][kK][eE] STYLE [sS][tT][yY][lL][eE] NSPTAGS ({A}|{B}|{I}|{EM}|{FONT}|{STRONG}|{BIG}|{SUP}|{SUB}|{U}|{STRIKE}|{STYLE}) /* These allocate the necessary space to make AT&T lex work. */ /* flex ignores them. */ %e 4000 %p 10000 %n 2000 /* treat inside of HTML comments and tags specially, to ensure that */ /* everything inside them is eliminated, even if they contain quotes */ %s COMMENT %s TAG %s BEGINTAG %% [^\-\"\n\r]+ {/* This ruleset eats up all */} -+[^\-\>\"\n\r]+ {/* HTML comments */} -\> {/* none */} {STRING} {/* none */} -{2,}\> BEGIN(INITIAL); [^\"\>\r\n]+ {/* This ruleset discards all */} {STRING} {/* HTML tags */} \> BEGIN(INITIAL); {WHITE}+ {/* eat whitespace to find tag name */} !-- BEGIN(COMMENT); /* HTML comment */ \/ {/* eat slash in tags */} {NSPTAGS} BEGIN(TAG); /* tag to be eliminated altogether */ \> { fputc(' ', yyout); BEGIN(INITIAL); /* whoa. Empty tag?!? Replace with space */ }; [A-Za-z0-9]+ | [^\r\n] { fputc(' ', yyout); BEGIN(TAG); /* all else is a tag to be replaced with a space */ } \< BEGIN(BEGINTAG); /* tag that must be analyzed further (comment, spacing tag, non-spacing tag) */   fputc(' ', yyout); /* replace special */ ¡ fputc('¡', yyout); /* HTML odes with */ ¡ fputc('¡', yyout); /* corresponding ISO */ ¢ fputc('¢', yyout); /* codes */ ¢ fputc('¢', yyout); £ fputc('£', yyout); £ fputc('£', yyout); ¤ fputc('¤', yyout); ¤ fputc('¤', yyout); ¥ fputc('¥', yyout); ¥ fputc('¥', yyout); ¦ fputc('¦', yyout); ¦ fputc('¦', yyout); § fputc('§', yyout); § fputc('§', yyout); ¨ fputc('¨', yyout); ¨ fputc('¨', yyout); © fputc('©', yyout); © fputc('©', yyout); ª fputc('ª', yyout); ª fputc('ª', yyout); « fputc('«', yyout); « fputc('«', yyout); ¬ fputc('¬', yyout); ¬ fputc('¬', yyout); ­ fputc('\\', yyout); ­ fputc('\\', yyout); ® fputc('®', yyout); ® fputc('®', yyout); ¯ fputc('¯', yyout); ¯ fputc('¯', yyout); ° fputc('°', yyout); ° fputc('°', yyout); ± fputc('±', yyout); ± fputc('±', yyout); ² fputc('²', yyout); ² fputc('²', yyout); ³ fputc('³', yyout); ³ fputc('³', yyout); ´ fputc('´', yyout); ´ fputc('´', yyout); µ fputc('µ', yyout); µ fputc('µ', yyout); ¶ fputc('¶', yyout); ¶ fputc('¶', yyout); · fputc('·', yyout); · fputc('·', yyout); ¸ fputc('¸', yyout); ¸ fputc('¸', yyout); ¹ fputc('¹', yyout); ¹ fputc('¹', yyout); º fputc('º', yyout); º fputc('º', yyout); » fputc('»', yyout); » fputc('»', yyout); ¼ fputc('¼', yyout); ¼ fputc('¼', yyout); ½ fputc('½', yyout); ½ fputc('½', yyout); ¾ fputc('¾', yyout); ¾ fputc('¾', yyout); ¿ fputc('¿', yyout); ¿ fputc('¿', yyout); À fputc('À', yyout); À fputc('À', yyout); Á fputc('Á', yyout); Á fputc('Á', yyout); Â fputc('Â', yyout); ˆ fputc('Â', yyout); Ã fputc('Ã', yyout); Ã fputc('Ã', yyout); Ä fputc('Ä', yyout); Ä fputc('Ä', yyout); Å fputc('Å', yyout); ˚ fputc('Å', yyout); Æ fputc('Æ', yyout); Æ fputc('Æ', yyout); Ç fputc('Ç', yyout); Ç fputc('Ç', yyout); È fputc('È', yyout); È fputc('È', yyout); É fputc('É', yyout); É fputc('É', yyout); Ê fputc('Ê', yyout); Ê fputc('Ê', yyout); Ë fputc('Ë', yyout); Ë fputc('Ë', yyout); Ì fputc('Ì', yyout); Ì fputc('Ì', yyout); Í fputc('Í', yyout); Í fputc('Í', yyout); Î fputc('Î', yyout); Î fputc('Î', yyout); Ï fputc('Ï', yyout); Ï fputc('Ï', yyout); Ð fputc('Ð', yyout); Ð fputc('Ð', yyout); Ñ fputc('Ñ', yyout); Ñ fputc('Ñ', yyout); Ò fputc('Ò', yyout); Ò fputc('Ò', yyout); Ó fputc('Ó', yyout); Ó fputc('Ó', yyout); Ô fputc('Ô', yyout); Ô fputc('Ô', yyout); Õ fputc('Õ', yyout); Õ fputc('Õ', yyout); Ö fputc('Ö', yyout); Ö fputc('Ö', yyout); × fputc('×', yyout); × fputc('×', yyout); Ø fputc('Ø', yyout); Ø fputc('Ø', yyout); Ù fputc('Ù', yyout); Ù fputc('Ù', yyout); Ú fputc('Ú', yyout); Ú fputc('Ú', yyout); Û fputc('Û', yyout); Û fputc('Û', yyout); Ü fputc('Ü', yyout); Ü fputc('Ü', yyout); Ý fputc('Ý', yyout); Ý fputc('Ý', yyout); Þ fputc('Þ', yyout); Þ fputc('Þ', yyout); ß fputc('ß', yyout); ß fputc('ß', yyout); à fputc('à', yyout); à fputc('à', yyout); á fputc('á', yyout); á fputc('á', yyout); â fputc('â', yyout); â fputc('â', yyout); ã fputc('ã', yyout); ã fputc('ã', yyout); ä fputc('ä', yyout); ä fputc('ä', yyout); å fputc('å', yyout); å fputc('å', yyout); æ fputc('æ', yyout); æ fputc('æ', yyout); ç fputc('ç', yyout); ç fputc('ç', yyout); è fputc('è', yyout); è fputc('è', yyout); é fputc('é', yyout); é fputc('é', yyout); ê fputc('ê', yyout); ê fputc('ê', yyout); ë fputc('ë', yyout); ë fputc('ë', yyout); ì fputc('ì', yyout); ì fputc('ì', yyout); í fputc('í', yyout); í fputc('í', yyout); î fputc('î', yyout); î fputc('î', yyout); ï fputc('ï', yyout); ï fputc('ï', yyout); ð fputc('ð', yyout); &ieth; fputc('ð', yyout); ñ fputc('ñ', yyout); ñ fputc('ñ', yyout); ò fputc('ò', yyout); ò fputc('ò', yyout); ó fputc('ó', yyout); ó fputc('ó', yyout); ô fputc('ô', yyout); ô fputc('ô', yyout); õ fputc('õ', yyout); õ fputc('õ', yyout); ö fputc('ö', yyout); ö fputc('ö', yyout); ÷ fputc('÷', yyout); ÷ fputc('÷', yyout); ø fputc('ø', yyout); ø fputc('ø', yyout); ù fputc('ù', yyout); ù fputc('ù', yyout); ú fputc('ú', yyout); ú fputc('ú', yyout); û fputc('û', yyout); û fputc('û', yyout); ü fputc('ü', yyout); ü fputc('ü', yyout); ý fputc('ý', yyout); ý fputc('ý', yyout); þ fputc('þ', yyout); þ fputc('þ', yyout); ÿ fputc('ÿ', yyout); ÿ fputc('ÿ', yyout); " fputc('\"', yyout); " fputc('\"', yyout); & fputc('&', yyout); & fputc('&', yyout); > fputc('>', yyout); > fputc('>', yyout); < fputc('<', yyout); < fputc('<', yyout); %% /* Define this if the filter is to be loaded as a shared library. This is an experimental option and requires patches to glimpse, at least up to version 4.12.6. The resulting speedup in searches is impressive and well worth the hassle. For the patch and instructions, contact cvogler@gradient.cis.upenn.edu. These patches might be merged into the main glimpse source tree in the future. */ #ifdef SHARED_OBJECT int filter_func(FILE *in, FILE *out) { yyout = out; yyrestart(in); BEGIN(INITIAL); /* necessary to put scanner in known state if previous file contained syntax errors, or unbalanced <, >, " */ while (yylex()) ; return 0; /* all o.k. */ } #else /* filter is loaded as standalone external glimpse filter process. This is the default. */ int main(void) { while (yylex()) ; return 1; } #endif glimpse-4.18.7/dynfilters/sotest.c000066400000000000000000000007441300371307100171230ustar00rootroot00000000000000#include #include int main(void) { int (*filter)(FILE *in, FILE *out); void *handle; char *error; if ((handle = dlopen("./htuml2txt.so", RTLD_NOW)) == NULL) { fprintf(stderr, "sotest: %s\n", dlerror()); return 1; } filter = dlsym(handle, "filter_func"); if ((error = dlerror()) != NULL) { fprintf(stderr, "sotest: %s\n", error); return 1; } (*filter)(stdin, stdout); dlclose(handle); return 0; } glimpse-4.18.7/genpatch000066400000000000000000000007461300371307100147710ustar00rootroot00000000000000#!/bin/sh # $Id: genpatch,v 1.1 1999/11/03 20:36:24 golda Exp $ PATH=/bin:/usr/bin:/usr/local/bin ; export PATH ROOT=${1-.} RLOGFLAGS="-L -R" tmpfile="/tmp/findco$$" for rcsdir in `find ${ROOT} -name RCS -type d -print` ; do rlog ${RLOGFLAGS} ${rcsdir}/* > ${tmpfile} if [ -s "${tmpfile}" ] ; then echo "# Files in ${rcsdir}:" for f in `cat ${tmpfile}` ; do f2=`echo $f | sed -e 's@RCS/@@' -e 's@,v$@@'` rcsdiff -c ${f2} done fi done rm -f ${tmpfile} glimpse-4.18.7/gentar000077500000000000000000000014751300371307100144630ustar00rootroot00000000000000#!/bin/sh # $Id: gentar,v 1.1 1998/04/27 16:11:23 pab Exp $ # # Build a tar file image of this directory, checking out files from RCS. # What version to build? RELVER=${1-DEV} srcdir="./glimpse-${RELVER}-src" # Safety check---don't overwrite existing directory if [ -d "${srcdir}" ] ; then echo "$0: Please remove existing source archive ${srcdir}" exit 1 fi # Get the hierarchy first dirs=`find . -type d` # Now create the duplication area mkdir ${srcdir} cdir=`pwd` # Duplicate the directory hierarchy; if the directory has an RCS area, # check out its files, then remove the RCS link. for d in ${dirs} ; do mkdir -p ${srcdir}/${d} if [ -e ${d}/RCS ] ; then (cd ${srcdir}/${d} ; ln -s ${cdir}/${d}/RCS ; co -f RCS/* ; rm RCS) fi done # Put all that into a tar file tar cf glimpse-${RELVER}-src.tar ${srcdir} glimpse-4.18.7/get_filename.c000066400000000000000000000645401300371307100160420ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include #include #include #include "glimpse.h" #include #define CHAR unsigned char /* ---------------------------------------------------------------------- get_filenames() input: an index table, (an index vector, i-th entry is ON if i-th partition is to be searched.), the partition table in src_index_set[] and the list of all files in "NAME_LIST". output: the list of filenames to be searched. ------------------------------------------------------------------------- */ #if BG_DEBUG extern FILE *debug; #endif /*BG_DEBUG*/ extern int p_table[MAX_PARTITION]; extern CHAR **GTextfiles; extern CHAR **GTextfilenames; extern int *GFileIndex; extern int GNumfiles; extern CHAR GProgname[]; extern CHAR FileNamePat[]; extern int MATCHFILE; extern int agrep_outpointer; extern int mask_int[32]; extern int OneFilePerBlock; extern char INDEX_DIR[MAX_LINE_LEN]; extern unsigned int *multi_dest_index_set[MAXNUM_PAT]; extern int file_num; /* in index/io.c */ int bigbuffer_size; int first_line_len = 0; char *bigbuffer = NULL; /* constant buffer to read all filenames in NAME_LIST */ char *outputbuffer = NULL; /* keeps changing: used for -F search via memagrep */ int outputbuffer_len = 0; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; read_filenames() { struct stat st; unsigned char buffer[MAX_NAME_SIZE]; char *currptr; int i; /* one time processing: assumes during one run of glimpse, the index remains constant! */ if (bigbuffer == NULL) { FILE *fp = fopen(NAME_LIST, "r"); if (fp == NULL) { fprintf(stderr, "Can't open for reading: %s/%s\n", INDEX_DIR, NAME_LIST); exit(2); } if (-1 == stat(NAME_LIST, &st)) { fclose(fp); fprintf(stderr, "Can't stat: %s/%s\n", INDEX_DIR, NAME_LIST); exit(2); } fgets(buffer, MAX_NAME_SIZE, fp); first_line_len = strlen(buffer); bigbuffer_size = st.st_size - first_line_len; sscanf(buffer, "%d", &file_num); if ((file_num < 0) || (file_num > MaxNum24bPartition)) { fclose(fp); fprintf(stderr, "Error in reading: %s/%s\n", INDEX_DIR, NAME_LIST); exit(2); } if (file_num == 0) { fclose(fp); fprintf(stderr, "Warning: No files were indexed! Exiting...\n"); exit(2); } initialize_data_structures(file_num); for (i=0; i DEF_MAX_INDEX_PERCENT/2) && (num_blocks > MaxNum8bPartition)) return slow_mask_filenames(index_vect, infile); for (i=0; i= maxcount) { /* first time (not compressing into smaller code since I want it to be similar to slow_mask... below) */ ret = (num_blocks - 1)*sizeof(int); num_read += ret; maxcount = num_blocks; for (i=0; i<=ret /* to process last one also */; i+=sizeof(int), count++) { readoffset = temp_bigbuffer_offset[1 + i/sizeof(int)]; /* printf("readoffset=%d\n", readoffset); */ if ((offset >= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(temp_bigbuffer_index[count])] |= mask_int[temp_bigbuffer_index[count] % 32]; else { for (; l= p_table[l]) && (temp_bigbuffer_index[count] < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out w/o setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } else { for (; i<=ret /* to process last one also */; i+=sizeof(int), count++) { readoffset = temp_bigbuffer_offset[1 + i/sizeof(int)]; /* printf("readoffset=%d\n", readoffset); */ if ((offset >= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(temp_bigbuffer_index[count])] |= mask_int[temp_bigbuffer_index[count] % 32]; else { for (; l= p_table[l]) && (temp_bigbuffer_index[count] < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out without setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } } } /* Now AND the incoming mask with the one constructed above */ if (OneFilePerBlock) { for (i=0; i= maxcount) { if (num_read= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(count)] |= mask_int[count % 32]; else { for (; l= p_table[l]) && (count < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out w/o setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } else if ((offset >= prevreadoffset) && (offset < name_list_size)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(count)] |= mask_int[count % 32]; else { for (; l= p_table[l]) && (count < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out without setting anything */ } count ++; found = 1; } else goto endofinput; /* since this offset >= name_list_size and there's no more input after that */ } else { for (; i= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(count)] |= mask_int[count % 32]; else { for (; l= p_table[l]) && (count < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out without setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } } } endofinput: /* Now AND the incoming mask with the one constructed above */ if (OneFilePerBlock) { for (i=0; i 0) ? round(file_num, 8*sizeof(int)) : MAX_PARTITION); i++) if(index_vect[i]) fprintf(debug, "i=%d,%x\n", i, index_vect[i]); #endif /*BG_DEBUG*/ GNumfiles = 0; filesseen = 0; endptr = beginptr = bigbuffer + MAX_PAT; if(MATCHFILE == OFF) { /* just copy the filenames */ if (OneFilePerBlock) { for (i=0; i= file_num) goto end_files; end_of_loop1: beginptr = endptr = endptr + 1; /* skip over '\n' */ filesseen ++; } } } } /* one file per block */ else { /* Just the outer for-loop and initial begin/end values are different: rest is same */ for (i=0; i 0) { start = p_table[i]; end = p_table[i+1]; if (start >= end) continue; #if BG_DEBUG fprintf(debug, "start=%d, end=%d\n", start, end); #endif /*BG_DEBUG*/ /* * skip over so many filenames and get the filenames to copy. * NOTE: successive "start"s ALWAYS increase. */ while(filesseen < start) { while(*beginptr != '\n') beginptr ++; beginptr ++; /* skip over '\n' */ filesseen ++; } endptr = beginptr; while (filesseen < end) { while(*endptr != '\n') endptr ++; if (endptr == beginptr + 1) goto end_of_loop2; /* null name of non-existent file */ *endptr = '\0'; /* return with all the names you COULD get */ if ((GTextfiles[GNumfiles] = (CHAR *)strdup(beginptr)) == NULL) { *endptr = '\n'; fprintf(stderr, "Out of memory at: %s:%d\n", __FILE__, __LINE__); return; } GFileIndex[GNumfiles] = filesseen; *endptr = '\n'; if (++GNumfiles >= file_num) goto end_files; end_of_loop2: beginptr = endptr = endptr + 1; /* skip over '\n' */ filesseen ++; } } } } } else { /* search and copy matched filenames */ extern int REGEX, FASTREGEX, D, WORDBOUND; /* agrep global which tells us whether the pattern is a regular expression or not, and if there are errors w/ -w */ int myREGEX, myFASTREGEX, myD, myWORDBOUND; errno = 0; if ((dummylen = memagrep_init(argc, argv, MAX_PAT, dummypat)) <= 0) goto end_files; memcpy(tempbuf, bigbuffer, bigbuffer_size >= MAX_PAT ? MAX_PAT*3 : MAX_PAT*2 + bigbuffer_size); ret = memagrep_search(dummylen, dummypat, dummylen*2, beginptr, outputbuffer_len, outputbuffer); memcpy(bigbuffer, tempbuf, bigbuffer_size >= MAX_PAT ? MAX_PAT*3 : MAX_PAT*2 + bigbuffer_size); myREGEX = REGEX; myFASTREGEX = FASTREGEX; myD = D; myWORDBOUND = WORDBOUND; if (OneFilePerBlock) { for (i=0; i 0) { #if BG_DEBUG { char c = outputbuffer[agrep_outpointer + 1]; outputbuffer[agrep_outpointer + 1] = '\0'; fprintf(debug, "OUTPUTBUFFER=%s\n", outputbuffer); outputbuffer[agrep_outpointer + 1] = c; } #endif /*BG_DEBUG*/ k = prevk = 0; #if EACHOPTION #else while (outputbuffer[k] == '\n') { k ++; prevk ++; } #endif while(k+1= file_num) goto end_files; k = prevk = k+1; } } } else { index_vect[i] &= ~mask_int[j]; /* remove it from the list: used if ByteLevelIndex */ } end_of_loop3: beginptr = endptr = endptr + 1; } } } /* one file per block */ else { /* Just the outer for-loop and initial begin/end values are different: rest is same */ for (i=0; i 0) { start = p_table[i]; end = p_table[i+1]; if (start >= end) continue; #if BG_DEBUG fprintf(debug, "start=%d, end=%d\n", start, end); #endif /*BG_DEBUG*/ /* * skip over so many filenames and get the region to search = * beginptr to endptr: NOTE: successive "start"s ALWAYS increase. */ while(filesseen < start) { while(*beginptr != '\n') beginptr ++; beginptr ++; /* skip over '\n' */ filesseen ++; } beginptr --; /* I need '\n' for memory search */ endptr = beginptr+1; while (filesseen < end) { while(*endptr != '\n') endptr ++; endptr ++; /* skip over '\n' */ filesseen ++; } endptr --; /* I need '\n' for memory search */ if (endptr == beginptr + 1) goto end_of_loop4; /* null name of non-existent file */ #if BG_DEBUG *endptr = '\0'; fprintf(debug, "From %d searching:\n%s\n", filesseen, beginptr+1); *endptr = '\n'; #endif /*BG_DEBUG*/ /* if file in the partition matches then copy it */ #if EACHOPTION if (myREGEX || myFASTREGEX || (myD && myWORDBOUND)) ret = memagrep_search(dummylen, dummypat, endptr-beginptr + 1, beginptr, outputbuffer_len, outputbuffer); else ret = memagrep_search(dummylen, dummypat, endptr-beginptr/* + 1*/, beginptr+1, outputbuffer_len, outputbuffer); #else /* beginptr points to '\n', entptr+1 points to '\n' */ ret = memagrep_search(dummylen, dummypat, endptr-beginptr+1, beginptr, outputbuffer_len, outputbuffer); #endif if (ret > 0) { k = prevk = 0; #if EACHOPTION #else while (outputbuffer[k] == '\n') { k ++; prevk ++; } #endif while(k+1= file_num) goto end_files; k = prevk = k+1; } } } else { index_vect[i] = 0; /* mask it off */ } end_of_loop4: beginptr = endptr = endptr + 1; } } } } end_files: #if BG_DEBUG fprintf(debug, "The following %d filenames are ON\n", GNumfiles); for (i=0; i #if BG_DEBUG extern FILE *debug; #endif /*BG_DEBUG*/ extern char INDEX_DIR[MAX_LINE_LEN]; extern int Only_first; extern int PRINTAPPXFILEMATCH; extern int OneFilePerBlock; extern int StructuredIndex; extern int WHOLEFILESCOPE; extern unsigned int *dest_index_set; extern unsigned char *dest_index_buf; extern int mask_int[32]; extern int errno; extern int ByteLevelIndex; extern int RecordLevelIndex; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; extern int NOBYTELEVEL; extern int OPTIMIZEBYTELEVEL; extern int RegionLimit; extern int PRINTINDEXLINE; extern struct offsets **src_offset_table; extern unsigned int *multi_dest_index_set[MAXNUM_PAT]; extern struct offsets **multi_dest_offset_table[MAXNUM_PAT]; extern char *index_argv[MAX_ARGS]; extern int index_argc; extern CHAR GProgname[MAXNAME]; extern FILE *indexfp, *minifp; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; extern int p_table[MAX_PARTITION]; extern int GNumpartitions; extern int INVERSE; /* agrep's global: need here to implement ~ in index-search */ extern int last_Y_filenumber; #define USEFREQUENCIES 0 /* set to one if we want to stop collecting offsets sometimes since words "look" like they are in the stop list... */ free_list(p1) struct offsets **p1; { struct offsets *tp1; while (*p1 != NULL) { tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); } } /* Unions offset lists list2 with list1 sorted in increasing order (deletes elements from list2) => changes both list1 and list2: f += #elems added */ sorted_union(list1, list2, f, pf, cf) struct offsets **list1, **list2; int *f, pf, cf; { register struct offsets **p1 = list1, *p2; register int count = *f; /* don't update *f if setting NOBYTELEVEL */ if (!RecordLevelIndex && NOBYTELEVEL) { /* cannot come here! */ free_list(list1); free_list(list2); return; } #if USEFREQUENCIES if (!RecordLevelIndex && ( ((pf > MIN_OCCURRENCES) && (count > MAX_UNION * pf)) || (count > MAX_ABSOLUTE) || ((count > MIN_OCCURRENCES) && (pf > MAX_UNION * count)) || (pf > MAX_ABSOLUTE) )) { /* enough if we check the second condition at the beginning since it won't surely be satisfied after this when count ++ */ NOBYTELEVEL = 1; return; } #endif while (*list2 != NULL) { /* extract 1st element, update list2 */ p2 = *list2; *list2 = (*list2)->next; p2->next = NULL; /* find position to insert p2, and do so */ p1 = list1; while (((*p1) != NULL) && ((*p1)->offset < p2->offset)) p1 = &(*p1)->next; if (*p1 == NULL) { /* end of list1: append list2 to it and return */ *p1 = p2; p2->next = *list2; *list2 = NULL; if (cf > 0) count = *f + cf; #if USEFREQUENCIES if (!RecordLevelIndex && ( ((pf > MIN_OCCURRENCES) && (count > MAX_UNION * pf)) || (count > MAX_ABSOLUTE))) { NOBYTELEVEL = 1; return; } #endif *f = count; return; } else if (p2->offset == (*p1)->offset) my_free(p2, sizeof(struct offsets)); else { p2->next = *p1; *p1 = p2; count ++; #if USEFREQUENCIES if (!RecordLevelIndex && ( ((pf > MIN_OCCURRENCES) && (count > MAX_UNION * pf)) || (count > MAX_ABSOLUTE) )) { NOBYTELEVEL = 1; return; } #endif /* update list1 */ list1 = &(*p1)->next; } } *f = count; } /* Intersects offset lists list2 with list1 sorted in increasing order (deletes elements from list2) => changes both list1 and list2 */ sorted_intersection(filenum, list1, list2, f) struct offsets **list1, **list2; int *f; { register struct offsets **p1 = list1, *p2, *tp1; register int diff; struct offsets *tp; if (!RecordLevelIndex && NOBYTELEVEL) { /* cannot come here! */ free_list(list1); free_list(list2); return; } /* NOT NECESSARY SINCE done INITIALIZED TO 0 ON CREATION AND MADE 0 BELOW tp = *list1; while (tp != NULL) { tp->done = 0; tp = tp->next; } */ #if 0 printf("sorted_intersection BEGIN: list1=\n\t"); tp = *list1; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); printf("list2=\n\t"); tp = *list2; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); #endif /* find position to intersect list2, and do so: REMEBER: list1 is in increasing order, and so is list2 !!! */ p1 = list1; while ( ((*p1) != NULL) && (*list2 != NULL) ) { diff = (*list2)->offset - (*p1)->offset; if ( (diff == 0) || (!RecordLevelIndex && (diff >= -RegionLimit) && (diff <= RegionLimit)) ) { (*p1)->done = 1; /* p1 is in */ p1 = &(*p1)->next; /* Can't increment p2 here since it might keep others after p1 also in */ } else { if (diff < 0) { p2 = *list2; *list2 = (*list2)->next; my_free(p2, sizeof(struct offsets)); /* p1 can intersect with list2's next */ } else { if((*p1)->done && 0) p1 = &(*p1)->next; /* imposs */ /* THIS CHECK ALWAYS YEILDS 0 FROM 25/08/1996: bgopal@cs.arizona.edu */ else { tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); (*f) --; } /* list2 can intersect with p1's next */ } } } while (*list2 != NULL) { p2 = *list2; *list2 = (*list2)->next; my_free(p2, sizeof(struct offsets)); } p1 = list1; while (*p1 != NULL) { if ((*p1)->done == 0) { tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); (*f) --; } else { (*p1)->done = 0; /* for the next round! */ p1 = &(*p1)->next; } } #if 0 printf("sorted_intersection END: list1=\n\t"); tp = *list1; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); printf("list2=\n\t"); tp = *list2; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); #endif } purge_offsets(p1) struct offsets **p1; { struct offsets *tp1; while (*p1 != NULL) { if ((*p1)->sign == 0) { tp1 = *p1; (*p1) = (*p1)->next; my_free(tp1, sizeof(struct offsets)); } else p1 = &(*p1)->next; } } /* Returns 1 if it is a Universal set, 0 otherwise. Constraint: WORD_END_MARK/ALL_INDEX_MARK must occur at or after buffer[0] */ get_set(buffer, set, offset_table, patlen, pattern, patattr, outfile, partfp, frequency, prevfreq) unsigned char *buffer; unsigned int *set; struct offsets **offset_table; int patlen; char *pattern; int patattr; FILE *outfile; FILE *partfp; int *frequency, prevfreq; { int bdx2, j; int ret; int x=0, y=0, diff, even_words=1, prevy; int indexattr = 0; struct offsets *o, *tailo, *heado; int delim = encode8b(0); int curfreq = 0; unsigned char c; /* buffer[0] is '\n', search must start from buffer[1] */ bdx2 = 1; if (OneFilePerBlock) while((bdx2= REAL_INDEX_BUF+1) return 0; if (StructuredIndex) { if (StructuredIndex < MaxNum8bPartition - 1) { indexattr = decode8b(buffer[bdx2+1]); } else { indexattr = decode16b((buffer[bdx2+1] << 8) | (buffer[bdx2 + 2])); } /* printf("i=%d p=%d\n", indexattr, patattr); */ if ((patattr > 0) && (indexattr != patattr)) { #if BG_DEBUG fprintf(debug, "indexattr=%d DOES NOT MATCH patattr=%d\n", indexattr, patattr); #endif /*BG_DEBUG*/ return 0; } } if (PRINTINDEXLINE) { c = buffer[bdx2]; buffer[bdx2] = '\0'; printf("%s %d", &buffer[1], indexattr); buffer[bdx2] = c; if (c == ALL_INDEX_MARK) printf(" ! "); else printf(" : "); } if (OneFilePerBlock && (buffer[bdx2] == ALL_INDEX_MARK)) { /* A intersection Univ-set = A: so src_index_set won't change; A union Univ-set = Univ-set: so src_index_set = all 1s */ #if BG_DEBUG buffer[bdx2] = '\0'; fprintf(debug, "All indices search for %s\n", buffer + 1); buffer[bdx2] = ALL_INDEX_MARK; #endif /*BG_DEBUG*/ set[REAL_PARTITION - 1] = 1; for(bdx2=0; bdx2= OneFilePerBlock) break; set[bdx2] |= mask_int[j]; } set[REAL_PARTITION - 1] = 1; if (ByteLevelIndex && !RecordLevelIndex) NOBYTELEVEL = 1; /* With RecordLevelIndex, I want NOBYTELEVEL to be unused (i.e., !NOBYTELEVEL is always true) */ return 1; } else if (!OneFilePerBlock) { /* check only if index+partitions are NOT split */ #if BG_DEBUG buffer[bdx2] = '\0'; fprintf(debug, "memagrep-line: %s\t\tpattern: %s\n", buffer, pattern); #endif /*BG_DEBUG*/ /* ignore if pattern with all its options matches block number sequence: bg+udi: Feb/16/93 */ buffer[bdx2] = '\n'; /* memagrep needs buffer to end with '\n' */ if ((ret = memagrep_search(patlen, pattern, bdx2+1, buffer, 0, outfile)) <= 0) return 0; else buffer[bdx2] = WORD_END_MARK; } if ((StructuredIndex > 0) && (StructuredIndex < MaxNum8bPartition - 1)) bdx2 ++; else if (StructuredIndex > 0) bdx2 += 2; bdx2++; /* bdx2 now points to the first byte of the offset */ even_words = 1; /* Code identical to that in merge_in() in glimpseindex */ if (OneFilePerBlock) { get_block_numbers(&buffer[bdx2], &buffer[bdx2], partfp); while((bdx2 0) && (x >= last_Y_filenumber)) continue; set[block2index(x)] |= block2mask(x); if (PRINTINDEXLINE) { printf("%d [", x); } prevy = 0; if (ByteLevelIndex) { heado = tailo = NULL; curfreq = 0; while ((bdx2MIN_OCCURRENCES)&&(curfreq+*frequency > MAX_UNION*prevfreq)) || (curfreq+*frequency > MAX_ABSOLUTE)) #else 1 #endif ) ) { /* These o's will be in sorted order. Just collect all of them and merge with &offset_table[x]. */ o = (struct offsets *)my_malloc(sizeof(struct offsets)); o->offset = y; o->next = NULL; o->sign = o->done = 0; if (heado == NULL) { heado = o; tailo = o; } else { tailo->next = o; tailo = o; } } else if (!RecordLevelIndex) { if (heado != NULL) free_list(&heado); /* printf("1 "); */ NOBYTELEVEL = 1; /* can't return since have to or the bitmasks */ } if ((bdx2 0) && (p_table[buffer[bdx2]] >= last_Y_filenumber)) { bdx2 ++; continue; } if (PRINTINDEXLINE) { for (j=p_table[buffer[bdx2]]; j 0) && (j >= last_Y_filenumber)) break; else printf("%d [] ", j); } set[buffer[bdx2]] = 1; bdx2++; } } if (PRINTINDEXLINE) { printf("\n"); } return 0; } /* * This is a very simple function: it gets the list of matched lines from the index, * and sets the block numbers corr. to files that need to be searched in "index_tab". * It also sets the file-offsets that have to be searched in "offset_tab" (byte-level). */ get_index(infile, index_tab, offset_tab, pattern, patlen, patattr, index_argv, index_argc, outfile, partfp, parse, first_time) char *infile; unsigned int *index_tab; struct offsets **offset_tab; char *pattern; int patlen; int patattr; char *index_argv[]; int index_argc; FILE *outfile; FILE *partfp; int parse; int first_time; { int i=0, j, iii; FILE *f_in; struct offsets **offsetptr = multi_dest_offset_table[0]; /* cannot be NULL if ByteLevelIndex: main.c takes care of that */ int ret=0; if (OneFilePerBlock && (parse & OR_EXP) && (index_tab[REAL_PARTITION - 1] == 1)) return 0; if (((infile == NULL) || !strcmp(infile, "")) /* || (index_tab == NULL) || (offset_tab == NULL) || (pattern == NULL)*/) return -1; if((f_in = fopen(infile, "r")) == NULL) { fprintf(stderr, "%s: can't open for reading: %s/%s\n", GProgname, INDEX_DIR, infile); return -1; } if (OneFilePerBlock) for(i=0; i= OneFilePerBlock) break; if (dest_index_set[i] & mask_int[j]) dest_index_set[i] &= ~mask_int[j]; else dest_index_set[i] |= mask_int[j]; } } else { for(i=0; i=GNumpartitions-1) break; /* STUPID: get_table returns 1 + part_num, where part_num was no. of partitions glimpseindex found */ if ((i == 0) || (i == '\n')) continue; if (dest_index_set[i]) dest_index_set[i] = 0; else dest_index_set[i] = 1; } } } /* Take intersection if parse=ANDPAT or 0 (one terminal pattern), union if OR_EXP; Take care of universal sets in index_tab[REAL_PARTITION - 1] */ if (OneFilePerBlock) { if (parse & OR_EXP) { if (ret) { ret_is_1: index_tab[REAL_PARTITION - 1] = 1; for(i=0; i= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; i= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } } first_time = 0; if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; i 0) ? round(OneFilePerBlock, 8*sizeof(int)) : MAX_PARTITION); i++) { if(index_tab[i]) fprintf(debug, "%d,%x\n", i, index_tab[i]); } #endif /*BG_DEBUG*/ fclose(f_in); return 0; } /* * Same as above, but uses mgrep to search the index for many patterns at one go, * and interprets the output obtained from the -M and -P options (set in main.c). */ mgrep_get_index(infile, index_tab, offset_tab, pat_list, pat_lens, pat_attr, mgrep_pat_index, num_mgrep_pat, patbufpos, index_argv, index_argc, outfile, partfp, parse, first_time) char *infile; unsigned int *index_tab; struct offsets **offset_tab; char *pat_list[]; int pat_lens[]; int pat_attr[]; int mgrep_pat_index[]; int num_mgrep_pat; int patbufpos; char *index_argv[]; int index_argc; FILE *outfile; FILE *partfp; int parse; int first_time; { int i=0, j, temp, iii, jjj; FILE *f_in; int ret; int x=0, y=0, even_words=1; int patnum; unsigned int *setptr; struct offsets **offsetptr; CHAR dummypat[MAX_PAT]; int dummylen=0; char allindexmark[MAXNUM_PAT]; int k; int sorted[MAXNUM_PAT], min, max; if (OneFilePerBlock && (parse & OR_EXP) && (index_tab[REAL_PARTITION - 1] == 1)) return 0; /* Do the mgrep() */ if ((f_in = fopen(infile, "w")) == NULL) { fprintf(stderr, "%s: run out of file descriptors!\n", GProgname); return -1; } errno = 0; if ((ret = fileagrep(index_argc, index_argv, 0, f_in)) < 0) { fprintf(stderr, "%s: error in searching index\n", HARVEST_PREFIX); fclose(f_in); return -1; } fflush(f_in); fclose(f_in); f_in = NULL; index_argv[patbufpos] = NULL; /* For index-search with memgrep and get-filenames */ dummypat[0] = '\0'; if ((dummylen = memagrep_init(index_argc, index_argv, MAX_PAT, dummypat)) <= 0) { fclose(f_in); return -1; } /* Interpret the result */ if((f_in = fopen(infile, "r")) == NULL) { fprintf(stderr, "%s: can't open for reading: %s/%s\n", GProgname, INDEX_DIR, infile); return -1; } if (OneFilePerBlock) { for (patnum=0; patnum num_mgrep_pat)) continue; /* error! */ setptr = multi_dest_index_set[patnum - 1]; offsetptr = multi_dest_offset_table[patnum - 1]; for(k=0; dest_index_buf[k] != ' '; k++); dest_index_buf[k] = '\n'; if (!allindexmark[patnum - 1]) allindexmark[patnum - 1] = (char)get_set(&dest_index_buf[k], setptr, offsetptr, pat_lens[mgrep_pat_index[patnum-1]], pat_list[mgrep_pat_index[patnum-1]], pat_attr[mgrep_pat_index[patnum-1]], outfile, partfp, &setptr[REAL_PARTITION - 2], min); /* To test the maximum disparity to stop unions within above */ if (!allindexmark[patnum-1]) min = setptr[REAL_PARTITION - 2]; for (patnum=0; patnum multi_dest_index_set[max][REAL_PARTITION - 2]) max = patnum; } /* Sort them according to the lengths of the lists in increasing order: min first */ for (patnum=0; patnum MAX_DISPARITY * multi_dest_index_set[sorted[0]][REAL_PARTITION - 2])) { NOBYTELEVEL = 1; /* printf("4 "); */ for (iii=0; iii= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) /* collect as many offsets as possible with RecordLevelIndex: free offset_tables at the end of process_query() */ for (i=0; i= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } } first_time = 0; if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) /* collect as many offsets as possible with RecordLevelIndex: free offset_tables at the end of process_query() */ for (i=0; i 0) ? round(OneFilePerBlock, 8*sizeof(int)) : MAX_PARTITION); i++) { if(index_tab[i]) fprintf(debug, "%d,%x\n", i, index_tab[i]); } #endif /*BG_DEBUG*/ fclose(f_in); return 0; } /* All borrowed from main.c and are needed for searching the index */ extern CHAR *pat_list[MAXNUM_PAT]; /* complete words within global pattern */ extern int pat_lens[MAXNUM_PAT]; /* their lengths */ extern int pat_attr[MAXNUM_PAT]; /* set of attributes */ extern int num_pat; extern CHAR pat_buf[(MAXNUM_PAT + 2)*MAXPAT]; extern int pat_ptr; extern int is_mgrep_pat[MAXNUM_PAT]; extern int mgrep_pat_index[MAXNUM_PAT]; extern int num_mgrep_pat; extern unsigned int *src_index_set; extern struct offsets **src_offset_table; extern char tempfile[]; extern int patindex; extern int patbufpos; extern ParseTree terminals[MAXNUM_PAT]; extern int GBESTMATCH; /* Should I change -B to -# where # = no. of errors? */ extern int bestmatcherrors; /* set during index search, used later on */ extern FILE *partfp; /* glimpse partitions */ extern FILE *nullfp; /* to discard output: agrep -s doesn't work properly */ extern int ComplexBoolean; extern int num_terminals; #if 0 extern struct token *hash_table[MAX_64K_HASH]; #else /*0*/ extern int mini_array_len; #endif /*0*/ extern int WORDBOUND, NOUPPER, D, LINENUM; int veryfastsearch(argc, argv, num_pat, pat_list, pat_lens, minifp) int argc; char *argv[]; int num_pat; CHAR *pat_list[MAXNUM_PAT]; int pat_lens[MAXNUM_PAT]; FILE *minifp; { /* * Figure out from options if very fast search is possible. */ if (minifp == NULL) return 0; if (!OneFilePerBlock) return 0; /* you did not build index for speed anyway */ if (!(WORDBOUND && NOUPPER && (D<=0))) return 0; if (LINENUM) return 0; return 1; /* if ((num_mgrep_pat == num_pat) || ((1 == num_pat) && (1 == checksg(pat_list[0], D, 0)))) return 1; */ /* either all >= 2 patterns are mgrep-able (simple) or there is just one simple pattern: i.e., "cast" can be used! */ /* return 0; */ } int mini_agrep(inword, inlen, outfp) CHAR *inword; int inlen; FILE *outfp; { static struct stat st; static int statted = 0; unsigned char s[MAX_LINE_LEN], word[MAX_NAME_LEN]; long beginoffset, endoffset, curroffset; unsigned char c; int j, num = 0, cmp, len; if (!statted) { sprintf((char*)s, "%s/%s", INDEX_DIR, INDEX_FILE); if (stat(s, &st) == -1) { fprintf(stderr, "Can't stat file: %s\n", s); exit(2); } statted = 1; } j = 0; while (*inword) { if (*inword == '\\') { inword++; continue; } if (isupper(*(unsigned char *)inword)) word[j] = tolower(*(unsigned char *)inword); else word[j] = *inword; j++; inword ++; } word[j] = '\0'; len = j; if (!get_mini(word, len, &beginoffset, &endoffset, 0, mini_array_len, minifp)) return 0; if (endoffset == -1) endoffset = st.st_size; if (endoffset <= beginoffset) return 0; /* We must find all occurrences of the word (in all attributes) so can't quit when we find the first match */ fseek(indexfp, beginoffset, 0); curroffset = ftell(indexfp); /* = beginoffset */ while ((curroffset < endoffset) && (fgets(s, MAX_LINE_LEN, indexfp) != NULL)) { j = 0; while ((j < MAX_LINE_LEN) && (s[j] != WORD_END_MARK) && (s[j] != ALL_INDEX_MARK) && (s[j] != '\0') && (s[j] != '\n')) j++; if ((j >= MAX_LINE_LEN) || (s[j] == '\0') || (s[j] == '\n')) { curroffset = ftell(indexfp); continue; } /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = s[j]; s[j] = '\0'; cmp = strcmp(word, s); #if WORD_SORTED if (cmp < 0) break; /* since index is sorted by word */ else #endif /* WORD_SORTED */ if (cmp != 0) { /* not IDENTICALLY EQUAL */ s[j] = c; curroffset = ftell(indexfp); continue; } s[j] = c; fputs(s, outfp); num++; curroffset = ftell(indexfp); } return num; } /* Returns the number of times a successful search was conducted: unused info at present. */ fillup_target(result_index_set, result_offset_table, parse) unsigned int *result_index_set; struct offsets **result_offset_table; long parse; { int i=0; FILE *tmpfp; int dummylen = 0; char dummypat[MAX_PAT]; int successes = 0, ret; int first_time = 1; extern int veryfast; int prev_INVERSE = INVERSE; veryfast = veryfastsearch(index_argc, index_argv, num_pat, pat_list, pat_lens, minifp); while (i < num_pat) { if (!veryfast) { if (is_mgrep_pat[i] && (num_mgrep_pat > 1)) { /* do later */ i++; continue; } strcpy(index_argv[patindex], pat_list[i]); /* i-th pattern in its right position */ } /* printf("pat_list[%d] = %s\n", i, pat_list[i]); */ if ((tmpfp = fopen(tempfile, "w")) == NULL) { fprintf(stderr, "%s: cannot open for writing: %s, errno=%d\n", GProgname, tempfile, errno); return(-1); } errno = 0; /* do we need to check is_mgrep_pat[i] here? */ if (veryfast && is_mgrep_pat[i]) { ret = mini_agrep(pat_list[i], pat_lens[i], tmpfp); } /* If this is the glimpse server, since the process doesn't die, most of its data pages might still remain in memory */ else if ((ret = fileagrep(index_argc, index_argv, 0, tmpfp)) < 0) { /* reinitialization here takes care of agrep_argv changes AFTER split_pattern */ fprintf(stderr, "%s: error in searching index\n", HARVEST_PREFIX); fclose(tmpfp); return(-1); } /* Now, the output of index search is in tempfile: need to use files here since index is too large */ fflush(tmpfp); fclose(tmpfp); tmpfp = NULL; /* Keep track of the maximum number of errors: will never enter veryfast */ if (GBESTMATCH) { if (errno > bestmatcherrors) bestmatcherrors = errno; } /* At this point, all index-search options are properly set due to the above fileagrep */ INVERSE = prev_INVERSE; if (-1 ==get_index(tempfile, result_index_set, result_offset_table, pat_list[i], pat_lens[i], pat_attr[i], index_argv, index_argc, nullfp, partfp, parse, first_time)) return(-1); successes ++; first_time = 0; i++; } fflush(stderr); if (veryfast) return successes; /* For index-search with memgrep in mgrep_get_index, and get-filenames */ dummypat[0] = '\0'; if ((dummylen = memagrep_init(index_argc, index_argv, MAX_PAT, dummypat)) <= 0) return(-1); if (num_mgrep_pat > 1) { CHAR *old_buf = (CHAR *)index_argv[patbufpos]; /* avoid my_free and re-my_malloc */ index_argv[patbufpos] = (char*)pat_buf; /* this contains all the patterns with the right -m and -M options */ #if BG_DEBUG fprintf(debug, "pat_buf = %s\n", pat_buf); #endif /*BG_DEBUG*/ strcpy(index_argv[patindex], "-z"); /* no-op: patterns are in patbufpos; also avoid shift-left of index_argv */ if (-1 == mgrep_get_index(tempfile, result_index_set, result_offset_table, pat_list, pat_lens, pat_attr, mgrep_pat_index, num_mgrep_pat, patbufpos, index_argv, index_argc, nullfp, partfp, parse, first_time)) { index_argv[patbufpos] = (char *)old_buf; /* else will my_free array! */ fprintf(stderr, "%s: error in searching index\n", HARVEST_PREFIX); return(-1); } successes ++; first_time = 0; index_argv[patbufpos] = (char *)old_buf; } return successes; } /* * Now, I search the index by doing an in-order traversal of the boolean parse tree starting at GParse. * The results at each node are stored in src_offset_table and src_index_set. Before the right child is * evaluated, results of the left child are stored in curr_offset_table and curr_index_set (accumulators) * and are unioned/intersected/noted with the right child's results (which get stored in src_...) and * passed on above. The accumulators are allocated at each internal node and freed after evaluation. * Left to right evaluation is good since number of curr_offset_tables that exist simultaneously depends * entirely on the maximum depth of a right branch (REAL_PARTITION is small so it won't make a difference). */ int search_index(tree) ParseTree *tree; { int prev_INVERSE; int i, j, iii; int first_time = 0; /* since it is used AFTER left child has been computed */ unsigned int *curr_index_set = NULL; struct offsets **curr_offset_table = NULL; if (ComplexBoolean) { /* recursive */ if (tree == NULL) return -1; if (tree->type == LEAF) { /* always AND pat of individual words at each term: initialize accordingly */ if (OneFilePerBlock) { for(i=0; iterminalindex, tree->terminalindex+1) <= 0) return -1; prev_INVERSE = INVERSE; /* agrep's global to implement NOT */ if (tree->op & NOTPAT) INVERSE = 1; if (fillup_target(src_index_set, src_offset_table, AND_EXP) <= 0) return -1; INVERSE = prev_INVERSE; return 1; } else if (tree->type == INTERNAL) { /* Search the left node and see if the right node can be searched */ if (search_index(tree->data.internal.left) <= 0) return -1; if (OneFilePerBlock && ((tree->op & OPMASK) == ORPAT) && (src_index_set[REAL_PARTITION - 1] == 1)) goto quit; /* nothing to do */ if ((tree->data.internal.right == NULL) || (tree->data.internal.right->type == 0)) return -1; /* uninitialized: see main.c */ curr_index_set = (unsigned int *)my_malloc(sizeof(int)*REAL_PARTITION); memset(curr_index_set, '\0', sizeof(int)*REAL_PARTITION); /* Save previous src_index_set and src_offset_table in fresh accumulators */ if (OneFilePerBlock) { memcpy(curr_index_set, src_index_set, sizeof(int)*REAL_PARTITION); curr_index_set[REAL_PARTITION - 1] = src_index_set[REAL_PARTITION - 1]; src_index_set[REAL_PARTITION - 1] = 0; curr_index_set[REAL_PARTITION - 2] = src_index_set[REAL_PARTITION - 2]; src_index_set[REAL_PARTITION - 2] = 0; } else memcpy(curr_index_set, src_index_set, MAX_PARTITION * sizeof(int)); if (ByteLevelIndex && !NOBYTELEVEL && (RecordLevelIndex || !(Only_first && !PRINTAPPXFILEMATCH))) { if ((curr_offset_table = (struct offsets **)my_malloc(sizeof(struct offsets *) * OneFilePerBlock)) == NULL) { fprintf(stderr, "%s: malloc failure at: %s:%d\n", GProgname, __FILE__, __LINE__); my_free(curr_index_set, REAL_PARTITION*sizeof(int)); return -1; } memcpy(curr_offset_table, src_offset_table, OneFilePerBlock * sizeof(struct offsets *)); memset(src_offset_table, '\0', sizeof(struct offsets *) * OneFilePerBlock); } /* Now evaluate the right node which automatically put the results in src_index_set/src_offset_table */ if (search_index(tree->data.internal.right) <= 0) { if (curr_offset_table != NULL) free(curr_offset_table); my_free(curr_index_set, REAL_PARTITION*sizeof(int)); return -1; } /* * Alpha substitution of the code in get_index(): * index_tab <- src_index_set * dest_index_table <- curr_index_set * offset_tab <- src_offset_table * dest_offset_table <- curr_offset_table * ret <- src_index_set[REAL_PARTITION - 1] for ORPAT, curr_index_set for ANDPAT * frequency = src_index_set[REAL_PARTITION - 2] in both ORPAT and ANDPAT * first_time <- 0 * return 0 <- goto quit * Slight difference since we want the results to go to src rather than curr. */ if (OneFilePerBlock) { if ((tree->op & OPMASK) == ORPAT) { if (src_index_set[REAL_PARTITION - 1] == 1) { /* curr..[..] can never be 1 since we would have quit above itself */ ret_is_1: src_index_set[REAL_PARTITION - 1] = 1; for(i=0; i= OneFilePerBlock) break; src_index_set[i] |= mask_int[j]; } if (ByteLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; i= OneFilePerBlock) break; src_index_set[i] |= mask_int[j]; } } first_time = 0; if (ByteLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; iop & OPMASK) == ORPAT) for(i=0; iop & NOTPAT) { if (ByteLevelIndex) { /* Can't recover the discarded offsets */ fprintf(stderr, "%s: can't handle NOT of AND/OR terms with ByteLevelIndex: please simplify the query\n", HARVEST_PREFIX); my_free(curr_index_set, REAL_PARTITION*sizeof(int)); return -1; } if (OneFilePerBlock) for (i=0; i 0 are displayed. .TP .B \-C tells glimpse to send its queries to \fIglimpseserver\fP. .TP .B \-d "'\fIdelim\fP'" Define \fIdelim\fP to be the separator between two records. The default value is '$', namely a record is by default a line. \fIdelim\fP can be a string of size at most 8 (with possible use of ^ and $), but not a regular expression. Text between two \fIdelim\fP's, before the first \fIdelim\fP, and after the last \fIdelim\fP is considered as one record. For example, -d '$$' defines paragraphs as records and -d '^From\ ' defines mail messages as records. \fIglimpse\fP matches each record separately. \fBThis option does not currently work with regular expressions.\fP The -d option is especially useful for Boolean AND queries, because the patterns need not appear in the same line but in the same record. For example, \fIglimpse -F mail -d '^From\ ' 'glimpse;arizona;announcement'\fR will output all mail messages (in their entirety) that have the 3 patterns anywhere in the message (or the header), assuming that files with 'mail' in their name contain mail messages. If you want the scope of the record to be the whole file, use the -W option. \fBGlimpse warning\fP: Use this option with care. If the delimiter is set to match mail messages, for example, and glimpse finds the pattern in a regular file, it may not find the delimiter and will therefore output the whole file. (The -t option - see below - can be used to put the \fIdelim\fP at the end of the record.) \fBPerformance Note:\fP Agrep (and glimpse) resorts to more complex search when the \-d option is used. The search is slower and unfortunately no more than 32 characters can be used in the pattern. .TP .B \-D\fIk\fP Set the cost of a deletion to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .BI \-e " pattern" Same as a simple .I pattern argument, but useful when the .I pattern begins with a .RB ` \- '. .TP .B \-E prints the lines in the index (as they appear in the index) which match the pattern. Used mostly for debugging and maintenance of the index. This is not an option that a user needs to know about. .TP .B \-f \fIfile_name\fR this option has a different meaning for agrep than for glimpse: In glimpse, only the files whose names are listed in \fIfile_name\fP are matched. (The file names have to appear as in .glimpse_filenames.) In agrep, the file_name contains the list of the patterns that are searched. (Starting at version 3.6, this option for glimpse is much faster for large files.) .TP .B \-F \fIfile_pattern\fR limits the search to those files whose name (including the whole path) matches \fIfile_pattern\fP. This option can be used in a variety of applications to provide limited search even for one large index. If \fIfile_pattern\fP matches a directory, then all files with this directory on their path will be considered. To limit the search to actual file names, use $ at the end of the pattern. \fIfile_pattern\fP can be a regular expression and even a Boolean pattern. This option is implemented by running agrep \fIfile_pattern\fP on the list of file names obtained from the index. Therefore, searching the index itself takes the same amount of time, but limiting the second phase of the search to only a few files can speed up the search significantly. For example, .sp 1 glimpse -F 'src#\\.c$' needle .sp 1 will search for needle in all .c files with src somewhere along the path. The -F \fIfile_pattern\fP must appear before the search pattern (e.g., glimpse needle -F '\\.c$' will not work). It is possible to use some of agrep's options when matching file names. In this case all options as well as the file_pattern should be in quotes. (-B and -v do not work very well as part of a file_pattern.) For example, .sp glimpse -F '-1 \\.html' pattern .sp will allow one spelling error when matching .html to the file names (so ".htm" and ".shtml" will match as well). .sp glimpse -F '-v \\.c$' counter .sp will search for 'counter' in all files \fIexcept\fP for .c files. .TP .B \-g prints the file number (its position in the .glimpse_filenames file) rather than its name. .TP .B \-G Output the (whole) files that contain a match. .TP .B \-h Do not display filenames. .TP .B \-H \fIdirectory_name\fR searches for the index and the other .glimpse files in \fIdirectory_name\fP. The default is the home directory. This option is useful, for example, if several different indexes are maintained for different archives (e.g., one for mail messages, one for source code, one for articles). .TP .B \-i Case-insensitive search \(em e.g., "A" and "a" are considered equivalent. Glimpse's index stores all patterns in lower case (see LIMITATIONS below). \fBPerformance Note:\fP When \-i is used together with the \-w option, the search may become much faster. It is recommended to have \-i and \-w as defaults, for example, through an alias. We use the following alias in our .cshrc file .br alias glwi 'glimpse -w -i' .TP .B \-I\fIk\fP Set the cost of an insertion to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .B \-j If the index was constructed with the -t option, then \-j will output the files last modification dates in addition to everything else. There are no major performance penalties for this option. .TP .B \-J \fIhost_name\fP used in conjunction with glimpseserver (\-C) to connect to one particular server. .TP .B \-k No symbol in the pattern is treated as a meta character. For example, glimpse -k 'a(b|c)*d' will find the occurrences of a(b|c)*d whereas glimpse 'a(b|c)*d' will find substrings that match the regular expression 'a(b|c)*d'. (The only exception is ^ at the beginning of the pattern and $ at the end of the pattern, which are still interpreted in the usual way. Use \\^ or \\$ if you need them verbatim.) .TP .B \-K \fIport_number\fP used in conjunction with glimpseserver (\-C) to connect to one particular server at the specified TCP port number. .TP .B \-l Output only the files names that contain a match. This option differs from the \-N option in that the files themselves \fIare\fP searched, but the matching lines are not shown. .TP .B \-L x | x:y | x:y:z if one number is given, it is a limit on the total number of matches. Glimpse outputs only the first x matches. If \-l is used (i.e., only file names are sought), then the limit is on the number of files; otherwise, the limit is on the number of records. If two numbers are given (x:y), then y is an added limit on the total number of files. If three numbers are given (x:y:z), then z is an added limit on the number of matches per file. If any of the x, y, or z is set to 0, it means to ignore it (in other words 0 = infinity in this case); for example, \-L 0:10 will output all matches to the first 10 files that contain a match. This option is particularly useful for servers that needs to limit the amount of output provided to clients. .TP .B \-m used for glimpse internals. .TP .B \-M used for glimpse internals. .TP .B \-n Each matching record (line) is prefixed by its record (line) number in the file. \fBPerformance Note:\fP To compute the record/line number, agrep needs to search for all record delimiters (or line breaks), which can slow down the search. .TP .B \-N searches only the index (so the search is faster). If -o or -b are used then the result is the number of files that have a potential match plus a prompt to ask if you want to see the file names. (If \-y is used, then there is no prompt and the names of the files will be shown.) This could be a way to get the matching file names without even having access to the files themselves. However, because only the index is searched, some potential matches may not be real matches. In other words, with \-N you will not miss any file but you may get extra files. For example, since the index stores everything in lower case, a case-sensitive query may match a file that has only a case-insensitive match. Boolean queries may match a file that has all the keywords but not in the same line (indexing with \-b allows glimpse to figure out whether the keywords are close, but it cannot figure out from the index whether they are exactly on the same line or in the same record without looking at the file). If the index was not build with \-o or \-b, then this option outputs the number of \fIblocks\fP matching the pattern. This is useful as an indication of how long the search will take. All files are partitioned into usually 200-250 blocks. The file \fB.glimpse_statistics\fP contains the total number of blocks (or \fBglimpse -N a\fP will give a pretty good estimate; only blocks with no occurrences of 'a' will be missed). .TP .B \-o the opposite of \-t: the delimiter is not output at the tail, but at the beginning of the matched record. .TP .B \-O the file names are not printed before every matched record; instead, each filename is printed just once, and all the matched records within it are printed after it. .TP .B \-p (from version 4.0B1 only) Supports reading compressed set of filenames. The -p option allows you to utilize compressed `neighborhoods' (sets of filenames) to limit your search, without uncompressing them. Added mostly for WebGlimpse. The usage is: .br "-p filename:X:Y:Z" where "filename" is the file with compressed neighborhoods, X is an offset into that file (usually 0, must be a multiple of sizeof(int)), Y is the length glimpse must access from that file (if 0, then whole file; must be a multiple of sizeof(int)), and Z must be 2 (it indicates that "filename" has the sparse-set representation of compressed neighborhoods: the other values are for internal use only). Note that any colon ":" in filename must be escaped using a backslash \. .TP .B \-P used for glimpse internals. .TP .B \-q prints the offsets of the beginning and end of each matched record. The difference between \-q and \-b is that \-b prints the offsets of the actual matched string, while \-q prints the offsets of the whole record where the match occurred. The output format is @x{y}, where x is the beginning offset and y is the end offset. .TP .B \-Q when used together with \-N glimpse not only displays the filename where the match occurs, but the exact occurrences (offsets) as seen in the index. This option is relevant only if the index was built with -b; otherwise, the offsets are not available in the index. This option is ignored when used not with \-N. .TP .B \-r This option is an agrep option and it will be ignored in glimpse, unless glimpse is used with a file name at the end which makes it run as agrep. If the file name is a directory name, the \-r option will search (recursively) the whole directory and everything below it. (The glimpse index will not be used.) .TP .B \-R \fIk\fP defines the maximum size (in bytes) of a record. The maximum value (which is the default) is 48K. Defining the maximum to be lower than the deafult may speed up some searches. .TP .B \-s Work silently, that is, display nothing except error messages. This is useful for checking the error status. .TP .B \-S\fIk\fP Set the cost of a substitution to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .B \-t Similar to the \-d option, except that the delimiter is assumed to appear at the \fIend\fP of the record. Glimpse will output the record starting from the end of .I delim to (and including) the next .I delim. (See warning for the \-d option.) .TP .B \-T directory Use \fIdirectory\fP as a place where temporary files are built. (Glimpse produces some small temporary files usually in /tmp.) This option is useful mainly in the context of structured queries for the Harvest project, where the temporary files may be non-trivial, and the /tmp directory may not have enough space for them. .TP .B \-U (starting at version 4.0B1) Interprets an index created with the -X or the -U option in glimpseindex. Useful mostly for WebGlimpse or similar web applications. When glimpse outputs matches, it will display the filename, the URL, and the title automatically. .TP .B \-v (This option is an agrep option and it will be ignored in glimpse, unless glimpse is used with a file name at the end which makes it run as agrep.) Output all records/lines that do \fInot\fP contain a match. (Glimpse does not support the NOT operator yet.) .TP .B \-V prints the current version of glimpse. .TP .B \-w Search for the pattern as a word \(em i.e., surrounded by non-alphanumeric characters. For example, \fIglimpse -w car\fR will match car, but not characters and not car10. The non-alphanumeric \fImust\fP surround the match; they cannot be counted as errors. This option does not work with regular expressions. \fBPerformance Note:\fP When \-w is used together with the \-i option, the search may become much faster. The \-w will not work with $, ^, and _ (see BUGS below). It is recommended to have \-i and \-w as defaults, for example, through an alias. We use the following alias in our .cshrc file .br alias glwi 'glimpse -w -i' .TP .B \-W The default for Boolean AND queries is that they cover one record (the default for a record is one line) at a time. For example, glimpse 'good;bad' will output all lines containing both 'good' and 'bad'. The \-W option changes the scope of Booleans to be the whole file. Within a file glimpse will output all matches to any of the patterns. So, glimpse -W 'good;bad' will output all lines containing 'good' \fIor\fP 'bad', but only in files that contain both patterns. The NOT operator '~' can be used only with \-W. It is described later on. The OR operator is essentially unaffected (unless it is in combination with the other Boolean operations). For structured queries, the scope is always the whole attribute or file. .TP .B \-x The pattern must match the whole line. (This option is translated to -w when the index is searched and it is used only when the actual text is searched. It is of limited use in glimpse.) .TP .B \-X (from version 4.0B1 only) Output the names of files that contain a match even if these files have been deleted since the index was built. Without this option glimpse will simply ignore these files. .TP .B \-y Do not prompt. Proceed with the match as if the answer to any prompt is y. Servers (or any other scripts) using glimpse will probably want to use this option. .TP .B \-Y \fIk\fP If the index was constructed with the -t option, then \-Y x will output only matches to files that were created or modified within the last x days. There are no major performance penalties for this option. .TP .B \-z Allow customizable filtering, using the file .glimpse_filters to perform the programs listed there for each match. The best example is compress/decompress. If .glimpse_filters include the line .br *.Z uncompress < .br (separated by tabs) then before indexing any file that matches the pattern "*.Z" (same syntax as the one for .glimpse_exclude) the command listed is executed first (assuming input is from stdin, which is why uncompress needs <) and its output (assuming it goes to stdout) is indexed. The file itself is not changed (i.e., it stays compressed). Then if glimpse -z is used, the same program is used on these files on the fly. Any program can be used (we run 'exec'). For example, one can filter out parts of files that should not be indexed. Glimpseindex tries to apply all filters in .glimpse_filters in the order they are given. For example, if you want to uncompress a file and then extract some part of it, put the compression command (the example above) first and then another line that specifies the extraction. Note that this can slow down the search because the filters need to be run before files are searched. (See also glimpseindex.) .TP .B \-Z No op. (It's useful for glimpse's internals. Trust us.) .LP The characters .RB ` $ ', .RB `^ ', .RB ` \(** ', .RB ` [ ' , .RB ` ] ' , .RB ` \s+2^\s0 ', .RB ` | ', .RB ` ( ', .RB ` ) ', .RB ` ! ', and .RB ` \e ' can cause unexpected results when included in the .IR pattern , as these characters are also meaningful to the shell. To avoid these problems, enclose the entire pattern in single quotes, i.e., 'pattern'. Do not use double quotes ("). .ne 4 .SH PATTERNS .LP \fIglimpse\fP supports a large variety of patterns, including simple strings, strings with classes of characters, sets of strings, wild cards, and regular expressions (see LIMITATIONS). .TP \fBStrings \fP Strings are any sequence of characters, including the special symbols `^' for beginning of line and `$' for end of line. The following special characters ( .RB ` $ ', .RB `^ ', .RB ` \(** ', .RB ` [ ' , .RB ` \s+2^\s0 ', .RB ` | ', .RB ` ( ', .RB ` ) ', .RB ` ! ', and .RB ` \e ' ) as well as the following meta characters special to glimpse (and agrep): .RB ` ; ', .RB ` , ', .RB ` # ', .RB ` < ', .RB ` > ', .RB ` - ', and .RB ` . ', should be preceded by `\\' if they are to be matched as regular characters. For example, \\^abc\\\\ corresponds to the string ^abc\\, whereas ^abc corresponds to the string abc at the beginning of a line. .TP \fBClasses of characters\fP a list of characters inside [] (in order) corresponds to any character from the list. For example, [a-ho-z] is any character between a and h or between o and z. The symbol `^' inside [] complements the list. For example, [^i-n] denote any character in the character set except character 'i' to 'n'. The symbol `^' thus has two meanings, but this is consistent with egrep. The symbol `.' (don't care) stands for any symbol (except for the newline symbol). .TP \fBBoolean operations\fP .B Glimpse supports an `AND' operation denoted by the symbol `;' an `OR' operation denoted by the symbol `,', a limited version of a 'NOT' operation (starting at version 4.0B1) denoted by the symbol `~', or any combination. For example, \fIglimpse 'pizza;cheeseburger'\fR will output all lines containing both patterns. \fIglimpse -F 'gnu;\\.c$' 'define;DEFAULT'\fR will output all lines containing both 'define' and 'DEFAULT' (anywhere in the line, not necessarily in order) in files whose name contains 'gnu' and ends with .c. \fIglimpse '{political,computer};science'\fR will match 'political science' or 'science of computers'. The NOT operation works only together with the -W option and it is generally applies only to the whole file rather to individual records. Its output may sometimes seem counterintuitive. Use with care. \fIglimpse -W 'fame;~glory'\fR will output all lines containing 'fame' in all files that contain 'fame' but do not contain 'glory'; This is the most common use of NOT, and in this case it works as expected. \fIglimpse -W '~{fame;glory}'\fR will be limited to files that do not contain both words, and will output all lines containing one of them. .TP \fBWild cards\fP The symbol '#' is used to denote a sequence of any number (including 0) of arbitrary characters (see LIMITATIONS). The symbol # is equivalent to .* in egrep. In fact, .* will work too, because it is a valid regular expression (see below), but unless this is part of an actual regular expression, # will work faster. (Currently glimpse is experiencing some problems with #.) .TP \fBCombination of exact and approximate matching\fP Any pattern inside angle brackets <> must match the text exactly even if the match is with errors. For example, ics matches mathematical with one error (replacing the last s with an a), but mathe does not match mathematical no matter how many errors are allowed. (This option is buggy at the moment.) .TP \fBRegular expressions\fP Since the index is word based, a regular expression must match words that appear in the index for glimpse to find it. Glimpse first strips the regular expression from all non-alphabetic characters, and searches the index for all remaining words. It then applies the regular expression matching algorithm to the files found in the index. For example, \fIglimpse\fP 'abc.*xyz' will search the index for all files that contain both 'abc' and 'xyz', and then search directly for 'abc.*xyz' in those files. (If you use glimpse \-w 'abc.*xyz', then 'abcxyz' will not be found, because glimpse will think that abc and xyz need to be matches to whole words.) The syntax of regular expressions in \fBglimpse\fP is in general the same as that for \fBagrep\fP. The union operation `|', Kleene closure `*', and parentheses () are all supported. Currently '+' is not supported. Regular expressions are currently limited to approximately 30 characters (generally excluding meta characters). Some options (\-d, \-w, \-t, \-x, \-D, \-I, \-S) do not currently work with regular expressions. The maximal number of errors for regular expressions that use '*' or '|' is 4. (See LIMITATIONS.) .TP \fBstructured queries\fP Glimpse supports some form of structured queries using Harvest's SOIF format. See STRUCTURED QUERIES below for details. .SH EXAMPLES .LP (Run "glimpse '^glimpse' this-file" to get a list of all examples, some of which were given earlier.) .TP glimpse -F 'haystack.h$' needle finds all needles in all haystack.h's files. .TP glimpse -2 -F html Anestesiology outputs all occurrences of Anestesiology with two errors in files with html somewhere in their full name. .TP glimpse -l -F '\\.c$' variablename lists the names of all .c files that contain variablename (the -l option lists file names rather than output the matched lines). .TP glimpse -F 'mail;1993' 'windsurfing;Arizona' finds all lines containing \fIwindsurfing\fP and \fIArizona\fP in all files having `mail' and '1993' somewhere in their full name. .TP glimpse -F mail 't.j@#uk' finds all mail addresses (search only files with mail somewhere in their name) from the uk, where the login name ends with t.j, where the . stands for any one character. (This is very useful to find a login name of someone whose middle name you don't know.) .TP glimpse -F mbox -h -G . > MBOX concatenates all files whose name matches `mbox' into one big one. .SH "SEARCHING IN COMPRESSED FILES" .LP Glimpse includes an optional new compression program, called \fIcast\fP, which allows glimpse (and agrep) to search the compressed files without having to decompress them. The search is actually significantly faster when the files are compressed. However, we have not tested \fIcast\fP as thoroughly as we would have liked, and a mishap in a compression algorithm can cause loss of data, so we recommend at this point to use \fIcast\fP very carefully. We do not support or maintain cast. (Unless you specifically use \fIcast\fP, the default is to ignore it.) .SH "GLIMPSEINDEX FILES" .LP All files used by glimpse are located at the directory(ies) where the index(es) is (are) stored and have .glimpse_ as a prefix. The first two files (.glimpse_exclude and .glimpse_include) are optionally supplied by the user. The other files are built and read by glimpse. .LP .IP "\fB.glimpse_exclude\fR" contains a list of files that glimpseindex is explicitly told to ignore. In general, the syntax of .glimpse_exclude/include is the same as that of agrep (or any other grep). The lines in the .glimpse_exclude file are matched to the file names, and if they match, the files are excluded. Notice that agrep matches to parts of the string! e.g., agrep /ftp/pub will match /home/ftp/pub and /ftp/pub/whatever. So, if you want to exclude /ftp/pub/core, you just list it, as is, in the .glimpse_exclude file. If you put "/home/ftp/pub/cdrom" in .glimpse_exclude, every file name that matches that string will be excluded, meaning all files below it. You can use ^ to indicate the beginning of a file name, and $ to indicate the end of one, and you can use * and ? in the usual way. For example /ftp/*html will exclude /ftp/pub/foo.html, but will also exclude /home/ftp/pub/html/whatever; if you want to exclude files that start with /ftp and end with html use ^/ftp*html$ Notice that putting a * at the beginning or at the end is redundant (in fact, in this case glimpseindex will remove the * when it does the indexing). No other meta characters are allowed in .glimpse_exclude (e.g., don't use .* or # or |). Lines with * or ? must have no more than 30 characters. Notice that, although the index itself will not be indexed, the list of file names (.glimpse_filenames) will be indexed unless it is explicitly listed in .glimpse_exclude. .IP "\fB.glimpse_filters\fR" See the description above for the -z option. .IP "\fB.glimpse_include\fR" contains a list of files that glimpseindex is explicitly told to \fIinclude\fP in the index even though they may look like non-text files. Symbolic links are followed by glimpseindex only if they are specifically included here. If a file is in both .glimpse_exclude and .glimpse_include it will be excluded. .IP "\fB.glimpse_filenames\fP" contains the list of all indexed file names, one per line. This is an ASCII file that can also be used with agrep to search for a file name leading to a fast find command. For example, .br glimpse 'count#\\.c$' ~/.glimpse_filenames .br will output the names of all (indexed) .c files that have 'count' in their name (including anywhere on the path from the index). Setting the following alias in the .login file may be useful: .br alias findfile 'glimpse -h \!:1 ~/.glimpse_filenames' .IP ".\fBglimpse_index\fP" contains the index. The index consists of lines, each starting with a word followed by a list of block numbers (unless the -o or -b options are used, in which case each word is followed by an offset into the file .glimpse_partitions where all pointers are kept). The block/file numbers are stored in binary form, so this is not an ASCII file. .IP "\fB.glimpse_messages\fP" contains the output of the -w option (see above). .IP "\fB.glimpse_partitions\fP" contains the partition of the indexed space into blocks and, when the index is built with the -o or -b options, some part of the index. This file is used internally by glimpse and it is a non-ASCII file. .IP "\fB.glimpse_statistics\fP" contains some statistics about the makeup of the index. Useful for some advanced applications and customization of glimpse. .IP "\fB.glimpse_turbo\fP" An added data structure (used under glimpseindex -o or -b only) that helps to speed up queries significantly for large indexes. Its size is 0.25MB. Glimpse will work without it if needed. .SH "STRUCTURED QUERIES" Glimpse can search for Boolean combinations of "attribute=value" terms by using the Harvest SOIF parser library (in glimpse/libtemplate). To search this way, the index must be made by using the -s option of glimpseindex (this can be used in conjunction with other glimpseindex options). For glimpse and glimpseindex to recognize "structured" files, they must be in SOIF format. In this format, each value is prefixed by an attribute-name with the size of the value (in bytes) present in "{}" after the name of the attribute. For example, The following lines are part of an SOIF file: .br .nf type{17}: Directory-Listing md5{32}: 3858c73d68616df0ed58a44d306b12ba .fi Any string can serve as an attribute name. Glimpse "pattern;type=Directory-Listing" will search for "pattern" only in files whose type is "Directory-Listing". The file itself is considered to be one "object" and its name/url appears as the first attribute with an "@" prefix; e.g., @FILE { http://xxx... } The scope of Boolean operations changes from records (lines) to whole files when structured queries are used in glimpse (since individual query terms can look at different attributes and they may not be "covered" by the record/line). Note that glimpse can only search for patterns in the value parts of the SOIF file: there are some attributes (like the TTL, MD5, etc.) that are interpreted by Harvest's internal routines. See RFC 2655 for more detailed information of the SOIF format. .SH "REFERENCES" .IP 1. U. Manber and S. Wu, "GLIMPSE: A Tool to Search Through Entire File Systems," \fIUsenix Winter 1994 Technical Conference\fP (best paper award), San Francisco (January 1994), pp. 23\-32. Also, Technical Report #TR 93-34, Dept. of Computer Science, University of Arizona, October 1993 (a postscript file is available by anonymous ftp at ftp://webglimpse.net/pub/glimpse/TR93-34.ps). .IP 2. S. Wu and U. Manber, "Fast Text Searching Allowing Errors," \fICommunications of the ACM\fP \fB35\fP (October 1992), pp. 83\-91. .SH "SEE ALSO" .BR agrep (1), .BR ed (1), .BR ex (1), .BR glimpseindex (1), .BR glimpseserver (1), .BR grep (1), .BR sh (1), .BR csh (1). .SH LIMITATIONS .LP The index of glimpse is word based. A pattern that contains more than one word cannot be found in the index. The way glimpse overcomes this weakness is by splitting any multi-word pattern into its set of words and looking for all of them in the index. For example, \fBglimpse 'linear programming'\fR will first consult the index to find all files containing both \fIlinear\fP and \fIprogramming\fP, and then apply agrep to find the combined pattern. This is usually an effective solution, but it can be slow for cases where both words are very common, but their combination is not. .LP As was mentioned in the section on PATTERNS above, some characters serve as meta characters for glimpse and need to be preceded by '\\' to search for them. The most common examples are the characters '.' (which stands for a wild card), and '*' (the Kleene closure). So, "glimpse ab.de" will match abcde, but "glimpse ab\\.de" will not, and "glimpse ab*de" will not match ab*de, but "glimpse ab\\*de" will. The meta character - is translated automatically to a hypen unless it appears between [] (in which case it denotes a range of characters). .LP The index of glimpse stores all patterns in lower case. When glimpse searches the index it first converts all patterns to lower case, finds the appropriate files, and then searches the actual files using the original patterns. So, for example, \fIglimpse ABCXYZ\fR will first find all files containing abcxyz in any combination of lower and upper cases, and then searches these files directly, so only the right cases will be found. One problem with this approach is discovering misspellings that are caused by wrong cases. For example, \fIglimpse -B abcXYZ\fR will first search the index for the best match to abcxyz (because the pattern is converted to lower case); it will find that there are matches with no errors, and will go to those files to search them directly, this time with the original upper cases. If the closest match is, say AbcXYZ, glimpse may miss it, because it doesn't expect an error. Another problem is speed. If you search for "ATT", it will look at the index for "att". Unless you use -w to match the whole word, glimpse may have to search all files containing, for example, "Seattle" which has "att" in it. .LP There is no size limit for simple patterns and simple patterns within Boolean expressions. More complicated patterns, such as regular expressions, are currently limited to approximately 30 characters. Lines are limited to 1024 characters. Records are limited to 48K, and may be truncated if they are larger than that. The limit of record length can be changed by modifying the parameter Max_record in agrep.h. .LP Glimpseindex does not index words of size > 64. .SH BUGS .LP In some rare cases, regular expressions using * or # may not match correctly. .LP A query that contains no alphanumeric characters is not recommended (unless glimpse is used as agrep and the file names are provided). This is an understatement. .LP The notion of "match to the whole word" (the \-w option) can be tricky sometimes. For example, glimpse -w 'word$' will not match 'word' appearing at the end of a line, because the extra '$' makes the pattern more than just one simple word. The same thing can happen with ^ and with _. To be on the safe side, use the -w option only when the patterns are actual words. .LP Please send bug reports or comments to gvelez@webglimpse.net. .SH DIAGNOSTICS Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files. .SH AUTHORS Udi Manber and Burra Gopal, Department of Computer Science, University of Arizona, and Sun Wu, the National Chung-Cheng University, Taiwan. Now maintained by Golda Velez at Internet WorkShop (Email: gvelez@webglimpse.net) glimpse-4.18.7/glimpse.chronicle000066400000000000000000000067331300371307100166070ustar00rootroot000000000000000. Created file on 2/May/94 1. Added patches to main.c to use sizeof(char*) instead of 4 in relevant places. Same for other pointer mallocs. -- bg 2 May 1994 2. Successfully ported to DEC-ALPHA: changes to routines in agrep/agrep.c -- bg 10 May 1994 3. Added the new mgrep routine. Removed bugs (pattern too long) related to using shift-or/and algorithm to search for booleans: now it uses mgrep. -- bg 26-30 May 1994 4. Added delimiter processing even with -f and -m. -- bg 31 May - 2 June 1994 5. Added structured queries support in June 1994 -- Syntax of glimpse: glimpse 'a1=v1,a2=v2...' (series of ORs) glimpse 'a1=v1;a2=v2...' (series of ANDs) -- Syntax of glimpseindex: glimpseindex -s -- NOTES: v1, v2 etc. must lie within the range of a1, a2, etc., i.e., if a1 is in the region [offset, offset+len] in a file, v1 must lie compleletely within that range. A new glimpse-file called .glimpse_attributes is created to hold the attributes discovered while indexing with -s. A-V searches may not give out an error message if the index is not built with -s. 6. Added glimpseserver to speed up queries by reading in the index ahead of time, during July 1994. 7. Integrated compression into glimpse during July 1994. The new option to glimpseindex can now index compressed files (those compressed with tcomp). 8. Added multipattern search for simple patterns with -w in compressed files during August 1994. 9. Ability to take input files from command line (-F) (Sep 10/94) 10. Added byte level index support (Sep 23/94) 11. Added support for arbitrary boolean expressions (Oct 10/94) 12. Added support for arbitary filtering with -z option (Oct 94) 13. Completely integrated glimpseserver into Harvest (Jan 95) 14. Speeded up structured queries and -W option (filtering) (Feb 95) 15. Added -W -z support (June 1995) 16. Added undocumented options for index-search/analysis (SFS_COMPAT, Jun 1995) 17. Removed bugs related to structured queries and booleans w/ -b (Jul 1995) 18. Added -z and structured queries support (Aug 1995) 19. Changed maximum pattern and indexable word sizes (Sep 1995) 20. Made it portable to various architectures (see README.install, Oct 1995) 21. Added "-f filename" option to glimpse: it allows you to restrict the search to only those files whose names appear in "filename" (Jan 1996). 22. Fixed the agrep bug where -n was not working with ISO chars (Jan 1996). 23. Added -t to glimpseindex that sorts .glimpse_filenames by decreasing order of modify time (st_mtime in stat structure); Added -@ option to glimpse to print time of file along with its name; (Feb 1996) 24. Added "-Y days" option to print files that were modified "days" before the index was created (Mar 1996). 25. Added support for handling filenames/directorynames with special characters: they can now have ' " & > < ! etc., whatever and glimpse works just fine. 26. Added conversion program for neighborhood manipulation in webglimpse (9/96). 27. Added limited support for NOT in glimpse (index search or -W only) (11/96). 28. Added support to search for patterns with repeating strings (11/96): "{computer;science},{computer;chronicles}" This now works in agrep as well as glimpse. However, its for simple patterns only (i.e., no regexp or spelling-errors). 29. Fixed some nagging memory leaks and segfaults on Solaris (10/96). 30. Fixed multiple matches / missed matches problems with -W (11/96). 31. Release of version 4.1 (10/97) glimpse-4.18.7/glimpse/000077500000000000000000000000001300371307100147065ustar00rootroot00000000000000glimpse-4.18.7/glimpse/CHANGES000066400000000000000000000367211300371307100157120ustar00rootroot00000000000000 Manually edited CHANGES file - now out of date. See ChangeLog for latest update information. Some notes from Peter on checking the code into RCS and some fixes to 4.1 are appended to the bottom of this file. 4.12.5 --> 4.12.6 - Fixes to configure script, thanks to Michael Heironimus - Fixes to index/partition.c, index/io.c and index/build_in.c should resolve problem with missing hits on the first one or two keywords in the index. Thanks to Morey Hubin. - Fix to sgrep.c solves problem of double-hit count with record delimiters. M. Hubin. 4.11 --> 4.12.5 Fix for using filters with structured indexes. Added FILE_END_MARK constant so it is possible to configure for filenames with spaces Test-fix for core dump on large indexes (may not have solved problem). 4.1 --> 4.11 Fix for core dump on merge, cleanup makefiles. 4.0 --> 4.1 - Minor bug fixes and cleanup preparatory to final glimpse release. 3.6 --> 4.0 - Added support to extract titles from HTML pages in glimpseindex with the -X option. These files must have names that end in: html, htm, shtml, shtm (It is easy to extend these -- just see glimpse.h/EXTRACT_INFO_SUFFIX. The routine to extract titles is index/filetype.c/extract_info(). This can be modified in various ways to extract info from many filetypes.) The titles are appended to the corresponding filenames after a ' ', before storing the filenames in .glimpse_filenames. In this case, glimpseindex assumes that filenames don't have spaces in them. - Added support to glimpseindex to store not just the names of files that are indexed, but also some extra information (like a URL) after each file, when -F is used to provide the names of the files to be indexed to glimpseindex. This will be stored in .glimpse_filenames and .glimpse_filehash. The information (URL) must be separated from the actual file name by one blank ' '. In this case too, glimpse assumes that filenames don't have spaces in them. - Added a -U option to glimpse to be able to interpret indices created with a -X or a -U option in glimpseindex. This is necessary since glimpse must know that the first ' ' (see above) siginifies the end of the filename in .glimpse_filenames. When glimpse outputs matches, it will display the filename, the URL, and the title automatically. The user must be able to parse this info properly though! - Added an option -X to glimpse to just output the names of files that do contain a match, in case glimpse is not able to open the file for reading. Without the -X option, glimpse will simply ignore the file and continue. - Added "wgconvert", a program to compress and decompress neighborhoods in webglimpse. It can also be used to convert a file of filenames (that's used as a parameter for the -f option in glimpse) to a smaller binary representation, and vice versa. See file "index/convert.c". (9-10/96). The compression can change a filenames file to a file containing a bit mask representaion of the set of files, or to a file containing a sparse set representation of these files. We recommend sparse-sets only. - Added support in glimpse to read not just a set of filenames (with a -f option), but also a compressed set of filenames (with the -p option). The -p option allows you to utilize compressed `neighborhoods' (sets of filenames) to limit your search, without uncompressing them. The usage is: "-p filename:X:Y:Z" where "filename" is the file with compressed neighborhoods, X is an offset into that file (usually 0, must be a multiple of sizeof(int)), Y is the length glimpse must access from that file (if 0, then whole file; must be a multiple of sizeof(int)), and Z must be 2 (it indicates that "filename" has the sparse-set representation of compressed neighborhoods: the other values are for internal use only). Note that any colon ":" in filename must be escaped using a backslash \. - Added limited support for NOT in glimpse. This works with index search (-N) or whole file scope for booleans (-W) only. (11/96). "Not" can be specified using "~". "Not" is most useful in expressions like "bad;~boy" or "woman,~girl"; or in "global not" expressions like "~{bad;boy}" or "~{woman,girl}". The semantics of ~ is as follows: the ~ works exactly as you would expect for index search (-N). For actual output, you will get all records with at least one of the specified patterns (bad, boy, woman, girl), that satisfy the boolean expression. That is, for example, "bad;~boy" will give you all records that contain "bad" but not "boy", in all files that contain "bad" but not "boy". However, if you search for "~{bad;boy}", glimpse/agrep will NOT output records that don't contain either bad or boy. They will only give you records that contain alteast one of "bad" or "boy" but not both. This is logical since otherwise, a pattern like "~ZZZZIYIUYIUYIUYRR", for example, would force glimpse to output all records in all files... For index-search and actual file-search to be consistent, a ~ should be used only with -W. Glimpse exits with an error otherwise. Agrep can now also search for nots, and the semantics are the same as above, except that the boolean expressions are evaluated on a per- record basis, rather than a per-file basis like glimpse. - Added support to search for patterns with repeating strings (11/96): "{computer;science},{computer;chronicles}" This now works in agrep as well as glimpse. However, its for simple patterns only (i.e., no regexp or spelling-errors). Previously, you were forced to say "{computer;{science,chronicles}}". This also fixes the "bug" where queries like "url=pat1;content=pat1" in Harvest did not work (the same pattern pat1 appears twice). - Fixed some nagging memory leaks and segfaults on Solaris (10/96). - Fixed multiple matches / missed matches problems with -W (11/96). 3.5 --> 3.6 - Many bug fixes and performance improvements to support webglimpse - A -R option to glimpseindex to recompute .glimpse_filenames_index from a changed .glimpse_filenames. This allows users to move the index from one file system to another (where the absolute pathnames of the same files can be different), or convert all absolute pathnames .glimpse_filenames to relative pathnames, and still use the existing index of that data. 3.0 --> 3.5 - added "-f filename" option to glimpse: it allows you to restrict the search to only those files whose names appear in "filename". - fixed the agrep bug where -n was not working with ISO characters. - Added -t to glimpseindex that sorts .glimpse_filenames by decreasing order of modify time (st_mtime in stat structure); - Added -j option to glimpse to print time of file along with its name; - Added "-Y days" option to print files that were modified "days" before the index was created. - Added support for arbitrary characters in filenames (e.g. >, <, space, &...) 2.1 ---> 3.0 - added a data structure (in .glimpse_turbo) that speeds up queries using -w and -i considerably for large indexes. It is meant mostly for servers using glimpse (e.g., Harvest and glimpseHTTP servers), but it benefits everyone. With this "turbo" option, typical queries take less than a second even for very large indexes. This was so successful that we made it the default rather than an option (it used to be -T in some earlier versions). If the .glimpse_turbo file is deleted, glimpse will still work properly (but glimpseindex -f and -a require it). - incremental indexing is now fully supported (even for -b). Deletion from the index is supported. glimpseindex -d filename(s) completely deletes the files from the index; glimpseindex -D filename(s) deletes the files only from the file list. - the index has been improved in several ways (transparently except for speed and space). As a result, indices built with earlier versions of glimpseindex will not work with 3.0 -- you must reindex again. - several options were added to glimpseindex and glimpse: glimpseindex -E indexes all file without attempting to run the filetype filtering (but excluded files or suffixes still apply). glimpse -Q extends -N in a nice way giving much more information about the matches in the index. glimpse -L has more options: -L x | x:y | x:y:z if one number is given, it is a limit on the total number of matches. Glimpse outputs only the first x matches. If two numbers are given (x:y), then y is an added limit on the total number of files. If three numbers are given (x:y:z), then z is an added limit on the number of matches per file. If any of the x, y, or z is set to 0, it means to ignore it (in other words 0 = infinity in this case); for example, -L 0:10 will output all matches to the first 10 files that contain a match. (There are also some undocumented-as-yet options. We are running out of letters. Only -j and -Y are not used!) - glimpse 3.0 still has a LOT of makefiles (one per architecture / OS). We hope to include autoconf support for glimpse in the future: but these should be sufficient for most purposes. - several bugs were fixed, and the whole package is now more portable. Binaries and make files for the following platforms are now available: AIX-3.2.5, HPPA, HPMC68K, IBM-RS6000, Linux, SGI. (Binaries and make files for SUNOS4.1.1, SUNOS4.1.3, SOLARIS 5.3 and DEC OSF/1 (ALPHA) are avaialable as usual.) See README.install for more details. 2.0 ---> 2.1 - Added the facility to run a glimpse server which reads the index into memory and stays in the background. Regular glimpse then submits queries to the server and echoes the replies. This can improve performance if the index is large since it doesn't have to be read-in for each query. Glimpse can contact (local or remote) servers using the -C, -J and -K options (see the man-pages for more details). - Optimized the performance of glimpse for very large structured indexes: this is mostly relevant in Harvest.1.1. Such indexes now take half the space, the indexing can be done in half the time, and structured queries are faster by a factor of 2 to 5! - Made code more portable: the code now runs on the following machines and operating systems: SUNOS ALPHA SOLARIS HPUX AIX LINUX - Added much improved man pages for glimpse, glimpseserver and glimpseindex. - Many bugs were fixed based on the reports received for glimpse.2.0 and Harvest.1.0. The code is now more robust, portable and readable. 1.1 ---> 2.0 - A "byte-level" indexing (glimpseindex -b) has been added, which mimics regular inverted indexes in that the exact location of each occurrence of each word (except for a stop list of common words) is indexed. The index itself is still searched with agrep so all options are still available. This option speeds up the search, sometimes considerably. - Added customizable filtering support -z to glimpseindex and glimpse. glimpseindex -z consults the file .glimpse_filters and performs the programs listed there for each match. The best example is compress/decompress. If .glimpse_filters include the line *.Z uncompress < then before indexing any file that matches the pattern "*.Z" (same syntax as the one for .glimpse_exclude) the command listed is executed first (assuming input is from stdin, which is why uncompress needs <) and its output (assuming it goes to stdout) is indexed. The file itself is not changed (i.e., it stays compressed). Then if glimpse -z is used, the same program is used on these files on the fly. Any program can be used (we run 'exec'). For example, one can filter out parts of files that should not be indexed. Note that this can slow down the search because the filters need to be run before files are searched. - There is a new compression package that allows glimpse (and agrep) to search DIRECTLY in compressed files. A new compression routine, called cast, is included. Also, glimpseindex can automatically index files compressed with cast. More details on this will be published later. - Queries can now include arbitrary combinations of ANDs, ORs, NOTs. - Added option -F in cast, uncast, buildcast and glimpseindex to take filenames from stdin. - Added a -L x option to glimpse to output only the first x matches. - User can explicitly specify whether exclude or include has higher priority (the default is to prefer exclude, glimpseindex -i gives priority to include). For example, you can put * in .glimpse_exclude and then explicitly say which files you want to include. - Added a -S x option to glimpseindex to allow the user to adjust the size of the stop-list under -o and -b. - Added a -W option to change the scope of Boolean queries to be the whole file. - Added support for structured queries in glimpse/glimpseindex (This was done for the Harvest project.) - Many small corrections were made based on the bug-reports received for version 1.1 and the beta version of 2.0. 1.0 ---> 1.1 - Names of files/directories whose ABSOLUTE path names are given as input to glimpseindex are indexed "as they are". If their RELATIVE path names are given, THEN glimpseindex tries to construct their absolute path names. Path names are still absolute: they are NOT relative to where the index is stored. - A new faster mgrep() (multi pattern search) has been added to agrep. - Boolean search by glimpse is now faster: it uses the new mgrep routine and the limit on the number of simple patterns separated by boolean operations is no longer 32 (it is 256 = maximum pattern length). - The maximum number of files which can be indexed at one go has been increased from 16000 to around 65000. - Many small corrections were made based on the bug-reports received for version 1.0. # /cs/usi/glimpse/ChangeLog # $Id: CHANGES,v 1.4 2000/08/16 04:23:00 golda Exp $ # Created: Tue Apr 7 08:54:27 1998 # Peter A. Bigot # # Description: # Master source area for glimpse indexing system. Tue Apr 7 08:54:27 1998 Peter A. Bigot (pab@thalia.CS.Arizona.EDU) * Generated this area from the glimpse-4.1 source release. All files checked in as version 1.1, and tagged with symbolic name "r4-1" thusly: % for f in `find . -name RCS` ; do \ (cd `dirname $f`; rcs -nr4-1: RCS/*) ; \ done * Patches are in ./Patches. * Applied glimpse-4.1-4.1b.patch; log in plog-4.1-4.1b. System incorrectly attempted to patch ./defs.h instead of ./compress/defs.h; applied that one by hand. This patch combines Burra's changes with some general cleanup; see the patch file for details. These changes checked in and tagged as symbolic name r4-1b. agrep/agrep.h:1.2; ./get_index.c:1.2; ./main.c:1.2; ./agrep/compat.c:1.2; ./agrep/newmgrep.c:1.2; ./compress/cast.c:1.2; ./compress/defs.h :1.2; ./compress/main_tbuild.c:1.2; ./compress/tsimpletest.c:1.2; ./index/build_in.c:1.2; ./index/convert.c:1.2; ./index/filetype.c:1.2; ./index/glimpse.c:1.2; ./index/region.c:1.2; ./index/simpletest.c:1.2. * Built and applied glimpse-4.1b-4.1c-patch. This fixes a problem when the last line of the file does not end in a newline. ./agrep/agrep.c:1.2; ./agrep/bitap.c:1.2; ./agrep/io.c:1.2; ./agrep/newmgrep.c:1.3; ./index/build_in.c:1.3. Thu Aug 20 15:03:08 1998 Peter A. Bigot (pab@thalia.CS.Arizona.EDU) * (index/{glimpse.c,Makefile.linux}) Fix bug involving the TEMP_DIR fixes: syntax error in glimpse.c, need to define variable when building buildcast. 1.{4,2}. * (KNOWN_BUGS) Added this file which contains information on how to duplicate bugs that we know exist, but haven't figured out how to squash yet. glimpse-4.18.7/glimpse/CONTRIBUTIONS000066400000000000000000000141201300371307100166710ustar00rootroot00000000000000We would like to acknowledge the members of harvest-dvl@cs.colorado.edu and everyone who had sent valuable bug-reports that helped to make glimpse more reliable, portable and easier to use. Especially, we would like to thank (in no particular order): "Marty Leisner" "Shirley Brown" Mark Eichin David Koski Jose Luis Pino James Binkley Benjamin Pierce Chris Dalton Rob Hartill nandu@cs.clemson.edu gmt (Gregg Townsend) Gabe Dalbec Jay Plett "Ullrich Hustadt" Pei Cao Michael Short Eric Johnson "Paul L. Clark" Charlie Stross Daniel Simmons Andrew Mauer "Curtis K. Wong" voelker@cs.washington.edu Vladimir G Ivanovic Bob Jackson raman@crl.dec.com "Michael S. Hart" Ray Schnitzler "David B. Rosen" rwk@integra.com jrb@cs.pdx.edu Brian Behlendorf rpe@pastek.cray.com (Roland Piquepaille) Dennis Grinberg Tom Phelps Eric Grosse douglis@research.att.com Jose Luis Pino ney@research.att.com joerg@pharmacy.isu.edu (Joerg Senekowitsch) George Hartzell Alastair Aitken CLMS leo@zycad.com (Leo Broukhis) Ray Allis (206) 865-3583 Paul Everitt "Bart J. Parliman" "Craig Bell (8-321-4036)" Vivek Khera forman@cs.washington.edu Vladimir Vukicevic barry@hal.com (Barry Bakalor) "Paul Pomes" Luca Toldo em@free.net (Evgeny Mironov) David A. Gernert Kent Lewallen "Gregory R. Olsen" Andries.Brouwer@cwi.nl Edgar Nielsen travis.winfrey@fi.gs.com (Travis Winfrey) Roy C Bixler John Cordell Bill Allocca dws@ssec.wisc.edu (DaviD W. Sanderson) barry@hal.com (Barry Bakalor) "Paul Pomes" vogelke@c-17igp.wpafb.af.mil Mark Metson (aa332@cfn.cs.dal.ca, gpurdy@fox.nstn.ns.ca) aeb@cwi.nl beebe@math.utah.edu (Nelson H. F. Beebe) jcasler@vnet.IBM.COM eaddy@scri.fsu.edu zeidenbe@ssc.wisc.edu seavey@OpenMarket.co dshaw@aplcomm.jhuapl.edu nandu@longs.att.com gencela@lafcol.lafayette.edu Keith Waclena "Charles F. Randall" Peter Marks - mark.dohm@teldta.com (Mark Dohm) "Daniel P. Zepeda" Binh Nguyen Alan Cunningham Ron Courtright rcourt@infinity.com jarausch@igpm.igpm.rwth-aachen.de (Helmut Jarausch) Dave Van Horn Alan.Harder@corp.sun.com Travis Taylor travis@ink2.ink.org Francesco Ruta ew@senate.be (Emmanuel Willems) "William Jaynes" ohst@informatik.hu-berlin.de (Daniel Ohst) Kurt Leinbach Pierre Violet Henrik.Martin@eua.ericsson.se (8 bit clean) gross@stimpy.ame.nd.edu (George B. Ross) Jim Meyering roger Firth Chet Murthy Todd White at RADium Technology Centre (Canada) stamer@merlin.physik.uni-oldenburg.DE (Heinrich Stamerjohanns) "CHRISM" (0) Jim Hurley "Piroz Mohseni" Cici.Mills@digicool.com (Ci-ci Mills) kaj.hejer@usit.uio.no (Kaj Hejer) Jonathan Shakes jrochkin@cs.oberlin.edu (Jonathan Rochkind) feyrer@rfhs1012.fh.uni-regensburg.de (Hubert Feyrer) Sudha Vaidyanathan Urmo Maeorg Tarvi Martens ericwolf@iquest.net (Eric Wolf) "O.Bartunov" Daniel Miles Kehoe Brett Bendickson Yvan Leclerc celuszak@bc-ad.bc.ca (Tom Celuszak) charles Tony Sprinzl Tai Jin Volker Ossenkopf bentson@grieg.seaslug.org (Randolph Bentson) koechlin@krug.inria.fr (Bruno Koechlin) "Steven J. Beaty" djk@clara.leather.chbi.co.uk () wysenet@iah.com (Larry E. Culver) Steve Karlovsky Duncan Fraser Doug Cooper Marc Paquette Keith Porterfield glenn@rockie.nsc.com (Glenn Newell) Emil Sit ada@mail2.umu.se Christer Holgersson Larry Schwimmer schwim@cyclone.stanford.edu Fred Douglis "Phil Kaslo" "Achim Bohnet" Alf-Christian Achilles p.d.stitt.kid0110@oasis.icl.co.uk Dachuan Zhang Howard Fear hsf@pagelus.com VaX#n8 Edwin Inglis Thomas Gries gries@epo.e-mail.com, gries@ibm.net Gerald Wildgruber, gewil@ue801be.ppp Davis Houlton SHIOZAKI Takehiko Jochen Schwarze wolinski@caissedesdepots.fr (Francis Wolinski) tobega@x.fra.se "David L. Fielding" wjones@tc.fluke.com (Warren Jones) vlad@mars.tecnomatix.com (Vlad Agranovsky) Udi Manber, Sun Wu and Burra Gopal glimpse-4.18.7/glimpse/ChangeLog000066400000000000000000000337111300371307100164650ustar00rootroot000000000000002008-03-27 13:26 golda * README.install: added Fedora 8 2006-08-22 10:18 dkreil * index/glimpse.h: fixed version number [tt] 2006-03-24 19:36 root * index/: asearch.c, ss/hash.c, dir.c: Add "const" to match *exact* type cast requirment 2006-03-24 19:13 root * index/: asearch.c, ss/hash.c, dir.c, glimpse.c: Fix type cast, missing return values 2006-03-13 15:12 root * Makefile: Add check target suite for Makefile 2006-03-13 15:09 root * Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.linux, Makefile.org, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos: Add check target to manually generated make files. 2006-03-13 12:35 root * main.c: Inital initialization of ret = 0. By default compiler should do it but we need this to suppress warnings 2006-03-10 18:42 root * Makefile.in: Fix path to check.sh 2006-03-10 18:28 root * Makefile.in: Check target added 2006-03-09 22:03 julian * main.c, split.c: remove 2 warnings with main.c (type cast to sockaddr) and warning with split.c (wrong type cast to unsigned char pointer) 2006-02-03 15:04 golda * index/glimpse.h: version number... 2006-02-03 10:56 golda * README: bug report address 2006-02-03 10:37 golda * index/glimpse.c: fix for TEMP_DIR def for buildcast thanks to Kent Mein for noticing! 2006-02-03 10:12 golda * Makefile.in: What the hell happened to glimpseindex ? 2005-08-07 01:35 golda * agrep/: agrep.c, preprocess.c: Use MAXREG, not static '30' 2004-11-08 18:02 golda * README: version # 2004-06-08 10:07 golda * index/: le, ure, , Makefile.in: Finally using autoconf 2004-06-08 10:02 golda * configure.in: finally got right combination of quotes for autoconf... 2003-11-12 22:16 golda * index/glimpse.h: add .abra to list of EXTRACT_INFO_SUFFIX files to look for titles in. (We preserve the title tag when prefiltering html files) 2003-11-12 22:10 golda * dynfilters/htuml2txt.so: [no log message] 2003-11-12 22:10 golda * configure, configure.in: Trying to use only autoconf, and not a customized configure script 2003-11-12 21:50 golda * main.c: Add support for --help & --version (in a rather cheesy way...) 2003-11-02 19:40 golda * index/glimpse.h: Updating list of suffixes, version # and date 2003-02-06 05:04 golda * main.c: Make glimpse -V return 0 (no error), not 1 - thanks to Bob Proulx 2003-01-25 13:15 golda * agrep/agrep.c: Enabled Bestmatch to work with Linenum as per Kevin McGrail (KAM) 2003-01-25 13:09 golda * index/glimpse.h: [no log message] 2003-01-25 13:09 golda * agrep/compat.c: Enable use of -B option along with -n (linenum) as per Kevin McGrail - probably never should have been disabled in first place 2002-11-29 17:47 golda * index/glimpse.h: [no log message] 2002-11-29 17:47 golda * agrep/: bitap.c, compat.c, io.c: Fix typo in bitap.c,io.c and compat.c that prevented compiling with --enable-pointers. Thanks to Clemens Fischer Correct error message about Bestmatch and Linenums, thanks to Kevin McGrail for pointing this out (and possible future fix to allow them to work together) 2002-10-10 22:28 golda * index/build_in.c: There was an error in build_in.c, a typo that prematurely terminated a for loop and resulted in missing hits for common search terms! Fixed now. This affected versions from 4.16.2 thru 4.17.0. If running one of those versions please upgrade to 4.17.1 2002-10-10 22:27 golda * dynfilters/htuml2txt.so: [no log message] 2002-09-27 14:41 golda * index/: dex.c, checksg.c, compat.c, maskgen.c, preprocess.c, build_in.c, glimpse.c, glimpse.h: Should compile now under CYGWIN (v1.3.12 or later)! Thanks to Tom Hudson for the tip, with the latest cygwin it seems that all that is needed is to add some include lines. Please report problems or other changes needed! 2002-09-06 01:11 golda * glimpse.1, glimpseindex.1, glimpseserver.1: Updated man pages to reflect new URL's, thanks to Kang-Jin Lee for reporting this and providing the new Harvest doc locations. 2002-06-17 23:45 golda * index/: filetype.c, glimpse.h: Finaly fixed segfault bug with large indexes, by increasing size of guilty buffer. Actually this is better than checking for buffer overrun because if we just stop at overrun we miss hits. With new size no overruns should be able to occur, that is the correct solution. Has been tested on two sites that were experiencing the segfault and has fixed problem. 2002-06-17 23:44 golda * dynfilters/htuml2txt.so: [no log message] 2002-05-02 21:43 golda * index/convert.c: Use system 'mv' command instead of c lib; works better on some linux systems 2002-02-14 17:38 golda * index/glimpse.h: Corrected version number 2001-10-13 01:14 golda * index/build_in.c: Fix segfault on certain large indexes. Check that we have enough space in merge_in before adding to it. In utils.c, when counter can be incremented > once in a loop, make sure we stop in time. 2001-10-13 01:13 golda * ChangeLog: Don't need ChangeLog in CVS! 2001-08-20 22:06 golda * index/: le, ure, Makefile.in: Added 'INDEXLIBS' configure script variable to include only -ldl if needed in index/Makefile 2001-08-20 20:59 golda * configure: Trying to get executable set in cvs 2001-08-20 19:56 golda * index/Makefile.in: Remove hard ref to -ldl 2001-08-20 19:53 golda * Makefile.in, configure: Added check for -ldl rather than including automatically 2001-07-07 23:17 golda * README, README.install: Updated instructions for using configure script 2001-07-07 23:06 golda * index/glimpse.h: Updated version, url, etc 2001-07-07 23:04 golda * index/: le, le.in, Makefile.in: make puts binaries in ./bin subdirectory 2001-07-06 23:37 golda * index/glimpse.h: Add '.jhtml' to list of extensions of HTML files (to extract titles from) 2001-07-06 22:47 golda * dynfilters/Makefile.in: Adding "distclean:" option to makefiles 2001-07-05 10:37 golda * index/: Makefile, ss/Makefile, Makefile: Removing Makefiles from CVS because they are now generated by configure script 2001-07-05 10:30 golda * configure: Corrected help text 2001-07-05 03:32 golda * index/: ure, glimpse.h: Moved FILE_END_MARK option into configure script (--file-end-mark=) instead of glimpse.h 2001-07-04 13:22 golda * agrep/Makefile, compress/Makefile, dynfilters/Makefile.in, index/Makefile: [no log message] 2001-07-04 13:22 golda * Makefile: Providing Makefile.in for configure script, for dynfilters may or may not work on all systems; we need to add a configure switch to turn them off if necessary. 2001-07-02 21:58 golda * agrep/bitap.c: Fix from Dan Slowik to fix line number reporting problem with agrep. 2001-05-20 21:49 golda * README.install: [no log message] 2001-05-20 21:48 golda * index/: le.in, icate.c, ure, ure.in, , , Makefile.in, agrep.h, bitap.c, ss/Makefile.in, ss/test.c, Makefile.in, convert.c, glimpse.h: Major fixes to configure script from Sang-yong Suh. Now it works! 2001-03-05 21:14 golda * README: Testing cvs problem 2000-12-06 16:11 golda * index/: recursive.c, ss/cast.c, ss/trecursive.c, ss/uncast.c, build_in.c, dir.c, filetype.c, glimpse.h: Use strerror(errno) instead of arbitrary "permission denied or nonexistent" error message, as suggested by ariel. Fixes bug #45. 2000-11-16 22:47 golda * split.c: Added dirs "lib" and "bin" to repository, also for some reason split.c had never been checked in. 2000-08-15 21:23 golda * CHANGES, ChangeLog: Used cvs2cl.pl to generate up-to-date ChangeLog from CVS. All old manually entered change information is in CHANGES file. 2000-08-15 15:07 golda * main.c: tiny formatting change 2000-08-15 15:04 golda * main.c: Corrected help output for -t switch 2000-08-14 17:06 golda * index/glimpse.h: Re-set name of filter file to .glimpse_filters, instead of the temporary name .glimpse_experimental_filters. This would have caused HTML tags in the output of glimpse 4.13.0a and 4.13.0, unless .glimpse_experimental_filters existed. Also corrected minor bug introduced in 4.12.6, in reading temp_rdelim (two lines were transposed). Probably didn't affect anything because the default delimiter is normally correct. 2000-07-16 22:34 golda * agrep/agrep.h: Changed leaf.attribute to type int to fix pointer to integer compiler warnings. Was used as int anyway, should not hurt anything. 2000-06-21 13:13 golda * COPYRIGHT: [no log message] 2000-05-30 13:01 golda * main.c: Corrected usage info for -t option on glimpse 2000-05-25 11:07 golda * index/: ters/Makefile.linux, ters/README, ters/htuml2txt.lex, ters/htuml2txt.so, ters/sotest.c, Makefile.linux, glimpse.h: Added Christian's changes to allow dynamic filters. I believe this has only been tested on Linux systems. --GV 2000-05-25 11:06 golda * Makefile, Makefile.linux: Added Christian's changes to allow dynamic filters 2000-04-20 13:27 golda * index/filetype.c: re-commented out debugging line 2000-03-08 11:51 golda * agrep/agrep.c: Patch from Dan Riley to fix glimpseserver crashing problems. Basically make sure multibuf is always set to NULL after it is freed, and freed before it is et to NULL. --GV 2000-01-20 06:04 golda * index/filetype.c: Added separator char before "xinfo" variable. This is needed to parse the URL out of the results and correctly make links in webglimpse's output. Fix as per Victor Gonzales T. --GV 2000-01-15 23:52 golda * index/filetype.c: Corrected problem with multiple-line titles and files with spaces in the names. FILE_END_MARK was being used in the wrong place, where multiple-line titles were being stuck together. --GV 2000-01-15 22:47 golda * agrep/checksg.c: Don't use mgrep() with delimiters - fix by Morey as per report by Michael O. --GV 2000-01-11 11:34 cpv298 * index/filetype.c: Fixed a nasty off-by-one error in extract_info() that clobbered memory past the end of arrays. glimpseindex -X should now stop segfaulting. 2000-01-11 11:23 test * README: Testing remote cvs --GV 1999-11-03 15:41 golda * Makefile.linux: Checking in modified Makefile to compile cleanly on Linux. 1999-11-03 15:16 golda * index/glimpse.h: Checking in changes to allow spaces in filenames - FILE_END_MARK constant set here. 1999-11-03 15:00 golda * main.c: Checking in changes that allow spaces infilenames, using FILE_END_MARK rather than a fixed character (a space) to delimit filenames & extra info. 1999-11-03 13:40 golda * compress/: Makefile, Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.in, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, cast.c, compress.chronicle, defs.h, hash.c, main_cast.c, main_tbuild.c, main_uncast.c, misc.c, quick.c, string.c, tbuild.c, test.c, tmemlook.c, trecursive.c, tsimpletest.c, uncast.c: Checking files into repostitory. 1999-11-03 13:39 golda * index/: Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README: Checking files into repository. 1999-11-03 13:37 golda * agrep/: COPYRIGHT, Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, agrep.1, agrep.algorithms, agrep.c, agrep.chronicle, agrep.h, asearch.c, asearch1.c, asplit.c, bitap.c, bitap.c.orig, checkfile.c, checkfile.h, checksg.c, compat.c, config.h, contribution.list, defs.h, delim.c, dummyfilters.c, dummysyscalls.c, follow.c, io.c, io.c.orig, main.c, maskgen.c, newmgrep.c, parse.c, preprocess.c, putils.c, re.h, recursive.c, utilities.c: Checking files into repository. --GV 1999-11-03 13:36 golda * ChangeLog, KNOWN_BUGS, Makefile.org, config.cache, config.log, config.status, genpatch: Cleaning up repository. --GV 1999-11-01 14:19 golda * CHANGES: Changes log - will be obsolete after 11/1/99 --G 1999-11-01 14:19 golda * index/: build_in.c, convert.c, filetype.c: Bringing Glimpse 4.12.6 under CVS. Future changes will be checked in independently. General changes were: 4.12.5 --> 4.12.6 - Fixes to configure script, thanks to Michael Heironimus - Fixes to index/partition.c, index/io.c and index/build_in.c should resolve problem with missing hits on the first one or two keywords in the index. Thanks to Morey Hubin. - Fix to sgrep.c solves problem of double-hit count with record delimiters. M. Hubin. 4.11 --> 4.12.5 Fix for using filters with structured indexes. Added FILE_END_MARK constant so it is possible to configure for filenames with spaces Test-fix for core dump on large indexes (may not have solved problem). --G 1999-11-01 14:16 golda * index/: Makefile, Makefile.in: Adding makefiles to cvs. 1999-11-01 13:34 golda * CHANGES, Makefile.in, config.cache, config.log, config.status, configure, configure.in: Adding into cvs repository. 1999-11-01 13:33 golda * agrep/: Makefile, Makefile.in: Adding into CVS repository. 1999-11-01 13:32 golda * agrep/sgrep.c: Adding into CVS repository 1999-05-05 01:16 gvelez * Makefile: Adding agrep archive to my CVS repository 1999-05-05 01:12 gvelez * main.c: Changes to return value in keeping with man pages: 0 for some hits, 1 for no hits, 2 for error. --GB 1999-03-02 00:38 gvelez * index/: build_in.c, convert.c, dir.c, filetype.c, fixname.c, getword.c, glimpse.h: Added index directory to repository 1999-03-02 00:37 gvelez * index/glimpse.c: Added default for TEMP_DIR --G 1998-05-22 10:31 udi * main.c: changed all /tmp to use TEMP_DIR (for security reasons) 1998-04-27 09:11 pab * gentar: Initial revision 1998-04-07 09:17 pab * get_index.c, main.c: Patch to rev 4.1b 1998-04-07 08:54 pab * CHANGES, CONTRIBUTIONS, COPYRIGHT, Makefile, Makefile.NeXT, Makefile.alpha, Makefile.hp, Makefile.in, Makefile.linux, Makefile.rs6000, Makefile.sgi, Makefile.solaris, Makefile.sunos, README, README.install, communicate.c, configure, configure.in, defs.h, get_filename.c, get_index.c, glimpse.1, glimpse.chronicle, glimpseindex.1, glimpseserver.1, install-sh, main.c, mkinstalldirs, split.c: Initial revision glimpse-4.18.7/glimpse/KNOWN_BUGS000066400000000000000000000017741300371307100163560ustar00rootroot00000000000000* Thu Aug 20 14:56:44 1998 agrep -v fails to write anything unless at least one line (possibly, a particular line) of the input matches the pattern that we're trying to mismatch. The non-glimpse agrep does not have this problem. E.g.: thalia[189]$ for p in O e n w o r ; do echo "Pattern '${p}':" ; echo "One@Two@Three" | tr '@' '\012' | agrep -v ${p} ; done Pattern 'O': Pattern 'e': Two Pattern 'n': Pattern 'w': One Pattern 'o': One Pattern 'r': One Two thalia[190]$ for p in O e n w o r ; do echo "Pattern '${p}':" ; echo "One@Two@Three" | tr '@' '\012' | agrep-2.04 -v ${p} ; done Pattern 'O': Two Three Pattern 'e': Two Pattern 'n': Two Three Pattern 'w': One Three Pattern 'o': One Three Pattern 'r': One Two The non-glimpse agrep uses bitap to do the search; the glimpse one uses bm. Some pre-condition is unsatisfied in the call to bm, because it overruns the input text buffer by a huge amount in attempting to find a pattern match. It isn't obvious to me where this bug is arising, or how to fix it. glimpse-4.18.7/glimpse/Makefile.NeXT000066400000000000000000000205241300371307100171660ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.alpha000066400000000000000000000206001300371307100174300ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc #gcc -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O -Olimit 3000 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.hp000066400000000000000000000205131300371307100167550ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 1 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.in000066400000000000000000000131551300371307100167600ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. srcdir = @srcdir@ VPATH = @srcdir@ SHELL = /bin/sh CC = @CC@ LIBS = @LIBS@ CP = @CP@ STRIP = @STRIP@ INSTALL = @INSTALL@ INSTALL_PROGRAM = @INSTALL_PROGRAM@ INSTALL_DATA = @INSTALL_DATA@ INSTALL_MAN = ${INSTALL} -m 444 DEFS = DYNFILTER = @DYNFILTER@ prefix = @prefix@ exec_prefix = @exec_prefix@ binprefix = manprefix = bindir = $(exec_prefix)/bin libdir = $(exec_prefix)/lib mandir = $(prefix)/man/man1 manext = 1 MANUAL = glimpse.1 glimpseindex.1 glimpseserver.1 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib SUBDIRS = compress agrep libtemplate index $(DYNFILTER) LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep OPTIMIZEFLAGS = -O2 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include CFLAGS = $(INCLUDEFLAGS) $(DEFS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c all: build-sub @TARGET@ Sall: $(PROG) $(PROGSERVER) $(PROGINDEX) agrep: $(PROGAGREP) NOTSall: $(NOTSPROG) $(NOTSPROGSERVER) build-sub: for d in $(SUBDIRS) ; do \ ( cd $$d; $(MAKE) ); \ done # Check target check: all $(SHELL) test/check.sh # INSTALL on Solaris should be carried one at a time. :-( install: all installdirs install-man for d in $(SUBDIRS) ; do \ ( cd $$d; $(MAKE) $@ ); \ done for d in $(BINDIR)/$(PROG) $(BINDIR)/$(PROGSERVER) ; do \ $(INSTALL) $$d $(bindir) ; \ done install-man: for d in $(MANUAL) ; do \ $(INSTALL_MAN) $$d $(mandir) ; \ done installdirs: mkinstalldirs $(srcdir)/mkinstalldirs $(bindir) $(mandir) clean: for d in $(SUBDIRS); do \ ( cd $$d; $(MAKE) $@ ); \ done rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(PROG) $(PROGSERVER) config.log rm -f $(LIBDIR)/lib$(LIBCOMPRESS).a $(LIBDIR)/lib$(LIBAGREP).a rm -f $(BINDIR)/* distclean: clean for d in $(SUBDIRS); do \ ( cd $$d; $(MAKE) $@ ); \ done rm -f Makefile config.cache config.status $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/$(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -o $(BINDIR)/$(PROG) main.o $(OBJS) -l$(LIBAGREP) $(LIBS) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/$(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LDFLAGS) -L$(LIBDIR) -o $(BINDIR)/$(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(LIBS) main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) -c $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) -c $(CFLAGS) -o $@ split.c glimpse-4.18.7/glimpse/Makefile.linux000066400000000000000000000213001300371307100175000ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 HAVE_STRERROR = 1 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 1 # You might have to change this depending on your machine configuration. CC = gcc -mpentiumpro SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib DYNFILTERDIR = dynfilters LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = -ldl PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep DYNHTMLFILTER = dynfilters/htuml2txt.so # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O2 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) -DHAVE_STRERROR=$(HAVE_STRERROR)\ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) $(DYNHTMLFILTER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_STRERROR="$(HAVE_STRERROR)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(DYNHTMLFILTER): $(DYNFILTERDIR)/htuml2txt.lex cd $(DYNFILTERDIR); $(MAKE) -f Makefile.linux htuml2txt.so # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean cd $(DYNFILTERDIR); $(MAKE) -f Makefile.linux clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.org000066400000000000000000000204031300371307100171330ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.rs6000000066400000000000000000000205431300371307100173030ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.sgi000066400000000000000000000205451300371307100171350ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = # -lsun for Irix 5? PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.solaris000066400000000000000000000206141300371307100200240ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To complie for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc # -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = -lsocket -lnsl PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/Makefile.sunos000066400000000000000000000205361300371307100175220ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall all: Sall # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ./bin/. and the libraries are assumed to # be in ./lib . You normally don't have to change them. # NOTE: GLIMPSEDIR can be relative or absolute. GLIMPSEDIR = .. BINDIR = bin AGREPDIR = agrep INDEXDIR = index COMPRESSDIR = compress TEMPLATEDIR = libtemplate LIBDIR = lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBCOMPRESS = cast LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpse PROGSERVER = glimpseserver NOTSPROG = nots$(PROG) NOTSPROGSERVER = nots$(PROGSERVER) PROGINDEX = index/glimpseindex PROGAGREP = agrep/agrep # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDES) $(SUBDIRLINKFLAGS) OBJS = get_filename.o \ get_index.o \ split.o \ $(INDEXDIR)/region.o \ $(INDEXDIR)/getword.o \ $(INDEXDIR)/filetype.o \ $(INDEXDIR)/simpletest.o \ $(INDEXDIR)/memlook.o \ $(INDEXDIR)/lib.o\ $(INDEXDIR)/io.o HDRS = $(INDEXDIR)/glimpse.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(AGREPDIR)/re.h $(INDEXDIR)/region.h SRC = main.c \ get_filename.c \ get_index.c \ split.c \ $(INDEXDIR)/region.c \ $(INDEXDIR)/getword.c \ $(INDEXDIR)/filetype.c \ $(INDEXDIR)/simpletest.c \ $(INDEXDIR)/memlook.c \ $(INDEXDIR)/io.c Sall: $(PROGINDEX) $(PROGAGREP) $(PROG) $(PROGSERVER) NOTSall: $(PROGINDEX) $(PROGAGREP) $(NOTSPROG) $(NOTSPROGSERVER) $(PROGINDEX): $(PROGAGREP) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(INDEXDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROGAGREP): $(LIBDIR)/lib$(LIBCOMPRESS).a cd $(AGREPDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBCOMPRESS).a: $(HDRS) cd $(COMPRESSDIR); $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(PROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(NOTSPROG): main.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROG) main.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR) $(PROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -L$(LIBTEMPLATEDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(NOTSPROGSERVER): main_server.o $(OBJS) $(SRC) $(HDRS) $(LIBDIR)/lib$(LIBAGREP).a $(LIBDIR)/lib$(LIBCOMPRESS).a $(CC) $(LINKFLAGS) -L$(LIBDIR) -o $(PROGSERVER) main_server.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROGSERVER) $(BINDIR) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR); $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" # Check target check: all $(SHELL) test/check.sh clean: -rm -f main_server.o main_server.c main.o $(OBJS) core a.out $(LIBDIR)/lib$(LIBAGREP).a $(PROG) $(PROGSERVER) cd $(AGREPDIR); $(MAKE) clean cd $(INDEXDIR) ; $(MAKE) clean cd $(COMPRESSDIR); $(MAKE) clean cd $(TEMPLATEDIR); $(MAKE) clean main_server.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h cp main.c main_server.c $(CC) $(CFLAGS) -DISSERVER=1 -o $@ main_server.c main.o: main.c defs.h $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -DISSERVER=0 -o $@ main.c get_filename.o: get_filename.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_filename.c get_index.o: get_index.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ get_index.c split.o: split.c $(AGREPDIR)/agrep.h $(AGREPDIR)/checkfile.h $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ split.c $(INDEXDIR)/lib.o: $(INDEXDIR)/lib.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/lib.c $(INDEXDIR)/io.o: $(INDEXDIR)/io.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/io.c $(INDEXDIR)/region.o: $(INDEXDIR)/region.c $(INDEXDIR)/glimpse.h $(INDEXDIR)/region.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/region.c $(INDEXDIR)/getword.o: $(INDEXDIR)/getword.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/getword.c $(INDEXDIR)/filetype.o: $(INDEXDIR)/filetype.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/filetype.c $(INDEXDIR)/simpletest.o: $(INDEXDIR)/simpletest.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/simpletest.c $(INDEXDIR)/memlook.o: $(INDEXDIR)/memlook.c $(INDEXDIR)/glimpse.h $(CC) $(CFLAGS) -o $@ $(INDEXDIR)/memlook.c glimpse-4.18.7/glimpse/README000066400000000000000000000045661300371307100156010ustar00rootroot00000000000000GLIMPSE 4.18: searching entire file systems (http://webglimpse.net/) (http://glimpse.cs.arizona.edu/) For installation instructions, see README.install Glimpse is a very powerful indexing and query system that allows you to search through all your files very quickly. It can be used by individuals for their personal file systems as well as by organizations for large data collections. Glimpse is also the basis of WebGlimpse, which provides search for web sites, and it is the default search engine in Harvest (see below). Glimpseindex, which you run by saying "glimpseindex DIR" builds an index of all text files in the tree rooted at DIR. (e.g., glimpseindex ~ indexes all your files.) With it, glimpse can search through all files much the same way as agrep (or any other grep), except that you don't have to specify file names and the search is fast. For example, glimpse -1 unbelievable will find all occurrences (in all your files!) of "unbelievable" allowing one spelling error; glimpse -F mail arizona will find all occurrences of "arizona" in all files with "mail" somewhere in their name; glimpse 'Arizona desert;windsurfing' will find all lines that contain both "Arizona desert" and "windsurfing". Glimpse supports three types of indexes: a tiny one (2-3% of the size of all files), a small one (7-9%), and a medium one (20-30%). The larger the index the faster the search. Glimpse supports most of agrep's options (agrep is our powerful version of grep, and it is part of glimpse) including approximate matching (e.g., finding misspelled words), Boolean queries, and even some limited forms of regular expressions. The WWW home page for glimpse is in http://glimpse.cs.arizona.edu/ It includes links to the source, binaries for most UNIX systems, documentations, articles, and more. The WebGlimpse home page is in http://glimpse.cs.arizona.edu/webglimpse/ Harvest's WWW home page is http://harvest.cs.colorado.edu/ (Harvest is an integrated set of tools to gather, extract, organize, search, cache, and replicate relevant information across the Internet.) Mail glimpse-request@cs.arizona.edu to be added to the glimpse mailing list. Mail glimpse@cs.arizona.edu to report bugs, ask questions, discuss tricks for using glimpse, etc. (This is a moderated mailing list.) Udi Manber, Burra Gopal, and Sun Wu. Please report bugs online at http://webglimpse.net/contact.php glimpse-4.18.7/glimpse/README.install000066400000000000000000000137671300371307100172510ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ This is version 4.1 of the glimpse package - a tool to search entire file systems. Please send any comments to glimpse@cs.arizona.edu. Check the file CHANGES for the changes since version 3.0 (there are many of them) and 4.0 (there are few). The files glimpse.1, glimpseindex.1, and glimpseserver.1 are the manual pages. Instructions for installing glimpse, glimpseindex, glimpseserver, and agrep: 1. Both the agrep and index directories have individual Makefiles which you can use independently. You can make everything by just typing make in the root glimpse directory. Sample Makefiles for various architectures have been provided; some have not been tested recently. We have a script "configure" in our distribution which has recently been very much improved. To generate makefiles for your system, run sh configure ( see ./configure --help for options ) Then run make make install to put binaries under /usr/local/bin. To install to a different directory, see the --prefix and --bindir options of configure. This configure script has been tested successfully on Linux-2.4.2 gcc-2.96 RedHat-7.1 Linux-2.2.16 egcs-2.91.66 RedHat-6.2 Linux-2.2.12 gcc-2.8.1 Fedora 8 (Linux) Solaris-2.5.1 gcc-2.7.2.1 SunOS-4.1.3 gcc-2.8.0 AIX-3.4 cc 2. To make individual binaries in a subdirectory "ddd", do the following: cd ./ddd ; make ; cd .. 3. To rebuild everything from scratch, do the following: make clean You can then proceed with the above steps. 4. The directory libtemplate contains code that was originally part of the Harvest source distribution. We believe the configuration there will work on all systems, but there are remnants of the Harvest configuration system still present. Don't be confused by them; they aren't used. 5. Binaries for several operating systems are available; see the download webpage (http://webglimpse.net/download.html) for details. NOTES: ------ People in our mailing list have commented that the make files we provide work on many other architectures too. We recommend that you do a pairwise "diff" of these makefiles to find out whether they support the options you need before trying to modify makefiles to suit your environment. Often a few changes to compiler options, etc., are enough to port glimpse to a new architecture / OS. Source code modifications are usually not necessary. We request you to mail us any changes to the Makefile (or the source) that are necessary to port glimpse to your architecture, and the corresponding binaries, so that we can include it in our distribution. We will appreciate any suggestions and will duly acknowledge all contributions. Some comments about portability (using the sample Makefiles): ------------------------------------------------------------- 6. You must define HAVE_DIRENT_H in agrep/Makefile, index/Makefile, compress/Makefile to be 1 or 0 depending on whether your machine has /usr/include/dirent.h or /usr/include/sys/dir.h. We found that on most machines/OSs like SunOS4.1, Solaris, Ultrix, AIX, OSF/1, HPUX and SGI IRIX 5.3, HAVE_DIRENT_H should be 1. On NeXT, HAVE_SYS_DIR_H should be 1 (the rest should be 0). 7. On Solaris, "RANLIB" should be define to be "true" in agrep/Makefile.solaris and compress/Makefile.solaris. 8. On Solaris (at least the version we have), the library archive program "ar" is in /usr/ccs/bin/ar instead of /usr/bin/ar. You must define "AR" in agrep/Makefile.solaris and compress/Makefile.solaris appropriately or set your PATH to include the appropriate directory name. 9. On Solaris you have to link the glimpse executables with the socket and nls libraries by specifying "-lsocket" and "-lnsl" to the make rules for "glimpse" and "glimpseserver". 10. On the DEC ALPHA and HP, the make variable "CC" was changed from "gcc" to "cc". 11. If you have the utime() routine and , define the make variable UTIME to 1 in glimpse/Makefile and compress/Makefile. Else define it to 0. 12. If you want to support the international character set (ISO_CHAR_SET), define the make variable ISO_CHAR_SET to 1 in glimpse/Makefile. Else define it to 0. 13. If you have the function strerr() on your system, #define HAVE_STRERROR to 1 in libtemplate/autoconf.h, else #undef it (or leave the definition in /**/). This is necessary on some BSD systems (some of our users have said so) and on NeXT (Gerald Wildgruber, gewil@ue801be.ppp). And on Irix 6.*, although the warning you get from the loader seems to be benign. 14. If you need to add any new macros or flags, you can edit the file: glimpse/agrep/config.h and add whatever is needed to make porting easy on your machine / OS. This file is included throughout glimpse source code. 15. On BSD, it may be necessary to define DONTUSESORT_T_OPTION in the Makefile for glimpseindex since -T in some BSD systems is defined as an alternative record separator rather than to specify a directory other than /tmp to store sort-temporary files. Porting to other platforms: -------------------------- We provide, but do not support, pre-compiled glimpse binaries for a variety of systems; see the glimpse web page for details. If you port glimpse to a system and are willing to provide binaries for it, send mail to glimpse@cs.arizona.edu. Makefiles for previous versions of glimpse on the following platforms have been provided by these people. We do not have these systems available to us, so are unable to verify that they work with the current version, but they should be a good starting point. Platform ported to Person to contact AIX-3.2.5 stamer@merlin.physik.uni-oldenburg.DE (Heinrich Stamerjohanns) HPPA, HPMC68K Chris Dalton IBM-RS6000 "CHRISM" (0) NeXT gross@stimpy.ame.nd.edu Thanks for your interest in glimpse. Udi Manber, Burra Gopal, and Sun Wu. glimpse-4.18.7/glimpse/communicate.c000066400000000000000000000225171300371307100173650ustar00rootroot00000000000000/* rewritten so that it uses no library routines */ #include /* HAVE_SYS_SELECT_H is defined here */ #if SFS_COMPAT #if defined(__NeXT__) #include #else #include #endif #endif #include #include #include #include #include #include #include /* #include */ /* #include */ #include #if 1 #if defined(_IBMR2) #include #endif #else #if defined(HAVE_SYS_SELECT_H) #include #endif #endif #include "glimpse.h" #include "defs.h" int mystrlen(str, max) char *str; int max; { int i=0; while ((i 0) { #if SFS_COMPAT nread = syscall(SYS_read, fd, ptr, nleft); #else nread = read(fd, ptr, nleft); #endif if (nread < 0) return(nread); else if (nread == 0) break; /* EOF */ nleft -= nread; ptr += nread; } return (nbytes - nleft); } int writen(fd, ptr, nbytes) int fd; char *ptr; int nbytes; { int nleft, nwritten; nleft = nbytes; while (nleft > 0) { #if SFS_COMPAT nwritten = syscall(SYS_write, fd, ptr, nleft); #else nwritten = write(fd, ptr, nleft); #endif if (nwritten <= 0) return nwritten; nleft -= nwritten; ptr += nwritten; } return (nbytes - nleft); } int readline(sockfd, ptr, maxlen) int sockfd; char *ptr; int maxlen; { int n, rc; char c; for (n=1; n= 20)) return -1; if (((*pclstdout = fds[1]) < 0) || (*pclstdout >= 20)) return -1; if (((*pclstderr = fds[2]) < 0) || (*pclstderr >= 20)) return -1; return 0; } #endif /*USE_MSGHDR*/ int linearize(sockfd, reqbuf, reqlen, argc, argv, pid) int sockfd; int reqlen, argc; char *reqbuf, *argv[]; int pid; { int i; unsigned char array[4]; int ptr = 0; int len; array[0] = (pid & 0xff000000) >> 24; array[1] = (pid & 0xff0000) >> 16; array[2] = (pid & 0xff00) >> 8; array[3] = (pid & 0xff); if (sockfd >= 0) { if (writen(sockfd, array, 4) < 4) return -1; } if (reqbuf != NULL) { if (ptr + 4 >= reqlen) return -1; memcpy(reqbuf+ptr, array, 4); ptr += 4; } array[0] = (argc & 0xff000000) >> 24; array[1] = (argc & 0xff0000) >> 16; array[2] = (argc & 0xff00) >> 8; array[3] = (argc & 0xff); if (sockfd >= 0) { if (writen(sockfd, array, 4) < 4) return -1; } if (reqbuf != NULL) { if (ptr + 4 >= reqlen) return -1; memcpy(reqbuf+ptr, array, 4); ptr += 4; } for (i=0; i= 0) { if (writen(sockfd, argv[i], len + 1) < len + 1) return -1; if (writen(sockfd, "\n", 1) < 1) return -1; /* so that we can do gets */ } if (reqbuf != NULL) { if (ptr + len + 2 >= reqlen) return -1; strcpy(reqbuf+ptr, argv[i]); ptr += len+1; reqbuf[ptr++] = '\0'; /* so that we can do strcpy */ } #if 0 printf("sending %s\n", argv[i]); #endif } return ptr; } int delinearize(sockfd, reqbuf, reqlen, pargc, pargv, ppid) int sockfd; int reqlen, *pargc; char *reqbuf, **pargv[]; int *ppid; { int i; char line[MAXLINE]; int len; int ptr = 0; unsigned char array[4]; *ppid = 0; *pargc = 0; *pargv = NULL; memset(array, '\0', 4); if (sockfd >= 0) if (readn(sockfd, array, 4) != 4) return -1; if (reqbuf != NULL) { if (ptr+4 >= reqlen) return -1; memcpy(array, reqbuf+ptr, 4); ptr += 4; } *ppid = (array[0] << 24) + (array[1] << 16) + (array[2] << 8) + array[3]; memset(array, '\0', 4); if (sockfd >= 0) if (readn(sockfd, array, 4) != 4) return -1; if (reqbuf != NULL) { if (ptr+4 >= reqlen) return -1; memcpy(array, reqbuf+ptr, 4); ptr += 4; } *pargc = (array[0] << 24) + (array[1] << 16) + (array[2] << 8) + array[3]; #if 0 printf("clargc=%x\n", *pargc); #endif /* VERY important, set hard-coded limit to MAX_ARGS*MAX_NAME_LEN; otherwise can cause the server to allocate TONS of memory */ if (*pargc <= 0 || *pargc >= (MAX_ARGS*MAX_NAME_LEN)) { *pargc = 0; return -1; } if ((*pargv = (char **)malloc(sizeof(char *) * *pargc)) == NULL) { /* no memory, so discard */ *pargc = 0; return - 1; } memset(*pargv, '\0', sizeof(char *) * *pargc); for (i=0; i<*pargc; i++) { if (sockfd >= 0) { if (readline(sockfd, line, MAXLINE) <= 0) return -1; if ((len = mystrlen(line, MAXLINE)) <= 0) { i--; continue; } if (((*pargv)[i] = (char *)malloc(len + 2)) == NULL) return -1; line[len] = '\0'; /* overwrite the '\n' */ strcpy((*pargv)[i], line); } if (reqbuf != NULL) { if ( ((len = mystrlen(reqbuf+ptr, reqlen-ptr)) <= 0) || (len >= MAXLINE) ) return -1; if (((*pargv)[i] = (char *)malloc(len + 2)) == NULL) return -1; strcpy((*pargv)[i], reqbuf+ptr); ptr += len + 2; } #if 0 printf("clargv[%x]=%s\n", i, (*pargv)[i]); #endif } return ptr; } int sendreq(sockfd, reqbuf, clstdin, clstdout, clstderr, clargc, clargv, clpid) int sockfd, clstdin, clstdout, clstderr, clargc, clpid; char reqbuf[MAX_ARGS*MAX_NAME_LEN], *clargv[]; { #if USE_MSGHDR struct iovec iov[1]; struct msghdr msg; int ret; int fds[3]; #endif /*USE_MSGHDR*/ #if USE_MSGHDR if ((ret = linearize(-1, reqbuf, MAX_ARGS*MAX_NAME_LEN, clargc, clargv, clpid)) < 0) return -1; fds[2] = clstdin; fds[1] = clstdout; fds[0] = clstderr; iov[0].iov_base = (char *) reqbuf; iov[0].iov_len = ret; msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = (caddr_t) NULL; msg.msg_namelen = 0; msg.msg_accrights = (caddr_t) fds; msg.msg_accrightslen = 2 * sizeof(int); /* don't send clstdin */ errno = 0; #if SFS_COMPAT if ((ret = syscall(SYS_sendmsg, sockfd, &msg, 0)) < 0) { #else if ((ret = sendmsg(sockfd, &msg, 0)) < 0) { #endif #if 0 printf("sendmsg ret = %x, errno = %d\n", ret, errno); #endif return (-1); } #if 0 printf("sendreq %x %x %x, ret = %x, errno = %d\n", fds[0], fds[1], fds[2], ret, errno); #endif #else /*USE_MSGHDR*/ if (linearize(sockfd, (char *)NULL, MAX_ARGS*MAX_NAME_LEN, clargc, clargv, clpid) < 0) return -1; #endif /*USE_MSGHDR*/ return (0); } int getreq(sockfd, reqbuf, pclstdin, pclstdout, pclstderr, pclargc, pclargv, pclpid) int sockfd, *pclstdin, *pclstdout, *pclstderr, *pclargc, *pclpid; char reqbuf[MAX_ARGS*MAX_NAME_LEN], **pclargv[]; { #if USE_MSGHDR struct iovec iov[1]; struct msghdr msg; int ret; int fds[3]; #endif /*USE_MSGHDR*/ #if USE_MSGHDR iov[0].iov_base = (char *) reqbuf; iov[0].iov_len = MAX_ARGS * MAX_NAME_LEN; msg.msg_iov = iov; msg.msg_iovlen = 1; msg.msg_name = (caddr_t) NULL; msg.msg_namelen = 0; msg.msg_accrights = (caddr_t)fds; msg.msg_accrightslen = 2*sizeof(int); errno = 0; #if SFS_COMPAT if ((ret = syscall(SYS_recvmsg, sockfd, &msg, 0)) < 0) { #else if ((ret = recvmsg(sockfd, &msg, 0)) < 0) { #endif #if 0 printf("bad recvmsg: ret = %x, errno = %d\n", ret, errno); #endif return -1; } *pclstdin = fds[2]; *pclstdout = fds[1]; *pclstderr = fds[0]; if ((ret == delinearize(-1, reqbuf, MAX_ARGS * MAX_NAME_LEN, pclargc, pclargv, pclpid)) < 0) return -1; #if 0 printf("getreq %x %x %x, ret = %x, errno = %d\n", fds[0], fds[1], fds[2], ret, errno); #endif #else /*USE_MSGHDR*/ if (delinearize(sockfd, (char *)NULL, MAX_ARGS * MAX_NAME_LEN, pclargc, pclargv, pclpid) < 0) return -1; *pclstdin = -1; *pclstdout = sockfd; *pclstderr = sockfd; #endif /*USE_MSGHDR*/ return (0); } glimpse-4.18.7/glimpse/configure000077500000000000000000005176011300371307100166270ustar00rootroot00000000000000#! /bin/sh # Guess values for system-dependent variables and create Makefiles. # Generated by GNU Autoconf 2.57. # # Copyright 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002 # Free Software Foundation, Inc. # This configure script is free software; the Free Software Foundation # gives unlimited permission to copy, distribute and modify it. ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## # Be Bourne compatible if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' elif test -n "${BASH_VERSION+set}" && (set -o posix) >/dev/null 2>&1; then set -o posix fi # Support unset when possible. if (FOO=FOO; unset FOO) >/dev/null 2>&1; then as_unset=unset else as_unset=false fi # Work around bugs in pre-3.0 UWIN ksh. $as_unset ENV MAIL MAILPATH PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. for as_var in \ LANG LANGUAGE LC_ADDRESS LC_ALL LC_COLLATE LC_CTYPE LC_IDENTIFICATION \ LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER \ LC_TELEPHONE LC_TIME do if (set +x; test -n "`(eval $as_var=C; export $as_var) 2>&1`"); then eval $as_var=C; export $as_var else $as_unset $as_var fi done # Required to use basename. if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi if (basename /) >/dev/null 2>&1 && test "X`basename / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi # Name of the executable. as_me=`$as_basename "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)$' \| \ . : '\(.\)' 2>/dev/null || echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/; q; } /^X\/\(\/\/\)$/{ s//\1/; q; } /^X\/\(\/\).*/{ s//\1/; q; } s/.*/./; q'` # PATH needs CR, and LINENO needs CR and PATH. # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then echo "#! /bin/sh" >conf$$.sh echo "exit 0" >>conf$$.sh chmod +x conf$$.sh if (PATH="/nonexistent;."; conf$$.sh) >/dev/null 2>&1; then PATH_SEPARATOR=';' else PATH_SEPARATOR=: fi rm -f conf$$.sh fi as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" || { # Find who we are. Look in the path if we contain no path at all # relative or not. case $0 in *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then { echo "$as_me: error: cannot find myself; rerun with an absolute path" >&2 { (exit 1); exit 1; }; } fi case $CONFIG_SHELL in '') as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for as_base in sh bash ksh sh5; do case $as_dir in /*) if ("$as_dir/$as_base" -c ' as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" ') 2>/dev/null; then $as_unset BASH_ENV || test "${BASH_ENV+set}" != set || { BASH_ENV=; export BASH_ENV; } $as_unset ENV || test "${ENV+set}" != set || { ENV=; export ENV; } CONFIG_SHELL=$as_dir/$as_base export CONFIG_SHELL exec "$CONFIG_SHELL" "$0" ${1+"$@"} fi;; esac done done ;; esac # Create $as_me.lineno as a copy of $as_myself, but with $LINENO # uniformly replaced by the line number. The first 'sed' inserts a # line-number line before each line; the second 'sed' does the real # work. The second script uses 'N' to pair each line-number line # with the numbered line, and appends trailing '-' during # substitution so that $LINENO is not a special case at line end. # (Raja R Harinath suggested sed '=', and Paul Eggert wrote the # second 'sed' script. Blame Lee E. McMahon for sed's syntax. :-) sed '=' <$as_myself | sed ' N s,$,-, : loop s,^\(['$as_cr_digits']*\)\(.*\)[$]LINENO\([^'$as_cr_alnum'_]\),\1\2\1\3, t loop s,-$,, s,^['$as_cr_digits']*\n,, ' >$as_me.lineno && chmod +x $as_me.lineno || { echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2 { (exit 1); exit 1; }; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensible to this). . ./$as_me.lineno # Exit status is that of the last command. exit } case `echo "testing\c"; echo 1,2,3`,`echo -n testing; echo 1,2,3` in *c*,-n*) ECHO_N= ECHO_C=' ' ECHO_T=' ' ;; *c*,* ) ECHO_N=-n ECHO_C= ECHO_T= ;; *) ECHO_N= ECHO_C='\c' ECHO_T= ;; esac if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi rm -f conf$$ conf$$.exe conf$$.file echo >conf$$.file if ln -s conf$$.file conf$$ 2>/dev/null; then # We could just check for DJGPP; but this test a) works b) is more generic # and c) will remain valid once DJGPP supports symlinks (DJGPP 2.04). if test -f conf$$.exe; then # Don't use ln at all; we don't have any links as_ln_s='cp -p' else as_ln_s='ln -s' fi elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.file if mkdir -p . 2>/dev/null; then as_mkdir_p=: else as_mkdir_p=false fi as_executable_p="test -f" # Sed expression to map a string onto a valid CPP name. as_tr_cpp="sed y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g" # Sed expression to map a string onto a valid variable name. as_tr_sh="sed y%*+%pp%;s%[^_$as_cr_alnum]%_%g" # IFS # We need space, tab and new line, in precisely that order. as_nl=' ' IFS=" $as_nl" # CDPATH. $as_unset CDPATH # Name of the host. # hostname on some systems (SVR3.2, Linux) returns a bogus exit status, # so uname gets run too. ac_hostname=`(hostname || uname -n) 2>/dev/null | sed 1q` exec 6>&1 # # Initializations. # ac_default_prefix=/usr/local ac_config_libobj_dir=. cross_compiling=no subdirs= MFLAGS= MAKEFLAGS= SHELL=${CONFIG_SHELL-/bin/sh} # Maximum number of lines to put in a shell here document. # This variable seems obsolete. It should probably be removed, and # only ac_max_sed_lines should be used. : ${ac_max_here_lines=38} # Identity of this package. PACKAGE_NAME= PACKAGE_TARNAME= PACKAGE_VERSION= PACKAGE_STRING= PACKAGE_BUGREPORT= ac_unique_file="get_filename.c" # Factoring default headers for most tests. ac_includes_default="\ #include #if HAVE_SYS_TYPES_H # include #endif #if HAVE_SYS_STAT_H # include #endif #if STDC_HEADERS # include # include #else # if HAVE_STDLIB_H # include # endif #endif #if HAVE_STRING_H # if !STDC_HEADERS && HAVE_MEMORY_H # include # endif # include #endif #if HAVE_STRINGS_H # include #endif #if HAVE_INTTYPES_H # include #else # if HAVE_STDINT_H # include # endif #endif #if HAVE_UNISTD_H # include #endif" ac_subst_vars='SHELL PATH_SEPARATOR PACKAGE_NAME PACKAGE_TARNAME PACKAGE_VERSION PACKAGE_STRING PACKAGE_BUGREPORT exec_prefix prefix program_transform_name bindir sbindir libexecdir datadir sysconfdir sharedstatedir localstatedir libdir includedir oldincludedir infodir mandir build_alias host_alias target_alias DEFS ECHO_C ECHO_N ECHO_T LIBS CC CFLAGS LDFLAGS CPPFLAGS ac_ct_CC EXEEXT OBJEXT AR RANLIB ac_ct_RANLIB LN_S LEX LEXLIB LEX_OUTPUT_ROOT STRIP CP INSTALL_PROGRAM INSTALL_SCRIPT INSTALL_DATA CPP EGREP TARGET HAVE_STRDUP LEXFLAGS DYNFILTER_TARGET DYNFILTER_CFLAGS DYNFILTER LIBOBJS LTLIBOBJS' ac_subst_files='' # Initialize some variables set by options. ac_init_help= ac_init_version=false # The variables have the same names as the options, with # dashes changed to underlines. cache_file=/dev/null exec_prefix=NONE no_create= no_recursion= prefix=NONE program_prefix=NONE program_suffix=NONE program_transform_name=s,x,x, silent= site= srcdir= verbose= x_includes=NONE x_libraries=NONE # Installation directory options. # These are left unexpanded so users can "make install exec_prefix=/foo" # and all the variables that are supposed to be based on exec_prefix # by default will actually change. # Use braces instead of parens because sh, perl, etc. also accept them. bindir='${exec_prefix}/bin' sbindir='${exec_prefix}/sbin' libexecdir='${exec_prefix}/libexec' datadir='${prefix}/share' sysconfdir='${prefix}/etc' sharedstatedir='${prefix}/com' localstatedir='${prefix}/var' libdir='${exec_prefix}/lib' includedir='${prefix}/include' oldincludedir='/usr/include' infodir='${prefix}/info' mandir='${prefix}/man' ac_prev= for ac_option do # If the previous option needs an argument, assign it. if test -n "$ac_prev"; then eval "$ac_prev=\$ac_option" ac_prev= continue fi ac_optarg=`expr "x$ac_option" : 'x[^=]*=\(.*\)'` # Accept the important Cygnus configure options, so we can diagnose typos. case $ac_option in -bindir | --bindir | --bindi | --bind | --bin | --bi) ac_prev=bindir ;; -bindir=* | --bindir=* | --bindi=* | --bind=* | --bin=* | --bi=*) bindir=$ac_optarg ;; -build | --build | --buil | --bui | --bu) ac_prev=build_alias ;; -build=* | --build=* | --buil=* | --bui=* | --bu=*) build_alias=$ac_optarg ;; -cache-file | --cache-file | --cache-fil | --cache-fi \ | --cache-f | --cache- | --cache | --cach | --cac | --ca | --c) ac_prev=cache_file ;; -cache-file=* | --cache-file=* | --cache-fil=* | --cache-fi=* \ | --cache-f=* | --cache-=* | --cache=* | --cach=* | --cac=* | --ca=* | --c=*) cache_file=$ac_optarg ;; --config-cache | -C) cache_file=config.cache ;; -datadir | --datadir | --datadi | --datad | --data | --dat | --da) ac_prev=datadir ;; -datadir=* | --datadir=* | --datadi=* | --datad=* | --data=* | --dat=* \ | --da=*) datadir=$ac_optarg ;; -disable-* | --disable-*) ac_feature=`expr "x$ac_option" : 'x-*disable-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_feature" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid feature name: $ac_feature" >&2 { (exit 1); exit 1; }; } ac_feature=`echo $ac_feature | sed 's/-/_/g'` eval "enable_$ac_feature=no" ;; -enable-* | --enable-*) ac_feature=`expr "x$ac_option" : 'x-*enable-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_feature" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid feature name: $ac_feature" >&2 { (exit 1); exit 1; }; } ac_feature=`echo $ac_feature | sed 's/-/_/g'` case $ac_option in *=*) ac_optarg=`echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"`;; *) ac_optarg=yes ;; esac eval "enable_$ac_feature='$ac_optarg'" ;; -exec-prefix | --exec_prefix | --exec-prefix | --exec-prefi \ | --exec-pref | --exec-pre | --exec-pr | --exec-p | --exec- \ | --exec | --exe | --ex) ac_prev=exec_prefix ;; -exec-prefix=* | --exec_prefix=* | --exec-prefix=* | --exec-prefi=* \ | --exec-pref=* | --exec-pre=* | --exec-pr=* | --exec-p=* | --exec-=* \ | --exec=* | --exe=* | --ex=*) exec_prefix=$ac_optarg ;; -gas | --gas | --ga | --g) # Obsolete; use --with-gas. with_gas=yes ;; -help | --help | --hel | --he | -h) ac_init_help=long ;; -help=r* | --help=r* | --hel=r* | --he=r* | -hr*) ac_init_help=recursive ;; -help=s* | --help=s* | --hel=s* | --he=s* | -hs*) ac_init_help=short ;; -host | --host | --hos | --ho) ac_prev=host_alias ;; -host=* | --host=* | --hos=* | --ho=*) host_alias=$ac_optarg ;; -includedir | --includedir | --includedi | --included | --include \ | --includ | --inclu | --incl | --inc) ac_prev=includedir ;; -includedir=* | --includedir=* | --includedi=* | --included=* | --include=* \ | --includ=* | --inclu=* | --incl=* | --inc=*) includedir=$ac_optarg ;; -infodir | --infodir | --infodi | --infod | --info | --inf) ac_prev=infodir ;; -infodir=* | --infodir=* | --infodi=* | --infod=* | --info=* | --inf=*) infodir=$ac_optarg ;; -libdir | --libdir | --libdi | --libd) ac_prev=libdir ;; -libdir=* | --libdir=* | --libdi=* | --libd=*) libdir=$ac_optarg ;; -libexecdir | --libexecdir | --libexecdi | --libexecd | --libexec \ | --libexe | --libex | --libe) ac_prev=libexecdir ;; -libexecdir=* | --libexecdir=* | --libexecdi=* | --libexecd=* | --libexec=* \ | --libexe=* | --libex=* | --libe=*) libexecdir=$ac_optarg ;; -localstatedir | --localstatedir | --localstatedi | --localstated \ | --localstate | --localstat | --localsta | --localst \ | --locals | --local | --loca | --loc | --lo) ac_prev=localstatedir ;; -localstatedir=* | --localstatedir=* | --localstatedi=* | --localstated=* \ | --localstate=* | --localstat=* | --localsta=* | --localst=* \ | --locals=* | --local=* | --loca=* | --loc=* | --lo=*) localstatedir=$ac_optarg ;; -mandir | --mandir | --mandi | --mand | --man | --ma | --m) ac_prev=mandir ;; -mandir=* | --mandir=* | --mandi=* | --mand=* | --man=* | --ma=* | --m=*) mandir=$ac_optarg ;; -nfp | --nfp | --nf) # Obsolete; use --without-fp. with_fp=no ;; -no-create | --no-create | --no-creat | --no-crea | --no-cre \ | --no-cr | --no-c | -n) no_create=yes ;; -no-recursion | --no-recursion | --no-recursio | --no-recursi \ | --no-recurs | --no-recur | --no-recu | --no-rec | --no-re | --no-r) no_recursion=yes ;; -oldincludedir | --oldincludedir | --oldincludedi | --oldincluded \ | --oldinclude | --oldinclud | --oldinclu | --oldincl | --oldinc \ | --oldin | --oldi | --old | --ol | --o) ac_prev=oldincludedir ;; -oldincludedir=* | --oldincludedir=* | --oldincludedi=* | --oldincluded=* \ | --oldinclude=* | --oldinclud=* | --oldinclu=* | --oldincl=* | --oldinc=* \ | --oldin=* | --oldi=* | --old=* | --ol=* | --o=*) oldincludedir=$ac_optarg ;; -prefix | --prefix | --prefi | --pref | --pre | --pr | --p) ac_prev=prefix ;; -prefix=* | --prefix=* | --prefi=* | --pref=* | --pre=* | --pr=* | --p=*) prefix=$ac_optarg ;; -program-prefix | --program-prefix | --program-prefi | --program-pref \ | --program-pre | --program-pr | --program-p) ac_prev=program_prefix ;; -program-prefix=* | --program-prefix=* | --program-prefi=* \ | --program-pref=* | --program-pre=* | --program-pr=* | --program-p=*) program_prefix=$ac_optarg ;; -program-suffix | --program-suffix | --program-suffi | --program-suff \ | --program-suf | --program-su | --program-s) ac_prev=program_suffix ;; -program-suffix=* | --program-suffix=* | --program-suffi=* \ | --program-suff=* | --program-suf=* | --program-su=* | --program-s=*) program_suffix=$ac_optarg ;; -program-transform-name | --program-transform-name \ | --program-transform-nam | --program-transform-na \ | --program-transform-n | --program-transform- \ | --program-transform | --program-transfor \ | --program-transfo | --program-transf \ | --program-trans | --program-tran \ | --progr-tra | --program-tr | --program-t) ac_prev=program_transform_name ;; -program-transform-name=* | --program-transform-name=* \ | --program-transform-nam=* | --program-transform-na=* \ | --program-transform-n=* | --program-transform-=* \ | --program-transform=* | --program-transfor=* \ | --program-transfo=* | --program-transf=* \ | --program-trans=* | --program-tran=* \ | --progr-tra=* | --program-tr=* | --program-t=*) program_transform_name=$ac_optarg ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) silent=yes ;; -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb) ac_prev=sbindir ;; -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \ | --sbi=* | --sb=*) sbindir=$ac_optarg ;; -sharedstatedir | --sharedstatedir | --sharedstatedi \ | --sharedstated | --sharedstate | --sharedstat | --sharedsta \ | --sharedst | --shareds | --shared | --share | --shar \ | --sha | --sh) ac_prev=sharedstatedir ;; -sharedstatedir=* | --sharedstatedir=* | --sharedstatedi=* \ | --sharedstated=* | --sharedstate=* | --sharedstat=* | --sharedsta=* \ | --sharedst=* | --shareds=* | --shared=* | --share=* | --shar=* \ | --sha=* | --sh=*) sharedstatedir=$ac_optarg ;; -site | --site | --sit) ac_prev=site ;; -site=* | --site=* | --sit=*) site=$ac_optarg ;; -srcdir | --srcdir | --srcdi | --srcd | --src | --sr) ac_prev=srcdir ;; -srcdir=* | --srcdir=* | --srcdi=* | --srcd=* | --src=* | --sr=*) srcdir=$ac_optarg ;; -sysconfdir | --sysconfdir | --sysconfdi | --sysconfd | --sysconf \ | --syscon | --sysco | --sysc | --sys | --sy) ac_prev=sysconfdir ;; -sysconfdir=* | --sysconfdir=* | --sysconfdi=* | --sysconfd=* | --sysconf=* \ | --syscon=* | --sysco=* | --sysc=* | --sys=* | --sy=*) sysconfdir=$ac_optarg ;; -target | --target | --targe | --targ | --tar | --ta | --t) ac_prev=target_alias ;; -target=* | --target=* | --targe=* | --targ=* | --tar=* | --ta=* | --t=*) target_alias=$ac_optarg ;; -v | -verbose | --verbose | --verbos | --verbo | --verb) verbose=yes ;; -version | --version | --versio | --versi | --vers | -V) ac_init_version=: ;; -with-* | --with-*) ac_package=`expr "x$ac_option" : 'x-*with-\([^=]*\)'` # Reject names that are not valid shell variable names. expr "x$ac_package" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid package name: $ac_package" >&2 { (exit 1); exit 1; }; } ac_package=`echo $ac_package| sed 's/-/_/g'` case $ac_option in *=*) ac_optarg=`echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"`;; *) ac_optarg=yes ;; esac eval "with_$ac_package='$ac_optarg'" ;; -without-* | --without-*) ac_package=`expr "x$ac_option" : 'x-*without-\(.*\)'` # Reject names that are not valid shell variable names. expr "x$ac_package" : ".*[^-_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid package name: $ac_package" >&2 { (exit 1); exit 1; }; } ac_package=`echo $ac_package | sed 's/-/_/g'` eval "with_$ac_package=no" ;; --x) # Obsolete; use --with-x. with_x=yes ;; -x-includes | --x-includes | --x-include | --x-includ | --x-inclu \ | --x-incl | --x-inc | --x-in | --x-i) ac_prev=x_includes ;; -x-includes=* | --x-includes=* | --x-include=* | --x-includ=* | --x-inclu=* \ | --x-incl=* | --x-inc=* | --x-in=* | --x-i=*) x_includes=$ac_optarg ;; -x-libraries | --x-libraries | --x-librarie | --x-librari \ | --x-librar | --x-libra | --x-libr | --x-lib | --x-li | --x-l) ac_prev=x_libraries ;; -x-libraries=* | --x-libraries=* | --x-librarie=* | --x-librari=* \ | --x-librar=* | --x-libra=* | --x-libr=* | --x-lib=* | --x-li=* | --x-l=*) x_libraries=$ac_optarg ;; -*) { echo "$as_me: error: unrecognized option: $ac_option Try \`$0 --help' for more information." >&2 { (exit 1); exit 1; }; } ;; *=*) ac_envvar=`expr "x$ac_option" : 'x\([^=]*\)='` # Reject names that are not valid shell variable names. expr "x$ac_envvar" : ".*[^_$as_cr_alnum]" >/dev/null && { echo "$as_me: error: invalid variable name: $ac_envvar" >&2 { (exit 1); exit 1; }; } ac_optarg=`echo "$ac_optarg" | sed "s/'/'\\\\\\\\''/g"` eval "$ac_envvar='$ac_optarg'" export $ac_envvar ;; *) # FIXME: should be removed in autoconf 3.0. echo "$as_me: WARNING: you should use --build, --host, --target" >&2 expr "x$ac_option" : ".*[^-._$as_cr_alnum]" >/dev/null && echo "$as_me: WARNING: invalid host type: $ac_option" >&2 : ${build_alias=$ac_option} ${host_alias=$ac_option} ${target_alias=$ac_option} ;; esac done if test -n "$ac_prev"; then ac_option=--`echo $ac_prev | sed 's/_/-/g'` { echo "$as_me: error: missing argument to $ac_option" >&2 { (exit 1); exit 1; }; } fi # Be sure to have absolute paths. for ac_var in exec_prefix prefix do eval ac_val=$`echo $ac_var` case $ac_val in [\\/$]* | ?:[\\/]* | NONE | '' ) ;; *) { echo "$as_me: error: expected an absolute directory name for --$ac_var: $ac_val" >&2 { (exit 1); exit 1; }; };; esac done # Be sure to have absolute paths. for ac_var in bindir sbindir libexecdir datadir sysconfdir sharedstatedir \ localstatedir libdir includedir oldincludedir infodir mandir do eval ac_val=$`echo $ac_var` case $ac_val in [\\/$]* | ?:[\\/]* ) ;; *) { echo "$as_me: error: expected an absolute directory name for --$ac_var: $ac_val" >&2 { (exit 1); exit 1; }; };; esac done # There might be people who depend on the old broken behavior: `$host' # used to hold the argument of --host etc. # FIXME: To remove some day. build=$build_alias host=$host_alias target=$target_alias # FIXME: To remove some day. if test "x$host_alias" != x; then if test "x$build_alias" = x; then cross_compiling=maybe echo "$as_me: WARNING: If you wanted to set the --build type, don't use --host. If a cross compiler is detected then cross compile mode will be used." >&2 elif test "x$build_alias" != "x$host_alias"; then cross_compiling=yes fi fi ac_tool_prefix= test -n "$host_alias" && ac_tool_prefix=$host_alias- test "$silent" = yes && exec 6>/dev/null # Find the source files, if location was not specified. if test -z "$srcdir"; then ac_srcdir_defaulted=yes # Try the directory containing this script, then its parent. ac_confdir=`(dirname "$0") 2>/dev/null || $as_expr X"$0" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$0" : 'X\(//\)[^/]' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$0" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` srcdir=$ac_confdir if test ! -r $srcdir/$ac_unique_file; then srcdir=.. fi else ac_srcdir_defaulted=no fi if test ! -r $srcdir/$ac_unique_file; then if test "$ac_srcdir_defaulted" = yes; then { echo "$as_me: error: cannot find sources ($ac_unique_file) in $ac_confdir or .." >&2 { (exit 1); exit 1; }; } else { echo "$as_me: error: cannot find sources ($ac_unique_file) in $srcdir" >&2 { (exit 1); exit 1; }; } fi fi (cd $srcdir && test -r ./$ac_unique_file) 2>/dev/null || { echo "$as_me: error: sources are in $srcdir, but \`cd $srcdir' does not work" >&2 { (exit 1); exit 1; }; } srcdir=`echo "$srcdir" | sed 's%\([^\\/]\)[\\/]*$%\1%'` ac_env_build_alias_set=${build_alias+set} ac_env_build_alias_value=$build_alias ac_cv_env_build_alias_set=${build_alias+set} ac_cv_env_build_alias_value=$build_alias ac_env_host_alias_set=${host_alias+set} ac_env_host_alias_value=$host_alias ac_cv_env_host_alias_set=${host_alias+set} ac_cv_env_host_alias_value=$host_alias ac_env_target_alias_set=${target_alias+set} ac_env_target_alias_value=$target_alias ac_cv_env_target_alias_set=${target_alias+set} ac_cv_env_target_alias_value=$target_alias ac_env_CC_set=${CC+set} ac_env_CC_value=$CC ac_cv_env_CC_set=${CC+set} ac_cv_env_CC_value=$CC ac_env_CFLAGS_set=${CFLAGS+set} ac_env_CFLAGS_value=$CFLAGS ac_cv_env_CFLAGS_set=${CFLAGS+set} ac_cv_env_CFLAGS_value=$CFLAGS ac_env_LDFLAGS_set=${LDFLAGS+set} ac_env_LDFLAGS_value=$LDFLAGS ac_cv_env_LDFLAGS_set=${LDFLAGS+set} ac_cv_env_LDFLAGS_value=$LDFLAGS ac_env_CPPFLAGS_set=${CPPFLAGS+set} ac_env_CPPFLAGS_value=$CPPFLAGS ac_cv_env_CPPFLAGS_set=${CPPFLAGS+set} ac_cv_env_CPPFLAGS_value=$CPPFLAGS ac_env_CPP_set=${CPP+set} ac_env_CPP_value=$CPP ac_cv_env_CPP_set=${CPP+set} ac_cv_env_CPP_value=$CPP # # Report the --help message. # if test "$ac_init_help" = "long"; then # Omit some internal or obsolete options to make the list less imposing. # This message is too long to be a string in the A/UX 3.1 sh. cat <<_ACEOF \`configure' configures this package to adapt to many kinds of systems. Usage: $0 [OPTION]... [VAR=VALUE]... To assign environment variables (e.g., CC, CFLAGS...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Defaults for the options are specified in brackets. Configuration: -h, --help display this help and exit --help=short display options specific to this package --help=recursive display the short help of all the included packages -V, --version display version information and exit -q, --quiet, --silent do not print \`checking...' messages --cache-file=FILE cache test results in FILE [disabled] -C, --config-cache alias for \`--cache-file=config.cache' -n, --no-create do not create output files --srcdir=DIR find the sources in DIR [configure dir or \`..'] _ACEOF cat <<_ACEOF Installation directories: --prefix=PREFIX install architecture-independent files in PREFIX [$ac_default_prefix] --exec-prefix=EPREFIX install architecture-dependent files in EPREFIX [PREFIX] By default, \`make install' will install all the files in \`$ac_default_prefix/bin', \`$ac_default_prefix/lib' etc. You can specify an installation prefix other than \`$ac_default_prefix' using \`--prefix', for instance \`--prefix=\$HOME'. For better control, use the options below. Fine tuning of the installation directories: --bindir=DIR user executables [EPREFIX/bin] --sbindir=DIR system admin executables [EPREFIX/sbin] --libexecdir=DIR program executables [EPREFIX/libexec] --datadir=DIR read-only architecture-independent data [PREFIX/share] --sysconfdir=DIR read-only single-machine data [PREFIX/etc] --sharedstatedir=DIR modifiable architecture-independent data [PREFIX/com] --localstatedir=DIR modifiable single-machine data [PREFIX/var] --libdir=DIR object code libraries [EPREFIX/lib] --includedir=DIR C header files [PREFIX/include] --oldincludedir=DIR C header files for non-gcc [/usr/include] --infodir=DIR info documentation [PREFIX/info] --mandir=DIR man documentation [PREFIX/man] _ACEOF cat <<\_ACEOF _ACEOF fi if test -n "$ac_init_help"; then cat <<\_ACEOF Optional Features: --disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no) --enable-FEATURE[=ARG] include FEATURE [ARG=yes] --enable-structured-queries enable structured queries --disable-iso-charset disable iso charset (may be slightly faster if you don't care about upper-ascii characters) --enable-sfs-compat Support SFS compatibility --enable-pointer blah --enable-measure-times blah --enable-warnings Add -Wall to CFLAGS --enable-strip Strip binaries Optional Packages: --with-PACKAGE[=ARG] use PACKAGE [ARG=yes] --without-PACKAGE do not use PACKAGE (same as --with-PACKAGE=no) --with-file-end-mark=CHAR use character CHAR as filename delimiter ' ' most often set to '\t' in order to index filenames with spaces; must match Webglimpse setting in lib/wgHeader.pm. Some influential environment variables: CC C compiler command CFLAGS C compiler flags LDFLAGS linker flags, e.g. -L if you have libraries in a nonstandard directory CPPFLAGS C/C++ preprocessor flags, e.g. -I if you have headers in a nonstandard directory CPP C preprocessor Use these variables to override the choices made by `configure' or to help it to find libraries and programs with nonstandard names/locations. _ACEOF fi if test "$ac_init_help" = "recursive"; then # If there are subdirs, report their specific --help. ac_popdir=`pwd` for ac_dir in : $ac_subdirs_all; do test "x$ac_dir" = x: && continue test -d $ac_dir || continue ac_builddir=. if test "$ac_dir" != .; then ac_dir_suffix=/`echo "$ac_dir" | sed 's,^\.[\\/],,'` # A "../" for each directory in $ac_dir_suffix. ac_top_builddir=`echo "$ac_dir_suffix" | sed 's,/[^\\/]*,../,g'` else ac_dir_suffix= ac_top_builddir= fi case $srcdir in .) # No --srcdir option. We are building in place. ac_srcdir=. if test -z "$ac_top_builddir"; then ac_top_srcdir=. else ac_top_srcdir=`echo $ac_top_builddir | sed 's,/$,,'` fi ;; [\\/]* | ?:[\\/]* ) # Absolute path. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ;; *) # Relative path. ac_srcdir=$ac_top_builddir$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_builddir$srcdir ;; esac # Don't blindly perform a `cd "$ac_dir"/$ac_foo && pwd` since $ac_foo can be # absolute. ac_abs_builddir=`cd "$ac_dir" && cd $ac_builddir && pwd` ac_abs_top_builddir=`cd "$ac_dir" && cd ${ac_top_builddir}. && pwd` ac_abs_srcdir=`cd "$ac_dir" && cd $ac_srcdir && pwd` ac_abs_top_srcdir=`cd "$ac_dir" && cd $ac_top_srcdir && pwd` cd $ac_dir # Check for guested configure; otherwise get Cygnus style configure. if test -f $ac_srcdir/configure.gnu; then echo $SHELL $ac_srcdir/configure.gnu --help=recursive elif test -f $ac_srcdir/configure; then echo $SHELL $ac_srcdir/configure --help=recursive elif test -f $ac_srcdir/configure.ac || test -f $ac_srcdir/configure.in; then echo $ac_configure --help else echo "$as_me: WARNING: no configuration information is in $ac_dir" >&2 fi cd $ac_popdir done fi test -n "$ac_init_help" && exit 0 if $ac_init_version; then cat <<\_ACEOF Copyright 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001, 2002 Free Software Foundation, Inc. This configure script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it. _ACEOF exit 0 fi exec 5>config.log cat >&5 <<_ACEOF This file contains any messages produced by compilers while running configure, to aid debugging if configure makes a mistake. It was created by $as_me, which was generated by GNU Autoconf 2.57. Invocation command line was $ $0 $@ _ACEOF { cat <<_ASUNAME ## --------- ## ## Platform. ## ## --------- ## hostname = `(hostname || uname -n) 2>/dev/null | sed 1q` uname -m = `(uname -m) 2>/dev/null || echo unknown` uname -r = `(uname -r) 2>/dev/null || echo unknown` uname -s = `(uname -s) 2>/dev/null || echo unknown` uname -v = `(uname -v) 2>/dev/null || echo unknown` /usr/bin/uname -p = `(/usr/bin/uname -p) 2>/dev/null || echo unknown` /bin/uname -X = `(/bin/uname -X) 2>/dev/null || echo unknown` /bin/arch = `(/bin/arch) 2>/dev/null || echo unknown` /usr/bin/arch -k = `(/usr/bin/arch -k) 2>/dev/null || echo unknown` /usr/convex/getsysinfo = `(/usr/convex/getsysinfo) 2>/dev/null || echo unknown` hostinfo = `(hostinfo) 2>/dev/null || echo unknown` /bin/machine = `(/bin/machine) 2>/dev/null || echo unknown` /usr/bin/oslevel = `(/usr/bin/oslevel) 2>/dev/null || echo unknown` /bin/universe = `(/bin/universe) 2>/dev/null || echo unknown` _ASUNAME as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. echo "PATH: $as_dir" done } >&5 cat >&5 <<_ACEOF ## ----------- ## ## Core tests. ## ## ----------- ## _ACEOF # Keep a trace of the command line. # Strip out --no-create and --no-recursion so they do not pile up. # Strip out --silent because we don't want to record it for future runs. # Also quote any args containing shell meta-characters. # Make two passes to allow for proper duplicate-argument suppression. ac_configure_args= ac_configure_args0= ac_configure_args1= ac_sep= ac_must_keep_next=false for ac_pass in 1 2 do for ac_arg do case $ac_arg in -no-create | --no-c* | -n | -no-recursion | --no-r*) continue ;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil) continue ;; *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?\"\']*) ac_arg=`echo "$ac_arg" | sed "s/'/'\\\\\\\\''/g"` ;; esac case $ac_pass in 1) ac_configure_args0="$ac_configure_args0 '$ac_arg'" ;; 2) ac_configure_args1="$ac_configure_args1 '$ac_arg'" if test $ac_must_keep_next = true; then ac_must_keep_next=false # Got value, back to normal. else case $ac_arg in *=* | --config-cache | -C | -disable-* | --disable-* \ | -enable-* | --enable-* | -gas | --g* | -nfp | --nf* \ | -q | -quiet | --q* | -silent | --sil* | -v | -verb* \ | -with-* | --with-* | -without-* | --without-* | --x) case "$ac_configure_args0 " in "$ac_configure_args1"*" '$ac_arg' "* ) continue ;; esac ;; -* ) ac_must_keep_next=true ;; esac fi ac_configure_args="$ac_configure_args$ac_sep'$ac_arg'" # Get rid of the leading space. ac_sep=" " ;; esac done done $as_unset ac_configure_args0 || test "${ac_configure_args0+set}" != set || { ac_configure_args0=; export ac_configure_args0; } $as_unset ac_configure_args1 || test "${ac_configure_args1+set}" != set || { ac_configure_args1=; export ac_configure_args1; } # When interrupted or exit'd, cleanup temporary files, and complete # config.log. We remove comments because anyway the quotes in there # would cause problems or look ugly. # WARNING: Be sure not to use single quotes in there, as some shells, # such as our DU 5.0 friend, will then `close' the trap. trap 'exit_status=$? # Save into config.log some information that might help in debugging. { echo cat <<\_ASBOX ## ---------------- ## ## Cache variables. ## ## ---------------- ## _ASBOX echo # The following way of writing the cache mishandles newlines in values, { (set) 2>&1 | case `(ac_space='"'"' '"'"'; set | grep ac_space) 2>&1` in *ac_space=\ *) sed -n \ "s/'"'"'/'"'"'\\\\'"'"''"'"'/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='"'"'\\2'"'"'/p" ;; *) sed -n \ "s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1=\\2/p" ;; esac; } echo cat <<\_ASBOX ## ----------------- ## ## Output variables. ## ## ----------------- ## _ASBOX echo for ac_var in $ac_subst_vars do eval ac_val=$`echo $ac_var` echo "$ac_var='"'"'$ac_val'"'"'" done | sort echo if test -n "$ac_subst_files"; then cat <<\_ASBOX ## ------------- ## ## Output files. ## ## ------------- ## _ASBOX echo for ac_var in $ac_subst_files do eval ac_val=$`echo $ac_var` echo "$ac_var='"'"'$ac_val'"'"'" done | sort echo fi if test -s confdefs.h; then cat <<\_ASBOX ## ----------- ## ## confdefs.h. ## ## ----------- ## _ASBOX echo sed "/^$/d" confdefs.h | sort echo fi test "$ac_signal" != 0 && echo "$as_me: caught signal $ac_signal" echo "$as_me: exit $exit_status" } >&5 rm -f core core.* *.core && rm -rf conftest* confdefs* conf$$* $ac_clean_files && exit $exit_status ' 0 for ac_signal in 1 2 13 15; do trap 'ac_signal='$ac_signal'; { (exit 1); exit 1; }' $ac_signal done ac_signal=0 # confdefs.h avoids OS command line length limits that DEFS can exceed. rm -rf conftest* confdefs.h # AIX cpp loses on an empty file, so make sure it contains at least a newline. echo >confdefs.h # Predefined preprocessor variables. cat >>confdefs.h <<_ACEOF #define PACKAGE_NAME "$PACKAGE_NAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_TARNAME "$PACKAGE_TARNAME" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_VERSION "$PACKAGE_VERSION" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_STRING "$PACKAGE_STRING" _ACEOF cat >>confdefs.h <<_ACEOF #define PACKAGE_BUGREPORT "$PACKAGE_BUGREPORT" _ACEOF # Let the site file select an alternate cache file if it wants to. # Prefer explicitly selected file to automatically selected ones. if test -z "$CONFIG_SITE"; then if test "x$prefix" != xNONE; then CONFIG_SITE="$prefix/share/config.site $prefix/etc/config.site" else CONFIG_SITE="$ac_default_prefix/share/config.site $ac_default_prefix/etc/config.site" fi fi for ac_site_file in $CONFIG_SITE; do if test -r "$ac_site_file"; then { echo "$as_me:$LINENO: loading site script $ac_site_file" >&5 echo "$as_me: loading site script $ac_site_file" >&6;} sed 's/^/| /' "$ac_site_file" >&5 . "$ac_site_file" fi done if test -r "$cache_file"; then # Some versions of bash will fail to source /dev/null (special # files actually), so we avoid doing that. if test -f "$cache_file"; then { echo "$as_me:$LINENO: loading cache $cache_file" >&5 echo "$as_me: loading cache $cache_file" >&6;} case $cache_file in [\\/]* | ?:[\\/]* ) . $cache_file;; *) . ./$cache_file;; esac fi else { echo "$as_me:$LINENO: creating cache $cache_file" >&5 echo "$as_me: creating cache $cache_file" >&6;} >$cache_file fi # Check that the precious variables saved in the cache have kept the same # value. ac_cache_corrupted=false for ac_var in `(set) 2>&1 | sed -n 's/^ac_env_\([a-zA-Z_0-9]*\)_set=.*/\1/p'`; do eval ac_old_set=\$ac_cv_env_${ac_var}_set eval ac_new_set=\$ac_env_${ac_var}_set eval ac_old_val="\$ac_cv_env_${ac_var}_value" eval ac_new_val="\$ac_env_${ac_var}_value" case $ac_old_set,$ac_new_set in set,) { echo "$as_me:$LINENO: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&5 echo "$as_me: error: \`$ac_var' was set to \`$ac_old_val' in the previous run" >&2;} ac_cache_corrupted=: ;; ,set) { echo "$as_me:$LINENO: error: \`$ac_var' was not set in the previous run" >&5 echo "$as_me: error: \`$ac_var' was not set in the previous run" >&2;} ac_cache_corrupted=: ;; ,);; *) if test "x$ac_old_val" != "x$ac_new_val"; then { echo "$as_me:$LINENO: error: \`$ac_var' has changed since the previous run:" >&5 echo "$as_me: error: \`$ac_var' has changed since the previous run:" >&2;} { echo "$as_me:$LINENO: former value: $ac_old_val" >&5 echo "$as_me: former value: $ac_old_val" >&2;} { echo "$as_me:$LINENO: current value: $ac_new_val" >&5 echo "$as_me: current value: $ac_new_val" >&2;} ac_cache_corrupted=: fi;; esac # Pass precious variables to config.status. if test "$ac_new_set" = set; then case $ac_new_val in *" "*|*" "*|*[\[\]\~\#\$\^\&\*\(\)\{\}\\\|\;\<\>\?\"\']*) ac_arg=$ac_var=`echo "$ac_new_val" | sed "s/'/'\\\\\\\\''/g"` ;; *) ac_arg=$ac_var=$ac_new_val ;; esac case " $ac_configure_args " in *" '$ac_arg' "*) ;; # Avoid dups. Use of quotes ensures accuracy. *) ac_configure_args="$ac_configure_args '$ac_arg'" ;; esac fi done if $ac_cache_corrupted; then { echo "$as_me:$LINENO: error: changes in the environment can compromise the build" >&5 echo "$as_me: error: changes in the environment can compromise the build" >&2;} { { echo "$as_me:$LINENO: error: run \`make distclean' and/or \`rm $cache_file' and start over" >&5 echo "$as_me: error: run \`make distclean' and/or \`rm $cache_file' and start over" >&2;} { (exit 1); exit 1; }; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu ac_config_headers="$ac_config_headers libtemplate/include/autoconf.h" ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}gcc", so it can be a program name with args. set dummy ${ac_tool_prefix}gcc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="${ac_tool_prefix}gcc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "gcc", so it can be a program name with args. set dummy gcc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="gcc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 echo "${ECHO_T}$ac_ct_CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi CC=$ac_ct_CC else CC="$ac_cv_prog_CC" fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}cc", so it can be a program name with args. set dummy ${ac_tool_prefix}cc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="${ac_tool_prefix}cc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$ac_cv_prog_CC"; then ac_ct_CC=$CC # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="cc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 echo "${ECHO_T}$ac_ct_CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi CC=$ac_ct_CC else CC="$ac_cv_prog_CC" fi fi if test -z "$CC"; then # Extract the first word of "cc", so it can be a program name with args. set dummy cc; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else ac_prog_rejected=no as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then if test "$as_dir/$ac_word$ac_exec_ext" = "/usr/ucb/cc"; then ac_prog_rejected=yes continue fi ac_cv_prog_CC="cc" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done if test $ac_prog_rejected = yes; then # We found a bogon in the path, so make sure we never use it. set dummy $ac_cv_prog_CC shift if test $# != 0; then # We chose a different compiler from the bogus one. # However, it has the same basename, so the bogon will be chosen # first if we set CC to just the basename; use the full file name. shift ac_cv_prog_CC="$as_dir/$ac_word${1+' '}$@" fi fi fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$CC"; then if test -n "$ac_tool_prefix"; then for ac_prog in cl do # Extract the first word of "$ac_tool_prefix$ac_prog", so it can be a program name with args. set dummy $ac_tool_prefix$ac_prog; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$CC"; then ac_cv_prog_CC="$CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_CC="$ac_tool_prefix$ac_prog" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi CC=$ac_cv_prog_CC if test -n "$CC"; then echo "$as_me:$LINENO: result: $CC" >&5 echo "${ECHO_T}$CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi test -n "$CC" && break done fi if test -z "$CC"; then ac_ct_CC=$CC for ac_prog in cl do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_CC+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_CC"; then ac_cv_prog_ac_ct_CC="$ac_ct_CC" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_CC="$ac_prog" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi ac_ct_CC=$ac_cv_prog_ac_ct_CC if test -n "$ac_ct_CC"; then echo "$as_me:$LINENO: result: $ac_ct_CC" >&5 echo "${ECHO_T}$ac_ct_CC" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi test -n "$ac_ct_CC" && break done CC=$ac_ct_CC fi fi test -z "$CC" && { { echo "$as_me:$LINENO: error: no acceptable C compiler found in \$PATH See \`config.log' for more details." >&5 echo "$as_me: error: no acceptable C compiler found in \$PATH See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } # Provide some information about the compiler. echo "$as_me:$LINENO:" \ "checking for C compiler version" >&5 ac_compiler=`set X $ac_compile; echo $2` { (eval echo "$as_me:$LINENO: \"$ac_compiler --version &5\"") >&5 (eval $ac_compiler --version &5) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } { (eval echo "$as_me:$LINENO: \"$ac_compiler -v &5\"") >&5 (eval $ac_compiler -v &5) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } { (eval echo "$as_me:$LINENO: \"$ac_compiler -V &5\"") >&5 (eval $ac_compiler -V &5) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files a.out a.exe b.out" # Try to create an executable without -o first, disregard a.out. # It will help us diagnose broken compilers, and finding out an intuition # of exeext. echo "$as_me:$LINENO: checking for C compiler default output" >&5 echo $ECHO_N "checking for C compiler default output... $ECHO_C" >&6 ac_link_default=`echo "$ac_link" | sed 's/ -o *conftest[^ ]*//'` if { (eval echo "$as_me:$LINENO: \"$ac_link_default\"") >&5 (eval $ac_link_default) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then # Find the output, starting from the most likely. This scheme is # not robust to junk in `.', hence go to wildcards (a.*) only as a last # resort. # Be careful to initialize this variable, since it used to be cached. # Otherwise an old cache value of `no' led to `EXEEXT = no' in a Makefile. ac_cv_exeext= # b.out is created by i960 compilers. for ac_file in a_out.exe a.exe conftest.exe a.out conftest a.* conftest.* b.out do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.o | *.obj ) ;; conftest.$ac_ext ) # This is the source file. ;; [ab].out ) # We found the default executable, but exeext='' is most # certainly right. break;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` # FIXME: I believe we export ac_cv_exeext for Libtool, # but it would be cool to find out if it's true. Does anybody # maintain Libtool? --akim. export ac_cv_exeext break;; * ) break;; esac done else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { echo "$as_me:$LINENO: error: C compiler cannot create executables See \`config.log' for more details." >&5 echo "$as_me: error: C compiler cannot create executables See \`config.log' for more details." >&2;} { (exit 77); exit 77; }; } fi ac_exeext=$ac_cv_exeext echo "$as_me:$LINENO: result: $ac_file" >&5 echo "${ECHO_T}$ac_file" >&6 # Check the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. echo "$as_me:$LINENO: checking whether the C compiler works" >&5 echo $ECHO_N "checking whether the C compiler works... $ECHO_C" >&6 # FIXME: These cross compiler hacks should be removed for Autoconf 3.0 # If not cross compiling, check that we can run a simple program. if test "$cross_compiling" != yes; then if { ac_try='./$ac_file' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then cross_compiling=no else if test "$cross_compiling" = maybe; then cross_compiling=yes else { { echo "$as_me:$LINENO: error: cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details." >&5 echo "$as_me: error: cannot run C compiled programs. If you meant to cross compile, use \`--host'. See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi fi fi echo "$as_me:$LINENO: result: yes" >&5 echo "${ECHO_T}yes" >&6 rm -f a.out a.exe conftest$ac_cv_exeext b.out ac_clean_files=$ac_clean_files_save # Check the compiler produces executables we can run. If not, either # the compiler is broken, or we cross compile. echo "$as_me:$LINENO: checking whether we are cross compiling" >&5 echo $ECHO_N "checking whether we are cross compiling... $ECHO_C" >&6 echo "$as_me:$LINENO: result: $cross_compiling" >&5 echo "${ECHO_T}$cross_compiling" >&6 echo "$as_me:$LINENO: checking for suffix of executables" >&5 echo $ECHO_N "checking for suffix of executables... $ECHO_C" >&6 if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then # If both `conftest.exe' and `conftest' are `present' (well, observable) # catch `conftest.exe'. For instance with Cygwin, `ls conftest' will # work properly (i.e., refer to `conftest.exe'), while it won't with # `rm'. for ac_file in conftest.exe conftest conftest.*; do test -f "$ac_file" || continue case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg | *.o | *.obj ) ;; *.* ) ac_cv_exeext=`expr "$ac_file" : '[^.]*\(\..*\)'` export ac_cv_exeext break;; * ) break;; esac done else { { echo "$as_me:$LINENO: error: cannot compute suffix of executables: cannot compile and link See \`config.log' for more details." >&5 echo "$as_me: error: cannot compute suffix of executables: cannot compile and link See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi rm -f conftest$ac_cv_exeext echo "$as_me:$LINENO: result: $ac_cv_exeext" >&5 echo "${ECHO_T}$ac_cv_exeext" >&6 rm -f conftest.$ac_ext EXEEXT=$ac_cv_exeext ac_exeext=$EXEEXT echo "$as_me:$LINENO: checking for suffix of object files" >&5 echo $ECHO_N "checking for suffix of object files... $ECHO_C" >&6 if test "${ac_cv_objext+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.o conftest.obj if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; then for ac_file in `(ls conftest.o conftest.obj; ls conftest.*) 2>/dev/null`; do case $ac_file in *.$ac_ext | *.xcoff | *.tds | *.d | *.pdb | *.xSYM | *.bb | *.bbg ) ;; *) ac_cv_objext=`expr "$ac_file" : '.*\.\(.*\)'` break;; esac done else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 { { echo "$as_me:$LINENO: error: cannot compute suffix of object files: cannot compile See \`config.log' for more details." >&5 echo "$as_me: error: cannot compute suffix of object files: cannot compile See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi rm -f conftest.$ac_cv_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_objext" >&5 echo "${ECHO_T}$ac_cv_objext" >&6 OBJEXT=$ac_cv_objext ac_objext=$OBJEXT echo "$as_me:$LINENO: checking whether we are using the GNU C compiler" >&5 echo $ECHO_N "checking whether we are using the GNU C compiler... $ECHO_C" >&6 if test "${ac_cv_c_compiler_gnu+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { #ifndef __GNUC__ choke me #endif ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_compiler_gnu=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_compiler_gnu=no fi rm -f conftest.$ac_objext conftest.$ac_ext ac_cv_c_compiler_gnu=$ac_compiler_gnu fi echo "$as_me:$LINENO: result: $ac_cv_c_compiler_gnu" >&5 echo "${ECHO_T}$ac_cv_c_compiler_gnu" >&6 GCC=`test $ac_compiler_gnu = yes && echo yes` ac_test_CFLAGS=${CFLAGS+set} ac_save_CFLAGS=$CFLAGS CFLAGS="-g" echo "$as_me:$LINENO: checking whether $CC accepts -g" >&5 echo $ECHO_N "checking whether $CC accepts -g... $ECHO_C" >&6 if test "${ac_cv_prog_cc_g+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_prog_cc_g=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_prog_cc_g=no fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_prog_cc_g" >&5 echo "${ECHO_T}$ac_cv_prog_cc_g" >&6 if test "$ac_test_CFLAGS" = set; then CFLAGS=$ac_save_CFLAGS elif test $ac_cv_prog_cc_g = yes; then if test "$GCC" = yes; then CFLAGS="-g -O2" else CFLAGS="-g" fi else if test "$GCC" = yes; then CFLAGS="-O2" else CFLAGS= fi fi echo "$as_me:$LINENO: checking for $CC option to accept ANSI C" >&5 echo $ECHO_N "checking for $CC option to accept ANSI C... $ECHO_C" >&6 if test "${ac_cv_prog_cc_stdc+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_cv_prog_cc_stdc=no ac_save_CC=$CC cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include #include /* Most of the following tests are stolen from RCS 5.7's src/conf.sh. */ struct buf { int x; }; FILE * (*rcsopen) (struct buf *, struct stat *, int); static char *e (p, i) char **p; int i; { return p[i]; } static char *f (char * (*g) (char **, int), char **p, ...) { char *s; va_list v; va_start (v,p); s = g (p, va_arg (v,int)); va_end (v); return s; } int test (int i, double x); struct s1 {int (*f) (int a);}; struct s2 {int (*f) (double a);}; int pairnames (int, char **, FILE *(*)(struct buf *, struct stat *, int), int, int); int argc; char **argv; int main () { return f (e, argv, 0) != argv[0] || f (e, argv, 1) != argv[1]; ; return 0; } _ACEOF # Don't try gcc -ansi; that turns off useful extensions and # breaks some systems' header files. # AIX -qlanglvl=ansi # Ultrix and OSF/1 -std1 # HP-UX 10.20 and later -Ae # HP-UX older versions -Aa -D_HPUX_SOURCE # SVR4 -Xc -D__EXTENSIONS__ for ac_arg in "" -qlanglvl=ansi -std1 -Ae "-Aa -D_HPUX_SOURCE" "-Xc -D__EXTENSIONS__" do CC="$ac_save_CC $ac_arg" rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_prog_cc_stdc=$ac_arg break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext done rm -f conftest.$ac_ext conftest.$ac_objext CC=$ac_save_CC fi case "x$ac_cv_prog_cc_stdc" in x|xno) echo "$as_me:$LINENO: result: none needed" >&5 echo "${ECHO_T}none needed" >&6 ;; *) echo "$as_me:$LINENO: result: $ac_cv_prog_cc_stdc" >&5 echo "${ECHO_T}$ac_cv_prog_cc_stdc" >&6 CC="$CC $ac_cv_prog_cc_stdc" ;; esac # Some people use a C++ compiler to compile C. Since we use `exit', # in C++ we need to declare it. In case someone uses the same compiler # for both compiling C and C++ we need to have the C++ compiler decide # the declaration of exit, since it's the most demanding environment. cat >conftest.$ac_ext <<_ACEOF #ifndef __cplusplus choke me #endif _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then for ac_declaration in \ ''\ '#include ' \ 'extern "C" void std::exit (int) throw (); using std::exit;' \ 'extern "C" void std::exit (int); using std::exit;' \ 'extern "C" void exit (int) throw ();' \ 'extern "C" void exit (int);' \ 'void exit (int);' do cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include $ac_declaration int main () { exit (42); ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then : else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 continue fi rm -f conftest.$ac_objext conftest.$ac_ext cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_declaration int main () { exit (42); ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest.$ac_ext done rm -f conftest* if test -n "$ac_declaration"; then echo '#ifdef __cplusplus' >>confdefs.h echo $ac_declaration >>confdefs.h echo '#endif' >>confdefs.h fi else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest.$ac_ext ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu # Extract the first word of "ar", so it can be a program name with args. set dummy ar; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_AR+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $AR in [\\/]* | ?:[\\/]*) ac_cv_path_AR="$AR" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_AR="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_path_AR" && ac_cv_path_AR="ar" ;; esac fi AR=$ac_cv_path_AR if test -n "$AR"; then echo "$as_me:$LINENO: result: $AR" >&5 echo "${ECHO_T}$AR" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi if test -z "$AR" ; then # Extract the first word of "ar", so it can be a program name with args. set dummy ar; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_AR+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $AR in [\\/]* | ?:[\\/]*) ac_cv_path_AR="$AR" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /usr/ccs/bin/ar do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_AR="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done ;; esac fi AR=$ac_cv_path_AR if test -n "$AR"; then echo "$as_me:$LINENO: result: $AR" >&5 echo "${ECHO_T}$AR" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi if test -z "$AR" ; then { { echo "$as_me:$LINENO: error: no acceptable ar found in \$PATH:/usr/ccs/bin/ar" >&5 echo "$as_me: error: no acceptable ar found in \$PATH:/usr/ccs/bin/ar" >&2;} { (exit 1); exit 1; }; } fi fi if test -n "$ac_tool_prefix"; then # Extract the first word of "${ac_tool_prefix}ranlib", so it can be a program name with args. set dummy ${ac_tool_prefix}ranlib; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_RANLIB+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$RANLIB"; then ac_cv_prog_RANLIB="$RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_RANLIB="${ac_tool_prefix}ranlib" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi RANLIB=$ac_cv_prog_RANLIB if test -n "$RANLIB"; then echo "$as_me:$LINENO: result: $RANLIB" >&5 echo "${ECHO_T}$RANLIB" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi fi if test -z "$ac_cv_prog_RANLIB"; then ac_ct_RANLIB=$RANLIB # Extract the first word of "ranlib", so it can be a program name with args. set dummy ranlib; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_ac_ct_RANLIB+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$ac_ct_RANLIB"; then ac_cv_prog_ac_ct_RANLIB="$ac_ct_RANLIB" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_ac_ct_RANLIB="ranlib" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_prog_ac_ct_RANLIB" && ac_cv_prog_ac_ct_RANLIB=":" fi fi ac_ct_RANLIB=$ac_cv_prog_ac_ct_RANLIB if test -n "$ac_ct_RANLIB"; then echo "$as_me:$LINENO: result: $ac_ct_RANLIB" >&5 echo "${ECHO_T}$ac_ct_RANLIB" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi RANLIB=$ac_ct_RANLIB else RANLIB="$ac_cv_prog_RANLIB" fi echo "$as_me:$LINENO: checking whether ln -s works" >&5 echo $ECHO_N "checking whether ln -s works... $ECHO_C" >&6 LN_S=$as_ln_s if test "$LN_S" = "ln -s"; then echo "$as_me:$LINENO: result: yes" >&5 echo "${ECHO_T}yes" >&6 else echo "$as_me:$LINENO: result: no, using $LN_S" >&5 echo "${ECHO_T}no, using $LN_S" >&6 fi for ac_prog in flex lex do # Extract the first word of "$ac_prog", so it can be a program name with args. set dummy $ac_prog; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_prog_LEX+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if test -n "$LEX"; then ac_cv_prog_LEX="$LEX" # Let the user override the test. else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_prog_LEX="$ac_prog" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done fi fi LEX=$ac_cv_prog_LEX if test -n "$LEX"; then echo "$as_me:$LINENO: result: $LEX" >&5 echo "${ECHO_T}$LEX" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi test -n "$LEX" && break done test -n "$LEX" || LEX=":" if test -z "$LEXLIB" then echo "$as_me:$LINENO: checking for yywrap in -lfl" >&5 echo $ECHO_N "checking for yywrap in -lfl... $ECHO_C" >&6 if test "${ac_cv_lib_fl_yywrap+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lfl $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char yywrap (); int main () { yywrap (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_fl_yywrap=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_fl_yywrap=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_fl_yywrap" >&5 echo "${ECHO_T}$ac_cv_lib_fl_yywrap" >&6 if test $ac_cv_lib_fl_yywrap = yes; then LEXLIB="-lfl" else echo "$as_me:$LINENO: checking for yywrap in -ll" >&5 echo $ECHO_N "checking for yywrap in -ll... $ECHO_C" >&6 if test "${ac_cv_lib_l_yywrap+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-ll $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char yywrap (); int main () { yywrap (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_l_yywrap=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_l_yywrap=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_l_yywrap" >&5 echo "${ECHO_T}$ac_cv_lib_l_yywrap" >&6 if test $ac_cv_lib_l_yywrap = yes; then LEXLIB="-ll" fi fi fi if test "x$LEX" != "x:"; then echo "$as_me:$LINENO: checking lex output file root" >&5 echo $ECHO_N "checking lex output file root... $ECHO_C" >&6 if test "${ac_cv_prog_lex_root+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else # The minimal lex program is just a single line: %%. But some broken lexes # (Solaris, I think it was) want two %% lines, so accommodate them. cat >conftest.l <<_ACEOF %% %% _ACEOF { (eval echo "$as_me:$LINENO: \"$LEX conftest.l\"") >&5 (eval $LEX conftest.l) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } if test -f lex.yy.c; then ac_cv_prog_lex_root=lex.yy elif test -f lexyy.c; then ac_cv_prog_lex_root=lexyy else { { echo "$as_me:$LINENO: error: cannot find output from $LEX; giving up" >&5 echo "$as_me: error: cannot find output from $LEX; giving up" >&2;} { (exit 1); exit 1; }; } fi fi echo "$as_me:$LINENO: result: $ac_cv_prog_lex_root" >&5 echo "${ECHO_T}$ac_cv_prog_lex_root" >&6 rm -f conftest.l LEX_OUTPUT_ROOT=$ac_cv_prog_lex_root echo "$as_me:$LINENO: checking whether yytext is a pointer" >&5 echo $ECHO_N "checking whether yytext is a pointer... $ECHO_C" >&6 if test "${ac_cv_prog_lex_yytext_pointer+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else # POSIX says lex can declare yytext either as a pointer or an array; the # default is implementation-dependent. Figure out which it is, since # not all implementations provide the %pointer and %array declarations. ac_cv_prog_lex_yytext_pointer=no echo 'extern char *yytext;' >>$LEX_OUTPUT_ROOT.c ac_save_LIBS=$LIBS LIBS="$LIBS $LEXLIB" cat >conftest.$ac_ext <<_ACEOF `cat $LEX_OUTPUT_ROOT.c` _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_prog_lex_yytext_pointer=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_save_LIBS rm -f "${LEX_OUTPUT_ROOT}.c" fi echo "$as_me:$LINENO: result: $ac_cv_prog_lex_yytext_pointer" >&5 echo "${ECHO_T}$ac_cv_prog_lex_yytext_pointer" >&6 if test $ac_cv_prog_lex_yytext_pointer = yes; then cat >>confdefs.h <<\_ACEOF #define YYTEXT_POINTER 1 _ACEOF fi fi if test "x$LEX" = "xflex" ; then DYNFILTER_TARGET=htuml2txt.so LEXFLAGS="-F -8" else DYNFILTER_TARGET=htuml2txt LEXFLAGS= fi if test "$ac_cv_c_compiler_gnu" = "yes" ; then DYNFILTER_CFLAGS="-O3 -fomit-frame-pointer" else DYNFILTER_CFLAGS="-O" fi if test "`uname`" = "Linux" ; then DYNFILTER=dynfilters else DYNFILTER= fi # Extract the first word of "strip", so it can be a program name with args. set dummy strip; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_STRIP+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $STRIP in [\\/]* | ?:[\\/]*) ac_cv_path_STRIP="$STRIP" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_STRIP="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_path_STRIP" && ac_cv_path_STRIP="strip" ;; esac fi STRIP=$ac_cv_path_STRIP if test -n "$STRIP"; then echo "$as_me:$LINENO: result: $STRIP" >&5 echo "${ECHO_T}$STRIP" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi # Extract the first word of "cp", so it can be a program name with args. set dummy cp; ac_word=$2 echo "$as_me:$LINENO: checking for $ac_word" >&5 echo $ECHO_N "checking for $ac_word... $ECHO_C" >&6 if test "${ac_cv_path_CP+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else case $CP in [\\/]* | ?:[\\/]*) ac_cv_path_CP="$CP" # Let the user override the test with a path. ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_word$ac_exec_ext"; then ac_cv_path_CP="$as_dir/$ac_word$ac_exec_ext" echo "$as_me:$LINENO: found $as_dir/$ac_word$ac_exec_ext" >&5 break 2 fi done done test -z "$ac_cv_path_CP" && ac_cv_path_CP="cp" ;; esac fi CP=$ac_cv_path_CP if test -n "$CP"; then echo "$as_me:$LINENO: result: $CP" >&5 echo "${ECHO_T}$CP" >&6 else echo "$as_me:$LINENO: result: no" >&5 echo "${ECHO_T}no" >&6 fi if test -z "$CP" ; then { { echo "$as_me:$LINENO: error: no cp found in \$PATH, something weird is going on." >&5 echo "$as_me: error: no cp found in \$PATH, something weird is going on." >&2;} { (exit 1); exit 1; }; } fi ac_aux_dir= for ac_dir in $srcdir $srcdir/.. $srcdir/../..; do if test -f $ac_dir/install-sh; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/install-sh -c" break elif test -f $ac_dir/install.sh; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/install.sh -c" break elif test -f $ac_dir/shtool; then ac_aux_dir=$ac_dir ac_install_sh="$ac_aux_dir/shtool install -c" break fi done if test -z "$ac_aux_dir"; then { { echo "$as_me:$LINENO: error: cannot find install-sh or install.sh in $srcdir $srcdir/.. $srcdir/../.." >&5 echo "$as_me: error: cannot find install-sh or install.sh in $srcdir $srcdir/.. $srcdir/../.." >&2;} { (exit 1); exit 1; }; } fi ac_config_guess="$SHELL $ac_aux_dir/config.guess" ac_config_sub="$SHELL $ac_aux_dir/config.sub" ac_configure="$SHELL $ac_aux_dir/configure" # This should be Cygnus configure. # Find a good install program. We prefer a C program (faster), # so one script is as good as another. But avoid the broken or # incompatible versions: # SysV /etc/install, /usr/sbin/install # SunOS /usr/etc/install # IRIX /sbin/install # AIX /bin/install # AmigaOS /C/install, which installs bootblocks on floppy discs # AIX 4 /usr/bin/installbsd, which doesn't work without a -g flag # AFS /usr/afsws/bin/install, which mishandles nonexistent args # SVR4 /usr/ucb/install, which tries to use the nonexistent group "staff" # ./install, which can be erroneously created by make from ./install.sh. echo "$as_me:$LINENO: checking for a BSD-compatible install" >&5 echo $ECHO_N "checking for a BSD-compatible install... $ECHO_C" >&6 if test -z "$INSTALL"; then if test "${ac_cv_path_install+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. # Account for people who put trailing slashes in PATH elements. case $as_dir/ in ./ | .// | /cC/* | \ /etc/* | /usr/sbin/* | /usr/etc/* | /sbin/* | /usr/afsws/bin/* | \ /usr/ucb/* ) ;; *) # OSF1 and SCO ODT 3.0 have their own names for install. # Don't use installbsd from OSF since it installs stuff as root # by default. for ac_prog in ginstall scoinst install; do for ac_exec_ext in '' $ac_executable_extensions; do if $as_executable_p "$as_dir/$ac_prog$ac_exec_ext"; then if test $ac_prog = install && grep dspmsg "$as_dir/$ac_prog$ac_exec_ext" >/dev/null 2>&1; then # AIX install. It has an incompatible calling convention. : elif test $ac_prog = install && grep pwplus "$as_dir/$ac_prog$ac_exec_ext" >/dev/null 2>&1; then # program-specific install script used by HP pwplus--don't use. : else ac_cv_path_install="$as_dir/$ac_prog$ac_exec_ext -c" break 3 fi fi done done ;; esac done fi if test "${ac_cv_path_install+set}" = set; then INSTALL=$ac_cv_path_install else # As a last resort, use the slow shell script. We don't cache a # path for INSTALL within a source directory, because that will # break other packages using the cache if that directory is # removed, or if the path is relative. INSTALL=$ac_install_sh fi fi echo "$as_me:$LINENO: result: $INSTALL" >&5 echo "${ECHO_T}$INSTALL" >&6 # Use test -z because SunOS4 sh mishandles braces in ${var-val}. # It thinks the first close brace ends the variable substitution. test -z "$INSTALL_PROGRAM" && INSTALL_PROGRAM='${INSTALL}' test -z "$INSTALL_SCRIPT" && INSTALL_SCRIPT='${INSTALL}' test -z "$INSTALL_DATA" && INSTALL_DATA='${INSTALL} -m 644' ac_header_dirent=no for ac_hdr in dirent.h sys/ndir.h sys/dir.h ndir.h; do as_ac_Header=`echo "ac_cv_header_dirent_$ac_hdr" | $as_tr_sh` echo "$as_me:$LINENO: checking for $ac_hdr that defines DIR" >&5 echo $ECHO_N "checking for $ac_hdr that defines DIR... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include <$ac_hdr> int main () { if ((DIR *) 0) return 0; ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then eval "$as_ac_Header=yes" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_Header=no" fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_hdr" | $as_tr_cpp` 1 _ACEOF ac_header_dirent=$ac_hdr; break fi done # Two versions of opendir et al. are in -ldir and -lx on SCO Xenix. if test $ac_header_dirent = dirent.h; then echo "$as_me:$LINENO: checking for library containing opendir" >&5 echo $ECHO_N "checking for library containing opendir... $ECHO_C" >&6 if test "${ac_cv_search_opendir+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_func_search_save_LIBS=$LIBS ac_cv_search_opendir=no cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="none required" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext if test "$ac_cv_search_opendir" = no; then for ac_lib in dir; do LIBS="-l$ac_lib $ac_func_search_save_LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="-l$ac_lib" break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext done fi LIBS=$ac_func_search_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_search_opendir" >&5 echo "${ECHO_T}$ac_cv_search_opendir" >&6 if test "$ac_cv_search_opendir" != no; then test "$ac_cv_search_opendir" = "none required" || LIBS="$ac_cv_search_opendir $LIBS" fi else echo "$as_me:$LINENO: checking for library containing opendir" >&5 echo $ECHO_N "checking for library containing opendir... $ECHO_C" >&6 if test "${ac_cv_search_opendir+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_func_search_save_LIBS=$LIBS ac_cv_search_opendir=no cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="none required" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext if test "$ac_cv_search_opendir" = no; then for ac_lib in x; do LIBS="-l$ac_lib $ac_func_search_save_LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char opendir (); int main () { opendir (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_search_opendir="-l$ac_lib" break else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext done fi LIBS=$ac_func_search_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_search_opendir" >&5 echo "${ECHO_T}$ac_cv_search_opendir" >&6 if test "$ac_cv_search_opendir" != no; then test "$ac_cv_search_opendir" = "none required" || LIBS="$ac_cv_search_opendir $LIBS" fi fi #Contribution by VaX#n8 ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu echo "$as_me:$LINENO: checking how to run the C preprocessor" >&5 echo $ECHO_N "checking how to run the C preprocessor... $ECHO_C" >&6 # On Suns, sometimes $CPP names a directory. if test -n "$CPP" && test -d "$CPP"; then CPP= fi if test -z "$CPP"; then if test "${ac_cv_prog_CPP+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else # Double quotes because CPP needs to be expanded for CPP in "$CC -E" "$CC -E -traditional-cpp" "/lib/cpp" do ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then : else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Broken: fails on valid input. continue fi rm -f conftest.err conftest.$ac_ext # OK, works on sane cases. Now check whether non-existent headers # can be detected and how. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then # Broken: success on invalid input. continue else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.err conftest.$ac_ext if $ac_preproc_ok; then break fi done ac_cv_prog_CPP=$CPP fi CPP=$ac_cv_prog_CPP else ac_cv_prog_CPP=$CPP fi echo "$as_me:$LINENO: result: $CPP" >&5 echo "${ECHO_T}$CPP" >&6 ac_preproc_ok=false for ac_c_preproc_warn_flag in '' yes do # Use a header file that comes with gcc, so configuring glibc # with a fresh cross-compiler works. # Prefer to if __STDC__ is defined, since # exists even on freestanding compilers. # On the NeXT, cc -E runs the code through the compiler's parser, # not just through cpp. "Syntax error" is here to catch this case. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #ifdef __STDC__ # include #else # include #endif Syntax error _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then : else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Broken: fails on valid input. continue fi rm -f conftest.err conftest.$ac_ext # OK, works on sane cases. Now check whether non-existent headers # can be detected and how. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then # Broken: success on invalid input. continue else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 # Passes both tests. ac_preproc_ok=: break fi rm -f conftest.err conftest.$ac_ext done # Because of `break', _AC_PREPROC_IFELSE's cleaning code was skipped. rm -f conftest.err conftest.$ac_ext if $ac_preproc_ok; then : else { { echo "$as_me:$LINENO: error: C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details." >&5 echo "$as_me: error: C preprocessor \"$CPP\" fails sanity check See \`config.log' for more details." >&2;} { (exit 1); exit 1; }; } fi ac_ext=c ac_cpp='$CPP $CPPFLAGS' ac_compile='$CC -c $CFLAGS $CPPFLAGS conftest.$ac_ext >&5' ac_link='$CC -o conftest$ac_exeext $CFLAGS $CPPFLAGS $LDFLAGS conftest.$ac_ext $LIBS >&5' ac_compiler_gnu=$ac_cv_c_compiler_gnu echo "$as_me:$LINENO: checking for egrep" >&5 echo $ECHO_N "checking for egrep... $ECHO_C" >&6 if test "${ac_cv_prog_egrep+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else if echo a | (grep -E '(a|b)') >/dev/null 2>&1 then ac_cv_prog_egrep='grep -E' else ac_cv_prog_egrep='egrep' fi fi echo "$as_me:$LINENO: result: $ac_cv_prog_egrep" >&5 echo "${ECHO_T}$ac_cv_prog_egrep" >&6 EGREP=$ac_cv_prog_egrep echo "$as_me:$LINENO: checking for ANSI C header files" >&5 echo $ECHO_N "checking for ANSI C header files... $ECHO_C" >&6 if test "${ac_cv_header_stdc+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include #include int main () { ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_header_stdc=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_header_stdc=no fi rm -f conftest.$ac_objext conftest.$ac_ext if test $ac_cv_header_stdc = yes; then # SunOS 4.x string.h does not declare mem*, contrary to ANSI. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "memchr" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # ISC 2.0.2 stdlib.h does not declare free, contrary to ANSI. cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include _ACEOF if (eval "$ac_cpp conftest.$ac_ext") 2>&5 | $EGREP "free" >/dev/null 2>&1; then : else ac_cv_header_stdc=no fi rm -f conftest* fi if test $ac_cv_header_stdc = yes; then # /bin/cc in Irix-4.0.5 gets non-ANSI ctype macros unless using -ansi. if test "$cross_compiling" = yes; then : else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #if ((' ' & 0x0FF) == 0x020) # define ISLOWER(c) ('a' <= (c) && (c) <= 'z') # define TOUPPER(c) (ISLOWER(c) ? 'A' + ((c) - 'a') : (c)) #else # define ISLOWER(c) \ (('a' <= (c) && (c) <= 'i') \ || ('j' <= (c) && (c) <= 'r') \ || ('s' <= (c) && (c) <= 'z')) # define TOUPPER(c) (ISLOWER(c) ? ((c) | 0x40) : (c)) #endif #define XOR(e, f) (((e) && !(f)) || (!(e) && (f))) int main () { int i; for (i = 0; i < 256; i++) if (XOR (islower (i), ISLOWER (i)) || toupper (i) != TOUPPER (i)) exit(2); exit (0); } _ACEOF rm -f conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='./conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then : else echo "$as_me: program exited with status $ac_status" >&5 echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ( exit $ac_status ) ac_cv_header_stdc=no fi rm -f core core.* *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext fi fi fi echo "$as_me:$LINENO: result: $ac_cv_header_stdc" >&5 echo "${ECHO_T}$ac_cv_header_stdc" >&6 if test $ac_cv_header_stdc = yes; then cat >>confdefs.h <<\_ACEOF #define STDC_HEADERS 1 _ACEOF fi # On IRIX 5.3, sys/types and inttypes.h are conflicting. for ac_header in sys/types.h sys/stat.h stdlib.h string.h memory.h strings.h \ inttypes.h stdint.h unistd.h do as_ac_Header=`echo "ac_cv_header_$ac_header" | $as_tr_sh` echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then eval "$as_ac_Header=yes" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_Header=no" fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done for ac_header in fcntl.h sys/file.h sys/time.h unistd.h sys/select.h do as_ac_Header=`echo "ac_cv_header_$ac_header" | $as_tr_sh` if eval "test \"\${$as_ac_Header+set}\" = set"; then echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 else # Is the header compilable? echo "$as_me:$LINENO: checking $ac_header usability" >&5 echo $ECHO_N "checking $ac_header usability... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_header_compiler=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_compiler=no fi rm -f conftest.$ac_objext conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_compiler" >&5 echo "${ECHO_T}$ac_header_compiler" >&6 # Is the header present? echo "$as_me:$LINENO: checking $ac_header presence" >&5 echo $ECHO_N "checking $ac_header presence... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include <$ac_header> _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then ac_header_preproc=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_preproc=no fi rm -f conftest.err conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_preproc" >&5 echo "${ECHO_T}$ac_header_preproc" >&6 # So? What about this header? case $ac_header_compiler:$ac_header_preproc in yes:no ) { echo "$as_me:$LINENO: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&5 echo "$as_me: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; no:yes ) { echo "$as_me:$LINENO: WARNING: $ac_header: present but cannot be compiled" >&5 echo "$as_me: WARNING: $ac_header: present but cannot be compiled" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: check for missing prerequisite headers?" >&5 echo "$as_me: WARNING: $ac_header: check for missing prerequisite headers?" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; esac echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else eval "$as_ac_Header=$ac_header_preproc" fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 fi if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done for ac_header in sys/dir.h sys/ndir.h strerr.h do as_ac_Header=`echo "ac_cv_header_$ac_header" | $as_tr_sh` if eval "test \"\${$as_ac_Header+set}\" = set"; then echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 else # Is the header compilable? echo "$as_me:$LINENO: checking $ac_header usability" >&5 echo $ECHO_N "checking $ac_header usability... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default #include <$ac_header> _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_header_compiler=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_compiler=no fi rm -f conftest.$ac_objext conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_compiler" >&5 echo "${ECHO_T}$ac_header_compiler" >&6 # Is the header present? echo "$as_me:$LINENO: checking $ac_header presence" >&5 echo $ECHO_N "checking $ac_header presence... $ECHO_C" >&6 cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include <$ac_header> _ACEOF if { (eval echo "$as_me:$LINENO: \"$ac_cpp conftest.$ac_ext\"") >&5 (eval $ac_cpp conftest.$ac_ext) 2>conftest.er1 ac_status=$? grep -v '^ *+' conftest.er1 >conftest.err rm -f conftest.er1 cat conftest.err >&5 echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } >/dev/null; then if test -s conftest.err; then ac_cpp_err=$ac_c_preproc_warn_flag else ac_cpp_err= fi else ac_cpp_err=yes fi if test -z "$ac_cpp_err"; then ac_header_preproc=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_header_preproc=no fi rm -f conftest.err conftest.$ac_ext echo "$as_me:$LINENO: result: $ac_header_preproc" >&5 echo "${ECHO_T}$ac_header_preproc" >&6 # So? What about this header? case $ac_header_compiler:$ac_header_preproc in yes:no ) { echo "$as_me:$LINENO: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&5 echo "$as_me: WARNING: $ac_header: accepted by the compiler, rejected by the preprocessor!" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; no:yes ) { echo "$as_me:$LINENO: WARNING: $ac_header: present but cannot be compiled" >&5 echo "$as_me: WARNING: $ac_header: present but cannot be compiled" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: check for missing prerequisite headers?" >&5 echo "$as_me: WARNING: $ac_header: check for missing prerequisite headers?" >&2;} { echo "$as_me:$LINENO: WARNING: $ac_header: proceeding with the preprocessor's result" >&5 echo "$as_me: WARNING: $ac_header: proceeding with the preprocessor's result" >&2;} ( cat <<\_ASBOX ## ------------------------------------ ## ## Report this to bug-autoconf@gnu.org. ## ## ------------------------------------ ## _ASBOX ) | sed "s/^/$as_me: WARNING: /" >&2 ;; esac echo "$as_me:$LINENO: checking for $ac_header" >&5 echo $ECHO_N "checking for $ac_header... $ECHO_C" >&6 if eval "test \"\${$as_ac_Header+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else eval "$as_ac_Header=$ac_header_preproc" fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_Header'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_Header'}'`" >&6 fi if test `eval echo '${'$as_ac_Header'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_header" | $as_tr_cpp` 1 _ACEOF fi done echo "$as_me:$LINENO: checking whether time.h and sys/time.h may both be included" >&5 echo $ECHO_N "checking whether time.h and sys/time.h may both be included... $ECHO_C" >&6 if test "${ac_cv_header_time+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #include int main () { if ((struct tm *) 0) return 0; ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_header_time=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_header_time=no fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_header_time" >&5 echo "${ECHO_T}$ac_cv_header_time" >&6 if test $ac_cv_header_time = yes; then cat >>confdefs.h <<\_ACEOF #define TIME_WITH_SYS_TIME 1 _ACEOF fi echo "$as_me:$LINENO: checking for an ANSI C-conforming const" >&5 echo $ECHO_N "checking for an ANSI C-conforming const... $ECHO_C" >&6 if test "${ac_cv_c_const+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { /* FIXME: Include the comments suggested by Paul. */ #ifndef __cplusplus /* Ultrix mips cc rejects this. */ typedef int charset[2]; const charset x; /* SunOS 4.1.1 cc rejects this. */ char const *const *ccp; char **p; /* NEC SVR4.0.2 mips cc rejects this. */ struct point {int x, y;}; static struct point const zero = {0,0}; /* AIX XL C 1.02.0.0 rejects this. It does not let you subtract one const X* pointer from another in an arm of an if-expression whose if-part is not a constant expression */ const char *g = "string"; ccp = &g + (g ? g-g : 0); /* HPUX 7.0 cc rejects these. */ ++ccp; p = (char**) ccp; ccp = (char const *const *) p; { /* SCO 3.2v4 cc rejects this. */ char *t; char const *s = 0 ? (char *) 0 : (char const *) 0; *t++ = 0; } { /* Someone thinks the Sun supposedly-ANSI compiler will reject this. */ int x[] = {25, 17}; const int *foo = &x[0]; ++foo; } { /* Sun SC1.0 ANSI compiler rejects this -- but not the above. */ typedef const int *iptr; iptr p = 0; ++p; } { /* AIX XL C 1.02.0.0 rejects this saying "k.c", line 2.27: 1506-025 (S) Operand must be a modifiable lvalue. */ struct s { int j; const int *ap[3]; }; struct s *b; b->j = 5; } { /* ULTRIX-32 V3.1 (Rev 9) vcc rejects this */ const int foo = 10; } #endif ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_c_const=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_c_const=no fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_c_const" >&5 echo "${ECHO_T}$ac_cv_c_const" >&6 if test $ac_cv_c_const = no; then cat >>confdefs.h <<\_ACEOF #define const _ACEOF fi echo "$as_me:$LINENO: checking return type of signal handlers" >&5 echo $ECHO_N "checking return type of signal handlers... $ECHO_C" >&6 if test "${ac_cv_type_signal+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ #include #include #ifdef signal # undef signal #endif #ifdef __cplusplus extern "C" void (*signal (int, void (*)(int)))(int); #else void (*signal ()) (); #endif int main () { int i; ; return 0; } _ACEOF rm -f conftest.$ac_objext if { (eval echo "$as_me:$LINENO: \"$ac_compile\"") >&5 (eval $ac_compile) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest.$ac_objext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_type_signal=void else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_type_signal=int fi rm -f conftest.$ac_objext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_type_signal" >&5 echo "${ECHO_T}$ac_cv_type_signal" >&6 cat >>confdefs.h <<_ACEOF #define RETSIGTYPE $ac_cv_type_signal _ACEOF echo "$as_me:$LINENO: checking whether utime accepts a null argument" >&5 echo $ECHO_N "checking whether utime accepts a null argument... $ECHO_C" >&6 if test "${ac_cv_func_utime_null+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else rm -f conftest.data; >conftest.data # Sequent interprets utime(file, 0) to mean use start of epoch. Wrong. if test "$cross_compiling" = yes; then ac_cv_func_utime_null=no else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ $ac_includes_default int main () { struct stat s, t; exit (!(stat ("conftest.data", &s) == 0 && utime ("conftest.data", (long *)0) == 0 && stat ("conftest.data", &t) == 0 && t.st_mtime >= s.st_mtime && t.st_mtime - s.st_mtime < 120)); ; return 0; } _ACEOF rm -f conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='./conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_utime_null=yes else echo "$as_me: program exited with status $ac_status" >&5 echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ( exit $ac_status ) ac_cv_func_utime_null=no fi rm -f core core.* *.core gmon.out bb.out conftest$ac_exeext conftest.$ac_objext conftest.$ac_ext fi rm -f core core.* *.core fi echo "$as_me:$LINENO: result: $ac_cv_func_utime_null" >&5 echo "${ECHO_T}$ac_cv_func_utime_null" >&6 if test $ac_cv_func_utime_null = yes; then cat >>confdefs.h <<\_ACEOF #define HAVE_UTIME_NULL 1 _ACEOF fi rm -f conftest.data #AC_CHECK_FUNCS(getcwd gethostname gettimeofday mkdir rmdir select socket strdup strftime strstr) # Need this for libtemplate for ac_func in strdup strerror do as_ac_var=`echo "ac_cv_func_$ac_func" | $as_tr_sh` echo "$as_me:$LINENO: checking for $ac_func" >&5 echo $ECHO_N "checking for $ac_func... $ECHO_C" >&6 if eval "test \"\${$as_ac_var+set}\" = set"; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char $ac_func (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char $ac_func (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_$ac_func) || defined (__stub___$ac_func) choke me #else char (*f) () = $ac_func; #endif #ifdef __cplusplus } #endif int main () { return f != $ac_func; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then eval "$as_ac_var=yes" else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 eval "$as_ac_var=no" fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: `eval echo '${'$as_ac_var'}'`" >&5 echo "${ECHO_T}`eval echo '${'$as_ac_var'}'`" >&6 if test `eval echo '${'$as_ac_var'}'` = yes; then cat >>confdefs.h <<_ACEOF #define `echo "HAVE_$ac_func" | $as_tr_cpp` 1 _ACEOF fi done # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. echo "$as_me:$LINENO: checking for sqrt in -lm" >&5 echo $ECHO_N "checking for sqrt in -lm... $ECHO_C" >&6 if test "${ac_cv_lib_m_sqrt+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lm $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char sqrt (); int main () { sqrt (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_m_sqrt=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_m_sqrt=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_m_sqrt" >&5 echo "${ECHO_T}$ac_cv_lib_m_sqrt" >&6 if test $ac_cv_lib_m_sqrt = yes; then cat >>confdefs.h <<_ACEOF #define HAVE_LIBM 1 _ACEOF LIBS="-lm $LIBS" fi echo "$as_me:$LINENO: checking for dlopen in -lc" >&5 echo $ECHO_N "checking for dlopen in -lc... $ECHO_C" >&6 if test "${ac_cv_lib_c_dlopen+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lc $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char dlopen (); int main () { dlopen (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_c_dlopen=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_c_dlopen=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_c_dlopen" >&5 echo "${ECHO_T}$ac_cv_lib_c_dlopen" >&6 if test $ac_cv_lib_c_dlopen = yes; then cat >>confdefs.h <<_ACEOF #define HAVE_LIBC 1 _ACEOF LIBS="-lc $LIBS" else echo "$as_me:$LINENO: checking for dlopen in -ldl" >&5 echo $ECHO_N "checking for dlopen in -ldl... $ECHO_C" >&6 if test "${ac_cv_lib_dl_dlopen+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-ldl $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char dlopen (); int main () { dlopen (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_dl_dlopen=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_dl_dlopen=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_dl_dlopen" >&5 echo "${ECHO_T}$ac_cv_lib_dl_dlopen" >&6 if test $ac_cv_lib_dl_dlopen = yes; then LIBS="$LIBS -ldl" fi fi #AC_CHECK_LIB(resolv, gethostbyname) #AC_CHECK_LIB(nsl, gethostname, [LIBS="$LIBS -lnsl"]) #AC_CHECK_LIB(socket, setsockopt, [LIBS="$LIBS -lsocket"]) #Contribution by: Larry Schwimmer schwim@cyclone.stanford.edu echo "$as_me:$LINENO: checking for connect" >&5 echo $ECHO_N "checking for connect... $ECHO_C" >&6 if test "${ac_cv_func_connect+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char connect (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char connect (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_connect) || defined (__stub___connect) choke me #else char (*f) () = connect; #endif #ifdef __cplusplus } #endif int main () { return f != connect; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_connect=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_func_connect=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_func_connect" >&5 echo "${ECHO_T}$ac_cv_func_connect" >&6 if test $ac_cv_func_connect = yes; then : else ac_check_socket=1 fi if test "$ac_check_socket" = 1; then echo "$as_me:$LINENO: checking for main in -lsocket" >&5 echo $ECHO_N "checking for main in -lsocket... $ECHO_C" >&6 if test "${ac_cv_lib_socket_main+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lsocket $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { main (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_socket_main=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_socket_main=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_socket_main" >&5 echo "${ECHO_T}$ac_cv_lib_socket_main" >&6 if test $ac_cv_lib_socket_main = yes; then LIBS="$LIBS -lsocket" else ac_check_both=1 fi fi if test "$ac_check_both" = 1; then ac_old_libs=$LIBS LIBS="$LIBS -lsocket -lnsl" echo "$as_me:$LINENO: checking for accept" >&5 echo $ECHO_N "checking for accept... $ECHO_C" >&6 if test "${ac_cv_func_accept+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char accept (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char accept (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_accept) || defined (__stub___accept) choke me #else char (*f) () = accept; #endif #ifdef __cplusplus } #endif int main () { return f != accept; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_accept=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_func_accept=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_func_accept" >&5 echo "${ECHO_T}$ac_cv_func_accept" >&6 if test $ac_cv_func_accept = yes; then checknsl=0 else LIBS=$ac_old_libs fi fi echo "$as_me:$LINENO: checking for gethostbyname" >&5 echo $ECHO_N "checking for gethostbyname... $ECHO_C" >&6 if test "${ac_cv_func_gethostbyname+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ /* System header to define __stub macros and hopefully few prototypes, which can conflict with char gethostbyname (); below. Prefer to if __STDC__ is defined, since exists even on freestanding compilers. */ #ifdef __STDC__ # include #else # include #endif /* Override any gcc2 internal prototype to avoid an error. */ #ifdef __cplusplus extern "C" { #endif /* We use char because int might match the return type of a gcc2 builtin and then its argument prototype would still apply. */ char gethostbyname (); /* The GNU C library defines this for functions which it implements to always fail with ENOSYS. Some functions are actually named something starting with __ and the normal name is an alias. */ #if defined (__stub_gethostbyname) || defined (__stub___gethostbyname) choke me #else char (*f) () = gethostbyname; #endif #ifdef __cplusplus } #endif int main () { return f != gethostbyname; ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_func_gethostbyname=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_func_gethostbyname=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext fi echo "$as_me:$LINENO: result: $ac_cv_func_gethostbyname" >&5 echo "${ECHO_T}$ac_cv_func_gethostbyname" >&6 if test $ac_cv_func_gethostbyname = yes; then : else echo "$as_me:$LINENO: checking for main in -lnsl" >&5 echo $ECHO_N "checking for main in -lnsl... $ECHO_C" >&6 if test "${ac_cv_lib_nsl_main+set}" = set; then echo $ECHO_N "(cached) $ECHO_C" >&6 else ac_check_lib_save_LIBS=$LIBS LIBS="-lnsl $LIBS" cat >conftest.$ac_ext <<_ACEOF #line $LINENO "configure" /* confdefs.h. */ _ACEOF cat confdefs.h >>conftest.$ac_ext cat >>conftest.$ac_ext <<_ACEOF /* end confdefs.h. */ int main () { main (); ; return 0; } _ACEOF rm -f conftest.$ac_objext conftest$ac_exeext if { (eval echo "$as_me:$LINENO: \"$ac_link\"") >&5 (eval $ac_link) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); } && { ac_try='test -s conftest$ac_exeext' { (eval echo "$as_me:$LINENO: \"$ac_try\"") >&5 (eval $ac_try) 2>&5 ac_status=$? echo "$as_me:$LINENO: \$? = $ac_status" >&5 (exit $ac_status); }; }; then ac_cv_lib_nsl_main=yes else echo "$as_me: failed program was:" >&5 sed 's/^/| /' conftest.$ac_ext >&5 ac_cv_lib_nsl_main=no fi rm -f conftest.$ac_objext conftest$ac_exeext conftest.$ac_ext LIBS=$ac_check_lib_save_LIBS fi echo "$as_me:$LINENO: result: $ac_cv_lib_nsl_main" >&5 echo "${ECHO_T}$ac_cv_lib_nsl_main" >&6 if test $ac_cv_lib_nsl_main = yes; then LIBS="$LIBS -lnsl" fi fi # Check whether --with-file-end-mark or --without-file-end-mark was given. if test "${with_file_end_mark+set}" = set; then withval="$with_file_end_mark" cat >>confdefs.h <<_ACEOF #define FILE_END_MARK '$withval' _ACEOF else cat >>confdefs.h <<\_ACEOF #define FILE_END_MARK ' ' _ACEOF fi; # Check whether --enable-structured-queries or --disable-structured-queries was given. if test "${enable_structured_queries+set}" = set; then enableval="$enable_structured_queries" cat >>confdefs.h <<\_ACEOF #define STRUCTURED_QUERIES 1 _ACEOF TARGET=Sall else cat >>confdefs.h <<\_ACEOF #define STRUCTURED_QUERIES 0 _ACEOF TARGET=NOTSall fi; # Check whether --enable-iso-charset or --disable-iso-charset was given. if test "${enable_iso_charset+set}" = set; then enableval="$enable_iso_charset" use_iso=$enablevel else use_iso=yes fi; # Check whether --enable-sfs-compat or --disable-sfs-compat was given. if test "${enable_sfs_compat+set}" = set; then enableval="$enable_sfs_compat" cat >>confdefs.h <<\_ACEOF #define SFS_COMPAT 1 _ACEOF else cat >>confdefs.h <<\_ACEOF #define SFS_COMPAT 0 _ACEOF fi; # Check whether --enable-pointer or --disable-pointer was given. if test "${enable_pointer+set}" = set; then enableval="$enable_pointer" else cat >>confdefs.h <<\_ACEOF #define AGREP_POINTER 1 _ACEOF fi; # Check whether --enable-measure-times or --disable-measure-times was given. if test "${enable_measure_times+set}" = set; then enableval="$enable_measure_times" cat >>confdefs.h <<\_ACEOF #define MEASURE_TIMES 1 _ACEOF fi; # Check whether --enable-warnings or --disable-warnings was given. if test "${enable_warnings+set}" = set; then enableval="$enable_warnings" CFLAGS="$CFLAGS -Wall" fi; # Check whether --enable-strip or --disable-strip was given. if test "${enable_strip+set}" = set; then enableval="$enable_strip" else STRIP="" fi; if test $use_iso = yes; then cat >>confdefs.h <<\_ACEOF #define ISO_CHAR_SET 1 _ACEOF else cat >>confdefs.h <<\_ACEOF #define ISO_CHAR_SET 0 _ACEOF fi ac_config_files="$ac_config_files Makefile index/Makefile compress/Makefile agrep/Makefile dynfilters/Makefile libtemplate/Makefile libtemplate/util/Makefile libtemplate/template/Makefile libtemplate/lib/Makefile" cat >confcache <<\_ACEOF # This file is a shell script that caches the results of configure # tests run on this system so they can be shared between configure # scripts and configure runs, see configure's option --config-cache. # It is not useful on other systems. If it contains results you don't # want to keep, you may remove or edit it. # # config.status only pays attention to the cache file if you give it # the --recheck option to rerun configure. # # `ac_cv_env_foo' variables (set or unset) will be overridden when # loading this file, other *unset* `ac_cv_foo' will be assigned the # following values. _ACEOF # The following way of writing the cache mishandles newlines in values, # but we know of no workaround that is simple, portable, and efficient. # So, don't put newlines in cache variables' values. # Ultrix sh set writes to stderr and can't be redirected directly, # and sets the high bit in the cache file unless we assign to the vars. { (set) 2>&1 | case `(ac_space=' '; set | grep ac_space) 2>&1` in *ac_space=\ *) # `set' does not quote correctly, so add quotes (double-quote # substitution turns \\\\ into \\, and sed turns \\ into \). sed -n \ "s/'/'\\\\''/g; s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1='\\2'/p" ;; *) # `set' quotes correctly as required by POSIX, so do not add quotes. sed -n \ "s/^\\([_$as_cr_alnum]*_cv_[_$as_cr_alnum]*\\)=\\(.*\\)/\\1=\\2/p" ;; esac; } | sed ' t clear : clear s/^\([^=]*\)=\(.*[{}].*\)$/test "${\1+set}" = set || &/ t end /^ac_cv_env/!s/^\([^=]*\)=\(.*\)$/\1=${\1=\2}/ : end' >>confcache if diff $cache_file confcache >/dev/null 2>&1; then :; else if test -w $cache_file; then test "x$cache_file" != "x/dev/null" && echo "updating cache $cache_file" cat confcache >$cache_file else echo "not updating unwritable cache $cache_file" fi fi rm -f confcache test "x$prefix" = xNONE && prefix=$ac_default_prefix # Let make expand exec_prefix. test "x$exec_prefix" = xNONE && exec_prefix='${prefix}' # VPATH may cause trouble with some makes, so we remove $(srcdir), # ${srcdir} and @srcdir@ from VPATH if srcdir is ".", strip leading and # trailing colons and then remove the whole line if VPATH becomes empty # (actually we leave an empty line to preserve line numbers). if test "x$srcdir" = x.; then ac_vpsub='/^[ ]*VPATH[ ]*=/{ s/:*\$(srcdir):*/:/; s/:*\${srcdir}:*/:/; s/:*@srcdir@:*/:/; s/^\([^=]*=[ ]*\):*/\1/; s/:*$//; s/^[^=]*=[ ]*$//; }' fi DEFS=-DHAVE_CONFIG_H ac_libobjs= ac_ltlibobjs= for ac_i in : $LIBOBJS; do test "x$ac_i" = x: && continue # 1. Remove the extension, and $U if already installed. ac_i=`echo "$ac_i" | sed 's/\$U\././;s/\.o$//;s/\.obj$//'` # 2. Add them. ac_libobjs="$ac_libobjs $ac_i\$U.$ac_objext" ac_ltlibobjs="$ac_ltlibobjs $ac_i"'$U.lo' done LIBOBJS=$ac_libobjs LTLIBOBJS=$ac_ltlibobjs : ${CONFIG_STATUS=./config.status} ac_clean_files_save=$ac_clean_files ac_clean_files="$ac_clean_files $CONFIG_STATUS" { echo "$as_me:$LINENO: creating $CONFIG_STATUS" >&5 echo "$as_me: creating $CONFIG_STATUS" >&6;} cat >$CONFIG_STATUS <<_ACEOF #! $SHELL # Generated by $as_me. # Run this file to recreate the current configuration. # Compiler output produced by configure, useful for debugging # configure, is in config.log if it exists. debug=false ac_cs_recheck=false ac_cs_silent=false SHELL=\${CONFIG_SHELL-$SHELL} _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF ## --------------------- ## ## M4sh Initialization. ## ## --------------------- ## # Be Bourne compatible if test -n "${ZSH_VERSION+set}" && (emulate sh) >/dev/null 2>&1; then emulate sh NULLCMD=: # Zsh 3.x and 4.x performs word splitting on ${1+"$@"}, which # is contrary to our usage. Disable this feature. alias -g '${1+"$@"}'='"$@"' elif test -n "${BASH_VERSION+set}" && (set -o posix) >/dev/null 2>&1; then set -o posix fi # Support unset when possible. if (FOO=FOO; unset FOO) >/dev/null 2>&1; then as_unset=unset else as_unset=false fi # Work around bugs in pre-3.0 UWIN ksh. $as_unset ENV MAIL MAILPATH PS1='$ ' PS2='> ' PS4='+ ' # NLS nuisances. for as_var in \ LANG LANGUAGE LC_ADDRESS LC_ALL LC_COLLATE LC_CTYPE LC_IDENTIFICATION \ LC_MEASUREMENT LC_MESSAGES LC_MONETARY LC_NAME LC_NUMERIC LC_PAPER \ LC_TELEPHONE LC_TIME do if (set +x; test -n "`(eval $as_var=C; export $as_var) 2>&1`"); then eval $as_var=C; export $as_var else $as_unset $as_var fi done # Required to use basename. if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi if (basename /) >/dev/null 2>&1 && test "X`basename / 2>&1`" = "X/"; then as_basename=basename else as_basename=false fi # Name of the executable. as_me=`$as_basename "$0" || $as_expr X/"$0" : '.*/\([^/][^/]*\)/*$' \| \ X"$0" : 'X\(//\)$' \| \ X"$0" : 'X\(/\)$' \| \ . : '\(.\)' 2>/dev/null || echo X/"$0" | sed '/^.*\/\([^/][^/]*\)\/*$/{ s//\1/; q; } /^X\/\(\/\/\)$/{ s//\1/; q; } /^X\/\(\/\).*/{ s//\1/; q; } s/.*/./; q'` # PATH needs CR, and LINENO needs CR and PATH. # Avoid depending upon Character Ranges. as_cr_letters='abcdefghijklmnopqrstuvwxyz' as_cr_LETTERS='ABCDEFGHIJKLMNOPQRSTUVWXYZ' as_cr_Letters=$as_cr_letters$as_cr_LETTERS as_cr_digits='0123456789' as_cr_alnum=$as_cr_Letters$as_cr_digits # The user is always right. if test "${PATH_SEPARATOR+set}" != set; then echo "#! /bin/sh" >conf$$.sh echo "exit 0" >>conf$$.sh chmod +x conf$$.sh if (PATH="/nonexistent;."; conf$$.sh) >/dev/null 2>&1; then PATH_SEPARATOR=';' else PATH_SEPARATOR=: fi rm -f conf$$.sh fi as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" || { # Find who we are. Look in the path if we contain no path at all # relative or not. case $0 in *[\\/]* ) as_myself=$0 ;; *) as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in $PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. test -r "$as_dir/$0" && as_myself=$as_dir/$0 && break done ;; esac # We did not find ourselves, most probably we were run as `sh COMMAND' # in which case we are not to be found in the path. if test "x$as_myself" = x; then as_myself=$0 fi if test ! -f "$as_myself"; then { { echo "$as_me:$LINENO: error: cannot find myself; rerun with an absolute path" >&5 echo "$as_me: error: cannot find myself; rerun with an absolute path" >&2;} { (exit 1); exit 1; }; } fi case $CONFIG_SHELL in '') as_save_IFS=$IFS; IFS=$PATH_SEPARATOR for as_dir in /bin$PATH_SEPARATOR/usr/bin$PATH_SEPARATOR$PATH do IFS=$as_save_IFS test -z "$as_dir" && as_dir=. for as_base in sh bash ksh sh5; do case $as_dir in /*) if ("$as_dir/$as_base" -c ' as_lineno_1=$LINENO as_lineno_2=$LINENO as_lineno_3=`(expr $as_lineno_1 + 1) 2>/dev/null` test "x$as_lineno_1" != "x$as_lineno_2" && test "x$as_lineno_3" = "x$as_lineno_2" ') 2>/dev/null; then $as_unset BASH_ENV || test "${BASH_ENV+set}" != set || { BASH_ENV=; export BASH_ENV; } $as_unset ENV || test "${ENV+set}" != set || { ENV=; export ENV; } CONFIG_SHELL=$as_dir/$as_base export CONFIG_SHELL exec "$CONFIG_SHELL" "$0" ${1+"$@"} fi;; esac done done ;; esac # Create $as_me.lineno as a copy of $as_myself, but with $LINENO # uniformly replaced by the line number. The first 'sed' inserts a # line-number line before each line; the second 'sed' does the real # work. The second script uses 'N' to pair each line-number line # with the numbered line, and appends trailing '-' during # substitution so that $LINENO is not a special case at line end. # (Raja R Harinath suggested sed '=', and Paul Eggert wrote the # second 'sed' script. Blame Lee E. McMahon for sed's syntax. :-) sed '=' <$as_myself | sed ' N s,$,-, : loop s,^\(['$as_cr_digits']*\)\(.*\)[$]LINENO\([^'$as_cr_alnum'_]\),\1\2\1\3, t loop s,-$,, s,^['$as_cr_digits']*\n,, ' >$as_me.lineno && chmod +x $as_me.lineno || { { echo "$as_me:$LINENO: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&5 echo "$as_me: error: cannot create $as_me.lineno; rerun with a POSIX shell" >&2;} { (exit 1); exit 1; }; } # Don't try to exec as it changes $[0], causing all sort of problems # (the dirname of $[0] is not the place where we might find the # original and so on. Autoconf is especially sensible to this). . ./$as_me.lineno # Exit status is that of the last command. exit } case `echo "testing\c"; echo 1,2,3`,`echo -n testing; echo 1,2,3` in *c*,-n*) ECHO_N= ECHO_C=' ' ECHO_T=' ' ;; *c*,* ) ECHO_N=-n ECHO_C= ECHO_T= ;; *) ECHO_N= ECHO_C='\c' ECHO_T= ;; esac if expr a : '\(a\)' >/dev/null 2>&1; then as_expr=expr else as_expr=false fi rm -f conf$$ conf$$.exe conf$$.file echo >conf$$.file if ln -s conf$$.file conf$$ 2>/dev/null; then # We could just check for DJGPP; but this test a) works b) is more generic # and c) will remain valid once DJGPP supports symlinks (DJGPP 2.04). if test -f conf$$.exe; then # Don't use ln at all; we don't have any links as_ln_s='cp -p' else as_ln_s='ln -s' fi elif ln conf$$.file conf$$ 2>/dev/null; then as_ln_s=ln else as_ln_s='cp -p' fi rm -f conf$$ conf$$.exe conf$$.file if mkdir -p . 2>/dev/null; then as_mkdir_p=: else as_mkdir_p=false fi as_executable_p="test -f" # Sed expression to map a string onto a valid CPP name. as_tr_cpp="sed y%*$as_cr_letters%P$as_cr_LETTERS%;s%[^_$as_cr_alnum]%_%g" # Sed expression to map a string onto a valid variable name. as_tr_sh="sed y%*+%pp%;s%[^_$as_cr_alnum]%_%g" # IFS # We need space, tab and new line, in precisely that order. as_nl=' ' IFS=" $as_nl" # CDPATH. $as_unset CDPATH exec 6>&1 # Open the log real soon, to keep \$[0] and so on meaningful, and to # report actual input values of CONFIG_FILES etc. instead of their # values after options handling. Logging --version etc. is OK. exec 5>>config.log { echo sed 'h;s/./-/g;s/^.../## /;s/...$/ ##/;p;x;p;x' <<_ASBOX ## Running $as_me. ## _ASBOX } >&5 cat >&5 <<_CSEOF This file was extended by $as_me, which was generated by GNU Autoconf 2.57. Invocation command line was CONFIG_FILES = $CONFIG_FILES CONFIG_HEADERS = $CONFIG_HEADERS CONFIG_LINKS = $CONFIG_LINKS CONFIG_COMMANDS = $CONFIG_COMMANDS $ $0 $@ _CSEOF echo "on `(hostname || uname -n) 2>/dev/null | sed 1q`" >&5 echo >&5 _ACEOF # Files that config.status was made for. if test -n "$ac_config_files"; then echo "config_files=\"$ac_config_files\"" >>$CONFIG_STATUS fi if test -n "$ac_config_headers"; then echo "config_headers=\"$ac_config_headers\"" >>$CONFIG_STATUS fi if test -n "$ac_config_links"; then echo "config_links=\"$ac_config_links\"" >>$CONFIG_STATUS fi if test -n "$ac_config_commands"; then echo "config_commands=\"$ac_config_commands\"" >>$CONFIG_STATUS fi cat >>$CONFIG_STATUS <<\_ACEOF ac_cs_usage="\ \`$as_me' instantiates files from templates according to the current configuration. Usage: $0 [OPTIONS] [FILE]... -h, --help print this help, then exit -V, --version print version number, then exit -q, --quiet do not print progress messages -d, --debug don't remove temporary files --recheck update $as_me by reconfiguring in the same conditions --file=FILE[:TEMPLATE] instantiate the configuration file FILE --header=FILE[:TEMPLATE] instantiate the configuration header FILE Configuration files: $config_files Configuration headers: $config_headers Report bugs to ." _ACEOF cat >>$CONFIG_STATUS <<_ACEOF ac_cs_version="\\ config.status configured by $0, generated by GNU Autoconf 2.57, with options \\"`echo "$ac_configure_args" | sed 's/[\\""\`\$]/\\\\&/g'`\\" Copyright 1992, 1993, 1994, 1995, 1996, 1998, 1999, 2000, 2001 Free Software Foundation, Inc. This config.status script is free software; the Free Software Foundation gives unlimited permission to copy, distribute and modify it." srcdir=$srcdir INSTALL="$INSTALL" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF # If no file are specified by the user, then we need to provide default # value. By we need to know if files were specified by the user. ac_need_defaults=: while test $# != 0 do case $1 in --*=*) ac_option=`expr "x$1" : 'x\([^=]*\)='` ac_optarg=`expr "x$1" : 'x[^=]*=\(.*\)'` ac_shift=: ;; -*) ac_option=$1 ac_optarg=$2 ac_shift=shift ;; *) # This is not an option, so the user has probably given explicit # arguments. ac_option=$1 ac_need_defaults=false;; esac case $ac_option in # Handling of the options. _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF -recheck | --recheck | --rechec | --reche | --rech | --rec | --re | --r) ac_cs_recheck=: ;; --version | --vers* | -V ) echo "$ac_cs_version"; exit 0 ;; --he | --h) # Conflict between --help and --header { { echo "$as_me:$LINENO: error: ambiguous option: $1 Try \`$0 --help' for more information." >&5 echo "$as_me: error: ambiguous option: $1 Try \`$0 --help' for more information." >&2;} { (exit 1); exit 1; }; };; --help | --hel | -h ) echo "$ac_cs_usage"; exit 0 ;; --debug | --d* | -d ) debug=: ;; --file | --fil | --fi | --f ) $ac_shift CONFIG_FILES="$CONFIG_FILES $ac_optarg" ac_need_defaults=false;; --header | --heade | --head | --hea ) $ac_shift CONFIG_HEADERS="$CONFIG_HEADERS $ac_optarg" ac_need_defaults=false;; -q | -quiet | --quiet | --quie | --qui | --qu | --q \ | -silent | --silent | --silen | --sile | --sil | --si | --s) ac_cs_silent=: ;; # This is an error. -*) { { echo "$as_me:$LINENO: error: unrecognized option: $1 Try \`$0 --help' for more information." >&5 echo "$as_me: error: unrecognized option: $1 Try \`$0 --help' for more information." >&2;} { (exit 1); exit 1; }; } ;; *) ac_config_targets="$ac_config_targets $1" ;; esac shift done ac_configure_extra_args= if $ac_cs_silent; then exec 6>/dev/null ac_configure_extra_args="$ac_configure_extra_args --silent" fi _ACEOF cat >>$CONFIG_STATUS <<_ACEOF if \$ac_cs_recheck; then echo "running $SHELL $0 " $ac_configure_args \$ac_configure_extra_args " --no-create --no-recursion" >&6 exec $SHELL $0 $ac_configure_args \$ac_configure_extra_args --no-create --no-recursion fi _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF for ac_config_target in $ac_config_targets do case "$ac_config_target" in # Handling of arguments. "Makefile" ) CONFIG_FILES="$CONFIG_FILES Makefile" ;; "index/Makefile" ) CONFIG_FILES="$CONFIG_FILES index/Makefile" ;; "compress/Makefile" ) CONFIG_FILES="$CONFIG_FILES compress/Makefile" ;; "agrep/Makefile" ) CONFIG_FILES="$CONFIG_FILES agrep/Makefile" ;; "dynfilters/Makefile" ) CONFIG_FILES="$CONFIG_FILES dynfilters/Makefile" ;; "libtemplate/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/Makefile" ;; "libtemplate/util/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/util/Makefile" ;; "libtemplate/template/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/template/Makefile" ;; "libtemplate/lib/Makefile" ) CONFIG_FILES="$CONFIG_FILES libtemplate/lib/Makefile" ;; "libtemplate/include/autoconf.h" ) CONFIG_HEADERS="$CONFIG_HEADERS libtemplate/include/autoconf.h" ;; *) { { echo "$as_me:$LINENO: error: invalid argument: $ac_config_target" >&5 echo "$as_me: error: invalid argument: $ac_config_target" >&2;} { (exit 1); exit 1; }; };; esac done # If the user did not use the arguments to specify the items to instantiate, # then the envvar interface is used. Set only those that are not. # We use the long form for the default assignment because of an extremely # bizarre bug on SunOS 4.1.3. if $ac_need_defaults; then test "${CONFIG_FILES+set}" = set || CONFIG_FILES=$config_files test "${CONFIG_HEADERS+set}" = set || CONFIG_HEADERS=$config_headers fi # Have a temporary directory for convenience. Make it in the build tree # simply because there is no reason to put it here, and in addition, # creating and moving files from /tmp can sometimes cause problems. # Create a temporary directory, and hook for its removal unless debugging. $debug || { trap 'exit_status=$?; rm -rf $tmp && exit $exit_status' 0 trap '{ (exit 1); exit 1; }' 1 2 13 15 } # Create a (secure) tmp directory for tmp files. { tmp=`(umask 077 && mktemp -d -q "./confstatXXXXXX") 2>/dev/null` && test -n "$tmp" && test -d "$tmp" } || { tmp=./confstat$$-$RANDOM (umask 077 && mkdir $tmp) } || { echo "$me: cannot create a temporary directory in ." >&2 { (exit 1); exit 1; } } _ACEOF cat >>$CONFIG_STATUS <<_ACEOF # # CONFIG_FILES section. # # No need to generate the scripts if there are no CONFIG_FILES. # This happens for instance when ./config.status config.h if test -n "\$CONFIG_FILES"; then # Protect against being on the right side of a sed subst in config.status. sed 's/,@/@@/; s/@,/@@/; s/,;t t\$/@;t t/; /@;t t\$/s/[\\\\&,]/\\\\&/g; s/@@/,@/; s/@@/@,/; s/@;t t\$/,;t t/' >\$tmp/subs.sed <<\\CEOF s,@SHELL@,$SHELL,;t t s,@PATH_SEPARATOR@,$PATH_SEPARATOR,;t t s,@PACKAGE_NAME@,$PACKAGE_NAME,;t t s,@PACKAGE_TARNAME@,$PACKAGE_TARNAME,;t t s,@PACKAGE_VERSION@,$PACKAGE_VERSION,;t t s,@PACKAGE_STRING@,$PACKAGE_STRING,;t t s,@PACKAGE_BUGREPORT@,$PACKAGE_BUGREPORT,;t t s,@exec_prefix@,$exec_prefix,;t t s,@prefix@,$prefix,;t t s,@program_transform_name@,$program_transform_name,;t t s,@bindir@,$bindir,;t t s,@sbindir@,$sbindir,;t t s,@libexecdir@,$libexecdir,;t t s,@datadir@,$datadir,;t t s,@sysconfdir@,$sysconfdir,;t t s,@sharedstatedir@,$sharedstatedir,;t t s,@localstatedir@,$localstatedir,;t t s,@libdir@,$libdir,;t t s,@includedir@,$includedir,;t t s,@oldincludedir@,$oldincludedir,;t t s,@infodir@,$infodir,;t t s,@mandir@,$mandir,;t t s,@build_alias@,$build_alias,;t t s,@host_alias@,$host_alias,;t t s,@target_alias@,$target_alias,;t t s,@DEFS@,$DEFS,;t t s,@ECHO_C@,$ECHO_C,;t t s,@ECHO_N@,$ECHO_N,;t t s,@ECHO_T@,$ECHO_T,;t t s,@LIBS@,$LIBS,;t t s,@CC@,$CC,;t t s,@CFLAGS@,$CFLAGS,;t t s,@LDFLAGS@,$LDFLAGS,;t t s,@CPPFLAGS@,$CPPFLAGS,;t t s,@ac_ct_CC@,$ac_ct_CC,;t t s,@EXEEXT@,$EXEEXT,;t t s,@OBJEXT@,$OBJEXT,;t t s,@AR@,$AR,;t t s,@RANLIB@,$RANLIB,;t t s,@ac_ct_RANLIB@,$ac_ct_RANLIB,;t t s,@LN_S@,$LN_S,;t t s,@LEX@,$LEX,;t t s,@LEXLIB@,$LEXLIB,;t t s,@LEX_OUTPUT_ROOT@,$LEX_OUTPUT_ROOT,;t t s,@STRIP@,$STRIP,;t t s,@CP@,$CP,;t t s,@INSTALL_PROGRAM@,$INSTALL_PROGRAM,;t t s,@INSTALL_SCRIPT@,$INSTALL_SCRIPT,;t t s,@INSTALL_DATA@,$INSTALL_DATA,;t t s,@CPP@,$CPP,;t t s,@EGREP@,$EGREP,;t t s,@TARGET@,$TARGET,;t t s,@HAVE_STRDUP@,$HAVE_STRDUP,;t t s,@LEXFLAGS@,$LEXFLAGS,;t t s,@DYNFILTER_TARGET@,$DYNFILTER_TARGET,;t t s,@DYNFILTER_CFLAGS@,$DYNFILTER_CFLAGS,;t t s,@DYNFILTER@,$DYNFILTER,;t t s,@LIBOBJS@,$LIBOBJS,;t t s,@LTLIBOBJS@,$LTLIBOBJS,;t t CEOF _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF # Split the substitutions into bite-sized pieces for seds with # small command number limits, like on Digital OSF/1 and HP-UX. ac_max_sed_lines=48 ac_sed_frag=1 # Number of current file. ac_beg=1 # First line for current file. ac_end=$ac_max_sed_lines # Line after last line for current file. ac_more_lines=: ac_sed_cmds= while $ac_more_lines; do if test $ac_beg -gt 1; then sed "1,${ac_beg}d; ${ac_end}q" $tmp/subs.sed >$tmp/subs.frag else sed "${ac_end}q" $tmp/subs.sed >$tmp/subs.frag fi if test ! -s $tmp/subs.frag; then ac_more_lines=false else # The purpose of the label and of the branching condition is to # speed up the sed processing (if there are no `@' at all, there # is no need to browse any of the substitutions). # These are the two extra sed commands mentioned above. (echo ':t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b' && cat $tmp/subs.frag) >$tmp/subs-$ac_sed_frag.sed if test -z "$ac_sed_cmds"; then ac_sed_cmds="sed -f $tmp/subs-$ac_sed_frag.sed" else ac_sed_cmds="$ac_sed_cmds | sed -f $tmp/subs-$ac_sed_frag.sed" fi ac_sed_frag=`expr $ac_sed_frag + 1` ac_beg=$ac_end ac_end=`expr $ac_end + $ac_max_sed_lines` fi done if test -z "$ac_sed_cmds"; then ac_sed_cmds=cat fi fi # test -n "$CONFIG_FILES" _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF for ac_file in : $CONFIG_FILES; do test "x$ac_file" = x: && continue # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in". case $ac_file in - | *:- | *:-:* ) # input from stdin cat >$tmp/stdin ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; *:* ) ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; * ) ac_file_in=$ac_file.in ;; esac # Compute @srcdir@, @top_srcdir@, and @INSTALL@ for subdirectories. ac_dir=`(dirname "$ac_file") 2>/dev/null || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` { if $as_mkdir_p; then mkdir -p "$ac_dir" else as_dir="$ac_dir" as_dirs= while test ! -d "$as_dir"; do as_dirs="$as_dir $as_dirs" as_dir=`(dirname "$as_dir") 2>/dev/null || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` done test ! -n "$as_dirs" || mkdir $as_dirs fi || { { echo "$as_me:$LINENO: error: cannot create directory \"$ac_dir\"" >&5 echo "$as_me: error: cannot create directory \"$ac_dir\"" >&2;} { (exit 1); exit 1; }; }; } ac_builddir=. if test "$ac_dir" != .; then ac_dir_suffix=/`echo "$ac_dir" | sed 's,^\.[\\/],,'` # A "../" for each directory in $ac_dir_suffix. ac_top_builddir=`echo "$ac_dir_suffix" | sed 's,/[^\\/]*,../,g'` else ac_dir_suffix= ac_top_builddir= fi case $srcdir in .) # No --srcdir option. We are building in place. ac_srcdir=. if test -z "$ac_top_builddir"; then ac_top_srcdir=. else ac_top_srcdir=`echo $ac_top_builddir | sed 's,/$,,'` fi ;; [\\/]* | ?:[\\/]* ) # Absolute path. ac_srcdir=$srcdir$ac_dir_suffix; ac_top_srcdir=$srcdir ;; *) # Relative path. ac_srcdir=$ac_top_builddir$srcdir$ac_dir_suffix ac_top_srcdir=$ac_top_builddir$srcdir ;; esac # Don't blindly perform a `cd "$ac_dir"/$ac_foo && pwd` since $ac_foo can be # absolute. ac_abs_builddir=`cd "$ac_dir" && cd $ac_builddir && pwd` ac_abs_top_builddir=`cd "$ac_dir" && cd ${ac_top_builddir}. && pwd` ac_abs_srcdir=`cd "$ac_dir" && cd $ac_srcdir && pwd` ac_abs_top_srcdir=`cd "$ac_dir" && cd $ac_top_srcdir && pwd` case $INSTALL in [\\/$]* | ?:[\\/]* ) ac_INSTALL=$INSTALL ;; *) ac_INSTALL=$ac_top_builddir$INSTALL ;; esac if test x"$ac_file" != x-; then { echo "$as_me:$LINENO: creating $ac_file" >&5 echo "$as_me: creating $ac_file" >&6;} rm -f "$ac_file" fi # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ if test x"$ac_file" = x-; then configure_input= else configure_input="$ac_file. " fi configure_input=$configure_input"Generated from `echo $ac_file_in | sed 's,.*/,,'` by configure." # First look for the input files in the build tree, otherwise in the # src tree. ac_file_inputs=`IFS=: for f in $ac_file_in; do case $f in -) echo $tmp/stdin ;; [\\/$]*) # Absolute (can't be DOS-style, as IFS=:) test -f "$f" || { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } echo $f;; *) # Relative if test -f "$f"; then # Build tree echo $f elif test -f "$srcdir/$f"; then # Source tree echo $srcdir/$f else # /dev/null tree { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } fi;; esac done` || { (exit 1); exit 1; } _ACEOF cat >>$CONFIG_STATUS <<_ACEOF sed "$ac_vpsub $extrasub _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF :t /@[a-zA-Z_][a-zA-Z_0-9]*@/!b s,@configure_input@,$configure_input,;t t s,@srcdir@,$ac_srcdir,;t t s,@abs_srcdir@,$ac_abs_srcdir,;t t s,@top_srcdir@,$ac_top_srcdir,;t t s,@abs_top_srcdir@,$ac_abs_top_srcdir,;t t s,@builddir@,$ac_builddir,;t t s,@abs_builddir@,$ac_abs_builddir,;t t s,@top_builddir@,$ac_top_builddir,;t t s,@abs_top_builddir@,$ac_abs_top_builddir,;t t s,@INSTALL@,$ac_INSTALL,;t t " $ac_file_inputs | (eval "$ac_sed_cmds") >$tmp/out rm -f $tmp/stdin if test x"$ac_file" != x-; then mv $tmp/out $ac_file else cat $tmp/out rm -f $tmp/out fi done _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF # # CONFIG_HEADER section. # # These sed commands are passed to sed as "A NAME B NAME C VALUE D", where # NAME is the cpp macro being defined and VALUE is the value it is being given. # # ac_d sets the value in "#define NAME VALUE" lines. ac_dA='s,^\([ ]*\)#\([ ]*define[ ][ ]*\)' ac_dB='[ ].*$,\1#\2' ac_dC=' ' ac_dD=',;t' # ac_u turns "#undef NAME" without trailing blanks into "#define NAME VALUE". ac_uA='s,^\([ ]*\)#\([ ]*\)undef\([ ][ ]*\)' ac_uB='$,\1#\2define\3' ac_uC=' ' ac_uD=',;t' for ac_file in : $CONFIG_HEADERS; do test "x$ac_file" = x: && continue # Support "outfile[:infile[:infile...]]", defaulting infile="outfile.in". case $ac_file in - | *:- | *:-:* ) # input from stdin cat >$tmp/stdin ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; *:* ) ac_file_in=`echo "$ac_file" | sed 's,[^:]*:,,'` ac_file=`echo "$ac_file" | sed 's,:.*,,'` ;; * ) ac_file_in=$ac_file.in ;; esac test x"$ac_file" != x- && { echo "$as_me:$LINENO: creating $ac_file" >&5 echo "$as_me: creating $ac_file" >&6;} # First look for the input files in the build tree, otherwise in the # src tree. ac_file_inputs=`IFS=: for f in $ac_file_in; do case $f in -) echo $tmp/stdin ;; [\\/$]*) # Absolute (can't be DOS-style, as IFS=:) test -f "$f" || { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } echo $f;; *) # Relative if test -f "$f"; then # Build tree echo $f elif test -f "$srcdir/$f"; then # Source tree echo $srcdir/$f else # /dev/null tree { { echo "$as_me:$LINENO: error: cannot find input file: $f" >&5 echo "$as_me: error: cannot find input file: $f" >&2;} { (exit 1); exit 1; }; } fi;; esac done` || { (exit 1); exit 1; } # Remove the trailing spaces. sed 's/[ ]*$//' $ac_file_inputs >$tmp/in _ACEOF # Transform confdefs.h into two sed scripts, `conftest.defines' and # `conftest.undefs', that substitutes the proper values into # config.h.in to produce config.h. The first handles `#define' # templates, and the second `#undef' templates. # And first: Protect against being on the right side of a sed subst in # config.status. Protect against being in an unquoted here document # in config.status. rm -f conftest.defines conftest.undefs # Using a here document instead of a string reduces the quoting nightmare. # Putting comments in sed scripts is not portable. # # `end' is used to avoid that the second main sed command (meant for # 0-ary CPP macros) applies to n-ary macro definitions. # See the Autoconf documentation for `clear'. cat >confdef2sed.sed <<\_ACEOF s/[\\&,]/\\&/g s,[\\$`],\\&,g t clear : clear s,^[ ]*#[ ]*define[ ][ ]*\([^ (][^ (]*\)\(([^)]*)\)[ ]*\(.*\)$,${ac_dA}\1${ac_dB}\1\2${ac_dC}\3${ac_dD},gp t end s,^[ ]*#[ ]*define[ ][ ]*\([^ ][^ ]*\)[ ]*\(.*\)$,${ac_dA}\1${ac_dB}\1${ac_dC}\2${ac_dD},gp : end _ACEOF # If some macros were called several times there might be several times # the same #defines, which is useless. Nevertheless, we may not want to # sort them, since we want the *last* AC-DEFINE to be honored. uniq confdefs.h | sed -n -f confdef2sed.sed >conftest.defines sed 's/ac_d/ac_u/g' conftest.defines >conftest.undefs rm -f confdef2sed.sed # This sed command replaces #undef with comments. This is necessary, for # example, in the case of _POSIX_SOURCE, which is predefined and required # on some systems where configure will not decide to define it. cat >>conftest.undefs <<\_ACEOF s,^[ ]*#[ ]*undef[ ][ ]*[a-zA-Z_][a-zA-Z_0-9]*,/* & */, _ACEOF # Break up conftest.defines because some shells have a limit on the size # of here documents, and old seds have small limits too (100 cmds). echo ' # Handle all the #define templates only if necessary.' >>$CONFIG_STATUS echo ' if grep "^[ ]*#[ ]*define" $tmp/in >/dev/null; then' >>$CONFIG_STATUS echo ' # If there are no defines, we may have an empty if/fi' >>$CONFIG_STATUS echo ' :' >>$CONFIG_STATUS rm -f conftest.tail while grep . conftest.defines >/dev/null do # Write a limited-size here document to $tmp/defines.sed. echo ' cat >$tmp/defines.sed <>$CONFIG_STATUS # Speed up: don't consider the non `#define' lines. echo '/^[ ]*#[ ]*define/!b' >>$CONFIG_STATUS # Work around the forget-to-reset-the-flag bug. echo 't clr' >>$CONFIG_STATUS echo ': clr' >>$CONFIG_STATUS sed ${ac_max_here_lines}q conftest.defines >>$CONFIG_STATUS echo 'CEOF sed -f $tmp/defines.sed $tmp/in >$tmp/out rm -f $tmp/in mv $tmp/out $tmp/in ' >>$CONFIG_STATUS sed 1,${ac_max_here_lines}d conftest.defines >conftest.tail rm -f conftest.defines mv conftest.tail conftest.defines done rm -f conftest.defines echo ' fi # grep' >>$CONFIG_STATUS echo >>$CONFIG_STATUS # Break up conftest.undefs because some shells have a limit on the size # of here documents, and old seds have small limits too (100 cmds). echo ' # Handle all the #undef templates' >>$CONFIG_STATUS rm -f conftest.tail while grep . conftest.undefs >/dev/null do # Write a limited-size here document to $tmp/undefs.sed. echo ' cat >$tmp/undefs.sed <>$CONFIG_STATUS # Speed up: don't consider the non `#undef' echo '/^[ ]*#[ ]*undef/!b' >>$CONFIG_STATUS # Work around the forget-to-reset-the-flag bug. echo 't clr' >>$CONFIG_STATUS echo ': clr' >>$CONFIG_STATUS sed ${ac_max_here_lines}q conftest.undefs >>$CONFIG_STATUS echo 'CEOF sed -f $tmp/undefs.sed $tmp/in >$tmp/out rm -f $tmp/in mv $tmp/out $tmp/in ' >>$CONFIG_STATUS sed 1,${ac_max_here_lines}d conftest.undefs >conftest.tail rm -f conftest.undefs mv conftest.tail conftest.undefs done rm -f conftest.undefs cat >>$CONFIG_STATUS <<\_ACEOF # Let's still pretend it is `configure' which instantiates (i.e., don't # use $as_me), people would be surprised to read: # /* config.h. Generated by config.status. */ if test x"$ac_file" = x-; then echo "/* Generated by configure. */" >$tmp/config.h else echo "/* $ac_file. Generated by configure. */" >$tmp/config.h fi cat $tmp/in >>$tmp/config.h rm -f $tmp/in if test x"$ac_file" != x-; then if diff $ac_file $tmp/config.h >/dev/null 2>&1; then { echo "$as_me:$LINENO: $ac_file is unchanged" >&5 echo "$as_me: $ac_file is unchanged" >&6;} else ac_dir=`(dirname "$ac_file") 2>/dev/null || $as_expr X"$ac_file" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$ac_file" : 'X\(//\)[^/]' \| \ X"$ac_file" : 'X\(//\)$' \| \ X"$ac_file" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$ac_file" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` { if $as_mkdir_p; then mkdir -p "$ac_dir" else as_dir="$ac_dir" as_dirs= while test ! -d "$as_dir"; do as_dirs="$as_dir $as_dirs" as_dir=`(dirname "$as_dir") 2>/dev/null || $as_expr X"$as_dir" : 'X\(.*[^/]\)//*[^/][^/]*/*$' \| \ X"$as_dir" : 'X\(//\)[^/]' \| \ X"$as_dir" : 'X\(//\)$' \| \ X"$as_dir" : 'X\(/\)' \| \ . : '\(.\)' 2>/dev/null || echo X"$as_dir" | sed '/^X\(.*[^/]\)\/\/*[^/][^/]*\/*$/{ s//\1/; q; } /^X\(\/\/\)[^/].*/{ s//\1/; q; } /^X\(\/\/\)$/{ s//\1/; q; } /^X\(\/\).*/{ s//\1/; q; } s/.*/./; q'` done test ! -n "$as_dirs" || mkdir $as_dirs fi || { { echo "$as_me:$LINENO: error: cannot create directory \"$ac_dir\"" >&5 echo "$as_me: error: cannot create directory \"$ac_dir\"" >&2;} { (exit 1); exit 1; }; }; } rm -f $ac_file mv $tmp/config.h $ac_file fi else cat $tmp/config.h rm -f $tmp/config.h fi done _ACEOF cat >>$CONFIG_STATUS <<\_ACEOF { (exit 0); exit 0; } _ACEOF chmod +x $CONFIG_STATUS ac_clean_files=$ac_clean_files_save # configure is writing to config.log, and then calls config.status. # config.status does its own redirection, appending to config.log. # Unfortunately, on DOS this fails, as config.log is still kept open # by configure, so config.status won't be able to write to it; its # output is simply discarded. So we exec the FD to /dev/null, # effectively closing config.log, so it can be properly (re)opened and # appended to by config.status. When coming back to configure, we # need to make the FD available again. if test "$no_create" != yes; then ac_cs_success=: ac_config_status_args= test "$silent" = yes && ac_config_status_args="$ac_config_status_args --quiet" exec 5>/dev/null $SHELL $CONFIG_STATUS $ac_config_status_args || ac_cs_success=false exec 5>>config.log # Use ||, not &&, to avoid exiting from the if with $? = 1, which # would make configure fail if this is the last instruction. $ac_cs_success || { (exit 1); exit 1; } fi glimpse-4.18.7/glimpse/configure.in000066400000000000000000000102721300371307100172210ustar00rootroot00000000000000dnl Process this file with autoconf to produce a configure script. AC_INIT(get_filename.c) AC_CONFIG_HEADER(libtemplate/include/autoconf.h) AC_PROG_CC dnl Check to see where to find ar. -- mkh AC_PATH_PROG(AR, ar, ar) if test -z "$AR" ; then AC_PATH_PROG(AR, ar, , /usr/ccs/bin/ar) if test -z "$AR" ; then AC_MSG_ERROR([no acceptable ar found in \$PATH:/usr/ccs/bin/ar]) fi fi AC_PROG_RANLIB AC_PROG_LN_S dnl configure for dynfilter AC_PROG_LEX if test "x$LEX" = "xflex" ; then DYNFILTER_TARGET=htuml2txt.so LEXFLAGS="-F -8" else DYNFILTER_TARGET=htuml2txt LEXFLAGS= fi if test "$ac_cv_prog_gcc" = "yes" ; then DYNFILTER_CFLAGS="-O3 -fomit-frame-pointer" else DYNFILTER_CFLAGS="-O" fi if test "`uname`" = "Linux" ; then DYNFILTER=dynfilters else DYNFILTER= fi dnl Check for strip, to support the --enable-strip option. -- mkh AC_PATH_PROG(STRIP, strip, strip) dnl Check for cp, this should not be a problem. -- mkh AC_PATH_PROG(CP, cp, cp) if test -z "$CP" ; then AC_MSG_ERROR([no cp found in \$PATH, something weird is going on.]) fi AC_PROG_INSTALL dnl Checks for header files. AC_HEADER_DIRENT #Contribution by VaX#n8 AC_HEADER_STDC AC_CHECK_HEADERS(fcntl.h sys/file.h sys/time.h unistd.h sys/select.h) dnl XXX sysuh AC_CHECK_HEADERS(sys/dir.h sys/ndir.h strerr.h) dnl Checks for typedefs, structures, and compiler characteristics. AC_HEADER_TIME dnl ######### compiler characteristics AC_C_CONST dnl Checks for library functions. AC_TYPE_SIGNAL AC_FUNC_UTIME_NULL #AC_CHECK_FUNCS(getcwd gethostname gettimeofday mkdir rmdir select socket strdup strftime strstr) # Need this for libtemplate AC_CHECK_FUNCS(strdup strerror) dnl Check for libraries # fmod, logb, and frexp are found in -lm on most systems. # On HPUX 9.01, -lm does not contain logb, so check for sqrt. AC_CHECK_LIB(m, sqrt) AC_CHECK_LIB(c, dlopen, , [AC_CHECK_LIB(dl, dlopen, [LIBS="$LIBS -ldl"])]) #AC_CHECK_LIB(resolv, gethostbyname) #AC_CHECK_LIB(nsl, gethostname, [LIBS="$LIBS -lnsl"]) #AC_CHECK_LIB(socket, setsockopt, [LIBS="$LIBS -lsocket"]) #Contribution by: Larry Schwimmer schwim@cyclone.stanford.edu AC_CHECK_FUNC(connect, , ac_check_socket=1) if test "$ac_check_socket" = 1; then AC_CHECK_LIB(socket, main, LIBS="$LIBS -lsocket", ac_check_both=1) fi if test "$ac_check_both" = 1; then ac_old_libs=$LIBS LIBS="$LIBS -lsocket -lnsl" AC_CHECK_FUNC(accept, checknsl=0, LIBS=$ac_old_libs) fi AC_CHECK_FUNC(gethostbyname, , AC_CHECK_LIB(nsl, main, [LIBS="$LIBS -lnsl"])) dnl Optional stuff AC_ARG_WITH(file-end-mark, [ --with-file-end-mark=CHAR use character CHAR as filename delimiter [' '] most often set to '\t' in order to index filenames with spaces; must match Webglimpse setting in lib/wgHeader.pm.], AC_DEFINE_UNQUOTED(FILE_END_MARK,'$withval'), AC_DEFINE(FILE_END_MARK,[' '])) dnl AC_DEFINE(FILE_END_MARK, [$withval]) dnl TARGET=Sall, dnl AC_DEFINE(FILE_END_MARK, [' ']) dnl TARGET=Sall) AC_ARG_ENABLE(structured-queries, [ --enable-structured-queries enable structured queries], AC_DEFINE(STRUCTURED_QUERIES,1) TARGET=Sall, AC_DEFINE(STRUCTURED_QUERIES,0) TARGET=NOTSall) AC_ARG_ENABLE(iso-charset, [ --disable-iso-charset disable iso charset (may be slightly faster if you don't care about upper-ascii characters)], [use_iso=$enablevel], [use_iso=yes]) AC_ARG_ENABLE(sfs-compat, [ --enable-sfs-compat Support SFS compatibility], AC_DEFINE(SFS_COMPAT,1), AC_DEFINE(SFS_COMPAT,0)) AC_ARG_ENABLE(pointer, [ --enable-pointer blah], , AC_DEFINE(AGREP_POINTER)) AC_ARG_ENABLE(measure-times, [ --enable-measure-times blah], AC_DEFINE(MEASURE_TIMES)) AC_ARG_ENABLE(warnings, [ --enable-warnings Add -Wall to CFLAGS], [CFLAGS="$CFLAGS -Wall"]) AC_ARG_ENABLE(strip, [ --enable-strip Strip binaries], , [STRIP=""]) if test $use_iso = yes; then AC_DEFINE(ISO_CHAR_SET,1) else AC_DEFINE(ISO_CHAR_SET,0) fi dnl local substitute AC_SUBST(TARGET) AC_SUBST(HAVE_STRDUP) AC_SUBST(LEXFLAGS) AC_SUBST(DYNFILTER_TARGET) AC_SUBST(DYNFILTER_CFLAGS) AC_SUBST(DYNFILTER) AC_OUTPUT(Makefile index/Makefile compress/Makefile agrep/Makefile dynfilters/Makefile libtemplate/Makefile libtemplate/util/Makefile libtemplate/template/Makefile libtemplate/lib/Makefile) glimpse-4.18.7/glimpse/defs.h000066400000000000000000000025001300371307100157750ustar00rootroot00000000000000#ifndef _GIMPSE_DEFS_H_ #define _GIMPSE_DEFS_H_ #include /* autoconf defines */ #define MAX_ARGS 80 /* English alphabets + numbers + pattern + progname + arguments + extras */ #define MAXFILEOPT 1024 /* includes length of args too: #args is <= MAX_ARGS */ #define BLOCKSIZE 8192 /* For compression: what is the optimal unit of disk i/o = n * pagesize */ /* * These are some parameters that allow us to switch between offset computation * and just index computation when the index is built at a byte-level: since * offset computation is a waste if we can't narrow down search enough (since * we must look all over and the lists become too long => bottleneck). This may * not be needed if we used trees to store intervals --- we'll do it later :-). */ #define MAX_DISPARITY 100 /* if least frequent word occurrs in < 1/100 times most frequent word, resort to agrep: don't intersect lists (byte-level) */ #define MIN_OCCURRENCES 20 /* Min no. of occurrences before we check for highly frequent words using MAX_UNION */ #define MAX_UNION 500 /* Don't even perform the Union of offsets if least < 1/500 times most freq word (we are on track of stop list kinda words) */ #define MAX_ABSOLUTE MAX_SORTLINE_LEN /* Don't even perform the Union of offsets if a word occurs more than 16K times (independent of #of files) */ #endif glimpse-4.18.7/glimpse/genpatch000066400000000000000000000007461300371307100164310ustar00rootroot00000000000000#!/bin/sh # $Id: genpatch,v 1.1 1999/11/03 20:36:24 golda Exp $ PATH=/bin:/usr/bin:/usr/local/bin ; export PATH ROOT=${1-.} RLOGFLAGS="-L -R" tmpfile="/tmp/findco$$" for rcsdir in `find ${ROOT} -name RCS -type d -print` ; do rlog ${RLOGFLAGS} ${rcsdir}/* > ${tmpfile} if [ -s "${tmpfile}" ] ; then echo "# Files in ${rcsdir}:" for f in `cat ${tmpfile}` ; do f2=`echo $f | sed -e 's@RCS/@@' -e 's@,v$@@'` rcsdiff -c ${f2} done fi done rm -f ${tmpfile} glimpse-4.18.7/glimpse/gentar000077500000000000000000000014751300371307100161230ustar00rootroot00000000000000#!/bin/sh # $Id: gentar,v 1.1 1998/04/27 16:11:23 pab Exp $ # # Build a tar file image of this directory, checking out files from RCS. # What version to build? RELVER=${1-DEV} srcdir="./glimpse-${RELVER}-src" # Safety check---don't overwrite existing directory if [ -d "${srcdir}" ] ; then echo "$0: Please remove existing source archive ${srcdir}" exit 1 fi # Get the hierarchy first dirs=`find . -type d` # Now create the duplication area mkdir ${srcdir} cdir=`pwd` # Duplicate the directory hierarchy; if the directory has an RCS area, # check out its files, then remove the RCS link. for d in ${dirs} ; do mkdir -p ${srcdir}/${d} if [ -e ${d}/RCS ] ; then (cd ${srcdir}/${d} ; ln -s ${cdir}/${d}/RCS ; co -f RCS/* ; rm RCS) fi done # Put all that into a tar file tar cf glimpse-${RELVER}-src.tar ${srcdir} glimpse-4.18.7/glimpse/get_filename.c000066400000000000000000000645401300371307100175020ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ #include #include #include #include "glimpse.h" #include #define CHAR unsigned char /* ---------------------------------------------------------------------- get_filenames() input: an index table, (an index vector, i-th entry is ON if i-th partition is to be searched.), the partition table in src_index_set[] and the list of all files in "NAME_LIST". output: the list of filenames to be searched. ------------------------------------------------------------------------- */ #if BG_DEBUG extern FILE *debug; #endif /*BG_DEBUG*/ extern int p_table[MAX_PARTITION]; extern CHAR **GTextfiles; extern CHAR **GTextfilenames; extern int *GFileIndex; extern int GNumfiles; extern CHAR GProgname[]; extern CHAR FileNamePat[]; extern int MATCHFILE; extern int agrep_outpointer; extern int mask_int[32]; extern int OneFilePerBlock; extern char INDEX_DIR[MAX_LINE_LEN]; extern unsigned int *multi_dest_index_set[MAXNUM_PAT]; extern int file_num; /* in index/io.c */ int bigbuffer_size; int first_line_len = 0; char *bigbuffer = NULL; /* constant buffer to read all filenames in NAME_LIST */ char *outputbuffer = NULL; /* keeps changing: used for -F search via memagrep */ int outputbuffer_len = 0; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; read_filenames() { struct stat st; unsigned char buffer[MAX_NAME_SIZE]; char *currptr; int i; /* one time processing: assumes during one run of glimpse, the index remains constant! */ if (bigbuffer == NULL) { FILE *fp = fopen(NAME_LIST, "r"); if (fp == NULL) { fprintf(stderr, "Can't open for reading: %s/%s\n", INDEX_DIR, NAME_LIST); exit(2); } if (-1 == stat(NAME_LIST, &st)) { fclose(fp); fprintf(stderr, "Can't stat: %s/%s\n", INDEX_DIR, NAME_LIST); exit(2); } fgets(buffer, MAX_NAME_SIZE, fp); first_line_len = strlen(buffer); bigbuffer_size = st.st_size - first_line_len; sscanf(buffer, "%d", &file_num); if ((file_num < 0) || (file_num > MaxNum24bPartition)) { fclose(fp); fprintf(stderr, "Error in reading: %s/%s\n", INDEX_DIR, NAME_LIST); exit(2); } if (file_num == 0) { fclose(fp); fprintf(stderr, "Warning: No files were indexed! Exiting...\n"); exit(2); } initialize_data_structures(file_num); for (i=0; i DEF_MAX_INDEX_PERCENT/2) && (num_blocks > MaxNum8bPartition)) return slow_mask_filenames(index_vect, infile); for (i=0; i= maxcount) { /* first time (not compressing into smaller code since I want it to be similar to slow_mask... below) */ ret = (num_blocks - 1)*sizeof(int); num_read += ret; maxcount = num_blocks; for (i=0; i<=ret /* to process last one also */; i+=sizeof(int), count++) { readoffset = temp_bigbuffer_offset[1 + i/sizeof(int)]; /* printf("readoffset=%d\n", readoffset); */ if ((offset >= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(temp_bigbuffer_index[count])] |= mask_int[temp_bigbuffer_index[count] % 32]; else { for (; l= p_table[l]) && (temp_bigbuffer_index[count] < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out w/o setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } else { for (; i<=ret /* to process last one also */; i+=sizeof(int), count++) { readoffset = temp_bigbuffer_offset[1 + i/sizeof(int)]; /* printf("readoffset=%d\n", readoffset); */ if ((offset >= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(temp_bigbuffer_index[count])] |= mask_int[temp_bigbuffer_index[count] % 32]; else { for (; l= p_table[l]) && (temp_bigbuffer_index[count] < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out without setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } } } /* Now AND the incoming mask with the one constructed above */ if (OneFilePerBlock) { for (i=0; i= maxcount) { if (num_read= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(count)] |= mask_int[count % 32]; else { for (; l= p_table[l]) && (count < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out w/o setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } else if ((offset >= prevreadoffset) && (offset < name_list_size)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(count)] |= mask_int[count % 32]; else { for (; l= p_table[l]) && (count < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out without setting anything */ } count ++; found = 1; } else goto endofinput; /* since this offset >= name_list_size and there's no more input after that */ } else { for (; i= prevreadoffset) && (offset < readoffset)) { /* printf("count=%d\n", count); */ if (OneFilePerBlock) multi_dest_index_set[0][block2index(count)] |= mask_int[count % 32]; else { for (; l= p_table[l]) && (count < p_table[l+1])) { multi_dest_index_set[0][l] = 1; break; /* out of for */ } } /* can't come here without break: if it does (serious!) will break out without setting anything */ } prevreadoffset = readoffset; i += sizeof(int); count ++; found = 1; break; /* out of for */ } prevreadoffset = readoffset; } } } } endofinput: /* Now AND the incoming mask with the one constructed above */ if (OneFilePerBlock) { for (i=0; i 0) ? round(file_num, 8*sizeof(int)) : MAX_PARTITION); i++) if(index_vect[i]) fprintf(debug, "i=%d,%x\n", i, index_vect[i]); #endif /*BG_DEBUG*/ GNumfiles = 0; filesseen = 0; endptr = beginptr = bigbuffer + MAX_PAT; if(MATCHFILE == OFF) { /* just copy the filenames */ if (OneFilePerBlock) { for (i=0; i= file_num) goto end_files; end_of_loop1: beginptr = endptr = endptr + 1; /* skip over '\n' */ filesseen ++; } } } } /* one file per block */ else { /* Just the outer for-loop and initial begin/end values are different: rest is same */ for (i=0; i 0) { start = p_table[i]; end = p_table[i+1]; if (start >= end) continue; #if BG_DEBUG fprintf(debug, "start=%d, end=%d\n", start, end); #endif /*BG_DEBUG*/ /* * skip over so many filenames and get the filenames to copy. * NOTE: successive "start"s ALWAYS increase. */ while(filesseen < start) { while(*beginptr != '\n') beginptr ++; beginptr ++; /* skip over '\n' */ filesseen ++; } endptr = beginptr; while (filesseen < end) { while(*endptr != '\n') endptr ++; if (endptr == beginptr + 1) goto end_of_loop2; /* null name of non-existent file */ *endptr = '\0'; /* return with all the names you COULD get */ if ((GTextfiles[GNumfiles] = (CHAR *)strdup(beginptr)) == NULL) { *endptr = '\n'; fprintf(stderr, "Out of memory at: %s:%d\n", __FILE__, __LINE__); return; } GFileIndex[GNumfiles] = filesseen; *endptr = '\n'; if (++GNumfiles >= file_num) goto end_files; end_of_loop2: beginptr = endptr = endptr + 1; /* skip over '\n' */ filesseen ++; } } } } } else { /* search and copy matched filenames */ extern int REGEX, FASTREGEX, D, WORDBOUND; /* agrep global which tells us whether the pattern is a regular expression or not, and if there are errors w/ -w */ int myREGEX, myFASTREGEX, myD, myWORDBOUND; errno = 0; if ((dummylen = memagrep_init(argc, argv, MAX_PAT, dummypat)) <= 0) goto end_files; memcpy(tempbuf, bigbuffer, bigbuffer_size >= MAX_PAT ? MAX_PAT*3 : MAX_PAT*2 + bigbuffer_size); ret = memagrep_search(dummylen, dummypat, dummylen*2, beginptr, outputbuffer_len, outputbuffer); memcpy(bigbuffer, tempbuf, bigbuffer_size >= MAX_PAT ? MAX_PAT*3 : MAX_PAT*2 + bigbuffer_size); myREGEX = REGEX; myFASTREGEX = FASTREGEX; myD = D; myWORDBOUND = WORDBOUND; if (OneFilePerBlock) { for (i=0; i 0) { #if BG_DEBUG { char c = outputbuffer[agrep_outpointer + 1]; outputbuffer[agrep_outpointer + 1] = '\0'; fprintf(debug, "OUTPUTBUFFER=%s\n", outputbuffer); outputbuffer[agrep_outpointer + 1] = c; } #endif /*BG_DEBUG*/ k = prevk = 0; #if EACHOPTION #else while (outputbuffer[k] == '\n') { k ++; prevk ++; } #endif while(k+1= file_num) goto end_files; k = prevk = k+1; } } } else { index_vect[i] &= ~mask_int[j]; /* remove it from the list: used if ByteLevelIndex */ } end_of_loop3: beginptr = endptr = endptr + 1; } } } /* one file per block */ else { /* Just the outer for-loop and initial begin/end values are different: rest is same */ for (i=0; i 0) { start = p_table[i]; end = p_table[i+1]; if (start >= end) continue; #if BG_DEBUG fprintf(debug, "start=%d, end=%d\n", start, end); #endif /*BG_DEBUG*/ /* * skip over so many filenames and get the region to search = * beginptr to endptr: NOTE: successive "start"s ALWAYS increase. */ while(filesseen < start) { while(*beginptr != '\n') beginptr ++; beginptr ++; /* skip over '\n' */ filesseen ++; } beginptr --; /* I need '\n' for memory search */ endptr = beginptr+1; while (filesseen < end) { while(*endptr != '\n') endptr ++; endptr ++; /* skip over '\n' */ filesseen ++; } endptr --; /* I need '\n' for memory search */ if (endptr == beginptr + 1) goto end_of_loop4; /* null name of non-existent file */ #if BG_DEBUG *endptr = '\0'; fprintf(debug, "From %d searching:\n%s\n", filesseen, beginptr+1); *endptr = '\n'; #endif /*BG_DEBUG*/ /* if file in the partition matches then copy it */ #if EACHOPTION if (myREGEX || myFASTREGEX || (myD && myWORDBOUND)) ret = memagrep_search(dummylen, dummypat, endptr-beginptr + 1, beginptr, outputbuffer_len, outputbuffer); else ret = memagrep_search(dummylen, dummypat, endptr-beginptr/* + 1*/, beginptr+1, outputbuffer_len, outputbuffer); #else /* beginptr points to '\n', entptr+1 points to '\n' */ ret = memagrep_search(dummylen, dummypat, endptr-beginptr+1, beginptr, outputbuffer_len, outputbuffer); #endif if (ret > 0) { k = prevk = 0; #if EACHOPTION #else while (outputbuffer[k] == '\n') { k ++; prevk ++; } #endif while(k+1= file_num) goto end_files; k = prevk = k+1; } } } else { index_vect[i] = 0; /* mask it off */ } end_of_loop4: beginptr = endptr = endptr + 1; } } } } end_files: #if BG_DEBUG fprintf(debug, "The following %d filenames are ON\n", GNumfiles); for (i=0; i #if BG_DEBUG extern FILE *debug; #endif /*BG_DEBUG*/ extern char INDEX_DIR[MAX_LINE_LEN]; extern int Only_first; extern int PRINTAPPXFILEMATCH; extern int OneFilePerBlock; extern int StructuredIndex; extern int WHOLEFILESCOPE; extern unsigned int *dest_index_set; extern unsigned char *dest_index_buf; extern int mask_int[32]; extern int errno; extern int ByteLevelIndex; extern int RecordLevelIndex; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; extern int NOBYTELEVEL; extern int OPTIMIZEBYTELEVEL; extern int RegionLimit; extern int PRINTINDEXLINE; extern struct offsets **src_offset_table; extern unsigned int *multi_dest_index_set[MAXNUM_PAT]; extern struct offsets **multi_dest_offset_table[MAXNUM_PAT]; extern char *index_argv[MAX_ARGS]; extern int index_argc; extern CHAR GProgname[MAXNAME]; extern FILE *indexfp, *minifp; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; extern int p_table[MAX_PARTITION]; extern int GNumpartitions; extern int INVERSE; /* agrep's global: need here to implement ~ in index-search */ extern int last_Y_filenumber; #define USEFREQUENCIES 0 /* set to one if we want to stop collecting offsets sometimes since words "look" like they are in the stop list... */ free_list(p1) struct offsets **p1; { struct offsets *tp1; while (*p1 != NULL) { tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); } } /* Unions offset lists list2 with list1 sorted in increasing order (deletes elements from list2) => changes both list1 and list2: f += #elems added */ sorted_union(list1, list2, f, pf, cf) struct offsets **list1, **list2; int *f, pf, cf; { register struct offsets **p1 = list1, *p2; register int count = *f; /* don't update *f if setting NOBYTELEVEL */ if (!RecordLevelIndex && NOBYTELEVEL) { /* cannot come here! */ free_list(list1); free_list(list2); return; } #if USEFREQUENCIES if (!RecordLevelIndex && ( ((pf > MIN_OCCURRENCES) && (count > MAX_UNION * pf)) || (count > MAX_ABSOLUTE) || ((count > MIN_OCCURRENCES) && (pf > MAX_UNION * count)) || (pf > MAX_ABSOLUTE) )) { /* enough if we check the second condition at the beginning since it won't surely be satisfied after this when count ++ */ NOBYTELEVEL = 1; return; } #endif while (*list2 != NULL) { /* extract 1st element, update list2 */ p2 = *list2; *list2 = (*list2)->next; p2->next = NULL; /* find position to insert p2, and do so */ p1 = list1; while (((*p1) != NULL) && ((*p1)->offset < p2->offset)) p1 = &(*p1)->next; if (*p1 == NULL) { /* end of list1: append list2 to it and return */ *p1 = p2; p2->next = *list2; *list2 = NULL; if (cf > 0) count = *f + cf; #if USEFREQUENCIES if (!RecordLevelIndex && ( ((pf > MIN_OCCURRENCES) && (count > MAX_UNION * pf)) || (count > MAX_ABSOLUTE))) { NOBYTELEVEL = 1; return; } #endif *f = count; return; } else if (p2->offset == (*p1)->offset) my_free(p2, sizeof(struct offsets)); else { p2->next = *p1; *p1 = p2; count ++; #if USEFREQUENCIES if (!RecordLevelIndex && ( ((pf > MIN_OCCURRENCES) && (count > MAX_UNION * pf)) || (count > MAX_ABSOLUTE) )) { NOBYTELEVEL = 1; return; } #endif /* update list1 */ list1 = &(*p1)->next; } } *f = count; } /* Intersects offset lists list2 with list1 sorted in increasing order (deletes elements from list2) => changes both list1 and list2 */ sorted_intersection(filenum, list1, list2, f) struct offsets **list1, **list2; int *f; { register struct offsets **p1 = list1, *p2, *tp1; register int diff; struct offsets *tp; if (!RecordLevelIndex && NOBYTELEVEL) { /* cannot come here! */ free_list(list1); free_list(list2); return; } /* NOT NECESSARY SINCE done INITIALIZED TO 0 ON CREATION AND MADE 0 BELOW tp = *list1; while (tp != NULL) { tp->done = 0; tp = tp->next; } */ #if 0 printf("sorted_intersection BEGIN: list1=\n\t"); tp = *list1; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); printf("list2=\n\t"); tp = *list2; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); #endif /* find position to intersect list2, and do so: REMEBER: list1 is in increasing order, and so is list2 !!! */ p1 = list1; while ( ((*p1) != NULL) && (*list2 != NULL) ) { diff = (*list2)->offset - (*p1)->offset; if ( (diff == 0) || (!RecordLevelIndex && (diff >= -RegionLimit) && (diff <= RegionLimit)) ) { (*p1)->done = 1; /* p1 is in */ p1 = &(*p1)->next; /* Can't increment p2 here since it might keep others after p1 also in */ } else { if (diff < 0) { p2 = *list2; *list2 = (*list2)->next; my_free(p2, sizeof(struct offsets)); /* p1 can intersect with list2's next */ } else { if((*p1)->done && 0) p1 = &(*p1)->next; /* imposs */ /* THIS CHECK ALWAYS YEILDS 0 FROM 25/08/1996: bgopal@cs.arizona.edu */ else { tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); (*f) --; } /* list2 can intersect with p1's next */ } } } while (*list2 != NULL) { p2 = *list2; *list2 = (*list2)->next; my_free(p2, sizeof(struct offsets)); } p1 = list1; while (*p1 != NULL) { if ((*p1)->done == 0) { tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); (*f) --; } else { (*p1)->done = 0; /* for the next round! */ p1 = &(*p1)->next; } } #if 0 printf("sorted_intersection END: list1=\n\t"); tp = *list1; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); printf("list2=\n\t"); tp = *list2; while (tp != NULL) { printf("%d ", tp->offset); tp = tp->next; } printf("\n"); #endif } purge_offsets(p1) struct offsets **p1; { struct offsets *tp1; while (*p1 != NULL) { if ((*p1)->sign == 0) { tp1 = *p1; (*p1) = (*p1)->next; my_free(tp1, sizeof(struct offsets)); } else p1 = &(*p1)->next; } } /* Returns 1 if it is a Universal set, 0 otherwise. Constraint: WORD_END_MARK/ALL_INDEX_MARK must occur at or after buffer[0] */ get_set(buffer, set, offset_table, patlen, pattern, patattr, outfile, partfp, frequency, prevfreq) unsigned char *buffer; unsigned int *set; struct offsets **offset_table; int patlen; char *pattern; int patattr; FILE *outfile; FILE *partfp; int *frequency, prevfreq; { int bdx2, j; int ret; int x=0, y=0, diff, even_words=1, prevy; int indexattr = 0; struct offsets *o, *tailo, *heado; int delim = encode8b(0); int curfreq = 0; unsigned char c; /* buffer[0] is '\n', search must start from buffer[1] */ bdx2 = 1; if (OneFilePerBlock) while((bdx2= REAL_INDEX_BUF+1) return 0; if (StructuredIndex) { if (StructuredIndex < MaxNum8bPartition - 1) { indexattr = decode8b(buffer[bdx2+1]); } else { indexattr = decode16b((buffer[bdx2+1] << 8) | (buffer[bdx2 + 2])); } /* printf("i=%d p=%d\n", indexattr, patattr); */ if ((patattr > 0) && (indexattr != patattr)) { #if BG_DEBUG fprintf(debug, "indexattr=%d DOES NOT MATCH patattr=%d\n", indexattr, patattr); #endif /*BG_DEBUG*/ return 0; } } if (PRINTINDEXLINE) { c = buffer[bdx2]; buffer[bdx2] = '\0'; printf("%s %d", &buffer[1], indexattr); buffer[bdx2] = c; if (c == ALL_INDEX_MARK) printf(" ! "); else printf(" : "); } if (OneFilePerBlock && (buffer[bdx2] == ALL_INDEX_MARK)) { /* A intersection Univ-set = A: so src_index_set won't change; A union Univ-set = Univ-set: so src_index_set = all 1s */ #if BG_DEBUG buffer[bdx2] = '\0'; fprintf(debug, "All indices search for %s\n", buffer + 1); buffer[bdx2] = ALL_INDEX_MARK; #endif /*BG_DEBUG*/ set[REAL_PARTITION - 1] = 1; for(bdx2=0; bdx2= OneFilePerBlock) break; set[bdx2] |= mask_int[j]; } set[REAL_PARTITION - 1] = 1; if (ByteLevelIndex && !RecordLevelIndex) NOBYTELEVEL = 1; /* With RecordLevelIndex, I want NOBYTELEVEL to be unused (i.e., !NOBYTELEVEL is always true) */ return 1; } else if (!OneFilePerBlock) { /* check only if index+partitions are NOT split */ #if BG_DEBUG buffer[bdx2] = '\0'; fprintf(debug, "memagrep-line: %s\t\tpattern: %s\n", buffer, pattern); #endif /*BG_DEBUG*/ /* ignore if pattern with all its options matches block number sequence: bg+udi: Feb/16/93 */ buffer[bdx2] = '\n'; /* memagrep needs buffer to end with '\n' */ if ((ret = memagrep_search(patlen, pattern, bdx2+1, buffer, 0, outfile)) <= 0) return 0; else buffer[bdx2] = WORD_END_MARK; } if ((StructuredIndex > 0) && (StructuredIndex < MaxNum8bPartition - 1)) bdx2 ++; else if (StructuredIndex > 0) bdx2 += 2; bdx2++; /* bdx2 now points to the first byte of the offset */ even_words = 1; /* Code identical to that in merge_in() in glimpseindex */ if (OneFilePerBlock) { get_block_numbers(&buffer[bdx2], &buffer[bdx2], partfp); while((bdx2 0) && (x >= last_Y_filenumber)) continue; set[block2index(x)] |= block2mask(x); if (PRINTINDEXLINE) { printf("%d [", x); } prevy = 0; if (ByteLevelIndex) { heado = tailo = NULL; curfreq = 0; while ((bdx2MIN_OCCURRENCES)&&(curfreq+*frequency > MAX_UNION*prevfreq)) || (curfreq+*frequency > MAX_ABSOLUTE)) #else 1 #endif ) ) { /* These o's will be in sorted order. Just collect all of them and merge with &offset_table[x]. */ o = (struct offsets *)my_malloc(sizeof(struct offsets)); o->offset = y; o->next = NULL; o->sign = o->done = 0; if (heado == NULL) { heado = o; tailo = o; } else { tailo->next = o; tailo = o; } } else if (!RecordLevelIndex) { if (heado != NULL) free_list(&heado); /* printf("1 "); */ NOBYTELEVEL = 1; /* can't return since have to or the bitmasks */ } if ((bdx2 0) && (p_table[buffer[bdx2]] >= last_Y_filenumber)) { bdx2 ++; continue; } if (PRINTINDEXLINE) { for (j=p_table[buffer[bdx2]]; j 0) && (j >= last_Y_filenumber)) break; else printf("%d [] ", j); } set[buffer[bdx2]] = 1; bdx2++; } } if (PRINTINDEXLINE) { printf("\n"); } return 0; } /* * This is a very simple function: it gets the list of matched lines from the index, * and sets the block numbers corr. to files that need to be searched in "index_tab". * It also sets the file-offsets that have to be searched in "offset_tab" (byte-level). */ get_index(infile, index_tab, offset_tab, pattern, patlen, patattr, index_argv, index_argc, outfile, partfp, parse, first_time) char *infile; unsigned int *index_tab; struct offsets **offset_tab; char *pattern; int patlen; int patattr; char *index_argv[]; int index_argc; FILE *outfile; FILE *partfp; int parse; int first_time; { int i=0, j, iii; FILE *f_in; struct offsets **offsetptr = multi_dest_offset_table[0]; /* cannot be NULL if ByteLevelIndex: main.c takes care of that */ int ret=0; if (OneFilePerBlock && (parse & OR_EXP) && (index_tab[REAL_PARTITION - 1] == 1)) return 0; if (((infile == NULL) || !strcmp(infile, "")) /* || (index_tab == NULL) || (offset_tab == NULL) || (pattern == NULL)*/) return -1; if((f_in = fopen(infile, "r")) == NULL) { fprintf(stderr, "%s: can't open for reading: %s/%s\n", GProgname, INDEX_DIR, infile); return -1; } if (OneFilePerBlock) for(i=0; i= OneFilePerBlock) break; if (dest_index_set[i] & mask_int[j]) dest_index_set[i] &= ~mask_int[j]; else dest_index_set[i] |= mask_int[j]; } } else { for(i=0; i=GNumpartitions-1) break; /* STUPID: get_table returns 1 + part_num, where part_num was no. of partitions glimpseindex found */ if ((i == 0) || (i == '\n')) continue; if (dest_index_set[i]) dest_index_set[i] = 0; else dest_index_set[i] = 1; } } } /* Take intersection if parse=ANDPAT or 0 (one terminal pattern), union if OR_EXP; Take care of universal sets in index_tab[REAL_PARTITION - 1] */ if (OneFilePerBlock) { if (parse & OR_EXP) { if (ret) { ret_is_1: index_tab[REAL_PARTITION - 1] = 1; for(i=0; i= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; i= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } } first_time = 0; if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; i 0) ? round(OneFilePerBlock, 8*sizeof(int)) : MAX_PARTITION); i++) { if(index_tab[i]) fprintf(debug, "%d,%x\n", i, index_tab[i]); } #endif /*BG_DEBUG*/ fclose(f_in); return 0; } /* * Same as above, but uses mgrep to search the index for many patterns at one go, * and interprets the output obtained from the -M and -P options (set in main.c). */ mgrep_get_index(infile, index_tab, offset_tab, pat_list, pat_lens, pat_attr, mgrep_pat_index, num_mgrep_pat, patbufpos, index_argv, index_argc, outfile, partfp, parse, first_time) char *infile; unsigned int *index_tab; struct offsets **offset_tab; char *pat_list[]; int pat_lens[]; int pat_attr[]; int mgrep_pat_index[]; int num_mgrep_pat; int patbufpos; char *index_argv[]; int index_argc; FILE *outfile; FILE *partfp; int parse; int first_time; { int i=0, j, temp, iii, jjj; FILE *f_in; int ret; int x=0, y=0, even_words=1; int patnum; unsigned int *setptr; struct offsets **offsetptr; CHAR dummypat[MAX_PAT]; int dummylen=0; char allindexmark[MAXNUM_PAT]; int k; int sorted[MAXNUM_PAT], min, max; if (OneFilePerBlock && (parse & OR_EXP) && (index_tab[REAL_PARTITION - 1] == 1)) return 0; /* Do the mgrep() */ if ((f_in = fopen(infile, "w")) == NULL) { fprintf(stderr, "%s: run out of file descriptors!\n", GProgname); return -1; } errno = 0; if ((ret = fileagrep(index_argc, index_argv, 0, f_in)) < 0) { fprintf(stderr, "%s: error in searching index\n", HARVEST_PREFIX); fclose(f_in); return -1; } fflush(f_in); fclose(f_in); f_in = NULL; index_argv[patbufpos] = NULL; /* For index-search with memgrep and get-filenames */ dummypat[0] = '\0'; if ((dummylen = memagrep_init(index_argc, index_argv, MAX_PAT, dummypat)) <= 0) { fclose(f_in); return -1; } /* Interpret the result */ if((f_in = fopen(infile, "r")) == NULL) { fprintf(stderr, "%s: can't open for reading: %s/%s\n", GProgname, INDEX_DIR, infile); return -1; } if (OneFilePerBlock) { for (patnum=0; patnum num_mgrep_pat)) continue; /* error! */ setptr = multi_dest_index_set[patnum - 1]; offsetptr = multi_dest_offset_table[patnum - 1]; for(k=0; dest_index_buf[k] != ' '; k++); dest_index_buf[k] = '\n'; if (!allindexmark[patnum - 1]) allindexmark[patnum - 1] = (char)get_set(&dest_index_buf[k], setptr, offsetptr, pat_lens[mgrep_pat_index[patnum-1]], pat_list[mgrep_pat_index[patnum-1]], pat_attr[mgrep_pat_index[patnum-1]], outfile, partfp, &setptr[REAL_PARTITION - 2], min); /* To test the maximum disparity to stop unions within above */ if (!allindexmark[patnum-1]) min = setptr[REAL_PARTITION - 2]; for (patnum=0; patnum multi_dest_index_set[max][REAL_PARTITION - 2]) max = patnum; } /* Sort them according to the lengths of the lists in increasing order: min first */ for (patnum=0; patnum MAX_DISPARITY * multi_dest_index_set[sorted[0]][REAL_PARTITION - 2])) { NOBYTELEVEL = 1; /* printf("4 "); */ for (iii=0; iii= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) /* collect as many offsets as possible with RecordLevelIndex: free offset_tables at the end of process_query() */ for (i=0; i= OneFilePerBlock) break; index_tab[i] |= mask_int[j]; } } first_time = 0; if (ByteLevelIndex && !RecordLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) /* collect as many offsets as possible with RecordLevelIndex: free offset_tables at the end of process_query() */ for (i=0; i 0) ? round(OneFilePerBlock, 8*sizeof(int)) : MAX_PARTITION); i++) { if(index_tab[i]) fprintf(debug, "%d,%x\n", i, index_tab[i]); } #endif /*BG_DEBUG*/ fclose(f_in); return 0; } /* All borrowed from main.c and are needed for searching the index */ extern CHAR *pat_list[MAXNUM_PAT]; /* complete words within global pattern */ extern int pat_lens[MAXNUM_PAT]; /* their lengths */ extern int pat_attr[MAXNUM_PAT]; /* set of attributes */ extern int num_pat; extern CHAR pat_buf[(MAXNUM_PAT + 2)*MAXPAT]; extern int pat_ptr; extern int is_mgrep_pat[MAXNUM_PAT]; extern int mgrep_pat_index[MAXNUM_PAT]; extern int num_mgrep_pat; extern unsigned int *src_index_set; extern struct offsets **src_offset_table; extern char tempfile[]; extern int patindex; extern int patbufpos; extern ParseTree terminals[MAXNUM_PAT]; extern int GBESTMATCH; /* Should I change -B to -# where # = no. of errors? */ extern int bestmatcherrors; /* set during index search, used later on */ extern FILE *partfp; /* glimpse partitions */ extern FILE *nullfp; /* to discard output: agrep -s doesn't work properly */ extern int ComplexBoolean; extern int num_terminals; #if 0 extern struct token *hash_table[MAX_64K_HASH]; #else /*0*/ extern int mini_array_len; #endif /*0*/ extern int WORDBOUND, NOUPPER, D, LINENUM; int veryfastsearch(argc, argv, num_pat, pat_list, pat_lens, minifp) int argc; char *argv[]; int num_pat; CHAR *pat_list[MAXNUM_PAT]; int pat_lens[MAXNUM_PAT]; FILE *minifp; { /* * Figure out from options if very fast search is possible. */ if (minifp == NULL) return 0; if (!OneFilePerBlock) return 0; /* you did not build index for speed anyway */ if (!(WORDBOUND && NOUPPER && (D<=0))) return 0; if (LINENUM) return 0; return 1; /* if ((num_mgrep_pat == num_pat) || ((1 == num_pat) && (1 == checksg(pat_list[0], D, 0)))) return 1; */ /* either all >= 2 patterns are mgrep-able (simple) or there is just one simple pattern: i.e., "cast" can be used! */ /* return 0; */ } int mini_agrep(inword, inlen, outfp) CHAR *inword; int inlen; FILE *outfp; { static struct stat st; static int statted = 0; unsigned char s[MAX_LINE_LEN], word[MAX_NAME_LEN]; long beginoffset, endoffset, curroffset; unsigned char c; int j, num = 0, cmp, len; if (!statted) { sprintf((char*)s, "%s/%s", INDEX_DIR, INDEX_FILE); if (stat(s, &st) == -1) { fprintf(stderr, "Can't stat file: %s\n", s); exit(2); } statted = 1; } j = 0; while (*inword) { if (*inword == '\\') { inword++; continue; } if (isupper(*(unsigned char *)inword)) word[j] = tolower(*(unsigned char *)inword); else word[j] = *inword; j++; inword ++; } word[j] = '\0'; len = j; if (!get_mini(word, len, &beginoffset, &endoffset, 0, mini_array_len, minifp)) return 0; if (endoffset == -1) endoffset = st.st_size; if (endoffset <= beginoffset) return 0; /* We must find all occurrences of the word (in all attributes) so can't quit when we find the first match */ fseek(indexfp, beginoffset, 0); curroffset = ftell(indexfp); /* = beginoffset */ while ((curroffset < endoffset) && (fgets(s, MAX_LINE_LEN, indexfp) != NULL)) { j = 0; while ((j < MAX_LINE_LEN) && (s[j] != WORD_END_MARK) && (s[j] != ALL_INDEX_MARK) && (s[j] != '\0') && (s[j] != '\n')) j++; if ((j >= MAX_LINE_LEN) || (s[j] == '\0') || (s[j] == '\n')) { curroffset = ftell(indexfp); continue; } /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = s[j]; s[j] = '\0'; cmp = strcmp(word, s); #if WORD_SORTED if (cmp < 0) break; /* since index is sorted by word */ else #endif /* WORD_SORTED */ if (cmp != 0) { /* not IDENTICALLY EQUAL */ s[j] = c; curroffset = ftell(indexfp); continue; } s[j] = c; fputs(s, outfp); num++; curroffset = ftell(indexfp); } return num; } /* Returns the number of times a successful search was conducted: unused info at present. */ fillup_target(result_index_set, result_offset_table, parse) unsigned int *result_index_set; struct offsets **result_offset_table; long parse; { int i=0; FILE *tmpfp; int dummylen = 0; char dummypat[MAX_PAT]; int successes = 0, ret; int first_time = 1; extern int veryfast; int prev_INVERSE = INVERSE; veryfast = veryfastsearch(index_argc, index_argv, num_pat, pat_list, pat_lens, minifp); while (i < num_pat) { if (!veryfast) { if (is_mgrep_pat[i] && (num_mgrep_pat > 1)) { /* do later */ i++; continue; } strcpy(index_argv[patindex], pat_list[i]); /* i-th pattern in its right position */ } /* printf("pat_list[%d] = %s\n", i, pat_list[i]); */ if ((tmpfp = fopen(tempfile, "w")) == NULL) { fprintf(stderr, "%s: cannot open for writing: %s, errno=%d\n", GProgname, tempfile, errno); return(-1); } errno = 0; /* do we need to check is_mgrep_pat[i] here? */ if (veryfast && is_mgrep_pat[i]) { ret = mini_agrep(pat_list[i], pat_lens[i], tmpfp); } /* If this is the glimpse server, since the process doesn't die, most of its data pages might still remain in memory */ else if ((ret = fileagrep(index_argc, index_argv, 0, tmpfp)) < 0) { /* reinitialization here takes care of agrep_argv changes AFTER split_pattern */ fprintf(stderr, "%s: error in searching index\n", HARVEST_PREFIX); fclose(tmpfp); return(-1); } /* Now, the output of index search is in tempfile: need to use files here since index is too large */ fflush(tmpfp); fclose(tmpfp); tmpfp = NULL; /* Keep track of the maximum number of errors: will never enter veryfast */ if (GBESTMATCH) { if (errno > bestmatcherrors) bestmatcherrors = errno; } /* At this point, all index-search options are properly set due to the above fileagrep */ INVERSE = prev_INVERSE; if (-1 ==get_index(tempfile, result_index_set, result_offset_table, pat_list[i], pat_lens[i], pat_attr[i], index_argv, index_argc, nullfp, partfp, parse, first_time)) return(-1); successes ++; first_time = 0; i++; } fflush(stderr); if (veryfast) return successes; /* For index-search with memgrep in mgrep_get_index, and get-filenames */ dummypat[0] = '\0'; if ((dummylen = memagrep_init(index_argc, index_argv, MAX_PAT, dummypat)) <= 0) return(-1); if (num_mgrep_pat > 1) { CHAR *old_buf = (CHAR *)index_argv[patbufpos]; /* avoid my_free and re-my_malloc */ index_argv[patbufpos] = (char*)pat_buf; /* this contains all the patterns with the right -m and -M options */ #if BG_DEBUG fprintf(debug, "pat_buf = %s\n", pat_buf); #endif /*BG_DEBUG*/ strcpy(index_argv[patindex], "-z"); /* no-op: patterns are in patbufpos; also avoid shift-left of index_argv */ if (-1 == mgrep_get_index(tempfile, result_index_set, result_offset_table, pat_list, pat_lens, pat_attr, mgrep_pat_index, num_mgrep_pat, patbufpos, index_argv, index_argc, nullfp, partfp, parse, first_time)) { index_argv[patbufpos] = (char *)old_buf; /* else will my_free array! */ fprintf(stderr, "%s: error in searching index\n", HARVEST_PREFIX); return(-1); } successes ++; first_time = 0; index_argv[patbufpos] = (char *)old_buf; } return successes; } /* * Now, I search the index by doing an in-order traversal of the boolean parse tree starting at GParse. * The results at each node are stored in src_offset_table and src_index_set. Before the right child is * evaluated, results of the left child are stored in curr_offset_table and curr_index_set (accumulators) * and are unioned/intersected/noted with the right child's results (which get stored in src_...) and * passed on above. The accumulators are allocated at each internal node and freed after evaluation. * Left to right evaluation is good since number of curr_offset_tables that exist simultaneously depends * entirely on the maximum depth of a right branch (REAL_PARTITION is small so it won't make a difference). */ int search_index(tree) ParseTree *tree; { int prev_INVERSE; int i, j, iii; int first_time = 0; /* since it is used AFTER left child has been computed */ unsigned int *curr_index_set = NULL; struct offsets **curr_offset_table = NULL; if (ComplexBoolean) { /* recursive */ if (tree == NULL) return -1; if (tree->type == LEAF) { /* always AND pat of individual words at each term: initialize accordingly */ if (OneFilePerBlock) { for(i=0; iterminalindex, tree->terminalindex+1) <= 0) return -1; prev_INVERSE = INVERSE; /* agrep's global to implement NOT */ if (tree->op & NOTPAT) INVERSE = 1; if (fillup_target(src_index_set, src_offset_table, AND_EXP) <= 0) return -1; INVERSE = prev_INVERSE; return 1; } else if (tree->type == INTERNAL) { /* Search the left node and see if the right node can be searched */ if (search_index(tree->data.internal.left) <= 0) return -1; if (OneFilePerBlock && ((tree->op & OPMASK) == ORPAT) && (src_index_set[REAL_PARTITION - 1] == 1)) goto quit; /* nothing to do */ if ((tree->data.internal.right == NULL) || (tree->data.internal.right->type == 0)) return -1; /* uninitialized: see main.c */ curr_index_set = (unsigned int *)my_malloc(sizeof(int)*REAL_PARTITION); memset(curr_index_set, '\0', sizeof(int)*REAL_PARTITION); /* Save previous src_index_set and src_offset_table in fresh accumulators */ if (OneFilePerBlock) { memcpy(curr_index_set, src_index_set, sizeof(int)*REAL_PARTITION); curr_index_set[REAL_PARTITION - 1] = src_index_set[REAL_PARTITION - 1]; src_index_set[REAL_PARTITION - 1] = 0; curr_index_set[REAL_PARTITION - 2] = src_index_set[REAL_PARTITION - 2]; src_index_set[REAL_PARTITION - 2] = 0; } else memcpy(curr_index_set, src_index_set, MAX_PARTITION * sizeof(int)); if (ByteLevelIndex && !NOBYTELEVEL && (RecordLevelIndex || !(Only_first && !PRINTAPPXFILEMATCH))) { if ((curr_offset_table = (struct offsets **)my_malloc(sizeof(struct offsets *) * OneFilePerBlock)) == NULL) { fprintf(stderr, "%s: malloc failure at: %s:%d\n", GProgname, __FILE__, __LINE__); my_free(curr_index_set, REAL_PARTITION*sizeof(int)); return -1; } memcpy(curr_offset_table, src_offset_table, OneFilePerBlock * sizeof(struct offsets *)); memset(src_offset_table, '\0', sizeof(struct offsets *) * OneFilePerBlock); } /* Now evaluate the right node which automatically put the results in src_index_set/src_offset_table */ if (search_index(tree->data.internal.right) <= 0) { if (curr_offset_table != NULL) free(curr_offset_table); my_free(curr_index_set, REAL_PARTITION*sizeof(int)); return -1; } /* * Alpha substitution of the code in get_index(): * index_tab <- src_index_set * dest_index_table <- curr_index_set * offset_tab <- src_offset_table * dest_offset_table <- curr_offset_table * ret <- src_index_set[REAL_PARTITION - 1] for ORPAT, curr_index_set for ANDPAT * frequency = src_index_set[REAL_PARTITION - 2] in both ORPAT and ANDPAT * first_time <- 0 * return 0 <- goto quit * Slight difference since we want the results to go to src rather than curr. */ if (OneFilePerBlock) { if ((tree->op & OPMASK) == ORPAT) { if (src_index_set[REAL_PARTITION - 1] == 1) { /* curr..[..] can never be 1 since we would have quit above itself */ ret_is_1: src_index_set[REAL_PARTITION - 1] = 1; for(i=0; i= OneFilePerBlock) break; src_index_set[i] |= mask_int[j]; } if (ByteLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; i= OneFilePerBlock) break; src_index_set[i] |= mask_int[j]; } } first_time = 0; if (ByteLevelIndex && !NOBYTELEVEL && !(Only_first && !PRINTAPPXFILEMATCH)) for (i=0; iop & OPMASK) == ORPAT) for(i=0; iop & NOTPAT) { if (ByteLevelIndex) { /* Can't recover the discarded offsets */ fprintf(stderr, "%s: can't handle NOT of AND/OR terms with ByteLevelIndex: please simplify the query\n", HARVEST_PREFIX); my_free(curr_index_set, REAL_PARTITION*sizeof(int)); return -1; } if (OneFilePerBlock) for (i=0; i 0 are displayed. .TP .B \-C tells glimpse to send its queries to \fIglimpseserver\fP. .TP .B \-d "'\fIdelim\fP'" Define \fIdelim\fP to be the separator between two records. The default value is '$', namely a record is by default a line. \fIdelim\fP can be a string of size at most 8 (with possible use of ^ and $), but not a regular expression. Text between two \fIdelim\fP's, before the first \fIdelim\fP, and after the last \fIdelim\fP is considered as one record. For example, -d '$$' defines paragraphs as records and -d '^From\ ' defines mail messages as records. \fIglimpse\fP matches each record separately. \fBThis option does not currently work with regular expressions.\fP The -d option is especially useful for Boolean AND queries, because the patterns need not appear in the same line but in the same record. For example, \fIglimpse -F mail -d '^From\ ' 'glimpse;arizona;announcement'\fR will output all mail messages (in their entirety) that have the 3 patterns anywhere in the message (or the header), assuming that files with 'mail' in their name contain mail messages. If you want the scope of the record to be the whole file, use the -W option. \fBGlimpse warning\fP: Use this option with care. If the delimiter is set to match mail messages, for example, and glimpse finds the pattern in a regular file, it may not find the delimiter and will therefore output the whole file. (The -t option - see below - can be used to put the \fIdelim\fP at the end of the record.) \fBPerformance Note:\fP Agrep (and glimpse) resorts to more complex search when the \-d option is used. The search is slower and unfortunately no more than 32 characters can be used in the pattern. .TP .B \-D\fIk\fP Set the cost of a deletion to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .BI \-e " pattern" Same as a simple .I pattern argument, but useful when the .I pattern begins with a .RB ` \- '. .TP .B \-E prints the lines in the index (as they appear in the index) which match the pattern. Used mostly for debugging and maintenance of the index. This is not an option that a user needs to know about. .TP .B \-f \fIfile_name\fR this option has a different meaning for agrep than for glimpse: In glimpse, only the files whose names are listed in \fIfile_name\fP are matched. (The file names have to appear as in .glimpse_filenames.) In agrep, the file_name contains the list of the patterns that are searched. (Starting at version 3.6, this option for glimpse is much faster for large files.) .TP .B \-F \fIfile_pattern\fR limits the search to those files whose name (including the whole path) matches \fIfile_pattern\fP. This option can be used in a variety of applications to provide limited search even for one large index. If \fIfile_pattern\fP matches a directory, then all files with this directory on their path will be considered. To limit the search to actual file names, use $ at the end of the pattern. \fIfile_pattern\fP can be a regular expression and even a Boolean pattern. This option is implemented by running agrep \fIfile_pattern\fP on the list of file names obtained from the index. Therefore, searching the index itself takes the same amount of time, but limiting the second phase of the search to only a few files can speed up the search significantly. For example, .sp 1 glimpse -F 'src#\\.c$' needle .sp 1 will search for needle in all .c files with src somewhere along the path. The -F \fIfile_pattern\fP must appear before the search pattern (e.g., glimpse needle -F '\\.c$' will not work). It is possible to use some of agrep's options when matching file names. In this case all options as well as the file_pattern should be in quotes. (-B and -v do not work very well as part of a file_pattern.) For example, .sp glimpse -F '-1 \\.html' pattern .sp will allow one spelling error when matching .html to the file names (so ".htm" and ".shtml" will match as well). .sp glimpse -F '-v \\.c$' counter .sp will search for 'counter' in all files \fIexcept\fP for .c files. .TP .B \-g prints the file number (its position in the .glimpse_filenames file) rather than its name. .TP .B \-G Output the (whole) files that contain a match. .TP .B \-h Do not display filenames. .TP .B \-H \fIdirectory_name\fR searches for the index and the other .glimpse files in \fIdirectory_name\fP. The default is the home directory. This option is useful, for example, if several different indexes are maintained for different archives (e.g., one for mail messages, one for source code, one for articles). .TP .B \-i Case-insensitive search \(em e.g., "A" and "a" are considered equivalent. Glimpse's index stores all patterns in lower case (see LIMITATIONS below). \fBPerformance Note:\fP When \-i is used together with the \-w option, the search may become much faster. It is recommended to have \-i and \-w as defaults, for example, through an alias. We use the following alias in our .cshrc file .br alias glwi 'glimpse -w -i' .TP .B \-I\fIk\fP Set the cost of an insertion to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .B \-j If the index was constructed with the -t option, then \-j will output the files last modification dates in addition to everything else. There are no major performance penalties for this option. .TP .B \-J \fIhost_name\fP used in conjunction with glimpseserver (\-C) to connect to one particular server. .TP .B \-k No symbol in the pattern is treated as a meta character. For example, glimpse -k 'a(b|c)*d' will find the occurrences of a(b|c)*d whereas glimpse 'a(b|c)*d' will find substrings that match the regular expression 'a(b|c)*d'. (The only exception is ^ at the beginning of the pattern and $ at the end of the pattern, which are still interpreted in the usual way. Use \\^ or \\$ if you need them verbatim.) .TP .B \-K \fIport_number\fP used in conjunction with glimpseserver (\-C) to connect to one particular server at the specified TCP port number. .TP .B \-l Output only the files names that contain a match. This option differs from the \-N option in that the files themselves \fIare\fP searched, but the matching lines are not shown. .TP .B \-L x | x:y | x:y:z if one number is given, it is a limit on the total number of matches. Glimpse outputs only the first x matches. If \-l is used (i.e., only file names are sought), then the limit is on the number of files; otherwise, the limit is on the number of records. If two numbers are given (x:y), then y is an added limit on the total number of files. If three numbers are given (x:y:z), then z is an added limit on the number of matches per file. If any of the x, y, or z is set to 0, it means to ignore it (in other words 0 = infinity in this case); for example, \-L 0:10 will output all matches to the first 10 files that contain a match. This option is particularly useful for servers that needs to limit the amount of output provided to clients. .TP .B \-m used for glimpse internals. .TP .B \-M used for glimpse internals. .TP .B \-n Each matching record (line) is prefixed by its record (line) number in the file. \fBPerformance Note:\fP To compute the record/line number, agrep needs to search for all record delimiters (or line breaks), which can slow down the search. .TP .B \-N searches only the index (so the search is faster). If -o or -b are used then the result is the number of files that have a potential match plus a prompt to ask if you want to see the file names. (If \-y is used, then there is no prompt and the names of the files will be shown.) This could be a way to get the matching file names without even having access to the files themselves. However, because only the index is searched, some potential matches may not be real matches. In other words, with \-N you will not miss any file but you may get extra files. For example, since the index stores everything in lower case, a case-sensitive query may match a file that has only a case-insensitive match. Boolean queries may match a file that has all the keywords but not in the same line (indexing with \-b allows glimpse to figure out whether the keywords are close, but it cannot figure out from the index whether they are exactly on the same line or in the same record without looking at the file). If the index was not build with \-o or \-b, then this option outputs the number of \fIblocks\fP matching the pattern. This is useful as an indication of how long the search will take. All files are partitioned into usually 200-250 blocks. The file \fB.glimpse_statistics\fP contains the total number of blocks (or \fBglimpse -N a\fP will give a pretty good estimate; only blocks with no occurrences of 'a' will be missed). .TP .B \-o the opposite of \-t: the delimiter is not output at the tail, but at the beginning of the matched record. .TP .B \-O the file names are not printed before every matched record; instead, each filename is printed just once, and all the matched records within it are printed after it. .TP .B \-p (from version 4.0B1 only) Supports reading compressed set of filenames. The -p option allows you to utilize compressed `neighborhoods' (sets of filenames) to limit your search, without uncompressing them. Added mostly for WebGlimpse. The usage is: .br "-p filename:X:Y:Z" where "filename" is the file with compressed neighborhoods, X is an offset into that file (usually 0, must be a multiple of sizeof(int)), Y is the length glimpse must access from that file (if 0, then whole file; must be a multiple of sizeof(int)), and Z must be 2 (it indicates that "filename" has the sparse-set representation of compressed neighborhoods: the other values are for internal use only). Note that any colon ":" in filename must be escaped using a backslash \. .TP .B \-P used for glimpse internals. .TP .B \-q prints the offsets of the beginning and end of each matched record. The difference between \-q and \-b is that \-b prints the offsets of the actual matched string, while \-q prints the offsets of the whole record where the match occurred. The output format is @x{y}, where x is the beginning offset and y is the end offset. .TP .B \-Q when used together with \-N glimpse not only displays the filename where the match occurs, but the exact occurrences (offsets) as seen in the index. This option is relevant only if the index was built with -b; otherwise, the offsets are not available in the index. This option is ignored when used not with \-N. .TP .B \-r This option is an agrep option and it will be ignored in glimpse, unless glimpse is used with a file name at the end which makes it run as agrep. If the file name is a directory name, the \-r option will search (recursively) the whole directory and everything below it. (The glimpse index will not be used.) .TP .B \-R \fIk\fP defines the maximum size (in bytes) of a record. The maximum value (which is the default) is 48K. Defining the maximum to be lower than the deafult may speed up some searches. .TP .B \-s Work silently, that is, display nothing except error messages. This is useful for checking the error status. .TP .B \-S\fIk\fP Set the cost of a substitution to \fIk\fP (\fIk\fP is a positive integer). This option does not currently work with regular expressions. .TP .B \-t Similar to the \-d option, except that the delimiter is assumed to appear at the \fIend\fP of the record. Glimpse will output the record starting from the end of .I delim to (and including) the next .I delim. (See warning for the \-d option.) .TP .B \-T directory Use \fIdirectory\fP as a place where temporary files are built. (Glimpse produces some small temporary files usually in /tmp.) This option is useful mainly in the context of structured queries for the Harvest project, where the temporary files may be non-trivial, and the /tmp directory may not have enough space for them. .TP .B \-U (starting at version 4.0B1) Interprets an index created with the -X or the -U option in glimpseindex. Useful mostly for WebGlimpse or similar web applications. When glimpse outputs matches, it will display the filename, the URL, and the title automatically. .TP .B \-v (This option is an agrep option and it will be ignored in glimpse, unless glimpse is used with a file name at the end which makes it run as agrep.) Output all records/lines that do \fInot\fP contain a match. (Glimpse does not support the NOT operator yet.) .TP .B \-V prints the current version of glimpse. .TP .B \-w Search for the pattern as a word \(em i.e., surrounded by non-alphanumeric characters. For example, \fIglimpse -w car\fR will match car, but not characters and not car10. The non-alphanumeric \fImust\fP surround the match; they cannot be counted as errors. This option does not work with regular expressions. \fBPerformance Note:\fP When \-w is used together with the \-i option, the search may become much faster. The \-w will not work with $, ^, and _ (see BUGS below). It is recommended to have \-i and \-w as defaults, for example, through an alias. We use the following alias in our .cshrc file .br alias glwi 'glimpse -w -i' .TP .B \-W The default for Boolean AND queries is that they cover one record (the default for a record is one line) at a time. For example, glimpse 'good;bad' will output all lines containing both 'good' and 'bad'. The \-W option changes the scope of Booleans to be the whole file. Within a file glimpse will output all matches to any of the patterns. So, glimpse -W 'good;bad' will output all lines containing 'good' \fIor\fP 'bad', but only in files that contain both patterns. The NOT operator '~' can be used only with \-W. It is described later on. The OR operator is essentially unaffected (unless it is in combination with the other Boolean operations). For structured queries, the scope is always the whole attribute or file. .TP .B \-x The pattern must match the whole line. (This option is translated to -w when the index is searched and it is used only when the actual text is searched. It is of limited use in glimpse.) .TP .B \-X (from version 4.0B1 only) Output the names of files that contain a match even if these files have been deleted since the index was built. Without this option glimpse will simply ignore these files. .TP .B \-y Do not prompt. Proceed with the match as if the answer to any prompt is y. Servers (or any other scripts) using glimpse will probably want to use this option. .TP .B \-Y \fIk\fP If the index was constructed with the -t option, then \-Y x will output only matches to files that were created or modified within the last x days. There are no major performance penalties for this option. .TP .B \-z Allow customizable filtering, using the file .glimpse_filters to perform the programs listed there for each match. The best example is compress/decompress. If .glimpse_filters include the line .br *.Z uncompress < .br (separated by tabs) then before indexing any file that matches the pattern "*.Z" (same syntax as the one for .glimpse_exclude) the command listed is executed first (assuming input is from stdin, which is why uncompress needs <) and its output (assuming it goes to stdout) is indexed. The file itself is not changed (i.e., it stays compressed). Then if glimpse -z is used, the same program is used on these files on the fly. Any program can be used (we run 'exec'). For example, one can filter out parts of files that should not be indexed. Glimpseindex tries to apply all filters in .glimpse_filters in the order they are given. For example, if you want to uncompress a file and then extract some part of it, put the compression command (the example above) first and then another line that specifies the extraction. Note that this can slow down the search because the filters need to be run before files are searched. (See also glimpseindex.) .TP .B \-Z No op. (It's useful for glimpse's internals. Trust us.) .LP The characters .RB ` $ ', .RB `^ ', .RB ` \(** ', .RB ` [ ' , .RB ` ] ' , .RB ` \s+2^\s0 ', .RB ` | ', .RB ` ( ', .RB ` ) ', .RB ` ! ', and .RB ` \e ' can cause unexpected results when included in the .IR pattern , as these characters are also meaningful to the shell. To avoid these problems, enclose the entire pattern in single quotes, i.e., 'pattern'. Do not use double quotes ("). .ne 4 .SH PATTERNS .LP \fIglimpse\fP supports a large variety of patterns, including simple strings, strings with classes of characters, sets of strings, wild cards, and regular expressions (see LIMITATIONS). .TP \fBStrings \fP Strings are any sequence of characters, including the special symbols `^' for beginning of line and `$' for end of line. The following special characters ( .RB ` $ ', .RB `^ ', .RB ` \(** ', .RB ` [ ' , .RB ` \s+2^\s0 ', .RB ` | ', .RB ` ( ', .RB ` ) ', .RB ` ! ', and .RB ` \e ' ) as well as the following meta characters special to glimpse (and agrep): .RB ` ; ', .RB ` , ', .RB ` # ', .RB ` < ', .RB ` > ', .RB ` - ', and .RB ` . ', should be preceded by `\\' if they are to be matched as regular characters. For example, \\^abc\\\\ corresponds to the string ^abc\\, whereas ^abc corresponds to the string abc at the beginning of a line. .TP \fBClasses of characters\fP a list of characters inside [] (in order) corresponds to any character from the list. For example, [a-ho-z] is any character between a and h or between o and z. The symbol `^' inside [] complements the list. For example, [^i-n] denote any character in the character set except character 'i' to 'n'. The symbol `^' thus has two meanings, but this is consistent with egrep. The symbol `.' (don't care) stands for any symbol (except for the newline symbol). .TP \fBBoolean operations\fP .B Glimpse supports an `AND' operation denoted by the symbol `;' an `OR' operation denoted by the symbol `,', a limited version of a 'NOT' operation (starting at version 4.0B1) denoted by the symbol `~', or any combination. For example, \fIglimpse 'pizza;cheeseburger'\fR will output all lines containing both patterns. \fIglimpse -F 'gnu;\\.c$' 'define;DEFAULT'\fR will output all lines containing both 'define' and 'DEFAULT' (anywhere in the line, not necessarily in order) in files whose name contains 'gnu' and ends with .c. \fIglimpse '{political,computer};science'\fR will match 'political science' or 'science of computers'. The NOT operation works only together with the -W option and it is generally applies only to the whole file rather to individual records. Its output may sometimes seem counterintuitive. Use with care. \fIglimpse -W 'fame;~glory'\fR will output all lines containing 'fame' in all files that contain 'fame' but do not contain 'glory'; This is the most common use of NOT, and in this case it works as expected. \fIglimpse -W '~{fame;glory}'\fR will be limited to files that do not contain both words, and will output all lines containing one of them. .TP \fBWild cards\fP The symbol '#' is used to denote a sequence of any number (including 0) of arbitrary characters (see LIMITATIONS). The symbol # is equivalent to .* in egrep. In fact, .* will work too, because it is a valid regular expression (see below), but unless this is part of an actual regular expression, # will work faster. (Currently glimpse is experiencing some problems with #.) .TP \fBCombination of exact and approximate matching\fP Any pattern inside angle brackets <> must match the text exactly even if the match is with errors. For example, ics matches mathematical with one error (replacing the last s with an a), but mathe does not match mathematical no matter how many errors are allowed. (This option is buggy at the moment.) .TP \fBRegular expressions\fP Since the index is word based, a regular expression must match words that appear in the index for glimpse to find it. Glimpse first strips the regular expression from all non-alphabetic characters, and searches the index for all remaining words. It then applies the regular expression matching algorithm to the files found in the index. For example, \fIglimpse\fP 'abc.*xyz' will search the index for all files that contain both 'abc' and 'xyz', and then search directly for 'abc.*xyz' in those files. (If you use glimpse \-w 'abc.*xyz', then 'abcxyz' will not be found, because glimpse will think that abc and xyz need to be matches to whole words.) The syntax of regular expressions in \fBglimpse\fP is in general the same as that for \fBagrep\fP. The union operation `|', Kleene closure `*', and parentheses () are all supported. Currently '+' is not supported. Regular expressions are currently limited to approximately 30 characters (generally excluding meta characters). Some options (\-d, \-w, \-t, \-x, \-D, \-I, \-S) do not currently work with regular expressions. The maximal number of errors for regular expressions that use '*' or '|' is 4. (See LIMITATIONS.) .TP \fBstructured queries\fP Glimpse supports some form of structured queries using Harvest's SOIF format. See STRUCTURED QUERIES below for details. .SH EXAMPLES .LP (Run "glimpse '^glimpse' this-file" to get a list of all examples, some of which were given earlier.) .TP glimpse -F 'haystack.h$' needle finds all needles in all haystack.h's files. .TP glimpse -2 -F html Anestesiology outputs all occurrences of Anestesiology with two errors in files with html somewhere in their full name. .TP glimpse -l -F '\\.c$' variablename lists the names of all .c files that contain variablename (the -l option lists file names rather than output the matched lines). .TP glimpse -F 'mail;1993' 'windsurfing;Arizona' finds all lines containing \fIwindsurfing\fP and \fIArizona\fP in all files having `mail' and '1993' somewhere in their full name. .TP glimpse -F mail 't.j@#uk' finds all mail addresses (search only files with mail somewhere in their name) from the uk, where the login name ends with t.j, where the . stands for any one character. (This is very useful to find a login name of someone whose middle name you don't know.) .TP glimpse -F mbox -h -G . > MBOX concatenates all files whose name matches `mbox' into one big one. .SH "SEARCHING IN COMPRESSED FILES" .LP Glimpse includes an optional new compression program, called \fIcast\fP, which allows glimpse (and agrep) to search the compressed files without having to decompress them. The search is actually significantly faster when the files are compressed. However, we have not tested \fIcast\fP as thoroughly as we would have liked, and a mishap in a compression algorithm can cause loss of data, so we recommend at this point to use \fIcast\fP very carefully. We do not support or maintain cast. (Unless you specifically use \fIcast\fP, the default is to ignore it.) .SH "GLIMPSEINDEX FILES" .LP All files used by glimpse are located at the directory(ies) where the index(es) is (are) stored and have .glimpse_ as a prefix. The first two files (.glimpse_exclude and .glimpse_include) are optionally supplied by the user. The other files are built and read by glimpse. .LP .IP "\fB.glimpse_exclude\fR" contains a list of files that glimpseindex is explicitly told to ignore. In general, the syntax of .glimpse_exclude/include is the same as that of agrep (or any other grep). The lines in the .glimpse_exclude file are matched to the file names, and if they match, the files are excluded. Notice that agrep matches to parts of the string! e.g., agrep /ftp/pub will match /home/ftp/pub and /ftp/pub/whatever. So, if you want to exclude /ftp/pub/core, you just list it, as is, in the .glimpse_exclude file. If you put "/home/ftp/pub/cdrom" in .glimpse_exclude, every file name that matches that string will be excluded, meaning all files below it. You can use ^ to indicate the beginning of a file name, and $ to indicate the end of one, and you can use * and ? in the usual way. For example /ftp/*html will exclude /ftp/pub/foo.html, but will also exclude /home/ftp/pub/html/whatever; if you want to exclude files that start with /ftp and end with html use ^/ftp*html$ Notice that putting a * at the beginning or at the end is redundant (in fact, in this case glimpseindex will remove the * when it does the indexing). No other meta characters are allowed in .glimpse_exclude (e.g., don't use .* or # or |). Lines with * or ? must have no more than 30 characters. Notice that, although the index itself will not be indexed, the list of file names (.glimpse_filenames) will be indexed unless it is explicitly listed in .glimpse_exclude. .IP "\fB.glimpse_filters\fR" See the description above for the -z option. .IP "\fB.glimpse_include\fR" contains a list of files that glimpseindex is explicitly told to \fIinclude\fP in the index even though they may look like non-text files. Symbolic links are followed by glimpseindex only if they are specifically included here. If a file is in both .glimpse_exclude and .glimpse_include it will be excluded. .IP "\fB.glimpse_filenames\fP" contains the list of all indexed file names, one per line. This is an ASCII file that can also be used with agrep to search for a file name leading to a fast find command. For example, .br glimpse 'count#\\.c$' ~/.glimpse_filenames .br will output the names of all (indexed) .c files that have 'count' in their name (including anywhere on the path from the index). Setting the following alias in the .login file may be useful: .br alias findfile 'glimpse -h \!:1 ~/.glimpse_filenames' .IP ".\fBglimpse_index\fP" contains the index. The index consists of lines, each starting with a word followed by a list of block numbers (unless the -o or -b options are used, in which case each word is followed by an offset into the file .glimpse_partitions where all pointers are kept). The block/file numbers are stored in binary form, so this is not an ASCII file. .IP "\fB.glimpse_messages\fP" contains the output of the -w option (see above). .IP "\fB.glimpse_partitions\fP" contains the partition of the indexed space into blocks and, when the index is built with the -o or -b options, some part of the index. This file is used internally by glimpse and it is a non-ASCII file. .IP "\fB.glimpse_statistics\fP" contains some statistics about the makeup of the index. Useful for some advanced applications and customization of glimpse. .IP "\fB.glimpse_turbo\fP" An added data structure (used under glimpseindex -o or -b only) that helps to speed up queries significantly for large indexes. Its size is 0.25MB. Glimpse will work without it if needed. .SH "STRUCTURED QUERIES" Glimpse can search for Boolean combinations of "attribute=value" terms by using the Harvest SOIF parser library (in glimpse/libtemplate). To search this way, the index must be made by using the -s option of glimpseindex (this can be used in conjunction with other glimpseindex options). For glimpse and glimpseindex to recognize "structured" files, they must be in SOIF format. In this format, each value is prefixed by an attribute-name with the size of the value (in bytes) present in "{}" after the name of the attribute. For example, The following lines are part of an SOIF file: .br .nf type{17}: Directory-Listing md5{32}: 3858c73d68616df0ed58a44d306b12ba .fi Any string can serve as an attribute name. Glimpse "pattern;type=Directory-Listing" will search for "pattern" only in files whose type is "Directory-Listing". The file itself is considered to be one "object" and its name/url appears as the first attribute with an "@" prefix; e.g., @FILE { http://xxx... } The scope of Boolean operations changes from records (lines) to whole files when structured queries are used in glimpse (since individual query terms can look at different attributes and they may not be "covered" by the record/line). Note that glimpse can only search for patterns in the value parts of the SOIF file: there are some attributes (like the TTL, MD5, etc.) that are interpreted by Harvest's internal routines. See RFC 2655 for more detailed information of the SOIF format. .SH "REFERENCES" .IP 1. U. Manber and S. Wu, "GLIMPSE: A Tool to Search Through Entire File Systems," \fIUsenix Winter 1994 Technical Conference\fP (best paper award), San Francisco (January 1994), pp. 23\-32. Also, Technical Report #TR 93-34, Dept. of Computer Science, University of Arizona, October 1993 (a postscript file is available by anonymous ftp at ftp://webglimpse.net/pub/glimpse/TR93-34.ps). .IP 2. S. Wu and U. Manber, "Fast Text Searching Allowing Errors," \fICommunications of the ACM\fP \fB35\fP (October 1992), pp. 83\-91. .SH "SEE ALSO" .BR agrep (1), .BR ed (1), .BR ex (1), .BR glimpseindex (1), .BR glimpseserver (1), .BR grep (1), .BR sh (1), .BR csh (1). .SH LIMITATIONS .LP The index of glimpse is word based. A pattern that contains more than one word cannot be found in the index. The way glimpse overcomes this weakness is by splitting any multi-word pattern into its set of words and looking for all of them in the index. For example, \fBglimpse 'linear programming'\fR will first consult the index to find all files containing both \fIlinear\fP and \fIprogramming\fP, and then apply agrep to find the combined pattern. This is usually an effective solution, but it can be slow for cases where both words are very common, but their combination is not. .LP As was mentioned in the section on PATTERNS above, some characters serve as meta characters for glimpse and need to be preceded by '\\' to search for them. The most common examples are the characters '.' (which stands for a wild card), and '*' (the Kleene closure). So, "glimpse ab.de" will match abcde, but "glimpse ab\\.de" will not, and "glimpse ab*de" will not match ab*de, but "glimpse ab\\*de" will. The meta character - is translated automatically to a hypen unless it appears between [] (in which case it denotes a range of characters). .LP The index of glimpse stores all patterns in lower case. When glimpse searches the index it first converts all patterns to lower case, finds the appropriate files, and then searches the actual files using the original patterns. So, for example, \fIglimpse ABCXYZ\fR will first find all files containing abcxyz in any combination of lower and upper cases, and then searches these files directly, so only the right cases will be found. One problem with this approach is discovering misspellings that are caused by wrong cases. For example, \fIglimpse -B abcXYZ\fR will first search the index for the best match to abcxyz (because the pattern is converted to lower case); it will find that there are matches with no errors, and will go to those files to search them directly, this time with the original upper cases. If the closest match is, say AbcXYZ, glimpse may miss it, because it doesn't expect an error. Another problem is speed. If you search for "ATT", it will look at the index for "att". Unless you use -w to match the whole word, glimpse may have to search all files containing, for example, "Seattle" which has "att" in it. .LP There is no size limit for simple patterns and simple patterns within Boolean expressions. More complicated patterns, such as regular expressions, are currently limited to approximately 30 characters. Lines are limited to 1024 characters. Records are limited to 48K, and may be truncated if they are larger than that. The limit of record length can be changed by modifying the parameter Max_record in agrep.h. .LP Glimpseindex does not index words of size > 64. .SH BUGS .LP In some rare cases, regular expressions using * or # may not match correctly. .LP A query that contains no alphanumeric characters is not recommended (unless glimpse is used as agrep and the file names are provided). This is an understatement. .LP The notion of "match to the whole word" (the \-w option) can be tricky sometimes. For example, glimpse -w 'word$' will not match 'word' appearing at the end of a line, because the extra '$' makes the pattern more than just one simple word. The same thing can happen with ^ and with _. To be on the safe side, use the -w option only when the patterns are actual words. .LP Please send bug reports or comments to gvelez@webglimpse.net. .SH DIAGNOSTICS Exit status is 0 if any matches are found, 1 if none, 2 for syntax errors or inaccessible files. .SH AUTHORS Udi Manber and Burra Gopal, Department of Computer Science, University of Arizona, and Sun Wu, the National Chung-Cheng University, Taiwan. Now maintained by Golda Velez at Internet WorkShop (Email: gvelez@webglimpse.net) glimpse-4.18.7/glimpse/glimpse.chronicle000066400000000000000000000067331300371307100202470ustar00rootroot000000000000000. Created file on 2/May/94 1. Added patches to main.c to use sizeof(char*) instead of 4 in relevant places. Same for other pointer mallocs. -- bg 2 May 1994 2. Successfully ported to DEC-ALPHA: changes to routines in agrep/agrep.c -- bg 10 May 1994 3. Added the new mgrep routine. Removed bugs (pattern too long) related to using shift-or/and algorithm to search for booleans: now it uses mgrep. -- bg 26-30 May 1994 4. Added delimiter processing even with -f and -m. -- bg 31 May - 2 June 1994 5. Added structured queries support in June 1994 -- Syntax of glimpse: glimpse 'a1=v1,a2=v2...' (series of ORs) glimpse 'a1=v1;a2=v2...' (series of ANDs) -- Syntax of glimpseindex: glimpseindex -s -- NOTES: v1, v2 etc. must lie within the range of a1, a2, etc., i.e., if a1 is in the region [offset, offset+len] in a file, v1 must lie compleletely within that range. A new glimpse-file called .glimpse_attributes is created to hold the attributes discovered while indexing with -s. A-V searches may not give out an error message if the index is not built with -s. 6. Added glimpseserver to speed up queries by reading in the index ahead of time, during July 1994. 7. Integrated compression into glimpse during July 1994. The new option to glimpseindex can now index compressed files (those compressed with tcomp). 8. Added multipattern search for simple patterns with -w in compressed files during August 1994. 9. Ability to take input files from command line (-F) (Sep 10/94) 10. Added byte level index support (Sep 23/94) 11. Added support for arbitrary boolean expressions (Oct 10/94) 12. Added support for arbitary filtering with -z option (Oct 94) 13. Completely integrated glimpseserver into Harvest (Jan 95) 14. Speeded up structured queries and -W option (filtering) (Feb 95) 15. Added -W -z support (June 1995) 16. Added undocumented options for index-search/analysis (SFS_COMPAT, Jun 1995) 17. Removed bugs related to structured queries and booleans w/ -b (Jul 1995) 18. Added -z and structured queries support (Aug 1995) 19. Changed maximum pattern and indexable word sizes (Sep 1995) 20. Made it portable to various architectures (see README.install, Oct 1995) 21. Added "-f filename" option to glimpse: it allows you to restrict the search to only those files whose names appear in "filename" (Jan 1996). 22. Fixed the agrep bug where -n was not working with ISO chars (Jan 1996). 23. Added -t to glimpseindex that sorts .glimpse_filenames by decreasing order of modify time (st_mtime in stat structure); Added -@ option to glimpse to print time of file along with its name; (Feb 1996) 24. Added "-Y days" option to print files that were modified "days" before the index was created (Mar 1996). 25. Added support for handling filenames/directorynames with special characters: they can now have ' " & > < ! etc., whatever and glimpse works just fine. 26. Added conversion program for neighborhood manipulation in webglimpse (9/96). 27. Added limited support for NOT in glimpse (index search or -W only) (11/96). 28. Added support to search for patterns with repeating strings (11/96): "{computer;science},{computer;chronicles}" This now works in agrep as well as glimpse. However, its for simple patterns only (i.e., no regexp or spelling-errors). 29. Fixed some nagging memory leaks and segfaults on Solaris (10/96). 30. Fixed multiple matches / missed matches problems with -W (11/96). 31. Release of version 4.1 (10/97) glimpse-4.18.7/glimpse/glimpseindex.1000066400000000000000000000602261300371307100174660ustar00rootroot00000000000000.TH GLIMPSEINDEX l "November 10, 1997" .SH NAME \fIglimpseindex 4.1\fP - index whole file systems to be searched by glimpse .SH OVERVIEW \fIGlimpse\fP (which stands for GLobal IMPlicit SEarch) is a popular UNIX indexing and query system that allows you to search through a large set of files very quickly. Glimpseindex is the indexing program for glimpse. Glimpse supports most of \fIagrep\fP's options (\fIagrep\fP is our powerful version of \fIgrep\fP) including approximate matching (e.g., finding misspelled words), Boolean queries, and even some limited forms of regular expressions. It is used in the same way, except that you don't have to specify file names. So, if you are looking for a \fIneedle\fP anywhere in your file system, all you have to do is say \fIglimpse needle\fR and all lines containing \fIneedle\fP will appear preceded by the file name. See man glimpse for details on how to use glimpse. .LP Glimpseindex provides three indexing options: a tiny index (2-3% of the total size of all files), a small index (7-8%) and a medium-size index (20-30%). Search times are normally better with larger indexes (although unless files are quite large, the small index is just about as good as the medium one). To index all your files, you say \fIglimpseindex ~\fR for tiny index (where ~ stands for the home directory), \fIglimpseindex -o ~\fR for small index, and \fIglimpseindex -b ~\fR for medium. .LP Please submit bug reports or comments at http://webglimpse.net/bugzilla/ Mail majordomo@webglimpse.net with SUBSCRIBE WGUSERS in the message body to be added to the webglimpse mailing list, where glimpse discussion is also directed. HTML version of these manual pages can be found in http://webglimpse.net/docs/glimpseindexhelp.html Also, see the glimpse home pages in http://webglimpse.net/glimpse/ .SH SYNOPSIS .B glimpseindex [ \fB\-abEfFiInostT \-w \fInumber\fP \-dD \fIfilename(s) \-H \fIdirectory\fP \-M \fInumber\fP \-S \fInumber\fP\fR ] \fIdirectory_name[s]\fR .SH INTRODUCTION \fIGlimpseindex\fP builds an index of all text files in all the directories specified and all their subdirectories (recursively). It is also possible to build several separate indexes (possibly even overlapping). The simplest way to index your files is to say .LP \fIglimpseindex -o ~\fP .LP The index consists of several files (described in detail below), all with the prefix \fI.glimpse_\fR stored in the user's home directory (unless otherwise specified with the -H option). Files with one of the following suffixes are not indexed: ".o", ".gz", ".Z", ".z", ".hqx", ".zip", ".tar". (Unless the -z option is used, see below.) In addition, glimpseindex attempts to determine whether a file is a text file and does not index files that it thinks are not text files. Numbers are not indexed unless the -n option is used. It is possible to prevent specified files from being indexed by adding their names to the .glimpse_exclude file (described below). The -o option builds a larger index than without it (typically about 7-8% vs. 2-3% without -o) allowing for a faster search (1-5 times faster). The -b builds an even larger index and allows an even faster search some of the time (-b is helpful mostly when large files are present). There is an incremental indexing option \fI-f\fR, which updates an existing index by determining which files have been created or modified since the index was built and adding them to the index (see -f). Glimpseindex is reasonably fast, taking about 20 minutes to index 15,000 files of about 200MB (on an Dec Alpha 233) and 2-4 minutes to update an existing index. (Your mileage may vary.) It is also possible to increment the index by adding a specific file (the -a option). .LP Once an index is built, searching for \fIpattern\fP is as easy as saying .LP \fIglimpse pattern\fR .LP (See man glimpse for all glimpse's options and features.) .SH "A DETAILED DESCRIPTION OF GLIMPSEINDEX" .LP Glimpse does not automatically index files. You have to tell it to do it. This can be done manually, but a better way is to set it to run every night. It is probably a good idea to run glimpseindex manually for the first time to be sure it works properly. The following is a simple script to run glimpseindex every night. We assume that this script is stored in a file called glimpse.script: .LP glimpseindex -o -t -w 5000 ~ >& .glimpse_out .br at -m 0300 glimpse.script .br (It might be interesting to collect all the outputs of glimpse by changing >& to >>& so that the file .glimpse_out maintains a history. In this case the file must be created before the first time >>& is used. If you use ksh, replace '>&' with '2>&1'.) .LP Glimpseindex stores the names of all the files that it indexed in the file .glimpse_filenames. Each file is listed by its full path name as obtained at the time the files were indexed. For example, /usr1/udi/file1. Glimpse uses this full name when it performs the search, so the name must match the current name. This may become a problem when the indexing and the search are done from different machines (e.g., through NFS), which may cause the path names to be different. For example, /tmp_mnt/R/xxx/xxx/usr1/udi/file1. (The same is true for several other .glimpse files. See below.) .LP Glimpseindex does not follow symbolic links unless they are explicitly included in the .glimpse_include file (described below). .LP Glimpseindex makes an effort to identify non-text files such as binary files, compressed files, uuencoded files, postscript files, binhex files, etc. These files are automatically not indexed. In addition, all files whose names end with `.o', `.gz', `.Z', `.z', `.hqx', `.zip', or `.tar' will not be indexed (unless they are specifically included in .glimpse_include - see below). .LP The options for glimpseindex are as follows: .TP .B \-a adds the given file[s] and/or directories to an existing index. Any given directory will be traversed recursively and all files will be indexed (unless they appear in .glimpse_exclude; see below). Using this option is generally much faster than indexing everything from scratch, although in rare cases the index may not be as good. If for some reason the index is full (which can happen unless -o or -b are used) glimpseindex -a will produce an error message and will exit without changing the original index. .TP .B \-b builds a medium-size index (20-30% of the size of all files), allowing faster search. This option forces glimpseindex to store an exact (byte level) pointer to each occurrence of each word (except for some very common words belonging to the stop list). .TP .B \-B uses a hash table that is 4 times bigger (256k entries instead of 64K) to speed up indexing. The memory usage will increase typically by about 2 MB. This option is only for indexing speed; it does not affect the final index. .TP .B \-d filename(s) deletes the given file(s) from the index. .TP .B \-D filename(s) deletes the given file(s) from the list of file names, but not from the index. This is much faster than -d, and the file(s) will not be found by glimpse. However, the index itself will not become smaller. .TP .B \-E does not run a check on file types. Glimpse normally attempts to exclude non-text files, but this attempt is not always perfect. With \-E, glimpseindex indexes all files, except those that are specifically excluded in .glimpse_exclude and those whose file names end with one of the excluded suffixes. .TP .B \-f incremental indexing. \fIglimpseindex\fP scans all files and adds to the index only those files that were created or modified after the current index was built. If there is no current index or if this procedure fails, \fIglimpseindex\fP automatically reverts to the default mode (which is to index everything from scratch). This option may create an inefficient index for several reasons, one of which is that deleted files are not really deleted from the index. Unless changes are small, mostly additions, and -o is used, we suggest to use the default mode as much as possible. .TP .B \-F Glimpseindex receives the list of files to index from standard input. .TP .B \-H directory Put or update the index and all other .glimpse files (listed below) in "directory". The default is the home directory. When glimpse is run, the -H option must be used to direct glimpse to this directory, because glimpse assumes that the index is in the home directory (see also the -H option in glimpse). .TP .B \-i Make .glimpse_include (SEE GLIMPSEINDEX FILES) take precedence over .glimpse_exclude, so that, for example, one can exclude everything (by putting *) and then explicitly include files. .TP .B \-I Instead of indexing, only show (print to standard out) the list of files that would be indexed. It is useful for filtering purposes. ("glimpseindex -I dir | glimpseindex -F" is the same as "glimpseindex dir".) .TP .B \-M x Tells glimpseindex to use x MB of memory for temporary tables. The more memory you allow the faster glimpseindex will run. The default is x=2. The value of x must be a positive integer. Glimpseindex will need more memory than x for other things, and glimpseindex may perform some 'forks', so you'll have to experiment if you want to use this option. WARNING: If x is too large you may run out of swap space. .TP .B \-n Index numbers as well as text. The default is not to index numbers. This is useful when searching for dates or other identifying numbers, but it may make the index very large if there are lots of numbers. In general, glimpseindex strips away any non-alphabetic character. For example, the string abc123 will be indexed as abc if the -n option is not used and as abc123 if it is used. Glimpse provides warnings (in .glimpse_messages) for all files in which more than half the words that were added to the index from that file had digits in them (this is an attempt to identify data files that should probably not be indexed). One can use the .glimpse_exclude file to exclude data files or any other files. (See GLIMPSEINDEX FILES.) .TP .B \-o Build a small index rather than tiny (meaning 7-9% of the sizes of all files - your mileage may vary) allowing faster search. This option forces glimpseindex to allocate one block per file (a block usually contains many files). A detailed explanation of how blocks affect glimpse can be found in the glimpse article. (See also LIMITATIONS.) .TP .B \-R Recompute .glimpse_filenames_index from .glimpse_filenames. The file .glimpse_filenames_index speeds up processing. Glimpseindex usually computes it automatically. However, if for some reason one wants to change the path names of the files listed in .glimpse_filenames, then running glimpseindex -R recomputes .glimpse_filenames_index. This is useful if the index is computed on one machine, but is used on another (with the same hierarchy). The names of the files listed in .glimpse_filenames are used in runtime, so changing them can be done at any time in any way (as long as just the names not the content is changed). This is not really an option in the regular sense; rather, it is a program by itself, and it is meant as a post-processing step. (Avaliable only from version 3.6.) .TP .B \-s supports structured queries. This option was added to support the Harvest project and it is applicable mostly in that context. See STRUCTURED QUERIES below for more information and also http://harvest.sourceforge.net/ for more information about the Harvest project. .TP .B \-S k The number k determines the size of the \fIstop-list\fP. The stop-list consists of words that are too common and are not indexed (e.g., 'the' or 'and'). Instead of having a fixed stop-list, glimpseindex figures out the words that are too common for every index separately. The rules are different for the different indexing options. The tiny index contains all words (the savings from a stop-list are too small to bother). The small index (-o), the number k is a percentage threshold. A word will be in the stop list if it appears in at least k% of all files. The default value is 80%. (If there are less than 256 files, then the stop-list is not maintained.) The medium index (-b) counts all occurrences of all words, and a word is added to the stop-list if it appears at least k times per MByte. The default value is 500. A query that includes a stop list word is of course less efficient. (See also LIMITATIONS below.) .TP .B \-t (A new option in version 3.5.) The order in which files are indexed is determined by scanning the directories, which is mostly arbitrary. With the \-t option, combined with either \-o and \-b, the indexed files are stored in reversed order of modification age (younger files first). Results of queries are then automatically returned in this order. Furthermore, glimpse can filter results by age; for example, asking to look at only files that are at most 5 days old. .TP .B \-T builds the turbo file. Starting at version 3.0, this is the default, so using this option has no effect. .TP .B \-w k Glimpseindex does a reasonable, but not a perfect, job of determining which files should not be indexed. Sometimes a large text file should not be indexed; for example, a dictionary may match most queries. The -w option stores in a file called .glimpse_messages (in the same directory as the index) the list of all files that contribute at least \fIk\fP new words to the index. The user can look at this list of files and decide which should or should not be indexed. The file .glimpse_exclude contains files that will not be indexed (see more below). We recommend to set \fIk\fP to about 1000. This is not an exact measure. For example, if the same file appears twice, then the second copy will not contribute any new words to the dictionary (but if you exclude the first copy and index again, the second copy will contribute). .TP .B \-X (starting at version 4.0B1) Extract titles from HTML pages and add the titles to the index (in .glimpse_filenames). (This feature was added to improve the performance of WebGlimpse.) Works only on files whose names end with .html, .htm, .shtml, and .shtm. (see glimpse.h/EXTRACT_INFO_SUFFIX to add to these suffixes.) The routine to extract titles is called extract_info, in index/filetype.c. This feature can be modified in various ways to extract info from many filetypes. The titles are appended to the corresponding filenames with a space separator. Glimpseindex assumes that filenames don't have spaces in them. .TP .B \-z Allow customizable filtering, using the file .glimpse_filters to perform the programs listed there for each match. The best example is compress/decompress. If .glimpse_filters include the line .br *.Z uncompress < .br (separated by tabs) then before indexing any file that matches the pattern "*.Z" (same syntax as the one for .glimpse_exclude) the command listed is executed first (assuming input is from stdin, which is why uncompress needs <) and its output (assuming it goes to stdout) is indexed. The file itself is not changed (i.e., it stays compressed). Then if glimpse -z is used, the same program is used on these files on the fly. Any program can be used (we run 'exec'). For example, one can filter out parts of files that should not be indexed. Glimpseindex tries to apply all filters in .glimpse_filters in the order they are given. For example, if you want to uncompress a file and then extract some part of it, put the compression command (the example above) first and then another line that specifies the extraction. Note that this can slow down the search because the filters need to be run before files are searched. .SH "GLIMPSEINDEX FILES" .LP All files used by glimpse are located at the directory(ies) where the index(es) is (are) stored and have .glimpse_ as a prefix. The first two files (.glimpse_exclude and .glimpse_include) are optionally supplied by the user. The other files are built and read by glimpse. .LP .IP "\fB.glimpse_exclude\fR" contains a list of files that glimpseindex is explicitly told to ignore. In general, the syntax of .glimpse_exclude/include is the same as that of agrep (or any other grep). The lines in the .glimpse_exclude file are matched to the file names, and if they match, the files are excluded. Notice that agrep matches to parts of the string! e.g., agrep /ftp/pub will match /home/ftp/pub and /ftp/pub/whatever. So, if you want to exclude /ftp/pub/core, you just list it, as is, in the .glimpse_exclude file. If you put "/home/ftp/pub/cdrom" in .glimpse_exclude, every file name that matches that string will be excluded, meaning all files below it. You can use ^ to indicate the beginning of a file name, and $ to indicate the end of one, and you can use * and ? in the usual way. For example /ftp/*html will exclude /ftp/pub/foo.html, but will also exclude /home/ftp/pub/html/whatever; if you want to exclude files that start with /ftp and end with html use ^/ftp*html$ Notice that putting a * at the beginning or at the end is redundant (in fact, in this case glimpseindex will remove the * when it does the indexing). No other meta characters are allowed in .glimpse_exclude (e.g., don't use .* or # or |). Lines with * or ? must have no more than 30 characters. Notice that, although the index itself will not be indexed, the list of file names (.glimpse_filenames) will be indexed unless it is explicitly listed in .glimpse_exclude. .IP "\fB.glimpse_filters\fR" See the description above for the -z option. .IP "\fB.glimpse_include\fR" contains a list of files that glimpseindex is explicitly told to \fIinclude\fP in the index even though they may look like non-text files. Symbolic links are followed by glimpseindex only if they are specifically included here. The syntax is the same as the one for .glimpse_exclude (see there). If a file is in both .glimpse_exclude and .glimpse_include it will be excluded unless -i is used. .IP "\fB.glimpse_filenames\fP" contains the list of all indexed file names, one per line. This is an ASCII file that can also be used with agrep to search for a file name leading to a fast find command. For example, .br glimpse 'count#\\.c$' ~/.glimpse_filenames .br will output the names of all (indexed) .c files that have 'count' in their name (including anywhere on the path from the index). Setting the following alias in the .login file may be useful: .br alias findfile 'glimpse -h \!:1 ~/.glimpse_filenames' .IP ".\fBglimpse_index\fP" contains the index. The index consists of lines, each starting with a word followed by a list of block numbers (unless the -o or -b options are used, in which case each word is followed by an offset into the file .glimpse_partitions where all pointers are kept). The block/file numbers are stored in binary form, so this is not an ASCII file. .IP "\fB.glimpse_messages\fP" contains the output of the -w option (see above). .IP "\fB.glimpse_partitions\fP" contains the partition of the indexed space into blocks and, when the index is built with the -o or -b options, some part of the index. This file is used internally by glimpse and it is a non-ASCII file. .IP "\fB.glimpse_statistics\fP" contains some statistics about the makeup of the index. Useful for some advanced applications and customization of glimpse. .SH "STRUCTURED QUERIES" Glimpse can search for Boolean combinations of "attribute=value" terms by using the Harvest SOIF parser library (in glimpse/libtemplate). To search this way, the index must be made by using the -s option of glimpseindex (this can be used in conjunction with other glimpseindex options). For glimpse and glimpseindex to recognize "structured" files, they must be in SOIF format. In this format, each value is prefixed by an attribute-name with the size of the value (in bytes) present in "{}" after the name of the attribute. For example, The following lines are part of an SOIF file: .br .nf type{17}: Directory-Listing md5{32}: 3858c73d68616df0ed58a44d306b12ba .fi Any string can serve as an attribute name. Glimpse "pattern;type=Directory-Listing" will search for "pattern" only in files whose type is "Directory-Listing". The file itself is considered to be one "object" and its name/url appears as the first attribute with an "@" prefix; e.g., @FILE { http://xxx... } The scope of Boolean operations changes from records (lines) to whole files when structured queries are used in glimpse (since individual query terms can look at different attributes and they may not be "covered" by the record/line). Note that glimpse can only search for patterns in the value parts of the SOIF file: there are some attributes (like the TTL, MD5, etc.) that are interpreted by Harvest's internal routines. See RFC 2655 for more detailed information of the SOIF format. .SH "REFERENCES" .IP 1. U. Manber and S. Wu, "GLIMPSE: A Tool to Search Through Entire File Systems," \fIUsenix Winter 1994 Technical Conference\fP (best paper award), San Francisco (January 1994), pp. 23\-32. Also, Technical Report #TR 93-34, Dept. of Computer Science, University of Arizona, October 1993 (a postscript file is available by anonymous ftp at ftp://webglimpse.net/pub/glimpse/TR93-34.ps). .IP 2. S. Wu and U. Manber, "Fast Text Searching Allowing Errors," \fICommunications of the ACM\fP \fB35\fP (October 1992), pp. 83\-91. .SH "SEE ALSO" .BR agrep (1), .BR ed (1), .BR ex (1), .BR glimpse (1), .BR glimpseserver (1), .BR grep (1V), .BR sh (1), .BR csh (1). .SH LIMITATIONS .LP The index of glimpse is word based. A pattern that contains more than one word cannot be found in the index. The way glimpse overcomes this weakness is by splitting any multi-word pattern into its set of words and looking for all of them in the index. For example, \fBglimpse 'linear programming'\fR will first consult the index to find all files containing both \fIlinear\fP and \fIprogramming\fP, and then apply agrep to find the combined pattern. This is usually an effective solution, but it can be slow for cases where both words are very common, but their combination is not. .LP The index of glimpse stores all patterns in lower case. When glimpse searches the index it first converts all patterns to lower case, finds the appropriate files, and then searches the actual files using the original patterns. So, for example, \fIglimpse ABCXYZ\fR will first find all files containing abcxyz in any combination of lower and upper cases, and then searches these files directly, so only the right cases will be found. One problem with this approach is discovering misspellings that are caused by wrong cases. For example, \fIglimpse -B abcXYZ\fR will first search the index for the best match to abcxyz (because the pattern is converted to lower case); it will find that there are matches with no errors, and will go to those files to search them directly, this time with the original upper cases. If the closest match is, say AbcXYZ, glimpse may miss it, because it doesn't expect an error. Another problem is speed. If you search for "ATT", it will look at the index for "att". Unless you use -w to match the whole word, glimpse may have to search all files containing, for example, "Seattle" which has "att" in it. .LP There is no size limit for simple patterns and simple patterns with Boolean AND or OR. More complicated patterns are currently limited to approximately 30 characters. Lines are limited to 1024 characters. Records are limited to 48K, and may be truncated if they are larger than that. The limit of record length can be changed by modifying the parameter Max_record in agrep.h. .LP Each line in .glimpse_exclude or .glimpse_include that contains a * or a ? must not exceed 30 characters length. .LP Glimpseindex does not index words of size > 64. .LP A medium-size index (-b) may lead to actually slower query times if the files are all very small. .LP Under -b, it may be impossible to make the stop list empty. Glimpseindex is using the "sort" routine, and all occurrences of a word appear at some point on one line. Sort is limiting the size of lines it can handle (the value depends on the platform; ours is 16KB). If the lines are too big, the word is added to the stop list. .SH BUGS .LP Please submit bug reports or comments at http://webglimpse.net/bugzilla/ .SH DIAGNOSTICS (Only in version 3.6 and above.) .br exit status 0: terminated normally; .br exit status 1: glimpseindex errors (e.g., bad option combos, no files were indexed, etc.) .br exit status 2: system errors (e.g., write failed, sort failed, malloc failed). .SH AUTHORS Udi Manber and Burra Gopal, Department of Computer Science, University of Arizona, and Sun Wu, the National Chung-Cheng University, Taiwan. Now maintained by Golda Velez at Internet WorkShop (Email: gvelez@webglimpse.net) glimpse-4.18.7/glimpse/glimpseserver.1000066400000000000000000000061071300371307100176630ustar00rootroot00000000000000.TH GLIMPSESERVER l "October 13, 1997" .SH NAME \fIglimpseserver 4.1\fP - a server version of the glimpse searching package. .SH OVERVIEW \fIGlimpse\fP is an indexing and query system that allows you to search through all your files very quickly. The use of glimpse in servers that handle frequent queries is growing, which is why we wrote glimpseserver to make searches more efficient. Glimpseserver starts a process that listens to queries, runs glimpse, and sends the answers back. The main advantage is that the index is read only once into memory saving a lot of IO. Glimpse communicates with glimpseserver through a given port number. See the warning about security below. .LP .SH SYNOPSIS .B glimpseserver [ \fB\-H \fIdir\fP \-K \fIport\fP \-J \fIhost\fP. ] .SH "DESCRIPTION" .LP .TP .B \-H \fIdir\fP specifies the directory of the index. Similar to the \-H option of glimpse. The default directory is the value of the environment variable $HOME if that is set, otherwise it is the current directory. .TP .B \-K \fIport\fP this is the TCP port for communication: glimpseserver waits for requests on this port and clients that want to search using the index in specified by the \-H option must use this port (by calling glimpse -K). The defaults port number is 2001. .TP .B \-J \fIhost\fP the name of the host. The default is the host where glimpseserver is running, which is probably the only possibility anyway. .SH "RESTARTING" .LP If a new index is created by running glimpseindex every night, restarting a new glimpseserver is now easier: simply send a SIGUSR2 (signal #31 - i.e., "kill -31 pid") to glimpseserver; it then re-reads the NEW index and is ready to serve requests again. (A SIGHUP, i.e., signal #1, can also be sent instead of SIGUSR2 to make the glimpseserver re-read the new index.) The recommended way to do a fresh indexing while the server is still running is: .br send SIGSTOP to glimpseserver .br do the indexing .br send SIGUSR2 to glimpseserver .br send SIGCONT to glimpseserver (to ask it to continue after stop) .br The SIGSTOP is required so that glimpseserver doesn't answer any queries while the indexing is going on. .SH "WARNING" .LP Glimpseserver should be used only for public servers. Any client that knows the port number can get any information available in the index (and port numbers are not that secret). When glimpse is run as a standalone application it requires read permission of the index and all the files. When glimpse uses the \-C option to communicate with glimpseserver, glimpse (the client) does not require any permission, because glimpseserver does all the searching. So, we recommend not to run glimpseserver on any data that should be protected. Glimpseserver is meant to be used for public data. .SH "SEE ALSO" .BR glimpse (1), .BR glimpseindex (1), .SH BUGS .LP Please submit bug reports or comments at http://webglimpse.net/bugzilla/ .SH AUTHORS Udi Manber and Burra Gopal, Department of Computer Science, University of Arizona, and Sun Wu, the National Chung-Cheng University, Taiwan. Now maintained by Golda Velez at Internet WorkShop (Email: gvelez@webglimpse.net) glimpse-4.18.7/glimpse/install-sh000077500000000000000000000112441300371307100167140ustar00rootroot00000000000000#!/bin/sh # # install - install a program, script, or datafile # This comes from X11R5. # # Calling this script install-sh is preferred over install.sh, to prevent # `make' implicit rules from creating a file called install from it # when there is no Makefile. # # This script is compatible with the BSD install script, but was written # from scratch. # # set DOITPROG to echo to test this script # Don't use :- since 4.3BSD and earlier shells don't like it. doit="${DOITPROG-}" # put in absolute paths if you don't have them in your path; or use env. vars. mvprog="${MVPROG-mv}" cpprog="${CPPROG-cp}" chmodprog="${CHMODPROG-chmod}" chownprog="${CHOWNPROG-chown}" chgrpprog="${CHGRPPROG-chgrp}" stripprog="${STRIPPROG-strip}" rmprog="${RMPROG-rm}" mkdirprog="${MKDIRPROG-mkdir}" tranformbasename="" transform_arg="" instcmd="$mvprog" chmodcmd="$chmodprog 0755" chowncmd="" chgrpcmd="" stripcmd="" rmcmd="$rmprog -f" mvcmd="$mvprog" src="" dst="" dir_arg="" while [ x"$1" != x ]; do case $1 in -c) instcmd="$cpprog" shift continue;; -d) dir_arg=true shift continue;; -m) chmodcmd="$chmodprog $2" shift shift continue;; -o) chowncmd="$chownprog $2" shift shift continue;; -g) chgrpcmd="$chgrpprog $2" shift shift continue;; -s) stripcmd="$stripprog" shift continue;; -t=*) transformarg=`echo $1 | sed 's/-t=//'` shift continue;; -b=*) transformbasename=`echo $1 | sed 's/-b=//'` shift continue;; *) if [ x"$src" = x ] then src=$1 else # this colon is to work around a 386BSD /bin/sh bug : dst=$1 fi shift continue;; esac done if [ x"$src" = x ] then echo "install: no input file specified" exit 1 else true fi if [ x"$dir_arg" != x ]; then dst=$src src="" if [ -d $dst ]; then instcmd=: else instcmd=mkdir fi else # Waiting for this to be detected by the "$instcmd $src $dsttmp" command # might cause directories to be created, which would be especially bad # if $src (and thus $dsttmp) contains '*'. if [ -f $src -o -d $src ] then true else echo "install: $src does not exist" exit 1 fi if [ x"$dst" = x ] then echo "install: no destination specified" exit 1 else true fi # If destination is a directory, append the input filename; if your system # does not like double slashes in filenames, you may need to add some logic if [ -d $dst ] then dst="$dst"/`basename $src` else true fi fi ## this sed command emulates the dirname command dstdir=`echo $dst | sed -e 's,[^/]*$,,;s,/$,,;s,^$,.,'` # Make sure that the destination directory exists. # this part is taken from Noah Friedman's mkinstalldirs script # Skip lots of stat calls in the usual case. if [ ! -d "$dstdir" ]; then defaultIFS=' ' IFS="${IFS-${defaultIFS}}" oIFS="${IFS}" # Some sh's can't handle IFS=/ for some reason. IFS='%' set - `echo ${dstdir} | sed -e 's@/@%@g' -e 's@^%@/@'` IFS="${oIFS}" pathcomp='' while [ $# -ne 0 ] ; do pathcomp="${pathcomp}${1}" shift if [ ! -d "${pathcomp}" ] ; then $mkdirprog "${pathcomp}" else true fi pathcomp="${pathcomp}/" done fi if [ x"$dir_arg" != x ] then $doit $instcmd $dst && if [ x"$chowncmd" != x ]; then $doit $chowncmd $dst; else true ; fi && if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dst; else true ; fi && if [ x"$stripcmd" != x ]; then $doit $stripcmd $dst; else true ; fi && if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dst; else true ; fi else # If we're going to rename the final executable, determine the name now. if [ x"$transformarg" = x ] then dstfile=`basename $dst` else dstfile=`basename $dst $transformbasename | sed $transformarg`$transformbasename fi # don't allow the sed command to completely eliminate the filename if [ x"$dstfile" = x ] then dstfile=`basename $dst` else true fi # Make a temp file name in the proper directory. dsttmp=$dstdir/#inst.$$# # Move or copy the file name to the temp name $doit $instcmd $src $dsttmp && trap "rm -f ${dsttmp}" 0 && # and set any options; do chmod last to preserve setuid bits # If any of these fail, we abort the whole thing. If we want to # ignore errors from any of these, just make sure not to ignore # errors from the above "$doit $instcmd $src $dsttmp" command. if [ x"$chowncmd" != x ]; then $doit $chowncmd $dsttmp; else true;fi && if [ x"$chgrpcmd" != x ]; then $doit $chgrpcmd $dsttmp; else true;fi && if [ x"$stripcmd" != x ]; then $doit $stripcmd $dsttmp; else true;fi && if [ x"$chmodcmd" != x ]; then $doit $chmodcmd $dsttmp; else true;fi && # Now rename the file to the real destination. $doit $rmcmd -f $dstdir/$dstfile && $doit $mvcmd $dsttmp $dstdir/$dstfile fi && exit 0 glimpse-4.18.7/glimpse/main.c000066400000000000000000003636031300371307100160110ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* bgopal: (1993-4) redesigned/rewritten using agrep's library interface */ #include #include #include #include "glimpse.h" #include "defs.h" #include #include "checkfile.h" #include #include #include #include /* for flock definition */ #if ISO_CHAR_SET #include /* support for 8bit character set */ #endif #define CLIENTSERVER 1 #define USE_MSGHDR 0 #define USE_UNIXDOMAIN 0 #define DEBUG 0 #define DEF_SERV_PORT 2001 #define MIN_SERV_PORT 1024 #define MAX_SERV_PORT 30000 #define SERVER_QUEUE_SIZE 10 /* number of requests to buffer up while processing one request = 5 */ /* Borrowed from C-Lib */ extern char **environ; extern int errno; #if CLIENTSERVER #include "communicate.c" #endif /*CLIENTSERVER*/ /* For client-server protocol */ CHAR *SERV_HOST = NULL; int SERV_PORT; char glimpse_reqbuf[MAX_ARGS*MAX_NAME_LEN]; extern int glimpse_clientdied; /* set if signal received about dead socket: need agrep variable so that exec() can return quickly */ int glimpse_reinitialize = 0; /* Borrowed from agrep.c */ extern int D_length; /* global variable in agrep */ extern int D; /* global variable in agrep */ extern int pattern_index; /* These are used for byte level index search */ extern CHAR CurrentFileName[MAX_LINE_LEN]; extern int SetCurrentFileName; extern int CurrentByteOffset; extern int SetCurrentByteOffset; extern long CurrentFileTime; extern int SetCurrentFileTime; extern int execfd; extern int agrep_initialfd; extern CHAR *agrep_inbuffer; extern int agrep_inlen; extern int agrep_inpointer; extern FILE *agrep_finalfp; extern CHAR *agrep_outbuffer; extern int agrep_outlen; extern int agrep_outpointer; extern int glimpse_call; /* prevent agrep from printing out its usage */ extern int glimpse_isserver; /* prevent agrep from asking for user input */ int first_search = 1; /* intra/interaction in process_query() and glimpse_search() */ #if ISSERVER int RemoteFiles = 0; /* Are the files present locally or remotely? If on, then -NQ is automatically added to all search options for each query */ #endif /* Borrowed from index/io.c */ extern int InfoAfterFilename; extern int OneFilePerBlock; extern int StructuredIndex; extern unsigned int *dest_index_set; extern unsigned char *dest_index_buf; extern unsigned int *src_index_set; extern unsigned char *src_index_buf; extern unsigned char *merge_index_buf; extern int mask_int[32]; extern int indexable_char[256]; int test_indexable_char[256]; extern int p_table[MAX_PARTITION]; extern int GMAX_WORD_SIZE; extern int IndexNumber; /* used in getword() */ extern int InterpretSpecial; /* used to "not-split" agrep-regexps */ extern int UseFilters; /* defined in build_in.c, used for filtering routines in io.c */ extern int ByteLevelIndex; extern int RecordLevelIndex; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; extern int file_num; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; /* Borrowed from get_filename.c */ extern int bigbuffer_size; extern char *bigbuffer; extern char *outputbuffer; /* OPTIONS/FLAGS */ int veryfast = 0; int CONTACT_SERVER = 0; /* Should client try to call server at all or just process query on its own? */ int NOBYTELEVEL = 0; /* Some cases where we cannot do byte level fast-search: ALWAYS 0 if !ByteLevelIndex */ int OPTIMIZEBYTELEVEL = 0; /* Some cases where we don't want to do byte level search since number of files is small */ int GCONSTANT = 0; /* should pattern be taken as-is or parsed? */ int GLIMITOUTPUT = 0; /* max no. of output lines: 0=>infinity=default=nolimit */ int GLIMITTOTALFILE = 0; /* max no. of files to match: 0=>infinity=default=nolimit */ int GLIMITPERFILE = 0; /* not used in glimpse */ int GBESTMATCH = 0; /* Should I change -B to -# where # = no. of errors? */ int GRECURSIVE = 0; int GNOPROMPT = 0; int GBYTECOUNT = 0; int GPRINTFILENUMBER = 0; int GPRINTFILETIME = 0; int GOUTTAIL = 0; int GFILENAMEONLY = 0; /* how to do it if it is an and expression in structured queries */ int GNOFILENAME=0; int GPRINTNONEXISTENTFILE = 0; /* if filename is not there in index, then at least let user know its name */ int MATCHFILE = 0; int PRINTATTR = 0; int PRINTINDEXLINE = 0; int Pat_as_is=0; int Only_first=0; /* Do index search only */ int PRINTAPPXFILEMATCH=0; /* Print places in file where match occurs: useful with -b only to analyse the index */ int GCOUNT=0; /* print number of matches rather than actual matches: used only when PRINTAPPX = 1 */ int HINTSFROMUSER=0; /* The user gives the hints about where we should search (result of adding -EQNgy) */ int WHOLEFILESCOPE=0; /* used only when foundattr is NOT set: otherwise, scope is whole file anyway */ int foundattr=0; /* set in split.c -- != 0 only when StructuredIndex AND query is structured */ int foundnot=0; /* set in split.c -- != 0 only when the not operator (~) is present in the pattern */ int FILENAMESINFILE=0; /* whether the user is providing an explicit list of filenames to be searched for pattern (if absent, then means all files) */ int BITFIELDFILE=0; /* Based on contribution From ada@mail2.umu.se Fri Jul 12 01:56 MST 1996; Christer Holgersson, Sen. SysNet Mgr, Umea University/SUNET, Sweden */ int BITFIELDOFFSET=0; int BITFIELDLENGTH=0; int BITFIELDENDIAN=0; int GNumDays = 0; /* whether the user wants files modified within these many days before creating the index: only >0 makes sense */ /* structured queries */ CHAR ***attr_vals; /* matrix of char pointers: row=max #of attributes, col=max possible values */ CHAR **attr_found; /* did the expression corr. to each value in attr_vals match? */ ParseTree *GParse; /* what kind of expression corr. to attr are we looking for */ /* arbitrary booleans */ ParseTree terminals[MAXNUM_PAT]; /* parse tree's terminal node pointers pt. to elements of this array; also used outside */ char matched_terminals[MAXNUM_PAT]; /* ...[i] is 1 if i'th terminal matched: used in filter_output and eval_tree */ int num_terminals; /* number of terminal patterns */ int ComplexBoolean=0; /* 1 if we need to use parse trees and the eval function */ /* index search */ CHAR *pat_list[MAXNUM_PAT]; /* complete words within global pattern */ int pat_lens[MAXNUM_PAT]; /* their lengths */ int pat_attr[MAXNUM_PAT]; /* set of attributes */ int is_mgrep_pat[MAXNUM_PAT]; int mgrep_pat_index[MAXNUM_PAT]; int num_mgrep_pat; CHAR pat_buf[(MAXNUM_PAT + 2)*MAXPAT]; int pat_ptr = 0; extern char INDEX_DIR[MAX_LINE_LEN]; char *TEMP_DIR = NULL; /* directory to store glimpse temporary files, usually /tmp unless -T is specified */ char indexnumberbuf[256]; /* to read in first few lines of the index */ char *index_argv[MAX_ARGS]; int index_argc = 0; int bestmatcherrors=0; /* set during index search, used later on */ int patindex; int patbufpos = -1; char tempfile[MAX_NAME_LEN]; char *filenames_file = NULL; char *bitfield_file = NULL; /* agrep search */ char *agrep_argv[MAX_ARGS]; int agrep_argc = 0; CHAR *FileOpt; /* the option list after -F */ int fileopt_length; CHAR GPattern[MAXPAT]; int GM; CHAR APattern[MAXPAT]; int AM; CHAR GD_pattern[MAXPAT]; int GD_length; CHAR **GTextfiles; CHAR **GTextfilenames; int *GFileIndex; int GNumfiles; int GNumpartitions; CHAR GProgname[MAXNAME]; /* persistent file descriptors */ #if BG_DEBUG FILE *debug; /* file descriptor for debugging output */ #endif /*BG_DEBUG*/ FILE *timesfp = NULL; FILE *timesindexfp = NULL; FILE *indexfp = NULL; /* glimpse index */ FILE *partfp = NULL; /* glimpse partitions */ FILE *minifp = NULL; /* glimpse turbo */ FILE *nullfp = NULL; /* to discard output: agrep -s doesn't work properly */ int svstdin = 0, svstdout = 1, svstderr = 2; static int one = 1; /* to set socket option so that glimpseserver releases socket after death */ /* Index manipulation */ struct offsets **src_offset_table; struct offsets **multi_dest_offset_table[MAXNUM_PAT]; unsigned int *multi_dest_index_set[MAXNUM_PAT]; extern free_list(); struct stat index_stat_buf, file_stat_buf; int timesindexsize = 0; int last_Y_filenumber = 0; /* Direct agrep access for bytelevel-indices */ extern int COUNT, INVERSE, TCOMPRESSED, NOFILENAME, POST_FILTER, OUTTAIL, BYTECOUNT, SILENT, NEW_FILE, LIMITOUTPUT, LIMITPERFILE, LIMITTOTALFILE, PRINTRECORD, DELIMITER, SILENT, FILENAMEONLY, num_of_matched, prev_num_of_matched, FILEOUT; CHAR matched_region[MAX_REGION_LIMIT*2 + MAXPATT*2]; int RegionLimit=DEFAULT_REGION_LIMIT; /* Returns number of matched records/lines. Uses agrep's options to output stuff nicely; never called with RecordLevelIndex set */ int glimpse_search(AM, APattern, GD_length, GD_pattern, realfilename, filename, fileindex, src_offset_table, outfp) int AM; unsigned char APattern[]; int GD_length; unsigned char GD_pattern[]; char *realfilename; char *filename; int fileindex; struct offsets *src_offset_table[]; FILE *outfp; { FILE *infp; char sig[SIGNATURE_LEN]; struct offsets **p1, *tp1; CHAR *text, *curtextend, *curtextbegin, c; int times; int num, ret=0, totalret = 0; int prevoffset = 0, begininterval = 0, endinterval = -1; CHAR *beginregionptr = 0, *endregionptr = 0; int beginpage = 0, endpage = -1; static int MAXTIMES, MAXPGTIMES, pagesize; static int first_time = 1; /* * If can't open file for read, quit * For each offset for that file: * seek to that point * go back until delimiter, go forward until delimiter, output it: MAX_REGION_LIMIT is 16K on either side. * read in units of RegionLimit * before outputting matched record, use options to put prefixes (or use memagrep which does everything?) * Algorithm changed: don't read same page in twice. */ if (first_time) { pagesize = DISKBLOCKSIZE; MAXTIMES = ((MAX_REGION_LIMIT / RegionLimit) > 1) ? (MAX_REGION_LIMIT / RegionLimit) : 1; MAXPGTIMES = ((MAX_REGION_LIMIT / pagesize) > 1) ? (MAX_REGION_LIMIT / pagesize) : 1; first_time = 0; } /* Safety: must end/begin with delim */ memcpy(matched_region, GD_pattern, GD_length); memcpy(matched_region+MAXPATT+2*MAX_REGION_LIMIT, GD_pattern, GD_length); text = &matched_region[MAX_REGION_LIMIT+MAXPATT]; if ((infp = my_fopen(filename, "r")) == NULL) return 0; NEW_FILE = ON; #if 0 /* Cannot search in .CZ files since offset computations will be incorrect */ TCOMPRESSED = ON; if (!tuncompressible_filename(file_list[i], strlen(file_list[i]))) TCOMPRESSED = OFF; num_read = fread(sig, 1, SIGNATURE_LEN, infp); if ((TCOMPRESSED == ON) && tuncompressible(sig, num_read)) { EASYSEARCH = sig[SIGNATURE_LEN-1]; if (!EASYSEARCH) { fprintf(stderr, "not compressed for easy-search: can miss some matches in: %s\n", CurrentFileName); /* not filename!!! */ } } else TCOMPRESSED = OFF; #endif /*0*/ p1 = &src_offset_table[fileindex]; while (*p1 != NULL) { if ( (begininterval <= (*p1)->offset) && (endinterval > (*p1)->offset) ) { /* already covered this area */ #if DEBUG printf("ignoring %d in [%d,%d]\n", (*p1)->offset, begininterval, endinterval); #endif /*DEBUG*/ tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); continue; } TCOMPRESSED = OFF; #if 1 if ( (beginpage <= (*p1)->offset) && (endpage >= (*p1)->offset) && (text + ((*p1)->offset - prevoffset) + GD_length < endregionptr)) { /* beginregionptr = curtextend - GD_length; /* prevent next curtextbegin to go behind previous curtextend (!) */ text += ((*p1)->offset - prevoffset); prevoffset = (*p1)->offset; if (!((curtextend = forward_delimiter(text, endregionptr, GD_pattern, GD_length, 1)) < endregionptr)) goto fresh_read; if (!((curtextbegin = backward_delimiter(text, beginregionptr, GD_pattern, GD_length, 0)) > beginregionptr)) goto fresh_read; } else { /* NOT within an area already read: must read another page: if record overlapps page, might read page twice: no time to fix */ fresh_read: prevoffset = (*p1)->offset; text = &matched_region[MAX_REGION_LIMIT+MAXPATT]; /* middle: points to occurrence of pattern */ endpage = beginpage = ((*p1)->offset / pagesize) * pagesize; /* endpage = (((*p1)->offset + pagesize) / pagesize) * pagesize */ endregionptr = beginregionptr = text - ((*p1)->offset - beginpage); /* overlay physical place starting from this logical point */ /* endregionptr = text + (endpage - (*p1)->offset); */ curtextbegin = curtextend = text; times = 0; while (times < MAXPGTIMES) { fseek(infp, endpage, 0); num = (&matched_region[MAX_REGION_LIMIT*2+MAXPATT] - endregionptr < pagesize) ? (&matched_region[MAX_REGION_LIMIT*2+MAXPATT] - endregionptr) : pagesize; if ((num = fread(endregionptr, 1, num, infp)) <= 0) break; endpage += num; endregionptr += num; if (endregionptr <= text) { curtextend = text; /* error in value of offset: file was modified and offsets no longer true: your RISK! */ break; } if (((curtextend = forward_delimiter(text, endregionptr, GD_pattern, GD_length, 1)) < endregionptr) || (endregionptr >= &matched_region[MAX_REGION_LIMIT*2 + MAXPATT])) break; times ++; } times = 0; while (times < MAXPGTIMES) { /* I have already read the initial page since endpage is beginpage initially */ if ((curtextbegin = backward_delimiter(text, beginregionptr, GD_pattern, GD_length, 0)) > beginregionptr) break; if (beginpage > 0) { if (beginregionptr - pagesize < &matched_region[MAXPATT]) { if ((num = beginregionptr - &matched_region[MAXPATT]) <= 0) break; } else num = pagesize; beginpage -= num; beginregionptr -= num; } else break; times ++; fseek(infp, beginpage, 0); fread(beginregionptr, 1, num, infp); } } #else /*1*/ /* Find forward delimiter (including delimiter) */ times = 0; fseek(infp, (*p1)->offset, 0); while (times < MAXTIMES) { if ((num = fread(text+RegionLimit*times, 1, RegionLimit, infp)) > 0) curtextend = forward_delimiter(text, text+RegionLimit*times+num, GD_pattern, GD_length, 1); if ((curtextend < text+RegionLimit*times+num) || (num < RegionLimit)) break; times ++; } /* Find backward delimiter (including delimiter) */ times = 0; while (times < MAXTIMES) { num = ((*p1)->offset - RegionLimit*(times+1)) > 0 ? ((*p1)->offset - RegionLimit*(times+1)) : 0; fseek(infp, num, 0); if (num > 0) { fread(text-RegionLimit*(times+1), 1, RegionLimit, infp); curtextbegin = backward_delimiter(text, text-RegionLimit*(times+1), GD_pattern, GD_length, 0); } else { fread(text-RegionLimit*times-(*p1)->offset, 1, (*p1)->offset, infp); curtextbegin = backward_delimiter(text, text-RegionLimit*times-(*p1)->offset, GD_pattern, GD_length, 0); } if ((num <= 0) || (curtextbegin > text-RegionLimit*(times+1))) break; times ++; } #endif /*1*/ /* set interval and delete the entry */ begininterval = (*p1)->offset - (text - curtextbegin); endinterval = (*p1)->offset + (curtextend - text); if (strncmp(curtextbegin, GD_pattern, GD_length)) { /* always pass enclosing delimiters to agrep; since we have seen text before curtextbegin + we have space, we can overwrite */ memcpy(curtextbegin - GD_length, GD_pattern, GD_length); curtextbegin -= GD_length; } #if DEBUG c = *curtextend; *curtextend = '\0'; printf("%s [%d < %d < %d], text = %d: %s\n", CurrentFileName, begininterval, (*p1)->offset, endinterval, text, curtextbegin); *curtextend = c; #endif /*DEBUG*/ tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); if (curtextend <= curtextbegin) continue; /* error in offsets/delims */ /* * Don't call memagrep since that is heavy weight. Call exec * directly after doing agrep_search()'s preprocessing here. * PS: can add agrep variable not to do delim search if called from here * since that prevents unnecessarily scanning the buffer for the 2nd time. */ CurrentByteOffset = begininterval+1; SetCurrentByteOffset = 1; first_search = 1; if (first_search) { if ((ret = memagrep_search(AM, APattern, curtextend-curtextbegin, curtextbegin, 0, outfp)) > 0) totalret ++; /* += ret */ else if ((ret < 0) && (errno == AGREP_ERROR)) { fclose(infp); return -1; } first_search = 0; } else { /* All agrep globals are properly set: has a bug because agrep's globals aren't properly reinitialized without agrep_search :-( */ agrep_finalfp = (FILE *)outfp; agrep_outlen = 0; agrep_outbuffer = NULL; agrep_outpointer = 0; execfd = agrep_initialfd = -1; agrep_inbuffer = curtextbegin; agrep_inlen = curtextend - curtextbegin; agrep_inpointer = 0; if ((ret = exec(-1, NULL)) > 0) totalret ++; /* += ret; */ else if ((ret < 0) && (errno == AGREP_ERROR)) { fclose(infp); return -1; } } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) break; /* done */ if ((totalret > 0) && FILENAMEONLY) break; } /* while *p1 != NULL */ SetCurrentByteOffset = 0; fclose(infp); if (totalret > 0) { /* dirty solution: must handle part of agrep here */ if (COUNT && !FILEOUT && !SILENT) { if(!NOFILENAME) fprintf(outfp, "%s: %d\n", CurrentFileName, totalret); else fprintf(outfp, "%d\n", totalret); } else if (FILEOUT) { file_out(realfilename); } } return totalret; } /* Sets lastfilenumber that needs to be searched: rest must be discarded */ int process_Y_option(num_files, num_days, fp) int num_files, num_days; FILE *fp; { CHAR arrayend[4]; last_Y_filenumber = 0; if ((num_days <= 0) || (fp == NULL) || (timesindexsize <= 0)) return 0; last_Y_filenumber = num_files; if (num_days * sizeof(int) >= timesindexsize) return 0; /* everything will be within so many days */ if (fseek(fp, num_days*sizeof(int), 0) == -1) return -1; fread(arrayend, 1, 4, fp); if ((last_Y_filenumber = (arrayend[0] << 24) | (arrayend[1] << 16) | (arrayend[2] << 8) | arrayend[3]) > num_files) last_Y_filenumber = num_files; if (last_Y_filenumber == 0) { last_Y_filenumber = 1; printf("Warning: no files modified in the last %d days were found in the index.\nSearching only the most recently modified file...\n", num_days); } return 0; } read_index(indexdir) char indexdir[MAXNAME]; { char *home; char s[MAXNAME]; int ret; if (indexdir[0] == '\0') { if ((home = (char *)getenv("HOME")) == NULL) { getcwd(indexdir, MAXNAME-1); fprintf(stderr, "using working-directory '%s' to locate index\n", indexdir); } else strncpy(indexdir, home, MAXNAME); } ret = chdir(indexdir); if (getcwd(INDEX_DIR, MAXNAME-1) == NULL) strcpy(INDEX_DIR, indexdir); if (ret < 0) { fprintf(stderr, "using working-directory '%s' to locate index\n", INDEX_DIR); } sprintf(s, "%s", INDEX_FILE); indexfp = fopen(s, "r"); if(indexfp == NULL) { fprintf(stderr, "can't open glimpse index-file %s/%s\n", INDEX_DIR, INDEX_FILE); fprintf(stderr, "(use -H to give an index-directory or run 'glimpseindex' to make an index)\n"); return -1; } if (stat(s, &index_stat_buf) == -1) { fprintf(stderr, "can't stat %s/%s\n", INDEX_DIR, s); fclose(indexfp); return -1; } sprintf(s, "%s", P_TABLE); partfp = fopen(s, "r"); if(partfp == NULL) { fprintf(stderr, "can't open glimpse partition-table %s/%s\n", INDEX_DIR, P_TABLE); fprintf(stderr, "(use -H to specify an index-directory or run glimpseindex to make an index)\n"); fclose(indexfp); return -1; } sprintf(s, "%s", DEF_TIME_FILE); timesfp = fopen(s, "r"); sprintf(s, "%s.index", DEF_TIME_FILE); timesindexfp = fopen(s, "r"); if (timesindexfp != NULL) { struct stat st; fstat(fileno(timesindexfp), &st); timesindexsize = st.st_size; } /* Get options */ #if BG_DEBUG debug = fopen(DEBUG_FILE, "w+"); if(debug == NULL) { fprintf(stderr, "can't open file %s/%s, errno=%d\n", INDEX_DIR, DEBUG_FILE, errno); return(-1); } #endif /*BG_DEBUG*/ fgets(indexnumberbuf, 256, indexfp); if(strstr(indexnumberbuf, "1234567890")) IndexNumber = ON; else IndexNumber = OFF; fscanf(indexfp, "%%%d\n", &OneFilePerBlock); if (OneFilePerBlock < 0) { ByteLevelIndex = ON; OneFilePerBlock = -OneFilePerBlock; } else if (OneFilePerBlock == 0) { GNumpartitions = get_table(P_TABLE, p_table, MAX_PARTITION, 0); } fscanf(indexfp, "%%%d%s\n", &StructuredIndex, old_rdelim); /* Set WHOLEFILESCOPE for do-it-yourself request processing at client */ WHOLEFILESCOPE = 1; if (StructuredIndex <= 0) { if (StructuredIndex == -2) { RecordLevelIndex = 1; strcpy(rdelim, old_rdelim); rdelim_len = strlen(rdelim); preprocess_delimiter(rdelim, rdelim_len, rdelim, &rdelim_len); } WHOLEFILESCOPE = 0; StructuredIndex = 0; PRINTATTR = 0; /* doesn't make sense: must not go into filter_output */ } else if (-1 == (StructuredIndex = attr_load_names(ATTRIBUTE_FILE))) { fprintf(stderr, "error in reading attribute file %s/%s\n", INDEX_DIR, ATTRIBUTE_FILE); return(-1); } #if BG_DEBUG fprintf(debug, "buf = %s OneFilePerBlock=%d StructuredIndex=%d\n", indexnumberbuf, OneFilePerBlock, StructuredIndex); #endif /*BG_DEBUG*/ sprintf(s, "%s", MINI_FILE); minifp = fopen(s, "r"); /* if (minifp==NULL && OneFilePerBlock) fprintf(stderr, "Can't open for reading: %s/%s --- cannot do very fast search\n", INDEX_DIR, MINI_FILE); */ if (OneFilePerBlock && glimpse_isserver && (minifp != NULL)) read_mini(indexfp, minifp); read_filenames(); /* Once IndexNumber info is available */ set_indexable_char(indexable_char); set_indexable_char(test_indexable_char); set_special_char(indexable_char); return 0; } #define CLEANUP \ {\ int q, k;\ if (timesfp != NULL) fclose(timesfp);\ if (timesindexfp != NULL) fclose(timesindexfp);\ if (indexfp != NULL) fclose(indexfp);\ if (partfp != NULL) fclose(partfp);\ if (minifp != NULL) fclose(minifp);\ if (nullfp != NULL) fclose(nullfp);\ indexfp = partfp = minifp = nullfp = NULL;\ if (ByteLevelIndex) {\ if (src_offset_table != NULL) for (k=0; k QUIT CURRENT REQUEST. */ int ignore_signal[32] = { 0, 0, 0, 1, 1, 1, 1, 1, 1, /* all the tracing stuff: since default action is to dump core */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }; /* resource lost: since default action is to dump core */ /* S.t. sockets don't persist: they sometimes have a bad habit of doing so */ void cleanup() { int i; /* ^C in the middle of a client call */ if (svstderr != 2) { close(2); dup(svstderr); } fprintf(stderr, "server cleaning up...\n"); CLEANUP; for (i=0; i<64; i++) close(i); exit(3); } void reinitialize(s) int s; { /* To force main-while loop call reinitialize_server() after do_select() */ glimpse_reinitialize = 1; #ifdef __svr4__ /* Solaris 2.3 insists that you reset the signal handler */ (void)signal(s, reinitialize); #endif } #define QUITREQUESTMSG "glimpseserver: aborting request...\n" /* S.t. one request doesn't keep server occupied too long, when client already quits */ void quitrequest(s) int s; { /* * Don't write onto stderr, since 2 is duped to sockfd => can cause recursive signal! * Also, don't print error message more than once for quitting one request. The * server receives signals for EVERY write it attempts when it finds a match: I could * not find a way to prevent it, but agrep/bitap.c/fill_buf() was fixed to limit it. * -- bg on 16th Feb 1995 */ if (!glimpse_clientdied && (s != SIGUSR1)) /* USR1 is a "friendly" cleanup message */ write(svstderr, QUITREQUESTMSG, strlen(QUITREQUESTMSG)); glimpse_clientdied = 1; #ifdef __svr4__ /* Solaris 2.3 insists that you reset the signal handler */ (void)signal(s, quitrequest); #endif } /* The client receives this signal when an output/input pipe is broken, etc. It simply exits from the current request */ void exitrequest() { glimpse_clientdied = 1; } main(argc, argv) int argc; char *argv[]; { int ret = 0, tried = 0; char indexdir[MAXNAME]; char **oldargv = argv; int oldargc = argc; #if CLIENTSERVER int sockfd, newsockfd, clilen, len, clpid; int clout; #if USE_UNIXDOMAIN struct sockaddr_un cli_addr, serv_addr; #else /*USE_UNIXDOMAIN*/ struct sockaddr_in cli_addr, serv_addr; struct hostent *hp; #endif /*USE_UNIXDOMAIN*/ int cli_len; int clargc; char **clargv; int clstdin, clstdout, clstderr; int i; char array[4]; char *p, c; #endif /*CLIENTSERVER*/ int quitwhile; #if ISO_CHAR_SET setlocale(LC_ALL,""); /* support for 8bit character set: ew@senate.be, Henrik.Martin@eua.ericsson.se */ #endif #if CLIENTSERVER && ISSERVER glimpse_isserver = 1; /* I am the server */ #else /*CLIENTSERVER && ISSERVER*/ if (argc <= 1) { usage(); /* Client nees at least 1 argument */ exit(1); } #endif /*CLIENTSERVER && ISSERVER*/ #define RETURNMAIN(val)\ {\ CLEANUP;\ if (val < 0) exit (2);\ else if (val == 0) exit (1);\ else exit (0);\ } SERV_HOST = (CHAR *)my_malloc(MAXNAME); #if !SYSCALLTESTING /* once-only initialization */ init_filename_hashtable(); src_offset_table = NULL; for (i=0; i MAX_ARGS) goto doityourself; #endif /*!ISSERVER*/ #if !SYSCALLTESTING while((--argc > 0) && (*++argv)[0] == '-' ) { p = argv[0] + 1; /* ptr to first character after '-' */ c = *(argv[0]+1); if (*p == '-') { /* cheesy hack to support --version and --help options */ if (*(p+1) == 'v') { c = 'V'; } else if (*(p+1) == 'h') { c = '?'; } } quitwhile = OFF; while (!quitwhile && (*p != '\0')) { switch(c) { /* Look for -H option at server (only one that makes sense); if client has a -H, then it goes to doityourself */ case 'H' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -H option\n", GProgname); RETURNMAIN(usageS()); } argv ++; strcpy(indexdir, argv[0]); argc --; } else { strcpy(indexdir, p+1); } quitwhile = ON; break; /* Recognized by both client and server */ case 'J' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server host name must follow the -J option\n", GProgname); #if ISSERVER RETURNMAIN(usageS()); #else /*ISSERVER*/ RETURNMAIN(usage()); #endif /*ISSERVER*/ } argv ++; strcpy(SERV_HOST, argv[0]); argc --; } else { strcpy(SERV_HOST, p+1); } quitwhile = ON; break; /* Recognized by both client and server */ case 'K' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server port must follow the -C option\n", GProgname); #if ISSERVER RETURNMAIN(usageS()); #else /*ISSERVER*/ RETURNMAIN(usage()); #endif /*ISSERVER*/ } argv ++; SERV_PORT = atoi(argv[0]); argc --; } else { SERV_PORT = atoi(p+1); } if ((SERV_PORT < MIN_SERV_PORT) || (SERV_PORT > MAX_SERV_PORT)) { fprintf(stderr, "Bad server port %d: must be in [%d, %d]: using default %d\n", SERV_PORT, MIN_SERV_PORT, MAX_SERV_PORT, DEF_SERV_PORT); SERV_PORT = DEF_SERV_PORT; } quitwhile = ON; break; #if ISSERVER #if SFS_COMPAT case 'R' : RemoteFiles = ON; break; case 'Z' : /* No op */ break; #endif case 'V' : printf("\nThis is glimpseindex version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); RETURNMAIN(1); case '?' : RETURNMAIN(usageS()); /* server cannot recognize any other option */ default : fprintf(stderr, "%s: server cannot recognize option: '%s'\n", GProgname, p); RETURNMAIN(usageS()); #else /*ISSERVER*/ /* These have 1 argument each, so must do quitwhile */ case 'd' : case 'e' : case 'f' : case 'k' : case 'D' : case 'F' : case 'I' : case 'L' : case 'R' : case 'S' : case 'T' : case 'Y' : case 'p' : if (argv[0][2] == '\0') {/* space after - option */ if(argc <= 1) { fprintf(stderr, "%s: the '-%c' option must have an argument\n", GProgname, c); RETURNMAIN(usage()); } argv++; argc--; } quitwhile = ON; break; /* These are illegal */ case 'm' : case 'v' : fprintf(stderr, "%s: illegal option: '-%c'\n", GProgname, c); RETURNMAIN(usage()); /* They can't be patterns and filenames since they start with a -, these don't have arguments */ case '!' : case 'a' : case 'b' : case 'c' : case 'h' : case 'i' : case 'j' : case 'l' : case 'n' : case 'o' : case 'q' : case 'r' : case 's' : case 't' : case 'u' : case 'g' : case 'w' : case 'x' : case 'y' : case 'z' : case 'A' : case 'B' : case 'E' : case 'G' : case 'M' : case 'N' : case 'O' : case 'P' : case 'Q' : case 'U' : case 'W' : case 'X' : case 'Z' : break; case 'C': CONTACT_SERVER = 1; break; case 'V' : printf("\nThis is glimpse version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); RETURNMAIN(1); case '?': RETURNMAIN(usage()); default : if (isdigit(c)) quitwhile = ON; else { fprintf(stderr, "%s: illegal option: '-%c'\n", GProgname, c); RETURNMAIN(usage()); } break; #endif /*ISSERVER*/ } /* switch(c) */ p ++; c = *p; } } #else CONTACT_SERVER = 1; argc=0; #endif #if !ISSERVER /* Next arg must be the pattern: Check if the user wants to run the client as agrep, or doesn't want to contact the server */ if ((argc > 1) || (!CONTACT_SERVER)) goto doityourself; #endif /*!ISSERVER*/ argv = oldargv; argc = oldargc; #endif /*CLIENTSERVER*/ #if ISSERVER && CLIENTSERVER if (-1 == read_index(indexdir)) RETURNMAIN(ret); /* Install signal handlers so that glimpseserver doesn't continue to run when sockets get broken, etc. */ for (i=0; i<32; i++) if (ignore_signal[i]) signal(i, SIG_IGN); signal(SIGHUP, cleanup); signal(SIGINT, cleanup); if (((void (*)())-1 == signal(SIGPIPE, quitrequest)) || ((void (*)())-1 == signal(SIGUSR1, quitrequest)) || #ifndef SCO ((void (*)())-1 == signal(SIGURG, quitrequest)) || #endif ((void (*)())-1 == signal(SIGUSR2, reinitialize)) || ((void (*)())-1 == signal(SIGHUP, reinitialize))) { /* Check for return values here since they ensure reliability */ fprintf(stderr, "glimpseserver: Unable to install signal-handlers.\n"); RETURNMAIN(-1); } #if USE_UNIXDOMAIN if ((sockfd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { fprintf(stderr, "server cannot open socket for communication.\n"); RETURNMAIN(-1); } char TMP_FILE_NAME[256]; strcpy(TMP_FILE_NAME,TEMP_DIR) ; strcat(TMP_FILE_NAME,"/.glimpse_server"); unlink(TMP_FILE_NAME); memset((char *)&serv_addr, '\0', sizeof(serv_addr)); serv_addr.sun_family = AF_UNIX; strcpy(serv_addr.sun_path, TMP_FILE_NAME); /* < 108 ! */ len = strlen(serv_addr.sun_path) + sizeof(serv_addr.sun_family); #else /*USE_UNIXDOMAIN*/ if ((sockfd = socket(PF_INET, SOCK_STREAM, 0)) < 0) { perror("glimpseserver: Cannot create socket"); RETURNMAIN(-1); } memset((char *)&serv_addr, '\0', sizeof(serv_addr)); serv_addr.sin_family = AF_INET; serv_addr.sin_port = htons(SERV_PORT); #if 0 /* use host-names not internet style d.d.d.d notation */ serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); #else /* * We only want to accept connections from glimpse clients * on the SERV_HOST, do not use INADDR_ANY! */ if ((hp = gethostbyname(SERV_HOST)) == NULL) { perror("glimpseserver: Cannot resolve host"); RETURNMAIN(-1); } memcpy((caddr_t)&serv_addr.sin_addr, hp->h_addr, hp->h_length); #endif /*0*/ len = sizeof(serv_addr); #endif /*USE_UNIXDOMAIN*/ /* test code for glimpse server, get it to realse socket when it dies: contribution by Sheldon Smoker */ if((setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,(char *)&one,sizeof(one))) == -1) { fprintf(stderr,"glimpseserver: could not set socket option\n"); perror("setsockopt"); exit(1); } /* end test code */ if (bind(sockfd, (struct sockaddr *)&serv_addr, len) < 0) { perror("glimpseserver: Cannot bind to socket"); RETURNMAIN(-1); } listen(sockfd, SERVER_QUEUE_SIZE); printf("glimpseserver: On-line (pid = %d, port = %d) waiting for request...\n", getpid(), SERV_PORT); fflush(stdout); /* must fflush to print on server stdout */ while (1) { /* * Spin until sockfd is ready to do a non-blocking accept(2). * We only wait for 15 seconds, because SunOS may * swap us out if we block for 20 seconds or more. * -- Courtesy: Darren Hardy, hardy@cs.colorado.edu */ if ((ret = do_select(sockfd, 15)) == 0) { if ((errno == EINTR) && glimpse_reinitialize) { glimpse_reinitialize = 0; CLEANUP; close(sockfd); sleep(IC_PORTRELEASE); reinitialize_server(oldargc, oldargv); } continue; } else if (ret != 1) continue; /* get parameters */ ret = 0; clargc = 0; clargv = NULL; cli_len = sizeof(cli_addr); if ((newsockfd = accept(sockfd, (struct sockaddr *)&cli_addr, &cli_len)) < 0) continue; if (getreq(newsockfd, glimpse_reqbuf, &clstdin, &clstdout, &clstderr, &clargc, &clargv, &clpid) < 0) { ret = -1; #if DEBUG printf("getreq errno: %d\n", errno); #endif /*DEBUG*/ goto end_process; } #if DEBUG printf("server processing request on %x\n", newsockfd); #endif /*DEBUG*/ /* * Server doesn't wait for response, no point using svstdin = dup(0); close(0); dup(clstdin); close(clstdin); */ /* * This is wrong since clstderr == clstdout! svstdout = dup(1); close(1); dup(clstdout); close(clstdout); svstderr = dup(2); close(2); dup(clstderr); close(clstderr); */ svstdout = dup(1); svstderr = dup(2); close(1); close(2); dup(clstdout); dup(clstderr); close(clstdout); close(clstderr); /* * IMPORTANT: Unbuffered I/O to the client! * Done for Harvest since partial results might be * needed and fflush will not flush partial results * to the client if we type ^C and kill it: it puts * them into /dev/null. This way, output is unbuffered * and the client sees at least some results if killed. */ setbuf(stdout, NULL); setbuf(stderr, NULL); glimpse_call = 0; glimpse_clientdied = 0; ret = process_query(clargc, clargv, newsockfd); /* * Server doesn't wait for response, no point using close(0); dup(svstdin); close(svstdin); svstdin = 0; */ if (glimpse_clientdied) { /* * This code is *ONLY* used as a safety net now. * The old problem was that users would see portions * of previous (and usually) unrelated queries! * glimpseserver now uses unbuffered I/O to the * client so all previous fwrite's to now are * gone. But since this is such a nasty problem * we flush stdout to /dev/null just in case. */ clout = open("/dev/null", O_WRONLY); close(1); dup(clout); close(clout); fflush(stdout); } /* Restore svstdout and svstdout to stdout/stderr */ close(1); dup(svstdout); close(svstdout); svstdout = 1; close(2); dup(svstderr); close(svstderr); svstderr = 2; end_process: #if USE_MSGHDR /* send reply and cleanup */ array[0] = (ret & 0xff000000) >> 24; array[1] = (ret & 0xff0000) >> 16; array[2] = (ret & 0xff00) >> 8; array[3] = (ret & 0xff); writen(newsockfd, array, 4); #endif /*USE_MSGHDR*/ #if DEBUG write(1, "done\n", 5); #endif /*DEBUG*/ for (i=0; ih_addr, hp->h_length); #endif /*0*/ len = sizeof(serv_addr); #endif /*USE_UNIXDOMAIN*/ if (connect(sockfd, (struct sockaddr *)&serv_addr, len) < 0) { char errbuf[4096]; sprintf(errbuf, "glimpse: Cannot contact glimpseserver: %s, port %d:", SERV_HOST, SERV_PORT); perror(errbuf); /* perror(SERV_HOST); */ #if DEBUG printf("connect errno: %d\n", errno); #endif /*DEBUG*/ close(sockfd); if ((errno == ECONNREFUSED) && (tried < 4)) { tried ++; goto trynewsocket; } goto doityourself; } if (sendreq(sockfd, glimpse_reqbuf, fileno(stdin), fileno(stdout), fileno(stderr), argc, argv, getpid()) < 0) { perror("sendreq"); #if DEBUG printf("sendreq errno: %d\n", errno); #endif /*DEBUG*/ close(sockfd); goto doityourself; } #if USE_MSGHDR if (readn(sockfd, array, 4) != 4) { close(sockfd); goto doityourself; } ret = (array[0] << 24) + (array[1] << 16) + (array[2] << 8) + array[3]; #else /*USE_MSGHDR*/ { /* * Dump everything the server writes into the socket onto * stdout until EOF/error. Do this in a way so that *everything* * the server sends is dumped to stdout by the client. The * client might die suddenly via ^C or SIGTERM, but we still * want the results. */ char tmpbuf[1024]; int n; while ((n = read(sockfd, tmpbuf, 1024)) > 0) { write(fileno(stdout), tmpbuf, n); } } #endif /*USE_MSGHDR*/ close(sockfd); RETURNMAIN(ret); doityourself: #if DEBUG printf("doing it myself :-(\n"); #endif /*DEBUG*/ #endif /*CLIENTSERVER*/ setbuf(stdout, NULL); /* Unbuffered I/O to always get every result */ setbuf(stderr, NULL); glimpse_call = 0; glimpse_clientdied = 0; ret = process_query(oldargc, oldargv, fileno(stdin)); RETURNMAIN(ret); #endif /*ISSERVER && CLIENTSERVER*/ } process_query(argc, argv, newsockfd) int argc; char *argv[]; int newsockfd; { int searchpercent; int num_blocks; int num_read; int i, j; int iii; /* Udi */ int jjj; char c; char *p; int ret; int jj; int quitwhile; char indexdir[MAX_LINE_LEN]; char temp_filenames_file[MAX_LINE_LEN]; char temp_bitfield_file[MAX_LINE_LEN]; char TEMP_FILE[MAX_LINE_LEN]; char temp_file[MAX_LINE_LEN]; int oldargc = argc; char **oldargv = argv; CHAR dummypat[MAX_PAT]; int dummylen=0; int my_M_index, my_P_index, my_b_index, my_A_index, my_l_index = -1, my_B_index = -1; char **outname; int gnum_of_matched = 0; int gprev_num_of_matched = 0; int gfiles_matched = 0; int foundpat = 0; int wholefilescope=0; int nobytelevelmustbeon=0; long get_file_time(); if ((argc <= 0) || (argv == NULL)) { errno = EINVAL; return -1; } /* * Macro to destroy EVERYTHING before return since we might want to make this a * library function later on: convention is that after destroy, objects are made * NULL throughout the source code, and are all set to NULL at initialization time. * DO agrep_argv, index_argv and FileOpt my_malloc/my_free optimizations later. * my_free calls have 2nd parameter = 0 if the size is not easily determinable. */ #define RETURN(val) \ {\ int q,k;\ \ first_search = 0;\ for (k=0; k MAX_ARGS) { #if ISSERVER fprintf(stderr, "too many arguments %d obtained on server!\n", argc); #endif /*ISSERVER*/ i = fileagrep(oldargc, oldargv, 0, stdout); RETURN(i); } /* * Process what options you can, then call fileagrep_init() to set * options in agrep and get the pattern. Then, call fileagrep_search(). * Begin by copying options into agrep_argv assuming glimpse was not * called as agrep (optimistic :-). */ agrep_argc = 0; for (i=0; i= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'z'; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; /* In glimpse, you should always print pattern when using mgrep (user can't do -f or -m)! */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'P'; agrep_argv[agrep_argc][2] = '\0'; my_P_index = agrep_argc; agrep_argc ++; /* In glimpse, you should always output multiple when doing mgrep */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'M'; agrep_argv[agrep_argc][2] = '\0'; my_M_index = agrep_argc; agrep_argc ++; /* In glimpse, you should print the byte offset if there is a structured query */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'b'; agrep_argv[agrep_argc][2] = '\0'; my_b_index = agrep_argc; agrep_argc ++; /* In glimpse, you should always have space for doing -m if required */ if (agrep_argc + 2 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'm'; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; agrep_argv[agrep_argc] = (char *)my_malloc(2); /* no op */ agrep_argv[agrep_argc][0] = '\0'; agrep_argc ++; /* Add -A option to print filenames as default */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'A'; agrep_argv[agrep_argc][2] = '\0'; my_A_index = agrep_argc; agrep_argc ++; while((agrep_argc < MAX_ARGS) && (--argc > 0) && (*++argv)[0] == '-' ) { p = argv[0] + 1; /* ptr to first character after '-' */ c = *(argv[0]+1); quitwhile = OFF; while (!quitwhile && (*p != '\0')) { c = *p; switch(c) { case 'F' : MATCHFILE = ON; FileOpt = (CHAR *)my_malloc(MAXFILEOPT); if (*(p + 1) == '\0') {/* space after - option */ if(argc <= 1) { fprintf(stderr, "%s: a file pattern must follow the -F option\n", GProgname); RETURN(usage()); } argv++; if ((dummylen = strlen(argv[0])) > MAXFILEOPT) { fprintf(stderr, "%s: -F option list too long\n", GProgname); RETURN(usage()); } strcpy(FileOpt, argv[0]); argc--; } else { if ((dummylen = strlen(p+1)) > MAXFILEOPT) { fprintf(stderr, "%s: -F option list too long\n", GProgname); RETURN(usage()); } strcpy(FileOpt, p+1); } /* else */ quitwhile = ON; break; /* search the index only and output the number of blocks */ case 'N' : Only_first = ON; break ; /* also keep track of the matches in each file */ case 'Q' : PRINTAPPXFILEMATCH = ON; break ; case 'U' : InfoAfterFilename = ON; break; case '!' : HINTSFROMUSER = ON; break; /* go to home directory to find the index: even if server overwrites indexdir here, it won't overwrite INDEX_DIR until read_index() */ case 'H' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -H option\n", GProgname); RETURN(usage()); } argv ++; #if !ISSERVER strcpy(indexdir, argv[0]); #endif /*!ISSERVER*/ argc --; } #if !ISSERVER else { strcpy(indexdir, p+1); } agrep_argv[agrep_argc] = (char *)my_malloc(4); strcpy(agrep_argv[agrep_argc], "-H"); agrep_argc ++; agrep_argv[agrep_argc] = (char *)my_malloc(strlen(indexdir) + 2); strcpy(agrep_argv[agrep_argc], indexdir); agrep_argc ++; #endif /*!ISSERVER*/ quitwhile = ON; break; #if ISSERVER && SFS_COMPAT /* INDEX_DIR will be already set since this is the server, so we can direclty xfer the .glimpse_* files */ case '.' : strcpy(TEMP_FILE, INDEX_DIR); strcpy(temp_file, "."); strcat(TEMP_FILE, "/."); if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a file name must follow the -. option\n", GProgname); RETURN(usage()); } argv ++; strcat(TEMP_FILE, argv[0]); strcat(temp_file, argv[0]); argc --; } else { strcat(TEMP_FILE, p+1); strcat(temp_file, p+1); } if (!strcmp(temp_file, INDEX_FILE) || !strcmp(temp_file, FILTER_FILE) || !strcmp(temp_file, ATTRIBUTE_FILE) || !strcmp(temp_file, MINI_FILE) || !strcmp(temp_file, P_TABLE) || !strcmp(temp_file, PROHIBIT_LIST) || !strcmp(temp_file, INCLUDE_LIST) || !strcmp(temp_file, NAME_LIST) || !strcmp(temp_file, NAME_LIST_INDEX) || !strcmp(temp_file, NAME_HASH) || !strcmp(temp_file, NAME_HASH_INDEX) || !strcmp(temp_file, DEF_STAT_FILE) || !strcmp(temp_file, DEF_MESSAGE_FILE) || !strcmp(temp_file, DEF_TIME_FILE)) { if ((ret = open(TEMP_FILE, O_RDONLY, 0)) <= 0) RETURN(ret); while ((num_read = read(ret, matched_region, MAX_REGION_LIMIT*2)) > 0) { write(1 /* NOT TO newsockfd since that was got by a syscall!!! */, matched_region, num_read); } close(ret); } quitwhile = ON; RETURN(0); #endif /* ISSERVER */ /* go to temp directory to create temp files */ case 'T' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -T option\n", GProgname); RETURN(usage()); } argv ++; strcpy(TEMP_DIR, argv[0]); argc --; } else { strcpy(TEMP_DIR, p+1); } sprintf(tempfile, "%s/.glimpse_tmp.%d", TEMP_DIR, getpid()); quitwhile = ON; break; /* To get files within some number of days before indexing was done */ case 'Y': if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the number of days must follow the -Y option\n", GProgname); RETURN(usage()); } argv ++; GNumDays = atoi(argv[0]); argc --; } else { GNumDays = atoi(p+1); } if (GNumDays <= 0) { fprintf(stderr, "%s: the number of days %d must be > 0\n", GProgname, GNumDays); RETURN(usage()); } quitwhile = ON; break; case 'R' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the record size must follow the -R option\n", GProgname); RETURN(usage()); } argv ++; RegionLimit = atoi(argv[0]); argc --; } else { RegionLimit = atoi(p+1); } if ((RegionLimit <= 0) || (RegionLimit > MAX_REGION_LIMIT)) { fprintf(stderr, "Bad record size %d: must be in [%d, %d]: using default %d\n", RegionLimit, 1, MAX_REGION_LIMIT, DEFAULT_REGION_LIMIT); RegionLimit = DEFAULT_REGION_LIMIT; } quitwhile = ON; break; /* doesn't matter if we overwrite the value in the client since the same value would have been picked up in main() anyway */ case 'J' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server host name must follow the -J option\n", GProgname); RETURN(usageS()); } argv ++; #if !ISSERVER strcpy(SERV_HOST, argv[0]); #endif /*!ISSERVER*/ argc --; } #if !ISSERVER else { strcpy(SERV_HOST, p+1); } #endif /*!ISSERVER*/ quitwhile = ON; break; /* doesn't matter if we overwrite the value in the client since the same value would have been picked up in main() anyway */ case 'K' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server port must follow the -C option\n", GProgname); RETURN(usage()); } argv ++; #if !ISSERVER SERV_PORT = atoi(argv[0]); #endif /*!ISSERVER*/ argc --; } #if !ISSERVER else { SERV_PORT = atoi(p+1); } if ((SERV_PORT < MIN_SERV_PORT) || (SERV_PORT > MAX_SERV_PORT)) { fprintf(stderr, "Bad server port %d: must be in [%d, %d]: using default %d\n", SERV_PORT, MIN_SERV_PORT, MAX_SERV_PORT, DEF_SERV_PORT); SERV_PORT = DEF_SERV_PORT; } #endif /*!ISSERVER*/ quitwhile = ON; break; /* Based on contribution From ada@mail2.umu.se Fri Jul 12 01:56 MST 1996; Christer Holgersson, Sen. SysNet Mgr, Umea University/SUNET, Sweden */ /* the bit-mask corresponding to the set of filenames within which the pattern should be searched is explicitly provided in a filename (absolute path name) */ case 'p' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the bitfield file [and an offset/length/endian separated by :] must follow the -p option\n", GProgname); RETURN(usage()); } argv ++; strcpy(bitfield_file, argv[0]); argc --; } else { strcpy(bitfield_file, p+1); } /* Find offset and length into bitfield file */ { int iiii = 0; BITFIELDOFFSET=0; BITFIELDLENGTH=0; BITFIELDENDIAN=0; iiii = 0; while (bitfield_file[iiii] != '\0') { if (bitfield_file[iiii] == '\\') { iiii ++; if (bitfield_file[iiii] == '\0') break; if (bitfield_file[iiii] == ':') { strcpy(&bitfield_file[iiii-1], &bitfield_file[iiii]); } else iiii ++; continue; } if (bitfield_file[iiii] == ':') { bitfield_file[iiii] = '\0'; sscanf(&bitfield_file[iiii+1], "%d:%d:%d", &BITFIELDOFFSET, &BITFIELDLENGTH, &BITFIELDENDIAN); if ((BITFIELDOFFSET < 0) || (BITFIELDLENGTH < 0) || (BITFIELDENDIAN < 0)) { fprintf(stderr, "Wrong offset %d or length %d or endian %d of bitfield file\n", BITFIELDOFFSET, BITFIELDLENGTH, BITFIELDENDIAN); RETURN(usage()); } break; } iiii++; } #if BG_DEBUG fprintf(debug, "BITFIELD %s : %d : %d : %d\n", BITFIELDFILE, BITFIELDOFFSET, BITFIELDLENGTH, BITFIELDENDIAN); #endif } if (bitfield_file[0] != '/') { getcwd(temp_bitfield_file, MAX_LINE_LEN-1); strcat(temp_bitfield_file, "/"); strcat(temp_bitfield_file, bitfield_file); strcpy(bitfield_file, temp_bitfield_file); } BITFIELDFILE = 1; quitwhile = ON; break; /* the set of filenames within which the pattern should be searched is explicitly provided in a filename (absolute path name) */ case 'f' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the filenames file must follow the -f option\n", GProgname); RETURN(usage()); } argv ++; strcpy(filenames_file, argv[0]); argc --; } else { strcpy(filenames_file, p+1); } if (filenames_file[0] != '/') { getcwd(temp_filenames_file, MAX_LINE_LEN-1); strcat(temp_filenames_file, "/"); strcat(temp_filenames_file, filenames_file); strcpy(filenames_file, temp_filenames_file); } FILENAMESINFILE = 1; quitwhile = ON; break; case 'C' : CONTACT_SERVER = 1; break; case 'a' : PRINTATTR = 1; break; case 'E': PRINTINDEXLINE = 1; break; case 'W': wholefilescope = 1; break; case 'z' : UseFilters = 1; break; case 'r' : GRECURSIVE = 1; break; case 'V' : printf("\nThis is glimpse version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); RETURN(0); /* Must let 'm' fall thru to default once multipatterns are done in agrep */ case 'm' : case 'v' : fprintf(stderr, "%s: illegal option: '-%c'\n", GProgname, c); RETURN(usage()); case 'I' : case 'D' : case 'S' : /* There is no space after these options */ agrep_argv[agrep_argc] = (char *)my_malloc(strlen(argv[0]) + 2); agrep_argv[agrep_argc][0] = '-'; strcpy(agrep_argv[agrep_argc] + 1, p); agrep_argc ++; quitwhile = ON; break; case 'l': GFILENAMEONLY = 1; my_l_index = agrep_argc; agrep_argv[agrep_argc] = (char *)my_malloc(4); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = c; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; break; /* * Copy the set of options for agrep: put them in separate argvs * even if they are together after one '-' (easier to process). * These are agrep options which glimpse has to peek into. */ default: agrep_argv[agrep_argc] = (char *)my_malloc(16); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = c; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; if (c == 'n') { nobytelevelmustbeon=1; } else if (c == 'X') GPRINTNONEXISTENTFILE = 1; else if (c == 'j') GPRINTFILETIME = 1; else if (c == 'b') GBYTECOUNT = 1; else if (c == 'g') GPRINTFILENUMBER = 1; else if (c == 't') GOUTTAIL = 1; else if (c == 'y') GNOPROMPT = 1; else if (c == 'h') GNOFILENAME = 1; else if (c == 'c') GCOUNT = 1; else if (c == 'B') { GBESTMATCH = 1; my_B_index = agrep_argc - 1; } /* the following options are followed by a parameter */ else if ((c == 'e') || (c == 'd') || (c == 'L') || (c == 'k')) { if (*(p + 1) == '\0') {/* space after - option */ if(argc <= 1) { fprintf(stderr, "%s: the '-%c' option must have an argument\n", GProgname, c); RETURN(usage()); } argv++; if ( (c == 'd') && ((D_length = strlen(argv[0])) > MAX_NAME_SIZE) ) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", GProgname, MAX_NAME_SIZE); RETURN(usage()); /* Should this be RegionLimit if ByteLevelIndex? */ } else if (c == 'L') { GLIMITOUTPUT = GLIMITTOTALFILE = GLIMITPERFILE = 0; sscanf(argv[0], "%d:%d:%d", &GLIMITOUTPUT, &GLIMITTOTALFILE, &GLIMITPERFILE); if ((GLIMITOUTPUT < 0) || (GLIMITTOTALFILE < 0) || (GLIMITPERFILE < 0)) { fprintf(stderr, "%s: invalid output limit %s\n", GProgname, argv[0]); RETURN(usage()); } } agrep_argv[agrep_argc] = (char *)my_malloc(strlen(argv[0]) + 2); strcpy(agrep_argv[agrep_argc], argv[0]); if (c == 'd') { preprocess_delimiter(argv[0], D_length, GD_pattern, &GD_length); if (GOUTTAIL == 2) GOUTTAIL = 0; /* Should this be RegionLimit if ByteLevelIndex? */ } if (c == 'k') GCONSTANT = 1; argc--; } else { if ( (c == 'd') && ((D_length = strlen(p+1)) > MAX_NAME_SIZE) ) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", GProgname, MAX_NAME_SIZE); RETURN(usage()); /* Should this be RegionLimit if ByteLevelIndex? */ } else if (c == 'L') { GLIMITOUTPUT = GLIMITTOTALFILE = GLIMITPERFILE = 0; sscanf(p+1, "%d:%d:%d", &GLIMITOUTPUT, &GLIMITTOTALFILE, &GLIMITPERFILE); if ((GLIMITOUTPUT < 0) || (GLIMITTOTALFILE < 0) || (GLIMITPERFILE < 0)) { fprintf(stderr, "%s: invalid output limit %s\n", GProgname, p+1); RETURN(usage()); } } agrep_argv[agrep_argc] = (char *)my_malloc(strlen(p+1) + 2); strcpy(agrep_argv[agrep_argc], p+1); if (c == 'd') { preprocess_delimiter(p+1, D_length-2, GD_pattern, &GD_length); if (GOUTTAIL == 2) GOUTTAIL = 0; /* Should this be RegionLimit if ByteLevelIndex? */ } if (c == 'k') GCONSTANT = 1; } agrep_argc ++; #if DEBUG fprintf(stderr, "%d = %s\n", agrep_argc, agrep_argv[agrep_argc - 1]); #endif /*DEBUG*/ quitwhile = ON; if ((c == 'e') || (c == 'k')) foundpat = 1; } /* else it is something that glimpse doesn't know and agrep needs to look at */ break; /* from default: */ } /* switch(c) */ p ++; } } /* while (--argc > 0 && (*++argv)[0] == '-') */ /* exitloop: */ if ((GBESTMATCH == ON) && (MATCHFILE == ON) && (Only_first == ON)) fprintf(stderr, "%s: Warning: the number of matches may be incorrect when -B is used with -F.\n", HARVEST_PREFIX); if (GOUTTAIL) GOUTTAIL = 1; if (GNOFILENAME) { agrep_argv[my_A_index][1] = 'Z'; /* ignore the -A option */ } #if ISSERVER if (RemoteFiles) { /* force -NQ so that won't start looking for files! */ Only_first = ON; PRINTAPPXFILEMATCH = ON; } #endif if (argc > 0) { /* copy the rest of the options the pattern and the filenames if any verbatim */ for (i=0; i= MAX_ARGS) break; agrep_argv[agrep_argc] = (char *)my_malloc(strlen(argv[0]) + 2); strcpy(agrep_argv[agrep_argc], argv[0]); agrep_argc ++; argv ++; } if (!foundpat) argc --; } #if 0 for (j=0; j 0, glimpse * runs as agrep: otherwise, it searches index, etc. */ if (argc <= 0) { if (RecordLevelIndex) { /* based on work done for robint@zedcor.com Robin Thomas, Art Today, Tucson, AZ */ /* if ((D_length > 0) && strcmp(GD_pattern, rdelim)) { fprintf(stderr, "Index created for delimiter `%s': cannot search with delimiter `%s'\n", rdelim, GD_pattern); RETURN(-1); } SHOULD I HAVE THIS CHECK? MAYBE GD_pattern is a SUBSTRING OF rdelim??? But this is safest thing to do... robint@zedcor.com */ RegionLimit = 0; /* region is EXACTLY the same record number, not a portion of a file within some offset+length */ } glimpse_call = 1; /* Initialize some data structures, read the index */ if (GRECURSIVE == 1) { fprintf(stderr, "illegal option: '-r'\n"); RETURN(usage()); } num_terminals = 0; GParse = NULL; memset(terminals, '\0', sizeof(ParseTree) * MAXNUM_PAT); #if !ISSERVER if (-1 == read_index(indexdir)) RETURN(-1); #endif /*!ISSERVER*/ /* This handles the -n option with ByteLevelIndex: disabled as of now, else should go into file search... */ if (nobytelevelmustbeon && (ByteLevelIndex && !RecordLevelIndex)) { /* with RecordLevelIndex, we'll do search, so don't set NOBYTELEVEL */ /* fprintf(stderr, "Warning: -n option used with byte-level index: must SEARCH the files\n"); */ NOBYTELEVEL=ON; } WHOLEFILESCOPE = (WHOLEFILESCOPE || wholefilescope); if (ByteLevelIndex) { /* Must zero them here in addition to index search so that RETURN macro runs correctly */ if ((src_offset_table == NULL) && ((src_offset_table = (struct offsets **)my_malloc(sizeof(struct offsets *) * OneFilePerBlock)) == NULL)) exit(2); memset(src_offset_table, '\0', sizeof(struct offsets *) * OneFilePerBlock); for (i=0; i= GM) { fprintf(stderr, "%s: pattern '%s' has no characters that were indexed: glimpse cannot search for it\n", HARVEST_PREFIX, GPattern); for (j=0; j 0) { dest_index_buf[num+1] = '\n'; if (!strncmp(dest_index_buf, "END", strlen("END"))) break; i = j = 0; while ((joffset = y; o->next = NULL; o->sign = o->done = 0; if (heado == NULL) { heado = o; tailo = o; } else { tailo->next = o; tailo = o; } if (dest_index_buf[i] == FILE_END_MARK) goto onemorey; src_offset_table[x] = heado; } /* printf("]\n"); */ num = readline(newsockfd, dest_index_buf, REAL_INDEX_BUF); } goto search_files; } /* * Copy the agrep-options that are relevant to index search into * index_argv (see man-pages for which options are relevant). * Also, adjust patindex whenever options are skipped over. * NOTE: agrep_argv does NOT contain two options after one '-'. */ index_argc = 0; for (j=0; j= OneFilePerBlock) break; src_index_set[iii] |= mask_int[jjj]; } } else for(iii=0; iii OneFilePerBlock) num_blocks = OneFilePerBlock; /* roundoff */ } else { for (iii=0; iii search=%d optimize=%d times=%d all=%d blocks=%d len=%d pat=%s scope=%d\n", NOBYTELEVEL, OPTIMIZEBYTELEVEL, src_index_set[REAL_PARTITION - 2], src_index_set[REAL_PARTITION - 1], num_blocks, strlen(APattern), APattern, WHOLEFILESCOPE); #endif /*DEBUG*/ /* Based on contribution From ada@mail2.umu.se Fri Jul 12 01:56 MST 1996; Christer Holgersson, Sen. SysNet Mgr, Umea University/SUNET, Sweden */ if (BITFIELDFILE) { int i, len = -1, nextchar; FILE *fp; fp = fopen(bitfield_file, "r"); if (fp != NULL) { if (BITFIELDENDIAN >= 2) { /* is a BIG-ENDIAN 4B integer list of indexes of files in .glimpse_filenames (sparse set) */ if (BITFIELDLENGTH == 0) BITFIELDLENGTH = file_num; if (BITFIELDOFFSET > 0) fseek(fp, BITFIELDOFFSET, (long)0); if (OneFilePerBlock) { for (i=0; i 0) fseek(fp, BITFIELDOFFSET, (long)0); if (OneFilePerBlock) { for (i=0; i MAX_PARTITION: see io.c */ } i++; } fclose(fp); if (i <= 0) { fprintf(stderr, "Error in reading %d bytes from offset %d in bitfield file %s ... ignoring it\n", BITFIELDOFFSET, BITFIELDLENGTH, bitfield_file); /* ignore bitfield_file */ } else { /* intersect files in bitfield_file with those that were obtained after pattern search */ if (OneFilePerBlock) { for (i=0; i OneFilePerBlock) num_blocks = OneFilePerBlock; /* roundoff */ } else { for (iii=0; iii 0) ? OneFilePerBlock : GNumpartitions, (OneFilePerBlock > 0) ? "files" : "blocks"); if (num_blocks > 0) { char cc[8]; cc[0] = 'y'; #if !ISSERVER if (!GNOPROMPT) { fprintf(stderr, "Do you want to see the file names? (y/n)"); fgets(cc, 4, stdin); } #endif /*!ISSERVER*/ if (!SILENT && (cc[0] == 'y')) { if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("BEGIN %d %d %d\n", bestmatcherrors, NOBYTELEVEL, OPTIMIZEBYTELEVEL); } for (jjj=0; jjj 0) && (jjj >= GLIMITOUTPUT)) break; if (ByteLevelIndex && !NOBYTELEVEL && (src_index_set[REAL_PARTITION - 1] != 1) && (src_offset_table[GFileIndex[jjj]] == NULL)) continue; if (GPRINTFILENUMBER) printf("%d", GFileIndex[jjj]); else printf("%s", GTextfiles[jjj]); if (PRINTAPPXFILEMATCH) { if (GCOUNT) { int n = 0; printf(": "); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { n ++; p1 = p1->next; } } else n = 1; /* there is atleast 1 match */ printf("%d", n); } else { printf(" ["); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { printf(" %d", p1->offset); p1 = p1->next; } } printf("]"); } } printf("\n"); } if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("END\n"); } } } RETURN(0); } /* end of Only_first */ if (!OneFilePerBlock) searchpercent = num_blocks*100/GNumpartitions; else searchpercent = num_blocks * 100 / OneFilePerBlock; #if BG_DEBUG fprintf(debug, "searchpercent = %d, num_blocks = %d\n", searchpercent, num_blocks); #endif /*BG_DEBUG*/ #if !ISSERVER if (!GNOPROMPT && (searchpercent > MAX_SEARCH_PERCENT)) { char cc[8]; cc[0] = 'y'; fprintf(stderr, "Your query may search about %d%% of the total space! Continue? (y/n)", searchpercent); fgets(cc, 4, stdin); if (cc[0] != 'y') RETURN(0); } if (ByteLevelIndex && !RecordLevelIndex && (searchpercent > DEF_MAX_INDEX_PERCENT)) NOBYTELEVEL = 1; /* with RecordLevelIndex, I don't just want to stop collecting offsets just because searchpercent > .... */ #endif /*!ISSERVER*/ } /* end of !MATCHFILE */ else { /* set up the right options for -F in index_argv/index_argc itself since they will no longer be used */ index_argc=0; strcpy(index_argv[0], GProgname); /* adding the -h option, which is safer for -F */ index_argc ++; index_argv[index_argc][0] = '-'; index_argv[index_argc][1] = 'h'; index_argv[index_argc][2] = '\0'; index_argc ++; /* new code: bgopal, Feb/8/94: deleted udi's code here */ j = 0; while (FileOpt[j] == '-') { j++; while ((FileOpt[j] != ' ') && (FileOpt[j] != '\0') && (FileOpt[j] != '\n')) { if (j >= MAX_ARGS - 1) { fprintf(stderr, "%s: too many options after -F: %s\n", GProgname, FileOpt); RETURN(usage()); } index_argv[index_argc][0] = '-'; index_argv[index_argc][1] = FileOpt[j]; index_argv[index_argc][2] = '\0'; index_argc ++; j++; } if ((FileOpt[j] == '\0') || (FileOpt[j] == '\n')) break; if ((FileOpt[j] == ' ') && (FileOpt[j-1] == '-')) { fprintf(stderr, "%s: illegal option: '-' after -F\n", GProgname); RETURN(usage()); } else if (FileOpt[j] == ' ') while(FileOpt[j] == ' ') j++; } while(FileOpt[j] == ' ') j++; fileopt_length = strlen(FileOpt); strncpy(index_argv[index_argc],FileOpt+j,fileopt_length-j); index_argv[index_argc][fileopt_length-j] = '\0'; index_argc++; my_free(FileOpt, MAXFILEOPT); FileOpt = NULL; #if BG_DEBUG fprintf(debug, "pattern to check with -F = %s\n",index_argv[index_argc-1]); #endif /*BG_DEBUG*/ #if DEBUG fprintf(stderr, "-F : "); for (jj=0; jj < index_argc; jj++) fprintf(stderr, " %s ",index_argv[jj]); fprintf(stderr, "\n"); #endif /*DEBUG*/ fflush(stdout); get_filenames(src_index_set, index_argc, index_argv, dummylen, dummypat, file_num); /* Assume #files per partitions is appx constant */ if (OneFilePerBlock) num_blocks = GNumfiles; else num_blocks = GNumfiles * GNumpartitions / p_table[GNumpartitions - 1]; if (Only_first) { /* search the index only */ fprintf(stderr, "There are matches to %d out of %d %s\n", num_blocks, (OneFilePerBlock > 0) ? OneFilePerBlock : GNumpartitions, (OneFilePerBlock > 0) ? "files" : "blocks"); if (num_blocks > 0) { char cc[8]; cc[0] = 'y'; #if !ISSERVER if (!GNOPROMPT) { fprintf(stderr, "Do you want to see the file names? (y/n)"); fgets(cc, 4, stdin); } #endif /*!ISSERVER*/ if (!SILENT && (cc[0] == 'y')) { if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("BEGIN %d %d %d\n", bestmatcherrors, NOBYTELEVEL, OPTIMIZEBYTELEVEL); } for (jjj=0; jjj 0) && (jjj >= GLIMITOUTPUT)) break; if (ByteLevelIndex && !NOBYTELEVEL && (src_index_set[REAL_PARTITION - 1] != 1) && (src_offset_table[GFileIndex[jjj]] == NULL)) continue; if (GPRINTFILENUMBER) printf("%d", GFileIndex[jjj]); else printf("%s", GTextfiles[jjj]); if (PRINTAPPXFILEMATCH) { if (GCOUNT) { int n = 0; printf(": "); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { n ++; p1 = p1->next; } } else n = 1; /* there is atleast 1 match */ printf("%d", n); } else { printf("["); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { printf(" %d", p1->offset); p1 = p1->next; } } printf("]"); } } printf("\n"); } if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("END\n"); } } } RETURN(0); } /* end of Only_first */ if (OneFilePerBlock) searchpercent = GNumfiles * 100 / OneFilePerBlock; else searchpercent = GNumfiles * 100 / p_table[GNumpartitions - 1]; #if BG_DEBUG fprintf(debug, "searchpercent = %d, num_files = %d\n", searchpercent, p_table[GNumpartitions - 1]); #endif /*BG_DEBUG*/ #if !ISSERVER if (!GNOPROMPT && (searchpercent > MAX_SEARCH_PERCENT)) { char cc[8]; cc[0] = 'y'; fprintf(stderr, "Your query may search about %d%% of the total space! Continue? (y/n)", searchpercent); fgets(cc, 4, stdin); if (cc[0] != 'y') RETURN(0); } if (ByteLevelIndex && !RecordLevelIndex && (searchpercent > DEF_MAX_INDEX_PERCENT)) NOBYTELEVEL = 1; /* with RecordLevelIndex, I don't just want to stop collecting offsets just because searchpercent > .... */ #endif /*!ISSERVER*/ } /* At this point, I have the set of files to search */ search_files: /* Replace -B by the number of errors if best-match */ if (GBESTMATCH && (my_B_index >= 0)) { sprintf(&agrep_argv[my_B_index][1], "%d", bestmatcherrors); #if BG_DEBUG fprintf(debug, "Changing -B to -%d\n", bestmatcherrors); #endif /*BG_DEBUG*/ } agrep_argv[my_M_index][1] = 'Z'; agrep_argv[my_P_index][1] = 'Z'; #if 0 for (iii=0; iii 0) { gnum_of_matched += ret; gfiles_matched ++; } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if ((ret < 0) && (errno == AGREP_ERROR)) break; if (glimpse_clientdied) break; fflush(stdout); } } else { for (i=0; i index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else if ((ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if ((ret < 0) && (errno == AGREP_ERROR)) break; if (glimpse_clientdied) break; fflush(stdout); } } } /* end of !UseFilters */ else { sprintf(outname[0], "%s/.glimpse_apply.%d", TEMP_DIR, getpid()); for (i=0; i index_stat_buf.st_mtime)) { first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, outname, 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else { if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, outname, 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else if ((ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], outname[0], GFileIndex[i], src_offset_table, stdout)) > 0) { gfiles_matched ++; gnum_of_matched += ret; } } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; } else { if (!ByteLevelIndex || RecordLevelIndex || NOBYTELEVEL) { first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else { SetCurrentFileName = 1; if (GPRINTFILENUMBER) sprintf(CurrentFileName, "%d", GFileIndex[i]); else strcpy(CurrentFileName, GTextfiles[i]); if (my_stat(GTextfiles[i], &file_stat_buf) == -1) { if (GPRINTNONEXISTENTFILE) printf("%s\n", CurrentFileName); continue; } if (GPRINTFILETIME) { SetCurrentFileTime = 1; CurrentFileTime = get_file_time(timesfp, &file_stat_buf, GTextfiles[i], GFileIndex[i]); } if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else if ((ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; } } if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if ((ret < 0) && (errno == AGREP_ERROR)) break; if (glimpse_clientdied) break; fflush(stdout); } } } /* end of WHOLEFILESCOPE <= 0 */ else { FILE *tmpfp = NULL; /* to store structured query-search output */ int OLDFILENAMEONLY;/* don't use FILENAMEONLY for agrepping the stuff: handle it in filtering */ int OLDLIMITOUTPUT; /* don't use LIMITs for search: only for filtering=identify_region(): agrep NEVER changes these 3 */ int OLDLIMITPERFILE; int OLDLIMITTOTALFILE; int OLDPRINTRECORD; /* don't use PRINTRECORD for search: only after filter_output() recognizes boolean in wholefilescope */ int OLDCOUNT; /* don't use OLDCOUNT for search: only after filter_output() recognizes boolean in wholefilescope */ if (!UseFilters) { for (i=0; i index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, tmpfp); } else ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, tmpfp); } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; fflush(tmpfp); fclose(tmpfp); tmpfp = NULL; if ((ret < 0) && (errno == AGREP_ERROR)) break; #if DEBUG printf("done search\n"); fflush(stdout); #endif /*DEBUG*/ FILENAMEONLY = OLDFILENAMEONLY; LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; PRINTRECORD = OLDPRINTRECORD; COUNT = OLDCOUNT; if (!UseFilters) { ret = filter_output(GTextfiles[i], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); } else { ret = filter_output(outname[0], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); unlink(outname[0]); } gnum_of_matched += (ret > 0) ? ret : 0; gfiles_matched += (ret > 0) ? 1 : 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if (glimpse_clientdied) break; fflush(stdout); } } else { /* we should try to apply the filter (we come here with -W -z, say) */ sprintf(outname[0], "%s/.glimpse_apply.%d", TEMP_DIR, getpid()); for (i=0; i index_stat_buf.st_mtime)) { first_search = 1; ret = fileagrep_search(AM, APattern, 1, outname, 0, tmpfp); } else { if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; ret = fileagrep_search(AM, APattern, 1, outname, 0, tmpfp); } else ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], outname[0], GFileIndex[i], src_offset_table, tmpfp); } } else { if (!ByteLevelIndex || RecordLevelIndex || NOBYTELEVEL) { if (GPRINTFILETIME) { SetCurrentFileTime = 1; CurrentFileTime = get_file_time(timesfp, NULL, GTextfiles[i], GFileIndex[i]); } first_search = 1; ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, tmpfp); } else { if (my_stat(GTextfiles[i], &file_stat_buf) == -1) { if (GPRINTNONEXISTENTFILE) printf("%s\n", CurrentFileName); fclose(tmpfp); continue; } if (GPRINTFILETIME) { SetCurrentFileTime = 1; CurrentFileTime = get_file_time(timesfp, &file_stat_buf, GTextfiles[i], GFileIndex[i]); } if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, tmpfp); } else ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, tmpfp); } } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; fflush(tmpfp); fclose(tmpfp); tmpfp = NULL; if ((ret < 0) && (errno == AGREP_ERROR)) break; #if DEBUG printf("done search\n"); fflush(stdout); #endif /*DEBUG*/ FILENAMEONLY = OLDFILENAMEONLY; LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; PRINTRECORD = OLDPRINTRECORD; COUNT = OLDCOUNT; if (!UseFilters) { /* Added to do structured queries from Webglimpse */ ret = filter_output(GTextfiles[i], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); } else { ret = filter_output(outname[0], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); } gnum_of_matched += (ret > 0) ? ret : 0; gfiles_matched += (ret > 0) ? 1 : 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if (glimpse_clientdied) break; fflush(stdout); } } } if (errno == AGREP_ERROR) { fprintf(stderr, "%s: error in options or arguments to `agrep'\n", HARVEST_PREFIX); } RETURN(gnum_of_matched); } else { /* argc > 0: simply call agrep */ #if DEBUG for (i=0; i 0) || (residue > 0)) { total_read += num_read; if (num_read <= 0) { final_end = filter_buf + residue; num_read = residue; residue = 0; } else { num_read += residue; final_end = (CHAR *)backward_delimiter(filter_buf + num_read, filter_buf, GD_pattern, GD_length, GOUTTAIL); residue = filter_buf + num_read - final_end; } #if DEBUG fprintf(stderr, "filter_buf=%x final_end=%x residue=%x last_chars=%c%c%c num_read=%x\n", filter_buf, final_end, residue, *(final_end-2), *(final_end-1), *(final_end), num_read); #endif /*DEBUG*/ current_begin = previous_begin = filter_buf; #if 1 current_end = (CHAR *)forward_delimiter(filter_buf, filter_buf + num_read, GD_pattern, GD_length, GOUTTAIL); /* skip over prefixes like filename */ if (!GOUTTAIL) current_end = (CHAR *)forward_delimiter((long)current_end + GD_length, final_end, GD_pattern, GD_length, GOUTTAIL); #else /*1*/ current_end = (CHAR *)forward_delimiter(filter_buf+1, final_end, GD_pattern, GD_length, GOUTTAIL); #endif /*1*/ #if DEBUG fprintf(stderr, "current_begin=%x current_end=%x\n", current_begin, current_end); #endif /*DEBUG*/ while (current_end <= final_end) { previous_begin = current_begin; /* look for %d= */ byteoff = -1; while (current_begin < current_end) { if (isdigit(*current_begin)) { skiplen = getbyteoff(current_begin, &byteoff); #if BG_DEBUG fprintf(debug, "byteoff=%d skiplen=%d\n", byteoff, skiplen); #endif /*BG_DEBUG*/ if ((skiplen < 0) || (byteoff < 0)) { current_begin ++; continue; } else break; } else current_begin ++; } #if DEBUG printf("current_begin=%x current_end=%x final_end=%x residue=%x num_read=%x\n", current_begin, current_end, final_end, residue, num_read); #endif /*DEBUG*/ #if DEBUG printf("byteoff=%d skiplen=%d\n", byteoff, skiplen); #endif /*DEBUG*/ if ((skiplen < 0) || (byteoff < 0)) { /* output the whole line as it is: there is nothing to strip (e.g., -l) */ #if 0 /* This is an error: -l is now handled completely inside filter_output --> agrep won't processes it when -W */ if (!SILENT) fwrite(previous_begin, 1, current_end-previous_begin, displayfp); numprinted ++; #endif } else if ( (num_attr <= 0) || (((attribute = region_identify(byteoff, 0)) < num_attr) && (attribute >= 0)) ) { /* prefix is from previous_begin to current_begin. Skip skiplen from current_begin. Rest until current_end is valid output */ if (num_attr <= 0) attribute = 0; #if BG_DEBUG fprintf(debug, "region@%d=%d\n", byteoff, attribute); #endif /*BG_DEBUG*/ c1 = *(current_begin + skiplen - 1); c2 = *(current_end + 1); printed = 0; for (i=0; i already_matched[%d] = %d, going to look at '%s'\n", i, matched_terminals[i], terminals[i].data.leaf.value); #endif if (matched_terminals[i] && (GFILENAMEONLY || FILEOUT || printed || ((LIMITOUTPUT > 0) && (numprinted >= LIMITOUTPUT)) || ((LIMITPERFILE > 0) && (numprinted >= LIMITPERFILE)))) continue; if ((terminals[i].data.leaf.attribute == 0) || ((int)(terminals[i].data.leaf.attribute) == attribute)) { *(current_begin + skiplen - 1) = '\n'; *(current_end + 1) = '\n'; OLDLIMITOUTPUT = LIMITOUTPUT; LIMITOUTPUT = 0; OLDLIMITPERFILE = LIMITPERFILE; LIMITPERFILE = 0; OLDLIMITTOTALFILE = LIMITTOTALFILE; LIMITTOTALFILE = 0; if (memagrep_search( strlen(terminals[i].data.leaf.value), terminals[i].data.leaf.value, current_end - current_begin - skiplen + 1, current_begin + skiplen - 1, 0, nullfp) > 0) { LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; #if 0 *(current_end + 1) = '\0'; printf("--> search succeeded for %s in %s\n", terminals[i].data.leaf.value, previous_begin); #endif /*0*/ *(current_begin + skiplen - 1) = c1; *(current_end + 1) = c2; matched_terminals[i] = 1; /* must reevaluate/set since don't know if it should be printed */ if (!(((LIMITOUTPUT > 0) && (numprinted >= LIMITOUTPUT)) || ((LIMITPERFILE > 0) && (numprinted >= LIMITPERFILE))) && !printed) { /* see if it was useful later */ if (!COUNT && !FILEOUT && !SILENT) { fwrite(previous_begin, 1, current_begin - previous_begin, displayfp); if (PRINTATTR) fprintf(displayfp, "%s# ", (attrname = attr_id_to_name(attribute)) == NULL ? "(null)" : attrname); if (GBYTECOUNT) fprintf(displayfp, "%d= ", byteoff); if (PRINTRECORD) { fwrite(current_begin + skiplen, 1, current_end - current_begin - skiplen, displayfp); } else { if (*(current_begin + skiplen) == '@') { int iii = 0; while (current_begin[skiplen + iii] != '}') fputc(current_begin[skiplen + iii++], displayfp); fputc('}', displayfp); } fputc('\n', displayfp); } } printed = 1; numprinted ++; } } else { LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; #if 0 *(current_end + 1) = '\0'; printf("--> search failed for %s in %s\n", terminals[i].data.leaf.value, previous_begin); #endif /*0*/ *(current_begin + skiplen - 1) = c1; *(current_end + 1) = c2; } } } if (!success) { if (ComplexBoolean) { success = eval_tree(GParse, matched_terminals); } else { if ((long)GParse & AND_EXP) { success = 0; for (ii=0; ii= num_terminals) success = 1; } else { success = 0; /* cannot come to filter_output in this case unless -a! */ } } } /* optimize options that do not need all the matched lines */ if (success) { if (GFILENAMEONLY) { if (GPRINTFILETIME) { /* from bug fix message by Dr Jaime Prilusky lsprilus@weizmann.weizmann.ac.il jaimep@terminator.pdb.bnl.gov */ if (!SILENT) fprintf(stdout, "%s%s\n", CurrentFileName, aprint_file_time(CurrentFileTime)); } else { if (!SILENT) fprintf(stdout, "%s\n", CurrentFileName); } if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; goto unlink_and_quit; } else if (FILEOUT) { /* file_out(infile); */ if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; goto unlink_and_quit; } } } if (success && (((LIMITOUTPUT > 0) && (numprinted >= LIMITOUTPUT)) || ((LIMITPERFILE > 0) && (numprinted >= LIMITPERFILE)))) goto double_break; if (glimpse_clientdied) goto double_break; if (current_end >= final_end) break; current_begin = current_end; if (!GOUTTAIL) current_end = (CHAR *)forward_delimiter((long)current_end + GD_length, final_end, GD_pattern, GD_length, GOUTTAIL); else current_end = (CHAR *)forward_delimiter(current_end, final_end, GD_pattern, GD_length, GOUTTAIL); #if DEBUG fprintf(stderr, "current_begin=%x current_end=%x\n", current_begin, current_end); #endif /*DEBUG*/ } if (residue > 0) { memcpy(filter_buf, final_end, residue); memcpy(filter_buf+residue, GD_pattern, GD_length); } } double_break: /* Come here on normal exit or when the current agrep-output is no longer of any use */ if (!success && (total_read > 0)) { if (ComplexBoolean) { success = eval_tree(GParse, matched_terminals); } else { if ((long)GParse & AND_EXP) { success = 0; for (ii=0; ii= num_terminals) success = 1; } else { success = 0; /* cannot come to filter_output in this case unless -a! */ } } } /* Print the temporary output onto stdout if search was successful; unlink the temprorary file */ if (success) { if (GFILENAMEONLY) { /* all other output options are useless since they all deal with the MATCHED line */ if (GPRINTFILETIME) { /* from bug fix message by Dr Jaime Prilusky lsprilus@weizmann.weizmann.ac.il jaimep@terminator.pdb.bnl.gov */ if (!SILENT) fprintf(stdout, "%s%s\n", CurrentFileName, aprint_file_time(CurrentFileTime)); } else { if (!SILENT) fprintf(stdout, "%s\n", CurrentFileName); } if (!SILENT) fprintf(stdout, "%s\n", CurrentFileName); if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; } else if (COUNT && !FILEOUT) { if (!SILENT) { if(!NOFILENAME) fprintf(stdout, "%s: %d\n", CurrentFileName, numprinted); else fprintf(stdout, "%d\n", numprinted); } if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; } else if (FILEOUT) { /* file_out(infile); */ if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; } else if (storefp != NULL) { fflush(storefp); fclose(storefp); #if DEBUG printf("STOREOUTPUT\n"); sprintf(s, "exec cat %s/.glimpse_storeoutput.%d\n", TEMP_DIR, getpid()); system(s); #endif /*DEBUG*/ sprintf(s, "%s/.glimpse_storeoutput.%d", TEMP_DIR, getpid()); if ((storefp = fopen(s, "r")) != NULL) { if (!SILENT) while (fgets(s, MAX_LINE_LEN, storefp) != NULL) fputs(s, stdout); fclose(storefp); } storefp = NULL; } } else { if (storefp != NULL) fclose(storefp); /* else don't bother to flush */ } unlink_and_quit: sprintf(s, "%s/.glimpse_storeoutput.%d", TEMP_DIR, getpid()); unlink(s); if (StructuredIndex) region_destroy(); fclose(outfp); if (GFILENAMEONLY) { if (numprinted > 0) return 1; else return 0; } else if (ComplexBoolean || ((long)GParse & AND_EXP)) { if (success) return numprinted; else return 0; } else { /* must be -a */ return numprinted; } } usage() { fprintf(stderr, "\nThis is glimpse version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-#abceghijklnprstwxyBCDEGIMNPQSVWZ] [-d DEL] [-f FILE] [-F PAT] [-H DIR] [-J HOST] [-K PORT] [-L X[:Y:Z]] [-R X] [-T DIR] [-Y D] pattern [files]", GProgname); fprintf(stderr, "\n"); fprintf(stderr, "List of options (see %s for more details):\n", GLIMPSE_URL); fprintf(stderr, "\n"); fprintf(stderr, "-#: find matches with at most # errors\n"); fprintf(stderr, "-a: print attribute names (useful only for Harvest SOIF format)\n"); fprintf(stderr, "-b: print the byte offset of the record from the beginning of the file\n"); fprintf(stderr, "-B: best match mode: find the closest matches to the pattern\n"); fprintf(stderr, "-c: output the number of matched records\n"); fprintf(stderr, "-C: send queries to glimpseserver\n"); fprintf(stderr, "-d DEL: define record delimiter DEL\n"); fprintf(stderr, "-D x: adjust the cost of deletions to x\n"); fprintf(stderr, "-e: for patterns starting with -\n"); fprintf(stderr, "-E: print matching lines as they appear in the index (useful in -EQNg)\n"); fprintf(stderr, "-f FILE: restrict the search to files whose names appear in FILE\n"); fprintf(stderr, "-F PAT: restrict the search to files matching PAT\n"); fprintf(stderr, "-g: print the file number (in the index)\n"); fprintf(stderr, "-G: output the (whole) files that contain a match\n"); fprintf(stderr, "-h: do not output file names before matched record\n"); fprintf(stderr, "-H DIR: the glimpse index is located in directory DIR\n"); fprintf(stderr, "-i: case-insensitive search, e.g., 'a' = 'A'\n"); fprintf(stderr, "-I x: adjust the cost of insertions to x\n"); fprintf(stderr, "-j: output file modification dates (if -t was used for the indexing)\n"); fprintf(stderr, "-J HOST: send queries to glimpseserver at HOST\n"); fprintf(stderr, "-k: use the pattern as is (no meta characters)\n"); fprintf(stderr, "-K PORT: send queries to glimpseserver at TCP port number PORT\n"); fprintf(stderr, "-l: output only the names of files that contain a match\n"); fprintf(stderr, "-L X[:Y:Z]: limit the output to X records [Y files, Z matches per file]\n"); fprintf(stderr, "-n: output record prefixed by record number\n"); fprintf(stderr, "-N: search only the index (may not be precise for some queries) \n"); fprintf(stderr, "-o: delimiter is output at the beginning of the matched record\n"); fprintf(stderr, "-O: file names are printed only once per file\n"); fprintf(stderr, "-p FILE:off:len:endian: restrict the search to the files whose names\n\tare specified as a bit-field OR sparse-set in FILE\n"); fprintf(stderr, "-P: print the pattern that matched before the matched record\n"); fprintf(stderr, "-q: print the offsets of the beginning and end of each matched record\n"); fprintf(stderr, "-Q: (with -N) print offsets of matches from (only the large) index\n"); fprintf(stderr, "-r: (used only for agrep) - recursive search\n"); fprintf(stderr, "-R X: set the maximum size of a record to X\n"); fprintf(stderr, "-s: display nothing except error messages\n"); fprintf(stderr, "-S x: adjust the cost of substitutions to x\n"); fprintf(stderr, "-t: use in combination with -d DEL so that the delimiter DEL appears\n\tat the end of each output record instead of the beginning\n"); fprintf(stderr, "-T DIR: temporary files are put in directory DIR (instead of /tmp)\n"); fprintf(stderr, "-u: do not output matched records (useful in -qbug)\n"); fprintf(stderr, "-U: interpret .glimpse_filenames when -U / -X option is used in glimpseindex\n"); fprintf(stderr, "-v: (works ONLY for agrep) - output all records that do not contain a match\n"); fprintf(stderr, "-V,--version: print the current version of glimpse\n"); fprintf(stderr, "-w: pattern has to match as a word, e.g., 'win' will not match 'wind'\n"); fprintf(stderr, "-W: the scope of Booleans is the whole file (except for structured queries)\n"); fprintf(stderr, "-x: the pattern must match the whole line\n"); fprintf(stderr, "-X: if an indexed file that matches 'pattern' doesn't exist, print its name\n"); fprintf(stderr, "-y: no prompt\n"); fprintf(stderr, "-Y D: output only matches in files that were updated in the last D days\n"); fprintf(stderr, "-z: customizable filtering using the .glimpse_filters file\n"); fprintf(stderr, "-Z: no op\n"); fprintf(stderr, "--help: this message\n"); fprintf(stderr, "\n"); fprintf(stderr, "For questions about glimpse, please contact: `%s'\n", GLIMPSE_EMAIL); return -1; /* useful if we make glimpse into a library */ /* * Undocumented Option Combinations for SFS (like RPC calls) * print file number of match instead of file name: -g * print enclosing offsets of matched record: -q * NOT print matched record: -u * E.G. USAGE: -qbug (b prints offset of pattern: can also use -lg or -Ng) * look only at index: -E * look at matched offsets in files as seen in index (w/o searching): -QN * E.G. USAGE: -EQNgy * read the -EQNg or just -QNg output from stdin and perform actual search w/o * searching the index (take hints from user): -U * NOTE: can't use U unless QNg are all used together (e.g., BEGIN/END won't be printed) */ } usageS() { fprintf(stderr, "\nThis is glimpse server version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-H DIR] [-J HOST] [-K PORT]", GProgname); fprintf(stderr, "\n"); fprintf(stderr, "-H DIR: the glimpse index is located in directory DIR\n"); fprintf(stderr, "-J HOST: the host name (string) clients must use / server runs on\n"); fprintf(stderr, "-K PORT: the port (short integer) clients must use / server runs on\n"); fprintf(stderr, "\n"); fprintf(stderr, "For questions about glimpse, please contact `%s'\n", GLIMPSE_EMAIL); return -1; /* useful if we make glimpse into a library */ } #if CLIENTSERVER /* * do_select() - based on select_loop() from the Harvest Broker. * -- Courtesy: Darren Hardy, hardy@cs.colorado.edu */ int do_select(sock, sec) int sock; /* the socket to wait for */ int sec; /* the number of seconds to wait */ { struct timeval to; fd_set qready; int err; if (sock < 0 || sec < 0) return 0; FD_ZERO(&qready); FD_SET(sock, &qready); to.tv_sec = sec; to.tv_usec = 0; if ((err = select(sock + 1, &qready, NULL, NULL, &to)) < 0) { if (errno == EINTR) return 0; perror("select"); return -1; } if (err == 0) return 0; /* If there's someone waiting to get it, let them through */ return (FD_ISSET(sock, &qready) ? 1 : 0); } #endif /* CLIENTSERVER */ glimpse-4.18.7/glimpse/mkinstalldirs000077500000000000000000000012131300371307100175110ustar00rootroot00000000000000#! /bin/sh # mkinstalldirs --- make directory hierarchy # Author: Noah Friedman # Created: 1993-05-16 # Last modified: 1994-03-25 # Public domain errstatus=0 for file in ${1+"$@"} ; do set fnord `echo ":$file" | sed -ne 's/^:\//#/;s/^://;s/\// /g;s/^#/\//;p'` shift pathcomp= for d in ${1+"$@"} ; do pathcomp="$pathcomp$d" case "$pathcomp" in -* ) pathcomp=./$pathcomp ;; esac if test ! -d "$pathcomp"; then echo "mkdir $pathcomp" 1>&2 mkdir "$pathcomp" || errstatus=$? fi pathcomp="$pathcomp/" done done exit $errstatus # mkinstalldirs ends here glimpse-4.18.7/glimpse/split.c000066400000000000000000000473661300371307100162250ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include "glimpse.h" extern CHAR *getword(); extern int checksg(); extern int D; extern CHAR GProgname[MAXNAME]; extern FILE *debug; extern int StructuredIndex; extern int WHOLEFILESCOPE; extern int ByteLevelIndex; extern int ComplexBoolean; extern int foundattr; extern int foundnot; /* returns where it found the distinguishing token: until that from prev value of begin is the current pattern (not just the "words" in it) */ CHAR * parse_flat(begin, end, prev, next) CHAR *begin; CHAR *end; int prev; int *next; { if (begin > end) { *next = prev; return end; } if (prev & ENDSUB_EXP) prev &= ~ATTR_EXP; if ((prev & ATTR_EXP) && !(prev & VAL_EXP)) prev |= VAL_EXP; while (begin <= end) { if (*begin == ',') { prev |= OR_EXP; prev |= VAL_EXP; prev |= ENDSUB_EXP; if (prev & AND_EXP) { fprintf(stderr, "%s: parse error at character '%c'\n", GProgname, *begin); return NULL; } *next = prev; return begin; } else if (*begin == ';') { prev |= AND_EXP; prev |= VAL_EXP; prev |= ENDSUB_EXP; if (prev & OR_EXP) { fprintf(stderr, "%s: parse error at character '%c'\n", GProgname, *begin); return NULL; } *next = prev; return begin; } else if (*begin == '=') { if (StructuredIndex <= 0) begin++; /* don't care about = since just another character */ else { if (prev & ATTR_EXP) { fprintf(stderr, "%s: syntax error: only ',' and ';' can follow 'attribute=value'\n", GProgname); return NULL; } prev |= ATTR_EXP; /* remains an ATTR_EXP until a new ',' OR ';' */ prev &= ~VAL_EXP; *next = prev; return begin; } } else if (*begin == '\\') begin ++; /* skip two things */ begin++; } *next = prev; return begin; } int split_pattern_flat(GPattern, GM, APattern, terminals, pnum_terminals, pGParse, num_attr) CHAR *GPattern; int GM; CHAR *APattern; ParseTree terminals[]; int *pnum_terminals; int *pGParse; /* doesn't interpret it as a tree */ int num_attr; { int j, k = 0, l = 0, len = 0; int current_attr; CHAR *buffer; CHAR *buffer_pat; CHAR *buffer_end; char tempbuf[MAX_LINE_LEN]; memset(APattern, '\0', MAXPAT); buffer = GPattern; buffer_end = buffer + GM; j=0; *pGParse = 0; current_attr = 0; foundattr = 0; /* * buffer is the runnning pointer, buffer_pat is the place where * the distinguishing delimiter was found, buffer_end is the end. */ while (buffer_pat = parse_flat(buffer, buffer_end, *pGParse, pGParse)) { /* there is no pattern until after the distinguishing delimiter position: some agrep garbage */ if (buffer_pat <= buffer) { buffer = buffer_pat+1; if (buffer_pat >= buffer_end) break; continue; } if ((*pGParse & ATTR_EXP) && !(*pGParse & VAL_EXP)) { /* fresh attribute */ foundattr=1; memcpy(tempbuf, buffer, buffer_pat - buffer); tempbuf[buffer_pat - buffer] = '\0'; len = strlen(tempbuf); for (k = 0; k= num_attr)) { buffer[buffer_pat - buffer] = '\0'; fprintf(stderr, "%s: unknown attribute name '%s'\n", GProgname, buffer); return -1; } buffer = buffer_pat+1; /* immediate next character after distinguishing delimiter */ if (buffer_pat >= buffer_end) break; continue; } else { /* attribute's value OR raw-value */ if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "%s: boolean expression has too many terms\n", GProgname); return -1; } terminals[*pnum_terminals].op = 0; terminals[*pnum_terminals].type = LEAF; terminals[*pnum_terminals].terminalindex = *pnum_terminals; terminals[*pnum_terminals].data.leaf.attribute = current_attr; /* default is no structure */ terminals[*pnum_terminals].data.leaf.value = (CHAR *)malloc(buffer_pat - buffer + 2); memcpy(terminals[*pnum_terminals].data.leaf.value, buffer, buffer_pat - buffer); /* without distinguishing delimiter */ terminals[*pnum_terminals].data.leaf.value[buffer_pat - buffer] = '\0'; if (foundattr || WHOLEFILESCOPE) { memcpy(&APattern[j], buffer, buffer_pat - buffer); j += buffer_pat - buffer; /* NOT including the distinguishing delimiter at buffer_pat, or '\0' */ APattern[j++] = (*(buffer_pat + 1) == '\0' ? '\0' : ','); /* always search for OR, do filtering at the end */ #if BG_DEBUG fprintf(debug, "current_attr = %d, val = %s\n", current_attr, terminals[*pnum_terminals].data.leaf.value); #endif /*BG_DEBUG*/ } else { memcpy(&APattern[j], buffer, buffer_pat + 1 - buffer); j += buffer_pat + 1 - buffer; /* including the distinguishing delimiter at buffer_pat, or '\0' */ } (*pnum_terminals)++; } if (*pGParse & ENDSUB_EXP) current_attr = 0; /* remains 0 until next fresh attribute */ if (buffer_pat >= buffer_end) break; buffer = buffer_pat+1; } if (buffer_pat == NULL) return -1; /* got out of while loop because of NULL rather than break */ APattern[j] = '\0'; if (foundattr || WHOLEFILESCOPE) /* then search must always be OR since scope is over whole files */ for (j=0; APattern[j] != '\0'; j++) if (APattern[j] == '\\') j++; else if (APattern[j] == ';') APattern[j] = ','; return(*pnum_terminals); } extern int is_complex_boolean(); /* use the one in agrep/asplit.c */ extern int get_token_bool(); /* use the one in agrep/asplit.c */ /* Spaces ARE significant: 'a1=v1' and 'a1=v1 ' and 'a1 =v1' etc. are NOT identical */ int get_attribute_value(pattr, pval, tokenbuf, tokenlen, num_attr) int *pattr, tokenlen; CHAR **pval, *tokenbuf; { CHAR tempbuf[MAXNAME]; int i = 0, j = 0, k = 0, l = 0; while (i < tokenlen) { if (tokenbuf[i] == '\\') { tempbuf[j++] = tokenbuf[i++]; tempbuf[j++] = tokenbuf[i++]; } else if (StructuredIndex) { if (tokenbuf[i] == '=') { i++; /* skip over = : now @ 1st char of value */ tempbuf[j] = '\0'; for (k=0; k= num_attr) ) { /* named a non-existent attribute */ fprintf(stderr, "%s: unknown attribute name '%s'\n", GProgname, tempbuf); return 0; } *pval = (CHAR *)malloc(tokenlen - i + 2); memcpy(*pval, &tokenbuf[i], tokenlen - i); (*pval)[tokenlen - i] = '\0'; foundattr = 1; return 1; } else tempbuf[j++] = tokenbuf[i++]; /* consider = as just another char */ } else tempbuf[j++] = tokenbuf[i++]; /* no attribute parsing */ } /* Not a structured expression */ tempbuf[j] = '\0'; *pval = (CHAR *)malloc(j + 2); memcpy(*pval, tempbuf, j); (*pval)[j] = '\0'; return 1; } extern destroy_tree(); /* use the one in agrep/asplit.c */ /* * Recursive descent; C-style => AND + OR have equal priority => must bracketize expressions appropriately or will go left->right. * Also strips out attribute names since agrep doesn't understand them: copies resulting pattern for agrep-ing into apattern. * Grammar: * E = {E} | ~a | ~{E} | E ; E | E , E | a * Parser: * One look ahead at each literal will tell you what to do. * ~ has highest priority, ; and , have equal priority (left to right associativity), ~~ is not allowed. */ ParseTree * parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr) CHAR *buffer; int len; int *bufptr; CHAR *apattern; int *apatptr; ParseTree terminals[]; int *pnum_terminals; int num_attr; { int token, tokenlen; CHAR tokenbuf[MAXNAME]; int oldtokenlen; CHAR oldtokenbuf[MAXNAME]; ParseTree *t, *n, *leftn; token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen); switch(token) { case '{': /* (exp) */ apattern[(*apatptr)++] = '{'; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) != '}') { fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); destroy_tree(t); return (NULL); } apattern[(*apatptr)++] = '}'; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return t; switch(token) { /* must find boolean infix operator */ case ',': case ';': apattern[(*apatptr)++] = token; leftn = t; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; /* or end of parent sub expression */ case '}': unget_token_bool(bufptr, tokenlen); /* part of someone else who called me */ return t; default: destroy_tree(t); fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } /* Go one level deeper */ case '~': /* not exp */ foundnot = 1; apattern[(*apatptr)++] = '~'; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return NULL; switch(token) { case 'a': if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "%s: pattern expression too long (> %d terms)\n", GProgname, MAXNUM_PAT); return NULL; } n = &terminals[*pnum_terminals]; n->op = 0; n->type = LEAF; n->terminalindex = (*pnum_terminals); n->data.leaf.value = NULL; n->data.leaf.attribute = 0; if (!get_attribute_value((int *)&n->data.leaf.attribute, &n->data.leaf.value, tokenbuf, tokenlen, num_attr)) return NULL; strcpy(&apattern[*apatptr], n->data.leaf.value); *apatptr += strlen(n->data.leaf.value); (*pnum_terminals)++; n->op |= NOTPAT; t = n; break; case '{': apattern[(*apatptr)++] = token; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; if (t->op & NOTPAT) t->op &= ~NOTPAT; else t->op |= NOTPAT; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) != '}') { fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); destroy_tree(t); return NULL; } apattern[(*apatptr)++] = '}'; break; default: fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } /* The resulting tree is in t. Now do another lookahead at this level */ if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return t; switch(token) { /* must find boolean infix operator */ case ',': case ';': apattern[(*apatptr)++] = token; leftn = t; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; case '}': unget_token_bool(bufptr, tokenlen); return t; default: destroy_tree(t); fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } case 'a': /* individual term (attr=val) */ if (tokenlen == 0) return NULL; memcpy(oldtokenbuf, tokenbuf, tokenlen); oldtokenlen = tokenlen; oldtokenbuf[oldtokenlen] = '\0'; token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen); switch(token) { case '}': /* part of case '{' above: else syntax error not detected but semantics ok */ unget_token_bool(bufptr, tokenlen); case 'e': /* endof input */ case ',': case ';': if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "%s: pattern expression too long (> %d terms)\n", GProgname, MAXNUM_PAT); return NULL; } n = &terminals[*pnum_terminals]; n->op = 0; n->type = LEAF; n->terminalindex = (*pnum_terminals); n->data.leaf.value = NULL; n->data.leaf.attribute = 0; if (!get_attribute_value((int *)&n->data.leaf.attribute, &n->data.leaf.value, oldtokenbuf, oldtokenlen, num_attr)) return NULL; strcpy(&apattern[*apatptr], n->data.leaf.value); *apatptr += strlen(n->data.leaf.value); (*pnum_terminals)++; if ((token == 'e') || (token == '}')) return n; /* nothing after terminal in expression */ leftn = n; apattern[(*apatptr)++] = token; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; default: fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } case 'e': /* can't happen as I always do a lookahead above and return current tree if e */ default: fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } } int split_pattern(GPattern, GM, APattern, terminals, pnum_terminals, pGParse, num_attr) CHAR *GPattern; int GM; CHAR *APattern; ParseTree terminals[]; int *pnum_terminals; ParseTree **pGParse; int num_attr; { int bufptr = 0, apatptr = 0, ret, i, j; foundattr = 0; if (is_complex_boolean(GPattern, GM)) { ComplexBoolean = 1; *pnum_terminals = 0; if ((*pGParse = parse_tree(GPattern, GM, &bufptr, APattern, &apatptr, terminals, pnum_terminals, num_attr)) == NULL) return -1; /* print_tree(*pGParse, 0); */ APattern[apatptr] = '\0'; if (foundattr || WHOLEFILESCOPE) { /* Search in agrep must always be OR since scope is whole file */ int i, j; for (i=0; i= buffer_end) break; continue; } if ((type = checksg(word, D, 0)) == -1) return -1; if (!type && ComplexBoolean) { fprintf(stderr, "%s: query has complex patterns (like '.*') or options (like -n)\n... cannot search for arbitrary booleans\n", GProgname); return -1; } #if 0 DISABLED IN GLIMPSE NOW SINCE MGREP HANDLES DUPLICATES -- IT WAS ALWAYS ABLE TO HANDLE SUPERSTRINGS/SUBSTRINGS...: bgopal, Nov 19, 1996 if (type) { /* Check if superstring: if so, ditch word */ for (i=0; i= buffer_end) break; continue; } /* Check if substring: delete all superstrings */ for (i=0; i= buffer_end) break; if(num_pat >= MAXNUM_PAT) { fprintf(stderr, "%s: Warning! too many words in pattern (> %d): ignoring...\n", GProgname, MAXNUM_PAT); break; } } } for (i=0; i& .glimpse_out .br at -m 0300 glimpse.script .br (It might be interesting to collect all the outputs of glimpse by changing >& to >>& so that the file .glimpse_out maintains a history. In this case the file must be created before the first time >>& is used. If you use ksh, replace '>&' with '2>&1'.) .LP Glimpseindex stores the names of all the files that it indexed in the file .glimpse_filenames. Each file is listed by its full path name as obtained at the time the files were indexed. For example, /usr1/udi/file1. Glimpse uses this full name when it performs the search, so the name must match the current name. This may become a problem when the indexing and the search are done from different machines (e.g., through NFS), which may cause the path names to be different. For example, /tmp_mnt/R/xxx/xxx/usr1/udi/file1. (The same is true for several other .glimpse files. See below.) .LP Glimpseindex does not follow symbolic links unless they are explicitly included in the .glimpse_include file (described below). .LP Glimpseindex makes an effort to identify non-text files such as binary files, compressed files, uuencoded files, postscript files, binhex files, etc. These files are automatically not indexed. In addition, all files whose names end with `.o', `.gz', `.Z', `.z', `.hqx', `.zip', or `.tar' will not be indexed (unless they are specifically included in .glimpse_include - see below). .LP The options for glimpseindex are as follows: .TP .B \-a adds the given file[s] and/or directories to an existing index. Any given directory will be traversed recursively and all files will be indexed (unless they appear in .glimpse_exclude; see below). Using this option is generally much faster than indexing everything from scratch, although in rare cases the index may not be as good. If for some reason the index is full (which can happen unless -o or -b are used) glimpseindex -a will produce an error message and will exit without changing the original index. .TP .B \-b builds a medium-size index (20-30% of the size of all files), allowing faster search. This option forces glimpseindex to store an exact (byte level) pointer to each occurrence of each word (except for some very common words belonging to the stop list). .TP .B \-B uses a hash table that is 4 times bigger (256k entries instead of 64K) to speed up indexing. The memory usage will increase typically by about 2 MB. This option is only for indexing speed; it does not affect the final index. .TP .B \-d filename(s) deletes the given file(s) from the index. .TP .B \-D filename(s) deletes the given file(s) from the list of file names, but not from the index. This is much faster than -d, and the file(s) will not be found by glimpse. However, the index itself will not become smaller. .TP .B \-E does not run a check on file types. Glimpse normally attempts to exclude non-text files, but this attempt is not always perfect. With \-E, glimpseindex indexes all files, except those that are specifically excluded in .glimpse_exclude and those whose file names end with one of the excluded suffixes. .TP .B \-f incremental indexing. \fIglimpseindex\fP scans all files and adds to the index only those files that were created or modified after the current index was built. If there is no current index or if this procedure fails, \fIglimpseindex\fP automatically reverts to the default mode (which is to index everything from scratch). This option may create an inefficient index for several reasons, one of which is that deleted files are not really deleted from the index. Unless changes are small, mostly additions, and -o is used, we suggest to use the default mode as much as possible. .TP .B \-F Glimpseindex receives the list of files to index from standard input. .TP .B \-H directory Put or update the index and all other .glimpse files (listed below) in "directory". The default is the home directory. When glimpse is run, the -H option must be used to direct glimpse to this directory, because glimpse assumes that the index is in the home directory (see also the -H option in glimpse). .TP .B \-i Make .glimpse_include (SEE GLIMPSEINDEX FILES) take precedence over .glimpse_exclude, so that, for example, one can exclude everything (by putting *) and then explicitly include files. .TP .B \-I Instead of indexing, only show (print to standard out) the list of files that would be indexed. It is useful for filtering purposes. ("glimpseindex -I dir | glimpseindex -F" is the same as "glimpseindex dir".) .TP .B \-M x Tells glimpseindex to use x MB of memory for temporary tables. The more memory you allow the faster glimpseindex will run. The default is x=2. The value of x must be a positive integer. Glimpseindex will need more memory than x for other things, and glimpseindex may perform some 'forks', so you'll have to experiment if you want to use this option. WARNING: If x is too large you may run out of swap space. .TP .B \-n Index numbers as well as text. The default is not to index numbers. This is useful when searching for dates or other identifying numbers, but it may make the index very large if there are lots of numbers. In general, glimpseindex strips away any non-alphabetic character. For example, the string abc123 will be indexed as abc if the -n option is not used and as abc123 if it is used. Glimpse provides warnings (in .glimpse_messages) for all files in which more than half the words that were added to the index from that file had digits in them (this is an attempt to identify data files that should probably not be indexed). One can use the .glimpse_exclude file to exclude data files or any other files. (See GLIMPSEINDEX FILES.) .TP .B \-o Build a small index rather than tiny (meaning 7-9% of the sizes of all files - your mileage may vary) allowing faster search. This option forces glimpseindex to allocate one block per file (a block usually contains many files). A detailed explanation of how blocks affect glimpse can be found in the glimpse article. (See also LIMITATIONS.) .TP .B \-R Recompute .glimpse_filenames_index from .glimpse_filenames. The file .glimpse_filenames_index speeds up processing. Glimpseindex usually computes it automatically. However, if for some reason one wants to change the path names of the files listed in .glimpse_filenames, then running glimpseindex -R recomputes .glimpse_filenames_index. This is useful if the index is computed on one machine, but is used on another (with the same hierarchy). The names of the files listed in .glimpse_filenames are used in runtime, so changing them can be done at any time in any way (as long as just the names not the content is changed). This is not really an option in the regular sense; rather, it is a program by itself, and it is meant as a post-processing step. (Avaliable only from version 3.6.) .TP .B \-s supports structured queries. This option was added to support the Harvest project and it is applicable mostly in that context. See STRUCTURED QUERIES below for more information and also http://harvest.sourceforge.net/ for more information about the Harvest project. .TP .B \-S k The number k determines the size of the \fIstop-list\fP. The stop-list consists of words that are too common and are not indexed (e.g., 'the' or 'and'). Instead of having a fixed stop-list, glimpseindex figures out the words that are too common for every index separately. The rules are different for the different indexing options. The tiny index contains all words (the savings from a stop-list are too small to bother). The small index (-o), the number k is a percentage threshold. A word will be in the stop list if it appears in at least k% of all files. The default value is 80%. (If there are less than 256 files, then the stop-list is not maintained.) The medium index (-b) counts all occurrences of all words, and a word is added to the stop-list if it appears at least k times per MByte. The default value is 500. A query that includes a stop list word is of course less efficient. (See also LIMITATIONS below.) .TP .B \-t (A new option in version 3.5.) The order in which files are indexed is determined by scanning the directories, which is mostly arbitrary. With the \-t option, combined with either \-o and \-b, the indexed files are stored in reversed order of modification age (younger files first). Results of queries are then automatically returned in this order. Furthermore, glimpse can filter results by age; for example, asking to look at only files that are at most 5 days old. .TP .B \-T builds the turbo file. Starting at version 3.0, this is the default, so using this option has no effect. .TP .B \-w k Glimpseindex does a reasonable, but not a perfect, job of determining which files should not be indexed. Sometimes a large text file should not be indexed; for example, a dictionary may match most queries. The -w option stores in a file called .glimpse_messages (in the same directory as the index) the list of all files that contribute at least \fIk\fP new words to the index. The user can look at this list of files and decide which should or should not be indexed. The file .glimpse_exclude contains files that will not be indexed (see more below). We recommend to set \fIk\fP to about 1000. This is not an exact measure. For example, if the same file appears twice, then the second copy will not contribute any new words to the dictionary (but if you exclude the first copy and index again, the second copy will contribute). .TP .B \-X (starting at version 4.0B1) Extract titles from HTML pages and add the titles to the index (in .glimpse_filenames). (This feature was added to improve the performance of WebGlimpse.) Works only on files whose names end with .html, .htm, .shtml, and .shtm. (see glimpse.h/EXTRACT_INFO_SUFFIX to add to these suffixes.) The routine to extract titles is called extract_info, in index/filetype.c. This feature can be modified in various ways to extract info from many filetypes. The titles are appended to the corresponding filenames with a space separator. Glimpseindex assumes that filenames don't have spaces in them. .TP .B \-z Allow customizable filtering, using the file .glimpse_filters to perform the programs listed there for each match. The best example is compress/decompress. If .glimpse_filters include the line .br *.Z uncompress < .br (separated by tabs) then before indexing any file that matches the pattern "*.Z" (same syntax as the one for .glimpse_exclude) the command listed is executed first (assuming input is from stdin, which is why uncompress needs <) and its output (assuming it goes to stdout) is indexed. The file itself is not changed (i.e., it stays compressed). Then if glimpse -z is used, the same program is used on these files on the fly. Any program can be used (we run 'exec'). For example, one can filter out parts of files that should not be indexed. Glimpseindex tries to apply all filters in .glimpse_filters in the order they are given. For example, if you want to uncompress a file and then extract some part of it, put the compression command (the example above) first and then another line that specifies the extraction. Note that this can slow down the search because the filters need to be run before files are searched. .SH "GLIMPSEINDEX FILES" .LP All files used by glimpse are located at the directory(ies) where the index(es) is (are) stored and have .glimpse_ as a prefix. The first two files (.glimpse_exclude and .glimpse_include) are optionally supplied by the user. The other files are built and read by glimpse. .LP .IP "\fB.glimpse_exclude\fR" contains a list of files that glimpseindex is explicitly told to ignore. In general, the syntax of .glimpse_exclude/include is the same as that of agrep (or any other grep). The lines in the .glimpse_exclude file are matched to the file names, and if they match, the files are excluded. Notice that agrep matches to parts of the string! e.g., agrep /ftp/pub will match /home/ftp/pub and /ftp/pub/whatever. So, if you want to exclude /ftp/pub/core, you just list it, as is, in the .glimpse_exclude file. If you put "/home/ftp/pub/cdrom" in .glimpse_exclude, every file name that matches that string will be excluded, meaning all files below it. You can use ^ to indicate the beginning of a file name, and $ to indicate the end of one, and you can use * and ? in the usual way. For example /ftp/*html will exclude /ftp/pub/foo.html, but will also exclude /home/ftp/pub/html/whatever; if you want to exclude files that start with /ftp and end with html use ^/ftp*html$ Notice that putting a * at the beginning or at the end is redundant (in fact, in this case glimpseindex will remove the * when it does the indexing). No other meta characters are allowed in .glimpse_exclude (e.g., don't use .* or # or |). Lines with * or ? must have no more than 30 characters. Notice that, although the index itself will not be indexed, the list of file names (.glimpse_filenames) will be indexed unless it is explicitly listed in .glimpse_exclude. .IP "\fB.glimpse_filters\fR" See the description above for the -z option. .IP "\fB.glimpse_include\fR" contains a list of files that glimpseindex is explicitly told to \fIinclude\fP in the index even though they may look like non-text files. Symbolic links are followed by glimpseindex only if they are specifically included here. The syntax is the same as the one for .glimpse_exclude (see there). If a file is in both .glimpse_exclude and .glimpse_include it will be excluded unless -i is used. .IP "\fB.glimpse_filenames\fP" contains the list of all indexed file names, one per line. This is an ASCII file that can also be used with agrep to search for a file name leading to a fast find command. For example, .br glimpse 'count#\\.c$' ~/.glimpse_filenames .br will output the names of all (indexed) .c files that have 'count' in their name (including anywhere on the path from the index). Setting the following alias in the .login file may be useful: .br alias findfile 'glimpse -h \!:1 ~/.glimpse_filenames' .IP ".\fBglimpse_index\fP" contains the index. The index consists of lines, each starting with a word followed by a list of block numbers (unless the -o or -b options are used, in which case each word is followed by an offset into the file .glimpse_partitions where all pointers are kept). The block/file numbers are stored in binary form, so this is not an ASCII file. .IP "\fB.glimpse_messages\fP" contains the output of the -w option (see above). .IP "\fB.glimpse_partitions\fP" contains the partition of the indexed space into blocks and, when the index is built with the -o or -b options, some part of the index. This file is used internally by glimpse and it is a non-ASCII file. .IP "\fB.glimpse_statistics\fP" contains some statistics about the makeup of the index. Useful for some advanced applications and customization of glimpse. .SH "STRUCTURED QUERIES" Glimpse can search for Boolean combinations of "attribute=value" terms by using the Harvest SOIF parser library (in glimpse/libtemplate). To search this way, the index must be made by using the -s option of glimpseindex (this can be used in conjunction with other glimpseindex options). For glimpse and glimpseindex to recognize "structured" files, they must be in SOIF format. In this format, each value is prefixed by an attribute-name with the size of the value (in bytes) present in "{}" after the name of the attribute. For example, The following lines are part of an SOIF file: .br .nf type{17}: Directory-Listing md5{32}: 3858c73d68616df0ed58a44d306b12ba .fi Any string can serve as an attribute name. Glimpse "pattern;type=Directory-Listing" will search for "pattern" only in files whose type is "Directory-Listing". The file itself is considered to be one "object" and its name/url appears as the first attribute with an "@" prefix; e.g., @FILE { http://xxx... } The scope of Boolean operations changes from records (lines) to whole files when structured queries are used in glimpse (since individual query terms can look at different attributes and they may not be "covered" by the record/line). Note that glimpse can only search for patterns in the value parts of the SOIF file: there are some attributes (like the TTL, MD5, etc.) that are interpreted by Harvest's internal routines. See RFC 2655 for more detailed information of the SOIF format. .SH "REFERENCES" .IP 1. U. Manber and S. Wu, "GLIMPSE: A Tool to Search Through Entire File Systems," \fIUsenix Winter 1994 Technical Conference\fP (best paper award), San Francisco (January 1994), pp. 23\-32. Also, Technical Report #TR 93-34, Dept. of Computer Science, University of Arizona, October 1993 (a postscript file is available by anonymous ftp at ftp://webglimpse.net/pub/glimpse/TR93-34.ps). .IP 2. S. Wu and U. Manber, "Fast Text Searching Allowing Errors," \fICommunications of the ACM\fP \fB35\fP (October 1992), pp. 83\-91. .SH "SEE ALSO" .BR agrep (1), .BR ed (1), .BR ex (1), .BR glimpse (1), .BR glimpseserver (1), .BR grep (1V), .BR sh (1), .BR csh (1). .SH LIMITATIONS .LP The index of glimpse is word based. A pattern that contains more than one word cannot be found in the index. The way glimpse overcomes this weakness is by splitting any multi-word pattern into its set of words and looking for all of them in the index. For example, \fBglimpse 'linear programming'\fR will first consult the index to find all files containing both \fIlinear\fP and \fIprogramming\fP, and then apply agrep to find the combined pattern. This is usually an effective solution, but it can be slow for cases where both words are very common, but their combination is not. .LP The index of glimpse stores all patterns in lower case. When glimpse searches the index it first converts all patterns to lower case, finds the appropriate files, and then searches the actual files using the original patterns. So, for example, \fIglimpse ABCXYZ\fR will first find all files containing abcxyz in any combination of lower and upper cases, and then searches these files directly, so only the right cases will be found. One problem with this approach is discovering misspellings that are caused by wrong cases. For example, \fIglimpse -B abcXYZ\fR will first search the index for the best match to abcxyz (because the pattern is converted to lower case); it will find that there are matches with no errors, and will go to those files to search them directly, this time with the original upper cases. If the closest match is, say AbcXYZ, glimpse may miss it, because it doesn't expect an error. Another problem is speed. If you search for "ATT", it will look at the index for "att". Unless you use -w to match the whole word, glimpse may have to search all files containing, for example, "Seattle" which has "att" in it. .LP There is no size limit for simple patterns and simple patterns with Boolean AND or OR. More complicated patterns are currently limited to approximately 30 characters. Lines are limited to 1024 characters. Records are limited to 48K, and may be truncated if they are larger than that. The limit of record length can be changed by modifying the parameter Max_record in agrep.h. .LP Each line in .glimpse_exclude or .glimpse_include that contains a * or a ? must not exceed 30 characters length. .LP Glimpseindex does not index words of size > 64. .LP A medium-size index (-b) may lead to actually slower query times if the files are all very small. .LP Under -b, it may be impossible to make the stop list empty. Glimpseindex is using the "sort" routine, and all occurrences of a word appear at some point on one line. Sort is limiting the size of lines it can handle (the value depends on the platform; ours is 16KB). If the lines are too big, the word is added to the stop list. .SH BUGS .LP Please submit bug reports or comments at http://webglimpse.net/bugzilla/ .SH DIAGNOSTICS (Only in version 3.6 and above.) .br exit status 0: terminated normally; .br exit status 1: glimpseindex errors (e.g., bad option combos, no files were indexed, etc.) .br exit status 2: system errors (e.g., write failed, sort failed, malloc failed). .SH AUTHORS Udi Manber and Burra Gopal, Department of Computer Science, University of Arizona, and Sun Wu, the National Chung-Cheng University, Taiwan. Now maintained by Golda Velez at Internet WorkShop (Email: gvelez@webglimpse.net) glimpse-4.18.7/glimpseserver.1000066400000000000000000000061071300371307100162230ustar00rootroot00000000000000.TH GLIMPSESERVER l "October 13, 1997" .SH NAME \fIglimpseserver 4.1\fP - a server version of the glimpse searching package. .SH OVERVIEW \fIGlimpse\fP is an indexing and query system that allows you to search through all your files very quickly. The use of glimpse in servers that handle frequent queries is growing, which is why we wrote glimpseserver to make searches more efficient. Glimpseserver starts a process that listens to queries, runs glimpse, and sends the answers back. The main advantage is that the index is read only once into memory saving a lot of IO. Glimpse communicates with glimpseserver through a given port number. See the warning about security below. .LP .SH SYNOPSIS .B glimpseserver [ \fB\-H \fIdir\fP \-K \fIport\fP \-J \fIhost\fP. ] .SH "DESCRIPTION" .LP .TP .B \-H \fIdir\fP specifies the directory of the index. Similar to the \-H option of glimpse. The default directory is the value of the environment variable $HOME if that is set, otherwise it is the current directory. .TP .B \-K \fIport\fP this is the TCP port for communication: glimpseserver waits for requests on this port and clients that want to search using the index in specified by the \-H option must use this port (by calling glimpse -K). The defaults port number is 2001. .TP .B \-J \fIhost\fP the name of the host. The default is the host where glimpseserver is running, which is probably the only possibility anyway. .SH "RESTARTING" .LP If a new index is created by running glimpseindex every night, restarting a new glimpseserver is now easier: simply send a SIGUSR2 (signal #31 - i.e., "kill -31 pid") to glimpseserver; it then re-reads the NEW index and is ready to serve requests again. (A SIGHUP, i.e., signal #1, can also be sent instead of SIGUSR2 to make the glimpseserver re-read the new index.) The recommended way to do a fresh indexing while the server is still running is: .br send SIGSTOP to glimpseserver .br do the indexing .br send SIGUSR2 to glimpseserver .br send SIGCONT to glimpseserver (to ask it to continue after stop) .br The SIGSTOP is required so that glimpseserver doesn't answer any queries while the indexing is going on. .SH "WARNING" .LP Glimpseserver should be used only for public servers. Any client that knows the port number can get any information available in the index (and port numbers are not that secret). When glimpse is run as a standalone application it requires read permission of the index and all the files. When glimpse uses the \-C option to communicate with glimpseserver, glimpse (the client) does not require any permission, because glimpseserver does all the searching. So, we recommend not to run glimpseserver on any data that should be protected. Glimpseserver is meant to be used for public data. .SH "SEE ALSO" .BR glimpse (1), .BR glimpseindex (1), .SH BUGS .LP Please submit bug reports or comments at http://webglimpse.net/bugzilla/ .SH AUTHORS Udi Manber and Burra Gopal, Department of Computer Science, University of Arizona, and Sun Wu, the National Chung-Cheng University, Taiwan. Now maintained by Golda Velez at Internet WorkShop (Email: gvelez@webglimpse.net) glimpse-4.18.7/index/000077500000000000000000000000001300371307100143555ustar00rootroot00000000000000glimpse-4.18.7/index/Makefile.NeXT000066400000000000000000000145031300371307100166350ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.NeXT CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.alpha000066400000000000000000000145561300371307100171140ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc #gcc -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O -Olimit 3000 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.alpha CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.hp000066400000000000000000000145021300371307100164250ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 0 HAVE_SYS_DIR_H = 1 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.hp CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.in000066400000000000000000000104761300371307100164320ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. srcdir = @srcdir@ VPATH = @srcdir@ SHELL = /bin/sh CC = @CC@ LIBS = @LIBS@ AR = @AR@ RANLIB = @RANLIB@ CP = @CP@ STRIP = @STRIP@ INSTALL = @INSTALL@ INSTALL_PROGRAM = @INSTALL_PROGRAM@ INSTALL_DATA = @INSTALL_DATA@ DEFS = prefix = @prefix@ exec_prefix = @exec_prefix@ binprefix = manprefix = bindir = $(exec_prefix)/bin libdir = $(exec_prefix)/lib mandir = $(prefix)/man/man1 manext = 1 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) OPTIMIZEFLAGS = -O2 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = $(DEFS) CFLAGS = $(INCLUDEFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o all: @TARGET@ Sall: $(CASTPROG) $(PROG) wgconvert NOTSall: $(NOTSCASTPROG) $(NOTSPROG) notswgconvert install: all install-man for i in $(BINDIR)/$(CASTPROG) $(BINDIR)/$(PROG) $(BINDIR)/wgconvert ; do \ $(INSTALL) $$i $(bindir) ; \ done install-man: clean: rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert distclean: clean rm -f Makefile #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LDFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LDFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/$(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LDFLAGS) -L$(LIBAGREPDIR) -o $(BINDIR)/$(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(LIBS) $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LDFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/$(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LDFLAGS) -L$(LIBAGREPDIR) -o $(BINDIR)/$(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(LIBS) wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LDFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(BINDIR)/wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(LIBS) notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LDFLAGS) -L$(LIBAGREPDIR) -o $(BINDIR)/wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(LIBS) $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h rm -f buildcast.c cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.linux000066400000000000000000000145461300371307100171650ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc -m486 SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = -ldl PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O2 #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.linux CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h rm -f buildcast.c cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.rs6000000066400000000000000000000145161300371307100167550ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.rs6000 CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.sgi000066400000000000000000000145051300371307100166030ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.sgi CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.solaris000066400000000000000000000145431300371307100174770ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc -traditional #cc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.solaris CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/Makefile.sunos000066400000000000000000000145141300371307100171700ustar00rootroot00000000000000# Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. # To compile for structured queries, make "all: Sall" and "STRUCTURED_QUERIES=1". #STRUCTURED_QUERIES = 0 STRUCTURED_QUERIES = 1 #all: NOTSall notswgconvert all: Sall wgconvert # Define HAVE_DIRENT_H to be 1 when you don't have else define it to be 0 (in this case, one of the other 3 flags may need to be defined to be 1). HAVE_DIRENT_H = 1 HAVE_SYS_DIR_H = 0 HAVE_SYS_NDIR_H = 0 HAVE_NDIR_H = 0 # Define UTIME to be 1 if you have the utime() routine on your system. Else define it to be 0. UTIME = 1 # Define ISO_CHAR_SET to be 1 if you want to use the international 8bit character set. Else define it to be 0. ISO_CHAR_SET = 0 # You might have to change this depending on your machine configuration. CC = gcc SHELL = /bin/sh # For compatibility with SFS, define this flag (internal only) SFS_COMPAT = 0 # YOU DON'T HAVE TO CHANGE ANYTHING BELOW THIS LINE # The binaries will be made in ../bin/. and the agrep library is assumed to # be in ../lib . You normally don't have to change them. # NOTE: INDEXDIR can be a relative or absolute path name. BINDIR = ../bin AGREPDIR = ../agrep INDEXDIR = ../index TEMPLATEDIR = ../libtemplate LIBAGREPDIR = ../lib LIBTEMPLATEDIR = $(TEMPLATEDIR)/lib LIBAGREP = agrep LIBTEMPLATE = template LIBUTIL = util OTHERLIBS = PROG = glimpseindex NOTSPROG = nots$(PROG) CASTPROG = buildcast NOTSCASTPROG = nots$(CASTPROG) # Include flags is not a part of CLFAGS and LINKFLAGS since path names from subdirs can be different OPTIMIZEFLAGS = -O #PROFILEFLAGS = -p #DEBUGFLAGS = -g -DBG_DEBUG=1 -DDEBUG=1 INCLUDEFLAGS = -I$(INDEXDIR) -I$(AGREPDIR) -I$(TEMPLATEDIR)/include DEFINEFLAGS = -DSTRUCTURED_QUERIES=$(STRUCTURED_QUERIES) -DHAVE_DIRENT_H=$(HAVE_DIRENT_H) -DHAVE_SYS_DIR_H=$(HAVE_SYS_DIR_H) \ -DHAVE_SYS_NDIR_H=$(HAVE_SYS_NDIR_H) -DHAVE_NDIR_H=$(HAVE_NDIR_H) -DUTIME=$(UTIME) -DISO_CHAR_SET=$(ISO_CHAR_SET) \ -DDONTUSESORT_T_OPTION=1 -DSFS_COMPAT=$(SFS_COMPAT) SUBDIRCFLAGS = -c $(DEFINEFLAGS) $(OPTIMIZEFLAGS) $(PROFILEFLAGS) $(DEBUGFLAGS) CFLAGS = $(INCLUDEFLAGS) $(SUBDIRCFLAGS) SUBDIRLINKFLAGS = $(PROFILEFLAGS) LINKFLAGS = $(INCLUDEFLAGS) $(SUBDIRLINKFLAGS) TEST = test OBJS = region.o \ dir.o \ io.o\ build_in.o \ filetype.o \ simpletest.o\ getword.o \ memlook.o \ lib.o \ partition.o Sall: $(CASTPROG) $(PROG) NOTSall: $(NOTSCASTPROG) $(NOTSPROG) #$(TEST): test.o $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a # $(CC) $(LINKFLAGS) -L$(LIBTEMPLATEDIR) -o $@ test.c -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) $(CASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(NOTSCASTPROG): buildcast.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(CASTPROG) buildcast.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(CASTPROG) $(BINDIR)/. $(PROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. $(NOTSPROG): glimpse.o $(OBJS) $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o $(PROG) glimpse.o $(OBJS) -l$(LIBAGREP) $(OTHERLIBS) cp $(PROG) $(BINDIR)/. wgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -L$(LIBTEMPLATEDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) -l$(LIBTEMPLATE) -l$(LIBUTIL) $(OTHERLIBS) cp wgconvert $(BINDIR)/. notswgconvert: convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o $(LIBAGREPDIR)/lib$(LIBAGREP).a $(CC) $(LINKFLAGS) -L$(LIBAGREPDIR) -o wgconvert convert.o io.o simpletest.o filetype.o region.o memlook.o getword.o -l$(LIBAGREP) $(OTHERLIBS) cp wgconvert $(BINDIR)/. $(LIBTEMPLATEDIR)/lib$(LIBTEMPLATE).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBTEMPLATEDIR)/lib$(LIBUTIL).a: cd $(TEMPLATEDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" $(LIBAGREPDIR)/lib$(LIBAGREP).a: cd $(AGREPDIR) ; $(MAKE) -f Makefile.sunos CC="$(CC)" SUBDIRCFLAGS="$(SUBDIRCFLAGS)" SUBDIRLINKFLAGS="$(SUBDIRLINKFLAGS)" SHELL="$(SHELL)" HAVE_DIRENT_H="$(HAVE_DIRENT_H)" HAVE_SYS_DIR_H="$(HAVE_SYS_DIR_H)" HAVE_SYS_NDIR_H="$(HAVE_SYS_NDIR_H)" HAVE_NDIR_H="$(HAVE_NDIR_H)" UTIME="$(UTIME)" STRUCTURED_QUERIES="$(STRUCTURED_QUERIES)" ISO_CHAR_SET="$(ISO_CHAR_SET)" SFS_COMPAT="$(SFS_COMPAT)" region.o: region.c region.h $(CC) -c $(CFLAGS) region.c getword.o: getword.c $(CC) -c $(CFLAGS) getword.c dir.o: dir.c $(CC) -c $(CFLAGS) dir.c lib.o: lib.c $(CC) -c $(CFLAGS) lib.c partition.o: partition.c $(CC) -c $(CFLAGS) partition.c glimpse.o: glimpse.c glimpse.h region.h $(CC) -c $(CFLAGS) -DBUILDCAST=0 glimpse.c buildcast.o: glimpse.c glimpse.h region.h cp glimpse.c buildcast.c $(CC) -c $(CFLAGS) -DBUILDCAST=1 -o buildcast.o buildcast.c io.o: io.c $(CC) -c $(CFLAGS) io.c build_in.o: build_in.c $(CC) -c $(CFLAGS) build_in.c filetype.o: filetype.c $(CC) -c $(CFLAGS) filetype.c simpletest.o: simpletest.c $(CC) -c $(CFLAGS) simpletest.c memlook.o: memlook.c $(CC) -c $(CFLAGS) memlook.c clean: -rm -f $(OBJS) glimpse.o convert.o buildcast.o buildcast.c core a.out $(PROG) $(CASTPROG) wgconvert $(OBJS): glimpse.h region.h glimpse-4.18.7/index/README000066400000000000000000000000711300371307100152330ustar00rootroot00000000000000This contains the source for the program "glimpseindex". glimpse-4.18.7/index/build_in.c000066400000000000000000002261161300371307100163160ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/build_in.c */ /* -------------------------------------------------------------- build_index(): build an index list from a set of files. INPUT: a set of file names char **name_list[]; a partition table int p_table[]; OUTPUT: an index list; char *index_list; the index list is a char string as follows: each entry of the index list contains two parts: name and indices, where name is an ascii character string, and indices is a list of short integer. (unsigned char) We use newline as a 'record delimiter' (a 'record is logically a word associated with its indices), and WORD_END_MARK to separate a word from its list of indices (s.t. fscanf %s works). Since we restrict the max number of partitions to be 255. a byte is enough to represent the index value. Note that there cannot be a partition #ed '\n'. An example index list: (in logical view) this 12 19 \n is 9 17 12 18 19 \n an 7 12 \n example 16 \n -----------------------------------------------------------------------*/ #include "glimpse.h" #include #define debugt #define BINARY 1 /* #define SW_DEBUG the original sw output of index set */ /* This flag must always be defined: it is used only in build_in.c */ /* #define UDI_DEBUG the original outputs of each indexed file */ /* Some variables used throughout */ #if BG_DEBUG extern FILE *LOGFILE; /* file descriptor for LOG output */ #endif /*BG_DEBUG*/ extern FILE *STATFILE; /* file descriptor for statistical data about indexed files */ extern FILE *MESSAGEFILE; /* file descriptor for important messages meant for the user */ extern char INDEX_DIR[MAX_LINE_LEN]; extern char sync_path[MAX_LINE_LEN]; extern struct stat istbuf; extern struct stat excstbuf; extern struct stat incstbuf; void insert_h(); void insert_index(); extern int ICurrentFileOffset; extern int NextICurrentFileOffset; /* Some options used throughout */ extern int OneFilePerBlock; extern int IndexNumber; extern int CountWords; extern int StructuredIndex; extern int InterpretSpecial; extern int total_size; extern int MAXWORDSPERFILE; extern int NUMERICWORDPERCENT; extern int AddToIndex; extern int DeleteFromIndex; extern int FastIndex; extern int BuildDictionary; extern int BuildDictionaryExisting; extern int CompressAfterBuild; extern int IncludeHigherPriority; extern int FilenamesOnStdin; extern int UseFilters; extern int ByteLevelIndex; extern int RecordLevelIndex; extern int StoreByteOffset; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; /* int IndexUnderscore; */ extern int IndexableFile; extern int MAX_INDEX_PERCENT; extern int MAX_PER_MB; extern int I_THRESHOLD; extern int usemalloc; extern int BigHashTable; extern int AddedMaxWordsMessage; extern int AddedMixedWordsMessage; extern int icount; /* count the number of my_malloc for indices structure */ extern int hash_icount; /* to see how much was added to the current hash table */ extern int save_icount; /* to see how much was added to the index by the current file */ extern int numeric_icount; /* to see how many numeric words were there in the current file */ extern int num_filter; extern int filter_len[MAX_FILTER]; extern CHAR *filter[MAX_FILTER]; extern CHAR *filter_command[MAX_FILTER]; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; extern int mask_int[32]; struct indices *deletedlist = NULL; char **name_list[MAXNUM_INDIRECT]; unsigned int *disable_list = NULL; int *size_list[MAXNUM_INDIRECT]; /* temporary area to store size of each file */ extern int p_table[MAX_PARTITION]; int p_size_list[MAX_PARTITION]; /* sum of the sizes of the files in each partition */ int part_num; /* number of partitions */ extern int memory_usage; /* borrowd from getword.c */ extern int PrintedLongWordWarning; extern int indexable_char[256]; extern char *getword(); extern int file_num; extern int old_file_num; extern int attr_num; extern int bp; /* buffer pointer */ extern unsigned char word[MAX_WORD_BUF]; extern int FirstTraverse1; extern struct indices *ip; extern int HashTableSize; struct token **hash_table; /*[MAX_64K_HASH];*/ build_index() { int i; if (AddToIndex || FastIndex) { FirstTraverse1 = OFF; } if ((total_size < LIMIT_64K_HASH*1024*1024) || !BigHashTable) { hash_table = (struct token **)my_malloc(sizeof(struct token *) * MAX_64K_HASH); HashTableSize = MAX_64K_HASH; } else { hash_table = (struct token **)my_malloc(sizeof(struct token *) * MAX_256K_HASH); HashTableSize = MAX_256K_HASH; } build_hash(); /* traverse1(); ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ removed on oct/8/96, bgopal, to see if crazysegvs disappear on lec */ return; } /* ---------------------------------------------------------------------- traverse() function: traverse the hash list of indices = a hash list is a array of linked list, where every node in a linked list contains a word whose hash_value is the same. While traversing the hash list, traverse() output a stream of index list. It also frees the memory used in hash_table. ------------------------------------------------------------------------*/ #define CRAZYSEGV 0 traverse() { int numseencount = 0; int numelements; int numonline; int i, j, attribute; struct token *tp, *tp_old; struct indices *ip, *ip_old; #if !CRAZYSEGV FILE *f_out; #else unsigned char onechar[4]; unsigned char onestring[MAX_LINE_LEN]; int f_out; #endif char s[MAX_LINE_LEN]; char *word; int x = -1, y=0, diff, temp, even_words=1; /* 0 is an even number */ int fputcerr; /* added by dgh 5-8-96 */ #ifdef SW_DEBUG printf("in traverse()\n"); #endif sprintf(s, "%s/%s", INDEX_DIR, I2); #if !CRAZYSEGV if ((f_out = fopen(s, "w")) == NULL) { #else if ((f_out = open(s, O_WRONLY|O_CREAT|O_TRUNC, 0600)) == -1) { #endif fprintf(stderr, "Cannot open %s for writing\n", s); exit(2); } for(i=0; iword; while(*word != '\0') { /* copy the word to output */ #if !CRAZYSEGV fputcerr=fputc(*word++, f_out);/* change from putc to fputc */ /* by dgh, 8-5-96 */ #else write(f_out, word, 1); word++; #endif } /* Look for stop lists */ if (OneFilePerBlock && !ByteLevelIndex && (file_num > MaxNum8bPartition) && (tp->totalcount > (file_num * MAX_INDEX_PERCENT / 100))) { #if !CRAZYSEGV putc(ALL_INDEX_MARK, f_out); #else onechar[0] = ALL_INDEX_MARK; write(f_out, onechar, 1); #endif if (StructuredIndex) { /* force big-endian as usual */ attribute = encode16b(tp->attribute); #if !CRAZYSEGV putc((attribute&0x0000ff00)>>8, f_out); putc((attribute&0x000000ff), f_out); #else onechar[0] = (attribute&0x0000ff00)>>8; onechar[1] = (attribute&0x000000ff); write(f_out, onechar, 2); #endif } #if !CRAZYSEGV putc(DONT_CONFUSE_SORT, f_out); #else onechar[0] = DONT_CONFUSE_SORT; write(f_out, onechar, 1); #endif goto next_token; } else if (ByteLevelIndex && (tp->totalcount > ( (((total_size>>20) > 0) && ((total_size>>20)*MAX_PER_MB < MAX_ALL_INDEX)) ? ((total_size>>20) * MAX_PER_MB) : MAX_ALL_INDEX) )) { #if !CRAZYSEGV putc(ALL_INDEX_MARK, f_out); #else onechar[0] = ALL_INDEX_MARK; write(f_out, onechar, 1); #endif if (StructuredIndex) { /* force big-endian as usual */ attribute = encode16b(tp->attribute); #if !CRAZYSEGV putc((attribute&0x0000ff00)>>8, f_out); putc((attribute&0x000000ff), f_out); #else onechar[0] = (attribute&0x0000ff00)>>8; onechar[1] = (attribute&0x000000ff); write(f_out, onechar, 2); #endif } #if !CRAZYSEGV putc(DONT_CONFUSE_SORT, f_out); #else onechar[0] = DONT_CONFUSE_SORT; write(f_out, onechar, 2); #endif goto next_token; } #if !CRAZYSEGV putc(WORD_END_MARK, f_out); #else onechar[0] = WORD_END_MARK; write(f_out, onechar, 1); #endif if (StructuredIndex) { /* force big-endian as usual */ attribute = encode16b(tp->attribute); #if !CRAZYSEGV putc((attribute&0x0000ff00)>>8, f_out); putc((attribute&0x000000ff), f_out); #else onechar[0] = (attribute&0x0000ff00)>>8; onechar[1] = (attribute&0x000000ff); write(f_out, onechar, 2); #endif } numonline = 0; x = -1; y = 0; even_words = 1; ip = tp->ip; /* traverse the indices list */ ip_old = ip; numelements = 0; while(ip != NULL) { numelements ++; if (CountWords) { #if !CRAZYSEGV fprintf(f_out, "%d", ip->offset[0]); #else sprintf(onestring, "%d", ip->offset[0]); write(f_out, onestring, strlen(onestring)); #endif } else { if (ByteLevelIndex) { for (j=0; j < INDEX_SET_SIZE; j++) { if (ip->index[j] == INDEX_ELEM_FREE) continue; if ((ip->offset[j] <= y) && (y > 0) && (x == ip->index[j])) { /* consecutive offsets not increasing in same file! */ fprintf(stderr, "ignoring (%d, %d) > (%d, %d)\n", x, y, ip->index[j], ip->offset[j]); continue; /* error! */ } if (numonline >= MAX_PER_LINE) { /* terminate current line since it is too late to put ALL_INDEX_MARK now ... Unfortunate since sort is screwedup */ #if !CRAZYSEGV putc('\n', f_out); #else onechar[0] = '\n'; write(f_out, onechar, 1); #endif #if 0 putc('\n', stdout); #endif /*0*/ word = tp->word; while(*word != '\0') { /* copy the word to output */ #if !CRAZYSEGV putc(*word++, f_out); #else write(f_out, word, 1); word ++; #endif } #if !CRAZYSEGV putc(WORD_END_MARK, f_out); #else onechar[0] = WORD_END_MARK; write(f_out, onechar, 1); #endif if (StructuredIndex) { /* force big-endian as usual */ attribute = encode16b(tp->attribute); #if !CRAZYSEGV putc((attribute&0x0000ff00)>>8, f_out); putc((attribute&0x000000ff), f_out); #else onechar[0] = (attribute&0x0000ff00)>>8; onechar[1] = (attribute&0x000000ff); write(f_out, onechar, 2); #endif } numonline = 0; x = -1; /* to force code below to output it as if it is a fresh file */ y = 0; /* must output first offset as is, rather than difference */ } if (x != ip->index[j]) { if (x != -1) { temp = encode8b(0); #if !CRAZYSEGV putc(temp, f_out); /* can never ordinarily happen since ICurrentFileOffset is always ++d => delimiter (unless RecordLevelIndex) */ #else onechar[0] = temp; write(f_out, onechar, 1); #endif } if (file_num <= MaxNum8bPartition) { x = encode8b(ip->index[j]); #if !CRAZYSEGV putc(x&0x000000ff, f_out); #else onechar[0] = x&0x000000ff; write(f_out, onechar, 1); #endif } else if (file_num <= MaxNum16bPartition) { x = encode16b(ip->index[j]); #if !CRAZYSEGV putc((x&0x0000ff00)>>8, f_out); putc(x&0x000000ff, f_out); #else onechar[0] = (x&0x0000ff00)>>8; onechar[1] = x&0x000000ff; write(f_out, onechar, 2); #endif } else { x = encode24b(ip->index[j]); #if !CRAZYSEGV putc((x&0x00ff0000)>>16, f_out); putc((x&0x0000ff00)>>8, f_out); putc(x&0x000000ff, f_out); #else onechar[0] = (x&0x00ff0000)>>16; onechar[1] = (x&0x0000ff00)>>8; onechar[2] = x&0x000000ff; write(f_out, onechar, 3); #endif } x = ip->index[j]; /* for next round */ #if 0 printf("#######x=%d ", x); #endif /*0*/ y = 0; } diff = ip->offset[j] - y; y = ip->offset[j]; if (diff < MaxNum1BPartition) { temp = encode8b(diff); #if !CRAZYSEGV putc(temp, f_out); #else onechar[0] = temp; write(f_out, onechar, 1); #endif } else if (diff < MaxNum2BPartition) { temp = encode8b((diff/MaxNum8bPartition) | 0x40); #if !CRAZYSEGV putc(temp, f_out); #else onechar[0] = temp; write(f_out, onechar, 1); #endif temp = encode8b(diff % MaxNum8bPartition); #if !CRAZYSEGV putc(temp, f_out); #else onechar[0] = temp; write(f_out, onechar, 1); #endif } else if (diff < MaxNum3BPartition) { temp = encode8b((diff/MaxNum16bPartition) | 0x80); #if !CRAZYSEGV putc(temp, f_out); #else onechar[0] = temp; write(f_out, onechar, 1); #endif temp = encode16b(diff % MaxNum16bPartition); #if !CRAZYSEGV putc((temp & 0x0000ff00) >> 8, f_out); putc(temp & 0x000000ff, f_out); #else onechar[0] = (temp & 0x0000ff00) >> 8; onechar[1] = temp & 0x000000ff; write(f_out, onechar, 2); #endif } else { temp = encode8b((diff/MaxNum24bPartition) | 0xc0); #if !CRAZYSEGV putc(temp, f_out); #else onechar[0] = temp; write(f_out, onechar, 1); #endif temp = encode24b(diff % MaxNum24bPartition); #if !CRAZYSEGV putc((temp & 0x00ff0000) >> 16, f_out); putc((temp & 0x0000ff00) >> 8, f_out); putc(temp & 0x000000ff, f_out); #else onechar[0] = (temp & 0x00ff0000) >> 16; onechar[1] = (temp & 0x0000ff00) >> 8; onechar[2] = temp & 0x000000ff; write(f_out, onechar, 3); #endif } numonline ++; } } /* ByteLevelIndex */ else if (OneFilePerBlock) { if (file_num <= MaxNum8bPartition) { for(j=0; j < INDEX_SET_SIZE; j++) { if (ip->index[j] == INDEX_ELEM_FREE) continue; #if !CRAZYSEGV putc(encode8b(ip->index[j]), f_out); #else onechar[0] = encode8b(ip->index[j]); write(f_out, onechar, 1); #endif } } else if (file_num <= MaxNum12bPartition) { for(j=0; j < INDEX_SET_SIZE; j++) { if (ip->index[j] == INDEX_ELEM_FREE) continue; x = encode12b(ip->index[j]); if (even_words) { #if !CRAZYSEGV putc(x & 0x000000ff, f_out); /* lsb */ #else onechar[0] = x & 0x000000ff; write(f_out, onechar, 1); #endif y = (x & 0x00000f00)>>8; /* msb */ even_words = 0; } else { /* odd number of words so far */ y |= (x&0x00000f00)>>4; /* msb of x into msb of y */ #if !CRAZYSEGV putc(y, f_out); putc(x&0x000000ff, f_out); #else onechar[0] = y; onechar[1] = x&0x000000ff; write(f_out, onechar, 2); #endif even_words = 1; } } } else if (file_num <= MaxNum16bPartition) { for(j=0; j < INDEX_SET_SIZE; j++) { if (ip->index[j] == INDEX_ELEM_FREE) continue; x = encode16b(ip->index[j]); #if !CRAZYSEGV putc((x&0x0000ff00)>>8, f_out); putc(x&0x000000ff, f_out); #else onechar[0] = (x&0x0000ff00)>>8; onechar[1] = x&0x000000ff; write(f_out, onechar, 2); #endif } } else { for(j=0; j < INDEX_SET_SIZE; j++) { if (ip->index[j] == INDEX_ELEM_FREE) continue; x = encode24b(ip->index[j]); #if !CRAZYSEGV putc((x&0x00ff0000)>>16, f_out); putc((x&0x0000ff00)>>8, f_out); putc(x&0x000000ff, f_out); #else onechar[0] = (x&0x00ff0000)>>16; onechar[1] = (x&0x0000ff00)>>8; onechar[2] = x&0x000000ff; write(f_out, onechar, 3); #endif } } } /* OneFilePerBlock */ else { /* normal partitions */ for(j=0; j < INDEX_SET_SIZE; j++) { if (ip->index[j] == INDEX_ELEM_FREE) continue; #if !CRAZYSEGV putc(ip->index[j], f_out); #else onechar[0] = ip->index[j]; write(f_out, onechar, 1); #endif } } } ip = ip->next_i; /* go to next indices */ indicesfree(ip_old, sizeof(struct indices)); ip_old = ip; } if (!ByteLevelIndex && OneFilePerBlock && !even_words && (file_num > MaxNum8bPartition) && (file_num <= MaxNum12bPartition)) { #if !CRAZYSEGV putc(y, f_out); #else onechar[0] = y; write(f_out, onechar, 1); #endif } next_token: #if !CRAZYSEGV if (putc('\n', f_out) == EOF) { #else onechar[0] = '\n'; if (write(f_out, onechar, 1) <= 0) { #endif fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } tp = tp->next_t; /* go to next token */ #if 0 fprintf(stderr, "numelements=%d\n", numelements); #endif /*0*/ #if BG_DEBUG memory_usage -= (strlen(tp_old->word) + 1); #endif /*BG_DEBUG*/ wordfree(tp_old->word, 0); tokenfree(tp_old, sizeof(struct token)); tp_old = tp; numseencount ++; } } tokenallfree(); indicesallfree(); wordallfree(); #if BG_DEBUG fprintf(stderr, "out of traverse(): saved/freed %d tokens: new usage: %d\n", numseencount, memory_usage); #endif #if !CRAZYSEGV fflush(f_out); fclose(f_out); #else close(f_out); #endif } traverse1() { FILE *i1, *i2, *i3; int ret; char s[MAX_LINE_LEN], es1[MAX_LINE_LEN], es2[MAX_LINE_LEN], es3[MAX_LINE_LEN]; char s1[MAX_LINE_LEN]; extern int errno; static int maxsortlinelen = 0; int i; if (maxsortlinelen <= 0) { if (file_num < MaxNum8bPartition) maxsortlinelen = round((MaxNum8bPartition * sizeof(int) + MAX_NAME_SIZE), MAX_LINE_LEN) * MAX_LINE_LEN; else if (file_num < MaxNum12bPartition) maxsortlinelen = round((MaxNum12bPartition * sizeof(int) + MAX_NAME_SIZE), MAX_LINE_LEN) * MAX_LINE_LEN; else maxsortlinelen = MAX_SORTLINE_LEN; } traverse(); /* will produce .i2 and my_free allocated memory */ #if USESORT_Z_OPTION #if DONTUSESORT_T_OPTION || SFS_COMPAT sprintf(s, "exec %s -z %d '%s/%s' > '%s/%s'\n", SYSTEM_SORT, maxsortlinelen, escapesinglequote(INDEX_DIR, es1), I2, escapesinglequote(INDEX_DIR, es2), O2); #else sprintf(s, "exec %s -T '%s' -z %d '%s/%s' > '%s/%s'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), maxsortlinelen, escapesinglequote(INDEX_DIR, es2), I2, escapesinglequote(INDEX_DIR, es3), O2); #endif #else #if DONTUSESORT_T_OPTION || SFS_COMPAT sprintf(s, "exec %s '%s/%s' > '%s/%s'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), I2, escapesinglequote(INDEX_DIR, es2), O2); #else sprintf(s, "exec %s -T '%s' '%s/%s' > '%s/%s'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), I2, escapesinglequote(INDEX_DIR, es3), O2); #endif #endif #ifdef SW_DEBUG printf("%s", s); #endif if((ret=system(s)) != 0) { sprintf(s1, "system('%s') failed at:\n\t File=%s, Line=%d, Errno=%d", s, __FILE__, __LINE__, errno); perror(s1); fprintf(stderr, "Please try to run the program again\n(If there's no memory, increase the swap area / don't use -M and -B options)\n"); exit(2); } #ifdef SW_DEBUG printf("mv .o2 .i2\n"); fflush(stdout); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, O2); sprintf(s1, "%s/%s", INDEX_DIR, I2); rename(s, s1); #else sprintf(s, "exec %s '%s/%s' '%s/%s'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), O2, escapesinglequote(INDEX_DIR, es2), I2); system(s); #endif system(sync_path); /* sync() has a bug */ #if 0 printf("traversed\n"); sprintf(s, "exec %s -10 '%s/%s'\n", SYSTEM_HEAD, escapesinglequote(INDEX_DIR, es1), I2); system(s); #endif /*0*/ /* * This flag is set from outside iff build-fast | build-addto option is set. */ if(FirstTraverse1) { /* Mention whether numbers are indexed */ if(IndexNumber) sprintf(s, "exec %s %%1234567890 > '%s/%s'\n", SYSTEM_ECHO, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); else sprintf(s, "exec %s %% > '%s/%s'\n", SYSTEM_ECHO, escapesinglequote(INDEX_DIR,es1), INDEX_FILE); system(s); /* Put the magic number: 0 if not 1file/blk, numfiles otherwise */ if (OneFilePerBlock) { if (ByteLevelIndex) sprintf(s, "exec %s %%-%d >> '%s/%s'\n", SYSTEM_ECHO, file_num, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); else sprintf(s, "exec %s %%%d >> '%s/%s'\n", SYSTEM_ECHO, file_num, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); } else sprintf(s, "exec %s %%0 >> '%s/%s'\n", SYSTEM_ECHO, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); system(s); /* Put the magic number: 0 if not structured index, 1 if so */ if (StructuredIndex) sprintf(s, "exec %s %%%d >> '%s/%s'\n", SYSTEM_ECHO, attr_num, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); else if (RecordLevelIndex) sprintf(s, "exec %s %%-2 %s >> '%s/%s'\n", SYSTEM_ECHO, old_rdelim, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); else sprintf(s, "exec %s %%0 >> '%s/%s'\n", SYSTEM_ECHO, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); system(s); #ifdef SW_DEBUG sprintf(s, "exec %s -l %s/.glimpse*\n", SYSTEM_LS, escapesinglequote(INDEX_DIR, es1)); system(s); #endif sprintf(s, "exec %s '%s/%s' >> '%s/%s'\n", SYSTEM_CAT, escapesinglequote(INDEX_DIR, es1), I2, escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, I2); unlink(s); #else sprintf(s, "exec %s '%s/%s'\n", SYSTEM_RM, escapesinglequote(INDEX_DIR, es1), I2); system(s); #endif #ifdef SW_DEBUG sprintf(s, "exec %s -l %s/.glimpse*\n", SYSTEM_LS, escapesinglequote(INDEX_DIR, es1)); system(s); #endif #if 0 printf("catted\n"); sprintf(s, "exec %s -10 '%s/%s'\n", SYSTEM_HEAD, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); system(s); #endif /*0*/ FirstTraverse1 = 0; system(sync_path); /* sync() has a bug */ return; } /* else not first-traverse */ sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if((i1 = fopen(s, "r")) == NULL) { /* new stuff */ fprintf(stderr, "can't open %s for reading\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, I2); if((i2 = fopen(s, "r")) == NULL) { /* old stuff */ fprintf(stderr, "can't open %s for reading\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, I3); if((i3 = fopen(s, "w")) == NULL) { /* result */ fprintf(stderr, "can't open %s for writing\n", s); exit(2); } /* Copy the 3 option fields (indexnumber, onefileperblock, structuredqueries) */ fgets(s, 256, i1); s[255] = '\0'; fputs(s, i3); fgets(s, 256, i1); s[255] = '\0'; fputs(s, i3); fgets(s, 256, i1); s[255] = '\0'; fputs(s, i3); merge_in(i2, i1, i3); /* merge_in(i1, i2, i3); */ #ifdef BG_DEBUG fprintf(stderr, "out of merge_in()\n"); #endif /*BG_DEBUG*/ fclose(i1); fflush(i2); fclose(i2); fflush(i3); fclose(i3); system(sync_path); /* sync() has a bug */ #ifdef SW_DEBUG printf("mv .i3 %s\n", INDEX_FILE); fflush(stdout); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, I3); sprintf(s1, "%s/%s", INDEX_DIR, INDEX_FILE); rename(s, s1); #else sprintf(s, "exec %s '%s/%s' '%s/%s'", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), I3, escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #endif /* #ifdef SW_DEBUG */ #if 0 printf("ls -l .i2 %s\n", INDEX_FILE); fflush(stdout); sprintf(s, "exec %s -l %s/.glimpse*", SYSTEM_LS, escapesinglequote(INDEX_DIR, es1)); printf("%d\n", system(s)); #endif #if 0 printf("merged\n"); sprintf(s, "exec %s -10 '%s/%s'\n", SYSTEM_HEAD, escapesinglequote(INDEX_DIR, es1), INDEX_FILE); system(s); #endif /*0*/ } /* -------------------------------------------------------------------- build_hash(): input: a set of filenames in name_list[], a partition table p_table[] output: a hash table hash_table[]. -----------------------------------------------------------------------*/ build_hash() { int fd; /* opened file number */ int i, pn; /* pn: current partition */ int num_read; char word[256]; struct stat stbuf; int offset; int toread; unsigned char *buffer; /* running pointer for getword = place where reads begin */ unsigned char *bx; /* running pointer for read-loop, initially buffer */ unsigned char *buffer_end; /* place where getword should stop */ unsigned char *buffer_begin;/* constant pointer to beginning */ unsigned char *next_record; /* pointer that tells where the current record ends: if buffer (returned by getword) is >= this, increment ICurrentFileOffset */ unsigned char *last_record; /* pointer that tells where the last record ends: may or may not be > buffer_end, but surely <= bx the last byte read */ int residue; /* extra variable to store buffer_begin + BLOCK_SIZE - buffer_end */ int tried_once = 0; int attribute; int ret; char outname[MAX_LINE_LEN]; char *unlinkname = NULL; int pid = getpid(); if (StructuredIndex) region_initialize(); init_hash_table(); #ifdef debug printf("entering build_hash(), part_num=%d\n", part_num); #endif tried_once = 0; try_again_1: buffer_begin = buffer = (unsigned char *) my_malloc(sizeof(char)* BLOCK_SIZE + 10); /* always read in units of BLOCK_SIZE or less */ if(buffer == NULL) { fprintf(stderr, "not enough memory in build_hash\n"); if (tried_once) return; traverse1(); init_hash_table(); tried_once = 1; goto try_again_1; } bx = buffer; if (OneFilePerBlock) { for(i=0; i 0) { /* do not remove old .TZ file */ if (StructuredIndex && (-1 == region_create(outname))) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); continue; } if (((fd = my_open(outname, O_RDONLY, 0)) == -1) ) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); if (StructuredIndex) region_destroy(); /* cannot happen! */ unlink(outname); continue; } unlinkname = outname; goto index_file1; } /* Try to apply the filter */ sprintf(outname, "%s/.glimpse_apply.%d", INDEX_DIR, pid); if ((ret = apply_filter(LIST_GET(name_list, i), outname)) == 1) { /* Some pattern matched AND some filter was successful */ if (StructuredIndex && (-1 == region_create(outname))) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); continue; } if (((fd = my_open(outname, O_RDONLY)) == -1) ) { /* error: shouldn't have returned 1! */ fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); if (StructuredIndex) region_destroy(); /* cannot happen! */ unlink(outname); continue; } unlinkname = outname; goto index_file1; } else if (ret == 2) { /* Some pattern matched but no filter was successful */ if (filetype(LIST_GET(name_list, i), 0, NULL, NULL)) { /* try to index input file if it satisfies filetype */ remove_filename(i, -1); unlink(outname); continue; } unlinkname = outname; } if (StructuredIndex && (-1 == region_create(LIST_GET(name_list, i)))) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); continue; } if (((fd = my_open(LIST_GET(name_list, i), O_RDONLY, 0)) == -1) ) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); if (StructuredIndex) region_destroy(); /* cannot happen! */ if (unlinkname != NULL) unlink(unlinkname); continue; } index_file1: #ifdef SW_DEBUG if (AddToIndex || FastIndex) printf("adding words of %s in %d\n", LIST_GET(name_list,i), i); printf("%s\n", LIST_GET(name_list, i)); #endif /* my_stat(LIST_GET(name_list, i), &stbuf); Chris Dalton */ fstat(fd, &stbuf); #ifdef SW_DEBUG printf("filesize: %d\n", stbuf.st_size); #endif #ifdef UDI_DEBUG printf("%s ", LIST_GET(name_list, i)); printf("size: %d ", stbuf.st_size); #endif /* buffer always points to a BLOCK_SIZE block of allocated memory */ buffer = buffer_begin; residue = 0; if (RecordLevelIndex) { if (!StoreByteOffset) NextICurrentFileOffset = ICurrentFileOffset = 1; else NextICurrentFileOffset = ICurrentFileOffset = 0; } for (offset = 0; offset < stbuf.st_size; offset += BLOCK_SIZE) { offset -= residue; if (!RecordLevelIndex) NextICurrentFileOffset = ICurrentFileOffset = offset; toread = offset + BLOCK_SIZE >= stbuf.st_size ? stbuf.st_size - offset : BLOCK_SIZE; lseek(fd, offset, SEEK_SET); bx= buffer; num_read = 0; while ((toread > 0) && ((num_read = read(fd, bx, toread)) < toread)) { if (num_read <= 0) { buffer = bx; fprintf(stderr, "read error on file %s at offset %d\n", LIST_GET(name_list, i), offset); goto break_break1; /* C doesn't have break; break; */ } bx += num_read; toread -= num_read; } if (num_read >= toread) { bx += num_read; toread -= num_read; } buffer_end = bx; residue = 0; if (buffer_end == buffer_begin + BLOCK_SIZE) { if (RecordLevelIndex) { buffer_end = backward_delimiter(buffer_end /* NOT bx */, buffer, rdelim, rdelim_len, 0); } else { while ((INDEXABLE(*(buffer_end-1))) && (buffer_end > buffer_begin + MAX_WORD_SIZE)) buffer_end --; } residue = buffer_begin + BLOCK_SIZE - buffer_end; /* if (residue > 0) printf("residue = %d in %s at %d\n", residue, LIST_GET(name_list, i), offset); */ } if (RecordLevelIndex) { next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); } bx = buffer; PrintedLongWordWarning = 0; while ((buffer=(unsigned char *) getword(LIST_GET(name_list, i), word, buffer, buffer_end, &attribute, &next_record)) < buffer_end) { if (RecordLevelIndex) { if (buffer >= next_record) { next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += next_record - buffer; else ICurrentFileOffset ++; } } /* printf("%s\n", word); */ if(word[0] == '\0') continue; if(icount - hash_icount >= I_THRESHOLD) { #if BG_DEBUG fprintf(LOGFILE, "reached I_THRESHOLD at %d\n", icount - hash_icount); #endif /*BG_DEBUG*/ traverse1(); init_hash_table(); hash_icount = icount; } insert_h(word, i, attribute); } if (word[0] != '\0') { /* printf("%s\n", word); */ if(icount - hash_icount >= I_THRESHOLD) { #if BG_DEBUG fprintf(LOGFILE, "reached I_THRESHOLD at %d\n", icount - hash_icount); #endif /*BG_DEBUG*/ traverse1(); init_hash_table(); hash_icount = icount; } insert_h(word, i, attribute); } if (RecordLevelIndex) { if (buffer >= next_record) { /* next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); */ ICurrentFileOffset ++; } } buffer = buffer_begin; next_record = buffer; } break_break1: close(fd); if (unlinkname != NULL) unlink(unlinkname); #ifdef UDI_DEBUG printf("add to index: %d\n",icount-save_icount); #endif if ((MAXWORDSPERFILE > 0) && (icount-save_icount > MAXWORDSPERFILE)) { fprintf(MESSAGEFILE, "%d words are contributed by %s\n", icount-save_icount, LIST_GET(name_list, i)); AddedMaxWordsMessage = ON; } if (IndexNumber && NUMERICWORDPERCENT && (numeric_icount * 100 > (icount - save_icount) * NUMERICWORDPERCENT) && (icount - save_icount > MIN_WORDS)) { fprintf(MESSAGEFILE, "NUMBERS occur in %d%% of %d words contributed by %s\n", (numeric_icount * 100)/(icount - save_icount), icount - save_icount, LIST_GET(name_list, i)); AddedMixedWordsMessage = ON; } numeric_icount=0; save_icount=icount; if (StructuredIndex) region_destroy(); } traverse1(); init_hash_table(); hash_icount = icount; my_free(buffer_begin, BLOCK_SIZE + 10); return; } for(pn=1; pn < part_num; pn++) /* partition # 0 is not accessed */ { if (pn == '\n') continue; /* There cannot be a partition # '\n' or 0: see partition.c */ for(i=p_table[pn]; i 0) { /* do not remove old .TZ file */ if (StructuredIndex && (-1 == region_create(outname))) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); continue; } if (((fd = my_open(outname, O_RDONLY, 0)) == -1) ) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); if (StructuredIndex) region_destroy(); /* cannot happen! */ unlink(outname); continue; } if (BuildDictionary && CompressAfterBuild) strcpy(LIST_GET(name_list, i), outname); /* name of clear file will be smaller, so enough space */ else unlinkname = outname; goto index_file2; } /* Try to apply the filter */ sprintf(outname, "%s/.glimpse_apply.%d", INDEX_DIR, pid); if ((ret = apply_filter(LIST_GET(name_list, i), outname)) == 1) { /* Some pattern matched AND some filter was successful */ if (StructuredIndex && (-1 == region_create(outname))) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); continue; } if (((fd = my_open(outname, O_RDONLY)) == -1) ) { /* error: shouldn't have returned 1! */ fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); if (StructuredIndex) region_destroy(); /* cannot happen! */ unlink(outname); continue; } unlinkname = outname; goto index_file2; } else if (ret == 2) { /* Some pattern matched but no filter was successful */ if (filetype(LIST_GET(name_list, i), 0, NULL, NULL)) { /* try to index input file if it satisfies filetype */ remove_filename(i, -1); unlink(outname); continue; } unlinkname = outname; } if (StructuredIndex && (-1 == region_create(LIST_GET(name_list, i)))) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); continue; } if (((fd = my_open(LIST_GET(name_list, i), O_RDONLY)) == -1) ) { fprintf(stderr, "permission denied or non-existent file: %s\n", LIST_GET(name_list, i)); remove_filename(i, -1); if (StructuredIndex) region_destroy(); /* cannot happen! */ if (unlinkname != NULL) unlink(unlinkname); continue; } index_file2: #ifdef SW_DEBUG if (AddToIndex || FastIndex) printf("adding words of %s in %d\n", LIST_GET(name_list, i), pn); printf("%s\n", LIST_GET(name_list, i)); #endif /* my_stat(LIST_GET(name_list, i), &stbuf); Chris Dalton */ fstat(fd, &stbuf); #ifdef SW_DEBUG printf("filesize: %d\n", stbuf.st_size); #endif #ifdef UDI_DEBUG printf("%s ", LIST_GET(name_list, i)); printf("size: %d ", stbuf.st_size); #endif /* buffer always points to a BLOCK_SIZE block of allocated memory */ buffer = buffer_begin; residue = 0; if (RecordLevelIndex) { if (!StoreByteOffset) NextICurrentFileOffset = ICurrentFileOffset = 1; else NextICurrentFileOffset = ICurrentFileOffset = 0; } for (offset = 0; offset < stbuf.st_size; offset += BLOCK_SIZE) { offset -= residue; if (!RecordLevelIndex) NextICurrentFileOffset = ICurrentFileOffset = offset; toread = offset + BLOCK_SIZE >= stbuf.st_size ? stbuf.st_size - offset : BLOCK_SIZE; lseek(fd, offset, SEEK_SET); bx= buffer; num_read = 0; while ((toread > 0) && ((num_read = read(fd, bx, toread)) < toread)) { if (num_read <= 0) { buffer = bx; fprintf(stderr, "read error on file %s at offset %d\n", LIST_GET(name_list, i), offset); goto break_break2; /* C doesn't have break; break; */ } bx += num_read; toread -= num_read; } if (num_read >= toread) { bx += num_read; toread -= num_read; } buffer_end = bx; residue = 0; if (buffer_end == buffer_begin + BLOCK_SIZE) { if (RecordLevelIndex) { buffer_end = backward_delimiter(buffer_end /* NOT bx */, buffer, rdelim, rdelim_len, 0); } else { while ((INDEXABLE(*(buffer_end-1))) && (buffer_end > buffer_begin + MAX_WORD_SIZE)) buffer_end --; } residue = buffer_begin + BLOCK_SIZE - buffer_end; /* if (residue > 0) printf("residue = %d in %s at %d\n", residue, LIST_GET(name_list, i), offset); */ } if (RecordLevelIndex) { next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); } bx = buffer; PrintedLongWordWarning = 0; while ((buffer=(unsigned char *) getword(LIST_GET(name_list, i), word, buffer, buffer_end, &attribute, &next_record)) < buffer_end) { if (RecordLevelIndex) { if (buffer >= next_record) { next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += next_record - buffer; else ICurrentFileOffset ++; } } /* printf("%s\n", word); */ if(word[0] == '\0') continue; if(icount - hash_icount >= I_THRESHOLD) { #if BG_DEBUG fprintf(LOGFILE, "reached I_THRESHOLD at %d\n", icount - hash_icount); #endif /*BG_DEBUG*/ traverse1(); init_hash_table(); hash_icount = icount; } insert_h(word, pn, attribute); } if (word[0] != '\0') { /* printf("%s\n", word); */ if(icount - hash_icount >= I_THRESHOLD) { #if BG_DEBUG fprintf(LOGFILE, "reached I_THRESHOLD at %d\n", icount - hash_icount); #endif /*BG_DEBUG*/ traverse1(); init_hash_table(); hash_icount = icount; } insert_h(word, pn, attribute); } if (RecordLevelIndex) { if (buffer >= next_record) { /* next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); */ ICurrentFileOffset ++; } } buffer = buffer_begin; next_record = buffer; } break_break2: close(fd); if (unlinkname != NULL) unlink(unlinkname); #ifdef UDI_DEBUG printf("add to index: %d\n",icount-save_icount); #endif if ((MAXWORDSPERFILE > 0) && (icount-save_icount > MAXWORDSPERFILE)) { fprintf(MESSAGEFILE, "%d words are contributed by %s\n", icount-save_icount, LIST_GET(name_list, i)); AddedMaxWordsMessage = ON; } if (IndexNumber && NUMERICWORDPERCENT && (numeric_icount * 100 > (icount - save_icount) * NUMERICWORDPERCENT) && (icount - save_icount > MIN_WORDS)) { fprintf(MESSAGEFILE, "NUMBERS occur in %d%% of %d words contributed by %s\n", (numeric_icount * 100)/(icount - save_icount), icount - save_icount, LIST_GET(name_list, i)); AddedMixedWordsMessage = ON; } numeric_icount=0; save_icount=icount; if (StructuredIndex) region_destroy(); } } traverse1(); init_hash_table(); hash_icount = icount; my_free(buffer_begin, BLOCK_SIZE + 10); } init_hash_table() { int i; for(i=0; iword) == 0) && (tp->attribute == attribute)) { insert_index(tp, pn); tried_once = 0; return; /* already in there */ } tp_bak = tp; tp = tp->next_t; } /* this is a new word, insert it */ if((tp = (struct token *) tokenalloc(sizeof(struct token))) == NULL) { tp_bak = NULL; if (tried_once == 1) { fprintf(stderr, "not enough memory in insert_h1 at icount=%d. skipping...\n", icount); tried_once = 0; return; /* ignore word altogether */ } traverse1(); init_hash_table(); tried_once = 1; /* memory allocation failed in malloc#1 */ insert_h(word, pn, attribute); /* next call can't fail here since traverse() calls *allfree() */ return; } if((tp->word = (char *) wordalloc(sizeof(char) * (wordlen+1))) == NULL) { tp_bak = NULL; if (tried_once == 2) { fprintf(stderr, "not enough memory in insert_h2 at icount=%d. skipping...\n", icount); tokenfree(tp, sizeof(struct token)); tried_once = 0; return; /* ignore word altogether */ } tokenfree(tp, sizeof(struct token)); traverse1(); init_hash_table(); tried_once = 2; /* memory allocation failed in malloc#2 */ insert_h(word, pn, attribute); /* next call can't fail here or above since traverse() calls *allfree() */ return; } strcpy(tp->word, word); tp->attribute = attribute; /* the index list has a first index */ if((iip = (struct indices *) indicesalloc(sizeof(struct indices))) == NULL) { tp_bak = NULL; if (tried_once == 3) { fprintf(stderr, "not enough memory in insert_h3 at icount=%d. skipping...\n", icount); wordfree(tp->word, wordlen + 1); tokenfree(tp, sizeof(struct token)); tried_once = 0; return; /* ignore word altogether */ } wordfree(tp->word, wordlen + 1); tokenfree(tp, sizeof(struct token)); traverse1(); init_hash_table(); tried_once = 3; /* memory allocation failed in malloc#3 */ insert_h(word, pn, attribute); /* next call can't fail here or above or above-above since traverse() calls *allfree() */ return; } icount++; if (IndexNumber && NUMERICWORDPERCENT) { int i=0; while(word[i] != '\0') { if (!isalpha(((unsigned char *)word)[i])) break; i++; } if (word[i] != '\0') numeric_icount ++; } #ifdef SW_DEBUG if((icount & 01777) == 0) printf("icount = %d\n", icount); #endif if (!CountWords) { iip->index[0] = pn; iip->offset[0] = ICurrentFileOffset; } for (j=1; jindex[j] = INDEX_ELEM_FREE; /* assign both head and tail */ iip->next_i = NULL; tp->ip = iip; tp->lastip = iip; if(tp_bak == NULL) hash_table[hash_value] = tp; else tp_bak->next_t = tp; tp->next_t = NULL; tp->totalcount = 1; tried_once = 0; /* now sure that there has been no memory allocation failure while inserting this word */ return; } /* ------------------------------------------------------------------- insert_index(): insert an index, i.e., pn, into an indices structure. The indices structure is a linked list where the 'first' one is always the active indices structure. When the active one is filled with 8 indices an indicies structure is created and becomes the active one. tp points to the token structure. so, tp->ip is always the active indices structure. THERE ARE NO STATE CHANGES UNLESS WE ARE SURE THAT MALLOCS WON'T FAIL: BG ------------------------------------------------------------------- */ void insert_index(tp, pn) struct token *tp; /* insert a index into a indices structure */ int pn; { struct indices *iip, *temp; struct indices *ip = (ByteLevelIndex ? tp->lastip : tp->ip); static int tried_once = 0; int j; if (CountWords) { /* I am not interested in maintaining where a word occurs: only the number of times it occurs */ ip->offset[0] ++; return; } /* Check for stop-list */ if (OneFilePerBlock && !ByteLevelIndex && (file_num > MaxNum8bPartition) && (tp->totalcount > (file_num * MAX_INDEX_PERCENT / 100))) return; if (ByteLevelIndex && (tp->totalcount > ( (((total_size>>20) > 0) && ((total_size>>20)*MAX_PER_MB < MAX_ALL_INDEX)) ? ((total_size>>20) * MAX_PER_MB) : MAX_ALL_INDEX) )) return; if (ByteLevelIndex) { for (j=INDEX_SET_SIZE; j>0; j--) { if(ip->index[j-1] == INDEX_ELEM_FREE) continue; if ((ip->index[j-1] == pn) && (ip->offset[j-1] == ICurrentFileOffset)) return; /* in identical position */ else break; } } else { for (j=INDEX_SET_SIZE; j>0; j--) { if(ip->index[j-1] == INDEX_ELEM_FREE) continue; if (ip->index[j-1] == pn) return; /* current word is not the first appearance in partition pn */ else break; } } /* ip->index[j] is the place to insert new pn provided j < INDEX_SET_SIZE */ if(j < INDEX_SET_SIZE) { ip->offset[j] = ICurrentFileOffset; ip->index[j] = pn; return; } if((iip = (struct indices *) indicesalloc(sizeof(struct indices)))==NULL) { if (tried_once == 1) { fprintf(stderr, "not enough memory in insert_index at icount=%d. skipping...\n", icount); tried_once = 0; return; /* ignore index altogether */ } traverse1(); init_hash_table(); tried_once = 1; /* memory allocation failed in malloc#1 */ insert_index(tp, pn); return; } icount++; if (ByteLevelIndex) { /* insert at the end */ tp->lastip->next_i = iip; iip->next_i = NULL; tp->lastip = iip; } else { iip->next_i = tp->ip; tp->ip = iip; } iip->offset[0] = ICurrentFileOffset; iip->index[0] = pn; for (j=1; jindex[j] = INDEX_ELEM_FREE; tp->totalcount ++; if ( (OneFilePerBlock && !ByteLevelIndex && (file_num > MaxNum8bPartition) && (tp->totalcount > (file_num * MAX_INDEX_PERCENT / 100))) || (ByteLevelIndex && (tp->totalcount > ( (((total_size>>20) > 0) && ((total_size>>20)*MAX_PER_MB < MAX_ALL_INDEX)) ? ((total_size>>20) * MAX_PER_MB) : MAX_ALL_INDEX) )) ) { for (iip=tp->ip; iip != NULL; temp = iip, iip = iip->next_i, indicesfree(temp, sizeof(struct indices))); tp->ip = NULL; /* never need to insert anything else here */ } /* printf("returning from insert_index()\n"); fflush(stderr); */ tried_once = 0; return; } /* Scan the indexed "word" from an index line: see io.c/merge_splits() */ scanword(word, buffer, buffer_end, attr) unsigned char *word, *buffer, *buffer_end; unsigned int *attr; { int i = MAX_WORD_SIZE; while ((i-- != 0) && (buffer <= buffer_end) && (*buffer != ALL_INDEX_MARK) && (*buffer != WORD_END_MARK) && (*buffer != '\n') && (*buffer != '\0')) *word ++ = *buffer ++; *word = '\0'; *attr = encode16b(0); if (StructuredIndex) { if ((*buffer == ALL_INDEX_MARK) || (*buffer == WORD_END_MARK)) { buffer ++; *attr = ((*buffer) << 8) | (*(buffer + 1)); } } } /* Globals used in merge, and also in glimpse's main.c */ extern unsigned int *src_index_set; extern unsigned int *dest_index_set; extern unsigned char *src_index_buf; extern unsigned char *dest_index_buf; extern unsigned char *merge_index_buf; /* merge index file f1 and f2, then put the result in index file f3 */ merge_in(f1, f2, f3) FILE *f1, *f2, *f3; { int src_mark, dest_mark; int src_num, dest_num; int src_end_pt, dest_end_pt; int cmp=0; /* the result of strcmp */ int bdx, bdx1, bdx2, merge_len, i, j; int TAIL1=0; char word1[MAX_WORD_SIZE+6]; /* used only for strcmp() */ char word2[MAX_WORD_SIZE+6]; /* used only for strcmp() */ unsigned int attr1, attr2; int x=0, y=0, even_words = 1; int merge_buf_size=REAL_INDEX_BUF; /* LOOK OUT FOR: [memset, fgets, endpt-forloop, scanword] 4-tuples: invariant */ #if debug printf("in merge_in()\n"); fflush(stdout); #endif memset(dest_index_buf, '\0', REAL_INDEX_BUF); if (fgets(dest_index_buf, REAL_INDEX_BUF, f2) == NULL) dest_index_buf[0] = '\0'; else dest_index_buf[REAL_INDEX_BUF - 1] = '\0'; dest_end_pt = strlen(dest_index_buf); scanword(word2, dest_index_buf, dest_index_buf+dest_end_pt, &attr2); #ifdef debug printf("word2 = %s\n", word2); #endif memset(src_index_buf, '\0', REAL_INDEX_BUF); while(fgets(src_index_buf, REAL_INDEX_BUF, f1)) { src_index_buf[REAL_INDEX_BUF - 1] = '\0'; src_end_pt = strlen(src_index_buf); scanword(word1, src_index_buf, src_index_buf+src_end_pt, &attr1); #ifdef debug printf("word1 = %s\n", word1); #endif while (((cmp = strncmp(word1, word2, MAX_WORD_SIZE+4)) > 0) || (StructuredIndex && (cmp == 0) && (attr1 > attr2))) { fputs(dest_index_buf, f3); memset(dest_index_buf, '\0', dest_end_pt+2); if(fgets(dest_index_buf, REAL_INDEX_BUF, f2) == NULL) { dest_index_buf[REAL_INDEX_BUF - 1] = '\0'; TAIL1 = ON; break; } dest_index_buf[REAL_INDEX_BUF - 1] = '\0'; dest_end_pt = strlen(dest_index_buf); scanword(word2, dest_index_buf, dest_index_buf+dest_end_pt, &attr2); } if(TAIL1 == ON) break; if ((cmp == 0) && (attr1 == attr2)) { /* we need to join the index of word1 and word2 */ #ifdef debug printf("joining src_index_buf and dest_index_buf\n"); printf("src_index_buf = %s", src_index_buf); printf("dest_index_buf = %s", dest_index_buf); #endif if (!CountWords && !ByteLevelIndex) { /* have to look for common indices and exclude them */ int oldbdx1, oldbdx2; merge_index_buf[0] = '\0'; merge_len = 0; oldbdx1 = bdx1 = 0; /* src_index_buf[src_end_pt] is '\0', src_index_buf[src_end_pt-1] is '\n' */ while((bdx1 < src_end_pt) && (src_index_buf[bdx1] != WORD_END_MARK) && (src_index_buf[bdx1] != ALL_INDEX_MARK)) bdx1 ++; if ((bdx1 > oldbdx1) && (bdx1 < src_end_pt)) { /* src_index_buf[bdx1] is the word-end-mark */ src_mark = src_index_buf[bdx1]; src_index_buf[bdx1] = '\0'; /* terminate word */ strcpy(merge_index_buf, src_index_buf); /* save the word itself */ merge_len = strlen(src_index_buf); /* merge_index_buf[merge_len] is '\0', merge_index_buf[merge_len-1] is a part of the word */ bdx1 ++; /* skip word end marker */ if (StructuredIndex) bdx1 += 2; /* skip attribute field */ } even_words = 1; src_num = 0; if (OneFilePerBlock) memset((char *)src_index_set, '\0', sizeof(int)*REAL_PARTITION); else memset((char *)src_index_set, '\0', sizeof(int) * (MAX_PARTITION + 1)); while(bdx1 < src_end_pt - 1) { if (OneFilePerBlock) { x = 0; if (file_num <= MaxNum8bPartition) { x = decode8b(src_index_buf[bdx1]); bdx1 ++; } else if (file_num <= MaxNum12bPartition) { if (even_words) { x = ((src_index_buf[bdx1+1] & 0x0000000f) << 8) | src_index_buf[bdx1]; x = decode12b(x); bdx1 += 2; even_words = 0; } else { /* odd number of words so far */ x = ((src_index_buf[bdx1-1] & 0x000000f0) << 4) | src_index_buf[bdx1]; x = decode12b(x); bdx1 ++; even_words = 1; } } else if (file_num <= MaxNum16bPartition) { x = (src_index_buf[bdx1] << 8) | src_index_buf[bdx1+1]; x = decode16b(x); bdx1 += 2; } else { x = (src_index_buf[bdx1] << 16) | (src_index_buf[bdx1+1] << 8) | src_index_buf[bdx1+2]; x = decode24b(x); bdx1 += 3; } src_index_set[block2index(x)] |= mask_int[x % (8*sizeof(int))]; src_num ++; } else src_index_set[src_num++] = src_index_buf[bdx1++]; } oldbdx2 = bdx2 = 0; /* dest_index_buf[dest_end_pt] is '\0', dest_index_buf[dest_end_pt-1] is '\n' */ while((bdx2 < dest_end_pt) && (dest_index_buf[bdx2] != WORD_END_MARK) && (dest_index_buf[bdx2] != ALL_INDEX_MARK)) bdx2 ++; if ((bdx2 > oldbdx2) && (bdx2 < dest_end_pt)) { /* dest_index_buf[bdx2] is the word-end-mark */ dest_mark = dest_index_buf[bdx2]; dest_index_buf[bdx2] = '\0'; /* terminate word */ if (merge_len == 0) { strcpy(merge_index_buf, dest_index_buf); /* save the word itself */ merge_len = strlen(merge_index_buf); /* merge_index_buf[merge_len] is '\0', merge_index_buf[merge_len-1] is a part of the word */ } bdx2 ++; /* skip word end marker */ if (StructuredIndex) bdx2 += 2; /* skip attribute field */ } even_words = 1; dest_num = 0; if (OneFilePerBlock) memset((char *)dest_index_set, '\0', sizeof(int)*REAL_PARTITION); else memset((char *)dest_index_set, '\0', sizeof(int) * (MAX_PARTITION + 1)); while(bdx2 < dest_end_pt - 1) { if (OneFilePerBlock) { x = 0; if (file_num <= MaxNum8bPartition) { x = decode8b(dest_index_buf[bdx2]); bdx2 ++; } else if (file_num <= MaxNum12bPartition) { if (even_words) { x = ((dest_index_buf[bdx2+1] & 0x0000000f) << 8) | dest_index_buf[bdx2]; x = decode12b(x); bdx2 += 2; even_words = 0; } else { /* odd number of words so far */ x = ((dest_index_buf[bdx2-1] & 0x000000f0) << 4) | dest_index_buf[bdx2]; x = decode12b(x); bdx2 ++; even_words = 1; } } else if (file_num <= MaxNum16bPartition) { x = (dest_index_buf[bdx2] << 8) | dest_index_buf[bdx2+1]; x = decode16b(x); bdx2 += 2; } else { x = (dest_index_buf[bdx2] << 16) | (dest_index_buf[bdx2+1] << 8) | dest_index_buf[bdx2+2]; x = decode24b(x); bdx2 += 3; } dest_index_set[block2index(x)] |= mask_int[x % (8*sizeof(int))]; dest_num ++; } else dest_index_set[dest_num++] = dest_index_buf[bdx2++]; } even_words = 1; /* prevent buffer overflows */ if (merge_len > 0) { if (merge_len > merge_buf_size - 6) { merge_buf_size += 20; merge_index_buf = realloc(merge_index_buf, merge_buf_size); fprintf(stderr,"Realloc + 20 new size is %d",merge_buf_size); memset(merge_index_buf + merge_len,'\0',merge_buf_size-merge_len); } if(OneFilePerBlock && ((src_mark == ALL_INDEX_MARK) || (dest_mark == ALL_INDEX_MARK) || ((file_num > MaxNum8bPartition) && (src_num + dest_num > file_num*MAX_INDEX_PERCENT / 100)) )) { merge_index_buf[merge_len++] = ALL_INDEX_MARK; if (StructuredIndex) { merge_index_buf[merge_len++] = (attr1 & 0xff00) >> 8; merge_index_buf[merge_len++] = (attr1 & 0xff); } if (file_num <= MaxNum8bPartition) merge_index_buf[merge_len ++] = encode8b(DONT_CONFUSE_SORT); else if (file_num <= MaxNum12bPartition) { merge_index_buf[merge_len ++] = (encode12b(DONT_CONFUSE_SORT) & 0x00000f00) >> 8; merge_index_buf[merge_len ++] = encode12b(DONT_CONFUSE_SORT) & 0x000000ff; } else if (file_num <= MaxNum16bPartition) { merge_index_buf[merge_len ++] = (encode16b(DONT_CONFUSE_SORT) & 0x0000ff00) >> 8; merge_index_buf[merge_len ++] = encode16b(DONT_CONFUSE_SORT) & 0x000000ff; } else { merge_index_buf[merge_len ++] = (encode24b(DONT_CONFUSE_SORT) & 0x00ff0000) >> 16; merge_index_buf[merge_len ++] = (encode24b(DONT_CONFUSE_SORT) & 0x0000ff00) >> 8; merge_index_buf[merge_len ++] = encode24b(DONT_CONFUSE_SORT) & 0x000000ff; } goto final_merge; } merge_index_buf[merge_len++] = WORD_END_MARK; if (StructuredIndex) { merge_index_buf[merge_len++] = (attr1 & 0xff00) >> 8; merge_index_buf[merge_len++] = (attr1 & 0xff); } if (OneFilePerBlock) { for (i=0; i (merge_buf_size - 3*8*sizeof(int) - 1)) { merge_buf_size *= 2; merge_index_buf = realloc(merge_index_buf, merge_buf_size); fprintf(stderr,"Had to Realloc merge buffer (#2), new size is %d\n",merge_buf_size); memset(merge_index_buf + merge_len,'\0',merge_buf_size-merge_len); } if (dest_index_set[i]) for (j=0; j<8*sizeof(int); j++) if (dest_index_set[i] & mask_int[j]) { x = i*8*sizeof(int) + j; if (file_num <= MaxNum8bPartition) { merge_index_buf[merge_len++] = encode8b(x); } else if (file_num <= MaxNum12bPartition) { x = encode12b(x); if (even_words) { merge_index_buf[merge_len++] = x & 0x000000ff; /* lsb */ y = (x & 0x00000f00)>>8; /* msb */ even_words = 0; } else { /* odd number of words so far */ y |= (x&0x00000f00)>>4; /* msb of x into msb of y */ merge_index_buf[merge_len ++] = y; merge_index_buf[merge_len ++] = x&0x000000ff; even_words = 1; } } else if (file_num <= MaxNum16bPartition) { x = encode16b(x); merge_index_buf[merge_len ++] = (x&0x0000ff00)>>8; merge_index_buf[merge_len ++] = x&0x000000ff; } else { x = encode24b(x); merge_index_buf[merge_len ++] = (x&0x00ff0000)>>16; merge_index_buf[merge_len ++] = (x&0x0000ff00)>>8; merge_index_buf[merge_len ++] = x&0x000000ff; } } } if (!even_words && (file_num > MaxNum8bPartition) && (file_num <= MaxNum12bPartition)) merge_index_buf[merge_len ++] = y; } else { /* normal indexing */ for (i=0; i (merge_buf_size - 3)) { merge_buf_size *= 2; merge_index_buf = realloc(merge_index_buf, merge_buf_size); fprintf(stderr,"Had to Realloc merge buffer (#3), new size is %d\n",merge_index_buf); memset(merge_index_buf + merge_len,'\0',merge_buf_size-merge_len); } } for (j=0; j=src_num) /* did not find match */ merge_index_buf[merge_len++] = dest_index_set[j]; /* Prevent buffer overflow */ if (merge_len > (merge_buf_size - 2)) { merge_buf_size *= 2; merge_index_buf = realloc(merge_index_buf, merge_buf_size); fprintf(stderr,"Realloc #4, new size is %d\n",merge_index_buf); memset(merge_index_buf + merge_len,'\0',merge_buf_size-merge_len); } /* Doesn't matter if dest_index_set is int-array (merge_index_buf being char array) since dest_index_set has only a char */ } } final_merge: merge_index_buf[merge_len++] = '\n'; merge_index_buf[merge_len] = '\0'; fputs(merge_index_buf, f3); /* fprintf(stderr, "%d+%d=%d ", src_end_pt, dest_end_pt, merge_len); */ } /* merge_len > 0 */ } else if (CountWords) { /* indices are frequencies, so just merge them: OneFilPerBlock is ignored */ strcpy(merge_index_buf, src_index_buf); bdx = strlen(merge_index_buf); /* merge_index_buf[bdx] is '\0', merge_index_buf[bdx-1] is '\n' */ if (bdx > 1) bdx--; /* now merge_index_buf[bdx] is '\n', merge_index_buf[bdx-1] is the last index */ bdx2 = 0; /* find the first index */ if (IndexNumber) while(isalnum(dest_index_buf[bdx2])) bdx2 ++; else while(isalpha(dest_index_buf[bdx2])) bdx2++; /* to skip over the word-end marker of dest_index_buf (which is a blank) */ if (bdx2 > 0) bdx2 ++; if (StructuredIndex) bdx2 += 2; /* this is a nop since CountWords and StructuredIndex don't make sense together */ if (bdx >= 1) { merge_index_buf[bdx++] = ' '; /* blank separated fscanf-able list of integers representing counts */ } /* append the indices of word1 to the buffer */ if (dest_index_buf[bdx2] > 0) { while((dest_index_buf[bdx2]>0)&&(bdx2= src_end_pt) || (src_index_buf[bdx1] == ALL_INDEX_MARK) || (src_end_pt + dest_end_pt >= MAX_SORTLINE_LEN)) { putc(ALL_INDEX_MARK, f3); if (StructuredIndex) { putc((attr1&0xff00) >> 8, f3); putc((attr1&0xff), f3); } putc(DONT_CONFUSE_SORT, f3); putc('\n', f3); } else { /* dest can be all index mark */ bdx2 = 0; while ((bdx2= dest_end_pt) || (dest_index_buf[bdx2] == ALL_INDEX_MARK)) { putc(ALL_INDEX_MARK, f3); if (StructuredIndex) { putc((attr1&0xff00) >> 8, f3); putc((attr1&0xff), f3); } putc(DONT_CONFUSE_SORT, f3); putc('\n', f3); } else { /* we have to put out both the lists */ putc(WORD_END_MARK, f3); bdx1 ++; /* skip over WORD_END_MARK */ if (StructuredIndex) { putc((attr1&0xff00) >> 8, f3); putc((attr1&0xff), f3); bdx1 += 2; } while ((bdx1 < src_end_pt) && (src_index_buf[bdx1] != '\n') && (src_index_buf[bdx1] != '\0')) putc(src_index_buf[bdx1++], f3); fputc(encode8b(0), f3); /* instead of the '\n' after end of src_index_buf */ bdx2 ++; /* skip over WORD_END_MARK */ if (StructuredIndex) bdx2 += 2; while ((bdx2 < dest_end_pt) && (dest_index_buf[bdx2] != '\n') && (dest_index_buf[bdx2] != '\0')) putc(dest_index_buf[bdx2++], f3); putc('\n', f3); } } } #if debug printf("merge_index_buf = %s", merge_index_buf); #endif /*debug*/ memset(dest_index_buf, '\0', dest_end_pt+2); if(fgets(dest_index_buf, REAL_INDEX_BUF, f2) == 0) { dest_index_buf[REAL_INDEX_BUF - 1] = '\0'; TAIL1 = ON; break; } dest_index_buf[REAL_INDEX_BUF - 1] = '\0'; dest_end_pt = strlen(dest_index_buf); scanword(word2, dest_index_buf, dest_index_buf+dest_end_pt, &attr2); } else { /* word1 < word2, so output src_index_buf */ fputs(src_index_buf, f3); } memset(src_index_buf, '\0', src_end_pt+2); } if(TAIL1) { if(cmp != 0) fputs(src_index_buf, f3); memset(src_index_buf, '\0', src_end_pt+2); while(fgets(src_index_buf, REAL_INDEX_BUF, f1)) { src_index_buf[REAL_INDEX_BUF - 1] = '\0'; src_end_pt = strlen(src_index_buf); fputs(src_index_buf, f3); memset(src_index_buf, '\0', src_end_pt+2); } } else { /* output the tail of f2 */ fputs(dest_index_buf, f3); memset(dest_index_buf, '\0', dest_end_pt+2); while(fgets(dest_index_buf, REAL_INDEX_BUF, f2)) { dest_index_buf[REAL_INDEX_BUF - 1] = '\0'; dest_end_pt = strlen(dest_index_buf); fputs(dest_index_buf, f3); memset(dest_index_buf, '\0', dest_end_pt+2); } } return; } remove_filename(fileindex, new_partition) int fileindex, new_partition; { if ((fileindex < 0) || (fileindex >= MaxNum24bPartition)) return; #if BG_DEBUG fprintf(LOGFILE, "removing %s from index\n", LIST_GET(name_list, fileindex)); memory_usage -= (strlen(LIST_GET(name_list, fileindex)) + 2); #endif /*BG_DEBUG*/ my_free(LIST_GET(name_list, fileindex), 0); LIST_SUREGET(name_list, fileindex) = NULL; if ((disable_list != NULL) && (fileindex < old_file_num)) disable_list[block2index(fileindex)] |= mask_int[fileindex % (8*sizeof(int))]; } /* returns the set of deleted files in the struct indices format: note that by construction, the list is sorted according to fileindex */ struct indices* get_removed_indices() { int i, j; char *name; struct indices *head = NULL, **tail, *new; tail = &head; for (i=0; iindex[INDEX_SET_SIZE - 1] != INDEX_ELEM_FREE)) { new = (struct indices *)my_malloc(sizeof(struct indices)); memset(new, '\0', sizeof(struct indices)); for (j=0; jindex[j] = INDEX_ELEM_FREE; if (*tail != NULL) { (*tail)->next_i = new; tail = &(*tail)->next_i; } else head = new; } for (j=0; jindex[j] == INDEX_ELEM_FREE) break; /* j must be < INDEX_SET_SIZE */ (*tail)->index[j] = i; } } return head; } /* returns a -ve number if there is no newfileindex for this file (deleted from index), or the new index otherwise */ /* length_of_deletedlist = MaxNum24bPartition + 1 - get_new_index(deletedlist, MaxNum24bPartition + 1); */ get_new_index(deletedlist, oldfileindex) struct indices *deletedlist; int oldfileindex; { int j; int reduction = 0; struct indices *head = deletedlist; while (head!=NULL) { for (j=0; jindex[j] == INDEX_ELEM_FREE) return oldfileindex; /* crossed the limit */ else if (oldfileindex == head->index[j]) return -1; /* oldfileindex has been deleted now */ else if (oldfileindex > head->index[j]) oldfileindex --; else /* oldfileindex < head->index[j] */ return oldfileindex; /* no more have been deleted before oldfileindex */ } head = head->next_i; } return oldfileindex; /* crossed the limit */ } delete_removed_indices(deletedlist) struct indices *deletedlist; { int j; int reduction = 0; struct indices *head = deletedlist, *temp; while (head!=NULL) { temp = head; head = head->next_i; my_free(temp, sizeof(struct indices)); } return reduction; } /* The savings in index space are not worth it if you call this function w/o OneFilePerBlock: returns new_file_num (after purging index of deleted files) */ int purge_index() { int new_file_num; int i, firsti; /* firsti used to eliminate dead words from index */ char s[MAX_LINE_LEN], es1[MAX_LINE_LEN], es2[MAX_LINE_LEN], temp_rdelim[MAX_LINE_LEN]; char s1[MAX_LINE_LEN]; int j; unsigned char c; FILE *i_in; FILE *i_out; int offset, index; char indexnumberbuf[256]; int onefileperblock, structuredindex; int even_words, dest_even_words; int delim = encode8b(0); int x, y, new_x; int prevy, diff, oldj; if (!OneFilePerBlock) return file_num; #if 0 /* initialized in glimpse.c now */ if ((deletedlist = get_removed_indices()) == NULL) return file_num; printf("into purge_index()\n"); #endif new_file_num = file_num - ( MaxNum24bPartition + 1 - get_new_index(deletedlist, MaxNum24bPartition + 1) ); /* since encoding may change */ /* else, we have already done merge-splits and the result is in INDEX_FILE */ #if 0 printf("purging indexing from %d to %d files\n", file_num, new_file_num); #endif sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if((i_in = fopen(s, "r")) == NULL) { fprintf(stderr, "can't open %s for reading\n", s); exit(2); } sprintf(s, "%s/.glimpse_merge.%d", INDEX_DIR, getpid()); if ((i_out = fopen(s, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", s); exit(2); } /* modified the original in glimpse's main.c */ fgets(indexnumberbuf, 256, i_in); fputs(indexnumberbuf, i_out); fscanf(i_in, "%%%d\n", &onefileperblock); if (ByteLevelIndex) fprintf(i_out, "%%-%d\n", new_file_num); /* #of files might have changed due to -f/-a */ else fprintf(i_out, "%%%d\n", new_file_num); /* This was the stupidest thing of all! */ if ( !fscanf(i_in, "%%%d\n", &structuredindex) ) /* New file format may not have delimiter. Fixed 10/25/99 as per mhubin. --GV */ fscanf(i_in, "%%%d%s\n", &structuredindex, temp_rdelim); if (structuredindex <= 0) structuredindex = 0; if (RecordLevelIndex) fprintf(i_out, "%%-2 %s\n", old_rdelim); /* robint@zedcor.com */ else fprintf(i_out, "%%%d\n", attr_num); /* attributes might have been added during last merge */ while (fgets(src_index_buf, REAL_INDEX_BUF, i_in)) { i = 0; j = 0; while ((j < REAL_INDEX_BUF) && (src_index_buf[j] != WORD_END_MARK) && (src_index_buf[j] != ALL_INDEX_MARK) && (src_index_buf[j] != '\0') && (src_index_buf[j] != '\n')) dest_index_buf[i++] = src_index_buf[j++]; if ((j >= REAL_INDEX_BUF) || (src_index_buf[j] == '\0') || (src_index_buf[j] == '\n')) continue; /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = src_index_buf[j]; dest_index_buf[i++] = src_index_buf[j++]; if (structuredindex) { /* convert all attributes to 2B to make merge_in()s easy in build_in.c */ if (structuredindex < MaxNum8bPartition - 1) { dest_index_buf[i++] = src_index_buf[j++]; } else { dest_index_buf[i++] = src_index_buf[j++]; dest_index_buf[i++] = src_index_buf[j++]; } } if (c == ALL_INDEX_MARK) { dest_index_buf[i++] = DONT_CONFUSE_SORT; dest_index_buf[i++] = '\n'; dest_index_buf[i] = '\0'; fputs(dest_index_buf, i_out); continue; } /* src_index_buf[j] points to the first byte of the block numbers, dest_index_buf[i] points to the first empty byte to fill up */ firsti = i; /* * Main while loop: Copied from glimpse/get_index.c/get_set() with minor modifications. */ even_words = 1; dest_even_words = 1; while((j= 0) { /* since encoding may change */ if (ByteLevelIndex) { if (new_file_num <= MaxNum8bPartition) { x = encode8b(new_x); dest_index_buf[i++] = (x&0x000000ff); } else if (new_file_num <= MaxNum16bPartition) { x = encode16b(new_x); dest_index_buf[i++] = ((x&0x0000ff00)>>8); dest_index_buf[i++] = (x&0x000000ff); } else { x = encode24b(new_x); dest_index_buf[i++] = ((x&0x00ff0000)>>16); dest_index_buf[i++] = ((x&0x0000ff00)>>8); dest_index_buf[i++] = (x&0x000000ff); } } /* ByteLevelIndex */ else if (OneFilePerBlock) { if (new_file_num <= MaxNum8bPartition) { dest_index_buf[i++] = (encode8b(new_x)); } else if (new_file_num <= MaxNum12bPartition) { x = encode12b(new_x); if (dest_even_words) { dest_index_buf[i++] = (x & 0x000000ff); /* lsb */ y = (x & 0x00000f00)>>8; /* msb */ dest_even_words = 0; } else { /* odd number of words so far */ y |= (x&0x00000f00)>>4; /* msb of x into msb of y */ dest_index_buf[i++] = (y); dest_index_buf[i++] = (x&0x000000ff); dest_even_words = 1; } } else if (file_num <= MaxNum16bPartition) { x = encode16b(new_x); dest_index_buf[i++] = ((x&0x0000ff00)>>8); dest_index_buf[i++] = (x&0x000000ff); } else { x = encode24b(new_x); dest_index_buf[i++] = ((x&0x00ff0000)>>16); dest_index_buf[i++] = ((x&0x0000ff00)>>8); dest_index_buf[i++] = (x&0x000000ff); } } /* OneFilePerBlock */ } /* else don't put anything */ prevy = 0; if (ByteLevelIndex) { while ((j= 0) { memcpy(&dest_index_buf[i], &src_index_buf[oldj], j - oldj); i += j - oldj; } if ((j= 0) { dest_index_buf[i++] = src_index_buf[j++]; } else j ++; break; } } /* else don't put anything */ } } if (!ByteLevelIndex && OneFilePerBlock && !dest_even_words && (new_file_num > MaxNum8bPartition) && (new_file_num <= MaxNum12bPartition)) dest_index_buf[i++] = (y); if (firsti < i) { dest_index_buf[i++] = '\n'; dest_index_buf[i] = '\0'; fputs(dest_index_buf, i_out); } else { /* word with no references, so ditch it: fill up junk for safety */ dest_index_buf[i] = '\n'; dest_index_buf[i+1] = '\n'; } } fclose(i_in); fflush(i_out); fclose(i_out); #if SFS_COMPAT sprintf(s, "%s/.glimpse_merge.%d", INDEX_DIR, getpid()); sprintf(s1, "%s/%s", INDEX_DIR, INDEX_FILE); rename(s, s1); #else sprintf(s, "exec %s '%s/.glimpse_merge.%d' '%s/%s'", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), getpid(), escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #endif return new_file_num; } /* This does not effect global variables other than disable_list: used only in incremental indexing */ initialize_disable_list(files) int files; { int _FILEMASK_SIZE = ((files + 1)/(8*sizeof(int)) + 4); int _REAL_PARTITION = (_FILEMASK_SIZE + 4); if (disable_list == NULL) disable_list = (unsigned int *)my_malloc(sizeof(int)*_REAL_PARTITION); memset(disable_list, '\0', sizeof(int) * _REAL_PARTITION); /* nothing is disabled initially */ } glimpse-4.18.7/index/convert.c000066400000000000000000001025021300371307100162010ustar00rootroot00000000000000/**********************************************************************************************/ /* convert.c: Program to inter-convert different representations of neighbourhood sets */ /* */ /* Uses: to compress neighbourhood sets for faster search/uncompress for viewing/editing them */ /* Author: Burra Gopal, bgopal@cs.arizona.edu, Sep 7-8 1996: WebGlimpse support */ /**********************************************************************************************/ #include /* configured defs */ #include "glimpse.h" #include #include #if ISO_CHAR_SET #include /* support for 8bit character set:ew@senate.be */ #endif #include #define IS_LITTLE_ENDIAN 1 #define IS_BIG_ENDIAN 0 #define IS_INDICES 1 #define IS_BITS 2 #define IS_NAMES 3 #define USUALBUFFER_SIZE (MAX_LINE_LEN*64) /* Exported routines */ int element2name(/*int, out char*, int, int, int*/); int mem_element2name(/*int, char*, unsigned char*, unsigned char*, int*/); int name2element(/*out int*, char*, int, int, int, int*/); int mem_name2element(/*out int*, char*, int, unsigned char*, unsigned char*, int*/); int do_conversion(/*FILE*, FILE*, int, int, int, int, int, unsigned int *, int, int*/); int change_format(/*int, int, int, int, int, int, char *, char **/); /* Imported routines */ int hashNk(/*char *, int*/);/* from io.c */ /* Internal routines */ int discardinfo(/*char **/); int allocate_and_fill(/* out unsigned char **, int, char *, int*/); /* Imported variables */ extern int errno; extern int get_index_type(); /* from io.c */ extern int file_num; /* from io.c */ extern int mask_int[32]; /* from io.c */ extern int BigFilenameHashTable; /* from io.c */ extern int InfoAfterFilename; /* from io.c */ /* Internal variables */ /* Variables related to options (i/p-->o/p types)*/ int InputType, OutputType, InputEndian, OutputEndian, InputFilenames, ReadIntoMemory; char glimpseindex_dir[MAX_LINE_LEN]; char filename_prefix[MAX_LINE_LEN]; /* Variables related to ReadIntoMemory option (I/O efficiency) */ unsigned char *filenames_buffer, *filenames_index_buffer, *filehash_buffer, *filehash_index_buffer; int filenames_len, filenames_index_len, filehash_len, filehash_index_len; int fdname, fdname_index, fdhash, fdhash_index; unsigned char usualbuffer[USUALBUFFER_SIZE]; /* Variables for statistics */ int hash_misses = 0; /******************************************************** * Discards information after ' ' in filename * * Returns: 0 if it found info to discard, -1 otherwise * * Assumes: file ends with '\0' * * CHANGED from ' ' to FILE_END_MARK 6/7/99 --GB * ********************************************************/ int discardinfo(file) char file[]; { int k; if (InfoAfterFilename) { k = 0; while (file[k] != '\0') { if (file[k] == '\\') { k ++; if (file[k] == '\0') break; k++; continue; } else { if (file[k] == FILE_END_MARK) { file[k] = '\0'; return 0; } k++; continue; } } } /* pab23feb98: return -1 if !InfoAfterFilename */ return -1; } /******************************************************************************************** * Allocates the "buffer" of size "len" and fills it up with "len" amount of data from "fd" * * Returns: 0 on success, -1 on failure (i.e., if allocation fails or can't read fully) * ********************************************************************************************/ int allocate_and_fill(buffer, len, filename, fd) unsigned char **buffer; int len; char *filename; int fd; { if ((len <= 0) || ((*buffer = (unsigned char *)my_malloc(len)) == NULL)) { fprintf(stderr, "Disable -M option: cannot allocate memory for %s\n", filename); return -1; } if (len != read(fd, *buffer, len)) { fprintf(stderr, "Disable -M option: cannot read %s\n", filename); return -1; } return 0; } /************************************************************************************** * Finds filename for given element (index#: every element points to indexed object) * * Returns: -1 if error and 0 on success * * See glimpse/index/io.c/save_datastructures() for the format of the names-file * **************************************************************************************/ int element2name(element, file, fd, fdi, files_used) int element; char file[]; /* out */ int fd, fdi; /* fd=filenames fd, fdi=filenames_index fd */ int files_used; { int k, offset, lastoffset = -1, len; unsigned char array[4]; if ((element < 0) || (element >= files_used)) { errno = EINVAL; return -1; } lseek(fdi, (long)element*4, SEEK_SET); if (read(fdi, array, 4) != 4) { errno = ENOENT; return -1; } offset = (array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]; if (read(fdi, array, 4) == 4) { lastoffset = (array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]; } if (lseek(fd, (long)offset, SEEK_SET) == -1) { fprintf(stderr, ".glimpse_filenames: can't seek to %d\n", offset); return -1; } if (lastoffset != -1) len = read(fd, file, lastoffset - offset); else len = read(fd, file, MAX_LINE_LEN); if (len == -1) { errno = ENOENT; return -1; } file[len - 1] = '\0'; /* separated by '\n', so zero that out: if empty file, will get its strlen() to be 0, as expected */ if (InfoAfterFilename) discardinfo(file); return 0; } /************************************************************************************** * Finds filename for given element (index#: every element points to indexed object) * * Returns: -1 if error and 0 on success * * See glimpse/index/io.c/save_datastructures() for the format of the names-file * * Works by reading in-memory copy of the files * **************************************************************************************/ int mem_element2name(element, file, filenames_buffer, filenames_index_buffer, files_used) int element; char file[]; /* out */ unsigned char *filenames_buffer, *filenames_index_buffer; int files_used; { int i, offset, lastoffset = -1, len; if ((element < 0) || (element >= files_used) || (element >= filenames_index_len)) { errno = EINVAL; return -1; } i = element*4; offset = (filenames_index_buffer[i] << 24) | (filenames_index_buffer[i+1] << 16) | (filenames_index_buffer[i+2] << 8) | filenames_index_buffer[i+3]; if (element == files_used - 1) lastoffset = filenames_len; else lastoffset = (filenames_index_buffer[i+4] << 24) | (filenames_index_buffer[i+5] << 16) | (filenames_index_buffer[i+6] << 8) | filenames_index_buffer[i+7]; /* fprintf(stderr, "element=%d offset=%d, lastoffset=%d, filenames_len=%d, files_used=%d\n", element, offset, lastoffset, filenames_len, files_used); */ if ((offset < 0) || (offset > filenames_len) || (lastoffset < 0) || (lastoffset > filenames_len) || (offset >= lastoffset)) { errno = ENOENT; return -1; } if (lastoffset - offset >= MAX_LINE_LEN) { errno = EINVAL; return -1; } memcpy(file, &filenames_buffer[offset], lastoffset-offset); file[lastoffset - offset - 1] = '\0'; /* separated by '\n', so zero that out: if empty file, will get its strlen() to be 0, as expected */ if (InfoAfterFilename) discardinfo(file); return 0; } /***************************************************************************************** * Returns: element (index#) for given filename (every element points to indexed object) * * Returns: -1 if error (assuming that element#s are >= 0, ofcourse...) * * See glimpse/index/io.c/save_datastructures() for the format of the hash-file * *****************************************************************************************/ int name2element(pelement, file, len, fd, fdi, files_used) int *pelement; /* out */ char file[]; int len; int fd, fdi; /* fd=filehash fd, fdi=filehash_index fd */ int files_used; { int malloced = 0, ret, i, k, foundblank=0, offset, lastoffset = -1, hash, size; unsigned char *buffer, array[4]; if ((len <= 0) || (len >= MAX_LINE_LEN)) { errno = EINVAL; return -1; } hash = hashNk(file, len); /* fprintf(stderr, "len=%d file=%s hash=%d\n", len, file, hash); */ if (lseek(fdi, (long)hash*4, SEEK_SET) == -1) { fprintf(stderr, ".glimpse_filehash_index: can't seek to %d\n", hash*4); return -1; } if ((ret = read(fdi, array, 4)) != 4) { fprintf(stderr, "read only %d bytes from %d\n", ret, hash*4); errno = ENOENT; return -1; } offset = (array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]; /* fprintf(stderr, "offset=%d\n", offset); */ if (read(fdi, array, 4) == 4) { lastoffset = (array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]; } else lastoffset = lseek(fd, (long)0, SEEK_END /*2*/ /* from end */); /* so that next time I get prev-value = file size */ /* fprintf(stderr, "lastoffset=%d\n", lastoffset); */ size = lastoffset - offset; if (size <= 1) { errno = ENOENT; return -1; } if (size < USUALBUFFER_SIZE) buffer = usualbuffer; else { buffer = (unsigned char *)my_malloc(size); malloced = 1; } /* fprintf(stderr, "hash=%d offset=%d lastoffset=%d size=%d\n", hash, offset, lastoffset, size); */ lseek(fd, (long)offset, SEEK_SET); if (size != read(fd, buffer, size)) { if (malloced) my_free((char *)buffer, size); errno = ENOENT; return -1; } /* fprintf(stderr, "buffer=%s\n", buffer+4); */ for (i=0; i= 0, ofcourse...) * * See glimpse/index/io.c/save_datastructures() for the format of the hash-file * * Works by reading in-memory copy of the files * *****************************************************************************************/ mem_name2element(pelement, file, len, filehash_buffer, filehash_index_buffer, files_used) int *pelement; /* out */ char *file; int len; unsigned char *filehash_buffer, *filehash_index_buffer; int files_used; { int lasti, ret, i, k, foundblank=0, offset, lastoffset = -1, hash, size; unsigned char *buffer; if ((len <= 0) || (len >= MAX_LINE_LEN)) { errno = EINVAL; return -1; } hash = hashNk(file, len); i = hash*4; offset = (filehash_index_buffer[i] << 24) | (filehash_index_buffer[i+1] << 16) | (filehash_index_buffer[i+2] << 8) | filehash_index_buffer[i+3]; if (BigFilenameHashTable) lasti = MAX_64K_HASH - 1; else lasti = MAX_4K_HASH - 1; if (i == lasti) lastoffset = filehash_len; else lastoffset = (filehash_index_buffer[i+4] << 24) | (filehash_index_buffer[i+5] << 16) | (filehash_index_buffer[i+6] << 8) | filehash_index_buffer[i+7]; if ((offset < 0) || (offset > filehash_len) || (lastoffset < 0) || (lastoffset > filehash_len) || (offset >= lastoffset)) { errno = ENOENT; return -1; } size = lastoffset - offset; if (size <= 1) { errno = ENOENT; return -1; } /* fprintf(stderr, "hash=%d offset=%d lastoffset=%d size=%d\n", hash, offset, lastoffset, size); */ buffer = &filehash_buffer[offset]; for (i=0; i%d %x\n", name, i, mask_int[i%(8*sizeof(int))]); */ index_set[block2index(i)] |= mask_int[i%(8*sizeof(int))]; if (OutputType == IS_INDICES) { /* indices is always bigendian */ putc(((i & 0xff000000) >> 24)&0xff, outputfile); putc(((i & 0x00ff0000) >> 16)&0xff, outputfile); putc(((i & 0x0000ff00) >> 8)&0xff, outputfile); putc((i & 0x000000ff), outputfile); } } } if (OutputType == IS_BITS) { for (i=0; i> 24)&0xff, outputfile); putc(((index_set[i] & 0x00ff0000) >> 16)&0xff, outputfile); putc(((index_set[i] & 0x0000ff00) >> 8)&0xff, outputfile); putc((index_set[i] & 0x000000ff), outputfile); } else if (OutputEndian == IS_LITTLE_ENDIAN) { /* little */ putc((index_set[i] & 0x000000ff), outputfile); putc(((index_set[i] & 0x0000ff00) >> 8)&0xff, outputfile); putc(((index_set[i] & 0x00ff0000) >> 16)&0xff, outputfile); putc(((index_set[i] & 0xff000000) >> 24)&0xff, outputfile); } } } } else if (InputType == IS_INDICES) { /* indices is always bigendian */ while ((nextchar = getc(inputfile)) != EOF) { nextchar = nextchar & 0xff; i = nextchar << 24; if ((nextchar = getc(inputfile)) == EOF) break; nextchar = nextchar & 0xff; i |= nextchar << 16; if ((nextchar = getc(inputfile)) == EOF) break; nextchar = nextchar & 0xff; i |= nextchar << 8; if ((nextchar = getc(inputfile)) == EOF) break; nextchar = nextchar & 0xff; i |= nextchar; if (indextype != 0) { if (i < file_num) index_set[block2index(i)] |= mask_int[i%(8*sizeof(int))]; } else { if (i < MAX_PARTITION) index_set[i] = 1; } if (OutputType == IS_NAMES) { if (ReadIntoMemory) ret = mem_element2name(i, outname, filenames_buffer, filenames_index_buffer, file_num); else ret = element2name(i, outname, fdname, fdname_index, file_num); if (ret != -1) fprintf(outputfile, "%s\n", outname); } } if (OutputType == IS_BITS) { for (i=0; i> 24)&0xff, outputfile); putc(((index_set[i] & 0x00ff0000) >> 16)&0xff, outputfile); putc(((index_set[i] & 0x0000ff00) >> 8)&0xff, outputfile); putc((index_set[i] & 0x000000ff), outputfile); } else if (OutputEndian == IS_LITTLE_ENDIAN) { /* little */ putc((index_set[i] & 0x000000ff), outputfile); putc(((index_set[i] & 0x0000ff00) >> 8)&0xff, outputfile); putc(((index_set[i] & 0x00ff0000) >> 16)&0xff, outputfile); putc(((index_set[i] & 0xff000000) >> 24)&0xff, outputfile); } } } } else if (InputType == IS_BITS) { i = 0; while ((i < sizeof(int) * index_set_size) && (nextchar = getc(inputfile)) != EOF) { nextchar = nextchar & 0x000000ff; /* fprintf(stderr, "nextchar=%x\n", nextchar); */ if (indextype != 0) { if (InputEndian == IS_LITTLE_ENDIAN) { /* little-endian: little end of integer was dumped first in bitfield_file */ index_set[i/4] |= (nextchar << (8*(i%4))); } else if (InputEndian == IS_BIG_ENDIAN) { /* big-endian: big end of integer is first was dumped first in bitfield_file */ index_set[i/4] |= (nextchar << (8*(4-1-(i%4)))); } } else { if (i < MAX_PARTITION) { /* interpretation of "bit" changes without OneFilePerBlock */ index_set[i] = (nextchar != 0) ? 1 : 0; } else break; /* BITFIELDLENGTH, by above definition, is always > MAX_PARTITION: see io.c */ } i++; } for (i=0; i>24, outputfile); putc((m&0x00ff0000)>>16, outputfile); putc((m&0x0000ff00)>>8, outputfile); putc((m&0x000000ff), outputfile); } } } } } return 0; } /********************************************************************** * Calls do_conversion() to convert storage format of a set of files; * * Optimizes some cases by reading important files into memory. * * Returns: 0 on success, -1 on failure * **********************************************************************/ int change_format(InputFilenames, ReadIntoMemory, InputType, OutputType, InputEndian, OutputEndian, glimpseindex_dir, filename_prefix) int InputFilenames; int ReadIntoMemory; int InputType; int OutputType; int InputEndian; int OutputEndian; char *glimpseindex_dir; char *filename_prefix; { char outname[MAX_LINE_LEN]; /* place where converted output is stored */ char s[MAX_LINE_LEN]; /* temp buffer */ char realname[MAX_LINE_LEN]; /* name after prefix of neighbourhood file is added to it */ char name[MAX_LINE_LEN]; /* name of file gotten from stdin: only if (InputFilenames) */ int lastslash, name_len, indextype, indexnumber, structuredindex, recordlevelindex, temp_attr_num, bytelevelindex; /*indextype*/ int i, ret; /* for-loop/return-value */ int num_input_filenames; /* for statistics */ char temp_rdelim[MAX_LINE_LEN]; /*indextype*/ struct stat istbuf; /*indexstat*/ struct stat fstbuf; /*filestat*/ unsigned int *index_set, index_set_size; /*neighbourhood's bitmap representation*/ FILE *inputfile, *outputfile; /*file to be converted/file to store converted output: only if (InputFilenames) */ /* Options set: read index */ sprintf(s, "%s/%s", glimpseindex_dir, INDEX_FILE); if (-1 == stat(s, &istbuf)) { fprintf(stderr, "Cannot find index in directory `%s'\n\tuse `-H dir' to specify a glimpse index directory\n", glimpseindex_dir); return usage(); } /* Find out existing index of words and partitions/filenumbers */ indextype = get_index_type(s, &indexnumber, &indextype, &structuredindex, temp_rdelim); if (structuredindex == -2) { recordlevelindex = 1; bytelevelindex = 1; } if (structuredindex <= 0) structuredindex = 0; else { temp_attr_num = structuredindex; structuredindex = 1; } if (indextype == 0) { file_num = MAX_PARTITION; /*tiny*/ index_set_size = MAX_PARTITION; } else { if (indextype > 0) file_num = indextype; /*small*/ else file_num = -indextype; /*medium*/ index_set_size = ((file_num + 8*sizeof(int) - 1)/(8*sizeof(int))); } index_set = (unsigned int *)my_malloc(index_set_size * sizeof(unsigned int)); memset(index_set, '\0', index_set_size * sizeof(unsigned int)); sprintf(name, "%s/%s", glimpseindex_dir, NAME_LIST); if ((fdname = open(name, O_RDONLY, 0)) == -1) { fprintf(stderr, "Cannot open for reading: %s\n", name); return -1; } fstbuf.st_size = 0; fstat(fdname, &fstbuf); if (ReadIntoMemory) { filenames_len = fstbuf.st_size; filenames_buffer = NULL; if (allocate_and_fill(&filenames_buffer, filenames_len, name, fdname) == -1) { close(fdname); if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); return -1; } close(fdname); } sprintf(name, "%s/%s", glimpseindex_dir, NAME_LIST_INDEX); if ((fdname_index = open(name, O_RDONLY, 0)) == -1) { fprintf(stderr, "Cannot open for reading: %s\n", name); if (!ReadIntoMemory) { close(fdname); } else { if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); } return -1; } fstbuf.st_size = 0; fstat(fdname_index, &fstbuf); if (ReadIntoMemory) { filenames_index_len = fstbuf.st_size; filenames_index_buffer = NULL; if (allocate_and_fill(&filenames_index_buffer, filenames_index_len, name, fdname_index) == -1) { close(fdname_index); if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); return -1; } close(fdname_index); } sprintf(name, "%s/%s", glimpseindex_dir, NAME_HASH); if ((fdhash = open(name, O_RDONLY, 0)) == -1) { fprintf(stderr, "Cannot open for reading: %s\n", name); fprintf(stderr, "To change formats, the index must be built using `glimpseindex -h ...'\n"); if (!ReadIntoMemory) { close(fdname); close(fdname_index); } else { if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); } return -1; } fstbuf.st_size = 0; fstat(fdhash, &fstbuf); if (ReadIntoMemory) { filehash_len = fstbuf.st_size; filehash_buffer = NULL; if (allocate_and_fill(&filehash_buffer, filehash_len, name, fdhash) == -1) { close(fdhash); if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); if (filehash_buffer != NULL) my_free(filehash_buffer, filehash_len); return -1; } close(fdhash); } sprintf(name, "%s/%s", glimpseindex_dir, NAME_HASH_INDEX); if ((fdhash_index = open(name, O_RDONLY, 0)) == -1) { fprintf(stderr, "Cannot open for reading: %s\n", name); fprintf(stderr, "To change formats, the index must be built using `glimpseindex -h ...'\n"); if (!ReadIntoMemory) { close(fdname); close(fdname_index); close(fdhash); } else { if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); if (filehash_buffer != NULL) my_free(filehash_buffer, filehash_len); } return -1; } fstbuf.st_size = 0; fstat(fdhash_index, &fstbuf); if (fstbuf.st_size == MAX_64K_HASH * 4) BigFilenameHashTable = 1; else if (fstbuf.st_size == MAX_4K_HASH * 4) BigFilenameHashTable = 0; else { fprintf(stderr, "Corrupted file: %s\n", name); if (!ReadIntoMemory) { close(fdname); close(fdname_index); close(fdhash); close(fdhash_index); } else { if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); if (filehash_buffer != NULL) my_free(filehash_buffer, filehash_len); } return -1; } if (ReadIntoMemory) { if (BigFilenameHashTable) filehash_index_len = MAX_64K_HASH * 4; else filehash_index_len = MAX_4K_HASH * 4; /* filehash_index_len = fstbuf.st_size; */ filehash_index_buffer = NULL; if (allocate_and_fill(&filehash_index_buffer, filehash_index_len, name, fdhash_index) == -1) { close(fdhash_index); if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); if (filehash_buffer != NULL) my_free(filehash_buffer, filehash_len); if (filehash_index_buffer != NULL) my_free(filehash_index_buffer, filehash_index_len); return -1; } close(fdhash_index); } /* fprintf(stderr, "file_num=%d, indextype=%d, structuredindex=%d, index_set_size=%d\n", file_num, indextype, structuredindex, index_set_size); */ /* Initialize statistics information */ hash_misses = 0; /* Do actual conversion */ if (!InputFilenames) ret = do_conversion(stdin, stdout, indextype, InputType, OutputType, InputEndian, OutputEndian, index_set, index_set_size, ReadIntoMemory); else { sprintf(outname, "./.wgconvert.%d", getpid()); /* place where converted neighbourhoods are gonna be (./ => same file system as input :-) */ /* convert file by file: if there is an error in converting one file, go to the next one!!! */ num_input_filenames = 0; while (fgets(name, MAX_LINE_LEN, stdin) != NULL) { num_input_filenames ++; name_len = strlen(name); name[name_len - 1] = '\0'; /* Figure out filename and put the -P prefix before it */ lastslash = -1; for (i=0; i= 0) { memcpy(realname, name, lastslash+1); realname[lastslash+1] = '\0'; } else realname[0] = '\0'; strcat(realname, filename_prefix); strcat(realname, &name[lastslash+1]); /* Call do_conversion() and check if it worked OK */ if ((inputfile = fopen(realname, "r")) == NULL) { fprintf(stderr, "Can't open for reading: %s\n", realname); continue; } if ((fstat(fileno(inputfile), &fstbuf) == -1) || (fstbuf.st_size <= 0)) { fprintf(stderr, "Zero sized file: %s\n", realname); fclose(inputfile); continue; } if ((outputfile = fopen(outname, "w")) == NULL) { fprintf(stderr, "Can't open for writing: %s\n", realname); fclose(inputfile); continue; } do_conversion(inputfile, outputfile, indextype, InputType, OutputType, InputEndian, OutputEndian, index_set, index_set_size, ReadIntoMemory); fclose(inputfile); fflush(outputfile); if ((fstat(fileno(outputfile), &fstbuf) == -1) || (fstbuf.st_size <= 0)) { fprintf(stderr, "Zero sized output for: %s\n", realname); fclose(outputfile); continue; } fclose(outputfile); /* move the converted neighbourhood file into the old neighbourhood file */ #if 1 sprintf(s, "mv -f %s %s", outname, realname); if (system(s) == -1) fprintf(stderr, "Errno=%d -- could not execute: %s\n", errno, s); #else if (rename(outname, realname) == -1) fprintf(stderr, "Errno=%d -- could not rename %s as %s\n", errno, outname, realname); #endif } unlink(outname); ret = 0; } /* Cleanup and return */ if (!ReadIntoMemory) { close(fdname); close(fdname_index); close(fdhash); close(fdhash_index); } else { if (filenames_buffer != NULL) my_free(filenames_buffer, filenames_len); if (filenames_index_buffer != NULL) my_free(filenames_index_buffer, filenames_index_len); if (filehash_buffer != NULL) my_free(filehash_buffer, filehash_len); if (filehash_index_buffer != NULL) my_free(filehash_index_buffer, filehash_index_len); } my_free(index_set, index_set_size*sizeof(int)); #if 1 if (InputFilenames && (InputType == IS_NAMES)) printf("hash_misses=%d num_input_filenames=%d\n", hash_misses, num_input_filenames); #endif return ret; } /*************************************** * Processes options * * Returns 0 on success, -1 on failure * ***************************************/ int main(argc, argv) int argc; char *argv[]; { /* Initialize */ InfoAfterFilename = 0; InputFilenames = 0; ReadIntoMemory = 0; filenames_buffer = filenames_index_buffer = filehash_buffer = filehash_index_buffer = NULL; filename_prefix[0] = '\0'; InputType = 0; OutputType = 0; InputEndian = IS_BIG_ENDIAN; OutputEndian = IS_BIG_ENDIAN; glimpseindex_dir[0] = '\0'; /* Read options (to know what they mean, check usage() below */ while (argc > 1) { if (strcmp(argv[1], "-ni") == 0) { InputType = IS_NAMES; OutputType = IS_INDICES; argc --; argv ++; } else if (strcmp(argv[1], "-in") == 0) { InputType = IS_INDICES; OutputType = IS_NAMES; argc --; argv ++; } else if (strcmp(argv[1], "-nb") == 0) { InputType = IS_NAMES; OutputType = IS_BITS; argc --; argv ++; } else if (strcmp(argv[1], "-bn") == 0) { InputType = IS_BITS; OutputType = IS_NAMES; argc --; argv ++; } else if (strcmp(argv[1], "-ib") == 0) { InputType = IS_INDICES; OutputType = IS_BITS; argc --; argv ++; } else if (strcmp(argv[1], "-bi") == 0) { InputType = IS_BITS; OutputType = IS_INDICES; argc --; argv ++; } else if (strcmp(argv[1], "-lo") == 0) { OutputEndian = IS_LITTLE_ENDIAN; argc --; argv ++; } else if (strcmp(argv[1], "-li") == 0) { InputEndian = IS_LITTLE_ENDIAN; argc --; argv ++; } else if (strcmp(argv[1], "-H") == 0) { if (argc == 2) { fprintf(stderr, "-H should be followed by a directory name\n"); return usage(); } strncpy(glimpseindex_dir, argv[2], MAX_LINE_LEN); argc -= 2; argv += 2; } else if (strcmp(argv[1], "-P") == 0) { if (argc == 2) { fprintf(stderr, "-P should be followed by a prefix for filenames\n"); return usage(); } strncpy(filename_prefix, argv[2], MAX_LINE_LEN); argc -= 2; argv += 2; } else if (strcmp(argv[1], "-F") == 0) { InputFilenames = 1; argc --; argv ++; } else if (strcmp(argv[1], "-M") == 0) { ReadIntoMemory = 1; argc --; argv ++; } else if (strcmp(argv[1], "-U") == 0) { InfoAfterFilename = 1; argc --; argv ++; } else { fprintf(stderr, "Invalid option %s\n", argv[1]); return usage(); } } /* Check for errors */ if ((InputType == 0) || (OutputType == 0)) { fprintf(stderr, "Must specify one of: -ib -bi -ni -in -nb -bn\n"); return usage(); } return change_format(InputFilenames, ReadIntoMemory, InputType, OutputType, InputEndian, OutputEndian, glimpseindex_dir, filename_prefix); } /***************************************** * Prints out a help/usage message * * Returns: nothing (exits from program) * *****************************************/ int usage() { fprintf(stderr, "\nusage: wgconvert {-ni,-in,-bn,-nb,-ib,-bi} [-li,-lo] [-F] [-H dir] [-M] [-P prefix] outfile\n"); fprintf(stderr, "`wgconvert' is used to change the format of neighbourhood files in `webglimpse'\n"); fprintf(stderr, "To change formats, the index must be built using `glimpseindex -h ...'\n\n"); fprintf(stderr, "There are 3 formats available:\n"); fprintf(stderr, "\t1. Complete path names of files (n)\n"); fprintf(stderr, "\t2. Indices of the files in .glimpse_filenames (i)\n"); fprintf(stderr, "\t3. A bit-mask of total#files bits for files in the neighborhood (b)\n"); fprintf(stderr, "We recommend options 1 or 2 since they are easy to use. To use #3, you must\n"); fprintf(stderr, "specify the proper `endian', since glimpse's -p option reads bits in 4B units.\n\n"); fprintf(stderr, "-ni: input is names file, output is indices file\n"); fprintf(stderr, "-in: input is indices file, output is names file\n"); fprintf(stderr, "-bn: input is bit-field file, output is names file\n"); fprintf(stderr, "-nb: input is names file, output is bit-field file\n"); fprintf(stderr, "-ib: input is indices file, output is bit-field file\n"); fprintf(stderr, "-bi: input is bit-field file, output is indices file\n"); fprintf(stderr, "-li: input bit-field file is little-endian (default big endian)\n"); fprintf(stderr, "-lo: output bit-field file is little-endian (default big endian)\n"); fprintf(stderr, "-F: expect filenames on stdin, not data of an input file\n\tIn this case, this program will convert each filename one by one\n"); fprintf(stderr, "-H dir: glimpse's index is in directory `dir'\n"); fprintf(stderr, "-M: cache some .glimpse* files in memory for speed\n\tUseful with -F when a lot of files are being wgconvert-ed at the same time\n"); fprintf(stderr, "-P prefix: prefix for filenames when -F option is used\n\tIf file=`/a/b.html', with `-P .nh.', wgconvert will access `/a/.nh.b.html'\n"); fprintf(stderr, "\nFor questions about wgconvert, please contact: `%s'\n", GLIMPSE_EMAIL); exit(2); return -1; /* so that the compiler doesn't cry */ } glimpse-4.18.7/index/dir.c000066400000000000000000000421141300371307100153010ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/dir.c */ /* The function of the program is to traverse the direcctory tree and print the size of the files in the tree. This program is derived from the C-programming language book It opens a directory file using opendir system call, and use readdir() to read each entry of the directory. */ #include "autoconf.h" /* ../libtemplate/include */ #include #include #if HAVE_DIRENT_H # include # define NAMLEN(dirent) strlen((dirent)->d_name) #else # define dirent direct # define NAMLEN(dirent) (dirent)->d_namlen # if HAVE_SYS_NDIR_H # include # endif # if HAVE_SYS_DIR_H # include # endif # if HAVE_NDIR_H # include # endif #endif #include #include #define BUFSIZE 256 #define DIRSIZE 14 #include "glimpse.h" #undef MAX_LIST #define MAX_LIST 100000 /* Removed on 16/Feb/1996 becuase changed type returned by lib_fstat to S_IFLNK #if SFS_COMPAT #define FS_TYPEMASK 0x700000 #define FS_LINK 0x300000 #endif */ extern FILE *TIMEFILE; #if BG_DEBUG extern FILE *LOGFILE; #endif /*BG_DEBUG*/ extern FILE *MESSAGEFILE; int ndx = 0; /* file index */ extern char **name_list[MAXNUM_INDIRECT]; /* store the file names */ extern int *size_list[MAXNUM_INDIRECT]; /* store the sizes of the files */ extern unsigned int *disable_list; /* store whether to DISABLE indexing or not: only with FastIndex or AddToIndex */ extern int file_num; extern int file_id; /* borrowed from filetype.c */ extern char INDEX_DIR[MAX_LINE_LEN]; extern int AddToIndex; extern int DeleteFromIndex; extern int FastIndex; extern int OneFilePerBlock; extern int IncludeHigherPriority; extern int BuildDictionaryExisting; extern int IndexEverything; extern int printed_warning; extern int SortByTime; extern int p_table[]; extern FILE *STATFILE; extern int ExtractInfo; extern int IndexableFile; extern int files_per_partition; extern int new_partition; extern int files_in_partition; extern struct stat istbuf; /* imported from glimpse.c */ extern int memory_usage; extern int mask_int[]; extern char exin_argv[8]; extern int exin_argc; extern char current_dir_buf[2*MAX_LINE_LEN + 4]; /* must have space to store pattern after directory name */ extern unsigned char dummypat[MAX_PAT]; extern int dummylen; extern FILE *dummyout; extern struct stat excstbuf; extern struct stat incstbuf; extern struct stat filstbuf; extern int num_filter; extern int filter_len[MAX_FILTER]; extern CHAR *filter[MAX_FILTER]; extern CHAR *filter_command[MAX_FILTER]; /* * Exclude/Include priorities with exclude > include (IncludeHigherPriority = OFF = default): * 1. Command line arguments (inclusion --> exclude list is never applied) * 2. Exclude list (exclusion) * 3. Include list (inclusion) * 4. Symbolic links (exclusion) * 5. filter processing (inclusion --> so that binary files that can be filtered are not excluded) * 6. filetype (exclusion) * * Exclude/Include priorities with include > exclude (IncludeHigherPriority = ON = -i): * 1. Command line arguments (inclusion --> exclude list is never applied) * 2. Include list (inclusion) * 3. Symbolic links (exclusion --> applying exclude list is unnecessary: optimization) * 4. Exclude list (exclusion) * 5. filter processing (inclusion --> so that binary files that can be filtered are not excluded) * 6. filetype (exclusion) */ char outname[MAX_LINE_LEN]; char inname[MAX_LINE_LEN]; fsize(name, pat, pat_len, num_pat, inc, inc_len, num_inc, toplevel) char *name; char **pat; int *pat_len; int num_pat; char **inc; int *inc_len; int num_inc; int toplevel; { struct stat stbuf; int i; int fileindex = -1; int force_include = 0; int len_current_dir_buf = strlen(current_dir_buf) + 1; /* includes the '\0' which is going to be replaced by '\n' below */ int name_len; char *t1; char xinfo[MAX_LINE_LEN], temp[MAX_LINE_LEN]; int xinfo_len = 0; if ((name == NULL) || (*name == '\0')) return 0; name_len = strlen(name); /* name[name_len] is '\0' */ #ifdef SW_DEBUG printf("num_pat= %d num_inc= %d\n", num_pat, num_inc); printf("name= %s\n", name); #endif /* * Find out what to exclude, what to include and skip * over symbolic links that don't HAVE to be included. * Some Extra get_filename_index calls are done but * that won't cost you anything (just #ing twice). */ /* Check if cache set in glimpse.c is correct */ if (!IndexableFile && !DeleteFromIndex && FastIndex && ((fileindex = get_filename_index(name, name_list, file_num)) != -1) && (disable_list[block2index(fileindex)] & mask_int[fileindex % (8*sizeof(int))])) { if (num_pat <= 0) { if (num_inc <= 0) return 0; else if (incstbuf.st_ctime <= istbuf.st_ctime) return 0; } else { if (num_inc <= 0) { if (excstbuf.st_ctime <= istbuf.st_ctime) return 0; } else if ((excstbuf.st_ctime <= istbuf.st_ctime) && (incstbuf.st_ctime <= istbuf.st_ctime)) return 0; } } #define PROCESS_EXIT \ {\ if (AddToIndex || FastIndex || DeleteFromIndex) {\ if ((fileindex = get_filename_index(name, name_list, file_num)) != -1) \ remove_filename(fileindex, new_partition);\ }\ } #define PROCESS_EXCLUDE \ {\ if (!toplevel) for(i=0; i 0) {\ name[name_len] = '\0';\ if (strstr(name, (const char *)pat[i]) != NULL) {\ PROCESS_EXIT;\ return 0;\ }\ }\ else { /* must call memagrep */\ int ret;\ name[name_len] = '\n'; /* memagrep wants names to end with '\n': '\0' is not necessary */\ /* printf("i=%d patlen=%d pat=%s inlen=%d input=%s\n", i, -pat_len[i], pat[i], len_current_dir_buf, current_dir_buf); */\ if (((pat_len[i] == -2) && (pat[i][0] == '.') && (pat[i][1] == '*')) ||\ ((ret = memagrep_search(-pat_len[i], pat[i], len_current_dir_buf, current_dir_buf, 0, dummyout)) > 0))\ {\ /* printf("excluding with %d %s\n", ret, name); */\ name[name_len] = '\0'; /* restore */\ PROCESS_EXIT;\ return 0; \ }\ /* else printf("ret=%d\n");*/\ }\ }\ name[name_len] = '\0';\ } #define PROCESS_INCLUDE \ {\ /*\ * When include has higher priority, we want to include directories\ * by default and match the include patterns only against filenames.\ * Based on bug reports for glimpse-2.1. bg: 2/mar/95.\ */\ if (IncludeHigherPriority && ((stbuf.st_mode & S_IFMT) == S_IFDIR)) force_include = 1;\ else for (i=0; i 0) {\ name[name_len] = '\0';\ if (strstr(name, (const char *)inc[i]) != NULL) {\ force_include = 1;\ break;\ }\ }\ else { /* must call memagrep */\ name[name_len] = '\n'; /* memagrep wants names to end with '\n': '\0' is not necessary */\ /* printf("pat=%s input=%s\n", pat[i], current_dir_buf); */\ if (((inc_len[i] == -2) && (inc[i][0] == '.') && (inc[i][1] == '*')) ||\ (memagrep_search(-inc_len[i], inc[i], len_current_dir_buf, current_dir_buf, 0, dummyout) > 0))\ {\ force_include = 1;\ break;\ }\ }\ }\ name[name_len] = '\0'; /* restore */\ if (toplevel) force_include = 1;\ } #define PROCESS_FILTER \ {\ /*\ * Filters should be processed independent of .include since they might have to be\ * excluded first. However, they must be processed before filetype since legitimate\ * files like *.Z might be excluded by it. Based on bug reports for glimpse-3.5: bg: 11/Apr/96.\ */\ if (!force_include) for (i=0; i 0) {\ name[name_len] = '\0';\ if (strstr(name, (const char *)filter[i]) != NULL) {\ force_include = 1;\ break;\ }\ }\ else { /* must call memagrep */\ name[name_len] = '\n'; /* memagrep wants names to end with '\n': '\0' is not necessary */\ /* printf("pat=%s input=%s\n", pat[i], current_dir_buf); */\ if (((filter_len[i] == -1) && (filter[i][0] == '.') && (filter[i][1] == '*')) ||\ (memagrep_search(-filter_len[i], filter[i], len_current_dir_buf, current_dir_buf, 0, dummyout) > 0))\ {\ force_include = 1;\ break;\ }\ }\ }\ name[name_len] = '\0'; /* restore */\ } if(my_lstat(name, &stbuf) == -1) { if (IndexableFile) return 0; /* Can happen for command line arguments, not stuff obtained from fsize_directory() */ #if BG_DEBUG fprintf(LOGFILE, "cannot find %s -- not indexing\n", name); #endif /*BG_DEBUG*/ PROCESS_EXIT; return 0; } /* Else lstat has all the requisite information */ /* Removed on 16/Feb/1996 becuase changed type returned by lib_fstat to S_IFLNK #if SFS_COMPAT if ((stbuf.st_spare1 & FS_TYPEMASK) == FS_LINK) return 0; #endif */ if ((stbuf.st_mode & S_IFMT) == S_IFLNK) { /* if (IndexableFile) return 0; ---> not correct! must process include/exclude with -I too */ PROCESS_INCLUDE; if (!force_include) { #if BG_DEBUG fprintf(LOGFILE, "%s is a symbolic link -- not indexing\n", name); #endif /*BG_DEBUG*/ PROCESS_EXIT; return 0; } if (-1 == my_stat(name, &stbuf)) { #if BG_DEBUG fprintf(LOGFILE, "cannot find target of symbolic link %s -- not indexing\n", name); #endif /*BG_DEBUG*/ PROCESS_EXIT; return 0; } } else /* if (!IndexableFile) ---> not correct! must process include/exclude with -I too */ { /* Put exclude include processing here... stat all the time: that is faster than former! */ if (FastIndex && ((fileindex = get_filename_index(name, name_list, file_num)) != -1)) { /* Don't process exclude/include if the file `name' is older then the index AND the exclude/include file is older then the index */ if (IncludeHigherPriority) { if (!((stbuf.st_ctime <= istbuf.st_ctime) && (incstbuf.st_ctime <= istbuf.st_ctime))) PROCESS_INCLUDE; if (!force_include && !((stbuf.st_ctime <= istbuf.st_ctime) && (excstbuf.st_ctime <= istbuf.st_ctime))) PROCESS_EXCLUDE; } else { if (!((stbuf.st_ctime <= istbuf.st_ctime) && (excstbuf.st_ctime <= istbuf.st_ctime))) PROCESS_EXCLUDE; if (!((stbuf.st_ctime <= istbuf.st_ctime) && (incstbuf.st_ctime <= istbuf.st_ctime))) PROCESS_INCLUDE; } if (!((stbuf.st_ctime <= istbuf.st_ctime) && (filstbuf.st_ctime <= istbuf.st_ctime))) PROCESS_FILTER; } else { /* Either AddToIndex or fresh indexing or previously excluded file: process exclude and include */ if (IncludeHigherPriority) { PROCESS_INCLUDE; if (!force_include) PROCESS_EXCLUDE; } else { PROCESS_EXCLUDE; PROCESS_INCLUDE; } PROCESS_FILTER; } } /* Here, the file exists and has not been excluded -- possibly has been included */ index_everything: if ((stbuf.st_mode & S_IFMT) == S_IFDIR) { if (-1 == fsize_directory(name, pat, pat_len, num_pat, inc, inc_len, num_inc)) return -1; } else if ((stbuf.st_mode & S_IFMT) == S_IFREG) { /* regular file */ if (IndexableFile) { if (!filetype(name, IndexEverything?2:1, NULL, NULL)) printf("%s\n", name); return 0; } if (DeleteFromIndex) { if ((fileindex = get_filename_index(name, name_list, file_num)) != -1) { remove_filename(fileindex, new_partition); } /* else doesn't exist in index, so doesn't matter */ return 0; } file_id ++; if (BuildDictionaryExisting) { /* Don't even store the names of the files that are not uncompressible */ if (file_num >= MaxNum24bPartition) { fprintf(stderr, "Too many files in index: indexing the first %d only.\n", MaxNum24bPartition); return -1; } if (tuncompress_file(name, outname, TC_EASYSEARCH | TC_OVERWRITE | TC_NOPROMPT) <= 0) return 0; file_num++; t1 = (char *) my_malloc(strlen(outname) + 2); strcpy(t1, outname); /* name_list[ndx] = t1; */ LIST_ADD(name_list, ndx, t1, char*); /* size_list[ndx] = stbuf.st_size;*/ LIST_ADD(size_list, ndx, stbuf.st_size, int); ndx ++; return 0; } #ifdef SW_DEBUG printf("%s: ", name); #endif if (AddToIndex || FastIndex) { if ((fileindex = get_filename_index(name, name_list, file_num)) != -1) { LIST_ADD(size_list, fileindex, stbuf.st_size, int); if (FastIndex && (stbuf.st_ctime <= istbuf.st_ctime)) disable_list[block2index(fileindex)] |= mask_int[fileindex % (8*sizeof(int))]; else { /* AddToIndex or file was modified (=> its type might have changed!) */ if (filetype(name, IndexEverything?2:1, &xinfo_len, xinfo)) { if (!force_include) { remove_filename(fileindex, new_partition); return 0; } else { #if BG_DEBUG fprintf(LOGFILE, "overriding and indexing: %s\n", name); #endif /*BG_DEBUG*/ } } if (ExtractInfo && (xinfo_len > 0)/* && (special_get_name(name, name_len, temp) != -1) NOT NEEDED since name is got from UNIX */) { my_free(LIST_SUREGET(name_list, fileindex)); t1 = (char *)my_malloc(strlen(name) + xinfo_len + 3); strcpy(t1, name); strcat(t1, " "); strcat(t1, xinfo); LIST_ADD(name_list, fileindex, t1, char*); change_filename(name, name_len, fileindex, t1); } disable_list[block2index(fileindex)] &= ~(mask_int[fileindex % (8*sizeof(int))]); } } else { /* new file not in filenames so no point in checking */ if(filetype(name, IndexEverything?2:1, &xinfo_len, xinfo)) { if (!force_include) return 0; else { #if BG_DEBUG fprintf(LOGFILE, "overriding and indexing: %s\n", name); #endif /*BG_DEBUG*/ } } if (file_num >= MaxNum24bPartition) { fprintf(stderr, "Too many files in index: indexing the first %d only.\n", MaxNum24bPartition); return -1; } if (ExtractInfo && (xinfo_len > 0)) { t1 = (char *)my_malloc(strlen(name) + xinfo_len + 3); strcpy(t1, name); strcat(t1, " "); strcat(t1, xinfo); } else { t1 = (char *)my_malloc(strlen(name) + 2); strcpy(t1, name); } /* name_list[file_num] = t1; */ LIST_ADD(name_list, file_num, t1, char*); /* size_list[file_num] = stbuf.st_size; */ LIST_ADD(size_list, file_num, stbuf.st_size, int); insert_filename(LIST_GET(name_list, file_num), file_num); file_num ++; if (!OneFilePerBlock) { if (files_in_partition + 1 > files_per_partition) { if (new_partition + 1 > MaxNumPartition) { if (!printed_warning) { printed_warning = 1; if (AddToIndex) { fprintf(MESSAGEFILE, "Warning: partition-table overflow! Fresh indexing recommended.n"); } else { fprintf(MESSAGEFILE, "Warning: partition-table overflow! Commencing fresh indexing...\n"); return -1; } } } else new_partition++; files_in_partition = 0; /* so that we don't get into this if-branch until another files_per_partition new files are seen */ } p_table[new_partition] = file_num; files_in_partition ++; } } } else { /* Fresh indexing: very simple -- add everything */ if(filetype(name, IndexEverything?2:1, &xinfo_len, xinfo)) { if (!force_include) return 0; else { #if BG_DEBUG fprintf(LOGFILE, "overriding and indexing: %s\n", name); #endif /*BG_DEBUG*/ } } if (file_num >= MaxNum24bPartition) { fprintf(stderr, "Too many files in index: indexing the first %d only.\n", MaxNum24bPartition); return -1; } if (SortByTime) fprintf(TIMEFILE, "%ld %d\n", stbuf.st_mtime, file_num); file_num++; if (ExtractInfo && (xinfo_len > 0)) { t1 = (char *)my_malloc(strlen(name) + xinfo_len + 3); strcpy(t1, name); strcat(t1, " "); strcat(t1, xinfo); } else { t1 = (char *) my_malloc(strlen(name) + 2); strcpy(t1, name); } /* name_list[ndx] = t1; */ LIST_ADD(name_list, ndx, t1, char*); /* size_list[ndx] = stbuf.st_size; */ LIST_ADD(size_list, ndx, stbuf.st_size, int); ndx++; } } return 0; } /* uses the space in the same "name" to get names of files in that directory and calls fsize */ /* pat, pat_len, num_pat, inc, inc_len, num_inc are just used for recursive calls to fsize */ /* special_get_name() doesn't have to be done since glimpseindex indexes just files, not directories, so dir's have no URL information, etc. */ fsize_directory(name, pat, pat_len, num_pat, inc, inc_len, num_inc) char *name; char **pat; int *pat_len; int num_pat; char **inc; int *inc_len; int num_inc; { struct dirent *dp; char *nbp, *nep; int i; DIR *dirp; /* printf("in fsize_directory, name= %s\n",name); */ if ((name == NULL) || (*name == '\0')) return 0; nbp = name + strlen(name); if( nbp+DIRSIZE+2 >= name+BUFSIZE ) /* name too long */ { fprintf(stderr, "name too long: %s\n", name); return 0; } if((dirp = opendir(name)) == NULL) { fprintf(stderr, "permission denied or non-existent directory: %s\n", name); return 0; } *nbp++ = '/'; for (dp = readdir(dirp); dp != NULL; dp = readdir(dirp)) { if (dp->d_name[0] == '\0' || strcmp(dp->d_name, ".") == 0 || strcmp(dp->d_name, "..") == 0) continue; for(i=0, nep=nbp; (dp->d_name[i] != '\0') && (nep < name+BUFSIZ-1); i++) *nep++ = dp->d_name[i]; if (dp->d_name[i] != '\0') { *nep = '\0'; fprintf(stderr, "name too long: %s\n", name); continue; } *nep = '\0'; /* printf("name= %s\n", name); */ if (-1 == fsize(name, pat, pat_len, num_pat, inc, inc_len, num_inc, 0)) return -1; } closedir (dirp); *--nbp = '\0'; /* restore name */ return 0; } glimpse-4.18.7/index/filetype.c000066400000000000000000000233611300371307100163470ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/filetype.c */ /* -------------------------------------------------------------------------- this function detect whether a given file is of special type which we do not want to index. if so, then return(1) else return (0). a file is said to be binary if more than 10% of character > 128 in the sampled input. a file is a uuencoded file if (maybe after mail header), there is a "begin" followed by 3 digits, and no lower case character. statistics we are concerned of: 1) average word length: should not be greater than 10. 2) index density: (the number of different words v.s. number of words). -----------------------------------------------------------------------------*/ #include "glimpse.h" #define SAMPLE_SIZE 8192 #define EXTRACT_SAMPLE_SIZE (MAX_LINE_LEN*2) /* must be lesser than above: used to get info to be stored ALONG with filename */ /* suggested fix: ldrolez@usa.net */ #define WORD_THRESHOLD 18 /* the ratio between number of characters and delimiters (blanks or \n) above which the file is determined to be hqx or other non-natural language text */ #if BG_DEBUG extern FILE *LOGFILE; #endif /*BG_DEBUG*/ char *member[MAX_4K_HASH]; int member_tag[MAX_4K_HASH]; int file_id; extern char *getword(); extern char INDEX_DIR[MAX_LINE_LEN]; extern int ExtractInfo; extern int InfoAfterFilename; char *extract_info_suffix[] = EXTRACT_INFO_SUFFIX; /* * dosuffix > 0 => processes suffixes (build_in.c after filtering); * dosuffix > 0 but != 1 => processes suffixes only (IndexEverything, dir.c where we don't want to read files); * dosuffix == 0 => processes other ad-hoc file checks (Default, dir.c where we want to discard un-indexable files). */ int filetype(name, dosuffix, xinfo_len, xinfo) char *name; int dosuffix; int *xinfo_len; /* length of information extracted */ char xinfo[MAX_LINE_LEN]; /* atmost 1K info can be extracted */ { unsigned char buffer[SAMPLE_SIZE+1]; int num_read; int BINARY=0; int UUENCODED=0; int fd; int i, name_len = strlen(name); int extract_only = 0; char name_buffer[MAX_LINE_LEN]; char *tempname; if (InfoAfterFilename || ExtractInfo) { special_get_name(name, name_len, name_buffer); tempname = name_buffer; } else tempname = name; name_len = strlen(tempname); /* printf("\tname=%s dosuffix=%d xinfo_len=%x *=%d\n", tempname, dosuffix, xinfo_len, (xinfo_len == NULL) ? -1 : *xinfo_len); */ if (xinfo_len != NULL) *xinfo_len = 0; if (!dosuffix) goto nosuffix; if (!strcmp(COMP_SUFFIX, &tempname[name_len-strlen(COMP_SUFFIX)])) return 0; if (test_special_suffix(tempname)) { /* printf("\t\tspecial suffix \n"); */ #if BG_DEBUG fprintf(LOGFILE, "special suffix: %s -- not indexing\n", name); #endif /*BG_DEBUG*/ return 1; } if (dosuffix != 1) { if (!ExtractInfo || (xinfo_len == NULL) || (xinfo == NULL)) return 0; extract_only = 1; } nosuffix: if((fd = my_open(tempname, 0)) < 0) { /* This is the only thing the user might want to know: suppress other warnings */ fprintf(stderr, "permission denied or non-existent file: %s\n", name); return(1); } if ((num_read = read(fd, buffer, extract_only?EXTRACT_SAMPLE_SIZE:SAMPLE_SIZE)) <= 0) { #if BG_DEBUG fprintf(LOGFILE, "no data: %s -- not indexing\n", name); #endif /*BG_DEBUG*/ close(fd); return 1; } if (extract_only) goto extract; if (test_postscript(buffer, num_read)) { /* printf("\t\tpostscript\n"); */ #if BG_DEBUG fprintf(LOGFILE, "postscript file: %s -- not indexing\n", name); #endif /*BG_DEBUG*/ close(fd); return 1; } BINARY = test_binary(buffer, num_read); if(BINARY == ON) { /* printf("\t\tbinary\n"); */ #if BG_DEBUG fprintf(LOGFILE, "binary file: %s -- not indexing\n", name); #endif /*BG_DEBUG*/ close(fd); return(1); } /* now check for uuencoded file */ UUENCODED = test_uuencode(buffer, num_read); if(UUENCODED == ON) { /* printf("\t\tuuencoded\n"); */ #if BG_DEBUG fprintf(LOGFILE, "uuencoded file: %s -- not indexing\n", name); #endif /*BG_DEBUG*/ close(fd); return(1); } if(heavy_index(tempname, buffer, num_read)) { /* printf("\t\theavy_index\n"); */ #if BG_DEBUG fprintf(LOGFILE, "heavy index file: %s -- not indexing\n ", name); #endif /*BG_DEBUG*/ close(fd); return(1); } if(hqx(tempname, buffer, num_read)) { /* printf("\t\thqx\n"); */ #if BG_DEBUG fprintf(LOGFILE, "too few real words: %s -- not indexing\n", name); #endif /*BG_DEBUG*/ close(fd); return(1); } extract: if (ExtractInfo && (xinfo_len != NULL) && (xinfo != NULL)) { /* This can be replaced by checks for in the file somewhere, but suffixes are faster and easier and enough in most cases */ for (i=0; i %s\n", name, xinfo); */ return k; } k = 0; xinfo[k++] = FILE_END_MARK; /* We need xinfo to start with a dividing character, usually space or tab */ /* There was a hard to find off-by-one error here that caused random crashes. We need an extra byte for the terminating 0, so loop only up to max_len - 1. - CV 01/10/00 */ while ((i %s\n", name, xinfo); */ return k; } xinfo[k] = '\0'; /* printf("-X on %s --> %s\n", name, xinfo); */ return k; } /* ---------------------------------------------------------------------- check for heavy index file. the function first test block 1 (of SAMPLE_SIZE bytes). the file is determined to be heavy index file if index_ratio > 0.9 and num_words > 500 ??? ---------------------------------------------------------------------- */ heavy_index(name, buffer, num_read) char *name; char *buffer; int num_read; { char *buffer_end; int hash_value; int new_word_num=0; int word_num=0; char word[256]; buffer_end = &buffer[num_read]; while((buffer = getword(name, word, buffer, buffer_end, NULL, NULL)) < buffer_end) { if(word[0] == '\0') continue; word_num++; hash_value = hash4k(word, strlen(word)); if(member_tag[hash_value] != file_id) { new_word_num++; member_tag[hash_value] = file_id; } } if(new_word_num * 100 >= word_num * 83 && word_num >= 500) return(1); #ifdef debug printf("%s: new_word_num=%d, word_num=%d\n", name, new_word_num, word_num); #endif return(0); } /* ---------------------------------------------------------------------- check for hqx encoded files or other files with long lines, for example, postscript files, core files, and others. the function first test block 1 (of SAMPLE_SIZE bytes). the file is determined to be bad if the ratio of blanks or newlines is too small. ---------------------------------------------------------------------- */ hqx(name, buffer, num_read) char *name; char *buffer; int num_read; { int i; char c; int sep=0; if (num_read < 2048) return(0) ; for (i=0; i < num_read ; i++) { c=buffer[i]; if (c == '\n' || c == ' ' || c == '/') sep++; /* the '/' is for list of file names. */ /* the \n is for lists of words, but should be excluded really so that dictionaries are excluded */ } if (!sep) return(1); if (num_read/sep > WORD_THRESHOLD) return(1); else return(0); } glimpse-4.18.7/index/fixname.c000066400000000000000000000006161300371307100161530ustar00rootroot00000000000000#include main() { int offset = 0; char buffer[1024]; fgets(buffer, 1024, stdin); /* skip over num. of file names */ offset += strlen(buffer); while (fgets(buffer, 1024, stdin) != NULL) { putc((offset & 0xff000000) >> 24, stdout); putc((offset & 0xff0000) >> 16, stdout); putc((offset & 0xff00) >> 8, stdout); putc((offset & 0xff), stdout); offset += strlen(buffer); } } glimpse-4.18.7/index/getword.c000066400000000000000000000231441300371307100162000ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/getword.c */ #include "glimpse.h" extern FILE *MESSAGEFILE; extern int NextICurrentFileOffset, ICurrentFileOffset; int StructuredIndex = 0; extern int RecordLevelIndex; extern int StoreByteOffset; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; int WORD_TOO_LONG = 0; int IndexNumber = 0; int CountWords = 0; int InterpretSpecial = 0; int indexable_char[256]; int GMAX_WORD_SIZE = MAX_WORD_SIZE; int PrintedLongWordWarning = 0; #define ALL_LOWER 0 /* default, what you start with: all are possible */ #define FIRST_UPPER 1 /* only first one seen is upper: 0 is impossible */ #define ALL_UPPER 2 /* all seen so far are upper: 2 and 3 are possible */ #define MIXED 3 /* neither of the above 3 */ #define ALPHANUM 1 #define ALPHAONLY 2 #define NUMONLY 3 /* ------------------------------------------------------------------------- getword(): get a word from stream pointed to by buffer. a word is a string of alpha-numeric characters. After the word is gotten, return a new pointer that points to a alpha-numeric character. For the first call to such function when the first character is not a alpha-numeric character, getword() only adjust the pointer to point to a alpha-numeric character. --------------------------------------------------------------------------*/ unsigned char *getword(filename, word, buffer, buffer_end, pattr, next_record) unsigned char *filename; unsigned char *word; unsigned char *buffer; unsigned char *buffer_end; int *pattr; unsigned char **next_record; { int word_length=0; unsigned char c, *wp=word; unsigned char *oldword=word; unsigned char *old_buffer = buffer; int previslsq = 0; int withinsq = 0; static int times = 0; if (!RecordLevelIndex) ICurrentFileOffset = NextICurrentFileOffset; if (pattr != NULL) *pattr = 0; if (CountWords) { /* don't convert case, ignore special, don't bother about offsets. */ unsigned char *temp_buffer; int flag = ALL_LOWER; for(temp_buffer = buffer; (temp_buffer - buffer < GMAX_WORD_SIZE) && (temp_buffer < buffer_end); temp_buffer ++) { if (!INDEXABLE(*temp_buffer)) break; if (isupper(*temp_buffer)) { if (flag == ALL_LOWER) { if (temp_buffer == buffer) flag = FIRST_UPPER; else { flag = MIXED; break; } } else if (flag == FIRST_UPPER) { if (temp_buffer == buffer + 1) flag = ALL_UPPER; else { flag = MIXED; break; } } else continue; /* must be ALL_UPPER -> let it remain so */ } else if (islower(*temp_buffer)) { if (flag == ALL_LOWER) continue; else if (flag == FIRST_UPPER) continue; else if (flag == ALL_UPPER) { flag = MIXED; break; } } /* else, not alphabet: ignore */ } if (flag == MIXED) { /* discard mixed words since they cannot be indexed */ word[0] = '\0'; if (IndexNumber) while(isalnum(*temp_buffer++)); else while(isalpha(*temp_buffer++)); return temp_buffer; } while(buffer < buffer_end) { if(INDEXABLE(*buffer)) { *word++ = *buffer ++; word_length++; } else { while((buffer< buffer_end) && !(INDEXABLE(*buffer))) buffer++; break; } if(word_length > GMAX_WORD_SIZE) { word = wp; WORD_TOO_LONG = ON; while((buffer < buffer_end) && INDEXABLE(*buffer)) buffer++; /* skip current long word */ break; } } } else { /* convert case, maybe interpret special */ while(buffer < buffer_end) { if (INDEXABLE(*buffer) || withinsq) { /* if (!RecordLevelIndex) ICurrentFileOffset is in the right place */ if (*buffer == '[') { previslsq = 1; withinsq = 1; } else { previslsq = 0; if (*buffer == ']') withinsq = 0; } if ((*buffer == '-') && !withinsq) { /* terminate word here */ buffer ++; if (!RecordLevelIndex) ICurrentFileOffset ++; else if ((next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } break; } if (isupper(*buffer)) *word++ = tolower(*buffer++); else *word++ = *buffer++; if (RecordLevelIndex && (next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } word_length++; } else if (INDEXABLE('[') && (*buffer == '^') && previslsq) { *word ++ = *buffer ++; word_length ++; if (RecordLevelIndex && (next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } previslsq = 0; } else { /* !INDEXABLE character */ previslsq = 0; if (InterpretSpecial && (*buffer == '\\')) { /* skip two things AND terminate word HERE */ if (buffer < buffer_end - 1) { buffer += 2; if (word_length <= 0) { if (RecordLevelIndex) { buffer -= 2; buffer ++; if ((next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } buffer ++; if ((next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } } else { ICurrentFileOffset += 2; } } } else if (buffer < buffer_end) { buffer ++; if (word_length <= 0) { if (RecordLevelIndex && (next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } else { ICurrentFileOffset ++; } } } } else { if (word_length <= 0) while((buffer < buffer_end) && !(INDEXABLE(*buffer))) { buffer++; if (!RecordLevelIndex) ICurrentFileOffset ++; else if ((next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } } /* else, it is a !INDEXABLE character AFTER seeing a legit word: terminate right here and let next loop worry about it */ /* --> better offset computation; should if() be for ByteLevelIndex???? ZZZZZZZZZZZZZZZZZZZZZZZZZZZZZ */ else if (RecordLevelIndex) break; else while((buffer < buffer_end) && !(INDEXABLE(*buffer))) { buffer++; if (RecordLevelIndex && (next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } } } break; } if(word_length > GMAX_WORD_SIZE) { word = wp; WORD_TOO_LONG = ON; while((buffer < buffer_end) && INDEXABLE(*buffer)) { buffer++; /* skip current long word */ if (RecordLevelIndex && (next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } } break; } } if (RecordLevelIndex && (next_record != NULL) && (buffer >= *next_record)) { *next_record = forward_delimiter(buffer, buffer_end, rdelim, rdelim_len, 0); if (StoreByteOffset) ICurrentFileOffset += *next_record - buffer; else ICurrentFileOffset ++; } } if(WORD_TOO_LONG) { c = wp[GMAX_WORD_SIZE]; wp[GMAX_WORD_SIZE] = '\0'; if (!PrintedLongWordWarning) { fprintf(MESSAGEFILE, "Warning: ignoring very long word '%s' (with > %d chars) in %s\n", oldword, GMAX_WORD_SIZE, filename); PrintedLongWordWarning = 1; } wp[GMAX_WORD_SIZE] = c; *wp = '\0'; } *word = '\0'; WORD_TOO_LONG = 0; if ((pattr != NULL) && (word_length > 0) && (StructuredIndex)) *pattr = region_identify(ICurrentFileOffset, 0); if (!RecordLevelIndex) NextICurrentFileOffset += (buffer <= old_buffer) ? 1 : (buffer - old_buffer); /* beginning of next word, atleast 1 */ #if 0 if (!strcmp(wp, "to")) { printf("%d ", ICurrentFileOffset); if (++times > 20) { printf("\n"); times = 0; } } #endif return(buffer); } set_indexable_char(indexable_char) int indexable_char[256]; { int i; /* Saves a lot of calls during run-time! */ for (i=0; i<256; i++) { if(!ISASCII((unsigned char)i) && !isalpha((unsigned char)i)) indexable_char[i] = 0; else if(IndexNumber) indexable_char[i] = isalnum(i); else indexable_char[i] = isalpha((unsigned char)i); } indexable_char['_'] = 1; } set_special_char(special_char) int special_char[256]; { /* * Set all special characters interpreted by agrep to 1. * Assume set_indexable_char has been done on it. */ special_char['-'] = 1; /* special_char[','] = 1; */ /* special_char[';'] = 1; */ /* special_char['.'] = 1; */ /* special_char['#'] = 1; */ /* special_char['|'] = 1; */ special_char['['] = 1; special_char[']'] = 1; /* special_char['('] = 1; */ /* special_char[')'] = 1; */ /* special_char['>'] = 1; */ /* special_char['<'] = 1; */ /* special_char['^'] = 1; */ /* special_char['$'] = 1; */ /* special_char['+'] = 1; */ } glimpse-4.18.7/index/glimpse.c000066400000000000000000001160761300371307100161740ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/glimpse.c */ #include "glimpse.h" #include #include #if ISO_CHAR_SET #include /* support for 8bit character set:ew@senate.be */ #endif #include extern char **environ; extern int errno; extern FILE *TIMEFILE; /* file descriptor for sorting .glimpse_filenames by time */ #if BG_DEBUG extern FILE *LOGFILE; /* file descriptor for LOG output */ #endif /*BG_DEBUG*/ extern FILE *STATFILE; /* file descriptor for statistical data about indexed files */ extern FILE *MESSAGEFILE; /* file descriptor for important messages meant for the user */ extern char INDEX_DIR[MAX_LINE_LEN]; extern struct stat istbuf; #ifdef BUILDCAST /* TEMP_DIR is normally defined in ../main.c; if we're building * buildcast, that's not linked in, so we need to define one here. */ /* char * TEMP_DIR = NULL; */ static char * TEMP_DIR = "/tmp"; #else extern char *TEMP_DIR; /* directory to store glimpse temporary files, usually /tmp unless -T is specified */ #endif /* BUILDCAST */ extern int indexable_char[256]; extern int GenerateHash; extern int KeepFilenames; extern int OneFilePerBlock; extern int IndexNumber; extern int CountWords; extern int StructuredIndex; extern int attr_num; extern int MAXWORDSPERFILE; extern int NUMERICWORDPERCENT; extern int AddToIndex; extern int DeleteFromIndex; extern int PurgeIndex; extern int FastIndex; extern int BuildDictionary; extern int BuildDictionaryExisting; extern int CompressAfterBuild; extern int IncludeHigherPriority; extern int FilenamesOnStdin; extern int ExtractInfo; extern int InfoAfterFilename; extern int FirstWordOfInfoIsKey; extern int UseFilters; extern int ByteLevelIndex; extern int RecordLevelIndex; extern int StoreByteOffset; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; extern int rdelim_len; /* extern int IndexUnderscore; */ extern int IndexableFile; extern int MAX_PER_MB, MAX_INDEX_PERCENT; extern int I_THRESHOLD; extern int BigHashTable; extern int BigFilenameHashTable; extern int IndexEverything; extern int BuildTurbo; extern int SortByTime; extern int AddedMaxWordsMessage; extern int AddedMixedWordsMessage; extern int file_num; extern int old_file_num; extern int new_file_num; extern int file_id; extern int part_num; extern char **name_list[MAXNUM_INDIRECT]; extern int p_table[MAX_PARTITION]; extern int *size_list[MAXNUM_INDIRECT]; extern int p_size_list[]; extern unsigned int *disable_list; extern int memory_usage; extern int mask_int[]; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; extern struct indices *deletedlist; extern char sync_path[MAX_LINE_LEN]; extern int ATLEASTONEFILE; extern set_usemalloc(); /* compress/misc.c */ char IProgname[MAX_LINE_LEN]; int ModifyFilenamesIndex = 0; /* * Has newnum crossed the boundary of an encoding? This is so rare that we * needn't optimize it by changing the format of the old index and reusing it. */ cross_boundary(oldnum, newnum) int oldnum, newnum; { int ret; if (oldnum <= 0) return 0; ret = ( ((oldnum <= MaxNum8bPartition) && (newnum > MaxNum8bPartition)) || ((oldnum <= MaxNum12bPartition) && (newnum > MaxNum12bPartition)) || ((oldnum <= MaxNum16bPartition) && (newnum > MaxNum16bPartition)) ); if (ret) fprintf(MESSAGEFILE, "Must change index format. Commencing fresh indexing...\n"); return ret; } determine_sync() { char S[1024], s1[256], s2[256]; FILE *fp; int i, ret; strcpy(sync_path, "sync"); sprintf(S, "exec whereis sync > %s/zz.%d", TEMP_DIR,getpid()); /* Change it to use which: not urgent. */ system(S); sprintf(S, "%s/zz.%d", TEMP_DIR,getpid()); if ((fp = fopen(S, "r")) == NULL) { /* printf("11111\n"); */ return 0; } if ((ret = fread(S, 1, sizeof(S)-1, fp)) <= 0) { sprintf(S, "%s/zz.%d", TEMP_DIR,getpid()); unlink(S); fclose(fp); /* printf("22222\n"); */ return 0; } S [ret] = 0; /* terminate string */ sprintf(s1, "%s/zz.%d", TEMP_DIR,getpid()); unlink(s1); fclose(fp); /* printf("read: %s\n", S); */ sscanf(S, "%s%s", s1, s2); /* printf("s1=%s s2=%s\n", s1, s2); */ if (strncmp(s1, "sync", 4)) { /* printf("33333\n"); */ return 0; } if (!strcmp(s2, "") || !strcmp(s2, " ")) { /* printf("44444\n"); */ return 0; } if (strstr(s2, "sync") == NULL) { /* printf("55555\n"); */ return 0; } strcpy(sync_path, s2); /* printf("Using sync in: %s\n", sync_path); */ return 1; } main(argc, argv) int argc; char **argv; { int pid = getpid(); int i, m = 0; char *indexdir, es1[MAX_LINE_LEN], es2[MAX_LINE_LEN]; char s[MAX_LINE_LEN], s1[MAX_LINE_LEN]; char working_dir[MAX_LINE_LEN]; FILE *tmpfp; char hash_file[MAX_LINE_LEN], string_file[MAX_LINE_LEN], freq_file[MAX_LINE_LEN]; char tmpbuf[1024]; struct stat stbuf; char name[MAX_LINE_LEN]; char outname[MAX_LINE_LEN]; int specialwords, threshold; int backup; struct indices *get_removed_indices(); struct timeval tv; #if ISO_CHAR_SET setlocale(LC_ALL,""); /* support for 8bit character set: ew@senate.be, Henrik.Martin@eua.ericsson.se */ #endif BuildDictionary = ON; set_usemalloc(); srand(pid); umask(077); determine_sync(); INDEX_DIR[0] = '\0'; specialwords = threshold = -1; /* so that compute_dictionary can use defaults not visible here */ strncpy(IProgname, argv[0], MAX_LINE_LEN); memset(size_list, '\0', sizeof(int *) * MAXNUM_INDIRECT); /* free it once partition successfully calculates p_size_list */ memset(name_list, '\0', sizeof(char **) * MAXNUM_INDIRECT); memset(p_size_list, '\0', sizeof(int) * MAX_PARTITION); build_filename_hashtable((char *)NULL, 0); /* * Process options. */ while (argc > 1) { if (strcmp(argv[1], "-help") == 0) { return usage(1); } #if !BUILDCAST else if (strcmp(argv[1], "-R") == 0) { ModifyFilenamesIndex = 1; argc --; argv ++; } else if (strcmp(argv[1], "-V") == 0) { printf("\nThis is glimpseindex version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); return(0); } else if (strcmp(argv[1], "-T") == 0) { BuildTurbo = ON; argc --; argv ++; } else if (strcmp(argv[1], "-I") == 0) { IndexableFile = ON; argc --; argv ++; } else if(strcmp(argv[1], "-a") == 0) { AddToIndex = ON; argc--; argv++; } else if(strcmp(argv[1], "-b") == 0) { ByteLevelIndex = ON; argc--; argv++; } else if(strcmp(argv[1], "-O") == 0) { StoreByteOffset = ON; argc--; argv++; } else if(strcmp(argv[1], "-r") == 0) { ByteLevelIndex = ON; RecordLevelIndex = ON; if (argc <= 2) { fprintf(stderr, "The -r option must be followed by a delimiter\n"); return usage(1); } else { strncpy(rdelim, argv[2], MAX_LINE_LEN); rdelim[MAX_LINE_LEN-1] = '\0'; rdelim_len = strlen(rdelim); strcpy(old_rdelim, rdelim); argc -= 2; argv += 2; } } else if(strcmp(argv[1], "-c") == 0) { CountWords = ON; argc--; argv++; } else if(strcmp(argv[1], "-d") == 0) { DeleteFromIndex = ON; argc --; argv ++; } else if(strcmp(argv[1], "-D") == 0) { PurgeIndex = OFF; argc --; argv ++; } else if(strcmp(argv[1], "-f") == 0) { FastIndex = ON; argc--; argv++; } else if (strcmp(argv[1], "-o") == 0) { OneFilePerBlock = ON; argc --; argv ++; } else if (strcmp(argv[1], "-s") == 0) { StructuredIndex = ON; argc --; argv ++; } else if(strcmp(argv[1], "-z") == 0) { UseFilters = ON; argc--; argv++; } else if(strcmp(argv[1], "-t") == 0) { SortByTime = ON; argc--; argv++; } else if (strcmp(argv[1], "-C") == 0) { BigFilenameHashTable = 1; argc --; argv ++; } #else /*!BUILDCAST*/ else if (strcmp(argv[1], "-V") == 0) { printf("\nThis is buildcast version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); return(0); } else if(strcmp(argv[1], "-C") == 0) { CompressAfterBuild = ON; argc --; argv ++; } else if(strcmp(argv[1], "-E") == 0) { BuildDictionaryExisting = ON; argc --; argv ++; } else if (strcmp(argv[1], "-t") == 0) { if ((argc <= 2) || !(isdigit(argv[2][0]))) { return usage(1); } else { threshold = atoi(argv[2]); argc -= 2; argv += 2; } } else if (strcmp(argv[1], "-l") == 0) { if ((argc <= 2) || !(isdigit(argv[2][0]))) { return usage(1); } else { specialwords = atoi(argv[2]); argc -= 2; argv += 2; } } #endif /*!BUILDCAST*/ else if (strcmp(argv[1], "-M") == 0) { if (argc == 2) { fprintf(stderr, "-M should be followed by the amount of memory in MB for indexing words\n"); return usage(1); } m = atoi(argv[2]); if (m < 1) { fprintf(stderr, "Ignoring -M %d (< 1 MB). Using default value of about 2 MB\n", m); return usage(1); } else { /* * Calculate I_THRESHOLD approximately. Note: 2*1024*1024*2 / (2*24 + 32 + 12) = 47662, DEF_I_THRESHOLD = 40000, so OK * N * sizeofindices + N*(avgwordlen + sizeoftoken)/indicespertoken <= mem * elemsperset = occurrences/indicespertoken * N <= mem * occurrences / (sizeofindices*indicespertoken + avgwordlen + sizeoftoken) */ I_THRESHOLD = m * 1024 * 1024 * (INDICES_PER_TOKEN) / (INDICES_PER_TOKEN * sizeof(struct indices) + sizeof(struct token) + AVG_WORD_LEN); fprintf(stderr, "Using %d words as threshold before merge\n", I_THRESHOLD/INDICES_PER_TOKEN); } argc -= 2; argv += 2; } else if (strcmp(argv[1], "-w") == 0) { if (argc == 2) { fprintf(stderr, "-w should be followed by the number of words\n"); return usage(1); } MAXWORDSPERFILE = atoi(argv[2]); argc -= 2; argv += 2; } else if (strcmp(argv[1], "-S") == 0) { if (argc == 2) { fprintf(stderr, "-S should be followed by the stop list limit\n"); return usage(1); } MAX_PER_MB = MAX_INDEX_PERCENT = atoi(argv[2]); argc -= 2; argv += 2; } else if(strcmp(argv[1], "-n") == 0) { IndexNumber = ON; if ((argc <= 2) || !(isdigit(argv[2][0]))) { /* -n has no arg */ argc --; argv ++; } else { NUMERICWORDPERCENT = atoi(argv[2]); if ((NUMERICWORDPERCENT > 100) || (NUMERICWORDPERCENT < 0)) { fprintf(stderr, "The percentage of numeric words must be in [0..100]\n"); return usage(1); } argc-=2; argv+=2; } } else if(strcmp(argv[1], "-h") == 0) { /* I want to generate .glimpse_filehash and .glimpse_filehash_index */ GenerateHash = ON; argc --; argv ++; } else if(strcmp(argv[1], "-i") == 0) { IncludeHigherPriority = ON; argc --; argv ++; } else if(strcmp(argv[1], "-k") == 0) { /* I want to know what files were there before: used in SFS to compute new sets from old ones */ KeepFilenames = ON; argc --; argv ++; } else if (strcmp(argv[1], "-B") == 0) { BigHashTable = 1; argc --; argv ++; } else if (strcmp(argv[1], "-E") == 0) { IndexEverything = 1; /* without doing stat tests, etc. */ argc --; argv ++; } else if(strcmp(argv[1], "-F") == 0) { FilenamesOnStdin = ON; argc--; argv++; } else if(strcmp(argv[1], "-X") == 0) { /* extract some info to append after a ' ' to filename in filename-buffer */ ExtractInfo = ON; argc--; argv++; } else if(strcmp(argv[1], "-U") == 0) { /* some information is there after blank after filename on same line as filename-buffer: makes sense only with -F */ InfoAfterFilename = ON; argc--; argv++; } else if(strcmp(argv[1], "-K") == 0) { /* first word of above info/or the extracted info is the key */ FirstWordOfInfoIsKey = ON; argc--; argv++; } /* else if(strcmp(argv[1], "-u") == 0) { IndexUnderscore = ON; argc--; argv++; } */ else if (strcmp(argv[1], "-H") == 0) { if (argc == 2) { fprintf(stderr, "-H should be followed by a directory name\n"); return usage(1); } strncpy(INDEX_DIR, argv[2], MAX_LINE_LEN); argc -= 2; argv += 2; } else break; /* rest are directory names */ } if (RecordLevelIndex && StructuredIndex) { fprintf(stderr, "-r and -s are not compatible!\n"); return usage(1); } if (StoreByteOffset && !RecordLevelIndex) { fprintf(stderr, "ignoring -O since -r was not specified\n"); StoreByteOffset = OFF; } if (InfoAfterFilename && !FilenamesOnStdin) { fprintf(stderr, "-U works only when -F is specified!\n"); return usage(1); } if (FirstWordOfInfoIsKey && !(InfoAfterFilename || ExtractInfo)) { fprintf(stderr, "-K works only when one of -X or -U are specified!\n"); return usage(1); } if (RecordLevelIndex) { /* printf("old_rdelim = %s rdelim = %s rdelim_len = %d\n", old_rdelim, rdelim, rdelim_len); */ preprocess_delimiter(rdelim, rdelim_len, rdelim, &rdelim_len); /* printf("processed rdelim = %s rdelim_len = %d\n", rdelim, rdelim_len); */ } if (ModifyFilenamesIndex) { int offset = 0; char buffer[1024]; FILE *filefp, *indexfp; sprintf(buffer, "%s/%s", INDEX_DIR, NAME_LIST); if ((filefp = fopen(buffer, "r")) == NULL) { fprintf(stderr, "Cannot open %s for reading\n", buffer); exit(2); } sprintf(buffer, "%s/%s.tmp", INDEX_DIR, NAME_LIST_INDEX); if ((indexfp = fopen(buffer, "w")) == NULL) { fprintf(stderr, "Cannot open %s for writing\n", buffer); exit(2); } fgets(buffer, 1024, filefp); /* skip over num. of file names */ offset += strlen(buffer); while (fgets(buffer, 1024, filefp) != NULL) { putc((offset & 0xff000000) >> 24, indexfp); putc((offset & 0xff0000) >> 16, indexfp); putc((offset & 0xff00) >> 8, indexfp); putc((offset & 0xff), indexfp); offset += strlen(buffer); } fflush(filefp); fclose(filefp); fflush(indexfp); fclose(indexfp); #if SFS_COMPAT sprintf(s, "%s/%s.tmp", INDEX_DIR, NAME_LIST_INDEX); sprintf(s1, "%s/%s", INDEX_DIR, NAME_LIST_INDEX); return rename(s, s1); #else sprintf(buffer, "mv %s/%s.tmp %s/%s", INDEX_DIR, NAME_LIST_INDEX, INDEX_DIR, NAME_LIST_INDEX); return system(buffer); #endif } BuildTurbo = ON; /* always ON: user can remove .glimpse_turbo if not needed */ /* * Look for invalid option combos. */ if ((argc<=1) && (!FilenamesOnStdin) && !FastIndex) { return usage(1); } if (DeleteFromIndex && (AddToIndex || CountWords || IndexableFile)) { /* With -f, it is automatic for files not found in OS but present in index; without it, an explicit set of files is required as argument on cmdline */ fprintf(stderr, "-d cannot be used with -I, -a or -c (see man pages)\n"); exit(1); } if (ByteLevelIndex) { if (MAX_PER_MB <= 0) { fprintf(stderr, "Stop list limit (#of occurrences per MB) '%d' must be > 0\n", MAX_PER_MB); exit(1); } } else if (OneFilePerBlock) { if ((MAX_INDEX_PERCENT <= 0) || (MAX_INDEX_PERCENT > 100)) { fprintf(stderr, "Stop list limit (%% of occurrences in files) '%d' must be in (0, 100]\n", MAX_INDEX_PERCENT); exit(1); } } /* * Find the index directory since it is used in all options. */ if (INDEX_DIR[0] == '\0') { if ((indexdir = getenv("HOME")) == NULL) { getcwd(INDEX_DIR, MAX_LINE_LEN-1); fprintf(stderr, "Using working-directory '%s' to store index\n\n", INDEX_DIR); } else strncpy(INDEX_DIR, indexdir, MAX_LINE_LEN); } getcwd(working_dir, MAX_LINE_LEN - 1); if (-1 == chdir(INDEX_DIR)) { fprintf(stderr, "Cannot change directory to %s\n", INDEX_DIR); return usage(0); } getcwd(INDEX_DIR, MAX_LINE_LEN - 1); /* must be absolute path name */ chdir(working_dir); /* get back to where you were */ if (IndexableFile) { /* traverse the given directories and output names of files that are indexable on stdout */ SortByTime = OFF; partition(argc, argv); return 0; } else { #if BUILDCAST printf("\nThis is buildcast version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); #else /*BUILDCAST*/ printf("\nThis is glimpseindex version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); #endif /*BUILDCAST*/ } if (ByteLevelIndex) { #if 0 /* We'll worry about these things later */ if (AddToIndex || DeleteFromIndex || FastIndex) { fprintf(stderr, "Fresh indexing recommended: -a, -d and -f are not supported with -b as yet\n"); exit(1); } AddToIndex = FastIndex = OFF; #endif CountWords = OFF; OneFilePerBlock = ON; } if (SortByTime) { if (DeleteFromIndex || AddToIndex) { fprintf(stderr, "Fresh indexing recommended: -a and -d are not supported with -t as yet\n"); exit(1); } FastIndex = OFF; /* automatically shuts it off as of now: we shall optimize -t with -f later */ } /* * CONVENTION: all the relevant output is on stdout; warnings/errors are on stderr. * Initialize / open important files. */ read_filters(INDEX_DIR, UseFilters); freq_file[0] = hash_file[0] = string_file[0] = '\0'; strcpy(freq_file, INDEX_DIR); strcat(freq_file, "/"); strcat(freq_file, DEF_FREQ_FILE); strcpy(hash_file, INDEX_DIR); strcat(hash_file, "/"); strcat(hash_file, DEF_HASH_FILE); strcpy(string_file, INDEX_DIR); strcat(string_file, "/"); strcat(string_file, DEF_STRING_FILE); initialize_tuncompress(string_file, freq_file, 0); sprintf(s, "%s/%s", INDEX_DIR, DEF_TIME_FILE); if((TIMEFILE = fopen(s, "w")) == 0) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } #if BG_DEBUG sprintf(s, "%s/%s", INDEX_DIR, DEF_LOG_FILE); if((LOGFILE = fopen(s, "w")) == 0) { fprintf(stderr, "can't open %s for writing\n", s); LOGFILE = stderr; } #endif /*BG_DEBUG*/ sprintf(s, "%s/%s", INDEX_DIR, DEF_MESSAGE_FILE); if((MESSAGEFILE = fopen(s, "w")) == 0) { fprintf(stderr, "can't open %s for writing\n", s); MESSAGEFILE = stderr; } sprintf(s, "%s/%s", INDEX_DIR, DEF_STAT_FILE); if((STATFILE = fopen(s, "a")) == 0) { fprintf(stderr, "can't open %s for appending\n", s); STATFILE = stderr; } gettimeofday(&tv, NULL); #if BUILDCAST fprintf(STATFILE, "\nThis is buildcast version %s, %s. %s", GLIMPSE_VERSION, GLIMPSE_DATE, ctime(&tv.tv_sec)); #else fprintf(STATFILE, "\nThis is glimpseindex version %s, %s. %s", GLIMPSE_VERSION, GLIMPSE_DATE, ctime(&tv.tv_sec)); #endif #if BG_DEBUG fprintf(LOGFILE, "Index Directory = %s\n\n", INDEX_DIR); #endif /*BG_DEBUG*/ if (MAXWORDSPERFILE != 0) fprintf(MESSAGEFILE, "Index: maximum number of indexed words per file = %d\n", MAXWORDSPERFILE); else fprintf(MESSAGEFILE, "Index: maximum number of indexed words per file = infinity\n"); fprintf(MESSAGEFILE, "Index: maximum percentage of numeric words per file = %d\n", NUMERICWORDPERCENT); set_indexable_char(indexable_char); #if BUILDCAST CountWords = ON; AddToIndex = OFF; FastIndex = OFF; /* Save old search-dictionaries */ sprintf(s, "%s/.glimpse_index", INDEX_DIR); if (!access(s, R_OK)) { sprintf(s, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); if (-1 == mkdir(s, 0700)) { fprintf(stderr, "cannot create temporary directory %s\n", s); return -1; } #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), INDEX_FILE, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, P_TABLE); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), P_TABLE, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, NAME_LIST); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), NAME_LIST, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, NAME_LIST_INDEX); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), NAME_LIST_INDEX, escapesinglequote(INDEX_DIR, es1), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, NAME_HASH); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), NAME_HASH, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, NAME_HASH_INDEX); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), NAME_HASH_INDEX, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, MINI_FILE); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), MINI_FILE, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, DEF_STAT_FILE); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), DEF_STAT_FILE, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif /* Don't save messages, log, debug, etc. */ sprintf(s, "%s/.glimpse_attributes", INDEX_DIR); if (!access(s, R_OK)) { #if SFS_COMPAT sprintf(s, "%s/%s", INDEX_DIR, ATTRIBUTE_FILE); sprintf(s1, "%s/.glimpse_tempdir.%d", INDEX_DIR, pid); rename(s, s1); #else sprintf(s, "exec %s -f '%s/%s' '%s/.glimpse_tempdir.%d'\n", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), ATTRIBUTE_FILE, escapesinglequote(INDEX_DIR, es2), pid); system(s); #endif } } /* Backup old cast-dictionaries: don't use move since indexing might want to use them */ sprintf(s, "%s/.glimpse_quick", INDEX_DIR); if (!access(s, R_OK)) { /* there are previous cast dictionaries */ backup = rand(); sprintf(s, "%s/.glimpse_backup.%x", INDEX_DIR, backup); if (-1 == mkdir(s, 0700)) { fprintf(stderr, "cannot create backup directory %s\n", s); return -1; } sprintf(s, "exec %s -f '%s/.glimpse_quick' '%s/.glimpse_backup.%x'\n", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), backup); system(s); sprintf(s, "exec %s -f '%s/.glimpse_compress' '%s/.glimpse_backup.%x'\n", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), backup); system(s); sprintf(s, "exec %s -f '%s/.glimpse_compress.index' '%s/.glimpse_backup.%x'\n", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), backup); system(s); sprintf(s, "exec %s -f '%s/.glimpse_uncompress' '%s/.glimpse_backup.%x'\n", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), backup); system(s); sprintf(s, "exec %s -f '%s/.glimpse_uncompress.index' '%s/.glimpse_backup.%x'\n", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), backup); system(s); printf("Saved previous cast-dictionary in %s/.glimpse_backup.%x\n", INDEX_DIR, backup); } /* Now index these files, and build new dictionaries */ partition(argc, argv); initialize_data_structures(file_num); old_file_num = file_num; build_index(); cleanup(); save_data_structures(); destroy_filename_hashtable(); uninitialize_common(); uninitialize_tcompress(); uninitialize_tuncompress(); compute_dictionary(threshold, DISKBLOCKSIZE, specialwords, INDEX_DIR); if (CompressAfterBuild) { /* For the new compression */ if (!initialize_tcompress(hash_file, freq_file, TC_ERRORMSGS)) goto docleanup; printf("Compressing files with new dictionary...\n"); /* Use the set of file-names collected during partition() / modified during build_hash */ for(i=0; i %d words to the index: check %s\n", MAXWORDSPERFILE, DEF_MESSAGE_FILE); if (AddedMixedWordsMessage) printf("Some files had numerals in > %d%% of the indexed words: check %s\n", NUMERICWORDPERCENT, DEF_MESSAGE_FILE); printf("\nIndex-directory: \"%s\"\nGlimpse-files created here:\n", INDEX_DIR); chdir(INDEX_DIR); sprintf(s, "exec %s -l .glimpse_* > %s/%d\n", SYSTEM_LS, TEMP_DIR,pid); system(s); sprintf(s, "%s/%d", TEMP_DIR,pid); if ((tmpfp = fopen(s, "r")) != NULL) { memset(tmpbuf, '\0', 1024); while(fgets(tmpbuf, 1024, tmpfp) != NULL) fputs(tmpbuf, stdout); fflush(tmpfp); fclose(tmpfp); unlink(s); } else fprintf(stderr, "cannot open %s to `cat': check %s for .glimpse - files\n", s, INDEX_DIR); #endif /*BUILDCAST*/ if (!ATLEASTONEFILE) exit(1); return 0; } cleanup() { char s[MAX_LINE_LEN]; sprintf(s, "%s/%s", INDEX_DIR, I1); unlink(s); sprintf(s, "%s/%s", INDEX_DIR, I2); unlink(s); sprintf(s, "%s/%s", INDEX_DIR, I3); unlink(s); sprintf(s, "%s/%s", INDEX_DIR, O1); unlink(s); sprintf(s, "%s/%s", INDEX_DIR, O2); unlink(s); sprintf(s, "%s/%s", INDEX_DIR, O3); unlink(s); sprintf(s, "%s/.glimpse_apply.%d", INDEX_DIR, getpid()); unlink(s); return(1); /* We return 1, because function expect int as return type */ } #if !BUILDCAST usage(flag) int flag; { if (flag) fprintf(stderr, "\nThis is glimpseindex version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-help] [-a] [-d] [-f] [-i] [-n [X]] [-o] [-r delim] [-s] [-t] [-w X] [-B] [-F] [-H DIR] [-I] [-M X] [-R] [-S X] [-T] [-V] NAMES\n", IProgname); fprintf(stderr, "List of options (see %s for more details):\n", GLIMPSE_URL); fprintf(stderr, "-help: outputs this menu\n"); fprintf(stderr, "-a: add given files/directories to an existing index\n"); fprintf(stderr, "-b: build a (large) byte-level index \n"); fprintf(stderr, "-B: use a hash table that is 4 times bigger (256k entries instead of 64K) \n"); fprintf(stderr, "-d NAMES: delete (file or directory) NAMES from an existing index\n"); fprintf(stderr, "-D NAMES: delete NAMES from the list of files (but not from the index!)\n"); fprintf(stderr, "-E: do not run a check on file types\n"); fprintf(stderr, "-f: incremental indexing (add all newly modified files)\n"); fprintf(stderr, "-F: the list of files to index is obtained from standard input\n"); fprintf(stderr, "-h: generates some hash-tables for WebGlimpse\n"); fprintf(stderr, "-H DIR: the index is put in directory DIR\n"); fprintf(stderr, "-i: make .glimpse_include take precedence over .glimpse_exclude\n"); fprintf(stderr, "-I: output the list of files that would be indexed (but don't index)\n"); fprintf(stderr, "-M X: use X MBytes of memory for temporary tables\n"); fprintf(stderr, "-n [X]: index numbers as well as words; warn (into .glimpse_messages)\n\tif file adds > X%% numeric words: default is %d\n", DEF_NUMERIC_WORD_PERCENT); fprintf(stderr, "-o: build a small (rather than tiny) size index (the recommended option!)\n"); /*fprintf(stderr, "-O: when using -r option, store byte offset of each record,\n\tinstead of the record number, for faster access\n");*/ fprintf(stderr, "-r delim: build an index at the granularity of delimiter `delim'\n\tto do booleans by reading ONLY the index\n"); fprintf(stderr, "-R: recompute .glimpse_filenames_index from .glimpse_filenames if it changes\n"); fprintf(stderr, "-s: build index to support structured (Harvest SOIF type) queries\n"); fprintf(stderr, "-S X: adjust the size of the stop list\n"); fprintf(stderr, "-t: sort the indexed files by date and time (most recent first)\n"); fprintf(stderr, "-T: build .glimpse_turbo for very fast search with -i -w in glimpse\n"); fprintf(stderr, "-U: there is extra information after filenames: works only with -F\n"); fprintf(stderr, "-w X: warn (into .glimpse_messages) if a file adds >= X words to the index\n"); fprintf(stderr, "-X: extract titles of all documents with .html, .htm, .shtm, .shtml suffix\n"); fprintf(stderr, "-z: customizable filtering using .glimpse_filters \n"); fprintf(stderr, "\n"); fprintf(stderr, "For questions about glimpse, please contact: `%s'\n", GLIMPSE_EMAIL); exit(1); } #else /*!BUILDCAST*/ usage(flag) int flag; { if (flag) fprintf(stderr, "\nThis is buildcast version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-help] [-t] [-i] [-l] [-n [X]] [-w X] [-C] [-E] [-F] [-H DIR] [-V] NAMES\n", IProgname); fprintf(stderr, "summary of frequently used options\n(for a more detailed listing see 'man cast'):\n"); fprintf(stderr, "-help: output this menu\n"); fprintf(stderr, "-n [X]: index numbers as well as words; warn (into .glimpse_messages)\n\tif file adds > X%% numeric words: default is %d\n", DEF_NUMERIC_WORD_PERCENT); fprintf(stderr, "-w X: warn if a file adds > X words to the index\n"); fprintf(stderr, "-C: compress files with the new dictionary after building it\n"); fprintf(stderr, "-E: build cast dictionary using existing compressed files only\n"); fprintf(stderr, "-F: expect filenames on stdin (useful for pipelining)\n"); fprintf(stderr, "-H DIR: .glimpse-files should be in directory DIR: default is '~'\n"); fprintf(stderr, "\n"); fprintf(stderr, "For questions about glimpse, please contact: `%s'\n", GLIMPSE_EMAIL); exit(1); return(1); /* We type return 1, because function expect int as return type */ } #endif /*!BUILDCAST*/ glimpse-4.18.7/index/glimpse.h000066400000000000000000000317651300371307100162020ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/pirs.h */ #include /* include configured defs */ #ifndef _GLIMPSE_H_ #define _GLIMPSE_H_ #include #include #include #include /*#include */ #include #include #include #include #undef log #include "agrep.h" #ifndef S_ISREG /* #define S_ISREG(mode) (0100000&(mode)) */ #define S_ISREG(mode) (((mode) & (_S_IFMT)) == (_S_IFREG)) #endif #ifndef S_ISDIR /* #define S_ISDIR(mode) (0040000&(mode)) */ #define S_ISDIR(mode) (((mode) & (_S_IFMT)) == (_S_IFDIR)) #endif #define IC_PORTRELEASE 20 /* time till used TCP port is released */ #ifndef ON #define ON 1 #endif #ifndef OFF #define OFF 0 #endif #ifndef CHAR #define CHAR unsigned char #endif #define MAX_INCLUSIVE 256 /* max number of inclusive patterns for files to be indexed even if filetype.c says otherwise. */ #define MAX_EXCLUSIVE 256 /* max number of exclusive patterns for not_to_be_indexed files */ #define MAX_FILTER 256 /* max number of filter patterns */ #define DEF_I_THRESHOLD 40000 /* 100000 originally, debugging 10000 */ #define AVG_OCCURRENCES 8 /* #of places a word occurs on average: sizeof(.glimpse_partitions)/`wc -l .glimpse_index`: divisible by INDEX_SET_SIZE */ #define MAX_LIST 0177777 #define DEFAULT_PART_SIZE (1 << 13) #define MAX_64K_HASH (64*1024) #define MAX_256K_HASH (256*1024) #define MAX_4K_HASH (4*1024) #define DISKBLOCKSIZE 8192 #define BLOCK_SIZE (1024*64) #define MAX_PARTITION 255 #define MaxNumPartition 250 /* it's not 255, since there is fragmentation*/ /* The idea behind our encoding is: dividend = divisor * quotient + remainder */ #define MaxNum4bPartition (16 - 2) /* since 10 and 0 can't be in LSB/MSB */ #define MaxNum8bPartition (256 - 2) #define MaxNum12bPartition (MaxNum4bPartition*MaxNum8bPartition) #define MaxNum16bPartition (MaxNum8bPartition*MaxNum8bPartition) #define MaxNum24bPartition (MaxNum8bPartition*MaxNum16bPartition) #define MaxNum32bPartition (MaxNum8bPartition*MaxNum24bPartition) /* These help in encoding byte-level indices: 1st byte's top 2 bits tell the #of bytes - 1 in offset-difference encoding; offset-diff 0 => new file follows */ #define MaxNum1BPartition (MaxNum8bPartition & 0x3f) /* 62: top byte is 0x00 | x % MaxNum8bPartition === x; just encode x */ #define MaxNum2BPartition (MaxNum1BPartition * MaxNum8bPartition) /* top byte = 0x40 | x / MaxNum8bPartition; rest is x % ~; encode both separately */ #define MaxNum3BPartition (MaxNum1BPartition * MaxNum16bPartition) /* top byte = 0x80 | x / MaxNum16bPartition; rest is x % ~; encode both separately */ #define MaxNum4BPartition (MaxNum1BPartition * MaxNum24bPartition) /* top byte = 0xc0 | x / MaxNum24bPartition; rest is x % ~; encode both separately */ #define DEF_NUMERIC_WORD_PERCENT 50 /* warn if > this many % of words added by file are numeric */ #define MIN_WORDS 50 /* before we inform about numeric words */ #define MAX_SEARCH_PERCENT 20 /* warn user if searching > this % of blocks */ #define DEF_MAX_INDEX_PERCENT 80 /* if word in > 80%, say everywhere for one-file-per-block */ #define DONT_CONFUSE_SORT 1 #define WORD_END_MARK 2 #define ALL_INDEX_MARK 3 /* If this, then word is in > 60% of blocks */ #define ATTR_END_MARK 4 /* After list of attributes before file offset/block numbers */ #define AVG_WORD_LEN 12 /* average word length is 8-9 including '\0': have safety margin */ #define MAX_NAME_SIZE 256 #define MAX_NAME_LEN MAX_NAME_SIZE #define MaxNameLength MAX_NAME_SIZE #define MAX_LINE_SIZE 1024 #define MAX_LINE_LEN 1024 #define MAX_SORTLINE_LEN (MAX_LINE_LEN * 16) /* Can be ((MaxNum16bPartition*sizeof(int)+MAX_NAME_LEN)*MAX_INDEX_PERCENT/100) in the worst case */ #define MAX_NAME_BUF MAX_NAME_SIZE #define MAX_WORD_SIZE 64 /* w/o '\0'; was 24 in 2.1 */ #define MAX_WORD_LEN MAX_WORD_SIZE #define MAX_WORD_BUF 80 /* was 32 in 2.1 */ #define MAX_PAT 256 #define MAXNUM_INDIRECT MaxNum8bPartition #define MAX_INDEX_BUF (MAX_PARTITION + 1 + 2*MAX_WORD_BUF + 2) /* index line length without OneFilePerBlock */ #define DEF_REAL_INDEX_BUF (MaxNum16bPartition*3 + 2*MAX_WORD_BUF + 2) /* index line length with OneFilePerBlock */ /* Must write fresh code to calculate these sets based by multiplying defaults below with round(file_num, MaxNum16bPartition) */ #define DEF_FILESET_SIZE MaxNum16bPartition /* used when OneFilePerBlock is ON */ #define DEF_FILEMASK_SIZE (DEF_FILESET_SIZE/(8*sizeof(int)) + 4) /* bit mask of files */ #define DEF_REAL_PARTITION (DEF_FILEMASK_SIZE + 4) /* must be > MAX_PARTITION + 1 */ /* block must be in 0..DEF_FILESET_SIZE-1, and integers should represent bit-masks */ #define block2index(i) (i/(8*sizeof(int))) #define block2mask(i) (1<<(i%(8*sizeof(int)))) /* not used */ #define round(x, y) (((x)+(y)-1)/(y)) #define FILES_PER_PARTITION(x) (16 + round(x, MAX_PARTITION)*16) /* 16 is minimum length of buffer: thereafter, allow noise upto 16 times average */ #define LIST_GET(list, elem) ((list[(elem)/MaxNum16bPartition] == 0) ? (0) : (list[(elem)/MaxNum16bPartition][(elem)%MaxNum16bPartition])) #define LIST_SUREGET(list, elem) (list[(elem)/MaxNum16bPartition][(elem)%MaxNum16bPartition]) #define LIST_ADD(list, elem, what, type) \ {\ int index = (elem /*+ 1*/)/MaxNum16bPartition;\ if (list[index] == NULL) {\ list[index] = (type *)malloc(sizeof(type)*MaxNum16bPartition);\ memset(list[index], '\0', sizeof(type)*MaxNum16bPartition);\ }\ LIST_SUREGET(list, elem) = what;\ } #define DEFAULT_REGION_LIMIT 256 /* default limit for a record: for ByteLevelIndex: pattern is ignored since can't avoid false matches w/o search */ #define MAX_REGION_LIMIT Max_record /* max amount of space I am going allocate for a record bounded by a delimiter: was 16384! Fixed -bg */ #define MAX_PER_LINE (MAX_SORTLINE_LEN / 2) /* #of words that can occur on one line before we split it up: not implemented at present */ #define DEF_MAX_PER_MB 500 /* Maximum number of times a word should occur in a megabyte before we say its everywhere */ #define DEF_ALL_INDEX 10000 /* Must be < DEF_MAX_ALL_INDEX */ #define DEF_MAX_ALL_INDEX (DEF_REAL_INDEX_BUF / 2) /* THIS * 2 must be < DEF_REAL_INDEX_BUF to prevent seg-faults! */ /* Default file names */ /* temporarily changed filter configuration file name to avoid conflicts between stable and experimental versions of glimpse. --CV 9/14/99 */ #define FILTER_FILE ".glimpse_filters" #define ATTRIBUTE_FILE ".glimpse_attributes" #define INDEX_FILE ".glimpse_index" #define MINI_FILE ".glimpse_turbo" #define P_TABLE ".glimpse_partitions" #define NAME_LIST ".glimpse_filenames" #define NAME_LIST_INDEX ".glimpse_filenames_index" #define NAME_HASH ".glimpse_filehash" #define NAME_HASH_INDEX ".glimpse_filehash_index" #define DEF_TIME_FILE ".glimpse_filetimes" #define DEF_LOG_FILE ".glimpse_log" #define DEF_MESSAGE_FILE ".glimpse_messages" #define DEF_STAT_FILE ".glimpse_statistics" #define PROHIBIT_LIST ".glimpse_exclude" #define INCLUDE_LIST ".glimpse_include" #define DEBUG_FILE ".glimpse_debug" #define I2 ".glimpse_tmpi2" #define I3 ".glimpse_tmpi3" #define I1 ".glimpse_tmpi1" #define O1 ".glimpse_tmpo1" #define O2 ".glimpse_tmpo2" #define O3 ".glimpse_tmpo3" #define DEF_LOCK_FILE ".glimpse_lock" #define HARVEST_PREFIX "glimpse" /* so that Darren can filterout error messages a user should see from the stuff outputted by glimpse on an error */ #define MASK_INT \ { 0x00000001, 0x00000002, 0x00000004, 0x00000008, 0x00000010, 0x00000020, 0x00000040, 0x00000080,\ 0x00000100, 0x00000200, 0x00000400, 0x00000800, 0x00001000, 0x00002000, 0x00004000, 0x00008000,\ 0x00010000, 0x00020000, 0x00040000, 0x00080000, 0x00100000, 0x00200000, 0x00400000, 0x00800000,\ 0x01000000, 0x02000000, 0x04000000, 0x08000000, 0x10000000, 0x20000000, 0x40000000, 0x80000000\ } #define INDEXABLE(c) (indexable_char[c]) #if SFS_COMPAT #define IGNORED_SUFFIXES {".glimpse_filehash", ".glimpse_filehash.prev", ".glimpse_filehash_index", ".glimpse_filehash_index.prev", ".glimpse_filenames", ".glimpse_filenames.prev", ".glimpse_filenames_index", ".glimpse_filenames_index.prev", ".glimpse_filetimes", ".glimpse_index", ".glimpse_partitions", ".glimpse_statistics", ".glimpse_messages", ".glimpse_exclude", ".glimpse_include", ".glimpse_filters", ".glimpse_attributes", ".glimpse_turbo"} #define NUM_SUFFIXES 18 #else #define IGNORED_SUFFIXES {"gz", "Z", "z", "zip", "o", "hqx", "tar", "glimpse_times", "glimpse_index", "glimpse_partitions"} #define NUM_SUFFIXES 10 #endif #define EXTRACT_INFO_SUFFIX {".htm", ".html", ".shtm", ".shtml", ".jhtml", ".phtml",".HTM",".HTML", ".abra"} #define NUM_EXTRACT_INFO_SUFFIX 9 /* Version and release year: same for glimpse and glimspeindex since glimpse HAS to interpret glimpseindex */ #define GLIMPSE_VERSION "4.18.7" #define GLIMPSE_DATE "2015" #define GLIMPSE_EMAIL "gvelez17@gmail.com" #define GLIMPSE_URL "http://webglimpse.net/" /* Some extern functions used in structured queries */ extern int attr_name_to_id(), attr_load_names(), attr_dump_names(); extern char *attr_id_to_name(); /* Data structures for hash-tables in build_in.c */ struct token { /* each token stores a unique word and unique attribute */ struct token *next_t; /* keep it a pointer even with tokenalloc to keep build_in.c same */ char *word; struct indices *ip; /* points to the head of the list of indices */ struct indices *lastip; /* tail of this list = last elemet (for increasing order insertion) */ unsigned int attribute; unsigned int totalcount;/* no. of indices structures in a token */ }; #define INDEX_SET_SIZE 4 #define INDEX_ELEM_FREE (MaxNum24bPartition + 1) /* can never be equal to a partition value */ struct indices { struct indices *next_i; /* keep it a pointer even with indexalloc to keep build_in.c same */ /*unsigned*/ int index[INDEX_SET_SIZE]; /* changed from char, 31/3/94 */ /*unsigned*/ int offset[INDEX_SET_SIZE]; /* added 19/9/94 */ }; /* Added 20/9/94 for get_index.c in glimpse (make it more efficient in space later) */ struct offsets { struct offsets *next; int offset; /* NOT unsigned!!! */ short sign; /* if 0, then indeterminate (bothways), 1 then +ve, -1 then -ve */ short done; /* if 0, then this did not have an intersection now, else it has had it */ }; #define INDICES_PER_TOKEN (AVG_OCCURRENCES/INDEX_SET_SIZE) /* average no. of struct indices per struct token: purely empirical result :-) */ /* Memory allocators: in io.c */ extern char *my_malloc(); extern int my_free(); extern FILE *my_fopen(); extern int my_open(), my_stat(), my_lstat(); extern char *wordalloc(); extern int wordfree(); extern int allwordfree(); extern struct indices *indicesalloc(); extern int indicesfree(); extern int allindicesfree(); extern struct token *tokenalloc(); extern int tokenfree(); extern int alltokenfree(); #define LIMIT_64K_HASH 50 /* size of total stuff to be indexed in MB after which 256K hash tables make more sense with the -B option */ #define hashword(word, wordlen) (((total_size < LIMIT_64K_HASH*1024*1024) || !BigHashTable) ? (hash64k(word, wordlen)) : (hash256k(word, wordlen))); /* * Just stores the word, wordlength and offset present in a line of the index in a structure (when made with -o or -b). * Doesn't store the attribute since we just need a hint into .glimpse_index from where agrep should begin search. */ #define WORD_SORTED 0 #if WORD_SORTED struct mini { char *word; long offset; }; /* Region searched with strcmp. #of regions = mini_array_len = (`wc -l .glimpse_index` - 3) / WORDS_PER_REGION */ #define WORDS_PER_REGION 128 #else /* WORD_SORTED */ struct mini { long offset; }; /* Range of each mini_array entry is words with same hash32k value => 32K offsets into the index need to be stored */ #define MINI_ARRAY_LEN (64*1024) #endif /* WORD_SORTED */ /* For incremental indexing only */ typedef struct _name_hashelement { struct _name_hashelement *next; char *name; int name_len; int index; } name_hashelement; /* * Limit on number of files is MaxNum24bPartition. To change it, you need * to add encode/decode code everywhere, INDEX_ELEM_FREE and MAXNUM_INDIRECT. * * Limit on number of attributes is MaxNum16bPartition. To change it, you * need to add encode/decode code everywhere. That is: merge_splits(), * save_data_structures(), traverse(), merge_in() and scanword() * in glimpseindex; get_set() in glimpse; and printx.c. * * No need to change any other data structures. */ /* Names of various system commands used in glimpseindex: use mv/rm etc rather than rename()/unlink() since former don't return unless parent-dir is sync-ed */ #define SYSTEM_SORT "sort" /* replace with different sort with longer lines. Later write a procedure for sort that doesn't need system() */ #define SYSTEM_LS "ls" #define SYSTEM_MV "mv" /* this doesn't work with SFS */ #define SYSTEM_RM "rm" /* this doesn't work with SFS */ #define SYSTEM_CAT "cat" #define SYSTEM_HEAD "head" #define SYSTEM_CP "cp" #define SYSTEM_ECHO "echo" #define SYSTEM_WC "wc" #define SYSTEM_AWK "awk" /* used at present only in "cast" package */ extern char *escapesinglequote(); #endif /* _GLIMPSE_H_ */ glimpse-4.18.7/index/index.chronicle000066400000000000000000000105141300371307100173550ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ Started in Aug 1993. 0. bgopal: The new indexing mechanism is totally different from the original one (written by udi and sun wu) -- the only thing common between the two is the format of the index and the partitioning algorithm (v. simple algo). 1. Changed pirs.c/main()/line16 to make argc>1 check before accessing argv. 2. Added a leading bit in the index values to distinguish them from the next word. This was mentioned but never implemented (comment in build_in.c). 3. Removed simple binary file and uuencoded file testing from filetype.c and put it into a new file simpletest.c so that the compress module can use it too. 4. Removed tolower in getword() so that I can index Linear, LINEAR and linear depending on the relative frequency. Else, compression becomes a problem. 5. Added case-check (allupper, alllower, onlyfirstupper-restlower) routine to getword() in getword.c -- does this only in case '-c' was specified. 6. Modified insert_h() and insert_index() procedures in build_in.c to store the count of words rather than the partition numbers if CountWords == ON. 7. Modified pirs.c to take the option -c for CountWords instead of gathering partition information (i.e., when we don't want .index_list for searching but for the EXACT frequency of occurrence of different words). 8. Modified merge_in.c to merge counts of similar words occurring in two different files rather than the partition numbers: the output of build_in when the CountWords option is set is: a word followed by end-of-word-mark followed by a list of (fprintf, not fwrite) counts separated by blanks, ended with a newline. 9. Changed the files "everywhere" to account for malloc-failures (try again after purging the hash-table once: if fail again, THEN exit). A. Changed the algorithm for build_hash -- it did not index all files. Block-copied the code in the inner while loop after the loop-terminates. B. Removed leading bit! Now sort gave problems on partiton#0, so ignored partition#0 altogether like partition#'\n' was ignored to figure out the end of the current input line/word. C. Removed all references to pirs everywhere: it is now "glimpse" -- 1/28/94. D. Bug fixes relating to $HOME not being there in the environment. E. Bug fix related to "very small directories" (partitioning algorithm). F. Fixed BIG bug related to memory leaks which can cause aborts... not sure if this was the reason for deadlocks (schwartz's bug) but ran ok for 280MB. G. Fixed a bug related to very small indices (with one partition only). H. Added a facility to have one file per block, i.e., each file is in one partition all by itself: a MAJOR change was done to many data-structures and encode/decode functions were added so that sort/gets don't get confused. -- bg, 23-30 Mar 1994 I. In fast index, the old index may be destroyed and built again. In add to index, it is never destroyed: things to it are only added. In add to index, the old guys are NOT checked for modification, etc, and all the new ones are added. Whereas in FastIndex, even the new ones are checked for modification date. In both, non-existent files are removed but the holes are not filled. The fastest way to add a new set of files is to use -f. This is same as saying -f AND -a except that the old index is never rebuilt with -a. (The index MIGHT need rebuilding if it was not found or partitions overflowed.) (Does this make sense? :-) -- bg, 20-22 Apr 1994 J. Changed STAT, MESSAGE, LOG (filenames) to STATFILE, MESSAGEFILE, LOGFILE to avoid name clashes with some C-lib variables. -- bg, 29 Apr 1994 K. Changed dir.c and partition.c to take care of absolute path names on the command line itself: now, everything on the command line is forced to be indexed (esp. symlinks which were excluded by default earlier). -- bg, 2 May 1994 L. Increased maximum number of files that can be indexed to 254*254 = 64516. -- bg, 4 May 1994 M. Added ability to index structured files during June/July 1994. N. Added ability to index compressed files, and automatically create compress dictionaries (for cast) with -z option during Aug 1994. O. Added user option -i to make include have higher priority than exclude during Aug 1994. P. Completed incremental indexing support during June 1995 glimpse-4.18.7/index/io.c000066400000000000000000001376211300371307100151420ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/io.c */ #include "glimpse.h" #include #include #include #include #include extern char INDEX_DIR[MAX_LINE_LEN]; extern int memory_usage; #include "utils.c" int REAL_INDEX_BUF = DEF_REAL_INDEX_BUF, MAX_ALL_INDEX = DEF_MAX_ALL_INDEX, FILEMASK_SIZE = DEF_FILEMASK_SIZE, REAL_PARTITION = DEF_REAL_PARTITION; /* Escapes single quotes in "original" string with backquote (\) s.t. it can be passed on to the shell as a file name: returns its second argument for printf */ /* Called before passing any argument to the system() routine in glimpse or glimspeindex source code */ /* Works only if the new name is going to be passed as argument to the shell within two ''s */ char * escapesinglequote(original, new) char *original, *new; { char *oldnew = new; while (*original != '\0') { if (*original == '\'') { *new ++ = '\''; /* close existing ' : this guy will be a part of a file name starting from a ' */ *new ++ = '\\'; /* add escape character */ *new ++ = '\''; /* add single quote from original here */ } *new ++ = *original ++; /* start the real single quote to continute existing file name if *original was ' */ } *new = *original; return oldnew; } /* -------------------------------------------------------------------- get_array_of_lines() input: an input filename, address of the table, maximum number of entries of the table, and a overflow handling flag. output: a set of strings in the table. when overflow is ON, the function returns after the table is filled. otherwise the function will exit if overflow occurs. In normal return, the function returns the number of entries read. ----------------------------------------------------------------------*/ get_array_of_lines(inputfile, table, max_entry, overflow_ok) char *inputfile; char **table[]; int max_entry; /* max number of entries in the table */ int overflow_ok; /* flag for handling overflow */ { int tx=0; /* index for table */ FILE *file_in; unsigned char buffer[MAX_NAME_BUF]; char *np; int line_length; int num_lines; if((file_in = fopen(inputfile, "r")) == NULL) { if (overflow_ok) return 0; fprintf(stderr, "can't open for reading: %s\n", inputfile); exit(2); } fgets(buffer, MAX_NAME_BUF, file_in); sscanf(buffer, "%d", &num_lines); if ((num_lines < 0) || (num_lines > MaxNum24bPartition)) { fclose(file_in); if (overflow_ok) return 0; fprintf(stderr, "Error in reading: %s\n", inputfile); exit(2); } while(fgets(buffer, MAX_NAME_BUF, file_in)) { line_length = strlen(buffer); if (line_length == 1) continue; buffer[line_length-1] = '\0'; /* discard the '\n' */ #if BG_DEBUG np = (char *) my_malloc(sizeof(char) * (line_length + 2)); #else /*BG_DEBUG*/ np = (char *) my_malloc(sizeof(char) * (line_length + 2)); #endif /*BG_DEBUG*/ if(np == NULL) { int i=0; fclose(file_in); for (i=0; i max_entry) { fclose(file_in); if(overflow_ok) { fclose(file_in); return(tx); } fprintf(stderr, "overflow in get_array_of_lines()\n"); exit(2); } } fclose(file_in); return(tx); /* return number of lines read */ } /* -------------------------------------------------------------------- get_table(): input: an input filename, address of the table, maximum number of entries of the table, and a overflow handling flag. output: a set of integers in the table. when overflow_ok is ON, the function returns after the table is filled. otherwise the function will exit if overflow occurs. In normal return, the function returns the number of entries read. ----------------------------------------------------------------------*/ int get_table(inputfile, table, max_entry, overflow_ok) char *inputfile; int table[]; int max_entry; int overflow_ok; { int val = 0; int c = 0; FILE *file_in; int tx=0; /* number of entries read */ if((file_in = fopen(inputfile, "r")) == NULL) { if (overflow_ok) return 0; fprintf(stderr, "can't open %s for reading\n", inputfile); exit(2); } while((c = getc(file_in)) != EOF) { val = c << 24; if ((c = getc(file_in)) == EOF) break; val |= c << 16; if ((c = getc(file_in)) == EOF) break; val |= c << 8; if ((c = getc(file_in)) == EOF) break; val |= c; if(tx > max_entry) { if(!overflow_ok) { fprintf(stderr, "in get_table: table overflow\n"); exit(2); } break; } table[tx++] = val; } fclose(file_in); return(tx); } get_index_type(s, dashn, num, attr, delim) char s[]; int *dashn, *num, *attr; char delim[]; { FILE *fp = fopen(s, "r"); char buf[MAX_LINE_LEN]; *dashn = *num = *attr = 0; *delim = '\0'; if (fp == NULL) return 0; fscanf(fp, "%s\n%%%d\n%%%d%s\n", buf, num, attr, delim); /* printf("get_index_type(): %s %d %d %s\n", buf, num, attr, delim); */ fclose(fp); if (strstr(buf, "1234567890")) *dashn = ON; return *num; } /* Read offset from srcbuf first so that you can use it with srcbuf=destbuf */ get_block_numbers(srcbuf, destbuf, partfp) unsigned char *srcbuf, *destbuf; FILE *partfp; { int offset, pat_size; static int printederror = 0; /* Does not do caching of blocks seen so far: done in OS hopefully */ offset = (srcbuf[0] << 24) | (srcbuf[1] << 16) | (srcbuf[2] << 8) | (srcbuf[3]); pat_size = decode32b(offset); if (-1 == fseek(partfp, pat_size, 0)) { if (!printederror) { fprintf(stderr, "Warning! Error in the format of the index!\n"); printederror = 1; } } destbuf[0] = '\n'; destbuf[1] = '\0'; destbuf[2] = '\0'; destbuf[3] = '\0'; if (fgets(destbuf, REAL_INDEX_BUF - MAX_WORD_BUF - 1, partfp) == NULL) { destbuf[0] = '\n'; destbuf[1] = '\0'; destbuf[2] = '\0'; destbuf[3] = '\0'; } } int num_filter=0; int filter_len[MAX_FILTER]; CHAR *filter[MAX_FILTER]; CHAR *filter_command[MAX_FILTER]; struct stat filstbuf; /* Prototype for filter entry point in a shared library -- CV 9/14/99 */ typedef int (*FILTER_FUNC)(FILE *, FILE *); /* Holds addresses of entry points -- CV 9/14/99 */ static FILTER_FUNC filter_func[MAX_FILTER]; /* Holds shared library handles, one per filter and shared library -- CV 9/14/99 */ static void *filter_handle[MAX_FILTER]; /* Loads shared filter libraries. The criterion, upon which this function decides whether a filter is an external program or a shared library is as follows: If the name of the filter command ends in ".so", it assumes that the filter is a shared library; otherwise it assumes an external program. apply_filters() finds out what kind the ith filter is used by looking at filter_func[i]. If it is non-null, it is a pointer to the entry point in a shared library. Otherwise, the filter is an external program. -- CV 9/14/99 */ #ifndef RTLD_NOW /* dummy function for old system such as SunOS-4.1.3 */ static void load_dyn_filters(void) {} #else static void load_dyn_filters(void) { int i, len, success; char *so_pos; char *error; memset(filter_func, '\0', sizeof (filter_func)); memset(filter_handle, '\0', sizeof (filter_handle)); for (i = 0; i < num_filter; i++) { success = 1; len = strlen(filter_command[i]); if (len > 4) { /* find location in string where .so suffix should be */ so_pos = (char *)(filter_command[i] + len - 3); if (strcmp(so_pos, ".so") == 0) { /* fprintf(stderr, "Loading %s\n", filter_command[i]); */ if ((filter_handle[i] = dlopen(filter_command[i], RTLD_NOW)) == NULL) { success = 0; error = dlerror(); } if (success) { filter_func[i] = dlsym(filter_handle[i], "filter_func"); if ((error = dlerror()) != NULL) success = 0; } if (! success) { fputs("Warning: Error while loading filter\n", stderr); fputs(error, stderr); if (filter_handle[i] != NULL) { /* lib was already loaded when error happened */ dlclose(filter_handle[i]); } /* Since an error occurred, should we disable the filter entirely for this command? If yes, HOW? -- CV */ } } } } } #endif /* RTLD_NOW */ /* Applies a filter to a single file, based on whether the filter is a command or in a shared library. filter_number is the filter's index in the filter table. Returns 0 on success, passes on error code from the filter otherwise -- CV 9/14/99 */ static int apply_one_filter(int filter_number, const char *in_name, const char *out_name) { int ret = 0; if (filter_func[filter_number] != NULL) { /* in shared library */ FILE *in = NULL, *out = NULL; in = fopen(in_name, "r"); if (in != NULL) out = fopen(out_name, "w"); if (in == NULL || out == NULL) ret = errno; else ret = (*filter_func[filter_number])(in, out); if (out != NULL) fclose(out); if (in != NULL) fclose(in); } else { /* in external program */ char escaped_in[MAX_LINE_LEN], escaped_out[MAX_LINE_LEN]; char command[2 * MAX_LINE_LEN]; escapesinglequote(in_name, escaped_in); escapesinglequote(out_name, escaped_out); sprintf(command, "exec %s '%s' > '%s'", filter_command[filter_number], escaped_in, escaped_out); /* FIXME: use snprintf() where available to avoid stupid buffer overruns. But security risk should be low, because no user-supplied data goes in here. -- CV 9/14/99. */ ret = system(command); } return ret; } /* Copies a file verbatim. It could be a little better optimized by using fread() and fwrite(), but speed is not critical here. Returns 0 on success, error code otherwise. -- CV 9/14/99 */ static int copy_file(const char *source, const char *destination) { FILE *src = NULL, *dest = NULL; int c, ret = 0; src = fopen(source, "r"); if (src != NULL) dest = fopen(destination, "w"); if (src == NULL || dest == NULL) ret = errno; else { /* FIXME: Better error checking here. But it's not really critical, because these are only temporary files. -- CV 9/14/99 */ while ((c = fgetc(src)) != EOF) fputc(c, dest); } if (dest != NULL) fclose(dest); if (src != NULL) fclose(src); return ret; } read_filters(index_dir, dofilter) char *index_dir; int dofilter; { int len; int patlen; int patpos; int commandpos; FILE *filterfile; char filterbuf[MAX_LINE_LEN]; char tempbuf[MAX_LINE_LEN]; char s[MAX_LINE_LEN]; num_filter = 0; memset(filter, '\0', sizeof(CHAR *) * MAX_FILTER); memset(filter_command, '\0', sizeof(CHAR *) * MAX_FILTER); memset(filter_len, '\0', sizeof(int) * MAX_FILTER); if (!dofilter) return; sprintf(s, "%s/%s", index_dir, FILTER_FILE); filterfile = fopen(s, "r"); if(filterfile == NULL) { /* fprintf(stderr, "can't open filter file %s\n", s); -- no need */ num_filter = 0; } else if (fstat(fileno(filterfile), &filstbuf) == -1) { num_filter = 0; } else { while((num_filter < MAX_FILTER) && fgets(filterbuf, MAX_LINE_LEN, filterfile)) { if ((len = strlen(filterbuf)) < 1) continue; filterbuf[len-1] = '\0'; commandpos = 0; while ((commandpos < len) && ((filterbuf[commandpos] == ' ') || (filterbuf[commandpos] == '\t'))) commandpos ++; /* leading spaces */ if (commandpos >= len) continue; if (filterbuf[commandpos] == '\'') { commandpos ++; patpos = commandpos; patlen = 0; while (commandpos < len) { if (filterbuf[commandpos] == '\\') { commandpos += 2; patlen += 2; } else if (filterbuf[commandpos] != '\'') { commandpos ++; patlen ++; } else break; } if ((commandpos >= len) || (patlen <= 0)) continue; commandpos ++; } else { patpos = commandpos; patlen = 0; while ((commandpos < len) && (filterbuf[commandpos] != ' ') && (filterbuf[commandpos] != '\t')) { commandpos ++; patlen ++; } while ((commandpos < len) && ((filterbuf[commandpos] == ' ') || (filterbuf[commandpos] == '\t'))) commandpos ++; if (commandpos >= len) continue; } memcpy(tempbuf, &filterbuf[patpos], patlen); tempbuf[patlen] = '\0'; if ((filter_len[num_filter] = convert2agrepregexp(tempbuf, patlen)) == 0) continue; /* inplace conversion */ filter[num_filter] = (unsigned char *) strdup(tempbuf); filter_command[num_filter] = (unsigned char *)strdup(&filterbuf[commandpos]); num_filter ++; } fclose(filterfile); } load_dyn_filters(); /* load filters in shared libraries -- CV 9/14/99 */ } /* 1 if filter application was successful and the output (>1B) is in outname, 2 if some pattern matched but there is no output, 0 otherwise: sep 15-18 '94 */ /* memagrep is initialized in partition.c for calls from dir.c, and it is already done by the time we call this function from main.c */ apply_filter(inname, outname) char *inname, *outname; /* outname is in-out, inname is in */ { int i; char name[MAX_LINE_LEN], es1[MAX_LINE_LEN], es2[MAX_LINE_LEN]; int name_len = strlen(inname); char s[MAX_LINE_LEN]; FILE *dummyout; FILE *dummyin; char dummybuf[4]; char prevoutname[MAX_LINE_LEN]; char newoutname[MAX_LINE_LEN]; char tempoutname[MAX_LINE_LEN]; char tempinname[MAX_LINE_LEN]; int ret = 0; int unlink_prevoutname = 0; if (num_filter <= 0) return 0; if ((dummyout = fopen("/dev/null", "w")) == NULL) return 0; /* ready for memgrep */ name[0] = '\n'; special_get_name(inname, name_len, tempinname); name_len = strlen(tempinname); strcpy(name+1, tempinname); strcpy(prevoutname, tempinname); strcpy(newoutname, outname); /* Current properly filtered output is always in prevoutname */ for(i=0; i 0) { char *suffix; name[name_len + 1] = '\0'; if ((suffix = strstr(name+1, filter[i])) != NULL) { /* Chris Dalton */ if (ret == 0) ret = 2; /* yes, it matched: now apply the command and get the output */ /* printf("filtering %s\n", name); */ /* new filter function -- CV 9/14/99 */ apply_one_filter(i, prevoutname, newoutname); if (((dummyin = my_fopen(newoutname, "r")) == NULL) || (fread(dummybuf, 1, 1, dummyin) <= 0)) { if (dummyin != NULL) fclose(dummyin); unlink(newoutname); continue; } /* Filter was successful: output exists and has atleast 1 byte in it */ fclose(dummyin); if (unlink_prevoutname) { unlink(prevoutname); strcpy(tempoutname, prevoutname); strcpy(prevoutname, newoutname); strcpy(newoutname, tempoutname); } else { strcpy(prevoutname, newoutname); sprintf(newoutname, "%s.o", prevoutname); } ret = 1; unlink_prevoutname = 1; #if 1 /* if the matched text was a proper suffix of the name, */ /* remove the suffix just processed before examining the */ /* name again. Chris Dalton */ /* And I don't know what the equivalent thing is with */ /* memagrep_search: since it doesn't return a pointer to */ /* the place where the match occured. Burra Gopal */ if (strcmp(filter[i], suffix) == 0) { name_len -= strlen(suffix); *suffix= '\0'; } #endif /*1*/ if (strlen(newoutname) >= MAX_LINE_LEN - 1) break; } } else { /* must call memagrep */ name[name_len + 1] = '\n'; /* memagrep wants names to end with '\n': '\0' is not necessary */ /* printf("i=%d filterlen=%d filter=%s inlen=%d input=%s\n", i, -filter_len[i], filter[i], len_current_dir_buf, current_dir_buf); */ if (((filter_len[i] == -2) && (filter[i][0] == '.') && (filter[i][1] == '*')) || (memagrep_search(-filter_len[i], filter[i], name_len + 2, name, 0, dummyout) > 0)) { if (ret == 0) ret = 2; /* yes, it matched: now apply the command and get the output */ /* printf("filtering %s\n", name); */ /* new filter function -- CV 9/14/99 */ apply_one_filter(i, prevoutname, newoutname); if (((dummyin = my_fopen(newoutname, "r")) == NULL) || (fread(dummybuf, 1, 1, dummyin) <= 0)) { if (dummyin != NULL) fclose(dummyin); unlink(newoutname); continue; } /* Filter was successful: output exists and has atleast 1 byte in it */ fclose(dummyin); if (unlink_prevoutname) { unlink(prevoutname); strcpy(tempoutname, prevoutname); strcpy(prevoutname, newoutname); strcpy(newoutname, tempoutname); } else { strcpy(prevoutname, newoutname); sprintf(newoutname, "%s.o", prevoutname); } ret = 1; unlink_prevoutname = 1; if (strlen(newoutname) >= MAX_LINE_LEN - 1) break; } } } if (ret == 1) strcpy(outname, prevoutname); else { /* dummy filter that copies input to output: caller can use tempinname but this has easy interface */ /* replaced system() call with a simple copy function. -- CV 9/14/99 */ copy_file(tempinname, outname); } fclose(dummyout); return ret; } /* Use a modified wais stoplist to do this with simple strcmp's in a for loop */ static_stop_list(word) char *word; { return 0; } /* This is the stuff that used to be present in the old build_in.c */ /* Some variables used throughout */ FILE *TIMEFILE; /* file descriptor for sorting .glimpse_filenames by time */ #if BG_DEBUG FILE *LOGFILE; /* file descriptor for LOG output */ #endif /*BG_DEBUG*/ FILE *STATFILE; /* file descriptor for statistical data about indexed files */ FILE *MESSAGEFILE; /* file descriptor for important messages meant for the user */ char INDEX_DIR[MAX_LINE_LEN]; char sync_path[MAX_LINE_LEN]; struct stat istbuf; struct stat excstbuf; struct stat incstbuf; int ICurrentFileOffset; int NextICurrentFileOffset; /* Some options used throughout */ int GenerateHash = OFF; int KeepFilenames = OFF; int OneFilePerBlock = OFF; int total_size = 0; int total_deleted = 0; int MAXWORDSPERFILE = 0; int NUMERICWORDPERCENT = DEF_NUMERIC_WORD_PERCENT; int AddToIndex = OFF; int DeleteFromIndex = OFF; int PurgeIndex = ON; int FastIndex = OFF; int BuildDictionary = OFF; int BuildDictionaryExisting = OFF; int CompressAfterBuild = OFF; int IncludeHigherPriority = OFF; int FilenamesOnStdin = OFF; int ExtractInfo = OFF; int InfoAfterFilename = OFF; int FirstWordOfInfoIsKey = OFF; int UseFilters = OFF; int ByteLevelIndex = OFF; int RecordLevelIndex = OFF; /* When we want a -o like index but want to do booleans on a per-record basis directly from index: robint@zedcor.com */ /* This type of index doesn't make sense with attributes since they span > 1 record; hence StructuredIndex == -2 => this = ON */ int StoreByteOffset = OFF; /* In RecordLevelIndex, store record # for each word or byte offset of the record: record # is the default (12/12/96) */ char rdelim[MAX_LINE_LEN]; char old_rdelim[MAX_LINE_LEN]; int rdelim_len = 0; /* int IndexUnderscore = OFF; */ int IndexableFile = OFF; int MAX_INDEX_PERCENT = DEF_MAX_INDEX_PERCENT; int MAX_PER_MB = DEF_MAX_PER_MB; int I_THRESHOLD = DEF_I_THRESHOLD; int BigHashTable = OFF; int IndexEverything = OFF; int HashTableSize = MAX_64K_HASH; int BuildTurbo = OFF; int SortByTime = OFF; int AddedMaxWordsMessage = OFF; int AddedMixedWordsMessage = OFF; int icount=0; /* count the number of my_malloc for indices structure */ int hash_icount=0; /* to see how much was added to the current hash table */ int save_icount=0; /* to see how much was added to the index by the current file */ int numeric_icount=0; /* to see how many numeric words were there in the current file */ int mask_int[32] = MASK_INT; int p_table[MAX_PARTITION]; int memory_usage = 0; char * my_malloc(len) int len; { char *s; static int i=100; if ((s = malloc(len)) != NULL) memory_usage += len; else fprintf(stderr, "malloc failed after memory_usage = %x Bytes\n", memory_usage); /* Don't exit since might do traverse here: exit in glimpse though */ #if BG_DEBUG printf("m:%x ", memory_usage); i--; if (i==0) { printf("\n"); i = 100; } #endif /*BG_DEBUG*/ return s; } my_free(ptr, size) void *ptr; int size; { if (ptr) free(ptr); memory_usage -= size; #if BG_DEBUG printf("f:%x ", memory_usage); #endif /*BG_DEBUG*/ } int file_num = 0; int old_file_num = 0; /* upto what file number should disable list be accessed: < file_num if incremental indexing */ int new_file_num = -1; /* after purging index, how many files are left: for save_data_structures() */ int bp=0; /* buffer pointer */ unsigned char word[MAX_WORD_BUF]; int FirstTraverse1 = ON; struct indices *ip; /* Globals used in merge, and also in glimpse's get_index.c */ unsigned int *src_index_set = NULL; unsigned int *dest_index_set = NULL; unsigned char *src_index_buf = NULL; unsigned char *dest_index_buf = NULL; unsigned char *merge_index_buf = NULL; /* * Routines for zonal memory allocation for glimpseindex and very fast search in glimpse. */ int next_free_token = 0; struct token *free_token = NULL; /*[I_THRESHOLD/AVG_OCCURRENCES]; */ int next_free_indices = 0; struct indices *free_indices = NULL; /*[I_THRESHOLD]; */ int next_free_word = 0; char *free_word = NULL; /*[I_THRESHOLD/AVG_OCCURRENCES * AVG_WORD_LEN]; */ extern int usemalloc; /* * The beauty of this allocation scheme is that "free" does not need to be implemented! */ tokenallfree() { next_free_token = 0; } tokenfree(e, len) struct token *e; int len; { if (usemalloc) my_free(e, sizeof(struct token)); } struct token * tokenalloc(len) int len; { struct token *e; if (usemalloc) (e) = (struct token *)my_malloc(sizeof(struct token)); else { if (free_token == NULL) free_token = (struct token *)my_malloc(sizeof(struct token) * I_THRESHOLD / INDICES_PER_TOKEN); if (free_token == NULL) {fprintf(stderr, "malloc failure in tokenalloc()\n"); exit(2);} else (e) = ((next_free_token >= I_THRESHOLD / INDICES_PER_TOKEN) ? (NULL) : (&(free_token[next_free_token ++]))); } return e; } indicesallfree() { next_free_indices = 0; } indicesfree(e, len) struct indices *e; int len; { if (usemalloc) my_free(e, sizeof(struct indices)); } struct indices * indicesalloc(len) int len; { struct indices *e; if (usemalloc) (e) = (struct indices *)my_malloc(sizeof(struct indices)); else { if (free_indices == NULL) free_indices = (struct indices *)my_malloc(sizeof(struct indices) * I_THRESHOLD); if (free_indices == NULL) {fprintf(stderr, "malloc failure in indicesalloc()\n"); exit(2);} else (e) = ((next_free_indices >= I_THRESHOLD) ? (NULL) : (&(free_indices[next_free_indices ++]))); } return e; } /* For words in a token structure */ wordallfree() { next_free_word = 0; } wordfree(s, len) char *s; int len; { if (usemalloc) my_free(s, len); } char * wordalloc(len) int len; { char *s; if (usemalloc) (s) = (char *)my_malloc(len); else { if (free_word == NULL) free_word = (char *)my_malloc(AVG_WORD_LEN * I_THRESHOLD/INDICES_PER_TOKEN); if (free_word == NULL) {fprintf(stderr, "malloc failure in wordalloc()\n"); exit(2); } else (s) = ((next_free_word + len + 2 >= AVG_WORD_LEN * I_THRESHOLD/INDICES_PER_TOKEN) ? (NULL) : (&(free_word[next_free_word]))); if (s != NULL) next_free_word += (len); /* 2 for 1 char word with '\0' */ } return s; } struct mini *mini_array = NULL; int mini_array_len = 0; #if WORD_SORTED /* * Routines that operate on the index using the mini-index. * * The index is a list of words+delim+attr+offset+\n sorted * by the word (using strcmp). * * The mini-index keeps track of the offsets in the index * where every WORDS_PER_REGION-th word in the index occurs. * There is no direct way for glimpse to seek into the mini * file for the exact offset of this word since unlike hash * values words are of variable length. * * This is small enough to be kept in memory and searched * directly with full word case insensitive string compares * with binary search. For 256000 words in index there will be * 256000/128 = 2000 words in mini-index that will occupy * 2000*32 (avgword + off + delim/attr + sizeof(struct mini)), * which is less than 16 pages (can always be resident in mem). * * We just need to string search log_2(2000) + 128 words of * length 12B each in the worst case ===> VERY FAST. This is * not the best possible but space is the limit. If we hash the * whole index/regions in the index, we need TOO MUCH memory. */ /* * Binary search mini_array[beginindex..endindex); return 1 if success, 0 if failure. * Sets begin and end offsets for direct search; initially beginindex=0, endindex=mini_array_len */ int get_mini(word, len, beginoffset, endoffset, beginindex, endindex, minifp) unsigned char *word; int len; long *beginoffset, *endoffset; int beginindex, endindex; FILE *minifp; { int cmp, midindex; if ((mini_array == NULL) || (mini_array_len <= 0)) return 0; midindex = beginindex + (endindex - beginindex)/2; cmp = strcmp(word, mini_array[midindex].word); if (cmp < 0) { /* word DEFINITELY BEFORE midindex (but still at or after beginindex) */ if (beginindex >= midindex) { /* range of search is just ONE element in array */ *beginoffset = mini_array[midindex].offset; if (midindex + 1 < mini_array_len) { *endoffset = mini_array[midindex + 1].offset; } else *endoffset = -1; /* go till end of file */ return 1; } else return get_mini(word, len, beginoffset, endoffset, beginindex, midindex); } else { /* word DEFINITELY AT OR AFTER midindex (but still before endindex) */ if ((cmp == 0) || (endindex <= midindex + 1)) { /* range of search is just ONE element in array */ *beginoffset = mini_array[midindex].offset; if (midindex + 1 < mini_array_len) { *endoffset = mini_array[midindex + 1].offset; } else *endoffset = -1; /* go till end of file */ return 1; } else return get_mini(word, len, beginoffset, endoffset, midindex, endindex); } } /* Returns: #of words in mini_array if success or already read, -1 if failure */ int read_mini(indexfp, minifp) FILE *indexfp, *minifp; /* indexfp pointing right to first line of word+... */ { unsigned char s[MAX_LINE_LEN], word[MAX_NAME_LEN]; int wordnum = 0, wordlen; long offset; struct stat st; if ((mini_array != NULL) && (mini_array_len > 0)) return mini_array_len; if (minifp == NULL) return 0; if (fstat(fileno(minifp), &st) == -1) { fprintf(stderr, "Can't stat: %s\n", s); return -1; } rewind(minifp); fscanf(minifp, "%d\n", &mini_array_len); if ((mini_array_len <= 0) || (mini_array_len > (st.st_size / 4 /* \n, space, 1char offset, 1char word */))) { fprintf(stderr, "Error in format of: %s\n", s); return -1; } mini_array = (struct mini *)my_malloc(sizeof(struct mini) * mini_array_len); memset(mini_array, '\0', sizeof(struct mini) * mini_array_len); while ((wordnum < mini_array_len) && (fscanf(minifp, "%s %ld\n", word, &offset) != EOF)) { wordlen = strlen((char *)word); mini_array[wordnum].word = (char *)my_malloc(wordlen + 2); strcpy((char *)mini_array[wordnum].word, (char *)word); mini_array[wordnum].offset = offset; wordnum ++; } return mini_array_len; } dump_mini(indexfile) char *indexfile; { unsigned char s[MAX_LINE_LEN], word[MAX_NAME_LEN]; FILE *indexfp; FILE *minifp; int wordnum = 0, j, attr_num; long offset; /* offset if offset of beginning of word */ char temp_rdelim[MAX_LINE_LEN]; temp_rdelim[0] = '\0'; /* Initialize just in case. 10/25/99 --GV */ if ((indexfp = fopen(indexfile, "r")) == NULL) { fprintf(stderr, "Can't open for reading: %s\n", indexfile); return; } sprintf(s, "%s/%s.tmp", INDEX_DIR, MINI_FILE); if ((minifp = fopen(s, "w")) == NULL) { fprintf(stderr, "Can't open for writing: %s\n", s); fclose(indexfp); return; } fgets(s, 256, indexfp); /* indexnumbers */ fgets(s, 256, indexfp); /* onefileperblock */ fscanf(indexfp, "%%%d%s\n", &attr_num, temp_delim); /* structured index */ offset = ftell(indexfp); while (fgets(s, MAX_LINE_LEN, indexfp) != NULL) { if ((wordnum % WORDS_PER_REGION) == 0) { j = 0; while ((j < MAX_LINE_LEN) && (s[j] != WORD_END_MARK) && (s[j] != ALL_INDEX_MARK) && (s[j] != '\n')) j++; if ((j >= MAX_LINE_LEN) || (s[j] == '\n')) { wordnum ++; offset = ftell(indexfp); continue; } /* else it is WORD_END_MARK or ALL_INDEX_MARK */ s[j] = '\0'; strcpy((char *)word, (char *)s); if (fprintf(minifp, "%s %ld\n", word, offset) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); break; } mini_array_len ++; } wordnum ++; offset = ftell(indexfp); } fclose(indexfp); fflush(minifp); fclose(minifp); /* * Add amount of space needed for mini_array at the beginning */ sprintf(s, "%s/%s", INDEX_DIR, MINI_FILE); if ((minifp = fopen(s, "w")) == NULL) { fprintf(stderr, "Can't open for writing: %s\n", s); goto end; } sprintf(s, "%s/%s.tmp", INDEX_DIR, MINI_FILE); if ((indexfp = fopen(s, "r")) == NULL) { fprintf(stderr, "Can't open for reading: %s\n", s); fclose(minifp); goto end; } fprintf(minifp, "%d\n", mini_array_len); while (fgets(s, MAX_LINE_LEN, indexfp) != NULL) { fputs(s, minifp); } fflush(minifp); fclose(minifp); end: sprintf(s, "%s/%s.tmp", INDEX_DIR, MINI_FILE); unlink(s); return; } #else /* WORD_SORTED */ int get_mini(word, len, beginoffset, endoffset, beginindex, endindex, minifp) unsigned char *word; int len; long *beginoffset, *endoffset; int beginindex, endindex; FILE *minifp; { int index; unsigned char array[sizeof(int)]; extern int glimpse_isserver; /* in agrep/agrep.c */ index = hash64k(word, len); if ((mini_array == NULL) || (mini_array_len <= 0) || !glimpse_isserver) { if (minifp == NULL) return 0; fseek(minifp, (long)(index*sizeof(int)), 0); if (fread((void *)array, sizeof(int), 1, minifp) != 1) return 0; *beginoffset = decode32b((array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]); if (fread((void *)array, sizeof(int), 1, minifp) != 1) *endoffset = -1; else *endoffset = decode32b((array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]); return 1; } *beginoffset = mini_array[index].offset; if (index + 1 < endindex) *endoffset = mini_array[index + 1].offset; else *endoffset = -1; return 1; } /* Returns: #of words in mini_array if success or already read, -1 if failure */ int read_mini(indexfp, minifp) FILE *indexfp, *minifp; /* indexfp pointing right to first line of word+... */ { unsigned char s[MAX_LINE_LEN], array[sizeof(int)]; int offset, hash_value; if ((mini_array != NULL) && (mini_array_len > 0)) return mini_array_len; if (minifp == NULL) return 0; rewind(minifp); mini_array_len = MINI_ARRAY_LEN; mini_array = (struct mini *)my_malloc(sizeof(struct mini) * mini_array_len); memset(mini_array, '\0', sizeof(struct mini) * mini_array_len); hash_value = 0; /* line# I am going to scan */ offset = 0; while ((hash_value < MINI_ARRAY_LEN) && (fread((void *)array, sizeof(int), 1, minifp) == 1)) { offset = (array[0] << 24) | (array[1] << 16) | (array[2] << 8) | array[3]; mini_array[hash_value++].offset = decode32b(offset); } for (; hash_value= linelen) || (s[j] == '\n') || (s[j] == '\0')) { continue; } /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = s[j]; s[j] = '\0'; hash_value = hash64k(s, j); s[j] = c; fprintf(newindexfp, "%d ", hash_value); if (fputs(s, newindexfp) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } } fclose(indexfp); fflush(newindexfp); fclose(newindexfp); #if SFS_COMPAT unlink(indexfile); #else sprintf(s, "exec %s '%s'", SYSTEM_RM, escapesinglequote(indexfile, es1)); system(s); #endif #if DONTUSESORT_T_OPTION || SFS_COMPAT sprintf(s, "exec %s -n '%s.tmp' > '%s'\n", SYSTEM_SORT, escapesinglequote(indexfile, es1), escapesinglequote(indexfile, es2)); #else sprintf(s, "exec %s -n -T '%s' '%s.tmp' > '%s'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), escapesinglequote(indexfile, es2), escapesinglequote(indexfile, es3)); #endif rc = system(s); if (rc >> 8) { fprintf (stderr, "'sort' command:\n"); fprintf (stderr, " %s\n", s); fprintf (stderr, "failed with exit status %d\n", rc>>8); exit(2); } #if SFS_COMPAT sprintf(s, "%s.tmp", indexfile); unlink(s); #else sprintf(s, "exec %s '%s.tmp'", SYSTEM_RM, escapesinglequote(indexfile, es1)); system(s); #endif system(sync_path); /* sync() has a BUG */ /* * Now dump the mini-file's offsets and create the stripped index file */ if ((indexfp = fopen(indexfile, "r")) == NULL) { fprintf(stderr, "Can't open for reading: %s\n", indexfile); exit(2); } sprintf(s, "%s.tmp", indexfile); if ((newindexfp = fopen(s, "w")) == NULL) { fprintf(stderr, "Can't open for writing: %s\n", s); fclose(indexfp); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, MINI_FILE); if ((minifp = fopen(s, "w")) == NULL) { fprintf(stderr, "Can't open for writing: %s\n", s); fclose(indexfp); fclose(newindexfp); exit(2); } fputs(indexnumber, newindexfp); fputs(onefileperblock, newindexfp); if (attr_num != -2) fprintf(newindexfp, "%%%d\n", attr_num); else fprintf(newindexfp, "%%%d %s\n", attr_num, temp_rdelim); prev_hash_value = -1; hash_value = 0; offset = ftell(newindexfp); while (fgets(s, MAX_LINE_LEN, indexfp) != NULL) { linelen = strlen(s); t = s; while ((*t != ' ') && (t < s + linelen)) t++; if (t >= s + linelen) continue; *t = '\0'; sscanf(s, "%d", &hash_value); t ++; /* points to first character of the beginning of s */ fputs(t, newindexfp); if (hash_value != prev_hash_value) { for (j=prev_hash_value + 1; j<=hash_value; j++) { eoffset = encode32b((int)offset); putc((eoffset & 0xff000000) >> 24, minifp); putc((eoffset & 0xff0000) >> 16, minifp); putc((eoffset & 0xff00) >> 8, minifp); if (putc((eoffset & 0xff), minifp) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } } prev_hash_value = hash_value; } offset = ftell(newindexfp); } for (hash_value = prev_hash_value + 1; hash_value> 24, minifp); putc((eoffset & 0xff0000) >> 16, minifp); putc((eoffset & 0xff00) >> 8, minifp); if (putc((eoffset & 0xff), minifp) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } } fclose(indexfp); fflush(newindexfp); fclose(newindexfp); fflush(minifp); fclose(minifp); #if SFS_COMPAT unlink(indexfile); #else sprintf(s, "exec %s '%s'", SYSTEM_RM, escapesinglequote(indexfile, es1)); system(s); #endif #if SFS_COMPAT sprintf(s, "%s.tmp", indexfile); rename(s, indexfile); #else sprintf(s, "exec %s '%s.tmp' '%s'\n", SYSTEM_MV, escapesinglequote(indexfile, es1), escapesinglequote(indexfile, es2)); system(s); #endif system(sync_path); /* sync() has a BUG */ } #endif /* WORD_SORTED */ /* Creates data structures that are related to the number of files present in * ".glimpse_filenames". These data structures are: * 1. index sets -- use my_malloc * 2. index bufs -- use my_malloc * Once this is done, this function can be called directly from glimpse/get_filenames() * and that can use all sets/bufs data structures directly. * This doesn't care how name_list() is created to be an array of arrays to be able to * add/delete dynamically from it: this uses malloc completely. * But: * disable_list (which is used only inside glimpse_index) must be malloced separately. * multi_dest_index_set (which is used only inside glimpse) must be malloced separately. */ initialize_data_structures(files) int files; { FILEMASK_SIZE = ((files + 1)/(8*sizeof(int)) + 4); REAL_PARTITION = (FILEMASK_SIZE + 4); if (REAL_PARTITION < MAX_PARTITION + 2) REAL_PARTITION = MAX_PARTITION + 2; REAL_INDEX_BUF = ((files + 1)*3 + 3*8*sizeof(int) + 6*MAX_WORD_BUF + 2); /* index line length with OneFilePerBlock (and/or ByteLevelIndex) */ /* increased by *3 + 24*sizeof(int) to avoid segfaults --GV */ if (REAL_INDEX_BUF < MAX_SORTLINE_LEN) REAL_INDEX_BUF = MAX_SORTLINE_LEN; MAX_ALL_INDEX = (REAL_INDEX_BUF / 2); if (src_index_set == NULL) src_index_set = (unsigned int *)my_malloc(sizeof(int)*REAL_PARTITION); memset(src_index_set, '\0', sizeof(int) * REAL_PARTITION); if (dest_index_set == NULL) dest_index_set = (unsigned int *)my_malloc(sizeof(int)*REAL_PARTITION); memset(dest_index_set, '\0', sizeof(int) * REAL_PARTITION); /* malloc a few extra bytes above REAL_INDEX_BUF */ /* the value of REAL_INDEX_BUF is used to terminate loops, but in the loop up to 4 chars may be assigned */ if (src_index_buf == NULL) src_index_buf = (unsigned char *)my_malloc(sizeof(char)*(REAL_INDEX_BUF+4)); memset(src_index_buf, '\0', sizeof(char)*(REAL_INDEX_BUF+4)); if (dest_index_buf == NULL) dest_index_buf = (unsigned char *)my_malloc(sizeof(char)*(REAL_INDEX_BUF + 4)); memset(dest_index_buf, '\0', sizeof(char)*(REAL_INDEX_BUF+4)); if (merge_index_buf == NULL) merge_index_buf = (unsigned char *)my_malloc(sizeof(char)*(REAL_INDEX_BUF + 4)); memset(merge_index_buf, '\0', sizeof(char)*(REAL_INDEX_BUF+4)); } destroy_data_structures() { if (src_index_set != NULL) free(src_index_set); src_index_set = NULL; if (dest_index_set != NULL) free(dest_index_set); dest_index_set = NULL; if (src_index_buf != NULL) free(src_index_buf); src_index_buf = NULL; if (dest_index_buf != NULL) free(dest_index_buf); dest_index_buf = NULL; if (merge_index_buf != NULL) free(merge_index_buf); merge_index_buf = NULL; } /* We MUST be able to parse name as: "goodoldunixfilename firstwordofotherinfo restofotherinfo_whichifNULL_willnotbeprecededbyblanklikeitdoeshere\n" */ /* len is strlen(name), being points to ^ and end points to ^: the firstwordofotherinfo can be used to create .glimpse_filehash when -U ON */ /* Restriction: the 3 strings above cannot contain '\n' or '\0' or ' ' */ /* returns 0 if parsing was successful, -1 if error */ /* begin/end values are NOT stored for each file (painful!), so this function may be called multiple times for the same name: caller MUST save if reqd. */ int special_parse_name(name, len, begin, end) char *name; int len; int *begin, *end; { int i; int index; *begin = -1; *end = -1; if (InfoAfterFilename || ExtractInfo) { /* Glimpse will ALWAYS terminate filename at first blank (no ' ', '\n', '\0' in filename) */ /* Trying to use FILE_END_MARK instead of blank! --GB 6/7/99 */ for (i=0; i= *end) { *end = *begin - 1; *begin = 0; /* was returning -1 before, but if can't find any "firstwordofinfo", then just use the first word in buffer for indexing... */ } return 0; } } else { *begin = 0; *end = len; return 0; } } /* Puts the actual name of the file in the file-system into temp (caller must pass buffer that is large enough to hold it...) */ int special_get_name(name, len, temp) char *name; int len; char *temp; { int begin=-1, end=-1; if (name == NULL) return -1; if (len < 0) len = strlen(name); if (len <= 0) { errno = EINVAL; return -1; } if (special_parse_name(name, len, &begin, &end) == -1) return -1; if ((begin >= MAX_LINE_LEN) || (len >= MAX_LINE_LEN)) { errno = ENAMETOOLONG; return -1; } if (begin > 0) { /* points to first element of the information (like URL) stored after filename */ memcpy(temp, name, begin-1); temp[begin-1] = '\0'; } else { /* no other information stored with filename */ memcpy(temp, name, len); temp[len] = '\0'; } return 0; } /* Must NOT write into name or flag since they may be passed as "const" char* on some systems */ FILE * my_fopen(name, flag) char *name; char *flag; { int len; char temp[MAX_LINE_LEN]; if (name == NULL) return NULL; len = strlen(name); if (special_get_name(name, len, temp) == -1) return NULL; return fopen(temp, flag); } int my_open(name, flag, mode) char *name; int flag, mode; { int len; char temp[MAX_LINE_LEN]; if (name == NULL) return -1; len = strlen(name); if (special_get_name(name, len, temp) == -1) return -1; return open(temp, flag, mode); } int my_stat(name, buf) char *name; struct stat *buf; { int len; char temp[MAX_LINE_LEN]; if (name == NULL) return -1; len = strlen(name); if (special_get_name(name, len, temp) == -1) return -1; return stat(temp, buf); } int my_lstat(name, buf) char *name; struct stat *buf; { int len; char temp[MAX_LINE_LEN]; if (name == NULL) return -1; len = strlen(name); if (special_get_name(name, len, temp) == -1) return -1; return lstat(temp, buf); } /* Changed hash-routines to look at exactly that portion of the filename that occurs before the first blank character, */ /* and use that to compare names: Oct/96 --- But lose efficiency since must parse name everytime: at least 1 string copy */ /* Using FILE_END_MARK instead of blank. --GB 6/7/99 */ name_hashelement *name_hashtable[MAX_64K_HASH]; /* if (!BigFilenameHashTable) then only the first 4K entries in it are used */ /* * Returns the index of the name if the it is found amongst the set * of files in name_array; -1 otherwise. */ int get_filename_index(name) char *name; { int index; int len; int i, begin=-1, end=-1; /* int skips=0; */ name_hashelement *e; char *temp; int temp_len; if (name == NULL) return -1; len = strlen(name); if (special_parse_name(name, len, &begin, &end) == -1) return -1; if ((begin >= MAX_LINE_LEN) || (len >= MAX_LINE_LEN)) { errno = ENAMETOOLONG; return -1; } temp = name; if (begin > 0) { /* points to first element of the information (like URL) stored after filename */ temp_len = begin - 1; } else { /* no other information stored with filename */ temp_len = len; } if (FirstWordOfInfoIsKey) index = hashNk(name, end-begin); else { /* hash on filename */ if (begin <= 0) index = hashNk(name, len); else index = hashNk(name, begin-1); } e = name_hashtable[index]; while((e != NULL) && (strncmp(temp, e->name, temp_len))) { /* skips ++; */ e = e->next; } /* fprintf(STATFILE, "skips = %d\n", skips); */ if (e == NULL) return -1; return e->index; } insert_filename(name, name_index) char *name; int name_index; { int len; int index; int i, begin=-1, end=-1; name_hashelement **pe; char *temp; int temp_len; if (name == NULL) return; len = strlen(name); if (special_parse_name(name, len, &begin, &end) == -1) return; if ((begin >= MAX_LINE_LEN) || (len >= MAX_LINE_LEN)) { errno = ENAMETOOLONG; return; } temp = name; if (begin > 0) { /* points to first element of the information (like URL) stored after filename */ temp_len = begin - 1; } else { /* no other information stored with filename */ temp_len = len; } if (FirstWordOfInfoIsKey) index = hashNk(name, end-begin); else { /* hash on filename */ if (begin <= 0) index = hashNk(name, len); else index = hashNk(name, begin-1); } pe = &name_hashtable[index]; while((*pe != NULL) && (strncmp((*pe)->name, temp, temp_len))) pe = &(*pe)->next; if ((*pe) != NULL) return; if ((*pe = (name_hashelement *)my_malloc(sizeof(name_hashelement))) == NULL) { fprintf(stderr, "malloc failure in insert_filename %s:%d\n", __FILE__, __LINE__); exit(2); } (*pe)->next = NULL; #if 0 if (((*pe)->name = (char *)my_malloc(len + 2)) == NULL) { fprintf(stderr, "malloc failure in insert_filename %s:%d\n", __FILE__, __LINE__); exit(2); } strcpy((*pe)->name, name); #else (*pe)->name = name; #endif (*pe)->name_len = strlen(name); (*pe)->index = name_index; } change_filename(name, len, index, newname) char *name; int len; int index; char *newname; { name_hashelement **pe, *t; char temp[MAX_LINE_LEN]; int temp_len; if (special_get_name(name, len, temp) == -1) return; temp_len = strlen(temp); pe = &name_hashtable[index]; while((*pe != NULL) && (strncmp((*pe)->name, temp, temp_len))) pe = &(*pe)->next; if ((*pe) == NULL) return; #if 0 my_free((*pe)->name); #endif (*pe)->name = newname; return; } delete_filename(name, name_index) char *name; int name_index; { int len; int index; int i, begin=-1, end=-1; name_hashelement **pe, *t; char *temp; int temp_len; if (name == NULL) return; len = strlen(name); if (special_parse_name(name, len, &begin, &end) == -1) return; if ((begin >= MAX_LINE_LEN) || (len >= MAX_LINE_LEN)) { errno = ENAMETOOLONG; return; } temp = name; if (begin > 0) { /* points to first element of the information (like URL) stored after filename */ temp_len = begin - 1; } else { /* no other information stored with filename */ temp_len = len; } if (FirstWordOfInfoIsKey) index = hashNk(name, end-begin); else { /* hash on filename */ if (begin <= 0) index = hashNk(name, len); else index = hashNk(name, begin-1); } pe = &name_hashtable[index]; while((*pe != NULL) && (strncmp((*pe)->name, temp, temp_len))) pe = &(*pe)->next; if ((*pe) == NULL) return; t = *pe; *pe = (*pe)->next; #if 0 my_free(t->name); #endif my_free(t, sizeof(name_hashelement)); return; } init_filename_hashtable() { int i; for (i=0; inext; #if 0 my_free(t->name); #endif my_free(t, sizeof(name_hashelement)); } *pe = NULL; } built_filename_hashtable = 0; } long get_file_time(fp, stbuf, name, i) FILE *fp; struct stat *stbuf; char *name; int i; { CHAR array[sizeof(long)]; int xx; long ret = 0; struct stat mystbuf; if (fp != NULL) { fseek(fp, i*sizeof(long), 0); fread(array, sizeof(long), 1, fp); for (xx=0; xxst_mtime; } else { if (my_stat(name, &mystbuf) == -1) ret = 0; else ret = mystbuf.st_mtime; } return ret; } glimpse-4.18.7/index/lib.c000066400000000000000000000007051300371307100152710ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/lib.c */ #include unsigned char *strdup(str) unsigned char *str; { int len; unsigned char *str1, *str1_bak; extern char *my_malloc(); len = strlen(str); str1 = (unsigned char *) my_malloc(len + 2); if(str1 == NULL) { fprintf(stderr, "malloc failure\n"); exit(2); } str1_bak = str1; while(*str1++ = *str++); return(str1_bak); } glimpse-4.18.7/index/memlook.c000066400000000000000000000043671300371307100161760ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/newmemlook.c */ /* by Christian Vogler, Jan 9, 2000 */ /* I completely rewrote memlook, because in its original form the function has too many problems. It is prone to several buffer overruns. In addition, it puts a sentinel value past the end of the buffer, which may corrupt other data structures. The purpose of this function is to match a string in an area of text. If the string starts with a newline, it is matched as well, *except* at the very beginning of the text area. In this latter case, the newline is ignored. Returns the index of the first character of the matching string in the text area (excluding beginning newline), or -2 if it cannot be found. */ #include #include int memlook(const unsigned char *pattern, const unsigned char *text, int length) { int pattern_length; const unsigned char *text_end = text + length; const unsigned char *pat_compare = pattern; /* next pattern character to be compared */ const unsigned char *text_ptr = text; /* next text location to be searched for pattern */ const unsigned char *text_compare; /* next text character to be compared in a pattern comparision */ int found = 0; pattern_length = strlen((char *) pattern); /* First check if pattern starts with a newline. If yes, ignore newline and try to match rest of pattern at beginning of text. */ if (pattern[0] == '\n') pat_compare++; while (text_ptr + pattern_length <= text_end) { text_compare = text_ptr; while (pat_compare < pattern + pattern_length) { if (*pat_compare != *text_compare) break; pat_compare++; text_compare++; } found = (pat_compare == pattern + pattern_length); if (found) break; /* pattern has been found */ /* not found? then reset pattern comparison character pointer and try next text character */ pat_compare = pattern; text_ptr++; } if (found) { /* ignore beginning newline in pattern when returning position */ if (pattern[0] == '\n' && *text_ptr == '\n') text_ptr++; return text_ptr - text; } else return (-2); } glimpse-4.18.7/index/partition.c000066400000000000000000001136631300371307100165440ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* ./glimpse/index/partition.c */ #include "glimpse.h" #include #include #include extern int BigFilenameHashTable; extern int DeleteFromIndex; extern int FastIndex; extern int FilenamesOnStdin; extern char INDEX_DIR[MAX_LINE_LEN]; extern char sync_path[MAX_LINE_LEN]; extern int file_num; /* the number of files */ extern int new_file_num; /* the new number of files after purging some from index */ extern char **name_list[MAXNUM_INDIRECT]; /* to store the file names */ extern int *size_list[MAXNUM_INDIRECT]; /* store size of each file */ extern int p_table[MAX_PARTITION]; /* partition table, the i-th partition begins at p_table[i] and ends at p_tables[i+1] */ extern int p_size_list[MAX_PARTITION]; /* sum of the sizes of the files in each partition */ extern int part_num; /* number of partitions, 1 initially since partition # 0 is not accessed */ extern int built_filename_hashtable; extern name_hashelement *name_hashtable[MAX_64K_HASH]; extern int total_size; /* total size of the directory */ extern int total_deleted; /* number of files being deleted */ int part_size=DEFAULT_PART_SIZE; /* partition size */ int new_partition; int files_per_partition; int files_in_partition; int ATLEASTONEFILE = 0; extern int errno; char patbuf[MAX_PAT]; extern unsigned char *src_index_buf; extern unsigned char *dest_index_buf; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; extern int memory_usage; extern struct indices *deletedlist; extern FILE *TIMEFILE; extern FILE *STATFILE; extern FILE *MESSAGEFILE; extern struct stat excstbuf; extern struct stat incstbuf; extern int GenerateHash; extern int KeepFilenames; extern int OneFilePerBlock; extern int ByteLevelIndex; extern int RecordLevelIndex; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; extern int StructuredIndex; extern int attr_num; extern char INDEX_DIR[MAX_LINE_LEN]; extern int AddToIndex; extern int IndexableFile; extern int BuildTurbo; extern int SortByTime; char *exin_argv[8]; int exin_argc; char current_dir_buf[2*MAX_LINE_LEN + 4]; /* must have space to store pattern after directory name */ unsigned char dummypat[MAX_PAT]; int dummylen; FILE *dummyout; partition(dir_num, dir_name) char **dir_name; int dir_num; { int num_pat=0; int num_inc=0; int len; long thetime; long prevtime; int theindex; int firsttime = 1; int xx; struct timeval tv; FILE *tmp_TIMEFILE; FILE *index_TIMEFILE; int ret; char **temp_name_list; int *temp_size_list; int temp_file_num; char S[MAX_LINE_LEN], S1[MAX_LINE_LEN], es1[MAX_LINE_LEN], es2[MAX_LINE_LEN], es3[MAX_LINE_LEN]; int pat_len[MAX_EXCLUSIVE]; int inc_len[MAX_EXCLUSIVE]; CHAR *inc[MAX_INCLUSIVE]; /* store the patterns used to mask in files */ CHAR *pat[MAX_EXCLUSIVE]; /* store the patterns that are used to mask out those files that are not to be indexed */ int MinPartNum; /* minimum number of partitions */ int i=0, j; int subtotal=0; int pdx = 0; /* index pointer for p_table */ FILE *patfile; /* file descriptor for prohibit pattern file */ FILE *incfile; /* file descriptor for include pattern file */ char *current_dir; /* must have '\n' before directory name */ char s[MAX_LINE_LEN]; char working_dir[MAX_LINE_LEN]; struct stat sbuf; current_dir_buf[0] = '\n'; current_dir_buf[1] = '\0'; current_dir = ¤t_dir_buf[1]; /* if (IndexableFile) goto directlytofsize; */ if ((dummyout = fopen("/dev/null", "w")) == NULL) return -1; exin_argv[0] = "glimpseindex"; exin_argv[1] = "dummypat"; exin_argc = 2; if ((dummylen = memagrep_init(exin_argc, exin_argv, MAX_PAT, dummypat)) <= 0) return -1; /* exclude/include pattern search */ sprintf(s, "%s/%s", INDEX_DIR, PROHIBIT_LIST); patfile = fopen(s, "r"); if(patfile == NULL) { /* fprintf(stderr, "can't open exclude-pattern file\n"); -- no need! */ num_pat = 0; } else { while((num_pat < MAX_EXCLUSIVE) && fgets(patbuf, MAX_PAT, patfile)) { if ((len = strlen(patbuf)) < 1) continue; patbuf[len-1] = '\0'; if ((pat_len[num_pat] = convert2agrepregexp(patbuf, len-1)) == 0) continue; pat[num_pat++] = (unsigned char *) strdup(patbuf); } fclose(patfile); } #if 0 printf("num_pat %d\n", num_pat); for(i=0; i 0)) { fflush(TIMEFILE); fclose(TIMEFILE); #if USESORT_Z_OPTION #if DONTUSESORT_T_OPTION || SFS_COMPAT sprintf(S, "exec %s -n -r -z %d '%s/%s' > '%s/%s.tmp'\n", SYSTEM_SORT, maxsortlinelen, escapesinglequote(INDEX_DIR, es1), DEF_TIME_FILE, escapesinglequote(INDEX_DIR, es2), DEF_TIME_FILE); #else sprintf(S, "exec %s -n -r -T '%s' -z %d '%s/%s' > '%s/%s.tmp'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), maxsortlinelen, escapesinglequote(INDEX_DIR, es2), DEF_TIME_FILE, escapesinglequote(INDEX_DIR, es3), DEF_TIME_FILE); #endif #else #if DONTUSESORT_T_OPTION || SFS_COMPAT sprintf(S, "exec %s -n -r '%s/%s' > '%s/%s.tmp'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), DEF_TIME_FILE, escapesinglequote(INDEX_DIR, es2), DEF_TIME_FILE); #else sprintf(S, "exec %s -n -r -T '%s' '%s/%s' > '%s/%s.tmp'\n", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), DEF_TIME_FILE, escapesinglequote(INDEX_DIR, es3), DEF_TIME_FILE); #endif #endif #ifdef BG_DEBUG printf("%s", S); #endif if((ret=system(S)) != 0) { sprintf(S1, "system('%s') failed at:\n\t File=%s, Line=%d, Errno=%d", S, __FILE__, __LINE__, errno); perror(S1); fprintf(stderr, "Please try to run the program again\n(If there's no memory, increase the swap area / don't use -M and -B options)\n"); sprintf(S, "%s/%s", INDEX_DIR, DEF_TIME_FILE); unlink(S); exit(2); } sprintf(S, "%s/%s.tmp", INDEX_DIR, DEF_TIME_FILE); if ((tmp_TIMEFILE = fopen(S, "r")) == NULL) { fprintf(stderr, "can't open %s for reading\n", S); unlink(S); exit(2); } sprintf(S, "%s/%s", INDEX_DIR, DEF_TIME_FILE); if ((TIMEFILE = fopen(S, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", S); unlink(S); exit(2); } sprintf(S, "%s/%s.index", INDEX_DIR, DEF_TIME_FILE); if ((index_TIMEFILE = fopen(S, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", S); unlink(S); exit(2); } /* Get the sorted times from .glimpse_times.tmp; dump the exact times for each file in .glimpse_times; dump per-day file# in .glimpse_times.index */ gettimeofday(&tv, NULL); temp_file_num = 0; temp_name_list = (char **)my_malloc(sizeof(char *) * file_num); memset(temp_name_list, '\0', sizeof(char *) * file_num); temp_size_list = (int *)my_malloc(sizeof(int) * file_num); memset(temp_size_list, '\0', sizeof(int) * file_num); prevtime = tv.tv_sec; while (fscanf(tmp_TIMEFILE, "%ld %d", &thetime, &theindex) == 2) { temp_name_list[temp_file_num] = LIST_GET(name_list, theindex); temp_size_list[temp_file_num] = LIST_GET(size_list, theindex); for (xx=0; xx>(8*(sizeof(long) - xx - 1)), TIMEFILE); /* fprintf(TIMEFILE, "%d %d\n", thetime, (prevtime - thetime)/86400); */ if (firsttime) { for (i=0; i<(prevtime - thetime + 86399)/86400; i++) { for (xx=0; xx>(8*(sizeof(int) - xx - 1)), index_TIMEFILE); /* fprintf(index_TIMEFILE, "%d\n", temp_file_num); */ } } else { for (i=0; i<(prevtime - thetime)/86400; i++) { for (xx=0; xx>(8*(sizeof(int) - xx - 1)), index_TIMEFILE); /* fprintf(index_TIMEFILE, "%d\n", temp_file_num); */ } } temp_file_num ++; if (!firsttime) prevtime -= i*86400; else if (i>0) prevtime -= (i-1)*86400; firsttime = 0; } if (temp_file_num != file_num) { fprintf(stderr, "error in sort: File=%s, Line=%d\n", __FILE__, __LINE__); exit(2); } /* Change the lists to be sorted now; free temporary lists */ for (i=0; i MaxNumPartition) { printed_warning = 1; if (AddToIndex) { fprintf(MESSAGEFILE, "Warning: partition-table overflow! Fresh indexing recommended.\n"); } else { fprintf(MESSAGEFILE, "Warning: partition-table overflow! Commencing fresh indexing...\n"); return partition(dir_num, dir_name); } } } if ((dir_num <= 1) && FilenamesOnStdin) while (fgets(current_dir, MAX_LINE_LEN, stdin) == current_dir) { current_dir[strlen(current_dir)-1] = '\0'; /* overwrite \n with \0 */ /* Get absolute path name of the directory or file being indexed */ if (-1 == my_stat(current_dir, &sbuf)) { fprintf(stderr, "permission denied or non-existent: %s\n", current_dir); continue; } if ((S_ISDIR(sbuf.st_mode)) && (current_dir[0] != '/')) { getcwd(working_dir, MAX_LINE_LEN - 1); if (-1 == chdir(current_dir)) { fprintf(stderr, "Cannot chdir to %s\n", current_dir); continue; } getcwd(current_dir, MAX_LINE_LEN - 1); chdir(working_dir); } if (!DeleteFromIndex) printf("Indexing \"%s\" ...\n", current_dir); fsize(current_dir, pat, pat_len, num_pat, inc, inc_len, num_inc, 0); /* the file names will be in name_list[]: NOT TOP LEVEL!!! Mar/11/96 */ } else for(i=1; i= 0) && (new_file_num <= file_num)) file_num = new_file_num; /* only if purge_index() was called: -f/-a/-d only */ /* Dump attributes */ if (StructuredIndex && (attr_num > 0)) { int ret; sprintf(s, "%s/%s", INDEX_DIR, ATTRIBUTE_FILE); if (-1 == (ret = attr_dump_names(s))) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } } /* Dump partition table; change index if necessary */ sprintf(s, "%s/%s", INDEX_DIR, P_TABLE); if((p_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open for writing: %s\n", s); exit(2); } if (!OneFilePerBlock) { #ifdef SW_DEBUG printf("part_num = %d, part_size = %d\n", part_num, part_size); #endif for(i=0; i<=part_num; i++) { /* Assumes sizeof(int) is 32bits, which is true even for ALPHA */ putc((p_table[i] & 0xff000000) >> 24, p_out); putc((p_table[i] & 0x00ff0000) >> 16, p_out); putc((p_table[i] & 0x0000ff00) >> 8, p_out); if (putc((p_table[i] & 0x000000ff), p_out) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } if (i==part_num) break; if (p_table[i] == p_table[i+1]) { fprintf(STATFILE, "part_num = %d, files = none, part_size = 0\n",i); continue; } fprintf(STATFILE, "part_num = %d, files = %d .. %d, part_size = %d\n", i, p_table[i], p_table[i+1] - 1, p_size_list[i]); } if (StructuredIndex) { /* check if we can reduce default 2B attributeids to smaller ones */ sprintf(s, "%s/.glimpse_split.%d", INDEX_DIR, getpid()); if((i_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if((i_in = fopen(s, "r")) == NULL) { fprintf(stderr, "can't open %s for reading\n", s); exit(2); } /* modified the original in glimpse's main.c */ fgets(indexnumberbuf, 256, i_in); fputs(indexnumberbuf, i_out); fscanf(i_in, "%%%d\n", &onefileperblock); fprintf(i_out, "%%%d\n", onefileperblock); /* If #of files change, then they are added to a new partition, which is updated above */ if ( !fscanf(i_in, "%%%d\n", &structuredindex) ) /* temp_rdelim may not be present in new-style indexes. Fixed by mhubin 10/25/99 --GV */ fscanf(i_in, "%%%d%s\n", &structuredindex, temp_rdelim); if (structuredindex <= 0) structuredindex = 0; if (RecordLevelIndex) fprintf(i_out, "%%-2 %s\n", old_rdelim); /* robint@zedcor.com (CANNOT HAPPEN SINCE RecordLevel AND Strucured ARE NOT COMPATIBLE!!!) */ else fprintf(i_out, "%%%d\n", attr_num); /* attributes might have been added during last merge */ while(fgets(src_index_buf, REAL_INDEX_BUF, i_in)) { j = 0; while ((j < REAL_INDEX_BUF) && (src_index_buf[j] != WORD_END_MARK) && (src_index_buf[j] != ALL_INDEX_MARK) && (src_index_buf[j] != '\0') && (src_index_buf[j] != '\n')) j++; if ((j >= REAL_INDEX_BUF) || (src_index_buf[j] == '\0') || (src_index_buf[j] == '\n')) continue; /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = src_index_buf[j+1]; src_index_buf[j+1] = '\0'; fputs(src_index_buf, i_out); src_index_buf[j+1] = c; index=decode16b((src_index_buf[j+1] << 8) | (src_index_buf[j+2])); if ((attr_num > 0) && (attr_num < MaxNum8bPartition - 1)) { putc(encode8b(index), i_out); } else if (attr_num > 0) { putc(src_index_buf[j+1], i_out); putc(src_index_buf[j+2], i_out); } j += 3; if (fputs(src_index_buf+j, i_out) == EOF) { /* Rest of the partitions information */ fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } } fclose(i_in); fflush(i_out); fclose(i_out); #if SFS_COMPAT sprintf(s, "%s/.glimpse_split.%d", INDEX_DIR, getpid()); sprintf(s1, "%s/%s", INDEX_DIR, INDEX_FILE); rename(s, s1); #else sprintf(s, "exec %s '%s/.glimpse_split.%d' '%s/%s'", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), getpid(), escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #endif } } else { /* Don't care about individual file sizes in statistics since the user can look at it anyway by ls -l! */ sprintf(s, "%s/.glimpse_split.%d", INDEX_DIR, getpid()); if((i_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if((i_in = fopen(s, "r")) == NULL) { fprintf(stderr, "can't open %s for reading\n", s); exit(2); } /* modified the original in glimpse's main.c */ fgets(indexnumberbuf, 256, i_in); fputs(indexnumberbuf, i_out); fscanf(i_in, "%%%d\n", &onefileperblock); if (ByteLevelIndex) fprintf(i_out, "%%-%d\n", file_num); /* #of files might have changed due to -f/-a */ else fprintf(i_out, "%%%d\n", file_num); /* This was the stupidest thing of all! */ if ( !fscanf(i_in, "%%%d\n", &structuredindex) ) /* 10/25/99 as per mhubin --GV */ fscanf(i_in, "%%%d%s\n", &structuredindex, temp_rdelim); if (structuredindex <= 0) structuredindex = 0; if (RecordLevelIndex) fprintf(i_out, "%%-2 %s\n", old_rdelim); /* robint@zedcor.com */ else fprintf(i_out, "%%%d\n", attr_num); /* attributes might have been added during last merge */ part_size = 0; /* current offset in the p_table file */ while(fgets(src_index_buf, REAL_INDEX_BUF, i_in)) { j = 0; while ((j < REAL_INDEX_BUF) && (src_index_buf[j] != WORD_END_MARK) && (src_index_buf[j] != ALL_INDEX_MARK) && (src_index_buf[j] != '\n')) j++; if ((j >= REAL_INDEX_BUF) || (src_index_buf[j] == '\n')) continue; /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = src_index_buf[j+1]; src_index_buf[j+1] = '\0'; fputs(src_index_buf, i_out); src_index_buf[j+1] = c; c = src_index_buf[j]; if (StructuredIndex) { index = decode16b((src_index_buf[j+1] << 8) | (src_index_buf[j+2])); if ((attr_num > 0) && (attr_num < MaxNum8bPartition - 1)) { putc(encode8b(index), i_out); } else if (attr_num > 0) { putc(src_index_buf[j+1], i_out); putc(src_index_buf[j+2], i_out); } j += 2; } if (c == ALL_INDEX_MARK) { putc(DONT_CONFUSE_SORT, i_out); if (putc('\n', i_out) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } continue; } offset = encode32b(part_size); putc((offset & 0xff000000) >> 24, i_out); /* force big-endian */ putc((offset & 0x00ff0000) >> 16, i_out); putc((offset & 0x0000ff00) >> 8, i_out); putc((offset & 0x000000ff), i_out); if (putc('\n', i_out) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } j++; /* @first byte of the block numbers */ while((src_index_buf[j] != '\n') && (src_index_buf[j] != '\0')) { putc(src_index_buf[j++], p_out); part_size ++; } if (putc('\n', p_out) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } part_size ++; } fclose(i_in); fflush(i_out); fclose(i_out); #if SFS_COMPAT sprintf(s, "%s/.glimpse_split.%d", INDEX_DIR, getpid()); sprintf(s1, "%s/%s", INDEX_DIR, INDEX_FILE); rename(s, s1); #else sprintf(s, "exec %s '%s/.glimpse_split.%d' '%s/%s'", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), getpid(), escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #endif system(sync_path); /* sync() has a BUG */ sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if (BuildTurbo) dump_mini(s); } fflush(p_out); fclose(p_out); /* Dump file names */ if (KeepFilenames) { sprintf(s, "exec %s '%s/%s' '%s/%s.prev'", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), NAME_LIST, escapesinglequote(INDEX_DIR, es2), NAME_LIST); system(s); sprintf(s, "exec %s '%s/%s' '%s/%s.prev'", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), NAME_LIST_INDEX, escapesinglequote(INDEX_DIR, es2), NAME_LIST_INDEX); system(s); } sprintf(s, "%s/%s", INDEX_DIR, NAME_LIST); if((f_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, NAME_LIST_INDEX); if((i_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } fprintf(f_out, "%d\n", file_num); for(i=0,offset=ftell(f_out); i> 24, i_out); putc((offset&0xff0000) >> 16, i_out); putc((offset&0xff00) >> 8, i_out); putc((offset&0xff), i_out); fputs(LIST_GET(name_list, i), f_out); putc('\n', f_out); offset += strlen(LIST_GET(name_list, i)) + 1; } else { /* else empty line to indicate file that was removed = HOLE */ if (name_list_size == file_num) { putc((offset&0xff000000) >> 24, i_out); putc((offset&0xff0000) >> 16, i_out); putc((offset&0xff00) >> 8, i_out); putc((offset&0xff), i_out); putc('\n', f_out); offset += 1; } } /* else there are no holes since index was purged, so don't put anything */ } if (!ATLEASTONEFILE) { fprintf(MESSAGEFILE, "Warning: number of files in the index is zero!\n"); } fflush(f_out); fclose(f_out); fflush(i_out); fclose(i_out); if (GenerateHash) { /* Dump file hash: don't want to keep filenames in hash-order like index since adding a file can shift many hash-values and change the whole index! */ if (KeepFilenames) { sprintf(s, "exec %s '%s/%s' '%s/%s.prev'", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), NAME_HASH, escapesinglequote(INDEX_DIR, es2), NAME_HASH); system(s); sprintf(s, "exec %s '%s/%s' '%s/%s.prev'", SYSTEM_CP, escapesinglequote(INDEX_DIR, es1), NAME_HASH_INDEX, escapesinglequote(INDEX_DIR, es2), NAME_HASH_INDEX); system(s); } sprintf(s, "%s/%s", INDEX_DIR, NAME_HASH); if((f_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, NAME_HASH_INDEX); if((i_out = fopen(s, "w")) == NULL) { fprintf(stderr, "can't open %s for writing\n", s); exit(2); } if (!built_filename_hashtable) build_filename_hashtable(name_list, file_num); hashtablesize = (BigFilenameHashTable ? MAX_64K_HASH : MAX_4K_HASH); for (i=0,offset=ftell(f_out); i> 24, i_out); putc((offset&0xff0000) >> 16, i_out); putc((offset&0xff00) >> 8, i_out); putc((offset&0xff), i_out); e = name_hashtable[i]; while(e!=NULL) { if ((index = get_new_index(deletedlist, e->index)) < 0) { e = e->next; continue; } putc(((index)&0xff000000)>>24, f_out); putc(((index)&0xff0000)>>16, f_out); putc(((index)&0xff00)>>8, f_out); putc(((index)&0xff), f_out); offset += 4; fputs(e->name, f_out); fputc('\0', f_out); /* so that I can do direct strcmp */ offset += strlen(e->name) + 1; e = e->next; } } fflush(f_out); fclose(f_out); fflush(i_out); fclose(i_out); } #if 0 fflush(stdout); printf("AFTER SAVE_DATA_STRUCTURES:\n"); sprintf(s, "exec %s -lg .glimpse_*", SYSTEM_LS); system(s); sprintf(s, "exec %s .glimpse_index", SYSTEM_WC); system(s); getchar(); #endif /*0*/ return 0; } /* Merges the index split by save_data_structures into a single index */ merge_splits() { FILE *i_in; FILE *p_in; FILE *i_out; char s[MAX_LINE_LEN], s1[MAX_LINE_LEN], es1[MAX_LINE_LEN], es2[MAX_LINE_LEN], es3[MAX_LINE_LEN], temp_rdelim[MAX_LINE_LEN]; int j, index; unsigned char c; char indexnumberbuf[256]; int onefileperblock, structuredindex, i, recordlevelindex; #if 0 fflush(stdout); printf("BEFORE MERGE_SPLITS:\n"); sprintf(s, "exec %s -lg .glimpse_*", SYSTEM_LS); system(s); sprintf(s, "exec %s .glimpse_index", SYSTEM_HEAD); system(s); getchar(); #endif /*0*/ temp_rdelim[0] = '\0'; /* Initialize in case not read. 10/25/99 --GV */ sprintf(s, "%s/%s", INDEX_DIR, P_TABLE); if ((p_in = fopen(s, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", s); exit(2); } sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if ((i_in = fopen(s, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", s); exit(2); } sprintf(s, "%s/.glimpse_merge.%d", INDEX_DIR, getpid()); if ((i_out = fopen(s, "w")) == NULL) { fprintf(stderr, "cannot open for writing: %s\n", s); exit(2); } /* modified the original in glimpse's main.c */ fgets(indexnumberbuf, 256, i_in); fputs(indexnumberbuf, i_out); fscanf(i_in, "%%%d\n", &onefileperblock); fprintf(i_out, "%%%d\n", onefileperblock); if ( !fscanf(i_in, "%%%d\n", &structuredindex) ) /* 10/25/99 as per mhubin --GV */ fscanf(i_in, "%%%d%s\n", &structuredindex, temp_rdelim); if (structuredindex == -2) recordlevelindex = 1; if (structuredindex <= 0) structuredindex = 0; if (recordlevelindex) fprintf(i_out, "%%-2 %s\n", temp_rdelim); else fprintf(i_out, "%%%d\n", structuredindex); printf("merge: %s\n", temp_rdelim); #if !WORD_SORTED if (!DeleteFromIndex || FastIndex) { /* a new index is going to be built in this case: must sort by word */ fclose(i_in); sprintf(s, "%s/%s", INDEX_DIR, MINI_FILE); if ((i_in = fopen(s, "r")) != NULL) { /* minifile exists */ #if DONTUSESORT_T_OPTION || SFS_COMPAT sprintf(s, "exec %s '%s/%s' > '%s/%s.tmp'", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), INDEX_FILE, escapesinglequote(INDEX_DIR, es2), INDEX_FILE); #else sprintf(s, "exec %s -T '%s' '%s/%s' > '%s/%s.tmp'", SYSTEM_SORT, escapesinglequote(INDEX_DIR, es1), escapesinglequote(INDEX_DIR, es2), INDEX_FILE, escapesinglequote(INDEX_DIR, es3), INDEX_FILE); #endif system(s); #if SFS_COMPAT sprintf(s, "%s/%s.tmp", INDEX_DIR, INDEX_FILE); sprintf(s1, "%s/%s", INDEX_DIR, INDEX_FILE); rename(s, s1); #else sprintf(s, "exec %s '%s/%s.tmp' '%s/%s'", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), INDEX_FILE, escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #endif system(sync_path); /* sync() has a BUG */ fclose(i_in); } sprintf(s, "%s/%s", INDEX_DIR, INDEX_FILE); if ((i_in = fopen(s, "r")) == NULL) { fprintf(stderr, "cannot open for reading: %s\n", s); exit(2); } /* skip the 1st 3 lines which might get jumbled up */ fgets(s, MAX_LINE_LEN, i_in); fgets(s, MAX_LINE_LEN, i_in); fgets(s, MAX_LINE_LEN, i_in); } #endif /* !WORD_SORTED */ while (fgets(src_index_buf, REAL_INDEX_BUF, i_in)) { j = 0; while ((j < REAL_INDEX_BUF) && (src_index_buf[j] != WORD_END_MARK) && (src_index_buf[j] != ALL_INDEX_MARK) && (src_index_buf[j] != '\0') && (src_index_buf[j] != '\n')) j++; if ((j >= REAL_INDEX_BUF) || (src_index_buf[j] == '\0') || (src_index_buf[j] == '\n')) continue; /* else it is WORD_END_MARK or ALL_INDEX_MARK */ c = src_index_buf[j+1]; src_index_buf[j+1] = '\0'; fputs(src_index_buf, i_out); src_index_buf[j+1] = c; c = src_index_buf[j]; if (structuredindex) { /* convert all attributes to 2B to make merge_in()s easy in build_in.c */ if (structuredindex < MaxNum8bPartition - 1) { index = encode16b(decode8b(src_index_buf[j+1])); putc((index & 0x0000ff00) >> 8, i_out); putc(index & 0x000000ff, i_out); j ++; } else { putc(src_index_buf[j+1], i_out); putc(src_index_buf[j+2], i_out); j += 2; } } if (c == ALL_INDEX_MARK) { putc(DONT_CONFUSE_SORT, i_out); putc('\n', i_out); continue; } /* src_index_buf[j+1] points to the first byte of the offset */ get_block_numbers(&src_index_buf[j+1], &dest_index_buf[0], p_in); j = 0; /* first byte of the block numbers */ while ((dest_index_buf[j] != '\n') && (dest_index_buf[j] != '\0')) { putc(dest_index_buf[j], i_out); dest_index_buf[j] = '\0'; j++; } if (putc('\n', i_out) == EOF) { fprintf(stderr, "Error: write failed at %s:%d\n", __FILE__, __LINE__); exit(2); } } fclose(i_in); fclose(p_in); fflush(i_out); fclose(i_out); #if SFS_COMPAT sprintf(s, "%s/.glimpse_merge.%d", INDEX_DIR, getpid()); sprintf(s1, "%s/%s", INDEX_DIR, INDEX_FILE); rename(s, s1); #else sprintf(s, "exec %s '%s/.glimpse_merge.%d' '%s/%s'", SYSTEM_MV, escapesinglequote(INDEX_DIR, es1), getpid(), escapesinglequote(INDEX_DIR, es2), INDEX_FILE); system(s); #endif #if 0 fflush(stdout); printf("AFTER MERGE_SPLITS:\n"); sprintf(s, "exec %s -lg .glimpse_*", SYSTEM_LS); system(s); sprintf(s, "exec %s .glimpse_index"SYSTEM_HEAD); system(s); getchar(); #endif /*0*/ } glimpse-4.18.7/index/region.c000066400000000000000000000323571300371307100160160ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* From mail received from Bill Camargo and Darren Hardy in June 1994 */ #include #include "region.h" /* * Exports the following routines. Any filtering/attr-val parsing mechanism * can be integrated into glimpse and glimpseindex with this interface. */ char * /* attrname = */ attr_id_to_name(/* int attrid */); int /* attrid = */ attr_name_to_id(/* char *attrname */); int attr_dump_names(/* char *filename */); int attr_load_names(/* char *filename */); int attr_free_table(/* void */); int region_initialize(/* void */); int region_destroy(/* void */); int region_create(/* char *filename */); int /* attrid = */ region_identify(/* int offset_in_file, int len_of_region */); #if BG_DEBUG extern int memory_usage; #endif /* BG_DEBUG*/ #if STRUCTURED_QUERIES printsp() { int x; printf("stack at %x\n", &x); } /*****************************************************************************/ #define ATTR_HASH_TABLE_SIZE 256 /* must be a power of 16=multiple of 4 bits */ #define ATTR_HASH_TABLE_MASK 0xff /* bits that mask off the bits in TABLE_SIZE */ #define ATTR_HASH_STEP_SIZE 2 /* #of nibbles that make up TABLE_SIZE */ attr_element_t *attr_hash_table[ATTR_HASH_TABLE_SIZE]; char **attr_name_table = NULL; int attr_num = 0; int attr_maxid = 0; /* English language characters have all info in lowest 4 bits */ int attr_hash_index(word, len) char *word; int len; { int i=0, j, index = 0, temp; for (i=0; i+ATTR_HASH_STEP_SIZE<=len; i+=ATTR_HASH_STEP_SIZE) { temp = 0; for (j=0; j attr_maxid)) return NULL; else return attr_name_table[id]; } /* * returns the attribute number associated with name, 0 for no attribute -- * NOTE: name may not be null terminated and you are not allowed to alter it. * called during indexing and search. */ int attr_name_to_id(name, len) char *name; int len; { int index = attr_hash_index(name, len); attr_element_t *e = attr_hash_table[index]; #if 0 char c = name[len]; name[len] = '\0'; fprintf(stderr, "attr=%s @ %d?\n", name, index); fflush(stderr); name[len] = c; #endif /*0*/ while(e != NULL) { if (!strncmp(e->attribute, name, len)) break; else e = e->next; } if (e!=NULL) { #if 0 fprintf(stderr, "foundid=%d\n", e->attributeid); #endif /*0*/ return e->attributeid; } return 0; } /* * returns the attribute number (> 0) for the attribute "name". It adds the * name as a newly seen attribute if it doesn't exist already (using #tables). * called in region_create, which is called during indexing. */ attr_insert_name(name, len) char *name; int len; { int index = attr_hash_index(name, len); attr_element_t **pe = &attr_hash_table[index], *e; while(*pe != NULL) { if (!strcmp((*pe)->attribute, name)) break; else pe = &(*pe)->next; } if (*pe!=NULL) return (*pe)->attributeid; e = (attr_element_t *)my_malloc(sizeof(attr_element_t)); e->attribute = (char *)my_malloc(len + 2); strncpy(e->attribute, name, len + 1); e->attributeid = (++attr_num); e->next = NULL; *pe = e; #if 0 fprintf(stderr, "inserting %s %d\n", name, attr_num); #endif /*0*/ return e->attributeid; } /* * frees current hash table of attr-value pairs. * called after dump in indexing, and at end of search (after previous load). */ int attr_free_table() { int i; attr_element_t *e, *temp; for (i=0; inext; #if BG_DEBUG memory_usage -= strlen(e->attribute) + 2; #endif /*BG_DEBUG*/ my_free(e->attribute, 0); my_free(e, sizeof(attr_element_t)); e = temp; } attr_hash_table[i] = NULL; } if (attr_name_table != NULL) { my_free(attr_name_table, sizeof(attr_element_t *) * ATTR_HASH_TABLE_SIZE); attr_name_table = NULL; } return 0; } /* Looks for embedded attributes and copies the real attribute into dest */ attr_extract(dest, src) char *dest, *src; { char *oldsrc = src; check_again: if (!strncmp("embed<", src, 6) || !strncmp("Embed<", src, 6) || !strncmp("EMBED<", src, 6)) { src += 6; while ((*src != '>') && (*src != '\0')) src++; if (*src == '\0') { strcpy(dest, oldsrc); return; } while (!isalnum(*(unsigned char *)src)) src ++; /* assuming type names are .. */ oldsrc = src; goto check_again; } strcpy(dest, src); return; } /* * dumps the attribute-list into a file name (id, name, \n) * into the file specified and then destroys the hash table. * Returns #of attributes dumped into the file, -1 if error. * called at the end of indexing. */ int attr_dump_names(filename) char *filename; { int i=0; int ret = -1; FILE *fp = fopen(filename, "w"); attr_element_t *e; #if 0 printf("in dump attr\n"); #endif /*0*/ if (fp == NULL) return -1; ret = 0; for (i=0; iattributeid, e->attribute); e = e->next; ret ++; } fputc('\n', fp); } fflush(fp); fclose(fp); return ret; } /* * constructs a hash-table of attributes by reading them from the file. * Returns #of attributes read from the file, -1 if error. * Does not recompute hash-indices of attributes. * called before searching for attr=val pairs. */ int attr_load_names(filename) char *filename; { int index = 0, ret = 0; FILE *fp = fopen(filename, "r"); attr_element_t *e; int c = 0; char temp[1024]; /* max attr name */ char buffer[1024+32];/* max attr id pair */ int i; int id; attr_maxid = 0; memset(attr_hash_table, '\0', sizeof(attr_element_t *) * ATTR_HASH_TABLE_SIZE); if (fp == NULL) return -1; while ((c = getc(fp)) != EOF) { if (c == '\n') { index ++; continue; } ungetc(c, fp); /* fscanf screws up fp and skips over trailing space characters (\t,\n, ) */ i=0; while ((c=getc(fp)) != ' ') buffer[i++] = c; buffer[i] = '\0'; #if 0 printf("buffer=%s\n", buffer); #endif /*0*/ sscanf(buffer, "%d,%1023s", &id, temp); temp[1023] = '\0'; #if 0 printf("read attr=%s,%d @ %d\n", temp, id, index); #endif /*0*/ if (id <= 0) continue; e = (attr_element_t *)my_malloc(sizeof(attr_element_t)); e->attributeid = id; if (id > attr_maxid) attr_maxid = id; e->attribute = (char *)my_malloc(strlen(temp) + 2); strcpy(e->attribute, temp); e->next = attr_hash_table[index]; attr_hash_table[index] = e; ret ++; if (index >= ATTR_HASH_TABLE_SIZE - 1) break; } fclose(fp); attr_name_table = (char **)my_malloc(sizeof(char *) * (ret=(ret >= (attr_maxid + 1) ? ret : (attr_maxid + 1)))); memset(attr_name_table, '\0', sizeof(char *) * ret); for (i=0; iattributeid] = e->attribute; e = e->next; } } return ret; } /***************************************************************************/ region_t *current_regions, *nextpos; /* nextpos is hint into list */ /* * Called during indexing before region_create. * returns 0. */ int region_initialize() { attr_num = 0; attr_name_table = NULL; memset(attr_hash_table, '\0', sizeof(attr_element_t *) * ATTR_HASH_TABLE_SIZE); current_regions = nextpos = NULL; return 0; } /* * creates a data structure containing the list of attributes * which occur at increasing offsets in the given file -- future * region_identify() calls use the "current" data structure. * returns 0 if success, -1 if it cannot open the file. */ int region_create(name) char *name; { FILE *fp; AVList *al; region_t *prl, *rl, *lastrl; Template *t; char temp[1024]; current_regions = nextpos = NULL; if ((fp = my_fopen(name, "r")) == NULL) return -1; init_parse_template_file(fp); lastrl = NULL; while ((t = parse_template()) != NULL) { /* do insertion sort of list returned by parse_template using offsets */ if ((t->url != NULL) && (strlen(t->url) > 0)) { rl = (region_t *)my_malloc(sizeof(region_t)); /* Darren Hardy's Voodo :-) */ /* The SOIF looks like this: @TTYPE { URL\n */ /* t->offset points to the @ */ /* rl->offset points to the space before URL */ /* rl->length includes the entire URL */ rl->offset = t->offset + strlen(t->template_type) + 3; rl->length = strlen(t->url) + 1; rl->attributeid = attr_insert_name("url", 3); if ((lastrl != NULL) && (lastrl->offset <= rl->offset)) { /* go forward */ prl = lastrl; while (prl->next != NULL) { if (prl->next->offset > rl->offset) { rl->prev = prl; rl->next = prl->next; prl->next->prev = rl; prl->next = rl; lastrl = rl; break; } else prl = prl->next; } if (prl->next == NULL) { rl->next = NULL; rl->prev = prl; prl->next = rl; lastrl = rl; } } else { /* must go backwards and find the right place to insert */ prl = lastrl; while (prl != NULL) { if (prl->offset < rl->offset) { rl->prev = prl; rl->next = prl->next; if (prl->next != NULL) prl->next->prev = rl; prl->next = rl; lastrl = rl; break; } else prl = prl->prev; } if (prl == NULL) { rl->next = current_regions; if (current_regions != NULL) current_regions->prev = rl; rl->prev = NULL; current_regions = rl; lastrl = rl; } } #if 0 printf("region url=[%d,%d]\n", rl->offset, rl->offset+rl->length); #endif /*0*/ } al = t->list; while(al != NULL) { rl = (region_t *)my_malloc(sizeof(region_t)); rl->offset = al->data->offset; rl->length = al->data->vsize; attr_extract(temp, al->data->attribute); rl->attributeid = attr_insert_name(temp, strlen(temp)); if ((lastrl != NULL) && (lastrl->offset <= rl->offset)) { /* go forward */ prl = lastrl; while (prl->next != NULL) { if (prl->next->offset > rl->offset) { rl->prev = prl; rl->next = prl->next; prl->next->prev = rl; prl->next = rl; lastrl = rl; break; } else prl = prl->next; } if (prl->next == NULL) { rl->next = NULL; rl->prev = prl; prl->next = rl; lastrl = rl; } } else { /* must go backwards and find the right place to insert */ prl = lastrl; while (prl != NULL) { if (prl->offset < rl->offset) { rl->prev = prl; rl->next = prl->next; if (prl->next != NULL) prl->next->prev = rl; prl->next = rl; lastrl = rl; break; } else prl = prl->prev; } if (prl == NULL) { rl->next = current_regions; if (current_regions != NULL) current_regions->prev = rl; rl->prev = NULL; current_regions = rl; lastrl = rl; } } #if 0 printf("region %s=[%d,%d]\n", al->data->attribute, rl->offset, rl->offset+rl->length); #endif /*0*/ al = al->next; } free_template(t); } finish_parse_template(); nextpos = current_regions; fclose(fp); return 0; } /* * frees the data structure created for the current file above. * returns 0. */ int region_destroy() { region_t *rl = current_regions, *trl; while (rl != NULL) { trl = rl; rl = rl->next; free(trl); } current_regions = nextpos = NULL; return 0; } /* * returns attribute number [1..num_attr] which covers (inclusive) * the region * [offset, offset+len] in the "current" file, 0 if none. * called during indexing after region_create, and search after * attr_load_names. Do not need sophisticated interval trees here! */ int region_identify(offset, len) int offset, len; { region_t *rl; if (nextpos == NULL) nextpos = current_regions; rl = nextpos; while (rl!=NULL) { if (rl->offset > offset + len) goto backwards; /* definitely before: can be earlier region OR hole */ else if ((rl->offset <= offset) && (rl->offset + rl->length >= offset + len)) return rl->attributeid; /* definitely within */ else if (rl->offset + rl->length < offset) nextpos = rl = rl->next; /* definitely after: later region */ else return 0; /* overlapping: error */ } return 0; /* reached end of file */ backwards: while (rl!=NULL) { if (rl->offset > offset + len) nextpos = rl = rl->prev; /* definitely before: earlier region */ else if ((rl->offset <= offset) && (rl->length + rl->length >= offset + len)) return rl->attributeid; /* definitely within */ else if (rl->offset + rl->length < offset) return 0; /* hole */ else return 0; /* overlapping: error */ } return 0; /* reached end of file */ } #else /*STRUCTURED_QUERIES*/ int attr_num = 0; char *attr_id_to_name(id) int id; { return NULL; } int attr_name_to_id(name) char *name; { return 0; } int attr_dump_names(name) char *name; { return 0; } int attr_load_names(name) char *name; { return 0; } int attr_free_table() { return 0; } int region_initialize() { return 0; } int region_desrtroy() { return 0; } int region_create(name) char *name; { return 0; } int region_destroy() { return 0; } int region_identify(offset, len) int offset, len; { return 0; } #endif /*STRUCTURED_QUERIES*/ glimpse-4.18.7/index/region.h000066400000000000000000000015501300371307100160120ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* Constructed from the mail messages received from Bill Camargo in June 1994 */ #ifndef _REGION_H_ #define _REGION_H_ #include #include /* autoconf defines STRUCTURED_SQURIES */ #if STRUCTURED_QUERIES /* These are imports from Bill's stuff */ #include "util.h" #include "template.h" #endif /*STRUCTURED_QUERIES*/ /* These are mine */ typedef struct REGION { int length; int offset; int attributeid; struct REGION *next, *prev; } region_t; /* Assuming there are no more than 2^(8*sizeof(int)) attributes */ typedef struct ATTR_ELEMENT { struct ATTR_ELEMENT *next; int attributeid; char *attribute; /* pointer to the one in the hash entry */ } attr_element_t; extern FILE *my_fopen(); extern char *my_malloc(); extern int my_free(); #endif /*_REGION_H_*/ glimpse-4.18.7/index/simpletest.c000066400000000000000000000063201300371307100167130ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* simple tests which don't need to access indexing data structures */ #include #include "glimpse.h" #define b_sample_size 2048 /* the number of bytes sampled to determine whether a file is binary */ #define u_sample_size 1024 /* the number of bytes sampled to determine whether a file is uuencoded */ extern UseFilters; #if 0 /* --------------------------------------------------------------------- check for binary stream --------------------------------------------------------------------- */ test_binary(buffer, length) unsigned char *buffer; int length; { int i=0; int b_count=0; if(length > b_sample_size) length = b_sample_size; for(i=0; i 127) b_count++; } if(b_count*10 >= length) return(1); return(0); } #else /*0*/ /* Lets try this one instead: Chris Dalton */ test_binary(buffer, length) unsigned char *buffer; int length; { int permitted_errors; if (length > b_sample_size) { length= b_sample_size; } permitted_errors= length/10; while (permitted_errors && length--) { if (!(isgraph(*buffer) || isspace(*buffer))) --permitted_errors; buffer ++; } /* printf("\t\t\tpermerr=%d in %d\n", permitted_errors, length); */ return (permitted_errors == 0); } #endif /*0*/ /* --------------------------------------------------------------------- check for uuencoded stream --------------------------------------------------------------------- */ test_uuencode(buffer, length) unsigned char *buffer; int length; { int i=0; int j; if(length > u_sample_size) length = u_sample_size; if(strncmp((char *)buffer, "begin", 5) == 0) { i=5; goto CONT; } i = memlook("\nbegin", buffer, length); if(i < 0) return(0); CONT: while(buffer[i] != '\n' && i=length) return 0; buffer[i] = '\0'; if ((first = (char *)strstr((char *)buffer, "PS-Adobe")) == NULL) { buffer[i] = '\n'; return 0; } buffer[i] = '\n'; return 1; } char *suffixlist[NUM_SUFFIXES] = IGNORED_SUFFIXES; int test_special_suffix(name) char *name; { int len = strlen(name); int j, i = len-1; char *suffix; #if SFS_COMPAT while(i>=0) if (name[i] == '/') break; else i--; if (i<0) return 0; /* no suffix: can be directory... */ else suffix = &name[i+1]; for (j=0; j= NUM_SUFFIXES) return 0; return 1; #else while(i>=0) if (name[i] == '.') break; else i--; if (i<0) return 0; /* no suffix: can be directory... */ else suffix = &name[i+1]; for (j=0; j= NUM_SUFFIXES) return 0; return 1; #endif } glimpse-4.18.7/index/utils.c000066400000000000000000000234571300371307100156740ustar00rootroot00000000000000#include "glimpse.h" int BigFilenameHashTable = OFF; #define SIGNIFICANT_HASH_REGION 24 /* n is guaranteed to be < MaxNum4bPartition */ int encode4b(n) int n; { if (n=='\0') return MaxNum4bPartition; if (n=='\n') return MaxNum4bPartition+1; return n; } int decode4b(n) int n; { if (n==MaxNum4bPartition) return '\0'; if (n==MaxNum4bPartition+1) return '\n'; return n; } /* n is guaranteed to be < MaxNum8bPartition */ int encode8b(n) int n; { if (n=='\0') return MaxNum8bPartition; if (n=='\n') return MaxNum8bPartition+1; return n; } int decode8b(n) int n; { if (n==MaxNum8bPartition) return '\0'; if (n==MaxNum8bPartition+1) return '\n'; return n; } /* n is guaranteed to be < MaxNum12bPartition */ int encode12b(n) int n; { unsigned char msb, lsb; msb = (n / MaxNum8bPartition); lsb = (n % MaxNum8bPartition); msb = encode4b(msb); lsb = encode8b(lsb); return (msb<<8)|lsb; } int decode12b(n) int n; { unsigned char msb, lsb; msb = ((n&0x00000f00) >> 8); lsb = (n&0x000000ff); msb = decode4b(msb); lsb = decode8b(lsb); return (msb * MaxNum8bPartition) + lsb; } /* n is guaranteed to be < MaxNum16bPartition */ int encode16b(n) int n; { unsigned char msb, lsb; msb = (n / MaxNum8bPartition); lsb = (n % MaxNum8bPartition); msb = encode8b(msb); lsb = encode8b(lsb); return (msb<<8)|lsb; } int decode16b(n) int n; { unsigned char msb, lsb; msb = ((n&0x0000ff00) >> 8); lsb = (n&0x000000ff); msb = decode8b(msb); lsb = decode8b(lsb); return (msb * MaxNum8bPartition) + lsb; } /* n is guaranteed to be < MaxNum24bPartition */ int encode24b(n) int n; { unsigned short msb, lsb; msb = (n / MaxNum16bPartition); lsb = (n % MaxNum16bPartition); msb = encode8b(msb); lsb = encode16b(lsb); return (msb<<16)|lsb; } int decode24b(n) int n; { unsigned short msb, lsb; msb = ((n&0x00ff0000) >> 16); lsb = (n&0x0000ffff); msb = decode8b(msb); lsb = decode16b(lsb); return (msb * MaxNum16bPartition) + lsb; } /* n is guaranteed to be < MaxNum32bPartition */ int encode32b(n) int n; { unsigned short msb, lsb; msb = (n / MaxNum16bPartition); lsb = (n % MaxNum16bPartition); msb = encode16b(msb); lsb = encode16b(lsb); return (msb<<16)|lsb; } int decode32b(n) int n; { unsigned short msb, lsb; msb = ((n&0xffff0000) >> 16); lsb = (n&0x0000ffff); msb = decode16b(msb); lsb = decode16b(lsb); return (msb * MaxNum16bPartition) + lsb; } /* * converts file-names with *,. and ? and converts it to # \. and ? ALL OTHER agrep-special characters are masked off. * if the filename NOT a regular expression involving ? or *, it leaves the name untouched and returns the string * length of the file name (so that we can avoid memagrep calls): otherwise, it returns the -ve strlength of the name * after performing the above conversion: hence we never need to call agrep if the length is +ve. */ int convert2agrepregexp(buf, len) char *buf; int len; { char tbuf[MAX_PAT+2]; int i=0, j=0; /* Ignore '*' at the beginning and '*' at the end */ if (len < 1) return 0; if ( ((len == 1) && (buf[len-1] == '*')) || ((len >= 2) && (buf[len-1] == '*') && (buf[len-1] != '\\')) ) { buf[len-1] = '\0'; len--; } if (buf[0] == '*') { for (i=0; i= len) return len; i = j = 0; while ((i') || (buf[i] == '<')|| /* (buf[i] == '^') || (buf[i] == '$') || */ (buf[i] == '+')|| (buf[i] == '{') || (buf[i] == '}') || (buf[i] == '~')){ tbuf[j++] = '\\'; tbuf[j++] = buf[i]; i++; } /* Interpret ONLY ? and * in file-names */ else if (buf[i] == '?') { tbuf[j++] = '.'; i++; } else if (buf[i] == '*') { tbuf[j++] = '.'; tbuf[j++] = '*'; i++; } else tbuf[j++] = buf[i++]; } if (j >= MAX_PAT) { tbuf[MAX_PAT-1] = '\0'; fprintf(stderr, "glimpseindex: pattern '%s' too long\n", buf); j = MAX_PAT - 1; } else { tbuf[j] = '\0'; } strcpy(buf, tbuf); #if 0 printf("%s=%d\n", buf, j); #endif /*0*/ return -j; /* strlen-compatible, -ve to indicate memagrep must be called */ } /* ----------------------------------------------------------------- input: a word (a string of ascii character terminated by NULL) output: a hash_value of the input word. hash function: if the word has length <= 4 the hash value is just a concatenation of the last four bits of the characters. if the word has length > 4, then after the above operation, the hash value is updated by adding each remaining character. (and AND with the 16-bits mask). bug-fixes in all hashing functions: Chris Dalton ---------------------------------------------------------------- */ int hash64k(word, len) char *word; int len; { unsigned int hash_value=0; unsigned int mask_4=017; unsigned int mask_16=0177777; int i; if(len<=4) { for(i=0; i 5 bits is waste since there are only 26 lower case letters */ int hash32k(word, len) char *word; int len; { unsigned int hash_value=0; unsigned int mask_5=037; unsigned int mask_15=077777; int i; if(len<=3) { for(i=0; i=len-6; i--) { hash_value = (hash_value << 4) | (file[i]&mask_4); /* hash_value = hash_value & mask_16; */ } for(i=len-7; i>=len-SIGNIFICANT_HASH_REGION-2; i--) hash_value = mask_16 & (hash_value + file[i]); return(hash_value & mask_16); } #else else { for(i=len-SIGNIFICANT_HASH_REGION-2; i=len-6; i--) { hash_value = (hash_value << 3) | (file[i]&mask_3); /* hash_value = hash_value & mask_16; */ } for(i=len-7; i>=len-SIGNIFICANT_HASH_REGION-2; i--) hash_value = mask_12 & (hash_value + file[i]); return(hash_value & mask_12); } #else else { for(i=len-SIGNIFICANT_HASH_REGION-2; i doesn't define. */ /* #undef gid_t */ /* Define if you have alloca.h and it should be used (not Ultrix). */ /* #undef HAVE_ALLOCA_H */ /* Define if you support file names longer than 14 characters. */ /* #undef HAVE_LONG_FILE_NAMES */ /* Define if your struct stat has st_blksize. */ /* #undef HAVE_ST_BLKSIZE */ /* Define if on MINIX. */ /* #undef _MINIX */ /* Define if you don't have dirent.h, but have ndir.h. */ /* #undef NDIR */ /* Define to `long' if doesn't define. */ /* #undef off_t */ /* Define if the system does not provide POSIX.1 features except with this defined. */ /* #undef _POSIX_1_SOURCE */ /* Define if you need to in order for stat and other things to work. */ /* #undef _POSIX_SOURCE */ /* Define as the return type of signal handlers (int or void). */ #define RETSIGTYPE void /* Define if the setvbuf function takes the buffering type as its second argument and the buffer pointer as the third, as on System V before release 3. */ /* #undef SETVBUF_REVERSED */ /* If using the C implementation of alloca, define if you know the direction of stack growth for your system; otherwise it will be automatically deduced at run-time. STACK_DIRECTION > 0 => grows toward higher addresses STACK_DIRECTION < 0 => gro ws toward lower addresses STACK_DIRECTION = 0 => direction of growth unknown */ /* #undef STACK_DIRECTION */ /* Define if the `S_IS*' macros in do not work properly. */ /* #undef STAT_MACROS_BROKEN */ /* Define if you have the ANSI C header files. */ #define STDC_HEADERS 1 /* Define if you don't have dirent.h, but have sys/dir.h. */ /* #undef SYSDIR */ /* Define if you don't have dirent.h, but have sys/ndir.h. */ /* #undef SYSNDIR */ /* Define to `int' if doesn't define. */ /* #undef uid_t */ /* Define if the closedir function returns void instead of int. */ /* #undef VOID_CLOSEDIR */ /* Define if your processor stores words with the most significant byte first (like Motorola and SPARC, unlike Intel and VAX). */ /* #undef WORDS_BIGENDIAN */ /* The number of bytes in a int. */ /* #undef SIZEOF_INT */ /* The number of bytes in a long. */ /* #undef SIZEOF_LONG */ /* Define if you have bcopy. */ /* #undef HAVE_BCOPY */ /* Define if you have bzero. */ /* #undef HAVE_BZERO */ /* Define if you have flock. */ /* #undef HAVE_FLOCK */ /* Define if you have fsync. */ /* #undef HAVE_FSYNC */ /* Define if you have ftruncate. */ /* #undef HAVE_FTRUNCATE */ /* Define if you have getcwd. */ /* #undef HAVE_GETCWD */ /* Define if you have getdtablesize. */ /* #undef HAVE_GETDTABLESIZE */ /* Define if you have lrand48. */ /* #undef HAVE_LRAND48 */ /* Define if you have memmove. */ /* #undef HAVE_MEMMOVE */ /* Define if you have mktime. */ /* #undef HAVE_MKTIME */ /* Define if you have nice. */ /* #undef HAVE_NICE */ /* Define if you have on_exit. */ /* #undef HAVE_ON_EXIT */ /* Define if you have random. */ /* #undef HAVE_RANDOM */ /* Define if you have rename. */ /* #undef HAVE_RENAME */ /* Define if you have setlinebuf. */ /* #undef HAVE_SETLINEBUF */ /* Define if you have setrlimit. */ /* #undef HAVE_SETRLIMIT */ /* Define if you have srand48. */ /* #undef HAVE_SRAND48 */ /* Define if you have srandom. */ /* #undef HAVE_SRANDOM */ /* Define if you have sysconf. */ /* #undef HAVE_SYSCONF */ /* Define if you have timegm. */ /* #undef HAVE_TIMEGM */ /* Define if you have usleep. */ /* #undef HAVE_USLEEP */ /* Define if you have vfork. */ /* #undef HAVE_VFORK */ /* Define if you have the header file. */ /* #undef HAVE_ARPA_INET_H */ /* Define if you have the header file. */ /* #undef HAVE_CONFIG_H */ /* Define if you have the header file. */ #define HAVE_MEMORY_H 1 /* Define if you have the header file. */ /* #undef HAVE_NETINET_IN_H */ /* Define if you have the header file. */ #define HAVE_STDLIB_H 1 /* Define if you have the header file. */ #define HAVE_STRING_H 1 /* Define if you have the header file. */ /* #undef HAVE_SYS_SYSLOG_H */ /* Define if you have the header file. */ #define HAVE_SYS_TYPES_H 1 /* Define if you have the header file. */ /* #undef HAVE_SYSLOG_H */ /* Define if you have the dbm library (-ldbm). */ /* #undef HAVE_LIBDBM */ /* Define if you have the fl library (-lfl). */ /* #undef HAVE_LIBFL */ /* Define if you have the malloc library (-lmalloc). */ /* #undef HAVE_LIBMALLOC */ /* Define if you have the ndbm library (-lndbm). */ /* #undef HAVE_LIBNDBM */ /* Define if you have the nsl library (-lnsl). */ /* #undef HAVE_LIBNSL */ /* Define if you have the resolv library (-lresolv). */ /* #undef HAVE_LIBRESOLV */ /* Define if you have the seq library (-lseq). */ /* #undef HAVE_LIBSEQ */ /* Define if you have the socket library (-lsocket). */ /* #undef HAVE_LIBSOCKET */ #endif /* _AUTOCONF_H_ */ glimpse-4.18.7/libtemplate/include/autoconf.h.in000066400000000000000000000144051300371307100215730ustar00rootroot00000000000000/* ** libtemplate/include/autoconf.h.in. */ /* ------------------------------------------------- SYSUH - begin update There are too many compiler DEFS. This will be a problem on some systems because the length limit of the comand line. Therefore, I moved the compiler DEFS here so that the make output is more readable. The following definitions will be automatically updated by configure. ----------------------------------------------------------------------*/ #ifndef _AUTOCONF_H_ #define _AUTOCONF_H_ #define HAVE_DIRENT_H 0 #define HAVE_FCNTL_H 0 #define HAVE_SYS_FILE_H 0 #define HAVE_SYS_TIME_H 0 #define HAVE_UNISTD_H 0 #define HAVE_SYS_SELECT_H 0 #define HAVE_SYS_DIR_H 0 #define TIME_WITH_SYS_TIME 0 #define HAVE_UTIME_NULL 0 #define HAVE_STRDUP 0 #define HAVE_STRERROR 0 #define HAVE_LIBM 0 #define STRUCTURED_QUERIES 0 #define ISO_CHAR_SET 0 #define SFS_COMPAT 0 #define AGREP_POINTER 0 #define FILE_END_MARK @FILE_END_MARK@ #define RETSIGTYPE @RETSIGTYPE@ /* ----------------------------------------- SYSUH - end update. ---------------------------------------------------- */ /* Define if on AIX 3. System headers sometimes define this. We just want to avoid a redefinition error message. */ #ifndef _ALL_SOURCE #undef _ALL_SOURCE #endif /* Define if using alloca.c. */ #undef C_ALLOCA /* Define to empty if the keyword does not work. */ #undef const /* Define to one of _getb67, GETB67, getb67 for Cray-2 and Cray-YMP systems. This function is required for alloca.c support on those systems. */ #undef CRAY_STACKSEG_END /* Define to the type of elements in the array set by `getgroups'. Usually this is either `int' or `gid_t'. */ #undef GETGROUPS_T /* Define to `int' if doesn't define. */ #undef gid_t /* Define if you have alloca.h and it should be used (not Ultrix). */ #undef HAVE_ALLOCA_H /* Define if you support file names longer than 14 characters. */ #undef HAVE_LONG_FILE_NAMES /* Define if your struct stat has st_blksize. */ #undef HAVE_ST_BLKSIZE /* Define if on MINIX. */ #undef _MINIX /* Define if you don't have dirent.h, but have ndir.h. */ #undef NDIR /* Define to `long' if doesn't define. */ #undef off_t /* Define if the system does not provide POSIX.1 features except with this defined. */ #undef _POSIX_1_SOURCE /* Define if you need to in order for stat and other things to work. */ #undef _POSIX_SOURCE /* Define as the return type of signal handlers (int or void). */ #undef RETSIGTYPE /* Define if the setvbuf function takes the buffering type as its second argument and the buffer pointer as the third, as on System V before release 3. */ #undef SETVBUF_REVERSED /* If using the C implementation of alloca, define if you know the direction of stack growth for your system; otherwise it will be automatically deduced at run-time. STACK_DIRECTION > 0 => grows toward higher addresses STACK_DIRECTION < 0 => gro ws toward lower addresses STACK_DIRECTION = 0 => direction of growth unknown */ #undef STACK_DIRECTION /* Define if the `S_IS*' macros in do not work properly. */ #undef STAT_MACROS_BROKEN /* Define if you have the ANSI C header files. */ #undef STDC_HEADERS /* Define if you don't have dirent.h, but have sys/dir.h. */ #undef SYSDIR /* Define if you don't have dirent.h, but have sys/ndir.h. */ #undef SYSNDIR /* Define to `int' if doesn't define. */ #undef uid_t /* Define if the closedir function returns void instead of int. */ #undef VOID_CLOSEDIR /* Define if your processor stores words with the most significant byte first (like Motorola and SPARC, unlike Intel and VAX). */ #undef WORDS_BIGENDIAN /* The number of bytes in a int. */ #undef SIZEOF_INT /* The number of bytes in a long. */ #undef SIZEOF_LONG /* Define if you have bcopy. */ #undef HAVE_BCOPY /* Define if you have bzero. */ #undef HAVE_BZERO /* Define if you have flock. */ #undef HAVE_FLOCK /* Define if you have fsync. */ #undef HAVE_FSYNC /* Define if you have ftruncate. */ #undef HAVE_FTRUNCATE /* Define if you have getcwd. */ #undef HAVE_GETCWD /* Define if you have getdtablesize. */ #undef HAVE_GETDTABLESIZE /* Define if you have lrand48. */ #undef HAVE_LRAND48 /* Define if you have memmove. */ #undef HAVE_MEMMOVE /* Define if you have mktime. */ #undef HAVE_MKTIME /* Define if you have nice. */ #undef HAVE_NICE /* Define if you have on_exit. */ #undef HAVE_ON_EXIT /* Define if you have random. */ #undef HAVE_RANDOM /* Define if you have rename. */ #undef HAVE_RENAME /* Define if you have setlinebuf. */ #undef HAVE_SETLINEBUF /* Define if you have setrlimit. */ #undef HAVE_SETRLIMIT /* Define if you have srand48. */ #undef HAVE_SRAND48 /* Define if you have srandom. */ #undef HAVE_SRANDOM /* Define if you have sysconf. */ #undef HAVE_SYSCONF /* Define if you have timegm. */ #undef HAVE_TIMEGM /* Define if you have usleep. */ #undef HAVE_USLEEP /* Define if you have vfork. */ #undef HAVE_VFORK /* Define if you have the header file. */ #undef HAVE_ARPA_INET_H /* Define if you have the header file. */ #undef HAVE_CONFIG_H /* Define if you have the header file. */ #undef HAVE_MEMORY_H /* Define if you have the header file. */ #undef HAVE_NETINET_IN_H /* Define if you have the header file. */ #undef HAVE_STDLIB_H /* Define if you have the header file. */ #undef HAVE_STRING_H /* Define if you have the header file. */ #undef HAVE_SYS_SYSLOG_H /* Define if you have the header file. */ #undef HAVE_SYS_TYPES_H /* Define if you have the header file. */ #undef HAVE_SYSLOG_H /* Define if you have the dbm library (-ldbm). */ #undef HAVE_LIBDBM /* Define if you have the fl library (-lfl). */ #undef HAVE_LIBFL /* Define if you have the malloc library (-lmalloc). */ #undef HAVE_LIBMALLOC /* Define if you have the ndbm library (-lndbm). */ #undef HAVE_LIBNDBM /* Define if you have the nsl library (-lnsl). */ #undef HAVE_LIBNSL /* Define if you have the resolv library (-lresolv). */ #undef HAVE_LIBRESOLV /* Define if you have the seq library (-lseq). */ #undef HAVE_LIBSEQ /* Define if you have the socket library (-lsocket). */ #undef HAVE_LIBSOCKET #endif /* _AUTOCONF_H_ */ glimpse-4.18.7/libtemplate/include/ccache.h000066400000000000000000000173671300371307100205700ustar00rootroot00000000000000/* * ccache.h - FTP Connection Cache * * David Merkel & Mark Peterson, University of Colorado - Boulder, July 1994 * * $Id: ccache.h,v 1.2 2006/02/03 16:59:14 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * * */ #ifndef _CCACHE_H_ #define _CCACHE_H_ #include /* for FILE */ #include "url.h" /* for URL */ #include "config.h" #ifndef _PARAMS #define _PARAMS(ARGS) ARGS #endif /* _PARAMS */ typedef char Datum; typedef int Boolean; #define HASH_SLOTS (256) #define MIN_CONNECTIONS 3 /* Min number of connections to maintain. */ #define MIN_TIMEOUT 5 /* Min timeout in minutes. */ typedef enum { INACTIVE, MEMORY_ONLY, FILE_ONLY, OPTIMIZE, TEMP } BufferStatus; typedef struct SockCntlRec { int timerid; /* ID of the expiration timer. */ URL *socketInfo; /* The URL structure */ int theSocket; unsigned long theHostIP; char *incomDataBuf; /* Incoming data buffer */ FILE *incomDataFile; /* Incoming data file buffer */ int incomDataSize; /* Number of bytes in buffer */ BufferStatus sockStateStor; /* Default status for buffering */ struct SockCntlRec *hashPrev; /* Hash table previous */ struct SockCntlRec *hashNext; /* Hash table next */ struct SockCntlRec *listPrev; /* TimeOut list previous */ struct SockCntlRec *listNext; /* TimeOut list next */ } SockCntlRec; typedef struct InitConfigRec { int maxConnections; /* max # of socks to keep open at one time. */ long timeOut; /* TimeOut time in seconds. */ } InitConfigRec; enum ptype { STR, INT, POINTER, MD5 }; typedef enum ptype PType; #define MAX_LINE_LENGTH 1024 #define MAX_FILENAME_LENGTH 1024 #define HOST_NAME_LENGTH 1024 #define SERV_REPLY_LENGTH 3 #define BUFFER_SIZE 1024 /* buffer size for read calls */ #define INIT_FILE_SIZE 50000 /* bufsz if read files in mem */ #define REALLOC_BLOCK 2048 /* blk sz on realloc calls */ #define BACK_LOG 5 #define ACCEPT_TIMEOUT 5 /* time in seconds to timeout */ #define READ_TIMEOUT 5 /* if no data on line */ #define MULTI_LINE_CODE '-' /* * Controls newly created files permissions, as per chmod() arguments. * Final permissions are determined by process umask settings */ #define INIT_PERMISSION 0666 /* 'rw-rw-rw-' */ #define MAX_MESSAGE_LENGTH 50 #define PORT_MESSAGE_LENGTH 12 /* ftp message defines */ #define CONNECT "CONNECT" #define USER "USER" #define PASSWD "PASS" #define MODE "MODE" #define TYPE "TYPE" #define RETRIEVE "RETR" #define PORT "PORT" #define REINIT "REIN" #define DISCONNECT "QUIT" /* numerical defines for previous for fast compares */ #define CONNECT_CHK 0 #define USER_CHK 1 #define PASSWD_CHK 2 #define MODE_CHK 3 #define TYPE_CHK 4 #define RETRIEVE_CHK 5 #define PORT_CHK 6 #define REINIT_CHK 7 #define DISCONNECT_CHK 8 /* ftp mode and type defines */ #define IMAGE 'I' #define STREAM 'S' /* ftp server reply codes */ #define DATA_CONN_OPEN (0x31323500) /* "125" */ #define START_TRANS (0x31353000) /* "150" */ #define CMD_OKAY (0x32303000) /* "200" */ #define CLOSING (0x32323100) /* "221" */ #define CONNECT_EST (0x32323000) /* "220" */ #define TRANS_SUCCESS (0x32323600) /* "226" */ #define USER_LOGIN (0x32333000) /* "230" */ #define SEND_PASS (0x33333100) /* "331" */ typedef struct data_return { Boolean inMemory; /* set if return file in mem */ Boolean useTempFile; /* set if use temp file */ char *buffer; /* for memory return */ char fileName[MAX_FILENAME_LENGTH]; /* save if not in memory */ long fileSize; /* data size */ } DataReturn; typedef short ERRCODE; typedef int CacheErr; #define SERV_NOT_RDY (0x31323000) /*"120" */ #define NEED_ACCOUNT (0x33333200) /*"332" */ #define SERV_NOT_AVAIL (0x34323100) /*"421" */ #define SYNTAX_ERR (0x35303000) /*"500" */ #define SYNTAX_ERR_PARM (0x35303100) /*"501" */ #define CMD_NOT_IMPL (0x35303200) /*"502" */ #define BAD_CMD_SEQ (0x35303300) /*"503" */ #define CMD_UNIMP_PARM (0x35303400) /*"504" */ #define NOT_LOGD_IN (0x35333000) /*"530" */ #define FILE_NOT_FOUND (0x35353000) /*"550" */ #define noErr 0 #define srvNotRdy 1 #define needAccnt 2 #define srvNotAvl 3 #define syntaxErr 4 #define cmdNotImp 5 #define badCmdSeq 6 #define cmdNImpParam 7 #define notLogdIn 8 #define initSockErr 9 #define readSockErr 10 #define getSockErr 11 #define getHostErr 12 #define connectErr 13 #define memoryErr 14 #define writeSockErr 15 #define setSockOptErr 16 #define getHNameErr 17 #define getHBNameErr 18 #define getSNameErr 19 #define bindErr 20 #define fileOpenErr 22 #define writeFileErr 23 #define tmpNameErr 24 #define acceptTOut 25 #define readTOut 26 #define argInvalid 27 #define noReply 28 #define badParam 29 #define urlErr 30 #define badurlType 31 #define fileNotFound 32 #define NO_ERROR 0 #define INIT_SOCKET_ERR -1 #define READ_FROM_SOCK_ERR -2 #define GET_SOCKET_ERR -3 #define GETHOST_ERR -4 #define CONNECT_ERR -5 #define MEMORY_ERROR -6 #define WRITE_TO_SOCK_ERR -7 #define SET_SOCKOPT_ERR -8 #define GET_HOSTNAME_ERR -9 #define GET_HOSTBYNAME_ERR -10 #define GET_SOCKNAME_ERR -11 #define BIND_ERR -12 #define FILE_OPEN_ERR -14 #define WRITE_FILE_ERR -15 #define CANT_GET_TMPNAME -16 #define SERV_REPLY_ERROR -20 #define NO_PASS_REQ -21 #define ACCEPT_TIMEOUT_ERR -22 #define READ_TIMEOUT_ERR -23 #define ARGUMENT_INVALID -24 #define NO_REPLY_PRESENT -25 #define BAD_PARAM_ERR -50 #define URL_ERR -51 #define BAD_URL_TYPE -52 #ifndef DEBUG #undef DEBUG 4 /* debug level */ #endif #ifndef TRUE #define TRUE 1 #endif #ifndef FALSE #define FALSE 0 #endif #define BACK_LOG 5 #define INIT_URL_LEN 256 #define INIT_PARAM_LEN 50 #define REALLOC_BLK 20 #define LINE_FEED '\n' #define CARRG_RET '\r' #define PARAM_END '.' #define BLOCK_END '!' #define TERM_LEN 3 #define MD5_LEN 32 /* timeouts for read calls on sockets */ /* timeout for server receiving client request messages */ #define SERVER_TIMEOUT 5 /* timeout for util calls reading params from socket */ #define PARAM_TIMEOUT 5 /* timeout for client waiting for server response */ #define CLIENT_TIMEOUT 120 /* ftp.c */ int FTPInit _PARAMS((u_long, int, char *)); int Login _PARAMS((char *, char *, int, Boolean, char *)); int Disconnect _PARAMS((int, char *)); int Retrieve _PARAMS((char *, int,int,Boolean,Boolean, char *, DataReturn *)); /* ftp_util.c */ int InitSocket _PARAMS((u_long, int )); int ReadServReply _PARAMS((int, char *)); int SendMessage _PARAMS((char *, char *, int)); void ReadOutText _PARAMS((int, Boolean, char *)); int CheckServReply _PARAMS((int, char *)); int PrepareDataConnect _PARAMS((int, Boolean)); int RetrieveFile _PARAMS((int, DataReturn *)); Boolean DisconnectOne _PARAMS(()); CacheErr GetError _PARAMS(()); char *GetFTPError _PARAMS(()); void DoError _PARAMS((CacheErr, char *)); /* ccache_util.c */ int MyRead _PARAMS((int, char *, int, int)); int AddURL _PARAMS((URL *, char **, int, int, Boolean)); int GetURL _PARAMS((URL *, int)); int GetParam _PARAMS((char **, PType, int)); int AddParam _PARAMS((char *, PType, char **, int *, int, Boolean)); int SocketWrite _PARAMS((int, char *, int)); void PrintURL _PARAMS((URL *)); unsigned long gethostinhex _PARAMS((char *)); void SockInit _PARAMS((InitConfigRec *)); DataReturn *SockGetData _PARAMS((URL *, BufferStatus, char *)); void ShutDownCache _PARAMS(()); void DestroyDataReturn _PARAMS((DataReturn *)); #endif /* _CCACHE_H_ */ glimpse-4.18.7/libtemplate/include/ccache_list.h000066400000000000000000000041161300371307100216070ustar00rootroot00000000000000/* * ccache_list.h -- definitions for CS 2270 linked list package * * Mark Peterson 9/91 * * $Id: ccache_list.h,v 1.1 1999/11/03 21:40:57 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #ifndef _CCACHE_LIST_H_ #define _CCACHE_LIST_H_ typedef struct list_node { /*list node type */ struct list_node *next; /*points to the following node */ struct list_node *previous; /*points to the previous node */ Datum *data; /*stores data record in list */ } List_Node; typedef struct { /*list header node */ List_Node *first; /*points to the first node */ List_Node *last; /*points to the last node */ unsigned int count; /*keeps count of the number of nodes in list */ int (*compare) (); /*A compare function */ } Linked_List; /* **The list toolkit functions: */ Linked_List *list_create(); /*initialize list header block */ void list_destroy(); /*destroy list header block */ List_Node *list_insert(); /*insert a new node in the list */ Datum *list_delete(); /*delete a node from the list */ List_Node *list_find(); /*find a node in the list */ Boolean list_apply(); /*apply a function to each node in the list */ /*Built in Macros */ #define list_first(head) ((head)->first) /*find the first node in the list */ #define list_next(node) ((node)->next) /*find the next node in the list */ #define list_last(head) ((head)->last) /*find the last node in the list */ #define list_previous(node) ((node)->previous) /*find the previous node in the list */ #define list_getdata(node) ((node)->data) /*get the data from the list */ void list_putdata(); /*modify the data in the node in the list */ #define list_length(head) ((head)->count) /*find the length of the list */ #endif glimpse-4.18.7/libtemplate/include/ccache_queue.h000066400000000000000000000030131300371307100217530ustar00rootroot00000000000000/* * ccache_queue.h - * * Mark Peterson 9/91 * * $Id: ccache_queue.h,v 1.1 1999/11/03 21:40:57 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #ifndef _CCACHE_QUEUE_H_ #define _CCACHE_QUEUE_H_ #include "ccache_list.h" typedef List_Node Queue_Node; /*Define a Queue node same as a List node. */ typedef Linked_List Queue; /*Define a Queue header same as a List header.*/ /* **The queue toolkit contains: */ Queue *queue_create(); /*initialize queue header block */ void queue_destroy(); /*destory queue header block */ Boolean enqueue(); /*insert a new node at the end of the queue */ Datum *dequeue(); /*delete a node from the head of the queue */ Boolean queue_apply(); /*apply a function to each node in the queue */ Boolean queue_empty(); /*is the queue empty? */ /*Special functions designed for use in a priority timer queue. */ /*For use specifically with the time_it.h package. */ Boolean tenqueue(); /*Enqueue's a timer value. */ int tdequeue(); /*Dequeue's a timer value. */ Datum *head(); #define queue_length(q_head) ((q_head)->count) /*Return the number of items in the queue. */ #endif glimpse-4.18.7/libtemplate/include/config.h000066400000000000000000000143511300371307100206150ustar00rootroot00000000000000/* * config.h - Master configuration file for the Harvest system. * * Darren Hardy, hardy@cs.colorado.edu, July 1994 * * $Id: config.h,v 1.1 1999/11/03 21:40:57 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #ifndef _CONFIG_H_ #define _CONFIG_H_ #include "autoconf.h" /* For GNU autoconf variables */ #include "paths.h" /* For GNU autoconf program/subst variables */ /* * USE_TMPDIR - default temporary directory into which files are extracted. */ #ifndef USE_TMPDIR #define USE_TMPDIR "/tmp" #endif /* * GENERATE_KEYWORDS - Gatherer will automatically generate a case * insenstive, unique, sorted keyword list for each content summary. */ #ifndef GENERATE_KEYWORDS #define GENERATE_KEYWORDS #endif /* * USE_LOCAL_CACHE - define if you want to use the Gatherer's local disk cache */ #ifndef USE_LOCAL_CACHE #define USE_LOCAL_CACHE #endif /* * CACHE_TTL - # of seconds after which local disk cache files are invalid */ #ifndef CACHE_TTL #define CACHE_TTL (1 * 7 * 24 * 60 * 60) /* 1 week */ #endif /* * USE_CCACHE - define if you want to the use FTP connection cache for liburl */ #ifndef USE_CCACHE #undef USE_CCACHE #endif /**************************************************************************** *--------------------------------------------------------------------------* * DO *NOT* MAKE ANY CHANGES below here unless you know what you're doing...* *--------------------------------------------------------------------------* ****************************************************************************/ /* * NO_STRDUP - define if standard C library doesn't have strdup(3). */ #ifndef NO_STRDUP #ifndef HAVE_STRDUP #define NO_STRDUP #endif #endif /* * NO_STRERROR - define if standard C library doesn't have strerror(3). */ #ifndef NO_STRERROR #ifndef HAVE_STRERROR #define NO_STRERROR #endif #endif /* * MAX_TYPES is the max # of types that the type recognition supports. */ #ifndef MAX_TYPES #define MAX_TYPES 512 #endif /* * CMD_TAR - command for tar */ #ifndef CMD_TAR #define CMD_TAR "tar" #endif /* * USE_BYNAME - name of the configuration file for by name type recog. */ #ifndef USE_BYNAME #define USE_BYNAME "byname.cf" #endif /* * USE_BYCONTENET - name of the configuration file for file content type recog. */ #ifndef USE_BYCONTENT #define USE_BYCONTENT "bycontent.cf" #endif /* * USE_BYURL - name of the configuration file for by URL type recog. */ #ifndef USE_BYURL #define USE_BYURL "byurl.cf" #endif /* * USE_MAGIC - default name and location of the magic file. */ #ifndef USE_MAGIC #define USE_MAGIC "magic" #endif /* * USE_STOPLIST - name of the stoplist configuration file */ #ifndef USE_STOPLIST #define USE_STOPLIST "stoplist.cf" #endif /* * USE_MD5 - generates MD5 (cryptographic checksums) for each retrieved file. */ #ifndef USE_MD5 #define USE_MD5 #endif /* * GDBM_GROWTH_BUG - reorganizes db after to many replaces */ #ifndef GDBM_GROWTH_BUG #undef GDBM_GROWTH_BUG #endif /* * REAL_FILE_URLS - causes the Gatherer to interpret 'file' URLs as * specified by Mosaic. If the hostname field is the same as the * current host, then the URL is treated as a local file, otherwise, * the 'file' URL is treated as an 'ftp' URL. */ #ifndef REAL_FILE_URLS #undef REAL_FILE_URLS #endif /* * TRANSLATE_LOCAL_URLS - causes the Gatherer to intercept certain * URLs and retrieve them through the local file system interface. */ #ifndef TRANSLATE_LOCAL_URLS #define TRANSLATE_LOCAL_URLS #endif /* * LOG_TIMES - each log message is prepended with the current time. */ #ifndef LOG_TIMES #define LOG_TIMES #endif /* * USE_LOG_SYNC - tries to synchonize multiple processes writing to a log file */ #ifndef USE_LOG_SYNC #define USE_LOG_SYNC #endif /* * XFER_TIMEOUT is the number of seconds that liburl will wait on a read() * before giving up. */ #ifndef XFER_TIMEOUT #define XFER_TIMEOUT 120 #endif /* * USE_CONFIRM_HOST - url_open() will check with DNS (or whatever) * to confirm that the hostname in the URL is valid if this is defined. */ #ifndef USE_CONFIRM_HOST #undef USE_CONFIRM_HOST #endif /* * USE_PCINDEX - defines .unnest types for the PC software Gatherer */ #ifndef USE_PCINDEX #define USE_PCINDEX #endif /* * Define _HARVEST_AIX_ for the RS/6000 AIX port. */ #ifndef _HARVEST_AIX_ #undef _HARVEST_AIX_ #endif #if defined(USE_POSIX_REGEX) || defined(USE_GNU_REGEX) #include #elif defined(USE_BSD_REGEX) extern int re_comp(), re_exec(); #endif #ifdef USE_POSIX_REGEX #ifndef USE_RE_SYNTAX #define USE_RE_SYNTAX REG_EXTENDED /* default Syntax */ #endif #endif /* internal quicksum needs good regex support */ #ifdef USE_POSIX_REGEX #ifndef USE_QUICKSUM #define USE_QUICKSUM #endif #ifndef USE_QUICKSUM_FILE #define USE_QUICKSUM_FILE "quick-sum.cf" #endif #endif #ifdef MEMORY_LEAKS #ifndef NO_STRDUP #define NO_STRDUP /* use our version of strdup() */ #endif #endif #ifndef BLKDEV_IOSIZE #include /* try to find it... */ #endif #ifdef BLKDEV_IOSIZE #define MIN_XFER BLKDEV_IOSIZE /* minimum number of bytes per disk xfer */ #else #define MIN_XFER 512 /* make reasonable guess */ #endif #ifndef BUFSIZ #include /* try to find it... */ #ifndef BUFSIZ #define BUFSIZ 4096 /* make reasonable guess */ #endif #endif #if defined(SYSTYPE_SYSV) || defined(__svr4__) /* System V system */ #define _HARVEST_SYSV_ #elif defined(__hpux) /* HP-UX - SysV-like? */ #define _HARVEST_HPUX_ #elif defined(__osf__) /* OSF/1 */ #define _HARVEST_OSF_ #elif defined(__linux__) /* Linux */ #define _HARVEST_LINUX_ #elif defined(_SYSTYPE_SYSV) || defined(__SYSTYPE_SYSV) /* other SysV */ #define _HARVEST_SYSV_ #elif defined(_SYSTYPE_SVR4) || defined(SYSTYPE_SVR4) /* other Sys4 */ #define _HARVEST_SYSV_ /* fake as sysv */ #else #define _HARVEST_BSD_ #endif #endif /* _CONFIG_H_ */ glimpse-4.18.7/libtemplate/include/gdbm.h000066400000000000000000000110571300371307100202610ustar00rootroot00000000000000/* gdbm.h - The include file for dbm users. */ /* This file is part of GDBM, the GNU data base manager, by Philip A. Nelson. Copyright (C) 1990, 1991, 1993 Free Software Foundation, Inc. GDBM is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2, or (at your option) any later version. GDBM is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with GDBM; see the file COPYING. If not, write to the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. You may contact the author by: e-mail: phil@cs.wwu.edu us-mail: Philip A. Nelson Computer Science Department Western Washington University Bellingham, WA 98226 *************************************************************************/ /* Protection for multiple includes. */ #ifndef _GDBM_H_ #define _GDBM_H_ /* Parameters to gdbm_open for READERS, WRITERS, and WRITERS who can create the database. */ #define GDBM_READER 0 /* A reader. */ #define GDBM_WRITER 1 /* A writer. */ #define GDBM_WRCREAT 2 /* A writer. Create the db if needed. */ #define GDBM_NEWDB 3 /* A writer. Always create a new db. */ #define GDBM_FAST 16 /* Write fast! => No fsyncs. */ /* Parameters to gdbm_store for simple insertion or replacement in the case that the key is already in the database. */ #define GDBM_INSERT 0 /* Never replace old data with new. */ #define GDBM_REPLACE 1 /* Always replace old data with new. */ /* Parameters to gdbm_setopt, specifing the type of operation to perform. */ #define GDBM_CACHESIZE 1 /* Set the cache size. */ #define GDBM_FASTMODE 2 /* Toggle fast mode. */ /* The data and key structure. This structure is defined for compatibility. */ typedef struct { char *dptr; int dsize; } datum; /* The file information header. This is good enough for most applications. */ typedef struct {int dummy[10];} *GDBM_FILE; /* Determine if the C(++) compiler requires complete function prototype */ #if __STDC__ || defined(__cplusplus) || defined(c_plusplus) #define GDBM_Proto(x) x #else #define GDBM_Proto(x) () #endif /* NeedFunctionPrototypes */ /* External variable, the gdbm build release string. */ extern char *gdbm_version; /* GDBM C++ support */ #if defined(__cplusplus) || defined(c_plusplus) extern "C" { #endif /* These are the routines! */ extern GDBM_FILE gdbm_open GDBM_Proto(( char *file, int block_size, int flags, int mode, void (*fatal_func)() )); extern void gdbm_close GDBM_Proto(( GDBM_FILE dbf )); extern int gdbm_store GDBM_Proto(( GDBM_FILE dbf, datum key, datum content, int flags )); extern datum gdbm_fetch GDBM_Proto(( GDBM_FILE dbf, datum key )); extern int gdbm_delete GDBM_Proto(( GDBM_FILE dbf, datum key )); extern datum gdbm_firstkey GDBM_Proto(( GDBM_FILE dbf )); extern datum gdbm_nextkey GDBM_Proto(( GDBM_FILE dbf, datum key )); extern int gdbm_reorganize GDBM_Proto(( GDBM_FILE dbf )); extern void gdbm_sync GDBM_Proto(( GDBM_FILE dbf )); extern int gdbm_exists GDBM_Proto(( GDBM_FILE dbf, datum key )); extern int gdbm_setopt GDBM_Proto(( GDBM_FILE dbf, int optflag, int *optval, int optlen )); #if defined(__cplusplus) || defined(c_plusplus) } #endif /* gdbm sends back the following error codes in the variable gdbm_errno. */ typedef enum { GDBM_NO_ERROR, GDBM_MALLOC_ERROR, GDBM_BLOCK_SIZE_ERROR, GDBM_FILE_OPEN_ERROR, GDBM_FILE_WRITE_ERROR, GDBM_FILE_SEEK_ERROR, GDBM_FILE_READ_ERROR, GDBM_BAD_MAGIC_NUMBER, GDBM_EMPTY_DATABASE, GDBM_CANT_BE_READER, GDBM_CANT_BE_WRITER, GDBM_READER_CANT_DELETE, GDBM_READER_CANT_STORE, GDBM_READER_CANT_REORGANIZE, GDBM_UNKNOWN_UPDATE, GDBM_ITEM_NOT_FOUND, GDBM_REORGANIZE_FAILED, GDBM_CANNOT_REPLACE, GDBM_ILLEGAL_DATA, GDBM_OPT_ALREADY_SET, GDBM_OPT_ILLEGAL} gdbm_error; extern gdbm_error gdbm_errno; /* extra prototypes */ /* GDBM C++ support */ #if defined(__cplusplus) || defined(c_plusplus) extern "C" { #endif extern const char *gdbm_strerror GDBM_Proto(( gdbm_error error )); #if defined(__cplusplus) || defined(c_plusplus) } #endif #endif glimpse-4.18.7/libtemplate/include/paths.h000066400000000000000000000006551300371307100204710ustar00rootroot00000000000000/* Generated automatically from paths.h.in by configure. */ #ifndef _PATHS_H_ #define _PATHS_H_ #ifndef CMD_PERL #define CMD_PERL "/usr/local/perl" #endif #ifndef CMD_GZIP #define CMD_GZIP "/usr/local/gzip" #endif #ifndef CMD_GUNZIP #define CMD_GUNZIP "/usr/local/gunzip" #endif #ifndef CMD_UNZIP #define CMD_UNZIP "/usr/local/unzip" #endif #ifndef CMD_UNCOMPRESS #define CMD_UNCOMPRESS "/usr/ucb/uncompress" #endif #endif glimpse-4.18.7/libtemplate/include/paths.h.in000066400000000000000000000005321300371307100210700ustar00rootroot00000000000000#ifndef _PATHS_H_ #define _PATHS_H_ #ifndef CMD_PERL #define CMD_PERL "@CMD_PERL@" #endif #ifndef CMD_GZIP #define CMD_GZIP "@CMD_GZIP@" #endif #ifndef CMD_GUNZIP #define CMD_GUNZIP "@CMD_GUNZIP@" #endif #ifndef CMD_UNZIP #define CMD_UNZIP "@CMD_UNZIP@" #endif #ifndef CMD_UNCOMPRESS #define CMD_UNCOMPRESS "@CMD_UNCOMPRESS@" #endif #endif glimpse-4.18.7/libtemplate/include/template.h000066400000000000000000000074111300371307100211620ustar00rootroot00000000000000/* * template.h - SOIF template processing support * * Darren Hardy, hardy@cs.colorado.edu, July 1994 * * $Id: template.h,v 1.2 2006/02/03 16:59:14 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #ifndef _TEMPLATE_H_ #define _TEMPLATE_H_ #include #include "config.h" /* Attribute-Value pair */ struct attr_value { char *attribute; /* Attribute string; '\0' terminated */ char *value; /* Value data; not '\0' terminated */ size_t vsize; /* # of bytes in the value data */ size_t offset; /* Starting byte of the value data in input */ }; typedef struct attr_value AVPair; /* List of Attribute-Value pairs */ struct attr_value_list { AVPair *data; struct attr_value_list *next; }; typedef struct attr_value_list AVList; /* SOIF Template structure */ struct template { char *template_type; /* type of template */ char *url; /* URL for template */ AVList *list; /* List of Attribute-Value pairs */ size_t offset; /* Starting byte of the template; the @ */ size_t length; /* # of bytes that the template covers */ }; typedef struct template Template; /* Common Attribute Tags */ #define T_ABSTRACT "Abstract" #define T_AUTHOR "Author" #define T_FILESIZE "File-Size" #define T_FULLTEXT "Full-Text" #define T_GENERATION "Generation-Time" #define T_GHOST "Gatherer-Host" #define T_GNAME "Gatherer-Name" #define T_GVERSION "Gatherer-Version" #define T_KEYS "Keywords" #define T_LMT "Last-Modification-Time" #define T_MD5 "MD5" #define T_NESTED "Nested-Filename" #define T_PARTTEXT "Partial-Text" #define T_REFERENCE "References" #define T_REFRESH "Refresh-Rate" #define T_RELATED "Related" #define T_TITLE "Title" #define T_TTL "Time-to-Live" #define T_TYPE "Type" #define T_UPDATE "Update-Time" #define T_URL "URL" /* Backwards compatibility */ #define T_FILETYPE T_TYPE #define T_TIMESTAMP T_LMT /* Common time_t numbers */ #define HOUR ((time_t) 60 * 60) #define DAY ((time_t) HOUR * 24) #define WEEK ((time_t) DAY * 7) #define MONTH ((time_t) WEEK * 4) #define YEAR ((time_t) MONTH * 12) #ifndef _PARAMS #define _PARAMS(ARGS) ARGS #endif /* _PARAMS */ /* Templates */ Template *create_template _PARAMS((char *, char *)); Template *embed_template _PARAMS((Template *, Template *)); void free_template _PARAMS((Template *)); /* Attribute-Value pairs */ AVList *create_AVList _PARAMS((char *, char *, int)); void add_AVList _PARAMS((AVList *, char *, char *, int)); void FAST_add_AVList _PARAMS((AVList *, char *, char *, int)); void append_AVList _PARAMS((AVList *, char *, char *, int)); AVPair *extract_AVPair _PARAMS((AVList *, char *)); AVList *merge_AVList _PARAMS((AVList *, AVList *)); AVList *sort_AVList _PARAMS((AVList *)); AVList *sink_embedded _PARAMS((AVList *)); int exists_AVList _PARAMS((AVList *, char *)); void free_AVPair _PARAMS((AVPair *)); void free_AVList _PARAMS((AVList *)); /* Printing Templates */ Buffer *init_print_template _PARAMS((FILE *)); void print_template _PARAMS((Template *)); void print_template_header _PARAMS((Template *)); void print_template_body _PARAMS((Template *)); void print_template_trailer _PARAMS((Template *)); void finish_print_template _PARAMS(()); /* Parsing Templates */ void init_parse_template_file _PARAMS((FILE *)); void init_parse_template_string _PARAMS((char *, int)); Template *parse_template _PARAMS(()); void finish_parse_template _PARAMS(()); int is_parse_end_of_input _PARAMS(()); #endif /* _TEMPLATE_H_ */ glimpse-4.18.7/libtemplate/include/time_it.h000066400000000000000000000021301300371307100207720ustar00rootroot00000000000000/* * time_it.h * * $Id: time_it.h,v 1.1 1999/11/03 21:40:57 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #ifndef _TIME_IT_H_ #define _TIME_IT_H_ /*Automatically include timer.h only if not previously included. */ /*Simple structure to track blocked processes in an OS. */ typedef struct timer_record { int eventnum; /*Tracks the order of items enqueued. */ int time_in_secs; /*Non-cumulative time in seconds for process to remain blocked in queue. */ int (*ProctoCall) (); /*A pointer to the blocked procedure. */ } Time_Node; void InitTimer(); int SetTimer(); Boolean CancelTimer(); void HandleTimerSignal(); void Freeze(); void Thaw(); void DisplayQueue(); #endif glimpse-4.18.7/libtemplate/include/url.h000066400000000000000000000061531300371307100201530ustar00rootroot00000000000000/* * url.h - URL Processing (parsing & retrieval) * * Darren Hardy, hardy@cs.colorado.edu, April 1994 * * $Id: url.h,v 1.2 2006/02/03 16:59:14 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #ifndef _URL_H_ #define _URL_H_ #include "config.h" #include /* * The supported URLs look like: * * file://host/pathname * gopher://host[:port][/TypeDigitGopherRequest] * http://host[:port][/[pathname][#name][?search]] * ftp://[user[:password]@]host[:port][/pathname] * * where host is either a fully qualified hostname, an IP number, or * a relative hostname. * * For http, any '#name' or '?search' directives are ignored. * For ftp, any user, password, or port directives are unsupported. */ struct url { char *url; /* Complete, normalized URL */ int type; /* file, ftp, http, gopher, etc. */ char *raw_pathname; /* pathname portion of the URL, w/ escapes */ char *pathname; /* pathname portion of the URL, w/o escapes */ char *host; /* fully qualified hostname */ int port; /* TCP/IP port */ /* Information for FTP processing */ char *user; /* Login name for ftp */ char *password; /* password for ftp */ /* Information for Gopher processing */ int gophertype; /* Numeric type for gopher request */ /* Information for HTTP processing */ char *http_version; /* HTTP/1.0 Version */ int http_status_code; /* HTTP/1.0 Status Code */ char *http_reason_line; /* HTTP/1.0 Reason Line */ char *http_mime_hdr; /* HTTP/1.0 MIME Response Header */ /* Information for local copy processing */ char *filename; /* local filename */ FILE *fp; /* ptr to local filename */ time_t lmt; /* Last-Modification-Time */ #ifdef USE_MD5 char *md5; /* MD5 value of URL */ #endif }; typedef struct url URL; enum url_types { /* Constants for URL types */ URL_UNKNOWN, URL_FILE, URL_FTP, URL_GOPHER, URL_HTTP, URL_NEWS, URL_NOP, URL_TELNET, URL_WAIS }; #ifndef _PARAMS #define _PARAMS(ARGS) ARGS #endif /* _PARAMS */ URL *url_open _PARAMS((char *)); int url_read _PARAMS((char *, int, int, URL *)); int url_retrieve _PARAMS((URL *)); void url_close _PARAMS((URL *)); void init_url _PARAMS(()); void finish_url _PARAMS(()); void url_purge _PARAMS(()); URL *dup_url _PARAMS((URL *)); void print_url _PARAMS((URL *)); int http_get _PARAMS((URL *)); int ftp_get _PARAMS((URL *)); int gopher_get _PARAMS((URL *)); #ifdef USE_LOCAL_CACHE void init_cache _PARAMS(()); char *lookup_cache _PARAMS((char *)); time_t lmt_cache _PARAMS((char *)); void add_cache _PARAMS((char *, char *)); void finish_cache _PARAMS(()); void expire_cache _PARAMS(()); #endif #ifdef USE_CCACHE void url_initCache _PARAMS((int, long)); void url_shutdowncache _PARAMS(()); #endif #endif /* _URL_H_ */ glimpse-4.18.7/libtemplate/include/util.h000066400000000000000000000066261300371307100203330ustar00rootroot00000000000000/* * util.h - Common utilities for the Essence system * * Darren Hardy, hardy@cs.colorado.edu, April 1994 * * $Id: util.h,v 1.3 2006/02/03 16:56:35 golda Exp $ * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #ifndef _UTIL_H_ #define _UTIL_H_ #include #include "config.h" /* Buffer structure for buffer management routines */ struct gbuf { /* Growing and shrinking buffer */ char *data; /* Data buffer */ int length; /* Current length of data buffer */ int size; /* Size allocated in the Data buffer */ int default_size; /* Default size of the Data buffer */ }; typedef struct gbuf Buffer; /* Growing buffer */ #define stradd_buffer(b,s) add_buffer((b), (s), strlen(s)) #ifndef _PARAMS #define _PARAMS(ARGS) ARGS #endif /* _PARAMS */ /* from buffer.c Buffer manipulation routines */ Buffer *create_buffer _PARAMS((int)); /* New buffer */ void grow_buffer _PARAMS((Buffer *)); /* Increase buffer size */ void increase_buffer _PARAMS((Buffer *, int)); /* Increase buffer size */ void shrink_buffer _PARAMS((Buffer *)); /* Reduce buffer size */ void add_buffer _PARAMS((Buffer *, char *, int));/* Add data to a buffer */ void free_buffer _PARAMS((Buffer *)); /* Clean up a buffer */ /* from host.c */ char *getfullhostname _PARAMS(()); /* Fully qualified hostname */ char *getmylogin _PARAMS(()); /* getlogin(3) clone */ char *getrealhost _PARAMS((char *)); /* Real DNS hostname */ /* from log.c */ void init_log _PARAMS((FILE *, FILE *)); /* Initialize log routines */ void init_log3 _PARAMS((char *,FILE *,FILE *)); /* Initialize log routines */ void log_errno _PARAMS((char *)); /* Same as perror(3) */ void fatal_errno _PARAMS((char *)); /* Same as perror(3) & exit */ #include void glimpselog _PARAMS((char *, ...)); /* Log a message */ void errorlog _PARAMS((char *, ...)); /* Log an error message */ void fatal _PARAMS((char *, ...)); /* Log error msg and exit */ #ifdef NO_STRDUP /* from strdup.c */ char *strdup _PARAMS((const char *)); /* Duplicate a string */ #endif /* from string.c */ void parse_argv _PARAMS((char **, char *)); /* Parse a command string */ /* from system.c */ int do_system _PARAMS((char *)); /* Wrapper for system(3) */ int run_cmd _PARAMS((char *)); /* Simple system(3) */ void do_system_lifetime _PARAMS((char *, int)); /* Limited system(3) */ void close_all_fds _PARAMS((int)); /* Closes all fd's */ /* from xmalloc.c */ void *xmalloc _PARAMS((size_t)); /* Wrapper for malloc(3) */ void *xrealloc _PARAMS((void *, size_t)); /* Wrapper for realloc(3) */ void xfree _PARAMS((void *)); /* Wrapper for free(3) */ /* from harvest.c */ char *harvest_bindir _PARAMS((void)); char *harvest_libdir _PARAMS((void)); char *harvest_topdir _PARAMS((void)); void harvest_add_path _PARAMS((char *)); #define harvest_add_gatherer_path() harvest_add_path("gatherer:") #define harvest_add_broker_path() harvest_add_path("broker:") #define harvest_add_cache_path() harvest_add_path("cache:") #define harvest_add_replicator_path() harvest_add_path("replicator:") #endif /* _UTIL_H_ */ glimpse-4.18.7/libtemplate/lib/000077500000000000000000000000001300371307100163165ustar00rootroot00000000000000glimpse-4.18.7/libtemplate/lib/Makefile.in000066400000000000000000000001231300371307100203570ustar00rootroot00000000000000all: install: clean: rm -f *.a core distclean realclean: clean rm -f Makefile glimpse-4.18.7/libtemplate/lib/Makefile.sunos000066400000000000000000000000501300371307100211170ustar00rootroot00000000000000all: install: clean: -rm -f *.a core glimpse-4.18.7/libtemplate/template/000077500000000000000000000000001300371307100173635ustar00rootroot00000000000000glimpse-4.18.7/libtemplate/template/Attributes.html000066400000000000000000000051731300371307100224050ustar00rootroot00000000000000 Attributes supported by the Gatherer

Attributes supported by the Gatherer

Abstract
Brief abstract about the object.
Author
Author(s) of the object.
Description
Brief description about the object.
File-Size
Number of bytes in the object.
Text-Text
Entire contents of the object.
Gatherer-Host
Host on which the Gatherer ran to extract information from the object.
Gatherer-Name
Name of the Gatherer that extracted information from the object. (eg. Full-Text, Selected-Text, or Terse).
Gatherer-Port
Port number on the Gatherer-Host that serves the Gatherer's information.
Gatherer-Version
Version number of the Gatherer.
Update-Time
The time that Gatherer updated the content summary for the object.
Keywords
Searchable keywords extracted from the object.
Last-Modification-Time
The time that the object was last modified.
MD5
MD5 16-byte checksum of the object.
Refresh-Rate
How often the Broker attempts to update the content summary.
Time-to-Live
The time at which the content summary is no longer valid.
Type
The object's type. Some example types are:

	Archive
	Audio
	Awk
	Backup
	Binary
	C
	CHeader
	Command
	Compressed
	CompressedTar
	Configuration
	Data
	Directory
	DotFile
	Dvi
	FAQ
	FYI
	Font
	FormattedText
	GDBM
	GNUCompressed
	GNUCompressedTar
	HTML
	Image
	Internet-Draft
	MacCompressed
	Mail
	Makefile
	ManPage
	Object
	OtherCode
	PCCompressed
	Patch
	Perl
	PostScript
	RCS
	README
	RFC
	SCCS
	ShellArchive
	Tar
	Tcl
	Tex
	Text
	Troff
	Uuencoded
	WaisSource
Update-Time
The time that Gatherer updated (generated) the content summary from the object.
URL-References
Any URL references present within HTML objects.


Darren R. Hardy, hardy@cs.colorado.edu
glimpse-4.18.7/libtemplate/template/Makefile.NeXT000066400000000000000000000040611300371307100216410ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.NeXT,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = /bin/ranlib XTRA_LIBS = -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.alpha000066400000000000000000000040721300371307100221120ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.alpha,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #installbsd -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = ranlib XTRA_LIBS = -lresolv -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.hp000066400000000000000000000040571300371307100214370ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.hp,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = : XTRA_LIBS = -lresolv -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.in000066400000000000000000000041541300371307100214340ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.in,v 1.2 2001/05/21 04:48:59 golda Exp $ # srcdir = @srcdir@ VPATH = @srcdir@ prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = @CC@ INSTALL = @INSTALL@ INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} RANLIB = @RANLIB@ XTRA_LIBS = -lresolv -lm LN_S = @LN_S@ DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib install: all install-man: clean: rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o distclean realclean: clean rm -f Makefile mktemplate install-bin: all @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.linux000066400000000000000000000040761300371307100221700ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.linux,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc -m486 INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = ranlib XTRA_LIBS = -lresolv -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.rs6000000066400000000000000000000040541300371307100217570ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.rs6000,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = true XTRA_LIBS = -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.sgi000066400000000000000000000040511300371307100216040ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.sgi,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = true XTRA_LIBS = -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.solaris000066400000000000000000000040701300371307100224770ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.solaris,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = true XTRA_LIBS = -lresolv -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/Makefile.sunos000066400000000000000000000040701300371307100221720ustar00rootroot00000000000000# # Makefile for the SOIF template processing code # # Darren Hardy, hardy@cs.colorado.edu, May 1994 # # $Id: Makefile.sunos,v 1.1 1999/11/03 21:41:04 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = ranlib XTRA_LIBS = -lresolv -lm LN_S = ln -s DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG DEBUG_LIBS = CFLAGS = $(DEBUG) -I../include OBJS = template.o LIBDIR = ../lib LDFLAGS = -L$(LIBDIR) LIBS = -ltemplate -lutil $(DEBUG_LIBS) $(XTRA_LIBS) LIBFILE = libtemplate.a BINS = cksoif print-template print-attr \ lsm2soif iafa2soif pcindex2soif translate-urls all: $(LIBFILE) install-lib #$(BINS) mktemplate $(LIBFILE): $(OBJS) ar r $@ $(OBJS) $(RANLIB) $@ clean: -rm -f core $(OBJS) $(LIBFILE) $(BINS) *.o #realclean: # -rm -f Makefile mktemplate install: install-lib @for f in $(BINS) mktemplate; do \ echo $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ $(INSTALL_BIN) $$f $(INSTALL_BINDIR); \ done -rm -f $(INSTALL_BINDIR)/LSM.unnest $(LN_S) $(INSTALL_BINDIR)/lsm2soif $(INSTALL_BINDIR)/LSM.unnest install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) cksoif: cksoif.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-template: print-template.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) print-attr: print-attr.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) template2html: template2html.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) translate-urls: translate-urls.o $(CC) -o $@ $@.o $(LDFLAGS) $(LIBS) lsm2soif: lsm2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) iafa2soif: iafa2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) print-urlrefs: print-urlrefs.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) pcindex2soif: pcindex2soif.o $(CC) -o $@ $@.o $(LDFLAGS) -lurl $(LIBS) -lgdbm -lmd5 $(XTRA_LIBS) glimpse-4.18.7/libtemplate/template/README000066400000000000000000000003331300371307100202420ustar00rootroot00000000000000This directory holds the SOIF template processing library as well as some other utilities for manipulating SOIF templates. template.c is the SOIF template library; everything else is related. -Darren Hardy, June 1994 glimpse-4.18.7/libtemplate/template/cksoif.c000066400000000000000000000024001300371307100210010ustar00rootroot00000000000000static char rcsid[] = "$Id: cksoif.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * cksoif - Reads in templates from stdin and prints whether or not * it's a legal SOIF template. * * Darren Hardy, hardy@cs.colorado.edu, November 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include "util.h" #include "template.h" int main(argc, argv) int argc; char *argv[]; { int n = 0, nok = 0; Template *t; init_parse_template_file(stdin); while (1) { n++; t = parse_template(); if (t == NULL && is_parse_end_of_input()) break; if (t == NULL) { printf("Attempt %d: Invalid SOIF\n", n); continue; } free_template(t); printf("Attempt %d: Valid SOIF\n", n); nok++; } finish_parse_template(); printf("Successfully parsed %d SOIF objects, in %d attempts\n", nok, n - 1); exit(0); } glimpse-4.18.7/libtemplate/template/iafa2soif.c000066400000000000000000000047011300371307100213740ustar00rootroot00000000000000static char rcsid[] = "$Id: iafa2soif.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * iafa2soif - Converts IAFA templates to SOIF. * * Usage: iafa2soif url local-file * * Darren Hardy, hardy@cs.colorado.edu, June 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include "util.h" #include "url.h" #include "template.h" static void usage() { fprintf(stderr, "Usage: iafa2soif url local-file\n"); exit(1); } #define isiafa(c) (isalnum(c) || ((c) == '-') || ((c) == '#')) static void do_iafatosoif(url, filename) char *url; char *filename; { char buf[BUFSIZ], *s, *p; char attr[BUFSIZ]; char value[BUFSIZ]; int i; Template *t; FILE *fp; URL *up; if ((up = url_open(url)) == NULL) { errorlog("Cannot open URL: %s\n", url); return; } /* Build the template */ t = create_template(NULL, up->url); /* Read the file and build a SOIF template from it */ if ((fp = fopen(filename, "r")) == NULL) { log_errno(filename); url_close(up); return; } attr[0] = '\0'; while (fgets(buf, BUFSIZ, fp)) { if ((s = strrchr(buf, '\n')) != NULL) *s = '\0'; if (strlen(buf) < 1) continue; if ((s = strchr(buf, ':')) == NULL) { append_AVList(t->list, attr, buf, strlen(buf)); continue; } for (p = buf, i = 0; p < s && isiafa(*p); p++, i++) attr[i] = *p; attr[i] = '\0'; if (i < 1) continue; /* null attribute */ while (*s != '\0' && (*s == ':' || isspace(*s))) s++; if (strlen(s) < 1) /* empty line */ continue; strcpy(value, s); if (t->list) append_AVList(t->list, attr, value, strlen(value)); else t->list = create_AVList(attr, value, strlen(value)); } fclose(fp); /* Print out the template */ (void) init_print_template(stdout); print_template(t); finish_print_template(); free_template(t); url_close(up); return; } int main(argc, argv) int argc; char *argv[]; { char *url, *filename; if (argc != 3) usage(); url = strdup(argv[1]); filename = strdup(argv[2]); init_log(stderr, stderr); init_url(); do_iafatosoif(url, filename); finish_url(); exit(0); } glimpse-4.18.7/libtemplate/template/lsm2soif.c000066400000000000000000000072741300371307100212770ustar00rootroot00000000000000static char rcsid[] = "$Id: lsm2soif.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * lsm2soif - Converts Linux Software Maps (lsm) to SOIF. * * Usage: lsm2soif url local-file * * Darren Hardy, hardy@cs.colorado.edu, June 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include "util.h" #include "url.h" #include "template.h" /* Local functions */ static void do_lsmtosoif(); /* Local variables */ static int n_flag = 0; static void usage() { fprintf(stderr, "Usage: lsm2soif url local-file\n"); exit(1); } static void do_lsmtosoif(url, filename) char *url; char *filename; { char buf[BUFSIZ], attr[BUFSIZ], value[BUFSIZ]; char *sv, *pv, *fv, *s, *p; int i; Template *t; FILE *fp; URL *up; AVPair *site_avp, *path_avp, *file_avp; if ((up = url_open(url)) == NULL) { errorlog("Cannot open URL: %s\n", url); return; } /* Build the template */ t = create_template(NULL, up->url); /* Read the file and build a SOIF template from it */ if ((fp = fopen(filename, "r")) == NULL) { log_errno(filename); url_close(up); return; } while (fgets(buf, BUFSIZ, fp)) { if ((s = strrchr(buf, '\n')) != NULL) *s = '\0'; if ((s = strchr(buf, '=')) == NULL) continue; /* not an LSM line */ for (p = buf, i = 0; p < s && !isspace(*p); p++, i++) attr[i] = *p; attr[i] = '\0'; if (i < 1) continue; /* null attribute */ if (isdigit(attr[--i])) attr[i] = '\0'; /* strip attribute number */ /* Make Desc lines Description lines */ if (!strcmp(attr, "Desc")) { strcpy(attr, "Description"); } while (*s != '\0' && (*s == '=' || isspace(*s))) s++; if (!strcmp(attr, "Site") || !strcmp(attr, "Path") || !strcmp(attr, "File")) { if ((p = strchr(s, ' ')) != NULL) *p = '\0'; if ((p = strchr(s, '\t')) != NULL) *p = '\0'; } if (strlen(s) < 1) /* empty line */ continue; strcpy(value, s); if (t->list) append_AVList(t->list, attr, value, strlen(value)); else t->list = create_AVList(attr, value, strlen(value)); } fclose(fp); /* Reset t->url to the file that the LSM points to, if possible */ site_avp = extract_AVPair(t->list, "Site"); path_avp = extract_AVPair(t->list, "Path"); file_avp = extract_AVPair(t->list, "File"); if (site_avp && path_avp && file_avp) { sv = strdup(site_avp->value); pv = strdup(path_avp->value); fv = strdup(file_avp->value); for (p = sv; *p && !isspace(*p); p++); *p = '\0'; for (p = pv; *p && !isspace(*p); p++); *p = '\0'; for (p = fv; *p && !isspace(*p); p++); *p = '\0'; if (*pv == '/' && *fv == '/') sprintf(buf, "ftp://%s%s%s", sv, pv, fv); else if (*pv == '/' && *fv != '/') sprintf(buf, "ftp://%s%s/%s", sv, pv, fv); else if (*pv != '/' && *fv == '/') sprintf(buf, "ftp://%s/%s%s", sv, pv, fv); else sprintf(buf, "ftp://%s/%s/%s", sv, pv, fv); xfree(t->url); t->url = strdup(buf); xfree(sv); xfree(pv); xfree(fv); } /* Print out the template */ (void) init_print_template(stdout); print_template(t); finish_print_template(); free_template(t); url_close(up); return; } int main(argc, argv) int argc; char *argv[]; { char *url, *filename; if (argc != 3) usage(); url = strdup(argv[1]); filename = strdup(argv[2]); init_log(stderr, stderr); init_url(); do_lsmtosoif(url, filename); finish_url(); exit(0); } glimpse-4.18.7/libtemplate/template/netfind2soif.pl000077500000000000000000000054251300371307100223230ustar00rootroot00000000000000: # *-*-perl-*-* eval 'exec perl -S $0 "$@"' if $running_under_some_shell; # # netfind2soif.pl - Converts the Netfind seed database into a SOIF stream. # Groups the SOIF templates by second-level domains. # # Darren Hardy, hardy@cs.colorado.edu, January 1995 # # $Id: netfind2soif.pl,v 1.1 1999/11/03 21:41:04 golda Exp $ # # The netfind seed database has records that look like this: # # %D domain-name # %O organization-name # %H hostnames # # Generates SOIF that looks like this: # # @DOMAIN { nop:domain-name # Embed-Domain{x}: foo.domain-name # Embed-Organization{x}: organization-name # Embed-Hosts{x}: hosts # Embed-Domain{x}: bar.domain-name # Embed-Organization{x}: organization-name # Embed-Hosts{x}: hosts # } # or # @DOMAIN { nop:domain-name # Domain{x}: domain-name # Organization{x}: organization-name # Hosts{x}: hosts # } # $ENV{'HARVEST_HOME'} = "/usr/local/harvest" if (!defined($ENV{'HARVEST_HOME'})); unshift(@INC, "$ENV{'HARVEST_HOME'}/lib"); # use local files $ENV{'TMPDIR'} = "/tmp" if (!defined($ENV{'TMPDIR'})); require 'soif.pl'; $sld_file = "$ENV{'TMPDIR'}/nfdomains.$$"; $slo_file = "$ENV{'TMPDIR'}/nforgs.$$"; $slh_file = "$ENV{'TMPDIR'}/nfhosts.$$"; &do_reset(); while (<>) { chop; $d = $1, next if (/^%D\s*(.*)$/io); $o = $1, next if (/^%O\s*(.*)$/io); $h = $1, next if (/^%H\s*(.*)$/io); if (defined($d) && defined($o) && defined($h)) { if ($d =~ /^(\d+)\.(\d+)\.(\d+)$/o) { $sld = "$1.$2"; } elsif ($d =~ /^([+\-\w]+\.[+\-\w]+)$/o) { $sld = $1; } elsif ($d =~ /^.*\.([+\-\w]+\.[+\-\w]+)$/o) { $sld = $1; } else { next; } $sl_domain{$sld} .= "$d\n"; if ($o eq "") { $sl_org{$sld} .= "unspecified\n"; } else { $sl_org{$sld} .= "$o\n"; } if ($h eq "") { $sl_host{$sld} .= "unspecified\n"; } else { $sl_host{$sld} .= "$h\n"; } &do_reset(); } } while (($key, $value) = each %sl_domain) { @domains = split(/\n/, $sl_domain{$key}); @orgs = split(/\n/, $sl_org{$key}); @hosts = split(/\n/, $sl_host{$key}); undef %record; delete $sl_domain{$key}; delete $sl_org{$key}; delete $sl_host{$key}; if ($#domains < 0 || $#orgs < 0 || $#hosts < 0) { print STDERR "Empty Record!\n"; next; } $n = $#domains + 1; if ($n == 1) { $record{"Domain"} = $domains[0]; $record{"Organization"} = $orgs[0] if ($orgs[0] ne "unspecified"); $record{"Hosts"} = $hosts[0] if ($hosts[0] ne "unspecified"); } else { for ($i = 1; $i <= $n; $i++) { $record{"Embed<$i>-Domain"} = $domains[$i-1]; $record{"Embed<$i>-Organization"} = $orgs[$i-1] if ($orgs[$i-1] ne "unspecified"); $record{"Embed<$i>-Hosts"} = $hosts[$i-1] if ($hosts[$i-1] ne "unspecified"); } } &soif'print("DOMAIN", "nop:$key", %record); } exit(0); sub do_reset { undef $d; undef $o; undef $h; undef $sld; } glimpse-4.18.7/libtemplate/template/pcindex2soif.c000066400000000000000000000113711300371307100221270ustar00rootroot00000000000000static char rcsid[] = "$Id: pcindex2soif.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * pcindex2soif.c - Converts Nebula's PC Index data file into SOIF * * Usage: pcindex2soif * * Input data file format: * First line: * ("root1" "root2" ... "rootn") * Other object lines: * (("." "p1" "p2" .. "name") "date" size "desc") * Directory lines: * (("." "p1" "p2" .. "name") "desc" "root") * * Has some ugly parsing; anyone know of a good lisp-style data reader in C? * Doesn't work with the Directory lines. * * Added the 'Archive-Site' tag. * * Darren Hardy, hardy@cs.colorado.edu, July 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include #include "util.h" #include "url.h" #include "template.h" /* * grab_wrapped_value - returns the text of the largest value wrapped by * char c[0] and char c[1] */ char *grab_wrapped_value(buf, c) char **buf, *c; { static char s[BUFSIZ]; char *q; if ((q = strchr(*buf, c[0])) == NULL) return(NULL); *buf = q + 1; /* mark beginning */ strcpy(s, *buf); /* copy */ if ((q = strrchr(s, c[1])) != NULL) { *q = '\0'; /* terminate match */ q = strrchr(*buf, c[1]); *buf = q + 1; /* advance buf ptr */ } else { return(NULL); } return(s); } /* * grab_next_value - returns the text in the next smallest value wrapped * by char c[0] and char c[1] */ char *grab_next_value(buf, c) char **buf, *c; { static char s[BUFSIZ]; char *q; if ((q = strchr(*buf, c[0])) == NULL) return(NULL); *buf = q + 1; /* mark beginning */ strcpy(s, *buf); /* copy */ if ((q = strchr(s, c[1])) != NULL) { *q = '\0'; /* terminate match */ q = strchr(*buf, c[1]); *buf = q + 1; /* advance buf ptr */ } else { return(NULL); } return(s); } int main(argc, argv) int argc; char *argv[]; { char buf[BUFSIZ], url[BUFSIZ]; char *s, *q, *p, *tmp, *line, *tmp2; URL *rootup; char archive_site[BUFSIZ]; Template *t; strcpy(archive_site, "Unknown"); init_log(stderr, stderr); #ifdef HAVE_SETLINEBUF setlinebuf(stderr); setlinebuf(stdout); #endif /* Grab the Root URL from the first line of the data file */ if (fgets(buf, BUFSIZ, stdin) == NULL) fatal("Cannot read first line.\n"); tmp = buf; tmp2 = grab_wrapped_value(&tmp, "()"); if ((q = grab_next_value(&tmp2, "\"\"")) == NULL) fatal("Cannot read root URL.\n"); if ((rootup = url_open(q)) == NULL) fatal("Cannot parse URL: %s\n", q); if (strstr(rootup->url, "cica") != NULL) { strcpy(archive_site, "CICA DOS"); } else if (strstr(rootup->url, "garbo") != NULL) { strcpy(archive_site, "Garbo DOS (Finland)"); } else if (strstr(rootup->url, "hobbes") != NULL) { strcpy(archive_site, "Hobbes OS/2"); } else if (strstr(rootup->url, "ulowell") != NULL) { strcpy(archive_site, "U. Lowell DOS Games"); } else if (strstr(rootup->url, "oakland") != NULL) { strcpy(archive_site, "Oakland U. DOS"); } else if (strstr(rootup->url, "umich") != NULL) { strcpy(archive_site, "U. Michigan DOS"); } /* Now read all of the lines */ while (fgets(buf, BUFSIZ, stdin) != NULL) { strcpy(url, rootup->url); /* Grab the entry in parens */ tmp = buf; line = grab_wrapped_value(&tmp, "()"); tmp = strdup(line); /* Build the URL */ s = grab_next_value(&tmp, "()"); tmp2 = strdup(s); while ((p = grab_next_value(&tmp2, "\"\"")) != NULL) { if (!strcmp(p, ".")) continue; strcat(url, "/"); strcat(url, p); } if ((t = create_template(NULL, url)) == NULL) fatal("Cannot create a template for URL: %s\n", url); /* Grab the Date */ if ((p = grab_next_value(&tmp, "\"\"")) == NULL) fatal("Could not grab the date: %s\n", buf); t->list = create_AVList("Archive-Site", archive_site, strlen(archive_site)); if (p != NULL && strlen(p) >= 6) /* yymmdd */ add_AVList(t->list, "ASCII-Date", p, strlen(p)); /* Grab the size */ if ((p = grab_next_value(&tmp, " ")) == NULL) fatal("Could not grab the size: %s\n", buf); if (strcmp(p, "0") != 0 && strlen(p) > 0) add_AVList(t->list, "File-Size", p, strlen(p)); /* Grab the Description */ if ((p = grab_next_value(&tmp, "\"\"")) == NULL) fatal("Could not grab the description: %s\n", buf); if (p != NULL && strlen(p) > 0) add_AVList(t->list, "Description", p, strlen(p)); init_print_template(stdout); print_template(t); finish_print_template(); free_template(t); } exit(0); } glimpse-4.18.7/libtemplate/template/print-attr.c000066400000000000000000000024561300371307100216420ustar00rootroot00000000000000static char rcsid[] = "$Id: print-attr.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * print-attr - Reads in a template from stdin and prints the data * associated with the given attributed to stdout. * * Darren Hardy, hardy@cs.colorado.edu, April 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include #include "util.h" #include "template.h" static void usage() { fprintf(stderr, "Usage: print-attr Attribute\n"); exit(1); } int main(argc, argv) int argc; char *argv[]; { Template *template; AVPair *avp; char *attr; if (argc != 2) usage(); attr = strdup(argv[1]); init_parse_template_file(stdin); while ((template = parse_template()) != NULL) { avp = extract_AVPair(template->list, attr); if (avp) { fwrite(avp->value, 1, avp->vsize, stdout); putchar('\n'); } free_template(template); } finish_parse_template(); exit(0); } glimpse-4.18.7/libtemplate/template/print-template.c000066400000000000000000000027501300371307100225000ustar00rootroot00000000000000static char rcsid[] = "$Id: print-template.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * print-template - Reads in a template from stdin and prints it to stdout. * Used to test template parsing and printing. * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include #include "util.h" #include "template.h" static void add_print_time(t) Template *t; { char buf[BUFSIZ]; sprintf(buf, "%u", (unsigned int) time(NULL)); add_AVList(t->list, "Print-Time", buf, strlen(buf)); } int main(argc, argv) int argc; char *argv[]; { Template *template; Buffer *b; init_parse_template_file(stdin); /* Read Template from stdin */ while ((template = parse_template()) != NULL) { add_print_time(template); /* Add new Attribute-Value */ b = init_print_template(NULL); /* Print Template to Buffer */ print_template(template); fwrite(b->data, 1, b->length, stdout); /* Buffer to stdout */ finish_print_template(); /* Clean up */ free_template(template); } finish_parse_template(); exit(0); } glimpse-4.18.7/libtemplate/template/print-urlrefs.c000066400000000000000000000051221300371307100223430ustar00rootroot00000000000000static char rcsid[] = "$Id: print-urlrefs.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * print-urlrefs - Reads SOIF and prints normalized URLs from the * URL-References attribute. Used to extract URLs from HTML object sums. * * Usage: print-urlrefs * * Darren Hardy, hardy@cs.colorado.edu, July 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include "util.h" #include "url.h" #include "template.h" static void print_urlrefs(t) Template *t; { AVPair *avp; char *s, url[BUFSIZ]; URL *up; if ((avp = extract_AVPair(t->list, "URL-References")) == NULL) return; /* For each line in the data, grab the URL */ for (s = strtok(avp->value, "\n"); s != NULL; s = strtok(NULL, "\n")) { url[0] = '\0'; /* Remove poorly formated lines */ if (strchr(s, '=') || strchr(s, ' ') || strchr(s, '<')) continue; if (strncmp(s, "mailto:", 6) == 0) continue; if (strncmp(s, "http:", 5) == 0 && !strstr(s, "://")) continue; if (strstr(s, "://") != NULL) { /* Is this URL ok as-is? If so, save it */ strcpy(url, s); } else if (s[0] == '/') { /* This URL is relative to the top of t->url */ char *thishost = t->url + strlen("http://"), *z; z = strchr(thishost, '/'); if (z != NULL) *z = '\0'; sprintf(url, "http:/%s%s", thishost, s); if (z != NULL) *z = '/'; } else { /* This URL is relative to t->url */ char *z = strdup(t->url), *p; if ((p = strrchr(z, '/')) != NULL) *p = '\0'; sprintf(url, "%s/%s", z, s); xfree(z); } /* If the URL is set, then parse it and print if ok */ if (url[0]) { if ((up = url_open(url)) != NULL) { printf("%s\n", up->url); url_close(up); } } } } int main(argc, argv) int argc; char *argv[]; { Template *template; Buffer *b; init_parse_template_file(stdin); while ((template = parse_template()) != NULL) { print_urlrefs(template); printf("%s\n", template->url); free_template(template); } finish_parse_template(); exit(0); } glimpse-4.18.7/libtemplate/template/soif.pl000077500000000000000000000054631300371307100206730ustar00rootroot00000000000000#-*-perl-*- # # soif.pl - Processing for the SOIF format. # # Darren Hardy, hardy@cs.colorado.edu, January 1995 # # $Id: soif.pl,v 1.1 1999/11/03 21:41:04 golda Exp $ # ####################################################################### # Usage: # # require 'soif.pl'; # # $soif'input = 'WHATEVER'; # defaults to STDIN # ($ttype, $url, %SOIF) = &soif'parse(); # foreach $k (sort keys %SOIF) { # print "KEY: $k\n"; # print "DATA: $SOIF{$k}\n"; # } # exit(0); # ####################################################################### # Copyright (c) 1994, 1995. All rights reserved. # # Mic Bowman of Transarc Corporation. # Peter Danzig of the University of Southern California. # Darren R. Hardy of the University of Colorado at Boulder. # Udi Manber of the University of Arizona. # Michael F. Schwartz of the University of Colorado at Boulder. # package soif; $soif'debug = 0; $soif'input = 'STDIN'; $soif'output = 'STDOUT'; $soif'sort_on_output = 1; # # soif'parse - $soif'input is the file descriptor from which to read SOIF. # Returns an associative array containing the SOIF, # the template type, and the URL. # sub soif'parse { print "Inside soif'parse.\n" if ($soif'debug); return () if (eof($soif'input)); # DW local($template_type) = "UNKNOWN"; local($url) = "UNKNOWN"; local(%SOIF); undef %SOIF; while (<$soif'input>) { print "READING input line: $_\n" if ($soif'debug); last if (/^\@\S+\s*{\s*\S+\s*$/o); } if (/^\@(\S+)\s*{\s*(\S+)\s*$/o) { $template_type = $1, $url = $2 } else { return ($template_type, $url, %SOIF); # done } while (<$soif'input>) { if (/^\s*([^{]+){(\d+)}:\t(.*\n)/o) { $attr = $1; $vsize = $2; $value = $3; if (length($value) < $vsize) { $nleft = $vsize - length($value); $end_value = ""; $x = read($soif'input, $end_value, $nleft); die "Cannot read $nleft bytes: $!" if ($x != $nleft); $value .= $end_value; undef $end_value; } chop($SOIF{$attr} = $value); next; } last if (/^}/o); } return ($template_type, $url, %SOIF); } # # soif'print - $soif'output is the file descriptor to write SOIF. # sub soif'print { print "Inside soif'print.\n" if ($soif'debug); local($template_type, $url, %SOIF) = @_; # Write SOIF header, body, and trailer print $soif'output "\@$template_type { $url\n"; if ($soif'sort_on_output) { foreach $k (sort keys %SOIF) { next if (length($SOIF{$k}) < 1); &soif'print_item($k, $SOIF{$k}); } } else { foreach $k (keys %SOIF) { next if (length($SOIF{$k}) < 1); &soif'print_item($k, $SOIF{$k}); } } print $soif'output "}\n"; } sub soif'print_item { local($k, $v) = @_; print $soif'output "$k" , "{", length($v), "}:\t"; print $soif'output $v, "\n"; } 1; glimpse-4.18.7/libtemplate/template/template.c000066400000000000000000000414671300371307100213560ustar00rootroot00000000000000static char rcsid[] = "$Id: template.c,v 1.3 2006/03/25 02:13:55 root Exp $"; /* * template.c - SOIF Object ("template") processing code for Harvest * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include #include #include "util.h" #include "template.h" /* Local functions */ static void output_char(); static void output_buffer(); static int attribute_cmp(); /* * create_AVList() - Creates an Attribute-Value node to include in an AVList */ AVList *create_AVList(attr, value, vsize) char *attr; char *value; int vsize; { static AVList *l; l = xmalloc(sizeof(AVList)); l->data = xmalloc(sizeof(AVPair)); l->data->value = xmalloc(vsize + 1); l->data->attribute = (char *)strdup(attr); l->data->vsize = vsize; memcpy(l->data->value, value, l->data->vsize); l->data->value[l->data->vsize] = '\0'; l->data->offset = -1; l->next = NULL; return (l); } /* * free_AVList() - Cleans up an AVList */ void free_AVList(list) AVList *list; { AVList *walker = list, *t; while (walker) { if (walker->data) free_AVPair(walker->data); t = walker; walker = walker->next; xfree(t); } } /* * free_AVPair() - Cleans up an AVPair */ void free_AVPair(avp) AVPair *avp; { if (!avp) return; if (avp->attribute) xfree(avp->attribute); if (avp->value) xfree(avp->value); xfree(avp); } /* * add_offset() - Adds the offset value to the AVPair matching attribute. */ void add_offset(l, attr, off) AVList *l; char *attr; size_t off; { AVPair *avp = extract_AVPair(l, attr); if (avp != NULL) avp->offset = off; } /* * extract_AVPair() - Searches for the given attribute in the AVList. * Does a case insensitive match on the attributes. Returns NULL * on error; otherwise returns the matching AVPair. */ AVPair *extract_AVPair(list, attribute) AVList *list; char *attribute; { AVList *walker; for (walker = list; walker; walker = walker->next) { if (!strcasecmp(walker->data->attribute, attribute)) return (walker->data); } return (NULL); } /* * exists_AVList() - Checks to see if an AVPair exists for the given * attribute. Returns non-zero if it does; 0 if it doesn't. */ int exists_AVList(list, attr) AVList *list; char *attr; { return (extract_AVPair(list, attr) != NULL ? 1 : 0); } /* * add_AVList() - Adds the Attribute-Value pair to the given AVList */ void add_AVList(list, attr, value, vsize) AVList *list; char *attr; char *value; int vsize; { AVList *walker = list; if (list == NULL) return; /* move to the end of the list, and add a node */ while (walker->next) { /* Don't add a duplicate Attribute, just replace it */ if (!strcasecmp(attr, walker->data->attribute)) { xfree(walker->data->value); walker->data->vsize = vsize; walker->data->value = xmalloc(vsize + 1); memcpy(walker->data->value, value, vsize); walker->data->value[vsize] = '\0'; return; } walker = walker->next; } walker->next = create_AVList(attr, value, vsize); } /* * FAST_add_AVList() - Quick version of add_AVList(). Doesn't check * for duplicates. attr MUST be unique to the list. */ void FAST_add_AVList(list, attr, value, vsize) AVList *list; char *attr; char *value; int vsize; { AVList *t; if (list == NULL) return; t = create_AVList(attr, value, vsize); t->next = list->next; list->next = t; } /* * merge_AVList() - Merges the b list into the a list. If the AVPair * in b exists in the a list, then the data is replaced. Otherwise, * the data is appended to the list. */ AVList *merge_AVList(a, b) AVList *a, *b; { AVList *walker = b; AVPair *avp; if (a == NULL) return (NULL); while (walker) { avp = extract_AVPair(a, walker->data->attribute); if (avp != NULL) { /* replace the data */ xfree(avp->value); avp->value = xmalloc(walker->data->vsize); memcpy(avp->value, walker->data->value, walker->data->vsize); avp->vsize = walker->data->vsize; avp->offset = walker->data->offset; } else { /* append it to 'a' */ add_AVList(a, walker->data->attribute, walker->data->value, walker->data->vsize); add_offset(a, walker->data->attribute, walker->data->offset); } walker = walker->next; } return (a); } /* * append_AVList() - Adds the Attribute-Value pair to the given AVList. * If the attr is present in the list, then it appends the value to * the previous value. */ void append_AVList(list, attr, value, vsize) AVList *list; char *attr; char *value; int vsize; { AVPair *avp; char *buf; if ((avp = extract_AVPair(list, attr)) == NULL) { add_AVList(list, attr, value, vsize); } else { /* replace the data */ buf = xmalloc(avp->vsize + vsize + 2); memcpy(buf, avp->value, avp->vsize); buf[avp->vsize] = '\n'; memcpy(buf + avp->vsize + 1, value, vsize); xfree(avp->value); avp->value = buf; avp->vsize += vsize + 1; avp->offset = -1; } } /* * create_template() - Creats a new Template structure. */ Template *create_template(type, url) char *type; char *url; { static Template *t = NULL; t = xmalloc(sizeof(Template)); if (type == NULL) t->template_type = (char *)strdup("FILE"); else t->template_type = (char *)strdup(type); t->url = (char *)strdup(url); t->list = NULL; t->offset = -1; t->length = -1; return (t); } /* * free_template() - Cleans up the template. */ void free_template(t) Template *t; { if (!t) return; if (t->list) free_AVList(t->list); if (t->template_type) xfree(t->template_type); if (t->url) xfree(t->url); xfree(t); } /* * Template Parsing and Printing code * * Template Parsing can read from memory or from a file. * Template Printing can write to memory or to a file. */ static FILE *outputfile = NULL; /* user's file */ Buffer *bout = NULL; /* * init_print_template() - Print template to memory buffer or to * a file if fp is not NULL. Returns NULL if printing to a file; * otherwise returns a pointer to the Buffer where the data is stored. */ Buffer *init_print_template(fp) FILE *fp; { if (fp) { outputfile = fp; return (NULL); } else { bout = create_buffer(BUFSIZ); return (bout); } } /* * output_char() - writes a single character to memory or a file. */ static void output_char(c) char c; { output_buffer(&c, 1); } /* * output_buffer() - writes a buffer to memory or a file. */ static void output_buffer(s, sz) char *s; int sz; { if (outputfile) fwrite(s, sizeof(char), sz, outputfile); else add_buffer(bout, s, sz); } /* * print_template() - Prints a SOIF Template structure into a file * or into memory. MUST call init_print_template_file() or * init_print_template_string() before, and finish_print_template() after. */ void print_template(template) Template *template; { /* Estimate the buffer size to prevent too many realloc() calls */ if (outputfile == NULL) { AVList *walker; int n = 0; for (walker = template->list; walker; walker = walker->next) n += walker->data->vsize; if (bout->length + n > bout->size) increase_buffer(bout, n); /* need more */ } print_template_header(template); print_template_body(template); print_template_trailer(template); } void print_template_header(template) Template *template; { char buf[BUFSIZ]; sprintf(buf, "@%s { %s\n", template->template_type, template->url); output_buffer(buf, strlen(buf)); } void print_template_body(template) Template *template; { char buf[BUFSIZ]; AVList *walker; for (walker = template->list; walker; walker = walker->next) { if (walker->data->vsize == 0) continue; /* Write out an Attribute value pair */ sprintf(buf, "%s{%u}:\t", walker->data->attribute, (unsigned int) walker->data->vsize); output_buffer(buf, strlen(buf)); output_buffer(walker->data->value, walker->data->vsize); output_char('\n'); } } void print_template_trailer(template) Template *template; { output_char('}'); output_char('\n'); if (outputfile != NULL) fflush(outputfile); } /* * finish_print_template() - Cleanup after printing template. * Buffer is no longer valid. */ void finish_print_template() { outputfile = NULL; if (bout) free_buffer(bout); bout = NULL; } /* Parsing templates */ static char *inputbuf = NULL; static FILE *inputfile = NULL; static int inputbufsz = 0, curp = 0; static size_t inputoffset = 0, inputlength = 0; void init_parse_template_file(fp) FILE *fp; { inputfile = fp; inputoffset = ftell(fp); inputlength = 0; } void init_parse_template_string(s, sz) char *s; int sz; { inputbuf = s; inputbufsz = sz; curp = 0; inputfile = NULL; inputoffset = 0; inputlength = 0; } void finish_parse_template() { inputfile = NULL; curp = 0; inputbufsz = 0; } int is_parse_end_of_input() { if (inputfile != NULL) return (feof(inputfile)); return (curp >= inputbufsz || inputbuf[curp] == '\0'); } /* input_char() -> performs c = input_char(); */ #define input_char() \ if (inputfile != NULL) { \ inputoffset++; \ inputlength++; \ c = fgetc(inputfile); \ } else if (curp >= inputbufsz || inputbuf[curp] == '\0') { \ c = (char) EOF; \ } else { \ inputoffset++; \ inputlength++; \ c = inputbuf[curp++]; \ } static void backup_char(x) char x; { inputoffset--; inputlength--; if (inputfile != NULL) ungetc(x, inputfile); else curp--; return; } #define skip_whitespace() \ while (1) { \ input_char(); \ if (c == EOF) return(NULL); \ if (!isspace(c)) { backup_char(c); break; } \ } #define skip_tab() \ while (1) { \ input_char(); \ if (c == EOF) return(NULL); \ if (c != '\t') { backup_char(c); break; } \ } #define skip_whitespace_and(a) \ while (1) { \ input_char(); \ if (c == EOF) return(NULL); \ if (c == a) continue; \ if (!isspace(c)) { backup_char(c); break; } \ } #define skip_whitespace_break() \ while (1) { \ input_char(); \ if (c == EOF) { done = 1; break; }\ if (c == '}') { done = 1; break; }\ if (!isspace(c)) { backup_char(c); break; } \ } #define grab_token() \ p = &buf[0]; \ while (1) { \ input_char(); \ if (c == (char) EOF) return(NULL); \ if (isspace(c)) { backup_char(c); break; } \ *p++ = c; \ if (p == &buf[BUFSIZ-1]) return(NULL); \ } \ *p = '\0'; #define grab_attribute() \ p = buf; \ while (1) { \ input_char(); \ if (c == EOF) return(NULL); \ if (c == '{') break; \ if (c == '}') break; \ *p++ = c; \ if (p == &buf[BUFSIZ-1]) return(NULL); \ } \ *p = '\0'; /* * parse_template() - Returns a Template structure for the template * stored in memory or in a file. MUST call init_parse_template_file() * or init_parse_template_string() before, and finish_parse_template() * after. Returns NULL on error. */ Template *parse_template() { static Template *template = NULL; char buf[BUFSIZ], *p, *attribute, *value; int vsize, i, done = 0, c; size_t voffset; template = xmalloc(sizeof(Template)); while (1) { /* Find starting point: @ */ input_char(); if (c == EOF) { xfree(template); return (NULL); } if (c == '@') break; } template->offset = inputoffset; /* Get Template-Type */ grab_token(); template->template_type = (char *)strdup(buf); /* Get URL */ skip_whitespace_and('{'); grab_token(); template->url = (char *)strdup(buf); template->list = NULL; #ifdef DEBUG glimpselog("Grabbing Template Object: %s %s\n", template->template_type, template->url); #endif while (1) { /* Get Attribute name and value */ skip_whitespace_break(); if (done == 1) break; grab_attribute(); attribute = (char *)strdup(buf); #ifdef DEBUG log("Grabbed Attribute: %s\n", attribute); #endif grab_attribute(); vsize = atoi(buf); /* Get Value */ input_char(); if (c != ':') { free_template(template); xfree(attribute); return (NULL); } input_char(); if (c != '\t') { free_template(template); xfree(attribute); return (NULL); } /* This is a very tight loop, so optimize */ value = xmalloc(vsize + 1); voffset = inputoffset; if (inputfile == NULL) { /* normal one-by-one */ for (i = 0; i < vsize; i++) { input_char(); value[i] = c; } value[i] = '\0'; } else { /* do the fast file copy */ if (fread(value, 1, vsize, inputfile) != vsize) { free_template(template); xfree(attribute); xfree(value); return(NULL); } inputoffset += vsize; inputlength += vsize; } if (template->list == NULL) { template->list = create_AVList(attribute, value, vsize); } else FAST_add_AVList(template->list,attribute,value,vsize); add_offset(template->list, attribute, voffset); xfree(attribute); xfree(value); } template->length = inputlength; return (template); } /* Sorting Attribute-Value Lists */ /* * attribute_cmp() - strcmp(3) for attributes. Works with "Embed" * attributes so that they are first sorted by number, then by attribute. * Does case insenstive compares. */ static int attribute_cmp(a, b) char *a, *b; { if ((tolower(a[0]) == 'e') && (tolower(b[0]) == 'e') && /* quickie */ !strncasecmp(a, "embed", 5) && !strncasecmp(b, "embed", 5)) { char *p, *q; int an, bn; p = strchr(a, '<'); /* Find embedded number */ q = strchr(a, '>'); if (!p || !q) return (strcasecmp(a, b)); /* bail */ *q = '\0'; an = atoi(p + 1); *q = '>'; p = strchr(b, '<'); /* Find embedded number */ q = strchr(b, '>'); if (!p || !q) return (strcasecmp(a, b)); /* bail */ *q = '\0'; bn = atoi(p + 1); *q = '>'; if (an != bn) /* If numbers are different */ return (an < bn ? -1 : 1); /* otherwise, do strcmp on attr */ return (strcasecmp(strchr(a, '>') + 1, strchr(b, '>') + 1)); } return (strcasecmp(a, b)); } /* * sort_AVList() - Uses an insertion sort to sort the AVList by attribute. * Returns the new head of the list. */ AVList *sort_AVList(avl) AVList *avl; { AVList *walker, *n, *a, *t; static AVList *head; int (*acmp) (); acmp = attribute_cmp; /* Set the first node to be the head of the new list */ head = avl; walker = avl->next; head->next = NULL; while (walker) { /* Pick off this node */ n = walker; walker = walker->next; n->next = NULL; /* Find insertion point */ for (a = head; a->next && acmp(a->next->data->attribute, n->data->attribute) < 0; a = a->next); if (a == head) { /* prepend to list */ if (acmp(a->data->attribute, n->data->attribute) < 0) { /* As the second node */ t = a->next; a->next = n; n->next = t; } else { /* As the first node */ head = n; n->next = a; } } else { /* insert into list */ t = a->next; a->next = n; n->next = t; } } return (head); } /* * embed_template() - Embeds the given Template t into the Template template. * Returns NULL on error; otherwise returns template. */ Template *embed_template(t, template) Template *t, *template; { int nembed = 0; /* number of embedded documents in t */ AVList *walker; char *p, *q, buf[BUFSIZ]; /* Find out what the last embedded document in template is */ for (walker = template->list; walker; walker = walker->next) { if (strncasecmp(walker->data->attribute, "embed<", 6)) continue; p = strchr(walker->data->attribute, '<') + 1; if ((q = strchr(walker->data->attribute, '>')) != NULL) *q = '\0'; else continue; nembed = (nembed < atoi(p)) ? atoi(p) : nembed; *q = '>'; /* replace */ } #ifdef DEBUG log("%s has %d embedded documents\n", template->url, nembed); #endif /* Now add all of the fields from t into template */ nembed++; for (walker = t->list; walker; walker = walker->next) { sprintf(buf, "Embed<%d>-%s", nembed, walker->data->attribute); FAST_add_AVList(template->list, buf, walker->data->value, walker->data->vsize); if (walker->data->offset != -1) add_offset(template->list, buf, walker->data->offset); } return (template); } /* * sink_embedded() - Places all of the embedded attributes at the bottom * of the list. *Must* be sorted first. */ AVList *sink_embedded(list) AVList *list; { AVList *start, *end, *walker, *last, *t; static AVList *head; for (walker = last = head = list, start = end = NULL; walker != NULL; last = walker, walker = walker->next) { if (!strncasecmp(walker->data->attribute, "embed", 5)) { start = start ? start : last; } else if (start != NULL) { end = end ? end : last; } } if (start == NULL || end == NULL) { /* No embedded section, or at bottom of list */ return (head); } else if (start == head) { last->next = start; /* Embed section at top of list */ head = end->next; end->next = NULL; } else { /* Embed section at middle of list */ t = start->next; last->next = t; start->next = end->next; end->next = NULL; } return (head); } glimpse-4.18.7/libtemplate/template/translate-urls.c000066400000000000000000000026341300371307100225140ustar00rootroot00000000000000static char rcsid[] = "$Id: translate-urls.c,v 1.1 1999/11/03 21:41:04 golda Exp $"; /* * translate-urls - Reads SOIF and prints the new SOIF's with the * translated URLs from the * * Usage: translate-urls [(-access | -host | -path) old new] * * Darren Hardy, hardy@cs.colorado.edu, July 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include #include "util.h" #include "url.h" #include "template.h" static void translate_url(t) Template *t; { } int main(argc, argv) int argc; char *argv[]; { Template *template; Buffer *b; init_parse_template_file(stdin); /* Read Template from stdin */ while (1) { template = parse_template(); if (template == NULL && is_parse_end_of_input()) break; if (template == NULL) continue; translate_url(template); b = init_print_template(NULL); print_template(template); fwrite(b->data, 1, b->length, stdout); finish_print_template(); free_template(template); } finish_parse_template(); exit(0); } glimpse-4.18.7/libtemplate/util/000077500000000000000000000000001300371307100165255ustar00rootroot00000000000000glimpse-4.18.7/libtemplate/util/Makefile.NeXT000066400000000000000000000020271300371307100210030ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.NeXT,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = /bin/ranlib DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib # Delete strerror.o, because it conflicts with definition in # /lib/libsys_s.a on NeXT systems OBJS = buffer.o host.o log.o strdup.o system.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.alpha000066400000000000000000000017021300371307100212510ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.alpha,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #installbsd -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = ranlib DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.hp000066400000000000000000000016671300371307100206050ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.hp,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = : DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.in000066400000000000000000000017641300371307100206020ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.in,v 1.2 2001/05/21 04:49:00 golda Exp $ # srcdir = @srcdir@ VPATH = @srcdir@ prefix = @prefix@ INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = @CC@ INSTALL = @INSTALL@ INSTALL_BIN = @INSTALL_PROGRAM@ INSTALL_FILE = @INSTALL_DATA@ RANLIB = @RANLIB@ DEFS = DEBUG = INCLUDE = -I../include CFLAGS = $(DEFS) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib install: all install-man: ctags: @ctags -w *.c clean: rm -f core *.o $(LIBFILE) tags distclean realclean: clean rm -f Makefile install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.linux000066400000000000000000000017061300371307100213270ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.linux,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc -m486 INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = ranlib DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.rs6000000066400000000000000000000016761300371307100211300ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.rs6000,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = true DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.sgi000066400000000000000000000016731300371307100207550ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.sgi,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = cc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = true DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.solaris000066400000000000000000000017001300371307100216360ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.solaris,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = true DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/Makefile.sunos000066400000000000000000000017001300371307100213310ustar00rootroot00000000000000# # Makefile for the utilities source directory # # $Id: Makefile.sunos,v 1.1 1999/11/03 20:42:14 golda Exp $ # prefix = /usr/local/harvest INSTALL_BINDIR = $(prefix)/bin INSTALL_LIBDIR = $(prefix)/lib INSTALL_MANDIR = $(prefix)/man SHELL = /bin/sh CC = gcc INSTALL = cp #install -c INSTALL_BIN = ${INSTALL} INSTALL_FILE = ${INSTALL} #-m 644 RANLIB = ranlib DEBUG = $(DEBUG_TOP) #-O #-g #-DDEBUG INCLUDE = -I../include CFLAGS = $(DEBUG) $(INCLUDE) LIBFILE = libutil.a LIBDIR = ../lib OBJS = buffer.o host.o log.o strdup.o system.o strerror.o \ string.o xmalloc.o all: $(LIBFILE) install-lib ctags: @ctags -w *.c clean: -rm -f core *.o $(LIBFILE) tags #realclean: clean # -rm -rf Makefile install: install-lib install-lib: $(LIBDIR)/$(LIBFILE) $(LIBDIR)/$(LIBFILE): $(LIBFILE) $(INSTALL_FILE) $(LIBFILE) $(LIBDIR)/$(LIBFILE) $(RANLIB) $(LIBDIR)/$(LIBFILE) $(LIBFILE): $(OBJS) ar r $(LIBFILE) $(OBJS) $(RANLIB) $(LIBFILE) glimpse-4.18.7/libtemplate/util/README000066400000000000000000000005541300371307100174110ustar00rootroot00000000000000This directory has some generally useful functions. buffer.c Buffer management host.c DNS and host-specific routines log.c Simple, uniform logging strdup.c strdup(3) implementation strerror.c strerror(3) implementation string.c string processing system.c routines for running system(3) xmalloc.c wrappers for memory management -Darren Hardy, July 1994 glimpse-4.18.7/libtemplate/util/buffer.c000066400000000000000000000046661300371307100201560ustar00rootroot00000000000000static char rcsid[] = "$Id: buffer.c,v 1.2 2003/11/13 05:17:39 golda Exp $"; /* * buffer.c - Simple dynamic buffer management. * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include "util.h" /* * create_buffer() - Creates a buffer of default_size bytes allocated. */ Buffer *create_buffer(default_size) int default_size; { static Buffer *b = NULL; b = xmalloc(sizeof(Buffer)); b->size = b->default_size = default_size; b->data = xmalloc(b->size); b->length = 0; #ifdef DEBUG glimpselog("Creating buffer of %d bytes\n", b->size); #endif return (b); } /* * increase_buffer() - Increase the buffer so that it holds sz more bytes. */ void increase_buffer(b, sz) Buffer *b; int sz; { b->size += sz; b->data = xrealloc(b->data, b->size); #ifdef DEBUG glimpselog("Growing buffer by %d bytes to %d bytes\n", sz, b->size); #endif } /* * grow_buffer() - increases the buffer size by the default size */ void grow_buffer(b) Buffer *b; { increase_buffer(b, b->default_size); } /* * shrink_buffer() - restores a buffer back to its original size. * all data is lost. */ void shrink_buffer(b) Buffer *b; { b->length = 0; if (b->size == b->default_size) /* nothing to do */ return; if (b->data) xfree(b->data); b->size = b->default_size; b->data = xmalloc(b->size); #ifdef DEBUG glimpselog("Shrinking buffer to %d bytes\n", b->size); #endif } /* * free_buffer() - Cleans up after a buffer. */ void free_buffer(b) Buffer *b; { if (b == NULL) return; #ifdef DEBUG glimpselog("Freeing buffer of %d bytes\n", b->size); #endif if (b->data) xfree(b->data); xfree(b); } /* * add_buffer() - Adds the sz bytes of s to the Buffer b. */ void add_buffer(b, s, sz) Buffer *b; char *s; int sz; { if (sz < 1) return; if (b->length + sz + 1 > b->size) increase_buffer(b, sz); if (sz > 1) memcpy(&b->data[b->length], s, sz); else b->data[b->length] = *s; b->length += sz; b->data[b->length] = '\0'; /* add NULL to current position */ } glimpse-4.18.7/libtemplate/util/harvest.c000066400000000000000000000053541300371307100203540ustar00rootroot00000000000000static char rcsid[] = "$Id: harvest.c,v 1.2 2003/11/13 05:17:39 golda Exp $"; /* * harvest.c - Routines specific to the Harvest installation * * Darren Hardy, hardy@cs.colorado.edu, December 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include #include "util.h" #define DEFAULT_HARVEST_HOME "/usr/local/harvest" /* * harvest_bindir() - Returns a static buffer that contains the * pathname that contains the binaries for Harvest. */ char *harvest_bindir() { static char bindir[MAXPATHLEN + 1]; char *s; if ((s = getenv("HARVEST_HOME")) != NULL) sprintf(bindir, "%s/bin", s); else sprintf(bindir, "%s/bin", DEFAULT_HARVEST_HOME); return (bindir); } /* * harvest_libdir() - Returns a static buffer that contains the * pathname that contains the libraries for Harvest. */ char *harvest_libdir() { static char libdir[MAXPATHLEN + 1]; char *s; if ((s = getenv("HARVEST_HOME")) != NULL) sprintf(libdir, "%s/lib", s); else sprintf(libdir, "%s/lib", DEFAULT_HARVEST_HOME); return (libdir); } /* * harvest_topdir() - Returns a static buffer that contains the * pathname that contains the libraries for Harvest. */ char *harvest_topdir() { static char topdir[MAXPATHLEN + 1]; char *s; if ((s = getenv("HARVEST_HOME")) != NULL) sprintf(topdir, "%s", s); else sprintf(topdir, "%s", DEFAULT_HARVEST_HOME); return (topdir); } /* * add_harvest_to_path() - If xtra is not-NULL, then it will * add harvest_libdir() + xtra to the path as well. For example, * a Gatherer process would call: * add_harvest_to_path("gatherer:") * to add $harvest_libdir/ and $harvest_libdir/gatherer */ void harvest_add_path(xtra) char *xtra; { char *s = getenv("PATH"), *newpath, *oldpath, *q; char *tmpxtra; if (s == NULL) fatal("This process does not have a PATH environment variable"); newpath = xmalloc(strlen(s) + BUFSIZ); sprintf(newpath, "PATH=%s", s); sprintf(newpath + strlen(newpath), ":%s", harvest_bindir()); if (xtra != NULL) { tmpxtra = strdup(xtra); q = strtok(tmpxtra, ":"); while (q != NULL) { sprintf(newpath + strlen(newpath), ":%s/%s", harvest_libdir(), q); q = strtok(NULL, ":"); } xfree(tmpxtra); } #ifdef DEBUG glimpselog("Adding new PATH to environment: %s\n", newpath); #endif (void) putenv(newpath); } glimpse-4.18.7/libtemplate/util/host.c000066400000000000000000000050061300371307100176470ustar00rootroot00000000000000static char rcsid[] = "$Id: host.c,v 1.2 2006/03/25 02:13:55 root Exp $"; /* * host.c - Retrieves full DNS name of the current host * * Darren Hardy, hardy@cs.colorado.edu, April 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include #include #include #include #include #include #include #include #include #include "util.h" #ifndef MAXHOSTNAMELEN #define MAXHOSTNAMELEN 256 #endif /* * getfullhostname() - Returns the fully qualified name of the current * host, or NULL on error. Pointer is only valid until the next call * to the gethost*() functions. */ char *getfullhostname() { struct hostent *hp; char buf[MAXHOSTNAMELEN + 1]; extern int gethostname(); /* UNIX system call */ if (gethostname(buf, MAXHOSTNAMELEN) < 0) return (NULL); if ((hp = gethostbyname(buf)) == NULL) return (NULL); return (hp->h_name); } /* * getmylogin() - Returns the login for the pid of the current process, * or "nobody" if there is not login associated with the pid of the * current process. Intended to be a replacement for braindead getlogin(3). */ char *getmylogin() { static char *nobody_str = "nobody"; uid_t myuid = getuid(); struct passwd *pwp; pwp = getpwuid(myuid); if (pwp == NULL || pwp->pw_name == NULL) return (nobody_str); return (pwp->pw_name); } /* * getrealhost() - Returns the real fully qualified name of the given * host or IP number, or NULL on error. */ char *getrealhost(s) char *s; { char *q; struct hostent *hp; int is_octet = 1, ndots = 0; unsigned int addr = 0; if (s == NULL || *s == '\0') return (NULL); for (q = s; *q; q++) { if (*q == '.') { ndots++; continue; } else if (!isdigit(*q)) { /* [^0-9] is a name */ is_octet = 0; break; } } if (ndots != 3) is_octet = 0; if (is_octet) { addr = inet_addr(s); hp = gethostbyaddr((char *) &addr, sizeof(unsigned int), AF_INET); } else { hp = gethostbyname(s); } return ((char *)(hp != NULL ? strdup(hp->h_name) : NULL)); } glimpse-4.18.7/libtemplate/util/log.c000066400000000000000000000066651300371307100174670ustar00rootroot00000000000000static char rcsid[] = "$Id: log.c,v 1.5 2006/02/03 16:53:26 golda Exp $"; /* * log.c - Logging facilities for Essence system. * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. */ #include #include #include #include #include #include #include #include "util.h" /* Local functions */ static void standard_msg(); static void log_flush(); /* Local variables */ static FILE *fp_log = NULL; static FILE *fp_errs = NULL; static int pid; static char *pname = NULL; #if defined(USE_LOG_SYNC) && defined(HAVE_FLOCK) static void lock_file(fp) FILE *fp; { if (flock(fileno(fp), LOCK_EX) < 0) log_errno("lockf"); if (fseek(fp, 0, SEEK_END) < 0) log_errno("fseek"); } static void unlock_file(fp) FILE *fp; { if (flock(fileno(fp), LOCK_UN) < 0) log_errno("lockf"); } #else #define lock_file(fp) /* nops */ #define unlock_file(fp) #endif /* * init_log() - Initializes the logging routines. log() prints to * FILE *a, and errorlog() prints to FILE *b; */ void init_log(a, b) FILE *a, *b; { fp_log = a; fp_errs = b; pid = getpid(); pname = NULL; } void init_log3(pn, a, b) char *pn; FILE *a, *b; { fp_log = a; fp_errs = b; pid = getpid(); pname = strdup(pn); } /* * glimpselog() - used like printf(3). Prints message to stdout. */ void glimpselog(char *fmt,...) { va_list ap; if (fp_log == NULL) return; va_start(ap, fmt); if (fp_log == NULL) return; lock_file(fp_log); standard_msg(fp_log); vfprintf(fp_log, fmt, ap); va_end(ap); log_flush(fp_log); unlock_file(fp_log); } /* * errorlog() - used like printf(3). Prints error message to stderr. */ void errorlog(char *fmt,...) { va_list ap; if (fp_errs == NULL) return; va_start(ap, fmt); if (fp_errs == NULL) return; lock_file(fp_errs); standard_msg(fp_errs); fprintf(fp_errs, "ERROR: "); vfprintf(fp_errs, fmt, ap); va_end(ap); log_flush(fp_errs); unlock_file(fp_errs); } /* * fatal() - used like printf(3). Prints error message to stderr and exits */ void fatal(char *fmt,...) { va_list ap; if (fp_errs == NULL) exit(1); va_start(ap, fmt); if (fp_errs == NULL) exit(1); lock_file(fp_errs); standard_msg(fp_errs); fprintf(fp_errs, "FATAL: "); vfprintf(fp_errs, fmt, ap); va_end(ap); log_flush(fp_errs); unlock_file(fp_errs); exit(1); } /* * log_errno() - Same as perror(); doesn't print when errno == 0 */ void log_errno(s) char *s; { if (errno != 0) errorlog("%s: %s\n", s, strerror(errno)); } /* * fatal_errno() - Same as perror() */ void fatal_errno(s) char *s; { fatal("%s: %s\n", s, strerror(errno)); } /* * standard_msg() - Prints the standard pid and timestamp */ static void standard_msg(fp) FILE *fp; { if (pname != NULL) fprintf(fp, "%7s: ", pname); else fprintf(fp, "%7d: ", pid); #ifdef LOG_TIMES { time_t t = time(NULL); char buf[BUFSIZ]; strftime(buf, BUFSIZ - 1, "%y%m%d %H:%M:%S:", localtime(&t)); fprintf(fp, "%s ", buf); } #endif } static void log_flush(fp) FILE *fp; { fflush(fp); } glimpse-4.18.7/libtemplate/util/strdup.c000066400000000000000000000020051300371307100202070ustar00rootroot00000000000000static char rcsid[] = "$Id: strdup.c,v 1.2 1999/11/19 08:11:49 golda Exp $"; /* * strdup.c - string duplication * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include "util.h" #ifdef NO_STRDUP /* * strdup() - same as strdup(3) */ char *strdup(s) const char *s; { static char *p = NULL; int sz; if (s == NULL) return (NULL); sz = strlen(s); p = xmalloc((size_t) sz + 1); /* allocate memory for string */ memcpy(p, s, sz); /* copy string */ p[sz] = '\0'; /* terminate string */ return (p); } #endif glimpse-4.18.7/libtemplate/util/strerror.c000066400000000000000000000015311300371307100205530ustar00rootroot00000000000000static char rcsid[] = "$Id: strerror.c,v 1.2 1999/11/19 08:11:49 golda Exp $"; /* * strerror.c - print message associated with errno * * Darren Hardy, hardy@cs.colorado.edu, April 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include "util.h" #ifdef NO_STRERROR /* * strerror() - same as strerror(3) */ char *strerror(n) int n; { if (n < 0 || n >= sys_nerr) return (NULL); return (sys_errlist[n]); } #endif glimpse-4.18.7/libtemplate/util/string.c000066400000000000000000000027031300371307100202010ustar00rootroot00000000000000static char rcsid[] = "$Id: string.c,v 1.1 1999/11/03 20:42:14 golda Exp $"; /* * string.c - Simple string manipulation * * Darren Hardy, hardy@cs.colorado.edu, June 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include "util.h" #ifndef isquote #define isquote(c) (((c) == '\"') || ((c) == '\'')) #endif /* * parse_argv() - Parses the command string to build an argv list. * Supports simple quoting. argv is large enough to support the * command string. */ void parse_argv(argv, cmd) char *argv[]; char *cmd; { char *tmp, *p, *q; int i = 0; p = q = tmp = strdup(cmd); while (1) { if (isquote(*q)) { p++; q++; while (*q && !isquote(*q)) q++; if (isquote(*q)) *q++ = '\0'; else if (*q == '\0') break; } else { while (*q && !isspace(*q)) q++; } while (isspace(*q) && !isquote(*q)) *q++ = '\0'; if (*q == '\0') { argv[i++] = strdup(p); break; } if (*p) argv[i++] = strdup(p); p = q; } argv[i] = NULL; xfree(tmp); } glimpse-4.18.7/libtemplate/util/system.c000066400000000000000000000103701300371307100202160ustar00rootroot00000000000000static char rcsid[] = "$Id: system.c,v 1.2 2003/11/13 05:17:39 golda Exp $"; /* * system.c - system(3) routines for Essence system. * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include #include #include "util.h" #ifdef HAVE_SETRLIMIT #include #include #endif static void redirect_stdout(); /* * do_system() - calls system(3). */ int do_system(cmd) char *cmd; { #ifdef DEBUG glimpselog("RUNNING as shell: %s\n", cmd); #endif return (system(cmd)); } /* * run_cmd() - simplified system(3). Parses the command, will redirect * stdout and then fork/exec() to save a sh process. */ int run_cmd(cmd) char *cmd; { int pid, status = 0; #ifdef DEBUG glimpselog("RUNNING: %s\n", cmd); #endif /* ** PURFIY: ** use fork() here instead of vfork(). With vfork parent and child ** share memory space. In the child we strdup a bunch of argv's ** which would otherwise never get free'd causing a memory leak in ** the parent. */ if ((pid = fork()) < 0) { log_errno("run_cmd: fork"); return (1); } if (pid == 0) { /* child */ char *argv[64], buf[BUFSIZ]; int i; memset(argv, '\0', sizeof(char *) * 64); parse_argv(argv, cmd); for (i = 0; argv[i] != NULL; i++) { if (argv[i][0] == '>' && argv[i + 1] != NULL) { argv[i] = NULL; redirect_stdout(argv[++i]); break; } } execvp(argv[0], argv); sprintf(buf, "execvp: %s", argv[0]); log_errno(buf); _exit(1); } /* parent */ (void) waitpid(pid, &status, (int) NULL); return (WEXITSTATUS(status)); } static void redirect_stdout(filename) char *filename; { int fd; if (filename == NULL) return; if ((fd = open(filename, O_RDWR | O_CREAT | O_TRUNC, 0644)) < 0) { log_errno(filename); return; } close(1); dup2(fd, 1); /* make stdout */ } static int dsl_pid = -1; static void alarm_handler() { #ifdef DEBUG glimpselog("do_system_lifetime: Caught signal. Killing %d.\n", dsl_pid); #endif (void) kill(dsl_pid, SIGTERM); sleep(1); (void) kill(dsl_pid, SIGKILL); } /* * do_system_lifetime() - calls system(3). Only lives for lifetime seconds. */ void do_system_lifetime(cmd, lifetime) char *cmd; int lifetime; { #ifdef DEBUG glimpselog("RUNNING: %s\n", cmd); glimpselog("do_system_lifetime: Lifetime is %d seconds.\n", lifetime); #endif signal(SIGALRM, alarm_handler); alarm(lifetime); /* ** PURFIY: ** use fork() here instead of vfork(). With vfork parent and child ** share memory space. In the child we strdup a bunch of argv's ** which would otherwise never get free'd causing a memory leak in ** the parent. */ if ((dsl_pid = fork()) < 0) { log_errno("fork"); return; } if (dsl_pid) { /* parent */ (void) waitpid(dsl_pid, (int *) NULL, (int) NULL); alarm(0); return; } else { /* child */ char *argv[64]; char buf[BUFSIZ], i; alarm(0); memset(argv, '\0', sizeof(char *) * 64); parse_argv(argv, cmd); for (i = 0; argv[i] != NULL; i++) { if (argv[i][0] == '>' && argv[i + 1] != NULL) { argv[i] = NULL; redirect_stdout(argv[++i]); break; } } #if defined(HAVE_SETRLIMIT) && defined(RLIMIT_CPU) { struct rlimit rlp; rlp.rlim_cur = rlp.rlim_max = lifetime; (void) setrlimit(RLIMIT_CPU, &rlp); } #endif execvp(argv[0], argv); sprintf(buf, "execvp: %s", argv[0]); log_errno(buf); _exit(1); } } /* * close_all_fds() - closes all of the file descriptors starting with start. */ void close_all_fds(start) int start; { int i; #if defined(HAVE_GETDTABLESIZE) for (i = start; i < getdtablesize(); i++) { #elif defined(HAVE_SYSCONF) && defined(_SC_OPEN_MAX) for (i = start; i < sysconf(_SC_OPEN_MAX); i++) { #elif defined(OPEN_MAX) for (i = start; i < OPEN_MAX; i++) { #else for (i = start; i < 64; i++) { #endif (void) close(i); } } glimpse-4.18.7/libtemplate/util/xmalloc.c000066400000000000000000000032261300371307100203330ustar00rootroot00000000000000static char rcsid[] = "$Id: xmalloc.c,v 1.1 1999/11/03 20:42:14 golda Exp $"; /* * xmalloc.c - Memory allocation for Essence system. * * Darren Hardy, hardy@cs.colorado.edu, February 1994 * * ---------------------------------------------------------------------- * Copyright (c) 1994, 1995. All rights reserved. * * Mic Bowman of Transarc Corporation. * Peter Danzig of the University of Southern California. * Darren R. Hardy of the University of Colorado at Boulder. * Udi Manber of the University of Arizona. * Michael F. Schwartz of the University of Colorado at Boulder. * */ #include #include #include #include "util.h" #ifdef MEMORY_LEAKS #include "leak.h" #endif /* * xmalloc() - same as malloc(3) except if malloc() returns NULL, then * xmalloc() prints an error message and calls exit(3). So xmalloc() * always returns non-NULL values. */ void *xmalloc(sz) size_t sz; { static void *p; #ifdef MEMORY_LEAKS leak_logging = 1; #endif if ((p = malloc(sz)) == NULL) { errorlog("malloc: Out of memory! Exiting...\n"); exit(1); } memset(p, '\0', sz); /* NULL out the memory */ return (p); } /* * xfree() - same as free(3). */ void xfree(s) void *s; { #ifdef MEMORY_LEAKS leak_logging = 1; #endif if (s != NULL) { free(s); } } /* * xrealloc() - same as realloc(3). Exits on error, so always returns * non-NULL values. */ void *xrealloc(s, sz) void *s; size_t sz; { static void *p; #ifdef MEMORY_LEAKS leak_logging = 1; #endif if ((p = realloc(s, sz)) == NULL) { errorlog("realloc: Out of memory! Exiting...\n"); exit(1); } return (p); } glimpse-4.18.7/main.c000066400000000000000000003636031300371307100143510ustar00rootroot00000000000000/* Copyright (c) 1994 Sun Wu, Udi Manber, Burra Gopal. All Rights Reserved. */ /* bgopal: (1993-4) redesigned/rewritten using agrep's library interface */ #include #include #include #include "glimpse.h" #include "defs.h" #include #include "checkfile.h" #include #include #include #include /* for flock definition */ #if ISO_CHAR_SET #include /* support for 8bit character set */ #endif #define CLIENTSERVER 1 #define USE_MSGHDR 0 #define USE_UNIXDOMAIN 0 #define DEBUG 0 #define DEF_SERV_PORT 2001 #define MIN_SERV_PORT 1024 #define MAX_SERV_PORT 30000 #define SERVER_QUEUE_SIZE 10 /* number of requests to buffer up while processing one request = 5 */ /* Borrowed from C-Lib */ extern char **environ; extern int errno; #if CLIENTSERVER #include "communicate.c" #endif /*CLIENTSERVER*/ /* For client-server protocol */ CHAR *SERV_HOST = NULL; int SERV_PORT; char glimpse_reqbuf[MAX_ARGS*MAX_NAME_LEN]; extern int glimpse_clientdied; /* set if signal received about dead socket: need agrep variable so that exec() can return quickly */ int glimpse_reinitialize = 0; /* Borrowed from agrep.c */ extern int D_length; /* global variable in agrep */ extern int D; /* global variable in agrep */ extern int pattern_index; /* These are used for byte level index search */ extern CHAR CurrentFileName[MAX_LINE_LEN]; extern int SetCurrentFileName; extern int CurrentByteOffset; extern int SetCurrentByteOffset; extern long CurrentFileTime; extern int SetCurrentFileTime; extern int execfd; extern int agrep_initialfd; extern CHAR *agrep_inbuffer; extern int agrep_inlen; extern int agrep_inpointer; extern FILE *agrep_finalfp; extern CHAR *agrep_outbuffer; extern int agrep_outlen; extern int agrep_outpointer; extern int glimpse_call; /* prevent agrep from printing out its usage */ extern int glimpse_isserver; /* prevent agrep from asking for user input */ int first_search = 1; /* intra/interaction in process_query() and glimpse_search() */ #if ISSERVER int RemoteFiles = 0; /* Are the files present locally or remotely? If on, then -NQ is automatically added to all search options for each query */ #endif /* Borrowed from index/io.c */ extern int InfoAfterFilename; extern int OneFilePerBlock; extern int StructuredIndex; extern unsigned int *dest_index_set; extern unsigned char *dest_index_buf; extern unsigned int *src_index_set; extern unsigned char *src_index_buf; extern unsigned char *merge_index_buf; extern int mask_int[32]; extern int indexable_char[256]; int test_indexable_char[256]; extern int p_table[MAX_PARTITION]; extern int GMAX_WORD_SIZE; extern int IndexNumber; /* used in getword() */ extern int InterpretSpecial; /* used to "not-split" agrep-regexps */ extern int UseFilters; /* defined in build_in.c, used for filtering routines in io.c */ extern int ByteLevelIndex; extern int RecordLevelIndex; extern int rdelim_len; extern char rdelim[MAX_LINE_LEN]; extern char old_rdelim[MAX_LINE_LEN]; extern int file_num; extern int REAL_PARTITION, REAL_INDEX_BUF, MAX_ALL_INDEX, FILEMASK_SIZE; /* Borrowed from get_filename.c */ extern int bigbuffer_size; extern char *bigbuffer; extern char *outputbuffer; /* OPTIONS/FLAGS */ int veryfast = 0; int CONTACT_SERVER = 0; /* Should client try to call server at all or just process query on its own? */ int NOBYTELEVEL = 0; /* Some cases where we cannot do byte level fast-search: ALWAYS 0 if !ByteLevelIndex */ int OPTIMIZEBYTELEVEL = 0; /* Some cases where we don't want to do byte level search since number of files is small */ int GCONSTANT = 0; /* should pattern be taken as-is or parsed? */ int GLIMITOUTPUT = 0; /* max no. of output lines: 0=>infinity=default=nolimit */ int GLIMITTOTALFILE = 0; /* max no. of files to match: 0=>infinity=default=nolimit */ int GLIMITPERFILE = 0; /* not used in glimpse */ int GBESTMATCH = 0; /* Should I change -B to -# where # = no. of errors? */ int GRECURSIVE = 0; int GNOPROMPT = 0; int GBYTECOUNT = 0; int GPRINTFILENUMBER = 0; int GPRINTFILETIME = 0; int GOUTTAIL = 0; int GFILENAMEONLY = 0; /* how to do it if it is an and expression in structured queries */ int GNOFILENAME=0; int GPRINTNONEXISTENTFILE = 0; /* if filename is not there in index, then at least let user know its name */ int MATCHFILE = 0; int PRINTATTR = 0; int PRINTINDEXLINE = 0; int Pat_as_is=0; int Only_first=0; /* Do index search only */ int PRINTAPPXFILEMATCH=0; /* Print places in file where match occurs: useful with -b only to analyse the index */ int GCOUNT=0; /* print number of matches rather than actual matches: used only when PRINTAPPX = 1 */ int HINTSFROMUSER=0; /* The user gives the hints about where we should search (result of adding -EQNgy) */ int WHOLEFILESCOPE=0; /* used only when foundattr is NOT set: otherwise, scope is whole file anyway */ int foundattr=0; /* set in split.c -- != 0 only when StructuredIndex AND query is structured */ int foundnot=0; /* set in split.c -- != 0 only when the not operator (~) is present in the pattern */ int FILENAMESINFILE=0; /* whether the user is providing an explicit list of filenames to be searched for pattern (if absent, then means all files) */ int BITFIELDFILE=0; /* Based on contribution From ada@mail2.umu.se Fri Jul 12 01:56 MST 1996; Christer Holgersson, Sen. SysNet Mgr, Umea University/SUNET, Sweden */ int BITFIELDOFFSET=0; int BITFIELDLENGTH=0; int BITFIELDENDIAN=0; int GNumDays = 0; /* whether the user wants files modified within these many days before creating the index: only >0 makes sense */ /* structured queries */ CHAR ***attr_vals; /* matrix of char pointers: row=max #of attributes, col=max possible values */ CHAR **attr_found; /* did the expression corr. to each value in attr_vals match? */ ParseTree *GParse; /* what kind of expression corr. to attr are we looking for */ /* arbitrary booleans */ ParseTree terminals[MAXNUM_PAT]; /* parse tree's terminal node pointers pt. to elements of this array; also used outside */ char matched_terminals[MAXNUM_PAT]; /* ...[i] is 1 if i'th terminal matched: used in filter_output and eval_tree */ int num_terminals; /* number of terminal patterns */ int ComplexBoolean=0; /* 1 if we need to use parse trees and the eval function */ /* index search */ CHAR *pat_list[MAXNUM_PAT]; /* complete words within global pattern */ int pat_lens[MAXNUM_PAT]; /* their lengths */ int pat_attr[MAXNUM_PAT]; /* set of attributes */ int is_mgrep_pat[MAXNUM_PAT]; int mgrep_pat_index[MAXNUM_PAT]; int num_mgrep_pat; CHAR pat_buf[(MAXNUM_PAT + 2)*MAXPAT]; int pat_ptr = 0; extern char INDEX_DIR[MAX_LINE_LEN]; char *TEMP_DIR = NULL; /* directory to store glimpse temporary files, usually /tmp unless -T is specified */ char indexnumberbuf[256]; /* to read in first few lines of the index */ char *index_argv[MAX_ARGS]; int index_argc = 0; int bestmatcherrors=0; /* set during index search, used later on */ int patindex; int patbufpos = -1; char tempfile[MAX_NAME_LEN]; char *filenames_file = NULL; char *bitfield_file = NULL; /* agrep search */ char *agrep_argv[MAX_ARGS]; int agrep_argc = 0; CHAR *FileOpt; /* the option list after -F */ int fileopt_length; CHAR GPattern[MAXPAT]; int GM; CHAR APattern[MAXPAT]; int AM; CHAR GD_pattern[MAXPAT]; int GD_length; CHAR **GTextfiles; CHAR **GTextfilenames; int *GFileIndex; int GNumfiles; int GNumpartitions; CHAR GProgname[MAXNAME]; /* persistent file descriptors */ #if BG_DEBUG FILE *debug; /* file descriptor for debugging output */ #endif /*BG_DEBUG*/ FILE *timesfp = NULL; FILE *timesindexfp = NULL; FILE *indexfp = NULL; /* glimpse index */ FILE *partfp = NULL; /* glimpse partitions */ FILE *minifp = NULL; /* glimpse turbo */ FILE *nullfp = NULL; /* to discard output: agrep -s doesn't work properly */ int svstdin = 0, svstdout = 1, svstderr = 2; static int one = 1; /* to set socket option so that glimpseserver releases socket after death */ /* Index manipulation */ struct offsets **src_offset_table; struct offsets **multi_dest_offset_table[MAXNUM_PAT]; unsigned int *multi_dest_index_set[MAXNUM_PAT]; extern free_list(); struct stat index_stat_buf, file_stat_buf; int timesindexsize = 0; int last_Y_filenumber = 0; /* Direct agrep access for bytelevel-indices */ extern int COUNT, INVERSE, TCOMPRESSED, NOFILENAME, POST_FILTER, OUTTAIL, BYTECOUNT, SILENT, NEW_FILE, LIMITOUTPUT, LIMITPERFILE, LIMITTOTALFILE, PRINTRECORD, DELIMITER, SILENT, FILENAMEONLY, num_of_matched, prev_num_of_matched, FILEOUT; CHAR matched_region[MAX_REGION_LIMIT*2 + MAXPATT*2]; int RegionLimit=DEFAULT_REGION_LIMIT; /* Returns number of matched records/lines. Uses agrep's options to output stuff nicely; never called with RecordLevelIndex set */ int glimpse_search(AM, APattern, GD_length, GD_pattern, realfilename, filename, fileindex, src_offset_table, outfp) int AM; unsigned char APattern[]; int GD_length; unsigned char GD_pattern[]; char *realfilename; char *filename; int fileindex; struct offsets *src_offset_table[]; FILE *outfp; { FILE *infp; char sig[SIGNATURE_LEN]; struct offsets **p1, *tp1; CHAR *text, *curtextend, *curtextbegin, c; int times; int num, ret=0, totalret = 0; int prevoffset = 0, begininterval = 0, endinterval = -1; CHAR *beginregionptr = 0, *endregionptr = 0; int beginpage = 0, endpage = -1; static int MAXTIMES, MAXPGTIMES, pagesize; static int first_time = 1; /* * If can't open file for read, quit * For each offset for that file: * seek to that point * go back until delimiter, go forward until delimiter, output it: MAX_REGION_LIMIT is 16K on either side. * read in units of RegionLimit * before outputting matched record, use options to put prefixes (or use memagrep which does everything?) * Algorithm changed: don't read same page in twice. */ if (first_time) { pagesize = DISKBLOCKSIZE; MAXTIMES = ((MAX_REGION_LIMIT / RegionLimit) > 1) ? (MAX_REGION_LIMIT / RegionLimit) : 1; MAXPGTIMES = ((MAX_REGION_LIMIT / pagesize) > 1) ? (MAX_REGION_LIMIT / pagesize) : 1; first_time = 0; } /* Safety: must end/begin with delim */ memcpy(matched_region, GD_pattern, GD_length); memcpy(matched_region+MAXPATT+2*MAX_REGION_LIMIT, GD_pattern, GD_length); text = &matched_region[MAX_REGION_LIMIT+MAXPATT]; if ((infp = my_fopen(filename, "r")) == NULL) return 0; NEW_FILE = ON; #if 0 /* Cannot search in .CZ files since offset computations will be incorrect */ TCOMPRESSED = ON; if (!tuncompressible_filename(file_list[i], strlen(file_list[i]))) TCOMPRESSED = OFF; num_read = fread(sig, 1, SIGNATURE_LEN, infp); if ((TCOMPRESSED == ON) && tuncompressible(sig, num_read)) { EASYSEARCH = sig[SIGNATURE_LEN-1]; if (!EASYSEARCH) { fprintf(stderr, "not compressed for easy-search: can miss some matches in: %s\n", CurrentFileName); /* not filename!!! */ } } else TCOMPRESSED = OFF; #endif /*0*/ p1 = &src_offset_table[fileindex]; while (*p1 != NULL) { if ( (begininterval <= (*p1)->offset) && (endinterval > (*p1)->offset) ) { /* already covered this area */ #if DEBUG printf("ignoring %d in [%d,%d]\n", (*p1)->offset, begininterval, endinterval); #endif /*DEBUG*/ tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); continue; } TCOMPRESSED = OFF; #if 1 if ( (beginpage <= (*p1)->offset) && (endpage >= (*p1)->offset) && (text + ((*p1)->offset - prevoffset) + GD_length < endregionptr)) { /* beginregionptr = curtextend - GD_length; /* prevent next curtextbegin to go behind previous curtextend (!) */ text += ((*p1)->offset - prevoffset); prevoffset = (*p1)->offset; if (!((curtextend = forward_delimiter(text, endregionptr, GD_pattern, GD_length, 1)) < endregionptr)) goto fresh_read; if (!((curtextbegin = backward_delimiter(text, beginregionptr, GD_pattern, GD_length, 0)) > beginregionptr)) goto fresh_read; } else { /* NOT within an area already read: must read another page: if record overlapps page, might read page twice: no time to fix */ fresh_read: prevoffset = (*p1)->offset; text = &matched_region[MAX_REGION_LIMIT+MAXPATT]; /* middle: points to occurrence of pattern */ endpage = beginpage = ((*p1)->offset / pagesize) * pagesize; /* endpage = (((*p1)->offset + pagesize) / pagesize) * pagesize */ endregionptr = beginregionptr = text - ((*p1)->offset - beginpage); /* overlay physical place starting from this logical point */ /* endregionptr = text + (endpage - (*p1)->offset); */ curtextbegin = curtextend = text; times = 0; while (times < MAXPGTIMES) { fseek(infp, endpage, 0); num = (&matched_region[MAX_REGION_LIMIT*2+MAXPATT] - endregionptr < pagesize) ? (&matched_region[MAX_REGION_LIMIT*2+MAXPATT] - endregionptr) : pagesize; if ((num = fread(endregionptr, 1, num, infp)) <= 0) break; endpage += num; endregionptr += num; if (endregionptr <= text) { curtextend = text; /* error in value of offset: file was modified and offsets no longer true: your RISK! */ break; } if (((curtextend = forward_delimiter(text, endregionptr, GD_pattern, GD_length, 1)) < endregionptr) || (endregionptr >= &matched_region[MAX_REGION_LIMIT*2 + MAXPATT])) break; times ++; } times = 0; while (times < MAXPGTIMES) { /* I have already read the initial page since endpage is beginpage initially */ if ((curtextbegin = backward_delimiter(text, beginregionptr, GD_pattern, GD_length, 0)) > beginregionptr) break; if (beginpage > 0) { if (beginregionptr - pagesize < &matched_region[MAXPATT]) { if ((num = beginregionptr - &matched_region[MAXPATT]) <= 0) break; } else num = pagesize; beginpage -= num; beginregionptr -= num; } else break; times ++; fseek(infp, beginpage, 0); fread(beginregionptr, 1, num, infp); } } #else /*1*/ /* Find forward delimiter (including delimiter) */ times = 0; fseek(infp, (*p1)->offset, 0); while (times < MAXTIMES) { if ((num = fread(text+RegionLimit*times, 1, RegionLimit, infp)) > 0) curtextend = forward_delimiter(text, text+RegionLimit*times+num, GD_pattern, GD_length, 1); if ((curtextend < text+RegionLimit*times+num) || (num < RegionLimit)) break; times ++; } /* Find backward delimiter (including delimiter) */ times = 0; while (times < MAXTIMES) { num = ((*p1)->offset - RegionLimit*(times+1)) > 0 ? ((*p1)->offset - RegionLimit*(times+1)) : 0; fseek(infp, num, 0); if (num > 0) { fread(text-RegionLimit*(times+1), 1, RegionLimit, infp); curtextbegin = backward_delimiter(text, text-RegionLimit*(times+1), GD_pattern, GD_length, 0); } else { fread(text-RegionLimit*times-(*p1)->offset, 1, (*p1)->offset, infp); curtextbegin = backward_delimiter(text, text-RegionLimit*times-(*p1)->offset, GD_pattern, GD_length, 0); } if ((num <= 0) || (curtextbegin > text-RegionLimit*(times+1))) break; times ++; } #endif /*1*/ /* set interval and delete the entry */ begininterval = (*p1)->offset - (text - curtextbegin); endinterval = (*p1)->offset + (curtextend - text); if (strncmp(curtextbegin, GD_pattern, GD_length)) { /* always pass enclosing delimiters to agrep; since we have seen text before curtextbegin + we have space, we can overwrite */ memcpy(curtextbegin - GD_length, GD_pattern, GD_length); curtextbegin -= GD_length; } #if DEBUG c = *curtextend; *curtextend = '\0'; printf("%s [%d < %d < %d], text = %d: %s\n", CurrentFileName, begininterval, (*p1)->offset, endinterval, text, curtextbegin); *curtextend = c; #endif /*DEBUG*/ tp1 = *p1; *p1 = (*p1)->next; my_free(tp1, sizeof(struct offsets)); if (curtextend <= curtextbegin) continue; /* error in offsets/delims */ /* * Don't call memagrep since that is heavy weight. Call exec * directly after doing agrep_search()'s preprocessing here. * PS: can add agrep variable not to do delim search if called from here * since that prevents unnecessarily scanning the buffer for the 2nd time. */ CurrentByteOffset = begininterval+1; SetCurrentByteOffset = 1; first_search = 1; if (first_search) { if ((ret = memagrep_search(AM, APattern, curtextend-curtextbegin, curtextbegin, 0, outfp)) > 0) totalret ++; /* += ret */ else if ((ret < 0) && (errno == AGREP_ERROR)) { fclose(infp); return -1; } first_search = 0; } else { /* All agrep globals are properly set: has a bug because agrep's globals aren't properly reinitialized without agrep_search :-( */ agrep_finalfp = (FILE *)outfp; agrep_outlen = 0; agrep_outbuffer = NULL; agrep_outpointer = 0; execfd = agrep_initialfd = -1; agrep_inbuffer = curtextbegin; agrep_inlen = curtextend - curtextbegin; agrep_inpointer = 0; if ((ret = exec(-1, NULL)) > 0) totalret ++; /* += ret; */ else if ((ret < 0) && (errno == AGREP_ERROR)) { fclose(infp); return -1; } } if (((LIMITOUTPUT > 0) && (LIMITOUTPUT <= num_of_matched)) || ((LIMITPERFILE > 0) && (LIMITPERFILE <= num_of_matched - prev_num_of_matched))) break; /* done */ if ((totalret > 0) && FILENAMEONLY) break; } /* while *p1 != NULL */ SetCurrentByteOffset = 0; fclose(infp); if (totalret > 0) { /* dirty solution: must handle part of agrep here */ if (COUNT && !FILEOUT && !SILENT) { if(!NOFILENAME) fprintf(outfp, "%s: %d\n", CurrentFileName, totalret); else fprintf(outfp, "%d\n", totalret); } else if (FILEOUT) { file_out(realfilename); } } return totalret; } /* Sets lastfilenumber that needs to be searched: rest must be discarded */ int process_Y_option(num_files, num_days, fp) int num_files, num_days; FILE *fp; { CHAR arrayend[4]; last_Y_filenumber = 0; if ((num_days <= 0) || (fp == NULL) || (timesindexsize <= 0)) return 0; last_Y_filenumber = num_files; if (num_days * sizeof(int) >= timesindexsize) return 0; /* everything will be within so many days */ if (fseek(fp, num_days*sizeof(int), 0) == -1) return -1; fread(arrayend, 1, 4, fp); if ((last_Y_filenumber = (arrayend[0] << 24) | (arrayend[1] << 16) | (arrayend[2] << 8) | arrayend[3]) > num_files) last_Y_filenumber = num_files; if (last_Y_filenumber == 0) { last_Y_filenumber = 1; printf("Warning: no files modified in the last %d days were found in the index.\nSearching only the most recently modified file...\n", num_days); } return 0; } read_index(indexdir) char indexdir[MAXNAME]; { char *home; char s[MAXNAME]; int ret; if (indexdir[0] == '\0') { if ((home = (char *)getenv("HOME")) == NULL) { getcwd(indexdir, MAXNAME-1); fprintf(stderr, "using working-directory '%s' to locate index\n", indexdir); } else strncpy(indexdir, home, MAXNAME); } ret = chdir(indexdir); if (getcwd(INDEX_DIR, MAXNAME-1) == NULL) strcpy(INDEX_DIR, indexdir); if (ret < 0) { fprintf(stderr, "using working-directory '%s' to locate index\n", INDEX_DIR); } sprintf(s, "%s", INDEX_FILE); indexfp = fopen(s, "r"); if(indexfp == NULL) { fprintf(stderr, "can't open glimpse index-file %s/%s\n", INDEX_DIR, INDEX_FILE); fprintf(stderr, "(use -H to give an index-directory or run 'glimpseindex' to make an index)\n"); return -1; } if (stat(s, &index_stat_buf) == -1) { fprintf(stderr, "can't stat %s/%s\n", INDEX_DIR, s); fclose(indexfp); return -1; } sprintf(s, "%s", P_TABLE); partfp = fopen(s, "r"); if(partfp == NULL) { fprintf(stderr, "can't open glimpse partition-table %s/%s\n", INDEX_DIR, P_TABLE); fprintf(stderr, "(use -H to specify an index-directory or run glimpseindex to make an index)\n"); fclose(indexfp); return -1; } sprintf(s, "%s", DEF_TIME_FILE); timesfp = fopen(s, "r"); sprintf(s, "%s.index", DEF_TIME_FILE); timesindexfp = fopen(s, "r"); if (timesindexfp != NULL) { struct stat st; fstat(fileno(timesindexfp), &st); timesindexsize = st.st_size; } /* Get options */ #if BG_DEBUG debug = fopen(DEBUG_FILE, "w+"); if(debug == NULL) { fprintf(stderr, "can't open file %s/%s, errno=%d\n", INDEX_DIR, DEBUG_FILE, errno); return(-1); } #endif /*BG_DEBUG*/ fgets(indexnumberbuf, 256, indexfp); if(strstr(indexnumberbuf, "1234567890")) IndexNumber = ON; else IndexNumber = OFF; fscanf(indexfp, "%%%d\n", &OneFilePerBlock); if (OneFilePerBlock < 0) { ByteLevelIndex = ON; OneFilePerBlock = -OneFilePerBlock; } else if (OneFilePerBlock == 0) { GNumpartitions = get_table(P_TABLE, p_table, MAX_PARTITION, 0); } fscanf(indexfp, "%%%d%s\n", &StructuredIndex, old_rdelim); /* Set WHOLEFILESCOPE for do-it-yourself request processing at client */ WHOLEFILESCOPE = 1; if (StructuredIndex <= 0) { if (StructuredIndex == -2) { RecordLevelIndex = 1; strcpy(rdelim, old_rdelim); rdelim_len = strlen(rdelim); preprocess_delimiter(rdelim, rdelim_len, rdelim, &rdelim_len); } WHOLEFILESCOPE = 0; StructuredIndex = 0; PRINTATTR = 0; /* doesn't make sense: must not go into filter_output */ } else if (-1 == (StructuredIndex = attr_load_names(ATTRIBUTE_FILE))) { fprintf(stderr, "error in reading attribute file %s/%s\n", INDEX_DIR, ATTRIBUTE_FILE); return(-1); } #if BG_DEBUG fprintf(debug, "buf = %s OneFilePerBlock=%d StructuredIndex=%d\n", indexnumberbuf, OneFilePerBlock, StructuredIndex); #endif /*BG_DEBUG*/ sprintf(s, "%s", MINI_FILE); minifp = fopen(s, "r"); /* if (minifp==NULL && OneFilePerBlock) fprintf(stderr, "Can't open for reading: %s/%s --- cannot do very fast search\n", INDEX_DIR, MINI_FILE); */ if (OneFilePerBlock && glimpse_isserver && (minifp != NULL)) read_mini(indexfp, minifp); read_filenames(); /* Once IndexNumber info is available */ set_indexable_char(indexable_char); set_indexable_char(test_indexable_char); set_special_char(indexable_char); return 0; } #define CLEANUP \ {\ int q, k;\ if (timesfp != NULL) fclose(timesfp);\ if (timesindexfp != NULL) fclose(timesindexfp);\ if (indexfp != NULL) fclose(indexfp);\ if (partfp != NULL) fclose(partfp);\ if (minifp != NULL) fclose(minifp);\ if (nullfp != NULL) fclose(nullfp);\ indexfp = partfp = minifp = nullfp = NULL;\ if (ByteLevelIndex) {\ if (src_offset_table != NULL) for (k=0; k QUIT CURRENT REQUEST. */ int ignore_signal[32] = { 0, 0, 0, 1, 1, 1, 1, 1, 1, /* all the tracing stuff: since default action is to dump core */ 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0 }; /* resource lost: since default action is to dump core */ /* S.t. sockets don't persist: they sometimes have a bad habit of doing so */ void cleanup() { int i; /* ^C in the middle of a client call */ if (svstderr != 2) { close(2); dup(svstderr); } fprintf(stderr, "server cleaning up...\n"); CLEANUP; for (i=0; i<64; i++) close(i); exit(3); } void reinitialize(s) int s; { /* To force main-while loop call reinitialize_server() after do_select() */ glimpse_reinitialize = 1; #ifdef __svr4__ /* Solaris 2.3 insists that you reset the signal handler */ (void)signal(s, reinitialize); #endif } #define QUITREQUESTMSG "glimpseserver: aborting request...\n" /* S.t. one request doesn't keep server occupied too long, when client already quits */ void quitrequest(s) int s; { /* * Don't write onto stderr, since 2 is duped to sockfd => can cause recursive signal! * Also, don't print error message more than once for quitting one request. The * server receives signals for EVERY write it attempts when it finds a match: I could * not find a way to prevent it, but agrep/bitap.c/fill_buf() was fixed to limit it. * -- bg on 16th Feb 1995 */ if (!glimpse_clientdied && (s != SIGUSR1)) /* USR1 is a "friendly" cleanup message */ write(svstderr, QUITREQUESTMSG, strlen(QUITREQUESTMSG)); glimpse_clientdied = 1; #ifdef __svr4__ /* Solaris 2.3 insists that you reset the signal handler */ (void)signal(s, quitrequest); #endif } /* The client receives this signal when an output/input pipe is broken, etc. It simply exits from the current request */ void exitrequest() { glimpse_clientdied = 1; } main(argc, argv) int argc; char *argv[]; { int ret = 0, tried = 0; char indexdir[MAXNAME]; char **oldargv = argv; int oldargc = argc; #if CLIENTSERVER int sockfd, newsockfd, clilen, len, clpid; int clout; #if USE_UNIXDOMAIN struct sockaddr_un cli_addr, serv_addr; #else /*USE_UNIXDOMAIN*/ struct sockaddr_in cli_addr, serv_addr; struct hostent *hp; #endif /*USE_UNIXDOMAIN*/ int cli_len; int clargc; char **clargv; int clstdin, clstdout, clstderr; int i; char array[4]; char *p, c; #endif /*CLIENTSERVER*/ int quitwhile; #if ISO_CHAR_SET setlocale(LC_ALL,""); /* support for 8bit character set: ew@senate.be, Henrik.Martin@eua.ericsson.se */ #endif #if CLIENTSERVER && ISSERVER glimpse_isserver = 1; /* I am the server */ #else /*CLIENTSERVER && ISSERVER*/ if (argc <= 1) { usage(); /* Client nees at least 1 argument */ exit(1); } #endif /*CLIENTSERVER && ISSERVER*/ #define RETURNMAIN(val)\ {\ CLEANUP;\ if (val < 0) exit (2);\ else if (val == 0) exit (1);\ else exit (0);\ } SERV_HOST = (CHAR *)my_malloc(MAXNAME); #if !SYSCALLTESTING /* once-only initialization */ init_filename_hashtable(); src_offset_table = NULL; for (i=0; i MAX_ARGS) goto doityourself; #endif /*!ISSERVER*/ #if !SYSCALLTESTING while((--argc > 0) && (*++argv)[0] == '-' ) { p = argv[0] + 1; /* ptr to first character after '-' */ c = *(argv[0]+1); if (*p == '-') { /* cheesy hack to support --version and --help options */ if (*(p+1) == 'v') { c = 'V'; } else if (*(p+1) == 'h') { c = '?'; } } quitwhile = OFF; while (!quitwhile && (*p != '\0')) { switch(c) { /* Look for -H option at server (only one that makes sense); if client has a -H, then it goes to doityourself */ case 'H' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -H option\n", GProgname); RETURNMAIN(usageS()); } argv ++; strcpy(indexdir, argv[0]); argc --; } else { strcpy(indexdir, p+1); } quitwhile = ON; break; /* Recognized by both client and server */ case 'J' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server host name must follow the -J option\n", GProgname); #if ISSERVER RETURNMAIN(usageS()); #else /*ISSERVER*/ RETURNMAIN(usage()); #endif /*ISSERVER*/ } argv ++; strcpy(SERV_HOST, argv[0]); argc --; } else { strcpy(SERV_HOST, p+1); } quitwhile = ON; break; /* Recognized by both client and server */ case 'K' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server port must follow the -C option\n", GProgname); #if ISSERVER RETURNMAIN(usageS()); #else /*ISSERVER*/ RETURNMAIN(usage()); #endif /*ISSERVER*/ } argv ++; SERV_PORT = atoi(argv[0]); argc --; } else { SERV_PORT = atoi(p+1); } if ((SERV_PORT < MIN_SERV_PORT) || (SERV_PORT > MAX_SERV_PORT)) { fprintf(stderr, "Bad server port %d: must be in [%d, %d]: using default %d\n", SERV_PORT, MIN_SERV_PORT, MAX_SERV_PORT, DEF_SERV_PORT); SERV_PORT = DEF_SERV_PORT; } quitwhile = ON; break; #if ISSERVER #if SFS_COMPAT case 'R' : RemoteFiles = ON; break; case 'Z' : /* No op */ break; #endif case 'V' : printf("\nThis is glimpseindex version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); RETURNMAIN(1); case '?' : RETURNMAIN(usageS()); /* server cannot recognize any other option */ default : fprintf(stderr, "%s: server cannot recognize option: '%s'\n", GProgname, p); RETURNMAIN(usageS()); #else /*ISSERVER*/ /* These have 1 argument each, so must do quitwhile */ case 'd' : case 'e' : case 'f' : case 'k' : case 'D' : case 'F' : case 'I' : case 'L' : case 'R' : case 'S' : case 'T' : case 'Y' : case 'p' : if (argv[0][2] == '\0') {/* space after - option */ if(argc <= 1) { fprintf(stderr, "%s: the '-%c' option must have an argument\n", GProgname, c); RETURNMAIN(usage()); } argv++; argc--; } quitwhile = ON; break; /* These are illegal */ case 'm' : case 'v' : fprintf(stderr, "%s: illegal option: '-%c'\n", GProgname, c); RETURNMAIN(usage()); /* They can't be patterns and filenames since they start with a -, these don't have arguments */ case '!' : case 'a' : case 'b' : case 'c' : case 'h' : case 'i' : case 'j' : case 'l' : case 'n' : case 'o' : case 'q' : case 'r' : case 's' : case 't' : case 'u' : case 'g' : case 'w' : case 'x' : case 'y' : case 'z' : case 'A' : case 'B' : case 'E' : case 'G' : case 'M' : case 'N' : case 'O' : case 'P' : case 'Q' : case 'U' : case 'W' : case 'X' : case 'Z' : break; case 'C': CONTACT_SERVER = 1; break; case 'V' : printf("\nThis is glimpse version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); RETURNMAIN(1); case '?': RETURNMAIN(usage()); default : if (isdigit(c)) quitwhile = ON; else { fprintf(stderr, "%s: illegal option: '-%c'\n", GProgname, c); RETURNMAIN(usage()); } break; #endif /*ISSERVER*/ } /* switch(c) */ p ++; c = *p; } } #else CONTACT_SERVER = 1; argc=0; #endif #if !ISSERVER /* Next arg must be the pattern: Check if the user wants to run the client as agrep, or doesn't want to contact the server */ if ((argc > 1) || (!CONTACT_SERVER)) goto doityourself; #endif /*!ISSERVER*/ argv = oldargv; argc = oldargc; #endif /*CLIENTSERVER*/ #if ISSERVER && CLIENTSERVER if (-1 == read_index(indexdir)) RETURNMAIN(ret); /* Install signal handlers so that glimpseserver doesn't continue to run when sockets get broken, etc. */ for (i=0; i<32; i++) if (ignore_signal[i]) signal(i, SIG_IGN); signal(SIGHUP, cleanup); signal(SIGINT, cleanup); if (((void (*)())-1 == signal(SIGPIPE, quitrequest)) || ((void (*)())-1 == signal(SIGUSR1, quitrequest)) || #ifndef SCO ((void (*)())-1 == signal(SIGURG, quitrequest)) || #endif ((void (*)())-1 == signal(SIGUSR2, reinitialize)) || ((void (*)())-1 == signal(SIGHUP, reinitialize))) { /* Check for return values here since they ensure reliability */ fprintf(stderr, "glimpseserver: Unable to install signal-handlers.\n"); RETURNMAIN(-1); } #if USE_UNIXDOMAIN if ((sockfd = socket(AF_UNIX, SOCK_STREAM, 0)) < 0) { fprintf(stderr, "server cannot open socket for communication.\n"); RETURNMAIN(-1); } char TMP_FILE_NAME[256]; strcpy(TMP_FILE_NAME,TEMP_DIR) ; strcat(TMP_FILE_NAME,"/.glimpse_server"); unlink(TMP_FILE_NAME); memset((char *)&serv_addr, '\0', sizeof(serv_addr)); serv_addr.sun_family = AF_UNIX; strcpy(serv_addr.sun_path, TMP_FILE_NAME); /* < 108 ! */ len = strlen(serv_addr.sun_path) + sizeof(serv_addr.sun_family); #else /*USE_UNIXDOMAIN*/ if ((sockfd = socket(PF_INET, SOCK_STREAM, 0)) < 0) { perror("glimpseserver: Cannot create socket"); RETURNMAIN(-1); } memset((char *)&serv_addr, '\0', sizeof(serv_addr)); serv_addr.sin_family = AF_INET; serv_addr.sin_port = htons(SERV_PORT); #if 0 /* use host-names not internet style d.d.d.d notation */ serv_addr.sin_addr.s_addr = htonl(INADDR_ANY); #else /* * We only want to accept connections from glimpse clients * on the SERV_HOST, do not use INADDR_ANY! */ if ((hp = gethostbyname(SERV_HOST)) == NULL) { perror("glimpseserver: Cannot resolve host"); RETURNMAIN(-1); } memcpy((caddr_t)&serv_addr.sin_addr, hp->h_addr, hp->h_length); #endif /*0*/ len = sizeof(serv_addr); #endif /*USE_UNIXDOMAIN*/ /* test code for glimpse server, get it to realse socket when it dies: contribution by Sheldon Smoker */ if((setsockopt(sockfd,SOL_SOCKET,SO_REUSEADDR,(char *)&one,sizeof(one))) == -1) { fprintf(stderr,"glimpseserver: could not set socket option\n"); perror("setsockopt"); exit(1); } /* end test code */ if (bind(sockfd, (struct sockaddr *)&serv_addr, len) < 0) { perror("glimpseserver: Cannot bind to socket"); RETURNMAIN(-1); } listen(sockfd, SERVER_QUEUE_SIZE); printf("glimpseserver: On-line (pid = %d, port = %d) waiting for request...\n", getpid(), SERV_PORT); fflush(stdout); /* must fflush to print on server stdout */ while (1) { /* * Spin until sockfd is ready to do a non-blocking accept(2). * We only wait for 15 seconds, because SunOS may * swap us out if we block for 20 seconds or more. * -- Courtesy: Darren Hardy, hardy@cs.colorado.edu */ if ((ret = do_select(sockfd, 15)) == 0) { if ((errno == EINTR) && glimpse_reinitialize) { glimpse_reinitialize = 0; CLEANUP; close(sockfd); sleep(IC_PORTRELEASE); reinitialize_server(oldargc, oldargv); } continue; } else if (ret != 1) continue; /* get parameters */ ret = 0; clargc = 0; clargv = NULL; cli_len = sizeof(cli_addr); if ((newsockfd = accept(sockfd, (struct sockaddr *)&cli_addr, &cli_len)) < 0) continue; if (getreq(newsockfd, glimpse_reqbuf, &clstdin, &clstdout, &clstderr, &clargc, &clargv, &clpid) < 0) { ret = -1; #if DEBUG printf("getreq errno: %d\n", errno); #endif /*DEBUG*/ goto end_process; } #if DEBUG printf("server processing request on %x\n", newsockfd); #endif /*DEBUG*/ /* * Server doesn't wait for response, no point using svstdin = dup(0); close(0); dup(clstdin); close(clstdin); */ /* * This is wrong since clstderr == clstdout! svstdout = dup(1); close(1); dup(clstdout); close(clstdout); svstderr = dup(2); close(2); dup(clstderr); close(clstderr); */ svstdout = dup(1); svstderr = dup(2); close(1); close(2); dup(clstdout); dup(clstderr); close(clstdout); close(clstderr); /* * IMPORTANT: Unbuffered I/O to the client! * Done for Harvest since partial results might be * needed and fflush will not flush partial results * to the client if we type ^C and kill it: it puts * them into /dev/null. This way, output is unbuffered * and the client sees at least some results if killed. */ setbuf(stdout, NULL); setbuf(stderr, NULL); glimpse_call = 0; glimpse_clientdied = 0; ret = process_query(clargc, clargv, newsockfd); /* * Server doesn't wait for response, no point using close(0); dup(svstdin); close(svstdin); svstdin = 0; */ if (glimpse_clientdied) { /* * This code is *ONLY* used as a safety net now. * The old problem was that users would see portions * of previous (and usually) unrelated queries! * glimpseserver now uses unbuffered I/O to the * client so all previous fwrite's to now are * gone. But since this is such a nasty problem * we flush stdout to /dev/null just in case. */ clout = open("/dev/null", O_WRONLY); close(1); dup(clout); close(clout); fflush(stdout); } /* Restore svstdout and svstdout to stdout/stderr */ close(1); dup(svstdout); close(svstdout); svstdout = 1; close(2); dup(svstderr); close(svstderr); svstderr = 2; end_process: #if USE_MSGHDR /* send reply and cleanup */ array[0] = (ret & 0xff000000) >> 24; array[1] = (ret & 0xff0000) >> 16; array[2] = (ret & 0xff00) >> 8; array[3] = (ret & 0xff); writen(newsockfd, array, 4); #endif /*USE_MSGHDR*/ #if DEBUG write(1, "done\n", 5); #endif /*DEBUG*/ for (i=0; ih_addr, hp->h_length); #endif /*0*/ len = sizeof(serv_addr); #endif /*USE_UNIXDOMAIN*/ if (connect(sockfd, (struct sockaddr *)&serv_addr, len) < 0) { char errbuf[4096]; sprintf(errbuf, "glimpse: Cannot contact glimpseserver: %s, port %d:", SERV_HOST, SERV_PORT); perror(errbuf); /* perror(SERV_HOST); */ #if DEBUG printf("connect errno: %d\n", errno); #endif /*DEBUG*/ close(sockfd); if ((errno == ECONNREFUSED) && (tried < 4)) { tried ++; goto trynewsocket; } goto doityourself; } if (sendreq(sockfd, glimpse_reqbuf, fileno(stdin), fileno(stdout), fileno(stderr), argc, argv, getpid()) < 0) { perror("sendreq"); #if DEBUG printf("sendreq errno: %d\n", errno); #endif /*DEBUG*/ close(sockfd); goto doityourself; } #if USE_MSGHDR if (readn(sockfd, array, 4) != 4) { close(sockfd); goto doityourself; } ret = (array[0] << 24) + (array[1] << 16) + (array[2] << 8) + array[3]; #else /*USE_MSGHDR*/ { /* * Dump everything the server writes into the socket onto * stdout until EOF/error. Do this in a way so that *everything* * the server sends is dumped to stdout by the client. The * client might die suddenly via ^C or SIGTERM, but we still * want the results. */ char tmpbuf[1024]; int n; while ((n = read(sockfd, tmpbuf, 1024)) > 0) { write(fileno(stdout), tmpbuf, n); } } #endif /*USE_MSGHDR*/ close(sockfd); RETURNMAIN(ret); doityourself: #if DEBUG printf("doing it myself :-(\n"); #endif /*DEBUG*/ #endif /*CLIENTSERVER*/ setbuf(stdout, NULL); /* Unbuffered I/O to always get every result */ setbuf(stderr, NULL); glimpse_call = 0; glimpse_clientdied = 0; ret = process_query(oldargc, oldargv, fileno(stdin)); RETURNMAIN(ret); #endif /*ISSERVER && CLIENTSERVER*/ } process_query(argc, argv, newsockfd) int argc; char *argv[]; int newsockfd; { int searchpercent; int num_blocks; int num_read; int i, j; int iii; /* Udi */ int jjj; char c; char *p; int ret; int jj; int quitwhile; char indexdir[MAX_LINE_LEN]; char temp_filenames_file[MAX_LINE_LEN]; char temp_bitfield_file[MAX_LINE_LEN]; char TEMP_FILE[MAX_LINE_LEN]; char temp_file[MAX_LINE_LEN]; int oldargc = argc; char **oldargv = argv; CHAR dummypat[MAX_PAT]; int dummylen=0; int my_M_index, my_P_index, my_b_index, my_A_index, my_l_index = -1, my_B_index = -1; char **outname; int gnum_of_matched = 0; int gprev_num_of_matched = 0; int gfiles_matched = 0; int foundpat = 0; int wholefilescope=0; int nobytelevelmustbeon=0; long get_file_time(); if ((argc <= 0) || (argv == NULL)) { errno = EINVAL; return -1; } /* * Macro to destroy EVERYTHING before return since we might want to make this a * library function later on: convention is that after destroy, objects are made * NULL throughout the source code, and are all set to NULL at initialization time. * DO agrep_argv, index_argv and FileOpt my_malloc/my_free optimizations later. * my_free calls have 2nd parameter = 0 if the size is not easily determinable. */ #define RETURN(val) \ {\ int q,k;\ \ first_search = 0;\ for (k=0; k MAX_ARGS) { #if ISSERVER fprintf(stderr, "too many arguments %d obtained on server!\n", argc); #endif /*ISSERVER*/ i = fileagrep(oldargc, oldargv, 0, stdout); RETURN(i); } /* * Process what options you can, then call fileagrep_init() to set * options in agrep and get the pattern. Then, call fileagrep_search(). * Begin by copying options into agrep_argv assuming glimpse was not * called as agrep (optimistic :-). */ agrep_argc = 0; for (i=0; i= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'z'; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; /* In glimpse, you should always print pattern when using mgrep (user can't do -f or -m)! */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'P'; agrep_argv[agrep_argc][2] = '\0'; my_P_index = agrep_argc; agrep_argc ++; /* In glimpse, you should always output multiple when doing mgrep */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'M'; agrep_argv[agrep_argc][2] = '\0'; my_M_index = agrep_argc; agrep_argc ++; /* In glimpse, you should print the byte offset if there is a structured query */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'b'; agrep_argv[agrep_argc][2] = '\0'; my_b_index = agrep_argc; agrep_argc ++; /* In glimpse, you should always have space for doing -m if required */ if (agrep_argc + 2 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'm'; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; agrep_argv[agrep_argc] = (char *)my_malloc(2); /* no op */ agrep_argv[agrep_argc][0] = '\0'; agrep_argc ++; /* Add -A option to print filenames as default */ if (agrep_argc + 1 >= MAX_ARGS) { fprintf(stderr, "%s: too many options!\n", GProgname); RETURN(usage()); } agrep_argv[agrep_argc] = (char *)my_malloc(sizeof(char *)); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = 'A'; agrep_argv[agrep_argc][2] = '\0'; my_A_index = agrep_argc; agrep_argc ++; while((agrep_argc < MAX_ARGS) && (--argc > 0) && (*++argv)[0] == '-' ) { p = argv[0] + 1; /* ptr to first character after '-' */ c = *(argv[0]+1); quitwhile = OFF; while (!quitwhile && (*p != '\0')) { c = *p; switch(c) { case 'F' : MATCHFILE = ON; FileOpt = (CHAR *)my_malloc(MAXFILEOPT); if (*(p + 1) == '\0') {/* space after - option */ if(argc <= 1) { fprintf(stderr, "%s: a file pattern must follow the -F option\n", GProgname); RETURN(usage()); } argv++; if ((dummylen = strlen(argv[0])) > MAXFILEOPT) { fprintf(stderr, "%s: -F option list too long\n", GProgname); RETURN(usage()); } strcpy(FileOpt, argv[0]); argc--; } else { if ((dummylen = strlen(p+1)) > MAXFILEOPT) { fprintf(stderr, "%s: -F option list too long\n", GProgname); RETURN(usage()); } strcpy(FileOpt, p+1); } /* else */ quitwhile = ON; break; /* search the index only and output the number of blocks */ case 'N' : Only_first = ON; break ; /* also keep track of the matches in each file */ case 'Q' : PRINTAPPXFILEMATCH = ON; break ; case 'U' : InfoAfterFilename = ON; break; case '!' : HINTSFROMUSER = ON; break; /* go to home directory to find the index: even if server overwrites indexdir here, it won't overwrite INDEX_DIR until read_index() */ case 'H' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -H option\n", GProgname); RETURN(usage()); } argv ++; #if !ISSERVER strcpy(indexdir, argv[0]); #endif /*!ISSERVER*/ argc --; } #if !ISSERVER else { strcpy(indexdir, p+1); } agrep_argv[agrep_argc] = (char *)my_malloc(4); strcpy(agrep_argv[agrep_argc], "-H"); agrep_argc ++; agrep_argv[agrep_argc] = (char *)my_malloc(strlen(indexdir) + 2); strcpy(agrep_argv[agrep_argc], indexdir); agrep_argc ++; #endif /*!ISSERVER*/ quitwhile = ON; break; #if ISSERVER && SFS_COMPAT /* INDEX_DIR will be already set since this is the server, so we can direclty xfer the .glimpse_* files */ case '.' : strcpy(TEMP_FILE, INDEX_DIR); strcpy(temp_file, "."); strcat(TEMP_FILE, "/."); if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a file name must follow the -. option\n", GProgname); RETURN(usage()); } argv ++; strcat(TEMP_FILE, argv[0]); strcat(temp_file, argv[0]); argc --; } else { strcat(TEMP_FILE, p+1); strcat(temp_file, p+1); } if (!strcmp(temp_file, INDEX_FILE) || !strcmp(temp_file, FILTER_FILE) || !strcmp(temp_file, ATTRIBUTE_FILE) || !strcmp(temp_file, MINI_FILE) || !strcmp(temp_file, P_TABLE) || !strcmp(temp_file, PROHIBIT_LIST) || !strcmp(temp_file, INCLUDE_LIST) || !strcmp(temp_file, NAME_LIST) || !strcmp(temp_file, NAME_LIST_INDEX) || !strcmp(temp_file, NAME_HASH) || !strcmp(temp_file, NAME_HASH_INDEX) || !strcmp(temp_file, DEF_STAT_FILE) || !strcmp(temp_file, DEF_MESSAGE_FILE) || !strcmp(temp_file, DEF_TIME_FILE)) { if ((ret = open(TEMP_FILE, O_RDONLY, 0)) <= 0) RETURN(ret); while ((num_read = read(ret, matched_region, MAX_REGION_LIMIT*2)) > 0) { write(1 /* NOT TO newsockfd since that was got by a syscall!!! */, matched_region, num_read); } close(ret); } quitwhile = ON; RETURN(0); #endif /* ISSERVER */ /* go to temp directory to create temp files */ case 'T' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: a directory name must follow the -T option\n", GProgname); RETURN(usage()); } argv ++; strcpy(TEMP_DIR, argv[0]); argc --; } else { strcpy(TEMP_DIR, p+1); } sprintf(tempfile, "%s/.glimpse_tmp.%d", TEMP_DIR, getpid()); quitwhile = ON; break; /* To get files within some number of days before indexing was done */ case 'Y': if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the number of days must follow the -Y option\n", GProgname); RETURN(usage()); } argv ++; GNumDays = atoi(argv[0]); argc --; } else { GNumDays = atoi(p+1); } if (GNumDays <= 0) { fprintf(stderr, "%s: the number of days %d must be > 0\n", GProgname, GNumDays); RETURN(usage()); } quitwhile = ON; break; case 'R' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the record size must follow the -R option\n", GProgname); RETURN(usage()); } argv ++; RegionLimit = atoi(argv[0]); argc --; } else { RegionLimit = atoi(p+1); } if ((RegionLimit <= 0) || (RegionLimit > MAX_REGION_LIMIT)) { fprintf(stderr, "Bad record size %d: must be in [%d, %d]: using default %d\n", RegionLimit, 1, MAX_REGION_LIMIT, DEFAULT_REGION_LIMIT); RegionLimit = DEFAULT_REGION_LIMIT; } quitwhile = ON; break; /* doesn't matter if we overwrite the value in the client since the same value would have been picked up in main() anyway */ case 'J' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server host name must follow the -J option\n", GProgname); RETURN(usageS()); } argv ++; #if !ISSERVER strcpy(SERV_HOST, argv[0]); #endif /*!ISSERVER*/ argc --; } #if !ISSERVER else { strcpy(SERV_HOST, p+1); } #endif /*!ISSERVER*/ quitwhile = ON; break; /* doesn't matter if we overwrite the value in the client since the same value would have been picked up in main() anyway */ case 'K' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the server port must follow the -C option\n", GProgname); RETURN(usage()); } argv ++; #if !ISSERVER SERV_PORT = atoi(argv[0]); #endif /*!ISSERVER*/ argc --; } #if !ISSERVER else { SERV_PORT = atoi(p+1); } if ((SERV_PORT < MIN_SERV_PORT) || (SERV_PORT > MAX_SERV_PORT)) { fprintf(stderr, "Bad server port %d: must be in [%d, %d]: using default %d\n", SERV_PORT, MIN_SERV_PORT, MAX_SERV_PORT, DEF_SERV_PORT); SERV_PORT = DEF_SERV_PORT; } #endif /*!ISSERVER*/ quitwhile = ON; break; /* Based on contribution From ada@mail2.umu.se Fri Jul 12 01:56 MST 1996; Christer Holgersson, Sen. SysNet Mgr, Umea University/SUNET, Sweden */ /* the bit-mask corresponding to the set of filenames within which the pattern should be searched is explicitly provided in a filename (absolute path name) */ case 'p' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the bitfield file [and an offset/length/endian separated by :] must follow the -p option\n", GProgname); RETURN(usage()); } argv ++; strcpy(bitfield_file, argv[0]); argc --; } else { strcpy(bitfield_file, p+1); } /* Find offset and length into bitfield file */ { int iiii = 0; BITFIELDOFFSET=0; BITFIELDLENGTH=0; BITFIELDENDIAN=0; iiii = 0; while (bitfield_file[iiii] != '\0') { if (bitfield_file[iiii] == '\\') { iiii ++; if (bitfield_file[iiii] == '\0') break; if (bitfield_file[iiii] == ':') { strcpy(&bitfield_file[iiii-1], &bitfield_file[iiii]); } else iiii ++; continue; } if (bitfield_file[iiii] == ':') { bitfield_file[iiii] = '\0'; sscanf(&bitfield_file[iiii+1], "%d:%d:%d", &BITFIELDOFFSET, &BITFIELDLENGTH, &BITFIELDENDIAN); if ((BITFIELDOFFSET < 0) || (BITFIELDLENGTH < 0) || (BITFIELDENDIAN < 0)) { fprintf(stderr, "Wrong offset %d or length %d or endian %d of bitfield file\n", BITFIELDOFFSET, BITFIELDLENGTH, BITFIELDENDIAN); RETURN(usage()); } break; } iiii++; } #if BG_DEBUG fprintf(debug, "BITFIELD %s : %d : %d : %d\n", BITFIELDFILE, BITFIELDOFFSET, BITFIELDLENGTH, BITFIELDENDIAN); #endif } if (bitfield_file[0] != '/') { getcwd(temp_bitfield_file, MAX_LINE_LEN-1); strcat(temp_bitfield_file, "/"); strcat(temp_bitfield_file, bitfield_file); strcpy(bitfield_file, temp_bitfield_file); } BITFIELDFILE = 1; quitwhile = ON; break; /* the set of filenames within which the pattern should be searched is explicitly provided in a filename (absolute path name) */ case 'f' : if (*(p + 1) == '\0') {/* space after - option */ if (argc <= 1) { fprintf(stderr, "%s: the filenames file must follow the -f option\n", GProgname); RETURN(usage()); } argv ++; strcpy(filenames_file, argv[0]); argc --; } else { strcpy(filenames_file, p+1); } if (filenames_file[0] != '/') { getcwd(temp_filenames_file, MAX_LINE_LEN-1); strcat(temp_filenames_file, "/"); strcat(temp_filenames_file, filenames_file); strcpy(filenames_file, temp_filenames_file); } FILENAMESINFILE = 1; quitwhile = ON; break; case 'C' : CONTACT_SERVER = 1; break; case 'a' : PRINTATTR = 1; break; case 'E': PRINTINDEXLINE = 1; break; case 'W': wholefilescope = 1; break; case 'z' : UseFilters = 1; break; case 'r' : GRECURSIVE = 1; break; case 'V' : printf("\nThis is glimpse version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); RETURN(0); /* Must let 'm' fall thru to default once multipatterns are done in agrep */ case 'm' : case 'v' : fprintf(stderr, "%s: illegal option: '-%c'\n", GProgname, c); RETURN(usage()); case 'I' : case 'D' : case 'S' : /* There is no space after these options */ agrep_argv[agrep_argc] = (char *)my_malloc(strlen(argv[0]) + 2); agrep_argv[agrep_argc][0] = '-'; strcpy(agrep_argv[agrep_argc] + 1, p); agrep_argc ++; quitwhile = ON; break; case 'l': GFILENAMEONLY = 1; my_l_index = agrep_argc; agrep_argv[agrep_argc] = (char *)my_malloc(4); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = c; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; break; /* * Copy the set of options for agrep: put them in separate argvs * even if they are together after one '-' (easier to process). * These are agrep options which glimpse has to peek into. */ default: agrep_argv[agrep_argc] = (char *)my_malloc(16); agrep_argv[agrep_argc][0] = '-'; agrep_argv[agrep_argc][1] = c; agrep_argv[agrep_argc][2] = '\0'; agrep_argc ++; if (c == 'n') { nobytelevelmustbeon=1; } else if (c == 'X') GPRINTNONEXISTENTFILE = 1; else if (c == 'j') GPRINTFILETIME = 1; else if (c == 'b') GBYTECOUNT = 1; else if (c == 'g') GPRINTFILENUMBER = 1; else if (c == 't') GOUTTAIL = 1; else if (c == 'y') GNOPROMPT = 1; else if (c == 'h') GNOFILENAME = 1; else if (c == 'c') GCOUNT = 1; else if (c == 'B') { GBESTMATCH = 1; my_B_index = agrep_argc - 1; } /* the following options are followed by a parameter */ else if ((c == 'e') || (c == 'd') || (c == 'L') || (c == 'k')) { if (*(p + 1) == '\0') {/* space after - option */ if(argc <= 1) { fprintf(stderr, "%s: the '-%c' option must have an argument\n", GProgname, c); RETURN(usage()); } argv++; if ( (c == 'd') && ((D_length = strlen(argv[0])) > MAX_NAME_SIZE) ) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", GProgname, MAX_NAME_SIZE); RETURN(usage()); /* Should this be RegionLimit if ByteLevelIndex? */ } else if (c == 'L') { GLIMITOUTPUT = GLIMITTOTALFILE = GLIMITPERFILE = 0; sscanf(argv[0], "%d:%d:%d", &GLIMITOUTPUT, &GLIMITTOTALFILE, &GLIMITPERFILE); if ((GLIMITOUTPUT < 0) || (GLIMITTOTALFILE < 0) || (GLIMITPERFILE < 0)) { fprintf(stderr, "%s: invalid output limit %s\n", GProgname, argv[0]); RETURN(usage()); } } agrep_argv[agrep_argc] = (char *)my_malloc(strlen(argv[0]) + 2); strcpy(agrep_argv[agrep_argc], argv[0]); if (c == 'd') { preprocess_delimiter(argv[0], D_length, GD_pattern, &GD_length); if (GOUTTAIL == 2) GOUTTAIL = 0; /* Should this be RegionLimit if ByteLevelIndex? */ } if (c == 'k') GCONSTANT = 1; argc--; } else { if ( (c == 'd') && ((D_length = strlen(p+1)) > MAX_NAME_SIZE) ) { fprintf(stderr, "%s: delimiter pattern too long (has > %d chars)\n", GProgname, MAX_NAME_SIZE); RETURN(usage()); /* Should this be RegionLimit if ByteLevelIndex? */ } else if (c == 'L') { GLIMITOUTPUT = GLIMITTOTALFILE = GLIMITPERFILE = 0; sscanf(p+1, "%d:%d:%d", &GLIMITOUTPUT, &GLIMITTOTALFILE, &GLIMITPERFILE); if ((GLIMITOUTPUT < 0) || (GLIMITTOTALFILE < 0) || (GLIMITPERFILE < 0)) { fprintf(stderr, "%s: invalid output limit %s\n", GProgname, p+1); RETURN(usage()); } } agrep_argv[agrep_argc] = (char *)my_malloc(strlen(p+1) + 2); strcpy(agrep_argv[agrep_argc], p+1); if (c == 'd') { preprocess_delimiter(p+1, D_length-2, GD_pattern, &GD_length); if (GOUTTAIL == 2) GOUTTAIL = 0; /* Should this be RegionLimit if ByteLevelIndex? */ } if (c == 'k') GCONSTANT = 1; } agrep_argc ++; #if DEBUG fprintf(stderr, "%d = %s\n", agrep_argc, agrep_argv[agrep_argc - 1]); #endif /*DEBUG*/ quitwhile = ON; if ((c == 'e') || (c == 'k')) foundpat = 1; } /* else it is something that glimpse doesn't know and agrep needs to look at */ break; /* from default: */ } /* switch(c) */ p ++; } } /* while (--argc > 0 && (*++argv)[0] == '-') */ /* exitloop: */ if ((GBESTMATCH == ON) && (MATCHFILE == ON) && (Only_first == ON)) fprintf(stderr, "%s: Warning: the number of matches may be incorrect when -B is used with -F.\n", HARVEST_PREFIX); if (GOUTTAIL) GOUTTAIL = 1; if (GNOFILENAME) { agrep_argv[my_A_index][1] = 'Z'; /* ignore the -A option */ } #if ISSERVER if (RemoteFiles) { /* force -NQ so that won't start looking for files! */ Only_first = ON; PRINTAPPXFILEMATCH = ON; } #endif if (argc > 0) { /* copy the rest of the options the pattern and the filenames if any verbatim */ for (i=0; i= MAX_ARGS) break; agrep_argv[agrep_argc] = (char *)my_malloc(strlen(argv[0]) + 2); strcpy(agrep_argv[agrep_argc], argv[0]); agrep_argc ++; argv ++; } if (!foundpat) argc --; } #if 0 for (j=0; j 0, glimpse * runs as agrep: otherwise, it searches index, etc. */ if (argc <= 0) { if (RecordLevelIndex) { /* based on work done for robint@zedcor.com Robin Thomas, Art Today, Tucson, AZ */ /* if ((D_length > 0) && strcmp(GD_pattern, rdelim)) { fprintf(stderr, "Index created for delimiter `%s': cannot search with delimiter `%s'\n", rdelim, GD_pattern); RETURN(-1); } SHOULD I HAVE THIS CHECK? MAYBE GD_pattern is a SUBSTRING OF rdelim??? But this is safest thing to do... robint@zedcor.com */ RegionLimit = 0; /* region is EXACTLY the same record number, not a portion of a file within some offset+length */ } glimpse_call = 1; /* Initialize some data structures, read the index */ if (GRECURSIVE == 1) { fprintf(stderr, "illegal option: '-r'\n"); RETURN(usage()); } num_terminals = 0; GParse = NULL; memset(terminals, '\0', sizeof(ParseTree) * MAXNUM_PAT); #if !ISSERVER if (-1 == read_index(indexdir)) RETURN(-1); #endif /*!ISSERVER*/ /* This handles the -n option with ByteLevelIndex: disabled as of now, else should go into file search... */ if (nobytelevelmustbeon && (ByteLevelIndex && !RecordLevelIndex)) { /* with RecordLevelIndex, we'll do search, so don't set NOBYTELEVEL */ /* fprintf(stderr, "Warning: -n option used with byte-level index: must SEARCH the files\n"); */ NOBYTELEVEL=ON; } WHOLEFILESCOPE = (WHOLEFILESCOPE || wholefilescope); if (ByteLevelIndex) { /* Must zero them here in addition to index search so that RETURN macro runs correctly */ if ((src_offset_table == NULL) && ((src_offset_table = (struct offsets **)my_malloc(sizeof(struct offsets *) * OneFilePerBlock)) == NULL)) exit(2); memset(src_offset_table, '\0', sizeof(struct offsets *) * OneFilePerBlock); for (i=0; i= GM) { fprintf(stderr, "%s: pattern '%s' has no characters that were indexed: glimpse cannot search for it\n", HARVEST_PREFIX, GPattern); for (j=0; j 0) { dest_index_buf[num+1] = '\n'; if (!strncmp(dest_index_buf, "END", strlen("END"))) break; i = j = 0; while ((joffset = y; o->next = NULL; o->sign = o->done = 0; if (heado == NULL) { heado = o; tailo = o; } else { tailo->next = o; tailo = o; } if (dest_index_buf[i] == FILE_END_MARK) goto onemorey; src_offset_table[x] = heado; } /* printf("]\n"); */ num = readline(newsockfd, dest_index_buf, REAL_INDEX_BUF); } goto search_files; } /* * Copy the agrep-options that are relevant to index search into * index_argv (see man-pages for which options are relevant). * Also, adjust patindex whenever options are skipped over. * NOTE: agrep_argv does NOT contain two options after one '-'. */ index_argc = 0; for (j=0; j= OneFilePerBlock) break; src_index_set[iii] |= mask_int[jjj]; } } else for(iii=0; iii OneFilePerBlock) num_blocks = OneFilePerBlock; /* roundoff */ } else { for (iii=0; iii search=%d optimize=%d times=%d all=%d blocks=%d len=%d pat=%s scope=%d\n", NOBYTELEVEL, OPTIMIZEBYTELEVEL, src_index_set[REAL_PARTITION - 2], src_index_set[REAL_PARTITION - 1], num_blocks, strlen(APattern), APattern, WHOLEFILESCOPE); #endif /*DEBUG*/ /* Based on contribution From ada@mail2.umu.se Fri Jul 12 01:56 MST 1996; Christer Holgersson, Sen. SysNet Mgr, Umea University/SUNET, Sweden */ if (BITFIELDFILE) { int i, len = -1, nextchar; FILE *fp; fp = fopen(bitfield_file, "r"); if (fp != NULL) { if (BITFIELDENDIAN >= 2) { /* is a BIG-ENDIAN 4B integer list of indexes of files in .glimpse_filenames (sparse set) */ if (BITFIELDLENGTH == 0) BITFIELDLENGTH = file_num; if (BITFIELDOFFSET > 0) fseek(fp, BITFIELDOFFSET, (long)0); if (OneFilePerBlock) { for (i=0; i 0) fseek(fp, BITFIELDOFFSET, (long)0); if (OneFilePerBlock) { for (i=0; i MAX_PARTITION: see io.c */ } i++; } fclose(fp); if (i <= 0) { fprintf(stderr, "Error in reading %d bytes from offset %d in bitfield file %s ... ignoring it\n", BITFIELDOFFSET, BITFIELDLENGTH, bitfield_file); /* ignore bitfield_file */ } else { /* intersect files in bitfield_file with those that were obtained after pattern search */ if (OneFilePerBlock) { for (i=0; i OneFilePerBlock) num_blocks = OneFilePerBlock; /* roundoff */ } else { for (iii=0; iii 0) ? OneFilePerBlock : GNumpartitions, (OneFilePerBlock > 0) ? "files" : "blocks"); if (num_blocks > 0) { char cc[8]; cc[0] = 'y'; #if !ISSERVER if (!GNOPROMPT) { fprintf(stderr, "Do you want to see the file names? (y/n)"); fgets(cc, 4, stdin); } #endif /*!ISSERVER*/ if (!SILENT && (cc[0] == 'y')) { if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("BEGIN %d %d %d\n", bestmatcherrors, NOBYTELEVEL, OPTIMIZEBYTELEVEL); } for (jjj=0; jjj 0) && (jjj >= GLIMITOUTPUT)) break; if (ByteLevelIndex && !NOBYTELEVEL && (src_index_set[REAL_PARTITION - 1] != 1) && (src_offset_table[GFileIndex[jjj]] == NULL)) continue; if (GPRINTFILENUMBER) printf("%d", GFileIndex[jjj]); else printf("%s", GTextfiles[jjj]); if (PRINTAPPXFILEMATCH) { if (GCOUNT) { int n = 0; printf(": "); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { n ++; p1 = p1->next; } } else n = 1; /* there is atleast 1 match */ printf("%d", n); } else { printf(" ["); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { printf(" %d", p1->offset); p1 = p1->next; } } printf("]"); } } printf("\n"); } if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("END\n"); } } } RETURN(0); } /* end of Only_first */ if (!OneFilePerBlock) searchpercent = num_blocks*100/GNumpartitions; else searchpercent = num_blocks * 100 / OneFilePerBlock; #if BG_DEBUG fprintf(debug, "searchpercent = %d, num_blocks = %d\n", searchpercent, num_blocks); #endif /*BG_DEBUG*/ #if !ISSERVER if (!GNOPROMPT && (searchpercent > MAX_SEARCH_PERCENT)) { char cc[8]; cc[0] = 'y'; fprintf(stderr, "Your query may search about %d%% of the total space! Continue? (y/n)", searchpercent); fgets(cc, 4, stdin); if (cc[0] != 'y') RETURN(0); } if (ByteLevelIndex && !RecordLevelIndex && (searchpercent > DEF_MAX_INDEX_PERCENT)) NOBYTELEVEL = 1; /* with RecordLevelIndex, I don't just want to stop collecting offsets just because searchpercent > .... */ #endif /*!ISSERVER*/ } /* end of !MATCHFILE */ else { /* set up the right options for -F in index_argv/index_argc itself since they will no longer be used */ index_argc=0; strcpy(index_argv[0], GProgname); /* adding the -h option, which is safer for -F */ index_argc ++; index_argv[index_argc][0] = '-'; index_argv[index_argc][1] = 'h'; index_argv[index_argc][2] = '\0'; index_argc ++; /* new code: bgopal, Feb/8/94: deleted udi's code here */ j = 0; while (FileOpt[j] == '-') { j++; while ((FileOpt[j] != ' ') && (FileOpt[j] != '\0') && (FileOpt[j] != '\n')) { if (j >= MAX_ARGS - 1) { fprintf(stderr, "%s: too many options after -F: %s\n", GProgname, FileOpt); RETURN(usage()); } index_argv[index_argc][0] = '-'; index_argv[index_argc][1] = FileOpt[j]; index_argv[index_argc][2] = '\0'; index_argc ++; j++; } if ((FileOpt[j] == '\0') || (FileOpt[j] == '\n')) break; if ((FileOpt[j] == ' ') && (FileOpt[j-1] == '-')) { fprintf(stderr, "%s: illegal option: '-' after -F\n", GProgname); RETURN(usage()); } else if (FileOpt[j] == ' ') while(FileOpt[j] == ' ') j++; } while(FileOpt[j] == ' ') j++; fileopt_length = strlen(FileOpt); strncpy(index_argv[index_argc],FileOpt+j,fileopt_length-j); index_argv[index_argc][fileopt_length-j] = '\0'; index_argc++; my_free(FileOpt, MAXFILEOPT); FileOpt = NULL; #if BG_DEBUG fprintf(debug, "pattern to check with -F = %s\n",index_argv[index_argc-1]); #endif /*BG_DEBUG*/ #if DEBUG fprintf(stderr, "-F : "); for (jj=0; jj < index_argc; jj++) fprintf(stderr, " %s ",index_argv[jj]); fprintf(stderr, "\n"); #endif /*DEBUG*/ fflush(stdout); get_filenames(src_index_set, index_argc, index_argv, dummylen, dummypat, file_num); /* Assume #files per partitions is appx constant */ if (OneFilePerBlock) num_blocks = GNumfiles; else num_blocks = GNumfiles * GNumpartitions / p_table[GNumpartitions - 1]; if (Only_first) { /* search the index only */ fprintf(stderr, "There are matches to %d out of %d %s\n", num_blocks, (OneFilePerBlock > 0) ? OneFilePerBlock : GNumpartitions, (OneFilePerBlock > 0) ? "files" : "blocks"); if (num_blocks > 0) { char cc[8]; cc[0] = 'y'; #if !ISSERVER if (!GNOPROMPT) { fprintf(stderr, "Do you want to see the file names? (y/n)"); fgets(cc, 4, stdin); } #endif /*!ISSERVER*/ if (!SILENT && (cc[0] == 'y')) { if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("BEGIN %d %d %d\n", bestmatcherrors, NOBYTELEVEL, OPTIMIZEBYTELEVEL); } for (jjj=0; jjj 0) && (jjj >= GLIMITOUTPUT)) break; if (ByteLevelIndex && !NOBYTELEVEL && (src_index_set[REAL_PARTITION - 1] != 1) && (src_offset_table[GFileIndex[jjj]] == NULL)) continue; if (GPRINTFILENUMBER) printf("%d", GFileIndex[jjj]); else printf("%s", GTextfiles[jjj]); if (PRINTAPPXFILEMATCH) { if (GCOUNT) { int n = 0; printf(": "); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { n ++; p1 = p1->next; } } else n = 1; /* there is atleast 1 match */ printf("%d", n); } else { printf("["); if (ByteLevelIndex && (src_offset_table != NULL)) { struct offsets *p1 = src_offset_table[GFileIndex[jjj]]; while (p1 != NULL) { printf(" %d", p1->offset); p1 = p1->next; } } printf("]"); } } printf("\n"); } if (PRINTAPPXFILEMATCH && Only_first && GPRINTFILENUMBER) { printf("END\n"); } } } RETURN(0); } /* end of Only_first */ if (OneFilePerBlock) searchpercent = GNumfiles * 100 / OneFilePerBlock; else searchpercent = GNumfiles * 100 / p_table[GNumpartitions - 1]; #if BG_DEBUG fprintf(debug, "searchpercent = %d, num_files = %d\n", searchpercent, p_table[GNumpartitions - 1]); #endif /*BG_DEBUG*/ #if !ISSERVER if (!GNOPROMPT && (searchpercent > MAX_SEARCH_PERCENT)) { char cc[8]; cc[0] = 'y'; fprintf(stderr, "Your query may search about %d%% of the total space! Continue? (y/n)", searchpercent); fgets(cc, 4, stdin); if (cc[0] != 'y') RETURN(0); } if (ByteLevelIndex && !RecordLevelIndex && (searchpercent > DEF_MAX_INDEX_PERCENT)) NOBYTELEVEL = 1; /* with RecordLevelIndex, I don't just want to stop collecting offsets just because searchpercent > .... */ #endif /*!ISSERVER*/ } /* At this point, I have the set of files to search */ search_files: /* Replace -B by the number of errors if best-match */ if (GBESTMATCH && (my_B_index >= 0)) { sprintf(&agrep_argv[my_B_index][1], "%d", bestmatcherrors); #if BG_DEBUG fprintf(debug, "Changing -B to -%d\n", bestmatcherrors); #endif /*BG_DEBUG*/ } agrep_argv[my_M_index][1] = 'Z'; agrep_argv[my_P_index][1] = 'Z'; #if 0 for (iii=0; iii 0) { gnum_of_matched += ret; gfiles_matched ++; } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if ((ret < 0) && (errno == AGREP_ERROR)) break; if (glimpse_clientdied) break; fflush(stdout); } } else { for (i=0; i index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else if ((ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if ((ret < 0) && (errno == AGREP_ERROR)) break; if (glimpse_clientdied) break; fflush(stdout); } } } /* end of !UseFilters */ else { sprintf(outname[0], "%s/.glimpse_apply.%d", TEMP_DIR, getpid()); for (i=0; i index_stat_buf.st_mtime)) { first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, outname, 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else { if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, outname, 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else if ((ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], outname[0], GFileIndex[i], src_offset_table, stdout)) > 0) { gfiles_matched ++; gnum_of_matched += ret; } } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; } else { if (!ByteLevelIndex || RecordLevelIndex || NOBYTELEVEL) { first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else { SetCurrentFileName = 1; if (GPRINTFILENUMBER) sprintf(CurrentFileName, "%d", GFileIndex[i]); else strcpy(CurrentFileName, GTextfiles[i]); if (my_stat(GTextfiles[i], &file_stat_buf) == -1) { if (GPRINTNONEXISTENTFILE) printf("%s\n", CurrentFileName); continue; } if (GPRINTFILETIME) { SetCurrentFileTime = 1; CurrentFileTime = get_file_time(timesfp, &file_stat_buf, GTextfiles[i], GFileIndex[i]); } if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; if ((ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } } else if ((ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, stdout)) > 0) { gnum_of_matched += ret; gfiles_matched ++; } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; } } if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if ((ret < 0) && (errno == AGREP_ERROR)) break; if (glimpse_clientdied) break; fflush(stdout); } } } /* end of WHOLEFILESCOPE <= 0 */ else { FILE *tmpfp = NULL; /* to store structured query-search output */ int OLDFILENAMEONLY;/* don't use FILENAMEONLY for agrepping the stuff: handle it in filtering */ int OLDLIMITOUTPUT; /* don't use LIMITs for search: only for filtering=identify_region(): agrep NEVER changes these 3 */ int OLDLIMITPERFILE; int OLDLIMITTOTALFILE; int OLDPRINTRECORD; /* don't use PRINTRECORD for search: only after filter_output() recognizes boolean in wholefilescope */ int OLDCOUNT; /* don't use OLDCOUNT for search: only after filter_output() recognizes boolean in wholefilescope */ if (!UseFilters) { for (i=0; i index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, tmpfp); } else ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, tmpfp); } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; fflush(tmpfp); fclose(tmpfp); tmpfp = NULL; if ((ret < 0) && (errno == AGREP_ERROR)) break; #if DEBUG printf("done search\n"); fflush(stdout); #endif /*DEBUG*/ FILENAMEONLY = OLDFILENAMEONLY; LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; PRINTRECORD = OLDPRINTRECORD; COUNT = OLDCOUNT; if (!UseFilters) { ret = filter_output(GTextfiles[i], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); } else { ret = filter_output(outname[0], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); unlink(outname[0]); } gnum_of_matched += (ret > 0) ? ret : 0; gfiles_matched += (ret > 0) ? 1 : 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if (glimpse_clientdied) break; fflush(stdout); } } else { /* we should try to apply the filter (we come here with -W -z, say) */ sprintf(outname[0], "%s/.glimpse_apply.%d", TEMP_DIR, getpid()); for (i=0; i index_stat_buf.st_mtime)) { first_search = 1; ret = fileagrep_search(AM, APattern, 1, outname, 0, tmpfp); } else { if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; ret = fileagrep_search(AM, APattern, 1, outname, 0, tmpfp); } else ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], outname[0], GFileIndex[i], src_offset_table, tmpfp); } } else { if (!ByteLevelIndex || RecordLevelIndex || NOBYTELEVEL) { if (GPRINTFILETIME) { SetCurrentFileTime = 1; CurrentFileTime = get_file_time(timesfp, NULL, GTextfiles[i], GFileIndex[i]); } first_search = 1; ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, tmpfp); } else { if (my_stat(GTextfiles[i], &file_stat_buf) == -1) { if (GPRINTNONEXISTENTFILE) printf("%s\n", CurrentFileName); fclose(tmpfp); continue; } if (GPRINTFILETIME) { SetCurrentFileTime = 1; CurrentFileTime = get_file_time(timesfp, &file_stat_buf, GTextfiles[i], GFileIndex[i]); } if (file_stat_buf.st_mtime > index_stat_buf.st_mtime) { /* fprintf(stderr, "Warning: file modified after indexing: must SEARCH %s\n", CurrentFileName); */ free_list(&src_offset_table[GFileIndex[i]]); first_search = 1; ret = fileagrep_search(AM, APattern, 1, >extfiles[i], 0, tmpfp); } else ret = glimpse_search(AM, APattern, GD_length, GD_pattern, GTextfiles[i], GTextfiles[i], GFileIndex[i], src_offset_table, tmpfp); } } SetCurrentFileName = 0; if (GPRINTFILETIME) SetCurrentFileTime = 0; fflush(tmpfp); fclose(tmpfp); tmpfp = NULL; if ((ret < 0) && (errno == AGREP_ERROR)) break; #if DEBUG printf("done search\n"); fflush(stdout); #endif /*DEBUG*/ FILENAMEONLY = OLDFILENAMEONLY; LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; PRINTRECORD = OLDPRINTRECORD; COUNT = OLDCOUNT; if (!UseFilters) { /* Added to do structured queries from Webglimpse */ ret = filter_output(GTextfiles[i], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); } else { ret = filter_output(outname[0], tempfile, GParse, GD_pattern, GD_length, GOUTTAIL, nullfp, StructuredIndex); } gnum_of_matched += (ret > 0) ? ret : 0; gfiles_matched += (ret > 0) ? 1 : 0; if (GLIMITOUTPUT > 0) { if (GLIMITOUTPUT <= gnum_of_matched) break; LIMITOUTPUT = GLIMITOUTPUT - gnum_of_matched; } if (GLIMITTOTALFILE > 0) { if (GLIMITTOTALFILE <= gfiles_matched) break; LIMITTOTALFILE = GLIMITTOTALFILE - gfiles_matched; } if (glimpse_clientdied) break; fflush(stdout); } } } if (errno == AGREP_ERROR) { fprintf(stderr, "%s: error in options or arguments to `agrep'\n", HARVEST_PREFIX); } RETURN(gnum_of_matched); } else { /* argc > 0: simply call agrep */ #if DEBUG for (i=0; i 0) || (residue > 0)) { total_read += num_read; if (num_read <= 0) { final_end = filter_buf + residue; num_read = residue; residue = 0; } else { num_read += residue; final_end = (CHAR *)backward_delimiter(filter_buf + num_read, filter_buf, GD_pattern, GD_length, GOUTTAIL); residue = filter_buf + num_read - final_end; } #if DEBUG fprintf(stderr, "filter_buf=%x final_end=%x residue=%x last_chars=%c%c%c num_read=%x\n", filter_buf, final_end, residue, *(final_end-2), *(final_end-1), *(final_end), num_read); #endif /*DEBUG*/ current_begin = previous_begin = filter_buf; #if 1 current_end = (CHAR *)forward_delimiter(filter_buf, filter_buf + num_read, GD_pattern, GD_length, GOUTTAIL); /* skip over prefixes like filename */ if (!GOUTTAIL) current_end = (CHAR *)forward_delimiter((long)current_end + GD_length, final_end, GD_pattern, GD_length, GOUTTAIL); #else /*1*/ current_end = (CHAR *)forward_delimiter(filter_buf+1, final_end, GD_pattern, GD_length, GOUTTAIL); #endif /*1*/ #if DEBUG fprintf(stderr, "current_begin=%x current_end=%x\n", current_begin, current_end); #endif /*DEBUG*/ while (current_end <= final_end) { previous_begin = current_begin; /* look for %d= */ byteoff = -1; while (current_begin < current_end) { if (isdigit(*current_begin)) { skiplen = getbyteoff(current_begin, &byteoff); #if BG_DEBUG fprintf(debug, "byteoff=%d skiplen=%d\n", byteoff, skiplen); #endif /*BG_DEBUG*/ if ((skiplen < 0) || (byteoff < 0)) { current_begin ++; continue; } else break; } else current_begin ++; } #if DEBUG printf("current_begin=%x current_end=%x final_end=%x residue=%x num_read=%x\n", current_begin, current_end, final_end, residue, num_read); #endif /*DEBUG*/ #if DEBUG printf("byteoff=%d skiplen=%d\n", byteoff, skiplen); #endif /*DEBUG*/ if ((skiplen < 0) || (byteoff < 0)) { /* output the whole line as it is: there is nothing to strip (e.g., -l) */ #if 0 /* This is an error: -l is now handled completely inside filter_output --> agrep won't processes it when -W */ if (!SILENT) fwrite(previous_begin, 1, current_end-previous_begin, displayfp); numprinted ++; #endif } else if ( (num_attr <= 0) || (((attribute = region_identify(byteoff, 0)) < num_attr) && (attribute >= 0)) ) { /* prefix is from previous_begin to current_begin. Skip skiplen from current_begin. Rest until current_end is valid output */ if (num_attr <= 0) attribute = 0; #if BG_DEBUG fprintf(debug, "region@%d=%d\n", byteoff, attribute); #endif /*BG_DEBUG*/ c1 = *(current_begin + skiplen - 1); c2 = *(current_end + 1); printed = 0; for (i=0; i already_matched[%d] = %d, going to look at '%s'\n", i, matched_terminals[i], terminals[i].data.leaf.value); #endif if (matched_terminals[i] && (GFILENAMEONLY || FILEOUT || printed || ((LIMITOUTPUT > 0) && (numprinted >= LIMITOUTPUT)) || ((LIMITPERFILE > 0) && (numprinted >= LIMITPERFILE)))) continue; if ((terminals[i].data.leaf.attribute == 0) || ((int)(terminals[i].data.leaf.attribute) == attribute)) { *(current_begin + skiplen - 1) = '\n'; *(current_end + 1) = '\n'; OLDLIMITOUTPUT = LIMITOUTPUT; LIMITOUTPUT = 0; OLDLIMITPERFILE = LIMITPERFILE; LIMITPERFILE = 0; OLDLIMITTOTALFILE = LIMITTOTALFILE; LIMITTOTALFILE = 0; if (memagrep_search( strlen(terminals[i].data.leaf.value), terminals[i].data.leaf.value, current_end - current_begin - skiplen + 1, current_begin + skiplen - 1, 0, nullfp) > 0) { LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; #if 0 *(current_end + 1) = '\0'; printf("--> search succeeded for %s in %s\n", terminals[i].data.leaf.value, previous_begin); #endif /*0*/ *(current_begin + skiplen - 1) = c1; *(current_end + 1) = c2; matched_terminals[i] = 1; /* must reevaluate/set since don't know if it should be printed */ if (!(((LIMITOUTPUT > 0) && (numprinted >= LIMITOUTPUT)) || ((LIMITPERFILE > 0) && (numprinted >= LIMITPERFILE))) && !printed) { /* see if it was useful later */ if (!COUNT && !FILEOUT && !SILENT) { fwrite(previous_begin, 1, current_begin - previous_begin, displayfp); if (PRINTATTR) fprintf(displayfp, "%s# ", (attrname = attr_id_to_name(attribute)) == NULL ? "(null)" : attrname); if (GBYTECOUNT) fprintf(displayfp, "%d= ", byteoff); if (PRINTRECORD) { fwrite(current_begin + skiplen, 1, current_end - current_begin - skiplen, displayfp); } else { if (*(current_begin + skiplen) == '@') { int iii = 0; while (current_begin[skiplen + iii] != '}') fputc(current_begin[skiplen + iii++], displayfp); fputc('}', displayfp); } fputc('\n', displayfp); } } printed = 1; numprinted ++; } } else { LIMITOUTPUT = OLDLIMITOUTPUT; LIMITPERFILE = OLDLIMITPERFILE; LIMITTOTALFILE = OLDLIMITTOTALFILE; #if 0 *(current_end + 1) = '\0'; printf("--> search failed for %s in %s\n", terminals[i].data.leaf.value, previous_begin); #endif /*0*/ *(current_begin + skiplen - 1) = c1; *(current_end + 1) = c2; } } } if (!success) { if (ComplexBoolean) { success = eval_tree(GParse, matched_terminals); } else { if ((long)GParse & AND_EXP) { success = 0; for (ii=0; ii= num_terminals) success = 1; } else { success = 0; /* cannot come to filter_output in this case unless -a! */ } } } /* optimize options that do not need all the matched lines */ if (success) { if (GFILENAMEONLY) { if (GPRINTFILETIME) { /* from bug fix message by Dr Jaime Prilusky lsprilus@weizmann.weizmann.ac.il jaimep@terminator.pdb.bnl.gov */ if (!SILENT) fprintf(stdout, "%s%s\n", CurrentFileName, aprint_file_time(CurrentFileTime)); } else { if (!SILENT) fprintf(stdout, "%s\n", CurrentFileName); } if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; goto unlink_and_quit; } else if (FILEOUT) { /* file_out(infile); */ if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; goto unlink_and_quit; } } } if (success && (((LIMITOUTPUT > 0) && (numprinted >= LIMITOUTPUT)) || ((LIMITPERFILE > 0) && (numprinted >= LIMITPERFILE)))) goto double_break; if (glimpse_clientdied) goto double_break; if (current_end >= final_end) break; current_begin = current_end; if (!GOUTTAIL) current_end = (CHAR *)forward_delimiter((long)current_end + GD_length, final_end, GD_pattern, GD_length, GOUTTAIL); else current_end = (CHAR *)forward_delimiter(current_end, final_end, GD_pattern, GD_length, GOUTTAIL); #if DEBUG fprintf(stderr, "current_begin=%x current_end=%x\n", current_begin, current_end); #endif /*DEBUG*/ } if (residue > 0) { memcpy(filter_buf, final_end, residue); memcpy(filter_buf+residue, GD_pattern, GD_length); } } double_break: /* Come here on normal exit or when the current agrep-output is no longer of any use */ if (!success && (total_read > 0)) { if (ComplexBoolean) { success = eval_tree(GParse, matched_terminals); } else { if ((long)GParse & AND_EXP) { success = 0; for (ii=0; ii= num_terminals) success = 1; } else { success = 0; /* cannot come to filter_output in this case unless -a! */ } } } /* Print the temporary output onto stdout if search was successful; unlink the temprorary file */ if (success) { if (GFILENAMEONLY) { /* all other output options are useless since they all deal with the MATCHED line */ if (GPRINTFILETIME) { /* from bug fix message by Dr Jaime Prilusky lsprilus@weizmann.weizmann.ac.il jaimep@terminator.pdb.bnl.gov */ if (!SILENT) fprintf(stdout, "%s%s\n", CurrentFileName, aprint_file_time(CurrentFileTime)); } else { if (!SILENT) fprintf(stdout, "%s\n", CurrentFileName); } if (!SILENT) fprintf(stdout, "%s\n", CurrentFileName); if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; } else if (COUNT && !FILEOUT) { if (!SILENT) { if(!NOFILENAME) fprintf(stdout, "%s: %d\n", CurrentFileName, numprinted); else fprintf(stdout, "%d\n", numprinted); } if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; } else if (FILEOUT) { /* file_out(infile); */ if (storefp != NULL) fclose(storefp); /* don't bother to flush! */ storefp = NULL; } else if (storefp != NULL) { fflush(storefp); fclose(storefp); #if DEBUG printf("STOREOUTPUT\n"); sprintf(s, "exec cat %s/.glimpse_storeoutput.%d\n", TEMP_DIR, getpid()); system(s); #endif /*DEBUG*/ sprintf(s, "%s/.glimpse_storeoutput.%d", TEMP_DIR, getpid()); if ((storefp = fopen(s, "r")) != NULL) { if (!SILENT) while (fgets(s, MAX_LINE_LEN, storefp) != NULL) fputs(s, stdout); fclose(storefp); } storefp = NULL; } } else { if (storefp != NULL) fclose(storefp); /* else don't bother to flush */ } unlink_and_quit: sprintf(s, "%s/.glimpse_storeoutput.%d", TEMP_DIR, getpid()); unlink(s); if (StructuredIndex) region_destroy(); fclose(outfp); if (GFILENAMEONLY) { if (numprinted > 0) return 1; else return 0; } else if (ComplexBoolean || ((long)GParse & AND_EXP)) { if (success) return numprinted; else return 0; } else { /* must be -a */ return numprinted; } } usage() { fprintf(stderr, "\nThis is glimpse version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-#abceghijklnprstwxyBCDEGIMNPQSVWZ] [-d DEL] [-f FILE] [-F PAT] [-H DIR] [-J HOST] [-K PORT] [-L X[:Y:Z]] [-R X] [-T DIR] [-Y D] pattern [files]", GProgname); fprintf(stderr, "\n"); fprintf(stderr, "List of options (see %s for more details):\n", GLIMPSE_URL); fprintf(stderr, "\n"); fprintf(stderr, "-#: find matches with at most # errors\n"); fprintf(stderr, "-a: print attribute names (useful only for Harvest SOIF format)\n"); fprintf(stderr, "-b: print the byte offset of the record from the beginning of the file\n"); fprintf(stderr, "-B: best match mode: find the closest matches to the pattern\n"); fprintf(stderr, "-c: output the number of matched records\n"); fprintf(stderr, "-C: send queries to glimpseserver\n"); fprintf(stderr, "-d DEL: define record delimiter DEL\n"); fprintf(stderr, "-D x: adjust the cost of deletions to x\n"); fprintf(stderr, "-e: for patterns starting with -\n"); fprintf(stderr, "-E: print matching lines as they appear in the index (useful in -EQNg)\n"); fprintf(stderr, "-f FILE: restrict the search to files whose names appear in FILE\n"); fprintf(stderr, "-F PAT: restrict the search to files matching PAT\n"); fprintf(stderr, "-g: print the file number (in the index)\n"); fprintf(stderr, "-G: output the (whole) files that contain a match\n"); fprintf(stderr, "-h: do not output file names before matched record\n"); fprintf(stderr, "-H DIR: the glimpse index is located in directory DIR\n"); fprintf(stderr, "-i: case-insensitive search, e.g., 'a' = 'A'\n"); fprintf(stderr, "-I x: adjust the cost of insertions to x\n"); fprintf(stderr, "-j: output file modification dates (if -t was used for the indexing)\n"); fprintf(stderr, "-J HOST: send queries to glimpseserver at HOST\n"); fprintf(stderr, "-k: use the pattern as is (no meta characters)\n"); fprintf(stderr, "-K PORT: send queries to glimpseserver at TCP port number PORT\n"); fprintf(stderr, "-l: output only the names of files that contain a match\n"); fprintf(stderr, "-L X[:Y:Z]: limit the output to X records [Y files, Z matches per file]\n"); fprintf(stderr, "-n: output record prefixed by record number\n"); fprintf(stderr, "-N: search only the index (may not be precise for some queries) \n"); fprintf(stderr, "-o: delimiter is output at the beginning of the matched record\n"); fprintf(stderr, "-O: file names are printed only once per file\n"); fprintf(stderr, "-p FILE:off:len:endian: restrict the search to the files whose names\n\tare specified as a bit-field OR sparse-set in FILE\n"); fprintf(stderr, "-P: print the pattern that matched before the matched record\n"); fprintf(stderr, "-q: print the offsets of the beginning and end of each matched record\n"); fprintf(stderr, "-Q: (with -N) print offsets of matches from (only the large) index\n"); fprintf(stderr, "-r: (used only for agrep) - recursive search\n"); fprintf(stderr, "-R X: set the maximum size of a record to X\n"); fprintf(stderr, "-s: display nothing except error messages\n"); fprintf(stderr, "-S x: adjust the cost of substitutions to x\n"); fprintf(stderr, "-t: use in combination with -d DEL so that the delimiter DEL appears\n\tat the end of each output record instead of the beginning\n"); fprintf(stderr, "-T DIR: temporary files are put in directory DIR (instead of /tmp)\n"); fprintf(stderr, "-u: do not output matched records (useful in -qbug)\n"); fprintf(stderr, "-U: interpret .glimpse_filenames when -U / -X option is used in glimpseindex\n"); fprintf(stderr, "-v: (works ONLY for agrep) - output all records that do not contain a match\n"); fprintf(stderr, "-V,--version: print the current version of glimpse\n"); fprintf(stderr, "-w: pattern has to match as a word, e.g., 'win' will not match 'wind'\n"); fprintf(stderr, "-W: the scope of Booleans is the whole file (except for structured queries)\n"); fprintf(stderr, "-x: the pattern must match the whole line\n"); fprintf(stderr, "-X: if an indexed file that matches 'pattern' doesn't exist, print its name\n"); fprintf(stderr, "-y: no prompt\n"); fprintf(stderr, "-Y D: output only matches in files that were updated in the last D days\n"); fprintf(stderr, "-z: customizable filtering using the .glimpse_filters file\n"); fprintf(stderr, "-Z: no op\n"); fprintf(stderr, "--help: this message\n"); fprintf(stderr, "\n"); fprintf(stderr, "For questions about glimpse, please contact: `%s'\n", GLIMPSE_EMAIL); return -1; /* useful if we make glimpse into a library */ /* * Undocumented Option Combinations for SFS (like RPC calls) * print file number of match instead of file name: -g * print enclosing offsets of matched record: -q * NOT print matched record: -u * E.G. USAGE: -qbug (b prints offset of pattern: can also use -lg or -Ng) * look only at index: -E * look at matched offsets in files as seen in index (w/o searching): -QN * E.G. USAGE: -EQNgy * read the -EQNg or just -QNg output from stdin and perform actual search w/o * searching the index (take hints from user): -U * NOTE: can't use U unless QNg are all used together (e.g., BEGIN/END won't be printed) */ } usageS() { fprintf(stderr, "\nThis is glimpse server version %s, %s.\n\n", GLIMPSE_VERSION, GLIMPSE_DATE); fprintf(stderr, "usage: %s [-H DIR] [-J HOST] [-K PORT]", GProgname); fprintf(stderr, "\n"); fprintf(stderr, "-H DIR: the glimpse index is located in directory DIR\n"); fprintf(stderr, "-J HOST: the host name (string) clients must use / server runs on\n"); fprintf(stderr, "-K PORT: the port (short integer) clients must use / server runs on\n"); fprintf(stderr, "\n"); fprintf(stderr, "For questions about glimpse, please contact `%s'\n", GLIMPSE_EMAIL); return -1; /* useful if we make glimpse into a library */ } #if CLIENTSERVER /* * do_select() - based on select_loop() from the Harvest Broker. * -- Courtesy: Darren Hardy, hardy@cs.colorado.edu */ int do_select(sock, sec) int sock; /* the socket to wait for */ int sec; /* the number of seconds to wait */ { struct timeval to; fd_set qready; int err; if (sock < 0 || sec < 0) return 0; FD_ZERO(&qready); FD_SET(sock, &qready); to.tv_sec = sec; to.tv_usec = 0; if ((err = select(sock + 1, &qready, NULL, NULL, &to)) < 0) { if (errno == EINTR) return 0; perror("select"); return -1; } if (err == 0) return 0; /* If there's someone waiting to get it, let them through */ return (FD_ISSET(sock, &qready) ? 1 : 0); } #endif /* CLIENTSERVER */ glimpse-4.18.7/mkinstalldirs000077500000000000000000000012131300371307100160510ustar00rootroot00000000000000#! /bin/sh # mkinstalldirs --- make directory hierarchy # Author: Noah Friedman # Created: 1993-05-16 # Last modified: 1994-03-25 # Public domain errstatus=0 for file in ${1+"$@"} ; do set fnord `echo ":$file" | sed -ne 's/^:\//#/;s/^://;s/\// /g;s/^#/\//;p'` shift pathcomp= for d in ${1+"$@"} ; do pathcomp="$pathcomp$d" case "$pathcomp" in -* ) pathcomp=./$pathcomp ;; esac if test ! -d "$pathcomp"; then echo "mkdir $pathcomp" 1>&2 mkdir "$pathcomp" || errstatus=$? fi pathcomp="$pathcomp/" done done exit $errstatus # mkinstalldirs ends here glimpse-4.18.7/split.c000066400000000000000000000473661300371307100145650ustar00rootroot00000000000000/* Copyright (c) 1994 Burra Gopal, Udi Manber. All Rights Reserved. */ #include "glimpse.h" extern CHAR *getword(); extern int checksg(); extern int D; extern CHAR GProgname[MAXNAME]; extern FILE *debug; extern int StructuredIndex; extern int WHOLEFILESCOPE; extern int ByteLevelIndex; extern int ComplexBoolean; extern int foundattr; extern int foundnot; /* returns where it found the distinguishing token: until that from prev value of begin is the current pattern (not just the "words" in it) */ CHAR * parse_flat(begin, end, prev, next) CHAR *begin; CHAR *end; int prev; int *next; { if (begin > end) { *next = prev; return end; } if (prev & ENDSUB_EXP) prev &= ~ATTR_EXP; if ((prev & ATTR_EXP) && !(prev & VAL_EXP)) prev |= VAL_EXP; while (begin <= end) { if (*begin == ',') { prev |= OR_EXP; prev |= VAL_EXP; prev |= ENDSUB_EXP; if (prev & AND_EXP) { fprintf(stderr, "%s: parse error at character '%c'\n", GProgname, *begin); return NULL; } *next = prev; return begin; } else if (*begin == ';') { prev |= AND_EXP; prev |= VAL_EXP; prev |= ENDSUB_EXP; if (prev & OR_EXP) { fprintf(stderr, "%s: parse error at character '%c'\n", GProgname, *begin); return NULL; } *next = prev; return begin; } else if (*begin == '=') { if (StructuredIndex <= 0) begin++; /* don't care about = since just another character */ else { if (prev & ATTR_EXP) { fprintf(stderr, "%s: syntax error: only ',' and ';' can follow 'attribute=value'\n", GProgname); return NULL; } prev |= ATTR_EXP; /* remains an ATTR_EXP until a new ',' OR ';' */ prev &= ~VAL_EXP; *next = prev; return begin; } } else if (*begin == '\\') begin ++; /* skip two things */ begin++; } *next = prev; return begin; } int split_pattern_flat(GPattern, GM, APattern, terminals, pnum_terminals, pGParse, num_attr) CHAR *GPattern; int GM; CHAR *APattern; ParseTree terminals[]; int *pnum_terminals; int *pGParse; /* doesn't interpret it as a tree */ int num_attr; { int j, k = 0, l = 0, len = 0; int current_attr; CHAR *buffer; CHAR *buffer_pat; CHAR *buffer_end; char tempbuf[MAX_LINE_LEN]; memset(APattern, '\0', MAXPAT); buffer = GPattern; buffer_end = buffer + GM; j=0; *pGParse = 0; current_attr = 0; foundattr = 0; /* * buffer is the runnning pointer, buffer_pat is the place where * the distinguishing delimiter was found, buffer_end is the end. */ while (buffer_pat = parse_flat(buffer, buffer_end, *pGParse, pGParse)) { /* there is no pattern until after the distinguishing delimiter position: some agrep garbage */ if (buffer_pat <= buffer) { buffer = buffer_pat+1; if (buffer_pat >= buffer_end) break; continue; } if ((*pGParse & ATTR_EXP) && !(*pGParse & VAL_EXP)) { /* fresh attribute */ foundattr=1; memcpy(tempbuf, buffer, buffer_pat - buffer); tempbuf[buffer_pat - buffer] = '\0'; len = strlen(tempbuf); for (k = 0; k= num_attr)) { buffer[buffer_pat - buffer] = '\0'; fprintf(stderr, "%s: unknown attribute name '%s'\n", GProgname, buffer); return -1; } buffer = buffer_pat+1; /* immediate next character after distinguishing delimiter */ if (buffer_pat >= buffer_end) break; continue; } else { /* attribute's value OR raw-value */ if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "%s: boolean expression has too many terms\n", GProgname); return -1; } terminals[*pnum_terminals].op = 0; terminals[*pnum_terminals].type = LEAF; terminals[*pnum_terminals].terminalindex = *pnum_terminals; terminals[*pnum_terminals].data.leaf.attribute = current_attr; /* default is no structure */ terminals[*pnum_terminals].data.leaf.value = (CHAR *)malloc(buffer_pat - buffer + 2); memcpy(terminals[*pnum_terminals].data.leaf.value, buffer, buffer_pat - buffer); /* without distinguishing delimiter */ terminals[*pnum_terminals].data.leaf.value[buffer_pat - buffer] = '\0'; if (foundattr || WHOLEFILESCOPE) { memcpy(&APattern[j], buffer, buffer_pat - buffer); j += buffer_pat - buffer; /* NOT including the distinguishing delimiter at buffer_pat, or '\0' */ APattern[j++] = (*(buffer_pat + 1) == '\0' ? '\0' : ','); /* always search for OR, do filtering at the end */ #if BG_DEBUG fprintf(debug, "current_attr = %d, val = %s\n", current_attr, terminals[*pnum_terminals].data.leaf.value); #endif /*BG_DEBUG*/ } else { memcpy(&APattern[j], buffer, buffer_pat + 1 - buffer); j += buffer_pat + 1 - buffer; /* including the distinguishing delimiter at buffer_pat, or '\0' */ } (*pnum_terminals)++; } if (*pGParse & ENDSUB_EXP) current_attr = 0; /* remains 0 until next fresh attribute */ if (buffer_pat >= buffer_end) break; buffer = buffer_pat+1; } if (buffer_pat == NULL) return -1; /* got out of while loop because of NULL rather than break */ APattern[j] = '\0'; if (foundattr || WHOLEFILESCOPE) /* then search must always be OR since scope is over whole files */ for (j=0; APattern[j] != '\0'; j++) if (APattern[j] == '\\') j++; else if (APattern[j] == ';') APattern[j] = ','; return(*pnum_terminals); } extern int is_complex_boolean(); /* use the one in agrep/asplit.c */ extern int get_token_bool(); /* use the one in agrep/asplit.c */ /* Spaces ARE significant: 'a1=v1' and 'a1=v1 ' and 'a1 =v1' etc. are NOT identical */ int get_attribute_value(pattr, pval, tokenbuf, tokenlen, num_attr) int *pattr, tokenlen; CHAR **pval, *tokenbuf; { CHAR tempbuf[MAXNAME]; int i = 0, j = 0, k = 0, l = 0; while (i < tokenlen) { if (tokenbuf[i] == '\\') { tempbuf[j++] = tokenbuf[i++]; tempbuf[j++] = tokenbuf[i++]; } else if (StructuredIndex) { if (tokenbuf[i] == '=') { i++; /* skip over = : now @ 1st char of value */ tempbuf[j] = '\0'; for (k=0; k= num_attr) ) { /* named a non-existent attribute */ fprintf(stderr, "%s: unknown attribute name '%s'\n", GProgname, tempbuf); return 0; } *pval = (CHAR *)malloc(tokenlen - i + 2); memcpy(*pval, &tokenbuf[i], tokenlen - i); (*pval)[tokenlen - i] = '\0'; foundattr = 1; return 1; } else tempbuf[j++] = tokenbuf[i++]; /* consider = as just another char */ } else tempbuf[j++] = tokenbuf[i++]; /* no attribute parsing */ } /* Not a structured expression */ tempbuf[j] = '\0'; *pval = (CHAR *)malloc(j + 2); memcpy(*pval, tempbuf, j); (*pval)[j] = '\0'; return 1; } extern destroy_tree(); /* use the one in agrep/asplit.c */ /* * Recursive descent; C-style => AND + OR have equal priority => must bracketize expressions appropriately or will go left->right. * Also strips out attribute names since agrep doesn't understand them: copies resulting pattern for agrep-ing into apattern. * Grammar: * E = {E} | ~a | ~{E} | E ; E | E , E | a * Parser: * One look ahead at each literal will tell you what to do. * ~ has highest priority, ; and , have equal priority (left to right associativity), ~~ is not allowed. */ ParseTree * parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr) CHAR *buffer; int len; int *bufptr; CHAR *apattern; int *apatptr; ParseTree terminals[]; int *pnum_terminals; int num_attr; { int token, tokenlen; CHAR tokenbuf[MAXNAME]; int oldtokenlen; CHAR oldtokenbuf[MAXNAME]; ParseTree *t, *n, *leftn; token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen); switch(token) { case '{': /* (exp) */ apattern[(*apatptr)++] = '{'; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) != '}') { fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); destroy_tree(t); return (NULL); } apattern[(*apatptr)++] = '}'; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return t; switch(token) { /* must find boolean infix operator */ case ',': case ';': apattern[(*apatptr)++] = token; leftn = t; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; /* or end of parent sub expression */ case '}': unget_token_bool(bufptr, tokenlen); /* part of someone else who called me */ return t; default: destroy_tree(t); fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } /* Go one level deeper */ case '~': /* not exp */ foundnot = 1; apattern[(*apatptr)++] = '~'; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return NULL; switch(token) { case 'a': if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "%s: pattern expression too long (> %d terms)\n", GProgname, MAXNUM_PAT); return NULL; } n = &terminals[*pnum_terminals]; n->op = 0; n->type = LEAF; n->terminalindex = (*pnum_terminals); n->data.leaf.value = NULL; n->data.leaf.attribute = 0; if (!get_attribute_value((int *)&n->data.leaf.attribute, &n->data.leaf.value, tokenbuf, tokenlen, num_attr)) return NULL; strcpy(&apattern[*apatptr], n->data.leaf.value); *apatptr += strlen(n->data.leaf.value); (*pnum_terminals)++; n->op |= NOTPAT; t = n; break; case '{': apattern[(*apatptr)++] = token; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; if (t->op & NOTPAT) t->op &= ~NOTPAT; else t->op |= NOTPAT; if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) != '}') { fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); destroy_tree(t); return NULL; } apattern[(*apatptr)++] = '}'; break; default: fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } /* The resulting tree is in t. Now do another lookahead at this level */ if ((token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen)) == 'e') return t; switch(token) { /* must find boolean infix operator */ case ',': case ';': apattern[(*apatptr)++] = token; leftn = t; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; case '}': unget_token_bool(bufptr, tokenlen); return t; default: destroy_tree(t); fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } case 'a': /* individual term (attr=val) */ if (tokenlen == 0) return NULL; memcpy(oldtokenbuf, tokenbuf, tokenlen); oldtokenlen = tokenlen; oldtokenbuf[oldtokenlen] = '\0'; token = get_token_bool(buffer, len, bufptr, tokenbuf, &tokenlen); switch(token) { case '}': /* part of case '{' above: else syntax error not detected but semantics ok */ unget_token_bool(bufptr, tokenlen); case 'e': /* endof input */ case ',': case ';': if (*pnum_terminals >= MAXNUM_PAT) { fprintf(stderr, "%s: pattern expression too long (> %d terms)\n", GProgname, MAXNUM_PAT); return NULL; } n = &terminals[*pnum_terminals]; n->op = 0; n->type = LEAF; n->terminalindex = (*pnum_terminals); n->data.leaf.value = NULL; n->data.leaf.attribute = 0; if (!get_attribute_value((int *)&n->data.leaf.attribute, &n->data.leaf.value, oldtokenbuf, oldtokenlen, num_attr)) return NULL; strcpy(&apattern[*apatptr], n->data.leaf.value); *apatptr += strlen(n->data.leaf.value); (*pnum_terminals)++; if ((token == 'e') || (token == '}')) return n; /* nothing after terminal in expression */ leftn = n; apattern[(*apatptr)++] = token; if ((t = parse_tree(buffer, len, bufptr, apattern, apatptr, terminals, pnum_terminals, num_attr)) == NULL) return NULL; n = (ParseTree *)malloc(sizeof(ParseTree)); n->op = (token == ';') ? ANDPAT : ORPAT ; n->type = INTERNAL; n->data.internal.left = leftn; n->data.internal.right = t; return n; default: fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } case 'e': /* can't happen as I always do a lookahead above and return current tree if e */ default: fprintf(stderr, "%s: parse error at offset %d\n", GProgname, *bufptr); return NULL; } } int split_pattern(GPattern, GM, APattern, terminals, pnum_terminals, pGParse, num_attr) CHAR *GPattern; int GM; CHAR *APattern; ParseTree terminals[]; int *pnum_terminals; ParseTree **pGParse; int num_attr; { int bufptr = 0, apatptr = 0, ret, i, j; foundattr = 0; if (is_complex_boolean(GPattern, GM)) { ComplexBoolean = 1; *pnum_terminals = 0; if ((*pGParse = parse_tree(GPattern, GM, &bufptr, APattern, &apatptr, terminals, pnum_terminals, num_attr)) == NULL) return -1; /* print_tree(*pGParse, 0); */ APattern[apatptr] = '\0'; if (foundattr || WHOLEFILESCOPE) { /* Search in agrep must always be OR since scope is whole file */ int i, j; for (i=0; i= buffer_end) break; continue; } if ((type = checksg(word, D, 0)) == -1) return -1; if (!type && ComplexBoolean) { fprintf(stderr, "%s: query has complex patterns (like '.*') or options (like -n)\n... cannot search for arbitrary booleans\n", GProgname); return -1; } #if 0 DISABLED IN GLIMPSE NOW SINCE MGREP HANDLES DUPLICATES -- IT WAS ALWAYS ABLE TO HANDLE SUPERSTRINGS/SUBSTRINGS...: bgopal, Nov 19, 1996 if (type) { /* Check if superstring: if so, ditch word */ for (i=0; i= buffer_end) break; continue; } /* Check if substring: delete all superstrings */ for (i=0; i= buffer_end) break; if(num_pat >= MAXNUM_PAT) { fprintf(stderr, "%s: Warning! too many words in pattern (> %d): ignoring...\n", GProgname, MAXNUM_PAT); break; } } } for (i=0; i>"$ERROR_LOG" echo "Exit code: ${GLIMPSEINDEX_EXIT_CODE}" >>"$ERROR_LOG" echo "Eror output: ${GLIMPSEINDEX_OUTPUT}" >>"$ERROR_LOG" echo "" >>"$ERROR_LOG" fi #---------------------------------------------- # Analyse indexing results #---------------------------------------------- GREP_INDEX=`grep "glimpse" .glimpse_index` if [ -f ".glimpse_index" -a -n "$GREP_INDEX" ]; then echo "ok" else echo "fail" fi #---------------------------------------------- # Perform boolean search using generated db #---------------------------------------------- echo "" echo "run test 2 [2 of 2]" echo -en "search test... " GLIMPSE_OUTPUT=`../bin/glimpse -c -h -i -y -H . 'test;suite'` GLIMPSE_EXIT_CODE="$?" if [ "$GLIMPSE_EXIT_CODE" -ne "0" ]; then echo "Error occured when running command: ../bin/glimpse -c -h -i -y -H . 'test;suite'" >>"$ERROR_LOG" echo "Exit code: ${GLIMPSE_EXIT_CODE}" >>"$ERROR_LOG" echo "Eror output: ${GLIMPSE_OUTPUT}" >>"$ERROR_LOG" echo "" >>"$ERROR_LOG" fi #---------------------------------------------- # Analyse search results #---------------------------------------------- if [ "$GLIMPSE_OUTPUT" = "1" ]; then echo "ok" else echo "fail" fi #---------------------------------------------- # Clean up #---------------------------------------------- rm -f .glimpse_* cd .. echo "" echo "Done"glimpse-4.18.7/test/test.txt000066400000000000000000000000311300371307100157370ustar00rootroot00000000000000Basic glimpse test suite.