apt-xapian-index-0.47ubuntu13/0000755000000000000000000000000013073450556013120 5ustar apt-xapian-index-0.47ubuntu13/data/0000755000000000000000000000000013070503156014021 5ustar apt-xapian-index-0.47ubuntu13/data/org.debian.AptXapianIndex.conf0000644000000000000000000000130113070503156021547 0ustar apt-xapian-index-0.47ubuntu13/data/org.debian.aptxapianindex.policy0000644000000000000000000000141713070503156022271 0ustar AptXapianIndex http://www.enricozini.org/sw/apt-xapian-index/ Update the xapian index System policy prevents updating xapian index no yes apt-xapian-index-0.47ubuntu13/data/org.debian.AptXapianIndex.service0000644000000000000000000000016713070503156022273 0ustar [D-BUS Service] Name=org.debian.AptXapianIndex Exec=/usr/share/apt-xapian-index/update-apt-xapian-index-dbus User=root apt-xapian-index-0.47ubuntu13/runtests0000755000000000000000000000076513070503156014735 0ustar #!/bin/sh export AXI_PLUGIN_DIR=`pwd`/plugins export AXI_DB_PATH=`pwd`/testdb export AXI_CACHE_PATH=`pwd`/testdb export PYTHONPATH="$PYTHONPATH:`pwd`" case "$1" in --cov) echo "Run with code coverage..." shift nosetests -w test --with-coverage --cover-package axi --cover-html --cover-html-dir=test-coverage "$@" echo "Coverage information found at file://`pwd`/test/test-coverage" ;; --clean) rm -rf test/test-coverage test/.coverage testdb ;; *) nosetests -w test "$@" ;; esac apt-xapian-index-0.47ubuntu13/ACKNOWLEDGEMENTS0000644000000000000000000000014613070503156015366 0ustar Enrico's work in versions 0.31 to 0.35 has been sponsored by the Fuss Project: http://www.fuss.bz.it/ apt-xapian-index-0.47ubuntu13/testrun0000755000000000000000000000017613070503156014546 0ustar #!/bin/sh export AXI_PLUGIN_DIR=plugins export AXI_DB_PATH=testdb export AXI_CACHE_PATH=testdb ./update-apt-xapian-index "$@" apt-xapian-index-0.47ubuntu13/debian/0000755000000000000000000000000013073450556014342 5ustar apt-xapian-index-0.47ubuntu13/debian/manpages0000644000000000000000000000004613070503156016050 0ustar update-apt-xapian-index.8 axi-cache.1 apt-xapian-index-0.47ubuntu13/debian/postrm0000755000000000000000000000042513070503156015605 0ustar #!/bin/sh set -e if [ "$1" = "remove" -o "$1" = "purge" ]; then echo "Removing index /var/lib/apt-xapian-index..." rm -rf /var/lib/apt-xapian-index fi if [ "$1" = "remove" -o "$1" = "purge" ]; then rm -ff /usr/share/apt-xapian-index/plugins/*.pyc fi #DEBHELPER# exit 0 apt-xapian-index-0.47ubuntu13/debian/docs0000644000000000000000000000003013070503156015176 0ustar README ACKNOWLEDGEMENTS apt-xapian-index-0.47ubuntu13/debian/copyright0000644000000000000000000000252513070503156016271 0ustar apt-xapian-index was packaged by Enrico Zini on Mon Oct 15 13:58:29 BST 2007 Upstream author: Enrico Zini Copyright (C) 2007 Enrico Zini License: This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. On Debian systems, the complete text of the GNU General Public License can be found in /usr/share/common-licenses/GPL file. The Debian packaging is licensed under the GNU General Public License version 2 or above as well. Some example files are licensed under the terms of the Do What The Fuck You Want To Public License, Version 2: This program is free software. It comes without any warranty, to the extent permitted by applicable law. You can redistribute it and/or modify it under the terms of the Do What The Fuck You Want To Public License, Version 2, as published by Sam Hocevar. See http://sam.zoy.org/wtfpl/COPYING for more details. apt-xapian-index-0.47ubuntu13/debian/bash-completion0000644000000000000000000000002713070503156017340 0ustar axi-cache.sh axi-cache apt-xapian-index-0.47ubuntu13/debian/changelog0000644000000000000000000007163013073450556016223 0ustar apt-xapian-index (0.47ubuntu13) zesty; urgency=medium * d/control: Change the dependency on python3-xapian to a versioned one so that on upgrades the package will be installed and python3-xapian1.3 will be removed. (LP: #1680090) -- Brian Murray Wed, 12 Apr 2017 09:10:22 -0700 apt-xapian-index (0.47ubuntu12) zesty; urgency=medium * d/control: Drop the explicit Depends/Build-Depends on python3-xapian1.3 since that was a development snapshot we only used in our port to Python 3. Straight up python3-xapian (1.4) has everything we need. (LP: #1660216) -- Barry Warsaw Mon, 03 Apr 2017 13:29:18 -0400 apt-xapian-index (0.47ubuntu11) yakkety; urgency=medium * debian/patches/05_python3.patch: - properly apply the change by Carlo Vanini, previous upload was rubbish -- Harald Sitter Mon, 15 Aug 2016 14:39:37 +0200 apt-xapian-index (0.47ubuntu10) yakkety; urgency=medium * debian/patches/05_python3.patch: - string.split has been removed in python3 - open subprocess stdout as text stream - Thanks to Carlo Vanini for the patch. -- Brian Murray Tue, 09 Aug 2016 16:43:22 -0700 apt-xapian-index (0.47ubuntu9) yakkety; urgency=medium * debian/patches/07_glib_import.patch: Fix the import of GLib and GObject. (LP: #1579834) -- Carlo Vanini Fri, 22 Jul 2016 13:25:11 -0700 apt-xapian-index (0.47ubuntu8) xenial; urgency=medium * debian/patches/05_python3.patch: del the indexer instance explicitly before sys.exit(), so that it gets decrefed and freed before Python's implicit shutdown machinery can potentially leave an empty os module for ServerProgress.__del__() to find. (LP: #1530518) -- Barry Warsaw Wed, 13 Apr 2016 17:36:22 -0400 apt-xapian-index (0.47ubuntu7) xenial; urgency=medium * debian/patches/04_bilingual.patch: Make some Python 2-only constructs in update-apt-xapian-index-dbus bilingual, and add a missing import. (LP: #1541407) -- Barry Warsaw Wed, 03 Feb 2016 09:54:17 -0500 apt-xapian-index (0.47ubuntu6) xenial; urgency=medium * debian/patches/06_32bit_sizes.patch: Watch out for package sizes that don't fit in 32 bits. (LP: #1527745) -- Barry Warsaw Fri, 18 Dec 2015 16:47:43 -0500 apt-xapian-index (0.47ubuntu5) xenial; urgency=medium * debian/patches/04_bilingual.patch: When using --force --update on update-apt-xapian-index, be sure package names passed to python3-apt are strings, not bytes. (LP: #1526267) -- Barry Warsaw Wed, 16 Dec 2015 14:12:44 -0500 apt-xapian-index (0.47ubuntu4) xenial; urgency=medium * debian/patches/04_bilingual.patch: Be sure to open files passed to pickle.{load,dump}() in binary mode. -- Barry Warsaw Tue, 15 Dec 2015 16:23:18 -0500 apt-xapian-index (0.47ubuntu3) xenial; urgency=medium * debian/patches: - 04_bilingual.patch: Port the code to work in Python 2.7, 3.4, and 3.5. - 05_python3.patch: Switch the default to Python 3. (LP: #1516688) * debian/compat: Bump to version 9. * debian/control: - Bump debhelper version to >= 9. - Switch dependencies to their Python 3 version. - Use X-Python3-Version instead. * debian/rules: - Use --with=python3 and --build-system=pybuild - Add override_dh_python3 to force /usr/bin/python3 shebang. - Run the tests in an override_dh_auto_test target. -- Barry Warsaw Tue, 24 Nov 2015 11:13:01 -0500 apt-xapian-index (0.47ubuntu2) xenial; urgency=medium * debian/patches/03_stick_to_main.patch: Use libdbus-1-dev instead of libept-dev as a test package since the latter is not in main, which will cause FTBFS in Ubuntu. -- Barry Warsaw Mon, 23 Nov 2015 17:29:00 -0500 apt-xapian-index (0.47ubuntu1) xenial; urgency=low * Merge from Debian unstable. Remaining changes: - Prefer native packages over foreign packages. - Do not crash if the DB is already locked. (LP: #590998) - Do not modify the DB "in-place" with --update - debian/patches/01_axi_cjk_support.patch: - Activate the CJK support when indexing the database. - debian/patches/02_axi-pkgname-mangled-term.patch: - add XPM term that contains a mangled version of the pkgname. - debian/postinst: - Do not build the DB in the postinst. - debian/rules: - quilt with dh --with python2 - plugins/app-install.py: - Ignore file not found errors due to a race condition. (LP: #752195) - Restore the D-Bus API since software-center is not yet removed from Ubuntu: - Re-add the update-apt-xapian-index-dbus script. - Re-add the data directory containing the D-Bus configuration files. - Re-add the d/dirs entries for the D-Bus files. - Re-add the d/rules to install the D-Bus files. - Remove the d/preinst, edit the d/postinst and d/postrm scripts since we don't want to remove the D-Bus conffile. * Dropped: - debian/patches/04_catch_invalid_desktop_file.patch: no longer needed - debian/patches/03_policykit_translations.patch: Applied directly to Ubuntu copy of D-Bus files. - debian/CVE-2013-1064.patch: Applied directly to the Ubuntu copy. -- Barry Warsaw Mon, 23 Nov 2015 10:56:00 -0500 apt-xapian-index (0.47) unstable; urgency=low [ Enrico Zini ] * s/UNRELEASED/unstable/ in 0.46 changelog. Closes: #719940 * Removed dbus support files, not needed anymore since software-center has been removed from sid and testing. Closes: #724837 * Ported to dh-python2 * Updated Standards-Version, no changes required. [ Elena Grandi ] * Use defaults when values file is broken. Closes: #736500 -- Enrico Zini Sun, 24 Aug 2014 10:44:58 -0700 apt-xapian-index (0.46ubuntu1) utopic; urgency=low * Merge from Debian unstable. Remaining changes: - prefer native packages over foreign packages - do not crash if the DB is already locked (LP: #590998) - do not modify the DB "in-place" with --update - debian/patches/01_axi_cjk_support.patch: - Activate the CJK support when indexing the database - debian/patches/02_axi-pkgname-mangled-term.patch: - add XPM term that contains a mangled version of the pkgname - debian/patches/03_policykit_translations.patch: - remove underscores from description - debian/patches/CVE-2013-1064.patch: + pass system-bus-name as a subject - debian/postinst: - do not build the DB in the postinst - debian/rules: - quilt with dh --with python2 - plugins/app-install.py: - ignore file not found errors due to a race condition (LP: #752195) * Drop: - debian/patches/04_catch_invalid_desktop_file.patch: no longer needed -- Michael Vogt Tue, 03 Jun 2014 21:29:13 +0200 apt-xapian-index (0.46) unstable; urgency=low * Fixed schrödinbug in app-install-data locale language parsing * Handle pyxdg ParsingErrors. Thanks Thomas Kluyver for the patch. Closes: #714594 * Don't touch COMP_WORDBREAKS in bash completion, thanks Jonathan Nieder for the patch. Closes: #711876. * Fixed errors in package description, thanks Chad Dunlap for the patch. Closes: #684787. * Also trap UnicodeDecodeError among pyxdg's possible exceptions. Closes: #695073. -- Enrico Zini Fri, 23 Nov 2012 21:46:48 +0100 apt-xapian-index (0.45ubuntu4) trusty; urgency=medium * Rebuild to drop files installed into /usr/share/pyshared. -- Matthias Klose Sun, 23 Feb 2014 13:46:10 +0000 apt-xapian-index (0.45ubuntu3) saucy; urgency=low * SECURITY UPDATE: possible privilege escalation via policykit UID lookup race. - debian/patches/CVE-2013-1064.patch: pass system-bus-name as a subject instead of pid so policykit can get the information from the system bus in update-apt-xapian-index-dbus. - CVE-2013-1064 -- Marc Deslauriers Wed, 18 Sep 2013 12:40:20 -0400 apt-xapian-index (0.45ubuntu2) raring; urgency=low * debian/patches/01_axi_cjk_support.patch: - updated for the new version, fix the build [ Thomas Kluyver ] * debian/patches/04_catch_invalid_desktop_file.patch: Catch an exception from parsing invalid .desktop files for app-install. -- Sebastien Bacher Tue, 27 Nov 2012 19:59:39 +0100 apt-xapian-index (0.45ubuntu1) raring; urgency=low * Merge from Debian unstable. Remaining changes: - prefer native packages over foreign packages - do not crash if the DB is already locked (LP: #590998) - do not modify the DB "in-place" with --update - debian/patches/01_axi_cjk_support.patch: - Activate the CJK support when indexing the database - Don't call update-python-modules in the postinst. LP: #856627. - fix spelling errors - add XPM term that contains a mangled version of the pkgname - debian/patches/03_policykit_translations.patch: - remove underscores from description -- Michael Vogt Tue, 27 Nov 2012 16:40:11 +0100 apt-xapian-index (0.45) unstable; urgency=low * Gracefully deal with plugins failing in their info() function. * Fixed typo in exception name. Closes: #639143. -- Enrico Zini Wed, 28 Dec 2011 14:34:39 +0100 apt-xapian-index (0.44ubuntu8) raring; urgency=low * debian/patches/03_policykit_translations.patch: remove underscores from description and message tags in the policykit policy, since there are no translations currently being done for apt-xapian-index and this causes error messages to be spewed when the policy file gets parsed. -- Mathieu Trudel-Lapierre Wed, 31 Oct 2012 10:24:35 +0100 apt-xapian-index (0.44ubuntu7) quantal; urgency=low * add XPM term that contains a mangled version of the pkgname where all "-" are replaced with "_" to workaround that the queryparser considers "-" a special char -- Michael Vogt Fri, 17 Aug 2012 13:35:36 +0200 apt-xapian-index (0.44ubuntu6) quantal; urgency=low * Fixed two Spelling errors (allows to changed to allows one to) (maintan changed to maintain) in the debian/control file -- Chad Dunlap Mon, 13 Aug 2012 13:46:19 -0400 apt-xapian-index (0.44ubuntu5) precise; urgency=low * Rebuild to drop python2.6 dependencies. -- Matthias Klose Sat, 31 Dec 2011 02:00:43 +0000 apt-xapian-index (0.44ubuntu4) oneiric; urgency=low * Don't call update-python-modules in the postinst. LP: #856627. -- Matthias Klose Sat, 24 Sep 2011 14:42:18 +0200 apt-xapian-index (0.44ubuntu3) oneiric; urgency=low * axi/indexer.py: - prefer native packages over foreign packages (unless there is no native version or the foreign package is installed) (LP: #830508) -- Michael Vogt Wed, 21 Sep 2011 13:48:47 +0200 apt-xapian-index (0.44ubuntu2) oneiric; urgency=low * debian/patches/01_axi_cjk_support.patch: - Activate the CJK support when indexing the database * debian/control, debian/rules: - add quilt for above patch -- Didier Roche Thu, 01 Sep 2011 18:26:04 +0200 apt-xapian-index (0.44ubuntu1) oneiric; urgency=low * Merge from debian unstable. Remaining changes: - do not crash if the DB is already locked (LP: #590998) - move to dh_python2 - do not modify the DB "in-place" with --update * ignore file not found errors when checking for the mtime (LP: #752195), thanks to Brian Murray -- Michael Vogt Tue, 26 Jul 2011 09:02:49 +0200 apt-xapian-index (0.44) unstable; urgency=low * Applied trailing comma patch by Cédric Boutillier. Closes: #630774 * Applied function prototype fix by Cédric Boutillier. Closes: #605376, #630343 * Applied dbus policy patch by Michael Vogt. Closes: #611268 * s/axi-search/axi-cache in the description, thanks to Jack Bates. Closes: #621035 * axi-cache: improve output messages when no results are found or when paging past the last result. Closes: #623478 * Applied slightly tweaked postinst patch by Michael Vogt, to index with ionice in postinst and to force a full reindex after some significant version upgrade. Closes: #603824 * Applied patch by Michael Vogt to fix fix type of "start-time" for policykit. Closes: #605591 * Implemented axi-cache search --all. Closes: #610051 * Fixed server locking and slave progress reporting. Closes: #611366 -- Enrico Zini Wed, 22 Jun 2011 00:50:25 +0200 apt-xapian-index (0.43ubuntu1) oneiric; urgency=low * Merge from debian unstable. Remaining changes: - when upgrading, ensure the index is fully rebuild (in the background) to ensure that we get updated information in /var/lib/apt-xapian-index/{index.values} and that the index fully utilizes the new plugins (LP: #646018) - use ionice for the index building - do not crash if the DB is already locked (LP: #590998) - data/org.debian.AptXapianIndex.conf: fix policy - move to dh_python2 - update-apt-xapian-index-dbus: + fix type of "start-time" for policykit (LP: #675533) -- Michael Vogt Fri, 17 Jun 2011 10:51:30 +0200 apt-xapian-index (0.43) unstable; urgency=low * Implemented axi-cache info. Closes: #602600 * Added relations plugin to index package relations. * Implemented axi-cache rdepends using the index. * Implemented axi-cache rdetails to show details of reverse dependencies. * Changed apt.progress.text.OpProgress subclass signature to match API. Closes: #628560 -- Enrico Zini Thu, 02 Jun 2011 22:59:31 +0200 apt-xapian-index (0.42) unstable; urgency=low * Added ruby examples, thanks to Daniel Brumbaugh Keeney -- Enrico Zini Wed, 09 Mar 2011 14:08:34 +0000 apt-xapian-index (0.41ubuntu7) oneiric; urgency=low * switch from python-support to dh_python2 -- Michael Vogt Fri, 10 Jun 2011 13:22:42 +0200 apt-xapian-index (0.41ubuntu6) natty; urgency=low * debian/cron.weekly: - do not modify the DB "in-place" with --update to avoid software-center seeing a corrupted database when it has it open at the same time -- Michael Vogt Mon, 18 Apr 2011 11:15:27 +0200 apt-xapian-index (0.41ubuntu5) natty; urgency=low * update-apt-xapian-index: - do not crash if the DB is already logged, thanks to Martin Schaaf (LP: #590998) -- Michael Vogt Fri, 28 Jan 2011 14:29:21 +0100 apt-xapian-index (0.41ubuntu4) natty; urgency=low * data/org.debian.AptXapianIndex.conf: - fix typo -- Michael Vogt Thu, 27 Jan 2011 15:57:50 +0100 apt-xapian-index (0.41ubuntu3) natty; urgency=low * data/org.debian.AptXapianIndex.conf: - update policy to avoid warning in software-center on startup -- Michael Vogt Thu, 27 Jan 2011 15:39:54 +0100 apt-xapian-index (0.41ubuntu2) natty; urgency=low * update-apt-xapian-index-dbus: - fix type of "start-time" for policykit (LP: #675533) -- Michael Vogt Wed, 01 Dec 2010 17:45:52 +0100 apt-xapian-index (0.41ubuntu1) natty; urgency=low * Merge from debian unstable. Remaining changes: - when upgrading, ensure the index is fully rebuild (in the background) to ensure that we get updated information in /var/lib/apt-xapian-index/{index.values} and that the index fully utilizes the new plugins (LP: #646018) - use ionice for the index building -- Michael Vogt Wed, 17 Nov 2010 17:50:20 +0100 apt-xapian-index (0.41) unstable; urgency=low * Fixed typo in dbus config * Fixed DeprecationWarning in set_sort_by_value. Closes: #601880. * Reset cached sort value if --sort is not provided for search. Closes: #601881. -- Enrico Zini Sat, 06 Nov 2010 11:25:07 +0000 apt-xapian-index (0.40) unstable; urgency=low * Xapian cache moved to /var/cache/apt-xapian-index. Closes: #594675. -- Enrico Zini Sun, 03 Oct 2010 11:39:28 +0100 apt-xapian-index (0.39ubuntu1) maverick; urgency=low * debian/postinst: - when upgrading, ensure the index is fully rebuild (in the background) to ensure that we get updated information in /var/lib/apt-xapian-index/{index.values} and that the index fully utilizes the new plugins (LP: #646018) -- Michael Vogt Thu, 23 Sep 2010 16:04:11 +0200 apt-xapian-index (0.39) unstable; urgency=low [ Enrico Zini ] * Fixed tests on Ubuntu (Thanks Michael Vogt for the patch) * Added cataloged-times plugin by Michael Vogt * Handle multiple invocations of indexer via dbus * Described axi-cache in the package description [ Colin Watson ] * Fix more crashes when the Dir::Cache::pkgcache file doesn't exist, along the same lines as Martin Pitt's change in 0.38 (LP: #267330). * Use the new MSetItem attribute API, introduced in Xapian 1.0.0, rather than the sequence API which was removed in 1.1.0 (closes: #595916). * Create the XDG cache directory with appropriate permissions if it doesn't exist. * Set better NAME sections in manual pages. -- Colin Watson Fri, 10 Sep 2010 13:10:42 +0100 apt-xapian-index (0.38) unstable; urgency=low [ Martin Pitt ] * plugins/apttags.py, AptTags.info(): If the Dir::Cache::pkgcache file does not exist (such as in our test suite, or simply if the system disables it), do not crash but return timestamp == 0, as per documentation. (LP: #267330) * plugins/descriptions.py, indexDeb822(): Fix KeyError on "Description" when running in a non-English locale. Instead, look for a translated key and index that one. * axi/indexer.py, setupIndexing(): Round timestamps when comparing them. This fixes the test suite failing on almost-but-not-quite-identical timestamps. * debian/rules: Run the test suite during build. Add the necessary python libs (-debian, -xapian, -apt, and -nose) as build dependencies. -- Enrico Zini Mon, 21 Jun 2010 13:39:11 +0100 apt-xapian-index (0.37) unstable; urgency=low * Move #DEBHELPER# at the beginning of postinst, otherwise update-python-modules -p doesn't seem to always work. Closes: #581811 -- Enrico Zini Mon, 24 May 2010 17:31:08 +0100 apt-xapian-index (0.36) unstable; urgency=low * Do not use ionice in cron job inside virtual environments. Patch by Raoul Bhatia. Closes: #581930. * Removed leftover debugging print when python-xdg is not available. Closes: #581906 * Do not require a password for a simple update-apt-xapian-index run via dbus. Patch from Ubuntu by Michael Vogt. Closes: #582428. -- Enrico Zini Sun, 23 May 2010 14:39:02 +0100 apt-xapian-index (0.35) unstable; urgency=low * Tolerate (and if --verbose, report) .desktop file with invalid popcon fields * Added missing import. Closes: #581736 * Run update-python-modules -p before updating the index in postinst. Closes: #581811 -- Enrico Zini Sun, 16 May 2010 09:33:58 +0100 apt-xapian-index (0.34) unstable; urgency=low * Added aliases plugin, to feed synonims to the index * Tolerate older versions of python-debian * Give a nicer error message if run with not enough permissions * Added acknowledgements file mentioning sponsorship by the Fuss project -- Enrico Zini Thu, 13 May 2010 14:19:43 +0100 apt-xapian-index (0.33) unstable; urgency=low * Added missing import, fixing indexing of multilanguage descriptions -- Enrico Zini Wed, 12 May 2010 22:31:31 +0100 apt-xapian-index (0.32) unstable; urgency=low * Tolerate plugins' init functions that do not expect any arguments -- Enrico Zini Wed, 12 May 2010 21:42:43 +0100 apt-xapian-index (0.31) unstable; urgency=low [ David Paleino ] * debian/rules: set COLUMNS envvar when calling help2man (Closes: #577525) * debian/cron.weekly: - pass --update to update-apt-xapian-index, try to be less invasive during background runs (LP: #363695) - don't run the indexer when on battery power * axi-cache, update-xapian-index: update version number [ Axel Rutz ] * debian/cron.weekly: - inserted missing space in 'ionice -c 3' - give the indexer maximum niceness (LP: #363695) [ Enrico Zini ] * Switch to debhelper7 and distutils * Reorganised code in modules * Added a test suite * Added indexer for app-install .desktop file information -- Enrico Zini Wed, 12 May 2010 21:42:36 +0100 apt-xapian-index (0.30) unstable; urgency=low [ Enrico Zini ] * axi-cache: fix behaviour of again with no parameters * axi-cache: AND terms by default instead of OR * axi-cache: remove AND, OR and NOT from partial expressions when providing tab completion candidates * axi-cache: suggest tags from a preset list of facets when completing "axi-cache search " * axi-cache: when completing "axi-cache again " suggest context-sensitive terms from the previous "search" query * axi-cache: implemented showpkg * axi-cache: implemented showsrc * axi-cache: implemented depends, rdepends, policy, madison [ David Paleino ] * axi-cache.sh: move common "else" clause out of the last case..esac -- David Paleino Mon, 12 Apr 2010 18:37:53 +0200 apt-xapian-index (0.29) unstable; urgency=low * axi-cache: don't die horribly if a package exists in a-x-i but not in apt -- Enrico Zini Sun, 11 Apr 2010 09:21:41 +0200 apt-xapian-index (0.28) unstable; urgency=low * Added Homepage: field * Implemented axi-cache show and David provided its completion * Allow to run via dbus (Thanks to Michael Vogt) -- Enrico Zini Sat, 10 Apr 2010 17:51:06 +0200 apt-xapian-index (0.27) unstable; urgency=low [ Enrico Zini ] * Added axi-cache to search the index * Add spellchecking information to the database [ David Paleino ] * Added axi-cache.sh bash-completion snippet * Install bash-completion snippet using dh_bash-completion * Added myself to Uploaders -- David Paleino Fri, 09 Apr 2010 23:19:15 +0200 apt-xapian-index (0.26) unstable; urgency=low * Use the new module name for python-debian. Closes: #573935 -- Enrico Zini Fri, 19 Mar 2010 21:00:08 +0000 apt-xapian-index (0.25) unstable; urgency=low * Upgrade to the new python-apt API. Thanks Julian Andres Klode for the patch. Closes: #572052 -- Enrico Zini Mon, 01 Mar 2010 11:56:55 +0000 apt-xapian-index (0.24) unstable; urgency=low * Fixed deprecation warnings, thanks to Matt Kraai. Closes: #570219 * Fixed more deprecation warnings. * Disable all warnings when run with --quiet. Not sure it is a good idea, not sure it is a bad idea either. -- Enrico Zini Sun, 28 Feb 2010 23:28:45 +0000 apt-xapian-index (0.23) unstable; urgency=low * Applied patch by Matt Kraai. Closes: #547074 * Better output for initial runs. Closes: #547126 * Check for ionice before using it. Closes: #562078 * Updated Standards-version. No changes needed. -- Enrico Zini Sun, 14 Feb 2010 18:51:21 +0100 apt-xapian-index (0.22) unstable; urgency=low * Applied patch by Michael Vogt. Closes: #536857. - Run the weekly update with nice and ionice - Fix bug in --updates when it is selecting the wrong index -- Enrico Zini Wed, 15 Jul 2009 11:44:24 +0100 apt-xapian-index (0.21) unstable; urgency=low * Applied patch from Michael Vogt to implement a --update option that performs an incremental update of only those packages whose version has changed * Use "auto path" instead of "flint path" in the Xapian link file, so that we can transparently handle new formats in the future * Added "exit 0" at the end of maintainer scripts -- Enrico Zini Tue, 07 Jul 2009 16:28:48 +0100 apt-xapian-index (0.20) unstable; urgency=low * Ported to use the new method names from python-debian. It seems to have become trendy to rename methods in libraries. I dread to think what will happen when it will become trendy to migrate to python 3. Closes: #526587. * Bumped Standards-Version, no changes needed * Depends on ${misc:Depends} -- Enrico Zini Sun, 03 May 2009 21:07:16 +0100 apt-xapian-index (0.19) unstable; urgency=low * Ported to use the Version class from new python-apt. * Depend on fixed version of new python-apt. Closes: #521346, #523737, #523747. -- Enrico Zini Mon, 13 Apr 2009 20:44:41 +0100 apt-xapian-index (0.18) unstable; urgency=low * Run the background update niced. * Check policy-rc.d to see if it should start update-apt-xapian-index in the background. Closes: #516728. -- Enrico Zini Tue, 24 Feb 2009 15:08:57 +0000 apt-xapian-index (0.17) unstable; urgency=medium * Work around python-apt bug #513315 until they fix it. Closes: #515791. -- Enrico Zini Sat, 21 Feb 2009 11:08:06 +0000 apt-xapian-index (0.16) unstable; urgency=low * Create /var/lib/apt-xapian-index/ in a way that is free of race conditions. Closes: #506766. * Fix cron job to be quiet if the package is removed but not purged. Closes: #502607. * Simplified searchcloud example. -- Enrico Zini Sat, 24 Jan 2009 10:45:05 +0000 apt-xapian-index (0.15) unstable; urgency=low * Properly implement detection of concurrent updates * If a concurrent update is detected, its status will be printed * Use a xapian stub database instead of a symlink -- Enrico Zini Thu, 07 Aug 2008 12:28:41 +0200 apt-xapian-index (0.14) unstable; urgency=low * Added --batch-mode to produce machine-readable output, to use for GUI feedback in package managers. Thanks to Petr Rockai. * Reformatted debian/copyright * Shortened short description -- Enrico Zini Thu, 31 Jul 2008 14:57:32 +0200 apt-xapian-index (0.13) unstable; urgency=low * Upload properly. -- Enrico Zini Sun, 20 Jul 2008 07:55:50 +0100 apt-xapian-index (0.12) unstable; urgency=low * Added axi-searchcloud.py example * Fix some lintian warnings * In postinst, build the index in background if it has never been done before -- Enrico Zini Sat, 19 Jul 2008 21:24:16 +0100 apt-xapian-index (0.11) unstable; urgency=low * Use -f when removing data on purge, to avoid complaining if it is missing. -- Enrico Zini Sun, 06 Jul 2008 17:48:18 +0200 apt-xapian-index (0.10) unstable; urgency=low * Applied patch from Michael Vogt. Closes: #487677. -- Enrico Zini Sun, 06 Jul 2008 16:58:20 +0200 apt-xapian-index (0.9) unstable; urgency=low * Remove the index on remove as well as on purge. * Properly unquote language names from translation files, and ignore translation records with strange Description fields. Closes: #472953 -- Enrico Zini Fri, 20 Jun 2008 09:49:48 +0100 apt-xapian-index (0.8) unstable; urgency=low * Fixed translations plugin to handle when there are no translations in the system. Closes: #471957 -- Enrico Zini Fri, 21 Mar 2008 19:22:09 +0800 apt-xapian-index (0.7) unstable; urgency=low * Remove *.pyc from the plugin directory on remove or purge. Closes: #467193 * Added plugin to index sections * Added plugin to index translated descriptions -- Enrico Zini Mon, 17 Mar 2008 13:02:42 +0000 apt-xapian-index (0.6) unstable; urgency=low * Added --pkgfile option to index arbitrary Package files instead of the APT cache. -- Enrico Zini Tue, 19 Feb 2008 11:19:08 +0000 apt-xapian-index (0.5) unstable; urgency=low * Updated XS-Vcs* fields to the new location in git under collab-maint * Cron job does not fail if the package is removed. Closes: #461571 * Fixed apttags.py plugin -- Enrico Zini Thu, 07 Feb 2008 11:28:14 +0000 apt-xapian-index (0.4) unstable; urgency=low * Added versioned dependency on python-xapian. Closes: #447382. * Added examples from my blog posts -- Enrico Zini Mon, 22 Oct 2007 15:34:38 +0100 apt-xapian-index (0.3) experimental; urgency=low * Install the examples properly -- Enrico Zini Tue, 16 Oct 2007 22:47:04 +0100 apt-xapian-index (0.2) experimental; urgency=low * Added license headers * Added examples -- Enrico Zini Tue, 16 Oct 2007 22:39:27 +0100 apt-xapian-index (0.1) experimental; urgency=low * Initial release. * This package will replace ept-cache reindex for the task of maintaining a system-wide index of Debian package metadata. -- Enrico Zini Tue, 16 Oct 2007 12:20:43 +0100 apt-xapian-index-0.47ubuntu13/debian/cron.weekly0000644000000000000000000000133213070503156016514 0ustar #!/bin/sh CMD=/usr/sbin/update-apt-xapian-index # ionice should not be called in a virtual environment # (similar to man-db cronjobs) egrep -q '(envID|VxID):.*[1-9]' /proc/self/status || IONICE=/usr/bin/ionice # Check if we're on battery if which on_ac_power >/dev/null 2>&1; then on_ac_power >/dev/null 2>&1 ON_BATTERY=$? # Here we use "-eq 1" instead of "-ne 0" because # on_ac_power could also return 255, which means # it can't tell whether we are on AC or not. In # that case, run update-a-x-i nevertheless. [ "$ON_BATTERY" -eq 1 ] && exit 0 fi # Rebuild the index if [ -x "$CMD" ] then if [ -x "$IONICE" ] then nice -n 19 $IONICE -c 3 $CMD --quiet else nice -n 19 $CMD --quiet fi fi apt-xapian-index-0.47ubuntu13/debian/vercheck0000755000000000000000000000116313070503156016053 0ustar #!/bin/sh VERSION_SRC1=`grep ^VERSION= update-apt-xapian-index | sed -re 's/^.+"([^"]+)".*/\1/'` VERSION_SRC2=`grep ^VERSION= axi-cache | sed -re 's/^.+"([^"]+)".*/\1/'` VERSION_DEB=`head -n 1 debian/changelog | sed -re 's/[^(]+\(([^)]+)\).+/\1/'` VERSION="$VERSION_SRC1" if [ "$VERSION_SRC1" != "$VERSION_DEB" ] then echo "Version mismatch between update-a-x-i source ($VERSION_SRC1) and debian/changelog ($VERSION_DEB)" >&2 exit 1 fi if [ "$VERSION_SRC2" != "$VERSION_DEB" ] then echo "Version mismatch between axi-cache source ($VERSION_SRC2) and debian/changelog ($VERSION_DEB)" >&2 exit 1 fi echo "$VERSION" exit 0 apt-xapian-index-0.47ubuntu13/debian/examples0000644000000000000000000000001313070503156016065 0ustar examples/* apt-xapian-index-0.47ubuntu13/debian/dirs0000644000000000000000000000032013070503156015211 0ustar usr/bin usr/sbin usr/share/doc usr/share/apt-xapian-index usr/share/apt-xapian-index/aliases usr/share/apt-xapian-index/plugins etc/dbus-1/system.d usr/share/dbus-1/system-services usr/share/polkit-1/actions apt-xapian-index-0.47ubuntu13/debian/compat0000644000000000000000000000000213070503156015530 0ustar 9 apt-xapian-index-0.47ubuntu13/debian/postinst0000644000000000000000000000224013070503156016136 0ustar #!/bin/sh -e #DEBHELPER# # ionice should not be called in a virtual environment # (similar to man-db cronjobs) if [ -x /usr/bin/ionice ] && ! egrep -q '(envID|VxID):.*[1-9]' /proc/self/status then IONICE="/usr/bin/ionice -c3" else IONICE="" fi case "$1" in configure) # Just checking the main directory with -d should prevent the indexing # to be started while an indexing is already going on, as the first # thing that update-apt-xapian-index does is to create the directory if # it is missing # # we also full-regenerate the index on upgrades from older versions # because the weekly --update cron job will not use new plugins for # already indexed packages if [ ! -d /var/lib/apt-xapian-index ] || dpkg --compare-versions "$2" lt-nl "0.39" then if [ ! -x /usr/sbin/policy-rc.d ] || /usr/sbin/policy-rc.d apt-xapian-index start then echo "apt-xapian-index: Building new index in background..." $IONICE nice /usr/sbin/update-apt-xapian-index --force --quiet & fi fi ;; esac exit 0 apt-xapian-index-0.47ubuntu13/debian/control0000644000000000000000000000352313073450535015745 0ustar Source: apt-xapian-index Section: admin Priority: optional Maintainer: Ubuntu Developers XSBC-Original-Maintainer: Enrico Zini Uploaders: David Paleino Build-Depends: debhelper (>= 9), quilt Build-Depends-Indep: help2man, python3-all, dh-python, bash-completion (>= 1:1.0-1~), python3-xapian (>= 1.4.3-1), python3-apt (>= 0.7.93.2), python3-debian (>= 0.1.14), python3-nose Standards-Version: 3.9.5.0 Vcs-Git: git://git.debian.org/git/collab-maint/apt-xapian-index.git Vcs-Browser: http://git.debian.org/?p=collab-maint/apt-xapian-index.git Homepage: http://www.enricozini.org/sw/apt-xapian-index/ X-Python3-Version: >= 3.4 Package: apt-xapian-index Architecture: all Depends: python3-xapian (>= 1.4.3-1), python3-apt (>= 0.7.93.2), python3-debian (>= 0.1.14), ${python3:Depends}, ${misc:Depends} Suggests: app-install-data, python3-xdg Description: maintenance and search tools for a Xapian index of Debian packages This package provides update-apt-xapian-index, a tool to maintain a Xapian index of Debian package information in /var/lib/apt-xapian-index, and axi-cache, a command line search tool that uses the index. . axi-cache allows one to search packages very quickly, and it also interfaces with the shell command line completion in a smart way, providing context-sensitive keyword and tag suggestions even before the search command is actually run. . update-apt-xapian-index allows plugins to be installed in /usr/share/apt-xapian-index to index all sorts of extra information, such as Debtags tags, popcon information, package ratings and anything else that would fit. . The index generated by update-apt-xapian-index is self-documenting, as it contains an autogenerated README file with information on the index layout and all the data that can be found in it. apt-xapian-index-0.47ubuntu13/debian/patches/0000755000000000000000000000000013070503156015761 5ustar apt-xapian-index-0.47ubuntu13/debian/patches/03_stick_to_main.patch0000644000000000000000000000322013070503156022124 0ustar --- a/test/test_indexer.py +++ b/test/test_indexer.py @@ -8,7 +8,7 @@ import tools import xapian -def smallcache(pkglist=["apt", "libept-dev", "gedit"]): +def smallcache(pkglist=["apt", "libdbus-1-dev", "gedit"]): class sc(object): def __init__(self, cache): self._pkgs = pkglist @@ -69,7 +69,7 @@ def testDeb822Rebuild(self): pkgfile = os.path.join(axi.XAPIANDBPATH, "packages") - subprocess.check_call("apt-cache show apt libept-dev gedit > " + pkgfile, shell=True) + subprocess.check_call("apt-cache show apt libdbus-1-dev gedit > " + pkgfile, shell=True) # No other indexers are running, ensure lock succeeds self.assert_(self.indexer.lock()) @@ -90,7 +90,7 @@ # Perform the initial indexing progress = axi.indexer.SilentProgress() pre_indexer = axi.indexer.Indexer(progress, True) - pre_indexer._test_wrap_apt_cache(smallcache(["apt", "libept-dev", "gedit"])) + pre_indexer._test_wrap_apt_cache(smallcache(["apt", "libdbus-1-dev", "gedit"])) self.assert_(pre_indexer.lock()) self.assert_(pre_indexer.setupIndexing()) pre_indexer.rebuild() @@ -133,7 +133,7 @@ self.assertFalse("XAPIAN_CJK_NGRAM" in os.environ) progress = axi.indexer.SilentProgress() pre_indexer = axi.indexer.Indexer(progress, True) - pre_indexer._test_wrap_apt_cache(smallcache(["apt", "libept-dev", "gedit"])) + pre_indexer._test_wrap_apt_cache(smallcache(["apt", "libdbus-1-dev", "gedit"])) self.assert_(pre_indexer.lock()) self.assert_(pre_indexer.setupIndexing()) pre_indexer.rebuild() apt-xapian-index-0.47ubuntu13/debian/patches/07_glib_import.patch0000644000000000000000000000253113070503156021620 0ustar ## Description: fix import of GLib and GObject ## Author: Carlo Vanini ## Bug-Ubuntu: https://bugs.launchpad.net/bugs/1579834 ## Last-Update: 2016-07-07 Index: apt-xapian-index-0.47ubuntu9/tests/dbus-update-apt-xapian-index.py =================================================================== --- apt-xapian-index-0.47ubuntu9.orig/tests/dbus-update-apt-xapian-index.py +++ apt-xapian-index-0.47ubuntu9/tests/dbus-update-apt-xapian-index.py @@ -4,7 +4,10 @@ from __future__ import print_function import dbus import os -import glib +try: + from gi.repository import GLib as glib +except ImportError: + import glib import dbus.mainloop.glib dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) Index: apt-xapian-index-0.47ubuntu9/update-apt-xapian-index-dbus =================================================================== --- apt-xapian-index-0.47ubuntu9.orig/update-apt-xapian-index-dbus +++ apt-xapian-index-0.47ubuntu9/update-apt-xapian-index-dbus @@ -6,8 +6,14 @@ import string import subprocess try: - import glib - import gobject + try: + from gi.repository import GLib as glib + except ImportError: + import glib + try: + from gi.repository import GObject as gobject + except ImportError: + import gobject import dbus import dbus.service import dbus.mainloop.glib apt-xapian-index-0.47ubuntu13/debian/patches/04_bilingual.patch0000644000000000000000000013607613070503156021270 0ustar Description: Port the code to work bilingually in Python 2.7, 3.4, and 3.5. Author: Barry Warsaw Bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804099 --- a/axi-cache +++ b/axi-cache @@ -25,8 +25,11 @@ # - save NAME save the query to be recalled later with @NAME (or if notmuch # has a syntax for saved queries, recycle it) +from __future__ import print_function + +from operator import itemgetter from optparse import OptionParser -from cStringIO import StringIO +from six import StringIO import sys import os, os.path import axi @@ -54,8 +57,8 @@ except ImportError: from debian_bundle import deb822 helponly = False -except ImportError, e: - print >>sys.stderr, "%s: only help functions are implemented, for the sake of help2man" % str(e) +except ImportError as e: + print("%s: only help functions are implemented, for the sake of help2man" % str(e), file=sys.stderr) helponly = True if not helponly: @@ -74,9 +77,9 @@ def readVocabulary(): try: fin = open(DEBTAGS_VOCABULARY) - except Exception, e: + except Exception as e: # Only show this when being verbose - print >>sys.stderr, "Cannot read %s: %s. Please install `debtags' t" % (DEBTAGS_VOCABULARY, str(e)) + print("Cannot read %s: %s. Please install `debtags' t" % (DEBTAGS_VOCABULARY, str(e)), file=sys.stderr) return None, None facets = dict() tags = dict() @@ -142,9 +145,9 @@ if os.path.exists(CACHEFILE): try: self.cache.read(CACHEFILE) - except Exception, e: - print >>sys.stderr, e - print >>sys.stderr, "ignoring %s which seems to be corrupted" % CACHEFILE + except Exception as e: + print(e, file=sys.stderr) + print("ignoring %s which seems to be corrupted" % CACHEFILE, file=sys.stderr) self.dirty = False self.facets = None @@ -154,7 +157,7 @@ "Save the state so we find it next time" if self.dirty: if not os.path.exists(XDG_CACHE_HOME): - os.makedirs(XDG_CACHE_HOME, mode=0700) + os.makedirs(XDG_CACHE_HOME, mode=0o700) self.cache.write(open(CACHEFILE, "w")) self.dirty = False @@ -400,7 +403,7 @@ import datetime import axi.indexer # General info - print "Main data directory:", axi.XAPIANDBPATH + print("Main data directory:", axi.XAPIANDBPATH) try: cur_timestamp = os.path.getmtime(axi.XAPIANDBSTAMP) cur_time = time.strftime("%c", time.localtime(cur_timestamp)) @@ -408,22 +411,22 @@ except e: cur_timestamp = 0 cur_time = "not available: " + str(e) - print "Update timestamp: %s (%s)" % (axi.XAPIANDBSTAMP, cur_time) + print("Update timestamp: %s (%s)" % (axi.XAPIANDBSTAMP, cur_time)) try: index_loc = open(axi.XAPIANINDEX).read().split(" ", 1)[1].strip() index_loc = "pointing to " + index_loc except e: index_loc = "not available: " + str(e) - print "Index location: %s (%s)" % (axi.XAPIANINDEX, index_loc) + print("Index location: %s (%s)" % (axi.XAPIANINDEX, index_loc)) def fileinfo(fname): if os.path.exists(fname): return fname else: return fname + " (available from next reindex)" - print "Documentation of index contents:", fileinfo(axi.XAPIANDBDOC) - print "Documentation of available prefixes:", fileinfo(axi.XAPIANDBPREFIXES) - print "Documentation of available values:", fileinfo(axi.XAPIANDBVALUES) - print "Plugin directory:", axi.PLUGINDIR + print("Documentation of index contents:", fileinfo(axi.XAPIANDBDOC)) + print("Documentation of available prefixes:", fileinfo(axi.XAPIANDBPREFIXES)) + print("Documentation of available values:", fileinfo(axi.XAPIANDBVALUES)) + print("Plugin directory:", axi.PLUGINDIR) # Aggregated plugin information # { name: { path=path, desc=desc, status=status } } @@ -432,7 +435,7 @@ # Aggregated value information # { valuename: { val=num, desc=shortdesc, plugins=[plugin names] } } values, descs = axi.readValueDB() - values = dict(((a, dict(val=b, desc=descs[a], plugins=[])) for a, b in values.iteritems())) + values = dict(((a, dict(val=b, desc=descs[a], plugins=[])) for a, b in values.items())) # Aggregated data source information # { pathname: { desc=shortdesc, plugins=[plugin names] } } @@ -497,34 +500,34 @@ #for name, info in sorted(plugins.iteritems(), key=lambda x:x[0]): # print " %s:" % info["path"] # print " ", info["desc"] - print "Plugin status:" - maxname = max((len(x) for x in plugins.iterkeys())) - for name, info in sorted(plugins.iteritems(), key=lambda x:x[0]): - print " ", name.ljust(maxname), info["status"] + print("Plugin status:") + maxname = max((len(x) for x in plugins)) + for name, info in sorted(plugins.items(), key=itemgetter(0)): + print(" ", name.ljust(maxname), info["status"]) # Value information - print "Values:" - maxname = max((len(x) for x in values.iterkeys())) - print " ", "Value".ljust(maxname), "Code", "Provided by" - for name, val in sorted(values.iteritems(), key=lambda x:x[0]): + print("Values:") + maxname = max((len(x) for x in values)) + print(" ", "Value".ljust(maxname), "Code", "Provided by") + for name, val in sorted(values.items(), key=itemgetter(0)): plugins = val.get("plugins", []) if plugins: provider = ", ".join(plugins) else: provider = "update-apt-xapian-index" - print " ", name.ljust(maxname), "%4d" % int(val["val"]), provider + print(" ", name.ljust(maxname), "%4d" % int(val["val"]), provider) # Source files information - print "Data sources:" + print("Data sources:") maxpath = 0 maxdesc = 0 - for k, v in sources.iteritems(): + for k, v in sources.items(): if len(k) > maxpath: maxpath = len(k) if len(v["desc"]) > maxdesc: maxdesc = len(v["desc"]) - print " ", "Source".ljust(maxpath), "Description".ljust(maxdesc), "Used by" - for path, info in sources.iteritems(): + print(" ", "Source".ljust(maxpath), "Description".ljust(maxdesc), "Used by") + for path, info in sources.items(): provider = ", ".join(info.get("plugins", [])) - print " ", path.ljust(maxpath), info["desc"].ljust(maxdesc), provider + print(" ", path.ljust(maxpath), info["desc"].ljust(maxdesc), provider) return 0 @@ -555,15 +558,15 @@ terms.update((str(term) for term in self.db.db.synonym_keys(self.args[0]))) terms.update((term.term for term in self.db.db.allterms(self.args[0]))) for term in sorted(terms): - print term + print(term) for term in self.db.db.allterms("XT" + self.args[0]): - print term.term[2:] + print(term.term[2:]) return 0 elif self.opts.tabcomplete == "plain" and not self.args: # Show a preset list of tags for facet in ["interface", "role", "use", "works-with"]: for term in self.db.db.allterms("XT" + facet + "::"): - print term.term[2:] + print(term.term[2:]) return 0 else: # Context-sensitive hints @@ -584,8 +587,8 @@ try: self.db.build_query() - except SavedStateError, e: - print >>sys.stderr, "%s: maybe you need to run '%s search' first?" % (self.name, str(e)) + except SavedStateError as e: + print("%s: maybe you need to run '%s search' first?" % (self.name, str(e)), file=sys.stderr) return 1 self.print_matches(self.db.get_matches(first = 0)) @@ -600,7 +603,7 @@ self.db.set_query_args(qargs, secondary=True) try: self.db.build_query() - except SavedStateError, e: + except SavedStateError as e: return 0 self.print_completions(self.db.get_matches(first = 0)) return 0 @@ -610,8 +613,8 @@ self.db = DB() try: self.db.build_query() - except SavedStateError, e: - print >>sys.stderr, "%s: maybe you need to run '%s search' first?" % (self.name, str(e)) + except SavedStateError as e: + print("%s: maybe you need to run '%s search' first?" % (self.name, str(e)), file=sys.stderr) return 1 count = int(args[0]) if args else 20 @@ -624,8 +627,8 @@ self.db = DB() try: self.db.build_query() - except SavedStateError, e: - print >>sys.stderr, "%s: maybe you need to run '%s search' first?" % (self.name, str(e)) + except SavedStateError as e: + print("%s: maybe you need to run '%s search' first?" % (self.name, str(e)), file=sys.stderr) return 1 count = int(args[0]) if args else 20 @@ -657,15 +660,15 @@ pkg = m.document.get_data() if partial is not None and not pkg.startswith(partial): continue if pkg not in blacklist: - print pkg - except SavedStateError, e: + print(pkg) + except SavedStateError as e: return 0 else: # Prefix expand for term in self.db.db.allterms("XP" + (partial or "")): pkg = term.term[2:] if pkg not in blacklist: - print pkg + print(pkg) return 0 def do_showpkg(self, args): @@ -687,11 +690,11 @@ "rdepends pkgname[s]: run apt-cache rdepends pkgname[s]" db = DB() for name in args: - print name - print "Reverse Depends:" + print(name) + print("Reverse Depends:") for pfx in ("XRD", "XRR", "XRS", "XRE", "XRP", "XRB", "XRC"): for pkg in db.get_rdeps(name, pfx): - print " ", pkg + print(" ", pkg) complete_rdepends = complete_show def do_rdetails(self, args): @@ -708,7 +711,7 @@ ("XRC", "con")): deps = list(db.get_rdeps(name, pfx)) if not deps: continue - print name, tag, " ".join(deps) + print(name, tag, " ".join(deps)) complete_rdetails = complete_show def do_policy(self, args): @@ -722,22 +725,22 @@ complete_madison = complete_show def format_help(self, out): - print >>out, "Commands:" + print("Commands:", file=out) itemlist = [] maxusage = 0 - for k, v in sorted(self.__class__.__dict__.iteritems()): + for k, v in sorted(self.__class__.__dict__.items()): if not k.startswith("do_"): continue line = self.name + " " + v.__doc__ usage, desc = line.split(": ", 1) if len(usage) > maxusage: maxusage = len(usage) itemlist.append([usage, desc]) - print >>out, " search commands:" + print(" search commands:", file=out) for usage, desc in [x for x in itemlist if "run apt-cache" not in x[1]]: - print >>out, " %-*.*s %s" % (maxusage, maxusage, usage, desc) - print >>out, " apt-cache front-ends:" + print(" %-*.*s %s" % (maxusage, maxusage, usage, desc), file=out) + print(" apt-cache front-ends:", file=out) for usage, desc in [x for x in itemlist if "run apt-cache" in x[1]]: - print >>out, " %-*.*s %s" % (maxusage, maxusage, usage, desc) + print(" %-*.*s %s" % (maxusage, maxusage, usage, desc), file=out) def do_help(self, args): "help: show a summary of commands" @@ -761,9 +764,9 @@ exclude = [x for x in exclude if x.lower() not in self.BOOLWORDS] for s in self.clean_suggestions(self.db.get_suggestions(count=10, filter=DB.BasicFilter(stemmer=self.db.stem, exclude=exclude, prefix=prefix))): if s.startswith("tag:"): - print s[4:] + print(s[4:]) else: - print s + print(s) def print_matches(self, matches): if self.opts.tags: @@ -785,31 +788,31 @@ tag = self.db.unprefix(res.term)[4:] desc = describe_tag(tag) if desc is None: - print "%i%% %s" % (int(score * 100), tag) + print("%i%% %s" % (int(score * 100), tag)) else: - print "%i%% %s -- %s" % (int(score * 100), tag, desc) + print("%i%% %s -- %s" % (int(score * 100), tag, desc)) else: est = matches.get_matches_estimated() first = matches.get_firstitem() count = matches.size() - print "%i results found." % est + print("%i results found." % est) if count != 0: - print "Results %i-%i:" % (first + 1, first + count) + print("Results %i-%i:" % (first + 1, first + count)) elif first != 0: - print "No more results to show" + print("No more results to show") self.print_all_matches((m for m in matches)) if first == 0: sc = self.db.get_spelling_correction() if sc: - print "Did you mean:", sc, "?" + print("Did you mean:", sc, "?") sugg = self.clean_suggestions(self.db.get_suggestions(count=7, filter=DB.TermFilter(stemmer=self.db.stem, exclude=self.args))) - print "More terms:", " ".join(sugg) + print("More terms:", " ".join(sugg)) stags = self.clean_suggestions(self.db.get_suggestions(count=7, filter=DB.TagFilter())) - print "More tags:", " ".join([x[4:] for x in stags]) + print("More tags:", " ".join([x[4:] for x in stags])) if first + count < est: - print "`%s more' will give more results" % self.name + print("`%s more' will give more results" % self.name) if first > 0: - print "`%s again' will restart the search" % self.name + print("`%s again' will restart the search" % self.name) def print_all_matches(self, matches_iter): """ @@ -824,9 +827,9 @@ except KeyError: pkg = None if pkg is not None and pkg.candidate: - print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) + print("%i%% %s - %s" % (m.percent, name, pkg.candidate.summary)) else: - print "%i%% %s - (unknown by apt)" % (m.percent, name) + print("%i%% %s - (unknown by apt)" % (m.percent, name)) def perform(self): self.cmd = "help" if not self.args else self.args.pop(0) @@ -838,7 +841,7 @@ else: f = getattr(self, "do_" + self.cmd, None) if f is None: - print >>sys.stderr, "Invalid command: `%s'.\n" % self.cmd + print("Invalid command: `%s'.\n" % self.cmd, file=sys.stderr) self.do_help(self.args) return 1 return f(self.args) --- a/axi/__init__.py +++ b/axi/__init__.py @@ -18,6 +18,8 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # +from __future__ import print_function + import os import os.path import sys @@ -59,7 +61,7 @@ mo = re_value.match(line) if not mo: if not quiet: - print >>sys.stderr, "%s:%d: line is not `name value [# description]': ignored" % (pathname, idx+1) + print("%s:%d: line is not `name value [# description]': ignored" % (pathname, idx+1), file=sys.stderr) continue # Parse the number name = mo.group(1) @@ -70,10 +72,10 @@ descs[name] = desc if not values: raise BrokenIndexError - except (OSError, IOError, BrokenIndexError), e: + except (OSError, IOError, BrokenIndexError) as e: # If we can't read the database, fallback to defaults if not quiet: - print >>sys.stderr, "%s: %s. Falling back on a default value database" % (pathname, e) + print("%s: %s. Falling back on a default value database" % (pathname, e), file=sys.stderr) values = DEFAULT_VALUES descs = DEFAULT_VALUE_DESCS return values, descs --- a/axi/indexer.py +++ b/axi/indexer.py @@ -19,6 +19,8 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # +from __future__ import print_function + import axi import sys import os @@ -31,8 +33,10 @@ import itertools import time import re -import urllib -import cPickle as pickle +import pickle + +from operator import itemgetter +from six.moves.urllib_parse import unquote APTLISTDIR="/var/lib/apt/lists" @@ -59,7 +63,7 @@ self.info = self.obj.info(**kw) except TypeError: self.info = self.obj.info() - except Exception, e: + except Exception as e: if progress: progress.warning("Plugin %s initialisation failed: %s" % (fname, str(e))) self.obj = None @@ -105,7 +109,7 @@ for f in os.listdir(APTLISTDIR): mo = tfile.search(f) if not mo: continue - langs.add(urllib.unquote(mo.group(1))) + langs.add(unquote(mo.group(1))) return langs @@ -120,36 +124,36 @@ self.is_verbose = False def begin(self, task): self.task = task - print "%s..." % self.task, + print("%s..." % self.task, end='') sys.stdout.flush() self.halfway = True def progress(self, percent): - print "\r%s... %d%%" % (self.task, percent), + print("\r%s... %d%%" % (self.task, percent), end='') sys.stdout.flush() self.halfway = True def end(self): - print "\r%s: done. " % self.task + print("\r%s: done. " % self.task) self.halfway = False def verbose(self, *args): if not self.is_verbose: return if self.halfway: - print - print " ".join(args) + print() + print(" ".join(args)) self.halfway = False def notice(self, *args): if self.halfway: - print - print >>sys.stderr, " ".join(args) + print() + print(" ".join(args), file=sys.stderr) self.halfway = False def warning(self, *args): if self.halfway: - print - print >>sys.stderr, " ".join(args) + print() + print(" ".join(args), file=sys.stderr) self.halfway = False def error(self, *args): if self.halfway: - print - print >>sys.stderr, " ".join(args) + print() + print(" ".join(args), file=sys.stderr) self.halfway = False class BatchProgress: @@ -160,25 +164,25 @@ self.task = None def begin(self, task): self.task = task - print "begin: %s\n" % self.task, + print("begin: %s\n" % self.task, end='') sys.stdout.flush() def progress(self, percent): - print "progress: %d/100\n" % percent, + print("progress: %d/100\n" % percent, end='') sys.stdout.flush() def end(self): - print "done: %s\n" % self.task + print("done: %s\n" % self.task) sys.stdout.flush() def verbose(self, *args): - print "verbose: %s" % (" ".join(args)) + print("verbose: %s" % (" ".join(args))) sys.stdout.flush() def notice(self, *args): - print "notice: %s" % (" ".join(args)) + print("notice: %s" % (" ".join(args))) sys.stdout.flush() def warning(self, *args): - print "warning: %s" % (" ".join(args)) + print("warning: %s" % (" ".join(args))) sys.stdout.flush() def error(self, *args): - print "error: %s" % (" ".join(args)) + print("error: %s" % (" ".join(args))) sys.stdout.flush() class SilentProgress: @@ -196,9 +200,9 @@ def notice(self, *args): pass def warning(self, *args): - print >>sys.stderr, " ".join(args) + print(" ".join(args), file=sys.stderr) def error(self, *args): - print >>sys.stderr, " ".join(args) + print(" ".join(args), file=sys.stderr) class ClientProgress: """ @@ -304,7 +308,7 @@ try: sock = self.sock.accept()[0] self.proxied.append(ServerSenderProgress(sock, self.task)) - except socket.error, e: + except socket.error as e: if e.args[0] != errno.EAGAIN: raise pass @@ -339,13 +343,12 @@ with ExecutionTime("db flush"): db.flush() """ - import time def __init__(self, info=""): self.info = info def __enter__(self): self.now = time.time() def __exit__(self, type, value, stack): - print "%s: %s" % (self.info, time.time() - self.now) + print("%s: %s" % (self.info, time.time() - self.now)) class Indexer(object): """ @@ -375,7 +378,7 @@ try: # Try to create it anyway os.mkdir(pathname) - except OSError, e: + except OSError as e: if e.errno != errno.EEXIST: # If we got an error besides path already existing, fail raise @@ -438,7 +441,7 @@ # Wrap the current progress with the server sender self.progress = ServerProgress(self.progress) return True - except IOError, e: + except IOError as e: if e.errno == errno.EACCES or e.errno == errno.EAGAIN: return False else: @@ -484,7 +487,7 @@ cur_timestamp = os.path.getmtime(axi.XAPIANDBSTAMP) else: cur_timestamp = 0 - except OSError, e: + except OSError as e: cur_timestamp = 0 self.progress.notice("Reading current timestamp failed: %s. Assuming the index has not been created yet." % e) @@ -584,6 +587,14 @@ seen = set() for fname in fnames: infd = open(fname) + # In Python 3, you cannot tell() on a file object while + # iterating so you have to pop the hood to get at the + # underlying buffer. In Python 2, the file object doesn't have + # this attribute, or this restriction. + try: + tellable = infd.buffer + except AttributeError: + tellable = infd # Get file size to compute progress total = os.fstat(infd.fileno())[6] for idx, pkg in enumerate(deb822.Deb822.iter_paragraphs(infd)): @@ -593,7 +604,7 @@ # Print approximate progress by checking the current read position # against the file size if total > 0 and idx % 200 == 0: - cur = infd.tell() + cur = tellable.tell() self.progress.progress(100*cur/total) yield self.get_document_from_deb822(pkg) @@ -615,6 +626,9 @@ if idx % 5000 == 0: self.progress.progress(100*idx/count) doc = db.get_document(m.docid) pkg = doc.get_data() + if bytes is not str and isinstance(pkg, bytes): + # python3-apt requires strings. + pkg = pkg.decode('utf-8') # this will return '' if there is no value 0, which is fine because it # will fail the comparison with the candidate version causing a reindex dbver = doc.get_value(0) @@ -640,12 +654,12 @@ """ try: db = xapian.WritableDatabase(pathname, xapian.DB_CREATE_OR_OPEN) - except xapian.DatabaseLockError, e: + except xapian.DatabaseLockError: self.progress.warning("DB Update failed, database locked") return # Make sure the index CJK-compatible - if self.is_cjk_enabled() and db.get_metadata("cjk_ngram") != "1": + if self.is_cjk_enabled() and db.get_metadata("cjk_ngram") != b"1": self.progress.notice("The index %s is not CJK-compatible, rebuilding it" % axi.XAPIANINDEX) return self.rebuild() @@ -747,9 +761,8 @@ self.progress.verbose("Installing the new index.") #os.symlink(tmpidxfname, axi.XAPIANDBPATH + "/index.tmp") - out = open(axi.XAPIANINDEX + ".tmp", "w") - print >>out, "auto", os.path.abspath(dbdir) - out.close() + with open(axi.XAPIANINDEX + ".tmp", "w") as out: + print("auto", os.path.abspath(dbdir), file=out) os.rename(axi.XAPIANINDEX + ".tmp", axi.XAPIANINDEX) # Remove all other index.* directories that are not the newly created one @@ -781,9 +794,8 @@ Write the prefix information on the given file """ self.progress.verbose("Writing prefix information to %s." % pathname) - out = open(pathname+".tmp", "w") - - print >>out, textwrap.dedent(""" + with open(pathname+".tmp", "w") as out: + print(textwrap.dedent(""" # This file contains the information about keyword prefixes used in the # APT Xapian index. # @@ -794,31 +806,30 @@ # This file lists terms with their index prefix, their queryparser # prefix, whether queryparser should treat it as boolean or # probabilistic and a short description. - """).lstrip() + """).lstrip(), file=out) - # Aggregate and normalise prefix information from all the plugins - prefixes = dict() - for addon in self.plugins: - for p in addon.info.get("prefixes", []): - idx = p.get("idx", None) - if idx is None: continue - qp = p.get("qp", None) - type = p.get("type", None) - desc = p.get("desc", None) - # TODO: warn of inconsistencies (plugins that disagree on qp or type) - old = prefixes.setdefault(idx, dict()) - if qp: old.setdefault("qp", qp) - if type: old.setdefault("type", type) - if desc: old.setdefault("desc", desc) - - for name, info in sorted(prefixes.iteritems(), key=lambda x: x[0]): - print >>out, "%s\t%s\t%s\t# %s" % ( - name, - info.get("qp", "-"), - info.get("type", "-"), - info.get("desc", "(description is missing)")) + # Aggregate and normalise prefix information from all the plugins + prefixes = dict() + for addon in self.plugins: + for p in addon.info.get("prefixes", []): + idx = p.get("idx", None) + if idx is None: continue + qp = p.get("qp", None) + type = p.get("type", None) + desc = p.get("desc", None) + # TODO: warn of inconsistencies (plugins that disagree on qp or type) + old = prefixes.setdefault(idx, dict()) + if qp: old.setdefault("qp", qp) + if type: old.setdefault("type", type) + if desc: old.setdefault("desc", desc) + + for name, info in sorted(prefixes.items(), key=itemgetter(0)): + print("%s\t%s\t%s\t# %s" % ( + name, + info.get("qp", "-"), + info.get("type", "-"), + info.get("desc", "(description is missing)")), file=out) - out.close() # Atomic update of the documentation os.rename(pathname+".tmp", pathname) @@ -827,9 +838,9 @@ Write the value information on the given file """ self.progress.verbose("Writing value information to %s." % pathname) - out = open(pathname+".tmp", "w") + with open(pathname+".tmp", "w") as out: - print >>out, textwrap.dedent(""" + print(textwrap.dedent(""" # This file contains the mapping between names of numeric values indexed in the # APT Xapian index and their index # @@ -841,13 +852,12 @@ # The format is exactly like /etc/services with name, number and optional # aliases, with the difference that the second column does not use the # "/protocol" part, which would be meaningless here. - """).lstrip() + """).lstrip(), file=out) - for name, idx in sorted(self.values.iteritems(), key=lambda x: x[1]): - desc = self.values_desc[name] - print >>out, "%s\t%d\t# %s" % (name, idx, desc) + for name, idx in sorted(self.values.items(), key=itemgetter(1)): + desc = self.values_desc[name] + print("%s\t%d\t# %s" % (name, idx, desc), file=out) - out.close() # Atomic update of the documentation os.rename(pathname+".tmp", pathname) @@ -871,8 +881,8 @@ self.progress.notice("Skipping documentation for plugin", addon.filename) # Write the documentation in pathname - out = open(pathname+".tmp", "w") - print >>out, textwrap.dedent(""" + with open(pathname+".tmp", "w") as out: + print(textwrap.dedent(""" =============== Database layout =============== @@ -888,12 +898,12 @@ numeric values is found in ``%s``. The data sources used for indexing are: - """).lstrip() % (axi.XAPIANDBPATH, axi.XAPIANDBVALUES) + """).lstrip() % (axi.XAPIANDBPATH, axi.XAPIANDBVALUES), file=out) - for d in docinfo: - print >>out, " * %s: %s" % (d['name'], d['shortDesc']) + for d in docinfo: + print(" * %s: %s" % (d['name'], d['shortDesc']), file=out) - print >>out, textwrap.dedent(""" + print(textwrap.dedent(""" This Xapian index follows the conventions for term prefixes described in ``/usr/share/doc/xapian-omega/termprefixes.txt.gz``. @@ -912,14 +922,13 @@ Active data sources ------------------- - """) - for d in docinfo: - print >>out, d['name'] - print >>out, '='*len(d['name']) - print >>out, textwrap.dedent(d['fullDoc']) - print >>out + """), file=out) + for d in docinfo: + print(d['name'], file=out) + print('='*len(d['name']), file=out) + print(textwrap.dedent(d['fullDoc']), file=out) + print(file=out) - out.close() # Atomic update of the documentation os.rename(pathname+".tmp", pathname) --- a/examples/aptxapianindex.py +++ b/examples/aptxapianindex.py @@ -4,6 +4,8 @@ # To Public License, Version 2, as published by Sam Hocevar. See # http://sam.zoy.org/wtfpl/COPYING for more details. +from __future__ import print_function + import os, re import xapian @@ -75,7 +77,7 @@ # If a filter was requested, AND it with the query return xapian.Query(xapian.Query.OP_AND, filterdb[filtername], query) else: - raise RuntimeError("Invalid filter type. Try one of " + ", ".join(sorted(filterdb.keys()))) + raise RuntimeError("Invalid filter type. Try one of " + ", ".join(sorted(filterdb))) else: return query @@ -85,8 +87,8 @@ """ # Display the top 20 results, sorted by how well they match cache = apt.Cache() - print "%i results found." % mset.get_matches_estimated() - print "Results 1-%i:" % mset.size() + print("%i results found." % mset.get_matches_estimated()) + print("Results 1-%i:" % mset.size()) for m in mset: # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. @@ -98,7 +100,7 @@ # Print the match, together with the short description if pkg.candidate: - print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) + print("%i%% %s - %s" % (m.percent, name, pkg.candidate.summary)) def readValueDB(pathname): """ @@ -116,23 +118,22 @@ # Split the line fields = splitter.split(line) if len(fields) < 2: - print >>sys.stderr, "Ignoring line %s:%d: only 1 value found when I need at least the value name and number" % (pathname, idx+1) + print("Ignoring line %s:%d: only 1 value found when I need at least the value name and number" % (pathname, idx+1), file=sys.stderr) continue # Parse the number try: number = int(fields[1]) except ValueError: - print >>sys.stderr, "Ignoring line %s:%d: the second column (\"%s\") must be a number" % (pathname, idx+1, fields[1]) + print("Ignoring line %s:%d: the second column (\"%s\") must be a number" % (pathname, idx+1, fields[1]), file=sys.stderr) continue values[fields[0]] = number for alias in fields[2:]: values[alias] = number - except OSError, e: + except OSError as e: # If we can't read the database, fallback to defaults - print >>sys.stderr, "Cannot read %s: %s. Using a minimal default configuration" % (pathname, e) + print("Cannot read %s: %s. Using a minimal default configuration" % (pathname, e), file=sys.stderr) values = dict( installedsize = 1, packagesize = 2 ) return values - --- a/examples/axi-query-expand.py +++ b/examples/axi-query-expand.py @@ -18,6 +18,8 @@ # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +from __future__ import print_function + from optparse import OptionParser import sys @@ -83,9 +85,9 @@ # Print it out. Note that some terms have a prefix from the database: can we # filter them out? Indeed: Xapian allow to give a filter to get_eset. # Read on... -print -print "Terms that could improve the search:", -print ", ".join(["%s (%.2f%%)" % (res.term, res.weight) for res in eset]) +print() +print("Terms that could improve the search:", end='') +print(", ".join(["%s (%.2f%%)" % (res.term, res.weight) for res in eset])) # You can also abuse this feature to show what are the tags that are most @@ -108,8 +110,8 @@ eset = enquire.get_eset(10, rset, Filter()) # Print out the resulting tags -print -print "Tags that could improve the search:", -print ", ".join(["%s (%.2f%%)" % (res.term[2:], res.weight) for res in eset]) +print() +print("Tags that could improve the search:", end='') +print(", ".join(["%s (%.2f%%)" % (res.term[2:], res.weight) for res in eset])) sys.exit(0) --- a/examples/axi-query-pkgtype.py +++ b/examples/axi-query-pkgtype.py @@ -19,6 +19,8 @@ # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +from __future__ import print_function + from optparse import OptionParser import sys @@ -79,7 +81,7 @@ # If a filter was requested, AND it with the query query = xapian.Query(xapian.Query.OP_AND, query, filterdb[options.type]) else: - print >>sys.stderr, "Invalid filter type. Try one of", ", ".join(sorted(filterdb.keys())) + print("Invalid filter type. Try one of", ", ".join(sorted(filterdb)), file=sys.stderr) sys.exit(1) @@ -90,8 +92,8 @@ # Display the top 20 results, sorted by how well they match cache = apt.Cache() matches = enquire.get_mset(0, 20) -print "%i results found." % matches.get_matches_estimated() -print "Results 1-%i:" % matches.size() +print("%i results found." % matches.get_matches_estimated()) +print("Results 1-%i:" % matches.size()) for m in matches: # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. @@ -103,6 +105,6 @@ if pkg.candidate: # Print the match, together with the short description - print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) + print("%i%% %s - %s" % (m.percent, name, pkg.candidate.summary)) sys.exit(0) --- a/examples/axi-query-simple.py +++ b/examples/axi-query-simple.py @@ -18,6 +18,8 @@ # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +from __future__ import print_function + from optparse import OptionParser import sys @@ -106,8 +108,8 @@ # Display the top 20 results, sorted by how well they match cache = apt.Cache() matches = enquire.get_mset(0, 20) -print "%i results found." % matches.get_matches_estimated() -print "Results 1-%i:" % matches.size() +print("%i results found." % matches.get_matches_estimated()) +print("Results 1-%i:" % matches.size()) for m in matches: # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. @@ -119,6 +121,6 @@ if pkg.candidate: # Print the match, together with the short description - print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) + print("%i%% %s - %s" % (m.percent, name, pkg.candidate.summary)) sys.exit(0) --- a/examples/axi-query-tags.py +++ b/examples/axi-query-tags.py @@ -18,6 +18,8 @@ # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +from __future__ import print_function + from optparse import OptionParser import sys @@ -91,6 +93,6 @@ # Print out the results for res in eset: - print "%.2f %s" % (res.weight, res.term[2:]) + print("%.2f %s" % (res.weight, res.term[2:])) sys.exit(0) --- a/examples/axi-query.py +++ b/examples/axi-query.py @@ -20,6 +20,8 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # +from __future__ import print_function + from aptxapianindex import * from optparse import OptionParser import sys @@ -96,12 +98,12 @@ # Display the results. cache = apt.Cache() matches = enquire.get_mset(0, 20) -print "%i results found." % matches.get_matches_estimated() -print "Results 1-%i:" % matches.size() +print("%i results found." % matches.get_matches_estimated()) +print("Results 1-%i:" % matches.size()) for m in matches: name = m.document.get_data() pkg = cache[name] if pkg.candidate: - print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) + print("%i%% %s - %s" % (m.percent, name, pkg.candidate.summary)) sys.exit(0) --- a/plugins/app-install.py +++ b/plugins/app-install.py @@ -3,10 +3,9 @@ from xdg.Exceptions import ParsingError from xdg import Locale HAS_XDG=True -except ImportError, e: +except ImportError as e: HAS_XDG=False -import axi.indexer import xapian import os, os.path @@ -55,7 +54,7 @@ # Add an "app-popcon" value with popcon rank try: popcon = int(entry.get("X-AppInstall-Popcon")) - except ValueError, e: + except ValueError as e: if self.progress: self.progress.verbose("%s: parsing X-AppInstall-Popcon: %s" % (fname, str(e))) popcon = -1 --- a/plugins/cataloged_time.py +++ b/plugins/cataloged_time.py @@ -3,7 +3,7 @@ HAS_APT=True except ImportError: HAS_APT=False -import cPickle +import pickle import os import os.path import time @@ -69,7 +69,7 @@ self._packages_cataloged_file = None if (self._packages_cataloged_file and os.path.exists(self._packages_cataloged_file)): - self._package_cataloged_time = cPickle.load(open(self._packages_cataloged_file)) + self._package_cataloged_time = pickle.load(open(self._packages_cataloged_file, 'rb')) else: self._package_cataloged_time = {} self.now = time.time() @@ -123,8 +123,8 @@ Called when the indexing is finihsed """ if self._packages_cataloged_file: - f=open(self._packages_cataloged_file+".new", "w") - res = cPickle.dump(self._package_cataloged_time, f) + f=open(self._packages_cataloged_file+".new", "wb") + res = pickle.dump(self._package_cataloged_time, f) f.close() os.rename(self._packages_cataloged_file+".new", self._packages_cataloged_file) --- a/plugins/descriptions.py +++ b/plugins/descriptions.py @@ -129,7 +129,7 @@ self.indexer.index_text_without_positions(pkg["Description"]) else: # check if we have a translated description - for k in pkg.keys(): + for k in pkg: if k.startswith('Description-'): self.indexer.index_text_without_positions(pkg[k]) break --- a/plugins/translated-desc.py +++ b/plugins/translated-desc.py @@ -5,12 +5,15 @@ HAS_APT=False import xapian import re -import os, os.path, urllib +import os +import codecs try: from debian import deb822 except ImportError: from debian_bundle import deb822 +from six.moves.urllib_parse import unquote + APTLISTDIR="/var/lib/apt/lists" def translationFiles(langs=None): @@ -21,7 +24,7 @@ mo = tfile.search(f) if not mo: continue if langs and not mo.group(1) in langs: continue - yield urllib.unquote(mo.group(1)), os.path.join(APTLISTDIR, f) + yield unquote(mo.group(1)), os.path.join(APTLISTDIR, f) class Indexer: def __init__(self, lang, file): @@ -38,13 +41,14 @@ # Read the translated descriptions self.descs = dict() desckey = "Description-"+self.lang - for pkg in deb822.Deb822.iter_paragraphs(open(file)): - # I need this if because in some translation files, some packages - # have a different Description header. For example, in the -de - # translations, I once found a Description-de.noguide: header - # instead of Description-de: - if desckey in pkg: - self.descs[pkg["Package"]] = pkg[desckey] + with codecs.open(file, 'r', encoding='utf-8') as fp: + for pkg in deb822.Deb822.iter_paragraphs(fp): + # I need this if because in some translation files, some + # packages have a different Description header. For example, + # in the -de translations, I once found a + # Description-de.noguide: header instead of Description-de: + if desckey in pkg: + self.descs[pkg["Package"]] = pkg[desckey] def index(self, document): name = document.get_data() --- a/runtests +++ b/runtests @@ -15,6 +15,6 @@ rm -rf test/test-coverage test/.coverage testdb ;; *) - nosetests -w test "$@" + python -m nose -w test "$@" ;; esac --- a/setup.py +++ b/setup.py @@ -15,9 +15,9 @@ author=['Enrico Zini'], author_email=['enrico@debian.org'], url='http://www.enricozini.org/sw/apt-xapian-index/', - install_requires = [ - "debian", "apt", "xapian", - ], + ## install_requires = [ + ## "debian", "apt", "xapian", + ## ], license='GPL', platforms='any', packages=['axi'], --- a/test/test_indexer.py +++ b/test/test_indexer.py @@ -1,6 +1,6 @@ # -*- coding: utf-8 -*- import unittest -import sys, os.path +import os import axi import axi.indexer import shutil @@ -26,7 +26,7 @@ def __getitem__(self, name): if name not in self._pkgs: - raise KeyError, "`%s' not in wrapped cache" % name + raise KeyError("`%s' not in wrapped cache" % name) return self._cache[name] return sc @@ -167,10 +167,10 @@ # Ensure that we created a new index (because we reindexed to get CJK) self.assertNotEqual(open(axi.XAPIANINDEX).read(), curidx) - + # Double check that we set the CJK flag in the index db = xapian.Database(axi.XAPIANINDEX) - self.assertEqual(db.get_metadata("cjk_ngram"), "1") + self.assertEqual(db.get_metadata("cjk_ngram"), b"1") def testIncrementalRebuildFromEmpty(self): # Prepare an incremental update --- a/tests/dbus-update-apt-xapian-index.py +++ b/tests/dbus-update-apt-xapian-index.py @@ -1,5 +1,7 @@ #!/usr/bin/python +from __future__ import print_function + import dbus import os import glib @@ -7,10 +9,10 @@ dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) def finished(res): - print "finished: ", res + print("finished: ", res) def progress(percent): - print "progress: ", percent + print("progress: ", percent) system_bus = dbus.SystemBus() --- /dev/null +++ b/tox.ini @@ -0,0 +1,16 @@ +[tox] +envlist = {nocov,cov}-{py27,py34,py35} + +[testenv] +sitepackages = True +indexserver = + default = http://missing.example.com +usedevelop = True +setenv = + AXI_PLUGIN_DIR={toxinidir}/plugins + AXI_DB_PATH={toxinidir}/testdb + AXI_CACHE_PATH={toxinidir}/testdb + PYTHONPATH="$PYTHONPATH:{toxinidir}" +commands = + nocov: python -m nose -w test -vv + cov: python -m nose -w test --with-coverage --cover-package axi --cover-html --cover-html-dir=test-coverage-{envname} --- a/update-apt-xapian-index +++ b/update-apt-xapian-index @@ -22,6 +22,8 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # +from __future__ import print_function + import os # Activate support for the CJK tokenizer os.environ["XAPIAN_CJK_NGRAM"] = "1" @@ -88,10 +90,10 @@ # Lock the session so that we prevent concurrent updates try: locked = indexer.lock() -except OSError, e: +except OSError as e: import errno if e.errno == errno.EACCES: - print >>sys.stderr, "You probably need to be root to do this." + print("You probably need to be root to do this.", file=sys.stderr) sys.exit(1) raise if not locked: --- a/update-apt-xapian-index-dbus +++ b/update-apt-xapian-index-dbus @@ -11,7 +11,8 @@ import dbus import dbus.service import dbus.mainloop.glib -except ImportError, e: +except ImportError as e: + import sys sys.stderr.write("Failed to import '%s', can not use dbus" % e) sys.exit(1) @@ -86,7 +87,7 @@ def update_async(self, force, update_only, sender=None, conn=None): if not self._authWithPolicyKit(sender, conn, "org.debian.aptxapianindex.update"): - raise PermissionDeniedError, "Permission denied by policy" + raise PermissionDeniedError("Permission denied by policy") # do not start update-apt-xapian-index twice, the clients will # get the finished signal from the previous running one if self._active_axi: apt-xapian-index-0.47ubuntu13/debian/patches/02_axi-pkgname-mangled-term.patch0000644000000000000000000000113213070503156024053 0ustar --- a/axi/indexer.py +++ b/axi/indexer.py @@ -531,6 +531,10 @@ # Index the package name with a special prefix, to be able to find this # document by exact package name match document.add_term("XP"+pkg.name) + # the query parser is very unhappy about "-" in the pkgname, this + # breaks e.g. FLAG_WILDCARD based matching, so we add a mangled + # name here + document.add_term("XPM"+pkg.name.replace("-","_")) # Have all the various plugins index their things for addon in self.plugins: addon.obj.index(document, pkg) apt-xapian-index-0.47ubuntu13/debian/patches/06_32bit_sizes.patch0000644000000000000000000000363713070503156021460 0ustar Description: On 32 bit systems, large packages like astrometry-data-2mass-00 can have an installed size that doesn't fit in 32 bits, generating an OverflowError. Xapian doesn't catch this and thus returns a value with an exception set, causing a chained SystemError. In either case, catch the exception and treat it as if the size is -1, i.e. don't add it to the document. Author: Barry Warsaw Bug: https://bugs.launchpad.net/ubuntu/+source/xapian1.3-bindings/+bug/1527745 --- a/plugins/sizes.py +++ b/plugins/sizes.py @@ -95,9 +95,15 @@ return if self.val_inst_size != -1: - document.add_value(self.val_inst_size, xapian.sortable_serialise(instSize)); + try: + document.add_value(self.val_inst_size, xapian.sortable_serialise(instSize)); + except (OverflowError, SystemError): + pass if self.val_pkg_size != -1: - document.add_value(self.val_pkg_size, xapian.sortable_serialise(pkgSize)); + try: + document.add_value(self.val_pkg_size, xapian.sortable_serialise(pkgSize)); + except (OverflowError, SystemError): + pass def indexDeb822(self, document, pkg): """ @@ -116,9 +122,15 @@ return if self.val_inst_size != -1: - document.add_value(self.val_inst_size, xapian.sortable_serialise(instSize)); + try: + document.add_value(self.val_inst_size, xapian.sortable_serialise(instSize)); + except (OverflowError, SystemError): + pass if self.val_pkg_size != -1: - document.add_value(self.val_pkg_size, xapian.sortable_serialise(pkgSize)); + try: + document.add_value(self.val_pkg_size, xapian.sortable_serialise(pkgSize)); + except (OverflowError, SystemError): + pass def init(): """ apt-xapian-index-0.47ubuntu13/debian/patches/series0000644000000000000000000000024013070503156017172 0ustar 01_axi_cjk_support.patch 02_axi-pkgname-mangled-term.patch 03_stick_to_main.patch 04_bilingual.patch 05_python3.patch 06_32bit_sizes.patch 07_glib_import.patch apt-xapian-index-0.47ubuntu13/debian/patches/01_axi_cjk_support.patch0000644000000000000000000001117013070503156022506 0ustar === modified file 'axi-cache' --- a/axi-cache +++ b/axi-cache @@ -40,6 +40,9 @@ XDG_CACHE_HOME = os.environ.get("XDG_CACHE_HOME", os.path.expanduser("~/.cache")) CACHEFILE = os.path.join(XDG_CACHE_HOME, "axi-cache.state") +# Activate support for the CJK tokenizer +os.environ["XAPIAN_CJK_NGRAM"] = "1" + try: from ConfigParser import RawConfigParser import re --- a/axi/indexer.py +++ b/axi/indexer.py @@ -627,6 +627,9 @@ self.progress.end() return unchanged, outdated, obsolete + def is_cjk_enabled (self): + return "XAPIAN_CJK_NGRAM" in os.environ + def updateIndex(self, pathname): """ Update the index @@ -636,6 +639,12 @@ except xapian.DatabaseLockError, e: self.progress.warning("DB Update failed, database locked") return + + # Make sure the index CJK-compatible + if self.is_cjk_enabled() and db.get_metadata("cjk_ngram") != "1": + self.progress.notice("The index %s is not CJK-compatible, rebuilding it" % axi.XAPIANINDEX) + return self.rebuild() + cache = self.aptcache() count = len(cache) @@ -692,6 +701,11 @@ # Create a new Xapian index db = xapian.WritableDatabase(pathname, xapian.DB_CREATE_OR_OVERWRITE) + + # Mark the new index as CJK-enabled if relevant + if self.is_cjk_enabled(): + db.set_metadata("cjk_ngram", "1") + # It seems to be faster without transactions, at the moment #db.begin_transaction(False) --- a/test/test_indexer.py +++ b/test/test_indexer.py @@ -6,6 +6,7 @@ import shutil import subprocess import tools +import xapian def smallcache(pkglist=["apt", "libept-dev", "gedit"]): class sc(object): @@ -61,6 +62,11 @@ # Ensure that we have an index self.assertCleanIndex() + def testAptRebuildWithCJK(self): + os.environ["XAPIAN_CJK_NGRAM"] = "1" + self.testAptRebuild() + del os.environ["XAPIAN_CJK_NGRAM"] + def testDeb822Rebuild(self): pkgfile = os.path.join(axi.XAPIANDBPATH, "packages") subprocess.check_call("apt-cache show apt libept-dev gedit > " + pkgfile, shell=True) @@ -117,6 +123,55 @@ # Ensure that we did not create a new index self.assertEqual(open(axi.XAPIANINDEX).read(), curidx) + def testIncrementalRebuildWithCJK(self): + os.environ["XAPIAN_CJK_NGRAM"] = "1" + self.testIncrementalRebuild() + del os.environ["XAPIAN_CJK_NGRAM"] + + def testIncrementalRebuildUpgradingtoCJK(self): + # Perform the initial indexing without CJK enabled + self.assertFalse("XAPIAN_CJK_NGRAM" in os.environ) + progress = axi.indexer.SilentProgress() + pre_indexer = axi.indexer.Indexer(progress, True) + pre_indexer._test_wrap_apt_cache(smallcache(["apt", "libept-dev", "gedit"])) + self.assert_(pre_indexer.lock()) + self.assert_(pre_indexer.setupIndexing()) + pre_indexer.rebuild() + pre_indexer = None + curidx = open(axi.XAPIANINDEX).read() + + # Ensure that we have an index + self.assertCleanIndex() + + # Prepare an incremental update + self.indexer._test_wrap_apt_cache(smallcache(["apt", "coreutils", "gedit"])) + + # No other indexers are running, ensure lock succeeds + self.assert_(self.indexer.lock()) + + # An index exists the plugin modification timestamps are the same, so + # we need to force the indexer to run + self.assert_(not self.indexer.setupIndexing()) + self.assert_(self.indexer.setupIndexing(force=True)) + + # Perform a rebuild with CJK enabled + os.environ["XAPIAN_CJK_NGRAM"] = "1" + self.indexer.incrementalUpdate() + del os.environ["XAPIAN_CJK_NGRAM"] + + # Close the indexer + self.indexer = None + + # Ensure that we have an index + self.assertCleanIndex() + + # Ensure that we created a new index (because we reindexed to get CJK) + self.assertNotEqual(open(axi.XAPIANINDEX).read(), curidx) + + # Double check that we set the CJK flag in the index + db = xapian.Database(axi.XAPIANINDEX) + self.assertEqual(db.get_metadata("cjk_ngram"), "1") + def testIncrementalRebuildFromEmpty(self): # Prepare an incremental update self.indexer._test_wrap_apt_cache(smallcache()) --- a/update-apt-xapian-index +++ b/update-apt-xapian-index @@ -22,6 +22,11 @@ # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # +import os +# Activate support for the CJK tokenizer +os.environ["XAPIAN_CJK_NGRAM"] = "1" + + # # Main program body # apt-xapian-index-0.47ubuntu13/debian/patches/05_python3.patch0000644000000000000000000001441213070503156020714 0ustar Description: Switch to Python 3. Author: Barry Warsaw Bug: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=804099 Index: apt-xapian-index-0.47ubuntu10/axi-cache =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/axi-cache +++ apt-xapian-index-0.47ubuntu10/axi-cache @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # coding: utf-8 # Index: apt-xapian-index-0.47ubuntu10/examples/axi-query-adaptivecutoff.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query-adaptivecutoff.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query-adaptivecutoff.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-query-adaptivecutoff - Use an adaptive cutoff to select results # Index: apt-xapian-index-0.47ubuntu10/examples/axi-query-expand.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query-expand.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query-expand.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-query-expand - Query and show possible expansions # Index: apt-xapian-index-0.47ubuntu10/examples/axi-query-pkgtype.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query-pkgtype.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query-pkgtype.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-query-pkgtype - Like axi-query-simple.py, but with a simple # result filter Index: apt-xapian-index-0.47ubuntu10/examples/axi-query-similar.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query-similar.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query-similar.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-query-similar - Show packages similar to a given one # Index: apt-xapian-index-0.47ubuntu10/examples/axi-query-simple.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query-simple.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query-simple.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-query-simple - apt-cache search replacement using apt-xapian-index # Index: apt-xapian-index-0.47ubuntu10/examples/axi-query-tags.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query-tags.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query-tags.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-query-tags - Look for Debtags tags by keyword # Index: apt-xapian-index-0.47ubuntu10/examples/axi-query.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-query.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-query.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # # axi-query - Example program to query the apt-xapian-index Index: apt-xapian-index-0.47ubuntu10/examples/axi-searchasyoutype.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-searchasyoutype.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-searchasyoutype.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-searchasyoutype - Search-as-you-type demo # Index: apt-xapian-index-0.47ubuntu10/examples/axi-searchcloud.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/examples/axi-searchcloud.py +++ apt-xapian-index-0.47ubuntu10/examples/axi-searchcloud.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # axi-searchasyoutype - Search-as-you-type demo # Index: apt-xapian-index-0.47ubuntu10/runtests =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/runtests +++ apt-xapian-index-0.47ubuntu10/runtests @@ -15,6 +15,6 @@ case "$1" in rm -rf test/test-coverage test/.coverage testdb ;; *) - python -m nose -w test "$@" + python3 -m nose -w test "$@" ;; esac Index: apt-xapian-index-0.47ubuntu10/tests/dbus-update-apt-xapian-index.py =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/tests/dbus-update-apt-xapian-index.py +++ apt-xapian-index-0.47ubuntu10/tests/dbus-update-apt-xapian-index.py @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 from __future__ import print_function Index: apt-xapian-index-0.47ubuntu10/update-apt-xapian-index =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/update-apt-xapian-index +++ apt-xapian-index-0.47ubuntu10/update-apt-xapian-index @@ -1,4 +1,4 @@ -#!/usr/bin/python +#!/usr/bin/python3 # -*- coding: utf-8 -*- # @@ -110,4 +110,6 @@ if opts.update: else: indexer.rebuild(opts.pkgfile) +# Free the resources explicitly. See LP: #1530518 +del indexer sys.exit(0) Index: apt-xapian-index-0.47ubuntu10/update-apt-xapian-index-dbus =================================================================== --- apt-xapian-index-0.47ubuntu10.orig/update-apt-xapian-index-dbus +++ apt-xapian-index-0.47ubuntu10/update-apt-xapian-index-dbus @@ -2,7 +2,6 @@ import logging import os -import string import subprocess try: @@ -56,7 +55,8 @@ class AptXapianIndexDBusService(dbus.ser logging.debug("Emitting UpdateProgress: %s" % percent) def _update_apt_xapian_index(self, cmd): - p = subprocess.Popen(cmd, stdout=subprocess.PIPE) + p = subprocess.Popen(cmd, stdout=subprocess.PIPE, + universal_newlines=True) self._active_axi = p while True: while gobject.main_context_default().pending(): @@ -68,7 +68,7 @@ class AptXapianIndexDBusService(dbus.ser if not line: continue try: - (op, progress) = string.split(line, sep=":", maxsplit=1) + (op, progress) = line.split(sep=":", maxsplit=1) if op == "progress": percent = int(progress.split("/")[0]) self.UpdateProgress(percent) apt-xapian-index-0.47ubuntu13/debian/rules0000755000000000000000000000407613070503156015421 0ustar #!/usr/bin/make -f VERSION=$(shell debian/vercheck) RELEASE_PACKAGE=apt-xapian-index PFX=$(CURDIR)/debian/apt-xapian-index %: dh $@ --with python3,quilt --buildsystem=pybuild override_dh_auto_build: dh_auto_build help2man --name='rebuild the Apt Xapian Index' --section=8 --no-info ./update-apt-xapian-index > update-apt-xapian-index.8 COLUMNS=200 help2man --name='query the Apt Xapian Index' --section=1 --no-info ./axi-cache > axi-cache.1 override_dh_python3: dh_python3 --shebang /usr/bin/python3 override_dh_auto_test: ifeq (, $(findstring nocheck, $(DEB_BUILD_OPTIONS))) # run test suite ./runtests -v endif override_dh_auto_install: dh_auto_install mv $(PFX)/usr/bin/update-apt-xapian-index \ $(PFX)/usr/sbin/update-apt-xapian-index # Install the plugins install -o root -g root -m 755 -d $(PFX)/usr/share/apt-xapian-index/plugins install -o root -g root -m 644 aliases/* $(PFX)/usr/share/apt-xapian-index/aliases/ install -o root -g root -m 644 plugins/*.py $(PFX)/usr/share/apt-xapian-index/plugins/ # Install the dbus stuff install -o root -g root -m 755 update-apt-xapian-index-dbus $(PFX)/usr/share/apt-xapian-index install -o root -g root -m 644 data/org.debian.AptXapianIndex.conf $(PFX)/etc/dbus-1/system.d install -o root -g root -m 644 data/org.debian.AptXapianIndex.service $(PFX)/usr/share/dbus-1/system-services/org.debian.AptXapianIndex.service install -o root -g root -m 644 data/org.debian.aptxapianindex.policy $(PFX)/usr/share/polkit-1/actions # Install bash completion dh_bash-completion override_dh_auto_clean: dh_auto_clean find . -name "*.pyc" -delete rm -rf testdb rm -f update-apt-xapian-index.8 axi-cache.1 vercheck: debian/vercheck > /dev/null debsrc: vercheck git-buildpackage -S -us -uc rm -f ../$(RELEASE_PACKAGE)_$(VERSION)_source.changes #release: clean # ( cd .. && FAKEROOTKEY= LD_PRELOAD= sbuild -c sid -A --nolog -s $(RELEASE_PACKAGE)_$(RELEASE_VERSION).dsc ) # ( cd .. && FAKEROOTKEY= LD_PRELOAD= lintian $(RELEASE_PACKAGE)_$(RELEASE_VERSION)_*.changes ) # git tag -s -m "Tagged version $(RELEASE_VERSION)" v$(RELEASE_VERSION) apt-xapian-index-0.47ubuntu13/axi/0000755000000000000000000000000013070503156013671 5ustar apt-xapian-index-0.47ubuntu13/axi/__init__.py0000644000000000000000000000604413070503156016006 0ustar # # axi - apt-xapian-index python modules # # Copyright (C) 2007--2010 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # import os import os.path import sys import re # Setup configuration PLUGINDIR = os.environ.get("AXI_PLUGIN_DIR", "/usr/share/apt-xapian-index/plugins") XAPIANDBPATH = os.environ.get("AXI_DB_PATH", "/var/lib/apt-xapian-index") XAPIANDBSTAMP = os.path.join(XAPIANDBPATH, "update-timestamp") XAPIANDBLOCK = os.path.join(XAPIANDBPATH, "update-lock") XAPIANDBUPDATESOCK = os.path.join(XAPIANDBPATH, "update-socket") XAPIANDBVALUES = os.path.join(XAPIANDBPATH, "values") XAPIANDBPREFIXES = os.path.join(XAPIANDBPATH, "prefixes") XAPIANDBDOC = os.path.join(XAPIANDBPATH, "README") XAPIANINDEX = os.path.join(XAPIANDBPATH, "index") XAPIANCACHEPATH = os.environ.get("AXI_CACHE_PATH", "/var/cache/apt-xapian-index") # Default value database in case one cannot be read DEFAULT_VALUES = dict(version=0, installedsize=1, packagesize=2) DEFAULT_VALUE_DESCS = dict( version="package version", installedsize="installed size", packagesize="package size" ) def readValueDB(pathname=XAPIANDBVALUES, quiet=False): """ Read the "/etc/services"-style database of value indices """ try: re_empty = re.compile("^\s*(?:#.*)?$") re_value = re.compile("^(\S+)\s+(\d+)(?:\s*#\s*(.+))?$") values = {} descs = {} for idx, line in enumerate(open(pathname)): # Skip empty lines and comments if re_empty.match(line): continue # Parse teh rest mo = re_value.match(line) if not mo: if not quiet: print >>sys.stderr, "%s:%d: line is not `name value [# description]': ignored" % (pathname, idx+1) continue # Parse the number name = mo.group(1) number = int(mo.group(2)) desc = mo.group(3) or "" values[name] = number descs[name] = desc if not values: raise BrokenIndexError except (OSError, IOError, BrokenIndexError), e: # If we can't read the database, fallback to defaults if not quiet: print >>sys.stderr, "%s: %s. Falling back on a default value database" % (pathname, e) values = DEFAULT_VALUES descs = DEFAULT_VALUE_DESCS return values, descs class BrokenIndexError(Exception): pass apt-xapian-index-0.47ubuntu13/axi/indexer.py0000644000000000000000000010002713070503156015701 0ustar # -*- coding: utf-8 -*- # # axi/indexer.py - apt-xapian-index indexer # # Copyright (C) 2007--2010 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # import axi import sys import os import imp import socket, errno import fcntl import textwrap import xapian import shutil import itertools import time import re import urllib import cPickle as pickle APTLISTDIR="/var/lib/apt/lists" class Addon: """ Indexer plugin wrapper """ def __init__(self, fname, progress=None, **kw): self.filename = os.path.basename(fname) self.name = os.path.splitext(self.filename)[0] oldpath = sys.path try: sys.path.append(os.path.dirname(fname)) self.module = imp.load_source("axi.plugin_" + self.name, fname) finally: sys.path = oldpath try: self.obj = self.module.init(**kw) except TypeError: self.obj = self.module.init() if self.obj: try: try: self.info = self.obj.info(**kw) except TypeError: self.info = self.obj.info() except Exception, e: if progress: progress.warning("Plugin %s initialisation failed: %s" % (fname, str(e))) self.obj = None def finished(self): if hasattr(self.obj, "finished"): self.obj.finished() def send_extra_info(self, **kw): func = getattr(self.obj, "send_extra_info", None) if func is not None: func(**kw) class Plugins(list): def __init__(self, **kw): """ Read the plugins, in sorted order. Pass all the keyword args to the plugin init """ if "langs" not in kw: kw["langs"] = self.scan_available_languages() progress = kw.get("progress", None) self.disabled = [] for fname in sorted(os.listdir(axi.PLUGINDIR)): # Skip non-pythons, hidden files and python sources starting with '_' if fname[0] in ['.', '_'] or not fname.endswith(".py"): continue fullname = os.path.join(axi.PLUGINDIR, fname) if not os.path.isfile(fullname): continue if progress: progress.verbose("Reading plugin %s." % fullname) addon = Addon(fullname, **kw) if addon.obj != None: self.append(addon) def scan_available_languages(self): # Languages we index langs = set() # Look for files like: ftp.uk.debian.org_debian_dists_sid_main_i18n_Translation-it # And extract the language code at the end tfile = re.compile(r"_i18n_Translation-([^-]+)$") for f in os.listdir(APTLISTDIR): mo = tfile.search(f) if not mo: continue langs.add(urllib.unquote(mo.group(1))) return langs class Progress: """ Normal progress report to stdout """ def __init__(self): self.task = None self.halfway = False self.is_verbose = False def begin(self, task): self.task = task print "%s..." % self.task, sys.stdout.flush() self.halfway = True def progress(self, percent): print "\r%s... %d%%" % (self.task, percent), sys.stdout.flush() self.halfway = True def end(self): print "\r%s: done. " % self.task self.halfway = False def verbose(self, *args): if not self.is_verbose: return if self.halfway: print print " ".join(args) self.halfway = False def notice(self, *args): if self.halfway: print print >>sys.stderr, " ".join(args) self.halfway = False def warning(self, *args): if self.halfway: print print >>sys.stderr, " ".join(args) self.halfway = False def error(self, *args): if self.halfway: print print >>sys.stderr, " ".join(args) self.halfway = False class BatchProgress: """ Machine readable progress report """ def __init__(self): self.task = None def begin(self, task): self.task = task print "begin: %s\n" % self.task, sys.stdout.flush() def progress(self, percent): print "progress: %d/100\n" % percent, sys.stdout.flush() def end(self): print "done: %s\n" % self.task sys.stdout.flush() def verbose(self, *args): print "verbose: %s" % (" ".join(args)) sys.stdout.flush() def notice(self, *args): print "notice: %s" % (" ".join(args)) sys.stdout.flush() def warning(self, *args): print "warning: %s" % (" ".join(args)) sys.stdout.flush() def error(self, *args): print "error: %s" % (" ".join(args)) sys.stdout.flush() class SilentProgress: """ Quiet progress report """ def begin(self, task): pass def progress(self, percent): pass def end(self): pass def verbose(self, *args): pass def notice(self, *args): pass def warning(self, *args): print >>sys.stderr, " ".join(args) def error(self, *args): print >>sys.stderr, " ".join(args) class ClientProgress: """ Client-side progress report, reporting progress from another running indexer """ def __init__(self, progress): self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) self.sock.settimeout(None) self.sock.connect(axi.XAPIANDBUPDATESOCK) self.progress = progress def loop(self): hasBegun = False while True: msg = self.sock.recv(4096) try: args = pickle.loads(msg) except EOFError: self.progress.error("The other update has stopped") return action = args[0] args = args[1:] if action == "begin": self.progress.begin(*args) hasBegun = True elif action == "progress": if not hasBegun: self.progress.begin(args[0]) hasBegun = True self.progress.progress(*args[1:]) elif action == "end": if not hasBegun: self.progress.begin(args[0]) hasBegun = True self.progress.end(*args[1:]) elif action == "verbose": self.progress.verbose(*args) elif action == "notice": self.progress.notice(*args) elif action == "error": self.progress.error(*args) elif action == "alldone": break else: self.progress.error("unknown action '%s' from other update-apt-xapian-index. Arguments: '%s'" % (action, ", ".join(map(repr, args)))) class ServerSenderProgress: """ Server endpoint for client-server progress report """ def __init__(self, sock, task = None): self.sock = sock self.task = task def __del__(self): self._send(pickle.dumps(("alldone",))) def _send(self, text): try: self.sock.send(text) except: pass def begin(self, task): self.task = task self._send(pickle.dumps(("begin", self.task))) def progress(self, percent): self._send(pickle.dumps(("progress", self.task, percent))) def end(self): self._send(pickle.dumps(("end", self.task))) def verbose(self, *args): self._send(pickle.dumps(("verbose",) + args)) def notice(self, *args): self._send(pickle.dumps(("notice",) + args)) def warning(self, *args): self._send(pickle.dumps(("warning",) + args)) def error(self, *args): self._send(pickle.dumps(("error",) + args)) class ServerProgress: """ Send progress report to any progress object, as well as to client indexers """ def __init__(self, mine): self.task = None self.proxied = [mine] self.sockfile = axi.XAPIANDBUPDATESOCK try: os.unlink(self.sockfile) except OSError: pass self.sock = socket.socket(socket.AF_UNIX, socket.SOCK_STREAM) self.sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) self.sock.bind(axi.XAPIANDBUPDATESOCK) self.sock.setblocking(False) self.sock.listen(5) # Disallowing unwanted people to mess with the file is automatic, as # the socket has the ownership of the user we're using, and people # can't connect to it unless they can write to it def __del__(self): self.sock.close() os.unlink(self.sockfile) def _check(self): try: sock = self.sock.accept()[0] self.proxied.append(ServerSenderProgress(sock, self.task)) except socket.error, e: if e.args[0] != errno.EAGAIN: raise pass def begin(self, task): self._check() self.task = task for x in self.proxied: x.begin(task) def progress(self, percent): self._check() for x in self.proxied: x.progress(percent) def end(self): self._check() for x in self.proxied: x.end() def verbose(self, *args): self._check() for x in self.proxied: x.verbose(*args) def notice(self, *args): self._check() for x in self.proxied: x.notice(*args) def warning(self, *args): self._check() for x in self.proxied: x.warning(*args) def error(self, *args): self._check() for x in self.proxied: x.error(*args) class ExecutionTime(object): """ Helper that can be used in with statements to have a simple measure of the timing of a particular block of code, e.g. with ExecutionTime("db flush"): db.flush() """ import time def __init__(self, info=""): self.info = info def __enter__(self): self.now = time.time() def __exit__(self, type, value, stack): print "%s: %s" % (self.info, time.time() - self.now) class Indexer(object): """ The indexer """ def __init__(self, progress, quietapt=False): self.progress = progress self.quietapt = quietapt self.verbose = getattr(progress, "is_verbose", False) # Timestamp of the most recent data source self.ds_timestamp = 0 # Apt cache instantiated on demand self.apt_cache = None # Loaded plugins self.plugins = None # OS file handle for the lock file self.lockfd = None # Ensure the database and cache directories exist self.ensure_dir_exists(axi.XAPIANDBPATH) self.ensure_dir_exists(axi.XAPIANCACHEPATH) def ensure_dir_exists(self, pathname): """ Create a directory if missing, but do not complain if it already exists """ try: # Try to create it anyway os.mkdir(pathname) except OSError, e: if e.errno != errno.EEXIST: # If we got an error besides path already existing, fail raise elif not os.path.isdir(pathname): # If that path already exists, but is not a directory, also fail raise def _test_wrap_apt_cache(self, wrapper): """ Wrap the apt-cache in some proxy object. This is used to give tests some control over the apt cache results """ if self.apt_cache is not None: raise RuntimeError("the cache has already been instantiated") # Instantiate the cache self.aptcache() # Wrap it self.apt_cache = wrapper(self.apt_cache) def aptcache(self): if not self.apt_cache: #import warnings ## Yes, apt, thanks, I know, the api isn't stable, thank you so very much ##warnings.simplefilter('ignore', FutureWarning) #warnings.filterwarnings("ignore","apt API not stable yet") import apt import apt_pkg #warnings.resetwarnings() if self.quietapt: class AptSilentProgress(apt.progress.text.OpProgress) : def __init__(self): pass def done(self): pass def update(self,percent=None): pass aptprogress = AptSilentProgress() else: aptprogress = None # memonly=True: force apt to not write a pkgcache.bin apt_pkg.init_config() self.apt_cache = apt.Cache(memonly=True, progress=aptprogress) return self.apt_cache def lock(self): """ Lock the session to prevent further updates. @returns True if the session is locked False if another indexer is running """ # Lock the session so that we prevent concurrent updates self.lockfd = os.open(axi.XAPIANDBLOCK, os.O_RDWR | os.O_CREAT) try: fcntl.lockf(self.lockfd, fcntl.LOCK_EX | fcntl.LOCK_NB) # Wrap the current progress with the server sender self.progress = ServerProgress(self.progress) return True except IOError, e: if e.errno == errno.EACCES or e.errno == errno.EAGAIN: return False else: raise def slave(self): """ Attach to a running indexer and report its progress. Return when the other indexer has finished. """ self.progress.notice("Another update is already running: showing its progress.") childProgress = ClientProgress(self.progress) childProgress.loop() def setupIndexing(self, force=False, system=True): """ Setup indexing: read plugins, check timestamps... @param force: if True, reindex also if the index is up to date @return: True if there is something to index False if there is no need of indexing """ # Read values database #values = readValueDB(VALUESCONF, progress) # Read the plugins, in sorted order self.plugins = Plugins(progress=self.progress, system=system) # Ensure that we have something to do if len(self.plugins) == 0: self.progress.notice("No indexing plugins found in %s" % axi.PLUGINDIR) return False # Get the most recent modification timestamp of the data sources self.ds_timestamp = max([x.info['timestamp'] for x in self.plugins]) # Get the timestamp of the last database update try: if os.path.exists(axi.XAPIANDBSTAMP): cur_timestamp = os.path.getmtime(axi.XAPIANDBSTAMP) else: cur_timestamp = 0 except OSError, e: cur_timestamp = 0 self.progress.notice("Reading current timestamp failed: %s. Assuming the index has not been created yet." % e) if self.verbose: self.progress.verbose("Most recent dataset: %s." % time.ctime(self.ds_timestamp)) self.progress.verbose("Most recent update for: %s." % time.ctime(cur_timestamp)) # See if we need an update if cur_timestamp != 0 and int(self.ds_timestamp+.5) <= int(cur_timestamp+0.5): if force: self.progress.notice("The index %s is up to date, but rebuilding anyway as requested." % axi.XAPIANDBPATH) else: self.progress.notice("The index %s is up to date" % axi.XAPIANDBPATH) return False # Build the value database self.progress.verbose("Aggregating value information.") # Read existing value database to keep ids stable in a system self.values, self.values_desc = axi.readValueDB(quiet=True) values_seq = max(self.values.values()) + 1 for addon in self.plugins: for v in addon.info.get("values", []): if v['name'] in self.values: continue self.values[v['name']] = values_seq values_seq += 1 self.values_desc[v['name']] = v['desc'] # Tell the plugins to do the long initialisation bits self.progress.verbose("Initializing plugins.") for addon in self.plugins: addon.obj.init(dict(values=self.values), self.progress) return True def get_document_from_apt(self, pkg): """ Get a xapian.Document for the given apt package record """ document = xapian.Document() # The document data is the package name document.set_data(pkg.name) # add information about the version of the package in slot 0 document.add_value(0, pkg.candidate.version) # Index the package name with a special prefix, to be able to find this # document by exact package name match document.add_term("XP"+pkg.name) # Have all the various plugins index their things for addon in self.plugins: addon.obj.index(document, pkg) return document def get_document_from_deb822(self, pkg): """ Get a xapian.Document for the given deb822 package record """ document = xapian.Document() # The document data is the package name document.set_data(pkg["Package"]) # add information about the version of the package in slot 0 document.add_value(0, pkg["Version"]) # Index the package name with a special prefix, to be able to find this # document by exact package name match document.add_term("XP"+pkg["Package"]) # Have all the various plugins index their things for addon in self.plugins: addon.obj.indexDeb822(document, pkg) return document def gen_documents_apt(self): """ Generate Xapian documents from an apt cache """ cache = self.aptcache() count = len(cache) for idx, pkg in enumerate(cache): if not pkg.candidate: continue # multiarch: do not index foreign arch if there is a native # archive version available (unless the pkg is installed) if (not pkg.installed and ":" in pkg.name and pkg.name.split(":")[0] in cache): continue # Print progress if idx % 200 == 0: self.progress.progress(100*idx/count) yield self.get_document_from_apt(pkg) def gen_documents_deb822(self, fnames): try: from debian import deb822 except ImportError: from debian_bundle import deb822 seen = set() for fname in fnames: infd = open(fname) # Get file size to compute progress total = os.fstat(infd.fileno())[6] for idx, pkg in enumerate(deb822.Deb822.iter_paragraphs(infd)): name = pkg["Package"] if name in seen: continue seen.add(name) # Print approximate progress by checking the current read position # against the file size if total > 0 and idx % 200 == 0: cur = infd.tell() self.progress.progress(100*cur/total) yield self.get_document_from_deb822(pkg) def compareCacheToDb(self, cache, db): """ Compare the apt cache to the database and return dicts of the form (pkgname, docid) for the following states: unchanged - no new version since the last update outdated - a new version since the last update obsolete - no longer in the apt cache """ unchanged = {} outdated = {} obsolete = {} self.progress.begin("Reading Xapian index") count = db.get_doccount() for (idx, m) in enumerate(db.postlist("")): if idx % 5000 == 0: self.progress.progress(100*idx/count) doc = db.get_document(m.docid) pkg = doc.get_data() # this will return '' if there is no value 0, which is fine because it # will fail the comparison with the candidate version causing a reindex dbver = doc.get_value(0) # check if the package no longer exists if not cache.has_key(pkg) or not cache[pkg].candidate: obsolete[pkg] = m.docid # check if we have a new version, we do not have to delete # the record, elif cache[pkg].candidate.version != dbver: outdated[pkg] = m.docid # its a valid package and we know about it already else: unchanged[pkg] = m.docid self.progress.end() return unchanged, outdated, obsolete def updateIndex(self, pathname): """ Update the index """ try: db = xapian.WritableDatabase(pathname, xapian.DB_CREATE_OR_OPEN) except xapian.DatabaseLockError, e: self.progress.warning("DB Update failed, database locked") return cache = self.aptcache() count = len(cache) unchanged, outdated, obsolete = self.compareCacheToDb(cache, db) self.progress.verbose("Unchanged versions: %s, oudated version: %s, " "obsolete versions: %s" % (len(unchanged), len(outdated), len(obsolete))) self.progress.begin("Updating Xapian index") for a in self.plugins: a.send_extra_info(db=db, aptcache=cache) for idx, pkg in enumerate(cache): if idx % 1000 == 0: self.progress.progress(100*idx/count) if not pkg.candidate: continue if pkg.name in unchanged: continue elif pkg.name in outdated: # update the existing db.replace_document(outdated[pkg.name], self.get_document_from_apt(pkg)) else: # add the new ones db.add_document(self.get_document_from_apt(pkg)) # and remove the obsoletes for docid in obsolete.values(): db.delete_document(docid) # finished for a in self.plugins: a.finished() db.flush() self.progress.end() def incrementalUpdate(self): if not os.path.exists(axi.XAPIANINDEX): self.progress.notice("No Xapian index built yet: falling back to full rebuild") return self.rebuild() dbkind, dbpath = open(axi.XAPIANINDEX).readline().split() self.updateIndex(dbpath) # Update the index timestamp if not os.path.exists(axi.XAPIANDBSTAMP): open(axi.XAPIANDBSTAMP, "w").close() if self.ds_timestamp > 0: os.utime(axi.XAPIANDBSTAMP, (self.ds_timestamp, self.ds_timestamp)) def buildIndex(self, pathname, documents, addoninfo={}): """ Create a new Xapian index with the content provided by the plugins """ self.progress.begin("Rebuilding Xapian index") # Create a new Xapian index db = xapian.WritableDatabase(pathname, xapian.DB_CREATE_OR_OVERWRITE) # It seems to be faster without transactions, at the moment #db.begin_transaction(False) for a in self.plugins: a.send_extra_info(db=db) # Add all generated documents to the index for doc in documents: db.add_document(doc) #db.commit_transaction(); for a in self.plugins: a.finished() db.flush() self.progress.end() def rebuild(self, pkgfiles=None): # Create a new Xapian index with the content provided by the plugins # Xapian takes care of preventing concurrent updates and removing the old # database if it's left over by a previous crashed update # Generate a new index name for idx in itertools.count(1): tmpidxfname = "index.%d" % idx dbdir = os.path.join(axi.XAPIANCACHEPATH, tmpidxfname) if not os.path.exists(dbdir): break; if pkgfiles: generator = self.gen_documents_deb822(pkgfiles) else: for a in self.plugins: a.send_extra_info(aptcache=self.aptcache()) generator = self.gen_documents_apt() self.buildIndex(dbdir, generator) # Update the 'index' symlink to point at the new index self.progress.verbose("Installing the new index.") #os.symlink(tmpidxfname, axi.XAPIANDBPATH + "/index.tmp") out = open(axi.XAPIANINDEX + ".tmp", "w") print >>out, "auto", os.path.abspath(dbdir) out.close() os.rename(axi.XAPIANINDEX + ".tmp", axi.XAPIANINDEX) # Remove all other index.* directories that are not the newly created one def cleanoldcaches(dir): for file in os.listdir(dir): if not file.startswith("index."): continue # Don't delete what we just created if file == tmpidxfname: continue fullpath = os.path.join(dir, file) # Only delete directories if not os.path.isdir(fullpath): continue self.progress.verbose("Removing old index %s." % fullpath) shutil.rmtree(fullpath) cleanoldcaches(axi.XAPIANDBPATH) cleanoldcaches(axi.XAPIANCACHEPATH) # Commit the changes and update the last update timestamp if not os.path.exists(axi.XAPIANDBSTAMP): open(axi.XAPIANDBSTAMP, "w").close() if self.ds_timestamp > 0: os.utime(axi.XAPIANDBSTAMP, (self.ds_timestamp, self.ds_timestamp)) self.writeValues() self.writePrefixes() self.writeDoc() def writePrefixes(self, pathname=axi.XAPIANDBPREFIXES): """ Write the prefix information on the given file """ self.progress.verbose("Writing prefix information to %s." % pathname) out = open(pathname+".tmp", "w") print >>out, textwrap.dedent(""" # This file contains the information about keyword prefixes used in the # APT Xapian index. # # Xapian allows to prefix some terms so they can be told apart from # normal keywords: this is used for example with Debtags tags, and # stemmed forms. # # This file lists terms with their index prefix, their queryparser # prefix, whether queryparser should treat it as boolean or # probabilistic and a short description. """).lstrip() # Aggregate and normalise prefix information from all the plugins prefixes = dict() for addon in self.plugins: for p in addon.info.get("prefixes", []): idx = p.get("idx", None) if idx is None: continue qp = p.get("qp", None) type = p.get("type", None) desc = p.get("desc", None) # TODO: warn of inconsistencies (plugins that disagree on qp or type) old = prefixes.setdefault(idx, dict()) if qp: old.setdefault("qp", qp) if type: old.setdefault("type", type) if desc: old.setdefault("desc", desc) for name, info in sorted(prefixes.iteritems(), key=lambda x: x[0]): print >>out, "%s\t%s\t%s\t# %s" % ( name, info.get("qp", "-"), info.get("type", "-"), info.get("desc", "(description is missing)")) out.close() # Atomic update of the documentation os.rename(pathname+".tmp", pathname) def writeValues(self, pathname=axi.XAPIANDBVALUES): """ Write the value information on the given file """ self.progress.verbose("Writing value information to %s." % pathname) out = open(pathname+".tmp", "w") print >>out, textwrap.dedent(""" # This file contains the mapping between names of numeric values indexed in the # APT Xapian index and their index # # Xapian allows to index numeric values as well as keywords and to use them for # all sorts of useful querying tricks. However, every numeric value needs to # have a unique index, and this configuration file is needed to record which # indices are allocated and to provide a mnemonic name for them. # # The format is exactly like /etc/services with name, number and optional # aliases, with the difference that the second column does not use the # "/protocol" part, which would be meaningless here. """).lstrip() for name, idx in sorted(self.values.iteritems(), key=lambda x: x[1]): desc = self.values_desc[name] print >>out, "%s\t%d\t# %s" % (name, idx, desc) out.close() # Atomic update of the documentation os.rename(pathname+".tmp", pathname) def writeDoc(self, pathname=axi.XAPIANDBDOC): """ Write the documentation in the given file """ self.progress.verbose("Writing documentation to %s." % pathname) # Collect the documentation docinfo = [] for addon in self.plugins: try: doc = addon.obj.doc() if doc != None: docinfo.append(dict( name = doc['name'], shortDesc = doc['shortDesc'], fullDoc = doc['fullDoc'])) except: # If a plugin has problem returning documentation, don't worry about it self.progress.notice("Skipping documentation for plugin", addon.filename) # Write the documentation in pathname out = open(pathname+".tmp", "w") print >>out, textwrap.dedent(""" =============== Database layout =============== This Xapian database indexes Debian package information. To query the database, open it as ``%s/index``. Data are indexed either as terms or as values. Words found in package descriptions are indexed lowercase, and all other kinds of terms have an uppercase prefix as documented below. Numbers are indexed as Xapian numeric values. A list of the meaning of the numeric values is found in ``%s``. The data sources used for indexing are: """).lstrip() % (axi.XAPIANDBPATH, axi.XAPIANDBVALUES) for d in docinfo: print >>out, " * %s: %s" % (d['name'], d['shortDesc']) print >>out, textwrap.dedent(""" This Xapian index follows the conventions for term prefixes described in ``/usr/share/doc/xapian-omega/termprefixes.txt.gz``. Extra Debian data sources can define more extended prefixes (starting with ``X``): their meaning is documented below together with the rest of the data source documentation. At the very least, at least the package name (with the ``XP`` prefix) will be present in every document in the database. This allows to quickly lookup a Xapian document by package name. The user data associated to a Xapian document is the package name. ------------------- Active data sources ------------------- """) for d in docinfo: print >>out, d['name'] print >>out, '='*len(d['name']) print >>out, textwrap.dedent(d['fullDoc']) print >>out out.close() # Atomic update of the documentation os.rename(pathname+".tmp", pathname) apt-xapian-index-0.47ubuntu13/README0000644000000000000000000000413413070503156013772 0ustar Repo: git://git.debian.org/git/collab-maint/apt-xapian-index.git http://git.debian.org/?p=collab-maint/apt-xapian-index.git Idea: - Many data sources each maintain their own data, but provide a plugin to index that data in a single, big Xapian database. - Every data source has its own db (possibly easy to read plaintext) in /var/lib/somewhere - Every data source has a tool to keep their own db up to date (e.g. downloading new data from the net, or whatever) - Every data source installs a plugin in /usr/share/apt-xapian/index/plugins that adds information from the data source into Xapian documents during indexing Technicalities: - There is a central update procedure, but it is fed enough data to do differential updates when xapian will make it faster to do so - Next to the database there is a README file with information about how the index is built and how it can be queried. Every indexing plugin adds information to this README - Xapian values are looked up by index. Indexes are given names using a configuration file in the style of /etc/services, located at /etc/apt/xapian-index-values.conf Writing a plugin: - Take a look at plugins/template.py: it contains all the methods and full documentations on what they should do. - Take a look at the other plugins for examples: there are many of them. Packages with Xapian bindings that can be uses for querying the database: - C++: libxapian-dev - Perl: libsearch-xapian-perl - Ruby: libxapian-ruby1.8 - Python: python-xapian - Tcl: tclxapian - PHP: php5-xapian The C++ API documentation is in the package xapian-doc. The documentation of the other languages is in the same package as the bindings. Examples can be found in xapian-examples, as well as in apt-xapian-index. Some low-level tools to access the database can be found in xapian-tools. Please see http://www.xapian.org and http://www.xapian.org/docs/ To do: - Example queries - Example scripts - Document libept transition - Libept transition - Move the debtags plugin in the debtags package - Popcon data source - Iterating data source apt-xapian-index-0.47ubuntu13/setup.py0000644000000000000000000000135113070503156014622 0ustar #!/usr/bin/env python import sys import os.path from distutils.core import setup for line in open(os.path.join(os.path.dirname(sys.argv[0]),'update-apt-xapian-index')): if line.startswith('VERSION='): version = eval(line.split('=')[-1]) setup(name='apt-xapian-index', version=version, description='Xapian index of Debian packages', # long_description='' author=['Enrico Zini'], author_email=['enrico@debian.org'], url='http://www.enricozini.org/sw/apt-xapian-index/', install_requires = [ "debian", "apt", "xapian", ], license='GPL', platforms='any', packages=['axi'], # py_modules=[''], scripts=['update-apt-xapian-index', 'axi-cache'], ) apt-xapian-index-0.47ubuntu13/examples/0000755000000000000000000000000013070503156014726 5ustar apt-xapian-index-0.47ubuntu13/examples/axi-searchcloud.py0000755000000000000000000002017613070503156020364 0ustar #!/usr/bin/python # axi-searchasyoutype - Search-as-you-type demo # # Copyright (C) 2008 Matteo Zandi, Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # Note: this program needs python-gtkhtml2 to work from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options] keywords", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") parser.add_option("-t", "--type", help="package type, one of 'game', 'gui', 'cmdline' or 'editor'") (options, args) = parser.parse_args() import gtk, pygtk # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os, math import xapian from aptxapianindex import * from debian import deb822 import gtkhtml2 # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Instantiate the APT cache as well cache = apt.Cache() # Facet name -> Short description facets = dict() # Tag name -> Short description tags = dict() for p in deb822.Deb822.iter_paragraphs(open("/var/lib/debtags/vocabulary", "r")): if "Description" not in p: continue desc = p["Description"].split("\n", 1)[0] if "Tag" in p: tags[p["Tag"]] = desc elif "Facet" in p: facets[p["Facet"]] = desc def SimpleOrQuery(input_terms): """ Simple OR Query terms is an array of words """ if len(input_terms) == 0: # No text given: abort return [] # To understand the following code, keep in mind that we do # search-as-you-type, so the last word may be partially typed. if len(input_terms[-1]) == 1: # If the last term has only one character, we ignore it waiting for # the user to type more. A better implementation can set up a # timer to disable this some time after the user stopped typing, to # give meaningful results to searches like "programming in c" input_terms = input_terms[:-1] if len(input_terms) == 0: return [], [] # Convert the words into terms for the query terms = termsForSimpleQuery(input_terms) # Since the last word can be partially typed, we add all words that # begin with the last one. terms.extend([x.term for x in db.allterms(input_terms[-1])]) # Build the query query = xapian.Query(xapian.Query.OP_OR, terms) # Add the simple user filter, if requested. This is to show that even # when we do search-as-you-type, everything works as usual query = addSimpleFilterToQuery(query, options.type) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # This does the adaptive cutoff trick on the query results (see # axi-query-adaptivecutoff.py) mset = enquire.get_mset(0, 1) try: topWeight = mset[0].weight except IndexError: return [], [] enquire.set_cutoff(0, topWeight * 0.7) # Select the first 30 documents as the key ones to use to compute relevant # terms packages = [] rset = xapian.RSet() for m in enquire.get_mset(0, 30): rset.add_document(m.docid) name = m.document.get_data() score = m.percent try: pkg = cache[name] except KeyError: continue # pkg.candidate may be none try: shortdesc = pkg.candidate.summary except AttributeError: continue packages.append((score, name, shortdesc)) class Filter(xapian.ExpandDecider): def __call__(self, term): #return term[0].islower() or term[:2] == "XT" return term[:2] == "XT" def format(k): if k in tags: facet = k.split("::", 1)[0] if facet in facets: return "%s: %s" % (facets[facet], tags[k]) else: return "%s" % tags[k] else: return k taglist = [] maxscore = None for res in enquire.get_eset(15, rset, Filter()): # Normalise the score in the interval [0, 1] weight = math.log(res.weight) if maxscore == None: maxscore = weight tag = res.term[2:] taglist.append( (tag, format(tag), float(weight) / maxscore) ) taglist.sort(key=lambda x:x[0]) if len(taglist) == 0: return [], [] return packages, taglist def mark_text_up(result_list): # 0-100 score, key (facet::tag), description document = gtkhtml2.Document() document.clear() document.open_stream("text/html") document.write_stream(""" """) for tag, desc, score in result_list: document.write_stream('%s ' % (tag, score*150, desc)) document.write_stream("") document.close_stream() return document class Demo: def __init__(self): w = gtk.Window() w.connect('destroy', gtk.main_quit) w.set_default_size(400, 400) self.model = gtk.ListStore(int, str, str) treeview = gtk.TreeView() treeview.show() treeview.set_model(self.model) cell_pct = gtk.CellRendererText() column_pct = gtk.TreeViewColumn ("Percent", cell_pct, text=0) column_pct.set_sort_column_id(0) treeview.append_column(column_pct) cell_name = gtk.CellRendererText() column_name = gtk.TreeViewColumn ("Name", cell_name, text=1) column_name.set_sort_column_id(0) treeview.append_column(column_name) cell_desc = gtk.CellRendererText() column_desc = gtk.TreeViewColumn ("Name", cell_desc, text=2) column_desc.set_sort_column_id(0) treeview.append_column(column_desc) document = gtkhtml2.Document() document.clear() document.open_stream("text/html") document.write_stream("Welcome, enter some text to start!") document.close_stream() self.view = gtkhtml2.View() self.view.set_document(document) scrolledwin = gtk.ScrolledWindow() scrolledwin.show() scrolledwin.set_policy(gtk.POLICY_NEVER, gtk.POLICY_AUTOMATIC) scrolledwin.add(treeview) vbox = gtk.VBox(False, 0) vbox.pack_start(scrolledwin, True, True, 0) vbox.pack_start(self.view, True, True, 0) self.entry = gtk.Entry() self.entry.connect('changed', self.on_entry_changed) vbox.pack_start(self.entry, False, True, 0) w.add(vbox) w.show_all() gtk.main() def on_entry_changed(self, widget, *args): packageslist, tags = SimpleOrQuery(widget.get_text().split()) self.model.clear() for item in packageslist: self.model.append(item) doc = mark_text_up(tags) doc.connect('link_clicked', self.on_link_clicked) self.view.set_document(doc) def on_link_clicked(self, document, link): self.entry.set_text(link + " " + self.entry.get_text().lstrip()) if __name__ == "__main__": demo = Demo() apt-xapian-index-0.47ubuntu13/examples/axi-query-adaptivecutoff.py0000755000000000000000000000513413070503156022234 0ustar #!/usr/bin/python # axi-query-adaptivecutoff - Use an adaptive cutoff to select results # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options] keywords", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") parser.add_option("-t", "--type", help="package type, one of 'game', 'gui', 'cmdline' or 'editor'") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian from aptxapianindex import * # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Build the base query as seen in axi-query-simple.py query = xapian.Query(xapian.Query.OP_OR, termsForSimpleQuery(args)) # Add the simple user filter, if requeste query = addSimpleFilterToQuery(query, options.type) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # Retrieve the first result, and check its relevance matches = enquire.get_mset(0, 1) topWeight = matches[0].weight # Tell Xapian that we only want results that are at least 70% as good as that enquire.set_cutoff(0, topWeight * 0.7) # Now we have a meaningful cutoff, so we can get a larger number of results: # thanks to the cutoff, approximate results will stop before starting to be # really bad. matches = enquire.get_mset(0, 200) # Display the results show_mset(matches) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/axi-searchasyoutype.py0000755000000000000000000001453013070503156021315 0ustar #!/usr/bin/python # axi-searchasyoutype - Search-as-you-type demo # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options] keywords", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") parser.add_option("-t", "--type", help="package type, one of 'game', 'gui', 'cmdline' or 'editor'") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian from aptxapianindex import * # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Instantiate the APT cache as well cache = apt.Cache() import curses import curses.wrapper import re class Results: """ Show the results of a query while we type it """ def __init__(self, stdscr): maxy, maxx = stdscr.getmaxyx() self.size = maxy-1 self.splitline = re.compile(r'\s+') self.win = curses.newwin(self.size, maxx, 0, 0) self.win.clear() def noresults(self, suggestion = "type more"): self.win.clear() self.win.addstr(0, 0, "No results, " + suggestion, curses.A_BOLD) self.win.refresh() def update(self, line): """ Show the results of the search done using the given line of text """ line = line.lower().strip() if len(line) == 0: # No text given: abort self.noresults() return # Split the line in words args = self.splitline.split(line) if len(args) == 0: # No text given: abort self.noresults() return # To understand the following code, keep in mind that we do # search-as-you-type, so the last word may be partially typed. if len(args[-1]) == 1: # If the last term has only one character, we ignore it waiting for # the user to type more. A better implementation can set up a # timer to disable this some time after the user stopped typing, to # give meaningful results to searches like "programming in c" args = args[:-1] if len(args) == 0: self.noresults() return # Convert the words into terms for the query terms = termsForSimpleQuery(args) # Since the last word can be partially typed, we add all words that # begin with the last one. terms.extend([x.term for x in db.allterms(args[-1])]) # Build the query query = xapian.Query(xapian.Query.OP_OR, terms) # Add the simple user filter, if requested. This is to show that even # when we do search-as-you-type, everything works as usual query = addSimpleFilterToQuery(query, options.type) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # This does the adaptive cutoff trick on the query results (see # axi-query-adaptivecutoff.py) mset = enquire.get_mset(0, 1) try: topWeight = mset[0].weight except IndexError: self.noresults("change your query") return enquire.set_cutoff(0, topWeight * 0.7) # Retrieve as many results as we can show mset = enquire.get_mset(0, self.size - 1) # Redraw the window self.win.clear() # Header self.win.addstr(0, 0, "%i results found." % mset.get_matches_estimated(), curses.A_BOLD) # Results for y, m in enumerate(mset): # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.get_data() # Get the package record out of the Apt cache, so we can retrieve the short # description pkg = cache[name] if pkg.candidate: # Print the match, together with the short description self.win.addstr(y+1, 0, "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary)) self.win.refresh() # Build the base query as seen in axi-query-simple.py class Input: def __init__(self, stdscr, results): maxy, maxx = stdscr.getmaxyx() self.results = results self.win = curses.newwin(1, maxx, maxy-1, 0) self.win.bkgdset(' ', curses.A_REVERSE) self.win.clear() self.win.addstr(0, 0, "> ", curses.A_REVERSE) self.line = "" def mainloop(self): old = "" while True: c = self.win.getch() if c == 10 or c == 27: break elif c == 127: control = True if len(self.line) > 0: self.line = self.line[:-1] else: self.line += chr(c) self.win.clear() self.win.addstr(0, 0, "> " + self.line, curses.A_REVERSE) self.win.refresh() if self.line != old: self.results.update(self.line) old = self.line def main(stdscr): results = Results(stdscr) input = Input(stdscr, results) stdscr.refresh() input.mainloop() curses.wrapper(main) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/aptxapianindex.py0000644000000000000000000001247613070503156020327 0ustar # This program is free software. It comes without any warranty, to # the extent permitted by applicable law. You can redistribute it # and/or modify it under the terms of the Do What The Fuck You Want # To Public License, Version 2, as published by Sam Hocevar. See # http://sam.zoy.org/wtfpl/COPYING for more details. import os, re import xapian import warnings # Setup configuration # This tells python-apt that we've seen the warning about the API not being # stable yet, and we don't want to see every time we run the program warnings.filterwarnings("ignore","apt API not stable yet") import apt warnings.resetwarnings() # Setup configuration XAPIANDBPATH = os.environ.get("AXI_DB_PATH", "/var/lib/apt-xapian-index") XAPIANDB = XAPIANDBPATH + "/index" XAPIANDBVALUES = XAPIANDBPATH + "/values" # This is our little database of simple Debtags filters we provide: the name # entered by the user in "--type" maps to a piece of Xapian query filterdb = dict( # We can do simple AND queries... game = xapian.Query(xapian.Query.OP_AND, ('XTuse::gameplaying', 'XTrole::program')), # Or we can do complicate binary expressions... gui = xapian.Query(xapian.Query.OP_AND, xapian.Query('XTrole::program'), xapian.Query(xapian.Query.OP_OR, 'XTinterface::x11', 'XTinterface::3d')), cmdline = xapian.Query(xapian.Query.OP_AND, 'XTrole::program', 'XTinterface::commandline'), editor = xapian.Query(xapian.Query.OP_AND, 'XTrole::program', 'XTuse::editing') # Feel free to invent more ) def termsForSimpleQuery(keywords): """ Given a list of user-supplied keywords, build the list of terms that will go in a simple Xapian query. If a term is lowercase and contains '::', then it's considered to be a Debtags tag. """ stemmer = xapian.Stem("english") terms = [] for word in keywords: if word.islower() and word.find("::") != -1: # FIXME: A better way could be to look up arguments in # /var/lib/debtags/vocabulary # # According to /var/lib/apt-xapian-index/README, Debtags tags are # indexed with the 'XT' prefix. terms.append("XT"+word) else: # If it is not a Debtags tag, then we consider it a normal keyword. word = word.lower() terms.append(word) # If the word has a stemmed version, add it to the query. # /var/lib/apt-xapian-index/README tells us that stemmed terms have a # 'Z' prefix. stem = stemmer(word) if stem != word: terms.append("Z"+stem) return terms def addSimpleFilterToQuery(query, filtername): """ If filtername is not None, lookup the simple filter database for the name and add its filter to the query. Returns the enhanced query. """ # See if the user wants to use one of the result filters if filtername: if filtername in filterdb: # If a filter was requested, AND it with the query return xapian.Query(xapian.Query.OP_AND, filterdb[filtername], query) else: raise RuntimeError("Invalid filter type. Try one of " + ", ".join(sorted(filterdb.keys()))) else: return query def show_mset(mset): """ Show a Xapian result mset as a list of packages and their short descriptions """ # Display the top 20 results, sorted by how well they match cache = apt.Cache() print "%i results found." % mset.get_matches_estimated() print "Results 1-%i:" % mset.size() for m in mset: # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.get_data() # Get the package record out of the Apt cache, so we can retrieve the short # description pkg = cache[name] # Print the match, together with the short description if pkg.candidate: print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) def readValueDB(pathname): """ Read the "/etc/services"-style database of value indices """ try: rmcomments = re.compile("\s*(#.*)?$") splitter = re.compile("\s+") values = {} for idx, line in enumerate(open(pathname)): # Remove comments and trailing spaces line = rmcomments.sub("", line) # Skip empty lines if len(line) == 0: continue # Split the line fields = splitter.split(line) if len(fields) < 2: print >>sys.stderr, "Ignoring line %s:%d: only 1 value found when I need at least the value name and number" % (pathname, idx+1) continue # Parse the number try: number = int(fields[1]) except ValueError: print >>sys.stderr, "Ignoring line %s:%d: the second column (\"%s\") must be a number" % (pathname, idx+1, fields[1]) continue values[fields[0]] = number for alias in fields[2:]: values[alias] = number except OSError, e: # If we can't read the database, fallback to defaults print >>sys.stderr, "Cannot read %s: %s. Using a minimal default configuration" % (pathname, e) values = dict( installedsize = 1, packagesize = 2 ) return values apt-xapian-index-0.47ubuntu13/examples/axi-query-pkgtype.py0000755000000000000000000001005513070503156020711 0ustar #!/usr/bin/python # axi-query-pkgtype - Like axi-query-simple.py, but with a simple # result filter # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options]", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") parser.add_option("-t", "--type", help="package type, one of 'game', 'gui', 'cmdline' or 'editor'") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian import warnings from aptxapianindex import * # This tells python-apt that we've seen the warning about the API not being # stable yet, and we don't want to see every time we run the program warnings.filterwarnings("ignore","apt API not stable yet") import apt warnings.resetwarnings() # This is our little database of simple Debtags filters we provide: the name # entered by the user in "--type" maps to a piece of Xapian query filterdb = dict( # We can do simple AND queries... game = xapian.Query(xapian.Query.OP_AND, ('XTuse::gameplaying', 'XTrole::program')), # Or we can do complicate binary expressions... gui = xapian.Query(xapian.Query.OP_AND, xapian.Query('XTrole::program'), xapian.Query(xapian.Query.OP_OR, 'XTinterface::x11', 'XTinterface::3d')), cmdline = xapian.Query(xapian.Query.OP_AND, 'XTrole::program', 'XTinterface::commandline'), editor = xapian.Query(xapian.Query.OP_AND, 'XTrole::program', 'XTuse::editing')) # Feel free to invent more # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Build the base query as seen in axi-query-simple.py query = xapian.Query(xapian.Query.OP_OR, termsForSimpleQuery(args)) # See if the user wants to use one of the result filters if options.type: if options.type in filterdb: # If a filter was requested, AND it with the query query = xapian.Query(xapian.Query.OP_AND, query, filterdb[options.type]) else: print >>sys.stderr, "Invalid filter type. Try one of", ", ".join(sorted(filterdb.keys())) sys.exit(1) # Perform the query, all the rest is as in axi-query-simple.py enquire = xapian.Enquire(db) enquire.set_query(query) # Display the top 20 results, sorted by how well they match cache = apt.Cache() matches = enquire.get_mset(0, 20) print "%i results found." % matches.get_matches_estimated() print "Results 1-%i:" % matches.size() for m in matches: # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.get_data() # Get the package record out of the Apt cache, so we can retrieve the short # description pkg = cache[name] if pkg.candidate: # Print the match, together with the short description print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/axi-query-simple.py0000755000000000000000000001070113070503156020515 0ustar #!/usr/bin/python # axi-query-simple - apt-cache search replacement using apt-xapian-index # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options]", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian import warnings # This tells python-apt that we've seen the warning about the API not being # stable yet, and we don't want to see every time we run the program warnings.filterwarnings("ignore","apt API not stable yet") import apt warnings.resetwarnings() # Setup configuration XAPIANDBPATH = os.environ.get("AXI_DB_PATH", "/var/lib/apt-xapian-index") XAPIANDB = XAPIANDBPATH + "/index" # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Stemmer function to generate stemmed search keywords stemmer = xapian.Stem("english") # Build the terms that will go in the query terms = [] for word in args: if word.islower() and word.find("::") != -1: # If it's lowercase and it contains '::', then we consider it a Debtags # tag. A better way could be to look up arguments in # /var/lib/debtags/vocabulary # # According to /var/lib/apt-xapian-index/README, Debtags tags are # indexed with the 'XT' prefix. terms.append("XT"+word) else: # If it is not a Debtags tag, then we consider it a normal keyword. word = word.lower() terms.append(word) # If the word has a stemmed version, add it to the query. # /var/lib/apt-xapian-index/README tells us that stemmed terms have a # 'Z' prefix. stem = stemmer(word) if stem != word: terms.append("Z"+stem) # OR the terms together into a Xapian query. # # One may ask, why OR and not AND? The reason is that, contrarily to # apt-cache, Xapian scores results according to how well they matched. # # Matches that math all the terms will score higher than the others, so if we # build an OR query what we really have is an AND query that gracefully # degenerates to closer matches when they run out of perfect results. # # This allows stemmed searches to work nicely: if you look for 'editing', then # the query will be 'editing OR Zedit'. Packages with the word 'editing' will # match both and score higher, and packages with the word 'edited' will still # match 'Zedit' and be included in the results. query = xapian.Query(xapian.Query.OP_OR, terms) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # Display the top 20 results, sorted by how well they match cache = apt.Cache() matches = enquire.get_mset(0, 20) print "%i results found." % matches.get_matches_estimated() print "Results 1-%i:" % matches.size() for m in matches: # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.get_data() # Get the package record out of the Apt cache, so we can retrieve the short # description pkg = cache[name] if pkg.candidate: # Print the match, together with the short description print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/axi-query-expand.py0000755000000000000000000000753713070503156020520 0ustar #!/usr/bin/python # axi-query-expand - Query and show possible expansions # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options] keywords", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") parser.add_option("-t", "--type", help="package type, one of 'game', 'gui', 'cmdline' or 'editor'") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian from aptxapianindex import * # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Build the base query as seen in axi-query-simple.py query = xapian.Query(xapian.Query.OP_OR, termsForSimpleQuery(args)) # Add the simple user filter, if requeste query = addSimpleFilterToQuery(query, options.type) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # Retrieve the top 20 results matches = enquire.get_mset(0, 20) # Display the results show_mset(matches) # Now, we ask Xapian what are the terms in the index that are most relevant to # this search. This can be used to suggest to the user the most useful ways of # refining the search. # Select the first 10 documents as the key ones to use to compute relevant # terms rset = xapian.RSet() for m in matches: rset.add_document(m.docid) # This is the "Expansion set" for the search: the 10 most relevant terms eset = enquire.get_eset(10, rset) # Print it out. Note that some terms have a prefix from the database: can we # filter them out? Indeed: Xapian allow to give a filter to get_eset. # Read on... print print "Terms that could improve the search:", print ", ".join(["%s (%.2f%%)" % (res.term, res.weight) for res in eset]) # You can also abuse this feature to show what are the tags that are most # related to the search results. This allows you to turn a search based on # keywords to a search based on semantic attributes, which would be an # absolutely stunning feature in a GUI. # We can do it thanks to Xapian allowing to specify a filter for the output of # get_eset. This filter filters out all the keywords that are not tags, or # that were in the list of query terms. class Filter(xapian.ExpandDecider): def __call__(self, term): """ Return true if we want the term, else false """ return term[:2] == "XT" # This is the "Expansion set" for the search: the 10 most relevant terms that # match the filter eset = enquire.get_eset(10, rset, Filter()) # Print out the resulting tags print print "Tags that could improve the search:", print ", ".join(["%s (%.2f%%)" % (res.term[2:], res.weight) for res in eset]) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/ruby/0000755000000000000000000000000013070503156015707 5ustar apt-xapian-index-0.47ubuntu13/examples/ruby/axi-searchasyoutype.rb0000644000000000000000000001267113070503156022252 0ustar #!/usr/bin/env ruby # # axi-searchasyoutype - Search-as-you-type demo # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # ############################# # # INCOMPLETE, does not work # ############################# require 'optparse' type = nil OptionParser.new do |opts| opts.program_name = 'axi-query-pkgtype.rb' opts.version = '0.1' opts.release = '1203587714' opts.banner = "Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags" opts.on '-t', '--type TYPE', "package type, one of 'game', 'gui', 'cmdline' or 'editor'" do |v| type = v.to_sym end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end.parse! rescue ( puts 'try axi-query-pkgtype.rb --help'; exit 2 ) args = ARGV.collect do |i| i.dup; end # Import the rest here so we don't need dependencies to be installed only to # print commandline help require 'xapian' require 'aptxapianindex' require 'curses' # Instantiate a xapian.Database object for read only access to the index db = Xapian::Database.new(XAPIANDB) # Show the results of a query while we type it class Results def initialize stdscr maxy, maxx = stdscr.getmaxyx size = maxy-1 # splitline = /\s+/ win = curses.newwin(size, maxx, 0, 0) win.clear end def noresults(suggestion = "type more") win.clear win.addstr(0, 0, "No results, " + suggestion, curses.A_BOLD) win.refresh end # Show the results of the search done using the given line of text def update(line) line = line.lower.strip if line.length == 0 # No text given: abort noresults return nil end # Split the line in words args = splitline.split(line) if args.length == 0 # No text given: abort noresults return nil end # To understand the following code, keep in mind that we do # search-as-you-type, so the last word may be partially typed. if args[-1].length == 1 # If the last term has only one character, we ignore it waiting for # the user to type more. A better implementation can set up a # timer to disable this some time after the user stopped typing, to # give meaningful results to searches like "programming in c" args = args[0..-2] if args.length == 0 self.noresults return nil end end # Convert the words into terms for the query terms = terms_for_simple_query(args) # Since the last word can be partially typed, we add all words that # begin with the last one. terms.extend(db.allterms(args[-1]).collect do |x| x.term; end) # Build the query query = Xapian::Query.new(Xapian::Query::OP_OR, terms) # Add the simple user filter, if requested. This is to show that even # when we do search-as-you-type, everything works as usual query = addSimpleFilterToQuery(query, type) # Perform the query enquire = Xapian::Enquire.new(db) enquire.query = query # This does the adaptive cutoff trick on the query results (see # axi-query-adaptivecutoff.py) mset = enquire.mset(0, 1) begin top_weight = mset[0].weight rescue IndexError noresults 'change your query' return nil end enquire.set_cutoff(0, top_weight * 0.7) # Retrieve as many results as we can show mset = enquire.mset(0, self.size - 1) # Redraw the window self.win.clear # Header self.win.addstr(0, 0, "%i results found." % mset.matches_estimated, curses.A_BOLD) # Results mset.each_pair do |y, m| # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.data # Print the match, together with the short description self.win.addstr(y+1, 0, "%i%% %s - %s" % [m.percent, name, pkg.summary]) end self.win.refresh end end # Build the base query as seen in axi-query-simple.py class Input def initialize stdscr, results maxy, maxx = stdscr.getmaxyx results = results win = curses.newwin(1, maxx, maxy-1, 0) win.bkgdset(' ', curses.A_REVERSE) win.clear win.addstr(0, 0, "> ", curses.A_REVERSE) line = "" end def mainloop old = "" loop do c = win.getch break if c == 10 or c == 27 if c == 127 control = true unless line.empty? line = line[0..-2] end else line << chr(c) end win.clear win.addstr(0, 0, "> " + self.line, curses.A_REVERSE) win.refresh unless line == old results.update(self.line) old = line end end end end def main(stdscr) results = Results.new(stdscr) input = Input(stdscr, results) stdscr.refresh input.mainloop end main nil # curses.wrapper(main) apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query-tags.rb0000644000000000000000000000610413070503156021115 0ustar #!/usr/bin/env ruby # axi-query-tags - Look for Debtags tags by keyword # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA require 'optparse' type = nil OptionParser.new do |opts| opts.program_name = 'axi-query-pkgtype.rb' opts.version = '0.1' opts.release = '1203587714' opts.banner = "Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags" opts.on '-t', '--type TYPE', "package type, one of 'game', 'gui', 'cmdline' or 'editor'" do |v| type = v.to_sym end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end.parse! rescue ( puts 'try axi-query-pkgtype.rb --help'; exit 2 ) args = ARGV.collect do |i| i.dup; end # Import the rest here so we don't need dependencies to be installed only to # print commandline help require 'xapian' require 'aptxapianindex' # Instantiate a xapian.Database object for read only access to the index db = Xapian::Database.new(XAPIANDB) # Stemmer function to generate stemmed search keywords stemmer = Xapian::Stem.new("english") # Build the base query query = Xapian::Query.new(Xapian::Query::OP_OR, terms_for_simple_query(args)) # Perform the query enquire = Xapian::Enquire.new(db) enquire.query = query # Now, instead of showing the results of the query, we ask Xapian what are the # terms in the index that are most relevant to this search. # Normally, you would use the results to suggest the user possible ways for # refining the search. I instead abuse this feature to see what are the tags # that are most related to the search results. # Select the first 10 documents as the key ones to use to compute relevant # terms rset = Xapian::RSet.new enquire.mset(0, 5).matches.each do |m| # TODO: use adaptive quality threshold here rset.add_document(m.docid) end # Xapian supports providing a filter object, to say that we are only interested # in some terms. # This one filters out all the keywords that are not tags, or that were in the # list of query terms. class Filter < Xapian::ExpandDecider # Return true if we want the term, else false def __call__ term term[0..1] == "XT" end end # This is the "Expansion set" for the search: the 10 most relevant terms that # match the filter eset = enquire.eset(10, rset, Filter.new) # Print out the results eset.terms.each do |res| puts "%.2f %s" % [res.weight, res.name[2..-1]] end apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query-similar.rb0000644000000000000000000000455613070503156021630 0ustar #!/usr/bin/env ruby # # axi-query-similar - Show packages similar to a given one # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # http://svn.debian.org/wsvn/collab-maint/deb-maint/apt-xapian-index/trunk/examples/axi-query-similar.py?op=file&rev=5455&sc=0 require 'xapian' require 'aptxapianindex' # Instantiate a xapian.Database object for read only access to the index DB = Xapian::Database.new(XAPIANDB) # Get the document corresponding to the package with the given name def doc_for_package pkgname # Query the term with the package name query = Xapian::Query.new("XP" << pkgname) enquire = Xapian::Enquire.new(DB) enquire.query = query # Get the top result only matches = enquire.mset(0, 1) if matches.size == 0 nil else m = matches.matches.first m.document end end # Build a term list with all the terms in the given packages terms = Array.new a = Array.new ARGV.each do |pkgname| a << "XP#{pkgname}" # Get the document corresponding to the package name doc = doc_for_package pkgname next unless doc # Retrieve all the terms in the document doc.terms.each do |t| if t.term.length < 2 or t.term[0..2] != 'XP' terms << t.term end end end # Build the big OR query query = Xapian::Query.new(Xapian::Query::OP_AND_NOT, # Terms we want Xapian::Query.new(Xapian::Query::OP_OR, terms), # AND NOT the input packages Xapian::Query.new(Xapian::Query::OP_OR, a)) a = nil # Perform the query enquire = Xapian::Enquire.new(DB) enquire.query = query # Retrieve the top 20 results matches = enquire.mset(0, 20) # Display the results show_mset(matches) apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query-simple.rb0000644000000000000000000000676413070503156021464 0ustar #!/usr/bin/env ruby # # axi-query-simple - apt-cache search replacement using apt-xapian-index # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # http://svn.debian.org/wsvn/collab-maint/deb-maint/apt-xapian-index/trunk/examples/axi-query-simple.py?op=file&rev=5455&sc=0 require 'xapian' # Setup configuration XAPIANDBPATH = "/var/lib/apt-xapian-index" XAPIANDB = XAPIANDBPATH + "/index" # Instantiate a xapian.Database object for read only access to the index db = Xapian::Database.new( XAPIANDB ) # Stemmer function to generate stemmed search keywords stemmer = Xapian::Stem.new 'english' # Build the terms that will go in the query terms = [] ARGV.each do |word| word = word.dup if not word.downcase! and word['::'] # If it's lowercase and it contains '::', then we consider it a Debtags # tag. A better way could be to look up arguments in # /var/lib/debtags/vocabulary # # According to /var/lib/apt-xapian-index/README, Debtags tags are # indexed with the 'XT' prefix. terms << ( 'XT' << word ) else # If it is not a Debtags tag, then we consider it a normal keyword. terms << word # If the word has a stemmed version, add it to the query. # /var/lib/apt-xapian-index/README tells us that stemmed terms have a # 'Z' prefix. stem = stemmer.call(word) terms << ('Z' << stem) unless stem == word end end # OR the terms together into a Xapian query. # # One may ask, why OR and not AND? The reason is that, contrarily to # apt-cache, Xapian scores results according to how well they matched. # # Matches that math all the terms will score higher than the others, so if we # build an OR query what we really have is an AND query that gracefully # degenerates to closer matches when they run out of perfect results. # # This allows stemmed searches to work nicely: if you look for 'editing', then # the query will be 'editing OR Zedit'. Packages with the word 'editing' will # match both and score higher, and packages with the word 'edited' will still # match 'Zedit' and be included in the results. query = Xapian::Query.new(Xapian::Query::OP_OR, terms) # Perform the query enquire = Xapian::Enquire.new db enquire.query = query # Display the top 20 results, sorted by how well they match matches = enquire.mset(0, 20) puts "%i results found." % matches.matches_estimated puts "Results 1-%i:" % matches.size matches.matches.each do |m| # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.data # Get the package record out of the Apt cache, so we can retrieve the short # description #pkg = Apt::cache[name] # Print the match, together with the short description puts "%i%% %s - %s" % [m.percent, name, 'summary not available'] end apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query-pkgtype.rb0000644000000000000000000000676513070503156021657 0ustar #!/usr/bin/env ruby # # axi-query-pkgtype - Like axi-query-simple.py, but with a simple # result filter # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA require 'optparse' type = nil OptionParser.new do |opts| opts.program_name = 'axi-query-pkgtype.rb' opts.version = '0.1' opts.release = '1203587714' opts.banner = "Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags" opts.on '-t', '--type TYPE', "package type, one of 'game', 'gui', 'cmdline' or 'editor'" do |v| type = v.to_sym end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end.parse! rescue ( puts 'try axi-query-pkgtype.rb --help'; exit 2 ) args = ARGV.collect do |i| i.dup; end # Import the rest here so we don't need dependencies to be installed only to # print commandline help require 'xapian' require 'aptxapianindex' # This is our little database of simple Debtags filters we provide: the name # entered by the user in "--type" maps to a piece of Xapian query FILTER_DB = { # We can do simple AND queries... :game => Xapian::Query.new(Xapian::Query::OP_AND, ['XTuse::gameplaying', 'XTrole::program']), # Or we can do complicate binary expressions... :gui => Xapian::Query.new( Xapian::Query::OP_AND, Xapian::Query.new('XTrole::program'), Xapian::Query.new(Xapian::Query::OP_OR, 'XTinterface::x11', 'XTinterface::3d')), :cmdline => Xapian::Query.new(Xapian::Query::OP_AND, 'XTrole::program', 'XTinterface::commandline'), :editor => Xapian::Query.new(Xapian::Query::OP_AND, 'XTrole::program', 'XTuse::editing') # Feel free to invent more } # Instantiate a xapian.Database object for read only access to the index db = Xapian::Database.new(XAPIANDB) # Build the base query as seen in axi-query-simple.py query = Xapian::Query.new(Xapian::Query::OP_OR, terms_for_simple_query(args)) # See if the user wants to use one of the result filters if type if FILTER_DB.has_key? type # If a filter was requested, AND it with the query query = Xapian::Query.new(Xapian::Query::OP_AND, query, FILTER_DB[type]) else $stderr.puts "Invalid filter type. Try one of %s" % FILTER_DB.keys.join(', ') exit 1 end end # Perform the query, all the rest is as in axi-query-simple.py enquire = Xapian::Enquire.new(db) enquire.query = query # Display the top 20 results, sorted by how well they match matches = enquire.mset(0, 20) puts "%i results found." % matches.matches_estimated puts "Results 1-%i:" % matches.size matches.matches.each do |m| # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.data # Print the match, together with the short description puts "%i%% %s - %s" % [m.percent, name, 'summary not available'] end apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query.rb0000644000000000000000000000620013070503156020156 0ustar #!/usr/bin/env ruby # # axi-query - Example program to query the apt-xapian-index # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA require 'optparse' type = nil OptionParser.new do |opts| opts.program_name = 'axi-query-pkgtype.rb' opts.version = '0.1' opts.release = '1203587714' opts.banner = "Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags" opts.on '-t', '--type TYPE', "package type, one of 'game', 'gui', 'cmdline' or 'editor'" do |v| type = v.to_sym end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end.parse! rescue ( puts 'try axi-query-pkgtype.rb --help'; exit 2 ) args = ARGV.collect do |i| i.dup; end # Import the rest here so we don't need dependencies to be installed only to # print commandline help require 'xapian' # Access the Xapian index db = Xapian::Database.new(XAPIANDB) # Build the query stemmer = Xapian::Stem.new("english") terms = [] args.each do |word| if word.islower and word.find("::") != -1: # If it's lowercase and contains, :: it's a tag # TODO: lookup in debtags' vocabulary instead terms<<("XT"+word) else: # Else we make a term word = word.lower terms<<(word) stem = stemmer(word) # If it has stemming, add that to the query, too if stem != word: terms<<("Z"+stem) query = Xapian::Query.new(Xapian::Query::OP_OR, terms) # Perform the query enquire = Xapian::Enquire.new(db) enquire.query = query if options.sort: values = read_value_db(XAPIANDBVALUES) # If we don't sort by relevance, we need to specify a cutoff in order to # remove poor results from the output # # Note: ept-cache implements an adaptive cutoff as follows: # 1. Retrieve only one result, with default sorting. Read its relevance as # the maximum relevance. # 2. Set the cutoff as some percentage of the maximum relevance # 3. Set sort by the wanted value # 4. Perform the query enquire.set_cutoff(60) # Sort by the requested value enquire.set_sort_by_value(values[options.sort]) # Display the results. cache = apt.Cache matches = enquire.mset(0, 20) print "%i results found." % matches.matches_estimated print "Results 1-%i:" % matches.size matches.matches.each do |m| name = m.document.data pkg = cache[name] print "%i%% %s - %s" % (m.percent, name, pkg.summary) end apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query-expand.rb0000644000000000000000000000740313070503156021441 0ustar #!/usr/bin/env ruby # axi-query-expand - Query and show possible expansions # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA require 'optparse' type = nil OptionParser.new do |opts| opts.program_name = 'axi-query-pkgtype.rb' opts.version = '0.1' opts.release = '1203587714' opts.banner = "Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags" opts.on '-t', '--type TYPE', "package type, one of 'game', 'gui', 'cmdline' or 'editor'" do |v| type = v.to_sym end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end.parse! rescue ( puts 'try axi-query-pkgtype.rb --help'; exit 2 ) args = ARGV.collect do |i| i.dup; end # Import the rest here so we don't need dependencies to be installed only to # print commandline help require 'xapian' require 'aptxapianindex' # Instantiate a Xapian::Database object for read only access to the index db = Xapian::Database.new(XAPIANDB) # Build the base query as seen in axi-query-simple.rb query = Xapian::Query.new(Xapian::Query::OP_OR, terms_for_simple_query(args)) # Add the simple user filter, if requeste query = add_simple_filter_to_query(query, type) # Perform the query enquire = Xapian::Enquire.new(db) enquire.query = query # Retrieve the top 20 results matches = enquire.mset(0, 20) # Display the results show_mset(matches) # Now, we ask Xapian what are the terms in the index that are most relevant to # this search. This can be used to suggest to the user the most useful ways of # refining the search. # Select the first 10 documents as the key ones to use to compute relevant # terms rset = Xapian::RSet.new matches.matches.each do |m| rset.add_document m.docid end # This is the "Expansion set" for the search: the 10 most relevant terms eset = enquire.eset(10, rset) # Print it out. Note that some terms have a prefix from the database: can we # filter them out? Indeed: Xapian allow to give a filter to get_eset. # Read on... puts '' puts "Terms that could improve the search:" terms = eset.terms.collect do |res| "%s (%.2f%%)" % [res.name, res.weight] end puts terms.join(', ') terms = nil # You can also abuse this feature to show what are the tags that are most # related to the search results. This allows you to turn a search based on # keywords to a search based on semantic attributes, which would be an # absolutely stunning feature in a GUI. # We can do it thanks to Xapian allowing to specify a filter for the output of # get_eset. This filter filters out all the keywords that are not tags, or # that were in the list of query terms. class Filter < Xapian::ExpandDecider # Return true if we want the term, else false def __call__ term term[0..1] == "XT" end end # This is the "Expansion set" for the search: the 10 most relevant terms that # match the filter eset = enquire.eset(10, rset, Filter.new) # Print out the resulting tags puts '' puts "Terms that could improve the search:" terms = eset.terms.collect do |res| "%s (%.2f%%)" % [res.name[2..-1], res.weight] end puts terms.join(', ') apt-xapian-index-0.47ubuntu13/examples/ruby/aptxapianindex.rb0000644000000000000000000001127113070503156021253 0ustar # This program is free software. It comes without any warranty, to # the extent permitted by applicable law. You can redistribute it # and/or modify it under the terms of the Do What The Fuck You Want # To Public License, Version 2, as published by Sam Hocevar. See # http://sam.zoy.org/wtfpl/COPYING for more details. # Setup configuration # This tells python-apt that we've seen the warning about the API not being # stable yet, and we don't want to see every time we run the program require 'xapian' # Setup configuration XAPIANDBPATH = '/var/lib/apt-xapian-index' XAPIANDB = XAPIANDBPATH + '/index' XAPIANDBVALUES = XAPIANDBPATH + '/values' # This is our little database of simple Debtags filters we provide: the name # entered by the user in "--type" maps to a piece of Xapian query FILTER_DB = { # We can do simple AND queries... :game => Xapian::Query.new(Xapian::Query::OP_AND, ['XTuse::gameplaying', 'XTrole::program']), # Or we can do complicate binary expressions... :gui => Xapian::Query.new( Xapian::Query::OP_AND, Xapian::Query.new('XTrole::program'), Xapian::Query.new(Xapian::Query::OP_OR, 'XTinterface::x11', 'XTinterface::3d')), :cmdline => Xapian::Query.new(Xapian::Query::OP_AND, 'XTrole::program', 'XTinterface::commandline'), :editor => Xapian::Query.new(Xapian::Query::OP_AND, 'XTrole::program', 'XTuse::editing') # Feel free to invent more } =begin Given a list of user-supplied keywords, build the list of terms that will go in a simple Xapian query. If a term is lowercase and contains '::', then it's considered to be a Debtags tag. =end def terms_for_simple_query keywords stemmer = Xapian::Stem.new("english") terms = [] keywords.each do |word| if not word.downcase! and word['::'] # FIXME: A better way could be to look up arguments in # /var/lib/debtags/vocabulary # # According to /var/lib/apt-xapian-index/README, Debtags tags are # indexed with the 'XT' prefix. terms << ( "XT" << word ) else # If it is not a Debtags tag, then we consider it a normal keyword. terms << word # If the word has a stemmed version, add it to the query. # /var/lib/apt-xapian-index/README tells us that stemmed terms have a # 'Z' prefix. stem = stemmer.call(word) terms << ( 'Z' << stem ) unless stem == word end end terms end =begin If filtername is not None, lookup the simple filter database for the name and add its filter to the query. Returns the enhanced query. =end def add_simple_filter_to_query query, filtername # See if the user wants to use one of the result filters if filtername if FILTER_DB.include? filtername # If a filter was requested, AND it with the query Xapian::Query.new(Xapian::Query::OP_AND, FILTER_DB[filtername], query) else raise RuntimeError("Invalid filter type. Try one of %s" % FILTER_DB.keys.join(', ')) end else query end end # Show a Xapian result mset as a list of packages and their short descriptions def show_mset mset # Display the top 20 results, sorted by how well they match puts "%i results found." % mset.matches_estimated puts "Results 1-%i:" % mset.size mset.matches.each do |m| # /var/lib/apt-xapian-index/README tells us that the Xapian document data # is the package name. name = m.document.data # Print the match, together with the short description puts "%i%% %s - %s" % [m.percent, name, 'summary not available'] end end # Read the "/etc/services"-style database of value indices def read_value_db pathname begin rmcomments = /\s*(#.*)?$/ splitter = /\s+/ values = {} File.open pathname, 'r' do |io| while line = io.gets # Remove comments and trailing spaces line = rmcomments.sub("", line) # Skip empty lines next if line.empty? # Split the line fields = splitter.split(line) if fields.length < 2 stderr.puts "Ignoring line %s:%d: only 1 value found when I need at least the value name and number" % [pathname, io.lineno + 1] next end # Parse the number begin number = fields[1].to_i rescue NoMethodError $stderr.puts "Ignoring line %s:%d: the second column (\"%s\") must be a number" % [pathname, io.lineno + 1, fields[1]] next end values[fields[0]] = number fields[2..-1].each do |a| values[a] = number end end end rescue => e # If we can't read the database, fallback to defaults $stderr.puts "Cannot read %s: %s. Using a minimal default configuration" % [pathname, e] values = { :installedsize => 1, :packagesize => 2 } end values end apt-xapian-index-0.47ubuntu13/examples/ruby/axi-query-adaptivecutoff.rb0000644000000000000000000000476013070503156023171 0ustar #!/usr/bin/env ruby # axi-query-adaptivecutoff - Use an adaptive cutoff to select results # # Copyright (C) 2007 Enrico Zini # Copyright (C) 2008 Daniel Brumbaugh Keeney # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA require 'optparse' type = nil OptionParser.new do |opts| opts.program_name = 'axi-query-pkgtype.rb' opts.version = '0.1' opts.release = '1203587714' opts.banner = "Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags" opts.on '-t', '--type TYPE', "package type, one of 'game', 'gui', 'cmdline' or 'editor'" do |v| type = v.to_sym end opts.on_tail("-h", "--help", "Show this message") do puts opts exit end end.parse! rescue ( puts 'try axi-query-pkgtype.rb --help'; exit 2 ) args = ARGV.collect do |i| i.dup; end # Import the rest here so we don't need dependencies to be installed only to # print commandline help require 'xapian' require 'aptxapianindex' # Instantiate a Xapian::Database object for read only access to the index db = Xapian::Database.new(XAPIANDB) # Build the base query as seen in axi-query-simple.rb query = Xapian::Query.new(Xapian::Query::OP_OR, terms_for_simple_query(args)) # Add the simple user filter, if requeste query = add_simple_filter_to_query(query, type) # Perform the query enquire = Xapian::Enquire.new(db) enquire.query = query # Retrieve the first result, and check its relevance matches = enquire.mset(0, 1) top_weight = matches.matches.first.weight # Tell Xapian that we only want results that are at least 70% as good as that p enquire.methods.sort enquire.cutoff!(0, top_weight * 0.7) # Now we have a meaningful cutoff, so we can get a larger number of results: # thanks to the cutoff, approximate results will stop before starting to be # really bad. matches = enquire.mset(0, 200) # Display the results show_mset(matches) apt-xapian-index-0.47ubuntu13/examples/README0000644000000000000000000000044413070503156015610 0ustar Feel free to send me more examples, or to send me ports of these examples to other languages. This is a list of currently available Xapian ports: - C++: libxapian-dev - Perl: libsearch-xapian-perl - Ruby: libxapian-ruby1.8 - Python: python-xapian - Tcl: tclxapian - PHP: php5-xapian apt-xapian-index-0.47ubuntu13/examples/axi-query.py0000755000000000000000000000662313070503156017236 0ustar #!/usr/bin/python # # axi-query - Example program to query the apt-xapian-index # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # from aptxapianindex import * from optparse import OptionParser import sys VERSION="0.1" class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options]", version="%prog "+ VERSION, description="Query the Apt Xapian index. Command line arguments can be keywords or Debtags tags") parser.add_option("-s", "--sort", help="sort by the given value, as listed in %s" % XAPIANDBVALUES) (options, args) = parser.parse_args() import os import xapian import warnings # Yes, apt, thanks, I know, the api isn't stable, thank you so very much #warnings.simplefilter('ignore', FutureWarning) warnings.filterwarnings("ignore","apt API not stable yet") import apt warnings.resetwarnings() # Access the Xapian index db = xapian.Database(XAPIANDB) # Build the query stemmer = xapian.Stem("english") terms = [] for word in args: if word.islower() and word.find("::") != -1: # If it's lowercase and contains, :: it's a tag # TODO: lookup in debtags' vocabulary instead terms.append("XT"+word) else: # Else we make a term word = word.lower() terms.append(word) stem = stemmer(word) # If it has stemming, add that to the query, too if stem != word: terms.append("Z"+stem) query = xapian.Query(xapian.Query.OP_OR, terms) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) if options.sort: values = readValueDB(XAPIANDBVALUES) # If we don't sort by relevance, we need to specify a cutoff in order to # remove poor results from the output # # Note: ept-cache implements an adaptive cutoff as follows: # 1. Retrieve only one result, with default sorting. Read its relevance as # the maximum relevance. # 2. Set the cutoff as some percentage of the maximum relevance # 3. Set sort by the wanted value # 4. Perform the query enquire.set_cutoff(60) # Sort by the requested value enquire.set_sort_by_value(values[options.sort]) # Display the results. cache = apt.Cache() matches = enquire.get_mset(0, 20) print "%i results found." % matches.get_matches_estimated() print "Results 1-%i:" % matches.size() for m in matches: name = m.document.get_data() pkg = cache[name] if pkg.candidate: print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/axi-query-similar.py0000755000000000000000000000562713070503156020677 0ustar #!/usr/bin/python # axi-query-similar - Show packages similar to a given one # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options] package(s)", version="%prog "+ VERSION, description="Find the packages similar to the given ones") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian from aptxapianindex import * # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) def docForPackage(pkgname): "Get the document corresponding to the package with the given name" # Query the term with the package name query = xapian.Query("XP"+pkgname) enquire = xapian.Enquire(db) enquire.set_query(query) # Get the top result only matches = enquire.get_mset(0, 1) if matches.size() == 0: return None else: m = matches[0] return m.document # Build a term list with all the terms in the given packages terms = [] for pkgname in args: # Get the document corresponding to the package name doc = docForPackage(pkgname) if not doc: continue # Retrieve all the terms in the document for t in doc.termlist(): if len(t.term) < 2 or t.term[:2] != 'XP': terms.append(t.term) # Build the big OR query query = xapian.Query(xapian.Query.OP_AND_NOT, # Terms we want xapian.Query(xapian.Query.OP_OR, terms), # AND NOT the input packages xapian.Query(xapian.Query.OP_OR, ["XP"+name for name in args])) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # Retrieve the top 20 results matches = enquire.get_mset(0, 20) # Display the results show_mset(matches) sys.exit(0) apt-xapian-index-0.47ubuntu13/examples/axi-query-tags.py0000755000000000000000000000616213070503156020170 0ustar #!/usr/bin/python # axi-query-tags - Look for Debtags tags by keyword # # Copyright (C) 2007 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA from optparse import OptionParser import sys VERSION="0.1" # Let's start with a simple command line parser with help class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options] keywords", version="%prog "+ VERSION, description="Find the Debtags tags corresponding to some keywords") (options, args) = parser.parse_args() # Import the rest here so we don't need dependencies to be installed only to # print commandline help import os import xapian from aptxapianindex import * # Instantiate a xapian.Database object for read only access to the index db = xapian.Database(XAPIANDB) # Build the base query query = xapian.Query(xapian.Query.OP_OR, termsForSimpleQuery(args)) # Perform the query enquire = xapian.Enquire(db) enquire.set_query(query) # Now, instead of showing the results of the query, we ask Xapian what are the # terms in the index that are most relevant to this search. # Normally, you would use the results to suggest the user possible ways for # refining the search. I instead abuse this feature to see what are the tags # that are most related to the search results. # Use an adaptive cutoff to avoid to pick bad results as references matches = enquire.get_mset(0, 1) topWeight = matches[0].weight enquire.set_cutoff(0, topWeight * 0.7) # Select the first 10 documents as the key ones to use to compute relevant # terms rset = xapian.RSet() for m in enquire.get_mset(0, 10): rset.add_document(m.docid) # Xapian supports providing a filter object, to say that we are only interested # in some terms. # This one filters out all the keywords that are not tags, or that were in the # list of query terms. class Filter(xapian.ExpandDecider): def __call__(self, term): """ Return true if we want the term, else false """ return term[:2] == "XT" # This is the "Expansion set" for the search: the 10 most relevant terms that # match the filter eset = enquire.get_eset(10, rset, Filter()) # Print out the results for res in eset: print "%.2f %s" % (res.weight, res.term[2:]) sys.exit(0) apt-xapian-index-0.47ubuntu13/axi-cache0000755000000000000000000010122513070503156014661 0ustar #!/usr/bin/python # coding: utf-8 # # axi-cache - Query apt-xapian-index database # # Copyright (C) 2010 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # TODO: # - save NAME save the query to be recalled later with @NAME (or if notmuch # has a syntax for saved queries, recycle it) from optparse import OptionParser from cStringIO import StringIO import sys import os, os.path import axi VERSION="0.47" # Setup configuration DEBTAGS_VOCABULARY = "/var/lib/debtags/vocabulary" #XDG_CONFIG_HOME = os.environ.get("XDG_CONFIG_HOME", os.expanduser("~/.config")) XDG_CACHE_HOME = os.environ.get("XDG_CACHE_HOME", os.path.expanduser("~/.cache")) CACHEFILE = os.path.join(XDG_CACHE_HOME, "axi-cache.state") try: from ConfigParser import RawConfigParser import re import math import xapian import apt try: from debian import deb822 except ImportError: from debian_bundle import deb822 helponly = False except ImportError, e: print >>sys.stderr, "%s: only help functions are implemented, for the sake of help2man" % str(e) helponly = True if not helponly: class SavedStateError(Exception): pass class AptSilentProgress(apt.progress.text.OpProgress) : "Quiet progress so we don't get cache loading messages from Apt" def __init__(self): pass def done(self): pass def update(self, percent=None): pass def readVocabulary(): try: fin = open(DEBTAGS_VOCABULARY) except Exception, e: # Only show this when being verbose print >>sys.stderr, "Cannot read %s: %s. Please install `debtags' t" % (DEBTAGS_VOCABULARY, str(e)) return None, None facets = dict() tags = dict() for entry in deb822.Deb822.iter_paragraphs(fin): if "Facet" in entry: facets[entry["Facet"]] = entry elif "Tag" in entry: tags[entry["Tag"]] = entry return facets, tags class DB(object): class BasicFilter(xapian.ExpandDecider): def __init__(self, stemmer=None, exclude=None, prefix=None): super(DB.BasicFilter, self).__init__() self.stem = stemmer if stemmer else lambda x:x self.exclude = set([self.stem(x) for x in exclude]) if exclude else set() self.prefix = prefix def __call__(self, term): if len(term) < 4: return False if self.prefix is not None: # Skip leading uppercase chars t = term while t and t[0].isupper(): t = t[1:] if not t.startswith(self.prefix): return False if self.stem(term) in self.exclude: return False if term.startswith("XT") or term.startswith("XS"): return True return term[0].islower() class TermFilter(BasicFilter): def __call__(self, term): if len(term) < 4: return False if self.stem(term) in self.exclude: return False return term[0].islower() class TagFilter(xapian.ExpandDecider): def __call__(self, term): return term.startswith("XT") def __init__(self): # Access the Xapian index self.db = xapian.Database(axi.XAPIANINDEX) self.stem = xapian.Stem("english") # Build query parser self.qp = xapian.QueryParser() self.qp.set_default_op(xapian.Query.OP_AND) self.qp.set_database(self.db) self.qp.set_stemmer(self.stem) self.qp.set_stemming_strategy(xapian.QueryParser.STEM_SOME) self.qp.add_prefix("pkg", "XP") self.qp.add_boolean_prefix("tag", "XT") self.qp.add_boolean_prefix("sec", "XS") #notmuch->value_range_processor = new Xapian::NumberValueRangeProcessor (NOTMUCH_VALUE_TIMESTAMP); #notmuch->query_parser->add_valuerangeprocessor (notmuch->value_range_processor); # Read state from previous runs self.cache = RawConfigParser() if os.path.exists(CACHEFILE): try: self.cache.read(CACHEFILE) except Exception, e: print >>sys.stderr, e print >>sys.stderr, "ignoring %s which seems to be corrupted" % CACHEFILE self.dirty = False self.facets = None self.tags = None def save(self): "Save the state so we find it next time" if self.dirty: if not os.path.exists(XDG_CACHE_HOME): os.makedirs(XDG_CACHE_HOME, mode=0700) self.cache.write(open(CACHEFILE, "w")) self.dirty = False def vocabulary(self): if self.facets is None: self.facets, self.tags = readVocabulary() return self.facets, self.tags def unprefix(self, term): "Convert DB prefixes to user prefixes" if term.startswith("XT"): return "tag:" + term[2:] elif term.startswith("XS"): return "sec:" + term[2:] elif term.startswith("XP"): return "pkg:" + term[2:] return term def is_tag(self, t): return self.db.term_exists("XT" + t) def set_query_args(self, args, secondary=False): def term_or_tag(t): if "::" in t and self.is_tag(t): return "tag:" + t else: return t args = map(term_or_tag, args) self.set_query_string(" ".join(args), secondary=secondary) def set_query_string(self, q, secondary=False): "Set the query in the cache" if not secondary: self.set_cache_last("query", q) self.unset_cache_last("secondary query") else: self.set_cache_last("secondary query", q) self.unset_cache_last("skip") def set_sort(self, key=None, cutoff=60): "Set sorting method (default is by relevance)" if key is None: self.unset_cache_last("sort") self.unset_cache_last("cutoff") else: self.set_cache_last("sort", key) self.set_cache_last("cutoff", str(cutoff)) self.unset_cache_last("skip") def build_query(self): "Build query from cached query info" q = self.get_cache_last("query") if not self.cache.has_option("last", "query"): raise SavedStateError("no saved query") self.query = self.qp.parse_query(q, xapian.QueryParser.FLAG_BOOLEAN | xapian.QueryParser.FLAG_LOVEHATE | xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE | xapian.QueryParser.FLAG_WILDCARD | xapian.QueryParser.FLAG_PURE_NOT | xapian.QueryParser.FLAG_SPELLING_CORRECTION | xapian.QueryParser.FLAG_AUTO_SYNONYMS) secondary = self.get_cache_last("secondary query", None) if secondary: secondary = self.qp.parse_query(secondary, xapian.QueryParser.FLAG_BOOLEAN | xapian.QueryParser.FLAG_LOVEHATE | xapian.QueryParser.FLAG_BOOLEAN_ANY_CASE | xapian.QueryParser.FLAG_WILDCARD | xapian.QueryParser.FLAG_PURE_NOT | xapian.QueryParser.FLAG_SPELLING_CORRECTION | xapian.QueryParser.FLAG_AUTO_SYNONYMS) self.query = xapian.Query(xapian.Query.OP_AND, self.query, secondary) # print "Query:", self.query # Build the enquire with the query self.enquire = xapian.Enquire(self.db) self.enquire.set_query(self.query) sort = self.get_cache_last("sort") if sort is not None: values, descs = axi.readValueDB() # If we don't sort by relevance, we need to specify a cutoff in order to # remove poor results from the output # # Note: ept-cache implements an adaptive cutoff as follows: # 1. Retrieve only one result, with default sorting. Read its relevance as # the maximum relevance. # 2. Set the cutoff as some percentage of the maximum relevance # 3. Set sort by the wanted value # 4. Perform the query # TODO: didn't this use to work? #self.enquire.set_cutoff(int(self.get_cache_last("cutoff", 60))) reverse = sort[0] == '-' or sort[-1] == '-' sort = sort.strip('-') # Sort by the requested value self.enquire.set_sort_by_value(values[sort], reverse) def get_matches(self, first=None, count=20): """ Return a Xapian mset with the next page of results. """ if first is None: first = int(self.get_cache_last("skip", 0)) self.set_cache_last("lastfirst", first) self.set_cache_last("skip", first + count) matches = self.enquire.get_mset(first, count) return matches def get_all_matches(self, first=None): """ Generate Xapian matches for all query matches """ if first is None: first = int(self.get_cache_last("skip", 0)) self.unset_cache_last("lastfirst") self.unset_cache_last("skip") while True: matches = self.enquire.get_mset(first, 100) count = matches.size() if count == 0: break for m in matches: yield m first += 100 def get_spelling_correction(self): return self.qp.get_corrected_query_string() def get_suggestions(self, count=10, filter=None): """ Compute suggestions for more terms Return a Xapian ESet """ # Use the first 30 results as the key ones to use to compute relevant # terms rset = xapian.RSet() for m in self.enquire.get_mset(0, 30): rset.add_document(m.docid) # Get results, optionally filtered if filter is None: filter = self.BasicFilter() return self.enquire.get_eset(count, rset, filter) # ConfigParser access wrappers with lots of extra ifs, needed because the # ConfigParser API has been designed to throw exceptions in the most stupid # places one could possibly conceive def get_cache_last(self, key, default=None): if self.cache.has_option("last", key): return self.cache.get("last", key) return default def set_cache_last(self, key, val): if not self.cache.has_section("last"): self.cache.add_section("last") self.cache.set("last", key, val) self.dirty = True def unset_cache_last(self, key): if not self.cache.has_section("last"): return self.cache.remove_option("last", key) def get_rdeps(self, name, pfx): "Return all the rdeps of type @pfx for package @name" enquire = xapian.Enquire(self.db) enquire.set_query(xapian.Query(pfx+name)) first = 0 while True: found = 0 for m in enquire.get_mset(first, first + 20): found += 1 yield m.document.get_data() if found < 20: break first += 20 class Cmdline(object): BOOLWORDS = set(["and", "or", "not"]) def __init__(self): self.name = "axi-cache" class Parser(OptionParser): def __init__(self, cmdline, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) self.cmdline = cmdline def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) def print_help(self, fd=sys.stdout): commands = StringIO() self.cmdline.format_help(commands) buf = StringIO() # Still in 2010 Cmdline is not an object. Oh dear. OptionParser.print_help(self, buf) buf = buf.getvalue().replace("ENDDESC", "\n\n" + commands.getvalue()[:-1]) fd.write(buf) self.parser = Parser(self, usage="usage: %prog [options] command [args]", version="%prog "+ VERSION, description="Query the Apt Xapian index.ENDDESC") self.parser.add_option("-s", "--sort", help="sort by the given value, as listed in %s. Add a '-' to reverse sort order" % axi.XAPIANDBVALUES) self.parser.add_option("--tags", action="store_true", help="show matching tags, rather than packages") self.parser.add_option("--tabcomplete", action="store", metavar="TYPE", help="suggest words for tab completion of the current command line (type is 'plain' or 'partial')") self.parser.add_option("--last", action="store_true", help="use 'show --last' to limit tab completion to only the packages from the last search results") self.parser.add_option("--all", action="store_true", help="disable pagination and always show all results. Note that search results are normally sorted by relevance, so you may find meaningless results at the end of the output") (opts, args) = self.parser.parse_args() self.opts = opts self.args = args if opts.tags and opts.all: self.parser.error("--tags conflicts with --all") def tabcomplete_query_args(self): if self.opts.tabcomplete == "partial" and self.args: queryargs = self.args[:-1] partial = self.args[-1] else: queryargs = self.args partial = None # Remove trailing boolean terms while queryargs and queryargs[-1].lower() in self.BOOLWORDS: queryargs = queryargs[:-1] return queryargs, partial def do_info(self, args): "info: print information about the apt-xapian-index environment" import time import datetime import axi.indexer # General info print "Main data directory:", axi.XAPIANDBPATH try: cur_timestamp = os.path.getmtime(axi.XAPIANDBSTAMP) cur_time = time.strftime("%c", time.localtime(cur_timestamp)) cur_time = "last updated: " + cur_time except e: cur_timestamp = 0 cur_time = "not available: " + str(e) print "Update timestamp: %s (%s)" % (axi.XAPIANDBSTAMP, cur_time) try: index_loc = open(axi.XAPIANINDEX).read().split(" ", 1)[1].strip() index_loc = "pointing to " + index_loc except e: index_loc = "not available: " + str(e) print "Index location: %s (%s)" % (axi.XAPIANINDEX, index_loc) def fileinfo(fname): if os.path.exists(fname): return fname else: return fname + " (available from next reindex)" print "Documentation of index contents:", fileinfo(axi.XAPIANDBDOC) print "Documentation of available prefixes:", fileinfo(axi.XAPIANDBPREFIXES) print "Documentation of available values:", fileinfo(axi.XAPIANDBVALUES) print "Plugin directory:", axi.PLUGINDIR # Aggregated plugin information # { name: { path=path, desc=desc, status=status } } plugins = dict() # Aggregated value information # { valuename: { val=num, desc=shortdesc, plugins=[plugin names] } } values, descs = axi.readValueDB() values = dict(((a, dict(val=b, desc=descs[a], plugins=[])) for a, b in values.iteritems())) # Aggregated data source information # { pathname: { desc=shortdesc, plugins=[plugin names] } } sources = dict() # Per-plugin info all_plugins_names = set((x for x in os.listdir(axi.PLUGINDIR) if x.endswith(".py"))) for plugin in axi.indexer.Plugins(): all_plugins_names.remove(plugin.filename) doc = plugin.obj.doc() or dict() # Last update status / whether an update is due # TODO: check if the data from the plugin is present in the index ts = plugin.info["timestamp"] if cur_timestamp == 0: status = "enabled, not indexed" elif ts == 0: status = "enabled, up to date" elif ts <= cur_timestamp: delta = datetime.timedelta(seconds=cur_timestamp-ts) status = "enabled, up to date (%s older than index)" % str(delta) else: delta = datetime.timedelta(seconds=ts-cur_timestamp) status = "enabled, needs indexing (%s newer than index)" % str(delta) desc = doc.get("shortDesc", "(description unavailable)") plugins[plugin.name] = dict( path=os.path.join(axi.PLUGINDIR, plugin.filename), desc=desc, status=status, ) # Aggregate info about values for v in plugin.info.get("values", []): name = v.get("name", None) if name is None: continue if name not in values: values[name] = dict(val=None, desc=v.get("desc", "(unknown"), plugins=[]) values[name].setdefault("plugins", []).append(plugin.name) # Data files used by the plugin for s in plugin.info.get("sources", []): path = s.get("path", "(unknown)") if path not in sources: sources[path] = dict(desc=s.get("desc", "(unknown)"), plugins=[]) sources[path].setdefault("plugins", []).append(plugin.name) # Disabled plugins for plugin in sorted(all_plugins_names): desc = doc.get("shortDesc", "(unavailable)") name = os.path.splitext(plugin)[0] plugins[name] = dict( path=os.path.join(axi.PLUGINDIR, plugin), desc="(plugin disabled, description unavailable)", status="disabled", ) # Plugin information # { name: { path=path, desc=desc, status=status } } #print "Plugins:" #maxname = max((len(x) for x in plugins.iterkeys())) #for name, info in sorted(plugins.iteritems(), key=lambda x:x[0]): # print " %s:" % info["path"] # print " ", info["desc"] print "Plugin status:" maxname = max((len(x) for x in plugins.iterkeys())) for name, info in sorted(plugins.iteritems(), key=lambda x:x[0]): print " ", name.ljust(maxname), info["status"] # Value information print "Values:" maxname = max((len(x) for x in values.iterkeys())) print " ", "Value".ljust(maxname), "Code", "Provided by" for name, val in sorted(values.iteritems(), key=lambda x:x[0]): plugins = val.get("plugins", []) if plugins: provider = ", ".join(plugins) else: provider = "update-apt-xapian-index" print " ", name.ljust(maxname), "%4d" % int(val["val"]), provider # Source files information print "Data sources:" maxpath = 0 maxdesc = 0 for k, v in sources.iteritems(): if len(k) > maxpath: maxpath = len(k) if len(v["desc"]) > maxdesc: maxdesc = len(v["desc"]) print " ", "Source".ljust(maxpath), "Description".ljust(maxdesc), "Used by" for path, info in sources.iteritems(): provider = ", ".join(info.get("plugins", [])) print " ", path.ljust(maxpath), info["desc"].ljust(maxdesc), provider return 0 def do_search(self, args): "search [terms]: start a new search" self.db = DB() self.db.set_query_args(args) if self.opts.sort: self.db.set_sort(self.opts.sort) else: self.db.set_sort(None) self.db.build_query() if self.opts.all: self.print_all_matches(self.db.get_all_matches()) else: self.print_matches(self.db.get_matches()) if not self.opts.tabcomplete: self.db.save() return 0 def complete_search(self): self.db = DB() if self.opts.tabcomplete == "partial" and len(self.args) == 1: # Simple prefix search terms = set() terms.update((str(term) for term in self.db.db.synonym_keys(self.args[0]))) terms.update((term.term for term in self.db.db.allterms(self.args[0]))) for term in sorted(terms): print term for term in self.db.db.allterms("XT" + self.args[0]): print term.term[2:] return 0 elif self.opts.tabcomplete == "plain" and not self.args: # Show a preset list of tags for facet in ["interface", "role", "use", "works-with"]: for term in self.db.db.allterms("XT" + facet + "::"): print term.term[2:] return 0 else: # Context-sensitive hints qargs, partial = self.tabcomplete_query_args() self.db.set_query_args(qargs) self.db.build_query() self.print_completions(self.db.get_matches()) return 0 def do_again(self, args): "again [query]: repeat the last search, possibly adding query terms" self.db = DB() self.db.set_query_args(args, secondary=True) if self.opts.sort: self.db.set_sort(self.opts.sort) else: self.db.set_sort(None) try: self.db.build_query() except SavedStateError, e: print >>sys.stderr, "%s: maybe you need to run '%s search' first?" % (self.name, str(e)) return 1 self.print_matches(self.db.get_matches(first = 0)) if not self.opts.tabcomplete: self.db.save() return 0 def complete_again(self): self.db = DB() qargs, partial = self.tabcomplete_query_args() self.db.set_query_args(qargs, secondary=True) try: self.db.build_query() except SavedStateError, e: return 0 self.print_completions(self.db.get_matches(first = 0)) return 0 def do_last(self, args): "last [count]: show the last results again" self.db = DB() try: self.db.build_query() except SavedStateError, e: print >>sys.stderr, "%s: maybe you need to run '%s search' first?" % (self.name, str(e)) return 1 count = int(args[0]) if args else 20 first = int(self.db.get_cache_last("lastfirst", 0)) self.print_matches(self.db.get_matches(first=first, count=count)) return 0 def do_more(self, args): "more [count]: show more terms from the last search" self.db = DB() try: self.db.build_query() except SavedStateError, e: print >>sys.stderr, "%s: maybe you need to run '%s search' first?" % (self.name, str(e)) return 1 count = int(args[0]) if args else 20 self.print_matches(self.db.get_matches(count=count)) if not self.opts.tabcomplete: self.db.save() return 0 def do_show(self, args): "show pkgname[s]: run apt-cache show pkgname[s]" os.execvp("/usr/bin/apt-cache", ["apt-cache", "show"] + args) def complete_show(self): self.db = DB() if self.opts.tabcomplete == "partial" and self.args: blacklist = set(self.args[:-1]) partial = self.args[-1] else: blacklist = set(self.args) partial = None if self.opts.last: # Expand from output of last query try: self.db.build_query() first = int(self.db.get_cache_last("lastfirst", 0)) for m in self.db.get_matches(first = first): pkg = m.document.get_data() if partial is not None and not pkg.startswith(partial): continue if pkg not in blacklist: print pkg except SavedStateError, e: return 0 else: # Prefix expand for term in self.db.db.allterms("XP" + (partial or "")): pkg = term.term[2:] if pkg not in blacklist: print pkg return 0 def do_showpkg(self, args): "showpkg pkgname[s]: run apt-cache showpkg pkgname[s]" os.execvp("/usr/bin/apt-cache", ["apt-cache", "showpkg"] + args) complete_showpkg = complete_show def do_showsrc(self, args): "showsrc pkgname[s]: run apt-cache showsrc pkgname[s]" os.execvp("/usr/bin/apt-cache", ["apt-cache", "showsrc"] + args) complete_showsrc = complete_show def do_depends(self, args): "depends pkgname[s]: run apt-cache depends pkgname[s]" os.execvp("/usr/bin/apt-cache", ["apt-cache", "depends"] + args) complete_depends = complete_show def do_rdepends(self, args): "rdepends pkgname[s]: run apt-cache rdepends pkgname[s]" db = DB() for name in args: print name print "Reverse Depends:" for pfx in ("XRD", "XRR", "XRS", "XRE", "XRP", "XRB", "XRC"): for pkg in db.get_rdeps(name, pfx): print " ", pkg complete_rdepends = complete_show def do_rdetails(self, args): "rdetails pkgname[s]: show details of reverse relationships for the given packages" db = DB() for name in args: for pfx, tag in ( ("XRD", "dep"), ("XRR", "rec"), ("XRS", "sug"), ("XRE", "enh"), ("XRP", "pre"), ("XRB", "bre"), ("XRC", "con")): deps = list(db.get_rdeps(name, pfx)) if not deps: continue print name, tag, " ".join(deps) complete_rdetails = complete_show def do_policy(self, args): "policy pkgname[s]: run apt-cache policy pkgname[s]" os.execvp("/usr/bin/apt-cache", ["apt-cache", "policy"] + args) complete_policy = complete_show def do_madison(self, args): "madison pkgname[s]: run apt-cache madison pkgname[s]" os.execvp("/usr/bin/apt-cache", ["apt-cache", "madison"] + args) complete_madison = complete_show def format_help(self, out): print >>out, "Commands:" itemlist = [] maxusage = 0 for k, v in sorted(self.__class__.__dict__.iteritems()): if not k.startswith("do_"): continue line = self.name + " " + v.__doc__ usage, desc = line.split(": ", 1) if len(usage) > maxusage: maxusage = len(usage) itemlist.append([usage, desc]) print >>out, " search commands:" for usage, desc in [x for x in itemlist if "run apt-cache" not in x[1]]: print >>out, " %-*.*s %s" % (maxusage, maxusage, usage, desc) print >>out, " apt-cache front-ends:" for usage, desc in [x for x in itemlist if "run apt-cache" in x[1]]: print >>out, " %-*.*s %s" % (maxusage, maxusage, usage, desc) def do_help(self, args): "help: show a summary of commands" self.parser.print_help() return 0 def clean_suggestions(self, eset): res = [] for r in eset: res.append(self.db.unprefix(r.term)) #res.sort() return res def print_completions(self, matches): if self.opts.tabcomplete == "partial": prefix = self.args[-1] if self.args else None exclude = self.args[:-1] else: prefix = None exclude = self.args exclude = [x for x in exclude if x.lower() not in self.BOOLWORDS] for s in self.clean_suggestions(self.db.get_suggestions(count=10, filter=DB.BasicFilter(stemmer=self.db.stem, exclude=exclude, prefix=prefix))): if s.startswith("tag:"): print s[4:] else: print s def print_matches(self, matches): if self.opts.tags: facets, tags = self.db.vocabulary() def describe_tag(tag): f, t = tag.split("::") try: fd = facets[f]["Description"].split("\n", 1)[0].strip() td = tags[tag]["Description"].split("\n", 1)[0].strip() return "%s: %s" % (fd, td) except KeyError: return None maxscore = None for res in self.db.get_suggestions(count=20, filter=DB.TagFilter()): # Normalise the score in the interval [0, 1] weight = math.log(res.weight) if maxscore == None: maxscore = weight score = float(weight) / maxscore tag = self.db.unprefix(res.term)[4:] desc = describe_tag(tag) if desc is None: print "%i%% %s" % (int(score * 100), tag) else: print "%i%% %s -- %s" % (int(score * 100), tag, desc) else: est = matches.get_matches_estimated() first = matches.get_firstitem() count = matches.size() print "%i results found." % est if count != 0: print "Results %i-%i:" % (first + 1, first + count) elif first != 0: print "No more results to show" self.print_all_matches((m for m in matches)) if first == 0: sc = self.db.get_spelling_correction() if sc: print "Did you mean:", sc, "?" sugg = self.clean_suggestions(self.db.get_suggestions(count=7, filter=DB.TermFilter(stemmer=self.db.stem, exclude=self.args))) print "More terms:", " ".join(sugg) stags = self.clean_suggestions(self.db.get_suggestions(count=7, filter=DB.TagFilter())) print "More tags:", " ".join([x[4:] for x in stags]) if first + count < est: print "`%s more' will give more results" % self.name if first > 0: print "`%s again' will restart the search" % self.name def print_all_matches(self, matches_iter): """ Given an iterator of Xapian matches, print them all out nicely formatted """ aptcache = apt.Cache(progress=AptSilentProgress()) for m in matches_iter: name = m.document.get_data() try: pkg = aptcache[name] except KeyError: pkg = None if pkg is not None and pkg.candidate: print "%i%% %s - %s" % (m.percent, name, pkg.candidate.summary) else: print "%i%% %s - (unknown by apt)" % (m.percent, name) def perform(self): self.cmd = "help" if not self.args else self.args.pop(0) if self.opts.tabcomplete is not None: f = getattr(self, "complete_" + self.cmd, None) if f is None: return 0 return f() else: f = getattr(self, "do_" + self.cmd, None) if f is None: print >>sys.stderr, "Invalid command: `%s'.\n" % self.cmd self.do_help(self.args) return 1 return f(self.args) if __name__ == "__main__": ui = Cmdline() if helponly: sys.exit(0) sys.exit(ui.perform()) apt-xapian-index-0.47ubuntu13/tests/0000755000000000000000000000000013070503156014252 5ustar apt-xapian-index-0.47ubuntu13/tests/dbus-update-apt-xapian-index.py0000644000000000000000000000106413070503156022207 0ustar #!/usr/bin/python import dbus import os import glib import dbus.mainloop.glib dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) def finished(res): print "finished: ", res def progress(percent): print "progress: ", percent system_bus = dbus.SystemBus() axi = dbus.Interface(system_bus.get_object("org.debian.AptXapianIndex","/"), "org.debian.AptXapianIndex") axi.connect_to_signal("UpdateFinished", finished) axi.connect_to_signal("UpdateProgress", progress) # force, update_only axi.update_async(True, True) glib.MainLoop().run() apt-xapian-index-0.47ubuntu13/axi-cache.sh0000644000000000000000000000323613070503156015272 0ustar # axi-cache(1) completion # # © 2010, David Paleino # # This file is released under GPL-2, or later. # This is for development only. It's normally defined inside bash-completion. #have() { return 0; } have axi-cache && _axi_cache() { local cur prev cmd COMPREPLY=() type _get_comp_words_by_ref &>/dev/null && { _get_comp_words_by_ref -n: cur prev } || { cur=$(_get_cword ":") prev=${COMP_WORDS[$COMP_CWORD-1]} } cmd=${COMP_WORDS[1]} case "$prev" in *axi-cache*) COMPREPLY=( $(compgen -W "help more search show again showpkg showsrc depends rdepends policy madison" -- "$cur") ) return 0 ;; --sort) COMPREPLY=( $(compgen -W "$(egrep ^[a-z] /var/lib/apt-xapian-index/values | awk -F"\t" '{print $1}')" -- "$cur") ) return 0 ;; esac case "$cmd" in search|again) if [[ "$cur" == -* ]]; then COMPREPLY=( $(compgen -W "--sort --tags" -- "$cur") ) return 0 fi ;; show|showpkg|showsrc|depends|rdepends|policy|madison) if [[ "$cur" == -* ]]; then COMPREPLY=( $(compgen -W "--last" -- "$cur") ) return 0 fi ;; *) return 0 ;; esac if [ -n "$cur" ]; then #unset COMP_WORDS[$COMP_CWORD] COMPREPLY=( $(compgen -W "$(${COMP_WORDS[@]} --tabcomplete=partial)" -- "$cur") ) else COMPREPLY=( $(compgen -W "$(${COMP_WORDS[@]} --tabcomplete=plain)" -- "$cur") ) fi return 0 } && complete -F _axi_cache axi-cache apt-xapian-index-0.47ubuntu13/update-apt-xapian-index-dbus0000755000000000000000000000745313070503156020431 0ustar #!/usr/bin/python import logging import os import string import subprocess try: import glib import gobject import dbus import dbus.service import dbus.mainloop.glib except ImportError, e: sys.stderr.write("Failed to import '%s', can not use dbus" % e) sys.exit(1) class PermissionDeniedError(dbus.DBusException): " permission denied by policy " pass class AptXapianIndexDBusService(dbus.service.Object): DBUS_INTERFACE_NAME = "org.debian.AptXapianIndex" def __init__(self): bus_name = dbus.service.BusName(self.DBUS_INTERFACE_NAME, bus=dbus.SystemBus()) dbus.service.Object.__init__(self, bus_name, '/') self._active_axi = None def _authWithPolicyKit(self, sender, connection, priv): system_bus = dbus.SystemBus() obj = system_bus.get_object("org.freedesktop.PolicyKit1", "/org/freedesktop/PolicyKit1/Authority", "org.freedesktop.PolicyKit1.Authority") policykit = dbus.Interface(obj, "org.freedesktop.PolicyKit1.Authority") subject = ('system-bus-name', { 'name': dbus.String(sender, variant_level = 1) } ) details = { '' : '' } flags = dbus.UInt32(1) # AllowUserInteraction = 0x00000001 cancel_id = '' (ok, notused, details) = policykit.CheckAuthorization( subject, priv, details, flags, cancel_id) return ok @dbus.service.signal(dbus_interface=DBUS_INTERFACE_NAME, signature="b") def UpdateFinished(self, res): logging.debug("Emitting UpdateFinished: %s" % res) @dbus.service.signal(dbus_interface=DBUS_INTERFACE_NAME, signature="i") def UpdateProgress(self, percent): logging.debug("Emitting UpdateProgress: %s" % percent) def _update_apt_xapian_index(self, cmd): p = subprocess.Popen(cmd, stdout=subprocess.PIPE) self._active_axi = p while True: while gobject.main_context_default().pending(): gobject.main_context_default().iteration() res = p.poll() if res is not None: break line = p.stdout.readline().strip() if not line: continue try: (op, progress) = string.split(line, sep=":", maxsplit=1) if op == "progress": percent = int(progress.split("/")[0]) self.UpdateProgress(percent) except ValueError: pass # axi finished self._active_axi = None # emit finish signal self.UpdateFinished(res == 0) @dbus.service.method(DBUS_INTERFACE_NAME, in_signature='bb', out_signature='', sender_keyword='sender', connection_keyword='conn') def update_async(self, force, update_only, sender=None, conn=None): if not self._authWithPolicyKit(sender, conn, "org.debian.aptxapianindex.update"): raise PermissionDeniedError, "Permission denied by policy" # do not start update-apt-xapian-index twice, the clients will # get the finished signal from the previous running one if self._active_axi: return cmd = ["/usr/sbin/update-apt-xapian-index", "--batch-mode"] if force: cmd.append("--force") if update_only: cmd.append("--update") glib.timeout_add(100, self._update_apt_xapian_index, cmd) if __name__ == "__main__": dbus.mainloop.glib.DBusGMainLoop(set_as_default=True) server = AptXapianIndexDBusService() gobject.MainLoop().run() apt-xapian-index-0.47ubuntu13/update-apt-xapian-index0000755000000000000000000000655013070503156017473 0ustar #!/usr/bin/python # -*- coding: utf-8 -*- # # update-apt-xapian-index - Maintain a system-wide Xapian index of Debian # package information # # Copyright (C) 2007--2010 Enrico Zini # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program; if not, write to the Free Software # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA # # # Main program body # # Minimal imports so we are always able to print command line help from optparse import OptionParser import sys import warnings VERSION="0.47" class Parser(OptionParser): def __init__(self, *args, **kwargs): OptionParser.__init__(self, *args, **kwargs) def error(self, msg): sys.stderr.write("%s: error: %s\n\n" % (self.get_prog_name(), msg)) self.print_help(sys.stderr) sys.exit(2) parser = Parser(usage="usage: %prog [options]", version="%prog "+ VERSION, description="Rebuild the Apt Xapian index") parser.add_option("-q", "--quiet", action="store_true", help="quiet mode: only output fatal errors") parser.add_option("-v", "--verbose", action="store_true", help="verbose mode") parser.add_option("-f", "--force", action="store_true", help="force database rebuild even if it's already up to date") parser.add_option("--pkgfile", action="append", help="do not use the APT cache, but the given Package file") parser.add_option("--batch-mode", action="store_true", help="use progress reporting suitable from programatic parsing.") parser.add_option("-u","--update", action="store_true", help="incremental update, reindexing only those packages whose version has changed since the last run") (opts, args) = parser.parse_args() # Rest of the imports here import axi import axi.indexer #if opts.quiet: print "quiet" #if opts.verbose: print "verbose" #if opts.force: print "force" # Instantiate the progress report if opts.batch_mode: quietapt = False progress = axi.indexer.BatchProgress() elif opts.quiet: quietapt = True progress = axi.indexer.SilentProgress() warnings.filterwarnings("ignore","") else: quietapt = False progress = axi.indexer.Progress() if opts.verbose: progress.is_verbose = True # Create the indexer indexer = axi.indexer.Indexer(progress, quietapt) # Lock the session so that we prevent concurrent updates try: locked = indexer.lock() except OSError, e: import errno if e.errno == errno.EACCES: print >>sys.stderr, "You probably need to be root to do this." sys.exit(1) raise if not locked: indexer.slave() sys.exit(0) # Set up the indexer and check that we have something to do if not indexer.setupIndexing(force=opts.force, system=opts.pkgfile is None): sys.exit(0) if opts.update: # update only mode indexer.incrementalUpdate() else: indexer.rebuild(opts.pkgfile) sys.exit(0) apt-xapian-index-0.47ubuntu13/aliases/0000755000000000000000000000000013070503156014531 5ustar apt-xapian-index-0.47ubuntu13/aliases/popular-apps0000644000000000000000000000033313070503156017076 0ustar # Aliases expanding names of popular applications excel XToffice::spreadsheet powerpoint XToffice::presentation photoshop XTworks-with::image:raster coreldraw XTworks-with::image:vector autocad XTworks-with::3dmodel apt-xapian-index-0.47ubuntu13/test/0000755000000000000000000000000013070503156014067 5ustar apt-xapian-index-0.47ubuntu13/test/tools.py0000755000000000000000000000137113070503156015606 0ustar # -*- coding: utf-8 -*- import unittest import axi import axi.indexer import os.path class AxiTestBase(unittest.TestCase): def assertCleanIndex(self): self.assert_(os.path.exists(axi.XAPIANINDEX)) self.assert_(os.path.exists(axi.XAPIANDBSTAMP)) self.assert_(os.path.exists(axi.XAPIANDBVALUES)) self.assert_(os.path.exists(axi.XAPIANDBDOC)) self.assert_(not os.path.exists(axi.XAPIANDBUPDATESOCK)) # Index is clean and up to date: an indexer should tell us that it does # not need to run progress = axi.indexer.SilentProgress() indexer = axi.indexer.Indexer(progress, True) self.assert_(indexer.lock()) self.assert_(not indexer.setupIndexing()) indexer = None apt-xapian-index-0.47ubuntu13/test/test_indexer.py0000644000000000000000000002660313070503156017145 0ustar # -*- coding: utf-8 -*- import unittest import sys, os.path import axi import axi.indexer import shutil import subprocess import tools def smallcache(pkglist=["apt", "libept-dev", "gedit"]): class sc(object): def __init__(self, cache): self._pkgs = pkglist self._cache = cache def has_key(self, name): return name in self._pkgs def __len__(self): return len(self._pkgs) def __iter__(self): for p in self._pkgs: yield self._cache[p] def __getitem__(self, name): if name not in self._pkgs: raise KeyError, "`%s' not in wrapped cache" % name return self._cache[name] return sc class TestIndexer(tools.AxiTestBase): def setUp(self): # Remove the text index if it exists if os.path.exists(axi.XAPIANDBPATH): shutil.rmtree(axi.XAPIANDBPATH) # Prepare a quiet indexer progress = axi.indexer.SilentProgress() self.indexer = axi.indexer.Indexer(progress, True) def tearDown(self): # Explicitly set indexer to none, otherwise in the next setUp we rmtree # testdb before the indexer had a chance to delete its lock self.indexer = None def testAptRebuild(self): self.indexer._test_wrap_apt_cache(smallcache()) # No other indexers are running, ensure lock succeeds self.assert_(self.indexer.lock()) # No index exists, so the indexer should decide it needs to run self.assert_(self.indexer.setupIndexing()) # Perform a rebuild self.indexer.rebuild() # Close the indexer self.indexer = None # Ensure that we have an index self.assertCleanIndex() def testDeb822Rebuild(self): pkgfile = os.path.join(axi.XAPIANDBPATH, "packages") subprocess.check_call("apt-cache show apt libept-dev gedit > " + pkgfile, shell=True) # No other indexers are running, ensure lock succeeds self.assert_(self.indexer.lock()) # No index exists, so the indexer should decide it needs to run self.assert_(self.indexer.setupIndexing()) # Perform a rebuild self.indexer.rebuild([pkgfile]) # Close the indexer self.indexer = None # Ensure that we have an index self.assertCleanIndex() def testIncrementalRebuild(self): # Perform the initial indexing progress = axi.indexer.SilentProgress() pre_indexer = axi.indexer.Indexer(progress, True) pre_indexer._test_wrap_apt_cache(smallcache(["apt", "libept-dev", "gedit"])) self.assert_(pre_indexer.lock()) self.assert_(pre_indexer.setupIndexing()) pre_indexer.rebuild() pre_indexer = None curidx = open(axi.XAPIANINDEX).read() # Ensure that we have an index self.assertCleanIndex() # Prepare an incremental update self.indexer._test_wrap_apt_cache(smallcache(["apt", "coreutils", "gedit"])) # No other indexers are running, ensure lock succeeds self.assert_(self.indexer.lock()) # An index exists the plugin modification timestamps are the same, so # we need to force the indexer to run self.assert_(not self.indexer.setupIndexing()) self.assert_(self.indexer.setupIndexing(force=True)) # Perform a rebuild self.indexer.incrementalUpdate() # Close the indexer self.indexer = None # Ensure that we have an index self.assertCleanIndex() # Ensure that we did not create a new index self.assertEqual(open(axi.XAPIANINDEX).read(), curidx) def testIncrementalRebuildFromEmpty(self): # Prepare an incremental update self.indexer._test_wrap_apt_cache(smallcache()) # No other indexers are running, ensure lock succeeds self.assert_(self.indexer.lock()) # No index exists, so the indexer should decide it needs to run self.assert_(self.indexer.setupIndexing()) # Perform an incremental rebuild, which should fall back on a normal # one self.indexer.incrementalUpdate() # Close the indexer self.indexer = None # Ensure that we have an index self.assertCleanIndex() # def test_url(self): # """ Environ: URL building """ # request.bind({'HTTP_HOST':'example.com'}, None) # self.assertEqual('http://example.com/', request.url) # request.bind({'SERVER_NAME':'example.com'}, None) # self.assertEqual('http://example.com/', request.url) # request.bind({'SERVER_NAME':'example.com', 'SERVER_PORT':'81'}, None) # self.assertEqual('http://example.com:81/', request.url) # request.bind({'wsgi.url_scheme':'https', 'SERVER_NAME':'example.com'}, None) # self.assertEqual('https://example.com:80/', request.url) # request.bind({'HTTP_HOST':'example.com', 'PATH_INFO':'/path', 'QUERY_STRING':'1=b&c=d', 'SCRIPT_NAME':'/sp'}, None) # self.assertEqual('http://example.com/sp/path?1=b&c=d', request.url) # # def test_dict_access(self): # """ Environ: request objects are environment dicts """ # e = {} # wsgiref.util.setup_testing_defaults(e) # request.bind(e, None) # for k, v in e.iteritems(): # self.assertTrue(k in request) # self.assertTrue(request[k] == v) # # def test_header_access(self): # """ Environ: Request objects decode headers """ # e = {} # wsgiref.util.setup_testing_defaults(e) # e['HTTP_SOME_HEADER'] = 'some value' # request.bind(e, None) # request['HTTP_SOME_OTHER_HEADER'] = 'some other value' # self.assertTrue('Some-Header' in request.header) # self.assertTrue(request.header['Some-Header'] == 'some value') # self.assertTrue(request.header['Some-Other-Header'] == 'some other value') # # # def test_cookie(self): # """ Environ: COOKIES """ # t = dict() # t['a=a'] = {'a': 'a'} # t['a=a; b=b'] = {'a': 'a', 'b':'b'} # t['a=a; a=b'] = {'a': 'b'} # for k, v in t.iteritems(): # request.bind({'HTTP_COOKIE': k}, None) # self.assertEqual(v, request.COOKIES) # # def test_get(self): # """ Environ: GET data """ # e = {} # e['QUERY_STRING'] = 'a=a&a=1&b=b&c=c' # request.bind(e, None) # self.assertTrue('a' in request.GET) # self.assertTrue('b' in request.GET) # self.assertEqual(['a','1'], request.GET.getall('a')) # self.assertEqual(['b'], request.GET.getall('b')) # self.assertEqual('1', request.GET['a']) # self.assertEqual('b', request.GET['b']) # # def test_post(self): # """ Environ: POST data """ # sq = u'a=a&a=1&b=b&c=c'.encode('utf8') # e = {} # wsgiref.util.setup_testing_defaults(e) # e['wsgi.input'].write(sq) # e['wsgi.input'].seek(0) # e['CONTENT_LENGTH'] = str(len(sq)) # e['REQUEST_METHOD'] = "POST" # request.bind(e, None) # self.assertTrue('a' in request.POST) # self.assertTrue('b' in request.POST) # self.assertEqual(['a','1'], request.POST.getall('a')) # self.assertEqual(['b'], request.POST.getall('b')) # self.assertEqual('1', request.POST['a']) # self.assertEqual('b', request.POST['b']) # # def test_params(self): # """ Environ: GET and POST are combined in request.param """ # e = {} # wsgiref.util.setup_testing_defaults(e) # e['wsgi.input'].write(tob('b=b&c=p')) # e['wsgi.input'].seek(0) # e['CONTENT_LENGTH'] = '7' # e['QUERY_STRING'] = 'a=a&c=g' # e['REQUEST_METHOD'] = "POST" # request.bind(e, None) # self.assertEqual(['a','b','c'], sorted(request.params.keys())) # self.assertEqual('p', request.params['c']) # # def test_getpostleak(self): # """ Environ: GET and POST sh0uld not leak into each other """ # e = {} # wsgiref.util.setup_testing_defaults(e) # e['wsgi.input'].write(u'b=b'.encode('utf8')) # e['wsgi.input'].seek(0) # e['CONTENT_LENGTH'] = '3' # e['QUERY_STRING'] = 'a=a' # e['REQUEST_METHOD'] = "POST" # request.bind(e, None) # self.assertEqual(['a'], request.GET.keys()) # self.assertEqual(['b'], request.POST.keys()) # # def test_body(self): # """ Environ: Request.body should behave like a file object factory """ # e = {} # wsgiref.util.setup_testing_defaults(e) # e['wsgi.input'].write(u'abc'.encode('utf8')) # e['wsgi.input'].seek(0) # e['CONTENT_LENGTH'] = str(3) # request.bind(e, None) # self.assertEqual(u'abc'.encode('utf8'), request.body.read()) # self.assertEqual(u'abc'.encode('utf8'), request.body.read(3)) # self.assertEqual(u'abc'.encode('utf8'), request.body.readline()) # self.assertEqual(u'abc'.encode('utf8'), request.body.readline(3)) # # def test_bigbody(self): # """ Environ: Request.body should handle big uploads using files """ # e = {} # wsgiref.util.setup_testing_defaults(e) # e['wsgi.input'].write((u'x'*1024*1000).encode('utf8')) # e['wsgi.input'].seek(0) # e['CONTENT_LENGTH'] = str(1024*1000) # request.bind(e, None) # self.assertTrue(hasattr(request.body, 'fileno')) # self.assertEqual(1024*1000, len(request.body.read())) # self.assertEqual(1024, len(request.body.read(1024))) # self.assertEqual(1024*1000, len(request.body.readline())) # self.assertEqual(1024, len(request.body.readline(1024))) # # def test_tobigbody(self): # """ Environ: Request.body should truncate to Content-Length bytes """ # e = {} # wsgiref.util.setup_testing_defaults(e) # e['wsgi.input'].write((u'x'*1024).encode('utf8')) # e['wsgi.input'].seek(0) # e['CONTENT_LENGTH'] = '42' # request.bind(e, None) # self.assertEqual(42, len(request.body.read())) # self.assertEqual(42, len(request.body.read(1024))) # self.assertEqual(42, len(request.body.readline())) # self.assertEqual(42, len(request.body.readline(1024))) # #class TestMultipart(unittest.TestCase): # def test_multipart(self): # """ Environ: POST (multipart files and multible values per key) """ # fields = [('field1','value1'), ('field2','value2'), ('field2','value3')] # files = [('file1','filename1.txt','content1'), ('file2','filename2.py',u'äöü')] # e = tools.multipart_environ(fields=fields, files=files) # request.bind(e, None) # # File content # self.assertTrue('file1' in request.POST) # self.assertEqual('content1', request.POST['file1'].file.read()) # # File name and meta data # self.assertTrue('file2' in request.POST) # self.assertEqual('filename2.py', request.POST['file2'].filename) # # UTF-8 files # x = request.POST['file2'].file.read() # if sys.version_info >= (3,0,0): # x = x.encode('ISO-8859-1') # self.assertEqual(u'äöü'.encode('utf8'), x) # # No file # self.assertTrue('file3' not in request.POST) # # Field (single) # self.assertEqual('value1', request.POST['field1']) # # Field (multi) # self.assertEqual(2, len(request.POST.getall('field2'))) # self.assertEqual(['value2', 'value3'], request.POST.getall('field2')) if __name__ == '__main__': unittest.main() apt-xapian-index-0.47ubuntu13/plugins/0000755000000000000000000000000013070503156014571 5ustar apt-xapian-index-0.47ubuntu13/plugins/descriptions.py0000644000000000000000000001146313070503156017656 0ustar try: import apt import apt_pkg HAS_APT=True except ImportError: HAS_APT=False import xapian import re import os, os.path class Descriptions: def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ res = dict( timestamp=0, prefixes=[ dict(idx="Z", qp=None, type=None, desc="Stemmed forms of keywords", ldesc="This contains the stemmed forms of keywords as generated by" " TermGenerator and matched by QueryParser"), ], ) if not HAS_APT: return res if not hasattr(apt_pkg, "config"): return res fname = apt_pkg.config.find_file("Dir::Cache::pkgcache") if not os.path.exists(fname): return res res["sources"] = [dict(path=fname, desc="APT index")] res["timestamp"] = os.path.getmtime(fname) return res def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ self.stemmer = xapian.Stem("english") self.indexer = xapian.TermGenerator() self.indexer.set_stemmer(self.stemmer) def send_extra_info(self, db=None, **kw): """ Receive extra parameters from the indexer. This may be called more than once, but after init(). We are using this to get the database instance """ if db is not None: self.indexer.set_flags(xapian.TermGenerator.FLAG_SPELLING) self.indexer.set_database(db) def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Package descriptions", shortDesc = "terms extracted from the package descriptions using Xapian's TermGenerator", fullDoc = """ The Descriptions data source simply uses Xapian's TermGenerator to tokenise and index the package descriptions. Currently this creates normal terms as well as stemmed terms prefixed with ``Z``. """ ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ self.indexer.set_document(document) # Index the record self.indexer.index_text_without_positions(pkg.name) version = pkg.candidate if version is not None: self.indexer.index_text_without_positions(version.raw_description) if not HAS_APT: def index(self, document, pkg): pass def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ self.indexer.set_document(document) # Index the record self.indexer.index_text_without_positions(pkg["Package"]) if 'Description' in pkg: self.indexer.index_text_without_positions(pkg["Description"]) else: # check if we have a translated description for k in pkg.keys(): if k.startswith('Description-'): self.indexer.index_text_without_positions(pkg[k]) break def init(**kw): """ Create and return the plugin object. """ return Descriptions() apt-xapian-index-0.47ubuntu13/plugins/template.py0000644000000000000000000000653713070503156016771 0ustar class Template: def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. sources (optional) ad array of dicts { path: pathname, desc: description }, one for every data file accessed by this plugin. A directory can be provided as path, meaning "it accesses all sorts of files inside this directory": for example the APT index, or the app-install-data files. Use [] to mean "no sources". Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ return dict(timestamp = 0, values = [], sources=[]) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ pass def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ #return dict( # name = "template", # shortDesc = "Example plugin that does nothing", # fullDoc = """ # Here goes full documentation in ReST format # """ #) # Return None if you don't want this file to appear in the # documentation. Probably, only the Template plugin should do that. return None def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ pass def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ pass def finished(self): """ Called when the indexing is finihsed """ def init(**kw): """ Create and return the plugin object. Keyword arguments can hold various kind of information passed by the indexer. Currently documented is: - langs: sequence of languages for which we are building the index Return None here to disable this plugin. """ return Template() apt-xapian-index-0.47ubuntu13/plugins/debtags.py0000644000000000000000000001115313070503156016555 0ustar # Add debtags tags to the index import re, os, os.path try: from debian import debtags except ImportError: from debian_bundle import debtags DEBTAGSDB = os.environ.get("AXI_DEBTAGS_DB", "/var/lib/debtags/package-tags") class Debtags: def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ return dict( timestamp=os.path.getmtime(DEBTAGSDB), sources=[dict(path=DEBTAGSDB, desc="Debtags tag information")], prefixes=[ dict(idx="XT", qp="tag:", type="bool", desc="Debtags tag", ldesc="Debtags package categories"), ], ) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ progress.begin("Reading Debtags database") self.db = debtags.DB() # Read full database, skipping special and todo tags tagFilter = re.compile(r"^special::.+$|^.+::TODO$") self.db.read(open(DEBTAGSDB, "r"), lambda x: not tagFilter.match(x)) progress.end() def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Debtags", shortDesc = "Debtags tag information", fullDoc = """ The Debtags data source indexes Debtags tags as terms with the ``XT`` prefix; for example: 'XTrole::program'. Using the ``XT`` terms, queries can be enhanced with semantic information. Xapian's support for complex expressions in queries can be used to great effect: for example:: XTrole::program AND XTuse::gameplaying AND (XTinterface::x11 OR XTinterface::3d) ``XT`` terms can also be used to improve the quality of search results. For example, the ``gimp`` package would not usually show up when searching the terms ``image editor``. This can be solved using the following technique: 1. Perform a normal query 2. Put the first 5 or so results in an Rset 3. Call Enquire::get_eset using the Rset and an expand filter that only accepts ``XT`` terms. This gives you the tags that are most relevant to the query. 4. Add the resulting terms to the initial query, and search again. The Debtags data source is provided by the ``debtags`` package. """ ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ for tag in self.db.tags_of_package(pkg.name): document.add_term("XT"+tag) def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ for tag in self.db.tags_of_package(pkg["Package"]): document.add_term("XT"+tag) def init(**kw): """ Create and return the plugin object. """ if not os.path.exists(DEBTAGSDB): return None return Debtags() apt-xapian-index-0.47ubuntu13/plugins/cataloged_time.py0000644000000000000000000001153313070503156020107 0ustar try: import apt_pkg HAS_APT=True except ImportError: HAS_APT=False import cPickle import os import os.path import time import xapian class CatalogedTime: PERSISTANCE_DIR = "/var/lib/apt-xapian-index/" CATALOGED_NAME = "cataloged_times.p" def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ res = dict(timestamp=0, values=[ dict(name="catalogedtime", desc="Cataloged timestamp"), ], sources=[ dict( path=os.path.join(self.PERSISTANCE_DIR, self.CATALOGED_NAME), desc="first-seen information for every package" ) ]) if not HAS_APT: return res if not hasattr(apt_pkg, "config"): return res fname = apt_pkg.config.find_file("Dir::Cache::pkgcache") if not os.path.exists(fname): return res res["timestamp"] = os.path.getmtime(fname) return res def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ if os.access(self.PERSISTANCE_DIR, os.W_OK): if not os.path.exists(self.PERSISTANCE_DIR): os.makedirs(self.PERSISTANCE_DIR) self._packages_cataloged_file = os.path.join(self.PERSISTANCE_DIR, self.CATALOGED_NAME) else: self._packages_cataloged_file = None if (self._packages_cataloged_file and os.path.exists(self._packages_cataloged_file)): self._package_cataloged_time = cPickle.load(open(self._packages_cataloged_file)) else: self._package_cataloged_time = {} self.now = time.time() values = info['values'] self.value = values.get("catalogedtime", -1) def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Cataloged time", shortDesc = "store the timestamp when the package was first cataloged", fullDoc = """ This datasource simply stores a value with the timestamp when the package was first cataloged. This is useful to e.g. implement a 'Whats new' feature. """ ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ time = self._package_cataloged_time.get(pkg.name, self.now) self._package_cataloged_time[pkg.name] = time document.add_value(self.value, xapian.sortable_serialise(time)) def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ pass def finished(self): """ Called when the indexing is finihsed """ if self._packages_cataloged_file: f=open(self._packages_cataloged_file+".new", "w") res = cPickle.dump(self._package_cataloged_time, f) f.close() os.rename(self._packages_cataloged_file+".new", self._packages_cataloged_file) def init(**kw): """ Create and return the plugin object. """ if not HAS_APT: return None return CatalogedTime() apt-xapian-index-0.47ubuntu13/plugins/translated-desc.py0000644000000000000000000001336313070503156020226 0ustar try: import apt HAS_APT=True except ImportError: HAS_APT=False import xapian import re import os, os.path, urllib try: from debian import deb822 except ImportError: from debian_bundle import deb822 APTLISTDIR="/var/lib/apt/lists" def translationFiles(langs=None): # Look for files like: ftp.uk.debian.org_debian_dists_sid_main_i18n_Translation-it # And extract the language code at the end tfile = re.compile(r"_i18n_Translation-([^-]+)$") for f in os.listdir(APTLISTDIR): mo = tfile.search(f) if not mo: continue if langs and not mo.group(1) in langs: continue yield urllib.unquote(mo.group(1)), os.path.join(APTLISTDIR, f) class Indexer: def __init__(self, lang, file): self.lang = lang self.xlang = lang.split("_")[0] self.indexer = xapian.TermGenerator() # Get a stemmer for this language, if available try: self.stemmer = xapian.Stem(self.xlang) self.indexer.set_stemmer(self.stemmer) except xapian.InvalidArgumentError: pass # Read the translated descriptions self.descs = dict() desckey = "Description-"+self.lang for pkg in deb822.Deb822.iter_paragraphs(open(file)): # I need this if because in some translation files, some packages # have a different Description header. For example, in the -de # translations, I once found a Description-de.noguide: header # instead of Description-de: if desckey in pkg: self.descs[pkg["Package"]] = pkg[desckey] def index(self, document): name = document.get_data() self.indexer.set_document(document) self.indexer.index_text_without_positions(self.descs.get(name, "")) class TranslatedDescriptions: def __init__(self, langs): self.langs = langs def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ if not HAS_APT: return dict(timestamp = 0) filelist = translationFiles(self.langs) maxts = max([0] + [os.path.getmtime(f) for l, f in filelist]) return dict( timestamp=maxts, sources=[dict(path=f, desc="%s translation" % l) for l, f in filelist], prefixes=[ dict(idx="Z", qp=None, type=None, desc="Stemmed forms of keywords", ldesc="This contains the stemmed forms of keywords as generated by" " TermGenerator and matched by QueryParser"), ], ) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ self.indexers = [] for lang, file in translationFiles(self.langs): progress.begin("Reading %s translations from %s" % (lang, file)) self.indexers.append(Indexer(lang, file)) progress.end() def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Translated package descriptions", shortDesc = "terms extracted from the translated package descriptions using Xapian's TermGenerator", fullDoc = """ The TranslatedDescriptions data source reads translated description files from %s, then uses Xapian's TermGenerator to tokenise and index their content. Currently this creates normal terms as well as stemmed terms prefixed with ``Z``. """ % APTLISTDIR ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ for i in self.indexers: i.index(document) def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ for i in self.indexers: i.index(document) def init(langs=None, **kw): """ Create and return the plugin object. """ if not HAS_APT: return None if not langs: return None files = [f for l, f in translationFiles(langs)] if len(files) == 0: return None return TranslatedDescriptions(langs=langs) apt-xapian-index-0.47ubuntu13/plugins/sizes.py0000644000000000000000000001035613070503156016305 0ustar try: import apt import apt_pkg HAS_APT=True except ImportError: HAS_APT=False import xapian import os, os.path class Sizes: def info(self, **kw): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ res = dict( timestamp=0, values=[ dict(name = "installedsize", desc = "installed size"), dict(name = "packagesize", desc = "package size") ], ) if kw.get("system", True): if not HAS_APT: return res file = apt_pkg.config.find_file("Dir::Cache::pkgcache") if not os.path.exists(file): return res ts = os.path.getmtime(file) else: file = "(stdin)" ts = 0 res["sources"] = [dict(path=file, desc="APT index")] res["timestamp"] = ts return res def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Sizes", shortDesc = "package sizes indexed as values", fullDoc = """ The Sizes data source indexes the package size and the installed size as the ``packagesize`` and ``installedsize`` Xapian values. """ ) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ # Read the value indexes we will use values = info['values'] self.val_inst_size = values.get("installedsize", -1) self.val_pkg_size = values.get("packagesize", -1) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ ver = pkg.candidate if ver is None: return try: instSize = ver.installed_size pkgSize = ver.size except: return if self.val_inst_size != -1: document.add_value(self.val_inst_size, xapian.sortable_serialise(instSize)); if self.val_pkg_size != -1: document.add_value(self.val_pkg_size, xapian.sortable_serialise(pkgSize)); def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ try: instSize = int(pkg["Installed-Size"]) pkgSize = int(pkg["Size"]) except: return if self.val_inst_size != -1: document.add_value(self.val_inst_size, xapian.sortable_serialise(instSize)); if self.val_pkg_size != -1: document.add_value(self.val_pkg_size, xapian.sortable_serialise(pkgSize)); def init(): """ Create and return the plugin object. """ return Sizes() apt-xapian-index-0.47ubuntu13/plugins/apttags.py0000644000000000000000000001266013070503156016613 0ustar # Add debtags tags to the index try: import apt import apt_pkg HAS_APT=True except ImportError: HAS_APT=False import re, os, os.path DEBTAGSDB = os.environ.get("AXI_DEBTAGS_DB", "/var/lib/debtags/package-tags") class AptTags(object): def __init__(self): self.re_expand = re.compile(r"\b([^{]+)\{([^}]+)\}") self.re_split = re.compile(r"\s*,\s*") def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ if not HAS_APT: return dict(timestamp = 0) if not hasattr(apt_pkg, "config"): return dict(timestamp = 0) file = apt_pkg.config.find_file("Dir::Cache::pkgcache") if not os.path.exists(file): return dict(timestamp = 0) return dict( timestamp=os.path.getmtime(file), sources=[dict(path=file, desc="APT index")], prefixes=[ dict(idx="XT", qp="tag:", type="bool", desc="Debtags tag", ldesc="Debtags package categories"), ], ) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ pass def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Apt tags", shortDesc = "Debtags tag information from the Packages file", fullDoc = """ The Apt tags data source indexes Debtags tags as found in the Packages file as terms with the ``XT`` prefix; for example: 'XTrole::program'. Using the ``XT`` terms, queries can be enhanced with semantic information. Xapian's support for complex expressions in queries can be used to great effect: for example:: XTrole::program AND XTuse::gameplaying AND (XTinterface::x11 OR XTinterface::3d) ``XT`` terms can also be used to improve the quality of search results. For example, the ``gimp`` package would not usually show up when searching the terms ``image editor``. This can be solved using the following technique: 1. Perform a normal query 2. Put the first 5 or so results in an Rset 3. Call Enquire::get_eset using the Rset and an expand filter that only accepts ``XT`` terms. This gives you the tags that are most relevant to the query. 4. Add the resulting terms to the initial query, and search again. The Apt tags data source will not work when Debtags is installed, as Debtags is able to provide a better set of tags. """ ) def _parse_and_index(self, tagstring, document): """ Parse tagstring into tags, and index the tags """ def expandTags(mo): root = mo.group(1) ends = self.re_split.split(mo.group(2)) return ", ".join([root + x for x in ends]) tagstring = self.re_expand.sub(expandTags, tagstring) for tag in self.re_split.split(tagstring): document.add_term("XT"+tag) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ ver = pkg.candidate if ver is None: return rec = ver.record if rec is None: return try: self._parse_and_index(rec['Tag'], document) except KeyError: return def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ try: self._parse_and_index(pkg['Tag'], document) except KeyError: return def init(**kw): """ Create and return the plugin object. """ if os.path.exists(DEBTAGSDB): return None return AptTags() apt-xapian-index-0.47ubuntu13/plugins/app-install.py0000644000000000000000000002212413070503156017370 0ustar try: from xdg.DesktopEntry import DesktopEntry from xdg.Exceptions import ParsingError from xdg import Locale HAS_XDG=True except ImportError, e: HAS_XDG=False import axi.indexer import xapian import os, os.path APPINSTALLDIR="/usr/share/app-install/desktop/" class Indexer: def __init__(self, lang, val_popcon, progress=None): self.val_popcon = val_popcon self.progress = progress if lang is None: lang = "en" self.lang = lang self.xlang = lang.split("_")[0] self.xdglangs = Locale.expand_languages([lang]) self.indexer = xapian.TermGenerator() # Get a stemmer for this language, if available try: self.stemmer = xapian.Stem(self.xlang) self.indexer.set_stemmer(self.stemmer) except xapian.InvalidArgumentError: pass def index(self, document, fname, entry): # Index a single term "XD", marking that the package contains .desktop # files document.add_term("XD") # Index the name of the .desktop file, with prefix XDF document.add_term("XDF" + fname) # Index keywords retrieved in this indexer's language self.indexer.set_document(document) oldlangs = Locale.langs try: Locale.langs = self.xdglangs self.indexer.index_text_without_positions(entry.getName()) self.indexer.index_text_without_positions(entry.getGenericName()) self.indexer.index_text_without_positions(entry.getComment()) finally: Locale.langs = oldlangs # Index .desktop categories, with prefix XDT for cat in entry.getCategories(): document.add_term("XDT"+cat) # Add an "app-popcon" value with popcon rank try: popcon = int(entry.get("X-AppInstall-Popcon")) except ValueError, e: if self.progress: self.progress.verbose("%s: parsing X-AppInstall-Popcon: %s" % (fname, str(e))) popcon = -1 if self.val_popcon != -1: document.add_value(self.val_popcon, xapian.sortable_serialise(popcon)); class AppInstall(object): def __init__(self, langs, progress): self.langs = langs self.progress = progress def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ maxts = 0 for f in os.listdir(APPINSTALLDIR): if f[0] == '.' or not f.endswith(".desktop"): continue try: ts = os.path.getmtime(os.path.join(APPINSTALLDIR, f)) if ts > maxts: maxts = ts except OSError: # ignore file not found #752195 (potential race) pass return dict( timestamp = maxts, values = [ dict(name = "app-popcon", desc = "app-install .desktop popcon rank"), ], sources = [ dict(path=APPINSTALLDIR, desc=".desktop files provided by app-install-data"), ], prefixes = [ dict(idx="XD", qp=None, type=None, desc="Marker to indicate that the package contains .desktop files", ldesc="Only 'XD' can present in the index. This is used to efficiently" " filter packages that have a .desktop file"), dict(idx="XDF", qp=None, type=None, desc="File name of the .desktop file", ldesc="This is the name of a .desktop file contained in the package." " There could be more than one"), dict(idx="XDT", qp=None, type="bool", desc="Categories from .desktop files", ldesc="This is similar to a tag, but filled with the categories found" " in .desktop files"), dict(idx="Z", qp=None, type=None, desc="Stemmed forms of keywords", ldesc="This contains the stemmed forms of keywords as generated by" " TermGenerator and matched by QueryParser"), ]) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ # Read the value indexes we will use values = info['values'] self.val_popcon = values.get("app-popcon", -1) self.indexers = [Indexer(lang, self.val_popcon, progress) for lang in [None] + list(self.langs)] self.entries = {} progress.begin("Reading .desktop files from %s" % APPINSTALLDIR) for f in os.listdir(APPINSTALLDIR): if f[0] == '.' or not f.endswith(".desktop"): continue try: entry = DesktopEntry(os.path.join(APPINSTALLDIR, f)) except (ValueError, ParsingError, UnicodeDecodeError): # Invalid .desktop files can cause a ValueError. From PyXDG 0.25, # that case will be turned into a ParsingError. continue pkg = entry.get("X-AppInstall-Package") self.entries.setdefault(pkg, []).append((f, entry)) progress.end() def send_extra_info(self, db=None, **kw): """ Receive extra parameters from the indexer. This may be called more than once, but after init(). We are using this to get the database instance """ if db is not None: for i in self.indexers: i.indexer.set_flags(xapian.TermGenerator.FLAG_SPELLING) i.indexer.set_database(db) def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "app-install information", shortDesc = "terms, categories and popcon values extracted from the app-install .desktop files", fullDoc = """ The AppInstall data source reads .desktop files from %s and adds the following terms: * keywords from the .desktop descriptions, via Xapian's TermGenerator, in all requested locales; * .desktop categories, with prefix XDT; * name of .desktop file, with prefix XDF; * a single term "XD", marking that the package contains .desktop files. It also adds an "app-popcon" value with popcon ranks from the app-install .desktop files. """ % APPINSTALLDIR ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ name = document.get_data() for e in self.entries.get(name, []): fname, entry = e for i in self.indexers: i.index(document, fname, entry) def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ name = document.get_data() for e in self.entries.get(name, []): fname, entry = e for i in self.indexers: i.index(document, fname, entry) def init(langs=None, progress=None, **kw): """ Create and return the plugin object. """ # If we don't have app-install data, skip it if not os.path.isdir(APPINSTALLDIR): return None # If we don't have python-xdg, skip it if not HAS_XDG: if progress: progress.verbose("please install python-xdg if you want to index app-install-data files") return None return AppInstall(langs=langs, progress=progress) apt-xapian-index-0.47ubuntu13/plugins/sections.py0000644000000000000000000000662513070503156017003 0ustar try: import apt import apt_pkg HAS_APT=True except ImportError: HAS_APT=False import xapian import os, os.path class Sections: def info(self, **kw): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ res = dict( timestamp=0, prefixes=[ dict(idx="XS", qp="sec:", type="bool", desc="Package section", ldesc="Debian package section, max one per package"), ], ) if kw.get("system", True): if not HAS_APT: return res file = apt_pkg.config.find_file("Dir::Cache::pkgcache") if not os.path.exists(file): return res ts = os.path.getmtime(file) else: file = "(stdin)" ts = 0 res["timestamp"] = ts res["sources"] = [dict(path=file, desc="APT index")] return res def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ pass def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Package sections", shortDesc = "Debian package sections", fullDoc = """ The section is indexed literally, with the prefix XS. """ ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ sec = pkg.section if sec: document.add_term("XS"+sec.lower()) def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ sec = pkg["Section"] if sec: document.add_term("XS"+sec.lower()) def init(**kw): """ Create and return the plugin object. """ return Sections() apt-xapian-index-0.47ubuntu13/plugins/aliases.py0000644000000000000000000001075313070503156016572 0ustar import xapian import os, os.path AXI_ALIASES = os.environ.get("AXI_ALIASES", "/etc/apt-xapian-index/aliases/:/usr/share/apt-xapian-index/aliases/") def read_db(progress=None): aliases = [] maxts = 0 files = [] for d in AXI_ALIASES.split(":"): if not os.path.isdir(d): continue for f in os.listdir(d): if f[0] == '.': continue fname = os.path.join(d, f) ts = os.path.getmtime(fname) if ts > maxts: maxts = ts if progress: progress.verbose("Reading aliases from %s..." % fname) info = dict(path=fname) for idx, line in enumerate(open(fname)): line = line.strip() if idx == 0 and line[0] == '#': # Take a comment at start of file as file description info["desc"] = line[1:].strip() continue # Skip comments and empty lines if not line or line[0] == '#': continue line = line.split() aliases.append(line) info.setdefault("desc", "synonyms for well-known terms") files.append(info) return maxts, aliases, files class Aliases: def __init__(self, maxts, db, files): self.maxts = maxts self.db = db self.files = files def info(self): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ return dict(timestamp=self.maxts, sources=self.files) def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ pass def send_extra_info(self, db=None, **kw): """ Receive extra parameters from the indexer. This may be called more than once, but after init(). We are using this to get the database instance """ if db is not None: for row in self.db: for a in row[1:]: db.add_synonym(row[0], a) def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Package aliases", shortDesc = "aliases for well known programs", fullDoc = """ The Aliases data source does not change documents in the index, but adds synonims to the database. Synonims allow to obtain good results while looking for well-know software names, even if such software does not exist in Debian. """ ) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ pass def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ pass def init(progress=None, **kw): """ Create and return the plugin object. """ maxts, db, files = read_db(progress) if not db: return None return Aliases(maxts, db, files) apt-xapian-index-0.47ubuntu13/plugins/relations.py0000644000000000000000000001261313070503156017146 0ustar try: import apt import apt_pkg HAS_APT=True except ImportError: HAS_APT=False import os, os.path import re class Relations: def __init__(self): self.prefix_desc=[ dict(idx="XRD", qp="reldep:", type="bool", desc="Relation: depends", ldesc="Depends: relationship, package names only"), dict(idx="XRR", qp="relrec:", type="bool", desc="Relation: recommends", ldesc="Recommends: relationship, package names only"), dict(idx="XRS", qp="relsug:", type="bool", desc="Relation: suggests", ldesc="Suggests: relationship, package names only"), dict(idx="XRE", qp="relenh:", type="bool", desc="Relation: ehnances", ldesc="Enhances: relationship, package names only"), dict(idx="XRP", qp="relpre:", type="bool", desc="Relation: pre-depends", ldesc="Pre-Depends: relationship, package names only"), dict(idx="XRB", qp="relbre:", type="bool", desc="Relation: breaks", ldesc="Breaks: relationship, package names only"), dict(idx="XRC", qp="relcon:", type="bool", desc="Relation: conflicts", ldesc="Conflicts: relationship, package names only"), ] self.prefixes = [(d["idx"], d["ldesc"][:d["ldesc"].find(":")]) for d in self.prefix_desc] self.re_split = re.compile(r"\s*[|,]\s*") def info(self, **kw): """ Return general information about the plugin. The information returned is a dict with various keywords: timestamp (required) the last modified timestamp of this data source. This will be used to see if we need to update the database or not. A timestamp of 0 means that this data source is either missing or always up to date. values (optional) an array of dicts { name: name, desc: description }, one for every numeric value indexed by this data source. Note that this method can be called before init. The idea is that, if the timestamp shows that this plugin is currently not needed, then the long initialisation can just be skipped. """ res = dict( timestamp=0, prefixes=self.prefix_desc, ) if kw.get("system", True): if not HAS_APT: return res file = apt_pkg.config.find_file("Dir::Cache::pkgcache") if not os.path.exists(file): return res ts = os.path.getmtime(file) else: file = "(stdin)" ts = 0 res["sources"] = [dict(path=file, desc="APT index")] res["timestamp"] = ts return res def init(self, info, progress): """ If needed, perform long initialisation tasks here. info is a dictionary with useful information. Currently it contains the following values: "values": a dict mapping index mnemonics to index numbers The progress indicator can be used to report progress. """ pass def doc(self): """ Return documentation information for this data source. The documentation information is a dictionary with these keys: name: the name for this data source shortDesc: a short description fullDoc: the full description as a chapter in ReST format """ return dict( name = "Package relationships", shortDesc = "Debian package relationships", fullDoc = """ Indexes one term per relationship declared with other packages. All relationship terms have prefixes starting with XR plus an extra prefix letter per relationship type. Terms are built using only the package names in the relationship fields: versioning and boolean operators are ignored. """ ) def _index_rel(self, pfx, val, doc): """ Extract all package names from @val and index them as terms with prefix @pfx """ for name in self.re_split.split(val): doc.add_term(pfx + name.split(None, 1)[0]) def index(self, document, pkg): """ Update the document with the information from this data source. document is the document to update pkg is the python-apt Package object for this package """ ver = pkg.candidate if ver is None: return rec = ver.record if rec is None: return for pfx, field in self.prefixes: val = rec.get(field, None) if val is None: continue self._index_rel(pfx, val, document) def indexDeb822(self, document, pkg): """ Update the document with the information from this data source. This is alternative to index, and it is used when indexing with package data taken from a custom Packages file. document is the document to update pkg is the Deb822 object for this package """ for pfx, field in self.prefixes: val = pkg.get(field, None) if val is None: continue self._index_rel(pfx, val, document) def init(**kw): """ Create and return the plugin object. """ return Relations()