pysolr-3.8.1/0000755000076500000240000000000013355253167013440 5ustar cadamsstaff00000000000000pysolr-3.8.1/PKG-INFO0000644000076500000240000002361413355253167014543 0ustar cadamsstaff00000000000000Metadata-Version: 2.1 Name: pysolr Version: 3.8.1 Summary: Lightweight python wrapper for Apache Solr. Home-page: https://github.com/django-haystack/pysolr/ Author: Daniel Lindsley Author-email: daniel@toastdriven.com License: BSD Description: ====== pysolr ====== ``pysolr`` is a lightweight Python wrapper for `Apache Solr`_. It provides an interface that queries the server and returns results based on the query. .. _`Apache Solr`: http://lucene.apache.org/solr/ Status ====== .. image:: https://secure.travis-ci.org/django-haystack/pysolr.png :target: https://secure.travis-ci.org/django-haystack/pysolr `Changelog `_ Features ======== * Basic operations such as selecting, updating & deleting. * Index optimization. * `"More Like This" `_ support (if set up in Solr). * `Spelling correction `_ (if set up in Solr). * Timeout support. * SolrCloud awareness Requirements ============ * Python 2.7 - 3.6 * Requests 2.9.1+ * **Optional** - ``simplejson`` * **Optional** - ``kazoo`` for SolrCloud mode Installation ============ pysolr is on PyPI: .. code-block:: console $ pip install pysolr Or if you want to install directly from the repository: ``python setup.py install``, or drop the ``pysolr.py`` file anywhere on your ``PYTHONPATH``. Usage ===== Basic usage looks like: .. code-block:: python # If on Python 2.X from __future__ import print_function import pysolr # Setup a Solr instance. The timeout is optional. solr = pysolr.Solr('http://localhost:8983/solr/', timeout=10, auth=) # How you'd index data. solr.add([ { "id": "doc_1", "title": "A test document", }, { "id": "doc_2", "title": "The Banana: Tasty or Dangerous?", "_doc": [ { "id": "child_doc_1", "title": "peel" }, { "id": "child_doc_2", "title": "seed" }, ] }, ]) # Note that the add method has commit=True by default, so this is # immediately committed to your index. # You can index a parent/child document relationship by # associating a list of child documents with the special key '_doc'. This # is helpful for queries that join together conditions on children and parent # documents. # Later, searching is easy. In the simple case, just a plain Lucene-style # query is fine. results = solr.search('bananas') # The ``Results`` object stores total results found, by default the top # ten most relevant results and any additional data like # facets/highlighting/spelling/etc. print("Saw {0} result(s).".format(len(results))) # Just loop over it to access the results. for result in results: print("The title is '{0}'.".format(result['title'])) # For a more advanced query, say involving highlighting, you can pass # additional options to Solr. results = solr.search('bananas', **{ 'hl': 'true', 'hl.fragsize': 10, }) # You can also perform More Like This searches, if your Solr is configured # correctly. similar = solr.more_like_this(q='id:doc_2', mltfl='text') # Finally, you can delete either individual documents, solr.delete(id='doc_1') # also in batches... solr.delete(id=['doc_1', 'doc_2']) # ...or all documents. solr.delete(q='*:*') .. code-block:: python # For SolrCloud mode, initialize your Solr like this: zookeeper = pysolr.ZooKeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181") solr = pysolr.SolrCloud(zookeeper, "collection1", auth=) Multicore Index ~~~~~~~~~~~~~~~ Simply point the URL to the index core: .. code-block:: python # Setup a Solr instance. The timeout is optional. solr = pysolr.Solr('http://localhost:8983/solr/core_0/', timeout=10) Custom Request Handlers ~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance. The trailing slash is optional. solr = pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', use_qt_param=False) If ``use_qt_param`` is ``True`` it is essential that the name of the handler is exactly what is configured in ``solrconfig.xml``, including the leading slash if any (though with the ``qt`` parameter a leading slash is not a requirement by SOLR). If ``use_qt_param`` is ``False`` (default), the leading and trailing slashes can be omitted. If ``search_handler`` is not specified, pysolr will default to ``/select``. The handlers for MoreLikeThis, Update, Terms etc. all default to the values set in the ``solrconfig.xml`` SOLR ships with: ``mlt``, ``update``, ``terms`` etc. The specific methods of pysolr's ``Solr`` class (like ``more_like_this``, ``suggest_terms`` etc.) allow for a kwarg ``handler`` to override that value. This includes the ``search`` method. Setting a handler in ``search`` explicitly overrides the ``search_handler`` setting (if any). Custom Authentication ~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance in a kerborized enviornment from requests_kerberos import HTTPKerberosAuth, OPTIONAL kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False) solr = pysolr.Solr('http://localhost:8983/solr/', auth=kerberos_auth) .. code-block:: python # Setup a CloudSolr instance in a kerborized environment from requests_kerberos import HTTPKerberosAuth, OPTIONAL kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False) zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181") solr = pysolr.SolrCloud(zookeeper, "collection", auth=kerberos_auth) If your Solr servers run off https ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance in an https environment solr = pysolr.Solr('http://localhost:8983/solr/', verify=path/to/cert.pem) .. code-block:: python # Setup a CloudSolr instance in a kerborized environment zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181") solr = pysolr.SolrCloud(zookeeper, "collection", verify=path/to/cert.perm) Custom Commit Policy ~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance. The trailing slash is optional. # All request to solr will result in a commit solr = pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', always_commit=True) ``always_commit`` signals to the Solr object to either commit or not commit by default for any solr request. Be sure to change this to True if you are upgrading from a version where the default policy was alway commit by default. Functions like ``add`` and ``delete`` also still provide a way to override the default by passing the ``commit`` kwarg. It is generally good practice to limit the amount of commits to solr. Excessive commits risk opening too many searcher or using too many system resources. LICENSE ======= ``pysolr`` is licensed under the New BSD license. Running Tests ============= The ``run-tests.py`` script will automatically perform the steps below and is recommended for testing by default unless you need more control. Running a test Solr instance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Downloading, configuring and running Solr 4 looks like this:: ./start-solr-test-server.sh Running the tests ~~~~~~~~~~~~~~~~~ The test suite requires the unittest2 library: Python 2:: python -m unittest2 tests Python 3:: python3 -m unittest tests Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: BSD License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 3 Provides-Extra: solrcloud pysolr-3.8.1/get-solr-download-url.py0000755000076500000240000000354012773557112020160 0ustar cadamsstaff00000000000000#!/usr/bin/env python # encoding: utf-8 from __future__ import absolute_import, division, print_function, unicode_literals from itertools import chain import sys import requests # Try to import urllib from the Python 3 reorganized stdlib first: try: from urllib.parse import urljoin except ImportError: try: from urlparse.parse import urljoin except ImportError: from urlparse import urljoin if len(sys.argv) != 2: print('Usage: %s SOLR_VERSION' % sys.argv[0], file=sys.stderr) sys.exit(1) solr_version = sys.argv[1] tarball = 'solr-{0}.tgz'.format(solr_version) dist_path = 'lucene/solr/{0}/{1}'.format(solr_version, tarball) download_url = urljoin('https://archive.apache.org/dist/', dist_path) mirror_response = requests.get("https://www.apache.org/dyn/mirrors/mirrors.cgi/%s?asjson=1" % dist_path) if not mirror_response.ok: print('Apache mirror request returned HTTP %d' % mirror_response.status_code, file=sys.stderr) sys.exit(1) mirror_data = mirror_response.json() # Since the Apache mirrors are often unreliable and releases may disappear without notice we'll # try the preferred mirror, all of the alternates and backups, and fall back to the main Apache # archive server: for base_url in chain((mirror_data['preferred'], ), mirror_data['http'], mirror_data['backup'], ('https://archive.apache.org/dist/', )): test_url = urljoin(base_url, mirror_data['path_info']) # The Apache mirror script's response format has recently changed to exclude the actual file paths: if not test_url.endswith(tarball): test_url = urljoin(test_url, dist_path) if requests.head(test_url, allow_redirects=True).status_code == 200: download_url = test_url break else: print('None of the Apache mirrors have %s' % dist_path, file=sys.stderr) sys.exit(1) print(download_url) pysolr-3.8.1/LICENSE0000644000076500000240000000302511415711351014432 0ustar cadamsstaff00000000000000Copyright (c) Joseph Kocherhans, Jacob Kaplan-Moss, Daniel Lindsley. All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. 3. Neither the name of pysolr nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. pysolr-3.8.1/pysolr.egg-info/0000755000076500000240000000000013355253167016462 5ustar cadamsstaff00000000000000pysolr-3.8.1/pysolr.egg-info/PKG-INFO0000644000076500000240000002361413355253167017565 0ustar cadamsstaff00000000000000Metadata-Version: 2.1 Name: pysolr Version: 3.8.1 Summary: Lightweight python wrapper for Apache Solr. Home-page: https://github.com/django-haystack/pysolr/ Author: Daniel Lindsley Author-email: daniel@toastdriven.com License: BSD Description: ====== pysolr ====== ``pysolr`` is a lightweight Python wrapper for `Apache Solr`_. It provides an interface that queries the server and returns results based on the query. .. _`Apache Solr`: http://lucene.apache.org/solr/ Status ====== .. image:: https://secure.travis-ci.org/django-haystack/pysolr.png :target: https://secure.travis-ci.org/django-haystack/pysolr `Changelog `_ Features ======== * Basic operations such as selecting, updating & deleting. * Index optimization. * `"More Like This" `_ support (if set up in Solr). * `Spelling correction `_ (if set up in Solr). * Timeout support. * SolrCloud awareness Requirements ============ * Python 2.7 - 3.6 * Requests 2.9.1+ * **Optional** - ``simplejson`` * **Optional** - ``kazoo`` for SolrCloud mode Installation ============ pysolr is on PyPI: .. code-block:: console $ pip install pysolr Or if you want to install directly from the repository: ``python setup.py install``, or drop the ``pysolr.py`` file anywhere on your ``PYTHONPATH``. Usage ===== Basic usage looks like: .. code-block:: python # If on Python 2.X from __future__ import print_function import pysolr # Setup a Solr instance. The timeout is optional. solr = pysolr.Solr('http://localhost:8983/solr/', timeout=10, auth=) # How you'd index data. solr.add([ { "id": "doc_1", "title": "A test document", }, { "id": "doc_2", "title": "The Banana: Tasty or Dangerous?", "_doc": [ { "id": "child_doc_1", "title": "peel" }, { "id": "child_doc_2", "title": "seed" }, ] }, ]) # Note that the add method has commit=True by default, so this is # immediately committed to your index. # You can index a parent/child document relationship by # associating a list of child documents with the special key '_doc'. This # is helpful for queries that join together conditions on children and parent # documents. # Later, searching is easy. In the simple case, just a plain Lucene-style # query is fine. results = solr.search('bananas') # The ``Results`` object stores total results found, by default the top # ten most relevant results and any additional data like # facets/highlighting/spelling/etc. print("Saw {0} result(s).".format(len(results))) # Just loop over it to access the results. for result in results: print("The title is '{0}'.".format(result['title'])) # For a more advanced query, say involving highlighting, you can pass # additional options to Solr. results = solr.search('bananas', **{ 'hl': 'true', 'hl.fragsize': 10, }) # You can also perform More Like This searches, if your Solr is configured # correctly. similar = solr.more_like_this(q='id:doc_2', mltfl='text') # Finally, you can delete either individual documents, solr.delete(id='doc_1') # also in batches... solr.delete(id=['doc_1', 'doc_2']) # ...or all documents. solr.delete(q='*:*') .. code-block:: python # For SolrCloud mode, initialize your Solr like this: zookeeper = pysolr.ZooKeeper("zkhost1:2181,zkhost2:2181,zkhost3:2181") solr = pysolr.SolrCloud(zookeeper, "collection1", auth=) Multicore Index ~~~~~~~~~~~~~~~ Simply point the URL to the index core: .. code-block:: python # Setup a Solr instance. The timeout is optional. solr = pysolr.Solr('http://localhost:8983/solr/core_0/', timeout=10) Custom Request Handlers ~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance. The trailing slash is optional. solr = pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', use_qt_param=False) If ``use_qt_param`` is ``True`` it is essential that the name of the handler is exactly what is configured in ``solrconfig.xml``, including the leading slash if any (though with the ``qt`` parameter a leading slash is not a requirement by SOLR). If ``use_qt_param`` is ``False`` (default), the leading and trailing slashes can be omitted. If ``search_handler`` is not specified, pysolr will default to ``/select``. The handlers for MoreLikeThis, Update, Terms etc. all default to the values set in the ``solrconfig.xml`` SOLR ships with: ``mlt``, ``update``, ``terms`` etc. The specific methods of pysolr's ``Solr`` class (like ``more_like_this``, ``suggest_terms`` etc.) allow for a kwarg ``handler`` to override that value. This includes the ``search`` method. Setting a handler in ``search`` explicitly overrides the ``search_handler`` setting (if any). Custom Authentication ~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance in a kerborized enviornment from requests_kerberos import HTTPKerberosAuth, OPTIONAL kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False) solr = pysolr.Solr('http://localhost:8983/solr/', auth=kerberos_auth) .. code-block:: python # Setup a CloudSolr instance in a kerborized environment from requests_kerberos import HTTPKerberosAuth, OPTIONAL kerberos_auth = HTTPKerberosAuth(mutual_authentication=OPTIONAL, sanitize_mutual_error_response=False) zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181") solr = pysolr.SolrCloud(zookeeper, "collection", auth=kerberos_auth) If your Solr servers run off https ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance in an https environment solr = pysolr.Solr('http://localhost:8983/solr/', verify=path/to/cert.pem) .. code-block:: python # Setup a CloudSolr instance in a kerborized environment zookeeper = pysolr.ZooKeeper("zkhost1:2181/solr, zkhost2:2181,...,zkhostN:2181") solr = pysolr.SolrCloud(zookeeper, "collection", verify=path/to/cert.perm) Custom Commit Policy ~~~~~~~~~~~~~~~~~~~~ .. code-block:: python # Setup a Solr instance. The trailing slash is optional. # All request to solr will result in a commit solr = pysolr.Solr('http://localhost:8983/solr/core_0/', search_handler='/autocomplete', always_commit=True) ``always_commit`` signals to the Solr object to either commit or not commit by default for any solr request. Be sure to change this to True if you are upgrading from a version where the default policy was alway commit by default. Functions like ``add`` and ``delete`` also still provide a way to override the default by passing the ``commit`` kwarg. It is generally good practice to limit the amount of commits to solr. Excessive commits risk opening too many searcher or using too many system resources. LICENSE ======= ``pysolr`` is licensed under the New BSD license. Running Tests ============= The ``run-tests.py`` script will automatically perform the steps below and is recommended for testing by default unless you need more control. Running a test Solr instance ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Downloading, configuring and running Solr 4 looks like this:: ./start-solr-test-server.sh Running the tests ~~~~~~~~~~~~~~~~~ The test suite requires the unittest2 library: Python 2:: python -m unittest2 tests Python 3:: python3 -m unittest tests Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: BSD License Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Topic :: Internet :: WWW/HTTP :: Indexing/Search Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 3 Provides-Extra: solrcloud pysolr-3.8.1/pysolr.egg-info/SOURCES.txt0000644000076500000240000000100113355253167020336 0ustar cadamsstaff00000000000000.gitchangelog.rc .gitignore .travis.yml AUTHORS CHANGELOG.rst LICENSE MANIFEST.in README.rst get-solr-download-url.py pysolr.py run-tests.py setup.cfg setup.py start-solr-test-server.sh tox.ini .github/issue_template.md .github/pull_request_template.md .github/stale.yml pysolr.egg-info/PKG-INFO pysolr.egg-info/SOURCES.txt pysolr.egg-info/dependency_links.txt pysolr.egg-info/requires.txt pysolr.egg-info/top_level.txt tests/__init__.py tests/test_admin.py tests/test_client.py tests/test_cloud.py tests/utils.pypysolr-3.8.1/pysolr.egg-info/requires.txt0000644000076500000240000000005013355253167021055 0ustar cadamsstaff00000000000000requests>=2.9.1 [solrcloud] kazoo==2.2 pysolr-3.8.1/pysolr.egg-info/top_level.txt0000644000076500000240000000000713355253167021211 0ustar cadamsstaff00000000000000pysolr pysolr-3.8.1/pysolr.egg-info/dependency_links.txt0000644000076500000240000000000113355253167022530 0ustar cadamsstaff00000000000000 pysolr-3.8.1/start-solr-test-server.sh0000755000076500000240000001357112745732000020367 0ustar cadamsstaff00000000000000#!/bin/bash set -e # Redirect output to log files when stdin is not a TTY: if [ ! -t 0 ]; then exec 1>test-solr.stdout.log 2>test-solr.stderr.log fi SOLR_VERSION=4.10.4 ROOT=$(cd `dirname $0`; pwd) APP=$ROOT/solr-app PIDS=$ROOT/solr.pids export SOLR_ARCHIVE="solr-${SOLR_VERSION}.tgz" LOGS=$ROOT/logs cd $ROOT function download_solr() { if [ -d "${HOME}/download-cache/" ]; then export SOLR_ARCHIVE="${HOME}/download-cache/${SOLR_ARCHIVE}" fi if [ -f ${SOLR_ARCHIVE} ]; then # If the tarball doesn't extract cleanly, remove it so it'll download again: tar -tf ${SOLR_ARCHIVE} > /dev/null || rm ${SOLR_ARCHIVE} fi if [ ! -f ${SOLR_ARCHIVE} ]; then SOLR_DOWNLOAD_URL=$(python get-solr-download-url.py $SOLR_VERSION) curl -Lo $SOLR_ARCHIVE ${SOLR_DOWNLOAD_URL} || (echo "Unable to download ${SOLR_DOWNLOAD_URL}"; exit 2) fi } function extract_solr() { APP=solr-app echo "Extracting Solr ${SOLR_VERSION} to `pwd`/$APP" rm -rf $APP mkdir $APP tar -C $APP -xf ${SOLR_ARCHIVE} --strip-components 1 solr-${SOLR_VERSION} } function prepare_solr_home() { SOLR_HOME=$1 HOST=$2 echo "Preparing SOLR_HOME at $SOLR_HOME for host $HOST" APP=$(pwd)/solr-app mkdir -p ${SOLR_HOME} cp solr-app/example/solr/solr.xml ${SOLR_HOME}/ cp solr-app/example/solr/zoo.cfg ${SOLR_HOME}/ } function prepare_core() { SOLR_HOME=$1 CORE=$2 echo "Preparing core $CORE" CORE_DIR=${SOLR_HOME}/${CORE} mkdir -p ${CORE_DIR} cp -r solr-app/example/solr/collection1/conf ${CORE_DIR}/ perl -p -i -e 's|\n \n\n\n