pYsearch-3.1/ 0000755 0011746 0000144 00000000000 10671606606 012007 5 ustar leif users pYsearch-3.1/docs/ 0000755 0011746 0000144 00000000000 10671606606 012737 5 ustar leif users pYsearch-3.1/docs/index.html 0000644 0011746 0000144 00000035547 10671606270 014747 0 ustar leif users
yahoo.search (version 1.19, Thu Jul 7 14:22:16 PDT 2005) | index /home/leif/hack/pysearch/yahoo/search/__init__.py |
Yahoo Search Web Services
This module implements a set of classes and functions to work with the
Yahoo Search Web Services. All results from these services are properly
formatted XML, and this package facilitates for proper parsing of these
result sets. Some of the features include:
* Extendandable API, with replaceable backend XML parsers, and
I/O interface.
* Type and value checking on search parameters, including
automatic type conversion (when appropriate and possible)
* Flexible return format, including DOM objects, or fully
parsed result objects
You can either instantiate a search object directly, or use the factory
function create_search() from the factory module. The supported classes
of searches are:
NewsSearch - News article search
VideoSearch - Video and movie search
ImageSearch - Image search
LocalSearch - Local area search
WebSearch - Web search
ContextSearch - Web search with a context
RelatedSuggestion - Web search Related Suggestion
SpellingSuggestion - Web search Spelling Suggestion
TermExtraction - Term Extraction service
AlbumSearch - Find information about albums
ArtistSearch - Information on a particular musical performer
SongDownload - Find links to various song providers of a song
PodcastSearch - Search for a Podcast site/feed
SongSearch - Provide information about songs
PageData - Shows a list of all pages belonging to a domain
InlinkData - Shows the pages from other sites linking to a page
The different sub-classes of search supports different sets of query
parameters. For details on all allowed parameters, please consult the
specific module documentation.
Each of these parameter is implemented as an attribute of each
respective class. For example, you can set parameters like:
from yahoo.search.web import WebSearch
app_id = "YahooDemo"
srch = WebSearch(app_id)
srch.query = "Leif Hedstrom"
srch.results = 40
or, if you are using the factory function:
from yahoo.search.factory import create_search
app_id = "YahooDemo"
srch = create_search("Web", app_id, query="Leif Hedstrom", results=40)
if srch is not None:
# srch object ready to use
...
else:
print "error"
or, the last alternative, a combination of the previous two:
import yahoo.search.web
app_id = "YahooDemo"
srch = web.WebSearch(app_id, query="Leif Hedstrom", results=40)
To retrieve a certain parameter value, simply access it as any normal
attribute:
print "Searched for ", srch.query
For more information on these parameters, and their allowed values, please
see the official Yahoo Search Services documentation available at
http://developer.yahoo.net/
Once the webservice object has been created, you can retrieve a parsed
object (typically a DOM object) using the get_results() method:
dom = srch.get_results()
This DOM object contains all results, and can be used as is. For easier
use of the results, you can use the built-in results factory, which will
traverse the entire DOM object, and create a list of results objects.
results = srch.parse_results(dom)
or, by using the implicit call to get_results():
results = srch.parse_results()
The default XML parser and results factories should be adequate for most
users, so use the parse_results() when possible. However, both the XML
parser and the results parser can easily be overriden. See the examples
below for details. More information about the DOM parsers are available
in the yahoo.search.dom module, and it's subclasses.
EXAMPLES:
This simple application will create a search object using the first
command line argument as the "type" (e.g. "web" or "news"), and all
subsequent arguments forms the query string:
#!/usr/bin/python
import sys
from yahoo.search.factory import create_search
service = sys.argv[1]
query = " ".join(sys.argv[2:])
app_id = "YahooDemo"
srch = create_search(service, app_id, query=query, results=5)
if srch is None:
srch = create_search("Web", app_id, query=query, results=5)
dom = srch.get_results()
results = srch.parse_results(dom)
for res in results:
url = res.Url
summary = res['Summary']
print "%s -> %s" (summary, url)
The same example using the PyXML 4DOM parser:
#!/usr/bin/python
import sys
from yahoo.search.factory import create_search
from xml.dom.ext.reader import Sax2
query = " ".join(sys.argv[2:])
srch = create_search(sys.argv[1], "YahooDemo", query=query, results=5)
if srch is not None:
reader = Sax2.Reader()
srch.install_xml_parser(reader.fromStream)
.
.
.
The last example will produce the same query, but uses an HTTP proxy
for the request:
#!/usr/bin/python
import sys
from yahoo.search.factory import create_search
import urllib2
query = " ".join(sys.argv[2:])
srch = create_search(sys.argv[1], "YahooDemo", query=query, results=5)
if srch is not None:
proxy = urllib2.ProxyHandler({"http" : "http://octopus:3128"})
opener = urllib2.build_opener(proxy)
srch.install_opener(opener)
.
.
.
You can obviously "mix and match" as necessary here. I'm using the
installer methods above for clarity, the APIs allows you to pass those
custom handlers as arguments as well (see the documentation below).
Package Contents | ||||||
|
Data | ||
__all__ = ['web', 'news', 'video', 'image', 'local', 'term', 'audio', 'site'] __author__ = 'Leif Hedstrom <leif@ogre.com>' __date__ = 'Thu Jul 7 14:22:16 PDT 2005' __revision__ = '$Id: __init__.py,v 1.19 2007/09/11 21:38:43 zwoop Exp $' __version__ = '$Revision: 1.19 $' |
Author | ||
Leif Hedstrom <leif@ogre.com> |