feedparser-5.1.3/0000775000175000017500000000000012061174742013642 5ustar kurtkurt00000000000000feedparser-5.1.3/README0000664000175000017500000000470712027504622014526 0ustar kurtkurt00000000000000feedparser - Parse Atom and RSS feeds in Python. Copyright (c) 2010-2012 Kurt McKee Copyright (c) 2002-2008 Mark Pilgrim feedparser is open source. See the LICENSE file for more information. Installation ============ Feedparser can be installed using distutils or setuptools by running: $ python setup.py install If you're using Python 3, feedparser will automatically be updated by the 2to3 tool; installation should be seamless across Python 2 and Python 3. There's one caveat, however: sgmllib.py was deprecated in Python 2.6 and is no longer included in the Python 3 standard library. Because feedparser currently relies on sgmllib.py to handle illformed feeds (among other things), it's a useful library to have installed. If your feedparser download included a copy of sgmllib.py, it's probably called sgmllib3.py, and you can simply rename the file to sgmllib.py. It will not be automatically installed using the command above, so you will have to manually copy it to somewhere in your Python path. If a copy of sgmllib.py was not included in your feedparser download, you can grab a copy from the Python 2 standard library (preferably from the Python 2.7 series) and run the 2to3 tool on it: $ 2to3 -w sgmllib.py If you copied sgmllib.py from a Python 2.6 or 2.7 installation you'll additionally need to edit the resulting file to remove the `warnpy3k` lines at the top of the file. There should be four lines at the top of the file that you can delete. Because sgmllib.py is a part of the Python codebase, it's licensed under the Python Software Foundation License. You can find a copy of that license at python.org: http://docs.python.org/license.html Documentation ============= The feedparser documentation is available on the web at: http://packages.python.org/feedparser It is also included in its source format, ReST, in the docs/ directory. To build the documentation you'll need the Sphinx package, which is available at: http://sphinx.pocoo.org/ You can then build HTML pages using a command similar to: $ sphinx-build -b html docs/ fpdocs This will produce HTML documentation in the fpdocs/ directory. Testing ======= Feedparser has an extensive test suite that has been growing for a decade. If you'd like to run the tests yourself, you can run the following command: $ python feedparsertest.py This will spawn an HTTP server that will listen on port 8097. The tests will fail if that port is in use. feedparser-5.1.3/PKG-INFO0000664000175000017500000000237012061174742014741 0ustar kurtkurt00000000000000Metadata-Version: 1.1 Name: feedparser Version: 5.1.3 Summary: Universal feed parser, handles RSS 0.9x, RSS 1.0, RSS 2.0, CDF, Atom 0.3, and Atom 1.0 feeds Home-page: http://code.google.com/p/feedparser/ Author: Kurt McKee Author-email: contactme@kurtmckee.org License: UNKNOWN Download-URL: http://code.google.com/p/feedparser/ Description: UNKNOWN Keywords: atom,cdf,feed,parser,rdf,rss Platform: POSIX Platform: Windows Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 2.4 Classifier: Programming Language :: Python :: 2.5 Classifier: Programming Language :: Python :: 2.6 Classifier: Programming Language :: Python :: 2.7 Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3.0 Classifier: Programming Language :: Python :: 3.1 Classifier: Programming Language :: Python :: 3.2 Classifier: Programming Language :: Python :: 3.3 Classifier: Topic :: Software Development :: Libraries :: Python Modules Classifier: Topic :: Text Processing :: Markup :: XML feedparser-5.1.3/MANIFEST.in0000664000175000017500000000027312027504622015376 0ustar kurtkurt00000000000000recursive-include feedparser/tests *.xml *.gz *.z recursive-include docs *.rst *.py *.css include feedparser/feedparsertest.py include feedparser/sgmllib3.py include LICENSE include NEWS feedparser-5.1.3/LICENSE0000664000175000017500000000617212027504622014651 0ustar kurtkurt00000000000000Universal Feed Parser (feedparser.py), its testing harness (feedparsertest.py), and its unit tests (everything in the tests/ directory) are released under the following license: ----- begin license block ----- Copyright (c) 2010-2012 Kurt McKee Copyright (c) 2002-2008 Mark Pilgrim All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. ----- end license block ----- Universal Feed Parser documentation (everything in the docs/ directory) is released under the following license: ----- begin license block ----- Copyright 2004-2008 Mark Pilgrim. All rights reserved. Redistribution and use in source (Sphinx ReST) and "compiled" forms (HTML, PDF, PostScript, RTF and so forth) with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code (Sphinx ReST) must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in compiled form (converted to HTML, PDF, PostScript, RTF and other formats) must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. THIS DOCUMENTATION IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS 'AS IS' AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS DOCUMENTATION, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. feedparser-5.1.3/docs/0000775000175000017500000000000012061174741014571 5ustar kurtkurt00000000000000feedparser-5.1.3/docs/reference-entry-content.rst0000664000175000017500000000771512027504622022077 0ustar kurtkurt00000000000000.. _reference.entry.content: :py:attr:`entries[i].content` ============================= A list of dictionaries with details about the full content of the entry. Atom feeds may contain multiple content elements. Clients should render as many of them as possible, based on the type and the client's abilities. .. _reference.entry.content.value: :py:attr:`entries[i].content[j].value` -------------------------------------- The value of this piece of content. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, it is :ref:`sanitized ` by default. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, certain (X)HTML elements within this value may contain relative :abbr:`URI (Uniform Resource Identifier)`\s. If so, they are :ref:`resolved according to a set of rules `. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, it will be :ref:`parsed for microformats `. .. _reference.entry.content.type: :py:attr:`entries[i].content[j].type` ------------------------------------- The content type of this piece of content. Most likely values for `type`: * :mimetype:`text/plain` * :mimetype:`text/html` * :mimetype:`application/xhtml+xml` For Atom feeds, the content type is taken from the type attribute, which defaults to :mimetype:`text/plain` if not specified. For :abbr:`RSS (Rich Site Summary)` feeds, the content type is auto-determined by inspecting the content, and defaults to :mimetype:`text/html`. Note that this may cause silent data loss if the value contains plain text with angle brackets. There is nothing I can do about this problem; it is a limitation of :abbr:`RSS (Rich Site Summary)`. Future enhancement: some versions of :abbr:`RSS (Rich Site Summary)` clearly specify that certain values default to :mimetype:`text/plain`, and :program:`Universal Feed Parser` should respect this, but it doesn't yet. .. _reference.entry.content.language: :py:attr:`entries[i].content[j].language` ----------------------------------------- The language of this piece of content. :py:attr:`~entries[i].content[j].language` is supposed to be a language code, as specified by :rfc:`3066`, but publishers have been known to publish random values like "English" or "German". :program:`Universal Feed Parser` does not do any parsing or normalization of language codes. :py:attr:`~entries[i].content[j].language` may come from the element's xml:lang attribute, or it may inherit from a parent element's xml:lang, or the :mailheader:`Content-Language` :abbr:`HTTP (Hypertext Transfer Protocol)` header. If the feed does not specify a language, :py:attr:`~entries[i].content[j].language` will be ``None``, the :program:`Python` null value. .. _reference.entry.content.base: :py:attr:`entries[i].content[j].base` ------------------------------------- The original base :abbr:`URI (Uniform Resource Identifier)` for links within this piece of content. :py:attr:`~entries[i].content[j].base` is only useful in rare situations and can usually be ignored. It is the original base :abbr:`URI (Uniform Resource Identifier)` for this value, as specified by the element's xml:base attribute, or a parent element's xml:base, or the appropriate :abbr:`HTTP (Hypertext Transfer Protocol)` header, or the :abbr:`URI (Uniform Resource Identifier)` of the feed. (See :ref:`advanced.base` for more details.) By the time you see it, :program:`Universal Feed Parser` has already resolved relative links in all values where it makes sense to do so. *Clients should never need to manually resolve relative links.* .. rubric:: Comes from * /atom03:feed/atom03:entry/atom03:content * /atom10:feed/atom10:entry/atom10:content * /rdf:RDF/rdf:item/content:encoded * /rss/channel/item/body * /rss/channel/item/content:encoded * /rss/channel/item/fullitem * /rss/channel/item/xhtml:body feedparser-5.1.3/docs/reference-feed-icon.rst0000664000175000017500000000042212027504622021103 0ustar kurtkurt00000000000000:py:attr:`feed.icon` ==================== A URL to a small icon representing the feed. If this is a relative :abbr:`URI (Uniform Resource Identifier)`, it is :ref:`resolved according to a set of rules `. .. rubric:: Comes from * /atom10:feed/atom10:icon feedparser-5.1.3/docs/microformats.rst0000664000175000017500000001522212027504622020027 0ustar kurtkurt00000000000000.. _advanced.microformats: Microformats ============ An emerging trend in feed syndication is the inclusion of `microformats`_. Besides the semantics defined by individual feed formats, publishers can add additional semantics using rel and class attributes in embedded :abbr:`HTML (HyperText Markup Language)` content. .. _microformats: http://microformats.org/ .. note:: To parse microformats. :program:`Universal Feed Parser` relies on a third-party library called `Beautiful Soup`_, which is distributed separately. If Beautiful Soup is not installed, :program:`Universal Feed Parser` will silently skip microformats parsing. .. _Beautiful Soup: http://www.crummy.com/software/BeautifulSoup/ The following elements are parsed for microformats: * :ref:`reference.entry.summary_detail.value` * :ref:`reference.entry.content.value` .. _advanced.microformats.relenclosure: rel=enclosure ------------- The `rel=enclosure`_ microformat provides a way for embedded :abbr:`HTML (HyperText Markup Language)` content to specify that a certain link should be treated as an :ref:`enclosure `. :program:`Universal Feed Parser` looks for links within embedded markup that meet any of the following conditions: .. _rel=enclosure: http://microformats.org/wiki/rel-enclosure * rel attribute contains enclosure (note: rel attributes can contain a list of space-separated values) * type attribute starts with audio/ * type attribute starts with video/ * type attribute starts with application/ but does not end with xml * href attribute ends with one of the following file extensions: :file:`.7z`, :file:`.avi`, :file:`.bin`, :file:`.bz2`, :file:`.bz2`, :file:`.deb`, :file:`.dmg`, :file:`.exe`, :file:`.gz`, :file:`.hqx`, :file:`.img`, :file:`.iso`, :file:`.jar`, :file:`.m4a`, :file:`.m4v`, :file:`.mp2`, :file:`.mp3`, :file:`.mp4`, :file:`.msi`, :file:`.ogg`, :file:`.ogm`, :file:`.rar`, :file:`.rpm`, :file:`.sit`, :file:`.sitx`, :file:`.tar`, :file:`.tbz2`, :file:`.tgz`, :file:`.wma`, :file:`.wmv`, :file:`.z`, :file:`.zip` When :program:`Universal Feed Parser` finds a link that satisfies any of these conditions, it adds it to :ref:`reference.entry.enclosures`. .. rubric:: Parsing embedded enclosures .. sourcecode:: python >>> import feedparser >>> d = feedparser.parse('http://feedparser.org/docs/examples/rel-enclosure.xml') >>> d.entries[0].enclosures [{u'href': u'http://example.com/movie.mp4', 'title': u'awesome movie'}] .. _advanced.microformats.reltag: rel=tag ------- The `rel=tag`_ microformat allows you to define :ref:`tags ` within embedded :abbr:`HTML (HyperText Markup Language)` content. :program:`Universal Feed Parser` looks for these attribute values in embedded markup and maps them to :ref:`reference.entry.tags`. .. _rel=tag: http://microformats.org/wiki/rel-tag .. rubric:: Parsing embedded tags .. sourcecode:: python >>> import feedparser >>> d = feedparser.parse('http://feedparser.org/docs/examples/rel-tag.xml') >>> d.entries[0].tags [{'term': u'tech', 'scheme': u'http://del.icio.us/tag/', 'label': u'Technology'}] .. _advanced.microformats.xfn: :abbr:`XFN (XHTML Friends Network)` ----------------------------------- The `XFN`_ microformat allows you to define human relationships between :abbr:`URI (Uniform Resource Identifier)`\s. For example, you could link from your weblog to your spouse's weblog with the ``rel="spouse"`` relation. It is intended primarily for "blogrolls" or other static lists of links, but the relations can occur anywhere in :abbr:`HTML (HyperText Markup Language)` content. If found, :program:`Universal Feed Parser` will return the :abbr:`XFN (XHTML Friends Network)` information in :ref:`reference.entry.xfn`. .. _XFN: http://microformats.org/wiki/XFN :program:`Universal Feed Parser` supports all of the relationships listed in the `XFN 1.1 profile`_, as well as the following variations: .. _XFN 1.1 profile: http://gmpg.org/xfn/11 * ``coworker`` in addition to ``co-worker`` * ``coresident`` in addition to ``co-resident`` * ``relative`` in addition to ``kin`` * ``brother`` and ``sister`` in addition to ``sibling`` * ``husband`` and ``wife`` in addition to ``spouse`` .. rubric:: Parsing :abbr:`XFN (XHTML Friends Network)` relationships .. sourcecode:: python >>> import feedparser >>> d = feedparser.parse('http://feedparser.org/docs/examples/xfn.xml') >>> person = d.entries[0].xfn[0] >>> person.name u'John Doe' >>> person.href u'http://example.com/johndoe' >>> person.relationships [u'coworker', u'friend'] .. _advanced.microformats.hcard: hCard ----- The `hCard`_ microformat allows you to embed address book information within :abbr:`HTML (HyperText Markup Language)` content. If :program:`Universal Feed Parser` finds an hCard within supported elements, it converts it into an RFC 2426-compliant vCard and returns it in :ref:`reference.entry.vcard`. .. _hCard: http://microformats.org/wiki/hcard .. rubric:: Converting embedded hCard markup into a vCard .. sourcecode:: python >>> import feedparser >>> d = feedparser.parse('http://feedparser.org/docs/examples/hcard.xml') >>> print d.entries[0].vcard BEGIN:vCard VERSION:3.0 FN:Frank Dawson N:Dawson;Frank ADR;TYPE=work,postal,parcel:;;6544 Battleford Drive;Raleigh;NC;27613-3502;U .S.A. TEL;TYPE=WORK,VOICE,MSG:+1-919-676-9515 TEL;TYPE=WORK,FAX:+1-919-676-9564 EMAIL;TYPE=internet,pref:Frank_Dawson at Lotus.com EMAIL;TYPE=internet:fdawson at earthlink.net ORG:Lotus Development Corporation URL:http://home.earthlink.net/~fdawson END:vCard BEGIN:vCard VERSION:3.0 FN:Tim Howes N:Howes;Tim ADR;TYPE=work:;;501 E. Middlefield Rd.;Mountain View;CA;94043;U.S.A. TEL;TYPE=WORK,VOICE,MSG:+1-415-937-3419 TEL;TYPE=WORK,FAX:+1-415-528-4164 EMAIL;TYPE=internet:howes at netscape.com ORG:Netscape Communications Corp. END:vCard .. note:: There are a growing number of microformats, and :program:`Universal Feed Parser` does not parse all of them. However, both the rel and class attributes survive :ref:`HTML sanitizing `, so applications built on :program:`Universal Feed Parser` that wish to parse additional microformat content are free to do so. .. seealso:: * `Microformats.org `_ * `rel=enclosure specification `_ * `rel=tag specification `_ * `XFN specification `_ * `hCard specification `_ feedparser-5.1.3/docs/reference-feed-title.rst0000664000175000017500000000146512027504622021304 0ustar kurtkurt00000000000000.. _reference.feed.title: :py:attr:`feed.title` ===================== The title of the feed. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, it is :ref:`sanitized ` by default. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, certain (X)HTML elements within this value may contain relative :abbr:`URI (Uniform Resource Identifier)`s. If so, they are :ref:`resolved according to a set of rules `. .. rubric:: Comes from * /atom03:feed/atom03:title * /atom10:feed/atom10:title * /rdf:RDF/rdf:channel/dc:title * /rdf:RDF/rdf:channel/rdf:title * /rss/channel/dc:title * /rss/channel/title .. seealso:: * :ref:`reference.feed.title_detail` feedparser-5.1.3/docs/reference-modified.rst0000664000175000017500000000107612027504622021040 0ustar kurtkurt00000000000000:py:attr:`modified` =================== The last-modified date of the feed, as specified in the :abbr:`HTTP (Hypertext Transfer Protocol)` headers. The purpose of :py:attr:`modified` is explained more fully in :ref:`http.etag`. .. tip:: :py:attr:`modified` will only be present if the feed was retrieved from a web server, and only if the web server provided a Last-Modified :abbr:`HTTP (Hypertext Transfer Protocol)` header for the feed. If the feed was parsed from a local file or from a string in memory, :py:attr:`modified` will not be present. feedparser-5.1.3/docs/reference-namespaces.rst0000664000175000017500000000112212027504622021367 0ustar kurtkurt00000000000000.. _reference.namespaces: :py:attr:`namespaces` ===================== A dictionary of all :abbr:`XML (Extensible Markup Language)` namespaces defined in the feed, as ``{prefix: namespaceURI}``. .. note:: The prefixes listed in the :py:attr:`namespaces` dictionary may not match the prefixes defined in the original feed. See :ref:`advanced.namespaces` for more details. .. tip:: This element always exists, although it may be an empty dictionary if the feed does not define any namespaces (such as an :abbr:`RSS (Rich Site Summary)` 2.0 feed with no extensions). feedparser-5.1.3/docs/reference-feed-errorreportsto.rst0000664000175000017500000000034412027504622023271 0ustar kurtkurt00000000000000.. _reference.feed.errorreportsto: :py:attr:`feed.errorreportsto` ============================== An email address for reporting errors in the feed itself. .. rubric:: Comes from * /rdf:RDF/admin:errorReportsTo/@rdf:resource feedparser-5.1.3/docs/reference-entry-title_detail.rst0000664000175000017500000000714412027504622023064 0ustar kurtkurt00000000000000.. _reference.entry.title_detail: :py:attr:`entries[i].title_detail` ================================== A dictionary with details about the entry title. .. _reference.entry.title_detail.value: :py:attr:`entries[i].title_detail.value` ---------------------------------------- Same as :ref:`reference.entry.title`. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, it is :ref:`sanitized ` by default. If this contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)`, certain (X)HTML elements within this value may contain relative :abbr:`URI (Uniform Resource Identifier)`\s. If so, they are :ref:`resolved according to a set of rules `. :py:attr:`entries[i].title_detail.type` --------------------------------------- The content type of the entry title. Most likely values for :py:attr:`~entries[i].title_detail.type`: * :mimetype:`text/plain` * :mimetype:`text/html` * :mimetype:`application/xhtml+xml` For Atom feeds, the content type is taken from the type attribute, which defaults to :mimetype:`text/plain` if not specified. For :abbr:`RSS (Rich Site Summary)` feeds, the content type is auto-determined by inspecting the content, and defaults to :mimetype:`text/html`. Note that this may cause silent data loss if the value contains plain text with angle brackets. There is nothing I can do about this problem; it is a limitation of :abbr:`RSS (Rich Site Summary)`. Future enhancement: some versions of :abbr:`RSS (Rich Site Summary)` clearly specify that certain values default to :mimetype:`text/plain`, and :program:`Universal Feed Parser` should respect this, but it doesn't yet. :py:attr:`entries[i].title_detail.language` ------------------------------------------- The language of the entry title. :py:attr:`~entries[i].title_detail.language` is supposed to be a language code, as specified by `RFC 3066`_, but publishers have been known to publish random values like "English" or "German". :program:`Universal Feed Parser` does not do any parsing or normalization of language codes. .. _RFC 3066: http://www.ietf.org/rfc/rfc3066.txt :py:attr:`~entries[i].title_detail.language` may come from the element's xml:lang attribute, or it may inherit from a parent element's xml:lang, or the Content-Language :abbr:`HTTP (Hypertext Transfer Protocol)` header. If the feed does not specify a language, :py:attr:`~entries[i].title_detail.language` will be ``None``, the :program:`Python` null value. :py:attr:`entries[i].title_detail.base` --------------------------------------- The original base :abbr:`URI (Uniform Resource Identifier)` for links within the entry title. :py:attr:`~entries[i].title_detail.base` is only useful in rare situations and can usually be ignored. It is the original base :abbr:`URI (Uniform Resource Identifier)` for this value, as specified by the element's xml:base attribute, or a parent element's xml:base, or the appropriate :abbr:`HTTP (Hypertext Transfer Protocol)` header, or the :abbr:`URI (Uniform Resource Identifier)` of the feed. (See :ref:`advanced.base` for more details.) By the time you see it, :program:`Universal Feed Parser` has already resolved relative links in all values where it makes sense to do so. *Clients should never need to manually resolve relative links.* .. rubric:: Comes from * /atom10:feed/atom10:entry/atom10:title * /atom03:feed/atom03:entry/atom03:title * /rss/channel/item/title * /rss/channel/item/dc:title * /rdf:RDF/rdf:item/rdf:title * /rdf:RDF/rdf:item/dc:title .. seealso:: * :ref:`reference.entry.title` feedparser-5.1.3/docs/changes-32.rst0000664000175000017500000000206712027504622017157 0ustar kurtkurt00000000000000Changes in version 3.2 ====================== :program:`Universal Feed Parser` 3.2 was released on July 3, 2004. - use :file:`cjkcodecs` and :file:`iconv_codec` if available - always convert feed to UTF-8 before passing to :abbr:`XML (Extensible Markup Language)` parser - completely revamped logic for determining character encoding and attempting :abbr:`XML (Extensible Markup Language)` parsing (much faster) - increased default timeout to 20 seconds - test for presence of ``Location`` header on redirects - added tests for many alternate character encodings - support various :abbr:`EBCDIC` encodings - support UTF-16BE and UTF16-LE with or without a :abbr:`BOM (Byte Order Mark)` - support UTF-8 with a :abbr:`BOM (Byte Order Mark)` - support UTF-32BE and UTF-32LE with or without a :abbr:`BOM (Byte Order Mark)` - fixed crashing bug if no :abbr:`XML (Extensible Markup Language)` parsers are available - added support for ``Content-encoding: deflate`` - send blank ``Accept-encoding`` header if neither :file:`gzip` nor :file:`zlib` modules are available feedparser-5.1.3/docs/resolving-relative-links.rst0000664000175000017500000002376112027504622022270 0ustar kurtkurt00000000000000.. _advanced.base: Relative Link Resolution ======================== Many feed elements and attributes are :abbr:`URI (Uniform Resource Identifier)`\s. :program:`Universal Feed Parser` resolves relative :abbr:`URI (Uniform Resource Identifier)`\s according to the `XML:Base `_ specification. We'll see how that works in a minute, but first let's talk about which values are treated as :abbr:`URI (Uniform Resource Identifier)`\s. Which Values Are :abbr:`URI (Uniform Resource Identifier)`\s ------------------------------------------------------------ These feed elements are treated as :abbr:`URI (Uniform Resource Identifier)`\s, and resolved if they are relative: * :ref:`reference.entry.author_detail.href` * :ref:`reference.entry.comments` * :ref:`reference.entry.contributors.href` * :ref:`reference.entry.enclosures.href` * :ref:`reference.entry.id` * :ref:`reference.entry.license` * :ref:`reference.entry.link` * :ref:`reference.entry.links.href` * :ref:`reference.entry.publisher_detail.href` * :ref:`reference.entry.source.author_detail.href` * :ref:`reference.entry.source.contributors.href` * :ref:`reference.entry.source.links.href` * :ref:`reference.feed.author_detail.href` * :ref:`reference.feed.contributors.href` * :ref:`reference.feed.docs` * :ref:`reference.feed.generator_detail.href` * :ref:`reference.feed.id` * :ref:`reference.feed.image.href` * :ref:`reference.feed.image.link` * :ref:`reference.feed.license` * :ref:`reference.feed.link` * :ref:`reference.feed.links.href` * :ref:`reference.feed.publisher_detail.href` * :ref:`reference.feed.textinput.link` In addition, several feed elements may contain :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)` markup. Certain elements and attributes in :abbr:`HTML (HyperText Markup Language)` can be relative :abbr:`URI (Uniform Resource Identifier)`\s, and :program:`Universal Feed Parser` will resolve these :abbr:`URI (Uniform Resource Identifier)`\s according to the same rules as the feed elements listed above. These feed elements may contain :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)` markup. In Atom feeds, whether these elements are treated as :abbr:`HTML (HyperText Markup Language)` depends on the value of the type attribute. In :abbr:`RSS (Rich Site Summary)` feeds, these values are always treated as :abbr:`HTML (HyperText Markup Language)`. * :ref:`reference.entry.content.value` * :ref:`reference.entry.summary` (:ref:`reference.entry.summary_detail.value`) * :ref:`reference.entry.title` (:ref:`reference.entry.title_detail.value`) * :ref:`reference.feed.info` (:ref:`reference.feed.info_detail.value`) * :ref:`reference.feed.rights` (:ref:`reference.feed.rights_detail.value`) * :ref:`reference.feed.subtitle` (:ref:`reference.feed.subtitle_detail.value`) * :ref:`reference.feed.title` (:ref:`reference.feed.title_detail.value`) When any of these feed elements contains :abbr:`HTML (HyperText Markup Language)` or :abbr:`XHTML (Extensible HyperText Markup Language)` markup, the following :abbr:`HTML (HyperText Markup Language)` elements are treated as :abbr:`URI (Uniform Resource Identifier)`\s and are resolved if they are relative: * * * *
* * *
* * * *