magpierss-0.72/ 0040755 0000765 0000765 00000000000 10333221327 012744 5 ustar kellan kellan magpierss-0.72/AUTHORS 0100644 0000765 0000765 00000000034 07556615254 014031 0 ustar kellan kellan kellan
magpierss-0.72/ChangeLog 0100644 0000765 0000765 00000022025 10333221072 014511 0 ustar kellan kellan 2005-10-28 14:11 kellan
* extlib/Snoopy.class.inc: a better solution
2005-10-28 11:51 kellan
* extlib/Snoopy.class.inc: fix arbtriary code execution
vulnerability when using curl+ssl
http://www.sec-consult.com/216.html
2005-03-08 10:46 kellan
* rss_parse.inc: fix bug w/ atom and date normalization
2005-02-09 14:59 kellan
* rss_fetch.inc: fix stale cache bug
2005-01-28 02:27 kellan
* rss_parse.inc: support php w/o array_change_case
2005-01-23 20:02 kellan
* rss_fetch.inc: fix cache bug introduced by charset encoding
2005-01-12 09:14 kellan
* rss_cache.inc, rss_fetch.inc: more sanity checks for when things
go wrong
2004-12-12 13:44 kellan
* INSTALL, rss_cache.inc, rss_utils.inc: detab
2004-11-23 20:15 kellan
* rss_parse.inc: fix calling iconv instead of mb_convert_encoding
2004-11-22 02:11 kellan
* CHANGES, ChangeLog, rss_parse.inc, scripts/magpie_debug.php: last
bit of tidying
2004-11-22 01:45 kellan
* rss_fetch.inc: detab, bump version
2004-11-22 01:43 kellan
* rss_parse.inc: was filtering too much
2004-11-22 00:03 kellan
* rss_fetch.inc, rss_parse.inc: cache on $url . $output_encoding
otherwise we can get munged output
2004-11-21 23:52 kellan
* rss_parse.inc: add WARNING
2004-11-21 23:45 kellan
* rss_parse.inc: don't set ERROR on notice or warning (rss_fetch
dies on parse errors)
2004-11-21 23:44 kellan
* rss_fetch.inc: add encoding defines (fix timeout error reporting)
2004-11-21 20:21 kellan
* rss_parse.inc: incorporate steve's patch
2004-11-21 19:26 kellan
* rss_parse.inc: remove old debugging functions, totally
arbitrarily. might break stuff. can't really explain why i'm
doing this.
2004-10-28 15:52 kellan
* rss_parse.inc: fixed '=' instead of '=='
2004-10-26 00:48 kellan
* rss_parse.inc: chance epoch to timestamp to conform w/ php naming
conventions
2004-06-15 12:00 kellan
* rss_parse.inc: [no log message]
2004-04-26 14:16 kellan
* rss_fetch.inc: bump version
2004-04-26 12:36 kellan
* rss_parse.inc: fix field doubling
2004-04-24 17:47 kellan
* CHANGES, ChangeLog: updated
2004-04-24 17:35 kellan
* rss_fetch.inc: bumped version
2004-04-24 16:52 kellan
* rss_parse.inc: support arbitrary atom content constructs
some refactoring
2004-04-24 16:15 kellan
* rss_parse.inc: support summary content contstruct. add normalize
function
2004-03-27 16:29 kellan
* extlib/Snoopy.class.inc: accept self-signed certs
2004-03-27 12:53 kellan
* extlib/Snoopy.class.inc: fixed SSL support * set status * set
error on bad curl
(also ripped out big chunks of dead weight (submit_form) which
were getting in my way
2004-01-25 02:25 kellan
* rss_parse.inc: make RSS 1.0's rdf:about available
2004-01-25 02:07 kellan
* rss_parse.inc: clean up text, and line formats. add support item
rdf:about
2004-01-24 23:40 kellan
* CHANGES, ChangeLog: update changes
2004-01-24 23:37 kellan
* rss_fetch.inc: updated version
2004-01-24 23:35 kellan
* rss_parse.inc: whitespace
2004-01-24 23:23 kellan
* extlib/Snoopy.class.inc: support badly formatted http headers
2004-01-24 23:20 kellan
* rss_parse.inc: added alpha atom parsing support
2003-06-25 22:34 kellan
* extlib/Snoopy.class.inc: fixed fread 4.3.2 compatibility problems
2003-06-13 11:31 kellan
* rss_fetch.inc: reset cache on 304
2003-06-12 21:37 kellan
* rss_cache.inc, rss_fetch.inc, rss_parse.inc, rss_utils.inc:
bumped up version numbers
2003-06-12 21:32 kellan
* htdocs/index.html: updated news
2003-06-12 21:27 kellan
* NEWS: a manual blog :)
2003-06-12 21:22 kellan
* htdocs/index.html: fully qualified img
2003-06-12 21:20 kellan
* htdocs/index.html: clean up. added badge.
2003-06-12 21:04 kellan
* rss_utils.inc: clean up regex
2003-06-12 21:02 kellan
* rss_cache.inc: suppress some warnings
2003-05-30 20:44 kellan
* extlib/Snoopy.class.inc: more comments, cleaned up notice
2003-05-30 15:14 kellan
* extlib/Snoopy.class.inc: don't advertise gzip support if the user
hasn't built php with gzinflate support
2003-05-12 22:32 kellan
* ChangeLog: changes
2003-05-12 22:11 kellan
* htdocs/index.html: announce 0.5
2003-05-12 21:42 kellan
* htdocs/index.html: change
2003-05-12 21:39 kellan
* rss_fetch.inc: use gzip
2003-05-12 21:37 kellan
* extlib/Snoopy.class.inc: added support gzip encoded content
negoiation
2003-05-12 21:32 kellan
* rss_cache.inc, rss_fetch.inc, rss_parse.inc, rss_utils.inc: fixed
typoes
2003-04-26 21:44 kellan
* rss_parse.inc: fix minor typo
2003-04-18 08:19 kellan
* htdocs/cookbook.html: updated cookbook to show more code for
limiting items
2003-03-03 16:02 kellan
* rss_parse.inc, scripts/magpie_slashbox.php: committed (or
adpated) patch from Nicola (www.technick.com) to quell 'Undefined
Indexes' notices
2003-03-03 15:59 kellan
* rss_fetch.inc: commited patch from nicola (www.technick.com) to
quell 'undefined indexes' notices.
* Magpie now automatically includes its version in the
user-agent, & whether cacheing is turned on.
2003-02-12 01:22 kellan
* CHANGES, ChangeLog: ChangeLog now auto-generated by cvs2cl
2003-02-12 00:21 kellan
* rss_fetch.inc: better errors, hopefully stomped on pesky notices
2003-02-12 00:19 kellan
* rss_parse.inc: check to see is xml is supported, if not die
also throw better xml errors
2003-02-12 00:18 kellan
* rss_cache.inc: hopefully cleared up some notices that were being
thrown into the log
fixed a debug statement that was being called as an error
2003-02-12 00:15 kellan
* scripts/: magpie_simple.php, magpie_slashbox.php: moved
magpie_simple to magpie_slashbox, and replaced it with a simpler
demo.
2003-02-12 00:02 kellan
* INSTALL, README, TROUBLESHOOTING: Improved documentation. Better
install instructions.
TROUBLESHOOTING cover common installation and usage problems
2003-01-22 14:40 kellan
* htdocs/cookbook.html: added cookbook.html
2003-01-21 23:47 kellan
* cookbook: a magpie cookbook
2003-01-20 10:09 kellan
* ChangeLog: updated
2003-01-20 09:23 kellan
* scripts/simple_smarty.php: minor clean up
2003-01-20 09:15 kellan
* scripts/README: added smarty url
2003-01-20 09:14 kellan
* magpie_simple.php, htdocs/index.html, scripts/README,
scripts/magpie_debug.php, scripts/magpie_simple.php,
scripts/simple_smarty.php,
scripts/smarty_plugin/modifier.rss_date_parse.php,
scripts/templates/simple.smarty: Added scripts directory for
examples on how to use MagpieRSS
magpie_simple - is a simple example magpie_debug - spew all the
information from a parsed RSS feed simple_smary - example of
using magpie with Smarty template system
smarty_plugin/modifier.rss_date_parse.php - support file for the
smarty demo templates/simple.smary - template for the smarty demo
2003-01-20 09:11 kellan
* rss_fetch.inc, rss_parse.inc: changes to error handling to give
script authors more access to magpie's errors.
added method magpie_error() to retrieve global MAGPIE_ERROR
variable for when fetch_rss() returns false
2002-10-26 19:02 kellan
* htdocs/index.html: putting the website under source control
2002-10-26 18:43 kellan
* AUTHORS, ChangeLog, INSTALL, README: some documentation to make
it all look official :)
2002-10-25 23:04 kellan
* magpie_simple.php: quxx
2002-10-25 23:04 kellan
* rss_parse.inc: added support for textinput and image
2002-10-25 19:23 kellan
* magpie_simple.php, rss_cache.inc, rss_fetch.inc, rss_parse.inc,
rss_utils.inc: switched to using Snoopy for fetching remote RSS
files.
added support for conditional gets
2002-10-25 19:22 kellan
* rss_cache.inc, rss_fetch.inc, rss_parse.inc, rss_utils.inc:
Change comment style to slavishly imitate the phpinsider style
found in Smarty and Snoopy :)
2002-10-25 19:18 kellan
* extlib/Snoopy.class.inc: added Snoopy in order to support
conditional gets
2002-10-23 23:19 kellan
* magpie_simple.php, rss_cache.inc, rss_fetch.inc, rss_parse.inc:
MAJOR CLEANUP!
* rss_fetch got rid of the options array, replaced it with a more
PHP-like solution of using defines. constants are setup, with
defaults, in the function init()
got rid of the idiom of passing back an array, its was awkward to
deal with in PHP, and unusual (and consquently confusing to
people). now i return true/false values, and try to setup error
string where appropiate (rss_cache has the most complete example
of this)
change the logic for interacting with the cache
* rss_cache major re-working of how error are handled. tried to
make the code more resillient. the cache is now much more aware
of MAX_AGE, where before this was being driven out of rss_fetch
(which was silly)
* rss_parse properly handles xml parse errors. used to sail
along blithely unaware.
2002-09-11 11:11 kellan
* rss_cache.inc, rss_parse.inc, magpie_simple.php, rss_fetch.inc,
rss_utils.inc: Initial revision
2002-09-11 11:11 kellan
* rss_cache.inc, rss_parse.inc, magpie_simple.php, rss_fetch.inc,
rss_utils.inc: initial import
magpierss-0.72/CHANGES 0100644 0000765 0000765 00000002127 10333221072 013733 0 ustar kellan kellan Version 0.72
-----------
- fix security exploit: http://www.sec-consult.com/216.html
Version 0.7
-----------
- support for input and output charset encoding
based on the work in FoF, uses iconv or mbstring if available
-
Version 0.6
-----------
- basic support for Atom syndication format
including support for Atom content constructs
- fixed support for private feeds (HTTP Auth and SSL)
(thanks to silverorange.com for providing test feeds)
- support for some broken webservers
Version 0.52
-----------
- support GZIP content negoiation
- PHP 4.3.2 support
Version 0.4
-----------
- improved error handling, better access for script authors
- included example scripts of working with MagpieRSS
- new Smarty plugin for RSS date parsing
Version 0.3
-----------
- added support for conditional gets (Last-Modified, ETag)
- now use Snoopy to handle fetching RSS files
Version 0.2
-----------
- MAJOR CLEAN UP
- removed kludgy $options array in favour of constants
- phased out returning arrays
- added better error handling
- re-worked comments
magpierss-0.72/cookbook 0100644 0000765 0000765 00000005730 07613421155 014507 0 ustar kellan kellan MAGPIERSS RECIPES: Cooking with Corbies
"Four and twenty blackbirds baked in a pie."
1. LIMIT THE NUMBER OF HEADLINES(AKA ITEMS) RETURNED.
PROBLEM:
You want to display the 10 (or 3) most recent headlines, but the RSS feed
contains 15.
SOLUTION:
$num_items = 10;
$rss = fetch_rss($url);
$items = array_slice($rss->items, 0, $num_items);
DISCUSSION:
Rather then trying to limit the number of items Magpie parses, a much simpler,
and more flexible approach is to take a "slice" of the array of items. And
array_slice() is smart enough to do the right thing if the feed has less items
then $num_items.
See: http://www.php.net/array_slice
2. DISPLAY A CUSTOM ERROR MESSAGE IF SOMETHING GOES WRONG
PROBLEM:
You don't want Magpie's error messages showing up if something goes wrong.
SOLUTION:
# Magpie throws USER_WARNINGS only
# so you can cloak these, by only showing ERRORs
error_reporting(E_ERROR);
# check the return value of fetch_rss()
$rss = fetch_rss($url);
if ( $rss ) {
...display rss feed...
}
else {
echo "An error occured! " .
"Consider donating more $$$ for restoration of services." .
" Error Message: " . magpie_error();
}
DISCUSSION:
MagpieRSS triggers a warning in a number of circumstances. The 2 most common
circumstances are: if the specified RSS file isn't properly formed (usually
because it includes illegal HTML), or if Magpie can't download the remote RSS
file, and there is no cached version.
If you don't want your users to see these warnings change your error_reporting
settings to only display ERRORs. Another option is to turn off display_error,
so that WARNINGs, and NOTICEs still go to the error_log but not to the webpages.
You can do this with:
ini_set('display_errors', 0);
See: http://www.php.net/error_reporting,
http://www.php.net/ini_set,
http://www.php.net/manual/en/ref.errorfunc.php
3. GENERATE A NEW RSS FEED
PROBLEM:
Create an RSS feed for other people to use.
SOLUTION:
Use Useful Inc's RSSWriter (http://usefulinc.com/rss/rsswriter/)
DISCUSSION:
An example of turning a Magpie parsed RSS object back into an RSS file is forth
coming. In the meantime RSSWriter has great documentation.
4. DISPLAY HEADLINES MORE RECENT THEN X DATE
PROBLEM:
You only want to display headlines that were published on, or after a certain
date.
SOLUTION:
require 'rss_utils.inc';
# get all headlines published today
$today = getdate();
# today, 12AM
$date = mktime(0,0,0,$today['mon'], $today['mday'], $today['year']);
$rss = fetch_rss($url);
foreach ( $rss->items as $item ) {
$published = parse_w3cdtf($item['dc']['date']);
if ( $published >= $date ) {
echo "Title: " . $item['title'];
echo "Published: " . date("h:i:s A", $published);
echo "
Rather then trying to limit the number of items Magpie parses, a much simpler,
and more flexible approach is to take a "slice" of the array of items. And
array_slice() is smart enough to do the right thing if the feed has less items
then $num_items.
2. Display a Custom Error Message if Something Goes Wrong
Problem:
You don't want Magpie's error messages showing up if something goes wrong.
Solution:
# Magpie throws USER_WARNINGS only
# so you can cloak these, by only showing ERRORs
error_reporting(E_ERROR);
# check the return value of fetch_rss()
$rss = fetch_rss($url);
if ( $rss ) {
...display rss feed...
}
else {
echo "An error occured! " .
"Consider donating more $$$ for restoration of services." .
"<br>Error Message: " . magpie_error();
}
Discussion:
MagpieRSS triggers a warning in a number of circumstances. The 2 most common
circumstances are: if the specified RSS file isn't properly formed (usually
because it includes illegal HTML), or if Magpie can't download the remote RSS
file, and there is no cached version.
If you don't want your users to see these warnings change your error_reporting
settings to only display ERRORs.
Another option is to turn off display_error,
so that WARNINGs, and NOTICEs still go to the error_log but not to the webpages.
You can do this with:
# you can also do this in your php.ini file
ini_set('display_errors', 0);
This recipe only works for RSS 1.0 feeds that include the field.
(which is very good RSS style) parse_w3cdtf() is defined in
rss_utils.inc, and parses RSS style dates into Unix epoch
seconds.
MagpieRSS provides fetch_rss() which takes a URL and returns a
parsed RSS object, but what if you want to parse a file stored locally that
doesn't have a URL?
Solution
require_once('rss_parse.inc');
$rss_file = 'some_rss_file.rdf';
$rss_string = read_file($rss_file);
$rss = new MagpieRSS( $rss_string );
if ( $rss and !$rss->ERROR) {
...display rss...
}
else {
echo "Error: " . $rss->ERROR;
}
# efficiently read a file into a string
# in php >= 4.3.0 you can simply use file_get_contents()
#
function read_file($filename) {
$fh = fopen($filename, 'r') or die($php_errormsg);
$rss_string = fread($fh, filesize($filename) );
fclose($fh);
return $rss_string;
}
Discussion
Here we are using MagpieRSS's RSS parser directly without the convience wrapper
of fetch_rss(). We read the contents of the RSS file into a
string, and pass it to the parser constructor. Notice also that error handling
is subtly different.
improved error handling, more flexibility for script authors,
backwards compatible
new and better examples! including using MagpieRSS and Smarty
new Smarty plugin for RSS date parsing
Why?
I wrote MagpieRSS out of a frustration with the limitations of existing
solutions. In particular many of the existing PHP solutions seemed to:
use a parser based on regular expressions, making for an inherently
fragile solution
only support early versions of RSS
discard all the interesting information besides item title, description,
and link.
not build proper separation between parsing the RSS and displaying it.
In particular I failed to find any PHP RSS parsers that could sufficiently
parse RSS 1.0 feeds, to be useful on the RSS based event feeds we generate
at Protest.net.
Features
Easy to Use
As simple as:
require('rss_fetch.inc');
$rss = fetch_rss($url);
Parses RSS 0.9 - RSS 1.0
Parses most RSS formats, including support for
1.0 modules and limited
namespace support. RSS is packed into convenient data structures; easy to
use in PHP, and appropriate for passing to a templating system, like
Smarty.
Integrated Object Cache
Caching the parsed RSS means that the 2nd request is fast, and that
including the rss_fetch call in your PHP page won't destroy your performance,
and force you to reply on an external cron job. And it happens transparently.
Makes extensive use of constants to allow overriding default behaviour, and
installation on shared hosts.
Modular
rss_fetch.inc - wraps a simple interface (fetch_rss())
around the library.
rss_parse.inc - provides the RSS parser, and the RSS object
rss_cache.inc - a simple (no GC) object cache, optimized for RSS objects
rss_utils.inc - utility functions for working with RSS. currently
provides parse_w3cdtf(), for parsing W3CDTF into epoch seconds.
Magpie's approach to parsing RSS
Magpie takes a naive, and inclusive approach. Absolutely
non-validating, as long as the RSS feed is well formed, Magpie will
cheerfully parse new, and never before seen tags in your RSS feeds.
This makes it very simple support the varied versions of RSS simply, but
forces the consumer of a RSS feed to be cognizant of how it is
structured.(at least if you want to do something fancy)
Magpie parses a RSS feed into a simple object, with 4 fields:
channel, items, image, and
textinput.
channel
$rss->channel contains key-value pairs of all tags, without
nested tags, found between the root tag (<rdf:RDF>, or <rss>)
and the end of the document.
items
$rss->items is an array of associative arrays, each one
describing a single item. An example that looks like:
<item rdf:about="http://protest.net/NorthEast/calendrome.cgi?span=event&ID=210257">
<title>Weekly Peace Vigil</title>
<link>http://protest.net/NorthEast/calendrome.cgi?span=event&ID=210257</link>
<description>Wear a white ribbon</description>
<dc:subject>Peace</dc:subject>
<ev:startdate>2002-06-01T11:00:00</ev:startdate>
<ev:location>Northampton, MA</ev:location>
<ev:enddate>2002-06-01T12:00:00</ev:enddate>
<ev:type>Protest</ev:type>
</item>
Is parsed, and pushed on the $rss->items array as:
array(
title => 'Weekly Peace Vigil',
link => 'http://protest.net/NorthEast/calendrome.cgi?span=event&ID=210257',
description => 'Wear a white ribbon',
dc => array (
subject => 'Peace'
),
ev => array (
startdate => '2002-06-01T11:00:00',
enddate => '2002-06-01T12:00:00',
type => 'Protest',
location => 'Northampton, MA'
)
);
image and textinput
$rss->image and $rss-textinput are associative arrays
including name-value pairs for anything found between the respective parent
tags.
coded by: kellan (at) protest.net, feedback is always appreciated.
magpierss-0.72/INSTALL 0100644 0000765 0000765 00000011155 10157110566 014003 0 ustar kellan kellan REQUIREMENTS
MapieRSS requires a recent PHP 4+ (developed with 4.2.0)
with xml (expat) support.
Optionally:
* PHP5 with libxml2 support.
* cURL for SSL support
* iconv (preferred) or mb_string for expanded character set support
QUICK START
Magpie consists of 4 files (rss_fetch.inc, rss_parser.inc, rss_cache.inc,
and rss_utils.inc), and the directory extlib (which contains a modified
version of the Snoopy HTTP client)
Copy these 5 resources to a directory named 'magpierss' in the same
directory as your PHP script.
At the top of your script add the following line:
require_once('magpierss/rss_fetch.inc');
Now you can use the fetch_rss() method:
$rss = fetch_rss($url);
Done. That's it. See README for more details on using MagpieRSS.
NEXT STEPS
Important: you'll probably want to get the cache directory working in
order to speed up your application, and not abuse the webserver you're
downloading the RSS from.
Optionally you can install MagpieRSS in your PHP include path in order to
make it available server wide.
Lastly you might want to look through the constants in rss_fetch.inc see if
there is anything you want to override (the defaults are pretty good)
For more info, or if you have trouble, see TROUBLESHOOTING
SETTING UP CACHING
Magpie has built-in transparent caching. With caching Magpie will only
fetch and parse RSS feeds when there is new content. Without this feature
your pages will be slow, and the sites serving the RSS feed will be annoyed
with you.
** Simple and Automatic **
By default Magpie will try to create a cache directory named 'cache' in the
same directory as your PHP script.
** Creating a Local Cache Directory **
Often this will fail, because your webserver doesn't have sufficient
permissions to create the directory.
Exact instructions for how to do this will vary from install to install and
platform to platform. The steps are:
1. Make a directory named 'cache'
2. Give the web server write access to that directory.
An example of how to do this on Debian would be:
1. mkdir /path/to/script/cache
2. chgrp www-data /path/to/script/cache
3. chmod 775 /path/to/script/cache
On other Unixes you'll need to change 'www-data' to what ever user Apache
runs as. (on MacOS X the user would be 'www')
** Cache in /tmp **
Sometimes you won't be able to create a local cache directory. Some reasons
might be:
1. No shell account
2. Insufficient permissions to change ownership of a directory
3. Webserver runs as 'nobody'
In these situations using a cache directory in /tmp can often be a good
option.
The drawback is /tmp is public, so anyone on the box can read the cache
files. Usually RSS feeds are public information, so you'll have to decide
how much of an issue that is.
To use /tmp as your cache directory you need to add the following line to
your script:
define('MAGPIE_CACHE_DIR', '/tmp/magpie_cache');
** Global Cache **
If you have several applications using Magpie, you can create a single
shared cache directory, either using the /tmp cache, or somewhere else on
the system.
The upside is that you'll distribute fetching and parsing feeds across
several applications.
INSTALLING MAGPIE SERVER WIDE
Rather then following the Quickstart instructions which requires you to have
a copy of Magpie per application, alternately you can place it in some
shared location.
** Adding Magpie to Your Include Path **
Copy the 5 resources (rss_fetch.inc, rss_parser.inc, rss_cache.inc,
rss_utils.inc, and extlib) to a directory named 'magpierss' in your include
path. Now any PHP file on your system can use Magpie with:
require_once('magpierss/rss_fetch.inc');
Different installs have different include paths, and you'll have to figure
out what your include_path is.
From shell you can try:
php -i | grep 'include_path'
Alternatley you can create a phpinfo.php file with contains:
Debian's default is:
/usr/share/php
(though more idealogically pure location would be /usr/local/share/php)
Apple's default include path is:
/usr/lib/php
While the Entropy PHP build seems to use:
/usr/local/php/lib/php magpierss-0.72/NEWS 0100644 0000765 0000765 00000003507 07672224152 013460 0 ustar kellan kellan MagpieRSS News
MAGPIERSS 0.51 RELEASED
* important bugfix!
* fix "silent failure" when PHP doesn't have zlib
FEED ON FEEDS USES MAGPIE
* web-based RSS aggregator built with Magpie
* easy to install, easy to use.
http://minutillo.com/steve/feedonfeeds/
MAGPIERSS 0.5 RELEASED
* supports transparent HTTP gzip content negotiation for reduced bandwidth usage
* quashed some undefined index notices
MAGPIERSS 0.46 RELEASED
* minor release, more error handling clean up
* documentation fixes, simpler example
* new trouble shooting guide for installation and usage problems
http://magpierss.sourceforge.net/TROUBLESHOOTING
MAGPIE NEWS AS RSS
* releases, bug fixes, releated stories in RSS
MAGPIERSS COOKBOOK: SIMPLE PHP RSS HOW TOS
* answers some of the most frequently asked Magpie questions
* feedback, suggestions, requests, recipes welcome
http://magpierss.sourceforge.net/cookbook.html
MAGPIERSS 0.4 RELEASED!
* improved error handling, more flexibility for script authors, backwards compatible
* new and better examples! including using MagpieRSS and Smarty
* new Smarty plugin for RSS date parsing
http://smarty.php.net
INFINITE PENGUIN NOW SUPPORTS MAGPIE 0.3
* simple, sophisticated RSS viewer
* includes auto-generated javascript ticker from RSS feed
http://www.infinitepenguins.net/rss/
TRAUMWIND RELEASES REX BACKEND FOR MAGPIERSS
* drop in support using regex based XML parser
* parses improperly formed XML that chokes expat
http://traumwind.de/blog/magpie/magpie_alike.php
MAGPIERSS 0.3 RELEASED!
* Support added for HTTP Conditional GETs.
http://fishbowl.pastiche.org/archives/001132.html
MAGPIERSS 0.2!
* Major clean up of the code. Easier to use.
* Simpler install on shared hosts.
* Better documentation and comments.
magpierss-0.72/README 0100644 0000765 0000765 00000002425 07622352556 013644 0 ustar kellan kellan NAME
MagpieRSS - a simple RSS integration tool
SYNOPSIS
require_once(rss_fetch.inc);
$url = $_GET['url'];
$rss = fetch_rss( $url );
echo "Channel Title: " . $rss->channel['title'] . "
";
DESCRIPTION
MapieRSS is an XML-based RSS parser in PHP. It attempts to be "PHP-like",
and simple to use.
Some features include:
* supports RSS 0.9 - 1.0, with limited RSS 2.0 support
* supports namespaces, and modules, including mod_content and mod_event
* open minded [1]
* simple, functional interface, to object oriented backend parser
* automatic caching of parsed RSS objects makes its easy to integrate
* supports conditional GET with Last-Modified, and ETag
* uses constants for easy override of default behaviour
* heavily commented
1. By open minded I mean Magpie will accept any tag it finds in good faith that
it was supposed to be here. For strict validation, look elsewhere.
GETTING STARTED
COPYRIGHT:
Copyright(c) 2002 kellan@protest.net. All rights reserved.
This software is released under the GNU General Public License.
Please read the disclaimer at the top of the Snoopy.class.inc file.
magpierss-0.72/rss_cache.inc 0100644 0000765 0000765 00000014350 10171230317 015370 0 ustar kellan kellan
* Version: 0.51
* License: GPL
*
* The lastest version of MagpieRSS can be obtained from:
* http://magpierss.sourceforge.net
*
* For questions, help, comments, discussion, etc., please join the
* Magpie mailing list:
* http://lists.sourceforge.net/lists/listinfo/magpierss-general
*
*/
class RSSCache {
var $BASE_CACHE = './cache'; // where the cache files are stored
var $MAX_AGE = 3600; // when are files stale, default one hour
var $ERROR = ""; // accumulate error messages
function RSSCache ($base='', $age='') {
if ( $base ) {
$this->BASE_CACHE = $base;
}
if ( $age ) {
$this->MAX_AGE = $age;
}
// attempt to make the cache directory
if ( ! file_exists( $this->BASE_CACHE ) ) {
$status = @mkdir( $this->BASE_CACHE, 0755 );
// if make failed
if ( ! $status ) {
$this->error(
"Cache couldn't make dir '" . $this->BASE_CACHE . "'."
);
}
}
}
/*=======================================================================*\
Function: set
Purpose: add an item to the cache, keyed on url
Input: url from wich the rss file was fetched
Output: true on sucess
\*=======================================================================*/
function set ($url, $rss) {
$this->ERROR = "";
$cache_file = $this->file_name( $url );
$fp = @fopen( $cache_file, 'w' );
if ( ! $fp ) {
$this->error(
"Cache unable to open file for writing: $cache_file"
);
return 0;
}
$data = $this->serialize( $rss );
fwrite( $fp, $data );
fclose( $fp );
return $cache_file;
}
/*=======================================================================*\
Function: get
Purpose: fetch an item from the cache
Input: url from wich the rss file was fetched
Output: cached object on HIT, false on MISS
\*=======================================================================*/
function get ($url) {
$this->ERROR = "";
$cache_file = $this->file_name( $url );
if ( ! file_exists( $cache_file ) ) {
$this->debug(
"Cache doesn't contain: $url (cache file: $cache_file)"
);
return 0;
}
$fp = @fopen($cache_file, 'r');
if ( ! $fp ) {
$this->error(
"Failed to open cache file for reading: $cache_file"
);
return 0;
}
if ($filesize = filesize($cache_file) ) {
$data = fread( $fp, filesize($cache_file) );
$rss = $this->unserialize( $data );
return $rss;
}
return 0;
}
/*=======================================================================*\
Function: check_cache
Purpose: check a url for membership in the cache
and whether the object is older then MAX_AGE (ie. STALE)
Input: url from wich the rss file was fetched
Output: cached object on HIT, false on MISS
\*=======================================================================*/
function check_cache ( $url ) {
$this->ERROR = "";
$filename = $this->file_name( $url );
if ( file_exists( $filename ) ) {
// find how long ago the file was added to the cache
// and whether that is longer then MAX_AGE
$mtime = filemtime( $filename );
$age = time() - $mtime;
if ( $this->MAX_AGE > $age ) {
// object exists and is current
return 'HIT';
}
else {
// object exists but is old
return 'STALE';
}
}
else {
// object does not exist
return 'MISS';
}
}
function cache_age( $cache_key ) {
$filename = $this->file_name( $url );
if ( file_exists( $filename ) ) {
$mtime = filemtime( $filename );
$age = time() - $mtime;
return $age;
}
else {
return -1;
}
}
/*=======================================================================*\
Function: serialize
\*=======================================================================*/
function serialize ( $rss ) {
return serialize( $rss );
}
/*=======================================================================*\
Function: unserialize
\*=======================================================================*/
function unserialize ( $data ) {
return unserialize( $data );
}
/*=======================================================================*\
Function: file_name
Purpose: map url to location in cache
Input: url from wich the rss file was fetched
Output: a file name
\*=======================================================================*/
function file_name ($url) {
$filename = md5( $url );
return join( DIRECTORY_SEPARATOR, array( $this->BASE_CACHE, $filename ) );
}
/*=======================================================================*\
Function: error
Purpose: register error
\*=======================================================================*/
function error ($errormsg, $lvl=E_USER_WARNING) {
// append PHP's error message if track_errors enabled
if ( isset($php_errormsg) ) {
$errormsg .= " ($php_errormsg)";
}
$this->ERROR = $errormsg;
if ( MAGPIE_DEBUG ) {
trigger_error( $errormsg, $lvl);
}
else {
error_log( $errormsg, 0);
}
}
function debug ($debugmsg, $lvl=E_USER_NOTICE) {
if ( MAGPIE_DEBUG ) {
$this->error("MagpieRSS [debug] $debugmsg", $lvl);
}
}
}
?>
magpierss-0.72/rss_fetch.inc 0100644 0000765 0000765 00000035322 10333217740 015425 0 ustar kellan kellan
* License: GPL
*
* The lastest version of MagpieRSS can be obtained from:
* http://magpierss.sourceforge.net
*
* For questions, help, comments, discussion, etc., please join the
* Magpie mailing list:
* magpierss-general@lists.sourceforge.net
*
*/
// Setup MAGPIE_DIR for use on hosts that don't include
// the current path in include_path.
// with thanks to rajiv and smarty
if (!defined('DIR_SEP')) {
define('DIR_SEP', DIRECTORY_SEPARATOR);
}
if (!defined('MAGPIE_DIR')) {
define('MAGPIE_DIR', dirname(__FILE__) . DIR_SEP);
}
require_once( MAGPIE_DIR . 'rss_parse.inc' );
require_once( MAGPIE_DIR . 'rss_cache.inc' );
// for including 3rd party libraries
define('MAGPIE_EXTLIB', MAGPIE_DIR . 'extlib' . DIR_SEP);
require_once( MAGPIE_EXTLIB . 'Snoopy.class.inc');
/*
* CONSTANTS - redefine these in your script to change the
* behaviour of fetch_rss() currently, most options effect the cache
*
* MAGPIE_CACHE_ON - Should Magpie cache parsed RSS objects?
* For me a built in cache was essential to creating a "PHP-like"
* feel to Magpie, see rss_cache.inc for rationale
*
*
* MAGPIE_CACHE_DIR - Where should Magpie cache parsed RSS objects?
* This should be a location that the webserver can write to. If this
* directory does not already exist Mapie will try to be smart and create
* it. This will often fail for permissions reasons.
*
*
* MAGPIE_CACHE_AGE - How long to store cached RSS objects? In seconds.
*
*
* MAGPIE_CACHE_FRESH_ONLY - If remote fetch fails, throw error
* instead of returning stale object?
*
* MAGPIE_DEBUG - Display debugging notices?
*
*/
/*=======================================================================*\
Function: fetch_rss:
Purpose: return RSS object for the give url
maintain the cache
Input: url of RSS file
Output: parsed RSS object (see rss_parse.inc)
NOTES ON CACHEING:
If caching is on (MAGPIE_CACHE_ON) fetch_rss will first check the cache.
NOTES ON RETRIEVING REMOTE FILES:
If conditional gets are on (MAGPIE_CONDITIONAL_GET_ON) fetch_rss will
return a cached object, and touch the cache object upon recieving a
304.
NOTES ON FAILED REQUESTS:
If there is an HTTP error while fetching an RSS object, the cached
version will be return, if it exists (and if MAGPIE_CACHE_FRESH_ONLY is off)
\*=======================================================================*/
define('MAGPIE_VERSION', '0.72');
$MAGPIE_ERROR = "";
function fetch_rss ($url) {
// initialize constants
init();
if ( !isset($url) ) {
error("fetch_rss called without a url");
return false;
}
// if cache is disabled
if ( !MAGPIE_CACHE_ON ) {
// fetch file, and parse it
$resp = _fetch_remote_file( $url );
if ( is_success( $resp->status ) ) {
return _response_to_rss( $resp );
}
else {
error("Failed to fetch $url and cache is off");
return false;
}
}
// else cache is ON
else {
// Flow
// 1. check cache
// 2. if there is a hit, make sure its fresh
// 3. if cached obj fails freshness check, fetch remote
// 4. if remote fails, return stale object, or error
$cache = new RSSCache( MAGPIE_CACHE_DIR, MAGPIE_CACHE_AGE );
if (MAGPIE_DEBUG and $cache->ERROR) {
debug($cache->ERROR, E_USER_WARNING);
}
$cache_status = 0; // response of check_cache
$request_headers = array(); // HTTP headers to send with fetch
$rss = 0; // parsed RSS object
$errormsg = 0; // errors, if any
// store parsed XML by desired output encoding
// as character munging happens at parse time
$cache_key = $url . MAGPIE_OUTPUT_ENCODING;
if (!$cache->ERROR) {
// return cache HIT, MISS, or STALE
$cache_status = $cache->check_cache( $cache_key);
}
// if object cached, and cache is fresh, return cached obj
if ( $cache_status == 'HIT' ) {
$rss = $cache->get( $cache_key );
if ( isset($rss) and $rss ) {
// should be cache age
$rss->from_cache = 1;
if ( MAGPIE_DEBUG > 1) {
debug("MagpieRSS: Cache HIT", E_USER_NOTICE);
}
return $rss;
}
}
// else attempt a conditional get
// setup headers
if ( $cache_status == 'STALE' ) {
$rss = $cache->get( $cache_key );
if ( $rss and $rss->etag and $rss->last_modified ) {
$request_headers['If-None-Match'] = $rss->etag;
$request_headers['If-Last-Modified'] = $rss->last_modified;
}
}
$resp = _fetch_remote_file( $url, $request_headers );
if (isset($resp) and $resp) {
if ($resp->status == '304' ) {
// we have the most current copy
if ( MAGPIE_DEBUG > 1) {
debug("Got 304 for $url");
}
// reset cache on 304 (at minutillo insistent prodding)
$cache->set($cache_key, $rss);
return $rss;
}
elseif ( is_success( $resp->status ) ) {
$rss = _response_to_rss( $resp );
if ( $rss ) {
if (MAGPIE_DEBUG > 1) {
debug("Fetch successful");
}
// add object to cache
$cache->set( $cache_key, $rss );
return $rss;
}
}
else {
$errormsg = "Failed to fetch $url ";
if ( $resp->status == '-100' ) {
$errormsg .= "(Request timed out after " . MAGPIE_FETCH_TIME_OUT . " seconds)";
}
elseif ( $resp->error ) {
# compensate for Snoopy's annoying habbit to tacking
# on '\n'
$http_error = substr($resp->error, 0, -2);
$errormsg .= "(HTTP Error: $http_error)";
}
else {
$errormsg .= "(HTTP Response: " . $resp->response_code .')';
}
}
}
else {
$errormsg = "Unable to retrieve RSS file for unknown reasons.";
}
// else fetch failed
// attempt to return cached object
if ($rss) {
if ( MAGPIE_DEBUG ) {
debug("Returning STALE object for $url");
}
return $rss;
}
// else we totally failed
error( $errormsg );
return false;
} // end if ( !MAGPIE_CACHE_ON ) {
} // end fetch_rss()
/*=======================================================================*\
Function: error
Purpose: set MAGPIE_ERROR, and trigger error
\*=======================================================================*/
function error ($errormsg, $lvl=E_USER_WARNING) {
global $MAGPIE_ERROR;
// append PHP's error message if track_errors enabled
if ( isset($php_errormsg) ) {
$errormsg .= " ($php_errormsg)";
}
if ( $errormsg ) {
$errormsg = "MagpieRSS: $errormsg";
$MAGPIE_ERROR = $errormsg;
trigger_error( $errormsg, $lvl);
}
}
function debug ($debugmsg, $lvl=E_USER_NOTICE) {
trigger_error("MagpieRSS [debug] $debugmsg", $lvl);
}
/*=======================================================================*\
Function: magpie_error
Purpose: accessor for the magpie error variable
\*=======================================================================*/
function magpie_error ($errormsg="") {
global $MAGPIE_ERROR;
if ( isset($errormsg) and $errormsg ) {
$MAGPIE_ERROR = $errormsg;
}
return $MAGPIE_ERROR;
}
/*=======================================================================*\
Function: _fetch_remote_file
Purpose: retrieve an arbitrary remote file
Input: url of the remote file
headers to send along with the request (optional)
Output: an HTTP response object (see Snoopy.class.inc)
\*=======================================================================*/
function _fetch_remote_file ($url, $headers = "" ) {
// Snoopy is an HTTP client in PHP
$client = new Snoopy();
$client->agent = MAGPIE_USER_AGENT;
$client->read_timeout = MAGPIE_FETCH_TIME_OUT;
$client->use_gzip = MAGPIE_USE_GZIP;
if (is_array($headers) ) {
$client->rawheaders = $headers;
}
@$client->fetch($url);
return $client;
}
/*=======================================================================*\
Function: _response_to_rss
Purpose: parse an HTTP response object into an RSS object
Input: an HTTP response object (see Snoopy)
Output: parsed RSS object (see rss_parse)
\*=======================================================================*/
function _response_to_rss ($resp) {
$rss = new MagpieRSS( $resp->results, MAGPIE_OUTPUT_ENCODING, MAGPIE_INPUT_ENCODING, MAGPIE_DETECT_ENCODING );
// if RSS parsed successfully
if ( $rss and !$rss->ERROR) {
// find Etag, and Last-Modified
foreach($resp->headers as $h) {
// 2003-03-02 - Nicola Asuni (www.tecnick.com) - fixed bug "Undefined offset: 1"
if (strpos($h, ": ")) {
list($field, $val) = explode(": ", $h, 2);
}
else {
$field = $h;
$val = "";
}
if ( $field == 'ETag' ) {
$rss->etag = $val;
}
if ( $field == 'Last-Modified' ) {
$rss->last_modified = $val;
}
}
return $rss;
} // else construct error message
else {
$errormsg = "Failed to parse RSS file.";
if ($rss) {
$errormsg .= " (" . $rss->ERROR . ")";
}
error($errormsg);
return false;
} // end if ($rss and !$rss->error)
}
/*=======================================================================*\
Function: init
Purpose: setup constants with default values
check for user overrides
\*=======================================================================*/
function init () {
if ( defined('MAGPIE_INITALIZED') ) {
return;
}
else {
define('MAGPIE_INITALIZED', true);
}
if ( !defined('MAGPIE_CACHE_ON') ) {
define('MAGPIE_CACHE_ON', true);
}
if ( !defined('MAGPIE_CACHE_DIR') ) {
define('MAGPIE_CACHE_DIR', './cache');
}
if ( !defined('MAGPIE_CACHE_AGE') ) {
define('MAGPIE_CACHE_AGE', 60*60); // one hour
}
if ( !defined('MAGPIE_CACHE_FRESH_ONLY') ) {
define('MAGPIE_CACHE_FRESH_ONLY', false);
}
if ( !defined('MAGPIE_OUTPUT_ENCODING') ) {
define('MAGPIE_OUTPUT_ENCODING', 'ISO-8859-1');
}
if ( !defined('MAGPIE_INPUT_ENCODING') ) {
define('MAGPIE_INPUT_ENCODING', null);
}
if ( !defined('MAGPIE_DETECT_ENCODING') ) {
define('MAGPIE_DETECT_ENCODING', true);
}
if ( !defined('MAGPIE_DEBUG') ) {
define('MAGPIE_DEBUG', 0);
}
if ( !defined('MAGPIE_USER_AGENT') ) {
$ua = 'MagpieRSS/'. MAGPIE_VERSION . ' (+http://magpierss.sf.net';
if ( MAGPIE_CACHE_ON ) {
$ua = $ua . ')';
}
else {
$ua = $ua . '; No cache)';
}
define('MAGPIE_USER_AGENT', $ua);
}
if ( !defined('MAGPIE_FETCH_TIME_OUT') ) {
define('MAGPIE_FETCH_TIME_OUT', 5); // 5 second timeout
}
// use gzip encoding to fetch rss files if supported?
if ( !defined('MAGPIE_USE_GZIP') ) {
define('MAGPIE_USE_GZIP', true);
}
}
// NOTE: the following code should really be in Snoopy, or at least
// somewhere other then rss_fetch!
/*=======================================================================*\
HTTP STATUS CODE PREDICATES
These functions attempt to classify an HTTP status code
based on RFC 2616 and RFC 2518.
All of them take an HTTP status code as input, and return true or false
All this code is adapted from LWP's HTTP::Status.
\*=======================================================================*/
/*=======================================================================*\
Function: is_info
Purpose: return true if Informational status code
\*=======================================================================*/
function is_info ($sc) {
return $sc >= 100 && $sc < 200;
}
/*=======================================================================*\
Function: is_success
Purpose: return true if Successful status code
\*=======================================================================*/
function is_success ($sc) {
return $sc >= 200 && $sc < 300;
}
/*=======================================================================*\
Function: is_redirect
Purpose: return true if Redirection status code
\*=======================================================================*/
function is_redirect ($sc) {
return $sc >= 300 && $sc < 400;
}
/*=======================================================================*\
Function: is_error
Purpose: return true if Error status code
\*=======================================================================*/
function is_error ($sc) {
return $sc >= 400 && $sc < 600;
}
/*=======================================================================*\
Function: is_client_error
Purpose: return true if Error status code, and its a client error
\*=======================================================================*/
function is_client_error ($sc) {
return $sc >= 400 && $sc < 500;
}
/*=======================================================================*\
Function: is_client_error
Purpose: return true if Error status code, and its a server error
\*=======================================================================*/
function is_server_error ($sc) {
return $sc >= 500 && $sc < 600;
}
?>
magpierss-0.72/rss_parse.inc 0100644 0000765 0000765 00000046270 10213344272 015450 0 ustar kellan kellan
* @version 0.7a
* @license GPL
*
*/
define('RSS', 'RSS');
define('ATOM', 'Atom');
require_once (MAGPIE_DIR . 'rss_utils.inc');
/**
* Hybrid parser, and object, takes RSS as a string and returns a simple object.
*
* see: rss_fetch.inc for a simpler interface with integrated caching support
*
*/
class MagpieRSS {
var $parser;
var $current_item = array(); // item currently being parsed
var $items = array(); // collection of parsed items
var $channel = array(); // hash of channel fields
var $textinput = array();
var $image = array();
var $feed_type;
var $feed_version;
var $encoding = ''; // output encoding of parsed rss
var $_source_encoding = ''; // only set if we have to parse xml prolog
var $ERROR = "";
var $WARNING = "";
// define some constants
var $_CONTENT_CONSTRUCTS = array('content', 'summary', 'info', 'title', 'tagline', 'copyright');
var $_KNOWN_ENCODINGS = array('UTF-8', 'US-ASCII', 'ISO-8859-1');
// parser variables, useless if you're not a parser, treat as private
var $stack = array(); // parser stack
var $inchannel = false;
var $initem = false;
var $incontent = false; // if in Atom field
var $intextinput = false;
var $inimage = false;
var $current_namespace = false;
/**
* Set up XML parser, parse source, and return populated RSS object..
*
* @param string $source string containing the RSS to be parsed
*
* NOTE: Probably a good idea to leave the encoding options alone unless
* you know what you're doing as PHP's character set support is
* a little weird.
*
* NOTE: A lot of this is unnecessary but harmless with PHP5
*
*
* @param string $output_encoding output the parsed RSS in this character
* set defaults to ISO-8859-1 as this is PHP's
* default.
*
* NOTE: might be changed to UTF-8 in future
* versions.
*
* @param string $input_encoding the character set of the incoming RSS source.
* Leave blank and Magpie will try to figure it
* out.
*
*
* @param bool $detect_encoding if false Magpie won't attempt to detect
* source encoding. (caveat emptor)
*
*/
function MagpieRSS ($source, $output_encoding='ISO-8859-1',
$input_encoding=null, $detect_encoding=true)
{
# if PHP xml isn't compiled in, die
#
if (!function_exists('xml_parser_create')) {
$this->error( "Failed to load PHP's XML Extension. " .
"http://www.php.net/manual/en/ref.xml.php",
E_USER_ERROR );
}
list($parser, $source) = $this->create_parser($source,
$output_encoding, $input_encoding, $detect_encoding);
if (!is_resource($parser)) {
$this->error( "Failed to create an instance of PHP's XML parser. " .
"http://www.php.net/manual/en/ref.xml.php",
E_USER_ERROR );
}
$this->parser = $parser;
# pass in parser, and a reference to this object
# setup handlers
#
xml_set_object( $this->parser, $this );
xml_set_element_handler($this->parser,
'feed_start_element', 'feed_end_element' );
xml_set_character_data_handler( $this->parser, 'feed_cdata' );
$status = xml_parse( $this->parser, $source );
if (! $status ) {
$errorcode = xml_get_error_code( $this->parser );
if ( $errorcode != XML_ERROR_NONE ) {
$xml_error = xml_error_string( $errorcode );
$error_line = xml_get_current_line_number($this->parser);
$error_col = xml_get_current_column_number($this->parser);
$errormsg = "$xml_error at line $error_line, column $error_col";
$this->error( $errormsg );
}
}
xml_parser_free( $this->parser );
$this->normalize();
}
function feed_start_element($p, $element, &$attrs) {
$el = $element = strtolower($element);
$attrs = array_change_key_case($attrs, CASE_LOWER);
// check for a namespace, and split if found
$ns = false;
if ( strpos( $element, ':' ) ) {
list($ns, $el) = split( ':', $element, 2);
}
if ( $ns and $ns != 'rdf' ) {
$this->current_namespace = $ns;
}
# if feed type isn't set, then this is first element of feed
# identify feed from root element
#
if (!isset($this->feed_type) ) {
if ( $el == 'rdf' ) {
$this->feed_type = RSS;
$this->feed_version = '1.0';
}
elseif ( $el == 'rss' ) {
$this->feed_type = RSS;
$this->feed_version = $attrs['version'];
}
elseif ( $el == 'feed' ) {
$this->feed_type = ATOM;
$this->feed_version = $attrs['version'];
$this->inchannel = true;
}
return;
}
if ( $el == 'channel' )
{
$this->inchannel = true;
}
elseif ($el == 'item' or $el == 'entry' )
{
$this->initem = true;
if ( isset($attrs['rdf:about']) ) {
$this->current_item['about'] = $attrs['rdf:about'];
}
}
// if we're in the default namespace of an RSS feed,
// record textinput or image fields
elseif (
$this->feed_type == RSS and
$this->current_namespace == '' and
$el == 'textinput' )
{
$this->intextinput = true;
}
elseif (
$this->feed_type == RSS and
$this->current_namespace == '' and
$el == 'image' )
{
$this->inimage = true;
}
# handle atom content constructs
elseif ( $this->feed_type == ATOM and in_array($el, $this->_CONTENT_CONSTRUCTS) )
{
// avoid clashing w/ RSS mod_content
if ($el == 'content' ) {
$el = 'atom_content';
}
$this->incontent = $el;
}
// if inside an Atom content construct (e.g. content or summary) field treat tags as text
elseif ($this->feed_type == ATOM and $this->incontent )
{
// if tags are inlined, then flatten
$attrs_str = join(' ',
array_map('map_attrs',
array_keys($attrs),
array_values($attrs) ) );
$this->append_content( "<$element $attrs_str>" );
array_unshift( $this->stack, $el );
}
// Atom support many links per containging element.
// Magpie treats link elements of type rel='alternate'
// as being equivalent to RSS's simple link element.
//
elseif ($this->feed_type == ATOM and $el == 'link' )
{
if ( isset($attrs['rel']) and $attrs['rel'] == 'alternate' )
{
$link_el = 'link';
}
else {
$link_el = 'link_' . $attrs['rel'];
}
$this->append($link_el, $attrs['href']);
}
// set stack[0] to current element
else {
array_unshift($this->stack, $el);
}
}
function feed_cdata ($p, $text) {
if ($this->feed_type == ATOM and $this->incontent)
{
$this->append_content( $text );
}
else {
$current_el = join('_', array_reverse($this->stack));
$this->append($current_el, $text);
}
}
function feed_end_element ($p, $el) {
$el = strtolower($el);
if ( $el == 'item' or $el == 'entry' )
{
$this->items[] = $this->current_item;
$this->current_item = array();
$this->initem = false;
}
elseif ($this->feed_type == RSS and $this->current_namespace == '' and $el == 'textinput' )
{
$this->intextinput = false;
}
elseif ($this->feed_type == RSS and $this->current_namespace == '' and $el == 'image' )
{
$this->inimage = false;
}
elseif ($this->feed_type == ATOM and in_array($el, $this->_CONTENT_CONSTRUCTS) )
{
$this->incontent = false;
}
elseif ($el == 'channel' or $el == 'feed' )
{
$this->inchannel = false;
}
elseif ($this->feed_type == ATOM and $this->incontent ) {
// balance tags properly
// note: i don't think this is actually neccessary
if ( $this->stack[0] == $el )
{
$this->append_content("$el>");
}
else {
$this->append_content("<$el />");
}
array_shift( $this->stack );
}
else {
array_shift( $this->stack );
}
$this->current_namespace = false;
}
function concat (&$str1, $str2="") {
if (!isset($str1) ) {
$str1="";
}
$str1 .= $str2;
}
function append_content($text) {
if ( $this->initem ) {
$this->concat( $this->current_item[ $this->incontent ], $text );
}
elseif ( $this->inchannel ) {
$this->concat( $this->channel[ $this->incontent ], $text );
}
}
// smart append - field and namespace aware
function append($el, $text) {
if (!$el) {
return;
}
if ( $this->current_namespace )
{
if ( $this->initem ) {
$this->concat(
$this->current_item[ $this->current_namespace ][ $el ], $text);
}
elseif ($this->inchannel) {
$this->concat(
$this->channel[ $this->current_namespace][ $el ], $text );
}
elseif ($this->intextinput) {
$this->concat(
$this->textinput[ $this->current_namespace][ $el ], $text );
}
elseif ($this->inimage) {
$this->concat(
$this->image[ $this->current_namespace ][ $el ], $text );
}
}
else {
if ( $this->initem ) {
$this->concat(
$this->current_item[ $el ], $text);
}
elseif ($this->intextinput) {
$this->concat(
$this->textinput[ $el ], $text );
}
elseif ($this->inimage) {
$this->concat(
$this->image[ $el ], $text );
}
elseif ($this->inchannel) {
$this->concat(
$this->channel[ $el ], $text );
}
}
}
function normalize () {
// if atom populate rss fields
if ( $this->is_atom() ) {
$this->channel['description'] = $this->channel['tagline'];
for ( $i = 0; $i < count($this->items); $i++) {
$item = $this->items[$i];
if ( isset($item['summary']) )
$item['description'] = $item['summary'];
if ( isset($item['atom_content']))
$item['content']['encoded'] = $item['atom_content'];
$atom_date = (isset($item['issued']) ) ? $item['issued'] : $item['modified'];
if ( $atom_date ) {
$epoch = @parse_w3cdtf($atom_date);
if ($epoch and $epoch > 0) {
$item['date_timestamp'] = $epoch;
}
}
$this->items[$i] = $item;
}
}
elseif ( $this->is_rss() ) {
$this->channel['tagline'] = $this->channel['description'];
for ( $i = 0; $i < count($this->items); $i++) {
$item = $this->items[$i];
if ( isset($item['description']))
$item['summary'] = $item['description'];
if ( isset($item['content']['encoded'] ) )
$item['atom_content'] = $item['content']['encoded'];
if ( $this->is_rss() == '1.0' and isset($item['dc']['date']) ) {
$epoch = @parse_w3cdtf($item['dc']['date']);
if ($epoch and $epoch > 0) {
$item['date_timestamp'] = $epoch;
}
}
elseif ( isset($item['pubdate']) ) {
$epoch = @strtotime($item['pubdate']);
if ($epoch > 0) {
$item['date_timestamp'] = $epoch;
}
}
$this->items[$i] = $item;
}
}
}
function is_rss () {
if ( $this->feed_type == RSS ) {
return $this->feed_version;
}
else {
return false;
}
}
function is_atom() {
if ( $this->feed_type == ATOM ) {
return $this->feed_version;
}
else {
return false;
}
}
/**
* return XML parser, and possibly re-encoded source
*
*/
function create_parser($source, $out_enc, $in_enc, $detect) {
if ( substr(phpversion(),0,1) == 5) {
$parser = $this->php5_create_parser($in_enc, $detect);
}
else {
list($parser, $source) = $this->php4_create_parser($source, $in_enc, $detect);
}
if ($out_enc) {
$this->encoding = $out_enc;
xml_parser_set_option($parser, XML_OPTION_TARGET_ENCODING, $out_enc);
}
return array($parser, $source);
}
/**
* Instantiate an XML parser under PHP5
*
* PHP5 will do a fine job of detecting input encoding
* if passed an empty string as the encoding.
*
* All hail libxml2!
*
*/
function php5_create_parser($in_enc, $detect) {
// by default php5 does a fine job of detecting input encodings
if(!$detect && $in_enc) {
return xml_parser_create($in_enc);
}
else {
return xml_parser_create('');
}
}
/**
* Instaniate an XML parser under PHP4
*
* Unfortunately PHP4's support for character encodings
* and especially XML and character encodings sucks. As
* long as the documents you parse only contain characters
* from the ISO-8859-1 character set (a superset of ASCII,
* and a subset of UTF-8) you're fine. However once you
* step out of that comfy little world things get mad, bad,
* and dangerous to know.
*
* The following code is based on SJM's work with FoF
* @see http://minutillo.com/steve/weblog/2004/6/17/php-xml-and-character-encodings-a-tale-of-sadness-rage-and-data-loss
*
*/
function php4_create_parser($source, $in_enc, $detect) {
if ( !$detect ) {
return array(xml_parser_create($in_enc), $source);
}
if (!$in_enc) {
if (preg_match('//m', $source, $m)) {
$in_enc = strtoupper($m[1]);
$this->source_encoding = $in_enc;
}
else {
$in_enc = 'UTF-8';
}
}
if ($this->known_encoding($in_enc)) {
return array(xml_parser_create($in_enc), $source);
}
// the dectected encoding is not one of the simple encodings PHP knows
// attempt to use the iconv extension to
// cast the XML to a known encoding
// @see http://php.net/iconv
if (function_exists('iconv')) {
$encoded_source = iconv($in_enc,'UTF-8', $source);
if ($encoded_source) {
return array(xml_parser_create('UTF-8'), $encoded_source);
}
}
// iconv didn't work, try mb_convert_encoding
// @see http://php.net/mbstring
if(function_exists('mb_convert_encoding')) {
$encoded_source = mb_convert_encoding($source, 'UTF-8', $in_enc );
if ($encoded_source) {
return array(xml_parser_create('UTF-8'), $encoded_source);
}
}
// else
$this->error("Feed is in an unsupported character encoding. ($in_enc) " .
"You may see strange artifacts, and mangled characters.",
E_USER_NOTICE);
return array(xml_parser_create(), $source);
}
function known_encoding($enc) {
$enc = strtoupper($enc);
if ( in_array($enc, $this->_KNOWN_ENCODINGS) ) {
return $enc;
}
else {
return false;
}
}
function error ($errormsg, $lvl=E_USER_WARNING) {
// append PHP's error message if track_errors enabled
if ( isset($php_errormsg) ) {
$errormsg .= " ($php_errormsg)";
}
if ( MAGPIE_DEBUG ) {
trigger_error( $errormsg, $lvl);
}
else {
error_log( $errormsg, 0);
}
$notices = E_USER_NOTICE|E_NOTICE;
if ( $lvl&$notices ) {
$this->WARNING = $errormsg;
} else {
$this->ERROR = $errormsg;
}
}
} // end class RSS
function map_attrs($k, $v) {
return "$k=\"$v\"";
}
// patch to support medieval versions of PHP4.1.x,
// courtesy, Ryan Currie, ryan@digibliss.com
if (!function_exists('array_change_key_case')) {
define("CASE_UPPER",1);
define("CASE_LOWER",0);
function array_change_key_case($array,$case=CASE_LOWER) {
if ($case=CASE_LOWER) $cmd=strtolower;
elseif ($case=CASE_UPPER) $cmd=strtoupper;
foreach($array as $key=>$value) {
$output[$cmd($key)]=$value;
}
return $output;
}
}
?>
magpierss-0.72/rss_utils.inc 0100644 0000765 0000765 00000004010 10157110566 015464 0 ustar kellan kellan
* Version: 0.51
* License: GPL
*
* The lastest version of MagpieRSS can be obtained from:
* http://magpierss.sourceforge.net
*
* For questions, help, comments, discussion, etc., please join the
* Magpie mailing list:
* magpierss-general@lists.sourceforge.net
*/
/*======================================================================*\
Function: parse_w3cdtf
Purpose: parse a W3CDTF date into unix epoch
NOTE: http://www.w3.org/TR/NOTE-datetime
\*======================================================================*/
function parse_w3cdtf ( $date_str ) {
# regex to match wc3dtf
$pat = "/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2})(:(\d{2}))?(?:([-+])(\d{2}):?(\d{2})|(Z))?/";
if ( preg_match( $pat, $date_str, $match ) ) {
list( $year, $month, $day, $hours, $minutes, $seconds) =
array( $match[1], $match[2], $match[3], $match[4], $match[5], $match[6]);
# calc epoch for current date assuming GMT
$epoch = gmmktime( $hours, $minutes, $seconds, $month, $day, $year);
$offset = 0;
if ( $match[10] == 'Z' ) {
# zulu time, aka GMT
}
else {
list( $tz_mod, $tz_hour, $tz_min ) =
array( $match[8], $match[9], $match[10]);
# zero out the variables
if ( ! $tz_hour ) { $tz_hour = 0; }
if ( ! $tz_min ) { $tz_min = 0; }
$offset_secs = (($tz_hour*60)+$tz_min)*60;
# is timezone ahead of GMT? then subtract offset
#
if ( $tz_mod == '+' ) {
$offset_secs = $offset_secs * -1;
}
$offset = $offset_secs;
}
$epoch = $epoch + $offset;
return $epoch;
}
else {
return -1;
}
}
?>
magpierss-0.72/scripts/ 0040755 0000765 0000765 00000000000 10333221327 014433 5 ustar kellan kellan magpierss-0.72/scripts/magpie_debug.php 0100755 0000765 0000765 00000003546 10333220302 017554 0 ustar kellan kellan Example Output";
echo "Channel: " . $rss->channel['title'] . "
Error: PHP compiled without XML support (--with-xml), Mapgie won't work without PHP support for XML. \n";
exit;
}
else {
echo "OK: Found an XML parser. \n";
}
if ( ! function_exists('gzinflate') ) {
echo "Warning: PHP compiled without Zlib support (--with-zlib). No support for GZIP encoding. \n";
}
else {
echo "OK: Support for GZIP encoding. \n";
}
if ( ! (function_exists('iconv') and function_exists('mb_convert_encoding') ) ) {
echo "Warning: No support for iconv (--with-iconv) or multi-byte strings (--enable-mbstring)." .
"No support character set munging. \n";
}
else {
echo "OK: Support for character munging. \n";
}
}
?>
magpierss-0.72/scripts/magpie_simple.php 0100755 0000765 0000765 00000001453 10333220742 017762 0 ustar kellan kellan channel['title'] . "
This is a simple example script. If this was a real script we probably wouldn't allow strangers to submit random URLs, and we certainly wouldn't simply echo anything passed in the URL. Additionally its a bad idea to leave this example script lying around.