pax_global_header 0000666 0000000 0000000 00000000064 12671525356 0014526 g ustar 00root root 0000000 0000000 52 comment=4fb2eb5365cbc0fd2e0c26ca748777d6c2539763
php-mf2-0.3.0/ 0000775 0000000 0000000 00000000000 12671525356 0012777 5 ustar 00root root 0000000 0000000 php-mf2-0.3.0/.gitignore 0000664 0000000 0000000 00000000102 12671525356 0014760 0 ustar 00root root 0000000 0000000 .DS_Store
/nbproject
composer.phar
/vendor/
/tmp
.idea/
/bin/test
php-mf2-0.3.0/.travis.yml 0000664 0000000 0000000 00000000127 12671525356 0015110 0 ustar 00root root 0000000 0000000 language: php
php:
- 5.4
- 5.5
- 5.6
- nightly
before_script: composer install
php-mf2-0.3.0/LICENSE.md 0000664 0000000 0000000 00000015574 12671525356 0014417 0 ustar 00root root 0000000 0000000 # Creative Commons Legal Code
## CC0 1.0 Universal
http://creativecommons.org/publicdomain/zero/1.0
Official translations of this legal tool are available> CREATIVE COMMONS CORPORATION IS NOT A LAW FIRM AND DOES NOT PROVIDE LEGAL SERVICES. DISTRIBUTION OF THIS DOCUMENT DOES NOT CREATE AN ATTORNEY-CLIENT RELATIONSHIP. CREATIVE COMMONS PROVIDES THIS INFORMATION ON AN "AS-IS" BASIS. CREATIVE COMMONS MAKES NO WARRANTIES REGARDING THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER, AND DISCLAIMS LIABILITY FOR DAMAGES RESULTING FROM THE USE OF THIS DOCUMENT OR THE INFORMATION OR WORKS PROVIDED HEREUNDER.
### _Statement of Purpose_
The laws of most jurisdictions throughout the world automatically confer exclusive Copyright and Related Rights (defined below) upon the creator and subsequent owner(s) (each and all, an "owner") of an original work of authorship and/or a database (each, a "Work").
Certain owners wish to permanently relinquish those rights to a Work for the purpose of contributing to a commons of creative, cultural and scientific works ("Commons") that the public can reliably and without fear of later claims of infringement build upon, modify, incorporate in other works, reuse and redistribute as freely as possible in any form whatsoever and for any purposes, including without limitation commercial purposes. These owners may contribute to the Commons to promote the ideal of a free culture and the further production of creative, cultural and scientific works, or to gain reputation or greater distribution for their Work in part through the use and efforts of others.
For these and/or other purposes and motivations, and without any expectation of additional consideration or compensation, the person associating CC0 with a Work (the "Affirmer"), to the extent that he or she is an owner of Copyright and Related Rights in the Work, voluntarily elects to apply CC0 to the Work and publicly distribute the Work under its terms, with knowledge of his or her Copyright and Related Rights in the Work and the meaning and intended legal effect of CC0 on those rights.
**1. Copyright and Related Rights.** A Work made available under CC0 may be protected by copyright and related or neighboring rights ("Copyright and Related Rights"). Copyright and Related Rights include, but are not limited to, the following:
1. the right to reproduce, adapt, distribute, perform, display, communicate, and translate a Work;
2. moral rights retained by the original author(s) and/or performer(s);
3. publicity and privacy rights pertaining to a person's image or likeness depicted in a Work;
4. rights protecting against unfair competition in regards to a Work, subject to the limitations in paragraph 4(a), below;
5. rights protecting the extraction, dissemination, use and reuse of data in a Work;
6. database rights (such as those arising under Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, and under any national implementation thereof, including any amended or successor version of such directive); and
7. other similar, equivalent or corresponding rights throughout the world based on applicable law or treaty, and any national implementations thereof.
**2. Waiver.** To the greatest extent permitted by, but not in contravention of, applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and unconditionally waives, abandons, and surrenders all of Affirmer's Copyright and Related Rights and associated claims and causes of action, whether now known or unknown (including existing as well as future claims and causes of action), in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes the Waiver for the benefit of each member of the public at large and to the detriment of Affirmer's heirs and successors, fully intending that such Waiver shall not be subject to revocation, rescission, cancellation, termination, or any other legal or equitable action to disrupt the quiet enjoyment of the Work by the public as contemplated by Affirmer's express Statement of Purpose.
**3. Public License Fallback.** Should any part of the Waiver for any reason be judged legally invalid or ineffective under applicable law, then the Waiver shall be preserved to the maximum extent permitted taking into account Affirmer's express Statement of Purpose. In addition, to the extent the Waiver is so judged Affirmer hereby grants to each affected person a royalty-free, non transferable, non sublicensable, non exclusive, irrevocable and unconditional license to exercise Affirmer's Copyright and Related Rights in the Work (i) in all territories worldwide, (ii) for the maximum duration provided by applicable law or treaty (including future time extensions), (iii) in any current or future medium and for any number of copies, and (iv) for any purpose whatsoever, including without limitation commercial, advertising or promotional purposes (the "License"). The License shall be deemed effective as of the date CC0 was applied by Affirmer to the Work. Should any part of the License for any reason be judged legally invalid or ineffective under applicable law, such partial invalidity or ineffectiveness shall not invalidate the remainder of the License, and in such case Affirmer hereby affirms that he or she will not (i) exercise any of his or her remaining Copyright and Related Rights in the Work or (ii) assert any associated claims and causes of action with respect to the Work, in either case contrary to Affirmer's express Statement of Purpose.
**4. Limitations and Disclaimers.**
1. No trademark or patent rights held by Affirmer are waived, abandoned, surrendered, licensed or otherwise affected by this document.
2. Affirmer offers the Work as-is and makes no representations or warranties of any kind concerning the Work, express, implied, statutory or otherwise, including without limitation warranties of title, merchantability, fitness for a particular purpose, non infringement, or the absence of latent or other defects, accuracy, or the present or absence of errors, whether or not discoverable, all to the greatest extent permissible under applicable law.
3. Affirmer disclaims responsibility for clearing rights of other persons that may apply to the Work or any use thereof, including without limitation any person's Copyright and Related Rights in the Work. Further, Affirmer disclaims responsibility for obtaining any necessary consents, permissions or other rights required for any use of the Work.
4. Affirmer understands and acknowledges that Creative Commons is not a party to this document and has no duty or obligation with respect to this CC0 or use of the Work.
php-mf2-0.3.0/Mf2/ 0000775 0000000 0000000 00000000000 12671525356 0013423 5 ustar 00root root 0000000 0000000 php-mf2-0.3.0/Mf2/Parser.php 0000664 0000000 0000000 00000135340 12671525356 0015376 0 ustar 00root root 0000000 0000000 Barnaby Walters');
* echo json_encode($output, JSON_PRETTY_PRINT);
*
* Produces:
*
* {
* "items": [
* {
* "type": ["h-card"],
* "properties": {
* "name": ["Barnaby Walters"]
* }
* }
* ],
* "rels": {}
* }
*
* @param string|DOMDocument $input The HTML string or DOMDocument object to parse
* @param string $url The URL the input document was found at, for relative URL resolution
* @param bool $convertClassic whether or not to convert classic microformats
* @return array Canonical MF2 array structure
*/
function parse($input, $url = null, $convertClassic = true) {
$parser = new Parser($input, $url);
return $parser->parse($convertClassic);
}
/**
* Fetch microformats2
*
* Given a URL, fetches it (following up to 5 redirects) and, if the content-type appears to be HTML, returns the parsed
* microformats2 array structure.
*
* Not that even if the response code was a 4XX or 5XX error, if the content-type is HTML-like then it will be parsed
* all the same, as there are legitimate cases where error pages might contain useful microformats (for example a deleted
* h-entry resulting in a 410 Gone page with a stub h-entry explaining the reason for deletion). Look in $curlInfo['http_code']
* for the actual value.
*
* @param string $url The URL to fetch
* @param bool $convertClassic (optional, default true) whether or not to convert classic microformats
* @param &array $curlInfo (optional) the results of curl_getinfo will be placed in this variable for debugging
* @return array|null canonical microformats2 array structure on success, null on failure
*/
function fetch($url, $convertClassic = true, &$curlInfo=null) {
$ch = curl_init();
curl_setopt($ch, CURLOPT_URL, $url);
curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
curl_setopt($ch, CURLOPT_HEADER, 0);
curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 1);
curl_setopt($ch, CURLOPT_MAXREDIRS, 5);
$html = curl_exec($ch);
$info = $curlInfo = curl_getinfo($ch);
curl_close($ch);
if (strpos(strtolower($info['content_type']), 'html') === false) {
// The content was not delivered as HTML, do not attempt to parse it.
return null;
}
# ensure the final URL is used to resolve relative URLs
$url = $info['url'];
return parse($html, $url, $convertClassic);
}
/**
* Unicode to HTML Entities
* @param string $input String containing characters to convert into HTML entities
* @return string
*/
function unicodeToHtmlEntities($input) {
return mb_convert_encoding($input, 'HTML-ENTITIES', mb_detect_encoding($input));
}
/**
* Collapse Whitespace
*
* Collapses any sequences of whitespace within a string into a single space
* character.
*
* @deprecated since v0.2.3
* @param string $str
* @return string
*/
function collapseWhitespace($str) {
return preg_replace('/[\s|\n]+/', ' ', $str);
}
function unicodeTrim($str) {
// this is cheating. TODO: find a better way if this causes any problems
$str = str_replace(mb_convert_encoding(' ', 'UTF-8', 'HTML-ENTITIES'), ' ', $str);
$str = preg_replace('/^\s+/', '', $str);
return preg_replace('/\s+$/', '', $str);
}
/**
* Microformat Name From Class string
*
* Given the value of @class, get the relevant mf classnames (e.g. h-card,
* p-name).
*
* @param string $class A space delimited list of classnames
* @param string $prefix The prefix to look for
* @return string|array The prefixed name of the first microfomats class found or false
*/
function mfNamesFromClass($class, $prefix='h-') {
$class = str_replace(array(' ', ' ', "\n"), ' ', $class);
$classes = explode(' ', $class);
$matches = array();
foreach ($classes as $classname) {
$compare_classname = ' ' . $classname;
$compare_prefix = ' ' . $prefix;
if (strstr($compare_classname, $compare_prefix) !== false && ($compare_classname != $compare_prefix)) {
$matches[] = ($prefix === 'h-') ? $classname : substr($classname, strlen($prefix));
}
}
return $matches;
}
/**
* Get Nested µf Property Name From Class
*
* Returns all the p-, u-, dt- or e- prefixed classnames it finds in a
* space-separated string.
*
* @param string $class
* @return array
*/
function nestedMfPropertyNamesFromClass($class) {
$prefixes = array('p-', 'u-', 'dt-', 'e-');
$propertyNames = array();
$class = str_replace(array(' ', ' ', "\n"), ' ', $class);
foreach (explode(' ', $class) as $classname) {
foreach ($prefixes as $prefix) {
// Check if $classname is a valid property classname for $prefix.
if (mb_substr($classname, 0, mb_strlen($prefix)) == $prefix && $classname != $prefix) {
$propertyName = mb_substr($classname, mb_strlen($prefix));
$propertyNames[$propertyName][] = $prefix;
}
}
}
foreach ($propertyNames as $property => $prefixes) {
$propertyNames[$property] = array_unique($prefixes);
}
return $propertyNames;
}
/**
* Wraps mfNamesFromClass to handle an element as input (common)
*
* @param DOMElement $e The element to get the classname for
* @param string $prefix The prefix to look for
* @return mixed See return value of mf2\Parser::mfNameFromClass()
*/
function mfNamesFromElement(\DOMElement $e, $prefix = 'h-') {
$class = $e->getAttribute('class');
return mfNamesFromClass($class, $prefix);
}
/**
* Wraps nestedMfPropertyNamesFromClass to handle an element as input
*/
function nestedMfPropertyNamesFromElement(\DOMElement $e) {
$class = $e->getAttribute('class');
return nestedMfPropertyNamesFromClass($class);
}
/**
* Converts various time formats to HH:MM
* @param string $time The time to convert
* @return string
*/
function convertTimeFormat($time) {
$hh = $mm = $ss = '';
preg_match('/(\d{1,2}):?(\d{2})?:?(\d{2})?(a\.?m\.?|p\.?m\.?)?/i', $time, $matches);
// If no am/pm is specified:
if (empty($matches[4])) {
return $time;
} else {
// Otherwise, am/pm is specified.
$meridiem = strtolower(str_replace('.', '', $matches[4]));
// Hours.
$hh = $matches[1];
// Add 12 to hours if pm applies.
if ($meridiem == 'pm' && ($hh < 12)) {
$hh += 12;
}
$hh = str_pad($hh, 2, '0', STR_PAD_LEFT);
// Minutes.
$mm = (empty($matches[2]) ) ? '00' : $matches[2];
// Seconds, only if supplied.
if (!empty($matches[3])) {
$ss = $matches[3];
}
if (empty($ss)) {
return sprintf('%s:%s', $hh, $mm);
}
else {
return sprintf('%s:%s:%s', $hh, $mm, $ss);
}
}
}
function applySrcsetUrlTransformation($srcset, $transformation) {
return implode(', ', array_filter(array_map(function ($srcsetPart) use ($transformation) {
$parts = explode(" \t\n\r\0\x0B", trim($srcsetPart), 2);
$parts[0] = rtrim($parts[0]);
if (empty($parts[0])) { return false; }
$parts[0] = call_user_func($transformation, $parts[0]);
return $parts[0] . (empty($parts[1]) ? '' : ' ' . $parts[1]);
}, explode(',', trim($srcset)))));
}
/**
* Microformats2 Parser
*
* A class which holds state for parsing microformats2 from HTML.
*
* Example usage:
*
* use Mf2;
* $parser = new Mf2\Parser('
Barnaby Walters
');
* $output = $parser->parse();
*/
class Parser {
/** @var string The baseurl (if any) to use for this parse */
public $baseurl;
/** @var DOMXPath object which can be used to query over any fragment*/
public $xpath;
/** @var DOMDocument */
public $doc;
/** @var SplObjectStorage */
protected $parsed;
public $jsonMode;
/**
* Constructor
*
* @param DOMDocument|string $input The data to parse. A string of HTML or a DOMDocument
* @param string $url The URL of the parsed document, for relative URL resolution
* @param boolean $jsonMode Whether or not to use a stdClass instance for an empty `rels` dictionary. This breaks PHP looping over rels, but allows the output to be correctly serialized as JSON.
*/
public function __construct($input, $url = null, $jsonMode = false) {
libxml_use_internal_errors(true);
if (is_string($input)) {
$doc = new DOMDocument();
@$doc->loadHTML(unicodeToHtmlEntities($input));
} elseif (is_a($input, 'DOMDocument')) {
$doc = $input;
} else {
$doc = new DOMDocument();
@$doc->loadHTML('');
}
$this->xpath = new DOMXPath($doc);
$baseurl = $url;
foreach ($this->xpath->query('//base[@href]') as $base) {
$baseElementUrl = $base->getAttribute('href');
if (parse_url($baseElementUrl, PHP_URL_SCHEME) === null) {
/* The base element URL is relative to the document URL.
*
* :/
*
* Perhaps the author was high? */
$baseurl = resolveUrl($url, $baseElementUrl);
} else {
$baseurl = $baseElementUrl;
}
break;
}
// Ignore elements as per the HTML5 spec
foreach ($this->xpath->query('//template') as $templateEl) {
$templateEl->parentNode->removeChild($templateEl);
}
$this->baseurl = $baseurl;
$this->doc = $doc;
$this->parsed = new SplObjectStorage();
$this->jsonMode = $jsonMode;
}
private function elementPrefixParsed(\DOMElement $e, $prefix) {
if (!$this->parsed->contains($e))
$this->parsed->attach($e, array());
$prefixes = $this->parsed[$e];
$prefixes[] = $prefix;
$this->parsed[$e] = $prefixes;
}
private function isElementParsed(\DOMElement $e, $prefix) {
if (!$this->parsed->contains($e))
return false;
$prefixes = $this->parsed[$e];
if (!in_array($prefix, $prefixes))
return false;
return true;
}
private function resolveChildUrls(DOMElement $el) {
$hyperlinkChildren = $this->xpath->query('.//*[@src or @href or @data]', $el);
foreach ($hyperlinkChildren as $child) {
if ($child->hasAttribute('href'))
$child->setAttribute('href', $this->resolveUrl($child->getAttribute('href')));
if ($child->hasAttribute('src'))
$child->setAttribute('src', $this->resolveUrl($child->getAttribute('src')));
if ($child->hasAttribute('srcset'))
$child->setAttribute('srcset', applySrcsetUrlTransformation($child->getAttribute('href'), [$this, 'resolveUrl']));
if ($child->hasAttribute('data'))
$child->setAttribute('data', $this->resolveUrl($child->getAttribute('data')));
}
}
public function textContent(DOMElement $el) {
$excludeTags = array('noframe', 'noscript', 'script', 'style', 'frames', 'frameset');
if (isset($el->tagName) and in_array(strtolower($el->tagName), $excludeTags)) {
return '';
}
$this->resolveChildUrls($el);
$clonedEl = $el->cloneNode(true);
foreach ($this->xpath->query('.//img', $clonedEl) as $imgEl) {
$newNode = $this->doc->createTextNode($imgEl->getAttribute($imgEl->hasAttribute('alt') ? 'alt' : 'src'));
$imgEl->parentNode->replaceChild($newNode, $imgEl);
}
foreach ($excludeTags as $tagName) {
foreach ($this->xpath->query(".//{$tagName}", $clonedEl) as $elToRemove) {
$elToRemove->parentNode->removeChild($elToRemove);
}
}
return $this->innerText($clonedEl);
}
/**
* This method attempts to return a better 'innerText' representation than DOMNode::textContent
*
* @param DOMElement|DOMText $el
* @param bool $implied when parsing for implied name for h-*, rules may be slightly different
* @see: https://github.com/glennjones/microformat-shiv/blob/dev/lib/text.js
*/
public function innerText($el, $implied=false) {
$out = '';
$blockLevelTags = array('h1', 'h2', 'h3', 'h4', 'h5', 'h6', 'p', 'hr', 'pre', 'table',
'address', 'article', 'aside', 'blockquote', 'caption', 'col', 'colgroup', 'dd', 'div',
'dt', 'dir', 'fieldset', 'figcaption', 'figure', 'footer', 'form', 'header', 'hgroup', 'hr',
'li', 'map', 'menu', 'nav', 'optgroup', 'option', 'section', 'tbody', 'testarea',
'tfoot', 'th', 'thead', 'tr', 'td', 'ul', 'ol', 'dl', 'details');
$excludeTags = array('noframe', 'noscript', 'script', 'style', 'frames', 'frameset');
// PHP DOMDocument doesn’t correctly handle whitespace around elements it doesn’t recognise.
$unsupportedTags = array('data');
if (isset($el->tagName)) {
if (in_array(strtolower($el->tagName), $excludeTags)) {
return $out;
} else if ($el->tagName == 'img') {
if ($el->getAttribute('alt') !== '') {
return $el->getAttribute('alt');
} else if (!$implied && $el->getAttribute('src') !== '') {
return $this->resolveUrl($el->getAttribute('src'));
}
} else if ($el->tagName == 'area' and $el->getAttribute('alt') !== '') {
return $el->getAttribute('alt');
} else if ($el->tagName == 'abbr' and $el->getAttribute('title') !== '') {
return $el->getAttribute('title');
}
}
// if node is a text node get its text
if (isset($el->nodeType) && $el->nodeType === 3) {
$out .= $el->textContent;
}
// get the text of the child nodes
if ($el->childNodes && $el->childNodes->length > 0) {
for ($j = 0; $j < $el->childNodes->length; $j++) {
$text = $this->innerText($el->childNodes->item($j), $implied);
if (!is_null($text)) {
$out .= $text;
}
}
}
if (isset($el->tagName)) {
// if its a block level tag add an additional space at the end
if (in_array(strtolower($el->tagName), $blockLevelTags)) {
$out .= ' ';
} elseif ($implied and in_array(strtolower($el->tagName), $unsupportedTags)) {
$out .= ' ';
} else if (strtolower($el->tagName) == 'br') {
// else if its a br, replace with newline
$out .= "\n";
}
}
return ($out === '') ? NULL : $out;
}
// TODO: figure out if this has problems with sms: and geo: URLs
public function resolveUrl($url) {
// If the URL is seriously malformed it’s probably beyond the scope of this
// parser to try to do anything with it.
if (parse_url($url) === false) {
return $url;
}
// per issue #40 valid URLs could have a space on either side
$url = trim($url);
$scheme = parse_url($url, PHP_URL_SCHEME);
if (empty($scheme) and !empty($this->baseurl)) {
return resolveUrl($this->baseurl, $url);
} else {
return $url;
}
}
// Parsing Functions
/**
* Parse value-class/value-title on an element, joining with $separator if
* there are multiple.
*
* @param \DOMElement $e
* @param string $separator = '' if multiple value-title elements, join with this string
* @return string|null the parsed value or null if value-class or -title aren’t in use
*/
public function parseValueClassTitle(\DOMElement $e, $separator = '') {
$valueClassElements = $this->xpath->query('./*[contains(concat(" ", @class, " "), " value ")]', $e);
if ($valueClassElements->length !== 0) {
// Process value-class stuff
$val = '';
foreach ($valueClassElements as $el) {
$val .= $this->textContent($el);
}
return unicodeTrim($val);
}
$valueTitleElements = $this->xpath->query('./*[contains(concat(" ", @class, " "), " value-title ")]', $e);
if ($valueTitleElements->length !== 0) {
// Process value-title stuff
$val = '';
foreach ($valueTitleElements as $el) {
$val .= $el->getAttribute('title');
}
return unicodeTrim($val);
}
// No value-title or -class in this element
return null;
}
/**
* Given an element with class="p-*", get its value
*
* @param DOMElement $p The element to parse
* @return string The plaintext value of $p, dependant on type
* @todo Make this adhere to value-class
*/
public function parseP(\DOMElement $p) {
$classTitle = $this->parseValueClassTitle($p, ' ');
if ($classTitle !== null) {
return $classTitle;
}
$this->resolveChildUrls($p);
if ($p->tagName == 'img' and $p->getAttribute('alt') !== '') {
$pValue = $p->getAttribute('alt');
} elseif ($p->tagName == 'area' and $p->getAttribute('alt') !== '') {
$pValue = $p->getAttribute('alt');
} elseif ($p->tagName == 'abbr' and $p->getAttribute('title') !== '') {
$pValue = $p->getAttribute('title');
} elseif (in_array($p->tagName, array('data', 'input')) and $p->getAttribute('value') !== '') {
$pValue = $p->getAttribute('value');
} else {
$pValue = unicodeTrim($this->innerText($p));
}
return $pValue;
}
/**
* Given an element with class="u-*", get the value of the URL
*
* @param DOMElement $u The element to parse
* @return string The plaintext value of $u, dependant on type
* @todo make this adhere to value-class
*/
public function parseU(\DOMElement $u) {
if (($u->tagName == 'a' or $u->tagName == 'area') and $u->getAttribute('href') !== null) {
$uValue = $u->getAttribute('href');
} elseif (in_array($u->tagName, array('img', 'audio', 'video', 'source')) and $u->getAttribute('src') !== null) {
$uValue = $u->getAttribute('src');
} elseif ($u->tagName == 'object' and $u->getAttribute('data') !== null) {
$uValue = $u->getAttribute('data');
}
if (isset($uValue)) {
return $this->resolveUrl($uValue);
}
$classTitle = $this->parseValueClassTitle($u);
if ($classTitle !== null) {
return $classTitle;
} elseif ($u->tagName == 'abbr' and $u->getAttribute('title') !== null) {
return $u->getAttribute('title');
} elseif (in_array($u->tagName, array('data', 'input')) and $u->getAttribute('value') !== null) {
return $u->getAttribute('value');
} else {
return unicodeTrim($this->textContent($u));
}
}
/**
* Given an element with class="dt-*", get the value of the datetime as a php date object
*
* @param DOMElement $dt The element to parse
* @param array $dates Array of dates processed so far
* @return string The datetime string found
*/
public function parseDT(\DOMElement $dt, &$dates = array()) {
// Check for value-class pattern
$valueClassChildren = $this->xpath->query('./*[contains(concat(" ", @class, " "), " value ") or contains(concat(" ", @class, " "), " value-title ")]', $dt);
$dtValue = false;
if ($valueClassChildren->length > 0) {
// They’re using value-class
$dateParts = array();
foreach ($valueClassChildren as $e) {
if (strstr(' ' . $e->getAttribute('class') . ' ', ' value-title ')) {
$title = $e->getAttribute('title');
if (!empty($title))
$dateParts[] = $title;
}
elseif ($e->tagName == 'img' or $e->tagName == 'area') {
// Use @alt
$alt = $e->getAttribute('alt');
if (!empty($alt))
$dateParts[] = $alt;
}
elseif ($e->tagName == 'data') {
// Use @value, otherwise innertext
$value = $e->hasAttribute('value') ? $e->getAttribute('value') : unicodeTrim($e->nodeValue);
if (!empty($value))
$dateParts[] = $value;
}
elseif ($e->tagName == 'abbr') {
// Use @title, otherwise innertext
$title = $e->hasAttribute('title') ? $e->getAttribute('title') : unicodeTrim($e->nodeValue);
if (!empty($title))
$dateParts[] = $title;
}
elseif ($e->tagName == 'del' or $e->tagName == 'ins' or $e->tagName == 'time') {
// Use @datetime if available, otherwise innertext
$dtAttr = ($e->hasAttribute('datetime')) ? $e->getAttribute('datetime') : unicodeTrim($e->nodeValue);
if (!empty($dtAttr))
$dateParts[] = $dtAttr;
}
else {
if (!empty($e->nodeValue))
$dateParts[] = unicodeTrim($e->nodeValue);
}
}
// Look through dateParts
$datePart = '';
$timePart = '';
foreach ($dateParts as $part) {
// Is this part a full ISO8601 datetime?
if (preg_match('/^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}(?::\d{2})?(?:Z?[+|-]\d{2}:?\d{2})?$/', $part)) {
// Break completely, we’ve got our value.
$dtValue = $part;
break;
} else {
// Is the current part a valid time(+TZ?) AND no other time representation has been found?
if ((preg_match('/\d{1,2}:\d{1,2}(Z?[+|-]\d{2}:?\d{2})?/', $part) or preg_match('/\d{1,2}[a|p]m/', $part)) and empty($timePart)) {
$timePart = $part;
} elseif (preg_match('/\d{4}-\d{2}-\d{2}/', $part) and empty($datePart)) {
// Is the current part a valid date AND no other date representation has been found?
$datePart = $part;
}
if ( !empty($datePart) && !in_array($datePart, $dates) ) {
$dates[] = $datePart;
}
$dtValue = '';
if ( empty($datePart) && !empty($timePart) ) {
$timePart = convertTimeFormat($timePart);
$dtValue = unicodeTrim($timePart, 'T');
}
else if ( !empty($datePart) && empty($timePart) ) {
$dtValue = rtrim($datePart, 'T');
}
else {
$timePart = convertTimeFormat($timePart);
$dtValue = rtrim($datePart, 'T') . 'T' . unicodeTrim($timePart, 'T');
}
}
}
} else {
// Not using value-class (phew).
if ($dt->tagName == 'img' or $dt->tagName == 'area') {
// Use @alt
// Is it an entire dt?
$alt = $dt->getAttribute('alt');
if (!empty($alt))
$dtValue = $alt;
} elseif (in_array($dt->tagName, array('data'))) {
// Use @value, otherwise innertext
// Is it an entire dt?
$value = $dt->getAttribute('value');
if (!empty($value))
$dtValue = $value;
else
$dtValue = $this->textContent($dt);
} elseif ($dt->tagName == 'abbr') {
// Use @title, otherwise innertext
// Is it an entire dt?
$title = $dt->getAttribute('title');
if (!empty($title))
$dtValue = $title;
else
$dtValue = $this->textContent($dt);
} elseif ($dt->tagName == 'del' or $dt->tagName == 'ins' or $dt->tagName == 'time') {
// Use @datetime if available, otherwise innertext
// Is it an entire dt?
$dtAttr = $dt->getAttribute('datetime');
if (!empty($dtAttr))
$dtValue = $dtAttr;
else
$dtValue = $this->textContent($dt);
} else {
$dtValue = $this->textContent($dt);
}
if (preg_match('/(\d{4}-\d{2}-\d{2})/', $dtValue, $matches)) {
$dates[] = $matches[0];
}
}
/**
* if $dtValue is only a time and there are recently parsed dates,
* form the full date-time using the most recently parsed dt- value
*/
if ((preg_match('/^\d{1,2}:\d{1,2}(Z?[+|-]\d{2}:?\d{2})?/', $dtValue) or preg_match('/^\d{1,2}[a|p]m/', $dtValue)) && !empty($dates)) {
$dtValue = convertTimeFormat($dtValue);
$dtValue = end($dates) . 'T' . unicodeTrim($dtValue, 'T');
}
return $dtValue;
}
/**
* Given the root element of some embedded markup, return a string representing that markup
*
* @param DOMElement $e The element to parse
* @return string $e’s innerHTML
*
* @todo need to mark this element as e- parsed so it doesn’t get parsed as it’s parent’s e-* too
*/
public function parseE(\DOMElement $e) {
$classTitle = $this->parseValueClassTitle($e);
if ($classTitle !== null)
return $classTitle;
// Expand relative URLs within children of this element
// TODO: as it is this is not relative to only children, make this .// and rerun tests
$this->resolveChildUrls($e);
$html = '';
foreach ($e->childNodes as $node) {
$html .= $node->C14N();
}
return array(
'html' => $html,
'value' => unicodeTrim($this->innerText($e))
);
}
private function removeTags(\DOMElement &$e, $tagName) {
while(($r = $e->getElementsByTagName($tagName)) && $r->length) {
$r->item(0)->parentNode->removeChild($r->item(0));
}
}
/**
* Recursively parse microformats
*
* @param DOMElement $e The element to parse
* @return array A representation of the values contained within microformat $e
*/
public function parseH(\DOMElement $e) {
// If it’s already been parsed (e.g. is a child mf), skip
if ($this->parsed->contains($e))
return null;
// Get current µf name
$mfTypes = mfNamesFromElement($e, 'h-');
// Initalise var to store the representation in
$return = array();
$children = array();
$dates = array();
// Handle nested microformats (h-*)
foreach ($this->xpath->query('.//*[contains(concat(" ", @class)," h-")]', $e) as $subMF) {
// Parse
$result = $this->parseH($subMF);
// If result was already parsed, skip it
if (null === $result)
continue;
// In most cases, the value attribute of the nested microformat should be the p- parsed value of the elemnt.
// The only times this is different is when the microformat is nested under certain prefixes, which are handled below.
$result['value'] = $this->parseP($subMF);
// Does this µf have any property names other than h-*?
$properties = nestedMfPropertyNamesFromElement($subMF);
if (!empty($properties)) {
// Yes! It’s a nested property µf
foreach ($properties as $property => $prefixes) {
// Note: handling microformat nesting under multiple conflicting prefixes is not currently specified by the mf2 parsing spec.
$prefixSpecificResult = $result;
if (in_array('p-', $prefixes)) {
$prefixSpecificResult['value'] = $prefixSpecificResult['properties']['name'][0];
} elseif (in_array('e-', $prefixes)) {
$eParsedResult = $this->parseE($subMF);
$prefixSpecificResult['html'] = $eParsedResult['html'];
$prefixSpecificResult['value'] = $eParsedResult['value'];
} elseif (in_array('u-', $prefixes)) {
$prefixSpecificResult['value'] = (empty($result['properties']['url'])) ? $this->parseU($subMF) : reset($result['properties']['url']);
}
$return[$property][] = $prefixSpecificResult;
}
} else {
// No, it’s a child µf
$children[] = $result;
}
// Make sure this sub-mf won’t get parsed as a µf or property
// TODO: Determine if clearing this is required?
$this->elementPrefixParsed($subMF, 'h');
$this->elementPrefixParsed($subMF, 'p');
$this->elementPrefixParsed($subMF, 'u');
$this->elementPrefixParsed($subMF, 'dt');
$this->elementPrefixParsed($subMF, 'e');
}
if($e->tagName == 'area') {
$coords = $e->getAttribute('coords');
$shape = $e->getAttribute('shape');
}
// Handle p-*
foreach ($this->xpath->query('.//*[contains(concat(" ", @class) ," p-")]', $e) as $p) {
if ($this->isElementParsed($p, 'p'))
continue;
$pValue = $this->parseP($p);
// Add the value to the array for it’s p- properties
foreach (mfNamesFromElement($p, 'p-') as $propName) {
if (!empty($propName))
$return[$propName][] = $pValue;
}
// Make sure this sub-mf won’t get parsed as a top level mf
$this->elementPrefixParsed($p, 'p');
}
// Handle u-*
foreach ($this->xpath->query('.//*[contains(concat(" ", @class)," u-")]', $e) as $u) {
if ($this->isElementParsed($u, 'u'))
continue;
$uValue = $this->parseU($u);
// Add the value to the array for it’s property types
foreach (mfNamesFromElement($u, 'u-') as $propName) {
$return[$propName][] = $uValue;
}
// Make sure this sub-mf won’t get parsed as a top level mf
$this->elementPrefixParsed($u, 'u');
}
// Handle dt-*
foreach ($this->xpath->query('.//*[contains(concat(" ", @class), " dt-")]', $e) as $dt) {
if ($this->isElementParsed($dt, 'dt'))
continue;
$dtValue = $this->parseDT($dt, $dates);
if ($dtValue) {
// Add the value to the array for dt- properties
foreach (mfNamesFromElement($dt, 'dt-') as $propName) {
$return[$propName][] = $dtValue;
}
}
// Make sure this sub-mf won’t get parsed as a top level mf
$this->elementPrefixParsed($dt, 'dt');
}
// Handle e-*
foreach ($this->xpath->query('.//*[contains(concat(" ", @class)," e-")]', $e) as $em) {
if ($this->isElementParsed($em, 'e'))
continue;
$eValue = $this->parseE($em);
if ($eValue) {
// Add the value to the array for e- properties
foreach (mfNamesFromElement($em, 'e-') as $propName) {
$return[$propName][] = $eValue;
}
}
// Make sure this sub-mf won’t get parsed as a top level mf
$this->elementPrefixParsed($em, 'e');
}
// Implied Properties
// Check for p-name
if (!array_key_exists('name', $return)) {
try {
// Look for img @alt
if (($e->tagName == 'img' or $e->tagName == 'area') and $e->getAttribute('alt') != '')
throw new Exception($e->getAttribute('alt'));
if ($e->tagName == 'abbr' and $e->hasAttribute('title'))
throw new Exception($e->getAttribute('title'));
// Look for nested img @alt
foreach ($this->xpath->query('./img[count(preceding-sibling::*)+count(following-sibling::*)=0]', $e) as $em) {
$emNames = mfNamesFromElement($em, 'h-');
if (empty($emNames) && $em->getAttribute('alt') != '') {
throw new Exception($em->getAttribute('alt'));
}
}
// Look for nested area @alt
foreach ($this->xpath->query('./area[count(preceding-sibling::*)+count(following-sibling::*)=0]', $e) as $em) {
$emNames = mfNamesFromElement($em, 'h-');
if (empty($emNames) && $em->getAttribute('alt') != '') {
throw new Exception($em->getAttribute('alt'));
}
}
// Look for double nested img @alt
foreach ($this->xpath->query('./*[count(preceding-sibling::*)+count(following-sibling::*)=0]/img[count(preceding-sibling::*)+count(following-sibling::*)=0]', $e) as $em) {
$emNames = mfNamesFromElement($em, 'h-');
if (empty($emNames) && $em->getAttribute('alt') != '') {
throw new Exception($em->getAttribute('alt'));
}
}
// Look for double nested img @alt
foreach ($this->xpath->query('./*[count(preceding-sibling::*)+count(following-sibling::*)=0]/area[count(preceding-sibling::*)+count(following-sibling::*)=0]', $e) as $em) {
$emNames = mfNamesFromElement($em, 'h-');
if (empty($emNames) && $em->getAttribute('alt') != '') {
throw new Exception($em->getAttribute('alt'));
}
}
throw new Exception($this->innerText($e, true));
} catch (Exception $exc) {
$return['name'][] = unicodeTrim($exc->getMessage());
}
}
// Check for u-photo
if (!array_key_exists('photo', $return)) {
// Look for img @src
try {
if ($e->tagName == 'img')
throw new Exception($e->getAttribute('src'));
// Look for nested img @src
foreach ($this->xpath->query('./img[count(preceding-sibling::*)+count(following-sibling::*)=0]', $e) as $em) {
if ($em->getAttribute('src') != '')
throw new Exception($em->getAttribute('src'));
}
// Look for double nested img @src
foreach ($this->xpath->query('./*[count(preceding-sibling::*)+count(following-sibling::*)=0]/img[count(preceding-sibling::*)+count(following-sibling::*)=0]', $e) as $em) {
if ($em->getAttribute('src') != '')
throw new Exception($em->getAttribute('src'));
}
} catch (Exception $exc) {
$return['photo'][] = $this->resolveUrl($exc->getMessage());
}
}
// Check for u-url
if (!array_key_exists('url', $return)) {
// Look for img @src
if ($e->tagName == 'a' or $e->tagName == 'area')
$url = $e->getAttribute('href');
// Look for nested a @href
foreach ($this->xpath->query('./a[count(preceding-sibling::a)+count(following-sibling::a)=0]', $e) as $em) {
$emNames = mfNamesFromElement($em, 'h-');
if (empty($emNames)) {
$url = $em->getAttribute('href');
break;
}
}
// Look for nested area @src
foreach ($this->xpath->query('./area[count(preceding-sibling::area)+count(following-sibling::area)=0]', $e) as $em) {
$emNames = mfNamesFromElement($em, 'h-');
if (empty($emNames)) {
$url = $em->getAttribute('href');
break;
}
}
if (!empty($url))
$return['url'][] = $this->resolveUrl($url);
}
// Make sure things are in alphabetical order
sort($mfTypes);
// Phew. Return the final result.
$parsed = array(
'type' => $mfTypes,
'properties' => $return
);
if (!empty($shape)) {
$parsed['shape'] = $shape;
}
if (!empty($coords)) {
$parsed['coords'] = $coords;
}
if (!empty($children)) {
$parsed['children'] = array_values(array_filter($children));
}
return $parsed;
}
/**
* Parse Rels and Alternatives
*
* Returns [$rels, $alternatives]. If the $rels value is to be empty, i.e. there are no links on the page
* with a rel value *not* containing `alternate`, then the type of $rels depends on $this->jsonMode. If set
* to true, it will be a stdClass instance, optimising for JSON serialisation. Otherwise (the default case),
* it will be an empty array.
*/
public function parseRelsAndAlternates() {
$rels = array();
$alternates = array();
// Iterate through all a, area and link elements with rel attributes
foreach ($this->xpath->query('//*[@rel and @href]') as $hyperlink) {
if ($hyperlink->getAttribute('rel') == '')
continue;
// Resolve the href
$href = $this->resolveUrl($hyperlink->getAttribute('href'));
// Split up the rel into space-separated values
$linkRels = array_filter(explode(' ', $hyperlink->getAttribute('rel')));
// If alternate in rels, create alternate structure, append
if (in_array('alternate', $linkRels)) {
$alt = array(
'url' => $href,
'rel' => implode(' ', array_diff($linkRels, array('alternate')))
);
if ($hyperlink->hasAttribute('media'))
$alt['media'] = $hyperlink->getAttribute('media');
if ($hyperlink->hasAttribute('hreflang'))
$alt['hreflang'] = $hyperlink->getAttribute('hreflang');
if ($hyperlink->hasAttribute('title'))
$alt['title'] = $hyperlink->getAttribute('title');
if ($hyperlink->hasAttribute('type'))
$alt['type'] = $hyperlink->getAttribute('type');
if ($hyperlink->nodeValue)
$alt['text'] = $hyperlink->nodeValue;
$alternates[] = $alt;
} else {
foreach ($linkRels as $rel) {
$rels[$rel][] = $href;
}
}
}
if (empty($rels) and $this->jsonMode) {
$rels = new stdClass();
}
return array($rels, $alternates);
}
/**
* Kicks off the parsing routine
*
* If `$htmlSafe` is set, any angle brackets in the results from non e-* properties
* will be HTML-encoded, bringing all output to the same level of encoding.
*
* If a DOMElement is set as the $context, only descendants of that element will
* be parsed for microformats.
*
* @param bool $htmlSafe whether or not to html-encode non e-* properties. Defaults to false
* @param DOMElement $context optionally an element from which to parse microformats
* @return array An array containing all the µfs found in the current document
*/
public function parse($convertClassic = true, DOMElement $context = null) {
$mfs = array();
if ($convertClassic) {
$this->convertLegacy();
}
$mfElements = null === $context
? $this->xpath->query('//*[contains(concat(" ", @class), " h-")]')
: $this->xpath->query('.//*[contains(concat(" ", @class), " h-")]', $context);
// Parser microformats
foreach ($mfElements as $node) {
// For each microformat
$result = $this->parseH($node);
// Add the value to the array for this property type
$mfs[] = $result;
}
// Parse rels
list($rels, $alternates) = $this->parseRelsAndAlternates();
$top = array(
'items' => array_values(array_filter($mfs)),
'rels' => $rels
);
if (count($alternates))
$top['alternates'] = $alternates;
return $top;
}
/**
* Parse From ID
*
* Given an ID, parse all microformats which are children of the element with
* that ID.
*
* Note that rel values are still document-wide.
*
* If an element with the ID is not found, an empty skeleton mf2 array structure
* will be returned.
*
* @param string $id
* @param bool $htmlSafe = false whether or not to HTML-encode angle brackets in non e-* properties
* @return array
*/
public function parseFromId($id, $convertClassic=true) {
$matches = $this->xpath->query("//*[@id='{$id}']");
if (empty($matches))
return array('items' => array(), 'rels' => array(), 'alternates' => array());
return $this->parse($convertClassic, $matches->item(0));
}
/**
* Convert Legacy Classnames
*
* Adds microformats2 classnames into a document containing only legacy
* semantic classnames.
*
* @return Parser $this
*/
public function convertLegacy() {
$doc = $this->doc;
$xp = new DOMXPath($doc);
// replace all roots
foreach ($this->classicRootMap as $old => $new) {
foreach ($xp->query('//*[contains(concat(" ", @class, " "), " ' . $old . ' ") and not(contains(concat(" ", @class, " "), " ' . $new . ' "))]') as $el) {
$el->setAttribute('class', $el->getAttribute('class') . ' ' . $new);
}
}
foreach ($this->classicPropertyMap as $oldRoot => $properties) {
$newRoot = $this->classicRootMap[$oldRoot];
foreach ($properties as $old => $new) {
foreach ($xp->query('//*[contains(concat(" ", @class, " "), " ' . $oldRoot . ' ")]//*[contains(concat(" ", @class, " "), " ' . $old . ' ") and not(contains(concat(" ", @class, " "), " ' . $new . ' "))]') as $el) {
$el->setAttribute('class', $el->getAttribute('class') . ' ' . $new);
}
}
}
return $this;
}
/**
* XPath Query
*
* Runs an XPath query over the current document. Works in exactly the same
* way as DOMXPath::query.
*
* @param string $expression
* @param DOMNode $context
* @return DOMNodeList
*/
public function query($expression, $context = null) {
return $this->xpath->query($expression, $context);
}
/**
* Classic Root Classname map
*/
public $classicRootMap = array(
'vcard' => 'h-card',
'hfeed' => 'h-feed',
'hentry' => 'h-entry',
'hrecipe' => 'h-recipe',
'hresume' => 'h-resume',
'vevent' => 'h-event',
'hreview' => 'h-review',
'hproduct' => 'h-product'
);
public $classicPropertyMap = array(
'vcard' => array(
'fn' => 'p-name',
'url' => 'u-url',
'honorific-prefix' => 'p-honorific-prefix',
'given-name' => 'p-given-name',
'additional-name' => 'p-additional-name',
'family-name' => 'p-family-name',
'honorific-suffix' => 'p-honorific-suffix',
'nickname' => 'p-nickname',
'email' => 'u-email',
'logo' => 'u-logo',
'photo' => 'u-photo',
'url' => 'u-url',
'uid' => 'u-uid',
'category' => 'p-category',
'adr' => 'p-adr h-adr',
'extended-address' => 'p-extended-address',
'street-address' => 'p-street-address',
'locality' => 'p-locality',
'region' => 'p-region',
'postal-code' => 'p-postal-code',
'country-name' => 'p-country-name',
'label' => 'p-label',
'geo' => 'p-geo h-geo',
'latitude' => 'p-latitude',
'longitude' => 'p-longitude',
'tel' => 'p-tel',
'note' => 'p-note',
'bday' => 'dt-bday',
'key' => 'u-key',
'org' => 'p-org',
'organization-name' => 'p-organization-name',
'organization-unit' => 'p-organization-unit',
),
'hentry' => array(
'entry-title' => 'p-name',
'entry-summary' => 'p-summary',
'entry-content' => 'e-content',
'published' => 'dt-published',
'updated' => 'dt-updated',
'author' => 'p-author h-card',
'category' => 'p-category',
'geo' => 'p-geo h-geo',
'latitude' => 'p-latitude',
'longitude' => 'p-longitude',
),
'hrecipe' => array(
'fn' => 'p-name',
'ingredient' => 'p-ingredient',
'yield' => 'p-yield',
'instructions' => 'e-instructions',
'duration' => 'dt-duration',
'nutrition' => 'p-nutrition',
'photo' => 'u-photo',
'summary' => 'p-summary',
'author' => 'p-author h-card'
),
'hresume' => array(
'summary' => 'p-summary',
'contact' => 'h-card p-contact',
'education' => 'h-event p-education',
'experience' => 'h-event p-experience',
'skill' => 'p-skill',
'affiliation' => 'p-affiliation h-card',
),
'vevent' => array(
'dtstart' => 'dt-start',
'dtend' => 'dt-end',
'duration' => 'dt-duration',
'description' => 'p-description',
'summary' => 'p-name',
'description' => 'p-description',
'url' => 'u-url',
'category' => 'p-category',
'location' => 'h-card',
'geo' => 'p-location h-geo'
),
'hreview' => array(
'summary' => 'p-name',
'fn' => 'p-item h-item p-name', // doesn’t work properly, see spec
'photo' => 'u-photo', // of the item being reviewed (p-item h-item u-photo)
'url' => 'u-url', // of the item being reviewed (p-item h-item u-url)
'reviewer' => 'p-reviewer p-author h-card',
'dtreviewed' => 'dt-reviewed',
'rating' => 'p-rating',
'best' => 'p-best',
'worst' => 'p-worst',
'description' => 'p-description'
),
'hproduct' => array(
'fn' => 'p-name',
'photo' => 'u-photo',
'brand' => 'p-brand',
'category' => 'p-category',
'description' => 'p-description',
'identifier' => 'u-identifier',
'url' => 'u-url',
'review' => 'p-review h-review',
'price' => 'p-price'
)
);
}
function parseUriToComponents($uri) {
$result = array(
'scheme' => null,
'authority' => null,
'path' => null,
'query' => null,
'fragment' => null
);
$u = @parse_url($uri);
if(array_key_exists('scheme', $u))
$result['scheme'] = $u['scheme'];
if(array_key_exists('host', $u)) {
if(array_key_exists('user', $u))
$result['authority'] = $u['user'];
if(array_key_exists('pass', $u))
$result['authority'] .= ':' . $u['pass'];
if(array_key_exists('user', $u) || array_key_exists('pass', $u))
$result['authority'] .= '@';
$result['authority'] .= $u['host'];
if(array_key_exists('port', $u))
$result['authority'] .= ':' . $u['port'];
}
if(array_key_exists('path', $u))
$result['path'] = $u['path'];
if(array_key_exists('query', $u))
$result['query'] = $u['query'];
if(array_key_exists('fragment', $u))
$result['fragment'] = $u['fragment'];
return $result;
}
function resolveUrl($baseURI, $referenceURI) {
$target = array(
'scheme' => null,
'authority' => null,
'path' => null,
'query' => null,
'fragment' => null
);
# 5.2.1 Pre-parse the Base URI
# The base URI (Base) is established according to the procedure of
# Section 5.1 and parsed into the five main components described in
# Section 3
$base = parseUriToComponents($baseURI);
# If base path is blank (http://example.com) then set it to /
# (I can't tell if this is actually in the RFC or not, but seems like it makes sense)
if($base['path'] == null)
$base['path'] = '/';
# 5.2.2. Transform References
# The URI reference is parsed into the five URI components
# (R.scheme, R.authority, R.path, R.query, R.fragment) = parse(R);
$reference = parseUriToComponents($referenceURI);
# A non-strict parser may ignore a scheme in the reference
# if it is identical to the base URI's scheme.
# TODO
if($reference['scheme']) {
$target['scheme'] = $reference['scheme'];
$target['authority'] = $reference['authority'];
$target['path'] = removeDotSegments($reference['path']);
$target['query'] = $reference['query'];
} else {
if($reference['authority']) {
$target['authority'] = $reference['authority'];
$target['path'] = removeDotSegments($reference['path']);
$target['query'] = $reference['query'];
} else {
if($reference['path'] == '') {
$target['path'] = $base['path'];
if($reference['query']) {
$target['query'] = $reference['query'];
} else {
$target['query'] = $base['query'];
}
} else {
if(substr($reference['path'], 0, 1) == '/') {
$target['path'] = removeDotSegments($reference['path']);
} else {
$target['path'] = mergePaths($base, $reference);
$target['path'] = removeDotSegments($target['path']);
}
$target['query'] = $reference['query'];
}
$target['authority'] = $base['authority'];
}
$target['scheme'] = $base['scheme'];
}
$target['fragment'] = $reference['fragment'];
# 5.3 Component Recomposition
$result = '';
if($target['scheme']) {
$result .= $target['scheme'] . ':';
}
if($target['authority']) {
$result .= '//' . $target['authority'];
}
$result .= $target['path'];
if($target['query']) {
$result .= '?' . $target['query'];
}
if($target['fragment']) {
$result .= '#' . $target['fragment'];
} elseif($referenceURI == '#') {
$result .= '#';
}
return $result;
}
# 5.2.3 Merge Paths
function mergePaths($base, $reference) {
# If the base URI has a defined authority component and an empty
# path,
if($base['authority'] && $base['path'] == null) {
# then return a string consisting of "/" concatenated with the
# reference's path; otherwise,
$merged = '/' . $reference['path'];
} else {
if(($pos=strrpos($base['path'], '/')) !== false) {
# return a string consisting of the reference's path component
# appended to all but the last segment of the base URI's path (i.e.,
# excluding any characters after the right-most "/" in the base URI
# path,
$merged = substr($base['path'], 0, $pos + 1) . $reference['path'];
} else {
# or excluding the entire base URI path if it does not contain
# any "/" characters).
$merged = $base['path'];
}
}
return $merged;
}
# 5.2.4.A Remove leading ../ or ./
function removeLeadingDotSlash(&$input) {
if(substr($input, 0, 3) == '../') {
$input = substr($input, 3);
} elseif(substr($input, 0, 2) == './') {
$input = substr($input, 2);
}
}
# 5.2.4.B Replace leading /. with /
function removeLeadingSlashDot(&$input) {
if(substr($input, 0, 3) == '/./') {
$input = '/' . substr($input, 3);
} else {
$input = '/' . substr($input, 2);
}
}
# 5.2.4.C Given leading /../ remove component from output buffer
function removeOneDirLevel(&$input, &$output) {
if(substr($input, 0, 4) == '/../') {
$input = '/' . substr($input, 4);
} else {
$input = '/' . substr($input, 3);
}
$output = substr($output, 0, strrpos($output, '/'));
}
# 5.2.4.D Remove . and .. if it's the only thing in the input
function removeLoneDotDot(&$input) {
if($input == '.') {
$input = substr($input, 1);
} else {
$input = substr($input, 2);
}
}
# 5.2.4.E Move one segment from input to output
function moveOneSegmentFromInput(&$input, &$output) {
if(substr($input, 0, 1) != '/') {
$pos = strpos($input, '/');
} else {
$pos = strpos($input, '/', 1);
}
if($pos === false) {
$output .= $input;
$input = '';
} else {
$output .= substr($input, 0, $pos);
$input = substr($input, $pos);
}
}
# 5.2.4 Remove Dot Segments
function removeDotSegments($path) {
# 1. The input buffer is initialized with the now-appended path
# components and the output buffer is initialized to the empty
# string.
$input = $path;
$output = '';
$step = 0;
# 2. While the input buffer is not empty, loop as follows:
while($input) {
$step++;
if(substr($input, 0, 3) == '../' || substr($input, 0, 2) == './') {
# A. If the input buffer begins with a prefix of "../" or "./",
# then remove that prefix from the input buffer; otherwise,
removeLeadingDotSlash($input);
} elseif(substr($input, 0, 3) == '/./' || $input == '/.') {
# B. if the input buffer begins with a prefix of "/./" or "/.",
# where "." is a complete path segment, then replace that
# prefix with "/" in the input buffer; otherwise,
removeLeadingSlashDot($input);
} elseif(substr($input, 0, 4) == '/../' || $input == '/..') {
# C. if the input buffer begins with a prefix of "/../" or "/..",
# where ".." is a complete path segment, then replace that
# prefix with "/" in the input buffer and remove the last
# segment and its preceding "/" (if any) from the output
# buffer; otherwise,
removeOneDirLevel($input, $output);
} elseif($input == '.' || $input == '..') {
# D. if the input buffer consists only of "." or "..", then remove
# that from the input buffer; otherwise,
removeLoneDotDot($input);
} else {
# E. move the first path segment in the input buffer to the end of
# the output buffer and any subsequent characters up to, but not including,
# the next "/" character or the end of the input buffer
moveOneSegmentFromInput($input, $output);
}
}
return $output;
}
php-mf2-0.3.0/README.md 0000664 0000000 0000000 00000044507 12671525356 0014270 0 ustar 00root root 0000000 0000000 # php-mf2
[](http://travis-ci.org/indieweb/php-mf2)
php-mf2 is a pure, generic [microformats-2](http://microformats.org/wiki/microformats-2) parser. It makes HTML as easy to consume as JSON.
Instead of having a hard-coded list of all the different microformats, it follows a set of procedures to handle different property types (e.g. `p-` for plaintext, `u-` for URL, etc). This allows for a very small and maintainable parser.
## Installation
There are two ways of installing php-mf2. I **highly recommend** installing php-mf2 using [Composer](http://getcomposer.org). The rest of the documentation assumes that you have done so.
To install using Composer, run `./composer.phar require mf2/mf2:~0.3`
If you can’t or don’t want to use Composer, then php-mf2 can be installed the old way by downloading [`/Mf2/Parser.php`](https://raw.githubusercontent.com/indieweb/php-mf2/master/Mf2/Parser.php), adding it to your project and requiring it from files you want to call its functions from, like this:
```php
"
gpg: aka "[jpeg image of size 12805]"
```
Possible issues:
* **Git complains that there’s no such tag**: check for a .git file in the source folder; odds are you have the prefer-dist setting enabled and composer is just extracting a zip rather than checking out from git.
* **Git complains the gpg command doesn’t exist**: If you successfully imported my key then you obviously do have gpg installed, but you might have gpg2, whereas git looks for gpg. Solution: tell git which binary to use: `git config --global gpg.program 'gpg2'`
## Usage
php-mf2 is PSR-0 autoloadable, so simply include Composer’s auto-generated autoload file (`/vendor/autoload.php`) and you can start using it. These two functions cover most situations:
* To fetch microformats from a URL, call `Mf2\fetch($url)`
* To parse microformats from HTML, call `Mf2\parse($html, $url)`, where `$url` is the URL from which `$html` was loaded, if any. This parameter is required for correct relative URL parsing and must not be left out unless parsing HTML which is not loaded from the web.
## Examples
### Fetching microformats from a page
```php
Barnaby Walters');
```
`$output` is a canonical microformats2 array structure like:
```json
{
"items": [{
"type": ["h-card"],
"properties": {
"name": ["Barnaby Walters"],
"url": ["https://waterpigs.co.uk/"]
}
}],
"rels": {}
}
```
If no microformats are found, `items` will be an empty array.
Note that, whilst the property prefixes are stripped, the prefix of the `h-*` classname(s) in the "type" array are retained.
### Parsing a document with relative URLs
Most of the time you’ll be getting your input HTML from a URL. You should pass that URL as the second parameter to `Mf2\parse()` so that any relative URLs in the document can be resolved. For example, say you got the following HTML from `http://example.org`:
```html
Mr. Example
```
Parsing like this:
```php
$output = Mf2\parse($html, 'http://example.org');
```
will result in the following output, with relative URLs made absolute:
```json
{
"items": [{
"type": ["h-card"],
"properties": {
"photo": ["http://example.org/photo.png"]
}
}],
"rels": {}
}
```
php-mf2 correctly handles relative URL resolution according to the URI and HTML specs, including correct use of the `` element.
### Parsing `rel` and `rel=alternate` values
php-mf2 also parses any link relations in the document, placing them into two top-level arrays — one for `rel=alternate` and another for all other rel values, e.g. when parsing:
```html
Me on twitter
```
parsing will result in the following keys:
```json
{
"items": [],
"rels": {
"me": ["https://twitter.com/barnabywalters"]
},
"alternates": [{
"url": "http://example.com/notes.atom",
"rel": "etc"
}]
}
```
Protip: if you’re not bothered about the microformats2 data and just want rels and alternates, you can improve performance by creating a `Mf2\Parser` object (see below) and calling `->parseRelsAndAlternates()` instead of `->parse()`, e.g.
```php
parseRelsAndAlternates();
```
### Debugging Mf2\fetch
`Mf2\fetch()` will attempt to parse any response served with “HTML” in the content-type, regardless of what the status code is. If it receives a non-HTML response it will return null.
To learn what the HTTP status code for any request was, or learn more about the request, pass a variable name as the third parameter to `Mf2\fetch()` — this will be filled with the contents of `curl_getinfo()`, e.g:
```php
This shows up yet more ignored content';
$parser = new Mf2\Parser($doc);
$parser->parseFromId('parse-from-here'); // returns a document with only the h-card descended from div#parse-from-here
$elementIWant = $parser->query('an xpath query')[0];
$parser->parse(true, $elementIWant); // returns a document with only mfs under the selected element
```
### Generating output for JSON serialization with JSON-mode
Due to a quirk with the way PHP arrays work, there is an edge case ([reported](https://github.com/indieweb/php-mf2/issues/29) by Tom Morris) in which a document with no rel values, when serialised as JSON, results in an empty object as the rels value rather than an empty array. Replacing this in code with a stdClass breaks PHP iteration over the values.
As of version 0.2.6, the default behaviour is back to being PHP-friendly, so if you want to produce results specifically for serialisation as JSON (for example if you run a HTML -> JSON service, or want to run tests against JSON fixtures), enable JSON mode:
```php
// …by passing true as the third constructor:
$jsonParser = new Mf2\Parser($html, $url, true);
```
### Classic Microformats Markup
php-mf2 has some support for parsing classic microformats markup. It’s enabled by default, but can be turned off by calling `Mf2\parse($html, $url, false);` or `$parser->parse(false);` if you’re instanciating a parser yourself.
In previous versions of php-mf2 you could also add your own class mappings — officially this is no longer supported.
* If the built in mappings don’t successfully parse some classic microformats markup then raise an issue and we’ll fix it.
* If you want to screen-scrape websites which don’t use mf2 into mf2 data structures, consider contributing to [php-mf2-shim](https://github.com/indieweb/php-mf2-shim)
* If you *really* need to make one-off changes to the default mappings… It is possible. But you have to figure it out for yourself ;)
## Security
**No filtering of content takes place in mf2\Parser, so treat its output as you would any untrusted data from the source of the parsed document.**
Some tips:
* All content apart from the 'html' key in dictionaries produced by parsing an `e-*` property is not HTML-escaped. For example, `<code>` will result in `"name": [""]`. At the very least, HTML-escape all properties before echoing them out in HTML
* If you’re using the raw HTML content under the 'html' key of dictionaries produced by parsing `e-*` properties, you SHOULD purify the HTML before displaying it to prevent injection of arbitrary code. For PHP I recommend using [HTML Purifier](http://htmlpurifier.org)
TODO: move this section to a security/consumption best practises page on the wiki
## Contributing
Issues and bug reports are very welcome. If you know how to write tests then please do so as code always expresses problems and intent much better than English, and gives me a way of measuring whether or not fixes have actually solved your problem. If you don’t know how to write tests, don’t worry :) Just include as much useful information in the issue as you can.
Pull requests very welcome, please try to maintain stylistic, structural and naming consistency with the existing codebase, and don’t be too upset if I make naming changes :)
### How to make a Pull Request
1. Fork the repo to your github account
2. Clone a copy to your computer (simply installing php-mf2 using composer only works for using it, not developing it)
3. Install the dev dependencies with `./composer.phar install`
4. Run PHPUnit with `./vendor/bin/phpunit`
5. Make your changes
6. Add PHPUnit tests for your changes, either in an existing test file if suitable, or a new one
7. Make sure your tests pass (`./vendor/bin/phpunit`), preferably using both PHP 5.3 and 5.4
8. Go to your fork of the repo on github.com and make a pull request, preferably with a short summary, detailed description and references to issues/parsing specs as appropriate
9. Bask in the warm feeling of having contributed to a piece of free software
### Testing
There are currently two separate test suites: one, in `tests/Mf2`, is written in phpunit, containing many microformats parsing examples as well as internal parser tests and regression tests for specific issues over php-mf2’s history. Run it with `./vendor/bin/phpunit`.
The other, in `tests/test-suite`, is a custom test harness which hooks up php-mf2 to the cross-platform microformats test suite. Each test consists of a HTML file and a corresponding JSON file, and the suite can be run with `php ./tests/test-suite/test-suite.php`.
Currently php-mf2 passes the majority of it’s own test case, and a good percentage of the cross-platform tests. Contributors should ALWAYS test against the PHPUnit suite to ensure any changes don’t negatively impact php-mf2, and SHOULD run the cross-platform suite, especially if you’re changing parsing behaviour.
### Changelog
#### v0.3.0
2016-03-14
* Requires PHP 5.4 at minimum (PHP 5.3 is EOL)
* Licensed under CC0 rather than MIT
* Merges Pull requests #70, #73, #74, #75, #77, #80, #82, #83, #85 and #86.
* Variety of small bug fixes and features including improved whitespace support, removal of style and script contents from plaintext properties
* All PHPUnit tests passing finally
Many thanks to @aaronpk, @diplix, @dissolve, @dymcx @gRegorLove, @jeena, @veganstraightedge and @voxpelli for all your hard work opening issues and sending and merging PRs!
#### v0.2.12
2015-07-12
* Merges pull requests [#65](https://github.com/indieweb/php-mf2/pull/65), [#66](https://github.com/indieweb/php-mf2/pull/66) and [#67](https://github.com/indieweb/php-mf2/pull/67).
* Fixes issue [#64](https://github.com/indieweb/php-mf2/issues/64).
Many thanks to @aaronpk, @gRegorLove and @kylewm for contributions, @aaronpk and @kevinmarks for PR management and @tantek for issue reporting!
#### v0.2.11
2015-07-10
#### v0.2.10
2015-04-29
* Merged [#58](https://github.com/indieweb/php-mf2/pull/58), fixing some parsing bugs and adding support for area element parsing. Thanks so much for your hard work and patience, Ben!
#### v0.2.9
2014-08-06
* Added backcompat classmap for hProduct, associated tests
* Started GPG signing version tags as barnaby@waterpigs.co.uk, fingerprint CBC7 7876 BF7C 9637 B6AE 77BA 7D49 834B 0416 CFA3
#### v0.2.8
2014-07-17
* Fixed issue #51 causing php-mf2 to not work with PHP 5.3
* Fixed issue #52 correctly handling the `` element by ignoring it
* Fixed issue #53 improving the plaintext parsing of `` elements
#### v0.2.7
2014-06-18
* Added `Mf2\fetch()` which fetches content from a URL and returns parsed microformats
* Added implied `dt-end` discovery (thanks for all your hard work, @gRegorLove!)
* Fixed issue causing classnames like `blah e- blah` to produce properties with numeric keys (thanks @aaronpk and @gRegorLove)
* Fixed issue causing resolved URLs to not include port numbers (thanks @aaronpk)
#### v0.2.6
* Added JSON mode as long-term fix for #29
* Fixed bug causing microformats nested under multiple property names to be parsed only once
#### v0.2.5
* Removed conditional replacing empty rel list with stdclass. Original purpose was to make JSON-encoding the output from the parser correct but it also caused Fatal Errors due to trying to treat stdclass as array.
#### v0.2.4
#### v0.2.3
* Made p-* parsing consistent with implied name parsing
* Stopped collapsing whitespace in p-* properties
* Implemented unicodeTrim which removes characters as well as regex \s
* Added support for implied name via abbr[title]
* Prevented excessively nested value-class elements from being parsed incorrectly, removed incorrect separator which was getting added in some cases
* Updated u-* parsing to be spec-compliant, matching [href] before value-class and only attempting URL resolution for URL attributes
* Added support for input[value] parsing
* Tests for all the above
#### v0.2.2
* Made resolveUrl method public, allowing advanced parsers and subclasses to make use of it
* Fixed bug causing multiple duplicate property values to appear
#### v0.2.1
* Fixed bug causing classic microformats property classnames to not be parsed correctly
#### v0.2.0 (BREAKING CHANGES)
* Namespace change from mf2 to Mf2, for PSR-0 compatibility
* `Mf2\parse()` function added to simplify the most common case of just parsing some HTML
* Updated e-* property parsing rules to match mf2 parsing spec — instead of producing inconsistent HTML content, it now produces dictionaries like
{
"html": "The Content",
"value: "The Content"
}
* Removed `htmlSafe` options as new e-* parsing rules make them redundant
* Moved a whole load of static functions out of the class and into standalone functions
* Changed autoloading to always include Parser.php instead of using classmap
#### v0.1.23
* Made some changes to the way back-compatibility with classic microformats are handled, ignoring classic property classnames inside mf2 roots and outside classic roots
* Deprecated ability to add new classmaps, removed twitter classmap. Use [php-mf2-shim](http://github.com/indieweb/php-mf2-shim) instead, it’s better
#### v0.1.22
* Converts classic microformats by default
#### v0.1.21
* Removed webignition dependency, also removing ext-intl dependency. php-mf2 is now a standalone, single file library again
* Replaced webignition URL resolving with custom code passing almost all tests, courtesy of Aaron Parecki
#### v0.1.20
* Added in almost-perfect custom URL resolving code
#### v0.1.19 (2013-06-11)
* Required stable version of webigniton/absolute-url-resolver, hopefully resolving versioning problems
#### v0.1.18 (2013-06-05)
* Fixed problems with isElementParsed, causing elements to be incorrectly parsed
* Cleaned up some test files
#### v0.1.17
* Rewrote some PHP 5.4 array syntax which crept into 0.1.16 so php-mf2 still works on PHP 5.3
* Fixed a bug causing weird partial microformats to be added to parent microformats if they had doubly property-nested children
* Finally actually licensed this project under a real license (MIT, in composer.json)
* Suggested barnabywalters/mf-cleaner in composer.json
#### v0.1.16
* Ability to parse from only an ID
* Context DOMElement can be passed to $parse
* Parser::query runs XPath queries on the current document
* When parsing e-* properties, elements with @src, @data or @href have relative URLs resolved in the output
#### v0.1.15
* Added html-safe options
* Added rel+rel-alternate parsing
## License
php-mf2 is dedicated to the public domain using Creative Commons -- CC0 1.0 Universal.
http://creativecommons.org/publicdomain/zero/1.0
php-mf2-0.3.0/bin/ 0000775 0000000 0000000 00000000000 12671525356 0013547 5 ustar 00root root 0000000 0000000 php-mf2-0.3.0/bin/fetch-mf2 0000775 0000000 0000000 00000001564 12671525356 0015256 0 ustar 00root root 0000000 0000000 #!/usr/bin/env php
=5.4.0"
},
"require-dev": {
"phpunit/phpunit": "3.7.*"
},
"autoload": {
"files": ["Mf2/Parser.php"]
},
"license": "CC0",
"suggest": {
"barnabywalters/mf-cleaner": "To more easily handle the canonical data php-mf2 gives you"
}
}
php-mf2-0.3.0/composer.lock 0000664 0000000 0000000 00000035276 12671525356 0015515 0 ustar 00root root 0000000 0000000 {
"_readme": [
"This file locks the dependencies of your project to a known state",
"Read more about it at http://getcomposer.org/doc/01-basic-usage.md#composer-lock-the-lock-file",
"This file is @generated automatically"
],
"hash": "4ddb858a7bdae5163307bd104589bd95",
"packages": [],
"packages-dev": [
{
"name": "phpunit/php-code-coverage",
"version": "1.2.18",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/php-code-coverage.git",
"reference": "fe2466802556d3fe4e4d1d58ffd3ccfd0a19be0b"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/php-code-coverage/zipball/fe2466802556d3fe4e4d1d58ffd3ccfd0a19be0b",
"reference": "fe2466802556d3fe4e4d1d58ffd3ccfd0a19be0b",
"shasum": ""
},
"require": {
"php": ">=5.3.3",
"phpunit/php-file-iterator": ">=1.3.0@stable",
"phpunit/php-text-template": ">=1.2.0@stable",
"phpunit/php-token-stream": ">=1.1.3,<1.3.0"
},
"require-dev": {
"phpunit/phpunit": "3.7.*@dev"
},
"suggest": {
"ext-dom": "*",
"ext-xdebug": ">=2.0.5"
},
"type": "library",
"extra": {
"branch-alias": {
"dev-master": "1.2.x-dev"
}
},
"autoload": {
"classmap": [
"PHP/"
]
},
"notification-url": "https://packagist.org/downloads/",
"include-path": [
""
],
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sb@sebastian-bergmann.de",
"role": "lead"
}
],
"description": "Library that provides collection, processing, and rendering functionality for PHP code coverage information.",
"homepage": "https://github.com/sebastianbergmann/php-code-coverage",
"keywords": [
"coverage",
"testing",
"xunit"
],
"time": "2014-09-02 10:13:14"
},
{
"name": "phpunit/php-file-iterator",
"version": "1.4.0",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/php-file-iterator.git",
"reference": "a923bb15680d0089e2316f7a4af8f437046e96bb"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/php-file-iterator/zipball/a923bb15680d0089e2316f7a4af8f437046e96bb",
"reference": "a923bb15680d0089e2316f7a4af8f437046e96bb",
"shasum": ""
},
"require": {
"php": ">=5.3.3"
},
"type": "library",
"extra": {
"branch-alias": {
"dev-master": "1.4.x-dev"
}
},
"autoload": {
"classmap": [
"src/"
]
},
"notification-url": "https://packagist.org/downloads/",
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sb@sebastian-bergmann.de",
"role": "lead"
}
],
"description": "FilterIterator implementation that filters files based on a list of suffixes.",
"homepage": "https://github.com/sebastianbergmann/php-file-iterator/",
"keywords": [
"filesystem",
"iterator"
],
"time": "2015-04-02 05:19:05"
},
{
"name": "phpunit/php-text-template",
"version": "1.2.0",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/php-text-template.git",
"reference": "206dfefc0ffe9cebf65c413e3d0e809c82fbf00a"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/php-text-template/zipball/206dfefc0ffe9cebf65c413e3d0e809c82fbf00a",
"reference": "206dfefc0ffe9cebf65c413e3d0e809c82fbf00a",
"shasum": ""
},
"require": {
"php": ">=5.3.3"
},
"type": "library",
"autoload": {
"classmap": [
"Text/"
]
},
"notification-url": "https://packagist.org/downloads/",
"include-path": [
""
],
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sb@sebastian-bergmann.de",
"role": "lead"
}
],
"description": "Simple template engine.",
"homepage": "https://github.com/sebastianbergmann/php-text-template/",
"keywords": [
"template"
],
"time": "2014-01-30 17:20:04"
},
{
"name": "phpunit/php-timer",
"version": "1.0.5",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/php-timer.git",
"reference": "19689d4354b295ee3d8c54b4f42c3efb69cbc17c"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/php-timer/zipball/19689d4354b295ee3d8c54b4f42c3efb69cbc17c",
"reference": "19689d4354b295ee3d8c54b4f42c3efb69cbc17c",
"shasum": ""
},
"require": {
"php": ">=5.3.3"
},
"type": "library",
"autoload": {
"classmap": [
"PHP/"
]
},
"notification-url": "https://packagist.org/downloads/",
"include-path": [
""
],
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sb@sebastian-bergmann.de",
"role": "lead"
}
],
"description": "Utility class for timing",
"homepage": "https://github.com/sebastianbergmann/php-timer/",
"keywords": [
"timer"
],
"time": "2013-08-02 07:42:54"
},
{
"name": "phpunit/php-token-stream",
"version": "1.2.2",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/php-token-stream.git",
"reference": "ad4e1e23ae01b483c16f600ff1bebec184588e32"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/php-token-stream/zipball/ad4e1e23ae01b483c16f600ff1bebec184588e32",
"reference": "ad4e1e23ae01b483c16f600ff1bebec184588e32",
"shasum": ""
},
"require": {
"ext-tokenizer": "*",
"php": ">=5.3.3"
},
"type": "library",
"extra": {
"branch-alias": {
"dev-master": "1.2-dev"
}
},
"autoload": {
"classmap": [
"PHP/"
]
},
"notification-url": "https://packagist.org/downloads/",
"include-path": [
""
],
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sb@sebastian-bergmann.de",
"role": "lead"
}
],
"description": "Wrapper around PHP's tokenizer extension.",
"homepage": "https://github.com/sebastianbergmann/php-token-stream/",
"keywords": [
"tokenizer"
],
"time": "2014-03-03 05:10:30"
},
{
"name": "phpunit/phpunit",
"version": "3.7.38",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/phpunit.git",
"reference": "38709dc22d519a3d1be46849868aa2ddf822bcf6"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/phpunit/zipball/38709dc22d519a3d1be46849868aa2ddf822bcf6",
"reference": "38709dc22d519a3d1be46849868aa2ddf822bcf6",
"shasum": ""
},
"require": {
"ext-ctype": "*",
"ext-dom": "*",
"ext-json": "*",
"ext-pcre": "*",
"ext-reflection": "*",
"ext-spl": "*",
"php": ">=5.3.3",
"phpunit/php-code-coverage": "~1.2",
"phpunit/php-file-iterator": "~1.3",
"phpunit/php-text-template": "~1.1",
"phpunit/php-timer": "~1.0",
"phpunit/phpunit-mock-objects": "~1.2",
"symfony/yaml": "~2.0"
},
"require-dev": {
"pear-pear.php.net/pear": "1.9.4"
},
"suggest": {
"phpunit/php-invoker": "~1.1"
},
"bin": [
"composer/bin/phpunit"
],
"type": "library",
"extra": {
"branch-alias": {
"dev-master": "3.7.x-dev"
}
},
"autoload": {
"classmap": [
"PHPUnit/"
]
},
"notification-url": "https://packagist.org/downloads/",
"include-path": [
"",
"../../symfony/yaml/"
],
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sebastian@phpunit.de",
"role": "lead"
}
],
"description": "The PHP Unit Testing framework.",
"homepage": "http://www.phpunit.de/",
"keywords": [
"phpunit",
"testing",
"xunit"
],
"time": "2014-10-17 09:04:17"
},
{
"name": "phpunit/phpunit-mock-objects",
"version": "1.2.3",
"source": {
"type": "git",
"url": "https://github.com/sebastianbergmann/phpunit-mock-objects.git",
"reference": "5794e3c5c5ba0fb037b11d8151add2a07fa82875"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/sebastianbergmann/phpunit-mock-objects/zipball/5794e3c5c5ba0fb037b11d8151add2a07fa82875",
"reference": "5794e3c5c5ba0fb037b11d8151add2a07fa82875",
"shasum": ""
},
"require": {
"php": ">=5.3.3",
"phpunit/php-text-template": ">=1.1.1@stable"
},
"suggest": {
"ext-soap": "*"
},
"type": "library",
"autoload": {
"classmap": [
"PHPUnit/"
]
},
"notification-url": "https://packagist.org/downloads/",
"include-path": [
""
],
"license": [
"BSD-3-Clause"
],
"authors": [
{
"name": "Sebastian Bergmann",
"email": "sb@sebastian-bergmann.de",
"role": "lead"
}
],
"description": "Mock Object library for PHPUnit",
"homepage": "https://github.com/sebastianbergmann/phpunit-mock-objects/",
"keywords": [
"mock",
"xunit"
],
"time": "2013-01-13 10:24:48"
},
{
"name": "symfony/yaml",
"version": "v2.6.6",
"target-dir": "Symfony/Component/Yaml",
"source": {
"type": "git",
"url": "https://github.com/symfony/Yaml.git",
"reference": "174f009ed36379a801109955fc5a71a49fe62dd4"
},
"dist": {
"type": "zip",
"url": "https://api.github.com/repos/symfony/Yaml/zipball/174f009ed36379a801109955fc5a71a49fe62dd4",
"reference": "174f009ed36379a801109955fc5a71a49fe62dd4",
"shasum": ""
},
"require": {
"php": ">=5.3.3"
},
"require-dev": {
"symfony/phpunit-bridge": "~2.7"
},
"type": "library",
"extra": {
"branch-alias": {
"dev-master": "2.6-dev"
}
},
"autoload": {
"psr-0": {
"Symfony\\Component\\Yaml\\": ""
}
},
"notification-url": "https://packagist.org/downloads/",
"license": [
"MIT"
],
"authors": [
{
"name": "Symfony Community",
"homepage": "http://symfony.com/contributors"
},
{
"name": "Fabien Potencier",
"email": "fabien@symfony.com"
}
],
"description": "Symfony Yaml Component",
"homepage": "http://symfony.com",
"time": "2015-03-30 15:54:10"
}
],
"aliases": [],
"minimum-stability": "stable",
"stability-flags": [],
"prefer-stable": false,
"prefer-lowest": false,
"platform": {
"php": ">=5.3.0"
},
"platform-dev": []
}
php-mf2-0.3.0/phpunit.xml 0000664 0000000 0000000 00000000316 12671525356 0015210 0 ustar 00root root 0000000 0000000
tests/Mf2
php-mf2-0.3.0/tests/ 0000775 0000000 0000000 00000000000 12671525356 0014141 5 ustar 00root root 0000000 0000000 php-mf2-0.3.0/tests/Mf2/ 0000775 0000000 0000000 00000000000 12671525356 0014565 5 ustar 00root root 0000000 0000000 php-mf2-0.3.0/tests/Mf2/ClassicMicroformatsTest.php 0000664 0000000 0000000 00000045113 12671525356 0022111 0 ustar 00root root 0000000 0000000 µf2 functionality.
*
* Mainly based off BC tables on http://microformats.org/wiki/microformats2#v2_vocabularies
*/
class ClassicMicroformatsTest extends PHPUnit_Framework_TestCase {
public function setUp() {
date_default_timezone_set('Europe/London');
}
public function testParsesClassicHcard() {
$input = '
Barnaby Walters is a person.
';
$expected = '{"items": [{"type": ["h-card"], "properties": {"name": ["Barnaby Walters"]}}], "rels": {}}';
$parser = new Parser($input, '', true);
$this->assertJsonStringEqualsJsonString(json_encode($parser->parse()), $expected);
}
public function testParsesClassicHEntry() {
$input = '
microformats2 Is Great
yes yes it is.
';
$expected = '{"items": [{"type": ["h-entry"], "properties": {"name": ["microformats2 Is Great"], "summary": ["yes yes it is."]}}], "rels": {}}';
$parser = new Parser($input, '', true);
$this->assertJsonStringEqualsJsonString(json_encode($parser->parse()), $expected);
}
public function testIgnoresClassicClassnamesUnderMf2Root() {
$input = <<
Not Me
I wrote this
EOT;
$parser = new Parser($input);
$result = $parser->parse();
$this->assertEquals('I wrote this', $result['items'][0]['properties']['author'][0]['properties']['name'][0]);
}
public function testIgnoresClassicPropertyClassnamesOutsideClassicRoots() {
$input = <<Mr. Invisible
EOT;
$parser = new Parser($input);
$result = $parser->parse();
$this->assertCount(0, $result['items']);
}
public function testParsesFBerrimanClassicHEntry() {
$input = <<
April was pretty decent. I got to attend two very good conferences and I got to speak at them.
EOT;
$parser = new Parser($input);
$result = $parser->parse();
$e = $result['items'][0];
$this->assertContains('h-entry', $e['type']);
}
public function testParsesSnarfedOrgArticleCorrectly() {
$input = file_get_contents(__DIR__ . '/snarfed.org.html');
$result = Mf2\parse($input, 'http://snarfed.org/2013-10-23_oauth-dropins');
}
public function testParsesHProduct() {
$input = <<<'EOT'
All-steel construction with non-skid rubber baseSpring-loaded inner channel prevents jamsAvailable in black, burgundy and beige <br /> <li>Staples up to 20 sheets</li>Select an ItemAll-steel construction with non-skid rubber baseSpring-loaded inner channel prevents jamsAvailable in black, burgundy and beige <br /> <li>Staples up to 20 sheets</li>Swingline® 747® Classic Desktop Staplers18.35USDhttp://www.staples-3p.com/s7/is/image/Staples/s0021414_sc7?$std$http://www.staples-3p.com/s7/is/image/Staples/s0021414_sc7?$thb$http://www.staples-3p.com/s7/is/image/Staples/s0021414/Swingline-747-Classic-Desktop-Staplers/product_SS264184All-steel construction with non-skid rubber baseFull stripStaples up to 20 sheetsEachAll-steel construction with non-skid rubber baseFull stripStaples up to 20 sheetsSwingline® 747® Classic Desktop Full Strip Stapler, 20 Sheet Capacity, Black18.35USDhttp://www.staples-3p.com/s7/is/image/Staples/s0021412_sc7?$std$http://www.staples-3p.com/s7/is/image/Staples/s0021412_sc7?$thb$http://www.staples-3p.com/s7/is/image/Staples/s0021412/Swingline-747-Classic-Desktop-Full-Strip-Stapler-20-Sheet-Capacity-Black/product_264184All-steel construction with non-skid rubber baseSpring-loaded inner channel prevents jamsBurgundy <br /> <li>Staples up to 20 sheets</li>EachAll-steel construction with non-skid rubber baseSpring-loaded inner channel prevents jamsBurgundy <br /> <li>Staples up to 20 sheets</li>Swingline® 747® Classic Desktop Stapler, Burgundy19.59USDhttp://www.staples-3p.com/s7/is/image/Staples/m000240695_sc7?$std$http://www.staples-3p.com/s7/is/image/Staples/m000240695_sc7?$thb$http://www.staples-3p.com/s7/is/image/Staples/m000240695/Swingline-747-Classic-Desktop-Stapler-Burgundy/product_41373220 sheet capacity with Swingline S.F.® 4® StaplesDurable metal constructionStapler opens for tacking abilityEach20 sheet capacity with Swingline S.F.® 4® StaplesDurable metal constructionStapler opens for tacking abilitySwingline® 747® Rio Red Stapler, 20 Sheet Capacity39.49USDhttp://www.staples-3p.com/s7/is/image/Staples/s0446269_sc7?$std$http://www.staples-3p.com/s7/is/image/Staples/s0446269_sc7?$thb$http://www.staples-3p.com/s7/is/image/Staples/s0446269/Swingline-747-Rio-Red-Stapler-20-Sheet-Capacity/product_562485
EOT;
$result = Mf2\parse($input, 'http://www.staples.com/Swingline-747-Rio-Red-Stapler-20-Sheet-Capacity/product_562485');
$this->assertCount(4, $result['items']);
}
/**
* @see https://github.com/indieweb/php-mf2/issues/81
*/
public function test_vevent() {
$input = <<< EOT
');
}
/**
* Test microformats nested under u-* property classnames derive value: key from parsing as u-*
*/
public function testMicroformatsNestedUnderUPropertyClassnamesDeriveValueCorrectly() {
$input = '
';
$mf = Mf2\parse($input);
$this->assertEquals($mf['items'][0]['properties']['url'][0]['value'], 'This should be the value');
}
public function testMicroformatsNestedUnderUPropertyClassnamesDeriveValueFromURL() {
$input = '
';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('start', $output['items'][0]['properties']);
$this->assertEquals('2012-10-07T21:18', $output['items'][0]['properties']['start'][0]);
}
/**
* @group parseDT
* @group valueClass
*/
public function testAbbrYYYY_MM_DD__HH_MM() {
$input = '
some day at 21:18
';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('start', $output['items'][0]['properties']);
$this->assertEquals('2012-10-07T21:18', $output['items'][0]['properties']['start'][0]);
}
/**
* @group parseDT
* @group valueClass
*/
public function testYYYY_MM_DD__HHpm() {
$input = '
2012-10-07 at 9pm
';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('start', $output['items'][0]['properties']);
$this->assertEquals('2012-10-07T21:00', $output['items'][0]['properties']['start'][0]);
}
/**
* @group parseDT
* @group valueClass
*/
public function testYYYY_MM_DD__HH_MMpm() {
$input = '
2012-10-07 at 9:00pm
';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('start', $output['items'][0]['properties']);
$this->assertEquals('2012-10-07T21:00', $output['items'][0]['properties']['start'][0]);
}
/**
* @group parseDT
* @group valueClass
*/
public function testYYYY_MM_DD__HH_MM_SSpm() {
$input = '
2012-10-07 at 9:00:00pm
';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('start', $output['items'][0]['properties']);
$this->assertEquals('2012-10-07T21:00:00', $output['items'][0]['properties']['start'][0]);
}
/**
* This test name refers to the value-class used within the dt-end.
* @group parseDT
* @group valueClass
*/
public function testImpliedDTEndWithValueClass() {
$input = '
2014-06-04 at 18:3019:30
';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('start', $output['items'][0]['properties']);
$this->assertArrayHasKey('end', $output['items'][0]['properties']);
$this->assertEquals('2014-06-04T18:30', $output['items'][0]['properties']['start'][0]);
$this->assertEquals('2014-06-04T19:30', $output['items'][0]['properties']['end'][0]);
}
/**
* This test name refers to the lack of value-class within the dt-end.
* @group parseDT
* @group valueClass
*/
public function testImpliedDTEndWithoutValueClass() {
$input = '
';
//$parser = new Parser($input);
$output = Mf2\parse($input);
$this->assertArrayHasKey('content', $output['items'][0]['properties']);
$this->assertEquals('Here is a load of embedded markup', $output['items'][0]['properties']['content'][0]['html']);
$this->assertEquals('Here is a load of embedded markup', $output['items'][0]['properties']['content'][0]['value']);
}
public function testParseEResolvesRelativeLinks() {
$input = '
', $output['items'][0]['properties']['published'][0]);
$this->assertEquals('', $output['items'][0]['properties']['url'][0]);
}
public function testHtmlEncodesImpliedProperties() {
$input = '<name>';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertEquals('', $output['items'][0]['properties']['name'][0]);
$this->assertEquals('', $output['items'][0]['properties']['url'][0]);
$this->assertEquals('', $output['items'][0]['properties']['photo'][0]);
}
public function testParsesRelValues() {
$input = 'Mr. Author';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('rels', $output);
$this->assertEquals('http://example.com', $output['rels']['author'][0]);
}
public function testParsesRelAlternateValues() {
$input = 'German Homepage';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayHasKey('alternates', $output);
$this->assertEquals('http://example.org', $output['alternates'][0]['url']);
$this->assertEquals('home', $output['alternates'][0]['rel']);
$this->assertEquals('de', $output['alternates'][0]['hreflang']);
$this->assertEquals('screen', $output['alternates'][0]['media']);
$this->assertEquals('text/html', $output['alternates'][0]['type']);
$this->assertEquals('German Homepage Link', $output['alternates'][0]['title']);
$this->assertEquals('German Homepage', $output['alternates'][0]['text']);
}
public function testParseFromIdOnlyReturnsMicroformatsWithinThatId() {
$input = <<Not Included
Included
Not Included
EOT;
$parser = new Parser($input);
$output = $parser->parseFromId('parse-here');
$this->assertCount(1, $output['items']);
$this->assertEquals('Included', $output['items'][0]['properties']['name'][0]);
}
/**
* Issue #21 github.com/indieweb/php-mf2/issues/21
*/
public function testDoesntAddArraysWithOnlyValueForAlreadyParsedNestedMicroformats() {
$input = <<
Nested Author
Real Author
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertCount(1, $output['items'][0]['properties']['author']);
}
public function testParsesNestedMicroformatsWithClassnamesInAnyOrder() {
$input = <<
Name
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertCount(1, $output['items'][0]['properties']['in-reply-to']);
$this->assertEquals('Name', $output['items'][0]['properties']['in-reply-to'][0]['properties']['name'][0]);
}
/**
* @group network
*/
public function testFetchMicroformats() {
$mf = Mf2\fetch('http://waterpigs.co.uk/');
$this->assertArrayHasKey('items', $mf);
$mf = Mf2\fetch('http://waterpigs.co.uk/photo.jpg', null, $curlInfo);
$this->assertNull($mf);
$this->assertContains('jpeg', $curlInfo['content_type']);
}
/**
* @see https://github.com/indieweb/php-mf2/issues/48
*/
public function testIgnoreClassesEndingInHyphen() {
$input = 'foo';
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayNotHasKey('0', $output['items'][0]['properties']);
}
/**
* @see https://github.com/indieweb/php-mf2/issues/52
* @see https://github.com/tommorris/mf2py/commit/92740deb7e19b8f1e7fbf6bec001cf52f2b07e99
*/
public function testIgnoresTemplateElements() {
$result = Mf2\parse('Tom Morris');
$this->assertCount(0, $result['items']);
}
/**
* @see https://github.com/indieweb/php-mf2/issues/53
* @see http://microformats.org/wiki/microformats2-parsing#parsing_an_e-_property
*/
public function testConvertsNestedImgElementToAltOrSrc() {
$input = <<
It is a strange thing to see a
EOT;
$result = Mf2\parse($input, 'http://waterpigs.co.uk/articles/five-legged-elephant');
$this->assertEquals('It is a strange thing to see a five legged elephant', $result['items'][0]['properties']['content'][0]['value']);
}
// parser not respecting not[h-*] in rule "else if .h-x>a[href]:only-of-type:not[.h-*] then use that [href] for url"
public function testNotImpliedUrlFromHCard() {
$input = <<John Q
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertArrayNotHasKey('url', $output['items'][0]['properties']);
}
public function testAreaTag() {
$input = <<
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertEquals('Person Bee', $output['items'][0]['properties']['name'][0]);
$this->assertEquals('rect', $output['items'][0]['properties']['category'][0]['shape']);
$this->assertEquals('100,100,120,120', $output['items'][0]['properties']['category'][0]['coords']);
$this->assertEquals('Person Bee', $output['items'][0]['properties']['category'][0]['value']);
}
public function testParseHcardInCategory() {
$input = <<Alice tagged
Bob Smith in
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertContains('h-entry', $output['items'][0]['type']);
$this->assertArrayHasKey('category', $output['items'][0]['properties']);
$this->assertContains('h-card', $output['items'][0]['properties']['category'][0]['type']);
$this->assertArrayHasKey('name', $output['items'][0]['properties']['category'][0]['properties']);
$this->assertEquals('Bob Smith', $output['items'][0]['properties']['category'][0]['properties']['name'][0]);
$this->assertArrayHasKey('url', $output['items'][0]['properties']['category'][0]['properties']);
$this->assertEquals('http://b.example.com/', $output['items'][0]['properties']['category'][0]['properties']['url'][0]);
}
public function testApplyTransformationToSrcset() {
$transformation = function ($url) {
return 'https://example.com/' . ltrim($url, '/');
};
// Example from https://developers.whatwg.org/edits.html#attr-img-srcset
$srcset = 'banner-HD.jpeg 2x, banner-phone.jpeg 100w, banner-phone-HD.jpeg 100w 2x';
$result = Mf2\applySrcsetUrlTransformation($srcset, $transformation);
$this->assertEquals('https://example.com/banner-HD.jpeg 2x, https://example.com/banner-phone.jpeg 100w, https://example.com/banner-phone-HD.jpeg 100w 2x', $result);
}
/**
* @see https://github.com/indieweb/php-mf2/issues/84
*/
public function testRelativeURLResolvedWithFinalURL() {
$mf = Mf2\fetch('http://aaron.pk/4Zn5');
$this->assertEquals('https://aaronparecki.com/2014/12/23/5/photo.jpeg', $mf['items'][0]['properties']['photo'][0]);
}
public function testScriptTagContentsRemovedFromTextValue() {
$input = <<
Hello World
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertContains('h-entry', $output['items'][0]['type']);
$this->assertContains('Hello World', $output['items'][0]['properties']['content'][0]);
$this->assertNotContains('alert', $output['items'][0]['properties']['content'][0]);
}
public function testScriptElementContentsRemovedFromAllPlaintextValues() {
$input = <<containedcontained
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertNotContains('not contained', $output['items'][0]['properties']['published'][0]);
$this->assertNotContains('not contained', $output['items'][0]['properties']['url'][0]);
}
public function testScriptTagContentsNotRemovedFromHTMLValue() {
$input = <<
Hello World
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertContains('h-entry', $output['items'][0]['type']);
$this->assertContains('Hello World', $output['items'][0]['properties']['content'][0]['value']);
$this->assertContains('Hello World', $output['items'][0]['properties']['content'][0]['html']);
# The script and style tags should be removed from plaintext results but left in HTML results.
$this->assertContains('alert', $output['items'][0]['properties']['content'][0]['html']);
$this->assertNotContains('alert', $output['items'][0]['properties']['content'][0]['value']);
$this->assertContains('visibility', $output['items'][0]['properties']['content'][0]['html']);
$this->assertNotContains('visibility', $output['items'][0]['properties']['content'][0]['value']);
}
public function testWhitespaceBetweenElements() {
$input = <<
I'm attending
Homebrew Website Club at Quip
Thanks for hosting!
EOT;
$parser = new Parser($input);
$output = $parser->parse();
$this->assertContains('h-entry', $output['items'][0]['type']);
$this->assertNotContains('attendingHomebrew', $output['items'][0]['properties']['name'][0]);
}
}
php-mf2-0.3.0/tests/Mf2/URLTest.php 0000664 0000000 0000000 00000023714 12671525356 0016607 0 ustar 00root root 0000000 0000000 assertEquals('one/two', $input);
$input = './one/two';
mf2\removeLeadingDotSlash($input);
$this->assertEquals('one/two', $input);
}
public function testRemoveLeadingSlashDot() {
$input = '/./one/two';
mf2\removeLeadingSlashDot($input);
$this->assertEquals('/one/two', $input);
$input = '/.';
mf2\removeLeadingSlashDot($input);
$this->assertEquals('/', $input);
$input = '/./../';
mf2\removeLeadingSlashDot($input);
$this->assertEquals('/../', $input);
$input = '/./../../g';
mf2\removeLeadingSlashDot($input);
$this->assertEquals('/../../g', $input);
}
public function testRemoveOneDirLevel() {
$input = '/../../g';
$output = '/a/b/c';
mf2\removeOneDirLevel($input, $output);
$this->assertEquals('/../g', $input);
$this->assertEquals('/a/b', $output);
$input = '/..';
$output = '/a/b/c';
mf2\removeOneDirLevel($input, $output);
$this->assertEquals('/', $input);
$this->assertEquals('/a/b', $output);
}
public function testRemoveLoneDotDot() {
$input = '.';
mf2\removeLoneDotDot($input);
$this->assertEquals('', $input);
$input = '..';
mf2\removeLoneDotDot($input);
$this->assertEquals('', $input);
}
public function testMoveOneSegmentFromInput() {
$input = '/a/b/c/./../../g';
$output = '';
mf2\moveOneSegmentFromInput($input, $output);
$this->assertEquals('/b/c/./../../g', $input);
$this->assertEquals('/a', $output);
$input = '/b/c/./../../g';
$output = '/a';
mf2\moveOneSegmentFromInput($input, $output);
$this->assertEquals('/c/./../../g', $input);
$this->assertEquals('/a/b', $output);
$input = '/c/./../../g';
$output = '/a/b';
mf2\moveOneSegmentFromInput($input, $output);
$this->assertEquals('/./../../g', $input);
$this->assertEquals('/a/b/c', $output);
$input = '/g';
$output = '/a';
mf2\moveOneSegmentFromInput($input, $output);
$this->assertEquals('', $input);
$this->assertEquals('/a/g', $output);
}
/**
* @dataProvider removeDotSegmentsData
*/
public function testRemoveDotSegments($assert, $path, $expected) {
$actual = mf2\removeDotSegments($path);
$this->assertEquals($expected, $actual, $assert);
}
public function removeDotSegmentsData() {
return array(
array('Should remove .. and .',
'/a/b/c/./../../g', '/a/g'),
array('Should remove ../..',
'/a/b/c/d/../../../g', '/a/g'),
array('Should not add leading slash',
'a/b/c', 'a/b/c'),
);
}
public function testNoPathOnBase() {
$actual = mf2\resolveUrl('http://example.com', '');
$this->assertEquals('http://example.com/', $actual);
$actual = mf2\resolveUrl('http://example.com', '#');
$this->assertEquals('http://example.com/#', $actual);
$actual = mf2\resolveUrl('http://example.com', '#thing');
$this->assertEquals('http://example.com/#thing', $actual);
}
public function testMisc() {
$expected = 'http://a/b/c/g';
$actual = mf2\resolveUrl('http://a/b/c/d;p?q', './g');
$this->assertEquals($expected, $actual);
$expected = 'http://a/b/c/g/';
$actual = mf2\resolveUrl('http://a/b/c/d;p?q', './g/');
$this->assertEquals($expected, $actual);
$expected = 'http://a/b/';
$actual = mf2\resolveUrl('http://a/b/c/d;p?q', '..');
$this->assertEquals($expected, $actual);
}
/** as per https://github.com/indieweb/php-mf2/issues/35 */
public function testResolvesProtocolRelativeUrlsCorrectly() {
$expected = 'http://cdn.example.org/thing/asset.css';
$actual = Mf2\resolveUrl('http://example.com', '//cdn.example.org/thing/asset.css');
$this->assertEquals($expected, $actual);
$expected = 'https://cdn.example.org/thing/asset.css';
$actual = Mf2\resolveUrl('https://example.com', '//cdn.example.org/thing/asset.css');
$this->assertEquals($expected, $actual);
}
/**
* @dataProvider testData
*/
public function testReturnsUrlIfAbsolute($assert, $base, $url, $expected) {
$actual = mf2\resolveUrl($base, $url);
$this->assertEquals($expected, $actual, $assert);
}
public function testData() {
// seriously, please update to PHP 5.4 so I can use nice array syntax ;)
// fail message, base, url, expected
$cases = array(
array('Should return absolute URL unchanged',
'http://example.com', 'http://example.com', 'http://example.com'),
array('Should return root given blank path',
'http://example.com', '', 'http://example.com/'),
array('Should return input unchanged given full URL and blank path',
'http://example.com/something', '', 'http://example.com/something'),
array('Should handle blank base URL',
'', 'http://example.com', 'http://example.com'),
array('Should resolve fragment ID',
'http://example.com', '#thing', 'http://example.com/#thing'),
array('Should resolve blank fragment ID',
'http://example.com', '#', 'http://example.com/#'),
array('Should resolve same level URL',
'http://example.com', 'thing', 'http://example.com/thing'),
array('Should resolve directory level URL',
'http://example.com', './thing', 'http://example.com/thing'),
array('Should resolve parent level URL at root level',
'http://example.com', '../thing', 'http://example.com/thing'),
array('Should resolve nested URL',
'http://example.com/something', 'another', 'http://example.com/another'),
array('Should ignore query strings in base url',
'http://example.com/index.php?url=http://example.org', '/thing', 'http://example.com/thing'),
array('Should resolve query strings',
'http://example.com/thing', '?stuff=yes', 'http://example.com/thing?stuff=yes'),
array('Should resolve dir level query strings',
'http://example.com', './?thing=yes', 'http://example.com/?thing=yes'),
array('Should resolve up one level from root domain',
'http://example.com', 'path/to/the/../file', 'http://example.com/path/to/file'),
array('Should resolve up one level from base with path',
'http://example.com/path/the', 'to/the/../file', 'http://example.com/path/to/file'),
// Tests from webignition library
array('relative add host from base',
'http://www.example.com', 'server.php', 'http://www.example.com/server.php'),
array('relative add scheme host pass from base',
'http://:pass@www.example.com', 'server.php', 'http://:pass@www.example.com/server.php'),
array('relative add scheme host user pass from base',
'http://user:pass@www.example.com', 'server.php', 'http://user:pass@www.example.com/server.php'),
array('relative base has file path',
'http://example.com/index.html', 'example.html', 'http://example.com/example.html'),
array('input has absolute path',
'http://www.example.com/pathOne/pathTwo/pathThree', '/server.php?param1=value1', 'http://www.example.com/server.php?param1=value1'),
array('test absolute url with path',
'http://www.example.com/', 'http://www.example.com/pathOne', 'http://www.example.com/pathOne'),
array('testRelativePathIsTransformedIntoCorrectAbsoluteUrl',
'http://www.example.com/pathOne/pathTwo/pathThree', 'server.php?param1=value1', 'http://www.example.com/pathOne/pathTwo/server.php?param1=value1'),
array('testAbsolutePathHasDotDotDirecoryAndSourceHasFileName',
'http://www.example.com/pathOne/index.php', '../jquery.js', 'http://www.example.com/jquery.js'),
array('testAbsolutePathHasDotDotDirecoryAndSourceHasDirectoryWithTrailingSlash',
'http://www.example.com/pathOne/', '../jquery.js', 'http://www.example.com/jquery.js'),
array('testAbsolutePathHasDotDotDirecoryAndSourceHasDirectoryWithoutTrailingSlash',
'http://www.example.com/pathOne', '../jquery.js', 'http://www.example.com/jquery.js'),
array('testAbsolutePathHasDotDirecoryAndSourceHasFilename',
'http://www.example.com/pathOne/index.php', './jquery.js', 'http://www.example.com/pathOne/jquery.js'),
array('testAbsolutePathHasDotDirecoryAndSourceHasDirectoryWithTrailingSlash',
'http://www.example.com/pathOne/', './jquery.js', 'http://www.example.com/pathOne/jquery.js'),
array('testAbsolutePathHasDotDirecoryAndSourceHasDirectoryWithoutTrailingSlash',
'http://www.example.com/pathOne', './jquery.js', 'http://www.example.com/jquery.js'),
array('testAbsolutePathIncludesPortNumber',
'http://example.com:8080/index.html', '/photo.jpg', 'http://example.com:8080/photo.jpg')
);
// PHP 5.4 and before returns a different result, but either are acceptable
if(PHP_MAJOR_VERSION <= 5 && PHP_MINOR_VERSION <= 4) {
$cases[] = array('relative add scheme host user from base',
'http://user:@www.example.com', 'server.php', 'http://user@www.example.com/server.php');
} else {
$cases[] = array('relative add scheme host user from base',
'http://user:@www.example.com', 'server.php', 'http://user:@www.example.com/server.php');
}
// Test cases from RFC
// http://tools.ietf.org/html/rfc3986#section-5.4
$rfcTests = array(
array("g:h", "g:h"),
array("g", "http://a/b/c/g"),
array("./g", "http://a/b/c/g"),
array("g/", "http://a/b/c/g/"),
array("/g", "http://a/g"),
array("//g", "http://g"),
array("?y", "http://a/b/c/d;p?y"),
array("g?y", "http://a/b/c/g?y"),
array("#s", "http://a/b/c/d;p?q#s"),
array("g#s", "http://a/b/c/g#s"),
array("g?y#s", "http://a/b/c/g?y#s"),
array(";x", "http://a/b/c/;x"),
array("g;x", "http://a/b/c/g;x"),
array("g;x?y#s", "http://a/b/c/g;x?y#s"),
array("", "http://a/b/c/d;p?q"),
array(".", "http://a/b/c/"),
array("./", "http://a/b/c/"),
array("..", "http://a/b/"),
array("../", "http://a/b/"),
array("../g", "http://a/b/g"),
array("../..", "http://a/"),
array("../../", "http://a/"),
array("../../g", "http://a/g")
);
foreach($rfcTests as $i=>$test) {
$cases[] = array(
'test rfc ' . $i, 'http://a/b/c/d;p?q', $test[0], $test[1]
);
}
return $cases;
}
}
php-mf2-0.3.0/tests/Mf2/bootstrap.php 0000664 0000000 0000000 00000000075 12671525356 0017315 0 ustar 00root root 0000000 0000000
fberriman | a blog for frances
Austin! One of my favourite cities (mostly because I love tacos). Was very pleased to be asked to return to this conference after I spoke there last year. The day was remarkable, if only because it’s one of the first conferences in a very long time where I actually watched all of the talks (although Rebecca, being on before me, may have only had half of my attention). Really a very well curated day, and I felt very lucky to be in the line-up.
Alex was not overly prescriptive in what I should talk about, but suggested he liked the content of last year and would like a little more on that. So, I decided to pick an aspect about that that I felt was important to us at GDS and fundamental to the success of our Design Principles.
For me, it’s been our honesty and simple language. The words that we’ve used to talk about user needs, technical aspects of the site and the ethos have been plain and no-nonsense. I think this is hugely down to the strength of a team that has the confidence to cut through bullshit and say what it really means – Russell and Sarah are particularly brilliant at this, and have had huge parts to play in getting this cult of simple down in writing.
The tech scene is sort of rife with nonsense words. Buzzwords and clichés and the new name for the next big thing, which is actually the new name for the same old sensible thing – but with better marketing and a twitter hashtag. Ugh. I want a lot less of that in our world.
So, I picked on a few of these and showed a few examples from how we’re dealing with them at GDS. I believe the video for that talk is out now, but the slides are here.
I attended this conference last year – definitely a favourite for its surprisingly sunny weather and for being one of the most friendly events I had been to in 2012. So, I was really glad to get to come back and share our Design Principles with the crowd.
It was very similar to the talk I gave at TXJS last year, except we’ve done a whole lot more at GDS since June of last year – we released v1.0 of gov.uk, and a bunch of other stuff like the performance platform, Inside Government (and the 24 departments) and foreign travel advice, to name a few. I showcased some of these things, and then went through the design principles with the lovely, receptive, Polish audience and it seemed to go over rather well. The slides for this version of the talk are here.
Three days are a lot for a conference, but it was really high quality through-out and the breadth of subjects was really great. I wouldn’t recommend putting the party on the second night again, however – that last morning was something of a challenge. :)
I’ve had a fair few people ask about the Jawbone Up I’ve been wearing since November (the second version, not the recalled first one – although, as you’ll read, perhaps this one should have been too). Here’s how I’ve found it.
The good
The reason I waited on the Up, over say the Nike Fuelband, was because I wanted a wrist-wearable tracker plus sleep data. The FitBit One has a wearable night-time band, but it looks rather large and cumbersome and I didn’t want a clothes-clip tracker in the day time (where do dress wearers clip them?).
The Up band’s size is really good and it’s comfy and it doesn’t look ridiculous.
I like the sleep tracking, although I feel like it’s not terribly accurate – if I wake and don’t move around much, it doesn’t record it as a waking period – but it’s accurate enough to collect the information I’m interested in.
I have been a bad sleeper for a long time, but having actual data about the length of time I’ve been asleep and awake has helped reduce my anxiety about a bad night’s sleep (it always feels like a lifetime when you’re awake in the middle of the night and don’t want to be – but turns out it isn’t), which in turn has helped improve how well I go to sleep generally, I think.
I also like the smart-alarm – before I’d put off looking at the clock to see the time, but the gentle nudge that, yes, it is about time I got up is really useful, and again, anxiety reducing.
The steps tracking seems fine. I’ve never bothered to calibrate it, since I don’t do much exercise except walking – but it seems to match the distances I do regularly around the city. It’s fun – I’m not competing, so it’s mostly just interesting. I hear from others that it basically can’t cope with running or cycling, though.
The bad
It broke. Twice. The first time, it broke after about 6 weeks – the vibration feature (needed for the smart-alarm and idle alert) just stopped working for no apparent reason. At the time, the Up band wasn’t out in the UK, so Jawbone were not willing to replace it (ugh) but when I said I’d be in New York for a week, they agreed to courier me a replacement to the Google office there while I was in town – which I think was really just a nice act on the part of one very good customer service rep I’d met on the support forums. Had I not been on the forum or nagged on twitter, I suspect I’d have been left out of pocket.
Unfortunately, the second band stopped working a couple of months later. The smart-alarm feature became temperamental and often wouldn’t go off at all, and the button on the end of the band had become dislodged and no longer clicked. This time, the band was out in the UK, and they sent me another one immediately.
I’ve been wearing the third one for about a week and I honestly expect it’ll break soon, too, sadly. Edd, who originally picked up my first band for me while he was in New York, had his first and second bands break too (the second after only 2 weeks) – so the statistical data I have available to me is not very favourable and a quick look through the forums will find most people in similar situations.
The other stuff
They just released third-party app integration, but sadly on iOS devices only (I use a Nexus 4 day to day, so syncing with an iOS device is an extra annoyance if I want to use those features). I expect that’ll help make the data the band is recording more interesting.
Otherwise, these are things I wish it had:
A visible metre or something on the band. I have to sync it with my phone to find out how I’m doing. It doesn’t even tell me the time. I feel like it’s not providing me with much in return for the space it’s taking up on my wrist.
There’s no web view – the only way to share the data is through facebook (meh) or if your friend is also an Up user (which is basically no one). I’d like to be able to let my husband see my sleep data – then he’ll know that I’m just grumpy because I’m tired. He can sneak a look at it on my phone, I guess, but it would just be nice to have a public view somewhere on the web.
The food and mood logging is boring and pointless. It may be that the new app integration gives this value, but it was onerous and I gave up after a week. The insights offered to you only ever related to steps and sleep, so no matter how much food and mood you logged, it was for your own entertainment only. These features appear to be rather tacked-on.
Some people complain about the lack of wireless sync as a deal-breaker (you sync it via the mic jack). This personally doesn’t greatly bother me (longer battery life is a reasonable trade), but given that I have to take it off my arm to find out anything about it, as mentioned above, then I think it would have been preferential in this case to sync wirelessly.
But, these are all minor gripes – I’d recommend but for the fact that they clearly have not managed to make a band that doesn’t expire every 2 months.
I’m mostly just hoping this band will hold out long enough for the delivery of the Fitbit Flex I just pre-ordered.
Update: My 3rd band has the same smart alarm fault. Sigh.
As I mentioned, I wrote for the Pastry Box Project for all of 2012.
Now, it’s hopefully going to be printed in dead tree form with the royalties going to the Red Cross. That’s kind of nice, as are many of the fancier offerings at the higher tiers (hand press? illustrations? all sorts!).