mwclient-0.10.0/ 0000775 0003720 0003720 00000000000 13521150513 014250 5 ustar travis travis 0000000 0000000 mwclient-0.10.0/LICENSE.md 0000664 0003720 0003720 00000002050 13521150461 015653 0 ustar travis travis 0000000 0000000 Copyright (c) 2006-2013 Bryan Tong Minh
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
mwclient-0.10.0/MANIFEST.in 0000664 0003720 0003720 00000000045 13521150461 016007 0 ustar travis travis 0000000 0000000 include README.md
include LICENSE.md
mwclient-0.10.0/PKG-INFO 0000664 0003720 0003720 00000007405 13521150513 015353 0 ustar travis travis 0000000 0000000 Metadata-Version: 2.1
Name: mwclient
Version: 0.10.0
Summary: MediaWiki API client
Home-page: https://github.com/btongminh/mwclient
Author: Bryan Tong Minh
Author-email: bryan.tongminh@gmail.com
License: MIT
Description:

# mwclient
[![Build status][build-status-img]](https://travis-ci.org/mwclient/mwclient)
[![Test coverage][test-coverage-img]](https://coveralls.io/r/mwclient/mwclient)
[![Code health][code-health-img]](https://landscape.io/github/mwclient/mwclient/master)
[![Latest version][latest-version-img]](https://pypi.python.org/pypi/mwclient)
[![MIT license][mit-license-img]](http://opensource.org/licenses/MIT)
[![Documentation status][documentation-status-img]](http://mwclient.readthedocs.io/en/latest/)
[![Issue statistics][issue-statistics-img]](http://isitmaintained.com/project/tldr-pages/tldr)
[![Gitter chat][gitter-chat-img]](https://gitter.im/mwclient/mwclient)
[build-status-img]: https://img.shields.io/travis/mwclient/mwclient.svg
[test-coverage-img]: https://img.shields.io/coveralls/mwclient/mwclient.svg
[code-health-img]: https://landscape.io/github/mwclient/mwclient/master/landscape.svg?style=flat
[latest-version-img]: https://img.shields.io/pypi/v/mwclient.svg
[mit-license-img]: https://img.shields.io/github/license/mwclient/mwclient.svg
[documentation-status-img]: https://readthedocs.org/projects/mwclient/badge/
[issue-statistics-img]: http://isitmaintained.com/badge/resolution/tldr-pages/tldr.svg
[gitter-chat-img]: https://img.shields.io/gitter/room/mwclient/mwclient.svg
mwclient is a lightweight Python client library to the
[MediaWiki API](https://mediawiki.org/wiki/API)
which provides access to most API functionality.
It works with Python 2.7 as well as 3.4 and above,
and supports MediaWiki 1.16 and above.
For functions not available in the current MediaWiki,
a `MediaWikiVersionError` is raised.
The current stable
[version 0.10.0](https://github.com/mwclient/mwclient/archive/v0.10.0.zip)
is [available through PyPI](https://pypi.python.org/pypi/mwclient):
```
$ pip install mwclient
```
The current [development version](https://github.com/mwclient/mwclient)
can be installed from GitHub:
```
$ pip install git+git://github.com/mwclient/mwclient.git
```
Please see the [changelog
document](https://github.com/mwclient/mwclient/blob/master/CHANGELOG.md)
for a list of changes.
## Documentation
Up-to-date documentation is hosted [at Read the Docs](http://mwclient.readthedocs.io/en/latest/).
It includes a user guide to get started using mwclient, a reference guide,
implementation and development notes.
There is also some documentation on the [GitHub wiki](https://github.com/mwclient/mwclient/wiki)
that hasn't been ported yet.
If you want to help, you're welcome!
## Contributing
Patches are welcome! See [this page](https://mwclient.readthedocs.io/en/latest/development/)
for information on how to get started with mwclient development.
Keywords: mediawiki wikipedia
Platform: UNKNOWN
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
mwclient-0.10.0/README.md 0000664 0003720 0003720 00000005314 13521150461 015534 0 ustar travis travis 0000000 0000000 
# mwclient
[![Build status][build-status-img]](https://travis-ci.org/mwclient/mwclient)
[![Test coverage][test-coverage-img]](https://coveralls.io/r/mwclient/mwclient)
[![Code health][code-health-img]](https://landscape.io/github/mwclient/mwclient/master)
[![Latest version][latest-version-img]](https://pypi.python.org/pypi/mwclient)
[![MIT license][mit-license-img]](http://opensource.org/licenses/MIT)
[![Documentation status][documentation-status-img]](http://mwclient.readthedocs.io/en/latest/)
[![Issue statistics][issue-statistics-img]](http://isitmaintained.com/project/tldr-pages/tldr)
[![Gitter chat][gitter-chat-img]](https://gitter.im/mwclient/mwclient)
[build-status-img]: https://img.shields.io/travis/mwclient/mwclient.svg
[test-coverage-img]: https://img.shields.io/coveralls/mwclient/mwclient.svg
[code-health-img]: https://landscape.io/github/mwclient/mwclient/master/landscape.svg?style=flat
[latest-version-img]: https://img.shields.io/pypi/v/mwclient.svg
[mit-license-img]: https://img.shields.io/github/license/mwclient/mwclient.svg
[documentation-status-img]: https://readthedocs.org/projects/mwclient/badge/
[issue-statistics-img]: http://isitmaintained.com/badge/resolution/tldr-pages/tldr.svg
[gitter-chat-img]: https://img.shields.io/gitter/room/mwclient/mwclient.svg
mwclient is a lightweight Python client library to the
[MediaWiki API](https://mediawiki.org/wiki/API)
which provides access to most API functionality.
It works with Python 2.7 as well as 3.4 and above,
and supports MediaWiki 1.16 and above.
For functions not available in the current MediaWiki,
a `MediaWikiVersionError` is raised.
The current stable
[version 0.10.0](https://github.com/mwclient/mwclient/archive/v0.10.0.zip)
is [available through PyPI](https://pypi.python.org/pypi/mwclient):
```
$ pip install mwclient
```
The current [development version](https://github.com/mwclient/mwclient)
can be installed from GitHub:
```
$ pip install git+git://github.com/mwclient/mwclient.git
```
Please see the [changelog
document](https://github.com/mwclient/mwclient/blob/master/CHANGELOG.md)
for a list of changes.
## Documentation
Up-to-date documentation is hosted [at Read the Docs](http://mwclient.readthedocs.io/en/latest/).
It includes a user guide to get started using mwclient, a reference guide,
implementation and development notes.
There is also some documentation on the [GitHub wiki](https://github.com/mwclient/mwclient/wiki)
that hasn't been ported yet.
If you want to help, you're welcome!
## Contributing
Patches are welcome! See [this page](https://mwclient.readthedocs.io/en/latest/development/)
for information on how to get started with mwclient development.
mwclient-0.10.0/mwclient/ 0000775 0003720 0003720 00000000000 13521150513 016072 5 ustar travis travis 0000000 0000000 mwclient-0.10.0/mwclient/__init__.py 0000664 0003720 0003720 00000003162 13521150461 020207 0 ustar travis travis 0000000 0000000 """
Copyright (c) 2006-2011 Bryan Tong Minh
Permission is hereby granted, free of charge, to any person
obtaining a copy of this software and associated documentation
files (the "Software"), to deal in the Software without
restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the
Software is furnished to do so, subject to the following
conditions:
The above copyright notice and this permission notice shall be
included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES
OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY,
WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
OTHER DEALINGS IN THE SOFTWARE.
"""
from mwclient.errors import * # noqa: F401, F403 # pylint: disable=unused-import
from mwclient.client import Site, __version__ # noqa: F401 # pylint: disable=unused-import
import logging
import warnings
# Show DeprecationWarning
warnings.simplefilter('always', DeprecationWarning)
# Logging: Add a null handler to avoid "No handler found" warnings.
try:
from logging import NullHandler
except ImportError:
class NullHandler(logging.Handler):
def emit(self, record):
pass
logging.getLogger(__name__).addHandler(NullHandler())
mwclient-0.10.0/mwclient/client.py 0000664 0003720 0003720 00000127677 13521150461 017750 0 ustar travis travis 0000000 0000000 # encoding=utf-8
import warnings
import logging
from six import text_type
import six
from collections import OrderedDict
try:
import json
except ImportError:
import simplejson as json
import requests
from requests.auth import HTTPBasicAuth, AuthBase
from requests_oauthlib import OAuth1
import mwclient.errors as errors
import mwclient.listing as listing
from mwclient.sleep import Sleepers
from mwclient.util import parse_timestamp, read_in_chunks
try:
import gzip
except ImportError:
gzip = None
__version__ = '0.10.0'
log = logging.getLogger(__name__)
USER_AGENT = 'mwclient/{} ({})'.format(__version__, 'https://github.com/mwclient/mwclient')
class Site(object):
"""A MediaWiki site identified by its hostname.
>>> import mwclient
>>> site = mwclient.Site('en.wikipedia.org')
Do not include the leading "http://".
Mwclient assumes that the script path (where index.php and api.php are located)
is '/w/'. If the site uses a different script path, you must specify this
(path must end in a '/').
Examples:
>>> site = mwclient.Site('vim.wikia.com', path='/')
>>> site = mwclient.Site('sourceforge.net', path='/apps/mediawiki/mwclient/')
"""
api_limit = 500
def __init__(self, host, path='/w/', ext='.php', pool=None, retry_timeout=30,
max_retries=25, wait_callback=lambda *x: None, clients_useragent=None,
max_lag=3, compress=True, force_login=True, do_init=True, httpauth=None,
reqs=None, consumer_token=None, consumer_secret=None, access_token=None,
access_secret=None, client_certificate=None, custom_headers=None, scheme='https'):
# Setup member variables
self.host = host
self.path = path
self.ext = ext
self.credentials = None
self.compress = compress
self.max_lag = text_type(max_lag)
self.force_login = force_login
self.requests = reqs or {}
self.scheme = scheme
if 'timeout' not in self.requests:
self.requests['timeout'] = 30 # seconds
if consumer_token is not None:
auth = OAuth1(consumer_token, consumer_secret, access_token, access_secret)
elif isinstance(httpauth, (list, tuple)):
auth = HTTPBasicAuth(*httpauth)
elif httpauth is None or isinstance(httpauth, (AuthBase,)):
auth = httpauth
else:
raise RuntimeError('Authentication is not a tuple or an instance of AuthBase')
self.sleepers = Sleepers(max_retries, retry_timeout, wait_callback)
# Site properties
self.blocked = False # Whether current user is blocked
self.hasmsg = False # Whether current user has new messages
self.groups = [] # Groups current user belongs to
self.rights = [] # Rights current user has
self.tokens = {} # Edit tokens of the current user
self.version = None
self.namespaces = self.default_namespaces
self.writeapi = False
# Setup connection
if pool is None:
self.connection = requests.Session()
self.connection.auth = auth
if client_certificate:
self.connection.cert = client_certificate
# Set User-Agent header field
if clients_useragent:
ua = clients_useragent + ' ' + USER_AGENT
else:
ua = USER_AGENT
self.connection.headers['User-Agent'] = ua
if custom_headers:
self.connection.headers.update(custom_headers)
else:
self.connection = pool
# Page generators
self.pages = listing.PageList(self)
self.categories = listing.PageList(self, namespace=14)
self.images = listing.PageList(self, namespace=6)
# Compat page generators
self.Pages = self.pages
self.Categories = self.categories
self.Images = self.images
# Initialization status
self.initialized = False
# Upload chunk size in bytes
self.chunk_size = 1048576
if do_init:
try:
self.site_init()
except errors.APIError as e:
if e.args[0] == 'mwoauth-invalid-authorization':
raise errors.OAuthAuthorizationError(self, e.code, e.info)
# Private wiki, do init after login
if e.args[0] not in {u'unknown_action', u'readapidenied'}:
raise
def site_init(self):
if self.initialized:
info = self.get('query', meta='userinfo', uiprop='groups|rights')
userinfo = info['query']['userinfo']
self.username = userinfo['name']
self.groups = userinfo.get('groups', [])
self.rights = userinfo.get('rights', [])
self.tokens = {}
return
meta = self.get('query', meta='siteinfo|userinfo',
siprop='general|namespaces', uiprop='groups|rights',
retry_on_error=False)
# Extract site info
self.site = meta['query']['general']
self.namespaces = {
namespace['id']: namespace.get('*', '')
for namespace in six.itervalues(meta['query']['namespaces'])
}
self.writeapi = 'writeapi' in self.site
self.version = self.version_tuple_from_generator(self.site['generator'])
# Require MediaWiki version >= 1.16
self.require(1, 16)
# User info
userinfo = meta['query']['userinfo']
self.username = userinfo['name']
self.groups = userinfo.get('groups', [])
self.rights = userinfo.get('rights', [])
self.initialized = True
@staticmethod
def version_tuple_from_generator(string, prefix='MediaWiki '):
"""Return a version tuple from a MediaWiki Generator string.
Example:
"MediaWiki 1.5.1" → (1, 5, 1)
Args:
prefix (str): The expected prefix of the string
"""
if not string.startswith(prefix):
raise errors.MediaWikiVersionError('Unknown generator {}'.format(string))
version = string[len(prefix):].split('.')
def split_num(s):
"""Split the string on the first non-digit character.
Returns:
A tuple of the digit part as int and, if available,
the rest of the string.
"""
i = 0
while i < len(s):
if s[i] < '0' or s[i] > '9':
break
i += 1
if s[i:]:
return (int(s[:i]), s[i:], )
else:
return (int(s[:i]), )
version_tuple = sum((split_num(s) for s in version), ())
if len(version_tuple) < 2:
raise errors.MediaWikiVersionError('Unknown MediaWiki {}'
.format('.'.join(version)))
return version_tuple
default_namespaces = {
0: u'', 1: u'Talk', 2: u'User', 3: u'User talk', 4: u'Project',
5: u'Project talk', 6: u'Image', 7: u'Image talk', 8: u'MediaWiki',
9: u'MediaWiki talk', 10: u'Template', 11: u'Template talk', 12: u'Help',
13: u'Help talk', 14: u'Category', 15: u'Category talk',
-1: u'Special', -2: u'Media'
}
def __repr__(self):
return "" % (self.host, self.path)
def get(self, action, *args, **kwargs):
"""Perform a generic API call using GET.
This is just a shorthand for calling api() with http_method='GET'.
All arguments will be passed on.
Returns:
The raw response from the API call, as a dictionary.
"""
return self.api(action, 'GET', *args, **kwargs)
def post(self, action, *args, **kwargs):
"""Perform a generic API call using POST.
This is just a shorthand for calling api() with http_method='POST'.
All arguments will be passed on.
Returns:
The raw response from the API call, as a dictionary.
"""
return self.api(action, 'POST', *args, **kwargs)
def api(self, action, http_method='POST', *args, **kwargs):
"""Perform a generic API call and handle errors.
All arguments will be passed on.
Example:
To get coordinates from the GeoData MediaWiki extension at English Wikipedia:
>>> site = Site('en.wikipedia.org')
>>> result = site.api('query', prop='coordinates', titles='Oslo|Copenhagen')
>>> for page in result['query']['pages'].values():
... if 'coordinates' in page:
... print '{} {} {}'.format(page['title'],
... page['coordinates'][0]['lat'],
... page['coordinates'][0]['lon'])
Oslo 59.95 10.75
Copenhagen 55.6761 12.5683
Returns:
The raw response from the API call, as a dictionary.
"""
kwargs.update(args)
if action == 'query' and 'continue' not in kwargs:
kwargs['continue'] = ''
if action == 'query':
if 'meta' in kwargs:
kwargs['meta'] += '|userinfo'
else:
kwargs['meta'] = 'userinfo'
if 'uiprop' in kwargs:
kwargs['uiprop'] += '|blockinfo|hasmsg'
else:
kwargs['uiprop'] = 'blockinfo|hasmsg'
sleeper = self.sleepers.make()
while True:
info = self.raw_api(action, http_method, **kwargs)
if not info:
info = {}
if self.handle_api_result(info, sleeper=sleeper):
return info
def handle_api_result(self, info, kwargs=None, sleeper=None):
if sleeper is None:
sleeper = self.sleepers.make()
try:
userinfo = info['query']['userinfo']
except KeyError:
userinfo = ()
if 'blockedby' in userinfo:
self.blocked = (userinfo['blockedby'], userinfo.get('blockreason', u''))
else:
self.blocked = False
self.hasmsg = 'messages' in userinfo
self.logged_in = 'anon' not in userinfo
if 'warnings' in info:
for module, warning in info['warnings'].items():
if '*' in warning:
log.warning(warning['*'])
if 'error' in info:
if info['error'].get('code') in {u'internal_api_error_DBConnectionError',
u'internal_api_error_DBQueryError'}:
sleeper.sleep()
return False
# cope with https://phabricator.wikimedia.org/T106066
if (
info['error'].get('code') == u'mwoauth-invalid-authorization'
and 'Nonce already used' in info['error'].get('info')
):
log.warning('retrying due to nonce error https://phabricator.wikimedia.org/T106066')
sleeper.sleep()
return False
if 'query' in info['error']:
# Semantic Mediawiki does not follow the standard error format
raise errors.APIError(None, info['error']['query'], kwargs)
if '*' in info['error']:
raise errors.APIError(info['error']['code'],
info['error']['info'], info['error']['*'])
raise errors.APIError(info['error']['code'],
info['error']['info'], kwargs)
return True
@staticmethod
def _query_string(*args, **kwargs):
kwargs.update(args)
qs1 = [(k, v) for k, v in six.iteritems(kwargs) if k not in {'wpEditToken', 'token'}]
qs2 = [(k, v) for k, v in six.iteritems(kwargs) if k in {'wpEditToken', 'token'}]
return OrderedDict(qs1 + qs2)
def raw_call(self, script, data, files=None, retry_on_error=True, http_method='POST'):
"""
Perform a generic request and return the raw text.
In the event of a network problem, or a HTTP response with status code 5XX,
we'll wait and retry the configured number of times before giving up
if `retry_on_error` is True.
`requests.exceptions.HTTPError` is still raised directly for
HTTP responses with status codes in the 4XX range, and invalid
HTTP responses.
Args:
script (str): Script name, usually 'api'.
data (dict): Post data
files (dict): Files to upload
retry_on_error (bool): Retry on connection error
http_method (str): The HTTP method, defaults to 'POST'
Returns:
The raw text response.
"""
headers = {}
if self.compress and gzip:
headers['Accept-Encoding'] = 'gzip'
sleeper = self.sleepers.make((script, data))
scheme = self.scheme
host = self.host
if isinstance(host, (list, tuple)):
warnings.warn(
'Specifying host as a tuple is deprecated as of mwclient 0.10.0. '
+ 'Please use the new scheme argument instead.',
DeprecationWarning
)
scheme, host = host
url = '{scheme}://{host}{path}{script}{ext}'.format(scheme=scheme, host=host,
path=self.path, script=script,
ext=self.ext)
while True:
try:
args = {'files': files, 'headers': headers}
for k, v in self.requests.items():
args[k] = v
if http_method == 'GET':
args['params'] = data
else:
args['data'] = data
stream = self.connection.request(http_method, url, **args)
if stream.headers.get('x-database-lag'):
wait_time = int(stream.headers.get('retry-after'))
log.warning('Database lag exceeds max lag. '
'Waiting for {} seconds'.format(wait_time))
sleeper.sleep(wait_time)
elif stream.status_code == 200:
return stream.text
elif stream.status_code < 500 or stream.status_code > 599:
stream.raise_for_status()
else:
if not retry_on_error:
stream.raise_for_status()
log.warning('Received {status} response: {text}. '
'Retrying in a moment.'
.format(status=stream.status_code,
text=stream.text))
sleeper.sleep()
except requests.exceptions.ConnectionError:
# In the event of a network problem
# (e.g. DNS failure, refused connection, etc),
# Requests will raise a ConnectionError exception.
if not retry_on_error:
raise
log.warning('Connection error. Retrying in a moment.')
sleeper.sleep()
def raw_api(self, action, http_method='POST', *args, **kwargs):
"""Send a call to the API."""
try:
retry_on_error = kwargs.pop('retry_on_error')
except KeyError:
retry_on_error = True
kwargs['action'] = action
kwargs['format'] = 'json'
data = self._query_string(*args, **kwargs)
res = self.raw_call('api', data, retry_on_error=retry_on_error,
http_method=http_method)
try:
return json.loads(res, object_pairs_hook=OrderedDict)
except ValueError:
if res.startswith('MediaWiki API is not enabled for this site.'):
raise errors.APIDisabledError
raise errors.InvalidResponse(res)
def raw_index(self, action, http_method='POST', *args, **kwargs):
"""Sends a call to index.php rather than the API."""
kwargs['action'] = action
kwargs['maxlag'] = self.max_lag
data = self._query_string(*args, **kwargs)
return self.raw_call('index', data, http_method=http_method)
def require(self, major, minor, revision=None, raise_error=True):
if self.version is None:
if raise_error is None:
return
raise RuntimeError('Site %s has not yet been initialized' % repr(self))
if revision is None:
if self.version[:2] >= (major, minor):
return True
elif raise_error:
raise errors.MediaWikiVersionError(
'Requires version {required[0]}.{required[1]}, '
'current version is {current[0]}.{current[1]}'
.format(required=(major, minor),
current=(self.version[:2]))
)
else:
return False
else:
raise NotImplementedError
# Actions
def email(self, user, text, subject, cc=False):
"""
Send email to a specified user on the wiki.
>>> try:
... site.email('SomeUser', 'Some message', 'Some subject')
... except mwclient.errors.NoSpecifiedEmailError as e:
... print 'The user does not accept email, or has not specified an email address.'
Args:
user (str): User name of the recipient
text (str): Body of the email
subject (str): Subject of the email
cc (bool): True to send a copy of the email to yourself (default is False)
Returns:
Dictionary of the JSON response
Raises:
NoSpecifiedEmailError (mwclient.errors.NoSpecifiedEmailError): if recipient does not accept email
EmailError (mwclient.errors.EmailError): on other errors
"""
token = self.get_token('email')
try:
info = self.post('emailuser', target=user, subject=subject,
text=text, ccme=cc, token=token)
except errors.APIError as e:
if e.args[0] == u'noemail':
raise errors.NoSpecifiedEmail(user, e.args[1])
raise errors.EmailError(*e)
return info
def login(self, username=None, password=None, cookies=None, domain=None):
"""
Login to the wiki using a username and password. The method returns
nothing if the login was successful, but raises and error if it was not.
Args:
username (str): MediaWiki username
password (str): MediaWiki password
cookies (dict): Custom cookies to include with the log-in request.
domain (str): Sends domain name for authentication; used by some
MediaWiki plug-ins like the 'LDAP Authentication' extension.
Raises:
LoginError (mwclient.errors.LoginError): Login failed, the reason can be
obtained from e.code and e.info (where e is the exception object) and
will be one of the API:Login errors. The most common error code is
"Failed", indicating a wrong username or password.
MaximumRetriesExceeded: API call to log in failed and was retried until all
retries were exhausted. This will not occur if the credentials are merely
incorrect. See MaximumRetriesExceeded for possible reasons.
APIError: An API error occurred. Rare, usually indicates an internal server error.
"""
if username and password:
self.credentials = (username, password, domain)
if cookies:
self.connection.cookies.update(cookies)
if self.credentials:
sleeper = self.sleepers.make()
kwargs = {
'lgname': self.credentials[0],
'lgpassword': self.credentials[1]
}
if self.credentials[2]:
kwargs['lgdomain'] = self.credentials[2]
# Try to login using the scheme for MW 1.27+. If the wiki is read protected,
# it is not possible to get the wiki version upfront using the API, so we just
# have to try. If the attempt fails, we try the old method.
try:
kwargs['lgtoken'] = self.get_token('login')
except (errors.APIError, KeyError):
log.debug('Failed to get login token, MediaWiki is older than 1.27.')
while True:
login = self.post('login', **kwargs)
if login['login']['result'] == 'Success':
break
elif login['login']['result'] == 'NeedToken':
kwargs['lgtoken'] = login['login']['token']
elif login['login']['result'] == 'Throttled':
sleeper.sleep(int(login['login'].get('wait', 5)))
else:
raise errors.LoginError(self, login['login']['result'],
login['login']['reason'])
self.site_init()
def get_token(self, type, force=False, title=None):
if self.version is None or self.version[:2] >= (1, 24):
# The 'csrf' (cross-site request forgery) token introduced in 1.24 replaces
# the majority of older tokens, like edittoken and movetoken.
if type not in {'watch', 'patrol', 'rollback', 'userrights', 'login'}:
type = 'csrf'
if type not in self.tokens:
self.tokens[type] = '0'
if self.tokens.get(type, '0') == '0' or force:
if self.version is None or self.version[:2] >= (1, 24):
# We use raw_api() rather than api() because api() is adding "userinfo"
# to the query and this raises an readapideniederror if the wiki is read
# protected and we're trying to fetch a login token.
info = self.raw_api('query', 'GET', meta='tokens', type=type)
self.handle_api_result(info)
# Note that for read protected wikis, we don't know the version when
# fetching the login token. If it's < 1.27, the request below will
# raise a KeyError that we should catch.
self.tokens[type] = info['query']['tokens']['%stoken' % type]
else:
if title is None:
# Some dummy title was needed to get a token prior to 1.24
title = 'Test'
info = self.post('query', titles=title,
prop='info', intoken=type)
for i in six.itervalues(info['query']['pages']):
if i['title'] == title:
self.tokens[type] = i['%stoken' % type]
return self.tokens[type]
def upload(self, file=None, filename=None, description='', ignore=False,
file_size=None, url=None, filekey=None, comment=None):
"""Upload a file to the site.
Note that one of `file`, `filekey` and `url` must be specified, but not
more than one. For normal uploads, you specify `file`.
Args:
file (str): File object or stream to upload.
filename (str): Destination filename, don't include namespace
prefix like 'File:'
description (str): Wikitext for the file description page.
ignore (bool): True to upload despite any warnings.
file_size (int): Deprecated in mwclient 0.7
url (str): URL to fetch the file from.
filekey (str): Key that identifies a previous upload that was
stashed temporarily.
comment (str): Upload comment. Also used as the initial page text
for new files if `description` is not specified.
Example:
>>> client.upload(open('somefile', 'rb'), filename='somefile.jpg',
description='Some description')
Returns:
JSON result from the API.
Raises:
errors.InsufficientPermission
requests.exceptions.HTTPError
"""
if file_size is not None:
# Note that DeprecationWarning is hidden by default since Python 2.7
warnings.warn(
'file_size is deprecated since mwclient 0.7',
DeprecationWarning
)
if filename is None:
raise TypeError('filename must be specified')
if len([x for x in [file, filekey, url] if x is not None]) != 1:
raise TypeError("exactly one of 'file', 'filekey' and 'url' must be specified")
image = self.Images[filename]
if not image.can('upload'):
raise errors.InsufficientPermission(filename)
if comment is None:
comment = description
text = None
else:
comment = comment
text = description
if file is not None:
if not hasattr(file, 'read'):
file = open(file, 'rb')
content_size = file.seek(0, 2)
file.seek(0)
if self.version[:2] >= (1, 20) and content_size > self.chunk_size:
return self.chunk_upload(file, filename, ignore, comment, text)
predata = {
'action': 'upload',
'format': 'json',
'filename': filename,
'comment': comment,
'text': text,
'token': image.get_token('edit'),
}
if ignore:
predata['ignorewarnings'] = 'true'
if url:
predata['url'] = url
# sessionkey was renamed to filekey in MediaWiki 1.18
# https://phabricator.wikimedia.org/rMW5f13517e36b45342f228f3de4298bb0fe186995d
if self.version[:2] < (1, 18):
predata['sessionkey'] = filekey
else:
predata['filekey'] = filekey
postdata = predata
files = None
if file is not None:
# Workaround for https://github.com/mwclient/mwclient/issues/65
# ----------------------------------------------------------------
# Since the filename in Content-Disposition is not interpreted,
# we can send some ascii-only dummy name rather than the real
# filename, which might contain non-ascii.
files = {'file': ('fake-filename', file)}
sleeper = self.sleepers.make()
while True:
data = self.raw_call('api', postdata, files)
info = json.loads(data)
if not info:
info = {}
if self.handle_api_result(info, kwargs=predata, sleeper=sleeper):
response = info.get('upload', {})
break
if file is not None:
file.close()
return response
def chunk_upload(self, file, filename, ignorewarnings, comment, text):
"""Upload a file to the site in chunks.
This method is called by `Site.upload` if you are connecting to a newer
MediaWiki installation, so it's normally not necessary to call this
method directly.
Args:
file (file-like object): File object or stream to upload.
params (dict): Dict containing upload parameters.
"""
image = self.Images[filename]
content_size = file.seek(0, 2)
file.seek(0)
params = {
'action': 'upload',
'format': 'json',
'stash': 1,
'offset': 0,
'filename': filename,
'filesize': content_size,
'token': image.get_token('edit'),
}
if ignorewarnings:
params['ignorewarnings'] = 'true'
sleeper = self.sleepers.make()
offset = 0
for chunk in read_in_chunks(file, self.chunk_size):
while True:
data = self.raw_call('api', params, files={'chunk': chunk})
info = json.loads(data)
if self.handle_api_result(info, kwargs=params, sleeper=sleeper):
response = info.get('upload', {})
break
offset += chunk.tell()
chunk.close()
log.debug('%s: Uploaded %d of %d bytes', filename, offset, content_size)
params['filekey'] = response['filekey']
if response['result'] == 'Continue':
params['offset'] = response['offset']
elif response['result'] == 'Success':
file.close()
break
else:
# Some kind or error or warning occured. In any case, we do not
# get the parameters we need to continue, so we should return
# the response now.
file.close()
return response
del params['action']
del params['stash']
del params['offset']
params['comment'] = comment
params['text'] = text
return self.post('upload', **params)
def parse(self, text=None, title=None, page=None, prop=None,
redirects=False, mobileformat=False):
kwargs = {}
if text is not None:
kwargs['text'] = text
if title is not None:
kwargs['title'] = title
if page is not None:
kwargs['page'] = page
if prop is not None:
kwargs['prop'] = prop
if redirects:
kwargs['redirects'] = '1'
if mobileformat:
kwargs['mobileformat'] = '1'
result = self.post('parse', **kwargs)
return result['parse']
# def block(self): TODO?
# def unblock: TODO?
# def patrol: TODO?
# def import: TODO?
# Lists
def allpages(self, start=None, prefix=None, namespace='0', filterredir='all',
minsize=None, maxsize=None, prtype=None, prlevel=None,
limit=None, dir='ascending', filterlanglinks='all', generator=True,
end=None):
"""Retrieve all pages on the wiki as a generator."""
pfx = listing.List.get_prefix('ap', generator)
kwargs = dict(listing.List.generate_kwargs(
pfx, ('from', start), ('to', end), prefix=prefix,
minsize=minsize, maxsize=maxsize, prtype=prtype, prlevel=prlevel,
namespace=namespace, filterredir=filterredir, dir=dir,
filterlanglinks=filterlanglinks,
))
return listing.List.get_list(generator)(self, 'allpages', 'ap',
limit=limit, return_values='title',
**kwargs)
def allimages(self, start=None, prefix=None, minsize=None, maxsize=None, limit=None,
dir='ascending', sha1=None, sha1base36=None, generator=True, end=None):
"""Retrieve all images on the wiki as a generator."""
pfx = listing.List.get_prefix('ai', generator)
kwargs = dict(listing.List.generate_kwargs(
pfx, ('from', start), ('to', end), prefix=prefix,
minsize=minsize, maxsize=maxsize,
dir=dir, sha1=sha1, sha1base36=sha1base36,
))
return listing.List.get_list(generator)(self, 'allimages', 'ai', limit=limit,
return_values='timestamp|url',
**kwargs)
def alllinks(self, start=None, prefix=None, unique=False, prop='title',
namespace='0', limit=None, generator=True, end=None):
"""Retrieve a list of all links on the wiki as a generator."""
pfx = listing.List.get_prefix('al', generator)
kwargs = dict(listing.List.generate_kwargs(pfx, ('from', start), ('to', end),
prefix=prefix,
prop=prop, namespace=namespace))
if unique:
kwargs[pfx + 'unique'] = '1'
return listing.List.get_list(generator)(self, 'alllinks', 'al', limit=limit,
return_values='title', **kwargs)
def allcategories(self, start=None, prefix=None, dir='ascending', limit=None,
generator=True, end=None):
"""Retrieve all categories on the wiki as a generator."""
pfx = listing.List.get_prefix('ac', generator)
kwargs = dict(listing.List.generate_kwargs(pfx, ('from', start), ('to', end),
prefix=prefix, dir=dir))
return listing.List.get_list(generator)(self, 'allcategories', 'ac', limit=limit,
**kwargs)
def allusers(self, start=None, prefix=None, group=None, prop=None, limit=None,
witheditsonly=False, activeusers=False, rights=None, end=None):
"""Retrieve all users on the wiki as a generator."""
kwargs = dict(listing.List.generate_kwargs('au', ('from', start), ('to', end),
prefix=prefix,
group=group, prop=prop,
rights=rights,
witheditsonly=witheditsonly,
activeusers=activeusers))
return listing.List(self, 'allusers', 'au', limit=limit, **kwargs)
def blocks(self, start=None, end=None, dir='older', ids=None, users=None, limit=None,
prop='id|user|by|timestamp|expiry|reason|flags'):
"""Retrieve blocks as a generator.
Each block is a dictionary containing:
- user: the username or IP address of the user
- id: the ID of the block
- timestamp: when the block was added
- expiry: when the block runs out (infinity for indefinite blocks)
- reason: the reason they are blocked
- allowusertalk: key is present (empty string) if the user is allowed to edit their user talk page
- by: the administrator who blocked the user
- nocreate: key is present (empty string) if the user's ability to create accounts has been disabled.
"""
# TODO: Fix. Fix what?
kwargs = dict(listing.List.generate_kwargs('bk', start=start, end=end, dir=dir,
ids=ids, users=users, prop=prop))
return listing.List(self, 'blocks', 'bk', limit=limit, **kwargs)
def deletedrevisions(self, start=None, end=None, dir='older', namespace=None,
limit=None, prop='user|comment'):
# TODO: Fix
kwargs = dict(listing.List.generate_kwargs('dr', start=start, end=end, dir=dir,
namespace=namespace, prop=prop))
return listing.List(self, 'deletedrevs', 'dr', limit=limit, **kwargs)
def exturlusage(self, query, prop=None, protocol='http', namespace=None, limit=None):
r"""Retrieve the list of pages that link to a particular domain or URL, as a generator.
This API call mirrors the Special:LinkSearch function on-wiki.
Query can be a domain like 'bbc.co.uk'.
Wildcards can be used, e.g. '\*.bbc.co.uk'.
Alternatively, a query can contain a full domain name and some or all of a URL:
e.g. '\*.wikipedia.org/wiki/\*'
See for details.
The generator returns dictionaries containing three keys:
- url: the URL linked to.
- ns: namespace of the wiki page
- pageid: the ID of the wiki page
- title: the page title.
"""
kwargs = dict(listing.List.generate_kwargs('eu', query=query, prop=prop,
protocol=protocol, namespace=namespace))
return listing.List(self, 'exturlusage', 'eu', limit=limit, **kwargs)
def logevents(self, type=None, prop=None, start=None, end=None,
dir='older', user=None, title=None, limit=None, action=None):
"""Retrieve logevents as a generator."""
kwargs = dict(listing.List.generate_kwargs('le', prop=prop, type=type, start=start,
end=end, dir=dir, user=user,
title=title, action=action))
return listing.List(self, 'logevents', 'le', limit=limit, **kwargs)
def checkuserlog(self, user=None, target=None, limit=10, dir='older',
start=None, end=None):
"""Retrieve checkuserlog items as a generator."""
kwargs = dict(listing.List.generate_kwargs('cul', target=target, start=start,
end=end, dir=dir, user=user))
return listing.NestedList('entries', self, 'checkuserlog', 'cul',
limit=limit, **kwargs)
# def protectedtitles requires 1.15
def random(self, namespace, limit=20):
"""Retrieve a generator of random pages from a particular namespace.
limit specifies the number of random articles retrieved.
namespace is a namespace identifier integer.
Generator contains dictionary with namespace, page ID and title.
"""
kwargs = dict(listing.List.generate_kwargs('rn', namespace=namespace))
return listing.List(self, 'random', 'rn', limit=limit, **kwargs)
def recentchanges(self, start=None, end=None, dir='older', namespace=None,
prop=None, show=None, limit=None, type=None, toponly=None):
"""List recent changes to the wiki, à la Special:Recentchanges.
"""
kwargs = dict(listing.List.generate_kwargs('rc', start=start, end=end, dir=dir,
namespace=namespace, prop=prop,
show=show, type=type,
toponly='1' if toponly else None))
return listing.List(self, 'recentchanges', 'rc', limit=limit, **kwargs)
def revisions(self, revids, prop='ids|timestamp|flags|comment|user'):
"""Get data about a list of revisions.
See also the `Page.revisions()` method.
API doc: https://www.mediawiki.org/wiki/API:Revisions
Example: Get revision text for two revisions:
>>> for revision in site.revisions([689697696, 689816909], prop='content'):
... print revision['*']
Args:
revids (list): A list of (max 50) revisions.
prop (str): Which properties to get for each revision.
Returns:
A list of revisions
"""
kwargs = {
'prop': 'revisions',
'rvprop': prop,
'revids': '|'.join(map(text_type, revids))
}
revisions = []
pages = self.get('query', **kwargs).get('query', {}).get('pages', {}).values()
for page in pages:
for revision in page.get('revisions', ()):
revision['pageid'] = page.get('pageid')
revision['pagetitle'] = page.get('title')
revision['timestamp'] = parse_timestamp(revision['timestamp'])
revisions.append(revision)
return revisions
def search(self, search, namespace='0', what=None, redirects=False, limit=None):
"""Perform a full text search.
API doc: https://www.mediawiki.org/wiki/API:Search
Example:
>>> for result in site.search('prefix:Template:Citation/'):
... print(result.get('title'))
Args:
search (str): The query string
namespace (int): The namespace to search (default: 0)
what (str): Search scope: 'text' for fulltext, or 'title' for titles only.
Depending on the search backend,
both options may not be available.
For instance
`CirrusSearch `_
doesn't support 'title', but instead provides an "intitle:"
query string filter.
redirects (bool): Include redirect pages in the search
(option removed in MediaWiki 1.23).
Returns:
mwclient.listings.List: Search results iterator
"""
kwargs = dict(listing.List.generate_kwargs('sr', search=search,
namespace=namespace, what=what))
if redirects:
kwargs['srredirects'] = '1'
return listing.List(self, 'search', 'sr', limit=limit, **kwargs)
def usercontributions(self, user, start=None, end=None, dir='older', namespace=None,
prop=None, show=None, limit=None, uselang=None):
"""
List the contributions made by a given user to the wiki, à la Special:Contributions.
API doc: https://www.mediawiki.org/wiki/API:Usercontribs
"""
kwargs = dict(listing.List.generate_kwargs('uc', user=user, start=start, end=end,
dir=dir, namespace=namespace,
prop=prop, show=show))
return listing.List(self, 'usercontribs', 'uc', limit=limit, uselang=uselang,
**kwargs)
def users(self, users, prop='blockinfo|groups|editcount'):
"""
Get information about a list of users.
API doc: https://www.mediawiki.org/wiki/API:Users
"""
return listing.List(self, 'users', 'us', ususers='|'.join(users), usprop=prop)
def watchlist(self, allrev=False, start=None, end=None, namespace=None, dir='older',
prop=None, show=None, limit=None):
"""
List the pages on the current user's watchlist.
API doc: https://www.mediawiki.org/wiki/API:Watchlist
"""
kwargs = dict(listing.List.generate_kwargs('wl', start=start, end=end,
namespace=namespace, dir=dir,
prop=prop, show=show))
if allrev:
kwargs['wlallrev'] = '1'
return listing.List(self, 'watchlist', 'wl', limit=limit, **kwargs)
def expandtemplates(self, text, title=None, generatexml=False):
"""
Takes wikitext (text) and expands templates.
API doc: https://www.mediawiki.org/wiki/API:Expandtemplates
"""
kwargs = {}
if title is None:
kwargs['title'] = title
if generatexml:
kwargs['generatexml'] = '1'
result = self.get('expandtemplates', text=text, **kwargs)
if generatexml:
return result['expandtemplates']['*'], result['parsetree']['*']
else:
return result['expandtemplates']['*']
def ask(self, query, title=None):
"""
Ask a query against Semantic MediaWiki.
API doc: https://semantic-mediawiki.org/wiki/Ask_API
Returns:
Generator for retrieving all search results, with each answer as a dictionary.
If the query is invalid, an APIError is raised. A valid query with zero
results will not raise any error.
Examples:
>>> query = "[[Category:my cat]]|[[Has name::a name]]|?Has property"
>>> for answer in site.ask(query):
>>> for title, data in answer.items()
>>> print(title)
>>> print(data)
"""
kwargs = {}
if title is None:
kwargs['title'] = title
offset = 0
while offset is not None:
results = self.raw_api('ask', query=u'{query}|offset={offset}'.format(
query=query, offset=offset), http_method='GET', **kwargs)
self.handle_api_result(results) # raises APIError on error
offset = results.get('query-continue-offset')
answers = results['query'].get('results', [])
if isinstance(answers, dict):
# In older versions of Semantic MediaWiki (at least until 2.3.0)
# a list was returned. In newer versions an object is returned
# with the page title as key.
answers = [answer for answer in answers.values()]
for answer in answers:
yield answer
mwclient-0.10.0/mwclient/errors.py 0000664 0003720 0003720 00000004550 13521150461 017766 0 ustar travis travis 0000000 0000000 class MwClientError(RuntimeError):
pass
class MediaWikiVersionError(MwClientError):
pass
class APIDisabledError(MwClientError):
pass
class MaximumRetriesExceeded(MwClientError):
pass
class APIError(MwClientError):
def __init__(self, code, info, kwargs):
self.code = code
self.info = info
super(APIError, self).__init__(code, info, kwargs)
class InsufficientPermission(MwClientError):
pass
class UserBlocked(InsufficientPermission):
pass
class EditError(MwClientError):
pass
class ProtectedPageError(EditError, InsufficientPermission):
def __init__(self, page, code=None, info=None):
self.page = page
self.code = code
self.info = info
def __str__(self):
if self.info is not None:
return self.info
return 'You do not have the "edit" right.'
class FileExists(EditError):
pass
class LoginError(MwClientError):
def __init__(self, site, code, info):
super(LoginError, self).__init__(
site,
{'result': code, 'reason': info} # For backwards-compability
)
self.site = site
self.code = code
self.info = info
def __str__(self):
return self.info
class OAuthAuthorizationError(LoginError):
pass
class AssertUserFailedError(MwClientError):
def __init__(self):
super(AssertUserFailedError, self).__init__((
'By default, mwclient protects you from accidentally editing '
'without being logged in. If you actually want to edit without '
'logging in, you can set force_login on the Site object to False.'
))
def __str__(self):
return self.args[0]
class EmailError(MwClientError):
pass
class NoSpecifiedEmail(EmailError):
pass
class NoWriteApi(MwClientError):
pass
class InvalidResponse(MwClientError):
def __init__(self, response_text=None):
super(InvalidResponse, self).__init__((
'Did not get a valid JSON response from the server. Check that '
'you used the correct hostname. If you did, the server might '
'be wrongly configured or experiencing temporary problems.'),
response_text
)
self.response_text = response_text
def __str__(self):
return self.args[0]
class InvalidPageTitle(MwClientError):
pass
mwclient-0.10.0/mwclient/ex.py 0000664 0003720 0003720 00000005026 13521150461 017065 0 ustar travis travis 0000000 0000000 import client
import requests
def read_config(config_files, **predata):
cfg = {}
for config_file in config_files:
cfg.update(_read_config_file(
config_file, predata))
return cfg
def _read_config_file(_config_file, predata):
_file = open(_config_file)
exec(_file, globals(), predata)
_file.close()
for _k, _v in predata.iteritems():
if not _k.startswith('_'):
yield _k, _v
for _k, _v in locals().iteritems():
if not _k.startswith('_'):
yield _k, _v
class SiteList(object):
def __init__(self):
self.sites = {}
def __getitem__(self, key):
if key not in self.sites:
self.sites[key] = {}
return self.sites[key]
def __iter__(self):
return self.sites.itervalues()
class ConfiguredSite(client.Site):
def __init__(self, *config_files, **kwargs):
self.config = read_config(config_files, sites=SiteList())
if 'name' in kwargs:
self.config.update(self.config['sites'][kwargs['name']])
do_login = 'username' in self.config and 'password' in self.config
super(ConfiguredSite, self).__init__(
host=self.config['host'],
path=self.config['path'],
ext=self.config.get('ext', '.php'),
do_init=not do_login,
retry_timeout=self.config.get('retry_timeout', 30),
max_retries=self.config.get('max_retries', -1),
)
if do_login:
self.login(self.config['username'],
self.config['password'])
class ConfiguredPool(list):
def __init__(self, *config_files):
self.config = read_config(config_files, sites=SiteList())
self.pool = requests.Session()
config = dict([(k, v) for k, v in self.config.iteritems()
if k != 'sites'])
for site in self.config['sites']:
cfg = config.copy()
cfg.update(site)
site.update(cfg)
do_login = 'username' in site and 'password' in site
self.append(client.Site(host=site['host'],
path=site['path'], ext=site.get('ext', '.php'),
pool=self.pool, do_init=not do_login,
retry_timeout=site.get('retry_timeout', 30),
max_retries=site.get('max_retries', -1)))
if do_login:
self[-1].login(site['username'], site['password'])
self[-1].config = site
mwclient-0.10.0/mwclient/image.py 0000664 0003720 0003720 00000005120 13521150461 017526 0 ustar travis travis 0000000 0000000 import mwclient.listing
import mwclient.page
class Image(mwclient.page.Page):
def __init__(self, site, name, info=None):
super(Image, self).__init__(site, name, info,
extra_properties={'imageinfo': (('iiprop', 'timestamp|user|comment|url|size|sha1|metadata|archivename'), )})
self.imagerepository = self._info.get('imagerepository', '')
self.imageinfo = self._info.get('imageinfo', ({}, ))[0]
def imagehistory(self):
"""
Get file revision info for the given file.
API doc: https://www.mediawiki.org/wiki/API:Imageinfo
"""
return mwclient.listing.PageProperty(self, 'imageinfo', 'ii',
iiprop='timestamp|user|comment|url|size|sha1|metadata|archivename')
def imageusage(self, namespace=None, filterredir='all', redirect=False,
limit=None, generator=True):
"""
List pages that use the given file.
API doc: https://www.mediawiki.org/wiki/API:Imageusage
"""
prefix = mwclient.listing.List.get_prefix('iu', generator)
kwargs = dict(mwclient.listing.List.generate_kwargs(prefix, title=self.name, namespace=namespace, filterredir=filterredir))
if redirect:
kwargs['%sredirect' % prefix] = '1'
return mwclient.listing.List.get_list(generator)(self.site, 'imageusage', 'iu', limit=limit, return_values='title', **kwargs)
def duplicatefiles(self, limit=None):
"""
List duplicates of the current file.
API doc: https://www.mediawiki.org/wiki/API:Duplicatefiles
"""
return mwclient.listing.PageProperty(self, 'duplicatefiles', 'df', dflimit=limit)
def download(self, destination=None):
"""
Download the file. If `destination` is given, the file will be written
directly to the stream. Otherwise the file content will be stored in memory
and returned (with the risk of running out of memory for large files).
Recommended usage:
>>> with open(filename, 'wb') as fd:
... image.download(fd)
Args:
destination (file object): Destination file
"""
url = self.imageinfo['url']
if destination is not None:
res = self.site.connection.get(url, stream=True)
for chunk in res.iter_content(1024):
destination.write(chunk)
else:
return self.site.connection.get(url).content
def __repr__(self):
return "" % (self.name.encode('utf-8'), self.site)
mwclient-0.10.0/mwclient/listing.py 0000664 0003720 0003720 00000025105 13521150461 020122 0 ustar travis travis 0000000 0000000 import six
import six.moves
from six import text_type
from mwclient.util import parse_timestamp
import mwclient.page
import mwclient.image
class List(object):
"""Base class for lazy iteration over api response content
This is a class providing lazy iteration. This means that the
content is loaded in chunks as long as the response hints at
continuing content.
"""
def __init__(self, site, list_name, prefix,
limit=None, return_values=None, max_items=None,
*args, **kwargs):
# NOTE: Fix limit
self.site = site
self.list_name = list_name
self.generator = 'list'
self.prefix = prefix
kwargs.update(args)
self.args = kwargs
if limit is None:
limit = site.api_limit
self.args[self.prefix + 'limit'] = text_type(limit)
self.count = 0
self.max_items = max_items
self._iter = iter(six.moves.range(0))
self.last = False
self.result_member = list_name
self.return_values = return_values
def __iter__(self):
return self
def __next__(self):
if self.max_items is not None:
if self.count >= self.max_items:
raise StopIteration
# For filered lists, we might have to do several requests
# to get the next element due to miser mode.
# See: https://github.com/mwclient/mwclient/issues/194
while True:
try:
item = six.next(self._iter)
if item is not None:
break
except StopIteration:
if self.last:
raise
self.load_chunk()
self.count += 1
if 'timestamp' in item:
item['timestamp'] = parse_timestamp(item['timestamp'])
if isinstance(self, GeneratorList):
return item
if type(self.return_values) is tuple:
return tuple((item[i] for i in self.return_values))
if self.return_values is not None:
return item[self.return_values]
return item
def next(self, *args, **kwargs):
""" For Python 2.x support """
return self.__next__(*args, **kwargs)
def load_chunk(self):
"""Query a new chunk of data
If the query is empty, `raise StopIteration`.
Else, update the iterator accordingly.
If 'continue' is in the response, it is added to `self.args`
(new style continuation, added in MediaWiki 1.21).
If not, but 'query-continue' is in the response, query its
item called `self.list_name` and add this to `self.args` (old
style continuation).
Else, set `self.last` to True.
"""
data = self.site.get(
'query', (self.generator, self.list_name),
*[(text_type(k), v) for k, v in six.iteritems(self.args)]
)
if not data:
# Non existent page
raise StopIteration
# Process response if not empty.
# See: https://github.com/mwclient/mwclient/issues/194
if 'query' in data:
self.set_iter(data)
if data.get('continue'):
# New style continuation, added in MediaWiki 1.21
self.args.update(data['continue'])
elif self.list_name in data.get('query-continue', ()):
# Old style continuation
self.args.update(data['query-continue'][self.list_name])
else:
self.last = True
def set_iter(self, data):
"""Set `self._iter` to the API response `data`."""
if self.result_member not in data['query']:
self._iter = iter(six.moves.range(0))
elif type(data['query'][self.result_member]) is list:
self._iter = iter(data['query'][self.result_member])
else:
self._iter = six.itervalues(data['query'][self.result_member])
def __repr__(self):
return "" % (self.list_name, self.site)
@staticmethod
def generate_kwargs(_prefix, *args, **kwargs):
kwargs.update(args)
for key, value in six.iteritems(kwargs):
if value is not None and value is not False:
yield _prefix + key, value
@staticmethod
def get_prefix(prefix, generator=False):
return ('g' if generator else '') + prefix
@staticmethod
def get_list(generator=False):
return GeneratorList if generator else List
class NestedList(List):
def __init__(self, nested_param, *args, **kwargs):
super(NestedList, self).__init__(*args, **kwargs)
self.nested_param = nested_param
def set_iter(self, data):
self._iter = iter(data['query'][self.result_member][self.nested_param])
class GeneratorList(List):
"""Lazy-loaded list of Page, Image or Category objects
While the standard List class yields raw response data
(optionally filtered based on the value of List.return_values),
this subclass turns the data into Page, Image or Category objects.
"""
def __init__(self, site, list_name, prefix, *args, **kwargs):
super(GeneratorList, self).__init__(site, list_name, prefix,
*args, **kwargs)
self.args['g' + self.prefix + 'limit'] = self.args[self.prefix + 'limit']
del self.args[self.prefix + 'limit']
self.generator = 'generator'
self.args['prop'] = 'info|imageinfo'
self.args['inprop'] = 'protection'
self.result_member = 'pages'
self.page_class = mwclient.page.Page
def __next__(self):
info = super(GeneratorList, self).__next__()
if info['ns'] == 14:
return Category(self.site, u'', info)
if info['ns'] == 6:
return mwclient.image.Image(self.site, u'', info)
return mwclient.page.Page(self.site, u'', info)
def load_chunk(self):
# Put this here so that the constructor does not fail
# on uninitialized sites
self.args['iiprop'] = 'timestamp|user|comment|url|size|sha1|metadata|archivename'
return super(GeneratorList, self).load_chunk()
class Category(mwclient.page.Page, GeneratorList):
def __init__(self, site, name, info=None, namespace=None):
mwclient.page.Page.__init__(self, site, name, info)
kwargs = {}
kwargs['gcmtitle'] = self.name
if namespace:
kwargs['gcmnamespace'] = namespace
GeneratorList.__init__(self, site, 'categorymembers', 'cm', **kwargs)
def __repr__(self):
return "" % (self.name.encode('utf-8'), self.site)
def members(self, prop='ids|title', namespace=None, sort='sortkey',
dir='asc', start=None, end=None, generator=True):
prefix = self.get_prefix('cm', generator)
kwargs = dict(self.generate_kwargs(prefix, prop=prop, namespace=namespace,
sort=sort, dir=dir, start=start, end=end, title=self.name))
return self.get_list(generator)(self.site, 'categorymembers', 'cm', **kwargs)
class PageList(GeneratorList):
def __init__(self, site, prefix=None, start=None, namespace=0, redirects='all', end=None):
self.namespace = namespace
kwargs = {}
if prefix:
kwargs['gapprefix'] = prefix
if start:
kwargs['gapfrom'] = start
if end:
kwargs['gapto'] = end
super(PageList, self).__init__(site, 'allpages', 'ap',
gapnamespace=text_type(namespace),
gapfilterredir=redirects,
**kwargs)
def __getitem__(self, name):
return self.get(name, None)
def get(self, name, info=()):
"""Return the page of name `name` as an object.
If self.namespace is not zero, use {namespace}:{name} as the
page name, otherwise guess the namespace from the name using
`self.guess_namespace`.
Returns:
One of Category, Image or Page (default), according to namespace.
"""
if self.namespace != 0:
full_page_name = u"{namespace}:{name}".format(
namespace=self.site.namespaces[self.namespace],
name=name,
)
namespace = self.namespace
else:
full_page_name = name
try:
namespace = self.guess_namespace(name)
except AttributeError:
# raised when `namespace` doesn't have a `startswith` attribute
namespace = 0
cls = {
14: Category,
6: mwclient.image.Image,
}.get(namespace, mwclient.page.Page)
return cls(self.site, full_page_name, info)
def guess_namespace(self, name):
"""Guess the namespace from name
If name starts with any of the site's namespaces' names or
default_namespaces, use that. Else, return zero.
Args:
name (str): The pagename as a string (having `.startswith`)
Returns:
The id of the guessed namespace or zero.
"""
for ns in self.site.namespaces:
if ns == 0:
continue
if name.startswith(u'%s:' % self.site.namespaces[ns].replace(' ', '_')):
return ns
elif ns in self.site.default_namespaces:
if name.startswith(u'%s:' % self.site.default_namespaces[ns].replace(' ', '_')):
return ns
return 0
class PageProperty(List):
def __init__(self, page, prop, prefix, *args, **kwargs):
super(PageProperty, self).__init__(page.site, prop, prefix,
titles=page.name,
*args, **kwargs)
self.page = page
self.generator = 'prop'
def set_iter(self, data):
for page in six.itervalues(data['query']['pages']):
if page['title'] == self.page.name:
self._iter = iter(page.get(self.list_name, ()))
return
raise StopIteration
class PagePropertyGenerator(GeneratorList):
def __init__(self, page, prop, prefix, *args, **kwargs):
super(PagePropertyGenerator, self).__init__(page.site, prop, prefix,
titles=page.name,
*args, **kwargs)
self.page = page
class RevisionsIterator(PageProperty):
def load_chunk(self):
if 'rvstartid' in self.args and 'rvstart' in self.args:
del self.args['rvstart']
return super(RevisionsIterator, self).load_chunk()
mwclient-0.10.0/mwclient/page.py 0000664 0003720 0003720 00000045770 13521150461 017377 0 ustar travis travis 0000000 0000000 import six
from six import text_type
import time
from mwclient.util import parse_timestamp
import mwclient.listing
import mwclient.errors
class Page(object):
def __init__(self, site, name, info=None, extra_properties=None):
if type(name) is type(self):
self.__dict__.update(name.__dict__)
return
self.site = site
self.name = name
self._textcache = {}
if not info:
if extra_properties:
prop = 'info|' + '|'.join(six.iterkeys(extra_properties))
extra_props = []
for extra_prop in six.itervalues(extra_properties):
extra_props.extend(extra_prop)
else:
prop = 'info'
extra_props = ()
if type(name) is int:
info = self.site.get('query', prop=prop, pageids=name,
inprop='protection', *extra_props)
else:
info = self.site.get('query', prop=prop, titles=name,
inprop='protection', *extra_props)
info = six.next(six.itervalues(info['query']['pages']))
self._info = info
if 'invalid' in info:
raise mwclient.errors.InvalidPageTitle(info.get('invalidreason'))
self.namespace = info.get('ns', 0)
self.name = info.get('title', u'')
if self.namespace:
self.page_title = self.strip_namespace(self.name)
else:
self.page_title = self.name
self.touched = parse_timestamp(info.get('touched'))
self.revision = info.get('lastrevid', 0)
self.exists = 'missing' not in info
self.length = info.get('length')
self.protection = {
i['type']: (i['level'], i['expiry'])
for i in info.get('protection', ())
if i
}
self.redirect = 'redirect' in info
self.pageid = info.get('pageid', None)
self.contentmodel = info.get('contentmodel', None)
self.pagelanguage = info.get('pagelanguage', None)
self.restrictiontypes = info.get('restrictiontypes', None)
self.last_rev_time = None
self.edit_time = None
def redirects_to(self):
""" Returns the redirect target page, or None if the page is not a redirect page."""
info = self.site.get('query', prop='pageprops', titles=self.name, redirects='')['query']
if 'redirects' in info:
for page in info['redirects']:
if page['from'] == self.name:
return Page(self.site, page['to'])
return None
else:
return None
def resolve_redirect(self):
""" Returns the redirect target page, or the current page if it's not a redirect page."""
target_page = self.redirects_to()
if target_page is None:
return self
else:
return target_page
def __repr__(self):
return "" % (self.name.encode('utf-8'), self.site)
def __unicode__(self):
return self.name
@staticmethod
def strip_namespace(title):
if title[0] == ':':
title = title[1:]
return title[title.find(':') + 1:]
@staticmethod
def normalize_title(title):
# TODO: Make site dependent
title = title.strip()
if title[0] == ':':
title = title[1:]
title = title[0].upper() + title[1:]
title = title.replace(' ', '_')
return title
def can(self, action):
"""Check if the current user has the right to carry out some action
with the current page.
Example:
>>> page.can('edit')
True
"""
level = self.protection.get(action, (action, ))[0]
if level == 'sysop':
level = 'editprotected'
return level in self.site.rights
def get_token(self, type, force=False):
return self.site.get_token(type, force, title=self.name)
def text(self, section=None, expandtemplates=False, cache=True, slot='main'):
"""Get the current wikitext of the page, or of a specific section.
If the page does not exist, an empty string is returned. By
default, results will be cached and if you call text() again
with the same section and expandtemplates the result will come
from the cache. The cache is stored on the instance, so it
lives as long as the instance does.
Args:
section (int): numbered section or `None` to get the whole page (default: `None`)
expandtemplates (bool): set to `True` to expand templates (default: `False`)
cache (bool): set to `False` to disable caching (default: `True`)
"""
if not self.can('read'):
raise mwclient.errors.InsufficientPermission(self)
if not self.exists:
return u''
if section is not None:
section = text_type(section)
key = hash((section, expandtemplates))
if cache and key in self._textcache:
return self._textcache[key]
revs = self.revisions(prop='content|timestamp', limit=1, section=section, slots=slot)
try:
rev = next(revs)
if 'slots' in rev:
text = rev['slots'][slot]['*']
else:
text = rev['*']
self.last_rev_time = rev['timestamp']
except StopIteration:
text = u''
self.last_rev_time = None
if not expandtemplates:
self.edit_time = time.gmtime()
else:
# The 'rvexpandtemplates' option was removed in MediaWiki 1.32, so we have to make
# an extra API call. See: https://github.com/mwclient/mwclient/issues/214
text = self.site.expandtemplates(text)
if cache:
self._textcache[key] = text
return text
def save(self, text, summary=u'', minor=False, bot=True, section=None, **kwargs):
"""Update the text of a section or the whole page by performing an edit operation.
"""
if not self.site.logged_in and self.site.force_login:
raise mwclient.errors.AssertUserFailedError()
if self.site.blocked:
raise mwclient.errors.UserBlocked(self.site.blocked)
if not self.can('edit'):
raise mwclient.errors.ProtectedPageError(self)
if not self.site.writeapi:
raise mwclient.errors.NoWriteApi(self)
data = {}
if minor:
data['minor'] = '1'
if not minor:
data['notminor'] = '1'
if self.last_rev_time:
data['basetimestamp'] = time.strftime('%Y%m%d%H%M%S', self.last_rev_time)
if self.edit_time:
data['starttimestamp'] = time.strftime('%Y%m%d%H%M%S', self.edit_time)
if bot:
data['bot'] = '1'
if section:
data['section'] = section
data.update(kwargs)
if self.site.force_login:
data['assert'] = 'user'
def do_edit():
result = self.site.post('edit', title=self.name, text=text,
summary=summary, token=self.get_token('edit'),
**data)
if result['edit'].get('result').lower() == 'failure':
raise mwclient.errors.EditError(self, result['edit'])
return result
try:
result = do_edit()
except mwclient.errors.APIError as e:
if e.code == 'badtoken':
# Retry, but only once to avoid an infinite loop
self.get_token('edit', force=True)
try:
result = do_edit()
except mwclient.errors.APIError as e:
self.handle_edit_error(e, summary)
else:
self.handle_edit_error(e, summary)
# 'newtimestamp' is not included if no change was made
if 'newtimestamp' in result['edit'].keys():
self.last_rev_time = parse_timestamp(result['edit'].get('newtimestamp'))
# Workaround for https://phabricator.wikimedia.org/T211233
for cookie in self.site.connection.cookies:
if 'PostEditRevision' in cookie.name:
self.site.connection.cookies.clear(cookie.domain, cookie.path, cookie.name)
# clear the page text cache
self._textcache = {}
return result['edit']
def handle_edit_error(self, e, summary):
if e.code == 'editconflict':
raise mwclient.errors.EditError(self, summary, e.info)
elif e.code in {'protectedtitle', 'cantcreate', 'cantcreate-anon',
'noimageredirect-anon', 'noimageredirect', 'noedit-anon',
'noedit', 'protectedpage', 'cascadeprotected',
'customcssjsprotected',
'protectednamespace-interface', 'protectednamespace'}:
raise mwclient.errors.ProtectedPageError(self, e.code, e.info)
elif e.code == 'assertuserfailed':
raise mwclient.errors.AssertUserFailedError()
else:
raise e
def move(self, new_title, reason='', move_talk=True, no_redirect=False):
"""Move (rename) page to new_title.
If user account is an administrator, specify no_redirect as True to not
leave a redirect.
If user does not have permission to move page, an InsufficientPermission
exception is raised.
"""
if not self.can('move'):
raise mwclient.errors.InsufficientPermission(self)
if not self.site.writeapi:
raise mwclient.errors.NoWriteApi(self)
data = {}
if move_talk:
data['movetalk'] = '1'
if no_redirect:
data['noredirect'] = '1'
result = self.site.post('move', ('from', self.name), to=new_title,
token=self.get_token('move'), reason=reason, **data)
return result['move']
def delete(self, reason='', watch=False, unwatch=False, oldimage=False):
"""Delete page.
If user does not have permission to delete page, an InsufficientPermission
exception is raised.
"""
if not self.can('delete'):
raise mwclient.errors.InsufficientPermission(self)
if not self.site.writeapi:
raise mwclient.errors.NoWriteApi(self)
data = {}
if watch:
data['watch'] = '1'
if unwatch:
data['unwatch'] = '1'
if oldimage:
data['oldimage'] = oldimage
result = self.site.post('delete', title=self.name,
token=self.get_token('delete'),
reason=reason, **data)
return result['delete']
def purge(self):
"""Purge server-side cache of page. This will re-render templates and other
dynamic content.
"""
self.site.post('purge', titles=self.name)
# def watch: requires 1.14
# Properties
def backlinks(self, namespace=None, filterredir='all', redirect=False,
limit=None, generator=True):
"""List pages that link to the current page, similar to Special:Whatlinkshere.
API doc: https://www.mediawiki.org/wiki/API:Backlinks
"""
prefix = mwclient.listing.List.get_prefix('bl', generator)
kwargs = dict(mwclient.listing.List.generate_kwargs(
prefix, namespace=namespace, filterredir=filterredir,
))
if redirect:
kwargs['%sredirect' % prefix] = '1'
kwargs[prefix + 'title'] = self.name
return mwclient.listing.List.get_list(generator)(
self.site, 'backlinks', 'bl', limit=limit, return_values='title',
**kwargs
)
def categories(self, generator=True, show=None):
"""List categories used on the current page.
API doc: https://www.mediawiki.org/wiki/API:Categories
Args:
generator (bool): Return generator (Default: True)
show (str): Set to 'hidden' to only return hidden categories
or '!hidden' to only return non-hidden ones.
Returns:
mwclient.listings.PagePropertyGenerator
"""
prefix = mwclient.listing.List.get_prefix('cl', generator)
kwargs = dict(mwclient.listing.List.generate_kwargs(
prefix, show=show
))
if generator:
return mwclient.listing.PagePropertyGenerator(
self, 'categories', 'cl', **kwargs
)
else:
# TODO: return sortkey if wanted
return mwclient.listing.PageProperty(
self, 'categories', 'cl', return_values='title', **kwargs
)
def embeddedin(self, namespace=None, filterredir='all', limit=None, generator=True):
"""List pages that transclude the current page.
API doc: https://www.mediawiki.org/wiki/API:Embeddedin
Args:
namespace (int): Restricts search to a given namespace (Default: None)
filterredir (str): How to filter redirects, either 'all' (default),
'redirects' or 'nonredirects'.
limit (int): Maximum amount of pages to return per request
generator (bool): Return generator (Default: True)
Returns:
mwclient.listings.List: Page iterator
"""
prefix = mwclient.listing.List.get_prefix('ei', generator)
kwargs = dict(mwclient.listing.List.generate_kwargs(prefix, namespace=namespace,
filterredir=filterredir))
kwargs[prefix + 'title'] = self.name
return mwclient.listing.List.get_list(generator)(
self.site, 'embeddedin', 'ei', limit=limit, return_values='title',
**kwargs
)
def extlinks(self):
"""List external links from the current page.
API doc: https://www.mediawiki.org/wiki/API:Extlinks
"""
return mwclient.listing.PageProperty(self, 'extlinks', 'el', return_values='*')
def images(self, generator=True):
"""List files/images embedded in the current page.
API doc: https://www.mediawiki.org/wiki/API:Images
"""
if generator:
return mwclient.listing.PagePropertyGenerator(self, 'images', '')
else:
return mwclient.listing.PageProperty(self, 'images', '',
return_values='title')
def iwlinks(self):
"""List interwiki links from the current page.
API doc: https://www.mediawiki.org/wiki/API:Iwlinks
"""
return mwclient.listing.PageProperty(self, 'iwlinks', 'iw',
return_values=('prefix', '*'))
def langlinks(self, **kwargs):
"""List interlanguage links from the current page.
API doc: https://www.mediawiki.org/wiki/API:Langlinks
"""
return mwclient.listing.PageProperty(self, 'langlinks', 'll',
return_values=('lang', '*'),
**kwargs)
def links(self, namespace=None, generator=True, redirects=False):
"""List links to other pages from the current page.
API doc: https://www.mediawiki.org/wiki/API:Links
"""
prefix = mwclient.listing.List.get_prefix('pl', generator)
kwargs = dict(mwclient.listing.List.generate_kwargs(prefix, namespace=namespace))
if redirects:
kwargs['redirects'] = '1'
if generator:
return mwclient.listing.PagePropertyGenerator(self, 'links', 'pl', **kwargs)
else:
return mwclient.listing.PageProperty(self, 'links', 'pl', return_values='title',
**kwargs)
def revisions(self, startid=None, endid=None, start=None, end=None,
dir='older', user=None, excludeuser=None, limit=50,
prop='ids|timestamp|flags|comment|user',
expandtemplates=False, section=None,
diffto=None, slots=None, uselang=None):
"""List revisions of the current page.
API doc: https://www.mediawiki.org/wiki/API:Revisions
Args:
startid (int): Revision ID to start listing from.
endid (int): Revision ID to stop listing at.
start (str): Timestamp to start listing from.
end (str): Timestamp to end listing at.
dir (str): Direction to list in: 'older' (default) or 'newer'.
user (str): Only list revisions made by this user.
excludeuser (str): Exclude revisions made by this user.
limit (int): The maximum number of revisions to return per request.
prop (str): Which properties to get for each revision,
default: 'ids|timestamp|flags|comment|user'
expandtemplates (bool): Expand templates in rvprop=content output
section (int): If rvprop=content is set, only retrieve the contents of this section.
diffto (str): Revision ID to diff each revision to. Use "prev",
"next" and "cur" for the previous, next and current
revision respectively.
slots (str): The content slot (Mediawiki >= 1.32) to retrieve
content from.
uselang (str): Language to use for parsed edit comments and other
localized messages.
Returns:
mwclient.listings.List: Revision iterator
"""
kwargs = dict(mwclient.listing.List.generate_kwargs('rv', startid=startid, endid=endid,
start=start, end=end, user=user,
excludeuser=excludeuser, diffto=diffto,
slots=slots))
if self.site.version[:2] < (1, 32) and 'rvslots' in kwargs:
# https://github.com/mwclient/mwclient/issues/199
del kwargs['rvslots']
kwargs['rvdir'] = dir
kwargs['rvprop'] = prop
kwargs['uselang'] = uselang
if expandtemplates:
kwargs['rvexpandtemplates'] = '1'
if section is not None:
kwargs['rvsection'] = section
return mwclient.listing.RevisionsIterator(self, 'revisions', 'rv', limit=limit,
**kwargs)
def templates(self, namespace=None, generator=True):
"""List templates used on the current page.
API doc: https://www.mediawiki.org/wiki/API:Templates
"""
prefix = mwclient.listing.List.get_prefix('tl', generator)
kwargs = dict(mwclient.listing.List.generate_kwargs(prefix, namespace=namespace))
if generator:
return mwclient.listing.PagePropertyGenerator(self, 'templates', prefix,
**kwargs)
else:
return mwclient.listing.PageProperty(self, 'templates', prefix,
return_values='title', **kwargs)
mwclient-0.10.0/mwclient/sleep.py 0000664 0003720 0003720 00000003060 13521150461 017555 0 ustar travis travis 0000000 0000000 import time
import logging
from mwclient.errors import MaximumRetriesExceeded
log = logging.getLogger(__name__)
class Sleepers(object):
def __init__(self, max_retries, retry_timeout, callback=lambda *x: None):
self.max_retries = max_retries
self.retry_timeout = retry_timeout
self.callback = callback
def make(self, args=None):
return Sleeper(args, self.max_retries, self.retry_timeout, self.callback)
class Sleeper(object):
"""
For any given operation, a `Sleeper` object keeps count of the number of
retries. For each retry, the sleep time increases until the max number of
retries is reached and a `MaximumRetriesExceeded` is raised. The sleeper
object should be discarded once the operation is successful.
"""
def __init__(self, args, max_retries, retry_timeout, callback):
self.args = args
self.retries = 0
self.max_retries = max_retries
self.retry_timeout = retry_timeout
self.callback = callback
def sleep(self, min_time=0):
"""
Sleep a minimum of `min_time` seconds.
The actual sleeping time will increase with the number of retries.
"""
self.retries += 1
if self.retries > self.max_retries:
raise MaximumRetriesExceeded(self, self.args)
self.callback(self, self.retries, self.args)
timeout = self.retry_timeout * (self.retries - 1)
if timeout < min_time:
timeout = min_time
log.debug('Sleeping for %d seconds', timeout)
time.sleep(timeout)
mwclient-0.10.0/mwclient/util.py 0000664 0003720 0003720 00000001040 13521150461 017416 0 ustar travis travis 0000000 0000000 import time
import io
def parse_timestamp(t):
"""Parses a string containing a timestamp.
Args:
t (str): A string containing a timestamp.
Returns:
time.struct_time: A timestamp.
"""
if t is None or t == '0000-00-00T00:00:00Z':
return time.struct_time((0, 0, 0, 0, 0, 0, 0, 0, 0))
return time.strptime(t, '%Y-%m-%dT%H:%M:%SZ')
def read_in_chunks(stream, chunk_size):
while True:
data = stream.read(chunk_size)
if not data:
break
yield io.BytesIO(data)
mwclient-0.10.0/mwclient.egg-info/ 0000775 0003720 0003720 00000000000 13521150513 017564 5 ustar travis travis 0000000 0000000 mwclient-0.10.0/mwclient.egg-info/PKG-INFO 0000664 0003720 0003720 00000007405 13521150512 020666 0 ustar travis travis 0000000 0000000 Metadata-Version: 2.1
Name: mwclient
Version: 0.10.0
Summary: MediaWiki API client
Home-page: https://github.com/btongminh/mwclient
Author: Bryan Tong Minh
Author-email: bryan.tongminh@gmail.com
License: MIT
Description:
# mwclient
[![Build status][build-status-img]](https://travis-ci.org/mwclient/mwclient)
[![Test coverage][test-coverage-img]](https://coveralls.io/r/mwclient/mwclient)
[![Code health][code-health-img]](https://landscape.io/github/mwclient/mwclient/master)
[![Latest version][latest-version-img]](https://pypi.python.org/pypi/mwclient)
[![MIT license][mit-license-img]](http://opensource.org/licenses/MIT)
[![Documentation status][documentation-status-img]](http://mwclient.readthedocs.io/en/latest/)
[![Issue statistics][issue-statistics-img]](http://isitmaintained.com/project/tldr-pages/tldr)
[![Gitter chat][gitter-chat-img]](https://gitter.im/mwclient/mwclient)
[build-status-img]: https://img.shields.io/travis/mwclient/mwclient.svg
[test-coverage-img]: https://img.shields.io/coveralls/mwclient/mwclient.svg
[code-health-img]: https://landscape.io/github/mwclient/mwclient/master/landscape.svg?style=flat
[latest-version-img]: https://img.shields.io/pypi/v/mwclient.svg
[mit-license-img]: https://img.shields.io/github/license/mwclient/mwclient.svg
[documentation-status-img]: https://readthedocs.org/projects/mwclient/badge/
[issue-statistics-img]: http://isitmaintained.com/badge/resolution/tldr-pages/tldr.svg
[gitter-chat-img]: https://img.shields.io/gitter/room/mwclient/mwclient.svg
mwclient is a lightweight Python client library to the
[MediaWiki API](https://mediawiki.org/wiki/API)
which provides access to most API functionality.
It works with Python 2.7 as well as 3.4 and above,
and supports MediaWiki 1.16 and above.
For functions not available in the current MediaWiki,
a `MediaWikiVersionError` is raised.
The current stable
[version 0.10.0](https://github.com/mwclient/mwclient/archive/v0.10.0.zip)
is [available through PyPI](https://pypi.python.org/pypi/mwclient):
```
$ pip install mwclient
```
The current [development version](https://github.com/mwclient/mwclient)
can be installed from GitHub:
```
$ pip install git+git://github.com/mwclient/mwclient.git
```
Please see the [changelog
document](https://github.com/mwclient/mwclient/blob/master/CHANGELOG.md)
for a list of changes.
## Documentation
Up-to-date documentation is hosted [at Read the Docs](http://mwclient.readthedocs.io/en/latest/).
It includes a user guide to get started using mwclient, a reference guide,
implementation and development notes.
There is also some documentation on the [GitHub wiki](https://github.com/mwclient/mwclient/wiki)
that hasn't been ported yet.
If you want to help, you're welcome!
## Contributing
Patches are welcome! See [this page](https://mwclient.readthedocs.io/en/latest/development/)
for information on how to get started with mwclient development.
Keywords: mediawiki wikipedia
Platform: UNKNOWN
Classifier: Programming Language :: Python
Classifier: Programming Language :: Python :: 2.7
Classifier: Programming Language :: Python :: 3.5
Classifier: Programming Language :: Python :: 3.6
Classifier: Programming Language :: Python :: 3.7
Description-Content-Type: text/markdown
mwclient-0.10.0/mwclient.egg-info/SOURCES.txt 0000664 0003720 0003720 00000000761 13521150513 021454 0 ustar travis travis 0000000 0000000 LICENSE.md
MANIFEST.in
README.md
setup.cfg
setup.py
mwclient/__init__.py
mwclient/client.py
mwclient/errors.py
mwclient/ex.py
mwclient/image.py
mwclient/listing.py
mwclient/page.py
mwclient/sleep.py
mwclient/util.py
mwclient.egg-info/PKG-INFO
mwclient.egg-info/SOURCES.txt
mwclient.egg-info/dependency_links.txt
mwclient.egg-info/requires.txt
mwclient.egg-info/top_level.txt
mwclient.egg-info/zip-safe
test/test_client.py
test/test_listing.py
test/test_page.py
test/test_sleep.py
test/test_util.py mwclient-0.10.0/mwclient.egg-info/dependency_links.txt 0000664 0003720 0003720 00000000001 13521150512 023631 0 ustar travis travis 0000000 0000000
mwclient-0.10.0/mwclient.egg-info/requires.txt 0000664 0003720 0003720 00000000026 13521150512 022161 0 ustar travis travis 0000000 0000000 requests-oauthlib
six
mwclient-0.10.0/mwclient.egg-info/top_level.txt 0000664 0003720 0003720 00000000011 13521150512 022305 0 ustar travis travis 0000000 0000000 mwclient
mwclient-0.10.0/mwclient.egg-info/zip-safe 0000664 0003720 0003720 00000000001 13521150512 021213 0 ustar travis travis 0000000 0000000
mwclient-0.10.0/setup.cfg 0000664 0003720 0003720 00000000711 13521150513 016070 0 ustar travis travis 0000000 0000000 [bumpversion]
current_version = 0.10.0
commit = True
tag = True
[aliases]
test = pytest
[bumpversion:file:setup.py]
search = version='{current_version}'
replace = version='{new_version}'
[bumpversion:file:mwclient/client.py]
[bumpversion:file:README.md]
[bdist_wheel]
universal = 1
[tool:pytest]
addopts = --cov mwclient test
[flake8]
ignore =
E501 # Line too long
W503 # Line break before binary operator
[egg_info]
tag_build =
tag_date = 0
mwclient-0.10.0/setup.py 0000664 0003720 0003720 00000002412 13521150461 015763 0 ustar travis travis 0000000 0000000 #!/usr/bin/env python
# encoding=utf-8
from __future__ import print_function
import os
import sys
from setuptools import setup
here = os.path.abspath(os.path.dirname(__file__))
README = open(os.path.join(here, 'README.md')).read()
needs_pytest = set(['pytest', 'test', 'ptr']).intersection(sys.argv)
pytest_runner = ['pytest-runner'] if needs_pytest else []
setup(name='mwclient',
version='0.10.0', # Use bumpversion to update
description='MediaWiki API client',
long_description=README,
long_description_content_type='text/markdown',
classifiers=[
'Programming Language :: Python',
'Programming Language :: Python :: 2.7',
'Programming Language :: Python :: 3.5',
'Programming Language :: Python :: 3.6',
'Programming Language :: Python :: 3.7',
],
keywords='mediawiki wikipedia',
author='Bryan Tong Minh',
author_email='bryan.tongminh@gmail.com',
url='https://github.com/btongminh/mwclient',
license='MIT',
packages=['mwclient'],
install_requires=['requests-oauthlib', 'six'],
setup_requires=pytest_runner,
tests_require=['pytest', 'pytest-cov', 'mock',
'responses>=0.3.0', 'responses!=0.6.0'],
zip_safe=True
)
mwclient-0.10.0/test/ 0000775 0003720 0003720 00000000000 13521150513 015227 5 ustar travis travis 0000000 0000000 mwclient-0.10.0/test/test_client.py 0000664 0003720 0003720 00000062100 13521150461 020117 0 ustar travis travis 0000000 0000000 # encoding=utf-8
from __future__ import print_function
from six import StringIO
import unittest
import pytest
import mwclient
import logging
import requests
import responses
import pkg_resources # part of setuptools
import mock
import time
from requests_oauthlib import OAuth1
try:
import json
except ImportError:
import simplejson as json
if __name__ == "__main__":
print()
print("Note: Running in stand-alone mode. Consult the README")
print(" (section 'Contributing') for advice on running tests.")
print()
logging.basicConfig(level=logging.DEBUG)
class TestCase(unittest.TestCase):
def metaResponse(self, **kwargs):
tpl = '{"query":{"general":{"generator":"MediaWiki %(version)s"},"namespaces":{"-1":{"*":"Special","canonical":"Special","case":"first-letter","id":-1},"-2":{"*":"Media","canonical":"Media","case":"first-letter","id":-2},"0":{"*":"","case":"first-letter","content":"","id":0},"1":{"*":"Talk","canonical":"Talk","case":"first-letter","id":1,"subpages":""},"10":{"*":"Template","canonical":"Template","case":"first-letter","id":10,"subpages":""},"100":{"*":"Test namespace 1","canonical":"Test namespace 1","case":"first-letter","id":100,"subpages":""},"101":{"*":"Test namespace 1 talk","canonical":"Test namespace 1 talk","case":"first-letter","id":101,"subpages":""},"102":{"*":"Test namespace 2","canonical":"Test namespace 2","case":"first-letter","id":102,"subpages":""},"103":{"*":"Test namespace 2 talk","canonical":"Test namespace 2 talk","case":"first-letter","id":103,"subpages":""},"11":{"*":"Template talk","canonical":"Template talk","case":"first-letter","id":11,"subpages":""},"1198":{"*":"Translations","canonical":"Translations","case":"first-letter","id":1198,"subpages":""},"1199":{"*":"Translations talk","canonical":"Translations talk","case":"first-letter","id":1199,"subpages":""},"12":{"*":"Help","canonical":"Help","case":"first-letter","id":12,"subpages":""},"13":{"*":"Help talk","canonical":"Help talk","case":"first-letter","id":13,"subpages":""},"14":{"*":"Category","canonical":"Category","case":"first-letter","id":14},"15":{"*":"Category talk","canonical":"Category talk","case":"first-letter","id":15,"subpages":""},"2":{"*":"User","canonical":"User","case":"first-letter","id":2,"subpages":""},"2500":{"*":"VisualEditor","canonical":"VisualEditor","case":"first-letter","id":2500},"2501":{"*":"VisualEditor talk","canonical":"VisualEditor talk","case":"first-letter","id":2501},"2600":{"*":"Topic","canonical":"Topic","case":"first-letter","defaultcontentmodel":"flow-board","id":2600},"3":{"*":"User talk","canonical":"User talk","case":"first-letter","id":3,"subpages":""},"4":{"*":"Wikipedia","canonical":"Project","case":"first-letter","id":4,"subpages":""},"460":{"*":"Campaign","canonical":"Campaign","case":"case-sensitive","defaultcontentmodel":"Campaign","id":460},"461":{"*":"Campaign talk","canonical":"Campaign talk","case":"case-sensitive","id":461},"5":{"*":"Wikipedia talk","canonical":"Project talk","case":"first-letter","id":5,"subpages":""},"6":{"*":"File","canonical":"File","case":"first-letter","id":6},"7":{"*":"File talk","canonical":"File talk","case":"first-letter","id":7,"subpages":""},"710":{"*":"TimedText","canonical":"TimedText","case":"first-letter","id":710},"711":{"*":"TimedText talk","canonical":"TimedText talk","case":"first-letter","id":711},"8":{"*":"MediaWiki","canonical":"MediaWiki","case":"first-letter","id":8,"subpages":""},"828":{"*":"Module","canonical":"Module","case":"first-letter","id":828,"subpages":""},"829":{"*":"Module talk","canonical":"Module talk","case":"first-letter","id":829,"subpages":""},"866":{"*":"CNBanner","canonical":"CNBanner","case":"first-letter","id":866},"867":{"*":"CNBanner talk","canonical":"CNBanner talk","case":"first-letter","id":867,"subpages":""},"9":{"*":"MediaWiki talk","canonical":"MediaWiki talk","case":"first-letter","id":9,"subpages":""},"90":{"*":"Thread","canonical":"Thread","case":"first-letter","id":90},"91":{"*":"Thread talk","canonical":"Thread talk","case":"first-letter","id":91},"92":{"*":"Summary","canonical":"Summary","case":"first-letter","id":92},"93":{"*":"Summary talk","canonical":"Summary talk","case":"first-letter","id":93}},"userinfo":{"anon":"","groups":["*"],"id":0,"name":"127.0.0.1","rights": %(rights)s}}}'
tpl = tpl % {'version': kwargs.get('version', '1.24wmf17'),
'rights': json.dumps(kwargs.get('rights', ["createaccount", "read", "edit", "createpage", "createtalk", "writeapi", "editmyusercss", "editmyuserjs", "viewmywatchlist", "editmywatchlist", "viewmyprivateinfo", "editmyprivateinfo", "editmyoptions", "centralauth-merge", "abusefilter-view", "abusefilter-log", "translate", "vipsscaler-test", "upload"]))
}
res = json.loads(tpl)
if kwargs.get('writeapi', True):
res['query']['general']['writeapi'] = ''
return res
def metaResponseAsJson(self, **kwargs):
return json.dumps(self.metaResponse(**kwargs))
def httpShouldReturn(self, body=None, callback=None, scheme='https', host='test.wikipedia.org', path='/w/',
script='api', headers=None, status=200, method='GET'):
url = '{scheme}://{host}{path}{script}.php'.format(scheme=scheme, host=host, path=path, script=script)
mock = responses.GET if method == 'GET' else responses.POST
if body is None:
responses.add_callback(mock, url, callback=callback)
else:
responses.add(mock, url, body=body, content_type='application/json',
adding_headers=headers, status=status)
def stdSetup(self):
self.httpShouldReturn(self.metaResponseAsJson())
site = mwclient.Site('test.wikipedia.org')
responses.reset()
return site
def makePageResponse(self, title='Dummy.jpg', **kwargs):
# Creates a dummy page response
pageinfo = {
"contentmodel": "wikitext",
"lastrevid": 112353797,
"length": 389,
"ns": 6,
"pageid": 738154,
"pagelanguage": "en",
"protection": [],
"title": title,
"touched": "2014-09-10T20:37:25Z"
}
pageinfo.update(**kwargs)
res = {
"query": {
"pages": {
"9": pageinfo
}
}
}
return json.dumps(res)
class TestClient(TestCase):
def setUp(self):
pass
def testVersion(self):
# The version specified in setup.py should equal the one specified in client.py
version = pkg_resources.require("mwclient")[0].version
assert version == mwclient.__version__
@responses.activate
def test_https_as_default(self):
# 'https' should be the default scheme
self.httpShouldReturn(self.metaResponseAsJson(), scheme='https')
site = mwclient.Site('test.wikipedia.org')
assert len(responses.calls) == 1
assert responses.calls[0].request.method == 'GET'
@responses.activate
def test_max_lag(self):
# Client should wait and retry if lag exceeds max-lag
def request_callback(request):
if len(responses.calls) == 0:
return (200, {'x-database-lag': '0', 'retry-after': '0'}, '')
else:
return (200, {}, self.metaResponseAsJson())
self.httpShouldReturn(callback=request_callback, scheme='https')
site = mwclient.Site('test.wikipedia.org')
assert len(responses.calls) == 2
assert 'retry-after' in responses.calls[0].response.headers
assert 'retry-after' not in responses.calls[1].response.headers
@responses.activate
def test_http_error(self):
# Client should raise HTTPError
self.httpShouldReturn('Uh oh', scheme='https', status=400)
with pytest.raises(requests.exceptions.HTTPError):
site = mwclient.Site('test.wikipedia.org')
@responses.activate
def test_force_http(self):
# Setting http should work
self.httpShouldReturn(self.metaResponseAsJson(), scheme='http')
site = mwclient.Site('test.wikipedia.org', scheme='http')
assert len(responses.calls) == 1
@responses.activate
def test_user_agent_is_sent(self):
# User specified user agent should be sent sent to server
self.httpShouldReturn(self.metaResponseAsJson())
site = mwclient.Site('test.wikipedia.org', clients_useragent='MyFabulousClient')
assert 'MyFabulousClient' in responses.calls[0].request.headers['user-agent']
@responses.activate
def test_custom_headers_are_sent(self):
# Custom headers should be sent to the server
self.httpShouldReturn(self.metaResponseAsJson())
site = mwclient.Site('test.wikipedia.org', custom_headers={'X-Wikimedia-Debug': 'host=mw1099.eqiad.wmnet; log'})
assert 'host=mw1099.eqiad.wmnet; log' in responses.calls[0].request.headers['X-Wikimedia-Debug']
@responses.activate
def test_basic_request(self):
self.httpShouldReturn(self.metaResponseAsJson())
site = mwclient.Site('test.wikipedia.org')
assert 'action=query' in responses.calls[0].request.url
assert 'meta=siteinfo%7Cuserinfo' in responses.calls[0].request.url
@responses.activate
def test_httpauth_defaults_to_basic_auth(self):
self.httpShouldReturn(self.metaResponseAsJson())
site = mwclient.Site('test.wikipedia.org', httpauth=('me', 'verysecret'))
assert isinstance(site.connection.auth, requests.auth.HTTPBasicAuth)
@responses.activate
def test_httpauth_raise_error_on_invalid_type(self):
self.httpShouldReturn(self.metaResponseAsJson())
with pytest.raises(RuntimeError):
site = mwclient.Site('test.wikipedia.org', httpauth=1)
@responses.activate
def test_oauth(self):
self.httpShouldReturn(self.metaResponseAsJson())
site = mwclient.Site('test.wikipedia.org',
consumer_token='a', consumer_secret='b',
access_token='c', access_secret='d')
assert isinstance(site.connection.auth, OAuth1)
@responses.activate
def test_api_disabled(self):
# Should raise APIDisabledError if API is not enabled
self.httpShouldReturn('MediaWiki API is not enabled for this site.')
with pytest.raises(mwclient.errors.APIDisabledError):
site = mwclient.Site('test.wikipedia.org')
@responses.activate
def test_version(self):
# Should parse the MediaWiki version number correctly
self.httpShouldReturn(self.metaResponseAsJson(version='1.16'))
site = mwclient.Site('test.wikipedia.org')
assert site.initialized is True
assert site.version == (1, 16)
@responses.activate
def test_min_version(self):
# Should raise MediaWikiVersionError if API version is < 1.16
self.httpShouldReturn(self.metaResponseAsJson(version='1.15'))
with pytest.raises(mwclient.errors.MediaWikiVersionError):
site = mwclient.Site('test.wikipedia.org')
@responses.activate
def test_private_wiki(self):
# Should not raise error
self.httpShouldReturn(json.dumps({
'error': {
'code': 'readapidenied',
'info': 'You need read permission to use this module'
}
}))
site = mwclient.Site('test.wikipedia.org')
assert site.initialized is False
# ----- Use standard setup for rest
@responses.activate
def test_headers(self):
# Content-type should be 'application/x-www-form-urlencoded' for POST requests
site = self.stdSetup()
self.httpShouldReturn('{}', method='POST')
site.post('purge', title='Main Page')
assert len(responses.calls) == 1
assert 'content-type' in responses.calls[0].request.headers
assert responses.calls[0].request.headers['content-type'] == 'application/x-www-form-urlencoded'
@responses.activate
def test_raw_index(self):
# Initializing the client should result in one request
site = self.stdSetup()
self.httpShouldReturn('Some data', script='index')
site.raw_index(action='purge', title='Main Page', http_method='GET')
assert len(responses.calls) == 1
@responses.activate
def test_api_error_response(self):
# Test that APIError is thrown on error response
site = self.stdSetup()
self.httpShouldReturn(json.dumps({
'error': {
'code': 'assertuserfailed',
'info': 'Assertion that the user is logged in failed',
'*': 'See https://en.wikipedia.org/w/api.php for API usage'
}
}), method='POST')
with pytest.raises(mwclient.errors.APIError) as excinfo:
site.api(action='edit', title='Wikipedia:Sandbox')
assert excinfo.value.code == 'assertuserfailed'
assert excinfo.value.info == 'Assertion that the user is logged in failed'
assert len(responses.calls) == 1
@responses.activate
def test_smw_error_response(self):
# Test that APIError is thrown on error response from SMW
site = self.stdSetup()
self.httpShouldReturn(json.dumps({
'error': {
'query': u'Certains « [[ » dans votre requête n’ont pas été clos par des « ]] » correspondants.'
}
}), method='GET')
with pytest.raises(mwclient.errors.APIError) as excinfo:
list(site.ask('test'))
assert excinfo.value.code is None
assert excinfo.value.info == u'Certains « [[ » dans votre requête n’ont pas été clos par des « ]] » correspondants.'
assert len(responses.calls) == 1
@responses.activate
def test_smw_response_v0_5(self):
# Test that the old SMW results format is handled
site = self.stdSetup()
self.httpShouldReturn(json.dumps({
"query": {
"results": [
{
"exists": "",
"fulltext": "Indeks (bibliotekfag)",
"fullurl": "http://example.com/wiki/Indeks_(bibliotekfag)",
"namespace": 0,
"printouts": [
{
"0": "1508611329",
"label": "Endringsdato"
}
]
},
{
"exists": "",
"fulltext": "Serendipitet",
"fullurl": "http://example.com/wiki/Serendipitet",
"namespace": 0,
"printouts": [
{
"0": "1508611394",
"label": "Endringsdato"
}
]
}
],
"serializer": "SMW\\Serializers\\QueryResultSerializer",
"version": 0.5
}
}), method='GET')
answers = set(result['fulltext'] for result in site.ask('test'))
assert answers == set(('Serendipitet', 'Indeks (bibliotekfag)'))
@responses.activate
def test_smw_response_v2(self):
# Test that the new SMW results format is handled
site = self.stdSetup()
self.httpShouldReturn(json.dumps({
"query": {
"results": {
"Indeks (bibliotekfag)": {
"exists": "1",
"fulltext": "Indeks (bibliotekfag)",
"fullurl": "http://example.com/wiki/Indeks_(bibliotekfag)",
"namespace": 0,
"printouts": {
"Endringsdato": [{
"raw": "1/2017/10/17/22/50/4/0",
"label": "Endringsdato"
}]
}
},
"Serendipitet": {
"exists": "1",
"fulltext": "Serendipitet",
"fullurl": "http://example.com/wiki/Serendipitet",
"namespace": 0,
"printouts": {
"Endringsdato": [{
"raw": "1/2017/10/17/22/50/4/0",
"label": "Endringsdato"
}]
}
}
},
"serializer": "SMW\\Serializers\\QueryResultSerializer",
"version": 2
}
}), method='GET')
answers = set(result['fulltext'] for result in site.ask('test'))
assert answers == set(('Serendipitet', 'Indeks (bibliotekfag)'))
@responses.activate
def test_repr(self):
# Test repr()
site = self.stdSetup()
assert repr(site) == ''
class TestLogin(TestCase):
@mock.patch('mwclient.client.Site.site_init')
@mock.patch('mwclient.client.Site.raw_api')
def test_old_login_flow(self, raw_api, site_init):
# The login flow used before MW 1.27 that starts with a action=login POST request
login_token = 'abc+\\'
def side_effect(*args, **kwargs):
if 'lgtoken' not in kwargs:
return {
'login': {'result': 'NeedToken', 'token': login_token}
}
elif 'lgname' in kwargs:
assert kwargs['lgtoken'] == login_token
return {
'login': {'result': 'Success'}
}
raw_api.side_effect = side_effect
site = mwclient.Site('test.wikipedia.org')
site.login('myusername', 'mypassword')
call_args = raw_api.call_args_list
assert len(call_args) == 3
assert call_args[0] == mock.call('query', 'GET', meta='tokens', type='login')
assert call_args[1] == mock.call('login', 'POST', lgname='myusername', lgpassword='mypassword')
assert call_args[2] == mock.call('login', 'POST', lgname='myusername', lgpassword='mypassword', lgtoken=login_token)
@mock.patch('mwclient.client.Site.site_init')
@mock.patch('mwclient.client.Site.raw_api')
def test_new_login_flow(self, raw_api, site_init):
# The login flow used from MW 1.27 that starts with a meta=tokens GET request
login_token = 'abc+\\'
def side_effect(*args, **kwargs):
if kwargs.get('meta') == 'tokens':
return {
'query': {'tokens': {'logintoken': login_token}}
}
elif 'lgname' in kwargs:
assert kwargs['lgtoken'] == login_token
return {
'login': {'result': 'Success'}
}
raw_api.side_effect = side_effect
site = mwclient.Site('test.wikipedia.org')
site.login('myusername', 'mypassword')
call_args = raw_api.call_args_list
assert len(call_args) == 2
assert call_args[0] == mock.call('query', 'GET', meta='tokens', type='login')
assert call_args[1] == mock.call('login', 'POST', lgname='myusername', lgpassword='mypassword', lgtoken=login_token)
class TestClientApiMethods(TestCase):
def setUp(self):
self.api = mock.patch('mwclient.client.Site.api').start()
self.api.return_value = self.metaResponse()
self.site = mwclient.Site('test.wikipedia.org')
def tearDown(self):
mock.patch.stopall()
def test_revisions(self):
self.api.return_value = {
'query': {'pages': {'1': {
'pageid': 1,
'title': 'Test page',
'revisions': [{
'revid': 689697696,
'timestamp': '2015-11-08T21:52:46Z',
'comment': 'Test comment 1'
}, {
'revid': 689816909,
'timestamp': '2015-11-09T16:09:28Z',
'comment': 'Test comment 2'
}]
}}}}
revisions = [rev for rev in self.site.revisions([689697696, 689816909], prop='content')]
args, kwargs = self.api.call_args
assert kwargs.get('revids') == '689697696|689816909'
assert len(revisions) == 2
assert revisions[0]['pageid'] == 1
assert revisions[0]['pagetitle'] == 'Test page'
assert revisions[0]['revid'] == 689697696
assert revisions[0]['timestamp'] == time.strptime('2015-11-08T21:52:46Z', '%Y-%m-%dT%H:%M:%SZ')
assert revisions[1]['revid'] == 689816909
class TestClientUploadArgs(TestCase):
def setUp(self):
self.raw_call = mock.patch('mwclient.client.Site.raw_call').start()
def configure(self, rights=['read', 'upload']):
self.raw_call.side_effect = [self.metaResponseAsJson(rights=rights)]
self.site = mwclient.Site('test.wikipedia.org')
self.vars = {
'fname': u'Some "ßeta" æøå.jpg',
'comment': u'Some slightly complex comment
π ≈ 3, © Me.jpg',
'token': u'abc+\\'
}
self.raw_call.side_effect = [
# 1st response:
self.makePageResponse(title='File:Test.jpg', imagerepository='local', imageinfo=[{
"comment": "",
"height": 1440,
"metadata": [],
"sha1": "69a764a9cf8307ea4130831a0aa0b9b7f9585726",
"size": 123,
"timestamp": "2013-12-22T07:11:07Z",
"user": "TestUser",
"width": 2160
}]),
# 2nd response:
json.dumps({'query': {'tokens': {'csrftoken': self.vars['token']}}}),
# 3rd response:
json.dumps({
"upload": {
"result": "Success",
"filename": self.vars['fname'],
"imageinfo": []
}
})
]
def tearDown(self):
mock.patch.stopall()
def test_upload_args(self):
# Test that methods are called, and arguments sent as expected
self.configure()
self.site.upload(file=StringIO('test'), filename=self.vars['fname'], comment=self.vars['comment'])
args, kwargs = self.raw_call.call_args
data = args[1]
files = args[2]
assert data.get('action') == 'upload'
assert data.get('filename') == self.vars['fname']
assert data.get('comment') == self.vars['comment']
assert data.get('token') == self.vars['token']
assert 'file' in files
def test_upload_missing_filename(self):
self.configure()
with pytest.raises(TypeError):
self.site.upload(file=StringIO('test'))
def test_upload_ambigitious_args(self):
self.configure()
with pytest.raises(TypeError):
self.site.upload(filename='Test', file=StringIO('test'), filekey='abc')
def test_upload_missing_upload_permission(self):
self.configure(rights=['read'])
with pytest.raises(mwclient.errors.InsufficientPermission):
self.site.upload(filename='Test', file=StringIO('test'))
class TestClientGetTokens(TestCase):
def setUp(self):
self.raw_call = mock.patch('mwclient.client.Site.raw_call').start()
def configure(self, version='1.24'):
self.raw_call.return_value = self.metaResponseAsJson(version=version)
self.site = mwclient.Site('test.wikipedia.org')
responses.reset()
def tearDown(self):
mock.patch.stopall()
def test_token_new_system(self):
# Test get_token for MW >= 1.24
self.configure(version='1.24')
self.raw_call.return_value = json.dumps({
'query': {'tokens': {'csrftoken': 'sometoken'}}
})
self.site.get_token('edit')
args, kwargs = self.raw_call.call_args
data = args[1]
assert 'intoken' not in data
assert data.get('type') == 'csrf'
assert 'csrf' in self.site.tokens
assert self.site.tokens['csrf'] == 'sometoken'
assert 'edit' not in self.site.tokens
def test_token_old_system_without_specifying_title(self):
# Test get_token for MW < 1.24
self.configure(version='1.23')
self.raw_call.return_value = self.makePageResponse(edittoken='sometoken', title='Test')
self.site.get_token('edit')
args, kwargs = self.raw_call.call_args
data = args[1]
assert 'type' not in data
assert data.get('intoken') == 'edit'
assert 'edit' in self.site.tokens
assert self.site.tokens['edit'] == 'sometoken'
assert 'csrf' not in self.site.tokens
def test_token_old_system_with_specifying_title(self):
# Test get_token for MW < 1.24
self.configure(version='1.23')
self.raw_call.return_value = self.makePageResponse(edittoken='sometoken', title='Some page')
self.site.get_token('edit', title='Some page')
args, kwargs = self.raw_call.call_args
data = args[1]
assert self.site.tokens['edit'] == 'sometoken'
if __name__ == '__main__':
unittest.main()
mwclient-0.10.0/test/test_listing.py 0000664 0003720 0003720 00000006645 13521150461 020326 0 ustar travis travis 0000000 0000000 # encoding=utf-8
from __future__ import print_function
import unittest
import pytest
import logging
import requests
import responses
import mock
import mwclient
from mwclient.listing import List, GeneratorList
try:
import json
except ImportError:
import simplejson as json
if __name__ == "__main__":
print()
print("Note: Running in stand-alone mode. Consult the README")
print(" (section 'Contributing') for advice on running tests.")
print()
class TestList(unittest.TestCase):
def setUp(self):
pass
def setupDummyResponses(self, mock_site, result_member, ns=None):
if ns is None:
ns = [0, 0, 0]
mock_site.get.side_effect = [
{
'continue': {
'apcontinue': 'Kre_Mbaye',
'continue': '-||'
},
'query': {
result_member: [
{
"pageid": 19839654,
"ns": ns[0],
"title": "Kre'fey",
},
{
"pageid": 19839654,
"ns": ns[1],
"title": "Kre-O",
}
]
}
},
{
'query': {
result_member: [
{
"pageid": 30955295,
"ns": ns[2],
"title": "Kre-O Transformers",
}
]
}
},
]
@mock.patch('mwclient.client.Site')
def test_list_continuation(self, mock_site):
# Test that the list fetches all three responses
# and yields dicts when return_values not set
lst = List(mock_site, 'allpages', 'ap', limit=2)
self.setupDummyResponses(mock_site, 'allpages')
vals = [x for x in lst]
assert len(vals) == 3
assert type(vals[0]) == dict
@mock.patch('mwclient.client.Site')
def test_list_with_str_return_value(self, mock_site):
# Test that the List yields strings when return_values is string
lst = List(mock_site, 'allpages', 'ap', limit=2, return_values='title')
self.setupDummyResponses(mock_site, 'allpages')
vals = [x for x in lst]
assert len(vals) == 3
assert type(vals[0]) == str
@mock.patch('mwclient.client.Site')
def test_list_with_tuple_return_value(self, mock_site):
# Test that the List yields tuples when return_values is tuple
lst = List(mock_site, 'allpages', 'ap', limit=2,
return_values=('title', 'ns'))
self.setupDummyResponses(mock_site, 'allpages')
vals = [x for x in lst]
assert len(vals) == 3
assert type(vals[0]) == tuple
@mock.patch('mwclient.client.Site')
def test_generator_list(self, mock_site):
# Test that the GeneratorList yields Page objects
lst = GeneratorList(mock_site, 'pages', 'p')
self.setupDummyResponses(mock_site, 'pages', ns=[0, 6, 14])
vals = [x for x in lst]
assert len(vals) == 3
assert type(vals[0]) == mwclient.page.Page
assert type(vals[1]) == mwclient.image.Image
assert type(vals[2]) == mwclient.listing.Category
if __name__ == '__main__':
unittest.main()
mwclient-0.10.0/test/test_page.py 0000664 0003720 0003720 00000031322 13521150461 017557 0 ustar travis travis 0000000 0000000 # encoding=utf-8
from __future__ import print_function
import unittest
import pytest
import logging
import requests
import responses
import mock
import mwclient
from mwclient.page import Page
from mwclient.client import Site
from mwclient.listing import Category
from mwclient.errors import APIError, AssertUserFailedError, ProtectedPageError, InvalidPageTitle
try:
import json
except ImportError:
import simplejson as json
if __name__ == "__main__":
print()
print("Note: Running in stand-alone mode. Consult the README")
print(" (section 'Contributing') for advice on running tests.")
print()
class TestPage(unittest.TestCase):
def setUp(self):
pass
@mock.patch('mwclient.client.Site')
def test_api_call_on_page_init(self, mock_site):
# Check that site.get() is called once on Page init
title = 'Some page'
mock_site.get.return_value = {
'query': {'pages': {'1': {}}}
}
page = Page(mock_site, title)
# test that Page called site.get with the right parameters
mock_site.get.assert_called_once_with('query', inprop='protection', titles=title, prop='info')
@mock.patch('mwclient.client.Site')
def test_nonexisting_page(self, mock_site):
# Check that API response results in page.exists being set to False
title = 'Some nonexisting page'
mock_site.get.return_value = {
'query': {'pages': {'-1': {'missing': ''}}}
}
page = Page(mock_site, title)
assert page.exists is False
@mock.patch('mwclient.client.Site')
def test_existing_page(self, mock_site):
# Check that API response results in page.exists being set to True
title = 'Norge'
mock_site.get.return_value = {
'query': {'pages': {'728': {}}}
}
page = Page(mock_site, title)
assert page.exists is True
@mock.patch('mwclient.client.Site')
def test_invalid_title(self, mock_site):
# Check that API page.exists is False for invalid title
title = '[Test]'
mock_site.get.return_value = {
"query": {
"pages": {
"-1": {
"title": "[Test]",
"invalidreason": "The requested page title contains invalid characters: \"[\".",
"invalid": ""
}
}
}
}
with pytest.raises(InvalidPageTitle):
page = Page(mock_site, title)
@mock.patch('mwclient.client.Site')
def test_pageprops(self, mock_site):
# Check that variouse page props are read correctly from API response
title = 'Some page'
mock_site.get.return_value = {
'query': {
'pages': {
'728': {
'contentmodel': 'wikitext',
'counter': '',
'lastrevid': 13355471,
'length': 58487,
'ns': 0,
'pageid': 728,
'pagelanguage': 'nb',
'protection': [],
'title': title,
'touched': '2014-09-14T21:11:52Z'
}
}
}
}
page = Page(mock_site, title)
assert page.exists is True
assert page.redirect is False
assert page.revision == 13355471
assert page.length == 58487
assert page.namespace == 0
assert page.name == title
assert page.page_title == title
@mock.patch('mwclient.client.Site')
def test_protection_levels(self, mock_site):
# If page is protected, check that protection is parsed correctly
title = 'Some page'
mock_site.get.return_value = {
'query': {
'pages': {
'728': {
'protection': [
{
'expiry': 'infinity',
'level': 'autoconfirmed',
'type': 'edit'
},
{
'expiry': 'infinity',
'level': 'sysop',
'type': 'move'
}
]
}
}
}
}
mock_site.rights = ['read', 'edit', 'move']
page = Page(mock_site, title)
assert page.protection == {'edit': ('autoconfirmed', 'infinity'), 'move': ('sysop', 'infinity')}
assert page.can('read') is True
assert page.can('edit') is False # User does not have 'autoconfirmed' right
assert page.can('move') is False # User does not have 'sysop' right
mock_site.rights = ['read', 'edit', 'move', 'autoconfirmed']
assert page.can('edit') is True # User has 'autoconfirmed' right
assert page.can('move') is False # User doesn't have 'sysop' right
mock_site.rights = ['read', 'edit', 'move', 'autoconfirmed', 'editprotected']
assert page.can('edit') is True # User has 'autoconfirmed' right
assert page.can('move') is True # User has 'sysop' right
@mock.patch('mwclient.client.Site')
def test_redirect(self, mock_site):
# Check that page.redirect is set correctly
title = 'Some redirect page'
mock_site.get.return_value = {
"query": {
"pages": {
"796917": {
"contentmodel": "wikitext",
"counter": "",
"lastrevid": 9342494,
"length": 70,
"ns": 0,
"pageid": 796917,
"pagelanguage": "nb",
"protection": [],
"redirect": "",
"title": title,
"touched": "2014-08-29T22:25:15Z"
}
}
}
}
page = Page(mock_site, title)
assert page.exists is True
assert page.redirect is True
@mock.patch('mwclient.client.Site')
def test_captcha(self, mock_site):
# Check that Captcha results in EditError
mock_site.blocked = False
mock_site.rights = ['read', 'edit']
title = 'Norge'
mock_site.get.return_value = {
'query': {'pages': {'728': {'protection': []}}}
}
page = Page(mock_site, title)
mock_site.post.return_value = {
'edit': {'result': 'Failure', 'captcha': {
'type': 'math',
'mime': 'text/tex',
'id': '509895952',
'question': '36 + 4 = '
}}
}
# For now, mwclient will just raise an EditError.
#
with pytest.raises(mwclient.errors.EditError):
page.save('Some text')
class TestPageApiArgs(unittest.TestCase):
def setUp(self):
title = 'Some page'
self.page_text = 'Hello world'
MockSite = mock.patch('mwclient.client.Site').start()
self.site = MockSite()
self.site.get.return_value = {'query': {'pages': {'1': {'title': title}}}}
self.site.rights = ['read']
self.site.api_limit = 500
self.site.version = (1, 32, 0)
self.page = Page(self.site, title)
self.site.get.return_value = {'query': {'pages': {'2': {
'ns': 0, 'pageid': 2, 'revisions': [{'*': 'Hello world', 'timestamp': '2014-08-29T22:25:15Z'}], 'title': title
}}}}
def get_last_api_call_args(self, http_method='POST'):
if http_method == 'GET':
args, kwargs = self.site.get.call_args
else:
args, kwargs = self.site.post.call_args
action = args[0]
args = args[1:]
kwargs.update(args)
return kwargs
def tearDown(self):
mock.patch.stopall()
def test_get_page_text(self):
# Check that page.text() works, and that a correct API call is made
text = self.page.text()
args = self.get_last_api_call_args(http_method='GET')
assert text == self.page_text
assert args == {
'prop': 'revisions',
'rvdir': 'older',
'titles': self.page.page_title,
'uselang': None,
'rvprop': 'content|timestamp',
'rvlimit': '1',
'rvslots': 'main',
}
def test_get_page_text_cached(self):
# Check page.text() caching
self.page.revisions = mock.Mock(return_value=iter([]))
self.page.text()
self.page.text()
# When cache is hit, revisions is not, so call_count should be 1
assert self.page.revisions.call_count == 1
self.page.text(cache=False)
# With cache explicitly disabled, we should hit revisions
assert self.page.revisions.call_count == 2
def test_get_section_text(self):
# Check that the 'rvsection' parameter is sent to the API
text = self.page.text(section=0)
args = self.get_last_api_call_args(http_method='GET')
assert args['rvsection'] == '0'
def test_get_text_expanded(self):
# Check that the 'rvexpandtemplates' parameter is sent to the API
text = self.page.text(expandtemplates=True)
args = self.get_last_api_call_args(http_method='GET')
assert self.site.expandtemplates.call_count == 1
assert args.get('rvexpandtemplates') is None
def test_assertuser_true(self):
# Check that assert=user is sent when force_login=True
self.site.blocked = False
self.site.rights = ['read', 'edit']
self.site.logged_in = True
self.site.force_login = True
self.site.api.return_value = {
'edit': {'result': 'Ok'}
}
self.page.save('Some text')
args = self.get_last_api_call_args()
assert args['assert'] == 'user'
def test_assertuser_false(self):
# Check that assert=user is not sent when force_login=False
self.site.blocked = False
self.site.rights = ['read', 'edit']
self.site.logged_in = False
self.site.force_login = False
self.site.api.return_value = {
'edit': {'result': 'Ok'}
}
self.page.save('Some text')
args = self.get_last_api_call_args()
assert 'assert' not in args
def test_handle_edit_error_assertuserfailed(self):
# Check that AssertUserFailedError is triggered
api_error = APIError('assertuserfailed',
'Assertion that the user is logged in failed',
'See https://en.wikipedia.org/w/api.php for API usage')
with pytest.raises(AssertUserFailedError):
self.page.handle_edit_error(api_error, 'n/a')
def test_handle_edit_error_protected(self):
# Check that ProtectedPageError is triggered
api_error = APIError('protectedpage',
'The "editprotected" right is required to edit this page',
'See https://en.wikipedia.org/w/api.php for API usage')
with pytest.raises(ProtectedPageError) as pp_error:
self.page.handle_edit_error(api_error, 'n/a')
assert pp_error.value.code == 'protectedpage'
assert str(pp_error.value) == 'The "editprotected" right is required to edit this page'
def test_get_page_categories(self):
# Check that page.categories() works, and that a correct API call is made
self.site.get.return_value = {
"batchcomplete": "",
"query": {
"pages": {
"1009371": {
"pageid": 1009371,
"ns": 14,
"title": "Category:1879 births",
},
"1005547": {
"pageid": 1005547,
"ns": 14,
"title": "Category:1955 deaths",
}
}
}
}
cats = list(self.page.categories())
args = self.get_last_api_call_args(http_method='GET')
assert {
'generator': 'categories',
'titles': self.page.page_title,
'iiprop': 'timestamp|user|comment|url|size|sha1|metadata|archivename',
'inprop': 'protection',
'prop': 'info|imageinfo',
'gcllimit': repr(self.page.site.api_limit),
} == args
assert set([c.name for c in cats]) == set([
'Category:1879 births',
'Category:1955 deaths',
])
if __name__ == '__main__':
unittest.main()
mwclient-0.10.0/test/test_sleep.py 0000664 0003720 0003720 00000003047 13521150461 017756 0 ustar travis travis 0000000 0000000 # encoding=utf-8
from __future__ import print_function
import unittest
import time
import mock
import pytest
from mwclient.sleep import Sleepers
from mwclient.sleep import Sleeper
from mwclient.errors import MaximumRetriesExceeded
if __name__ == "__main__":
print()
print("Note: Running in stand-alone mode. Consult the README")
print(" (section 'Contributing') for advice on running tests.")
print()
class TestSleepers(unittest.TestCase):
def setUp(self):
self.sleep = mock.patch('time.sleep').start()
self.max_retries = 10
self.sleepers = Sleepers(self.max_retries, 30)
def tearDown(self):
mock.patch.stopall()
def test_make(self):
sleeper = self.sleepers.make()
assert type(sleeper) == Sleeper
assert sleeper.retries == 0
def test_sleep(self):
sleeper = self.sleepers.make()
sleeper.sleep()
sleeper.sleep()
self.sleep.assert_has_calls([mock.call(0), mock.call(30)])
def test_min_time(self):
sleeper = self.sleepers.make()
sleeper.sleep(5)
self.sleep.assert_has_calls([mock.call(5)])
def test_retries_count(self):
sleeper = self.sleepers.make()
sleeper.sleep()
sleeper.sleep()
assert sleeper.retries == 2
def test_max_retries(self):
sleeper = self.sleepers.make()
for x in range(self.max_retries):
sleeper.sleep()
with pytest.raises(MaximumRetriesExceeded):
sleeper.sleep()
if __name__ == '__main__':
unittest.main()
mwclient-0.10.0/test/test_util.py 0000664 0003720 0003720 00000001505 13521150461 017620 0 ustar travis travis 0000000 0000000 # encoding=utf-8
from __future__ import print_function
import unittest
import time
from mwclient.util import parse_timestamp
if __name__ == "__main__":
print()
print("Note: Running in stand-alone mode. Consult the README")
print(" (section 'Contributing') for advice on running tests.")
print()
class TestUtil(unittest.TestCase):
def test_parse_missing_timestamp(self):
assert time.struct_time((0, 0, 0, 0, 0, 0, 0, 0, 0)) == parse_timestamp(None)
def test_parse_empty_timestamp(self):
assert time.struct_time((0, 0, 0, 0, 0, 0, 0, 0, 0)) == parse_timestamp('0000-00-00T00:00:00Z')
def test_parse_nonempty_timestamp(self):
assert time.struct_time((2015, 1, 2, 20, 18, 36, 4, 2, -1)) == parse_timestamp('2015-01-02T20:18:36Z')
if __name__ == '__main__':
unittest.main()