oops_datedir_repo-0.0.24/0000755000175000017500000000000013274306551017006 5ustar cjwatsoncjwatson00000000000000oops_datedir_repo-0.0.24/PKG-INFO0000644000175000017500000001074213274306551020107 0ustar cjwatsoncjwatson00000000000000Metadata-Version: 1.1 Name: oops_datedir_repo Version: 0.0.24 Summary: OOPS disk serialisation and repository management. Home-page: https://launchpad.net/python-oops-datedir-repo Author: Launchpad Developers Author-email: launchpad-dev@lists.launchpad.net License: UNKNOWN Description: ************************************************************************* python-oops-datedir-repo: A simple disk repository for OOPS Error reports ************************************************************************* Copyright (c) 2011, Canonical Ltd This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, version 3 only. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see . GNU Lesser General Public License version 3 (see the file LICENSE). This is a component of the python-oops project: https://launchpad.net/python-oops. An OOPS report is a report about something going wrong in a piece of software... thus, an 'oops' :) This package provides disk storage, management, and a serialisation format for OOPSes stored in the repository. Programs or services that are generating OOPS reports need this package or other similar ones, if they want to persist the reports. Dependencies ============ * Python 2.6+ * The oops package (https://launchpad.net/python-oops or 'oops' on pypi). Testing Dependencies ==================== * fixtures (http://pypi.python.org/pypi/fixtures) * subunit (http://pypi.python.org/pypi/python-subunit) (optional) * testtools (http://pypi.python.org/pypi/testtools) Usage ===== oops_datedir_repo is an extension package for the oops package. The DateDirRepo class provides an OOPS publisher (``DateDirRepo.publish``) which will write OOPSes into the repository. Retrieving OOPSes can be done by using the low level serializer_rfc822 functions : an OOPS report can be written to a disk file via the serializer_rfc822.write() function, and read via the matching read() function. Typical usage:: >>> config = oops.Config() >>> with fixtures.TempDir() as tempdir: ... repo = oops_datedir_repo.DateDirRepo('/tmp/demo') ... config.publishers.append(repo.publish) ... ids = config.publish({'oops': '!!!'}) For more information see the oops package documentation or the api docs. Installation ============ Either run setup.py in an environment with all the dependencies available, or add the working directory to your PYTHONPATH. Development =========== Upstream development takes place at https://launchpad.net/python-oops-datedir-repo. To setup a working area for development, if the dependencies are not immediately available, you can use ./bootstrap.py to create bin/buildout, then bin/py to get a python interpreter with the dependencies available. To run the tests use the runner of your choice, the test suite is oops_datedir_repo.tests.test_suite. For instance:: $ bin/py -m testtools.run oops_datedir_repo.tests.test_suite If you have testrepository you can run the tests with that:: $ testr run Platform: UNKNOWN Classifier: Development Status :: 2 - Pre-Alpha Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL) Classifier: Operating System :: OS Independent Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 2 Classifier: Programming Language :: Python :: 3 oops_datedir_repo-0.0.24/setup.py0000755000175000017500000000434713274305354020533 0ustar cjwatsoncjwatson00000000000000#!/usr/bin/env python # # Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). from distutils.core import setup import os.path with open(os.path.join(os.path.dirname(__file__), 'README')) as f: description = f.read() setup(name="oops_datedir_repo", version="0.0.24", description="OOPS disk serialisation and repository management.", long_description=description, maintainer="Launchpad Developers", maintainer_email="launchpad-dev@lists.launchpad.net", url="https://launchpad.net/python-oops-datedir-repo", packages=['oops_datedir_repo'], package_dir = {'':'.'}, classifiers = [ 'Development Status :: 2 - Pre-Alpha', 'Intended Audience :: Developers', 'License :: OSI Approved :: GNU Library or Lesser General Public License (LGPL)', 'Operating System :: OS Independent', 'Programming Language :: Python', 'Programming Language :: Python :: 2', 'Programming Language :: Python :: 3', ], install_requires = [ 'bson', 'iso8601', 'launchpadlib', # Needed for pruning - perhaps should be optional. 'oops>=0.0.11', 'pytz', 'six', ], extras_require = dict( test=[ 'fixtures', 'testtools', ] ), entry_points=dict( console_scripts=[ # `console_scripts` is a magic name to setuptools 'bsondump = oops_datedir_repo.bsondump:main', 'prune = oops_datedir_repo.prune:main', ]), ) oops_datedir_repo-0.0.24/oops_datedir_repo/0000755000175000017500000000000013274306551022507 5ustar cjwatsoncjwatson00000000000000oops_datedir_repo-0.0.24/oops_datedir_repo/bsondump.py0000644000175000017500000000261013274305354024707 0ustar cjwatsoncjwatson00000000000000# # Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). """Print a BSON document for easier human inspection. This can be used for oopses, which are commonly (though not necessarily) stored as BSON. usage: bsondump FILE """ from __future__ import absolute_import, print_function from pprint import pprint import sys from oops_datedir_repo import anybson as bson def main(argv=None): if argv is None: argv = sys.argv if len(argv) != 2: print __doc__ sys.exit(1) # I'd like to use json here, but not everything serializable in bson is # easily representable in json - even before getting in to the weird parts, # oopses commonly have datetime objects. -- mbp 2011-12-20 pprint(bson.loads(file(argv[1]).read())) oops_datedir_repo-0.0.24/oops_datedir_repo/serializer.py0000644000175000017500000000407213274305354025235 0ustar cjwatsoncjwatson00000000000000# Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). """Read from any known serializer. Where possible using the specific known serializer is better as it is more efficient and won't suffer false positives if two serializations happen to pun with each other (unlikely though that is). Typical usage: >>> fp = file('an-oops', 'rb') >>> report = serializer.read(fp) See the serializer_rfc822 and serializer_bson modules for information about serializing OOPS reports by hand. Generally just using the DateDirRepo.publish method is all that is needed. """ from __future__ import absolute_import, print_function __all__ = [ 'read', ] import bz2 from io import BytesIO from oops_datedir_repo import ( anybson as bson, serializer_bson, serializer_rfc822, ) def read(fp): """Deserialize an OOPS from a bson or rfc822 message. The whole file is read regardless of the OOPS format. It should be opened in binary mode. :raises IOError: If the file has no content. """ # Deal with no-rewindable file pointers. content = fp.read() if len(content) == 0: # This OOPS has no content raise IOError("Empty OOPS Report") if content[0:3] == b"BZh": content = bz2.decompress(content) try: return serializer_bson.read(BytesIO(content)) except (KeyError, ValueError, IndexError, bson.InvalidBSON): return serializer_rfc822.read(BytesIO(content)) oops_datedir_repo-0.0.24/oops_datedir_repo/anybson.py0000644000175000017500000000220313274305354024527 0ustar cjwatsoncjwatson00000000000000# Copyright (c) 2012, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). from __future__ import absolute_import, print_function __all__ = [ 'dumps', 'loads', ] try: from bson import dumps, loads # Create the exception that won't be raised by this version of # bson class InvalidBSON(Exception): pass except ImportError: from bson import BSON, InvalidBSON def dumps(obj): return BSON.encode(obj) def loads(data): return BSON(data).decode(tz_aware=True) oops_datedir_repo-0.0.24/oops_datedir_repo/prune.py0000644000175000017500000001450613274305354024220 0ustar cjwatsoncjwatson00000000000000# # Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). """Delete OOPSes that are not referenced in the bugtracker. Currently only has support for the Launchpad bug tracker. """ from __future__ import absolute_import, print_function __metaclass__ = type import datetime import logging import optparse from textwrap import dedent import sys from launchpadlib.launchpad import Launchpad from launchpadlib.uris import lookup_service_root from pytz import utc import oops_datedir_repo __all__ = [ 'main', ] class LaunchpadTracker: """Abstracted bug tracker/forums etc - permits testing of main().""" def __init__(self, options): self.lp = Launchpad.login_anonymously( 'oops-prune', options.lpinstance, version='devel') def find_oops_references(self, start_time, end_time, project=None, projectgroup=None): """Find oops references from start_time to end_time. :param project: Either None or a project name, or a list of projects. :param projectgroup: Either None or a project group name or a list of project group names. """ projects = set([]) if project is not None: if type(project) is not list: project = [project] projects.update(project) if projectgroup is not None: if type(projectgroup) is not list: projectgroup = [projectgroup] for group in projectgroup: [projects.add(lp_proj.name) for lp_proj in self.lp.project_groups[group].projects] result = set() lp_projects = self.lp.projects one_week = datetime.timedelta(weeks=1) for project in projects: lp_project = lp_projects[project] current_start = start_time while current_start < end_time: current_end = current_start + one_week if current_end > end_time: current_end = end_time logging.info( "Querying OOPS references on %s from %s to %s", project, current_start, current_end) result.update(lp_project.findReferencedOOPS( start_date=current_start, end_date=current_end)) current_start = current_end return result def main(argv=None, tracker=LaunchpadTracker, logging=logging): """Console script entry point.""" if argv is None: argv = sys.argv usage = dedent("""\ %prog [options] The following options must be supplied: --repo And at least one of either --project or --projectgroup e.g. %prog --repo . --projectgroup launchpad-project Will process every member project of launchpad-project. --project and --projectgroup can be supplied multiple times. When run this program will ask Launchpad for OOPS references made since the last date it pruned up to, with an upper limit of one week from today. It then looks in the repository for all oopses created during that date range, and if they are not in the set returned by Launchpad, deletes them. If the repository has never been pruned before, it will pick the earliest datedir present in the repository as the start date. """) description = \ "Delete OOPS reports that are not referenced in a bug tracker." parser = optparse.OptionParser( description=description, usage=usage) parser.add_option('--project', action="append", help="Launchpad project to find references in.") parser.add_option('--projectgroup', action="append", help="Launchpad project group to find references in.") parser.add_option('--repo', help="Path to the repository to read from.") parser.add_option( '--lpinstance', help="Launchpad instance to use", default="production") options, args = parser.parse_args(argv[1:]) def needed(*optnames): present = set() for optname in optnames: if getattr(options, optname, None) is not None: present.add(optname) if not present: if len(optnames) == 1: raise ValueError('Option "%s" must be supplied' % optname) else: raise ValueError( 'One of options %s must be supplied' % (optnames,)) needed('repo') needed('project', 'projectgroup') logging.basicConfig( filename='prune.log', filemode='w', level=logging.DEBUG) repo = oops_datedir_repo.DateDirRepo(options.repo) one_week = datetime.timedelta(weeks=1) one_day = datetime.timedelta(days=1) # Only prune OOPS reports more than one week old. prune_until = datetime.datetime.now(utc) - one_week # Ignore OOPS reports we already found references for - older than the last # prune date. try: prune_from = repo.get_config('pruned-until') except KeyError: try: oldest_oops = repo.oldest_date() except ValueError: logging.info("No OOPSes in repo, nothing to do.") return 0 midnight_utc = datetime.time(tzinfo=utc) prune_from = datetime.datetime.combine(oldest_oops, midnight_utc) # The tracker finds all the references for the selected dates. finder = tracker(options) references = finder.find_oops_references( prune_from, datetime.datetime.now(utc), options.project, options.projectgroup) # Then we can delete the unreferenced oopses. repo.prune_unreferenced(prune_from, prune_until, references) # And finally save the fact we have scanned up to the selected date. repo.set_config('pruned-until', prune_until) return 0 oops_datedir_repo-0.0.24/oops_datedir_repo/repository.py0000644000175000017500000003010413274305354025276 0ustar cjwatsoncjwatson00000000000000# # Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). """The primary interface to oopses stored on disk - the DateDirRepo.""" from __future__ import absolute_import, print_function __metaclass__ = type __all__ = [ 'DateDirRepo', ] import datetime import errno from functools import partial from hashlib import md5 import os.path import stat from pytz import utc from oops_datedir_repo import ( anybson as bson, serializer, serializer_bson, ) class DateDirRepo: """Publish oopses to a date-dir repository. A date-dir repository is a directory containing: * Zero or one directories called 'metadata'. If it exists this directory contains any housekeeping material needed (such as a metadata.conf ini file). * Zero or more directories named like YYYY-MM-DD, which contain zero or more OOPS reports. OOPS file names can take various forms, but must not end in .tmp - those are considered to be OOPS reports that are currently being written. * The behaviour of this class is to assign OOPS file names by hashing the serialized OOPS to get a unique file name. Other naming schemes are valid - the code doesn't assume anything other than the .tmp limitation above. """ def __init__(self, error_dir, serializer=None, inherit_id=False, stash_path=False): """Create a DateDirRepo. :param error_dir: The base directory to write OOPSes into. OOPSes are written into a subdirectory this named after the date (e.g. 2011-12-30). :param serializer: If supplied should be the module (e.g. oops_datedir_repo.serializer_rfc822) to use to serialize OOPSes. Defaults to using serializer_bson. :param inherit_id: If True, use the oops ID (if present) supplied in the report, rather than always assigning a new one. :param stash_path: If True, the filename that the OOPS was written to is stored in the OOPS report under the key 'datedir_repo_filepath'. It is not stored in the OOPS written to disk, only the in-memory model. """ self.root = error_dir if serializer is None: serializer = serializer_bson self.serializer = serializer self.inherit_id = inherit_id self.stash_path = stash_path self.metadatadir = os.path.join(self.root, 'metadata') self.config_path = os.path.join(self.metadatadir, 'config.bson') def publish(self, report, now=None): """Write the report to disk. The report is written to a temporary file, and then renamed to its final location. Programs concurrently reading from a DateDirRepo should ignore files ending in .tmp. :param now: The datetime to use as the current time. Will be determined if not supplied. Useful for testing. """ # We set file permission to: rw-r--r-- (so that reports from # umask-restricted services can be gathered by a tool running as # another user). wanted_file_permission = ( stat.S_IRUSR | stat.S_IWUSR | stat.S_IRGRP | stat.S_IROTH) if now is not None: now = now.astimezone(utc) else: now = datetime.datetime.now(utc) # Don't mess with the original report when changing ids etc. original_report = report report = dict(report) md5hash = md5(serializer_bson.dumps(report)).hexdigest() oopsid = 'OOPS-%s' % md5hash prefix = os.path.join(self.root, now.strftime('%Y-%m-%d')) if not os.path.isdir(prefix): try: os.makedirs(prefix) except OSError as err: # EEXIST - dir created by another, concurrent process if err.errno != errno.EEXIST: raise # For directories we need to set the x bits too. os.chmod( prefix, wanted_file_permission | stat.S_IXUSR | stat.S_IXGRP | stat.S_IXOTH) filename = os.path.join(prefix, oopsid) if self.inherit_id: oopsid = report.get('id') or oopsid report['id'] = oopsid with open(filename + '.tmp', 'wb') as f: self.serializer.write(report, f) os.rename(filename + '.tmp', filename) if self.stash_path: original_report['datedir_repo_filepath'] = filename os.chmod(filename, wanted_file_permission) return [report['id']] def republish(self, publisher): """Republish the contents of the DateDirRepo to another publisher. This makes it easy to treat a DateDirRepo as a backing store in message queue environments: if the message queue is down, flush to the DateDirRepo, then later pick the OOPSes up and send them to the message queue environment. For instance: >>> repo = DateDirRepo('.') >>> repo.publish({'some':'report'}) >>> queue = [] >>> def queue_publisher(report): ... queue.append(report) ... return report['id'] >>> repo.republish(queue_publisher) Will scan the disk and send the single found report to queue_publisher, deleting the report afterwards. Empty datedir directories are automatically cleaned up, as are stale .tmp files. If the publisher returns None, signalling that it did not publish the report, then the report is not deleted from disk. """ two_days = datetime.timedelta(2) now = datetime.date.today() old = now - two_days for dirname, (y,m,d) in self._datedirs(): date = datetime.date(y, m, d) prune = date < old dirpath = os.path.join(self.root, dirname) files = os.listdir(dirpath) if not files and prune: # Cleanup no longer needed directory. os.rmdir(dirpath) for candidate in map(partial(os.path.join, dirpath), files): if candidate.endswith('.tmp'): if prune: os.unlink(candidate) continue with open(candidate, 'rb') as report_file: try: report = serializer.read(report_file) except IOError as e: if e.args[0] == 'Empty OOPS Report': report = None else: raise if report is not None: oopsid = publisher(report) if (report is None and prune) or (report is not None and oopsid): os.unlink(candidate) def _datedirs(self): """Yield each subdir which looks like a datedir.""" for dirname in os.listdir(self.root): try: y, m, d = dirname.split('-') y = int(y) m = int(m) d = int(d) except ValueError: # Not a datedir continue yield dirname, (y, m, d) def _read_config(self): """Return the current config document from disk.""" try: with open(self.config_path, 'rb') as config_file: return bson.loads(config_file.read()) except IOError as e: if e.errno != errno.ENOENT: raise return {} def get_config(self, key): """Return a key from the repository config. :param key: A key to read from the config. """ return self._read_config()[key] def set_config(self, key, value): """Set config option key to value. This is written to the bson document root/metadata/config.bson :param key: The key to set - anything that can be a key in a bson document. :param value: The value to set - anything that can be a value in a bson document. """ config = self._read_config() config[key] = value try: with open(self.config_path + '.tmp', 'wb') as config_file: config_file.write(bson.dumps(config)) except IOError as e: if e.errno != errno.ENOENT: raise os.mkdir(self.metadatadir) with open(self.config_path + '.tmp', 'wb') as config_file: config_file.write(bson.dumps(config)) os.rename(self.config_path + '.tmp', self.config_path) def oldest_date(self): """Return the date of the oldest datedir in the repository. If pruning / resubmission is working this should also be the date of the oldest oops in the repository. """ dirs = list(self._datedirs()) if not dirs: raise ValueError("No OOPSes in repository.") return datetime.date(*sorted(dirs)[0][1]) def prune_unreferenced(self, start_time, stop_time, references): """Delete OOPS reports filed between start_time and stop_time. A report is deleted if all of the following are true: * it is in a datedir covered by [start_time, stop_time] inclusive of the end points. * It is not in the set references. * Its timestamp falls between start_time and stop_time inclusively or it's timestamp is outside the datedir it is in or there is no timestamp on the report. :param start_time: The lower bound to prune within. :param stop_time: The upper bound to prune within. :param references: An iterable of OOPS ids to keep. """ start_date = start_time.date() stop_date = stop_time.date() midnight = datetime.time(tzinfo=utc) for dirname, (y,m,d) in self._datedirs(): dirdate = datetime.date(y, m, d) if dirdate < start_date or dirdate > stop_date: continue dirpath = os.path.join(self.root, dirname) files = os.listdir(dirpath) deleted = 0 for candidate in map(partial(os.path.join, dirpath), files): if candidate.endswith('.tmp'): # Old half-written oops: just remove. os.unlink(candidate) deleted += 1 continue with open(candidate, 'rb') as report_file: report = serializer.read(report_file) report_time = report.get('time', None) if (report_time is None or getattr(report_time, 'date', None) is None or report_time.date() < dirdate or report_time.date() > dirdate): # The report is oddly filed or missing a precise # datestamp. Treat it like midnight on the day of the # directory it was placed in - this is a lower bound on # when it was actually created. report_time = datetime.datetime.combine( dirdate, midnight) if (report_time >= start_time and report_time <= stop_time and report['id'] not in references): # Unreferenced and prunable os.unlink(candidate) deleted += 1 if deleted == len(files): # Everything in the directory was deleted. os.rmdir(dirpath) oops_datedir_repo-0.0.24/oops_datedir_repo/serializer_rfc822.py0000644000175000017500000002023113274305354026316 0ustar cjwatsoncjwatson00000000000000# Copyright (c) 2010, 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). """Read / Write an OOPS dict as an rfc822 formatted message. This style of OOPS format is very web server specific, not extensible - it should be considered deprecated. The reports this serializer handles always have the following variables (See the python-oops api docs for more information about these variables): * id: The name of this error report. * type: The type of the exception that occurred. * value: The value of the exception that occurred. * time: The time at which the exception occurred. * reporter: The reporting program. * topic: The identifier for the template/script that oopsed. [this is written as Page-Id for compatibility with as yet unported tools.] * branch_nick: The branch nickname. * revno: The revision number of the branch. * tb_text: A text version of the traceback. * username: The user associated with the request. * url: The URL for the failed request. * req_vars: The request variables. Either a list of 2-tuples or a dict. * branch_nick: A name for the branch of code that was running when the report was triggered. * revno: The revision that the branch was at. * Informational: A flag, True if the error wasn't fatal- if it was 'informational'. [Deprecated - this is no longer part of the oops report conventions. Existing reports with it set are still read, but the key is only present if it was truely in the report.] """ from __future__ import absolute_import, print_function __all__ = [ 'read', 'write', ] __metaclass__ = type try: from email.parser import BytesParser except ImportError: # On Python 2, email.parser.Parser will do well enough, since # bytes == str. from email.parser import Parser as BytesParser import logging import re import urllib import iso8601 import six from six.moves import intern from six.moves.urllib_parse import ( quote, unquote, ) def read(fp): """Deserialize an OOPS from an RFC822 format message.""" msg = BytesParser().parse(fp, headersonly=True) id = msg.get('oops-id') exc_type = msg.get('exception-type') exc_value = msg.get('exception-value') datestr = msg.get('date') if datestr is not None: date = iso8601.parse_date(msg.get('date')) else: date = None topic = msg.get('topic') if topic is None: topic = msg.get('page-id') username = msg.get('user') url = msg.get('url') try: duration = float(msg.get('duration', '-1')) except ValueError: duration = float(-1) informational = msg.get('informational') branch_nick = msg.get('branch') revno = msg.get('revision') reporter = msg.get('oops-reporter') # Explicitly use an iterator so we can process the file sequentially. lines = iter(msg.get_payload().splitlines(True)) statement_pat = re.compile(r'^(\d+)-(\d+)(?:@([\w-]+))?\s+(.*)') def is_req_var(line): return "=" in line and not statement_pat.match(line) def is_traceback(line): return line.lower().startswith('traceback') or line.startswith( '== EXTRA DATA ==') req_vars = [] statements = [] first_tb_line = '' for line in lines: first_tb_line = line line = line.strip() if line == '': continue else: match = statement_pat.match(line) if match is not None: start, end, db_id, statement = match.groups() if db_id is not None: db_id = intern(db_id) # This string is repeated lots. statements.append( [int(start), int(end), db_id, statement]) elif is_req_var(line): key, value = line.split('=', 1) req_vars.append([unquote(key), unquote(value)]) elif is_traceback(line): break req_vars = dict(req_vars) # The rest is traceback. tb_text = ''.join([first_tb_line] + list(lines)) result = dict(id=id, type=exc_type, value=exc_value, time=date, topic=topic, tb_text=tb_text, username=username, url=url, duration=duration, req_vars=req_vars, timeline=statements, branch_nick=branch_nick, revno=revno) if informational is not None: result['informational'] = informational if reporter is not None: result['reporter'] = reporter return result def _normalise_whitespace(s): """Normalise the whitespace in a bytestring to spaces.""" if s is None: return None # (used by the cast to %s to get 'None') return b' '.join(s.split()) def _safestr(obj): if isinstance(obj, six.text_type): return obj.replace('\\', '\\\\').encode('ASCII', 'backslashreplace') # A call to str(obj) could raise anything at all. # We'll ignore these errors, and print something # useful instead, but also log the error. # We disable the pylint warning for the blank except. if isinstance(obj, six.binary_type): value = obj else: try: value = str(obj) except: logging.getLogger('oops_datedir_repo.serializer_rfc822').exception( 'Error while getting a str ' 'representation of an object') value = '' % ( str(type(obj).__name__)) # Some str() calls return unicode objects. if isinstance(value, six.text_type): return _safestr(value) # encode non-ASCII characters value = value.replace(b'\\', b'\\\\') value = re.sub( br'[\x80-\xff]', lambda match: ('\\x%02x' % ord(match.group(0))).encode('UTF-8'), value) return value def to_chunks(report): """Returns a list of bytestrings making up the serialized oops.""" chunks = [] def header(label, key, optional=True): if optional and key not in report: return value = _safestr(report[key]) value = _normalise_whitespace(value) chunks.append(label.encode('UTF-8') + b': ' + value + b'\n') header('Oops-Id', 'id', optional=False) header('Exception-Type', 'type') header('Exception-Value', 'value') if 'time' in report: chunks.append( ('Date: %s\n' % report['time'].isoformat()).encode('UTF-8')) header('Page-Id', 'topic') header('Branch', 'branch_nick') header('Revision', 'revno') header('User', 'username') header('URL', 'url') header('Duration', 'duration') header('Informational', 'informational') header('Oops-Reporter', 'reporter') chunks.append(b'\n') safe_chars = ';/\\?:@&+$, ()*!' if 'req_vars' in report: try: items = sorted(report['req_vars'].items()) except AttributeError: items = report['req_vars'] for key, value in items: chunk = '%s=%s\n' % ( quote(_safestr(key), safe_chars), quote(_safestr(value), safe_chars)) chunks.append(chunk.encode('UTF-8')) chunks.append(b'\n') if 'timeline' in report: for row in report['timeline']: (start, end, category, statement) = row[:4] chunks.append( ('%05d-%05d@' % (start, end)).encode('UTF-8') + _safestr(category) + b' ' + _normalise_whitespace(_safestr(statement)) + b'\n') chunks.append(b'\n') if 'tb_text' in report: chunks.append(_safestr(report['tb_text'])) return chunks def write(report, output): """Write a report to a file.""" output.writelines(to_chunks(report)) oops_datedir_repo-0.0.24/oops_datedir_repo/__init__.py0000644000175000017500000000311613274306545024624 0ustar cjwatsoncjwatson00000000000000# # Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). from __future__ import absolute_import, print_function # same format as sys.version_info: "A tuple containing the five components of # the version number: major, minor, micro, releaselevel, and serial. All # values except releaselevel are integers; the release level is 'alpha', # 'beta', 'candidate', or 'final'. The version_info value corresponding to the # Python version 2.0 is (2, 0, 0, 'final', 0)." Additionally we use a # releaselevel of 'dev' for unreleased under-development code. # # If the releaselevel is 'alpha' then the major/minor/micro components are not # established at this point, and setup.py will use a version of next-$(revno). # If the releaselevel is 'final', then the tarball will be major.minor.micro. # Otherwise it is major.minor.micro~$(revno). __version__ = (0, 0, 24, 'final', 0) __all__ = [ 'DateDirRepo', ] from oops_datedir_repo.repository import DateDirRepo oops_datedir_repo-0.0.24/oops_datedir_repo/serializer_bson.py0000644000175000017500000000506713274305354026263 0ustar cjwatsoncjwatson00000000000000# Copyright (c) 2011, Canonical Ltd # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation, version 3 only. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU Lesser General Public License for more details. # # You should have received a copy of the GNU Lesser General Public License # along with this program. If not, see . # GNU Lesser General Public License version 3 (see the file LICENSE). """Read / Write an OOPS dict as a bson dict. This style of OOPS format is very extensible and maintains compatability with older rfc822 oops code: the previously mandatory keys are populated on read. Use of bson serializing is recommended. The reports this serializer handles always have the following variables (See the python-oops api docs for more information about these variables): * id: The name of this error report. * type: The type of the exception that occurred. * value: The value of the exception that occurred. * time: The time at which the exception occurred. * reporter: The reporting program. * topic: The identifier for the template/script that oopsed. * branch_nick: The branch nickname. * revno: The revision number of the branch. * tb_text: A text version of the traceback. * username: The user associated with the request. * url: The URL for the failed request. * req_vars: The request variables. Either a list of 2-tuples or a dict. * branch_nick: A name for the branch of code that was running when the report was triggered. * revno: The revision that the branch was at. """ from __future__ import absolute_import, print_function __all__ = [ 'dumps', 'read', 'write', ] __metaclass__ = type from oops_datedir_repo import anybson as bson def read(fp): """Deserialize an OOPS from a bson message.""" report = bson.loads(fp.read()) for key in ( 'branch_nick', 'revno', 'type', 'value', 'time', 'topic', 'username', 'url'): report.setdefault(key, None) report.setdefault('duration', -1) report.setdefault('req_vars', {}) report.setdefault('tb_text', '') report.setdefault('timeline', []) return report def dumps(report): """Return a binary string representing report.""" return bson.dumps(report) def write(report, fp): """Write report to fp.""" return fp.write(dumps(report)) oops_datedir_repo-0.0.24/README0000644000175000017500000000615712465216261017676 0ustar cjwatsoncjwatson00000000000000************************************************************************* python-oops-datedir-repo: A simple disk repository for OOPS Error reports ************************************************************************* Copyright (c) 2011, Canonical Ltd This program is free software: you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation, version 3 only. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of the GNU Lesser General Public License along with this program. If not, see . GNU Lesser General Public License version 3 (see the file LICENSE). This is a component of the python-oops project: https://launchpad.net/python-oops. An OOPS report is a report about something going wrong in a piece of software... thus, an 'oops' :) This package provides disk storage, management, and a serialisation format for OOPSes stored in the repository. Programs or services that are generating OOPS reports need this package or other similar ones, if they want to persist the reports. Dependencies ============ * Python 2.6+ * The oops package (https://launchpad.net/python-oops or 'oops' on pypi). Testing Dependencies ==================== * fixtures (http://pypi.python.org/pypi/fixtures) * subunit (http://pypi.python.org/pypi/python-subunit) (optional) * testtools (http://pypi.python.org/pypi/testtools) Usage ===== oops_datedir_repo is an extension package for the oops package. The DateDirRepo class provides an OOPS publisher (``DateDirRepo.publish``) which will write OOPSes into the repository. Retrieving OOPSes can be done by using the low level serializer_rfc822 functions : an OOPS report can be written to a disk file via the serializer_rfc822.write() function, and read via the matching read() function. Typical usage:: >>> config = oops.Config() >>> with fixtures.TempDir() as tempdir: ... repo = oops_datedir_repo.DateDirRepo('/tmp/demo') ... config.publishers.append(repo.publish) ... ids = config.publish({'oops': '!!!'}) For more information see the oops package documentation or the api docs. Installation ============ Either run setup.py in an environment with all the dependencies available, or add the working directory to your PYTHONPATH. Development =========== Upstream development takes place at https://launchpad.net/python-oops-datedir-repo. To setup a working area for development, if the dependencies are not immediately available, you can use ./bootstrap.py to create bin/buildout, then bin/py to get a python interpreter with the dependencies available. To run the tests use the runner of your choice, the test suite is oops_datedir_repo.tests.test_suite. For instance:: $ bin/py -m testtools.run oops_datedir_repo.tests.test_suite If you have testrepository you can run the tests with that:: $ testr run