pax_global_header00006660000000000000000000000064142016363250014514gustar00rootroot0000000000000052 comment=eda9f06fc25dbaae028756fc47160529bbeb1ded powa-collector-1.2.0/000077500000000000000000000000001420163632500144465ustar00rootroot00000000000000powa-collector-1.2.0/.gitignore000066400000000000000000000001221420163632500164310ustar00rootroot00000000000000.*.sw* *.pyc *.conf **/__pycache__/ .pypirc build/ dist/ powa_collector.egg-info/ powa-collector-1.2.0/CHANGELOG000066400000000000000000000040661420163632500156660ustar00rootroot000000000000001.2.0: New Features: - Automatically detect hypopg on remort servers (Julien Rouhaud, thanks to github user MikhailSaynukov for the request) Bugfixes: - Fix sleep time calculation (Marc Cousin) - Properly detect -Infinity as an unknown last snapshot (Julien Rouhaud) - Properly handle error happening when retrieving the list of remote servers (Julien Rouhaud) - Properly detect stop condition after checking if PoWA must be loaded (Julien Rouhaud) - Close all thread's connections in case of uncatched error during snapshot (Marc Cousin) Misc: - Immediately exit the worker thread if PoWA isn't present or can't be loaded (Julien Rouhaud) - Improve server list stdout logging when no server is found (Julien Rouhaud) - Do PoWA extension sanity checks for the dedicated repo connection too (Julien Rouhaud) - Fix compatibility with python 3.9 (Julien Rouhaud, per report from Christoph Berg) 1.1.1: Bugfix: - Make sure that repo connection is available when getting repo powa version (Julien Rouhaud, thanks to Adrien Nayrat for the report and testing the patch) 1.1.0: New features: - Avoid explicit "LOAD 'powa'" with poWA 4.1.0, so a superuser isn't required anymore when PoWA isn't in shared_preload_libraries (Julien Rouhaud) - Store postgres and handled extensions versions on repository server (Julien Rouhaud) Bug fixes: - Handle errors that might happen during snapshot (Julien Rouhaud) 1.0.0: New features: - Handle the new query_cleanup query that may be run after getting remote data. Bugfix: - Let workers quit immediately if they're asked to stop. 0.0.3 Bugfix: - Support standard_conforming_strings = off - Better error message for remote servers lacking powa extension (Thomas Reiss and Julien Rouhaud) 0.0.2 Bugfix: - Ignore deactivated servers Miscellaneous: - Set lock_timeout to 2s for every pg connection - Fully qualify all objects in SQL queries 0.0.1 Initial release powa-collector-1.2.0/LICENSE000066400000000000000000000016521420163632500154570ustar00rootroot00000000000000Copyright (c) 2018-2022, The PoWA-team Permission to use, copy, modify, and distribute this software and its documentation for any purpose, without fee, and without a written agreement is hereby granted, provided that the above copyright notice and this paragraph and the following two paragraphs appear in all copies. IN NO EVENT SHALL The PoWA-team BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF The PoWA-team HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. The PoWA-team SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND The PoWA-team HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR MODIFICATIONS. powa-collector-1.2.0/README.md000066400000000000000000000047151420163632500157340ustar00rootroot00000000000000Overview ======== This repository contains the `powa-collector` tool, a simple multi-threaded python program that performs the snapshots for all the remote servers configured in a powa repository database (in the **powa_servers** table). Requirements ============ This program requires python 2.7 or python 3. The required dependencies are listed in the **requirements.txt** file. Configuration ============= Copy the provided `powa-collector.conf-dist` file to a new `powa-collector.conf` file, and adapt the **dsn** specification to be able to connect to the wanted main PoWA repository. Usage ===== To start the program, simply run the powa-collector.py program. A `SIGTERM` or a `Keyboard Interrupt` on the program will cleanly stop all the thread and exit the program. A `SIGHUP` will reload the configuration. Protocol ======== A minimal communication protocol is implented, using the LISTEN/NOTIFY facility provided by postgres, which is used by the powa-web project. You can send queries to collector by sending messages on the "powa_collector" channel. The collector will send answers on the channel you specified, so make sure to listen on it before sending any query to not miss answers. The requests are of the following form: COMMAND RESPONSE_CHANNEL OPTIONAL_ARGUMENTS - COMMAND: mandatory argument describing the query. The following commands are supported: - RELOAD: reload the configuration and report that the main thread successfully received the command. The reload will be attempted even if no response channel was provided. - WORKERS_STATUS: return a JSON (srvid is the key, status is the content) describing the status of each remote server thread. Command is ignored if no response channel was provided. This command accept an optional argument to get the status of a single remote server, identified by its srvid. If no worker exists for this server, an empty JSON will be returned. - RESPONSE_CHANNEL: mandatory argument to describe the NOTIFY channel the client listens a response on. '-' can be used if no answer should be sent. - OPTIONAL_ARGUMENTS: space separated list of arguments, specific to the underlying command. The answers are of the form: COMMAND STATUS DATA - COMMAND: same as the command in the query - STATUS: OK or KO. - DATA: reason for the failure if status is KO, otherwise the data for the answer. powa-collector-1.2.0/powa-collector.conf-dist000066400000000000000000000001561420163632500212120ustar00rootroot00000000000000{ "repository": { "dsn": "postgresql://powa_user@localhost:5432/powa" }, "debug": false } powa-collector-1.2.0/powa-collector.py000077500000000000000000000020651420163632500177600ustar00rootroot00000000000000#!/usr/bin/env python import getopt import os import sys from powa_collector import PowaCollector, getVersion def usage(rc): """Show tool usage """ print("""Usage: %s [ -? | --help ] [ -V | --version ] -? | --help Show this message and exits -v | --version Report powa-collector version and exits See https://powa.readthedocs.io/en/latest/powa-collector/ for more information about this tool. """ % os.path.basename(__file__)) sys.exit(rc) def main(): """Instantiates and starts a PowaCollector object """ try: opts, args = getopt.getopt(sys.argv[1:], "?V", ["help", "version"]) except getopt.GetoptError as e: print(str(e)) usage(1) for o, a in opts: if o in ("-?", "--help"): usage(0) elif o in ("-V", "--version"): print("%s version %s" % (os.path.basename(__file__), getVersion())) sys.exit(0) else: assert False, "unhandled option" app = PowaCollector() app.main() if (__name__ == "__main__"): main() powa-collector-1.2.0/powa_collector/000077500000000000000000000000001420163632500174625ustar00rootroot00000000000000powa-collector-1.2.0/powa_collector/__init__.py000066400000000000000000000321401420163632500215730ustar00rootroot00000000000000""" PowaCollector: powa-collector main application. It takes a simple configuration file in json format, where repository.dsn should point to an URI to connect to the repository server. The list of remote servers and their configuration will be retrieved from this repository server. It maintains a persistent dedicated connection to the repository server, for monitoring and communication purpose. It also starts one thread per remote server. These threads are kept in the "workers" dict attribute, with the key being the textual identifier (host:port). See powa_worker.py for more details about those threads. The main thread will intercept the following signals: - SIGHUP: reload configuration and log and changes done - SIGTERM: cleanly terminate all threads and exit A minimal communication protocol is implented, using the LISTEN/NOTIFY facility provided by postgres. The dedicated main thread repository connection listens on the "powa_collector" channel. A client, such as powa-web, can send requests on this channel and the main thread will act and respond accordingly. The requests are of the following form: COMMAND RESPONSE_CHANNEL OPTIONAL_ARGUMENTS See the README.md file for the full protocol documentation. """ from powa_collector.options import (parse_options, get_full_config, add_servers_config) from powa_collector.powa_worker import PowaThread import psycopg2 import select import logging import json import signal __VERSION__ = '1.2.0' __VERSION_NUM__ = [int(part) for part in __VERSION__.split('.')] def getVersion(): """Return powa_collector's version as a string""" return __VERSION__ class PowaCollector(): """Main powa collector's class. This manages all collection tasks Declare all attributes here, we don't want dynamic attributes """ def __init__(self): """Instance creator. Sets logging, signal handlers, and basic structure""" self.workers = {} self.logger = logging.getLogger("powa-collector") self.stopping = False raw_options = parse_options() loglevel = logging.INFO if (raw_options["debug"]): loglevel = logging.DEBUG extra = {'threadname': '-'} logging.basicConfig( format='%(asctime)s %(threadname)s %(levelname)-6s: %(message)s ', level=loglevel) self.logger = logging.LoggerAdapter(self.logger, extra) signal.signal(signal.SIGHUP, self.sighandler) signal.signal(signal.SIGTERM, self.sighandler) def connect(self, options): """Connect to the repository Used for communication with powa-web and users of the communication repository Persistent Threads will use distinct connections """ try: self.logger.debug("Connecting on repository...") self.__repo_conn = psycopg2.connect(options["repository"]['dsn']) self.__repo_conn.autocommit = True self.logger.debug("Connected.") cur = self.__repo_conn.cursor() # Setup a 2s lock_timeout if there's no inherited lock_timeout cur.execute("""SELECT pg_catalog.set_config(name, '2000', false) FROM pg_catalog.pg_settings WHERE name = 'lock_timeout' AND setting = '0'""") cur.execute("SET application_name = %s", ('PoWA collector - main thread' + ' (' + __VERSION__ + ')', )) # Listen on our dedicated powa_collector notify channel cur.execute("LISTEN powa_collector") # Check if powa-archivist is installed on the repository server cur.execute("""SELECT regexp_split_to_array(extversion, '\\.'), extversion FROM pg_catalog.pg_extension WHERE extname = 'powa'""") ver = cur.fetchone() cur.close() if ver is None: self.__repo_conn.close() self.__repo_conn = None self.logger.error("PoWA extension not found on repository " "server") return False elif (int(ver[0][0]) < 4): self.__repo_conn.close() self.__repo_conn = None self.logger.error("Incompatible PoWA version, found %s," " requires at least 4.0.0" % ver[1]) return False except psycopg2.Error as e: self.__repo_conn = None self.logger.error("Error connecting:\n%s", e) return False return True def process_notification(self): """Process PostgreSQL NOTIFY messages. These come mainly from the UI, to ask us to reload our configuration, or to display the workers status """ if (not self.__repo_conn): return self.__repo_conn.poll() cur = self.__repo_conn.cursor() while (self.__repo_conn.notifies): notif = self.__repo_conn.notifies.pop(0).payload.split(' ') status = '' cmd = notif.pop(0) channel = "-" status = "OK" data = None # the channel is mandatory, but if the message can be processed # without answering, we'll try to if (len(notif) > 0): channel = notif.pop(0) self.logger.debug("Received async command: %s %s %r" % (cmd, channel, notif)) if (cmd == "RELOAD"): self.reload_conf() data = 'OK' elif (cmd == "WORKERS_STATUS"): # ignore the message if no channel was received if (channel != '-'): # did the caller request a single server only? We ignore # anything but the first parameter passed if (len(notif) > 0 and notif[0].isdigit()): w_id = int(notif[0]) data = json.dumps(self.list_workers(w_id, False)) else: data = json.dumps(self.list_workers(None, False)) # everything else is unhandled else: status = 'UNKNOWN' data = '' # if there was a response channel, reply back if (channel != '-'): payload = ("%(cmd)s %(status)s %(data)s" % {'cmd': cmd, 'status': status, 'data': data}) # with default configuration, postgres only accept up to 8k # bytes payload. If the payload is longer, just warn the # caller that it didn't fit. # XXX we could implement multi-part answer, but if we ever # reach that point, we should consider moving to a table if (len(payload.encode('utf-8')) >= 8000): payload = ("%(cmd)s %(status)s %(data)" % {'cmd': cmd, 'status': "KO", 'data': "ANSWER TOO LONG"}) cur.execute("""NOTIFY "%(channel)s", '%(payload)s'""" % {'channel': channel, 'payload': payload}) cur.close() def main(self): """Start the active loop. Connect or reconnect to the repository and starts threads to manage the monitored servers """ raw_options = parse_options() self.logger.info("Starting powa-collector...") if (not self.connect(raw_options)): exit(1) try: self.config = add_servers_config(self.__repo_conn, raw_options) except psycopg2.Error as e: self.__repo_conn.close() self.__repo_conn = None self.logger.error("Error retrieving the list of remote servers:" "\n%s", e) exit(1) for k, conf in self.config["servers"].items(): self.register_worker(k, self.config["repository"], conf) self.list_workers() try: while (not self.stopping): if (self.__repo_conn is not None): try: cur = self.__repo_conn.cursor() cur.execute("SELECT 1") cur.close() except Exception: self.__repo_conn = None self.logger.warning("Connection was dropped!") if (not self.__repo_conn): self.connect(raw_options) select.select([self.__repo_conn], [], [], 10) self.process_notification() except KeyboardInterrupt: self.logger.debug("KeyboardInterrupt caught") self.logger.info("Stopping all workers and exiting...") self.stop_all_workers() def register_worker(self, name, repository, config): """Add a worker thread to a server""" self.workers[name] = PowaThread(name, repository, config) self.workers[name].start() def stop_all_workers(self): """Ask all worker threads to stop This is asynchronous, no guarantee """ for k, worker in self.workers.items(): worker.ask_to_stop() def sighandler(self, signum, frame): """Manage signal handlers: reload conf on SIGHUB, shutdown on SIGTERM""" if (signum == signal.SIGHUP): self.logger.debug("SIGHUP caught") self.reload_conf() elif (signum == signal.SIGTERM): self.logger.debug("SIGTERM caught") self.stop_all_workers() self.stopping = True else: self.logger.error("Unhandled signal %d" % signum) def list_workers(self, wanted_srvid=None, tostdout=True): """List all workers and their status""" res = {} if (tostdout): self.logger.info('List of workers:') if (tostdout and len(self.workers.items()) == 0): self.logger.info('No worker') for k, worker in self.workers.items(): # self.logger.info(" %s%s" % (k, "" if (worker.is_alive()) else # " (stopped)")) worker_srvid = self.config["servers"][k]["srvid"] # ignore this entry if caller want information for only one server if (wanted_srvid is not None and wanted_srvid != worker_srvid): continue status = "Unknown" if (worker.is_stopping()): status = "stopping" elif (worker.is_alive()): status = worker.get_status() else: status = "stopped" if (tostdout): self.logger.info("%r (%s)" % (worker, status)) else: res[worker_srvid] = status return res def reload_conf(self): """Reload configuration: - reparse the configuration - stop and start workers if necessary - for those whose configuration has changed, ask them to reload - update dep versions: recompute powa version's and its dependencies """ self.list_workers() self.logger.info('Reloading...') config_new = get_full_config(self.__repo_conn) # check for removed servers for k, worker in self.workers.items(): if (worker.is_alive()): continue if (worker.is_stopping()): self.logger.warning("The worker %s is stoping" % k) if (k not in config_new["servers"]): self.logger.info("%s has been removed, stopping it..." % k) worker.ask_to_stop() # check for added servers for k in config_new["servers"]: if (k not in self.workers or not self.workers[k].is_alive()): self.logger.info("%s has been added, registering it..." % k) self.register_worker(k, config_new["repository"], config_new["servers"][k]) # check for updated configuration for k in config_new["servers"]: cur = config_new["servers"][k] if (not conf_are_equal(cur, self.workers[k].get_config())): self.workers[k].ask_reload(cur) # also try to reconnect if the worker experienced connection issue elif(self.workers[k].get_status() != "running"): self.workers[k].ask_reload(cur) # update stored versions for k in config_new["servers"]: self.workers[k].ask_update_dep_versions() self.config = config_new self.logger.info('Reload done') def conf_are_equal(conf1, conf2): """Compare two configurations, returns True if equal""" for k in conf1.keys(): if (k not in conf2): return False if (conf1[k] != conf2[k]): return False for k in conf2.keys(): if (k not in conf1): return False if (conf1[k] != conf2[k]): return False return True powa-collector-1.2.0/powa_collector/options.py000066400000000000000000000060151420163632500215310ustar00rootroot00000000000000""" Simple configuration file handling, as a JSON. """ import json import os import sys SAMPLE_CONFIG_FILE = """ { "repository": { "dsn": "postgresql://powa_user@localhost:5432/powa", }, "debug": false } """ CONF_LOCATIONS = [ '/etc/powa-collector.conf', os.path.expanduser('~/.config/powa-collector.conf'), os.path.expanduser('~/.powa-collector.conf'), './powa-collector.conf' ] def get_full_config(conn): """Return the full configuration, consisting of the information from the local configuration file and the remote servers stored on the repository database. """ return add_servers_config(conn, parse_options()) def add_servers_config(conn, config): """Add the activated remote servers stored on the repository database to a given configuration JSON. """ if ("servers" not in config): config["servers"] = {} cur = conn.cursor() cur.execute(""" SELECT id, hostname, port, username, password, dbname, frequency FROM public.powa_servers s WHERE s.id > 0 AND s.frequency > 0 ORDER BY id """) for row in cur: parms = {} parms["host"] = row[1] parms["port"] = row[2] parms["user"] = row[3] if (row[4] is not None): parms["password"] = row[4] parms["dbname"] = row[5] key = row[1] + ':' + str(row[2]) config["servers"][key] = {} config["servers"][key]["dsn"] = parms config["servers"][key]["frequency"] = row[6] config["servers"][key]["srvid"] = row[0] conn.commit() return config def parse_options(): """Look for the configuration file in all supported location, parse it and return the resulting JSON, also adding the implicit values if needed. """ options = None for possible_config in CONF_LOCATIONS: options = parse_file(possible_config) if (options is not None): break if (options is None): print("Could not find the configuration file in any of the expected" + " locations:") for possible_config in CONF_LOCATIONS: print("\t- %s" % possible_config) sys.exit(1) if ('repository' not in options or 'dsn' not in options["repository"]): print("The configuration file is invalid, it should contains" + " a repository.dsn entry") print("Place and adapt the following content in one of those " "locations:""") print("\n\t".join([""] + CONF_LOCATIONS)) print(SAMPLE_CONFIG_FILE) sys.exit(1) if ('debug' not in options): options["debug"] = False return options def parse_file(filepath): """Read a configuration file and return the JSON """ try: return json.load(open(filepath)) except IOError: return None except Error as e: print("Error parsing config file %s:" % filepath) print("\t%s" % e) sys.exit(1) powa-collector-1.2.0/powa_collector/powa_worker.py000066400000000000000000000660671420163632500224120ustar00rootroot00000000000000""" PowaThread: powa-collector dedicated remote server thread. One of such thread is started per remote server by the main thred. Each threads will use 2 connections: - a persistent dedicated connection to the remote server, where it'll get the source data - a connection to the repository server, to write the source data and perform the snapshot. This connection is created and dropped at each checkpoint """ from decimal import Decimal import threading import time import psycopg2 from psycopg2.extras import DictCursor import logging from os import SEEK_SET import random import sys from powa_collector.snapshot import (get_snapshot_functions, get_src_query, get_tmp_name) if (sys.version_info < (3, 0)): from StringIO import StringIO else: from io import StringIO class PowaThread (threading.Thread): """A Powa collector thread. Derives from threading.Thread Manages a monitored remote server. """ def __init__(self, name, repository, config): """Instance creator. Starts threading and logger""" threading.Thread.__init__(self) # we use this event to sleep on the worker main loop. It'll be set by # the main thread through one of the public functions, when a SIGHUP # was received to notify us to reload our config, or if we should # terminate. Both public functions will first set the required event # before setting this one, to avoid missing an event in case of the # sleep ends at exactly the same time self.__stop_sleep = threading.Event() # this event is set when we should terminate the thread self.__stopping = threading.Event() # this event is set when we should reload the configuration self.__got_sighup = threading.Event() self.__connected = threading.Event() self.name = name self.__repository = repository self.__config = config self.__pending_config = None self.__update_dep_versions = False self.__remote_conn = None self.__repo_conn = None self.__last_repo_conn_errored = False self.logger = logging.getLogger("powa-collector") self.last_time = None extra = {'threadname': self.name} self.logger = logging.LoggerAdapter(self.logger, extra) self.logger.debug("Creating worker %s: %r" % (name, config)) def __repr__(self): return ("%s: %s" % (self.name, self.__config["dsn"])) def __get_powa_version(self, conn): """Get powa's extension version""" cur = conn.cursor() cur.execute("""SELECT regexp_split_to_array(extversion, '\\.'), extversion FROM pg_catalog.pg_extension WHERE extname = 'powa'""") res = cur.fetchone() cur.close() return res def __maybe_load_powa(self, conn): """Loads Powa if it's not already and it's needed. Only supports 4.0+ extension, and this version can be loaded on the fly """ ver = self.__get_powa_version(conn) if (not ver): self.logger.error("PoWA extension not found") self.__disconnect_all() self.__stopping.set() return elif (int(ver[0][0]) < 4): self.logger.error("Incompatible PoWA version, found %s," " requires at least 4.0.0" % ver[1]) self.__disconnect_all() self.__stopping.set() return # make sure the GUC are present in case powa isn't in # shared_preload_librairies. This is only required for powa # 4.0.x. if (int(ver[0][0]) == 4 and int(ver[0][1]) == 0): try: cur = conn.cursor() cur.execute("LOAD 'powa'") cur.close() conn.commit() except psycopg2.Error as e: self.logger.error("Could not load extension powa:\n%s" % e) self.__disconnect_all() self.__stopping.set() def __save_versions(self): """Save the versions we collect on the remote server in the repository""" srvid = self.__config["srvid"] if (self.__repo_conn is None): self.__connect() ver = self.__get_powa_version(self.__repo_conn) # Check and update PG and dependencies versions, for powa 4.1+ if (not ver or (int(ver[0][0]) == 4 and int(ver[0][1]) == 0)): return self.logger.debug("Checking postgres and dependencies versions") if (self.__remote_conn is None or self.__repo_conn is None): self.logger.error("Could not check PoWA") return cur = self.__remote_conn.cursor() repo_cur = self.__repo_conn.cursor() cur.execute(""" SELECT setting FROM pg_settings WHERE name = 'server_version' --WHERE name = 'server_version_num' """) server_num = cur.fetchone() repo_cur.execute(""" SELECT version FROM powa_servers WHERE id = %(srvid)s """, {'srvid': srvid}) repo_num = cur.fetchone() if (repo_num is None or repo_num[0] != server_num[0]): try: repo_cur.execute(""" UPDATE powa_servers SET version = %(version)s WHERE id = %(srvid)s """, {'srvid': srvid, 'version': server_num[0]}) self.__repo_conn.commit() except Exception as e: self.logger.warning("Could not save server version" + ": %s" % (e)) self.__repo_conn.rollback() hypo_ver = None repo_cur.execute(""" SELECT extname, version FROM powa_extensions WHERE srvid = %(srvid)s """ % {'srvid': srvid}) exts = repo_cur.fetchall() for ext in exts: if (ext[0] == 'hypopg'): hypo_ver = ext[1] cur.execute(""" SELECT extversion FROM pg_extension WHERE extname = %(extname)s """, {'extname': ext[0]}) remote_ver = cur.fetchone() if (not remote_ver): self.logger.debug("No version found for extension " + "%s on server %d" % (ext[0], srvid)) continue if (ext[1] is None or ext[1] != remote_ver[0]): try: repo_cur.execute(""" UPDATE powa_extensions SET version = %(version)s WHERE srvid = %(srvid)s AND extname = %(extname)s """, {'version': remote_ver, 'srvid': srvid, 'extname': ext[0]}) self.__repo_conn.commit() except Exception as e: self.logger.warning("Could not save version for extension " + "%s: %s" % (ext[0], e)) self.__repo_conn.rollback() # Special handling of hypopg, which isn't required to be installed in # the powa dedicated database. cur.execute(""" SELECT default_version FROM pg_available_extensions WHERE name = 'hypopg' """) remote_ver = cur.fetchone() if (remote_ver is None): try: repo_cur.execute(""" DELETE FROM powa_extensions WHERE srvid = %(srvid)s AND extname = 'hypopg' """, {'srvid': srvid, 'hypo_ver': remote_ver}) self.__repo_conn.commit() except Exception as e: self.logger.warning("Could not save version for extension " + "hypopg: %s" % (e)) self.__repo_conn.rollback() elif (remote_ver != hypo_ver): try: if (hypo_ver is None): repo_cur.execute(""" INSERT INTO powa_extensions (srvid, extname, version) VALUES (%(srvid)s, 'hypopg', %(hypo_ver)s) """, {'srvid': srvid, 'hypo_ver': remote_ver}) else: repo_cur.execute(""" UPDATE powa_extensions SET version = %(hypo_ver)s WHERE srvid = %(srvid)s AND extname = 'hypopg' """, {'srvid': srvid, 'hypo_ver': remote_ver}) self.__repo_conn.commit() except Exception as e: self.logger.warning("Could not save version for extension " + "hypopg: %s" % (e)) self.__repo_conn.rollback() self.__disconnect_repo() def __check_powa(self): """Check that Powa is ready on the remote server.""" if (self.__remote_conn is None): self.__connect() if (self.is_stopping()): return # make sure the GUC are present in case powa isn't in # shared_preload_librairies. This is only required for powa # 4.0.x. if (self.__remote_conn is not None): self.__maybe_load_powa(self.__remote_conn) if (self.is_stopping()): return # Check and update PG and dependencies versions if possible self.__save_versions() def __reload(self): """Reload configuration Disconnect from everything, read new configuration, reconnect, update dependencies, check Powa is still available The new session could be totally different """ self.logger.info("Reloading configuration") if (self.__pending_config is not None): self.__config = self.__pending_config self.__pending_config = None self.__disconnect_all() self.__connect() if (self.__update_dep_versions): self.__update_dep_versions = False self.__check_powa() self.__got_sighup.clear() def __report_error(self, msg, replace=True): """Store errors in the repository database. replace means we overwrite current stored errors in the database for this server. Else we append""" if (self.__repo_conn is not None): if (type(msg).__name__ == 'list'): error = msg else: error = [msg] srvid = self.__config["srvid"] cur = self.__repo_conn.cursor() cur.execute("SAVEPOINT metas") try: if (replace): cur.execute("""UPDATE public.powa_snapshot_metas SET errors = %s WHERE srvid = %s """, (error, srvid)) else: cur.execute("""UPDATE public.powa_snapshot_metas SET errors = pg_catalog.array_cat(errors, %s) WHERE srvid = %s """, (error, srvid)) cur.execute("RELEASE metas") except psycopg2.Error as e: err = "Could not report error for server %d:\n%s" % (srvid, e) self.logger.warning(err) cur.execute("ROLLBACK TO metas") self.__repo_conn.commit() def __connect(self): """Connect to a remote server Override lock_timeout, application name""" if ('dsn' not in self.__repository or 'dsn' not in self.__config): self.logger.error("Missing connection info") self.__stopping.set() return try: if (self.__repo_conn is None): self.logger.debug("Connecting on repository...") self.__repo_conn = psycopg2.connect(self.__repository['dsn']) self.logger.debug("Connected.") # make sure the GUC are present in case powa isn't in # shared_preload_librairies. This is only required for powa # 4.0.x. self.__maybe_load_powa(self.__repo_conn) # Return now if __maybe_load_powa asked to stop if (self.is_stopping()): return cur = self.__repo_conn.cursor() cur.execute("""SELECT pg_catalog.set_config(name, '2000', false) FROM pg_catalog.pg_settings WHERE name = 'lock_timeout' AND setting = '0'""") cur.execute("SET application_name = %s", ('PoWA collector - repo_conn for worker ' + self.name,)) cur.close() self.__repo_conn.commit() self.__last_repo_conn_errored = False if (self.__remote_conn is None): self.logger.debug("Connecting on remote database...") self.__remote_conn = psycopg2.connect(**self.__config['dsn']) self.logger.debug("Connected.") # make sure the GUC are present in case powa isn't in # shared_preload_librairies. This is only required for powa # 4.0.x. if (self.__remote_conn is not None): self.__maybe_load_powa(self.__remote_conn) # Return now if __maybe_load_powa asked to stop if (self.is_stopping()): return cur = self.__remote_conn.cursor() cur.execute("""SELECT pg_catalog.set_config(name, '2000', false) FROM pg_catalog.pg_settings WHERE name = 'lock_timeout' AND setting = '0'""") cur.execute("SET application_name = %s", ('PoWA collector - worker ' + self.name,)) cur.close() self.__remote_conn.commit() self.__connected.set() except psycopg2.Error as e: self.logger.error("Error connecting on %s:\n%s" % (self.__config["dsn"], e)) if (self.__repo_conn is not None): self.__report_error("%s" % (e)) else: self.__last_repo_conn_errored = True def __disconnect_all(self): """Disconnect from remote server and repository server""" if (self.__remote_conn is not None): self.logger.info("Disconnecting from remote server") self.__remote_conn.close() self.__remote_conn = None if (self.__repo_conn is not None): self.logger.info("Disconnecting from repository") self.__disconnect_repo() self.__connected.clear() def __disconnect_repo(self): """Disconnect from repo""" if (self.__repo_conn is not None): self.__repo_conn.close() self.__repo_conn = None def __disconnect_all_and_exit(self): """Disconnect all and stop the thread""" # this is the exit point self.__disconnect_all() self.logger.info("stopped") self.__stopping.clear() def __worker_main(self): """The thread's main loop Get latest snapshot timestamp for the remote server and determine how long to sleep before performing the next snapshot. Add a random seed to avoid doing all remote servers simultaneously""" self.last_time = None self.__check_powa() # if this worker has been restarted, restore the previous snapshot # time to try to keep up on the same frequency if (not self.is_stopping() and self.__repo_conn is not None): cur = None try: cur = self.__repo_conn.cursor() cur.execute("""SELECT EXTRACT(EPOCH FROM snapts) FROM public.powa_snapshot_metas WHERE srvid = %d """ % self.__config["srvid"]) row = cur.fetchone() if not row: self.logger.error("Server %d was not correctly registered" " (no powa_snapshot_metas record)" % self.__config["srvid"]) self.logger.debug("Server configuration details:\n%r" % self.__config) self.logger.error("Stopping worker for server %d" % self.__config["srvid"]) self.__stopping.set() if row: self.last_time = row[0] self.logger.debug("Retrieved last snapshot time:" + " %r" % self.last_time) cur.close() self.__repo_conn.commit() except Exception as e: self.logger.warning("Could not retrieve last snapshot" + " time: %s" % (e)) if (cur is not None): cur.close() self.__repo_conn.rollback() # Normalize unknkown last snapshot time if (self.last_time == Decimal('-Infinity')): self.last_time = None # if this worker was stopped longer than the configured frequency, # assign last snapshot time to a random time between now and now minus # duration. This will help to spread the snapshots and avoid activity # spikes if the collector itself was stopped for a long time, or if a # lot of new servers were added if (not self.is_stopping() and self.last_time is not None and ((time.time() - self.last_time) > self.__config["frequency"])): random.seed() r = random.randint(0, self.__config["frequency"] - 1) self.logger.debug("Spreading snapshot: setting last snapshot to" + " %d seconds ago (frequency: %d)" % (r, self.__config["frequency"])) self.last_time = time.time() - r while (not self.is_stopping()): start_time = time.time() if (self.__got_sighup.isSet()): self.__reload() if ((self.last_time is None) or (start_time - self.last_time) >= self.__config["frequency"]): try: self.__take_snapshot() except psycopg2.Error as e: self.logger.error("Error during snapshot: %s" % e) # It will reconnect automatically at next snapshot self.__disconnect_all() self.last_time = time.time() time_to_sleep = self.__config["frequency"] - (self.last_time - start_time) # sleep until the scheduled processing time, or if the main thread # asked us to perform an action or if we were asked to stop. if (time_to_sleep > 0 and not self.is_stopping()): self.__stop_sleep.wait(time_to_sleep) # clear the event if it has been set. We'll process all possible # event triggered by it within the next iteration if (self.__stop_sleep.isSet()): self.__stop_sleep.clear() # main loop is over, disconnect and quit self.__disconnect_all_and_exit() def __take_snapshot(self): """Main part of the worker thread. This function will call all the query_src functions enabled for the target server, and insert all the retrieved rows on the repository server, in unlogged tables, and finally call powa_take_snapshot() on the repository server to finish the distant snapshot. All is done in one transaction, so that there won't be concurrency issues if a snapshot takes longer than the specified interval. This also ensure that all rows will see the same snapshot timestamp. """ srvid = self.__config["srvid"] if (self.is_stopping()): return self.__connect() if (self.__remote_conn is None): self.logger.error("No connection to remote server, snapshot skipped") return if (self.__repo_conn is None): self.logger.error("No connection to repository server, snapshot skipped") return # get the list of snapshot functions, and their associated query_src cur = self.__repo_conn.cursor(cursor_factory=DictCursor) cur.execute("SAVEPOINT snapshots") try: cur.execute(get_snapshot_functions(), (srvid,)) snapfuncs = cur.fetchall() cur.execute("RELEASE snapshots") except psycopg2.Error as e: cur.execute("ROLLBACK TO snapshots") err = "Error while getting snapshot functions:\n%s" % (e) self.logger.error(err) self.logger.error("Exiting worker for server %s..." % srvid) self.__stopping.set() return cur.close() if (not snapfuncs): self.logger.info("No datasource configured for server %d" % srvid) self.logger.debug("Committing transaction") self.__repo_conn.commit() self.__disconnect_repo() return ins = self.__repo_conn.cursor() data_src = self.__remote_conn.cursor() errors = [] for snapfunc in snapfuncs: if (self.is_stopping()): return module_name = snapfunc["module"] query_source = snapfunc["query_source"] cleanup_sql = snapfunc["query_cleanup"] function_name = snapfunc["function_name"] self.logger.debug("Working on module %s", module_name) # get the SQL needed to insert the query_src data on the remote # server into the transient unlogged table on the repository server if (query_source is None): self.logger.warning("Not query_source for %s" % function_name) continue # execute the query_src functions to get local data (srvid 0) self.logger.debug("Calling public.%s(0)..." % query_source) data_src_sql = get_src_query(query_source, srvid) # use savepoint, maybe the datasource is not setup on the remote # server data_src.execute("SAVEPOINT src") # XXX should we use os.pipe() or a temp file instead, to avoid too # much memory consumption? buf = StringIO() try: data_src.copy_expert("COPY (%s) TO stdout" % data_src_sql, buf) except psycopg2.Error as e: err = "Error while calling public.%s:\n%s" % (query_source, e) errors.append(err) data_src.execute("ROLLBACK TO src") # execute the cleanup query if provided if (cleanup_sql is not None): data_src.execute("SAVEPOINT src") try: self.logger.debug("Calling %s..." % cleanup_sql) data_src.execute(cleanup_sql) except psycopg2.Error as e: err = "Error while calling %s:\n%s" % (cleanup_sql, e) errors.append(err) data_src.execute("ROLLBACK TO src") if (self.is_stopping()): return # insert the data to the transient unlogged table ins.execute("SAVEPOINT data") buf.seek(0, SEEK_SET) try: ins.copy_expert("COPY %s FROM stdin" % get_tmp_name(query_source), buf) except psycopg2.Error as e: err = "Error while inserting data:\n%s" % e self.logger.warning(err) errors.append(err) self.logger.warning("Giving up for %s", function_name) ins.execute("ROLLBACK TO data") buf.close() data_src.close() if (self.is_stopping()): if (len(errors) > 0): self.__report_error(errors) return # call powa_take_snapshot() for the given server self.logger.debug("Calling powa_take_snapshot(%d)..." % (srvid)) sql = ("SELECT public.powa_take_snapshot(%(srvid)d)" % {'srvid': srvid}) try: ins.execute("SAVEPOINT powa_take_snapshot") ins.execute(sql) val = ins.fetchone()[0] if (val != 0): self.logger.warning("Number of errors during snapshot: %d", val) self.logger.warning(" Check the logs on the repository server") ins.execute("RELEASE powa_take_snapshot") except psycopg2.Error as e: err = "Error while taking snapshot for server %d:\n%s" % (srvid, e) self.logger.warning(err) errors.append(err) ins.execute("ROLLBACK TO powa_take_snapshot") ins.execute("SET application_name = %s", ('PoWA collector - repo_conn for worker ' + self.name,)) ins.close() # we need to report and append errors after calling powa_take_snapshot, # since this function will first reset errors if (len(errors) > 0): self.__report_error(errors, False) # and finally commit the transaction self.logger.debug("Committing transaction") self.__repo_conn.commit() self.__remote_conn.commit() self.__disconnect_repo() def is_stopping(self): """Is the thread currently stopping""" return self.__stopping.isSet() def get_config(self): """Returns the thread's config""" return self.__config def ask_to_stop(self): """Ask the thread to stop""" self.__stopping.set() self.logger.info("Asked to stop...") self.__stop_sleep.set() def run(self): """Start the main loop of the thread""" if (not self.is_stopping()): self.logger.info("Starting worker") self.__worker_main() def ask_reload(self, new_config): """Ask the thread to reload""" self.logger.debug("Reload asked") self.__pending_config = new_config self.__got_sighup.set() self.__stop_sleep.set() def ask_update_dep_versions(self): """Ask the thread to recompute its dependencies""" self.logger.debug("Version dependencies reload asked") self.__update_dep_versions = True self.__got_sighup.set() self.__stop_sleep.set() def get_status(self): """Get the status: ok, not connected to repo, or not connected to remote""" if (self.__repo_conn is None and self.__last_repo_conn_errored): return "no connection to repository server" if (self.__remote_conn is None): return "no connection to remote server" else: return "running" powa-collector-1.2.0/powa_collector/snapshot.py000066400000000000000000000013211420163632500216700ustar00rootroot00000000000000def get_snapshot_functions(): """Get the list of enabled functions for snapshotting""" # XXX should we ignore entries without query_src? return """SELECT module, query_source, query_cleanup, function_name FROM public.powa_functions WHERE operation = 'snapshot' AND enabled AND srvid = %s ORDER BY priority""" def get_src_query(src_fct, srvid): """Get the SQL query we'll use to get results from a snapshot function""" return ("SELECT %(srvid)d, * FROM public.%(fname)s(0)" % {'fname': src_fct, 'srvid': srvid}) def get_tmp_name(src_fct): """Get the temp table name we'll use to spool changes""" return "%s_tmp" % src_fct powa-collector-1.2.0/requirements.txt000066400000000000000000000000111420163632500177220ustar00rootroot00000000000000psycopg2 powa-collector-1.2.0/setup.py000066400000000000000000000021701420163632500161600ustar00rootroot00000000000000import setuptools __VERSION__ = None with open("powa_collector/__init__.py", "r") as fh: for line in fh: if line.startswith('__VERSION__'): __VERSION__ = line.split('=')[1].replace("'", '').strip() break requires = ['psycopg2'] setuptools.setup( name="powa-collector", version=__VERSION__, author="powa-team", license='Postgresql', author_email="rjuju123@gmail.com", description="PoWA collector, a collector for performing remote snapshot with PoWA", long_description="See https://powa.readthedocs.io/", long_description_content_type="text/markdown", url="https://powa.readthedocs.io/", packages=setuptools.find_packages(), install_requires=requires, scripts=["powa-collector.py"], classifiers=[ "Development Status :: 5 - Production/Stable", "Programming Language :: Python :: 2", "Programming Language :: Python :: 3", "License :: OSI Approved :: PostgreSQL License", "Operating System :: OS Independent", "Intended Audience :: System Administrators", "Topic :: Database :: Front-Ends" ], )