obnam-1.6.1/0000755000175000017500000000000012246357067012513 5ustar jenkinsjenkinsobnam-1.6.1/COPYING0000644000175000017500000010451312246357067013552 0ustar jenkinsjenkins GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to the public, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively state the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. Copyright (C) This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . Also add information on how to contact you by electronic and paper mail. If the program does terminal interaction, make it output a short notice like this when it starts in an interactive mode: Copyright (C) This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. This is free software, and you are welcome to redistribute it under certain conditions; type `show c' for details. The hypothetical commands `show w' and `show c' should show the appropriate parts of the General Public License. Of course, your program's commands might be different; for a GUI interface, you would use an "about box". You should also get your employer (if you work as a programmer) or school, if any, to sign a "copyright disclaimer" for the program, if necessary. For more information on this, and how to apply and follow the GNU GPL, see . The GNU General Public License does not permit incorporating your program into proprietary programs. If your program is a subroutine library, you may consider it more useful to permit linking proprietary applications with the library. If this is what you want to do, use the GNU Lesser General Public License instead of this License. But first, please read . obnam-1.6.1/NEWS0000644000175000017500000011646612246357067013230 0ustar jenkinsjenkinsObnam NEWS ========== This file summarizes changes between releases of Obnam. Version 1.6.1, released 2013-11-30 ---------------------------------- * Fix Debian package dependencies correctly. Version 1.6, released 2013-11-30 -------------------------------- * Stop logging paramiko exceptions that get converted into another type of exception by the SFTP plugin in Obnam. * `obnam-benchmark` can now use an installed version of larch. Patch by Lars Kruse. * Obnam has been ported to FreeBSD by Itamar Turner-Trauring of HybridCluster. * Backup progress reporting now reports scanned file data, not just backed up file data. This will hopefully be less confusing to people. * The `list-keys`, `client-keys`, and `list-toplevels` commands now obey a new option, `--key-details`, to show the usernames attached to each public key. Patch by Lars Kruse. * New option `--ssh-command` to set the command Obnam runs when invoking ssh. patch by Lars Kruse. * `obnam clients` can now be used without being an existing client. Patch by Itamar Turner-Trauring. * New option `--ssh-host-keys-check` to better specify how SSH host keys should be checked. Patch by Itamar Turner-Trauring. Bug fixes: * Fix`"obnam list-toplevels` so it doesn't give an error when it's unable to read the per-client directory of another client, when encryption is used. Fix by Lars Kruse. * Fix the encryption plugin to give a better error message when it looks for client directories but fails to find them. Fix by Lars Kruse. * `obnam list-toplevels` got confused when the repository contained extra files, such as "lock" (left there by a previous, crashed Obnam run). It no longer does. Fix by Lars Kruse. * The SFTP plugin now handles another error code (EACCESS) when writing a file and the directory it should go into not existing. Patch by Armin Größlinger. * Obnam's manual page now explains about breaking long logical lines into multiple physical ones. * The `/~/` path prefix in SFTP URLs works again, at least with sufficiently new versions of Paramiko (1.7.7.1 in Debian wheezy is OK). Reported by Lars Kruse. * The Nagios plugin to report errors in a way Nagios expects. Patch by Martijn Grendelman. * The Nagios plugin for Obnam now correctly handles the case where a backup repository for a client exists, but does not have a backup yet. Patch by Lars Kruse. * `obnam ls` now handles trailing slashes in filename arguments. Reported by Biltong. * When restoring a backup, Obnam will now continue past errors, instead of aborting with the first one. Patch by Itamar Turner-Trauring. Version 1.5, released 2013-08-08 -------------------------------- Bug fixes: * Terminal progress reporting now updated only every 0.1 seconds, instead of 0.01 seconds, to reduce terminal emulator CPU usage. Reported by Neal Becker. * Empty exclude patterns are ignored. Previously, a configuration file line such as "exclude = foo, bar," (note trailing comma) would result in an empty pattern, which would match everything, and therefore nothing would be backed up. Reported by Sharon Kimble. * A FUSE plugin to access (read-only) data from the backup repository has been added. Written by Valery Yundin. Version 1.4, released 2013-03-16 -------------------------------- * The``ls` command now takes filenames as (optional) arguments, instead of a list of generations. Based on patch by Damien Couroussé. * Even more detailed progress reporting during a backup. * Add --fsck-skip-generations option to tell fsck to not check any generation metadata. * The default log level is now INFO, instead of DEBUG. This is to be considered a quantum leap in the continuing rise of the maturity level of the software. (Actually, the change is there just to save some disk space and I/O for people who don't want to be involved in Obnam development and don't want to have massive log files.) * The default sizes for the `lru-size` and `upload-queue-size` settings have been reduced, to reduce the memory impact of Obnam. * `obnam restore` now reports transfer statistics at the end, similarly to what `obnam backup` does. Suggested by "S. B.". Bug fixes: * If listing extended attributes for a filesystem that does not support them, Obnam no longer crashes, just silently does not backup extended attributes. Which aren't there anyway. * A bug in handling stat lookup errors was fixed. Reported by Peter Palfrader. Symptom: `AttributeError: 'exceptions.OSError' object has no attribute 'st_ino'` in an error message or log file. * A bug in a restore crashing when failing to set extended attributes on the restored file was fixed. Reported by "S. B.". * Made it clearer what is happening when unlocking the repository due to errors, and fixed it so that a failure to unlock is also an error. Reported by andrewsh. * The dependency on Larch is now for 1.20121216 or newer, since that is needed for fsck to work. * The manual page did not document the client name arguments to the `add-key` and `remove-key` subcommands. Reported by Lars Kruse. * Restoring symlinks as root would fail. Reported and fixed by David Fries. * Only set ssh user/port if explicitily requested, otherwise let ssh select them. Reported by Michael Goetze, fixed by David Fries. * Fix problem with old version of paramiko and chdir. Fixed by Nick Altmann. * Fix problems with signed vs unsigned values for struct stat fields. Reported by Henning Verbeek. Version 1.3, released 2012-12-16 -------------------------------- * When creating files in the backup repository, Obnam tries to avoid NFS synchronisation problems by first writing a temporary file and then creating a hardlink to the actual filename. This works badly on filesystems that do not allow hard links, such as VFAT. If creating the hardlink fails, Obnam now further tries to use the `open(2)` system call with the `O_EXCL` flag to create the target file. This should allow things to work with both NFS and VFAT. * More detailed progress reporting during the backup. * Manual page now covers the diff subcommand. Patch by Peter Valdemar Mørch. * Speed optimisation patch for backing up files in inode numbering order, from Christophe Vu-Brugier. * A setuid or setgid bit is now not restored if Obnam is not used by root or the same user as the owner of the restored file. * Many new settings to control "obnam fsck", mainly to reduce the amount of checking being done in order to make it faster. However, fsck is has lost some features (checks), which will be added back in a future release. * More frequent fsck progress reporting. Some speed optimisations to fsck. Bug fixes: * Empty values for extended attributes are now backed up correctly. Previously they would cause an infinite loop. * Extended attributes without values are now ignored. This is different from attributes with empty values. Reported by Vladimir Elisseev. * An empty port number in sftp URLs is now handled correctly. Found based on report by Anton Shevtsov. * A bad performance bug when backing up full systems (starting from the filesystem root directory) has been fixed. At the beginning of each generation, Obnam removes any directories that are not part of the current backup roots. This is necessary so that if you change the backup roots, the old stuff doesn't hang around forever. However, when the backup root is the filesystem root, due to the now-fixed bug Obnam would first remove everything, and then back it up all over again. This "worked", but was quite slow. Thanks to Nix for reporting the problem. * Obnam now runs GnuPG explicitly with the "no text mode" setting, to override a "text mode" setting in the user's configuration. The files Obnam encrypts need to be treated as binary, not text files. Reported by Robin Sheat. * A shared B-tree concurrency bug has been fixed: If another instance of Obnam was modifying a shared B-tree, Obnam would crash and abort a backup, possibly leaving lock files lying around. Now a failure to look up a chunk via its checksum is ignored, and the backup continues. * Bugs in how Python OSError exceptions were being raises have been fixed. Error messages should now be somewhat clearer. * Unset or wrongly set variable "full" fixed in "obnam diff". Reported by ROGERIO DE CARVALHO BASTOS and patched by Peter Valdemar Mørch. * Setuid and setgid bits are now restored correctly, when restore happens as root. Reported by Pavel Kokolemin. * Obnam now complains if no backup roots have been specfied. Version 1.2, released 2012-10-06 -------------------------------- * Added a note to `--node-size` that it only affects new B-trees. Thanks, Michael Brown. * New `obnam diff` subcommand to show differences (added/removed/modified files) between two generations, by Peter Valdemar Mørch. * `obnam backup` now logs the names of files that are getting backed up at the INFO level rather than DEBUG. * The command synopsises for backup, restore, and verify commands now make it clearer that Obnam only accepts directories, not individual files, as arguments. (For now.) * The output from the `show` plugin can now be redirected with the `--output=FILE` option. Affected subcommands: `clients`, `generations`, `genids`, `ls`, `diff`, `nagios-last-backup-age`. Bug fixes: * Notify user of errors during backups. * The SFTP plugin now manages to deal with repository paths starting with `/~/` which already exist without crashing. * Character and block device nodes are now restored correctly. Thanks to Martin Dummer for the bug report. * The symmteric key for a toplevel repository directory is re-encrypted when a public key is added or removed to the toplevel using the `add-key` or `remove-key` subcommands. * Manual page typo fix. Thanks, Steve Kemp. Version 1.1, released 2012-06-30 -------------------------------- * Mark the `--small-files-in-btree` settings as deprecated. * Obnam now correctly checks that `--repository` is set. * Options in `--help` output are now grouped in random senseless ways rather than being in one randomly ordered group. * Manual page clarification for `--root` and `verify`. Thanks, Saint Germain. * Remove outdated section from manual page explaining that there is not format conversion. Thanks, Elrond of Samba-TNG. * Added missing information about specifying a user in sftp URLs. Thanks, Joey Hess, for pointing it out. * Manual page clarification on `--keep` from Damien Couroussé. * Make `obnam forget` report which generations it would remove without `--pretend`. Thanks, Neal Becker, for the suggestion. Version 1.0, released 2012-06-01 -------------------------------- * Fixed bug in finding duplicate files during a backup generation. Thanks to Saint Germain for reporting the problem. * Changed version number to 1.0. Version 0.30, released 2012-05-30; a RELEASE CANDIDATE ------------------------------------------------------ Only bug fixes, and only in the test suite. * Fix test case problem when `$TMPDIR` lacks `user_xattr`. The extended attributes test won't succeed in that case, and it's pointless to run it. * Fix test case problem when `$TMPDIR` lacks nanosecond timestamps for files. The test case now ignores such timestamps, making the test pass anyway. The timestamp accuracy is not important for this test. Version 0.29, released 2012-05-27; a RELEASE CANDIDATE ------------------------------------------------------ * "obnam backup" now writes performance statistics at the end of a backup run. Search the log for "Backup performance statistics" (INFO level). * "obnam verify" now continues past the first error. Thanks to Rafał Gwiazda for requesting this. * Add an `obnam-viewprof` utility to translate Python profiling output into human readable text form. * Bug fix: If a file's extended attributes have changed in any way, the change is now backed up. * "obnam fsck" is now a bit faster. * The shared directories in the repository are now locked only during updates, allowing more efficient concurrent backups between several computers. * Obnam now gives a better error message when a backup root is not a directory. Thanks to Edward Allcutt for reporting the error (). * The output format of "obnam ls" has changed. It now has one line per file, and includes the full pathname of the file, rather mimicking the output of "ls -lAR". Thanks to Edward Allcutt for the suggestion (). * A few optimizations to sftp speed. Small files are still slow. Version 0.28, released 2012-05-10; a BETA release ------------------------------------------------- * `force-lock` should now remove all locks. * Out-of-space errors in the repository now terminate the backup process. Previously, Obnam would continue, ignoring the failure to write. If you make space in the repository and restart Obnam, it will continue from the previous checkpoint. * The convert5to6 black box test now works even if run by other people than liw. * "obnam backup" now uses a single SFTP connection to the backup repository, rather than opening a new one after each checkpoint generation. Thanks to weinzwang for reporting the problem. * "obnam verify" now obeys the `--quiet` option. * "obnam backup" no longer counts chunks already in the repository in the uploaded amount of data. Version 0.27, released 2012-04-30; a BETA release ------------------------------------------------- * The repository format has again changed in an incompatible manner, so you will need to re-backup everything again. Alternatively, you can try the new `convert5to6` subcommand. See the manual page for details. Make sure you have a copy of the repository before converting, the code is new and may be buggy. * New option `--small-files-in-btree` enables Obnam to store the contents of small files in the per-client B-tree. This is not the default, at least yet, since it's impact on real life performance is unknown, but it should make things go a bit faster for high latency repository connections. * Some SFTP related speed optimizations. * Data filtering is now strictly stable and priority-ordered, ensuring that compression always happens before encryption etc. * Repository metadata is never filtered, so that we can be sure that in future if when we add backwards-compatibility we can detect the format without worrying about any other filtering which might occur. * Forcing of locks is now unconditional and across the entire repository. * Uses the larch 0.30 read-only mode to fix a bug where opening a B-tree rolls back changes someone else is making, even if we only use the tree to read stuff from. * "obnam backup" will now exit with a non-zero exit code if there were any errors during a backup, and the problematic files were skipped. Thanks, Peter Palfrader, for reporting the bug. * "obnam forget" is now a bit faster. * Hash collisions for filenames are now handled. Version 0.26, released 2012-03-26; a BETA release ------------------------------------------------- * Clients now lock the parts of the backup repository they're using, while making any changes, so that multiple clients can work at the same time without corrupting the repository. * Now depends on a larch 0.28, which uses journalling to avoid on-disk inconsistencies and corruption during crashes. * Compression and encryption can now be used together. Version 0.25, released 2012-02-18; a BETA release ------------------------------------------------- * Log files are now created with permissions that allow only the owner to read or write them. This fixes a privacy leak. * The `nagios-last-backup-age` subcommand is useful for setting up Nagios (or similar systems) to check that backups get run properly. Thanks to Peter Palfrader for the patch. * Some clarification on how the forget policy works, prompted by questions from Peter Palfrader. * New settings `ssh-known-hosts` (for choosing which file to check for known host keys), `strict-ssh-host-keys` (for disallowing unknown host keys), and `ssh-key` (for choosing which key file to use for SSH connections) allow better and safer use of ssh. * Checkpoints will now happen even in the middle of files (but between chunks). * The `--pretend` option now works for backups as well. BUG FIXES: * `obnam ls` now shows the correct timestamps for generations. Thanks, Anders Wirzenius. Version 0.24.1, released 2011-12-24; a BETA release ------------------------------------------------- BUG FIX: * Fix test case for file timestamps with sub-second resolution. Not all filesystems have that, so the test case has been changed to accept lack of sub-second timestamps. Version 0.24, released 2011-12-18; a BETA release ------------------------------------------------- USER VISIBLE CHANGES * The way file timestamps (modification and access times) have changed, to fix inaccuracies introduced by the old way. Times are now stored as two integers giving full seconds and nanoseconds past the full second, instead of the weird earlier system that was imposed by Python's use of floating point for the timestamps. This causes the repository format version to be bumped, resulting in a need to start over with an empty repository. * Extended file attributes are now backed up from and restored to local filesystems. They are neither backed up, nor restored for live data accessed over SFTP. * If the `--exclude` regular expression is wrong, Obnam now gives an error message and then ignores the regexp, rather than crashing. * There is now a compression plugin, enabled with `--compress-with=gzip`. * De-duplication mode can now be chosen by the user: the new `--deduplicate` setting can be one of `never` (fast, but uses more space); `verify` (slow, but handles hash collisions gracefully); and `fatalist` (fast, but lossy, if there is a hash collision). `fatalist` is the default mode. * Restores now obey the `--dry-run` option. Thanks to Peter Palfreder for the bug report. * New option `--verify-randomly` allows you to check only a part of the backup, instead of everything. * Verify now has some progress reporting. * Forget is now much faster. * Forget now has progress reporting. It is not fast enough to do without, sorry. * Backup now removes any checkpoint generations it created during a backup run, if it succeeds without errors. BUG FIXES: * Now works with a repository on sshfs. Thanks to Dafydd Harries for reporting the problem. * Now depends on a newer version of the larch library, fixing a problem when the Obnam default node size changes and an existing repository has a different size. * User and group names for sftp live data are no longer queried from the local system. Instead, they're marked as unknown. Version 0.23, released 2011-10-02; a BETA release ------------------------------------------------- USER VISIBLE CHANGES: * `restore` now shows a progress bar. * `fsck` now has more useful progress reporting, and does more checking, including the integrity of the contents of file content. * `fsck` now also checks the integrity of the B-trees in the repository, so that it is not necessary to run `fsck-larch` manually anymore. This works remotely as well, whereas `fsck-larch` only worked on B-trees on the local filesystem. * `force-lock` now gives a warning if the client does not exist in the repository. * Subcommands for encryption now give a warning if encryption key is not given. * The `--fsck-fix` option will now instruct `obnam fsck` to try to fix problems found. For this release, it only means fixing B-tree missing node problems, but more will follow. * The default sizes have been changed for B-tree nodes (256 KiB) and file contents chunks (1 MiB), based on benchmarking. * SFTP protocol use has been optimized, which should result in some more speed. This also highlights the need to change obnam so it can do uploads in the background. * If a client does not exist in the repository, `force-lock` now gives a warning to the user, rather than ignoring it silently. DEVELOPER CHANGES: * New `--sftp-delay=100` option can be used to simulate SFTP backups over networks with long round trip times. * `obnam-benchmark` can now use `--sftp-delay` and other changes to make it more useful. INTERNAL CHANGES: * Got rid of terminal status plugin. Now, the `Application` class provides a `ttystatus.TerminalStatus` instance instead, in the `ts` attribute. Other plugings are supposed to use that for progress reporting and messaging to the user. * The `posix_fadvise` system call is used only if available. This should improve Obnam's portability a bit. Version 0.22, released 2011-08-25; a BETA release ------------------------------------------------- USER VISIBLE CHANGES: * Obnam now reports its current configuration in the log file at startup. This will hopefully remove one round of "did you use the --foo option?" questions between developers and bug reporters. BUG FIXES: * The repository is now unlocked on exit only if it is still locked. * A wrongly caught `GeneratorExit` is now dealt with properly. * Keyboard interrupts are logged, so they don't show up as anonymous errors. CHANGES RELEVANT TO DEVELOPERS ONLY: * `setup.py` has been enhanced to work more like the old `Makefile` did: `clean` removes more artifacts. Instructions in `README` have been updated to point at `setup.py`. * Compiler warning about `_XOPEN_SOURCE` re-definition fixed. * Tests are now again run during a Debian package build. Version 0.21, released 2011-08-23; a BETA release ------------------------------------------------- USER VISIBLE CHANGES: * Obnam will now unlock the repository if there's an error during a backup. For the most part, the `force-lock` operation should now be unnecessary, but it's still there in case it's useful some day. BUG FIXES: * Negative timestamps for files now work. Thanks to Jamil Djadala for reporting the bug. * The documentation for --checkpoint units fixed. Thanks, user weinzwang from IRC. * The connections to the repository and live data filesystem are now properly closed. This makes benchmark read/write statistics be correct. Version 0.20.1, released 2011-08-11; a BETA release ------------------------------------------------- BUG FIXES: * More cases of Unicode strings versus plain strings in filenames over SFTP fixed. Thanks to Tapani Tarvainen. Version 0.20, released 2011-08-09; a BETA release ------------------------------------------------- BUG FIXES: * Non-ASCII filenames over SFTP root now work. (Thanks, Tapani Tarvainen, for the reproducible bug report.) * The count of files while making a backup now counts all files found, not just those backed up. The old behavior was confusing people. USER VISIBLE CHANGES: * The output of `obnam ls` now formats the columns a little prettier, so that wide values do not cause misalignment. * The error message when trying to use an encrypted repository without encryption is now better (and suggests missing encryption being the reason). Thanks, chrysn. * Obnam now supports backing up of Unix sockets. Version 0.19, released 2011-08-03; a BETA release ------------------------------------------------- INCOMPATIBILITY CHANGES: * We now require version 0.21 of the `larch` library, and this requires bumping the repository format. This means old backup repositories can't be used with this version, and you need to back up everything again. (Please tell me when this becomes a problem.) BUG FIXES: * Found one more place where a file going missing during a backup may cause a crash. * Typo in error message about on-disk formats fixed. (Thanks, Tapani Tarvainen.) * The `--trace` option works again. * `fcntl.F_SETFL` does not seem to work on file descriptors for files owned by root that are read-only to the user running obnam. Worked around by ignoring any problems with setting the flags. * The funnest bug in this release: if no log file was specified with `--log`, the current working directory was excluded from the backup. USER VISIBLE CHANGES: * `obnam(1)` manual page now discusses how configuration files are used. * The manual page describes problems using sftp to access live data. * The documentation for `--no-act` was clarified to say it only works for `forget. (Thanks, Daniel Silverstone.) * `obnam-benchmark` now has a manual page. * The backup plugin logs files it excludes, so the user can find out what's going on. A confused user is an unhappy user. INTERNAL STUFF: * Tracing statements added to various parts of the code, to help debug mysterious problems. * All exceptions are derived from `obnamlib.AppException` or `obnamlib.Error`, and those are derived from `cliapp.AppException`, so that the user gets nicer error messages than Python stack traces. * `blackboxtests` is no longer run under fakeroot, because Debian packages are built under fakeroot, and fakeroot within fakeroot causes trouble. However, the point of running tests under fakeroot was to make sure certain kinds of bugs are caught, and since Debian package building runs the tests anyway, the test coverage is not actually diminished. * The `Makefile` has new targets `fast-check` and `network-tests`. The latter runs tests over sftp to localhost. Version 0.18, released 2011-07-20; a BETA release ------------------------------------------------- * The repository format has again changed in an incompatible manner, so you will need to re-backup everything again. (If this is a problem, tell me, and I'll consider adding backwards compatibility before 1.0 is released.) * New option `--exclude-caches` allows automatic exclusion of cache directories that are marked as such. * Obnam now makes files in the repository be read-only, so that they're that much harder to delete by mistake. * Error message about files that can't be backed up now mentions the correct file. * Bugfix: unreadable files and directories no longer cause the backup to fail. The problems are reported, but the backup continues. Thanks to Jeff Epler for reporting the bug. * Speed improvement from Jeff Epler for excluding files from backups. * Various other speed improvements. * Bugfix: restoring symlinks now works even if the symlink is restored before its target. Also, the permissions of the symlink (rather than its target) are now restored correctly. Thanks to Jeff Epler for an exemplary bug report. * New option `--one-file-system`, from Jeff Epler. * New benchmarking tool `obnam-benchmark`, which is more flexible than the old `run-benchmark`. * When encrypting/decrypting data with GnuPG, temporary files are no longer used. * When verifying, `.../foo` and `.../foo/` now work the same way. * New option `--symmetric-key-bits`. * The chunk directory uses more hierarchy levels, and the way chunks are stored there is now user-configurable (but you'll get into trouble if you don't always use the same configuration). This should speed things up a bit once the number of chunks grows very large. * New `--chunkids-per-group` option, for yet more knobs to tweak when searching for optimal performance. * Local files are now opened using `O_NOATIME` so they can be backed up without affecting timestamps. * Now uses the `cliapp` framework for writing command line applications. The primary user-visible effect is that the manpage now has an accurate list of options. * Bugfix: Obnam now again reports VFS I/O statistics. * Bugfix: Obnam can again back up live data that is accessed using sftp. Thanks to Tapani Tarvainen for reporting the problem. Version 0.17, released 2011-05-21; a BETA release ------------------------------------------------- * This is the second BETA release. * The `run-benchmark` script now works with the new version of `seivot`. The only benchmark size is one gibibyte, for now, because Obnam's too slow to do big ones in reasonable time. As an aside, the benchmark script got rewritten in Python, so it can be made more flexible. * Benchmarks are run using encrypted backups. * The kernel buffer cache is dropped before each obnam run, so the benchmark result is more realistic (read: slower). * Obnam now rotates its logs. See `--log-max` and `--log-keep` options in the manual page. The default location for the log file is now `~/.cache/obnam/obnam.log` for people, and `/var/log/obnam.log` for root. * Obnam now restores sparse files correctly. * There have been some speed improvements to Obnam. * The `--repository` option now has the shorter alias `-r`, since it gets used so often. * `obnam force-lock` now merely gives an error message, instead of a Python stack trace, if the repository does not exist. * Obnam now does not crash if files go missing during a backup, or can't be read, or there are other problems with them. It will report the problem, but then continue as if it had never heard of the file. * Obnam now supports FIFO files. * Obnam now verifies checksums when it restores files. * Obnam now stores the checksum for the whole file, not just the checksum for each chunk of its contents. * Obnam's own log file is automatically excluded from backups. * Obnam now stores and restores file timestamps to full accuracy, instead of truncating them to whole seconds. * The format of the backup repository has changed in an incompatible way, and Obnam will now refuse to use an old repository. This means you will need to use an old version to restore from them, and need to re-backup everything. Sorry. Version 0.16, released 2011-07-17; a BETA release ------------------------------------------------- * This is the first BETA release. Obnam should now be feature complete for real use. Performance is lacking and there are many bugs remaining. There are no known bugs that would corrupt backed up data, or prevent its recovery. * Add encryption support. See the manual page for how to use it. Version 0.15.1, released 2011-03-21; an ALPHA release ---------------------------------------------------- * Fix `setup.py` to not import `obnamlib`, so it works when building under pbuilder on Debian. Meh. Version 0.15, released 2011-03-21; an ALPHA release ---------------------------------------------------- Bugs fixed: * Manual page GPL copyright blurb is now properly marked up as a comment. (Thanks, Joey Hess.) * README now links to python-lru correctly. (Thanks, Erik Johansson.) Improvements and other changes: * Filenames and directories are backed up in sorted order. This should make it easier to know how far obnam's gotten. * The location where backups are stored is now called the repository, instead of the store. Suggested by Joey Hess. * The repository and the target directory for restored data are now both created by Obnam, if they don't already exist. Suggested by Joey Hess. * Better control of logging, using the new `--trace` option. * Manual page now explains making backups a little better. * Default value for `--lru-size` reduced to 500, for great improvement in memory used, without, it seems, much decrease in speed. * `obnam verify` now reports success explicitly. Based on question from Joey Hess. * `obnam verify` now accepts both non-option arguments and the `--root` option. Suggested by Joey Hess. * `obnam forget` now accepts "generation specifiers", not just numeric generation ids. This means that `obnam forget latest` works. * I/O statistics are logged more systematically. * `obnam force-lock` introduced, to allow breaking a lock left behind if obnam crashes. But it never does, of course. (Well, except if there's a bug, like when a file changes at the wrong moment.) * `obnam genids` introduced, to list generation ids without any other data. The old command `obnam generations` still works, and lists other info about each generation as well, but that's sometimes bad for scripting. * The `--dump-memory-profile` option now accepts the value `simple`, for reporting basic memory use. It has such a small impact that it's the default. * Obnam now stores the version of the on-disk format in the repository. This should allow it to handle repositories created by a different version and act suitably (hopefully without wiping all your backups). Version 0.14, released 2010-12-29; an ALPHA release ---------------------------------------------------- This version is capable of backing up my laptop's home directory. It is, however, still an ALPHA release, and you should not rely on it as your sole form of backup. It is also slow. But if you're curious, now would be a good time to try it out a bit. Bug fixes: * `COPYING` now contains GPL version 3, instead of 2. The code was licensed under version 3 already. (Thank you Greg Grossmeier.) * The manual page now uses `-` and `\-` correctly. * `obnam forget` now actually removes data that is no longer used by any generation. * When backing up a new generation, if any of the root directories for the backup got dropped by the user, they are now also removed from the backup generation. Old generations obviously still have them. * Only the per-client B-tree forest should have multiple trees. Now this actually happens, whereas previously sometimes a very large number of new trees would be created in some forests. (What's good for rain forests is not good for saving disk space.) * When recursing through directory trees, obnam no longer follows symlinks to directories. * obnam no longer creates a missing backup store when backing up to a local disk. It never did this when backing up via sftp. (This saves me from figuring out which of `store`, `stor`, and `sorte` is the real directory.) New features and stuff: * `blackboxtest` has been rewritten to use Python's `unittest` framework, rather than a homegrown bad re-implementation of some of it. * `obnam ls` interprets arguments as "genspecs" rather than generation identifiers. This means `obnam ls latest` works, and now `latest` is also the default if you don't give any spec. * `run-benchmarks` now outputs results into a git checkout of , an ikiwiki instance hosted by . The script also puts the results into a suitable sub-directory, adds a page for the RSS feed of benchmark results, and updates the report page that summarizes all stored results. * There is now a 100 GiB benchmark. * Clients are now called clients, instead of hosts. This terminology should be clearer. * The list of clients now stores a random integer identifier for each client (unique within the store). The identifier is used as the name of the per-client B-tree directory, rather than the hostname of the client. This should prevent a teeny tiny bit of information leakage. It also makes debugging things much harder. * Various refactorings and prettifications of the code has happened. For example, several classes have been split off from the `store.py` module. This has also resulted in much better test coverage for those classes. * The per-client trees (formerly GenerationStore, now ClientMetadataTree) have a more complicated key now: 4 parts, not 3. This makes it easier to keep separate data about files, and other data that needs to be stored per-generation, such as what the generation id is. * `find-duplicate-chunks`, a tool for finding duplicate chunks of data in a files in a directory tree, was added to the tree. I have used it to find out if is worthwhile to do duplicate chunk removal at all. (It is, at least for my data.) Also, it can be used to find good values for chunk sizes for duplicate detection. * The whole way in which obnam does de-duplication got re-designed and re-implemented. This is tricky stuff, when there is more than one client. * `SftpFS` now uses a hack copied from bzrlib, to use openssh if it is available, and paramiko only if it is not. This speeds up sftp data transfers quite a bit. (Where bzrlib supports more than just openssh, we don't, since I have no way to test the other stuff. Patches welcome.) * The way lists of chunk ids are stored for files got changed. Now we store several ids per list item, which is faster and also saves some space in the B-tree nodes. Also, it is now possible to append to the list, which means the caller does not need to first gather a list of all ids. Such a list gets quite costly when the file is quite big (e.g., in the terabyte size). * New `--dump-memory-profile` option was added to help do memory profiling with meliae or heapy have been added. (Obnam's memory consumption finally got annoying enough that I did something about it.) Removed stuff: * The functional specification was badly outdated, and has been removed. I decided to stop kidding myself that I would keep it up to date. * The store design document has been removed from the store tree. The online version at is the canonical version, and is actually kept up to date. * The benchmark specification has likewise been replaced with . Version 0.13, released 2010-07-13; an ALPHA release ---------------------------------------------------- * Bug fix: a mistake in 0.12 caused checkpoints to happen after each file after the first checkpoint. Now they happen at the right intervals again. * Upload speed is now displayed during backups. * Obnam now tells the kernel that it shouldn't cache data it reads or writes. It is not likely that data being backed up is going to be needed again any time soon, so there's no point in caching it. (The posix_fadvise call is used for this.) * New --lru-size option sets size of LRU cache for nodes in memory. The obnam default is large enough to suit large backups. This uses more memory, but is faster than btree's small default of 100. Version 0.12, released 2010-07-11; an ALPHA release ---------------------------------------------------- * NOTE: This version makes incompatible changes to the way data is stored on-disk. Backups made with older versions are NOT supported. Sorry. * The run-benchmark script has dropped some smaller sizes (they're too fast to be interesting), and adds a 10 GiB test size. * Various speed optimizations. Most importantly, the way file metadata (results of lstat(2)) are encoded has changed. This is the incompatible change from above. It's much faster now, though. * Preliminary support for using SFTP for the backup store added. Hasn't been used much yet, so might well be very buggy. Version 0.11, released 2010-07-05; an ALPHA release ---------------------------------------------------- * Speed optimizations: - chunk identifiers are now sequential, except for the first one, or when there's a collision - chunks are now stored in a more sensible directory hierarchy (instead of one per directory, on average) - adding files to a directory in the backup store is now faster - only store a file's metadata that if it is changed * New --exclude=regexp option to exclude files based on pathnames * Obnam now makes checkpoints during backups. If a backup is aborted in the middle and then re-started, it will continue from the latest checkpoint rather than from the beginning of the previous backup run. - New option --checkpoint to set the interval between checkpoints. Defaults to 1 GiB. * Options for various B-tree settings. This is mostly useful for finding the optimal set of defaults, but may be useful in other situations for some people. - New options --chunk-group-size, --chunk-size, --node-size, --upload-queue-size. * Somewhat better progress reporting during backups. Version 0.10, released 2010-06-29; an ALPHA release --------------------------------------------------- * Rewritten from scratch. * Old NEWS file entries removed (see bzr if you're interested). obnam-1.6.1/README0000644000175000017500000001241112246357067013372 0ustar jenkinsjenkinsObnam, a backup program ======================= Obnam is a backup program. Home page --------- The Obnam home page is at , see there for more information. Installation ------------ The source tree contains packaging for Debian. Run `debuild -us -uc -i.git` to build an installation package. On other systems, using the `setup.py` file should work: run "python setup.py --help" for advice. If not, please report a bug. (I've only tested `setup.py` enough for to build the Debian package.) You need to install my Python B-tree library, and some of my other libraries and tools, which you can get from: * * * (for automatic tests) * * * * * * (for benchmarks) You also need third party libraries: * paramiko: See debian/control for the full set of build dependencies and runtime dependencies on a Debian system. (That set actually gets tested. The above list is maintained manually and may get out of date from time to time.) Use --- To get a quick help summary of options: ./obnam --help To make a backup: ./obnam backup --repository /tmp/mybackup $HOME For more information, see the manual page: man -l obnam.1 Hacking ------- Obnam source code is stored in git for version control purposes; you can get a copy as follows: git clone git://git.liw.fi/obnam The 'master' branch is the main development one. Any bug fixes and features should be developed in a dedicated branch, which gets merged to master when the changes are done and considered good. To build and run automatic tests: ./check ./check --fast # unit tests only, no black box tests ./check --network # requires ssh access to localhost `check` is a wrapper around `python setup.py`, but since using that takes several steps, the script makes things easier. You need my CoverageTestRunner to run tests, see above for where to get it. A couple of scripts exist to run benchmarks and profiles: ./metadata-speed 10000 ./obnam-benchmark --size=1m/100k --results /tmp/benchmark-results viewprof /tmp/benchmark-results/*/*backup-0.prof seivots-summary /tmp/benchmark-results/*/*.seivot | less -S There are two kinds of results: Python profiling output, and `.seivot` files. For the former, `viewprof` is a little helper script I wrote, around the Python pstats module. You can use your own, or get mine from extrautils (). Running the benchmarks under profiling makes them a little slower (typically around 10% for me, when I've compared), but that's OK: the absolute numbers of the benchmarks are less important than the relative ones. It's nice to be able to look at the profiler output, if a benchmark is surprisingly slow, without having to re-run it. `seivots-summary` is a tool to display summaries of the measurements made during a benchmark run. `seivot` is the tool that makes the measurements. I typically save a number of benchmark results, so that I can see how my changes affect performance over time. If you make any changes, I welcome patches, either as plain diffs, `git format-patch --cover-letter` mails, or public repositories I can merge from. The code layout is roughly like this: obnamlib/ # all the real code obnamlib/plugins/ # the plugin code (see pluginmgr.py) obnam # script to invoke obnam _obnammodule.c # wrapper around some system calls In obnamlib, every code module has a corresponding test module, and "make check" uses CoverageTestRunner to run them pairwise. For each pair, test coverage must be 100% or the test will fail. Mark statements that should not be included in coverage test with "# pragma: no cover", if you really, really can't write a test. without-tests lists modules that have no test modules. If you want to make a new release of Obnam, I recommend following my release checklist: . Feedback -------- I welcome bug fixes, enhancements, bug reports, suggestions, requests, and other feedback. I prefer e-mail the mailing list: see for instructions. It would be helpful if you can run `make clean check` before submitting a patch, but it is not strictly required. Legal stuff ----------- Most of the code is written by Lars Wirzenius. (Please provide patches so that can change.) The code is covered by the GNU General Public License, version 3 or later. Copyright 2010-2013 Lars Wirzenius This program is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program. If not, see . obnam-1.6.1/_obnammodule.c0000644000175000017500000001772012246357067015327 0ustar jenkinsjenkins/* * _obnammodule.c -- Python extensions for Obna * * Copyright (C) 2008, 2009 Lars Wirzenius * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License as published by * the Free Software Foundation; either version 3 of the License, or * (at your option) any later version. * * This program is distributed in the hope that it will be useful, * but WITHOUT ANY WARRANTY; without even the implied warranty of * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the * GNU General Public License for more details. * * You should have received a copy of the GNU General Public License along * with this program; if not, write to the Free Software Foundation, Inc., * 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. */ /* * This is a Python extension module written for Obnam, the backup * software. * * This module provides a way to call the posix_fadvise function from * Python. Obnam uses this to use set the POSIX_FADV_SEQUENTIAL and * POSIX_FADV_DONTNEED flags, to make sure the kernel knows that it will * read files sequentially and that the data does not need to be cached. * This makes Obnam not trash the disk buffer cache, which is nice. */ #define _FILE_OFFSET_BITS 64 #include #ifndef _XOPEN_SOURCE #define _XOPEN_SOURCE 600 #endif #define _POSIX_C_SOURCE 200809L #include #include #include #include #include #include #include #ifdef __FreeBSD__ #include #define NO_NANOSECONDS 1 #else #include #define NO_NANOSECONDS 0 #endif static PyObject * fadvise_dontneed(PyObject *self, PyObject *args) { #if POSIX_FADV_DONTNEED int fd; /* Can't use off_t for offset and len, since PyArg_ParseTuple doesn't know it. */ unsigned long long offset; unsigned long long len; int ret; if (!PyArg_ParseTuple(args, "iLL", &fd, &offset, &len)) return NULL; ret = posix_fadvise(fd, offset, len, POSIX_FADV_DONTNEED); return Py_BuildValue("i", ret); #else return Py_BuildValue("i", 0); #endif } static PyObject * utimensat_wrapper(PyObject *self, PyObject *args) { int ret; const char *filename; long atime_sec, atime_nsec; long mtime_sec, mtime_nsec; #if NO_NANOSECONDS struct timeval tv[2]; #else struct timespec tv[2]; #endif if (!PyArg_ParseTuple(args, "sllll", &filename, &atime_sec, &atime_nsec, &mtime_sec, &mtime_nsec)) return NULL; #if NO_NANOSECONDS tv[0].tv_sec = atime_sec; tv[0].tv_usec = atime_nsec / 1000; tv[1].tv_sec = mtime_sec; tv[1].tv_usec = mtime_nsec / 1000; ret = lutimes(filename, tv); #else tv[0].tv_sec = atime_sec; tv[0].tv_nsec = atime_nsec; tv[1].tv_sec = mtime_sec; tv[1].tv_nsec = mtime_nsec; ret = utimensat(AT_FDCWD, filename, tv, AT_SYMLINK_NOFOLLOW); #endif if (ret == -1) ret = errno; return Py_BuildValue("i", ret); } /* * Since we can't set nanosecond mtime and atimes on some platforms, also * don't retrieve that level of precision from lstat(), so comparisons * work. */ static unsigned long long remove_precision(unsigned long long nanoseconds) { #if NO_NANOSECONDS return nanoseconds - (nanoseconds % 1000); #else return nanoseconds; #endif } static PyObject * lstat_wrapper(PyObject *self, PyObject *args) { int ret; const char *filename; struct stat st = {0}; if (!PyArg_ParseTuple(args, "s", &filename)) return NULL; ret = lstat(filename, &st); if (ret == -1) ret = errno; return Py_BuildValue("iKKKKKKKLLLLKLKLK", ret, (unsigned long long) st.st_dev, (unsigned long long) st.st_ino, (unsigned long long) st.st_mode, (unsigned long long) st.st_nlink, (unsigned long long) st.st_uid, (unsigned long long) st.st_gid, (unsigned long long) st.st_rdev, (long long) st.st_size, (long long) st.st_blksize, (long long) st.st_blocks, (long long) st.st_atim.tv_sec, remove_precision(st.st_atim.tv_nsec), (long long) st.st_mtim.tv_sec, remove_precision(st.st_mtim.tv_nsec), (long long) st.st_ctim.tv_sec, remove_precision(st.st_ctim.tv_nsec)); } static PyObject * llistxattr_wrapper(PyObject *self, PyObject *args) { const char *filename; size_t bufsize; PyObject *o; char* buf; ssize_t n; if (!PyArg_ParseTuple(args, "s", &filename)) return NULL; #ifdef __FreeBSD__ bufsize = extattr_list_link(filename, EXTATTR_NAMESPACE_USER, NULL, 0); buf = malloc(bufsize); n = extattr_list_link(filename, EXTATTR_NAMESPACE_USER, buf, bufsize); if (n >= 0) { /* Convert from length-prefixed BSD style to '\0'-suffixed Linux style. */ size_t i = 0; while (i < n) { unsigned char length = (unsigned char) buf[i]; memmove(buf + i, buf + i + 1, length); buf[i + length] = '\0'; i += length + 1; } o = Py_BuildValue("s#", buf, (int) n); } else { o = Py_BuildValue("i", errno); } free(buf); #else bufsize = 0; o = NULL; do { bufsize += 1024; buf = malloc(bufsize); n = llistxattr(filename, buf, bufsize); if (n >= 0) o = Py_BuildValue("s#", buf, (int) n); else if (n == -1 && errno != ERANGE) o = Py_BuildValue("i", errno); free(buf); } while (o == NULL); #endif return o; } static PyObject * lgetxattr_wrapper(PyObject *self, PyObject *args) { const char *filename; const char *attrname; size_t bufsize; PyObject *o; if (!PyArg_ParseTuple(args, "ss", &filename, &attrname)) return NULL; bufsize = 0; o = NULL; do { bufsize += 1024; char *buf = malloc(bufsize); #ifdef __FreeBSD__ int n = extattr_get_link(filename, EXTATTR_NAMESPACE_USER, attrname, buf, bufsize); #else ssize_t n = lgetxattr(filename, attrname, buf, bufsize); #endif if (n >= 0) o = Py_BuildValue("s#", buf, (int) n); else if (n == -1 && errno != ERANGE) o = Py_BuildValue("i", errno); free(buf); } while (o == NULL); return o; } static PyObject * lsetxattr_wrapper(PyObject *self, PyObject *args) { const char *filename; const char *name; const char *value; int size; int ret; if (!PyArg_ParseTuple(args, "sss#", &filename, &name, &value, &size)) return NULL; #ifdef __FreeBSD__ ret = extattr_set_link(filename, EXTATTR_NAMESPACE_USER, name, value, size); #else ret = lsetxattr(filename, name, value, size, 0); #endif if (ret == -1) ret = errno; return Py_BuildValue("i", ret); } static PyMethodDef methods[] = { {"fadvise_dontneed", fadvise_dontneed, METH_VARARGS, "Call posix_fadvise(2) with POSIX_FADV_DONTNEED argument."}, {"utimensat", utimensat_wrapper, METH_VARARGS, "utimensat(2) wrapper."}, {"lstat", lstat_wrapper, METH_VARARGS, "lstat(2) wrapper; arg is filename, returns tuple."}, {"llistxattr", llistxattr_wrapper, METH_VARARGS, "llistxattr(2) wrapper; arg is filename, returns tuple."}, {"lgetxattr", lgetxattr_wrapper, METH_VARARGS, "lgetxattr(2) wrapper; arg is filename, returns tuple."}, {"lsetxattr", lsetxattr_wrapper, METH_VARARGS, "lsetxattr(2) wrapper; arg is filename, returns errno."}, {NULL, NULL, 0, NULL} /* Sentinel */ }; PyMODINIT_FUNC init_obnam(void) { (void) Py_InitModule("_obnam", methods); } obnam-1.6.1/analyze-repository-files0000755000175000017500000000727312246357067017432 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . '''Analyze the files in an Obnam backup repository. For performance reasons, it is best if Obnam does not write too many files per directory, or too large or too small files. This program analyzes all the files in an Obnam backup repository, or, indeed, any local directory, and reports the following: * total number of files * sum of lengths of files * number of files per directory: fewest, most, average, median (both number and name of directory) * size of files: smallest, largest, average, median (both size and name of file) ''' import os import stat import sys class Stats(object): def __init__(self): self.dirs = list() self.files = list() def add_dir(self, dirname, count): self.dirs.append((count, dirname)) def add_file(self, filename, size): self.files.append((size, filename)) @property def total_files(self): return len(self.files) @property def sum_of_sizes(self): return sum(size for size, name in self.files) @property def dirsizes(self): self.dirs.sort() num_dirs = len(self.dirs) fewest, fewest_name = self.dirs[0] most, most_name = self.dirs[-1] average = sum(count for count, name in self.dirs) / num_dirs median = self.dirs[num_dirs/2][0] return fewest, fewest_name, most, most_name, average, median @property def filesizes(self): self.files.sort() num_files = len(self.files) smallest, smallest_name = self.files[0] largest, largest_name = self.files[-1] average = sum(size for size, name in self.files) / num_files median = self.files[num_files/2][0] return smallest, smallest_name, largest, largest_name, average, median def main(): stats = Stats() for name in sys.argv[1:]: stat_info = os.lstat(name) if stat.S_ISDIR(stat_info.st_mode): for dirname, subdirs, filenames in os.walk(name): stats.add_dir(dirname, len(filenames) + len(subdirs)) for filename in filenames: pathname = os.path.join(dirname, filename) stat_info = os.lstat(pathname) if stat.S_ISREG(stat_info.st_mode): stats.add_file(pathname, stat_info.st_size) elif stat.S_ISREG(stat_info.st_mode): stats.add_file(name, stat_info.st_size) print "total_files:", stats.total_files print "sum of sizes:", stats.sum_of_sizes fewest, fewest_name, most, most_name, average, median = stats.dirsizes print "files per dir:" print " fewest:", fewest, fewest_name print " most:", most, most_name print " average:", average print " median:", median smallest, smallest_name, largest, largest_name, average, median = \ stats.filesizes print "file sizes:" print " smallest:", smallest, smallest_name print " largest:", largest, largest_name print " average:", average print " median:", median if __name__ == '__main__': main() obnam-1.6.1/check0000755000175000017500000000144612246357067013523 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e python setup.py --quiet clean python setup.py --quiet build_ext -i rm -rf build python setup.py --quiet check "$@" obnam-1.6.1/check-lock-usage-from-log0000755000175000017500000002371012246357067017271 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . '''Check lock file usage from log files. This program reads a number of Obnam log files, produced with tracing for obnamlib, and analyses them for bugs when using lock files. Each log file is assumed to be produced by a separate Obnam instance. * Have any instances held the same lock during overlapping periods? ''' import cliapp import logging import os import re import time import ttystatus timestamp_pat = \ r'^\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d (?P\d+\.\d+) .*' lock_pat = re.compile( timestamp_pat + r'vfs_local.py:[0-9]*:lock: got lockname=(?P.*)') unlock_pat = re.compile( timestamp_pat + r'vfs_local.py:[0-9]*:unlock: lockname=(?P.*)') writefile_pat = re.compile( timestamp_pat + r'vfs_local.py:[0-9]*:write_file: write_file (?P.*)$') overwritefile_pat = re.compile( timestamp_pat + r'vfs_local.py:[0-9]*:overwrite_file: overwrite_file (?P.*)$') node_open_pat = re.compile( timestamp_pat + r'nodestore_disk.py:\d+:get_node: reading node \d+ from file ' r'(?P.*)$') node_remove_pat = re.compile( timestamp_pat + r'vfs_local.py:\d+:remove: remove (?P.*/nodes/.*)$') rename_pat = re.compile( timestamp_pat + r'vfs_local.py:\d+:rename: rename (?P\S+) (?P\S+)$') class LogEvent(object): def __init__(self, logfile, lineno, timestamp): self.logfile = logfile self.lineno = lineno self.timestamp = timestamp def sortkey(self): return self.timestamp class LockEvent(LogEvent): def __init__(self, logfile, lineno, timestamp, lockname): LogEvent.__init__(self, logfile, lineno, timestamp) self.lockname = lockname def __str__(self): return 'Lock(%s)' % self.lockname class UnlockEvent(LockEvent): def __str__(self): return 'Unlock(%s)' % self.lockname class WriteFileEvent(LogEvent): def __init__(self, logfile, lineno, timestamp, filename): LogEvent.__init__(self, logfile, lineno, timestamp) self.filename = filename def __str__(self): return 'WriteFile(%s)' % self.filename class OverwriteFileEvent(WriteFileEvent): def __str__(self): return 'OverwriteFile(%s)' % self.filename class NodeCreateEvent(LogEvent): def __init__(self, logfile, lineno, timestamp, node_id): LogEvent.__init__(self, logfile, lineno, timestamp) self.node_id = node_id def __str__(self): return 'NodeCreate(%s)' % self.node_id class NodeDestroyEvent(NodeCreateEvent): def __str__(self): return 'NodeDestroy(%s)' % self.node_id class NodeReadEvent(NodeCreateEvent): def __str__(self): return 'NodeOpen(%s)' % self.node_id class RenameEvent(LogEvent): def __init__(self, logfile, lineno, timestamp, old, new): LogEvent.__init__(self, logfile, lineno, timestamp) self.old = old self.new = new def __str__(self): return 'Rename(%s -> %s)' % (self.old, self.new) class CheckLocks(cliapp.Application): def setup(self): self.events = [] self.errors = 0 self.latest_opened_node = None self.patterns = [ (lock_pat, self.lock_event), (unlock_pat, self.unlock_event), (writefile_pat, self.writefile_event), (overwritefile_pat, self.overwritefile_event), (node_open_pat, self.read_node_event), (node_remove_pat, self.node_remove_event), (rename_pat, self.rename_event), ] self.ts = ttystatus.TerminalStatus() self.ts.format( 'Reading %ElapsedTime() %Integer(lines): %Pathname(filename)') self.ts['lines'] = 0 def cleanup(self): self.ts.clear() self.analyse_phase_1() self.ts.finish() if self.errors: raise cliapp.AppException('There were %d errors' % self.errors) def error(self, msg): logging.error(msg) self.ts.error(msg) self.errors += 1 def analyse_phase_1(self): self.events.sort(key=lambda e: e.sortkey()) self.events = self.create_node_events(self.events) self.ts.format('Phase 1: %Index(event,events)') self.ts['events'] = self.events self.ts.flush() current_locks = set() current_nodes = set() for e in self.events: self.ts['event'] = e logging.debug( 'analysing: %s:%s: %s: %s' % (e.logfile, e.lineno, repr(e.sortkey()), str(e))) if type(e) is LockEvent: if e.lockname in current_locks: self.error( 'Re-locking %s: %s:%s:%s' % (e.lockname, e.logfile, e.lineno, e.timestamp)) else: current_locks.add(e.lockname) elif type(e) is UnlockEvent: if e.lockname not in current_locks: self.error( 'Unlocking %s which was not locked: %s:%s:%s' % (e.lockname, e.logfile, e.lineno, e.timestamp)) else: current_locks.remove(e.lockname) elif type(e) in (WriteFileEvent, OverwriteFileEvent): lockname = self.determine_lockfile(e.filename) if lockname and lockname not in current_locks: self.error( '%s:%s: ' 'Write to file %s despite lock %s not existing' % (e.logfile, e.lineno, e.filename, lockname)) elif type(e) is NodeCreateEvent: if e.node_id in current_nodes: self.error( '%s:%s: Node %s already exists' % (e.logfile, e.lineno, e.node_id)) else: current_nodes.add(e.node_id) elif type(e) is NodeDestroyEvent: if e.node_id not in current_nodes: self.error( '%s:%s: Node %s does not exist' % (e.logfile, e.lineno, e.node_id)) else: current_nodes.remove(e.node_id) elif type(e) is NodeReadEvent: if e.node_id not in current_nodes: self.error( '%s:%s: Node %s does not exist' % (e.logfile, e.lineno, e.node_id)) elif type(e) is RenameEvent: if e.old in current_nodes: current_nodes.remove(e.old) current_nodes.add(e.new) else: raise NotImplementedError() def create_node_events(self, events): new = [] for e in events: new.append(e) if type(e) in (WriteFileEvent, OverwriteFileEvent): if '/nodes/' in e.filename: new_e = NodeCreateEvent( e.logfile, e.lineno, e.timestamp, e.filename) new_e.timestamp = e.timestamp new.append(new_e) return new def determine_lockfile(self, filename): if filename.endswith('/lock'): return None toplevel = filename.split('/')[0] if toplevel == 'chunks': return None if toplevel in ('metadata', 'clientlist'): return './lock' return toplevel + '/lock' def process_input(self, name): self.ts['filename'] = name return cliapp.Application.process_input(self, name) def process_input_line(self, filename, line): self.ts['lines'] = self.global_lineno for pat, func in self.patterns: m = pat.search(line) if m: event = func(filename, line, m) if event is not None: self.events.append(event) def lock_event(self, filename, line, match): return LockEvent( filename, self.lineno, float(match.group('timestamp')), match.group('lock')) def unlock_event(self, filename, line, match): return UnlockEvent( filename, self.lineno, float(match.group('timestamp')), match.group('lock')) def writefile_event(self, filename, line, match): return WriteFileEvent( filename, self.lineno, float(match.group('timestamp')), match.group('filename')) def overwritefile_event(self, filename, line, match): return OverwriteFileEvent( filename, self.lineno, float(match.group('timestamp')), match.group('filename')) def read_node_event(self, filename, line, match): node_id = match.group('nodeid') if not os.path.basename(node_id).startswith('tmp'): return NodeReadEvent( filename, self.lineno, float(match.group('timestamp')), node_id) def node_remove_event(self, filename, line, match): return NodeDestroyEvent( filename, self.lineno, float(match.group('timestamp')), match.group('nodeid')) def rename_event(self, filename, line, match): return RenameEvent( filename, self.lineno, float(match.group('timestamp')), match.group('old'), match.group('new')) CheckLocks().run() obnam-1.6.1/confs/0000755000175000017500000000000012246357067013623 5ustar jenkinsjenkinsobnam-1.6.1/confs/common.conf0000644000175000017500000000003712246357067015762 0ustar jenkinsjenkins[config] with-encryption = yes obnam-1.6.1/confs/historical-local.conf0000644000175000017500000000011112246357067017714 0ustar jenkinsjenkins[config] profile-name = historical-local size = 4k/4k generations = 1000 obnam-1.6.1/confs/media-local.conf0000644000175000017500000000012412246357067016636 0ustar jenkinsjenkins[config] profile-name = media-local size = 1g/100m file-size = 100m generations = 2 obnam-1.6.1/confs/sourcecode-local.conf0000644000175000017500000000012712246357067017715 0ustar jenkinsjenkins[config] profile-name = sourcecode-local size = 1g/10m file-size = 16k generations = 2 obnam-1.6.1/crash-test0000755000175000017500000000403412246357067014517 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -eu if [ "$#" != 1 ] then echo "usage: see source" 1>&2 exit 1 fi N="$1" tempdir="$(mktemp -d)" echo "Temporary directory: $tempdir" cat < "$tempdir/conf" [config] repository = $tempdir/repo root = $tempdir/data log = $tempdir/obnam.log trace = larch checkpoint = 1m lock-timeout = 1 log-keep = 16 log-level = debug trace = larch, obnamlib EOF # Do a minimal backup to make sure the repository works at least once, without the crash-limit option mkdir "$tempdir/data" ./obnam backup --no-default-config --config "$tempdir/conf" genbackupdata --create=100m "$tempdir/data" echo "crash-limit = $N" >> "$tempdir/conf" while true do # There's no need to delete this file because the first Exception message # that appears in the file will terminate the test. # rm -f "$tempdir/obnam.log" echo "Trying backup with at most $N writes to repository" ./obnam force-lock --no-default-config --config "$tempdir/conf" 2>/dev/null if ./obnam backup --no-default-config --config "$tempdir/conf" 2>/dev/null then echo "Backup finished ok, done" break fi if ! grep -q '^Exception: Crashing as requested' "$tempdir/obnam.log" then echo "Backup terminated because of unrequested crash" 1>&2 exit 1 fi # ./obnam fsck --no-default-config --config "$tempdir/conf" || true done rm -rf "$tempdir" echo "OK" obnam-1.6.1/create-vfat-disk-image0000755000175000017500000000154412246357067016656 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -eu filename="$1" size="$2" qemu-img create -f raw "$filename" "$size" #parted "$filename" mklabel msdos #parted "$filename" mkpart primary fat32 0% 100% /sbin/mkfs.vfat "$filename" obnam-1.6.1/dumpobjs0000755000175000017500000000261512246357067014270 0ustar jenkinsjenkins#!/usr/bin/python # Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import sys import time import obnamlib def find_objids(fs): basenames = fs.listdir('.') return [x[:-len('.obj')] for x in basenames if x.endswith('.obj')] fs = obnamlib.LocalFS(sys.argv[1]) repo = obnamlib.Repository(fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0) for objid in find_objids(fs): obj = repo.get_object(objid) print 'id %s (%s):' % (obj.id, obj.__class__.__name__) for name in obj.fieldnames(): print ' %-10s %s' % (name, repr(getattr(obj, name))) obnam-1.6.1/find-duplicate-chunks0000755000175000017500000000620212246357067016622 0ustar jenkinsjenkins#!/usr/bin/python # # Report duplicate chunks of data in a filesystem. import hashlib import os import subprocess import sys import tempfile def compute(f, chunk_size, offset): chunk = f.read(chunk_size) while len(chunk) == chunk_size: yield hashlib.md5(chunk).hexdigest() chunk = chunk[offset:] + f.read(offset) def compute_checksums(f, chunk_size, offset, dirname): for dirname, subdirs, filenames in os.walk(dirname): for filename in filenames: pathname = os.path.join(dirname, filename) if os.path.isfile(pathname) and not os.path.islink(pathname): ff = file(pathname) for checksum in compute(ff, chunk_size, offset): f.write('%s\n' % checksum) ff.close() def sort_checksums(f, checksums_name): subprocess.check_call(['sort', '-T', '.', '--batch-size', '1000', '-S', '1G', ], stdin=file(checksums_name), stdout=f) def count_duplicates(f, sorted_name): subprocess.check_call(['uniq', '-c'], stdin=file(sorted_name), stdout=f) def make_report(f, counts_name, chunk_size, offset): num_diff_checksums = 0 saved = 0 total = 0 limits = [1] counts = { 1: 0 } for line in file(counts_name): count, checksum = line.split() count = int(count) num_diff_checksums += 1 saved += (count-1) * chunk_size total += count * chunk_size while limits[-1] < count: n = limits[-1] * 10 limits.append(n) counts[n] = 0 for limit in limits: if count <= limit: counts[limit] += count break f.write('chunk size: %d\n' % chunk_size) f.write('offset: %d\n' % offset) f.write('#different checksums: %d\n' % num_diff_checksums) f.write('%8s %8s\n' % ('repeats', 'how many')) for limit in limits: f.write('%8d %8d\n' % (limit, counts[limit])) f.write('bytes saved by de-duplication: %d\n' % saved) f.write('%% saved: %f\n' % (100.0*saved/total)) def main(): chunk_size = int(sys.argv[1]) offset = int(sys.argv[2]) dirname = sys.argv[3] prefix = 'data-%04d-%04d' % (chunk_size, offset) checksums_name = prefix + '.checksums' sorted_name = prefix + '.sorted' counts_name = prefix + '.counts' report_name = prefix + '.report' steps = ( (checksums_name, compute_checksums, (chunk_size, offset, dirname)), (sorted_name, sort_checksums, (checksums_name,)), (counts_name, count_duplicates, (sorted_name,)), (report_name, make_report, (counts_name, chunk_size, offset)), ) for filename, func, args in steps: if not os.path.exists(filename): print 'Step:', func.__name__ fd, output_name = tempfile.mkstemp(dir='.') os.close(fd) f = file(output_name, 'w') func(*((f,) + args)) f.close() os.rename(output_name, filename) if __name__ == '__main__': main() obnam-1.6.1/lock-and-increment0000755000175000017500000000225012246357067016112 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import sys import obnamlib dirname = sys.argv[1] count_to = int(sys.argv[2]) filename = os.path.join(dirname, 'counter') fs = obnamlib.LocalFS('/') lm = obnamlib.LockManager(fs, 60, 'lock-and-increment') for i in range(count_to): lm.lock([dirname]) if fs.exists(filename): data = fs.cat(filename) counter = int(data) counter += 1 fs.overwrite_file(filename, str(counter)) else: fs.write_file(filename, str(1)) lm.unlock([dirname]) obnam-1.6.1/meliae-show0000755000175000017500000000051712246357067014656 0ustar jenkinsjenkins#!/usr/bin/python from meliae import loader from pprint import pprint as pp import sys om = loader.load(sys.argv[1]) om.remove_expensive_references() print om.summarize() print for type_name in sys.argv[2:]: objs = om.get_all(type_name) for obj in objs[:5]: pp(obj.p) print om.summarize(obj) print obnam-1.6.1/metadata-speed0000755000175000017500000000257612246357067015331 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import random import shutil import sys import time import obnamlib def measure(n, func): start = time.clock() for i in range(n): func() end = time.clock() return end - start def main(): n = int(sys.argv[1]) fs = obnamlib.LocalFS('.') fs.connect() metadata = obnamlib.read_metadata(fs, '.') encoded = obnamlib.encode_metadata(metadata) calibrate = measure(n, lambda: None) encode = measure(n, lambda: obnamlib.encode_metadata(metadata)) decode = measure(n, lambda: obnamlib.decode_metadata(encoded)) print 'encode: %.1f s' % (n/(encode - calibrate)) print 'decode: %.1f s' % (n/(decode - calibrate)) if __name__ == '__main__': main() obnam-1.6.1/obnam0000755000175000017500000000141312246357067013534 0ustar jenkinsjenkins#!/usr/bin/python # Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import obnamlib obnamlib.App(progname='obnam', version=obnamlib.__version__).run() obnam-1.6.1/obnam-benchmark0000755000175000017500000002067312246357067015475 0ustar jenkinsjenkins#!/usr/bin/python # # Copyright 2010, 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import cliapp import ConfigParser import glob import logging import os import shutil import socket import subprocess import tempfile class ObnamBenchmark(cliapp.Application): default_sizes = ['1g/100m'] keyid = '3B1802F81B321347' opers = ('backup', 'restore', 'list_files', 'forget') def add_settings(self): self.settings.string(['results'], 'put results under DIR (%default)', metavar='DIR', default='../benchmarks') self.settings.string(['obnam-branch'], 'use DIR as the obnam branch to benchmark ' '(default: %default)', metavar='DIR', default='.') self.settings.string(['larch-branch'], 'use DIR as the larch branch (default: %default)', metavar='DIR', ) self.settings.string(['seivot-branch'], 'use DIR as the seivot branch ' '(default: installed seivot)', metavar='DIR') self.settings.boolean(['with-encryption'], 'run benchmark using encryption') self.settings.string(['profile-name'], 'short name for benchmark scenario', default='unknown') self.settings.string_list(['size'], 'add PAIR to list of sizes to ' 'benchmark (e.g., 10g/1m)', metavar='PAIR') self.settings.bytesize(['file-size'], 'how big should files be?', default=4096) self.settings.integer(['generations'], 'benchmark N generations (default: %default)', metavar='N', default=5) self.settings.boolean(['use-sftp-repository'], 'access the repository over SFTP ' '(requires ssh to localhost to work)') self.settings.boolean(['use-sftp-root'], 'access the live data over SFTP ' '(requires ssh to localhost to work)') self.settings.integer(['sftp-delay'], 'add artifical delay to sftp transfers ' '(in milliseconds)') self.settings.string(['description'], 'describe benchmark') self.settings.boolean(['drop-caches'], 'drop kernel buffer caches') self.settings.string(['seivot-log'], 'seivot log setting') self.settings.boolean(['verify'], 'verify restores') def process_args(self, args): self.require_tmpdir() obnam_revno = self.bzr_revno(self.settings['obnam-branch']) if self.settings['larch-branch']: larch_revno = self.bzr_revno(self.settings['larch-branch']) else: larch_revno = None results = self.results_dir(obnam_revno, larch_revno) obnam_branch = self.settings['obnam-branch'] if self.settings['seivot-branch']: seivot = os.path.join(self.settings['seivot-branch'], 'seivot') else: seivot = 'seivot' generations = self.settings['generations'] tempdir = tempfile.mkdtemp() env = self.setup_gnupghome(tempdir) sizes = self.settings['size'] or self.default_sizes logging.debug('sizes: %s' % repr(sizes)) file_size = self.settings['file-size'] profile_name = self.settings['profile-name'] for pair in sizes: initial, inc = self.parse_size_pair(pair) msg = 'Profile %s, size %s inc %s' % (profile_name, initial, inc) print print msg print '-' * len(msg) print obnam_profile = os.path.join(results, 'obnam--%(op)s-%(gen)s.prof') output = os.path.join(results, 'obnam.seivot') if os.path.exists(output): print ('%s already exists, not re-running benchmark' % output) else: argv = [seivot, '--obnam-branch', obnam_branch, '--incremental-data', inc, '--file-size', str(file_size), '--obnam-profile', obnam_profile, '--generations', str(generations), '--profile-name', profile_name, '--sftp-delay', str(self.settings['sftp-delay']), '--initial-data', initial, '--output', output] if self.settings['larch-branch']: argv.extend(['--larch-branch', self.settings['larch-branch']]) if self.settings['seivot-log']: argv.extend(['--log', self.settings['seivot-log']]) if self.settings['drop-caches']: argv.append('--drop-caches') if self.settings['use-sftp-repository']: argv.append('--use-sftp-repository') if self.settings['use-sftp-root']: argv.append('--use-sftp-root') if self.settings['with-encryption']: argv.extend(['--encrypt-with', self.keyid]) if self.settings['description']: argv.extend(['--description', self.settings['description']]) if self.settings['verify']: argv.append('--verify') self.runcmd(argv, env=env) shutil.rmtree(tempdir) def require_tmpdir(self): if 'TMPDIR' not in os.environ: raise cliapp.AppException('TMPDIR is not set. ' 'You would probably run out of space ' 'on /tmp.') if not os.path.exists(os.environ['TMPDIR']): raise cliapp.AppException('TMPDIR points at a non-existent ' 'directory %s' % os.environ['TMPDIR']) logging.debug('TMPDIR=%s' % repr(os.environ['TMPDIR'])) @property def hostname(self): return socket.gethostname() @property def obnam_branch_name(self): obnam_branch = os.path.abspath(self.settings['obnam-branch']) return os.path.basename(obnam_branch) def results_dir(self, obnam_revno, larch_revno): parent = self.settings['results'] parts = [self.hostname, self.obnam_branch_name, str(obnam_revno)] if larch_revno: parts.append(str(larch_revno)) prefix = os.path.join(parent, "-".join(parts)) get_path = lambda counter: "%s-%d" % (prefix, counter) counter = 0 dirname = get_path(counter) while os.path.exists(dirname): counter += 1 dirname = get_path(counter) os.makedirs(dirname) return dirname def setup_gnupghome(self, tempdir): gnupghome = os.path.join(tempdir, 'gnupghome') shutil.copytree('test-gpghome', gnupghome) env = dict(os.environ) env['GNUPGHOME'] = gnupghome return env def bzr_revno(self, branch): p = subprocess.Popen(['bzr', 'revno'], cwd=branch, stdout=subprocess.PIPE) out, err = p.communicate() if p.returncode != 0: raise cliapp.AppException('bzr failed') revno = out.strip() logging.debug('bzr branch %s has revno %s' % (branch, revno)) return revno def parse_size_pair(self, pair): return pair.split('/', 1) if __name__ == '__main__': ObnamBenchmark().run() obnam-1.6.1/obnam-benchmark.1.in0000644000175000017500000000653612246357067016240 0ustar jenkinsjenkins.\" Copyright 2011 Lars Wirzenius .\" .\" This program is free software: you can redistribute it and/or modify .\" it under the terms of the GNU General Public License as published by .\" the Free Software Foundation, either version 3 of the License, or .\" (at your option) any later version. .\" .\" This program is distributed in the hope that it will be useful, .\" but WITHOUT ANY WARRANTY; without even the implied warranty of .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the .\" GNU General Public License for more details. .\" .\" You should have received a copy of the GNU General Public License .\" along with this program. If not, see . .\" .TH OBNAM-BENCHMARK 1 .SH NAME obnam-benchmark \- benchmark obnam .SH SYNOPSIS .SH DESCRIPTION .B obnam-benchmark benchmarks the .BR obnam (1) backup application, by measuring how much time it takes to do a backup, restore, etc, in various scenarios. .B obnam-benchmark uses the .BR seivot (1) tool for actually running the benchmarks, but makes some helpful assumptions about things, to make it simpler to run than running .B seivot directly. .PP Benchmarks are run using two different usage profiles: .I mailspool (all files are small), and .I mediaserver (all files are big). For each profile, test data of the desired total size is generated, backed up, and then several incremental generations are backed up, each adding some more generated test data. Then other operations are run against the backup repository: restoring, listing the contents of, and removing each generation. .PP The result of the benchmark is a .I .seivot file per profile, plus a Python profiler file for each run of .BR obnam . These are stored in .IR ../benchmarks . A set of .I .seivot files can be summarized for comparison with .BR seivots-summary (1). The profiling files can be viewed with the usual Python tools: see the .B pstats module. .PP The benchmarks are run against a version of .B obnam checked out from version control. It is not (currently) possible to run the benchmark against an installed version of .BR obnam. Also the .I larch Python library, which .B obnam needs, needs to be checked out from version control. The .B \-\-obnam\-branch and .B \-\-larch\-branch options set the locations, if the defaults are not correct. .SH OPTIONS .SH ENVIRONMENT .TP .BR TMPDIR This variable .I must be set. It controls where the temporary files (generated test data) is stored. If this variable was not set, they'd be put into .IR /tmp , which easily fills up, to the detriment of the entire system. Thus. .B obnam-benchmark requires that the location is set explicitly. (You can still use .I /tmp if you want, but you have to set .B TMPDIR explicitly.) .SH FILES .TP .BR ../benchmarks/ The default directory where results of the benchmark are stored, in a subdirectory named after the branch and revision numbers. .SH EXAMPLE To run a small benchmark: .IP TMPDIR=/var/tmp obnam-benchmark --size=10m/1m .PP To run a benchmark using existing data: .IP TMPDIR=/var/tmp obnam-benchmark --use-existing=$HOME/Mail .PP To view the currently available benchmark results: .IP seivots-summary ../benchmarks/*/*mail*.seivot | less -S .br seivots-summary ../benchmarks/*/*media*.seivot | less -S .PP (You need to run .B seivots-summary once per usage profile.) .SH "SEE ALSO" .BR obnam (1), .BR seivot (1), .BR seivots-summary (1). obnam-1.6.1/obnam-viewprof0000755000175000017500000000176412246357067015404 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import pstats import sys if len(sys.argv) not in [2, 3]: sys.stderr.write('Usage: obnam-viewprof foo.prof [sort-order]\n') sys.exit(1) if len(sys.argv) == 3: order = sys.argv[2] else: order = 'cumulative' p = pstats.Stats(sys.argv[1]) p.strip_dirs() p.sort_stats(order) p.print_stats() p.print_callees() obnam-1.6.1/obnam-viewprof.10000644000175000017500000000255312246357067015535 0ustar jenkinsjenkins.\" Copyright 2012 Lars Wirzenius .\" .\" This program is free software: you can redistribute it and/or modify .\" it under the terms of the GNU General Public License as published by .\" the Free Software Foundation, either version 3 of the License, or .\" (at your option) any later version. .\" .\" This program is distributed in the hope that it will be useful, .\" but WITHOUT ANY WARRANTY; without even the implied warranty of .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the .\" GNU General Public License for more details. .\" .\" You should have received a copy of the GNU General Public License .\" along with this program. If not, see . .\" .TH OBNAM-VIEWPROF 1 .SH NAME obnam-viewprof \- show Python profiler output .SH SYNOPSIS .B obnam-viewprof .I profile .RI [ sort-order ] .SH DESCRIPTION .B obnam-viewprof shows a plain text version of Python profiler output. You can generate such output from Obnam by setting the .B OBNAM_PROFILE environment variable to a filename. The profile will be written to that filename, and you should give it to .B obnam-viewprof as an argument. .PP The .I sort-order argument defaults to .B cumulative and can be any of the orderings that the Python pstats library supports. .PP .B obnam-viewprof is mainly useful for those developing .BR obnam (1). .SH "SEE ALSO" .BR obnam (1). obnam-1.6.1/obnam.1.in0000644000175000017500000004443612246357067014311 0ustar jenkinsjenkins.\" Copyright 2010-2013 Lars Wirzenius .\" .\" This program is free software: you can redistribute it and/or modify .\" it under the terms of the GNU General Public License as published by .\" the Free Software Foundation, either version 3 of the License, or .\" (at your option) any later version. .\" .\" This program is distributed in the hope that it will be useful, .\" but WITHOUT ANY WARRANTY; without even the implied warranty of .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the .\" GNU General Public License for more details. .\" .\" You should have received a copy of the GNU General Public License .\" along with this program. If not, see . .TH OBNAM 1 .SH NAME obnam \- make, restore, and manipulate backups .SH SYNOPSIS .SH DESCRIPTION .B obnam makes, restores, manipulates, and otherwise deals with backups. It can store backups on a local disk or to a server via sftp. Every backup generation looks like a fresh snapshot, but is really incremental: the user does not need to worry whether it's a full backup or not. Only changed data is backed up, and if a chunk of data is already backed up in another file, that data is re-used. .PP The place where backed up data is placed is called the backup repository. A repository may be, for example, a directory on an sftp server, or a directory on a USB hard disk. A single repository may contain backups from several clients. Their data will intermingle as if they were using separate repositories, but if one client backs up a file, the others may re-use the data. .PP .B obnam command line syntax consists of a .I command possibly followed by arguments. The commands are list below. .IP \(bu .B backup makes a new backup. The first time it is run, it makes a full backup, after that an incremental one. .IP \(bu .B restore is the opposite of a backup. It copies backed up data from the backup repository to a target directory. You can restore everything in a generation, or just selected files. .IP \(bu .B clients lists the clients that are backed up to the repository. .IP \(bu .B generations lists every backup generation for a given client, plus some metadata about the generation. .IP \(bu .B genids lists the identifier for every backup generation for a given client. No other information is shown. This can be useful for scripting. .IP \(bu .B ls lists the contents of a given generation, similar to .BR "ls \-lAR" . .IP \(bu .B verify compares data in the backup with actual user data, and makes sures they are identical. It is most useful to run immediately after a backup, to check that it actually worked. It can be run at any time, but if the user data has changed, verification fails even though the backup is OK. .IP \(bu .B forget removes backup generations that are no longer wanted, so that they don't use disk space. Note that after a backup generation is removed the data can't be restored anymore. You can either specify the generations to remove by listing them on the command line, or use the .B \-\-keep option to specify a policy for what to keep (everything else will be removed). .IP \(bu .B fsck checks the internal consistency of the backup repository. It verifies that all clients, generations, directories, files, and all file contents still exists in the backup repository. It may take quite a long time to run. .IP \(bu .B force\-lock removes a lock file for a client in the repository. You should only force a lock if you are sure no-one is accessing that client's data in the repository. A dangling lock might happen, for example, if obnam loses its network connection to the backup repository. .IP \(bu .B client\-keys lists the encryption key associated with each client. .IP \(bu .B list\-keys lists the keys that can access the repository, and which toplevel directories each key can access. Some of the toplevel directories are shared between clients, others are specific to a client. .IP \(bu .B list\-toplevels is like .BR list\-keys , but lists toplevels and which keys can access them. .IP \(bu .B add\-key adds an encryption key to the repository. By default, the key is added only to the shared toplevel directories, but it can also be added to specific clients: list the names of the clients on the command line. They key is given with the .B \-\-keyid option. Whoever has access to the secret key corresponding to the key id can access the backup repository (the shared toplevels plus specified clients). .IP \(bu .B remove\-key removes a key from the shared toplevel directories, plus any clients specified on the command line. .IP \(bu .B nagios\-last\-backup\-age is a check that exits with non-zero return if a backup age exceeds a certain threshold. It is suitable for use as a check plugin for nagios. Thresholds can be given the .B \-\-warn-age and .B \-\-critical-age options. .IP \(bu .B diff compares two generations and lists files differing between them. Every output line will be prefixed either by a plus sign (+) for files that were added, a minus sign (-) for files that have been removed or an asterisk (*) for files that have changed. If only one generation ID is specified on the command line that generation will be compared with its direct predecessor. If two IDs have been specified, all changes between those two generations will be listed. .IP \(bu .B mount makes the backup repository available via a read-only FUSE filesystem. This means you can look at backed up data using normal tools, such as your GUI file manager, or command line tools such as .BR ls (1), .BR diff (1), and .BR cp (1). You can't make new backups with the mount subcommand, but you can restore data easily. .IP You need to have the FUSE utilities and have permission to use FUSE for this to work. The details will vary between operating systems; in Debian, install the package .I fuse and add yourself to the .I fuse group (you may need to log out and back in again). .SS "Making backups" When you run a backup, .B obnam uploads data into the backup repository. The data is divided into chunks, and if a chunk already exists in the backup repository, it is not uploaded again. This allows .B obnam to deal with files that have been changed or renamed since the previous backup run. It also allows several backup clients to avoid uploading the same data. If, for example, everyone in the office has a copy of the same sales brochures, only one copy needs to be stored in the backup repository. .PP Every backup run is a .IR generation . In addition, .B obnam will make .I checkpoint generations every now and then. These are exactly like normal generations, but are not guaranteed to be a complete snapshot of the live data. If the backup run needs to be aborted in the middle, the next backup run can continue from the latest checkpoint, avoiding the need to start completely over. .PP If one backup run drops a backup root directory, the older generations will still keep it: nothing changes in the old generations just because there is a new one. If the root was dropped by mistake, it can be added back and the next backup run will re-use the existing data in the backup repository, and will only back up the file metadata (filenames, permissions, etc). .SS "Verifying backups" What good is a backup system you cannot rely on? How can you rely on something you cannot test? The .B "obnam verify" command checks that data in the backup repository matches actual user data. It retrieves one or more files from the repository and compares them to the user data. This is essentialy the same as doing a restore, then comparing restored files with the original files using .BR cmp (1), but easier to use. .PP By default verification happens on all files. You can also specify the files to be verified by listing them on the command line. You should specify the full paths to the files, not relative to the current directory. .PP The output lists files that fail verification for some reason. If you verify everything, it is likely that some files (e.g., parent directories of backup root) may have changed without it being a problem. Note that you will need to specify the whole path to the files or directories to be verified, not relative to the backup root. You still need to specify at least one of the backup roots on the command line or via the .B \-\-root option so that obnam will find the filesystem, in case it is a remote one. .SS "URL syntax" Whenever obnam accepts a URL, it can be either a local pathname, or an .B sftp URL. An sftp URL has the following form: .IP \fBsftp://\fR[\fIuser\fR@]\fIdomain\fR[:\fIport\fR]\fB/path .PP where .I domain is a normal Internet domain name for a server, .I user is your username on that server, .I port is an optional port number, and .I path is a pathname on the server side. Like .BR bzr (1), but unlike the sftp URL standard, the pathname is absolute, unless it starts with .B /~/ in which case it is relative to the user's home directory on the server. .PP See the EXAMPLE section for examples of URLs. .PP You can use .B sftp URLs for the repository, or the live data (root), but note that due to limitations in the protocol, and its implementation in the .B paramiko library, some things will not work very well for accessing live data over .BR sftp . Most importantly, the handling of of hardlinks is rather suboptimal. For live data access, you should not end the URL with .B /~/ and should append a dot at the end in this special case. .SS "Generation specifications" When not using the latest generation, you will need to specify which one you need. This will be done with the .B \-\-generation option, which takes a generation specification as its argument. The specification is either the word .IR latest , meaning the latest generation (also the default), or a number. See the .B generations command to see what generations are available, and what their numbers are. .SS "Policy for keeping and removing backup generations" The .B forget command can follow a policy to automatically keep some and remove other backup generations. The policy is set with the .BR \-\-keep =\fIPOLICY option. .PP .I POLICY is comma-separated list of rules. Each rule consists of a count and a time period. The time periods are .BR h , .BR d , .BR w , .BR m , and .BR y , for hour, day, week, month, and year. .PP A policy of .I 30d means to keep the latest backup for each day when a backup was made, and keep the last 30 such backups. Any backup matched by any policy rule is kept, and any backups in between will be removed, as will any backups older than the oldest kept backup. .PP As an example, assume backups are taken every hour, on the hour: at 00:00, 01:00, 02:00, and so on, until 23:00. If the .B forget command is run at 23:15, with the above policy, it will keep the backup taken at 23:00 on each day, and remove every other backup that day. It will also remove backups older than 30 days. .PP If backups are made every other day, at noon, .B forget would keep the 30 last backups, or 60 days worth of backups, with the above policy. .PP Note that obnam will only inspect timestamps in the backup repository, and does not care what the actual current time is. This means that if you stop making new backups, the existing ones won't be removed automatically. In essence, obnam pretends the current time is just after the latest backup when .B forget is run. .PP The rules can be given in any order, but will be sorted to ascending order of time period before applied. (It is an error to give two rules for the same period.) A backup generation is kept if it matches any rule. .PP For example, assume the same backup frequence as above, but a policy of .IR 30d,52w . This will keep the newest daily backup for each day for thirty days, .I and the newest weekly backup for 52 weeks. Because the hourly backups will be removed daily, before they have a chance to get saved by a weekly rule, the effect is that the 23:00 o'clock backup for each day is saved for a month, and the 23:00 backup on Sundays is saved for a year. .PP If, instead, you use a policy of .IR 72h,30d,52w , .B obnam would keep the last 72 hourly backups, and the last backup of each calendar day for 30 days, and the last backup of each calendar week for 52 weeks. If the backup frequency was once per day, .B obnam would keep the backup of each calendar hour for which a backup was made, for 72 such backups. In other words, it would effectively keep the the last 72 daily backups. .PP Sound confusing? Just wonder how confused the developer was when writing the code. .PP If no policy is given, .B forget will keep everything. .PP A typical policy might be .IR 72h,7d,5w,12m , which means: keep the last 72 hourly backups, the last 7 daily backups, the last 5 weekly backups and the last 12 monthly backups. If the backups are systematically run on an hourly basis, this will mean keeping hourly backups for three days, daily backups for a week, weekly backups for a month, and monthly backups for a year. .PP The way the policy works is a bit complicated. Run .B forget with the .B \-\-pretend option to make sure you're removing the right ones. .\" .SS "Using encryption" .B obnam can encrypt all the data it writes to the backup repository. It uses .BR gpg (1) to do the encryption. You need to create a key pair using .B "gpg --gen-key" (or use an existing one), and then tell .B obnam about it using the .B \-\-encrypt\-with option. .SS "Configuration files" .B obnam will look for configuration files in a number of locations. See the FILES section for a list. All files are treated as if they were one with the contents of all files catenated. .PP The files are in INI format, and only the .I [config] section is used (any other sections are ignored). .PP The long names of options are used as keys for configuration variables. Any setting that can be set from the command line can be set in a configuration file, in the .I [config] section. .PP For example, the options in the following command line: .sp 1 .RS obnam --repository=/backup --exclude='\.wav$' backup .RE .sp 1 could be replaced with the following configuration file: .sp 1 .nf .RS [config] repository: /backup exclude: \.wav$ .RE .fi .sp 1 (You can use either .I foo=value or .I foo: value syntax in the files.) .PP The only unusual thing about the files is the way options that can be used many times are expressed. All values are put in a single logical line, separated by commas (and optionally spaces as well). For example: .sp 1 .RS .nf [config] exclude = foo, bar, \\.mp3$ .fi .RE .sp 1 The above has three values for the .B exclude option: any files that contain the words .I foo or .I bar anywhere in the fully qualified pathname, or files with names ending with a period and .I mp3 (because the exclusions are regular expressions). .PP A long logical line can be broken into several physical ones, by starting a new line at white space, and indenting the continuation lines: .sp 1 .RS .nf [config] exclude = foo, bar, \\.mp3$ .fi .RE .sp 1 The above example adds three exclusion patterns. .SS "Multiple clients and locking" .B obnam supports sharing a repository between multiple clients. The clients can share the file contents (chunks), so that if client A backs up a large file, and client B has the same file, then B does not need to upload the large file to the repository a second time. For this to work without confusion, the clients use a simple locking protocol that allows only one client at a time to modify the shared data structures. Locks do not prevent read-only access at the same time: this allows you to restore while someone else is backing up. .PP Sometimes a read-only operation happens to access a data structure at the same time as it is being modified. This can result in a crash. It will not result in corrupt data, or incorrect restores. However, you may need to restart the read-only operation after a crash. .SS "Repository format conversion" The .B convert5to6 subcommand converts a repository of format 5 to format 6. It is somewhat dangerous! It modifies the repository in place, so you should be careful. You should do a hardlink copy of the repository before converting: .IP cp -al repo repo.format5 .PP You should also run this with local filesystem access to the repository, rather than sftp, to avoid abysmal performance. .\"--------------------------------------------------------------------- .SH OPTIONS .SS "Option values" The .I SIZE value in options mentioned above specifies a size in bytes, with optional suffixes to indicate kilobytes (k), kibibytes (Ki), megabytes (M), mebibyts (Mi), gigabytes (G), gibibytes (Gi), terabytes (T), tibibytes (Ti). The suffixes are case-insensitive. .\" ------------------------------------------------------------------ .SH "EXIT STATUS" .B obnam will exit with zero if everything went well, and non-zero otherwise. .SH ENVIRONMENT .B obnam will pass on the environment it gets from its parent, without modification. It does not obey any unusual environment variables, but it does obey the usual ones when running external programs, creating temporary files, etc. .SH FILES .I /etc/obnam.conf .br .I /etc/obnam/*.conf .br .I ~/.obnam.conf .br .I ~/.config/obnam/*.conf .RS Configuration files for .BR obnam . It is not an error for any or all of the files to not exist. .RE .SH EXAMPLE To back up your home directory to a server: .IP .nf obnam backup \-\-repository sftp://your.server/~/backups $HOME .PP To restore your latest backup from the server: .IP .nf obnam restore \-\-repository sftp://your.server/~/backups \\ \-\-to /var/tmp/my.home.dir .PP To restore just one file or directory: .IP .nf obnam restore \-\-repository sftp://your.server/~/backups \\ \-\-to /var/tmp/my.home.dir $HOME/myfile.txt .fi .PP Alternatively, mount the backup repository using the FUSE filesystem (note that the .B \-\-to option is necessary and that the .B \-\-viewmode option is usually a good idea): .IP .nf mkdir my-repo obnam restore \-\-repository sftp://your.server/~/backups \\ \-\-to my-repo \-\-viewmode multiple cp my-repo/latest/$HOME/myfile.txt fusermount -u my-repo .PP To check that the backup worked: .IP .nf obnam verify \-\-repository sftp://your.server/~/backups \\ /path/to/file .PP To remove old backups, keeping the newest backup for each day for ten years: .IP .nf obnam forget \-\-repository sftp://your.server/~/backups \\ \-\-keep 3650d .PP To verify that the backup repository is OK: .IP .nf obnam fsck \-\-repository sftp://your.server/~/backups .fi .PP To view the backed up files in the backup repository using FUSE: .IP .nf obnam mount \-\-to my-fuse \-\-viewmode multiple ls -lh my-fuse fusermount -u my-fuse .fi .SH "SEE ALSO" .TP .BR cliapp (5) obnam-1.6.1/obnamlib/0000755000175000017500000000000012246357067014276 5ustar jenkinsjenkinsobnam-1.6.1/obnamlib/__init__.py0000644000175000017500000001013612246357067016410 0ustar jenkinsjenkins# Copyright (C) 2009-2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import cliapp __version__ = '1.6.1' # Import _obnam if it is there. We need to be able to do things without # it, especially at build time, while we're generating manual pages. # If _obnam is not there, substitute a dummy that throws an exception # if used. class DummyExtension(object): def __getattr__(self, name): raise Exception('Trying to use _obnam, but that was not found.') try: import _obnam except ImportError: _obnam = DummyExtension() from pluginmgr import PluginManager class Error(cliapp.AppException): pass class EncryptionError(Error): pass DEFAULT_NODE_SIZE = 256 * 1024 # benchmarked on 2011-09-01 DEFAULT_CHUNK_SIZE = 1024 * 1024 # benchmarked on 2011-09-01 DEFAULT_UPLOAD_QUEUE_SIZE = 128 DEFAULT_LRU_SIZE = 256 DEFAULT_CHUNKIDS_PER_GROUP = 1024 DEFAULT_NAGIOS_WARN_AGE = '27h' DEFAULT_NAGIOS_CRIT_AGE = '8d' # The following values have been determined empirically on a laptop # with an encrypted ext4 filesystem. Other values might be better for # other situations. IDPATH_DEPTH = 3 IDPATH_BITS = 12 IDPATH_SKIP = 13 # Maximum identifier for clients, chunks, files, etc. This is the largest # unsigned 64-bit value. In various places we assume 64-bit field sizes # for on-disk data structures. MAX_ID = 2**64 - 1 option_group = { 'perf': 'Performance tweaking', 'devel': 'Development of Obnam itself', } from sizeparse import SizeSyntaxError, UnitNameError, ByteSizeParser from encryption import (generate_symmetric_key, encrypt_symmetric, decrypt_symmetric, get_public_key, get_public_key_user_ids, Keyring, SecretKeyring, encrypt_with_keyring, decrypt_with_secret_keys, SymmetricKeyCache) from hooks import Hook, MissingFilterError, FilterHook, HookManager from pluginbase import ObnamPlugin from vfs import VirtualFileSystem, VfsFactory, VfsTests from vfs_local import LocalFS from metadata import (read_metadata, set_metadata, Metadata, metadata_fields, metadata_verify_fields, encode_metadata, decode_metadata) from repo_interface import ( RepositoryInterface, RepositoryInterfaceTests, RepositoryClientAlreadyExists, RepositoryClientDoesNotExist, RepositoryClientListNotLocked, RepositoryClientListLockingFailed, RepositoryClientLockingFailed, RepositoryClientNotLocked, RepositoryClientKeyNotAllowed, RepositoryClientGenerationUnfinished, RepositoryGenerationKeyNotAllowed, RepositoryGenerationDoesNotExist, RepositoryClientHasNoGenerations, RepositoryFileDoesNotExistInGeneration, RepositoryFileKeyNotAllowed, RepositoryChunkDoesNotExist, RepositoryChunkContentNotInIndexes, RepositoryChunkIndexesNotLocked, RepositoryChunkIndexesLockingFailed, REPO_CLIENT_TEST_KEY, REPO_GENERATION_TEST_KEY, REPO_FILE_TEST_KEY, REPO_FILE_MTIME, REPO_FILE_INTEGER_KEYS) from repo_dummy import RepositoryFormatDummy from repo_fmt_6 import RepositoryFormat6 from repo_tree import RepositoryTree from chunklist import ChunkList from clientlist import ClientList from checksumtree import ChecksumTree from clientmetadatatree import ClientMetadataTree from lockmgr import LockManager from repo import Repository, LockFail, BadFormat from forget_policy import ForgetPolicy from app import App __all__ = locals() obnam-1.6.1/obnamlib/app.py0000644000175000017500000002152612246357067015436 0ustar jenkinsjenkins# Copyright (C) 2009, 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import cliapp import larch import logging import os import socket import StringIO import sys import time import tracing import ttystatus import obnamlib class App(cliapp.Application): '''Main program for backup program.''' def add_settings(self): devel_group = obnamlib.option_group['devel'] perf_group = obnamlib.option_group['perf'] self.settings.string(['repository', 'r'], 'name of backup repository') self.settings.string(['client-name'], 'name of client (%default)', default=self.deduce_client_name()) self.settings.bytesize(['node-size'], 'size of B-tree nodes on disk; only affects new ' 'B-trees so you may need to delete a client ' 'or repository to change this for existing ' 'repositories ' '(default: %default)', default=obnamlib.DEFAULT_NODE_SIZE, group=perf_group) self.settings.bytesize(['chunk-size'], 'size of chunks of file data backed up ' '(default: %default)', default=obnamlib.DEFAULT_CHUNK_SIZE, group=perf_group) self.settings.bytesize(['upload-queue-size'], 'length of upload queue for B-tree nodes ' '(default: %default)', default=obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, group=perf_group) self.settings.bytesize(['lru-size'], 'size of LRU cache for B-tree nodes ' '(default: %default)', default=obnamlib.DEFAULT_LRU_SIZE, group=perf_group) self.settings.string_list(['trace'], 'add to filename patters for which trace ' 'debugging logging happens') self.settings.integer(['idpath-depth'], 'depth of chunk id mapping', default=obnamlib.IDPATH_DEPTH, group=perf_group) self.settings.integer(['idpath-bits'], 'chunk id level size', default=obnamlib.IDPATH_BITS, group=perf_group) self.settings.integer(['idpath-skip'], 'chunk id mapping lowest bits skip', default=obnamlib.IDPATH_SKIP, group=perf_group) self.settings.boolean(['quiet'], 'be silent') self.settings.boolean(['verbose'], 'be verbose') self.settings.boolean(['pretend', 'dry-run', 'no-act'], 'do not actually change anything (works with ' 'backup, forget and restore only, and may only ' 'simulate approximately real behavior)') self.settings.string(['pretend-time'], 'pretend it is TIMESTAMP (YYYY-MM-DD HH:MM:SS); ' 'this is only useful for testing purposes', metavar='TIMESTAMP', group=devel_group) self.settings.integer(['lock-timeout'], 'when locking in the backup repository, ' 'wait TIMEOUT seconds for an existing lock ' 'to go away before giving up', metavar='TIMEOUT', default=60) self.settings.integer(['crash-limit'], 'artificially crash the program after COUNTER ' 'files written to the repository; this is ' 'useful for crash testing the application, ' 'and should not be enabled for real use; ' 'set to 0 to disable (disabled by default)', metavar='COUNTER', group=devel_group) # The following needs to be done here, because it needs # to be done before option processing. This is a bit ugly, # but the best we can do with the current cliapp structure. # Possibly cliapp will provide a better hook for us to use # later on, but this is reality now. self.setup_ttystatus() self.pm = obnamlib.PluginManager() self.pm.locations = [self.plugins_dir()] self.pm.plugin_arguments = (self,) self.setup_hooks() self.fsf = obnamlib.VfsFactory() self.pm.load_plugins() self.pm.enable_plugins() self.hooks.call('plugins-loaded') self.settings['log-level'] = 'info' def deduce_client_name(self): return socket.gethostname() def setup_hooks(self): self.hooks = obnamlib.HookManager() self.hooks.new('plugins-loaded') self.hooks.new('config-loaded') self.hooks.new('shutdown') # The Repository class defines some hooks, but the class # won't be instantiated until much after plugins are enabled, # and since all hooks must be defined when plugins are enabled, # we create one instance here, which will immediately be destroyed. # FIXME: This is fugly. obnamlib.Repository(None, 1000, 1000, 100, self.hooks, 10, 10, 10, self.time, 0, '') def plugins_dir(self): return os.path.join(os.path.dirname(obnamlib.__file__), 'plugins') def setup_logging(self): log = self.settings['log'] if log and log != 'syslog' and not os.path.exists(log): fd = os.open(log, os.O_WRONLY | os.O_CREAT, 0600) os.close(fd) cliapp.Application.setup_logging(self) def process_args(self, args): try: if self.settings['quiet']: self.ts.disable() for pattern in self.settings['trace']: tracing.trace_add_pattern(pattern) self.hooks.call('config-loaded') cliapp.Application.process_args(self, args) self.hooks.call('shutdown') except larch.Error, e: logging.critical(str(e)) sys.stderr.write('ERROR: %s\n' % str(e)) sys.exit(1) def setup_ttystatus(self): self.ts = ttystatus.TerminalStatus(period=0.1) if self.settings['quiet']: self.ts.disable() def open_repository(self, create=False, repofs=None): # pragma: no cover logging.debug('opening repository (create=%s)' % create) tracing.trace('repofs=%s' % repr(repofs)) repopath = self.settings['repository'] if repofs is None: repofs = self.fsf.new(repopath, create=create) if self.settings['crash-limit'] > 0: repofs.crash_limit = self.settings['crash-limit'] repofs.connect() else: repofs.reinit(repopath) return obnamlib.Repository(repofs, self.settings['node-size'], self.settings['upload-queue-size'], self.settings['lru-size'], self.hooks, self.settings['idpath-depth'], self.settings['idpath-bits'], self.settings['idpath-skip'], self.time, self.settings['lock-timeout'], self.settings['client-name']) def time(self): '''Return current time in seconds since epoch. This is a wrapper around time.time() so that it can be overridden with the --pretend-time setting. ''' s = self.settings['pretend-time'] if s: t = time.strptime(s, '%Y-%m-%d %H:%M:%S') return time.mktime(t) else: return time.time() obnam-1.6.1/obnamlib/checksumtree.py0000644000175000017500000000573712246357067017346 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import struct import tracing import obnamlib class ChecksumTree(obnamlib.RepositoryTree): '''Repository map of checksum to integer id. The checksum might be, for example, an MD5 one (as returned by hashlib.md5().digest()). The id would be a chunk id. ''' def __init__(self, fs, name, checksum_length, node_size, upload_queue_size, lru_size, hooks): tracing.trace('new ChecksumTree name=%s' % name) self.fmt = '!%dsQQ' % checksum_length key_bytes = struct.calcsize(self.fmt) obnamlib.RepositoryTree.__init__(self, fs, name, key_bytes, node_size, upload_queue_size, lru_size, hooks) self.keep_just_one_tree = True def key(self, checksum, chunk_id, client_id): return struct.pack(self.fmt, checksum, chunk_id, client_id) def unkey(self, key): return struct.unpack(self.fmt, key) def add(self, checksum, chunk_id, client_id): tracing.trace('checksum=%s', repr(checksum)) tracing.trace('chunk_id=%s', chunk_id) tracing.trace('client_id=%s', client_id) self.start_changes() key = self.key(checksum, chunk_id, client_id) self.tree.insert(key, '') def find(self, checksum): if self.init_forest() and self.forest.trees: minkey = self.key(checksum, 0, 0) maxkey = self.key(checksum, obnamlib.MAX_ID, obnamlib.MAX_ID) t = self.forest.trees[-1] pairs = t.lookup_range(minkey, maxkey) return [self.unkey(key)[1] for key, value in pairs] else: return [] def remove(self, checksum, chunk_id, client_id): tracing.trace('checksum=%s', repr(checksum)) tracing.trace('chunk_id=%s', chunk_id) tracing.trace('client_id=%s', client_id) self.start_changes() key = self.key(checksum, chunk_id, client_id) self.tree.remove_range(key, key) def chunk_is_used(self, checksum, chunk_id): '''Is a given chunk used by anyone?''' if self.init_forest() and self.forest.trees: minkey = self.key(checksum, chunk_id, 0) maxkey = self.key(checksum, chunk_id, obnamlib.MAX_ID) t = self.forest.trees[-1] return not t.range_is_empty(minkey, maxkey) else: return False obnam-1.6.1/obnamlib/checksumtree_tests.py0000644000175000017500000000517312246357067020562 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import hashlib import shutil import tempfile import unittest import obnamlib class ChecksumTreeTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() fs = obnamlib.LocalFS(self.tempdir) self.hooks = obnamlib.HookManager() self.hooks.new('repository-toplevel-init') self.checksum = hashlib.md5('foo').digest() self.tree = obnamlib.ChecksumTree(fs, 'x', len(self.checksum), obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, self) def tearDown(self): self.tree.commit() shutil.rmtree(self.tempdir) def test_is_empty_initially(self): self.assertEqual(self.tree.find(self.checksum), []) def test_finds_checksums(self): self.tree.add(self.checksum, 1, 3) self.tree.add(self.checksum, 2, 4) self.assertEqual(sorted(self.tree.find(self.checksum)), [1, 2]) def test_finds_only_the_right_checksums(self): self.tree.add(self.checksum, 1, 2) self.tree.add(self.checksum, 3, 4) self.tree.add(hashlib.md5('bar').digest(), 5, 6) self.assertEqual(sorted(self.tree.find(self.checksum)), [1, 3]) def test_removes_checksum(self): self.tree.add(self.checksum, 1, 3) self.tree.add(self.checksum, 2, 4) self.tree.remove(self.checksum, 2, 4) self.assertEqual(self.tree.find(self.checksum), [1]) def test_adds_same_id_only_once(self): self.tree.add(self.checksum, 1, 2) self.tree.add(self.checksum, 1, 2) self.assertEqual(self.tree.find(self.checksum), [1]) def test_unknown_chunk_is_not_used(self): self.assertFalse(self.tree.chunk_is_used(self.checksum, 0)) def test_known_chunk_is_used(self): self.tree.add(self.checksum, 0, 1) self.assertTrue(self.tree.chunk_is_used(self.checksum, 0)) obnam-1.6.1/obnamlib/chunklist.py0000644000175000017500000000412312246357067016654 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import hashlib import struct import random import tracing import obnamlib class ChunkList(obnamlib.RepositoryTree): '''Repository's list of chunks. The list maps a chunk id to its checksum. The list is implemented as a B-tree, with the 64-bit chunk id as the key, and the checksum as the value. ''' def __init__(self, fs, node_size, upload_queue_size, lru_size, hooks): tracing.trace('new ChunkList') self.fmt = '!Q' self.key_bytes = struct.calcsize(self.fmt) obnamlib.RepositoryTree.__init__(self, fs, 'chunklist', self.key_bytes, node_size, upload_queue_size, lru_size, hooks) self.keep_just_one_tree = True def key(self, chunk_id): return struct.pack(self.fmt, chunk_id) def add(self, chunk_id, checksum): tracing.trace('chunk_id=%s', chunk_id) tracing.trace('checksum=%s', repr(checksum)) self.start_changes() self.tree.insert(self.key(chunk_id), checksum) def get_checksum(self, chunk_id): if self.init_forest() and self.forest.trees: t = self.forest.trees[-1] return t.lookup(self.key(chunk_id)) raise KeyError(chunk_id) def remove(self, chunk_id): tracing.trace('chunk_id=%s', chunk_id) self.start_changes() key = self.key(chunk_id) self.tree.remove_range(key, key) obnam-1.6.1/obnamlib/chunklist_tests.py0000644000175000017500000000354412246357067020104 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import shutil import tempfile import unittest import obnamlib class ChunkListTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() fs = obnamlib.LocalFS(self.tempdir) self.hooks = obnamlib.HookManager() self.hooks.new('repository-toplevel-init') self.list = obnamlib.ChunkList(fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, self) def tearDown(self): shutil.rmtree(self.tempdir) def test_raises_keyerror_for_missing_chunk(self): self.assertRaises(KeyError, self.list.get_checksum, 0) def test_adds_chunk(self): self.list.add(0, 'checksum') self.assertEqual(self.list.get_checksum(0), 'checksum') def test_adds_second_chunk(self): self.list.add(0, 'checksum') self.list.add(1, 'checksum1') self.assertEqual(self.list.get_checksum(1), 'checksum1') def test_removes_chunk(self): self.list.add(0, 'checksum') self.list.remove(0) self.assertRaises(KeyError, self.list.get_checksum, 0) obnam-1.6.1/obnamlib/clientlist.py0000644000175000017500000001230212246357067017020 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import hashlib import logging import struct import random import tracing import obnamlib class ClientList(obnamlib.RepositoryTree): '''Repository's list of clients. The list maps a client name to an arbitrary (string) identifier, which is unique within the repository. The list is implemented as a B-tree, with a three-part key: 128-bit MD5 of client name, 64-bit unique identifier, and subkey identifier. The value depends on the subkey: it's either the client's full name, or the public key identifier the client uses to encrypt their backups. The client's identifier is a random, unique 64-bit integer. ''' # subkey values CLIENT_NAME = 0 KEYID = 1 SUBKEY_MAX = 255 def __init__(self, fs, node_size, upload_queue_size, lru_size, hooks): tracing.trace('new ClientList') self.hash_len = len(self.hashfunc('')) self.fmt = '!%dsQB' % self.hash_len self.key_bytes = struct.calcsize(self.fmt) self.minkey = self.hashkey('\x00' * self.hash_len, 0, 0) self.maxkey = self.hashkey('\xff' * self.hash_len, obnamlib.MAX_ID, self.SUBKEY_MAX) obnamlib.RepositoryTree.__init__(self, fs, 'clientlist', self.key_bytes, node_size, upload_queue_size, lru_size, hooks) self.keep_just_one_tree = True def hashfunc(self, string): return hashlib.new('md5', string).digest() def hashkey(self, namehash, client_id, subkey): return struct.pack(self.fmt, namehash, client_id, subkey) def key(self, client_name, client_id, subkey): h = self.hashfunc(client_name) return self.hashkey(h, client_id, subkey) def unkey(self, key): return struct.unpack(self.fmt, key) def random_id(self): return random.randint(0, obnamlib.MAX_ID) def list_clients(self): if self.init_forest() and self.forest.trees: t = self.forest.trees[-1] return [v for k, v in t.lookup_range(self.minkey, self.maxkey) if self.unkey(k)[2] == self.CLIENT_NAME] else: return [] def find_client_id(self, t, client_name): minkey = self.key(client_name, 0, 0) maxkey = self.key(client_name, obnamlib.MAX_ID, self.SUBKEY_MAX) for k, v in t.lookup_range(minkey, maxkey): checksum, client_id, subkey = self.unkey(k) if subkey == self.CLIENT_NAME and v == client_name: return client_id return None def get_client_id(self, client_name): if not self.init_forest() or not self.forest.trees: return None t = self.forest.trees[-1] return self.find_client_id(t, client_name) def add_client(self, client_name): logging.info('Adding client %s' % client_name) self.start_changes() if self.find_client_id(self.tree, client_name) is None: while True: candidate_id = self.random_id() key = self.key(client_name, candidate_id, self.CLIENT_NAME) try: self.tree.lookup(key) except KeyError: break key = self.key(client_name, candidate_id, self.CLIENT_NAME) self.tree.insert(key, client_name) logging.debug('Client %s has id %s' % (client_name, candidate_id)) def remove_client(self, client_name): logging.info('Removing client %s' % client_name) self.start_changes() client_id = self.find_client_id(self.tree, client_name) if client_id is not None: key = self.key(client_name, client_id, self.CLIENT_NAME) self.tree.remove(key) def get_client_keyid(self, client_name): if self.init_forest() and self.forest.trees: t = self.forest.trees[-1] client_id = self.find_client_id(t, client_name) if client_id is not None: key = self.key(client_name, client_id, self.KEYID) for k, v in t.lookup_range(key, key): return v return None def set_client_keyid(self, client_name, keyid): logging.info('Setting client %s to use key %s' % (client_name, keyid)) self.start_changes() client_id = self.find_client_id(self.tree, client_name) key = self.key(client_name, client_id, self.KEYID) if keyid is None: self.tree.remove_range(key, key) else: self.tree.insert(key, keyid) obnam-1.6.1/obnamlib/clientlist_tests.py0000644000175000017500000000777712246357067020266 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import shutil import tempfile import unittest import obnamlib class ClientListTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() fs = obnamlib.LocalFS(self.tempdir) self.hooks = obnamlib.HookManager() self.hooks.new('repository-toplevel-init') self.list = obnamlib.ClientList(fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, self) def tearDown(self): shutil.rmtree(self.tempdir) def test_key_bytes_is_correct_length(self): self.assertEqual(self.list.key_bytes, len(self.list.key('foo', 12765, 0))) def test_unkey_unpacks_key_correctly(self): key = self.list.key('client name', 12765, 42) client_hash, client_id, subkey = self.list.unkey(key) self.assertEqual(client_id, 12765) self.assertEqual(subkey, 42) def test_reports_none_as_id_for_nonexistent_client(self): self.assertEqual(self.list.get_client_id('foo'), None) def test_lists_no_clients_when_tree_does_not_exist(self): self.assertEqual(self.list.list_clients(), []) def test_added_client_has_integer_id(self): self.list.add_client('foo') self.assert_(type(self.list.get_client_id('foo')) in [int, long]) def test_added_client_is_listed(self): self.list.add_client('foo') self.list.set_client_keyid('foo', 'cafebeef') self.assertEqual(self.list.list_clients(), ['foo']) def test_removed_client_has_none_id(self): self.list.add_client('foo') self.list.remove_client('foo') self.assertEqual(self.list.get_client_id('foo'), None) def test_removed_client_has_no_keys(self): self.list.add_client('foo') client_id = self.list.get_client_id('foo') self.list.remove_client('foo') minkey = self.list.key('foo', client_id, 0) maxkey = self.list.key('foo', client_id, self.list.SUBKEY_MAX) pairs = list(self.list.tree.lookup_range(minkey, maxkey)) self.assertEqual(pairs, []) def test_twice_added_client_exists_only_once(self): self.list.add_client('foo') self.list.add_client('foo') self.assertEqual(self.list.list_clients(), ['foo']) def test_adding_handles_hash_collision(self): def bad_hash(string): return '0' * 16 self.list.hashfunc = bad_hash self.list.add_client('foo') self.list.add_client('bar') self.assertEqual(sorted(self.list.list_clients()), ['bar', 'foo']) self.assertNotEqual(self.list.get_client_id('bar'), self.list.get_client_id('foo')) def test_client_has_no_public_key_initially(self): self.list.add_client('foo') self.assertEqual(self.list.get_client_keyid('foo'), None) def test_sets_client_keyid(self): self.list.add_client('foo') self.list.set_client_keyid('foo', 'cafebeef') self.assertEqual(self.list.get_client_keyid('foo'), 'cafebeef') def test_remove_client_keyid(self): self.list.add_client('foo') self.list.set_client_keyid('foo', 'cafebeef') self.list.set_client_keyid('foo', None) self.assertEqual(self.list.get_client_keyid('foo'), None) obnam-1.6.1/obnamlib/clientmetadatatree.py0000644000175000017500000005014112246357067020510 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import hashlib import logging import os import random import struct import tracing import obnamlib class ClientMetadataTree(obnamlib.RepositoryTree): '''Store per-client metadata about files. Actual file contents is stored elsewhere, this stores just the metadata about files: names, inode info, and what chunks of data they use. See http://braawi.org/obnam/ondisk/ for a description of how this works. ''' # Filesystem metadata. PREFIX_FS_META = 0 # prefix FILE_NAME = 0 # subkey type for storing pathnames FILE_CHUNKS = 1 # subkey type for list of chunks FILE_METADATA = 3 # subkey type for inode fields, etc DIR_CONTENTS = 4 # subkey type for list of directory contents FILE_DATA = 5 # subkey type for file data (instead of chunk) FILE_METADATA_ENCODED = 0 # subkey value for encoded obnamlib.Metadata(). # References to chunks in this generation. # Main key is the chunk id, subkey type is always 0, subkey is file id # for file that uses the chunk. PREFIX_CHUNK_REF = 1 # Metadata about the generation. The main key is always the hash of # 'generation', subkey type field is always 0. PREFIX_GEN_META = 2 # prefix GEN_ID = 0 # subkey type for generation id GEN_STARTED = 1 # subkey type for when generation was started GEN_ENDED = 2 # subkey type for when generation was ended GEN_IS_CHECKPOINT = 3 # subkey type for whether generation is checkpoint GEN_FILE_COUNT = 4 # subkey type for count of files+dirs in generation GEN_TOTAL_DATA = 5 # subkey type for sum of all file sizes in gen # Maximum values for the subkey type field, and the subkey field. # Both have a minimum value of 0. TYPE_MAX = 255 SUBKEY_MAX = struct.pack('!Q', obnamlib.MAX_ID) def __init__(self, fs, client_dir, node_size, upload_queue_size, lru_size, repo): tracing.trace('new ClientMetadataTree, client_dir=%s' % client_dir) self.current_time = repo.current_time key_bytes = len(self.hashkey(0, self.default_file_id(''), 0, 0)) obnamlib.RepositoryTree.__init__(self, fs, client_dir, key_bytes, node_size, upload_queue_size, lru_size, repo) self.genhash = self.default_file_id('generation') self.chunkids_per_key = max(1, int(node_size / 4 / struct.calcsize('Q'))) self.init_caches() def init_caches(self): self.known_generations = {} self.file_ids = {} def default_file_id(self, filename): '''Return hash of filename suitable for use as main key.''' tracing.trace(repr(filename)) def hash(s): return hashlib.md5(s).digest()[:4] dirname = os.path.dirname(filename) basename = os.path.basename(filename) return hash(dirname) + hash(basename) def _bad_default_file_id(self, filename): '''For use by unit tests.''' return struct.pack('!Q', 0) def hashkey(self, prefix, mainhash, subtype, subkey): '''Compute a full key. The full key consists of three parts: * prefix (0 for filesystem metadata, 1 for chunk refs) * a hash of mainkey (64 bits) * the subkey type (8 bits) * type subkey (64 bits) These are catenated. mainhash must be a string of 8 bytes. subtype must be an integer in the range 0.255, inclusive. subkey must be either a string or an integer. If it is a string, it will be padded with NUL bytes at the end, if it is less than 8 bytes, and truncated, if longer. If it is an integer, it will be converted as a string, and the value must fit into 64 bits. ''' if type(subkey) == str: subkey = (subkey + '\0' * 8)[:8] fmt = '!B8sB8s' else: assert type(subkey) in [int, long] fmt = '!B8sBQ' return struct.pack(fmt, prefix, mainhash, subtype, subkey) def fskey(self, mainhash, subtype, subkey): ''''Generate key for filesystem metadata.''' return self.hashkey(self.PREFIX_FS_META, mainhash, subtype, subkey) def fs_unkey(self, key): '''Inverse of fskey.''' parts = struct.unpack('!B8sB8s', key) return parts[1], parts[3] def genkey(self, subkey): '''Generate key for generation metadata.''' return self.hashkey(self.PREFIX_GEN_META, self.genhash, 0, subkey) def int2bin(self, integer): '''Convert an integer to a binary string representation.''' return struct.pack('!Q', integer) def chunk_key(self, chunk_id, file_id): '''Generate a key for a chunk reference.''' return self.hashkey(self.PREFIX_CHUNK_REF, self.int2bin(chunk_id), 0, file_id) def chunk_unkey(self, key): '''Return the chunk and file ids in a chunk key.''' parts = struct.unpack('!BQBQ', key) return parts[1], parts[3] def get_file_id(self, tree, pathname): '''Return id for file in a given generation.''' if tree in self.file_ids: if pathname in self.file_ids[tree]: return self.file_ids[tree][pathname] else: self.file_ids[tree] = {} default_file_id = self.default_file_id(pathname) minkey = self.fskey(default_file_id, self.FILE_NAME, 0) maxkey = self.fskey(default_file_id, self.FILE_NAME, obnamlib.MAX_ID) for key, value in tree.lookup_range(minkey, maxkey): def_id, file_id = self.fs_unkey(key) assert def_id == default_file_id, \ 'def=%s other=%s' % (repr(def_id), repr(default_file_id)) self.file_ids[tree][value] = file_id if value == pathname: return file_id raise KeyError('%s does not yet have a file-id' % pathname) def set_file_id(self, pathname): '''Set and return the file-id for a file in current generation.''' default_file_id = self.default_file_id(pathname) minkey = self.fskey(default_file_id, self.FILE_NAME, 0) maxkey = self.fskey(default_file_id, self.FILE_NAME, obnamlib.MAX_ID) file_ids = set() for key, value in self.tree.lookup_range(minkey, maxkey): def_id, file_id = self.fs_unkey(key) assert def_id == default_file_id if value == pathname: return file_id file_ids.add(file_id) while True: n = random.randint(0, obnamlib.MAX_ID) file_id = struct.pack('!Q', n) if file_id not in file_ids: break key = self.fskey(default_file_id, self.FILE_NAME, file_id) self.tree.insert(key, pathname) return file_id def _lookup_int(self, tree, key): return struct.unpack('!Q', tree.lookup(key))[0] def _insert_int(self, tree, key, value): return tree.insert(key, struct.pack('!Q', value)) def commit(self): tracing.trace('committing ClientMetadataTree') if self.tree: now = int(self.current_time()) self._insert_int(self.tree, self.genkey(self.GEN_ENDED), now) genid = self._get_generation_id_or_None(self.tree) if genid is not None: t = [(self.GEN_FILE_COUNT, 'file_count'), (self.GEN_TOTAL_DATA, 'total_data')] for subkey, attr in t: if hasattr(self, attr): self._insert_count(genid, subkey, getattr(self, attr)) obnamlib.RepositoryTree.commit(self) def init_forest(self, *args, **kwargs): self.init_caches() return obnamlib.RepositoryTree.init_forest(self, *args, **kwargs) def start_changes(self, *args, **kwargs): self.init_caches() return obnamlib.RepositoryTree.start_changes(self, *args, **kwargs) def find_generation(self, genid): def fill_cache(): key = self.genkey(self.GEN_ID) for t in self.forest.trees: t_genid = self._lookup_int(t, key) if t_genid == genid: self.known_generations[genid] = t return t if self.forest: if genid in self.known_generations: return self.known_generations[genid] t = fill_cache() if t is not None: return t raise KeyError('Unknown generation %s' % genid) def list_generations(self): if self.forest: genids = [] for t in self.forest.trees: genid = self._get_generation_id_or_None(t) if genid is not None: genids.append(genid) return genids else: return [] def start_generation(self): tracing.trace('start new generation') self.start_changes() gen_id = self.forest.new_id() now = int(self.current_time()) self._insert_int(self.tree, self.genkey(self.GEN_ID), gen_id) self._insert_int(self.tree, self.genkey(self.GEN_STARTED), now) self.file_count = self.get_generation_file_count(gen_id) or 0 self.total_data = self.get_generation_data(gen_id) or 0 def set_current_generation_is_checkpoint(self, is_checkpoint): tracing.trace('is_checkpoint=%s', is_checkpoint) value = 1 if is_checkpoint else 0 key = self.genkey(self.GEN_IS_CHECKPOINT) self._insert_int(self.tree, key, value) def get_is_checkpoint(self, genid): tree = self.find_generation(genid) key = self.genkey(self.GEN_IS_CHECKPOINT) try: return self._lookup_int(tree, key) except KeyError: return 0 def remove_generation(self, genid): tracing.trace('genid=%s', genid) tree = self.find_generation(genid) if tree == self.tree: self.tree = None self.forest.remove_tree(tree) def get_generation_id(self, tree): return self._lookup_int(tree, self.genkey(self.GEN_ID)) def _get_generation_id_or_None(self, tree): try: return self.get_generation_id(tree) except KeyError: # pragma: no cover return None def _lookup_time(self, tree, what): try: return self._lookup_int(tree, self.genkey(what)) except KeyError: return None def get_generation_times(self, genid): tree = self.find_generation(genid) return (self._lookup_time(tree, self.GEN_STARTED), self._lookup_time(tree, self.GEN_ENDED)) def get_generation_data(self, genid): return self._lookup_count(genid, self.GEN_TOTAL_DATA) def _lookup_count(self, genid, count_type): tree = self.find_generation(genid) key = self.genkey(count_type) try: return self._lookup_int(tree, key) except KeyError: return None def _insert_count(self, genid, count_type, count): tree = self.find_generation(genid) key = self.genkey(count_type) return self._insert_int(tree, key, count) def get_generation_file_count(self, genid): return self._lookup_count(genid, self.GEN_FILE_COUNT) def create(self, filename, encoded_metadata): tracing.trace('filename=%s', filename) file_id = self.set_file_id(filename) gen_id = self.get_generation_id(self.tree) try: old_metadata = self.get_metadata(gen_id, filename) except KeyError: old_metadata = None self.file_count += 1 else: old = obnamlib.decode_metadata(old_metadata) if old.isfile(): self.total_data -= old.st_size or 0 metadata = obnamlib.decode_metadata(encoded_metadata) if metadata.isfile(): self.total_data += metadata.st_size or 0 if encoded_metadata != old_metadata: tracing.trace('new or changed metadata') self.set_metadata(filename, encoded_metadata) # Add to parent's contents, unless already there. parent = os.path.dirname(filename) tracing.trace('parent=%s', parent) if parent != filename: # root dir is its own parent basename = os.path.basename(filename) parent_id = self.set_file_id(parent) key = self.fskey(parent_id, self.DIR_CONTENTS, file_id) # We could just insert, but that would cause unnecessary # churn in the tree if nothing changes. try: self.tree.lookup(key) tracing.trace('was already in parent') # pragma: no cover except KeyError: self.tree.insert(key, basename) tracing.trace('added to parent') def get_metadata(self, genid, filename): tree = self.find_generation(genid) file_id = self.get_file_id(tree, filename) key = self.fskey(file_id, self.FILE_METADATA, self.FILE_METADATA_ENCODED) return tree.lookup(key) def set_metadata(self, filename, encoded_metadata): tracing.trace('filename=%s', filename) file_id = self.set_file_id(filename) key1 = self.fskey(file_id, self.FILE_NAME, file_id) self.tree.insert(key1, filename) key2 = self.fskey(file_id, self.FILE_METADATA, self.FILE_METADATA_ENCODED) self.tree.insert(key2, encoded_metadata) def remove(self, filename): tracing.trace('filename=%s', filename) file_id = self.get_file_id(self.tree, filename) genid = self.get_generation_id(self.tree) self.file_count -= 1 try: encoded_metadata = self.get_metadata(genid, filename) except KeyError: pass else: metadata = obnamlib.decode_metadata(encoded_metadata) if metadata.isfile(): self.total_data -= metadata.st_size or 0 # Remove any children. minkey = self.fskey(file_id, self.DIR_CONTENTS, 0) maxkey = self.fskey(file_id, self.DIR_CONTENTS, obnamlib.MAX_ID) for key, basename in self.tree.lookup_range(minkey, maxkey): self.remove(os.path.join(filename, basename)) # Remove chunk refs. for chunkid in self.get_file_chunks(genid, filename): key = self.chunk_key(chunkid, file_id) self.tree.remove_range(key, key) # Remove this file's metadata. minkey = self.fskey(file_id, 0, 0) maxkey = self.fskey(file_id, self.TYPE_MAX, self.SUBKEY_MAX) self.tree.remove_range(minkey, maxkey) # Remove filename. default_file_id = self.default_file_id(filename) key = self.fskey(default_file_id, self.FILE_NAME, file_id) self.tree.remove_range(key, key) # Also remove from parent's contents. parent = os.path.dirname(filename) if parent != filename: # root dir is its own parent parent_id = self.set_file_id(parent) key = self.fskey(parent_id, self.DIR_CONTENTS, file_id) # The range removal will work even if the key does not exist. self.tree.remove_range(key, key) def listdir(self, genid, dirname): tree = self.find_generation(genid) try: dir_id = self.get_file_id(tree, dirname) except KeyError: return [] minkey = self.fskey(dir_id, self.DIR_CONTENTS, 0) maxkey = self.fskey(dir_id, self.DIR_CONTENTS, self.SUBKEY_MAX) basenames = [] for key, value in tree.lookup_range(minkey, maxkey): basenames.append(value) return basenames def get_file_chunks(self, genid, filename): tree = self.find_generation(genid) try: file_id = self.get_file_id(tree, filename) except KeyError: return [] minkey = self.fskey(file_id, self.FILE_CHUNKS, 0) maxkey = self.fskey(file_id, self.FILE_CHUNKS, self.SUBKEY_MAX) pairs = tree.lookup_range(minkey, maxkey) chunkids = [] for key, value in pairs: chunkids.extend(self._decode_chunks(value)) return chunkids def _encode_chunks(self, chunkids): fmt = '!' + ('Q' * len(chunkids)) return struct.pack(fmt, *chunkids) def _decode_chunks(self, encoded): size = struct.calcsize('Q') count = len(encoded) / size fmt = '!' + ('Q' * count) return struct.unpack(fmt, encoded) def _insert_chunks(self, tree, file_id, i, chunkids): key = self.fskey(file_id, self.FILE_CHUNKS, i) encoded = self._encode_chunks(chunkids) tree.insert(key, encoded) def set_file_chunks(self, filename, chunkids): tracing.trace('filename=%s', filename) tracing.trace('chunkids=%s', repr(chunkids)) file_id = self.set_file_id(filename) minkey = self.fskey(file_id, self.FILE_CHUNKS, 0) maxkey = self.fskey(file_id, self.FILE_CHUNKS, self.SUBKEY_MAX) for key, value in self.tree.lookup_range(minkey, maxkey): for chunkid in self._decode_chunks(value): k = self.chunk_key(chunkid, file_id) self.tree.remove_range(k, k) self.tree.remove_range(minkey, maxkey) self.append_file_chunks(filename, chunkids) def append_file_chunks(self, filename, chunkids): tracing.trace('filename=%s', filename) tracing.trace('chunkids=%s', repr(chunkids)) file_id = self.set_file_id(filename) minkey = self.fskey(file_id, self.FILE_CHUNKS, 0) maxkey = self.fskey(file_id, self.FILE_CHUNKS, self.SUBKEY_MAX) i = self.tree.count_range(minkey, maxkey) while chunkids: some = chunkids[:self.chunkids_per_key] self._insert_chunks(self.tree, file_id, i, some) for chunkid in some: self.tree.insert(self.chunk_key(chunkid, file_id), '') i += 1 chunkids = chunkids[self.chunkids_per_key:] def chunk_in_use(self, gen_id, chunk_id): '''Is a chunk used by a generation?''' minkey = self.chunk_key(chunk_id, 0) maxkey = self.chunk_key(chunk_id, obnamlib.MAX_ID) t = self.find_generation(gen_id) return not t.range_is_empty(minkey, maxkey) def list_chunks_in_generation(self, gen_id): '''Return list of chunk ids used in a given generation.''' minkey = self.chunk_key(0, 0) maxkey = self.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) t = self.find_generation(gen_id) return list(set(self.chunk_unkey(key)[0] for key, value in t.lookup_range(minkey, maxkey))) def set_file_data(self, filename, contents): # pragma: no cover '''Store contents of file, if small, in B-tree instead of chunk. The length of the contents should be small enough to fit in a B-tree leaf. ''' tracing.trace('filename=%s' % filename) tracing.trace('contents=%s' % repr(contents)) file_id = self.set_file_id(filename) key = self.fskey(file_id, self.FILE_DATA, 0) self.tree.insert(key, contents) def get_file_data(self, gen_id, filename): # pragma: no cover '''Return contents of file, if set, or None.''' tree = self.find_generation(gen_id) file_id = self.get_file_id(tree, filename) key = self.fskey(file_id, self.FILE_DATA, 0) try: return tree.lookup(key) except KeyError: return None obnam-1.6.1/obnamlib/clientmetadatatree_tests.py0000644000175000017500000004654612246357067021750 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import shutil import stat import tempfile import time import unittest import obnamlib class ClientMetadataTreeTests(unittest.TestCase): def current_time(self): return time.time() if self.now is None else self.now def setUp(self): self.now = None self.tempdir = tempfile.mkdtemp() fs = obnamlib.LocalFS(self.tempdir) self.hooks = obnamlib.HookManager() self.hooks.new('repository-toplevel-init') self.client = obnamlib.ClientMetadataTree(fs, 'clientid', obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, self) self.file_size = 123 self.file_metadata = obnamlib.Metadata(st_mode=stat.S_IFREG | 0666, st_size=self.file_size) self.file_encoded = obnamlib.encode_metadata(self.file_metadata) def tearDown(self): shutil.rmtree(self.tempdir) def test_has_not_current_generation_initially(self): self.assertEqual(self.client.tree, None) def test_lists_no_generations_initially(self): self.assertEqual(self.client.list_generations(), []) def test_starts_generation(self): self.now = 12765 self.client.start_generation() self.assertNotEqual(self.client.tree, None) def lookup(x): key = self.client.genkey(x) return self.client._lookup_int(self.client.tree, key) genid = self.client.get_generation_id(self.client.tree) self.assertEqual(lookup(self.client.GEN_ID), genid) self.assertEqual(lookup(self.client.GEN_STARTED), 12765) self.assertFalse(self.client.get_is_checkpoint(genid)) def test_starts_second_generation(self): self.now = 1 self.client.start_generation() genid1 = self.client.get_generation_id(self.client.tree) self.client.commit() self.assertEqual(self.client.tree, None) self.now = 2 self.client.start_generation() self.assertNotEqual(self.client.tree, None) def lookup(x): key = self.client.genkey(x) return self.client._lookup_int(self.client.tree, key) genid2 = self.client.get_generation_id(self.client.tree) self.assertEqual(lookup(self.client.GEN_ID), genid2) self.assertNotEqual(genid1, genid2) self.assertEqual(lookup(self.client.GEN_STARTED), 2) self.assertFalse(self.client.get_is_checkpoint(genid2)) self.assertEqual(self.client.list_generations(), [genid1, genid2]) def test_sets_is_checkpoint(self): self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.set_current_generation_is_checkpoint(True) self.assert_(self.client.get_is_checkpoint(genid)) def test_unsets_is_checkpoint(self): self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.set_current_generation_is_checkpoint(True) self.client.set_current_generation_is_checkpoint(False) self.assertFalse(self.client.get_is_checkpoint(genid)) def test_removes_generation(self): self.client.start_generation() self.client.commit() genid = self.client.list_generations()[0] self.client.remove_generation(genid) self.assertEqual(self.client.list_generations(), []) def test_removes_started_generation(self): self.client.start_generation() self.client.remove_generation(self.client.list_generations()[0]) self.assertEqual(self.client.list_generations(), []) self.assertEqual(self.client.tree, None) def test_started_generation_has_start_time(self): self.now = 1 self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.assertEqual(self.client.get_generation_times(genid), (1, None)) def test_committed_generation_has_times(self): self.now = 1 self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.now = 2 self.client.commit() self.assertEqual(self.client.get_generation_times(genid), (1, 2)) def test_single_empty_generation_counts_zero_files(self): self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.commit() self.assertEqual(self.client.get_generation_file_count(genid), 0) def test_counts_files_in_first_generation(self): self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/foo', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_file_count(genid), 1) def test_counts_new_files_in_second_generation(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/bar', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_file_count(genid), 2) def test_discounts_deleted_files_in_second_generation(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.remove('/foo') self.client.commit() self.assertEqual(self.client.get_generation_file_count(genid), 0) def test_does_not_increment_count_for_recreated_files(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/foo', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_file_count(genid), 1) def test_single_empty_generation_has_no_data(self): self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.commit() self.assertEqual(self.client.get_generation_data(genid), 0) def test_has_data_in_first_generation(self): self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/foo', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_data(genid), self.file_size) def test_counts_new_files_in_second_generation(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/bar', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_data(genid), 2 * self.file_size) def test_counts_replaced_data_in_second_generation(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/foo', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_data(genid), self.file_size) def test_discounts_deleted_data_in_second_generation(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.remove('/foo') self.client.commit() self.assertEqual(self.client.get_generation_data(genid), 0) def test_does_not_increment_data_for_recreated_files(self): self.client.start_generation() self.client.create('/foo', self.file_encoded) self.client.commit() self.client.start_generation() genid = self.client.get_generation_id(self.client.tree) self.client.create('/foo', self.file_encoded) self.client.commit() self.assertEqual(self.client.get_generation_data(genid), self.file_size) def test_finds_generation_the_first_time(self): self.client.start_generation() tree = self.client.tree genid = self.client.get_generation_id(tree) self.client.commit() self.assertEqual(self.client.find_generation(genid), tree) def test_finds_generation_the_second_time(self): self.client.start_generation() tree = self.client.tree genid = self.client.get_generation_id(tree) self.client.commit() self.client.find_generation(genid) self.assertEqual(self.client.find_generation(genid), tree) def test_find_generation_raises_keyerror_for_empty_forest(self): self.client.init_forest() self.assertRaises(KeyError, self.client.find_generation, 0) def test_find_generation_raises_keyerror_for_unknown_generation(self): self.assertRaises(KeyError, self.client.find_generation, 0) class ClientMetadataTreeFileOpsTests(unittest.TestCase): def current_time(self): return time.time() if self.now is None else self.now def setUp(self): self.now = None self.tempdir = tempfile.mkdtemp() fs = obnamlib.LocalFS(self.tempdir) self.hooks = obnamlib.HookManager() self.hooks.new('repository-toplevel-init') self.client = obnamlib.ClientMetadataTree(fs, 'clientid', obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, self) # Force use of filename hash collisions. self.client.default_file_id = self.client._bad_default_file_id self.client.start_generation() self.clientid = self.client.get_generation_id(self.client.tree) self.file_metadata = obnamlib.Metadata(st_mode=stat.S_IFREG | 0666) self.file_encoded = obnamlib.encode_metadata(self.file_metadata) self.dir_metadata = obnamlib.Metadata(st_mode=stat.S_IFDIR | 0777) self.dir_encoded = obnamlib.encode_metadata(self.dir_metadata) def tearDown(self): shutil.rmtree(self.tempdir) def test_has_empty_root_initially(self): self.assertEqual(self.client.listdir(self.clientid, '/'), []) def test_has_no_metadata_initially(self): self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo') def test_sets_metadata(self): self.client.set_metadata('/foo', self.file_encoded) self.assertEqual(self.client.get_metadata(self.clientid, '/foo'), self.file_encoded) def test_creates_file_at_root(self): self.client.create('/foo', self.file_encoded) self.assertEqual(self.client.listdir(self.clientid, '/'), ['foo']) self.assertEqual(self.client.get_metadata(self.clientid, '/foo'), self.file_encoded) def test_removes_file_at_root(self): self.client.create('/foo', self.file_encoded) self.client.remove('/foo') self.assertEqual(self.client.listdir(self.clientid, '/'), []) self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo') def test_creates_directory_at_root(self): self.client.create('/foo', self.dir_encoded) self.assertEqual(self.client.listdir(self.clientid, '/'), ['foo']) self.assertEqual(self.client.get_metadata(self.clientid, '/foo'), self.dir_encoded) def test_removes_directory_at_root(self): self.client.create('/foo', self.dir_encoded) self.client.remove('/foo') self.assertEqual(self.client.listdir(self.clientid, '/'), []) self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo') def test_creates_directory_and_files_and_subdirs(self): self.client.create('/foo', self.dir_encoded) self.client.create('/foo/foobar', self.file_encoded) self.client.create('/foo/bar', self.dir_encoded) self.client.create('/foo/bar/baz', self.file_encoded) self.assertEqual(self.client.listdir(self.clientid, '/'), ['foo']) self.assertEqual(sorted(self.client.listdir(self.clientid, '/foo')), ['bar', 'foobar']) self.assertEqual(self.client.listdir(self.clientid, '/foo/bar'), ['baz']) self.assertEqual(self.client.get_metadata(self.clientid, '/foo'), self.dir_encoded) self.assertEqual(self.client.get_metadata(self.clientid, '/foo/bar'), self.dir_encoded) self.assertEqual(self.client.get_metadata(self.clientid, '/foo/foobar'), self.file_encoded) self.assertEqual(self.client.get_metadata(self.clientid, '/foo/bar/baz'), self.file_encoded) def test_removes_directory_and_files_and_subdirs(self): self.client.create('/foo', self.dir_encoded) self.client.create('/foo/foobar', self.file_encoded) self.client.create('/foo/bar', self.dir_encoded) self.client.create('/foo/bar/baz', self.file_encoded) self.client.remove('/foo') self.assertEqual(self.client.listdir(self.clientid, '/'), []) self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo') self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo/foobar') self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo/bar') self.assertRaises(KeyError, self.client.get_metadata, self.clientid, '/foo/bar/baz') def test_has_no_file_chunks_initially(self): self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'), []) def test_sets_file_chunks(self): self.client.set_file_chunks('/foo', [1, 2, 3]) self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'), [1, 2, 3]) def test_appends_file_chunks_to_empty_list(self): self.client.append_file_chunks('/foo', [1, 2, 3]) self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'), [1, 2, 3]) def test_appends_file_chunks_to_nonempty_list(self): self.client.set_file_chunks('/foo', [1, 2, 3]) self.client.append_file_chunks('/foo', [4, 5, 6]) self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'), [1, 2, 3, 4, 5, 6]) def test_generation_has_no_chunk_refs_initially(self): minkey = self.client.chunk_key(0, 0) maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)), []) def test_generation_has_no_chunk_refs_initially(self): minkey = self.client.chunk_key(0, 0) maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)), []) def test_sets_file_chunks(self): self.client.set_file_chunks('/foo', [1, 2, 3]) self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'), [1, 2, 3]) def test_generation_has_no_chunk_refs_initially(self): minkey = self.client.chunk_key(0, 0) maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)), []) def test_set_file_chunks_adds_chunk_refs(self): self.client.set_file_chunks('/foo', [1, 2]) file_id = self.client.get_file_id(self.client.tree, '/foo') minkey = self.client.chunk_key(0, 0) maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) self.assertEqual(set(self.client.tree.lookup_range(minkey, maxkey)), set([(self.client.chunk_key(1, file_id), ''), (self.client.chunk_key(2, file_id), '')])) def test_set_file_chunks_removes_now_unused_chunk_refs(self): self.client.set_file_chunks('/foo', [1, 2]) self.client.set_file_chunks('/foo', [1]) file_id = self.client.get_file_id(self.client.tree, '/foo') minkey = self.client.chunk_key(0, 0) maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)), [(self.client.chunk_key(1, file_id), '')]) def test_remove_removes_chunk_refs(self): self.client.set_file_chunks('/foo', [1, 2]) self.client.remove('/foo') minkey = self.client.chunk_key(0, 0) maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID) self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)), []) def test_report_chunk_not_in_use_initially(self): gen_id = self.client.get_generation_id(self.client.tree) self.assertFalse(self.client.chunk_in_use(gen_id, 0)) def test_report_chunk_in_use_after_it_is(self): gen_id = self.client.get_generation_id(self.client.tree) self.client.set_file_chunks('/foo', [0]) self.assertTrue(self.client.chunk_in_use(gen_id, 0)) def test_lists_no_chunks_in_generation_initially(self): gen_id = self.client.get_generation_id(self.client.tree) self.assertEqual(self.client.list_chunks_in_generation(gen_id), []) def test_lists_used_chunks_in_generation(self): gen_id = self.client.get_generation_id(self.client.tree) self.client.set_file_chunks('/foo', [0]) self.client.set_file_chunks('/bar', [1]) self.assertEqual(set(self.client.list_chunks_in_generation(gen_id)), set([0, 1])) def test_lists_chunks_in_generation_only_once(self): gen_id = self.client.get_generation_id(self.client.tree) self.client.set_file_chunks('/foo', [0]) self.client.set_file_chunks('/bar', [0]) self.assertEqual(self.client.list_chunks_in_generation(gen_id), [0]) obnam-1.6.1/obnamlib/encryption.py0000644000175000017500000001650612246357067017052 0ustar jenkinsjenkins# Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import shutil import subprocess import tempfile import tracing import obnamlib def generate_symmetric_key(numbits, filename='/dev/random'): '''Generate a random key of at least numbits for symmetric encryption.''' tracing.trace('numbits=%d', numbits) bytes = (numbits + 7) / 8 f = open(filename, 'rb') key = f.read(bytes) f.close() return key.encode('hex') class SymmetricKeyCache(object): '''Cache symmetric keys in memory.''' def __init__(self): self.clear() def get(self, repo, toplevel): if repo in self.repos and toplevel in self.repos[repo]: return self.repos[repo][toplevel] return None def put(self, repo, toplevel, key): if repo not in self.repos: self.repos[repo] = {} self.repos[repo][toplevel] = key def clear(self): self.repos = {} def _gpg_pipe(args, data, passphrase): '''Pipe things through gpg. With the right args, this can be either an encryption or a decryption operation. For safety, we give the passphrase to gpg via a file descriptor. The argument list is modified to include the relevant options for that. The data is fed to gpg via a temporary file, readable only by the owner, to avoid congested pipes. ''' # Open pipe for passphrase, and write it there. If passphrase is # very long (more than 4 KiB by default), this might block. A better # implementation would be to have a loop around select(2) to do pipe # I/O when it can be done without blocking. Patches most welcome. keypipe = os.pipe() os.write(keypipe[1], passphrase + '\n') os.close(keypipe[1]) # Actually run gpg. argv = ['gpg', '--passphrase-fd', str(keypipe[0]), '-q', '--batch', '--no-textmode'] + args tracing.trace('argv=%s', repr(argv)) p = subprocess.Popen(argv, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE) out, err = p.communicate(data) os.close(keypipe[0]) # Return output data, or deal with errors. if p.returncode: # pragma: no cover raise obnamlib.EncryptionError(err) return out def encrypt_symmetric(cleartext, key): '''Encrypt data with symmetric encryption.''' return _gpg_pipe(['-c'], cleartext, key) def decrypt_symmetric(encrypted, key): '''Decrypt encrypted data with symmetric encryption.''' return _gpg_pipe(['-d'], encrypted, key) def _gpg(args, stdin='', gpghome=None): '''Run gpg and return its output.''' env = dict() env.update(os.environ) if gpghome is not None: env['GNUPGHOME'] = gpghome tracing.trace('gpghome=%s' % gpghome) argv = ['gpg', '-q', '--batch', '--no-textmode'] + args tracing.trace('argv=%s', repr(argv)) p = subprocess.Popen(argv, stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE, env=env) out, err = p.communicate(stdin) # Return output data, or deal with errors. if p.returncode: # pragma: no cover raise obnamlib.EncryptionError(err) return out def get_public_key(keyid, gpghome=None): '''Return the ASCII armored export form of a given public key.''' return _gpg(['--export', '--armor', keyid], gpghome=gpghome) def get_public_key_user_ids(keyid, gpghome=None): # pragma: no cover '''Return the ASCII armored export form of a given public key.''' user_ids = [] output = _gpg(['--with-colons', '--list-keys', keyid], gpghome=gpghome) for line in output.splitlines(): token = line.split(":") if len(token) >= 10: user_id = token[9].strip().replace(r'\x3a', ":") if user_id: user_ids.append(user_id) return user_ids class Keyring(object): '''A simplistic representation of GnuPG keyrings. Just enough functionality for obnam's purposes. ''' _keyring_name = 'pubring.gpg' def __init__(self, encoded=''): self._encoded = encoded self._gpghome = None self._keyids = None def _setup(self): self._gpghome = tempfile.mkdtemp() f = open(self._keyring, 'wb') f.write(self._encoded) f.close() def _cleanup(self): shutil.rmtree(self._gpghome) self._gpghome = None @property def _keyring(self): return os.path.join(self._gpghome, self._keyring_name) def _real_keyids(self): output = self.gpg(False, ['--list-keys', '--with-colons']) keyids = [] for line in output.splitlines(): fields = line.split(':') if len(fields) >= 5 and fields[0] == 'pub': keyids.append(fields[4]) return keyids def keyids(self): if self._keyids is None: self._keyids = self._real_keyids() return self._keyids def __str__(self): return self._encoded def __contains__(self, keyid): return keyid in self.keyids() def _reread_keyring(self): f = open(self._keyring, 'rb') self._encoded = f.read() f.close() self._keyids = None def add(self, key): self.gpg(True, ['--import'], stdin=key) def remove(self, keyid): self.gpg(True, ['--delete-key', '--yes', keyid]) def gpg(self, reread, *args, **kwargs): self._setup() kwargs['gpghome'] = self._gpghome try: result = _gpg(*args, **kwargs) except BaseException: # pragma: no cover self._cleanup() raise else: if reread: self._reread_keyring() self._cleanup() return result class SecretKeyring(Keyring): '''Same as Keyring, but for secret keys.''' _keyring_name = 'secring.gpg' def _real_keyids(self): output = self.gpg(False, ['--list-secret-keys', '--with-colons']) keyids = [] for line in output.splitlines(): fields = line.split(':') if len(fields) >= 5 and fields[0] == 'sec': keyids.append(fields[4]) return keyids def encrypt_with_keyring(cleartext, keyring): '''Encrypt data with all keys in a keyring.''' recipients = [] for keyid in keyring.keyids(): recipients += ['-r', keyid] return keyring.gpg(False, ['-e', '--trust-model', 'always', '--no-encrypt-to', '--no-default-recipient', ] + recipients, stdin=cleartext) def decrypt_with_secret_keys(encrypted, gpghome=None): '''Decrypt data using secret keys GnuPG finds on its own.''' return _gpg(['-d'], stdin=encrypted, gpghome=gpghome) obnam-1.6.1/obnamlib/encryption_tests.py0000644000175000017500000001531012246357067020264 0ustar jenkinsjenkins# Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import shutil import subprocess import tempfile import unittest import obnamlib def cat(filename): f = open(filename, 'rb') data = f.read() f.close() return data class SymmetricEncryptionTests(unittest.TestCase): # We don't test the quality of keys or encryption here. Doing that is # hard to do well, and we'll just assume that reading /dev/random # for keys, and using gpg for encryption, is going to work well. # In these tests, we care about making sure we use the tools right, # not that the tools themselves work right. def test_generates_key_of_correct_length(self): numbits = 16 key = obnamlib.generate_symmetric_key(numbits, filename='/dev/zero') self.assertEqual(len(key) * 8 / 2, numbits) # /2 for hex encoding def test_generates_key_with_size_rounded_up(self): numbits = 15 key = obnamlib.generate_symmetric_key(numbits, filename='/dev/zero') self.assertEqual(len(key)/2, 2) # /2 for hex encoding def test_encrypts_into_different_string_than_cleartext(self): cleartext = 'hello world' key = 'sekr1t' encrypted = obnamlib.encrypt_symmetric(cleartext, key) self.assertNotEqual(cleartext, encrypted) def test_encrypt_decrypt_round_trip(self): cleartext = 'hello, world' key = 'sekr1t' encrypted = obnamlib.encrypt_symmetric(cleartext, key) decrypted = obnamlib.decrypt_symmetric(encrypted, key) self.assertEqual(decrypted, cleartext) class SymmetricKeyCacheTests(unittest.TestCase): def setUp(self): self.cache = obnamlib.SymmetricKeyCache() self.repo = 'repo' self.repo2 = 'repo2' self.toplevel = 'toplevel' self.key = 'key' self.key2 = 'key2' def test_does_not_have_key_initially(self): self.assertEqual(self.cache.get(self.repo, self.toplevel), None) def test_remembers_key(self): self.cache.put(self.repo, self.toplevel, self.key) self.assertEqual(self.cache.get(self.repo, self.toplevel), self.key) def test_does_not_remember_key_for_different_repo(self): self.cache.put(self.repo, self.toplevel, self.key) self.assertEqual(self.cache.get(self.repo2, self.toplevel), None) def test_remembers_keys_for_both_repos(self): self.cache.put(self.repo, self.toplevel, self.key) self.cache.put(self.repo2, self.toplevel, self.key2) self.assertEqual(self.cache.get(self.repo, self.toplevel), self.key) self.assertEqual(self.cache.get(self.repo2, self.toplevel), self.key2) def test_clears_cache(self): self.cache.put(self.repo, self.toplevel, self.key) self.cache.clear() self.assertEqual(self.cache.get(self.repo, self.toplevel), None) class GetPublicKeyTests(unittest.TestCase): def setUp(self): self.dirname = tempfile.mkdtemp() self.gpghome = os.path.join(self.dirname, 'gpghome') shutil.copytree('test-gpghome', self.gpghome) self.keyid = '1B321347' def tearDown(self): shutil.rmtree(self.dirname) def test_exports_key(self): key = obnamlib.get_public_key(self.keyid, gpghome=self.gpghome) self.assert_('-----BEGIN PGP PUBLIC KEY BLOCK-----' in key) class KeyringTests(unittest.TestCase): def setUp(self): self.keyring = obnamlib.Keyring() self.keyid = '3B1802F81B321347' self.key = ''' -----BEGIN PGP PUBLIC KEY BLOCK----- Version: GnuPG v1.4.10 (GNU/Linux) mI0ETY8gwwEEAMrSXBIJseIv9miuwnYlCd7CQCzNb8nHYkpo4o1nEQD3k/h7xj9m /0Gd5kLfF+WLwAxSJYb41JjaKs0FeUexSGNePdNFxn2CCZ4moHH19tTlWGfqCNz7 vcYQpSbPix+zhR7uNqilxtsIrx1iyYwh7L2VKf/KMJ7yXbT+jbAj7fqBABEBAAG0 CFRlc3QgS2V5iLgEEwECACIFAk2PIMMCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4B AheAAAoJEDsYAvgbMhNHlEED/1UkiLJ8R3phMRnjLtn+5JobYvOi7WEubnRv1rnN MC4MyhFiLux7Z8p3xwt1Pf2GqL7q1dD91NOx+6KS3d1PFmiM/i1fYalZPbzm1gNr 8sFK2Gxsnd7mmYf2wKIo335Bk21SCmGcNKvmKW2M6ckzPT0q/RZ2hhY9JhHUiLG4 Lu3muI0ETY8gwwEEAMQoiBCQYky52pDamnH5c7FngCM72AkNq/z0+DHqY202gksd Vy63TF7UGIsiCLvY787vPm62sOqYO0uI6PV5xVDGyJh4oI/g2zgNkhXRZrIB1Q+T THp7qSmwQUZv8T+HfgxLiaXDq6oV/HWLElcMQ9ClZ3Sxzlu3ZQHrtmY5XridABEB AAGInwQYAQIACQUCTY8gwwIbDAAKCRA7GAL4GzITR4hgBAClEurTj5n0/21pWZH0 Ljmokwa3FM++OZxO7shc1LIVNiAKfLiPigU+XbvSeVWTeajKkvj5LCVxKQiRSiYB Z85TYTo06kHvDCYQmFOSGrLsZxMyJCfHML5spF9+bej5cepmuNVIdJK5vlgDiVr3 uWUO7gMi+AlnxbfXVCTEgw3xhg== =j+6W -----END PGP PUBLIC KEY BLOCK----- ''' def test_has_no_keys_initially(self): self.assertEqual(self.keyring.keyids(), []) self.assertEqual(str(self.keyring), '') def test_gets_no_keys_from_empty_encoded(self): keyring = obnamlib.Keyring(encoded='') self.assertEqual(keyring.keyids(), []) def test_adds_key(self): self.keyring.add(self.key) self.assertEqual(self.keyring.keyids(), [self.keyid]) self.assert_(self.keyid in self.keyring) def test_removes_key(self): self.keyring.add(self.key) self.keyring.remove(self.keyid) self.assertEqual(self.keyring.keyids(), []) def test_export_import_roundtrip_works(self): self.keyring.add(self.key) exported = str(self.keyring) keyring2 = obnamlib.Keyring(exported) self.assertEqual(keyring2.keyids(), [self.keyid]) class SecretKeyringTests(unittest.TestCase): def test_lists_correct_key(self): keyid1 = '3B1802F81B321347' keyid2 = 'DF3D13AA11E69900' seckeys = obnamlib.SecretKeyring(cat('test-gpghome/secring.gpg')) self.assertEqual(sorted(seckeys.keyids()), sorted([keyid1, keyid2])) class PublicKeyEncryptionTests(unittest.TestCase): def test_roundtrip_works(self): cleartext = 'hello, world' passphrase = 'password1' keyring = obnamlib.Keyring(cat('test-gpghome/pubring.gpg')) seckeys = obnamlib.SecretKeyring(cat('test-gpghome/secring.gpg')) encrypted = obnamlib.encrypt_with_keyring(cleartext, keyring) decrypted = obnamlib.decrypt_with_secret_keys(encrypted, gpghome='test-gpghome') self.assertEqual(decrypted, cleartext) obnam-1.6.1/obnamlib/forget_policy.py0000644000175000017500000000764212246357067017526 0ustar jenkinsjenkins# Copyright (C) 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import re import obnamlib class ForgetPolicy(object): '''Parse and interpret a policy for what to forget and what to keep. See documentation for the --keep option for details. ''' periods = { 'h': 'hourly', 'd': 'daily', 'w': 'weekly', 'm': 'monthly', 'y': 'yearly', } rule_pat = re.compile(r'(?P\d+)(?P(h|d|w|m|y))') def parse(self, optarg): '''Parse the argument of --keep. Return a dictionary indexed by 'hourly', 'daily', 'weekly', 'monthly', 'yearly', and giving the number of generations to keep for each time period. ''' remaining = optarg m = self.rule_pat.match(remaining) if not m: raise obnamlib.Error('Forget policy syntax error: %s' % optarg) result = dict((y, None) for x, y in self.periods.iteritems()) while m: count = int(m.group('count')) period = self.periods[m.group('period')] if result[period] is not None: raise obnamlib.Error('Forget policy may not ' 'duplicate period (%s): %s' % (period, optarg)) result[period] = count remaining = remaining[m.end():] if not remaining: break if not remaining.startswith(','): raise obnamlib.Error('Forget policy must have rules ' 'separated by commas: %s' % optarg) remaining = remaining[1:] m = self.rule_pat.match(remaining) result.update((x, 0) for x, y in result.iteritems() if y is None) return result def last_in_each_period(self, period, genlist): formats = { 'hourly': '%Y-%m-%d %H', 'daily': '%Y-%m-%d', 'weekly': '%Y-%W', 'monthly': '%Y-%m', 'yearly': '%Y', } matches = [] for genid, dt in genlist: formatted = dt.strftime(formats[period]) if not matches: matches.append((genid, formatted)) elif matches[-1][1] == formatted: matches[-1] = (genid, formatted) else: matches.append((genid, formatted)) return [genid for genid, formatted in matches] def match(self, rules, genlist): '''Match a parsed ruleset against a list of generations and times. The ruleset should be of the form returned by the parse method. genlist should be a list of generation identifiers and timestamps. Identifiers can be anything, timestamps should be an instance of datetime.datetime, with no time zone (it is ignored). genlist should be in ascending order by time: oldest one first. Return value is all those pairs from genlist that should be kept (i.e., which match the rules). ''' result_ids = set() for period in rules: genids = self.last_in_each_period(period, genlist) if rules[period]: for genid in genids[-rules[period]:]: result_ids.add(genid) return [(genid, dt) for genid, dt in genlist if genid in result_ids] obnam-1.6.1/obnamlib/forget_policy_tests.py0000644000175000017500000001164312246357067020744 0ustar jenkinsjenkins# Copyright (C) 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import datetime import unittest import obnamlib class ForgetPolicyParseTests(unittest.TestCase): def setUp(self): self.fp = obnamlib.ForgetPolicy() def test_raises_error_for_empty_string(self): self.assertRaises(obnamlib.Error, self.fp.parse, '') def test_raises_error_for_unknown_period(self): self.assertRaises(obnamlib.Error, self.fp.parse, '7x') def test_raises_error_if_period_is_duplicated(self): self.assertRaises(obnamlib.Error, self.fp.parse, '1h,2h') def test_raises_error_rules_not_separated_by_comma(self): self.assertRaises(obnamlib.Error, self.fp.parse, '1h 2d') def test_parses_single_rule(self): self.assertEqual(self.fp.parse('7d'), { 'hourly': 0, 'daily': 7, 'weekly': 0, 'monthly': 0, 'yearly': 0 }) def test_parses_multiple_rules(self): self.assertEqual(self.fp.parse('1h,2d,3w,4m,255y'), { 'hourly': 1, 'daily': 2, 'weekly': 3, 'monthly': 4, 'yearly': 255 }) class ForgetPolicyMatchTests(unittest.TestCase): def setUp(self): self.fp = obnamlib.ForgetPolicy() def match2(self, spec, times): rules = self.fp.parse(spec) return [dt for i, dt in self.fp.match(rules, list(enumerate(times)))] def test_hourly_matches(self): h0m0 = datetime.datetime(2000, 1, 1, 0, 0) h0m59 = datetime.datetime(2000, 1, 1, 0, 59) h1m0 = datetime.datetime(2000, 1, 1, 1, 0) h1m59 = datetime.datetime(2000, 1, 1, 1, 59) self.assertEqual(self.match2('1h', [h0m0, h0m59, h1m0, h1m59]), [h1m59]) def test_two_hourly_matches(self): h0m0 = datetime.datetime(2000, 1, 1, 0, 0) h0m59 = datetime.datetime(2000, 1, 1, 0, 59) h1m0 = datetime.datetime(2000, 1, 1, 1, 0) h1m59 = datetime.datetime(2000, 1, 1, 1, 59) self.assertEqual(self.match2('2h', [h0m0, h0m59, h1m0, h1m59]), [h0m59, h1m59]) def test_daily_matches(self): d1h0 = datetime.datetime(2000, 1, 1, 0, 0) d1h23 = datetime.datetime(2000, 1, 1, 23, 0) d2h0 = datetime.datetime(2000, 1, 2, 0, 0) d2h23 = datetime.datetime(2000, 1, 2, 23, 0) self.assertEqual(self.match2('1d', [d1h0, d1h23, d2h0, d2h23]), [d2h23]) # Not testing weekly matching, since I can't figure out to make # a sensible test case right now. def test_monthly_matches(self): m1d1 = datetime.datetime(2000, 1, 1, 0, 0) m1d28 = datetime.datetime(2000, 1, 28, 0, 0) m2d1 = datetime.datetime(2000, 2, 1, 0, 0) m2d28 = datetime.datetime(2000, 2, 28, 0, 0) self.assertEqual(self.match2('1m', [m1d1, m1d28, m2d1, m2d28]), [m2d28]) def test_yearly_matches(self): y1m1 = datetime.datetime(2000, 1, 1, 0, 0) y1m12 = datetime.datetime(2000, 12, 1, 0, 0) y2m1 = datetime.datetime(2001, 1, 1, 0, 0) y2m12 = datetime.datetime(2001, 12, 1, 0, 0) self.assertEqual(self.match2('1y', [y1m1, y1m12, y2m1, y2m12]), [y2m12]) def test_hourly_and_daily_match_together(self): d1h0m0 = datetime.datetime(2000, 1, 1, 0, 0) d1h0m1 = datetime.datetime(2000, 1, 1, 0, 1) d2h0m0 = datetime.datetime(2000, 1, 2, 0, 0) d2h0m1 = datetime.datetime(2000, 1, 2, 0, 1) d3h0m0 = datetime.datetime(2000, 1, 3, 0, 0) d3h0m1 = datetime.datetime(2000, 1, 3, 0, 1) genlist = list(enumerate([d1h0m0, d1h0m1, d2h0m0, d2h0m1, d3h0m0, d3h0m1])) rules = self.fp.parse('1h,2d') self.assertEqual([dt for genid, dt in self.fp.match(rules, genlist)], [d2h0m1, d3h0m1]) def test_hourly_and_daily_together_when_only_daily_backups(self): d1 = datetime.datetime(2000, 1, 1, 0, 0) d2 = datetime.datetime(2000, 1, 2, 0, 0) d3 = datetime.datetime(2000, 1, 3, 0, 0) self.assertEqual(self.match2('10h,1d', [d1, d2, d3]), [d1, d2, d3]) obnam-1.6.1/obnamlib/hooks.py0000644000175000017500000001402312246357067015773 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . '''Hooks with callbacks. In order to de-couple parts of the application, especially when plugins are used, hooks can be used. A hook is a location in the application code where plugins may want to do something. Each hook has a name and a list of callbacks. The application defines the name and the location where the hook will be invoked, and the plugins (or other parts of the application) will register callbacks. ''' import logging import tracing import obnamlib class Hook(object): '''A hook.''' EARLY_PRIORITY = 250 DEFAULT_PRIORITY = 500 LATE_PRIORITY = 750 def __init__(self): self.callbacks = [] self.priorities = {} def add_callback(self, callback, priority=DEFAULT_PRIORITY): '''Add a callback to this hook. Return an identifier that can be used to remove this callback. ''' if callback not in self.callbacks: self.priorities[callback] = priority self.callbacks.append(callback) self.callbacks.sort(lambda x,y: cmp(self.priorities[x], self.priorities[y])) return callback def call_callbacks(self, *args, **kwargs): '''Call all callbacks with the given arguments.''' for callback in self.callbacks: callback(*args, **kwargs) def remove_callback(self, callback_id): '''Remove a specific callback.''' if callback_id in self.callbacks: self.callbacks.remove(callback_id) del self.priorities[callback_id] class MissingFilterError(obnamlib.Error): '''Missing tag encountered reading filtered data.''' def __init__(self, tagname): self.tagname = tagname logging.warning("Missing tag: " + repr(tagname)) obnamlib.Error.__init__(self, "Unknown filter tag encountered: %s" % repr(tagname)) class FilterHook(Hook): '''A hook which filters data through callbacks. Every hook of this type accepts a piece of data as its first argument Each callback gets the return value of the previous one as its argument. The caller gets the value of the final callback. Other arguments (with or without keywords) are passed as-is to each callback. ''' def __init__(self): Hook.__init__(self) self.bytag = {} def add_callback(self, callback, priority=Hook.DEFAULT_PRIORITY): assert(hasattr(callback, "tag")) assert(hasattr(callback, "filter_read")) assert(hasattr(callback, "filter_write")) self.bytag[callback.tag] = callback return Hook.add_callback(self, callback, priority) def remove_callback(self, callback_id): Hook.remove_callback(self, callback_id) del self.bytag[callback_id.tag] def call_callbacks(self, data, *args, **kwargs): raise NotImplementedError() def run_filter_read(self, data, *args, **kwargs): tag, content = data.split("\0", 1) while tag != "": if tag not in self.bytag: raise MissingFilterError(tag) data = self.bytag[tag].filter_read(content, *args, **kwargs) tag, content = data.split("\0", 1) return content def run_filter_write(self, data, *args, **kwargs): tracing.trace('called') data = "\0" + data for filt in self.callbacks: tracing.trace('calling %s' % filt) new_data = filt.filter_write(data, *args, **kwargs) assert new_data is not None, \ filt.tag + ": Returned None from filter_write()" if data != new_data: tracing.trace('filt.tag=%s' % filt.tag) data = filt.tag + "\0" + new_data tracing.trace('done') return data class HookManager(object): '''Manage the set of hooks the application defines.''' def __init__(self): self.hooks = {} self.filters = {} def new(self, name): '''Create a new hook. If a hook with that name already exists, nothing happens. ''' if name not in self.hooks: self.hooks[name] = Hook() def new_filter(self, name): '''Create a new filter hook.''' if name not in self.filters: self.filters[name] = FilterHook() def add_callback(self, name, callback, priority=Hook.DEFAULT_PRIORITY): '''Add a callback to a named hook.''' if name in self.hooks: return self.hooks[name].add_callback(callback, priority) else: return self.filters[name].add_callback(callback, priority) def remove_callback(self, name, callback_id): '''Remove a specific callback from a named hook.''' if name in self.hooks: self.hooks[name].remove_callback(callback_id) else: self.filters[name].remove_callback(callback_id) def call(self, name, *args, **kwargs): '''Call callbacks for a named hook, using given arguments.''' self.hooks[name].call_callbacks(*args, **kwargs) def filter_read(self, name, *args, **kwargs): '''Run reader filter for named filter, using given arguments.''' return self.filters[name].run_filter_read(*args, **kwargs) def filter_write(self, name, *args, **kwargs): '''Run writer filter for named filter, using given arguments.''' return self.filters[name].run_filter_write(*args, **kwargs) obnam-1.6.1/obnamlib/hooks_tests.py0000644000175000017500000001555012246357067017223 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import unittest import obnamlib import base64 class HookTests(unittest.TestCase): def setUp(self): self.hook = obnamlib.Hook() def callback(self, *args, **kwargs): self.args = args self.kwargs = kwargs def callback2(self, *args, **kwargs): self.args2 = args self.kwargs2 = kwargs def test_has_no_callbacks_by_default(self): self.assertEqual(self.hook.callbacks, []) def test_adds_callback(self): self.hook.add_callback(self.callback) self.assertEqual(self.hook.callbacks, [self.callback]) def test_adds_callback_only_once(self): self.hook.add_callback(self.callback) self.hook.add_callback(self.callback) self.assertEqual(self.hook.callbacks, [self.callback]) def test_adds_two_callbacks(self): id1 = self.hook.add_callback(self.callback) id2 = self.hook.add_callback(self.callback2, obnamlib.Hook.DEFAULT_PRIORITY + 1) self.assertEqual(self.hook.callbacks, [self.callback, self.callback2]) self.assertNotEqual(id1, id2) def test_adds_callbacks_in_reverse_order(self): id1 = self.hook.add_callback(self.callback) id2 = self.hook.add_callback(self.callback2, obnamlib.Hook.DEFAULT_PRIORITY - 1) self.assertEqual(self.hook.callbacks, [self.callback2, self.callback]) self.assertNotEqual(id1, id2) def test_calls_callback(self): self.hook.add_callback(self.callback) self.hook.call_callbacks('bar', kwarg='foobar') self.assertEqual(self.args, ('bar',)) self.assertEqual(self.kwargs, { 'kwarg': 'foobar' }) def test_removes_callback(self): cb_id = self.hook.add_callback(self.callback) self.hook.remove_callback(cb_id) self.assertEqual(self.hook.callbacks, []) class NeverAddsFilter(object): def __init__(self): self.tag = "never" def filter_read(self, data, *args, **kwargs): self.args = args self.kwargs = kwargs self.wasread = True return data def filter_write(self, data, *args, **kwargs): self.args = args self.kwargs = kwargs self.wasread = False return data class Base64Filter(object): def __init__(self): self.tag = "base64" def filter_read(self, data, *args, **kwargs): self.args = args self.kwargs = kwargs self.wasread = True return base64.b64decode(data) def filter_write(self, data, *args, **kwargs): self.args = args self.kwargs = kwargs self.wasread = False return base64.b64encode(data) class FilterHookTests(unittest.TestCase): def setUp(self): self.hook = obnamlib.FilterHook() def test_add_filter_ok(self): self.hook.add_callback(NeverAddsFilter()) def test_never_filter_no_tags(self): self.hook.add_callback(NeverAddsFilter()) self.assertEquals(self.hook.run_filter_write("foo"), "\0foo") def test_never_filter_clean_revert(self): self.hook.add_callback(NeverAddsFilter()) self.assertEquals(self.hook.run_filter_read("\0foo"), "foo") def test_base64_filter_encode(self): self.hook.add_callback(Base64Filter()) self.assertEquals(self.hook.run_filter_write("OK"), "base64\0AE9L") def test_base64_filter_decode(self): self.hook.add_callback(Base64Filter()) self.assertEquals(self.hook.run_filter_read("base64\0AE9L"), "OK") def test_missing_filter_raises(self): self.assertRaises(obnamlib.MissingFilterError, self.hook.run_filter_read, "missing\0") def test_missing_filter_gives_tag(self): try: self.hook.run_filter_read("missing\0") except obnamlib.MissingFilterError, e: self.assertEquals(e.tagname, "missing") def test_can_remove_filters(self): myfilter = NeverAddsFilter() filterid = self.hook.add_callback(myfilter) self.hook.remove_callback(filterid) self.assertEquals(self.hook.callbacks, []) def test_call_callbacks_raises(self): self.assertRaises(NotImplementedError, self.hook.call_callbacks, "") class HookManagerTests(unittest.TestCase): def setUp(self): self.hooks = obnamlib.HookManager() self.hooks.new('foo') def callback(self, *args, **kwargs): self.args = args self.kwargs = kwargs def test_has_no_tests_initially(self): hooks = obnamlib.HookManager() self.assertEqual(hooks.hooks, {}) def test_adds_new_hook(self): self.assert_(self.hooks.hooks.has_key('foo')) def test_adds_new_filter_hook(self): self.hooks.new_filter('bar') self.assert_('bar' in self.hooks.filters) def test_adds_callback(self): self.hooks.add_callback('foo', self.callback) self.assertEqual(self.hooks.hooks['foo'].callbacks, [self.callback]) def test_removes_callback(self): cb_id = self.hooks.add_callback('foo', self.callback) self.hooks.remove_callback('foo', cb_id) self.assertEqual(self.hooks.hooks['foo'].callbacks, []) def test_calls_callbacks(self): self.hooks.add_callback('foo', self.callback) self.hooks.call('foo', 'bar', kwarg='foobar') self.assertEqual(self.args, ('bar',)) self.assertEqual(self.kwargs, { 'kwarg': 'foobar' }) def test_filter_write_returns_value_of_callbacks(self): self.hooks.new_filter('bar') self.assertEquals(self.hooks.filter_write('bar', "foo"), "\0foo") def test_filter_read_returns_value_of_callbacks(self): self.hooks.new_filter('bar') self.assertEquals(self.hooks.filter_read('bar', "\0foo"), "foo") def test_add_callbacks_to_filters(self): self.hooks.new_filter('bar') filt = NeverAddsFilter() self.hooks.add_callback('bar', filt) self.assertEquals(self.hooks.filters['bar'].callbacks, [filt]) def test_remove_callbacks_from_filters(self): self.hooks.new_filter('bar') filt = NeverAddsFilter() self.hooks.add_callback('bar', filt) self.hooks.remove_callback('bar', filt) self.assertEquals(self.hooks.filters['bar'].callbacks, []) obnam-1.6.1/obnamlib/lockmgr.py0000644000175000017500000000531412246357067016311 0ustar jenkinsjenkins# Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import time import obnamlib class LockManager(object): '''Lock and unlock sets of directories at once.''' def __init__(self, fs, timeout, client): self._fs = fs self.timeout = timeout data = ["[lockfile]"] data = data + ["client=" + client] data = data + ["pid=%d" % os.getpid()] data = data + self._read_boot_id() self.data = '\r\n'.join(data) def _read_boot_id(self): # pragma: no cover try: with open("/proc/sys/kernel/random/boot_id", "r") as f: boot_id = f.read().strip() except: return [] else: return ["boot_id=%s" % boot_id] def _time(self): # pragma: no cover return time.time() def _sleep(self): # pragma: no cover time.sleep(1) def sort(self, dirnames): def bytelist(s): return [ord(s) for s in str(s)] return sorted(dirnames, key=bytelist) def _lockname(self, dirname): return os.path.join(dirname, 'lock') def _lock_one(self, dirname): started = self._time() while True: lock_name = self._lockname(dirname) try: self._fs.lock(lock_name, self.data) except obnamlib.LockFail: if self._time() - started >= self.timeout: raise obnamlib.LockFail('Lock timeout: %s' % lock_name) else: return self._sleep() def _unlock_one(self, dirname): self._fs.unlock(self._lockname(dirname)) def lock(self, dirnames): '''Lock ALL the directories.''' we_locked = [] for dirname in self.sort(dirnames): try: self._lock_one(dirname) except obnamlib.LockFail: self.unlock(we_locked) raise else: we_locked.append(dirname) def unlock(self, dirnames): '''Unlock ALL the directories.''' for dirname in self.sort(dirnames): self._unlock_one(dirname) obnam-1.6.1/obnamlib/lockmgr_tests.py0000644000175000017500000000616012246357067017533 0ustar jenkinsjenkins# Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import shutil import tempfile import unittest import obnamlib class LockManagerTests(unittest.TestCase): def locked(self, dirname): return os.path.exists(os.path.join(dirname, 'lock')) def fake_time(self): self.now += 1 return self.now def setUp(self): self.tempdir = tempfile.mkdtemp() self.dirnames = [] for x in ['a', 'b', 'c']: dirname = os.path.join(self.tempdir, x) os.mkdir(dirname) self.dirnames.append(dirname) self.fs = obnamlib.LocalFS(self.tempdir) self.timeout = 10 self.now = 0 self.lm = obnamlib.LockManager(self.fs, self.timeout, '') self.lm._time = self.fake_time self.lm._sleep = lambda: None def tearDown(self): shutil.rmtree(self.tempdir) def test_has_nothing_locked_initially(self): for dirname in self.dirnames: self.assertFalse(self.locked(dirname)) def test_locks_single_directory(self): self.lm.lock([self.dirnames[0]]) self.assertTrue(self.locked(self.dirnames[0])) def test_unlocks_single_directory(self): self.lm.lock([self.dirnames[0]]) self.lm.unlock([self.dirnames[0]]) self.assertFalse(self.locked(self.dirnames[0])) def test_waits_until_timeout_for_locked_directory(self): self.lm.lock([self.dirnames[0]]) self.assertRaises(obnamlib.LockFail, self.lm.lock, [self.dirnames[0]]) self.assertTrue(self.now >= self.timeout) def test_notices_when_preexisting_lock_goes_away(self): self.lm.lock([self.dirnames[0]]) self.lm._sleep = lambda: os.remove(self.lm._lockname(self.dirnames[0])) self.lm.lock([self.dirnames[0]]) self.assertTrue(True) def test_locks_all_directories(self): self.lm.lock(self.dirnames) for dirname in self.dirnames: self.assertTrue(self.locked(dirname)) def test_unlocks_all_directories(self): self.lm.lock(self.dirnames) self.lm.unlock(self.dirnames) for dirname in self.dirnames: self.assertFalse(self.locked(dirname)) def test_does_not_lock_anything_if_one_lock_fails(self): self.lm.lock([self.dirnames[-1]]) self.assertRaises(obnamlib.LockFail, self.lm.lock, self.dirnames) for dirname in self.dirnames[:-1]: self.assertFalse(self.locked(dirname)) self.assertTrue(self.locked(self.dirnames[-1])) obnam-1.6.1/obnamlib/metadata.py0000644000175000017500000003342012246357067016432 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import errno import grp import logging import os import pwd import stat import struct import tracing import obnamlib metadata_verify_fields = ( 'st_mode', 'st_mtime_sec', 'st_mtime_nsec', 'st_nlink', 'st_size', 'st_uid', 'groupname', 'username', 'target', 'xattr', ) metadata_fields = metadata_verify_fields + ( 'st_blocks', 'st_dev', 'st_gid', 'st_ino', 'st_atime_sec', 'st_atime_nsec', 'md5', ) class Metadata(object): '''Represent metadata for a filesystem entry. The metadata for a filesystem entry (file, directory, device, ...) consists of its stat(2) result, plus ACL and xattr. This class represents them as fields. We do not store all stat(2) fields. Here's a commentary on all fields: field? stored? why st_atime_sec yes mutt compares atime, mtime to see ifmsg is new st_atime_nsec yes mutt compares atime, mtime to see ifmsg is new st_blksize no no way to restore, not useful backed up st_blocks yes should restore create holes in file? st_ctime no no way to restore, not useful backed up st_dev yes used to restore hardlinks st_gid yes used to restore group ownership st_ino yes used to restore hardlinks st_mode yes used to restore permissions st_mtime_sec yes used to restore mtime st_mtime_nsec yes used to restore mtime st_nlink yes used to restore hardlinks st_rdev no no use (correct me if I'm wrong about this) st_size yes user needs it to see size of file in backup st_uid yes used to restored ownership The field 'target' stores the target of a symlink. Additionally, the fields 'groupname' and 'username' are stored. They contain the textual names that correspond to st_gid and st_uid. When restoring, the names will be preferred by default. The 'md5' field optionally stores the whole-file checksum for the file. The 'xattr' field optionally stores extended attributes encoded as a binary blob. ''' def __init__(self, **kwargs): for field in metadata_fields: setattr(self, field, None) for field, value in kwargs.iteritems(): setattr(self, field, value) def isdir(self): return self.st_mode is not None and stat.S_ISDIR(self.st_mode) def islink(self): return self.st_mode is not None and stat.S_ISLNK(self.st_mode) def isfile(self): return self.st_mode is not None and stat.S_ISREG(self.st_mode) def __repr__(self): # pragma: no cover fields = ', '.join('%s=%s' % (k, getattr(self, k)) for k in metadata_fields) return 'Metadata(%s)' % fields def __cmp__(self, other): for field in metadata_fields: ours = getattr(self, field) theirs = getattr(other, field) if ours == theirs: continue if ours < theirs: return -1 if ours > theirs: return +1 return 0 # Caching versions of username/groupname lookups. # These work on the assumption that the mappings from uid/gid do not # change during the runtime of the backup. _uid_to_username = {} def _cached_getpwuid(uid): # pragma: no cover if uid not in _uid_to_username: _uid_to_username[uid] = pwd.getpwuid(uid) return _uid_to_username[uid] _gid_to_groupname = {} def _cached_getgrgid(gid): # pragma: no cover if gid not in _gid_to_groupname: _gid_to_groupname[gid] = grp.getgrgid(gid) return _gid_to_groupname[gid] def get_xattrs_as_blob(fs, filename): # pragma: no cover tracing.trace('filename=%s' % filename) try: names = fs.llistxattr(filename) except (OSError, IOError), e: if e.errno in (errno.EOPNOTSUPP, errno.EACCES): return None raise tracing.trace('names=%s' % repr(names)) if not names: return None values = [] for name in names[:]: tracing.trace('trying name %s' % repr(name)) try: value = fs.lgetxattr(filename, name) except OSError, e: # On btrfs, at least, this can happen: the filesystem returns # a list of attribute names, but then fails when looking up # the value for one or more of the names. We pretend that the # name was never returned in that case. # # Obviously this can happen due to race conditions as well. if e.errno == errno.ENODATA: names.remove(name) logging.warning( '%s has extended attribute named %s without value, ' 'ignoring attribute' % (filename, name)) else: raise else: tracing.trace('lgetxattr(%s)=%s' % (name, value)) values.append(value) assert len(names) == len(values) name_blob = ''.join('%s\0' % name for name in names) lengths = [len(v) for v in values] fmt = '!' + 'Q' * len(values) value_blob = struct.pack(fmt, *lengths) + ''.join(values) return ('%s%s%s' % (struct.pack('!Q', len(name_blob)), name_blob, value_blob)) def set_xattrs_from_blob(fs, filename, blob): # pragma: no cover sizesize = struct.calcsize('!Q') name_blob_size = struct.unpack('!Q', blob[:sizesize])[0] name_blob = blob[sizesize : sizesize + name_blob_size] value_blob = blob[sizesize + name_blob_size : ] names = [s for s in name_blob.split('\0')[:-1]] fmt = '!' + 'Q' * len(names) lengths_size = sizesize * len(names) lengths = struct.unpack(fmt, value_blob[:lengths_size]) pos = lengths_size for i, name in enumerate(names): value = value_blob[pos:pos + lengths[i]] pos += lengths[i] fs.lsetxattr(filename, name, value) def read_metadata(fs, filename, st=None, getpwuid=None, getgrgid=None): '''Return object detailing metadata for a filesystem entry.''' metadata = Metadata() stat_result = st or fs.lstat(filename) for field in metadata_fields: if field.startswith('st_') and hasattr(stat_result, field): setattr(metadata, field, getattr(stat_result, field)) if stat.S_ISLNK(stat_result.st_mode): metadata.target = fs.readlink(filename) else: metadata.target = '' getgrgid = getgrgid or _cached_getgrgid try: metadata.groupname = getgrgid(metadata.st_gid)[0] except KeyError: metadata.groupname = None getpwuid = getpwuid or _cached_getpwuid try: metadata.username = getpwuid(metadata.st_uid)[0] except KeyError: metadata.username = None metadata.xattr = get_xattrs_as_blob(fs, filename) return metadata def set_metadata(fs, filename, metadata, getuid=None): '''Set metadata for a filesystem entry. We only set metadata that can sensibly be set: st_atime, st_mode, st_mtime. We also attempt to set ownership (st_gid, st_uid), but only if we're running as root. We ignore the username, groupname fields: we assume the caller will change st_uid, st_gid accordingly if they want to mess with things. This makes the user take care of error situations and looking up user preferences. ''' symlink = stat.S_ISLNK(metadata.st_mode) if symlink: fs.symlink(metadata.target, filename) # Set owner before mode, so that a setuid bit does not get reset. getuid = getuid or os.getuid if getuid() == 0: fs.lchown(filename, metadata.st_uid, metadata.st_gid) # If we are not the owner, and not root, do not restore setuid/setgid. mode = metadata.st_mode if getuid() not in (0, metadata.st_uid): # pragma: no cover mode = mode & (~stat.S_ISUID) mode = mode & (~stat.S_ISGID) if symlink: fs.chmod_symlink(filename, mode) else: fs.chmod_not_symlink(filename, mode) if metadata.xattr: # pragma: no cover set_xattrs_from_blob(fs, filename, metadata.xattr) fs.lutimes(filename, metadata.st_atime_sec, metadata.st_atime_nsec, metadata.st_mtime_sec, metadata.st_mtime_nsec) metadata_format = struct.Struct('!Q' + # flags 'Q' + # st_mode 'qQ' + # st_mtime_sec and _nsec 'qQ' + # st_atime_sec and _nsec 'Q' + # st_nlink 'Q' + # st_size 'Q' + # st_uid 'Q' + # st_gid 'Q' + # st_dev 'Q' + # st_ino 'Q' + # st_blocks 'Q' + # len of groupname 'Q' + # len of username 'Q' + # len of symlink target 'Q' + # len of md5 'Q' + # len of xattr '') def encode_metadata(metadata): flags = 0 for i, name in enumerate(obnamlib.metadata_fields): if getattr(metadata, name) is not None: flags |= (1 << i) try: packed = metadata_format.pack(flags, metadata.st_mode or 0, metadata.st_mtime_sec or 0, metadata.st_mtime_nsec or 0, metadata.st_atime_sec or 0, metadata.st_atime_nsec or 0, metadata.st_nlink or 0, metadata.st_size or 0, metadata.st_uid or 0, metadata.st_gid or 0, metadata.st_dev or 0, metadata.st_ino or 0, metadata.st_blocks or 0, len(metadata.groupname or ''), len(metadata.username or ''), len(metadata.target or ''), len(metadata.md5 or ''), len(metadata.xattr or '')) except TypeError, e: # pragma: no cover logging.error('ERROR: Packing error due to %s' % str(e)) logging.error('ERROR: st_mode=%s' % repr(metadata.st_mode)) logging.error('ERROR: st_mtime_sec=%s' % repr(metadata.st_mtime_sec)) logging.error('ERROR: st_mtime_nsec=%s' % repr(metadata.st_mtime_nsec)) logging.error('ERROR: st_atime_sec=%s' % repr(metadata.st_atime_sec)) logging.error('ERROR: st_atime_nsec=%s' % repr(metadata.st_atime_nsec)) logging.error('ERROR: st_nlink=%s' % repr(metadata.st_nlink)) logging.error('ERROR: st_size=%s' % repr(metadata.st_size)) logging.error('ERROR: st_uid=%s' % repr(metadata.st_uid)) logging.error('ERROR: st_gid=%s' % repr(metadata.st_gid)) logging.error('ERROR: st_dev=%s' % repr(metadata.st_dev)) logging.error('ERROR: st_ino=%s' % repr(metadata.st_ino)) logging.error('ERROR: st_blocks=%s' % repr(metadata.st_blocks)) logging.error('ERROR: groupname=%s' % repr(metadata.groupname)) logging.error('ERROR: username=%s' % repr(metadata.username)) logging.error('ERROR: target=%s' % repr(metadata.target)) logging.error('ERROR: md5=%s' % repr(metadata.md5)) logging.error('ERROR: xattr=%s' % repr(metadata.xattr)) raise return (packed + (metadata.groupname or '') + (metadata.username or '') + (metadata.target or '') + (metadata.md5 or '') + (metadata.xattr or '')) def decode_metadata(encoded): items = metadata_format.unpack_from(encoded) flags = items[0] pos = [1, metadata_format.size] metadata = obnamlib.Metadata() def is_present(field): i = obnamlib.metadata_fields.index(field) return (flags & (1 << i)) != 0 def decode(field, num_items, inc_offset, getvalue): if is_present(field): value = getvalue(pos[0], pos[1]) setattr(metadata, field, value) if inc_offset: pos[1] += len(value) pos[0] += num_items def decode_integer(field): decode(field, 1, False, lambda i, o: items[i]) def decode_string(field): decode(field, 1, True, lambda i, o: encoded[o:o + items[i]]) decode_integer('st_mode') decode_integer('st_mtime_sec') decode_integer('st_mtime_nsec') decode_integer('st_atime_sec') decode_integer('st_atime_nsec') decode_integer('st_nlink') decode_integer('st_size') decode_integer('st_uid') decode_integer('st_gid') decode_integer('st_dev') decode_integer('st_ino') decode_integer('st_blocks') decode_string('groupname') decode_string('username') decode_string('target') decode_string('md5') decode_string('xattr') return metadata obnam-1.6.1/obnamlib/metadata_tests.py0000644000175000017500000002553212246357067017661 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import stat import tempfile import unittest import platform import obnamlib class FakeFS(object): def __init__(self): self.st_atime_sec = 1 self.st_atime_nsec = 11 self.st_blocks = 2 self.st_dev = 3 self.st_gid = 4 self.st_ino = 5 self.st_mode = 6 self.st_mtime_sec = 7 self.st_mtime_nsec = 71 self.st_nlink = 8 self.st_size = 9 self.st_uid = 10 self.groupname = 'group' self.username = 'user' self.target = 'target' def lstat(self, filename): return self def readlink(self, filename): return 'target' def getpwuid(self, uid): return (self.username, None, self.st_uid, self.st_gid, None, None, None) def getgrgid(self, gid): return (self.groupname, None, self.st_gid, None) def fail_getpwuid(self, uid): raise KeyError(uid) def fail_getgrgid(self, gid): raise KeyError(gid) def llistxattr(self, filename): return [] class MetadataTests(unittest.TestCase): def test_sets_mtime_from_kwarg(self): metadata = obnamlib.Metadata(st_mtime_sec=123) self.assertEqual(metadata.st_mtime_sec, 123) def test_isdir_returns_false_for_regular_file(self): metadata = obnamlib.Metadata(st_mode=stat.S_IFREG) self.assertFalse(metadata.isdir()) def test_isdir_returns_true_for_directory(self): metadata = obnamlib.Metadata(st_mode=stat.S_IFDIR) self.assert_(metadata.isdir()) def test_isdir_returns_false_when_st_mode_is_not_set(self): metadata = obnamlib.Metadata() self.assertFalse(metadata.isdir()) def test_islink_returns_false_for_regular_file(self): metadata = obnamlib.Metadata(st_mode=stat.S_IFREG) self.assertFalse(metadata.islink()) def test_islink_returns_true_for_symlink(self): metadata = obnamlib.Metadata(st_mode=stat.S_IFLNK) self.assert_(metadata.islink()) def test_islink_returns_false_when_st_mode_is_not_set(self): metadata = obnamlib.Metadata() self.assertFalse(metadata.islink()) def test_isfile_returns_true_for_regular_file(self): metadata = obnamlib.Metadata(st_mode=stat.S_IFREG) self.assert_(metadata.isfile()) def test_isfile_returns_false_when_st_mode_is_not_set(self): metadata = obnamlib.Metadata() self.assertFalse(metadata.isfile()) def test_has_no_md5_by_default(self): metadata = obnamlib.Metadata() self.assertEqual(metadata.md5, None) def test_sets_md5(self): metadata = obnamlib.Metadata(md5='checksum') self.assertEqual(metadata.md5, 'checksum') def test_is_equal_to_itself(self): metadata = obnamlib.Metadata(st_mode=stat.S_IFREG) self.assertEqual(metadata, metadata) def test_less_than_works(self): m1 = obnamlib.Metadata(st_size=1) m2 = obnamlib.Metadata(st_size=2) self.assert_(m1 < m2) def test_greater_than_works(self): m1 = obnamlib.Metadata(st_size=1) m2 = obnamlib.Metadata(st_size=2) self.assert_(m2 > m1) class ReadMetadataTests(unittest.TestCase): def setUp(self): self.fakefs = FakeFS() def test_returns_stat_fields_correctly(self): metadata = obnamlib.read_metadata(self.fakefs, 'foo', getpwuid=self.fakefs.getpwuid, getgrgid=self.fakefs.getgrgid) fields = ['st_atime_sec','st_atime_nsec', 'st_blocks', 'st_dev', 'st_gid', 'st_ino', 'st_mode', 'st_mtime_sec', 'st_mtime_nsec', 'st_nlink', 'st_size', 'st_uid', 'groupname', 'username'] for field in fields: self.assertEqual(getattr(metadata, field), getattr(self.fakefs, field), field) def test_returns_symlink_fields_correctly(self): self.fakefs.st_mode |= stat.S_IFLNK; metadata = obnamlib.read_metadata(self.fakefs, 'foo', getpwuid=self.fakefs.getpwuid, getgrgid=self.fakefs.getgrgid) fields = ['st_mode', 'target'] for field in fields: self.assertEqual(getattr(metadata, field), getattr(self.fakefs, field), field) def test_reads_username_as_None_if_lookup_fails(self): metadata = obnamlib.read_metadata(self.fakefs, 'foo', getpwuid=self.fakefs.fail_getpwuid, getgrgid=self.fakefs.fail_getgrgid) self.assertEqual(metadata.username, None) class SetMetadataTests(unittest.TestCase): def setUp(self): self.metadata = obnamlib.Metadata() self.metadata.st_atime_sec = 12765 self.metadata.st_atime_nsec = 0 self.metadata.st_mode = 42 | stat.S_IFREG self.metadata.st_mtime_sec = 10**9 self.metadata.st_mtime_nsec = 0 self.metadata.st_uid = 1234 self.metadata.st_gid = 5678 fd, self.filename = tempfile.mkstemp() os.close(fd) # On some systems (e.g. FreeBSD) /tmp is apparently setgid and # default gid of files is therefore not the user's gid. os.chown(self.filename, os.getuid(), os.getgid()) self.fs = obnamlib.LocalFS('/') self.fs.connect() self.uid_set = None self.gid_set = None self.fs.lchown = self.fake_lchown obnamlib.set_metadata(self.fs, self.filename, self.metadata) self.st = os.stat(self.filename) def tearDown(self): self.fs.close() os.remove(self.filename) def fake_lchown(self, filename, uid, gid): self.uid_set = uid self.gid_set = gid def test_sets_atime(self): self.assertEqual(self.st.st_atime, self.metadata.st_atime_sec) def test_sets_mode(self): self.assertEqual(self.st.st_mode, self.metadata.st_mode) def test_sets_mtime(self): self.assertEqual(self.st.st_mtime, self.metadata.st_mtime_sec) def test_does_not_set_uid_when_not_running_as_root(self): self.assertEqual(self.st.st_uid, os.getuid()) def test_does_not_set_gid_when_not_running_as_root(self): self.assertEqual(self.st.st_gid, os.getgid()) def test_sets_uid_when_running_as_root(self): obnamlib.set_metadata(self.fs, self.filename, self.metadata, getuid=lambda: 0) self.assertEqual(self.uid_set, self.metadata.st_uid) def test_sets_gid_when_running_as_root(self): obnamlib.set_metadata(self.fs, self.filename, self.metadata, getuid=lambda: 0) self.assertEqual(self.gid_set, self.metadata.st_gid) def test_sets_symlink_target(self): self.fs.remove(self.filename) self.metadata.st_mode = 0777 | stat.S_IFLNK; self.metadata.target = 'target' obnamlib.set_metadata(self.fs, self.filename, self.metadata) self.assertEqual(self.fs.readlink(self.filename), 'target') def test_sets_symlink_mtime_perms(self): self.fs.remove(self.filename) self.metadata.st_mode = 0777 | stat.S_IFLNK; self.metadata.target = 'target' obnamlib.set_metadata(self.fs, self.filename, self.metadata) st = os.lstat(self.filename) self.assertEqual(st.st_mode, self.metadata.st_mode) self.assertEqual(st.st_mtime, self.metadata.st_mtime_sec) class MetadataCodingTests(unittest.TestCase): def equal(self, meta1, meta2): for name in dir(meta1): if name in obnamlib.metadata.metadata_fields: value1 = getattr(meta1, name) value2 = getattr(meta2, name) self.assertEqual( value1, value2, 'attribute %s must be equal (%s vs %s)' % (name, value1, value2)) def test_round_trip(self): metadata = obnamlib.metadata.Metadata(st_mode=1, st_mtime_sec=2, st_mtime_nsec=12756, st_nlink=3, st_size=4, st_uid=5, st_blocks=6, st_dev=7, st_gid=8, st_ino=9, st_atime_sec=10, st_atime_nsec=123, groupname='group', username='user', target='target', md5='checksum') encoded = obnamlib.encode_metadata(metadata) decoded = obnamlib.decode_metadata(encoded) self.equal(metadata, decoded) def test_round_trip_for_None_values(self): metadata = obnamlib.metadata.Metadata() encoded = obnamlib.encode_metadata(metadata) decoded = obnamlib.decode_metadata(encoded) for name in dir(metadata): if name in obnamlib.metadata.metadata_fields: self.assertEqual(getattr(decoded, name), None, 'attribute %s must be None' % name) def test_round_trip_for_maximum_values(self): unsigned_max = 2**64 - 1 signed_max = 2**63 - 1 metadata = obnamlib.metadata.Metadata( st_mode=unsigned_max, st_mtime_sec=signed_max, st_mtime_nsec=unsigned_max, st_nlink=unsigned_max, st_size=signed_max, st_uid=unsigned_max, st_blocks=signed_max, st_dev=unsigned_max, st_gid=unsigned_max, st_ino=unsigned_max, st_atime_sec=signed_max, st_atime_nsec=unsigned_max) encoded = obnamlib.encode_metadata(metadata) decoded = obnamlib.decode_metadata(encoded) self.equal(metadata, decoded) obnam-1.6.1/obnamlib/pluginbase.py0000644000175000017500000000150512246357067017002 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import obnamlib class ObnamPlugin(obnamlib.pluginmgr.Plugin): '''Base class for plugins in Obnam.''' def __init__(self, app): self.app = app obnam-1.6.1/obnamlib/pluginbase_tests.py0000644000175000017500000000200512246357067020220 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import unittest import obnamlib class FakeApp(object): def __init__(self): self.hooks = self class ObnamPluginTests(unittest.TestCase): def setUp(self): self.fakeapp = FakeApp() self.plugin = obnamlib.ObnamPlugin(self.fakeapp) def test_has_an_app(self): self.assertEqual(self.plugin.app, self.fakeapp) obnam-1.6.1/obnamlib/pluginmgr.py0000644000175000017500000001771712246357067016671 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . '''A generic plugin manager. The plugin manager finds files with plugins and loads them. It looks for plugins in a number of locations specified by the caller. To add a plugin to be loaded, it is enough to put it in one of the locations, and name it *_plugin.py. (The naming convention is to allow having other modules as well, such as unit tests, in the same locations.) ''' import imp import inspect import os class Plugin(object): '''Base class for plugins. A plugin MUST NOT have any side effects when it is instantiated. This is necessary so that it can be safely loaded by unit tests, and so that a user interface can allow the user to disable it, even if it is installed, with no ill effects. Any side effects that would normally happen should occur in the enable() method, and be undone by the disable() method. These methods must be callable any number of times. The subclass MAY define the following attributes: * name * description * version * required_application_version name is the user-visible identifier for the plugin. It defaults to the plugin's classname. description is the user-visible description of the plugin. It may be arbitrarily long, and can use pango markup language. Defaults to the empty string. version is the plugin version. Defaults to '0.0.0'. It MUST be a sequence of integers separated by periods. If several plugins with the same name are found, the newest version is used. Versions are compared integer by integer, starting with the first one, and a missing integer treated as a zero. If two plugins have the same version, either might be used. required_application_version gives the version of the minimal application version the plugin is written for. The first integer must match exactly: if the application is version 2.3.4, the plugin's required_application_version must be at least 2 and at most 2.3.4 to be loaded. Defaults to 0. ''' @property def name(self): return self.__class__.__name__ @property def description(self): return '' @property def version(self): return '0.0.0' @property def required_application_version(self): return '0.0.0' def enable_wrapper(self): '''Enable plugin. The plugin manager will call this method, which then calls the enable method. Plugins should implement the enable method. The wrapper method is there to allow an application to provide an extended base class that does some application specific magic when plugins are enabled or disabled. ''' self.enable() def disable_wrapper(self): '''Corresponds to enable_wrapper, but for disabling a plugin.''' self.disable() def enable(self): '''Enable the plugin.''' raise NotImplemented() def disable(self): '''Disable the plugin.''' raise NotImplemented() class PluginManager(object): '''Manage plugins. This class finds and loads plugins, and keeps a list of them that can be accessed in various ways. The locations are set via the locations attribute, which is a list. When a plugin is loaded, an instance of its class is created. This instance is initialized using normal and keyword arguments specified in the plugin manager attributes plugin_arguments and plugin_keyword_arguments. The version of the application using the plugin manager is set via the application_version attribute. This defaults to '0.0.0'. ''' suffix = '_plugin.py' def __init__(self): self.locations = [] self._plugins = None self._plugin_files = None self.plugin_arguments = [] self.plugin_keyword_arguments = {} self.application_version = '0.0.0' @property def plugin_files(self): if self._plugin_files is None: self._plugin_files = self.find_plugin_files() return self._plugin_files @property def plugins(self): if self._plugins is None: self._plugins = self.load_plugins() return self._plugins def __getitem__(self, name): for plugin in self.plugins: if plugin.name == name: return plugin raise KeyError('Plugin %s is not known' % name) def find_plugin_files(self): '''Find files that may contain plugins. This finds all files named *_plugin.py in all locations. The returned list is sorted. ''' pathnames = [] for location in self.locations: try: basenames = os.listdir(location) except os.error: continue for basename in basenames: s = os.path.join(location, basename) if s.endswith(self.suffix) and os.path.exists(s): pathnames.append(s) return sorted(pathnames) def load_plugins(self): '''Load plugins from all plugin files.''' plugins = dict() for pathname in self.plugin_files: for plugin in self.load_plugin_file(pathname): if plugin.name in plugins: p = plugins[plugin.name] if self.is_older(p.version, plugin.version): plugins[plugin.name] = plugin else: plugins[plugin.name] = plugin return plugins.values() def is_older(self, version1, version2): '''Is version1 older than version2?''' return self.parse_version(version1) < self.parse_version(version2) def load_plugin_file(self, pathname): '''Return plugin classes in a plugin file.''' name, ext = os.path.splitext(os.path.basename(pathname)) f = file(pathname, 'r') module = imp.load_module(name, f, pathname, ('.py', 'r', imp.PY_SOURCE)) f.close() plugins = [] for dummy, member in inspect.getmembers(module, inspect.isclass): if issubclass(member, Plugin): p = member(*self.plugin_arguments, **self.plugin_keyword_arguments) if self.compatible_version(p.required_application_version): plugins.append(p) return plugins def compatible_version(self, required_application_version): '''Check that the plugin is version-compatible with the application. This checks the plugin's required_application_version against the declared application version and returns True if they are compatible, and False if not. ''' req = self.parse_version(required_application_version) app = self.parse_version(self.application_version) return app[0] == req[0] and app >= req def parse_version(self, version): '''Parse a string represenation of a version into list of ints.''' return [int(s) for s in version.split('.')] def enable_plugins(self, plugins=None): '''Enable all or selected plugins.''' for plugin in plugins or self.plugins: plugin.enable_wrapper() def disable_plugins(self, plugins=None): '''Disable all or selected plugins.''' for plugin in plugins or self.plugins: plugin.disable_wrapper() obnam-1.6.1/obnamlib/pluginmgr_tests.py0000644000175000017500000001212412246357067020076 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import unittest from pluginmgr import Plugin, PluginManager class PluginTests(unittest.TestCase): def setUp(self): self.plugin = Plugin() def test_name_is_class_name(self): self.assertEqual(self.plugin.name, 'Plugin') def test_description_is_empty_string(self): self.assertEqual(self.plugin.description, '') def test_version_is_zeroes(self): self.assertEqual(self.plugin.version, '0.0.0') def test_required_application_version_is_zeroes(self): self.assertEqual(self.plugin.required_application_version, '0.0.0') def test_enable_raises_exception(self): self.assertRaises(Exception, self.plugin.enable) def test_disable_raises_exception(self): self.assertRaises(Exception, self.plugin.disable) def test_enable_wrapper_calls_enable(self): self.plugin.enable = lambda: setattr(self, 'enabled', True) self.plugin.enable_wrapper() self.assert_(self.enabled, True) def test_disable_wrapper_calls_disable(self): self.plugin.disable = lambda: setattr(self, 'disabled', True) self.plugin.disable_wrapper() self.assert_(self.disabled, True) class PluginManagerInitialStateTests(unittest.TestCase): def setUp(self): self.pm = PluginManager() def test_locations_is_empty_list(self): self.assertEqual(self.pm.locations, []) def test_plugins_is_empty_list(self): self.assertEqual(self.pm.plugins, []) def test_application_version_is_zeroes(self): self.assertEqual(self.pm.application_version, '0.0.0') def test_plugin_files_is_empty(self): self.assertEqual(self.pm.plugin_files, []) def test_plugin_arguments_is_empty(self): self.assertEqual(self.pm.plugin_arguments, []) def test_plugin_keyword_arguments_is_empty(self): self.assertEqual(self.pm.plugin_keyword_arguments, {}) class PluginManagerTests(unittest.TestCase): def setUp(self): self.pm = PluginManager() self.pm.locations = ['test-plugins', 'not-exist'] self.pm.plugin_arguments = ('fooarg',) self.pm.plugin_keyword_arguments = { 'bar': 'bararg' } self.files = sorted(['test-plugins/hello_plugin.py', 'test-plugins/aaa_hello_plugin.py', 'test-plugins/oldhello_plugin.py', 'test-plugins/wrongversion_plugin.py']) def test_finds_the_right_plugin_files(self): self.assertEqual(self.pm.find_plugin_files(), self.files) def test_plugin_files_attribute_implicitly_searches(self): self.assertEqual(self.pm.plugin_files, self.files) def test_loads_hello_plugin(self): plugins = self.pm.load_plugins() self.assertEqual(len(plugins), 1) self.assertEqual(plugins[0].name, 'Hello') def test_plugins_attribute_implicitly_searches(self): self.assertEqual(len(self.pm.plugins), 1) self.assertEqual(self.pm.plugins[0].name, 'Hello') def test_initializes_hello_with_correct_args(self): plugin = self.pm['Hello'] self.assertEqual(plugin.foo, 'fooarg') self.assertEqual(plugin.bar, 'bararg') def test_raises_keyerror_for_unknown_plugin(self): self.assertRaises(KeyError, self.pm.__getitem__, 'Hithere') def test_enable_plugins_enables_all_plugins(self): enabled = set() for plugin in self.pm.plugins: plugin.enable = lambda: enabled.add(plugin) self.pm.enable_plugins() self.assertEqual(enabled, set(self.pm.plugins)) def test_disable_plugins_disables_all_plugins(self): disabled = set() for plugin in self.pm.plugins: plugin.disable = lambda: disabled.add(plugin) self.pm.disable_plugins() self.assertEqual(disabled, set(self.pm.plugins)) class PluginManagerCompatibleApplicationVersionTests(unittest.TestCase): def setUp(self): self.pm = PluginManager() self.pm.application_version = '1.2.3' def test_rejects_zero(self): self.assertFalse(self.pm.compatible_version('0')) def test_rejects_two(self): self.assertFalse(self.pm.compatible_version('2')) def test_rejects_one_two_four(self): self.assertFalse(self.pm.compatible_version('1.2.4')) def test_accepts_one(self): self.assert_(self.pm.compatible_version('1')) def test_accepts_one_two_three(self): self.assert_(self.pm.compatible_version('1.2.3')) obnam-1.6.1/obnamlib/plugins/0000755000175000017500000000000012246357067015757 5ustar jenkinsjenkinsobnam-1.6.1/obnamlib/plugins/__init__.py0000644000175000017500000000000012246357067020056 0ustar jenkinsjenkinsobnam-1.6.1/obnamlib/plugins/backup_plugin.py0000644000175000017500000007455212246357067021171 0ustar jenkinsjenkins# Copyright (C) 2009, 2010, 2011, 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import errno import gc import logging import os import re import stat import sys import time import traceback import tracing import ttystatus import obnamlib import larch class ChunkidPool(object): '''Checksum/chunkid mappings that are pending an upload to shared trees.''' def __init__(self): self.clear() def add(self, chunkid, checksum): if checksum not in self._mapping: self._mapping[checksum] = [] self._mapping[checksum].append(chunkid) def __contains__(self, checksum): return checksum in self._mapping def get(self, checksum): return self._mapping.get(checksum, []) def clear(self): self._mapping = {} def __iter__(self): for checksum in self._mapping.keys(): for chunkid in self._mapping[checksum]: yield chunkid, checksum class BackupProgress(object): def __init__(self, ts): self.file_count = 0 self.backed_up_count = 0 self.uploaded_bytes = 0 self.scanned_bytes = 0 self.started = None self._ts = ts self._ts['current-file'] = '' self._ts['scanned-bytes'] = 0 self._ts['uploaded-bytes'] = 0 self._ts.format('%ElapsedTime() ' '%Counter(current-file) ' 'files ' '%ByteSize(scanned-bytes) scanned: ' '%String(what)') def clear(self): self._ts.clear() def error(self, msg): self._ts.error(msg) def what(self, what_what): if self.started is None: self.started = time.time() self._ts['what'] = what_what self._ts.flush() def update_progress(self): self._ts['not-shown'] = 'not shown' def update_progress_with_file(self, filename, metadata): self._ts['what'] = filename self._ts['current-file'] = filename self.file_count += 1 def update_progress_with_scanned(self, amount): self.scanned_bytes += amount self._ts['scanned-bytes'] = self.scanned_bytes def update_progress_with_upload(self, amount): self.uploaded_bytes += amount self._ts['uploaded-bytes'] = self.uploaded_bytes def update_progress_with_removed_checkpoint(self, gen): self._ts['checkpoint'] = gen def report_stats(self): size_table = [ (1024**4, 'TiB'), (1024**3, 'GiB'), (1024**2, 'MiB'), (1024**1, 'KiB'), (0, 'B') ] for size_base, size_unit in size_table: if self.uploaded_bytes >= size_base: if size_base > 0: size_amount = float(self.uploaded_bytes) / float(size_base) else: size_amount = float(self.uploaded_bytes) break speed_table = [ (1024**3, 'GiB/s'), (1024**2, 'MiB/s'), (1024**1, 'KiB/s'), (0, 'B/s') ] duration = time.time() - self.started speed = float(self.uploaded_bytes) / duration for speed_base, speed_unit in speed_table: if speed >= speed_base: if speed_base > 0: speed_amount = speed / speed_base else: speed_amount = speed break duration_string = '' seconds = duration if seconds >= 3600: duration_string += '%dh' % int(seconds/3600) seconds %= 3600 if seconds >= 60: duration_string += '%dm' % int(seconds/60) seconds %= 60 if seconds > 0: duration_string += '%ds' % round(seconds) logging.info('Backup performance statistics:') logging.info('* files found: %s' % self.file_count) logging.info('* files backed up: %s' % self.backed_up_count) logging.info('* uploaded data: %s bytes (%s %s)' % (self.uploaded_bytes, size_amount, size_unit)) logging.info('* duration: %s s' % duration) logging.info('* average speed: %s %s' % (speed_amount, speed_unit)) self._ts.notify( 'Backed up %d files (of %d found), ' 'uploaded %.1f %s in %s at %.1f %s average speed' % (self.backed_up_count, self.file_count, size_amount, size_unit, duration_string, speed_amount, speed_unit)) class BackupPlugin(obnamlib.ObnamPlugin): def enable(self): backup_group = obnamlib.option_group['backup'] = 'Backing up' perf_group = obnamlib.option_group['perf'] self.app.add_subcommand('backup', self.backup, arg_synopsis='[DIRECTORY]...') self.app.settings.string_list(['root'], 'what to backup') self.app.settings.string_list(['exclude'], 'regular expression for pathnames to ' 'exclude from backup (can be used multiple ' 'times)', group=backup_group) self.app.settings.boolean(['exclude-caches'], 'exclude directories (and their subdirs) ' 'that contain a CACHEDIR.TAG file', group=backup_group) self.app.settings.boolean(['one-file-system'], 'exclude directories (and their subdirs) ' 'that are in a different filesystem', group=backup_group) self.app.settings.bytesize(['checkpoint'], 'make a checkpoint after a given SIZE ' '(%default)', metavar='SIZE', default=1024**3, group=backup_group) self.app.settings.integer(['chunkids-per-group'], 'encode NUM chunk ids per group (%default)', metavar='NUM', default=obnamlib.DEFAULT_CHUNKIDS_PER_GROUP, group=perf_group) self.app.settings.choice(['deduplicate'], ['fatalist', 'never', 'verify'], 'find duplicate data in backed up data ' 'and store it only once; three modes ' 'are available: never de-duplicate, ' 'verify that no hash collisions happen, ' 'or (the default) fatalistically accept ' 'the risk of collisions', metavar='MODE', group=backup_group) self.app.settings.boolean(['leave-checkpoints'], 'leave checkpoint generations at the end ' 'of a successful backup run', group=backup_group) self.app.settings.boolean(['small-files-in-btree'], 'put contents of small files directly into ' 'the per-client B-tree, instead of ' 'separate chunk files; do not use this ' 'as it is quite bad for performance', group=backup_group) self.app.settings.string_list( ['testing-fail-matching'], 'development testing helper: simulate failures during backup ' 'for files that match the given regular expressions', metavar='REGEXP') def configure_ttystatus_for_backup(self): self.progress = BackupProgress(self.app.ts) def error(self, msg, exc=None): self.errors = True logging.error(msg) if exc: logging.error(repr(exc)) # FIXME: ttystatus.TerminalStatus.error is quiet if --quiet is used. # That's a bug, so we work around it by writing to stderr directly. sys.stderr.write('ERROR: %s\n' % msg) def parse_checkpoint_size(self, value): p = obnamlib.ByteSizeParser() p.set_default_unit('MiB') return p.parse(value) @property def pretend(self): return self.app.settings['pretend'] def backup(self, args): '''Backup data to repository.''' logging.info('Backup starts') logging.debug( 'Checkpoints every %s bytes' % self.app.settings['checkpoint']) self.app.settings.require('repository') self.app.settings.require('client-name') if not self.app.settings['repository']: raise obnamlib.Error('No --repository setting. ' 'You need to specify it on the command ' 'line or a configuration file.') self.configure_ttystatus_for_backup() self.progress.what('setting up') self.compile_exclusion_patterns() self.memory_dump_counter = 0 self.progress.what('connecting to repository') client_name = self.app.settings['client-name'] if self.pretend: self.repo = self.app.open_repository() self.repo.open_client(client_name) else: self.repo = self.app.open_repository(create=True) self.progress.what('adding client') self.add_client(client_name) self.progress.what('locking client') self.repo.lock_client(client_name) # Need to lock the shared stuff briefly, so encryption etc # gets initialized. self.progress.what( 'initialising shared directories') self.repo.lock_shared() self.repo.unlock_shared() self.errors = False self.chunkid_pool = ChunkidPool() try: if not self.pretend: self.progress.what('starting new generation') self.repo.start_generation() self.fs = None roots = self.app.settings['root'] + args if not roots: raise obnamlib.Error('No backup roots specified') self.backup_roots(roots) self.progress.what('committing changes to repository') if not self.pretend: self.progress.what( 'committing changes to repository: locking shared B-trees') self.repo.lock_shared() self.progress.what( 'committing changes to repository: ' 'adding chunks to shared B-trees') self.add_chunks_to_shared() self.progress.what( 'committing changes to repository: ' 'committing client') self.repo.commit_client() self.progress.what( 'committing changes to repository: ' 'committing shared B-trees') self.repo.commit_shared() self.progress.what('closing connection to repository') self.repo.fs.close() self.progress.clear() self.progress.report_stats() logging.info('Backup finished.') self.app.dump_memory_profile('at end of backup run') except BaseException, e: logging.debug('Handling exception %s' % str(e)) logging.debug(traceback.format_exc()) self.unlock_when_error() raise if self.errors: raise obnamlib.Error('There were errors during the backup') def unlock_when_error(self): try: if self.repo.got_client_lock: logging.info('Attempting to unlock client because of error') self.repo.unlock_client() if self.repo.got_shared_lock: logging.info( 'Attempting to unlock shared trees because of error') self.repo.unlock_shared() except BaseException, e2: logging.warning( 'Error while unlocking due to error: %s' % str(e2)) logging.debug(traceback.format_exc()) else: logging.info('Successfully unlocked') def add_chunks_to_shared(self): for chunkid, checksum in self.chunkid_pool: self.repo.put_chunk_in_shared_trees(chunkid, checksum) self.chunkid_pool.clear() def add_client(self, client_name): self.repo.lock_root() if client_name not in self.repo.list_clients(): tracing.trace('adding new client %s' % client_name) tracing.trace('client list before adding: %s' % self.repo.list_clients()) self.repo.add_client(client_name) tracing.trace('client list after adding: %s' % self.repo.list_clients()) self.repo.commit_root() self.repo = self.app.open_repository(repofs=self.repo.fs.fs) def compile_exclusion_patterns(self): log = self.app.settings['log'] if log: log = self.app.settings['log'] self.app.settings['exclude'].append(log) for pattern in self.app.settings['exclude']: logging.debug('Exclude pattern: %s' % pattern) self.exclude_pats = [] for x in self.app.settings['exclude']: if x != '': try: self.exclude_pats.append(re.compile(x)) except re.error, e: msg = ( 'error compiling regular expression "%s": %s' % (x, e)) logging.error(msg) self.progress.error(msg) def backup_roots(self, roots): self.progress.what('connecting to to repository') self.fs = self.app.fsf.new(roots[0]) self.fs.connect() absroots = [] for root in roots: self.progress.what('determining absolute path for %s' % root) self.fs.reinit(root) absroots.append(self.fs.abspath('.')) if not self.pretend: self.remove_old_roots(absroots) self.checkpoints = [] self.last_checkpoint = 0 self.interval = self.app.settings['checkpoint'] for root in roots: logging.info('Backing up root %s' % root) self.progress.what('connecting to live data %s' % root) self.fs.reinit(root) self.progress.what('scanning for files in %s' % root) absroot = self.fs.abspath('.') self.root_metadata = self.fs.lstat(absroot) for pathname, metadata in self.find_files(absroot): logging.info('Backing up %s' % pathname) try: self.maybe_simulate_error(pathname) if stat.S_ISDIR(metadata.st_mode): self.backup_dir_contents(pathname) elif stat.S_ISREG(metadata.st_mode): assert metadata.md5 is None metadata.md5 = self.backup_file_contents(pathname, metadata) self.backup_metadata(pathname, metadata) except (IOError, OSError), e: msg = 'Can\'t back up %s: %s' % (pathname, e.strerror) self.error(msg, e) if e.errno == errno.ENOSPC: raise if self.time_for_checkpoint(): self.make_checkpoint() self.progress.what(pathname) self.backup_parents('.') remove_checkpoints = (not self.errors and not self.app.settings['leave-checkpoints'] and not self.pretend) if remove_checkpoints: self.progress.what('removing checkpoints') for gen in self.checkpoints: self.progress.update_progress_with_removed_checkpoint(gen) self.repo.remove_generation(gen) if self.fs: self.fs.close() def maybe_simulate_error(self, pathname): '''Raise an IOError if specified by --testing-fail-matching.''' for pattern in self.app.settings['testing-fail-matching']: if re.search(pattern, pathname): e = errno.ENOENT raise IOError(e, os.strerror(e), pathname) def time_for_checkpoint(self): bytes_since = (self.repo.fs.bytes_written - self.last_checkpoint) return bytes_since >= self.interval def make_checkpoint(self): logging.info('Making checkpoint') self.progress.what('making checkpoint') if not self.pretend: self.checkpoints.append(self.repo.new_generation) self.progress.what('making checkpoint: backing up parents') self.backup_parents('.') self.progress.what('making checkpoint: locking shared B-trees') self.repo.lock_shared() self.progress.what('making checkpoint: adding chunks to shared B-trees') self.add_chunks_to_shared() self.progress.what('making checkpoint: committing per-client B-tree') self.repo.commit_client(checkpoint=True) self.progress.what('making checkpoint: committing shared B-trees') self.repo.commit_shared() self.last_checkpoint = self.repo.fs.bytes_written self.progress.what('making checkpoint: re-opening repository') self.repo = self.app.open_repository(repofs=self.repo.fs.fs) self.progress.what('making checkpoint: locking client') self.repo.lock_client(self.app.settings['client-name']) self.progress.what('making checkpoint: starting a new generation') self.repo.start_generation() self.app.dump_memory_profile('at end of checkpoint') self.progress.what('making checkpoint: continuing backup') def find_files(self, root): '''Find all files and directories that need to be backed up. This is a generator. It yields (pathname, metadata) pairs. The caller should not recurse through directories, just backup the directory itself (name, metadata, file list). ''' for pathname, st in self.fs.scan_tree(root, ok=self.can_be_backed_up): tracing.trace('considering %s' % pathname) try: metadata = obnamlib.read_metadata(self.fs, pathname, st=st) self.progress.update_progress_with_file(pathname, metadata) if self.needs_backup(pathname, metadata): self.progress.backed_up_count += 1 yield pathname, metadata else: self.progress.update_progress_with_scanned( metadata.st_size) except GeneratorExit: raise except KeyboardInterrupt: logging.error('Keyboard interrupt') raise except BaseException, e: msg = 'Cannot back up %s: %s' % (pathname, str(e)) self.error(msg, e) def can_be_backed_up(self, pathname, st): if self.app.settings['one-file-system']: if st.st_dev != self.root_metadata.st_dev: logging.debug('Excluding (one-file-system): %s' % pathname) return False for pat in self.exclude_pats: if pat.search(pathname): logging.debug('Excluding (pattern): %s' % pathname) return False if stat.S_ISDIR(st.st_mode) and self.app.settings['exclude-caches']: tag_filename = 'CACHEDIR.TAG' tag_contents = 'Signature: 8a477f597d28d172789f06886806bc55' tag_path = os.path.join(pathname, 'CACHEDIR.TAG') if self.fs.exists(tag_path): # Can't use with, because Paramiko's SFTPFile does not work. f = self.fs.open(tag_path, 'rb') data = f.read(len(tag_contents)) f.close() if data == tag_contents: logging.debug('Excluding (cache dir): %s' % pathname) return False return True def needs_backup(self, pathname, current): '''Does a given file need to be backed up?''' # Directories always require backing up so that backup_dir_contents # can remove stuff that no longer exists from them. if current.isdir(): tracing.trace('%s is directory, so needs backup' % pathname) return True if self.pretend: gens = self.repo.list_generations() if not gens: return True gen = gens[-1] else: gen = self.repo.new_generation tracing.trace('gen=%s' % repr(gen)) try: old = self.repo.get_metadata(gen, pathname) except obnamlib.Error, e: # File does not exist in the previous generation, so it # does need to be backed up. tracing.trace('%s not in previous gen, so needs backup' % pathname) tracing.trace('error: %s' % str(e)) tracing.trace(traceback.format_exc()) return True needs = (current.st_mtime_sec != old.st_mtime_sec or current.st_mtime_nsec != old.st_mtime_nsec or current.st_mode != old.st_mode or current.st_nlink != old.st_nlink or current.st_size != old.st_size or current.st_uid != old.st_uid or current.st_gid != old.st_gid or current.xattr != old.xattr) if needs: tracing.trace('%s has changed metadata, so needs backup' % pathname) return needs def backup_parents(self, root): '''Back up parents of root, non-recursively.''' root = self.fs.abspath(root) tracing.trace('backing up parents of %s', root) dummy_metadata = obnamlib.Metadata(st_mode=0777 | stat.S_IFDIR) while True: parent = os.path.dirname(root) try: metadata = obnamlib.read_metadata(self.fs, root) except OSError, e: logging.warning( 'Failed to get metadata for %s: %s: %s' % (root, e.errno or 0, e.strerror)) logging.warning('Using fake metadata instead for %s' % root) metadata = dummy_metadata if not self.pretend: self.repo.create(root, metadata) if root == parent: break root = parent def backup_metadata(self, pathname, metadata): '''Back up metadata for a filesystem object''' tracing.trace('backup_metadata: %s', pathname) if not self.pretend: self.repo.create(pathname, metadata) def backup_file_contents(self, filename, metadata): '''Back up contents of a regular file.''' tracing.trace('backup_file_contents: %s', filename) if self.pretend: tracing.trace('pretending to upload the whole file') self.progress.update_progress_with_upload(metadata.st_size) return tracing.trace('setting file chunks to empty') if not self.pretend: self.repo.set_file_chunks(filename, []) tracing.trace('opening file for reading') f = self.fs.open(filename, 'r') summer = self.repo.new_checksummer() max_intree = self.app.settings['node-size'] / 4 if (metadata.st_size <= max_intree and self.app.settings['small-files-in-btree']): contents = f.read() assert len(contents) <= max_intree # FIXME: silly error checking f.close() self.progress.update_progress_with_scanned(len(contents)) self.repo.set_file_data(filename, contents) summer.update(contents) return summer.digest() chunk_size = int(self.app.settings['chunk-size']) chunkids = [] while True: tracing.trace('reading some data') self.progress.update_progress() data = f.read(chunk_size) if not data: tracing.trace('end of data') break tracing.trace('got %d bytes of data' % len(data)) self.progress.update_progress_with_scanned(len(data)) summer.update(data) if not self.pretend: chunkids.append(self.backup_file_chunk(data)) if len(chunkids) >= self.app.settings['chunkids-per-group']: tracing.trace('adding %d chunkids to file' % len(chunkids)) self.repo.append_file_chunks(filename, chunkids) self.app.dump_memory_profile('after appending some ' 'chunkids') chunkids = [] else: self.self.update_progress_with_upload(len(data)) if not self.pretend and self.time_for_checkpoint(): logging.debug('making checkpoint in the middle of a file') self.repo.append_file_chunks(filename, chunkids) chunkids = [] self.make_checkpoint() tracing.trace('closing file') f.close() if chunkids: assert not self.pretend tracing.trace('adding final %d chunkids to file' % len(chunkids)) self.repo.append_file_chunks(filename, chunkids) self.app.dump_memory_profile('at end of file content backup for %s' % filename) tracing.trace('done backing up file contents') return summer.digest() def backup_file_chunk(self, data): '''Back up a chunk of data by putting it into the repository.''' def find(): # We ignore lookup errors here intentionally. We're reading # the checksum trees without a lock, so another Obnam may be # modifying them, which can lead to spurious NodeMissing # exceptions, and other errors. We don't care: we'll just # pretend no chunk with the checksum exists yet. try: in_tree = self.repo.find_chunks(checksum) except larch.Error: in_tree = [] return in_tree + self.chunkid_pool.get(checksum) def get(chunkid): return self.repo.get_chunk(chunkid) def put(): self.progress.update_progress_with_upload(len(data)) return self.repo.put_chunk_only(data) def share(chunkid): self.chunkid_pool.add(chunkid, checksum) checksum = self.repo.checksum(data) mode = self.app.settings['deduplicate'] if mode == 'never': return put() elif mode == 'verify': for chunkid in find(): data2 = get(chunkid) if data == data2: return chunkid else: chunkid = put() share(chunkid) return chunkid elif mode == 'fatalist': existing = find() if existing: return existing[0] else: chunkid = put() share(chunkid) return chunkid else: if not hasattr(self, 'bad_deduplicate_reported'): logging.error('unknown --deduplicate setting value') self.bad_deduplicate_reported = True chunkid = put() share(chunkid) return chunkid def backup_dir_contents(self, root): '''Back up the list of files in a directory.''' tracing.trace('backup_dir: %s', root) if self.pretend: return new_basenames = self.fs.listdir(root) old_basenames = self.repo.listdir(self.repo.new_generation, root) for old in old_basenames: pathname = os.path.join(root, old) if old not in new_basenames: self.repo.remove(pathname) # Files that are created after the previous generation will be # added to the directory when they are backed up, so we don't # need to worry about them here. def remove_old_roots(self, new_roots): '''Remove from started generation anything that is not a backup root. We recurse from filesystem root directory until getting to one of the new backup roots, or a directory or file that is not a parent of one of the new backup roots. We remove anything that is not a new backup root, or their parent. ''' def is_parent(pathname): if not pathname.endswith(os.sep): pathname += os.sep for new_root in new_roots: if new_root.startswith(pathname): return True return False def helper(dirname): if dirname in new_roots: tracing.trace('is a new root: %s' % dirname) elif is_parent(dirname): tracing.trace('is parent of a new root: %s' % dirname) pathnames = [os.path.join(dirname, x) for x in self.repo.listdir(gen_id, dirname)] for pathname in pathnames: helper(pathname) else: tracing.trace('is extra and removed: %s' % dirname) self.progress.what('removing %s from new generation' % dirname) self.repo.remove(dirname) self.progress.what(msg) assert not self.pretend msg = 'removing old backup roots from new generation' self.progress.what(msg) tracing.trace('new_roots: %s' % repr(new_roots)) gen_id = self.repo.new_generation helper('/') obnam-1.6.1/obnamlib/plugins/compression_plugin.py0000644000175000017500000000401712246357067022252 0ustar jenkinsjenkins# Copyright (C) 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import zlib import obnamlib class DeflateCompressionFilter(object): def __init__(self, app): self.tag = "deflate" self.app = app self.warned = False def filter_read(self, data, repo, toplevel): return zlib.decompress(data) def filter_write(self, data, repo, toplevel): how = self.app.settings['compress-with'] if how == 'deflate': data = zlib.compress(data) elif how == 'gzip': if not self.warned: self.app.ts.notify("--compress-with=gzip is deprecated. " + "Use --compress-with=deflate instead") self.warned = True data = zlib.compress(data) return data class CompressionPlugin(obnamlib.ObnamPlugin): def enable(self): self.app.settings.choice(['compress-with'], ['none', 'deflate', 'gzip'], 'use PROGRAM to compress repository with ' '(one of none, deflate)', metavar='PROGRAM') hooks = [ ('repository-data', DeflateCompressionFilter(self.app), obnamlib.Hook.EARLY_PRIORITY), ] for name, callback, prio in hooks: self.app.hooks.add_callback(name, callback, prio) obnam-1.6.1/obnamlib/plugins/convert5to6_plugin.py0000644000175000017500000000575212246357067022116 0ustar jenkinsjenkins# Copyright (C) 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import re import stat import tracing import zlib import obnamlib class Convert5to6Plugin(obnamlib.ObnamPlugin): '''Convert a version 5 repository to version 6, in place.''' def enable(self): self.app.add_subcommand('convert5to6', self.convert, arg_synopsis='') def convert(self, args): self.app.settings.require('repository') self.rawfs = self.app.fsf.new(self.app.settings['repository']) self.convert_format() self.repo = self.app.open_repository() self.convert_files() def convert_files(self): funcs = [] if self.app.settings['compress-with'] == 'gzip': funcs.append(self.gunzip) if self.app.settings['encrypt-with']: self.symmetric_keys = {} funcs.append(self.decrypt) tracing.trace('funcs=%s' % repr(funcs)) for filename in self.find_files(): logging.debug('converting file %s' % filename) data = self.rawfs.cat(filename) tracing.trace('old data is %d bytes' % len(data)) for func in funcs: data = func(filename, data) tracing.trace('new data is %d bytes' % len(data)) self.repo.fs.overwrite_file(filename, data) def find_files(self): ignored_pat = re.compile(r'^(tmp.*|lock|format|userkeys|key)$') for filename, st in self.rawfs.scan_tree('.'): ignored = ignored_pat.match(os.path.basename(filename)) if stat.S_ISREG(st.st_mode) and not ignored: assert filename.startswith('./') yield filename[2:] def get_symmetric_key(self, filename): toplevel = filename.split('/')[0] tracing.trace('toplevel=%s' % toplevel) if toplevel not in self.symmetric_keys: encoded = self.rawfs.cat(os.path.join(toplevel, 'key')) key = obnamlib.decrypt_with_secret_keys(encoded) self.symmetric_keys[toplevel] = key return self.symmetric_keys[toplevel] def decrypt(self, filename, data): symmetric_key = self.get_symmetric_key(filename) return obnamlib.decrypt_symmetric(data, symmetric_key) def gunzip(self, filename, data): return zlib.decompress(data) def convert_format(self): self.rawfs.overwrite_file('metadata/format', '6\n') obnam-1.6.1/obnamlib/plugins/encryption_plugin.py0000644000175000017500000002631212246357067022105 0ustar jenkinsjenkins# Copyright (C) 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import obnamlib class EncryptionPlugin(obnamlib.ObnamPlugin): def enable(self): encryption_group = obnamlib.option_group['encryption'] = 'Encryption' self.app.settings.string(['encrypt-with'], 'PGP key with which to encrypt data ' 'in the backup repository', group=encryption_group) self.app.settings.string(['keyid'], 'PGP key id to add to/remove from ' 'the backup repository', group=encryption_group) self.app.settings.boolean(['weak-random'], 'use /dev/urandom instead of /dev/random ' 'to generate symmetric keys', group=encryption_group) self.app.settings.boolean(['key-details'], 'show additional user IDs for all keys', group=encryption_group) self.app.settings.string(['symmetric-key-bits'], 'size of symmetric key, in bits', group=encryption_group) self.tag = "encrypt1" hooks = [ ('repository-toplevel-init', self.toplevel_init, obnamlib.Hook.DEFAULT_PRIORITY), ('repository-data', self, obnamlib.Hook.LATE_PRIORITY), ('repository-add-client', self.add_client, obnamlib.Hook.DEFAULT_PRIORITY), ] for name, callback, rev in hooks: self.app.hooks.add_callback(name, callback, rev) self._pubkey = None self.app.add_subcommand('client-keys', self.client_keys) self.app.add_subcommand('list-keys', self.list_keys) self.app.add_subcommand('list-toplevels', self.list_toplevels) self.app.add_subcommand( 'add-key', self.add_key, arg_synopsis='[CLIENT-NAME]...') self.app.add_subcommand( 'remove-key', self.remove_key, arg_synopsis='[CLIENT-NAME]...') self.app.add_subcommand('remove-client', self.remove_client, arg_synopsis='[CLIENT-NAME]...') self._symkeys = obnamlib.SymmetricKeyCache() def disable(self): self._symkeys.clear() @property def keyid(self): return self.app.settings['encrypt-with'] @property def pubkey(self): if self._pubkey is None: self._pubkey = obnamlib.get_public_key(self.keyid) return self._pubkey @property def devrandom(self): if self.app.settings['weak-random']: return '/dev/urandom' else: return '/dev/random' @property def symmetric_key_bits(self): return int(self.app.settings['symmetric-key-bits'] or '256') def _write_file(self, repo, pathname, contents): repo.fs.fs.write_file(pathname, contents) def _overwrite_file(self, repo, pathname, contents): repo.fs.fs.overwrite_file(pathname, contents) def toplevel_init(self, repo, toplevel): '''Initialize a new toplevel for encryption.''' if not self.keyid: return pubkeys = obnamlib.Keyring() pubkeys.add(self.pubkey) symmetric_key = obnamlib.generate_symmetric_key( self.symmetric_key_bits, filename=self.devrandom) encrypted = obnamlib.encrypt_with_keyring(symmetric_key, pubkeys) self._write_file(repo, os.path.join(toplevel, 'key'), encrypted) encoded = str(pubkeys) encrypted = obnamlib.encrypt_symmetric(encoded, symmetric_key) self._write_file(repo, os.path.join(toplevel, 'userkeys'), encrypted) def filter_read(self, encrypted, repo, toplevel): if not self.keyid: return encrypted symmetric_key = self.get_symmetric_key(repo, toplevel) return obnamlib.decrypt_symmetric(encrypted, symmetric_key) def filter_write(self, cleartext, repo, toplevel): if not self.keyid: return cleartext symmetric_key = self.get_symmetric_key(repo, toplevel) return obnamlib.encrypt_symmetric(cleartext, symmetric_key) def get_symmetric_key(self, repo, toplevel): key = self._symkeys.get(repo, toplevel) if key is None: encoded = repo.fs.fs.cat(os.path.join(toplevel, 'key')) key = obnamlib.decrypt_with_secret_keys(encoded) self._symkeys.put(repo, toplevel, key) return key def read_keyring(self, repo, toplevel): encrypted = repo.fs.fs.cat(os.path.join(toplevel, 'userkeys')) encoded = self.filter_read(encrypted, repo, toplevel) return obnamlib.Keyring(encoded=encoded) def write_keyring(self, repo, toplevel, keyring): encoded = str(keyring) encrypted = self.filter_write(encoded, repo, toplevel) pathname = os.path.join(toplevel, 'userkeys') self._overwrite_file(repo, pathname, encrypted) def add_to_userkeys(self, repo, toplevel, public_key): userkeys = self.read_keyring(repo, toplevel) userkeys.add(public_key) self.write_keyring(repo, toplevel, userkeys) def remove_from_userkeys(self, repo, toplevel, keyid): userkeys = self.read_keyring(repo, toplevel) if keyid in userkeys: logging.debug('removing key %s from %s' % (keyid, toplevel)) userkeys.remove(keyid) self.write_keyring(repo, toplevel, userkeys) else: logging.debug('unable to remove key %s from %s (not there)' % (keyid, toplevel)) def rewrite_symmetric_key(self, repo, toplevel): symmetric_key = self.get_symmetric_key(repo, toplevel) userkeys = self.read_keyring(repo, toplevel) encrypted = obnamlib.encrypt_with_keyring(symmetric_key, userkeys) self._overwrite_file(repo, os.path.join(toplevel, 'key'), encrypted) def add_client(self, clientlist, client_name): clientlist.set_client_keyid(client_name, self.keyid) def quit_if_unencrypted(self): if self.app.settings['encrypt-with']: return False self.app.output.write('Warning: Encryption not in use.\n') self.app.output.write('(Use --encrypt-with to set key.)\n') return True def client_keys(self, args): '''List clients and their keys in the repository.''' if self.quit_if_unencrypted(): return repo = self.app.open_repository() clients = repo.list_clients() for client in clients: keyid = repo.clientlist.get_client_keyid(client) if keyid is None: key_info = 'no key' else: key_info = self._get_key_string(keyid) print client, key_info def _find_keys_and_toplevels(self, repo): toplevels = repo.fs.listdir('.') keys = dict() tops = dict() for toplevel in [d for d in toplevels if d != 'metadata']: # skip files (e.g. 'lock') or empty directories if not repo.fs.exists(os.path.join(toplevel, 'key')): continue try: userkeys = self.read_keyring(repo, toplevel) except obnamlib.EncryptionError: # other client's toplevels are unreadable tops[toplevel] = [] continue for keyid in userkeys.keyids(): keys[keyid] = keys.get(keyid, []) + [toplevel] tops[toplevel] = tops.get(toplevel, []) + [keyid] return keys, tops def _get_key_string(self, keyid): verbose = self.app.settings['key-details'] if verbose: user_ids = obnamlib.get_public_key_user_ids(keyid) if user_ids: return "%s (%s)" % (keyid, ", ".join(user_ids)) return str(keyid) def list_keys(self, args): '''List keys and the repository toplevels they're used in.''' if self.quit_if_unencrypted(): return repo = self.app.open_repository() keys, tops = self._find_keys_and_toplevels(repo) for keyid in keys: print 'key: %s' % self._get_key_string(keyid) for toplevel in keys[keyid]: print ' %s' % toplevel def list_toplevels(self, args): '''List repository toplevel directories and their keys.''' if self.quit_if_unencrypted(): return repo = self.app.open_repository() keys, tops = self._find_keys_and_toplevels(repo) for toplevel in tops: print 'toplevel: %s' % toplevel for keyid in tops[toplevel]: print ' %s' % self._get_key_string(keyid) _shared = ['chunklist', 'chunks', 'chunksums', 'clientlist'] def _find_clientdirs(self, repo, client_names): result = [] for client_name in client_names: client_id = repo.clientlist.get_client_id(client_name) if client_id: result.append(repo.client_dir(client_id)) else: logging.warning("client not found: %s" % client_name) return result def add_key(self, args): '''Add a key to the repository.''' if self.quit_if_unencrypted(): return self.app.settings.require('keyid') repo = self.app.open_repository() keyid = self.app.settings['keyid'] key = obnamlib.get_public_key(keyid) clients = self._find_clientdirs(repo, args) for toplevel in self._shared + clients: self.add_to_userkeys(repo, toplevel, key) self.rewrite_symmetric_key(repo, toplevel) def remove_key(self, args): '''Remove a key from the repository.''' if self.quit_if_unencrypted(): return self.app.settings.require('keyid') repo = self.app.open_repository() keyid = self.app.settings['keyid'] clients = self._find_clientdirs(repo, args) for toplevel in self._shared + clients: self.remove_from_userkeys(repo, toplevel, keyid) self.rewrite_symmetric_key(repo, toplevel) def remove_client(self, args): '''Remove client and its key from repository.''' if self.quit_if_unencrypted(): return repo = self.app.open_repository() repo.lock_root() for client_name in args: logging.info('removing client %s' % client_name) repo.remove_client(client_name) repo.commit_root() obnam-1.6.1/obnamlib/plugins/force_lock_plugin.py0000644000175000017500000000456312246357067022025 0ustar jenkinsjenkins# Copyright (C) 2009, 2010, 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import obnamlib class ForceLockPlugin(obnamlib.ObnamPlugin): def enable(self): self.app.add_subcommand('force-lock', self.force_lock) def force_lock(self, args): '''Force a locked repository to be open.''' self.app.settings.require('repository') self.app.settings.require('client-name') repourl = self.app.settings['repository'] client_name = self.app.settings['client-name'] logging.info('Forcing lock') logging.info('Repository: %s' % repourl) logging.info('Client: %s' % client_name) try: repo = self.app.open_repository() except OSError, e: raise obnamlib.Error('Repository does not exist ' 'or cannot be accessed.\n' + str(e)) all_clients = repo.list_clients() if client_name not in all_clients: msg = 'Client does not exist in repository.' logging.warning(msg) self.app.output.write('Warning: %s\n' % msg) return all_dirs = ['clientlist', 'chunksums', 'chunklist', 'chunks', '.'] for client_name in all_clients: client_id = repo.clientlist.get_client_id(client_name) client_dir = repo.client_dir(client_id) all_dirs.append(client_dir) for one_dir in all_dirs: lockname = os.path.join(one_dir, 'lock') if repo.fs.exists(lockname): logging.info('Removing lockfile %s' % lockname) repo.fs.remove(lockname) else: logging.info('%s is not locked' % one_dir) repo.fs.close() return 0 obnam-1.6.1/obnamlib/plugins/forget_plugin.py0000644000175000017500000000627012246357067021202 0ustar jenkinsjenkins# Copyright (C) 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import datetime import obnamlib class ForgetPlugin(obnamlib.ObnamPlugin): '''Forget generations.''' def enable(self): self.app.add_subcommand('forget', self.forget, arg_synopsis='[GENERATION]...') self.app.settings.string(['keep'], 'policy for what generations to keep ' 'when forgetting') def forget(self, args): '''Forget (remove) specified backup generations.''' self.app.settings.require('repository') self.app.settings.require('client-name') self.app.ts['gen'] = None self.app.ts['gens'] = [] self.app.ts.format('forgetting generations: %Index(gen,gens) done') self.repo = self.app.open_repository() self.repo.lock_client(self.app.settings['client-name']) self.repo.lock_shared() self.app.dump_memory_profile('at beginning') if args: self.app.ts['gens'] = args for genspec in args: self.app.ts['gen'] = genspec genid = self.repo.genspec(genspec) self.app.ts.notify('Forgetting generation %s' % genid) self.remove(genid) self.app.dump_memory_profile('after removing %s' % genid) elif self.app.settings['keep']: genlist = [] dt = datetime.datetime(1970, 1, 1, 0, 0, 0) for genid in self.repo.list_generations(): start, end = self.repo.get_generation_times(genid) genlist.append((genid, dt.fromtimestamp(end))) fp = obnamlib.ForgetPolicy() rules = fp.parse(self.app.settings['keep']) keeplist = fp.match(rules, genlist) keepids = set(genid for genid, dt in keeplist) removeids = [genid for genid, dt in genlist if genid not in keepids] self.app.ts['gens'] = removeids for genid in removeids: self.app.ts['gen'] = genid self.remove(genid) self.app.dump_memory_profile('after removing %s' % genid) self.repo.commit_client() self.repo.commit_shared() self.app.dump_memory_profile('after committing') self.repo.fs.close() self.app.ts.finish() def remove(self, genid): if self.app.settings['pretend']: self.app.ts.notify('Pretending to remove generation %s' % genid) else: self.repo.remove_generation(genid) obnam-1.6.1/obnamlib/plugins/fsck_plugin.py0000644000175000017500000003272312246357067020644 0ustar jenkinsjenkins# Copyright (C) 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import larch.fsck import logging import os import sys import ttystatus import obnamlib class WorkItem(larch.fsck.WorkItem): '''A work item for fsck. Whoever creates a WorkItem shall set the ``repo`` to the repository being used. ''' class CheckChunk(WorkItem): def __init__(self, chunkid, checksummer): self.chunkid = chunkid self.checksummer = checksummer self.name = 'chunk %s' % chunkid def do(self): logging.debug('Checking chunk %s' % self.chunkid) if not self.repo.chunk_exists(self.chunkid): self.error('chunk %s does not exist' % self.chunkid) else: data = self.repo.get_chunk(self.chunkid) checksum = self.repo.checksum(data) try: correct = self.repo.chunklist.get_checksum(self.chunkid) except KeyError: self.error('chunk %s not in chunklist' % self.chunkid) else: if checksum != correct: self.error('chunk %s has wrong checksum' % self.chunkid) if self.chunkid not in self.repo.chunksums.find(checksum): self.error('chunk %s not in chunksums' % self.chunkid) self.checksummer.update(data) self.chunkids_seen.add(self.chunkid) class CheckFileChecksum(WorkItem): def __init__(self, filename, correct, chunkids, checksummer): self.filename = filename self.name = '%s checksum' % filename self.correct = correct self.chunkids = chunkids self.checksummer = checksummer def do(self): logging.debug('Checking whole-file checksum for %s' % self.filename) if self.correct != self.checksummer.digest(): self.error('%s whole-file checksum mismatch' % self.name) class CheckFile(WorkItem): def __init__(self, client_name, genid, filename, metadata): self.client_name = client_name self.genid = genid self.filename = filename self.metadata = metadata self.name = 'file %s:%s:%s' % (client_name, genid, filename) def do(self): logging.debug('Checking client=%s genid=%s filename=%s' % (self.client_name, self.genid, self.filename)) if self.repo.current_client != self.client_name: self.repo.open_client(self.client_name) if self.metadata.isfile() and not self.settings['fsck-ignore-chunks']: chunkids = self.repo.get_file_chunks(self.genid, self.filename) checksummer = self.repo.new_checksummer() for chunkid in chunkids: yield CheckChunk(chunkid, checksummer) yield CheckFileChecksum( self.name, self.metadata.md5, chunkids, checksummer) class CheckDirectory(WorkItem): def __init__(self, client_name, genid, dirname): self.client_name = client_name self.genid = genid self.dirname = dirname self.name = 'dir %s:%s:%s' % (client_name, genid, dirname) def do(self): logging.debug('Checking client=%s genid=%s dirname=%s' % (self.client_name, self.genid, self.dirname)) if self.repo.current_client != self.client_name: self.repo.open_client(self.client_name) self.repo.get_metadata(self.genid, self.dirname) for basename in self.repo.listdir(self.genid, self.dirname): pathname = os.path.join(self.dirname, basename) metadata = self.repo.get_metadata(self.genid, pathname) if metadata.isdir(): yield CheckDirectory(self.client_name, self.genid, pathname) elif not self.settings['fsck-skip-files']: yield CheckFile( self.client_name, self.genid, pathname, metadata) class CheckGeneration(WorkItem): def __init__(self, client_name, genid): self.client_name = client_name self.genid = genid self.name = 'generation %s:%s' % (client_name, genid) def do(self): logging.debug('Checking client=%s genid=%s' % (self.client_name, self.genid)) started, ended = self.repo.client.get_generation_times(self.genid) if started is None: self.error('%s:%s: no generation start time' % (self.client_name, self.genid)) if ended is None: self.error('%s:%s: no generation end time' % (self.client_name, self.genid)) n = self.repo.client.get_generation_file_count(self.genid) if n is None: self.error('%s:%s: no file count' % (self.client_name, self.genid)) n = self.repo.client.get_generation_data(self.genid) if n is None: self.error('%s:%s: no total data' % (self.client_name, self.genid)) if self.settings['fsck-skip-dirs']: return [] else: return [CheckDirectory(self.client_name, self.genid, '/')] class CheckGenerationIdsAreDifferent(WorkItem): def __init__(self, client_name, genids): self.client_name = client_name self.genids = list(genids) def do(self): logging.debug('Checking genid uniqueness for client=%s' % self.client_name) done = set() while self.genids: genid = self.genids.pop() if genid in done: self.error('%s: duplicate generation id %s' % genid) else: done.add(genid) class CheckClientExists(WorkItem): def __init__(self, client_name): self.client_name = client_name self.name = 'does client %s exist?' % client_name def do(self): logging.debug('Checking client=%s exists' % self.client_name) client_id = self.repo.clientlist.get_client_id(self.client_name) if client_id is None: self.error('Client %s is in client list, but has no id' % self.client_name) class CheckClient(WorkItem): def __init__(self, client_name): self.client_name = client_name self.name = 'client %s' % client_name def do(self): logging.debug('Checking client=%s' % self.client_name) if self.repo.current_client != self.client_name: self.repo.open_client(self.client_name) genids = self.repo.list_generations() yield CheckGenerationIdsAreDifferent(self.client_name, genids) if self.settings['fsck-skip-generations']: genids = [] elif self.settings['fsck-last-generation-only'] and genids: genids = genids[-1:] for genid in genids: yield CheckGeneration(self.client_name, genid) class CheckClientlist(WorkItem): name = 'client list' def do(self): logging.debug('Checking clientlist') clients = self.repo.clientlist.list_clients() for client_name in clients: if client_name not in self.settings['fsck-ignore-client']: yield CheckClientExists(client_name) if not self.settings['fsck-skip-per-client-b-trees']: for client_name in clients: if client_name not in self.settings['fsck-ignore-client']: client_id = self.repo.clientlist.get_client_id(client_name) client_dir = self.repo.client_dir(client_id) yield CheckBTree(str(client_dir)) for client_name in clients: if client_name not in self.settings['fsck-ignore-client']: yield CheckClient(client_name) class CheckForExtraChunks(WorkItem): def __init__(self): self.name = 'extra chunks' def do(self): logging.debug('Checking for extra chunks') for chunkid in self.repo.list_chunks(): if chunkid not in self.chunkids_seen: self.error('chunk %s not used by anyone' % chunkid) class CheckBTree(WorkItem): def __init__(self, dirname): self.dirname = dirname self.name = 'B-tree %s' % dirname def do(self): if not self.repo.fs.exists(self.dirname): logging.debug('B-tree %s does not exist, skipping' % self.dirname) return logging.debug('Checking B-tree %s' % self.dirname) fix = self.settings['fsck-fix'] forest = larch.open_forest(allow_writes=fix, dirname=self.dirname, vfs=self.repo.fs) fsck = larch.fsck.Fsck(forest, self.warning, self.error, fix) for work in fsck.find_work(): yield work class CheckRepository(WorkItem): def __init__(self): self.name = 'repository' def do(self): logging.debug('Checking repository') if not self.settings['fsck-skip-shared-b-trees']: yield CheckBTree('clientlist') yield CheckBTree('chunklist') yield CheckBTree('chunksums') yield CheckClientlist() class FsckPlugin(obnamlib.ObnamPlugin): def enable(self): self.app.add_subcommand('fsck', self.fsck) group = 'Integrity checking (fsck)' self.app.settings.boolean( ['fsck-fix'], 'should fsck try to fix problems?', group=group) self.app.settings.boolean( ['fsck-ignore-chunks'], 'ignore chunks when checking repository integrity (assume all ' 'chunks exist and are correct)', group=group) self.app.settings.string_list( ['fsck-ignore-client'], 'do not check repository data for cient NAME', metavar='NAME', group=group) self.app.settings.boolean( ['fsck-last-generation-only'], 'check only the last generation for each client', group=group) self.app.settings.boolean( ['fsck-skip-generations'], 'do not check any generations', group=group) self.app.settings.boolean( ['fsck-skip-dirs'], 'do not check anything about directories and their files', group=group) self.app.settings.boolean( ['fsck-skip-files'], 'do not check anything about files', group=group) self.app.settings.boolean( ['fsck-skip-per-client-b-trees'], 'do not check per-client B-trees', group=group) self.app.settings.boolean( ['fsck-skip-shared-b-trees'], 'do not check shared B-trees', group=group) def configure_ttystatus(self): self.app.ts.clear() self.app.ts['this_item'] = 0 self.app.ts['items'] = 0 self.app.ts.format( 'Checking %Integer(this_item)/%Integer(items): %String(item)') def fsck(self, args): '''Verify internal consistency of backup repository.''' self.app.settings.require('repository') logging.debug('fsck on %s' % self.app.settings['repository']) self.configure_ttystatus() self.repo = self.app.open_repository() self.repo.lock_root() client_names = self.repo.list_clients() client_dirs = [self.repo.client_dir( self.repo.clientlist.get_client_id(name)) for name in client_names] self.repo.lockmgr.lock(client_dirs) self.repo.lock_shared() self.errors = 0 self.chunkids_seen = set() self.work_items = [] self.add_item(CheckRepository(), append=True) final_items = [] if not self.app.settings['fsck-ignore-chunks']: final_items.append(CheckForExtraChunks()) while self.work_items: work = self.work_items.pop(0) logging.debug('doing: %s' % str(work)) self.app.ts['item'] = work self.app.ts.increase('this_item', 1) pos = 0 for more in work.do() or []: self.add_item(more, pos=pos) pos += 1 if not self.work_items: for work in final_items: self.add_item(work, append=True) final_items = [] self.repo.unlock_shared() self.repo.lockmgr.unlock(client_dirs) self.repo.unlock_root() self.repo.fs.close() self.app.ts.finish() if self.errors: sys.exit(1) def add_item(self, work, append=False, pos=0): logging.debug('adding: %s' % str(work)) work.warning = self.warning work.error = self.error work.repo = self.repo work.settings = self.app.settings work.chunkids_seen = self.chunkids_seen if append: self.work_items.append(work) else: self.work_items.insert(pos, work) self.app.ts.increase('items', 1) def error(self, msg): logging.error(msg) self.app.ts.error(msg) self.errors += 1 def warning(self, msg): logging.warning(msg) self.app.ts.notify(msg) obnam-1.6.1/obnamlib/plugins/fuse_plugin.py0000644000175000017500000005364612246357067020667 0ustar jenkinsjenkins# Copyright (C) 2013 Valery Yundin # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import stat import sys import logging import errno import struct import signal import obnamlib try: import fuse fuse.fuse_python_api = (0, 2) except ImportError: class Bunch: def __init__(self, **kwds): self.__dict__.update(kwds) fuse = Bunch(Fuse = object) class ObnamFuseOptParse(object): '''Option parsing class for FUSE has to set fuse_args.mountpoint ''' obnam = None def __init__(self, *args, **kw): self.fuse_args = \ 'fuse_args' in kw and kw.pop('fuse_args') or fuse.FuseArgs() if 'fuse' in kw: self.fuse = kw.pop('fuse') def parse_args(self, args=None, values=None): self.fuse_args.mountpoint = self.obnam.app.settings['to'] for opt in self.obnam.app.settings['fuse-opt']: if opt == '-f': self.fuse_args.setmod('foreground') else: self.fuse_args.add(opt) if not hasattr(self.fuse_args, 'ro'): self.fuse_args.add('ro') class ObnamFuseFile(object): fs = None # points to active ObnamFuse object direct_io = False # do not use direct I/O on this file. keep_cache = True # cached file data need not to be invalidated. def __init__(self, path, flags, *mode): logging.debug('FUSE file open %s %d', path, flags) if ((flags & os.O_WRONLY) or (flags & os.O_RDWR) or (flags & os.O_CREAT) or (flags & os.O_EXCL) or (flags & os.O_TRUNC) or (flags & os.O_APPEND)): raise IOError(errno.EROFS, 'Read only filesystem') try: self.path = path if path == '/.pid' and self.fs.obnam.app.settings['viewmode'] == 'multiple': self.read = self.read_pid return self.metadata = self.fs.get_metadata(path) # if not a regular file return EINVAL if not stat.S_ISREG(self.metadata.st_mode): raise IOError(errno.EINVAL, 'Invalid argument') self.chunkids = None self.chunksize = None self.lastdata = None self.lastblock = None except: logging.error('Unexpected exception', exc_info=True) raise def read_pid(self, length, offset): pid = str(os.getpid()) if length < len(pid) or offset != 0: return '' else: return pid def fgetattr(self): logging.debug('FUSE file fgetattr') return self.fs.getattr(self.path) def read(self, length, offset): logging.debug('FUSE file read(%s, %d, %d)', self.path, length, offset) try: if length == 0 or offset >= self.metadata.st_size: return '' repo = self.fs.obnam.repo gen, repopath = self.fs.get_gen_path(self.path) # if stored inside B-tree contents = repo.get_file_data(gen, repopath) if contents is not None: return contents[offset:offset+length] # stored in chunks if not self.chunkids: self.chunkids = repo.get_file_chunks(gen, repopath) if len(self.chunkids) == 1: if not self.lastdata: self.lastdata = repo.get_chunk(self.chunkids[0]) return self.lastdata[offset:offset+length] else: chunkdata = None if not self.chunksize: # take the cached value as the first guess for chunksize self.chunksize = self.fs.sizecache.get(gen, self.fs.chunksize) blocknum = offset/self.chunksize blockoffs = offset - blocknum*self.chunksize # read a chunk if guessed blocknum and chunksize make sense if blocknum < len(self.chunkids): chunkdata = repo.get_chunk(self.chunkids[blocknum]) else: chunkdata = '' # check if chunkdata is of expected length validate = min(self.chunksize, self.metadata.st_size - blocknum*self.chunksize) if validate != len(chunkdata): if blocknum < len(self.chunkids)-1: # the length of all but last chunks is chunksize self.chunksize = len(chunkdata) else: # guessing failed, get the length of the first chunk self.chunksize = len(repo.get_chunk(self.chunkids[0])) chunkdata = None # save correct chunksize self.fs.sizecache[gen] = self.chunksize if not chunkdata: blocknum = offset/self.chunksize blockoffs = offset - blocknum*self.chunksize if self.lastblock == blocknum: chunkdata = self.lastdata else: chunkdata = repo.get_chunk(self.chunkids[blocknum]) output = [] while True: output.append(chunkdata[blockoffs:blockoffs+length]) readlength = len(chunkdata) - blockoffs if length > readlength and blocknum < len(self.chunkids)-1: length -= readlength blocknum += 1 blockoffs = 0 chunkdata = repo.get_chunk(self.chunkids[blocknum]) else: self.lastblock = blocknum self.lastdata = chunkdata break return ''.join(output) except (OSError, IOError), e: logging.debug('FUSE Expected exception') raise except: logging.exception('Unexpected exception') raise def release(self, flags): logging.debug('FUSE file release %d', flags) self.lastdata = None return 0 def fsync(self, isfsyncfile): logging.debug('FUSE file fsync') return 0 def flush(self): logging.debug('FUSE file flush') return 0 def ftruncate(self, size): logging.debug('FUSE file ftruncate %d', size) return 0 def lock(self, cmd, owner, **kw): logging.debug('FUSE file lock %s %s %s', repr(cmd), repr(owner), repr(kw)) raise IOError(errno.EOPNOTSUPP, 'Operation not supported') class ObnamFuse(fuse.Fuse): '''FUSE main class ''' MAX_METADATA_CACHE = 512 def sigUSR1(self): if self.obnam.app.settings['viewmode'] == 'multiple': repo = self.obnam.app.open_repository() repo.open_client(self.obnam.app.settings['client-name']) generations = [gen for gen in repo.list_generations() if not repo.get_is_checkpoint(gen)] self.obnam.repo = repo self.rootstat, self.rootlist = self.multiple_root_list(generations) self.metadatacache.clear() def get_metadata(self, path): #logging.debug('FUSE get_metadata(%s)', path) try: return self.metadatacache[path] except KeyError: if len(self.metadatacache) > self.MAX_METADATA_CACHE: self.metadatacache.clear() metadata = self.obnam.repo.get_metadata(*self.get_gen_path(path)) self.metadatacache[path] = metadata return metadata def get_stat(self, path): logging.debug('FUSE get_stat(%s)', path) metadata = self.get_metadata(path) st = fuse.Stat() st.st_mode = metadata.st_mode st.st_dev = metadata.st_dev st.st_nlink = metadata.st_nlink st.st_uid = metadata.st_uid st.st_gid = metadata.st_gid st.st_size = metadata.st_size st.st_atime = metadata.st_atime_sec st.st_mtime = metadata.st_mtime_sec st.st_ctime = st.st_mtime return st def single_root_list(self, gen): repo = self.obnam.repo mountroot = self.obnam.mountroot rootlist = {} for entry in repo.listdir(gen, mountroot): path = '/' + entry rootlist[path] = self.get_stat(path) rootstat = self.get_stat('/') return (rootstat, rootlist) def multiple_root_list(self, generations): repo = self.obnam.repo mountroot = self.obnam.mountroot rootlist = {} used_generations = [] for gen in generations: path = '/' + str(gen) try: genstat = self.get_stat(path) start, end = repo.get_generation_times(gen) genstat.st_ctime = genstat.st_mtime = end rootlist[path] = genstat used_generations.append(gen) except obnamlib.Error: pass if not used_generations: raise obnamlib.Error('No generations found for %s' % mountroot) latest = used_generations[-1] laststat = rootlist['/' + str(latest)] rootstat = fuse.Stat(**laststat.__dict__) laststat = fuse.Stat(target=str(latest), **laststat.__dict__) laststat.st_mode &= ~(stat.S_IFDIR | stat.S_IFREG) laststat.st_mode |= stat.S_IFLNK rootlist['/latest'] = laststat pidstat = fuse.Stat(**rootstat.__dict__) pidstat.st_mode = stat.S_IFREG | stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH rootlist['/.pid'] = pidstat return (rootstat, rootlist) def init_root(self): repo = self.obnam.repo mountroot = self.obnam.mountroot generations = self.obnam.app.settings['generation'] if self.obnam.app.settings['viewmode'] == 'single': if len(generations) != 1: raise obnamlib.Error( 'The single mode wants exactly one generation option') gen = repo.genspec(generations[0]) if mountroot == '/': self.get_gen_path = lambda path: (gen, path) else: self.get_gen_path = (lambda path : path == '/' and (gen, mountroot) or (gen, mountroot + path)) self.rootstat, self.rootlist = self.single_root_list(gen) logging.debug('FUSE single rootlist %s', repr(self.rootlist)) elif self.obnam.app.settings['viewmode'] == 'multiple': # we need the list of all real (non-checkpoint) generations if len(generations) == 1: generations = [gen for gen in repo.list_generations() if not repo.get_is_checkpoint(gen)] if mountroot == '/': def gen_path_0(path): if path.count('/') == 1: gen = path[1:] return (int(gen), mountroot) else: gen, repopath = path[1:].split('/', 1) return (int(gen), mountroot + repopath) self.get_gen_path = gen_path_0 else: def gen_path_n(path): if path.count('/') == 1: gen = path[1:] return (int(gen), mountroot) else: gen, repopath = path[1:].split('/', 1) return (int(gen), mountroot + '/' + repopath) self.get_gen_path = gen_path_n self.rootstat, self.rootlist = self.multiple_root_list(generations) logging.debug('FUSE multiple rootlist %s', repr(self.rootlist)) else: raise obnamlib.Error('Unknown value for viewmode') def __init__(self, *args, **kw): self.obnam = kw['obnam'] ObnamFuseFile.fs = self self.file_class = ObnamFuseFile self.metadatacache = {} self.chunksize = self.obnam.app.settings['chunk-size'] self.sizecache = {} self.rootlist = None self.rootstat = None self.init_root() fuse.Fuse.__init__(self, *args, **kw) def getattr(self, path): try: if path.count('/') == 1: if path == '/': return self.rootstat elif path in self.rootlist: return self.rootlist[path] else: raise obnamlib.Error('ENOENT') else: return self.get_stat(path) except obnamlib.Error: raise IOError(errno.ENOENT, 'No such file or directory') except: logging.error('Unexpected exception', exc_info=True) raise def readdir(self, path, fh): logging.debug('FUSE readdir(%s, %s)', path, repr(fh)) try: if path == '/': listdir = [x[1:] for x in self.rootlist.keys()] else: listdir = self.obnam.repo.listdir(*self.get_gen_path(path)) return [fuse.Direntry(name) for name in ['.', '..'] + listdir] except obnamlib.Error: raise IOError(errno.EINVAL, 'Invalid argument') except: logging.error('Unexpected exception', exc_info=True) raise def readlink(self, path): try: statdata = self.rootlist.get(path) if statdata and hasattr(statdata, 'target'): return statdata.target metadata = self.get_metadata(path) if metadata.islink(): return metadata.target else: raise IOError(errno.EINVAL, 'Invalid argument') except obnamlib.Error: raise IOError(errno.ENOENT, 'No such file or directory') except: logging.error('Unexpected exception', exc_info=True) raise def statfs(self): logging.debug('FUSE statfs') try: repo = self.obnam.repo if self.obnam.app.settings['viewmode'] == 'multiple': blocks = sum(repo.client.get_generation_data(gen) for gen in repo.list_generations()) files = sum(repo.client.get_generation_file_count(gen) for gen in repo.list_generations()) else: gen = self.get_gen_path('/')[0] blocks = repo.client.get_generation_data(gen) files = repo.client.get_generation_file_count(gen) stv = fuse.StatVfs() stv.f_bsize = 65536 stv.f_frsize = 0 stv.f_blocks = blocks/65536 stv.f_bfree = 0 stv.f_bavail = 0 stv.f_files = files stv.f_ffree = 0 stv.f_favail = 0 stv.f_flag = 0 stv.f_namemax = 255 #raise OSError(errno.ENOSYS, 'Unimplemented') return stv except: logging.error('Unexpected exception', exc_info=True) raise def getxattr(self, path, name, size): logging.debug('FUSE getxattr %s %s %d', path, name, size) try: try: metadata = self.get_metadata(path) except ValueError: return 0 if not metadata.xattr: return 0 blob = metadata.xattr sizesize = struct.calcsize('!Q') name_blob_size = struct.unpack('!Q', blob[:sizesize])[0] name_blob = blob[sizesize : sizesize + name_blob_size] name_list = name_blob.split('\0')[:-1] if name in name_list: value_blob = blob[sizesize + name_blob_size : ] idx = name_list.index(name) fmt = '!' + 'Q' * len(name_list) lengths_size = sizesize * len(name_list) lengths_list = struct.unpack(fmt, value_blob[:lengths_size]) if size == 0: return lengths_list[idx] pos = lengths_size + sum(lengths_list[:idx]) value = value_blob[pos:pos + lengths_list[idx]] return value except obnamlib.Error: raise IOError(errno.ENOENT, 'No such file or directory') except: logging.error('Unexpected exception', exc_info=True) raise def listxattr(self, path, size): logging.debug('FUSE listxattr %s %d', path, size) try: metadata = self.get_metadata(path) if not metadata.xattr: return 0 blob = metadata.xattr sizesize = struct.calcsize('!Q') name_blob_size = struct.unpack('!Q', blob[:sizesize])[0] if size == 0: return name_blob_size name_blob = blob[sizesize : sizesize + name_blob_size] return name_blob.split('\0')[:-1] except obnamlib.Error: raise IOError(errno.ENOENT, 'No such file or directory') except: logging.error('Unexpected exception', exc_info=True) raise def fsync(self, path, isFsyncFile): return 0 def chmod(self, path, mode): raise IOError(errno.EROFS, 'Read only filesystem') def chown(self, path, uid, gid): raise IOError(errno.EROFS, 'Read only filesystem') def link(self, targetPath, linkPath): raise IOError(errno.EROFS, 'Read only filesystem') def mkdir(self, path, mode): raise IOError(errno.EROFS, 'Read only filesystem') def mknod(self, path, mode, dev): raise IOError(errno.EROFS, 'Read only filesystem') def rename(self, oldPath, newPath): raise IOError(errno.EROFS, 'Read only filesystem') def rmdir(self, path): raise IOError(errno.EROFS, 'Read only filesystem') def symlink(self, targetPath, linkPath): raise IOError(errno.EROFS, 'Read only filesystem') def truncate(self, path, size): raise IOError(errno.EROFS, 'Read only filesystem') def unlink(self, path): raise IOError(errno.EROFS, 'Read only filesystem') def utime(self, path, times): raise IOError(errno.EROFS, 'Read only filesystem') def write(self, path, buf, offset): raise IOError(errno.EROFS, 'Read only filesystem') def setxattr(self, path, name, val, flags): raise IOError(errno.EROFS, 'Read only filesystem') def removexattr(self, path, name): raise IOError(errno.EROFS, 'Read only filesystem') class MountPlugin(obnamlib.ObnamPlugin): '''Mount backup repository as a user-space filesystem. At the momemnt only a specific generation can be mounted ''' def enable(self): mount_group = obnamlib.option_group['mount'] = 'Mounting with FUSE' self.app.add_subcommand('mount', self.mount, arg_synopsis='[ROOT]') self.app.settings.choice(['viewmode'], ['single', 'multiple'], '"single" directly mount specified generation, ' '"multiple" mount all generations as separate directories', metavar='MODE', group=mount_group) self.app.settings.string_list(['fuse-opt'], 'options to pass directly to Fuse', metavar='FUSE', group=mount_group) def mount(self, args): '''Mount a backup repository as a FUSE filesystem. This subcommand allows you to access backups in an Obnam backup repository as normal files and directories. Each backed up file or directory can be viewed directly, using a graphical file manager or command line tools. Example: To mount your backup repository: mkdir my-fuse obnam mount --viewmode multiple --to my-fuse You can then access the backup using commands such as these: ls -l my-fuse ls -l my-fuse/latest diff -u my-fuse/latest/home/liw/README ~/README You can also restore files by copying them from the my-fuse directory: cp -a my-fuse/12765/Maildir ~/Maildir.restored To un-mount: fusermount -u my-fuse ''' if not hasattr(fuse, 'fuse_python_api'): raise obnamlib.Error('Failed to load module "fuse", ' 'try installing python-fuse') self.app.settings.require('repository') self.app.settings.require('client-name') self.app.settings.require('to') self.repo = self.app.open_repository() self.repo.open_client(self.app.settings['client-name']) self.mountroot = (['/'] + self.app.settings['root'] + args)[-1] if self.mountroot != '/': self.mountroot = self.mountroot.rstrip('/') logging.debug('FUSE Mounting %s@%s:%s to %s', self.app.settings['client-name'], self.app.settings['generation'], self.mountroot, self.app.settings['to']) try: ObnamFuseOptParse.obnam = self fs = ObnamFuse(obnam=self, parser_class=ObnamFuseOptParse) signal.signal(signal.SIGUSR1, lambda s,f: fs.sigUSR1()) signal.siginterrupt(signal.SIGUSR1, False) fs.flags = 0 fs.multithreaded = 0 fs.parse() fs.main() except fuse.FuseError, e: raise obnamlib.Error(repr(e)) self.repo.fs.close() obnam-1.6.1/obnamlib/plugins/restore_plugin.py0000644000175000017500000003204612246357067021377 0ustar jenkinsjenkins# Copyright (C) 2009, 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import stat import time import ttystatus import obnamlib class Hardlinks(object): '''Keep track of inodes with unrestored hardlinks.''' def __init__(self): self.inodes = dict() def key(self, metadata): return '%s:%s' % (metadata.st_dev, metadata.st_ino) def add(self, filename, metadata): self.inodes[self.key(metadata)] = (filename, metadata.st_nlink) def filename(self, metadata): key = self.key(metadata) if key in self.inodes: return self.inodes[key][0] else: return None def forget(self, metadata): key = self.key(metadata) filename, nlinks = self.inodes[key] if nlinks <= 2: del self.inodes[key] else: self.inodes[key] = (filename, nlinks - 1) class RestorePlugin(obnamlib.ObnamPlugin): # A note about the implementation: we need to make sure all the # files we restore go into the target directory. We do this by # prefixing all filenames we write to with './', and then using # os.path.join to put the target directory name at the beginning. # The './' business is necessary because os.path.join(a,b) returns # just b if b is an absolute path. def enable(self): self.app.add_subcommand('restore', self.restore, arg_synopsis='[DIRECTORY]...') self.app.settings.string(['to'], 'where to restore') self.app.settings.string_list(['generation'], 'which generation to restore', default=['latest']) @property def write_ok(self): return not self.app.settings['dry-run'] def configure_ttystatus(self): self.app.ts['current'] = '' self.app.ts['total'] = 0 self.app.ts['current-bytes'] = 0 self.app.ts['total-bytes'] = 0 self.app.ts.format('%RemainingTime(current-bytes,total-bytes) ' '%Counter(current) files ' '%ByteSize(current-bytes) ' '(%PercentDone(current-bytes,total-bytes)) ' '%ByteSpeed(current-bytes) ' '%Pathname(current)') def restore(self, args): '''Restore some or all files from a generation.''' self.app.settings.require('repository') self.app.settings.require('client-name') self.app.settings.require('generation') self.app.settings.require('to') logging.debug('restoring generation %s' % self.app.settings['generation']) logging.debug('restoring to %s' % self.app.settings['to']) logging.debug('restoring what: %s' % repr(args)) if not args: logging.debug('no args given, so restoring everything') args = ['/'] self.downloaded_bytes = 0 self.file_count = 0 self.started = time.time() self.repo = self.app.open_repository() self.repo.open_client(self.app.settings['client-name']) if self.write_ok: self.fs = self.app.fsf.new(self.app.settings['to'], create=True) self.fs.connect() else: self.fs = None # this will trigger error if we try to really write self.hardlinks = Hardlinks() self.errors = False generations = self.app.settings['generation'] if len(generations) != 1: raise obnamlib.Error( 'The restore command wants exactly one generation option') gen = self.repo.genspec(generations[0]) self.configure_ttystatus() self.app.ts['total'] = self.repo.client.get_generation_file_count(gen) self.app.ts['total-bytes'] = self.repo.client.get_generation_data(gen) self.app.dump_memory_profile('at beginning after setup') for arg in args: self.restore_something(gen, arg) self.app.dump_memory_profile('at restoring %s' % repr(arg)) self.repo.fs.close() if self.write_ok: self.fs.close() self.app.ts.clear() self.report_stats() self.app.ts.finish() if self.errors: raise obnamlib.Error('There were errors when restoring') def restore_something(self, gen, root): for pathname, metadata in self.repo.walk(gen, root, depth_first=True): self.file_count += 1 self.app.ts['current'] = pathname self.restore_safely(gen, pathname, metadata) def restore_safely(self, gen, pathname, metadata): try: dirname = os.path.dirname(pathname) if self.write_ok and not self.fs.exists('./' + dirname): self.fs.makedirs('./' + dirname) set_metadata = True if metadata.isdir(): self.restore_dir(gen, pathname, metadata) elif metadata.islink(): self.restore_symlink(gen, pathname, metadata) elif metadata.st_nlink > 1: link = self.hardlinks.filename(metadata) if link: self.restore_hardlink(pathname, link, metadata) set_metadata = False else: self.hardlinks.add(pathname, metadata) self.restore_first_link(gen, pathname, metadata) else: self.restore_first_link(gen, pathname, metadata) if set_metadata and self.write_ok: try: obnamlib.set_metadata(self.fs, './' + pathname, metadata) except (IOError, OSError), e: msg = ('Could not set metadata: %s: %d: %s' % (pathname, e.errno, e.strerror)) logging.error(msg) self.app.ts.notify(msg) self.errors = True except Exception, e: # Reaching this code path means we've hit a bug, so we log a full traceback. msg = "Failed to restore %s:" % (pathname,) logging.exception(msg) self.app.ts.notify(msg + " " + str(e)) self.errors = True def restore_dir(self, gen, root, metadata): logging.debug('restoring dir %s' % root) if self.write_ok: if not self.fs.exists('./' + root): self.fs.mkdir('./' + root) self.app.dump_memory_profile('after recursing through %s' % repr(root)) def restore_hardlink(self, filename, link, metadata): logging.debug('restoring hardlink %s to %s' % (filename, link)) if self.write_ok: self.fs.link('./' + link, './' + filename) self.hardlinks.forget(metadata) def restore_symlink(self, gen, filename, metadata): logging.debug('restoring symlink %s' % filename) def restore_first_link(self, gen, filename, metadata): if stat.S_ISREG(metadata.st_mode): self.restore_regular_file(gen, filename, metadata) elif stat.S_ISFIFO(metadata.st_mode): self.restore_fifo(gen, filename, metadata) elif stat.S_ISSOCK(metadata.st_mode): self.restore_socket(gen, filename, metadata) elif stat.S_ISBLK(metadata.st_mode) or stat.S_ISCHR(metadata.st_mode): self.restore_device(gen, filename, metadata) else: msg = ('Unknown file type: %s (%o)' % (filename, metadata.st_mode)) logging.error(msg) self.app.ts.notify(msg) def restore_regular_file(self, gen, filename, metadata): logging.debug('restoring regular %s' % filename) if self.write_ok: f = self.fs.open('./' + filename, 'wb') summer = self.repo.new_checksummer() try: contents = self.repo.get_file_data(gen, filename) if contents is None: chunkids = self.repo.get_file_chunks(gen, filename) self.restore_chunks(f, chunkids, summer) else: f.write(contents) summer.update(contents) self.downloaded_bytes += len(contents) except obnamlib.MissingFilterError, e: msg = 'Missing filter error during restore: %s' % filename logging.error(msg) self.app.ts.notify(msg) self.errors = True f.close() correct_checksum = metadata.md5 if summer.digest() != correct_checksum: msg = 'File checksum restore error: %s' % filename logging.error(msg) self.app.ts.notify(msg) self.errors = True def restore_chunks(self, f, chunkids, checksummer): zeroes = '' hole_at_end = False for chunkid in chunkids: data = self.repo.get_chunk(chunkid) self.verify_chunk_checksum(data, chunkid) checksummer.update(data) self.downloaded_bytes += len(data) if len(data) != len(zeroes): zeroes = '\0' * len(data) if data == zeroes: f.seek(len(data), 1) hole_at_end = True else: f.write(data) hole_at_end = False self.app.ts['current-bytes'] += len(data) if hole_at_end: pos = f.tell() if pos > 0: f.seek(-1, 1) f.write('\0') def verify_chunk_checksum(self, data, chunkid): checksum = self.repo.checksum(data) try: wanted = self.repo.chunklist.get_checksum(chunkid) except KeyError: # Chunk might not be in the tree, but that does not # mean it is invalid. We'll assume it is valid. return if checksum != wanted: raise obnamlib.Error('chunk %s checksum error' % chunkid) def restore_fifo(self, gen, filename, metadata): logging.debug('restoring fifo %s' % filename) if self.write_ok: self.fs.mknod('./' + filename, metadata.st_mode) def restore_socket(self, gen, filename, metadata): logging.debug('restoring socket %s' % filename) if self.write_ok: self.fs.mknod('./' + filename, metadata.st_mode) def restore_device(self, gen, filename, metadata): logging.debug('restoring device %s' % filename) if self.write_ok: self.fs.mknod('./' + filename, metadata.st_mode) def report_stats(self): size_table = [ (1024**4, 'TiB'), (1024**3, 'GiB'), (1024**2, 'MiB'), (1024**1, 'KiB'), (0, 'B') ] for size_base, size_unit in size_table: if self.downloaded_bytes >= size_base: if size_base > 0: size_amount = (float(self.downloaded_bytes) / float(size_base)) else: size_amount = float(self.downloaded_bytes) break speed_table = [ (1024**3, 'GiB/s'), (1024**2, 'MiB/s'), (1024**1, 'KiB/s'), (0, 'B/s') ] duration = time.time() - self.started speed = float(self.downloaded_bytes) / duration for speed_base, speed_unit in speed_table: if speed >= speed_base: if speed_base > 0: speed_amount = speed / speed_base else: speed_amount = speed break duration_string = '' seconds = duration if seconds >= 3600: duration_string += '%dh' % int(seconds/3600) seconds %= 3600 if seconds >= 60: duration_string += '%dm' % int(seconds/60) seconds %= 60 if seconds > 0: duration_string += '%ds' % round(seconds) logging.info('Restore performance statistics:') logging.info('* files restored: %s' % self.file_count) logging.info('* downloaded data: %s bytes (%s %s)' % (self.downloaded_bytes, size_amount, size_unit)) logging.info('* duration: %s s' % duration) logging.info('* average speed: %s %s' % (speed_amount, speed_unit)) self.app.ts.notify( 'Restored %d files, ' 'downloaded %.1f %s in %s at %.1f %s average speed' % (self.file_count, size_amount, size_unit, duration_string, speed_amount, speed_unit)) obnam-1.6.1/obnamlib/plugins/sftp_plugin.py0000644000175000017500000005503712246357067020675 0ustar jenkinsjenkins# Copyright (C) 2009 Lars Wirzenius # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import errno import hashlib import logging import os import pwd import random import socket import stat import subprocess import time import traceback import urlparse # As of 2010-07-10, Debian's paramiko package triggers # RandomPool_DeprecationWarning. This will eventually be fixed. Until # then, there is no point in spewing the warning to the user, who can't # do nothing. # http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=586925 import warnings with warnings.catch_warnings(): warnings.simplefilter('ignore') import paramiko import obnamlib def ioerror_to_oserror(method): '''Decorator to convert an IOError exception to OSError. Python's os.* raise OSError, mostly, but paramiko's corresponding methods raise IOError. This decorator fixes that. ''' def helper(self, filename, *args, **kwargs): try: return method(self, filename, *args, **kwargs) except IOError, e: raise OSError(e.errno, e.strerror or str(e), filename) return helper class SSHChannelAdapter(object): '''Take an ssh subprocess and pretend it is a paramiko Channel.''' # This is inspired by the ssh.py module in bzrlib. def __init__(self, proc): self.proc = proc def send(self, data): return os.write(self.proc.stdin.fileno(), data) def recv(self, count): try: return os.read(self.proc.stdout.fileno(), count) except socket.error, e: if e.args[0] in (errno.EPIPE, errno.ECONNRESET, errno.ECONNABORTED, errno.EBADF): # Connection has closed. Paramiko expects an empty string in # this case, not an exception. return '' raise def get_name(self): return 'obnam SSHChannelAdapter' def close(self): logging.debug('SSHChannelAdapter.close called') for func in [self.proc.stdin.close, self.proc.stdout.close, self.proc.wait]: try: func() except OSError: pass class SftpFS(obnamlib.VirtualFileSystem): '''A VFS implementation for SFTP. ''' # 32 KiB is the chunk size that gives me the fastest speed # for sftp transfers. I don't know why the size matters. chunk_size = 32 * 1024 def __init__(self, baseurl, create=False, settings=None): obnamlib.VirtualFileSystem.__init__(self, baseurl) self.sftp = None self.settings = settings self._roundtrips = 0 self._initial_dir = None self.reinit(baseurl, create=create) # Backwards compatibility with old, deprecated option: if settings and settings['strict-ssh-host-keys']: settings["ssh-host-keys-check"] = "yes" def _delay(self): self._roundtrips += 1 if self.settings: ms = self.settings['sftp-delay'] if ms > 0: time.sleep(ms * 0.001) def log_stats(self): obnamlib.VirtualFileSystem.log_stats(self) logging.info('VFS: baseurl=%s roundtrips=%s' % (self.baseurl, self._roundtrips)) def _to_string(self, str_or_unicode): if type(str_or_unicode) is unicode: return str_or_unicode.encode('utf-8') else: return str_or_unicode def _create_root_if_missing(self): try: self.mkdir(self.path) except OSError, e: # sftp/paramiko does not give us a useful errno so we hope # for the best pass self.create_path_if_missing = False # only create once def connect(self): try_openssh = not self.settings or not self.settings['pure-paramiko'] if not try_openssh or not self._connect_openssh(): self._connect_paramiko() if self.create_path_if_missing: self._create_root_if_missing() self.chdir('.') self._initial_dir = self.getcwd() self.chdir(self.path) def _connect_openssh(self): executable = 'ssh' args = ['-oForwardX11=no', '-oForwardAgent=no', '-oClearAllForwardings=yes', '-oProtocol=2', '-s'] if self.settings and self.settings['ssh-command']: executable = self.settings["ssh-command"] # default user/port from ssh (could be a per host configuration) if self.port: args += ['-p', str(self.port)] if self.user: args += ['-l', self.user] if self.settings and self.settings['ssh-key']: args += ['-i', self.settings['ssh-key']] if (self.settings and self.settings['ssh-host-keys-check'] != "ssh-config"): value = self.settings['ssh-host-keys-check'] args += ['-o', 'StrictHostKeyChecking=%s' % (value,)] if self.settings and self.settings['ssh-known-hosts']: args += ['-o', 'UserKnownHostsFile=%s' % self.settings['ssh-known-hosts']] args += [self.host, 'sftp'] # prepend the executable to the argument list args.insert(0, executable) logging.debug('executing openssh: %s' % args) try: proc = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, close_fds=True) except OSError: return False self.transport = None self.sftp = paramiko.SFTPClient(SSHChannelAdapter(proc)) return True def _connect_paramiko(self): logging.debug( 'connect_paramiko: host=%s port=%s' % (self.host, self.port)) if self.port: remote = (self.host, self.port) else: remote = (self.host) self.transport = paramiko.Transport(remote) self.transport.connect() logging.debug('connect_paramiko: connected') try: self._check_host_key(self.host) except BaseException, e: self.transport.close() self.transport = None raise logging.debug('connect_paramiko: host key checked') self._authenticate(self.user) logging.debug('connect_paramiko: authenticated') self.sftp = paramiko.SFTPClient.from_transport(self.transport) logging.debug('connect_paramiko: end') def _check_host_key(self, hostname): logging.debug('checking ssh host key for %s' % hostname) offered_key = self.transport.get_remote_server_key() known_hosts_path = self.settings['ssh-known-hosts'] known_hosts = paramiko.util.load_host_keys(known_hosts_path) known_keys = known_hosts.lookup(hostname) if known_keys is None: if self.settings['ssh-host-keys-check'] == 'yes': raise obnamlib.Error('No known host key for %s' % hostname) logging.warning('No known host keys for %s; accepting offered key' % hostname) return offered_type = offered_key.get_name() if not known_keys.has_key(offered_type): if self.settings['ssh-host-keys-check'] == 'yes': raise obnamlib.Error('No known type %s host key for %s' % (offered_type, hostname)) logging.warning('No known host key of type %s for %s; accepting ' 'offered key' % (offered_type, hostname)) known_key = known_keys[offered_type] if offered_key != known_key: raise obnamlib.Error('SSH server %s offered wrong public key' % hostname) logging.debug('Host key for %s OK' % hostname) def _authenticate(self, username): if not username: username = self._get_username() for key in self._find_auth_keys(): try: self.transport.auth_publickey(username, key) return except paramiko.SSHException: pass raise obnamlib.Error('Can\'t authenticate to SSH server using key.') def _find_auth_keys(self): if self.settings and self.settings['ssh-key']: return [self._load_from_key_file(self.settings['ssh-key'])] else: return self._load_from_agent() def _load_from_key_file(self, filename): try: key = paramiko.RSAKey.from_private_key_file(filename) except paramiko.PasswordRequiredException: password = getpass.getpass('RSA key password for %s: ' % filename) key = paramiko.RSAKey.from_private_key_file(filename, password) return key def _load_from_agent(self): agent = paramiko.Agent() return agent.get_keys() def close(self): logging.debug('SftpFS.close called') self.sftp.close() self.sftp = None if self.transport: self.transport.close() self.transport = None obnamlib.VirtualFileSystem.close(self) self._delay() @ioerror_to_oserror def reinit(self, baseurl, create=False): scheme, netloc, path, query, fragment = urlparse.urlsplit(baseurl) if scheme != 'sftp': raise obnamlib.Error('SftpFS used with non-sftp URL: %s' % baseurl) if '@' in netloc: user, netloc = netloc.split('@', 1) else: user = None if ':' in netloc: host, port = netloc.split(':', 1) if port == '': port = None else: try: port = int(port) except ValueError, e: msg = ('Invalid port number %s in %s: %s' % (port, baseurl, str(e))) logging.error(msg) raise obnamlib.Error(msg) else: host = netloc port = None if path.startswith('/~/'): path = path[3:] self.host = host self.port = port self.user = user self.path = path self.create_path_if_missing = create self._delay() if self.sftp: if create: self._create_root_if_missing() logging.debug('chdir to %s' % path) self.sftp.chdir(self._initial_dir) self.sftp.chdir(path) def _get_username(self): return pwd.getpwuid(os.getuid()).pw_name def getcwd(self): self._delay() return self._to_string(self.sftp.getcwd()) @ioerror_to_oserror def chdir(self, pathname): self._delay() self.sftp.chdir(pathname) @ioerror_to_oserror def listdir(self, pathname): self._delay() return [self._to_string(x) for x in self.sftp.listdir(pathname)] def _force_32bit_timestamp(self, timestamp): if timestamp is None: return None max_int32 = 2**31 - 1 # max positive 32 signed integer value if timestamp > max_int32: timestamp -= 2**32 if timestamp > max_int32: timestamp = max_int32 # it's too large, need to lose info return timestamp def _fix_stat(self, pathname, st): # SFTP and/or paramiko fail to return some of the required fields, # so we add them, using faked data. defaults = { 'st_blocks': (st.st_size / 512) + (1 if st.st_size % 512 else 0), 'st_dev': 0, 'st_ino': int(hashlib.md5(pathname).hexdigest()[:8], 16), 'st_nlink': 1, } for name, value in defaults.iteritems(): if not hasattr(st, name): setattr(st, name, value) # Paramiko seems to deal with unsigned timestamps only, at least # in version 1.7.6. We therefore force the timestamps into # a signed 32-bit value. This limits the range, but allows # timestamps that are negative (before 1970). Once paramiko is # fixed, this code can be removed. st.st_mtime_sec = self._force_32bit_timestamp(st.st_mtime) st.st_atime_sec = self._force_32bit_timestamp(st.st_atime) # Within Obnam, we pretend stat results have st_Xtime_sec and # st_Xtime_nsec, but not st_Xtime. Remove those fields. del st.st_mtime del st.st_atime # We only get integer timestamps, so set these explicitly to 0. st.st_mtime_nsec = 0 st.st_atime_nsec = 0 return st @ioerror_to_oserror def listdir2(self, pathname): self._delay() attrs = self.sftp.listdir_attr(pathname) pairs = [(self._to_string(st.filename), st) for st in attrs] fixed = [(name, self._fix_stat(name, st)) for name, st in pairs] return fixed def lock(self, lockname, data): try: self.write_file(lockname, data) except OSError, e: raise obnamlib.LockFail('Failure get lock %s' % lockname) def unlock(self, lockname): self._remove_if_exists(lockname) def exists(self, pathname): try: self.lstat(pathname) except OSError: return False else: return True def isdir(self, pathname): self._delay() try: st = self.lstat(pathname) except OSError: return False else: return stat.S_ISDIR(st.st_mode) def mknod(self, pathname, mode): # SFTP does not provide an mknod, so we can't do this. We # raise an exception, so upper layers can handle this (we _could_ # just fail silently, but that would be silly.) raise NotImplementedError('mknod on SFTP: %s' % pathname) @ioerror_to_oserror def mkdir(self, pathname): self._delay() self.sftp.mkdir(pathname) @ioerror_to_oserror def makedirs(self, pathname): parent = os.path.dirname(pathname) if parent and parent != pathname and not self.exists(parent): self.makedirs(parent) self.mkdir(pathname) @ioerror_to_oserror def rmdir(self, pathname): self._delay() self.sftp.rmdir(pathname) @ioerror_to_oserror def remove(self, pathname): self._delay() self.sftp.remove(pathname) def _remove_if_exists(self, pathname): '''Like remove, but OK if file does not exist.''' try: self.remove(pathname) except OSError, e: if e.errno != errno.ENOENT: raise @ioerror_to_oserror def rename(self, old, new): self._delay() self._remove_if_exists(new) self.sftp.rename(old, new) @ioerror_to_oserror def lstat(self, pathname): self._delay() st = self.sftp.lstat(pathname) self._fix_stat(pathname, st) return st @ioerror_to_oserror def lchown(self, pathname, uid, gid): self._delay() if stat.S_ISLNK(self.lstat(pathname).st_mode): logging.warning('NOT changing ownership of symlink %s' % pathname) else: self.sftp.chown(pathname, uid, gid) @ioerror_to_oserror def chmod_symlink(self, pathname, mode): # SFTP and/or paramiko don't have lchmod at all, so we can't # actually do this. However, we at least check that pathname # exists. self.lstat(pathname) @ioerror_to_oserror def chmod_not_symlink(self, pathname, mode): self._delay() self.sftp.chmod(pathname, mode) @ioerror_to_oserror def lutimes(self, pathname, atime_sec, atime_nsec, mtime_sec, mtime_nsec): # FIXME: This does not work for symlinks! # Sftp does not have a way of doing that. This means if the restore # target is over sftp, symlinks and their targets will have wrong # mtimes. self._delay() if getattr(self, 'lutimes_warned', False): logging.warning('lutimes used over SFTP, this does not work ' 'against symlinks (warning appears only first ' 'time)') self.lutimes_warned = True self.sftp.utime(pathname, (atime_sec, mtime_sec)) def link(self, existing_path, new_path): raise obnamlib.Error('Cannot hardlink on SFTP. Sorry.') def readlink(self, symlink): self._delay() return self._to_string(self.sftp.readlink(symlink)) @ioerror_to_oserror def symlink(self, source, destination): self._delay() self.sftp.symlink(source, destination) def open(self, pathname, mode, bufsize=-1): self._delay() return self.sftp.file(pathname, mode, bufsize=bufsize) def cat(self, pathname): self._delay() f = self.open(pathname, 'r') f.prefetch() chunks = [] while True: chunk = f.read(self.chunk_size) if not chunk: break chunks.append(chunk) self.bytes_read += len(chunk) f.close() return ''.join(chunks) @ioerror_to_oserror def write_file(self, pathname, contents): try: f = self.open(pathname, 'wx') except (IOError, OSError), e: # When the path to the file to be written does not # exist, we try to create the directories below. Note that # some SFTP servers return EACCES instead of ENOENT # when the path to the file does not exist, so we # do not raise an exception here for both ENOENT # and EACCES. if e.errno != errno.ENOENT and e.errno != errno.EACCES: raise dirname = os.path.dirname(pathname) self.makedirs(dirname) f = self.open(pathname, 'wx') self._write_helper(f, contents) f.close() def _tempfile(self, dirname): '''Create a new file with a random name, return file handle and name.''' if dirname: try: self.makedirs(dirname) except OSError: # We ignore the error, on the assumption that it was due # to the directory already existing. If it didn't exist # and the error was for something else, then we'll catch # that when we open the file for writing. pass while True: i = random.randint(0, 2**64-1) basename = 'tmp.%x' % i pathname = os.path.join(dirname, basename) try: f = self.open(pathname, 'wx', bufsize=self.chunk_size) except OSError: pass else: return f, pathname @ioerror_to_oserror def overwrite_file(self, pathname, contents): self._delay() dirname = os.path.dirname(pathname) f, tempname = self._tempfile(dirname) self._write_helper(f, contents) f.close() self.rename(tempname, pathname) def _write_helper(self, f, contents): for pos in range(0, len(contents), self.chunk_size): chunk = contents[pos:pos + self.chunk_size] f.write(chunk) self.bytes_written += len(chunk) class SftpPlugin(obnamlib.ObnamPlugin): def enable(self): ssh_group = obnamlib.option_group['ssh'] = 'SSH/SFTP' devel_group = obnamlib.option_group['devel'] self.app.settings.integer(['sftp-delay'], 'add an artificial delay (in milliseconds) ' 'to all SFTP transfers', group=devel_group) self.app.settings.string(['ssh-key'], 'use FILENAME as the ssh RSA private key for ' 'sftp access (default is using keys known ' 'to ssh-agent)', metavar='FILENAME', group=ssh_group) self.app.settings.boolean(['strict-ssh-host-keys'], 'DEPRECATED, use --ssh-host-keys-check ' 'instead', group=ssh_group) self.app.settings.choice(['ssh-host-keys-check'], ['ssh-config', 'yes', 'no', 'ask'], 'If "yes", require that the ssh host key must ' 'be known and correct to be accepted. If ' '"no", do not require that. If "ask", the ' 'user is interactively asked to accept new ' 'hosts. The default ("ssh-config") is to ' 'rely on the settings of the underlying ' 'SSH client', metavar='VALUE', group=ssh_group) self.app.settings.string(['ssh-known-hosts'], 'filename of the user\'s known hosts file ' '(default: %default)', metavar='FILENAME', default= os.path.expanduser('~/.ssh/known_hosts'), group=ssh_group) self.app.settings.string(['ssh-command'], 'alternative executable to be used instead ' 'of "ssh" (full path is allowed, no ' 'arguments may be added)', metavar='EXECUTABLE', group=ssh_group) self.app.settings.boolean(['pure-paramiko'], 'do not use openssh even if available, ' 'use paramiko only instead', group=ssh_group) self.app.fsf.register('sftp', SftpFS, settings=self.app.settings) obnam-1.6.1/obnamlib/plugins/show_plugin.py0000644000175000017500000003165212246357067020676 0ustar jenkinsjenkins# Copyright (C) 2009, 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import os import re import stat import sys import time import obnamlib class ShowPlugin(obnamlib.ObnamPlugin): '''Show information about data in the backup repository. This implements commands for listing contents of root and client objects, or the contents of a backup generation. ''' leftists = (2, 3, 6) min_widths = (1, 1, 1, 1, 6, 20, 1) def enable(self): self.app.add_subcommand('clients', self.clients) self.app.add_subcommand('generations', self.generations) self.app.add_subcommand('genids', self.genids) self.app.add_subcommand('ls', self.ls, arg_synopsis='[FILE]...') self.app.add_subcommand('diff', self.diff, arg_synopsis='[GENERATION1] GENERATION2') self.app.add_subcommand('nagios-last-backup-age', self.nagios_last_backup_age) self.app.settings.string(['warn-age'], 'for nagios-last-backup-age: maximum age (by ' 'default in hours) for the most recent ' 'backup before status is warning. ' 'Accepts one char unit specifier ' '(s,m,h,d for seconds, minutes, hours, ' 'and days.', metavar='AGE', default=obnamlib.DEFAULT_NAGIOS_WARN_AGE) self.app.settings.string(['critical-age'], 'for nagios-last-backup-age: maximum age ' '(by default in hours) for the most ' 'recent backup before statis is critical. ' 'Accepts one char unit specifier ' '(s,m,h,d for seconds, minutes, hours, ' 'and days.', metavar='AGE', default=obnamlib.DEFAULT_NAGIOS_WARN_AGE) def open_repository(self, require_client=True): self.app.settings.require('repository') if require_client: self.app.settings.require('client-name') self.repo = self.app.open_repository() if require_client: self.repo.open_client(self.app.settings['client-name']) def clients(self, args): '''List clients using the repository.''' self.open_repository(require_client=False) for client_name in self.repo.list_clients(): self.app.output.write('%s\n' % client_name) self.repo.fs.close() def generations(self, args): '''List backup generations for client.''' self.open_repository() for gen in self.repo.list_generations(): start, end = self.repo.get_generation_times(gen) is_checkpoint = self.repo.get_is_checkpoint(gen) if is_checkpoint: checkpoint = ' (checkpoint)' else: checkpoint = '' sys.stdout.write('%s\t%s .. %s (%d files, %d bytes) %s\n' % (gen, self.format_time(start), self.format_time(end), self.repo.client.get_generation_file_count(gen), self.repo.client.get_generation_data(gen), checkpoint)) self.repo.fs.close() def nagios_last_backup_age(self, args): '''Check if the most recent generation is recent enough.''' try: self.open_repository() except obnamlib.Error, e: self.app.output.write('CRITICAL: %s\n' % e) sys.exit(2) most_recent = None warn_age = self._convert_time(self.app.settings['warn-age']) critical_age = self._convert_time(self.app.settings['critical-age']) for gen in self.repo.list_generations(): start, end = self.repo.get_generation_times(gen) if most_recent is None or start > most_recent: most_recent = start self.repo.fs.close() now = self.app.time() if most_recent is None: # the repository is empty / the client does not exist self.app.output.write('CRITICAL: no backup found.\n') sys.exit(2) elif (now - most_recent > critical_age): self.app.output.write( 'CRITICAL: backup is old. last backup was %s.\n' % (self.format_time(most_recent))) sys.exit(2) elif (now - most_recent > warn_age): self.app.output.write( 'WARNING: backup is old. last backup was %s.\n' % self.format_time(most_recent)) sys.exit(1) self.app.output.write( 'OK: backup is recent. last backup was %s.\n' % self.format_time(most_recent)) def genids(self, args): '''List generation ids for client.''' self.open_repository() for gen in self.repo.list_generations(): sys.stdout.write('%s\n' % gen) self.repo.fs.close() def ls(self, args): '''List contents of a generation.''' self.open_repository() if len(args) is 0: args = ['/'] for gen in self.app.settings['generation']: gen = self.repo.genspec(gen) started, ended = self.repo.client.get_generation_times(gen) started = self.format_time(started) ended = self.format_time(ended) self.app.output.write( 'Generation %s (%s - %s)\n' % (gen, started, ended)) for ls_file in args: ls_file = self.remove_trailing_slashes(ls_file) self.show_objects(gen, ls_file) self.repo.fs.close() def remove_trailing_slashes(self, filename): while filename.endswith('/') and filename != '/': filename = filename[:-1] return filename def format_time(self, timestamp): return time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(timestamp)) def isdir(self, gen, filename): metadata = self.repo.get_metadata(gen, filename) return metadata.isdir() def show_objects(self, gen, dirname): self.show_item(gen, dirname) subdirs = [] for basename in sorted(self.repo.listdir(gen, dirname)): full = os.path.join(dirname, basename) if self.isdir(gen, full): subdirs.append(full) else: self.show_item(gen, full) for subdir in subdirs: self.show_objects(gen, subdir) def show_item(self, gen, filename): fields = self.fields(gen, filename) widths = [ 1, # mode 5, # nlink -8, # owner -8, # group 10, # size 1, # mtime -1, # name ] result = [] for i in range(len(fields)): if widths[i] < 0: fmt = '%-*s' else: fmt = '%*s' result.append(fmt % (abs(widths[i]), fields[i])) self.app.output.write('%s\n' % ' '.join(result)) def show_diff_for_file(self, gen, fullname, change_char): '''Show what has changed for a single file. change_char is a single char (+,- or *) indicating whether a file got added, removed or altered. If --verbose, just show all the details as ls shows, otherwise show just the file's full name. ''' if self.app.settings['verbose']: sys.stdout.write('%s ' % change_char) self.show_item(gen, fullname) else: self.app.output.write('%s %s\n' % (change_char, fullname)) def show_diff_for_common_file(self, gen1, gen2, fullname, subdirs): changed = False if self.isdir(gen1, fullname) != self.isdir(gen2, fullname): changed = True elif self.isdir(gen2, fullname): subdirs.append(fullname) else: # Files are both present and neither is a directory # Check md5 sums md5_1 = self.repo.get_metadata(gen1, fullname) md5_2 = self.repo.get_metadata(gen2, fullname) if md5_1 != md5_2: changed = True if changed: self.show_diff_for_file(gen2, fullname, '*') def show_diff(self, gen1, gen2, dirname): # This set contains the files from the old/src generation set1 = self.repo.listdir(gen1, dirname) subdirs = [] # These are the new/dst generation files for basename in sorted(self.repo.listdir(gen2, dirname)): full = os.path.join(dirname, basename) if basename in set1: # Its in both generations set1.remove(basename) self.show_diff_for_common_file(gen1, gen2, full, subdirs) else: # Its only in set2 - the file/dir got added self.show_diff_for_file(gen2, full, '+') for basename in sorted(set1): # This was only in gen1 - it got removed full = os.path.join(dirname, basename) self.show_diff_for_file(gen1, full, '-') for subdir in subdirs: self.show_diff(gen1, gen2, subdir) def diff(self, args): '''Show difference between two generations.''' if len(args) not in (1, 2): raise obnamlib.Error('Need one or two generations') self.open_repository() if len(args) == 1: gen2 = self.repo.genspec(args[0]) # Now we have the dst/second generation for show_diff. Use # genids/list_generations to find the previous generation genids = self.repo.list_generations() index = genids.index(gen2) if index == 0: raise obnamlib.Error( 'Can\'t show first generation. Use \'ls\' instead') gen1 = genids[index - 1] else: gen1 = self.repo.genspec(args[0]) gen2 = self.repo.genspec(args[1]) self.show_diff(gen1, gen2, '/') self.repo.fs.close() def fields(self, gen, full): metadata = self.repo.get_metadata(gen, full) perms = ['?'] + ['-'] * 9 tab = [ (stat.S_IFREG, 0, '-'), (stat.S_IFDIR, 0, 'd'), (stat.S_IFLNK, 0, 'l'), (stat.S_IFIFO, 0, 'p'), (stat.S_IRUSR, 1, 'r'), (stat.S_IWUSR, 2, 'w'), (stat.S_IXUSR, 3, 'x'), (stat.S_IRGRP, 4, 'r'), (stat.S_IWGRP, 5, 'w'), (stat.S_IXGRP, 6, 'x'), (stat.S_IROTH, 7, 'r'), (stat.S_IWOTH, 8, 'w'), (stat.S_IXOTH, 9, 'x'), ] mode = metadata.st_mode or 0 for bitmap, offset, char in tab: if (mode & bitmap) == bitmap: perms[offset] = char perms = ''.join(perms) timestamp = time.strftime('%Y-%m-%d %H:%M:%S', time.gmtime(metadata.st_mtime_sec)) if metadata.islink(): name = '%s -> %s' % (full, metadata.target) else: name = full return (perms, str(metadata.st_nlink or 0), metadata.username or '', metadata.groupname or '', str(metadata.st_size or 0), timestamp, name) def format(self, fields): return ' '. join(self.align(widths[i], fields[i], i) for i in range(len(fields))) def align(self, width, field, field_no): if field_no in self.leftists: return '%-*s' % (width, field) else: return '%*s' % (width, field) def _convert_time(self, s, default_unit='h'): m = re.match('([0-9]+)([smhdw])?$', s) if m is None: raise ValueError ticks = int(m.group(1)) unit = m.group(2) if unit is None: unit = default_unit if unit == 's': None elif unit == 'm': ticks *= 60 elif unit == 'h': ticks *= 60*60 elif unit == 'd': ticks *= 60*60*24 elif unit == 'w': ticks *= 60*60*24*7 else: raise ValueError return ticks obnam-1.6.1/obnamlib/plugins/verify_plugin.py0000644000175000017500000001517512246357067021224 0ustar jenkinsjenkins# Copyright (C) 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import random import stat import sys import urlparse import obnamlib class Fail(obnamlib.Error): def __init__(self, filename, reason): self.filename = filename self.reason = reason def __str__(self): return '%s: %s' % (self.filename, self.reason) class VerifyPlugin(obnamlib.ObnamPlugin): def enable(self): self.app.add_subcommand('verify', self.verify, arg_synopsis='[DIRECTORY]...') self.app.settings.integer(['verify-randomly'], 'verify N files randomly from the backup ' '(default is zero, meaning everything)', metavar='N') def verify(self, args): '''Verify that live data and backed up data match.''' self.app.settings.require('repository') self.app.settings.require('client-name') self.app.settings.require('generation') if len(self.app.settings['generation']) != 1: raise obnamlib.Error( 'verify must be given exactly one generation') logging.debug('verifying generation %s' % self.app.settings['generation']) if not args: self.app.settings.require('root') args = self.app.settings['root'] if not args: logging.debug('no roots/args given, so verifying everything') args = ['/'] logging.debug('verifying what: %s' % repr(args)) self.repo = self.app.open_repository() self.repo.open_client(self.app.settings['client-name']) self.fs = self.app.fsf.new(args[0]) self.fs.connect() t = urlparse.urlparse(args[0]) root_url = urlparse.urlunparse((t[0], t[1], '/', t[3], t[4], t[5])) logging.debug('t: %s' % repr(t)) logging.debug('root_url: %s' % repr(root_url)) self.fs.reinit(root_url) self.failed = False gen = self.repo.genspec(self.app.settings['generation'][0]) self.app.ts['done'] = 0 self.app.ts['total'] = 0 self.app.ts['filename'] = '' if not self.app.settings['quiet']: self.app.ts.format( '%ElapsedTime() ' 'verifying file %Counter(filename)/%Integer(total) ' '%PercentDone(done,total): ' '%Pathname(filename)') num_randomly = self.app.settings['verify-randomly'] if num_randomly == 0: self.app.ts['total'] = \ self.repo.client.get_generation_file_count(gen) for filename, metadata in self.walk(gen, args): self.app.ts['filename'] = filename try: self.verify_metadata(gen, filename, metadata) except Fail, e: self.log_fail(e) else: if metadata.isfile(): try: self.verify_regular_file(gen, filename, metadata) except Fail, e: self.log_fail(e) self.app.ts['done'] += 1 else: logging.debug('verifying %d files randomly' % num_randomly) self.app.ts['total'] = num_randomly self.app.ts.notify('finding all files to choose randomly') filenames = [filename for filename, metadata in self.walk(gen, args) if metadata.isfile()] chosen = [] for i in range(min(num_randomly, len(filenames))): filename = random.choice(filenames) filenames.remove(filename) chosen.append(filename) for filename in chosen: self.app.ts['filename'] = filename metadata = self.repo.get_metadata(gen, filename) try: self.verify_metadata(gen, filename, metadata) self.verify_regular_file(gen, filename, metadata) except Fail, e: self.log_fail(e) self.app.ts['done'] += 1 self.fs.close() self.repo.fs.close() self.app.ts.finish() if self.failed: sys.exit(1) print "Verify did not find problems." def log_fail(self, e): msg = 'verify failure: %s: %s' % (e.filename, e.reason) logging.error(msg) if self.app.settings['quiet']: sys.stderr.write('%s\n' % msg) else: self.app.ts.notify(msg) self.failed = True def verify_metadata(self, gen, filename, backed_up): try: live_data = obnamlib.read_metadata(self.fs, filename) except OSError, e: raise Fail(filename, 'missing or inaccessible: %s' % e.strerror) for field in obnamlib.metadata_verify_fields: v1 = getattr(backed_up, field) v2 = getattr(live_data, field) if v1 != v2: raise Fail(filename, 'metadata change: %s (%s vs %s)' % (field, v1, v2)) def verify_regular_file(self, gen, filename, metadata): logging.debug('verifying regular %s' % filename) f = self.fs.open(filename, 'r') chunkids = self.repo.get_file_chunks(gen, filename) if not self.verify_chunks(f, chunkids): raise Fail(filename, 'data changed') f.close() def verify_chunks(self, f, chunkids): for chunkid in chunkids: backed_up = self.repo.get_chunk(chunkid) live_data = f.read(len(backed_up)) if backed_up != live_data: return False return True def walk(self, gen, args): '''Iterate over each pathname specified by arguments. This is a generator. ''' for arg in args: scheme, netloc, path, query, fragment = urlparse.urlsplit(arg) arg = os.path.normpath(path) for x in self.repo.walk(gen, arg): yield x obnam-1.6.1/obnamlib/plugins/vfs_local_plugin.py0000644000175000017500000000153112246357067021657 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import logging import os import re import stat import obnamlib class VfsLocalPlugin(obnamlib.ObnamPlugin): def enable(self): self.app.fsf.register('', obnamlib.LocalFS) obnam-1.6.1/obnamlib/repo.py0000644000175000017500000006740712246357067015633 0ustar jenkinsjenkins# Copyright (C) 2009-2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import errno import hashlib import larch import logging import os import random import re import stat import struct import time import tracing import obnamlib class LockFail(obnamlib.Error): pass class BadFormat(obnamlib.Error): pass class HookedFS(object): '''A class to filter read/written data through hooks.''' def __init__(self, repo, fs, hooks): self.repo = repo self.fs = fs self.hooks = hooks def __getattr__(self, name): return getattr(self.fs, name) def _get_toplevel(self, filename): parts = filename.split(os.sep) if len(parts) > 1: return parts[0] else: # pragma: no cover raise obnamlib.Error('File at repository root: %s' % filename) def cat(self, filename, runfilters=True): data = self.fs.cat(filename) toplevel = self._get_toplevel(filename) if not runfilters: return data return self.hooks.filter_read('repository-data', data, repo=self.repo, toplevel=toplevel) def lock(self, filename, data): self.fs.lock(filename, data) def write_file(self, filename, data, runfilters=True): tracing.trace('writing hooked %s' % filename) toplevel = self._get_toplevel(filename) if runfilters: data = self.hooks.filter_write('repository-data', data, repo=self.repo, toplevel=toplevel) self.fs.write_file(filename, data) def overwrite_file(self, filename, data, runfilters=True): tracing.trace('overwriting hooked %s' % filename) toplevel = self._get_toplevel(filename) if runfilters: data = self.hooks.filter_write('repository-data', data, repo=self.repo, toplevel=toplevel) self.fs.overwrite_file(filename, data) class Repository(object): '''Repository for backup data. Backup data is put on a virtual file system (obnamlib.VirtualFileSystem instance), in some form that the API of this class does not care about. The repository may contain data for several clients that share encryption keys. Each client is identified by a name. The repository has a "root" object, which is conceptually a list of client names. Each client in turn is conceptually a list of generations, which correspond to snapshots of the user data that existed when the generation was created. Read-only access to the repository does not require locking. Write access may affect only the root object, or only a client's own data, and thus locking may affect only the root, or only the client. When a new generation is started, it is a copy-on-write clone of the previous generation, and the caller needs to modify the new generation to match the current state of user data. The file 'metadata/format' at the root of the repository contains the version of the repository format it uses. The version is specified using a single integer. ''' format_version = 6 def __init__(self, fs, node_size, upload_queue_size, lru_size, hooks, idpath_depth, idpath_bits, idpath_skip, current_time, lock_timeout, client_name): self.current_time = current_time self.setup_hooks(hooks or obnamlib.HookManager()) self.fs = HookedFS(self, fs, self.hooks) self.node_size = node_size self.upload_queue_size = upload_queue_size self.lru_size = lru_size hider = hashlib.md5() hider.update(client_name) self.lockmgr = obnamlib.LockManager(self.fs, lock_timeout, hider.hexdigest()) self.got_root_lock = False self._open_client_list() self.got_shared_lock = False self.got_client_lock = False self.current_client = None self.current_client_id = None self.new_generation = None self.added_clients = [] self.removed_clients = [] self.removed_generations = [] self.client = None self._open_shared() self.prev_chunkid = None self.chunk_idpath = larch.IdPath('chunks', idpath_depth, idpath_bits, idpath_skip) self._chunks_exists = False def _open_client_list(self): self.clientlist = obnamlib.ClientList(self.fs, self.node_size, self.upload_queue_size, self.lru_size, self) def _open_shared(self): self.chunklist = obnamlib.ChunkList(self.fs, self.node_size, self.upload_queue_size, self.lru_size, self) self.chunksums = obnamlib.ChecksumTree(self.fs, 'chunksums', len(self.checksum('')), self.node_size, self.upload_queue_size, self.lru_size, self) def setup_hooks(self, hooks): self.hooks = hooks self.hooks.new('repository-toplevel-init') self.hooks.new_filter('repository-data') self.hooks.new('repository-add-client') def checksum(self, data): '''Return checksum of data. The checksum is (currently) MD5. ''' checksummer = self.new_checksummer() checksummer.update(data) return checksummer.hexdigest() def new_checksummer(self): '''Return a new checksum algorithm.''' return hashlib.md5() def acceptable_version(self, version): '''Are we compatible with on-disk format?''' return self.format_version == version def client_dir(self, client_id): '''Return name of sub-directory for a given client.''' return str(client_id) def list_clients(self): '''Return list of names of clients using this repository.''' self.check_format_version() listed = set(self.clientlist.list_clients()) added = set(self.added_clients) removed = set(self.removed_clients) clients = listed.union(added).difference(removed) return list(clients) def require_root_lock(self): '''Ensure we have the lock on the repository's root node.''' if not self.got_root_lock: raise LockFail('have not got lock on root node') def require_shared_lock(self): '''Ensure we have the lock on the shared B-trees except clientlist.''' if not self.got_shared_lock: raise LockFail('have not got lock on shared B-trees') def require_client_lock(self): '''Ensure we have the lock on the currently open client.''' if not self.got_client_lock: raise LockFail('have not got lock on client') def require_open_client(self): '''Ensure we have opened the client (r/w or r/o).''' if self.current_client is None: raise obnamlib.Error('client is not open') def require_started_generation(self): '''Ensure we have started a new generation.''' if self.new_generation is None: raise obnamlib.Error('new generation has not started') def require_no_root_lock(self): '''Ensure we haven't locked root yet.''' if self.got_root_lock: raise obnamlib.Error('We have already locked root, oops') def require_no_shared_lock(self): '''Ensure we haven't locked shared B-trees yet.''' if self.got_shared_lock: raise obnamlib.Error('We have already locked shared B-trees, oops') def require_no_client_lock(self): '''Ensure we haven't locked the per-client B-tree yet.''' if self.got_client_lock: raise obnamlib.Error('We have already locked the client, oops') def lock_root(self): '''Lock root node. Raise obnamlib.LockFail if locking fails. Lock will be released by commit_root() or unlock_root(). ''' tracing.trace('locking root') self.require_no_root_lock() self.require_no_client_lock() self.require_no_shared_lock() self.lockmgr.lock(['.']) self.check_format_version() self.got_root_lock = True self.added_clients = [] self.removed_clients = [] self._write_format_version(self.format_version) self.clientlist.start_changes() def unlock_root(self): '''Unlock root node without committing changes made.''' tracing.trace('unlocking root') self.require_root_lock() self.added_clients = [] self.removed_clients = [] self.lockmgr.unlock(['.']) self.got_root_lock = False self._open_client_list() def commit_root(self): '''Commit changes to root node, and unlock it.''' tracing.trace('committing root') self.require_root_lock() for client_name in self.added_clients: self.clientlist.add_client(client_name) self.hooks.call('repository-add-client', self.clientlist, client_name) self.added_clients = [] for client_name in self.removed_clients: client_id = self.clientlist.get_client_id(client_name) client_dir = self.client_dir(client_id) if client_id is not None and self.fs.exists(client_dir): self.fs.rmtree(client_dir) self.clientlist.remove_client(client_name) self.clientlist.commit() self.unlock_root() def get_format_version(self): '''Return (major, minor) of the on-disk format version. If on-disk repository does not have a version yet, return None. ''' if self.fs.exists('metadata/format'): data = self.fs.cat('metadata/format', runfilters=False) lines = data.splitlines() line = lines[0] try: version = int(line) except ValueError, e: # pragma: no cover msg = ('Invalid repository format version (%s) -- ' 'forgot encryption?' % repr(line)) raise obnamlib.Error(msg) return version else: return None def _write_format_version(self, version): '''Write the desired format version to the repository.''' tracing.trace('write format version') if not self.fs.exists('metadata'): self.fs.mkdir('metadata') self.fs.overwrite_file('metadata/format', '%s\n' % version, runfilters=False) def check_format_version(self): '''Verify that on-disk format version is compatbile. If not, raise BadFormat. ''' on_disk = self.get_format_version() if on_disk is not None and not self.acceptable_version(on_disk): raise BadFormat('On-disk repository format %s is incompatible ' 'with program format %s; you need to use a ' 'different version of Obnam' % (on_disk, self.format_version)) def add_client(self, client_name): '''Add a new client to the repository.''' tracing.trace('client_name=%s', client_name) self.require_root_lock() if client_name in self.list_clients(): raise obnamlib.Error('client %s already exists in repository' % client_name) self.added_clients.append(client_name) def remove_client(self, client_name): '''Remove a client from the repository. This removes all data related to the client, including all actual file data unless other clients also use it. ''' tracing.trace('client_name=%s', client_name) self.require_root_lock() if client_name not in self.list_clients(): raise obnamlib.Error('client %s does not exist' % client_name) self.removed_clients.append(client_name) @property def shared_dirs(self): return [self.chunklist.dirname, self.chunksums.dirname, self.chunk_idpath.dirname] def lock_shared(self): '''Lock a client for exclusive write access. Raise obnamlib.LockFail if locking fails. Lock will be released by commit_client() or unlock_client(). ''' tracing.trace('locking shared') self.require_no_shared_lock() self.check_format_version() self.lockmgr.lock(self.shared_dirs) self.got_shared_lock = True tracing.trace('starting changes in chunksums and chunklist') self.chunksums.start_changes() self.chunklist.start_changes() # Initialize the chunks directory for encryption, etc, if it just # got created. dirname = self.chunk_idpath.dirname filenames = self.fs.listdir(dirname) if filenames == [] or filenames == ['lock']: self.hooks.call('repository-toplevel-init', self, dirname) def commit_shared(self): '''Commit changes to shared B-trees.''' tracing.trace('committing shared') self.require_shared_lock() self.chunklist.commit() self.chunksums.commit() self.unlock_shared() def unlock_shared(self): '''Unlock currently locked shared B-trees.''' tracing.trace('unlocking shared') self.require_shared_lock() self.lockmgr.unlock(self.shared_dirs) self.got_shared_lock = False self._open_shared() def lock_client(self, client_name): '''Lock a client for exclusive write access. Raise obnamlib.LockFail if locking fails. Lock will be released by commit_client() or unlock_client(). ''' tracing.trace('client_name=%s', client_name) self.require_no_client_lock() self.require_no_shared_lock() self.check_format_version() client_id = self.clientlist.get_client_id(client_name) if client_id is None: raise LockFail('client %s does not exist' % client_name) client_dir = self.client_dir(client_id) if not self.fs.exists(client_dir): self.fs.mkdir(client_dir) self.hooks.call('repository-toplevel-init', self, client_dir) self.lockmgr.lock([client_dir]) self.got_client_lock = True self.current_client = client_name self.current_client_id = client_id self.added_generations = [] self.removed_generations = [] self.client = obnamlib.ClientMetadataTree(self.fs, client_dir, self.node_size, self.upload_queue_size, self.lru_size, self) self.client.init_forest() def unlock_client(self): '''Unlock currently locked client, without committing changes.''' tracing.trace('unlocking client') self.require_client_lock() self.new_generation = None self._really_remove_generations(self.added_generations) self.lockmgr.unlock([self.client.dirname]) self.client = None # FIXME: This should remove uncommitted data. self.added_generations = [] self.removed_generations = [] self.got_client_lock = False self.current_client = None self.current_client_id = None def commit_client(self, checkpoint=False): '''Commit changes to and unlock currently locked client.''' tracing.trace('committing client (checkpoint=%s)', checkpoint) self.require_client_lock() self.require_shared_lock() commit_client = self.new_generation or self.removed_generations if self.new_generation: self.client.set_current_generation_is_checkpoint(checkpoint) self.added_generations = [] self._really_remove_generations(self.removed_generations) if commit_client: self.client.commit() self.unlock_client() def open_client(self, client_name): '''Open a client for read-only operation.''' tracing.trace('open r/o client_name=%s' % client_name) self.check_format_version() client_id = self.clientlist.get_client_id(client_name) if client_id is None: raise obnamlib.Error('%s is not an existing client' % client_name) self.current_client = client_name self.current_client_id = client_id client_dir = self.client_dir(client_id) self.client = obnamlib.ClientMetadataTree(self.fs, client_dir, self.node_size, self.upload_queue_size, self.lru_size, self) self.client.init_forest() def list_generations(self): '''List existing generations for currently open client.''' self.require_open_client() return self.client.list_generations() def get_is_checkpoint(self, genid): '''Is a generation a checkpoint one?''' self.require_open_client() return self.client.get_is_checkpoint(genid) def start_generation(self): '''Start a new generation. The new generation is a copy-on-write clone of the previous one (or empty, if first generation). ''' tracing.trace('start new generation') self.require_client_lock() if self.new_generation is not None: raise obnamlib.Error('Cannot start two new generations') self.client.start_generation() self.new_generation = \ self.client.get_generation_id(self.client.tree) self.added_generations.append(self.new_generation) return self.new_generation def _really_remove_generations(self, remove_genids): '''Really remove a list of generations. This is not part of the public API. This does not make any safety checks. ''' def find_chunkids_in_gens(genids): chunkids = set() for genid in genids: x = self.client.list_chunks_in_generation(genid) chunkids = chunkids.union(set(x)) return chunkids def find_gens_to_keep(): return [genid for genid in self.list_generations() if genid not in remove_genids] def remove_chunks(chunk_ids): for chunk_id in chunk_ids: try: checksum = self.chunklist.get_checksum(chunk_id) except KeyError: # No checksum, therefore it can't be shared, therefore # we can remove it. self.remove_chunk(chunk_id) else: self.chunksums.remove(checksum, chunk_id, self.current_client_id) if not self.chunksums.chunk_is_used(checksum, chunk_id): self.remove_chunk(chunk_id) def remove_gens(genids): if self.new_generation is None: self.client.start_changes(create_tree=False) for genid in genids: self.client.remove_generation(genid) if not remove_genids: return self.require_client_lock() self.require_shared_lock() maybe_remove = find_chunkids_in_gens(remove_genids) keep_genids = find_gens_to_keep() keep = find_chunkids_in_gens(keep_genids) remove = maybe_remove.difference(keep) remove_chunks(remove) remove_gens(remove_genids) def remove_generation(self, gen): '''Remove a committed generation.''' self.require_client_lock() if gen == self.new_generation: raise obnamlib.Error('cannot remove started generation') self.removed_generations.append(gen) def get_generation_times(self, gen): '''Return start and end times of a generation. An unfinished generation has no end time, so None is returned. ''' self.require_open_client() return self.client.get_generation_times(gen) def listdir(self, gen, dirname): '''Return list of basenames in a directory within generation.''' self.require_open_client() return self.client.listdir(gen, dirname) def get_metadata(self, gen, filename): '''Return metadata for a file in a generation.''' self.require_open_client() try: encoded = self.client.get_metadata(gen, filename) except KeyError: raise obnamlib.Error('%s does not exist' % filename) return obnamlib.decode_metadata(encoded) def create(self, filename, metadata): '''Create a new (empty) file in the new generation.''' self.require_started_generation() encoded = obnamlib.encode_metadata(metadata) self.client.create(filename, encoded) def remove(self, filename): '''Remove file or directory or directory tree from generation.''' self.require_started_generation() self.client.remove(filename) def _chunk_filename(self, chunkid): return self.chunk_idpath.convert(chunkid) def put_chunk_only(self, data): '''Put chunk of data into repository. If the same data is already in the repository, it will be put there a second time. It is the caller's responsibility to check that the data is not already in the repository. Return the unique identifier of the new chunk. ''' def random_chunkid(): return random.randint(0, obnamlib.MAX_ID) self.require_started_generation() if self.prev_chunkid is None: self.prev_chunkid = random_chunkid() while True: chunkid = (self.prev_chunkid + 1) % obnamlib.MAX_ID filename = self._chunk_filename(chunkid) try: self.fs.write_file(filename, data) except OSError, e: # pragma: no cover if e.errno == errno.EEXIST: self.prev_chunkid = random_chunkid() continue raise else: tracing.trace('chunkid=%s', chunkid) break self.prev_chunkid = chunkid return chunkid def put_chunk_in_shared_trees(self, chunkid, checksum): '''Put the chunk into the shared trees. The chunk is assumed to already exist in the repository, so we just need to add it to the shared trees that map chunkids to checksums and checksums to chunkids. ''' tracing.trace('chunkid=%s', chunkid) tracing.trace('checksum=%s', repr(checksum)) self.require_started_generation() self.require_shared_lock() self.chunklist.add(chunkid, checksum) self.chunksums.add(checksum, chunkid, self.current_client_id) def get_chunk(self, chunkid): '''Return data of chunk with given id.''' self.require_open_client() return self.fs.cat(self._chunk_filename(chunkid)) def chunk_exists(self, chunkid): '''Does a chunk exist in the repository?''' self.require_open_client() return self.fs.exists(self._chunk_filename(chunkid)) def find_chunks(self, checksum): '''Return identifiers of chunks with given checksum. Because of hash collisions, the list may be longer than one. ''' self.require_open_client() return self.chunksums.find(checksum) def list_chunks(self): '''Return list of ids of all chunks in repository.''' result = [] pat = re.compile(r'^.*/.*/[0-9a-fA-F]+$') if self.fs.exists('chunks'): for pathname, st in self.fs.scan_tree('chunks'): if stat.S_ISREG(st.st_mode) and pat.match(pathname): basename = os.path.basename(pathname) result.append(int(basename, 16)) return result def remove_chunk(self, chunk_id): '''Remove a chunk from the repository. Note that this does _not_ remove the chunk from the chunk checksum forest. The caller is not supposed to call us until the chunk is not there anymore. However, it does remove the chunk from the chunk list forest. ''' tracing.trace('chunk_id=%s', chunk_id) self.require_open_client() self.require_shared_lock() self.chunklist.remove(chunk_id) filename = self._chunk_filename(chunk_id) try: self.fs.remove(filename) except OSError: pass def get_file_chunks(self, gen, filename): '''Return list of ids of chunks belonging to a file.''' self.require_open_client() return self.client.get_file_chunks(gen, filename) def set_file_chunks(self, filename, chunkids): '''Set ids of chunks belonging to a file. File must be in the started generation. ''' self.require_started_generation() self.client.set_file_chunks(filename, chunkids) def append_file_chunks(self, filename, chunkids): '''Append to list of ids of chunks belonging to a file. File must be in the started generation. ''' self.require_started_generation() self.client.append_file_chunks(filename, chunkids) def set_file_data(self, filename, contents): # pragma: no cover '''Store contents of file in B-tree instead of chunks dir.''' self.require_started_generation() self.client.set_file_data(filename, contents) def get_file_data(self, gen, filename): # pragma: no cover '''Returned contents of file stored in B-tree instead of chunks dir.''' self.require_open_client() return self.client.get_file_data(gen, filename) def genspec(self, spec): '''Interpret a generation specification.''' self.require_open_client() gens = self.list_generations() if not gens: raise obnamlib.Error('No generations') if spec == 'latest': return gens[-1] else: try: intspec = int(spec) except ValueError: raise obnamlib.Error('Generation %s is not an integer' % spec) if intspec in gens: return intspec else: raise obnamlib.Error('Generation %s not found' % spec) def walk(self, gen, arg, depth_first=False): '''Iterate over each pathname specified by argument. This is a generator. Each return value is a tuple consisting of a pathname and its corresponding metadata. Directories are recursed into. ''' arg = os.path.normpath(arg) metadata = self.get_metadata(gen, arg) if metadata.isdir(): if not depth_first: yield arg, metadata kids = self.listdir(gen, arg) kidpaths = [os.path.join(arg, kid) for kid in kids] for kidpath in kidpaths: for x in self.walk(gen, kidpath, depth_first=depth_first): yield x if depth_first: yield arg, metadata else: yield arg, metadata obnam-1.6.1/obnamlib/repo_dummy.py0000644000175000017500000005052512246357067017037 0ustar jenkinsjenkins# Copyright 2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # # =*= License: GPL-3+ =*= import obnamlib class KeyValueStore(object): def __init__(self): self._map = {} def get_value(self, key, default): if key in self._map: return self._map[key] return default def set_value(self, key, value): self._map[key] = value def remove_value(self, key): del self._map[key] def items(self): return self._map.items() def copy(self): other = KeyValueStore() for key, value in self.items(): other.set_value(key, value) return other class LockableKeyValueStore(object): def __init__(self): self.locked = False self.data = KeyValueStore() self.stashed = None def lock(self): assert not self.locked self.stashed = self.data self.data = self.data.copy() self.locked = True def unlock(self): assert self.locked self.data = self.stashed self.stashed = None self.locked = False def commit(self): assert self.locked self.stashed = None self.locked = False def get_value(self, key, default): return self.data.get_value(key, default) def set_value(self, key, value): self.data.set_value(key, value) def remove_value(self, key): self.data.remove_value(key) def items(self): return self.data.items() class Counter(object): def __init__(self): self._latest = 0 def next(self): self._latest += 1 return self._latest class DummyClient(object): def __init__(self, name): self.name = name self.generation_counter = Counter() self.data = LockableKeyValueStore() def lock(self): if self.data.locked: raise obnamlib.RepositoryClientLockingFailed(self.name) self.data.lock() def _require_lock(self): if not self.data.locked: raise obnamlib.RepositoryClientNotLocked(self.name) def unlock(self): self._require_lock() self.data.unlock() def commit(self): self._require_lock() self.data.set_value('current-generation', None) self.data.commit() def get_key(self, key): return self.data.get_value(key, '') def set_key(self, key, value): self._require_lock() self.data.set_value(key, value) def get_generation_ids(self): key = 'generation-ids' return self.data.get_value(key, []) def create_generation(self): self._require_lock() if self.data.get_value('current-generation', None) is not None: raise obnamlib.RepositoryClientGenerationUnfinished(self.name) generation_id = (self.name, self.generation_counter.next()) ids = self.data.get_value('generation-ids', []) self.data.set_value('generation-ids', ids + [generation_id]) self.data.set_value('current-generation', generation_id) if ids: prev_gen_id = ids[-1] for key, value in self.data.items(): if self._is_filekey(key): x, gen_id, filename = key if gen_id == prev_gen_id: value = self.data.get_value(key, None) self.data.set_value( self._filekey(generation_id, filename), value) elif self._is_filekeykey(key): x, gen_id, filename, k = key if gen_id == prev_gen_id: value = self.data.get_value(key, None) self.data.set_value( self._filekeykey(generation_id, filename, k), value) elif self._is_filechunkskey(key): x, gen_id, filename = key if gen_id == prev_gen_id: value = self.data.get_value(key, []) self.data.set_value( self._filechunkskey(generation_id, filename), value) return generation_id def _require_generation(self, gen_id): ids = self.data.get_value('generation-ids', []) if gen_id not in ids: raise obnamlib.RepositoryGenerationDoesNotExist(self.name) def get_generation_key(self, gen_id, key): return self.data.get_value(gen_id + (key,), '') def set_generation_key(self, gen_id, key, value): self._require_lock() self.data.set_value(gen_id + (key,), value) def remove_generation(self, gen_id): self._require_lock() self._require_generation(gen_id) ids = self.data.get_value('generation-ids', []) self.data.set_value('generation-ids', [x for x in ids if x != gen_id]) def get_generation_chunk_ids(self, gen_id): chunk_ids = [] for key, value in self.data.items(): if self._is_filechunkskey(key) and key[1] == gen_id: chunk_ids.extend(value) return chunk_ids def interpret_generation_spec(self, genspec): ids = self.data.get_value('generation-ids', []) if not ids: raise obnamlib.RepositoryClientHasNoGenerations(self.name) if genspec == 'latest': if ids: return ids[-1] else: gen_number = int(genspec) if (self.name, gen_number) in ids: return (self.name, gen_number) raise obnamlib.RepositoryGenerationDoesNotExist(self.name) def make_generation_spec(self, generation_id): name, gen_number = generation_id return str(gen_number) def _filekey(self, gen_id, filename): return ('file', gen_id, filename) def _is_filekey(self, key): return (type(key) is tuple and len(key) == 3 and key[0] == 'file') def file_exists(self, gen_id, filename): return self.data.get_value(self._filekey(gen_id, filename), False) def add_file(self, gen_id, filename): self.data.set_value(self._filekey(gen_id, filename), True) def remove_file(self, gen_id, filename): keys = [] for key, value in self.data.items(): right_kind = ( self._is_filekey(key) or self._is_filekeykey(key) or self._is_filechunkskey(key)) if right_kind: if key[1] == gen_id and key[2] == filename: keys.append(key) for k in keys: self.data.remove_value(k) def _filekeykey(self, gen_id, filename, key): return ('filekey', gen_id, filename, key) def _is_filekeykey(self, key): return (type(key) is tuple and len(key) == 4 and key[0] == 'filekey') def _require_file(self, gen_id, filename): if not self.file_exists(gen_id, filename): raise obnamlib.RepositoryFileDoesNotExistInGeneration( self.name, self.make_generation_spec(gen_id), filename) _integer_keys = ( obnamlib.REPO_FILE_MTIME, ) def get_file_key(self, gen_id, filename, key): self._require_generation(gen_id) self._require_file(gen_id, filename) if key in self._integer_keys: default = 0 else: default = '' return self.data.get_value( self._filekeykey(gen_id, filename, key), default) def set_file_key(self, gen_id, filename, key, value): self._require_generation(gen_id) self._require_file(gen_id, filename) self.data.set_value(self._filekeykey(gen_id, filename, key), value) def _filechunkskey(self, gen_id, filename): return ('filechunks', gen_id, filename) def _is_filechunkskey(self, key): return ( type(key) is tuple and len(key) == 3 and key[0] == 'filechunks') def get_file_chunk_ids(self, gen_id, filename): self._require_generation(gen_id) self._require_file(gen_id, filename) return self.data.get_value(self._filechunkskey(gen_id, filename), []) def append_file_chunk_id(self, gen_id, filename, chunk_id): self._require_generation(gen_id) self._require_file(gen_id, filename) chunk_ids = self.get_file_chunk_ids(gen_id, filename) self.data.set_value( self._filechunkskey(gen_id, filename), chunk_ids + [chunk_id]) def clear_file_chunk_ids(self, gen_id, filename): self._require_generation(gen_id) self._require_file(gen_id, filename) self.data.set_value(self._filechunkskey(gen_id, filename), []) def get_file_children(self, gen_id, filename): children = [] if filename.endswith('/'): prefix = filename else: prefix = filename + '/' for key, value in self.data.items(): if not self._is_filekey(key): continue x, y, candidate = key if candidate == filename: continue if not candidate.startswith(prefix): # pragma: no cover continue if '/' in candidate[len(prefix):]: continue children.append(candidate) return children class DummyClientList(object): def __init__(self): self.data = LockableKeyValueStore() def lock(self): if self.data.locked: raise obnamlib.RepositoryClientListLockingFailed() self.data.lock() def unlock(self): if not self.data.locked: raise obnamlib.RepositoryClientListNotLocked() self.data.unlock() def commit(self): if not self.data.locked: raise obnamlib.RepositoryClientListNotLocked() self.data.commit() def force(self): if self.data.locked: self.unlock() self.lock() def _require_lock(self): if not self.data.locked: raise obnamlib.RepositoryClientListNotLocked() def names(self): return [k for k, v in self.data.items() if v is not None] def __getitem__(self, client_name): client = self.data.get_value(client_name, None) if client is None: raise obnamlib.RepositoryClientDoesNotExist(client_name) return client def add(self, client_name): self._require_lock() if self.data.get_value(client_name, None) is not None: raise obnamlib.RepositoryClientAlreadyExists(client_name) self.data.set_value(client_name, DummyClient(client_name)) def remove(self, client_name): self._require_lock() if self.data.get_value(client_name, None) is None: raise obnamlib.RepositoryClientDoesNotExist(client_name) self.data.set_value(client_name, None) def rename(self, old_client_name, new_client_name): self._require_lock() client = self.data.get_value(old_client_name, None) if client is None: raise obnamlib.RepositoryClientDoesNotExist(old_client_name) if self.data.get_value(new_client_name, None) is not None: raise obnamlib.RepositoryClientAlreadyExists(new_client_name) self.data.set_value(old_client_name, None) self.data.set_value(new_client_name, client) def get_client_by_generation_id(self, gen_id): client_name, generation_number = gen_id return self[client_name] class ChunkStore(object): def __init__(self): self.next_chunk_id = Counter() self.chunks = {} def put_chunk_content(self, content): chunk_id = self.next_chunk_id.next() self.chunks[chunk_id] = content return chunk_id def get_chunk_content(self, chunk_id): if chunk_id not in self.chunks: raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id)) return self.chunks[chunk_id] def has_chunk(self, chunk_id): return chunk_id in self.chunks def remove_chunk(self, chunk_id): if chunk_id not in self.chunks: raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id)) del self.chunks[chunk_id] class ChunkIndexes(object): def __init__(self): self.data = LockableKeyValueStore() def lock(self): if self.data.locked: raise obnamlib.RepositoryChunkIndexesLockingFailed() self.data.lock() def _require_lock(self): if not self.data.locked: raise obnamlib.RepositoryChunkIndexesNotLocked() def unlock(self): self._require_lock() self.data.unlock() def commit(self): self._require_lock() self.data.commit() def force(self): if self.data.locked: self.unlock() self.lock() def put_chunk(self, chunk_id, chunk_content, client_id): self._require_lock() self.data.set_value(chunk_id, chunk_content) def find_chunk(self, chunk_content): for chunk_id, stored_content in self.data.items(): if stored_content == chunk_content: return chunk_id raise obnamlib.RepositoryChunkContentNotInIndexes() def remove_chunk(self, chunk_id, client_id): self._require_lock() self.data.set_value(chunk_id, None) class RepositoryFormatDummy(obnamlib.RepositoryInterface): '''Simplistic repository format for testing. This class exists to exercise the RepositoryInterfaceTests class. ''' format = 'dummy' def __init__(self): self._client_list = DummyClientList() self._chunk_store = ChunkStore() self._chunk_indexes = ChunkIndexes() def set_fs(self, fs): pass def init_repo(self): pass def get_client_names(self): return self._client_list.names() def lock_client_list(self): self._client_list.lock() def unlock_client_list(self): self._client_list.unlock() def commit_client_list(self): self._client_list.commit() def force_client_list_lock(self): self._client_list.force() def add_client(self, client_name): self._client_list.add(client_name) def remove_client(self, client_name): self._client_list.remove(client_name) def rename_client(self, old_client_name, new_client_name): self._client_list.rename(old_client_name, new_client_name) def lock_client(self, client_name): self._client_list[client_name].lock() def unlock_client(self, client_name): self._client_list[client_name].unlock() def commit_client(self, client_name): self._client_list[client_name].commit() def get_allowed_client_keys(self): return [obnamlib.REPO_CLIENT_TEST_KEY] def get_client_key(self, client_name, key): return self._client_list[client_name].get_key(key) def set_client_key(self, client_name, key, value): if key not in self.get_allowed_client_keys(): raise obnamlib.RepositoryClientKeyNotAllowed( self.format, client_name, key) self._client_list[client_name].set_key(key, value) def get_client_generation_ids(self, client_name): return self._client_list[client_name].get_generation_ids() def create_generation(self, client_name): return self._client_list[client_name].create_generation() def get_allowed_generation_keys(self): return [obnamlib.REPO_GENERATION_TEST_KEY] def get_generation_key(self, generation_id, key): client = self._client_list.get_client_by_generation_id(generation_id) return client.get_generation_key(generation_id, key) def set_generation_key(self, generation_id, key, value): client = self._client_list.get_client_by_generation_id(generation_id) if key not in self.get_allowed_generation_keys(): raise obnamlib.RepositoryGenerationKeyNotAllowed( self.format, client.name, key) return client.set_generation_key(generation_id, key, value) def remove_generation(self, generation_id): client = self._client_list.get_client_by_generation_id(generation_id) client.remove_generation(generation_id) def get_generation_chunk_ids(self, generation_id): client = self._client_list.get_client_by_generation_id(generation_id) return client.get_generation_chunk_ids(generation_id) def interpret_generation_spec(self, client_name, genspec): client = self._client_list[client_name] return client.interpret_generation_spec(genspec) def make_generation_spec(self, generation_id): client = self._client_list.get_client_by_generation_id(generation_id) return client.make_generation_spec(generation_id) def file_exists(self, generation_id, filename): client = self._client_list.get_client_by_generation_id(generation_id) return client.file_exists(generation_id, filename) def add_file(self, generation_id, filename): client = self._client_list.get_client_by_generation_id(generation_id) return client.add_file(generation_id, filename) def remove_file(self, generation_id, filename): client = self._client_list.get_client_by_generation_id(generation_id) return client.remove_file(generation_id, filename) def get_file_key(self, generation_id, filename, key): client = self._client_list.get_client_by_generation_id(generation_id) if key not in self.get_allowed_file_keys(): raise obnamlib.RepositoryFileKeyNotAllowed( self.format, client.name, key) return client.get_file_key(generation_id, filename, key) def set_file_key(self, generation_id, filename, key, value): client = self._client_list.get_client_by_generation_id(generation_id) if key not in self.get_allowed_file_keys(): raise obnamlib.RepositoryFileKeyNotAllowed( self.format, client.name, key) client.set_file_key(generation_id, filename, key, value) def get_allowed_file_keys(self): return [obnamlib.REPO_FILE_TEST_KEY, obnamlib.REPO_FILE_MTIME] def get_file_chunk_ids(self, generation_id, filename): client = self._client_list.get_client_by_generation_id(generation_id) return client.get_file_chunk_ids(generation_id, filename) def append_file_chunk_id(self, generation_id, filename, chunk_id): client = self._client_list.get_client_by_generation_id(generation_id) return client.append_file_chunk_id(generation_id, filename, chunk_id) def clear_file_chunk_ids(self, generation_id, filename): client = self._client_list.get_client_by_generation_id(generation_id) client.clear_file_chunk_ids(generation_id, filename) def get_file_children(self, generation_id, filename): client = self._client_list.get_client_by_generation_id(generation_id) return client.get_file_children(generation_id, filename) def put_chunk_content(self, content): return self._chunk_store.put_chunk_content(content) def get_chunk_content(self, chunk_id): return self._chunk_store.get_chunk_content(chunk_id) def has_chunk(self, chunk_id): return self._chunk_store.has_chunk(chunk_id) def remove_chunk(self, chunk_id): self._chunk_store.remove_chunk(chunk_id) def lock_chunk_indexes(self): self._chunk_indexes.lock() def unlock_chunk_indexes(self): self._chunk_indexes.unlock() def commit_chunk_indexes(self): self._chunk_indexes.commit() def force_chunk_indexes_lock(self): self._chunk_indexes.force() def put_chunk_into_indexes(self, chunk_id, chunk_content, client_id): self._chunk_indexes.put_chunk(chunk_id, chunk_content, client_id) def find_chunk_id_by_content(self, chunk_content): return self._chunk_indexes.find_chunk(chunk_content) def remove_chunk_from_indexes(self, chunk_id, client_id): return self._chunk_indexes.remove_chunk(chunk_id, client_id) def get_fsck_work_item(self): return 'this pretends to be a work item' obnam-1.6.1/obnamlib/repo_dummy_tests.py0000644000175000017500000000153612246357067020257 0ustar jenkinsjenkins# Copyright 2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # # =*= License: GPL-3+ =*= import obnamlib class RepositoryFormatDummyTests(obnamlib.RepositoryInterfaceTests): def setUp(self): self.repo = obnamlib.RepositoryFormatDummy() obnam-1.6.1/obnamlib/repo_fmt_6.py0000644000175000017500000005676412246357067016732 0ustar jenkinsjenkins# Copyright (C) 2009-2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import errno import hashlib import larch import logging import os import random import re import stat import struct import time import tracing import obnamlib class HookedFS(object): '''A class to filter read/written data through hooks.''' def __init__(self, repo, fs, hooks): self.repo = repo self.fs = fs self.hooks = hooks def __getattr__(self, name): return getattr(self.fs, name) def _get_toplevel(self, filename): parts = filename.split(os.sep) if len(parts) > 1: return parts[0] else: # pragma: no cover raise obnamlib.Error('File at repository root: %s' % filename) def cat(self, filename, runfilters=True): data = self.fs.cat(filename) toplevel = self._get_toplevel(filename) if not runfilters: # pragma: no cover return data return self.hooks.filter_read('repository-data', data, repo=self.repo, toplevel=toplevel) def lock(self, filename, data): self.fs.lock(filename, data) def write_file(self, filename, data, runfilters=True): tracing.trace('writing hooked %s' % filename) toplevel = self._get_toplevel(filename) if runfilters: data = self.hooks.filter_write('repository-data', data, repo=self.repo, toplevel=toplevel) self.fs.write_file(filename, data) def overwrite_file(self, filename, data, runfilters=True): tracing.trace('overwriting hooked %s' % filename) toplevel = self._get_toplevel(filename) if runfilters: data = self.hooks.filter_write('repository-data', data, repo=self.repo, toplevel=toplevel) self.fs.overwrite_file(filename, data) class _OpenClient(object): def __init__(self, client): self.locked = False self.client = client self.current_generation_number = None self.removed_generation_numbers = [] class RepositoryFormat6(obnamlib.RepositoryInterface): format = '6' def __init__(self, lock_timeout=0, node_size=obnamlib.DEFAULT_NODE_SIZE, upload_queue_size=obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, lru_size=obnamlib.DEFAULT_LRU_SIZE, idpath_depth=obnamlib.IDPATH_DEPTH, idpath_bits=obnamlib.IDPATH_BITS, idpath_skip=obnamlib.IDPATH_SKIP, hooks=None): self._lock_timeout = lock_timeout self._node_size = node_size self._upload_queue_size = upload_queue_size self._lru_size = lru_size self._idpath_depth = idpath_depth self._idpath_bits = idpath_bits self._idpath_skip = idpath_skip self._setup_hooks(hooks or obnamlib.HookManager()) self._setup_chunks() def _setup_hooks(self, hooks): self.hooks = hooks self.hooks.new('repository-toplevel-init') self.hooks.new_filter('repository-data') self.hooks.new('repository-add-client') def set_fs(self, fs): self._fs = HookedFS(self, fs, self.hooks) self._lockmgr = obnamlib.LockManager(self._fs, self._lock_timeout, '') self._setup_client_list() self._setup_client() self._setup_chunk_indexes() def init_repo(self): # There is nothing else to be done. pass # Client list handling. def _setup_client_list(self): self._got_client_list_lock = False self._client_list = obnamlib.ClientList( self._fs, self._node_size, self._upload_queue_size, self._lru_size, self) def _raw_lock_client_list(self): if self._got_client_list_lock: raise obnamlib.RepositoryClientListLockingFailed() self._lockmgr.lock(['.']) self._got_client_list_lock = True self._client_list.start_changes() def _raw_unlock_client_list(self): if not self._got_client_list_lock: raise obnamlib.RepositoryClientListNotLocked() self._lockmgr.unlock(['.']) self._setup_client_list() def _require_client_list_lock(self): if not self._got_client_list_lock: raise obnamlib.RepositoryClientListNotLocked() def lock_client_list(self): tracing.trace('locking client list') self._raw_lock_client_list() def unlock_client_list(self): tracing.trace('unlocking client list') self._raw_unlock_client_list() def commit_client_list(self): tracing.trace('committing client list') self._client_list.commit() self._raw_unlock_client_list() def force_client_list_lock(self): tracing.trace('forcing client list lock') if self._got_client_list_lock: self._raw_unlock_client_list() self._raw_lock_client_list() def get_client_names(self): return self._client_list.list_clients() def add_client(self, client_name): self._require_client_list_lock() if self._client_list.get_client_id(client_name): raise obnamlib.RepositoryClientAlreadyExists(client_name) self._client_list.add_client(client_name) def remove_client(self, client_name): self._require_client_list_lock() if not self._client_list.get_client_id(client_name): raise obnamlib.RepositoryClientDoesNotExist(client_name) self._client_list.remove_client(client_name) def rename_client(self, old_client_name, new_client_name): self._require_client_list_lock() client_names = self.get_client_names() if old_client_name not in client_names: raise obnamlib.RepositoryClientDoesNotExist(old_client_name) if new_client_name in client_names: raise obnamlib.RepositoryClientAlreadyExists(new_client_name) client_id = self._get_client_id(old_client_name) new_key = self._client_list.key( new_client_name, client_id, self._client_list.CLIENT_NAME) self._client_list.tree.insert(new_key, new_client_name) old_key = self._client_list.key( old_client_name, client_id, self._client_list.CLIENT_NAME) self._client_list.tree.remove(old_key) def _get_client_id(self, client_name): '''Return a client's unique, filesystem-visible id. The id is a random 64-bit integer. ''' return self._client_list.get_client_id(client_name) # Handling of individual clients. def current_time(self): # ClientMetadataTree wants us to provide this method. # FIXME: A better design would be to for us to provide # the class with a function to call. return time.time() def _setup_client(self): # We keep a list of all open clients. An open client may or # may not be locked. Each value in the dict is a tuple of # ClientMetadataTree and is_locked. self._open_clients = {} def _open_client(self, client_name): if client_name not in self._open_clients: tracing.trace('client_name=%s', client_name) client_id = self._get_client_id(client_name) if client_id is None: # pragma: no cover raise obnamlib.RepositoryClientDoesNotExist(client_name) client_dir = self._get_client_dir(client_id) client = obnamlib.ClientMetadataTree( self._fs, client_dir, self._node_size, self._upload_queue_size, self._lru_size, self) client.init_forest() self._open_clients[client_name] = _OpenClient(client) return self._open_clients[client_name].client def _get_client_dir(self, client_id): '''Return name of sub-directory for a given client.''' return str(client_id) def _client_is_locked(self, client_name): if client_name in self._open_clients: open_client = self._open_clients[client_name] return open_client.locked return False def _require_client_lock(self, client_name): if client_name not in self.get_client_names(): raise obnamlib.RepositoryClientDoesNotExist(client_name) if not self._client_is_locked(client_name): raise obnamlib.RepositoryClientNotLocked(client_name) def _raw_lock_client(self, client_name): tracing.trace('client_name=%s', client_name) if self._client_is_locked(client_name): raise obnamlib.RepositoryClientLockingFailed(client_name) client_id = self._get_client_id(client_name) if client_id is None: # pragma: no cover raise obnamlib.RepositoryClientDoesNotExist(client_name) # Create and initialise the client's own directory, if needed. client_dir = self._get_client_dir(client_id) if not self._fs.exists(client_dir): self._fs.mkdir(client_dir) self.hooks.call('repository-toplevel-init', self, client_dir) # Actually lock the directory. self._lockmgr.lock([client_dir]) # Remember that we have the lock. self._open_client(client_name) # Ensure client is open open_client = self._open_clients[client_name] open_client.locked = True def _raw_unlock_client(self, client_name): tracing.trace('client_name=%s', client_name) self._require_client_lock(client_name) open_client = self._open_clients[client_name] self._lockmgr.unlock([open_client.client.dirname]) del self._open_clients[client_name] def lock_client(self, client_name): logging.info('Locking client %s' % client_name) self._raw_lock_client(client_name) def unlock_client(self, client_name): logging.info('Unlocking client %s' % client_name) self._raw_unlock_client(client_name) def commit_client(self, client_name): tracing.trace('client_name=%s', client_name) self._require_client_lock(client_name) open_client = self._open_clients[client_name] for gen_number in open_client.removed_generation_numbers: open_client.client.remove_generation(gen_number) if open_client.current_generation_number: open_client.client.commit() self._raw_unlock_client(client_name) def get_allowed_client_keys(self): return [] def get_client_key(self, client_name, key): # pragma: no cover raise obnamlib.RepositoryClientKeyNotAllowed( self.format, client_name, key) def set_client_key(self, client_name, key, value): raise obnamlib.RepositoryClientKeyNotAllowed( self.format, client_name, key) def get_client_generation_ids(self, client_name): client = self._open_client(client_name) open_client = self._open_clients[client_name] return [ (client_name, gen_number) for gen_number in client.list_generations() if gen_number not in open_client.removed_generation_numbers] def create_generation(self, client_name): tracing.trace('client_name=%s', client_name) self._require_client_lock(client_name) open_client = self._open_clients[client_name] if open_client.current_generation_number is not None: raise obnamlib.RepositoryClientGenerationUnfinished(client_name) open_client.client.start_generation() open_client.current_generation_number = \ open_client.client.get_generation_id(open_client.client.tree) return (client_name, open_client.current_generation_number) # Generations for a client. def _require_existing_generation(self, generation_id): client_name, gen_number = generation_id if generation_id not in self.get_client_generation_ids(client_name): raise obnamlib.RepositoryGenerationDoesNotExist(client_name) def get_allowed_generation_keys(self): return [] def get_generation_key(self, generation_id, key): # pragma: no cover client_name, gen_number = generation_key raise obnamlib.RepositoryGenerationKeyNotAllowed( self.format, client_name, key) def set_generation_key(self, generation_id, key, value): # pragma: no cover client_name, gen_number = generation_key raise obnamlib.RepositoryGenerationKeyNotAllowed( self.format, client_name, key) def interpret_generation_spec(self, client_name, genspec): ids = self.get_client_generation_ids(client_name) if not ids: raise obnamlib.RepositoryClientHasNoGenerations(client_name) if genspec == 'latest': return ids[-1] for gen_id in ids: if str(gen_id[1]) == genspec: return gen_id raise obnamlib.RepositoryGenerationDoesNotExist(client_name) def make_generation_spec(self, gen_id): return str(gen_id[1]) def remove_generation(self, gen_id): tracing.trace('gen_id=%s' % repr(gen_id)) client_name, gen_number = gen_id self._require_client_lock(client_name) self._require_existing_generation(gen_id) open_client = self._open_clients[client_name] if gen_number == open_client.current_generation_number: open_client.current_generation = None open_client.removed_generation_numbers.append(gen_number) def get_generation_chunk_ids(self, generation_id): client_name, gen_number = generation_id client = self._open_client(client_name) return client.list_chunks_in_generation(gen_number) # Chunks and chunk indexes. def _setup_chunks(self): self._prev_chunk_id = None self._chunk_idpath = larch.IdPath( 'chunks', self._idpath_depth, self._idpath_bits, self._idpath_skip) def _chunk_filename(self, chunk_id): return self._chunk_idpath.convert(chunk_id) def _random_chunk_id(self): return random.randint(0, obnamlib.MAX_ID) def put_chunk_content(self, data): if self._prev_chunk_id is None: self._prev_chunk_id = self._random_chunk_id() while True: chunk_id = (self._prev_chunk_id + 1) % obnamlib.MAX_ID filename = self._chunk_filename(chunk_id) try: self._fs.write_file(filename, data) except OSError, e: # pragma: no cover if e.errno == errno.EEXIST: self._prev_chunk_id = self._random_chunk_id() continue raise else: tracing.trace('chunkid=%s', chunk_id) break self._prev_chunk_id = chunk_id return chunk_id def get_chunk_content(self, chunk_id): try: return self._fs.cat(self._chunk_filename(chunk_id)) except IOError, e: if e.errno == errno.ENOENT: raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id)) raise # pragma: no cover def has_chunk(self, chunk_id): return self._fs.exists(self._chunk_filename(chunk_id)) def remove_chunk(self, chunk_id): tracing.trace('chunk_id=%s', chunk_id) filename = self._chunk_filename(chunk_id) try: self._fs.remove(filename) except OSError: raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id)) # Chunk indexes. def _checksum(self, data): return hashlib.md5(data).hexdigest() def _setup_chunk_indexes(self): self._got_chunk_indexes_lock = False self._chunklist = obnamlib.ChunkList( self._fs, self._node_size, self._upload_queue_size, self._lru_size, self) self._chunksums = obnamlib.ChecksumTree( self._fs, 'chunksums', len(self._checksum('')), self._node_size, self._upload_queue_size, self._lru_size, self) def _chunk_index_dirs_to_lock(self): return [ self._chunklist.dirname, self._chunksums.dirname, self._chunk_idpath.dirname] def _require_chunk_indexes_lock(self): if not self._got_chunk_indexes_lock: raise obnamlib.RepositoryChunkIndexesNotLocked() def _raw_lock_chunk_indexes(self): if self._got_chunk_indexes_lock: raise obnamlib.RepositoryChunkIndexesLockingFailed() self._lockmgr.lock(self._chunk_index_dirs_to_lock()) self._got_chunk_indexes_lock = True tracing.trace('starting changes in chunksums and chunklist') self._chunksums.start_changes() self._chunklist.start_changes() # Initialize the chunks directory for encryption, etc, if it just # got created. dirname = self._chunk_idpath.dirname filenames = self._fs.listdir(dirname) if filenames == [] or filenames == ['lock']: self.hooks.call('repository-toplevel-init', self, dirname) def _raw_unlock_chunk_indexes(self): self._require_chunk_indexes_lock() self._lockmgr.unlock(self._chunk_index_dirs_to_lock()) self._setup_chunk_indexes() def lock_chunk_indexes(self): tracing.trace('locking chunk indexes') self._raw_lock_chunk_indexes() def unlock_chunk_indexes(self): tracing.trace('unlocking chunk indexes') self._raw_unlock_chunk_indexes() def force_chunk_indexes_lock(self): tracing.trace('forcing chunk indexes lock') if self._got_chunk_indexes_lock: self._raw_unlock_chunk_indexes() self._raw_lock_chunk_indexes() def commit_chunk_indexes(self): tracing.trace('committing chunk indexes') self._require_chunk_indexes_lock() self._chunklist.commit() self._chunksums.commit() self._raw_unlock_chunk_indexes() def put_chunk_into_indexes(self, chunk_id, data, client_id): tracing.trace('chunk_id=%s', chunk_id) checksum = self._checksum(data) tracing.trace('checksum of data: %s', checksum) tracing.trace('client_id=%s', client_id) self._require_chunk_indexes_lock() self._chunklist.add(chunk_id, checksum) self._chunksums.add(checksum, chunk_id, client_id) def remove_chunk_from_indexes(self, chunk_id, client_id): tracing.trace('chunk_id=%s', chunk_id) self._require_chunk_indexes_lock() checksum = self._chunklist.get_checksum(chunk_id) self._chunksums.remove(checksum, chunk_id, client_id) self._chunklist.remove(chunk_id) def find_chunk_id_by_content(self, data): checksum = self._checksum(data) candidates = self._chunksums.find(checksum) for chunk_id in candidates: chunk_data = self.get_chunk_content(chunk_id) if chunk_data == data: return chunk_id raise obnamlib.RepositoryChunkContentNotInIndexes() # Individual files in a generation. def _require_existing_file(self, generation_id, filename): client_name, gen_number = generation_id if generation_id not in self.get_client_generation_ids(client_name): raise obnamlib.RepositoryGenerationDoesNotExist(client_name) if not self.file_exists(generation_id, filename): raise obnamlib.RepositoryFileDoesNotExistInGeneration( client_name, self.make_generation_spec(generation_id), filename) def file_exists(self, generation_id, filename): client_name, gen_number = generation_id client = self._open_client(client_name) try: client.get_metadata(gen_number, filename) return True except KeyError: return False def add_file(self, generation_id, filename): client_name, gen_number = generation_id self._require_client_lock(client_name) client = self._open_client(client_name) encoded_metadata = obnamlib.encode_metadata(obnamlib.Metadata()) client.create(filename, encoded_metadata) def remove_file(self, generation_id, filename): client_name, gen_number = generation_id self._require_client_lock(client_name) client = self._open_client(client_name) client.remove(filename) # FIXME: Only removes from unfinished gen! def get_allowed_file_keys(self): return [obnamlib.REPO_FILE_TEST_KEY] def get_file_key(self, generation_id, filename, key): self._require_existing_file(generation_id, filename) client_name, gen_number = generation_id client = self._open_client(client_name) encoded_metadata = client.get_metadata(gen_number, filename) metadata = obnamlib.decode_metadata(encoded_metadata) if key == obnamlib.REPO_FILE_MTIME: return metadata.st_mtime_sec or 0 elif key == obnamlib.REPO_FILE_TEST_KEY: return metadata.target or '' else: raise obnamlib.RepositoryFileKeyNotAllowed( self.format, client_name, key) def set_file_key(self, generation_id, filename, key, value): client_name, gen_number = generation_id self._require_client_lock(client_name) self._require_existing_file(generation_id, filename) client = self._open_client(client_name) encoded_metadata = client.get_metadata(gen_number, filename) metadata = obnamlib.decode_metadata(encoded_metadata) if key == obnamlib.REPO_FILE_MTIME: metadata.st_mtime_sec = value elif key == obnamlib.REPO_FILE_TEST_KEY: metadata.target = value else: raise obnamlib.RepositoryFileKeyNotAllowed( self.format, client_name, key) encoded_metadata = obnamlib.encode_metadata(metadata) # FIXME: Only sets in unfinished generation client.set_metadata(filename, encoded_metadata) def get_file_chunk_ids(self, generation_id, filename): self._require_existing_file(generation_id, filename) client_name, gen_number = generation_id client = self._open_client(client_name) return client.get_file_chunks(gen_number, filename) def clear_file_chunk_ids(self, generation_id, filename): self._require_existing_file(generation_id, filename) client_name, gen_number = generation_id self._require_client_lock(client_name) client = self._open_client(client_name) client.set_file_chunks(filename, []) # FIXME: current gen only def append_file_chunk_id(self, generation_id, filename, chunk_id): self._require_existing_file(generation_id, filename) client_name, gen_number = generation_id self._require_client_lock(client_name) client = self._open_client(client_name) client.append_file_chunks(filename, [chunk_id]) # FIXME: curgen only def get_file_children(self, generation_id, filename): self._require_existing_file(generation_id, filename) client_name, gen_number = generation_id client = self._open_client(client_name) return [os.path.join(filename, basename) for basename in client.listdir(gen_number, filename)] # Fsck. def get_fsck_work_item(self): return [] obnam-1.6.1/obnamlib/repo_fmt_6_tests.py0000644000175000017500000000201512246357067020130 0ustar jenkinsjenkins# Copyright (C) 2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import shutil import tempfile import obnamlib class RepositoryFormat6Tests(obnamlib.RepositoryInterfaceTests): def setUp(self): self.tempdir = tempfile.mkdtemp() fs = obnamlib.LocalFS(self.tempdir) self.repo = obnamlib.RepositoryFormat6() self.repo.set_fs(fs) def tearDown(self): shutil.rmtree(self.tempdir) obnam-1.6.1/obnamlib/repo_interface.py0000644000175000017500000016366012246357067017651 0ustar jenkinsjenkins# repo_interface.py -- interface class for repository access # # Copyright 2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # # =*= License: GPL-3+ =*= import unittest import obnamlib # The following is a canonical list of all keys that can be used with # the repository interface for key/value pairs. Not all formats need # to support all keys, but they all must support the test keys, for # the test suite to function. REPO_CLIENT_TEST_KEY = 0 # string REPO_GENERATION_TEST_KEY = 1 # string REPO_FILE_TEST_KEY = 2 # string REPO_FILE_MTIME = 3 # integer REPO_FILE_INTEGER_KEYS = ( REPO_FILE_MTIME, ) # The following is a key that is NOT allowed for any repository format. WRONG_KEY = -1 class RepositoryClientListLockingFailed(obnamlib.Error): def __init__(self): self.msg = 'Repository client list could not be locked' class RepositoryClientListNotLocked(obnamlib.Error): def __init__(self): self.msg = 'Repository client list is not locked' class RepositoryClientAlreadyExists(obnamlib.Error): def __init__(self, client_name): self.msg = 'Repository client %s already exists' % client_name class RepositoryClientDoesNotExist(obnamlib.Error): def __init__(self, client_name): self.msg = 'Repository client %s does not exist' % client_name class RepositoryClientLockingFailed(obnamlib.Error): def __init__(self, client_name): self.msg = 'Repository client %s could not be locked' % client_name class RepositoryClientNotLocked(obnamlib.Error): def __init__(self, client_name): self.msg = 'Repository client %s is not locked' % client_name class RepositoryClientKeyNotAllowed(obnamlib.Error): def __init__(self, format, client_name, key): self.msg = ( 'Client %s uses repository format %s ' 'which does not allow the key %s to be use for clients' % (format, client_name, key)) class RepositoryClientGenerationUnfinished(obnamlib.Error): def __init__(self, client_name): self.msg = ( 'Cannot start new generation for %s: ' 'previous one is not finished yet (programming error)' % client_name) class RepositoryGenerationKeyNotAllowed(obnamlib.Error): def __init__(self, format, client_name, key): self.msg = ( 'Client %s uses repository format %s ' 'which does not allow the key %s to be use for generations' % (format, client_name, key)) class RepositoryGenerationDoesNotExist(obnamlib.Error): def __init__(self, client_name): self.msg = ( 'Cannot find requested generation for client %s' % client_name) class RepositoryClientHasNoGenerations(obnamlib.Error): def __init__(self, client_name): self.msg = 'Client %s has no generations' % client_name class RepositoryFileDoesNotExistInGeneration(obnamlib.Error): def __init__(self, client_name, genspec, filename): self.msg = ( 'Client %s, generation %s does not have file %s' % (client_name, genspec, filename)) class RepositoryFileKeyNotAllowed(obnamlib.Error): def __init__(self, format, client_name, key): self.msg = ( 'Client %s uses repository format %s ' 'which does not allow the key %s to be use for files' % (client_name, format, key)) class RepositoryChunkDoesNotExist(obnamlib.Error): def __init__(self, chunk_id_as_string): self.msg = "Repository doesn't contain chunk %s" % chunk_id_as_string class RepositoryChunkContentNotInIndexes(obnamlib.Error): def __init__(self): self.msg = "Repository chunk indexes do not contain content" class RepositoryChunkIndexesNotLocked(obnamlib.Error): def __init__(self): self.msg = 'Repository chunk indexes are not locked' class RepositoryChunkIndexesLockingFailed(obnamlib.Error): def __init__(self): self.msg = 'Repository chunk indexes are already locked' class RepositoryInterface(object): '''Abstract interface to Obnam backup repositories. An Obnam backup repository stores backups for backup clients. As development of Obnam progresses, the details of how things are stored can change. This is usually necessary for performance improvements. To allow Obnam to access, both for reading and writing, any version of the repository format, this class defines an interface for repository access. Every different version of the format implements a class with this interface, so that the rest of Obnam can just use the interface. The interface is suitably high level that using the repository is convenient, and that it allows a variety of implementations. At the same time it concentrates on the needs of repository access only. The interface also specifies the interface with which the implementation accesses the actual filesystem: it is the Obnam VFS layer. [rest of Obnam code] | | calls RepositoryInterface API | V [RepositoryFormatX implementing RepositoryInterface API] | | calls VFS API | V [FooFS implementing VirtualFileSystem API] The VFS API implementation is given to the RepositoryInterface implementation with the ``set_fs`` method. It must be stressed that ALL access to the repository go via an implemention of RepositoryInterface. Further, all the implementation classes must be instantiated via RepositoryFactory. The abstraction RepositoryInterface provides for repositories consists of a few key concepts: * A repository contains data about one or more clients. * For each client, there is some metadata, plus a list of generations. * For each generation, there is some metadata, plus a list of files (where directories are treated as files). * For each file, there is some metadata, plus a list of chunk identifiers. * File contents data is split into chunks, each given a unique identifier. * There is optionally some indexing for content based lookups of chunks (e.g., look up chunks based on an MD5 checksum). * There are three levels of locking: the list of clients, the per-client data (information about generations), and the chunk lookup indexes are all locked up individually. * All metadata is stored as key/value pairs, where the key is one of a strictly limited, version-specific list of allowed ones, and the value is a binary string or a 64-bit integer (the type depends on the key). All allowed keys are implicitly set to the empty string or 0 if not set otherwise. Further, the repository format version implementation is given a directory in which it stores the repository, using any number of files it wishes. No other files will be in that directory. (RepositoryFactory creates the actual directory.) The only restriction is that within that directory, the ``metadata/format``file MUST be a plain text file (no encryption, compression), containing a single line, giving the format of the repository, as an arbitrary string. Each RepositoryInterface implementation will work with exactly one such format, and have a class attribute ``format`` which contains the string. There is no method to remove a repository. This is handled externally by removing the repository directory and all its files. Since that code is generic, it is not needed in the interface. Each RepositoryInterface implementation can have a custom initialiser. RepositoryFactory will know how to call it, giving it all the information it needs. Generation and chunk identifiers, as returned by this API, are opaque objects, which may be compared for equality, but not for sorting. A generation id will include information to identify the client it belongs to, in order to make it unnecessary to always specify the client. File metadata (stat fields, etc) are stored using individual file keys: repo.set_file_key(gen_id, filename, REPO_FILE_KEY_MTIME, mtime) This is to allow maximum flexibility in how data is actually stored in the repository, and to make the least amount of assumptions that will hinder convertability between repository formats. However, storing them independently is likely to be epxensive, and so the implementation may actually pool file key changes to a file and only actually encode all of them, as a blob, when the API user is finished with a file. There is no API call to indicate that explicitly, but the API implementation can deduce it by noticing that another file's file key, or other metadata, gets set. This design aims to make the API as easy to use as possible, by avoiding an extra "I am finished with this file for now" method call. ''' # Operations on the repository itself. def set_fs(self, fs): '''Set the Obnam VFS instance for accessing the filesystem.''' raise NotImplementedError() def init_repo(self): '''Initialize a nearly-empty directory for this format version. The repository will contain the file ``metadata/format``, with the right contents, but nothing else. ''' raise NotImplementedError() # Client list. def get_client_names(self): '''Return list of client names currently existing in the repository.''' raise NotImplementedError() def lock_client_list(self): '''Lock the client list for changes.''' raise NotImplementedError() def commit_client_list(self): '''Commit changes to client list and unlock it.''' raise NotImplementedError() def unlock_client_list(self): '''Forget changes to client list and unlock it.''' raise NotImplementedError() def force_client_list_lock(self): '''Force the client list lock. If the process that locked the client list is dead, this method forces the lock open and takes it for the calling process instead. Any uncommitted changes by the original locker will be lost. ''' raise NotImplementedError() def add_client(self, client_name): '''Add a client to the client list. Raise RepositoryClientAlreadyExists if the client already exists. ''' raise NotImplementedError() def remove_client(self, client_name): '''Remove a client from the client list.''' raise NotImplementedError() def rename_client(self, old_client_name, new_client_name): '''Rename a client to have a new name.''' raise NotImplementedError() # A particular client. def lock_client(self, client_name): '''Lock the client for changes. This lock must be taken for any changes to the per-client data, including any changes to backup generations for the client. ''' raise NotImplementedError() def commit_client(self, client_name): '''Commit changes to client and unlock it.''' raise NotImplementedError() def unlock_client(self, client_name): '''Forget changes to client and unlock it.''' raise NotImplementedError() def force_client_lock(self, client_name): '''Force the client lock. If the process that locked the client is dead, this method forces the lock open and takes it for the calling process instead. Any uncommitted changes by the original locker will be lost. ''' raise NotImplementedError() def get_allowed_client_keys(self): '''Return list of allowed per-client keys for thist format.''' raise NotImplementedError() def get_client_key(self, client_name, key): '''Return current value of a key for a given client. If not set explicitly, the value is the empty string. If the key is not in the list of allowed keys for this format, raise RepositoryClientKeyNotAllowed. ''' raise NotImplementedError() def set_client_key(self, client_name, key, value): '''Set value for a per-client key.''' raise NotImplementedError() def get_client_generation_ids(self, client_name): '''Return a list of opague ids for generations in a client. The list is ordered: the first id in the list is the oldest generation. The ids needs not be sortable, and they may or may not be simple types. ''' raise NotImplementedError() def create_generation(self, client_name): '''Start a new generation for a client. Return the generation id for the new generation. The id implicitly also identifies the client. ''' raise NotImplementedError() # Generations. The generation id identifies client as well. def get_allowed_generation_keys(self): '''Return list of all allowed keys for generations.''' raise NotImplementedError() def get_generation_key(self, generation_id, key): '''Return current value for a generation key.''' raise NotImplementedError() def set_generation_key(self, generation_id, key, value): '''Set a key/value pair for a given generation.''' raise NotImplementedError() def remove_generation(self, generation_id): '''Remove an existing generation. The removed generation may be the currently unfinished one. ''' raise NotImplementedError() def get_generation_chunk_ids(self, generation_id): '''Return list of chunk ids used by a generation. Each file lists the chunks it uses, but iterating over all files is expensive. This method gives a potentially more efficient way of getting the information. ''' raise NotImplementedError() def interpret_generation_spec(self, client_name, genspec): '''Return the generation id for a user-given specification. The specification is a string, and either gives the number of a generation, or is the word 'latest'. The return value is a generation id usable with the RepositoryInterface API. ''' raise NotImplementedError() def make_generation_spec(self, gen_id): '''Return a generation spec that matches a given generation id. If we tell the user the returned string, and they later give it to interpret_generation_spec, the same generation id is returned. ''' raise NotImplementedError() # Individual files and directories in a generation. def file_exists(self, generation_id, filename): '''Does a file exist in a generation? The filename should be the full path to the file. ''' raise NotImplementedError() def add_file(self, generation_id, filename): '''Adds a file to the generation. Any metadata about the file needs to be added with set_file_key. ''' raise NotImplementedError() def remove_file(self, generation_id, filename): '''Removes a file from the given generation. The generation MUST be the created, but not committed or unlocked generation. All the file keys associated with the file are also removed. ''' raise NotImplementedError() def get_allowed_file_keys(self): '''Return list of allowed file keys for this format.''' raise NotImplementedError() def get_file_key(self, generation_id, filename, key): '''Return value for a file key, or empty string. The empty string is returned if no value has been set for the file key, or the file does not exist. ''' raise NotImplementedError() def set_file_key(self, generation_id, filename, key, value): '''Set value for a file key. It is an error to set the value for a file key if the file does not exist yet. ''' raise NotImplementedError() def get_file_chunk_ids(self, generation_id, filename): '''Get the list of chunk ids for a file.''' raise NotImplementedError() def clear_file_chunk_ids(self, generation_id, filename): '''Clear the list of chunk ids for a file.''' raise NotImplementedError() def append_file_chunk_id(self, generation_id, filename, chunk_id): '''Add a chunk id for a file. The chunk id is added to the end of the list of chunk ids, so file data ordering is preserved.. ''' raise NotImplementedError() def get_file_children(self, generation_id, filename): '''List contents of a directory. This returns a list of full pathnames for all the files in the repository that are direct children of the given file. This may fail if the given file is not a directory, but that is not guaranteed. ''' raise NotImplementedError() # Chunks. def put_chunk_content(self, data): '''Add a new chunk into the repository. Return the chunk identifier. ''' raise NotImplementedError() def get_chunk_content(self, chunk_id): '''Return the contents of a chunk, given its id.''' raise NotImplementedError() def has_chunk(self, chunk_id): '''Does a chunk (still) exist in the repository?''' raise NotImplementedError() def remove_chunk(self, chunk_id): '''Remove chunk from repository, but not chunk indexes.''' raise NotImplementedError() def lock_chunk_indexes(self): '''Locks chunk indexes for updates.''' raise NotImplementedError() def unlock_chunk_indexes(self): '''Unlocks chunk indexes without committing them.''' raise NotImplementedError() def force_chunk_indexex_lock(self): '''Forces a chunk index lock open and takes it for the caller.''' raise NotImplementedError() def commit_chunk_indexes(self): '''Commit changes to chunk indexes.''' raise NotImplementedError() def put_chunk_into_indexes(self, chunk_id, data, client_id): '''Adds a chunk to indexes. This does not do any de-duplication. The indexes map a chunk id to its checksum, and a checksum to both the chunk ids (possibly several!) and the client ids for the clients that use the chunk. The client ids are used to track when a chunk is no longer used by anyone and can be removed. ''' raise NotImplementedError() def remove_chunk_from_indexes(self, chunk_id, client_id): '''Removes a chunk from indexes, given its id, for a given client.''' raise NotImplementedError() def find_chunk_id_by_content(self, data): '''Finds a chunk id given its content. This will raise RepositoryChunkContentNotInIndexes if the chunk is not in the indexes. Otherwise it will return one chunk id that has exactly the same content. If the indexes contain duplicate chunks, any one of the might be returned. ''' raise NotImplementedError() # Fsck. def get_fsck_work_item(self): '''Return an fsck work item for checking this repository. The work item may spawn more items. ''' raise NotImplementedError() class RepositoryInterfaceTests(unittest.TestCase): # pragma: no cover '''Tests for implementations of RepositoryInterface. Each implementation of RepositoryInterface should have a corresponding test class, which inherits this class. The test subclass must set ``self.repo`` to an instance of the class to be tested. The repository must be empty and uninitialised. ''' # Tests for repository level things. def test_has_format_attribute(self): self.assertEqual(type(self.repo.format), str) def test_has_set_fs_method(self): # We merely test that set_fs can be called. self.assertEqual(self.repo.set_fs(None), None) # Tests for the client list. def test_has_no_clients_initially(self): self.repo.init_repo() self.assertEqual(self.repo.get_client_names(), []) def test_adds_a_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.assertEqual(self.repo.get_client_names(), ['foo']) def test_renames_a_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.repo.lock_client_list() self.repo.rename_client('foo', 'bar') self.assertEqual(self.repo.get_client_names(), ['bar']) def test_removes_a_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.remove_client('foo') self.assertEqual(self.repo.get_client_names(), []) def test_fails_adding_existing_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.assertRaises( obnamlib.RepositoryClientAlreadyExists, self.repo.add_client, 'foo') def test_fails_renaming_nonexistent_client(self): self.repo.init_repo() self.repo.lock_client_list() self.assertRaises( obnamlib.RepositoryClientDoesNotExist, self.repo.rename_client, 'foo', 'bar') def test_fails_renaming_to_existing_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.add_client('bar') self.repo.commit_client_list() self.repo.lock_client_list() self.assertRaises( obnamlib.RepositoryClientAlreadyExists, self.repo.rename_client, 'foo', 'bar') def test_fails_removing_nonexistent_client(self): self.repo.init_repo() self.repo.lock_client_list() self.assertRaises( obnamlib.RepositoryClientDoesNotExist, self.repo.remove_client, 'foo') def test_raises_lock_error_if_adding_client_without_locking(self): self.repo.init_repo() self.assertRaises( obnamlib.RepositoryClientListNotLocked, self.repo.add_client, 'foo') def test_raises_lock_error_if_renaming_client_without_locking(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.assertRaises( obnamlib.RepositoryClientListNotLocked, self.repo.rename_client, 'foo', 'bar') def test_raises_lock_error_if_removing_client_without_locking(self): self.repo.init_repo() self.assertRaises( obnamlib.RepositoryClientListNotLocked, self.repo.remove_client, 'foo') def test_unlocking_client_list_does_not_add_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.unlock_client_list() self.assertEqual(self.repo.get_client_names(), []) def test_unlocking_client_list_does_not_rename_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.repo.lock_client_list() self.repo.rename_client('foo', 'bar') self.repo.unlock_client_list() self.assertEqual(self.repo.get_client_names(), ['foo']) def test_unlocking_client_list_does_not_remove_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.repo.lock_client_list() self.repo.remove_client('foo') self.repo.unlock_client_list() self.assertEqual(self.repo.get_client_names(), ['foo']) def test_committing_client_list_adds_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.assertEqual(self.repo.get_client_names(), ['foo']) def test_committing_client_list_renames_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.repo.lock_client_list() self.repo.rename_client('foo', 'bar') self.repo.commit_client_list() self.assertEqual(self.repo.get_client_names(), ['bar']) def test_commiting_client_list_removes_client(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('foo') self.repo.commit_client_list() self.repo.lock_client_list() self.repo.remove_client('foo') self.repo.commit_client_list() self.assertEqual(self.repo.get_client_names(), []) def test_commiting_client_list_removes_lock(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.commit_client_list() self.repo.lock_client_list() self.assertEqual(self.repo.get_client_names(), []) def test_unlocking_client_list_removes_lock(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.unlock_client_list() self.repo.lock_client_list() self.assertEqual(self.repo.get_client_names(), []) def test_locking_client_list_twice_fails(self): self.repo.init_repo() self.repo.lock_client_list() self.assertRaises( obnamlib.RepositoryClientListLockingFailed, self.repo.lock_client_list) def test_unlocking_client_list_when_unlocked_fails(self): self.repo.init_repo() self.assertRaises( obnamlib.RepositoryClientListNotLocked, self.repo.unlock_client_list) def test_committing_client_list_when_unlocked_fails(self): self.repo.init_repo() self.assertRaises( obnamlib.RepositoryClientListNotLocked, self.repo.commit_client_list) def test_forces_client_list_lock(self): self.repo.init_repo() self.repo.lock_client_list() self.repo.add_client('bar') self.repo.force_client_list_lock() self.repo.add_client('foo') self.assertEqual(self.repo.get_client_names(), ['foo']) # Tests for client specific stuff. def setup_client(self): self.repo.lock_client_list() self.repo.add_client('fooclient') self.repo.commit_client_list() def test_locking_client_twice_fails(self): self.setup_client() self.repo.lock_client('fooclient') self.assertRaises( obnamlib.RepositoryClientLockingFailed, self.repo.lock_client, 'fooclient') def test_unlocking_client_when_unlocked_fails(self): self.setup_client() self.assertRaises( obnamlib.RepositoryClientNotLocked, self.repo.unlock_client, 'fooclient') def test_committing_client_when_unlocked_fails(self): self.setup_client() self.assertRaises( obnamlib.RepositoryClientNotLocked, self.repo.commit_client, 'fooclient') def test_unlocking_nonexistent_client_fails(self): self.setup_client() self.assertRaises( obnamlib.RepositoryClientDoesNotExist, self.repo.unlock_client, 'notexist') def test_committing_nonexistent_client_fails(self): self.setup_client() self.assertRaises( obnamlib.RepositoryClientDoesNotExist, self.repo.commit_client, 'notexist') def test_unlocking_client_removes_lock(self): self.setup_client() self.repo.lock_client('fooclient') self.repo.unlock_client('fooclient') self.assertEqual(self.repo.lock_client('fooclient'), None) def test_committing_client_removes_lock(self): self.setup_client() self.repo.lock_client('fooclient') self.repo.commit_client('fooclient') self.assertEqual(self.repo.lock_client('fooclient'), None) def test_has_list_of_allowed_client_keys(self): self.assertEqual(type(self.repo.get_allowed_client_keys()), list) def test_gets_all_allowed_client_keys(self): self.setup_client() for key in self.repo.get_allowed_client_keys(): value = self.repo.get_client_key('fooclient', key) self.assertEqual(type(value), str) def client_test_key_is_allowed(self): return (obnamlib.REPO_CLIENT_TEST_KEY in self.repo.get_allowed_client_keys()) def test_has_empty_string_for_client_test_key(self): if self.client_test_key_is_allowed(): self.setup_client() value = self.repo.get_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY) self.assertEqual(value, '') def test_sets_client_key(self): if self.client_test_key_is_allowed(): self.setup_client() self.repo.lock_client('fooclient') self.repo.set_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar') value = self.repo.get_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY) self.assertEqual(value, 'bar') def test_setting_unallowed_client_key_fails(self): self.setup_client() self.repo.lock_client('fooclient') self.assertRaises( obnamlib.RepositoryClientKeyNotAllowed, self.repo.set_client_key, 'fooclient', WRONG_KEY, '') def test_setting_client_key_without_locking_fails(self): if self.client_test_key_is_allowed(): self.setup_client() self.assertRaises( obnamlib.RepositoryClientNotLocked, self.repo.set_client_key, 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar') def test_committing_client_preserves_key_changs(self): if self.client_test_key_is_allowed(): self.setup_client() self.repo.lock_client('fooclient') self.repo.set_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar') value = self.repo.get_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY) self.repo.commit_client('fooclient') self.assertEqual(value, 'bar') def test_unlocking_client_undoes_key_changes(self): if self.client_test_key_is_allowed(): self.setup_client() self.repo.lock_client('fooclient') self.repo.set_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar') self.repo.unlock_client('fooclient') value = self.repo.get_client_key( 'fooclient', obnamlib.REPO_CLIENT_TEST_KEY) self.assertEqual(value, '') def test_getting_client_key_for_unknown_client_fails(self): if self.client_test_key_is_allowed(): self.setup_client() self.assertRaises( obnamlib.RepositoryClientDoesNotExist, self.repo.get_client_key, 'notexistclient', obnamlib.REPO_CLIENT_TEST_KEY) def test_new_client_has_no_generations(self): self.setup_client() self.assertEqual(self.repo.get_client_generation_ids('fooclient'), []) def test_creates_new_generation(self): self.setup_client() self.repo.lock_client('fooclient') new_id = self.repo.create_generation('fooclient') self.assertEqual( self.repo.get_client_generation_ids('fooclient'), [new_id]) def test_creating_generation_fails_current_generation_unfinished(self): self.setup_client() self.repo.lock_client('fooclient') self.repo.create_generation('fooclient') self.assertRaises( obnamlib.RepositoryClientGenerationUnfinished, self.repo.create_generation, 'fooclient') def test_creating_generation_fails_if_client_is_unlocked(self): self.setup_client() self.assertRaises( obnamlib.RepositoryClientNotLocked, self.repo.create_generation, 'fooclient') def test_unlocking_client_removes_created_generation(self): self.setup_client() self.repo.lock_client('fooclient') new_id = self.repo.create_generation('fooclient') self.repo.unlock_client('fooclient') self.assertEqual(self.repo.get_client_generation_ids('fooclient'), []) def test_committing_client_keeps_created_generation(self): self.setup_client() self.repo.lock_client('fooclient') new_id = self.repo.create_generation('fooclient') self.repo.commit_client('fooclient') self.assertEqual( self.repo.get_client_generation_ids('fooclient'), [new_id]) # Operations on one generation. def create_generation(self): self.setup_client() self.repo.lock_client('fooclient') return self.repo.create_generation('fooclient') def generation_test_key_is_allowed(self): return (obnamlib.REPO_GENERATION_TEST_KEY in self.repo.get_allowed_generation_keys()) def test_has_list_of_allowed_generation_keys(self): self.assertEqual(type(self.repo.get_allowed_generation_keys()), list) def test_gets_all_allowed_generation_keys(self): gen_id = self.create_generation() for key in self.repo.get_allowed_generation_keys(): value = self.repo.get_generation_key(gen_id, key) self.assertEqual(type(value), str) def test_has_empty_string_for_generation_test_key(self): if self.generation_test_key_is_allowed(): gen_id = self.create_generation() value = self.repo.get_generation_key( gen_id, obnamlib.REPO_GENERATION_TEST_KEY) self.assertEqual(value, '') def test_sets_generation_key(self): if self.generation_test_key_is_allowed(): gen_id = self.create_generation() self.repo.set_generation_key( gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar') value = self.repo.get_generation_key( gen_id, obnamlib.REPO_GENERATION_TEST_KEY) self.assertEqual(value, 'bar') def test_setting_unallowed_generation_key_fails(self): if self.generation_test_key_is_allowed(): gen_id = self.create_generation() self.assertRaises( obnamlib.RepositoryGenerationKeyNotAllowed, self.repo.set_generation_key, gen_id, WRONG_KEY, '') def test_setting_generation_key_without_locking_fails(self): if self.generation_test_key_is_allowed(): gen_id = self.create_generation() self.repo.commit_client('fooclient') self.assertRaises( obnamlib.RepositoryClientNotLocked, self.repo.set_generation_key, gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar') def test_committing_client_preserves_generation_key_changs(self): if self.generation_test_key_is_allowed(): gen_id = self.create_generation() self.repo.set_generation_key( gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar') value = self.repo.get_generation_key( gen_id, obnamlib.REPO_GENERATION_TEST_KEY) self.repo.commit_client('fooclient') self.assertEqual(value, 'bar') def test_unlocking_client_undoes_generation_key_changes(self): if self.generation_test_key_is_allowed(): gen_id = self.create_generation() self.repo.set_generation_key( gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar') self.repo.unlock_client('fooclient') value = self.repo.get_generation_key( gen_id, obnamlib.REPO_CLIENT_TEST_KEY) self.assertEqual(value, '') def test_removes_unfinished_generation(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertEqual(self.repo.get_client_generation_ids('fooclient'), []) def test_removes_finished_generation(self): gen_id = self.create_generation() self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') self.repo.remove_generation(gen_id) self.assertEqual(self.repo.get_client_generation_ids('fooclient'), []) def test_removing_removed_generation_fails(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.remove_generation, gen_id) def test_removing_generation_without_client_lock_fails(self): gen_id = self.create_generation() self.repo.commit_client('fooclient') self.assertRaises( obnamlib.RepositoryClientNotLocked, self.repo.remove_generation, gen_id) def test_unlocking_client_forgets_generation_removal(self): gen_id = self.create_generation() self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') self.repo.remove_generation(gen_id) self.repo.unlock_client('fooclient') self.assertEqual( self.repo.get_client_generation_ids('fooclient'), [gen_id]) def test_committing_client_actually_removes_generation(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.repo.commit_client('fooclient') self.assertEqual(self.repo.get_client_generation_ids('fooclient'), []) def test_empty_generation_uses_no_chunk_ids(self): gen_id = self.create_generation() self.assertEqual(self.repo.get_generation_chunk_ids(gen_id), []) def test_interprets_latest_as_a_generation_spec(self): gen_id = self.create_generation() self.assertEqual( self.repo.interpret_generation_spec('fooclient', 'latest'), gen_id) def test_interpreting_latest_genspec_without_generations_fails(self): self.setup_client() self.assertRaises( obnamlib.RepositoryClientHasNoGenerations, self.repo.interpret_generation_spec, 'fooclient', 'latest') def test_interprets_generation_spec(self): gen_id = self.create_generation() genspec = self.repo.make_generation_spec(gen_id) self.assertEqual( self.repo.interpret_generation_spec('fooclient', genspec), gen_id) def test_interpreting_generation_spec_for_removed_generation_fails(self): # Note that we must have at least one generation, after removing # one. gen_id = self.create_generation() self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') genspec = self.repo.make_generation_spec(gen_id) self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.interpret_generation_spec, 'fooclient', genspec) # Tests for individual files in a generation. def test_file_does_not_exist(self): gen_id = self.create_generation() self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar')) def test_adds_file(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar')) def test_unlocking_forgets_file_add(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.unlock_client('fooclient') self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar')) def test_committing_remembers_file_add(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.commit_client('fooclient') self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar')) def test_creating_generation_clones_previous_one(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') self.assertTrue(self.repo.file_exists(gen_id_2, '/foo/bar')) def test_removes_added_file_from_current_generation(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.remove_file(gen_id, '/foo/bar') self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar')) def test_unlocking_forgets_file_removal(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') self.repo.remove_file(gen_id, '/foo/bar') self.repo.unlock_client('fooclient') self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar')) def test_committing_remembers_file_removal(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') self.assertTrue(self.repo.file_exists(gen_id_2, '/foo/bar')) self.repo.remove_file(gen_id_2, '/foo/bar') self.repo.commit_client('fooclient') self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar')) self.assertFalse(self.repo.file_exists(gen_id_2, '/foo/bar')) def test_has_list_of_allowed_file_keys(self): self.assertEqual(type(self.repo.get_allowed_file_keys()), list) def test_gets_all_allowed_file_keys(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') for key in self.repo.get_allowed_file_keys(): value = self.repo.get_file_key(gen_id, '/foo/bar', key) if key in REPO_FILE_INTEGER_KEYS: self.assertEqual(type(value), int) else: self.assertEqual(type(value), str) def test_has_empty_string_for_file_test_key(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) self.assertEqual(value, '') def test_get_file_key_fails_for_nonexistent_generation(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.get_file_key, gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) def test_get_file_key_fails_for_forbidden_key(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.assertRaises( obnamlib.RepositoryFileKeyNotAllowed, self.repo.get_file_key, gen_id, '/foo/bar', WRONG_KEY) def test_get_file_key_fails_for_nonexistent_file(self): gen_id = self.create_generation() self.assertRaises( obnamlib.RepositoryFileDoesNotExistInGeneration, self.repo.get_file_key, gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) def test_sets_file_key(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.set_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo') value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) self.assertEqual(value, 'yoyo') def test_setting_unallowed_file_key_fails(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.assertRaises( obnamlib.RepositoryFileKeyNotAllowed, self.repo.set_file_key, gen_id, '/foo/bar', WRONG_KEY, 'yoyo') def test_file_has_zero_mtime_by_default(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME) self.assertEqual(value, 0) def test_sets_file_mtime(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.set_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME, 123) value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME) self.assertEqual(value, 123) def test_set_file_key_fails_for_nonexistent_generation(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.set_file_key, gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo') def test_setting_file_key_for_nonexistent_file_fails(self): gen_id = self.create_generation() self.assertRaises( obnamlib.RepositoryFileDoesNotExistInGeneration, self.repo.set_file_key, gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo') # FIXME: These tests fails due to ClientMetadataTree brokenness, it seems. # They're disabled, for now. The bug is not exposed by existing code, # only by the new interface's tests. if False: def test_removing_file_removes_all_its_file_keys(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.set_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME, 123) # Remove the file. Key should be removed. self.repo.remove_file(gen_id, '/foo/bar') self.assertRaises( obnamlib.RepositoryFileDoesNotExistInGeneration, self.repo.get_file_key, gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME) # Add the file back. Key should still be removed. self.repo.add_file(gen_id, '/foo/bar') value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME) self.assertEqual(value, 0) def test_can_add_a_file_then_remove_then_add_it_again(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar')) self.repo.remove_file(gen_id, '/foo/bar') self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar')) self.repo.add_file(gen_id, '/foo/bar') self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar')) def test_unlocking_client_forgets_set_file_keys(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.set_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo') self.repo.unlock_client('fooclient') self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.get_file_key, gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) def test_committing_client_remembers_set_file_keys(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.set_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo') self.repo.commit_client('fooclient') value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) self.assertEqual(value, 'yoyo') def test_setting_file_key_does_not_affect_previous_generation(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.set_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'first') self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') self.repo.set_file_key( gen_id_2, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'second') self.repo.commit_client('fooclient') value = self.repo.get_file_key( gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) self.assertEqual(value, 'first') value_2 = self.repo.get_file_key( gen_id_2, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY) self.assertEqual(value_2, 'second') def test_new_file_has_no_chunk_ids(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.assertEqual(self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), []) def test_getting_file_chunk_ids_for_nonexistent_generation_fails(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.get_file_chunk_ids, gen_id, '/foo/bar') def test_getting_file_chunk_ids_for_nonexistent_file_fails(self): gen_id = self.create_generation() self.assertRaises( obnamlib.RepositoryFileDoesNotExistInGeneration, self.repo.get_file_chunk_ids, gen_id, '/foo/bar') def test_appends_one_file_chunk_id(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1) self.assertEqual( self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), [1]) def test_appends_two_file_chunk_ids(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1) self.repo.append_file_chunk_id(gen_id, '/foo/bar', 2) self.assertEqual( self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), [1, 2]) def test_appending_file_chunk_ids_in_nonexistent_generation_fails(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.append_file_chunk_id, gen_id, '/foo/bar', 1) def test_appending_file_chunk_ids_to_nonexistent_file_fails(self): gen_id = self.create_generation() self.assertRaises( obnamlib.RepositoryFileDoesNotExistInGeneration, self.repo.append_file_chunk_id, gen_id, '/foo/bar', 1) def test_adding_chunk_id_to_file_adds_it_to_generation_chunk_ids(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1) self.assertEqual(self.repo.get_generation_chunk_ids(gen_id), [1]) def test_clears_file_chunk_ids(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1) self.repo.clear_file_chunk_ids(gen_id, '/foo/bar') self.assertEqual(self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), []) def test_clearing_file_chunk_ids_in_nonexistent_generation_fails(self): gen_id = self.create_generation() self.repo.remove_generation(gen_id) self.assertRaises( obnamlib.RepositoryGenerationDoesNotExist, self.repo.clear_file_chunk_ids, gen_id, '/foo/bar') def test_clearing_file_chunk_ids_for_nonexistent_file_fails(self): gen_id = self.create_generation() self.assertRaises( obnamlib.RepositoryFileDoesNotExistInGeneration, self.repo.clear_file_chunk_ids, gen_id, '/foo/bar') def test_unlocking_client_forgets_modified_file_chunk_ids(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1) self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') self.repo.append_file_chunk_id(gen_id_2, '/foo/bar', 2) self.assertEqual( self.repo.get_file_chunk_ids(gen_id_2, '/foo/bar'), [1, 2]) self.repo.unlock_client('fooclient') self.assertEqual( self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), [1]) def test_committing_child_remembers_modified_file_chunk_ids(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1) self.repo.commit_client('fooclient') self.repo.lock_client('fooclient') gen_id_2 = self.repo.create_generation('fooclient') self.repo.append_file_chunk_id(gen_id_2, '/foo/bar', 2) self.assertEqual( self.repo.get_file_chunk_ids(gen_id_2, '/foo/bar'), [1, 2]) self.repo.commit_client('fooclient') self.assertEqual( self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), [1]) self.assertEqual( self.repo.get_file_chunk_ids(gen_id_2, '/foo/bar'), [1, 2]) def test_new_file_has_no_children(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo/bar') self.assertEqual(self.repo.get_file_children(gen_id, '/foo/bar'), []) def test_gets_file_child(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/foo') self.repo.add_file(gen_id, '/foo/bar') self.assertEqual( self.repo.get_file_children(gen_id, '/foo'), ['/foo/bar']) def test_gets_only_immediate_child_for_file(self): gen_id = self.create_generation() self.repo.add_file(gen_id, '/') self.repo.add_file(gen_id, '/foo') self.repo.add_file(gen_id, '/foo/bar') self.assertEqual( self.repo.get_file_children(gen_id, '/'), ['/foo']) # Chunk and chunk indexes. def test_puts_chunk_into_repository(self): chunk_id = self.repo.put_chunk_content('foochunk') self.assertTrue(self.repo.has_chunk(chunk_id)) self.assertEqual(self.repo.get_chunk_content(chunk_id), 'foochunk') def test_removes_chunk(self): chunk_id = self.repo.put_chunk_content('foochunk') self.repo.remove_chunk(chunk_id) self.assertFalse(self.repo.has_chunk(chunk_id)) self.assertRaises( obnamlib.RepositoryChunkDoesNotExist, self.repo.get_chunk_content, chunk_id) def test_removing_nonexistent_chunk_fails(self): chunk_id = self.repo.put_chunk_content('foochunk') self.repo.remove_chunk(chunk_id) self.assertRaises( obnamlib.RepositoryChunkDoesNotExist, self.repo.remove_chunk, chunk_id) def test_adds_chunk_to_indexes(self): self.repo.lock_chunk_indexes() chunk_id = self.repo.put_chunk_content('foochunk') self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123) self.assertEqual( self.repo.find_chunk_id_by_content('foochunk'), chunk_id) def test_removes_chunk_from_indexes(self): self.repo.lock_chunk_indexes() chunk_id = self.repo.put_chunk_content('foochunk') self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123) self.repo.remove_chunk_from_indexes(chunk_id, 123) self.assertRaises( obnamlib.RepositoryChunkContentNotInIndexes, self.repo.find_chunk_id_by_content, 'foochunk') def test_putting_chunk_to_indexes_without_locking_them_fails(self): chunk_id = self.repo.put_chunk_content('foochunk') self.assertRaises( obnamlib.RepositoryChunkIndexesNotLocked, self.repo.put_chunk_into_indexes, chunk_id, 'foochunk', 123) def test_removing_chunk_from_indexes_without_locking_them_fails(self): chunk_id = self.repo.put_chunk_content('foochunk') self.repo.lock_chunk_indexes() self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123) self.repo.commit_chunk_indexes() self.assertRaises( obnamlib.RepositoryChunkIndexesNotLocked, self.repo.remove_chunk_from_indexes, chunk_id, 123) def test_unlocking_chunk_indexes_forgets_changes(self): chunk_id = self.repo.put_chunk_content('foochunk') self.repo.lock_chunk_indexes() self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123) self.repo.unlock_chunk_indexes() self.assertRaises( obnamlib.RepositoryChunkContentNotInIndexes, self.repo.find_chunk_id_by_content, 'foochunk') def test_committing_chunk_indexes_remembers_changes(self): chunk_id = self.repo.put_chunk_content('foochunk') self.repo.lock_chunk_indexes() self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123) self.repo.commit_chunk_indexes() self.assertEqual( self.repo.find_chunk_id_by_content('foochunk'), chunk_id) def test_locking_chunk_indexes_twice_fails(self): self.repo.lock_chunk_indexes() self.assertRaises( obnamlib.RepositoryChunkIndexesLockingFailed, self.repo.lock_chunk_indexes) def test_unlocking_unlocked_chunk_indexes_fails(self): self.assertRaises( obnamlib.RepositoryChunkIndexesNotLocked, self.repo.unlock_chunk_indexes) def test_forces_chunk_index_lock(self): self.repo.lock_chunk_indexes() self.repo.force_chunk_indexes_lock() self.assertEqual(self.repo.unlock_chunk_indexes(), None) # Fsck. def test_returns_fsck_work_item(self): self.assertNotEqual(self.repo.get_fsck_work_item(), None) obnam-1.6.1/obnamlib/repo_tests.py0000644000175000017500000007606212246357067017052 0ustar jenkinsjenkins# Copyright (C) 2010-2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import hashlib import os import shutil import stat import tempfile import time import unittest import obnamlib class RepositoryRootNodeTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() self.fs = obnamlib.LocalFS(self.tempdir) self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.otherfs = obnamlib.LocalFS(self.tempdir) self.other = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') def tearDown(self): shutil.rmtree(self.tempdir) def test_has_format_version(self): self.assert_(hasattr(self.repo, 'format_version')) def test_accepts_same_format_version(self): self.assert_(self.repo.acceptable_version(self.repo.format_version)) def test_does_not_accept_older_format_version(self): older_version = self.repo.format_version - 1 self.assertFalse(self.repo.acceptable_version(older_version)) def test_does_not_accept_newer_version(self): newer_version = self.repo.format_version + 1 self.assertFalse(self.repo.acceptable_version(newer_version)) def test_has_none_version_for_empty_repository(self): self.assertEqual(self.repo.get_format_version(), None) def test_creates_repository_with_format_version(self): self.repo.lock_root() self.assertEqual(self.repo.get_format_version(), self.repo.format_version) def test_lists_no_clients(self): self.assertEqual(self.repo.list_clients(), []) def test_has_not_got_root_node_lock(self): self.assertFalse(self.repo.got_root_lock) def test_locks_root_node(self): self.repo.lock_root() self.assert_(self.repo.got_root_lock) def test_locking_root_node_twice_fails(self): self.repo.lock_root() self.assertRaises(obnamlib.Error, self.repo.lock_root) def test_commit_releases_lock(self): self.repo.lock_root() self.repo.commit_root() self.assertFalse(self.repo.got_root_lock) def test_unlock_releases_lock(self): self.repo.lock_root() self.repo.unlock_root() self.assertFalse(self.repo.got_root_lock) def test_commit_without_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.commit_root) def test_unlock_root_without_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.unlock_root) def test_commit_when_locked_by_other_fails(self): self.other.lock_root() self.assertRaises(obnamlib.LockFail, self.repo.commit_root) def test_unlock_root_when_locked_by_other_fails(self): self.other.lock_root() self.assertRaises(obnamlib.LockFail, self.repo.unlock_root) def test_on_disk_repository_has_no_version_initially(self): self.assertEqual(self.repo.get_format_version(), None) def test_lock_root_adds_version(self): self.repo.lock_root() self.assertEqual(self.repo.get_format_version(), self.repo.format_version) def test_lock_root_fails_if_format_is_incompatible(self): self.repo._write_format_version(0) self.assertRaises(obnamlib.BadFormat, self.repo.lock_root) def test_list_clients_fails_if_format_is_incompatible(self): self.repo._write_format_version(0) self.assertRaises(obnamlib.BadFormat, self.repo.list_clients) def test_locks_shared(self): self.repo.lock_shared() self.assertTrue(self.repo.got_shared_lock) def test_locking_shared_twice_fails(self): self.repo.lock_shared() self.assertRaises(obnamlib.Error, self.repo.lock_shared) def test_unlocks_shared(self): self.repo.lock_shared() self.repo.unlock_shared() self.assertFalse(self.repo.got_shared_lock) def test_unlock_shared_when_locked_by_other_fails(self): self.other.lock_shared() self.assertRaises(obnamlib.LockFail, self.repo.unlock_shared) def test_lock_client_fails_if_format_is_incompatible(self): self.repo._write_format_version(0) self.assertRaises(obnamlib.BadFormat, self.repo.lock_client, 'foo') def test_open_client_fails_if_format_is_incompatible(self): self.repo._write_format_version(0) self.assertRaises(obnamlib.BadFormat, self.repo.open_client, 'foo') def test_adding_client_without_root_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.add_client, 'foo') def test_adds_client(self): self.repo.lock_root() self.repo.add_client('foo') self.assertEqual(self.repo.list_clients(), ['foo']) def test_adds_two_clients_across_commits(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.commit_root() self.repo.lock_root() self.repo.add_client('bar') self.repo.commit_root() self.assertEqual(sorted(self.repo.list_clients()), ['bar', 'foo']) def test_adds_client_that_persists_after_commit(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.commit_root() s2 = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.assertEqual(s2.list_clients(), ['foo']) def test_adding_existing_client_fails(self): self.repo.lock_root() self.repo.add_client('foo') self.assertRaises(obnamlib.Error, self.repo.add_client, 'foo') def test_removing_client_without_root_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.remove_client, 'foo') def test_removing_nonexistent_client_fails(self): self.repo.lock_root() self.assertRaises(obnamlib.Error, self.repo.remove_client, 'foo') def test_removing_client_works(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.remove_client('foo') self.assertEqual(self.repo.list_clients(), []) def test_removing_client_persists_past_commit(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.remove_client('foo') self.repo.commit_root() self.assertEqual(self.repo.list_clients(), []) def test_adding_client_without_commit_does_not_happen(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.unlock_root() self.assertEqual(self.repo.list_clients(), []) def test_removing_client_without_commit_does_not_happen(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.commit_root() self.repo.lock_root() self.repo.remove_client('foo') self.repo.unlock_root() self.assertEqual(self.repo.list_clients(), ['foo']) def test_removing_client_that_has_data_removes_the_data_as_well(self): self.repo.lock_root() self.repo.add_client('foo') self.repo.commit_root() self.repo.lock_client('foo') self.repo.lock_shared() self.repo.start_generation() self.repo.create('/', obnamlib.Metadata()) self.repo.commit_client() self.repo.commit_shared() self.repo.lock_root() self.repo.remove_client('foo') self.repo.commit_root() self.assertEqual(self.repo.list_clients(), []) self.assertFalse(self.fs.exists('foo')) class RepositoryClientTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() self.fs = obnamlib.LocalFS(self.tempdir) self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.repo.lock_root() self.repo.add_client('client_name') self.repo.commit_root() self.otherfs = obnamlib.LocalFS(self.tempdir) self.other = obnamlib.Repository(self.otherfs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.dir_meta = obnamlib.Metadata() self.dir_meta.st_mode = stat.S_IFDIR | 0777 def tearDown(self): shutil.rmtree(self.tempdir) def test_has_not_got_client_lock(self): self.assertFalse(self.repo.got_client_lock) def test_locks_client(self): self.repo.lock_client('client_name') self.assert_(self.repo.got_client_lock) def test_locking_client_twice_fails(self): self.repo.lock_client('client_name') self.assertRaises(obnamlib.Error, self.repo.lock_client, 'client_name') def test_locking_nonexistent_client_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.lock_client, 'foo') def test_unlock_client_releases_lock(self): self.repo.lock_client('client_name') self.repo.unlock_client() self.assertFalse(self.repo.got_client_lock) def test_commit_client_releases_lock(self): self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.commit_client() self.repo.commit_shared() self.assertFalse(self.repo.got_client_lock) def test_commit_does_not_mark_as_checkpoint_by_default(self): self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.start_generation() genid = self.repo.new_generation self.repo.commit_client() self.repo.commit_shared() self.repo.open_client('client_name') self.assertFalse(self.repo.get_is_checkpoint(genid)) def test_commit_marks_as_checkpoint_when_requested(self): self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.start_generation() genid = self.repo.new_generation self.repo.commit_client(checkpoint=True) self.repo.commit_shared() self.repo.open_client('client_name') self.assert_(self.repo.get_is_checkpoint(genid)) def test_commit_client_without_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.commit_client) def test_unlock_client_without_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.unlock_client) def test_commit_client_when_locked_by_other_fails(self): self.other.lock_client('client_name') self.assertRaises(obnamlib.LockFail, self.repo.commit_client) def test_unlock_client_when_locked_by_other_fails(self): self.other.lock_client('client_name') self.assertRaises(obnamlib.LockFail, self.repo.unlock_client) def test_opens_client_fails_if_client_does_not_exist(self): self.assertRaises(obnamlib.Error, self.repo.open_client, 'bad') def test_opens_client_even_when_locked_by_other(self): self.other.lock_client('client_name') self.repo.open_client('client_name') self.assert_(True) def test_lists_no_generations_when_readonly(self): self.repo.open_client('client_name') self.assertEqual(self.repo.list_generations(), []) def test_lists_no_generations_when_locked(self): self.repo.lock_client('client_name') self.assertEqual(self.repo.list_generations(), []) def test_listing_generations_fails_if_client_is_not_open(self): self.assertRaises(obnamlib.Error, self.repo.list_generations) def test_not_making_new_generation(self): self.assertEqual(self.repo.new_generation, None) def test_starting_new_generation_without_lock_fails(self): self.assertRaises(obnamlib.LockFail, self.repo.start_generation) def test_starting_new_generation_works(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.assert_(self.repo.new_generation) self.assertEqual(self.repo.new_generation, gen) self.assertEqual(self.repo.list_generations(), [gen]) def test_starting_second_concurrent_new_generation_fails(self): self.repo.lock_client('client_name') self.repo.start_generation() self.assertRaises(obnamlib.Error, self.repo.start_generation) def test_second_generation_has_different_id_from_first(self): self.repo.lock_client('client_name') self.repo.lock_shared() gen = self.repo.start_generation() self.repo.commit_client() self.repo.commit_shared() self.repo.lock_client('client_name') self.assertNotEqual(gen, self.repo.start_generation()) def test_new_generation_has_start_time_only(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() start, end = self.repo.get_generation_times(gen) self.assertNotEqual(start, None) self.assertEqual(end, None) def test_commited_generation_has_start_and_end_times(self): self.repo.lock_client('client_name') self.repo.lock_shared() gen = self.repo.start_generation() self.repo.commit_client() self.repo.commit_shared() self.repo.open_client('client_name') start, end = self.repo.get_generation_times(gen) self.assertNotEqual(start, None) self.assertNotEqual(end, None) self.assert_(start <= end) def test_adding_generation_without_committing_does_not_add_it(self): self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.start_generation() self.repo.unlock_client() self.repo.unlock_shared() self.repo.open_client('client_name') self.assertEqual(self.repo.list_generations(), []) def test_removing_generation_works(self): self.repo.lock_client('client_name') self.repo.lock_shared() gen = self.repo.start_generation() self.repo.commit_client() self.repo.commit_shared() self.repo.open_client('client_name') self.assertEqual(len(self.repo.list_generations()), 1) self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.remove_generation(gen) self.repo.commit_client() self.repo.commit_shared() self.repo.open_client('client_name') self.assertEqual(self.repo.list_generations(), []) def test_removing_only_second_generation_works(self): # Create first generation. It will be empty. self.repo.lock_client('client_name') self.repo.lock_shared() gen1 = self.repo.start_generation() self.repo.commit_client() self.repo.commit_shared() # Create second generation. It will have a file with two chunks. # Only one of the chunks will be put into the shared trees. self.repo.lock_client('client_name') self.repo.lock_shared() gen2 = self.repo.start_generation() chunk_id1 = self.repo.put_chunk_only('data') self.repo.put_chunk_in_shared_trees(chunk_id1, 'checksum') chunk_id2 = self.repo.put_chunk_only('data2') self.repo.set_file_chunks('/foo', [chunk_id1, chunk_id2]) self.repo.commit_client() self.repo.commit_shared() # Do we have the right generations? And the chunk2? self.repo.open_client('client_name') self.assertEqual(len(self.repo.list_generations()), 2) self.assertTrue(self.repo.chunk_exists(chunk_id1)) self.assertTrue(self.repo.chunk_exists(chunk_id2)) # Remove second generation. This should remove the chunk too. self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.remove_generation(gen2) self.repo.commit_client() self.repo.commit_shared() # Make sure we have only the first generation, and that the # chunks are gone. self.repo.open_client('client_name') self.assertEqual(self.repo.list_generations(), [gen1]) self.assertFalse(self.repo.chunk_exists(chunk_id1)) self.assertFalse(self.repo.chunk_exists(chunk_id2)) def test_removing_started_generation_fails(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.assertRaises(obnamlib.Error, self.repo.remove_generation, gen) def test_removing_without_committing_does_not_remove(self): self.repo.lock_client('client_name') self.repo.lock_shared() gen = self.repo.start_generation() self.repo.commit_client() self.repo.commit_shared() self.repo.lock_client('client_name') self.repo.lock_shared() self.repo.remove_generation(gen) self.repo.unlock_client() self.repo.unlock_shared() self.repo.open_client('client_name') self.assertEqual(self.repo.list_generations(), [gen]) def test_new_generation_has_root_dir_only(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.assertEqual(self.repo.listdir(gen, '/'), []) def test_create_fails_unless_generation_is_started(self): self.assertRaises(obnamlib.Error, self.repo.create, None, None) def test_create_adds_file(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/', self.dir_meta) self.repo.create('/foo', obnamlib.Metadata()) self.assertEqual(self.repo.listdir(gen, '/'), ['foo']) def test_create_adds_two_files(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/', self.dir_meta) self.repo.create('/foo', obnamlib.Metadata()) self.repo.create('/bar', obnamlib.Metadata()) self.assertEqual(sorted(self.repo.listdir(gen, '/')), ['bar', 'foo']) def test_create_adds_lots_of_files(self): n = 100 self.repo.lock_client('client_name') gen = self.repo.start_generation() pathnames = ['/%d' % i for i in range(n)] for pathname in pathnames: self.repo.create(pathname, obnamlib.Metadata()) self.assertEqual(sorted(self.repo.listdir(gen, '/')), sorted(os.path.basename(x) for x in pathnames)) def test_create_adds_dir(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/foo', self.dir_meta) self.assertEqual(self.repo.listdir(gen, '/foo'), []) def test_create_adds_dir_after_file_in_it(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/foo/bar', obnamlib.Metadata()) self.repo.create('/foo', self.dir_meta) self.assertEqual(self.repo.listdir(gen, '/foo'), ['bar']) def test_gets_metadata_for_dir(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/foo', self.dir_meta) self.assertEqual(self.repo.get_metadata(gen, '/foo').st_mode, self.dir_meta.st_mode) def test_remove_removes_file(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/foo', obnamlib.Metadata()) self.repo.remove('/foo') self.assertEqual(self.repo.listdir(gen, '/'), []) def test_remove_removes_directory_tree(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/foo/bar', obnamlib.Metadata()) self.repo.remove('/foo') self.assertEqual(self.repo.listdir(gen, '/'), []) def test_get_metadata_works(self): metadata = obnamlib.Metadata() metadata.st_size = 123 self.repo.lock_client('client_name') gen = self.repo.start_generation() self.repo.create('/foo', metadata) received = self.repo.get_metadata(gen, '/foo') self.assertEqual(metadata.st_size, received.st_size) def test_get_metadata_raises_exception_if_file_does_not_exist(self): self.repo.lock_client('client_name') gen = self.repo.start_generation() self.assertRaises(obnamlib.Error, self.repo.get_metadata, gen, '/foo') class RepositoryChunkTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() self.fs = obnamlib.LocalFS(self.tempdir) self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.repo.lock_root() self.repo.add_client('client_name') self.repo.commit_root() self.repo.lock_client('client_name') self.repo.start_generation() def tearDown(self): shutil.rmtree(self.tempdir) def test_checksum_returns_checksum(self): self.assertNotEqual(self.repo.checksum('data'), None) def test_put_chunk_returns_id(self): self.repo.lock_shared() self.assertNotEqual(self.repo.put_chunk_only('data'), None) def test_get_chunk_retrieves_what_put_chunk_puts(self): self.repo.lock_shared() chunkid = self.repo.put_chunk_only('data') self.assertEqual(self.repo.get_chunk(chunkid), 'data') def test_chunk_does_not_exist(self): self.assertFalse(self.repo.chunk_exists(1234)) def test_chunk_exists_after_it_is_put(self): self.repo.lock_shared() chunkid = self.repo.put_chunk_only('chunk') self.assert_(self.repo.chunk_exists(chunkid)) def test_removes_chunk(self): self.repo.lock_shared() chunkid = self.repo.put_chunk_only('chunk') self.repo.remove_chunk(chunkid) self.assertFalse(self.repo.chunk_exists(chunkid)) def test_silently_ignores_failure_when_removing_nonexistent_chunk(self): self.repo.lock_shared() self.assertEqual(self.repo.remove_chunk(0), None) def test_find_chunks_finds_what_put_chunk_puts(self): self.repo.lock_shared() checksum = self.repo.checksum('data') chunkid = self.repo.put_chunk_only('data') self.repo.put_chunk_in_shared_trees(chunkid, checksum) self.assertEqual(self.repo.find_chunks(checksum), [chunkid]) def test_find_chunks_finds_nothing_if_nothing_is_put(self): self.assertEqual(self.repo.find_chunks('checksum'), []) def test_handles_checksum_collision(self): self.repo.lock_shared() checksum = self.repo.checksum('data') chunkid1 = self.repo.put_chunk_only('data') chunkid2 = self.repo.put_chunk_only('data') self.repo.put_chunk_in_shared_trees(chunkid1, checksum) self.repo.put_chunk_in_shared_trees(chunkid2, checksum) self.assertEqual(set(self.repo.find_chunks(checksum)), set([chunkid1, chunkid2])) def test_returns_no_chunks_initially(self): self.assertEqual(self.repo.list_chunks(), []) def test_returns_chunks_after_they_exist(self): self.repo.lock_shared() checksum = self.repo.checksum('data') chunkids = [] for i in range(2): chunkids.append(self.repo.put_chunk_only('data')) self.assertEqual(sorted(self.repo.list_chunks()), sorted(chunkids)) class RepositoryGetSetChunksTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() self.fs = obnamlib.LocalFS(self.tempdir) self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.repo.lock_root() self.repo.add_client('client_name') self.repo.commit_root() self.repo.lock_client('client_name') self.gen = self.repo.start_generation() self.repo.create('/foo', obnamlib.Metadata()) def tearDown(self): shutil.rmtree(self.tempdir) def test_file_has_no_chunks(self): self.assertEqual(self.repo.get_file_chunks(self.gen, '/foo'), []) def test_sets_chunks_for_file(self): self.repo.set_file_chunks('/foo', [1, 2]) chunkids = self.repo.get_file_chunks(self.gen, '/foo') self.assertEqual(sorted(chunkids), [1, 2]) def test_appends_chunks_to_empty_list(self): self.repo.append_file_chunks('/foo', [1, 2]) chunkids = self.repo.get_file_chunks(self.gen, '/foo') self.assertEqual(sorted(chunkids), [1, 2]) def test_appends_chunks_to_nonempty_list(self): self.repo.append_file_chunks('/foo', [1, 2]) self.repo.append_file_chunks('/foo', [3, 4]) chunkids = self.repo.get_file_chunks(self.gen, '/foo') self.assertEqual(sorted(chunkids), [1, 2, 3, 4]) class RepositoryGenspecTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() repodir = os.path.join(self.tempdir, 'repo') os.mkdir(repodir) fs = obnamlib.LocalFS(repodir) self.repo = obnamlib.Repository(fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.repo.lock_root() self.repo.add_client('client_name') self.repo.commit_root() self.repo.lock_client('client_name') self.repo.lock_shared() def tearDown(self): shutil.rmtree(self.tempdir) def backup(self): gen = self.repo.start_generation() self.repo.commit_client() self.repo.commit_shared() self.repo.lock_client('client_name') self.repo.lock_shared() return gen def test_latest_raises_error_if_there_are_no_generations(self): self.assertRaises(obnamlib.Error, self.repo.genspec, 'latest') def test_latest_returns_only_generation(self): gen = self.backup() self.assertEqual(self.repo.genspec('latest'), gen) def test_latest_returns_newest_generation(self): self.backup() gen = self.backup() self.assertEqual(self.repo.genspec('latest'), gen) def test_other_spec_returns_itself(self): gen = self.backup() self.assertEqual(self.repo.genspec(str(gen)), gen) def test_noninteger_spec_raises_error(self): gen = self.backup() self.assertNotEqual(gen, 'foo') self.assertRaises(obnamlib.Error, self.repo.genspec, 'foo') def test_nonexistent_spec_raises_error(self): self.backup() self.assertRaises(obnamlib.Error, self.repo.genspec, 1234) class RepositoryWalkTests(unittest.TestCase): def setUp(self): self.tempdir = tempfile.mkdtemp() self.fs = obnamlib.LocalFS(self.tempdir) self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE, obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, obnamlib.DEFAULT_LRU_SIZE, None, obnamlib.IDPATH_DEPTH, obnamlib.IDPATH_BITS, obnamlib.IDPATH_SKIP, time.time, 0, '') self.repo.lock_root() self.repo.add_client('client_name') self.repo.commit_root() self.dir_meta = obnamlib.Metadata() self.dir_meta.st_mode = stat.S_IFDIR | 0777 self.file_meta = obnamlib.Metadata() self.file_meta.st_mode = stat.S_IFREG | 0644 self.repo.lock_client('client_name') self.repo.lock_shared() self.gen = self.repo.start_generation() self.repo.create('/', self.dir_meta) self.repo.create('/foo', self.dir_meta) self.repo.create('/foo/bar', self.file_meta) self.repo.commit_client() self.repo.open_client('client_name') def tearDown(self): shutil.rmtree(self.tempdir) def test_walk_find_everything(self): found = list(self.repo.walk(self.gen, '/')) self.assertEqual(found, [('/', self.dir_meta), ('/foo', self.dir_meta), ('/foo/bar', self.file_meta)]) def test_walk_find_depth_first(self): found = list(self.repo.walk(self.gen, '/', depth_first=True)) self.assertEqual(found, [('/foo/bar', self.file_meta), ('/foo', self.dir_meta), ('/', self.dir_meta)]) obnam-1.6.1/obnamlib/repo_tree.py0000644000175000017500000001044412246357067016637 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import larch import tracing import obnamlib class RepositoryTree(object): '''A B-tree within an obnamlib.Repository. For read-only operation, call init_forest before doing anything. For read-write operation, call start_changes before doing anything, and commit afterwards. In between, self.tree is the new tree to be modified. Note that self.tree is NOT available after init_forest. After init_forest or start_changes, self.forest is the opened forest. Unlike self.tree, it will not go away after commit. ''' def __init__(self, fs, dirname, key_bytes, node_size, upload_queue_size, lru_size, repo): self.fs = fs self.dirname = dirname self.key_bytes = key_bytes self.node_size = node_size self.upload_queue_size = upload_queue_size self.lru_size = lru_size self.repo = repo self.forest = None self.forest_allows_writes = False self.tree = None self.keep_just_one_tree = False def init_forest(self, allow_writes=False): if self.forest is None: tracing.trace('initializing forest dirname=%s', self.dirname) assert self.tree is None if not self.fs.exists(self.dirname): tracing.trace('%s does not exist', self.dirname) return False self.forest = larch.open_forest(key_size=self.key_bytes, node_size=self.node_size, dirname=self.dirname, upload_max=self.upload_queue_size, lru_size=self.lru_size, vfs=self.fs, allow_writes=allow_writes) self.forest_allows_writes = allow_writes return True def start_changes(self, create_tree=True): tracing.trace('start changes for %s', self.dirname) if self.forest is None or not self.forest_allows_writes: if not self.fs.exists(self.dirname): need_init = True else: filenames = self.fs.listdir(self.dirname) need_init = filenames == [] or filenames == ['lock'] if need_init: if not self.fs.exists(self.dirname): tracing.trace('create %s', self.dirname) self.fs.mkdir(self.dirname) self.repo.hooks.call('repository-toplevel-init', self.repo, self.dirname) self.forest = None self.init_forest(allow_writes=True) assert self.forest is not None assert self.forest_allows_writes, \ 'it is "%s"' % repr(self.forest_allows_writes) if self.tree is None and create_tree: if self.forest.trees: self.tree = self.forest.new_tree(self.forest.trees[-1]) tracing.trace('use newest tree %s (of %d)', self.tree.root.id, len(self.forest.trees)) else: self.tree = self.forest.new_tree() tracing.trace('new tree root id %s', self.tree.root.id) def commit(self): tracing.trace('committing') if self.forest: if self.keep_just_one_tree: while len(self.forest.trees) > 1: tracing.trace('not keeping tree with root id %s', self.forest.trees[0].root.id) self.forest.remove_tree(self.forest.trees[0]) self.forest.commit() self.tree = None obnam-1.6.1/obnamlib/sizeparse.py0000644000175000017500000000405412246357067016660 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import re import obnamlib class UnitError(obnamlib.Error): def __str__(self): return self.msg class SizeSyntaxError(UnitError): def __init__(self, string): self.msg = '"%s" is not a valid size' % string class UnitNameError(UnitError): def __init__(self, string): self.msg = '"%s" is not a valid unit' % string class ByteSizeParser(object): '''Parse sizes of data in bytes, kilobytes, kibibytes, etc.''' pat = re.compile(r'^(?P\d+(\.\d+)?)\s*' r'(?P[kmg]?i?b?)?$', re.I) units = { 'b': 1, 'k': 1000, 'kb': 1000, 'kib': 1024, 'm': 1000**2, 'mb': 1000**2, 'mib': 1024**2, 'g': 1000**3, 'gb': 1000**3, 'gib': 1024**3, } def __init__(self): self.set_default_unit('B') def set_default_unit(self, unit): if unit.lower() not in self.units: raise UnitNameError(unit) self.default_unit = unit def parse(self, string): m = self.pat.match(string) if not m: raise SizeSyntaxError(string) size = float(m.group('size')) unit = m.group('unit') if not unit: unit = self.default_unit elif unit.lower() not in self.units: raise UnitNameError(unit) factor = self.units[unit.lower()] return int(size * factor) obnam-1.6.1/obnamlib/sizeparse_tests.py0000644000175000017500000000651012246357067020101 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import unittest import obnamlib class ByteSizeParserTests(unittest.TestCase): def setUp(self): self.p = obnamlib.ByteSizeParser() def test_parses_zero(self): self.assertEqual(self.p.parse('0'), 0) def test_parses_unadorned_size_as_bytes(self): self.assertEqual(self.p.parse('123'), 123) def test_returns_an_int(self): self.assert_(isinstance(self.p.parse('123'), int)) def test_parses_unadorned_size_using_default_unit(self): self.p.set_default_unit('KiB') self.assertEqual(self.p.parse('123'), 123 * 1024) def test_parses_size_with_byte_unit(self): self.assertEqual(self.p.parse('123 B'), 123) def test_parses_size_with_kilo_unit(self): self.assertEqual(self.p.parse('123 k'), 123 * 1000) def test_parses_size_with_kilobyte_unit(self): self.assertEqual(self.p.parse('123 kB'), 123 * 1000) def test_parses_size_with_kibibyte_unit(self): self.assertEqual(self.p.parse('123 KiB'), 123 * 1024) def test_parses_size_with_mega_unit(self): self.assertEqual(self.p.parse('123 m'), 123 * 1000**2) def test_parses_size_with_megabyte_unit(self): self.assertEqual(self.p.parse('123 MB'), 123 * 1000**2) def test_parses_size_with_mebibyte_unit(self): self.assertEqual(self.p.parse('123 MiB'), 123 * 1024**2) def test_parses_size_with_giga_unit(self): self.assertEqual(self.p.parse('123 g'), 123 * 1000**3) def test_parses_size_with_gigabyte_unit(self): self.assertEqual(self.p.parse('123 GB'), 123 * 1000**3) def test_parses_size_with_gibibyte_unit(self): self.assertEqual(self.p.parse('123 GiB'), 123 * 1024**3) def test_raises_error_for_empty_string(self): self.assertRaises(obnamlib.SizeSyntaxError, self.p.parse, '') def test_raises_error_for_missing_size(self): self.assertRaises(obnamlib.SizeSyntaxError, self.p.parse, 'KiB') def test_raises_error_for_bad_unit(self): self.assertRaises(obnamlib.SizeSyntaxError, self.p.parse, '1 km') def test_raises_error_for_bad_unit_thats_similar_to_real_one(self): self.assertRaises(obnamlib.UnitNameError, self.p.parse, '1 ib') def test_raises_error_for_bad_default_unit(self): self.assertRaises(obnamlib.UnitNameError, self.p.set_default_unit, 'km') def test_size_syntax_error_includes_input_string(self): text = 'asdf asdf' e = obnamlib.SizeSyntaxError(text) self.assert_(text in str(e), str(e)) def test_unit_name_error_includes_input_string(self): text = 'asdf asdf' e = obnamlib.UnitNameError(text) self.assert_(text in str(e), str(e)) obnam-1.6.1/obnamlib/vfs.py0000644000175000017500000006002612246357067015452 0ustar jenkinsjenkins# Copyright (C) 2008, 2010 Lars Wirzenius # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import errno import logging import os import stat import urlparse import obnamlib class VirtualFileSystem(object): '''A virtual filesystem interface. The backup program needs to access both local and remote files. To make it easier to support all kinds of files both locally and remotely, we use a custom virtual filesystem interface so that all filesystem access is done the same way. This way, we can easily support user data and backup repositories in any combination of local and remote filesystems. This class defines the interface for such virtual filesystems. Sub-classes will actually implement the interface. When a VFS is instantiated, it is bound to a base URL. When accessing the virtual filesystem, all paths are then given relative to the base URL. The Unix syntax for files is used for the relative paths: directory components separated by slashes, and an initial slash indicating the root of the filesystem (in this case, the base URL). ''' def __init__(self, baseurl): self.baseurl = baseurl self.bytes_read = 0 self.bytes_written = 0 logging.debug('VFS: __init__: baseurl=%s' % self.baseurl) def log_stats(self): logging.debug( 'VFS: baseurl=%s read=%d written=%d' % (self.baseurl, self.bytes_read, self.bytes_written)) def connect(self): '''Connect to filesystem.''' def close(self): '''Close connection to filesystem.''' self.log_stats() def reinit(self, new_baseurl, create=False): '''Go back to the beginning. This behaves like instantiating a new instance, but possibly faster for things like SftpFS. If there is a network connection already open, it will be reused. ''' def abspath(self, pathname): '''Return absolute version of pathname.''' return os.path.abspath(os.path.join(self.getcwd(), pathname)) def getcwd(self): '''Return current working directory as absolute pathname.''' def chdir(self, pathname): '''Change current working directory to pathname.''' def listdir(self, pathname): '''Return list of basenames of entities at pathname.''' def listdir2(self, pathname): '''Return list of basenames and stats of entities at pathname. The stat entity may be an exception object instead, to indicate an error. ''' def lock(self, lockname, data): '''Create a lock file with the given name.''' def unlock(self, lockname): '''Remove a lock file.''' def exists(self, pathname): '''Does the file or directory exist?''' def mknod(self, pathname, mode): '''Create a filesystem node.''' def isdir(self, pathname): '''Is it a directory?''' def mkdir(self, pathname): '''Create a directory. Parent directories must already exist. ''' def makedirs(self, pathname): '''Create a directory, and missing parents.''' def rmdir(self, pathname): '''Remove an empty directory.''' def rmtree(self, dirname): '''Remove a directory tree, including its contents.''' if self.isdir(dirname): for pathname, st in self.scan_tree(dirname): if stat.S_ISDIR(st.st_mode): self.rmdir(pathname) else: self.remove(pathname) def remove(self, pathname): '''Remove a file.''' def rename(self, old, new): '''Rename a file.''' def lstat(self, pathname): '''Like os.lstat.''' def get_username(self, uid): '''Return name for user, or None if not known.''' def get_groupname(self, gid): '''Return name for group, or None if not known.''' def llistxattr(self, pathname): '''Return list of names of extended attributes for file.''' return [] def lgetxattr(self, pathname, attrname): '''Return value of an extended attribute.''' def lsetxattr(self, pathname, attrname, attrvalue): '''Set value of an extended attribute.''' def lchown(self, pathname, uid, gid): '''Like os.lchown.''' def chmod_symlink(self, pathname, mode): '''Like os.lchmod, for symlinks only. This may fail if the pathname is not a symlink (but it may not). If the target is a symlink, but the platform (e.g., Linux) does not allow setting the permissions of a symlink, the method will silently do nothing. ''' def chmod_not_symlink(self, pathname, mode): '''Like os.chmod, for non-symlinks only. This may fail if pathname is a symlink (but it may not). It MUST NOT be called for a symlink; use chmod_symlink instead. ''' def lutimes(self, pathname, atime_sec, atime_nsec, mtime_sec, mtime_nsec): '''Like lutimes(2). This isn't quite like lutimes, actually. Most importantly, it uses nanosecond timestamps rather than microsecond. This is important. ''' def link(self, existing_path, new_path): '''Like os.link.''' def readlink(self, symlink): '''Like os.readlink.''' def symlink(self, source, destination): '''Like os.symlink.''' def open(self, pathname, mode): '''Open a file, like the builtin open() or file() function. The return value is a file object like the ones returned by the builtin open() function. ''' def cat(self, pathname): '''Return the contents of a file.''' def write_file(self, pathname, contents): '''Write a new file. The file must not yet exist. The file is not necessarily written atomically, meaning that if the writing fails (connection to server drops, for example), the file might exist in a partial form. The callers need to deal with this. Any directories in pathname will be created if necessary. ''' def overwrite_file(self, pathname, contents): '''Like write_file, but overwrites existing file.''' def scan_tree(self, dirname, ok=None, dirst=None, log=logging.error, error_handler=None): '''Scan a tree for files. Return a generator that returns ``(pathname, stat_result)`` pairs for each file and directory in the tree, in depth-first order. If ``ok`` is not None, it must be a function that determines if a particular file or directory should be returned. It gets the pathname and stat result as arguments, and should return True or False. If it returns False on a directory, ``scan_tree`` will not recurse into the directory. ``dirst`` is for internal optimization, and should not be used by the caller. ``log`` is used by unit tests and should not be used by the caller. Errors from calling ``listdir`` or ``lstat`` are logged, but do not stop the scanning. Such files or directories are not returned, however. If `error_handler` is defined, it is called once for every problem, giving the name and exception as arguments. ''' error_handler = error_handler or (lambda name, e: None) try: pairs = self.listdir2(dirname) except OSError, e: log('listdir failed: %s: %s' % (e.filename, e.strerror)) error_handler(dirname, e) pairs = [] queue = [] for name, st in pairs: pathname = os.path.join(dirname, name) if isinstance(st, BaseException): error_handler(pathname, st) elif ok is None or ok(pathname, st): if stat.S_ISDIR(st.st_mode): for t in self.scan_tree(pathname, ok=ok, dirst=st): yield t else: queue.append((pathname, st)) for pathname, st in queue: yield pathname, st if dirst is None: try: dirst = self.lstat(dirname) except OSError, e: log('lstat for dir failed: %s: %s' % (e.filename, e.strerror)) return yield dirname, dirst class VfsFactory: '''Create new instances of VirtualFileSystem.''' def __init__(self): self.implementations = {} def register(self, scheme, implementation, **kwargs): if scheme in self.implementations: raise obnamlib.Error('URL scheme %s already registered' % scheme) self.implementations[scheme] = (implementation, kwargs) def new(self, url, create=False): '''Create a new VFS appropriate for a given URL.''' scheme, netloc, path, params, query, fragment = urlparse.urlparse(url) if scheme in self.implementations: klass, kwargs = self.implementations[scheme] return klass(url, create=create, **kwargs) raise obnamlib.Error('Unknown VFS type %s' % url) class VfsTests(object): # pragma: no cover '''Re-useable tests for VirtualFileSystem implementations. The base class can't be usefully instantiated itself. Instead you are supposed to sub-class it and implement the API in a suitable way for yourself. This class implements a number of tests that the API implementation must pass. The implementation's own test class should inherit from this class, and unittest.TestCase. The test sub-class should define a setUp method that sets the following: * self.fs to an instance of the API implementation sub-class * self.basepath to the path to the base of the filesystem basepath must be operable as a pathname using os.path tools. If the VFS implemenation operates remotely and wants to operate on a URL like 'http://domain/path' as the baseurl, then basepath must be just the path portion of the URL. The directory indicated by basepath must exist, but must be empty at start. ''' non_ascii_name = u'm\u00e4kel\u00e4'.encode('utf-8') def test_abspath_returns_input_for_absolute_path(self): self.assertEqual(self.fs.abspath('/foo/bar'), '/foo/bar') def test_abspath_returns_absolute_path_for_relative_input(self): self.assertEqual(self.fs.abspath('foo'), os.path.join(self.basepath, 'foo')) def test_abspath_normalizes_path(self): self.assertEqual(self.fs.abspath('foo/..'), self.basepath) def test_abspath_returns_plain_string(self): self.fs.mkdir(self.non_ascii_name) self.fs.chdir(self.non_ascii_name) self.assertEqual(type(self.fs.abspath('.')), str) def test_reinit_works(self): self.fs.chdir('/') self.fs.reinit(self.fs.baseurl) self.assertEqual(self.fs.getcwd(), self.basepath) def test_reinit_to_nonexistent_filename_raises_OSError(self): notexist = os.path.join(self.fs.baseurl, 'thisdoesnotexist') self.assertRaises(OSError, self.fs.reinit, notexist) def test_reinit_creates_target_if_requested(self): self.fs.chdir('/') new_baseurl = os.path.join(self.fs.baseurl, 'newdir') new_basepath = os.path.join(self.basepath, 'newdir') self.fs.reinit(new_baseurl, create=True) self.assertEqual(self.fs.getcwd(), new_basepath) def test_getcwd_returns_dirname(self): self.assertEqual(self.fs.getcwd(), self.basepath) def test_getcwd_returns_plain_string(self): self.fs.mkdir(self.non_ascii_name) self.fs.chdir(self.non_ascii_name) self.assertEqual(type(self.fs.getcwd()), str) def test_chdir_changes_only_fs_cwd_not_process_cwd(self): process_cwd = os.getcwd() self.fs.chdir('/') self.assertEqual(self.fs.getcwd(), '/') self.assertEqual(os.getcwd(), process_cwd) def test_chdir_to_nonexistent_raises_exception(self): self.assertRaises(OSError, self.fs.chdir, '/foobar') def test_chdir_to_relative_works(self): pathname = os.path.join(self.basepath, 'foo') os.mkdir(pathname) self.fs.chdir('foo') self.assertEqual(self.fs.getcwd(), pathname) def test_chdir_to_dotdot_works(self): pathname = os.path.join(self.basepath, 'foo') os.mkdir(pathname) self.fs.chdir('foo') self.fs.chdir('..') self.assertEqual(self.fs.getcwd(), self.basepath) def test_creates_lock_file(self): self.fs.lock('lock', 'lock data') self.assertTrue(self.fs.exists('lock')) self.assertEqual(self.fs.cat('lock'), 'lock data') def test_second_lock_fails(self): self.fs.lock('lock', 'lock data') self.assertRaises(Exception, self.fs.lock, 'lock', 'second lock') self.assertEqual(self.fs.cat('lock'), 'lock data') def test_unlock_removes_lock(self): self.fs.lock('lock', 'lock data') self.fs.unlock('lock') self.assertFalse(self.fs.exists('lock')) def test_exists_returns_false_for_nonexistent_file(self): self.assertFalse(self.fs.exists('foo')) def test_exists_returns_true_for_existing_file(self): self.fs.write_file('foo', '') self.assert_(self.fs.exists('foo')) def test_isdir_returns_false_for_nonexistent_file(self): self.assertFalse(self.fs.isdir('foo')) def test_isdir_returns_false_for_nondir(self): self.fs.write_file('foo', '') self.assertFalse(self.fs.isdir('foo')) def test_isdir_returns_true_for_existing_dir(self): self.fs.mkdir('foo') self.assert_(self.fs.isdir('foo')) def test_listdir_returns_plain_strings_only(self): self.fs.write_file(u'M\u00E4kel\u00E4'.encode('utf-8'), 'data') names = self.fs.listdir('.') types = [type(x) for x in names] self.assertEqual(types, [str]) def test_listdir_raises_oserror_if_directory_does_not_exist(self): self.assertRaises(OSError, self.fs.listdir, 'foo') def test_listdir2_returns_name_stat_pairs(self): funny = u'M\u00E4kel\u00E4'.encode('utf-8') self.fs.write_file(funny, 'data') pairs = self.fs.listdir2('.') self.assertEqual(len(pairs), 1) self.assertEqual(len(pairs[0]), 2) name, st = pairs[0] self.assertEqual(type(name), str) self.assertEqual(name, funny) self.assert_(hasattr(st, 'st_mode')) self.assertFalse(hasattr(st, 'st_mtime')) self.assert_(hasattr(st, 'st_mtime_sec')) self.assert_(hasattr(st, 'st_mtime_nsec')) def test_listdir2_returns_plain_strings_only(self): self.fs.write_file(u'M\u00E4kel\u00E4'.encode('utf-8'), 'data') names = [name for name, st in self.fs.listdir2('.')] types = [type(x) for x in names] self.assertEqual(types, [str]) def test_listdir2_raises_oserror_if_directory_does_not_exist(self): self.assertRaises(OSError, self.fs.listdir2, 'foo') def test_mknod_creates_fifo(self): self.fs.mknod('foo', 0600 | stat.S_IFIFO) self.assertEqual(self.fs.lstat('foo').st_mode, 0600 | stat.S_IFIFO) def test_mkdir_raises_oserror_if_directory_exists(self): self.assertRaises(OSError, self.fs.mkdir, '.') def test_mkdir_raises_oserror_if_parent_does_not_exist(self): self.assertRaises(OSError, self.fs.mkdir, 'foo/bar') def test_makedirs_raises_oserror_when_directory_exists(self): self.fs.mkdir('foo') self.assertRaises(OSError, self.fs.makedirs, 'foo') def test_makedirs_creates_directory_when_parent_exists(self): self.fs.makedirs('foo') self.assert_(self.fs.isdir('foo')) def test_makedirs_creates_directory_when_parent_does_not_exist(self): self.fs.makedirs('foo/bar') self.assert_(self.fs.isdir('foo/bar')) def test_rmdir_removes_directory(self): self.fs.mkdir('foo') self.fs.rmdir('foo') self.assertFalse(self.fs.exists('foo')) def test_rmdir_raises_oserror_if_directory_does_not_exist(self): self.assertRaises(OSError, self.fs.rmdir, 'foo') def test_rmdir_raises_oserror_if_directory_is_not_empty(self): self.fs.mkdir('foo') self.fs.write_file('foo/bar', '') self.assertRaises(OSError, self.fs.rmdir, 'foo') def test_rmtree_removes_directory_tree(self): self.fs.mkdir('foo') self.fs.write_file('foo/bar', '') self.fs.rmtree('foo') self.assertFalse(self.fs.exists('foo')) def test_rmtree_is_silent_when_target_does_not_exist(self): self.assertEqual(self.fs.rmtree('foo'), None) def test_remove_removes_file(self): self.fs.write_file('foo', '') self.fs.remove('foo') self.assertFalse(self.fs.exists('foo')) def test_remove_raises_oserror_if_file_does_not_exist(self): self.assertRaises(OSError, self.fs.remove, 'foo') def test_rename_renames_file(self): self.fs.write_file('foo', 'xxx') self.fs.rename('foo', 'bar') self.assertFalse(self.fs.exists('foo')) self.assertEqual(self.fs.cat('bar'), 'xxx') def test_rename_raises_oserror_if_file_does_not_exist(self): self.assertRaises(OSError, self.fs.rename, 'foo', 'bar') def test_rename_works_if_target_exists(self): self.fs.write_file('foo', 'foo') self.fs.write_file('bar', 'bar') self.fs.rename('foo', 'bar') self.assertEqual(self.fs.cat('bar'), 'foo') def test_lstat_returns_result_with_all_required_fields(self): st = self.fs.lstat('.') for field in obnamlib.metadata_fields: if field.startswith('st_'): self.assert_(hasattr(st, field), 'stat must return %s' % field) def test_lstat_returns_right_filetype_for_directory(self): st = self.fs.lstat('.') self.assert_(stat.S_ISDIR(st.st_mode)) def test_lstat_raises_oserror_for_nonexistent_entry(self): self.assertRaises(OSError, self.fs.lstat, 'notexists') def test_chmod_not_symlink_sets_permissions_correctly(self): self.fs.mkdir('foo') self.fs.chmod_not_symlink('foo', 0777) self.assertEqual(self.fs.lstat('foo').st_mode & 0777, 0777) def test_chmod_not_symlink_raises_oserror_for_nonexistent_entry(self): self.assertRaises(OSError, self.fs.chmod_not_symlink, 'notexists', 0) def test_chmod_symlink_raises_oserror_for_nonexistent_entry(self): self.assertRaises(OSError, self.fs.chmod_symlink, 'notexists', 0) def test_lutimes_sets_times_correctly(self): self.fs.mkdir('foo') self.fs.lutimes('foo', 1, 2*1000, 3, 4*1000) self.assertEqual(self.fs.lstat('foo').st_atime_sec, 1) # not all filesystems support sub-second timestamps; those that # do not, return 0, so we have to accept either that or the correct # value, but no other vlaues self.assert_(self.fs.lstat('foo').st_atime_nsec in [0, 2*1000]) self.assertEqual(self.fs.lstat('foo').st_mtime_sec, 3) self.assert_(self.fs.lstat('foo').st_mtime_nsec in [0, 4*1000]) def test_lutimes_raises_oserror_for_nonexistent_entry(self): self.assertRaises(OSError, self.fs.lutimes, 'notexists', 1, 2, 3, 4) def test_link_creates_hard_link(self): self.fs.write_file('foo', 'foo') self.fs.link('foo', 'bar') st1 = self.fs.lstat('foo') st2 = self.fs.lstat('bar') self.assertEqual(st1, st2) def test_symlink_creates_soft_link(self): self.fs.symlink('foo', 'bar') target = self.fs.readlink('bar') self.assertEqual(target, 'foo') def test_readlink_returns_plain_string(self): self.fs.symlink(self.non_ascii_name, self.non_ascii_name) target = self.fs.readlink(self.non_ascii_name) self.assertEqual(target, self.non_ascii_name) self.assertEqual(type(target), str) def test_symlink_raises_oserror_if_name_exists(self): self.fs.write_file('foo', 'foo') self.assertRaises(OSError, self.fs.symlink, 'bar', 'foo') def test_opens_existing_file_ok_for_reading(self): self.fs.write_file('foo', '') self.assert_(self.fs.open('foo', 'r')) def test_opens_existing_file_ok_for_writing(self): self.fs.write_file('foo', '') self.assert_(self.fs.open('foo', 'w')) def test_open_fails_for_nonexistent_file(self): self.assertRaises(IOError, self.fs.open, 'foo', 'r') def test_cat_reads_existing_file_ok(self): self.fs.write_file('foo', 'bar') self.assertEqual(self.fs.cat('foo'), 'bar') def test_cat_fails_for_nonexistent_file(self): self.assertRaises(IOError, self.fs.cat, 'foo') def test_has_read_nothing_initially(self): self.assertEqual(self.fs.bytes_read, 0) def test_cat_updates_bytes_read(self): self.fs.write_file('foo', 'bar') self.fs.cat('foo') self.assertEqual(self.fs.bytes_read, 3) def test_write_fails_if_file_exists_already(self): self.fs.write_file('foo', 'bar') self.assertRaises(OSError, self.fs.write_file, 'foo', 'foobar') def test_write_creates_missing_directories(self): self.fs.write_file('foo/bar', 'yo') self.assertEqual(self.fs.cat('foo/bar'), 'yo') def test_write_leaves_existing_file_intact(self): self.fs.write_file('foo', 'bar') try: self.fs.write_file('foo', 'foobar') except OSError: pass self.assertEqual(self.fs.cat('foo'), 'bar') def test_overwrite_creates_new_file_ok(self): self.fs.overwrite_file('foo', 'bar') self.assertEqual(self.fs.cat('foo'), 'bar') def test_overwrite_replaces_existing_file(self): self.fs.write_file('foo', 'bar') self.fs.overwrite_file('foo', 'foobar') self.assertEqual(self.fs.cat('foo'), 'foobar') def test_has_written_nothing_initially(self): self.assertEqual(self.fs.bytes_written, 0) def test_write_updates_written(self): self.fs.write_file('foo', 'foo') self.assertEqual(self.fs.bytes_written, 3) def test_overwrite_updates_written(self): self.fs.overwrite_file('foo', 'foo') self.assertEqual(self.fs.bytes_written, 3) def set_up_scan_tree(self): self.dirs = ['foo', 'foo/bar', 'foobar'] self.dirs = [os.path.join(self.basepath, x) for x in self.dirs] for dirname in self.dirs: self.fs.mkdir(dirname) self.dirs.insert(0, self.basepath) self.fs.symlink('foo', 'symfoo') self.pathnames = self.dirs + [os.path.join(self.basepath, 'symfoo')] def test_scan_tree_returns_nothing_if_listdir_fails(self): self.set_up_scan_tree() def raiser(dirname): raise OSError(123, 'oops', dirname) def logerror(msg): pass self.fs.listdir2 = raiser result = list(self.fs.scan_tree(self.basepath, log=logerror)) self.assertEqual(len(result), 1) pathname, st = result[0] self.assertEqual(pathname, self.basepath) def test_scan_tree_returns_the_right_stuff(self): self.set_up_scan_tree() result = list(self.fs.scan_tree(self.basepath)) pathnames = [pathname for pathname, st in result] self.assertEqual(sorted(pathnames), sorted(self.pathnames)) def test_scan_tree_filters_away_unwanted(self): def ok(pathname, st): return stat.S_ISDIR(st.st_mode) self.set_up_scan_tree() result = list(self.fs.scan_tree(self.basepath, ok=ok)) pathnames = [pathname for pathname, st in result] self.assertEqual(sorted(pathnames), sorted(self.dirs)) obnam-1.6.1/obnamlib/vfs_local.py0000644000175000017500000003234412246357067016626 0ustar jenkinsjenkins# Copyright (C) 2008 Lars Wirzenius # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import errno import fcntl import grp import logging import math import os import pwd import tempfile import time import tracing import obnamlib # O_NOATIME is Linux specific: EXTRA_OPEN_FLAGS = getattr(os, "O_NOATIME", 0) class LocalFSFile(file): def read(self, amount=-1): offset = self.tell() data = file.read(self, amount) if data: fd = self.fileno() obnamlib._obnam.fadvise_dontneed(fd, offset, len(data)) return data def write(self, data): offset = self.tell() file.write(self, data) fd = self.fileno() obnamlib._obnam.fadvise_dontneed(fd, offset, len(data)) class LocalFS(obnamlib.VirtualFileSystem): """A VFS implementation for local filesystems.""" chunk_size = 1024 * 1024 def __init__(self, baseurl, create=False): tracing.trace('baseurl=%s', baseurl) tracing.trace('create=%s', create) obnamlib.VirtualFileSystem.__init__(self, baseurl) self.reinit(baseurl, create=create) # For checking that we do not unlock something we didn't lock # ourselves. self.our_locks = set() # For testing purposes, allow setting a limit on write operations # after which an exception gets raised. If set to None, no crash. self.crash_limit = None self.crash_counter = 0 # Do we have lchmod? self.got_lchmod = hasattr(os, 'lchmod') def maybe_crash(self): # pragma: no cover if self.crash_limit is not None: self.crash_counter += 1 if self.crash_counter >= self.crash_limit: raise Exception('Crashing as requested after %d writes' % self.crash_counter) def reinit(self, baseurl, create=False): # We fake chdir so that it doesn't mess with the caller's # perception of current working directory. This also benefits # unit tests. To do this, we store the baseurl as the cwd. tracing.trace('baseurl=%s', baseurl) tracing.trace('create=%s', create) self.cwd = os.path.abspath(baseurl) if not self.isdir('.'): if create: tracing.trace('creating %s', baseurl) try: os.mkdir(baseurl) except OSError, e: # pragma: no cover # The directory might have been created concurrently # by someone else! if e.errno != errno.EEXIST: raise else: err = errno.ENOENT raise OSError(err, os.strerror(err), self.cwd) def getcwd(self): return self.cwd def chdir(self, pathname): tracing.trace('LocalFS(%s).chdir(%s)', self.baseurl, pathname) newcwd = os.path.abspath(self.join(pathname)) if not os.path.isdir(newcwd): raise OSError('%s is not a directory' % newcwd) self.cwd = newcwd def lock(self, lockname, data): tracing.trace('attempting lockname=%s', lockname) try: self.write_file(lockname, data) except OSError, e: if e.errno == errno.EEXIST: raise obnamlib.LockFail("Lock %s already exists" % lockname) else: raise # pragma: no cover tracing.trace('got lockname=%s', lockname) tracing.trace('time=%f' % time.time()) self.our_locks.add(lockname) def unlock(self, lockname): tracing.trace('lockname=%s', lockname) assert lockname in self.our_locks self.remove(lockname) self.our_locks.remove(lockname) tracing.trace('time=%f' % time.time()) def join(self, pathname): return os.path.join(self.cwd, pathname) def remove(self, pathname): tracing.trace('remove %s', pathname) os.remove(self.join(pathname)) self.maybe_crash() def rename(self, old, new): tracing.trace('rename %s %s', old, new) os.rename(self.join(old), self.join(new)) self.maybe_crash() def lstat(self, pathname): (ret, dev, ino, mode, nlink, uid, gid, rdev, size, blksize, blocks, atime_sec, atime_nsec, mtime_sec, mtime_nsec, ctime_sec, ctime_nsec) = obnamlib._obnam.lstat(self.join(pathname)) if ret != 0: raise OSError(ret, os.strerror(ret), pathname) return obnamlib.Metadata( st_dev=dev, st_ino=ino, st_mode=mode, st_nlink=nlink, st_uid=uid, st_gid=gid, st_rdev=rdev, st_size=size, st_blksize=blksize, st_blocks=blocks, st_atime_sec=atime_sec, st_atime_nsec=atime_nsec, st_mtime_sec=mtime_sec, st_mtime_nsec=mtime_nsec, st_ctime_sec=ctime_sec, st_ctime_nsec=ctime_nsec ) def get_username(self, uid): return pwd.getpwuid(uid)[0] def get_groupname(self, gid): return grp.getgrgid(gid)[0] def llistxattr(self, filename): # pragma: no cover ret = obnamlib._obnam.llistxattr(self.join(filename)) if type(ret) is int: raise OSError(ret, os.strerror(ret), filename) return [s for s in ret.split('\0') if s] def lgetxattr(self, filename, attrname): # pragma: no cover ret = obnamlib._obnam.lgetxattr(self.join(filename), attrname) if type(ret) is int: raise OSError(ret, os.strerror(ret), filename) return ret def lsetxattr(self, filename, attrname, attrvalue): # pragma: no cover ret = obnamlib._obnam.lsetxattr(self.join(filename), attrname, attrvalue) if ret != 0: raise OSError(ret, os.strerror(ret), filename) def lchown(self, pathname, uid, gid): # pragma: no cover tracing.trace('lchown %s %d %d', pathname, uid, gid) os.lchown(self.join(pathname), uid, gid) # This method is excluded from test coverage because the platform # either has lchmod or doesn't, and accordingly either branch of # the if statement is taken, and the other branch shows up as not # being tested by the unit tests. def chmod_symlink(self, pathname, mode): # pragma: no cover tracing.trace('chmod_symlink %s %o', pathname, mode) if self.got_lchmod: os.lchmod(self.join(pathname), mode) else: self.lstat(pathname) def chmod_not_symlink(self, pathname, mode): tracing.trace('chmod_not_symlink %s %o', pathname, mode) os.chmod(self.join(pathname), mode) def lutimes(self, pathname, atime_sec, atime_nsec, mtime_sec, mtime_nsec): assert atime_sec is not None assert atime_nsec is not None assert mtime_sec is not None assert mtime_nsec is not None ret = obnamlib._obnam.utimensat(self.join(pathname), atime_sec, atime_nsec, mtime_sec, mtime_nsec) if ret != 0: raise OSError(ret, os.strerror(ret), pathname) def link(self, existing, new): tracing.trace('existing=%s', existing) tracing.trace('new=%s', new) os.link(self.join(existing), self.join(new)) self.maybe_crash() def readlink(self, pathname): return os.readlink(self.join(pathname)) def symlink(self, existing, new): tracing.trace('existing=%s', existing) tracing.trace('new=%s', new) os.symlink(existing, self.join(new)) self.maybe_crash() def open(self, pathname, mode): tracing.trace('pathname=%s', pathname) tracing.trace('mode=%s', mode) f = LocalFSFile(self.join(pathname), mode) tracing.trace('opened %s', pathname) try: flags = fcntl.fcntl(f.fileno(), fcntl.F_GETFL) flags |= EXTRA_OPEN_FLAGS fcntl.fcntl(f.fileno(), fcntl.F_SETFL, flags) except IOError, e: # pragma: no cover tracing.trace('fcntl F_SETFL failed: %s', repr(e)) return f # ignore any problems setting flags tracing.trace('returning ok') return f def exists(self, pathname): return os.path.exists(self.join(pathname)) def isdir(self, pathname): return os.path.isdir(self.join(pathname)) def mknod(self, pathname, mode): tracing.trace('pathmame=%s', pathname) tracing.trace('mode=%o', mode) os.mknod(self.join(pathname), mode) def mkdir(self, pathname): tracing.trace('mkdir %s', pathname) os.mkdir(self.join(pathname)) self.maybe_crash() def makedirs(self, pathname): tracing.trace('makedirs %s', pathname) os.makedirs(self.join(pathname)) self.maybe_crash() def rmdir(self, pathname): tracing.trace('rmdir %s', pathname) os.rmdir(self.join(pathname)) self.maybe_crash() def cat(self, pathname): tracing.trace('pathname=%s' % pathname) pathname = self.join(pathname) f = self.open(pathname, 'rb') chunks = [] while True: chunk = f.read(self.chunk_size) if not chunk: break chunks.append(chunk) self.bytes_read += len(chunk) f.close() data = ''.join(chunks) return data def write_file(self, pathname, contents): # pragma: no cover # This is tricky. We need to at least try to support NFS, and # various filesystems that do not support hardlinks. On NFS, # creating a file with O_EXCL is not guaranteed to be atomic, # but adding a link with link(2) is. However, that doesn't work # on VFAT, for example. So we try do both: first create a # temporary file with a name guaranteed to not be a name we # want to use, and then we rename it using link(2) and remove(2). # If that fails, we try to create the target file with O_EXCL # and rename the temporary file to that. This is still not 100% # reliable: someone could be mounting VFAT across NFS, for # example, but it's the best we can do. If this paragraph is # wrong, tell the authors. tracing.trace('write_file %s', pathname) tempname = self._write_to_tempfile(pathname, contents) path = self.join(pathname) # Try link(2) for creating target file. try: os.link(tempname, path) except OSError, e: pass else: os.remove(tempname) tracing.trace('link+remove worked') return # Nope, didn't work. Now try with O_EXCL instead. try: fd = os.open(path, os.O_CREAT | os.O_EXCL | os.O_WRONLY, 0666) os.close(fd) os.rename(tempname, path) except OSError, e: # Give up. os.remove(tempname) raise tracing.trace('O_EXCL+rename worked') self.maybe_crash() def overwrite_file(self, pathname, contents): tracing.trace('overwrite_file %s', pathname) tempname = self._write_to_tempfile(pathname, contents) path = self.join(pathname) os.rename(tempname, path) self.maybe_crash() def _write_to_tempfile(self, pathname, contents): path = self.join(pathname) dirname = os.path.dirname(path) if not os.path.exists(dirname): tracing.trace('os.makedirs(%s)' % dirname) os.makedirs(dirname) fd, tempname = tempfile.mkstemp(dir=dirname) os.close(fd) f = self.open(tempname, 'wb') pos = 0 while pos < len(contents): chunk = contents[pos:pos+self.chunk_size] f.write(chunk) pos += len(chunk) self.bytes_written += len(chunk) f.close() return tempname def listdir(self, dirname): return os.listdir(self.join(dirname)) def listdir2(self, dirname): result = [] for name in self.listdir(dirname): try: st = self.lstat(os.path.join(dirname, name)) except OSError, e: # pragma: no cover st = e ino = -1 else: ino = st.st_ino result.append((ino, name, st)) # We sort things in inode order, for speed when doing namei lookups # when backing up. result.sort() return [(name, st) for ino, name, st in result] obnam-1.6.1/obnamlib/vfs_local_tests.py0000644000175000017500000000510712246357067020045 0ustar jenkinsjenkins# Copyright (C) 2008 Lars Wirzenius # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 2 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. import platform import errno import os import shutil import tempfile import unittest import obnamlib from obnamlib import _obnam class LocalFSTests(obnamlib.VfsTests, unittest.TestCase): def setUp(self): self.basepath = tempfile.mkdtemp() self.fs = obnamlib.LocalFS(self.basepath) def tearDown(self): self.fs.close() shutil.rmtree(self.basepath) def test_joins_relative_path_ok(self): self.assertEqual(self.fs.join('foo'), os.path.join(self.basepath, 'foo')) def test_join_treats_absolute_path_as_absolute(self): self.assertEqual(self.fs.join('/foo'), '/foo') def test_get_username_returns_root_for_zero(self): self.assertEqual(self.fs.get_username(0), 'root') def test_get_groupname_returns_root_for_zero(self): root = 'wheel' if platform.system() == 'FreeBSD' else 'root' self.assertEqual(self.fs.get_groupname(0), root) class XAttrTests(unittest.TestCase): '''Tests for extended attributes.''' def setUp(self): fd, self.filename = tempfile.mkstemp() os.close(fd) def test_empty_list(self): '''A new file has no extended attributes.''' self.assertEqual(_obnam.llistxattr(self.filename), "") def test_lsetxattr(self): '''lsetxattr() sets an attribute on a file.''' _obnam.lsetxattr(self.filename, "user.key", "value") _obnam.lsetxattr(self.filename, "user.hello", "world") self.assertEqual(sorted(_obnam.llistxattr(self.filename).strip("\0").split("\0")), ["user.hello", "user.key"]) def test_lgetxattr(self): '''lgetxattr() gets the value of an attribute set on the file.''' _obnam.lsetxattr(self.filename, "user.hello", "world") self.assertEqual(_obnam.lgetxattr(self.filename, "user.hello"), "world") obnam-1.6.1/read-live-data-with-sftp0000755000175000017500000000243612246357067017150 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . import stat import sys import ttystatus from obnamlib.plugins.sftp_plugin import SftpFS ts = ttystatus.TerminalStatus(period=0.1) ts['bytes'] = 0 ts.format( '%ElapsedTime() %Counter(pathname) %ByteSize(bytes) ' '%ByteSpeed(bytes) %Pathname(pathname)') url = sys.argv[1] fs = SftpFS(url) fs.connect() for pathname, st in fs.scan_tree('.'): ts['pathname'] = pathname if stat.S_ISREG(st.st_mode): f = fs.open(pathname, 'rb') while True: data = f.read(1024**2) if not data: break ts['bytes'] += len(data) f.close() ts.finish() obnam-1.6.1/run-benchmarks0000755000175000017500000000155712246357067015370 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -ue for x in confs/*.conf do if [ "$x" != confs/common.conf ] then ./obnam-benchmark --no-default-config --config=confs/common.conf \ --config="$x" "$@" fi done obnam-1.6.1/sed-in-place0000755000175000017500000000037512246357067014707 0ustar jenkinsjenkins#!/bin/sh # # Do a sed in place for a set of files. This is like GNU sed -i, but # we can't assume GNU sed. set -eu sedcmd="$1" shift for filename in "$@" do temp="$(mktemp)" sed "$sedcmd" "$filename" > "$temp" mv "$temp" "$filename" doneobnam-1.6.1/setup.py0000644000175000017500000001275412246357067014236 0ustar jenkinsjenkins#!/usr/bin/python # Copyright (C) 2008-2012 Lars Wirzenius # # This program is free software; you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation; either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License along # with this program; if not, write to the Free Software Foundation, Inc., # 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA. from distutils.core import setup, Extension from distutils.cmd import Command from distutils.command.build import build from distutils.command.clean import clean import glob import os import shutil import subprocess import sys import tempfile # We need to know whether we can run yarn. We do this by checking # the python-markdown version: if it's new enough, we assume yarn # is available, and if it isn't, yarn won't be available since it # won't work with old versions (e.g., the one in Debian squeeze.) try: import markdown except ImportError: got_yarn = False else: if (hasattr(markdown, 'extensions') and hasattr(markdown.extensions, 'Extension')): got_yarn = True else: got_yarn = False def runcmd(*args, **kwargs): try: subprocess.check_call(*args, **kwargs) except subprocess.CalledProcessError, e: sys.stderr.write('ERROR: %s\n' % str(e)) sys.exit(1) class GenerateManpage(build): def run(self): build.run(self) print 'building manpages' for x in ['obnam', 'obnam-benchmark']: with open('%s.1' % x, 'w') as f: runcmd(['python', x, '--generate-manpage=%s.1.in' % x, '--output=%s.1' % x], stdout=f) class CleanMore(clean): def run(self): clean.run(self) for x in ['obnam.1', 'obnam-benchmark.1', '.coverage', 'obnamlib/_obnam.so']: if os.path.exists(x): os.remove(x) self.remove_pyc('obnamlib') self.remove_pyc('test-plugins') if os.path.isdir('build'): shutil.rmtree('build') def remove_pyc(self, rootdir): for dirname, subdirs, basenames in os.walk(rootdir): for x in [os.path.join(dirname, base) for base in basenames if base.endswith('.pyc')]: os.remove(x) class Check(Command): user_options = [ ('unit-only', 'u', 'run unit tests tests only?'), ('fast', 'f', 'run fast tests only?'), ('network', 'n', 'run network tests to localhost?'), ('network-only', 'N', 'only run network tests to localhost?'), ] def initialize_options(self): self.unit_only = False self.fast = False self.network = False self.network_only = False def finalize_options(self): pass def run(self): local = not self.network_only network = self.network or self.network_only fast = self.fast slow = not self.fast and not self.unit_only if local and (self.unit_only or fast): print "run unit tests" runcmd(['python', '-m', 'CoverageTestRunner', '--ignore-missing-from=without-tests']) os.remove('.coverage') if local and fast: print "run black box tests" runcmd(['cmdtest', 'tests']) if got_yarn: runcmd( ['yarn', '-s', 'yarns/obnam.sh'] + glob.glob('yarns/*.yarn')) num_clients = '2' num_generations = '16' if local and slow: print "run locking tests" test_repo = tempfile.mkdtemp() runcmd(['./test-locking', num_clients, num_generations, test_repo, test_repo]) shutil.rmtree(test_repo) if local and slow: print "run crash test" runcmd(['./crash-test', '200']) if network and fast: print "run sftp tests" runcmd(['./test-sftpfs']) if network and fast: print "re-run black box tests using localhost networking" env = dict(os.environ) env['OBNAM_TEST_SFTP_ROOT'] = 'yes' env['OBNAM_TEST_SFTP_REPOSITORY'] = 'yes' runcmd(['cmdtest', 'tests'], env=env) if network and slow: print "re-run locking tests using localhost networking" test_repo = tempfile.mkdtemp() repo_url = 'sftp://localhost/%s' % test_repo runcmd(['./test-locking', num_clients, num_generations, repo_url, test_repo]) shutil.rmtree(test_repo) print "setup.py check done" setup(name='obnam', version='1.6.1', description='Backup software', author='Lars Wirzenius', author_email='liw@liw.fi', url='http://liw.fi/obnam/', scripts=['obnam', 'obnam-benchmark', 'obnam-viewprof'], packages=['obnamlib', 'obnamlib.plugins'], ext_modules=[Extension('obnamlib._obnam', sources=['_obnammodule.c'])], data_files=[('share/man/man1', glob.glob('*.1'))], cmdclass={ 'build': GenerateManpage, 'check': Check, 'clean': CleanMore, }, ) obnam-1.6.1/test-data/0000755000175000017500000000000012246357067014401 5ustar jenkinsjenkinsobnam-1.6.1/test-data/repo-format-5-encrypted-gzipped.tar.gz0000644000175000017500000003051112246357067023560 0ustar jenkinsjenkins1Oy<ˤ"6"ʾe'[3HBٮ%EdJ(I PJEuox}wXj4 PY=h;;:I, [e~\,^|7þ/#5@ U  w@_U* B`0+-#`ֶ(ceYAqRHbkkàV7HDEB@!(|%O@W0y~*C[\U7oh.]ȓCΛ,lZzkĨ6E :m/_,(d;kd>+(ؠF댧/I޿(k>ȇ'>oGg=;Z=%^^ӿBs.}Lȵhq9@cp}y}|/"t2eTC{1dQ!@ h{ nvw&JU \|r7o3jMF_eV/˄|w>J3rN". +zSϏԤ}@IåZ(Ä뫀is0cI [fM 3ʙ+d`(Vy1,6 ޑvn*z9bs8 L-"R|@@p(&fjsF_*(Y~CsHF3z8Kyr53 KW&YLlȔɗwU.h[ /wн*x3\K|jE jyG02m8}:V ,F] /+| $<.9n˜["'SoΜ;d(,e)ְe <{"ΛQ#V@+u~ 6cWsO:v]L cKsO.GHs_'vfJk@lOTbH }0dz~!~MnAWj%0(rz(Ҧʊ0ӞP2m[ap_]hɨ`|oPٔwjrT,9*kV|>wYdegzJk-[%lp= NtcbBF*3K(&i9@D4߰aEmh o#p3Ƀg,}%RoyfV2 $F d[~(T]nuD6l=l{mn( vD?Q-42~Xm!qy8KyS{Tε:J;% 3X ~ 1,D܂<;xӚi|"~&rg5!=S&ݸ̥2z\V"M}uab,jJ6PQs ;x#gN_2OW2 .nu+J]I+Α3NՊ0X´{Keli GYΐ贚k/beG :X^Q6g,"4k{J+Ro ̛;RBr)"fqe%z,dhx/8O]ȒXnBw8{y3*'?|4~LFK-1 QzI c}y퉇x-EÆ4XdCg;AV^rذ0Q&O6uǰ_B+!ABQV͝?9ᾏrY+4>(rs}IYWrDhh!Ê|o1% ;NalB<@>>Tlx2μE%♍E|;@m/#_HVWsQǣ0HwX#$nwƭnM:֋(%$ +B{ -'sI~]B 4\_-S(|6ej}c2hi@W k7n,.(޼b<,s2.@4)(-.AYn xg:]^^ZZu 8Jϛ`P^u]<)$SH:}]mX1ɅD+qcbL\Ѩ:glD^Gk ''fXG/2J"M4?jyWV_Gt9u9 q2 "j_${jGxu6CGL bK"u ᚖ'.x4mbw; ]s{jO)ǿ錘KP{6u5(]y*=60VAƯjP~H״b6)[%.lZsU=RjC~qQKXn3wr CRuʖve!g!?2 ,%<ǢT$~~+lўw$n͝E^첔.v/7/Wjzw޹+\:l1|?={@60 HoH5 !Ho/MZ# ?I-`W^uoJ @HD;?WeC>ac?kOpx(sfdؓgSEm_EwX"C^䕂47슯-vM'_-){cb#zN$Nq`AG!$%ώ+}v؁T3y(ՀY~ϻ'>ٞI *ĈQt`:X*ux:mN*j7.DTq깡l|)\hyud*"5(ȷ<A "?YTkoz7>xBa`Qr#ߦv5&A+pKB=Ǯmt\R#f* m2#F5ʓ}2^6| ?aT@'z'B 8DO{~@!T!|?s?hu>h +OA`g@4ٸ+8@4/a"IA_Lojc%. @c|~3 t7B ̏vIl]*v4o:OdK=IQt Dx >3WBe+b }|WWtI=fUuFӕs=\[:^޽y0VT +$hD>1龗ف */ tXi* L %LNuRnͯEjΙӾ4ob؅Cٰ_?5.g|-*.y|C?C0]:ktlffݸD^O2]ܶ+:,{d3X[ju Kxb,{ZU+xds6#@wGύ:\~DƯOz NpW}'$wL#cȷ06g^ 5LE{f{Duk"-?JA f'#RC-$ؘB[e[RLݧEeo8x=`-۞J$g'لύn뛱6obcoe"׻.Qeq(nG,:uѭVffs>(NjP8c*~V(xOgH M)[վC1OƂڸz"I7(wh6S& 7( ڗVɹKvÖb94J-\{c ]*q10o &^ߺ=vFOH"7rCm _QP@W(·=8G ߧ?@/B^ "ǟBUQo8G@geGVFR0D!庙jIq sjlЌ*̺㚓67˓Sr'E'SprrOZ\-)em7q_Xop }\1IO2i8/19%) Ԥ{b*ΐz] SmrVKPY~fRd˘n˶[;< 3<];+%^$*\Q"2smՏzrX151BLd+\2A GM濠"/b"U>ī?E>؄z׆>,^k#_u' nC>XP !IR` $% %\box\1Lt}e> O#֕߯ WcK4S܎iNv>{TlNfw̕>+5P;o`㕫э_#c?*ť3q9+@`Kc$DmR >4^7 rl m%x205*rWkä{ xy?eJ]pZǞs+hd/g7/¨b[<A __PU5Ehu1| 5\^X ^Ł.=31tW 0`SD @Vo}4^Jt|G  Ӛ^&G>#G'Ek\b?^YS )u:wz6fpÄ7h5Pl] -g#=4F^=G?ßƏ*Q>9 -4u~NR>p S"kXkȚwh'rXxV@7̀؛Vs_?Rt !acw?/.uHcϛY65^,6+\;Mip0V@kk]~ *!˛O|2Rc\DYuy\­$"g;=nL({ X֮ӓJjkcIN1Ƭ:W[Wo54ůMamx2*鉻мp>1d18 xƷtd;ڏXJ`Q+g۟o&uN[zO-e&M^S$ lؖj"f3ō쬞ǟo|-8e3txTs‘VS+zG,Aa t9axm Ek_Jd0i^vh^qsH feꖳd."3K+db ="q|LO#pEc~+݅52ꏭo]s/YpIF4*t|@d =TR%bQ?8*-tܠKwuvyJƖ^$ƒZxL׏->uU׉:B0Ov_? AN _L$ we]PqEpc>!sӎXxS )ըW732ϲr ܥSxMQ1&ʌ\1[B- &.I0Fl""+%mE |8fqx M_yw'Laj;uG*2V'RUhdߪc>KB"?08pK~(eLaTz+5bs7&t C\T#V;?n[ۻ?Kb!6 RtԧCl 1 cp񑕫\hjT&Y491[{{ o~߻ #W; DߛG v߈o1(SOX 줪7lUwP( ch ߪ̝/(N[xf(BgB/!O3&㛖muR6~y$bn]h$Ƹ 7_?lwG' _TM8C=N?IoVq,R 7/g@NpP^6̙fpo3XP82~HO;f ~ c~ s'Á뒫\taLQ v\4˃4^fltfVUmi{Zii#T+OιǪM'x@+ DAb3#6sJtW綵^ӿ&/6|p_Rlͅ+i6C$V1/˫6Hh.Da|QI'5y6snWA>t,t=O[brS6]w]٢3 W@읹 hq'i`$^}khnz0E0yַP,XPxAaQGdXuË|yt c#RȦZ7W ?֬r1pnN{e׿ĕ W@᥿u7gm{ p8QحO~ >濉Ÿf:wy&FlsPp]aG+q~\m+ӄ:̘o1pFo!h!s.3:$Z%eՋMs/HвVss\:X2H"c7Mej<P0T1"ZH00SAU6$Xqz,}Ť9Uv:t?da?$i.e4L-e('~skӏj-rV໽~ ($D{9ߠ?R ;o?v/$$DOwH_#*$MV77IOovO^|4YMYx#qᵾϨ`Z9>8VEP %h#nnhb.̴B{d"c㽝)^Ò"=P2b;`cl {Ccg רuOG_DJ|NJIbEOVmb Җ6E[^+G wqݸn,ƚkEՠޛP1%ckΠ\1i򼪩쫡R^i*f<$ hYB݇6t!5 t/6ؖ}zM|ʱ YFno`q>5|} `)":<2χ⫻hG!Z'ynLu5m?&y$tސ}jMîOMiw"/EZ&sS_R\3/t+l|nFO5t?2+6,O% th+/!l$dɯ~]XwWYX9obd*:`vŨZK9-pd{Aei$?dp+jkVdǙ1KoNlSӹ)5{R%(&#" Uj_m=wKR\% >򦤮3V\2ՄCe0nj}Ib!1F))qzΙ++ q(yghn;KyPȘx8 fLz{)AFo?~ KëuW CХgt[ef9^ M  ;2G`ӏmiY.9Ͼ9N\Բ6 |>]yUyʒ,%q)J&gSa:4A &Sg2$'0l_~mqfHtXZe" gŷT$ă MRtmKb$K- 3F L}@|\0q@gónu Bg_oK͝T5Q6 ?XPi +_3F~y+]8cwmSecond Key For Testing"MR   =Y%;";PU8_W.TDԽ#KEX<1@Iؗ;{x&"M./'ݳ6~Ø$[]LC 0=( u}sUACѣՊ?MRulgPcWS&%|:Nx#-KYgOsp÷W8B`3Nkw>&F%-]S/0pতY <tqthFב MR =Af$ZgV7mwEWb XUn\%(^x)0u|E _"rHA+9q0XjmQ̒d~&ctVXKcobnam-1.6.1/test-gpghome/random_seed0000644000175000017500000000113012246357067017314 0ustar jenkinsjenkinsBeu"aZ;8+&^u o@ /ܻ9P8'K$˘u$XO]"P\`B"I{4WB>S9ܬ(-)+ݦ/=;)i腪3gVD2=RL?_RD56vɅH/}'1#]s<\j|+L,~;֗c(!&)}څ[ڑ`^}a^3F(P-@D3(xۏYTqkd~=gTF$r vIfqplE|xT\jU`$V'KJ}޴RW3(E!多:<)k ;`$=O-F1pijKS1L(}mOoz-Su2%UE]6}RB]?;K!n|65'`=X6SA˯CkoFBSO1աތ,F6Ϫ)& ܛmF;qcjY"OO_0ڋr{Il>+d%[ \퉚jnobnam-1.6.1/test-gpghome/secring.gpg0000644000175000017500000000512312246357067017250 0ustar jenkinsjenkinsM \ /hv% @,obJhg{?fAB R%Ԙ*yGHc^=E} &qXg&ϋ6bɌ!콕)0]#GX|@"z߂OWX7b|{lS7HYڼ:Fa.E{5:/ Ok}9=hy`Z]=Ct/屠(LVǁytX :" <(}O|3ky~hCk̖>hɕS.= Ȏy,r䊁\x5Ppt ڝ{Iy.TƵtj >.&^lIyP P@^:>@u\H$181(UَNM. .Test Key"M    ;2GAU$|Gza1.ba.ntoֹ0. b.{gw u=ӱOh-_aY=kJll晇(~AmR a4)m3==*v=&Ԉ.M (bLڐښqsg#; 1cm6KW.L^">n;KyPȘx8 fLz{)AFo?~ KëuW CХgt[ef9^n-?3giU\iP} ]ƻ-%٪Ǣ?mTǕCV!4/jQDڰʫ3WUD.br?FX?]yUyʒ,%q)J&gSa:4A &Sg2$'0l_~mqfHtXZe" gŷT$ă MRtmKb$K- 3F L}@|\0q@gónu Bg_oK͝T5Q6 ?XPi +_3F~y+]8cwmo-阄vΫAA{|~gLV¯~U Â,;Ƿr#y;a᧭Hlt滶Use~y`Z#.lq&/=ύWVJ#?0*^"Z)<߬H_cd$UTh Ű@~X> ͻʲ|<~uh\YN NTKj|h˨N-0%:5!ڧLxYvR+gd,'`ED+Z# DL@^vB-􀴤ҴSecond Key For Testing"MR   =Y%;";PU8_W.TDԽ#KEX<1@Iؗ;{x&"M./'ݳ6~Ø$[]LC 0=( u}sUACѣՊ?MRulgPcWS&%|:Nx#-KYgOsp÷W8B`3Nkw>&F%-]S/0pতY <tqthFב%uXK`ć!d)D)*XM8|+|6 Vn(`Ͼ`| nRdT8'̧&HcQGdHD*97!k~Ro!pCDܕk\X3At32lf3?Ml&dia4߮J \n7F0hLF]|=MޞҙW$Ai?GSN s?e"36bI229} dEh28d㖎P| D> c MR =Af$ZgV7mwEWb XUn\%(^x)0u|E _"rHA+9q0XjmQ̒d~&ctVXKcobnam-1.6.1/test-gpghome/trustdb.gpg0000644000175000017500000000252012246357067017303 0ustar jenkinsjenkinsgpgMR  N*=Ls;2G EOL|B  D]Py(=! c&.Sobnam-1.6.1/test-lock-files0000755000175000017500000000247412246357067015455 0ustar jenkinsjenkins#!/bin/sh # # Obnam test: test lock files. # # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . die() { echo "$@" 1>&2 exit 1 } [ "$#" = 2 ] || die "Bad usage, read source!" NCLIENTS="$1" COUNTER="$2" directory="$(mktemp -d)" pids="$(mktemp)" for i in $(seq "$NCLIENTS") do ./lock-and-increment "$directory" "$COUNTER" & echo "$!" >> "$pids" done errors=0 for pid in $(cat "$pids") do if ! wait "$pid" then echo "at least one client failed" 1>&2 errors=1 fi done n=$(cat "$directory/counter") wanted=$(expr "$COUNTER" '*' "$NCLIENTS") if [ "$n" != "$wanted" ] then echo "counted to $n should be $wanted" 1>&2 errors=1 fi rm -rf "$directory" "$pids" exit $errors obnam-1.6.1/test-locking0000755000175000017500000000304612246357067015047 0ustar jenkinsjenkins#!/bin/sh # # Obnam test: test locking with multiple clients accessing the same repo. # # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . die() { echo "$@" 1>&2 exit 1 } [ "$#" = 4 ] || die "Bad usage, read source!" NCLIENTS="$1" NGENERATIONS="$2" repourl="$3" repopath="$4" tempdir="$(mktemp -d)" pids="$tempdir/pids" echo "Starting backups: $NCLIENTS clients $NGENERATIONS generations" echo "Using temporary directory $tempdir" for i in $(seq "$NCLIENTS") do ./test-many-generations "$NGENERATIONS" "$repourl" "$repopath" \ "client-$i" > "client-$i.output" 2>&1 & echo "$!" >> "$pids" done echo "Waiting for clients to finish... this may take a long time" errors=0 for pid in $(cat "$pids") do if ! wait "$pid" then if [ "$errors" = 0 ] then echo "at least one client failed" 1>&2 fi errors=1 fi done if [ "$errors" = 0 ] then rm -rf "$tempdir" fi exit $errors obnam-1.6.1/test-many-generations0000755000175000017500000000333512246357067016702 0ustar jenkinsjenkins#!/bin/bash # # Obnam test: backup and verify many generations of data. # # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -x die() { echo "$@" 1>&2 exit 1 } [ "$#" = 4 ] || die "Bad usage, read source!" N="$1" repourl="$2" repopath="$3" client="$4" root="$(mktemp -d)" log="$(mktemp)" amount="1k" conf="$(mktemp)" cat < "$conf" [config] client-name = $client quiet = yes log = $client.log trace = obnamlib, larch repository = $repourl root = $root EOF echo "Configuration:" cat "$conf" echo echo "Making backups" seq "$N" | while read gen do genbackupdata --quiet --create="$amount" "$root" --seed="$RANDOM" find "$root" -exec touch -d "1970-01-01 00:00:$gen" '{}' ';' ./verification-test backup "$repopath" "$root" "$conf" \ >> "$client-verif.output" 2>&1 || exit 1 done || exit 1 echo "Verifying results" while true do ./verification-test verify "$repopath" "$root" "$conf" ret="$?" if [ "$ret" = 0 ] then break elif [ "$ret" != 42 ] then echo "$client failed verification" 1>&2 exit 1 fi done || exit 1 rm -rf "$conf" "$root" "$log" obnam-1.6.1/test-paramiko0000644000175000017500000000767612246357067015236 0ustar jenkinsjenkins# Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . '''Test paramiko by doing an sftp copy from localhost.''' import os import paramiko import pwd import socket import subprocess import sys import tempfile import time class SSHChannelAdapter(object): '''Take an ssh subprocess and pretend it is a paramiko Channel.''' def __init__(self, proc): self.proc = proc def send(self, data): return os.write(self.proc.stdin.fileno(), data) def recv(self, count): try: return os.read(self.proc.stdout.fileno(), count) except socket.error, e: if e.args[0] in (errno.EPIPE, errno.ECONNRESET, errno.ECONNABORTED, errno.EBADF): # Connection has closed. Paramiko expects an empty string in # this case, not an exception. return '' raise def get_name(self): return 'obnam SSHChannelAdapter' def close(self): for func in [proc.stdin.close, proc.stdout.close, proc.wait]: try: func() except OSError: pass username = pwd.getpwuid(os.getuid()).pw_name host = 'localhost' port = 22 path = '/tmp/zeroes' subsystem = 'sftp' args = ['ssh', '-oForwardX11=no', '-oForwardAgent=no', '-oClearAllForwardings=yes', '-oProtocol=2'] if port is not None: args.extend(['-p', str(port)]) if username is not None: args.extend(['-l', username]) args.extend(['-s', host, subsystem]) proc = transport = None try: raise OSError(None, None, None) proc = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE, close_fds=True) except OSError: transport = paramiko.Transport((host, port)) transport.connect() agent = paramiko.Agent() agent_keys = agent.get_keys() for key in agent_keys: try: transport.auth_publickey(username, key) break except paramiko.SSHException: pass else: raise Exception('no auth') try: keys = paramiko.util.load_host_keys( os.path.expanduser('~/.ssh/known_hosts')) except IOError: print '*** Unable to open host keys file' sys.exit(1) t = transport hostname = host key = t.get_remote_server_key() k = keys.lookup(hostname) if k is None: print 'unknown host %s (using it anyway)' % hostname else: key_type = key.get_name() if not k.has_key(key_type): print 'no key of type %s for %s' % (key_type, hostname) print 'accepting host key anyway' else: host_key = k[key_type] if host_key != key: print 'wrong host key for %s' % hostname sys.exit(1) else: print 'good host key for %s' % hostname sftp = paramiko.SFTPClient.from_transport(transport) else: sftp = paramiko.SFTPClient(SSHChannelAdapter(proc)) n = 0 f = sftp.open(path) start = time.time() while True: data = f.read(32*1024) if not data: break n += len(data) end = time.time() f.close() sftp.close() if transport: transport.close() if proc: proc.close() duration = end - start n = float(n) print duration, n/1024/1024, 8 * n / duration / 1024 / 1024 obnam-1.6.1/test-plugins/0000755000175000017500000000000012246357067015151 5ustar jenkinsjenkinsobnam-1.6.1/test-plugins/aaa_hello_plugin.py0000644000175000017500000000021012246357067020777 0ustar jenkinsjenkinsimport pluginmgr class Hello(pluginmgr.Plugin): def __init__(self, foo, bar=None): self.foo = foo self.bar = bar obnam-1.6.1/test-plugins/hello_plugin.py0000644000175000017500000000030412246357067020201 0ustar jenkinsjenkinsimport pluginmgr class Hello(pluginmgr.Plugin): def __init__(self, foo, bar=None): self.foo = foo self.bar = bar @property def version(self): return '0.0.1' obnam-1.6.1/test-plugins/oldhello_plugin.py0000644000175000017500000000021112246357067020675 0ustar jenkinsjenkinsimport pluginmgr class Hello(pluginmgr.Plugin): def __init__(self, foo, bar=None): self.foo = foo self.bar = bar obnam-1.6.1/test-plugins/wrongversion_plugin.py0000644000175000017500000000041212246357067021640 0ustar jenkinsjenkins# This is a test plugin that requires a newer application version than # what the test harness specifies. import pluginmgr class WrongVersion(pluginmgr.Plugin): required_application_version = '9999.9.9' def __init__(self, *args, **kwargs): pass obnam-1.6.1/test-sftpfs0000755000175000017500000000730612246357067014731 0ustar jenkinsjenkins#!/usr/bin/python # Copyright 2010 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . '''Test SftpFS. This can't be part of the normal unit tests, since it requires access to a (real) ssh server. To run these tests, you must arrange for localhost to be able to accept ssh connections using the ssh agent. ''' import logging import os import pwd import shutil import tempfile import unittest import obnamlib import obnamlib.plugins.sftp_plugin class SftpTests(unittest.TestCase, obnamlib.VfsTests): def setUp(self): self.basepath = tempfile.mkdtemp() baseurl = 'sftp://localhost%s' % self.basepath settings = { 'pure-paramiko': False, 'create': True, 'sftp-delay': 0, 'ssh-key': '', 'strict-ssh-host-keys': False, 'ssh-known-hosts': os.path.expanduser('~/.ssh/known_hosts'), 'ssh-command': None, 'ssh-host-keys-check': 'no', } self.fs = obnamlib.plugins.sftp_plugin.SftpFS(baseurl, settings=settings) self.fs.connect() def tearDown(self): self.fs.close() shutil.rmtree(self.basepath) def test_sets_path_to_absolute_path(self): self.assert_(self.fs.path.startswith('/')) def test_resolves_magic_homedir_prefix(self): baseurl = 'sftp://localhost/~/' settings = { 'pure-paramiko': False, 'create': True, 'sftp-delay': 0, 'ssh-key': '', 'strict-ssh-host-keys': False, 'ssh-known-hosts': os.path.expanduser('~/.ssh/known_hosts'), 'ssh-command': None, 'ssh-host-keys-check': 'no', } fs = obnamlib.plugins.sftp_plugin.SftpFS(baseurl, settings=settings) fs.connect() homedir = pwd.getpwuid(os.getuid()).pw_dir self.assertEqual(fs._initial_dir, homedir) self.assertEqual(fs.getcwd(), homedir) def test_initial_cwd_is_basepath(self): self.assertEqual(self.fs.getcwd(), self.fs.path) def test_link_creates_hard_link(self): pass # sftp does not support hardlinking, so not testing it def test_mknod_creates_fifo(self): self.assertRaises(NotImplementedError, self.fs.mknod, 'foo', 0) # Override method from the VfsTests class. SFTP doesn't do sub-second # timestamps, so we fix the test here to not set those fields to nonzero. def test_lutimes_sets_times_correctly(self): self.fs.mkdir('foo') self.fs.lutimes('foo', 1, 2*1000, 3, 4*1000) self.assertEqual(self.fs.lstat('foo').st_atime_sec, 1) self.assertEqual(self.fs.lstat('foo').st_atime_nsec, 0) self.assertEqual(self.fs.lstat('foo').st_mtime_sec, 3) self.assertEqual(self.fs.lstat('foo').st_mtime_nsec, 0) def test_get_username_returns_None_for_zero(self): self.assertEqual(self.fs.get_username(0), None) def test_get_groupname_returns_None_for_zero(self): self.assertEqual(self.fs.get_groupname(0), None) if __name__ == '__main__': logging.basicConfig(filename='/dev/null') unittest.main() obnam-1.6.1/tests/0000755000175000017500000000000012246357067013655 5ustar jenkinsjenkinsobnam-1.6.1/tests/anyone-list-clients.script0000755000175000017500000000156412246357067021015 0ustar jenkinsjenkins#!/bin/sh # Copyright 2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # The clients command can be run even by a client not in the list of clients # that previously did a backup. set -eu $SRCDIR/tests/backup $SRCDIR/obnam -r $DATADIR/repo --client=someotherclient clients obnam-1.6.1/tests/anyone-list-clients.stdout0000644000175000017500000000001112246357067021012 0ustar jenkinsjenkinsrainyday obnam-1.6.1/tests/backup0000755000175000017500000000143512246357067015053 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Helper script for cmdtest tests: run a backup. set -e $SRCDIR/tests/obnam backup "$(cat $DATADIR/rooturl)" "$@" obnam-1.6.1/tests/backup-pretend.script0000755000175000017500000000163312246357067020015 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -eu $SRCDIR/tests/backup summain -r "$DATADIR/repo/metadata" > "$DATADIR/repo.summain" $SRCDIR/tests/backup --pretend summain -r "$DATADIR/repo/metadata" > "$DATADIR/repo2.summain" diff -u "$DATADIR/repo.summain" "$DATADIR/repo2.summain" obnam-1.6.1/tests/compression+encryption.script0000755000175000017500000000154712246357067021644 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e gpgkey='3B1802F81B321347' $SRCDIR/tests/backup --encrypt-with="$gpgkey" --compress-with=gzip $SRCDIR/tests/restore --encrypt-with="$gpgkey" --compress-with=gzip $SRCDIR/tests/verify obnam-1.6.1/tests/compression.script0000755000175000017500000000143312246357067017450 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup --compress-with=gzip $SRCDIR/tests/restore --compress-with=gzip $SRCDIR/tests/verify obnam-1.6.1/tests/convert5to6.script0000755000175000017500000000275212246357067017312 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Do not include roots from previous generations in a new generation, # when the user stops specifying them. set -eu cd "$DATADIR" tar -xf "$SRCDIR/test-data/repo-format-5-encrypted-gzipped.tar.gz" "$SRCDIR/tests/obnam" convert5to6 \ -r repo --encrypt-with=1B321347 --compress-with=gzip "$SRCDIR/tests/obnam" restore \ -r repo --encrypt-with=1B321347 --compress-with=gzip --to=restored (cd "restored/home/liw/obnam/convert-5to6/t" && summain --exclude=Username --exclude=Uid --exclude=Group --exclude=Gid data) \ > restored.summain # Not all filesystems support nanosecond timestamps, but that doesn't # matter for us. So we remove the sub-second timestamps from both # summain files. "$SRCDIR/sed-in-place" '/^Mtime:/s/\.[0-9]* /.IGNORED /' \ data.summain restored.summain diff -u data.summain restored.summain obnam-1.6.1/tests/encryption-adds-key.script0000755000175000017500000000164212246357067021002 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e gpgkey='3B1802F81B321347' gpgkey2='DF3D13AA11E69900' $SRCDIR/tests/backup --encrypt-with="$gpgkey" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" add-key --keyid="$gpgkey2" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" list-keys | grep '^key:' obnam-1.6.1/tests/encryption-adds-key.stdout0000644000175000017500000000005412246357067021011 0ustar jenkinsjenkinskey: DF3D13AA11E69900 key: 3B1802F81B321347 obnam-1.6.1/tests/encryption-has-client-key-after-backup.script0000755000175000017500000000146212246357067024460 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e gpgkey='3B1802F81B321347' $SRCDIR/tests/backup --encrypt-with="$gpgkey" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" client-keys obnam-1.6.1/tests/encryption-has-client-key-after-backup.stdout0000644000175000017500000000003212246357067024463 0ustar jenkinsjenkinsrainyday 3B1802F81B321347 obnam-1.6.1/tests/encryption-removes-client.script0000755000175000017500000000156612246357067022242 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e gpgkey='3B1802F81B321347' $SRCDIR/tests/backup --encrypt-with="$gpgkey" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" remove-client rainyday $SRCDIR/tests/obnam --encrypt-with="$gpgkey" client-keys obnam-1.6.1/tests/encryption-removes-key.script0000755000175000017500000000175512246357067021554 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e gpgkey='3B1802F81B321347' gpgkey2='DF3D13AA11E69900' $SRCDIR/tests/backup --encrypt-with="$gpgkey" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" add-key --keyid="$gpgkey2" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" remove-key --keyid="$gpgkey2" $SRCDIR/tests/obnam --encrypt-with="$gpgkey" list-keys | grep '^key:' obnam-1.6.1/tests/encryption-removes-key.stdout0000644000175000017500000000002612246357067021555 0ustar jenkinsjenkinskey: 3B1802F81B321347 obnam-1.6.1/tests/encryption-replaces-key.script0000755000175000017500000000261212246357067021663 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -eu gpgkey='3B1802F81B321347' fingerprint='4E2AF28A3D824CF2B3F1FE733B1802F81B321347' gpgkey2='DF3D13AA11E69900' # Make a backup with the default key ($gpgkey). $SRCDIR/tests/backup --encrypt-with="$gpgkey" # Add new key. "rainyday" is the name of the client. $SRCDIR/tests/obnam --encrypt-with="$gpgkey" add-key --keyid="$gpgkey2" \ rainyday # Remove the old key. $SRCDIR/tests/obnam --encrypt-with="$gpgkey2" remove-key --keyid="$gpgkey" \ rainyday # Remove the old key from the gpg keyring. export GNUPGHOME="$DATADIR/gpg" gpg --batch --delete-secret-key "$fingerprint" 2>/dev/null # Verify that the backup is still readable, now with the new key. $SRCDIR/tests/restore --encrypt-with="$gpgkey2" $SRCDIR/tests/verify obnam-1.6.1/tests/encryption-use.script0000755000175000017500000000147512246357067020101 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e gpgkey='3B1802F81B321347' $SRCDIR/tests/backup --encrypt-with="$gpgkey" $SRCDIR/tests/restore --encrypt-with="$gpgkey" $SRCDIR/tests/verify obnam-1.6.1/tests/exclude-cachedir.script0000755000175000017500000000167012246357067020303 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e cachedir="$DATADIR/data/cache" mkdir "$cachedir" echo -n 'Signature: 8a477f597d28d172789f06886806bc55' \ > "$cachedir/CACHEDIR.TAG" echo foo > "$cachedir/foo" $SRCDIR/tests/backup --exclude-caches $SRCDIR/tests/restore find "$DATADIR/restored" -name cachdir obnam-1.6.1/tests/fail-on-bad-file-checksum.exit0000644000175000017500000000000212246357067021326 0ustar jenkinsjenkins1 obnam-1.6.1/tests/fail-on-bad-file-checksum.script0000755000175000017500000000232512246357067021676 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup # Remove the chunk checksum list, and then modify the chunks, # so that we can do a restore without triggering "bad chunk checksum" # errors. We only want to trigger the whole-file checksum error. rm -rf "$DATADIR/repo/chunklist/nodes" rm -rf "$DATADIR/repo/chunklist/refcounts" rm -rf "$DATADIR/repo/chunklist/metadata" find "$DATADIR/repo/chunks" -type f | while read filename do tr '\0-\377' '\200-\377\0-\177' < "$filename" > "$filename.new" mv "$filename.new" "$filename" done # Restore. $SRCDIR/tests/restore obnam-1.6.1/tests/fail-on-bad-file-checksum.stderr0000644000175000017500000000005012246357067021663 0ustar jenkinsjenkinsERROR: There were errors when restoring obnam-1.6.1/tests/fail-on-mangled-chunk.exit0000644000175000017500000000000212246357067020600 0ustar jenkinsjenkins1 obnam-1.6.1/tests/fail-on-mangled-chunk.script0000755000175000017500000000245312246357067021152 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup # Corrupt chunk files. find "$DATADIR/repo/chunks" -type f -size +11c | while read filename do dd if="$filename" of="$filename.new" bs=1 count=10 2>/dev/null dd if="$filename" bs=1 skip=10 2>/dev/null | \ tr '\0-\377' '\200-\377\0-\177' >> "$filename.new" mv "$filename.new" "$filename" done # Restore. This will fail, and output an error, which contains a chunk id. # Blot out the id. if $SRCDIR/tests/restore 2> "$DATADIR/stderr" then cat "$DATADIR/stderr" 1>&2 exit 0 else sed 's/chunk [[:digit:]]* checksum/chunk BLOTTED checksum/' \ "$DATADIR/stderr" 1>&2 exit 1 fi obnam-1.6.1/tests/fail-on-mangled-chunk.stderr0000644000175000017500000000005012246357067021135 0ustar jenkinsjenkinsERROR: There were errors when restoring obnam-1.6.1/tests/forget-removes-according-to-policy.script0000755000175000017500000000177612246357067023731 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup $SRCDIR/tests/backup $SRCDIR/tests/obnam genids > "$DATADIR/genids-1" $SRCDIR/tests/obnam forget --keep=1d $SRCDIR/sed-in-place 1d "$DATADIR/genids-1" $SRCDIR/tests/obnam genids > "$DATADIR/genids-2" diff -u "$DATADIR/genids-1" "$DATADIR/genids-2" obnam-1.6.1/tests/forget-removes-nothing-by-default.script0000755000175000017500000000171012246357067023547 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup $SRCDIR/tests/backup $SRCDIR/tests/obnam genids > "$DATADIR/genids-1" $SRCDIR/tests/obnam forget $SRCDIR/tests/obnam genids > "$DATADIR/genids-2" diff -u "$DATADIR/genids-1" "$DATADIR/genids-2" obnam-1.6.1/tests/forget-removes-nothing-if-pretending.script0000755000175000017500000000173412246357067024254 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup $SRCDIR/tests/backup $SRCDIR/tests/obnam genids > "$DATADIR/genids-1" $SRCDIR/tests/obnam forget --pretend --keep=1d $SRCDIR/tests/obnam genids > "$DATADIR/genids-2" diff -u "$DATADIR/genids-1" "$DATADIR/genids-2" obnam-1.6.1/tests/forget-removes-specified-gens.script0000755000175000017500000000202412246357067022733 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup $SRCDIR/tests/backup $SRCDIR/tests/obnam genids > "$DATADIR/genids-1" $SRCDIR/tests/obnam forget $(head -n1 "$DATADIR/genids-1") $SRCDIR/sed-in-place 1d "$DATADIR/genids-1" $SRCDIR/tests/obnam genids > "$DATADIR/genids-2" diff -u "$DATADIR/genids-1" "$DATADIR/genids-2" obnam-1.6.1/tests/forget-removes-unwanted-leaving-empty-generation.script0000755000175000017500000000226312246357067026610 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e dd if=/dev/zero of="$DATADIR/data/aaa" bs=1M count=1 2> /dev/null $SRCDIR/tests/backup --deduplicate=never $SRCDIR/tests/backup --deduplicate=never $SRCDIR/tests/obnam genids > "$DATADIR/genids-1" # Add an empty generation. find "$DATADIR/data" -mindepth 1 -delete $SRCDIR/tests/backup --deduplicate=never $SRCDIR/tests/obnam forget $(cat "$DATADIR/genids-1") # Remove encryption metadata, if any. rm -f "$DATADIR/repo/chunks/key" rm -f "$DATADIR/repo/chunks/userkeys" find "$DATADIR/repo/chunks" -type f -ls obnam-1.6.1/tests/forget-removes-unwated-data.script0000755000175000017500000000210512246357067022424 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e dd if=/dev/zero of="$DATADIR/data/aaa" bs=1M count=1 2> /dev/null $SRCDIR/tests/backup --deduplicate=never $SRCDIR/tests/backup --deduplicate=never $SRCDIR/tests/obnam genids > "$DATADIR/genids-1" $SRCDIR/tests/obnam forget $(cat "$DATADIR/genids-1") # Remove encryption metadata, if any. rm -f "$DATADIR/repo/chunks/key" rm -f "$DATADIR/repo/chunks/userkeys" find "$DATADIR/repo/chunks" -type f -ls obnam-1.6.1/tests/hardlinks.script0000755000175000017500000000166412246357067017074 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # Hardlinks do not work with sftp/paramiko. Skip this test? if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then exit 0 fi echo foobar > "$DATADIR/data/foo" ln "$DATADIR/data/foo" "$DATADIR/data/bar" $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/logs-for-owner-only.script0000755000175000017500000000152412246357067020747 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup --log="$DATADIR/obnam.log" # Print out the permissions. python -c "import sys, os; print os.lstat(sys.argv[1]).st_mode" "$DATADIR/obnam.log" obnam-1.6.1/tests/logs-for-owner-only.stdout0000644000175000017500000000000612246357067020754 0ustar jenkinsjenkins33152 obnam-1.6.1/tests/ls-generation-timestamp.script0000755000175000017500000000163112246357067021657 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # This tests for a bug in Obnam as of 0.24. # http://liw.fi/obnam/bugs/generation-time-stamps-wrong/ set -e $SRCDIR/tests/backup --pretend-time='2007-08-12 01:02:03' # Print out the generation timestamp. $SRCDIR/tests/obnam ls | head -n1 obnam-1.6.1/tests/ls-generation-timestamp.stdout0000644000175000017500000000007112246357067021667 0ustar jenkinsjenkinsGeneration 2 (2007-08-12 01:02:03 - 2007-08-12 01:02:03) obnam-1.6.1/tests/nagios-check.script0000755000175000017500000000251712246357067017446 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e opts="--warn-age=1h --critical-age=1d" # Backup, and pretend to do it during a particular time. $SRCDIR/tests/backup --pretend-time="1999-01-01 00:00:00" # Check there's a warning after one if ! $SRCDIR/tests/obnam $opts nagios-last-backup-age \ --pretend-time="1999-01-01 01:00:01" then echo 'correctly returned non-zero exit code' else echo 'incorrectly returned zero exit code' 1>&2 exit 1 fi # Check there's an error. if ! $SRCDIR/tests/obnam $opts nagios-last-backup-age \ --pretend-time="1999-01-02 00:00:01" then echo 'correctly returned non-zero exit code' else echo 'incorrectly returned zero exit code' 1>&2 exit 1 fi obnam-1.6.1/tests/nagios-check.stdout0000644000175000017500000000031112246357067017447 0ustar jenkinsjenkinsWARNING: backup is old. last backup was 1999-01-01 00:00:00. correctly returned non-zero exit code CRITICAL: backup is old. last backup was 1999-01-01 00:00:00. correctly returned non-zero exit code obnam-1.6.1/tests/named-pipe.script0000755000175000017500000000163412246357067017131 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # Named pipes do not work with sftp/paramiko. Skip this test? if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then exit 0 fi # Create a named pipe. mkfifo "$DATADIR/data/pipe" $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/named-socket.script0000755000175000017500000000217412246357067017464 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # Named pipes do not work with sftp/paramiko. Skip this test? if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then exit 0 fi # Create a named socket. python -c ' import platform if platform.system() == "FreeBSD": # Cannot mknod sockets on FreeBSD. raise SystemExit(0) import os, stat filename = os.path.join(os.environ["DATADIR"], "data", "socket") os.mknod(filename, 0600 | stat.S_IFSOCK) ' $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/no-roots-from-old-gens.script0000755000175000017500000000230712246357067021337 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Do not include roots from previous generations in a new generation, # when the user stops specifying them. set -e mkdir "$DATADIR/data/root1" echo foo > "$DATADIR/data/root1/foo" mkdir "$DATADIR/data/root2" echo bar > "$DATADIR/data/root2/bar" rooturl=$(cat $DATADIR/rooturl) # Run the first backup with root1. $SRCDIR/tests/obnam backup "$rooturl/root1" # Run the second backup with root2. $SRCDIR/tests/obnam backup "$rooturl/root2" # Verify the latest generation has nothing from root1. $SRCDIR/tests/obnam ls | grep root1 || true obnam-1.6.1/tests/notify-error-during-backup.exit0000644000175000017500000000000212246357067021730 0ustar jenkinsjenkins1 obnam-1.6.1/tests/notify-error-during-backup.script0000755000175000017500000000144112246357067022276 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup --testing-fail-matching ccc obnam-1.6.1/tests/notify-error-during-backup.stderr0000644000175000017500000000015012246357067022266 0ustar jenkinsjenkinsERROR: Can't back up TMP/data/ccc: No such file or directory ERROR: There were errors during the backup obnam-1.6.1/tests/obnam0000755000175000017500000000216612246357067014704 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Helper script for cmdtest tests: run a backup. export GNUPGHOME="$DATADIR/gpg" out="$(mktemp)" err="$(mktemp)" $SRCDIR/obnam \ --client-name=rainyday \ --quiet \ --no-default-config \ -r "$(cat $DATADIR/repourl)" \ --weak-random \ --log="$DATADIR/obnam.log" \ --trace=vfs \ --trace=repo \ "$@" > "$out" 2> "$err" exit=$? sed "s#$DATADIR#TMP#g" "$out" sed "s#$DATADIR#TMP#g" "$err" 1>&2 rm -f "$out" "$err" exit "$exit" obnam-1.6.1/tests/pre-epoch-mtime.script0000755000175000017500000000225312246357067020103 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # This doesn't work for sftp restores. paramiko or sftp can't handle # negative timestamps. Meh. if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then exit 0 fi # It's possible to have timestamps before the epoch, i.e., negative # ones. For example, in the UK during DST, "touch -t 197001010000" # will create one. echo foo > "$DATADIR/data/foo" python -c ' import os os.utime(os.path.join(os.environ["DATADIR"], "data", "foo"), (-3600, -3600)) ' $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/pretend-time.script0000755000175000017500000000153612246357067017510 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup --pretend-time='2007-08-12 01:02:03' # Print out the generation timestamp. $SRCDIR/tests/obnam $opts generations | sed 's/([0-9]* files/(xx files/' obnam-1.6.1/tests/pretend-time.stdout0000644000175000017500000000010712246357067017514 0ustar jenkinsjenkins2 2007-08-12 01:02:03 .. 2007-08-12 01:02:03 (xx files, 100000 bytes) obnam-1.6.1/tests/remove-checkpoints.script0000755000175000017500000000146312246357067020717 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup --checkpoint=1k # Check that there's only one generation. $SRCDIR/tests/obnam genids | wc -l | sed 's/ *//' obnam-1.6.1/tests/remove-checkpoints.stdout0000644000175000017500000000000212246357067020716 0ustar jenkinsjenkins1 obnam-1.6.1/tests/restore0000755000175000017500000000151612246357067015271 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Helper script for cmdtest tests: restore a backup. set -e $SRCDIR/tests/obnam restore -r "$(cat $DATADIR/repourl)" \ --to "$(cat $DATADIR/restoredurl)" "$@" obnam-1.6.1/tests/restores-compressed-without-being-told-to.script0000755000175000017500000000153112246357067025261 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e root="$(cat $DATADIR/rooturl)" genbackupdata "$DATADIR/data" --create=100k --quiet $SRCDIR/tests/backup --compress-with=gzip $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/restores-identical-data.script0000755000175000017500000000136112246357067021616 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/restores-single-file.script0000755000175000017500000000164712246357067021160 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # Create a file in the test data with a name and contents we know. echo foobar > "$DATADIR/data/foo" $SRCDIR/tests/backup $SRCDIR/tests/restore "$DATADIR/data/foo" # Verify it's OK. diff "$DATADIR/data/foo" "$DATADIR/restored/$DATADIR/data/foo" obnam-1.6.1/tests/root-is-symlink.script0000755000175000017500000000243312246357067020170 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # It seems paramiko can't create dangling symlinks. Skip this if # restoring over sftp. if [ "$OBNAM_TEST_SFTP_ROOT" != yes ] then # Make the backup root be a symlink. mv "$DATADIR/data" "$DATADIR/data.real" ln -s data.real "$DATADIR/data" $SRCDIR/tests/backup $SRCDIR/tests/restore # This is a workaround of a bug in summain 0.13. Can be removed when # a later version of summain is published and installed everywhere. mkdir "$DATADIR/restored/$DATADIR/data.real" $SRCDIR/tests/verify ' { gsub(/^Name: \/.*\//, "Name: ") print } ' fi obnam-1.6.1/tests/setup0000755000175000017500000000257312246357067014752 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # Make a copy of the gpg homedir. This is so that we don't get gpg # accidentally changing anything, such as the random seed file. cp -a "$SRCDIR/test-gpghome" "$DATADIR/gpg" chmod go= "$DATADIR/gpg" # Generate some test data. genbackupdata --quiet --create=100k "$DATADIR/data" if [ "$OBNAM_TEST_SFTP_REPOSITORY" = yes ] then REPO="sftp://localhost$DATADIR/repo" else REPO="$DATADIR/repo" fi echo "$REPO" > "$DATADIR/repourl" if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then ROOT="sftp://localhost$DATADIR/data" RESTORED="sftp://localhost$DATADIR/restored" else ROOT="$DATADIR/data" RESTORED="$DATADIR/restored" fi echo "$ROOT" > "$DATADIR/rooturl" echo "$RESTORED" > "$DATADIR/restoredurl" obnam-1.6.1/tests/sparse-files.script0000755000175000017500000000152612246357067017507 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e # Create a sparse file. dd if=/dev/zero of="$DATADIR/data/sparse" bs=1M seek=1 count=0 2> /dev/null $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/symlink.script0000755000175000017500000000177012246357067016601 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e ln -s . "$DATADIR/data/symlink-with-existing-target" # It seems paramiko can't create dangling symlinks. Skip this if # restoring over sftp. if [ "$OBNAM_TEST_SFTP_ROOT" != yes ] then ln -s this-does-not-exist "$DATADIR/data/symlink-with-missing-target" fi $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/teardown0000755000175000017500000000135712246357067015434 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e chmod -R 0755 "$DATADIR" find "$DATADIR" -mindepth 1 -delete obnam-1.6.1/tests/two-generations.script0000755000175000017500000000177212246357067020242 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Make two backup generations with different contents. Make sure the # last generation can be restored correctly. set -e find "$DATADIR/data" -mindepth 1 -delete echo foo > "$DATADIR/data/foo" $SRCDIR/tests/backup rm "$DATADIR/data/foo" echo bar > "$DATADIR/data/bar" $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify obnam-1.6.1/tests/two-roots.script0000755000175000017500000000276312246357067017073 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Backup two directoreis at once. set -e mkdir "$DATADIR/data2" echo foo > "$DATADIR/data2/foo" # The .../data directory is already specified by the backup script. $SRCDIR/tests/backup "$(cat $DATADIR/rooturl)2" $SRCDIR/tests/restore # Need to verify manually, since the verify script assumes data only. summain -r "$DATADIR/data" "$DATADIR/data2" > "$DATADIR/data.summain" summain -r "$DATADIR/restored/$DATADIR/data" \ "$DATADIR/restored/$DATADIR/data2" > "$DATADIR/restored.summain" # Timestamps are whole seconds with sftp, so we need to mangle the # summain output to remove sub-second timestamps. if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then sed -i '/^Mtime:/s/\.[[:digit:]]\+ / /' \ "$DATADIR/data.summain" \ "$DATADIR/restored.summain" fi diff -u "$DATADIR/data.summain" "$DATADIR/restored.summain" obnam-1.6.1/tests/unreadable-dir.exit0000644000175000017500000000000212246357067017416 0ustar jenkinsjenkins1 obnam-1.6.1/tests/unreadable-dir.script0000755000175000017500000000244012246357067017764 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" mkdir -m 0 "$DATADIR/data/bbb" echo ccc > "$DATADIR/data/ccc" if $SRCDIR/tests/backup then exit=0 else exit=1 fi $SRCDIR/tests/restore # Remove the problematic directory so that verify works. # Don't do this if running as root, since in that case # obnam _can_ back it up. (Yes, this is convoluted.) # When removing the directory, make sure the mtime doesn't # change of the parent. if [ "$(whoami)" != root ] then touch -r "$DATADIR/data" "$DATADIR/timestamp" rmdir "$DATADIR/data/bbb" touch -r "$DATADIR/timestamp" "$DATADIR/data" fi $SRCDIR/tests/verify exit $exit obnam-1.6.1/tests/unreadable-dir.stderr0000644000175000017500000000014012246357067017753 0ustar jenkinsjenkinsERROR: Can't back up TMP/data/bbb: Permission denied ERROR: There were errors during the backup obnam-1.6.1/tests/unreadable-file.exit0000644000175000017500000000000212246357067017557 0ustar jenkinsjenkins1 obnam-1.6.1/tests/unreadable-file.script0000755000175000017500000000247312246357067020133 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo bbb > "$DATADIR/data/bbb" chmod 0 "$DATADIR/data/bbb" echo ccc > "$DATADIR/data/ccc" if $SRCDIR/tests/backup then exit=0 else exit=1 fi $SRCDIR/tests/restore # Remove the problematic directory so that verify works. # Don't do this if running as root, since in that case # obnam _can_ back it up. (Yes, this is convoluted.) # When removing the directory, make sure the mtime doesn't # change of the parent. if [ "$(whoami)" != root ] then touch -r "$DATADIR/data" "$DATADIR/timestamp" rm -f "$DATADIR/data/bbb" touch -r "$DATADIR/timestamp" "$DATADIR/data" fi $SRCDIR/tests/verify exit $exit obnam-1.6.1/tests/unreadable-file.stderr0000644000175000017500000000014012246357067020114 0ustar jenkinsjenkinsERROR: Can't back up TMP/data/bbb: Permission denied ERROR: There were errors during the backup obnam-1.6.1/tests/use-old-node-size.script0000755000175000017500000000153112246357067020351 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e $SRCDIR/tests/backup --node-size=4k $SRCDIR/tests/backup --node-size=16k # Check that there's two generations. $SRCDIR/tests/obnam $opts genids | wc -l | sed 's/ *//' obnam-1.6.1/tests/use-old-node-size.stdout0000644000175000017500000000000212246357067020354 0ustar jenkinsjenkins2 obnam-1.6.1/tests/verifies-randomly.script0000755000175000017500000000152612246357067020551 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup $SRCDIR/tests/obnam verify --root="$(cat $DATADIR/rooturl)" --verify-randomly=1 obnam-1.6.1/tests/verifies-randomly.stdout0000644000175000017500000000003612246357067020557 0ustar jenkinsjenkinsVerify did not find problems. obnam-1.6.1/tests/verify0000755000175000017500000000262712246357067015116 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Helper script for cmdtest tests: verify restored data. set -e summain -r "$DATADIR/data" > "$DATADIR/data.summain" summain -r "$DATADIR/restored/$DATADIR/data" > "$DATADIR/restored.summain" # Timestamps are whole seconds with sftp, so we need to mangle the # summain output to remove sub-second timestamps. if [ "$OBNAM_TEST_SFTP_ROOT" = yes ] then "$SRCDIR/sed-in-place" '/^Mtime:/s/\.[[:digit:]]\+ / /' \ "$DATADIR/data.summain" \ "$DATADIR/restored.summain" fi # Allow caller to mangle further. if [ "x$1" != x ] then for x in "$DATADIR/data.summain" "$DATADIR/restored.summain" do awk "$1" "$x" > "$x.new" mv "$x.new" "$x" done fi diff -u "$DATADIR/data.summain" "$DATADIR/restored.summain" obnam-1.6.1/tests/verify-notices-changes.script0000755000175000017500000000166712246357067021474 0ustar jenkinsjenkins#!/bin/sh # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -e echo aaa > "$DATADIR/data/aaa" echo ccc > "$DATADIR/data/ccc" $SRCDIR/tests/backup touch -d "1970-01-02 03:04:05" "$DATADIR/data/aaa" $SRCDIR/tests/obnam verify "$(cat $DATADIR/rooturl)" 2>&1 | sed "s,$DATADIR,TMP,g" | sed '/st_mtime/s/([^)]*)/(...)/' obnam-1.6.1/tests/verify-notices-changes.stdout0000644000175000017500000000010212246357067021466 0ustar jenkinsjenkinsverify failure: TMP/data/aaa: metadata change: st_mtime_sec (...) obnam-1.6.1/tests/xattr-change-only.script0000755000175000017500000000241412246357067020453 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Make two backup generations with different contents. Make sure the # last generation can be restored correctly. set -e # This only works with local filesystem access, not over sftp. # It also requires user_xattr to be set on the filesystem. # If both requirements are not met, skip the test. echo foo > "$DATADIR/data/foo" if [ "$OBNAM_TEST_SFTP_ROOT" != yes ] && setfattr --name=user.foo --value=bar "$DATADIR/data/foo" 2>/dev/null then $SRCDIR/tests/backup setfattr --name=user.foo --value=bar "$DATADIR/data/foo" $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify fi obnam-1.6.1/tests/xattr-empty.script0000755000175000017500000000216112246357067017404 0ustar jenkinsjenkins#!/bin/sh # Copyright 2012 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # Backup an extended attribute that is empty. set -e # This only works with local filesystem access, not over sftp. # It also requires user_xattr to be set on the filesystem. # If both requirements are not met, skip the test. echo foo > "$DATADIR/data/foo" if [ "$OBNAM_TEST_SFTP_ROOT" != yes ] && setfattr --name=user.foo --value='""' "$DATADIR/data/foo" 2>/dev/null then $SRCDIR/tests/backup $SRCDIR/tests/restore $SRCDIR/tests/verify fi obnam-1.6.1/try-node-chunk-sizes0000755000175000017500000000176012246357067016447 0ustar jenkinsjenkins#!/bin/sh # # Run obnam with various node and chunk sizes, in order to see which ones # are optimal. set -e datadir="$1" sizes="8kib 16kib 32kib 64kib 128kib 256kib 512kib 1024kib" sizes="8kib 16kib" conf="$(mktemp)" rm -rf try-sizes mkdir try-sizes for node_size in $sizes do for chunk_size in $sizes do echo -------------------------------- echo "node=$node_size chunk=$chunk_size" cat << eof > "$conf" [config] chunk-size = $chunk_size node-size = $node_size eof output="try-sizes/${node_size}-${chunk_size}.seivot" $HOME/seivot/trunk/seivot \ --output="$output" \ --drop-caches \ --obnam-config="$conf" \ --generations=2 \ --incremental=1 \ --use-existing="$datadir" \ --obnam-branch=. \ --larch-branch=$HOME/larch/trunk \ --description="node=$node_size chunk=$chunk_size" \ --profile-name="real data" done done rm -f "$conf" obnam-1.6.1/try-node-chunk-sizes-plot-results0000755000175000017500000000310612246357067021116 0ustar jenkinsjenkins#!/usr/bin/python import ConfigParser import re import subprocess import sys ops = ['backup', 'restore', 'forget'] def parse_size(spec): m = re.search('\d+', spec) return int(m.group()) def dat_filename(op, node_size): return '%s-%s-%d.dat' % (sys.argv[1], op, node_size) def plotfile(op, node_size): return ('"%s" with lines title "node size %d KiB"' % (dat_filename(op, node_size), node_size)) for op in ops: data = {} for filename in sys.argv[2:]: cp = ConfigParser.ConfigParser() cp.read(filename) desc = cp.get('meta', 'description') words = desc.split() node_size = parse_size(words[0]) chunk_size = parse_size(words[1]) secs = cp.getfloat('0', '%s.real' % op) data[node_size] = data.get(node_size, []) + [(chunk_size, secs)] for node_size in data: points = sorted(data[node_size]) name = dat_filename(op, node_size) with open(name, 'w') as f: for chunk_size, secs in points: f.write('%d %f\n' % (chunk_size, secs)) gnuplot = '''\ set terminal svg dynamic set title "%s %s" set xlabel "chunk size (KiB)" set ylabel "time (s)" ''' % (sys.argv[1], op) gnuplot += ('plot' + ', '.join(plotfile(op, x) for x in sorted(data.keys())) + '\n') script_name = '%s-%s.gnuplot' % (sys.argv[1], op) with open(script_name, 'w') as f: f.write(gnuplot) with open('%s-%s.svg' % (sys.argv[1], op), 'wb') as outp: subprocess.check_call(['gnuplot', script_name], stdout=outp) obnam-1.6.1/verification-test0000755000175000017500000000764612246357067016115 0ustar jenkinsjenkins#!/bin/sh # # Obnam verification test. # # This script runs and verifies backups with Obnam. There are two stages. # In the first stage, the user makes backups frequently for some period # of time (e.g., daily for a week). In the second stage, every backup # generation is restored and the restored data compared with the original. # The test succeeds if all generations can be restored and verified # successfully. # # The verification is done using the summain(1) checksumming and manifest # generation tool. Obnam has an internal verification command, but it is # better to use an independent tool. summain(1) is a better choice than, # say, md5sum, since it includes much of the inode metadata in the manifest, # so that restoring file permissions, etc, are also verified. # # To use this script, run it one the following ways: # # ./verification-test backup REPO DIR # ./verification-test verify REPO DIR # # You can optionally add the name of a configuration file to use as the # fourth argument. # # You must run this script from the Obnam source directory. # The repository must be a local directory. # You can choose any DIR you like, but it should be something that changes # frequently and is small enough that you don't get impatient about while # it is getting backed up. # # The filesystem (or the parts inside the specified directory) MUST be # idle from when the backup start until it ends. Otherwise the test may # fail even though Obnam was working ine: it's just that some file changed # between Obnam backing it up and summain including it in the manifest. # # Copyright 2011 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . set -eux die() { echo "$@" 1>&2 exit 1 } backup() { local repo="$1" local dir="$2" local conf="$3" local client=$(awk '/client-name/ { print $3 }' "$conf") local genids=$(mktemp) ./obnam --no-default-configs --config="$conf" backup || exit 1 while ! ./obnam --no-default-configs --config="$conf" genids > "$genids" do : done local gen=$(tail -n1 "$genids") if [ "x$gen" = "x" ] then die "gen is empty" fi summain "$dir" -r --output="$repo/$client.$gen.summain" } abspath() { case "$1" in /*) echo -n "$1" ;; *) echo -n "$(pwd)/$1" ;; esac } verify() { local repo="$1" local dir=$(abspath "$2") local conf="$3" local tempdir="$(mktemp -d)" local client=$(awk '/client-name/ { print $3 }' "$conf") ./obnam --no-default-configs --config="$conf" genids \ > "$tempdir"/gens || exit 42 while read gen do rm -rf "$tempdir/$gen" ./obnam --no-default-configs --config="$conf" \ restore --to="$tempdir/$gen" --generation="$gen" || \ exit 42 summain "$tempdir/$gen/$dir" -r --output="$tempdir/summain.$gen" if ! diff -u "$repo/$client.$gen.summain" "$tempdir/summain.$gen" then die "generation $gen failed to restore properly, see $tempdir" fi rm -rf "$tempdir/$gen" "$tempdir/summain.$gen" done < "$tempdir/gens" rm -rf "$tempdir" } [ "$#" = 4 ] || die "Bad usage, read source!" repo="$2" root="$3" conf="$4" case "$1" in backup) backup "$repo" "$root" "$conf" ;; verify) verify "$repo" "$root" "$conf" ;; *) die "Unknown subcommand $1" ;; esac obnam-1.6.1/without-tests0000644000175000017500000000214312246357067015301 0ustar jenkinsjenkins./setup.py ./obnamlib/__init__.py ./obnamlib/app.py ./obnamlib/vfs.py ./obnamlib/plugins/backup_plugin.py ./obnamlib/plugins/restore_plugin.py ./obnamlib/plugins/show_plugin.py ./obnamlib/plugins/verify_plugin.py ./test-plugins/hello_plugin.py ./test-plugins/oldhello_plugin.py ./test-plugins/aaa_hello_plugin.py ./test-plugins/wrongversion_plugin.py ./obnamlib/plugins/forget_plugin.py ./obnamlib/plugins/fsck_plugin.py ./obnamlib/plugins/vfs_local_plugin.py ./obnamlib/plugins/sftp_plugin.py ./obnamlib/plugins/force_lock_plugin.py ./obnamlib/plugins/__init__.py ./obnamlib/repo_tree.py ./obnamlib/plugins/encryption_plugin.py ./obnamlib/plugins/compression_plugin.py ./.pc/debian-changes-0.22-2/obnamlib/app.py ./.pc/debian-changes-0.22-2/obnamlib/plugins/forget_plugin.py ./.pc/debian-changes-0.22-2/obnamlib/plugins/verify_plugin.py ./.pc/debian-changes-0.22-2/obnamlib/plugins/backup_plugin.py ./.pc/debian-changes-0.22-2/obnamlib/plugins/fsck_plugin.py ./.pc/debian-changes-0.22-2/obnamlib/plugins/restore_plugin.py obnamlib/plugins/convert5to6_plugin.py obnamlib/repo_interface.py obnamlib/plugins/fuse_plugin.py obnam-1.6.1/yarns/0000755000175000017500000000000012246357067013647 5ustar jenkinsjenkinsobnam-1.6.1/yarns/fuse.yarn0000644000175000017500000000656212246357067015515 0ustar jenkinsjenkinsBlack box testing for the Obnam FUSE plugin =========================================== The FUSE plugin gives read-only access to a backup repository. There's a lot of potential corner cases here, but for now, this test suite concentrates on verifying that at least the basics work. SCENARIO Browsing backups with FUSE plugin ASSUMING user is in group fuse GIVEN a live data directory AND a 0 byte file called empty AND a 1 byte file called one AND a 4096 byte file called 4k AND a 10485760 byte file called 10meg AND a hardlink to 4k called 4k-hardlink AND a symlink to 4k called 4k-symlink AND a directory called some-dir AND a symlink to ../4k called some-dir/other-symlink WHEN live data is backed up AND repository is fuse mounted THEN latest generation can be copied correctly from fuse mount FINALLY unmount repository The following sections implement the various steps. We use `$DATADIR/live` for the live data, `$DATADIR/repo` for the repository, and `$DATADIR/mount` as the FUSE mount point. We can only run this test if the user is in the fuse group. This may be a portability concern: this works in Debian GNU/Linux, but might be different in other Linux distros, or on non-Linux systems. IMPLEMENTS ASSUMING user is in group (\S+) groups | tr ' ' '\n' | grep -Fx "$MATCH_1" IMPLEMENTS GIVEN a live data directory mkdir "$DATADIR/live" IMPLEMENTS GIVEN a (\d+) byte file called (\S+) dd if=/dev/zero of="$DATADIR/live/$MATCH_2" bs=1 count="$MATCH_1" IMPLEMENTS GIVEN a hardlink to (\S+) called (\S+) ln "$DATADIR/live/$MATCH_1" "$DATADIR/live/$MATCH_2" IMPLEMENTS GIVEN a symlink to (\S+) called (\S+) ln -s "$DATADIR/live/$MATCH_1" "$DATADIR/live/$MATCH_2" IMPLEMENTS GIVEN a directory called (\S+) mkdir "$DATADIR/live/$MATCH_1" We do the backup, and verify that it can be accessed correctly, by doing a "manifest" of the live data before the backup, and then against the fuse mount, and comparing the two manifests. `manifest` (a shell function in `obnam.sh`) runs summain with useful parameters. It's used twice, and the parameters need to be the same so the results can be compared with diff. summain is a manifest tool. `manifest` additionally mangles the mtime output to be full seconds only: for whatever reason, the fuse mount only shows full seconds. This may be a bug (FIXME: find out if it is). `run_obnam` is another shell function, which runs Obnam without the user's configuration files. We don't want the user's configuration to affect the test suite. IMPLEMENTS WHEN live data is backed up manifest "$DATADIR/live" > "$DATADIR/live.summain" run_obnam backup -r "$DATADIR/repo" "$DATADIR/live" IMPLEMENTS WHEN repository is fuse mounted run_obnam clients -r "$DATADIR/repo" mkdir "$DATADIR/mount" run_obnam mount -r "$DATADIR/repo" --to "$DATADIR/mount" --viewmode multiple IMPLEMENTS THEN latest generation can be copied correctly from fuse mount manifest "$DATADIR/mount/latest/$DATADIR/live" > "$DATADIR/latest.summain" diff -u "$DATADIR/live.summain" "$DATADIR/latest.summain" If we did do the fuse mount, **always** unmount it, even when a step failed. We do not want failed test runs to leavo mounts lying around. IMPLEMENTS FINALLY unmount repository if [ -e "$DATADIR/mount" ] then fusermount -u "$DATADIR/mount" fi obnam-1.6.1/yarns/ls.yarn0000644000175000017500000000665112246357067015170 0ustar jenkinsjenkins`obnam ls` ========== `obnam ls` lists contents of repositories. To test this, we start by backing up some live data to a repository, and then we query it in various ways. SCENARIO obnam ls GIVEN a live data directory xyzzy AND a file xyzzy/foo.txt containing "foo" AND a directory xyzzy/bar AND a file xyzzy/bar/yo containing "yo" AND an obnam config called xyzzy.conf AND config xyzzy.conf sets repository to xyzzy.repo AND config xyzzy.conf sets root to xyzzy WHEN "obnam backup" using xyzzy.conf is run THEN plain "obnam ls" using xyzzy.conf contains "bar/yo$" AND "obnam ls" using xyzzy.conf when given xyzzy contains "bar/yo$" There was a bug in Obnam 1.5 (and possibly other versions) that listing contents of a directory that ends in a slash (but isn't the root directory) fails. The following is a test for that bug. AND "obnam ls" using xyzzy.conf when given xyzzy/ contains "bar/yo$" That's it. Implementations =============== Introduction ------------ We run Obnam in `$DATADIR` so that any paths to test data, etc, may be relative to `$DATADIR`. The `run_obnam` shell function in `obnam.sh` allows us to do that safely. Live data creation ------------------ There's several steps to creating live data. First is to create directory for the live data. We let the user choose the name, in case they need to have more than one live data (perhaps for simulating multiple clients). IMPLEMENTS GIVEN a live data directory (\S+) mkdir "$DATADIR/$MATCH_1" Create a file. This is meant for live data creation, but it can actually create any file. The contents printed using `printf`(1), so that escapes may be used. IMPLEMENTS GIVEN a file (\S+) containing "(.*)" printf "$MATCH_1" > "$DATADIR/$MATCH_1" Likewise, a directory may be created. IMPLEMENTS GIVEN a directory (\S+) mkdir "$DATADIR/$MATCH_1" Obnam configuration file creation --------------------------------- Various tests need to set up configuration files with specific settings. IMPLEMENTS GIVEN an obnam config called (\S+) printf '[config]\n' > "$DATADIR/$MATCH_1" IMPLEMENTS GIVEN config (\S+) sets (\S+) to (\S+) printf '%s = %s\n' "$MATCH_2" "$MATCH_3" >> "$DATADIR/$MATCH_1" Backing up with Obnam --------------------- We need to backup with Obnam. IMPLEMENTS WHEN "obnam backup" using (\S+) is run cd "$DATADIR" run_obnam --config "$DATADIR/$MATCH_1" backup Checking `obnam ls` results --------------------------- `obnam ls` produces output that mimicks (but doesn't exactly match) that of `ls -lAR` using GNU coreutils and other popular implementations of the Unix `ls`(1) command. This makes exact output matching a bit tricky to do. Instead, we check whether specific files are in the output. IMPLEMENTS THEN plain "obnam ls" using (\S+) contains "(.+)" cd "$DATADIR" run_obnam --config "$DATADIR/$MATCH_1" ls | grep "$MATCH_2" This is similar to a plain `obnam ls`, but we give it an argument, to restrict what gets listed. If the argument is not an absolute path, we make it one, since Obnam expects it to be. It is, unfortunately, difficult to let the user of this IMPLEMENTS specify an absolute path. IMPLEMENTS THEN "obnam ls" using (\S+) when given (\S+) contains "(.+)" cd "$DATADIR" case "$MATCH_2" in /*) path="$MATCH_2" ;; *) path="$(pwd)/$MATCH_2" ;; esac run_obnam --config "$DATADIR/$MATCH_1" ls "$path" | grep "$MATCH_3" obnam-1.6.1/yarns/obnam.sh0000644000175000017500000000174512246357067015306 0ustar jenkinsjenkins# Copyright 2013 Lars Wirzenius # # This program is free software: you can redistribute it and/or modify # it under the terms of the GNU General Public License as published by # the Free Software Foundation, either version 3 of the License, or # (at your option) any later version. # # This program is distributed in the hope that it will be useful, # but WITHOUT ANY WARRANTY; without even the implied warranty of # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the # GNU General Public License for more details. # # You should have received a copy of the GNU General Public License # along with this program. If not, see . # # =*= License: GPL-3+ =*= # Run Obnam in a safe way that ignore's any configuration files outside # the test. run_obnam() { "$SRCDIR/obnam" --no-default-config "$@" } # Create a manifest with summain of a directory. manifest() { summain -r "$1" --exclude Ino --exclude Dev | sed '/^Mtime:/s/\.[0-9]* / /' }