obnam-1.6.1/ 0000755 0001750 0001750 00000000000 12246357067 012513 5 ustar jenkins jenkins obnam-1.6.1/COPYING 0000644 0001750 0001750 00000104513 12246357067 013552 0 ustar jenkins jenkins GNU GENERAL PUBLIC LICENSE
Version 3, 29 June 2007
Copyright (C) 2007 Free Software Foundation, Inc.
Everyone is permitted to copy and distribute verbatim copies
of this license document, but changing it is not allowed.
Preamble
The GNU General Public License is a free, copyleft license for
software and other kinds of works.
The licenses for most software and other practical works are designed
to take away your freedom to share and change the works. By contrast,
the GNU General Public License is intended to guarantee your freedom to
share and change all versions of a program--to make sure it remains free
software for all its users. We, the Free Software Foundation, use the
GNU General Public License for most of our software; it applies also to
any other work released this way by its authors. You can apply it to
your programs, too.
When we speak of free software, we are referring to freedom, not
price. Our General Public Licenses are designed to make sure that you
have the freedom to distribute copies of free software (and charge for
them if you wish), that you receive source code or can get it if you
want it, that you can change the software or use pieces of it in new
free programs, and that you know you can do these things.
To protect your rights, we need to prevent others from denying you
these rights or asking you to surrender the rights. Therefore, you have
certain responsibilities if you distribute copies of the software, or if
you modify it: responsibilities to respect the freedom of others.
For example, if you distribute copies of such a program, whether
gratis or for a fee, you must pass on to the recipients the same
freedoms that you received. You must make sure that they, too, receive
or can get the source code. And you must show them these terms so they
know their rights.
Developers that use the GNU GPL protect your rights with two steps:
(1) assert copyright on the software, and (2) offer you this License
giving you legal permission to copy, distribute and/or modify it.
For the developers' and authors' protection, the GPL clearly explains
that there is no warranty for this free software. For both users' and
authors' sake, the GPL requires that modified versions be marked as
changed, so that their problems will not be attributed erroneously to
authors of previous versions.
Some devices are designed to deny users access to install or run
modified versions of the software inside them, although the manufacturer
can do so. This is fundamentally incompatible with the aim of
protecting users' freedom to change the software. The systematic
pattern of such abuse occurs in the area of products for individuals to
use, which is precisely where it is most unacceptable. Therefore, we
have designed this version of the GPL to prohibit the practice for those
products. If such problems arise substantially in other domains, we
stand ready to extend this provision to those domains in future versions
of the GPL, as needed to protect the freedom of users.
Finally, every program is threatened constantly by software patents.
States should not allow patents to restrict development and use of
software on general-purpose computers, but in those that do, we wish to
avoid the special danger that patents applied to a free program could
make it effectively proprietary. To prevent this, the GPL assures that
patents cannot be used to render the program non-free.
The precise terms and conditions for copying, distribution and
modification follow.
TERMS AND CONDITIONS
0. Definitions.
"This License" refers to version 3 of the GNU General Public License.
"Copyright" also means copyright-like laws that apply to other kinds of
works, such as semiconductor masks.
"The Program" refers to any copyrightable work licensed under this
License. Each licensee is addressed as "you". "Licensees" and
"recipients" may be individuals or organizations.
To "modify" a work means to copy from or adapt all or part of the work
in a fashion requiring copyright permission, other than the making of an
exact copy. The resulting work is called a "modified version" of the
earlier work or a work "based on" the earlier work.
A "covered work" means either the unmodified Program or a work based
on the Program.
To "propagate" a work means to do anything with it that, without
permission, would make you directly or secondarily liable for
infringement under applicable copyright law, except executing it on a
computer or modifying a private copy. Propagation includes copying,
distribution (with or without modification), making available to the
public, and in some countries other activities as well.
To "convey" a work means any kind of propagation that enables other
parties to make or receive copies. Mere interaction with a user through
a computer network, with no transfer of a copy, is not conveying.
An interactive user interface displays "Appropriate Legal Notices"
to the extent that it includes a convenient and prominently visible
feature that (1) displays an appropriate copyright notice, and (2)
tells the user that there is no warranty for the work (except to the
extent that warranties are provided), that licensees may convey the
work under this License, and how to view a copy of this License. If
the interface presents a list of user commands or options, such as a
menu, a prominent item in the list meets this criterion.
1. Source Code.
The "source code" for a work means the preferred form of the work
for making modifications to it. "Object code" means any non-source
form of a work.
A "Standard Interface" means an interface that either is an official
standard defined by a recognized standards body, or, in the case of
interfaces specified for a particular programming language, one that
is widely used among developers working in that language.
The "System Libraries" of an executable work include anything, other
than the work as a whole, that (a) is included in the normal form of
packaging a Major Component, but which is not part of that Major
Component, and (b) serves only to enable use of the work with that
Major Component, or to implement a Standard Interface for which an
implementation is available to the public in source code form. A
"Major Component", in this context, means a major essential component
(kernel, window system, and so on) of the specific operating system
(if any) on which the executable work runs, or a compiler used to
produce the work, or an object code interpreter used to run it.
The "Corresponding Source" for a work in object code form means all
the source code needed to generate, install, and (for an executable
work) run the object code and to modify the work, including scripts to
control those activities. However, it does not include the work's
System Libraries, or general-purpose tools or generally available free
programs which are used unmodified in performing those activities but
which are not part of the work. For example, Corresponding Source
includes interface definition files associated with source files for
the work, and the source code for shared libraries and dynamically
linked subprograms that the work is specifically designed to require,
such as by intimate data communication or control flow between those
subprograms and other parts of the work.
The Corresponding Source need not include anything that users
can regenerate automatically from other parts of the Corresponding
Source.
The Corresponding Source for a work in source code form is that
same work.
2. Basic Permissions.
All rights granted under this License are granted for the term of
copyright on the Program, and are irrevocable provided the stated
conditions are met. This License explicitly affirms your unlimited
permission to run the unmodified Program. The output from running a
covered work is covered by this License only if the output, given its
content, constitutes a covered work. This License acknowledges your
rights of fair use or other equivalent, as provided by copyright law.
You may make, run and propagate covered works that you do not
convey, without conditions so long as your license otherwise remains
in force. You may convey covered works to others for the sole purpose
of having them make modifications exclusively for you, or provide you
with facilities for running those works, provided that you comply with
the terms of this License in conveying all material for which you do
not control copyright. Those thus making or running the covered works
for you must do so exclusively on your behalf, under your direction
and control, on terms that prohibit them from making any copies of
your copyrighted material outside their relationship with you.
Conveying under any other circumstances is permitted solely under
the conditions stated below. Sublicensing is not allowed; section 10
makes it unnecessary.
3. Protecting Users' Legal Rights From Anti-Circumvention Law.
No covered work shall be deemed part of an effective technological
measure under any applicable law fulfilling obligations under article
11 of the WIPO copyright treaty adopted on 20 December 1996, or
similar laws prohibiting or restricting circumvention of such
measures.
When you convey a covered work, you waive any legal power to forbid
circumvention of technological measures to the extent such circumvention
is effected by exercising rights under this License with respect to
the covered work, and you disclaim any intention to limit operation or
modification of the work as a means of enforcing, against the work's
users, your or third parties' legal rights to forbid circumvention of
technological measures.
4. Conveying Verbatim Copies.
You may convey verbatim copies of the Program's source code as you
receive it, in any medium, provided that you conspicuously and
appropriately publish on each copy an appropriate copyright notice;
keep intact all notices stating that this License and any
non-permissive terms added in accord with section 7 apply to the code;
keep intact all notices of the absence of any warranty; and give all
recipients a copy of this License along with the Program.
You may charge any price or no price for each copy that you convey,
and you may offer support or warranty protection for a fee.
5. Conveying Modified Source Versions.
You may convey a work based on the Program, or the modifications to
produce it from the Program, in the form of source code under the
terms of section 4, provided that you also meet all of these conditions:
a) The work must carry prominent notices stating that you modified
it, and giving a relevant date.
b) The work must carry prominent notices stating that it is
released under this License and any conditions added under section
7. This requirement modifies the requirement in section 4 to
"keep intact all notices".
c) You must license the entire work, as a whole, under this
License to anyone who comes into possession of a copy. This
License will therefore apply, along with any applicable section 7
additional terms, to the whole of the work, and all its parts,
regardless of how they are packaged. This License gives no
permission to license the work in any other way, but it does not
invalidate such permission if you have separately received it.
d) If the work has interactive user interfaces, each must display
Appropriate Legal Notices; however, if the Program has interactive
interfaces that do not display Appropriate Legal Notices, your
work need not make them do so.
A compilation of a covered work with other separate and independent
works, which are not by their nature extensions of the covered work,
and which are not combined with it such as to form a larger program,
in or on a volume of a storage or distribution medium, is called an
"aggregate" if the compilation and its resulting copyright are not
used to limit the access or legal rights of the compilation's users
beyond what the individual works permit. Inclusion of a covered work
in an aggregate does not cause this License to apply to the other
parts of the aggregate.
6. Conveying Non-Source Forms.
You may convey a covered work in object code form under the terms
of sections 4 and 5, provided that you also convey the
machine-readable Corresponding Source under the terms of this License,
in one of these ways:
a) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by the
Corresponding Source fixed on a durable physical medium
customarily used for software interchange.
b) Convey the object code in, or embodied in, a physical product
(including a physical distribution medium), accompanied by a
written offer, valid for at least three years and valid for as
long as you offer spare parts or customer support for that product
model, to give anyone who possesses the object code either (1) a
copy of the Corresponding Source for all the software in the
product that is covered by this License, on a durable physical
medium customarily used for software interchange, for a price no
more than your reasonable cost of physically performing this
conveying of source, or (2) access to copy the
Corresponding Source from a network server at no charge.
c) Convey individual copies of the object code with a copy of the
written offer to provide the Corresponding Source. This
alternative is allowed only occasionally and noncommercially, and
only if you received the object code with such an offer, in accord
with subsection 6b.
d) Convey the object code by offering access from a designated
place (gratis or for a charge), and offer equivalent access to the
Corresponding Source in the same way through the same place at no
further charge. You need not require recipients to copy the
Corresponding Source along with the object code. If the place to
copy the object code is a network server, the Corresponding Source
may be on a different server (operated by you or a third party)
that supports equivalent copying facilities, provided you maintain
clear directions next to the object code saying where to find the
Corresponding Source. Regardless of what server hosts the
Corresponding Source, you remain obligated to ensure that it is
available for as long as needed to satisfy these requirements.
e) Convey the object code using peer-to-peer transmission, provided
you inform other peers where the object code and Corresponding
Source of the work are being offered to the general public at no
charge under subsection 6d.
A separable portion of the object code, whose source code is excluded
from the Corresponding Source as a System Library, need not be
included in conveying the object code work.
A "User Product" is either (1) a "consumer product", which means any
tangible personal property which is normally used for personal, family,
or household purposes, or (2) anything designed or sold for incorporation
into a dwelling. In determining whether a product is a consumer product,
doubtful cases shall be resolved in favor of coverage. For a particular
product received by a particular user, "normally used" refers to a
typical or common use of that class of product, regardless of the status
of the particular user or of the way in which the particular user
actually uses, or expects or is expected to use, the product. A product
is a consumer product regardless of whether the product has substantial
commercial, industrial or non-consumer uses, unless such uses represent
the only significant mode of use of the product.
"Installation Information" for a User Product means any methods,
procedures, authorization keys, or other information required to install
and execute modified versions of a covered work in that User Product from
a modified version of its Corresponding Source. The information must
suffice to ensure that the continued functioning of the modified object
code is in no case prevented or interfered with solely because
modification has been made.
If you convey an object code work under this section in, or with, or
specifically for use in, a User Product, and the conveying occurs as
part of a transaction in which the right of possession and use of the
User Product is transferred to the recipient in perpetuity or for a
fixed term (regardless of how the transaction is characterized), the
Corresponding Source conveyed under this section must be accompanied
by the Installation Information. But this requirement does not apply
if neither you nor any third party retains the ability to install
modified object code on the User Product (for example, the work has
been installed in ROM).
The requirement to provide Installation Information does not include a
requirement to continue to provide support service, warranty, or updates
for a work that has been modified or installed by the recipient, or for
the User Product in which it has been modified or installed. Access to a
network may be denied when the modification itself materially and
adversely affects the operation of the network or violates the rules and
protocols for communication across the network.
Corresponding Source conveyed, and Installation Information provided,
in accord with this section must be in a format that is publicly
documented (and with an implementation available to the public in
source code form), and must require no special password or key for
unpacking, reading or copying.
7. Additional Terms.
"Additional permissions" are terms that supplement the terms of this
License by making exceptions from one or more of its conditions.
Additional permissions that are applicable to the entire Program shall
be treated as though they were included in this License, to the extent
that they are valid under applicable law. If additional permissions
apply only to part of the Program, that part may be used separately
under those permissions, but the entire Program remains governed by
this License without regard to the additional permissions.
When you convey a copy of a covered work, you may at your option
remove any additional permissions from that copy, or from any part of
it. (Additional permissions may be written to require their own
removal in certain cases when you modify the work.) You may place
additional permissions on material, added by you to a covered work,
for which you have or can give appropriate copyright permission.
Notwithstanding any other provision of this License, for material you
add to a covered work, you may (if authorized by the copyright holders of
that material) supplement the terms of this License with terms:
a) Disclaiming warranty or limiting liability differently from the
terms of sections 15 and 16 of this License; or
b) Requiring preservation of specified reasonable legal notices or
author attributions in that material or in the Appropriate Legal
Notices displayed by works containing it; or
c) Prohibiting misrepresentation of the origin of that material, or
requiring that modified versions of such material be marked in
reasonable ways as different from the original version; or
d) Limiting the use for publicity purposes of names of licensors or
authors of the material; or
e) Declining to grant rights under trademark law for use of some
trade names, trademarks, or service marks; or
f) Requiring indemnification of licensors and authors of that
material by anyone who conveys the material (or modified versions of
it) with contractual assumptions of liability to the recipient, for
any liability that these contractual assumptions directly impose on
those licensors and authors.
All other non-permissive additional terms are considered "further
restrictions" within the meaning of section 10. If the Program as you
received it, or any part of it, contains a notice stating that it is
governed by this License along with a term that is a further
restriction, you may remove that term. If a license document contains
a further restriction but permits relicensing or conveying under this
License, you may add to a covered work material governed by the terms
of that license document, provided that the further restriction does
not survive such relicensing or conveying.
If you add terms to a covered work in accord with this section, you
must place, in the relevant source files, a statement of the
additional terms that apply to those files, or a notice indicating
where to find the applicable terms.
Additional terms, permissive or non-permissive, may be stated in the
form of a separately written license, or stated as exceptions;
the above requirements apply either way.
8. Termination.
You may not propagate or modify a covered work except as expressly
provided under this License. Any attempt otherwise to propagate or
modify it is void, and will automatically terminate your rights under
this License (including any patent licenses granted under the third
paragraph of section 11).
However, if you cease all violation of this License, then your
license from a particular copyright holder is reinstated (a)
provisionally, unless and until the copyright holder explicitly and
finally terminates your license, and (b) permanently, if the copyright
holder fails to notify you of the violation by some reasonable means
prior to 60 days after the cessation.
Moreover, your license from a particular copyright holder is
reinstated permanently if the copyright holder notifies you of the
violation by some reasonable means, this is the first time you have
received notice of violation of this License (for any work) from that
copyright holder, and you cure the violation prior to 30 days after
your receipt of the notice.
Termination of your rights under this section does not terminate the
licenses of parties who have received copies or rights from you under
this License. If your rights have been terminated and not permanently
reinstated, you do not qualify to receive new licenses for the same
material under section 10.
9. Acceptance Not Required for Having Copies.
You are not required to accept this License in order to receive or
run a copy of the Program. Ancillary propagation of a covered work
occurring solely as a consequence of using peer-to-peer transmission
to receive a copy likewise does not require acceptance. However,
nothing other than this License grants you permission to propagate or
modify any covered work. These actions infringe copyright if you do
not accept this License. Therefore, by modifying or propagating a
covered work, you indicate your acceptance of this License to do so.
10. Automatic Licensing of Downstream Recipients.
Each time you convey a covered work, the recipient automatically
receives a license from the original licensors, to run, modify and
propagate that work, subject to this License. You are not responsible
for enforcing compliance by third parties with this License.
An "entity transaction" is a transaction transferring control of an
organization, or substantially all assets of one, or subdividing an
organization, or merging organizations. If propagation of a covered
work results from an entity transaction, each party to that
transaction who receives a copy of the work also receives whatever
licenses to the work the party's predecessor in interest had or could
give under the previous paragraph, plus a right to possession of the
Corresponding Source of the work from the predecessor in interest, if
the predecessor has it or can get it with reasonable efforts.
You may not impose any further restrictions on the exercise of the
rights granted or affirmed under this License. For example, you may
not impose a license fee, royalty, or other charge for exercise of
rights granted under this License, and you may not initiate litigation
(including a cross-claim or counterclaim in a lawsuit) alleging that
any patent claim is infringed by making, using, selling, offering for
sale, or importing the Program or any portion of it.
11. Patents.
A "contributor" is a copyright holder who authorizes use under this
License of the Program or a work on which the Program is based. The
work thus licensed is called the contributor's "contributor version".
A contributor's "essential patent claims" are all patent claims
owned or controlled by the contributor, whether already acquired or
hereafter acquired, that would be infringed by some manner, permitted
by this License, of making, using, or selling its contributor version,
but do not include claims that would be infringed only as a
consequence of further modification of the contributor version. For
purposes of this definition, "control" includes the right to grant
patent sublicenses in a manner consistent with the requirements of
this License.
Each contributor grants you a non-exclusive, worldwide, royalty-free
patent license under the contributor's essential patent claims, to
make, use, sell, offer for sale, import and otherwise run, modify and
propagate the contents of its contributor version.
In the following three paragraphs, a "patent license" is any express
agreement or commitment, however denominated, not to enforce a patent
(such as an express permission to practice a patent or covenant not to
sue for patent infringement). To "grant" such a patent license to a
party means to make such an agreement or commitment not to enforce a
patent against the party.
If you convey a covered work, knowingly relying on a patent license,
and the Corresponding Source of the work is not available for anyone
to copy, free of charge and under the terms of this License, through a
publicly available network server or other readily accessible means,
then you must either (1) cause the Corresponding Source to be so
available, or (2) arrange to deprive yourself of the benefit of the
patent license for this particular work, or (3) arrange, in a manner
consistent with the requirements of this License, to extend the patent
license to downstream recipients. "Knowingly relying" means you have
actual knowledge that, but for the patent license, your conveying the
covered work in a country, or your recipient's use of the covered work
in a country, would infringe one or more identifiable patents in that
country that you have reason to believe are valid.
If, pursuant to or in connection with a single transaction or
arrangement, you convey, or propagate by procuring conveyance of, a
covered work, and grant a patent license to some of the parties
receiving the covered work authorizing them to use, propagate, modify
or convey a specific copy of the covered work, then the patent license
you grant is automatically extended to all recipients of the covered
work and works based on it.
A patent license is "discriminatory" if it does not include within
the scope of its coverage, prohibits the exercise of, or is
conditioned on the non-exercise of one or more of the rights that are
specifically granted under this License. You may not convey a covered
work if you are a party to an arrangement with a third party that is
in the business of distributing software, under which you make payment
to the third party based on the extent of your activity of conveying
the work, and under which the third party grants, to any of the
parties who would receive the covered work from you, a discriminatory
patent license (a) in connection with copies of the covered work
conveyed by you (or copies made from those copies), or (b) primarily
for and in connection with specific products or compilations that
contain the covered work, unless you entered into that arrangement,
or that patent license was granted, prior to 28 March 2007.
Nothing in this License shall be construed as excluding or limiting
any implied license or other defenses to infringement that may
otherwise be available to you under applicable patent law.
12. No Surrender of Others' Freedom.
If conditions are imposed on you (whether by court order, agreement or
otherwise) that contradict the conditions of this License, they do not
excuse you from the conditions of this License. If you cannot convey a
covered work so as to satisfy simultaneously your obligations under this
License and any other pertinent obligations, then as a consequence you may
not convey it at all. For example, if you agree to terms that obligate you
to collect a royalty for further conveying from those to whom you convey
the Program, the only way you could satisfy both those terms and this
License would be to refrain entirely from conveying the Program.
13. Use with the GNU Affero General Public License.
Notwithstanding any other provision of this License, you have
permission to link or combine any covered work with a work licensed
under version 3 of the GNU Affero General Public License into a single
combined work, and to convey the resulting work. The terms of this
License will continue to apply to the part which is the covered work,
but the special requirements of the GNU Affero General Public License,
section 13, concerning interaction through a network will apply to the
combination as such.
14. Revised Versions of this License.
The Free Software Foundation may publish revised and/or new versions of
the GNU General Public License from time to time. Such new versions will
be similar in spirit to the present version, but may differ in detail to
address new problems or concerns.
Each version is given a distinguishing version number. If the
Program specifies that a certain numbered version of the GNU General
Public License "or any later version" applies to it, you have the
option of following the terms and conditions either of that numbered
version or of any later version published by the Free Software
Foundation. If the Program does not specify a version number of the
GNU General Public License, you may choose any version ever published
by the Free Software Foundation.
If the Program specifies that a proxy can decide which future
versions of the GNU General Public License can be used, that proxy's
public statement of acceptance of a version permanently authorizes you
to choose that version for the Program.
Later license versions may give you additional or different
permissions. However, no additional obligations are imposed on any
author or copyright holder as a result of your choosing to follow a
later version.
15. Disclaimer of Warranty.
THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY
APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT
HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY
OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO,
THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM
IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF
ALL NECESSARY SERVICING, REPAIR OR CORRECTION.
16. Limitation of Liability.
IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS
THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY
GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE
USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF
DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD
PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS),
EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF
SUCH DAMAGES.
17. Interpretation of Sections 15 and 16.
If the disclaimer of warranty and limitation of liability provided
above cannot be given local legal effect according to their terms,
reviewing courts shall apply local law that most closely approximates
an absolute waiver of all civil liability in connection with the
Program, unless a warranty or assumption of liability accompanies a
copy of the Program in return for a fee.
END OF TERMS AND CONDITIONS
How to Apply These Terms to Your New Programs
If you develop a new program, and you want it to be of the greatest
possible use to the public, the best way to achieve this is to make it
free software which everyone can redistribute and change under these terms.
To do so, attach the following notices to the program. It is safest
to attach them to the start of each source file to most effectively
state the exclusion of warranty; and each file should have at least
the "copyright" line and a pointer to where the full notice is found.
Copyright (C)
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
Also add information on how to contact you by electronic and paper mail.
If the program does terminal interaction, make it output a short
notice like this when it starts in an interactive mode:
Copyright (C)
This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
This is free software, and you are welcome to redistribute it
under certain conditions; type `show c' for details.
The hypothetical commands `show w' and `show c' should show the appropriate
parts of the General Public License. Of course, your program's commands
might be different; for a GUI interface, you would use an "about box".
You should also get your employer (if you work as a programmer) or school,
if any, to sign a "copyright disclaimer" for the program, if necessary.
For more information on this, and how to apply and follow the GNU GPL, see
.
The GNU General Public License does not permit incorporating your program
into proprietary programs. If your program is a subroutine library, you
may consider it more useful to permit linking proprietary applications with
the library. If this is what you want to do, use the GNU Lesser General
Public License instead of this License. But first, please read
.
obnam-1.6.1/NEWS 0000644 0001750 0001750 00000116466 12246357067 013230 0 ustar jenkins jenkins Obnam NEWS
==========
This file summarizes changes between releases of Obnam.
Version 1.6.1, released 2013-11-30
----------------------------------
* Fix Debian package dependencies correctly.
Version 1.6, released 2013-11-30
--------------------------------
* Stop logging paramiko exceptions that get converted into another
type of exception by the SFTP plugin in Obnam.
* `obnam-benchmark` can now use an installed version of larch.
Patch by Lars Kruse.
* Obnam has been ported to FreeBSD by Itamar Turner-Trauring
of HybridCluster.
* Backup progress reporting now reports scanned file data, not just
backed up file data. This will hopefully be less confusing to people.
* The `list-keys`, `client-keys`, and `list-toplevels` commands now
obey a new option, `--key-details`, to show the usernames attached
to each public key. Patch by Lars Kruse.
* New option `--ssh-command` to set the command Obnam runs
when invoking ssh. patch by Lars Kruse.
* `obnam clients` can now be used without being an existing client.
Patch by Itamar Turner-Trauring.
* New option `--ssh-host-keys-check` to better specify how SSH
host keys should be checked. Patch by Itamar Turner-Trauring.
Bug fixes:
* Fix`"obnam list-toplevels` so it doesn't give an error when it's
unable to read the per-client directory of another client, when
encryption is used. Fix by Lars Kruse.
* Fix the encryption plugin to give a better error message when it
looks for client directories but fails to find them. Fix by
Lars Kruse.
* `obnam list-toplevels` got confused when the repository contained
extra files, such as "lock" (left there by a previous, crashed Obnam
run). It no longer does. Fix by Lars Kruse.
* The SFTP plugin now handles another error code (EACCESS) when writing
a file and the directory it should go into not existing. Patch by
Armin Größlinger.
* Obnam's manual page now explains about breaking long logical lines
into multiple physical ones.
* The `/~/` path prefix in SFTP URLs works again, at least with
sufficiently new versions of Paramiko (1.7.7.1 in Debian wheezy is
OK). Reported by Lars Kruse.
* The Nagios plugin to report errors in a way Nagios expects.
Patch by Martijn Grendelman.
* The Nagios plugin for Obnam now correctly handles the case
where a backup repository for a client exists, but does not have
a backup yet. Patch by Lars Kruse.
* `obnam ls` now handles trailing slashes in filename arguments.
Reported by Biltong.
* When restoring a backup, Obnam will now continue past errors,
instead of aborting with the first one. Patch by Itamar
Turner-Trauring.
Version 1.5, released 2013-08-08
--------------------------------
Bug fixes:
* Terminal progress reporting now updated only every 0.1 seconds,
instead of 0.01 seconds, to reduce terminal emulator CPU usage.
Reported by Neal Becker.
* Empty exclude patterns are ignored. Previously, a configuration file
line such as "exclude = foo, bar," (note trailing comma) would result
in an empty pattern, which would match everything, and therefore
nothing would be backed up. Reported by Sharon Kimble.
* A FUSE plugin to access (read-only) data from the backup repository
has been added. Written by Valery Yundin.
Version 1.4, released 2013-03-16
--------------------------------
* The``ls` command now takes filenames as (optional) arguments, instead
of a list of generations. Based on patch by Damien Couroussé.
* Even more detailed progress reporting during a backup.
* Add --fsck-skip-generations option to tell fsck to not check any
generation metadata.
* The default log level is now INFO, instead of DEBUG. This is to be
considered a quantum leap in the continuing rise of the maturity level
of the software. (Actually, the change is there just to save some
disk space and I/O for people who don't want to be involved in Obnam
development and don't want to have massive log files.)
* The default sizes for the `lru-size` and `upload-queue-size` settings
have been reduced, to reduce the memory impact of Obnam.
* `obnam restore` now reports transfer statistics at the end, similarly
to what `obnam backup` does. Suggested by "S. B.".
Bug fixes:
* If listing extended attributes for a filesystem that does not support
them, Obnam no longer crashes, just silently does not backup extended
attributes. Which aren't there anyway.
* A bug in handling stat lookup errors was fixed. Reported by
Peter Palfrader. Symptom: `AttributeError: 'exceptions.OSError'
object has no attribute 'st_ino'` in an error message or log file.
* A bug in a restore crashing when failing to set extended attributes
on the restored file was fixed. Reported by "S. B.".
* Made it clearer what is happening when unlocking the repository due to
errors, and fixed it so that a failure to unlock is also an error.
Reported by andrewsh.
* The dependency on Larch is now for 1.20121216 or newer, since that is
needed for fsck to work.
* The manual page did not document the client name arguments to the
`add-key` and `remove-key` subcommands. Reported by Lars Kruse.
* Restoring symlinks as root would fail. Reported and fixed by
David Fries.
* Only set ssh user/port if explicitily requested, otherwise let ssh
select them. Reported by Michael Goetze, fixed by David Fries.
* Fix problem with old version of paramiko and chdir. Fixed by Nick Altmann.
* Fix problems with signed vs unsigned values for struct stat fields.
Reported by Henning Verbeek.
Version 1.3, released 2012-12-16
--------------------------------
* When creating files in the backup repository, Obnam tries to avoid NFS
synchronisation problems by first writing a temporary file and then
creating a hardlink to the actual filename. This works badly on filesystems
that do not allow hard links, such as VFAT. If creating the hardlink
fails, Obnam now further tries to use the `open(2)` system call with
the `O_EXCL` flag to create the target file. This should allow things
to work with both NFS and VFAT.
* More detailed progress reporting during the backup.
* Manual page now covers the diff subcommand. Patch by Peter Valdemar Mørch.
* Speed optimisation patch for backing up files in inode numbering order,
from Christophe Vu-Brugier.
* A setuid or setgid bit is now not restored if Obnam is not used by root
or the same user as the owner of the restored file.
* Many new settings to control "obnam fsck", mainly to reduce the amount
of checking being done in order to make it faster. However, fsck is
has lost some features (checks), which will be added back in a future
release.
* More frequent fsck progress reporting. Some speed optimisations to fsck.
Bug fixes:
* Empty values for extended attributes are now backed up correctly.
Previously they would cause an infinite loop.
* Extended attributes without values are now ignored. This is different
from attributes with empty values. Reported by Vladimir Elisseev.
* An empty port number in sftp URLs is now handled correctly. Found based
on report by Anton Shevtsov.
* A bad performance bug when backing up full systems (starting from the
filesystem root directory) has been fixed. At the beginning of each
generation, Obnam removes any directories that are not part of the
current backup roots. This is necessary so that if you change the
backup roots, the old stuff doesn't hang around forever. However, when
the backup root is the filesystem root, due to the now-fixed bug Obnam
would first remove everything, and then back it up all over again. This
"worked", but was quite slow. Thanks to Nix for reporting the problem.
* Obnam now runs GnuPG explicitly with the "no text mode" setting, to override
a "text mode" setting in the user's configuration. The files Obnam encrypts
need to be treated as binary, not text files. Reported by Robin Sheat.
* A shared B-tree concurrency bug has been fixed: If another instance of
Obnam was modifying a shared B-tree, Obnam would crash and abort a backup,
possibly leaving lock files lying around. Now a failure to look up a chunk
via its checksum is ignored, and the backup continues.
* Bugs in how Python OSError exceptions were being raises have been fixed.
Error messages should now be somewhat clearer.
* Unset or wrongly set variable "full" fixed in "obnam diff". Reported
by ROGERIO DE CARVALHO BASTOS and patched by Peter Valdemar Mørch.
* Setuid and setgid bits are now restored correctly, when restore happens
as root. Reported by Pavel Kokolemin.
* Obnam now complains if no backup roots have been specfied.
Version 1.2, released 2012-10-06
--------------------------------
* Added a note to `--node-size` that it only affects new B-trees.
Thanks, Michael Brown.
* New `obnam diff` subcommand to show differences (added/removed/modified
files) between two generations, by Peter Valdemar Mørch.
* `obnam backup` now logs the names of files that are getting backed up
at the INFO level rather than DEBUG.
* The command synopsises for backup, restore, and verify commands now
make it clearer that Obnam only accepts directories, not individual
files, as arguments. (For now.)
* The output from the `show` plugin can now be redirected with the
`--output=FILE` option. Affected subcommands: `clients`, `generations`,
`genids`, `ls`, `diff`, `nagios-last-backup-age`.
Bug fixes:
* Notify user of errors during backups.
* The SFTP plugin now manages to deal with repository paths starting
with `/~/` which already exist without crashing.
* Character and block device nodes are now restored correctly.
Thanks to Martin Dummer for the bug report.
* The symmteric key for a toplevel repository directory is re-encrypted
when a public key is added or removed to the toplevel using the
`add-key` or `remove-key` subcommands.
* Manual page typo fix. Thanks, Steve Kemp.
Version 1.1, released 2012-06-30
--------------------------------
* Mark the `--small-files-in-btree` settings as deprecated.
* Obnam now correctly checks that `--repository` is set.
* Options in `--help` output are now grouped in random senseless ways
rather than being in one randomly ordered group.
* Manual page clarification for `--root` and `verify`. Thanks, Saint Germain.
* Remove outdated section from manual page explaining that there is not
format conversion. Thanks, Elrond of Samba-TNG.
* Added missing information about specifying a user in sftp URLs. Thanks,
Joey Hess, for pointing it out.
* Manual page clarification on `--keep` from Damien Couroussé.
* Make `obnam forget` report which generations it would remove without
`--pretend`. Thanks, Neal Becker, for the suggestion.
Version 1.0, released 2012-06-01
--------------------------------
* Fixed bug in finding duplicate files during a backup generation.
Thanks to Saint Germain for reporting the problem.
* Changed version number to 1.0.
Version 0.30, released 2012-05-30; a RELEASE CANDIDATE
------------------------------------------------------
Only bug fixes, and only in the test suite.
* Fix test case problem when `$TMPDIR` lacks `user_xattr`. The extended
attributes test won't succeed in that case, and it's pointless to run it.
* Fix test case problem when `$TMPDIR` lacks nanosecond timestamps for
files. The test case now ignores such timestamps, making the test pass
anyway. The timestamp accuracy is not important for this test.
Version 0.29, released 2012-05-27; a RELEASE CANDIDATE
------------------------------------------------------
* "obnam backup" now writes performance statistics at the end of a backup run.
Search the log for "Backup performance statistics" (INFO level).
* "obnam verify" now continues past the first error. Thanks to Rafał Gwiazda
for requesting this.
* Add an `obnam-viewprof` utility to translate Python profiling output
into human readable text form.
* Bug fix: If a file's extended attributes have changed in any way, the change
is now backed up.
* "obnam fsck" is now a bit faster.
* The shared directories in the repository are now locked only during updates,
allowing more efficient concurrent backups between several computers.
* Obnam now gives a better error message when a backup root is not a
directory. Thanks to Edward Allcutt for reporting the error
().
* The output format of "obnam ls" has changed. It now has one line per
file, and includes the full pathname of the file, rather mimicking the
output of "ls -lAR". Thanks to Edward Allcutt for the suggestion
().
* A few optimizations to sftp speed. Small files are still slow.
Version 0.28, released 2012-05-10; a BETA release
-------------------------------------------------
* `force-lock` should now remove all locks.
* Out-of-space errors in the repository now terminate the backup process.
Previously, Obnam would continue, ignoring the failure to write. If you
make space in the repository and restart Obnam, it will continue from
the previous checkpoint.
* The convert5to6 black box test now works even if run by other people
than liw.
* "obnam backup" now uses a single SFTP connection to the backup repository,
rather than opening a new one after each checkpoint generation. Thanks to
weinzwang for reporting the problem.
* "obnam verify" now obeys the `--quiet` option.
* "obnam backup" no longer counts chunks already in the repository in the
uploaded amount of data.
Version 0.27, released 2012-04-30; a BETA release
-------------------------------------------------
* The repository format has again changed in an incompatible manner,
so you will need to re-backup everything again. Alternatively, you can
try the new `convert5to6` subcommand. See the manual page for details.
Make sure you have a copy of the repository before converting, the
code is new and may be buggy.
* New option `--small-files-in-btree` enables Obnam to store the contents
of small files in the per-client B-tree. This is not the default, at
least yet, since it's impact on real life performance is unknown, but
it should make things go a bit faster for high latency repository
connections.
* Some SFTP related speed optimizations.
* Data filtering is now strictly stable and priority-ordered, ensuring that
compression always happens before encryption etc.
* Repository metadata is never filtered, so that we can be sure that
in future if when we add backwards-compatibility we can detect the format
without worrying about any other filtering which might occur.
* Forcing of locks is now unconditional and across the entire repository.
* Uses the larch 0.30 read-only mode to fix a bug where opening a B-tree
rolls back changes someone else is making, even if we only use the tree
to read stuff from.
* "obnam backup" will now exit with a non-zero exit code if there were
any errors during a backup, and the problematic files were skipped.
Thanks, Peter Palfrader, for reporting the bug.
* "obnam forget" is now a bit faster.
* Hash collisions for filenames are now handled.
Version 0.26, released 2012-03-26; a BETA release
-------------------------------------------------
* Clients now lock the parts of the backup repository they're using,
while making any changes, so that multiple clients can work at the
same time without corrupting the repository.
* Now depends on a larch 0.28, which uses journalling to avoid on-disk
inconsistencies and corruption during crashes.
* Compression and encryption can now be used together.
Version 0.25, released 2012-02-18; a BETA release
-------------------------------------------------
* Log files are now created with permissions that allow only the owner
to read or write them. This fixes a privacy leak.
* The `nagios-last-backup-age` subcommand is useful for setting up Nagios
(or similar systems) to check that backups get run properly. Thanks to
Peter Palfrader for the patch.
* Some clarification on how the forget policy works, prompted by questions
from Peter Palfrader.
* New settings `ssh-known-hosts` (for choosing which file to check for
known host keys), `strict-ssh-host-keys` (for disallowing unknown host
keys), and `ssh-key` (for choosing which key file to use for SSH
connections) allow better and safer use of ssh.
* Checkpoints will now happen even in the middle of files (but between
chunks).
* The `--pretend` option now works for backups as well.
BUG FIXES:
* `obnam ls` now shows the correct timestamps for generations.
Thanks, Anders Wirzenius.
Version 0.24.1, released 2011-12-24; a BETA release
-------------------------------------------------
BUG FIX:
* Fix test case for file timestamps with sub-second resolution. Not all
filesystems have that, so the test case has been changed to accept lack
of sub-second timestamps.
Version 0.24, released 2011-12-18; a BETA release
-------------------------------------------------
USER VISIBLE CHANGES
* The way file timestamps (modification and access times) have changed,
to fix inaccuracies introduced by the old way. Times are now stored
as two integers giving full seconds and nanoseconds past the full
second, instead of the weird earlier system that was imposed by Python's
use of floating point for the timestamps. This causes the repository
format version to be bumped, resulting in a need to start over with an
empty repository.
* Extended file attributes are now backed up from and restored to local
filesystems. They are neither backed up, nor restored for live data
accessed over SFTP.
* If the `--exclude` regular expression is wrong, Obnam now gives an
error message and then ignores the regexp, rather than crashing.
* There is now a compression plugin, enabled with `--compress-with=gzip`.
* De-duplication mode can now be chosen by the user: the new
`--deduplicate` setting can be one of `never` (fast, but uses more space);
`verify` (slow, but handles hash collisions gracefully); and
`fatalist` (fast, but lossy, if there is a hash collision). `fatalist`
is the default mode.
* Restores now obey the `--dry-run` option. Thanks to Peter Palfreder for
the bug report.
* New option `--verify-randomly` allows you to check only a part of the
backup, instead of everything.
* Verify now has some progress reporting.
* Forget is now much faster.
* Forget now has progress reporting. It is not fast enough to do without,
sorry.
* Backup now removes any checkpoint generations it created during a backup
run, if it succeeds without errors.
BUG FIXES:
* Now works with a repository on sshfs. Thanks to Dafydd Harries for
reporting the problem.
* Now depends on a newer version of the larch library, fixing a problem
when the Obnam default node size changes and an existing repository
has a different size.
* User and group names for sftp live data are no longer queried from the
local system. Instead, they're marked as unknown.
Version 0.23, released 2011-10-02; a BETA release
-------------------------------------------------
USER VISIBLE CHANGES:
* `restore` now shows a progress bar.
* `fsck` now has more useful progress reporting, and does more checking,
including the integrity of the contents of file content.
* `fsck` now also checks the integrity of the B-trees in the repository,
so that it is not necessary to run `fsck-larch` manually anymore. This
works remotely as well, whereas `fsck-larch` only worked on B-trees
on the local filesystem.
* `force-lock` now gives a warning if the client does not exist in the
repository.
* Subcommands for encryption now give a warning if encryption key is not
given.
* The `--fsck-fix` option will now instruct `obnam fsck` to try to fix
problems found. For this release, it only means fixing B-tree missing
node problems, but more will follow.
* The default sizes have been changed for B-tree nodes (256 KiB)
and file contents chunks (1 MiB), based on benchmarking.
* SFTP protocol use has been optimized, which should result in some
more speed. This also highlights the need to change obnam so it can
do uploads in the background.
* If a client does not exist in the repository, `force-lock` now gives
a warning to the user, rather than ignoring it silently.
DEVELOPER CHANGES:
* New `--sftp-delay=100` option can be used to simulate SFTP backups over
networks with long round trip times.
* `obnam-benchmark` can now use `--sftp-delay` and other changes to make
it more useful.
INTERNAL CHANGES:
* Got rid of terminal status plugin. Now, the `Application` class provides
a `ttystatus.TerminalStatus` instance instead, in the `ts` attribute.
Other plugings are supposed to use that for progress reporting and
messaging to the user.
* The `posix_fadvise` system call is used only if available. This should
improve Obnam's portability a bit.
Version 0.22, released 2011-08-25; a BETA release
-------------------------------------------------
USER VISIBLE CHANGES:
* Obnam now reports its current configuration in the log file at startup.
This will hopefully remove one round of "did you use the --foo option?"
questions between developers and bug reporters.
BUG FIXES:
* The repository is now unlocked on exit only if it is still locked.
* A wrongly caught `GeneratorExit` is now dealt with properly.
* Keyboard interrupts are logged, so they don't show up as anonymous errors.
CHANGES RELEVANT TO DEVELOPERS ONLY:
* `setup.py` has been enhanced to work more like the old `Makefile` did:
`clean` removes more artifacts. Instructions in `README` have been updated
to point at `setup.py`.
* Compiler warning about `_XOPEN_SOURCE` re-definition fixed.
* Tests are now again run during a Debian package build.
Version 0.21, released 2011-08-23; a BETA release
-------------------------------------------------
USER VISIBLE CHANGES:
* Obnam will now unlock the repository if there's an error during a backup.
For the most part, the `force-lock` operation should now be unnecessary,
but it's still there in case it's useful some day.
BUG FIXES:
* Negative timestamps for files now work. Thanks to Jamil Djadala
for reporting the bug.
* The documentation for --checkpoint units fixed. Thanks, user weinzwang
from IRC.
* The connections to the repository and live data filesystem are now
properly closed. This makes benchmark read/write statistics be correct.
Version 0.20.1, released 2011-08-11; a BETA release
-------------------------------------------------
BUG FIXES:
* More cases of Unicode strings versus plain strings in filenames
over SFTP fixed. Thanks to Tapani Tarvainen.
Version 0.20, released 2011-08-09; a BETA release
-------------------------------------------------
BUG FIXES:
* Non-ASCII filenames over SFTP root now work. (Thanks, Tapani Tarvainen,
for the reproducible bug report.)
* The count of files while making a backup now counts all files found,
not just those backed up. The old behavior was confusing people.
USER VISIBLE CHANGES:
* The output of `obnam ls` now formats the columns a little prettier,
so that wide values do not cause misalignment.
* The error message when trying to use an encrypted repository without
encryption is now better (and suggests missing encryption being the
reason). Thanks, chrysn.
* Obnam now supports backing up of Unix sockets.
Version 0.19, released 2011-08-03; a BETA release
-------------------------------------------------
INCOMPATIBILITY CHANGES:
* We now require version 0.21 of the `larch` library, and this requires
bumping the repository format. This means old backup repositories can't
be used with this version, and you need to back up everything again.
(Please tell me when this becomes a problem.)
BUG FIXES:
* Found one more place where a file going missing during a backup may
cause a crash.
* Typo in error message about on-disk formats fixed.
(Thanks, Tapani Tarvainen.)
* The `--trace` option works again.
* `fcntl.F_SETFL` does not seem to work on file descriptors for files
owned by root that are read-only to the user running obnam. Worked
around by ignoring any problems with setting the flags.
* The funnest bug in this release: if no log file was specified with `--log`,
the current working directory was excluded from the backup.
USER VISIBLE CHANGES:
* `obnam(1)` manual page now discusses how configuration files are used.
* The manual page describes problems using sftp to access live data.
* The documentation for `--no-act` was clarified to say it only works
for `forget. (Thanks, Daniel Silverstone.)
* `obnam-benchmark` now has a manual page.
* The backup plugin logs files it excludes, so the user can find out what's
going on. A confused user is an unhappy user.
INTERNAL STUFF:
* Tracing statements added to various parts of the code, to help debug
mysterious problems.
* All exceptions are derived from `obnamlib.AppException` or
`obnamlib.Error`, and those are derived from `cliapp.AppException`,
so that the user gets nicer error messages than Python stack traces.
* `blackboxtests` is no longer run under fakeroot, because Debian packages
are built under fakeroot, and fakeroot within fakeroot causes trouble.
However, the point of running tests under fakeroot was to make sure
certain kinds of bugs are caught, and since Debian package building runs
the tests anyway, the test coverage is not actually diminished.
* The `Makefile` has new targets `fast-check` and `network-tests`. The
latter runs tests over sftp to localhost.
Version 0.18, released 2011-07-20; a BETA release
-------------------------------------------------
* The repository format has again changed in an incompatible manner,
so you will need to re-backup everything again. (If this is a problem,
tell me, and I'll consider adding backwards compatibility before 1.0
is released.)
* New option `--exclude-caches` allows automatic exclusion of cache
directories that are marked as such.
* Obnam now makes files in the repository be read-only, so that they're
that much harder to delete by mistake.
* Error message about files that can't be backed up now mentions the
correct file.
* Bugfix: unreadable files and directories no longer cause the backup
to fail. The problems are reported, but the backup continues.
Thanks to Jeff Epler for reporting the bug.
* Speed improvement from Jeff Epler for excluding files from backups.
* Various other speed improvements.
* Bugfix: restoring symlinks now works even if the symlink is restored
before its target. Also, the permissions of the symlink (rather than its
target) are now restored correctly. Thanks to Jeff Epler for an
exemplary bug report.
* New option `--one-file-system`, from Jeff Epler.
* New benchmarking tool `obnam-benchmark`, which is more flexible than
the old `run-benchmark`.
* When encrypting/decrypting data with GnuPG, temporary files are no
longer used.
* When verifying, `.../foo` and `.../foo/` now work the same way.
* New option `--symmetric-key-bits`.
* The chunk directory uses more hierarchy levels, and the way chunks
are stored there is now user-configurable (but you'll get into trouble
if you don't always use the same configuration). This should speed
things up a bit once the number of chunks grows very large.
* New `--chunkids-per-group` option, for yet more knobs to tweak when
searching for optimal performance.
* Local files are now opened using `O_NOATIME` so they can be backed
up without affecting timestamps.
* Now uses the `cliapp` framework for writing command line applications.
The primary user-visible effect is that the manpage now has an
accurate list of options.
* Bugfix: Obnam now again reports VFS I/O statistics.
* Bugfix: Obnam can again back up live data that is accessed using sftp.
Thanks to Tapani Tarvainen for reporting the problem.
Version 0.17, released 2011-05-21; a BETA release
-------------------------------------------------
* This is the second BETA release.
* The `run-benchmark` script now works with the new version of `seivot`.
The only benchmark size is one gibibyte, for now, because Obnam's too
slow to do big ones in reasonable time. As an aside, the benchmark
script got rewritten in Python, so it can be made more flexible.
* Benchmarks are run using encrypted backups.
* The kernel buffer cache is dropped before each obnam run, so the
benchmark result is more realistic (read: slower).
* Obnam now rotates its logs. See `--log-max` and `--log-keep` options
in the manual page. The default location for the log file is now
`~/.cache/obnam/obnam.log` for people, and
`/var/log/obnam.log` for root.
* Obnam now restores sparse files correctly.
* There have been some speed improvements to Obnam.
* The `--repository` option now has the shorter alias `-r`, since it
gets used so often.
* `obnam force-lock` now merely gives an error message, instead of a
Python stack trace, if the repository does not exist.
* Obnam now does not crash if files go missing during a backup, or can't
be read, or there are other problems with them. It will report the
problem, but then continue as if it had never heard of the file.
* Obnam now supports FIFO files.
* Obnam now verifies checksums when it restores files.
* Obnam now stores the checksum for the whole file, not just the checksum
for each chunk of its contents.
* Obnam's own log file is automatically excluded from backups.
* Obnam now stores and restores file timestamps to full accuracy,
instead of truncating them to whole seconds.
* The format of the backup repository has changed in an incompatible way,
and Obnam will now refuse to use an old repository. This means you
will need to use an old version to restore from them, and need to
re-backup everything. Sorry.
Version 0.16, released 2011-07-17; a BETA release
-------------------------------------------------
* This is the first BETA release. Obnam should now be feature complete
for real use. Performance is lacking and there are many bugs remaining.
There are no known bugs that would corrupt backed up data, or prevent
its recovery.
* Add encryption support. See the manual page for how to use it.
Version 0.15.1, released 2011-03-21; an ALPHA release
----------------------------------------------------
* Fix `setup.py` to not import `obnamlib`, so it works when building under
pbuilder on Debian. Meh.
Version 0.15, released 2011-03-21; an ALPHA release
----------------------------------------------------
Bugs fixed:
* Manual page GPL copyright blurb is now properly marked up as a comment.
(Thanks, Joey Hess.)
* README now links to python-lru correctly. (Thanks, Erik Johansson.)
Improvements and other changes:
* Filenames and directories are backed up in sorted order. This should
make it easier to know how far obnam's gotten.
* The location where backups are stored is now called the repository,
instead of the store. Suggested by Joey Hess.
* The repository and the target directory for restored data are now
both created by Obnam, if they don't already exist. Suggested by
Joey Hess.
* Better control of logging, using the new `--trace` option.
* Manual page now explains making backups a little better.
* Default value for `--lru-size` reduced to 500, for great improvement
in memory used, without, it seems, much decrease in speed.
* `obnam verify` now reports success explicitly. Based on question
from Joey Hess.
* `obnam verify` now accepts both non-option arguments and the `--root`
option. Suggested by Joey Hess.
* `obnam forget` now accepts "generation specifiers", not just numeric
generation ids. This means that `obnam forget latest` works.
* I/O statistics are logged more systematically.
* `obnam force-lock` introduced, to allow breaking a lock left behind
if obnam crashes. But it never does, of course. (Well, except if there's
a bug, like when a file changes at the wrong moment.)
* `obnam genids` introduced, to list generation ids without any other data.
The old command `obnam generations` still works, and lists other info
about each generation as well, but that's sometimes bad for scripting.
* The `--dump-memory-profile` option now accepts the value `simple`, for
reporting basic memory use. It has such a small impact that it's the
default.
* Obnam now stores the version of the on-disk format in the repository.
This should allow it to handle repositories created by a different
version and act suitably (hopefully without wiping all your backups).
Version 0.14, released 2010-12-29; an ALPHA release
----------------------------------------------------
This version is capable of backing up my laptop's home directory.
It is, however, still an ALPHA release, and you should not rely on
it as your sole form of backup. It is also slow. But if you're
curious, now would be a good time to try it out a bit.
Bug fixes:
* `COPYING` now contains GPL version 3, instead of 2. The code was
licensed under version 3 already. (Thank you Greg Grossmeier.)
* The manual page now uses `-` and `\-` correctly.
* `obnam forget` now actually removes data that is no longer used by
any generation.
* When backing up a new generation, if any of the root directories for
the backup got dropped by the user, they are now also removed from
the backup generation. Old generations obviously still have them.
* Only the per-client B-tree forest should have multiple trees. Now this
actually happens, whereas previously sometimes a very large number of
new trees would be created in some forests. (What's good for rain
forests is not good for saving disk space.)
* When recursing through directory trees, obnam no longer follows
symlinks to directories.
* obnam no longer creates a missing backup store when backing up to
a local disk. It never did this when backing up via sftp. (This
saves me from figuring out which of `store`, `stor`, and `sorte`
is the real directory.)
New features and stuff:
* `blackboxtest` has been rewritten to use Python's `unittest`
framework, rather than a homegrown bad re-implementation of some of it.
* `obnam ls` interprets arguments as "genspecs" rather than generation
identifiers. This means `obnam ls latest` works, and now `latest` is
also the default if you don't give any spec.
* `run-benchmarks` now outputs results into a git checkout of
, an ikiwiki instance hosted by
. The script also puts the results into
a suitable sub-directory, adds a page for the RSS feed of benchmark
results, and updates the report page that summarizes all stored results.
* There is now a 100 GiB benchmark.
* Clients are now called clients, instead of hosts. This terminology should
be clearer.
* The list of clients now stores a random integer identifier for each client
(unique within the store). The identifier is used as the name of the
per-client B-tree directory, rather than the hostname of the client.
This should prevent a teeny tiny bit of information leakage. It also
makes debugging things much harder.
* Various refactorings and prettifications of the code has happened.
For example, several classes have been split off from the `store.py`
module. This has also resulted in much better test coverage for those
classes.
* The per-client trees (formerly GenerationStore, now ClientMetadataTree)
have a more complicated key now: 4 parts, not 3. This makes it easier
to keep separate data about files, and other data that needs to be
stored per-generation, such as what the generation id is.
* `find-duplicate-chunks`, a tool for finding duplicate chunks of data
in a files in a directory tree, was added to the tree. I have used it
to find out if is worthwhile to do duplicate chunk removal at all.
(It is, at least for my data.) Also, it can be used to find good
values for chunk sizes for duplicate detection.
* The whole way in which obnam does de-duplication got re-designed and
re-implemented. This is tricky stuff, when there is more than one client.
* `SftpFS` now uses a hack copied from bzrlib, to use openssh if it is
available, and paramiko only if it is not. This speeds up sftp data
transfers quite a bit. (Where bzrlib supports more than just openssh,
we don't, since I have no way to test the other stuff. Patches welcome.)
* The way lists of chunk ids are stored for files got changed. Now we store
several ids per list item, which is faster and also saves some space
in the B-tree nodes. Also, it is now possible to append to the list,
which means the caller does not need to first gather a list of all ids.
Such a list gets quite costly when the file is quite big (e.g., in the
terabyte size).
* New `--dump-memory-profile` option was added to help do memory profiling
with meliae or heapy have been added. (Obnam's memory consumption finally
got annoying enough that I did something about it.)
Removed stuff:
* The functional specification was badly outdated, and has been removed.
I decided to stop kidding myself that I would keep it up to date.
* The store design document has been removed from the store tree.
The online version at is the
canonical version, and is actually kept up to date.
* The benchmark specification has likewise been replaced with
.
Version 0.13, released 2010-07-13; an ALPHA release
----------------------------------------------------
* Bug fix: a mistake in 0.12 caused checkpoints to happen after each
file after the first checkpoint. Now they happen at the right intervals
again.
* Upload speed is now displayed during backups.
* Obnam now tells the kernel that it shouldn't cache data it reads or
writes. It is not likely that data being backed up is going to be
needed again any time soon, so there's no point in caching it.
(The posix_fadvise call is used for this.)
* New --lru-size option sets size of LRU cache for nodes in memory.
The obnam default is large enough to suit large backups. This uses more
memory, but is faster than btree's small default of 100.
Version 0.12, released 2010-07-11; an ALPHA release
----------------------------------------------------
* NOTE: This version makes incompatible changes to the way data is stored
on-disk. Backups made with older versions are NOT supported. Sorry.
* The run-benchmark script has dropped some smaller sizes (they're too
fast to be interesting), and adds a 10 GiB test size.
* Various speed optimizations. Most importantly, the way file metadata
(results of lstat(2)) are encoded has changed. This is the incompatible
change from above. It's much faster now, though.
* Preliminary support for using SFTP for the backup store added. Hasn't
been used much yet, so might well be very buggy.
Version 0.11, released 2010-07-05; an ALPHA release
----------------------------------------------------
* Speed optimizations:
- chunk identifiers are now sequential, except for the first one, or
when there's a collision
- chunks are now stored in a more sensible directory hierarchy (instead
of one per directory, on average)
- adding files to a directory in the backup store is now faster
- only store a file's metadata that if it is changed
* New --exclude=regexp option to exclude files based on pathnames
* Obnam now makes checkpoints during backups. If a backup is aborted
in the middle and then re-started, it will continue from the latest
checkpoint rather than from the beginning of the previous backup run.
- New option --checkpoint to set the interval between checkpoints.
Defaults to 1 GiB.
* Options for various B-tree settings. This is mostly useful for finding
the optimal set of defaults, but may be useful in other situations for
some people.
- New options --chunk-group-size, --chunk-size, --node-size,
--upload-queue-size.
* Somewhat better progress reporting during backups.
Version 0.10, released 2010-06-29; an ALPHA release
---------------------------------------------------
* Rewritten from scratch.
* Old NEWS file entries removed (see bzr if you're interested).
obnam-1.6.1/README 0000644 0001750 0001750 00000012411 12246357067 013372 0 ustar jenkins jenkins Obnam, a backup program
=======================
Obnam is a backup program.
Home page
---------
The Obnam home page is at , see there
for more information.
Installation
------------
The source tree contains packaging for Debian. Run `debuild -us -uc -i.git` to
build an installation package.
On other systems, using the `setup.py` file should work: run
"python setup.py --help" for advice. If not, please report a bug.
(I've only tested `setup.py` enough for to build the Debian package.)
You need to install my Python B-tree library, and some of my other libraries
and tools, which you can get from:
*
*
* (for automatic tests)
*
*
*
*
*
* (for benchmarks)
You also need third party libraries:
* paramiko:
See debian/control for the full set of build dependencies and runtime
dependencies on a Debian system. (That set actually gets tested. The
above list is maintained manually and may get out of date from time
to time.)
Use
---
To get a quick help summary of options:
./obnam --help
To make a backup:
./obnam backup --repository /tmp/mybackup $HOME
For more information, see the manual page:
man -l obnam.1
Hacking
-------
Obnam source code is stored in git for version control purposes;
you can get a copy as follows:
git clone git://git.liw.fi/obnam
The 'master' branch is the main development one. Any bug fixes and
features should be developed in a dedicated branch, which gets merged
to master when the changes are done and considered good.
To build and run automatic tests:
./check
./check --fast # unit tests only, no black box tests
./check --network # requires ssh access to localhost
`check` is a wrapper around `python setup.py`, but since using that
takes several steps, the script makes things easier.
You need my CoverageTestRunner to run tests, see above for where to get it.
A couple of scripts exist to run benchmarks and profiles:
./metadata-speed 10000
./obnam-benchmark --size=1m/100k --results /tmp/benchmark-results
viewprof /tmp/benchmark-results/*/*backup-0.prof
seivots-summary /tmp/benchmark-results/*/*.seivot | less -S
There are two kinds of results: Python profiling output, and `.seivot`
files.
For the former, `viewprof` is a little helper script I wrote,
around the Python pstats module.
You can use your own, or get mine from extrautils
(). Running the benchmarks under profiling
makes them a little slower (typically around 10% for me, when I've
compared), but that's OK: the absolute numbers of the benchmarks are
less important than the relative ones. It's nice to be able to look at
the profiler output, if a benchmark is surprisingly slow, without
having to re-run it.
`seivots-summary` is a tool to display summaries of the measurements
made during a benchmark run. `seivot` is the tool that makes the
measurements. I typically save a number of benchmark results, so that
I can see how my changes affect performance over time.
If you make any changes, I welcome patches, either as plain diffs,
`git format-patch --cover-letter` mails, or public repositories I can
merge from.
The code layout is roughly like this:
obnamlib/ # all the real code
obnamlib/plugins/ # the plugin code (see pluginmgr.py)
obnam # script to invoke obnam
_obnammodule.c # wrapper around some system calls
In obnamlib, every code module has a corresponding test module,
and "make check" uses CoverageTestRunner to run them pairwise. For
each pair, test coverage must be 100% or the test will fail.
Mark statements that should not be included in coverage test with
"# pragma: no cover", if you really, really can't write a test.
without-tests lists modules that have no test modules.
If you want to make a new release of Obnam, I recommend following
my release checklist: .
Feedback
--------
I welcome bug fixes, enhancements, bug reports, suggestions, requests,
and other feedback. I prefer e-mail the mailing list:
see
for instructions.
It would be helpful if you can run `make clean check` before submitting
a patch, but it is not strictly required.
Legal stuff
-----------
Most of the code is written by Lars Wirzenius. (Please provide patches
so that can change.)
The code is covered by the GNU General Public License, version 3 or later.
Copyright 2010-2013 Lars Wirzenius
This program is free software: you can redistribute it and/or modify
it under the terms of the GNU General Public License as published by
the Free Software Foundation, either version 3 of the License, or
(at your option) any later version.
This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
GNU General Public License for more details.
You should have received a copy of the GNU General Public License
along with this program. If not, see .
obnam-1.6.1/_obnammodule.c 0000644 0001750 0001750 00000017720 12246357067 015327 0 ustar jenkins jenkins /*
* _obnammodule.c -- Python extensions for Obna
*
* Copyright (C) 2008, 2009 Lars Wirzenius
*
* This program is free software; you can redistribute it and/or modify
* it under the terms of the GNU General Public License as published by
* the Free Software Foundation; either version 3 of the License, or
* (at your option) any later version.
*
* This program is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
* GNU General Public License for more details.
*
* You should have received a copy of the GNU General Public License along
* with this program; if not, write to the Free Software Foundation, Inc.,
* 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
*/
/*
* This is a Python extension module written for Obnam, the backup
* software.
*
* This module provides a way to call the posix_fadvise function from
* Python. Obnam uses this to use set the POSIX_FADV_SEQUENTIAL and
* POSIX_FADV_DONTNEED flags, to make sure the kernel knows that it will
* read files sequentially and that the data does not need to be cached.
* This makes Obnam not trash the disk buffer cache, which is nice.
*/
#define _FILE_OFFSET_BITS 64
#include
#ifndef _XOPEN_SOURCE
#define _XOPEN_SOURCE 600
#endif
#define _POSIX_C_SOURCE 200809L
#include
#include
#include
#include
#include
#include
#include
#ifdef __FreeBSD__
#include
#define NO_NANOSECONDS 1
#else
#include
#define NO_NANOSECONDS 0
#endif
static PyObject *
fadvise_dontneed(PyObject *self, PyObject *args)
{
#if POSIX_FADV_DONTNEED
int fd;
/* Can't use off_t for offset and len, since PyArg_ParseTuple
doesn't know it. */
unsigned long long offset;
unsigned long long len;
int ret;
if (!PyArg_ParseTuple(args, "iLL", &fd, &offset, &len))
return NULL;
ret = posix_fadvise(fd, offset, len, POSIX_FADV_DONTNEED);
return Py_BuildValue("i", ret);
#else
return Py_BuildValue("i", 0);
#endif
}
static PyObject *
utimensat_wrapper(PyObject *self, PyObject *args)
{
int ret;
const char *filename;
long atime_sec, atime_nsec;
long mtime_sec, mtime_nsec;
#if NO_NANOSECONDS
struct timeval tv[2];
#else
struct timespec tv[2];
#endif
if (!PyArg_ParseTuple(args, "sllll",
&filename,
&atime_sec,
&atime_nsec,
&mtime_sec,
&mtime_nsec))
return NULL;
#if NO_NANOSECONDS
tv[0].tv_sec = atime_sec;
tv[0].tv_usec = atime_nsec / 1000;
tv[1].tv_sec = mtime_sec;
tv[1].tv_usec = mtime_nsec / 1000;
ret = lutimes(filename, tv);
#else
tv[0].tv_sec = atime_sec;
tv[0].tv_nsec = atime_nsec;
tv[1].tv_sec = mtime_sec;
tv[1].tv_nsec = mtime_nsec;
ret = utimensat(AT_FDCWD, filename, tv, AT_SYMLINK_NOFOLLOW);
#endif
if (ret == -1)
ret = errno;
return Py_BuildValue("i", ret);
}
/*
* Since we can't set nanosecond mtime and atimes on some platforms, also
* don't retrieve that level of precision from lstat(), so comparisons
* work.
*/
static unsigned long long
remove_precision(unsigned long long nanoseconds)
{
#if NO_NANOSECONDS
return nanoseconds - (nanoseconds % 1000);
#else
return nanoseconds;
#endif
}
static PyObject *
lstat_wrapper(PyObject *self, PyObject *args)
{
int ret;
const char *filename;
struct stat st = {0};
if (!PyArg_ParseTuple(args, "s", &filename))
return NULL;
ret = lstat(filename, &st);
if (ret == -1)
ret = errno;
return Py_BuildValue("iKKKKKKKLLLLKLKLK",
ret,
(unsigned long long) st.st_dev,
(unsigned long long) st.st_ino,
(unsigned long long) st.st_mode,
(unsigned long long) st.st_nlink,
(unsigned long long) st.st_uid,
(unsigned long long) st.st_gid,
(unsigned long long) st.st_rdev,
(long long) st.st_size,
(long long) st.st_blksize,
(long long) st.st_blocks,
(long long) st.st_atim.tv_sec,
remove_precision(st.st_atim.tv_nsec),
(long long) st.st_mtim.tv_sec,
remove_precision(st.st_mtim.tv_nsec),
(long long) st.st_ctim.tv_sec,
remove_precision(st.st_ctim.tv_nsec));
}
static PyObject *
llistxattr_wrapper(PyObject *self, PyObject *args)
{
const char *filename;
size_t bufsize;
PyObject *o;
char* buf;
ssize_t n;
if (!PyArg_ParseTuple(args, "s", &filename))
return NULL;
#ifdef __FreeBSD__
bufsize = extattr_list_link(filename, EXTATTR_NAMESPACE_USER, NULL, 0);
buf = malloc(bufsize);
n = extattr_list_link(filename, EXTATTR_NAMESPACE_USER, buf, bufsize);
if (n >= 0) {
/* Convert from length-prefixed BSD style to '\0'-suffixed
Linux style. */
size_t i = 0;
while (i < n) {
unsigned char length = (unsigned char) buf[i];
memmove(buf + i, buf + i + 1, length);
buf[i + length] = '\0';
i += length + 1;
}
o = Py_BuildValue("s#", buf, (int) n);
} else {
o = Py_BuildValue("i", errno);
}
free(buf);
#else
bufsize = 0;
o = NULL;
do {
bufsize += 1024;
buf = malloc(bufsize);
n = llistxattr(filename, buf, bufsize);
if (n >= 0)
o = Py_BuildValue("s#", buf, (int) n);
else if (n == -1 && errno != ERANGE)
o = Py_BuildValue("i", errno);
free(buf);
} while (o == NULL);
#endif
return o;
}
static PyObject *
lgetxattr_wrapper(PyObject *self, PyObject *args)
{
const char *filename;
const char *attrname;
size_t bufsize;
PyObject *o;
if (!PyArg_ParseTuple(args, "ss", &filename, &attrname))
return NULL;
bufsize = 0;
o = NULL;
do {
bufsize += 1024;
char *buf = malloc(bufsize);
#ifdef __FreeBSD__
int n = extattr_get_link(filename, EXTATTR_NAMESPACE_USER, attrname, buf, bufsize);
#else
ssize_t n = lgetxattr(filename, attrname, buf, bufsize);
#endif
if (n >= 0)
o = Py_BuildValue("s#", buf, (int) n);
else if (n == -1 && errno != ERANGE)
o = Py_BuildValue("i", errno);
free(buf);
} while (o == NULL);
return o;
}
static PyObject *
lsetxattr_wrapper(PyObject *self, PyObject *args)
{
const char *filename;
const char *name;
const char *value;
int size;
int ret;
if (!PyArg_ParseTuple(args, "sss#", &filename, &name, &value, &size))
return NULL;
#ifdef __FreeBSD__
ret = extattr_set_link(filename, EXTATTR_NAMESPACE_USER, name, value, size);
#else
ret = lsetxattr(filename, name, value, size, 0);
#endif
if (ret == -1)
ret = errno;
return Py_BuildValue("i", ret);
}
static PyMethodDef methods[] = {
{"fadvise_dontneed", fadvise_dontneed, METH_VARARGS,
"Call posix_fadvise(2) with POSIX_FADV_DONTNEED argument."},
{"utimensat", utimensat_wrapper, METH_VARARGS,
"utimensat(2) wrapper."},
{"lstat", lstat_wrapper, METH_VARARGS,
"lstat(2) wrapper; arg is filename, returns tuple."},
{"llistxattr", llistxattr_wrapper, METH_VARARGS,
"llistxattr(2) wrapper; arg is filename, returns tuple."},
{"lgetxattr", lgetxattr_wrapper, METH_VARARGS,
"lgetxattr(2) wrapper; arg is filename, returns tuple."},
{"lsetxattr", lsetxattr_wrapper, METH_VARARGS,
"lsetxattr(2) wrapper; arg is filename, returns errno."},
{NULL, NULL, 0, NULL} /* Sentinel */
};
PyMODINIT_FUNC
init_obnam(void)
{
(void) Py_InitModule("_obnam", methods);
}
obnam-1.6.1/analyze-repository-files 0000755 0001750 0001750 00000007273 12246357067 017432 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
'''Analyze the files in an Obnam backup repository.
For performance reasons, it is best if Obnam does not write too many
files per directory, or too large or too small files. This program
analyzes all the files in an Obnam backup repository, or, indeed, any
local directory, and reports the following:
* total number of files
* sum of lengths of files
* number of files per directory: fewest, most, average, median
(both number and name of directory)
* size of files: smallest, largest, average, median
(both size and name of file)
'''
import os
import stat
import sys
class Stats(object):
def __init__(self):
self.dirs = list()
self.files = list()
def add_dir(self, dirname, count):
self.dirs.append((count, dirname))
def add_file(self, filename, size):
self.files.append((size, filename))
@property
def total_files(self):
return len(self.files)
@property
def sum_of_sizes(self):
return sum(size for size, name in self.files)
@property
def dirsizes(self):
self.dirs.sort()
num_dirs = len(self.dirs)
fewest, fewest_name = self.dirs[0]
most, most_name = self.dirs[-1]
average = sum(count for count, name in self.dirs) / num_dirs
median = self.dirs[num_dirs/2][0]
return fewest, fewest_name, most, most_name, average, median
@property
def filesizes(self):
self.files.sort()
num_files = len(self.files)
smallest, smallest_name = self.files[0]
largest, largest_name = self.files[-1]
average = sum(size for size, name in self.files) / num_files
median = self.files[num_files/2][0]
return smallest, smallest_name, largest, largest_name, average, median
def main():
stats = Stats()
for name in sys.argv[1:]:
stat_info = os.lstat(name)
if stat.S_ISDIR(stat_info.st_mode):
for dirname, subdirs, filenames in os.walk(name):
stats.add_dir(dirname, len(filenames) + len(subdirs))
for filename in filenames:
pathname = os.path.join(dirname, filename)
stat_info = os.lstat(pathname)
if stat.S_ISREG(stat_info.st_mode):
stats.add_file(pathname, stat_info.st_size)
elif stat.S_ISREG(stat_info.st_mode):
stats.add_file(name, stat_info.st_size)
print "total_files:", stats.total_files
print "sum of sizes:", stats.sum_of_sizes
fewest, fewest_name, most, most_name, average, median = stats.dirsizes
print "files per dir:"
print " fewest:", fewest, fewest_name
print " most:", most, most_name
print " average:", average
print " median:", median
smallest, smallest_name, largest, largest_name, average, median = \
stats.filesizes
print "file sizes:"
print " smallest:", smallest, smallest_name
print " largest:", largest, largest_name
print " average:", average
print " median:", median
if __name__ == '__main__':
main()
obnam-1.6.1/check 0000755 0001750 0001750 00000001446 12246357067 013523 0 ustar jenkins jenkins #!/bin/sh
# Copyright 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
set -e
python setup.py --quiet clean
python setup.py --quiet build_ext -i
rm -rf build
python setup.py --quiet check "$@"
obnam-1.6.1/check-lock-usage-from-log 0000755 0001750 0001750 00000023710 12246357067 017271 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
'''Check lock file usage from log files.
This program reads a number of Obnam log files, produced with tracing
for obnamlib, and analyses them for bugs when using lock files. Each
log file is assumed to be produced by a separate Obnam instance.
* Have any instances held the same lock during overlapping periods?
'''
import cliapp
import logging
import os
import re
import time
import ttystatus
timestamp_pat = \
r'^\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d (?P\d+\.\d+) .*'
lock_pat = re.compile(
timestamp_pat +
r'vfs_local.py:[0-9]*:lock: got lockname=(?P.*)')
unlock_pat = re.compile(
timestamp_pat +
r'vfs_local.py:[0-9]*:unlock: lockname=(?P.*)')
writefile_pat = re.compile(
timestamp_pat +
r'vfs_local.py:[0-9]*:write_file: write_file (?P.*)$')
overwritefile_pat = re.compile(
timestamp_pat +
r'vfs_local.py:[0-9]*:overwrite_file: overwrite_file (?P.*)$')
node_open_pat = re.compile(
timestamp_pat +
r'nodestore_disk.py:\d+:get_node: reading node \d+ from file '
r'(?P.*)$')
node_remove_pat = re.compile(
timestamp_pat +
r'vfs_local.py:\d+:remove: remove (?P.*/nodes/.*)$')
rename_pat = re.compile(
timestamp_pat +
r'vfs_local.py:\d+:rename: rename (?P\S+) (?P\S+)$')
class LogEvent(object):
def __init__(self, logfile, lineno, timestamp):
self.logfile = logfile
self.lineno = lineno
self.timestamp = timestamp
def sortkey(self):
return self.timestamp
class LockEvent(LogEvent):
def __init__(self, logfile, lineno, timestamp, lockname):
LogEvent.__init__(self, logfile, lineno, timestamp)
self.lockname = lockname
def __str__(self):
return 'Lock(%s)' % self.lockname
class UnlockEvent(LockEvent):
def __str__(self):
return 'Unlock(%s)' % self.lockname
class WriteFileEvent(LogEvent):
def __init__(self, logfile, lineno, timestamp, filename):
LogEvent.__init__(self, logfile, lineno, timestamp)
self.filename = filename
def __str__(self):
return 'WriteFile(%s)' % self.filename
class OverwriteFileEvent(WriteFileEvent):
def __str__(self):
return 'OverwriteFile(%s)' % self.filename
class NodeCreateEvent(LogEvent):
def __init__(self, logfile, lineno, timestamp, node_id):
LogEvent.__init__(self, logfile, lineno, timestamp)
self.node_id = node_id
def __str__(self):
return 'NodeCreate(%s)' % self.node_id
class NodeDestroyEvent(NodeCreateEvent):
def __str__(self):
return 'NodeDestroy(%s)' % self.node_id
class NodeReadEvent(NodeCreateEvent):
def __str__(self):
return 'NodeOpen(%s)' % self.node_id
class RenameEvent(LogEvent):
def __init__(self, logfile, lineno, timestamp, old, new):
LogEvent.__init__(self, logfile, lineno, timestamp)
self.old = old
self.new = new
def __str__(self):
return 'Rename(%s -> %s)' % (self.old, self.new)
class CheckLocks(cliapp.Application):
def setup(self):
self.events = []
self.errors = 0
self.latest_opened_node = None
self.patterns = [
(lock_pat, self.lock_event),
(unlock_pat, self.unlock_event),
(writefile_pat, self.writefile_event),
(overwritefile_pat, self.overwritefile_event),
(node_open_pat, self.read_node_event),
(node_remove_pat, self.node_remove_event),
(rename_pat, self.rename_event),
]
self.ts = ttystatus.TerminalStatus()
self.ts.format(
'Reading %ElapsedTime() %Integer(lines): %Pathname(filename)')
self.ts['lines'] = 0
def cleanup(self):
self.ts.clear()
self.analyse_phase_1()
self.ts.finish()
if self.errors:
raise cliapp.AppException('There were %d errors' % self.errors)
def error(self, msg):
logging.error(msg)
self.ts.error(msg)
self.errors += 1
def analyse_phase_1(self):
self.events.sort(key=lambda e: e.sortkey())
self.events = self.create_node_events(self.events)
self.ts.format('Phase 1: %Index(event,events)')
self.ts['events'] = self.events
self.ts.flush()
current_locks = set()
current_nodes = set()
for e in self.events:
self.ts['event'] = e
logging.debug(
'analysing: %s:%s: %s: %s' %
(e.logfile, e.lineno, repr(e.sortkey()), str(e)))
if type(e) is LockEvent:
if e.lockname in current_locks:
self.error(
'Re-locking %s: %s:%s:%s' %
(e.lockname, e.logfile, e.lineno,
e.timestamp))
else:
current_locks.add(e.lockname)
elif type(e) is UnlockEvent:
if e.lockname not in current_locks:
self.error(
'Unlocking %s which was not locked: %s:%s:%s' %
(e.lockname, e.logfile, e.lineno,
e.timestamp))
else:
current_locks.remove(e.lockname)
elif type(e) in (WriteFileEvent, OverwriteFileEvent):
lockname = self.determine_lockfile(e.filename)
if lockname and lockname not in current_locks:
self.error(
'%s:%s: '
'Write to file %s despite lock %s not existing' %
(e.logfile, e.lineno, e.filename, lockname))
elif type(e) is NodeCreateEvent:
if e.node_id in current_nodes:
self.error(
'%s:%s: Node %s already exists' %
(e.logfile, e.lineno, e.node_id))
else:
current_nodes.add(e.node_id)
elif type(e) is NodeDestroyEvent:
if e.node_id not in current_nodes:
self.error(
'%s:%s: Node %s does not exist' %
(e.logfile, e.lineno, e.node_id))
else:
current_nodes.remove(e.node_id)
elif type(e) is NodeReadEvent:
if e.node_id not in current_nodes:
self.error(
'%s:%s: Node %s does not exist' %
(e.logfile, e.lineno, e.node_id))
elif type(e) is RenameEvent:
if e.old in current_nodes:
current_nodes.remove(e.old)
current_nodes.add(e.new)
else:
raise NotImplementedError()
def create_node_events(self, events):
new = []
for e in events:
new.append(e)
if type(e) in (WriteFileEvent, OverwriteFileEvent):
if '/nodes/' in e.filename:
new_e = NodeCreateEvent(
e.logfile, e.lineno, e.timestamp, e.filename)
new_e.timestamp = e.timestamp
new.append(new_e)
return new
def determine_lockfile(self, filename):
if filename.endswith('/lock'):
return None
toplevel = filename.split('/')[0]
if toplevel == 'chunks':
return None
if toplevel in ('metadata', 'clientlist'):
return './lock'
return toplevel + '/lock'
def process_input(self, name):
self.ts['filename'] = name
return cliapp.Application.process_input(self, name)
def process_input_line(self, filename, line):
self.ts['lines'] = self.global_lineno
for pat, func in self.patterns:
m = pat.search(line)
if m:
event = func(filename, line, m)
if event is not None:
self.events.append(event)
def lock_event(self, filename, line, match):
return LockEvent(
filename, self.lineno, float(match.group('timestamp')),
match.group('lock'))
def unlock_event(self, filename, line, match):
return UnlockEvent(
filename, self.lineno, float(match.group('timestamp')),
match.group('lock'))
def writefile_event(self, filename, line, match):
return WriteFileEvent(
filename, self.lineno, float(match.group('timestamp')),
match.group('filename'))
def overwritefile_event(self, filename, line, match):
return OverwriteFileEvent(
filename, self.lineno, float(match.group('timestamp')),
match.group('filename'))
def read_node_event(self, filename, line, match):
node_id = match.group('nodeid')
if not os.path.basename(node_id).startswith('tmp'):
return NodeReadEvent(
filename, self.lineno, float(match.group('timestamp')),
node_id)
def node_remove_event(self, filename, line, match):
return NodeDestroyEvent(
filename, self.lineno, float(match.group('timestamp')),
match.group('nodeid'))
def rename_event(self, filename, line, match):
return RenameEvent(
filename, self.lineno, float(match.group('timestamp')),
match.group('old'), match.group('new'))
CheckLocks().run()
obnam-1.6.1/confs/ 0000755 0001750 0001750 00000000000 12246357067 013623 5 ustar jenkins jenkins obnam-1.6.1/confs/common.conf 0000644 0001750 0001750 00000000037 12246357067 015762 0 ustar jenkins jenkins [config]
with-encryption = yes
obnam-1.6.1/confs/historical-local.conf 0000644 0001750 0001750 00000000111 12246357067 017714 0 ustar jenkins jenkins [config]
profile-name = historical-local
size = 4k/4k
generations = 1000
obnam-1.6.1/confs/media-local.conf 0000644 0001750 0001750 00000000124 12246357067 016636 0 ustar jenkins jenkins [config]
profile-name = media-local
size = 1g/100m
file-size = 100m
generations = 2
obnam-1.6.1/confs/sourcecode-local.conf 0000644 0001750 0001750 00000000127 12246357067 017715 0 ustar jenkins jenkins [config]
profile-name = sourcecode-local
size = 1g/10m
file-size = 16k
generations = 2
obnam-1.6.1/crash-test 0000755 0001750 0001750 00000004034 12246357067 014517 0 ustar jenkins jenkins #!/bin/sh
# Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
set -eu
if [ "$#" != 1 ]
then
echo "usage: see source" 1>&2
exit 1
fi
N="$1"
tempdir="$(mktemp -d)"
echo "Temporary directory: $tempdir"
cat < "$tempdir/conf"
[config]
repository = $tempdir/repo
root = $tempdir/data
log = $tempdir/obnam.log
trace = larch
checkpoint = 1m
lock-timeout = 1
log-keep = 16
log-level = debug
trace = larch, obnamlib
EOF
# Do a minimal backup to make sure the repository works at least once, without the crash-limit option
mkdir "$tempdir/data"
./obnam backup --no-default-config --config "$tempdir/conf"
genbackupdata --create=100m "$tempdir/data"
echo "crash-limit = $N" >> "$tempdir/conf"
while true
do
# There's no need to delete this file because the first Exception message
# that appears in the file will terminate the test.
# rm -f "$tempdir/obnam.log"
echo "Trying backup with at most $N writes to repository"
./obnam force-lock --no-default-config --config "$tempdir/conf" 2>/dev/null
if ./obnam backup --no-default-config --config "$tempdir/conf" 2>/dev/null
then
echo "Backup finished ok, done"
break
fi
if ! grep -q '^Exception: Crashing as requested' "$tempdir/obnam.log"
then
echo "Backup terminated because of unrequested crash" 1>&2
exit 1
fi
# ./obnam fsck --no-default-config --config "$tempdir/conf" || true
done
rm -rf "$tempdir"
echo "OK"
obnam-1.6.1/create-vfat-disk-image 0000755 0001750 0001750 00000001544 12246357067 016656 0 ustar jenkins jenkins #!/bin/sh
# Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
set -eu
filename="$1"
size="$2"
qemu-img create -f raw "$filename" "$size"
#parted "$filename" mklabel msdos
#parted "$filename" mkpart primary fat32 0% 100%
/sbin/mkfs.vfat "$filename"
obnam-1.6.1/dumpobjs 0000755 0001750 0001750 00000002615 12246357067 014270 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import sys
import time
import obnamlib
def find_objids(fs):
basenames = fs.listdir('.')
return [x[:-len('.obj')] for x in basenames if x.endswith('.obj')]
fs = obnamlib.LocalFS(sys.argv[1])
repo = obnamlib.Repository(fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0)
for objid in find_objids(fs):
obj = repo.get_object(objid)
print 'id %s (%s):' % (obj.id, obj.__class__.__name__)
for name in obj.fieldnames():
print ' %-10s %s' % (name, repr(getattr(obj, name)))
obnam-1.6.1/find-duplicate-chunks 0000755 0001750 0001750 00000006202 12246357067 016622 0 ustar jenkins jenkins #!/usr/bin/python
#
# Report duplicate chunks of data in a filesystem.
import hashlib
import os
import subprocess
import sys
import tempfile
def compute(f, chunk_size, offset):
chunk = f.read(chunk_size)
while len(chunk) == chunk_size:
yield hashlib.md5(chunk).hexdigest()
chunk = chunk[offset:] + f.read(offset)
def compute_checksums(f, chunk_size, offset, dirname):
for dirname, subdirs, filenames in os.walk(dirname):
for filename in filenames:
pathname = os.path.join(dirname, filename)
if os.path.isfile(pathname) and not os.path.islink(pathname):
ff = file(pathname)
for checksum in compute(ff, chunk_size, offset):
f.write('%s\n' % checksum)
ff.close()
def sort_checksums(f, checksums_name):
subprocess.check_call(['sort',
'-T', '.',
'--batch-size', '1000',
'-S', '1G',
],
stdin=file(checksums_name),
stdout=f)
def count_duplicates(f, sorted_name):
subprocess.check_call(['uniq', '-c'], stdin=file(sorted_name), stdout=f)
def make_report(f, counts_name, chunk_size, offset):
num_diff_checksums = 0
saved = 0
total = 0
limits = [1]
counts = { 1: 0 }
for line in file(counts_name):
count, checksum = line.split()
count = int(count)
num_diff_checksums += 1
saved += (count-1) * chunk_size
total += count * chunk_size
while limits[-1] < count:
n = limits[-1] * 10
limits.append(n)
counts[n] = 0
for limit in limits:
if count <= limit:
counts[limit] += count
break
f.write('chunk size: %d\n' % chunk_size)
f.write('offset: %d\n' % offset)
f.write('#different checksums: %d\n' % num_diff_checksums)
f.write('%8s %8s\n' % ('repeats', 'how many'))
for limit in limits:
f.write('%8d %8d\n' % (limit, counts[limit]))
f.write('bytes saved by de-duplication: %d\n' % saved)
f.write('%% saved: %f\n' % (100.0*saved/total))
def main():
chunk_size = int(sys.argv[1])
offset = int(sys.argv[2])
dirname = sys.argv[3]
prefix = 'data-%04d-%04d' % (chunk_size, offset)
checksums_name = prefix + '.checksums'
sorted_name = prefix + '.sorted'
counts_name = prefix + '.counts'
report_name = prefix + '.report'
steps = (
(checksums_name, compute_checksums, (chunk_size, offset, dirname)),
(sorted_name, sort_checksums, (checksums_name,)),
(counts_name, count_duplicates, (sorted_name,)),
(report_name, make_report, (counts_name, chunk_size, offset)),
)
for filename, func, args in steps:
if not os.path.exists(filename):
print 'Step:', func.__name__
fd, output_name = tempfile.mkstemp(dir='.')
os.close(fd)
f = file(output_name, 'w')
func(*((f,) + args))
f.close()
os.rename(output_name, filename)
if __name__ == '__main__':
main()
obnam-1.6.1/lock-and-increment 0000755 0001750 0001750 00000002250 12246357067 016112 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import sys
import obnamlib
dirname = sys.argv[1]
count_to = int(sys.argv[2])
filename = os.path.join(dirname, 'counter')
fs = obnamlib.LocalFS('/')
lm = obnamlib.LockManager(fs, 60, 'lock-and-increment')
for i in range(count_to):
lm.lock([dirname])
if fs.exists(filename):
data = fs.cat(filename)
counter = int(data)
counter += 1
fs.overwrite_file(filename, str(counter))
else:
fs.write_file(filename, str(1))
lm.unlock([dirname])
obnam-1.6.1/meliae-show 0000755 0001750 0001750 00000000517 12246357067 014656 0 ustar jenkins jenkins #!/usr/bin/python
from meliae import loader
from pprint import pprint as pp
import sys
om = loader.load(sys.argv[1])
om.remove_expensive_references()
print om.summarize()
print
for type_name in sys.argv[2:]:
objs = om.get_all(type_name)
for obj in objs[:5]:
pp(obj.p)
print om.summarize(obj)
print
obnam-1.6.1/metadata-speed 0000755 0001750 0001750 00000002576 12246357067 015331 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import random
import shutil
import sys
import time
import obnamlib
def measure(n, func):
start = time.clock()
for i in range(n):
func()
end = time.clock()
return end - start
def main():
n = int(sys.argv[1])
fs = obnamlib.LocalFS('.')
fs.connect()
metadata = obnamlib.read_metadata(fs, '.')
encoded = obnamlib.encode_metadata(metadata)
calibrate = measure(n, lambda: None)
encode = measure(n, lambda: obnamlib.encode_metadata(metadata))
decode = measure(n, lambda: obnamlib.decode_metadata(encoded))
print 'encode: %.1f s' % (n/(encode - calibrate))
print 'decode: %.1f s' % (n/(decode - calibrate))
if __name__ == '__main__':
main()
obnam-1.6.1/obnam 0000755 0001750 0001750 00000001413 12246357067 013534 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import obnamlib
obnamlib.App(progname='obnam', version=obnamlib.__version__).run()
obnam-1.6.1/obnam-benchmark 0000755 0001750 0001750 00000020673 12246357067 015475 0 ustar jenkins jenkins #!/usr/bin/python
#
# Copyright 2010, 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import cliapp
import ConfigParser
import glob
import logging
import os
import shutil
import socket
import subprocess
import tempfile
class ObnamBenchmark(cliapp.Application):
default_sizes = ['1g/100m']
keyid = '3B1802F81B321347'
opers = ('backup', 'restore', 'list_files', 'forget')
def add_settings(self):
self.settings.string(['results'], 'put results under DIR (%default)',
metavar='DIR', default='../benchmarks')
self.settings.string(['obnam-branch'],
'use DIR as the obnam branch to benchmark '
'(default: %default)',
metavar='DIR',
default='.')
self.settings.string(['larch-branch'],
'use DIR as the larch branch (default: %default)',
metavar='DIR',
)
self.settings.string(['seivot-branch'],
'use DIR as the seivot branch '
'(default: installed seivot)',
metavar='DIR')
self.settings.boolean(['with-encryption'],
'run benchmark using encryption')
self.settings.string(['profile-name'],
'short name for benchmark scenario',
default='unknown')
self.settings.string_list(['size'],
'add PAIR to list of sizes to '
'benchmark (e.g., 10g/1m)',
metavar='PAIR')
self.settings.bytesize(['file-size'], 'how big should files be?',
default=4096)
self.settings.integer(['generations'],
'benchmark N generations (default: %default)',
metavar='N',
default=5)
self.settings.boolean(['use-sftp-repository'],
'access the repository over SFTP '
'(requires ssh to localhost to work)')
self.settings.boolean(['use-sftp-root'],
'access the live data over SFTP '
'(requires ssh to localhost to work)')
self.settings.integer(['sftp-delay'],
'add artifical delay to sftp transfers '
'(in milliseconds)')
self.settings.string(['description'], 'describe benchmark')
self.settings.boolean(['drop-caches'], 'drop kernel buffer caches')
self.settings.string(['seivot-log'], 'seivot log setting')
self.settings.boolean(['verify'], 'verify restores')
def process_args(self, args):
self.require_tmpdir()
obnam_revno = self.bzr_revno(self.settings['obnam-branch'])
if self.settings['larch-branch']:
larch_revno = self.bzr_revno(self.settings['larch-branch'])
else:
larch_revno = None
results = self.results_dir(obnam_revno, larch_revno)
obnam_branch = self.settings['obnam-branch']
if self.settings['seivot-branch']:
seivot = os.path.join(self.settings['seivot-branch'], 'seivot')
else:
seivot = 'seivot'
generations = self.settings['generations']
tempdir = tempfile.mkdtemp()
env = self.setup_gnupghome(tempdir)
sizes = self.settings['size'] or self.default_sizes
logging.debug('sizes: %s' % repr(sizes))
file_size = self.settings['file-size']
profile_name = self.settings['profile-name']
for pair in sizes:
initial, inc = self.parse_size_pair(pair)
msg = 'Profile %s, size %s inc %s' % (profile_name, initial, inc)
print
print msg
print '-' * len(msg)
print
obnam_profile = os.path.join(results,
'obnam--%(op)s-%(gen)s.prof')
output = os.path.join(results, 'obnam.seivot')
if os.path.exists(output):
print ('%s already exists, not re-running benchmark' %
output)
else:
argv = [seivot,
'--obnam-branch', obnam_branch,
'--incremental-data', inc,
'--file-size', str(file_size),
'--obnam-profile', obnam_profile,
'--generations', str(generations),
'--profile-name', profile_name,
'--sftp-delay', str(self.settings['sftp-delay']),
'--initial-data', initial,
'--output', output]
if self.settings['larch-branch']:
argv.extend(['--larch-branch', self.settings['larch-branch']])
if self.settings['seivot-log']:
argv.extend(['--log', self.settings['seivot-log']])
if self.settings['drop-caches']:
argv.append('--drop-caches')
if self.settings['use-sftp-repository']:
argv.append('--use-sftp-repository')
if self.settings['use-sftp-root']:
argv.append('--use-sftp-root')
if self.settings['with-encryption']:
argv.extend(['--encrypt-with', self.keyid])
if self.settings['description']:
argv.extend(['--description',
self.settings['description']])
if self.settings['verify']:
argv.append('--verify')
self.runcmd(argv, env=env)
shutil.rmtree(tempdir)
def require_tmpdir(self):
if 'TMPDIR' not in os.environ:
raise cliapp.AppException('TMPDIR is not set. '
'You would probably run out of space '
'on /tmp.')
if not os.path.exists(os.environ['TMPDIR']):
raise cliapp.AppException('TMPDIR points at a non-existent '
'directory %s' % os.environ['TMPDIR'])
logging.debug('TMPDIR=%s' % repr(os.environ['TMPDIR']))
@property
def hostname(self):
return socket.gethostname()
@property
def obnam_branch_name(self):
obnam_branch = os.path.abspath(self.settings['obnam-branch'])
return os.path.basename(obnam_branch)
def results_dir(self, obnam_revno, larch_revno):
parent = self.settings['results']
parts = [self.hostname, self.obnam_branch_name, str(obnam_revno)]
if larch_revno:
parts.append(str(larch_revno))
prefix = os.path.join(parent, "-".join(parts))
get_path = lambda counter: "%s-%d" % (prefix, counter)
counter = 0
dirname = get_path(counter)
while os.path.exists(dirname):
counter += 1
dirname = get_path(counter)
os.makedirs(dirname)
return dirname
def setup_gnupghome(self, tempdir):
gnupghome = os.path.join(tempdir, 'gnupghome')
shutil.copytree('test-gpghome', gnupghome)
env = dict(os.environ)
env['GNUPGHOME'] = gnupghome
return env
def bzr_revno(self, branch):
p = subprocess.Popen(['bzr', 'revno'], cwd=branch,
stdout=subprocess.PIPE)
out, err = p.communicate()
if p.returncode != 0:
raise cliapp.AppException('bzr failed')
revno = out.strip()
logging.debug('bzr branch %s has revno %s' % (branch, revno))
return revno
def parse_size_pair(self, pair):
return pair.split('/', 1)
if __name__ == '__main__':
ObnamBenchmark().run()
obnam-1.6.1/obnam-benchmark.1.in 0000644 0001750 0001750 00000006536 12246357067 016240 0 ustar jenkins jenkins .\" Copyright 2011 Lars Wirzenius
.\"
.\" This program is free software: you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation, either version 3 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program. If not, see .
.\"
.TH OBNAM-BENCHMARK 1
.SH NAME
obnam-benchmark \- benchmark obnam
.SH SYNOPSIS
.SH DESCRIPTION
.B obnam-benchmark
benchmarks the
.BR obnam (1)
backup application,
by measuring how much time it takes to do a backup, restore, etc,
in various scenarios.
.B obnam-benchmark
uses the
.BR seivot (1)
tool for actually running the benchmarks,
but makes some helpful assumptions about things,
to make it simpler to run than running
.B seivot
directly.
.PP
Benchmarks are run using two different usage profiles:
.I mailspool
(all files are small), and
.I mediaserver
(all files are big).
For each profile,
test data of the desired total size is generated,
backed up,
and then several incremental generations are backed up,
each adding some more generated test data.
Then other operations are run against the backup repository:
restoring,
listing the contents of,
and removing each generation.
.PP
The result of the benchmark is a
.I .seivot
file per profile,
plus a Python profiler file for each run of
.BR obnam .
These are stored in
.IR ../benchmarks .
A set of
.I .seivot
files can be summarized for comparison with
.BR seivots-summary (1).
The profiling files can be viewed with the usual Python tools:
see the
.B pstats
module.
.PP
The benchmarks are run against a version of
.B obnam
checked out from version control.
It is not (currently) possible to run the benchmark against an installed
version of
.BR obnam.
Also the
.I larch
Python library,
which
.B obnam
needs,
needs to be checked out from version control.
The
.B \-\-obnam\-branch
and
.B \-\-larch\-branch
options set the locations,
if the defaults are not correct.
.SH OPTIONS
.SH ENVIRONMENT
.TP
.BR TMPDIR
This variable
.I must
be set.
It controls where the temporary files (generated test data) is stored.
If this variable was not set,
they'd be put into
.IR /tmp ,
which easily fills up,
to the detriment of the entire system.
Thus.
.B obnam-benchmark
requires that the location is set explicitly.
(You can still use
.I /tmp
if you want, but you have to set
.B TMPDIR
explicitly.)
.SH FILES
.TP
.BR ../benchmarks/
The default directory where results of the benchmark are stored,
in a subdirectory named after the branch and revision numbers.
.SH EXAMPLE
To run a small benchmark:
.IP
TMPDIR=/var/tmp obnam-benchmark --size=10m/1m
.PP
To run a benchmark using existing data:
.IP
TMPDIR=/var/tmp obnam-benchmark --use-existing=$HOME/Mail
.PP
To view the currently available benchmark results:
.IP
seivots-summary ../benchmarks/*/*mail*.seivot | less -S
.br
seivots-summary ../benchmarks/*/*media*.seivot | less -S
.PP
(You need to run
.B seivots-summary
once per usage profile.)
.SH "SEE ALSO"
.BR obnam (1),
.BR seivot (1),
.BR seivots-summary (1).
obnam-1.6.1/obnam-viewprof 0000755 0001750 0001750 00000001764 12246357067 015404 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import pstats
import sys
if len(sys.argv) not in [2, 3]:
sys.stderr.write('Usage: obnam-viewprof foo.prof [sort-order]\n')
sys.exit(1)
if len(sys.argv) == 3:
order = sys.argv[2]
else:
order = 'cumulative'
p = pstats.Stats(sys.argv[1])
p.strip_dirs()
p.sort_stats(order)
p.print_stats()
p.print_callees()
obnam-1.6.1/obnam-viewprof.1 0000644 0001750 0001750 00000002553 12246357067 015535 0 ustar jenkins jenkins .\" Copyright 2012 Lars Wirzenius
.\"
.\" This program is free software: you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation, either version 3 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program. If not, see .
.\"
.TH OBNAM-VIEWPROF 1
.SH NAME
obnam-viewprof \- show Python profiler output
.SH SYNOPSIS
.B obnam-viewprof
.I profile
.RI [ sort-order ]
.SH DESCRIPTION
.B obnam-viewprof
shows a plain text version of Python profiler output.
You can generate such output from Obnam by setting the
.B OBNAM_PROFILE
environment variable to a filename.
The profile will be written to that filename,
and you should give it to
.B obnam-viewprof
as an argument.
.PP
The
.I sort-order
argument defaults to
.B cumulative
and can be any of the orderings that the Python pstats library supports.
.PP
.B obnam-viewprof
is mainly useful for those developing
.BR obnam (1).
.SH "SEE ALSO"
.BR obnam (1).
obnam-1.6.1/obnam.1.in 0000644 0001750 0001750 00000044436 12246357067 014311 0 ustar jenkins jenkins .\" Copyright 2010-2013 Lars Wirzenius
.\"
.\" This program is free software: you can redistribute it and/or modify
.\" it under the terms of the GNU General Public License as published by
.\" the Free Software Foundation, either version 3 of the License, or
.\" (at your option) any later version.
.\"
.\" This program is distributed in the hope that it will be useful,
.\" but WITHOUT ANY WARRANTY; without even the implied warranty of
.\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
.\" GNU General Public License for more details.
.\"
.\" You should have received a copy of the GNU General Public License
.\" along with this program. If not, see .
.TH OBNAM 1
.SH NAME
obnam \- make, restore, and manipulate backups
.SH SYNOPSIS
.SH DESCRIPTION
.B obnam
makes, restores, manipulates, and otherwise deals with backups.
It can store backups on a local disk or to a server via sftp.
Every backup generation looks like a fresh snapshot,
but is really incremental:
the user does not need to worry whether it's a full backup or not.
Only changed data is backed up,
and if a chunk of data is already backed up in another file,
that data is re-used.
.PP
The place where backed up data is placed is called the
backup repository.
A repository may be, for example, a directory on an sftp server,
or a directory on a USB hard disk.
A single repository may contain backups from several clients.
Their data will intermingle as if they were using separate repositories,
but if one client backs up a file, the others may re-use the data.
.PP
.B obnam
command line syntax consists of a
.I command
possibly followed by arguments.
The commands are list below.
.IP \(bu
.B backup
makes a new backup.
The first time it is run, it makes a full backup,
after that an incremental one.
.IP \(bu
.B restore
is the opposite of a backup.
It copies backed up data from the backup repository to a target directory.
You can restore everything in a generation,
or just selected files.
.IP \(bu
.B clients
lists the clients that are backed up to the repository.
.IP \(bu
.B generations
lists every backup generation for a given client,
plus some metadata about the generation.
.IP \(bu
.B genids
lists the identifier for every backup generation for a given client.
No other information is shown.
This can be useful for scripting.
.IP \(bu
.B ls
lists the contents of a given generation, similar to
.BR "ls \-lAR" .
.IP \(bu
.B verify
compares data in the backup with actual user data,
and makes sures they are identical.
It is most useful to run immediately after a backup,
to check that it actually worked.
It can be run at any time,
but if the user data has changed,
verification fails even though the backup is OK.
.IP \(bu
.B forget
removes backup generations that are no longer wanted,
so that they don't use disk space.
Note that after a backup generation is removed
the data can't be restored anymore.
You can either specify the generations to remove by listing them
on the command line,
or use the
.B \-\-keep
option to specify a policy for what to keep (everything else will
be removed).
.IP \(bu
.B fsck
checks the internal consistency of the backup repository.
It verifies that all clients, generations, directories, files, and
all file contents still exists in the backup repository.
It may take quite a long time to run.
.IP \(bu
.B force\-lock
removes a lock file for a client in the repository.
You should only force a lock if you are sure no-one is accessing that
client's data in the repository.
A dangling lock might happen, for example, if obnam loses its network
connection to the backup repository.
.IP \(bu
.B client\-keys
lists the encryption key associated with each client.
.IP \(bu
.B list\-keys
lists the keys that can access the repository,
and which toplevel directories each key can access.
Some of the toplevel directories are shared between clients,
others are specific to a client.
.IP \(bu
.B list\-toplevels
is like
.BR list\-keys ,
but lists toplevels and which keys can access them.
.IP \(bu
.B add\-key
adds an encryption key to the repository.
By default, the key is added only to the shared toplevel directories,
but it can also be added to specific clients:
list the names of the clients on the command line.
They key is given with the
.B \-\-keyid
option.
Whoever has access to the secret key corresponding to the key id
can access the backup repository
(the shared toplevels plus specified clients).
.IP \(bu
.B remove\-key
removes a key from the shared toplevel directories,
plus any clients specified on the command line.
.IP \(bu
.B nagios\-last\-backup\-age
is a check that exits with non-zero return if a backup age exceeds a certain
threshold. It is suitable for use as a check plugin for nagios. Thresholds
can be given the
.B \-\-warn-age
and
.B \-\-critical-age
options.
.IP \(bu
.B diff
compares two generations and lists files differing between them. Every output
line will be prefixed either by a plus sign (+) for files that were added, a
minus sign (-) for files that have been removed or an asterisk (*) for files
that have changed. If only one generation ID is specified on the command line
that generation will be compared with its direct predecessor. If two IDs have
been specified, all changes between those two generations will be listed.
.IP \(bu
.B mount
makes the backup repository available via a read-only FUSE filesystem.
This means you can look at backed up data using normal tools,
such as your GUI file manager,
or command line tools such as
.BR ls (1),
.BR diff (1),
and
.BR cp (1).
You can't make new backups with the mount subcommand,
but you can restore data easily.
.IP
You need to have the FUSE utilities and have permission to use FUSE
for this to work.
The details will vary between operating systems;
in Debian, install the package
.I fuse
and add yourself to the
.I fuse
group (you may need to log out and back in again).
.SS "Making backups"
When you run a backup,
.B obnam
uploads data into the backup repository.
The data is divided into chunks,
and if a chunk already exists in the backup repository,
it is not uploaded again.
This allows
.B obnam
to deal with files that have been changed or renamed since the previous
backup run.
It also allows several backup clients to avoid uploading the same data.
If, for example, everyone in the office has a copy of the same sales brochures,
only one copy needs to be stored in the backup repository.
.PP
Every backup run is a
.IR generation .
In addition,
.B obnam
will make
.I checkpoint
generations every now and then.
These are exactly like normal generations,
but are not guaranteed to be a complete snapshot of the live data.
If the backup run needs to be aborted in the middle,
the next backup run can continue from the latest checkpoint,
avoiding the need to start completely over.
.PP
If one backup run drops a backup root directory,
the older generations will still keep it:
nothing changes in the old generations just because there is a new one.
If the root was dropped by mistake,
it can be added back and the next backup run will re-use the existing
data in the backup repository,
and will only back up the file metadata (filenames, permissions, etc).
.SS "Verifying backups"
What good is a backup system you cannot rely on?
How can you rely on something you cannot test?
The
.B "obnam verify"
command checks that data in the backup repository matches actual user data.
It retrieves one or more files from the repository and compares them to
the user data.
This is essentialy the same as doing a restore,
then comparing restored files with the original files using
.BR cmp (1),
but easier to use.
.PP
By default verification happens on all files.
You can also specify the files to be verified by listing them on the
command line.
You should specify the full paths to the files,
not relative to the current directory.
.PP
The output lists files that fail verification for some reason.
If you verify everything, it is likely that some files (e.g.,
parent directories of backup root) may have changed without it
being a problem.
Note that you will need to specify the whole path to the files
or directories to be verified, not relative to the backup root.
You still need to specify at least one of the backup roots
on the command line or via the
.B \-\-root
option so that obnam will find the filesystem, in case it is
a remote one.
.SS "URL syntax"
Whenever obnam accepts a URL, it can be either a local pathname,
or an
.B sftp
URL.
An sftp URL has the following form:
.IP
\fBsftp://\fR[\fIuser\fR@]\fIdomain\fR[:\fIport\fR]\fB/path
.PP
where
.I domain
is a normal Internet domain name for a server,
.I user
is your username on that server,
.I port
is an optional port number,
and
.I path
is a pathname on the server side.
Like
.BR bzr (1),
but unlike the sftp URL standard,
the pathname is absolute,
unless it starts with
.B /~/
in which case it is relative to the user's home directory on the server.
.PP
See the EXAMPLE section for examples of URLs.
.PP
You can use
.B sftp
URLs for the repository, or the live data (root),
but note that due to limitations in the protocol,
and its implementation in the
.B paramiko
library,
some things will not work very well for accessing live data over
.BR sftp .
Most importantly,
the handling of of hardlinks is rather suboptimal.
For live data access, you should not end the URL with
.B /~/
and should append a dot at the end in this special case.
.SS "Generation specifications"
When not using the latest generation,
you will need to specify which one you need.
This will be done with the
.B \-\-generation
option,
which takes a generation specification as its argument.
The specification is either the word
.IR latest ,
meaning the latest generation (also the default),
or a number.
See the
.B generations
command to see what generations are available,
and what their numbers are.
.SS "Policy for keeping and removing backup generations"
The
.B forget
command can follow a policy to automatically keep some and remove
other backup generations.
The policy is set with the
.BR \-\-keep =\fIPOLICY
option.
.PP
.I POLICY
is comma-separated list of rules.
Each rule consists of a count and a time period.
The time periods are
.BR h ,
.BR d ,
.BR w ,
.BR m ,
and
.BR y ,
for hour, day, week, month, and year.
.PP
A policy of
.I 30d
means to keep the latest backup for each day when a backup was made,
and keep the last 30 such backups.
Any backup matched by any policy rule is kept,
and any backups in between will be removed,
as will any backups older than the oldest kept backup.
.PP
As an example, assume backups are taken every hour, on the hour:
at 00:00, 01:00, 02:00, and so on, until 23:00.
If the
.B forget
command is run at 23:15, with the above policy,
it will keep the backup taken at 23:00 on each day,
and remove every other backup that day.
It will also remove backups older than 30 days.
.PP
If backups are made every other day, at noon,
.B forget
would keep the 30 last backups,
or 60 days worth of backups,
with the above policy.
.PP
Note that obnam will only inspect timestamps in the backup repository,
and does not care what the actual current time is.
This means that if you stop making new backups,
the existing ones won't be removed automatically.
In essence, obnam pretends the current time is just after the
latest backup when
.B forget
is run.
.PP
The rules can be given in any order,
but will be sorted to ascending order of time period before applied.
(It is an error to give two rules for the same period.)
A backup generation is kept if it matches any rule.
.PP
For example, assume the same backup frequence as above,
but a policy of
.IR 30d,52w .
This will keep the newest daily backup for each day for thirty days,
.I and
the newest weekly backup for 52 weeks.
Because the hourly backups will be removed daily,
before they have a chance to get saved by a weekly rule,
the effect is that the 23:00 o'clock backup for each day is
saved for a month,
and the 23:00 backup on Sundays is saved for a year.
.PP
If, instead, you use a policy of
.IR 72h,30d,52w ,
.B obnam
would keep the last 72 hourly backups,
and the last backup of each calendar day for 30 days,
and the last backup of each calendar week for 52 weeks.
If the backup frequency was once per day,
.B obnam
would keep the backup of each calendar hour for which a backup was made,
for 72 such backups.
In other words, it would effectively keep the the last 72 daily backups.
.PP
Sound confusing?
Just wonder how confused the developer was when writing the code.
.PP
If no policy is given,
.B forget
will keep everything.
.PP
A typical policy might be
.IR 72h,7d,5w,12m ,
which means: keep
the last 72 hourly backups,
the last 7 daily backups,
the last 5 weekly backups and
the last 12 monthly backups.
If the backups are systematically run on an hourly basis,
this will mean keeping
hourly backups for three days,
daily backups for a week,
weekly backups for a month,
and monthly backups for a year.
.PP
The way the policy works is a bit complicated.
Run
.B forget
with the
.B \-\-pretend
option to make sure you're removing the right ones.
.\"
.SS "Using encryption"
.B obnam
can encrypt all the data it writes to the backup repository.
It uses
.BR gpg (1)
to do the encryption.
You need to create a key pair using
.B "gpg --gen-key"
(or use an existing one),
and then tell
.B obnam
about it using the
.B \-\-encrypt\-with
option.
.SS "Configuration files"
.B obnam
will look for configuration files in a number of locations.
See the FILES section for a list.
All files are treated as if they were one with the contents of all
files catenated.
.PP
The files are in INI format,
and only the
.I [config]
section is used
(any other sections are ignored).
.PP
The long names of options are used as keys for configuration
variables.
Any setting that can be set from the command line can be set in a configuration
file, in the
.I [config]
section.
.PP
For example, the options in the following command line:
.sp 1
.RS
obnam --repository=/backup --exclude='\.wav$' backup
.RE
.sp 1
could be replaced with the following configuration file:
.sp 1
.nf
.RS
[config]
repository: /backup
exclude: \.wav$
.RE
.fi
.sp 1
(You can use either
.I foo=value
or
.I foo: value
syntax in the files.)
.PP
The only unusual thing about the files is the way options that can be
used many times are expressed.
All values are put in a single logical line,
separated by commas
(and optionally spaces as well).
For example:
.sp 1
.RS
.nf
[config]
exclude = foo, bar, \\.mp3$
.fi
.RE
.sp 1
The above has three values for the
.B exclude
option:
any files that contain the words
.I foo
or
.I bar
anywhere in the fully qualified pathname,
or files with names ending with a period and
.I mp3
(because the exclusions are regular expressions).
.PP
A long logical line can be broken into several physical ones,
by starting a new line at white space,
and indenting the continuation lines:
.sp 1
.RS
.nf
[config]
exclude = foo,
bar,
\\.mp3$
.fi
.RE
.sp 1
The above example adds three exclusion patterns.
.SS "Multiple clients and locking"
.B obnam
supports sharing a repository between multiple clients.
The clients can share the file contents (chunks),
so that if client A backs up a large file,
and client B has the same file,
then B does not need to upload the large file to the repository a second time.
For this to work without confusion,
the clients use a simple locking protocol that allows only one client
at a time to modify the shared data structures.
Locks do not prevent read-only access at the same time:
this allows you to restore while someone else is backing up.
.PP
Sometimes a read-only operation happens to access a data structure at the
same time as it is being modified.
This can result in a crash.
It will not result in corrupt data,
or incorrect restores.
However, you may need to restart the read-only operation after a crash.
.SS "Repository format conversion"
The
.B convert5to6
subcommand converts a repository of format 5 to format 6.
It is somewhat dangerous!
It modifies the repository in place,
so you should be careful.
You should do a hardlink copy of the repository before converting:
.IP
cp -al repo repo.format5
.PP
You should also run this with local filesystem access to the repository,
rather than sftp,
to avoid abysmal performance.
.\"---------------------------------------------------------------------
.SH OPTIONS
.SS "Option values"
The
.I SIZE
value in options mentioned above specifies a size in bytes,
with optional suffixes to indicate kilobytes (k), kibibytes (Ki),
megabytes (M), mebibyts (Mi), gigabytes (G), gibibytes (Gi),
terabytes (T), tibibytes (Ti).
The suffixes are case-insensitive.
.\" ------------------------------------------------------------------
.SH "EXIT STATUS"
.B obnam
will exit with zero if everything went well,
and non-zero otherwise.
.SH ENVIRONMENT
.B obnam
will pass on the environment it gets from its parent,
without modification.
It does not obey any unusual environment variables,
but it does obey the usual ones when running external programs,
creating temporary files, etc.
.SH FILES
.I /etc/obnam.conf
.br
.I /etc/obnam/*.conf
.br
.I ~/.obnam.conf
.br
.I ~/.config/obnam/*.conf
.RS
Configuration files for
.BR obnam .
It is not an error for any or all of the files to not exist.
.RE
.SH EXAMPLE
To back up your home directory to a server:
.IP
.nf
obnam backup \-\-repository sftp://your.server/~/backups $HOME
.PP
To restore your latest backup from the server:
.IP
.nf
obnam restore \-\-repository sftp://your.server/~/backups \\
\-\-to /var/tmp/my.home.dir
.PP
To restore just one file or directory:
.IP
.nf
obnam restore \-\-repository sftp://your.server/~/backups \\
\-\-to /var/tmp/my.home.dir $HOME/myfile.txt
.fi
.PP
Alternatively, mount the backup repository using the FUSE filesystem
(note that the
.B \-\-to
option is necessary and that the
.B \-\-viewmode
option is usually a good idea):
.IP
.nf
mkdir my-repo
obnam restore \-\-repository sftp://your.server/~/backups \\
\-\-to my-repo \-\-viewmode multiple
cp my-repo/latest/$HOME/myfile.txt
fusermount -u my-repo
.PP
To check that the backup worked:
.IP
.nf
obnam verify \-\-repository sftp://your.server/~/backups \\
/path/to/file
.PP
To remove old backups, keeping the newest backup for each day for
ten years:
.IP
.nf
obnam forget \-\-repository sftp://your.server/~/backups \\
\-\-keep 3650d
.PP
To verify that the backup repository is OK:
.IP
.nf
obnam fsck \-\-repository sftp://your.server/~/backups
.fi
.PP
To view the backed up files in the backup repository using FUSE:
.IP
.nf
obnam mount \-\-to my-fuse \-\-viewmode multiple
ls -lh my-fuse
fusermount -u my-fuse
.fi
.SH "SEE ALSO"
.TP
.BR cliapp (5)
obnam-1.6.1/obnamlib/ 0000755 0001750 0001750 00000000000 12246357067 014276 5 ustar jenkins jenkins obnam-1.6.1/obnamlib/__init__.py 0000644 0001750 0001750 00000010136 12246357067 016410 0 ustar jenkins jenkins # Copyright (C) 2009-2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import cliapp
__version__ = '1.6.1'
# Import _obnam if it is there. We need to be able to do things without
# it, especially at build time, while we're generating manual pages.
# If _obnam is not there, substitute a dummy that throws an exception
# if used.
class DummyExtension(object):
def __getattr__(self, name):
raise Exception('Trying to use _obnam, but that was not found.')
try:
import _obnam
except ImportError:
_obnam = DummyExtension()
from pluginmgr import PluginManager
class Error(cliapp.AppException):
pass
class EncryptionError(Error):
pass
DEFAULT_NODE_SIZE = 256 * 1024 # benchmarked on 2011-09-01
DEFAULT_CHUNK_SIZE = 1024 * 1024 # benchmarked on 2011-09-01
DEFAULT_UPLOAD_QUEUE_SIZE = 128
DEFAULT_LRU_SIZE = 256
DEFAULT_CHUNKIDS_PER_GROUP = 1024
DEFAULT_NAGIOS_WARN_AGE = '27h'
DEFAULT_NAGIOS_CRIT_AGE = '8d'
# The following values have been determined empirically on a laptop
# with an encrypted ext4 filesystem. Other values might be better for
# other situations.
IDPATH_DEPTH = 3
IDPATH_BITS = 12
IDPATH_SKIP = 13
# Maximum identifier for clients, chunks, files, etc. This is the largest
# unsigned 64-bit value. In various places we assume 64-bit field sizes
# for on-disk data structures.
MAX_ID = 2**64 - 1
option_group = {
'perf': 'Performance tweaking',
'devel': 'Development of Obnam itself',
}
from sizeparse import SizeSyntaxError, UnitNameError, ByteSizeParser
from encryption import (generate_symmetric_key,
encrypt_symmetric,
decrypt_symmetric,
get_public_key,
get_public_key_user_ids,
Keyring,
SecretKeyring,
encrypt_with_keyring,
decrypt_with_secret_keys,
SymmetricKeyCache)
from hooks import Hook, MissingFilterError, FilterHook, HookManager
from pluginbase import ObnamPlugin
from vfs import VirtualFileSystem, VfsFactory, VfsTests
from vfs_local import LocalFS
from metadata import (read_metadata, set_metadata, Metadata, metadata_fields,
metadata_verify_fields, encode_metadata, decode_metadata)
from repo_interface import (
RepositoryInterface,
RepositoryInterfaceTests,
RepositoryClientAlreadyExists,
RepositoryClientDoesNotExist,
RepositoryClientListNotLocked,
RepositoryClientListLockingFailed,
RepositoryClientLockingFailed,
RepositoryClientNotLocked,
RepositoryClientKeyNotAllowed,
RepositoryClientGenerationUnfinished,
RepositoryGenerationKeyNotAllowed,
RepositoryGenerationDoesNotExist,
RepositoryClientHasNoGenerations,
RepositoryFileDoesNotExistInGeneration,
RepositoryFileKeyNotAllowed,
RepositoryChunkDoesNotExist,
RepositoryChunkContentNotInIndexes,
RepositoryChunkIndexesNotLocked,
RepositoryChunkIndexesLockingFailed,
REPO_CLIENT_TEST_KEY,
REPO_GENERATION_TEST_KEY,
REPO_FILE_TEST_KEY,
REPO_FILE_MTIME,
REPO_FILE_INTEGER_KEYS)
from repo_dummy import RepositoryFormatDummy
from repo_fmt_6 import RepositoryFormat6
from repo_tree import RepositoryTree
from chunklist import ChunkList
from clientlist import ClientList
from checksumtree import ChecksumTree
from clientmetadatatree import ClientMetadataTree
from lockmgr import LockManager
from repo import Repository, LockFail, BadFormat
from forget_policy import ForgetPolicy
from app import App
__all__ = locals()
obnam-1.6.1/obnamlib/app.py 0000644 0001750 0001750 00000021526 12246357067 015436 0 ustar jenkins jenkins # Copyright (C) 2009, 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import cliapp
import larch
import logging
import os
import socket
import StringIO
import sys
import time
import tracing
import ttystatus
import obnamlib
class App(cliapp.Application):
'''Main program for backup program.'''
def add_settings(self):
devel_group = obnamlib.option_group['devel']
perf_group = obnamlib.option_group['perf']
self.settings.string(['repository', 'r'], 'name of backup repository')
self.settings.string(['client-name'], 'name of client (%default)',
default=self.deduce_client_name())
self.settings.bytesize(['node-size'],
'size of B-tree nodes on disk; only affects new '
'B-trees so you may need to delete a client '
'or repository to change this for existing '
'repositories '
'(default: %default)',
default=obnamlib.DEFAULT_NODE_SIZE,
group=perf_group)
self.settings.bytesize(['chunk-size'],
'size of chunks of file data backed up '
'(default: %default)',
default=obnamlib.DEFAULT_CHUNK_SIZE,
group=perf_group)
self.settings.bytesize(['upload-queue-size'],
'length of upload queue for B-tree nodes '
'(default: %default)',
default=obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
group=perf_group)
self.settings.bytesize(['lru-size'],
'size of LRU cache for B-tree nodes '
'(default: %default)',
default=obnamlib.DEFAULT_LRU_SIZE,
group=perf_group)
self.settings.string_list(['trace'],
'add to filename patters for which trace '
'debugging logging happens')
self.settings.integer(['idpath-depth'],
'depth of chunk id mapping',
default=obnamlib.IDPATH_DEPTH,
group=perf_group)
self.settings.integer(['idpath-bits'],
'chunk id level size',
default=obnamlib.IDPATH_BITS,
group=perf_group)
self.settings.integer(['idpath-skip'],
'chunk id mapping lowest bits skip',
default=obnamlib.IDPATH_SKIP,
group=perf_group)
self.settings.boolean(['quiet'], 'be silent')
self.settings.boolean(['verbose'], 'be verbose')
self.settings.boolean(['pretend', 'dry-run', 'no-act'],
'do not actually change anything (works with '
'backup, forget and restore only, and may only '
'simulate approximately real behavior)')
self.settings.string(['pretend-time'],
'pretend it is TIMESTAMP (YYYY-MM-DD HH:MM:SS); '
'this is only useful for testing purposes',
metavar='TIMESTAMP',
group=devel_group)
self.settings.integer(['lock-timeout'],
'when locking in the backup repository, '
'wait TIMEOUT seconds for an existing lock '
'to go away before giving up',
metavar='TIMEOUT',
default=60)
self.settings.integer(['crash-limit'],
'artificially crash the program after COUNTER '
'files written to the repository; this is '
'useful for crash testing the application, '
'and should not be enabled for real use; '
'set to 0 to disable (disabled by default)',
metavar='COUNTER',
group=devel_group)
# The following needs to be done here, because it needs
# to be done before option processing. This is a bit ugly,
# but the best we can do with the current cliapp structure.
# Possibly cliapp will provide a better hook for us to use
# later on, but this is reality now.
self.setup_ttystatus()
self.pm = obnamlib.PluginManager()
self.pm.locations = [self.plugins_dir()]
self.pm.plugin_arguments = (self,)
self.setup_hooks()
self.fsf = obnamlib.VfsFactory()
self.pm.load_plugins()
self.pm.enable_plugins()
self.hooks.call('plugins-loaded')
self.settings['log-level'] = 'info'
def deduce_client_name(self):
return socket.gethostname()
def setup_hooks(self):
self.hooks = obnamlib.HookManager()
self.hooks.new('plugins-loaded')
self.hooks.new('config-loaded')
self.hooks.new('shutdown')
# The Repository class defines some hooks, but the class
# won't be instantiated until much after plugins are enabled,
# and since all hooks must be defined when plugins are enabled,
# we create one instance here, which will immediately be destroyed.
# FIXME: This is fugly.
obnamlib.Repository(None, 1000, 1000, 100, self.hooks, 10, 10, 10,
self.time, 0, '')
def plugins_dir(self):
return os.path.join(os.path.dirname(obnamlib.__file__), 'plugins')
def setup_logging(self):
log = self.settings['log']
if log and log != 'syslog' and not os.path.exists(log):
fd = os.open(log, os.O_WRONLY | os.O_CREAT, 0600)
os.close(fd)
cliapp.Application.setup_logging(self)
def process_args(self, args):
try:
if self.settings['quiet']:
self.ts.disable()
for pattern in self.settings['trace']:
tracing.trace_add_pattern(pattern)
self.hooks.call('config-loaded')
cliapp.Application.process_args(self, args)
self.hooks.call('shutdown')
except larch.Error, e:
logging.critical(str(e))
sys.stderr.write('ERROR: %s\n' % str(e))
sys.exit(1)
def setup_ttystatus(self):
self.ts = ttystatus.TerminalStatus(period=0.1)
if self.settings['quiet']:
self.ts.disable()
def open_repository(self, create=False, repofs=None): # pragma: no cover
logging.debug('opening repository (create=%s)' % create)
tracing.trace('repofs=%s' % repr(repofs))
repopath = self.settings['repository']
if repofs is None:
repofs = self.fsf.new(repopath, create=create)
if self.settings['crash-limit'] > 0:
repofs.crash_limit = self.settings['crash-limit']
repofs.connect()
else:
repofs.reinit(repopath)
return obnamlib.Repository(repofs,
self.settings['node-size'],
self.settings['upload-queue-size'],
self.settings['lru-size'],
self.hooks,
self.settings['idpath-depth'],
self.settings['idpath-bits'],
self.settings['idpath-skip'],
self.time,
self.settings['lock-timeout'],
self.settings['client-name'])
def time(self):
'''Return current time in seconds since epoch.
This is a wrapper around time.time() so that it can be overridden
with the --pretend-time setting.
'''
s = self.settings['pretend-time']
if s:
t = time.strptime(s, '%Y-%m-%d %H:%M:%S')
return time.mktime(t)
else:
return time.time()
obnam-1.6.1/obnamlib/checksumtree.py 0000644 0001750 0001750 00000005737 12246357067 017346 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import struct
import tracing
import obnamlib
class ChecksumTree(obnamlib.RepositoryTree):
'''Repository map of checksum to integer id.
The checksum might be, for example, an MD5 one (as returned by
hashlib.md5().digest()). The id would be a chunk id.
'''
def __init__(self, fs, name, checksum_length, node_size,
upload_queue_size, lru_size, hooks):
tracing.trace('new ChecksumTree name=%s' % name)
self.fmt = '!%dsQQ' % checksum_length
key_bytes = struct.calcsize(self.fmt)
obnamlib.RepositoryTree.__init__(self, fs, name, key_bytes, node_size,
upload_queue_size, lru_size, hooks)
self.keep_just_one_tree = True
def key(self, checksum, chunk_id, client_id):
return struct.pack(self.fmt, checksum, chunk_id, client_id)
def unkey(self, key):
return struct.unpack(self.fmt, key)
def add(self, checksum, chunk_id, client_id):
tracing.trace('checksum=%s', repr(checksum))
tracing.trace('chunk_id=%s', chunk_id)
tracing.trace('client_id=%s', client_id)
self.start_changes()
key = self.key(checksum, chunk_id, client_id)
self.tree.insert(key, '')
def find(self, checksum):
if self.init_forest() and self.forest.trees:
minkey = self.key(checksum, 0, 0)
maxkey = self.key(checksum, obnamlib.MAX_ID, obnamlib.MAX_ID)
t = self.forest.trees[-1]
pairs = t.lookup_range(minkey, maxkey)
return [self.unkey(key)[1] for key, value in pairs]
else:
return []
def remove(self, checksum, chunk_id, client_id):
tracing.trace('checksum=%s', repr(checksum))
tracing.trace('chunk_id=%s', chunk_id)
tracing.trace('client_id=%s', client_id)
self.start_changes()
key = self.key(checksum, chunk_id, client_id)
self.tree.remove_range(key, key)
def chunk_is_used(self, checksum, chunk_id):
'''Is a given chunk used by anyone?'''
if self.init_forest() and self.forest.trees:
minkey = self.key(checksum, chunk_id, 0)
maxkey = self.key(checksum, chunk_id, obnamlib.MAX_ID)
t = self.forest.trees[-1]
return not t.range_is_empty(minkey, maxkey)
else:
return False
obnam-1.6.1/obnamlib/checksumtree_tests.py 0000644 0001750 0001750 00000005173 12246357067 020562 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import hashlib
import shutil
import tempfile
import unittest
import obnamlib
class ChecksumTreeTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
fs = obnamlib.LocalFS(self.tempdir)
self.hooks = obnamlib.HookManager()
self.hooks.new('repository-toplevel-init')
self.checksum = hashlib.md5('foo').digest()
self.tree = obnamlib.ChecksumTree(fs, 'x', len(self.checksum),
obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, self)
def tearDown(self):
self.tree.commit()
shutil.rmtree(self.tempdir)
def test_is_empty_initially(self):
self.assertEqual(self.tree.find(self.checksum), [])
def test_finds_checksums(self):
self.tree.add(self.checksum, 1, 3)
self.tree.add(self.checksum, 2, 4)
self.assertEqual(sorted(self.tree.find(self.checksum)), [1, 2])
def test_finds_only_the_right_checksums(self):
self.tree.add(self.checksum, 1, 2)
self.tree.add(self.checksum, 3, 4)
self.tree.add(hashlib.md5('bar').digest(), 5, 6)
self.assertEqual(sorted(self.tree.find(self.checksum)), [1, 3])
def test_removes_checksum(self):
self.tree.add(self.checksum, 1, 3)
self.tree.add(self.checksum, 2, 4)
self.tree.remove(self.checksum, 2, 4)
self.assertEqual(self.tree.find(self.checksum), [1])
def test_adds_same_id_only_once(self):
self.tree.add(self.checksum, 1, 2)
self.tree.add(self.checksum, 1, 2)
self.assertEqual(self.tree.find(self.checksum), [1])
def test_unknown_chunk_is_not_used(self):
self.assertFalse(self.tree.chunk_is_used(self.checksum, 0))
def test_known_chunk_is_used(self):
self.tree.add(self.checksum, 0, 1)
self.assertTrue(self.tree.chunk_is_used(self.checksum, 0))
obnam-1.6.1/obnamlib/chunklist.py 0000644 0001750 0001750 00000004123 12246357067 016654 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import hashlib
import struct
import random
import tracing
import obnamlib
class ChunkList(obnamlib.RepositoryTree):
'''Repository's list of chunks.
The list maps a chunk id to its checksum.
The list is implemented as a B-tree, with the 64-bit chunk id as the
key, and the checksum as the value.
'''
def __init__(self, fs, node_size, upload_queue_size, lru_size, hooks):
tracing.trace('new ChunkList')
self.fmt = '!Q'
self.key_bytes = struct.calcsize(self.fmt)
obnamlib.RepositoryTree.__init__(self, fs, 'chunklist', self.key_bytes,
node_size, upload_queue_size,
lru_size, hooks)
self.keep_just_one_tree = True
def key(self, chunk_id):
return struct.pack(self.fmt, chunk_id)
def add(self, chunk_id, checksum):
tracing.trace('chunk_id=%s', chunk_id)
tracing.trace('checksum=%s', repr(checksum))
self.start_changes()
self.tree.insert(self.key(chunk_id), checksum)
def get_checksum(self, chunk_id):
if self.init_forest() and self.forest.trees:
t = self.forest.trees[-1]
return t.lookup(self.key(chunk_id))
raise KeyError(chunk_id)
def remove(self, chunk_id):
tracing.trace('chunk_id=%s', chunk_id)
self.start_changes()
key = self.key(chunk_id)
self.tree.remove_range(key, key)
obnam-1.6.1/obnamlib/chunklist_tests.py 0000644 0001750 0001750 00000003544 12246357067 020104 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import shutil
import tempfile
import unittest
import obnamlib
class ChunkListTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
fs = obnamlib.LocalFS(self.tempdir)
self.hooks = obnamlib.HookManager()
self.hooks.new('repository-toplevel-init')
self.list = obnamlib.ChunkList(fs,
obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, self)
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_raises_keyerror_for_missing_chunk(self):
self.assertRaises(KeyError, self.list.get_checksum, 0)
def test_adds_chunk(self):
self.list.add(0, 'checksum')
self.assertEqual(self.list.get_checksum(0), 'checksum')
def test_adds_second_chunk(self):
self.list.add(0, 'checksum')
self.list.add(1, 'checksum1')
self.assertEqual(self.list.get_checksum(1), 'checksum1')
def test_removes_chunk(self):
self.list.add(0, 'checksum')
self.list.remove(0)
self.assertRaises(KeyError, self.list.get_checksum, 0)
obnam-1.6.1/obnamlib/clientlist.py 0000644 0001750 0001750 00000012302 12246357067 017020 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import hashlib
import logging
import struct
import random
import tracing
import obnamlib
class ClientList(obnamlib.RepositoryTree):
'''Repository's list of clients.
The list maps a client name to an arbitrary (string) identifier,
which is unique within the repository.
The list is implemented as a B-tree, with a three-part key:
128-bit MD5 of client name, 64-bit unique identifier, and subkey
identifier. The value depends on the subkey: it's either the
client's full name, or the public key identifier the client
uses to encrypt their backups.
The client's identifier is a random, unique 64-bit integer.
'''
# subkey values
CLIENT_NAME = 0
KEYID = 1
SUBKEY_MAX = 255
def __init__(self, fs, node_size, upload_queue_size, lru_size, hooks):
tracing.trace('new ClientList')
self.hash_len = len(self.hashfunc(''))
self.fmt = '!%dsQB' % self.hash_len
self.key_bytes = struct.calcsize(self.fmt)
self.minkey = self.hashkey('\x00' * self.hash_len, 0, 0)
self.maxkey = self.hashkey('\xff' * self.hash_len, obnamlib.MAX_ID,
self.SUBKEY_MAX)
obnamlib.RepositoryTree.__init__(self, fs, 'clientlist',
self.key_bytes, node_size,
upload_queue_size, lru_size, hooks)
self.keep_just_one_tree = True
def hashfunc(self, string):
return hashlib.new('md5', string).digest()
def hashkey(self, namehash, client_id, subkey):
return struct.pack(self.fmt, namehash, client_id, subkey)
def key(self, client_name, client_id, subkey):
h = self.hashfunc(client_name)
return self.hashkey(h, client_id, subkey)
def unkey(self, key):
return struct.unpack(self.fmt, key)
def random_id(self):
return random.randint(0, obnamlib.MAX_ID)
def list_clients(self):
if self.init_forest() and self.forest.trees:
t = self.forest.trees[-1]
return [v
for k, v in t.lookup_range(self.minkey, self.maxkey)
if self.unkey(k)[2] == self.CLIENT_NAME]
else:
return []
def find_client_id(self, t, client_name):
minkey = self.key(client_name, 0, 0)
maxkey = self.key(client_name, obnamlib.MAX_ID, self.SUBKEY_MAX)
for k, v in t.lookup_range(minkey, maxkey):
checksum, client_id, subkey = self.unkey(k)
if subkey == self.CLIENT_NAME and v == client_name:
return client_id
return None
def get_client_id(self, client_name):
if not self.init_forest() or not self.forest.trees:
return None
t = self.forest.trees[-1]
return self.find_client_id(t, client_name)
def add_client(self, client_name):
logging.info('Adding client %s' % client_name)
self.start_changes()
if self.find_client_id(self.tree, client_name) is None:
while True:
candidate_id = self.random_id()
key = self.key(client_name, candidate_id, self.CLIENT_NAME)
try:
self.tree.lookup(key)
except KeyError:
break
key = self.key(client_name, candidate_id, self.CLIENT_NAME)
self.tree.insert(key, client_name)
logging.debug('Client %s has id %s' % (client_name, candidate_id))
def remove_client(self, client_name):
logging.info('Removing client %s' % client_name)
self.start_changes()
client_id = self.find_client_id(self.tree, client_name)
if client_id is not None:
key = self.key(client_name, client_id, self.CLIENT_NAME)
self.tree.remove(key)
def get_client_keyid(self, client_name):
if self.init_forest() and self.forest.trees:
t = self.forest.trees[-1]
client_id = self.find_client_id(t, client_name)
if client_id is not None:
key = self.key(client_name, client_id, self.KEYID)
for k, v in t.lookup_range(key, key):
return v
return None
def set_client_keyid(self, client_name, keyid):
logging.info('Setting client %s to use key %s' % (client_name, keyid))
self.start_changes()
client_id = self.find_client_id(self.tree, client_name)
key = self.key(client_name, client_id, self.KEYID)
if keyid is None:
self.tree.remove_range(key, key)
else:
self.tree.insert(key, keyid)
obnam-1.6.1/obnamlib/clientlist_tests.py 0000644 0001750 0001750 00000007777 12246357067 020266 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import shutil
import tempfile
import unittest
import obnamlib
class ClientListTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
fs = obnamlib.LocalFS(self.tempdir)
self.hooks = obnamlib.HookManager()
self.hooks.new('repository-toplevel-init')
self.list = obnamlib.ClientList(fs,
obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, self)
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_key_bytes_is_correct_length(self):
self.assertEqual(self.list.key_bytes,
len(self.list.key('foo', 12765, 0)))
def test_unkey_unpacks_key_correctly(self):
key = self.list.key('client name', 12765, 42)
client_hash, client_id, subkey = self.list.unkey(key)
self.assertEqual(client_id, 12765)
self.assertEqual(subkey, 42)
def test_reports_none_as_id_for_nonexistent_client(self):
self.assertEqual(self.list.get_client_id('foo'), None)
def test_lists_no_clients_when_tree_does_not_exist(self):
self.assertEqual(self.list.list_clients(), [])
def test_added_client_has_integer_id(self):
self.list.add_client('foo')
self.assert_(type(self.list.get_client_id('foo')) in [int, long])
def test_added_client_is_listed(self):
self.list.add_client('foo')
self.list.set_client_keyid('foo', 'cafebeef')
self.assertEqual(self.list.list_clients(), ['foo'])
def test_removed_client_has_none_id(self):
self.list.add_client('foo')
self.list.remove_client('foo')
self.assertEqual(self.list.get_client_id('foo'), None)
def test_removed_client_has_no_keys(self):
self.list.add_client('foo')
client_id = self.list.get_client_id('foo')
self.list.remove_client('foo')
minkey = self.list.key('foo', client_id, 0)
maxkey = self.list.key('foo', client_id, self.list.SUBKEY_MAX)
pairs = list(self.list.tree.lookup_range(minkey, maxkey))
self.assertEqual(pairs, [])
def test_twice_added_client_exists_only_once(self):
self.list.add_client('foo')
self.list.add_client('foo')
self.assertEqual(self.list.list_clients(), ['foo'])
def test_adding_handles_hash_collision(self):
def bad_hash(string):
return '0' * 16
self.list.hashfunc = bad_hash
self.list.add_client('foo')
self.list.add_client('bar')
self.assertEqual(sorted(self.list.list_clients()), ['bar', 'foo'])
self.assertNotEqual(self.list.get_client_id('bar'),
self.list.get_client_id('foo'))
def test_client_has_no_public_key_initially(self):
self.list.add_client('foo')
self.assertEqual(self.list.get_client_keyid('foo'), None)
def test_sets_client_keyid(self):
self.list.add_client('foo')
self.list.set_client_keyid('foo', 'cafebeef')
self.assertEqual(self.list.get_client_keyid('foo'), 'cafebeef')
def test_remove_client_keyid(self):
self.list.add_client('foo')
self.list.set_client_keyid('foo', 'cafebeef')
self.list.set_client_keyid('foo', None)
self.assertEqual(self.list.get_client_keyid('foo'), None)
obnam-1.6.1/obnamlib/clientmetadatatree.py 0000644 0001750 0001750 00000050141 12246357067 020510 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import hashlib
import logging
import os
import random
import struct
import tracing
import obnamlib
class ClientMetadataTree(obnamlib.RepositoryTree):
'''Store per-client metadata about files.
Actual file contents is stored elsewhere, this stores just the
metadata about files: names, inode info, and what chunks of
data they use.
See http://braawi.org/obnam/ondisk/ for a description of how
this works.
'''
# Filesystem metadata.
PREFIX_FS_META = 0 # prefix
FILE_NAME = 0 # subkey type for storing pathnames
FILE_CHUNKS = 1 # subkey type for list of chunks
FILE_METADATA = 3 # subkey type for inode fields, etc
DIR_CONTENTS = 4 # subkey type for list of directory contents
FILE_DATA = 5 # subkey type for file data (instead of chunk)
FILE_METADATA_ENCODED = 0 # subkey value for encoded obnamlib.Metadata().
# References to chunks in this generation.
# Main key is the chunk id, subkey type is always 0, subkey is file id
# for file that uses the chunk.
PREFIX_CHUNK_REF = 1
# Metadata about the generation. The main key is always the hash of
# 'generation', subkey type field is always 0.
PREFIX_GEN_META = 2 # prefix
GEN_ID = 0 # subkey type for generation id
GEN_STARTED = 1 # subkey type for when generation was started
GEN_ENDED = 2 # subkey type for when generation was ended
GEN_IS_CHECKPOINT = 3 # subkey type for whether generation is checkpoint
GEN_FILE_COUNT = 4 # subkey type for count of files+dirs in generation
GEN_TOTAL_DATA = 5 # subkey type for sum of all file sizes in gen
# Maximum values for the subkey type field, and the subkey field.
# Both have a minimum value of 0.
TYPE_MAX = 255
SUBKEY_MAX = struct.pack('!Q', obnamlib.MAX_ID)
def __init__(self, fs, client_dir, node_size, upload_queue_size, lru_size,
repo):
tracing.trace('new ClientMetadataTree, client_dir=%s' % client_dir)
self.current_time = repo.current_time
key_bytes = len(self.hashkey(0, self.default_file_id(''), 0, 0))
obnamlib.RepositoryTree.__init__(self, fs, client_dir, key_bytes,
node_size, upload_queue_size,
lru_size, repo)
self.genhash = self.default_file_id('generation')
self.chunkids_per_key = max(1,
int(node_size / 4 / struct.calcsize('Q')))
self.init_caches()
def init_caches(self):
self.known_generations = {}
self.file_ids = {}
def default_file_id(self, filename):
'''Return hash of filename suitable for use as main key.'''
tracing.trace(repr(filename))
def hash(s):
return hashlib.md5(s).digest()[:4]
dirname = os.path.dirname(filename)
basename = os.path.basename(filename)
return hash(dirname) + hash(basename)
def _bad_default_file_id(self, filename):
'''For use by unit tests.'''
return struct.pack('!Q', 0)
def hashkey(self, prefix, mainhash, subtype, subkey):
'''Compute a full key.
The full key consists of three parts:
* prefix (0 for filesystem metadata, 1 for chunk refs)
* a hash of mainkey (64 bits)
* the subkey type (8 bits)
* type subkey (64 bits)
These are catenated.
mainhash must be a string of 8 bytes.
subtype must be an integer in the range 0.255, inclusive.
subkey must be either a string or an integer. If it is a string,
it will be padded with NUL bytes at the end, if it is less than
8 bytes, and truncated, if longer. If it is an integer, it will
be converted as a string, and the value must fit into 64 bits.
'''
if type(subkey) == str:
subkey = (subkey + '\0' * 8)[:8]
fmt = '!B8sB8s'
else:
assert type(subkey) in [int, long]
fmt = '!B8sBQ'
return struct.pack(fmt, prefix, mainhash, subtype, subkey)
def fskey(self, mainhash, subtype, subkey):
''''Generate key for filesystem metadata.'''
return self.hashkey(self.PREFIX_FS_META, mainhash, subtype, subkey)
def fs_unkey(self, key):
'''Inverse of fskey.'''
parts = struct.unpack('!B8sB8s', key)
return parts[1], parts[3]
def genkey(self, subkey):
'''Generate key for generation metadata.'''
return self.hashkey(self.PREFIX_GEN_META, self.genhash, 0, subkey)
def int2bin(self, integer):
'''Convert an integer to a binary string representation.'''
return struct.pack('!Q', integer)
def chunk_key(self, chunk_id, file_id):
'''Generate a key for a chunk reference.'''
return self.hashkey(self.PREFIX_CHUNK_REF, self.int2bin(chunk_id),
0, file_id)
def chunk_unkey(self, key):
'''Return the chunk and file ids in a chunk key.'''
parts = struct.unpack('!BQBQ', key)
return parts[1], parts[3]
def get_file_id(self, tree, pathname):
'''Return id for file in a given generation.'''
if tree in self.file_ids:
if pathname in self.file_ids[tree]:
return self.file_ids[tree][pathname]
else:
self.file_ids[tree] = {}
default_file_id = self.default_file_id(pathname)
minkey = self.fskey(default_file_id, self.FILE_NAME, 0)
maxkey = self.fskey(default_file_id, self.FILE_NAME, obnamlib.MAX_ID)
for key, value in tree.lookup_range(minkey, maxkey):
def_id, file_id = self.fs_unkey(key)
assert def_id == default_file_id, \
'def=%s other=%s' % (repr(def_id), repr(default_file_id))
self.file_ids[tree][value] = file_id
if value == pathname:
return file_id
raise KeyError('%s does not yet have a file-id' % pathname)
def set_file_id(self, pathname):
'''Set and return the file-id for a file in current generation.'''
default_file_id = self.default_file_id(pathname)
minkey = self.fskey(default_file_id, self.FILE_NAME, 0)
maxkey = self.fskey(default_file_id, self.FILE_NAME, obnamlib.MAX_ID)
file_ids = set()
for key, value in self.tree.lookup_range(minkey, maxkey):
def_id, file_id = self.fs_unkey(key)
assert def_id == default_file_id
if value == pathname:
return file_id
file_ids.add(file_id)
while True:
n = random.randint(0, obnamlib.MAX_ID)
file_id = struct.pack('!Q', n)
if file_id not in file_ids:
break
key = self.fskey(default_file_id, self.FILE_NAME, file_id)
self.tree.insert(key, pathname)
return file_id
def _lookup_int(self, tree, key):
return struct.unpack('!Q', tree.lookup(key))[0]
def _insert_int(self, tree, key, value):
return tree.insert(key, struct.pack('!Q', value))
def commit(self):
tracing.trace('committing ClientMetadataTree')
if self.tree:
now = int(self.current_time())
self._insert_int(self.tree, self.genkey(self.GEN_ENDED), now)
genid = self._get_generation_id_or_None(self.tree)
if genid is not None:
t = [(self.GEN_FILE_COUNT, 'file_count'),
(self.GEN_TOTAL_DATA, 'total_data')]
for subkey, attr in t:
if hasattr(self, attr):
self._insert_count(genid, subkey, getattr(self, attr))
obnamlib.RepositoryTree.commit(self)
def init_forest(self, *args, **kwargs):
self.init_caches()
return obnamlib.RepositoryTree.init_forest(self, *args, **kwargs)
def start_changes(self, *args, **kwargs):
self.init_caches()
return obnamlib.RepositoryTree.start_changes(self, *args, **kwargs)
def find_generation(self, genid):
def fill_cache():
key = self.genkey(self.GEN_ID)
for t in self.forest.trees:
t_genid = self._lookup_int(t, key)
if t_genid == genid:
self.known_generations[genid] = t
return t
if self.forest:
if genid in self.known_generations:
return self.known_generations[genid]
t = fill_cache()
if t is not None:
return t
raise KeyError('Unknown generation %s' % genid)
def list_generations(self):
if self.forest:
genids = []
for t in self.forest.trees:
genid = self._get_generation_id_or_None(t)
if genid is not None:
genids.append(genid)
return genids
else:
return []
def start_generation(self):
tracing.trace('start new generation')
self.start_changes()
gen_id = self.forest.new_id()
now = int(self.current_time())
self._insert_int(self.tree, self.genkey(self.GEN_ID), gen_id)
self._insert_int(self.tree, self.genkey(self.GEN_STARTED), now)
self.file_count = self.get_generation_file_count(gen_id) or 0
self.total_data = self.get_generation_data(gen_id) or 0
def set_current_generation_is_checkpoint(self, is_checkpoint):
tracing.trace('is_checkpoint=%s', is_checkpoint)
value = 1 if is_checkpoint else 0
key = self.genkey(self.GEN_IS_CHECKPOINT)
self._insert_int(self.tree, key, value)
def get_is_checkpoint(self, genid):
tree = self.find_generation(genid)
key = self.genkey(self.GEN_IS_CHECKPOINT)
try:
return self._lookup_int(tree, key)
except KeyError:
return 0
def remove_generation(self, genid):
tracing.trace('genid=%s', genid)
tree = self.find_generation(genid)
if tree == self.tree:
self.tree = None
self.forest.remove_tree(tree)
def get_generation_id(self, tree):
return self._lookup_int(tree, self.genkey(self.GEN_ID))
def _get_generation_id_or_None(self, tree):
try:
return self.get_generation_id(tree)
except KeyError: # pragma: no cover
return None
def _lookup_time(self, tree, what):
try:
return self._lookup_int(tree, self.genkey(what))
except KeyError:
return None
def get_generation_times(self, genid):
tree = self.find_generation(genid)
return (self._lookup_time(tree, self.GEN_STARTED),
self._lookup_time(tree, self.GEN_ENDED))
def get_generation_data(self, genid):
return self._lookup_count(genid, self.GEN_TOTAL_DATA)
def _lookup_count(self, genid, count_type):
tree = self.find_generation(genid)
key = self.genkey(count_type)
try:
return self._lookup_int(tree, key)
except KeyError:
return None
def _insert_count(self, genid, count_type, count):
tree = self.find_generation(genid)
key = self.genkey(count_type)
return self._insert_int(tree, key, count)
def get_generation_file_count(self, genid):
return self._lookup_count(genid, self.GEN_FILE_COUNT)
def create(self, filename, encoded_metadata):
tracing.trace('filename=%s', filename)
file_id = self.set_file_id(filename)
gen_id = self.get_generation_id(self.tree)
try:
old_metadata = self.get_metadata(gen_id, filename)
except KeyError:
old_metadata = None
self.file_count += 1
else:
old = obnamlib.decode_metadata(old_metadata)
if old.isfile():
self.total_data -= old.st_size or 0
metadata = obnamlib.decode_metadata(encoded_metadata)
if metadata.isfile():
self.total_data += metadata.st_size or 0
if encoded_metadata != old_metadata:
tracing.trace('new or changed metadata')
self.set_metadata(filename, encoded_metadata)
# Add to parent's contents, unless already there.
parent = os.path.dirname(filename)
tracing.trace('parent=%s', parent)
if parent != filename: # root dir is its own parent
basename = os.path.basename(filename)
parent_id = self.set_file_id(parent)
key = self.fskey(parent_id, self.DIR_CONTENTS, file_id)
# We could just insert, but that would cause unnecessary
# churn in the tree if nothing changes.
try:
self.tree.lookup(key)
tracing.trace('was already in parent') # pragma: no cover
except KeyError:
self.tree.insert(key, basename)
tracing.trace('added to parent')
def get_metadata(self, genid, filename):
tree = self.find_generation(genid)
file_id = self.get_file_id(tree, filename)
key = self.fskey(file_id, self.FILE_METADATA,
self.FILE_METADATA_ENCODED)
return tree.lookup(key)
def set_metadata(self, filename, encoded_metadata):
tracing.trace('filename=%s', filename)
file_id = self.set_file_id(filename)
key1 = self.fskey(file_id, self.FILE_NAME, file_id)
self.tree.insert(key1, filename)
key2 = self.fskey(file_id, self.FILE_METADATA,
self.FILE_METADATA_ENCODED)
self.tree.insert(key2, encoded_metadata)
def remove(self, filename):
tracing.trace('filename=%s', filename)
file_id = self.get_file_id(self.tree, filename)
genid = self.get_generation_id(self.tree)
self.file_count -= 1
try:
encoded_metadata = self.get_metadata(genid, filename)
except KeyError:
pass
else:
metadata = obnamlib.decode_metadata(encoded_metadata)
if metadata.isfile():
self.total_data -= metadata.st_size or 0
# Remove any children.
minkey = self.fskey(file_id, self.DIR_CONTENTS, 0)
maxkey = self.fskey(file_id, self.DIR_CONTENTS, obnamlib.MAX_ID)
for key, basename in self.tree.lookup_range(minkey, maxkey):
self.remove(os.path.join(filename, basename))
# Remove chunk refs.
for chunkid in self.get_file_chunks(genid, filename):
key = self.chunk_key(chunkid, file_id)
self.tree.remove_range(key, key)
# Remove this file's metadata.
minkey = self.fskey(file_id, 0, 0)
maxkey = self.fskey(file_id, self.TYPE_MAX, self.SUBKEY_MAX)
self.tree.remove_range(minkey, maxkey)
# Remove filename.
default_file_id = self.default_file_id(filename)
key = self.fskey(default_file_id, self.FILE_NAME, file_id)
self.tree.remove_range(key, key)
# Also remove from parent's contents.
parent = os.path.dirname(filename)
if parent != filename: # root dir is its own parent
parent_id = self.set_file_id(parent)
key = self.fskey(parent_id, self.DIR_CONTENTS, file_id)
# The range removal will work even if the key does not exist.
self.tree.remove_range(key, key)
def listdir(self, genid, dirname):
tree = self.find_generation(genid)
try:
dir_id = self.get_file_id(tree, dirname)
except KeyError:
return []
minkey = self.fskey(dir_id, self.DIR_CONTENTS, 0)
maxkey = self.fskey(dir_id, self.DIR_CONTENTS, self.SUBKEY_MAX)
basenames = []
for key, value in tree.lookup_range(minkey, maxkey):
basenames.append(value)
return basenames
def get_file_chunks(self, genid, filename):
tree = self.find_generation(genid)
try:
file_id = self.get_file_id(tree, filename)
except KeyError:
return []
minkey = self.fskey(file_id, self.FILE_CHUNKS, 0)
maxkey = self.fskey(file_id, self.FILE_CHUNKS, self.SUBKEY_MAX)
pairs = tree.lookup_range(minkey, maxkey)
chunkids = []
for key, value in pairs:
chunkids.extend(self._decode_chunks(value))
return chunkids
def _encode_chunks(self, chunkids):
fmt = '!' + ('Q' * len(chunkids))
return struct.pack(fmt, *chunkids)
def _decode_chunks(self, encoded):
size = struct.calcsize('Q')
count = len(encoded) / size
fmt = '!' + ('Q' * count)
return struct.unpack(fmt, encoded)
def _insert_chunks(self, tree, file_id, i, chunkids):
key = self.fskey(file_id, self.FILE_CHUNKS, i)
encoded = self._encode_chunks(chunkids)
tree.insert(key, encoded)
def set_file_chunks(self, filename, chunkids):
tracing.trace('filename=%s', filename)
tracing.trace('chunkids=%s', repr(chunkids))
file_id = self.set_file_id(filename)
minkey = self.fskey(file_id, self.FILE_CHUNKS, 0)
maxkey = self.fskey(file_id, self.FILE_CHUNKS, self.SUBKEY_MAX)
for key, value in self.tree.lookup_range(minkey, maxkey):
for chunkid in self._decode_chunks(value):
k = self.chunk_key(chunkid, file_id)
self.tree.remove_range(k, k)
self.tree.remove_range(minkey, maxkey)
self.append_file_chunks(filename, chunkids)
def append_file_chunks(self, filename, chunkids):
tracing.trace('filename=%s', filename)
tracing.trace('chunkids=%s', repr(chunkids))
file_id = self.set_file_id(filename)
minkey = self.fskey(file_id, self.FILE_CHUNKS, 0)
maxkey = self.fskey(file_id, self.FILE_CHUNKS, self.SUBKEY_MAX)
i = self.tree.count_range(minkey, maxkey)
while chunkids:
some = chunkids[:self.chunkids_per_key]
self._insert_chunks(self.tree, file_id, i, some)
for chunkid in some:
self.tree.insert(self.chunk_key(chunkid, file_id), '')
i += 1
chunkids = chunkids[self.chunkids_per_key:]
def chunk_in_use(self, gen_id, chunk_id):
'''Is a chunk used by a generation?'''
minkey = self.chunk_key(chunk_id, 0)
maxkey = self.chunk_key(chunk_id, obnamlib.MAX_ID)
t = self.find_generation(gen_id)
return not t.range_is_empty(minkey, maxkey)
def list_chunks_in_generation(self, gen_id):
'''Return list of chunk ids used in a given generation.'''
minkey = self.chunk_key(0, 0)
maxkey = self.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
t = self.find_generation(gen_id)
return list(set(self.chunk_unkey(key)[0]
for key, value in t.lookup_range(minkey, maxkey)))
def set_file_data(self, filename, contents): # pragma: no cover
'''Store contents of file, if small, in B-tree instead of chunk.
The length of the contents should be small enough to fit in a
B-tree leaf.
'''
tracing.trace('filename=%s' % filename)
tracing.trace('contents=%s' % repr(contents))
file_id = self.set_file_id(filename)
key = self.fskey(file_id, self.FILE_DATA, 0)
self.tree.insert(key, contents)
def get_file_data(self, gen_id, filename): # pragma: no cover
'''Return contents of file, if set, or None.'''
tree = self.find_generation(gen_id)
file_id = self.get_file_id(tree, filename)
key = self.fskey(file_id, self.FILE_DATA, 0)
try:
return tree.lookup(key)
except KeyError:
return None
obnam-1.6.1/obnamlib/clientmetadatatree_tests.py 0000644 0001750 0001750 00000046546 12246357067 021750 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import shutil
import stat
import tempfile
import time
import unittest
import obnamlib
class ClientMetadataTreeTests(unittest.TestCase):
def current_time(self):
return time.time() if self.now is None else self.now
def setUp(self):
self.now = None
self.tempdir = tempfile.mkdtemp()
fs = obnamlib.LocalFS(self.tempdir)
self.hooks = obnamlib.HookManager()
self.hooks.new('repository-toplevel-init')
self.client = obnamlib.ClientMetadataTree(fs, 'clientid',
obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, self)
self.file_size = 123
self.file_metadata = obnamlib.Metadata(st_mode=stat.S_IFREG | 0666,
st_size=self.file_size)
self.file_encoded = obnamlib.encode_metadata(self.file_metadata)
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_has_not_current_generation_initially(self):
self.assertEqual(self.client.tree, None)
def test_lists_no_generations_initially(self):
self.assertEqual(self.client.list_generations(), [])
def test_starts_generation(self):
self.now = 12765
self.client.start_generation()
self.assertNotEqual(self.client.tree, None)
def lookup(x):
key = self.client.genkey(x)
return self.client._lookup_int(self.client.tree, key)
genid = self.client.get_generation_id(self.client.tree)
self.assertEqual(lookup(self.client.GEN_ID), genid)
self.assertEqual(lookup(self.client.GEN_STARTED), 12765)
self.assertFalse(self.client.get_is_checkpoint(genid))
def test_starts_second_generation(self):
self.now = 1
self.client.start_generation()
genid1 = self.client.get_generation_id(self.client.tree)
self.client.commit()
self.assertEqual(self.client.tree, None)
self.now = 2
self.client.start_generation()
self.assertNotEqual(self.client.tree, None)
def lookup(x):
key = self.client.genkey(x)
return self.client._lookup_int(self.client.tree, key)
genid2 = self.client.get_generation_id(self.client.tree)
self.assertEqual(lookup(self.client.GEN_ID), genid2)
self.assertNotEqual(genid1, genid2)
self.assertEqual(lookup(self.client.GEN_STARTED), 2)
self.assertFalse(self.client.get_is_checkpoint(genid2))
self.assertEqual(self.client.list_generations(), [genid1, genid2])
def test_sets_is_checkpoint(self):
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.set_current_generation_is_checkpoint(True)
self.assert_(self.client.get_is_checkpoint(genid))
def test_unsets_is_checkpoint(self):
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.set_current_generation_is_checkpoint(True)
self.client.set_current_generation_is_checkpoint(False)
self.assertFalse(self.client.get_is_checkpoint(genid))
def test_removes_generation(self):
self.client.start_generation()
self.client.commit()
genid = self.client.list_generations()[0]
self.client.remove_generation(genid)
self.assertEqual(self.client.list_generations(), [])
def test_removes_started_generation(self):
self.client.start_generation()
self.client.remove_generation(self.client.list_generations()[0])
self.assertEqual(self.client.list_generations(), [])
self.assertEqual(self.client.tree, None)
def test_started_generation_has_start_time(self):
self.now = 1
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.assertEqual(self.client.get_generation_times(genid), (1, None))
def test_committed_generation_has_times(self):
self.now = 1
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.now = 2
self.client.commit()
self.assertEqual(self.client.get_generation_times(genid), (1, 2))
def test_single_empty_generation_counts_zero_files(self):
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.commit()
self.assertEqual(self.client.get_generation_file_count(genid), 0)
def test_counts_files_in_first_generation(self):
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_file_count(genid), 1)
def test_counts_new_files_in_second_generation(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/bar', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_file_count(genid), 2)
def test_discounts_deleted_files_in_second_generation(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.remove('/foo')
self.client.commit()
self.assertEqual(self.client.get_generation_file_count(genid), 0)
def test_does_not_increment_count_for_recreated_files(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_file_count(genid), 1)
def test_single_empty_generation_has_no_data(self):
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.commit()
self.assertEqual(self.client.get_generation_data(genid), 0)
def test_has_data_in_first_generation(self):
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_data(genid),
self.file_size)
def test_counts_new_files_in_second_generation(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/bar', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_data(genid),
2 * self.file_size)
def test_counts_replaced_data_in_second_generation(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_data(genid),
self.file_size)
def test_discounts_deleted_data_in_second_generation(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.remove('/foo')
self.client.commit()
self.assertEqual(self.client.get_generation_data(genid), 0)
def test_does_not_increment_data_for_recreated_files(self):
self.client.start_generation()
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.client.start_generation()
genid = self.client.get_generation_id(self.client.tree)
self.client.create('/foo', self.file_encoded)
self.client.commit()
self.assertEqual(self.client.get_generation_data(genid),
self.file_size)
def test_finds_generation_the_first_time(self):
self.client.start_generation()
tree = self.client.tree
genid = self.client.get_generation_id(tree)
self.client.commit()
self.assertEqual(self.client.find_generation(genid), tree)
def test_finds_generation_the_second_time(self):
self.client.start_generation()
tree = self.client.tree
genid = self.client.get_generation_id(tree)
self.client.commit()
self.client.find_generation(genid)
self.assertEqual(self.client.find_generation(genid), tree)
def test_find_generation_raises_keyerror_for_empty_forest(self):
self.client.init_forest()
self.assertRaises(KeyError, self.client.find_generation, 0)
def test_find_generation_raises_keyerror_for_unknown_generation(self):
self.assertRaises(KeyError, self.client.find_generation, 0)
class ClientMetadataTreeFileOpsTests(unittest.TestCase):
def current_time(self):
return time.time() if self.now is None else self.now
def setUp(self):
self.now = None
self.tempdir = tempfile.mkdtemp()
fs = obnamlib.LocalFS(self.tempdir)
self.hooks = obnamlib.HookManager()
self.hooks.new('repository-toplevel-init')
self.client = obnamlib.ClientMetadataTree(fs, 'clientid',
obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE,
self)
# Force use of filename hash collisions.
self.client.default_file_id = self.client._bad_default_file_id
self.client.start_generation()
self.clientid = self.client.get_generation_id(self.client.tree)
self.file_metadata = obnamlib.Metadata(st_mode=stat.S_IFREG | 0666)
self.file_encoded = obnamlib.encode_metadata(self.file_metadata)
self.dir_metadata = obnamlib.Metadata(st_mode=stat.S_IFDIR | 0777)
self.dir_encoded = obnamlib.encode_metadata(self.dir_metadata)
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_has_empty_root_initially(self):
self.assertEqual(self.client.listdir(self.clientid, '/'), [])
def test_has_no_metadata_initially(self):
self.assertRaises(KeyError, self.client.get_metadata, self.clientid,
'/foo')
def test_sets_metadata(self):
self.client.set_metadata('/foo', self.file_encoded)
self.assertEqual(self.client.get_metadata(self.clientid, '/foo'),
self.file_encoded)
def test_creates_file_at_root(self):
self.client.create('/foo', self.file_encoded)
self.assertEqual(self.client.listdir(self.clientid, '/'), ['foo'])
self.assertEqual(self.client.get_metadata(self.clientid, '/foo'),
self.file_encoded)
def test_removes_file_at_root(self):
self.client.create('/foo', self.file_encoded)
self.client.remove('/foo')
self.assertEqual(self.client.listdir(self.clientid, '/'), [])
self.assertRaises(KeyError, self.client.get_metadata,
self.clientid, '/foo')
def test_creates_directory_at_root(self):
self.client.create('/foo', self.dir_encoded)
self.assertEqual(self.client.listdir(self.clientid, '/'), ['foo'])
self.assertEqual(self.client.get_metadata(self.clientid, '/foo'),
self.dir_encoded)
def test_removes_directory_at_root(self):
self.client.create('/foo', self.dir_encoded)
self.client.remove('/foo')
self.assertEqual(self.client.listdir(self.clientid, '/'), [])
self.assertRaises(KeyError, self.client.get_metadata,
self.clientid, '/foo')
def test_creates_directory_and_files_and_subdirs(self):
self.client.create('/foo', self.dir_encoded)
self.client.create('/foo/foobar', self.file_encoded)
self.client.create('/foo/bar', self.dir_encoded)
self.client.create('/foo/bar/baz', self.file_encoded)
self.assertEqual(self.client.listdir(self.clientid, '/'), ['foo'])
self.assertEqual(sorted(self.client.listdir(self.clientid, '/foo')),
['bar', 'foobar'])
self.assertEqual(self.client.listdir(self.clientid, '/foo/bar'),
['baz'])
self.assertEqual(self.client.get_metadata(self.clientid, '/foo'),
self.dir_encoded)
self.assertEqual(self.client.get_metadata(self.clientid, '/foo/bar'),
self.dir_encoded)
self.assertEqual(self.client.get_metadata(self.clientid, '/foo/foobar'),
self.file_encoded)
self.assertEqual(self.client.get_metadata(self.clientid,
'/foo/bar/baz'),
self.file_encoded)
def test_removes_directory_and_files_and_subdirs(self):
self.client.create('/foo', self.dir_encoded)
self.client.create('/foo/foobar', self.file_encoded)
self.client.create('/foo/bar', self.dir_encoded)
self.client.create('/foo/bar/baz', self.file_encoded)
self.client.remove('/foo')
self.assertEqual(self.client.listdir(self.clientid, '/'), [])
self.assertRaises(KeyError, self.client.get_metadata,
self.clientid, '/foo')
self.assertRaises(KeyError, self.client.get_metadata,
self.clientid, '/foo/foobar')
self.assertRaises(KeyError, self.client.get_metadata,
self.clientid, '/foo/bar')
self.assertRaises(KeyError, self.client.get_metadata,
self.clientid, '/foo/bar/baz')
def test_has_no_file_chunks_initially(self):
self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'), [])
def test_sets_file_chunks(self):
self.client.set_file_chunks('/foo', [1, 2, 3])
self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'),
[1, 2, 3])
def test_appends_file_chunks_to_empty_list(self):
self.client.append_file_chunks('/foo', [1, 2, 3])
self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'),
[1, 2, 3])
def test_appends_file_chunks_to_nonempty_list(self):
self.client.set_file_chunks('/foo', [1, 2, 3])
self.client.append_file_chunks('/foo', [4, 5, 6])
self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'),
[1, 2, 3, 4, 5, 6])
def test_generation_has_no_chunk_refs_initially(self):
minkey = self.client.chunk_key(0, 0)
maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)),
[])
def test_generation_has_no_chunk_refs_initially(self):
minkey = self.client.chunk_key(0, 0)
maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)),
[])
def test_sets_file_chunks(self):
self.client.set_file_chunks('/foo', [1, 2, 3])
self.assertEqual(self.client.get_file_chunks(self.clientid, '/foo'),
[1, 2, 3])
def test_generation_has_no_chunk_refs_initially(self):
minkey = self.client.chunk_key(0, 0)
maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)),
[])
def test_set_file_chunks_adds_chunk_refs(self):
self.client.set_file_chunks('/foo', [1, 2])
file_id = self.client.get_file_id(self.client.tree, '/foo')
minkey = self.client.chunk_key(0, 0)
maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
self.assertEqual(set(self.client.tree.lookup_range(minkey, maxkey)),
set([(self.client.chunk_key(1, file_id), ''),
(self.client.chunk_key(2, file_id), '')]))
def test_set_file_chunks_removes_now_unused_chunk_refs(self):
self.client.set_file_chunks('/foo', [1, 2])
self.client.set_file_chunks('/foo', [1])
file_id = self.client.get_file_id(self.client.tree, '/foo')
minkey = self.client.chunk_key(0, 0)
maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)),
[(self.client.chunk_key(1, file_id), '')])
def test_remove_removes_chunk_refs(self):
self.client.set_file_chunks('/foo', [1, 2])
self.client.remove('/foo')
minkey = self.client.chunk_key(0, 0)
maxkey = self.client.chunk_key(obnamlib.MAX_ID, obnamlib.MAX_ID)
self.assertEqual(list(self.client.tree.lookup_range(minkey, maxkey)),
[])
def test_report_chunk_not_in_use_initially(self):
gen_id = self.client.get_generation_id(self.client.tree)
self.assertFalse(self.client.chunk_in_use(gen_id, 0))
def test_report_chunk_in_use_after_it_is(self):
gen_id = self.client.get_generation_id(self.client.tree)
self.client.set_file_chunks('/foo', [0])
self.assertTrue(self.client.chunk_in_use(gen_id, 0))
def test_lists_no_chunks_in_generation_initially(self):
gen_id = self.client.get_generation_id(self.client.tree)
self.assertEqual(self.client.list_chunks_in_generation(gen_id), [])
def test_lists_used_chunks_in_generation(self):
gen_id = self.client.get_generation_id(self.client.tree)
self.client.set_file_chunks('/foo', [0])
self.client.set_file_chunks('/bar', [1])
self.assertEqual(set(self.client.list_chunks_in_generation(gen_id)),
set([0, 1]))
def test_lists_chunks_in_generation_only_once(self):
gen_id = self.client.get_generation_id(self.client.tree)
self.client.set_file_chunks('/foo', [0])
self.client.set_file_chunks('/bar', [0])
self.assertEqual(self.client.list_chunks_in_generation(gen_id), [0])
obnam-1.6.1/obnamlib/encryption.py 0000644 0001750 0001750 00000016506 12246357067 017052 0 ustar jenkins jenkins # Copyright 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import shutil
import subprocess
import tempfile
import tracing
import obnamlib
def generate_symmetric_key(numbits, filename='/dev/random'):
'''Generate a random key of at least numbits for symmetric encryption.'''
tracing.trace('numbits=%d', numbits)
bytes = (numbits + 7) / 8
f = open(filename, 'rb')
key = f.read(bytes)
f.close()
return key.encode('hex')
class SymmetricKeyCache(object):
'''Cache symmetric keys in memory.'''
def __init__(self):
self.clear()
def get(self, repo, toplevel):
if repo in self.repos and toplevel in self.repos[repo]:
return self.repos[repo][toplevel]
return None
def put(self, repo, toplevel, key):
if repo not in self.repos:
self.repos[repo] = {}
self.repos[repo][toplevel] = key
def clear(self):
self.repos = {}
def _gpg_pipe(args, data, passphrase):
'''Pipe things through gpg.
With the right args, this can be either an encryption or a decryption
operation.
For safety, we give the passphrase to gpg via a file descriptor.
The argument list is modified to include the relevant options for that.
The data is fed to gpg via a temporary file, readable only by
the owner, to avoid congested pipes.
'''
# Open pipe for passphrase, and write it there. If passphrase is
# very long (more than 4 KiB by default), this might block. A better
# implementation would be to have a loop around select(2) to do pipe
# I/O when it can be done without blocking. Patches most welcome.
keypipe = os.pipe()
os.write(keypipe[1], passphrase + '\n')
os.close(keypipe[1])
# Actually run gpg.
argv = ['gpg', '--passphrase-fd', str(keypipe[0]), '-q', '--batch',
'--no-textmode'] + args
tracing.trace('argv=%s', repr(argv))
p = subprocess.Popen(argv, stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.PIPE)
out, err = p.communicate(data)
os.close(keypipe[0])
# Return output data, or deal with errors.
if p.returncode: # pragma: no cover
raise obnamlib.EncryptionError(err)
return out
def encrypt_symmetric(cleartext, key):
'''Encrypt data with symmetric encryption.'''
return _gpg_pipe(['-c'], cleartext, key)
def decrypt_symmetric(encrypted, key):
'''Decrypt encrypted data with symmetric encryption.'''
return _gpg_pipe(['-d'], encrypted, key)
def _gpg(args, stdin='', gpghome=None):
'''Run gpg and return its output.'''
env = dict()
env.update(os.environ)
if gpghome is not None:
env['GNUPGHOME'] = gpghome
tracing.trace('gpghome=%s' % gpghome)
argv = ['gpg', '-q', '--batch', '--no-textmode'] + args
tracing.trace('argv=%s', repr(argv))
p = subprocess.Popen(argv, stdin=subprocess.PIPE, stdout=subprocess.PIPE,
stderr=subprocess.PIPE, env=env)
out, err = p.communicate(stdin)
# Return output data, or deal with errors.
if p.returncode: # pragma: no cover
raise obnamlib.EncryptionError(err)
return out
def get_public_key(keyid, gpghome=None):
'''Return the ASCII armored export form of a given public key.'''
return _gpg(['--export', '--armor', keyid], gpghome=gpghome)
def get_public_key_user_ids(keyid, gpghome=None): # pragma: no cover
'''Return the ASCII armored export form of a given public key.'''
user_ids = []
output = _gpg(['--with-colons', '--list-keys', keyid], gpghome=gpghome)
for line in output.splitlines():
token = line.split(":")
if len(token) >= 10:
user_id = token[9].strip().replace(r'\x3a', ":")
if user_id:
user_ids.append(user_id)
return user_ids
class Keyring(object):
'''A simplistic representation of GnuPG keyrings.
Just enough functionality for obnam's purposes.
'''
_keyring_name = 'pubring.gpg'
def __init__(self, encoded=''):
self._encoded = encoded
self._gpghome = None
self._keyids = None
def _setup(self):
self._gpghome = tempfile.mkdtemp()
f = open(self._keyring, 'wb')
f.write(self._encoded)
f.close()
def _cleanup(self):
shutil.rmtree(self._gpghome)
self._gpghome = None
@property
def _keyring(self):
return os.path.join(self._gpghome, self._keyring_name)
def _real_keyids(self):
output = self.gpg(False, ['--list-keys', '--with-colons'])
keyids = []
for line in output.splitlines():
fields = line.split(':')
if len(fields) >= 5 and fields[0] == 'pub':
keyids.append(fields[4])
return keyids
def keyids(self):
if self._keyids is None:
self._keyids = self._real_keyids()
return self._keyids
def __str__(self):
return self._encoded
def __contains__(self, keyid):
return keyid in self.keyids()
def _reread_keyring(self):
f = open(self._keyring, 'rb')
self._encoded = f.read()
f.close()
self._keyids = None
def add(self, key):
self.gpg(True, ['--import'], stdin=key)
def remove(self, keyid):
self.gpg(True, ['--delete-key', '--yes', keyid])
def gpg(self, reread, *args, **kwargs):
self._setup()
kwargs['gpghome'] = self._gpghome
try:
result = _gpg(*args, **kwargs)
except BaseException: # pragma: no cover
self._cleanup()
raise
else:
if reread:
self._reread_keyring()
self._cleanup()
return result
class SecretKeyring(Keyring):
'''Same as Keyring, but for secret keys.'''
_keyring_name = 'secring.gpg'
def _real_keyids(self):
output = self.gpg(False, ['--list-secret-keys', '--with-colons'])
keyids = []
for line in output.splitlines():
fields = line.split(':')
if len(fields) >= 5 and fields[0] == 'sec':
keyids.append(fields[4])
return keyids
def encrypt_with_keyring(cleartext, keyring):
'''Encrypt data with all keys in a keyring.'''
recipients = []
for keyid in keyring.keyids():
recipients += ['-r', keyid]
return keyring.gpg(False,
['-e',
'--trust-model', 'always',
'--no-encrypt-to',
'--no-default-recipient',
] + recipients,
stdin=cleartext)
def decrypt_with_secret_keys(encrypted, gpghome=None):
'''Decrypt data using secret keys GnuPG finds on its own.'''
return _gpg(['-d'], stdin=encrypted, gpghome=gpghome)
obnam-1.6.1/obnamlib/encryption_tests.py 0000644 0001750 0001750 00000015310 12246357067 020264 0 ustar jenkins jenkins # Copyright 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import shutil
import subprocess
import tempfile
import unittest
import obnamlib
def cat(filename):
f = open(filename, 'rb')
data = f.read()
f.close()
return data
class SymmetricEncryptionTests(unittest.TestCase):
# We don't test the quality of keys or encryption here. Doing that is
# hard to do well, and we'll just assume that reading /dev/random
# for keys, and using gpg for encryption, is going to work well.
# In these tests, we care about making sure we use the tools right,
# not that the tools themselves work right.
def test_generates_key_of_correct_length(self):
numbits = 16
key = obnamlib.generate_symmetric_key(numbits, filename='/dev/zero')
self.assertEqual(len(key) * 8 / 2, numbits) # /2 for hex encoding
def test_generates_key_with_size_rounded_up(self):
numbits = 15
key = obnamlib.generate_symmetric_key(numbits, filename='/dev/zero')
self.assertEqual(len(key)/2, 2) # /2 for hex encoding
def test_encrypts_into_different_string_than_cleartext(self):
cleartext = 'hello world'
key = 'sekr1t'
encrypted = obnamlib.encrypt_symmetric(cleartext, key)
self.assertNotEqual(cleartext, encrypted)
def test_encrypt_decrypt_round_trip(self):
cleartext = 'hello, world'
key = 'sekr1t'
encrypted = obnamlib.encrypt_symmetric(cleartext, key)
decrypted = obnamlib.decrypt_symmetric(encrypted, key)
self.assertEqual(decrypted, cleartext)
class SymmetricKeyCacheTests(unittest.TestCase):
def setUp(self):
self.cache = obnamlib.SymmetricKeyCache()
self.repo = 'repo'
self.repo2 = 'repo2'
self.toplevel = 'toplevel'
self.key = 'key'
self.key2 = 'key2'
def test_does_not_have_key_initially(self):
self.assertEqual(self.cache.get(self.repo, self.toplevel), None)
def test_remembers_key(self):
self.cache.put(self.repo, self.toplevel, self.key)
self.assertEqual(self.cache.get(self.repo, self.toplevel), self.key)
def test_does_not_remember_key_for_different_repo(self):
self.cache.put(self.repo, self.toplevel, self.key)
self.assertEqual(self.cache.get(self.repo2, self.toplevel), None)
def test_remembers_keys_for_both_repos(self):
self.cache.put(self.repo, self.toplevel, self.key)
self.cache.put(self.repo2, self.toplevel, self.key2)
self.assertEqual(self.cache.get(self.repo, self.toplevel), self.key)
self.assertEqual(self.cache.get(self.repo2, self.toplevel), self.key2)
def test_clears_cache(self):
self.cache.put(self.repo, self.toplevel, self.key)
self.cache.clear()
self.assertEqual(self.cache.get(self.repo, self.toplevel), None)
class GetPublicKeyTests(unittest.TestCase):
def setUp(self):
self.dirname = tempfile.mkdtemp()
self.gpghome = os.path.join(self.dirname, 'gpghome')
shutil.copytree('test-gpghome', self.gpghome)
self.keyid = '1B321347'
def tearDown(self):
shutil.rmtree(self.dirname)
def test_exports_key(self):
key = obnamlib.get_public_key(self.keyid, gpghome=self.gpghome)
self.assert_('-----BEGIN PGP PUBLIC KEY BLOCK-----' in key)
class KeyringTests(unittest.TestCase):
def setUp(self):
self.keyring = obnamlib.Keyring()
self.keyid = '3B1802F81B321347'
self.key = '''
-----BEGIN PGP PUBLIC KEY BLOCK-----
Version: GnuPG v1.4.10 (GNU/Linux)
mI0ETY8gwwEEAMrSXBIJseIv9miuwnYlCd7CQCzNb8nHYkpo4o1nEQD3k/h7xj9m
/0Gd5kLfF+WLwAxSJYb41JjaKs0FeUexSGNePdNFxn2CCZ4moHH19tTlWGfqCNz7
vcYQpSbPix+zhR7uNqilxtsIrx1iyYwh7L2VKf/KMJ7yXbT+jbAj7fqBABEBAAG0
CFRlc3QgS2V5iLgEEwECACIFAk2PIMMCGwMGCwkIBwMCBhUIAgkKCwQWAgMBAh4B
AheAAAoJEDsYAvgbMhNHlEED/1UkiLJ8R3phMRnjLtn+5JobYvOi7WEubnRv1rnN
MC4MyhFiLux7Z8p3xwt1Pf2GqL7q1dD91NOx+6KS3d1PFmiM/i1fYalZPbzm1gNr
8sFK2Gxsnd7mmYf2wKIo335Bk21SCmGcNKvmKW2M6ckzPT0q/RZ2hhY9JhHUiLG4
Lu3muI0ETY8gwwEEAMQoiBCQYky52pDamnH5c7FngCM72AkNq/z0+DHqY202gksd
Vy63TF7UGIsiCLvY787vPm62sOqYO0uI6PV5xVDGyJh4oI/g2zgNkhXRZrIB1Q+T
THp7qSmwQUZv8T+HfgxLiaXDq6oV/HWLElcMQ9ClZ3Sxzlu3ZQHrtmY5XridABEB
AAGInwQYAQIACQUCTY8gwwIbDAAKCRA7GAL4GzITR4hgBAClEurTj5n0/21pWZH0
Ljmokwa3FM++OZxO7shc1LIVNiAKfLiPigU+XbvSeVWTeajKkvj5LCVxKQiRSiYB
Z85TYTo06kHvDCYQmFOSGrLsZxMyJCfHML5spF9+bej5cepmuNVIdJK5vlgDiVr3
uWUO7gMi+AlnxbfXVCTEgw3xhg==
=j+6W
-----END PGP PUBLIC KEY BLOCK-----
'''
def test_has_no_keys_initially(self):
self.assertEqual(self.keyring.keyids(), [])
self.assertEqual(str(self.keyring), '')
def test_gets_no_keys_from_empty_encoded(self):
keyring = obnamlib.Keyring(encoded='')
self.assertEqual(keyring.keyids(), [])
def test_adds_key(self):
self.keyring.add(self.key)
self.assertEqual(self.keyring.keyids(), [self.keyid])
self.assert_(self.keyid in self.keyring)
def test_removes_key(self):
self.keyring.add(self.key)
self.keyring.remove(self.keyid)
self.assertEqual(self.keyring.keyids(), [])
def test_export_import_roundtrip_works(self):
self.keyring.add(self.key)
exported = str(self.keyring)
keyring2 = obnamlib.Keyring(exported)
self.assertEqual(keyring2.keyids(), [self.keyid])
class SecretKeyringTests(unittest.TestCase):
def test_lists_correct_key(self):
keyid1 = '3B1802F81B321347'
keyid2 = 'DF3D13AA11E69900'
seckeys = obnamlib.SecretKeyring(cat('test-gpghome/secring.gpg'))
self.assertEqual(sorted(seckeys.keyids()), sorted([keyid1, keyid2]))
class PublicKeyEncryptionTests(unittest.TestCase):
def test_roundtrip_works(self):
cleartext = 'hello, world'
passphrase = 'password1'
keyring = obnamlib.Keyring(cat('test-gpghome/pubring.gpg'))
seckeys = obnamlib.SecretKeyring(cat('test-gpghome/secring.gpg'))
encrypted = obnamlib.encrypt_with_keyring(cleartext, keyring)
decrypted = obnamlib.decrypt_with_secret_keys(encrypted,
gpghome='test-gpghome')
self.assertEqual(decrypted, cleartext)
obnam-1.6.1/obnamlib/forget_policy.py 0000644 0001750 0001750 00000007642 12246357067 017526 0 ustar jenkins jenkins # Copyright (C) 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import re
import obnamlib
class ForgetPolicy(object):
'''Parse and interpret a policy for what to forget and what to keep.
See documentation for the --keep option for details.
'''
periods = {
'h': 'hourly',
'd': 'daily',
'w': 'weekly',
'm': 'monthly',
'y': 'yearly',
}
rule_pat = re.compile(r'(?P\d+)(?P(h|d|w|m|y))')
def parse(self, optarg):
'''Parse the argument of --keep.
Return a dictionary indexed by 'hourly', 'daily', 'weekly',
'monthly', 'yearly', and giving the number of generations
to keep for each time period.
'''
remaining = optarg
m = self.rule_pat.match(remaining)
if not m:
raise obnamlib.Error('Forget policy syntax error: %s' % optarg)
result = dict((y, None) for x, y in self.periods.iteritems())
while m:
count = int(m.group('count'))
period = self.periods[m.group('period')]
if result[period] is not None:
raise obnamlib.Error('Forget policy may not '
'duplicate period (%s): %s' %
(period, optarg))
result[period] = count
remaining = remaining[m.end():]
if not remaining:
break
if not remaining.startswith(','):
raise obnamlib.Error('Forget policy must have rules '
'separated by commas: %s' % optarg)
remaining = remaining[1:]
m = self.rule_pat.match(remaining)
result.update((x, 0) for x, y in result.iteritems() if y is None)
return result
def last_in_each_period(self, period, genlist):
formats = {
'hourly': '%Y-%m-%d %H',
'daily': '%Y-%m-%d',
'weekly': '%Y-%W',
'monthly': '%Y-%m',
'yearly': '%Y',
}
matches = []
for genid, dt in genlist:
formatted = dt.strftime(formats[period])
if not matches:
matches.append((genid, formatted))
elif matches[-1][1] == formatted:
matches[-1] = (genid, formatted)
else:
matches.append((genid, formatted))
return [genid for genid, formatted in matches]
def match(self, rules, genlist):
'''Match a parsed ruleset against a list of generations and times.
The ruleset should be of the form returned by the parse method.
genlist should be a list of generation identifiers and timestamps.
Identifiers can be anything, timestamps should be an instance
of datetime.datetime, with no time zone (it is ignored).
genlist should be in ascending order by time: oldest one first.
Return value is all those pairs from genlist that should be
kept (i.e., which match the rules).
'''
result_ids = set()
for period in rules:
genids = self.last_in_each_period(period, genlist)
if rules[period]:
for genid in genids[-rules[period]:]:
result_ids.add(genid)
return [(genid, dt) for genid, dt in genlist
if genid in result_ids]
obnam-1.6.1/obnamlib/forget_policy_tests.py 0000644 0001750 0001750 00000011643 12246357067 020744 0 ustar jenkins jenkins # Copyright (C) 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import datetime
import unittest
import obnamlib
class ForgetPolicyParseTests(unittest.TestCase):
def setUp(self):
self.fp = obnamlib.ForgetPolicy()
def test_raises_error_for_empty_string(self):
self.assertRaises(obnamlib.Error, self.fp.parse, '')
def test_raises_error_for_unknown_period(self):
self.assertRaises(obnamlib.Error, self.fp.parse, '7x')
def test_raises_error_if_period_is_duplicated(self):
self.assertRaises(obnamlib.Error, self.fp.parse, '1h,2h')
def test_raises_error_rules_not_separated_by_comma(self):
self.assertRaises(obnamlib.Error, self.fp.parse, '1h 2d')
def test_parses_single_rule(self):
self.assertEqual(self.fp.parse('7d'),
{ 'hourly': 0,
'daily': 7,
'weekly': 0,
'monthly': 0,
'yearly': 0 })
def test_parses_multiple_rules(self):
self.assertEqual(self.fp.parse('1h,2d,3w,4m,255y'),
{ 'hourly': 1,
'daily': 2,
'weekly': 3,
'monthly': 4,
'yearly': 255 })
class ForgetPolicyMatchTests(unittest.TestCase):
def setUp(self):
self.fp = obnamlib.ForgetPolicy()
def match2(self, spec, times):
rules = self.fp.parse(spec)
return [dt for i, dt in self.fp.match(rules, list(enumerate(times)))]
def test_hourly_matches(self):
h0m0 = datetime.datetime(2000, 1, 1, 0, 0)
h0m59 = datetime.datetime(2000, 1, 1, 0, 59)
h1m0 = datetime.datetime(2000, 1, 1, 1, 0)
h1m59 = datetime.datetime(2000, 1, 1, 1, 59)
self.assertEqual(self.match2('1h', [h0m0, h0m59, h1m0, h1m59]),
[h1m59])
def test_two_hourly_matches(self):
h0m0 = datetime.datetime(2000, 1, 1, 0, 0)
h0m59 = datetime.datetime(2000, 1, 1, 0, 59)
h1m0 = datetime.datetime(2000, 1, 1, 1, 0)
h1m59 = datetime.datetime(2000, 1, 1, 1, 59)
self.assertEqual(self.match2('2h', [h0m0, h0m59, h1m0, h1m59]),
[h0m59, h1m59])
def test_daily_matches(self):
d1h0 = datetime.datetime(2000, 1, 1, 0, 0)
d1h23 = datetime.datetime(2000, 1, 1, 23, 0)
d2h0 = datetime.datetime(2000, 1, 2, 0, 0)
d2h23 = datetime.datetime(2000, 1, 2, 23, 0)
self.assertEqual(self.match2('1d', [d1h0, d1h23, d2h0, d2h23]),
[d2h23])
# Not testing weekly matching, since I can't figure out to make
# a sensible test case right now.
def test_monthly_matches(self):
m1d1 = datetime.datetime(2000, 1, 1, 0, 0)
m1d28 = datetime.datetime(2000, 1, 28, 0, 0)
m2d1 = datetime.datetime(2000, 2, 1, 0, 0)
m2d28 = datetime.datetime(2000, 2, 28, 0, 0)
self.assertEqual(self.match2('1m', [m1d1, m1d28, m2d1, m2d28]),
[m2d28])
def test_yearly_matches(self):
y1m1 = datetime.datetime(2000, 1, 1, 0, 0)
y1m12 = datetime.datetime(2000, 12, 1, 0, 0)
y2m1 = datetime.datetime(2001, 1, 1, 0, 0)
y2m12 = datetime.datetime(2001, 12, 1, 0, 0)
self.assertEqual(self.match2('1y', [y1m1, y1m12, y2m1, y2m12]),
[y2m12])
def test_hourly_and_daily_match_together(self):
d1h0m0 = datetime.datetime(2000, 1, 1, 0, 0)
d1h0m1 = datetime.datetime(2000, 1, 1, 0, 1)
d2h0m0 = datetime.datetime(2000, 1, 2, 0, 0)
d2h0m1 = datetime.datetime(2000, 1, 2, 0, 1)
d3h0m0 = datetime.datetime(2000, 1, 3, 0, 0)
d3h0m1 = datetime.datetime(2000, 1, 3, 0, 1)
genlist = list(enumerate([d1h0m0, d1h0m1, d2h0m0, d2h0m1,
d3h0m0, d3h0m1]))
rules = self.fp.parse('1h,2d')
self.assertEqual([dt for genid, dt in self.fp.match(rules, genlist)],
[d2h0m1, d3h0m1])
def test_hourly_and_daily_together_when_only_daily_backups(self):
d1 = datetime.datetime(2000, 1, 1, 0, 0)
d2 = datetime.datetime(2000, 1, 2, 0, 0)
d3 = datetime.datetime(2000, 1, 3, 0, 0)
self.assertEqual(self.match2('10h,1d', [d1, d2, d3]),
[d1, d2, d3])
obnam-1.6.1/obnamlib/hooks.py 0000644 0001750 0001750 00000014023 12246357067 015773 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
'''Hooks with callbacks.
In order to de-couple parts of the application, especially when plugins
are used, hooks can be used. A hook is a location in the application
code where plugins may want to do something. Each hook has a name and
a list of callbacks. The application defines the name and the location
where the hook will be invoked, and the plugins (or other parts of the
application) will register callbacks.
'''
import logging
import tracing
import obnamlib
class Hook(object):
'''A hook.'''
EARLY_PRIORITY = 250
DEFAULT_PRIORITY = 500
LATE_PRIORITY = 750
def __init__(self):
self.callbacks = []
self.priorities = {}
def add_callback(self, callback, priority=DEFAULT_PRIORITY):
'''Add a callback to this hook.
Return an identifier that can be used to remove this callback.
'''
if callback not in self.callbacks:
self.priorities[callback] = priority
self.callbacks.append(callback)
self.callbacks.sort(lambda x,y: cmp(self.priorities[x],
self.priorities[y]))
return callback
def call_callbacks(self, *args, **kwargs):
'''Call all callbacks with the given arguments.'''
for callback in self.callbacks:
callback(*args, **kwargs)
def remove_callback(self, callback_id):
'''Remove a specific callback.'''
if callback_id in self.callbacks:
self.callbacks.remove(callback_id)
del self.priorities[callback_id]
class MissingFilterError(obnamlib.Error):
'''Missing tag encountered reading filtered data.'''
def __init__(self, tagname):
self.tagname = tagname
logging.warning("Missing tag: " + repr(tagname))
obnamlib.Error.__init__(self, "Unknown filter tag encountered: %s" %
repr(tagname))
class FilterHook(Hook):
'''A hook which filters data through callbacks.
Every hook of this type accepts a piece of data as its first argument
Each callback gets the return value of the previous one as its
argument. The caller gets the value of the final callback.
Other arguments (with or without keywords) are passed as-is to
each callback.
'''
def __init__(self):
Hook.__init__(self)
self.bytag = {}
def add_callback(self, callback, priority=Hook.DEFAULT_PRIORITY):
assert(hasattr(callback, "tag"))
assert(hasattr(callback, "filter_read"))
assert(hasattr(callback, "filter_write"))
self.bytag[callback.tag] = callback
return Hook.add_callback(self, callback, priority)
def remove_callback(self, callback_id):
Hook.remove_callback(self, callback_id)
del self.bytag[callback_id.tag]
def call_callbacks(self, data, *args, **kwargs):
raise NotImplementedError()
def run_filter_read(self, data, *args, **kwargs):
tag, content = data.split("\0", 1)
while tag != "":
if tag not in self.bytag:
raise MissingFilterError(tag)
data = self.bytag[tag].filter_read(content, *args, **kwargs)
tag, content = data.split("\0", 1)
return content
def run_filter_write(self, data, *args, **kwargs):
tracing.trace('called')
data = "\0" + data
for filt in self.callbacks:
tracing.trace('calling %s' % filt)
new_data = filt.filter_write(data, *args, **kwargs)
assert new_data is not None, \
filt.tag + ": Returned None from filter_write()"
if data != new_data:
tracing.trace('filt.tag=%s' % filt.tag)
data = filt.tag + "\0" + new_data
tracing.trace('done')
return data
class HookManager(object):
'''Manage the set of hooks the application defines.'''
def __init__(self):
self.hooks = {}
self.filters = {}
def new(self, name):
'''Create a new hook.
If a hook with that name already exists, nothing happens.
'''
if name not in self.hooks:
self.hooks[name] = Hook()
def new_filter(self, name):
'''Create a new filter hook.'''
if name not in self.filters:
self.filters[name] = FilterHook()
def add_callback(self, name, callback, priority=Hook.DEFAULT_PRIORITY):
'''Add a callback to a named hook.'''
if name in self.hooks:
return self.hooks[name].add_callback(callback, priority)
else:
return self.filters[name].add_callback(callback, priority)
def remove_callback(self, name, callback_id):
'''Remove a specific callback from a named hook.'''
if name in self.hooks:
self.hooks[name].remove_callback(callback_id)
else:
self.filters[name].remove_callback(callback_id)
def call(self, name, *args, **kwargs):
'''Call callbacks for a named hook, using given arguments.'''
self.hooks[name].call_callbacks(*args, **kwargs)
def filter_read(self, name, *args, **kwargs):
'''Run reader filter for named filter, using given arguments.'''
return self.filters[name].run_filter_read(*args, **kwargs)
def filter_write(self, name, *args, **kwargs):
'''Run writer filter for named filter, using given arguments.'''
return self.filters[name].run_filter_write(*args, **kwargs)
obnam-1.6.1/obnamlib/hooks_tests.py 0000644 0001750 0001750 00000015550 12246357067 017223 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import unittest
import obnamlib
import base64
class HookTests(unittest.TestCase):
def setUp(self):
self.hook = obnamlib.Hook()
def callback(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def callback2(self, *args, **kwargs):
self.args2 = args
self.kwargs2 = kwargs
def test_has_no_callbacks_by_default(self):
self.assertEqual(self.hook.callbacks, [])
def test_adds_callback(self):
self.hook.add_callback(self.callback)
self.assertEqual(self.hook.callbacks, [self.callback])
def test_adds_callback_only_once(self):
self.hook.add_callback(self.callback)
self.hook.add_callback(self.callback)
self.assertEqual(self.hook.callbacks, [self.callback])
def test_adds_two_callbacks(self):
id1 = self.hook.add_callback(self.callback)
id2 = self.hook.add_callback(self.callback2,
obnamlib.Hook.DEFAULT_PRIORITY + 1)
self.assertEqual(self.hook.callbacks, [self.callback, self.callback2])
self.assertNotEqual(id1, id2)
def test_adds_callbacks_in_reverse_order(self):
id1 = self.hook.add_callback(self.callback)
id2 = self.hook.add_callback(self.callback2,
obnamlib.Hook.DEFAULT_PRIORITY - 1)
self.assertEqual(self.hook.callbacks, [self.callback2, self.callback])
self.assertNotEqual(id1, id2)
def test_calls_callback(self):
self.hook.add_callback(self.callback)
self.hook.call_callbacks('bar', kwarg='foobar')
self.assertEqual(self.args, ('bar',))
self.assertEqual(self.kwargs, { 'kwarg': 'foobar' })
def test_removes_callback(self):
cb_id = self.hook.add_callback(self.callback)
self.hook.remove_callback(cb_id)
self.assertEqual(self.hook.callbacks, [])
class NeverAddsFilter(object):
def __init__(self):
self.tag = "never"
def filter_read(self, data, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.wasread = True
return data
def filter_write(self, data, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.wasread = False
return data
class Base64Filter(object):
def __init__(self):
self.tag = "base64"
def filter_read(self, data, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.wasread = True
return base64.b64decode(data)
def filter_write(self, data, *args, **kwargs):
self.args = args
self.kwargs = kwargs
self.wasread = False
return base64.b64encode(data)
class FilterHookTests(unittest.TestCase):
def setUp(self):
self.hook = obnamlib.FilterHook()
def test_add_filter_ok(self):
self.hook.add_callback(NeverAddsFilter())
def test_never_filter_no_tags(self):
self.hook.add_callback(NeverAddsFilter())
self.assertEquals(self.hook.run_filter_write("foo"), "\0foo")
def test_never_filter_clean_revert(self):
self.hook.add_callback(NeverAddsFilter())
self.assertEquals(self.hook.run_filter_read("\0foo"), "foo")
def test_base64_filter_encode(self):
self.hook.add_callback(Base64Filter())
self.assertEquals(self.hook.run_filter_write("OK"), "base64\0AE9L")
def test_base64_filter_decode(self):
self.hook.add_callback(Base64Filter())
self.assertEquals(self.hook.run_filter_read("base64\0AE9L"), "OK")
def test_missing_filter_raises(self):
self.assertRaises(obnamlib.MissingFilterError,
self.hook.run_filter_read, "missing\0")
def test_missing_filter_gives_tag(self):
try:
self.hook.run_filter_read("missing\0")
except obnamlib.MissingFilterError, e:
self.assertEquals(e.tagname, "missing")
def test_can_remove_filters(self):
myfilter = NeverAddsFilter()
filterid = self.hook.add_callback(myfilter)
self.hook.remove_callback(filterid)
self.assertEquals(self.hook.callbacks, [])
def test_call_callbacks_raises(self):
self.assertRaises(NotImplementedError, self.hook.call_callbacks, "")
class HookManagerTests(unittest.TestCase):
def setUp(self):
self.hooks = obnamlib.HookManager()
self.hooks.new('foo')
def callback(self, *args, **kwargs):
self.args = args
self.kwargs = kwargs
def test_has_no_tests_initially(self):
hooks = obnamlib.HookManager()
self.assertEqual(hooks.hooks, {})
def test_adds_new_hook(self):
self.assert_(self.hooks.hooks.has_key('foo'))
def test_adds_new_filter_hook(self):
self.hooks.new_filter('bar')
self.assert_('bar' in self.hooks.filters)
def test_adds_callback(self):
self.hooks.add_callback('foo', self.callback)
self.assertEqual(self.hooks.hooks['foo'].callbacks, [self.callback])
def test_removes_callback(self):
cb_id = self.hooks.add_callback('foo', self.callback)
self.hooks.remove_callback('foo', cb_id)
self.assertEqual(self.hooks.hooks['foo'].callbacks, [])
def test_calls_callbacks(self):
self.hooks.add_callback('foo', self.callback)
self.hooks.call('foo', 'bar', kwarg='foobar')
self.assertEqual(self.args, ('bar',))
self.assertEqual(self.kwargs, { 'kwarg': 'foobar' })
def test_filter_write_returns_value_of_callbacks(self):
self.hooks.new_filter('bar')
self.assertEquals(self.hooks.filter_write('bar', "foo"), "\0foo")
def test_filter_read_returns_value_of_callbacks(self):
self.hooks.new_filter('bar')
self.assertEquals(self.hooks.filter_read('bar', "\0foo"), "foo")
def test_add_callbacks_to_filters(self):
self.hooks.new_filter('bar')
filt = NeverAddsFilter()
self.hooks.add_callback('bar', filt)
self.assertEquals(self.hooks.filters['bar'].callbacks, [filt])
def test_remove_callbacks_from_filters(self):
self.hooks.new_filter('bar')
filt = NeverAddsFilter()
self.hooks.add_callback('bar', filt)
self.hooks.remove_callback('bar', filt)
self.assertEquals(self.hooks.filters['bar'].callbacks, [])
obnam-1.6.1/obnamlib/lockmgr.py 0000644 0001750 0001750 00000005314 12246357067 016311 0 ustar jenkins jenkins # Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import time
import obnamlib
class LockManager(object):
'''Lock and unlock sets of directories at once.'''
def __init__(self, fs, timeout, client):
self._fs = fs
self.timeout = timeout
data = ["[lockfile]"]
data = data + ["client=" + client]
data = data + ["pid=%d" % os.getpid()]
data = data + self._read_boot_id()
self.data = '\r\n'.join(data)
def _read_boot_id(self): # pragma: no cover
try:
with open("/proc/sys/kernel/random/boot_id", "r") as f:
boot_id = f.read().strip()
except:
return []
else:
return ["boot_id=%s" % boot_id]
def _time(self): # pragma: no cover
return time.time()
def _sleep(self): # pragma: no cover
time.sleep(1)
def sort(self, dirnames):
def bytelist(s):
return [ord(s) for s in str(s)]
return sorted(dirnames, key=bytelist)
def _lockname(self, dirname):
return os.path.join(dirname, 'lock')
def _lock_one(self, dirname):
started = self._time()
while True:
lock_name = self._lockname(dirname)
try:
self._fs.lock(lock_name, self.data)
except obnamlib.LockFail:
if self._time() - started >= self.timeout:
raise obnamlib.LockFail('Lock timeout: %s' % lock_name)
else:
return
self._sleep()
def _unlock_one(self, dirname):
self._fs.unlock(self._lockname(dirname))
def lock(self, dirnames):
'''Lock ALL the directories.'''
we_locked = []
for dirname in self.sort(dirnames):
try:
self._lock_one(dirname)
except obnamlib.LockFail:
self.unlock(we_locked)
raise
else:
we_locked.append(dirname)
def unlock(self, dirnames):
'''Unlock ALL the directories.'''
for dirname in self.sort(dirnames):
self._unlock_one(dirname)
obnam-1.6.1/obnamlib/lockmgr_tests.py 0000644 0001750 0001750 00000006160 12246357067 017533 0 ustar jenkins jenkins # Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import shutil
import tempfile
import unittest
import obnamlib
class LockManagerTests(unittest.TestCase):
def locked(self, dirname):
return os.path.exists(os.path.join(dirname, 'lock'))
def fake_time(self):
self.now += 1
return self.now
def setUp(self):
self.tempdir = tempfile.mkdtemp()
self.dirnames = []
for x in ['a', 'b', 'c']:
dirname = os.path.join(self.tempdir, x)
os.mkdir(dirname)
self.dirnames.append(dirname)
self.fs = obnamlib.LocalFS(self.tempdir)
self.timeout = 10
self.now = 0
self.lm = obnamlib.LockManager(self.fs, self.timeout, '')
self.lm._time = self.fake_time
self.lm._sleep = lambda: None
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_has_nothing_locked_initially(self):
for dirname in self.dirnames:
self.assertFalse(self.locked(dirname))
def test_locks_single_directory(self):
self.lm.lock([self.dirnames[0]])
self.assertTrue(self.locked(self.dirnames[0]))
def test_unlocks_single_directory(self):
self.lm.lock([self.dirnames[0]])
self.lm.unlock([self.dirnames[0]])
self.assertFalse(self.locked(self.dirnames[0]))
def test_waits_until_timeout_for_locked_directory(self):
self.lm.lock([self.dirnames[0]])
self.assertRaises(obnamlib.LockFail,
self.lm.lock, [self.dirnames[0]])
self.assertTrue(self.now >= self.timeout)
def test_notices_when_preexisting_lock_goes_away(self):
self.lm.lock([self.dirnames[0]])
self.lm._sleep = lambda: os.remove(self.lm._lockname(self.dirnames[0]))
self.lm.lock([self.dirnames[0]])
self.assertTrue(True)
def test_locks_all_directories(self):
self.lm.lock(self.dirnames)
for dirname in self.dirnames:
self.assertTrue(self.locked(dirname))
def test_unlocks_all_directories(self):
self.lm.lock(self.dirnames)
self.lm.unlock(self.dirnames)
for dirname in self.dirnames:
self.assertFalse(self.locked(dirname))
def test_does_not_lock_anything_if_one_lock_fails(self):
self.lm.lock([self.dirnames[-1]])
self.assertRaises(obnamlib.LockFail, self.lm.lock, self.dirnames)
for dirname in self.dirnames[:-1]:
self.assertFalse(self.locked(dirname))
self.assertTrue(self.locked(self.dirnames[-1]))
obnam-1.6.1/obnamlib/metadata.py 0000644 0001750 0001750 00000033420 12246357067 016432 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import errno
import grp
import logging
import os
import pwd
import stat
import struct
import tracing
import obnamlib
metadata_verify_fields = (
'st_mode', 'st_mtime_sec', 'st_mtime_nsec',
'st_nlink', 'st_size', 'st_uid', 'groupname', 'username', 'target',
'xattr',
)
metadata_fields = metadata_verify_fields + (
'st_blocks', 'st_dev', 'st_gid', 'st_ino', 'st_atime_sec',
'st_atime_nsec', 'md5',
)
class Metadata(object):
'''Represent metadata for a filesystem entry.
The metadata for a filesystem entry (file, directory, device, ...)
consists of its stat(2) result, plus ACL and xattr.
This class represents them as fields.
We do not store all stat(2) fields. Here's a commentary on all fields:
field? stored? why
st_atime_sec yes mutt compares atime, mtime to see ifmsg is new
st_atime_nsec yes mutt compares atime, mtime to see ifmsg is new
st_blksize no no way to restore, not useful backed up
st_blocks yes should restore create holes in file?
st_ctime no no way to restore, not useful backed up
st_dev yes used to restore hardlinks
st_gid yes used to restore group ownership
st_ino yes used to restore hardlinks
st_mode yes used to restore permissions
st_mtime_sec yes used to restore mtime
st_mtime_nsec yes used to restore mtime
st_nlink yes used to restore hardlinks
st_rdev no no use (correct me if I'm wrong about this)
st_size yes user needs it to see size of file in backup
st_uid yes used to restored ownership
The field 'target' stores the target of a symlink.
Additionally, the fields 'groupname' and 'username' are stored. They
contain the textual names that correspond to st_gid and st_uid. When
restoring, the names will be preferred by default.
The 'md5' field optionally stores the whole-file checksum for the file.
The 'xattr' field optionally stores extended attributes encoded as
a binary blob.
'''
def __init__(self, **kwargs):
for field in metadata_fields:
setattr(self, field, None)
for field, value in kwargs.iteritems():
setattr(self, field, value)
def isdir(self):
return self.st_mode is not None and stat.S_ISDIR(self.st_mode)
def islink(self):
return self.st_mode is not None and stat.S_ISLNK(self.st_mode)
def isfile(self):
return self.st_mode is not None and stat.S_ISREG(self.st_mode)
def __repr__(self): # pragma: no cover
fields = ', '.join('%s=%s' % (k, getattr(self, k))
for k in metadata_fields)
return 'Metadata(%s)' % fields
def __cmp__(self, other):
for field in metadata_fields:
ours = getattr(self, field)
theirs = getattr(other, field)
if ours == theirs:
continue
if ours < theirs:
return -1
if ours > theirs:
return +1
return 0
# Caching versions of username/groupname lookups.
# These work on the assumption that the mappings from uid/gid do not
# change during the runtime of the backup.
_uid_to_username = {}
def _cached_getpwuid(uid): # pragma: no cover
if uid not in _uid_to_username:
_uid_to_username[uid] = pwd.getpwuid(uid)
return _uid_to_username[uid]
_gid_to_groupname = {}
def _cached_getgrgid(gid): # pragma: no cover
if gid not in _gid_to_groupname:
_gid_to_groupname[gid] = grp.getgrgid(gid)
return _gid_to_groupname[gid]
def get_xattrs_as_blob(fs, filename): # pragma: no cover
tracing.trace('filename=%s' % filename)
try:
names = fs.llistxattr(filename)
except (OSError, IOError), e:
if e.errno in (errno.EOPNOTSUPP, errno.EACCES):
return None
raise
tracing.trace('names=%s' % repr(names))
if not names:
return None
values = []
for name in names[:]:
tracing.trace('trying name %s' % repr(name))
try:
value = fs.lgetxattr(filename, name)
except OSError, e:
# On btrfs, at least, this can happen: the filesystem returns
# a list of attribute names, but then fails when looking up
# the value for one or more of the names. We pretend that the
# name was never returned in that case.
#
# Obviously this can happen due to race conditions as well.
if e.errno == errno.ENODATA:
names.remove(name)
logging.warning(
'%s has extended attribute named %s without value, '
'ignoring attribute' % (filename, name))
else:
raise
else:
tracing.trace('lgetxattr(%s)=%s' % (name, value))
values.append(value)
assert len(names) == len(values)
name_blob = ''.join('%s\0' % name for name in names)
lengths = [len(v) for v in values]
fmt = '!' + 'Q' * len(values)
value_blob = struct.pack(fmt, *lengths) + ''.join(values)
return ('%s%s%s' %
(struct.pack('!Q', len(name_blob)),
name_blob,
value_blob))
def set_xattrs_from_blob(fs, filename, blob): # pragma: no cover
sizesize = struct.calcsize('!Q')
name_blob_size = struct.unpack('!Q', blob[:sizesize])[0]
name_blob = blob[sizesize : sizesize + name_blob_size]
value_blob = blob[sizesize + name_blob_size : ]
names = [s for s in name_blob.split('\0')[:-1]]
fmt = '!' + 'Q' * len(names)
lengths_size = sizesize * len(names)
lengths = struct.unpack(fmt, value_blob[:lengths_size])
pos = lengths_size
for i, name in enumerate(names):
value = value_blob[pos:pos + lengths[i]]
pos += lengths[i]
fs.lsetxattr(filename, name, value)
def read_metadata(fs, filename, st=None, getpwuid=None, getgrgid=None):
'''Return object detailing metadata for a filesystem entry.'''
metadata = Metadata()
stat_result = st or fs.lstat(filename)
for field in metadata_fields:
if field.startswith('st_') and hasattr(stat_result, field):
setattr(metadata, field, getattr(stat_result, field))
if stat.S_ISLNK(stat_result.st_mode):
metadata.target = fs.readlink(filename)
else:
metadata.target = ''
getgrgid = getgrgid or _cached_getgrgid
try:
metadata.groupname = getgrgid(metadata.st_gid)[0]
except KeyError:
metadata.groupname = None
getpwuid = getpwuid or _cached_getpwuid
try:
metadata.username = getpwuid(metadata.st_uid)[0]
except KeyError:
metadata.username = None
metadata.xattr = get_xattrs_as_blob(fs, filename)
return metadata
def set_metadata(fs, filename, metadata, getuid=None):
'''Set metadata for a filesystem entry.
We only set metadata that can sensibly be set: st_atime, st_mode,
st_mtime. We also attempt to set ownership (st_gid, st_uid), but
only if we're running as root. We ignore the username, groupname
fields: we assume the caller will change st_uid, st_gid accordingly
if they want to mess with things. This makes the user take care
of error situations and looking up user preferences.
'''
symlink = stat.S_ISLNK(metadata.st_mode)
if symlink:
fs.symlink(metadata.target, filename)
# Set owner before mode, so that a setuid bit does not get reset.
getuid = getuid or os.getuid
if getuid() == 0:
fs.lchown(filename, metadata.st_uid, metadata.st_gid)
# If we are not the owner, and not root, do not restore setuid/setgid.
mode = metadata.st_mode
if getuid() not in (0, metadata.st_uid): # pragma: no cover
mode = mode & (~stat.S_ISUID)
mode = mode & (~stat.S_ISGID)
if symlink:
fs.chmod_symlink(filename, mode)
else:
fs.chmod_not_symlink(filename, mode)
if metadata.xattr: # pragma: no cover
set_xattrs_from_blob(fs, filename, metadata.xattr)
fs.lutimes(filename, metadata.st_atime_sec, metadata.st_atime_nsec,
metadata.st_mtime_sec, metadata.st_mtime_nsec)
metadata_format = struct.Struct('!Q' + # flags
'Q' + # st_mode
'qQ' + # st_mtime_sec and _nsec
'qQ' + # st_atime_sec and _nsec
'Q' + # st_nlink
'Q' + # st_size
'Q' + # st_uid
'Q' + # st_gid
'Q' + # st_dev
'Q' + # st_ino
'Q' + # st_blocks
'Q' + # len of groupname
'Q' + # len of username
'Q' + # len of symlink target
'Q' + # len of md5
'Q' + # len of xattr
'')
def encode_metadata(metadata):
flags = 0
for i, name in enumerate(obnamlib.metadata_fields):
if getattr(metadata, name) is not None:
flags |= (1 << i)
try:
packed = metadata_format.pack(flags,
metadata.st_mode or 0,
metadata.st_mtime_sec or 0,
metadata.st_mtime_nsec or 0,
metadata.st_atime_sec or 0,
metadata.st_atime_nsec or 0,
metadata.st_nlink or 0,
metadata.st_size or 0,
metadata.st_uid or 0,
metadata.st_gid or 0,
metadata.st_dev or 0,
metadata.st_ino or 0,
metadata.st_blocks or 0,
len(metadata.groupname or ''),
len(metadata.username or ''),
len(metadata.target or ''),
len(metadata.md5 or ''),
len(metadata.xattr or ''))
except TypeError, e: # pragma: no cover
logging.error('ERROR: Packing error due to %s' % str(e))
logging.error('ERROR: st_mode=%s' % repr(metadata.st_mode))
logging.error('ERROR: st_mtime_sec=%s' % repr(metadata.st_mtime_sec))
logging.error('ERROR: st_mtime_nsec=%s' % repr(metadata.st_mtime_nsec))
logging.error('ERROR: st_atime_sec=%s' % repr(metadata.st_atime_sec))
logging.error('ERROR: st_atime_nsec=%s' % repr(metadata.st_atime_nsec))
logging.error('ERROR: st_nlink=%s' % repr(metadata.st_nlink))
logging.error('ERROR: st_size=%s' % repr(metadata.st_size))
logging.error('ERROR: st_uid=%s' % repr(metadata.st_uid))
logging.error('ERROR: st_gid=%s' % repr(metadata.st_gid))
logging.error('ERROR: st_dev=%s' % repr(metadata.st_dev))
logging.error('ERROR: st_ino=%s' % repr(metadata.st_ino))
logging.error('ERROR: st_blocks=%s' % repr(metadata.st_blocks))
logging.error('ERROR: groupname=%s' % repr(metadata.groupname))
logging.error('ERROR: username=%s' % repr(metadata.username))
logging.error('ERROR: target=%s' % repr(metadata.target))
logging.error('ERROR: md5=%s' % repr(metadata.md5))
logging.error('ERROR: xattr=%s' % repr(metadata.xattr))
raise
return (packed +
(metadata.groupname or '') +
(metadata.username or '') +
(metadata.target or '') +
(metadata.md5 or '') +
(metadata.xattr or ''))
def decode_metadata(encoded):
items = metadata_format.unpack_from(encoded)
flags = items[0]
pos = [1, metadata_format.size]
metadata = obnamlib.Metadata()
def is_present(field):
i = obnamlib.metadata_fields.index(field)
return (flags & (1 << i)) != 0
def decode(field, num_items, inc_offset, getvalue):
if is_present(field):
value = getvalue(pos[0], pos[1])
setattr(metadata, field, value)
if inc_offset:
pos[1] += len(value)
pos[0] += num_items
def decode_integer(field):
decode(field, 1, False, lambda i, o: items[i])
def decode_string(field):
decode(field, 1, True, lambda i, o: encoded[o:o + items[i]])
decode_integer('st_mode')
decode_integer('st_mtime_sec')
decode_integer('st_mtime_nsec')
decode_integer('st_atime_sec')
decode_integer('st_atime_nsec')
decode_integer('st_nlink')
decode_integer('st_size')
decode_integer('st_uid')
decode_integer('st_gid')
decode_integer('st_dev')
decode_integer('st_ino')
decode_integer('st_blocks')
decode_string('groupname')
decode_string('username')
decode_string('target')
decode_string('md5')
decode_string('xattr')
return metadata
obnam-1.6.1/obnamlib/metadata_tests.py 0000644 0001750 0001750 00000025532 12246357067 017661 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import stat
import tempfile
import unittest
import platform
import obnamlib
class FakeFS(object):
def __init__(self):
self.st_atime_sec = 1
self.st_atime_nsec = 11
self.st_blocks = 2
self.st_dev = 3
self.st_gid = 4
self.st_ino = 5
self.st_mode = 6
self.st_mtime_sec = 7
self.st_mtime_nsec = 71
self.st_nlink = 8
self.st_size = 9
self.st_uid = 10
self.groupname = 'group'
self.username = 'user'
self.target = 'target'
def lstat(self, filename):
return self
def readlink(self, filename):
return 'target'
def getpwuid(self, uid):
return (self.username, None, self.st_uid, self.st_gid,
None, None, None)
def getgrgid(self, gid):
return (self.groupname, None, self.st_gid, None)
def fail_getpwuid(self, uid):
raise KeyError(uid)
def fail_getgrgid(self, gid):
raise KeyError(gid)
def llistxattr(self, filename):
return []
class MetadataTests(unittest.TestCase):
def test_sets_mtime_from_kwarg(self):
metadata = obnamlib.Metadata(st_mtime_sec=123)
self.assertEqual(metadata.st_mtime_sec, 123)
def test_isdir_returns_false_for_regular_file(self):
metadata = obnamlib.Metadata(st_mode=stat.S_IFREG)
self.assertFalse(metadata.isdir())
def test_isdir_returns_true_for_directory(self):
metadata = obnamlib.Metadata(st_mode=stat.S_IFDIR)
self.assert_(metadata.isdir())
def test_isdir_returns_false_when_st_mode_is_not_set(self):
metadata = obnamlib.Metadata()
self.assertFalse(metadata.isdir())
def test_islink_returns_false_for_regular_file(self):
metadata = obnamlib.Metadata(st_mode=stat.S_IFREG)
self.assertFalse(metadata.islink())
def test_islink_returns_true_for_symlink(self):
metadata = obnamlib.Metadata(st_mode=stat.S_IFLNK)
self.assert_(metadata.islink())
def test_islink_returns_false_when_st_mode_is_not_set(self):
metadata = obnamlib.Metadata()
self.assertFalse(metadata.islink())
def test_isfile_returns_true_for_regular_file(self):
metadata = obnamlib.Metadata(st_mode=stat.S_IFREG)
self.assert_(metadata.isfile())
def test_isfile_returns_false_when_st_mode_is_not_set(self):
metadata = obnamlib.Metadata()
self.assertFalse(metadata.isfile())
def test_has_no_md5_by_default(self):
metadata = obnamlib.Metadata()
self.assertEqual(metadata.md5, None)
def test_sets_md5(self):
metadata = obnamlib.Metadata(md5='checksum')
self.assertEqual(metadata.md5, 'checksum')
def test_is_equal_to_itself(self):
metadata = obnamlib.Metadata(st_mode=stat.S_IFREG)
self.assertEqual(metadata, metadata)
def test_less_than_works(self):
m1 = obnamlib.Metadata(st_size=1)
m2 = obnamlib.Metadata(st_size=2)
self.assert_(m1 < m2)
def test_greater_than_works(self):
m1 = obnamlib.Metadata(st_size=1)
m2 = obnamlib.Metadata(st_size=2)
self.assert_(m2 > m1)
class ReadMetadataTests(unittest.TestCase):
def setUp(self):
self.fakefs = FakeFS()
def test_returns_stat_fields_correctly(self):
metadata = obnamlib.read_metadata(self.fakefs, 'foo',
getpwuid=self.fakefs.getpwuid,
getgrgid=self.fakefs.getgrgid)
fields = ['st_atime_sec','st_atime_nsec', 'st_blocks', 'st_dev',
'st_gid', 'st_ino', 'st_mode', 'st_mtime_sec',
'st_mtime_nsec', 'st_nlink', 'st_size', 'st_uid',
'groupname', 'username']
for field in fields:
self.assertEqual(getattr(metadata, field),
getattr(self.fakefs, field),
field)
def test_returns_symlink_fields_correctly(self):
self.fakefs.st_mode |= stat.S_IFLNK;
metadata = obnamlib.read_metadata(self.fakefs, 'foo',
getpwuid=self.fakefs.getpwuid,
getgrgid=self.fakefs.getgrgid)
fields = ['st_mode', 'target']
for field in fields:
self.assertEqual(getattr(metadata, field),
getattr(self.fakefs, field),
field)
def test_reads_username_as_None_if_lookup_fails(self):
metadata = obnamlib.read_metadata(self.fakefs, 'foo',
getpwuid=self.fakefs.fail_getpwuid,
getgrgid=self.fakefs.fail_getgrgid)
self.assertEqual(metadata.username, None)
class SetMetadataTests(unittest.TestCase):
def setUp(self):
self.metadata = obnamlib.Metadata()
self.metadata.st_atime_sec = 12765
self.metadata.st_atime_nsec = 0
self.metadata.st_mode = 42 | stat.S_IFREG
self.metadata.st_mtime_sec = 10**9
self.metadata.st_mtime_nsec = 0
self.metadata.st_uid = 1234
self.metadata.st_gid = 5678
fd, self.filename = tempfile.mkstemp()
os.close(fd)
# On some systems (e.g. FreeBSD) /tmp is apparently setgid and
# default gid of files is therefore not the user's gid.
os.chown(self.filename, os.getuid(), os.getgid())
self.fs = obnamlib.LocalFS('/')
self.fs.connect()
self.uid_set = None
self.gid_set = None
self.fs.lchown = self.fake_lchown
obnamlib.set_metadata(self.fs, self.filename, self.metadata)
self.st = os.stat(self.filename)
def tearDown(self):
self.fs.close()
os.remove(self.filename)
def fake_lchown(self, filename, uid, gid):
self.uid_set = uid
self.gid_set = gid
def test_sets_atime(self):
self.assertEqual(self.st.st_atime, self.metadata.st_atime_sec)
def test_sets_mode(self):
self.assertEqual(self.st.st_mode, self.metadata.st_mode)
def test_sets_mtime(self):
self.assertEqual(self.st.st_mtime, self.metadata.st_mtime_sec)
def test_does_not_set_uid_when_not_running_as_root(self):
self.assertEqual(self.st.st_uid, os.getuid())
def test_does_not_set_gid_when_not_running_as_root(self):
self.assertEqual(self.st.st_gid, os.getgid())
def test_sets_uid_when_running_as_root(self):
obnamlib.set_metadata(self.fs, self.filename, self.metadata,
getuid=lambda: 0)
self.assertEqual(self.uid_set, self.metadata.st_uid)
def test_sets_gid_when_running_as_root(self):
obnamlib.set_metadata(self.fs, self.filename, self.metadata,
getuid=lambda: 0)
self.assertEqual(self.gid_set, self.metadata.st_gid)
def test_sets_symlink_target(self):
self.fs.remove(self.filename)
self.metadata.st_mode = 0777 | stat.S_IFLNK;
self.metadata.target = 'target'
obnamlib.set_metadata(self.fs, self.filename, self.metadata)
self.assertEqual(self.fs.readlink(self.filename), 'target')
def test_sets_symlink_mtime_perms(self):
self.fs.remove(self.filename)
self.metadata.st_mode = 0777 | stat.S_IFLNK;
self.metadata.target = 'target'
obnamlib.set_metadata(self.fs, self.filename, self.metadata)
st = os.lstat(self.filename)
self.assertEqual(st.st_mode, self.metadata.st_mode)
self.assertEqual(st.st_mtime, self.metadata.st_mtime_sec)
class MetadataCodingTests(unittest.TestCase):
def equal(self, meta1, meta2):
for name in dir(meta1):
if name in obnamlib.metadata.metadata_fields:
value1 = getattr(meta1, name)
value2 = getattr(meta2, name)
self.assertEqual(
value1,
value2,
'attribute %s must be equal (%s vs %s)' %
(name, value1, value2))
def test_round_trip(self):
metadata = obnamlib.metadata.Metadata(st_mode=1,
st_mtime_sec=2,
st_mtime_nsec=12756,
st_nlink=3,
st_size=4,
st_uid=5,
st_blocks=6,
st_dev=7,
st_gid=8,
st_ino=9,
st_atime_sec=10,
st_atime_nsec=123,
groupname='group',
username='user',
target='target',
md5='checksum')
encoded = obnamlib.encode_metadata(metadata)
decoded = obnamlib.decode_metadata(encoded)
self.equal(metadata, decoded)
def test_round_trip_for_None_values(self):
metadata = obnamlib.metadata.Metadata()
encoded = obnamlib.encode_metadata(metadata)
decoded = obnamlib.decode_metadata(encoded)
for name in dir(metadata):
if name in obnamlib.metadata.metadata_fields:
self.assertEqual(getattr(decoded, name), None,
'attribute %s must be None' % name)
def test_round_trip_for_maximum_values(self):
unsigned_max = 2**64 - 1
signed_max = 2**63 - 1
metadata = obnamlib.metadata.Metadata(
st_mode=unsigned_max,
st_mtime_sec=signed_max,
st_mtime_nsec=unsigned_max,
st_nlink=unsigned_max,
st_size=signed_max,
st_uid=unsigned_max,
st_blocks=signed_max,
st_dev=unsigned_max,
st_gid=unsigned_max,
st_ino=unsigned_max,
st_atime_sec=signed_max,
st_atime_nsec=unsigned_max)
encoded = obnamlib.encode_metadata(metadata)
decoded = obnamlib.decode_metadata(encoded)
self.equal(metadata, decoded)
obnam-1.6.1/obnamlib/pluginbase.py 0000644 0001750 0001750 00000001505 12246357067 017002 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import obnamlib
class ObnamPlugin(obnamlib.pluginmgr.Plugin):
'''Base class for plugins in Obnam.'''
def __init__(self, app):
self.app = app
obnam-1.6.1/obnamlib/pluginbase_tests.py 0000644 0001750 0001750 00000002005 12246357067 020220 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import unittest
import obnamlib
class FakeApp(object):
def __init__(self):
self.hooks = self
class ObnamPluginTests(unittest.TestCase):
def setUp(self):
self.fakeapp = FakeApp()
self.plugin = obnamlib.ObnamPlugin(self.fakeapp)
def test_has_an_app(self):
self.assertEqual(self.plugin.app, self.fakeapp)
obnam-1.6.1/obnamlib/pluginmgr.py 0000644 0001750 0001750 00000017717 12246357067 016671 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
'''A generic plugin manager.
The plugin manager finds files with plugins and loads them. It looks
for plugins in a number of locations specified by the caller. To add
a plugin to be loaded, it is enough to put it in one of the locations,
and name it *_plugin.py. (The naming convention is to allow having
other modules as well, such as unit tests, in the same locations.)
'''
import imp
import inspect
import os
class Plugin(object):
'''Base class for plugins.
A plugin MUST NOT have any side effects when it is instantiated.
This is necessary so that it can be safely loaded by unit tests,
and so that a user interface can allow the user to disable it,
even if it is installed, with no ill effects. Any side effects
that would normally happen should occur in the enable() method,
and be undone by the disable() method. These methods must be
callable any number of times.
The subclass MAY define the following attributes:
* name
* description
* version
* required_application_version
name is the user-visible identifier for the plugin. It defaults
to the plugin's classname.
description is the user-visible description of the plugin. It may
be arbitrarily long, and can use pango markup language. Defaults
to the empty string.
version is the plugin version. Defaults to '0.0.0'. It MUST be a
sequence of integers separated by periods. If several plugins with
the same name are found, the newest version is used. Versions are
compared integer by integer, starting with the first one, and a
missing integer treated as a zero. If two plugins have the same
version, either might be used.
required_application_version gives the version of the minimal
application version the plugin is written for. The first integer
must match exactly: if the application is version 2.3.4, the
plugin's required_application_version must be at least 2 and
at most 2.3.4 to be loaded. Defaults to 0.
'''
@property
def name(self):
return self.__class__.__name__
@property
def description(self):
return ''
@property
def version(self):
return '0.0.0'
@property
def required_application_version(self):
return '0.0.0'
def enable_wrapper(self):
'''Enable plugin.
The plugin manager will call this method, which then calls the
enable method. Plugins should implement the enable method.
The wrapper method is there to allow an application to provide
an extended base class that does some application specific
magic when plugins are enabled or disabled.
'''
self.enable()
def disable_wrapper(self):
'''Corresponds to enable_wrapper, but for disabling a plugin.'''
self.disable()
def enable(self):
'''Enable the plugin.'''
raise NotImplemented()
def disable(self):
'''Disable the plugin.'''
raise NotImplemented()
class PluginManager(object):
'''Manage plugins.
This class finds and loads plugins, and keeps a list of them that
can be accessed in various ways.
The locations are set via the locations attribute, which is a list.
When a plugin is loaded, an instance of its class is created. This
instance is initialized using normal and keyword arguments specified
in the plugin manager attributes plugin_arguments and
plugin_keyword_arguments.
The version of the application using the plugin manager is set via
the application_version attribute. This defaults to '0.0.0'.
'''
suffix = '_plugin.py'
def __init__(self):
self.locations = []
self._plugins = None
self._plugin_files = None
self.plugin_arguments = []
self.plugin_keyword_arguments = {}
self.application_version = '0.0.0'
@property
def plugin_files(self):
if self._plugin_files is None:
self._plugin_files = self.find_plugin_files()
return self._plugin_files
@property
def plugins(self):
if self._plugins is None:
self._plugins = self.load_plugins()
return self._plugins
def __getitem__(self, name):
for plugin in self.plugins:
if plugin.name == name:
return plugin
raise KeyError('Plugin %s is not known' % name)
def find_plugin_files(self):
'''Find files that may contain plugins.
This finds all files named *_plugin.py in all locations.
The returned list is sorted.
'''
pathnames = []
for location in self.locations:
try:
basenames = os.listdir(location)
except os.error:
continue
for basename in basenames:
s = os.path.join(location, basename)
if s.endswith(self.suffix) and os.path.exists(s):
pathnames.append(s)
return sorted(pathnames)
def load_plugins(self):
'''Load plugins from all plugin files.'''
plugins = dict()
for pathname in self.plugin_files:
for plugin in self.load_plugin_file(pathname):
if plugin.name in plugins:
p = plugins[plugin.name]
if self.is_older(p.version, plugin.version):
plugins[plugin.name] = plugin
else:
plugins[plugin.name] = plugin
return plugins.values()
def is_older(self, version1, version2):
'''Is version1 older than version2?'''
return self.parse_version(version1) < self.parse_version(version2)
def load_plugin_file(self, pathname):
'''Return plugin classes in a plugin file.'''
name, ext = os.path.splitext(os.path.basename(pathname))
f = file(pathname, 'r')
module = imp.load_module(name, f, pathname,
('.py', 'r', imp.PY_SOURCE))
f.close()
plugins = []
for dummy, member in inspect.getmembers(module, inspect.isclass):
if issubclass(member, Plugin):
p = member(*self.plugin_arguments,
**self.plugin_keyword_arguments)
if self.compatible_version(p.required_application_version):
plugins.append(p)
return plugins
def compatible_version(self, required_application_version):
'''Check that the plugin is version-compatible with the application.
This checks the plugin's required_application_version against
the declared application version and returns True if they are
compatible, and False if not.
'''
req = self.parse_version(required_application_version)
app = self.parse_version(self.application_version)
return app[0] == req[0] and app >= req
def parse_version(self, version):
'''Parse a string represenation of a version into list of ints.'''
return [int(s) for s in version.split('.')]
def enable_plugins(self, plugins=None):
'''Enable all or selected plugins.'''
for plugin in plugins or self.plugins:
plugin.enable_wrapper()
def disable_plugins(self, plugins=None):
'''Disable all or selected plugins.'''
for plugin in plugins or self.plugins:
plugin.disable_wrapper()
obnam-1.6.1/obnamlib/pluginmgr_tests.py 0000644 0001750 0001750 00000012124 12246357067 020076 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import unittest
from pluginmgr import Plugin, PluginManager
class PluginTests(unittest.TestCase):
def setUp(self):
self.plugin = Plugin()
def test_name_is_class_name(self):
self.assertEqual(self.plugin.name, 'Plugin')
def test_description_is_empty_string(self):
self.assertEqual(self.plugin.description, '')
def test_version_is_zeroes(self):
self.assertEqual(self.plugin.version, '0.0.0')
def test_required_application_version_is_zeroes(self):
self.assertEqual(self.plugin.required_application_version, '0.0.0')
def test_enable_raises_exception(self):
self.assertRaises(Exception, self.plugin.enable)
def test_disable_raises_exception(self):
self.assertRaises(Exception, self.plugin.disable)
def test_enable_wrapper_calls_enable(self):
self.plugin.enable = lambda: setattr(self, 'enabled', True)
self.plugin.enable_wrapper()
self.assert_(self.enabled, True)
def test_disable_wrapper_calls_disable(self):
self.plugin.disable = lambda: setattr(self, 'disabled', True)
self.plugin.disable_wrapper()
self.assert_(self.disabled, True)
class PluginManagerInitialStateTests(unittest.TestCase):
def setUp(self):
self.pm = PluginManager()
def test_locations_is_empty_list(self):
self.assertEqual(self.pm.locations, [])
def test_plugins_is_empty_list(self):
self.assertEqual(self.pm.plugins, [])
def test_application_version_is_zeroes(self):
self.assertEqual(self.pm.application_version, '0.0.0')
def test_plugin_files_is_empty(self):
self.assertEqual(self.pm.plugin_files, [])
def test_plugin_arguments_is_empty(self):
self.assertEqual(self.pm.plugin_arguments, [])
def test_plugin_keyword_arguments_is_empty(self):
self.assertEqual(self.pm.plugin_keyword_arguments, {})
class PluginManagerTests(unittest.TestCase):
def setUp(self):
self.pm = PluginManager()
self.pm.locations = ['test-plugins', 'not-exist']
self.pm.plugin_arguments = ('fooarg',)
self.pm.plugin_keyword_arguments = { 'bar': 'bararg' }
self.files = sorted(['test-plugins/hello_plugin.py',
'test-plugins/aaa_hello_plugin.py',
'test-plugins/oldhello_plugin.py',
'test-plugins/wrongversion_plugin.py'])
def test_finds_the_right_plugin_files(self):
self.assertEqual(self.pm.find_plugin_files(), self.files)
def test_plugin_files_attribute_implicitly_searches(self):
self.assertEqual(self.pm.plugin_files, self.files)
def test_loads_hello_plugin(self):
plugins = self.pm.load_plugins()
self.assertEqual(len(plugins), 1)
self.assertEqual(plugins[0].name, 'Hello')
def test_plugins_attribute_implicitly_searches(self):
self.assertEqual(len(self.pm.plugins), 1)
self.assertEqual(self.pm.plugins[0].name, 'Hello')
def test_initializes_hello_with_correct_args(self):
plugin = self.pm['Hello']
self.assertEqual(plugin.foo, 'fooarg')
self.assertEqual(plugin.bar, 'bararg')
def test_raises_keyerror_for_unknown_plugin(self):
self.assertRaises(KeyError, self.pm.__getitem__, 'Hithere')
def test_enable_plugins_enables_all_plugins(self):
enabled = set()
for plugin in self.pm.plugins:
plugin.enable = lambda: enabled.add(plugin)
self.pm.enable_plugins()
self.assertEqual(enabled, set(self.pm.plugins))
def test_disable_plugins_disables_all_plugins(self):
disabled = set()
for plugin in self.pm.plugins:
plugin.disable = lambda: disabled.add(plugin)
self.pm.disable_plugins()
self.assertEqual(disabled, set(self.pm.plugins))
class PluginManagerCompatibleApplicationVersionTests(unittest.TestCase):
def setUp(self):
self.pm = PluginManager()
self.pm.application_version = '1.2.3'
def test_rejects_zero(self):
self.assertFalse(self.pm.compatible_version('0'))
def test_rejects_two(self):
self.assertFalse(self.pm.compatible_version('2'))
def test_rejects_one_two_four(self):
self.assertFalse(self.pm.compatible_version('1.2.4'))
def test_accepts_one(self):
self.assert_(self.pm.compatible_version('1'))
def test_accepts_one_two_three(self):
self.assert_(self.pm.compatible_version('1.2.3'))
obnam-1.6.1/obnamlib/plugins/ 0000755 0001750 0001750 00000000000 12246357067 015757 5 ustar jenkins jenkins obnam-1.6.1/obnamlib/plugins/__init__.py 0000644 0001750 0001750 00000000000 12246357067 020056 0 ustar jenkins jenkins obnam-1.6.1/obnamlib/plugins/backup_plugin.py 0000644 0001750 0001750 00000074552 12246357067 021171 0 ustar jenkins jenkins # Copyright (C) 2009, 2010, 2011, 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import errno
import gc
import logging
import os
import re
import stat
import sys
import time
import traceback
import tracing
import ttystatus
import obnamlib
import larch
class ChunkidPool(object):
'''Checksum/chunkid mappings that are pending an upload to shared trees.'''
def __init__(self):
self.clear()
def add(self, chunkid, checksum):
if checksum not in self._mapping:
self._mapping[checksum] = []
self._mapping[checksum].append(chunkid)
def __contains__(self, checksum):
return checksum in self._mapping
def get(self, checksum):
return self._mapping.get(checksum, [])
def clear(self):
self._mapping = {}
def __iter__(self):
for checksum in self._mapping.keys():
for chunkid in self._mapping[checksum]:
yield chunkid, checksum
class BackupProgress(object):
def __init__(self, ts):
self.file_count = 0
self.backed_up_count = 0
self.uploaded_bytes = 0
self.scanned_bytes = 0
self.started = None
self._ts = ts
self._ts['current-file'] = ''
self._ts['scanned-bytes'] = 0
self._ts['uploaded-bytes'] = 0
self._ts.format('%ElapsedTime() '
'%Counter(current-file) '
'files '
'%ByteSize(scanned-bytes) scanned: '
'%String(what)')
def clear(self):
self._ts.clear()
def error(self, msg):
self._ts.error(msg)
def what(self, what_what):
if self.started is None:
self.started = time.time()
self._ts['what'] = what_what
self._ts.flush()
def update_progress(self):
self._ts['not-shown'] = 'not shown'
def update_progress_with_file(self, filename, metadata):
self._ts['what'] = filename
self._ts['current-file'] = filename
self.file_count += 1
def update_progress_with_scanned(self, amount):
self.scanned_bytes += amount
self._ts['scanned-bytes'] = self.scanned_bytes
def update_progress_with_upload(self, amount):
self.uploaded_bytes += amount
self._ts['uploaded-bytes'] = self.uploaded_bytes
def update_progress_with_removed_checkpoint(self, gen):
self._ts['checkpoint'] = gen
def report_stats(self):
size_table = [
(1024**4, 'TiB'),
(1024**3, 'GiB'),
(1024**2, 'MiB'),
(1024**1, 'KiB'),
(0, 'B')
]
for size_base, size_unit in size_table:
if self.uploaded_bytes >= size_base:
if size_base > 0:
size_amount = float(self.uploaded_bytes) / float(size_base)
else:
size_amount = float(self.uploaded_bytes)
break
speed_table = [
(1024**3, 'GiB/s'),
(1024**2, 'MiB/s'),
(1024**1, 'KiB/s'),
(0, 'B/s')
]
duration = time.time() - self.started
speed = float(self.uploaded_bytes) / duration
for speed_base, speed_unit in speed_table:
if speed >= speed_base:
if speed_base > 0:
speed_amount = speed / speed_base
else:
speed_amount = speed
break
duration_string = ''
seconds = duration
if seconds >= 3600:
duration_string += '%dh' % int(seconds/3600)
seconds %= 3600
if seconds >= 60:
duration_string += '%dm' % int(seconds/60)
seconds %= 60
if seconds > 0:
duration_string += '%ds' % round(seconds)
logging.info('Backup performance statistics:')
logging.info('* files found: %s' % self.file_count)
logging.info('* files backed up: %s' % self.backed_up_count)
logging.info('* uploaded data: %s bytes (%s %s)' %
(self.uploaded_bytes, size_amount, size_unit))
logging.info('* duration: %s s' % duration)
logging.info('* average speed: %s %s' % (speed_amount, speed_unit))
self._ts.notify(
'Backed up %d files (of %d found), '
'uploaded %.1f %s in %s at %.1f %s average speed' %
(self.backed_up_count, self.file_count,
size_amount, size_unit,
duration_string, speed_amount, speed_unit))
class BackupPlugin(obnamlib.ObnamPlugin):
def enable(self):
backup_group = obnamlib.option_group['backup'] = 'Backing up'
perf_group = obnamlib.option_group['perf']
self.app.add_subcommand('backup', self.backup,
arg_synopsis='[DIRECTORY]...')
self.app.settings.string_list(['root'], 'what to backup')
self.app.settings.string_list(['exclude'],
'regular expression for pathnames to '
'exclude from backup (can be used multiple '
'times)',
group=backup_group)
self.app.settings.boolean(['exclude-caches'],
'exclude directories (and their subdirs) '
'that contain a CACHEDIR.TAG file',
group=backup_group)
self.app.settings.boolean(['one-file-system'],
'exclude directories (and their subdirs) '
'that are in a different filesystem',
group=backup_group)
self.app.settings.bytesize(['checkpoint'],
'make a checkpoint after a given SIZE '
'(%default)',
metavar='SIZE',
default=1024**3,
group=backup_group)
self.app.settings.integer(['chunkids-per-group'],
'encode NUM chunk ids per group (%default)',
metavar='NUM',
default=obnamlib.DEFAULT_CHUNKIDS_PER_GROUP,
group=perf_group)
self.app.settings.choice(['deduplicate'],
['fatalist', 'never', 'verify'],
'find duplicate data in backed up data '
'and store it only once; three modes '
'are available: never de-duplicate, '
'verify that no hash collisions happen, '
'or (the default) fatalistically accept '
'the risk of collisions',
metavar='MODE',
group=backup_group)
self.app.settings.boolean(['leave-checkpoints'],
'leave checkpoint generations at the end '
'of a successful backup run',
group=backup_group)
self.app.settings.boolean(['small-files-in-btree'],
'put contents of small files directly into '
'the per-client B-tree, instead of '
'separate chunk files; do not use this '
'as it is quite bad for performance',
group=backup_group)
self.app.settings.string_list(
['testing-fail-matching'],
'development testing helper: simulate failures during backup '
'for files that match the given regular expressions',
metavar='REGEXP')
def configure_ttystatus_for_backup(self):
self.progress = BackupProgress(self.app.ts)
def error(self, msg, exc=None):
self.errors = True
logging.error(msg)
if exc:
logging.error(repr(exc))
# FIXME: ttystatus.TerminalStatus.error is quiet if --quiet is used.
# That's a bug, so we work around it by writing to stderr directly.
sys.stderr.write('ERROR: %s\n' % msg)
def parse_checkpoint_size(self, value):
p = obnamlib.ByteSizeParser()
p.set_default_unit('MiB')
return p.parse(value)
@property
def pretend(self):
return self.app.settings['pretend']
def backup(self, args):
'''Backup data to repository.'''
logging.info('Backup starts')
logging.debug(
'Checkpoints every %s bytes' % self.app.settings['checkpoint'])
self.app.settings.require('repository')
self.app.settings.require('client-name')
if not self.app.settings['repository']:
raise obnamlib.Error('No --repository setting. '
'You need to specify it on the command '
'line or a configuration file.')
self.configure_ttystatus_for_backup()
self.progress.what('setting up')
self.compile_exclusion_patterns()
self.memory_dump_counter = 0
self.progress.what('connecting to repository')
client_name = self.app.settings['client-name']
if self.pretend:
self.repo = self.app.open_repository()
self.repo.open_client(client_name)
else:
self.repo = self.app.open_repository(create=True)
self.progress.what('adding client')
self.add_client(client_name)
self.progress.what('locking client')
self.repo.lock_client(client_name)
# Need to lock the shared stuff briefly, so encryption etc
# gets initialized.
self.progress.what(
'initialising shared directories')
self.repo.lock_shared()
self.repo.unlock_shared()
self.errors = False
self.chunkid_pool = ChunkidPool()
try:
if not self.pretend:
self.progress.what('starting new generation')
self.repo.start_generation()
self.fs = None
roots = self.app.settings['root'] + args
if not roots:
raise obnamlib.Error('No backup roots specified')
self.backup_roots(roots)
self.progress.what('committing changes to repository')
if not self.pretend:
self.progress.what(
'committing changes to repository: locking shared B-trees')
self.repo.lock_shared()
self.progress.what(
'committing changes to repository: '
'adding chunks to shared B-trees')
self.add_chunks_to_shared()
self.progress.what(
'committing changes to repository: '
'committing client')
self.repo.commit_client()
self.progress.what(
'committing changes to repository: '
'committing shared B-trees')
self.repo.commit_shared()
self.progress.what('closing connection to repository')
self.repo.fs.close()
self.progress.clear()
self.progress.report_stats()
logging.info('Backup finished.')
self.app.dump_memory_profile('at end of backup run')
except BaseException, e:
logging.debug('Handling exception %s' % str(e))
logging.debug(traceback.format_exc())
self.unlock_when_error()
raise
if self.errors:
raise obnamlib.Error('There were errors during the backup')
def unlock_when_error(self):
try:
if self.repo.got_client_lock:
logging.info('Attempting to unlock client because of error')
self.repo.unlock_client()
if self.repo.got_shared_lock:
logging.info(
'Attempting to unlock shared trees because of error')
self.repo.unlock_shared()
except BaseException, e2:
logging.warning(
'Error while unlocking due to error: %s' % str(e2))
logging.debug(traceback.format_exc())
else:
logging.info('Successfully unlocked')
def add_chunks_to_shared(self):
for chunkid, checksum in self.chunkid_pool:
self.repo.put_chunk_in_shared_trees(chunkid, checksum)
self.chunkid_pool.clear()
def add_client(self, client_name):
self.repo.lock_root()
if client_name not in self.repo.list_clients():
tracing.trace('adding new client %s' % client_name)
tracing.trace('client list before adding: %s' %
self.repo.list_clients())
self.repo.add_client(client_name)
tracing.trace('client list after adding: %s' %
self.repo.list_clients())
self.repo.commit_root()
self.repo = self.app.open_repository(repofs=self.repo.fs.fs)
def compile_exclusion_patterns(self):
log = self.app.settings['log']
if log:
log = self.app.settings['log']
self.app.settings['exclude'].append(log)
for pattern in self.app.settings['exclude']:
logging.debug('Exclude pattern: %s' % pattern)
self.exclude_pats = []
for x in self.app.settings['exclude']:
if x != '':
try:
self.exclude_pats.append(re.compile(x))
except re.error, e:
msg = (
'error compiling regular expression "%s": %s' % (x, e))
logging.error(msg)
self.progress.error(msg)
def backup_roots(self, roots):
self.progress.what('connecting to to repository')
self.fs = self.app.fsf.new(roots[0])
self.fs.connect()
absroots = []
for root in roots:
self.progress.what('determining absolute path for %s' % root)
self.fs.reinit(root)
absroots.append(self.fs.abspath('.'))
if not self.pretend:
self.remove_old_roots(absroots)
self.checkpoints = []
self.last_checkpoint = 0
self.interval = self.app.settings['checkpoint']
for root in roots:
logging.info('Backing up root %s' % root)
self.progress.what('connecting to live data %s' % root)
self.fs.reinit(root)
self.progress.what('scanning for files in %s' % root)
absroot = self.fs.abspath('.')
self.root_metadata = self.fs.lstat(absroot)
for pathname, metadata in self.find_files(absroot):
logging.info('Backing up %s' % pathname)
try:
self.maybe_simulate_error(pathname)
if stat.S_ISDIR(metadata.st_mode):
self.backup_dir_contents(pathname)
elif stat.S_ISREG(metadata.st_mode):
assert metadata.md5 is None
metadata.md5 = self.backup_file_contents(pathname,
metadata)
self.backup_metadata(pathname, metadata)
except (IOError, OSError), e:
msg = 'Can\'t back up %s: %s' % (pathname, e.strerror)
self.error(msg, e)
if e.errno == errno.ENOSPC:
raise
if self.time_for_checkpoint():
self.make_checkpoint()
self.progress.what(pathname)
self.backup_parents('.')
remove_checkpoints = (not self.errors and
not self.app.settings['leave-checkpoints']
and not self.pretend)
if remove_checkpoints:
self.progress.what('removing checkpoints')
for gen in self.checkpoints:
self.progress.update_progress_with_removed_checkpoint(gen)
self.repo.remove_generation(gen)
if self.fs:
self.fs.close()
def maybe_simulate_error(self, pathname):
'''Raise an IOError if specified by --testing-fail-matching.'''
for pattern in self.app.settings['testing-fail-matching']:
if re.search(pattern, pathname):
e = errno.ENOENT
raise IOError(e, os.strerror(e), pathname)
def time_for_checkpoint(self):
bytes_since = (self.repo.fs.bytes_written - self.last_checkpoint)
return bytes_since >= self.interval
def make_checkpoint(self):
logging.info('Making checkpoint')
self.progress.what('making checkpoint')
if not self.pretend:
self.checkpoints.append(self.repo.new_generation)
self.progress.what('making checkpoint: backing up parents')
self.backup_parents('.')
self.progress.what('making checkpoint: locking shared B-trees')
self.repo.lock_shared()
self.progress.what('making checkpoint: adding chunks to shared B-trees')
self.add_chunks_to_shared()
self.progress.what('making checkpoint: committing per-client B-tree')
self.repo.commit_client(checkpoint=True)
self.progress.what('making checkpoint: committing shared B-trees')
self.repo.commit_shared()
self.last_checkpoint = self.repo.fs.bytes_written
self.progress.what('making checkpoint: re-opening repository')
self.repo = self.app.open_repository(repofs=self.repo.fs.fs)
self.progress.what('making checkpoint: locking client')
self.repo.lock_client(self.app.settings['client-name'])
self.progress.what('making checkpoint: starting a new generation')
self.repo.start_generation()
self.app.dump_memory_profile('at end of checkpoint')
self.progress.what('making checkpoint: continuing backup')
def find_files(self, root):
'''Find all files and directories that need to be backed up.
This is a generator. It yields (pathname, metadata) pairs.
The caller should not recurse through directories, just backup
the directory itself (name, metadata, file list).
'''
for pathname, st in self.fs.scan_tree(root, ok=self.can_be_backed_up):
tracing.trace('considering %s' % pathname)
try:
metadata = obnamlib.read_metadata(self.fs, pathname, st=st)
self.progress.update_progress_with_file(pathname, metadata)
if self.needs_backup(pathname, metadata):
self.progress.backed_up_count += 1
yield pathname, metadata
else:
self.progress.update_progress_with_scanned(
metadata.st_size)
except GeneratorExit:
raise
except KeyboardInterrupt:
logging.error('Keyboard interrupt')
raise
except BaseException, e:
msg = 'Cannot back up %s: %s' % (pathname, str(e))
self.error(msg, e)
def can_be_backed_up(self, pathname, st):
if self.app.settings['one-file-system']:
if st.st_dev != self.root_metadata.st_dev:
logging.debug('Excluding (one-file-system): %s' % pathname)
return False
for pat in self.exclude_pats:
if pat.search(pathname):
logging.debug('Excluding (pattern): %s' % pathname)
return False
if stat.S_ISDIR(st.st_mode) and self.app.settings['exclude-caches']:
tag_filename = 'CACHEDIR.TAG'
tag_contents = 'Signature: 8a477f597d28d172789f06886806bc55'
tag_path = os.path.join(pathname, 'CACHEDIR.TAG')
if self.fs.exists(tag_path):
# Can't use with, because Paramiko's SFTPFile does not work.
f = self.fs.open(tag_path, 'rb')
data = f.read(len(tag_contents))
f.close()
if data == tag_contents:
logging.debug('Excluding (cache dir): %s' % pathname)
return False
return True
def needs_backup(self, pathname, current):
'''Does a given file need to be backed up?'''
# Directories always require backing up so that backup_dir_contents
# can remove stuff that no longer exists from them.
if current.isdir():
tracing.trace('%s is directory, so needs backup' % pathname)
return True
if self.pretend:
gens = self.repo.list_generations()
if not gens:
return True
gen = gens[-1]
else:
gen = self.repo.new_generation
tracing.trace('gen=%s' % repr(gen))
try:
old = self.repo.get_metadata(gen, pathname)
except obnamlib.Error, e:
# File does not exist in the previous generation, so it
# does need to be backed up.
tracing.trace('%s not in previous gen, so needs backup' % pathname)
tracing.trace('error: %s' % str(e))
tracing.trace(traceback.format_exc())
return True
needs = (current.st_mtime_sec != old.st_mtime_sec or
current.st_mtime_nsec != old.st_mtime_nsec or
current.st_mode != old.st_mode or
current.st_nlink != old.st_nlink or
current.st_size != old.st_size or
current.st_uid != old.st_uid or
current.st_gid != old.st_gid or
current.xattr != old.xattr)
if needs:
tracing.trace('%s has changed metadata, so needs backup' % pathname)
return needs
def backup_parents(self, root):
'''Back up parents of root, non-recursively.'''
root = self.fs.abspath(root)
tracing.trace('backing up parents of %s', root)
dummy_metadata = obnamlib.Metadata(st_mode=0777 | stat.S_IFDIR)
while True:
parent = os.path.dirname(root)
try:
metadata = obnamlib.read_metadata(self.fs, root)
except OSError, e:
logging.warning(
'Failed to get metadata for %s: %s: %s' %
(root, e.errno or 0, e.strerror))
logging.warning('Using fake metadata instead for %s' % root)
metadata = dummy_metadata
if not self.pretend:
self.repo.create(root, metadata)
if root == parent:
break
root = parent
def backup_metadata(self, pathname, metadata):
'''Back up metadata for a filesystem object'''
tracing.trace('backup_metadata: %s', pathname)
if not self.pretend:
self.repo.create(pathname, metadata)
def backup_file_contents(self, filename, metadata):
'''Back up contents of a regular file.'''
tracing.trace('backup_file_contents: %s', filename)
if self.pretend:
tracing.trace('pretending to upload the whole file')
self.progress.update_progress_with_upload(metadata.st_size)
return
tracing.trace('setting file chunks to empty')
if not self.pretend:
self.repo.set_file_chunks(filename, [])
tracing.trace('opening file for reading')
f = self.fs.open(filename, 'r')
summer = self.repo.new_checksummer()
max_intree = self.app.settings['node-size'] / 4
if (metadata.st_size <= max_intree and
self.app.settings['small-files-in-btree']):
contents = f.read()
assert len(contents) <= max_intree # FIXME: silly error checking
f.close()
self.progress.update_progress_with_scanned(len(contents))
self.repo.set_file_data(filename, contents)
summer.update(contents)
return summer.digest()
chunk_size = int(self.app.settings['chunk-size'])
chunkids = []
while True:
tracing.trace('reading some data')
self.progress.update_progress()
data = f.read(chunk_size)
if not data:
tracing.trace('end of data')
break
tracing.trace('got %d bytes of data' % len(data))
self.progress.update_progress_with_scanned(len(data))
summer.update(data)
if not self.pretend:
chunkids.append(self.backup_file_chunk(data))
if len(chunkids) >= self.app.settings['chunkids-per-group']:
tracing.trace('adding %d chunkids to file' % len(chunkids))
self.repo.append_file_chunks(filename, chunkids)
self.app.dump_memory_profile('after appending some '
'chunkids')
chunkids = []
else:
self.self.update_progress_with_upload(len(data))
if not self.pretend and self.time_for_checkpoint():
logging.debug('making checkpoint in the middle of a file')
self.repo.append_file_chunks(filename, chunkids)
chunkids = []
self.make_checkpoint()
tracing.trace('closing file')
f.close()
if chunkids:
assert not self.pretend
tracing.trace('adding final %d chunkids to file' % len(chunkids))
self.repo.append_file_chunks(filename, chunkids)
self.app.dump_memory_profile('at end of file content backup for %s' %
filename)
tracing.trace('done backing up file contents')
return summer.digest()
def backup_file_chunk(self, data):
'''Back up a chunk of data by putting it into the repository.'''
def find():
# We ignore lookup errors here intentionally. We're reading
# the checksum trees without a lock, so another Obnam may be
# modifying them, which can lead to spurious NodeMissing
# exceptions, and other errors. We don't care: we'll just
# pretend no chunk with the checksum exists yet.
try:
in_tree = self.repo.find_chunks(checksum)
except larch.Error:
in_tree = []
return in_tree + self.chunkid_pool.get(checksum)
def get(chunkid):
return self.repo.get_chunk(chunkid)
def put():
self.progress.update_progress_with_upload(len(data))
return self.repo.put_chunk_only(data)
def share(chunkid):
self.chunkid_pool.add(chunkid, checksum)
checksum = self.repo.checksum(data)
mode = self.app.settings['deduplicate']
if mode == 'never':
return put()
elif mode == 'verify':
for chunkid in find():
data2 = get(chunkid)
if data == data2:
return chunkid
else:
chunkid = put()
share(chunkid)
return chunkid
elif mode == 'fatalist':
existing = find()
if existing:
return existing[0]
else:
chunkid = put()
share(chunkid)
return chunkid
else:
if not hasattr(self, 'bad_deduplicate_reported'):
logging.error('unknown --deduplicate setting value')
self.bad_deduplicate_reported = True
chunkid = put()
share(chunkid)
return chunkid
def backup_dir_contents(self, root):
'''Back up the list of files in a directory.'''
tracing.trace('backup_dir: %s', root)
if self.pretend:
return
new_basenames = self.fs.listdir(root)
old_basenames = self.repo.listdir(self.repo.new_generation, root)
for old in old_basenames:
pathname = os.path.join(root, old)
if old not in new_basenames:
self.repo.remove(pathname)
# Files that are created after the previous generation will be
# added to the directory when they are backed up, so we don't
# need to worry about them here.
def remove_old_roots(self, new_roots):
'''Remove from started generation anything that is not a backup root.
We recurse from filesystem root directory until getting to one of
the new backup roots, or a directory or file that is not a parent
of one of the new backup roots. We remove anything that is not a
new backup root, or their parent.
'''
def is_parent(pathname):
if not pathname.endswith(os.sep):
pathname += os.sep
for new_root in new_roots:
if new_root.startswith(pathname):
return True
return False
def helper(dirname):
if dirname in new_roots:
tracing.trace('is a new root: %s' % dirname)
elif is_parent(dirname):
tracing.trace('is parent of a new root: %s' % dirname)
pathnames = [os.path.join(dirname, x)
for x in self.repo.listdir(gen_id, dirname)]
for pathname in pathnames:
helper(pathname)
else:
tracing.trace('is extra and removed: %s' % dirname)
self.progress.what('removing %s from new generation' % dirname)
self.repo.remove(dirname)
self.progress.what(msg)
assert not self.pretend
msg = 'removing old backup roots from new generation'
self.progress.what(msg)
tracing.trace('new_roots: %s' % repr(new_roots))
gen_id = self.repo.new_generation
helper('/')
obnam-1.6.1/obnamlib/plugins/compression_plugin.py 0000644 0001750 0001750 00000004017 12246357067 022252 0 ustar jenkins jenkins # Copyright (C) 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import zlib
import obnamlib
class DeflateCompressionFilter(object):
def __init__(self, app):
self.tag = "deflate"
self.app = app
self.warned = False
def filter_read(self, data, repo, toplevel):
return zlib.decompress(data)
def filter_write(self, data, repo, toplevel):
how = self.app.settings['compress-with']
if how == 'deflate':
data = zlib.compress(data)
elif how == 'gzip':
if not self.warned:
self.app.ts.notify("--compress-with=gzip is deprecated. " +
"Use --compress-with=deflate instead")
self.warned = True
data = zlib.compress(data)
return data
class CompressionPlugin(obnamlib.ObnamPlugin):
def enable(self):
self.app.settings.choice(['compress-with'],
['none', 'deflate', 'gzip'],
'use PROGRAM to compress repository with '
'(one of none, deflate)',
metavar='PROGRAM')
hooks = [
('repository-data', DeflateCompressionFilter(self.app),
obnamlib.Hook.EARLY_PRIORITY),
]
for name, callback, prio in hooks:
self.app.hooks.add_callback(name, callback, prio)
obnam-1.6.1/obnamlib/plugins/convert5to6_plugin.py 0000644 0001750 0001750 00000005752 12246357067 022116 0 ustar jenkins jenkins # Copyright (C) 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import re
import stat
import tracing
import zlib
import obnamlib
class Convert5to6Plugin(obnamlib.ObnamPlugin):
'''Convert a version 5 repository to version 6, in place.'''
def enable(self):
self.app.add_subcommand('convert5to6', self.convert, arg_synopsis='')
def convert(self, args):
self.app.settings.require('repository')
self.rawfs = self.app.fsf.new(self.app.settings['repository'])
self.convert_format()
self.repo = self.app.open_repository()
self.convert_files()
def convert_files(self):
funcs = []
if self.app.settings['compress-with'] == 'gzip':
funcs.append(self.gunzip)
if self.app.settings['encrypt-with']:
self.symmetric_keys = {}
funcs.append(self.decrypt)
tracing.trace('funcs=%s' % repr(funcs))
for filename in self.find_files():
logging.debug('converting file %s' % filename)
data = self.rawfs.cat(filename)
tracing.trace('old data is %d bytes' % len(data))
for func in funcs:
data = func(filename, data)
tracing.trace('new data is %d bytes' % len(data))
self.repo.fs.overwrite_file(filename, data)
def find_files(self):
ignored_pat = re.compile(r'^(tmp.*|lock|format|userkeys|key)$')
for filename, st in self.rawfs.scan_tree('.'):
ignored = ignored_pat.match(os.path.basename(filename))
if stat.S_ISREG(st.st_mode) and not ignored:
assert filename.startswith('./')
yield filename[2:]
def get_symmetric_key(self, filename):
toplevel = filename.split('/')[0]
tracing.trace('toplevel=%s' % toplevel)
if toplevel not in self.symmetric_keys:
encoded = self.rawfs.cat(os.path.join(toplevel, 'key'))
key = obnamlib.decrypt_with_secret_keys(encoded)
self.symmetric_keys[toplevel] = key
return self.symmetric_keys[toplevel]
def decrypt(self, filename, data):
symmetric_key = self.get_symmetric_key(filename)
return obnamlib.decrypt_symmetric(data, symmetric_key)
def gunzip(self, filename, data):
return zlib.decompress(data)
def convert_format(self):
self.rawfs.overwrite_file('metadata/format', '6\n')
obnam-1.6.1/obnamlib/plugins/encryption_plugin.py 0000644 0001750 0001750 00000026312 12246357067 022105 0 ustar jenkins jenkins # Copyright (C) 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import obnamlib
class EncryptionPlugin(obnamlib.ObnamPlugin):
def enable(self):
encryption_group = obnamlib.option_group['encryption'] = 'Encryption'
self.app.settings.string(['encrypt-with'],
'PGP key with which to encrypt data '
'in the backup repository',
group=encryption_group)
self.app.settings.string(['keyid'],
'PGP key id to add to/remove from '
'the backup repository',
group=encryption_group)
self.app.settings.boolean(['weak-random'],
'use /dev/urandom instead of /dev/random '
'to generate symmetric keys',
group=encryption_group)
self.app.settings.boolean(['key-details'],
'show additional user IDs for all keys',
group=encryption_group)
self.app.settings.string(['symmetric-key-bits'],
'size of symmetric key, in bits',
group=encryption_group)
self.tag = "encrypt1"
hooks = [
('repository-toplevel-init', self.toplevel_init,
obnamlib.Hook.DEFAULT_PRIORITY),
('repository-data', self,
obnamlib.Hook.LATE_PRIORITY),
('repository-add-client', self.add_client,
obnamlib.Hook.DEFAULT_PRIORITY),
]
for name, callback, rev in hooks:
self.app.hooks.add_callback(name, callback, rev)
self._pubkey = None
self.app.add_subcommand('client-keys', self.client_keys)
self.app.add_subcommand('list-keys', self.list_keys)
self.app.add_subcommand('list-toplevels', self.list_toplevels)
self.app.add_subcommand(
'add-key', self.add_key, arg_synopsis='[CLIENT-NAME]...')
self.app.add_subcommand(
'remove-key', self.remove_key, arg_synopsis='[CLIENT-NAME]...')
self.app.add_subcommand('remove-client', self.remove_client,
arg_synopsis='[CLIENT-NAME]...')
self._symkeys = obnamlib.SymmetricKeyCache()
def disable(self):
self._symkeys.clear()
@property
def keyid(self):
return self.app.settings['encrypt-with']
@property
def pubkey(self):
if self._pubkey is None:
self._pubkey = obnamlib.get_public_key(self.keyid)
return self._pubkey
@property
def devrandom(self):
if self.app.settings['weak-random']:
return '/dev/urandom'
else:
return '/dev/random'
@property
def symmetric_key_bits(self):
return int(self.app.settings['symmetric-key-bits'] or '256')
def _write_file(self, repo, pathname, contents):
repo.fs.fs.write_file(pathname, contents)
def _overwrite_file(self, repo, pathname, contents):
repo.fs.fs.overwrite_file(pathname, contents)
def toplevel_init(self, repo, toplevel):
'''Initialize a new toplevel for encryption.'''
if not self.keyid:
return
pubkeys = obnamlib.Keyring()
pubkeys.add(self.pubkey)
symmetric_key = obnamlib.generate_symmetric_key(
self.symmetric_key_bits,
filename=self.devrandom)
encrypted = obnamlib.encrypt_with_keyring(symmetric_key, pubkeys)
self._write_file(repo, os.path.join(toplevel, 'key'), encrypted)
encoded = str(pubkeys)
encrypted = obnamlib.encrypt_symmetric(encoded, symmetric_key)
self._write_file(repo, os.path.join(toplevel, 'userkeys'), encrypted)
def filter_read(self, encrypted, repo, toplevel):
if not self.keyid:
return encrypted
symmetric_key = self.get_symmetric_key(repo, toplevel)
return obnamlib.decrypt_symmetric(encrypted, symmetric_key)
def filter_write(self, cleartext, repo, toplevel):
if not self.keyid:
return cleartext
symmetric_key = self.get_symmetric_key(repo, toplevel)
return obnamlib.encrypt_symmetric(cleartext, symmetric_key)
def get_symmetric_key(self, repo, toplevel):
key = self._symkeys.get(repo, toplevel)
if key is None:
encoded = repo.fs.fs.cat(os.path.join(toplevel, 'key'))
key = obnamlib.decrypt_with_secret_keys(encoded)
self._symkeys.put(repo, toplevel, key)
return key
def read_keyring(self, repo, toplevel):
encrypted = repo.fs.fs.cat(os.path.join(toplevel, 'userkeys'))
encoded = self.filter_read(encrypted, repo, toplevel)
return obnamlib.Keyring(encoded=encoded)
def write_keyring(self, repo, toplevel, keyring):
encoded = str(keyring)
encrypted = self.filter_write(encoded, repo, toplevel)
pathname = os.path.join(toplevel, 'userkeys')
self._overwrite_file(repo, pathname, encrypted)
def add_to_userkeys(self, repo, toplevel, public_key):
userkeys = self.read_keyring(repo, toplevel)
userkeys.add(public_key)
self.write_keyring(repo, toplevel, userkeys)
def remove_from_userkeys(self, repo, toplevel, keyid):
userkeys = self.read_keyring(repo, toplevel)
if keyid in userkeys:
logging.debug('removing key %s from %s' % (keyid, toplevel))
userkeys.remove(keyid)
self.write_keyring(repo, toplevel, userkeys)
else:
logging.debug('unable to remove key %s from %s (not there)' %
(keyid, toplevel))
def rewrite_symmetric_key(self, repo, toplevel):
symmetric_key = self.get_symmetric_key(repo, toplevel)
userkeys = self.read_keyring(repo, toplevel)
encrypted = obnamlib.encrypt_with_keyring(symmetric_key, userkeys)
self._overwrite_file(repo, os.path.join(toplevel, 'key'), encrypted)
def add_client(self, clientlist, client_name):
clientlist.set_client_keyid(client_name, self.keyid)
def quit_if_unencrypted(self):
if self.app.settings['encrypt-with']:
return False
self.app.output.write('Warning: Encryption not in use.\n')
self.app.output.write('(Use --encrypt-with to set key.)\n')
return True
def client_keys(self, args):
'''List clients and their keys in the repository.'''
if self.quit_if_unencrypted():
return
repo = self.app.open_repository()
clients = repo.list_clients()
for client in clients:
keyid = repo.clientlist.get_client_keyid(client)
if keyid is None:
key_info = 'no key'
else:
key_info = self._get_key_string(keyid)
print client, key_info
def _find_keys_and_toplevels(self, repo):
toplevels = repo.fs.listdir('.')
keys = dict()
tops = dict()
for toplevel in [d for d in toplevels if d != 'metadata']:
# skip files (e.g. 'lock') or empty directories
if not repo.fs.exists(os.path.join(toplevel, 'key')):
continue
try:
userkeys = self.read_keyring(repo, toplevel)
except obnamlib.EncryptionError:
# other client's toplevels are unreadable
tops[toplevel] = []
continue
for keyid in userkeys.keyids():
keys[keyid] = keys.get(keyid, []) + [toplevel]
tops[toplevel] = tops.get(toplevel, []) + [keyid]
return keys, tops
def _get_key_string(self, keyid):
verbose = self.app.settings['key-details']
if verbose:
user_ids = obnamlib.get_public_key_user_ids(keyid)
if user_ids:
return "%s (%s)" % (keyid, ", ".join(user_ids))
return str(keyid)
def list_keys(self, args):
'''List keys and the repository toplevels they're used in.'''
if self.quit_if_unencrypted():
return
repo = self.app.open_repository()
keys, tops = self._find_keys_and_toplevels(repo)
for keyid in keys:
print 'key: %s' % self._get_key_string(keyid)
for toplevel in keys[keyid]:
print ' %s' % toplevel
def list_toplevels(self, args):
'''List repository toplevel directories and their keys.'''
if self.quit_if_unencrypted():
return
repo = self.app.open_repository()
keys, tops = self._find_keys_and_toplevels(repo)
for toplevel in tops:
print 'toplevel: %s' % toplevel
for keyid in tops[toplevel]:
print ' %s' % self._get_key_string(keyid)
_shared = ['chunklist', 'chunks', 'chunksums', 'clientlist']
def _find_clientdirs(self, repo, client_names):
result = []
for client_name in client_names:
client_id = repo.clientlist.get_client_id(client_name)
if client_id:
result.append(repo.client_dir(client_id))
else:
logging.warning("client not found: %s" % client_name)
return result
def add_key(self, args):
'''Add a key to the repository.'''
if self.quit_if_unencrypted():
return
self.app.settings.require('keyid')
repo = self.app.open_repository()
keyid = self.app.settings['keyid']
key = obnamlib.get_public_key(keyid)
clients = self._find_clientdirs(repo, args)
for toplevel in self._shared + clients:
self.add_to_userkeys(repo, toplevel, key)
self.rewrite_symmetric_key(repo, toplevel)
def remove_key(self, args):
'''Remove a key from the repository.'''
if self.quit_if_unencrypted():
return
self.app.settings.require('keyid')
repo = self.app.open_repository()
keyid = self.app.settings['keyid']
clients = self._find_clientdirs(repo, args)
for toplevel in self._shared + clients:
self.remove_from_userkeys(repo, toplevel, keyid)
self.rewrite_symmetric_key(repo, toplevel)
def remove_client(self, args):
'''Remove client and its key from repository.'''
if self.quit_if_unencrypted():
return
repo = self.app.open_repository()
repo.lock_root()
for client_name in args:
logging.info('removing client %s' % client_name)
repo.remove_client(client_name)
repo.commit_root()
obnam-1.6.1/obnamlib/plugins/force_lock_plugin.py 0000644 0001750 0001750 00000004563 12246357067 022025 0 ustar jenkins jenkins # Copyright (C) 2009, 2010, 2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import obnamlib
class ForceLockPlugin(obnamlib.ObnamPlugin):
def enable(self):
self.app.add_subcommand('force-lock', self.force_lock)
def force_lock(self, args):
'''Force a locked repository to be open.'''
self.app.settings.require('repository')
self.app.settings.require('client-name')
repourl = self.app.settings['repository']
client_name = self.app.settings['client-name']
logging.info('Forcing lock')
logging.info('Repository: %s' % repourl)
logging.info('Client: %s' % client_name)
try:
repo = self.app.open_repository()
except OSError, e:
raise obnamlib.Error('Repository does not exist '
'or cannot be accessed.\n' +
str(e))
all_clients = repo.list_clients()
if client_name not in all_clients:
msg = 'Client does not exist in repository.'
logging.warning(msg)
self.app.output.write('Warning: %s\n' % msg)
return
all_dirs = ['clientlist', 'chunksums', 'chunklist', 'chunks', '.']
for client_name in all_clients:
client_id = repo.clientlist.get_client_id(client_name)
client_dir = repo.client_dir(client_id)
all_dirs.append(client_dir)
for one_dir in all_dirs:
lockname = os.path.join(one_dir, 'lock')
if repo.fs.exists(lockname):
logging.info('Removing lockfile %s' % lockname)
repo.fs.remove(lockname)
else:
logging.info('%s is not locked' % one_dir)
repo.fs.close()
return 0
obnam-1.6.1/obnamlib/plugins/forget_plugin.py 0000644 0001750 0001750 00000006270 12246357067 021202 0 ustar jenkins jenkins # Copyright (C) 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import datetime
import obnamlib
class ForgetPlugin(obnamlib.ObnamPlugin):
'''Forget generations.'''
def enable(self):
self.app.add_subcommand('forget', self.forget,
arg_synopsis='[GENERATION]...')
self.app.settings.string(['keep'],
'policy for what generations to keep '
'when forgetting')
def forget(self, args):
'''Forget (remove) specified backup generations.'''
self.app.settings.require('repository')
self.app.settings.require('client-name')
self.app.ts['gen'] = None
self.app.ts['gens'] = []
self.app.ts.format('forgetting generations: %Index(gen,gens) done')
self.repo = self.app.open_repository()
self.repo.lock_client(self.app.settings['client-name'])
self.repo.lock_shared()
self.app.dump_memory_profile('at beginning')
if args:
self.app.ts['gens'] = args
for genspec in args:
self.app.ts['gen'] = genspec
genid = self.repo.genspec(genspec)
self.app.ts.notify('Forgetting generation %s' % genid)
self.remove(genid)
self.app.dump_memory_profile('after removing %s' % genid)
elif self.app.settings['keep']:
genlist = []
dt = datetime.datetime(1970, 1, 1, 0, 0, 0)
for genid in self.repo.list_generations():
start, end = self.repo.get_generation_times(genid)
genlist.append((genid, dt.fromtimestamp(end)))
fp = obnamlib.ForgetPolicy()
rules = fp.parse(self.app.settings['keep'])
keeplist = fp.match(rules, genlist)
keepids = set(genid for genid, dt in keeplist)
removeids = [genid
for genid, dt in genlist
if genid not in keepids]
self.app.ts['gens'] = removeids
for genid in removeids:
self.app.ts['gen'] = genid
self.remove(genid)
self.app.dump_memory_profile('after removing %s' % genid)
self.repo.commit_client()
self.repo.commit_shared()
self.app.dump_memory_profile('after committing')
self.repo.fs.close()
self.app.ts.finish()
def remove(self, genid):
if self.app.settings['pretend']:
self.app.ts.notify('Pretending to remove generation %s' % genid)
else:
self.repo.remove_generation(genid)
obnam-1.6.1/obnamlib/plugins/fsck_plugin.py 0000644 0001750 0001750 00000032723 12246357067 020644 0 ustar jenkins jenkins # Copyright (C) 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import larch.fsck
import logging
import os
import sys
import ttystatus
import obnamlib
class WorkItem(larch.fsck.WorkItem):
'''A work item for fsck.
Whoever creates a WorkItem shall set the ``repo`` to the repository
being used.
'''
class CheckChunk(WorkItem):
def __init__(self, chunkid, checksummer):
self.chunkid = chunkid
self.checksummer = checksummer
self.name = 'chunk %s' % chunkid
def do(self):
logging.debug('Checking chunk %s' % self.chunkid)
if not self.repo.chunk_exists(self.chunkid):
self.error('chunk %s does not exist' % self.chunkid)
else:
data = self.repo.get_chunk(self.chunkid)
checksum = self.repo.checksum(data)
try:
correct = self.repo.chunklist.get_checksum(self.chunkid)
except KeyError:
self.error('chunk %s not in chunklist' % self.chunkid)
else:
if checksum != correct:
self.error('chunk %s has wrong checksum' % self.chunkid)
if self.chunkid not in self.repo.chunksums.find(checksum):
self.error('chunk %s not in chunksums' % self.chunkid)
self.checksummer.update(data)
self.chunkids_seen.add(self.chunkid)
class CheckFileChecksum(WorkItem):
def __init__(self, filename, correct, chunkids, checksummer):
self.filename = filename
self.name = '%s checksum' % filename
self.correct = correct
self.chunkids = chunkids
self.checksummer = checksummer
def do(self):
logging.debug('Checking whole-file checksum for %s' % self.filename)
if self.correct != self.checksummer.digest():
self.error('%s whole-file checksum mismatch' % self.name)
class CheckFile(WorkItem):
def __init__(self, client_name, genid, filename, metadata):
self.client_name = client_name
self.genid = genid
self.filename = filename
self.metadata = metadata
self.name = 'file %s:%s:%s' % (client_name, genid, filename)
def do(self):
logging.debug('Checking client=%s genid=%s filename=%s' %
(self.client_name, self.genid, self.filename))
if self.repo.current_client != self.client_name:
self.repo.open_client(self.client_name)
if self.metadata.isfile() and not self.settings['fsck-ignore-chunks']:
chunkids = self.repo.get_file_chunks(self.genid, self.filename)
checksummer = self.repo.new_checksummer()
for chunkid in chunkids:
yield CheckChunk(chunkid, checksummer)
yield CheckFileChecksum(
self.name, self.metadata.md5, chunkids, checksummer)
class CheckDirectory(WorkItem):
def __init__(self, client_name, genid, dirname):
self.client_name = client_name
self.genid = genid
self.dirname = dirname
self.name = 'dir %s:%s:%s' % (client_name, genid, dirname)
def do(self):
logging.debug('Checking client=%s genid=%s dirname=%s' %
(self.client_name, self.genid, self.dirname))
if self.repo.current_client != self.client_name:
self.repo.open_client(self.client_name)
self.repo.get_metadata(self.genid, self.dirname)
for basename in self.repo.listdir(self.genid, self.dirname):
pathname = os.path.join(self.dirname, basename)
metadata = self.repo.get_metadata(self.genid, pathname)
if metadata.isdir():
yield CheckDirectory(self.client_name, self.genid, pathname)
elif not self.settings['fsck-skip-files']:
yield CheckFile(
self.client_name, self.genid, pathname, metadata)
class CheckGeneration(WorkItem):
def __init__(self, client_name, genid):
self.client_name = client_name
self.genid = genid
self.name = 'generation %s:%s' % (client_name, genid)
def do(self):
logging.debug('Checking client=%s genid=%s' %
(self.client_name, self.genid))
started, ended = self.repo.client.get_generation_times(self.genid)
if started is None:
self.error('%s:%s: no generation start time' %
(self.client_name, self.genid))
if ended is None:
self.error('%s:%s: no generation end time' %
(self.client_name, self.genid))
n = self.repo.client.get_generation_file_count(self.genid)
if n is None:
self.error('%s:%s: no file count' % (self.client_name, self.genid))
n = self.repo.client.get_generation_data(self.genid)
if n is None:
self.error('%s:%s: no total data' % (self.client_name, self.genid))
if self.settings['fsck-skip-dirs']:
return []
else:
return [CheckDirectory(self.client_name, self.genid, '/')]
class CheckGenerationIdsAreDifferent(WorkItem):
def __init__(self, client_name, genids):
self.client_name = client_name
self.genids = list(genids)
def do(self):
logging.debug('Checking genid uniqueness for client=%s' %
self.client_name)
done = set()
while self.genids:
genid = self.genids.pop()
if genid in done:
self.error('%s: duplicate generation id %s' % genid)
else:
done.add(genid)
class CheckClientExists(WorkItem):
def __init__(self, client_name):
self.client_name = client_name
self.name = 'does client %s exist?' % client_name
def do(self):
logging.debug('Checking client=%s exists' % self.client_name)
client_id = self.repo.clientlist.get_client_id(self.client_name)
if client_id is None:
self.error('Client %s is in client list, but has no id' %
self.client_name)
class CheckClient(WorkItem):
def __init__(self, client_name):
self.client_name = client_name
self.name = 'client %s' % client_name
def do(self):
logging.debug('Checking client=%s' % self.client_name)
if self.repo.current_client != self.client_name:
self.repo.open_client(self.client_name)
genids = self.repo.list_generations()
yield CheckGenerationIdsAreDifferent(self.client_name, genids)
if self.settings['fsck-skip-generations']:
genids = []
elif self.settings['fsck-last-generation-only'] and genids:
genids = genids[-1:]
for genid in genids:
yield CheckGeneration(self.client_name, genid)
class CheckClientlist(WorkItem):
name = 'client list'
def do(self):
logging.debug('Checking clientlist')
clients = self.repo.clientlist.list_clients()
for client_name in clients:
if client_name not in self.settings['fsck-ignore-client']:
yield CheckClientExists(client_name)
if not self.settings['fsck-skip-per-client-b-trees']:
for client_name in clients:
if client_name not in self.settings['fsck-ignore-client']:
client_id = self.repo.clientlist.get_client_id(client_name)
client_dir = self.repo.client_dir(client_id)
yield CheckBTree(str(client_dir))
for client_name in clients:
if client_name not in self.settings['fsck-ignore-client']:
yield CheckClient(client_name)
class CheckForExtraChunks(WorkItem):
def __init__(self):
self.name = 'extra chunks'
def do(self):
logging.debug('Checking for extra chunks')
for chunkid in self.repo.list_chunks():
if chunkid not in self.chunkids_seen:
self.error('chunk %s not used by anyone' % chunkid)
class CheckBTree(WorkItem):
def __init__(self, dirname):
self.dirname = dirname
self.name = 'B-tree %s' % dirname
def do(self):
if not self.repo.fs.exists(self.dirname):
logging.debug('B-tree %s does not exist, skipping' % self.dirname)
return
logging.debug('Checking B-tree %s' % self.dirname)
fix = self.settings['fsck-fix']
forest = larch.open_forest(allow_writes=fix, dirname=self.dirname,
vfs=self.repo.fs)
fsck = larch.fsck.Fsck(forest, self.warning, self.error, fix)
for work in fsck.find_work():
yield work
class CheckRepository(WorkItem):
def __init__(self):
self.name = 'repository'
def do(self):
logging.debug('Checking repository')
if not self.settings['fsck-skip-shared-b-trees']:
yield CheckBTree('clientlist')
yield CheckBTree('chunklist')
yield CheckBTree('chunksums')
yield CheckClientlist()
class FsckPlugin(obnamlib.ObnamPlugin):
def enable(self):
self.app.add_subcommand('fsck', self.fsck)
group = 'Integrity checking (fsck)'
self.app.settings.boolean(
['fsck-fix'],
'should fsck try to fix problems?',
group=group)
self.app.settings.boolean(
['fsck-ignore-chunks'],
'ignore chunks when checking repository integrity (assume all '
'chunks exist and are correct)',
group=group)
self.app.settings.string_list(
['fsck-ignore-client'],
'do not check repository data for cient NAME',
metavar='NAME',
group=group)
self.app.settings.boolean(
['fsck-last-generation-only'],
'check only the last generation for each client',
group=group)
self.app.settings.boolean(
['fsck-skip-generations'],
'do not check any generations',
group=group)
self.app.settings.boolean(
['fsck-skip-dirs'],
'do not check anything about directories and their files',
group=group)
self.app.settings.boolean(
['fsck-skip-files'],
'do not check anything about files',
group=group)
self.app.settings.boolean(
['fsck-skip-per-client-b-trees'],
'do not check per-client B-trees',
group=group)
self.app.settings.boolean(
['fsck-skip-shared-b-trees'],
'do not check shared B-trees',
group=group)
def configure_ttystatus(self):
self.app.ts.clear()
self.app.ts['this_item'] = 0
self.app.ts['items'] = 0
self.app.ts.format(
'Checking %Integer(this_item)/%Integer(items): %String(item)')
def fsck(self, args):
'''Verify internal consistency of backup repository.'''
self.app.settings.require('repository')
logging.debug('fsck on %s' % self.app.settings['repository'])
self.configure_ttystatus()
self.repo = self.app.open_repository()
self.repo.lock_root()
client_names = self.repo.list_clients()
client_dirs = [self.repo.client_dir(
self.repo.clientlist.get_client_id(name))
for name in client_names]
self.repo.lockmgr.lock(client_dirs)
self.repo.lock_shared()
self.errors = 0
self.chunkids_seen = set()
self.work_items = []
self.add_item(CheckRepository(), append=True)
final_items = []
if not self.app.settings['fsck-ignore-chunks']:
final_items.append(CheckForExtraChunks())
while self.work_items:
work = self.work_items.pop(0)
logging.debug('doing: %s' % str(work))
self.app.ts['item'] = work
self.app.ts.increase('this_item', 1)
pos = 0
for more in work.do() or []:
self.add_item(more, pos=pos)
pos += 1
if not self.work_items:
for work in final_items:
self.add_item(work, append=True)
final_items = []
self.repo.unlock_shared()
self.repo.lockmgr.unlock(client_dirs)
self.repo.unlock_root()
self.repo.fs.close()
self.app.ts.finish()
if self.errors:
sys.exit(1)
def add_item(self, work, append=False, pos=0):
logging.debug('adding: %s' % str(work))
work.warning = self.warning
work.error = self.error
work.repo = self.repo
work.settings = self.app.settings
work.chunkids_seen = self.chunkids_seen
if append:
self.work_items.append(work)
else:
self.work_items.insert(pos, work)
self.app.ts.increase('items', 1)
def error(self, msg):
logging.error(msg)
self.app.ts.error(msg)
self.errors += 1
def warning(self, msg):
logging.warning(msg)
self.app.ts.notify(msg)
obnam-1.6.1/obnamlib/plugins/fuse_plugin.py 0000644 0001750 0001750 00000053646 12246357067 020667 0 ustar jenkins jenkins # Copyright (C) 2013 Valery Yundin
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import stat
import sys
import logging
import errno
import struct
import signal
import obnamlib
try:
import fuse
fuse.fuse_python_api = (0, 2)
except ImportError:
class Bunch:
def __init__(self, **kwds):
self.__dict__.update(kwds)
fuse = Bunch(Fuse = object)
class ObnamFuseOptParse(object):
'''Option parsing class for FUSE
has to set fuse_args.mountpoint
'''
obnam = None
def __init__(self, *args, **kw):
self.fuse_args = \
'fuse_args' in kw and kw.pop('fuse_args') or fuse.FuseArgs()
if 'fuse' in kw:
self.fuse = kw.pop('fuse')
def parse_args(self, args=None, values=None):
self.fuse_args.mountpoint = self.obnam.app.settings['to']
for opt in self.obnam.app.settings['fuse-opt']:
if opt == '-f':
self.fuse_args.setmod('foreground')
else:
self.fuse_args.add(opt)
if not hasattr(self.fuse_args, 'ro'):
self.fuse_args.add('ro')
class ObnamFuseFile(object):
fs = None # points to active ObnamFuse object
direct_io = False # do not use direct I/O on this file.
keep_cache = True # cached file data need not to be invalidated.
def __init__(self, path, flags, *mode):
logging.debug('FUSE file open %s %d', path, flags)
if ((flags & os.O_WRONLY) or (flags & os.O_RDWR) or
(flags & os.O_CREAT) or (flags & os.O_EXCL) or
(flags & os.O_TRUNC) or (flags & os.O_APPEND)):
raise IOError(errno.EROFS, 'Read only filesystem')
try:
self.path = path
if path == '/.pid' and self.fs.obnam.app.settings['viewmode'] == 'multiple':
self.read = self.read_pid
return
self.metadata = self.fs.get_metadata(path)
# if not a regular file return EINVAL
if not stat.S_ISREG(self.metadata.st_mode):
raise IOError(errno.EINVAL, 'Invalid argument')
self.chunkids = None
self.chunksize = None
self.lastdata = None
self.lastblock = None
except:
logging.error('Unexpected exception', exc_info=True)
raise
def read_pid(self, length, offset):
pid = str(os.getpid())
if length < len(pid) or offset != 0:
return ''
else:
return pid
def fgetattr(self):
logging.debug('FUSE file fgetattr')
return self.fs.getattr(self.path)
def read(self, length, offset):
logging.debug('FUSE file read(%s, %d, %d)', self.path, length, offset)
try:
if length == 0 or offset >= self.metadata.st_size:
return ''
repo = self.fs.obnam.repo
gen, repopath = self.fs.get_gen_path(self.path)
# if stored inside B-tree
contents = repo.get_file_data(gen, repopath)
if contents is not None:
return contents[offset:offset+length]
# stored in chunks
if not self.chunkids:
self.chunkids = repo.get_file_chunks(gen, repopath)
if len(self.chunkids) == 1:
if not self.lastdata:
self.lastdata = repo.get_chunk(self.chunkids[0])
return self.lastdata[offset:offset+length]
else:
chunkdata = None
if not self.chunksize:
# take the cached value as the first guess for chunksize
self.chunksize = self.fs.sizecache.get(gen, self.fs.chunksize)
blocknum = offset/self.chunksize
blockoffs = offset - blocknum*self.chunksize
# read a chunk if guessed blocknum and chunksize make sense
if blocknum < len(self.chunkids):
chunkdata = repo.get_chunk(self.chunkids[blocknum])
else:
chunkdata = ''
# check if chunkdata is of expected length
validate = min(self.chunksize, self.metadata.st_size - blocknum*self.chunksize)
if validate != len(chunkdata):
if blocknum < len(self.chunkids)-1:
# the length of all but last chunks is chunksize
self.chunksize = len(chunkdata)
else:
# guessing failed, get the length of the first chunk
self.chunksize = len(repo.get_chunk(self.chunkids[0]))
chunkdata = None
# save correct chunksize
self.fs.sizecache[gen] = self.chunksize
if not chunkdata:
blocknum = offset/self.chunksize
blockoffs = offset - blocknum*self.chunksize
if self.lastblock == blocknum:
chunkdata = self.lastdata
else:
chunkdata = repo.get_chunk(self.chunkids[blocknum])
output = []
while True:
output.append(chunkdata[blockoffs:blockoffs+length])
readlength = len(chunkdata) - blockoffs
if length > readlength and blocknum < len(self.chunkids)-1:
length -= readlength
blocknum += 1
blockoffs = 0
chunkdata = repo.get_chunk(self.chunkids[blocknum])
else:
self.lastblock = blocknum
self.lastdata = chunkdata
break
return ''.join(output)
except (OSError, IOError), e:
logging.debug('FUSE Expected exception')
raise
except:
logging.exception('Unexpected exception')
raise
def release(self, flags):
logging.debug('FUSE file release %d', flags)
self.lastdata = None
return 0
def fsync(self, isfsyncfile):
logging.debug('FUSE file fsync')
return 0
def flush(self):
logging.debug('FUSE file flush')
return 0
def ftruncate(self, size):
logging.debug('FUSE file ftruncate %d', size)
return 0
def lock(self, cmd, owner, **kw):
logging.debug('FUSE file lock %s %s %s', repr(cmd), repr(owner), repr(kw))
raise IOError(errno.EOPNOTSUPP, 'Operation not supported')
class ObnamFuse(fuse.Fuse):
'''FUSE main class
'''
MAX_METADATA_CACHE = 512
def sigUSR1(self):
if self.obnam.app.settings['viewmode'] == 'multiple':
repo = self.obnam.app.open_repository()
repo.open_client(self.obnam.app.settings['client-name'])
generations = [gen for gen in repo.list_generations()
if not repo.get_is_checkpoint(gen)]
self.obnam.repo = repo
self.rootstat, self.rootlist = self.multiple_root_list(generations)
self.metadatacache.clear()
def get_metadata(self, path):
#logging.debug('FUSE get_metadata(%s)', path)
try:
return self.metadatacache[path]
except KeyError:
if len(self.metadatacache) > self.MAX_METADATA_CACHE:
self.metadatacache.clear()
metadata = self.obnam.repo.get_metadata(*self.get_gen_path(path))
self.metadatacache[path] = metadata
return metadata
def get_stat(self, path):
logging.debug('FUSE get_stat(%s)', path)
metadata = self.get_metadata(path)
st = fuse.Stat()
st.st_mode = metadata.st_mode
st.st_dev = metadata.st_dev
st.st_nlink = metadata.st_nlink
st.st_uid = metadata.st_uid
st.st_gid = metadata.st_gid
st.st_size = metadata.st_size
st.st_atime = metadata.st_atime_sec
st.st_mtime = metadata.st_mtime_sec
st.st_ctime = st.st_mtime
return st
def single_root_list(self, gen):
repo = self.obnam.repo
mountroot = self.obnam.mountroot
rootlist = {}
for entry in repo.listdir(gen, mountroot):
path = '/' + entry
rootlist[path] = self.get_stat(path)
rootstat = self.get_stat('/')
return (rootstat, rootlist)
def multiple_root_list(self, generations):
repo = self.obnam.repo
mountroot = self.obnam.mountroot
rootlist = {}
used_generations = []
for gen in generations:
path = '/' + str(gen)
try:
genstat = self.get_stat(path)
start, end = repo.get_generation_times(gen)
genstat.st_ctime = genstat.st_mtime = end
rootlist[path] = genstat
used_generations.append(gen)
except obnamlib.Error:
pass
if not used_generations:
raise obnamlib.Error('No generations found for %s' % mountroot)
latest = used_generations[-1]
laststat = rootlist['/' + str(latest)]
rootstat = fuse.Stat(**laststat.__dict__)
laststat = fuse.Stat(target=str(latest), **laststat.__dict__)
laststat.st_mode &= ~(stat.S_IFDIR | stat.S_IFREG)
laststat.st_mode |= stat.S_IFLNK
rootlist['/latest'] = laststat
pidstat = fuse.Stat(**rootstat.__dict__)
pidstat.st_mode = stat.S_IFREG | stat.S_IRUSR | stat.S_IRGRP | stat.S_IROTH
rootlist['/.pid'] = pidstat
return (rootstat, rootlist)
def init_root(self):
repo = self.obnam.repo
mountroot = self.obnam.mountroot
generations = self.obnam.app.settings['generation']
if self.obnam.app.settings['viewmode'] == 'single':
if len(generations) != 1:
raise obnamlib.Error(
'The single mode wants exactly one generation option')
gen = repo.genspec(generations[0])
if mountroot == '/':
self.get_gen_path = lambda path: (gen, path)
else:
self.get_gen_path = (lambda path : path == '/'
and (gen, mountroot)
or (gen, mountroot + path))
self.rootstat, self.rootlist = self.single_root_list(gen)
logging.debug('FUSE single rootlist %s', repr(self.rootlist))
elif self.obnam.app.settings['viewmode'] == 'multiple':
# we need the list of all real (non-checkpoint) generations
if len(generations) == 1:
generations = [gen for gen in repo.list_generations()
if not repo.get_is_checkpoint(gen)]
if mountroot == '/':
def gen_path_0(path):
if path.count('/') == 1:
gen = path[1:]
return (int(gen), mountroot)
else:
gen, repopath = path[1:].split('/', 1)
return (int(gen), mountroot + repopath)
self.get_gen_path = gen_path_0
else:
def gen_path_n(path):
if path.count('/') == 1:
gen = path[1:]
return (int(gen), mountroot)
else:
gen, repopath = path[1:].split('/', 1)
return (int(gen), mountroot + '/' + repopath)
self.get_gen_path = gen_path_n
self.rootstat, self.rootlist = self.multiple_root_list(generations)
logging.debug('FUSE multiple rootlist %s', repr(self.rootlist))
else:
raise obnamlib.Error('Unknown value for viewmode')
def __init__(self, *args, **kw):
self.obnam = kw['obnam']
ObnamFuseFile.fs = self
self.file_class = ObnamFuseFile
self.metadatacache = {}
self.chunksize = self.obnam.app.settings['chunk-size']
self.sizecache = {}
self.rootlist = None
self.rootstat = None
self.init_root()
fuse.Fuse.__init__(self, *args, **kw)
def getattr(self, path):
try:
if path.count('/') == 1:
if path == '/':
return self.rootstat
elif path in self.rootlist:
return self.rootlist[path]
else:
raise obnamlib.Error('ENOENT')
else:
return self.get_stat(path)
except obnamlib.Error:
raise IOError(errno.ENOENT, 'No such file or directory')
except:
logging.error('Unexpected exception', exc_info=True)
raise
def readdir(self, path, fh):
logging.debug('FUSE readdir(%s, %s)', path, repr(fh))
try:
if path == '/':
listdir = [x[1:] for x in self.rootlist.keys()]
else:
listdir = self.obnam.repo.listdir(*self.get_gen_path(path))
return [fuse.Direntry(name) for name in ['.', '..'] + listdir]
except obnamlib.Error:
raise IOError(errno.EINVAL, 'Invalid argument')
except:
logging.error('Unexpected exception', exc_info=True)
raise
def readlink(self, path):
try:
statdata = self.rootlist.get(path)
if statdata and hasattr(statdata, 'target'):
return statdata.target
metadata = self.get_metadata(path)
if metadata.islink():
return metadata.target
else:
raise IOError(errno.EINVAL, 'Invalid argument')
except obnamlib.Error:
raise IOError(errno.ENOENT, 'No such file or directory')
except:
logging.error('Unexpected exception', exc_info=True)
raise
def statfs(self):
logging.debug('FUSE statfs')
try:
repo = self.obnam.repo
if self.obnam.app.settings['viewmode'] == 'multiple':
blocks = sum(repo.client.get_generation_data(gen)
for gen in repo.list_generations())
files = sum(repo.client.get_generation_file_count(gen)
for gen in repo.list_generations())
else:
gen = self.get_gen_path('/')[0]
blocks = repo.client.get_generation_data(gen)
files = repo.client.get_generation_file_count(gen)
stv = fuse.StatVfs()
stv.f_bsize = 65536
stv.f_frsize = 0
stv.f_blocks = blocks/65536
stv.f_bfree = 0
stv.f_bavail = 0
stv.f_files = files
stv.f_ffree = 0
stv.f_favail = 0
stv.f_flag = 0
stv.f_namemax = 255
#raise OSError(errno.ENOSYS, 'Unimplemented')
return stv
except:
logging.error('Unexpected exception', exc_info=True)
raise
def getxattr(self, path, name, size):
logging.debug('FUSE getxattr %s %s %d', path, name, size)
try:
try:
metadata = self.get_metadata(path)
except ValueError:
return 0
if not metadata.xattr:
return 0
blob = metadata.xattr
sizesize = struct.calcsize('!Q')
name_blob_size = struct.unpack('!Q', blob[:sizesize])[0]
name_blob = blob[sizesize : sizesize + name_blob_size]
name_list = name_blob.split('\0')[:-1]
if name in name_list:
value_blob = blob[sizesize + name_blob_size : ]
idx = name_list.index(name)
fmt = '!' + 'Q' * len(name_list)
lengths_size = sizesize * len(name_list)
lengths_list = struct.unpack(fmt, value_blob[:lengths_size])
if size == 0:
return lengths_list[idx]
pos = lengths_size + sum(lengths_list[:idx])
value = value_blob[pos:pos + lengths_list[idx]]
return value
except obnamlib.Error:
raise IOError(errno.ENOENT, 'No such file or directory')
except:
logging.error('Unexpected exception', exc_info=True)
raise
def listxattr(self, path, size):
logging.debug('FUSE listxattr %s %d', path, size)
try:
metadata = self.get_metadata(path)
if not metadata.xattr:
return 0
blob = metadata.xattr
sizesize = struct.calcsize('!Q')
name_blob_size = struct.unpack('!Q', blob[:sizesize])[0]
if size == 0:
return name_blob_size
name_blob = blob[sizesize : sizesize + name_blob_size]
return name_blob.split('\0')[:-1]
except obnamlib.Error:
raise IOError(errno.ENOENT, 'No such file or directory')
except:
logging.error('Unexpected exception', exc_info=True)
raise
def fsync(self, path, isFsyncFile):
return 0
def chmod(self, path, mode):
raise IOError(errno.EROFS, 'Read only filesystem')
def chown(self, path, uid, gid):
raise IOError(errno.EROFS, 'Read only filesystem')
def link(self, targetPath, linkPath):
raise IOError(errno.EROFS, 'Read only filesystem')
def mkdir(self, path, mode):
raise IOError(errno.EROFS, 'Read only filesystem')
def mknod(self, path, mode, dev):
raise IOError(errno.EROFS, 'Read only filesystem')
def rename(self, oldPath, newPath):
raise IOError(errno.EROFS, 'Read only filesystem')
def rmdir(self, path):
raise IOError(errno.EROFS, 'Read only filesystem')
def symlink(self, targetPath, linkPath):
raise IOError(errno.EROFS, 'Read only filesystem')
def truncate(self, path, size):
raise IOError(errno.EROFS, 'Read only filesystem')
def unlink(self, path):
raise IOError(errno.EROFS, 'Read only filesystem')
def utime(self, path, times):
raise IOError(errno.EROFS, 'Read only filesystem')
def write(self, path, buf, offset):
raise IOError(errno.EROFS, 'Read only filesystem')
def setxattr(self, path, name, val, flags):
raise IOError(errno.EROFS, 'Read only filesystem')
def removexattr(self, path, name):
raise IOError(errno.EROFS, 'Read only filesystem')
class MountPlugin(obnamlib.ObnamPlugin):
'''Mount backup repository as a user-space filesystem.
At the momemnt only a specific generation can be mounted
'''
def enable(self):
mount_group = obnamlib.option_group['mount'] = 'Mounting with FUSE'
self.app.add_subcommand('mount', self.mount,
arg_synopsis='[ROOT]')
self.app.settings.choice(['viewmode'],
['single', 'multiple'],
'"single" directly mount specified generation, '
'"multiple" mount all generations as separate directories',
metavar='MODE',
group=mount_group)
self.app.settings.string_list(['fuse-opt'],
'options to pass directly to Fuse',
metavar='FUSE', group=mount_group)
def mount(self, args):
'''Mount a backup repository as a FUSE filesystem.
This subcommand allows you to access backups in an Obnam
backup repository as normal files and directories. Each
backed up file or directory can be viewed directly, using
a graphical file manager or command line tools.
Example: To mount your backup repository:
mkdir my-fuse
obnam mount --viewmode multiple --to my-fuse
You can then access the backup using commands such as these:
ls -l my-fuse
ls -l my-fuse/latest
diff -u my-fuse/latest/home/liw/README ~/README
You can also restore files by copying them from the
my-fuse directory:
cp -a my-fuse/12765/Maildir ~/Maildir.restored
To un-mount:
fusermount -u my-fuse
'''
if not hasattr(fuse, 'fuse_python_api'):
raise obnamlib.Error('Failed to load module "fuse", '
'try installing python-fuse')
self.app.settings.require('repository')
self.app.settings.require('client-name')
self.app.settings.require('to')
self.repo = self.app.open_repository()
self.repo.open_client(self.app.settings['client-name'])
self.mountroot = (['/'] + self.app.settings['root'] + args)[-1]
if self.mountroot != '/':
self.mountroot = self.mountroot.rstrip('/')
logging.debug('FUSE Mounting %s@%s:%s to %s', self.app.settings['client-name'],
self.app.settings['generation'],
self.mountroot, self.app.settings['to'])
try:
ObnamFuseOptParse.obnam = self
fs = ObnamFuse(obnam=self, parser_class=ObnamFuseOptParse)
signal.signal(signal.SIGUSR1, lambda s,f: fs.sigUSR1())
signal.siginterrupt(signal.SIGUSR1, False)
fs.flags = 0
fs.multithreaded = 0
fs.parse()
fs.main()
except fuse.FuseError, e:
raise obnamlib.Error(repr(e))
self.repo.fs.close()
obnam-1.6.1/obnamlib/plugins/restore_plugin.py 0000644 0001750 0001750 00000032046 12246357067 021377 0 ustar jenkins jenkins # Copyright (C) 2009, 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import stat
import time
import ttystatus
import obnamlib
class Hardlinks(object):
'''Keep track of inodes with unrestored hardlinks.'''
def __init__(self):
self.inodes = dict()
def key(self, metadata):
return '%s:%s' % (metadata.st_dev, metadata.st_ino)
def add(self, filename, metadata):
self.inodes[self.key(metadata)] = (filename, metadata.st_nlink)
def filename(self, metadata):
key = self.key(metadata)
if key in self.inodes:
return self.inodes[key][0]
else:
return None
def forget(self, metadata):
key = self.key(metadata)
filename, nlinks = self.inodes[key]
if nlinks <= 2:
del self.inodes[key]
else:
self.inodes[key] = (filename, nlinks - 1)
class RestorePlugin(obnamlib.ObnamPlugin):
# A note about the implementation: we need to make sure all the
# files we restore go into the target directory. We do this by
# prefixing all filenames we write to with './', and then using
# os.path.join to put the target directory name at the beginning.
# The './' business is necessary because os.path.join(a,b) returns
# just b if b is an absolute path.
def enable(self):
self.app.add_subcommand('restore', self.restore,
arg_synopsis='[DIRECTORY]...')
self.app.settings.string(['to'], 'where to restore')
self.app.settings.string_list(['generation'],
'which generation to restore',
default=['latest'])
@property
def write_ok(self):
return not self.app.settings['dry-run']
def configure_ttystatus(self):
self.app.ts['current'] = ''
self.app.ts['total'] = 0
self.app.ts['current-bytes'] = 0
self.app.ts['total-bytes'] = 0
self.app.ts.format('%RemainingTime(current-bytes,total-bytes) '
'%Counter(current) files '
'%ByteSize(current-bytes) '
'(%PercentDone(current-bytes,total-bytes)) '
'%ByteSpeed(current-bytes) '
'%Pathname(current)')
def restore(self, args):
'''Restore some or all files from a generation.'''
self.app.settings.require('repository')
self.app.settings.require('client-name')
self.app.settings.require('generation')
self.app.settings.require('to')
logging.debug('restoring generation %s' %
self.app.settings['generation'])
logging.debug('restoring to %s' % self.app.settings['to'])
logging.debug('restoring what: %s' % repr(args))
if not args:
logging.debug('no args given, so restoring everything')
args = ['/']
self.downloaded_bytes = 0
self.file_count = 0
self.started = time.time()
self.repo = self.app.open_repository()
self.repo.open_client(self.app.settings['client-name'])
if self.write_ok:
self.fs = self.app.fsf.new(self.app.settings['to'], create=True)
self.fs.connect()
else:
self.fs = None # this will trigger error if we try to really write
self.hardlinks = Hardlinks()
self.errors = False
generations = self.app.settings['generation']
if len(generations) != 1:
raise obnamlib.Error(
'The restore command wants exactly one generation option')
gen = self.repo.genspec(generations[0])
self.configure_ttystatus()
self.app.ts['total'] = self.repo.client.get_generation_file_count(gen)
self.app.ts['total-bytes'] = self.repo.client.get_generation_data(gen)
self.app.dump_memory_profile('at beginning after setup')
for arg in args:
self.restore_something(gen, arg)
self.app.dump_memory_profile('at restoring %s' % repr(arg))
self.repo.fs.close()
if self.write_ok:
self.fs.close()
self.app.ts.clear()
self.report_stats()
self.app.ts.finish()
if self.errors:
raise obnamlib.Error('There were errors when restoring')
def restore_something(self, gen, root):
for pathname, metadata in self.repo.walk(gen, root, depth_first=True):
self.file_count += 1
self.app.ts['current'] = pathname
self.restore_safely(gen, pathname, metadata)
def restore_safely(self, gen, pathname, metadata):
try:
dirname = os.path.dirname(pathname)
if self.write_ok and not self.fs.exists('./' + dirname):
self.fs.makedirs('./' + dirname)
set_metadata = True
if metadata.isdir():
self.restore_dir(gen, pathname, metadata)
elif metadata.islink():
self.restore_symlink(gen, pathname, metadata)
elif metadata.st_nlink > 1:
link = self.hardlinks.filename(metadata)
if link:
self.restore_hardlink(pathname, link, metadata)
set_metadata = False
else:
self.hardlinks.add(pathname, metadata)
self.restore_first_link(gen, pathname, metadata)
else:
self.restore_first_link(gen, pathname, metadata)
if set_metadata and self.write_ok:
try:
obnamlib.set_metadata(self.fs, './' + pathname, metadata)
except (IOError, OSError), e:
msg = ('Could not set metadata: %s: %d: %s' %
(pathname, e.errno, e.strerror))
logging.error(msg)
self.app.ts.notify(msg)
self.errors = True
except Exception, e:
# Reaching this code path means we've hit a bug, so we log a full traceback.
msg = "Failed to restore %s:" % (pathname,)
logging.exception(msg)
self.app.ts.notify(msg + " " + str(e))
self.errors = True
def restore_dir(self, gen, root, metadata):
logging.debug('restoring dir %s' % root)
if self.write_ok:
if not self.fs.exists('./' + root):
self.fs.mkdir('./' + root)
self.app.dump_memory_profile('after recursing through %s' % repr(root))
def restore_hardlink(self, filename, link, metadata):
logging.debug('restoring hardlink %s to %s' % (filename, link))
if self.write_ok:
self.fs.link('./' + link, './' + filename)
self.hardlinks.forget(metadata)
def restore_symlink(self, gen, filename, metadata):
logging.debug('restoring symlink %s' % filename)
def restore_first_link(self, gen, filename, metadata):
if stat.S_ISREG(metadata.st_mode):
self.restore_regular_file(gen, filename, metadata)
elif stat.S_ISFIFO(metadata.st_mode):
self.restore_fifo(gen, filename, metadata)
elif stat.S_ISSOCK(metadata.st_mode):
self.restore_socket(gen, filename, metadata)
elif stat.S_ISBLK(metadata.st_mode) or stat.S_ISCHR(metadata.st_mode):
self.restore_device(gen, filename, metadata)
else:
msg = ('Unknown file type: %s (%o)' %
(filename, metadata.st_mode))
logging.error(msg)
self.app.ts.notify(msg)
def restore_regular_file(self, gen, filename, metadata):
logging.debug('restoring regular %s' % filename)
if self.write_ok:
f = self.fs.open('./' + filename, 'wb')
summer = self.repo.new_checksummer()
try:
contents = self.repo.get_file_data(gen, filename)
if contents is None:
chunkids = self.repo.get_file_chunks(gen, filename)
self.restore_chunks(f, chunkids, summer)
else:
f.write(contents)
summer.update(contents)
self.downloaded_bytes += len(contents)
except obnamlib.MissingFilterError, e:
msg = 'Missing filter error during restore: %s' % filename
logging.error(msg)
self.app.ts.notify(msg)
self.errors = True
f.close()
correct_checksum = metadata.md5
if summer.digest() != correct_checksum:
msg = 'File checksum restore error: %s' % filename
logging.error(msg)
self.app.ts.notify(msg)
self.errors = True
def restore_chunks(self, f, chunkids, checksummer):
zeroes = ''
hole_at_end = False
for chunkid in chunkids:
data = self.repo.get_chunk(chunkid)
self.verify_chunk_checksum(data, chunkid)
checksummer.update(data)
self.downloaded_bytes += len(data)
if len(data) != len(zeroes):
zeroes = '\0' * len(data)
if data == zeroes:
f.seek(len(data), 1)
hole_at_end = True
else:
f.write(data)
hole_at_end = False
self.app.ts['current-bytes'] += len(data)
if hole_at_end:
pos = f.tell()
if pos > 0:
f.seek(-1, 1)
f.write('\0')
def verify_chunk_checksum(self, data, chunkid):
checksum = self.repo.checksum(data)
try:
wanted = self.repo.chunklist.get_checksum(chunkid)
except KeyError:
# Chunk might not be in the tree, but that does not
# mean it is invalid. We'll assume it is valid.
return
if checksum != wanted:
raise obnamlib.Error('chunk %s checksum error' % chunkid)
def restore_fifo(self, gen, filename, metadata):
logging.debug('restoring fifo %s' % filename)
if self.write_ok:
self.fs.mknod('./' + filename, metadata.st_mode)
def restore_socket(self, gen, filename, metadata):
logging.debug('restoring socket %s' % filename)
if self.write_ok:
self.fs.mknod('./' + filename, metadata.st_mode)
def restore_device(self, gen, filename, metadata):
logging.debug('restoring device %s' % filename)
if self.write_ok:
self.fs.mknod('./' + filename, metadata.st_mode)
def report_stats(self):
size_table = [
(1024**4, 'TiB'),
(1024**3, 'GiB'),
(1024**2, 'MiB'),
(1024**1, 'KiB'),
(0, 'B')
]
for size_base, size_unit in size_table:
if self.downloaded_bytes >= size_base:
if size_base > 0:
size_amount = (float(self.downloaded_bytes) /
float(size_base))
else:
size_amount = float(self.downloaded_bytes)
break
speed_table = [
(1024**3, 'GiB/s'),
(1024**2, 'MiB/s'),
(1024**1, 'KiB/s'),
(0, 'B/s')
]
duration = time.time() - self.started
speed = float(self.downloaded_bytes) / duration
for speed_base, speed_unit in speed_table:
if speed >= speed_base:
if speed_base > 0:
speed_amount = speed / speed_base
else:
speed_amount = speed
break
duration_string = ''
seconds = duration
if seconds >= 3600:
duration_string += '%dh' % int(seconds/3600)
seconds %= 3600
if seconds >= 60:
duration_string += '%dm' % int(seconds/60)
seconds %= 60
if seconds > 0:
duration_string += '%ds' % round(seconds)
logging.info('Restore performance statistics:')
logging.info('* files restored: %s' % self.file_count)
logging.info('* downloaded data: %s bytes (%s %s)' %
(self.downloaded_bytes, size_amount, size_unit))
logging.info('* duration: %s s' % duration)
logging.info('* average speed: %s %s' % (speed_amount, speed_unit))
self.app.ts.notify(
'Restored %d files, '
'downloaded %.1f %s in %s at %.1f %s average speed' %
(self.file_count,
size_amount, size_unit,
duration_string, speed_amount, speed_unit))
obnam-1.6.1/obnamlib/plugins/sftp_plugin.py 0000644 0001750 0001750 00000055037 12246357067 020675 0 ustar jenkins jenkins # Copyright (C) 2009 Lars Wirzenius
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
import errno
import hashlib
import logging
import os
import pwd
import random
import socket
import stat
import subprocess
import time
import traceback
import urlparse
# As of 2010-07-10, Debian's paramiko package triggers
# RandomPool_DeprecationWarning. This will eventually be fixed. Until
# then, there is no point in spewing the warning to the user, who can't
# do nothing.
# http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=586925
import warnings
with warnings.catch_warnings():
warnings.simplefilter('ignore')
import paramiko
import obnamlib
def ioerror_to_oserror(method):
'''Decorator to convert an IOError exception to OSError.
Python's os.* raise OSError, mostly, but paramiko's corresponding
methods raise IOError. This decorator fixes that.
'''
def helper(self, filename, *args, **kwargs):
try:
return method(self, filename, *args, **kwargs)
except IOError, e:
raise OSError(e.errno, e.strerror or str(e), filename)
return helper
class SSHChannelAdapter(object):
'''Take an ssh subprocess and pretend it is a paramiko Channel.'''
# This is inspired by the ssh.py module in bzrlib.
def __init__(self, proc):
self.proc = proc
def send(self, data):
return os.write(self.proc.stdin.fileno(), data)
def recv(self, count):
try:
return os.read(self.proc.stdout.fileno(), count)
except socket.error, e:
if e.args[0] in (errno.EPIPE, errno.ECONNRESET, errno.ECONNABORTED,
errno.EBADF):
# Connection has closed. Paramiko expects an empty string in
# this case, not an exception.
return ''
raise
def get_name(self):
return 'obnam SSHChannelAdapter'
def close(self):
logging.debug('SSHChannelAdapter.close called')
for func in [self.proc.stdin.close, self.proc.stdout.close,
self.proc.wait]:
try:
func()
except OSError:
pass
class SftpFS(obnamlib.VirtualFileSystem):
'''A VFS implementation for SFTP.
'''
# 32 KiB is the chunk size that gives me the fastest speed
# for sftp transfers. I don't know why the size matters.
chunk_size = 32 * 1024
def __init__(self, baseurl, create=False, settings=None):
obnamlib.VirtualFileSystem.__init__(self, baseurl)
self.sftp = None
self.settings = settings
self._roundtrips = 0
self._initial_dir = None
self.reinit(baseurl, create=create)
# Backwards compatibility with old, deprecated option:
if settings and settings['strict-ssh-host-keys']:
settings["ssh-host-keys-check"] = "yes"
def _delay(self):
self._roundtrips += 1
if self.settings:
ms = self.settings['sftp-delay']
if ms > 0:
time.sleep(ms * 0.001)
def log_stats(self):
obnamlib.VirtualFileSystem.log_stats(self)
logging.info('VFS: baseurl=%s roundtrips=%s' %
(self.baseurl, self._roundtrips))
def _to_string(self, str_or_unicode):
if type(str_or_unicode) is unicode:
return str_or_unicode.encode('utf-8')
else:
return str_or_unicode
def _create_root_if_missing(self):
try:
self.mkdir(self.path)
except OSError, e:
# sftp/paramiko does not give us a useful errno so we hope
# for the best
pass
self.create_path_if_missing = False # only create once
def connect(self):
try_openssh = not self.settings or not self.settings['pure-paramiko']
if not try_openssh or not self._connect_openssh():
self._connect_paramiko()
if self.create_path_if_missing:
self._create_root_if_missing()
self.chdir('.')
self._initial_dir = self.getcwd()
self.chdir(self.path)
def _connect_openssh(self):
executable = 'ssh'
args = ['-oForwardX11=no', '-oForwardAgent=no',
'-oClearAllForwardings=yes', '-oProtocol=2',
'-s']
if self.settings and self.settings['ssh-command']:
executable = self.settings["ssh-command"]
# default user/port from ssh (could be a per host configuration)
if self.port:
args += ['-p', str(self.port)]
if self.user:
args += ['-l', self.user]
if self.settings and self.settings['ssh-key']:
args += ['-i', self.settings['ssh-key']]
if (self.settings and
self.settings['ssh-host-keys-check'] != "ssh-config"):
value = self.settings['ssh-host-keys-check']
args += ['-o', 'StrictHostKeyChecking=%s' % (value,)]
if self.settings and self.settings['ssh-known-hosts']:
args += ['-o',
'UserKnownHostsFile=%s' %
self.settings['ssh-known-hosts']]
args += [self.host, 'sftp']
# prepend the executable to the argument list
args.insert(0, executable)
logging.debug('executing openssh: %s' % args)
try:
proc = subprocess.Popen(args,
stdin=subprocess.PIPE,
stdout=subprocess.PIPE,
close_fds=True)
except OSError:
return False
self.transport = None
self.sftp = paramiko.SFTPClient(SSHChannelAdapter(proc))
return True
def _connect_paramiko(self):
logging.debug(
'connect_paramiko: host=%s port=%s' % (self.host, self.port))
if self.port:
remote = (self.host, self.port)
else:
remote = (self.host)
self.transport = paramiko.Transport(remote)
self.transport.connect()
logging.debug('connect_paramiko: connected')
try:
self._check_host_key(self.host)
except BaseException, e:
self.transport.close()
self.transport = None
raise
logging.debug('connect_paramiko: host key checked')
self._authenticate(self.user)
logging.debug('connect_paramiko: authenticated')
self.sftp = paramiko.SFTPClient.from_transport(self.transport)
logging.debug('connect_paramiko: end')
def _check_host_key(self, hostname):
logging.debug('checking ssh host key for %s' % hostname)
offered_key = self.transport.get_remote_server_key()
known_hosts_path = self.settings['ssh-known-hosts']
known_hosts = paramiko.util.load_host_keys(known_hosts_path)
known_keys = known_hosts.lookup(hostname)
if known_keys is None:
if self.settings['ssh-host-keys-check'] == 'yes':
raise obnamlib.Error('No known host key for %s' % hostname)
logging.warning('No known host keys for %s; accepting offered key'
% hostname)
return
offered_type = offered_key.get_name()
if not known_keys.has_key(offered_type):
if self.settings['ssh-host-keys-check'] == 'yes':
raise obnamlib.Error('No known type %s host key for %s' %
(offered_type, hostname))
logging.warning('No known host key of type %s for %s; accepting '
'offered key' % (offered_type, hostname))
known_key = known_keys[offered_type]
if offered_key != known_key:
raise obnamlib.Error('SSH server %s offered wrong public key' %
hostname)
logging.debug('Host key for %s OK' % hostname)
def _authenticate(self, username):
if not username:
username = self._get_username()
for key in self._find_auth_keys():
try:
self.transport.auth_publickey(username, key)
return
except paramiko.SSHException:
pass
raise obnamlib.Error('Can\'t authenticate to SSH server using key.')
def _find_auth_keys(self):
if self.settings and self.settings['ssh-key']:
return [self._load_from_key_file(self.settings['ssh-key'])]
else:
return self._load_from_agent()
def _load_from_key_file(self, filename):
try:
key = paramiko.RSAKey.from_private_key_file(filename)
except paramiko.PasswordRequiredException:
password = getpass.getpass('RSA key password for %s: ' %
filename)
key = paramiko.RSAKey.from_private_key_file(filename, password)
return key
def _load_from_agent(self):
agent = paramiko.Agent()
return agent.get_keys()
def close(self):
logging.debug('SftpFS.close called')
self.sftp.close()
self.sftp = None
if self.transport:
self.transport.close()
self.transport = None
obnamlib.VirtualFileSystem.close(self)
self._delay()
@ioerror_to_oserror
def reinit(self, baseurl, create=False):
scheme, netloc, path, query, fragment = urlparse.urlsplit(baseurl)
if scheme != 'sftp':
raise obnamlib.Error('SftpFS used with non-sftp URL: %s' % baseurl)
if '@' in netloc:
user, netloc = netloc.split('@', 1)
else:
user = None
if ':' in netloc:
host, port = netloc.split(':', 1)
if port == '':
port = None
else:
try:
port = int(port)
except ValueError, e:
msg = ('Invalid port number %s in %s: %s' %
(port, baseurl, str(e)))
logging.error(msg)
raise obnamlib.Error(msg)
else:
host = netloc
port = None
if path.startswith('/~/'):
path = path[3:]
self.host = host
self.port = port
self.user = user
self.path = path
self.create_path_if_missing = create
self._delay()
if self.sftp:
if create:
self._create_root_if_missing()
logging.debug('chdir to %s' % path)
self.sftp.chdir(self._initial_dir)
self.sftp.chdir(path)
def _get_username(self):
return pwd.getpwuid(os.getuid()).pw_name
def getcwd(self):
self._delay()
return self._to_string(self.sftp.getcwd())
@ioerror_to_oserror
def chdir(self, pathname):
self._delay()
self.sftp.chdir(pathname)
@ioerror_to_oserror
def listdir(self, pathname):
self._delay()
return [self._to_string(x) for x in self.sftp.listdir(pathname)]
def _force_32bit_timestamp(self, timestamp):
if timestamp is None:
return None
max_int32 = 2**31 - 1 # max positive 32 signed integer value
if timestamp > max_int32:
timestamp -= 2**32
if timestamp > max_int32:
timestamp = max_int32 # it's too large, need to lose info
return timestamp
def _fix_stat(self, pathname, st):
# SFTP and/or paramiko fail to return some of the required fields,
# so we add them, using faked data.
defaults = {
'st_blocks': (st.st_size / 512) +
(1 if st.st_size % 512 else 0),
'st_dev': 0,
'st_ino': int(hashlib.md5(pathname).hexdigest()[:8], 16),
'st_nlink': 1,
}
for name, value in defaults.iteritems():
if not hasattr(st, name):
setattr(st, name, value)
# Paramiko seems to deal with unsigned timestamps only, at least
# in version 1.7.6. We therefore force the timestamps into
# a signed 32-bit value. This limits the range, but allows
# timestamps that are negative (before 1970). Once paramiko is
# fixed, this code can be removed.
st.st_mtime_sec = self._force_32bit_timestamp(st.st_mtime)
st.st_atime_sec = self._force_32bit_timestamp(st.st_atime)
# Within Obnam, we pretend stat results have st_Xtime_sec and
# st_Xtime_nsec, but not st_Xtime. Remove those fields.
del st.st_mtime
del st.st_atime
# We only get integer timestamps, so set these explicitly to 0.
st.st_mtime_nsec = 0
st.st_atime_nsec = 0
return st
@ioerror_to_oserror
def listdir2(self, pathname):
self._delay()
attrs = self.sftp.listdir_attr(pathname)
pairs = [(self._to_string(st.filename), st) for st in attrs]
fixed = [(name, self._fix_stat(name, st)) for name, st in pairs]
return fixed
def lock(self, lockname, data):
try:
self.write_file(lockname, data)
except OSError, e:
raise obnamlib.LockFail('Failure get lock %s' % lockname)
def unlock(self, lockname):
self._remove_if_exists(lockname)
def exists(self, pathname):
try:
self.lstat(pathname)
except OSError:
return False
else:
return True
def isdir(self, pathname):
self._delay()
try:
st = self.lstat(pathname)
except OSError:
return False
else:
return stat.S_ISDIR(st.st_mode)
def mknod(self, pathname, mode):
# SFTP does not provide an mknod, so we can't do this. We
# raise an exception, so upper layers can handle this (we _could_
# just fail silently, but that would be silly.)
raise NotImplementedError('mknod on SFTP: %s' % pathname)
@ioerror_to_oserror
def mkdir(self, pathname):
self._delay()
self.sftp.mkdir(pathname)
@ioerror_to_oserror
def makedirs(self, pathname):
parent = os.path.dirname(pathname)
if parent and parent != pathname and not self.exists(parent):
self.makedirs(parent)
self.mkdir(pathname)
@ioerror_to_oserror
def rmdir(self, pathname):
self._delay()
self.sftp.rmdir(pathname)
@ioerror_to_oserror
def remove(self, pathname):
self._delay()
self.sftp.remove(pathname)
def _remove_if_exists(self, pathname):
'''Like remove, but OK if file does not exist.'''
try:
self.remove(pathname)
except OSError, e:
if e.errno != errno.ENOENT:
raise
@ioerror_to_oserror
def rename(self, old, new):
self._delay()
self._remove_if_exists(new)
self.sftp.rename(old, new)
@ioerror_to_oserror
def lstat(self, pathname):
self._delay()
st = self.sftp.lstat(pathname)
self._fix_stat(pathname, st)
return st
@ioerror_to_oserror
def lchown(self, pathname, uid, gid):
self._delay()
if stat.S_ISLNK(self.lstat(pathname).st_mode):
logging.warning('NOT changing ownership of symlink %s' % pathname)
else:
self.sftp.chown(pathname, uid, gid)
@ioerror_to_oserror
def chmod_symlink(self, pathname, mode):
# SFTP and/or paramiko don't have lchmod at all, so we can't
# actually do this. However, we at least check that pathname
# exists.
self.lstat(pathname)
@ioerror_to_oserror
def chmod_not_symlink(self, pathname, mode):
self._delay()
self.sftp.chmod(pathname, mode)
@ioerror_to_oserror
def lutimes(self, pathname, atime_sec, atime_nsec, mtime_sec, mtime_nsec):
# FIXME: This does not work for symlinks!
# Sftp does not have a way of doing that. This means if the restore
# target is over sftp, symlinks and their targets will have wrong
# mtimes.
self._delay()
if getattr(self, 'lutimes_warned', False):
logging.warning('lutimes used over SFTP, this does not work '
'against symlinks (warning appears only first '
'time)')
self.lutimes_warned = True
self.sftp.utime(pathname, (atime_sec, mtime_sec))
def link(self, existing_path, new_path):
raise obnamlib.Error('Cannot hardlink on SFTP. Sorry.')
def readlink(self, symlink):
self._delay()
return self._to_string(self.sftp.readlink(symlink))
@ioerror_to_oserror
def symlink(self, source, destination):
self._delay()
self.sftp.symlink(source, destination)
def open(self, pathname, mode, bufsize=-1):
self._delay()
return self.sftp.file(pathname, mode, bufsize=bufsize)
def cat(self, pathname):
self._delay()
f = self.open(pathname, 'r')
f.prefetch()
chunks = []
while True:
chunk = f.read(self.chunk_size)
if not chunk:
break
chunks.append(chunk)
self.bytes_read += len(chunk)
f.close()
return ''.join(chunks)
@ioerror_to_oserror
def write_file(self, pathname, contents):
try:
f = self.open(pathname, 'wx')
except (IOError, OSError), e:
# When the path to the file to be written does not
# exist, we try to create the directories below. Note that
# some SFTP servers return EACCES instead of ENOENT
# when the path to the file does not exist, so we
# do not raise an exception here for both ENOENT
# and EACCES.
if e.errno != errno.ENOENT and e.errno != errno.EACCES:
raise
dirname = os.path.dirname(pathname)
self.makedirs(dirname)
f = self.open(pathname, 'wx')
self._write_helper(f, contents)
f.close()
def _tempfile(self, dirname):
'''Create a new file with a random name, return file handle and name.'''
if dirname:
try:
self.makedirs(dirname)
except OSError:
# We ignore the error, on the assumption that it was due
# to the directory already existing. If it didn't exist
# and the error was for something else, then we'll catch
# that when we open the file for writing.
pass
while True:
i = random.randint(0, 2**64-1)
basename = 'tmp.%x' % i
pathname = os.path.join(dirname, basename)
try:
f = self.open(pathname, 'wx', bufsize=self.chunk_size)
except OSError:
pass
else:
return f, pathname
@ioerror_to_oserror
def overwrite_file(self, pathname, contents):
self._delay()
dirname = os.path.dirname(pathname)
f, tempname = self._tempfile(dirname)
self._write_helper(f, contents)
f.close()
self.rename(tempname, pathname)
def _write_helper(self, f, contents):
for pos in range(0, len(contents), self.chunk_size):
chunk = contents[pos:pos + self.chunk_size]
f.write(chunk)
self.bytes_written += len(chunk)
class SftpPlugin(obnamlib.ObnamPlugin):
def enable(self):
ssh_group = obnamlib.option_group['ssh'] = 'SSH/SFTP'
devel_group = obnamlib.option_group['devel']
self.app.settings.integer(['sftp-delay'],
'add an artificial delay (in milliseconds) '
'to all SFTP transfers',
group=devel_group)
self.app.settings.string(['ssh-key'],
'use FILENAME as the ssh RSA private key for '
'sftp access (default is using keys known '
'to ssh-agent)',
metavar='FILENAME',
group=ssh_group)
self.app.settings.boolean(['strict-ssh-host-keys'],
'DEPRECATED, use --ssh-host-keys-check '
'instead',
group=ssh_group)
self.app.settings.choice(['ssh-host-keys-check'],
['ssh-config', 'yes', 'no', 'ask'],
'If "yes", require that the ssh host key must '
'be known and correct to be accepted. If '
'"no", do not require that. If "ask", the '
'user is interactively asked to accept new '
'hosts. The default ("ssh-config") is to '
'rely on the settings of the underlying '
'SSH client',
metavar='VALUE',
group=ssh_group)
self.app.settings.string(['ssh-known-hosts'],
'filename of the user\'s known hosts file '
'(default: %default)',
metavar='FILENAME',
default=
os.path.expanduser('~/.ssh/known_hosts'),
group=ssh_group)
self.app.settings.string(['ssh-command'],
'alternative executable to be used instead '
'of "ssh" (full path is allowed, no '
'arguments may be added)',
metavar='EXECUTABLE',
group=ssh_group)
self.app.settings.boolean(['pure-paramiko'],
'do not use openssh even if available, '
'use paramiko only instead',
group=ssh_group)
self.app.fsf.register('sftp', SftpFS, settings=self.app.settings)
obnam-1.6.1/obnamlib/plugins/show_plugin.py 0000644 0001750 0001750 00000031652 12246357067 020676 0 ustar jenkins jenkins # Copyright (C) 2009, 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import os
import re
import stat
import sys
import time
import obnamlib
class ShowPlugin(obnamlib.ObnamPlugin):
'''Show information about data in the backup repository.
This implements commands for listing contents of root and client
objects, or the contents of a backup generation.
'''
leftists = (2, 3, 6)
min_widths = (1, 1, 1, 1, 6, 20, 1)
def enable(self):
self.app.add_subcommand('clients', self.clients)
self.app.add_subcommand('generations', self.generations)
self.app.add_subcommand('genids', self.genids)
self.app.add_subcommand('ls', self.ls, arg_synopsis='[FILE]...')
self.app.add_subcommand('diff', self.diff,
arg_synopsis='[GENERATION1] GENERATION2')
self.app.add_subcommand('nagios-last-backup-age',
self.nagios_last_backup_age)
self.app.settings.string(['warn-age'],
'for nagios-last-backup-age: maximum age (by '
'default in hours) for the most recent '
'backup before status is warning. '
'Accepts one char unit specifier '
'(s,m,h,d for seconds, minutes, hours, '
'and days.',
metavar='AGE',
default=obnamlib.DEFAULT_NAGIOS_WARN_AGE)
self.app.settings.string(['critical-age'],
'for nagios-last-backup-age: maximum age '
'(by default in hours) for the most '
'recent backup before statis is critical. '
'Accepts one char unit specifier '
'(s,m,h,d for seconds, minutes, hours, '
'and days.',
metavar='AGE',
default=obnamlib.DEFAULT_NAGIOS_WARN_AGE)
def open_repository(self, require_client=True):
self.app.settings.require('repository')
if require_client:
self.app.settings.require('client-name')
self.repo = self.app.open_repository()
if require_client:
self.repo.open_client(self.app.settings['client-name'])
def clients(self, args):
'''List clients using the repository.'''
self.open_repository(require_client=False)
for client_name in self.repo.list_clients():
self.app.output.write('%s\n' % client_name)
self.repo.fs.close()
def generations(self, args):
'''List backup generations for client.'''
self.open_repository()
for gen in self.repo.list_generations():
start, end = self.repo.get_generation_times(gen)
is_checkpoint = self.repo.get_is_checkpoint(gen)
if is_checkpoint:
checkpoint = ' (checkpoint)'
else:
checkpoint = ''
sys.stdout.write('%s\t%s .. %s (%d files, %d bytes) %s\n' %
(gen,
self.format_time(start),
self.format_time(end),
self.repo.client.get_generation_file_count(gen),
self.repo.client.get_generation_data(gen),
checkpoint))
self.repo.fs.close()
def nagios_last_backup_age(self, args):
'''Check if the most recent generation is recent enough.'''
try:
self.open_repository()
except obnamlib.Error, e:
self.app.output.write('CRITICAL: %s\n' % e)
sys.exit(2)
most_recent = None
warn_age = self._convert_time(self.app.settings['warn-age'])
critical_age = self._convert_time(self.app.settings['critical-age'])
for gen in self.repo.list_generations():
start, end = self.repo.get_generation_times(gen)
if most_recent is None or start > most_recent: most_recent = start
self.repo.fs.close()
now = self.app.time()
if most_recent is None:
# the repository is empty / the client does not exist
self.app.output.write('CRITICAL: no backup found.\n')
sys.exit(2)
elif (now - most_recent > critical_age):
self.app.output.write(
'CRITICAL: backup is old. last backup was %s.\n' %
(self.format_time(most_recent)))
sys.exit(2)
elif (now - most_recent > warn_age):
self.app.output.write(
'WARNING: backup is old. last backup was %s.\n' %
self.format_time(most_recent))
sys.exit(1)
self.app.output.write(
'OK: backup is recent. last backup was %s.\n' %
self.format_time(most_recent))
def genids(self, args):
'''List generation ids for client.'''
self.open_repository()
for gen in self.repo.list_generations():
sys.stdout.write('%s\n' % gen)
self.repo.fs.close()
def ls(self, args):
'''List contents of a generation.'''
self.open_repository()
if len(args) is 0:
args = ['/']
for gen in self.app.settings['generation']:
gen = self.repo.genspec(gen)
started, ended = self.repo.client.get_generation_times(gen)
started = self.format_time(started)
ended = self.format_time(ended)
self.app.output.write(
'Generation %s (%s - %s)\n' % (gen, started, ended))
for ls_file in args:
ls_file = self.remove_trailing_slashes(ls_file)
self.show_objects(gen, ls_file)
self.repo.fs.close()
def remove_trailing_slashes(self, filename):
while filename.endswith('/') and filename != '/':
filename = filename[:-1]
return filename
def format_time(self, timestamp):
return time.strftime('%Y-%m-%d %H:%M:%S', time.localtime(timestamp))
def isdir(self, gen, filename):
metadata = self.repo.get_metadata(gen, filename)
return metadata.isdir()
def show_objects(self, gen, dirname):
self.show_item(gen, dirname)
subdirs = []
for basename in sorted(self.repo.listdir(gen, dirname)):
full = os.path.join(dirname, basename)
if self.isdir(gen, full):
subdirs.append(full)
else:
self.show_item(gen, full)
for subdir in subdirs:
self.show_objects(gen, subdir)
def show_item(self, gen, filename):
fields = self.fields(gen, filename)
widths = [
1, # mode
5, # nlink
-8, # owner
-8, # group
10, # size
1, # mtime
-1, # name
]
result = []
for i in range(len(fields)):
if widths[i] < 0:
fmt = '%-*s'
else:
fmt = '%*s'
result.append(fmt % (abs(widths[i]), fields[i]))
self.app.output.write('%s\n' % ' '.join(result))
def show_diff_for_file(self, gen, fullname, change_char):
'''Show what has changed for a single file.
change_char is a single char (+,- or *) indicating whether a file
got added, removed or altered.
If --verbose, just show all the details as ls shows, otherwise
show just the file's full name.
'''
if self.app.settings['verbose']:
sys.stdout.write('%s ' % change_char)
self.show_item(gen, fullname)
else:
self.app.output.write('%s %s\n' % (change_char, fullname))
def show_diff_for_common_file(self, gen1, gen2, fullname, subdirs):
changed = False
if self.isdir(gen1, fullname) != self.isdir(gen2, fullname):
changed = True
elif self.isdir(gen2, fullname):
subdirs.append(fullname)
else:
# Files are both present and neither is a directory
# Check md5 sums
md5_1 = self.repo.get_metadata(gen1, fullname)
md5_2 = self.repo.get_metadata(gen2, fullname)
if md5_1 != md5_2:
changed = True
if changed:
self.show_diff_for_file(gen2, fullname, '*')
def show_diff(self, gen1, gen2, dirname):
# This set contains the files from the old/src generation
set1 = self.repo.listdir(gen1, dirname)
subdirs = []
# These are the new/dst generation files
for basename in sorted(self.repo.listdir(gen2, dirname)):
full = os.path.join(dirname, basename)
if basename in set1:
# Its in both generations
set1.remove(basename)
self.show_diff_for_common_file(gen1, gen2, full, subdirs)
else:
# Its only in set2 - the file/dir got added
self.show_diff_for_file(gen2, full, '+')
for basename in sorted(set1):
# This was only in gen1 - it got removed
full = os.path.join(dirname, basename)
self.show_diff_for_file(gen1, full, '-')
for subdir in subdirs:
self.show_diff(gen1, gen2, subdir)
def diff(self, args):
'''Show difference between two generations.'''
if len(args) not in (1, 2):
raise obnamlib.Error('Need one or two generations')
self.open_repository()
if len(args) == 1:
gen2 = self.repo.genspec(args[0])
# Now we have the dst/second generation for show_diff. Use
# genids/list_generations to find the previous generation
genids = self.repo.list_generations()
index = genids.index(gen2)
if index == 0:
raise obnamlib.Error(
'Can\'t show first generation. Use \'ls\' instead')
gen1 = genids[index - 1]
else:
gen1 = self.repo.genspec(args[0])
gen2 = self.repo.genspec(args[1])
self.show_diff(gen1, gen2, '/')
self.repo.fs.close()
def fields(self, gen, full):
metadata = self.repo.get_metadata(gen, full)
perms = ['?'] + ['-'] * 9
tab = [
(stat.S_IFREG, 0, '-'),
(stat.S_IFDIR, 0, 'd'),
(stat.S_IFLNK, 0, 'l'),
(stat.S_IFIFO, 0, 'p'),
(stat.S_IRUSR, 1, 'r'),
(stat.S_IWUSR, 2, 'w'),
(stat.S_IXUSR, 3, 'x'),
(stat.S_IRGRP, 4, 'r'),
(stat.S_IWGRP, 5, 'w'),
(stat.S_IXGRP, 6, 'x'),
(stat.S_IROTH, 7, 'r'),
(stat.S_IWOTH, 8, 'w'),
(stat.S_IXOTH, 9, 'x'),
]
mode = metadata.st_mode or 0
for bitmap, offset, char in tab:
if (mode & bitmap) == bitmap:
perms[offset] = char
perms = ''.join(perms)
timestamp = time.strftime('%Y-%m-%d %H:%M:%S',
time.gmtime(metadata.st_mtime_sec))
if metadata.islink():
name = '%s -> %s' % (full, metadata.target)
else:
name = full
return (perms,
str(metadata.st_nlink or 0),
metadata.username or '',
metadata.groupname or '',
str(metadata.st_size or 0),
timestamp,
name)
def format(self, fields):
return ' '. join(self.align(widths[i], fields[i], i)
for i in range(len(fields)))
def align(self, width, field, field_no):
if field_no in self.leftists:
return '%-*s' % (width, field)
else:
return '%*s' % (width, field)
def _convert_time(self, s, default_unit='h'):
m = re.match('([0-9]+)([smhdw])?$', s)
if m is None: raise ValueError
ticks = int(m.group(1))
unit = m.group(2)
if unit is None: unit = default_unit
if unit == 's':
None
elif unit == 'm':
ticks *= 60
elif unit == 'h':
ticks *= 60*60
elif unit == 'd':
ticks *= 60*60*24
elif unit == 'w':
ticks *= 60*60*24*7
else:
raise ValueError
return ticks
obnam-1.6.1/obnamlib/plugins/verify_plugin.py 0000644 0001750 0001750 00000015175 12246357067 021224 0 ustar jenkins jenkins # Copyright (C) 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import random
import stat
import sys
import urlparse
import obnamlib
class Fail(obnamlib.Error):
def __init__(self, filename, reason):
self.filename = filename
self.reason = reason
def __str__(self):
return '%s: %s' % (self.filename, self.reason)
class VerifyPlugin(obnamlib.ObnamPlugin):
def enable(self):
self.app.add_subcommand('verify', self.verify,
arg_synopsis='[DIRECTORY]...')
self.app.settings.integer(['verify-randomly'],
'verify N files randomly from the backup '
'(default is zero, meaning everything)',
metavar='N')
def verify(self, args):
'''Verify that live data and backed up data match.'''
self.app.settings.require('repository')
self.app.settings.require('client-name')
self.app.settings.require('generation')
if len(self.app.settings['generation']) != 1:
raise obnamlib.Error(
'verify must be given exactly one generation')
logging.debug('verifying generation %s' %
self.app.settings['generation'])
if not args:
self.app.settings.require('root')
args = self.app.settings['root']
if not args:
logging.debug('no roots/args given, so verifying everything')
args = ['/']
logging.debug('verifying what: %s' % repr(args))
self.repo = self.app.open_repository()
self.repo.open_client(self.app.settings['client-name'])
self.fs = self.app.fsf.new(args[0])
self.fs.connect()
t = urlparse.urlparse(args[0])
root_url = urlparse.urlunparse((t[0], t[1], '/', t[3], t[4], t[5]))
logging.debug('t: %s' % repr(t))
logging.debug('root_url: %s' % repr(root_url))
self.fs.reinit(root_url)
self.failed = False
gen = self.repo.genspec(self.app.settings['generation'][0])
self.app.ts['done'] = 0
self.app.ts['total'] = 0
self.app.ts['filename'] = ''
if not self.app.settings['quiet']:
self.app.ts.format(
'%ElapsedTime() '
'verifying file %Counter(filename)/%Integer(total) '
'%PercentDone(done,total): '
'%Pathname(filename)')
num_randomly = self.app.settings['verify-randomly']
if num_randomly == 0:
self.app.ts['total'] = \
self.repo.client.get_generation_file_count(gen)
for filename, metadata in self.walk(gen, args):
self.app.ts['filename'] = filename
try:
self.verify_metadata(gen, filename, metadata)
except Fail, e:
self.log_fail(e)
else:
if metadata.isfile():
try:
self.verify_regular_file(gen, filename, metadata)
except Fail, e:
self.log_fail(e)
self.app.ts['done'] += 1
else:
logging.debug('verifying %d files randomly' % num_randomly)
self.app.ts['total'] = num_randomly
self.app.ts.notify('finding all files to choose randomly')
filenames = [filename
for filename, metadata in self.walk(gen, args)
if metadata.isfile()]
chosen = []
for i in range(min(num_randomly, len(filenames))):
filename = random.choice(filenames)
filenames.remove(filename)
chosen.append(filename)
for filename in chosen:
self.app.ts['filename'] = filename
metadata = self.repo.get_metadata(gen, filename)
try:
self.verify_metadata(gen, filename, metadata)
self.verify_regular_file(gen, filename, metadata)
except Fail, e:
self.log_fail(e)
self.app.ts['done'] += 1
self.fs.close()
self.repo.fs.close()
self.app.ts.finish()
if self.failed:
sys.exit(1)
print "Verify did not find problems."
def log_fail(self, e):
msg = 'verify failure: %s: %s' % (e.filename, e.reason)
logging.error(msg)
if self.app.settings['quiet']:
sys.stderr.write('%s\n' % msg)
else:
self.app.ts.notify(msg)
self.failed = True
def verify_metadata(self, gen, filename, backed_up):
try:
live_data = obnamlib.read_metadata(self.fs, filename)
except OSError, e:
raise Fail(filename, 'missing or inaccessible: %s' % e.strerror)
for field in obnamlib.metadata_verify_fields:
v1 = getattr(backed_up, field)
v2 = getattr(live_data, field)
if v1 != v2:
raise Fail(filename,
'metadata change: %s (%s vs %s)' % (field, v1, v2))
def verify_regular_file(self, gen, filename, metadata):
logging.debug('verifying regular %s' % filename)
f = self.fs.open(filename, 'r')
chunkids = self.repo.get_file_chunks(gen, filename)
if not self.verify_chunks(f, chunkids):
raise Fail(filename, 'data changed')
f.close()
def verify_chunks(self, f, chunkids):
for chunkid in chunkids:
backed_up = self.repo.get_chunk(chunkid)
live_data = f.read(len(backed_up))
if backed_up != live_data:
return False
return True
def walk(self, gen, args):
'''Iterate over each pathname specified by arguments.
This is a generator.
'''
for arg in args:
scheme, netloc, path, query, fragment = urlparse.urlsplit(arg)
arg = os.path.normpath(path)
for x in self.repo.walk(gen, arg):
yield x
obnam-1.6.1/obnamlib/plugins/vfs_local_plugin.py 0000644 0001750 0001750 00000001531 12246357067 021657 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import logging
import os
import re
import stat
import obnamlib
class VfsLocalPlugin(obnamlib.ObnamPlugin):
def enable(self):
self.app.fsf.register('', obnamlib.LocalFS)
obnam-1.6.1/obnamlib/repo.py 0000644 0001750 0001750 00000067407 12246357067 015633 0 ustar jenkins jenkins # Copyright (C) 2009-2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import errno
import hashlib
import larch
import logging
import os
import random
import re
import stat
import struct
import time
import tracing
import obnamlib
class LockFail(obnamlib.Error):
pass
class BadFormat(obnamlib.Error):
pass
class HookedFS(object):
'''A class to filter read/written data through hooks.'''
def __init__(self, repo, fs, hooks):
self.repo = repo
self.fs = fs
self.hooks = hooks
def __getattr__(self, name):
return getattr(self.fs, name)
def _get_toplevel(self, filename):
parts = filename.split(os.sep)
if len(parts) > 1:
return parts[0]
else: # pragma: no cover
raise obnamlib.Error('File at repository root: %s' % filename)
def cat(self, filename, runfilters=True):
data = self.fs.cat(filename)
toplevel = self._get_toplevel(filename)
if not runfilters:
return data
return self.hooks.filter_read('repository-data', data,
repo=self.repo, toplevel=toplevel)
def lock(self, filename, data):
self.fs.lock(filename, data)
def write_file(self, filename, data, runfilters=True):
tracing.trace('writing hooked %s' % filename)
toplevel = self._get_toplevel(filename)
if runfilters:
data = self.hooks.filter_write('repository-data', data,
repo=self.repo, toplevel=toplevel)
self.fs.write_file(filename, data)
def overwrite_file(self, filename, data, runfilters=True):
tracing.trace('overwriting hooked %s' % filename)
toplevel = self._get_toplevel(filename)
if runfilters:
data = self.hooks.filter_write('repository-data', data,
repo=self.repo, toplevel=toplevel)
self.fs.overwrite_file(filename, data)
class Repository(object):
'''Repository for backup data.
Backup data is put on a virtual file system
(obnamlib.VirtualFileSystem instance), in some form that
the API of this class does not care about.
The repository may contain data for several clients that share
encryption keys. Each client is identified by a name.
The repository has a "root" object, which is conceptually a list of
client names.
Each client in turn is conceptually a list of generations,
which correspond to snapshots of the user data that existed
when the generation was created.
Read-only access to the repository does not require locking.
Write access may affect only the root object, or only a client's
own data, and thus locking may affect only the root, or only
the client.
When a new generation is started, it is a copy-on-write clone
of the previous generation, and the caller needs to modify
the new generation to match the current state of user data.
The file 'metadata/format' at the root of the repository contains the
version of the repository format it uses. The version is
specified using a single integer.
'''
format_version = 6
def __init__(self, fs, node_size, upload_queue_size, lru_size, hooks,
idpath_depth, idpath_bits, idpath_skip, current_time,
lock_timeout, client_name):
self.current_time = current_time
self.setup_hooks(hooks or obnamlib.HookManager())
self.fs = HookedFS(self, fs, self.hooks)
self.node_size = node_size
self.upload_queue_size = upload_queue_size
self.lru_size = lru_size
hider = hashlib.md5()
hider.update(client_name)
self.lockmgr = obnamlib.LockManager(self.fs, lock_timeout,
hider.hexdigest())
self.got_root_lock = False
self._open_client_list()
self.got_shared_lock = False
self.got_client_lock = False
self.current_client = None
self.current_client_id = None
self.new_generation = None
self.added_clients = []
self.removed_clients = []
self.removed_generations = []
self.client = None
self._open_shared()
self.prev_chunkid = None
self.chunk_idpath = larch.IdPath('chunks', idpath_depth,
idpath_bits, idpath_skip)
self._chunks_exists = False
def _open_client_list(self):
self.clientlist = obnamlib.ClientList(self.fs, self.node_size,
self.upload_queue_size,
self.lru_size, self)
def _open_shared(self):
self.chunklist = obnamlib.ChunkList(self.fs, self.node_size,
self.upload_queue_size,
self.lru_size, self)
self.chunksums = obnamlib.ChecksumTree(self.fs, 'chunksums',
len(self.checksum('')),
self.node_size,
self.upload_queue_size,
self.lru_size, self)
def setup_hooks(self, hooks):
self.hooks = hooks
self.hooks.new('repository-toplevel-init')
self.hooks.new_filter('repository-data')
self.hooks.new('repository-add-client')
def checksum(self, data):
'''Return checksum of data.
The checksum is (currently) MD5.
'''
checksummer = self.new_checksummer()
checksummer.update(data)
return checksummer.hexdigest()
def new_checksummer(self):
'''Return a new checksum algorithm.'''
return hashlib.md5()
def acceptable_version(self, version):
'''Are we compatible with on-disk format?'''
return self.format_version == version
def client_dir(self, client_id):
'''Return name of sub-directory for a given client.'''
return str(client_id)
def list_clients(self):
'''Return list of names of clients using this repository.'''
self.check_format_version()
listed = set(self.clientlist.list_clients())
added = set(self.added_clients)
removed = set(self.removed_clients)
clients = listed.union(added).difference(removed)
return list(clients)
def require_root_lock(self):
'''Ensure we have the lock on the repository's root node.'''
if not self.got_root_lock:
raise LockFail('have not got lock on root node')
def require_shared_lock(self):
'''Ensure we have the lock on the shared B-trees except clientlist.'''
if not self.got_shared_lock:
raise LockFail('have not got lock on shared B-trees')
def require_client_lock(self):
'''Ensure we have the lock on the currently open client.'''
if not self.got_client_lock:
raise LockFail('have not got lock on client')
def require_open_client(self):
'''Ensure we have opened the client (r/w or r/o).'''
if self.current_client is None:
raise obnamlib.Error('client is not open')
def require_started_generation(self):
'''Ensure we have started a new generation.'''
if self.new_generation is None:
raise obnamlib.Error('new generation has not started')
def require_no_root_lock(self):
'''Ensure we haven't locked root yet.'''
if self.got_root_lock:
raise obnamlib.Error('We have already locked root, oops')
def require_no_shared_lock(self):
'''Ensure we haven't locked shared B-trees yet.'''
if self.got_shared_lock:
raise obnamlib.Error('We have already locked shared B-trees, oops')
def require_no_client_lock(self):
'''Ensure we haven't locked the per-client B-tree yet.'''
if self.got_client_lock:
raise obnamlib.Error('We have already locked the client, oops')
def lock_root(self):
'''Lock root node.
Raise obnamlib.LockFail if locking fails. Lock will be released
by commit_root() or unlock_root().
'''
tracing.trace('locking root')
self.require_no_root_lock()
self.require_no_client_lock()
self.require_no_shared_lock()
self.lockmgr.lock(['.'])
self.check_format_version()
self.got_root_lock = True
self.added_clients = []
self.removed_clients = []
self._write_format_version(self.format_version)
self.clientlist.start_changes()
def unlock_root(self):
'''Unlock root node without committing changes made.'''
tracing.trace('unlocking root')
self.require_root_lock()
self.added_clients = []
self.removed_clients = []
self.lockmgr.unlock(['.'])
self.got_root_lock = False
self._open_client_list()
def commit_root(self):
'''Commit changes to root node, and unlock it.'''
tracing.trace('committing root')
self.require_root_lock()
for client_name in self.added_clients:
self.clientlist.add_client(client_name)
self.hooks.call('repository-add-client',
self.clientlist, client_name)
self.added_clients = []
for client_name in self.removed_clients:
client_id = self.clientlist.get_client_id(client_name)
client_dir = self.client_dir(client_id)
if client_id is not None and self.fs.exists(client_dir):
self.fs.rmtree(client_dir)
self.clientlist.remove_client(client_name)
self.clientlist.commit()
self.unlock_root()
def get_format_version(self):
'''Return (major, minor) of the on-disk format version.
If on-disk repository does not have a version yet, return None.
'''
if self.fs.exists('metadata/format'):
data = self.fs.cat('metadata/format', runfilters=False)
lines = data.splitlines()
line = lines[0]
try:
version = int(line)
except ValueError, e: # pragma: no cover
msg = ('Invalid repository format version (%s) -- '
'forgot encryption?' %
repr(line))
raise obnamlib.Error(msg)
return version
else:
return None
def _write_format_version(self, version):
'''Write the desired format version to the repository.'''
tracing.trace('write format version')
if not self.fs.exists('metadata'):
self.fs.mkdir('metadata')
self.fs.overwrite_file('metadata/format', '%s\n' % version,
runfilters=False)
def check_format_version(self):
'''Verify that on-disk format version is compatbile.
If not, raise BadFormat.
'''
on_disk = self.get_format_version()
if on_disk is not None and not self.acceptable_version(on_disk):
raise BadFormat('On-disk repository format %s is incompatible '
'with program format %s; you need to use a '
'different version of Obnam' %
(on_disk, self.format_version))
def add_client(self, client_name):
'''Add a new client to the repository.'''
tracing.trace('client_name=%s', client_name)
self.require_root_lock()
if client_name in self.list_clients():
raise obnamlib.Error('client %s already exists in repository' %
client_name)
self.added_clients.append(client_name)
def remove_client(self, client_name):
'''Remove a client from the repository.
This removes all data related to the client, including all
actual file data unless other clients also use it.
'''
tracing.trace('client_name=%s', client_name)
self.require_root_lock()
if client_name not in self.list_clients():
raise obnamlib.Error('client %s does not exist' % client_name)
self.removed_clients.append(client_name)
@property
def shared_dirs(self):
return [self.chunklist.dirname, self.chunksums.dirname,
self.chunk_idpath.dirname]
def lock_shared(self):
'''Lock a client for exclusive write access.
Raise obnamlib.LockFail if locking fails. Lock will be released
by commit_client() or unlock_client().
'''
tracing.trace('locking shared')
self.require_no_shared_lock()
self.check_format_version()
self.lockmgr.lock(self.shared_dirs)
self.got_shared_lock = True
tracing.trace('starting changes in chunksums and chunklist')
self.chunksums.start_changes()
self.chunklist.start_changes()
# Initialize the chunks directory for encryption, etc, if it just
# got created.
dirname = self.chunk_idpath.dirname
filenames = self.fs.listdir(dirname)
if filenames == [] or filenames == ['lock']:
self.hooks.call('repository-toplevel-init', self, dirname)
def commit_shared(self):
'''Commit changes to shared B-trees.'''
tracing.trace('committing shared')
self.require_shared_lock()
self.chunklist.commit()
self.chunksums.commit()
self.unlock_shared()
def unlock_shared(self):
'''Unlock currently locked shared B-trees.'''
tracing.trace('unlocking shared')
self.require_shared_lock()
self.lockmgr.unlock(self.shared_dirs)
self.got_shared_lock = False
self._open_shared()
def lock_client(self, client_name):
'''Lock a client for exclusive write access.
Raise obnamlib.LockFail if locking fails. Lock will be released
by commit_client() or unlock_client().
'''
tracing.trace('client_name=%s', client_name)
self.require_no_client_lock()
self.require_no_shared_lock()
self.check_format_version()
client_id = self.clientlist.get_client_id(client_name)
if client_id is None:
raise LockFail('client %s does not exist' % client_name)
client_dir = self.client_dir(client_id)
if not self.fs.exists(client_dir):
self.fs.mkdir(client_dir)
self.hooks.call('repository-toplevel-init', self, client_dir)
self.lockmgr.lock([client_dir])
self.got_client_lock = True
self.current_client = client_name
self.current_client_id = client_id
self.added_generations = []
self.removed_generations = []
self.client = obnamlib.ClientMetadataTree(self.fs, client_dir,
self.node_size,
self.upload_queue_size,
self.lru_size, self)
self.client.init_forest()
def unlock_client(self):
'''Unlock currently locked client, without committing changes.'''
tracing.trace('unlocking client')
self.require_client_lock()
self.new_generation = None
self._really_remove_generations(self.added_generations)
self.lockmgr.unlock([self.client.dirname])
self.client = None # FIXME: This should remove uncommitted data.
self.added_generations = []
self.removed_generations = []
self.got_client_lock = False
self.current_client = None
self.current_client_id = None
def commit_client(self, checkpoint=False):
'''Commit changes to and unlock currently locked client.'''
tracing.trace('committing client (checkpoint=%s)', checkpoint)
self.require_client_lock()
self.require_shared_lock()
commit_client = self.new_generation or self.removed_generations
if self.new_generation:
self.client.set_current_generation_is_checkpoint(checkpoint)
self.added_generations = []
self._really_remove_generations(self.removed_generations)
if commit_client:
self.client.commit()
self.unlock_client()
def open_client(self, client_name):
'''Open a client for read-only operation.'''
tracing.trace('open r/o client_name=%s' % client_name)
self.check_format_version()
client_id = self.clientlist.get_client_id(client_name)
if client_id is None:
raise obnamlib.Error('%s is not an existing client' % client_name)
self.current_client = client_name
self.current_client_id = client_id
client_dir = self.client_dir(client_id)
self.client = obnamlib.ClientMetadataTree(self.fs, client_dir,
self.node_size,
self.upload_queue_size,
self.lru_size, self)
self.client.init_forest()
def list_generations(self):
'''List existing generations for currently open client.'''
self.require_open_client()
return self.client.list_generations()
def get_is_checkpoint(self, genid):
'''Is a generation a checkpoint one?'''
self.require_open_client()
return self.client.get_is_checkpoint(genid)
def start_generation(self):
'''Start a new generation.
The new generation is a copy-on-write clone of the previous
one (or empty, if first generation).
'''
tracing.trace('start new generation')
self.require_client_lock()
if self.new_generation is not None:
raise obnamlib.Error('Cannot start two new generations')
self.client.start_generation()
self.new_generation = \
self.client.get_generation_id(self.client.tree)
self.added_generations.append(self.new_generation)
return self.new_generation
def _really_remove_generations(self, remove_genids):
'''Really remove a list of generations.
This is not part of the public API.
This does not make any safety checks.
'''
def find_chunkids_in_gens(genids):
chunkids = set()
for genid in genids:
x = self.client.list_chunks_in_generation(genid)
chunkids = chunkids.union(set(x))
return chunkids
def find_gens_to_keep():
return [genid
for genid in self.list_generations()
if genid not in remove_genids]
def remove_chunks(chunk_ids):
for chunk_id in chunk_ids:
try:
checksum = self.chunklist.get_checksum(chunk_id)
except KeyError:
# No checksum, therefore it can't be shared, therefore
# we can remove it.
self.remove_chunk(chunk_id)
else:
self.chunksums.remove(checksum, chunk_id,
self.current_client_id)
if not self.chunksums.chunk_is_used(checksum, chunk_id):
self.remove_chunk(chunk_id)
def remove_gens(genids):
if self.new_generation is None:
self.client.start_changes(create_tree=False)
for genid in genids:
self.client.remove_generation(genid)
if not remove_genids:
return
self.require_client_lock()
self.require_shared_lock()
maybe_remove = find_chunkids_in_gens(remove_genids)
keep_genids = find_gens_to_keep()
keep = find_chunkids_in_gens(keep_genids)
remove = maybe_remove.difference(keep)
remove_chunks(remove)
remove_gens(remove_genids)
def remove_generation(self, gen):
'''Remove a committed generation.'''
self.require_client_lock()
if gen == self.new_generation:
raise obnamlib.Error('cannot remove started generation')
self.removed_generations.append(gen)
def get_generation_times(self, gen):
'''Return start and end times of a generation.
An unfinished generation has no end time, so None is returned.
'''
self.require_open_client()
return self.client.get_generation_times(gen)
def listdir(self, gen, dirname):
'''Return list of basenames in a directory within generation.'''
self.require_open_client()
return self.client.listdir(gen, dirname)
def get_metadata(self, gen, filename):
'''Return metadata for a file in a generation.'''
self.require_open_client()
try:
encoded = self.client.get_metadata(gen, filename)
except KeyError:
raise obnamlib.Error('%s does not exist' % filename)
return obnamlib.decode_metadata(encoded)
def create(self, filename, metadata):
'''Create a new (empty) file in the new generation.'''
self.require_started_generation()
encoded = obnamlib.encode_metadata(metadata)
self.client.create(filename, encoded)
def remove(self, filename):
'''Remove file or directory or directory tree from generation.'''
self.require_started_generation()
self.client.remove(filename)
def _chunk_filename(self, chunkid):
return self.chunk_idpath.convert(chunkid)
def put_chunk_only(self, data):
'''Put chunk of data into repository.
If the same data is already in the repository, it will be put there
a second time. It is the caller's responsibility to check
that the data is not already in the repository.
Return the unique identifier of the new chunk.
'''
def random_chunkid():
return random.randint(0, obnamlib.MAX_ID)
self.require_started_generation()
if self.prev_chunkid is None:
self.prev_chunkid = random_chunkid()
while True:
chunkid = (self.prev_chunkid + 1) % obnamlib.MAX_ID
filename = self._chunk_filename(chunkid)
try:
self.fs.write_file(filename, data)
except OSError, e: # pragma: no cover
if e.errno == errno.EEXIST:
self.prev_chunkid = random_chunkid()
continue
raise
else:
tracing.trace('chunkid=%s', chunkid)
break
self.prev_chunkid = chunkid
return chunkid
def put_chunk_in_shared_trees(self, chunkid, checksum):
'''Put the chunk into the shared trees.
The chunk is assumed to already exist in the repository, so we
just need to add it to the shared trees that map chunkids to
checksums and checksums to chunkids.
'''
tracing.trace('chunkid=%s', chunkid)
tracing.trace('checksum=%s', repr(checksum))
self.require_started_generation()
self.require_shared_lock()
self.chunklist.add(chunkid, checksum)
self.chunksums.add(checksum, chunkid, self.current_client_id)
def get_chunk(self, chunkid):
'''Return data of chunk with given id.'''
self.require_open_client()
return self.fs.cat(self._chunk_filename(chunkid))
def chunk_exists(self, chunkid):
'''Does a chunk exist in the repository?'''
self.require_open_client()
return self.fs.exists(self._chunk_filename(chunkid))
def find_chunks(self, checksum):
'''Return identifiers of chunks with given checksum.
Because of hash collisions, the list may be longer than one.
'''
self.require_open_client()
return self.chunksums.find(checksum)
def list_chunks(self):
'''Return list of ids of all chunks in repository.'''
result = []
pat = re.compile(r'^.*/.*/[0-9a-fA-F]+$')
if self.fs.exists('chunks'):
for pathname, st in self.fs.scan_tree('chunks'):
if stat.S_ISREG(st.st_mode) and pat.match(pathname):
basename = os.path.basename(pathname)
result.append(int(basename, 16))
return result
def remove_chunk(self, chunk_id):
'''Remove a chunk from the repository.
Note that this does _not_ remove the chunk from the chunk
checksum forest. The caller is not supposed to call us until
the chunk is not there anymore.
However, it does remove the chunk from the chunk list forest.
'''
tracing.trace('chunk_id=%s', chunk_id)
self.require_open_client()
self.require_shared_lock()
self.chunklist.remove(chunk_id)
filename = self._chunk_filename(chunk_id)
try:
self.fs.remove(filename)
except OSError:
pass
def get_file_chunks(self, gen, filename):
'''Return list of ids of chunks belonging to a file.'''
self.require_open_client()
return self.client.get_file_chunks(gen, filename)
def set_file_chunks(self, filename, chunkids):
'''Set ids of chunks belonging to a file.
File must be in the started generation.
'''
self.require_started_generation()
self.client.set_file_chunks(filename, chunkids)
def append_file_chunks(self, filename, chunkids):
'''Append to list of ids of chunks belonging to a file.
File must be in the started generation.
'''
self.require_started_generation()
self.client.append_file_chunks(filename, chunkids)
def set_file_data(self, filename, contents): # pragma: no cover
'''Store contents of file in B-tree instead of chunks dir.'''
self.require_started_generation()
self.client.set_file_data(filename, contents)
def get_file_data(self, gen, filename): # pragma: no cover
'''Returned contents of file stored in B-tree instead of chunks dir.'''
self.require_open_client()
return self.client.get_file_data(gen, filename)
def genspec(self, spec):
'''Interpret a generation specification.'''
self.require_open_client()
gens = self.list_generations()
if not gens:
raise obnamlib.Error('No generations')
if spec == 'latest':
return gens[-1]
else:
try:
intspec = int(spec)
except ValueError:
raise obnamlib.Error('Generation %s is not an integer' % spec)
if intspec in gens:
return intspec
else:
raise obnamlib.Error('Generation %s not found' % spec)
def walk(self, gen, arg, depth_first=False):
'''Iterate over each pathname specified by argument.
This is a generator. Each return value is a tuple consisting
of a pathname and its corresponding metadata. Directories are
recursed into.
'''
arg = os.path.normpath(arg)
metadata = self.get_metadata(gen, arg)
if metadata.isdir():
if not depth_first:
yield arg, metadata
kids = self.listdir(gen, arg)
kidpaths = [os.path.join(arg, kid) for kid in kids]
for kidpath in kidpaths:
for x in self.walk(gen, kidpath, depth_first=depth_first):
yield x
if depth_first:
yield arg, metadata
else:
yield arg, metadata
obnam-1.6.1/obnamlib/repo_dummy.py 0000644 0001750 0001750 00000050525 12246357067 017037 0 ustar jenkins jenkins # Copyright 2013 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
#
# =*= License: GPL-3+ =*=
import obnamlib
class KeyValueStore(object):
def __init__(self):
self._map = {}
def get_value(self, key, default):
if key in self._map:
return self._map[key]
return default
def set_value(self, key, value):
self._map[key] = value
def remove_value(self, key):
del self._map[key]
def items(self):
return self._map.items()
def copy(self):
other = KeyValueStore()
for key, value in self.items():
other.set_value(key, value)
return other
class LockableKeyValueStore(object):
def __init__(self):
self.locked = False
self.data = KeyValueStore()
self.stashed = None
def lock(self):
assert not self.locked
self.stashed = self.data
self.data = self.data.copy()
self.locked = True
def unlock(self):
assert self.locked
self.data = self.stashed
self.stashed = None
self.locked = False
def commit(self):
assert self.locked
self.stashed = None
self.locked = False
def get_value(self, key, default):
return self.data.get_value(key, default)
def set_value(self, key, value):
self.data.set_value(key, value)
def remove_value(self, key):
self.data.remove_value(key)
def items(self):
return self.data.items()
class Counter(object):
def __init__(self):
self._latest = 0
def next(self):
self._latest += 1
return self._latest
class DummyClient(object):
def __init__(self, name):
self.name = name
self.generation_counter = Counter()
self.data = LockableKeyValueStore()
def lock(self):
if self.data.locked:
raise obnamlib.RepositoryClientLockingFailed(self.name)
self.data.lock()
def _require_lock(self):
if not self.data.locked:
raise obnamlib.RepositoryClientNotLocked(self.name)
def unlock(self):
self._require_lock()
self.data.unlock()
def commit(self):
self._require_lock()
self.data.set_value('current-generation', None)
self.data.commit()
def get_key(self, key):
return self.data.get_value(key, '')
def set_key(self, key, value):
self._require_lock()
self.data.set_value(key, value)
def get_generation_ids(self):
key = 'generation-ids'
return self.data.get_value(key, [])
def create_generation(self):
self._require_lock()
if self.data.get_value('current-generation', None) is not None:
raise obnamlib.RepositoryClientGenerationUnfinished(self.name)
generation_id = (self.name, self.generation_counter.next())
ids = self.data.get_value('generation-ids', [])
self.data.set_value('generation-ids', ids + [generation_id])
self.data.set_value('current-generation', generation_id)
if ids:
prev_gen_id = ids[-1]
for key, value in self.data.items():
if self._is_filekey(key):
x, gen_id, filename = key
if gen_id == prev_gen_id:
value = self.data.get_value(key, None)
self.data.set_value(
self._filekey(generation_id, filename), value)
elif self._is_filekeykey(key):
x, gen_id, filename, k = key
if gen_id == prev_gen_id:
value = self.data.get_value(key, None)
self.data.set_value(
self._filekeykey(generation_id, filename, k),
value)
elif self._is_filechunkskey(key):
x, gen_id, filename = key
if gen_id == prev_gen_id:
value = self.data.get_value(key, [])
self.data.set_value(
self._filechunkskey(generation_id, filename),
value)
return generation_id
def _require_generation(self, gen_id):
ids = self.data.get_value('generation-ids', [])
if gen_id not in ids:
raise obnamlib.RepositoryGenerationDoesNotExist(self.name)
def get_generation_key(self, gen_id, key):
return self.data.get_value(gen_id + (key,), '')
def set_generation_key(self, gen_id, key, value):
self._require_lock()
self.data.set_value(gen_id + (key,), value)
def remove_generation(self, gen_id):
self._require_lock()
self._require_generation(gen_id)
ids = self.data.get_value('generation-ids', [])
self.data.set_value('generation-ids', [x for x in ids if x != gen_id])
def get_generation_chunk_ids(self, gen_id):
chunk_ids = []
for key, value in self.data.items():
if self._is_filechunkskey(key) and key[1] == gen_id:
chunk_ids.extend(value)
return chunk_ids
def interpret_generation_spec(self, genspec):
ids = self.data.get_value('generation-ids', [])
if not ids:
raise obnamlib.RepositoryClientHasNoGenerations(self.name)
if genspec == 'latest':
if ids:
return ids[-1]
else:
gen_number = int(genspec)
if (self.name, gen_number) in ids:
return (self.name, gen_number)
raise obnamlib.RepositoryGenerationDoesNotExist(self.name)
def make_generation_spec(self, generation_id):
name, gen_number = generation_id
return str(gen_number)
def _filekey(self, gen_id, filename):
return ('file', gen_id, filename)
def _is_filekey(self, key):
return (type(key) is tuple and len(key) == 3 and key[0] == 'file')
def file_exists(self, gen_id, filename):
return self.data.get_value(self._filekey(gen_id, filename), False)
def add_file(self, gen_id, filename):
self.data.set_value(self._filekey(gen_id, filename), True)
def remove_file(self, gen_id, filename):
keys = []
for key, value in self.data.items():
right_kind = (
self._is_filekey(key) or
self._is_filekeykey(key) or
self._is_filechunkskey(key))
if right_kind:
if key[1] == gen_id and key[2] == filename:
keys.append(key)
for k in keys:
self.data.remove_value(k)
def _filekeykey(self, gen_id, filename, key):
return ('filekey', gen_id, filename, key)
def _is_filekeykey(self, key):
return (type(key) is tuple and len(key) == 4 and key[0] == 'filekey')
def _require_file(self, gen_id, filename):
if not self.file_exists(gen_id, filename):
raise obnamlib.RepositoryFileDoesNotExistInGeneration(
self.name, self.make_generation_spec(gen_id), filename)
_integer_keys = (
obnamlib.REPO_FILE_MTIME,
)
def get_file_key(self, gen_id, filename, key):
self._require_generation(gen_id)
self._require_file(gen_id, filename)
if key in self._integer_keys:
default = 0
else:
default = ''
return self.data.get_value(
self._filekeykey(gen_id, filename, key), default)
def set_file_key(self, gen_id, filename, key, value):
self._require_generation(gen_id)
self._require_file(gen_id, filename)
self.data.set_value(self._filekeykey(gen_id, filename, key), value)
def _filechunkskey(self, gen_id, filename):
return ('filechunks', gen_id, filename)
def _is_filechunkskey(self, key):
return (
type(key) is tuple and len(key) == 3 and key[0] == 'filechunks')
def get_file_chunk_ids(self, gen_id, filename):
self._require_generation(gen_id)
self._require_file(gen_id, filename)
return self.data.get_value(self._filechunkskey(gen_id, filename), [])
def append_file_chunk_id(self, gen_id, filename, chunk_id):
self._require_generation(gen_id)
self._require_file(gen_id, filename)
chunk_ids = self.get_file_chunk_ids(gen_id, filename)
self.data.set_value(
self._filechunkskey(gen_id, filename),
chunk_ids + [chunk_id])
def clear_file_chunk_ids(self, gen_id, filename):
self._require_generation(gen_id)
self._require_file(gen_id, filename)
self.data.set_value(self._filechunkskey(gen_id, filename), [])
def get_file_children(self, gen_id, filename):
children = []
if filename.endswith('/'):
prefix = filename
else:
prefix = filename + '/'
for key, value in self.data.items():
if not self._is_filekey(key):
continue
x, y, candidate = key
if candidate == filename:
continue
if not candidate.startswith(prefix): # pragma: no cover
continue
if '/' in candidate[len(prefix):]:
continue
children.append(candidate)
return children
class DummyClientList(object):
def __init__(self):
self.data = LockableKeyValueStore()
def lock(self):
if self.data.locked:
raise obnamlib.RepositoryClientListLockingFailed()
self.data.lock()
def unlock(self):
if not self.data.locked:
raise obnamlib.RepositoryClientListNotLocked()
self.data.unlock()
def commit(self):
if not self.data.locked:
raise obnamlib.RepositoryClientListNotLocked()
self.data.commit()
def force(self):
if self.data.locked:
self.unlock()
self.lock()
def _require_lock(self):
if not self.data.locked:
raise obnamlib.RepositoryClientListNotLocked()
def names(self):
return [k for k, v in self.data.items() if v is not None]
def __getitem__(self, client_name):
client = self.data.get_value(client_name, None)
if client is None:
raise obnamlib.RepositoryClientDoesNotExist(client_name)
return client
def add(self, client_name):
self._require_lock()
if self.data.get_value(client_name, None) is not None:
raise obnamlib.RepositoryClientAlreadyExists(client_name)
self.data.set_value(client_name, DummyClient(client_name))
def remove(self, client_name):
self._require_lock()
if self.data.get_value(client_name, None) is None:
raise obnamlib.RepositoryClientDoesNotExist(client_name)
self.data.set_value(client_name, None)
def rename(self, old_client_name, new_client_name):
self._require_lock()
client = self.data.get_value(old_client_name, None)
if client is None:
raise obnamlib.RepositoryClientDoesNotExist(old_client_name)
if self.data.get_value(new_client_name, None) is not None:
raise obnamlib.RepositoryClientAlreadyExists(new_client_name)
self.data.set_value(old_client_name, None)
self.data.set_value(new_client_name, client)
def get_client_by_generation_id(self, gen_id):
client_name, generation_number = gen_id
return self[client_name]
class ChunkStore(object):
def __init__(self):
self.next_chunk_id = Counter()
self.chunks = {}
def put_chunk_content(self, content):
chunk_id = self.next_chunk_id.next()
self.chunks[chunk_id] = content
return chunk_id
def get_chunk_content(self, chunk_id):
if chunk_id not in self.chunks:
raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id))
return self.chunks[chunk_id]
def has_chunk(self, chunk_id):
return chunk_id in self.chunks
def remove_chunk(self, chunk_id):
if chunk_id not in self.chunks:
raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id))
del self.chunks[chunk_id]
class ChunkIndexes(object):
def __init__(self):
self.data = LockableKeyValueStore()
def lock(self):
if self.data.locked:
raise obnamlib.RepositoryChunkIndexesLockingFailed()
self.data.lock()
def _require_lock(self):
if not self.data.locked:
raise obnamlib.RepositoryChunkIndexesNotLocked()
def unlock(self):
self._require_lock()
self.data.unlock()
def commit(self):
self._require_lock()
self.data.commit()
def force(self):
if self.data.locked:
self.unlock()
self.lock()
def put_chunk(self, chunk_id, chunk_content, client_id):
self._require_lock()
self.data.set_value(chunk_id, chunk_content)
def find_chunk(self, chunk_content):
for chunk_id, stored_content in self.data.items():
if stored_content == chunk_content:
return chunk_id
raise obnamlib.RepositoryChunkContentNotInIndexes()
def remove_chunk(self, chunk_id, client_id):
self._require_lock()
self.data.set_value(chunk_id, None)
class RepositoryFormatDummy(obnamlib.RepositoryInterface):
'''Simplistic repository format for testing.
This class exists to exercise the RepositoryInterfaceTests class.
'''
format = 'dummy'
def __init__(self):
self._client_list = DummyClientList()
self._chunk_store = ChunkStore()
self._chunk_indexes = ChunkIndexes()
def set_fs(self, fs):
pass
def init_repo(self):
pass
def get_client_names(self):
return self._client_list.names()
def lock_client_list(self):
self._client_list.lock()
def unlock_client_list(self):
self._client_list.unlock()
def commit_client_list(self):
self._client_list.commit()
def force_client_list_lock(self):
self._client_list.force()
def add_client(self, client_name):
self._client_list.add(client_name)
def remove_client(self, client_name):
self._client_list.remove(client_name)
def rename_client(self, old_client_name, new_client_name):
self._client_list.rename(old_client_name, new_client_name)
def lock_client(self, client_name):
self._client_list[client_name].lock()
def unlock_client(self, client_name):
self._client_list[client_name].unlock()
def commit_client(self, client_name):
self._client_list[client_name].commit()
def get_allowed_client_keys(self):
return [obnamlib.REPO_CLIENT_TEST_KEY]
def get_client_key(self, client_name, key):
return self._client_list[client_name].get_key(key)
def set_client_key(self, client_name, key, value):
if key not in self.get_allowed_client_keys():
raise obnamlib.RepositoryClientKeyNotAllowed(
self.format, client_name, key)
self._client_list[client_name].set_key(key, value)
def get_client_generation_ids(self, client_name):
return self._client_list[client_name].get_generation_ids()
def create_generation(self, client_name):
return self._client_list[client_name].create_generation()
def get_allowed_generation_keys(self):
return [obnamlib.REPO_GENERATION_TEST_KEY]
def get_generation_key(self, generation_id, key):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.get_generation_key(generation_id, key)
def set_generation_key(self, generation_id, key, value):
client = self._client_list.get_client_by_generation_id(generation_id)
if key not in self.get_allowed_generation_keys():
raise obnamlib.RepositoryGenerationKeyNotAllowed(
self.format, client.name, key)
return client.set_generation_key(generation_id, key, value)
def remove_generation(self, generation_id):
client = self._client_list.get_client_by_generation_id(generation_id)
client.remove_generation(generation_id)
def get_generation_chunk_ids(self, generation_id):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.get_generation_chunk_ids(generation_id)
def interpret_generation_spec(self, client_name, genspec):
client = self._client_list[client_name]
return client.interpret_generation_spec(genspec)
def make_generation_spec(self, generation_id):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.make_generation_spec(generation_id)
def file_exists(self, generation_id, filename):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.file_exists(generation_id, filename)
def add_file(self, generation_id, filename):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.add_file(generation_id, filename)
def remove_file(self, generation_id, filename):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.remove_file(generation_id, filename)
def get_file_key(self, generation_id, filename, key):
client = self._client_list.get_client_by_generation_id(generation_id)
if key not in self.get_allowed_file_keys():
raise obnamlib.RepositoryFileKeyNotAllowed(
self.format, client.name, key)
return client.get_file_key(generation_id, filename, key)
def set_file_key(self, generation_id, filename, key, value):
client = self._client_list.get_client_by_generation_id(generation_id)
if key not in self.get_allowed_file_keys():
raise obnamlib.RepositoryFileKeyNotAllowed(
self.format, client.name, key)
client.set_file_key(generation_id, filename, key, value)
def get_allowed_file_keys(self):
return [obnamlib.REPO_FILE_TEST_KEY, obnamlib.REPO_FILE_MTIME]
def get_file_chunk_ids(self, generation_id, filename):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.get_file_chunk_ids(generation_id, filename)
def append_file_chunk_id(self, generation_id, filename, chunk_id):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.append_file_chunk_id(generation_id, filename, chunk_id)
def clear_file_chunk_ids(self, generation_id, filename):
client = self._client_list.get_client_by_generation_id(generation_id)
client.clear_file_chunk_ids(generation_id, filename)
def get_file_children(self, generation_id, filename):
client = self._client_list.get_client_by_generation_id(generation_id)
return client.get_file_children(generation_id, filename)
def put_chunk_content(self, content):
return self._chunk_store.put_chunk_content(content)
def get_chunk_content(self, chunk_id):
return self._chunk_store.get_chunk_content(chunk_id)
def has_chunk(self, chunk_id):
return self._chunk_store.has_chunk(chunk_id)
def remove_chunk(self, chunk_id):
self._chunk_store.remove_chunk(chunk_id)
def lock_chunk_indexes(self):
self._chunk_indexes.lock()
def unlock_chunk_indexes(self):
self._chunk_indexes.unlock()
def commit_chunk_indexes(self):
self._chunk_indexes.commit()
def force_chunk_indexes_lock(self):
self._chunk_indexes.force()
def put_chunk_into_indexes(self, chunk_id, chunk_content, client_id):
self._chunk_indexes.put_chunk(chunk_id, chunk_content, client_id)
def find_chunk_id_by_content(self, chunk_content):
return self._chunk_indexes.find_chunk(chunk_content)
def remove_chunk_from_indexes(self, chunk_id, client_id):
return self._chunk_indexes.remove_chunk(chunk_id, client_id)
def get_fsck_work_item(self):
return 'this pretends to be a work item'
obnam-1.6.1/obnamlib/repo_dummy_tests.py 0000644 0001750 0001750 00000001536 12246357067 020257 0 ustar jenkins jenkins # Copyright 2013 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
#
# =*= License: GPL-3+ =*=
import obnamlib
class RepositoryFormatDummyTests(obnamlib.RepositoryInterfaceTests):
def setUp(self):
self.repo = obnamlib.RepositoryFormatDummy()
obnam-1.6.1/obnamlib/repo_fmt_6.py 0000644 0001750 0001750 00000056764 12246357067 016732 0 ustar jenkins jenkins # Copyright (C) 2009-2013 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import errno
import hashlib
import larch
import logging
import os
import random
import re
import stat
import struct
import time
import tracing
import obnamlib
class HookedFS(object):
'''A class to filter read/written data through hooks.'''
def __init__(self, repo, fs, hooks):
self.repo = repo
self.fs = fs
self.hooks = hooks
def __getattr__(self, name):
return getattr(self.fs, name)
def _get_toplevel(self, filename):
parts = filename.split(os.sep)
if len(parts) > 1:
return parts[0]
else: # pragma: no cover
raise obnamlib.Error('File at repository root: %s' % filename)
def cat(self, filename, runfilters=True):
data = self.fs.cat(filename)
toplevel = self._get_toplevel(filename)
if not runfilters: # pragma: no cover
return data
return self.hooks.filter_read('repository-data', data,
repo=self.repo, toplevel=toplevel)
def lock(self, filename, data):
self.fs.lock(filename, data)
def write_file(self, filename, data, runfilters=True):
tracing.trace('writing hooked %s' % filename)
toplevel = self._get_toplevel(filename)
if runfilters:
data = self.hooks.filter_write('repository-data', data,
repo=self.repo, toplevel=toplevel)
self.fs.write_file(filename, data)
def overwrite_file(self, filename, data, runfilters=True):
tracing.trace('overwriting hooked %s' % filename)
toplevel = self._get_toplevel(filename)
if runfilters:
data = self.hooks.filter_write('repository-data', data,
repo=self.repo, toplevel=toplevel)
self.fs.overwrite_file(filename, data)
class _OpenClient(object):
def __init__(self, client):
self.locked = False
self.client = client
self.current_generation_number = None
self.removed_generation_numbers = []
class RepositoryFormat6(obnamlib.RepositoryInterface):
format = '6'
def __init__(self,
lock_timeout=0,
node_size=obnamlib.DEFAULT_NODE_SIZE,
upload_queue_size=obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
lru_size=obnamlib.DEFAULT_LRU_SIZE,
idpath_depth=obnamlib.IDPATH_DEPTH,
idpath_bits=obnamlib.IDPATH_BITS,
idpath_skip=obnamlib.IDPATH_SKIP,
hooks=None):
self._lock_timeout = lock_timeout
self._node_size = node_size
self._upload_queue_size = upload_queue_size
self._lru_size = lru_size
self._idpath_depth = idpath_depth
self._idpath_bits = idpath_bits
self._idpath_skip = idpath_skip
self._setup_hooks(hooks or obnamlib.HookManager())
self._setup_chunks()
def _setup_hooks(self, hooks):
self.hooks = hooks
self.hooks.new('repository-toplevel-init')
self.hooks.new_filter('repository-data')
self.hooks.new('repository-add-client')
def set_fs(self, fs):
self._fs = HookedFS(self, fs, self.hooks)
self._lockmgr = obnamlib.LockManager(self._fs, self._lock_timeout, '')
self._setup_client_list()
self._setup_client()
self._setup_chunk_indexes()
def init_repo(self):
# There is nothing else to be done.
pass
# Client list handling.
def _setup_client_list(self):
self._got_client_list_lock = False
self._client_list = obnamlib.ClientList(
self._fs, self._node_size, self._upload_queue_size,
self._lru_size, self)
def _raw_lock_client_list(self):
if self._got_client_list_lock:
raise obnamlib.RepositoryClientListLockingFailed()
self._lockmgr.lock(['.'])
self._got_client_list_lock = True
self._client_list.start_changes()
def _raw_unlock_client_list(self):
if not self._got_client_list_lock:
raise obnamlib.RepositoryClientListNotLocked()
self._lockmgr.unlock(['.'])
self._setup_client_list()
def _require_client_list_lock(self):
if not self._got_client_list_lock:
raise obnamlib.RepositoryClientListNotLocked()
def lock_client_list(self):
tracing.trace('locking client list')
self._raw_lock_client_list()
def unlock_client_list(self):
tracing.trace('unlocking client list')
self._raw_unlock_client_list()
def commit_client_list(self):
tracing.trace('committing client list')
self._client_list.commit()
self._raw_unlock_client_list()
def force_client_list_lock(self):
tracing.trace('forcing client list lock')
if self._got_client_list_lock:
self._raw_unlock_client_list()
self._raw_lock_client_list()
def get_client_names(self):
return self._client_list.list_clients()
def add_client(self, client_name):
self._require_client_list_lock()
if self._client_list.get_client_id(client_name):
raise obnamlib.RepositoryClientAlreadyExists(client_name)
self._client_list.add_client(client_name)
def remove_client(self, client_name):
self._require_client_list_lock()
if not self._client_list.get_client_id(client_name):
raise obnamlib.RepositoryClientDoesNotExist(client_name)
self._client_list.remove_client(client_name)
def rename_client(self, old_client_name, new_client_name):
self._require_client_list_lock()
client_names = self.get_client_names()
if old_client_name not in client_names:
raise obnamlib.RepositoryClientDoesNotExist(old_client_name)
if new_client_name in client_names:
raise obnamlib.RepositoryClientAlreadyExists(new_client_name)
client_id = self._get_client_id(old_client_name)
new_key = self._client_list.key(
new_client_name, client_id, self._client_list.CLIENT_NAME)
self._client_list.tree.insert(new_key, new_client_name)
old_key = self._client_list.key(
old_client_name, client_id, self._client_list.CLIENT_NAME)
self._client_list.tree.remove(old_key)
def _get_client_id(self, client_name):
'''Return a client's unique, filesystem-visible id.
The id is a random 64-bit integer.
'''
return self._client_list.get_client_id(client_name)
# Handling of individual clients.
def current_time(self):
# ClientMetadataTree wants us to provide this method.
# FIXME: A better design would be to for us to provide
# the class with a function to call.
return time.time()
def _setup_client(self):
# We keep a list of all open clients. An open client may or
# may not be locked. Each value in the dict is a tuple of
# ClientMetadataTree and is_locked.
self._open_clients = {}
def _open_client(self, client_name):
if client_name not in self._open_clients:
tracing.trace('client_name=%s', client_name)
client_id = self._get_client_id(client_name)
if client_id is None: # pragma: no cover
raise obnamlib.RepositoryClientDoesNotExist(client_name)
client_dir = self._get_client_dir(client_id)
client = obnamlib.ClientMetadataTree(
self._fs, client_dir, self._node_size,
self._upload_queue_size, self._lru_size, self)
client.init_forest()
self._open_clients[client_name] = _OpenClient(client)
return self._open_clients[client_name].client
def _get_client_dir(self, client_id):
'''Return name of sub-directory for a given client.'''
return str(client_id)
def _client_is_locked(self, client_name):
if client_name in self._open_clients:
open_client = self._open_clients[client_name]
return open_client.locked
return False
def _require_client_lock(self, client_name):
if client_name not in self.get_client_names():
raise obnamlib.RepositoryClientDoesNotExist(client_name)
if not self._client_is_locked(client_name):
raise obnamlib.RepositoryClientNotLocked(client_name)
def _raw_lock_client(self, client_name):
tracing.trace('client_name=%s', client_name)
if self._client_is_locked(client_name):
raise obnamlib.RepositoryClientLockingFailed(client_name)
client_id = self._get_client_id(client_name)
if client_id is None: # pragma: no cover
raise obnamlib.RepositoryClientDoesNotExist(client_name)
# Create and initialise the client's own directory, if needed.
client_dir = self._get_client_dir(client_id)
if not self._fs.exists(client_dir):
self._fs.mkdir(client_dir)
self.hooks.call('repository-toplevel-init', self, client_dir)
# Actually lock the directory.
self._lockmgr.lock([client_dir])
# Remember that we have the lock.
self._open_client(client_name) # Ensure client is open
open_client = self._open_clients[client_name]
open_client.locked = True
def _raw_unlock_client(self, client_name):
tracing.trace('client_name=%s', client_name)
self._require_client_lock(client_name)
open_client = self._open_clients[client_name]
self._lockmgr.unlock([open_client.client.dirname])
del self._open_clients[client_name]
def lock_client(self, client_name):
logging.info('Locking client %s' % client_name)
self._raw_lock_client(client_name)
def unlock_client(self, client_name):
logging.info('Unlocking client %s' % client_name)
self._raw_unlock_client(client_name)
def commit_client(self, client_name):
tracing.trace('client_name=%s', client_name)
self._require_client_lock(client_name)
open_client = self._open_clients[client_name]
for gen_number in open_client.removed_generation_numbers:
open_client.client.remove_generation(gen_number)
if open_client.current_generation_number:
open_client.client.commit()
self._raw_unlock_client(client_name)
def get_allowed_client_keys(self):
return []
def get_client_key(self, client_name, key): # pragma: no cover
raise obnamlib.RepositoryClientKeyNotAllowed(
self.format, client_name, key)
def set_client_key(self, client_name, key, value):
raise obnamlib.RepositoryClientKeyNotAllowed(
self.format, client_name, key)
def get_client_generation_ids(self, client_name):
client = self._open_client(client_name)
open_client = self._open_clients[client_name]
return [
(client_name, gen_number)
for gen_number in client.list_generations()
if gen_number not in open_client.removed_generation_numbers]
def create_generation(self, client_name):
tracing.trace('client_name=%s', client_name)
self._require_client_lock(client_name)
open_client = self._open_clients[client_name]
if open_client.current_generation_number is not None:
raise obnamlib.RepositoryClientGenerationUnfinished(client_name)
open_client.client.start_generation()
open_client.current_generation_number = \
open_client.client.get_generation_id(open_client.client.tree)
return (client_name, open_client.current_generation_number)
# Generations for a client.
def _require_existing_generation(self, generation_id):
client_name, gen_number = generation_id
if generation_id not in self.get_client_generation_ids(client_name):
raise obnamlib.RepositoryGenerationDoesNotExist(client_name)
def get_allowed_generation_keys(self):
return []
def get_generation_key(self, generation_id, key): # pragma: no cover
client_name, gen_number = generation_key
raise obnamlib.RepositoryGenerationKeyNotAllowed(
self.format, client_name, key)
def set_generation_key(self, generation_id, key, value): # pragma: no cover
client_name, gen_number = generation_key
raise obnamlib.RepositoryGenerationKeyNotAllowed(
self.format, client_name, key)
def interpret_generation_spec(self, client_name, genspec):
ids = self.get_client_generation_ids(client_name)
if not ids:
raise obnamlib.RepositoryClientHasNoGenerations(client_name)
if genspec == 'latest':
return ids[-1]
for gen_id in ids:
if str(gen_id[1]) == genspec:
return gen_id
raise obnamlib.RepositoryGenerationDoesNotExist(client_name)
def make_generation_spec(self, gen_id):
return str(gen_id[1])
def remove_generation(self, gen_id):
tracing.trace('gen_id=%s' % repr(gen_id))
client_name, gen_number = gen_id
self._require_client_lock(client_name)
self._require_existing_generation(gen_id)
open_client = self._open_clients[client_name]
if gen_number == open_client.current_generation_number:
open_client.current_generation = None
open_client.removed_generation_numbers.append(gen_number)
def get_generation_chunk_ids(self, generation_id):
client_name, gen_number = generation_id
client = self._open_client(client_name)
return client.list_chunks_in_generation(gen_number)
# Chunks and chunk indexes.
def _setup_chunks(self):
self._prev_chunk_id = None
self._chunk_idpath = larch.IdPath(
'chunks', self._idpath_depth, self._idpath_bits,
self._idpath_skip)
def _chunk_filename(self, chunk_id):
return self._chunk_idpath.convert(chunk_id)
def _random_chunk_id(self):
return random.randint(0, obnamlib.MAX_ID)
def put_chunk_content(self, data):
if self._prev_chunk_id is None:
self._prev_chunk_id = self._random_chunk_id()
while True:
chunk_id = (self._prev_chunk_id + 1) % obnamlib.MAX_ID
filename = self._chunk_filename(chunk_id)
try:
self._fs.write_file(filename, data)
except OSError, e: # pragma: no cover
if e.errno == errno.EEXIST:
self._prev_chunk_id = self._random_chunk_id()
continue
raise
else:
tracing.trace('chunkid=%s', chunk_id)
break
self._prev_chunk_id = chunk_id
return chunk_id
def get_chunk_content(self, chunk_id):
try:
return self._fs.cat(self._chunk_filename(chunk_id))
except IOError, e:
if e.errno == errno.ENOENT:
raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id))
raise # pragma: no cover
def has_chunk(self, chunk_id):
return self._fs.exists(self._chunk_filename(chunk_id))
def remove_chunk(self, chunk_id):
tracing.trace('chunk_id=%s', chunk_id)
filename = self._chunk_filename(chunk_id)
try:
self._fs.remove(filename)
except OSError:
raise obnamlib.RepositoryChunkDoesNotExist(str(chunk_id))
# Chunk indexes.
def _checksum(self, data):
return hashlib.md5(data).hexdigest()
def _setup_chunk_indexes(self):
self._got_chunk_indexes_lock = False
self._chunklist = obnamlib.ChunkList(
self._fs, self._node_size, self._upload_queue_size,
self._lru_size, self)
self._chunksums = obnamlib.ChecksumTree(
self._fs, 'chunksums', len(self._checksum('')), self._node_size,
self._upload_queue_size, self._lru_size, self)
def _chunk_index_dirs_to_lock(self):
return [
self._chunklist.dirname,
self._chunksums.dirname,
self._chunk_idpath.dirname]
def _require_chunk_indexes_lock(self):
if not self._got_chunk_indexes_lock:
raise obnamlib.RepositoryChunkIndexesNotLocked()
def _raw_lock_chunk_indexes(self):
if self._got_chunk_indexes_lock:
raise obnamlib.RepositoryChunkIndexesLockingFailed()
self._lockmgr.lock(self._chunk_index_dirs_to_lock())
self._got_chunk_indexes_lock = True
tracing.trace('starting changes in chunksums and chunklist')
self._chunksums.start_changes()
self._chunklist.start_changes()
# Initialize the chunks directory for encryption, etc, if it just
# got created.
dirname = self._chunk_idpath.dirname
filenames = self._fs.listdir(dirname)
if filenames == [] or filenames == ['lock']:
self.hooks.call('repository-toplevel-init', self, dirname)
def _raw_unlock_chunk_indexes(self):
self._require_chunk_indexes_lock()
self._lockmgr.unlock(self._chunk_index_dirs_to_lock())
self._setup_chunk_indexes()
def lock_chunk_indexes(self):
tracing.trace('locking chunk indexes')
self._raw_lock_chunk_indexes()
def unlock_chunk_indexes(self):
tracing.trace('unlocking chunk indexes')
self._raw_unlock_chunk_indexes()
def force_chunk_indexes_lock(self):
tracing.trace('forcing chunk indexes lock')
if self._got_chunk_indexes_lock:
self._raw_unlock_chunk_indexes()
self._raw_lock_chunk_indexes()
def commit_chunk_indexes(self):
tracing.trace('committing chunk indexes')
self._require_chunk_indexes_lock()
self._chunklist.commit()
self._chunksums.commit()
self._raw_unlock_chunk_indexes()
def put_chunk_into_indexes(self, chunk_id, data, client_id):
tracing.trace('chunk_id=%s', chunk_id)
checksum = self._checksum(data)
tracing.trace('checksum of data: %s', checksum)
tracing.trace('client_id=%s', client_id)
self._require_chunk_indexes_lock()
self._chunklist.add(chunk_id, checksum)
self._chunksums.add(checksum, chunk_id, client_id)
def remove_chunk_from_indexes(self, chunk_id, client_id):
tracing.trace('chunk_id=%s', chunk_id)
self._require_chunk_indexes_lock()
checksum = self._chunklist.get_checksum(chunk_id)
self._chunksums.remove(checksum, chunk_id, client_id)
self._chunklist.remove(chunk_id)
def find_chunk_id_by_content(self, data):
checksum = self._checksum(data)
candidates = self._chunksums.find(checksum)
for chunk_id in candidates:
chunk_data = self.get_chunk_content(chunk_id)
if chunk_data == data:
return chunk_id
raise obnamlib.RepositoryChunkContentNotInIndexes()
# Individual files in a generation.
def _require_existing_file(self, generation_id, filename):
client_name, gen_number = generation_id
if generation_id not in self.get_client_generation_ids(client_name):
raise obnamlib.RepositoryGenerationDoesNotExist(client_name)
if not self.file_exists(generation_id, filename):
raise obnamlib.RepositoryFileDoesNotExistInGeneration(
client_name, self.make_generation_spec(generation_id),
filename)
def file_exists(self, generation_id, filename):
client_name, gen_number = generation_id
client = self._open_client(client_name)
try:
client.get_metadata(gen_number, filename)
return True
except KeyError:
return False
def add_file(self, generation_id, filename):
client_name, gen_number = generation_id
self._require_client_lock(client_name)
client = self._open_client(client_name)
encoded_metadata = obnamlib.encode_metadata(obnamlib.Metadata())
client.create(filename, encoded_metadata)
def remove_file(self, generation_id, filename):
client_name, gen_number = generation_id
self._require_client_lock(client_name)
client = self._open_client(client_name)
client.remove(filename) # FIXME: Only removes from unfinished gen!
def get_allowed_file_keys(self):
return [obnamlib.REPO_FILE_TEST_KEY]
def get_file_key(self, generation_id, filename, key):
self._require_existing_file(generation_id, filename)
client_name, gen_number = generation_id
client = self._open_client(client_name)
encoded_metadata = client.get_metadata(gen_number, filename)
metadata = obnamlib.decode_metadata(encoded_metadata)
if key == obnamlib.REPO_FILE_MTIME:
return metadata.st_mtime_sec or 0
elif key == obnamlib.REPO_FILE_TEST_KEY:
return metadata.target or ''
else:
raise obnamlib.RepositoryFileKeyNotAllowed(
self.format, client_name, key)
def set_file_key(self, generation_id, filename, key, value):
client_name, gen_number = generation_id
self._require_client_lock(client_name)
self._require_existing_file(generation_id, filename)
client = self._open_client(client_name)
encoded_metadata = client.get_metadata(gen_number, filename)
metadata = obnamlib.decode_metadata(encoded_metadata)
if key == obnamlib.REPO_FILE_MTIME:
metadata.st_mtime_sec = value
elif key == obnamlib.REPO_FILE_TEST_KEY:
metadata.target = value
else:
raise obnamlib.RepositoryFileKeyNotAllowed(
self.format, client_name, key)
encoded_metadata = obnamlib.encode_metadata(metadata)
# FIXME: Only sets in unfinished generation
client.set_metadata(filename, encoded_metadata)
def get_file_chunk_ids(self, generation_id, filename):
self._require_existing_file(generation_id, filename)
client_name, gen_number = generation_id
client = self._open_client(client_name)
return client.get_file_chunks(gen_number, filename)
def clear_file_chunk_ids(self, generation_id, filename):
self._require_existing_file(generation_id, filename)
client_name, gen_number = generation_id
self._require_client_lock(client_name)
client = self._open_client(client_name)
client.set_file_chunks(filename, []) # FIXME: current gen only
def append_file_chunk_id(self, generation_id, filename, chunk_id):
self._require_existing_file(generation_id, filename)
client_name, gen_number = generation_id
self._require_client_lock(client_name)
client = self._open_client(client_name)
client.append_file_chunks(filename, [chunk_id]) # FIXME: curgen only
def get_file_children(self, generation_id, filename):
self._require_existing_file(generation_id, filename)
client_name, gen_number = generation_id
client = self._open_client(client_name)
return [os.path.join(filename, basename)
for basename in client.listdir(gen_number, filename)]
# Fsck.
def get_fsck_work_item(self):
return []
obnam-1.6.1/obnamlib/repo_fmt_6_tests.py 0000644 0001750 0001750 00000002015 12246357067 020130 0 ustar jenkins jenkins # Copyright (C) 2013 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import shutil
import tempfile
import obnamlib
class RepositoryFormat6Tests(obnamlib.RepositoryInterfaceTests):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
fs = obnamlib.LocalFS(self.tempdir)
self.repo = obnamlib.RepositoryFormat6()
self.repo.set_fs(fs)
def tearDown(self):
shutil.rmtree(self.tempdir)
obnam-1.6.1/obnamlib/repo_interface.py 0000644 0001750 0001750 00000163660 12246357067 017651 0 ustar jenkins jenkins # repo_interface.py -- interface class for repository access
#
# Copyright 2013 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
#
# =*= License: GPL-3+ =*=
import unittest
import obnamlib
# The following is a canonical list of all keys that can be used with
# the repository interface for key/value pairs. Not all formats need
# to support all keys, but they all must support the test keys, for
# the test suite to function.
REPO_CLIENT_TEST_KEY = 0 # string
REPO_GENERATION_TEST_KEY = 1 # string
REPO_FILE_TEST_KEY = 2 # string
REPO_FILE_MTIME = 3 # integer
REPO_FILE_INTEGER_KEYS = (
REPO_FILE_MTIME,
)
# The following is a key that is NOT allowed for any repository format.
WRONG_KEY = -1
class RepositoryClientListLockingFailed(obnamlib.Error):
def __init__(self):
self.msg = 'Repository client list could not be locked'
class RepositoryClientListNotLocked(obnamlib.Error):
def __init__(self):
self.msg = 'Repository client list is not locked'
class RepositoryClientAlreadyExists(obnamlib.Error):
def __init__(self, client_name):
self.msg = 'Repository client %s already exists' % client_name
class RepositoryClientDoesNotExist(obnamlib.Error):
def __init__(self, client_name):
self.msg = 'Repository client %s does not exist' % client_name
class RepositoryClientLockingFailed(obnamlib.Error):
def __init__(self, client_name):
self.msg = 'Repository client %s could not be locked' % client_name
class RepositoryClientNotLocked(obnamlib.Error):
def __init__(self, client_name):
self.msg = 'Repository client %s is not locked' % client_name
class RepositoryClientKeyNotAllowed(obnamlib.Error):
def __init__(self, format, client_name, key):
self.msg = (
'Client %s uses repository format %s '
'which does not allow the key %s to be use for clients' %
(format, client_name, key))
class RepositoryClientGenerationUnfinished(obnamlib.Error):
def __init__(self, client_name):
self.msg = (
'Cannot start new generation for %s: '
'previous one is not finished yet (programming error)' %
client_name)
class RepositoryGenerationKeyNotAllowed(obnamlib.Error):
def __init__(self, format, client_name, key):
self.msg = (
'Client %s uses repository format %s '
'which does not allow the key %s to be use for generations' %
(format, client_name, key))
class RepositoryGenerationDoesNotExist(obnamlib.Error):
def __init__(self, client_name):
self.msg = (
'Cannot find requested generation for client %s' %
client_name)
class RepositoryClientHasNoGenerations(obnamlib.Error):
def __init__(self, client_name):
self.msg = 'Client %s has no generations' % client_name
class RepositoryFileDoesNotExistInGeneration(obnamlib.Error):
def __init__(self, client_name, genspec, filename):
self.msg = (
'Client %s, generation %s does not have file %s' %
(client_name, genspec, filename))
class RepositoryFileKeyNotAllowed(obnamlib.Error):
def __init__(self, format, client_name, key):
self.msg = (
'Client %s uses repository format %s '
'which does not allow the key %s to be use for files' %
(client_name, format, key))
class RepositoryChunkDoesNotExist(obnamlib.Error):
def __init__(self, chunk_id_as_string):
self.msg = "Repository doesn't contain chunk %s" % chunk_id_as_string
class RepositoryChunkContentNotInIndexes(obnamlib.Error):
def __init__(self):
self.msg = "Repository chunk indexes do not contain content"
class RepositoryChunkIndexesNotLocked(obnamlib.Error):
def __init__(self):
self.msg = 'Repository chunk indexes are not locked'
class RepositoryChunkIndexesLockingFailed(obnamlib.Error):
def __init__(self):
self.msg = 'Repository chunk indexes are already locked'
class RepositoryInterface(object):
'''Abstract interface to Obnam backup repositories.
An Obnam backup repository stores backups for backup clients.
As development of Obnam progresses, the details of how things
are stored can change. This is usually necessary for performance
improvements.
To allow Obnam to access, both for reading and writing, any
version of the repository format, this class defines an interface
for repository access. Every different version of the format
implements a class with this interface, so that the rest of
Obnam can just use the interface.
The interface is suitably high level that using the repository
is convenient, and that it allows a variety of implementations.
At the same time it concentrates on the needs of repository
access only.
The interface also specifies the interface with which the
implementation accesses the actual filesystem: it is the
Obnam VFS layer.
[rest of Obnam code]
|
| calls RepositoryInterface API
|
V
[RepositoryFormatX implementing RepositoryInterface API]
|
| calls VFS API
|
V
[FooFS implementing VirtualFileSystem API]
The VFS API implementation is given to the RepositoryInterface
implementation with the ``set_fs`` method.
It must be stressed that ALL access to the repository go via
an implemention of RepositoryInterface. Further, all the
implementation classes must be instantiated via RepositoryFactory.
The abstraction RepositoryInterface provides for repositories
consists of a few key concepts:
* A repository contains data about one or more clients.
* For each client, there is some metadata, plus a list of generations.
* For each generation, there is some metadata, plus a list of
files (where directories are treated as files).
* For each file, there is some metadata, plus a list of chunk
identifiers.
* File contents data is split into chunks, each given a unique
identifier.
* There is optionally some indexing for content based lookups of
chunks (e.g., look up chunks based on an MD5 checksum).
* There are three levels of locking: the list of clients,
the per-client data (information about generations), and
the chunk lookup indexes are all locked up individually.
* All metadata is stored as key/value pairs, where the key is one
of a strictly limited, version-specific list of allowed ones,
and the value is a binary string or a 64-bit integer (the type
depends on the key). All allowed keys are implicitly set to
the empty string or 0 if not set otherwise.
Further, the repository format version implementation is given
a directory in which it stores the repository, using any number
of files it wishes. No other files will be in that directory.
(RepositoryFactory creates the actual directory.) The only
restriction is that within that directory, the
``metadata/format``file MUST be a plain text file (no encryption,
compression), containing a single line, giving the format
of the repository, as an arbitrary string. Each RepositoryInterface
implementation will work with exactly one such format, and have
a class attribute ``format`` which contains the string.
There is no method to remove a repository. This is handled
externally by removing the repository directory and all its files.
Since that code is generic, it is not needed in the interface.
Each RepositoryInterface implementation can have a custom
initialiser. RepositoryFactory will know how to call it,
giving it all the information it needs.
Generation and chunk identifiers, as returned by this API, are
opaque objects, which may be compared for equality, but not for
sorting. A generation id will include information to identify
the client it belongs to, in order to make it unnecessary to
always specify the client.
File metadata (stat fields, etc) are stored using individual
file keys:
repo.set_file_key(gen_id, filename, REPO_FILE_KEY_MTIME, mtime)
This is to allow maximum flexibility in how data is actually stored
in the repository, and to make the least amount of assumptions
that will hinder convertability between repository formats.
However, storing them independently is likely to be epxensive,
and so the implementation may actually pool file key changes to
a file and only actually encode all of them, as a blob, when the
API user is finished with a file. There is no API call to indicate
that explicitly, but the API implementation can deduce it by noticing
that another file's file key, or other metadata, gets set. This
design aims to make the API as easy to use as possible, by avoiding
an extra "I am finished with this file for now" method call.
'''
# Operations on the repository itself.
def set_fs(self, fs):
'''Set the Obnam VFS instance for accessing the filesystem.'''
raise NotImplementedError()
def init_repo(self):
'''Initialize a nearly-empty directory for this format version.
The repository will contain the file ``metadata/format``,
with the right contents, but nothing else.
'''
raise NotImplementedError()
# Client list.
def get_client_names(self):
'''Return list of client names currently existing in the repository.'''
raise NotImplementedError()
def lock_client_list(self):
'''Lock the client list for changes.'''
raise NotImplementedError()
def commit_client_list(self):
'''Commit changes to client list and unlock it.'''
raise NotImplementedError()
def unlock_client_list(self):
'''Forget changes to client list and unlock it.'''
raise NotImplementedError()
def force_client_list_lock(self):
'''Force the client list lock.
If the process that locked the client list is dead, this method
forces the lock open and takes it for the calling process instead.
Any uncommitted changes by the original locker will be lost.
'''
raise NotImplementedError()
def add_client(self, client_name):
'''Add a client to the client list.
Raise RepositoryClientAlreadyExists if the client already exists.
'''
raise NotImplementedError()
def remove_client(self, client_name):
'''Remove a client from the client list.'''
raise NotImplementedError()
def rename_client(self, old_client_name, new_client_name):
'''Rename a client to have a new name.'''
raise NotImplementedError()
# A particular client.
def lock_client(self, client_name):
'''Lock the client for changes.
This lock must be taken for any changes to the per-client
data, including any changes to backup generations for the
client.
'''
raise NotImplementedError()
def commit_client(self, client_name):
'''Commit changes to client and unlock it.'''
raise NotImplementedError()
def unlock_client(self, client_name):
'''Forget changes to client and unlock it.'''
raise NotImplementedError()
def force_client_lock(self, client_name):
'''Force the client lock.
If the process that locked the client is dead, this method
forces the lock open and takes it for the calling process instead.
Any uncommitted changes by the original locker will be lost.
'''
raise NotImplementedError()
def get_allowed_client_keys(self):
'''Return list of allowed per-client keys for thist format.'''
raise NotImplementedError()
def get_client_key(self, client_name, key):
'''Return current value of a key for a given client.
If not set explicitly, the value is the empty string.
If the key is not in the list of allowed keys for this
format, raise RepositoryClientKeyNotAllowed.
'''
raise NotImplementedError()
def set_client_key(self, client_name, key, value):
'''Set value for a per-client key.'''
raise NotImplementedError()
def get_client_generation_ids(self, client_name):
'''Return a list of opague ids for generations in a client.
The list is ordered: the first id in the list is the oldest
generation. The ids needs not be sortable, and they may or
may not be simple types.
'''
raise NotImplementedError()
def create_generation(self, client_name):
'''Start a new generation for a client.
Return the generation id for the new generation. The id
implicitly also identifies the client.
'''
raise NotImplementedError()
# Generations. The generation id identifies client as well.
def get_allowed_generation_keys(self):
'''Return list of all allowed keys for generations.'''
raise NotImplementedError()
def get_generation_key(self, generation_id, key):
'''Return current value for a generation key.'''
raise NotImplementedError()
def set_generation_key(self, generation_id, key, value):
'''Set a key/value pair for a given generation.'''
raise NotImplementedError()
def remove_generation(self, generation_id):
'''Remove an existing generation.
The removed generation may be the currently unfinished one.
'''
raise NotImplementedError()
def get_generation_chunk_ids(self, generation_id):
'''Return list of chunk ids used by a generation.
Each file lists the chunks it uses, but iterating over all
files is expensive. This method gives a potentially more
efficient way of getting the information.
'''
raise NotImplementedError()
def interpret_generation_spec(self, client_name, genspec):
'''Return the generation id for a user-given specification.
The specification is a string, and either gives the number
of a generation, or is the word 'latest'.
The return value is a generation id usable with the
RepositoryInterface API.
'''
raise NotImplementedError()
def make_generation_spec(self, gen_id):
'''Return a generation spec that matches a given generation id.
If we tell the user the returned string, and they later give
it to interpret_generation_spec, the same generation id is
returned.
'''
raise NotImplementedError()
# Individual files and directories in a generation.
def file_exists(self, generation_id, filename):
'''Does a file exist in a generation?
The filename should be the full path to the file.
'''
raise NotImplementedError()
def add_file(self, generation_id, filename):
'''Adds a file to the generation.
Any metadata about the file needs to be added with
set_file_key.
'''
raise NotImplementedError()
def remove_file(self, generation_id, filename):
'''Removes a file from the given generation.
The generation MUST be the created, but not committed or
unlocked generation.
All the file keys associated with the file are also removed.
'''
raise NotImplementedError()
def get_allowed_file_keys(self):
'''Return list of allowed file keys for this format.'''
raise NotImplementedError()
def get_file_key(self, generation_id, filename, key):
'''Return value for a file key, or empty string.
The empty string is returned if no value has been set for the
file key, or the file does not exist.
'''
raise NotImplementedError()
def set_file_key(self, generation_id, filename, key, value):
'''Set value for a file key.
It is an error to set the value for a file key if the file does
not exist yet.
'''
raise NotImplementedError()
def get_file_chunk_ids(self, generation_id, filename):
'''Get the list of chunk ids for a file.'''
raise NotImplementedError()
def clear_file_chunk_ids(self, generation_id, filename):
'''Clear the list of chunk ids for a file.'''
raise NotImplementedError()
def append_file_chunk_id(self, generation_id, filename, chunk_id):
'''Add a chunk id for a file.
The chunk id is added to the end of the list of chunk ids,
so file data ordering is preserved..
'''
raise NotImplementedError()
def get_file_children(self, generation_id, filename):
'''List contents of a directory.
This returns a list of full pathnames for all the files in
the repository that are direct children of the given file.
This may fail if the given file is not a directory, but
that is not guaranteed.
'''
raise NotImplementedError()
# Chunks.
def put_chunk_content(self, data):
'''Add a new chunk into the repository.
Return the chunk identifier.
'''
raise NotImplementedError()
def get_chunk_content(self, chunk_id):
'''Return the contents of a chunk, given its id.'''
raise NotImplementedError()
def has_chunk(self, chunk_id):
'''Does a chunk (still) exist in the repository?'''
raise NotImplementedError()
def remove_chunk(self, chunk_id):
'''Remove chunk from repository, but not chunk indexes.'''
raise NotImplementedError()
def lock_chunk_indexes(self):
'''Locks chunk indexes for updates.'''
raise NotImplementedError()
def unlock_chunk_indexes(self):
'''Unlocks chunk indexes without committing them.'''
raise NotImplementedError()
def force_chunk_indexex_lock(self):
'''Forces a chunk index lock open and takes it for the caller.'''
raise NotImplementedError()
def commit_chunk_indexes(self):
'''Commit changes to chunk indexes.'''
raise NotImplementedError()
def put_chunk_into_indexes(self, chunk_id, data, client_id):
'''Adds a chunk to indexes.
This does not do any de-duplication.
The indexes map a chunk id to its checksum, and a checksum
to both the chunk ids (possibly several!) and the client ids
for the clients that use the chunk. The client ids are used
to track when a chunk is no longer used by anyone and can
be removed.
'''
raise NotImplementedError()
def remove_chunk_from_indexes(self, chunk_id, client_id):
'''Removes a chunk from indexes, given its id, for a given client.'''
raise NotImplementedError()
def find_chunk_id_by_content(self, data):
'''Finds a chunk id given its content.
This will raise RepositoryChunkContentNotInIndexes if the
chunk is not in the indexes. Otherwise it will return one
chunk id that has exactly the same content. If the indexes
contain duplicate chunks, any one of the might be returned.
'''
raise NotImplementedError()
# Fsck.
def get_fsck_work_item(self):
'''Return an fsck work item for checking this repository.
The work item may spawn more items.
'''
raise NotImplementedError()
class RepositoryInterfaceTests(unittest.TestCase): # pragma: no cover
'''Tests for implementations of RepositoryInterface.
Each implementation of RepositoryInterface should have a corresponding
test class, which inherits this class. The test subclass must set
``self.repo`` to an instance of the class to be tested. The repository
must be empty and uninitialised.
'''
# Tests for repository level things.
def test_has_format_attribute(self):
self.assertEqual(type(self.repo.format), str)
def test_has_set_fs_method(self):
# We merely test that set_fs can be called.
self.assertEqual(self.repo.set_fs(None), None)
# Tests for the client list.
def test_has_no_clients_initially(self):
self.repo.init_repo()
self.assertEqual(self.repo.get_client_names(), [])
def test_adds_a_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.assertEqual(self.repo.get_client_names(), ['foo'])
def test_renames_a_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.repo.lock_client_list()
self.repo.rename_client('foo', 'bar')
self.assertEqual(self.repo.get_client_names(), ['bar'])
def test_removes_a_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.remove_client('foo')
self.assertEqual(self.repo.get_client_names(), [])
def test_fails_adding_existing_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.assertRaises(
obnamlib.RepositoryClientAlreadyExists,
self.repo.add_client, 'foo')
def test_fails_renaming_nonexistent_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.assertRaises(
obnamlib.RepositoryClientDoesNotExist,
self.repo.rename_client, 'foo', 'bar')
def test_fails_renaming_to_existing_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.add_client('bar')
self.repo.commit_client_list()
self.repo.lock_client_list()
self.assertRaises(
obnamlib.RepositoryClientAlreadyExists,
self.repo.rename_client, 'foo', 'bar')
def test_fails_removing_nonexistent_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.assertRaises(
obnamlib.RepositoryClientDoesNotExist,
self.repo.remove_client, 'foo')
def test_raises_lock_error_if_adding_client_without_locking(self):
self.repo.init_repo()
self.assertRaises(
obnamlib.RepositoryClientListNotLocked,
self.repo.add_client, 'foo')
def test_raises_lock_error_if_renaming_client_without_locking(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.assertRaises(
obnamlib.RepositoryClientListNotLocked,
self.repo.rename_client, 'foo', 'bar')
def test_raises_lock_error_if_removing_client_without_locking(self):
self.repo.init_repo()
self.assertRaises(
obnamlib.RepositoryClientListNotLocked,
self.repo.remove_client, 'foo')
def test_unlocking_client_list_does_not_add_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.unlock_client_list()
self.assertEqual(self.repo.get_client_names(), [])
def test_unlocking_client_list_does_not_rename_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.repo.lock_client_list()
self.repo.rename_client('foo', 'bar')
self.repo.unlock_client_list()
self.assertEqual(self.repo.get_client_names(), ['foo'])
def test_unlocking_client_list_does_not_remove_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.repo.lock_client_list()
self.repo.remove_client('foo')
self.repo.unlock_client_list()
self.assertEqual(self.repo.get_client_names(), ['foo'])
def test_committing_client_list_adds_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.assertEqual(self.repo.get_client_names(), ['foo'])
def test_committing_client_list_renames_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.repo.lock_client_list()
self.repo.rename_client('foo', 'bar')
self.repo.commit_client_list()
self.assertEqual(self.repo.get_client_names(), ['bar'])
def test_commiting_client_list_removes_client(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('foo')
self.repo.commit_client_list()
self.repo.lock_client_list()
self.repo.remove_client('foo')
self.repo.commit_client_list()
self.assertEqual(self.repo.get_client_names(), [])
def test_commiting_client_list_removes_lock(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.commit_client_list()
self.repo.lock_client_list()
self.assertEqual(self.repo.get_client_names(), [])
def test_unlocking_client_list_removes_lock(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.unlock_client_list()
self.repo.lock_client_list()
self.assertEqual(self.repo.get_client_names(), [])
def test_locking_client_list_twice_fails(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.assertRaises(
obnamlib.RepositoryClientListLockingFailed,
self.repo.lock_client_list)
def test_unlocking_client_list_when_unlocked_fails(self):
self.repo.init_repo()
self.assertRaises(
obnamlib.RepositoryClientListNotLocked,
self.repo.unlock_client_list)
def test_committing_client_list_when_unlocked_fails(self):
self.repo.init_repo()
self.assertRaises(
obnamlib.RepositoryClientListNotLocked,
self.repo.commit_client_list)
def test_forces_client_list_lock(self):
self.repo.init_repo()
self.repo.lock_client_list()
self.repo.add_client('bar')
self.repo.force_client_list_lock()
self.repo.add_client('foo')
self.assertEqual(self.repo.get_client_names(), ['foo'])
# Tests for client specific stuff.
def setup_client(self):
self.repo.lock_client_list()
self.repo.add_client('fooclient')
self.repo.commit_client_list()
def test_locking_client_twice_fails(self):
self.setup_client()
self.repo.lock_client('fooclient')
self.assertRaises(
obnamlib.RepositoryClientLockingFailed,
self.repo.lock_client, 'fooclient')
def test_unlocking_client_when_unlocked_fails(self):
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientNotLocked,
self.repo.unlock_client, 'fooclient')
def test_committing_client_when_unlocked_fails(self):
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientNotLocked,
self.repo.commit_client, 'fooclient')
def test_unlocking_nonexistent_client_fails(self):
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientDoesNotExist,
self.repo.unlock_client, 'notexist')
def test_committing_nonexistent_client_fails(self):
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientDoesNotExist,
self.repo.commit_client, 'notexist')
def test_unlocking_client_removes_lock(self):
self.setup_client()
self.repo.lock_client('fooclient')
self.repo.unlock_client('fooclient')
self.assertEqual(self.repo.lock_client('fooclient'), None)
def test_committing_client_removes_lock(self):
self.setup_client()
self.repo.lock_client('fooclient')
self.repo.commit_client('fooclient')
self.assertEqual(self.repo.lock_client('fooclient'), None)
def test_has_list_of_allowed_client_keys(self):
self.assertEqual(type(self.repo.get_allowed_client_keys()), list)
def test_gets_all_allowed_client_keys(self):
self.setup_client()
for key in self.repo.get_allowed_client_keys():
value = self.repo.get_client_key('fooclient', key)
self.assertEqual(type(value), str)
def client_test_key_is_allowed(self):
return (obnamlib.REPO_CLIENT_TEST_KEY in
self.repo.get_allowed_client_keys())
def test_has_empty_string_for_client_test_key(self):
if self.client_test_key_is_allowed():
self.setup_client()
value = self.repo.get_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY)
self.assertEqual(value, '')
def test_sets_client_key(self):
if self.client_test_key_is_allowed():
self.setup_client()
self.repo.lock_client('fooclient')
self.repo.set_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar')
value = self.repo.get_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY)
self.assertEqual(value, 'bar')
def test_setting_unallowed_client_key_fails(self):
self.setup_client()
self.repo.lock_client('fooclient')
self.assertRaises(
obnamlib.RepositoryClientKeyNotAllowed,
self.repo.set_client_key, 'fooclient', WRONG_KEY, '')
def test_setting_client_key_without_locking_fails(self):
if self.client_test_key_is_allowed():
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientNotLocked,
self.repo.set_client_key,
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar')
def test_committing_client_preserves_key_changs(self):
if self.client_test_key_is_allowed():
self.setup_client()
self.repo.lock_client('fooclient')
self.repo.set_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar')
value = self.repo.get_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY)
self.repo.commit_client('fooclient')
self.assertEqual(value, 'bar')
def test_unlocking_client_undoes_key_changes(self):
if self.client_test_key_is_allowed():
self.setup_client()
self.repo.lock_client('fooclient')
self.repo.set_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY, 'bar')
self.repo.unlock_client('fooclient')
value = self.repo.get_client_key(
'fooclient', obnamlib.REPO_CLIENT_TEST_KEY)
self.assertEqual(value, '')
def test_getting_client_key_for_unknown_client_fails(self):
if self.client_test_key_is_allowed():
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientDoesNotExist,
self.repo.get_client_key, 'notexistclient',
obnamlib.REPO_CLIENT_TEST_KEY)
def test_new_client_has_no_generations(self):
self.setup_client()
self.assertEqual(self.repo.get_client_generation_ids('fooclient'), [])
def test_creates_new_generation(self):
self.setup_client()
self.repo.lock_client('fooclient')
new_id = self.repo.create_generation('fooclient')
self.assertEqual(
self.repo.get_client_generation_ids('fooclient'),
[new_id])
def test_creating_generation_fails_current_generation_unfinished(self):
self.setup_client()
self.repo.lock_client('fooclient')
self.repo.create_generation('fooclient')
self.assertRaises(
obnamlib.RepositoryClientGenerationUnfinished,
self.repo.create_generation, 'fooclient')
def test_creating_generation_fails_if_client_is_unlocked(self):
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientNotLocked,
self.repo.create_generation, 'fooclient')
def test_unlocking_client_removes_created_generation(self):
self.setup_client()
self.repo.lock_client('fooclient')
new_id = self.repo.create_generation('fooclient')
self.repo.unlock_client('fooclient')
self.assertEqual(self.repo.get_client_generation_ids('fooclient'), [])
def test_committing_client_keeps_created_generation(self):
self.setup_client()
self.repo.lock_client('fooclient')
new_id = self.repo.create_generation('fooclient')
self.repo.commit_client('fooclient')
self.assertEqual(
self.repo.get_client_generation_ids('fooclient'),
[new_id])
# Operations on one generation.
def create_generation(self):
self.setup_client()
self.repo.lock_client('fooclient')
return self.repo.create_generation('fooclient')
def generation_test_key_is_allowed(self):
return (obnamlib.REPO_GENERATION_TEST_KEY in
self.repo.get_allowed_generation_keys())
def test_has_list_of_allowed_generation_keys(self):
self.assertEqual(type(self.repo.get_allowed_generation_keys()), list)
def test_gets_all_allowed_generation_keys(self):
gen_id = self.create_generation()
for key in self.repo.get_allowed_generation_keys():
value = self.repo.get_generation_key(gen_id, key)
self.assertEqual(type(value), str)
def test_has_empty_string_for_generation_test_key(self):
if self.generation_test_key_is_allowed():
gen_id = self.create_generation()
value = self.repo.get_generation_key(
gen_id, obnamlib.REPO_GENERATION_TEST_KEY)
self.assertEqual(value, '')
def test_sets_generation_key(self):
if self.generation_test_key_is_allowed():
gen_id = self.create_generation()
self.repo.set_generation_key(
gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar')
value = self.repo.get_generation_key(
gen_id, obnamlib.REPO_GENERATION_TEST_KEY)
self.assertEqual(value, 'bar')
def test_setting_unallowed_generation_key_fails(self):
if self.generation_test_key_is_allowed():
gen_id = self.create_generation()
self.assertRaises(
obnamlib.RepositoryGenerationKeyNotAllowed,
self.repo.set_generation_key, gen_id, WRONG_KEY, '')
def test_setting_generation_key_without_locking_fails(self):
if self.generation_test_key_is_allowed():
gen_id = self.create_generation()
self.repo.commit_client('fooclient')
self.assertRaises(
obnamlib.RepositoryClientNotLocked,
self.repo.set_generation_key,
gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar')
def test_committing_client_preserves_generation_key_changs(self):
if self.generation_test_key_is_allowed():
gen_id = self.create_generation()
self.repo.set_generation_key(
gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar')
value = self.repo.get_generation_key(
gen_id, obnamlib.REPO_GENERATION_TEST_KEY)
self.repo.commit_client('fooclient')
self.assertEqual(value, 'bar')
def test_unlocking_client_undoes_generation_key_changes(self):
if self.generation_test_key_is_allowed():
gen_id = self.create_generation()
self.repo.set_generation_key(
gen_id, obnamlib.REPO_GENERATION_TEST_KEY, 'bar')
self.repo.unlock_client('fooclient')
value = self.repo.get_generation_key(
gen_id, obnamlib.REPO_CLIENT_TEST_KEY)
self.assertEqual(value, '')
def test_removes_unfinished_generation(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertEqual(self.repo.get_client_generation_ids('fooclient'), [])
def test_removes_finished_generation(self):
gen_id = self.create_generation()
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
self.repo.remove_generation(gen_id)
self.assertEqual(self.repo.get_client_generation_ids('fooclient'), [])
def test_removing_removed_generation_fails(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.remove_generation, gen_id)
def test_removing_generation_without_client_lock_fails(self):
gen_id = self.create_generation()
self.repo.commit_client('fooclient')
self.assertRaises(
obnamlib.RepositoryClientNotLocked,
self.repo.remove_generation, gen_id)
def test_unlocking_client_forgets_generation_removal(self):
gen_id = self.create_generation()
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
self.repo.remove_generation(gen_id)
self.repo.unlock_client('fooclient')
self.assertEqual(
self.repo.get_client_generation_ids('fooclient'),
[gen_id])
def test_committing_client_actually_removes_generation(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.repo.commit_client('fooclient')
self.assertEqual(self.repo.get_client_generation_ids('fooclient'), [])
def test_empty_generation_uses_no_chunk_ids(self):
gen_id = self.create_generation()
self.assertEqual(self.repo.get_generation_chunk_ids(gen_id), [])
def test_interprets_latest_as_a_generation_spec(self):
gen_id = self.create_generation()
self.assertEqual(
self.repo.interpret_generation_spec('fooclient', 'latest'),
gen_id)
def test_interpreting_latest_genspec_without_generations_fails(self):
self.setup_client()
self.assertRaises(
obnamlib.RepositoryClientHasNoGenerations,
self.repo.interpret_generation_spec, 'fooclient', 'latest')
def test_interprets_generation_spec(self):
gen_id = self.create_generation()
genspec = self.repo.make_generation_spec(gen_id)
self.assertEqual(
self.repo.interpret_generation_spec('fooclient', genspec),
gen_id)
def test_interpreting_generation_spec_for_removed_generation_fails(self):
# Note that we must have at least one generation, after removing
# one.
gen_id = self.create_generation()
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
genspec = self.repo.make_generation_spec(gen_id)
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.interpret_generation_spec, 'fooclient', genspec)
# Tests for individual files in a generation.
def test_file_does_not_exist(self):
gen_id = self.create_generation()
self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar'))
def test_adds_file(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar'))
def test_unlocking_forgets_file_add(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.unlock_client('fooclient')
self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar'))
def test_committing_remembers_file_add(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.commit_client('fooclient')
self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar'))
def test_creating_generation_clones_previous_one(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
self.assertTrue(self.repo.file_exists(gen_id_2, '/foo/bar'))
def test_removes_added_file_from_current_generation(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.remove_file(gen_id, '/foo/bar')
self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar'))
def test_unlocking_forgets_file_removal(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
self.repo.remove_file(gen_id, '/foo/bar')
self.repo.unlock_client('fooclient')
self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar'))
def test_committing_remembers_file_removal(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
self.assertTrue(self.repo.file_exists(gen_id_2, '/foo/bar'))
self.repo.remove_file(gen_id_2, '/foo/bar')
self.repo.commit_client('fooclient')
self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar'))
self.assertFalse(self.repo.file_exists(gen_id_2, '/foo/bar'))
def test_has_list_of_allowed_file_keys(self):
self.assertEqual(type(self.repo.get_allowed_file_keys()), list)
def test_gets_all_allowed_file_keys(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
for key in self.repo.get_allowed_file_keys():
value = self.repo.get_file_key(gen_id, '/foo/bar', key)
if key in REPO_FILE_INTEGER_KEYS:
self.assertEqual(type(value), int)
else:
self.assertEqual(type(value), str)
def test_has_empty_string_for_file_test_key(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
self.assertEqual(value, '')
def test_get_file_key_fails_for_nonexistent_generation(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.get_file_key,
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
def test_get_file_key_fails_for_forbidden_key(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.assertRaises(
obnamlib.RepositoryFileKeyNotAllowed,
self.repo.get_file_key,
gen_id, '/foo/bar', WRONG_KEY)
def test_get_file_key_fails_for_nonexistent_file(self):
gen_id = self.create_generation()
self.assertRaises(
obnamlib.RepositoryFileDoesNotExistInGeneration,
self.repo.get_file_key,
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
def test_sets_file_key(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.set_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo')
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
self.assertEqual(value, 'yoyo')
def test_setting_unallowed_file_key_fails(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.assertRaises(
obnamlib.RepositoryFileKeyNotAllowed,
self.repo.set_file_key, gen_id, '/foo/bar', WRONG_KEY, 'yoyo')
def test_file_has_zero_mtime_by_default(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME)
self.assertEqual(value, 0)
def test_sets_file_mtime(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.set_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME, 123)
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME)
self.assertEqual(value, 123)
def test_set_file_key_fails_for_nonexistent_generation(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.set_file_key,
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo')
def test_setting_file_key_for_nonexistent_file_fails(self):
gen_id = self.create_generation()
self.assertRaises(
obnamlib.RepositoryFileDoesNotExistInGeneration,
self.repo.set_file_key,
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo')
# FIXME: These tests fails due to ClientMetadataTree brokenness, it seems.
# They're disabled, for now. The bug is not exposed by existing code,
# only by the new interface's tests.
if False:
def test_removing_file_removes_all_its_file_keys(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.set_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME, 123)
# Remove the file. Key should be removed.
self.repo.remove_file(gen_id, '/foo/bar')
self.assertRaises(
obnamlib.RepositoryFileDoesNotExistInGeneration,
self.repo.get_file_key,
gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME)
# Add the file back. Key should still be removed.
self.repo.add_file(gen_id, '/foo/bar')
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_MTIME)
self.assertEqual(value, 0)
def test_can_add_a_file_then_remove_then_add_it_again(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar'))
self.repo.remove_file(gen_id, '/foo/bar')
self.assertFalse(self.repo.file_exists(gen_id, '/foo/bar'))
self.repo.add_file(gen_id, '/foo/bar')
self.assertTrue(self.repo.file_exists(gen_id, '/foo/bar'))
def test_unlocking_client_forgets_set_file_keys(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.set_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo')
self.repo.unlock_client('fooclient')
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.get_file_key,
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
def test_committing_client_remembers_set_file_keys(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.set_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'yoyo')
self.repo.commit_client('fooclient')
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
self.assertEqual(value, 'yoyo')
def test_setting_file_key_does_not_affect_previous_generation(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.set_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'first')
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
self.repo.set_file_key(
gen_id_2, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY, 'second')
self.repo.commit_client('fooclient')
value = self.repo.get_file_key(
gen_id, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
self.assertEqual(value, 'first')
value_2 = self.repo.get_file_key(
gen_id_2, '/foo/bar', obnamlib.REPO_FILE_TEST_KEY)
self.assertEqual(value_2, 'second')
def test_new_file_has_no_chunk_ids(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.assertEqual(self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), [])
def test_getting_file_chunk_ids_for_nonexistent_generation_fails(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.get_file_chunk_ids, gen_id, '/foo/bar')
def test_getting_file_chunk_ids_for_nonexistent_file_fails(self):
gen_id = self.create_generation()
self.assertRaises(
obnamlib.RepositoryFileDoesNotExistInGeneration,
self.repo.get_file_chunk_ids, gen_id, '/foo/bar')
def test_appends_one_file_chunk_id(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1)
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id, '/foo/bar'),
[1])
def test_appends_two_file_chunk_ids(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1)
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 2)
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id, '/foo/bar'),
[1, 2])
def test_appending_file_chunk_ids_in_nonexistent_generation_fails(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.append_file_chunk_id, gen_id, '/foo/bar', 1)
def test_appending_file_chunk_ids_to_nonexistent_file_fails(self):
gen_id = self.create_generation()
self.assertRaises(
obnamlib.RepositoryFileDoesNotExistInGeneration,
self.repo.append_file_chunk_id, gen_id, '/foo/bar', 1)
def test_adding_chunk_id_to_file_adds_it_to_generation_chunk_ids(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1)
self.assertEqual(self.repo.get_generation_chunk_ids(gen_id), [1])
def test_clears_file_chunk_ids(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1)
self.repo.clear_file_chunk_ids(gen_id, '/foo/bar')
self.assertEqual(self.repo.get_file_chunk_ids(gen_id, '/foo/bar'), [])
def test_clearing_file_chunk_ids_in_nonexistent_generation_fails(self):
gen_id = self.create_generation()
self.repo.remove_generation(gen_id)
self.assertRaises(
obnamlib.RepositoryGenerationDoesNotExist,
self.repo.clear_file_chunk_ids, gen_id, '/foo/bar')
def test_clearing_file_chunk_ids_for_nonexistent_file_fails(self):
gen_id = self.create_generation()
self.assertRaises(
obnamlib.RepositoryFileDoesNotExistInGeneration,
self.repo.clear_file_chunk_ids, gen_id, '/foo/bar')
def test_unlocking_client_forgets_modified_file_chunk_ids(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1)
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
self.repo.append_file_chunk_id(gen_id_2, '/foo/bar', 2)
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id_2, '/foo/bar'),
[1, 2])
self.repo.unlock_client('fooclient')
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id, '/foo/bar'),
[1])
def test_committing_child_remembers_modified_file_chunk_ids(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.repo.append_file_chunk_id(gen_id, '/foo/bar', 1)
self.repo.commit_client('fooclient')
self.repo.lock_client('fooclient')
gen_id_2 = self.repo.create_generation('fooclient')
self.repo.append_file_chunk_id(gen_id_2, '/foo/bar', 2)
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id_2, '/foo/bar'),
[1, 2])
self.repo.commit_client('fooclient')
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id, '/foo/bar'),
[1])
self.assertEqual(
self.repo.get_file_chunk_ids(gen_id_2, '/foo/bar'),
[1, 2])
def test_new_file_has_no_children(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo/bar')
self.assertEqual(self.repo.get_file_children(gen_id, '/foo/bar'), [])
def test_gets_file_child(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/foo')
self.repo.add_file(gen_id, '/foo/bar')
self.assertEqual(
self.repo.get_file_children(gen_id, '/foo'),
['/foo/bar'])
def test_gets_only_immediate_child_for_file(self):
gen_id = self.create_generation()
self.repo.add_file(gen_id, '/')
self.repo.add_file(gen_id, '/foo')
self.repo.add_file(gen_id, '/foo/bar')
self.assertEqual(
self.repo.get_file_children(gen_id, '/'),
['/foo'])
# Chunk and chunk indexes.
def test_puts_chunk_into_repository(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.assertTrue(self.repo.has_chunk(chunk_id))
self.assertEqual(self.repo.get_chunk_content(chunk_id), 'foochunk')
def test_removes_chunk(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.remove_chunk(chunk_id)
self.assertFalse(self.repo.has_chunk(chunk_id))
self.assertRaises(
obnamlib.RepositoryChunkDoesNotExist,
self.repo.get_chunk_content, chunk_id)
def test_removing_nonexistent_chunk_fails(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.remove_chunk(chunk_id)
self.assertRaises(
obnamlib.RepositoryChunkDoesNotExist,
self.repo.remove_chunk, chunk_id)
def test_adds_chunk_to_indexes(self):
self.repo.lock_chunk_indexes()
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123)
self.assertEqual(
self.repo.find_chunk_id_by_content('foochunk'), chunk_id)
def test_removes_chunk_from_indexes(self):
self.repo.lock_chunk_indexes()
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123)
self.repo.remove_chunk_from_indexes(chunk_id, 123)
self.assertRaises(
obnamlib.RepositoryChunkContentNotInIndexes,
self.repo.find_chunk_id_by_content, 'foochunk')
def test_putting_chunk_to_indexes_without_locking_them_fails(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.assertRaises(
obnamlib.RepositoryChunkIndexesNotLocked,
self.repo.put_chunk_into_indexes, chunk_id, 'foochunk', 123)
def test_removing_chunk_from_indexes_without_locking_them_fails(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.lock_chunk_indexes()
self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123)
self.repo.commit_chunk_indexes()
self.assertRaises(
obnamlib.RepositoryChunkIndexesNotLocked,
self.repo.remove_chunk_from_indexes, chunk_id, 123)
def test_unlocking_chunk_indexes_forgets_changes(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.lock_chunk_indexes()
self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123)
self.repo.unlock_chunk_indexes()
self.assertRaises(
obnamlib.RepositoryChunkContentNotInIndexes,
self.repo.find_chunk_id_by_content, 'foochunk')
def test_committing_chunk_indexes_remembers_changes(self):
chunk_id = self.repo.put_chunk_content('foochunk')
self.repo.lock_chunk_indexes()
self.repo.put_chunk_into_indexes(chunk_id, 'foochunk', 123)
self.repo.commit_chunk_indexes()
self.assertEqual(
self.repo.find_chunk_id_by_content('foochunk'), chunk_id)
def test_locking_chunk_indexes_twice_fails(self):
self.repo.lock_chunk_indexes()
self.assertRaises(
obnamlib.RepositoryChunkIndexesLockingFailed,
self.repo.lock_chunk_indexes)
def test_unlocking_unlocked_chunk_indexes_fails(self):
self.assertRaises(
obnamlib.RepositoryChunkIndexesNotLocked,
self.repo.unlock_chunk_indexes)
def test_forces_chunk_index_lock(self):
self.repo.lock_chunk_indexes()
self.repo.force_chunk_indexes_lock()
self.assertEqual(self.repo.unlock_chunk_indexes(), None)
# Fsck.
def test_returns_fsck_work_item(self):
self.assertNotEqual(self.repo.get_fsck_work_item(), None)
obnam-1.6.1/obnamlib/repo_tests.py 0000644 0001750 0001750 00000076062 12246357067 017052 0 ustar jenkins jenkins # Copyright (C) 2010-2011 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import hashlib
import os
import shutil
import stat
import tempfile
import time
import unittest
import obnamlib
class RepositoryRootNodeTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
self.fs = obnamlib.LocalFS(self.tempdir)
self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.otherfs = obnamlib.LocalFS(self.tempdir)
self.other = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_has_format_version(self):
self.assert_(hasattr(self.repo, 'format_version'))
def test_accepts_same_format_version(self):
self.assert_(self.repo.acceptable_version(self.repo.format_version))
def test_does_not_accept_older_format_version(self):
older_version = self.repo.format_version - 1
self.assertFalse(self.repo.acceptable_version(older_version))
def test_does_not_accept_newer_version(self):
newer_version = self.repo.format_version + 1
self.assertFalse(self.repo.acceptable_version(newer_version))
def test_has_none_version_for_empty_repository(self):
self.assertEqual(self.repo.get_format_version(), None)
def test_creates_repository_with_format_version(self):
self.repo.lock_root()
self.assertEqual(self.repo.get_format_version(),
self.repo.format_version)
def test_lists_no_clients(self):
self.assertEqual(self.repo.list_clients(), [])
def test_has_not_got_root_node_lock(self):
self.assertFalse(self.repo.got_root_lock)
def test_locks_root_node(self):
self.repo.lock_root()
self.assert_(self.repo.got_root_lock)
def test_locking_root_node_twice_fails(self):
self.repo.lock_root()
self.assertRaises(obnamlib.Error, self.repo.lock_root)
def test_commit_releases_lock(self):
self.repo.lock_root()
self.repo.commit_root()
self.assertFalse(self.repo.got_root_lock)
def test_unlock_releases_lock(self):
self.repo.lock_root()
self.repo.unlock_root()
self.assertFalse(self.repo.got_root_lock)
def test_commit_without_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.commit_root)
def test_unlock_root_without_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.unlock_root)
def test_commit_when_locked_by_other_fails(self):
self.other.lock_root()
self.assertRaises(obnamlib.LockFail, self.repo.commit_root)
def test_unlock_root_when_locked_by_other_fails(self):
self.other.lock_root()
self.assertRaises(obnamlib.LockFail, self.repo.unlock_root)
def test_on_disk_repository_has_no_version_initially(self):
self.assertEqual(self.repo.get_format_version(), None)
def test_lock_root_adds_version(self):
self.repo.lock_root()
self.assertEqual(self.repo.get_format_version(),
self.repo.format_version)
def test_lock_root_fails_if_format_is_incompatible(self):
self.repo._write_format_version(0)
self.assertRaises(obnamlib.BadFormat, self.repo.lock_root)
def test_list_clients_fails_if_format_is_incompatible(self):
self.repo._write_format_version(0)
self.assertRaises(obnamlib.BadFormat, self.repo.list_clients)
def test_locks_shared(self):
self.repo.lock_shared()
self.assertTrue(self.repo.got_shared_lock)
def test_locking_shared_twice_fails(self):
self.repo.lock_shared()
self.assertRaises(obnamlib.Error, self.repo.lock_shared)
def test_unlocks_shared(self):
self.repo.lock_shared()
self.repo.unlock_shared()
self.assertFalse(self.repo.got_shared_lock)
def test_unlock_shared_when_locked_by_other_fails(self):
self.other.lock_shared()
self.assertRaises(obnamlib.LockFail, self.repo.unlock_shared)
def test_lock_client_fails_if_format_is_incompatible(self):
self.repo._write_format_version(0)
self.assertRaises(obnamlib.BadFormat, self.repo.lock_client, 'foo')
def test_open_client_fails_if_format_is_incompatible(self):
self.repo._write_format_version(0)
self.assertRaises(obnamlib.BadFormat, self.repo.open_client, 'foo')
def test_adding_client_without_root_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.add_client, 'foo')
def test_adds_client(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.assertEqual(self.repo.list_clients(), ['foo'])
def test_adds_two_clients_across_commits(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.commit_root()
self.repo.lock_root()
self.repo.add_client('bar')
self.repo.commit_root()
self.assertEqual(sorted(self.repo.list_clients()), ['bar', 'foo'])
def test_adds_client_that_persists_after_commit(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.commit_root()
s2 = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.assertEqual(s2.list_clients(), ['foo'])
def test_adding_existing_client_fails(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.assertRaises(obnamlib.Error, self.repo.add_client, 'foo')
def test_removing_client_without_root_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.remove_client, 'foo')
def test_removing_nonexistent_client_fails(self):
self.repo.lock_root()
self.assertRaises(obnamlib.Error, self.repo.remove_client, 'foo')
def test_removing_client_works(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.remove_client('foo')
self.assertEqual(self.repo.list_clients(), [])
def test_removing_client_persists_past_commit(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.remove_client('foo')
self.repo.commit_root()
self.assertEqual(self.repo.list_clients(), [])
def test_adding_client_without_commit_does_not_happen(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.unlock_root()
self.assertEqual(self.repo.list_clients(), [])
def test_removing_client_without_commit_does_not_happen(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.commit_root()
self.repo.lock_root()
self.repo.remove_client('foo')
self.repo.unlock_root()
self.assertEqual(self.repo.list_clients(), ['foo'])
def test_removing_client_that_has_data_removes_the_data_as_well(self):
self.repo.lock_root()
self.repo.add_client('foo')
self.repo.commit_root()
self.repo.lock_client('foo')
self.repo.lock_shared()
self.repo.start_generation()
self.repo.create('/', obnamlib.Metadata())
self.repo.commit_client()
self.repo.commit_shared()
self.repo.lock_root()
self.repo.remove_client('foo')
self.repo.commit_root()
self.assertEqual(self.repo.list_clients(), [])
self.assertFalse(self.fs.exists('foo'))
class RepositoryClientTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
self.fs = obnamlib.LocalFS(self.tempdir)
self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.repo.lock_root()
self.repo.add_client('client_name')
self.repo.commit_root()
self.otherfs = obnamlib.LocalFS(self.tempdir)
self.other = obnamlib.Repository(self.otherfs,
obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.dir_meta = obnamlib.Metadata()
self.dir_meta.st_mode = stat.S_IFDIR | 0777
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_has_not_got_client_lock(self):
self.assertFalse(self.repo.got_client_lock)
def test_locks_client(self):
self.repo.lock_client('client_name')
self.assert_(self.repo.got_client_lock)
def test_locking_client_twice_fails(self):
self.repo.lock_client('client_name')
self.assertRaises(obnamlib.Error, self.repo.lock_client,
'client_name')
def test_locking_nonexistent_client_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.lock_client, 'foo')
def test_unlock_client_releases_lock(self):
self.repo.lock_client('client_name')
self.repo.unlock_client()
self.assertFalse(self.repo.got_client_lock)
def test_commit_client_releases_lock(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.commit_client()
self.repo.commit_shared()
self.assertFalse(self.repo.got_client_lock)
def test_commit_does_not_mark_as_checkpoint_by_default(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.start_generation()
genid = self.repo.new_generation
self.repo.commit_client()
self.repo.commit_shared()
self.repo.open_client('client_name')
self.assertFalse(self.repo.get_is_checkpoint(genid))
def test_commit_marks_as_checkpoint_when_requested(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.start_generation()
genid = self.repo.new_generation
self.repo.commit_client(checkpoint=True)
self.repo.commit_shared()
self.repo.open_client('client_name')
self.assert_(self.repo.get_is_checkpoint(genid))
def test_commit_client_without_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.commit_client)
def test_unlock_client_without_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.unlock_client)
def test_commit_client_when_locked_by_other_fails(self):
self.other.lock_client('client_name')
self.assertRaises(obnamlib.LockFail, self.repo.commit_client)
def test_unlock_client_when_locked_by_other_fails(self):
self.other.lock_client('client_name')
self.assertRaises(obnamlib.LockFail, self.repo.unlock_client)
def test_opens_client_fails_if_client_does_not_exist(self):
self.assertRaises(obnamlib.Error, self.repo.open_client, 'bad')
def test_opens_client_even_when_locked_by_other(self):
self.other.lock_client('client_name')
self.repo.open_client('client_name')
self.assert_(True)
def test_lists_no_generations_when_readonly(self):
self.repo.open_client('client_name')
self.assertEqual(self.repo.list_generations(), [])
def test_lists_no_generations_when_locked(self):
self.repo.lock_client('client_name')
self.assertEqual(self.repo.list_generations(), [])
def test_listing_generations_fails_if_client_is_not_open(self):
self.assertRaises(obnamlib.Error, self.repo.list_generations)
def test_not_making_new_generation(self):
self.assertEqual(self.repo.new_generation, None)
def test_starting_new_generation_without_lock_fails(self):
self.assertRaises(obnamlib.LockFail, self.repo.start_generation)
def test_starting_new_generation_works(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.assert_(self.repo.new_generation)
self.assertEqual(self.repo.new_generation, gen)
self.assertEqual(self.repo.list_generations(), [gen])
def test_starting_second_concurrent_new_generation_fails(self):
self.repo.lock_client('client_name')
self.repo.start_generation()
self.assertRaises(obnamlib.Error, self.repo.start_generation)
def test_second_generation_has_different_id_from_first(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
gen = self.repo.start_generation()
self.repo.commit_client()
self.repo.commit_shared()
self.repo.lock_client('client_name')
self.assertNotEqual(gen, self.repo.start_generation())
def test_new_generation_has_start_time_only(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
start, end = self.repo.get_generation_times(gen)
self.assertNotEqual(start, None)
self.assertEqual(end, None)
def test_commited_generation_has_start_and_end_times(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
gen = self.repo.start_generation()
self.repo.commit_client()
self.repo.commit_shared()
self.repo.open_client('client_name')
start, end = self.repo.get_generation_times(gen)
self.assertNotEqual(start, None)
self.assertNotEqual(end, None)
self.assert_(start <= end)
def test_adding_generation_without_committing_does_not_add_it(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.start_generation()
self.repo.unlock_client()
self.repo.unlock_shared()
self.repo.open_client('client_name')
self.assertEqual(self.repo.list_generations(), [])
def test_removing_generation_works(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
gen = self.repo.start_generation()
self.repo.commit_client()
self.repo.commit_shared()
self.repo.open_client('client_name')
self.assertEqual(len(self.repo.list_generations()), 1)
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.remove_generation(gen)
self.repo.commit_client()
self.repo.commit_shared()
self.repo.open_client('client_name')
self.assertEqual(self.repo.list_generations(), [])
def test_removing_only_second_generation_works(self):
# Create first generation. It will be empty.
self.repo.lock_client('client_name')
self.repo.lock_shared()
gen1 = self.repo.start_generation()
self.repo.commit_client()
self.repo.commit_shared()
# Create second generation. It will have a file with two chunks.
# Only one of the chunks will be put into the shared trees.
self.repo.lock_client('client_name')
self.repo.lock_shared()
gen2 = self.repo.start_generation()
chunk_id1 = self.repo.put_chunk_only('data')
self.repo.put_chunk_in_shared_trees(chunk_id1, 'checksum')
chunk_id2 = self.repo.put_chunk_only('data2')
self.repo.set_file_chunks('/foo', [chunk_id1, chunk_id2])
self.repo.commit_client()
self.repo.commit_shared()
# Do we have the right generations? And the chunk2?
self.repo.open_client('client_name')
self.assertEqual(len(self.repo.list_generations()), 2)
self.assertTrue(self.repo.chunk_exists(chunk_id1))
self.assertTrue(self.repo.chunk_exists(chunk_id2))
# Remove second generation. This should remove the chunk too.
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.remove_generation(gen2)
self.repo.commit_client()
self.repo.commit_shared()
# Make sure we have only the first generation, and that the
# chunks are gone.
self.repo.open_client('client_name')
self.assertEqual(self.repo.list_generations(), [gen1])
self.assertFalse(self.repo.chunk_exists(chunk_id1))
self.assertFalse(self.repo.chunk_exists(chunk_id2))
def test_removing_started_generation_fails(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.assertRaises(obnamlib.Error,
self.repo.remove_generation, gen)
def test_removing_without_committing_does_not_remove(self):
self.repo.lock_client('client_name')
self.repo.lock_shared()
gen = self.repo.start_generation()
self.repo.commit_client()
self.repo.commit_shared()
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.repo.remove_generation(gen)
self.repo.unlock_client()
self.repo.unlock_shared()
self.repo.open_client('client_name')
self.assertEqual(self.repo.list_generations(), [gen])
def test_new_generation_has_root_dir_only(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.assertEqual(self.repo.listdir(gen, '/'), [])
def test_create_fails_unless_generation_is_started(self):
self.assertRaises(obnamlib.Error, self.repo.create, None, None)
def test_create_adds_file(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/', self.dir_meta)
self.repo.create('/foo', obnamlib.Metadata())
self.assertEqual(self.repo.listdir(gen, '/'), ['foo'])
def test_create_adds_two_files(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/', self.dir_meta)
self.repo.create('/foo', obnamlib.Metadata())
self.repo.create('/bar', obnamlib.Metadata())
self.assertEqual(sorted(self.repo.listdir(gen, '/')), ['bar', 'foo'])
def test_create_adds_lots_of_files(self):
n = 100
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
pathnames = ['/%d' % i for i in range(n)]
for pathname in pathnames:
self.repo.create(pathname, obnamlib.Metadata())
self.assertEqual(sorted(self.repo.listdir(gen, '/')),
sorted(os.path.basename(x) for x in pathnames))
def test_create_adds_dir(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/foo', self.dir_meta)
self.assertEqual(self.repo.listdir(gen, '/foo'), [])
def test_create_adds_dir_after_file_in_it(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/foo/bar', obnamlib.Metadata())
self.repo.create('/foo', self.dir_meta)
self.assertEqual(self.repo.listdir(gen, '/foo'), ['bar'])
def test_gets_metadata_for_dir(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/foo', self.dir_meta)
self.assertEqual(self.repo.get_metadata(gen, '/foo').st_mode,
self.dir_meta.st_mode)
def test_remove_removes_file(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/foo', obnamlib.Metadata())
self.repo.remove('/foo')
self.assertEqual(self.repo.listdir(gen, '/'), [])
def test_remove_removes_directory_tree(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/foo/bar', obnamlib.Metadata())
self.repo.remove('/foo')
self.assertEqual(self.repo.listdir(gen, '/'), [])
def test_get_metadata_works(self):
metadata = obnamlib.Metadata()
metadata.st_size = 123
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.repo.create('/foo', metadata)
received = self.repo.get_metadata(gen, '/foo')
self.assertEqual(metadata.st_size, received.st_size)
def test_get_metadata_raises_exception_if_file_does_not_exist(self):
self.repo.lock_client('client_name')
gen = self.repo.start_generation()
self.assertRaises(obnamlib.Error, self.repo.get_metadata,
gen, '/foo')
class RepositoryChunkTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
self.fs = obnamlib.LocalFS(self.tempdir)
self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.repo.lock_root()
self.repo.add_client('client_name')
self.repo.commit_root()
self.repo.lock_client('client_name')
self.repo.start_generation()
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_checksum_returns_checksum(self):
self.assertNotEqual(self.repo.checksum('data'), None)
def test_put_chunk_returns_id(self):
self.repo.lock_shared()
self.assertNotEqual(self.repo.put_chunk_only('data'), None)
def test_get_chunk_retrieves_what_put_chunk_puts(self):
self.repo.lock_shared()
chunkid = self.repo.put_chunk_only('data')
self.assertEqual(self.repo.get_chunk(chunkid), 'data')
def test_chunk_does_not_exist(self):
self.assertFalse(self.repo.chunk_exists(1234))
def test_chunk_exists_after_it_is_put(self):
self.repo.lock_shared()
chunkid = self.repo.put_chunk_only('chunk')
self.assert_(self.repo.chunk_exists(chunkid))
def test_removes_chunk(self):
self.repo.lock_shared()
chunkid = self.repo.put_chunk_only('chunk')
self.repo.remove_chunk(chunkid)
self.assertFalse(self.repo.chunk_exists(chunkid))
def test_silently_ignores_failure_when_removing_nonexistent_chunk(self):
self.repo.lock_shared()
self.assertEqual(self.repo.remove_chunk(0), None)
def test_find_chunks_finds_what_put_chunk_puts(self):
self.repo.lock_shared()
checksum = self.repo.checksum('data')
chunkid = self.repo.put_chunk_only('data')
self.repo.put_chunk_in_shared_trees(chunkid, checksum)
self.assertEqual(self.repo.find_chunks(checksum), [chunkid])
def test_find_chunks_finds_nothing_if_nothing_is_put(self):
self.assertEqual(self.repo.find_chunks('checksum'), [])
def test_handles_checksum_collision(self):
self.repo.lock_shared()
checksum = self.repo.checksum('data')
chunkid1 = self.repo.put_chunk_only('data')
chunkid2 = self.repo.put_chunk_only('data')
self.repo.put_chunk_in_shared_trees(chunkid1, checksum)
self.repo.put_chunk_in_shared_trees(chunkid2, checksum)
self.assertEqual(set(self.repo.find_chunks(checksum)),
set([chunkid1, chunkid2]))
def test_returns_no_chunks_initially(self):
self.assertEqual(self.repo.list_chunks(), [])
def test_returns_chunks_after_they_exist(self):
self.repo.lock_shared()
checksum = self.repo.checksum('data')
chunkids = []
for i in range(2):
chunkids.append(self.repo.put_chunk_only('data'))
self.assertEqual(sorted(self.repo.list_chunks()), sorted(chunkids))
class RepositoryGetSetChunksTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
self.fs = obnamlib.LocalFS(self.tempdir)
self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.repo.lock_root()
self.repo.add_client('client_name')
self.repo.commit_root()
self.repo.lock_client('client_name')
self.gen = self.repo.start_generation()
self.repo.create('/foo', obnamlib.Metadata())
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_file_has_no_chunks(self):
self.assertEqual(self.repo.get_file_chunks(self.gen, '/foo'), [])
def test_sets_chunks_for_file(self):
self.repo.set_file_chunks('/foo', [1, 2])
chunkids = self.repo.get_file_chunks(self.gen, '/foo')
self.assertEqual(sorted(chunkids), [1, 2])
def test_appends_chunks_to_empty_list(self):
self.repo.append_file_chunks('/foo', [1, 2])
chunkids = self.repo.get_file_chunks(self.gen, '/foo')
self.assertEqual(sorted(chunkids), [1, 2])
def test_appends_chunks_to_nonempty_list(self):
self.repo.append_file_chunks('/foo', [1, 2])
self.repo.append_file_chunks('/foo', [3, 4])
chunkids = self.repo.get_file_chunks(self.gen, '/foo')
self.assertEqual(sorted(chunkids), [1, 2, 3, 4])
class RepositoryGenspecTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
repodir = os.path.join(self.tempdir, 'repo')
os.mkdir(repodir)
fs = obnamlib.LocalFS(repodir)
self.repo = obnamlib.Repository(fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.repo.lock_root()
self.repo.add_client('client_name')
self.repo.commit_root()
self.repo.lock_client('client_name')
self.repo.lock_shared()
def tearDown(self):
shutil.rmtree(self.tempdir)
def backup(self):
gen = self.repo.start_generation()
self.repo.commit_client()
self.repo.commit_shared()
self.repo.lock_client('client_name')
self.repo.lock_shared()
return gen
def test_latest_raises_error_if_there_are_no_generations(self):
self.assertRaises(obnamlib.Error, self.repo.genspec, 'latest')
def test_latest_returns_only_generation(self):
gen = self.backup()
self.assertEqual(self.repo.genspec('latest'), gen)
def test_latest_returns_newest_generation(self):
self.backup()
gen = self.backup()
self.assertEqual(self.repo.genspec('latest'), gen)
def test_other_spec_returns_itself(self):
gen = self.backup()
self.assertEqual(self.repo.genspec(str(gen)), gen)
def test_noninteger_spec_raises_error(self):
gen = self.backup()
self.assertNotEqual(gen, 'foo')
self.assertRaises(obnamlib.Error, self.repo.genspec, 'foo')
def test_nonexistent_spec_raises_error(self):
self.backup()
self.assertRaises(obnamlib.Error, self.repo.genspec, 1234)
class RepositoryWalkTests(unittest.TestCase):
def setUp(self):
self.tempdir = tempfile.mkdtemp()
self.fs = obnamlib.LocalFS(self.tempdir)
self.repo = obnamlib.Repository(self.fs, obnamlib.DEFAULT_NODE_SIZE,
obnamlib.DEFAULT_UPLOAD_QUEUE_SIZE,
obnamlib.DEFAULT_LRU_SIZE, None,
obnamlib.IDPATH_DEPTH,
obnamlib.IDPATH_BITS,
obnamlib.IDPATH_SKIP,
time.time, 0, '')
self.repo.lock_root()
self.repo.add_client('client_name')
self.repo.commit_root()
self.dir_meta = obnamlib.Metadata()
self.dir_meta.st_mode = stat.S_IFDIR | 0777
self.file_meta = obnamlib.Metadata()
self.file_meta.st_mode = stat.S_IFREG | 0644
self.repo.lock_client('client_name')
self.repo.lock_shared()
self.gen = self.repo.start_generation()
self.repo.create('/', self.dir_meta)
self.repo.create('/foo', self.dir_meta)
self.repo.create('/foo/bar', self.file_meta)
self.repo.commit_client()
self.repo.open_client('client_name')
def tearDown(self):
shutil.rmtree(self.tempdir)
def test_walk_find_everything(self):
found = list(self.repo.walk(self.gen, '/'))
self.assertEqual(found,
[('/', self.dir_meta),
('/foo', self.dir_meta),
('/foo/bar', self.file_meta)])
def test_walk_find_depth_first(self):
found = list(self.repo.walk(self.gen, '/', depth_first=True))
self.assertEqual(found,
[('/foo/bar', self.file_meta),
('/foo', self.dir_meta),
('/', self.dir_meta)])
obnam-1.6.1/obnamlib/repo_tree.py 0000644 0001750 0001750 00000010444 12246357067 016637 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import larch
import tracing
import obnamlib
class RepositoryTree(object):
'''A B-tree within an obnamlib.Repository.
For read-only operation, call init_forest before doing anything.
For read-write operation, call start_changes before doing anything,
and commit afterwards. In between, self.tree is the new tree to be
modified. Note that self.tree is NOT available after init_forest.
After init_forest or start_changes, self.forest is the opened forest.
Unlike self.tree, it will not go away after commit.
'''
def __init__(self, fs, dirname, key_bytes, node_size, upload_queue_size,
lru_size, repo):
self.fs = fs
self.dirname = dirname
self.key_bytes = key_bytes
self.node_size = node_size
self.upload_queue_size = upload_queue_size
self.lru_size = lru_size
self.repo = repo
self.forest = None
self.forest_allows_writes = False
self.tree = None
self.keep_just_one_tree = False
def init_forest(self, allow_writes=False):
if self.forest is None:
tracing.trace('initializing forest dirname=%s', self.dirname)
assert self.tree is None
if not self.fs.exists(self.dirname):
tracing.trace('%s does not exist', self.dirname)
return False
self.forest = larch.open_forest(key_size=self.key_bytes,
node_size=self.node_size,
dirname=self.dirname,
upload_max=self.upload_queue_size,
lru_size=self.lru_size,
vfs=self.fs,
allow_writes=allow_writes)
self.forest_allows_writes = allow_writes
return True
def start_changes(self, create_tree=True):
tracing.trace('start changes for %s', self.dirname)
if self.forest is None or not self.forest_allows_writes:
if not self.fs.exists(self.dirname):
need_init = True
else:
filenames = self.fs.listdir(self.dirname)
need_init = filenames == [] or filenames == ['lock']
if need_init:
if not self.fs.exists(self.dirname):
tracing.trace('create %s', self.dirname)
self.fs.mkdir(self.dirname)
self.repo.hooks.call('repository-toplevel-init', self.repo,
self.dirname)
self.forest = None
self.init_forest(allow_writes=True)
assert self.forest is not None
assert self.forest_allows_writes, \
'it is "%s"' % repr(self.forest_allows_writes)
if self.tree is None and create_tree:
if self.forest.trees:
self.tree = self.forest.new_tree(self.forest.trees[-1])
tracing.trace('use newest tree %s (of %d)', self.tree.root.id,
len(self.forest.trees))
else:
self.tree = self.forest.new_tree()
tracing.trace('new tree root id %s', self.tree.root.id)
def commit(self):
tracing.trace('committing')
if self.forest:
if self.keep_just_one_tree:
while len(self.forest.trees) > 1:
tracing.trace('not keeping tree with root id %s',
self.forest.trees[0].root.id)
self.forest.remove_tree(self.forest.trees[0])
self.forest.commit()
self.tree = None
obnam-1.6.1/obnamlib/sizeparse.py 0000644 0001750 0001750 00000004054 12246357067 016660 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import re
import obnamlib
class UnitError(obnamlib.Error):
def __str__(self):
return self.msg
class SizeSyntaxError(UnitError):
def __init__(self, string):
self.msg = '"%s" is not a valid size' % string
class UnitNameError(UnitError):
def __init__(self, string):
self.msg = '"%s" is not a valid unit' % string
class ByteSizeParser(object):
'''Parse sizes of data in bytes, kilobytes, kibibytes, etc.'''
pat = re.compile(r'^(?P\d+(\.\d+)?)\s*'
r'(?P[kmg]?i?b?)?$', re.I)
units = {
'b': 1,
'k': 1000,
'kb': 1000,
'kib': 1024,
'm': 1000**2,
'mb': 1000**2,
'mib': 1024**2,
'g': 1000**3,
'gb': 1000**3,
'gib': 1024**3,
}
def __init__(self):
self.set_default_unit('B')
def set_default_unit(self, unit):
if unit.lower() not in self.units:
raise UnitNameError(unit)
self.default_unit = unit
def parse(self, string):
m = self.pat.match(string)
if not m:
raise SizeSyntaxError(string)
size = float(m.group('size'))
unit = m.group('unit')
if not unit:
unit = self.default_unit
elif unit.lower() not in self.units:
raise UnitNameError(unit)
factor = self.units[unit.lower()]
return int(size * factor)
obnam-1.6.1/obnamlib/sizeparse_tests.py 0000644 0001750 0001750 00000006510 12246357067 020101 0 ustar jenkins jenkins # Copyright 2010 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import unittest
import obnamlib
class ByteSizeParserTests(unittest.TestCase):
def setUp(self):
self.p = obnamlib.ByteSizeParser()
def test_parses_zero(self):
self.assertEqual(self.p.parse('0'), 0)
def test_parses_unadorned_size_as_bytes(self):
self.assertEqual(self.p.parse('123'), 123)
def test_returns_an_int(self):
self.assert_(isinstance(self.p.parse('123'), int))
def test_parses_unadorned_size_using_default_unit(self):
self.p.set_default_unit('KiB')
self.assertEqual(self.p.parse('123'), 123 * 1024)
def test_parses_size_with_byte_unit(self):
self.assertEqual(self.p.parse('123 B'), 123)
def test_parses_size_with_kilo_unit(self):
self.assertEqual(self.p.parse('123 k'), 123 * 1000)
def test_parses_size_with_kilobyte_unit(self):
self.assertEqual(self.p.parse('123 kB'), 123 * 1000)
def test_parses_size_with_kibibyte_unit(self):
self.assertEqual(self.p.parse('123 KiB'), 123 * 1024)
def test_parses_size_with_mega_unit(self):
self.assertEqual(self.p.parse('123 m'), 123 * 1000**2)
def test_parses_size_with_megabyte_unit(self):
self.assertEqual(self.p.parse('123 MB'), 123 * 1000**2)
def test_parses_size_with_mebibyte_unit(self):
self.assertEqual(self.p.parse('123 MiB'), 123 * 1024**2)
def test_parses_size_with_giga_unit(self):
self.assertEqual(self.p.parse('123 g'), 123 * 1000**3)
def test_parses_size_with_gigabyte_unit(self):
self.assertEqual(self.p.parse('123 GB'), 123 * 1000**3)
def test_parses_size_with_gibibyte_unit(self):
self.assertEqual(self.p.parse('123 GiB'), 123 * 1024**3)
def test_raises_error_for_empty_string(self):
self.assertRaises(obnamlib.SizeSyntaxError, self.p.parse, '')
def test_raises_error_for_missing_size(self):
self.assertRaises(obnamlib.SizeSyntaxError, self.p.parse, 'KiB')
def test_raises_error_for_bad_unit(self):
self.assertRaises(obnamlib.SizeSyntaxError, self.p.parse, '1 km')
def test_raises_error_for_bad_unit_thats_similar_to_real_one(self):
self.assertRaises(obnamlib.UnitNameError, self.p.parse, '1 ib')
def test_raises_error_for_bad_default_unit(self):
self.assertRaises(obnamlib.UnitNameError,
self.p.set_default_unit, 'km')
def test_size_syntax_error_includes_input_string(self):
text = 'asdf asdf'
e = obnamlib.SizeSyntaxError(text)
self.assert_(text in str(e), str(e))
def test_unit_name_error_includes_input_string(self):
text = 'asdf asdf'
e = obnamlib.UnitNameError(text)
self.assert_(text in str(e), str(e))
obnam-1.6.1/obnamlib/vfs.py 0000644 0001750 0001750 00000060026 12246357067 015452 0 ustar jenkins jenkins # Copyright (C) 2008, 2010 Lars Wirzenius
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
import errno
import logging
import os
import stat
import urlparse
import obnamlib
class VirtualFileSystem(object):
'''A virtual filesystem interface.
The backup program needs to access both local and remote files.
To make it easier to support all kinds of files both locally and
remotely, we use a custom virtual filesystem interface so that
all filesystem access is done the same way. This way, we can
easily support user data and backup repositories in any combination of
local and remote filesystems.
This class defines the interface for such virtual filesystems.
Sub-classes will actually implement the interface.
When a VFS is instantiated, it is bound to a base URL. When
accessing the virtual filesystem, all paths are then given
relative to the base URL. The Unix syntax for files is used
for the relative paths: directory components separated by
slashes, and an initial slash indicating the root of the
filesystem (in this case, the base URL).
'''
def __init__(self, baseurl):
self.baseurl = baseurl
self.bytes_read = 0
self.bytes_written = 0
logging.debug('VFS: __init__: baseurl=%s' % self.baseurl)
def log_stats(self):
logging.debug(
'VFS: baseurl=%s read=%d written=%d' %
(self.baseurl, self.bytes_read, self.bytes_written))
def connect(self):
'''Connect to filesystem.'''
def close(self):
'''Close connection to filesystem.'''
self.log_stats()
def reinit(self, new_baseurl, create=False):
'''Go back to the beginning.
This behaves like instantiating a new instance, but possibly
faster for things like SftpFS. If there is a network
connection already open, it will be reused.
'''
def abspath(self, pathname):
'''Return absolute version of pathname.'''
return os.path.abspath(os.path.join(self.getcwd(), pathname))
def getcwd(self):
'''Return current working directory as absolute pathname.'''
def chdir(self, pathname):
'''Change current working directory to pathname.'''
def listdir(self, pathname):
'''Return list of basenames of entities at pathname.'''
def listdir2(self, pathname):
'''Return list of basenames and stats of entities at pathname.
The stat entity may be an exception object instead, to indicate
an error.
'''
def lock(self, lockname, data):
'''Create a lock file with the given name.'''
def unlock(self, lockname):
'''Remove a lock file.'''
def exists(self, pathname):
'''Does the file or directory exist?'''
def mknod(self, pathname, mode):
'''Create a filesystem node.'''
def isdir(self, pathname):
'''Is it a directory?'''
def mkdir(self, pathname):
'''Create a directory.
Parent directories must already exist.
'''
def makedirs(self, pathname):
'''Create a directory, and missing parents.'''
def rmdir(self, pathname):
'''Remove an empty directory.'''
def rmtree(self, dirname):
'''Remove a directory tree, including its contents.'''
if self.isdir(dirname):
for pathname, st in self.scan_tree(dirname):
if stat.S_ISDIR(st.st_mode):
self.rmdir(pathname)
else:
self.remove(pathname)
def remove(self, pathname):
'''Remove a file.'''
def rename(self, old, new):
'''Rename a file.'''
def lstat(self, pathname):
'''Like os.lstat.'''
def get_username(self, uid):
'''Return name for user, or None if not known.'''
def get_groupname(self, gid):
'''Return name for group, or None if not known.'''
def llistxattr(self, pathname):
'''Return list of names of extended attributes for file.'''
return []
def lgetxattr(self, pathname, attrname):
'''Return value of an extended attribute.'''
def lsetxattr(self, pathname, attrname, attrvalue):
'''Set value of an extended attribute.'''
def lchown(self, pathname, uid, gid):
'''Like os.lchown.'''
def chmod_symlink(self, pathname, mode):
'''Like os.lchmod, for symlinks only.
This may fail if the pathname is not a symlink (but it may
not). If the target is a symlink, but the platform (e.g.,
Linux) does not allow setting the permissions of a symlink,
the method will silently do nothing.
'''
def chmod_not_symlink(self, pathname, mode):
'''Like os.chmod, for non-symlinks only.
This may fail if pathname is a symlink (but it may not). It
MUST NOT be called for a symlink; use chmod_symlink instead.
'''
def lutimes(self, pathname, atime_sec, atime_nsec, mtime_sec, mtime_nsec):
'''Like lutimes(2).
This isn't quite like lutimes, actually. Most importantly, it uses
nanosecond timestamps rather than microsecond. This is important.
'''
def link(self, existing_path, new_path):
'''Like os.link.'''
def readlink(self, symlink):
'''Like os.readlink.'''
def symlink(self, source, destination):
'''Like os.symlink.'''
def open(self, pathname, mode):
'''Open a file, like the builtin open() or file() function.
The return value is a file object like the ones returned
by the builtin open() function.
'''
def cat(self, pathname):
'''Return the contents of a file.'''
def write_file(self, pathname, contents):
'''Write a new file.
The file must not yet exist. The file is not necessarily written
atomically, meaning that if the writing fails (connection to
server drops, for example), the file might exist in a partial
form. The callers need to deal with this.
Any directories in pathname will be created if necessary.
'''
def overwrite_file(self, pathname, contents):
'''Like write_file, but overwrites existing file.'''
def scan_tree(self, dirname, ok=None, dirst=None, log=logging.error,
error_handler=None):
'''Scan a tree for files.
Return a generator that returns ``(pathname, stat_result)``
pairs for each file and directory in the tree, in
depth-first order.
If ``ok`` is not None, it must be a function that determines
if a particular file or directory should be returned.
It gets the pathname and stat result as arguments, and
should return True or False. If it returns False on a
directory, ``scan_tree`` will not recurse into the
directory.
``dirst`` is for internal optimization, and should not
be used by the caller. ``log`` is used by unit tests and
should not be used by the caller.
Errors from calling ``listdir`` or ``lstat`` are logged,
but do not stop the scanning. Such files or directories are
not returned, however. If `error_handler` is defined, it is
called once for every problem, giving the name and exception
as arguments.
'''
error_handler = error_handler or (lambda name, e: None)
try:
pairs = self.listdir2(dirname)
except OSError, e:
log('listdir failed: %s: %s' % (e.filename, e.strerror))
error_handler(dirname, e)
pairs = []
queue = []
for name, st in pairs:
pathname = os.path.join(dirname, name)
if isinstance(st, BaseException):
error_handler(pathname, st)
elif ok is None or ok(pathname, st):
if stat.S_ISDIR(st.st_mode):
for t in self.scan_tree(pathname, ok=ok, dirst=st):
yield t
else:
queue.append((pathname, st))
for pathname, st in queue:
yield pathname, st
if dirst is None:
try:
dirst = self.lstat(dirname)
except OSError, e:
log('lstat for dir failed: %s: %s' % (e.filename, e.strerror))
return
yield dirname, dirst
class VfsFactory:
'''Create new instances of VirtualFileSystem.'''
def __init__(self):
self.implementations = {}
def register(self, scheme, implementation, **kwargs):
if scheme in self.implementations:
raise obnamlib.Error('URL scheme %s already registered' % scheme)
self.implementations[scheme] = (implementation, kwargs)
def new(self, url, create=False):
'''Create a new VFS appropriate for a given URL.'''
scheme, netloc, path, params, query, fragment = urlparse.urlparse(url)
if scheme in self.implementations:
klass, kwargs = self.implementations[scheme]
return klass(url, create=create, **kwargs)
raise obnamlib.Error('Unknown VFS type %s' % url)
class VfsTests(object): # pragma: no cover
'''Re-useable tests for VirtualFileSystem implementations.
The base class can't be usefully instantiated itself.
Instead you are supposed to sub-class it and implement the API in
a suitable way for yourself.
This class implements a number of tests that the API implementation
must pass. The implementation's own test class should inherit from
this class, and unittest.TestCase.
The test sub-class should define a setUp method that sets the following:
* self.fs to an instance of the API implementation sub-class
* self.basepath to the path to the base of the filesystem
basepath must be operable as a pathname using os.path tools. If
the VFS implemenation operates remotely and wants to operate on a
URL like 'http://domain/path' as the baseurl, then basepath must be
just the path portion of the URL.
The directory indicated by basepath must exist, but must be empty
at start.
'''
non_ascii_name = u'm\u00e4kel\u00e4'.encode('utf-8')
def test_abspath_returns_input_for_absolute_path(self):
self.assertEqual(self.fs.abspath('/foo/bar'), '/foo/bar')
def test_abspath_returns_absolute_path_for_relative_input(self):
self.assertEqual(self.fs.abspath('foo'),
os.path.join(self.basepath, 'foo'))
def test_abspath_normalizes_path(self):
self.assertEqual(self.fs.abspath('foo/..'), self.basepath)
def test_abspath_returns_plain_string(self):
self.fs.mkdir(self.non_ascii_name)
self.fs.chdir(self.non_ascii_name)
self.assertEqual(type(self.fs.abspath('.')), str)
def test_reinit_works(self):
self.fs.chdir('/')
self.fs.reinit(self.fs.baseurl)
self.assertEqual(self.fs.getcwd(), self.basepath)
def test_reinit_to_nonexistent_filename_raises_OSError(self):
notexist = os.path.join(self.fs.baseurl, 'thisdoesnotexist')
self.assertRaises(OSError, self.fs.reinit, notexist)
def test_reinit_creates_target_if_requested(self):
self.fs.chdir('/')
new_baseurl = os.path.join(self.fs.baseurl, 'newdir')
new_basepath = os.path.join(self.basepath, 'newdir')
self.fs.reinit(new_baseurl, create=True)
self.assertEqual(self.fs.getcwd(), new_basepath)
def test_getcwd_returns_dirname(self):
self.assertEqual(self.fs.getcwd(), self.basepath)
def test_getcwd_returns_plain_string(self):
self.fs.mkdir(self.non_ascii_name)
self.fs.chdir(self.non_ascii_name)
self.assertEqual(type(self.fs.getcwd()), str)
def test_chdir_changes_only_fs_cwd_not_process_cwd(self):
process_cwd = os.getcwd()
self.fs.chdir('/')
self.assertEqual(self.fs.getcwd(), '/')
self.assertEqual(os.getcwd(), process_cwd)
def test_chdir_to_nonexistent_raises_exception(self):
self.assertRaises(OSError, self.fs.chdir, '/foobar')
def test_chdir_to_relative_works(self):
pathname = os.path.join(self.basepath, 'foo')
os.mkdir(pathname)
self.fs.chdir('foo')
self.assertEqual(self.fs.getcwd(), pathname)
def test_chdir_to_dotdot_works(self):
pathname = os.path.join(self.basepath, 'foo')
os.mkdir(pathname)
self.fs.chdir('foo')
self.fs.chdir('..')
self.assertEqual(self.fs.getcwd(), self.basepath)
def test_creates_lock_file(self):
self.fs.lock('lock', 'lock data')
self.assertTrue(self.fs.exists('lock'))
self.assertEqual(self.fs.cat('lock'), 'lock data')
def test_second_lock_fails(self):
self.fs.lock('lock', 'lock data')
self.assertRaises(Exception, self.fs.lock, 'lock', 'second lock')
self.assertEqual(self.fs.cat('lock'), 'lock data')
def test_unlock_removes_lock(self):
self.fs.lock('lock', 'lock data')
self.fs.unlock('lock')
self.assertFalse(self.fs.exists('lock'))
def test_exists_returns_false_for_nonexistent_file(self):
self.assertFalse(self.fs.exists('foo'))
def test_exists_returns_true_for_existing_file(self):
self.fs.write_file('foo', '')
self.assert_(self.fs.exists('foo'))
def test_isdir_returns_false_for_nonexistent_file(self):
self.assertFalse(self.fs.isdir('foo'))
def test_isdir_returns_false_for_nondir(self):
self.fs.write_file('foo', '')
self.assertFalse(self.fs.isdir('foo'))
def test_isdir_returns_true_for_existing_dir(self):
self.fs.mkdir('foo')
self.assert_(self.fs.isdir('foo'))
def test_listdir_returns_plain_strings_only(self):
self.fs.write_file(u'M\u00E4kel\u00E4'.encode('utf-8'), 'data')
names = self.fs.listdir('.')
types = [type(x) for x in names]
self.assertEqual(types, [str])
def test_listdir_raises_oserror_if_directory_does_not_exist(self):
self.assertRaises(OSError, self.fs.listdir, 'foo')
def test_listdir2_returns_name_stat_pairs(self):
funny = u'M\u00E4kel\u00E4'.encode('utf-8')
self.fs.write_file(funny, 'data')
pairs = self.fs.listdir2('.')
self.assertEqual(len(pairs), 1)
self.assertEqual(len(pairs[0]), 2)
name, st = pairs[0]
self.assertEqual(type(name), str)
self.assertEqual(name, funny)
self.assert_(hasattr(st, 'st_mode'))
self.assertFalse(hasattr(st, 'st_mtime'))
self.assert_(hasattr(st, 'st_mtime_sec'))
self.assert_(hasattr(st, 'st_mtime_nsec'))
def test_listdir2_returns_plain_strings_only(self):
self.fs.write_file(u'M\u00E4kel\u00E4'.encode('utf-8'), 'data')
names = [name for name, st in self.fs.listdir2('.')]
types = [type(x) for x in names]
self.assertEqual(types, [str])
def test_listdir2_raises_oserror_if_directory_does_not_exist(self):
self.assertRaises(OSError, self.fs.listdir2, 'foo')
def test_mknod_creates_fifo(self):
self.fs.mknod('foo', 0600 | stat.S_IFIFO)
self.assertEqual(self.fs.lstat('foo').st_mode, 0600 | stat.S_IFIFO)
def test_mkdir_raises_oserror_if_directory_exists(self):
self.assertRaises(OSError, self.fs.mkdir, '.')
def test_mkdir_raises_oserror_if_parent_does_not_exist(self):
self.assertRaises(OSError, self.fs.mkdir, 'foo/bar')
def test_makedirs_raises_oserror_when_directory_exists(self):
self.fs.mkdir('foo')
self.assertRaises(OSError, self.fs.makedirs, 'foo')
def test_makedirs_creates_directory_when_parent_exists(self):
self.fs.makedirs('foo')
self.assert_(self.fs.isdir('foo'))
def test_makedirs_creates_directory_when_parent_does_not_exist(self):
self.fs.makedirs('foo/bar')
self.assert_(self.fs.isdir('foo/bar'))
def test_rmdir_removes_directory(self):
self.fs.mkdir('foo')
self.fs.rmdir('foo')
self.assertFalse(self.fs.exists('foo'))
def test_rmdir_raises_oserror_if_directory_does_not_exist(self):
self.assertRaises(OSError, self.fs.rmdir, 'foo')
def test_rmdir_raises_oserror_if_directory_is_not_empty(self):
self.fs.mkdir('foo')
self.fs.write_file('foo/bar', '')
self.assertRaises(OSError, self.fs.rmdir, 'foo')
def test_rmtree_removes_directory_tree(self):
self.fs.mkdir('foo')
self.fs.write_file('foo/bar', '')
self.fs.rmtree('foo')
self.assertFalse(self.fs.exists('foo'))
def test_rmtree_is_silent_when_target_does_not_exist(self):
self.assertEqual(self.fs.rmtree('foo'), None)
def test_remove_removes_file(self):
self.fs.write_file('foo', '')
self.fs.remove('foo')
self.assertFalse(self.fs.exists('foo'))
def test_remove_raises_oserror_if_file_does_not_exist(self):
self.assertRaises(OSError, self.fs.remove, 'foo')
def test_rename_renames_file(self):
self.fs.write_file('foo', 'xxx')
self.fs.rename('foo', 'bar')
self.assertFalse(self.fs.exists('foo'))
self.assertEqual(self.fs.cat('bar'), 'xxx')
def test_rename_raises_oserror_if_file_does_not_exist(self):
self.assertRaises(OSError, self.fs.rename, 'foo', 'bar')
def test_rename_works_if_target_exists(self):
self.fs.write_file('foo', 'foo')
self.fs.write_file('bar', 'bar')
self.fs.rename('foo', 'bar')
self.assertEqual(self.fs.cat('bar'), 'foo')
def test_lstat_returns_result_with_all_required_fields(self):
st = self.fs.lstat('.')
for field in obnamlib.metadata_fields:
if field.startswith('st_'):
self.assert_(hasattr(st, field), 'stat must return %s' % field)
def test_lstat_returns_right_filetype_for_directory(self):
st = self.fs.lstat('.')
self.assert_(stat.S_ISDIR(st.st_mode))
def test_lstat_raises_oserror_for_nonexistent_entry(self):
self.assertRaises(OSError, self.fs.lstat, 'notexists')
def test_chmod_not_symlink_sets_permissions_correctly(self):
self.fs.mkdir('foo')
self.fs.chmod_not_symlink('foo', 0777)
self.assertEqual(self.fs.lstat('foo').st_mode & 0777, 0777)
def test_chmod_not_symlink_raises_oserror_for_nonexistent_entry(self):
self.assertRaises(OSError, self.fs.chmod_not_symlink, 'notexists', 0)
def test_chmod_symlink_raises_oserror_for_nonexistent_entry(self):
self.assertRaises(OSError, self.fs.chmod_symlink, 'notexists', 0)
def test_lutimes_sets_times_correctly(self):
self.fs.mkdir('foo')
self.fs.lutimes('foo', 1, 2*1000, 3, 4*1000)
self.assertEqual(self.fs.lstat('foo').st_atime_sec, 1)
# not all filesystems support sub-second timestamps; those that
# do not, return 0, so we have to accept either that or the correct
# value, but no other vlaues
self.assert_(self.fs.lstat('foo').st_atime_nsec in [0, 2*1000])
self.assertEqual(self.fs.lstat('foo').st_mtime_sec, 3)
self.assert_(self.fs.lstat('foo').st_mtime_nsec in [0, 4*1000])
def test_lutimes_raises_oserror_for_nonexistent_entry(self):
self.assertRaises(OSError, self.fs.lutimes, 'notexists', 1, 2, 3, 4)
def test_link_creates_hard_link(self):
self.fs.write_file('foo', 'foo')
self.fs.link('foo', 'bar')
st1 = self.fs.lstat('foo')
st2 = self.fs.lstat('bar')
self.assertEqual(st1, st2)
def test_symlink_creates_soft_link(self):
self.fs.symlink('foo', 'bar')
target = self.fs.readlink('bar')
self.assertEqual(target, 'foo')
def test_readlink_returns_plain_string(self):
self.fs.symlink(self.non_ascii_name, self.non_ascii_name)
target = self.fs.readlink(self.non_ascii_name)
self.assertEqual(target, self.non_ascii_name)
self.assertEqual(type(target), str)
def test_symlink_raises_oserror_if_name_exists(self):
self.fs.write_file('foo', 'foo')
self.assertRaises(OSError, self.fs.symlink, 'bar', 'foo')
def test_opens_existing_file_ok_for_reading(self):
self.fs.write_file('foo', '')
self.assert_(self.fs.open('foo', 'r'))
def test_opens_existing_file_ok_for_writing(self):
self.fs.write_file('foo', '')
self.assert_(self.fs.open('foo', 'w'))
def test_open_fails_for_nonexistent_file(self):
self.assertRaises(IOError, self.fs.open, 'foo', 'r')
def test_cat_reads_existing_file_ok(self):
self.fs.write_file('foo', 'bar')
self.assertEqual(self.fs.cat('foo'), 'bar')
def test_cat_fails_for_nonexistent_file(self):
self.assertRaises(IOError, self.fs.cat, 'foo')
def test_has_read_nothing_initially(self):
self.assertEqual(self.fs.bytes_read, 0)
def test_cat_updates_bytes_read(self):
self.fs.write_file('foo', 'bar')
self.fs.cat('foo')
self.assertEqual(self.fs.bytes_read, 3)
def test_write_fails_if_file_exists_already(self):
self.fs.write_file('foo', 'bar')
self.assertRaises(OSError, self.fs.write_file, 'foo', 'foobar')
def test_write_creates_missing_directories(self):
self.fs.write_file('foo/bar', 'yo')
self.assertEqual(self.fs.cat('foo/bar'), 'yo')
def test_write_leaves_existing_file_intact(self):
self.fs.write_file('foo', 'bar')
try:
self.fs.write_file('foo', 'foobar')
except OSError:
pass
self.assertEqual(self.fs.cat('foo'), 'bar')
def test_overwrite_creates_new_file_ok(self):
self.fs.overwrite_file('foo', 'bar')
self.assertEqual(self.fs.cat('foo'), 'bar')
def test_overwrite_replaces_existing_file(self):
self.fs.write_file('foo', 'bar')
self.fs.overwrite_file('foo', 'foobar')
self.assertEqual(self.fs.cat('foo'), 'foobar')
def test_has_written_nothing_initially(self):
self.assertEqual(self.fs.bytes_written, 0)
def test_write_updates_written(self):
self.fs.write_file('foo', 'foo')
self.assertEqual(self.fs.bytes_written, 3)
def test_overwrite_updates_written(self):
self.fs.overwrite_file('foo', 'foo')
self.assertEqual(self.fs.bytes_written, 3)
def set_up_scan_tree(self):
self.dirs = ['foo', 'foo/bar', 'foobar']
self.dirs = [os.path.join(self.basepath, x) for x in self.dirs]
for dirname in self.dirs:
self.fs.mkdir(dirname)
self.dirs.insert(0, self.basepath)
self.fs.symlink('foo', 'symfoo')
self.pathnames = self.dirs + [os.path.join(self.basepath, 'symfoo')]
def test_scan_tree_returns_nothing_if_listdir_fails(self):
self.set_up_scan_tree()
def raiser(dirname):
raise OSError(123, 'oops', dirname)
def logerror(msg):
pass
self.fs.listdir2 = raiser
result = list(self.fs.scan_tree(self.basepath, log=logerror))
self.assertEqual(len(result), 1)
pathname, st = result[0]
self.assertEqual(pathname, self.basepath)
def test_scan_tree_returns_the_right_stuff(self):
self.set_up_scan_tree()
result = list(self.fs.scan_tree(self.basepath))
pathnames = [pathname for pathname, st in result]
self.assertEqual(sorted(pathnames), sorted(self.pathnames))
def test_scan_tree_filters_away_unwanted(self):
def ok(pathname, st):
return stat.S_ISDIR(st.st_mode)
self.set_up_scan_tree()
result = list(self.fs.scan_tree(self.basepath, ok=ok))
pathnames = [pathname for pathname, st in result]
self.assertEqual(sorted(pathnames), sorted(self.dirs))
obnam-1.6.1/obnamlib/vfs_local.py 0000644 0001750 0001750 00000032344 12246357067 016626 0 ustar jenkins jenkins # Copyright (C) 2008 Lars Wirzenius
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
import errno
import fcntl
import grp
import logging
import math
import os
import pwd
import tempfile
import time
import tracing
import obnamlib
# O_NOATIME is Linux specific:
EXTRA_OPEN_FLAGS = getattr(os, "O_NOATIME", 0)
class LocalFSFile(file):
def read(self, amount=-1):
offset = self.tell()
data = file.read(self, amount)
if data:
fd = self.fileno()
obnamlib._obnam.fadvise_dontneed(fd, offset, len(data))
return data
def write(self, data):
offset = self.tell()
file.write(self, data)
fd = self.fileno()
obnamlib._obnam.fadvise_dontneed(fd, offset, len(data))
class LocalFS(obnamlib.VirtualFileSystem):
"""A VFS implementation for local filesystems."""
chunk_size = 1024 * 1024
def __init__(self, baseurl, create=False):
tracing.trace('baseurl=%s', baseurl)
tracing.trace('create=%s', create)
obnamlib.VirtualFileSystem.__init__(self, baseurl)
self.reinit(baseurl, create=create)
# For checking that we do not unlock something we didn't lock
# ourselves.
self.our_locks = set()
# For testing purposes, allow setting a limit on write operations
# after which an exception gets raised. If set to None, no crash.
self.crash_limit = None
self.crash_counter = 0
# Do we have lchmod?
self.got_lchmod = hasattr(os, 'lchmod')
def maybe_crash(self): # pragma: no cover
if self.crash_limit is not None:
self.crash_counter += 1
if self.crash_counter >= self.crash_limit:
raise Exception('Crashing as requested after %d writes' %
self.crash_counter)
def reinit(self, baseurl, create=False):
# We fake chdir so that it doesn't mess with the caller's
# perception of current working directory. This also benefits
# unit tests. To do this, we store the baseurl as the cwd.
tracing.trace('baseurl=%s', baseurl)
tracing.trace('create=%s', create)
self.cwd = os.path.abspath(baseurl)
if not self.isdir('.'):
if create:
tracing.trace('creating %s', baseurl)
try:
os.mkdir(baseurl)
except OSError, e: # pragma: no cover
# The directory might have been created concurrently
# by someone else!
if e.errno != errno.EEXIST:
raise
else:
err = errno.ENOENT
raise OSError(err, os.strerror(err), self.cwd)
def getcwd(self):
return self.cwd
def chdir(self, pathname):
tracing.trace('LocalFS(%s).chdir(%s)', self.baseurl, pathname)
newcwd = os.path.abspath(self.join(pathname))
if not os.path.isdir(newcwd):
raise OSError('%s is not a directory' % newcwd)
self.cwd = newcwd
def lock(self, lockname, data):
tracing.trace('attempting lockname=%s', lockname)
try:
self.write_file(lockname, data)
except OSError, e:
if e.errno == errno.EEXIST:
raise obnamlib.LockFail("Lock %s already exists" % lockname)
else:
raise # pragma: no cover
tracing.trace('got lockname=%s', lockname)
tracing.trace('time=%f' % time.time())
self.our_locks.add(lockname)
def unlock(self, lockname):
tracing.trace('lockname=%s', lockname)
assert lockname in self.our_locks
self.remove(lockname)
self.our_locks.remove(lockname)
tracing.trace('time=%f' % time.time())
def join(self, pathname):
return os.path.join(self.cwd, pathname)
def remove(self, pathname):
tracing.trace('remove %s', pathname)
os.remove(self.join(pathname))
self.maybe_crash()
def rename(self, old, new):
tracing.trace('rename %s %s', old, new)
os.rename(self.join(old), self.join(new))
self.maybe_crash()
def lstat(self, pathname):
(ret, dev, ino, mode, nlink, uid, gid, rdev, size, blksize, blocks,
atime_sec, atime_nsec, mtime_sec, mtime_nsec,
ctime_sec, ctime_nsec) = obnamlib._obnam.lstat(self.join(pathname))
if ret != 0:
raise OSError(ret, os.strerror(ret), pathname)
return obnamlib.Metadata(
st_dev=dev,
st_ino=ino,
st_mode=mode,
st_nlink=nlink,
st_uid=uid,
st_gid=gid,
st_rdev=rdev,
st_size=size,
st_blksize=blksize,
st_blocks=blocks,
st_atime_sec=atime_sec,
st_atime_nsec=atime_nsec,
st_mtime_sec=mtime_sec,
st_mtime_nsec=mtime_nsec,
st_ctime_sec=ctime_sec,
st_ctime_nsec=ctime_nsec
)
def get_username(self, uid):
return pwd.getpwuid(uid)[0]
def get_groupname(self, gid):
return grp.getgrgid(gid)[0]
def llistxattr(self, filename): # pragma: no cover
ret = obnamlib._obnam.llistxattr(self.join(filename))
if type(ret) is int:
raise OSError(ret, os.strerror(ret), filename)
return [s for s in ret.split('\0') if s]
def lgetxattr(self, filename, attrname): # pragma: no cover
ret = obnamlib._obnam.lgetxattr(self.join(filename), attrname)
if type(ret) is int:
raise OSError(ret, os.strerror(ret), filename)
return ret
def lsetxattr(self, filename, attrname, attrvalue): # pragma: no cover
ret = obnamlib._obnam.lsetxattr(self.join(filename),
attrname, attrvalue)
if ret != 0:
raise OSError(ret, os.strerror(ret), filename)
def lchown(self, pathname, uid, gid): # pragma: no cover
tracing.trace('lchown %s %d %d', pathname, uid, gid)
os.lchown(self.join(pathname), uid, gid)
# This method is excluded from test coverage because the platform
# either has lchmod or doesn't, and accordingly either branch of
# the if statement is taken, and the other branch shows up as not
# being tested by the unit tests.
def chmod_symlink(self, pathname, mode): # pragma: no cover
tracing.trace('chmod_symlink %s %o', pathname, mode)
if self.got_lchmod:
os.lchmod(self.join(pathname), mode)
else:
self.lstat(pathname)
def chmod_not_symlink(self, pathname, mode):
tracing.trace('chmod_not_symlink %s %o', pathname, mode)
os.chmod(self.join(pathname), mode)
def lutimes(self, pathname, atime_sec, atime_nsec, mtime_sec, mtime_nsec):
assert atime_sec is not None
assert atime_nsec is not None
assert mtime_sec is not None
assert mtime_nsec is not None
ret = obnamlib._obnam.utimensat(self.join(pathname),
atime_sec, atime_nsec,
mtime_sec, mtime_nsec)
if ret != 0:
raise OSError(ret, os.strerror(ret), pathname)
def link(self, existing, new):
tracing.trace('existing=%s', existing)
tracing.trace('new=%s', new)
os.link(self.join(existing), self.join(new))
self.maybe_crash()
def readlink(self, pathname):
return os.readlink(self.join(pathname))
def symlink(self, existing, new):
tracing.trace('existing=%s', existing)
tracing.trace('new=%s', new)
os.symlink(existing, self.join(new))
self.maybe_crash()
def open(self, pathname, mode):
tracing.trace('pathname=%s', pathname)
tracing.trace('mode=%s', mode)
f = LocalFSFile(self.join(pathname), mode)
tracing.trace('opened %s', pathname)
try:
flags = fcntl.fcntl(f.fileno(), fcntl.F_GETFL)
flags |= EXTRA_OPEN_FLAGS
fcntl.fcntl(f.fileno(), fcntl.F_SETFL, flags)
except IOError, e: # pragma: no cover
tracing.trace('fcntl F_SETFL failed: %s', repr(e))
return f # ignore any problems setting flags
tracing.trace('returning ok')
return f
def exists(self, pathname):
return os.path.exists(self.join(pathname))
def isdir(self, pathname):
return os.path.isdir(self.join(pathname))
def mknod(self, pathname, mode):
tracing.trace('pathmame=%s', pathname)
tracing.trace('mode=%o', mode)
os.mknod(self.join(pathname), mode)
def mkdir(self, pathname):
tracing.trace('mkdir %s', pathname)
os.mkdir(self.join(pathname))
self.maybe_crash()
def makedirs(self, pathname):
tracing.trace('makedirs %s', pathname)
os.makedirs(self.join(pathname))
self.maybe_crash()
def rmdir(self, pathname):
tracing.trace('rmdir %s', pathname)
os.rmdir(self.join(pathname))
self.maybe_crash()
def cat(self, pathname):
tracing.trace('pathname=%s' % pathname)
pathname = self.join(pathname)
f = self.open(pathname, 'rb')
chunks = []
while True:
chunk = f.read(self.chunk_size)
if not chunk:
break
chunks.append(chunk)
self.bytes_read += len(chunk)
f.close()
data = ''.join(chunks)
return data
def write_file(self, pathname, contents): # pragma: no cover
# This is tricky. We need to at least try to support NFS, and
# various filesystems that do not support hardlinks. On NFS,
# creating a file with O_EXCL is not guaranteed to be atomic,
# but adding a link with link(2) is. However, that doesn't work
# on VFAT, for example. So we try do both: first create a
# temporary file with a name guaranteed to not be a name we
# want to use, and then we rename it using link(2) and remove(2).
# If that fails, we try to create the target file with O_EXCL
# and rename the temporary file to that. This is still not 100%
# reliable: someone could be mounting VFAT across NFS, for
# example, but it's the best we can do. If this paragraph is
# wrong, tell the authors.
tracing.trace('write_file %s', pathname)
tempname = self._write_to_tempfile(pathname, contents)
path = self.join(pathname)
# Try link(2) for creating target file.
try:
os.link(tempname, path)
except OSError, e:
pass
else:
os.remove(tempname)
tracing.trace('link+remove worked')
return
# Nope, didn't work. Now try with O_EXCL instead.
try:
fd = os.open(path, os.O_CREAT | os.O_EXCL | os.O_WRONLY, 0666)
os.close(fd)
os.rename(tempname, path)
except OSError, e:
# Give up.
os.remove(tempname)
raise
tracing.trace('O_EXCL+rename worked')
self.maybe_crash()
def overwrite_file(self, pathname, contents):
tracing.trace('overwrite_file %s', pathname)
tempname = self._write_to_tempfile(pathname, contents)
path = self.join(pathname)
os.rename(tempname, path)
self.maybe_crash()
def _write_to_tempfile(self, pathname, contents):
path = self.join(pathname)
dirname = os.path.dirname(path)
if not os.path.exists(dirname):
tracing.trace('os.makedirs(%s)' % dirname)
os.makedirs(dirname)
fd, tempname = tempfile.mkstemp(dir=dirname)
os.close(fd)
f = self.open(tempname, 'wb')
pos = 0
while pos < len(contents):
chunk = contents[pos:pos+self.chunk_size]
f.write(chunk)
pos += len(chunk)
self.bytes_written += len(chunk)
f.close()
return tempname
def listdir(self, dirname):
return os.listdir(self.join(dirname))
def listdir2(self, dirname):
result = []
for name in self.listdir(dirname):
try:
st = self.lstat(os.path.join(dirname, name))
except OSError, e: # pragma: no cover
st = e
ino = -1
else:
ino = st.st_ino
result.append((ino, name, st))
# We sort things in inode order, for speed when doing namei lookups
# when backing up.
result.sort()
return [(name, st) for ino, name, st in result]
obnam-1.6.1/obnamlib/vfs_local_tests.py 0000644 0001750 0001750 00000005107 12246357067 020045 0 ustar jenkins jenkins # Copyright (C) 2008 Lars Wirzenius
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 2 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
import platform
import errno
import os
import shutil
import tempfile
import unittest
import obnamlib
from obnamlib import _obnam
class LocalFSTests(obnamlib.VfsTests, unittest.TestCase):
def setUp(self):
self.basepath = tempfile.mkdtemp()
self.fs = obnamlib.LocalFS(self.basepath)
def tearDown(self):
self.fs.close()
shutil.rmtree(self.basepath)
def test_joins_relative_path_ok(self):
self.assertEqual(self.fs.join('foo'),
os.path.join(self.basepath, 'foo'))
def test_join_treats_absolute_path_as_absolute(self):
self.assertEqual(self.fs.join('/foo'), '/foo')
def test_get_username_returns_root_for_zero(self):
self.assertEqual(self.fs.get_username(0), 'root')
def test_get_groupname_returns_root_for_zero(self):
root = 'wheel' if platform.system() == 'FreeBSD' else 'root'
self.assertEqual(self.fs.get_groupname(0), root)
class XAttrTests(unittest.TestCase):
'''Tests for extended attributes.'''
def setUp(self):
fd, self.filename = tempfile.mkstemp()
os.close(fd)
def test_empty_list(self):
'''A new file has no extended attributes.'''
self.assertEqual(_obnam.llistxattr(self.filename), "")
def test_lsetxattr(self):
'''lsetxattr() sets an attribute on a file.'''
_obnam.lsetxattr(self.filename, "user.key", "value")
_obnam.lsetxattr(self.filename, "user.hello", "world")
self.assertEqual(sorted(_obnam.llistxattr(self.filename).strip("\0").split("\0")),
["user.hello", "user.key"])
def test_lgetxattr(self):
'''lgetxattr() gets the value of an attribute set on the file.'''
_obnam.lsetxattr(self.filename, "user.hello", "world")
self.assertEqual(_obnam.lgetxattr(self.filename, "user.hello"), "world")
obnam-1.6.1/read-live-data-with-sftp 0000755 0001750 0001750 00000002436 12246357067 017150 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
import stat
import sys
import ttystatus
from obnamlib.plugins.sftp_plugin import SftpFS
ts = ttystatus.TerminalStatus(period=0.1)
ts['bytes'] = 0
ts.format(
'%ElapsedTime() %Counter(pathname) %ByteSize(bytes) '
'%ByteSpeed(bytes) %Pathname(pathname)')
url = sys.argv[1]
fs = SftpFS(url)
fs.connect()
for pathname, st in fs.scan_tree('.'):
ts['pathname'] = pathname
if stat.S_ISREG(st.st_mode):
f = fs.open(pathname, 'rb')
while True:
data = f.read(1024**2)
if not data:
break
ts['bytes'] += len(data)
f.close()
ts.finish()
obnam-1.6.1/run-benchmarks 0000755 0001750 0001750 00000001557 12246357067 015370 0 ustar jenkins jenkins #!/bin/sh
# Copyright 2012 Lars Wirzenius
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see .
set -ue
for x in confs/*.conf
do
if [ "$x" != confs/common.conf ]
then
./obnam-benchmark --no-default-config --config=confs/common.conf \
--config="$x" "$@"
fi
done
obnam-1.6.1/sed-in-place 0000755 0001750 0001750 00000000375 12246357067 014707 0 ustar jenkins jenkins #!/bin/sh
#
# Do a sed in place for a set of files. This is like GNU sed -i, but
# we can't assume GNU sed.
set -eu
sedcmd="$1"
shift
for filename in "$@"
do
temp="$(mktemp)"
sed "$sedcmd" "$filename" > "$temp"
mv "$temp" "$filename"
done obnam-1.6.1/setup.py 0000644 0001750 0001750 00000012754 12246357067 014236 0 ustar jenkins jenkins #!/usr/bin/python
# Copyright (C) 2008-2012 Lars Wirzenius
#
# This program is free software; you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation; either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License along
# with this program; if not, write to the Free Software Foundation, Inc.,
# 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
from distutils.core import setup, Extension
from distutils.cmd import Command
from distutils.command.build import build
from distutils.command.clean import clean
import glob
import os
import shutil
import subprocess
import sys
import tempfile
# We need to know whether we can run yarn. We do this by checking
# the python-markdown version: if it's new enough, we assume yarn
# is available, and if it isn't, yarn won't be available since it
# won't work with old versions (e.g., the one in Debian squeeze.)
try:
import markdown
except ImportError:
got_yarn = False
else:
if (hasattr(markdown, 'extensions') and
hasattr(markdown.extensions, 'Extension')):
got_yarn = True
else:
got_yarn = False
def runcmd(*args, **kwargs):
try:
subprocess.check_call(*args, **kwargs)
except subprocess.CalledProcessError, e:
sys.stderr.write('ERROR: %s\n' % str(e))
sys.exit(1)
class GenerateManpage(build):
def run(self):
build.run(self)
print 'building manpages'
for x in ['obnam', 'obnam-benchmark']:
with open('%s.1' % x, 'w') as f:
runcmd(['python', x, '--generate-manpage=%s.1.in' % x,
'--output=%s.1' % x], stdout=f)
class CleanMore(clean):
def run(self):
clean.run(self)
for x in ['obnam.1', 'obnam-benchmark.1', '.coverage',
'obnamlib/_obnam.so']:
if os.path.exists(x):
os.remove(x)
self.remove_pyc('obnamlib')
self.remove_pyc('test-plugins')
if os.path.isdir('build'):
shutil.rmtree('build')
def remove_pyc(self, rootdir):
for dirname, subdirs, basenames in os.walk(rootdir):
for x in [os.path.join(dirname, base)
for base in basenames
if base.endswith('.pyc')]:
os.remove(x)
class Check(Command):
user_options = [
('unit-only', 'u', 'run unit tests tests only?'),
('fast', 'f', 'run fast tests only?'),
('network', 'n', 'run network tests to localhost?'),
('network-only', 'N', 'only run network tests to localhost?'),
]
def initialize_options(self):
self.unit_only = False
self.fast = False
self.network = False
self.network_only = False
def finalize_options(self):
pass
def run(self):
local = not self.network_only
network = self.network or self.network_only
fast = self.fast
slow = not self.fast and not self.unit_only
if local and (self.unit_only or fast):
print "run unit tests"
runcmd(['python', '-m', 'CoverageTestRunner',
'--ignore-missing-from=without-tests'])
os.remove('.coverage')
if local and fast:
print "run black box tests"
runcmd(['cmdtest', 'tests'])
if got_yarn:
runcmd(
['yarn', '-s', 'yarns/obnam.sh'] +
glob.glob('yarns/*.yarn'))
num_clients = '2'
num_generations = '16'
if local and slow:
print "run locking tests"
test_repo = tempfile.mkdtemp()
runcmd(['./test-locking', num_clients,
num_generations, test_repo, test_repo])
shutil.rmtree(test_repo)
if local and slow:
print "run crash test"
runcmd(['./crash-test', '200'])
if network and fast:
print "run sftp tests"
runcmd(['./test-sftpfs'])
if network and fast:
print "re-run black box tests using localhost networking"
env = dict(os.environ)
env['OBNAM_TEST_SFTP_ROOT'] = 'yes'
env['OBNAM_TEST_SFTP_REPOSITORY'] = 'yes'
runcmd(['cmdtest', 'tests'], env=env)
if network and slow:
print "re-run locking tests using localhost networking"
test_repo = tempfile.mkdtemp()
repo_url = 'sftp://localhost/%s' % test_repo
runcmd(['./test-locking', num_clients,
num_generations, repo_url, test_repo])
shutil.rmtree(test_repo)
print "setup.py check done"
setup(name='obnam',
version='1.6.1',
description='Backup software',
author='Lars Wirzenius',
author_email='liw@liw.fi',
url='http://liw.fi/obnam/',
scripts=['obnam', 'obnam-benchmark', 'obnam-viewprof'],
packages=['obnamlib', 'obnamlib.plugins'],
ext_modules=[Extension('obnamlib._obnam', sources=['_obnammodule.c'])],
data_files=[('share/man/man1', glob.glob('*.1'))],
cmdclass={
'build': GenerateManpage,
'check': Check,
'clean': CleanMore,
},
)
obnam-1.6.1/test-data/ 0000755 0001750 0001750 00000000000 12246357067 014401 5 ustar jenkins jenkins obnam-1.6.1/test-data/repo-format-5-encrypted-gzipped.tar.gz 0000644 0001750 0001750 00000030511 12246357067 023560 0 ustar jenkins jenkins 1O y<ˤ"6"ʾe'[3HBٮ%EdJ(I PJEuox}w