s3cmd-1.1.0-beta3/0000755000175000001440000000000011703443760014351 5ustar mludvigusers00000000000000s3cmd-1.1.0-beta3/NEWS0000644000175000001440000001674311703443340015055 0ustar mludvigusers00000000000000s3cmd 1.1.0 - ??? =========== * MultiPart upload enabled for both [put] and [sync]. Default chunk size is 15MB. * CloudFront invalidation via [sync --cf-invalidate] and [cfinvalinfo]. * Increased socket_timeout from 10 secs to 5 mins. * Added "Static WebSite" support [ws-create / ws-delete / ws-info] (contributed by Jens Braeuer) * Force MIME type with --mime-type=abc/xyz, also --guess-mime-type is now on by default, -M is no longer shorthand for --guess-mime-type * Allow parameters in MIME types, for example: --mime-type="text/plain; charset=utf-8" * MIME type can be guessed by python-magic which is a lot better than relying on the extension. Contributed by Karsten Sperling. * Support for environment variables as config values. For instance in ~/.s3cmd put "access_key=$S3_ACCESS_KEY". Contributed by Ori Bar. * Support for --configure checking access to a specific bucket instead of listing all buckets. Listing buckets requires the S3 ListAllMyBuckets permission which is typically not available to delegated IAM accounts. With this change, s3cmd --configure accepts an (optional) bucket uri as a parameter and if it's provided, the access check will just verify access to this bucket individually. Contributed by Mike Repass. * Allow STDOUT as a destination even for downloading multiple files. They will be output one after another without any delimiters! Contributed by Rob Wills. s3cmd 1.0.0 - 2011-01-18 =========== * [sync] now supports --no-check-md5 * Network connections now have 10s timeout * [sync] now supports bucket-to-bucket synchronisation * Added [accesslog] command. * Added access logging for CloudFront distributions using [cfmodify --log] * Added --acl-grant and --acl-revoke [Timothee Groleau] * Allow s3:// URI as well as cf:// URI as a distribution name for most CloudFront related commands. * Support for Reduced Redundancy Storage (--reduced-redundancy) * Follow symlinks in [put] and [sync] with --follow-symlinks * Support for CloudFront DefaultRootObject [Luke Andrew] s3cmd 0.9.9.91 - 2009-10-08 ============== * Fixed invalid reference to a variable in failed upload handling. s3cmd 0.9.9.90 - 2009-10-06 ============== * New command 'sign' for signing e.g. POST upload policies. * Fixed handling of filenames that differ only in capitalisation (eg blah.txt vs Blah.TXT). * Added --verbatim mode, preventing most filenames pre-processing. Good for fixing unreadable buckets. * Added --recursive support for [cp] and [mv], including multiple-source arguments, --include/--exclude, --dry-run, etc. * Added --exclude/--include and --dry-run for [del], [setacl]. * Neutralise characters that are invalid in XML to avoid ExpatErrors. http://boodebr.org/main/python/all-about-python-and-unicode * New command [fixbucket] for for fixing invalid object names in a given Bucket. For instance names with  in them (not sure how people manage to upload them but they do). s3cmd 0.9.9 - 2009-02-17 =========== New commands: * Commands for copying and moving objects, within or between buckets: [cp] and [mv] (Andrew Ryan) * CloudFront support through [cfcreate], [cfdelete], [cfmodify] and [cfinfo] commands. (sponsored by Joseph Denne) * New command [setacl] for setting ACL on existing objects, use together with --acl-public/--acl-private (sponsored by Joseph Denne) Other major features: * Improved source dirname handling for [put], [get] and [sync]. * Recursive and wildcard support for [put], [get] and [del]. * Support for non-recursive [ls]. * Enabled --dry-run for [put], [get] and [sync]. * Allowed removal of non-empty buckets with [rb --force]. * Implemented progress meter (--progress / --no-progress) * Added --include / --rinclude / --(r)include-from options to override --exclude exclusions. * Added --add-header option for [put], [sync], [cp] and [mv]. Good for setting e.g. Expires or Cache-control headers. * Added --list-md5 option for [ls]. * Continue [get] partially downloaded files with --continue * New option --skip-existing for [get] and [sync]. Minor features and bugfixes: * Fixed GPG (--encrypt) compatibility with Python 2.6. * Always send Content-Length header to satisfy some http proxies. * Fixed installation on Windows and Mac OS X. * Don't print nasty backtrace on KeyboardInterrupt. * Should work fine on non-UTF8 systems, provided all the files are in current system encoding. * System encoding can be overriden using --encoding. * Improved resistance to communication errors (Connection reset by peer, etc.) s3cmd 0.9.8.4 - 2008-11-07 ============= * Stabilisation / bugfix release: * Restored access to upper-case named buckets. * Improved handling of filenames with Unicode characters. * Avoid ZeroDivisionError on ultrafast links (for instance on Amazon EC2) * Re-issue failed requests (e.g. connection errors, internal server errors, etc). * Sync skips over files that can't be open instead of terminating the sync completely. * Doesn't run out of open files quota on sync with lots of files. s3cmd 0.9.8.3 - 2008-07-29 ============= * Bugfix release. Avoid running out-of-memory in MD5'ing large files. s3cmd 0.9.8.2 - 2008-06-27 ============= * Bugfix release. Re-upload file if Amazon doesn't send ETag back. s3cmd 0.9.8.1 - 2008-06-27 ============= * Bugfix release. Fixed 'mb' and 'rb' commands again. s3cmd 0.9.8 - 2008-06-23 =========== * Added --exclude / --rexclude options for sync command. * Doesn't require $HOME env variable to be set anymore. * Better checking of bucket names to Amazon S3 rules. s3cmd 0.9.7 - 2008-06-05 =========== * Implemented 'sync' from S3 back to local folder, including file attribute restoration. * Failed uploads are retried on lower speed to improve error resilience. * Compare MD5 of the uploaded file, compare with checksum reported by S3 and re-upload on mismatch. s3cmd 0.9.6 - 2008-02-28 =========== * Support for setting / guessing MIME-type of uploaded file * Correctly follow redirects when accessing buckets created in Europe. * Introduced 'info' command both for buckets and objects * Correctly display public URL on uploads * Updated TODO list for everyone to see where we're heading * Various small fixes. See ChangeLog for details. s3cmd 0.9.5 - 2007-11-13 =========== * Support for buckets created in Europe * Initial 'sync' support, for now local to s3 direction only * Much better handling of multiple args to put, get and del * Tries to use ElementTree from any available module * Support for buckets with over 1000 objects. s3cmd 0.9.4 - 2007-08-13 =========== * Support for transparent GPG encryption of uploaded files * HTTP proxy support * HTTPS protocol support * Support for non-ASCII characters in uploaded filenames s3cmd 0.9.3 - 2007-05-26 =========== * New command "du" for displaying size of your data in S3. (Basil Shubin) s3cmd 0.9.2 - 2007-04-09 =========== * Lots of new documentation * Allow "get" to stdout (use "-" in place of destination file to get the file contents on stdout) * Better compatibility with Python 2.4 * Output public HTTP URL for objects stored with Public ACL * Various bugfixes and improvements s3cmd 0.9.1 - 2007-02-06 =========== * All commands now use S3-URIs * Removed hard dependency on Python 2.5 * Experimental support for Python 2.4 (requires external ElementTree module) s3cmd 0.9.0 - 2007-01-18 =========== * First public release brings support for all basic Amazon S3 operations: Creation and Removal of buckets, Upload (put), Download (get) and Removal (del) of files/objects. s3cmd-1.1.0-beta3/s3cmd.10000644000175000001440000003252711703443535015455 0ustar mludvigusers00000000000000 .TH s3cmd 1 .SH NAME s3cmd \- tool for managing Amazon S3 storage space and Amazon CloudFront content delivery network .SH SYNOPSIS .B s3cmd [\fIOPTIONS\fR] \fICOMMAND\fR [\fIPARAMETERS\fR] .SH DESCRIPTION .PP .B s3cmd is a command line client for copying files to/from Amazon S3 (Simple Storage Service) and performing other related tasks, for instance creating and removing buckets, listing objects, etc. .SH COMMANDS .PP .B s3cmd can do several \fIactions\fR specified by the following \fIcommands\fR. .TP s3cmd \fBmb\fR \fIs3://BUCKET\fR Make bucket .TP s3cmd \fBrb\fR \fIs3://BUCKET\fR Remove bucket .TP s3cmd \fBls\fR \fI[s3://BUCKET[/PREFIX]]\fR List objects or buckets .TP s3cmd \fBla\fR \fI\fR List all object in all buckets .TP s3cmd \fBput\fR \fIFILE [FILE...] s3://BUCKET[/PREFIX]\fR Put file into bucket .TP s3cmd \fBget\fR \fIs3://BUCKET/OBJECT LOCAL_FILE\fR Get file from bucket .TP s3cmd \fBdel\fR \fIs3://BUCKET/OBJECT\fR Delete file from bucket .TP s3cmd \fBsync\fR \fILOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR\fR Synchronize a directory tree to S3 .TP s3cmd \fBdu\fR \fI[s3://BUCKET[/PREFIX]]\fR Disk usage by buckets .TP s3cmd \fBinfo\fR \fIs3://BUCKET[/OBJECT]\fR Get various information about Buckets or Files .TP s3cmd \fBcp\fR \fIs3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]\fR Copy object .TP s3cmd \fBmv\fR \fIs3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]\fR Move object .TP s3cmd \fBsetacl\fR \fIs3://BUCKET[/OBJECT]\fR Modify Access control list for Bucket or Files .TP s3cmd \fBaccesslog\fR \fIs3://BUCKET\fR Enable/disable bucket access logging .TP s3cmd \fBsign\fR \fISTRING-TO-SIGN\fR Sign arbitrary string using the secret key .TP s3cmd \fBfixbucket\fR \fIs3://BUCKET[/PREFIX]\fR Fix invalid file names in a bucket .PP Commands for static WebSites configuration .TP s3cmd \fBws-create\fR \fIs3://BUCKET\fR Create Website from bucket .TP s3cmd \fBws-delete\fR \fIs3://BUCKET\fR Delete Website .TP s3cmd \fBws-info\fR \fIs3://BUCKET\fR Info about Website .PP Commands for CloudFront management .TP s3cmd \fBcflist\fR \fI\fR List CloudFront distribution points .TP s3cmd \fBcfinfo\fR \fI[cf://DIST_ID]\fR Display CloudFront distribution point parameters .TP s3cmd \fBcfcreate\fR \fIs3://BUCKET\fR Create CloudFront distribution point .TP s3cmd \fBcfdelete\fR \fIcf://DIST_ID\fR Delete CloudFront distribution point .TP s3cmd \fBcfmodify\fR \fIcf://DIST_ID\fR Change CloudFront distribution point parameters .TP s3cmd \fBcfinvalinfo\fR \fIcf://DIST_ID[/INVAL_ID]\fR Display CloudFront invalidation request(s) status .SH OPTIONS .PP Some of the below specified options can have their default values set in .B s3cmd config file (by default $HOME/.s3cmd). As it's a simple text file feel free to open it with your favorite text editor and do any changes you like. .TP \fB\-h\fR, \fB\-\-help\fR show this help message and exit .TP \fB\-\-configure\fR Invoke interactive (re)configuration tool. Optionally use as '\fB--configure\fR s3://come-bucket' to test access to a specific bucket instead of attempting to list them all. .TP \fB\-c\fR FILE, \fB\-\-config\fR=FILE Config file name. Defaults to /home/mludvig/.s3cfg .TP \fB\-\-dump\-config\fR Dump current configuration after parsing config files and command line options and exit. .TP \fB\-n\fR, \fB\-\-dry\-run\fR Only show what should be uploaded or downloaded but don't actually do it. May still perform S3 requests to get bucket listings and other information though (only for file transfer commands) .TP \fB\-e\fR, \fB\-\-encrypt\fR Encrypt files before uploading to S3. .TP \fB\-\-no\-encrypt\fR Don't encrypt files. .TP \fB\-f\fR, \fB\-\-force\fR Force overwrite and other dangerous operations. .TP \fB\-\-continue\fR Continue getting a partially downloaded file (only for [get] command). .TP \fB\-\-skip\-existing\fR Skip over files that exist at the destination (only for [get] and [sync] commands). .TP \fB\-r\fR, \fB\-\-recursive\fR Recursive upload, download or removal. .TP \fB\-\-check\-md5\fR Check MD5 sums when comparing files for [sync]. (default) .TP \fB\-\-no\-check\-md5\fR Do not check MD5 sums when comparing files for [sync]. Only size will be compared. May significantly speed up transfer but may also miss some changed files. .TP \fB\-P\fR, \fB\-\-acl\-public\fR Store objects with ACL allowing read for anyone. .TP \fB\-\-acl\-private\fR Store objects with default ACL allowing access for you only. .TP \fB\-\-acl\-grant\fR=PERMISSION:EMAIL or USER_CANONICAL_ID Grant stated permission to a given amazon user. Permission is one of: read, write, read_acp, write_acp, full_control, all .TP \fB\-\-acl\-revoke\fR=PERMISSION:USER_CANONICAL_ID Revoke stated permission for a given amazon user. Permission is one of: read, write, read_acp, wr ite_acp, full_control, all .TP \fB\-\-delete\-removed\fR Delete remote objects with no corresponding local file [sync] .TP \fB\-\-no\-delete\-removed\fR Don't delete remote objects. .TP \fB\-p\fR, \fB\-\-preserve\fR Preserve filesystem attributes (mode, ownership, timestamps). Default for [sync] command. .TP \fB\-\-no\-preserve\fR Don't store FS attributes .TP \fB\-\-exclude\fR=GLOB Filenames and paths matching GLOB will be excluded from sync .TP \fB\-\-exclude\-from\fR=FILE Read --exclude GLOBs from FILE .TP \fB\-\-rexclude\fR=REGEXP Filenames and paths matching REGEXP (regular expression) will be excluded from sync .TP \fB\-\-rexclude\-from\fR=FILE Read --rexclude REGEXPs from FILE .TP \fB\-\-include\fR=GLOB Filenames and paths matching GLOB will be included even if previously excluded by one of \fB--(r)exclude(-from)\fR patterns .TP \fB\-\-include\-from\fR=FILE Read --include GLOBs from FILE .TP \fB\-\-rinclude\fR=REGEXP Same as --include but uses REGEXP (regular expression) instead of GLOB .TP \fB\-\-rinclude\-from\fR=FILE Read --rinclude REGEXPs from FILE .TP \fB\-\-bucket\-location\fR=BUCKET_LOCATION Datacentre to create bucket in. As of now the datacenters are: US (default), EU, us-west-1, and ap- southeast-1 .TP \fB\-\-reduced\-redundancy\fR, \fB\-\-rr\fR Store object with 'Reduced redundancy'. Lower per-GB price. [put, cp, mv] .TP \fB\-\-access\-logging\-target\-prefix\fR=LOG_TARGET_PREFIX Target prefix for access logs (S3 URI) (for [cfmodify] and [accesslog] commands) .TP \fB\-\-no\-access\-logging\fR Disable access logging (for [cfmodify] and [accesslog] commands) .TP \fB\-\-default\-mime\-type\fR Default MIME-type for stored objects. Application default is binary/octet-stream. .TP \fB\-\-guess\-mime\-type\fR Guess MIME-type of files by their extension or mime magic. Fall back to default MIME-Type as specified by \fB--default-mime-type\fR option .TP \fB\-\-no\-guess\-mime\-type\fR Don't guess MIME-type and use the default type instead. .TP \fB\-m\fR MIME/TYPE, \fB\-\-mime\-type\fR=MIME/TYPE Force MIME-type. Override both \fB--default-mime-type\fR and \fB--guess-mime-type\fR. .TP \fB\-\-add\-header\fR=NAME:VALUE Add a given HTTP header to the upload request. Can be used multiple times. For instance set 'Expires' or 'Cache-Control' headers (or both) using this options if you like. .TP \fB\-\-encoding\fR=ENCODING Override autodetected terminal and filesystem encoding (character set). Autodetected: UTF-8 .TP \fB\-\-verbatim\fR Use the S3 name as given on the command line. No pre- processing, encoding, etc. Use with caution! .TP \fB\-\-disable\-multipart\fR Disable multipart upload on files bigger than \fB--multipart-chunk-size-mb\fR .TP \fB\-\-multipart\-chunk\-size\-mb\fR=SIZE Size of each chunk of a multipart upload. Files bigger than SIZE are automatically uploaded as multithreaded- multipart, smaller files are uploaded using the traditional method. SIZE is in Mega-Bytes, default chunk size is noneMB, minimum allowed chunk size is 5MB, maximum is 5GB. .TP \fB\-\-list\-md5\fR Include MD5 sums in bucket listings (only for 'ls' command). .TP \fB\-H\fR, \fB\-\-human\-readable\-sizes\fR Print sizes in human readable form (eg 1kB instead of 1234). .TP \fB\-\-ws\-index\fR=WEBSITE_INDEX Name of error-document (only for [ws-create] command) .TP \fB\-\-ws\-error\fR=WEBSITE_ERROR Name of index-document (only for [ws-create] command) .TP \fB\-\-progress\fR Display progress meter (default on TTY). .TP \fB\-\-no\-progress\fR Don't display progress meter (default on non-TTY). .TP \fB\-\-enable\fR Enable given CloudFront distribution (only for [cfmodify] command) .TP \fB\-\-disable\fR Enable given CloudFront distribution (only for [cfmodify] command) .TP \fB\-\-cf\-invalidate\fR Invalidate the uploaded filed in CloudFront. Also see [cfinval] command. .TP \fB\-\-cf\-add\-cname\fR=CNAME Add given CNAME to a CloudFront distribution (only for [cfcreate] and [cfmodify] commands) .TP \fB\-\-cf\-remove\-cname\fR=CNAME Remove given CNAME from a CloudFront distribution (only for [cfmodify] command) .TP \fB\-\-cf\-comment\fR=COMMENT Set COMMENT for a given CloudFront distribution (only for [cfcreate] and [cfmodify] commands) .TP \fB\-\-cf\-default\-root\-object\fR=DEFAULT_ROOT_OBJECT Set the default root object to return when no object is specified in the URL. Use a relative path, i.e. default/index.html instead of /default/index.html or s3://bucket/default/index.html (only for [cfcreate] and [cfmodify] commands) .TP \fB\-v\fR, \fB\-\-verbose\fR Enable verbose output. .TP \fB\-d\fR, \fB\-\-debug\fR Enable debug output. .TP \fB\-\-version\fR Show s3cmd version (1.1.0-beta3) and exit. .TP \fB\-F\fR, \fB\-\-follow\-symlinks\fR Follow symbolic links as if they are regular files .SH EXAMPLES One of the most powerful commands of \fIs3cmd\fR is \fBs3cmd sync\fR used for synchronising complete directory trees to or from remote S3 storage. To some extent \fBs3cmd put\fR and \fBs3cmd get\fR share a similar behaviour with \fBsync\fR. .PP Basic usage common in backup scenarios is as simple as: .nf s3cmd sync /local/path/ s3://test-bucket/backup/ .fi .PP This command will find all files under /local/path directory and copy them to corresponding paths under s3://test-bucket/backup on the remote side. For example: .nf /local/path/\fBfile1.ext\fR \-> s3://bucket/backup/\fBfile1.ext\fR /local/path/\fBdir123/file2.bin\fR \-> s3://bucket/backup/\fBdir123/file2.bin\fR .fi .PP However if the local path doesn't end with a slash the last directory's name is used on the remote side as well. Compare these with the previous example: .nf s3cmd sync /local/path s3://test-bucket/backup/ .fi will sync: .nf /local/\fBpath/file1.ext\fR \-> s3://bucket/backup/\fBpath/file1.ext\fR /local/\fBpath/dir123/file2.bin\fR \-> s3://bucket/backup/\fBpath/dir123/file2.bin\fR .fi .PP To retrieve the files back from S3 use inverted syntax: .nf s3cmd sync s3://test-bucket/backup/ /tmp/restore/ .fi that will download files: .nf s3://bucket/backup/\fBfile1.ext\fR \-> /tmp/restore/\fBfile1.ext\fR s3://bucket/backup/\fBdir123/file2.bin\fR \-> /tmp/restore/\fBdir123/file2.bin\fR .fi .PP Without the trailing slash on source the behaviour is similar to what has been demonstrated with upload: .nf s3cmd sync s3://test-bucket/backup /tmp/restore/ .fi will download the files as: .nf s3://bucket/\fBbackup/file1.ext\fR \-> /tmp/restore/\fBbackup/file1.ext\fR s3://bucket/\fBbackup/dir123/file2.bin\fR \-> /tmp/restore/\fBbackup/dir123/file2.bin\fR .fi .PP All source file names, the bold ones above, are matched against \fBexclude\fR rules and those that match are then re\-checked against \fBinclude\fR rules to see whether they should be excluded or kept in the source list. .PP For the purpose of \fB\-\-exclude\fR and \fB\-\-include\fR matching only the bold file names above are used. For instance only \fBpath/file1.ext\fR is tested against the patterns, not \fI/local/\fBpath/file1.ext\fR .PP Both \fB\-\-exclude\fR and \fB\-\-include\fR work with shell-style wildcards (a.k.a. GLOB). For a greater flexibility s3cmd provides Regular-expression versions of the two exclude options named \fB\-\-rexclude\fR and \fB\-\-rinclude\fR. The options with ...\fB\-from\fR suffix (eg \-\-rinclude\-from) expect a filename as an argument. Each line of such a file is treated as one pattern. .PP There is only one set of patterns built from all \fB\-\-(r)exclude(\-from)\fR options and similarly for include variant. Any file excluded with eg \-\-exclude can be put back with a pattern found in \-\-rinclude\-from list. .PP Run s3cmd with \fB\-\-dry\-run\fR to verify that your rules work as expected. Use together with \fB\-\-debug\fR get detailed information about matching file names against exclude and include rules. .PP For example to exclude all files with ".jpg" extension except those beginning with a number use: .PP \-\-exclude '*.jpg' \-\-rinclude '[0-9].*\.jpg' .SH SEE ALSO For the most up to date list of options run .B s3cmd \-\-help .br For more info about usage, examples and other related info visit project homepage at .br .B http://s3tools.org .SH DONATIONS Please consider a donation if you have found s3cmd useful: .br .B http://s3tools.org/donate .SH AUTHOR Written by Michal Ludvig and 15+ contributors .SH CONTACT, SUPPORT Prefered way to get support is our mailing list: .I s3tools\-general@lists.sourceforge.net .SH REPORTING BUGS Report bugs to .I s3tools\-bugs@lists.sourceforge.net .SH COPYRIGHT Copyright \(co 2007,2008,2009,2010,2011,2012 Michal Ludvig .br This is free software. You may redistribute copies of it under the terms of the GNU General Public License version 2 . There is NO WARRANTY, to the extent permitted by law. s3cmd-1.1.0-beta3/s3cmd0000755000175000001440000024235211703442504015313 0ustar mludvigusers00000000000000#!/usr/bin/python ## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import sys if float("%d.%d" %(sys.version_info[0], sys.version_info[1])) < 2.4: sys.stderr.write("ERROR: Python 2.4 or higher required, sorry.\n") sys.exit(1) import logging import time import os import re import errno import glob import traceback import codecs import locale import subprocess import htmlentitydefs import socket from copy import copy from optparse import OptionParser, Option, OptionValueError, IndentedHelpFormatter from logging import debug, info, warning, error from distutils.spawn import find_executable def output(message): sys.stdout.write(message + "\n") def check_args_type(args, type, verbose_type): for arg in args: if S3Uri(arg).type != type: raise ParameterError("Expecting %s instead of '%s'" % (verbose_type, arg)) def cmd_du(args): s3 = S3(Config()) if len(args) > 0: uri = S3Uri(args[0]) if uri.type == "s3" and uri.has_bucket(): subcmd_bucket_usage(s3, uri) return subcmd_bucket_usage_all(s3) def subcmd_bucket_usage_all(s3): response = s3.list_all_buckets() buckets_size = 0 for bucket in response["list"]: size = subcmd_bucket_usage(s3, S3Uri("s3://" + bucket["Name"])) if size != None: buckets_size += size total_size, size_coeff = formatSize(buckets_size, Config().human_readable_sizes) total_size_str = str(total_size) + size_coeff output(u"".rjust(8, "-")) output(u"%s Total" % (total_size_str.ljust(8))) def subcmd_bucket_usage(s3, uri): bucket = uri.bucket() object = uri.object() if object.endswith('*'): object = object[:-1] try: response = s3.bucket_list(bucket, prefix = object, recursive = True) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % bucket) return else: raise bucket_size = 0 for object in response["list"]: size, size_coeff = formatSize(object["Size"], False) bucket_size += size total_size, size_coeff = formatSize(bucket_size, Config().human_readable_sizes) total_size_str = str(total_size) + size_coeff output(u"%s %s" % (total_size_str.ljust(8), uri)) return bucket_size def cmd_ls(args): s3 = S3(Config()) if len(args) > 0: uri = S3Uri(args[0]) if uri.type == "s3" and uri.has_bucket(): subcmd_bucket_list(s3, uri) return subcmd_buckets_list_all(s3) def cmd_buckets_list_all_all(args): s3 = S3(Config()) response = s3.list_all_buckets() for bucket in response["list"]: subcmd_bucket_list(s3, S3Uri("s3://" + bucket["Name"])) output(u"") def subcmd_buckets_list_all(s3): response = s3.list_all_buckets() for bucket in response["list"]: output(u"%s s3://%s" % ( formatDateTime(bucket["CreationDate"]), bucket["Name"], )) def subcmd_bucket_list(s3, uri): bucket = uri.bucket() prefix = uri.object() debug(u"Bucket 's3://%s':" % bucket) if prefix.endswith('*'): prefix = prefix[:-1] try: response = s3.bucket_list(bucket, prefix = prefix) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % bucket) return else: raise if cfg.list_md5: format_string = u"%(timestamp)16s %(size)9s%(coeff)1s %(md5)32s %(uri)s" else: format_string = u"%(timestamp)16s %(size)9s%(coeff)1s %(uri)s" for prefix in response['common_prefixes']: output(format_string % { "timestamp": "", "size": "DIR", "coeff": "", "md5": "", "uri": uri.compose_uri(bucket, prefix["Prefix"])}) for object in response["list"]: size, size_coeff = formatSize(object["Size"], Config().human_readable_sizes) output(format_string % { "timestamp": formatDateTime(object["LastModified"]), "size" : str(size), "coeff": size_coeff, "md5" : object['ETag'].strip('"'), "uri": uri.compose_uri(bucket, object["Key"]), }) def cmd_bucket_create(args): s3 = S3(Config()) for arg in args: uri = S3Uri(arg) if not uri.type == "s3" or not uri.has_bucket() or uri.has_object(): raise ParameterError("Expecting S3 URI with just the bucket name set instead of '%s'" % arg) try: response = s3.bucket_create(uri.bucket(), cfg.bucket_location) output(u"Bucket '%s' created" % uri.uri()) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % uri.bucket()) return else: raise def cmd_website_info(args): s3 = S3(Config()) for arg in args: uri = S3Uri(arg) if not uri.type == "s3" or not uri.has_bucket() or uri.has_object(): raise ParameterError("Expecting S3 URI with just the bucket name set instead of '%s'" % arg) try: response = s3.website_info(uri, cfg.bucket_location) if response: output(u"Bucket %s: Website configuration" % uri.uri()) output(u"Website endpoint: %s" % response['website_endpoint']) output(u"Index document: %s" % response['index_document']) output(u"Error document: %s" % response['error_document']) else: output(u"Bucket %s: Unable to receive website configuration." % (uri.uri())) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % uri.bucket()) return else: raise def cmd_website_create(args): s3 = S3(Config()) for arg in args: uri = S3Uri(arg) if not uri.type == "s3" or not uri.has_bucket() or uri.has_object(): raise ParameterError("Expecting S3 URI with just the bucket name set instead of '%s'" % arg) try: response = s3.website_create(uri, cfg.bucket_location) output(u"Bucket '%s': website configuration created." % (uri.uri())) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % uri.bucket()) return else: raise def cmd_website_delete(args): s3 = S3(Config()) for arg in args: uri = S3Uri(arg) if not uri.type == "s3" or not uri.has_bucket() or uri.has_object(): raise ParameterError("Expecting S3 URI with just the bucket name set instead of '%s'" % arg) try: response = s3.website_delete(uri, cfg.bucket_location) output(u"Bucket '%s': website configuration deleted." % (uri.uri())) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % uri.bucket()) return else: raise def cmd_bucket_delete(args): def _bucket_delete_one(uri): try: response = s3.bucket_delete(uri.bucket()) except S3Error, e: if e.info['Code'] == 'BucketNotEmpty' and (cfg.force or cfg.recursive): warning(u"Bucket is not empty. Removing all the objects from it first. This may take some time...") subcmd_object_del_uri(uri.uri(), recursive = True) return _bucket_delete_one(uri) elif S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % uri.bucket()) return else: raise s3 = S3(Config()) for arg in args: uri = S3Uri(arg) if not uri.type == "s3" or not uri.has_bucket() or uri.has_object(): raise ParameterError("Expecting S3 URI with just the bucket name set instead of '%s'" % arg) _bucket_delete_one(uri) output(u"Bucket '%s' removed" % uri.uri()) def cmd_object_put(args): cfg = Config() s3 = S3(cfg) if len(args) == 0: raise ParameterError("Nothing to upload. Expecting a local file or directory and a S3 URI destination.") ## Normalize URI to convert s3://bkt to s3://bkt/ (trailing slash) destination_base_uri = S3Uri(args.pop()) if destination_base_uri.type != 's3': raise ParameterError("Destination must be S3Uri. Got: %s" % destination_base_uri) destination_base = str(destination_base_uri) if len(args) == 0: raise ParameterError("Nothing to upload. Expecting a local file or directory.") local_list, single_file_local = fetch_local_list(args) local_list, exclude_list = filter_exclude_include(local_list) local_count = len(local_list) info(u"Summary: %d local files to upload" % local_count) if local_count > 0: if not destination_base.endswith("/"): if not single_file_local: raise ParameterError("Destination S3 URI must end with '/' (ie must refer to a directory on the remote side).") local_list[local_list.keys()[0]]['remote_uri'] = unicodise(destination_base) else: for key in local_list: local_list[key]['remote_uri'] = unicodise(destination_base + key) if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) for key in local_list: output(u"upload: %s -> %s" % (local_list[key]['full_name_unicode'], local_list[key]['remote_uri'])) warning(u"Exitting now because of --dry-run") return seq = 0 for key in local_list: seq += 1 uri_final = S3Uri(local_list[key]['remote_uri']) extra_headers = copy(cfg.extra_headers) full_name_orig = local_list[key]['full_name'] full_name = full_name_orig seq_label = "[%d of %d]" % (seq, local_count) if Config().encrypt: exitcode, full_name, extra_headers["x-amz-meta-s3tools-gpgenc"] = gpg_encrypt(full_name_orig) try: response = s3.object_put(full_name, uri_final, extra_headers, extra_label = seq_label) except S3UploadError, e: error(u"Upload of '%s' failed too many times. Skipping that file." % full_name_orig) continue except InvalidFileError, e: warning(u"File can not be uploaded: %s" % e) continue speed_fmt = formatSize(response["speed"], human_readable = True, floating_point = True) if not Config().progress_meter: output(u"File '%s' stored as '%s' (%d bytes in %0.1f seconds, %0.2f %sB/s) %s" % (unicodise(full_name_orig), uri_final, response["size"], response["elapsed"], speed_fmt[0], speed_fmt[1], seq_label)) if Config().acl_public: output(u"Public URL of the object is: %s" % (uri_final.public_url())) if Config().encrypt and full_name != full_name_orig: debug(u"Removing temporary encrypted file: %s" % unicodise(full_name)) os.remove(full_name) def cmd_object_get(args): cfg = Config() s3 = S3(cfg) ## Check arguments: ## if not --recursive: ## - first N arguments must be S3Uri ## - if the last one is S3 make current dir the destination_base ## - if the last one is a directory: ## - take all 'basenames' of the remote objects and ## make the destination name be 'destination_base'+'basename' ## - if the last one is a file or not existing: ## - if the number of sources (N, above) == 1 treat it ## as a filename and save the object there. ## - if there's more sources -> Error ## if --recursive: ## - first N arguments must be S3Uri ## - for each Uri get a list of remote objects with that Uri as a prefix ## - apply exclude/include rules ## - each list item will have MD5sum, Timestamp and pointer to S3Uri ## used as a prefix. ## - the last arg may be '-' (stdout) ## - the last arg may be a local directory - destination_base ## - if the last one is S3 make current dir the destination_base ## - if the last one doesn't exist check remote list: ## - if there is only one item and its_prefix==its_name ## download that item to the name given in last arg. ## - if there are more remote items use the last arg as a destination_base ## and try to create the directory (incl. all parents). ## ## In both cases we end up with a list mapping remote object names (keys) to local file names. ## Each item will be a dict with the following attributes # {'remote_uri', 'local_filename'} download_list = [] if len(args) == 0: raise ParameterError("Nothing to download. Expecting S3 URI.") if S3Uri(args[-1]).type == 'file': destination_base = args.pop() else: destination_base = "." if len(args) == 0: raise ParameterError("Nothing to download. Expecting S3 URI.") remote_list = fetch_remote_list(args, require_attribs = False) remote_list, exclude_list = filter_exclude_include(remote_list) remote_count = len(remote_list) info(u"Summary: %d remote files to download" % remote_count) if remote_count > 0: if destination_base == "-": ## stdout is ok for multiple remote files! for key in remote_list: remote_list[key]['local_filename'] = "-" elif not os.path.isdir(destination_base): ## We were either given a file name (existing or not) if remote_count > 1: raise ParameterError("Destination must be a directory or stdout when downloading multiple sources.") remote_list[remote_list.keys()[0]]['local_filename'] = deunicodise(destination_base) elif os.path.isdir(destination_base): if destination_base[-1] != os.path.sep: destination_base += os.path.sep for key in remote_list: remote_list[key]['local_filename'] = destination_base + key else: raise InternalError("WTF? Is it a dir or not? -- %s" % destination_base) if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) for key in remote_list: output(u"download: %s -> %s" % (remote_list[key]['object_uri_str'], remote_list[key]['local_filename'])) warning(u"Exitting now because of --dry-run") return seq = 0 for key in remote_list: seq += 1 item = remote_list[key] uri = S3Uri(item['object_uri_str']) ## Encode / Decode destination with "replace" to make sure it's compatible with current encoding destination = unicodise_safe(item['local_filename']) seq_label = "[%d of %d]" % (seq, remote_count) start_position = 0 if destination == "-": ## stdout dst_stream = sys.__stdout__ else: ## File try: file_exists = os.path.exists(destination) try: dst_stream = open(destination, "ab") except IOError, e: if e.errno == errno.ENOENT: basename = destination[:destination.rindex(os.path.sep)] info(u"Creating directory: %s" % basename) os.makedirs(basename) dst_stream = open(destination, "ab") else: raise if file_exists: if Config().get_continue: start_position = dst_stream.tell() elif Config().force: start_position = 0L dst_stream.seek(0L) dst_stream.truncate() elif Config().skip_existing: info(u"Skipping over existing file: %s" % (destination)) continue else: dst_stream.close() raise ParameterError(u"File %s already exists. Use either of --force / --continue / --skip-existing or give it a new name." % destination) except IOError, e: error(u"Skipping %s: %s" % (destination, e.strerror)) continue response = s3.object_get(uri, dst_stream, start_position = start_position, extra_label = seq_label) if response["headers"].has_key("x-amz-meta-s3tools-gpgenc"): gpg_decrypt(destination, response["headers"]["x-amz-meta-s3tools-gpgenc"]) response["size"] = os.stat(destination)[6] if not Config().progress_meter and destination != "-": speed_fmt = formatSize(response["speed"], human_readable = True, floating_point = True) output(u"File %s saved as '%s' (%d bytes in %0.1f seconds, %0.2f %sB/s)" % (uri, destination, response["size"], response["elapsed"], speed_fmt[0], speed_fmt[1])) def cmd_object_del(args): for uri_str in args: uri = S3Uri(uri_str) if uri.type != "s3": raise ParameterError("Expecting S3 URI instead of '%s'" % uri_str) if not uri.has_object(): if Config().recursive and not Config().force: raise ParameterError("Please use --force to delete ALL contents of %s" % uri_str) elif not Config().recursive: raise ParameterError("File name required, not only the bucket name. Alternatively use --recursive") subcmd_object_del_uri(uri_str) def subcmd_object_del_uri(uri_str, recursive = None): s3 = S3(cfg) if recursive is None: recursive = cfg.recursive remote_list = fetch_remote_list(uri_str, require_attribs = False, recursive = recursive) remote_list, exclude_list = filter_exclude_include(remote_list) remote_count = len(remote_list) info(u"Summary: %d remote files to delete" % remote_count) if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) for key in remote_list: output(u"delete: %s" % remote_list[key]['object_uri_str']) warning(u"Exitting now because of --dry-run") return for key in remote_list: item = remote_list[key] response = s3.object_delete(S3Uri(item['object_uri_str'])) output(u"File %s deleted" % item['object_uri_str']) def subcmd_cp_mv(args, process_fce, action_str, message): if len(args) < 2: raise ParameterError("Expecting two or more S3 URIs for " + action_str) dst_base_uri = S3Uri(args.pop()) if dst_base_uri.type != "s3": raise ParameterError("Destination must be S3 URI. To download a file use 'get' or 'sync'.") destination_base = dst_base_uri.uri() remote_list = fetch_remote_list(args, require_attribs = False) remote_list, exclude_list = filter_exclude_include(remote_list) remote_count = len(remote_list) info(u"Summary: %d remote files to %s" % (remote_count, action_str)) if cfg.recursive: if not destination_base.endswith("/"): destination_base += "/" for key in remote_list: remote_list[key]['dest_name'] = destination_base + key else: for key in remote_list: if destination_base.endswith("/"): remote_list[key]['dest_name'] = destination_base + key else: remote_list[key]['dest_name'] = destination_base if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) for key in remote_list: output(u"%s: %s -> %s" % (action_str, remote_list[key]['object_uri_str'], remote_list[key]['dest_name'])) warning(u"Exitting now because of --dry-run") return seq = 0 for key in remote_list: seq += 1 seq_label = "[%d of %d]" % (seq, remote_count) item = remote_list[key] src_uri = S3Uri(item['object_uri_str']) dst_uri = S3Uri(item['dest_name']) extra_headers = copy(cfg.extra_headers) response = process_fce(src_uri, dst_uri, extra_headers) output(message % { "src" : src_uri, "dst" : dst_uri }) if Config().acl_public: info(u"Public URL is: %s" % dst_uri.public_url()) def cmd_cp(args): s3 = S3(Config()) subcmd_cp_mv(args, s3.object_copy, "copy", "File %(src)s copied to %(dst)s") def cmd_mv(args): s3 = S3(Config()) subcmd_cp_mv(args, s3.object_move, "move", "File %(src)s moved to %(dst)s") def cmd_info(args): s3 = S3(Config()) while (len(args)): uri_arg = args.pop(0) uri = S3Uri(uri_arg) if uri.type != "s3" or not uri.has_bucket(): raise ParameterError("Expecting S3 URI instead of '%s'" % uri_arg) try: if uri.has_object(): info = s3.object_info(uri) output(u"%s (object):" % uri.uri()) output(u" File size: %s" % info['headers']['content-length']) output(u" Last mod: %s" % info['headers']['last-modified']) output(u" MIME type: %s" % info['headers']['content-type']) output(u" MD5 sum: %s" % info['headers']['etag'].strip('"')) else: info = s3.bucket_info(uri) output(u"%s (bucket):" % uri.uri()) output(u" Location: %s" % info['bucket-location']) acl = s3.get_acl(uri) acl_grant_list = acl.getGrantList() for grant in acl_grant_list: output(u" ACL: %s: %s" % (grant['grantee'], grant['permission'])) if acl.isAnonRead(): output(u" URL: %s" % uri.public_url()) except S3Error, e: if S3.codes.has_key(e.info["Code"]): error(S3.codes[e.info["Code"]] % uri.bucket()) return else: raise def cmd_sync_remote2remote(args): s3 = S3(Config()) # Normalise s3://uri (e.g. assert trailing slash) destination_base = unicode(S3Uri(args[-1])) src_list = fetch_remote_list(args[:-1], recursive = True, require_attribs = True) dst_list = fetch_remote_list(destination_base, recursive = True, require_attribs = True) src_count = len(src_list) dst_count = len(dst_list) info(u"Found %d source files, %d destination files" % (src_count, dst_count)) src_list, exclude_list = filter_exclude_include(src_list) src_list, dst_list, existing_list = compare_filelists(src_list, dst_list, src_remote = True, dst_remote = True) src_count = len(src_list) dst_count = len(dst_list) print(u"Summary: %d source files to copy, %d files at destination to delete" % (src_count, dst_count)) if src_count > 0: ### Populate 'remote_uri' only if we've got something to sync from src to dst for key in src_list: src_list[key]['target_uri'] = destination_base + key if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) if cfg.delete_removed: for key in dst_list: output(u"delete: %s" % dst_list[key]['object_uri_str']) for key in src_list: output(u"Sync: %s -> %s" % (src_list[key]['object_uri_str'], src_list[key]['target_uri'])) warning(u"Exitting now because of --dry-run") return # Delete items in destination that are not in source if cfg.delete_removed: if cfg.dry_run: for key in dst_list: output(u"delete: %s" % dst_list[key]['object_uri_str']) else: for key in dst_list: uri = S3Uri(dst_list[key]['object_uri_str']) s3.object_delete(uri) output(u"deleted: '%s'" % uri) # Perform the synchronization of files timestamp_start = time.time() seq = 0 file_list = src_list.keys() file_list.sort() for file in file_list: seq += 1 item = src_list[file] src_uri = S3Uri(item['object_uri_str']) dst_uri = S3Uri(item['target_uri']) seq_label = "[%d of %d]" % (seq, src_count) extra_headers = copy(cfg.extra_headers) try: response = s3.object_copy(src_uri, dst_uri, extra_headers) output("File %(src)s copied to %(dst)s" % { "src" : src_uri, "dst" : dst_uri }) except S3Error, e: error("File %(src)s could not be copied: %(e)s" % { "src" : src_uri, "e" : e }) total_elapsed = time.time() - timestamp_start outstr = "Done. Copied %d files in %0.1f seconds, %0.2f files/s" % (seq, total_elapsed, seq/total_elapsed) if seq > 0: output(outstr) else: info(outstr) def cmd_sync_remote2local(args): def _parse_attrs_header(attrs_header): attrs = {} for attr in attrs_header.split("/"): key, val = attr.split(":") attrs[key] = val return attrs s3 = S3(Config()) destination_base = args[-1] local_list, single_file_local = fetch_local_list(destination_base, recursive = True) remote_list = fetch_remote_list(args[:-1], recursive = True, require_attribs = True) local_count = len(local_list) remote_count = len(remote_list) info(u"Found %d remote files, %d local files" % (remote_count, local_count)) remote_list, exclude_list = filter_exclude_include(remote_list) remote_list, local_list, existing_list = compare_filelists(remote_list, local_list, src_remote = True, dst_remote = False) local_count = len(local_list) remote_count = len(remote_list) info(u"Summary: %d remote files to download, %d local files to delete" % (remote_count, local_count)) if not os.path.isdir(destination_base): ## We were either given a file name (existing or not) or want STDOUT if remote_count > 1: raise ParameterError("Destination must be a directory when downloading multiple sources.") remote_list[remote_list.keys()[0]]['local_filename'] = deunicodise(destination_base) else: if destination_base[-1] != os.path.sep: destination_base += os.path.sep for key in remote_list: local_filename = destination_base + key if os.path.sep != "/": local_filename = os.path.sep.join(local_filename.split("/")) remote_list[key]['local_filename'] = deunicodise(local_filename) if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) if cfg.delete_removed: for key in local_list: output(u"delete: %s" % local_list[key]['full_name_unicode']) for key in remote_list: output(u"download: %s -> %s" % (remote_list[key]['object_uri_str'], remote_list[key]['local_filename'])) warning(u"Exitting now because of --dry-run") return if cfg.delete_removed: for key in local_list: os.unlink(local_list[key]['full_name']) output(u"deleted: %s" % local_list[key]['full_name_unicode']) total_size = 0 total_elapsed = 0.0 timestamp_start = time.time() seq = 0 dir_cache = {} file_list = remote_list.keys() file_list.sort() for file in file_list: seq += 1 item = remote_list[file] uri = S3Uri(item['object_uri_str']) dst_file = item['local_filename'] seq_label = "[%d of %d]" % (seq, remote_count) try: dst_dir = os.path.dirname(dst_file) if not dir_cache.has_key(dst_dir): dir_cache[dst_dir] = Utils.mkdir_with_parents(dst_dir) if dir_cache[dst_dir] == False: warning(u"%s: destination directory not writable: %s" % (file, dst_dir)) continue try: open_flags = os.O_CREAT open_flags |= os.O_TRUNC # open_flags |= os.O_EXCL debug(u"dst_file=%s" % unicodise(dst_file)) # This will have failed should the file exist os.close(os.open(dst_file, open_flags)) # Yeah I know there is a race condition here. Sadly I don't know how to open() in exclusive mode. dst_stream = open(dst_file, "wb") response = s3.object_get(uri, dst_stream, extra_label = seq_label) dst_stream.close() if response['headers'].has_key('x-amz-meta-s3cmd-attrs') and cfg.preserve_attrs: attrs = _parse_attrs_header(response['headers']['x-amz-meta-s3cmd-attrs']) if attrs.has_key('mode'): os.chmod(dst_file, int(attrs['mode'])) if attrs.has_key('mtime') or attrs.has_key('atime'): mtime = attrs.has_key('mtime') and int(attrs['mtime']) or int(time.time()) atime = attrs.has_key('atime') and int(attrs['atime']) or int(time.time()) os.utime(dst_file, (atime, mtime)) ## FIXME: uid/gid / uname/gname handling comes here! TODO except OSError, e: try: dst_stream.close() except: pass if e.errno == errno.EEXIST: warning(u"%s exists - not overwriting" % (dst_file)) continue if e.errno in (errno.EPERM, errno.EACCES): warning(u"%s not writable: %s" % (dst_file, e.strerror)) continue if e.errno == errno.EISDIR: warning(u"%s is a directory - skipping over" % dst_file) continue raise e except KeyboardInterrupt: try: dst_stream.close() except: pass warning(u"Exiting after keyboard interrupt") return except Exception, e: try: dst_stream.close() except: pass error(u"%s: %s" % (file, e)) continue # We have to keep repeating this call because # Python 2.4 doesn't support try/except/finally # construction :-( try: dst_stream.close() except: pass except S3DownloadError, e: error(u"%s: download failed too many times. Skipping that file." % file) continue speed_fmt = formatSize(response["speed"], human_readable = True, floating_point = True) if not Config().progress_meter: output(u"File '%s' stored as '%s' (%d bytes in %0.1f seconds, %0.2f %sB/s) %s" % (uri, unicodise(dst_file), response["size"], response["elapsed"], speed_fmt[0], speed_fmt[1], seq_label)) total_size += response["size"] total_elapsed = time.time() - timestamp_start speed_fmt = formatSize(total_size/total_elapsed, human_readable = True, floating_point = True) # Only print out the result if any work has been done or # if the user asked for verbose output outstr = "Done. Downloaded %d bytes in %0.1f seconds, %0.2f %sB/s" % (total_size, total_elapsed, speed_fmt[0], speed_fmt[1]) if total_size > 0: output(outstr) else: info(outstr) def cmd_sync_local2remote(args): def _build_attr_header(src): import pwd, grp attrs = {} src = deunicodise(src) try: st = os.stat_result(os.stat(src)) except OSError, e: raise InvalidFileError(u"%s: %s" % (unicodise(src), e.strerror)) for attr in cfg.preserve_attrs_list: if attr == 'uname': try: val = pwd.getpwuid(st.st_uid).pw_name except KeyError: attr = "uid" val = st.st_uid warning(u"%s: Owner username not known. Storing UID=%d instead." % (unicodise(src), val)) elif attr == 'gname': try: val = grp.getgrgid(st.st_gid).gr_name except KeyError: attr = "gid" val = st.st_gid warning(u"%s: Owner groupname not known. Storing GID=%d instead." % (unicodise(src), val)) else: val = getattr(st, 'st_' + attr) attrs[attr] = val result = "" for k in attrs: result += "%s:%s/" % (k, attrs[k]) return { 'x-amz-meta-s3cmd-attrs' : result[:-1] } s3 = S3(cfg) if cfg.encrypt: error(u"S3cmd 'sync' doesn't yet support GPG encryption, sorry.") error(u"Either use unconditional 's3cmd put --recursive'") error(u"or disable encryption with --no-encrypt parameter.") sys.exit(1) ## Normalize URI to convert s3://bkt to s3://bkt/ (trailing slash) destination_base_uri = S3Uri(args[-1]) if destination_base_uri.type != 's3': raise ParameterError("Destination must be S3Uri. Got: %s" % destination_base_uri) destination_base = str(destination_base_uri) local_list, single_file_local = fetch_local_list(args[:-1], recursive = True) remote_list = fetch_remote_list(destination_base, recursive = True, require_attribs = True) local_count = len(local_list) remote_count = len(remote_list) info(u"Found %d local files, %d remote files" % (local_count, remote_count)) local_list, exclude_list = filter_exclude_include(local_list) if single_file_local and len(local_list) == 1 and len(remote_list) == 1: ## Make remote_key same as local_key for comparison if we're dealing with only one file remote_list_entry = remote_list[remote_list.keys()[0]] # Flush remote_list, by the way remote_list = { local_list.keys()[0] : remote_list_entry } local_list, remote_list, existing_list = compare_filelists(local_list, remote_list, src_remote = False, dst_remote = True) local_count = len(local_list) remote_count = len(remote_list) info(u"Summary: %d local files to upload, %d remote files to delete" % (local_count, remote_count)) if local_count > 0: ## Populate 'remote_uri' only if we've got something to upload if not destination_base.endswith("/"): if not single_file_local: raise ParameterError("Destination S3 URI must end with '/' (ie must refer to a directory on the remote side).") local_list[local_list.keys()[0]]['remote_uri'] = unicodise(destination_base) else: for key in local_list: local_list[key]['remote_uri'] = unicodise(destination_base + key) if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) if cfg.delete_removed: for key in remote_list: output(u"delete: %s" % remote_list[key]['object_uri_str']) for key in local_list: output(u"upload: %s -> %s" % (local_list[key]['full_name_unicode'], local_list[key]['remote_uri'])) warning(u"Exitting now because of --dry-run") return if cfg.delete_removed: for key in remote_list: uri = S3Uri(remote_list[key]['object_uri_str']) s3.object_delete(uri) output(u"deleted: '%s'" % uri) uploaded_objects_list = [] total_size = 0 total_elapsed = 0.0 timestamp_start = time.time() seq = 0 file_list = local_list.keys() file_list.sort() for file in file_list: seq += 1 item = local_list[file] src = item['full_name'] uri = S3Uri(item['remote_uri']) seq_label = "[%d of %d]" % (seq, local_count) extra_headers = copy(cfg.extra_headers) try: if cfg.preserve_attrs: attr_header = _build_attr_header(src) debug(u"attr_header: %s" % attr_header) extra_headers.update(attr_header) response = s3.object_put(src, uri, extra_headers, extra_label = seq_label) except InvalidFileError, e: warning(u"File can not be uploaded: %s" % e) continue except S3UploadError, e: error(u"%s: upload failed too many times. Skipping that file." % item['full_name_unicode']) continue speed_fmt = formatSize(response["speed"], human_readable = True, floating_point = True) if not cfg.progress_meter: output(u"File '%s' stored as '%s' (%d bytes in %0.1f seconds, %0.2f %sB/s) %s" % (item['full_name_unicode'], uri, response["size"], response["elapsed"], speed_fmt[0], speed_fmt[1], seq_label)) total_size += response["size"] uploaded_objects_list.append(uri.object()) total_elapsed = time.time() - timestamp_start total_speed = total_elapsed and total_size/total_elapsed or 0.0 speed_fmt = formatSize(total_speed, human_readable = True, floating_point = True) # Only print out the result if any work has been done or # if the user asked for verbose output outstr = "Done. Uploaded %d bytes in %0.1f seconds, %0.2f %sB/s" % (total_size, total_elapsed, speed_fmt[0], speed_fmt[1]) if total_size > 0: output(outstr) else: info(outstr) if cfg.invalidate_on_cf: if len(uploaded_objects_list) == 0: info("Nothing to invalidate in CloudFront") else: # 'uri' from the last iteration is still valid at this point cf = CloudFront(cfg) result = cf.InvalidateObjects(uri, uploaded_objects_list) if result['status'] == 201: output("Created invalidation request for %d paths" % len(uploaded_objects_list)) output("Check progress with: s3cmd cfinvalinfo cf://%s/%s" % (result['dist_id'], result['request_id'])) def cmd_sync(args): if (len(args) < 2): raise ParameterError("Too few parameters! Expected: %s" % commands['sync']['param']) if S3Uri(args[0]).type == "file" and S3Uri(args[-1]).type == "s3": return cmd_sync_local2remote(args) if S3Uri(args[0]).type == "s3" and S3Uri(args[-1]).type == "file": return cmd_sync_remote2local(args) if S3Uri(args[0]).type == "s3" and S3Uri(args[-1]).type == "s3": return cmd_sync_remote2remote(args) raise ParameterError("Invalid source/destination: '%s'" % "' '".join(args)) def cmd_setacl(args): def _update_acl(uri, seq_label = ""): something_changed = False acl = s3.get_acl(uri) debug(u"acl: %s - %r" % (uri, acl.grantees)) if cfg.acl_public == True: if acl.isAnonRead(): info(u"%s: already Public, skipping %s" % (uri, seq_label)) else: acl.grantAnonRead() something_changed = True elif cfg.acl_public == False: # we explicitely check for False, because it could be None if not acl.isAnonRead(): info(u"%s: already Private, skipping %s" % (uri, seq_label)) else: acl.revokeAnonRead() something_changed = True # update acl with arguments # grant first and revoke later, because revoke has priority if cfg.acl_grants: something_changed = True for grant in cfg.acl_grants: acl.grant(**grant); if cfg.acl_revokes: something_changed = True for revoke in cfg.acl_revokes: acl.revoke(**revoke); if not something_changed: return retsponse = s3.set_acl(uri, acl) if retsponse['status'] == 200: if cfg.acl_public in (True, False): output(u"%s: ACL set to %s %s" % (uri, set_to_acl, seq_label)) else: output(u"%s: ACL updated" % uri) s3 = S3(cfg) set_to_acl = cfg.acl_public and "Public" or "Private" if not cfg.recursive: old_args = args args = [] for arg in old_args: uri = S3Uri(arg) if not uri.has_object(): if cfg.acl_public != None: info("Setting bucket-level ACL for %s to %s" % (uri.uri(), set_to_acl)) else: info("Setting bucket-level ACL for %s" % (uri.uri())) if not cfg.dry_run: _update_acl(uri) else: args.append(arg) remote_list = fetch_remote_list(args) remote_list, exclude_list = filter_exclude_include(remote_list) remote_count = len(remote_list) info(u"Summary: %d remote files to update" % remote_count) if cfg.dry_run: for key in exclude_list: output(u"exclude: %s" % unicodise(key)) for key in remote_list: output(u"setacl: %s" % remote_list[key]['object_uri_str']) warning(u"Exitting now because of --dry-run") return seq = 0 for key in remote_list: seq += 1 seq_label = "[%d of %d]" % (seq, remote_count) uri = S3Uri(remote_list[key]['object_uri_str']) _update_acl(uri, seq_label) def cmd_accesslog(args): s3 = S3(cfg) bucket_uri = S3Uri(args.pop()) if bucket_uri.object(): raise ParameterError("Only bucket name is required for [accesslog] command") if cfg.log_target_prefix == False: accesslog, response = s3.set_accesslog(bucket_uri, enable = False) elif cfg.log_target_prefix: log_target_prefix_uri = S3Uri(cfg.log_target_prefix) if log_target_prefix_uri.type != "s3": raise ParameterError("--log-target-prefix must be a S3 URI") accesslog, response = s3.set_accesslog(bucket_uri, enable = True, log_target_prefix_uri = log_target_prefix_uri, acl_public = cfg.acl_public) else: # cfg.log_target_prefix == None accesslog = s3.get_accesslog(bucket_uri) output(u"Access logging for: %s" % bucket_uri.uri()) output(u" Logging Enabled: %s" % accesslog.isLoggingEnabled()) if accesslog.isLoggingEnabled(): output(u" Target prefix: %s" % accesslog.targetPrefix().uri()) #output(u" Public Access: %s" % accesslog.isAclPublic()) def cmd_sign(args): string_to_sign = args.pop() debug("string-to-sign: %r" % string_to_sign) signature = Utils.sign_string(string_to_sign) output("Signature: %s" % signature) def cmd_fixbucket(args): def _unescape(text): ## # Removes HTML or XML character references and entities from a text string. # # @param text The HTML (or XML) source text. # @return The plain text, as a Unicode string, if necessary. # # From: http://effbot.org/zone/re-sub.htm#unescape-html def _unescape_fixup(m): text = m.group(0) if not htmlentitydefs.name2codepoint.has_key('apos'): htmlentitydefs.name2codepoint['apos'] = ord("'") if text[:2] == "&#": # character reference try: if text[:3] == "&#x": return unichr(int(text[3:-1], 16)) else: return unichr(int(text[2:-1])) except ValueError: pass else: # named entity try: text = unichr(htmlentitydefs.name2codepoint[text[1:-1]]) except KeyError: pass return text # leave as is text = text.encode('ascii', 'xmlcharrefreplace') return re.sub("&#?\w+;", _unescape_fixup, text) cfg.urlencoding_mode = "fixbucket" s3 = S3(cfg) count = 0 for arg in args: culprit = S3Uri(arg) if culprit.type != "s3": raise ParameterError("Expecting S3Uri instead of: %s" % arg) response = s3.bucket_list_noparse(culprit.bucket(), culprit.object(), recursive = True) r_xent = re.compile("&#x[\da-fA-F]+;") response['data'] = unicode(response['data'], 'UTF-8') keys = re.findall("(.*?)", response['data'], re.MULTILINE) debug("Keys: %r" % keys) for key in keys: if r_xent.search(key): info("Fixing: %s" % key) debug("Step 1: Transforming %s" % key) key_bin = _unescape(key) debug("Step 2: ... to %s" % key_bin) key_new = replace_nonprintables(key_bin) debug("Step 3: ... then to %s" % key_new) src = S3Uri("s3://%s/%s" % (culprit.bucket(), key_bin)) dst = S3Uri("s3://%s/%s" % (culprit.bucket(), key_new)) resp_move = s3.object_move(src, dst) if resp_move['status'] == 200: output("File %r renamed to %s" % (key_bin, key_new)) count += 1 else: error("Something went wrong for: %r" % key) error("Please report the problem to s3tools-bugs@lists.sourceforge.net") if count > 0: warning("Fixed %d files' names. Their ACL were reset to Private." % count) warning("Use 's3cmd setacl --acl-public s3://...' to make") warning("them publicly readable if required.") def resolve_list(lst, args): retval = [] for item in lst: retval.append(item % args) return retval def gpg_command(command, passphrase = ""): debug("GPG command: " + " ".join(command)) p = subprocess.Popen(command, stdin = subprocess.PIPE, stdout = subprocess.PIPE, stderr = subprocess.STDOUT) p_stdout, p_stderr = p.communicate(passphrase + "\n") debug("GPG output:") for line in p_stdout.split("\n"): debug("GPG: " + line) p_exitcode = p.wait() return p_exitcode def gpg_encrypt(filename): tmp_filename = Utils.mktmpfile() args = { "gpg_command" : cfg.gpg_command, "passphrase_fd" : "0", "input_file" : filename, "output_file" : tmp_filename, } info(u"Encrypting file %(input_file)s to %(output_file)s..." % args) command = resolve_list(cfg.gpg_encrypt.split(" "), args) code = gpg_command(command, cfg.gpg_passphrase) return (code, tmp_filename, "gpg") def gpg_decrypt(filename, gpgenc_header = "", in_place = True): tmp_filename = Utils.mktmpfile(filename) args = { "gpg_command" : cfg.gpg_command, "passphrase_fd" : "0", "input_file" : filename, "output_file" : tmp_filename, } info(u"Decrypting file %(input_file)s to %(output_file)s..." % args) command = resolve_list(cfg.gpg_decrypt.split(" "), args) code = gpg_command(command, cfg.gpg_passphrase) if code == 0 and in_place: debug(u"Renaming %s to %s" % (tmp_filename, filename)) os.unlink(filename) os.rename(tmp_filename, filename) tmp_filename = filename return (code, tmp_filename) def run_configure(config_file, args): cfg = Config() options = [ ("access_key", "Access Key", "Access key and Secret key are your identifiers for Amazon S3"), ("secret_key", "Secret Key"), ("gpg_passphrase", "Encryption password", "Encryption password is used to protect your files from reading\nby unauthorized persons while in transfer to S3"), ("gpg_command", "Path to GPG program"), ("use_https", "Use HTTPS protocol", "When using secure HTTPS protocol all communication with Amazon S3\nservers is protected from 3rd party eavesdropping. This method is\nslower than plain HTTP and can't be used if you're behind a proxy"), ("proxy_host", "HTTP Proxy server name", "On some networks all internet access must go through a HTTP proxy.\nTry setting it here if you can't conect to S3 directly"), ("proxy_port", "HTTP Proxy server port"), ] ## Option-specfic defaults if getattr(cfg, "gpg_command") == "": setattr(cfg, "gpg_command", find_executable("gpg")) if getattr(cfg, "proxy_host") == "" and os.getenv("http_proxy"): re_match=re.match("(http://)?([^:]+):(\d+)", os.getenv("http_proxy")) if re_match: setattr(cfg, "proxy_host", re_match.groups()[1]) setattr(cfg, "proxy_port", re_match.groups()[2]) try: while 1: output(u"\nEnter new values or accept defaults in brackets with Enter.") output(u"Refer to user manual for detailed description of all options.") for option in options: prompt = option[1] ## Option-specific handling if option[0] == 'proxy_host' and getattr(cfg, 'use_https') == True: setattr(cfg, option[0], "") continue if option[0] == 'proxy_port' and getattr(cfg, 'proxy_host') == "": setattr(cfg, option[0], 0) continue try: val = getattr(cfg, option[0]) if type(val) is bool: val = val and "Yes" or "No" if val not in (None, ""): prompt += " [%s]" % val except AttributeError: pass if len(option) >= 3: output(u"\n%s" % option[2]) val = raw_input(prompt + ": ") if val != "": if type(getattr(cfg, option[0])) is bool: # Turn 'Yes' into True, everything else into False val = val.lower().startswith('y') setattr(cfg, option[0], val) output(u"\nNew settings:") for option in options: output(u" %s: %s" % (option[1], getattr(cfg, option[0]))) val = raw_input("\nTest access with supplied credentials? [Y/n] ") if val.lower().startswith("y") or val == "": try: # Default, we try to list 'all' buckets which requires # ListAllMyBuckets permission if len(args) == 0: output(u"Please wait, attempting to list all buckets...") S3(Config()).bucket_list("", "") else: # If user specified a bucket name directly, we check it and only it. # Thus, access check can succeed even if user only has access to # to a single bucket and not ListAllMyBuckets permission. output(u"Please wait, attempting to list bucket: " + args[0]) uri = S3Uri(args[0]) if uri.type == "s3" and uri.has_bucket(): S3(Config()).bucket_list(uri.bucket(), "") else: raise Exception(u"Invalid bucket uri: " + args[0]) output(u"Success. Your access key and secret key worked fine :-)") output(u"\nNow verifying that encryption works...") if not getattr(cfg, "gpg_command") or not getattr(cfg, "gpg_passphrase"): output(u"Not configured. Never mind.") else: if not getattr(cfg, "gpg_command"): raise Exception("Path to GPG program not set") if not os.path.isfile(getattr(cfg, "gpg_command")): raise Exception("GPG program not found") filename = Utils.mktmpfile() f = open(filename, "w") f.write(os.sys.copyright) f.close() ret_enc = gpg_encrypt(filename) ret_dec = gpg_decrypt(ret_enc[1], ret_enc[2], False) hash = [ Utils.hash_file_md5(filename), Utils.hash_file_md5(ret_enc[1]), Utils.hash_file_md5(ret_dec[1]), ] os.unlink(filename) os.unlink(ret_enc[1]) os.unlink(ret_dec[1]) if hash[0] == hash[2] and hash[0] != hash[1]: output ("Success. Encryption and decryption worked fine :-)") else: raise Exception("Encryption verification error.") except Exception, e: error(u"Test failed: %s" % (e)) val = raw_input("\nRetry configuration? [Y/n] ") if val.lower().startswith("y") or val == "": continue val = raw_input("\nSave settings? [y/N] ") if val.lower().startswith("y"): break val = raw_input("Retry configuration? [Y/n] ") if val.lower().startswith("n"): raise EOFError() ## Overwrite existing config file, make it user-readable only old_mask = os.umask(0077) try: os.remove(config_file) except OSError, e: if e.errno != errno.ENOENT: raise f = open(config_file, "w") os.umask(old_mask) cfg.dump_config(f) f.close() output(u"Configuration saved to '%s'" % config_file) except (EOFError, KeyboardInterrupt): output(u"\nConfiguration aborted. Changes were NOT saved.") return except IOError, e: error(u"Writing config file failed: %s: %s" % (config_file, e.strerror)) sys.exit(1) def process_patterns_from_file(fname, patterns_list): try: fn = open(fname, "rt") except IOError, e: error(e) sys.exit(1) for pattern in fn: pattern = pattern.strip() if re.match("^#", pattern) or re.match("^\s*$", pattern): continue debug(u"%s: adding rule: %s" % (fname, pattern)) patterns_list.append(pattern) return patterns_list def process_patterns(patterns_list, patterns_from, is_glob, option_txt = ""): """ process_patterns(patterns, patterns_from, is_glob, option_txt = "") Process --exclude / --include GLOB and REGEXP patterns. 'option_txt' is 'exclude' / 'include' / 'rexclude' / 'rinclude' Returns: patterns_compiled, patterns_text """ patterns_compiled = [] patterns_textual = {} if patterns_list is None: patterns_list = [] if patterns_from: ## Append patterns from glob_from for fname in patterns_from: debug(u"processing --%s-from %s" % (option_txt, fname)) patterns_list = process_patterns_from_file(fname, patterns_list) for pattern in patterns_list: debug(u"processing %s rule: %s" % (option_txt, patterns_list)) if is_glob: pattern = glob.fnmatch.translate(pattern) r = re.compile(pattern) patterns_compiled.append(r) patterns_textual[r] = pattern return patterns_compiled, patterns_textual def get_commands_list(): return [ {"cmd":"mb", "label":"Make bucket", "param":"s3://BUCKET", "func":cmd_bucket_create, "argc":1}, {"cmd":"rb", "label":"Remove bucket", "param":"s3://BUCKET", "func":cmd_bucket_delete, "argc":1}, {"cmd":"ls", "label":"List objects or buckets", "param":"[s3://BUCKET[/PREFIX]]", "func":cmd_ls, "argc":0}, {"cmd":"la", "label":"List all object in all buckets", "param":"", "func":cmd_buckets_list_all_all, "argc":0}, {"cmd":"put", "label":"Put file into bucket", "param":"FILE [FILE...] s3://BUCKET[/PREFIX]", "func":cmd_object_put, "argc":2}, {"cmd":"get", "label":"Get file from bucket", "param":"s3://BUCKET/OBJECT LOCAL_FILE", "func":cmd_object_get, "argc":1}, {"cmd":"del", "label":"Delete file from bucket", "param":"s3://BUCKET/OBJECT", "func":cmd_object_del, "argc":1}, #{"cmd":"mkdir", "label":"Make a virtual S3 directory", "param":"s3://BUCKET/path/to/dir", "func":cmd_mkdir, "argc":1}, {"cmd":"sync", "label":"Synchronize a directory tree to S3", "param":"LOCAL_DIR s3://BUCKET[/PREFIX] or s3://BUCKET[/PREFIX] LOCAL_DIR", "func":cmd_sync, "argc":2}, {"cmd":"du", "label":"Disk usage by buckets", "param":"[s3://BUCKET[/PREFIX]]", "func":cmd_du, "argc":0}, {"cmd":"info", "label":"Get various information about Buckets or Files", "param":"s3://BUCKET[/OBJECT]", "func":cmd_info, "argc":1}, {"cmd":"cp", "label":"Copy object", "param":"s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]", "func":cmd_cp, "argc":2}, {"cmd":"mv", "label":"Move object", "param":"s3://BUCKET1/OBJECT1 s3://BUCKET2[/OBJECT2]", "func":cmd_mv, "argc":2}, {"cmd":"setacl", "label":"Modify Access control list for Bucket or Files", "param":"s3://BUCKET[/OBJECT]", "func":cmd_setacl, "argc":1}, {"cmd":"accesslog", "label":"Enable/disable bucket access logging", "param":"s3://BUCKET", "func":cmd_accesslog, "argc":1}, {"cmd":"sign", "label":"Sign arbitrary string using the secret key", "param":"STRING-TO-SIGN", "func":cmd_sign, "argc":1}, {"cmd":"fixbucket", "label":"Fix invalid file names in a bucket", "param":"s3://BUCKET[/PREFIX]", "func":cmd_fixbucket, "argc":1}, ## Website commands {"cmd":"ws-create", "label":"Create Website from bucket", "param":"s3://BUCKET", "func":cmd_website_create, "argc":1}, {"cmd":"ws-delete", "label":"Delete Website", "param":"s3://BUCKET", "func":cmd_website_delete, "argc":1}, {"cmd":"ws-info", "label":"Info about Website", "param":"s3://BUCKET", "func":cmd_website_info, "argc":1}, ## CloudFront commands {"cmd":"cflist", "label":"List CloudFront distribution points", "param":"", "func":CfCmd.info, "argc":0}, {"cmd":"cfinfo", "label":"Display CloudFront distribution point parameters", "param":"[cf://DIST_ID]", "func":CfCmd.info, "argc":0}, {"cmd":"cfcreate", "label":"Create CloudFront distribution point", "param":"s3://BUCKET", "func":CfCmd.create, "argc":1}, {"cmd":"cfdelete", "label":"Delete CloudFront distribution point", "param":"cf://DIST_ID", "func":CfCmd.delete, "argc":1}, {"cmd":"cfmodify", "label":"Change CloudFront distribution point parameters", "param":"cf://DIST_ID", "func":CfCmd.modify, "argc":1}, #{"cmd":"cfinval", "label":"Invalidate CloudFront objects", "param":"s3://BUCKET/OBJECT [s3://BUCKET/OBJECT ...]", "func":CfCmd.invalidate, "argc":1}, {"cmd":"cfinvalinfo", "label":"Display CloudFront invalidation request(s) status", "param":"cf://DIST_ID[/INVAL_ID]", "func":CfCmd.invalinfo, "argc":1}, ] def format_commands(progname, commands_list): help = "Commands:\n" for cmd in commands_list: help += " %s\n %s %s %s\n" % (cmd["label"], progname, cmd["cmd"], cmd["param"]) return help class OptionMimeType(Option): def check_mimetype(option, opt, value): if re.compile("^[a-z0-9]+/[a-z0-9+\.-]+(;.*)?$", re.IGNORECASE).match(value): return value raise OptionValueError("option %s: invalid MIME-Type format: %r" % (opt, value)) class OptionS3ACL(Option): def check_s3acl(option, opt, value): permissions = ('read', 'write', 'read_acp', 'write_acp', 'full_control', 'all') try: permission, grantee = re.compile("^(\w+):(.+)$", re.IGNORECASE).match(value).groups() if not permission or not grantee: raise if permission in permissions: return { 'name' : grantee, 'permission' : permission.upper() } else: raise OptionValueError("option %s: invalid S3 ACL permission: %s (valid values: %s)" % (opt, permission, ", ".join(permissions))) except: raise OptionValueError("option %s: invalid S3 ACL format: %r" % (opt, value)) class OptionAll(OptionMimeType, OptionS3ACL): TYPE_CHECKER = copy(Option.TYPE_CHECKER) TYPE_CHECKER["mimetype"] = OptionMimeType.check_mimetype TYPE_CHECKER["s3acl"] = OptionS3ACL.check_s3acl TYPES = Option.TYPES + ("mimetype", "s3acl") class MyHelpFormatter(IndentedHelpFormatter): def format_epilog(self, epilog): if epilog: return "\n" + epilog + "\n" else: return "" def main(): global cfg commands_list = get_commands_list() commands = {} ## Populate "commands" from "commands_list" for cmd in commands_list: if cmd.has_key("cmd"): commands[cmd["cmd"]] = cmd default_verbosity = Config().verbosity optparser = OptionParser(option_class=OptionAll, formatter=MyHelpFormatter()) #optparser.disable_interspersed_args() config_file = None if os.getenv("HOME"): config_file = os.path.join(os.getenv("HOME"), ".s3cfg") elif os.name == "nt" and os.getenv("USERPROFILE"): config_file = os.path.join(os.getenv("USERPROFILE").decode('mbcs'), "Application Data", "s3cmd.ini") preferred_encoding = locale.getpreferredencoding() or "UTF-8" optparser.set_defaults(encoding = preferred_encoding) optparser.set_defaults(config = config_file) optparser.set_defaults(verbosity = default_verbosity) optparser.add_option( "--configure", dest="run_configure", action="store_true", help="Invoke interactive (re)configuration tool. Optionally use as '--configure s3://come-bucket' to test access to a specific bucket instead of attempting to list them all.") optparser.add_option("-c", "--config", dest="config", metavar="FILE", help="Config file name. Defaults to %default") optparser.add_option( "--dump-config", dest="dump_config", action="store_true", help="Dump current configuration after parsing config files and command line options and exit.") optparser.add_option("-n", "--dry-run", dest="dry_run", action="store_true", help="Only show what should be uploaded or downloaded but don't actually do it. May still perform S3 requests to get bucket listings and other information though (only for file transfer commands)") optparser.add_option("-e", "--encrypt", dest="encrypt", action="store_true", help="Encrypt files before uploading to S3.") optparser.add_option( "--no-encrypt", dest="encrypt", action="store_false", help="Don't encrypt files.") optparser.add_option("-f", "--force", dest="force", action="store_true", help="Force overwrite and other dangerous operations.") optparser.add_option( "--continue", dest="get_continue", action="store_true", help="Continue getting a partially downloaded file (only for [get] command).") optparser.add_option( "--skip-existing", dest="skip_existing", action="store_true", help="Skip over files that exist at the destination (only for [get] and [sync] commands).") optparser.add_option("-r", "--recursive", dest="recursive", action="store_true", help="Recursive upload, download or removal.") optparser.add_option( "--check-md5", dest="check_md5", action="store_true", help="Check MD5 sums when comparing files for [sync]. (default)") optparser.add_option( "--no-check-md5", dest="check_md5", action="store_false", help="Do not check MD5 sums when comparing files for [sync]. Only size will be compared. May significantly speed up transfer but may also miss some changed files.") optparser.add_option("-P", "--acl-public", dest="acl_public", action="store_true", help="Store objects with ACL allowing read for anyone.") optparser.add_option( "--acl-private", dest="acl_public", action="store_false", help="Store objects with default ACL allowing access for you only.") optparser.add_option( "--acl-grant", dest="acl_grants", type="s3acl", action="append", metavar="PERMISSION:EMAIL or USER_CANONICAL_ID", help="Grant stated permission to a given amazon user. Permission is one of: read, write, read_acp, write_acp, full_control, all") optparser.add_option( "--acl-revoke", dest="acl_revokes", type="s3acl", action="append", metavar="PERMISSION:USER_CANONICAL_ID", help="Revoke stated permission for a given amazon user. Permission is one of: read, write, read_acp, wr ite_acp, full_control, all") optparser.add_option( "--delete-removed", dest="delete_removed", action="store_true", help="Delete remote objects with no corresponding local file [sync]") optparser.add_option( "--no-delete-removed", dest="delete_removed", action="store_false", help="Don't delete remote objects.") optparser.add_option("-p", "--preserve", dest="preserve_attrs", action="store_true", help="Preserve filesystem attributes (mode, ownership, timestamps). Default for [sync] command.") optparser.add_option( "--no-preserve", dest="preserve_attrs", action="store_false", help="Don't store FS attributes") optparser.add_option( "--exclude", dest="exclude", action="append", metavar="GLOB", help="Filenames and paths matching GLOB will be excluded from sync") optparser.add_option( "--exclude-from", dest="exclude_from", action="append", metavar="FILE", help="Read --exclude GLOBs from FILE") optparser.add_option( "--rexclude", dest="rexclude", action="append", metavar="REGEXP", help="Filenames and paths matching REGEXP (regular expression) will be excluded from sync") optparser.add_option( "--rexclude-from", dest="rexclude_from", action="append", metavar="FILE", help="Read --rexclude REGEXPs from FILE") optparser.add_option( "--include", dest="include", action="append", metavar="GLOB", help="Filenames and paths matching GLOB will be included even if previously excluded by one of --(r)exclude(-from) patterns") optparser.add_option( "--include-from", dest="include_from", action="append", metavar="FILE", help="Read --include GLOBs from FILE") optparser.add_option( "--rinclude", dest="rinclude", action="append", metavar="REGEXP", help="Same as --include but uses REGEXP (regular expression) instead of GLOB") optparser.add_option( "--rinclude-from", dest="rinclude_from", action="append", metavar="FILE", help="Read --rinclude REGEXPs from FILE") optparser.add_option( "--bucket-location", dest="bucket_location", help="Datacentre to create bucket in. As of now the datacenters are: US (default), EU, us-west-1, and ap-southeast-1") optparser.add_option( "--reduced-redundancy", "--rr", dest="reduced_redundancy", action="store_true", help="Store object with 'Reduced redundancy'. Lower per-GB price. [put, cp, mv]") optparser.add_option( "--access-logging-target-prefix", dest="log_target_prefix", help="Target prefix for access logs (S3 URI) (for [cfmodify] and [accesslog] commands)") optparser.add_option( "--no-access-logging", dest="log_target_prefix", action="store_false", help="Disable access logging (for [cfmodify] and [accesslog] commands)") optparser.add_option( "--default-mime-type", dest="default_mime_type", action="store_true", help="Default MIME-type for stored objects. Application default is binary/octet-stream.") optparser.add_option( "--guess-mime-type", dest="guess_mime_type", action="store_true", help="Guess MIME-type of files by their extension or mime magic. Fall back to default MIME-Type as specified by --default-mime-type option") optparser.add_option( "--no-guess-mime-type", dest="guess_mime_type", action="store_false", help="Don't guess MIME-type and use the default type instead.") optparser.add_option("-m", "--mime-type", dest="mime_type", type="mimetype", metavar="MIME/TYPE", help="Force MIME-type. Override both --default-mime-type and --guess-mime-type.") optparser.add_option( "--add-header", dest="add_header", action="append", metavar="NAME:VALUE", help="Add a given HTTP header to the upload request. Can be used multiple times. For instance set 'Expires' or 'Cache-Control' headers (or both) using this options if you like.") optparser.add_option( "--encoding", dest="encoding", metavar="ENCODING", help="Override autodetected terminal and filesystem encoding (character set). Autodetected: %s" % preferred_encoding) optparser.add_option( "--verbatim", dest="urlencoding_mode", action="store_const", const="verbatim", help="Use the S3 name as given on the command line. No pre-processing, encoding, etc. Use with caution!") optparser.add_option( "--disable-multipart", dest="enable_multipart", action="store_false", help="Disable multipart upload on files bigger than --multipart-chunk-size-mb") optparser.add_option( "--multipart-chunk-size-mb", dest="multipart_chunk_size_mb", type="int", action="store", metavar="SIZE", help="Size of each chunk of a multipart upload. Files bigger than SIZE are automatically uploaded as multithreaded-multipart, smaller files are uploaded using the traditional method. SIZE is in Mega-Bytes, default chunk size is %defaultMB, minimum allowed chunk size is 5MB, maximum is 5GB.") optparser.add_option( "--list-md5", dest="list_md5", action="store_true", help="Include MD5 sums in bucket listings (only for 'ls' command).") optparser.add_option("-H", "--human-readable-sizes", dest="human_readable_sizes", action="store_true", help="Print sizes in human readable form (eg 1kB instead of 1234).") optparser.add_option( "--ws-index", dest="website_index", action="store", help="Name of error-document (only for [ws-create] command)") optparser.add_option( "--ws-error", dest="website_error", action="store", help="Name of index-document (only for [ws-create] command)") optparser.add_option( "--progress", dest="progress_meter", action="store_true", help="Display progress meter (default on TTY).") optparser.add_option( "--no-progress", dest="progress_meter", action="store_false", help="Don't display progress meter (default on non-TTY).") optparser.add_option( "--enable", dest="enable", action="store_true", help="Enable given CloudFront distribution (only for [cfmodify] command)") optparser.add_option( "--disable", dest="enable", action="store_false", help="Enable given CloudFront distribution (only for [cfmodify] command)") optparser.add_option( "--cf-invalidate", dest="invalidate_on_cf", action="store_true", help="Invalidate the uploaded filed in CloudFront. Also see [cfinval] command.") optparser.add_option( "--cf-add-cname", dest="cf_cnames_add", action="append", metavar="CNAME", help="Add given CNAME to a CloudFront distribution (only for [cfcreate] and [cfmodify] commands)") optparser.add_option( "--cf-remove-cname", dest="cf_cnames_remove", action="append", metavar="CNAME", help="Remove given CNAME from a CloudFront distribution (only for [cfmodify] command)") optparser.add_option( "--cf-comment", dest="cf_comment", action="store", metavar="COMMENT", help="Set COMMENT for a given CloudFront distribution (only for [cfcreate] and [cfmodify] commands)") optparser.add_option( "--cf-default-root-object", dest="cf_default_root_object", action="store", metavar="DEFAULT_ROOT_OBJECT", help="Set the default root object to return when no object is specified in the URL. Use a relative path, i.e. default/index.html instead of /default/index.html or s3://bucket/default/index.html (only for [cfcreate] and [cfmodify] commands)") optparser.add_option("-v", "--verbose", dest="verbosity", action="store_const", const=logging.INFO, help="Enable verbose output.") optparser.add_option("-d", "--debug", dest="verbosity", action="store_const", const=logging.DEBUG, help="Enable debug output.") optparser.add_option( "--version", dest="show_version", action="store_true", help="Show s3cmd version (%s) and exit." % (PkgInfo.version)) optparser.add_option("-F", "--follow-symlinks", dest="follow_symlinks", action="store_true", default=False, help="Follow symbolic links as if they are regular files") optparser.set_usage(optparser.usage + " COMMAND [parameters]") optparser.set_description('S3cmd is a tool for managing objects in '+ 'Amazon S3 storage. It allows for making and removing '+ '"buckets" and uploading, downloading and removing '+ '"objects" from these buckets.') optparser.epilog = format_commands(optparser.get_prog_name(), commands_list) optparser.epilog += ("\nFor more informations see the progect homepage:\n%s\n" % PkgInfo.url) optparser.epilog += ("\nConsider a donation if you have found s3cmd useful:\n%s/donate\n" % PkgInfo.url) (options, args) = optparser.parse_args() ## Some mucking with logging levels to enable ## debugging/verbose output for config file parser on request logging.basicConfig(level=options.verbosity, format='%(levelname)s: %(message)s', stream = sys.stderr) if options.show_version: output(u"s3cmd version %s" % PkgInfo.version) sys.exit(0) ## Now finally parse the config file if not options.config: error(u"Can't find a config file. Please use --config option.") sys.exit(1) try: cfg = Config(options.config) except IOError, e: if options.run_configure: cfg = Config() else: error(u"%s: %s" % (options.config, e.strerror)) error(u"Configuration file not available.") error(u"Consider using --configure parameter to create one.") sys.exit(1) ## And again some logging level adjustments ## according to configfile and command line parameters if options.verbosity != default_verbosity: cfg.verbosity = options.verbosity logging.root.setLevel(cfg.verbosity) ## Default to --progress on TTY devices, --no-progress elsewhere ## Can be overriden by actual --(no-)progress parameter cfg.update_option('progress_meter', sys.stdout.isatty()) ## Unsupported features on Win32 platform if os.name == "nt": if cfg.preserve_attrs: error(u"Option --preserve is not yet supported on MS Windows platform. Assuming --no-preserve.") cfg.preserve_attrs = False if cfg.progress_meter: error(u"Option --progress is not yet supported on MS Windows platform. Assuming --no-progress.") cfg.progress_meter = False ## Pre-process --add-header's and put them to Config.extra_headers SortedDict() if options.add_header: for hdr in options.add_header: try: key, val = hdr.split(":", 1) except ValueError: raise ParameterError("Invalid header format: %s" % hdr) key_inval = re.sub("[a-zA-Z0-9-.]", "", key) if key_inval: key_inval = key_inval.replace(" ", "") key_inval = key_inval.replace("\t", "") raise ParameterError("Invalid character(s) in header name '%s': \"%s\"" % (key, key_inval)) debug(u"Updating Config.Config extra_headers[%s] -> %s" % (key.strip(), val.strip())) cfg.extra_headers[key.strip()] = val.strip() ## --acl-grant/--acl-revoke arguments are pre-parsed by OptionS3ACL() if options.acl_grants: for grant in options.acl_grants: cfg.acl_grants.append(grant) if options.acl_revokes: for grant in options.acl_revokes: cfg.acl_revokes.append(grant) ## Process --(no-)check-md5 if options.check_md5 == False: try: cfg.sync_checks.remove("md5") except Exception: pass if options.check_md5 == True and cfg.sync_checks.count("md5") == 0: cfg.sync_checks.append("md5") ## Update Config with other parameters for option in cfg.option_list(): try: if getattr(options, option) != None: debug(u"Updating Config.Config %s -> %s" % (option, getattr(options, option))) cfg.update_option(option, getattr(options, option)) except AttributeError: ## Some Config() options are not settable from command line pass ## Special handling for tri-state options (True, False, None) cfg.update_option("enable", options.enable) cfg.update_option("acl_public", options.acl_public) ## Check multipart chunk constraints if cfg.multipart_chunk_size_mb < MultiPartUpload.MIN_CHUNK_SIZE_MB: raise ParameterError("Chunk size %d MB is too small, must be >= %d MB. Please adjust --multipart-chunk-size-mb" % (cfg.multipart_chunk_size_mb, MultiPartUpload.MIN_CHUNK_SIZE_MB)) if cfg.multipart_chunk_size_mb > MultiPartUpload.MAX_CHUNK_SIZE_MB: raise ParameterError("Chunk size %d MB is too large, must be <= %d MB. Please adjust --multipart-chunk-size-mb" % (cfg.multipart_chunk_size_mb, MultiPartUpload.MAX_CHUNK_SIZE_MB)) ## CloudFront's cf_enable and Config's enable share the same --enable switch options.cf_enable = options.enable ## CloudFront's cf_logging and Config's log_target_prefix share the same --log-target-prefix switch options.cf_logging = options.log_target_prefix ## Update CloudFront options if some were set for option in CfCmd.options.option_list(): try: if getattr(options, option) != None: debug(u"Updating CloudFront.Cmd %s -> %s" % (option, getattr(options, option))) CfCmd.options.update_option(option, getattr(options, option)) except AttributeError: ## Some CloudFront.Cmd.Options() options are not settable from command line pass ## Set output and filesystem encoding for printing out filenames. sys.stdout = codecs.getwriter(cfg.encoding)(sys.stdout, "replace") sys.stderr = codecs.getwriter(cfg.encoding)(sys.stderr, "replace") ## Process --exclude and --exclude-from patterns_list, patterns_textual = process_patterns(options.exclude, options.exclude_from, is_glob = True, option_txt = "exclude") cfg.exclude.extend(patterns_list) cfg.debug_exclude.update(patterns_textual) ## Process --rexclude and --rexclude-from patterns_list, patterns_textual = process_patterns(options.rexclude, options.rexclude_from, is_glob = False, option_txt = "rexclude") cfg.exclude.extend(patterns_list) cfg.debug_exclude.update(patterns_textual) ## Process --include and --include-from patterns_list, patterns_textual = process_patterns(options.include, options.include_from, is_glob = True, option_txt = "include") cfg.include.extend(patterns_list) cfg.debug_include.update(patterns_textual) ## Process --rinclude and --rinclude-from patterns_list, patterns_textual = process_patterns(options.rinclude, options.rinclude_from, is_glob = False, option_txt = "rinclude") cfg.include.extend(patterns_list) cfg.debug_include.update(patterns_textual) ## Set socket read()/write() timeout socket.setdefaulttimeout(cfg.socket_timeout) if cfg.encrypt and cfg.gpg_passphrase == "": error(u"Encryption requested but no passphrase set in config file.") error(u"Please re-run 's3cmd --configure' and supply it.") sys.exit(1) if options.dump_config: cfg.dump_config(sys.stdout) sys.exit(0) if options.run_configure: # 'args' may contain the test-bucket URI run_configure(options.config, args) sys.exit(0) if len(args) < 1: error(u"Missing command. Please run with --help for more information.") sys.exit(1) ## Unicodise all remaining arguments: args = [unicodise(arg) for arg in args] command = args.pop(0) try: debug(u"Command: %s" % commands[command]["cmd"]) ## We must do this lookup in extra step to ## avoid catching all KeyError exceptions ## from inner functions. cmd_func = commands[command]["func"] except KeyError, e: error(u"Invalid command: %s" % e) sys.exit(1) if len(args) < commands[command]["argc"]: error(u"Not enough paramters for command '%s'" % command) sys.exit(1) try: cmd_func(args) except S3Error, e: error(u"S3 error: %s" % e) sys.exit(1) def report_exception(e): sys.stderr.write(""" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! An unexpected error has occurred. Please report the following lines to: s3tools-bugs@lists.sourceforge.net !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! """) tb = traceback.format_exc(sys.exc_info()) e_class = str(e.__class__) e_class = e_class[e_class.rfind(".")+1 : -2] sys.stderr.write(u"Problem: %s: %s\n" % (e_class, e)) try: sys.stderr.write("S3cmd: %s\n" % PkgInfo.version) except NameError: sys.stderr.write("S3cmd: unknown version. Module import problem?\n") sys.stderr.write("\n") sys.stderr.write(unicode(tb, errors="replace")) if type(e) == ImportError: sys.stderr.write("\n") sys.stderr.write("Your sys.path contains these entries:\n") for path in sys.path: sys.stderr.write(u"\t%s\n" % path) sys.stderr.write("Now the question is where have the s3cmd modules been installed?\n") sys.stderr.write(""" !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! An unexpected error has occurred. Please report the above lines to: s3tools-bugs@lists.sourceforge.net !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! """) if __name__ == '__main__': try: ## Our modules ## Keep them in try/except block to ## detect any syntax errors in there from S3.Exceptions import * from S3 import PkgInfo from S3.S3 import S3 from S3.Config import Config from S3.SortedDict import SortedDict from S3.S3Uri import S3Uri from S3 import Utils from S3.Utils import * from S3.Progress import Progress from S3.CloudFront import Cmd as CfCmd from S3.CloudFront import CloudFront from S3.FileLists import * from S3.MultiPart import MultiPartUpload main() sys.exit(0) except ImportError, e: report_exception(e) sys.exit(1) except ParameterError, e: error(u"Parameter problem: %s" % e) sys.exit(1) except SystemExit, e: sys.exit(e.code) except KeyboardInterrupt: sys.stderr.write("See ya!\n") sys.exit(1) except Exception, e: report_exception(e) sys.exit(1) # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/setup.cfg0000644000175000001440000000003411615700335016163 0ustar mludvigusers00000000000000[sdist] formats = gztar,zip s3cmd-1.1.0-beta3/README0000644000175000001440000003151211615700335015227 0ustar mludvigusers00000000000000S3cmd tool for Amazon Simple Storage Service (S3) ================================================= Author: Michal Ludvig S3tools / S3cmd project homepage: http://s3tools.org S3tools / S3cmd mailing lists: * Announcements of new releases: s3tools-announce@lists.sourceforge.net * General questions and discussion about usage s3tools-general@lists.sourceforge.net * Bug reports s3tools-bugs@lists.sourceforge.net Amazon S3 homepage: http://aws.amazon.com/s3 !!! !!! Please consult INSTALL file for installation instructions! !!! What is Amazon S3 ----------------- Amazon S3 provides a managed internet-accessible storage service where anyone can store any amount of data and retrieve it later again. Maximum amount of data in one "object" is 5GB, maximum number of objects is not limited. S3 is a paid service operated by the well known Amazon.com internet book shop. Before storing anything into S3 you must sign up for an "AWS" account (where AWS = Amazon Web Services) to obtain a pair of identifiers: Access Key and Secret Key. You will need to give these keys to S3cmd. Think of them as if they were a username and password for your S3 account. Pricing explained ----------------- At the time of this writing the costs of using S3 are (in USD): $0.15 per GB per month of storage space used plus $0.10 per GB - all data uploaded plus $0.18 per GB - first 10 TB / month data downloaded $0.16 per GB - next 40 TB / month data downloaded $0.13 per GB - data downloaded / month over 50 TB plus $0.01 per 1,000 PUT or LIST requests $0.01 per 10,000 GET and all other requests If for instance on 1st of January you upload 2GB of photos in JPEG from your holiday in New Zealand, at the end of January you will be charged $0.30 for using 2GB of storage space for a month, $0.20 for uploading 2GB of data, and a few cents for requests. That comes to slightly over $0.50 for a complete backup of your precious holiday pictures. In February you don't touch it. Your data are still on S3 servers so you pay $0.30 for those two gigabytes, but not a single cent will be charged for any transfer. That comes to $0.30 as an ongoing cost of your backup. Not too bad. In March you allow anonymous read access to some of your pictures and your friends download, say, 500MB of them. As the files are owned by you, you are responsible for the costs incurred. That means at the end of March you'll be charged $0.30 for storage plus $0.09 for the download traffic generated by your friends. There is no minimum monthly contract or a setup fee. What you use is what you pay for. At the beginning my bill used to be like US$0.03 or even nil. That's the pricing model of Amazon S3 in a nutshell. Check Amazon S3 homepage at http://aws.amazon.com/s3 for more details. Needless to say that all these money are charged by Amazon itself, there is obviously no payment for using S3cmd :-) Amazon S3 basics ---------------- Files stored in S3 are called "objects" and their names are officially called "keys". Since this is sometimes confusing for the users we often refer to the objects as "files" or "remote files". Each object belongs to exactly one "bucket". To describe objects in S3 storage we invented a URI-like schema in the following form: s3://BUCKET or s3://BUCKET/OBJECT Buckets ------- Buckets are sort of like directories or folders with some restrictions: 1) each user can only have 100 buckets at the most, 2) bucket names must be unique amongst all users of S3, 3) buckets can not be nested into a deeper hierarchy and 4) a name of a bucket can only consist of basic alphanumeric characters plus dot (.) and dash (-). No spaces, no accented or UTF-8 letters, etc. It is a good idea to use DNS-compatible bucket names. That for instance means you should not use upper case characters. While DNS compliance is not strictly required some features described below are not available for DNS-incompatible named buckets. One more step further is using a fully qualified domain name (FQDN) for a bucket - that has even more benefits. * For example "s3://--My-Bucket--" is not DNS compatible. * On the other hand "s3://my-bucket" is DNS compatible but is not FQDN. * Finally "s3://my-bucket.s3tools.org" is DNS compatible and FQDN provided you own the s3tools.org domain and can create the domain record for "my-bucket.s3tools.org". Look for "Virtual Hosts" later in this text for more details regarding FQDN named buckets. Objects (files stored in Amazon S3) ----------------------------------- Unlike for buckets there are almost no restrictions on object names. These can be any UTF-8 strings of up to 1024 bytes long. Interestingly enough the object name can contain forward slash character (/) thus a "my/funny/picture.jpg" is a valid object name. Note that there are not directories nor buckets called "my" and "funny" - it is really a single object name called "my/funny/picture.jpg" and S3 does not care at all that it _looks_ like a directory structure. The full URI of such an image could be, for example: s3://my-bucket/my/funny/picture.jpg Public vs Private files ----------------------- The files stored in S3 can be either Private or Public. The Private ones are readable only by the user who uploaded them while the Public ones can be read by anyone. Additionally the Public files can be accessed using HTTP protocol, not only using s3cmd or a similar tool. The ACL (Access Control List) of a file can be set at the time of upload using --acl-public or --acl-private options with 's3cmd put' or 's3cmd sync' commands (see below). Alternatively the ACL can be altered for existing remote files with 's3cmd setacl --acl-public' (or --acl-private) command. Simple s3cmd HowTo ------------------ 1) Register for Amazon AWS / S3 Go to http://aws.amazon.com/s3, click the "Sign up for web service" button in the right column and work through the registration. You will have to supply your Credit Card details in order to allow Amazon charge you for S3 usage. At the end you should have your Access and Secret Keys 2) Run "s3cmd --configure" You will be asked for the two keys - copy and paste them from your confirmation email or from your Amazon account page. Be careful when copying them! They are case sensitive and must be entered accurately or you'll keep getting errors about invalid signatures or similar. 3) Run "s3cmd ls" to list all your buckets. As you just started using S3 there are no buckets owned by you as of now. So the output will be empty. 4) Make a bucket with "s3cmd mb s3://my-new-bucket-name" As mentioned above the bucket names must be unique amongst _all_ users of S3. That means the simple names like "test" or "asdf" are already taken and you must make up something more original. To demonstrate as many features as possible let's create a FQDN-named bucket s3://public.s3tools.org: ~$ s3cmd mb s3://public.s3tools.org Bucket 's3://public.s3tools.org' created 5) List your buckets again with "s3cmd ls" Now you should see your freshly created bucket ~$ s3cmd ls 2009-01-28 12:34 s3://public.s3tools.org 6) List the contents of the bucket ~$ s3cmd ls s3://public.s3tools.org ~$ It's empty, indeed. 7) Upload a single file into the bucket: ~$ s3cmd put some-file.xml s3://public.s3tools.org/somefile.xml some-file.xml -> s3://public.s3tools.org/somefile.xml [1 of 1] 123456 of 123456 100% in 2s 51.75 kB/s done Upload a two directory tree into the bucket's virtual 'directory': ~$ s3cmd put --recursive dir1 dir2 s3://public.s3tools.org/somewhere/ File 'dir1/file1-1.txt' stored as 's3://public.s3tools.org/somewhere/dir1/file1-1.txt' [1 of 5] File 'dir1/file1-2.txt' stored as 's3://public.s3tools.org/somewhere/dir1/file1-2.txt' [2 of 5] File 'dir1/file1-3.log' stored as 's3://public.s3tools.org/somewhere/dir1/file1-3.log' [3 of 5] File 'dir2/file2-1.bin' stored as 's3://public.s3tools.org/somewhere/dir2/file2-1.bin' [4 of 5] File 'dir2/file2-2.txt' stored as 's3://public.s3tools.org/somewhere/dir2/file2-2.txt' [5 of 5] As you can see we didn't have to create the /somewhere 'directory'. In fact it's only a filename prefix, not a real directory and it doesn't have to be created in any way beforehand. 8) Now list the bucket contents again: ~$ s3cmd ls s3://public.s3tools.org DIR s3://public.s3tools.org/somewhere/ 2009-02-10 05:10 123456 s3://public.s3tools.org/somefile.xml Use --recursive (or -r) to list all the remote files: ~$ s3cmd ls s3://public.s3tools.org 2009-02-10 05:10 123456 s3://public.s3tools.org/somefile.xml 2009-02-10 05:13 18 s3://public.s3tools.org/somewhere/dir1/file1-1.txt 2009-02-10 05:13 8 s3://public.s3tools.org/somewhere/dir1/file1-2.txt 2009-02-10 05:13 16 s3://public.s3tools.org/somewhere/dir1/file1-3.log 2009-02-10 05:13 11 s3://public.s3tools.org/somewhere/dir2/file2-1.bin 2009-02-10 05:13 8 s3://public.s3tools.org/somewhere/dir2/file2-2.txt 9) Retrieve one of the files back and verify that it hasn't been corrupted: ~$ s3cmd get s3://public.s3tools.org/somefile.xml some-file-2.xml s3://public.s3tools.org/somefile.xml -> some-file-2.xml [1 of 1] 123456 of 123456 100% in 3s 35.75 kB/s done ~$ md5sum some-file.xml some-file-2.xml 39bcb6992e461b269b95b3bda303addf some-file.xml 39bcb6992e461b269b95b3bda303addf some-file-2.xml Checksums of the original file matches the one of the retrieved one. Looks like it worked :-) To retrieve a whole 'directory tree' from S3 use recursive get: ~$ s3cmd get --recursive s3://public.s3tools.org/somewhere File s3://public.s3tools.org/somewhere/dir1/file1-1.txt saved as './somewhere/dir1/file1-1.txt' File s3://public.s3tools.org/somewhere/dir1/file1-2.txt saved as './somewhere/dir1/file1-2.txt' File s3://public.s3tools.org/somewhere/dir1/file1-3.log saved as './somewhere/dir1/file1-3.log' File s3://public.s3tools.org/somewhere/dir2/file2-1.bin saved as './somewhere/dir2/file2-1.bin' File s3://public.s3tools.org/somewhere/dir2/file2-2.txt saved as './somewhere/dir2/file2-2.txt' Since the destination directory wasn't specified s3cmd saved the directory structure in a current working directory ('.'). There is an important difference between: get s3://public.s3tools.org/somewhere and get s3://public.s3tools.org/somewhere/ (note the trailing slash) S3cmd always uses the last path part, ie the word after the last slash, for naming files. In the case of s3://.../somewhere the last path part is 'somewhere' and therefore the recursive get names the local files as somewhere/dir1, somewhere/dir2, etc. On the other hand in s3://.../somewhere/ the last path part is empty and s3cmd will only create 'dir1' and 'dir2' without the 'somewhere/' prefix: ~$ s3cmd get --recursive s3://public.s3tools.org/somewhere /tmp File s3://public.s3tools.org/somewhere/dir1/file1-1.txt saved as '/tmp/dir1/file1-1.txt' File s3://public.s3tools.org/somewhere/dir1/file1-2.txt saved as '/tmp/dir1/file1-2.txt' File s3://public.s3tools.org/somewhere/dir1/file1-3.log saved as '/tmp/dir1/file1-3.log' File s3://public.s3tools.org/somewhere/dir2/file2-1.bin saved as '/tmp/dir2/file2-1.bin' See? It's /tmp/dir1 and not /tmp/somewhere/dir1 as it was in the previous example. 10) Clean up - delete the remote files and remove the bucket: Remove everything under s3://public.s3tools.org/somewhere/ ~$ s3cmd del --recursive s3://public.s3tools.org/somewhere/ File s3://public.s3tools.org/somewhere/dir1/file1-1.txt deleted File s3://public.s3tools.org/somewhere/dir1/file1-2.txt deleted ... Now try to remove the bucket: ~$ s3cmd rb s3://public.s3tools.org ERROR: S3 error: 409 (BucketNotEmpty): The bucket you tried to delete is not empty Ouch, we forgot about s3://public.s3tools.org/somefile.xml We can force the bucket removal anyway: ~$ s3cmd rb --force s3://public.s3tools.org/ WARNING: Bucket is not empty. Removing all the objects from it first. This may take some time... File s3://public.s3tools.org/somefile.xml deleted Bucket 's3://public.s3tools.org/' removed Hints ----- The basic usage is as simple as described in the previous section. You can increase the level of verbosity with -v option and if you're really keen to know what the program does under its bonet run it with -d to see all 'debugging' output. After configuring it with --configure all available options are spitted into your ~/.s3cfg file. It's a text file ready to be modified in your favourite text editor. For more information refer to: * S3cmd / S3tools homepage at http://s3tools.org * Amazon S3 homepage at http://aws.amazon.com/s3 Enjoy! Michal Ludvig * michal@logix.cz * http://www.logix.cz/michal s3cmd-1.1.0-beta3/S3/0000755000175000001440000000000011703443760014636 5ustar mludvigusers00000000000000s3cmd-1.1.0-beta3/S3/Utils.py0000644000175000001440000002771411700327125016314 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import os import sys import time import re import string import random import rfc822 import hmac import base64 import errno from logging import debug, info, warning, error import Config import Exceptions # hashlib backported to python 2.4 / 2.5 is not compatible with hmac! if sys.version_info[0] == 2 and sys.version_info[1] < 6: from md5 import md5 import sha as sha1 else: from hashlib import md5, sha1 try: import xml.etree.ElementTree as ET except ImportError: import elementtree.ElementTree as ET from xml.parsers.expat import ExpatError __all__ = [] def parseNodes(nodes): ## WARNING: Ignores text nodes from mixed xml/text. ## For instance some textother text ## will be ignore "some text" node retval = [] for node in nodes: retval_item = {} for child in node.getchildren(): name = child.tag if child.getchildren(): retval_item[name] = parseNodes([child]) else: retval_item[name] = node.findtext(".//%s" % child.tag) retval.append(retval_item) return retval __all__.append("parseNodes") def stripNameSpace(xml): """ removeNameSpace(xml) -- remove top-level AWS namespace """ r = re.compile('^(]+?>\s?)(<\w+) xmlns=[\'"](http://[^\'"]+)[\'"](.*)', re.MULTILINE) if r.match(xml): xmlns = r.match(xml).groups()[2] xml = r.sub("\\1\\2\\4", xml) else: xmlns = None return xml, xmlns __all__.append("stripNameSpace") def getTreeFromXml(xml): xml, xmlns = stripNameSpace(xml) try: tree = ET.fromstring(xml) if xmlns: tree.attrib['xmlns'] = xmlns return tree except ExpatError, e: error(e) raise Exceptions.ParameterError("Bucket contains invalid filenames. Please run: s3cmd fixbucket s3://your-bucket/") __all__.append("getTreeFromXml") def getListFromXml(xml, node): tree = getTreeFromXml(xml) nodes = tree.findall('.//%s' % (node)) return parseNodes(nodes) __all__.append("getListFromXml") def getDictFromTree(tree): ret_dict = {} for child in tree.getchildren(): if child.getchildren(): ## Complex-type child. Recurse content = getDictFromTree(child) else: content = child.text if ret_dict.has_key(child.tag): if not type(ret_dict[child.tag]) == list: ret_dict[child.tag] = [ret_dict[child.tag]] ret_dict[child.tag].append(content or "") else: ret_dict[child.tag] = content or "" return ret_dict __all__.append("getDictFromTree") def getTextFromXml(xml, xpath): tree = getTreeFromXml(xml) if tree.tag.endswith(xpath): return tree.text else: return tree.findtext(xpath) __all__.append("getTextFromXml") def getRootTagName(xml): tree = getTreeFromXml(xml) return tree.tag __all__.append("getRootTagName") def xmlTextNode(tag_name, text): el = ET.Element(tag_name) el.text = unicode(text) return el __all__.append("xmlTextNode") def appendXmlTextNode(tag_name, text, parent): """ Creates a new Node and sets its content to 'text'. Then appends the created Node to 'parent' element if given. Returns the newly created Node. """ el = xmlTextNode(tag_name, text) parent.append(el) return el __all__.append("appendXmlTextNode") def dateS3toPython(date): date = re.compile("(\.\d*)?Z").sub(".000Z", date) return time.strptime(date, "%Y-%m-%dT%H:%M:%S.000Z") __all__.append("dateS3toPython") def dateS3toUnix(date): ## FIXME: This should be timezone-aware. ## Currently the argument to strptime() is GMT but mktime() ## treats it as "localtime". Anyway... return time.mktime(dateS3toPython(date)) __all__.append("dateS3toUnix") def dateRFC822toPython(date): return rfc822.parsedate(date) __all__.append("dateRFC822toPython") def dateRFC822toUnix(date): return time.mktime(dateRFC822toPython(date)) __all__.append("dateRFC822toUnix") def formatSize(size, human_readable = False, floating_point = False): size = floating_point and float(size) or int(size) if human_readable: coeffs = ['k', 'M', 'G', 'T'] coeff = "" while size > 2048: size /= 1024 coeff = coeffs.pop(0) return (size, coeff) else: return (size, "") __all__.append("formatSize") def formatDateTime(s3timestamp): return time.strftime("%Y-%m-%d %H:%M", dateS3toPython(s3timestamp)) __all__.append("formatDateTime") def convertTupleListToDict(list): retval = {} for tuple in list: retval[tuple[0]] = tuple[1] return retval __all__.append("convertTupleListToDict") _rnd_chars = string.ascii_letters+string.digits _rnd_chars_len = len(_rnd_chars) def rndstr(len): retval = "" while len > 0: retval += _rnd_chars[random.randint(0, _rnd_chars_len-1)] len -= 1 return retval __all__.append("rndstr") def mktmpsomething(prefix, randchars, createfunc): old_umask = os.umask(0077) tries = 5 while tries > 0: dirname = prefix + rndstr(randchars) try: createfunc(dirname) break except OSError, e: if e.errno != errno.EEXIST: os.umask(old_umask) raise tries -= 1 os.umask(old_umask) return dirname __all__.append("mktmpsomething") def mktmpdir(prefix = "/tmp/tmpdir-", randchars = 10): return mktmpsomething(prefix, randchars, os.mkdir) __all__.append("mktmpdir") def mktmpfile(prefix = "/tmp/tmpfile-", randchars = 20): createfunc = lambda filename : os.close(os.open(filename, os.O_CREAT | os.O_EXCL)) return mktmpsomething(prefix, randchars, createfunc) __all__.append("mktmpfile") def hash_file_md5(filename): h = md5() f = open(filename, "rb") while True: # Hash 32kB chunks data = f.read(32*1024) if not data: break h.update(data) f.close() return h.hexdigest() __all__.append("hash_file_md5") def mkdir_with_parents(dir_name): """ mkdir_with_parents(dst_dir) Create directory 'dir_name' with all parent directories Returns True on success, False otherwise. """ pathmembers = dir_name.split(os.sep) tmp_stack = [] while pathmembers and not os.path.isdir(os.sep.join(pathmembers)): tmp_stack.append(pathmembers.pop()) while tmp_stack: pathmembers.append(tmp_stack.pop()) cur_dir = os.sep.join(pathmembers) try: debug("mkdir(%s)" % cur_dir) os.mkdir(cur_dir) except (OSError, IOError), e: warning("%s: can not make directory: %s" % (cur_dir, e.strerror)) return False except Exception, e: warning("%s: %s" % (cur_dir, e)) return False return True __all__.append("mkdir_with_parents") def unicodise(string, encoding = None, errors = "replace"): """ Convert 'string' to Unicode or raise an exception. """ if not encoding: encoding = Config.Config().encoding if type(string) == unicode: return string debug("Unicodising %r using %s" % (string, encoding)) try: return string.decode(encoding, errors) except UnicodeDecodeError: raise UnicodeDecodeError("Conversion to unicode failed: %r" % string) __all__.append("unicodise") def deunicodise(string, encoding = None, errors = "replace"): """ Convert unicode 'string' to , by default replacing all invalid characters with '?' or raise an exception. """ if not encoding: encoding = Config.Config().encoding if type(string) != unicode: return str(string) debug("DeUnicodising %r using %s" % (string, encoding)) try: return string.encode(encoding, errors) except UnicodeEncodeError: raise UnicodeEncodeError("Conversion from unicode failed: %r" % string) __all__.append("deunicodise") def unicodise_safe(string, encoding = None): """ Convert 'string' to Unicode according to current encoding and replace all invalid characters with '?' """ return unicodise(deunicodise(string, encoding), encoding).replace(u'\ufffd', '?') __all__.append("unicodise_safe") def replace_nonprintables(string): """ replace_nonprintables(string) Replaces all non-printable characters 'ch' in 'string' where ord(ch) <= 26 with ^@, ^A, ... ^Z """ new_string = "" modified = 0 for c in string: o = ord(c) if (o <= 31): new_string += "^" + chr(ord('@') + o) modified += 1 elif (o == 127): new_string += "^?" modified += 1 else: new_string += c if modified and Config.Config().urlencoding_mode != "fixbucket": warning("%d non-printable characters replaced in: %s" % (modified, new_string)) return new_string __all__.append("replace_nonprintables") def sign_string(string_to_sign): #debug("string_to_sign: %s" % string_to_sign) signature = base64.encodestring(hmac.new(Config.Config().secret_key, string_to_sign, sha1).digest()).strip() #debug("signature: %s" % signature) return signature __all__.append("sign_string") def check_bucket_name(bucket, dns_strict = True): if dns_strict: invalid = re.search("([^a-z0-9\.-])", bucket) if invalid: raise Exceptions.ParameterError("Bucket name '%s' contains disallowed character '%s'. The only supported ones are: lowercase us-ascii letters (a-z), digits (0-9), dot (.) and hyphen (-)." % (bucket, invalid.groups()[0])) else: invalid = re.search("([^A-Za-z0-9\._-])", bucket) if invalid: raise Exceptions.ParameterError("Bucket name '%s' contains disallowed character '%s'. The only supported ones are: us-ascii letters (a-z, A-Z), digits (0-9), dot (.), hyphen (-) and underscore (_)." % (bucket, invalid.groups()[0])) if len(bucket) < 3: raise Exceptions.ParameterError("Bucket name '%s' is too short (min 3 characters)" % bucket) if len(bucket) > 255: raise Exceptions.ParameterError("Bucket name '%s' is too long (max 255 characters)" % bucket) if dns_strict: if len(bucket) > 63: raise Exceptions.ParameterError("Bucket name '%s' is too long (max 63 characters)" % bucket) if re.search("-\.", bucket): raise Exceptions.ParameterError("Bucket name '%s' must not contain sequence '-.' for DNS compatibility" % bucket) if re.search("\.\.", bucket): raise Exceptions.ParameterError("Bucket name '%s' must not contain sequence '..' for DNS compatibility" % bucket) if not re.search("^[0-9a-z]", bucket): raise Exceptions.ParameterError("Bucket name '%s' must start with a letter or a digit" % bucket) if not re.search("[0-9a-z]$", bucket): raise Exceptions.ParameterError("Bucket name '%s' must end with a letter or a digit" % bucket) return True __all__.append("check_bucket_name") def check_bucket_name_dns_conformity(bucket): try: return check_bucket_name(bucket, dns_strict = True) except Exceptions.ParameterError: return False __all__.append("check_bucket_name_dns_conformity") def getBucketFromHostname(hostname): """ bucket, success = getBucketFromHostname(hostname) Only works for hostnames derived from bucket names using Config.host_bucket pattern. Returns bucket name and a boolean success flag. """ # Create RE pattern from Config.host_bucket pattern = Config.Config().host_bucket % { 'bucket' : '(?P.*)' } m = re.match(pattern, hostname) if not m: return (hostname, False) return m.groups()[0], True __all__.append("getBucketFromHostname") def getHostnameFromBucket(bucket): return Config.Config().host_bucket % { 'bucket' : bucket } __all__.append("getHostnameFromBucket") # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/SortedDict.py0000644000175000001440000000362711700327105017253 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 from BidirMap import BidirMap class SortedDictIterator(object): def __init__(self, sorted_dict, keys): self.sorted_dict = sorted_dict self.keys = keys def next(self): try: return self.keys.pop(0) except IndexError: raise StopIteration class SortedDict(dict): def __init__(self, mapping = {}, ignore_case = True, **kwargs): """ WARNING: SortedDict() with ignore_case==True will drop entries differing only in capitalisation! Eg: SortedDict({'auckland':1, 'Auckland':2}).keys() => ['Auckland'] With ignore_case==False it's all right """ dict.__init__(self, mapping, **kwargs) self.ignore_case = ignore_case def keys(self): keys = dict.keys(self) if self.ignore_case: # Translation map xlat_map = BidirMap() for key in keys: xlat_map[key.lower()] = key # Lowercase keys lc_keys = xlat_map.keys() lc_keys.sort() return [xlat_map[k] for k in lc_keys] else: keys.sort() return keys def __iter__(self): return SortedDictIterator(self, self.keys()) if __name__ == "__main__": d = { 'AWS' : 1, 'Action' : 2, 'america' : 3, 'Auckland' : 4, 'America' : 5 } sd = SortedDict(d) print "Wanted: Action, america, Auckland, AWS, [ignore case]" print "Got: ", for key in sd: print "%s," % key, print " [used: __iter__()]" d = SortedDict(d, ignore_case = False) print "Wanted: AWS, Action, Auckland, america, [case sensitive]" print "Got: ", for key in d.keys(): print "%s," % key, print " [used: keys()]" # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/ACL.py0000644000175000001440000001563111700327105015604 0ustar mludvigusers00000000000000## Amazon S3 - Access Control List representation ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 from Utils import getTreeFromXml try: import xml.etree.ElementTree as ET except ImportError: import elementtree.ElementTree as ET class Grantee(object): ALL_USERS_URI = "http://acs.amazonaws.com/groups/global/AllUsers" LOG_DELIVERY_URI = "http://acs.amazonaws.com/groups/s3/LogDelivery" def __init__(self): self.xsi_type = None self.tag = None self.name = None self.display_name = None self.permission = None def __repr__(self): return 'Grantee("%(tag)s", "%(name)s", "%(permission)s")' % { "tag" : self.tag, "name" : self.name, "permission" : self.permission } def isAllUsers(self): return self.tag == "URI" and self.name == Grantee.ALL_USERS_URI def isAnonRead(self): return self.isAllUsers() and (self.permission == "READ" or self.permission == "FULL_CONTROL") def getElement(self): el = ET.Element("Grant") grantee = ET.SubElement(el, "Grantee", { 'xmlns:xsi' : 'http://www.w3.org/2001/XMLSchema-instance', 'xsi:type' : self.xsi_type }) name = ET.SubElement(grantee, self.tag) name.text = self.name permission = ET.SubElement(el, "Permission") permission.text = self.permission return el class GranteeAnonRead(Grantee): def __init__(self): Grantee.__init__(self) self.xsi_type = "Group" self.tag = "URI" self.name = Grantee.ALL_USERS_URI self.permission = "READ" class GranteeLogDelivery(Grantee): def __init__(self, permission): """ permission must be either READ_ACP or WRITE """ Grantee.__init__(self) self.xsi_type = "Group" self.tag = "URI" self.name = Grantee.LOG_DELIVERY_URI self.permission = permission class ACL(object): EMPTY_ACL = "" def __init__(self, xml = None): if not xml: xml = ACL.EMPTY_ACL self.grantees = [] self.owner_id = "" self.owner_nick = "" tree = getTreeFromXml(xml) self.parseOwner(tree) self.parseGrants(tree) def parseOwner(self, tree): self.owner_id = tree.findtext(".//Owner//ID") self.owner_nick = tree.findtext(".//Owner//DisplayName") def parseGrants(self, tree): for grant in tree.findall(".//Grant"): grantee = Grantee() g = grant.find(".//Grantee") grantee.xsi_type = g.attrib['{http://www.w3.org/2001/XMLSchema-instance}type'] grantee.permission = grant.find('Permission').text for el in g: if el.tag == "DisplayName": grantee.display_name = el.text else: grantee.tag = el.tag grantee.name = el.text self.grantees.append(grantee) def getGrantList(self): acl = [] for grantee in self.grantees: if grantee.display_name: user = grantee.display_name elif grantee.isAllUsers(): user = "*anon*" else: user = grantee.name acl.append({'grantee': user, 'permission': grantee.permission}) return acl def getOwner(self): return { 'id' : self.owner_id, 'nick' : self.owner_nick } def isAnonRead(self): for grantee in self.grantees: if grantee.isAnonRead(): return True return False def grantAnonRead(self): if not self.isAnonRead(): self.appendGrantee(GranteeAnonRead()) def revokeAnonRead(self): self.grantees = [g for g in self.grantees if not g.isAnonRead()] def appendGrantee(self, grantee): self.grantees.append(grantee) def hasGrant(self, name, permission): name = name.lower() permission = permission.upper() for grantee in self.grantees: if grantee.name.lower() == name: if grantee.permission == "FULL_CONTROL": return True elif grantee.permission.upper() == permission: return True return False; def grant(self, name, permission): if self.hasGrant(name, permission): return name = name.lower() permission = permission.upper() if "ALL" == permission: permission = "FULL_CONTROL" if "FULL_CONTROL" == permission: self.revoke(name, "ALL") grantee = Grantee() grantee.name = name grantee.permission = permission if name.find('@') <= -1: # ultra lame attempt to differenciate emails id from canonical ids grantee.xsi_type = "CanonicalUser" grantee.tag = "ID" else: grantee.xsi_type = "AmazonCustomerByEmail" grantee.tag = "EmailAddress" self.appendGrantee(grantee) def revoke(self, name, permission): name = name.lower() permission = permission.upper() if "ALL" == permission: self.grantees = [g for g in self.grantees if not g.name.lower() == name] else: self.grantees = [g for g in self.grantees if not (g.name.lower() == name and g.permission.upper() == permission)] def __str__(self): tree = getTreeFromXml(ACL.EMPTY_ACL) tree.attrib['xmlns'] = "http://s3.amazonaws.com/doc/2006-03-01/" owner = tree.find(".//Owner//ID") owner.text = self.owner_id acl = tree.find(".//AccessControlList") for grantee in self.grantees: acl.append(grantee.getElement()) return ET.tostring(tree) if __name__ == "__main__": xml = """ 12345678901234567890 owner-nickname 12345678901234567890 owner-nickname FULL_CONTROL http://acs.amazonaws.com/groups/global/AllUsers READ """ acl = ACL(xml) print "Grants:", acl.getGrantList() acl.revokeAnonRead() print "Grants:", acl.getGrantList() acl.grantAnonRead() print "Grants:", acl.getGrantList() print acl # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/PkgInfo.py0000644000175000001440000000073111703443467016552 0ustar mludvigusers00000000000000package = "s3cmd" version = "1.1.0-beta3" url = "http://s3tools.org" license = "GPL version 2" short_description = "Command line tool for managing Amazon S3 and CloudFront services" long_description = """ S3cmd lets you copy files from/to Amazon S3 (Simple Storage Service) using a simple to use command line client. Supports rsync-like backup, GPG encryption, and more. Also supports management of Amazon's CloudFront content delivery network. """ # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/AccessLog.py0000644000175000001440000000563211700327105017050 0ustar mludvigusers00000000000000## Amazon S3 - Access Control List representation ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import S3Uri from Exceptions import ParameterError from Utils import getTreeFromXml from ACL import GranteeAnonRead try: import xml.etree.ElementTree as ET except ImportError: import elementtree.ElementTree as ET __all__ = [] class AccessLog(object): LOG_DISABLED = "" LOG_TEMPLATE = "" def __init__(self, xml = None): if not xml: xml = self.LOG_DISABLED self.tree = getTreeFromXml(xml) self.tree.attrib['xmlns'] = "http://doc.s3.amazonaws.com/2006-03-01" def isLoggingEnabled(self): return bool(self.tree.find(".//LoggingEnabled")) def disableLogging(self): el = self.tree.find(".//LoggingEnabled") if el: self.tree.remove(el) def enableLogging(self, target_prefix_uri): el = self.tree.find(".//LoggingEnabled") if not el: el = getTreeFromXml(self.LOG_TEMPLATE) self.tree.append(el) el.find(".//TargetBucket").text = target_prefix_uri.bucket() el.find(".//TargetPrefix").text = target_prefix_uri.object() def targetPrefix(self): if self.isLoggingEnabled(): el = self.tree.find(".//LoggingEnabled") target_prefix = "s3://%s/%s" % ( self.tree.find(".//LoggingEnabled//TargetBucket").text, self.tree.find(".//LoggingEnabled//TargetPrefix").text) return S3Uri.S3Uri(target_prefix) else: return "" def setAclPublic(self, acl_public): le = self.tree.find(".//LoggingEnabled") if not le: raise ParameterError("Logging not enabled, can't set default ACL for logs") tg = le.find(".//TargetGrants") if not acl_public: if not tg: ## All good, it's not been there return else: le.remove(tg) else: # acl_public == True anon_read = GranteeAnonRead().getElement() if not tg: tg = ET.SubElement(le, "TargetGrants") ## What if TargetGrants already exists? We should check if ## AnonRead is there before appending a new one. Later... tg.append(anon_read) def isAclPublic(self): raise NotImplementedError() def __str__(self): return ET.tostring(self.tree) __all__.append("AccessLog") if __name__ == "__main__": from S3Uri import S3Uri log = AccessLog() print log log.enableLogging(S3Uri("s3://targetbucket/prefix/log-")) print log log.setAclPublic(True) print log log.setAclPublic(False) print log log.disableLogging() print log # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/CloudFront.py0000644000175000001440000007375211700327125017276 0ustar mludvigusers00000000000000## Amazon CloudFront support ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import sys import time import httplib import random from datetime import datetime from logging import debug, info, warning, error try: import xml.etree.ElementTree as ET except ImportError: import elementtree.ElementTree as ET from Config import Config from Exceptions import * from Utils import getTreeFromXml, appendXmlTextNode, getDictFromTree, dateS3toPython, sign_string, getBucketFromHostname, getHostnameFromBucket from S3Uri import S3Uri, S3UriS3 from FileLists import fetch_remote_list cloudfront_api_version = "2010-11-01" cloudfront_resource = "/%(api_ver)s/distribution" % { 'api_ver' : cloudfront_api_version } def output(message): sys.stdout.write(message + "\n") def pretty_output(label, message): #label = ("%s " % label).ljust(20, ".") label = ("%s:" % label).ljust(15) output("%s %s" % (label, message)) class DistributionSummary(object): ## Example: ## ## ## 1234567890ABC ## Deployed ## 2009-01-16T11:49:02.189Z ## blahblahblah.cloudfront.net ## ## example.bucket.s3.amazonaws.com ## ## cdn.example.com ## img.example.com ## What Ever ## true ## def __init__(self, tree): if tree.tag != "DistributionSummary": raise ValueError("Expected xml, got: <%s />" % tree.tag) self.parse(tree) def parse(self, tree): self.info = getDictFromTree(tree) self.info['Enabled'] = (self.info['Enabled'].lower() == "true") if self.info.has_key("CNAME") and type(self.info['CNAME']) != list: self.info['CNAME'] = [self.info['CNAME']] def uri(self): return S3Uri("cf://%s" % self.info['Id']) class DistributionList(object): ## Example: ## ## ## ## 100 ## false ## ## ... handled by DistributionSummary() class ... ## ## def __init__(self, xml): tree = getTreeFromXml(xml) if tree.tag != "DistributionList": raise ValueError("Expected xml, got: <%s />" % tree.tag) self.parse(tree) def parse(self, tree): self.info = getDictFromTree(tree) ## Normalise some items self.info['IsTruncated'] = (self.info['IsTruncated'].lower() == "true") self.dist_summs = [] for dist_summ in tree.findall(".//DistributionSummary"): self.dist_summs.append(DistributionSummary(dist_summ)) class Distribution(object): ## Example: ## ## ## 1234567890ABC ## InProgress ## 2009-01-16T13:07:11.319Z ## blahblahblah.cloudfront.net ## ## ... handled by DistributionConfig() class ... ## ## def __init__(self, xml): tree = getTreeFromXml(xml) if tree.tag != "Distribution": raise ValueError("Expected xml, got: <%s />" % tree.tag) self.parse(tree) def parse(self, tree): self.info = getDictFromTree(tree) ## Normalise some items self.info['LastModifiedTime'] = dateS3toPython(self.info['LastModifiedTime']) self.info['DistributionConfig'] = DistributionConfig(tree = tree.find(".//DistributionConfig")) def uri(self): return S3Uri("cf://%s" % self.info['Id']) class DistributionConfig(object): ## Example: ## ## ## somebucket.s3.amazonaws.com ## s3://somebucket/ ## http://somebucket.s3.amazonaws.com/ ## true ## ## bu.ck.et ## /cf-somebucket/ ## ## EMPTY_CONFIG = "true" xmlns = "http://cloudfront.amazonaws.com/doc/%(api_ver)s/" % { 'api_ver' : cloudfront_api_version } def __init__(self, xml = None, tree = None): if xml is None: xml = DistributionConfig.EMPTY_CONFIG if tree is None: tree = getTreeFromXml(xml) if tree.tag != "DistributionConfig": raise ValueError("Expected xml, got: <%s />" % tree.tag) self.parse(tree) def parse(self, tree): self.info = getDictFromTree(tree) self.info['Enabled'] = (self.info['Enabled'].lower() == "true") if not self.info.has_key("CNAME"): self.info['CNAME'] = [] if type(self.info['CNAME']) != list: self.info['CNAME'] = [self.info['CNAME']] self.info['CNAME'] = [cname.lower() for cname in self.info['CNAME']] if not self.info.has_key("Comment"): self.info['Comment'] = "" if not self.info.has_key("DefaultRootObject"): self.info['DefaultRootObject'] = "" ## Figure out logging - complex node not parsed by getDictFromTree() logging_nodes = tree.findall(".//Logging") if logging_nodes: logging_dict = getDictFromTree(logging_nodes[0]) logging_dict['Bucket'], success = getBucketFromHostname(logging_dict['Bucket']) if not success: warning("Logging to unparsable bucket name: %s" % logging_dict['Bucket']) self.info['Logging'] = S3UriS3("s3://%(Bucket)s/%(Prefix)s" % logging_dict) else: self.info['Logging'] = None def __str__(self): tree = ET.Element("DistributionConfig") tree.attrib['xmlns'] = DistributionConfig.xmlns ## Retain the order of the following calls! appendXmlTextNode("Origin", self.info['Origin'], tree) appendXmlTextNode("CallerReference", self.info['CallerReference'], tree) for cname in self.info['CNAME']: appendXmlTextNode("CNAME", cname.lower(), tree) if self.info['Comment']: appendXmlTextNode("Comment", self.info['Comment'], tree) appendXmlTextNode("Enabled", str(self.info['Enabled']).lower(), tree) # don't create a empty DefaultRootObject element as it would result in a MalformedXML error if str(self.info['DefaultRootObject']): appendXmlTextNode("DefaultRootObject", str(self.info['DefaultRootObject']), tree) if self.info['Logging']: logging_el = ET.Element("Logging") appendXmlTextNode("Bucket", getHostnameFromBucket(self.info['Logging'].bucket()), logging_el) appendXmlTextNode("Prefix", self.info['Logging'].object(), logging_el) tree.append(logging_el) return ET.tostring(tree) class Invalidation(object): ## Example: ## ## ## id ## status ## date ## ## /image1.jpg ## /image2.jpg ## /videos/movie.flv ## my-batch ## ## def __init__(self, xml): tree = getTreeFromXml(xml) if tree.tag != "Invalidation": raise ValueError("Expected xml, got: <%s />" % tree.tag) self.parse(tree) def parse(self, tree): self.info = getDictFromTree(tree) def __str__(self): return str(self.info) class InvalidationList(object): ## Example: ## ## ## ## Invalidation ID ## 2 ## true ## ## [Second Invalidation ID] ## Completed ## ## ## [First Invalidation ID] ## Completed ## ## def __init__(self, xml): tree = getTreeFromXml(xml) if tree.tag != "InvalidationList": raise ValueError("Expected xml, got: <%s />" % tree.tag) self.parse(tree) def parse(self, tree): self.info = getDictFromTree(tree) def __str__(self): return str(self.info) class InvalidationBatch(object): ## Example: ## ## ## /image1.jpg ## /image2.jpg ## /videos/movie.flv ## /sound%20track.mp3 ## my-batch ## def __init__(self, reference = None, distribution = None, paths = []): if reference: self.reference = reference else: if not distribution: distribution="0" self.reference = "%s.%s.%s" % (distribution, datetime.strftime(datetime.now(),"%Y%m%d%H%M%S"), random.randint(1000,9999)) self.paths = [] self.add_objects(paths) def add_objects(self, paths): self.paths.extend(paths) def get_reference(self): return self.reference def __str__(self): tree = ET.Element("InvalidationBatch") for path in self.paths: if path[0] != "/": path = "/" + path appendXmlTextNode("Path", path, tree) appendXmlTextNode("CallerReference", self.reference, tree) return ET.tostring(tree) class CloudFront(object): operations = { "CreateDist" : { 'method' : "POST", 'resource' : "" }, "DeleteDist" : { 'method' : "DELETE", 'resource' : "/%(dist_id)s" }, "GetList" : { 'method' : "GET", 'resource' : "" }, "GetDistInfo" : { 'method' : "GET", 'resource' : "/%(dist_id)s" }, "GetDistConfig" : { 'method' : "GET", 'resource' : "/%(dist_id)s/config" }, "SetDistConfig" : { 'method' : "PUT", 'resource' : "/%(dist_id)s/config" }, "Invalidate" : { 'method' : "POST", 'resource' : "/%(dist_id)s/invalidation" }, "GetInvalList" : { 'method' : "GET", 'resource' : "/%(dist_id)s/invalidation" }, "GetInvalInfo" : { 'method' : "GET", 'resource' : "/%(dist_id)s/invalidation/%(request_id)s" }, } ## Maximum attempts of re-issuing failed requests _max_retries = 5 dist_list = None def __init__(self, config): self.config = config ## -------------------------------------------------- ## Methods implementing CloudFront API ## -------------------------------------------------- def GetList(self): response = self.send_request("GetList") response['dist_list'] = DistributionList(response['data']) if response['dist_list'].info['IsTruncated']: raise NotImplementedError("List is truncated. Ask s3cmd author to add support.") ## TODO: handle Truncated return response def CreateDistribution(self, uri, cnames_add = [], comment = None, logging = None, default_root_object = None): dist_config = DistributionConfig() dist_config.info['Enabled'] = True dist_config.info['Origin'] = uri.host_name() dist_config.info['CallerReference'] = str(uri) dist_config.info['DefaultRootObject'] = default_root_object if comment == None: dist_config.info['Comment'] = uri.public_url() else: dist_config.info['Comment'] = comment for cname in cnames_add: if dist_config.info['CNAME'].count(cname) == 0: dist_config.info['CNAME'].append(cname) if logging: dist_config.info['Logging'] = S3UriS3(logging) request_body = str(dist_config) debug("CreateDistribution(): request_body: %s" % request_body) response = self.send_request("CreateDist", body = request_body) response['distribution'] = Distribution(response['data']) return response def ModifyDistribution(self, cfuri, cnames_add = [], cnames_remove = [], comment = None, enabled = None, logging = None, default_root_object = None): if cfuri.type != "cf": raise ValueError("Expected CFUri instead of: %s" % cfuri) # Get current dist status (enabled/disabled) and Etag info("Checking current status of %s" % cfuri) response = self.GetDistConfig(cfuri) dc = response['dist_config'] if enabled != None: dc.info['Enabled'] = enabled if comment != None: dc.info['Comment'] = comment if default_root_object != None: dc.info['DefaultRootObject'] = default_root_object for cname in cnames_add: if dc.info['CNAME'].count(cname) == 0: dc.info['CNAME'].append(cname) for cname in cnames_remove: while dc.info['CNAME'].count(cname) > 0: dc.info['CNAME'].remove(cname) if logging != None: if logging == False: dc.info['Logging'] = False else: dc.info['Logging'] = S3UriS3(logging) response = self.SetDistConfig(cfuri, dc, response['headers']['etag']) return response def DeleteDistribution(self, cfuri): if cfuri.type != "cf": raise ValueError("Expected CFUri instead of: %s" % cfuri) # Get current dist status (enabled/disabled) and Etag info("Checking current status of %s" % cfuri) response = self.GetDistConfig(cfuri) if response['dist_config'].info['Enabled']: info("Distribution is ENABLED. Disabling first.") response['dist_config'].info['Enabled'] = False response = self.SetDistConfig(cfuri, response['dist_config'], response['headers']['etag']) warning("Waiting for Distribution to become disabled.") warning("This may take several minutes, please wait.") while True: response = self.GetDistInfo(cfuri) d = response['distribution'] if d.info['Status'] == "Deployed" and d.info['Enabled'] == False: info("Distribution is now disabled") break warning("Still waiting...") time.sleep(10) headers = {} headers['if-match'] = response['headers']['etag'] response = self.send_request("DeleteDist", dist_id = cfuri.dist_id(), headers = headers) return response def GetDistInfo(self, cfuri): if cfuri.type != "cf": raise ValueError("Expected CFUri instead of: %s" % cfuri) response = self.send_request("GetDistInfo", dist_id = cfuri.dist_id()) response['distribution'] = Distribution(response['data']) return response def GetDistConfig(self, cfuri): if cfuri.type != "cf": raise ValueError("Expected CFUri instead of: %s" % cfuri) response = self.send_request("GetDistConfig", dist_id = cfuri.dist_id()) response['dist_config'] = DistributionConfig(response['data']) return response def SetDistConfig(self, cfuri, dist_config, etag = None): if etag == None: debug("SetDistConfig(): Etag not set. Fetching it first.") etag = self.GetDistConfig(cfuri)['headers']['etag'] debug("SetDistConfig(): Etag = %s" % etag) request_body = str(dist_config) debug("SetDistConfig(): request_body: %s" % request_body) headers = {} headers['if-match'] = etag response = self.send_request("SetDistConfig", dist_id = cfuri.dist_id(), body = request_body, headers = headers) return response def InvalidateObjects(self, uri, paths): # uri could be either cf:// or s3:// uri cfuri = self.get_dist_name_for_bucket(uri) if len(paths) > 999: try: tmp_filename = Utils.mktmpfile() f = open(tmp_filename, "w") f.write("\n".join(paths)+"\n") f.close() warning("Request to invalidate %d paths (max 999 supported)" % len(paths)) warning("All the paths are now saved in: %s" % tmp_filename) except: pass raise ParameterError("Too many paths to invalidate") invalbatch = InvalidationBatch(distribution = cfuri.dist_id(), paths = paths) debug("InvalidateObjects(): request_body: %s" % invalbatch) response = self.send_request("Invalidate", dist_id = cfuri.dist_id(), body = str(invalbatch)) response['dist_id'] = cfuri.dist_id() if response['status'] == 201: inval_info = Invalidation(response['data']).info response['request_id'] = inval_info['Id'] debug("InvalidateObjects(): response: %s" % response) return response def GetInvalList(self, cfuri): if cfuri.type != "cf": raise ValueError("Expected CFUri instead of: %s" % cfuri) response = self.send_request("GetInvalList", dist_id = cfuri.dist_id()) response['inval_list'] = InvalidationList(response['data']) return response def GetInvalInfo(self, cfuri): if cfuri.type != "cf": raise ValueError("Expected CFUri instead of: %s" % cfuri) if cfuri.request_id() is None: raise ValueError("Expected CFUri with Request ID") response = self.send_request("GetInvalInfo", dist_id = cfuri.dist_id(), request_id = cfuri.request_id()) response['inval_status'] = Invalidation(response['data']) return response ## -------------------------------------------------- ## Low-level methods for handling CloudFront requests ## -------------------------------------------------- def send_request(self, op_name, dist_id = None, request_id = None, body = None, headers = {}, retries = _max_retries): operation = self.operations[op_name] if body: headers['content-type'] = 'text/plain' request = self.create_request(operation, dist_id, request_id, headers) conn = self.get_connection() debug("send_request(): %s %s" % (request['method'], request['resource'])) conn.request(request['method'], request['resource'], body, request['headers']) http_response = conn.getresponse() response = {} response["status"] = http_response.status response["reason"] = http_response.reason response["headers"] = dict(http_response.getheaders()) response["data"] = http_response.read() conn.close() debug("CloudFront: response: %r" % response) if response["status"] >= 500: e = CloudFrontError(response) if retries: warning(u"Retrying failed request: %s" % op_name) warning(unicode(e)) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) return self.send_request(op_name, dist_id, body, retries - 1) else: raise e if response["status"] < 200 or response["status"] > 299: raise CloudFrontError(response) return response def create_request(self, operation, dist_id = None, request_id = None, headers = None): resource = cloudfront_resource + ( operation['resource'] % { 'dist_id' : dist_id, 'request_id' : request_id }) if not headers: headers = {} if headers.has_key("date"): if not headers.has_key("x-amz-date"): headers["x-amz-date"] = headers["date"] del(headers["date"]) if not headers.has_key("x-amz-date"): headers["x-amz-date"] = time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime()) signature = self.sign_request(headers) headers["Authorization"] = "AWS "+self.config.access_key+":"+signature request = {} request['resource'] = resource request['headers'] = headers request['method'] = operation['method'] return request def sign_request(self, headers): string_to_sign = headers['x-amz-date'] signature = sign_string(string_to_sign) debug(u"CloudFront.sign_request('%s') = %s" % (string_to_sign, signature)) return signature def get_connection(self): if self.config.proxy_host != "": raise ParameterError("CloudFront commands don't work from behind a HTTP proxy") return httplib.HTTPSConnection(self.config.cloudfront_host) def _fail_wait(self, retries): # Wait a few seconds. The more it fails the more we wait. return (self._max_retries - retries + 1) * 3 def get_dist_name_for_bucket(self, uri): if (uri.type == "cf"): return uri if (uri.type != "s3"): raise ParameterError("CloudFront or S3 URI required instead of: %s" % arg) debug("_get_dist_name_for_bucket(%r)" % uri) if CloudFront.dist_list is None: response = self.GetList() CloudFront.dist_list = {} for d in response['dist_list'].dist_summs: if d.info.has_key("S3Origin"): CloudFront.dist_list[getBucketFromHostname(d.info['S3Origin']['DNSName'])[0]] = d.uri() else: # Skip over distributions with CustomOrigin continue debug("dist_list: %s" % CloudFront.dist_list) try: return CloudFront.dist_list[uri.bucket()] except Exception, e: debug(e) raise ParameterError("Unable to translate S3 URI to CloudFront distribution name: %s" % arg) class Cmd(object): """ Class that implements CloudFront commands """ class Options(object): cf_cnames_add = [] cf_cnames_remove = [] cf_comment = None cf_enable = None cf_logging = None cf_default_root_object = None def option_list(self): return [opt for opt in dir(self) if opt.startswith("cf_")] def update_option(self, option, value): setattr(Cmd.options, option, value) options = Options() @staticmethod def _parse_args(args): cf = CloudFront(Config()) cfuris = [] for arg in args: uri = cf.get_dist_name_for_bucket(S3Uri(arg)) cfuris.append(uri) return cfuris @staticmethod def info(args): cf = CloudFront(Config()) if not args: response = cf.GetList() for d in response['dist_list'].dist_summs: if d.info.has_key("S3Origin"): origin = S3UriS3.httpurl_to_s3uri(d.info['S3Origin']['DNSName']) elif d.info.has_key("CustomOrigin"): origin = "http://%s/" % d.info['CustomOrigin']['DNSName'] else: origin = "" pretty_output("Origin", origin) pretty_output("DistId", d.uri()) pretty_output("DomainName", d.info['DomainName']) if d.info.has_key("CNAME"): pretty_output("CNAMEs", ", ".join(d.info['CNAME'])) pretty_output("Status", d.info['Status']) pretty_output("Enabled", d.info['Enabled']) output("") else: cfuris = Cmd._parse_args(args) for cfuri in cfuris: response = cf.GetDistInfo(cfuri) d = response['distribution'] dc = d.info['DistributionConfig'] if dc.info.has_key("S3Origin"): origin = S3UriS3.httpurl_to_s3uri(dc.info['S3Origin']['DNSName']) elif dc.info.has_key("CustomOrigin"): origin = "http://%s/" % dc.info['CustomOrigin']['DNSName'] else: origin = "" pretty_output("Origin", origin) pretty_output("DistId", d.uri()) pretty_output("DomainName", d.info['DomainName']) if dc.info.has_key("CNAME"): pretty_output("CNAMEs", ", ".join(dc.info['CNAME'])) pretty_output("Status", d.info['Status']) pretty_output("Comment", dc.info['Comment']) pretty_output("Enabled", dc.info['Enabled']) pretty_output("DfltRootObject", dc.info['DefaultRootObject']) pretty_output("Logging", dc.info['Logging'] or "Disabled") pretty_output("Etag", response['headers']['etag']) @staticmethod def create(args): cf = CloudFront(Config()) buckets = [] for arg in args: uri = S3Uri(arg) if uri.type != "s3": raise ParameterError("Bucket can only be created from a s3:// URI instead of: %s" % arg) if uri.object(): raise ParameterError("Use s3:// URI with a bucket name only instead of: %s" % arg) if not uri.is_dns_compatible(): raise ParameterError("CloudFront can only handle lowercase-named buckets.") buckets.append(uri) if not buckets: raise ParameterError("No valid bucket names found") for uri in buckets: info("Creating distribution from: %s" % uri) response = cf.CreateDistribution(uri, cnames_add = Cmd.options.cf_cnames_add, comment = Cmd.options.cf_comment, logging = Cmd.options.cf_logging, default_root_object = Cmd.options.cf_default_root_object) d = response['distribution'] dc = d.info['DistributionConfig'] output("Distribution created:") pretty_output("Origin", S3UriS3.httpurl_to_s3uri(dc.info['Origin'])) pretty_output("DistId", d.uri()) pretty_output("DomainName", d.info['DomainName']) pretty_output("CNAMEs", ", ".join(dc.info['CNAME'])) pretty_output("Comment", dc.info['Comment']) pretty_output("Status", d.info['Status']) pretty_output("Enabled", dc.info['Enabled']) pretty_output("DefaultRootObject", dc.info['DefaultRootObject']) pretty_output("Etag", response['headers']['etag']) @staticmethod def delete(args): cf = CloudFront(Config()) cfuris = Cmd._parse_args(args) for cfuri in cfuris: response = cf.DeleteDistribution(cfuri) if response['status'] >= 400: error("Distribution %s could not be deleted: %s" % (cfuri, response['reason'])) output("Distribution %s deleted" % cfuri) @staticmethod def modify(args): cf = CloudFront(Config()) if len(args) > 1: raise ParameterError("Too many parameters. Modify one Distribution at a time.") try: cfuri = Cmd._parse_args(args)[0] except IndexError, e: raise ParameterError("No valid Distribution URI found.") response = cf.ModifyDistribution(cfuri, cnames_add = Cmd.options.cf_cnames_add, cnames_remove = Cmd.options.cf_cnames_remove, comment = Cmd.options.cf_comment, enabled = Cmd.options.cf_enable, logging = Cmd.options.cf_logging, default_root_object = Cmd.options.cf_default_root_object) if response['status'] >= 400: error("Distribution %s could not be modified: %s" % (cfuri, response['reason'])) output("Distribution modified: %s" % cfuri) response = cf.GetDistInfo(cfuri) d = response['distribution'] dc = d.info['DistributionConfig'] pretty_output("Origin", S3UriS3.httpurl_to_s3uri(dc.info['Origin'])) pretty_output("DistId", d.uri()) pretty_output("DomainName", d.info['DomainName']) pretty_output("Status", d.info['Status']) pretty_output("CNAMEs", ", ".join(dc.info['CNAME'])) pretty_output("Comment", dc.info['Comment']) pretty_output("Enabled", dc.info['Enabled']) pretty_output("DefaultRootObject", dc.info['DefaultRootObject']) pretty_output("Etag", response['headers']['etag']) @staticmethod def invalinfo(args): cf = CloudFront(Config()) cfuris = Cmd._parse_args(args) requests = [] for cfuri in cfuris: if cfuri.request_id(): requests.append(str(cfuri)) else: inval_list = cf.GetInvalList(cfuri) try: for i in inval_list['inval_list'].info['InvalidationSummary']: requests.append("/".join(["cf:/", cfuri.dist_id(), i["Id"]])) except: continue for req in requests: cfuri = S3Uri(req) inval_info = cf.GetInvalInfo(cfuri) st = inval_info['inval_status'].info pretty_output("URI", str(cfuri)) pretty_output("Status", st['Status']) pretty_output("Created", st['CreateTime']) pretty_output("Nr of paths", len(st['InvalidationBatch']['Path'])) pretty_output("Reference", st['InvalidationBatch']['CallerReference']) output("") # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/Progress.py0000644000175000001440000001360111700327105017004 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import sys import datetime import Utils class Progress(object): _stdout = sys.stdout def __init__(self, labels, total_size): self._stdout = sys.stdout self.new_file(labels, total_size) def new_file(self, labels, total_size): self.labels = labels self.total_size = total_size # Set initial_position to something in the # case we're not counting from 0. For instance # when appending to a partially downloaded file. # Setting initial_position will let the speed # be computed right. self.initial_position = 0 self.current_position = self.initial_position self.time_start = datetime.datetime.now() self.time_last = self.time_start self.time_current = self.time_start self.display(new_file = True) def update(self, current_position = -1, delta_position = -1): self.time_last = self.time_current self.time_current = datetime.datetime.now() if current_position > -1: self.current_position = current_position elif delta_position > -1: self.current_position += delta_position #else: # no update, just call display() self.display() def done(self, message): self.display(done_message = message) def output_labels(self): self._stdout.write(u"%(source)s -> %(destination)s %(extra)s\n" % self.labels) self._stdout.flush() def display(self, new_file = False, done_message = None): """ display(new_file = False[/True], done = False[/True]) Override this method to provide a nicer output. """ if new_file: self.output_labels() self.last_milestone = 0 return if self.current_position == self.total_size: print_size = Utils.formatSize(self.current_position, True) if print_size[1] != "": print_size[1] += "B" timedelta = self.time_current - self.time_start sec_elapsed = timedelta.days * 86400 + timedelta.seconds + float(timedelta.microseconds)/1000000.0 print_speed = Utils.formatSize((self.current_position - self.initial_position) / sec_elapsed, True, True) self._stdout.write("100%% %s%s in %.2fs (%.2f %sB/s)\n" % (print_size[0], print_size[1], sec_elapsed, print_speed[0], print_speed[1])) self._stdout.flush() return rel_position = selfself.current_position * 100 / self.total_size if rel_position >= self.last_milestone: self.last_milestone = (int(rel_position) / 5) * 5 self._stdout.write("%d%% ", self.last_milestone) self._stdout.flush() return class ProgressANSI(Progress): ## http://en.wikipedia.org/wiki/ANSI_escape_code SCI = '\x1b[' ANSI_hide_cursor = SCI + "?25l" ANSI_show_cursor = SCI + "?25h" ANSI_save_cursor_pos = SCI + "s" ANSI_restore_cursor_pos = SCI + "u" ANSI_move_cursor_to_column = SCI + "%uG" ANSI_erase_to_eol = SCI + "0K" ANSI_erase_current_line = SCI + "2K" def display(self, new_file = False, done_message = None): """ display(new_file = False[/True], done_message = None) """ if new_file: self.output_labels() self._stdout.write(self.ANSI_save_cursor_pos) self._stdout.flush() return timedelta = self.time_current - self.time_start sec_elapsed = timedelta.days * 86400 + timedelta.seconds + float(timedelta.microseconds)/1000000.0 if (sec_elapsed > 0): print_speed = Utils.formatSize((self.current_position - self.initial_position) / sec_elapsed, True, True) else: print_speed = (0, "") self._stdout.write(self.ANSI_restore_cursor_pos) self._stdout.write(self.ANSI_erase_to_eol) self._stdout.write("%(current)s of %(total)s %(percent)3d%% in %(elapsed)ds %(speed).2f %(speed_coeff)sB/s" % { "current" : str(self.current_position).rjust(len(str(self.total_size))), "total" : self.total_size, "percent" : self.total_size and (self.current_position * 100 / self.total_size) or 0, "elapsed" : sec_elapsed, "speed" : print_speed[0], "speed_coeff" : print_speed[1] }) if done_message: self._stdout.write(" %s\n" % done_message) self._stdout.flush() class ProgressCR(Progress): ## Uses CR char (Carriage Return) just like other progress bars do. CR_char = chr(13) def display(self, new_file = False, done_message = None): """ display(new_file = False[/True], done_message = None) """ if new_file: self.output_labels() return timedelta = self.time_current - self.time_start sec_elapsed = timedelta.days * 86400 + timedelta.seconds + float(timedelta.microseconds)/1000000.0 if (sec_elapsed > 0): print_speed = Utils.formatSize((self.current_position - self.initial_position) / sec_elapsed, True, True) else: print_speed = (0, "") self._stdout.write(self.CR_char) output = " %(current)s of %(total)s %(percent)3d%% in %(elapsed)4ds %(speed)7.2f %(speed_coeff)sB/s" % { "current" : str(self.current_position).rjust(len(str(self.total_size))), "total" : self.total_size, "percent" : self.total_size and (self.current_position * 100 / self.total_size) or 0, "elapsed" : sec_elapsed, "speed" : print_speed[0], "speed_coeff" : print_speed[1] } self._stdout.write(output) if done_message: self._stdout.write(" %s\n" % done_message) self._stdout.flush() # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/__init__.py0000644000175000001440000000000011615700335016731 0ustar mludvigusers00000000000000s3cmd-1.1.0-beta3/S3/FileLists.py0000644000175000001440000003456011703442425017113 0ustar mludvigusers00000000000000## Create and compare lists of files/objects ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 from S3 import S3 from Config import Config from S3Uri import S3Uri from SortedDict import SortedDict from Utils import * from Exceptions import ParameterError from logging import debug, info, warning, error import os import glob __all__ = ["fetch_local_list", "fetch_remote_list", "compare_filelists", "filter_exclude_include"] def _fswalk_follow_symlinks(path): ''' Walk filesystem, following symbolic links (but without recursion), on python2.4 and later If a recursive directory link is detected, emit a warning and skip. ''' assert os.path.isdir(path) # only designed for directory argument walkdirs = set([path]) targets = set() for dirpath, dirnames, filenames in os.walk(path): for dirname in dirnames: current = os.path.join(dirpath, dirname) target = os.path.realpath(current) if os.path.islink(current): if target in targets: warning("Skipping recursively symlinked directory %s" % dirname) else: walkdirs.add(current) targets.add(target) for walkdir in walkdirs: for value in os.walk(walkdir): yield value def _fswalk(path, follow_symlinks): ''' Directory tree generator path (str) is the root of the directory tree to walk follow_symlinks (bool) indicates whether to descend into symbolically linked directories ''' if follow_symlinks: return _fswalk_follow_symlinks(path) return os.walk(path) def filter_exclude_include(src_list): info(u"Applying --exclude/--include") cfg = Config() exclude_list = SortedDict(ignore_case = False) for file in src_list.keys(): debug(u"CHECK: %s" % file) excluded = False for r in cfg.exclude: if r.search(file): excluded = True debug(u"EXCL-MATCH: '%s'" % (cfg.debug_exclude[r])) break if excluded: ## No need to check for --include if not excluded for r in cfg.include: if r.search(file): excluded = False debug(u"INCL-MATCH: '%s'" % (cfg.debug_include[r])) break if excluded: ## Still excluded - ok, action it debug(u"EXCLUDE: %s" % file) exclude_list[file] = src_list[file] del(src_list[file]) continue else: debug(u"PASS: %s" % (file)) return src_list, exclude_list def fetch_local_list(args, recursive = None): def _get_filelist_local(local_uri): info(u"Compiling list of local files...") if local_uri.isdir(): local_base = deunicodise(local_uri.basename()) local_path = deunicodise(local_uri.path()) filelist = _fswalk(local_path, cfg.follow_symlinks) single_file = False else: local_base = "" local_path = deunicodise(local_uri.dirname()) filelist = [( local_path, [], [deunicodise(local_uri.basename())] )] single_file = True loc_list = SortedDict(ignore_case = False) for root, dirs, files in filelist: rel_root = root.replace(local_path, local_base, 1) for f in files: full_name = os.path.join(root, f) if not os.path.isfile(full_name): continue if os.path.islink(full_name): if not cfg.follow_symlinks: continue relative_file = unicodise(os.path.join(rel_root, f)) if os.path.sep != "/": # Convert non-unix dir separators to '/' relative_file = "/".join(relative_file.split(os.path.sep)) if cfg.urlencoding_mode == "normal": relative_file = replace_nonprintables(relative_file) if relative_file.startswith('./'): relative_file = relative_file[2:] sr = os.stat_result(os.lstat(full_name)) loc_list[relative_file] = { 'full_name_unicode' : unicodise(full_name), 'full_name' : full_name, 'size' : sr.st_size, 'mtime' : sr.st_mtime, ## TODO: Possibly more to save here... } return loc_list, single_file cfg = Config() local_uris = [] local_list = SortedDict(ignore_case = False) single_file = False if type(args) not in (list, tuple): args = [args] if recursive == None: recursive = cfg.recursive for arg in args: uri = S3Uri(arg) if not uri.type == 'file': raise ParameterError("Expecting filename or directory instead of: %s" % arg) if uri.isdir() and not recursive: raise ParameterError("Use --recursive to upload a directory: %s" % arg) local_uris.append(uri) for uri in local_uris: list_for_uri, single_file = _get_filelist_local(uri) local_list.update(list_for_uri) ## Single file is True if and only if the user ## specified one local URI and that URI represents ## a FILE. Ie it is False if the URI was of a DIR ## and that dir contained only one FILE. That's not ## a case of single_file==True. if len(local_list) > 1: single_file = False return local_list, single_file def fetch_remote_list(args, require_attribs = False, recursive = None): def _get_filelist_remote(remote_uri, recursive = True): ## If remote_uri ends with '/' then all remote files will have ## the remote_uri prefix removed in the relative path. ## If, on the other hand, the remote_uri ends with something else ## (probably alphanumeric symbol) we'll use the last path part ## in the relative path. ## ## Complicated, eh? See an example: ## _get_filelist_remote("s3://bckt/abc/def") may yield: ## { 'def/file1.jpg' : {}, 'def/xyz/blah.txt' : {} } ## _get_filelist_remote("s3://bckt/abc/def/") will yield: ## { 'file1.jpg' : {}, 'xyz/blah.txt' : {} } ## Furthermore a prefix-magic can restrict the return list: ## _get_filelist_remote("s3://bckt/abc/def/x") yields: ## { 'xyz/blah.txt' : {} } info(u"Retrieving list of remote files for %s ..." % remote_uri) s3 = S3(Config()) response = s3.bucket_list(remote_uri.bucket(), prefix = remote_uri.object(), recursive = recursive) rem_base_original = rem_base = remote_uri.object() remote_uri_original = remote_uri if rem_base != '' and rem_base[-1] != '/': rem_base = rem_base[:rem_base.rfind('/')+1] remote_uri = S3Uri("s3://%s/%s" % (remote_uri.bucket(), rem_base)) rem_base_len = len(rem_base) rem_list = SortedDict(ignore_case = False) break_now = False for object in response['list']: if object['Key'] == rem_base_original and object['Key'][-1] != os.path.sep: ## We asked for one file and we got that file :-) key = os.path.basename(object['Key']) object_uri_str = remote_uri_original.uri() break_now = True rem_list = {} ## Remove whatever has already been put to rem_list else: key = object['Key'][rem_base_len:] ## Beware - this may be '' if object['Key']==rem_base !! object_uri_str = remote_uri.uri() + key rem_list[key] = { 'size' : int(object['Size']), 'timestamp' : dateS3toUnix(object['LastModified']), ## Sadly it's upload time, not our lastmod time :-( 'md5' : object['ETag'][1:-1], 'object_key' : object['Key'], 'object_uri_str' : object_uri_str, 'base_uri' : remote_uri, } if break_now: break return rem_list cfg = Config() remote_uris = [] remote_list = SortedDict(ignore_case = False) if type(args) not in (list, tuple): args = [args] if recursive == None: recursive = cfg.recursive for arg in args: uri = S3Uri(arg) if not uri.type == 's3': raise ParameterError("Expecting S3 URI instead of '%s'" % arg) remote_uris.append(uri) if recursive: for uri in remote_uris: objectlist = _get_filelist_remote(uri) for key in objectlist: remote_list[key] = objectlist[key] else: for uri in remote_uris: uri_str = str(uri) ## Wildcards used in remote URI? ## If yes we'll need a bucket listing... if uri_str.find('*') > -1 or uri_str.find('?') > -1: first_wildcard = uri_str.find('*') first_questionmark = uri_str.find('?') if first_questionmark > -1 and first_questionmark < first_wildcard: first_wildcard = first_questionmark prefix = uri_str[:first_wildcard] rest = uri_str[first_wildcard+1:] ## Only request recursive listing if the 'rest' of the URI, ## i.e. the part after first wildcard, contains '/' need_recursion = rest.find('/') > -1 objectlist = _get_filelist_remote(S3Uri(prefix), recursive = need_recursion) for key in objectlist: ## Check whether the 'key' matches the requested wildcards if glob.fnmatch.fnmatch(objectlist[key]['object_uri_str'], uri_str): remote_list[key] = objectlist[key] else: ## No wildcards - simply append the given URI to the list key = os.path.basename(uri.object()) if not key: raise ParameterError(u"Expecting S3 URI with a filename or --recursive: %s" % uri.uri()) remote_item = { 'base_uri': uri, 'object_uri_str': unicode(uri), 'object_key': uri.object() } if require_attribs: response = S3(cfg).object_info(uri) remote_item.update({ 'size': int(response['headers']['content-length']), 'md5': response['headers']['etag'].strip('"\''), 'timestamp' : dateRFC822toUnix(response['headers']['date']) }) remote_list[key] = remote_item return remote_list def compare_filelists(src_list, dst_list, src_remote, dst_remote): def __direction_str(is_remote): return is_remote and "remote" or "local" # We don't support local->local sync, use 'rsync' or something like that instead ;-) assert(not(src_remote == False and dst_remote == False)) info(u"Verifying attributes...") cfg = Config() exists_list = SortedDict(ignore_case = False) debug("Comparing filelists (direction: %s -> %s)" % (__direction_str(src_remote), __direction_str(dst_remote))) debug("src_list.keys: %s" % src_list.keys()) debug("dst_list.keys: %s" % dst_list.keys()) for file in src_list.keys(): debug(u"CHECK: %s" % file) if dst_list.has_key(file): ## Was --skip-existing requested? if cfg.skip_existing: debug(u"IGNR: %s (used --skip-existing)" % (file)) exists_list[file] = src_list[file] del(src_list[file]) ## Remove from destination-list, all that is left there will be deleted del(dst_list[file]) continue attribs_match = True ## Check size first if 'size' in cfg.sync_checks and dst_list[file]['size'] != src_list[file]['size']: debug(u"XFER: %s (size mismatch: src=%s dst=%s)" % (file, src_list[file]['size'], dst_list[file]['size'])) attribs_match = False ## Check MD5 compare_md5 = 'md5' in cfg.sync_checks # Multipart-uploaded files don't have a valid MD5 sum - it ends with "...-NN" if compare_md5 and (src_remote == True and src_list[file]['md5'].find("-") >= 0) or (dst_remote == True and dst_list[file]['md5'].find("-") >= 0): compare_md5 = False info(u"Disabled MD5 check for %s" % file) if attribs_match and compare_md5: try: if src_remote == False and dst_remote == True: src_md5 = hash_file_md5(src_list[file]['full_name']) dst_md5 = dst_list[file]['md5'] elif src_remote == True and dst_remote == False: src_md5 = src_list[file]['md5'] dst_md5 = hash_file_md5(dst_list[file]['full_name']) elif src_remote == True and dst_remote == True: src_md5 = src_list[file]['md5'] dst_md5 = dst_list[file]['md5'] except (IOError,OSError), e: # MD5 sum verification failed - ignore that file altogether debug(u"IGNR: %s (disappeared)" % (file)) warning(u"%s: file disappeared, ignoring." % (file)) del(src_list[file]) del(dst_list[file]) continue if src_md5 != dst_md5: ## Checksums are different. attribs_match = False debug(u"XFER: %s (md5 mismatch: src=%s dst=%s)" % (file, src_md5, dst_md5)) if attribs_match: ## Remove from source-list, all that is left there will be transferred debug(u"IGNR: %s (transfer not needed)" % file) exists_list[file] = src_list[file] del(src_list[file]) ## Remove from destination-list, all that is left there will be deleted del(dst_list[file]) return src_list, dst_list, exists_list # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/BidirMap.py0000644000175000001440000000205511700327105016670 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 class BidirMap(object): def __init__(self, **map): self.k2v = {} self.v2k = {} for key in map: self.__setitem__(key, map[key]) def __setitem__(self, key, value): if self.v2k.has_key(value): if self.v2k[value] != key: raise KeyError("Value '"+str(value)+"' already in use with key '"+str(self.v2k[value])+"'") try: del(self.v2k[self.k2v[key]]) except KeyError: pass self.k2v[key] = value self.v2k[value] = key def __getitem__(self, key): return self.k2v[key] def __str__(self): return self.v2k.__str__() def getkey(self, value): return self.v2k[value] def getvalue(self, key): return self.k2v[key] def keys(self): return [key for key in self.k2v] def values(self): return [value for value in self.v2k] # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/S3Uri.py0000644000175000001440000001464011701432613016153 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import os import re import sys from BidirMap import BidirMap from logging import debug import S3 from Utils import unicodise, check_bucket_name_dns_conformity class S3Uri(object): type = None _subclasses = None def __new__(self, string): if not self._subclasses: ## Generate a list of all subclasses of S3Uri self._subclasses = [] dict = sys.modules[__name__].__dict__ for something in dict: if type(dict[something]) is not type(self): continue if issubclass(dict[something], self) and dict[something] != self: self._subclasses.append(dict[something]) for subclass in self._subclasses: try: instance = object.__new__(subclass) instance.__init__(string) return instance except ValueError, e: continue raise ValueError("%s: not a recognized URI" % string) def __str__(self): return self.uri() def __unicode__(self): return self.uri() def __repr__(self): return "<%s: %s>" % (self.__class__.__name__, self.__unicode__()) def public_url(self): raise ValueError("This S3 URI does not have Anonymous URL representation") def basename(self): return self.__unicode__().split("/")[-1] class S3UriS3(S3Uri): type = "s3" _re = re.compile("^s3://([^/]+)/?(.*)", re.IGNORECASE) def __init__(self, string): match = self._re.match(string) if not match: raise ValueError("%s: not a S3 URI" % string) groups = match.groups() self._bucket = groups[0] self._object = unicodise(groups[1]) def bucket(self): return self._bucket def object(self): return self._object def has_bucket(self): return bool(self._bucket) def has_object(self): return bool(self._object) def uri(self): return "/".join(["s3:/", self._bucket, self._object]) def is_dns_compatible(self): return check_bucket_name_dns_conformity(self._bucket) def public_url(self): if self.is_dns_compatible(): return "http://%s.s3.amazonaws.com/%s" % (self._bucket, self._object) else: return "http://s3.amazonaws.com/%s/%s" % (self._bucket, self._object) def host_name(self): if self.is_dns_compatible(): return "%s.s3.amazonaws.com" % (self._bucket) else: return "s3.amazonaws.com" @staticmethod def compose_uri(bucket, object = ""): return "s3://%s/%s" % (bucket, object) @staticmethod def httpurl_to_s3uri(http_url): m=re.match("(https?://)?([^/]+)/?(.*)", http_url, re.IGNORECASE) hostname, object = m.groups()[1:] hostname = hostname.lower() if hostname == "s3.amazonaws.com": ## old-style url: http://s3.amazonaws.com/bucket/object if object.count("/") == 0: ## no object given bucket = object object = "" else: ## bucket/object bucket, object = object.split("/", 1) elif hostname.endswith(".s3.amazonaws.com"): ## new-style url: http://bucket.s3.amazonaws.com/object bucket = hostname[:-(len(".s3.amazonaws.com"))] else: raise ValueError("Unable to parse URL: %s" % http_url) return S3Uri("s3://%(bucket)s/%(object)s" % { 'bucket' : bucket, 'object' : object }) class S3UriS3FS(S3Uri): type = "s3fs" _re = re.compile("^s3fs://([^/]*)/?(.*)", re.IGNORECASE) def __init__(self, string): match = self._re.match(string) if not match: raise ValueError("%s: not a S3fs URI" % string) groups = match.groups() self._fsname = groups[0] self._path = unicodise(groups[1]).split("/") def fsname(self): return self._fsname def path(self): return "/".join(self._path) def uri(self): return "/".join(["s3fs:/", self._fsname, self.path()]) class S3UriFile(S3Uri): type = "file" _re = re.compile("^(\w+://)?(.*)") def __init__(self, string): match = self._re.match(string) groups = match.groups() if groups[0] not in (None, "file://"): raise ValueError("%s: not a file:// URI" % string) self._path = unicodise(groups[1]).split("/") def path(self): return "/".join(self._path) def uri(self): return "/".join(["file:/", self.path()]) def isdir(self): return os.path.isdir(self.path()) def dirname(self): return os.path.dirname(self.path()) class S3UriCloudFront(S3Uri): type = "cf" _re = re.compile("^cf://([^/]*)/*(.*)", re.IGNORECASE) def __init__(self, string): match = self._re.match(string) if not match: raise ValueError("%s: not a CloudFront URI" % string) groups = match.groups() self._dist_id = groups[0] self._request_id = groups[1] != "/" and groups[1] or None def dist_id(self): return self._dist_id def request_id(self): return self._request_id def uri(self): uri = "cf://" + self.dist_id() if self.request_id(): uri += "/" + self.request_id() return uri if __name__ == "__main__": uri = S3Uri("s3://bucket/object") print "type() =", type(uri) print "uri =", uri print "uri.type=", uri.type print "bucket =", uri.bucket() print "object =", uri.object() print uri = S3Uri("s3://bucket") print "type() =", type(uri) print "uri =", uri print "uri.type=", uri.type print "bucket =", uri.bucket() print uri = S3Uri("s3fs://filesystem1/path/to/remote/file.txt") print "type() =", type(uri) print "uri =", uri print "uri.type=", uri.type print "path =", uri.path() print uri = S3Uri("/path/to/local/file.txt") print "type() =", type(uri) print "uri =", uri print "uri.type=", uri.type print "path =", uri.path() print uri = S3Uri("cf://1234567890ABCD/") print "type() =", type(uri) print "uri =", uri print "uri.type=", uri.type print "dist_id =", uri.dist_id() print # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/Config.py0000644000175000001440000001674611701534227016430 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import logging from logging import debug, info, warning, error import re import os import Progress from SortedDict import SortedDict class Config(object): _instance = None _parsed_files = [] _doc = {} access_key = "" secret_key = "" host_base = "s3.amazonaws.com" host_bucket = "%(bucket)s.s3.amazonaws.com" simpledb_host = "sdb.amazonaws.com" cloudfront_host = "cloudfront.amazonaws.com" verbosity = logging.WARNING progress_meter = True progress_class = Progress.ProgressCR send_chunk = 4096 recv_chunk = 4096 list_md5 = False human_readable_sizes = False extra_headers = SortedDict(ignore_case = True) force = False enable = None get_continue = False skip_existing = False recursive = False acl_public = None acl_grants = [] acl_revokes = [] proxy_host = "" proxy_port = 3128 encrypt = False dry_run = False preserve_attrs = True preserve_attrs_list = [ 'uname', # Verbose owner Name (e.g. 'root') 'uid', # Numeric user ID (e.g. 0) 'gname', # Group name (e.g. 'users') 'gid', # Numeric group ID (e.g. 100) 'atime', # Last access timestamp 'mtime', # Modification timestamp 'ctime', # Creation timestamp 'mode', # File mode (e.g. rwxr-xr-x = 755) #'acl', # Full ACL (not yet supported) ] delete_removed = False _doc['delete_removed'] = "[sync] Remove remote S3 objects when local file has been deleted" gpg_passphrase = "" gpg_command = "" gpg_encrypt = "%(gpg_command)s -c --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s" gpg_decrypt = "%(gpg_command)s -d --verbose --no-use-agent --batch --yes --passphrase-fd %(passphrase_fd)s -o %(output_file)s %(input_file)s" use_https = False bucket_location = "US" default_mime_type = "binary/octet-stream" guess_mime_type = True mime_type = "" enable_multipart = True multipart_chunk_size_mb = 15 # MB # List of checks to be performed for 'sync' sync_checks = ['size', 'md5'] # 'weak-timestamp' # List of compiled REGEXPs exclude = [] include = [] # Dict mapping compiled REGEXPs back to their textual form debug_exclude = {} debug_include = {} encoding = "utf-8" urlencoding_mode = "normal" log_target_prefix = "" reduced_redundancy = False follow_symlinks = False socket_timeout = 300 invalidate_on_cf = False website_index = "index.html" website_error = "" website_endpoint = "http://%(bucket)s.s3-website-%(location)s.amazonaws.com/" ## Creating a singleton def __new__(self, configfile = None): if self._instance is None: self._instance = object.__new__(self) return self._instance def __init__(self, configfile = None): if configfile: self.read_config_file(configfile) def option_list(self): retval = [] for option in dir(self): ## Skip attributes that start with underscore or are not string, int or bool option_type = type(getattr(Config, option)) if option.startswith("_") or \ not (option_type in ( type("string"), # str type(42), # int type(True))): # bool continue retval.append(option) return retval def read_config_file(self, configfile): cp = ConfigParser(configfile) for option in self.option_list(): self.update_option(option, cp.get(option)) self._parsed_files.append(configfile) def dump_config(self, stream): ConfigDumper(stream).dump("default", self) def update_option(self, option, value): if value is None: return #### Handle environment reference if str(value).startswith("$"): return self.update_option(option, os.getenv(str(value)[1:])) #### Special treatment of some options ## verbosity must be known to "logging" module if option == "verbosity": try: setattr(Config, "verbosity", logging._levelNames[value]) except KeyError: error("Config: verbosity level '%s' is not valid" % value) ## allow yes/no, true/false, on/off and 1/0 for boolean options elif type(getattr(Config, option)) is type(True): # bool if str(value).lower() in ("true", "yes", "on", "1"): setattr(Config, option, True) elif str(value).lower() in ("false", "no", "off", "0"): setattr(Config, option, False) else: error("Config: value of option '%s' must be Yes or No, not '%s'" % (option, value)) elif type(getattr(Config, option)) is type(42): # int try: setattr(Config, option, int(value)) except ValueError, e: error("Config: value of option '%s' must be an integer, not '%s'" % (option, value)) else: # string setattr(Config, option, value) class ConfigParser(object): def __init__(self, file, sections = []): self.cfg = {} self.parse_file(file, sections) def parse_file(self, file, sections = []): debug("ConfigParser: Reading file '%s'" % file) if type(sections) != type([]): sections = [sections] in_our_section = True f = open(file, "r") r_comment = re.compile("^\s*#.*") r_empty = re.compile("^\s*$") r_section = re.compile("^\[([^\]]+)\]") r_data = re.compile("^\s*(?P\w+)\s*=\s*(?P.*)") r_quotes = re.compile("^\"(.*)\"\s*$") for line in f: if r_comment.match(line) or r_empty.match(line): continue is_section = r_section.match(line) if is_section: section = is_section.groups()[0] in_our_section = (section in sections) or (len(sections) == 0) continue is_data = r_data.match(line) if is_data and in_our_section: data = is_data.groupdict() if r_quotes.match(data["value"]): data["value"] = data["value"][1:-1] self.__setitem__(data["key"], data["value"]) if data["key"] in ("access_key", "secret_key", "gpg_passphrase"): print_value = (data["value"][:2]+"...%d_chars..."+data["value"][-1:]) % (len(data["value"]) - 3) else: print_value = data["value"] debug("ConfigParser: %s->%s" % (data["key"], print_value)) continue warning("Ignoring invalid line in '%s': %s" % (file, line)) def __getitem__(self, name): return self.cfg[name] def __setitem__(self, name, value): self.cfg[name] = value def get(self, name, default = None): if self.cfg.has_key(name): return self.cfg[name] return default class ConfigDumper(object): def __init__(self, stream): self.stream = stream def dump(self, section, config): self.stream.write("[%s]\n" % section) for option in config.option_list(): self.stream.write("%s = %s\n" % (option, getattr(config, option))) # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/Exceptions.py0000644000175000001440000000512311700327105017321 0ustar mludvigusers00000000000000## Amazon S3 manager - Exceptions library ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 from Utils import getTreeFromXml, unicodise, deunicodise from logging import debug, info, warning, error try: import xml.etree.ElementTree as ET except ImportError: import elementtree.ElementTree as ET class S3Exception(Exception): def __init__(self, message = ""): self.message = unicodise(message) def __str__(self): ## Call unicode(self) instead of self.message because ## __unicode__() method could be overriden in subclasses! return deunicodise(unicode(self)) def __unicode__(self): return self.message ## (Base)Exception.message has been deprecated in Python 2.6 def _get_message(self): return self._message def _set_message(self, message): self._message = message message = property(_get_message, _set_message) class S3Error (S3Exception): def __init__(self, response): self.status = response["status"] self.reason = response["reason"] self.info = { "Code" : "", "Message" : "", "Resource" : "" } debug("S3Error: %s (%s)" % (self.status, self.reason)) if response.has_key("headers"): for header in response["headers"]: debug("HttpHeader: %s: %s" % (header, response["headers"][header])) if response.has_key("data"): tree = getTreeFromXml(response["data"]) error_node = tree if not error_node.tag == "Error": error_node = tree.find(".//Error") for child in error_node.getchildren(): if child.text != "": debug("ErrorXML: " + child.tag + ": " + repr(child.text)) self.info[child.tag] = child.text self.code = self.info["Code"] self.message = self.info["Message"] self.resource = self.info["Resource"] def __unicode__(self): retval = u"%d " % (self.status) retval += (u"(%s)" % (self.info.has_key("Code") and self.info["Code"] or self.reason)) if self.info.has_key("Message"): retval += (u": %s" % self.info["Message"]) return retval class CloudFrontError(S3Error): pass class S3UploadError(S3Exception): pass class S3DownloadError(S3Exception): pass class S3RequestError(S3Exception): pass class S3ResponseError(S3Exception): pass class InvalidFileError(S3Exception): pass class ParameterError(S3Exception): pass # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/S3.py0000644000175000001440000011137711703440625015504 0ustar mludvigusers00000000000000## Amazon S3 manager ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 import sys import os, os.path import time import httplib import logging import mimetypes import re from logging import debug, info, warning, error from stat import ST_SIZE try: from hashlib import md5 except ImportError: from md5 import md5 from Utils import * from SortedDict import SortedDict from AccessLog import AccessLog from ACL import ACL, GranteeLogDelivery from BidirMap import BidirMap from Config import Config from Exceptions import * from MultiPart import MultiPartUpload from S3Uri import S3Uri try: import magic try: ## https://github.com/ahupp/python-magic magic_ = magic.Magic(mime=True) def mime_magic(file): return magic_.from_file(file) except (TypeError, AttributeError): ## Older python-magic versions magic_ = magic.open(magic.MAGIC_MIME) magic_.load() def mime_magic(file): return magic_.file(file) except ImportError, e: if str(e).find("magic") >= 0: magic_message = "Module python-magic is not available." else: magic_message = "Module python-magic can't be used (%s)." % e.message magic_message += " Guessing MIME types based on file extensions." magic_warned = False def mime_magic(file): global magic_warned if (not magic_warned): warning(magic_message) magic_warned = True return mimetypes.guess_type(file)[0] __all__ = [] class S3Request(object): def __init__(self, s3, method_string, resource, headers, params = {}): self.s3 = s3 self.headers = SortedDict(headers or {}, ignore_case = True) self.resource = resource self.method_string = method_string self.params = params self.update_timestamp() self.sign() def update_timestamp(self): if self.headers.has_key("date"): del(self.headers["date"]) self.headers["x-amz-date"] = time.strftime("%a, %d %b %Y %H:%M:%S +0000", time.gmtime()) def format_param_str(self): """ Format URL parameters from self.params and returns ?parm1=val1&parm2=val2 or an empty string if there are no parameters. Output of this function should be appended directly to self.resource['uri'] """ param_str = "" for param in self.params: if self.params[param] not in (None, ""): param_str += "&%s=%s" % (param, self.params[param]) else: param_str += "&%s" % param return param_str and "?" + param_str[1:] def sign(self): h = self.method_string + "\n" h += self.headers.get("content-md5", "")+"\n" h += self.headers.get("content-type", "")+"\n" h += self.headers.get("date", "")+"\n" for header in self.headers.keys(): if header.startswith("x-amz-"): h += header+":"+str(self.headers[header])+"\n" if self.resource['bucket']: h += "/" + self.resource['bucket'] h += self.resource['uri'] debug("SignHeaders: " + repr(h)) signature = sign_string(h) self.headers["Authorization"] = "AWS "+self.s3.config.access_key+":"+signature def get_triplet(self): self.update_timestamp() self.sign() resource = dict(self.resource) ## take a copy resource['uri'] += self.format_param_str() return (self.method_string, resource, self.headers) class S3(object): http_methods = BidirMap( GET = 0x01, PUT = 0x02, HEAD = 0x04, DELETE = 0x08, POST = 0x10, MASK = 0x1F, ) targets = BidirMap( SERVICE = 0x0100, BUCKET = 0x0200, OBJECT = 0x0400, MASK = 0x0700, ) operations = BidirMap( UNDFINED = 0x0000, LIST_ALL_BUCKETS = targets["SERVICE"] | http_methods["GET"], BUCKET_CREATE = targets["BUCKET"] | http_methods["PUT"], BUCKET_LIST = targets["BUCKET"] | http_methods["GET"], BUCKET_DELETE = targets["BUCKET"] | http_methods["DELETE"], OBJECT_PUT = targets["OBJECT"] | http_methods["PUT"], OBJECT_GET = targets["OBJECT"] | http_methods["GET"], OBJECT_HEAD = targets["OBJECT"] | http_methods["HEAD"], OBJECT_DELETE = targets["OBJECT"] | http_methods["DELETE"], OBJECT_POST = targets["OBJECT"] | http_methods["POST"], ) codes = { "NoSuchBucket" : "Bucket '%s' does not exist", "AccessDenied" : "Access to bucket '%s' was denied", "BucketAlreadyExists" : "Bucket '%s' already exists", } ## S3 sometimes sends HTTP-307 response redir_map = {} ## Maximum attempts of re-issuing failed requests _max_retries = 5 def __init__(self, config): self.config = config def get_connection(self, bucket): if self.config.proxy_host != "": return httplib.HTTPConnection(self.config.proxy_host, self.config.proxy_port) else: if self.config.use_https: return httplib.HTTPSConnection(self.get_hostname(bucket)) else: return httplib.HTTPConnection(self.get_hostname(bucket)) def get_hostname(self, bucket): if bucket and check_bucket_name_dns_conformity(bucket): if self.redir_map.has_key(bucket): host = self.redir_map[bucket] else: host = getHostnameFromBucket(bucket) else: host = self.config.host_base debug('get_hostname(%s): %s' % (bucket, host)) return host def set_hostname(self, bucket, redir_hostname): self.redir_map[bucket] = redir_hostname def format_uri(self, resource): if resource['bucket'] and not check_bucket_name_dns_conformity(resource['bucket']): uri = "/%s%s" % (resource['bucket'], resource['uri']) else: uri = resource['uri'] if self.config.proxy_host != "": uri = "http://%s%s" % (self.get_hostname(resource['bucket']), uri) debug('format_uri(): ' + uri) return uri ## Commands / Actions def list_all_buckets(self): request = self.create_request("LIST_ALL_BUCKETS") response = self.send_request(request) response["list"] = getListFromXml(response["data"], "Bucket") return response def bucket_list(self, bucket, prefix = None, recursive = None): def _list_truncated(data): ## can either be "true" or "false" or be missing completely is_truncated = getTextFromXml(data, ".//IsTruncated") or "false" return is_truncated.lower() != "false" def _get_contents(data): return getListFromXml(data, "Contents") def _get_common_prefixes(data): return getListFromXml(data, "CommonPrefixes") uri_params = {} truncated = True list = [] prefixes = [] while truncated: response = self.bucket_list_noparse(bucket, prefix, recursive, uri_params) current_list = _get_contents(response["data"]) current_prefixes = _get_common_prefixes(response["data"]) truncated = _list_truncated(response["data"]) if truncated: if current_list: uri_params['marker'] = self.urlencode_string(current_list[-1]["Key"]) else: uri_params['marker'] = self.urlencode_string(current_prefixes[-1]["Prefix"]) debug("Listing continues after '%s'" % uri_params['marker']) list += current_list prefixes += current_prefixes response['list'] = list response['common_prefixes'] = prefixes return response def bucket_list_noparse(self, bucket, prefix = None, recursive = None, uri_params = {}): if prefix: uri_params['prefix'] = self.urlencode_string(prefix) if not self.config.recursive and not recursive: uri_params['delimiter'] = "/" request = self.create_request("BUCKET_LIST", bucket = bucket, **uri_params) response = self.send_request(request) #debug(response) return response def bucket_create(self, bucket, bucket_location = None): headers = SortedDict(ignore_case = True) body = "" if bucket_location and bucket_location.strip().upper() != "US": bucket_location = bucket_location.strip() if bucket_location.upper() == "EU": bucket_location = bucket_location.upper() else: bucket_location = bucket_location.lower() body = "" body += bucket_location body += "" debug("bucket_location: " + body) check_bucket_name(bucket, dns_strict = True) else: check_bucket_name(bucket, dns_strict = False) if self.config.acl_public: headers["x-amz-acl"] = "public-read" request = self.create_request("BUCKET_CREATE", bucket = bucket, headers = headers) response = self.send_request(request, body) return response def bucket_delete(self, bucket): request = self.create_request("BUCKET_DELETE", bucket = bucket) response = self.send_request(request) return response def get_bucket_location(self, uri): request = self.create_request("BUCKET_LIST", bucket = uri.bucket(), extra = "?location") response = self.send_request(request) location = getTextFromXml(response['data'], "LocationConstraint") if not location or location in [ "", "US" ]: location = "us-east-1" elif location == "EU": location = "eu-west-1" return location def bucket_info(self, uri): # For now reports only "Location". One day perhaps more. response = {} response['bucket-location'] = self.get_bucket_location(uri) return response def website_info(self, uri, bucket_location = None): headers = SortedDict(ignore_case = True) bucket = uri.bucket() body = "" request = self.create_request("BUCKET_LIST", bucket = bucket, extra="?website") try: response = self.send_request(request, body) response['index_document'] = getTextFromXml(response['data'], ".//IndexDocument//Suffix") response['error_document'] = getTextFromXml(response['data'], ".//ErrorDocument//Key") response['website_endpoint'] = self.config.website_endpoint % { "bucket" : uri.bucket(), "location" : self.get_bucket_location(uri)} return response except S3Error, e: if e.status == 404: debug("Could not get /?website - website probably not configured for this bucket") return None raise def website_create(self, uri, bucket_location = None): headers = SortedDict(ignore_case = True) bucket = uri.bucket() body = '' body += ' ' body += (' %s' % self.config.website_index) body += ' ' if self.config.website_error: body += ' ' body += (' %s' % self.config.website_error) body += ' ' body += '' request = self.create_request("BUCKET_CREATE", bucket = bucket, extra="?website") debug("About to send request '%s' with body '%s'" % (request, body)) response = self.send_request(request, body) debug("Received response '%s'" % (response)) return response def website_delete(self, uri, bucket_location = None): headers = SortedDict(ignore_case = True) bucket = uri.bucket() body = "" request = self.create_request("BUCKET_DELETE", bucket = bucket, extra="?website") debug("About to send request '%s' with body '%s'" % (request, body)) response = self.send_request(request, body) debug("Received response '%s'" % (response)) if response['status'] != 204: raise S3ResponseError("Expected status 204: %s" % response) return response def object_put(self, filename, uri, extra_headers = None, extra_label = ""): # TODO TODO # Make it consistent with stream-oriented object_get() if uri.type != "s3": raise ValueError("Expected URI type 's3', got '%s'" % uri.type) if not os.path.isfile(filename): raise InvalidFileError(u"%s is not a regular file" % unicodise(filename)) try: file = open(filename, "rb") size = os.stat(filename)[ST_SIZE] except (IOError, OSError), e: raise InvalidFileError(u"%s: %s" % (unicodise(filename), e.strerror)) headers = SortedDict(ignore_case = True) if extra_headers: headers.update(extra_headers) ## MIME-type handling content_type = self.config.mime_type if not content_type and self.config.guess_mime_type: content_type = mime_magic(filename) if not content_type: content_type = self.config.default_mime_type debug("Content-Type set to '%s'" % content_type) headers["content-type"] = content_type ## Other Amazon S3 attributes if self.config.acl_public: headers["x-amz-acl"] = "public-read" if self.config.reduced_redundancy: headers["x-amz-storage-class"] = "REDUCED_REDUNDANCY" ## Multipart decision multipart = False if self.config.enable_multipart: if size > self.config.multipart_chunk_size_mb * 1024 * 1024: multipart = True if multipart: # Multipart requests are quite different... drop here return self.send_file_multipart(file, headers, uri, size) ## Not multipart... headers["content-length"] = size request = self.create_request("OBJECT_PUT", uri = uri, headers = headers) labels = { 'source' : unicodise(filename), 'destination' : unicodise(uri.uri()), 'extra' : extra_label } response = self.send_file(request, file, labels) return response def object_get(self, uri, stream, start_position = 0, extra_label = ""): if uri.type != "s3": raise ValueError("Expected URI type 's3', got '%s'" % uri.type) request = self.create_request("OBJECT_GET", uri = uri) labels = { 'source' : unicodise(uri.uri()), 'destination' : unicodise(stream.name), 'extra' : extra_label } response = self.recv_file(request, stream, labels, start_position) return response def object_delete(self, uri): if uri.type != "s3": raise ValueError("Expected URI type 's3', got '%s'" % uri.type) request = self.create_request("OBJECT_DELETE", uri = uri) response = self.send_request(request) return response def object_copy(self, src_uri, dst_uri, extra_headers = None): if src_uri.type != "s3": raise ValueError("Expected URI type 's3', got '%s'" % src_uri.type) if dst_uri.type != "s3": raise ValueError("Expected URI type 's3', got '%s'" % dst_uri.type) headers = SortedDict(ignore_case = True) headers['x-amz-copy-source'] = "/%s/%s" % (src_uri.bucket(), self.urlencode_string(src_uri.object())) ## TODO: For now COPY, later maybe add a switch? headers['x-amz-metadata-directive'] = "COPY" if self.config.acl_public: headers["x-amz-acl"] = "public-read" if self.config.reduced_redundancy: headers["x-amz-storage-class"] = "REDUCED_REDUNDANCY" # if extra_headers: # headers.update(extra_headers) request = self.create_request("OBJECT_PUT", uri = dst_uri, headers = headers) response = self.send_request(request) return response def object_move(self, src_uri, dst_uri, extra_headers = None): response_copy = self.object_copy(src_uri, dst_uri, extra_headers) debug("Object %s copied to %s" % (src_uri, dst_uri)) if getRootTagName(response_copy["data"]) == "CopyObjectResult": response_delete = self.object_delete(src_uri) debug("Object %s deleted" % src_uri) return response_copy def object_info(self, uri): request = self.create_request("OBJECT_HEAD", uri = uri) response = self.send_request(request) return response def get_acl(self, uri): if uri.has_object(): request = self.create_request("OBJECT_GET", uri = uri, extra = "?acl") else: request = self.create_request("BUCKET_LIST", bucket = uri.bucket(), extra = "?acl") response = self.send_request(request) acl = ACL(response['data']) return acl def set_acl(self, uri, acl): if uri.has_object(): request = self.create_request("OBJECT_PUT", uri = uri, extra = "?acl") else: request = self.create_request("BUCKET_CREATE", bucket = uri.bucket(), extra = "?acl") body = str(acl) debug(u"set_acl(%s): acl-xml: %s" % (uri, body)) response = self.send_request(request, body) return response def get_accesslog(self, uri): request = self.create_request("BUCKET_LIST", bucket = uri.bucket(), extra = "?logging") response = self.send_request(request) accesslog = AccessLog(response['data']) return accesslog def set_accesslog_acl(self, uri): acl = self.get_acl(uri) debug("Current ACL(%s): %s" % (uri.uri(), str(acl))) acl.appendGrantee(GranteeLogDelivery("READ_ACP")) acl.appendGrantee(GranteeLogDelivery("WRITE")) debug("Updated ACL(%s): %s" % (uri.uri(), str(acl))) self.set_acl(uri, acl) def set_accesslog(self, uri, enable, log_target_prefix_uri = None, acl_public = False): request = self.create_request("BUCKET_CREATE", bucket = uri.bucket(), extra = "?logging") accesslog = AccessLog() if enable: accesslog.enableLogging(log_target_prefix_uri) accesslog.setAclPublic(acl_public) else: accesslog.disableLogging() body = str(accesslog) debug(u"set_accesslog(%s): accesslog-xml: %s" % (uri, body)) try: response = self.send_request(request, body) except S3Error, e: if e.info['Code'] == "InvalidTargetBucketForLogging": info("Setting up log-delivery ACL for target bucket.") self.set_accesslog_acl(S3Uri("s3://%s" % log_target_prefix_uri.bucket())) response = self.send_request(request, body) else: raise return accesslog, response ## Low level methods def urlencode_string(self, string, urlencoding_mode = None): if type(string) == unicode: string = string.encode("utf-8") if urlencoding_mode is None: urlencoding_mode = self.config.urlencoding_mode if urlencoding_mode == "verbatim": ## Don't do any pre-processing return string encoded = "" ## List of characters that must be escaped for S3 ## Haven't found this in any official docs ## but my tests show it's more less correct. ## If you start getting InvalidSignature errors ## from S3 check the error headers returned ## from S3 to see whether the list hasn't ## changed. for c in string: # I'm not sure how to know in what encoding # 'object' is. Apparently "type(object)==str" # but the contents is a string of unicode # bytes, e.g. '\xc4\x8d\xc5\xafr\xc3\xa1k' # Don't know what it will do on non-utf8 # systems. # [hope that sounds reassuring ;-)] o = ord(c) if (o < 0x20 or o == 0x7f): if urlencoding_mode == "fixbucket": encoded += "%%%02X" % o else: error(u"Non-printable character 0x%02x in: %s" % (o, string)) error(u"Please report it to s3tools-bugs@lists.sourceforge.net") encoded += replace_nonprintables(c) elif (o == 0x20 or # Space and below o == 0x22 or # " o == 0x23 or # # o == 0x25 or # % (escape character) o == 0x26 or # & o == 0x2B or # + (or it would become ) o == 0x3C or # < o == 0x3E or # > o == 0x3F or # ? o == 0x60 or # ` o >= 123): # { and above, including >= 128 for UTF-8 encoded += "%%%02X" % o else: encoded += c debug("String '%s' encoded to '%s'" % (string, encoded)) return encoded def create_request(self, operation, uri = None, bucket = None, object = None, headers = None, extra = None, **params): resource = { 'bucket' : None, 'uri' : "/" } if uri and (bucket or object): raise ValueError("Both 'uri' and either 'bucket' or 'object' parameters supplied") ## If URI is given use that instead of bucket/object parameters if uri: bucket = uri.bucket() object = uri.has_object() and uri.object() or None if bucket: resource['bucket'] = str(bucket) if object: resource['uri'] = "/" + self.urlencode_string(object) if extra: resource['uri'] += extra method_string = S3.http_methods.getkey(S3.operations[operation] & S3.http_methods["MASK"]) request = S3Request(self, method_string, resource, headers, params) debug("CreateRequest: resource[uri]=" + resource['uri']) return request def _fail_wait(self, retries): # Wait a few seconds. The more it fails the more we wait. return (self._max_retries - retries + 1) * 3 def send_request(self, request, body = None, retries = _max_retries): method_string, resource, headers = request.get_triplet() debug("Processing request, please wait...") if not headers.has_key('content-length'): headers['content-length'] = body and len(body) or 0 try: # "Stringify" all headers for header in headers.keys(): headers[header] = str(headers[header]) conn = self.get_connection(resource['bucket']) uri = self.format_uri(resource) debug("Sending request method_string=%r, uri=%r, headers=%r, body=(%i bytes)" % (method_string, uri, headers, len(body or ""))) conn.request(method_string, uri, body, headers) response = {} http_response = conn.getresponse() response["status"] = http_response.status response["reason"] = http_response.reason response["headers"] = convertTupleListToDict(http_response.getheaders()) response["data"] = http_response.read() debug("Response: " + str(response)) conn.close() except Exception, e: if retries: warning("Retrying failed request: %s (%s)" % (resource['uri'], e)) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) return self.send_request(request, body, retries - 1) else: raise S3RequestError("Request failed for: %s" % resource['uri']) if response["status"] == 307: ## RedirectPermanent redir_bucket = getTextFromXml(response['data'], ".//Bucket") redir_hostname = getTextFromXml(response['data'], ".//Endpoint") self.set_hostname(redir_bucket, redir_hostname) warning("Redirected to: %s" % (redir_hostname)) return self.send_request(request, body) if response["status"] >= 500: e = S3Error(response) if retries: warning(u"Retrying failed request: %s" % resource['uri']) warning(unicode(e)) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) return self.send_request(request, body, retries - 1) else: raise e if response["status"] < 200 or response["status"] > 299: raise S3Error(response) return response def send_file(self, request, file, labels, throttle = 0, retries = _max_retries, offset = 0, chunk_size = -1): method_string, resource, headers = request.get_triplet() size_left = size_total = headers.get("content-length") if self.config.progress_meter: progress = self.config.progress_class(labels, size_total) else: info("Sending file '%s', please wait..." % file.name) timestamp_start = time.time() try: conn = self.get_connection(resource['bucket']) conn.connect() conn.putrequest(method_string, self.format_uri(resource)) for header in headers.keys(): conn.putheader(header, str(headers[header])) conn.endheaders() except Exception, e: if self.config.progress_meter: progress.done("failed") if retries: warning("Retrying failed request: %s (%s)" % (resource['uri'], e)) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) # Connection error -> same throttle value return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size) else: raise S3UploadError("Upload failed for: %s" % resource['uri']) file.seek(offset) md5_hash = md5() try: while (size_left > 0): #debug("SendFile: Reading up to %d bytes from '%s'" % (self.config.send_chunk, file.name)) data = file.read(min(self.config.send_chunk, size_left)) md5_hash.update(data) conn.send(data) if self.config.progress_meter: progress.update(delta_position = len(data)) size_left -= len(data) if throttle: time.sleep(throttle) md5_computed = md5_hash.hexdigest() response = {} http_response = conn.getresponse() response["status"] = http_response.status response["reason"] = http_response.reason response["headers"] = convertTupleListToDict(http_response.getheaders()) response["data"] = http_response.read() response["size"] = size_total conn.close() debug(u"Response: %s" % response) except Exception, e: if self.config.progress_meter: progress.done("failed") if retries: if retries < self._max_retries: throttle = throttle and throttle * 5 or 0.01 warning("Upload failed: %s (%s)" % (resource['uri'], e)) warning("Retrying on lower speed (throttle=%0.2f)" % throttle) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) # Connection error -> same throttle value return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size) else: debug("Giving up on '%s' %s" % (file.name, e)) raise S3UploadError("Upload failed for: %s" % resource['uri']) timestamp_end = time.time() response["elapsed"] = timestamp_end - timestamp_start response["speed"] = response["elapsed"] and float(response["size"]) / response["elapsed"] or float(-1) if self.config.progress_meter: ## The above conn.close() takes some time -> update() progress meter ## to correct the average speed. Otherwise people will complain that ## 'progress' and response["speed"] are inconsistent ;-) progress.update() progress.done("done") if response["status"] == 307: ## RedirectPermanent redir_bucket = getTextFromXml(response['data'], ".//Bucket") redir_hostname = getTextFromXml(response['data'], ".//Endpoint") self.set_hostname(redir_bucket, redir_hostname) warning("Redirected to: %s" % (redir_hostname)) return self.send_file(request, file, labels, offset = offset, chunk_size = chunk_size) # S3 from time to time doesn't send ETag back in a response :-( # Force re-upload here. if not response['headers'].has_key('etag'): response['headers']['etag'] = '' if response["status"] < 200 or response["status"] > 299: try_retry = False if response["status"] >= 500: ## AWS internal error - retry try_retry = True elif response["status"] >= 400: err = S3Error(response) ## Retriable client error? if err.code in [ 'BadDigest', 'OperationAborted', 'TokenRefreshRequired', 'RequestTimeout' ]: try_retry = True if try_retry: if retries: warning("Upload failed: %s (%s)" % (resource['uri'], S3Error(response))) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size) else: warning("Too many failures. Giving up on '%s'" % (file.name)) raise S3UploadError ## Non-recoverable error raise S3Error(response) debug("MD5 sums: computed=%s, received=%s" % (md5_computed, response["headers"]["etag"])) if response["headers"]["etag"].strip('"\'') != md5_hash.hexdigest(): warning("MD5 Sums don't match!") if retries: warning("Retrying upload of %s" % (file.name)) return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size) else: warning("Too many failures. Giving up on '%s'" % (file.name)) raise S3UploadError return response def send_file_multipart(self, file, headers, uri, size): chunk_size = self.config.multipart_chunk_size_mb * 1024 * 1024 upload = MultiPartUpload(self, file, uri, headers) upload.upload_all_parts() response = upload.complete_multipart_upload() response["speed"] = 0 # XXX response["size"] = size return response def recv_file(self, request, stream, labels, start_position = 0, retries = _max_retries): method_string, resource, headers = request.get_triplet() if self.config.progress_meter: progress = self.config.progress_class(labels, 0) else: info("Receiving file '%s', please wait..." % stream.name) timestamp_start = time.time() try: conn = self.get_connection(resource['bucket']) conn.connect() conn.putrequest(method_string, self.format_uri(resource)) for header in headers.keys(): conn.putheader(header, str(headers[header])) if start_position > 0: debug("Requesting Range: %d .. end" % start_position) conn.putheader("Range", "bytes=%d-" % start_position) conn.endheaders() response = {} http_response = conn.getresponse() response["status"] = http_response.status response["reason"] = http_response.reason response["headers"] = convertTupleListToDict(http_response.getheaders()) debug("Response: %s" % response) except Exception, e: if self.config.progress_meter: progress.done("failed") if retries: warning("Retrying failed request: %s (%s)" % (resource['uri'], e)) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) # Connection error -> same throttle value return self.recv_file(request, stream, labels, start_position, retries - 1) else: raise S3DownloadError("Download failed for: %s" % resource['uri']) if response["status"] == 307: ## RedirectPermanent response['data'] = http_response.read() redir_bucket = getTextFromXml(response['data'], ".//Bucket") redir_hostname = getTextFromXml(response['data'], ".//Endpoint") self.set_hostname(redir_bucket, redir_hostname) warning("Redirected to: %s" % (redir_hostname)) return self.recv_file(request, stream, labels) if response["status"] < 200 or response["status"] > 299: raise S3Error(response) if start_position == 0: # Only compute MD5 on the fly if we're downloading from beginning # Otherwise we'd get a nonsense. md5_hash = md5() size_left = int(response["headers"]["content-length"]) size_total = start_position + size_left current_position = start_position if self.config.progress_meter: progress.total_size = size_total progress.initial_position = current_position progress.current_position = current_position try: while (current_position < size_total): this_chunk = size_left > self.config.recv_chunk and self.config.recv_chunk or size_left data = http_response.read(this_chunk) stream.write(data) if start_position == 0: md5_hash.update(data) current_position += len(data) ## Call progress meter from here... if self.config.progress_meter: progress.update(delta_position = len(data)) conn.close() except Exception, e: if self.config.progress_meter: progress.done("failed") if retries: warning("Retrying failed request: %s (%s)" % (resource['uri'], e)) warning("Waiting %d sec..." % self._fail_wait(retries)) time.sleep(self._fail_wait(retries)) # Connection error -> same throttle value return self.recv_file(request, stream, labels, current_position, retries - 1) else: raise S3DownloadError("Download failed for: %s" % resource['uri']) stream.flush() timestamp_end = time.time() if self.config.progress_meter: ## The above stream.flush() may take some time -> update() progress meter ## to correct the average speed. Otherwise people will complain that ## 'progress' and response["speed"] are inconsistent ;-) progress.update() progress.done("done") if start_position == 0: # Only compute MD5 on the fly if we were downloading from the beginning response["md5"] = md5_hash.hexdigest() else: # Otherwise try to compute MD5 of the output file try: response["md5"] = hash_file_md5(stream.name) except IOError, e: if e.errno != errno.ENOENT: warning("Unable to open file: %s: %s" % (stream.name, e)) warning("Unable to verify MD5. Assume it matches.") response["md5"] = response["headers"]["etag"] response["md5match"] = response["headers"]["etag"].find(response["md5"]) >= 0 response["elapsed"] = timestamp_end - timestamp_start response["size"] = current_position response["speed"] = response["elapsed"] and float(response["size"]) / response["elapsed"] or float(-1) if response["size"] != start_position + long(response["headers"]["content-length"]): warning("Reported size (%s) does not match received size (%s)" % ( start_position + response["headers"]["content-length"], response["size"])) debug("ReceiveFile: Computed MD5 = %s" % response["md5"]) if not response["md5match"]: warning("MD5 signatures do not match: computed=%s, received=%s" % ( response["md5"], response["headers"]["etag"])) return response __all__.append("S3") # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/SimpleDB.py0000644000175000001440000001536711700327105016652 0ustar mludvigusers00000000000000## Amazon SimpleDB library ## Author: Michal Ludvig ## http://www.logix.cz/michal ## License: GPL Version 2 """ Low-level class for working with Amazon SimpleDB """ import time import urllib import base64 import hmac import sha import httplib from logging import debug, info, warning, error from Utils import convertTupleListToDict from SortedDict import SortedDict from Exceptions import * class SimpleDB(object): # API Version # See http://docs.amazonwebservices.com/AmazonSimpleDB/2007-11-07/DeveloperGuide/ Version = "2007-11-07" SignatureVersion = 1 def __init__(self, config): self.config = config ## ------------------------------------------------ ## Methods implementing SimpleDB API ## ------------------------------------------------ def ListDomains(self, MaxNumberOfDomains = 100): ''' Lists all domains associated with our Access Key. Returns domain names up to the limit set by MaxNumberOfDomains. ''' parameters = SortedDict() parameters['MaxNumberOfDomains'] = MaxNumberOfDomains return self.send_request("ListDomains", DomainName = None, parameters = parameters) def CreateDomain(self, DomainName): return self.send_request("CreateDomain", DomainName = DomainName) def DeleteDomain(self, DomainName): return self.send_request("DeleteDomain", DomainName = DomainName) def PutAttributes(self, DomainName, ItemName, Attributes): parameters = SortedDict() parameters['ItemName'] = ItemName seq = 0 for attrib in Attributes: if type(Attributes[attrib]) == type(list()): for value in Attributes[attrib]: parameters['Attribute.%d.Name' % seq] = attrib parameters['Attribute.%d.Value' % seq] = unicode(value) seq += 1 else: parameters['Attribute.%d.Name' % seq] = attrib parameters['Attribute.%d.Value' % seq] = unicode(Attributes[attrib]) seq += 1 ## TODO: ## - support for Attribute.N.Replace ## - support for multiple values for one attribute return self.send_request("PutAttributes", DomainName = DomainName, parameters = parameters) def GetAttributes(self, DomainName, ItemName, Attributes = []): parameters = SortedDict() parameters['ItemName'] = ItemName seq = 0 for attrib in Attributes: parameters['AttributeName.%d' % seq] = attrib seq += 1 return self.send_request("GetAttributes", DomainName = DomainName, parameters = parameters) def DeleteAttributes(self, DomainName, ItemName, Attributes = {}): """ Remove specified Attributes from ItemName. Attributes parameter can be either: - not specified, in which case the whole Item is removed - list, e.g. ['Attr1', 'Attr2'] in which case these parameters are removed - dict, e.g. {'Attr' : 'One', 'Attr' : 'Two'} in which case the specified values are removed from multi-value attributes. """ parameters = SortedDict() parameters['ItemName'] = ItemName seq = 0 for attrib in Attributes: parameters['Attribute.%d.Name' % seq] = attrib if type(Attributes) == type(dict()): parameters['Attribute.%d.Value' % seq] = unicode(Attributes[attrib]) seq += 1 return self.send_request("DeleteAttributes", DomainName = DomainName, parameters = parameters) def Query(self, DomainName, QueryExpression = None, MaxNumberOfItems = None, NextToken = None): parameters = SortedDict() if QueryExpression: parameters['QueryExpression'] = QueryExpression if MaxNumberOfItems: parameters['MaxNumberOfItems'] = MaxNumberOfItems if NextToken: parameters['NextToken'] = NextToken return self.send_request("Query", DomainName = DomainName, parameters = parameters) ## Handle NextToken? Or maybe not - let the upper level do it ## ------------------------------------------------ ## Low-level methods for handling SimpleDB requests ## ------------------------------------------------ def send_request(self, *args, **kwargs): request = self.create_request(*args, **kwargs) #debug("Request: %s" % repr(request)) conn = self.get_connection() conn.request("GET", self.format_uri(request['uri_params'])) http_response = conn.getresponse() response = {} response["status"] = http_response.status response["reason"] = http_response.reason response["headers"] = convertTupleListToDict(http_response.getheaders()) response["data"] = http_response.read() conn.close() if response["status"] < 200 or response["status"] > 299: debug("Response: " + str(response)) raise S3Error(response) return response def create_request(self, Action, DomainName, parameters = None): if not parameters: parameters = SortedDict() parameters['AWSAccessKeyId'] = self.config.access_key parameters['Version'] = self.Version parameters['SignatureVersion'] = self.SignatureVersion parameters['Action'] = Action parameters['Timestamp'] = time.strftime("%Y-%m-%dT%H:%M:%SZ", time.gmtime()) if DomainName: parameters['DomainName'] = DomainName parameters['Signature'] = self.sign_request(parameters) parameters.keys_return_lowercase = False uri_params = urllib.urlencode(parameters) request = {} request['uri_params'] = uri_params request['parameters'] = parameters return request def sign_request(self, parameters): h = "" parameters.keys_sort_lowercase = True parameters.keys_return_lowercase = False for key in parameters: h += "%s%s" % (key, parameters[key]) #debug("SignRequest: %s" % h) return base64.encodestring(hmac.new(self.config.secret_key, h, sha).digest()).strip() def get_connection(self): if self.config.proxy_host != "": return httplib.HTTPConnection(self.config.proxy_host, self.config.proxy_port) else: if self.config.use_https: return httplib.HTTPSConnection(self.config.simpledb_host) else: return httplib.HTTPConnection(self.config.simpledb_host) def format_uri(self, uri_params): if self.config.proxy_host != "": uri = "http://%s/?%s" % (self.config.simpledb_host, uri_params) else: uri = "/?%s" % uri_params #debug('format_uri(): ' + uri) return uri # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/S3/MultiPart.py0000644000175000001440000001116111701432613017122 0ustar mludvigusers00000000000000## Amazon S3 Multipart upload support ## Author: Jerome Leclanche ## License: GPL Version 2 import os from stat import ST_SIZE from logging import debug, info, warning, error from Utils import getTextFromXml, formatSize, unicodise from Exceptions import S3UploadError class MultiPartUpload(object): MIN_CHUNK_SIZE_MB = 5 # 5MB MAX_CHUNK_SIZE_MB = 5120 # 5GB MAX_FILE_SIZE = 42949672960 # 5TB def __init__(self, s3, file, uri, headers_baseline = {}): self.s3 = s3 self.file = file self.uri = uri self.parts = {} self.headers_baseline = headers_baseline self.upload_id = self.initiate_multipart_upload() def initiate_multipart_upload(self): """ Begin a multipart upload http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadInitiate.html """ request = self.s3.create_request("OBJECT_POST", uri = self.uri, headers = self.headers_baseline, extra = "?uploads") response = self.s3.send_request(request) data = response["data"] self.upload_id = getTextFromXml(data, "UploadId") return self.upload_id def upload_all_parts(self): """ Execute a full multipart upload on a file Returns the seq/etag dict TODO use num_processes to thread it """ if not self.upload_id: raise RuntimeError("Attempting to use a multipart upload that has not been initiated.") size_left = file_size = os.stat(self.file.name)[ST_SIZE] self.chunk_size = self.s3.config.multipart_chunk_size_mb * 1024 * 1024 nr_parts = file_size / self.chunk_size + (file_size % self.chunk_size and 1) debug("MultiPart: Uploading %s in %d parts" % (self.file.name, nr_parts)) seq = 1 while size_left > 0: offset = self.chunk_size * (seq - 1) current_chunk_size = min(file_size - offset, self.chunk_size) size_left -= current_chunk_size labels = { 'source' : unicodise(self.file.name), 'destination' : unicodise(self.uri.uri()), 'extra' : "[part %d of %d, %s]" % (seq, nr_parts, "%d%sB" % formatSize(current_chunk_size, human_readable = True)) } try: self.upload_part(seq, offset, current_chunk_size, labels) except: error(u"Upload of '%s' part %d failed. Aborting multipart upload." % (self.file.name, seq)) self.abort_upload() raise seq += 1 debug("MultiPart: Upload finished: %d parts", seq - 1) def upload_part(self, seq, offset, chunk_size, labels): """ Upload a file chunk http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadUploadPart.html """ # TODO implement Content-MD5 debug("Uploading part %i of %r (%s bytes)" % (seq, self.upload_id, chunk_size)) headers = { "content-length": chunk_size } query_string = "?partNumber=%i&uploadId=%s" % (seq, self.upload_id) request = self.s3.create_request("OBJECT_PUT", uri = self.uri, headers = headers, extra = query_string) response = self.s3.send_file(request, self.file, labels, offset = offset, chunk_size = chunk_size) self.parts[seq] = response["headers"]["etag"] return response def complete_multipart_upload(self): """ Finish a multipart upload http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadComplete.html """ debug("MultiPart: Completing upload: %s" % self.upload_id) parts_xml = [] part_xml = "%i%s" for seq, etag in self.parts.items(): parts_xml.append(part_xml % (seq, etag)) body = "%s" % ("".join(parts_xml)) headers = { "content-length": len(body) } request = self.s3.create_request("OBJECT_POST", uri = self.uri, headers = headers, extra = "?uploadId=%s" % (self.upload_id)) response = self.s3.send_request(request, body = body) return response def abort_upload(self): """ Abort multipart upload http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadAbort.html """ debug("MultiPart: Aborting upload: %s" % self.upload_id) request = self.s3.create_request("OBJECT_DELETE", uri = self.uri, extra = "?uploadId=%s" % (self.upload_id)) response = self.s3.send_request(request) return response # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/setup.py0000644000175000001440000000453711700327105016063 0ustar mludvigusers00000000000000from distutils.core import setup import sys import os import S3.PkgInfo if float("%d.%d" % sys.version_info[:2]) < 2.4: sys.stderr.write("Your Python version %d.%d.%d is not supported.\n" % sys.version_info[:3]) sys.stderr.write("S3cmd requires Python 2.4 or newer.\n") sys.exit(1) try: import xml.etree.ElementTree as ET print "Using xml.etree.ElementTree for XML processing" except ImportError, e: sys.stderr.write(str(e) + "\n") try: import elementtree.ElementTree as ET print "Using elementtree.ElementTree for XML processing" except ImportError, e: sys.stderr.write(str(e) + "\n") sys.stderr.write("Please install ElementTree module from\n") sys.stderr.write("http://effbot.org/zone/element-index.htm\n") sys.exit(1) try: ## Remove 'MANIFEST' file to force ## distutils to recreate it. ## Only in "sdist" stage. Otherwise ## it makes life difficult to packagers. if sys.argv[1] == "sdist": os.unlink("MANIFEST") except: pass ## Re-create the manpage ## (Beware! Perl script on the loose!!) if sys.argv[1] == "sdist": if os.stat_result(os.stat("s3cmd.1")).st_mtime < os.stat_result(os.stat("s3cmd")).st_mtime: sys.stderr.write("Re-create man page first!\n") sys.stderr.write("Run: ./s3cmd --help | ./format-manpage.pl > s3cmd.1\n") sys.exit(1) ## Don't install manpages and docs when $S3CMD_PACKAGING is set ## This was a requirement of Debian package maintainer. if not os.getenv("S3CMD_PACKAGING"): man_path = os.getenv("S3CMD_INSTPATH_MAN") or "share/man" doc_path = os.getenv("S3CMD_INSTPATH_DOC") or "share/doc/packages" data_files = [ (doc_path+"/s3cmd", [ "README", "INSTALL", "NEWS" ]), (man_path+"/man1", [ "s3cmd.1" ] ), ] else: data_files = None ## Main distutils info setup( ## Content description name = S3.PkgInfo.package, version = S3.PkgInfo.version, packages = [ 'S3' ], scripts = ['s3cmd'], data_files = data_files, ## Packaging details author = "Michal Ludvig", author_email = "michal@logix.cz", url = S3.PkgInfo.url, license = S3.PkgInfo.license, description = S3.PkgInfo.short_description, long_description = """ %s Authors: -------- Michal Ludvig """ % (S3.PkgInfo.long_description) ) # vim:et:ts=4:sts=4:ai s3cmd-1.1.0-beta3/PKG-INFO0000644000175000001440000000122511703443760015446 0ustar mludvigusers00000000000000Metadata-Version: 1.0 Name: s3cmd Version: 1.1.0-beta3 Summary: Command line tool for managing Amazon S3 and CloudFront services Home-page: http://s3tools.org Author: Michal Ludvig Author-email: michal@logix.cz License: GPL version 2 Description: S3cmd lets you copy files from/to Amazon S3 (Simple Storage Service) using a simple to use command line client. Supports rsync-like backup, GPG encryption, and more. Also supports management of Amazon's CloudFront content delivery network. Authors: -------- Michal Ludvig Platform: UNKNOWN s3cmd-1.1.0-beta3/INSTALL0000644000175000001440000000522311615700335015400 0ustar mludvigusers00000000000000Installation of s3cmd package ============================= Author: Michal Ludvig S3tools / S3cmd project homepage: http://s3tools.sourceforge.net Amazon S3 homepage: http://aws.amazon.com/s3 !!! !!! Please consult README file for setup, usage and examples! !!! Package formats --------------- S3cmd is distributed in two formats: 1) Prebuilt RPM file - should work on most RPM-based distributions 2) Source .tar.gz package Installation of RPM package --------------------------- As user "root" run: rpm -ivh s3cmd-X.Y.Z.noarch.rpm where X.Y.Z is the most recent s3cmd release version. You may be informed about missing dependencies on Python or some libraries. Please consult your distribution documentation on ways to solve the problem. Installation of source .tar.gz package -------------------------------------- There are three options to run s3cmd from source tarball: 1) S3cmd program as distributed in s3cmd-X.Y.Z.tar.gz can be run directly from where you untar'ed the package. 2) Or you may want to move "s3cmd" file and "S3" subdirectory to some other path. Make sure that "S3" subdirectory ends up in the same place where you move the "s3cmd" file. For instance if you decide to move s3cmd to you $HOME/bin you will have $HOME/bin/s3cmd file and $HOME/bin/S3 directory with a number of support files. 3) The cleanest and most recommended approach is to run python setup.py install You will however need Python "distutils" module for this to work. It is often part of the core python package (e.g. in OpenSuse Python 2.5 package) or it can be installed using your package manager, e.g. in Debian use apt-get install python2.4-setuptools Again, consult your distribution documentation on how to find out the actual package name and how to install it then. Note to distibutions package maintainers ---------------------------------------- Define shell environment variable S3CMD_PACKAGING=yes if you don't want setup.py to install manpages and doc files. You'll have to install them manually in your .spec or similar package build scripts. On the other hand if you want setup.py to install manpages and docs, but to other than default path, define env variables $S3CMD_INSTPATH_MAN and $S3CMD_INSTPATH_DOC. Check out setup.py for details and default values. Where to get help ----------------- If in doubt, or if something doesn't work as expected, get back to us via mailing list: s3tools-general@lists.sourceforge.net For more information refer to: * S3cmd / S3tools homepage at http://s3tools.sourceforge.net Enjoy! Michal Ludvig * michal@logix.cz * http://www.logix.cz/michal